Application Development

Asynchronous Processing in Node.js

Here’s Part 1 in a four-part series on asynchronous Node.js development.

By Dan McGhan

July/August 2017

JavaScript has been called the most popular programming language. Why is it so popular? It likely has something to do with the fact that JavaScript is the programming language of the web and that web browsers are so ubiquitous and websites and apps are more popular than ever.

JavaScript was first released in 1995 as an event-driven language. As users interacted with content in a web page, events were triggered and JavaScript code registered with those events was executed.

A platform for running server-side JavaScript, Node.js was introduced in 2009. Node.js uses an asynchronous, event-driven I/O model that makes it efficient and scalable. But what exactly does that mean, and how are developers supposed to write asynchronous code? This Oracle Magazine article series will answer these questions and provide readers with a solid foundation for doing asynchronous work with Node.js. Here’s a list of the topics that will be covered in this Node.js series:

  1. Introduction to asynchronous processing in Node.js
  2. Using the Async module
  3. The promise of promises
  4. Async done right with async/await

In this first installment in the series, you’ll learn some of the basics related to asynchronous processing with Node.js. Subsequent articles will build on the concepts covered in this article to do some more-complicated tasks.

The code in this article will be somewhat simple, enabling you to focus on core concepts. If you’d like to set up a test environment that you can use to work through the examples, check out “Creating a Sandbox for Learning Node.js and Oracle Database” on the Oracle and JavaScript blog.

Beginning at the Beginning

Let’s start with an overview of the Node.js architecture. Once Node.js is installed, you can run it from the command line as “node.” Running “node” without any arguments starts the Node.js interactive mode, which is basically a read-eval-print loop (REPL).

$ node
> 1 + 2
3
> console.log('Hello world! ');
Hello world!
undefined
> process.exit(0);
$

As you can see, JavaScript code can be entered and executed interactively. Note that process.exit is used to exit the REPL (pressing Ctrl-C twice will also exit). The “process” object is a global object that provides programmatic access to the currently running Node.js process. It is important to understand that Node.js runs as an operating system process.

Although Node.js is often described as a single-threaded environment, that description isn’t exactly complete. In fact, as shown in Figure 1, just by running Node.js and looking up the process with Activity Monitor on my Mac, I can see that the Node.js process is running 10 threads!
o47opensource-f1
 

Figure 1: Single-threaded Node.js and its 10 threads

Most of these threads are created by two open source libraries that Node.js depends on: V8 and Libuv. V8 is the high-performance JavaScript engine that’s used by the Chrome web browser, and Libuv is the cross-platform framework that provides the evented architecture that makes Node.js so efficient.

When Node.js is described as a single-threaded environment, it’s because only one thread will be dedicated to running Libuv’s event loop. This thread is often referred to as the main thread, and it’s where all the JavaScript code in a Node.js application will run. And because all JavaScript code shares the same thread, it’s very important that no application code consume much time on the thread. Enter asynchronous Node.js code.

Getting to Asynchronous

Many languages have features that allow for asynchronous programming, and JavaScript is particularly adept at this, due to its event-driven nature. Let’s get into some code examples that demonstrate the difference between synchronous code and asynchronous Node.js code.

Here’s that “hello world” you knew was coming:

console.log('hello');
console.log('world');

Add those lines to a file named sync.js. To run this script with Node.js, open a terminal, change directories to where sync.js was created, and run the following command.

$ node sync.js
hello
world

As you can see, “hello” appeared first in the console, followed by “world”—completely synchronous. So how can the code be made to run asynchronously? The answer is to use one of the asynchronous APIs provided by a built-in or third-party module. Node.js has asynchronous APIs for many different types of operations, including timers, disk and network I/O, and CPU-intensive tasks (such as encryption and compression). These operations should all be done asynchronously.

Here’s an example of asynchronous code that uses a simple timer:

setTimeout(function () {
    console.log('hello');
}, 3000);
console.log('world');

Save the script to a file named async.js, and run it with Node.js (as you ran the sync.js script before). You should see “world” appear immediately and “hello” appear three seconds later.

$ node async.js
world
hello

Surprising, no? The setTimeout function is an asynchronous API that takes two parameters: a callback function and the number of milliseconds to wait before running the function. setTimeout was implemented as an asynchronous API because pausing on the main thread would prevent all JavaScript code from running for the specified length of time. When setTimeout has finished doing its work (just waiting, in this case), it places the callback function in a queue to be executed ASAP.

Figure 2 illustrates how asynchronous APIs work in Node.js.
o47opensource-fig2

Figure 2: How asynchronous Node.js APIs work

Now that you have a better understanding of how asynchronous work is handled in Node.js, let’s talk about an issue newcomers sometimes run into: the pyramid of doom! The pyramid of doom, aka callback hell, results from the nature of how anonymous callback functions are often nested and indented to help keep the code maintainable.

Here’s a Node.js example that makes three asynchronous API calls nested within each other to control the order of execution:

setTimeout(function () {
    console.log('1: three seconds after the start');
    setTimeout(function () {
        console.log('2: two seconds after 1');
        setTimeout(function () {
            console.log('3: one second after 2');
        }, 1000);
    }, 2000);
}, 3000);

Save the script to a file named pyramid-of-doom.js, and run it. You should see the following output, with each timer’s start relative to when the timer before it finished.

$ node pyramid-of-doom.js
1: three seconds after the start
2: two seconds after 1
3: one second after 2

Looking back at the example code, do you see the horizontal pyramid (with white space on the left) beginning to take shape as you nest callback functions? That’s the pyramid of doom. In this example, it’s not so bad, but in real-world applications, this issue can become difficult to manage.

Fortunately, there are many solutions that can help developers avoid this problem. The simplest way to avoid the pyramid of doom is to avoid nesting anonymous callback functions by using named functions instead. The previous example can be rewritten as follows:

function doWork1 () {
    setTimeout(function () {
        console.log('1: three seconds after the start');
        doWork2();
    }, 3000);
}
function doWork2 () {
    setTimeout(function () {
        console.log('2: two seconds after doWork1');
        doWork3();
    }, 2000);
}
function doWork3 () {
    setTimeout(function () {
       console.log('3: one second after doWork2');
    }, 1000);
}
doWork1(); // Starts the function chain

Save the script to a file named named-functions.js, and run it as before. The timing should work as in the previous example.

$ node named-functions.js
1: three seconds after the start
2: two seconds after doWork1
3: one second after doWork2

Although they’re effective at limiting the level of indentation, named functions alone can help you only so much with respect to asynchronous programming. For example, this technique can be used only for sequential processing. For more-complex flows, such as those involving parallel processing, you’ll need some better tools.

In the articles that follow, I will explore more-robust solutions for handling asynchronous flows in Node.js, including the Async module, promises, and a new feature of JavaScript that builds on promises and greatly simplifies asynchronous code: async/await. I’ll explore each of these solutions in the context of a database query.

Next Steps

LEARN more about JavaScript and Oracle.

 

DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.