5. Modules

In the Node.js servers and scripts you have written thus far, you have already consumed external functionality in the form of modules. In this chapter, I explain how this all works and how to write your own. In addition to all the powerful and functional modules that Node already provides for you, there is a huge community further developing modules that you can take advantage of in your programs, and indeed you can even write your own to give something back!

One of the cool things about Node is that you don’t really distinguish between modules that you have produced and modules that you consume from external repositories, such as those you see later in this chapter via npm, the Node Package Manager. When you write separate classes and groups of functions in Node, you put them in basically the same format—perhaps with a bit less dressing and documentation—as modules you download from the Internet and use. In fact, it usually takes only an extra bit of JSON and maybe line or two of code to prepare your code for consumption by others!

Node ships with a large number of built-in modules, all of which are packaged in the node executable on your system. You can view their source if you download the Node source code from the nodejs.org website. They all live in the lib/ subdirectory.

Writing Simple Modules

At a high level, modules are a way to group common functionality in Node.js. If you have a library of functions or classes for working with a particular database server, for example, it would make a lot of sense to put that code into a module and package it for consumption.

Every file in Node.js is a module, although modules do not necessarily have to be this simple. You can package complex modules with many files, unit tests, documentation, and other support files into folders and consume them in the same way you would a module with only a single JavaScript file (see “Writing Modules” later in this chapter).

To write your own module that exposes, or exports, a function called hello_world, you can write the following and save it to mymodule.js:

exports.hello_world = function () {
    console.log("Hello World");
}

The exports object is a special object created by the Node module system in every file you create and is returned as the value of the require function when you include that module. It lives off the module object that every module has and is used to expose functions, variables, or classes. In the simple example here, the module exposes a single function on the exports object, and to consume it, you could write the following and save it to modtest.js:

var mm = require ('./mymodule');
mm.hello_world();

Running node modtest.js causes Node to print out "Hello World" exactly as you would expect. You can expose as many functions and classes as you want off the exports object as you’d like. For example:

function Greeter (lang) {
    this.language = lang;
    this.greet = function () {
        switch (this.language) {
          case "en": return "Hello!";
          case "de": return "Hallo!";
          case "jp": return "Image";
          default: return "No speaka that language";
        }
    }
}

exports.hello_world = function () {
    console.log("Hello World");
}

exports.goodbye = function () {
    console.log("Bye bye!");
}

exports.create_greeter = function (lang) {
    return new Greeter(lang);
}

The module variable given to each module contains information such as the filename of the current module, its child modules, its parent modules, and more.

Modules and Objects

You frequently return objects from modules that you write. There are two key patterns through which you do this.

The Factory Model

The previous sample module contains a class called Greeter. To get an instance of a Greeter object, you call a creation function—or factory function—to create and return an instance of this class. The basic model is as follows:

function ABC (parms) {
    this.varA = ...;
    this.varB = ...;
    this.functionA = function () {
        ...
    }
}

exports.create_ABC = function (parms) {
    return new ABC(parms);
}

The advantage to this model is that the module can still expose other functions and classes via the exports object.

The Constructor Model

Another way to expose classes from a module you write would be to completely replace the exports object in the module with a class that you want people to use:

function ABC () {
    this.varA = 10;
    this.varB = 20;
    this.functionA = function (var1, var2) {
        console.log(var1 + " " + var2);
    }
}

module.exports = ABC;

To use this module, you would change your code to be the following:

var ABCClass = require('./conmod2');
var obj = new ABCClass();
obj.functionA(1, 2);

Thus, the only thing you are really exposing from the module is a constructor for the class. This approach feels nice and OOP-y, but has the disadvantage of not letting you expose much else from your module; it also tends to feel a bit awkward in the Node way of doing things. I showed it to you here so that you can recognize it for what it is when you see it, but you will almost never use it in this book or your projects—you will largely stick with the factory model.

npm: The Node Package Manager

Apart from writing your own modules and using those provided by Node.js, you will frequently use code written by other people in the Node community and published on the Internet. The most common way this is done today is by using npm, the Node Package Manager. npm is installed with your node installation (as you saw in Chapter 1, “Getting Started”), and you can go to the command line and type npm help to verify that it’s still there and working.

To install modules via npm, you use the npm install command. This technique requires only the name of the module package you want to install. Many npm modules have their source code hosted on github.com, so they usually tell you the name required, for example:

host:ch5 marcw$ npm install mysql
[email protected] /Users/marcwan/src/misc/LearningNodeJS/Chapter05
Image [email protected]
  Image [email protected]
  Image [email protected]
  Image  Image [email protected]
  Image  Image [email protected]
  Image  Image [email protected]
  Image  Image [email protected]
  Image [email protected]

If you’re not sure of the name of the package you want to install, you can use the npm search command, as follows:

npm search sql

This command prints the name and description of all matching modules.

However, you’re going to have a far richer and easier experience if you search by visiting npmjs.org and looking there.

npm installs module packages to the node_modules/ subdirectory of your project. If a module package itself has any dependencies, they are installed to a node_modules/ subdirectory of that module’s folder.

+ project/
    + node_modules/
        module1
        module2
            + node_modules/
                dependency1
    main.js

To see a list of all modules that a project is currently using, you can use the npm ls command:

host:ch05 marcwan$ npm ls
[email protected] /Users/marc/src/misc/LearningNodeJS/Chapter05
Image [email protected]
  Image [email protected]

To update an installed package to a newer version, use the npm update command. If you specify a package name, it updates only that one. If you do not specify a package name, it updates all packages to their latest version. If there are no changes to the package, it will print out nothing:

host:ch5 marcw$ npm update mysql
host:ch5 marcw$

Consuming Modules

As you have already seen, to include a module in a Node file that you are writing, you use the require function. To be able to reference the functions and/or classes on that module, you assign the results (the exports object of the loaded module) to a variable:

var http = require('http');

Included modules are private to the module that includes them, so if a.js loads the http module, then b.js cannot reference it, unless it itself also loads http.

Searching for Modules

Node.js uses a pretty straightforward set of rules for finding modules requested with the require function:

1. If the requested module is a built-in one—such as http or fs—Node uses that.

2. If the module name in the require function begins with a path component (/, ../, or /). Node looks in the specified directory for that module and tries to load it there. If you don’t specify a .js extension on your module name, Node first looks for a folder-based module of that name. If it does not find that, it then adds the extensions .js, .json, .node and tries to load modules of those types. (Modules with the extension .node are compiled add-on modules.)

3. If the module name does not have a path component at the beginning, Node looks in the node_modules/ subfolder of the current folder for the module there. If it is found, that is loaded; otherwise, Node works its way up the path tree of the current location looking for node_modules/ folders there. If those continue to fail, it looks in some standard default locations, such as /usr/lib, /usr/local/lib, or C:Program FilesUserApp Datalocation pm if you’re running on Windows.

4. If the module isn’t found in any of these locations, an error is thrown.

Module Caching

After a module has been loaded from a particular file or directory, Node.js caches it. Subsequent calls to require that would load the same module from the same location get the exact same code, with any initialization or other work that has taken place. Where this becomes interesting is in situations where we have a few different people asking for the same module. Consider the following project structure:

+ my_project/
    + node_modules/
        + special_widget/
            + node_modules/
                mail_widget (v2.0.1)
        mail_widget (v1.0.0)
    main.js
    utils.js

In this example, if either main.js or utils.js requires mail_widget, it gets v1.0.0 because Node’s search rules find it in the node_modules/ subdirectory of my_project. However, if they require special_widget, which in turn wishes to use mail_widget, special_widget gets its own privately included version of mail_widget, the v2.0.1 one in its own node_modules/ folder.

This is one of the most powerful and awesome features of the Node.js module system! In so many other systems, modules, widgets, or dynamic libraries are all stored in a central location, which creates versioning nightmares when you require packages that themselves require different versions of some other module. In Node, they are free to include these different versions of the other modules, and Node’s namespace and module rules mean that they do not interfere with each other at all! Individual modules and portions of a project are free to include, update, or modify included modules as they see fit without affecting the rest of the system.

In short, Node.js works intuitively, and for perhaps the first time in your life, you don’t have to sit there endlessly cursing the package repository system you’re using.

Cycles

Consider the following situation:

Image a.js requires b.js.

Image b.js requires a.js.

Image main.js requires a.js.

You can see that you clearly have a cycle in the preceding modules. Node stops cycles from being a problem by simply returning uninitialized modules when it detects one. In the preceding case, the following happens:

Image main.js is loaded, and code runs that requires a.js.

Image a.js is loaded, and code runs that requires b.js.

Image b.js is loaded, and code runs that requires a.js.

Image Node detects the cycle and returns an object referring to a.js, but does not execute any more code—the loading and initialization of a.js are unfinished at this point!

Image b.js, a.js, and main.js all finish initializing (in that order), and then the reference from b.js to a.js is valid and fully usable.

Writing Modules

Recall that every file in Node.js is itself a module, with a module and exports object. However, you also should know that modules can be a bit more complicated than that, with a directory to hold its contents and a file containing packaging information. For those cases in which you want to write a bunch of support files, break up the functionality of the module into separate JavaScript files, or even include unit tests, you can write modules in this format.

The basic format is as follows:

1. Create the folder to hold the module contents.

2. Put a file called package.json into this folder. This file should contain at least a name for the module and main JavaScript file that Node should load initially for that module.

3. If Node cannot find the package.json file or no main JavaScript file is specified, it looks for index.js (or index.node for compiled add-on modules).

Creating Your Module

Now take the code you wrote for managing photos and albums in the preceding chapter and put it into a module. Doing so lets you share it with other projects that you write later and isolate the code so you can write unit tests, and so on.

First, create the following directory structure in the source scratch directory (that is, ~/src/scratch or wherever you’re playing around with Node):

+ album_mgr/
    + lib/
    + test/

In the album_mgr folder, create a file called package.json and put the following in it:

{ "name": "album-manager",
  "version": "1.0.0",
  "main": "./lib/albums.js" }

This is the most basic of package.json files; it tells npm that the package should have the friendly name album-manager and that the “main” or starting JavaScript file for the package is the albums.js file in the lib/ subdirectory. Package.json files can contain many other fields, including descriptions, author information, licensing, etc. The npm documentation covers this in detail.

The preceding directory structure is by no means mandatory or written in stone; it is simply one of the common layouts for packages that I have found to be useful and have thus latched on to. You are under no obligation to follow it. I do, however, recommend that you start doing things this way and start experimenting with different layouts only after you’re comfortable with the whole system.

Sites such as github.com that are frequently used to host Node module source automatically display Readme documentation if they find it. Thus, it is pretty common for people to include a Readme.md (the “md” stands for markdown and refers to the standard documentation format that github.com uses). You are highly encouraged to write documentation for your modules to help people get started using it. For the album-manager module, I wrote the following Readme file:

# Album-Manager

This is our module for managing photo albums based on a directory. We
assume that, given a path, there is an albums sub-folder, and each of
its individual sub-folders are themselves the albums. Files in those
sub-folders are photos.

## Album Manager

The album manager exposes a single function, `albums`, which returns
an array of `Album` objects for each album it contains.

## Album Object

The album object has the following two properties and one method:

* `name` -- The name of the album
* `path` -- The path to the album
* `photos()` -- Calling this method will return all the album's photos

Now you can write your actual module files. First, start with the promised lib/albums.js, which is just some of the album-loading code from Chapter 4, “Writing Applications,” repackaged into a module-like JavaScript file:

var fs = require('fs'),
    album = require('./album.js');

exports.version = "1.0.0";

exports.albums = function (root, callback) {
    // we will just assume that any directory in our 'albums'
    // subfolder is an album.
    fs.readdir(root + "/albums", (err, files) => {
        if (err) {
            callback(err);
            return;
        }

        var album_list = [];

        (function iterator(index) {
            if (index == files.length) {
                callback(null, album_list);
                return;
            }

            fs.stat(root + "albums/" + files[index], (err, stats) => {
                if (err) {
                    callback(make_error("file_error",
                                        JSON.stringify(err)));
                    return;
                }
                if (stats.isDirectory()) {
                    var p = root + "albums/" + files[index];
                    album_list.push(album.create_album(p));
                }
                iterator(index + 1)
            });
        })(0);
    });
};

function make_error(err, msg) {
    var e = new Error(msg);
    e.code = err;
    return e;
}

One of the standard things to provide in the exported functionality of modules is a version member field. Although I don’t always use it, it can be a helpful way for calling modules to check your version and execute different code depending on what it has.

You can see that the album functionality is split into a new file called lib/album.js, and there is a new class called Album. This class looks as follows:

function Album (album_path) {
    this.name = path.basename(album_path);
    this.path = album_path;
}

Album.prototype.name = null;
Album.prototype.path = null;
Album.prototype._photos = null;

Album.prototype.photos = function (callback) {
    if (this._photos != null) {
        callback(null, this._photos);
        return;
    }

    fs.readdir(this.path, (err, files) => {
        if (err) {
            if (err.code == "ENOENT") {
                callback(no_such_album());
            } else {
                callback(make_error("file_error", JSON.stringify(err)));
            }
            return;
        }

        var only_files = [];

        var iterator = (index) => {
            if (index == files.length) {
                callback(null, only_files);
                return;
            }

            fs.stat(this.path + "/" + files[index], (err, stats) => {
                if (err) {
                    callback(make_error("file_error",
                                        JSON.stringify(err)));
                    return;
                }
                if (stats.isFile()) {
                    only_files.push(files[index]);
                }
                iterator(index + 1)
            });
        };
        iterator(0);
    });
};

If you’re confused by the prototype keyword used a few times in the preceding source code, perhaps now is a good time to jump back to Chapter 2 and review the section on writing classes in JavaScript. The prototype keyword here is simply a way to set properties on all instances of our Album class.

Again, this is pretty much what you saw in Chapter 4 with the basic JSON server. The only real difference is that it is packaged into a class with a prototype object and method called photos.

I hope you also noted the following two things:

1. You now use a new built-in module called path, and you use the basename function on it to extract the album’s name from the path.

2. By using arrow functions for anonymous callbacks within this class, we avoid the problems with the this pointer mentioned in “Who Am I? Maintaining a Sense of Identity” in Chapter 3, “Asynchronous Programming.” If you’re not sure what we’re talking about here, please take a moment to refer back to that section.

The rest of the album.js file is simply as follows:

var path = require('path'),
    fs = require('fs');

// Album class code goes here

exports.create_album = function (path) {
    return new Album(path);
};
function make_error(err, msg) {
    var e = new Error(msg);
    e.code = err;
    return e;
}
function no_such_album() {
    return { error: "no_such_album",
             message: "The specified album does not exist." };
}

And that is all you need for your album-manager module! To test it, go back to the scratch directory and enter the following test program as atest.js:

var amgr = require('./album_mgr');  // Our module is in the album_mgr dir as per above

amgr.albums('./', function (err, albums) {
    if (err) {
        console.log("Unexpected error: " + JSON.stringify(err));
        return;
    }

    var iterator = (index) => {
        if (index == albums.length) {
            console.log("Done");
            return;
        }

        albums[index].photos(function (err, photos) {
            if (err) {
                console.log("Err loading album: " + JSON.stringify(err));
                return;
            }

            console.log(albums[index].name);
            console.log(photos);
            console.log("");
            iterator(index + 1);
        });
    }
    iterator(0);
});

Now, all you have to do is ensure you have an albums/ subfolder in the current directory, and you should be able to run atest.js and see something like the following:

hostname:Chapter05 marcw$ node atest
australia2010
[ 'aus_01.jpg',
  'aus_02.jpg',
  'aus_03.jpg',
  'aus_04.jpg',
  'aus_05.jpg',
  'aus_06.jpg',
  'aus_07.jpg',
  'aus_08.jpg',
  'aus_09.jpg' ]

italy2012
[ 'picture_01.jpg',
  'picture_02.jpg',
  'picture_03.jpg',
  'picture_04.jpg',
  'picture_05.jpg' ]

japan2010
[ 'picture_001.jpg',
  'picture_002.jpg',
  'picture_003.jpg',
  'picture_004.jpg',
  'picture_005.jpg',
  'picture_006.jpg',
  'picture_007.jpg' ]

Done

Developing with Your Module

You now have a module for working with albums. If you would like to use it in multiple projects, you could copy it to the node_modules/ folder of your other projects, but then you would have a problem: What happens when you want to make a change to your albums module? Do you have to copy the source code over to all the locations it is being used each and every time you change it? Ideally, we’d like to be able to use npm even for our own private modules but not risk having them get uploaded to the actual npm repository on the Internet.

Fortunately, npm solves both of these problems for us. You can modify the package.json file to add the following:

{ "name": "album-manager",
  "version": "1.0.0",
  "main": "./lib/albums.js",
  "private": true }

This code tells npm to never accidentally publish this to the live npm repository, which you don’t want for this module now.

Then, you can use the npm link command, which tells npm to put a link to the album-manager package in the local machine’s default public package repository (such as /usr/local/lib/node_modules on Linux and Mac machines, or C:UsersusernameAppDatalocation pm on Windows).

host:Chapter05 marcw$ cd album_mgr
host:album_mgr marcw$ sudo npm link
/usr/local/lib/node_modules/album-manager ->
/Users/marcw/src/scratch/Chapter05/album_mgr

Note that depending on how the permissions and such are set up on your local machine, you might need to run this command as super-user with sudo (Windows users will certainly not need to).

Now, to consume this module, you need to do two things:

1. Refer to 'album-manager' instead of 'album_mgr' in the code (because npm uses the name field in package.json).

2. Create a reference to the album-manager module with npm for each project that wants to use it. You can just type npm link album-manager:

host:Chapter05 marcw$ mkdir test_project
host:Chapter05 marcw$ cd test_project/
host:test_project marcw$ npm link album-manager
/Users/marcw/src/scratch/Chapter05/test_project/node_modules/album-manager ->
   /usr/local/lib/node_modules/album-manager ->
   /Users/marcw/src/scratch/Chapter05/album_mgr
host:test_project marcw$ dir
drwxr-xr-x   3 marcw  staff  102 11 20 18:38 node_modules/
host:test_project marcw$ dir node_modules/
lrwxr-xr-x  1 marcw  staff   41 11 20 18:38 album-manager@ ->
   /usr/local/lib/node_modules/album-manager

Now, you are free to make changes to your original album manager source, and all referencing projects will see changes right away.

Publishing Your Modules

If you have written a module that you would like to share with other users, you can publish it to the official npm registry using npm publish. This requires you to do the following:

Image Remove the "private": true line from the package.json file.

Image Create an account on the npm registry servers with npm adduser.

Image Optionally, choose to fill in more fields in package.json (run npm help json to get more information on which fields you might want to add) with things such as a description, author contact information, and host website.

Image Finally, run npm publish from the module directory to push it to npm. That’s it!

host:album_mgr marcw$ npm adduser
Username: learningnode_test
Password:
Email: (this IS public) [email protected]
Logged in as learningnode_test on https://registry.npmjs.org/.
host:album_mgr marcw$ npm publish
+ [email protected]

If you accidentally publish something you didn’t mean to or otherwise want to remove from the npm registry, you can use npm unpublish:

host:album_mgr marcw$ npm unpublish
npm ERR! Refusing to delete entire project.
npm ERR! Run with --force to do this.
npm ERR! npm unpublish <project>[@<version>]
host:album_mgr marcw$ npm unpublish --force
npm WARN using --force I sure hope you know what you are doing.
- [email protected]

If you see the following when trying to publish a module:

npm ERR! publish Failed PUT 403
npm ERR! Darwin 15.6.0
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "publish"
npm ERR! node v6.3.1
npm ERR! npm  v3.10.3
npm ERR! code E403

npm ERR! you do not have permission to publish "album-manager". Are you logged in as
the correct user? : album-manager

It most likely means that somebody else has registered a module with this name. Your best bet is to choose another name.

Managing Asynchronous Code

You have already used a few of the Node.js built-in modules in code written thus far (http, fs, path, querystring, and url), and you will use many more throughout the rest of the book. However, there are one or two modules you will find yourself using for nearly every single project to manage a problem every Node.js programmer runs into: managing asynchronous code. We show two solutions here.

The Problem

Consider the case in which you want to write some asynchronous code to

Image Open a handle to a path.

Image Determine whether or not the path points to a file.

Image Load in the contents of the file if the path does point to a file.

Image Close the file handle and return the contents to the caller.

You’ve seen almost all this code before, and the function to do this looks something like the following, where functions you call are bold and the callback arrow functions you write are bold and italic:

var fs = require('fs');

function load_file_contents(path, callback) {
    fs.open(path, 'r', (err, f) => {
        if (err) {
            callback(err);
            return;
        } else if (!f) {
            callback({ error: "invalid_handle",
                       message: "bad file handle from fs.open"});
            return;
        }
        fs.fstat(f, (err, stats) => {
            if (err) {
                callback(err);
                return;
            }
            if (stats.isFile()) {
                var b = new Buffer(stats.size);
                fs.read(f, b, 0, stats.size, null, (err, br, buf) => {
                    if (err) {
                        callback(err);
                        return;
                    }

                    fs.close(f, (err) => {
                        if (err) {
                            callback(err);
                            return;
                        }
                        callback(null, b.toString('utf8', 0, br));
                    });
                });
            } else {
                callback({ error: "not_file",
                          message: "Can't load directory" });
                return;
            }
        });
    });
}

As you can, even for a short, contrived example such as this, the code is starting to nest pretty seriously and deeply. Nest more than a few levels deep, and you’ll find that you cannot fit your code in an 80-column terminal or one page of printed paper any more. It can also be quite difficult to read the code, figure out what variables are being used where, and determine the flow of the functions being called and returned.

Our Preferred Solution—async

To solve this problem, you can use an npm module called async. The async module provides an intuitive way to structure and organize asynchronous calls, and removes many, if not all, of the tricky parts of asynchronous programming you encounter in Node.js.

Executing Code in Serial

You can execute code serially in async in two ways: through the waterfall function or the series function (see Figure 5.1)

Image

Figure 5.1 Serial execution with async.waterfall

The waterfall function takes an array of functions and executes them one at a time, passing the results from each function to the next. At the end, a resulting function is called with the results from the final function in the array. If an error is signaled at any step of the way, execution is halted, and the resulting function is called with that error instead.

For example, you could easily rewrite the previous code cleanly (it’s in the GitHub source tree) using async.waterfall:

var fs = require('fs');
var async = require('async');

function load_file_contents(path, callback) {
    async.waterfall([
        function (callback) {
            fs.open(path, 'r', callback);
        },
        // the f (file handle) was passed to the callback at the end of
        // the fs.open function call. async passes all params to us.
        function (f, callback) {
            fs.fstat(f, function (err, stats) {
                if (err)
                    // abort and go straight to resulting function
                    callback(err);
                else
                    // f and stats are passed to next in waterfall
                    callback(null, f, stats);
            });
        },
        function (f, stats, callback) {
            if (stats.isFile()) {
                var b = new Buffer(stats.size);
                fs.read(f, b, 0, stats.size, null, function (err, br, buf) {
                    if (err)
                        callback(err);
                    else
                        // f and string are passed to next in waterfall
                        callback(null, f, b.toString('utf8', 0, br));
                });
            } else {
                callback({ error: "not_file",
                           message: "Can't load directory" });
            }
        },
        function (f, contents, callback) {
            fs.close(f, function (err) {
                if (err)
                    callback(err);
                else
                    callback(null, contents);
            });
        }
    ]
      // this is called after all have executed in success
      // case, or as soon as there is an error.
    , function (err, file_contents) {
        callback(err, file_contents);
    });
}

Although the code has grown a little bit in length, when you organize the functions serially in an array like this, the code is significantly cleaner looking and easier to read.

The async.series function differs from async.waterfall in two keys ways:

Image Results from one function are not passed to the next; instead, they are collected in an array, which becomes the “results” (the second) parameter to the final resulting function. Each step of the serial call gets one slot in this results array.

Image You can pass an object to async.series, and it enumerates the keys and executes the functions assigned to them. In this case, the results are not passed as an array, but an object with the same keys as the functions called.

Consider this example:

var async = require("async");

async.series({
    numbers: (callback) => {
        setTimeout(function () {
            callback(null, [ 1, 2, 3 ]);
        }, 1500);
    },
    strings: (callback) => {
        setTimeout(function () {
            callback(null, [ "a", "b", "c" ]);
        }, 2000);
    }
},
(err, results) => {
    console.log(results);
});

This function generates the following output:

{ numbers: [ 1, 2, 3 ], strings: [ 'a', 'b', 'c' ] }

Executing in Parallel

In the previous async.series example, there was no reason to use a serial execution sequence for the functions; the second function did not depend on the results of the first, so they could have executed in parallel (see Figure 5.2). For this, async provides async.parallel, as follows:

var async = require("async");

async.parallel({

    numbers: function (callback) {
        setTimeout(function () {
            callback(null, [ 1, 2, 3 ]);
        }, 1500);
    },
    strings: function (callback) {
        setTimeout(function () {
            callback(null, [ "a", "b", "c" ]);
        }, 2000);
    }
},
function (err, results) {
    console.log(results);
});

Image

Figure 5.2 Parallel execution with async.parallel

This function generates the exact same output as before.

Mixing It Up

The most powerful function of them all is the async.auto function, which lets you mix ordered and unordered functions together into one powerful sequence of functions. In this, you pass an object where keys contain either

Image A function to execute or

Image An array of dependencies and then a function to execute. These dependencies are strings and are the names of properties in the object provided to async.auto. The auto function waits for these dependencies to finish executing before calling the provided function.

The async.auto function figures out the required order to execute all the functions, including which can be executed in parallel and which need to wait for others (see Figure 5.3). As with the async.waterfall function, you can pass results from one function to the next via the callback parameter:

var async = require("async");

async.auto({
    numbers: (callback) => {
        setTimeout(() => {
            callback(null, [ 1, 2, 3 ]);
        }, 1500);
    },
    strings: (callback) => {
        setTimeout(() => {
            callback(null, [ "a", "b", "c" ]);
        }, 2000);
    },
    // do not execute this function until numbers and strings are done
    // thus_far is an object with numbers and strings as arrays.
    assemble: [ 'numbers', 'strings', (thus_far, callback) => {
        callback(null, {
            numbers: thus_far.numbers.join(",  "),
            strings: "'" + thus_far.strings.join("',  '") + "'"
        });
    }]
},
// this is called at the end when all other functions have executed. Optional
(err, results) => {
    if (err)
        console.log(err);
    else
        console.log(results);
});

Image

Figure 5.3 Mixing execution models with async.auto

The results parameter passed to the final resulting function is an object in which the properties hold the results of each of the functions executed on the object:

{ numbers: [ 1, 2, 3 ],
  strings: [ 'a', 'b', 'c' ],
  assemble: { numbers: '1,  2,  3', strings: ''a',  'b',  'c'' } }

Looping Asynchronously

In Chapter 3, I showed you how you can use the following pattern to iterate over the items in an array with asynchronous function calls:

var iterator = (i) => {
  if( i < array.length ) {
     async_work( function(){
       iterator( i + 1 )
     })
  } else {
    callback(results);
  }
}
iterator(0);

Although this technique works great and is indeed gloriously geeky, it’s a bit more complicated than I’d like. The async module comes to the rescue again with async.forEachSeries. It iterates over every element in the provided array, calling the given function for each. However, it waits for each to finish executing before calling the next in the series:

async.forEachSeries(
    arr,
    // called for each element in arr
    (element, callback) => {
         // use element
         callback(null);  // YOU MUST CALL ME FOR EACH ELEMENT!
    },
    // called at the end
    function (err) {
        // was there an error?  err will be non-null then
    }
);

To simply loop over every element in a loop and then have async wait for all of them to finish, you can use async.forEach, which is called in the exact same way and differs in that it doesn’t execute the functions serially.

The async module contains a ton of other functionality and is truly one of the indispensable modules of Node.js programming today. I highly encourage you to browse the documentation at https://github.com/caolan/async and play around with it. It truly takes the already-enjoyable Node.js programming environment and makes it even better.

Making Promises and Keeping Them

While the methods we have looked at thus far for managing asynchronous programming—patterns and the async module—are the primary ways you’ll work with Node.js throughout this book, another popular pattern for managing asynchronous programming is promises.

Promises come in various implementations and flavors. The one we will be discussing here will come via the use of the bluebird module in Node, which can be installed by running npm install bluebird. There are a number of other modules for doing promises in Node, most notably promises, and Q, but we’ll use bluebird for now, for it has the ability to “promisify” entire modules in Node, which we’ll find quite useful—working with promises requires APIs to be written differently, and better promises packages will be able to take a regular module and wrap it in promise-ready versions.

Just like async, promises seek to make writing asynchronous code easier for you by automatically passing parameters from callbacks to the next function invocation. Similarly, they aim to centralize all error processing in one place at the end.

Looking back to the above example of opening a file, seeing if it’s actually a file, reading its contents, and then closing it, we can rewrite this using promises in the following manner:

var Promise = require("bluebird");
var fs = Promise.promisifyAll(require("fs"));

function load_file_contents2(filename, callback) {
    var errorHandler = (err) => {
        console.log("SO SAD: " + err);
        callback(err, null);
    }

    fs.openAsync(filename, 'r', 0)
    .then(function (fd) {                                  // 1
        fs.fstatAsync(fd)
        .then(function (stats) {
            if (stats.isFile()) {                      // 2
                var b = new Buffer(stats.size);
                return fs.readAsync(fd, b, 0, stats.size, null)
                    .then(fs.closeAsync(fd))
                    .then(function () {
                        callback(null, b.toString('utf8'))
                    })
                    .catch(errorHandler);
            }
        })
    })
    .catch(errorHandler);
}

Using APIs with the promise pattern requires modification and conversion to promise-compatible versions. For example, we want to use the File System (fs) APIs with promises in the above example, so we import bluebird and convert the module as follows:

var Promise = require("bluebird");
var fs = Promise.promisifyAll(require("fs"));

The bluebird module leaves all the original functions and adds new promisified versions for all functions that have an error as the first parameter to the callback in the module. These modified function versions have async appended to the function name.

Effectively, when calling the then function:

Image The provided function is executed.

Image If the callback has a non-null error, all further promises are skipped until the catch method is reached, and it is passed the error.

Image If the callback has a value, this is passed to the next then function in the chain.

In the above promises code, we have to deal with a couple of interesting problems. For the part marked with // 1, we originally wanted to write the code like this:

    fs.openAsync("promises.js", 'r')
        .then(fs.fstatAsync)          // fd passed from openAsync to us by promises
        .then(function (stats) {
            if (stats.isFile() {
               fs.readAsync(fd, ...); // etc

This, however presents a problem. Once you’ve verified the path is a file, we want to read from it, which requires the fd (file descriptor) parameter that openAsync passes to its callback. Because you just consumed this quietly in the promise chain above, you don’t have it anywhere to pass to readAsync. Thus, you can see in the code that you’ve actually created a new function with fd as one of the parameters so that code anywhere in that scope can refer to the file descriptor.

Our second problem is then how to deal with the branching in // 2. You want to execute different code depending on whether the given path is a file or not. Looking at the above code, you can see that the way to do this is to just start another promise chain within the first promise chain—there is no limit to how deeply you nest them!

So, while our code is still nests a bit dealing with all the possible paths, we’ve managed to make it significantly more compact, and also factor out all the error handling code into one place, which is still a big improvement to our original function. Much like async, the promises model of asynchronous programming comes with solutions to asynchronous looping and parallel execution constructs that you would expect from the asynchronous world of Node.

How We’ll Manage Asynchronous Code

While promises can be an effective and common solution to managing asynchronous code, we’ll continue using mostly async and regular callbacks in this book as I find promises to have two shortcomings that don’t really work for me:

1. Using promises requires either rewriting your APIs to be promise-enabled or using promise modules such as bluebird that provide promisify functionality. The former adds complexity and differs between the various promises systems, while the latter is limited and can’t always provide promisified versions of things (e.g., the old fs.exists function, which never returned an (err) in its callback).

2. I find the code you write in promises not particularly readable. To my eyes, it looks complicated and introduces too many new concepts and functions you have to get used to in order to solve all the different problems. In this regard, I find async produces far cleaner code.

As always, you are encouraged to play around with all the different paradigms out there (there are other approaches to solving the asynchronous programming problem!) and choose which one works best for you.

Summary

In this chapter you were more formally introduced to modules in Node.js. Although you have seen them before, now you finally know how they’re written, how Node finds them for inclusion, and how to use npm to find and install them. You can write your own complex modules now with package.json files and link them across your projects or even publish them for others to use via npm.

Finally, you are now armed with knowledge of various approaches to cleaning up asynchronous programming, in particular async, one of the modules that you will use in nearly every single Node project you write from now on.

Next up: Putting the “web” back in web servers. You look at some cool ways to use JSON and Node in your web apps and how to handle some other core Node technologies such as events and streams.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset