Jan 12 2014

Hooking into Node loader for fun and profit

Log loaded files, add code coverage and extra features on the fly.

When you call require("filename") in Node, the system resolves filename to full path (for example /user/me/filename.js), reads source from disk and evaluates the loaded JavaScript. Turns out, you can substitute your own logic instead of reading and evaluating using the defaults. For example you can hook into loading of .js files and strip all console.log calls! I have written node-hook that hides the mechanics and just needs your own transform function. I will show several useful ways the loaded source code can be transformed.

note: Nodejs caches the evaluated JavaScript. If you need to transform code, make sure to install the hook first, before any other files are loaded using require call. I recommend putting all hook installations as the very first lines in the package's main file.

Logging files loaded and evaluated

The simplest transformation: lets prepend each loaded source file with console.log call that would print the file name.

var hook = require('node-hook');
function logLoadedFilename(source, filename) {
    return 'console.log("' + filename + '");\n' + source;
}
hook.hook('.js', logLoadedFilename);
require('./dummy');
// prints fulle dummy.js filename, runs dummy.js

The custom transform function logLoadedFilename gets two arguments, the loaded original source and full filename. It should return the transformed JavaScript code text. If nothing is returned, the hook will print error message, but will continue (without evaluating the original source!).

Mix and match different languages

Any language that can be compiled to JavaScript could be loaded directly from Node. For example, we could install a hook to automatically transpile CoffeeScript files to JavaScript

var hook = require('node-hook');
var coffee = require('coffee-script');
function coffeeToJs(source, filename) {
    return coffee.compile(source, {
        filename: filename
    });
}
hook.hook('.coffee', coffeeToJs);
require('./dummy.coffee');

note: you don't need to actually use node-hook to support CoffeeScript, it installs its own hook automatically when you use require("coffee-script").

Strip C-style comments from JSON files

Node can load and parse JSON files when you call require("json filename"). I love JSON files but always felt bad they do not allow comments. There is limited work around I describe in Angular and JS nuggets, but I always wanted full C-style comments (// and /* */). Sindre Sorhus wrote strip-json-comments that has single method to strip comments from any given source string. Putting this together with require hook allows stripping comments from JSON files automatically

var hook = require('node-hook');
var strip = require('strip-json-comments');
function stripJson(source) {
    var ret = strip(source);
    // str will be evaluated by Nodejs, just like eval(...)
    var str = 'module.exports = ' + ret;
    return str;
}
hook.hook('.json', stripJson);

This transformation is implemented in autostrip-json-comments

Extend the JavaScript language

I write a lot of asynchronous code using promises, and often find myself using function.bind(...) syntax to do context binding and partial application. For example when I want to print message after a promise is resolved:

// using separate function
get('http://www.google.com')
.then(function () {
    console.log('google is working');
});
// OR shorter using .bind
get('http://www.google.com')
.then(console.log.bind(null, 'google is working'));

Often, I see every link in the promise chain using .bind, increasing code size and occluding the actual intent. To avoid writing bind over and over, I wrote a small syntax shortcut dotdot. It hooks into .js file loader and uses a regular expression to replace foo..bar() with foo.bar.bind(foo).

var dotdot = require('./src/dotdot');
var hook = require('node-hook').hook;
hook(dotdot);
// now I can replace code like this
asyncSquare(2)
    .then(console.log.bind(null, '2 ='))
    .then(asyncSquare.bind(null, 3))
    .then(console.log.bind(null, '3 ='))
    .then(asyncSquare.bind(null, 4))
    .then(console.log.bind(null, '4 ='));
// with
asyncSquare(2)
    .then(console..log('2 ='))
    .then(asyncSquare..(3))
    .then(console..log('3 ='))
    .then(asyncSquare..(4))
    .then(console..log('4 ='));

Code coverage

An excellent pure JavaScript code coverage library istanbul installs its own hook, making very drastic changes to the source code before evaluating. Basically, it adds a separate counter and increment for each original source line:

dummy.js

console.log('first line');    // line 0
console.log('second line');   // line 1
// becomes
__counters['dummy.js'][0]++; console.log('first line');
__counters['dummy.js'][1]++; console.log('second line');

After the execution has finished, one just looks at the __counters structure to get the coverage information.

The actual transformation is not pure text-based like dotdot. Instead, the source is first converted into an Abstract Syntax Tree (AST), the new nodes are inserted (to keep track of branches, statements, function calls), and then the updated tree is serialized back to source string. You can read more about these types of code transformation in Toby Ho's post Falafel, Source Rewriting, and a Magicial Assert.

Conclusion

Transforming source code automatically on load is a powerful tool, allowing to extend the JavaScript language itself. There is definitely a trade off between the power and maintainence, because the source code you see is no longer the source running.

The current node-hook implementation is rudimentary. For example, it does not allow chaining the transformations, only a single transform function per file extension is supported. Another nice to have feature would be to support a user-supplied filtering function to transform only certain source files and not everything.

These transformations are not limited to Nodejs. Any AMD-style JavaScript loader does essentially the same thing: downloads the module source then evaluates the source. So you can install a hook and transform the code before it evaluates. You might need to add an ability to run user-supplied transform to the AMD loader, since none of them provide this feature. It should be simple to do.

Hacking Node require

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do