An interesting observation came across my desk a few days ago: according to
Node’s require
is dog slow
the Node require
"hunts" for files to load when resolving 3rd party names. For example
when you require("express")
in your application source file, the Node require will
try to load node_modules/express.js
and will fail, then it will try to load
node_modules/express.json
and will fail, then it will try to load
node_modules/express.node
. Finally it will "give up" and will load
node_modules/express/package.json
to read the proper main filename. Only then it will
read the node_modules/express/index.js
from the disk!
You can see this for yourself if you profile our own Node application using
the dtruss
program (included with Mac OS). Just start the profiling from the first terminal
sudo dtruss -d -n 'node' > /tmp/require.log 2>&1
Then go to the second terminal window and start the application. For example I will
load express
, and that is it. Because require
calls are synchronous I can simply time the
call using high resolution timer
1 | function time(fn) { |
Run this program and the load the /tmp/require.log
in a text editor. The result shows lots of calls just to find
the right source file to start loading express
library!
# microseconds call
664730 stat64(".../test/node_modules/express\0", 0x7FFF5FBFECF8, 0x204) = 0 0
664784 stat64(".../test/node_modules/express.js\0", 0x7FFF5FBFED28, 0x204) = -1 Err#2
664834 stat64(".../test/node_modules/express.json\0", 0x7FFF5FBFED28, 0x204) = -1 Err#2
664859 stat64(".../test/node_modules/express.node\0", 0x7FFF5FBFED28, 0x204) = -1 Err#2
664969 open(".../test/node_modules/express/package.json\0", 0x0, 0x1B6) = 11 0
664976 fstat64(0xB, 0x7FFF5FBFEC38, 0x1B6) = 0 0
665022 read(0xB, "{\n \"name\": \"express\", ...}", 0x103D) = 4157 0
665030 close(0xB) = 0 0
The first column shows the timestamp in microseconds. Each wasted file system call takes only 100 microseconds, but the tiny delays add up to hundreds of milliseconds and finally seconds for larger frameworks.
I will show the end-to-end timing results later.
Cache path resolution
Luckily we can easily hook into the Nodejs loader,
overwrite the require
calls and cache the resolved filenames.
I wrote cache-require-paths that does this. The entire
source is only generous 30 lines and here is the main gist: wrap Module.prototype.require
and
save the resolved filenames into an object on the first run. On the second run, if the name cache
already has the resolution for given filename, load that module.
1 | var Module = require('module'); |
One can simple load this as the first line in the application and get the cache benefits
npm install --save cache-require-paths
// first line of your app.js
require('cache-require-paths');
The cache mechanism avoids a lot of wasted file system calls (always slow!) and generates the following results for a couple of popular libraries.
Using node 0.10.37
require('X') | standard (ms) | with cache (ms) | speedup (%)
------------------------------------------------------------------
[email protected] | 72 | 46 | 36
[email protected] | 230 | 170 | 26
[email protected] | 120 | 95 | 20
[email protected] | 170 | 120 | 29
Using node 0.12.2 - all startup times became slower.
require('X') | standard (ms) | with cache (ms) | speedup (%)
------------------------------------------------------------------
[email protected] | 90 | 55 | 38
[email protected] | 250 | 200 | 20
[email protected] | 150 | 120 | 20
[email protected] | 200 | 145 | 27
Interesting, isn't it? A large startup performance boost just by using a single require!
Of course, I need to add cache invalidation, for example if the module's dependencies changed.
Luckily this is simple to do: just look at the list of dependencies in the package.json
file!