I was reading an interesting book "Secure Your Node.js Web Application" by Karl Düüna and the page 44 in chapter 4 "Avoid code injections" caught my eye. In the example, the user passes an arithmetic formula from the web page to be evaluated on the server. For example, the user could enter "2 + 4" and the server would do something like this
1 | app.post('/calc', function (req, res) { |
The author goes on to show that a malicious user can send an input that will cause
problems. For example, entering 3; process.exit()
as the input formula would stop
the server!
The book goes on to show some solutions to this problem, like white listing the allowed input types, etc.
The danger
The code passed to eval
could be quite large and malicious. Or it could be loaded from
other modules deep down the dependency graph. It can do anything, really.
Imagine we execute malicious code that goes through the list of modules and finds all modules
that export any method with word login
in them. The malicious code then can wrap these methods
easily and steal ALL logins
1 | Object.keys(require.cache).map(m => m.exports).filter(e => typeof e === 'object') |
That's it, every login executed from now on will go through this code. Dangerous, because we never suspected that other modules can spy on us, right?
Aside from eval(req.body...), what are the other ways to inject malicious code into our application?
Here are some possibilities I have found.
Sneak malicious code in the new package version
If we declare a dependency on module using fuzzy version like A@^X.X.X
, an attacker could
add malicious code and publish [email protected]+1
. By automatically installing latest patch, our server
loads the malicous code - and then all bets are off.
Solution: use the exact versions yourself and shrinkwrap dependencies to make sure the entire tree of dependencies is locked.
Overwrite source files
Imagine a malicious dependency module that rewrites your JavaScript files (but in a very obfuscated manner).
If we load that malicious dependency first, it changes the source for files loaded next,
ensuring more malicious code is loaded by require
.
Solution: set the source folder to read-only and always use a separate folder for storing the data.
Forcing module reload
This attack seems harmless but shows that Node has a vulnerability that is unique to the interpreted languages. When you execute a compiled program (like C++ binary), all the machine code is loaded and stored in read-only memory space. Thus the program cannot change itself at runtime.
Node programs can change themselves at any time. A basic way to change the running
code is by changing the loaded modules, all accessible using require.cache
object.
Take a simple program that loads module started.js
. The "started" module keeps the
timestamp when the application has started. Because Node modules are cached, we assume
that this value will never change, right?
1 | // started.js |
Note that we use the strict mode and declare every variable constant in both modules. Yet, the output shows the discrepancy: the third value is different from the first two.
1 | $ node when-started.js |
While simple, this shows that giving the "user" code access to the list of loaded modules can lead to unexpected results.
Solution: once all necessary modules are loaded, you can use Object.seal(require.cache)
to prevent new module load or deleting modules already loaded.
Unapply style attacks
Read how changing the prototype methods is dangerous in Unapply attack post.
Feeding new code to the next require
Note: if you want to see debug log from the Node module load system, turn on the debug messages
with NODE_DEBUG=module node ...
environment setting.
Note: before reading the rest of this blog post, you might want to read
How require()
Actually Works
Changing the timestamp by breaking the cached module assumption is small potatoes.
We can directly set / change the module value to whatever we want! Using the same
started.js
module, but instead of deleting the loaded module from cache, we can
set the exported value.
1 | // set-started.js |
The program's output
1 | $ node set-started.js |
See the next section on how to prevent changing values inside the loaded modules.
Changing loaded code without reload
Even more dangerous, we do not have to require
a module to change its behavior.
Most modules do not load primitive values (like numbers or strings), but return objects.
For example, let us change the configuration value
1 | // config.js |
This code shows a lot more dangerous behavior - the user account has been changed from
whatever the configured value was to root
1 | $ node set-config.js |
Even worse, the config.user
value changes in the first place too!
1 | // set-config.js |
Again, note that JavaScript const
keyword only locks the reference config
and not the
object itself.
Solution 1: lock down sensitive objects inside the module itself using deep freeze.
1 | // config.js |
The same set-config.js
code now fails
1 | $ node set-config.js |
Solution 2: use functions instead of objects to keep sensitive information private via a closure.
Instead of exporting a config object, return a config getter function. It will be much harder to change the data inside the function's closure than to modify a property of an object.
1 | // config.js |
Nuclear option - control the require.cache
In all previous cases, we were able to change the running code by modifying the require.cache
object. This object starts empty and keeps growing as more user code is loaded (native modules are
not stored there). If we want to make sure the loaded code stays original, we should make
the require.cache
add-only object.
Loading code protection first
In every prevention method, we rely on some of our trusted code to be loaded first or at
startup. The way to load a module before running a program is by using -r
CLI argument.
1 | console.log('preloaded') |
1 | $ node -r ./preload.js when-started.js |
There is even a tool to preload code as plugins, see the module toolbag.
Controlling module cache via proxies
The only way reliable way of making require.cache
add-only I could come up with was using
ES6 Proxies.
First, we can write a function that can make any object add-only.
1 |
|
Here is an example that makes a new object, allows adding and getting properties, but not modifying them or deleting properties. It works today in Chrome (v49) and Chrome Canary (v51).
1 | const f = makeAddOnly() |
If we could use proxies from Node today, we could have made require.cache
add-only and prevent
code modification attacks (assuming we can freeze the exports too).
1 |
|
It would be great, but the proxies cannot be used from Node v5, or even transpiled or polyfilled yet.
Today we are limited to observing the cache using the deprecated (but available) Object.observe
method.
Observing cache changes
Instead of proxies, we can monitor the cache changes using the following approach.
1 |
|
We will be notified every time some one adds (loads) new module. When a new module is loaded we will deep freeze it. If someone else tries to delete or alter an already loaded module, we throw an error. We should probably even exit the process!
Let us try deleting a loaded module
1 | const config = require('./config') |
1 | $ node observe-cache.js |
Note because the Object.observe
is asynchronous, we will get the changes only AFTER
they have happened. This makes our checks and errors reactionary - the change has already happened!
1 | const config = require('./config') |
The above code runs without errors, because we loaded and modified the config before we had
a change to freeze it from addOnly
change callback.
1 | $ node observe-cache.js |
but if we assume that the config module loads before trying to modifying it, then we do get an error.
1 | const config = require('./config') |
Now we are getting errors
1 | 1: config.user limited |
Not the most reliable solution, but works if we assume the malicious code gets injected later than the initial code is loaded, it works.
Preload all modules and STOP
We can generate a snapshot of all modules loaded during "normal" run of the server, and then preload all modules ourselves and then seal the module cache.
For example, cache-require-paths monitors the loaded modules and saves all resolved paths into a json file. We can similarly save loaded module paths or even load all modules from package json file and then disable loading anything else.
In the below example let us have two files. started.js
and when-started.js
.
We will create another file that will bootstrap everything and then will lock down the cache.
We will run it as
node -r preload.js when-started.js
We need to load started.js
file - that is easy. But then we need to load when-started.js
- without
compiling it (because this triggers immediately the code we want to protect from!) Instead we will
create a dummy object in the cache that will only allow us to set the module once.
In addition we will deep freeze every exported object for each module
1 | // preload.js |
The normal execution proceeds just fine. We can load ./started
several times.
1 | // when-started.js |
1 | $ node -r ./preload.js when-started.js |
But if we try to load a new module - we cannot
1 | const when1 = require('./started') |
1 | $ node -r ./preload.js when-started.js |
The error is a little cryptic, because it looks like it is trying to delete the root module, but I think this is because it fails to internally update it.
Let us try deleting a module to force its reload.
1 | const when1 = require('./started') |
Same exception happens again - we cannot delete properties from the sealed require.cache
object.
If we try to change a loaded module after the fact, it fails, because the exports object has been frozen.
1 | const config = require('./config') |
This method works, but requires you to really know what modules are necessary. This is hard to estimate,
since blindly loading the entire node_modules
folder is going to take a while!
Prefreeze all modules and STOP
Let us make a slight twist on the previous approach. Instead of loading a module, we are going
to put a placeholder into require.cache
that will freeze it in the future, just like we did for
the single root module above. This method is a lot more efficient because we are NOT preloading
(compiling) all methods from the node_modules
folder. Instead we are just setting
placeholders in the require.cache
with a couple of hooks.
First, form fully resolved module paths
1 |
|
Second, create a property manually for each module - a placeholder
1 | // add placeholders for each module and only allow |
Every time a real module is loaded, it will be passed into the set
function.
We cannot freeze the exported object just yet - the module is only starting to load and has not been compiled yet.
There is a special property on the module instance passed to the set
called loaded
.
In order to really know when the module is loaded, we can define a complex property again!
1 | set: function setModule(moduleValue) { |
Finally, we can seal the require.cache
because all the placeholder properties for
all possible modules have been created.
1 | console.log('preloaded all modules') |
Normal load works fine (I kept the debug log statements)
1 | $ node -r ./prefreeze.js set-config.js |
Now, let us try changing the exported value via require.cache
shortcut
1 | const config = require('./config') |
1 | $ node -r ./prefreeze.js set-config.js |
The error is very clear if the attack is deferred.
1 |
|
1 | $ node -r ./prefreeze.js set-config.js |
Boom, the attack has been repealed because the fast preload ensured that the entire require.cache
has been sealed and frozen!
Conclusion
We need to think how to protect a running Node process for changing its own code in case of malicious code injection. Most fixes require deep freezing sensitive objects and code fragments. While external solutions are possible, none is available at the moment. This leaves only preloading and freezing the cache as the only way of making the program's code "read-only".