Mar 3 2016

Playing havoc with Node module system

Node.js is really really really susceptible to code injection attacks.

I was reading an interesting book "Secure Your Node.js Web Application" by Karl Düüna and the page 44 in chapter 4 "Avoid code injections" caught my eye. In the example, the user passes an arithmetic formula from the web page to be evaluated on the server. For example, the user could enter "2 + 4" and the server would do something like this

app.post('/calc', function (req, res) {
  var result;
  eval('result = ' + req.body.formula);
  res.send('the result is: ' + result);
});

The author goes on to show that a malicious user can send an input that will cause problems. For example, entering 3; process.exit() as the input formula would stop the server!

The book goes on to show some solutions to this problem, like white listing the allowed input types, etc.

The danger

The code passed to eval could be quite large and malicious. Or it could be loaded from other modules deep down the dependency graph. It can do anything, really.

Imagine we execute malicious code that goes through the list of modules and finds all modules that export any method with word login in them. The malicious code then can wrap these methods easily and steal ALL logins

Object.keys(require.cache).map(m => m.exports).filter(e => typeof e === 'object')
  .forEach(e => Object.keys(e).filter(name => /login/.test(name)).forEach(method => {
    e[method] = function () {
      // send arguments to remote server!
      // let the login work as usual
      return e[method].apply(e, arguments)
    }
  }))

That's it, every login executed from now on will go through this code. Dangerous, because we never suspected that other modules can spy on us, right?

Aside from eval(req.body...), what are the other ways to inject malicious code into our application?

Here are some possibilities I have found.

Sneak malicious code in the new package version

If we declare a dependency on module using fuzzy version like A@^X.X.X, an attacker could add malicious code and publish [email protected]+1. By automatically installing latest patch, our server loads the malicous code - and then all bets are off.

Solution: use the exact versions yourself and shrinkwrap dependencies to make sure the entire tree of dependencies is locked.

Overwrite source files

Imagine a malicious dependency module that rewrites your JavaScript files (but in a very obfuscated manner). If we load that malicious dependency first, it changes the source for files loaded next, ensuring more malicious code is loaded by require.

Solution: set the source folder to read-only and always use a separate folder for storing the data.

Forcing module reload

This attack seems harmless but shows that Node has a vulnerability that is unique to the interpreted languages. When you execute a compiled program (like C++ binary), all the machine code is loaded and stored in read-only memory space. Thus the program cannot change itself at runtime.

Node programs can change themselves at any time. A basic way to change the running code is by changing the loaded modules, all accessible using require.cache object.

Take a simple program that loads module started.js. The "started" module keeps the timestamp when the application has started. Because Node modules are cached, we assume that this value will never change, right?

// started.js
'use strict'
const started = new Date()
module.exports = started
// when-started.js
'use strict'
const when1 = require('./started')
console.log('1: started program at', Number(when1))
// uses cached loaded value
const when2 = require('./started')
console.log('2: started program at', Number(when2))
const startedPath = require.resolve('./started')
// force reloading './started' next time
delete require.cache[startedPath]
setTimeout(function () {
  const when3 = require('./started')
  console.log('3: started program at', Number(when3))
}, 100)

Note that we use the strict mode and declare every variable constant in both modules. Yet, the output shows the discrepancy: the third value is different from the first two.

$ node when-started.js 
1: started program at 1457017164598
2: started program at 1457017164598
3: started program at 1457017164712

While simple, this shows that giving the "user" code access to the list of loaded modules can lead to unexpected results.

Solution: once all necessary modules are loaded, you can use Object.seal(require.cache) to prevent new module load or deleting modules already loaded.

Unapply style attacks

Read how changing the prototype methods is dangerous in Unapply attack post.

Feeding new code to the next require

Note: if you want to see debug log from the Node module load system, turn on the debug messages with NODE_DEBUG=module node ... environment setting.

Note: before reading the rest of this blog post, you might want to read How require() Actually Works

Changing the timestamp by breaking the cached module assumption is small potatoes. We can directly set / change the module value to whatever we want! Using the same started.js module, but instead of deleting the loaded module from cache, we can set the exported value.

// set-started.js
'use strict'
const when1 = require('./started')
console.log('1: started program at', Number(when1))
// uses cached loaded value
const when2 = require('./started')
console.log('2: started program at', Number(when2))
const startedPath = require.resolve('./started')
// change the value directly
require.cache[startedPath].exports = 1337
setTimeout(function () {
  const when3 = require('./started')
  console.log('3: started program at', Number(when3))
}, 100)

The program's output

$ node set-started.js 
1: started program at 1457017716062
2: started program at 1457017716062
3: started program at 1337

See the next section on how to prevent changing values inside the loaded modules.

Changing loaded code without reload

Even more dangerous, we do not have to require a module to change its behavior. Most modules do not load primitive values (like numbers or strings), but return objects. For example, let us change the configuration value

// config.js
'use strict'
module.exports = { user: 'limited' }
// set-config.js
'use strict'
const config = require('./config')
console.log('1: config.user', config.user)
const configPath = require.resolve('./config')
require.cache[configPath].exports.user = 'root'
console.log('2: config.user', config.user)

This code shows a lot more dangerous behavior - the user account has been changed from whatever the configured value was to root

1
2
3

$ node set-config.js 
1: config.user limited
2: config.user root

Even worse, the config.user value changes in the first place too!

// set-config.js
'use strict'
const config = require('./config')
setTimeout(function () {
  console.log('1: config.user', config.user)
  // config.user root
}, 100)

Again, note that JavaScript const keyword only locks the reference config and not the object itself.

Solution 1: lock down sensitive objects inside the module itself using deep freeze.

// config.js
'use strict'
const freeze = require('deep-freeze')
module.exports = freeze({ user: 'limited' })

The same set-config.js code now fails

$ node set-config.js 
1: config.user limited
/set-config.js:7
require.cache[configPath].exports.user = 'root'
                                       ^
TypeError: Cannot assign to read only property 'user' of #<Object>
    at Object.<anonymous> (/set-config.js:7:40)

Solution 2: use functions instead of objects to keep sensitive information private via a closure.

Instead of exporting a config object, return a config getter function. It will be much harder to change the data inside the function's closure than to modify a property of an object.

// config.js
'use strict'
function config(field) {
  const settings = { user: 'limited' }
  return settings[field]
}
module.exports = config
// set-config.js
'use strict'
const config = require('./config')
console.log('1: config.user', config('user'))
const configPath = require.resolve('./config')
// cannot access 'user' inside the closure
// require.cache[configPath].exports ...

Nuclear option - control the require.cache

In all previous cases, we were able to change the running code by modifying the require.cache object. This object starts empty and keeps growing as more user code is loaded (native modules are not stored there). If we want to make sure the loaded code stays original, we should make the require.cache add-only object.

Loading code protection first

In every prevention method, we rely on some of our trusted code to be loaded first or at startup. The way to load a module before running a program is by using -r CLI argument.

1	console.log('preloaded')

$ node -r ./preload.js when-started.js 
preloaded
1: started program at 1457030822826
2: started program at 1457030822826
3: started program at 1457030822931

There is even a tool to preload code as plugins, see the module toolbag.

Controlling module cache via proxies

The only way reliable way of making require.cache add-only I could come up with was using ES6 Proxies.

First, we can write a function that can make any object add-only.

'use strict'
function makeAddOnly(obj) {
  const addOnly = {
    set: function (target, property, value) {
      if (!target.hasOwnProperty(property)) {
        target[property] = value
      } else {
        throw new Error('Cannot change property ' + property)
      }
    },
    deleteProperty: function (target, property) {
      throw new Error('Cannot delete property ' + property)
    }
  }
  return new Proxy(obj, addOnly)
}

Here is an example that makes a new object, allows adding and getting properties, but not modifying them or deleting properties. It works today in Chrome (v49) and Chrome Canary (v51).

const f = makeAddOnly()
f.foo         // undefined
f.foo = 42    // 42
delete f.foo  // Uncaught Error: Cannot delete property foo
f.foo = -1    // Uncaught Error: Cannot change property foo
f.bar = 'bar' // 'bar'
f             // Object {foo: 42, bar: "bar"}

If we could use proxies from Node today, we could have made require.cache add-only and prevent code modification attacks (assuming we can freeze the exports too).

'use strict'
import makeAddOnly from './add-only'
require.cache = makeAddOnly(require.cache)
// no more cache shenanigans

It would be great, but the proxies cannot be used from Node v5, or even transpiled or polyfilled yet. Today we are limited to observing the cache using the deprecated (but available) Object.observe method.

Observing cache changes

Instead of proxies, we can monitor the cache changes using the following approach.

'use strict'
const freeze = require('deep-freeze')
function addOnly(changes) {
  changes.forEach(function (change) {
    if (change.type === 'add') {
      // property could have been deleted already!
      if (change.object[change.name]) {
        console.log('freezing new property', change.name)
        change.object[change.name] = freeze(change.object[change.name])
      }
    } else {
      // update or delete
      throw new Error('Cannot ' + change.type +
        ' existing property ' + change.name)
      // maybe even process.exit(-1)
    }
  })
}
Object.observe(require.cache, addOnly, ['add', 'update', 'delete'])

We will be notified every time some one adds (loads) new module. When a new module is loaded we will deep freeze it. If someone else tries to delete or alter an already loaded module, we throw an error. We should probably even exit the process!

Let us try deleting a loaded module

const config = require('./config')
console.log('1: config.user', config.user)
console.log('deleting config module')
const configPath = require.resolve('./config')
delete require.cache[configPath]

$ node observe-cache.js 
1: config.user limited
deleting config module
Error: Cannot delete existing property /config.js

Note because the Object.observe is asynchronous, we will get the changes only AFTER they have happened. This makes our checks and errors reactionary - the change has already happened!

const config = require('./config')
console.log('1: config.user', config.user)
setTimeout(function () {
  console.log('first user is now', config.user)
}, 100)
console.log('changing config.user')
const configPath = require.resolve('./config')
require.cache[configPath].exports.user = 'root'
console.log('2: config.user', config.user)

The above code runs without errors, because we loaded and modified the config before we had a change to freeze it from addOnly change callback.

$ node observe-cache.js 
1: config.user limited
changing config.user
2: config.user root
freezing new property /config.js
first user is now root

but if we assume that the config module loads before trying to modifying it, then we do get an error.

const config = require('./config')
console.log('1: config.user', config.user)
setTimeout(function () {
  console.log('first user is now', config.user)
}, 100)
setTimeout(function () {
  console.log('changing config.user')
  const configPath = require.resolve('./config')
  require.cache[configPath].exports.user = 'root'
  console.log('2: config.user', config.user)
}, 0)

Now we are getting errors

1: config.user limited
freezing new property /config.js
changing config.user
/observe-cache.js:40
  require.cache[configPath].exports.user = 'root'
                                         ^
TypeError: Cannot assign to read only property 'user' of #<Object>

Not the most reliable solution, but works if we assume the malicious code gets injected later than the initial code is loaded, it works.

Preload all modules and STOP

We can generate a snapshot of all modules loaded during "normal" run of the server, and then preload all modules ourselves and then seal the module cache.

For example, cache-require-paths monitors the loaded modules and saves all resolved paths into a json file. We can similarly save loaded module paths or even load all modules from package json file and then disable loading anything else.

In the below example let us have two files. started.js and when-started.js. We will create another file that will bootstrap everything and then will lock down the cache. We will run it as

node -r preload.js when-started.js

We need to load started.js file - that is easy. But then we need to load when-started.js - without compiling it (because this triggers immediately the code we want to protect from!) Instead we will create a dummy object in the cache that will only allow us to set the module once.

In addition we will deep freeze every exported object for each module

// preload.js
require('./started')
// any other modules ...
var rootModule
Object.defineProperty(require.cache,
  require.resolve('./when-started'),
  {
    enumerable: true,
    get: function () { return rootModule },
    set: function (val) {
      if (rootModule) {
        const err = new Error('root module has been already set')
        console.error(err.message)
        throw err
      }
      console.log('setting root value')
      rootModule = val
    }
  }
)
const freeze = require('deep-freeze')
console.log('preloaded all modules')
Object.seal(require.cache)
console.log('require cache sealed')
// freeze everything
Object.keys(require.cache).forEach(function (name) {
  const m = require.cache[name]
  try {
    if (typeof m.exports === 'object') {
      console.log('freezing', name)
      m.exports = freeze(m.exports)
    }
  } catch (err) {}
})
console.log('exports frozen')

The normal execution proceeds just fine. We can load ./started several times.

// when-started.js
const when1 = require('./started')
console.log('1: started program at', Number(when1))
// uses cached loaded value
const when2 = require('./started')
console.log('2: started program at', Number(when2))

$ node -r ./preload.js when-started.js 
preloaded all modules
require cache sealed
exports frozen
setting root value
1: started program at 1457038848046
2: started program at 1457038848046

But if we try to load a new module - we cannot

const when1 = require('./started')
console.log('1: started program at', Number(when1))
// try loading another module
require('./foo')

$ node -r ./preload.js when-started.js 
preloaded all modules
require cache sealed
setting root value
exports frozen
1: started program at 1457038939506
2: started program at 1457038939506
module.js:318
      delete Module._cache[filename];
TypeError: Cannot delete property '/when-started.js' of #<Object>

The error is a little cryptic, because it looks like it is trying to delete the root module, but I think this is because it fails to internally update it.

Let us try deleting a module to force its reload.

1
2
3

const when1 = require('./started')
console.log('1: started program at', Number(when1))
delete require.cache[startedPath]

Same exception happens again - we cannot delete properties from the sealed require.cache object.

If we try to change a loaded module after the fact, it fails, because the exports object has been frozen.

const config = require('./config')
console.log('1: config.user', config.user)
const configPath = require.resolve('./config')
require.cache[configPath].exports.user = 'root'
console.log('2: config.user', config.user)
// throws an Error

This method works, but requires you to really know what modules are necessary. This is hard to estimate, since blindly loading the entire node_modules folder is going to take a while!

Prefreeze all modules and STOP

Let us make a slight twist on the previous approach. Instead of loading a module, we are going to put a placeholder into require.cache that will freeze it in the future, just like we did for the single root module above. This method is a lot more efficient because we are NOT preloading (compiling) all methods from the node_modules folder. Instead we are just setting placeholders in the require.cache with a couple of hooks.

First, form fully resolved module paths

'use strict'
const freeze = require('deep-freeze')
const allModules = ['./config', './set-config'] // or read all modules from node_modules!
  .map(function (name) {
    return require.resolve(name)
  })

Second, create a property manually for each module - a placeholder

// add placeholders for each module and only allow
// loading it once, also freezing exports
allModules.forEach(function loadOnce(fullName) {
  var thisModule
  Object.defineProperty(require.cache, fullName,
    {
      enumerable: true,
      get: function () { return thisModule },
      set: function setModule(moduleValue) {
        console.log('setting module value', fullName)
        if (thisModule) {
          const err = new Error('root module has been already set')
          console.error(err.message)
          throw err
        }

        // freezing code will be here
        thisModule = moduleValue
      }
    }
  )
})

Every time a real module is loaded, it will be passed into the set function. We cannot freeze the exported object just yet - the module is only starting to load and has not been compiled yet. There is a special property on the module instance passed to the set called loaded. In order to really know when the module is loaded, we can define a complex property again!

set: function setModule(moduleValue) {
  // check if module has been loaded already code
  // freeze exports when the module gets loaded
  var loaded
  Object.defineProperty(moduleValue, 'loaded', {
    enumerable: true,
    get: function () { return loaded },
    set: function (moduleLoaded) {
      console.log('module', fullName, 'loaded', moduleLoaded)
      loaded = moduleLoaded
      if (typeof moduleValue.exports === 'object') {
        console.log('freezing exports', moduleValue.exports)
        moduleValue.exports = freeze(moduleValue.exports)
      }
    }
  })
}

Finally, we can seal the require.cache because all the placeholder properties for all possible modules have been created.

1
2
3

console.log('preloaded all modules')
Object.seal(require.cache)
console.log('require cache sealed')

Normal load works fine (I kept the debug log statements)

$ node -r ./prefreeze.js set-config.js 
preloaded all modules
require cache sealed
setting module value /set-config.js
setting module value /config.js
loading config.js
module /config.js loaded true
freezing exports { user: 'limited' }
1: config.user limited
2: config.user limited
module /set-config.js loaded true
freezing exports {}

Now, let us try changing the exported value via require.cache shortcut

const config = require('./config')
console.log('1: config.user', config.user)
const configPath = require.resolve('./config')
require.cache[configPath].exports.user = 'root'
console.log('2: config.user', config.user)

$ node -r ./prefreeze.js set-config.js 
preloaded all modules
require cache sealed
setting module value /set-config.js
setting module value /config.js
loading config.js
module /config.js loaded true
freezing exports { user: 'limited' }
1: config.user limited
module.js:318
      delete Module._cache[filename];
                           ^
TypeError: Cannot delete property '/set-config.js' of #<Object>

The error is very clear if the attack is deferred.

'use strict'
const config = require('./config')
console.log('1: config.user', config.user)
setTimeout(function tryAttacking() {
  console.log('trying to attack')
  const configPath = require.resolve('./config')
  require.cache[configPath].exports.user = 'root'
  console.log('2: config.user', config.user)
}, 100)

$ node -r ./prefreeze.js set-config.js 
preloaded all modules
require cache sealed
setting module value /set-config.js
setting module value /config.js
loading config.js
module /config.js loaded true
freezing exports { user: 'limited' }
1: config.user limited
module /set-config.js loaded true
freezing exports {}
trying to attack
/set-config.js:9
  require.cache[configPath].exports.user = 'root'
                                         ^
TypeError: Cannot assign to read only property 'user' of #<Object>
    at tryAttacking [as _onTimeout] (/set-config.js:9:42)
    at Timer.listOnTimeout (timers.js:92:15)

Boom, the attack has been repealed because the fast preload ensured that the entire require.cache has been sealed and frozen!

Conclusion

We need to think how to protect a running Node process for changing its own code in case of malicious code injection. Most fixes require deep freezing sensitive objects and code fragments. While external solutions are possible, none is available at the moment. This leaves only preloading and freezing the cache as the only way of making the program's code "read-only".

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do