Faking lexical scope

Feeding fake variables to a function taken out of its lexical scope.

I have described how to test function's purity against a set of unit tests. For example, we can take a pure function add and recreate another function by reusing the function's string representation.

1
2
3
function add(a, b) { return a + b; }
var recreatedAdd = eval('(' + add.toString() + ')');
recreatedAdd(2, 3); // 5

I prefer using eval to new Function because add.toString() returns complete function including the signature, while new Function requires me to list explicit arguments plus separate function body.

We can even create add function inside private closure just to confirm that we are not mixing the eval with the original function

1
2
3
4
5
6
var add = (function () {
function add(a, b) { return a + b; }
return add;
}());
var recreatedAdd = eval('(' + add.toString() + ')');
recreatedAdd(2, 3); // 5

I had difficulty testing functions that rely on the lexical scope, because the evaluated expression is in the different lexical scope. The following fragment shows the problem:

1
2
3
4
5
6
7
8
9
10
var add2 = (function () {
var two = 2;
function add2(a) { return a + two; }
return add2;
}());
var recreatedAdd2 = eval('(' + add2.toString() + ')');
recreatedAdd2(3);
// prints
(function add2(a) { return a + two; })
ReferenceError: two is not defined

When we execute eval('(' + add2.toString() + ')') we are only evaluating the function add2 returned from the anonymous closure. The evaluation does not have access to the original lexical scope containing variable two. How could it? All it knows is that it needs to evaluate a source fragment, without knowing where it came from. Can we actually evaluate the string in this case?

Turns out that we can. The eval uses the current lexical scope during the evaluation. While this presents a potential security hole, it can be used to our advantage. If we knew that the source fragment needed a variable two, we could create a fake variable named two.

1
2
3
4
5
6
7
8
9
var add2 = (function () {
var two = 2;
function add2(a) { return a + two; }
return add2;
}());
// create fake variable in the current lexical scope
var two = 3;
var recreatedAdd2 = eval('(' + add2.toString() + ')');
recreatedAdd2(3); // 6

How can we find out free variables referenced inside the function? By parsing the function using esprima for example. I prefer to use esprima via falafel wrapper for simplicity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
var add2 = (function () {
var two = 2;
function add2(a) { return a + two; }
return add2;
}());
var falafel = require('falafel');
falafel(add2.toString(), function (node) {
if (node.type === 'Identifier') {
console.log('variable', node.name);
}
});
// prints
variable add2
variable a
variable a
variable two

We could filter out both the function's name and its argument names

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var falafel = require('falafel');
var freeVariables = {};
falafel(add2.toString(), function (node) {
if (node.type === 'Identifier') {
console.log('variable', node.name);
freeVariables[node.name] = true;
} else if (node.type === 'FunctionDeclaration') {
// filter out function name and arguments
delete freeVariables[node.id.name];
node.params.forEach(function (param) {
delete freeVariables[param.name];
});
}
});
console.log('free variables', Object.keys(freeVariables));
// prints
free variables ['two']

Thus we can determine that two must come from outer lexical scope.

Update 1: fake dynamic set of variables

In the above example we knew the variable to be faked (called two). Then I described how to find set of variables still free inside the function to be called. If we discover the set of variables by inspecting the function at runtime, how do we actually put them into the lexical scope? By putting them into the eval expression text.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var add2 = (function () {
var two = 2;
function add2(a) { return a + two; }
return add2;
}());
function iife(code) {
return '(function (){\n' + code + '\n}());';
}
// create dynamic fake variable in the current lexical scope
var injectName = 'two';
var injectValue = 3;
var code =
'var ' + injectName + ' = ' + injectValue + ';\n' +
'return ' + add2.toString();
console.log(iife(code));
var recreatedAdd2 = eval(iife(code));
console.log(recreatedAdd2(3));

I wrap dynamically injected variable and function inside an IIFE block, and the above code prints

(function (){
var two = 3;
return function add2(a) { return a + two; }
}());
6

This way I can dynamically recreate all necessary variables, including their values.

Update 2: real world use cases

  • I used the faking of lexical scopes including dynamic variable injection to mock HTTP ajax requests in live AngularJS application. The code wraps around the selected scope method and substitutes a fake $http variable. See ng-wedge.

  • Overwriting and patching NodeJS module load function in really-need.

The original function that compiles loaded source Module.prototype._compile passes require function to the new module. I needed to pass all arguments to the require, so I patched _compile then evaluated it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// original Module.prototype._compile
Module.prototype._compile = function(content, filename) {
var self = this;
// remove shebang
content = content.replace(/^\#\!.*/, '');
function require(path) {
return self.require(path);
}
require.resolve = function(request) {
return Module._resolveFilename(request, self);
};
...
};
// require replacement from really-need
// these variables are needed inside eval _compile
var runInNewContext = require('vm').runInNewContext;
var runInThisContext = require('vm').runInThisContext;
var path = require('path');
// my patched version
var _compileStr = _compile.toString();
// pass all arguments from loaded module to our self.require
_compileStr = _compileStr.replace('self.require(path);',
'self.require.apply(self, arguments);');
/* jshint -W061 */
var patchedCompile = eval('(' + _compileStr + ')');
Module.prototype._compile = function(content, filename) {
return patchedCompile.call(this, content, filename);
};

The patch in the above example was simple enough to need only text substitution.

1
2
3
4
5
6
7
8
// original require only passed the name
function require(path) {
return self.require(path);
}
// patched require passes all arguments
function require(path) {
self.require.apply(self, arguments);
}

See Hacking Node require for all details.

Limitations

Dynamic evaluation assumes you know the names of the variables needed to be faked. The names usually change when the code is minified. Thus the above technique is very fragile in the production environment.