jscodeshift example

Simple example showing automated client code change when module API changes.

Initial code

Take a simple calc.js that exports a single function that adds two numbers.

1
2
// calc.js
module.exports = function add(a, b) { return a + b }

Any client can load the function directly using require('./calc') and use it. I am using relative paths, but the same holds for module names.

1
2
3
4
5
// index.js
const add = require('./calc')
console.log('2 + 3 =', add(2, 3))
// node index.js
// 2 + 3 = 5

So far so good.

calc API has changed

Let us change the API exported by the calc.js file. Instead of directly exporting a single function, let us export an object with add property. This allows us to extend the API with sub, mul and other functions in the future.

1
2
3
4
// calc.js
module.exports = {
add: function add(a, b) { return a + b }
}

Our module calc.js changed its external "public" API, thus this is a major change according to semantic versioning. Every existing client will crash when trying to use the new version.

1
2
3
console.log('2 + 3 =', add(2, 3))
^
TypeError: add is not a function

Let us create a code transform that will change any client from using the exported function to use the exported "add" property.

1
2
3
4
// existing client
const add = require('./calc')
// transformed client
const add = require('./calc').add

Transform setup

Let us initialize the transform that does not change the source code yet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// transform.js
const j = require('jscodeshift')

function transform (file, api, options) {
console.log('transforming', file.path)

const parsed = j(file.source)
const transformed = parsed

const outputOptions = {
quote: 'single'
}
return transformed.toSource(outputOptions)
}
module.exports = transform

We can add the script command to the package.json and the dependency on jscodeshift

1
2
3
4
5
6
7
8
{
"scripts": {
"test": "jscodeshift index.js -t transform.js --dry --print"
},
"devDependencies": {
"jscodeshift": "0.3.30"
}
}

We are running the transform in "dry" mode that will NOT overwrite the source file index.js. It will also print the output source code for using --print option. This combination is perfect while developing a transform.

Input abstract syntax tree

The input program is first parsed, then transformed, then converted back into a string. The parsed program becomes a root of a tree, each node being an instance of "NodePath". The purpose of "NodePath" is to keep links to its parent and children "NodePath" instances. The actual information is contained in the "value" property object.

We can print the abstract syntax tree of an example program index.js. Usually I just use AST explorer to parse and view the tree (more on this later), but for now we can just use the terminal.

Let us remove all code from the index.js leaving only the const add = require('./calc') line for simplicity. Let us also print the parsed object inside the transform.js

1
2
const parsed = j(file.source)
console.log(parsed)

Calling npm test produces the following

1
2
3
4
5
6
7
8
9
10
transforming index.js
Collection {
_parent: undefined,
__paths:
[ NodePath {
value: [Object],
parentPath: null,
name: null,
__childCache: null } ],
_types: [ 'File', 'Node', 'Printable' ] }

Ok, just printing the top level "NodePath" object is not good enough. We really want to traverse all nodes in the tree and only print the require('./calc') calls. Luckily, the parsed object implements "Collections" methods, just like an Array. We can print all "CallExpression" nodes for example.

1
2
3
4
5
const parsed = j(file.source)
parsed.find(j.CallExpression)
.forEach(function (path) {
console.log(path.value)
})

The above code finds a single node

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
transforming index.js
Node {
type: 'CallExpression',
start: 12,
end: 29,
loc:
SourceLocation {
start: Position { line: 1, column: 12 },
end: Position { line: 1, column: 29 },
lines: Lines {},
indent: 0 },
callee:
Node {
type: 'Identifier',
start: 12,
end: 19,
loc: SourceLocation { start: [Object], end: [Object], lines: Lines {}, indent: 0 },
name: 'require',
typeAnnotation: null },
arguments:
[ Node {
type: 'Literal',
start: 20,
end: 28,
loc: [Object],
value: './calc',
rawValue: './calc',
raw: '\'./calc\'',
regex: null } ],
trailingComments: null }

We are only interested in the call expressions require('./calc') thus we can filter our node collection. Let us filter "NodePath" objects

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const isRequire = n =>
n && n.callee && n.callee.name === 'require'
const isUnary = args =>
Array.isArray(args) && args.length === 1
const isCalc = arg => arg.value === './calc'
const isRequireCalc = n =>
isRequire(n) && isUnary(n.arguments) && isCalc(n.arguments[0])
function transform (file, api, options) {
const parsed = j(file.source)
parsed.find(j.CallExpression)
.filter(path => isRequireCalc(path.value))
.forEach(function (path) {
console.log(path.value)
})
}

This should produce the same list, but if we had other function calls in our program, only the require('./calc') would be processed.

Note: you can provide search parameters to parsed.find function, and it will handle all the edge cases for you. In the above case finding all require calls would be

1
parsed.find(j.CallExpression, {callee: {value: 'require'}})

We would still need to filter all calls to have the first argument ./calc after that.

Desired output

I found that the easiest way to transform one abstract syntax into another one is to write the desired output program first. In our case, let us just create a file desired.js with simple require('./calc').add line.

Then paste the input and desired source into http://astexplorer.net/ and see both trees.

File index.js with just require('./calc') source code has the following tree.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
{
"type": "Program",
"start": 0,
"end": 18,
"body": [
{
"type": "ExpressionStatement",
"start": 0,
"end": 17,
"expression": {
"type": "CallExpression",
"start": 0,
"end": 17,
"callee": {
"type": "Identifier",
"start": 0,
"end": 7,
"name": "require"
},
"arguments": [
{
"type": "Literal",
"start": 8,
"end": 16,
"value": "./calc",
"raw": "'./calc'"
}
]
}
}
],
"sourceType": "module"
}

Same file with require('./calc').add produces slightly more complex tree

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
{
"type": "Program",
"start": 0,
"end": 22,
"body": [
{
"type": "ExpressionStatement",
"start": 0,
"end": 21,
"expression": {
"type": "MemberExpression",
"start": 0,
"end": 21,
"object": {
"type": "CallExpression",
"start": 0,
"end": 17,
"callee": {
"type": "Identifier",
"start": 0,
"end": 7,
"name": "require"
},
"arguments": [
{
"type": "Literal",
"start": 8,
"end": 16,
"value": "./calc",
"raw": "'./calc'"
}
]
},
"property": {
"type": "Identifier",
"start": 18,
"end": 21,
"name": "add"
},
"computed": false
}
}
],
"sourceType": "module"
}

Thus we need to transform every require('./calc') "CallExpression" into a "MemberExpression" with additional property "add". We can visualize this as follows

1
2
3
4
5
require('./calc')    .add
----------------- ----------
CallExpression Identifier
-------------------------------
MemberExpression

Transformation

We are going to replace each filtered call expression with a member expression. We can tell the Collections api to replace the current syntax tree node with new value using replaceWith method.

1
2
3
4
5
parsed.find(j.CallExpression)
.filter(path => isRequireCalc(path.value))
.replaceWith(function (path) {
// return new AST node
})

jscodeshift includes helpful builder functions that match 1 to 1 the AST names, just in lowercase. Here is our transformation

1
2
3
4
5
6
.replaceWith(function (path) {
return j.memberExpression(
path.value,
j.identifier('add')
)
})

Notice the trick - we are reusing the existing "CallExpression" in path.value so we do not have to construct require('./calc') node again. We just use it as the first argument (the target object) to the j.memberExpression.

The transformation prints the result: require('./calc').add. This is exactly what we need. Let us remove "--dry" parameter and save the output file. The diff shows the change.

1
2
3
-const add = require('./calc')
+const add = require('./calc').add
console.log('2 + 3 =', add(2, 3))

The transformed index.js now works with our new API.

1
2
node index.js
2 + 3 = 5

Final thoughts

Codemods for semantic versioning

By definition, the clients require code modifications when the module has a major update. I already can enforce do not break the dependent projects when releasing minor or patch releases rule using dont-break and dont-crack tools. It would be nice to extend these tools and check codemods included with a major release. If the codemods successfully fix all dependent projects, then release the new major version; the clients will be able to successfully update.

Automated codemod generation

I once built a primitive data transform solver using Ramda library, called Rambo. Just give Rambo one or more input and output data examples, and it will (in very very simple cases) give you a source code that will transform the input into output.

I wish there were a simple solver for codemods. Just give it input source and corresponding desired code, and wait for a transformation function. Such tool if successful in most cases would lead to wide codemods adoption and fewer obstacles to successful code upgrades.

Relevant links