Optimizing nth

Profiling and speeding up a function using Chrome and v8-natives.

Recently I came across nth micro library. It adds a suffix to a number

1
2
3
nth(1); // "1st"
nth(2); // "2nd"
nth(212): // "212th"

The library is very small - just a single function with a few lines of code. I wanted to check how fast it was, and if it could be made faster. Here are the steps that sped it up by a factor of 5x, and in a particular use case by a factor of 50x (using caching).

Initial performance

I grabbed the original source code and pasted into index.html.

1
2
3
4
5
6
7
8
9
10
function nth(n) {
if (/^(string|number)$/.test(typeof n) === false) { return n; }
var suffixes = ['th', 'st', 'nd', 'rd', 'th', 'th', 'th', 'th', 'th', 'th'],
match = n.toString().match(/\d$/),
suffix;
if (!match) { return n; }
suffix = suffixes[match[0]];
if (/1[123]$/.test(n)) { suffix = 'th'; }
return [n, suffix].join('');
}

To test it, I am computing 1 million random numbers in the range from 1 to 1000, then calling nth on each number.

1
2
3
4
5
6
7
8
9
10
11
12
var results;
var k, n = 1000000;
results = new Array(n);
for (k = 0; k < n; k += 1) {
results[k] = Math.round(Math.random() * 1000);
}
function benchmarkNth() {
for (k = 0; k < n; k += 1) {
results[k] = nth(results[k]);
}
}
benchmarkNth();

In order to accurately profile the JavaScript, I will use console.profile methods.

1
2
3
4
5
console.profile('nth');
benchmarkNth();
console.profileEnd('nth');
console.log('first 10 results');
console.log(results.slice(0, 10));

The initial function takes almost 1 second to process 1 million numbers.

step-1

Profile individual functions

Chrome profiler does not show which statements are bottlenecks - it works at the function level. Thus I need to refactor the code to find the actual bottleneck.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
function isStringOrNumber(x) {
return /^(string|number)$/.test(typeof n);
}
function findMatch(n) {
return n.toString().match(/\d$/);
}
function joinSuffix(n, suffix) {
return [n, suffix].join('');
}
var th = /1[123]$/;
function check11th(n, suffix) {
return th.test(n) ? 'th' : suffix;
}
var suffixes = ['th', 'st', 'nd', 'rd', 'th', 'th', 'th', 'th', 'th', 'th'];
function nth(n) {
if (!isStringOrNumber(n)) { return n; }
var match = findMatch(n);
if (!match) { return n; }
var suffix = suffixes[match[0]];
suffix = check11th(n, suffix);
return joinSuffix(n, suffix);
}

This is the same logic as the original function, but now Chrome profiler can show which parts take longer.

step-2

Hmm, all bottlenecks are native functions (red rectangle). None of my little functions are taking too long.

Confirm function optimization

Chrome profiler does not show any obvious bottlenecks in this code. I wanted to see if any of my functions has not been optimized by the v8 engine. I showed how to check function's status in Detecting function optimizations in V8. I copied the nth source code into index.js and used v8-natives module to profile and query function's optimization status.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// index.js
// same code as above
function callNth() {
return nth(Math.round(Math.random() * 1000));
}
// profile callNth calls
var n = 1000000;
var v8 = require('v8-natives');
var took = v8.helpers.benchmark(n, callNth);
console.log('each call to nth took', took/n, 'nanoseconds');
// print v8 engine optimization status for my functions
[nth, isStringOrNumber, findMatch, joinSuffix, check11th].forEach(function (fn) {
v8.helpers.printStatus(fn);
});

I need to run node using a special flag to allow native syntax query

$ node --allow-natives-syntax index.js
each call to nth took 512.916498 nanoseconds
Function nth is optimized
Function isStringOrNumber is optimized
Function findMatch is optimized
Function joinSuffix is optimized
Function check11th is optimized

Every function has been optimized, so nothing is obviously slow. Where is the bottleneck?

Solving the bottleneck

I went back to the Chrome profile results. Where is the call set length coming from? We do not get length of a string or an Array anywhere in our code. I also wanted to know why the garbage collector runs so often. Seems we are creating a lot of objects somewhere.

Inspecting the tree view in the profile, I noticed that set length seems to be called from joinSuffix function.

1
2
3
function joinSuffix(n, suffix) {
return [n, suffix].join('');
}

This is a red flag - we are using a tiny array to concatenate a number and a string. Imagine what the JavaScript engine has to do in this case:

  • creates a new array
  • places a number n into the first position
  • places a string suffix into the second position
  • calls Array.prototype.join and returns string result

Let us change this function to use regular string concatenation

1
2
3
function joinSuffix(n, suffix) {
return n + suffix;
}

The performance has improved dramatically

step-3

Application-specific optimization

While we can profile and improve each individual function, this application has its specific obvious optimization to make. We are computing suffix for 1 million numbers limited to 1000 range. Thus we can cache the computed results once and avoid the multiple computation.

1
2
3
4
5
6
7
8
9
var cache = new Array(1000);
function cachedNth(n) {
if (!cache[n]) {
cache[n] = nth(n);
}
return cache[n];
}
...
results[k] = cachedNth(results[k]);

This optimization leads to the most dramatic improvement

step-4

Conclusion

It is important to accurately profile the JavaScript code before making any optimizations. In some cases the true bottleneck is non-obvious and requires inspecting the source code for potentially expensive operations that are still optimized by v8 engine. While micro-optimizations are important, one can get large performance gains by using the application-specific shortcuts.

Related