Jun 5 2017

How To Draw An Owl

Refactor code using functional approach (Maybe, immutable).

Intro

I like simple code. I consider functional code simpler than OO code. Yet, I am not an advanced practitioner of functional programming; and I easily get lost when people throw words like "comonads" around. After writing [a lot of blog posts][../tags/functional/] about functional way of writing JavaScript, I only scratch a surface of actually using the full power of FP in JS.

In fact, using a little bit of functional programming in JavaScript is super easy - just take advantage of built-in Array callbacks! Yet mastering an advanced topic like combining several Monad types together is much much harder. See if you can read this explanation of Freer Monad and be ready to use it in your code!

A lot of FP explanations seems to start with basics and then immediately switch to controlling side effects via Abstract Data Types (like Functor, Monad). Meanwhile I am sitting there reading about it and feeling like I am learning to draw an owl:

How to draw an owl: simplified version

In this blog post I will try to refactor a piece of my code applying the following functional principles:

pure functions
object lenses
immutable data structure
Maybe monad

and I will also do it slowly, explaining each code transformation. Hopefully, I will better understand the FP myself, and fill in the details on how to draw an owl.

How to really draw an owl

Initial code

As an example, I will use a little utility I wrote some time ago called Rocha - it is a wrapper around super popular unit testing framework Mocha that randomizes the order of unit tests before running them. This comes in handy when flashing dependencies between tests. If the tests pass - great, and next time they will run in new order. But if the tests fail, the running order is saved as a JSON file, and the next test run will try the same failing test order. Thus you can work on finding and fixing the test dependencies, making them truly independent.

The code fragment we are going to look at is the shuffling code. Each Mocha test file can have blocks of tests called suites. The suites can be nested, and each suite can have additional tests. The suite is defined by describe function and each test by it function.

// typical spec file
it('has a test', () => {...})
describe('a suite', () => {
  it('a test inside a suite', () => {...})
  it('second', () => {...})
  it('third', () => {...})
  describe('nested suite', () => {
    it('has another test', () => {...})
  })
  describe('second nested suite', () => {
    it('yet another test', () => {...})
  })
})

Our Rocha wrapper looks at the previous run results, and if there is no previously saved failing test run order, shuffles the top level suite object.

mocha.suite.beforeAll(function () {
  const cachedOrder = cache.load()
  if (cachedOrder) {
    log('reordering specs like last time')
    order.set(mocha.suite, cachedOrder)
  } else {
    order.shuffle(mocha.suite)
  }
})

The initial implementation of order.shuffle is in function shuffleDescribes that you can see here

function shuffleDescribes (suite) {
  if (suite.suites && suite.suites.length) {
    log('shuffling %d describe blocks in "%s"',
      suite.suites.length, suite.title)
    suite.suites = _.shuffle(suite.suites)
    shuffleTests(suite)
    suite.suites.forEach(shuffleDescribes)
  }
}

Before we refactor, we must write unit tests. I quickly wrote a few sanity checks using snap-shot utility to add snapshot testing to Mocha framework. For sanity, I added returning the suite reference from shuffleDescribes

function shuffleDescribes (suite) {
  ...
  return suite
}

the unit tests are very simple. I am using la assertion function from lazy-ass - which just checks predicate and throws an exception formed from all arguments passed to it.

const la = require('lazy-ass')
const is = require('check-more-types')
const _ = require('lodash')
describe.only('shuffle', () => {
  it('is a function', () => {
    la(is.fn(shuffle), shuffle)
  })
  it('does not shuffle undefined', () => {
    la(_.isUndefined(shuffle()))
  })
  it('does not shuffle empty', () => {
    la(snapshot(shuffle({})))
  })
  it('does not shuffle empty suites', () => {
    const s = {
      suites: []
    }
    la(snapshot(shuffle(s)))
  })
  it('returns shuffled suites', () => {
    const s = {
      suites: R.range(1, 100)
    }
    const shuffled = shuffle(s)
    la(!_.isEqual(shuffled.suites, s.suites),
      'shuffled suites are the same', shuffled.suites)
  })
})

Note in the last test we are getting shuffled test, but because we are mutating the s.suites array directly, our initial object s does NOT hold the original list of items from 1 to 100.

const s = {
  suites: R.range(1, 100)
}
const shuffled = shuffle(s)
la(_.isEqual(shuffled.suites, s.suites),
  'shuffled suites are the same', shuffled.suites)

This is the major downside to mutable data - the original reference all of the sudden has changed data; this is very unexpected and hurts our test! We will take care of this later, but for now we are going to add an explicit check to compare object reference returned from the transformation - for now the returned object is going to be the same as the input one.

const s = {
  suites: R.range(1, 100)
}
const shuffled = shuffle(s)
la(shuffled === s, 'returns same reference')
la(_.isEqual(shuffled.suites, s.suites),
  'shuffled suites are the same', shuffled.suites)

Comment the code

Two quick notes on refactoring: we need to have unit tests, and we need to understand what is going on (to the best of our abilities) before changing any code. To better understand, I usually write a few comments to see if I understand the coded algorithm. Isn't this what we do every day: read and understand what the code does (and then change it to really do what it claims to do?)

function shuffleDescribes (suite) {
  // if the `suite` has nested suites
  if (suite.suites && suite.suites.length) {
    // side-effect: print log message
    log('shuffling %d describe blocks in "%s"',
      suite.suites.length, suite.title)
    // shuffle sub-suites
    suite.suites = _.shuffle(suite.suites)
    // shuffle tests inside this suite
    shuffleTests(suite)
    // recursive call: shuffle each suite inside
    suite.suites.forEach(shuffleDescribes)
  }
}

Note that we already found a problem. A suite can have just tests, and no nested suites; but our shuffleDescribes only calls shuffleTests(suite) if there are nested suites! This is a common occurrence; reading the code and trying to document each step finds flaws in the coded logic.

Before we begin refactoring, I must say that I usually have both Lodash docs https://lodash.com/docs/ and Ramda docs http://ramdajs.com/docs/ open in my browser as I program day to day. We are going to use a lot of little functions from these libraries. Whenever I need a function that operates on a collection for example, I look through each set of docs; they both have split all available functions into categories: Function, Logic, List, Objects, etc, making the search easy.

Refactoring the condition

Let us start with the guard condition at the start of the function shuffleDescribes.

1
2
3

if (suite.suites && suite.suites.length) {
 ...
}

The condition checks if the object suite has a suites property, and if the value is non-empty. Yet, the check is not very robust. For example, it assumes that the argument suite is a valid object. It also does not check if the suite.suites is an Array - only that its length property is truthy; this allows suite.suites to be a string, invoking recursive shuffle!

The condition is just AND of several predicates. I have shown how to refactor this case in this blog post. We just need to clearly express each check and then use R.allPass to combine them.

step 1

const R = require('ramda')
const positive = R.lt(0)
function isValidSuite (suite) {
  const p1 = R.is(Object)
  const p2 = R.has('suites'),
  const p3 = R.pathSatisfies(positive, ['suites', 'length'])
  return p1(suite) && p2(suite) && p3(suite)
}
function shuffleDescribes (suite) {
  if (isValidSuite(suite)) {
    ...
  }
  return suite
}

The last predicate p3 is unusual - it makes use of Ramda.pathSatisfies function that can check if a value in an object at the path satisfies a given predicate function, in this case the predicate itself is a function we derive using Ramda.lt function.

const positive = R.lt(0)
positive(0)   // false
positive(100) // true
const p3 = R.pathSatisfies(positive, ['suites', 'length'])
p3({suites: []})        // false
p3({suites: [1, 2, 3]}) // true

step 2

Let us replace p1(suite) && p2(suite) && p3(suite) with a single call - and we can do this because all predicate functions are unary and applied to the same object suite.

const R = require('ramda')
const positive = R.lt(0)
function isValidSuite (suite) {
  const p1 = R.is(Object)
  const p2 = R.has('suites'),
  const p3 = R.pathSatisfies(positive, ['suites', 'length'])
  const isValid = R.allPass([p1, p2, p3])
  return isValid(suite)
}

Now the trick to get rid of the extra function. Notice that our predicates p1, p2 and p3 no longer use the argument suite. Even allPass does not need it right away - it is curried after all. Thus the function isValid and isValidSuite is really the same function and we can "collapse" the condition into a single call

const positive = R.lt(0)
// isValidSuite :: suite -> bool (depending on the suites object)
const isValidSuite = R.allPass([
  R.is(Object),
  R.has('suites'),
  R.pathSatisfies(positive, ['suites', 'length'])
])

The rest of the code stays unchanged. Function isValidSuite is waiting (patiently) for an object suite to inspect.

Refactoring imperative code

Now that we have a clean self-explaining and robust condition function, let us see how to write better code in its if/else branches. Our current code required comments in order to be clear (otherwise a logical error we just found would be tough to hide). One good way to avoid using comments, is to split the code in small functions. Each function should be so small that it should have a single purpose and good descriptive name. Then the code would not require a lot of comments, because function names would explain what is happening step by step.

We have 3 new candidate functions to write looking at the code. Just look at each line with a comment!

function shuffleDescribes (suite) {
  if (isValidSuite(suite)) {
    // side-effect: print log message
    log('shuffling %d describe blocks in "%s"',
      suite.suites.length, suite.title)
    // shuffle sub-suites
    suite.suites = _.shuffle(suite.suites)
    shuffleTests(suite)
    // recursive call: shuffle each suite inside
    suite.suites.forEach(shuffleDescribes)
  }
  return suite
}

Functio shuffleTests already exists, so we only need new functions to log debug message, shuffle suites order and recursively shuffle each suite. Moving each line into its own function makes the code self-documenting

const logShuffle = (s) =>
  log('shuffling %d describe blocks in "%s"',
    s.suites.length, s.title)

const shuffleSuites = (s) => {
  s.suites = _.shuffle(s.suites)
  return s
}

function shuffleNestedSuites (s) {
  s.suites.forEach(shuffleDescribes)
  return s
}

function shuffleDescribes (suite) {
  if (isValidSuite(suite)) {
    logShuffle(suite)
    suite = shuffleSuites(suite)
    suite = shuffleTests(suite)
    suite = shuffleNestedSuites(suite)
  }
  return suite
}

Great; each function has a single purpose and is simple to read and understand. Every function (except for logging one) is closer to being pure. The functions still mutate the input object, but at least they do not access outside variables, except for logShuffle.

Use object lens to modified properties

Look at the function that shuffles s.suites array. It operates on the parent object s (which is our suite), reaches in, grabs the value of the property s.suites, modifies it and then sets it back.

const shuffleSuites = (s) => {
  s.suites = _.shuffle(s.suites)
  return s
}

Is there a functional way to read a property, run a function to modified it (in this case by calling _.shuffle) and then write the value back into the object? There is and it is called lenses - and Ramda includes functions to do so. Watch this Egghead lesson https://egghead.io/lessons/javascript-change-object-properties-with-ramda-lenses to get a taste of them. In our case, we are reading and writing the same property, which has a Ramda shortcut called R.lensProp - it is a lens focused on an object's property. We can create a lens and then specify a function to transform the value and then write it back - using R.over function.

const suitesLens = R.lensProp('suites')
const shuffleSuites = R.over(suitesLens, _.shuffle)
// call shuffleSuites just like before
// suite = shuffleSuites(suite)

Hmm, on of our unit tests fails!

1 2	shuffle returns shuffled suites: Error: returns same reference

Turns out Ramda lenses, clone the given object when setting the property. It plays right into our hands and gives us data immutability at this particular step.

We can update our tests and move on.

it('returns different object', () => {
  const s = {
    suites: [1, 2, 3]
  }
  const shuffled = shuffle(s)
  la(shuffled !== s, 'returns new object')
})

it('returns shuffled suites', () => {
  const s = {
    suites: R.range(1, 100)
  }
  const shuffled = shuffle(s)
  la(!_.isEqual(shuffled.suites, s.suites),
    'shuffled suites are the same',
    shuffled.suites, 'initial', s.suites)
})

But there are other places were we can use lenses. For example shuffling each child suite. Our current code does this in "side-effecty" way using Array.forEach

function shuffleNestedSuites (s) {
  s.suites.forEach(shuffleDescribes)
  return s
}

We can first refactor this code to be pure and use Array.map

function shuffleNestedSuites (s) {
  s.suites = s.suites.map(shuffleDescribes)
  return s
}

s.suites = s.suites.map(...) looks almost like a property lens, yet instead of a transformation function the code calls a method on the array. Not going to work. Luckily, Ramda provides the map function we can use instead, passing the array.

// use Ramda.map
function shuffleNestedSuites (s) {
  s.suites = R.map(shuffleDescribes, s.suites)
  return s
}
// use Ramda.map partially applied for clarity
function shuffleNestedSuites (s) {
  s.suites = R.map(shuffleDescribes)(s.suites)
  return s
}

The above code can now be rewritten using a lens - and we can even reuse the same property lens, because we are operating on the property name suites!

1
2
3

const suitesLens = R.lensProp('suites')
const shuffleSuites = R.over(suitesLens, _.shuffle)
const shuffleNestedSuites = R.over(suitesLens, R.map(shuffleDescribes))

Note: sometimes I use R. prefix in my code to make clear that lensProp is coming from Ramda library. Usually I prefer destructuring assignment to import only the desired properties from the library to make code more readable.

const {lensProp, over, map} = require('ramda')
const suitesLens = lensProp('suites')
const shuffleSuites = over(suitesLens, _.shuffle)
const shuffleNestedSuites = over(suitesLens, map(shuffleDescribes))

Beautiful, lenses can really focus on object's properties. Sigh, it is a bad pun.

Maybe

The last refactoring I am going to use is to replace the imperative if/else statement with a Maybe monad. Our code currently evaluates a condition and then proceeds to call series of functions, passing an argument. The code also receives the result from each function and passes it to the next one - super ugly!

function shuffleDescribes (suite) {
  if (isValidSuite(suite)) {
    logShuffle(suite)
    suite = shuffleSuites(suite)
    suite = shuffleTests(suite)
    suite = shuffleNestedSuites(suite)
  }
  return suite
}

What we are describing is a computation that starts with a value. Then, if the value passes the condition isValidSuite it transforms this value through several transformation functions. If the value does not pass isValidSuite condition, all these transformations are skipped. We do not have to reinvent the wheel and code it ourselves - this pattern has been implemented before (many times). Here is how it would look in principle.

function shuffleDescribes (suite) {
  return
  wrap(suite)  // <-- predicate goes here
    .map(logShuffle)    // ok, just believe me
    .map(shuffleSuites) // puts shuffleSuites(s) result back
    .map(shuffleTests)  // into the "wrapper object"
    .map(shuffleNestedSuites)
  }
}

Ramda itself does not implement this wrapper type, but a companion library ramda-fantasy does. In case we are going to use the Maybe type which represents a value that might or might not be there.

step 1

How do we create a Maybe? We don't. We create one of the subtypes - if the value is invalid / does not exist we will create Maybe.Nothing, and if the value is valid we are going to create Maybe.Just(x) passing actual value as x argument. Here is our condition code

const {Just, Nothing} = require('ramda-fantasy').Maybe
function maybeSuites (suite) {
  return isValidSuite(suite) ? Just(suite) : Nothing()
}
function shuffleDescribes (suite) {
  return maybeSuites(suite)
  // ...
  // we need the actual value?
}

See how we stick suites value into Maybe? Now we can map over it, passing our transform functions. But first, we need to decide what to do at the end. Function shuffleDescribes does not return the Maybe[suites] - it returns just suites reference. Thus we need to extract the value before returning it. At the end the value Maybe[suites] could be Just[something] or Nothing(). We must handle both cases. Ramda-fantasy supplies a good method maybe.getOrElse to extract the value something from the Just[something] case, or a default if the value is Nothing(). In our case, the else branch of the imperative code just returned the suites argument - and we can do the same.

function shuffleDescribes (suite) {
  return maybeSuites(suite)
  // ...
  .getOrElse(suite)
}

Perfect, we are conforming to the function's API. We are constructing a Maybe[suites] value and we are getting value back before returning from the function shuffleDescribes.

step 2

We are shuffling suites inside the top level suite. Before we were just calling a function and mutated the variable suites. When we use a Maybe it has a method .map just like a JavaScript array has; supply a transform function and the Maybe will receive and store updated value.

// before
/*
  suite = shuffleSuites(suite)
  suite = shuffleTests(suite)
  suite = shuffleNestedSuites(suite)
*/
// after switching to Maybe
function shuffleDescribes (suite) {
  return maybeSuites(suite)
    .map(shuffleSuites)
    .map(shuffleTests)
    .map(shuffleNestedSuites)
    .getOrElse(suite)
}

Nice, the imperative logic "call function F and store result; call function G and store result ..." has been replaced with .map(F).map(G)... fluent interface. But we have one function that does NOT return suites - the log function. It returns nothing (which is a red flag that the function is not pure). We cannot put logShuffle into our Maybe chain as a callback because it will overwrite suites value with undefined. We could modify logShuffle to return the argument it receives:

const logShuffle = (s) => {
  log('shuffling %d describe blocks in "%s"',
    s.suites.length, s.title)
  return s
}

That's ugly, and it is easy to forget to return the argument s. Thus we better use Ramda.tap to wrap a function and always return the input argument. Tap is one of my favorite and most often used function adaptors.

function shuffleDescribes (suite) {
  return maybeSuites(suite)
    .map(tap(logShuffle))
    .map(shuffleSuites)
    .map(shuffleTests)
    .map(shuffleNestedSuites)
    .getOrElse(suite)
}

Is this really better than imperative code? I would argue yes.

We had to explicitly deal with invalid suite data.
Each step is a pure function, operating on an immutable data object.
We only used a few "primitive" functions: map, lens, etc. With fewer moving parts, we did not really have to write new code; our code is just a series of data transforms.

And that is how you draw an owl.

More info

More writings on refactoring and writing code functional style

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do