Aug 16 2016

Monads

Array, Promise and Maybe monads. Plus Docker is a functor.

Big thank you to Vincent Orr and especially to Luis Atencio for reading an early draft of this post and providing generous feedback.

There is a new excellent explanation of the Maybe monad written by James Sinclair. I highly recommend reading his long and comprehensive essay. I just want to give a slightly different reason for having monads around.

Simple function for primitives

Whenever I code, I always prefer to code the simplest functionality first. For example an addition function would just return the sum of its two arguments

function add(a, b) {
  return a + b
}
console.log(add(2, 3))
// 5

What if I want to log the input arguments before adding them? I do not add the logging into the function add - this adds a second concern to the function, making it harder to understand, test and reuse.

// BAD
function add(a, b) {
  console.log(a, b)
  return a + b
}
console.log(add(2, 3))
// 2 3
// 5

Instead, we can compose a log and add functions.

function add(a, b) {
  return a + b
}
function logArguments(fn) {
  return function () {
    console.log.apply(console, arguments)
    return fn.apply(null, arguments)
  }
}
console.log(logArguments(add)(2, 3))
// 2 3
// 5

We can refactor the logArguments further to separate the log call from applying the given function, but the main point is this:

We kept the function add as is; add has a very clear purpose and works on its simple arguments.

Let us take a look at another simple function, for example double

function double(x) {
  return x + x
}
console.log(double(10))
// 20

Again, double only handles the simplest case possible - a single input argument. If we wanted to extend double with additional features, like logging its arguments, we would create a second function via composition.

1
2
3

console.log(logArguments(double)(10))
// 10
// 20

Working with arrays

What if we wanted to double many numbers? Given a list of numbers, could we double every element in the list by reusing the original double that operates on a single item at a time?

function map(fn) {
  return function (list) {
    const result = []
    for (var k = 0; k < list.length; k += 1) {
      result.push(fn(list[k]))
    }
    return result
  }
}
const numbers = [1, 2, 3]
console.log(map(double)(numbers))
// [2, 4, 6]

map

The function double stays the same, but it was adapted to work on a list of numbers using another function we wrote. What if we wanted to log each number before doubling it? Just compose double, logArguments and map functions!

console.log(map(logArguments(double))(numbers))
// 1
// 2
// 3
// [2, 4, 6]

What is so interesting about the map function we have just written? Its purpose is not really to adapt the double function per se. It does not extend its functionality the way logArguments does. No, the map function adapts the function double to work with a different type of inputs - it "teaches" double how to work with multiple numbers stored in an Array.

In functional jargon we say that map "lifts" function double to work on a non primitive type.

We can write "type signature" next to each function for clarity. Even if the original function double works with any argument due to JavaScript's dynamic nature, let us pretend that we are only passing numbers to it.

                       // name   :: argument -> return type
double                 // double :: number   -> number
logArguments(double)   //        :: number   -> number
map(double)            //        :: [number] -> [number]

The function map(double) expects a list of numbers, expressed in our type signature as [number] and returns a list of numbers, also written as [number].

Function map adapts any function that works on a simple single argument to also work on an entire list of values. It is so useful and common that it became a built-in method in the ES5 JavaScript standard, and it belongs to the Array type. Thus our program should really be written simply as

1 2	console.log(numbers.map(double)) // [2, 4, 6]

Great, what does this have to do with monads? Well, we are almost there. The Array instance is a container. It just stores many values and has a method map that "unwraps" the values and gives them one by one to a simple function like double that only knows how to deal with just a simple single value.

The Array is so convenient and is used so often, that sometimes we run into troubles. For example, what if the function we are mapping does not return a primitive value, but (drum roll please) an already wrapped value?

Consider for example a function that returns all distinct characters in a given string.

// nice ES6 trick to get distinct values here
function distinct(str) {
  return [... new Set(str)]
}
console.log(distinct('foo'))
// ['f', 'o']

What happens when we want to find distinct characters across a list of words? Well, we are going to map over an Array, right?

1
2
3

const words = ['foo', 'bar', 'baz']
console.log(words.map(distinct))
// [ [ 'f', 'o' ], [ 'b', 'a', 'r' ], [ 'b', 'a', 'z' ] ]

Hmm, this does not work very well, because the function distinct is a little more complicated than double. Their signatures show this

1 2	double // double :: number -> number distinct // distinct :: string -> [string]

When we map over distinct we put back into the Array more little Arrays instead of primitive types (that should be string in this case). The assumption that map makes is that the function it adapts takes the primitive value and returns a primitive unwrapped value.

We can try using something else in this case. For example, we could call join on the array, get a single string and than map it, but that does not work since Array.prototype.join does not put the result back into the wrapped value that has map

1 2	console.log(words.join('').map(distinct)) // TypeError: words.join(...).map is not a function

We could of course "wrap" the value returned by the join ourselves, then we could call map if we wanted.

1 2	console.log([words.join('')].map(distinct)) // [ [ 'f', 'o', 'b', 'a', 'r', 'z' ] ]

flatMap

Almost correct, we really want to handle the case when the mapped function returns an already wrapped value! Let us write a function that is just like map function, but knows how to NOT doubly wrap the result. Because in this case its purpose is to avoid nested arrays, we need to flatten the result, and thus we will call it flatMap - it maps the value and then flattens the returned wrapped result to avoid wrapping it twice.

Array.prototype.flatMap = function flatMap(fn) {
  var result = []
  for (var k = 0; k < this.length; k += 1) {
    result = result.concat(fn(this[k]))
  }
  // set this instance to the result array
  for (var k = 0; k < result.length; k += 1) {
    this[k] = result[k]
  }
  this.length = result.length
  return this
}

The above function calls the simple fn that MUST return an Array. The Array is the wrapped value in this case, and we prevent putting an Array into an Array by replacing the internal list with the new result.

Here is how we use it

1 2	console.log([words.join('')].flatMap(distinct)) // [ 'f', 'o', 'b', 'a', 'r', 'z' ]

We got ourselves an Array monad! It is nothing but a container data type that can use simple functions with signatures x -> x by mapping over them. The result of a simple function is placed back into the Array - the container lives on! And most importantly, if we add flatMap to the Array, we get ourselves a monad - a container that knows how to handle simple functions that return a container instead of a primitive value. A monad does not nest values when using flatMap - it knows that it should discard the second wrapper and just keep one, otherwise it will soon get into a Russian nested doll situation where the underlying primitive value gets lost inside all the returned wrapping.

It is almost like re-gifting a present. You do not just place the wrapped unwanted present into a new gift bag and give it to someone else. You first unwrap and discard the wrapping paper with the personal greeting card, wrap the present in your paper and add a new greeting card.

Note: the method flatMap is also called chain sometimes. I prefer flatMap of course because it parallels the Array operation.

Combining map with flatMap

We can chain both map and flatMap method calls, depending on the return type of the lifted (adapted) function. For example we can flat map strings in an Array to letters and then map each letter to upper case.

function toUpper(s) {
  return s.toUpperCase()
}
console.log(
  words.flatMap(distinct).map(toUpper)
)
// const words = ['foo', 'bar', 'baz']
// [ 'F', 'O', 'B', 'A', 'R', 'B', 'A', 'Z' ]

The map vs flatMap choice should naturally depend on the function used. For example we could use flatMap to double numbers, but it would be a lot more efficient and intuitive to use simple map

// BAD
[1, 2, 3].flatMap(x => [x * x]) // [1, 4, 9]
// GOOD
[1, 2, 3].map(x => x * x) // [1, 4, 9]

Working with async code

The above examples are all synchronous. The result is available right away, that is why we could print the returned value. The Arrays are very suitable for working with synchronous lifted functions; but they break down when the items are generated asynchronously, like reading values from a database or network resource.

Promises are abstractions that allow us to work efficiently with values that will be generated asynchronously. They are part of the ES6 standard and are widely available in browsers.

const promise = http.get('/some/value')
// http.get call returns a "promise" that will later contain the actual value
promise.then(function (value) {
  console.log('we got', value)
})

Promises are monads! They wrap a value that will be there in the future, and allow modifying the value using a "map" operation, called then. In fact Promises are even more user friendly than the above Array monad because both map and flatMap is just a single method then. The Promise.prototype.then is "smart" to determine how it should treat the value returned from the callback function.

If the callback function returns a non-Promise instance, it will be treated like map, placing the value back into the Promise "wrapper".

1 2	Promise.resolve(42).then(double).then(console.log) // 84

If the passed function returns a Promise object, it knows NOT to double wrap it and then method acts like flatMap instead

1 2	Promise.resolve(42).then(() => Promise.resolve(100)).then(console.log) // 100

The anonymous function in the middle has returned a Promise object (with value 100 inside), but what got printed next was just the primitive value 100, not [Promise{100}] because then acted like flatMap in this case.

Monads of different type

The method flatMap (and then for Promises) is nice, but it only flattens the returned value of the same type. For example, if we resolve a Promise with an Array, it will keep the Array, since Promise only understands how to flatten other Promises, not other monad container types.

1 2	Promise.resolve(42).then(() => Promise.resolve(numbers)).then(console.log) // [1, 2, 3]

Similarly, we can map or flatMap an Array of Promises, but we still will get the Promise objects inside.

1 2	console.log(numbers.flatMap(x => Promise.resolve(x))) // [ Promise { 1 }, Promise { 2 }, Promise { 3 } ]

If we want to really get the primitive value from a monad container inside another monad container we have to map with custom code. For example, Promise API has a method to convert an Array of Promises into an Array of resolved primitive values.

1
2
3

Promise.all(numbers.flatMap(x => Promise.resolve(x)))
  .then(values => console.log('resolved values', values))
// resolved values [ 1, 2, 3 ]

The call Promise.all in the first line takes as an argument an Array monad with each individual item being a Promise monad. It returns a single Promise monad that contains an Array monad. This switcheroo gives us the actual values.

Maybe monad

Note: this Maybe monad is slightly different from the typical implementation. It is only supposed to be an example how to safely perform numerical division.

We saw a monad that helped deal with multiple values (an Array monad), and a monad that helped deal with asynchronous values (a Promise monad). There is an infinite number of useful Monad types. They just need to wrap a value and provide a map and a flatMap methods. The map is especially useful because it allows us to quickly reuse the simplest function that does not care about any of the Monad mambo-jumbo, and only deals with a primitive value.

The Maybe monad is useful for dealing with non-existent values without a pyramid of if - else blocks. Take a look at the typical divide function.

function divide(a, b) {
  return a / b
}
console.log(divide(10, 2)) // 5
console.log(divide(10, 0)) // Infinity

Without guarding for the zero value of the second argument, we can get quite a large number! We could add the guard logic into the function itself, but this goes against our principle - add new features to the existing function by composing functions, not by putting more code inside of them.

function divide(a, b) {
  if (b === 0) {
    return 'nope'
  }
  return a / b
}
console.log(divide(10, 2)) // 5
console.log(divide(10, 0)) // nope

Notice we had to hard code the action to take if the second argument was zero, in this case returning 'nope' string (or maybe throwing an exception).

We can handle the above situation differently. Let us wrap the data in a new data type for storing just two numbers

function Maybe(a, b) {
  this.a = a
  this.b = b
}
var maybe = new Maybe(10, 2)
console.log(maybe)
// Maybe { a: 10, b: 2 }

Using keyword new is a pain so we can just add a utility method of to make creating a Maybe object of two numbers easier

function Maybe(a, b) {
  this.a = a
  this.b = b
}
Maybe.of = function of(a, b) {
  return new Maybe(a, b)
}
var maybe = Maybe.of(10, 2)
console.log(maybe)
// Maybe { a: 10, b: 2 }

We have an existing function divide(a, b) that we want to reuse safely.

1
2
3

function divide(a, b) {
  return a / b
}

The function divide operates on the primitive values and returns a primitive, thus it is a good candidate for map method. We will add map method to the Maybe object. We will place the guard logic there!

Maybe.prototype.map = function (fn) {
  if (!this.b) {
    // do not divide, but return invalid object
    return Maybe.of(null, null)
  }
  // put the result into first position
  return Maybe.of(fn(this.a, this.b))
}

We can then safely use the division - the result, if there is one will be in the a property. If a is "null" then the division was invalid because the second number was zero.

console.log(Maybe.of(10, 2).map(divide))
// Maybe { a: 5, b: undefined }
console.log(Maybe.of(10, 0).map(divide))
// Maybe { a: null, b: null }

What about flatMap? We need to handle a case when the function returns an instance of Maybe.

Maybe.prototype.flatMap = function (fn) {
  if (!this.b) {
    // do NOT call a function, since this division is invalid
    return Maybe.of(null, null)
  }
  return fn(this.a, this.b)
}

We can use it on a somewhat artificial example

// safe division
function maybeDivide(x, y) {
  if (y) {
    return Maybe.of(x, y).map(divide)
  }
  return Maybe.of(null, null)
}
console.log(Maybe.of(10, 2).flatMap(maybeDivide))
// Maybe { a: 5, b: undefined }

Notice that flatMap cannot be interchanged with map - if the given function returns already wrapped value we need to use flatMap to get the value to avoid double wrapping. If we forget this rule and use map the result will be a nested Maybe inside another one.

1 2	console.log(Maybe.of(10, 2).map(maybeDivide)) Maybe { a: Maybe { a: 5, b: undefined }, b: undefined }

Getting the result out of the above Maybe monad is kind of awkward. Thus most Maybe implementations provide convenience methods. These methods also help avoid hard coding the logic in the original simple function. Let us add a method that gets the computed value or returns the default value if the division was invalid

Maybe.prototype.orElse = function (defaultValue) {
  if (this.a === null) {
    return defaultValue
  }
  return this.a
}
console.log(Maybe.of(10, 2).map(divide).orElse('noop'))
// 5
console.log(Maybe.of(10, 0).map(divide).orElse('noop'))
// 'noop'

Libraries like Ramda Fantasy and Folktale provide good implementations of Maybe and other monads with lots of convenience methods.

Docker is a functor

If monads wrap a value and have both map and flatMap methods, then how do we call types that only have the map method? They are called functors; a standard ES5 Array is an example. The "functors" only "safely" use the given callback function, giving it a wrapped value and placing the primitive result back into the container without any thinking.

Sometimes a very unusual structure turns out to be a functor. Docker images are built from simple text files. Each Dockerfile specifies a base image and then each command (like install a software module) creates a new derived image. Here is a Dockerfile that installs gulp in a Node 6 base image

1 2	FROM mhart/alpine-node:6 RUN npm install -g gulp

We can take this file and run docker build command.

docker build .
Step 1 : FROM mhart/alpine-node:6
6: Pulling from mhart/alpine-node
Status: Downloaded newer image for mhart/alpine-node:6
 ---> ecd37ad77c2b
Step 2 : RUN npm install -g gulp
 ---> Running in c2568b777d7d
/usr/bin/gulp -> /usr/lib/node_modules/gulp/bin/gulp.js
 ---> f3158cbb038c
Removing intermediate container c2568b777d7d
Successfully built f3158cbb038c

The above build step downloaded an image "mhart/alpine-node:6", which contains the NodeJS binary, ran the command npm install -g gulp inside that "environment" and created new image with id "f3158cbb038c".

Conceptually, this looks a lot like our monads! The Docker image wraps a value (in the example a NodeJS binary, then a NodeJS binary and "gulp" installed). Each statement RUN <command> is like a JavaScript function. The command docker build plays the map method role - it allows a simple "dumb" command npm install -g gulp to actually run on the value inside the image and then places the value back into the Docker image.

(I am avoiding using the word "container" because Docker actually has a specific meaning for this term - a Docker container is an image being executed).

So the above program is almost like the following JavaScript

1
2
3

var image = DockerImage('mhart/alpine-node:6')
  .map('npm install -g gulp')
  .save('f3158cbb038c')

We can repeat the process multiple times. Another Dockerfile can use the produced image and run ("map") a simple command that will operate inside the image.

1 2	FROM f3158cbb038c RUN gulp -v

$ docker build -f Dockerfile2 .
Sending build context to Docker daemon 7.168 kB
Step 1 : FROM f3158cbb038c
 ---> f3158cbb038c
Step 2 : RUN gulp -v
 ---> Running in c46144b84495
[02:24:44] CLI version 3.9.1
 ---> 35e23a678ef0
Removing intermediate container c46144b84495
Successfully built 35e23a678ef0

which is equivalent (combined with the previous step) to:

var image = DockerImage('mhart/alpine-node:6')
  .map('npm install -g gulp')
  .map('gulp -v') // CLI version 3.9.1
  .save('35e23a678ef0')

Why Docker is not a monad

In order for Docker to be a "monad" it would have to support flatMap. A callback function for flatMap could return a new Docker image and the original container would have to know how to switch to it. That is, the returned value would be a Docker image that would become the new base image.

Since there is no Docker functionality like this (there is Docker in Docker, but it does not support replacing the original image with the image returned by the build command), we can only express what it would be like in fake JavaScript notation

var image = DockerImage('mhart/alpine-node:6')
  .map('npm install -g gulp')
  .map('gulp -v') // CLI version 3.9.1
  .flatMap('FROM mhart/alpine-node:6') // wipes out gulp install again
  .map('gulp -v') // gulp: command not found

Thus the Docker system is not a full monad, but only functor - it only knows how to map shell commands.

To read more about functors, I have a couple of other blog posts:

I also have a Docker basics covered in a single gist.

Conclusion

Monads are a way to reuse simple functions. Monads wrap a primitive value that the simple function expects. The wrapping logic could be used to iterate over items in a list, handle asynchronous values or guard against values that would break the simple functions.

A monad has map method that calls the simple function with the value, and wraps the result. A monad also has flatMap method that calls a function that returns a wrapped value. The flatMap prevents double wrapping.

Docker is a functor - it only supports "map" method, but not the "flatMap" method.

Further info

Functional Programming in JavaScript by Luis Atencio has a great chapter on the proper Maybe monad.

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do