Trying Redis

Playing with remote Redis service for quick caching.

I have never had to manage or really use Redis in my professional life. Other people have set one up for me for session storage or other needs. But recently I needed a quick cache for checking external urls, so I have decided to finally really use Redis.

The goal of Redis NoSQL DB is simple. Given a key and a value (almost any serializable value would work) write a value into the database. You can even set an expiration duration on the key - after certain time the value will be automatically deleted from the Redis database.

1
2
3
4
5
6
7
8
9
10
// Redis set command
set <key> <value>
get <key>
// you can expire values after a certain time
set <key> <value> EX <seconds>
// set command options (https://redis.io/commands/set)
// EX seconds -- Set the specified expire time, in seconds.
// PX milliseconds -- Set the specified expire time, in milliseconds.
// NX -- Only set the key if it does not already exist.
// XX -- Only set the key if it already exist.

Using a popular Node Redis client ioredis this looks like this

1
2
3
4
5
6
7
8
const url = process.env.REDIS_URL
const Redis = require('ioredis')
const redis = new Redis(url)
// expire this value after 5 seconds
redis.set('foo', 'bar', 'ex', 5)
// getting value back is async
const value = await redis.get('foo')
console.log(value)

Great, so let's setup a shared Redis instance, I don't want to have it work just on my machine!

Setting up remote Redis

I have set up a free Redis machine at https://redislabs.com. A total of 30MB should be plenty for my needs. To connect I will need the url with the password included.

Redis setup information

I will place the full connection url into ~/.as-a/.as-a.ini file like this

.as-a.ini
1
2
[redis-labs]
REDIS_URL=redis://:<password>@<url>.redislabs.com:13654

Using CLI tool as-a I can quickly run my script with REDIS_URL environment variable (or any other collection of variables)

1
$ as-a redis-labs node .

Here my first script that shows values stored and retrieved from a remote Redis server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const url = process.env.REDIS_URL
const Redis = require('ioredis')
const redis = new Redis(url)

console.log(new Date())

// expire this value after 5 seconds
redis.set('foo', 'bar', 'ex', 5)
const value = await redis.get('foo')
console.log(value)

// trying to get a value that does not exist
const bar = await redis.get('bar')
console.log('bar is', bar) // bar is null

process.exit(0)

We can see the value under key foo only stored for 5 seconds. Comment out the line redis.set('foo', 'bar', 'ex', 5) and run the program again quickly - the string "bar" will be returned. But if we run the program again after 5 seconds, the null will appear. Here is a "normal" run, then run with the line commented out after 4 seconds, then another run after 4 more seconds.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ as-a redis-labs node .
2018-04-27T19:48:40.644Z
bar
bar is null

$ as-a redis-labs node .
2018-04-27T19:48:44.053Z
bar
bar is null

$ as-a redis-labs node .
2018-04-27T19:48:47.958Z
null
bar is null

The value has expired.

Redis vs memory

To simplify testing, instead of always going through the real Redis instance, I have switched to keyv that allows me to use either in-memory DB or Redis.

1
npm install --save keyv @keyv/redis

The Keyv API for set and get is almost the same as "classic" Redis client and enough for my needs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const Keyv = require('keyv')
// if process.env.REDIS_URL is a `redis:...` url will use
// Redis client. Otherwise uses in-memory cache
const keyv = new Keyv(process.env.REDIS_URL)

const seconds = (n) => 1000 * n

// keyv.set returns a promise!
// expire this value after 5 seconds
await keyv.set('foo', 'bar', seconds(5))
console.log(new Date(), await keyv.get('foo'))

setInterval(async () => {
console.log(new Date(), await keyv.get('foo'))
}, 1000)

Run this and see the value expire after 5 seconds.

1
2
3
4
5
6
7
8
9
$ node .
2018-04-27T20:18:05.746Z 'bar'
2018-04-27T20:18:06.751Z 'bar'
2018-04-27T20:18:07.754Z 'bar'
2018-04-27T20:18:08.760Z 'bar'
2018-04-27T20:18:09.762Z 'bar'
2018-04-27T20:18:10.767Z undefined
2018-04-27T20:18:11.768Z undefined
^C

Or against a Redis instance

1
2
3
4
5
6
7
8
9
$ as-a redis-labs node .
2018-04-27T20:19:04.430Z 'bar'
2018-04-27T20:19:05.470Z 'bar'
2018-04-27T20:19:06.472Z 'bar'
2018-04-27T20:19:07.476Z 'bar'
2018-04-27T20:19:08.479Z 'bar'
2018-04-27T20:19:09.482Z undefined
2018-04-27T20:19:10.484Z undefined
^C

Beautiful, but note that keyv returns undefined and not a null. This might be significant for some use cases, but not for mine.

Do not prevent Node from exiting

By default, an open Redis connection will prevent the Node process from exiting, just like listening to a port prevents the process from terminating. The Redis client exposes client.unref() method. I have forked @keyv/redis and modified its code to expose the actual client in the Keyv constructor. Now the following process just exits.

1
2
3
const Keyv = require('keyv')
const keyv = new Keyv(process.env.REDIS_URL)
keyv.opts.store.client.unref()

While the pull request 16 stays open, or if it is declined, you can use my fork directly from GitHub.

1
npm i -S bahmutov/keyv-redis#5850d5999ca897ba832c751c0574d77c7b566034

Running the above test program confirms normal process exit

1
2
$ as-a redis-labs node .
2018-04-27T21:23:45.445Z 'bar'

Top level async / await

You have noticed that I am using async keyword at the top level of my program. To make this work, I recommend top-level-await. Just load this module from index.js and move "actual" source code into app.js

index.js
1
2
require('top-level-await')
require('./app')
app.js
1
2
3
4
5
6
const Keyv = require('keyv')
const keyv = new Keyv(process.env.REDIS_URL)
const seconds = (n) => 1000 * n
// expire this value after 5 seconds
await keyv.set('foo', 'bar', seconds(5))
console.log(new Date(), await keyv.get('foo'))

Using Redis for caching

So now it is time to actually use Redis for a task. Cypress documentation is an open source project that lives at github.com/cypress-io/cypress-documentation. The documentation uses Hexo static generator to transform Markdown into a static site. We have extended Hexo with a few additional helpers. One of them transforms urls into anchor links. Here are a couple of examples, including links the Cypress redirection service on.cypress.io.

1
2
3
4
5
6
// link to an external page
{% url 'https://github.com/stanleyhlng/mocha-multi-reporters' %}
// link to https://on.cypress.io/visit
{% url 'visit' visit %}
// link to https://on.cypress.io/configuration#Screenshots
{% url 'screenshotsFolder' configuration#Screenshots %}

When generating the static documentation site, we want to validate the links to make sure they are still valid. There is an url helper that does the check.

  • if the url has no hash part, then we can check if the request HEAD <url> is responding with 200 status
  • if the url does have a hash part like configuration#Screenshots then we need to get the full page and check if there is an element with ID screenshots

Because urls repeat, caching the checks speeds up the site build a lot - there are almost 1800 urls in the docs as of April 2018! The caching right now uses a plain JavaScript object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// cache validations
const cache = {}

function validateAndGetUrl (sidebar, href, source, text, render) {
// do we already have a cache for this href?
const cachedValue = cache[href]

// if we got it, return it!
if (cachedValue) {
return Promise.resolve(cachedValue)
}

if (isExternalHref(href)) {
// cache this now even though
// we haven't validated it yet
// because it will just fail later
cache[href] = href

if (args.validate === false) {
return Promise.resolve(href)
}

return validateExternalUrl(href, source)
.return(href)
}
// other code
}

Great, the code already is using Promises to do its work. Moving it to Keyv is very straightforward. Even better, without REDIS_URL it automatically falls back to in-memory cache which acts same way as using an object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
const Keyv = require('keyv')
const keyv = new Keyv(process.env.REDIS_URL)
if (process.env.REDIS_URL) {
debug('using external Redis server to store HREF checks')
// allow the process exit when done
// otherwise Redis connection will block it forever
keyv.opts.store.client.unref()
} else {
debug('storing external HREF checks in memory')
}

function validateAndGetUrl (sidebar, href, source, text, render) {
// do we already have a cache for this href?
return keyv.get(href).then((cachedValue) => {
// if we got it, return it!
if (cachedValue) {
debug('key found %s -> %s', href, cachedValue)
return cachedValue
}
// rest of the code
})
}

You can see the pull request with my changes. When this gets merged, I can set the REDIS_URL environment variable on our CI that is doing the site build and deployment, and make the cache duration something longer like 1 day. This will ensure that external links are rechecked, yet multiple deploys per day are fast.