Try RethinkDB

Initial local experiments with a modern NoSQL database.

I was fed up with callback MongoDB API, and the Firebase API just disappoints. I heard a lot about RethinkDB and finally has decided to give it a try. Here is my experience. While the official RethinkDB documentation uses terms like "table" and "document", I prefer using terms "table" and "row", even if each row is an individual JSON object. In fact, the database API uses .row() method anyway to refer to each document.

Install

  • Download an official package from the install page.
  • Run the installer
  • Start the local DB from the command line rethinkdb

The DB is up and running, and even has cool web interface running by default at localhost:8080

Note: the database by default will save the data in the current working folder in rethinkdb_data folder. If you stop the database using Ctrl-C key command, make sure to restart the database from the same folder to continue.

Create a test table using Data Explorer

The default installation has test database without any tables. The RethinkDB home page has a few commands to initialize a sample table with data to play around. All examples are in JavaScript and can be pasted directly into the data explorer using the local web interface.

Run each line separately

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
r.db('test').tableCreate('tv_shows');
// press the "Run" button or Shift+Enter to run the query
r.table('tv_shows').insert([{ name: 'Star Trek TNG', episodes: 178 },
{ name: 'Battlestar Galactica', episodes: 75 }]);
// press the "Run" button or Shift+Enter to run the query
r.table('tv_shows').count();
// 2
// see all rows / documents in the table
r.table('properties');
/*
{
"episodes": 75 ,
"id": "80cdcf8c-8e26-4769-8f78-fad5d580dc27" ,
"name": "Battlestar Galactica"
} {
"episodes": 178 ,
"id": "acac3257-508f-4bfb-a96f-72d8ffd2f754" ,
"name": "Star Trek TNG"
}
*/
r.table('tv_shows').filter(r.row('episodes').gt(100));
/*
{
"episodes": 178 ,
"id": "766fb292-b7bb-44ec-9a92-c9b8716f2183" ,
"name": "Star Trek TNG"
}
*/
// delete all items from the table
r.table('tv_shows').delete();

Simple!

Create a test table from Node

Let us perform the same operations from NodeJS without using any additional libraries, just the standard RethinkDB client.

1
2
3
$ npm install rethinkdb --save
[email protected] node_modules/rethinkdb
└── [email protected]

I notice that the only dependency is the excellent promise library bluebird, so far this is very promising.

I have deleted the tv_shows table using the Data Explorer web interface, then wrote the following code

use-rethink-db-directly.js
1
2
3
4
5
6
7
8
9
10
11
require('console.json');
var r = require('rethinkdb');
r.connect({ host: 'localhost', port: 28015 }, function(err, connection) {
if (err) throw err;
console.log('got localhost connection');
r.db('test').tableCreate('tv_shows').run(connection, function(err, result) {
if (err) throw err;
console.json(result);
process.exit(0);
});
});

I am using my own console.json to simplify pretty-printing the operation results.

output
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ node use-rethink-db-directly.js
got localhost connection
{
"config_changes": [
{
"new_val": {
"db": "test",
"durability": "hard",
"id": "8a1d2cb4-02ff-457d-9067-0a0a7ca72293",
"name": "tv_shows",
"primary_key": "id",
...
},
"old_val": null
}
],
"tables_created": 1
}

Cool, but I prefer promises to Node-style error callbacks. Luckily, the RethinkDB api is consistent, so we can convert to promise-returning methods quickly, for example using the bluebird. Here is the code to connect to the database and then to create the table.

1
2
3
4
5
6
7
8
9
10
11
12
require('console.json');
var Promise = require('bluebird');
var r = require('rethinkdb');
Promise.promisify(r.connect, r)({ host: 'localhost', port: 28015 })
.then(function (connection) {
console.log('got localhost connection');
var createTable = r.db('test').tableCreate('tv_shows');
return Promise.promisify(createTable.run, createTable)(connection)
.then(console.json);
})
.catch(console.error)
.finally(process.exit);

Since the table already exists, this shows an error. The error is caught because we added a .catch callback at the end of the promise chain. Thus the error handling is a lot more robust than when we checked the err argument in each callback. Plus we are guaranteed that no matter where the error happens or if there is a success, the process will exit because of the .finally callback.

$ node use-rethink-db-promises.js
got localhost connection
{ [RqlRuntimeError: Table test.tv_shows already exists in:
r.db(test).tableCreate(tv_shows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^]

We can delete the table using the Data Explorer web interface and try again to get the correct response.

Inserting and querying data from Node

Once we have the initial data, let us insert a few records using promise-returning API. Once the data is in, I will fetch the number of tv shows and print it to the console.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
require('console.json');
var Promise = require('bluebird');
var r = require('rethinkdb');
Promise.promisify(r.connect, r)({ host: 'localhost', port: 28015 })
.then(function (connection) {
console.log('got localhost connection');
return connection;
})
.then(function (connection) {
var query = r.table('tv_shows').insert([{ name: 'Star Trek TNG', episodes: 178 },
{ name: 'Battlestar Galactica', episodes: 75 }]);
return Promise.promisify(query.run, query)(connection)
.then(function () {
return connection;
});
})
.then(function (connection) {
var query = r.table('tv_shows').count();
return Promise.promisify(query.run, query)(connection)
.then(function (n) {
console.log('there are', n, 'tv shows');
return connection;
});
})
// output
got localhost connection
there are 2 tv shows

Notice that once we got the connection, we do keep passing it avoid polluting the global scope. We also form a query and then promisify it using the code like

1
2
var query = r.table('tv_shows').count();
return Promise.promisify(query.run, query)(connection)

Using RethinkDB via ORM

Writing each database operation using the low-level queries quickly leads to a lot of boilerplate code. We mostly need tables that store objects and allow the same standard operations

  • Create new table if it does not exist
  • Add new item to the table
  • Query items
  • Update or delete an item

Other programmers have written object-relational mapping (ORM for short) libraries to abstract these operations. For RethinkDB I have tried three such libraries: reheat, rethinkdbdash and thinky. I only used very basic features and spent only a short amount of time with each library, but picked thinky because it was simple, has good examples and docs at thinky.io and worked out of the box. It is built on top of rethinkdbdash and adds better connection management plus lots of model relations stuff.

thinky example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
var thinky = require('thinky')(/* options */);
// Create a model - the table is automatically created
var Show = thinky.createModel('tv_shows', {
id: String,
name: String,
episodes: Number
});
console.log('Show is', Show);
Show.getJoin()
.run()
.then(function (all) {
console.log(all);
process.exit();
});

The default options are enough to connect to the local server and use the test database. Output shows the automatically created table without any TV shows (I have deleted the table before running this example).

1
2
Show is r.table("tv_shows")
[]

Let us insert a couple of new documents into the Show model, then query and print all

1
2
3
4
5
6
7
8
9
10
11
12
13
var Promise = require('bluebird');
var starTreck = new Show({ name: 'Star Trek TNG', episodes: 178 });
var bg = new Show({ name: 'Battlestar Galactica', episodes: 75 });
Promise.all([starTreck.saveAll(), bg.saveAll()]) // 1
.then(function () {
return Show.getJoin()
.run()
.then(function (all) {
console.log(all);
});
})
.catch(console.error)
.finally(process.exit);
[ { episodes: 75,
    id: 'c9f893b6-0c97-44b3-956b-bcce6cbb8344',
    name: 'Battlestar Galactica' },
  { episodes: 178,
    id: 'dde2be0b-3e3e-4277-adc8-0b64508820e8',
    name: 'Star Trek TNG' } ]

How cool is this! I am using standard promise API to wait for all objects to be saved (line // 1), then query the table model and print the update information. This is exactly what I needed for my simple data needs.

If I need more, Thinky provides more API functions, and also exposes the original r object as thinky.r. Using the r object I can perform any advanced operations that Thinky does not wrap.

If you need more details, there is a good (but old) blog post Blog example with Thinky showing how to model a dependency among objects.

Thinky also has lots of other goodies I found:

  • Adding instance (individual document) methods
  • Adding model (static or table) methods
  • Model.save method for inserting an array of items at once.

To delete all documents from a table for a model, just use ModelName.delete().run() which returns a promise.

Extras