Flexible Cypress Data Setup And Validation

Setting up, caching, and re-creating the test data using cypress-data-session plugin.

This post will introduce you to a very powerful way of creating and re-using data in your Cypress tests. By re-using the expensive to create objects like users, projects, etc. you will make your tests much much faster, and potentially the tests will be easier to read and maintain.

Introduction

Imagine your Cypress test needs to create a user before logging in. Is creating a user an instant step? No, it probably takes time, especially if you go through the app's user interface without using API calls or App Actions. So you want to create a user and keep it around, to avoid re-creating it for each test. Sometimes you do want to check if the user object or some other piece of data is still valid; maybe another test has cleared the database, removing all the users. So you need a mechanism for validating the user before the test proceeds.

These actions: creating a piece of test data, storing it for other tests to use, validating, and re-creating if the validation has failed, are very common. Thus I have written a plugin called cypress-data-session to avoid re-implementing the same boilerplate in my code. This plugin gives an introduction to the plugin's use in the real-world scenarios.

Creating the user

Let's create an user for our application - which is required to log in, and create a chat room. A typical test would do something like this:

cypress/integration/register-user.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/// <reference types="cypress" />

it('registers user', () => {
cy.visit('/')

const username = 'Test'
const password = 'MySecreT'

cy.get('#create-account').should('be.visible').click()
cy.get('.register-form')
.should('be.visible')
.within(() => {
cy.get('[placeholder=username]')
.type(username)
.should('have.value', username)
cy.get('[placeholder=password]').type(password)

cy.contains('button', 'create').click()
})

cy.get('.login-form')
.should('be.visible')
.within(() => {
cy.get('[placeholder=username]')
.type(username)
.should('have.value', username)
cy.get('[placeholder=password]').type(password)

cy.contains('button', 'login').click()
})

// if the user has been created and could log in
// we should be redirected to the home page with the rooms
cy.location('pathname').should('equal', '/rooms')
})

🎁 You can find the source code examples used in this blog post in the repo bahmutov/chat.io.

The first time this test runs, everything goes well. The user is created and can log in.

The user is created successfully

But when we rerun the test, it fails, since the user with the same username already exists.

Cannot register the same user twice

Sure, the failure is expected. We have four choices:

  • delete all users before each test so we can create the user Test with the password MySecreT.
  • delete just the user Test if it exists.
  • create a user with unique random name just for this test.
  • cache the created user and reuse it.

The last option is the hardest to implement, but can offer substantial speed savings, since create a user (or some complicated piece of test data) can be slow.

Separate creation from logging in

Before we proceed, I want to point out that the test above mixes creating the user object and using it. We probably want to keep the act of creating the user really clear, thus I will rewrite the test a little bit.

cypress/integration/register-user.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
function registerUser(username, password) {
cy.visit('/')

cy.get('#create-account').should('be.visible').click()
cy.get('.register-form')
.should('be.visible')
.within(() => {
cy.get('[placeholder=username]')
.type(username)
.should('have.value', username)
cy.get('[placeholder=password]').type(password)

cy.contains('button', 'create').click()
})
}

function loginUser(username, password) {
cy.visit('/')

cy.get('.login-form')
.should('be.visible')
.within(() => {
cy.get('[placeholder=username]')
.type(username)
.should('have.value', username)
cy.get('[placeholder=password]').type(password)

cy.contains('button', 'login').click()
})
// if everything goes well
cy.contains('.success', 'Your account has been created')
.should('be.visible')
}

it('registers user', () => {
const username = 'Test'
const password = 'MySecreT'

registerUser(username, password)
loginUser(username, password)

// if the user has been created and could log in
// we should be redirected to the home page with the rooms
cy.location('pathname').should('equal', '/rooms')
})

The above test fails if we re-run it, so let's take care of that. Let's delete the user before each test. I have registered a task in the plugin file to connect to the database and clear all users. Just for kicks I also added a task to find a user by username.

cypress/plugin/index.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
const database = require('../../app/database')

async function clearUsers() {
console.log('clear users')
await database.models.user.deleteMany({})
return null
}

async function findUser(username) {
console.log('find user', username)
if (typeof username !== 'string') {
throw new Error('username must be a string')
}
return database.models.user.findOne({ username })
}

module.exports = (on, config) => {
on('task', {
clearUsers,
findUser,
})
}

At the start of the test, we can delete all users in the database (which is fine, we are running one test at a time).

1
2
3
4
5
6
7
8
9
10
11
12
it('registers user', () => {
const username = 'Test'
const password = 'MySecreT'

cy.task('clearUsers')
registerUser(username, password)
loginUser(username, password)

// if the user has been created and could log in
// we should be redirected to the home page with the rooms
cy.location('pathname').should('equal', '/rooms')
})

Clear all existing users before creating the test user

Nice, the test can be re-run multiple times. But we can do better - let us avoid deleting all users. We can quickly check if the already created user is still valid.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
it('registers user', () => {
const username = 'Test'
const password = 'MySecreT'

cy.task('findUser', username).then((user) => {
if (!user) {
registerUser(username, password)
}
})
loginUser(username, password)

// if the user has been created and could log in
// we should be redirected to the home page with the rooms
cy.location('pathname').should('equal', '/rooms')
})

The above test is much much faster - since it reuses the previously created user, and avoids recreating unnecessarily.

Finds the previously created user

Data session

Now let us rewrite the above test using cypress-data-session plugin. We will import the plugin from the support file, which gives us the cy.dataSession command. First, let's recreate the original "naive" create and log in behavior:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
it('registers user using data session', () => {
const username = 'Test'
const password = 'MySecreT'

cy.dataSession({
name: 'user',
setup() {
registerUser(username, password)
},
})
loginUser(username, password)

// if the user has been created and could log in
// we should be redirected to the home page with the rooms
cy.location('pathname').should('equal', '/rooms')
})

The test passes on the first attempt if there are no users in the database.

The user was created using the setup method

Great, we created the data item (the user) using the setup method, and gave the data session an alias "user". We can pass some data from that alias later; it can be used to access the object created by the setup method.

Notice, if we re-run the test, it fails when it tries to run the setup method again.

The data session tries to recompute the item

The data session does not know that the user object is still valid, and should not be recomputed. Let's tell the data session how to check. We will add the validate method that can run Cypress commands and resolves with a boolean value. If we yield true, the data session will skip recomputing the user again.

1
2
3
4
5
6
7
8
9
10
cy.dataSession({
name: 'user',
setup() {
registerUser(username, password)
},
validate() {
return cy.task('findUser', username).then(Boolean)
},
})
loginUser(username, password)

The test now immediately logs in - because the user is still valid.

The user is validated, not recomputing necessary

Register the user via API call

We can optimize how we create the user. Instead of filling the form fields and submitting the form, we could simply post it using the cy.request command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function registerApi(username, password) {
cy.request({
method: 'POST',
url: '/register',
form: true,
body: {
username,
password,
},
})
}
cy.dataSession({
name: 'user',
setup() {
registerApi(username, password)
},
validate() {
return cy.task('findUser', username).then(Boolean)
},
})
loginUser(username, password)

The test passes and is faster.

Register the user with cy.request command

Register the user via task command

We can bypass the API and directly create the user in the database (of course, we can use the application database model layer to avoid creating an invalid entity) by calling the plugin code via cy.task command.

cypress/plugins/index.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
const database = require('../../app/database')
const { registerUser } = require('../../app/models/user')

async function findUser(username) {
console.log('find user', username)
if (typeof username !== 'string') {
throw new Error('username must be a string')
}
return database.models.user.findOne({ username })
}

async function getUser(id) {
console.log('get user with id %s', id)
return database.models.user.findOne({ _id: id })
}

async function makeUser(credentials) {
console.log('makeUser', credentials?.username)
const errorMessageOrUser = await registerUser(credentials)
if (typeof errorMessageOrUser === 'string') {
throw new Error(errorMessageOrUser)
}
console.log(
'made user %s id %s',
credentials.username,
errorMessageOrUser._id,
)
return errorMessageOrUser._id
}

module.exports = (on, config) => {
on('task', {
findUser,
getUser,
makeUser,
})
}

📝 For more examples on how to connect to the MongoDB database from Cypress tests, read the blog post Testing Mongo with Cypress or How To Verify Phone Number During Tests Part 2.

We can now register a user really quickly.

1
2
3
4
5
6
7
8
9
cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password })
},
validate() {
return cy.task('findUser', username).then(Boolean)
},
})

Register the user with cy.task command

Caching data

The cy.dataSession command above helped us organize the user creation a little bit, but its power is in caching a piece of created data. For example, why do we need the username to check if the user is still valid? A user object should be checked by its ID! The user ID is returned by the cy.task('makeUser'), so let's store it somewhere. That is precisely what cy.dataSession can do internally, so you do not have to do it! It can even store it across the specs, so it survives hard reloads and opening a different spec.

In fact, the user ID has already been stored - because that is what the Cypress command chain inside the setup method yields. That ID is stored in the session, and we can see what the session stores using a static method Cypress.getDataSession added to the global Cypress object by the plugin.

The user ID was stored inside the data session "user"

We can store any object there, let's keep the user ID, the username, and the password together.

1
2
3
4
5
6
7
8
9
10
11
cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate() {
return cy.task('findUser', username).then(Boolean)
},
})

We can store an entire object inside the data session

What about validating the data? Does it need to rely on the external closure variable username? No - the data session code automatically passes the stored data to the validate method! We can rewrite the validate method to use the parameter instead of the external variable:

1
2
3
4
5
6
7
8
9
10
11
cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ username }) {
return cy.task('findUser', username).then(Boolean)
},
})

Even better, we can grab the ID property and use the task getUser to validate the user quicker

1
2
3
4
5
6
7
8
9
10
11
cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

Notice how the Command Log shows the cy.task checking the user ID now

Validating the user by the stored ID

The stored user object is automatically available under a Cypress alias and under the test context property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
const username = 'Test'
const password = 'MySecreT'

cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

cy.get('@user')
.should('have.property', 'username', username)
// or access the alias using the test context property
// after it has been set
.then(function () {
expect(this.user).to.have.keys('id', 'username', 'password')
})

The name we gave the data session "user" became a Cypress alias, reachable using the cy.get command or via the text context property.

Accessing the computed cached data via an alias "user"

Tip: the plugin adds static methods to the global Cypress object that allow you inspecting individual sessions, clearing them, or disabling the plugin completely, all from the browser's DevTools console. See the README.

Logging using API call

In our code, we are using the cy.task to create the user if necessary, but we still log in using the page form. To log in faster, we can use cy.request command. This command can submit the /login form and set any returned cookies, like the connect.sid session cookie the application is using.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
export const loginViaApi = (username, password) => {
cy.log(`log in user **${username}**`)
cy.request({
method: 'POST',
url: '/login',
form: true,
body: {
username,
password,
},
})
return cy.wrap({ username, password })
}

it('register and log in using cy.request', () => {
const username = 'Test'
const password = 'MySecreT'

cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

cy.get('@user')
.should('have.property', 'username', username)
// or access the alias using the test context property
// after it has been set
.then(function () {
expect(this.user).to.have.keys('id', 'username', 'password')
})

loginViaApi(username, password)

// if the user is logged in correctly,
// the session cookie is set, and when we visit the page
// we are automatically redirected to the list of chat rooms
cy.visit('/')
cy.location('pathname').should('equal', '/rooms')
})

Look good, and it is faster too, since we avoid visiting the page and typing into the form, and then submitting it.

Logging in using cy.request command

Caching the session cookie

When we log in using the form or via cy.request call, the browser receives the session cookie from the backend. This cookie is associated with the user we have created. If this cookie is removed, the server redirects the page back to the log in.

The session cookie used to authenticate the user

If we log in once, and save this cookie in a variable, we could log in instantly the second time by setting it before cy.visit. This sounds a lot like ... cy.dataSession command. Let's "build" it by first just using the setup method and storing the cookie inside the data session. Change the loginViaApi function to yield the cookie value, and call this method from the setup method - this will store the cookie in the data session cache.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
export const loginViaApi = (username, password) => {
cy.log(`log in user **${username}**`)
cy.request({
method: 'POST',
url: '/login',
form: true,
body: {
username,
password,
},
})
// after cy.request, the cookie should exist in the browser
return cy.getCookie('connect.sid')
}

it('register and log in using data sessions', () => {
const username = 'Test'
const password = 'MySecreT'

cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

cy.dataSession({
name: 'logged in user',
setup() {
// yields the connect.sid cookie
return loginViaApi(username, password)
},
})

// if the user is logged in correctly,
// the session cookie is set, and when we visit the page
// we are automatically redirected to the list of chat rooms
cy.visit('/')
cy.location('pathname').should('equal', '/rooms')
})

The second cy.dataSession always recreates the cookie as the captured movie below shows - because we do not have the validate method yet.

The cookie is recomputed for every test

Let's think when the cookie is valid - when it has any value. Of course, we could validate the cookie fully by making an authorized request and checking if it fails. But for now, let's assume that IF we have a cookie, then it is ok to use it again. We can tell the data session that any previous value is ok to reuse by using validate: true value.

1
2
3
4
5
6
7
8
9
cy.dataSession({
name: 'logged in user',
setup() {
// yields the connect.sid cookie
return loginViaApi(username, password)
},
// any non-null cookie value is valid
validate: true,
})

Hmm, the test has failed - the user was not logged in when the test visited the page.

The user was redirected back to the login page

Cypress clears all cookies before each test. Thus the data storage has a cookie from the previous session BUT it is still in memory. We told the cy.dataStorage that the cookie is valid, but we also need to tell it how to restore the cookie and set it in the browser before proceeding. We have the method recreate for this; it receives the value from the data session storage (just like the method validate does).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cy.dataSession({
name: 'logged in user',
setup() {
// yields the connect.sid cookie
return loginViaApi(username, password)
},
// any non-null cookie value is valid
validate: true,
// if we have the previous valid cookie
// set it in the browser before any cy.visit
recreate(cookie) {
cy.setCookie('connect.sid', cookie.value)
},
})

// if the user is logged in correctly,
// the session cookie is set, and when we visit the page
// we are automatically redirected to the list of chat rooms
cy.visit('/')
cy.location('pathname').should('equal', '/rooms')

Recreate the browser session by setting the cookie from the data session

Reusing the previous session cookie is very very fast, even compared to logging in using the cy.request command.

Data session methods

The cy.dataSession allows you to create the initial item using setup, stores it, validates if the previous item is still valid using validate, and if it is still valid, set it back into the browser using any Cypress commands using the recreate method. One could summarize the logic using the following list:

  • if there is no previous item for the session named X
    • run the setup and store the item under name X
  • else
    • there is a previous item
    • check if it is still valid using validate
      • if still valid, call recreate if provided
      • otherwise call the setup again

There are a few other lifecycle methods in cy.dataSession to make dealing with the item more explicit, see the README for details.

Dependent sessions

Multiple sessions store their data separately. We can check what the session stores from the DevTools console.

Printing the saved data from each data session

The two data sessions have a dependency; if the user object is recreated, then the previously stored cookie becomes invalid. One cannot log in using a session for a user that does not exist! Thus we need to invalidate the data session "logged in user" when running the "user" session setup. We can do it explicitly from the "user" data session. When running the setup function, call Cypress.clearDataSession('logged in user') and it is deleted.

1
2
3
4
5
6
7
8
9
10
11
12
cy.dataSession({
name: 'user',
setup() {
Cypress.clearDataSession('logged in user')
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

In the recording below all data sessions are set, but I clear the users from the database table and clear the "user" data session, forcing the cy.dataSession to recreate the user. The setup runs and clears the "logged in user" data session. That's why you see the message "first time for session logged in user" in the Cypress Command Log.

The second session is recomputed because the user session clears it

There is an alternative way to re-compute the data session which I prefer. Instead of the "user" data session clearing every session that might need to be recomputed, why don't we tell the "logged in user" that it depends on the "user" session? There is a parameter that specifies it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
cy.dataSession({
name: 'user',
setup() {
cy.task('makeUser', { username, password }).then((id) => {
return { id, username, password }
})
},
validate({ id }) {
return cy.task('getUser', id).then(Boolean)
},
})

cy.dataSession({
name: 'logged in user',
dependsOn: ['user'],
setup() {
// yields the connect.sid cookie
return loginViaApi(username, password)
},
// any non-null cookie value is valid
validate: true,
// if we have the previous valid cookie
// set it in the browser before any cy.visit
recreate(cookie) {
cy.setCookie('connect.sid', cookie.value)
},
})

Notice how the second data session declares all sessions it depends on using dependsOn: ['user'] parameter. Under the hood, each data session generates a new random UUID when it is computed using the setup call. Every session with dependencies keeps a list of UUIDs for the sessions it depends on. During the validate step, if any of the upstream data sessions have a different UUID from its list, then it must have been recomputed, and thus the current data session is no longer valid. Clean and simple!

The second session is recomputed when the dependent session is recomputed

More info

I believe the cypress-data-session plugin provides a very flexible and powerful way for creating and re-using any data during Cypress tests. It can do all the things I have shown in this blog post and more. For example, it can share the data across specs! For more information, see the plugin's README, and the example application bahmutov/chat.io. You can also find lots of example videos, some of them linked here:

See the cypress-data-session README Videos section for the up-to-date list.

Read the blog post Faster User Object Creation.