Burning Tests with cypress-grep

How to run the same test again and again to confirm it is flake-free

Recently a project of mine bahmutov/cypress-grep-example showed two flaky tests.

Cypress Dashboard showing two flaky tests

Are the tests really showing a problem with the application? Or are the tests themselves unreliable? Would these tests show failures if we run them 100 times in a row?

This is where the cypress-grep plugin comes in very handy. Just instal it and add to the support file

1
$ npm i -D cypress-grep
1
2
3
4
// add to cypress/support/index.js
// load and register the grep feature
// https://github.com/bahmutov/cypress-grep
require('cypress-grep')()

We have the project with multiple spec files. The first flaky test "should cancel edits on escape" is located in spec file editing-spec.js. Let's run this test by itself 10 times.

1
2
3
4
$ npx cypress run --spec cypress/integration/editing-spec.js \
--env grep="should cancel edits on escape",burn=10
cypress-grep: tests with "should cancel edits on escape" in their names
...

The spec runs and the test we are grepping by title text "should cancel edits on escape" is repeated 10 times. The other tests are all pending.

Burning the selected test

So there is definitely something wrong with this test or the application. We can grab any screenshot - they all show the same failure: the first letter is missing from the title.

The missing letter

Often the application is not ready to receive the cy.type command while still loading. In our case, this seems unlikely - after all, the failure happens in the 3rd todo item, not at the very first item. Maybe something is wrong with typing the characters? Or editing them? Let's make sure the 3 todo items created before each test are typed correctly.

cypress/integration/editing-spec.js
1
2
3
beforeEach(function () {
cy.createDefaultTodos().as('todos')
})

We are using a custom command cy.createDefaultTodos to enter the 3 todo items. We can add assertions checking the input field values right there.

cypress/support/commands.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Cypress.Commands.add('createDefaultTodos', function () {
...
cy.get('.new-todo', { log: false })
.type(`${TODO_ITEM_ONE}{enter}`, { log: false })
.type(`${TODO_ITEM_TWO}{enter}`, { log: false })
.type(`${TODO_ITEM_THREE}{enter}`, { log: false })

cy.get('.todo-list li', { log: false })
.should('have.length', 3)
.and(($listItems) => {
// check the text in each list item
expect($listItems[0], 'first item').to.have.text(TODO_ITEM_ONE)
expect($listItems[1], 'second item').to.have.text(TODO_ITEM_TWO)
expect($listItems[2], 'third item').to.have.text(TODO_ITEM_THREE)
})
}

And let's burn the test again.

The first letter was missing on creation

The problem seems to be in typing the initial text, not in editing it afterwards. Cypress types pretty quickly, much faster than a normal human being. Maybe the application cannot keep up for some reason? Let's add a delay of 20ms after each character.

cypress/support/commands.js
1
2
3
4
5
const opts = { log: false, delay: 20 }
cy.get('.new-todo', opts)
.type(`${TODO_ITEM_ONE}{enter}`, opts)
.type(`${TODO_ITEM_TWO}{enter}`, opts)
.type(`${TODO_ITEM_THREE}{enter}`, opts)

Time to burn it to find out if we have fixed it.

1
... --env grep="should cancel edits on escape",burn=100

The tests seem to be stable. 100 tests pass.

Burning tests with delay

Now we can decide if we want to move on, or keep digging into the application's code to find why the first letter is lost sometimes. There is one other way. We have slowed down every test that creates the default todo items by 400-500ms. Is this a good trade-off to make 1 or 2 tests stable?

Or is the test retries a better answer? In this instance I would prefer to get to the bottom of the problem and not use the test retries.

Bonus: video

See how I am burning a test in this short video below

Bonus 2: burn tests on CircleCI

Read the blog post Burn Cypress Tests on CircleCI

Bonus 3: burning new or changed specs

If you watch my presentation about slicing and dicing E2E tests, you will see that at Mercari US we run changed and new Cypress specs first before running the rest of the tests. We now also burn the changed and new specs, running them 5 times in a row. This prevents flaky tests from sneaking in.