Cypress Flakiness Examples

A few examples of solving test flake plus how to use GitHub Copilot to write Cypress tests.

Recently I watched Filip Hric YT livestream Debugging test flakes in Cypress. Good video. He has shown several examples of flaky tests caused by the bad test design. In this blog post, I will give my take on the solid test design that avoids all shown pitfalls.

🎁 The source code for this blog post is in the public repo bahmutov/cypress-flakiness-debug-examples. I have grabbed the initial code from filiphric/cypress-flakiness-debug-examples branch customer-subscriptions. The examples in this blog post are using the application and the tests in the subfolder customer-subscriptions.

The application

The application displays a list of subscriptions. Some subscriptions are active and some are not.

Customer subscriptions app

The list is generated dynamically using @faker-js/faker module. The number of items can be from 1 to 6:

customer-subscriptions/src/modules/generateCustomers.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import { faker } from '@faker-js/faker'
import { Subscription } from '@/interfaces/subscription'

export const generateCustomers = () => {
let customers: Subscription[] = []
let customerCount = Math.floor(Math.random() * 6) + 1

for (let index = 0; index < customerCount; index++) {
let firstName = faker.person.firstName()
let lastName = faker.person.lastName()
let status = ['active', 'inactive', 'trial'] as const

customers.push({
id: faker.string.uuid(),
fullName: `${firstName} ${lastName}`,
email: faker.internet.exampleEmail({ firstName, lastName }),
status: status[Math.floor(Math.random() * 3)],
})
}

return customers
}

There is also a loading "splash" screen before the items load.

customer-subscriptions/src/components/SubscriptionList.tsx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
useEffect(() => {
// Fetch subscriptions from API
fetch('/api/subscriptions')
.then((response) => {
if (!response.ok) {
throw new Error('Network response was not ok')
}
return response.json()
})
.then((data) => {
setSubscriptions(data)
setLoading(false)
})
.catch((error) => {
setError(error.message)
setLoading(false)
})
}, [])

if (loading) {
return <div>Loading...</div>
}

if (error) {
return <div>Error: {error}</div>
}

Let's see how Cypress end-to-end tests for this app can be flaky or solid.

The loading

If you just started the application, the webserver might take longer to bundle and return the homepage, especially in the dev mode. To better simulate the unpredictable initial load, I will add a random delay between 1 and 11 seconds to the fetch call.

1
2
3
4
5
6
7
8
9
10
useEffect(() => {
setTimeout(
() => {
// Fetch subscriptions from API
fetch('/api/subscriptions')
...
},
Math.random() * 10_000 + 1000,
)
}, [])

Here is the first test as shown by Filip. It is flaky on purpose. Can you see at least two potential problems?

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
it('Activates a subscription', () => {
let randomItem = Math.floor(Math.random() * 7)

cy.visit('/')
cy.get('[data-cy=customer-item]').eq(randomItem).click()
cy.contains('Activate Subscription').click()
cy.contains('Subscription was activated').should('be.visible')
})

The loading time

The first problem we see when running the test is the command cy.get('[data-cy=customer-item]') failing.

The test fails to find any items

The test runner simply did not "see" the subscriptions list in time. To determine it, click on the failed command GET and look at the restored DOM snapshot: the app was still showing the "Loading" message.

The app snapshot at the moment the GET command failed

But sometimes the test succeeds. If you inspect the same command GET using the time-traveling debugger, instead of "Loading" you see the items.

The app snapshot at the moment the GET command succeeded

This is test flake: depending on the application's speed the same command step fails or succeeds. We need to take the maximum loading time into the account. The error message "Timed out retrying after 4000ms: Expected to find element: [data-cy=customer-item], but never found it." tells us the solution: we must increase the timeout for the GET command because the application might not show the items until after 11 seconds passed. Let's fix the test:

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
cy.visit('/')
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.eq(randomItem)
.click()

The command now retries for 11 seconds instead of the default 4. You can see the slower progress bar.

The GET command can retry finding items up to 11 seconds

The number of items

Here is the second problem with the test. The application can show between 1 and 6 items. The test picks a random index between 1 and 6. If the test picks an index larger than the random number of items, the test will fail, since there is no such item. We must pick one of the existing items. Filip changed his test to do so:

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
9
10
11
it('Activates a subscription', () => {
cy.visit('/')
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.its('length')
.then((numberOfItems) => {
const randomItem = Cypress._.random(0, numberOfItems - 1)
cy.get('[data-cy=customer-item]').eq(randomItem).click()
})
cy.contains('Activate Subscription').click()
cy.contains('Subscription was activated').should('be.visible')
})

💡 If you do need to pick a random number between min and max in your Cypress tests, please use the bundled Lodash function _.random:

1
2
3
4
// instead of this
let randomItem = Math.floor(Math.random() * 7)
// use Cypress._.random function
const randomItem = Cypress._.random(0, 6)

The above test is good. I would also print the picked item index to make it very clear which item we are subscribing to. It certainly removes the flake from trying to pick an item that is not there.

1
2
3
4
5
.then((numberOfItems) => {
const randomItem = Cypress._.random(0, numberOfItems - 1)
cy.log(`Activating subscription for item #${randomItem}`)
cy.get('[data-cy=customer-item]').eq(randomItem).click()
})

Active subscriptions

The test is less flaky than before but occasionally it still fails. Here is a good example of the failing test:

The test fails to activate the subscription

We only have one item, so we pick it to activate the subscription. But the item is already active. We cannot click on it to activate again. When picking an item, we must only consider the items that are "trial" or "inactive".

Can only activate trial or inactive subscriptions

We can look at the HTML markup to see if the "trial" and "inactive" items have any HTML attributes that we can use to easily query them while omitting the "active" subscriptions.

Subscription items HTML markup

Hmm, nothing. No biggie, we can add data-status attribute to our SubscriptionItem component.

customer-subscriptions/src/components/SubscriptionItem.tsx
1
2
3
4
5
6
7
8
9
10
11
12
const SubscriptionItem: React.FC<SubscriptionItemProps> = ({
subscription,
onOpenModal,
}) => {
return (
<div
className="flex items-center cursor-pointer hover:bg-slate-100 px-4 py-2 rounded-md"
data-cy="customer-item"
data-status={subscription.status}
onClick={() => onOpenModal(subscription)}
>
...

Tip: I use separate data- attribute to pass the status following the advice in my blog post Do Not Put Ids Into Test Ids.

Now our test can be very explicit and only consider the "trial" or "inactive" items by using the OR CSS selector.

1
2
3
4
cy.get(
'[data-cy=customer-item][data-status=trial], [data-cy=customer-item][data-status=inactive]',
{ timeout: 11_000 },
)

We can also go the other way and filter out all active subscriptions using the cy.not command.

1
2
3
4
5
6
7
8
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.not('[data-status=active]')
.its('length')
.then((numberOfItems) => {
const randomItem = Cypress._.random(0, numberOfItems - 1)
cy.log(`Activating subscription for item #${randomItem}`)
cy.get('[data-cy=customer-item]').eq(randomItem).click()
})

We still have two problems with this test that will cause flake. Do you see them? One is caused by the random data, another by the test design.

Sample data

Here is the problem caused by the test design. In the failure below we see that we are picking the item number one (zero-based index).

Failed to active the subscription number one

Hmm, let's hover over the items we picked initially using cy.get('[data-cy=customer-item]', { timeout: 11_000 }).not('[data-status=active]') command. Seems we correctly picked 4 subscriptions.

Subscriptions that can be activated

Now hover over the EQ 1 command. Why it picking the already activated subscription that is NOT part of the original four items?

Why is it picking the active subscription to try to activate it?

Ohhh, we picked the "trial" or "inactive" subscriptions initially to pick the random item index. But then we applied this index to all items

1
2
3
4
5
6
7
8
9
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.not('[data-status=active]')
.its('length')
.then((numberOfItems) => {
const randomItem = Cypress._.random(0, numberOfItems - 1)
cy.log(`Activating subscription for item #${randomItem}`)
// WRONG: we must pick on the of the original items
cy.get('[data-cy=customer-item]').eq(randomItem).click()
})

Let's fix it. Just apply the same logic to filter items.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
it('Activates a subscription', () => {
cy.visit('/')
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.not('[data-status=active]')
.its('length')
.then((numberOfItems) => {
const randomItem = Cypress._.random(0, numberOfItems - 1)
cy.log(`Activating subscription for item #${randomItem}`)
cy.get('[data-cy=customer-item]')
.not('[data-status=active]')
.eq(randomItem)
.click()
})
cy.contains('Activate Subscription').click()
cy.contains('Subscription was activated').should('be.visible')
})

The test works pretty well. But we can express our test even simpler by using my cypress-map plugin. Like Lodash, the cypress-map provides a lot of "missing" queries and commands that make Cypress tests much simpler and stable. In our case, we need something like Lodash's _.sample function.

1
2
_.sample([1, 2, 3, 4]);
// => 2
customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
9
10
11
12
// https://github.com/bahmutov/cypress-map
import 'cypress-map'

it('Activates a subscription', () => {
cy.visit('/')
cy.get('[data-cy=customer-item]', { timeout: 11_000 })
.not('[data-status=active]')
.sample()
.click()
cy.contains('Activate Subscription').click()
cy.contains('Subscription was activated').should('be.visible')
})

Using cy.sample query command simplifies the test

Boom. Simple and solid. Almost.

GitHub Copilot trick

Here my #1 tip for you when writing Cypress tests. I have stated it many many years ago at a few conference presentations. When writing Cypress tests, write first "directions" to a human user. For example:

List of test steps for a human tester

Write the steps as comments, telling the tester what to do, but not how to do it. Here are the steps I wrote as comments inside an empty Cypress test

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
9
10
// https://github.com/bahmutov/cypress-map
import 'cypress-map'

it('Activates a subscription', () => {
cy.visit('/')
// get all items with data-cy attribute "customer-item"
// omit all items with data-status attribute "active"
// sample an item using cy.sample command
// click on the item
})

Now add an empty line in your VSCode and it should trigger GitHub Copilot. Here is what happens for me:

GitHub Copilot suggest the right Cypress commands

Nice! Our comments got "translated" into correct Cypress code that even uses the custom cy.sample command from cypress-map. Click "Tab" to accept the suggested code, and we have a passing test.

The AI-generated test passes

Let's continue generating our test. Write more user instructions.

There should be a link to activate the subscription

Then we need to confirm the subscription was activated.

GitHub Copilot suggestion how to check the successfully activated subscription

The test is complete.

The full generated test

Not bad, right?

The data edge case

Yet, there is one more source of test flake that we did not consider. Sometimes the test fails.

The test can still fail sometimes

Again, by inspecting the Command Log column, you can see the problem. The test did find items. The test failed to find non-active items, because there all randomly generated subscriptions were "active" already. Our test always assumed there will be some inactive subscriptions, but that is an invalid assumption. We can do conditional testing in Cypress, even if it is an anti-pattern. The simplest way to run testing command only if there are items to be activated is by using my cypress-if plugin.

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// https://github.com/bahmutov/cypress-map
import 'cypress-map'
// https://github.com/bahmutov/cypress-if
import 'cypress-if'

it('Activates a subscription', () => {
cy.visit('/')
// get all items with data-cy attribute "customer-item"
// omit all items with data-status attribute "active"
// sample an item using cy.sample command
// click on the item
cy.get('[data-cy=customer-item]')
// the elements are there, the list will not change
// thus we can set the timeout to 0
// to quickly move to the next steps
.not('[data-status=active]', { timeout: 0 })
.if('exists')
.sample()
.click()
.then(() => {
// find the button that contains the text "Activate Subscription"
// and click on it
cy.contains('Activate Subscription').click()
// the page should contain a visible element
// with text "Subscription was activated"
cy.contains('Subscription was activated').should('be.visible')
})
.else('Could not find inactive subscription')
})

If the NOT [data-status=active] command yields no elements, we take the ELSE branch where we simply print the info message.

Handle the data case when there are no items to activate

Control the data

Finally, let's confirm that our code works no matter what the backend sends us. We need to control the data, and the best way is to use cy.intercept command to stub the loading network call. We can use fixtures of different types: no inactive subscription, one item, a mixture of items to make sure our testing code can handle each possible situation. We can copy the starting response data straight from the browser.

Copy the network response into a JSON fixture file

The first JSON fixture will have a mixture of items, with "active" items first. This will test the case we discussed in the "Sample data" section above. The index of the inactive item should not accidentally apply to all items.

customer-subscriptions/cypress/fixtures/mixed.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[
{
"id": "d216125a-426a-40f9-bfa4-03bfd96cff1c",
"fullName": "Herman Stark",
"email": "[email protected]",
"status": "active"
},
{
"id": "ff86ef5d-da46-4079-8ac7-44135e399627",
"fullName": "Liliana Parisian",
"email": "[email protected]",
"status": "active"
},
{
"id": "4852ee4c-8d75-49de-8d99-fb49cc03318d",
"fullName": "Grover Bartoletti",
"email": "[email protected]",
"status": "inactive"
},
{
"id": "5e4cef59-116c-4f70-8e3f-fd8cf270c0af",
"fullName": "Lacy Abshire",
"email": "[email protected]",
"status": "inactive"
},
{
"id": "3291bea3-71a8-42e0-9199-0240827e58f1",
"fullName": "Gia Marvin",
"email": "[email protected]",
"status": "trial"
}
]

Our next fixture will have just an active subscription to test the "ELSE" logic.

customer-subscriptions/cypress/fixtures/active-only.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[
{
"id": "d216125a-426a-40f9-bfa4-03bfd96cff1c",
"fullName": "Herman Stark",
"email": "[email protected]",
"status": "active"
},
{
"id": "ff86ef5d-da46-4079-8ac7-44135e399627",
"fullName": "Liliana Parisian",
"email": "[email protected]",
"status": "active"
}
]

Finally, we want to make sure we test the "trial" items and that they can be activated

customer-subscriptions/cypress/fixtures/trial.json
1
2
3
4
5
6
7
8
[
{
"id": "3291bea3-71a8-42e0-9199-0240827e58f1",
"fullName": "Gia Marvin",
"email": "[email protected]",
"status": "trial"
}
]

Let's write the tests. We can refactor the code to avoid the duplication.

customer-subscriptions/cypress/e2e/spec.cy.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
// https://github.com/bahmutov/cypress-map
import 'cypress-map'
// https://github.com/bahmutov/cypress-if
import 'cypress-if'

function activateOneSubscription() {
cy.visit('/')
// get all items with data-cy attribute "customer-item"
// omit all items with data-status attribute "active"
// sample an item using cy.sample command
// click on the item
cy.get('[data-cy=customer-item]')
// the elements are there, the list will not change
// thus we can set the timeout to 0
// to quickly move to the next steps
.not('[data-status=active]', { timeout: 0 })
.if('exists')
.sample()
.click()
.then(() => {
// find the button that contains the text "Activate Subscription"
// and click on it
cy.contains('Activate Subscription').click()
// the page should contain a visible element
// with text "Subscription was activated"
cy.contains('Subscription was activated').should('be.visible')
})
.else('Could not find inactive subscription')
}

it('Activates a subscription', () => {
cy.intercept('GET', '/api/subscriptions', {
fixture: 'mixed.json',
}).as('subscriptions')
activateOneSubscription()
// confirm the subscription network call happened
cy.wait('@subscriptions')
})

it('Activates a trial subscription', () => {
cy.intercept('GET', '/api/subscriptions', {
fixture: 'trial.json',
}).as('subscriptions')
activateOneSubscription()
// confirm the subscription network call happened
cy.wait('@subscriptions')
})

it('Does not activate an active subscription', () => {
cy.intercept('GET', '/api/subscriptions', {
fixture: 'active-only.json',
}).as('subscriptions')
activateOneSubscription()
// confirm the subscription network call happened
cy.wait('@subscriptions')
})

Three different tests use different data to confirm our logic works

Response data changes

When using a network stub there is a danger that the server changes its response and our tests don't catch it. We can prevent this by adding a quick API test or a spy E2E test. Since we only want to ensure the properties / types of the objects in the response, I recommend using my cy-spok plugin.

API test

In this test we will make a network call ourselves using the cy.request command and will validate the response. The response should be an array, and we can validate the first object using the built-in assertions plus spok property predicates.

customer-subscriptions/cypress/e2e/api.cy.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import spok from 'cy-spok'

const isStatus = (s) =>
s === 'active' || s === 'inactive' || s === 'trial'

it('responds with a subscription object', () => {
cy.request('/api/subscriptions/')
.its('body')
.should('be.an', 'array')
.its(0, { timeout: 0 })
// confirm each item can only have these properties
.should('have.keys', ['id', 'fullName', 'email', 'status'])
// use cy-spok to check the types of each property
.and(
spok({
id: spok.string,
fullName: spok.string,
email: spok.string,
status: isStatus,
}),
)
})

There is no web page in this test, since we never called cy.visit. Instead we simply see the assertions in the Command Log.

The API test confirms the server response schema

Network spy test

Sometimes it is hard to make a valid request from the test: the format of the call might be complicated, plus require authentication. It might be easier to just spy on the call made by the application. The same logic applies.

customer-subscriptions/cypress/e2e/spy.cy.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import spok from 'cy-spok'

const isStatus = (s) =>
s === 'active' || s === 'inactive' || s === 'trial'

it('spies on the server response', () => {
// set up the network spy before the page loads
cy.intercept('GET', '/api/subscriptions').as('subscriptions')
cy.visit('/')
cy.wait('@subscriptions')
.its('response.body')
.should('be.an', 'array')
.its(0, { timeout: 0 })
// confirm each item can only have these properties
.should('have.keys', ['id', 'fullName', 'email', 'status'])
// use cy-spok to check the types of each property
.and(
spok({
id: spok.string,
fullName: spok.string,
email: spok.string,
status: isStatus,
}),
)
})

Spying on the network call to confirm its schema

Beautiful.

🎓 In this blog post I used a lot of Cypress plugins. If you want to learn them better, I have an online hands-on course Cypress Plugins. You can also level up your network testing by taking my Cypress Network Testing Exercises course.