Best articles on how to debug Cypress flakes

Here’s a collection of articles we’ve found helpful when trying to understand why flakes occur and how to write more resilient tests.

By Filip Hric
  1. Using explicit waiting
  1. Using unreadable selectors
  1. Selecting elements improperly
  1. Ignoring requests in your app
  1. Overlooking DOM re-rendering
  1. Creating inefficient command chains
  1. Overusing UI
  1. Repeating the same set of actions
1. Unprepared Element
2. Flickering Element
3. Impatient Test
Kailash Pathak
1. Adding.Exist,.visible before Clicking on the element
2. Test Retries
3. Use timeouts instead of waits
4. Use cy. intercept() to wait for the XHR request to finish execution.
By Jori Gelbaugh
  1. Reduce Inconsistent Interactions in the DOM
  1. Reduce Inconsistency in Requests
Causes
1. TEST-SIDE CAUSES 
2. ENVIRONMENT-SIDE CAUSES 
3. PRODUCT-SIDE CAUSES 
Ways To Fight Flakiness 
  1. FOCUS ON YOUR TEAM #
  1. KEEP TESTS ISOLATED #
  1. FURTHER OPTIMIZE THE TEST STRUCTURE #
  1. WAIT! TEST RETRIES ARE SOMETIMES OK? #
  1. USING DYNAMIC WAITING TIMES #
SPOTTING THE RED FLAGS #
  • The test is large and contains a lot of logic.
  • The test covers a lot of code (for example, in UI tests).
  • The test makes use of fixed waiting times.
  • The test depends on previous tests.
  • The test asserts data that is not 100% predictable, such as the use of IDs, times, or demo data, especially randomly generated ones.

Methods for identifying and dealing with flaky tests

by Jason Palmer
Causes of flakiness
There are many causes of flakiness but the few I highlight in my talk are:
  1. Inconsistent assertion timing – when your application state is not consistent between test runs you will notice that expect/assert statements fail randomly. The fix for this is to construct tests so that you wait for the application to be in a consistent state before asserting. I am not talking about “wait” statements either. You should have predicates in place to poll the application state until it reaches a known good state where you can assert.
  1. Reliance on test order – global state is the main culprit that causes tests to be reliant on other tests. If you see that you cannot run a test in isolation and it only passes when the whole suite is run then you have this problem. The solution is to entirely reset the state between each test run and reduce the need for global state.
Flakybot
So we have the tools to tell you which tests are flaky, but how will an engineer know they have fixed the flakiness problem? This is why we created Flakybot. This is an internal tool at Spotify to help engineers determine if their test(s) are flaky before merging code to master.