Flaky test detection, reporting, prevention, and minimization

Description

Flaky tests are a huge problem in many teams' CI/CD pipelines. Sometimes flaky tests are avoidable, sometimes they aren't. Even when they are, making them stable can be hard and sometimes prohibitive. We should make it easy to detect flaky tests, report on them, and possibly work around their flakiness.

Note that tests may fail for many reasons, from unexpected state to network outages. Sometimes retrying a test right away is still going to fail because of an external incident. An exponential backoff on retries might be helpful here.

Proposal

Leverage JUnit as standard output from all test runners
Detect patterns that indicate flakiness such as:
- Tests that fail, but then succeed on retries
- Tests that fail a high percentage of times
Report the flakiest tests (so that someone can work on fixing them) (https://gitlab.com/gitlab-org/gitlab-ee/issues/3673)
Flag flaky tests for automatic retries
Block MRs that introduce new flaky tests

Links / references

From https://gitlab.com/gitlab-org/gitlab-ce/issues/32308#note_34748569

Documentation blurb

Overview

What is it? Why should someone use this feature? What is the underlying (business) problem? How do you use this feature?

Use cases

Who is this for? Provide one or more use cases.

Feature checklist

Make sure these are completed before closing the issue, with a link to the relevant commit.

Feature assurance
Documentation
Added to features.yml

Edited Oct 06, 2017 by Mark Pundsack

Admin message