Feature Request: means of troubleshooting intermittent failures that do not depend on sequence

Created by: brandondrew

Subject of the issue

Feature Request: means of gathering statistics on intermittent tests that do not depend on sequence

Your environment

Ruby version: 2.7.8
rspec-core version: 3.9.3

Steps to reproduce

I'm aware of rspec --bisect for isolating specs that fail intermittently based on sequence of examples run. But I have a case in a legacy app (I've never worked in this codebase before) where there are specs failing intermittently even when run in isolation—so sequence is not a factor.

In other words, if I run

rspec ./spec/services/locate_missing_approver_service_spec.rb:5

it will fail perhaps 1 time in 10. That 10% figure is a guess based on my experience of running this, but it might be useful if I had a way of accurately (and quickly, and not manually) gathering statistics like that, so I could use git bisect to determine where the intermittent failures begin (or if they were always present since the beginning of that spec).

At the moment I'm manually running a function to check whether the failures occur in any given commit or with any temporarily changed code, and I could extend it to gather stats, but I'm not sure if that is the best path to go down.

function test-flaky-spec() {
  while true; do
    rspec ./spec/services/locate_missing_approver_service_spec.rb:5
    if [[ $? -ne 0 ]]; then
      break
    fi
    sleep 0.1
  done
}

If gather statistics is the best means of troubleshooting that we can hope for, it would be nice to have some options built-into Rspec, such as those illustrated here, specifying how many times to run the spec and where to save the data:

rspec --loop=100 --stats=missing_approver.csv ./spec/services/locate_missing_approver_service_spec.rb:5

But perhaps there is something better than such statistics, and those with more experience diagnosing this sort of problem can offer better ways to support such troubleshooting—I'm sorry that this feature request is still somewhat abstract.

Expected behavior

Some means to diagnosing intermittent failures (those which don't depend on sequence of examples, that is).

Actual behavior

As far as I can see, we're pretty much on our own in this situation. (But maybe I'm overlooking something! I checked rspec --help on the latest version and I don't see anything there either, so I don't think I'm overlooking anything.)

Admin message