Feature Request: means of troubleshooting intermittent failures that do not depend on sequence
Created by: brandondrew
Subject of the issue
Feature Request: means of gathering statistics on intermittent tests that do not depend on sequence
Your environment
- Ruby version: 2.7.8
- rspec-core version: 3.9.3
Steps to reproduce
I'm aware of rspec --bisect
for isolating specs that fail intermittently based on sequence of examples run. But I have a case in a legacy app (I've never worked in this codebase before) where there are specs failing intermittently even when run in isolation—so sequence is not a factor.
In other words, if I run
rspec ./spec/services/locate_missing_approver_service_spec.rb:5
it will fail perhaps 1 time in 10. That 10% figure is a guess based on my experience of running this, but it might be useful if I had a way of accurately (and quickly, and not manually) gathering statistics like that, so I could use git bisect
to determine where the intermittent failures begin (or if they were always present since the beginning of that spec).
At the moment I'm manually running a function to check whether the failures occur in any given commit or with any temporarily changed code, and I could extend it to gather stats, but I'm not sure if that is the best path to go down.
function test-flaky-spec() {
while true; do
rspec ./spec/services/locate_missing_approver_service_spec.rb:5
if [[ $? -ne 0 ]]; then
break
fi
sleep 0.1
done
}
If gather statistics is the best means of troubleshooting that we can hope for, it would be nice to have some options built-into Rspec, such as those illustrated here, specifying how many times to run the spec and where to save the data:
rspec --loop=100 --stats=missing_approver.csv ./spec/services/locate_missing_approver_service_spec.rb:5
But perhaps there is something better than such statistics, and those with more experience diagnosing this sort of problem can offer better ways to support such troubleshooting—I'm sorry that this feature request is still somewhat abstract.
Expected behavior
Some means to diagnosing intermittent failures (those which don't depend on sequence of examples, that is).
Actual behavior
As far as I can see, we're pretty much on our own in this situation. (But maybe I'm overlooking something! I checked rspec --help
on the latest version and I don't see anything there either, so I don't think I'm overlooking anything.)