Skip to content

test: improve test-cluster-disconnect-suicide-race

Previously, test-cluster-disconnect-suicide-race had two issues:

  • Magic numbers: How many times to spawn a worker was determined through empirical experimentation. This means that as new platforms and new CPU/RAM configurations are tested, the magic numbers require more and more refinement. This brings us to...
  • Non-determinism: The test seems to fail all the time when the bug it tests for is present, but it's really a judgment based on sampling. "Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try 16..."

This revised version of the test takes a different approach. The fix for the bug that the test was written for means that the disconnect event will fire on a subsequent tick. So we check for that and the test still fails when the fix is not in the code base and succeeds when it is.

Advantages of this approach include:

  • The test runs much faster.
  • The test should be reliable on any new platform regardless of CPU and RAM.

Ref: #4674

cc @santigimeno @iWuzHere

Merge request reports

Loading