Skip to content

Improve performance of bisect using new forking bisect runner

gitlab-qa-bot requested to merge myron/bisect-forker into master

Created by: myronmarston

Until now, --bisect has shelled out to run each subset. I've realized this is pretty inefficient: you keep paying the cost to boot RSpec and the application environment over and over again. I thought we could perhaps improve performance by running the specs using fork instead of shelling out. This PR is the result of my attempt.

Results

The improvement of the new forking bisect runner can make a big difference, but the amount of improvement depends a lot on an individual project's spec suite. Spec suites that boot quickly, and have a runtime dominated by the specs themselves, will see only marginal improvements. Any spec suite that has significant boot up time (e.g. to load rails or some other heavyweight dependency) should see a noticeable improvement.

I did some testing using the unit specs of plines, and old project I worked on years ago. It contains 343 unit specs which run pretty quickly (time bin/rspec spec/unit takes about 2 seconds). Running bin/rspec spec/unit --bisect using the existing shell runner takes 9.8 seconds. With the new fork runner, it drops to 6.3 seconds--about a 33% improvement.

However, if we simulate the typical boot time of a rails app by adding a sleep 5 to spec_helper, the difference is much more dramatic. With the shell runner, a bisect run takes 108.9 seconds. With the fork runner, it takes 11.7 seconds.

If you really care about speed, you can pass additional --require options (or even require a file that pre-loads all of your application) when bisecting to ensure as much is pre-loaded as possible.

Compatibility concerns

As nice as the performance improvements from the fork runner are, I don't think it's going to work for all spec suites. Here are some potential issues:

  • Platforms that don't support fork. AFAIK, this is only windows, but we have to support it, obviously. We can keep using the shell runner for these environments. This PR does that by checking Process.respond_to?(:fork) before deciding which bisect runner to default to.
  • Projects that use spring or some other forking runner. I don't know much about spring (or any of the other similar projects: zeus, spork, etc), but I understand it uses forking to work. I don't know if there will be any conflict, if RSpec uses forking internally for --bisect and spring also does.
  • Projects that put one-time setup at the top level of spec_helper instead of in a before(:suite) hook. The forking runner loads --require files only once (which is a big part of where it gets its perf improvements!) but that means that spec suite bootstrapping logic written at the top level of spec_helper will only get executed once, and not once per spec run, as with the shell runner. Usually this isn't a big deal, but imagine a spec suite that booted selenium at the top level of spec_helper, and then shut down selenium in an after(:suite) hook. In such a case, selenium would get shutdown after the first spec run, and it would not be available for use in subsequent spec runs. With the shell runner, this isn't a problem, because it re-loads spec_helper each time.

Given these compatibility concerns, I'm not sure if we should default to the new :fork runner on platforms that support :fork or not. If we stick with :shell as the default and make users opt-in to :fork, most users will miss out on the performance improvement. But I also don't want to break bisect for some users. Then again, I spent a bunch of time trying to create a situation like that last case, and failed to actually trigger a problem. So maybe the forking runner isn't going to cause problems for any users.

What do others think?

Merge request reports