Avoid bisect command to get stuck
Created by: benoittgt
Fix: https://github.com/rspec/rspec-core/issues/2637
From Palkan in benoittgt/rspec_repro_bisect_deadlock#1
First, I've tried to play with the number of specs which led to the interesting conclusion: the process hangs only at 1548+ specs.
RSpec.describe "a bunch of nothing" do
(0...(ENV.fetch('N', 3000).to_i)).each do |t|
it { expect(t).to eq t }
end
end
Try to run with N=1547
and N=1548
.
Seems suspicious, right?
Let's add pry-byebug
to the equation (or Gemfile).
In order it to work we need to tweak our runner code a bit:
- $stdout = $stderr = @spec_output
+ # $stdout = $stderr = @spec_output
After a bit of puts
debugging I localized the problem:
@channel.send
.
Channel#send
calls IO#write
here
https://github.com/rspec/rspec-core/blob/7b6b9c3/lib/rspec/core/bisect/utilities.rb#L41:
def send(message)
packet = Marshal.dump(message)
@write_io.write("#{packet.bytesize}\n#{packet}")
end
Do you know, what is the packet.bytesize
for N=1548
? It's 65548.
This number is very important: the pipe size is only 65536 on MacOS
(see docs for IO#write_nonblock
for more).
That makes @write_io.write
hangs forever, because no one reads the
buffer: we call Channel#receive
only after Process.waitpid(pid)
,
thus waiting for the write operation to complete.
A basic proposal is to use WNOHANG. From waitpid doc:
If WNOHANG was specified in options and there were no children in a waitable state, then waitid() returns 0 immediately (...)
To validate this proposal on OSX we run just before running bisect:
lsof -n -P -r1 -c ruby | grep -e 'PIP' -e '===' -e 'COMMAND'
This will give us in loop the PIPE sizes of Ruby processes. Without our patch we see that quickly we hit 65536 bytes on two pipes, with the patch we keep pipes at the right size.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ruby 40134 benoit 3 PIPE 0xf3b025a6a6cd6005 16384 ->0xf3b025a6a6cd5045
ruby 40134 benoit 4 PIPE 0xf3b025a6a6cd5045 16384 ->0xf3b025a6a6cd6005
ruby 40134 benoit 5 PIPE 0xf3b025a6a6cd7805 16384 ->0xf3b025a6a6cd7145
ruby 40134 benoit 7 PIPE 0xf3b025a6a6cd7145 16384 ->0xf3b025a6a6cd7805
ruby 40134 benoit 10 PIPE 0xf3b025a6a6cd6fc5 16384 ->0xf3b025a6a6cd5a05
ruby 40134 benoit 11 PIPE 0xf3b025a6a6cd5a05 16384 ->0xf3b025a6a6cd6fc5
ruby 40144 benoit 3 PIPE 0xf3b025a6a6cd5d05 16384 ->0xf3b025a6a6cd5c45
ruby 40144 benoit 4 PIPE 0xf3b025a6a6cd5c45 16384 ->0xf3b025a6a6cd5d05
ruby 40144 benoit 5 PIPE 0xf3b025a6a6cd7085 16384 ->0xf3b025a6a6cd6785
ruby 40144 benoit 7 PIPE 0xf3b025a6a6cd6785 16384 ->0xf3b025a6a6cd7085
ruby 40144 benoit 10 PIPE 0xf3b025a6a6cd6fc5 16384 ->0xf3b025a6a6cd5a05
ruby 40144 benoit 11 PIPE 0xf3b025a6a6cd5a05 16384 ->0xf3b025a6a6cd6fc5
Improvements: The bisect command request lot's of ram. The next step should be to reduce that consumption.
Thanks to Palkan, julik and Samuel Williams for the help.