rak
" aliased to "bundle exec rake
")
rak test
which internally expands to the equivelent of:
rak test:spec test:int
which runs my specs and integration tests in that order.
Then an odd thing happened. My specs failed but then the integration tests ran anyway and scrolled the spec failure off the screen to report happy success. After some digging around I discovered the following:
- I'd had this test failure for a few commits without noticing.
- My CI builds on Travis CI were all reporting success (although the failure message was there in the build logs should one manually check.)
- Rake itself and RSpec's rake task were both fine.
- Running my tests directly with the
rspec
CLI, I was getting an exit code of 0 on both success and failure.
RSpec was returning an error/exit/status code of 0 despite test failure. It should be non-zero so that external processes like Travis CI and Rake can determine that something's gone wrong and react accordingly.
I'm going to cut a tedious story down to the result here. After investigation I learned that rspec
worked as expected again when I avoided this piece of code in my tests:
What this piece of code does is:
- create a temporary directory the first time it's called
- reuse that temporary directory on subsequent calls
- remove the directory at the end of the process's lifecycle
Why does that affect RSpec returning non-zero on failure? Because RSpec itself doesn't run immediately; it wants to wait until all of your specs have been loaded first. The way it accomplishes that is it registers itself to run via an at_exit
block and then calls exit
when it's finished with your specs. Still sounds like there's no problem right? Well this is what happens from that point on...
- RSpec finishes running tests and calls
exit
with 0 for success or 1 for failure. - Accordingly, Ruby creates an instance of
SystemExit
and plonks it into the$!
global variable. - Now that the process is shutting down, my
at_exit
in the snippet above starts running. - My cleanup code (correctly, validly and legally) runs Ruby's
FileUtils.remove_entry_secure
to remove the temporary directly created during tests. This isn't a problem in itself. - Here's the gotcha:
FileUtils.remove_entry_secure
removes the directory and sets$!
tonil
to indicate that no exception occurred. - Ruby ends the process and sets the exit code to the result of
$!.status
which was lost in the previous step.
That was the problem. Now what's the solution?
Simple. Just take care to preserve the exit status in your at_exit
so that it ends with what it started with. Here is a helper method that I created:
(You can also view this file
directly on Github in a utility library of mine.)
Then the solution becomes simply use at_exit_preserving_exit_status
instead of at_exit
in first snippet, and everything works again! Happy days!
Be careful that you don't corrupt the value of $!
when using at_exit
. If you're not careful (or don't use a handy, safe function like presented above), then you can corrupt the exit status of RSpec in particular, and other libraries that work in a similar fashion.
Out of curiosity, why preserve only SystemExit instead of any exception?
ReplyDeleteOriginally I wrote a fn that preserved any kind of status. Then I went and changed it about half an hour later, thinking "ooh good catch! There are scenarios where that would produce an unexpected result". I was slightly proud of myself for realising and anticipating that scenario.... now I can't for the life of me remember what I was thinking. I'll think a bit more over the weekend and comment again if it comes back to me.
DeleteHi David,
ReplyDeleteI was trying to see in the FileUtils.remove_entry_secure, where it sets $! to nil.
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/fileutils/rdoc/FileUtils.html#method-c-remove_entry_secure
I'm unable to set $! to nil myself, so was curious where this code lives.
$ ruby -e '$! = nil'
-e:1:in `': $! is a read-only variable (NameError)
Cheers,
Phil
Hi mate. Yeah I haven't looked to see where (or how) the code sets $!. If it's not in the Ruby source it's probably in the C source. It's definitely happening somewhere in there though, have a look at this:
ReplyDeleterequire 'fileutils'
at_exit{ p $!; FileUtils.remove_entry_secure '/tmp/blaa'; p $! }
Dir.mkdir '/tmp/blaa'; exit 7
When run produces this:
#<SystemExit: exit>
nil
Because $! is thread-local magic variable, this works and preserves the original exception object:
ReplyDeletemodule Kernel
def at_exit_preserving_exit_status
at_exit { Thread.new { yield }.join }
end
end