Dot, dot, dot, dot, dot — tests are passing, looks like it’s time for lunch — dot, dot, dot, dot, F. F? F? But the code works. I know it does. I think it does. Why is my test failing?
One of the most frustrating times as a TDD developer is that moment when a test is failing and you don’t know why, as opposed to the more normal case where the test fails as expected. Here’s a grab bag of tips, tricks, hints, and thoughts to get us all through that difficult time.
Something Must Have Changed
This may be the most obvious piece of advice in the history of ever, but I find it’s worth repeating, mantra-like, when confronted with a bad bug:
When a formerly-passing test fails, it means something changed.
It may be in the code, or the system, or the test. But it’s probably not sunspots, and it’s probably not evil spirits possessing your MacBook. (Unless you are either living in a Charles Stross novel or writing Perl, but I digress…)
Looking through recent changes can help figure out what the cause of the failure is. Git’s bisect tool can do this automatically, or you can just look through recent changes in your source control viewer of choice. If the test was passing at one time, there’s a good chance the answer is in there somewhere.
This is a great argument in favor of committing to your source control very, very frequently (especially when you are using git and can do local commits), so that your changes are very granular.
Isolate
When looking at a small number of failing tests, it’s helpful to be able to run just those tests. Autotest is outstanding for this, since it will run the failing tests over and over until they pass. This is especially helpful if you have a number of failing tests that are not in the same test class.
This little code and terminal snippet is very helpful for quickly running one class at a time, which is almost like isolating a failing test, or at least close enough to be useful. Depending on your IDE and test framework of choice, you may also be able to run individual tests from the IDE.
Isolating tests makes the tests run faster when you are focused on just a few tests, and also makes any diagnostics you insert easier to interpret.
Two tips that I’ve stolen from listening to and reading Kent Beck:
- Back out your entire most recent change since your last passing test and start over. This works best if you work in very small increments, but it gets you out of the “I know I typed something wrong but I just can’t see it” nightmare
- Replace all the expressions in the method under test with literals — if that passes, then put the expressions back one by one until you find the culprit.
Diagnose
I have to say, I’m not a big fan of using stop-and-step debuggers. I’ve used them when I’ve been in an IDE, I’ve never really used the Rails command line debugger, but mostly I’ve found that not to be a great experience.
Normally, to diagnose what’s going on in a test, I usually either add additional assertions in the test or have the code print information to the console. If I diagnose via assertions, generally I’m testing the values of variables in more detail.
For some reason, I see a lot of people using Ruby’s puts method to write to the console — I recommend p, which calls inspect on the object before printing, and generally results in more informative output. As a matter of course, I put require pp, which allows me to use pp to get pretty-printed output, which is nice for nested data structures. Also, y gives a YAML representation of the output — very readable for ActiveRecord objects.
>> x = {1 => ['a', 'b'], 2 => 'c'}
>> puts x
1ab2c
>> p x
{1=>["a", "b"], 2=>"c"}
>> pp x
{1=>["a", "b"], 2=>"c"}
>> y x
---
1:
- a
- b
2: c
Especially if I have autotest running just the one test, I’ve been known to bury print statements all over the place — controllers, Rails itself (often educational). Just remember to take them out when you are done.
Clear Your Head
Take a walk. Force your pair to solve the problem. Get a cup of coffee (actually, I hate coffee, get a Diet Coke). Take a nap. All those silly clear your head things really do work sometimes.
Band Aids
It’s tempting sometimes to comment out the offending test and then your suite passes and all is well with the world again. That’s generally a bad idea (although sometimes a major refactoring can genuinely make tests obsolete.
Hope this helps. What do you do to fix stubborn tests?
Related Services: Ruby on Rails Development, Custom Software Development


Wow, this brought back some bad memories from last night. On my Moqueue project (mock objects for the amqp library, it’s a bear to test otherwise) I’m using the examples from the amqp library as integration/acceptance tests. One of these has a Logger class that demonstrates fanout queues, and it suddenly started failing, which confused the (perl code here) out of me. The culprit was wycats/bundler which was pulling “logger” in from stdlib when rubygems got required.
The moral: be very thorough when thinking about what’s changed since the test passed.
If a test is sometimes failing and other times passing depending on how it’s run, it’s likely due to some state left over from a previous test.
To help debug this, try running your tests in reverse order. With rspec you can use the “-R” option.
As a rule of thumb, every test should start with the same clean environment. Ensure the setup/teardown hooks are removing any changes leftover from a previous test.
Nice post. This scenario is rarely encountered when using continuous integration – the offending commit is caught within seconds so you still have some fresh context. Running cruisecontrol for rails and java projects has all but remove this issue.
On projects where there isn’t a little robot friend checking all the commits, a tool like fisheye is fantastic for walking history and looking at the diffs. I used this quite a bit on a highly distributed team while cruise was down (being moved).
@David Hainlin:
It’s also avoidable when running autotest/autospec. You even don’t have to commit, just to save a file and see your tests crash.
–
“tests failing and app working” is something that regularly happens to me. Each time, i think “oh no, please tests, you’re supposed to save my time”.
I think the biggest problem is the discordance of environments. It’s especially true when doing integration testing with cucumber/webrat. We don’t actually test our app, we test our test-app, which can have a really different environment from the production one.
I have thought about this several times, without finding a perfect solution. Maybe something like using git to branch the app in a test branch, and duplicating dev db in test db without using fixtures or mocks for integration tests, testing more the value change between actions than the value itself…