Thoughts On Unit Testing

In a recent seminar at our company we talked about unit testing and during the discussion that ensued I found that I had a few things to say about the topic. There’s probably no wrong or right here (apart from the fact that you must test!) but there are things that I found work well in practice and others that really only look good on paper. At lot of it has to do with how you end up working, especially when under pressure, and, of course, to some extent with personal preference.

With that out of the way, the following are my observations collected over the course of some relatively large projects.

Don’t bloat test numbers

A lot of IDEs these days provide automation for tasks that are tedious. One of these tasks is setting up your test infrastructure. There are tools that auto-generate tests or test stubs whenever you add a method to a class or something similar. While this is sure to increase your coverage I’m not convinced this is a good idea in the long run.

What’s going to happen is that you generate a large number of trivial tests that you otherwise wouldn’t have written. The kind of trivial tests that come to mind are accessors, for example. What’s the point of testing methods for attribute assignment/read-out? Especially if they’re auto-generated anyway.

Effectively, what you’ll do is make it harder to manage tests overall. For instance, when your test logs are spammed with meaningless succeeding tests you’re more likely to miss a problem. Also, updating expectation values should your inputs change is going to generate a lot of work (maybe to the effect that you’ll hold off on it, which is never good). But most importantly, your test run time is going to increase and in consequence may make you run the full suite less often, especially when you’re in a hurry.

Similarly, if you yourself have written code that auto-generates other code, don’t auto-generate test code as well. Create tests for your generator and make sure it works but don’t spawn code and tests – you’re essentially re-testing the same thing with predictable outcome.

Make sure your tests add value

In the same vein, make sure your tests actually increase the value of your test suite. There’s no point copy-pasting a test and just changing parameters to have yet another test unless it actually ends up running a new code path or testing an edge case (NULL parameters etc).

This may sound trivial but perhaps a less obvious case is a wrapper method that calls a complicated one. Is it really worthwhile to add a test for a trivial method which basically only replicates the complicated test? Blindly adding tests for sake of coverage can lead to the problem of long-running test suites with no or little extra value, as mentioned above. Either use the trivial, higher-level method to test the complicated under-lying one or test the latter directly.

Coverage is great but don’t over-interpret its value. If full coverage leads to a test suite that is so slow you’ll only run it once a month you’ll end up testing less, not more.

Adapt your test strategy to your type of software

There are different kinds of software and along with them come different approaches to unit testing.

I believe the main distinctions to make are the following:

  • library code / APIs
  • faceless application
  • GUI application

Libraries & APIs

Library code is probably the easiest to tackle with traditional unit testing and the one where the strictest rules apply. I would maintain that every public API has to be covered by unit tests, typically including edge case parameters. It’s pretty embarrassing to ship an API and find the advertised calls into your library don’t work. The best way to ensure it does is to be your own (and first!) client by running your test suite against the full set of APIs. It doesn’t stop at the public API, of course, but it’s really the place to start, also to get a feeling if the interface is good. Test driven design works great for libraries.

What makes it easier to maintain full or extensive coverage in library unit testing, is that there’s typically much less state involved than in application code – it’s much simpler to maintain test data.

Faceless Application

A ‘faceless application’ is an application that interfaces to users by some means other than GUI or code, for example file based exchange or network sockets.

Therefore, testing the interface of a ‘faceless application’ is quite different from library or GUI testing. Where in the case of a library you write code to test your interfaces, here you are going to spend much more of your time setting up your test fixtures in the form of files or network services (or mock interfaces for that matter).

I believe in unit testing it really comes down to what the interfaces to your code are. In case of the library, it’s really other code that connects with yours and therefore the best way to test is to write calls into your library. With a faceless application you have quite different interfaces. So while internally, i.e. inside your application, you still use test driven design to cover your internal interfaces to some extent, you also have to address a different test infrastructure to your public interface.

Typically setting up this kind of test infrastructure is quite a bit more complicated than in the case of library code. You’ll probably end up doing some “test triage”: You can’t be everywhere, so to speak, and I believe the most important coverage is that of your public interface.

GUI Applications

Finally, GUI applications. I have not really found a good way to do GUI application unit testing. Naturally one will have “back-end” code that can and must be tested in “traditional” ways but most importantly you really want to have tests in place for your public interfaces, i.e. do common user tasks so you can be sure that you (or rather your test) has successfully clicked all the essential buttons before you ship.

I know there are tools like “Eggplant” that claim to cover this. They may well do so but I’ve never tried any of them for the few (small) GUI applications I’ve written.

What I did try was use AppleScript automation for an Objective-C/Cocoa/OSX application (I’m sure there are similar scripting tools for other platforms/languages) but in the end it was too slow to allow running a big test suite. Also, this approach is limited in its result checking: One can read out controls (“did the expected text appear when I clicked the ‘Transform’ button?”) but obviously result checking is not always that simple.

Actually, I’m not convinced that automated unit testing is really the proper model for GUI applications. One reason for this is that they allow for so much freedom in how you can chain together actions that it’s virtually impossible to instrument all combinations in unit tests. Plus if you do, you may end up being constrained by your tests not to make GUI changes for fear of breaking a huge suite of unit tests.

You may argue that you’re expecting the same “breaking with habit” from your users but GUIs are much more about design and “feel” and you sometimes need to “force” the direction. Legacy unit tests can make you keep a “stale” GUI when you should really move ahead. For library code this is really the other way around: Unit tests ensure that you maintain source compatibility and make you think twice if it’s worthwhile to make an incompatible change because you’re the first one that’s going be hit by it: you’re going to have to update all your tests – the same you’re asking your software developer clients to do. From how much work this update is going to be for you you can judge if the change is really worth it. The defining difference is probably that in one case you have persisted behavior (code) whereas in the other it’s transient – the users’ “muscle memory” or habits.

The solution for GUI applications is probably: Rely on good beta-testers. (See also: http://www.wilshipley.com/blog/2005/09/unit-testing-is-teh-suck-urr.html)