Serverless Continuous Integration

The source of all truth, Wikipedia, defines Continuous Integration as “the practise of merging all developer working copies to a shared mainline several times a day.” In practise this typically translates to something like the following:

  • a git master branch
  • a build server that monitors that branch
  • a test that runs on every commit to this branch
  • a test report that gives the build a green or red label

That is usually the gist of it. Sometimes there are also additional health checks like coverage or style guides which can be regarded as part of the overall test that green lights the build or not.

Continuous Integration is a fantastic concept to keep a project “in the green”. Surely everyone would agree that in an ideal world we would always have one watching our backs.

The problem, however, is that this build server doesn’t come for free. There’s always this extra step, this extra task, to get one up and running when all you want to do is get started with your project.

Wouldn’t it be great it you could have a build server without the server bit?

Actually, you can and it’s something I’ve been using in one way or another since about 2004 but never really thought of in the sense that it resembles a CI “system”, mainly because it seems quite trivial on the face of it. However, I haven’t come across this elsewhere so maybe there is something worth talking about.

What I’m referring to is the practise of logging your test results to a test.log file and committing this file as part of your changes.

Why commit test results?

What a CI system does is tell you if a build is ok, and it does so not just now but also when you come back and want to know if this particular version worked at some point in the past. This is very valuable not just because it’s faster than going back to an old version and running the tests again but also because you may not actually be able to run those tests anymore, for various reasons:

  • your build system is on a newer version and you’d have to revert your whole toolchain to an older version (hello Swift and Xcode), which could be anywhere from tedious to impractical to impossible
  • your build might be for a platform you don’t support and maintain anymore
  • your tests might depend on external fixtures that don’t exist anymore (yes, it’s not advisable to integrate tests with external resources but especially in times before Docker you sometimes had to make do with a test database that was set up for your integration tests)

And this is why a test.log next to your commits is very useful: It tells you what happened when the test was run right there in your version control system, right there on GitHub for example. You don’t even need to look it up in Jenkins, you can see it side by side with your code.

Credit to this idea goes to my colleague Andrea Valassi who started committing the test result logs of the COOL project we were working on together starting in 2004. COOL is a library to access data relating to the Large Hadron Collider at CERN and works with SQLite, MySQL, and Oracle databases across a range of Unix platforms and it’s an example of a software package facing some of the test maintenance issues listed above.

As part of the project we committed test logs to the the repository, per platform. For example here’s the one for Red Hat 7.3, compiled with gcc 3.2. Remember when I said you may want to check if something worked a while back on a platform you don’t support or maintain anymore? I can tell you from looking at that file that these tests passed when they were run 11 years ago. And the beauty is it lives with the code.

Think about how many systems have a CI server at the time they’re being maintained which disappears after a while or which is replaced by some other system without migrating the test result data. I believe it is quite powerful to have these logs around. I like to think of it as ASCII text format for CI systems, it will pretty much always be available as long as the source code is available.

It also serves as a sign-off or a proof to collaborators that you’ve run the tests and they worked when you did so. No need to wait for another Jenkins run after committing to tell you there was something wrong, just send the “receipt” along with your commit.

This works great even if this collaborator is just your future self. No more head scratching if you ran those tests last month and what the results might have been, even when it’s a small project that doesn’t really warrant setting up a CI server.

Caveats

This does not however replace a full blown CI system when things get more complex, because it comes with a few caveats:

  • What works for you may not work for others. What the test.log proves is works for me(TM). Your mileage may vary. Your personal build system may be particular to you or you may be relying on test dependencies others don’t have. This is fine if you’re currently the only one working on it and don’t want to go through the hassle of making the tests independent of your set-up.
  • When your tests get really slow (tip: don’t let them) it will get annoying to have to wait for that test run before you commit. Instead of running them less often, now is the time to offload them asynchronously to a build server (or better make them fast again, or both).
  • Log files are just text and while text is a great and timeless format, it is undeniable that the ability to click through test results to test sources is a great boon, especially when your test suite is large and you need to navigate through lots of failures when something goes wrong.

Logging it

Creating a test.log is a simple matter of redirecting stdout and stderr to a file. For example, I use the following in a python project:

#!/bin/bash
LOGFILE="test.log"
date 2>&1 | tee $LOGFILE
uname -a 2>&1 | tee -a $LOGFILE
set -o pipefail
nosetests -v --with-id 2>&1 | tee -a $LOGFILE

which creates the following log file

Wed Apr  6 09:49:49 UTC 2016
Linux 0dc61e369c7c 4.1.19-boot2docker #1 SMP Mon Mar 7 17:44:33 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
#124 test_01_validate_data (data_loader.tests.test_load_file.TestDataLoader) ... ok
...

So at a glance I know when this was run, on what platform, and which tests passed.

Note that the set -o pipefail command ensures errors in a pipeline aren’t masked. In this case it makes sure that if nosetests fails it’s not masked by tee returning a 0 . This is less relevant if you are only interested in the script’s output to test.log . However, if you also want to run an actual CI server from the same script you will need the nosetests return value to indicate success or failure.

Enforcing it

A CI system is only useful when it’s actually running and this naturally also applies to test.log. With a little discipline you need nothing more than a simple test runner script to log your results. If you don’t want to rely on discipline to run the tests with each commit you can add a simple pre-commit hook to .git/hooks:

#!/bin/sh
# A commit hook to require the presence of a test log file in the commit.

LOGFILE="test.log"

git diff --cached --name-only | if grep --quiet "$LOGFILE"
then
    exit 0
else
    echo "Error: $LOGFILE nees to be part of the commit."
    exit 1
fi

In case you want to commit something without running the hook, use the --no-verify or -n option of git commit:

git commit -n ...

You could also do a similar thing only when merging or pushing changes or use any of the other commit hooks available.

This concept of “Serverless Continuous Integration” has served me well over the years and it’s something I set up as soon as I have more than a handful of tests around. The effort is so low there’s simply no good reason not to do it. The same can’t quite be said for Jenkins and friends.