See Stephanos Zarchariadis speak at JAX DevOps on 5 April 2017 in London!

“Nobody expects that value there!” Making sure your acceptance tests work properly

Stefanos Zachariadis
Program code image via Shutterstock.

We’re super excited for Stefanos Zachariadis’ talk at JAX Devops. Here, he explains how to write some acceptance tests without any errors specifically for FinTech.

It’s 2017. You’re asked to implement a feature in your team. The feature is distilled into a story with acceptance criteria. You start by writing some honest end-to-end acceptance tests and then test drive the development of the implementation. One iteration later, all the tests are passing and the feature goes into production. Where it then proceeds to fail. Excuses are thrown around. Sounds familiar? The development process above sounds almost textbook. So what could have gone wrong?

setUp to fail

Let’s assume we’re developing a piece of software for retail banking. What could have gone wrong with our release that our acceptance tests in our CI environment did not catch?

Making our data invalid

Let’s assume that we model bank accounts as follows:

 enum AccountType {

public class Account { 
     private final String name;
     private final AccountType type;

Now, our bank decides to get out of the private banking business. We then proceed to engage ourselves in one of the most satisfying experiences a developer can have: deleting code. Our code no longer supports the creation of PRIVATE AccountTypes and we deleted the enum value and supporting code. And of course, any acceptance test that used to work with private banking accounts is gone. Hey, we don’t support this type of account any more.

Only that we forgot that there are still private banking accounts in production, so as soon as we
release the code, reporting starts failing with exceptions!

Nobody expects that value there!

In our domain, we’re assuming that every account has a name. That’s the name you see when you log in to your online banking portal. Armed with this assumption, we add an equals() method to our Account. Unfortunately, one of the previous revisions had a bug that under some circumstances created accounts with null names; we have fixed the bug a long time ago now, so we can no longer create null name accounts in our CI environment. Or perhaps an account name was optional in the distant past. Either way, there’s now accounts in production with null name fields which are impossible to create in our CI environment. As soon as we release our code and equals() is run on the right account, it fails with a NullPointerException.

did our migration work?

Our bank is expanding! It used to be the case that every account was implicitly a USD account – but
we now are going to support accounts of different currencies. We modify our account to look like

 public class Account { 
     private final String name; 
     private final AccountType type; 
     private final CurrencyCode currency; 

We migrate (or perhaps have our DBA migrate) all production accounts to have a default currency code of USD, as they are all accounts that were created when the only account currency supported was USD. How do we actually know that our migration worked? How can we rely on the fact that all our accounts have a non null USD currency code? After all, we cannot create null currency code accounts in our CI environment.

This is just a small sample of problems that traditional CI may miss. So how can we avoid them?

The problem with tests

Software tests are the best means of defence we have against bugs and regressions. And in modern software development, passing tests around a new feature is how we know that feature is done and working. But tests have limitations. Here’s how a typical test for our banking software could look like, assuming it’s an end to end test:

public void getsTheCorrectBalanceAfterADeposit() {
     Account account = new Account();


    assertEquals(200, account.getBalance());

Basically, given an account, when I deposit $200 into it, then the balance of the account is $200. This is nice and simple, right?

Software is complicated

Unfortunately, real systems are rarely that simple. Take an example system that has 200 boolean variables. Even such a trivial system has 2^200 possible states it can be in. This is a huge number – larger than the number of stars in the known universe. A system also does not stay static; We all like frequent releases and for very good reasons, but frequent releases also mean frequent exposure to change and therefore new risk.

Our hypothetical system with the 200 booleans had 200 booleans that are interpreted in a certain way at a particular release. So as if 2^200 possible states wasn’t hard enough on its own, we now need to add a time dimension to that complexity. This is because most systems produce durable data and that data needs to be read and dealt with sensibly in any future release. After all, you wouldn’t want to lose the money in your bank account when your bank releases a new version of their software.

Continuous delivery & durable data makes it even harder

Our given, when, then test above is showing us that the system works when I’ve just created a bank account and deposited $200. In reality, the balance in your account is going to be a result of a sequence of operations you’ve instigated over a long period of time, actions other actors (e.g. direct debits) have performed, system releases and data migrations. What this means is that real data (your bank account) is a result of a very complex set of interactions, that testing is unlikely to include. This is because testing cannot be exhaustive because our systems are way too complex and produce and amend data over long time periods and continuous releases.

Durable data over multiple releases means that data is being produced and added to by multiple versions of your software. A CI environment and continuous delivery pipeline by definition tests asingle snapshot in time, that is the software revision / commit that we’re determining whether to release to production. This means that even if somehow you do think you can do exhaustive testing, it’s actually impossible to address the time dimension using standard CI acceptance tests, since if you’d wanted to address it you’d need to run your acceptance tests for your nth release on top of data generated by tests from the previous n – 1 releases. Clearly this does not scale. This means that your new release is going to touch data that have been generated by sequences of operations that are not tested in your CI environment no matter what you do!

What we’d like to test

There’s three data attributes we’d like to reason about:

Data validity

Validity is an indication of the system correctness. An invalid state can lead our system to catastrophic failure (e.g. Exceptions) or undefined behavior that’s not predicted by our tests. There’s two aspects to validity: 1) all data that could be loaded at runtime should be able to be loaded without error. This tests for things like unexpected (even corrupt) data that the system can no longer read; and  2) each application has different business specific criteria of what it means to be valid, which could be asserted on. In our fictional banking software, it may be that no account can have a balance that’s negative and less than some overdraft limit.

These tests could be expressed rather simply:

public void allAccountsAreReadable() {
     final AccountDao accountDao = new AccountDao();
     for(User user : users) {
public void allAccountsHaveABalanceThatsAboveTheOverdraftLimit() {
     final AccountDao accountDao = new AccountDao();
     for(User user : users) {
          Account account = accountDao.loadAccountFor(user);
     assertTrue(account.getBalance() > NEGATIVE_OVERDRAFT_LIMIT);

Data invariance

The state of the system before and after a release should be the same (there may be some exceptions of course). Given that data is a result of a complex set of operations that we have no hope of fully replicating in our CI environment, we can’t really make concrete assertions on what individual data values could be. To put this another way, if you’re a customer of our fictional bank, we can’t write an assertion on what your balance should exactly be. But what we could do, is that if we have revision X of our banking software out in production and we’re taking revision Y through our continuous delivery pipeline, then revision X should give me the same balance as revision Y in a controlled environment. In code, this could look as follows:

public void accountBalanceIsMaintained() {
     final AccountDao accountDao = new AccountDao();
     for(User user : users) {
          Account account = accountDao.loadAccountFor(user);

     BigDecimal productionBalance =

     assertEqual(productionBalance, account.getBalance());

More generally, invariant testing involves the following:

  1. capture invariants
  2. perform action
  3. verify invariants

In our case, the action is deploy the new version of the system in a controlled environment.

Migration integrity

At some point in a system’s life, we’re going to change the way the data is stored. To paraphrase, a continuously delivered system cannot have data persistence that’s set in stone. One way or another we’re going to have to migrate our data from the current representation in production to a different representation. If our bank is expanding to include accounts in different currencies as explained above, how can we make sure that the migration of all existing accounts to be explicitly USD accounts happens correctly? Well, here’s a test:

public void allAccountsAreInUSD() {
     final AccountDao accountDao = new AccountDao();
     for(User user : users) {
          Account account = accountDao.loadAccountFor(user);
     assertEqual(CurrencyCode.USD, account.getCurrency())

What’s stopping us from writing these kind of tests? The missing link is data – real, production data. Despite agile development techniques, continuous delivery, iterating over requirements and implementation, test driven development, CI and the like, our code tends to touch production data at the time of release. This is as late as possible and smells an awful lot like waterfall development.

If it’s painful, do it early and often

If we were able to get production data into a CI environment, we could import it into a database or a running system and write these kind of tests.

Unfortunately, this would probably be illegal. If you’re a customer of our fictional bank, you wouldn’t want your details to be available in a CI environment for all devs to see. But what if we could get something as close as possible to production data into CI, but something without personally identifying information?

Data sanitization to the rescue

The solution to this problem is to include a data sanitization service and ship it to production along with the rest of the code. The service, a cleanser, is responsible for taking a copy of production data, removing personally identifying information, and shipping it to the CI environment. In a database environment, this could be done as follows:

  • take a database slave out of the pool
  • run code that removes personally identifying information. For example all names could change to variations of First name, Last name. This is the sanitization or cleansing process.
  • archive the cleansed result and upload it to CI
  • undo / rollback the changes and return the database to the pool

The cleanser is an integral part of our application and should be deployed with it. This is because new versions of the application may include new types of sensitive data and the cleanser needs to be updated to sanitize it.

The migration blues

A cleanser will allow us to ship our data from production to our CI environment. But can the latest version of our software in CI read that data? It may be able to, but what if the current version assumes a different data structure, because, for example, we’re now supporting multiple account currencies in our retail banking software? If only we could migrate the data to the latest schema.

The solution presents itself:

Data is owned by the application and therefore data migration is integral to it. This means that the migration process is owned by the application, and new migrations should ship with the application and be performed with the deployment of every new version. This allows for a CI process that does the following:

  1. Import the sanitized data that the cleanser produced
  2. Capture invariants (e.g. account balances)
  3. Migrate it to the latest version of the application
  4. Run the tests

Volumes and frequency

This approach sounds good in principle, but how often should you do it? When should the data be refreshed? How frequently should the data tests run in CI? What do you do if you you have terabytes of data?

As is often the case with difficult questions, the answer is it depends. In principle, we want to do these things as often as possible. This includes refreshing data and running the tests in CI. After all, frequent refreshes and test runs have a chance of catching bugs earlier. But often we have to be pragmatic. Here’s some techniques to consider:

  • Sampling: If your data volumes are just too large, consider making the cleanser sample the data that it sends over, rather than sending the whole lot.
  • Incremental updates: Rather than sending the complete volume of data every time, consider sending an incremental update, through a diff mechanism.
  • Rolling back: Consider rolling back after a test run – this should make the next run much faster.

A note on staging

If you’ve implemented this approach, consider using it in your staging environment too. Once you have real (sanitized) production data, why not import them into your staging environment prior to release? That way you will end up with an environment that’s as close as possible to production topology and data-wise, running the code that you wish to release to production.


Data testing can help you catch a whole new category of bugs and errors before your users do. We hope that this article outlined some approaches for doing this.


Stefanos Zachariadis will be delivering one talk at JAX DevOps which will focus on how to avoiding those pesky acceptance errors and how to catch them before your users find them.


Stefanos Zachariadis

Stefanos loves to code and has done so professionally for over 12 years. His career has taken various twists and turns: From academia to writing satellite software for the European Space Agency; flight search software for a major airline; test automation for various banks and steam turbine design software, leading up to writing low latency code that can process 50,000 orders a second on a single CPU core as a team lead for LMAX Exchange, the pioneers of Continuous Delivery. Now an independent developer through motocode ltd and the coder behind CycleMaps, one of the leading cycling apps in the UK. His hobbies include more development, music, photography and backpacking.


Find him on twitter at @thenewstef.

comments powered by Disqus