End-to-end tests: not the be-all and end-all
Theres more to software quality than a few unit tests and system tests. All levels of your software should be covered, writes Sebastian Bergmann.
Software is never perfect. Failures of software make the news and damage the reputation of companies and organizations. A rather spectacular example of such a failure lead to the loss of the Mars Climate Orbiter. The mission of this robotic space probe, which roughly cost $655 million USD, was the study of the atmosphere and the climate of our planetary neighbour. On September 23, 1999 the probe approached Mars at a wrong angle and disintegrated. The cause for the loss of the Mars Climate Orbiter was published on September 30, 1999: “The peer review preliminary findings indicate that one team used English units (e.g., inches, feet and pounds) while the other used metric units for a key spacecraft operation. This information was critical to the maneuvers required to place the spacecraft in the proper Mars orbit.”
We can assume that the two teams mentioned in the report above tested their individual components of the software (in isolation from the other component) rigorously with unit tests. Considering the outcome, however, we have to assume not a single integration test was performed to ensure that the two components work correctly when used together. It appears that these two components collaborated for the first time in the orbit of Mars. This is not an integration test but rather a “disintegration test”.
Dr. Edward Weiler, NASA’s Associate Administrator for Space Science, made an interesting observation in the peer review preliminary findings: “The problem here was not the error, it was the failure of NASA’s systems engineering, and the checks and balances in our processes to detect the error. That’s why we lost the spacecraft.” The cause for the loss of the Mars Climate Orbiter was not just a technical error in the navigation unit’s source code but rather rooted in communication and process problems. Had the two teams communicated with each other, they would have been aware that one team used metric units while the other team used English units. And had the process called for integration testing of the two components in addition to testing them isolated from each other, the mistake would have been exposed as well.
One of the most important tasks in software testing is to find the smallest scope in which a test case can be implemented. The smaller the scope in which a test is run, the faster it can be executed and the more precise its result. Unit Tests test one unit of code in isolation from all collaborators. Integration Tests verify the interaction of two or more collaborators in isolation from the rest of the system. Edge-to-Edge Tests exercise the software as end-to-end as possible in a single process (and without using a web browser or a web server). End-to-End Tests, or System Tests, look at the whole system and in the case of a web application send a HTTP request from a web browser to a web server running the software to inspect the HTTP response that is sent back.
In a previous installment of this column I wrote that “[a]cceptance tests tell you that you are building the right product by ensuring that the software does what it is supposed to be doing. Unit Tests tell that you are building the product right by ensuring that the code works correctly.”
It is easy – and seductive – to implement acceptance tests using tools that exercise the software in an end-to-end fashion. Especially teams that are new to testing and / or have to deal with legacy software that is not testable on the unit-level often walk into the trap of testing their application’s core domain logic through the frontend. This indirect way of testing is slow and fragile. It is slow because the whole application is executed in a large scope to test an aspect of the application that should be tested in a small scope. It is fragile because the tests for the domain logic have to be adapted when the frontend’s HTML templates change. To make it worse, these tests are performed in a scope that is so large that a failing test only tell the developer that something does not work without providing information pointing to the root cause. While there is a place for these kinds of tests in the test mix for an application, for instance to test cross-browser compatibility, a team would be ill-advised to solely rely on end-to-end tests as these are cumbersome to write and maintain, prone to errors, and slow to execute.
Acceptance tests should instead be implemented using edge-to-edge tests. These are easier to write and faster to execute than old-fashioned end-to-end tests. More importantly, they require minimal maintenance and deliver highly reliable results. When the architecture of the software allows for both unit tests and edge-to-edge tests then it will also be easy to adopt the practices of Experiment-Driven Development and Testing in Production that Eric Ries wrote about in his book “The Lean Startup”. The promise of being able to develop both the business model as well as the software that implements it should be reason enough for enterprises to invest in a modern, highly decoupled software architecture. And when the members of the software development team communicate well, both among themselves and with the other stakeholders, then there is not much that can really impede the success of the project.
Sebastian Bergmann is a mastermind of PHP development and PHP quality assurance. He has contributed instrumentally to transforming PHP into a reliable platform for large-scale, mission-critical projects. Companies and PHP developers around the world benefit from the tools that he has written.