Avoiding disaster

How code testing can do more harm than good

Albert Ziegler
© Shutterstock / Maksim M  

There’s nothing worse than repairing a bug and subsequently introducing four new ones. In this article, Albert Ziegler explains that while testing can improve certain areas of code, out of sight, other areas can be impacted and languish.

Reworking your code is a recommended and often necessary strategy, but it can also be dangerous.

There’s nothing worse than repairing a bug and subsequently introducing four new ones. If the bugs turn out to be security vulnerabilities, then the stakes are even higher because that could expose your applications and data to remote exploits and lead to data breaches and other attacks. Automated tests are widely used to prevent exactly that, but while testing can improve certain areas of code, out of sight, other areas can be impacted and languish.

For example, a developer might not find problems in the documentation, or flag a piece of code that appears to work correctly, but is simply not maintainable, or an algorithm that returns the right result, but takes longer than necessary. The only way out of this dilemma is to make sure to always check your blind spot — but depending on code language, there are different patterns to look out for and different priorities to keep in mind when testing and writing code. We used code analysis of an open source community of 700,000 developers at LGTM to find recurring themes in the most common languages on LGTM, JavaScript, Python, and Java. Here are some key areas that can be adversely impacted by testing or which can’t be tested at all.


Modularity, which speaks to how the code is organized, is generally weakened by tests. If the code is organized in a sensible way, that will pay dividends when you rework, extend or repair it because it will be easier to find bugs and eliminate them. Automated testing supposedly makes it safe to rework your code as much as you’d like. The thinking is that if the tests don’t flag any problems, there aren’t any. However, tests only check that the current code returns the right result. The quickest way to make them pass is often not the right way that organizes your code so it can be extended later. We found the modularity of code to degrade significantly with testing.


Most projects only write unit tests for correct program behavior and not for efficient program behavior. When you are free to rework your code, as long as you keep the result the same, that usually leads to patterns that are less efficient than they could be. With JavaScript, the more tests you had the more efficiency alerts you had compared to other programs of similar size. With Python, we found that if you have more tests you will have fewer efficiency alerts.


In our analysis at LGTM, we found that tests can have a negative impact on documentation. If you rework your code, you need to update your documentation. But tests have no handle on documentation because it can’t be executed. So if you rely on tests alone, your documentation will get out of sync. However, if the code is organized in a way that’s difficult to understand, there are many ways to write a program to perform a specific task, to debug or to extend to include new features, as when a programmer wants to change a specific aspect. Or it can be written in a way so these are not easy to accomplish. A programmer might change a signature of a function order to make a test pass and forget to update the documentation. Good documentation is essential for maintainable code. Developers should make it easy for their fellow developers and themselves to go back into the code and easily improve it later.


Overall, testing improves security — the amount of security alerts goes down — but only some of the dangerous patterns get better, others get worse. One example is Cross-site Scripting (XSS), one of the most common vulnerabilities found in web applications. We found that the more tests a project has relative to its size, the more likely it is to have an XSS vulnerability, which could enable remote attackers to bypass access controls and inject malicious client-side script into web pages.


If you want to extend your code or maintain it, you first need to be able to understand it. But even perfectly clear code can become unintelligible after a few iterations of being reworked again and again: naming conventions stop making sense and expressions become more and more complex. Your tests don’t care, but your fellow developers and your future self definitely will. They may change things that make the test pass, but not focus on readability.

SEE ALSO: 5 predictions on where testing will go in 2019

Overall, the benefits of testing outweigh the disadvantages, but it’s important to understand where testing won’t help, and may even hurt. To avoid being impacted spend more time on code review. In fact, a brand new analysis of pull requests of GitHub projects shows that simply increasing the amount of code review appears to be the best way to increase code quality. Make it a common practice to have a formal review of every bit of code, either having someone manually go through every line of code for an in-depth review or do a general look. Have senior programmers review the code. Run code analysis to make sure you haven’t introduced anything that’s bad. You can either integrate automated tooling checks in the code review process or run them regularly on your code base. You should get feedback into the coding process, whether it’s human or automated. Don’t assume testing will save you.


Albert Ziegler

Albert Ziegler is a data scientist at Semmle, where he performs data driven research into the process and the results of collaborative software development.
After his PhD in Pure Mathematics, Albert has worked both as a software developer and as a data scientist, and finally as a data scientist researching software development. His interests are the drivers behind differences in code quality and software productivity. He’s also a contributor to the blog.

Inline Feedbacks
View all comments