Size matters

Report: Open source code higher quality – until it supersizes

Elliot Bentley

Which code is better-written: open source or proprietary? The answer depends on the size of the codebase, it seems.

What kind of code is
better-written: open source or proprietary? The answer depends on
the size of the codebase, it seems.

An analysis has revealed that while the difference is
negligible between smaller projects, OSS tends to produce
higher-quality mid-sized codebases – but when it comes to projects
with over a million lines of code, proprietary software wins

The report comes from Coverity, a company that provide
a static analysis of code quality for both open and closed source
projects. This year marks the fifth anniversary of Coverity’s
annual report, which took into account over 450,000,000 lines of

Coverity measures software quality by “defect density”
– the number of detectable high- or medium-impact defects per 1,000
lines of code. The 254 open-source projects in the study, which
include Linux, PHP and Apache, had an average defect density of
0.69 – almost exactly the same as the 300 proprietary codebases
sampled, which had an average of 0.68.

Both figures are well below the “accepted industry
standard defect density” of 1.0. However, this may be an
unrealistic reflection of the software industry at large: those
uninterested in producing high-quality code are unlikely to bother
having it analysed. In addition, the lack of formal deadlines and
highly transparent nature of open-source development probably
encourage higher-quality code.

More intriguing is the correlation between large
codebases and quality. In projects with between 500,000 and one
million lines of code, defect density increased to 0.98 in
proprietary software and 0.44 in open-source software. Above one
million, however, these reversed to 0.66 and 0.75 respectively.

Such humongous codebases are apparently simply too
large for open-source projects to manage; Coverity themselves put
it down to “differing dynamics within open source and proprietary
development teams, as well as the point at which these teams
implement formalized development testing processes”.

A notable exception to this trend is the Linux
codebase, which has consistently scored above average in terms of
code quality. In fact, this year’s scan is the best on record – the
7.6 million lines of code analysed were found to have a defect
density of 0.59.

You can read the
full report
for free over on Coverity’s website.

Photo by Michael

comments powered by Disqus