How Artifactory became 10 times faster
New version of binary management tool scraps Jackrabbit storage engine and reaps massive rewards.
It takes a brave developer to release a major software update featuring only improvements to performance. However, the developers of Artifactory – who are today releasing version 3.0 – have achieved no mean feat in ramping up its speed by a factor of ten.
Reportedly in use at LinkedIn, Netflix, Twitter, Oracle, EMC2, Apple and VMware, Artifactory is a tool for managing thirty-party binaries and libraries. Parent company JFrog, based in Israel but now expanding to California, recently launched a GitHub-like binary database with social features called Bintray. Having opened to the public in March, the company claims that it now boasts almost 60,000 software packages.
However, that’s something of a side project: Artifactory is JFrog’s flagship product, which is today receiving a massive speed boost to their flagship product, Artifactory – achieved by ripping out the existing Jackrabbit storage engine and replacing it with a custom-made solution.
To eke out the performance improvements they wanted, the development team had to “tune every SQL query”, JFrog Chief Architect and co-founder Fred Simon told JAXenter. The company’s engineers worked “like a game developer, when they go all the way back to C”.
“We went very low-level, so people will say, ‘woah, you’re crazy, why are you rewriting the wheel in terms of going so low-level about data storage?’.” But despite “some doubt along the way”, the eight-month project paid off. “To tell you the truth, we surprised ourselves,” said Simon.
The benchmarks shared with JAXenter by JFrog support these claims: with 50 users, Artifactory 3.0 is able to serve an average of 1298 user requests per second – over ten times that of current version Artifactory 2.6.7, which can only do 127 per second. With 200 users, this difference extends to a 13x speed boost. These improvements also mean that Artifactory requires less disk space, CPU and memory.
If Jackrabbit was such a bottleneck, however, why did they choose to use it in Artifactory in the first place?
“It was good for what we wanted to do [initially],” responded Simon, “but once we saw that the model of the content management was really stable, it was more of a burden than a help.” With Artifactory becoming an increasingly mature piece of software, the team were able to write a replacement for Jackrabbit tailored to Artifactory’s needs.
“And the other thing is that we had enough resources and time to decide that we need to push the performance of Artifactory,” he added. “The main thing also is that there are lot of other features we want to add to our software, and we knew that if we kept adding features on top of the current storage of Jackrabbit, we would have been pushing it to the limit.”
When asked if previous versions of Artifactory could be considered slow, Simon drew a comparison with Twitter’s frequent downtime in its early days. By moving away from “high level” technologies and optimising their stack for specific use cases, he said, both have achieved considerable performance improvements.
If it looks like a Black Duck…
The other big feature in Artifactory 3.0 is integration with Black Duck, whose software analyses third-party libraries for licensing issues. However, this is usually done after a build is complete (or even sent to production), clashing with modern continuous integration practises.
By integrating the two products, users can get the “best of both worlds”, claimed Simon: the license-checking demanded by the legal team and low-friction deployment demanded by the developers. With Black Duck enabled, any binaries added to Artifactory are automatically scanned, and the results sent to the necessary party. Artifactory has included a license-checking feature for a while, but – as Simon admits – it is “kind of simple” in comparison.
“The process here is, for the developer, it’s easy, and for the people that are watching [for licensing issues], it’s immediate also. They are getting the real information about what’s really built, as soon as it is built.”
This Black Duck integration is only available in the paid ‘Pro’ version of Artifactory, while the performance improvements will be included in the open-source edition too.
Photo by Nathan E.