DataStax CTO Jonathan Ellis on gearing up for future market tussles, and nurturing Facebook
When we meet at the Cassandra Summit Europe, DataStax CTO Jonathan Ellis is fresh from announcing the online ‘Java Development with Apache Cassandra’ virtual training centre. Whilst the European Cassandra community is relatively small, judging from the buzz of positive feedback on the floor, we can expect it to grow at a robust pace in the next few years.
The Apache NoSQL database’s future didn’t always look so certain though. Developed by Facebook to power its Inbox feature, Cassandra went on to become one of its many open sourced projects. With a few exceptions (Thrift, for example), this is, according to Ellis, usually to to the detriment of the project, with the intention often more to demonstrate the power of the all-conquering Facebook devs machine than a genuine intention to grow and foster a community around it.
Enter Ellis, who at the time was looking to solve a file metadata conundrum for US company Mozy, centered on the fact that the Oracle database they were employing simply couldn’t keep pace with their millions of users – a problem many businesses at the time were grappling with.
He told Rackspace that he was interested in building a scalable database, pointing out that, “More and more companies are building these web applications where that’s the bottleneck”. The upshot of this was a new role for Ellis, building a scalable database that Rackspace could use internally. In the process, he was tasked with evaluating the different NoSQL systems available at the time, which included MongoDB, Voldemort (remember them?), and of course, Cassandra.
Having handed the code and trademark of Cassandra to the Apache Foundation, Facebook had dusted off their hands and left the project to the mercy of the wild. At which point, Ellis went to Apache and said, “I’ve been maintaining this for a few months now, I’ve been dealing with bug reports and fixes and new features, so you should probably make me a committer and I can help you guys out”. The rest, as they say, was history.
Unusually, Ellis found himself in the position of simultaneously building a community around the product as he was developing it. After working at Rackspace for a year and a half, he took the decision to form DataStax and take it “to the next level”.
From the outset, DataStax had the objective of building a more interesting product on top of Cassandra, and, about a year and a half after its inception, the company released DataStax Enterprise 1.0. The product integrates analytics and search support on top of Cassandra, security features and ops centre integration, as well as the recent addition of DevCenter (more on that in a second), making it “really a full featured data platform on top of open source Cassandra”.
Inspired by schema browsers and query tools in the relational world, Ellis is keen to emphasise that DevCenter is the first graphical query manager for a NoSQL product. Among other features, it allows users to connect to their local test centre, save scripts, and replay them later.
Breaking down ‘NoSQL’
Although some may draw comparisons between Cassandra and other products in the sector, he notes that one of the current issues of flying under the NoSQL banner is that it can denote so many different permutations of the system.
“You’ve got graph databases, key value databases, document databases, document stores…And then you’ve got things like Cassandra, and we’re like, “Hey, we’re pretty relation-ish, we’ve got CQL, and we’ve got rows and columns, but we’re also about scaling. We’re not doing ACID transactions because that stops us from scaling the way we want to. There’s a lot of different things in NoSQL”.
Although Ellis believes that it was valuable early on in the NoSQL movement to be able to pin a concrete label on an Oracle alternative, and be able to show that “not every nail is fit for that hammer”, the downside is that, “there are people saying, well, Cassandra and MongoDB, they’re both NoSQL so they must be interchangeable at that level. Not really the case”.
Well, for now at least. Ellis predicts a lot of consolidation in the NoSQL market moving forward, with clear market leaders and ‘also rans’ emerging in the next few years. For now, he believes MongoDB is more of a MySQL competitor, commenting that, “MySQL is owned by Oracle now, and at DataStax we’re driving Cassandra at the high end of the Oracle market”.
That being said, he is well aware that MongoDB will tell prospective clients that they can scale as well as Cassandra, something that he says is, “not true – but that’s what they’d tell you. So there’s definitely some competition looming on the horizon there, and at the same time, we’re looking at making Cassandra easier to use and kind of moving into MondoDB’s home turf there as well”.
A battle with the world’s most valuable (on paper at least) NoSQL providers might have some frantically diversifying – but Ellis is sticking to his guns. And having survived Facebook’s absentee parentism, seen off a host of competitors, and fostered a whole new wave of Cassandra aficionados, DataStax might just be the one to watch to KO the mighty Mongo one day.