Could Cassandra be the first breakout NoSQL database?
Datastaxs recent $25m round of funding should see the non-relational database flourish in a Big Data platform. Can it become the non-relational weapon of choice for developers?
Years of misunderstanding haven’t been kind to the NoSQL
database. Aside from the confusing name (generally understood to
mean ‘not only SQL’), there’s always been an air of reluctance from
the enterprise world to move away from Oracle’s steady relational
database, until there was a definite need
to switch from tables to documents
The emergence of Big Data in the past few years has been the kickstart NoSQL distributors needed. Relational databases cannot cope with the sheer amount of data coming in and can’t provide the immediacy large-scale enterprises need to obtain information.
Open source offerings have been lurking in the background for a while, with the highly-tunable Apache Cassandra becoming a community favourite quickly. Emerging from the incubator in October 2011, Cassandra’s beauty lies in its flexible schema, its hybrid data model (lying somewhere between a key-value and tabular database) and also through its high availability. Being from the Apache Software Foundation, there’s also intrinsic links to the big data ‘kernel’ Apache Hadoop, and search server Apache Solr giving users an extra dimension to their data processing and storage.
Using NoSQL on cheap servers for processing and querying data is proving an enticing option for companies of all sizes, especially in combination with MapReduce technology to crunch it all.
One company that appears to be leading this data-driven charge is DataStax, who this week announced the completion of a $25 million C round of funding. Having already permeated the environments of some large companies (notably Netflix), the San Mateo startup are making big noises about their enterprise platform, melding the worlds of Cassandra and Hadoop together. Netflix is a client worth crowing about, with DataStax’s enterprise option being used as one of their primary data stores
DataStax CEO Billy Bosworth said in a press-release that the funding led by Meritech (investors in Facebook,Cloudera and Greenplum no less) would be used “to further grow” their customer base internationally, particularly “businesses whose applications require massive scalability and continuous availability.”
By offering both the cruncher and storage in one package, could DataStax become a viable contender to Hadoop vendors, like Cloudera, MapR and HortonWorks? Bosworth seems to think so, telling Investor Place:
We are a fast-growing software company with over 200 customers, ranging from startups to 15 of the Fortune 100. DataStax Enterprise not only delivers production-ready Cassandra, but also goes one step further by integrating the best-of-breed Big Data technologies like Apache Hadoop for analytics, and Apache Solr for search.
Another key player in the space is 10gen’s MongoDB, which has
garnered a community backing that surpasses Cassandra’s. Its focus
on web-based server application and its JSON-like storage format
are two reasons why it’s in use at leading news websites, The
Guardian and The New York Times. And its appeal is not limited to
newspaper sites: MongoDB is also in use at Barclays, foursquare and
The more mature MongoDB has gained such a significant foothold in this NoSQL battle that it might be difficult for Cassandra to make an imprint, even with Twitter as a big user. As with any technology shift at an enterprise level, considerations need to be made before picking a non-relational database.
Oracle should be quaking in their boots, though. Whilst it may take some years to displace their relational database stronghold, there are signs that databases from the likes of DataStax and 10gen are now making a sizable impact at some of the largest companies around.The rise of NoSQL has been a long time coming, but it seems that it could be on the cusp of mass-market adoption.