Digg Moves From MySQL, to NoSQL
Back in September 2009, Ian Eure announced that Digg were planning to move to Cassandra, claiming that the vertically partitioned master-slave configuration of MySQL, was simply not the way forward for Digg. The team looked at a number of non-relational data stores - HBase, Hypertable, Tokyo Cabinet/Tyrant, Voldemort, and Dynomite – before settling on Cassandra. What gave Cassandra the edge, for Digg, was its column-oriented data storage and distributed, peer-to-peer cluster.
In a blog post announcing the move's completion, John Quinn cites the major reason for the switch from MySQL to NoSQL, as “the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight.” The company also plans to increasingly replace failed nodes with no downtime, and to add capacity – something that Quinn acknowledges is tricky with MySQL. They cite Google and Amazon as inspirations in Digg's NoSQL adoption (Google and Amazon use BigTable and Dynamo, respectively) alongside Digg's own commitment to the development of open source software.
Currently, most of Digg's functionality has been reimplemented using Cassandra as the primary data store, and the Digg team are working on updating the operational capabilities of Cassandra, to better suit Digg's needs.
“We'll continue to lead the way in championing Cassandra's development and adoption,” writes Quinn.
The latest release of Apache Cassandra, is version 0.5.1.