Digg+Cassandra

Digg Moves From MySQL, to NoSQL

Jessica Thornsby

Digg have completed their move from MySQL. Now, the social media news website, runs on Cassandra.

Back in September 2009, Ian Eure announced that Digg were planning to move to Cassandra,
claiming that the vertically partitioned master-slave configuration
of MySQL, was simply not the way forward for Digg. The team looked
at a number of non-relational data stores – HBase, Hypertable,
Tokyo Cabinet/Tyrant, Voldemort, and Dynomite – before settling on
Cassandra. What gave Cassandra the edge, for Digg, was its
column-oriented data storage and distributed, peer-to-peer
cluster.

In a blog post announcing the move’s completion,
John Quinn cites the major reason for the switch from MySQL to
NoSQL, as “the increasing difficulty of building a high
performance, write intensive, application on a data set that is
growing quickly, with no end in sight.” The company also plans to
increasingly replace failed nodes with no downtime, and to add
capacity – something that Quinn acknowledges is tricky with MySQL.
They cite Google and Amazon as inspirations in Digg’s NoSQL
adoption (Google and Amazon use BigTable and Dynamo, respectively)
alongside Digg’s own commitment to the development of open source
software.

Currently, most of Digg’s functionality has been reimplemented
using Cassandra as the primary data store, and the Digg team are
working on updating the operational capabilities of Cassandra, to
better suit Digg’s needs.

“We’ll continue to lead the way in championing Cassandra’s
development and adoption,” writes Quinn.

The latest release of Apache Cassandra, is version 0.5.1.

Author
Comments
comments powered by Disqus