The little Big Data project that could
Apache Cassandra v2.0 unleashed
The Apache Software Foundation announced the launch of a second version of open source, big data distributed database Apache Cassandra this week. Used by mega sites like Netflix, CERN, Reddit, and Instagram, since its 2010 ‘graduation’, the highly-scalable database program has achieved huge adoption, growing bigger and more powerful all the time - and with heavy hitters like this backing your site, you need to be offering the goods.
The open source distributed database system is intended for storing and managing large amounts of data across commodity servers, and is designed to serve as both a real-time operational data store for online transactional applications and a read-intensive database for large-scale business intelligence (BI) systems. Additionally, the fully distributed architecture integrated within Apache Cassandra in theory gives high fault tolerance.
According to the Foundation’s press release, the new version of Apache Cassandra comes complete with new enhancements such as lightweight transactions, triggers, and CQL (Cassandra Query Language) enhancements.
What’s the spiel behind these new features?
- Lightweight transactions that offers linearizable consistency.
- Experimental Triggers Support.
- Numerous enhancements to CQL as well as a new and better version of the native protocol.
- Compaction improvements (including a hybrid strategy that combines leveled and size-tiered compaction).
- A new faster Thrift Server implementation based on LMAX Disruptor.
- Eager retries: avoids query timeout by sending data requests to other replicas if too much time passes on the original request.
You can download both the source and binary distributions of Cassandra 2.0.0 at: http://cassandra.apache.org/download/