A MapReduce alternative?

Welcome Cascading 2.0 – the open source API to ease your Hadoop woes

Chris Mayer

Big Data headache? Worry no more – Cascading 2.0 is here to solve it.

Ever wanted to harness the best of Hadoop, but struggled with the logistical headache the transition caused? It appears a new release might be the answer to your Big Data headache, as Concurrent have released Cascading 2.0 – an open source data workflow API, positioning itself as an alternative to MapReduce.

Cascading 2.0 is a Java application framework that enables developers to build robust, Apache Hadoop data management applications for the cloud or to keep them in-house. It’s already garnered some impressive suitors – Etsy, Razorfish and most notably Twitter, probably the most impressive use case to study. 

Twitter has recently undergone an open source drive, releasing some very interesting tools to the masses, but they’ve also realised the need to be flexible in their infrastructure to deal with such fluctuation. The Cascading framework is designed for data scientists, Hadoop administrators and application developers alike, to collaborate and rapidly develop and deploy scalable Big Data applications, but also analyse unstructured and semi-structured data in any format from NoSQL databases for example.

Arguably the neatest thing about Cascading is its multilingual approach – you can build and test applications from the desktop in the language of your choosing – Java, Scala, Clojure and JRuby are all welcomed into the fold.

This release represents five years long work by founder Chris Wensel, Concurrent’s CEO, with the first code base appearing in 2007 – way before the Big Data curve. Now Cascading 2.0 enters the opening with an Apache 2.0 license to boot – hopefully giving it the adoption it deserves.

“Building applications on Hadoop, despite its growing adoption in the enterprise, is notoriously difficult. We are driving the future of application development and management on Hadoop, by allowing enterprises to quickly extract meaningful information from large amounts of distributed data and better understand the business implications. We make it easy for developers to build powerful data processing applications for Hadoop, without requiring months spent learning about the intricacies of MapReduce,” said Wensel.

It appears other Big Data experts are following Cascading’s rise with interest.

“MapR shares a commitment to the growing, innovative and rich Hadoop development community. Cascading is already integrated and distributed as part of our MapR Distribution, and is widely used across organizations that depend on Big Data analysis. Cascading lets enterprise developers focus on the business of applications and data processing, while handling the complexities of development.” said MapR Technologies’ CEO and co-founder John Schroeder.

An alternative to MapReduce will certainly liven up the Big Data processing world, and Cascading is well placed to be a viable competitor.

Inline Feedbacks
View all comments