A MapReduce alternative?

Welcome Cascading 2.0 – the open source API to ease your Hadoop woes

Chris Mayer
Cascading-2-0

Big Data headache? Worry no more – Cascading 2.0 is here to solve it.

Ever wanted to harness the best of Hadoop, but struggled with
the logistical headache the transition caused? It appears a new
release might be the answer to your Big Data headache, as
Concurrent have released Cascading 2.0 – an open source data
workflow API, positioning itself as an alternative to
MapReduce.

Cascading
2.0
is a Java application framework that enables developers to
build robust, Apache Hadoop data management applications for the
cloud or to keep them in-house. It’s already garnered some
impressive suitors – Etsy, Razorfish and most notably Twitter,
probably the most impressive use case to study. 

Twitter has recently undergone an open source drive, releasing
some very interesting tools to the masses, but they’ve also
realised the need to be flexible in their infrastructure to deal
with such fluctuation. The Cascading framework is designed for data
scientists, Hadoop administrators and application developers alike,
to collaborate and rapidly develop and deploy scalable Big Data
applications, but also analyse unstructured and semi-structured
data in any format from NoSQL databases for example.

Arguably the neatest thing about Cascading is its multilingual
approach – you can build and test applications from the desktop in
the language of your choosing – Java, Scala, Clojure and JRuby are
all welcomed into the fold.

This release represents five years long work by founder Chris
Wensel, Concurrent’s CEO, with the first code base appearing in
2007 – way before the Big Data curve. Now Cascading 2.0 enters the
opening with an Apache 2.0 license to boot – hopefully giving it
the adoption it deserves.

“Building applications on Hadoop, despite its growing adoption
in the enterprise, is notoriously difficult. We are driving the
future of application development and management on Hadoop, by
allowing enterprises to quickly extract meaningful information from
large amounts of distributed data and better understand the
business implications. We make it easy for developers to build
powerful data processing applications for Hadoop, without requiring
months spent learning about the intricacies of MapReduce,” said
Wensel.

It appears other Big Data experts are following Cascading’s rise
with interest.

“MapR shares a commitment to the growing, innovative and
rich Hadoop development community. Cascading is already integrated
and distributed as part of our MapR Distribution, and is widely
used across organizations that depend on Big Data analysis.
Cascading lets enterprise developers focus on the business of
applications and data processing, while handling the complexities
of development.” said MapR Technologies’ CEO and co-founder John
Schroeder.

An alternative to MapReduce will certainly liven up the
Big Data processing world, and Cascading is well placed to be a
viable competitor.

Author
Comments
comments powered by Disqus