Concurrent introduce Hadoop machine learning tool Pattern
The Cascading company reveal their plans following Marchs $4m investment and unsurprisingly, it centres around the application framework.
March’s $4m investment, Cascading company
Concurrent have made their latest Big Data move, pushing a new free
Hadoop workflow project out into the open.
One of Hadoop’s main enterprise pain points is
the difficulty in integrating with analytics systems. Without
comprehensive links to established analytics systems such as R or
SAS, Hadoop will struggle to be accepted as the data processing
gold standard without simplified methods of gleaning
Scoring engine Pattern
allows users to quickly deploy machine learning models in
Hadoop clusters, making it far easier, in theory, to perform
statistical analysis on applications. Data scientists can either
export models over through the Pattern Java API or the popular
XML-based Predictive Model Markup Language.
It is the second project Concurrent has launched
to complement Cascading . Back in February, the
company launched Lingual, a tool designed to help SQL get up to
speed with Hadoop. Over the last few months, many Hadoop vendors
have been thinking along the same lines. Application framework
Cascading however has been around since is already in the
production environments of Twitter, eBay and Etsy.
CTO Chris Wensel
admitted to GigaOM that Pattern alone “isn’t
the real takeaway,” but the three projects in unison is the real
“When combined, Cascading, Lingual and Pattern
close the modeling, development and production loop for all data
oriented applications. The combination of the three is the
application ensemble for further enabling enterprises to drive
differentiation through data,” he explained in a press
Pattern’s nearest competitor is Apache Mahout, a
fellow scalable machine learning library which arrived in 2009.
However Concurrent are keen to point
out its differences, believing that Apache Mahout
is merely a set of HDFS-focused algorithms while Pattern “can
leverage resources beyond Hadoop while complying best practice for
Wensel points out that while all will remain open source, the
company plan to create “a suite” of products that centre on
Cascading, as promised in March. Concurrent’s best chance of
generating noise and cash lie with the application framework as the
main layer to deploy and create Hadoop applications, with Lingual
and Pattern dovetailing it as enterprise sweeteners.
Image courtesy of Frédéric