Straight out of Hadoop World 2012

Amazon teams up with MapR for alternative Hadoop option

Chris Mayer

Yet more exciting Hadoop announcements this week, as MapR deploy Version 2.0 of its MapR Distribution and reveal a tie-in with Amazon Elastic MapReduce service

JAXenter has become Hadoop Central over the past few days. This
week has seen a number of vendors show their Hadoop hand at Hadoop
Summit in San Jose, with almost all the key players like Cloudera
and Hortonworks detailing their strategy for the coming year.

Another specialist, MapR didn’t want to be left out and have
made two seismic moves – coupling together their 2.0 Hadoop
distribution release with the ability to use MapReduce’s Hadoop
service in Amazon Elastic MapReduce.

Turning to the new distribution first, MapR Distribution 2.0
(available in open source M3 and open-core M5
 includes new features such as multi-tenancy,
advanced monitoring, management, isolation and security for
Hadoop. It appears the first one has made the AWS partnership
possible, targetting multiple customers at once. Like others have
done this week, MapR’s distro also hooks up to other Hadoop
side-projects such as MapReduce, Hive, Pig or Cascading
for some deep-dive Hadoop analytics.

It’s a big coup for one of the main Hadoop specialists, becoming
the first non-Amazon option in EMR, enabling users to tap into
their fine-tuned Hadoop distro. The harmony of scalable
MapR clusters and the scalability and flexibility of AWS might just
give AWS users the edge when provisioning, not to mention the
ability to interweave other services into it – such
as Amazon
DynamoDB and CloudWatch

Speaking to
Talking Cloud
, VP of Marketing at MapR, Jack Norris
described how ‘very excited’ MapR were with this ‘excellent
partnership’, saying:

The benefits that MapR provides to Amazon
customers are high availability, your services will continue to be
available, data protection, inter-cluster mirroring, so you can
mirror across availability zones within Amazon to provide that high
availability and continuity, as well as between on-premise and
cloud deployments…we think customers are going to be very

And so they should be. We were wondering how
Amazon were going to fit into this seemingly quadratic Hadoop
equation, and it’s MapR who’ve made the decisive move, seeing that
90% of EMR users do their Hadoop cluster working through AWS
entirely. By getting the MapR code into the Amazon fabric, it’s an
alternative high-performance option for those customers, with
Amazon doing all the billing and MapR offering 24/7

The state of play then – the Hadoop community
at large has addressed HDFS problems in the latest 2.0 alpha
release and Cloudera and MapR are plumping for that version,
despite it not being fully-fleshed out. MapR’s offering, currently
in public beta, will probably go GA at some point in Q3.
Hortonworks however have opted to stick with the current 1.0
codeline, but a severely enhanced offering in their Data Platform –
thanks to pairing up with VMware. Will they live to regret holding
back or will their safe play result in a stronger move? Time will

comments powered by Disqus