Straight out of Hadoop World 2012
Amazon teams up with MapR for alternative Hadoop option
JAXenter has become Hadoop Central over the past few days. This week has seen a number of vendors show their Hadoop hand at Hadoop Summit in San Jose, with almost all the key players like Cloudera and Hortonworks detailing their strategy for the coming year.
Another specialist, MapR didn't want to be left out and have made two seismic moves - coupling together their 2.0 Hadoop distribution release with the ability to use MapReduce's Hadoop service in Amazon Elastic MapReduce.
Turning to the new distribution first, MapR Distribution 2.0 (available in open source M3 and open-core M5 editions) includes new features such as multi-tenancy, advanced monitoring, management, isolation and security for Hadoop. It appears the first one has made the AWS partnership possible, targetting multiple customers at once. Like others have done this week, MapR's distro also hooks up to other Hadoop side-projects such as MapReduce, Hive, Pig or Cascading for some deep-dive Hadoop analytics.
It's a big coup for one of the main Hadoop specialists, becoming the first non-Amazon option in EMR, enabling users to tap into their fine-tuned Hadoop distro. The harmony of scalable MapR clusters and the scalability and flexibility of AWS might just give AWS users the edge when provisioning, not to mention the ability to interweave other services into it - such as Amazon S3, DynamoDB and CloudWatch.
Speaking to Talking Cloud, VP of Marketing at MapR, Jack Norris described how 'very excited' MapR were with this 'excellent partnership', saying:
The benefits that MapR provides to Amazon customers are high availability, your services will continue to be available, data protection, inter-cluster mirroring, so you can mirror across availability zones within Amazon to provide that high availability and continuity, as well as between on-premise and cloud deployments...we think customers are going to be very pleased.
And so they should be. We were wondering how Amazon were going to fit into this seemingly quadratic Hadoop equation, and it's MapR who've made the decisive move, seeing that 90% of EMR users do their Hadoop cluster working through AWS entirely. By getting the MapR code into the Amazon fabric, it's an alternative high-performance option for those customers, with Amazon doing all the billing and MapR offering 24/7 support.
The state of play then - the Hadoop community at large has addressed HDFS problems in the latest 2.0 alpha release and Cloudera and MapR are plumping for that version, despite it not being fully-fleshed out. MapR's offering, currently in public beta, will probably go GA at some point in Q3. Hortonworks however have opted to stick with the current 1.0 codeline, but a severely enhanced offering in their Data Platform - thanks to pairing up with VMware. Will they live to regret holding back or will their safe play result in a stronger move? Time will tell.