Someone's moving quickly
VMware makes Hadoop advances - Project Serengeti and Spring for Apache Hadoop 1.0 M2
VMware has certainly been busy this week in mapping out their Hadoop strategy. Early today we mentioned how they helped provide the cloud infrastructure for Hortonworks Data Platform 1.0, now they've made two other steps towards Big Data dominance.
First up is the arrival of a new open source project, Serengeti, with the primary goal of making their virtualised environments 'Hadoop-aware', enabling enterprises to quickly deploy, manage, and scale Apache Hadoop in virtual and cloud environments. This olive branch to the Big Data behemoth appears to have come about after realising how strong the Apache community behind it has become since the beginning of the year.
Project Serengeti will be a one-click deployment toolkit that removes the stumbling blocks and makes it possible to deploy Hadoop clusters in mere minutes - fully utilising an array of Hadoop-like projects such as Apache Pig and Apache Hive. Although quite nascent at the moment, in its 0.5 release, Serengeti could well become a vibrant habitat for Cloud Foundry developers to utilise Hadoop. It's available under an Apache 2.0 license
"Hadoop must become friendly with the technologies
and practices of enterprise IT if it is to become a first-class
citizen within enterprise IT infrastructure. The resource-intensive
nature of large Big Data clusters make virtualization an important
piece that Hadoop must accommodate," said Tony Baer, Principal
Analyst at OVUM. "VMware's involvement with the Apache Hadoop
project and its new Serengeti Apache project are critical moves
that could provide enterprises the flexibility that they will need
when it comes to prototyping and deploying
With five distros now available (Cloudera, Greenplum, Hortonworks, IBM and MapR) and VMware playing a part in all them (Hortonworks recently), it looks like the virtualisation specialist has been proactive of late. If you thought Hadoop was in its infancy, it might be time to reassess - business has just picked up as all are gunning for Hadoop-centric stacks, new providers and much, much more.
This aggressive Hadoop approach continues with the release of Spring for Apache Hadoop 1.0 M2, taking into account the latest developments in the space. Costin Leau revealed all that was new for the project which began in February, noting big changes, such as improved Hadoop Data Access Object (HDAO) support and new integration with the fledgling Cascading library (by introducing dedicated Spring Framework Taps and Spring Integration resources).
Also new to the project is more robust security for File-System, Map/Reduce and Pig components, perhaps a signal that the enterprise can tap into this Hadoop goldmine. Things have certainly got interesting as VMware pulls back the curtain on their previously shrouded Hadoop development - they're reading for the next-generation of Big Data. It's definitely been a Big Data week - what will happen next?