VMware makes Hadoop advances – Project Serengeti and Spring for Apache Hadoop 1.0 M2

Chris Mayer

Once you’ve made one Hadoop announcement, they then come in droves. Welcome Project Serengeti to ease Hadoop transition and Spring for Apache Hadoop 1.0 M2, bringing in enterprise-looking features

VMware has certainly been busy this week in mapping out their
Hadoop strategy. Early today we mentioned how they helped provide
the cloud infrastructure for Hortonworks Data Platform 1.0, now
they’ve made two other steps towards Big Data dominance.

First up is the arrival of a new open source project, Serengeti,
with the primary goal of making their virtualised
environments ‘Hadoop-aware’, enabling enterprises to
quickly deploy, manage, and scale Apache Hadoop in virtual and
cloud environments. This olive branch to the Big Data behemoth
appears to have come about after realising how strong the Apache
community behind it has become since the beginning of the

will be a one-click deployment toolkit that removes
the stumbling blocks and makes it possible to deploy Hadoop
clusters in mere minutes – fully utilising an array of Hadoop-like
projects such as Apache Pig and Apache Hive. Although
quite nascent at the moment, in its 0.5 release, Serengeti could
well become a vibrant habitat for Cloud Foundry developers to
utilise Hadoop. It’s available under an Apache 2.0

“Hadoop must become friendly with the technologies
and practices of enterprise IT if it is to become a first-class
citizen within enterprise IT infrastructure. The resource-intensive
nature of large Big Data clusters make virtualization an important
piece that Hadoop must accommodate,” said Tony Baer, Principal
Analyst at OVUM. “VMware’s involvement with the Apache Hadoop
project and its new Serengeti Apache project are critical moves
that could provide enterprises the flexibility that they will need
when it comes to prototyping and deploying

With five distros now available (Cloudera,
Greenplum, Hortonworks, IBM and MapR) and VMware playing a part in
all them (Hortonworks recently), it looks like the virtualisation
specialist has been proactive of late. If you thought Hadoop was in
its infancy, it might be time to reassess – business has just
picked up as all are gunning for Hadoop-centric stacks, new
providers and much, much more.

This aggressive Hadoop approach continues with the release of
Spring for Apache Hadoop 1.0 M2, taking into account the latest
developments in the space.
Costin Leau
revealed all that was new for the project which
began in February, noting big changes, such as improved
Hadoop Data Access Object (HDAO) support and new
integration with the fledgling Cascading library
(by introducing dedicated
Spring Framework 
and Spring Integration resources). 

Also new to the project is more robust
security for File-System, Map/Reduce and Pig components,
perhaps a signal that the enterprise can tap into this Hadoop
goldmine. Things have certainly got interesting as VMware pulls
back the curtain on their previously shrouded Hadoop development
 - they’re reading for the next-generation of Big Data. It’s
definitely been a Big Data week – what will happen

