Someone's moving quickly

VMware makes Hadoop advances - Project Serengeti and Spring for Apache Hadoop 1.0 M2

VMware has certainly been busy this week in mapping out their Hadoop strategy. Early today we mentioned how they helped provide the cloud infrastructure for Hortonworks Data Platform 1.0, now they've made two other steps towards Big Data dominance.

First up is the arrival of a new open source project, Serengeti, with the primary goal of making their virtualised environments 'Hadoop-aware', enabling enterprises to quickly deploy, manage, and scale Apache Hadoop in virtual and cloud environments. This olive branch to the Big Data behemoth appears to have come about after realising how strong the Apache community behind it has become since the beginning of the year.

Project Serengeti will be a one-click deployment toolkit that removes the stumbling blocks and makes it possible to deploy Hadoop clusters in mere minutes - fully utilising an array of Hadoop-like projects such as Apache Pig and Apache Hive. Although quite nascent at the moment, in its 0.5 release, Serengeti could well become a vibrant habitat for Cloud Foundry developers to utilise Hadoop. It's available under an Apache 2.0 license 

"Hadoop must become friendly with the technologies and practices of enterprise IT if it is to become a first-class citizen within enterprise IT infrastructure. The resource-intensive nature of large Big Data clusters make virtualization an important piece that Hadoop must accommodate," said Tony Baer, Principal Analyst at OVUM. "VMware's involvement with the Apache Hadoop project and its new Serengeti Apache project are critical moves that could provide enterprises the flexibility that they will need when it comes to prototyping and deploying Hadoop."

With five distros now available (Cloudera, Greenplum, Hortonworks, IBM and MapR) and VMware playing a part in all them (Hortonworks recently), it looks like the virtualisation specialist has been proactive of late. If you thought Hadoop was in its infancy, it might be time to reassess - business has just picked up as all are gunning for Hadoop-centric stacks, new providers and much, much more.

This aggressive Hadoop approach continues with the release of Spring for Apache Hadoop 1.0 M2, taking into account the latest developments in the space. Costin Leau revealed all that was new for the project which began in February, noting big changes, such as improved Hadoop Data Access Object (HDAO) support and new integration with the fledgling Cascading library (by introducing dedicated Spring Framework Taps and Spring Integration resources). 

Also new to the project is more robust security for File-System, Map/Reduce and Pig components, perhaps a signal that the enterprise can tap into this Hadoop goldmine. Things have certainly got interesting as VMware pulls back the curtain on their previously shrouded Hadoop development  - they're reading for the next-generation of Big Data. It's definitely been a Big Data week - what will happen next?

Chris Mayer

What do you think?

JAX Magazine - 2014 - 06 Exclucively for iPad users JAX Magazine on Android

Comments

Latest opinions