High availability on the agenda

Hortonworks bolster Data Platform with 1.1 release

Not much has gone on in the world of Hadoop since June’s big data extravaganza at Hadoop Summit 2012, which key vendors used as a soapbox to outline their strategies for the coming year.

Whilst Cloudera and MapR chose to pursue the possibilities of Hadoop 2.0 (still in alpha), Yahoo! spinoff Hortonworks unveiled their long-awaited Hortonworks Data Platform that would stick to the proven codebase. It left many puzzled, but upon reflection it was a wise decision to opt for reliability with the enterprise world still having cold feet over Hadoop distributions. But what else can Hortonworks offer that’s innovative?

Three months on and the team have released Hortonworks Data Platform 1.1, with some notable additions to the stack that calm concerns that they were being too conservative.

Arguably the biggest improvement comes in the form of extra high availability options, to include the latest versions of Red Hat Enterprise Linux. This is quite a big deal, opening up the options to newcomers, who can now use Linux or solutions from VMware.

Elsewhere, data streaming catcher Apache Flume makes its debut within the distribution, to help Hadoop get over its real-time headache. The incubating project aims to provide a distributed available system to collect the morass of log data into one centralised store. It has already created a big stir, promising to glean insight from the data that was too clunky for the old Hadoop to deal with.

Ops is another enterprise stumbling block. Keeping tabs on the Hadoop infrastructure can initially be overwhelming, so HDP 1.1 has created a deeper ops-centre of sorts to manage clusters and integrate more third-party tools, all in one place. There’s also the claim that this platform makes mincemeat of MapReduce jobs, with a 10% performance boost. No mean feat when you consider the already swift nature of Hadoop.

With the final Apache Hadoop 2.0 still some time away, Hortonworks’ decision to get the best out of the current stable release could prove an astute business move. Offering substantial high availability options on steady grounding, and crucially before the next-generation MapReduce (YARN) and HDFS properly come to fruition might give Hortonworks the edge.

Chris Mayer

What do you think?

Comments

Latest opinions