Hadoop Central finally gets first product
Hortonworks announces Version 1 of its flagship Data Platform
The team at the heart of Apache Hadoop development has finally released their first dedicated stack, as Hortonworks announced the availability of Hortonworks Data Platform 1.0 at Hadoop Summit 2012 yesterday.
Since the release of Apache Hadoop 1.0, a number of contenders have entered the fray with Hadoop-based distros, but until now, the team central to Apache Hadoop development from the start had held off. Whilst Hadoop's open source core is driven out of the Apache Software Foundation, support comes from companies like Cloudera and MapR (often partnering up with larger vendors to provide propertiary extensions) but Hortonworks were keen to wait until now to show their hand. And with good reason.
The company, formed a year ago as a spinout of the Yahoo! Hadoop team, have been instrumental in the first major release, as they were entrusted with creating one key aspect of Hadoop, in data-cruncher MapReduce. They've also laid the foundations down for Hadoop 2.0, which recently saw an alpha release. Hortonworks are offering something slightly different in Data Platform, with it being the only option thus far that is 100% based on open source Apache Hadoop.
Speaking to The Register, John Kreisa, vice president of marketing at Hortonworks, explained why they were going for a pure Hadoop stack:
There are no proprietary bits in the Hortonworks Data Platform, no lock in, and we (sic) doing this because we believe that it will greatly facilitate Hadoop adoption.
Given their instrumental part in getting Hadoop this far along the line, it's a shrewd move. Many of those already channelling Hadoop as part of their production will want no ties to other vendors, and get that level of quality assurance from the company most involved with Apache Hadoop development itself.
The platform itself claims to be a 'an enterprise-class solution architecture for high availability'. Teaming up with VMware, who've provided the infrastructure in vSphere, Hortonworks Data Platform strikes up partnerships with other spinoff Hadoop projects such as the scripting project Pig, querying hub Hive, metadata king HCatalog (Hortonworks stands alone in offering that) and Zookeeper to keep the whole thing ticking over smoothly. There's also the standard HDFS and MapReduce duo within the architecture too, as expected.
As is company policy, Hortonworks will contribute code from its HA solution back in the Hadoop project itself. The flexibility offered here is sure to entice many to the distro, as long as it gets the right marketing - Data Platform can adapt to the upcoming Hadoop 2.0 release when it arrives.
Now that there's five stacks at the table, all with merit, (Cloudera, EMC Greenplum, Hortonworks, IBM and MapR), the Hadoop choice just became a little bit harder for enterprises. But if you're looking for no tie-ins, seamless integration and pure tried and tested Hadoop 1.0, Hortonworks Data Platform might just be the one for you.