Getting up to speed

Hadoop Hive-as-a-service Qubole notches up $7m funding

Chris Mayer
Beehive2

Created by those behind Apache Hive and Facebook’s analytics platform, is Qubole more than just another Big-Data-as-a-service startup?

The slew of Big Data-as-a-service startups, or the money being
pumped into them, shows no sign of letting up, with Hadoop churner
Qubole raising $7m in their
opening round of funding.

Qubole is aimed at predominantly at data
scientists and ETL engineers who want little fuss when it comes
analysing data pipelines.

Its founders, Ashish Thusoo and Joydeep Sen
Sarma, are responsible for Hadoop’s data warehousing system Hive
and its querying language, which they co-authored while working on
Facebook’s analytics platform.

Naturally, the project is at the heart of the
Qubole Data Service, and could be considered a cloud version of
Hive itself. Qubole processes unstructured data and
let
s users run quick Hive jobs within
Amazon Web Services. The platform calls upon a number of analytics
tools like R, SQL sources such as MySQL, as well as NoSQL databases
like MongoDB, before pushing it to typical business intelligence
applications.

Since coming out of beta in December, Qubole has
processed around half a petabyte of data from clients. Thusoo
believes this quick milestone “demonstrates Qubole’s growth and
viability”.

“We are very excited to raise the bar again as
we continue to innovate on behalf of our users,” he said in a press
release.

The duo’s Hadoop heritage has helped them
optimise the framework, with claims that their platform runs Hive
queries and Hadoop jobs five times faster than Amazon Elastic
MapReduce does.

The hookup to Amazon Web Services is undoubtedly
Qubole’s strongest selling point, with a ready-made customer base
at their disposal, who won’t to learn something new when keeping
tabs on their cluster.

But is Qubole just another name in an already
competitive field? Data analysis platform Hadapt
for
example
, offers a similar concept and was
launched in 2011. Hive
itself has been
around even longer and beginning to show its age. 

Leading Hadoop vendors are moving on from Hive, or are opting to
renovate it in their distributions. Cloudera have
Impala
, MapR have
Drill
and Hortonworks have the
Stinger Initiative
, which is promising to make Hive 100 times
quicker with a new processing framework called Tez. It seems in
this world, you have to adapt to survive – can Qubole do the same
in the long run?

Author
Comments
comments powered by Disqus