NoSQL in the firing line

MapR release HBase-heavy Hadoop distro, M7

Chris Mayer

Every Hadoop vendor wants to be different. MapR opt for NoSQL approach to their special Hadoop sauce

Not wanting Cloudera to steal the limelight with Impala,
MapR have unleashed their latest
distribution, with yet another take on how to best to push Hadoop

MapR were one of the first Hadoop vendors on the scene,
making their name with on their advanced flavour of Hadoop. But now
it appears offering non-relational capabilities is at the centre of
MapR’s minds with the crosshairs firmly fixed on HBase, the NoSQL
distributed database.

The company first outlined plans to speed up the HBase
layer of Hadoop back in October in their whitepaper for the M7
distribution. With research telling them that
of Hadoop shops
[PDF] are using HBase in
production, it seems like a natural choice for MapR to fine-tune
the NoSQL layer.

The top-end M7
edition aligns HBase with MapR’s closed source version of the
Hadoop filesystem, so they share a single data layer. M7 splits up
HBase database tables and stores them within the MapR filesystem,
giving a huge boost in performance. The company boldly boast that
M7 delivers a rapid “one million operations/sec with a ten node
cluster”, supports one trillion tables in one cluster and “ensures
99.999% availability for HBase and Hadoop applications”.

M7 is the top-end distribution offered by
and is primarily targeted at heavy
NoSQL enterprise users. M3 is the free community version of MapR’s
Hadoop stack, while M5 is the half-way house which

opens up extra features such mirroring and
within the company’s filesystem, with
additional tech support.

The competition is heating up as vendors look
for new ways to distinguish themselves from the pack. Parallel to
rolling out the release, MapR have announced the inclusion of a
search engine within their distribution, in conjunction with
Lucidworks, the company behind Apache Lucene and Solr. Currently in
private beta, users can index and search standard files without
needing to perform conversion or transformation, as well as clone
and snapshot files within the filesystem.

There’s no room for Apache Drill,
the Google Dremel-mimicking system for analysing large
datasets, within M7 just yet. The project led by MapR is still
under heavy development in the Apache Incubator.

comments powered by Disqus