Hadoop just got a speed power up
Twitter's real time Hadoop front-end tool now cloud ready, thanks to Nodeable's StreamReduce
Big Data analytics is big business these days - more and more firms want to be able to grasp key trends across huge datasets as fast as possible. What was once interest from afar is now certainly demand for the larger enterprises around. Obviously, given that the Big Data revolution has yet fully realise the potential with real-time analytics, simply because that capability doesn't exist within the batch-oriented Hadoop currently. And that functionality isn't coming for the forseeable future either.
However there are a number of options appearing on the horizon seeking to change this. The latest to spot room in the market is Nodeable, who have diversified radically away from their cloud analytic systems manager, which they launched last year, towards a real-time model. Their banner product is StreamReduce - the first real-time Hadoop pre-processing engine that aims to make sense of the reams of multi-structured data as it hits the system.
Based on open source Storm, a real-time analytics framework that Twitter acquired from BackType last year and now uses internally, StreamReduce is essentially a cloudy version of its predecessor, yet is an important step to be made for Big Data. It finally opens the doors to the wider developing community, a move which Nodeable are keen to shout about to gain appeal. Now, sophisticated data analytics are not just available to data scientists but for those who aren't so eager to shell out huge capital to get their hands on critical insight. Pricing is set at $99 per month.
Nodeable say that StreamReduce is the ideal 'cloud-based complement to batch processing via Apache Hadoop and Amazon Web Services Elastic Map Reduce', providing a much-needed speed boost to MapReduce. Something which is badly needed with murmurs of discontent about and are already getting glances from suitors. The most notable so far is Cloudera.
Mike Olson, CEO of Cloudera and also on the board at Nodeable said: “As the demand for Hadoop and big data insights continues to grow, Nodeable’s pre-processing engine helps large enterprises get more from Hadoop faster and its real-time cloud delivery brings the benefits of big data analytics to companies without the dedicated IT staff and resources of the Global 2000.”
“For almost a year the engineering team at Nodeable has worked closely with more than 400 beta users who’ve told us that a real-time analytics complement to Hadoop is a top priority,” said Dave Rosenberg, Founder and CEO of Nodeable. “Batch workflows are too slow for turning data into useful, actionable information. StreamReduce solves that problem with a simple cloud-based solution.”
The UI for StreamReduce looks familiar to those who've already used the original Nodeable product, coming with a nice Twitter-like gloss. SteamReduce runs in the Amazon Web Services cloud, picking the best of AWS tools, as well as other popular NoSQL choices like MongoDB and Amazon’s DynamoDB. Rosenberg however has stated to GigaOM that this might change to Cassandra to suit higher volumes of traffic.
The ideal use cases for StreamReduce according to Nodeable are pretty broad, and include:
- Log and clickstream analysis
- Anomaly detection in Amazon Web Services EC2 instances
- Security and fraud detection
- Mobile and geo-location measurement
- Pinpointed advertising and marketing
The key here is Hadoop might be have eagle-eye accuracy with data that has been processed, but isn't capable of harvesting similar insights from data there and then. Thankfully, up steps Nodeable with StreamReduce with a compelling solution to this complex issue at least
Expect more real-time analytics companies crop up with their own tool and buddy up with the Hadoop providers in the next few months. However, Nodeable and Cloudera have gained a potentially pivotal head start already on their respective competitors