Time to get real
SpringSource announce next Big Data steps in Spring XD
Following the successful launch of their Hadoop project in February, SpringSource have unveiled their next steps into the world of Big Data.
Spring XD (shorthand for eXtreme Data and not the emoticon) is set to tackle “Big Data complexity” according to SpringSource engineer Mark Fisher.
Having spent the last year crafting the groundwork for Spring for Apache Hadoop, allowing it to run MapReduce jobs and create helper classes for offshoot projects like Hive and Pig, Spring XD is focused on dealing with common Big Data use cases.
The reveal of Spring XD comes mere hours before Pivotal launches, the new “platform” for VMware and EMC’s line of big data and cloud products.
Fisher outlines four ambitious key goals for Spring XD, which centre around “high throughput” data ingestion and the ability to export to relational and NoSQL databases. The red hot trend of real-time analytics is also set to be addressed with Spring XD set to include “real-time analytics at ingestion time” - but how advanced that will be remains to be seen. The project will also introduce a Hadoop workflow management system through batch jobs, which will interact with fellow Hadoop projects such as Cascading, as well as standard enterprise systems.
Despite many of these use cases being covered in SpringSource’s Spring Data book (which has sample code on Github), Fisher believes the arrival of Spring XD will provide a “consistent” and “familiar” model to Spring developers. Further down the roadmap, Spring XD will provide an out-of-the-box executable server, pluggable modules and a model for collecting instances on or off the Hadoop cluster.
The repository can be forked right away, however. Fisher admits that the project is in its infancy, but wanted the announcement to come out as soon as possible to allow community members to get their teeth into it. Further milestones are expected in May, June and August before a release candidate arrives in September.