Movin' on up

Typesafe survey reveals increased exploration of Apache Spark

JAXenter Editorial Team
Positive trend image via Shutterstock

Typesafe found what they like to call a ‘golden nugget’ in some of their latest developer survey results and wanted to investigate the unexpected uptake of Apache Spark. Their responses paint a promising and exciting picture for Spark adoption.

After a recent Typesafe survey of Java 8 that included more than 3,000 participants, 17% of respondents had been recorded as using Apache Spark in production, a surprising and interesting tidbit for the Typesafe team. This sparked (zing!) another survey that has provided more accurate information about the use of Big Data tools.

A new wave of ‘Reactive Big Data’?

A total of 2,136 developers and other IT professionals participated in the Typesafe survey. Of these respondents, 13% said they already use Apache Spark in production, while 31% are currently exploring the idea of implementing it. 20% of those surveyed are scheduled to begin using Apache Spark in 2015: for an additional 2% of participants, their uptake will begin in 2016.

A group of 6% had reported that after testing Apache Spark, they’d come to the conclusion that they didn’t want to use it. On the other hand, a whopping 28% of survey participants had never even heard of the tool. This statistic might well prove damaging to what Typesafe believes is a next wave of Reactive Big Data?

Of those who are already using Spark, 78% said that they’re hoping the tool will help with problems in the rapid batch processing of large amounts of data. The biggest hurdle for the effective use of Spark, as identified by the survey, was a lack of awareness and/or lack of experience.

SEE ALSO: Spark and Cassandra team up to make data analytics super fast

As reported on the Typesafe blog, Apache Spark is experiencing remarkable growth in both adoption and awareness, self described as a “fast and general engine for large-scale data processing”. Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing, which enables it to run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.

The full report is available here (registration required to download), which touches on some other hot topics, including what developers are using Spark for.

Inline Feedbacks
View all comments