Mark Brewer: “Predictable DevOps is a critical requirement for fast data”
© Shutterstock / garagestock
Lightbend recently revealed the findings of a new survey of 2,457 global developers. We talked with Mark Brewer, the CEO of Lightbend about the key findings, differences between fast data and big data, misconceptions and more.
Lightbend recently surveyed almost 2,500 global developers in order to better understand the current state of fast data.
The survey aims to offer answers to questions about the alignment between fast data and business value, patterns and challenges early adopters face and the impacts on software development and tool choices.
The big data market is undergoing a rapid transformation from data at rest to data in motion. Analysts indicate the adoption of fast data is happening at a rate three times faster than traditional Hadoop.
Executive Summary — Lightbend Global Developer Survey
The results can be summarized in just a couple of words: speed matters.
It seems that over 50 percent of the respondents choose new frameworks and languages based on fast data requirements. Although respondents seem to be “more confident working with disparate data sources and continuous streams of input than operationalizing their systems in production, integrating, scaling, debugging, and monitoring are posing challenges for developers.”
Results also show that “choosing the right tools and techniques for fast data can be daunting as the emerging ecosystem of streaming frameworks is constantly shifting and not fully understood by developers and architects.”
Download the full report.
We talked with Mark Brewer, the CEO of Lightbend about the key findings, differences between fast data and big data, misconceptions and more.
JAXenter: Analysts indicate the adoption of fast data is happening at a rate three times faster than traditional Hadoop, according to Lightbend’s latest survey. Why is the fast data market moving so fast? What’s the explanation?
Mark Brewer: If we go back 15-20 years when the Internet kicked off, companies worried about things like — what do we do about all these unprecedented large datasets we’re capturing? The thinking was let’s just capture it somehow and then look at it after the fact. That was the backdrop that led to the Hadoop ecosystem — this process of capturing data as it came in, and then running large batch jobs or interactive data warehouse jobs after the fact, to understand it.
The use cases driving today’s new applications are all about using the data faster. AI, Machine Learning, IoT, predictive analytics — processing streaming data has become a first-class concern. If you are an auto loan company, the new customer prospect wants a quote instantaneously, or they will go to a competitor. If you are delivering a smart home product, real-time is table stakes to even shipping a product. The market opportunities and forces of competition are pushing for faster value from data, faster answers and now, smarter answers, using AI and ML.
JAXenter: What is the main difference between fast data and big data?
Mark Brewer: The data conversation is moving beyond analytics and into the application. The early waves of big data were really dominated by this concept of having very large datasets that were external to the applications and systems, and the predominant concern was — how do I use data science and analytics to extract meaning out of these datasets, with batch operations?
The market opportunities and forces of competition are pushing for faster value from data, faster answers and now, smarter answers, using AI and ML.
There has long been this concept of the “three Vs” within big data conversations: volume, variety and velocity. It’s a distinction that the analytics vendors emphasized to explain to the market how they should think about their data assets. What we’re seeing with developers today is that concept of velocity — which is frequently being expressed by this term “fast data”— is the overwhelming priority with which they are building the new class of applications and systems, and where data is no longer external to the system. Instead data — or streaming/fast data, to be precise — is internal and integral to the application infrastructure itself.
SEE ALSO: Big Data becomes Fast Data
JAXenter: What is the biggest misconception about fast data?
Mark Brewer: I think the biggest misconception is that it’s going to entirely replace the old Hadoop ecosystem. For many, simply speeding up batch jobs in Hadoop is adequate for moving, say, from a job taking a week, to a day. And there are many use cases where batch will continue to be “good enough” for the foreseeable future. So it’s not that we’re going to turn off our Hadoop clusters or stop doing data warehousing – they’re still very important in the larger ecosystem of things. But increasingly, if we can get value out of our data as quickly as possible, that opens up new opportunities to provide value to customers, to improve business operations, and create brand new services and offerings that were just not possible before.
So it’s not that we’re going to turn off our Hadoop clusters or stop doing data warehousing — they’re still very important in the larger ecosystem of things. But increasingly, if we can get value out of our data as quickly as possible, that opens up new opportunities to provide value to customers, to improve business operations, and create brand new services and offerings that were just not possible before.
Any industry where response time matters — whoever harnesses fast data has a major advantage.
Another misconception is that fast data equates to “real-time”, with “real-time” carrying the same meaning for everyone. What we’re seeing is that for some businesses, “real-time” means accelerating a batch job from once per day to intra-hour. For others, “real-time” literally means sub-millisecond response time. So there’s a lot of room for interpretation about “how fast is fast?”— but the one thing that’s clear is that most businesses are indeed trying to get faster.
JAXenter: The survey also shows that more than half of the respondents choose new frameworks and languages based on fast data requirements. Does this mean that fast data is shaking up the traditional stack?
Mark Brewer: Absolutely. 37 percent of developers we surveyed are choosing new frameworks based on their ability to handle data more effectively. 18 percent are choosing new languages based on their ability to handle data more effectively. And 30 percent are modernizing old systems specifically to be more compatible with new data requirements.
Lightbend is addressing the market of 10 million-plus JVM based developers who — as they get into these fast data use cases — experience that the Java EE platform really wasn’t designed to solve these types of challenges at run time. The explosion of new frameworks has been in response to this general inability of the Java ecosystem to keep pace with the modern class of data-driven applications, and that’s the real basis for this renaissance you’re seeing in new technology choices for the developer.
JAXenter: How important is it to use more fast data right now? What are the biggest benefits?
Mark Brewer: For every business, time equals money — the sooner I can get answers or value out of the data, the more valuable it is. I would say that fast data is more important today because the tools available to achieve it are more readily available, and you see the ripple effect through every industry of the companies that do it well displacing the ones that do not. The same way that Netflix killed Blockbuster, that Uber and Lyft disrupted the taxi industry: any industry where response time matters, whoever harnesses fast data has a major advantage. The biggest benefits are delivering more value in products and services, competitive advantages, and unlocking new business opportunities and revenues out of existing data assets.
The explosion of new frameworks has been in response to this general inability of the Java ecosystem to keep pace with the modern class of data-driven applications.
JAXenter: Is choosing the right frameworks and tools confusing?
Mark Brewer: Absolutely. There isn’t a one size fits all in this space, and there is an explosion of new tools and frameworks, and the selection of the “right” ones is not only very time consuming but it also has major implications for the project success. Even at the streaming engine level, there are many different choices, different streaming use cases that dictate which is the best fit, and precious little market feedback. That’s exactly why Lightbend Fast Data Platform includes expert guidance for developers and architects so they can pick the tools that are the best fit for their team, the project at hand and their company.
JAXenter: How important is it right now to embrace DevOps? Does a successful DevOps strategy ensure fast data success?
Mark Brewer: Predictable DevOps is a critical requirement for fast data, and we see enterprises struggling in a couple of key areas. First, after they navigate the gauntlet of choosing from the many, many different available fast data frameworks, designing their systems and integrating it all (which, by the way, is something Lightbend Fast Data Platform solves with the curated and pre-integrated platform, along with expert advice throughout the process); they find themselves at run-time wondering how to pinpoint the cause of problems.
Lightbend Fast Data Platform meets that challenge in two ways. One-stop support, to fill that void for a “single throat to choke” for enterprises when they run into run-time challenges for all these frameworks whose operational model is still very early. But even before getting to that, DevOps teams need monitoring to keep an eye on what’s going on. So Lightbend Fast Data Platform includes purpose-built monitoring that can be an ally to DevOps teams, helping them identify problems and performance issues affecting their fast data applications.
But more broadly, the DevOps challenge with fast data is that it’s always-on. With batch processes, the jobs were run external to production apps, and failures were tolerable and could be maintained at predictable intervals. With fast data use cases, keeping the system operational fully functional when you have a never-ending stream of data — it’s driving developers and operations to the importance the tenets of Reactive. By including Reactive tooling in the form of the Reactive Platform, Lightbend Fast Data Platform helps deliver the extreme resilience that DevOps need and rely on when they become responsible for these critical applications.