Containerization and artificial intelligence: The keys to unlocking next-gen analytics
What is the next evolutionary step for data analysis? In this article, Piet Loubser argues that containerization and embedded AI to make data pipelines more intelligent represent the future.
For all the talk about data fueling digital transformation, it would seem the world remains in a strange limbo where a few companies embrace this concept. Many continue to be slow followers, while others are not yet familiar with the idea.
A new report from Gartner, titled Applied Infonomics: Seven Steps to Monetize Available Information Assets (Nov 2018) shows top performers are more than twice as likely as typical performers to have monetized information assets, and eight times more likely to have done so than trailing organizations.
From my experience over the past 20+ years in analytics and data management, the main challenge for those companies in the middle and those that are laggards is that they generally lack the vision of what could be done differently from their traditional information processes. They are waiting for vendors to bring their best practices, use cases and ideas. While vendors strive to do this – and often bring these new use cases – the challenge is that the top performers in any particular industry are often running away with your market while you wait.
A new mindset for data practitioners, analysts, engineers, data scientists
The key challenge with unlocking the true value of data in and for our organization means we have to unlearn much of what we were taught over the past 20+ years of data warehousing and Business Intelligence (BI) and embrace entirely new concepts and skills.
The traditional BI world was built and designed at a time when repeatedly answering a few well-known questions was enough, such as how much of product “A” did we sell in North America. The entire data-to-insights value chain was designed around that concept and was taking into consideration the massive cost of processing and storing data for analytics. However, neither of these concepts hold true today. Global competition, the elevation of the customer as the true king, and cloud computing have all but invalidated these old data processes.
Containerization and AI emerge as the key enablers for self-service data prep
In a world where analytical exploration has become the key to an organization’s ability to constantly innovate, we have to empower those with the best business context and understanding to take the lead. These are often the data and/or business analysts. The traditional approach where these experts had to ask developers to pull data together is no longer acceptable because it takes too long and runs the risk of removing much of the possible insights before the business users have a chance to look at it.
Enabling the business with self-service data prep has enabled them to operate at the speed of business across the entire data value chain. Another critical need is that we need to assist the non-technical business user with intelligence and also guide them so that they do not get themselves into trouble. With this in mind, expect to see the following trends in the coming year:
- Containerization, powered by Kubernetes. Containerization has been around for decades, but Kubernetes takes it to the next level with extremely scalable container distribution and management. Analytical workloads today can no longer be contained to a single location. They have to be built in such a way that they can be deployed to any location, run for as long as needed, and then allocate resources upon completion. Kubernetes is the cornerstone for the agility we require, as it opens the door for us to provide zero admin workloads that can be deployed across multiple cloud or on-prem infrastructures, where ever data gravity or processing is available.
- Embedded AI to make data pipelines more intelligent. While we clearly see the value in using AI and other related techniques, such as Machine Learning (ML), in understanding customer purchase behaviors, we have not done enough with AI and ML to help us better understand – and make smarter – our own data processes. Using AI, organizations can better understand what the ingested data being means and how it can be better enriched. A key trend over the coming year will be to deploy algorithms to help with every step of the data value chain – from ingesting to understanding what the data means, to how to standardize the data and, finally, how to enrich the data with additional data elements.
While self-service data prep is a game changer for data analysts, data will remain a team sport. It will continue to require a collaborative effort among business (who know the context), IT (who knows the data technologies), and data scientists (who know the statistical and mathematical algorithms to exploit). It will also require a new mindset where shadow IT is no longer something we dread but where we embrace our counterparts across the organization and know everyone is part of the data value chain. In this way, each group will be able to apply their expertise and play a critical role in unlocking that breakthrough insight that will deliver great results for our business.