Exploration, training and deployment

Why Kubernetes and containers are the perfect fit for machine learning

Murli Thirumale
© Shutterstock / Mark Rademaker (modified)

Both machine learning and the use of cloud-native environments built on containers are becoming more commonplace in the enterprise. Luckily, Kubernetes and containers are a perfect match for ML. The cloud-native model has many advantages that can be brought over to machine learning and other forms of artificial intelligence for more effective, practical business strategies.

Machine learning is permeating every corner of the industry, from fraud detection to supply chain optimization to personalizing the customer experience. McKinsey has found that nearly half of enterprises have infused AI into at least one of their standard business processes, and Gartner says seven out of 10 enterprises will be using some form of AI by 2021. That’s a short two years away.

But for businesses to take advantage of AI, they need an infrastructure that allows data scientists to experiment and iterate with different data sets, algorithms, and computing environments without slowing them down or placing a heavy burden on the IT department. That means they need a simple, automated way to quickly deploy code in a repeatable manner across local and cloud environments and to connect to the data sources they need.

A cloud-native environment built on containers is the most effective and efficient way to support this type of rapid development, evidenced by announcements from big vendors like Google and HPE, which have each released new software and services to enable machine learning and deep learning in containers. Much as containers can speed the deployment of enterprise applications by packaging the code in a wrapper along with its runtime requirements, these same qualities make containers highly practical for machine learning.

Broadly speaking, there are three phases of an AI project where containers are beneficial: exploration, training, and deployment. Here’s a look at what each involves and how containers can assist with each by reducing costs and simplifying deployment, allowing innovation to flourish.


To build an AI model, data scientists experiment with different data sets and machine learning algorithms to find the right data and algorithms to predict outcomes with maximum accuracy and efficiency. There are various libraries and frameworks for creating machine learning models for different problem types and industries. Speed of iteration and the ability to run tests in parallel is essential for data teams as they try to uncover new revenue streams and meet business goals in a reasonable timeframe.

Containers provide a way to package up these libraries for specific domains, point to the right data source and deploy algorithms in a consistent fashion. That way, data scientists have an isolated environment they can customize for their exploration, without needing IT to manage multiple sets of libraries and frameworks in a shared environment.

SEE ALSO: Unleash chaos engineering: Kubethanos kills half your Kubernetes pods


Once an AI model has been built, it needs to be trained against large volumes of data across different platforms to maximize accuracy and minimize resource utilization. Training is highly compute-intensive, and containers make it easy to scale workloads up and down across multiple compute nodes quickly. A scheduler identifies the optimal node based on available resources and other factors.

A distributed cloud environment also allows compute and storage to be managed separately, which cuts storage utilization and therefore costs. Traditionally, compute and storage were tightly coupled, but containers along with a modern data management plane allows compute to be scaled independently and moved close to the data, wherever it resides.

With compute and storage separate, data scientists can run their models on different types of hardware, such as GPUs and specialized processors, to determine which model will provide the greatest accuracy and efficiency. They can also work to incrementally improve accuracy by adjusting weightings, biases and other parameters.


In production, a machine learning application will often combine several models that serve different purposes. One model might summarize the text in a social post, for example, while another assesses sentiment. Containers allow each model to be deployed as a microservice — an independent, lightweight program that developers can reuse in other applications.

Microservices also make it easier to deploy models in parallel in different production environments for purposes such as a/b testing, and the smaller programs allow models to be updated independently from the larger application, speeding release times, and reducing the room for error.

SEE ALSO: Artificial intelligence & machine learning: The brain of a smart city

Closing thoughts

At each stage of the process, containers allow data teams to explore, test and improve their machine learning programs more quickly and with minimal support from IT. Containers provide a portable and consistent environment that can be deployed rapidly in different environments to maximize the accuracy, performance, and efficiency of machine learning applications.

The cloud-native model has revolutionized how enterprise applications are deployed and managed by speeding innovation and reducing costs. It’s time to bring these same advantages to machine learning and other forms of AI so that businesses can better serve their customers and compete more effectively.


Murli Thirumale

Murli Thirumale is co-founder and CEO of Portworx, a container-native storage and data management startup. Prior to founding Portworx, Murli founded and sold two companies – Ocarina Networks (acquired by Dell in 2010) and Net6 (acquired by Citrix in 2004). He was also a GM at HP for a decade before that.

Inline Feedbacks
View all comments