The importance of planning ahead

Avoiding cloud lock-in must be about more than orchestration

Patrick Callaghan
© Shutterstock / Evgenia Parajanian

Containers can provide an easier way to deploy software by stripping out extraneous material and reducing image sizes to solely what is required. At the same time, microservices break down applications into smaller components that fulfill specific roles within the application. This combination of containers and microservices can accelerate scalability to maximum!

Choosing a platform for your application can seem like the simplest decision, but it can have the most impact on your future development choices. For other teams, the choice of which cloud provider, operating system or programming language can be the cause of much wailing and gnashing of teeth internally.

To overcome this, many development teams are looking at software containers and microservices. Containers can provide an easier way to deploy software by stripping out extraneous material and reducing image sizes to solely what is required. Removing these additional and unnecessary elements is a core reason for implementing containers, as it makes developers more productive – around 57 percent of those surveyed by 451 Research listed developer productivity as a key reason to move.

At the same time, microservices break down applications into smaller components that fulfill specific roles within the application. By using smaller building blocks, these applications can be scaled up and down more easily. This combination of containers and microservices is essential for companies that want to scale up extremely rapidly in response to peaks in demand from customers that want service right now.

Hybrid cloud and multi-cloud strategies with containers

Alongside this application development trend, the role for infrastructure is changing. More companies are adopting hybrid cloud and multi-cloud strategies in order to keep up with all the changes in demand levels around IT. Previously, Gartner estimated that the number of companies moving to multi-cloud strategies will be around 70 percent of all enterprises by 2019, compared to only 10 percent in 2016.

Forrester Consulting found a similar trend taking place in 2018 – around 60 percent of companies have already deployed production applications in the public cloud, with the main driving force here being business performance and operational efficiency. This rapid shift is being led by developers, as they want more flexibility around how they run and deploy applications.

Now, this move to hybrid and multi-cloud is also being driven by the fear of lock-in. While public cloud services can offer huge compute, storage, and service availability at the drop of a hat, many within IT are cautious of putting all their eggs in one metaphorical basket. Being reliant on one public cloud service can be risky for those that rely too much on specific services and then risk them changing or being removed in the future. Similarly, companies in the retail and e-commerce sector have been wary of running on public cloud services that are owned by one of the biggest retailers in the world.

Containers have been posited as a way to avoid this lock-in. By abstracting the application and its essential elements away from the underlying hardware and platform, any container image should be able to run on any cloud service. This has been one of the reasons for the growth of Kubernetes, the open source container management and orchestration platform released by Google in 2014 and now developed by the Cloud Native Computing Foundation (CNCF).

Kubernetes offers a way to run containers efficiently at scale, automating many of the management steps that are involved in running clusters. By making it simpler to orchestrate clusters and pods of containers, Kubernetes has made container-based applications easier to operate and manage over time. It has also removed some of the challenges around scaling up applications to meet changing demand levels, even when operating in hybrid and multi-cloud environments.

More importantly, multiple public clouds now support Kubernetes and offer managed Kubernetes services too, making it both easier to deploy and to move if you decide to change platforms. This ubiquity is one element that – so the theory goes – makes it easier to avoid lock-in. Don’t like your current public cloud provider? Simply move those containers to another provider, or work with another service to manage them on your behalf.

SEE ALSO: PaaS, containers, and serverless are creating a multi-platform world

Service ubiquity versus data autonomy

However, containers are only one element of many modern applications. While many services will run in containers, they are often stateless and are not used for storing and analyzing data. As applications operate, they will create data that has to be stored and managed over time.

In order to truly achieve application autonomy, it’s important to look at containers and data together. If your containers are portable but your data is not, then you are effectively locked into a specific provider for that element of your application. Similarly, if you are tied to public cloud provider tools for analytics or data storage, then you are going to find it harder to migrate away if you ever need to.

To avoid this, it’s important to look at what data management, storage and analysis requirements you have. Are you carrying out analytics on the data that your applications create, and how soon after any data is created will this take place? Are you looking at historical trends or do you need to make analytical decisions in the moment?

For some applications, historic analysis of information held in a data lake may be enough. For most applications today, however, data has to be used as soon as it is created. For e-commerce and retail companies, moves like personalization and recommendation of products have to take place as close to a customer action as possible to have a chance of success. Providing a great recommendation after the fact is interesting, but not likely to change behavior compared to a personalized response that takes place in real time.

Similarly, analytics can be based on multiple different ways of storing data. Relational, NoSQL and graph databases all provide different methods for handling data and gleaning information from the huge volume of information that applications create. With all of these different options available, it is worth looking at how these different data models and databases can be implemented alongside your container-based application or as containers themselves.

If your application relies on specific functionality for data storage or analytics that is based on a specific public cloud service, then you will have to rely on that service over time whether you use containers or not. This creates an element of lock-in. To get around this, your data layer should be able to run across multiple locations or cloud providers in the same way that containers can achieve. You need to have the flexibility to move data in and out of various clouds whenever you want, with no change to your application.

There are two sides to this consideration. The first is that this cloud independent approach should mean that you can migrate between internal data center infrastructure and a public cloud provider, or run across multiple public cloud services as required. This “active everywhere database” approach should involve the same cloud database or service over every platform. An example here would be Apache Cassandra, which can run across multiple clouds or in hybrid environments independently. This approach can ensure that data is not locked into the specific cloud, while still benefiting from the cloud performance advantages too.

The alternative approach here is to run across multiple cloud services in order to take advantage of those services. In this instance, having a database that is capable of providing a single data fabric across multiple clouds allows you to take advantage of specific services of each of the clouds platforms. For example, we might use Azure IoT to ingest data, we may want to then use Google Cloud Machine Learning to process that data and then push these models and reports to AWS S3 for storage. Each of these platforms can meet their own specific goals within an application, while still retaining independence.

SEE ALSO: Know your history — Containers aren’t just a flash in the pan

Containers, data, and scale

For applications that are creating data at scale, running databases as images or containers may not be suitable for production environments. Whereas application components can be stateless, data storage has to be stateful, so may not be best suited to running as containers. Instead, containers can connect to the data layer via APIs. At the same time, this data should be independent of any public cloud so it can be run in hybrid or multi-cloud environments, just as containers can.

Containers offer a great way to deliver scalable applications that can react to user demand levels faster and more efficiently than traditional infrastructures can support. However, adopting containers cannot avoid lock-in on its own. Analyzing your overall application requirements – from containers through to data management, analytics and storage over time – and then, looking at how to run these other elements independently is also required. By planning ahead, developers can help their companies adopt better hybrid cloud strategies that ensure data autonomy, prevent lock-in and meet the needs of the “right now economy” that all companies have to operate in.

This article is part of the latest JAX Magazine issue. You can download it now for free.

Have you adopted serverless and loved it, or do you prefer containers? Are you still unsure and want to know more before making a decision? This JAX Magazine issue will give you everything you need to know about containers and serverless computing but it won’t decide for you.


Patrick Callaghan

Patrick Callaghan is solutions architect at DataStax. He works with companies on implementing and supporting large-scale, mission critical applications across hybrid cloud environments while preserving data autonomy. Prior to DataStax, he held roles in the banking and finance sector as a Java consultant covering real-time applications for tracking market risk.

Inline Feedbacks
View all comments