The Hard Thing About Kubernetes; The Great Kubernetes Irony
Kubernetes is an incredibly powerful enabler for enterprise-wide app modernization strategies, but if not done right, the cost of modernization can take a toll. The Great Kubernetes Irony is that the very technology that was created to streamline the management of modern applications is, itself, incredibly difficult to manage.
Provisioning your first, barebones, Kubernetes cluster may take 15 minutes or less, but it’s the start of a long journey. No one ends up running a single cluster, or even just two. Each and every cluster you create will require ongoing maintenance and installation of tens of components to make the cluster work for an enterprise environment. And all of these will need to be upgraded as and when new versions are available for each individual component. And let’s not forget making sure proper role-based access control is in place to ensure cross-cloud governance. You may also consider investing in GitOps tooling so newly developed software can be automatically deployed on the cluster without developers needing to learn the vagaries of Kubernetes. And last but not least you need to set up auditing of all activity by everyone involved across all environments.
Kubernetes is an incredibly powerful enabler for enterprise-wide app modernization strategies, but if not done right, the cost of modernization can take a toll.
The Great Kubernetes Irony is that the very technology that was created to streamline the management of modern applications is, itself, incredibly difficult to manage. Why? Well, here are just a bare minimum set of systems and tools needed to make Kubernetes work:
Cluster Provisioning is the process of creating clusters in one or more environments. You can do this yourself of course using each of the K8s distributions your enterprise uses. Tools include Kops, OpenShift, PKS and Terraform for EKS, AKS, and GKE. Be sure to invest in automation for this task so clusters can be (re)provisioned easily by anyone in the operations team.
Cluster Blueprinting creates an approved set of standardized cluster configurations with tools that can be easily reused by the enterprise to more quickly deploy and easily support applications. Without addressing this, enterprises end up with snowflake clusters very quickly, making it operationally difficult to manage a cluster fleet.
Continuous Deployment (CD) is a process by which containerized applications can be deployed frequently and automatically to one or more clusters. There are a number of open source and commercial options for this on the market, and having a well-thought-through CD strategy with the right tooling for your use case is key to ensuring your developers are moving as fast as possible.
Cluster RBAC and SSO solutions are required to govern access and authorization to all of your clusters – deployed across clouds and data centers – by systems and users. You can build this yourself or use open-source tools. You’ll also need to integrate with your enterprise’s SSO tool of choice such as OKTA or Ping.
Secrets Management solutions are needed to securely handle the credentials and other sensitive data needed by applications to deliver value. It’s no longer industry best practice to build this yourself, and a number of great platforms, such as HashiCorp Vault, exist to address the secrets management needs of enterprises. But making Vault work with your Kubernetes fleet is a massive undertaking, which you need to solve for one way or another.
Log Collection solutions are used to collect application and system logs from across the cluster fleet. FluentD is most popular for collecting logs. Enterprises need to review strategies for centrally collecting fluentD generated logs from clusters in a seamless, repeatable fashion. Metrics Collection solutions collect the vitals for clusters, applications and its components. Prometheus is most widely used here. Here too, enterprises need to review strategies for centrally collecting metrics scraped by Prometheus in each cluster in a seamless, repeatable fashion.
Ingress; Service Mesh components manage application traffic to/from clusters and manage a number of application services such as transport security and tracing. Tools include Nginx, Istio/Envoy. It’s important to think through a multi-cluster strategy to ensure that the same application security rigor is applied to all clusters, everywhere.
Cluster Upgrading is the process of upgrading not just the Kubernetes distribution running on your clusters, but also the core components that help Kubernetes function. New K8s versions are made available 3-4 times per year to deliver new capabilities to the community. The best practice is to be no more than a few months behind the latest k8s version for your production clusters.
Storage Management is an acute problem for enterprises when running Kubernetes clusters on premises. Many teams invest time and effort productizing tools such OpenEBS, GlusterFS and Ceph/Rook. Commercial solutions are also available to address this requirement.
Backup; Restore is critical to ensure continuous operations for production applications if/when disaster strikes. Open source tools such as Valero, and commercial tools such as Kasten, are used by companies to address this requirement.
Kubernetes API Endpoint Security is, arguably, the most important requirement for enterprise-grade Kubernetes management. Be sure to make your cluster’s API endpoint only available in a private subnet, lest your clusters get attacked like those at Tesla.
And this is just a short sample of items that need to be managed (but there are more including DNS and networking configurations as well as continuous integration (CI) and alerting tools). To get Kubernetes right, your DevOps team needs to be proficient in ALL of these areas in order to deploy and operate an enterprise-grade Kubernetes environment. And that’s the hard thing about Kubernetes. Sure, it’s incredibly powerful. But without a clear understanding of what all is involved to properly manage Kubernetes containers, or fleets of containers, enterprise DevOps teams will have a difficult time getting past some significant hurdles. By focusing on the areas outlined above, your team is better equipped to gain the full spectrum of the benefits Kubernetes was designed to provide.