Interview with Ajay Tripathy, Chief Technology Officer and a co-founder at Kubecost

“Kubernetes infrastructure can get expensive quite quickly without accurately monitoring”

JAXenter Editorial Team
© Shutterstock / Repina Valeriya

We spoke with Ajay Tripathy, Chief Technology Officer and a co-founder at Kubecost about The Cloud Native Computing Foundation’s survey into Kubernetes spending. What areas is Kubernetes overspending most rampant and ripe for improvement and how can developers reduce their Kubernetes spend?

JAXenter: The Cloud Native Computing Foundation (CNCF) just announced the results of its first-ever survey into Kubernetes spending. What are the most critical takeaways from the data that developers and DevOps teams need to pay attention to?

Ajay Tripathy: Kubernetes infrastructure can get expensive quite quickly without accurately monitoring how and where you are spending. 68% of respondents, from startups to enterprises, saw their Kubernetes costs increase over the past year. For half of those respondents, costs went up more than 20%. That’s a significant increase for a lot of teams over a short time, and the root of this is often related to poor or insufficient monitoring and visibility into Kubernetes spending.

A big takeaway from CNCF’s survey is that many teams either do not monitor Kubernetes spend at all (24%) or they rely on rough monthly estimates (44%). Only 13% rely on accurate showbacks and just 14% reported a chargeback program in place. Of those who are actively monitoring spend more granularly, most use the cloud-native tools (such as the open source project Kubecost) or ad-hoc spreadsheets (though the latter gets harder to do at scale). CNCF’s survey is worth a read – it’s the first of its kind and really paints a clear picture of a growing concern for organizations now using Kubernetes.

Source 1 CNCF report “FinOps for Kubernetes”

JAXenter: In the rush to cloud native environments, do you think it has been too easy for Kubernetes costs to spin out of control? Why or why not?

Ajay Tripathy: Kubernetes was first and foremost about engineering productively, scale, and rapid delivery. The orchestration platform has done that incredibly well for many organizations, transforming the way they deliver software in fundamental ways. Before Kubernetes, companies might buy or rent hardware in a data center, with clear costs and amortization. Or they’d buy flexible capacity to run virtual machines. That moved CAPEX to OPEX, but spending was still somewhat predictable. One of the challenges that organizations are finding with the flexibility and real-time scaling of Kubernetes is that it also allows for cloud spend to fluctuate wildly. As more organizations scale up their Kubernetes environments, teams are increasingly wrestling with the financial costs of this platform modernization.

SEE ALSO: Why Isn’t Application Security Instrumentation in Your Software Stack?

JAXenter: What are some things developers and DevOps teams can do – specifically – to reduce their Kubernetes spend? What areas is Kubernetes overspending most rampant and ripe for improvement?

Ajay Tripathy: There are often many opportunities to reduce spend, from more efficient bin packing workloads, to right-sizing applications, to determining the right resource class for compute storage, and more. The engineers I talk to are well aware of the fact that they can do better, but in most cases they don’t know where to optimize. In other words, most engineering teams fly blind. They don’t know which parts of their systems are oversized or underperforming. I hear stories from engineers of applications requesting large amounts of pods “to be safe,” only to find out later that all those pods were not necessary (or that they were causing outages to other parts of the cluster).

JAXenter: Are Kubernetes performance, stability, and reliability issues a concern for those considering whether to be more aggressive with Kubernetes cost-saving measures? Is this a case of knowingly overspending to ensure peace of mind that their cloud native applications will avoid preventable issues, or something else?

Ajay Tripathy: Teams often overprovision resources and therefore overspend due to a lack of visibility or awareness. When you don’t know the true cost of compute resources, it’s very natural to over-request! Many organizations are of course concerned with ensuring applications serve their businesses as they scale, that they meet all SLAs, are performant, are reliable, etc. The budget naturally grows with actual business needs. But there’s a difference between spending to ensure those performance goals are continually met, and putting cloud budget to resources they don’t actually need or use. The latter is still very much the case. With SREs, for example, knowing which resources are available is crucial to achieve their goals, and they can’t get full visibility without a cost monitoring strategy.

SEE ALSO: Why Integration Testing is Taking Off

JAXenter: Two years from now, if CNCF conducts a similar survey across this cross-section of developers and DevOps teams, what data do you expect will be similar, and what do you think will be markedly different?

Ajay Tripathy: I would hope to see far fewer teams relying on ad-hoc monitoring or no data at all, limiting their visibility into their Kubernetes-related spending. And I do think more companies will have started their crawl-walk-run journey to FinOps practices over the next year or two: starting from Kubernetes cost monitoring, moving to showback and chargeback, and then continually iterating and expanding their precision. As the CNCF survey shows, costs are going up, and the potential to save significantly by reining those in is getting higher every day. I also expect to see an increasing number of teams introducing automation for managing and governing cloud spend. Especially at scale, there are a lot of budget efficiency gains there for the taking right now for many teams.

Inline Feedbacks
View all comments