Organizations face common challenges with Continuous Delivery
Organizations still face challenges and toil getting software across the finish line to production. Investments in Continuous Delivery aim to bring more consistent and safer approaches to getting features into production.
As engineers, we are always seeking to better our craft. Investments in engineering efficiency raise the collective bar for all. At Harness, we’re fortunate to have the opportunity to talk to many organizations along their Continuous Delivery journeys. Part of what we do when engaging with a customer or prospect is run a Continuous Delivery Capability Assessment (CDCA) to catalog and measure maturity.
Over the past year, we have analyzed and aggregated our capability assessments, uncovering common challenges organizations face. In our Continuous Delivery Insights report, we identified the time, effort, cost and velocity associated with their current Continuous Delivery process. Our data shows that velocity is up – but complexity and cost are also on the rise.
We measured Continuous Delivery performance metrics across over 100 firms and found that as sophisticated as many are, there is still a lot of effort required to get features/fixes into production. At organizations that are looking to strengthen or further their Continuous Delivery goals, we noticed the following median [middle] or average values.
We define deployment frequency as the number of times a build is deployed to production. In terms of a microservices architecture, deployment frequency is usually increased as the number of services typically have a one-to-one relationship with the build. For the sample set we interviewed, the median deployment frequency is ten days, which shows bi-monthly deployments are becoming the norm.
These bi-monthly deployments might be on-demand, but the lead times start to add up. Lead time is the amount of time needed to validate a deployment once the process has started. Our sample shows that organizations typically require an average of eight hours; e.g eight hours of advance notice to allow validation and sign off of a deployment.
During those eight hours of lead time and the validation steps, if a decision is made to roll back, we saw that organizations averaged 60 minutes to restore a service, i.e. roll back or roll forward. An hour might not seem long to some, but for engineers, every second can feel stressful as you race to restore your SLAs.
Adding up all the effort from different team members during a deployment, getting an artifact into production represented an average of 25 human hours of work. Certainly, different team members will have varying levels of involvement throughout the build, deploy, and validation cycles, but this represents more than half a week of a full-time employee’s work in total burden.
Software development is full of unknowns; core to innovation is developing approaches and features for the first time. The expectation is for us to iterate and learn from failures. We certainly have gotten better at deployment and testing methodologies. One way to measure this is through change failure rate, or the percentage of deployments that fail. Through our sample set, an average 11% of deployments failed. Even with modern approaches being defined, organizations still have a long way to go to increase their velocity safety in deployments in terms of adopting modern approaches.
Modern Approaches and Challenges
The goal of Continuous Delivery is to collectively raise the bar around software delivery. For organizations looking to increase agility and safety, a canary deployment seems like an obvious choice for safer deployments. A canary deployment is essentially a safe approach to releasing where you send in a canary [release candidate] and incrementally replace the stable version until the canary takes over. If at any time the canary is not promoted, the stable version supersedes.
As simple as this concept is to grasp, in practice can be difficult. Making decisions on when to promote and roll-back a canary can be challenging to automate. In Continuous Delivery Insights 2020, we found that only about four percent of organizations were taking a canary based approach somewhere in their organization.
When a failure does arise with a production deployment, you are left with two choices which certainly impacts mean time to recovery [MTTR]. Organizations can choose to roll back to the last known working version to minimize MTTR and have proper time to fix, but this impacts getting the features out. Alternatively, organizations can roll forward with a fix rather than restore, which is riskier toward MTTR but increases the velocity in which the features will be deployed. Among the organizations we interviewed, 85% take the more conservative approach in rolling back to restore then fix the redeploy.
We are excited to start tracking metrics on the overall challenges with Continuous Delivery year over year while we work toward improving the metrics and adoption of Continuous Delivery approaches. Organizations still face challenges and toil getting software across the finish line to production. Investments in Continuous Delivery aims to bring more consistent and safer approaches to getting features into production.