Smoothing the continuous delivery path
Continuous Delivery is gaining recognition as a best practice, but adopting it and iteratively improving it is challenging.
To paraphrase Wikipedia, Continuous Delivery is a software engineering approach that produces valuable software in short cycles and enables production releases to be made at any time.
Continuous Delivery is gaining recognition as a best practice, but adopting it and iteratively improving it is challenging. Given the diversity of teams and architectures that do Continuous Delivery well, it’s clear that there is no single, golden path.
This article explores how two very different teams successfully practiced and improved Continuous Delivery. Both teams were sizeable and mature in their use of agile and lean practices. One team chose microservices, Scala, MongoDB and Docker on a greenfield project. The other faced the constraints of a monolithic architecture, legacy code, .NET, MySQL and Windows.
Patterns for successful practice
From observing both teams, some common patterns were visible that contributed to their successful Continuous Delivery.
Continuous Integration that works
Continuous Integration (CI) is the foundation that enables Continuous Delivery. To be a truly solid foundation though, the CI system must maintain good health, which only happens if the team exercise it and care for it. Team members need to be integrating their changes regularly (multiple times per day) and responding promptly to red builds.
The team should also be eliminating warnings and addressing long running CI steps. These important behaviours ensure that release candidates can be created regularly, efficiently and quickly. Once this process starts taking hours instead of minutes, Continuous Delivery becomes a burden instead of an enabler.
Managing the complexity of software is extremely challenging. The right mix of automated tests helps address the risk present when changing a complex system, by identifying areas of high risk (e.g. lack of test coverage or broken tests) that need further investigation. When practicing automated testing, it’s important to get the right distribution of unit, integration and end-to-end tests (the well documented “test pyramid”).
Both teams I worked with moved towards a tear-drop distribution: a very small number of end-to-end tests, sitting on top of a high number of integration tests, with a moderate number of unit tests at the base. This provided the best balance between behavioural coverage and cost of change, which, in turn, allowed the risk present in a software increment to be more easily identified.
Low cost deployment (and rollback)
Once a release candidate has been produced by the CI system, and the team is happy with it’s level of risk, one or more deployments will take place, to a variety of environments (normally QA, Staging/Pre-Production, Production).
When practicing Continuous Delivery, it’s typical for these deployments to happen multiple times per week, if not per day. A key success factor is thus to minimise the time and effort of these deployments. The microservice team were able to reduce this overhead down to minutes, which enabled multiple deployments per day. The monolith team reduced it to hours, in order to achieve weekly deployments.
Regardless of how frequent production deployments happen, the cost and impact of rolling back must be tiny (seconds), to minimise service downtime. This makes rolling back pain-free and not a “bad thing” to do.
Monitoring and alerting
No matter how much testing (manual or automated) a release candidate has, there is always a risk that something will break when it goes into Production. Both teams were able to monitor the impact of a release in near real-time using tools such as Elastic Search, Kibana, Papertrail, Splunk and NewRelic. Having such tools easily available is great, but they’re next to useless unless people look at them, and they are coupled to automated alerting (such as PagerDuty).
This required a culture of “caring about Production”, so that the whole team (not just Operations, QA or Development) knew what “normal” looked like, and noticed when Production’s vital signs took a turn for the worse.
This article has highlighted how different teams, with very different architectures, both successfully practiced Continuous Delivery. It’s touched on some of the shared patterns that have enabled this.
If you’d like to hear more about how their Continuous Delivery journey, including the different blockers and accelerators they faced, and the ever present impact of Conway’s Law, then I’ll be speaking on this topic at JAX London on 13-15th October 2015.