Taking a new approach to Observability – what are you looking for and why?
Understanding what observability is – and equally what it is not – is essential for anyone involved in the wider software engineering sector. Observability has its roots in engineering and control theory such as papers published in the 1960s by Rudolf Kalman.
Observability is growing in importance for all those involved in software engineering. From developers cutting code and pulling together application components through to testing, security and operations teams, everyone needs more insight into what is taking place within a service. With so many different requirements to follow, more business goals to meet, and more complex architectures like microservices and containers to track, getting data on what is taking place is becoming more important.
This is where observability comes in. Without good data, it’s hard to narrow down what the problems are and how to fix them. Yet observability itself can be hard to define in practice.
Understanding what observability is – and equally what it is not – is essential for anyone involved in the wider software engineering sector. By looking at what changes are taking place within application design, development and management, and how those trends are affecting the ways we choose to monitor, measure and understand these services, we can see how approaches like observability can help across the whole process. Equally, we can see where more data is needed, or where it’s about putting the data we have into better context.
Observability has its roots in engineering and control theory such as papers published in the 1960s by Rudolf Kalman. Under this, observability uses data from elements within a system to provide insights into what is taking place inside a system. More recently, observability in the software development area has evolved as a successor to traditional monitoring and has been based on combining three sets of data – logs, metrics and tracing – to provide a more complete picture of what is taking place.
However, solely looking at observability as an exercise in acquiring “more data” misses the point. Instead, observability has to include an understanding of the purpose that any application as a whole is supposed to serve, the objectives of the team building the application, and the business at large. In fact, most of those objectives will boil to reliability. In other words, is the application reliably meeting both the uptime and the performance standards of the business? Without this understanding, “observability” serves only to increase the noise coming from the application without solving real problems for teams and their users.
Alongside this context, it’s also important to recognize that today’s IT infrastructure and applications are more complex and more dynamic than they were in the past. From simple monolithic apps and three-tier web applications, today’s microservices-based implementations are made up of multiple discrete components that interact with each other through APIs. This adoption of microservices is both enabled by and mirrored with more use of cloud services – either public, private or hybrid – and new platforms like Kubernetes and serverless.
In this world of dynamic microservices, service components are constantly scaling up and down to meet demand, while those same components are being continually updated and upgraded as part of a Continuous Delivery model. This means that these services are so dynamic and complex that they seem almost organic. In this scenario, traditional red light/green light monitoring and reactive approaches are woefully inadequate. Additionally, modern software engineering teams are no longer siloed by skill sets like database administration or backend engineering. Rather, these teams own the full lifecycle of service components from writing code to monitoring alerts and fixing issues in production.
All of this means that a new approach is needed – observability. Proper observability remains focused on the end goal – providing a reliable service – while serving the needs of the new integrated model of production support. This means that modern observability tools must not only be able to graft constantly changing data sources from dynamic architectures together, but also present that data in an easily consumed, integrated way for those teams to react quickly to fix reliability issues.
Making more of your data
As part of adopting observability, it’s worth spending time on how to create objectives for your observability to measure. This defines how you will use your data and the results you want to achieve, making your approach more proactive than simply diving into data if and when a problem comes up.
Setting out the right objectives for your developers as a team matters, because it helps you focus on how your software performs over time and adds to whatever business goals you want to achieve. By putting this mindset in place, it helps engineering teams move their software updates from initial development through into production, and then offers a feedback loop once those changes have been made. By looking at business processes, changes and software performance together in context, you should get more value out of the data that your applications produce over time.
Additionally, these objectives have to mature from simple system load and failure monitoring to goals aligned with end-users. For example, rather than setting an objective on server load, a better measure is to set objectives against important end-user needs like successfully adding items to shopping carts, page load times, or credit-card transactions completing. Typically, these kinds of objectives will tie directly back to the way the service generates revenue or value for the business. This focuses the minds of engineers on how their applications’ performance delivers value to the business and customers, rather than merely watching low-level indicators.
This approach to observability has to run all the time, it has to provide useful data back to teams, and it has to be part of their workflows. For software developers, this data should provide insight back on the impact that their changes have made, whether this is fixing a particular problem or improving throughput to meet a business request. Either way, making use of observability data should deliver better applications that more closely meet what the business is after.
Gartner terms this approach ‘continuous intelligence,’ where data from applications and cloud infrastructure is gathered, analyzed and parsed to provide recommendations back to the teams involved. The firm has also predicted that more than half of all new business applications developed by 2022 will have continuous intelligence baked in. For developers, this approach should be a natural extension of what they already do around continuous integration, deployment, and analysis. Observability will play a critical role in delivering this kind of improvement, but only based on using contextual data and good objective setting together.