Interview with Björn Rabenstein, Production Engineer at SoundCloud

“Prometheus itself is a product of a DevOps mindset”

Gabriela Motroc

A lot of companies and organizations have adopted Prometheus and the project quickly gained an active developer and user community. It is currently a standalone open source project maintained independently of any company. In 2016, Prometheus joined the Cloud Native Computing Foundation as the second hosted project after Kubernetes. We talked to Björn Rabenstein, engineer at SoundCloud and Prometheus core developer, about how Prometheus can help companies adopt DevOps.

JAXenter: Would you call Prometheus a DevOps tool?

Björn Rabenstein: Absolutely. I mean, there are so many different understandings of what DevOps actually means, but I dare to say that Prometheus fits most of them. Prometheus itself is a product of a DevOps mindset: Engineers at SoundCloud who found themselves unable to fulfill the operational necessity of monitoring with the tools available solved the problem with software: they developed Prometheus.

Before we dive into more concrete aspects, let me quickly mention PromQL, the powerful Prometheus expression language, and how it relates to “configuration as code”. Prometheus instrumentation is quite simple, actually. It uses very fundamental metrics in the monitored code, with almost no state or logic in it. The processing and evaluation logic happens on the Prometheus server, expressed with the PromQL. The logic is exactly where it needs to be. This makes the configuration of your monitoring system, dashboarding queries, and alerting rules a lot more code-like. Fun fact: PromQL is Turing-complete, as proven in an excellent (and humorous) lightning talk by Brian Brazil at PromCom.

JAXenter: If a company has DevOps deeply embedded into their culture, does that make them a good fit for using Prometheus?

Björn Rabenstein: The best source of metrics for Prometheus is instrumentation of your own code. While instrumentation is quite easy, as mentioned above, it suddenly makes monitoring a development concern, too. That’s a good thing. It’s hard to identify problems in complex system only with a traditional black-box monitoring approach. Prometheus naturally involves developers into monitoring and cannot really work with a “throw over the fence” mentality. As a byproduct, Prometheus instrumentation has reportedly helped developers to debug and improve their code.

JAXenter: If a company is adopting DevOps, how can using Prometheus help them on their way?

Björn Rabenstein: What’s really interesting is the operational simplicity of Prometheus. At SoundCloud, teams failed to spin up their own Statsd/Graphite stack. If you look closer at it, it’s quite a contraption, involving many different components written in different languages, requiring specific runtime environments and all that. As a consequence, we needed a few expert ops folks that essentially had to offer Statsd/Graphite as a service for the rest of the company.

The Prometheus server, in contrast, is just a static Go binary, which you drop somewhere with some minimal configuration for a start, and you got things going in no time. At SoundCloud, each team runs their own Prometheus servers, both for production and to play with it during testing and development. SoundCloud follows a “you build it, you run it” philosophy. Prometheus enabled us to include monitoring into that concept.

To run microservices on bare metal, there is no way around containers.

As developing software now includes developing the monitoring setup, it is immensely helpful that you can just spin up a new Prometheus server with new configs and alerting rules to try things out. That’s easily possible because Prometheus uses a pull approach; you don’t have to reconfigure the world to run another monitoring server. Point your Prometheus test server at production targets to test your monitoring. Or point it at test targets to monitor your software under development. All at your fingertips.

JAXenter: What is the importance of Prometheus for containers and microservices?

Björn Rabenstein: The following three projects all began at SoundCloud in 2012: the migration towards microservices, an in-house container orchestration platform, and – obviously – Prometheus. That’s no coincidence. To run microservices on bare metal, there is no way around containers. To run containers at scale, you need an orchestration platform. And to monitor both your microservices and the container orchestration, you need a new kind of monitoring system. One followed the other, and the rest is history. This natural evolution explains why Prometheus is such a good fit for monitoring containers and microservices.

JAXenter: What’s next for Prometheus?

Björn Rabenstein: The hottest topic for users right now is the evolving integration with Kubernetes. All the pieces are in place, like the newly written service discovery code. Now it’s the users’ turn to make use of it and to not only monitor Kubernetes clusters but also to run Prometheus itself on Kubernetes. The Prometheus Operator recently released by CoreOS is a great tool to accomplish the latter easily. More details in Fabian Reinartz’s blog post.

For Prometheus development, a replicated long-term storage layer seems to be some kind of holy grail. Several knights have gone on their quests already. Check out Cortex by Weaveworks and Vulcan by Digital Ocean. These approaches help the respective company to offer Prometheus as a service to their customers. For the community in general, I believe and hope we will see even more additions to the list of directly instrumented software. Docker is jumping on the bandwagon as we speak. It is important to note that Prometheus instrumentation is not a one-way road into Prometheus usage. The Prometheus exposition format is documented and is understood by more tools than just Prometheus, e.g. Influx’s Telegraf.

The hottest topic for users right now is the evolving integration with Kubernetes.

Furthermore, the Prometheus client libraries make it easy to present metrics in other formats, too. For example, with the Go, Java, and Python libraries, you can push the Prometheus metrics into Graphite. “Instrument once, use it everywhere”, if I may modify a well-known tag line. All of the above is so easy because the Prometheus data model is so rich that it can easily be mapped into other metrics systems.

Thank you very much!

Gabriela Motroc
Gabriela Motroc is an online editor for Before working at S&S Media she studied International Communication Management at The Hague University of Applied Sciences.

comments powered by Disqus