Why time series may be seriously underestimated
Software Engineer Gianluca Arbezzano (CurrencyFair) is one of the speakers at the upcoming DevOps Conference in Berlin (13 – 16 June 2016) where he is going to cover Time Series data in relation to AWS CloudFormation. In this article, he offers a sneak peek at his talks AWS under the Hood and Listen to your Infrastructure and please sleep.
I wrote this post with the aim to share an experience with time series and why —in my opinion— this type of data helps understand how our applications and/or our servers work.
Time series are everywhere, including in logs, sequences of temperature measurements and user growth in time. The logs are a time series because they have a value and a timestamp, in this case an HTTP request at 10/Oct/2000:13:55:36 -0700.
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
This example of an Apache log row can help you understand exactly how diffuse this type of data is.
The best time series are the ones of a number and a timestamp because it is easy to manipulate and to group then. The IoT is packed with this type of data because a lot of sensors return numbers and you can use a series to understand their trends. You can monitor the temperature of your apartment or if you have different sensors in different houses you can compare the data in order to understand which thermostat works better.
From June 13th to 16th the next DevOps Conference will be presented by JAXenter in Berlin. This time the main subjects of the conference are Continuous Delivery, Microservices, Docker, Clouds und Lean Business. More than 30 national and international renowned speakers will share their knowledge in more than 35 workshops, sessions and keynotes! Find out more here.
The DevOps world is made of this data: CPU, memory usage, number of deploy and everything that you can collect. I use InfluxDB and its ecosystem made of Grafana or Chronograf to create my dashboard and see my data and some collectors: Telegraph, your application, everything that can send points.
Now we are ready to create an example of CPU monitoring made with InfluxDB, Telegraph with the CPU plugin and Chronograf to watch a pretty graph of our CPU usage.
In order to install this service I will use a set of docker images:
docker pull gianarb/influxdb:0.10.0 docker pull gianarb/chronograf:0.10.0 docker pull gianarb/telegraf:0.10.4.1
The first thing to do is start our InfluxDB server.
docker run -t -p 8086:8086 -p 8083:8083 gianarb/influxdb:0.10.0
Open your browser and go to http://container-ip:8083 to see the admin page.
InfluxDB supports the SQL query language. You can create your first database:
CREATE DATABASE cpu_mac
Now we are ready to collect our CPU metric. Telegraph helps us to grab some information from our machine and pushes them to InfluxDB. During this example we will use just the CPU plugin. Write this configuration file, for example in this path /tmp/telegraf.conf.
[tags] dc = "local-mac" # OUTPUTS [outputs] [outputs.influxdb] url = "http://&lt;container-id&gt;:8086" # EDIT, USE YOUR SERVER! database = "cpu_mac" # EDIT, USE YOUR DATABASE ALREADY CREATED. [cpu] percpu = false totalcpu = true
We are ready to start Telegraph in order to collect our CPU information.
docker run -t -v /tmp/telegraf.conf:/etc/telegraf.conf gianarb/telegraf:0.10.4.1
Every 10 seconds this tool will send your data in InfluxDB. The last step is to actually see our data. To begin we can use the admin panel:
SELECT * FROM cpu_usage_system WHERE dc='local-mac'
The result will be like this (very difficult to read).
Chronograf is a dashboard for data visualization:
docker run -t -p 8086:8086 -p 10000:10000 gianarb/chronograf:0.10.0
You can visit this page http://container-ip:1000 in order to see this dashboard. The wizard is very easy to follow; all you need to do is add InfluxDB server and create your first widget. This is my result.
We created a slim monitor for our local machine, but we can extend it in order to monitor a cluster of servers:
You can use kronograf with specific tags to collect your data and to create your personal dashboards.
[tags] dc = "web1"
Everything we used in this article is Open Source: InfluxDB, Graphite, Docker images, etc. You can either contribute or you can try them and share your opinion.