days
1
0
hours
1
4
minutes
2
3
seconds
0
4

Why time series may be seriously underestimated

Gianluca Arbezzano
time series
Time image via Shutterstock

Software Engineer Gianluca Arbezzano (CurrencyFair) is one of the speakers at the upcoming DevOps Conference in Berlin (13 – 16 June 2016) where he is going to cover Time Series data in relation to AWS CloudFormation. In this article, he offers a sneak peek at his talks AWS under the Hood and Listen to your Infrastructure and please sleep.

I wrote this post with the aim to share an experience with time series and why —in my opinion— this type of data helps understand how our applications and/or our servers work.

Time series

Time series are everywhere, including in logs, sequences of temperature measurements and user growth in time. The logs are a time series because they have a value and a timestamp, in this case an HTTP request at 10/Oct/2000:13:55:36 -0700.

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

This example of an Apache log row can help you understand exactly how diffuse this type of data is.

The best time series are the ones of a number and a timestamp because it is easy to manipulate and to group then. The IoT is packed with this type of data because a lot of sensors return numbers and you can use a series to understand their trends. You can monitor the temperature of your apartment or if you have different sensors in different houses you can compare the data in order to understand which thermostat works better.

DevOpsConference LogoDevOps Conference 2016 – Continuous Delivery, Microservices, Docker, Clouds and Lean Business
From June 13th to 16th the next DevOps Conference will be presented by JAXenter in Berlin. This time the main subjects of the conference are Continuous Delivery, Microservices, Docker, Clouds und Lean Business. More than 30 national and international renowned speakers will share their knowledge in more than 35 workshops, sessions and keynotes! Find out more here.

The DevOps world is made of this data: CPU, memory usage, number of deploy and everything that you can collect. I use InfluxDB and its ecosystem made of Grafana or Chronograf to create my dashboard and see my data and some collectors: Telegraph, your application, everything that can send points.

Example

Now we are ready to create an example of CPU monitoring made with InfluxDB, Telegraph with the CPU plugin and Chronograf to watch a pretty graph of our CPU usage.

In order to install this service I will use a set of docker images:

docker pull gianarb/influxdb:0.10.0
docker pull gianarb/chronograf:0.10.0
docker pull gianarb/telegraf:0.10.4.1

The first thing to do is start our InfluxDB server.

docker run -t -p 8086:8086 -p 8083:8083 gianarb/influxdb:0.10.0

Open your browser and go to http://container-ip:8083 to see the admin page.
InfluxDB supports the SQL query language. You can create your first database:

CREATE DATABASE cpu_mac

Now we are ready to collect our CPU metric. Telegraph helps us to grab some information from our machine and pushes them to InfluxDB. During this example we will use just the CPU plugin. Write this configuration file, for example in this path  /tmp/telegraf.conf.

[tags]
dc = "local-mac"

# OUTPUTS
[outputs]
[outputs.influxdb]
url = "http://<container-id>:8086" # EDIT, USE YOUR SERVER!
database = "cpu_mac" # EDIT, USE YOUR DATABASE ALREADY CREATED.

[cpu]
percpu = false
totalcpu = true

We are ready to start Telegraph in order to collect our CPU information.

docker run -t -v /tmp/telegraf.conf:/etc/telegraf.conf gianarb/telegraf:0.10.4.1

Every 10 seconds this tool will send your data in InfluxDB. The last step is to actually see our data. To begin we can use the admin panel:

SELECT * FROM cpu_usage_system WHERE dc='local-mac'

The result will be like this (very difficult to read).

687474703a2f2f7331322e706f7374696d672e6f72672f746d6d377a337777642f53637265656e5f53686f745f323031365f30335f30315f61745f32335f34375f31322e706e67

Chronograf is a dashboard for data visualization:

docker run -t -p 8086:8086 -p 10000:10000 gianarb/chronograf:0.10.0

You can visit this page http://container-ip:1000 in order to see this dashboard. The wizard is very easy to follow; all you need to do is add InfluxDB server and create your first widget. This is my result.

687474703a2f2f7331322e706f7374696d672e6f72672f6f33663334356168392f53637265656e5f53686f745f323031365f30335f30315f61745f32335f35385f34322e706e67

We created a slim monitor for our local machine, but we can extend it in order to monitor a cluster of servers:

687474703a2f2f73372e706f7374696d672e6f72672f69396e69696574367a2f53637265656e5f53686f745f323031365f30335f30325f61745f32325f31305f35362e706e67

You can use kronograf with specific tags to collect your data and to create your personal dashboards.

[tags]
dc = "web1"

Everything we used in this article is Open Source: InfluxDB, Graphite, Docker images, etc. You can either contribute or you can try them and share your opinion.

See you during the DevOps Conference and follow me on Twitter and GitHub.

Author
Gianluca Arbezzano

Gianluca Arbezzano

All Posts by Gianluca Arbezzano

Software Engineer at CurrencyFair a tech financial company. I am a PHP developer but I work on different stack layers, automation, scalability and HA. Open Source contributor for several projects, above all Zend Framework, and member of Doctrine ORM developers team. Strong believer in best developing practices and supporter of different User Groups. My fields of interest are various and constantly in evolution: in the last year, I worked a lot on scalable infrastructures, reaching the goal of building some of them on top of AWS, DigitalOcean and OpenStack.

Comments
comments powered by Disqus