Top 10 Docker logging gotchas every Docker user should know
© Shutterstock / Tashatuvango
Docker changed the way applications are deployed, as well as the workflow for log management. In this article, Stefan Thies reveals the top 10 Docker logging gotchas every Docker user should know.
One of the first commands Docker users learn after “docker run” is “docker logs” — but did you know that the “docker logs” command doesn’t always work? That might sound amazing but it’s true. We will learn more about the reasons later.
Docker changed not only the way applications are deployed, but also the workflow for log management. Instead of writing logs to files, containers write logs to the console (stdout/stderr) and Docker Logging Drivers forward logs to their destination. A check against Docker Github issues quickly shows that users have various problems when dealing with Docker logs.
Docker Logging Drivers overview
Managing logs with Docker seems to be tricky and needs deeper knowledge of Docker Logging Driver implementations and alternatives to overcome issues that people report. So what are the top 10 Docker logging gotchas, every Docker user should know?
Let’s start with an overview of Docker Logging Drivers and options to ship logs to centralized Log Management solutions such as Elastic Stack (former ELK Stack) or Sematext Cloud.
In the early days of Docker, container logs were only available via Docker remote API, i.e. via “docker logs” command and a few advanced log shippers. Later on, Docker introduced logging drivers as plugins, to open Docker for integrations with various log management tools. These logging drivers are implemented as binary plugins in the docker daemon. Recently, the plugin architecture got extended to run logging plugins as external processes, which could register as plugins and retrieve logs via Linux FIFO files. Currently, logging drivers shipped with docker binaries are binary plugins, but this might change in the near future.
Docker Logging Drivers receive container logs and forwards them to remote destinations or files. The default logging driver is “json-file”. It stores container logs in JSON format on local disk. Docker has a plugin architecture for logging drivers, so there are plugins for open source tools and commercial tools available:
- Journald – storing container logs in the system journal
- Syslog Driver – supporting UDP, TCP, TLS
- Fluentd – supporting TCP or Unix socket connections to fluentd
- Splunk – HTTP/HTTPS forwarding to Splunk server
- Gelf – UDP log forwarding to Graylog2
For a complete log management solution, additional tools need to be involved:
- Log parser to structure logs, typically part of log shippers (fluentd, rsyslog, logstash, logagent, …)
- Log indexing, visualisation and alerting:
- Elasticsearch and Kibana (Elastic Stack, also known as ELK stack),
- Graylog OSS / Enterprise
- Sematext Cloud / Enterprise
- and many more …
To ship logs to one of the backends you might need to select a logging driver or logging tool that supports your Log Management solution of choice. If your tool needs Syslog input, you might choose the Syslog driver.
1.Docker logs command works only with json-file Logging driver
The default Logging driver “json-file” writes logs to the local disk, and the json-file driver is the only one that works in parallel to “docker logs” command. As soon one uses alternative logging drivers, such as Syslog, Gelf or Splunk, the Docker logs API calls start failing, and the “docker logs” command shows an error reporting the limitations instead of displaying the logs on the console. Not only does the docker log command fail, but many other tools using the Docker API for logs, such as Docker user interfaces like Portainer or log collection containers like Logspout are not able to show the container logs in this situation.
2. Docker Syslog driver can block container deployment and lose logs when Syslog server is not reachable
Using Docker Syslog driver with TCP or TLS is a reliable way to deliver logs. However, the Syslog logging driver requires an established TCP connection to the Syslog server when a container starts up. If this connection can’t be established at container start time, the container start fails with an error message like
docker: Error response from daemon: Failed to initialize logging driver: dial tcp
This means a temporary network problem or high network latency could block the deployment of containers. In addition, a restart of the Syslog server could tear down all containers logging via TCP/TS to a central Syslog server, which is definitely the situation to avoid.
3. Docker syslog driver loses logs when destination is down
Similar to the issue above, causing a loss of logs is the missing ability of Docker logging drivers to buffer logs on disk when they can’t be delivered to remote destinations. Here is an interesting issue to watch.
4. Docker logging drivers don’t support multi-line logs like error stack traces
When we talk about logs, most people think of simple single-line logs, say like Nginx or Apache logs. However, logs can also span multiple lines. For example, exception traces typically span multiple lines, so to help Logstash users we’ve shared how to handle stack traces with Logstash.
Things are no better in the world of containers, where things get even more complicated because logs from all apps running in containers get emitted to the same output – stdout. No wonder seeing issue #22920 closed with “Closed. Don’t care.” disappointed so many people. Luckily, there are tools like Sematext Docker Agent that can parse multi-line logs out of the box, as well as apply custom multi-line patterns.
5. Docker service logs command hangs with non-json logging driver
While the json-files driver seems robust, other log drivers could unfortunately still cause trouble with Docker Swarm mode. See this GitHub issue.
6. Docker daemon crashes if fluentd daemon is gone and buffer is full
Another scenario where a logging driver causes trouble when the remote destination is not reachable — in this particular case, the logging drivers throws exceptions that cause Docker daemon to crash.
7. Docker container gets stuck in Created state on Splunk driver failure
If the Splunk server returns a 504 on container start, the container is actually started, but docker reports the container as failed to start. Once in this state, the container no longer appears under docker ps, and the container process cannot be stopped with docker kill. The only way to stop the process is to manually kill it.
8. Docker logs skipping/missing application logs (journald driver)
It turns out that this issue is caused by journald rate limits, which needs to be increased as Docker creates logs for all running applications and journald might skip some logs due to its rate limitation settings. So be aware of your journald settings when you connect Docker to it.
9. Gelf driver issues
The Gelf logging driver is missing a TCP or TLS option and supports only UDP, which could be risky to loose log messages when UDP packets get dropped. Some issues report a problem of DNS resolution/caching with GELF drivers so you logs might be sent to “Nirvana” when your Graylog server IP changes — and this could happen quickly using container deployments.
10. Docker does not support multiple log drivers
It would be nice to have logs stored locally on the server and the possibility to ship them to remote servers. Currently, Docker does not support multiple log drivers, so users are forced to pick a single log driver. Not an easy decision knowing various issues listed in this post.
That’s it! These are my top 10 Docker Logging Gotchas!
Alternatives to Docker Log Drivers
With so many issues around Docker Log Drivers, are there alternatives? It turns out there are — Docker API based log shippers to the rescue!to the rescue.
Here are a few good reasons to look at such alternatives:
- Json-file driver is the default and reliable, a local copy of logs is always available, and the ‘docker logs’ AND Docker API calls for logs just work.
- Ability to filter logs by various dynamic criteria like image name or labels
- Better metadata, having full access to Docker API
- No risk of crashing Docker Daemon because such log shippers could run in a container with limited resource usage and disk space consumption (e.g. put buffer directory in a volume and set useful limits)
Please note that a third tool, which fits more or less in this category is Elastic Filebeat. Filebeat collects log files from, generated by the json-file log driver only the enrichment for container metadata is done via Docker API calls.
Logspout provides multiple outputs and can route logs from different containers to different destinations without changing the application container logging settings. It handles ANSI Escape sequences (like color codes in logs), which could be problematic for full-text search in Elasticsearch.
Like Logspout, Sematext Docker Agent (SDA) is API based, supports log routing and handles ANSI Escape sequences for full-text search. However, Sematext Docker Agent is actually more than just a simple log shipper.
SDA takes care of many issues raised by Docker users such as multi-line logs, log format detection and log parsing, complete metadata enrichment containers (labels, geoip, Swarm and Kubernetes specific metadata), disk buffering and reliable shipping via TLS. It is open source on Github, can be used with the Elastic Stack or Sematext Cloud and can collect not just container logs, but also container events, plus Docker host and container metrics.
Differences between three logging solutions that work well with the json-file driver and Docker Remote API
The clear recommendation for API based loggers might change in the future as Docker log drivers improve over time and the new plugin mechanism via Unix socket allows new logging driver implementations to run as separate processes. This feature really improves Dockers logging plugin architecture and is a good sign that Docker takes logging issues seriously.
In the meantime, consider Docker API based log collectors like Sematext Docker Agent and Logspout to avoid running into issues with Docker logs, like the 10 gotchas described here.