Transforming disorganized logging streams

Advantages and challenges of using Kafka for logging

Ryan Staatz
© Shutterstock / Cagkan Sayin

Kafka has become a go-to platform for organizations to move their data between infrastructure and tools to analyze and audit their networks for log streaming. While it has many advantages, it can also present several challenges, such as scalability and bottlenecks that can limit system effectiveness.

With all of the infrastructure that powers IT, organizations need to find tools that can utilize logged events across their entire environment. Having a single point of information for log data helps internal teams analyze and audit internal systems to help detect and block potential cyber-attacks, application issues, and more.

Kafka has become a go-to platform for organizations to move their data between infrastructure and tools to analyze and audit their networks for log streaming. It can turn disorganized logging streams into an understandable, analyzable output that IT departments can use to better monitor all environments to detect potential and ongoing problems.

While Kafka has many advantages, it can also present several challenges, such as scalability and bottlenecks that can limit system effectiveness.

SEE ALSO: Identity Security in 2022: Why Automation, UX, and Best-of-Suite Software Will Lead the Way

What advantages does Kafka bring?

Easy-to-Read Outputs

Logging is a crucial aspect of an organization’s infrastructure, as logs help IT teams keep environments running properly — from debugging code and system issues to assisting in detecting and blocking cyber-attacks. Kafka is designed to ingest, process, store, and route logging data, all in real time. Once data is processed, Kafka can turn disorganized logging streams into a more understandable, easy-to-read output for IT staff to monitor. With the information that’s easy to read, teams can spend less time parsing through logs and more time responding to issues that could lead to potential downtime.

Simple Deployments

Adding Kafka to an environment is relatively easy for most organizations with infrastructure in place to utilize multiple large storage clusters and proper TCP network protocols to connect them. With system basics established, like the ability to publish and subscribe to logs, Kafka just needs the right cluster and storage infrastructure to be added in. Typically, organizations will establish a group of servers across multiple data centers to provide redundancy, allowing servers to take over if one fails to improve uptime.

Ecosystem Integrations

In addition, Kafka is an open-source program in the Apache system, so it is constantly tested and updated while working seamlessly with other open-source Apache projects. Having an open-sourced logging broker like Kafka allows an organization to have a wide range of tools to tie in whenever needed. With Kafka available in a wide range of programming languages, such as Python, C, C++, and Go, administrators can choose the environment that works best for their operating system. While some people may worry about an open-source tool’s longevity, Kafka has been around for more than a decade now and used by millions of developers for mission-critical applications, so users can be confident it will be available and supported for the foreseeable future.

Challenges of logging with Kafka

Kafka can be a powerful and valuable tool for many organizations to help safeguard their servers and clusters with visible and easy-to-read log tools. However, it does come with some significant challenges to be aware of before implementing it throughout your systems.

Difficult to Scale

One of the biggest challenges for an organization that implements Kafka is that its lack of scalability in large enterprise environments can be detrimental to IT departments. As companies grow, Kafka will require them to build more infrastructure to meet increased demands for server memory, network bandwidth, and disk capacity. Inherently, as a company grows, it will need to process more logs and purchase more infrastructure while adding additional resources on servers. Kafka can be particularly useful for small organizations, but the excessive resources needed means the cost to upkeep it doesn’t scale well.

SEE ALSO: Moving to cloud-native applications and data with Kubernetes and Apache Cassandra

Spikes causing bottlenecks

On top of this, bottlenecks in Kafka become increasingly noticeable as an organization scales, which can be extremely dangerous to a company that doesn’t catch the event logs quick enough. The first reason for this is that Kafka partitions only keep log messages for a specific amount of time before they are deleted. If resources are not correctly scaled, an organization risks the deletion of important events before they can be viewed or addressed. However, even small organizations can feel the similar effects of bottlenecking if there is a significant spike in logged events before it’s even had the chance to scale its infrastructure. Events can easily be timed out, deleted, or lost during a spike without an administrator knowing, putting an organization at significant risk of downtime, data breach, or more.

When it comes to logging, organizations need to ensure that their solution is robust and ready to handle the challenges that come with fluctuations in logged events. For a company with a smaller IT environment, utilizing an open source solution like Kafka can be the answer. It comes out of the box with many integrations and easy-to-read data and is simple to implement. However, once an organization starts to grow and needs to scale its systems to read millions of log lines per second, quickly creating terabytes of data daily, a more robust solution needs to be implemented before Kafka’s potential weak points start to break down.


Ryan Staatz

Ryan Staatz is a Software Engineer IV at LogDNA, where he migrated the company’s infrastructure from VMs to Kubernetes. His team partners with large enterprise companies, such as IBM, to establish stability across deployments, expand LogDNA’s compliance repertoire and improve observability at scale. Ryan has presented on scaling Elasticsearch on Kubernetes, handling challenges with a multicloud infrastructure, running Kubernetes on bare metal and managing dozens of separate production environments. Ryan holds a BA in Human Biology from Stanford University.

Inline Feedbacks
View all comments