Interview with Erez Berkner, co-founder and CEO of Lumigo

“Lumigo is purpose-built for cloud-native”

JAXenter Editorial Team
© Shutterstock / ArtemisDiana

Erez Berkner, the co-founder and CEO of Lumigo, spoke with JAXenter about the cloud observability space, elaborating on what it’s trying to solve and how it’s different from monitoring. He explains in detail how Lumigo helps make sense of the ever-growing complexity of cloud-native applications and allows developers to find the root cause of an issue in highly-distributed systems. Erez explains Lumigo’s decision to expand to Kubernetes and hybrid apps and reveals the company’s plans for further down the road.

JAXenter: How is observability different from cloud monitoring and testing?

Erez Berkner: Monitoring enables teams to keep an eye on the state of their systems and understand them. It involves collecting predefined metrics or logs. People usually use monitoring to keep track of a system’s health. They do that by collecting error logs and system metrics, and then using those to alert about issues.

Observability, on the other hand, lets teams troubleshoot and debug their system. It spots patterns that aren’t defined upfront. With observability, we gather insights and act on them.

SEE ALSO: “Redis enables us to build high-performing, reliable features”

JAXenter: What are the main pain points cloud observability is trying to solve?

Erez Berkner: Cloud systems, and particularly those built with serverless and microservices architectures, are typically highly distributed with many dependencies across internal and 3rd-party services. This creates a particularly painful challenge when trying to get to the root cause of an issue, such as a bug or performance problem.

JAXenter: How does Lumigo solve these pain points?

Erez Berkner: At Lumigo, we use an approach known as automated distributed tracing, which tracks a service request across all of the services – whether internal or external – that are required to complete a transaction. In addition, it gathers in one place all of the relevant information that would help a developer identify the root cause of an issue and resolve it. Our platform correlates logs and creates a virtual stack trace of your distributed environment.

JAXenter: How do you correlate millions of log lines, traces and metrics across distributed services? Can you explain how this works under the hood?

Erez Berkner: Sure. Lumigo’s correlation engine uses innovative algorithms that were developed specifically for cloud services. The engine uses data observed by the Lumigo tracer (a code library) to deterministically identify a request and correlate the logs, inputs, env variables and traces across distributed services. It’s a no-code, no-deployment concept that is able to correlate synchronous managed services (e.g., Lambda or Stripe) as well as asynchronous managed services (such as dynamoDB).

JAXenter: Cloud-native applications are getting more complex every day. What future do you see for cloud-native applications?

Erez Berkner: The good news is that new cloud-native services are released by cloud providers all the time, which makes developers’ lives easier in certain respects, and lets applications run more efficiently. But there is a flip-side which is increased complexity. We definitely see this trend continuing and it’s really the reason the Lumigo platform exists: to cut through all the complexity and give developers an easy way to understand what’s going on with their application.

JAXenter: What is the demand for your platform? What kind of response are you getting from developers?

Erez Berkner: The demand for our platform is directly tied to the maturity level of cloud-native usage — meaning in-production workloads. So as you can imagine, it’s quite high right now… The response has been tremendous. On our Slack, we have an internal channel we called #customer-compliments, which gets multiple posts every day, with things like “We connected Lumigo 20 minutes ago and immediately found problems we weren’t even aware that we had and are now working on implementing fixes!” It’s very satisfying and gets us excited every time.

JAXenter: How is Lumigo unique?

Erez Berkner: I’d say the main way in which we’re unique is that we are completely obsessed with cloud-native. We are not a legacy monitoring tool that’s trying to create a patchwork of solutions to address these new environments. Lumigo is purpose-built for cloud-native. We are literally 100% cloud-native ourselves and our platform was developed by cloud-native developers for cloud-native developers. The way it manifests itself in the product is by having the right metric or debugging info available for the developer at the right time, as well as having the right cloud-native out-of-the-box metrics (for example, cold starts or service latencies) and these features really speak to cloud-native developers and make them say “these guys get it”.

SEE ALSO: The 7 tenets of serverless data

JAXenter: Lumigo recently raised $29M to expand the platform to Kubernetes and hybrid apps. Why did you choose to focus your efforts in this direction?

Erez Berkner: So first, I want to make sure your readers know that we are still very much focused on serverless. The important word in your question is “expand”. The reason we’re doing this is because our customers are telling us that in the real world, their applications are usually built on a mix of serverless and non-serverless environments. They use serverless and managed services but also containers, Kubernetes and even VMs. And our goal is to give them a holistic view of what’s happening in their applications and be able to easily resolve any issues.

JAXenter: What can we expect from Lumigo in the future?

Erez Berkner: One important direction is exactly what we mentioned above: making sure we give developers a complete end-to-end picture, regardless of technology or specific cloud services. We want them to be able to use Lumigo on any modern cloud technology. And as the cloud-native ecosystem evolves and grows, so will Lumigo.

The second area of focus goes back to how we opened this discussion with the difference between monitoring and observability. We want to give developers the best tools to find and resolve issues and we’ve added many capabilities for that purpose lately, such as identifying rogue deployments or seeing a “Live tail” of the application logs. Many more are on the way.

Inline Feedbacks
View all comments