What exactly is Knative? An introduction with Evan Anderson from Google
What are the benefits of serverless computing? What exactly is Knative and what features are still in development? In our interview with Evan Anderson, Senior Staff Software Engineer at Google, he gives an introduction into the new shiny serverless tooling based on Kubernetes. He also talks about the benefits and the downsides of serverless computing and why it is such a big topic at the moment.
JAXenter: Hello Evan and thanks for taking the time to answer our questions. After Docker came Kubernetes, and now the new hot topic is Knative. What is Knative all about?
Evan Anderson: Kubernetes was an attempt to elevate the conversation about cloud computing using some of the lessons learned at Google about container orchestration. Knative is a follow-up on that success, building up the stack into the “serverless” space (scalable event-driven compute with a high degree of management automation) to enable developers to focus on business solutions rather than infrastructure.
We’ve broken down the project into three main pillars at the moment:
Build focuses on taking source and producing Docker images in a repeatable server-side way.
Serving provides scalable (start from zero, scale to many) request-driven serving of stateless Docker containers, including the tools needed to inspect and debug those services.
Event orchestrates on- and off-cluster event sources and enables delivery to multiple compute endpoints (Knative Serving, but also Kubernetes Services or even raw VMs).
One of the key insights that our team had at the start of the project was that the Function as a Service (FaaS) paradigm that’s currently dominant in the serverless marketplace was narrower than necessary, and we could implement both FaaS and Platform as a Service (PaaS) on top of stateless, request-driven Container as a Service. Based on previous experience building serverless platforms at Google, we had a good idea of how the serving and supporting components should look. But we also knew that great OSS software comes from the community, so we found a number of strong partners to make Knative a reality.
At its core, Knative has two goals:
Create building blocks to enable a high-quality OSS serverless stack.
Drive improvements to Kubernetes, which support both serverless and general-purpose computing.
As a set of OSS building blocks, Knative allows on-premises IT departments and cloud providers to provide a common serverless development experience for web sites, stateless API services, and event processing.
JAXenter: The Knative project is relatively new. What functions are still being developed? Approximately, when will it be production ready?
Evan Anderson: Knative implements a change-tracked server-side workflow where each deployment and configuration change results in a new server-side Revision. Revisions allow easy canary and rollback of production changes, and were part of our initial design process based on experience with App Engine. Every Knative Revision is backed by a standard Kubernetes Deployment and an Istio VirtualService, so we deliver a lot of functionality out of the box. Further, we integrate with systems like Prometheus, Zipkin, and ELK to collect observability information, which can be a challenge in serverless environments.
Having worked in Google production on two SRE teams, I hesitate to say something is “production ready” until a customer has tried it in their own environment. Based on the work the community is doing, we are on track to reach the following targets in the next few releases:
Reactive autoscaling: Scale-from-zero (aka: cold start) < 2s with appropriate runtime and cluster configuration. Scale to 1000 instances (possibly much more than 1000 qps) in less than a minute.
Automatic metrics, logs and telemetry collection, so you have visibility into what your code is doing.
7+ languages (Node, Java, Python, Ruby, C#, PHP, Go) with high-quality build templates where you don’t need to write a Dockerfile to get started.
Automatic TLS configuration, data plane security and rate-limiting controls.
A few dozen event sources (GitHub, Slack, etc) which can deliver to functions or applications.
There’s a lot of interesting open questions in both the Build and Eventing prongs of the project – Build naturally extends into CI/CD and multi-stage validation and rollout, while the intersection of Eventing and serverless architecture is still evolving rapidly once you get past “send an event from A to B”.
JAXenter: Which features are still being worked on and which ones are planned for future releases?
Evan Anderson: I think we’ll always be working on autoscaling and networking features. Right now, we’re focused on enabling core user scenarios like “run my application with autoscaling”, “let me plug in my own build”, and “I need to debug my application”. One of the biggest requests we received from users with our initial release was to make each component more independent and pluggable, which is a big challenge while also trying to make the whole platform coherent.
There are also some features that don’t land on our roadmap until we get more experience running the system, such as live upgrades from one release to the next, component packaging, and automatic generation of release notes.
JAXenter: Knative is built on top of Kubernetes, how is Istio involved in the Knative ecosystem?
Evan Anderson: Kubernetes provides a robust container scheduling layer, but Kubernetes networking tools (service) were too primitive for the features we knew we wanted to build in Knative. Istio enables core Knative features such as percentage-based canary of new code and configuration and common security controls in the request path. Istio’s service mesh model matched our experience at Google and gave us a lot of the network routing capabilities for free (meaning that we only needed to write the configuration to request it).
One of the interesting post-launch feedback items we heard from several members of the community was that they couldn’t use Istio in their environment, but still wanted to use Knative. So something we’ve been looking at for future releases is creating internal custom resources (CRDs) to represent the desired Knative state for routing or load-balancing, and then using those CRDs as extension points so you could replace Istio with Contour or Linkerd if desired. In some cases, you might end up with a slightly longer network path or fewer authentication features, but it widens the set of use cases that Knative can address.
JAXenter: How is Knative different from AWS Lambda and Google Cloud Functions?
Evan Anderson: First of all, Knative is a set of software building blocks, not a hosted and packaged solution like Lambda and Cloud Functions. Being a set of building blocks is different from hosted and packaged services in a few ways:
You can run it locally using minikube, or install it onto a Kubernetes cluster anywhere (including on AWS or Google). Because you can run Knative yourself, we divide up our customers into three groups:
Developers: these are our end-users who are solving business problems by writing code. Typically, they want their code to just run and scale with no extra work on their part.
Operators: these are the IT professionals running Knative to provide a serverless experience to developers. This could be a cloud provider or in-house private cloud – in either case, they are managing servers and upgrades.
Contributors: these are the community members who are working on Knative itself. Contributors need ways to build and test Knative itself.
Existing hosted FaaS offerings target source deployments (i.e. upload your Node or Python code as a zip file). Knative can take a source upload and build a container, but it also works with any existing workflows that produce a container, including languages that Knative doesn’t have any native support for yet. (for example, we even have Dart, Kotlin, Rust, and Haskell samples!)
Portability: Knative isn’t tied to a single cloud vendor – it’s hybrid and cross-cloud. There will be plugins to enable vendor-specific features (like Google Stackdriver for logs), but the core experience and runtime behavior will translate from one cloud to another.
It’s OSS! You can download the source code, contribute your bug fixes, and participate in the working groups to make it better.
JAXenter: Serverless is on the rise, although the name is misleading – servers are still needed at some level. But why is serverless such a big topic at the moment?
Evan Anderson: I’m not quite sure why serverless didn’t take off sooner. The PaaS phenomenon in 2008-2012 (Heroku, App Engine, Cloud Foundry, and others) had most of the same key ingredients: stateless process scale-out, server-side build, integrated monitoring. Probably one of the key areas that’s gotten a lot better is the actual server-side infrastructure. Back when Google App Engine launched in 2008, there were a lot of limitations that were a product of the times: limited language choice, single vendor, no escape hatch to other styles of compute. Today, something like Cloud Functions or Lambda is still single-vendor, but the language choices are wider and the overall cloud ecosystem has gotten a lot deeper.
Regardless of “why now”, there are a lot of benefits to adopting a stateless, request-driven compute model (combined with pay-per-use, I think this is the core of serverless). Much like the event-driven model in desktop UI, handling a single event or request at a time often leads to simpler, more stable code. The architecture also lends itself fairly naturally towards developing in the 12-factor pattern, and a lot of the undifferentiated heavy lifting around monitoring, request delivery, and identity is taken care of for you, which is a huge productivity win. This also makes serverless systems stable and self-repairing, which is great for both hobby and enterprise projects where maintenance work is expensive.
JAXenter: Thank you very much!