Kubeflow: Bringing together Kubernetes and machine learning
Kubeflow brings composable, easier to use stacks with more control and portability for Kubernetes deployments for all ML, not just TensorFlow.
Introducing Kubeflow, the new project to make machine learning on Kubernetes easy, portable, and scalable. Kubeflow should be able to run in any environment where Kubernetes runs. Instead of recreating other services, Kubeflow distinguishes itself by spinning up the best solutions for Kubernetes users.
Why switch to Kubeflow?
Kubeflow is intended to make ML easier for Kubernetes users. How? By letting the system take care of the details (within reason) and support the kind of tooling ML practitioners want and need.
Kubernetes already has:
- Easy, repeatable, portable deployments on a diverse infrastructure (laptop <-> ML rig <-> training cluster <-> production cluster)
- Deploying and managing loosely-coupled micro services
- Scaling based on demand
Kuberflow aims to give users simple manifests so they can have an easy to use ML stack anywhere Kubernetes is already running. Plus, it should self-configure based on the cluster it deploys into.
In particular, Kubeflow contains support for creating JupyterHub. Users can create for multi-use Hubs for Jupyter notebooks. The Hub can offer notebook servers to a class of students, a corporate data science workgroup, a scientific research project, or a high performance computing group.
Kubeflow users can also create a TensorFlow Training Controller for configuring CPUs or GPUs. It also helps adjust the size of a cluster with a single setting. They can also create a TF Serving continer.
How is it better than a plain Docker image over Kubernetes? Well, first of all, Kubeflow is great for anyone already using Kubernetes. This keeps everyone on the same toolchain, rather than adding an extra step. It also brings scalability to people with existing on premise or cloud-based servers.
However, if you have just a single container and a simple pipeline, Kubeflow might be a bit much. The Google Cloud ML Engine is more for people who want to run in the cloud and who want a layer of abstraction. In general, if you’re wiring together 5 or more services and systems to create a ML stack, then Kubeflow should simplify your workload.
SEE MORE: Machine Learning as a microservice in a Docker container on a Kubernetes cluster — say what?
Obviously, the baseline assumes you’ve already got a Kubernetes cluster available. Additional configuration should be necessary if you have any specific Kubernetes installations. More information can be found on GitHub.