Fresh out of the oven: Torus is a Docker-based toolkit for machine learning projects
Manifold may just have the solution for a problem that has been facing many ML teams. Let’s take a look at Torus: a new toolkit that promises to bring DevOps practices to machine learning. Open up the box and see what’s inside.
DevOps for machine learning?
By now, DevOps is an increasingly common practice and has changed the landscape with its presence. But, what if DevOps practices were applied to other projects? Manifold, an AI engineering services firm, had this question about machine learning.
As AI continues to be the hot-button topic in the news, AI projects and focus on machine learning grows. There is still a barrier between machine learning engineers and development and operations and there is no defined machine learning engineer toolkit currently. Teams are feeling the pressure of having to keep up in this growing, fast-paced sector but are also facing many internal challenges that must be solved first, namely a better way of reliably working on projects with more accessible communication.
What was the solution? It’s called Torus, and if that name just gave you flashbacks to geometry you’re right on the money. (For those of us who weren’t so keen on mathematics, it’s a donut. Frosting not included.) Torus is open source, so you reap the benefits from Manifold’s concept. But before you get running to GitHub, let’s look under the hood at what Torus can do and what it is and what it hopes to accomplish.
From Manifold’s engineering blog: “The goal of Torus is to help data science teams adopt Docker and apply Development Operations (DevOps) best practices to streamline machine learning delivery pipelines.” Torus began as an internal project at Manifold and the teams involved have all seen increased productivity by using Docker. Of course, implementing any new technology over a team can be difficult, no matter the size. There’s bound to always be some bumps in the road when learning a new platform. But by providing Torus, Manifold hopes to make this easier and pass on their productivity savings. (Another win for open source projects.)
Let’s get Dockerized!
Torus uses Docker to keep all the production on the same page. This way, every project is shared cross-platform across the entire team, ensuring good communication and a lowering of walls between engineers. It’s DevOps practices but for a new breed.
Time to check out the goods. The Torus package includes:
Installation takes as much time as brewing a pot of coffee while you get ready for your Dockerized world of high productivity. All of the common libraries will be available through a Jupyter notebook, already installed and ready to use throughout the team. Have a preferred browser and IDE that you absolutely refuse to change? No problem. You can use them with no issue – Torus is cross-compatible. The containers are all nice, neat, and separate so every different project has its own tidy space (just like separate donut flavors at the bakery).
GitHub lists what you’ll get from scaffolding your data science projects with Docker:
- Project Docker image built with your own Dockerfile for project specific requirements
- Docker Compose configuration that dynamically binds to a free host port and forwards to the jupyter server listening port inside the container
- Shared volume configuration for accessing and executing all your project code inside of the controlled container environment
- Ability to edit code using your favorite IDE on your host machine and seeing real-time changes to the runtime environment
- Jupyter notebook fully configured with nb-extensions ready for development and feature engineering
- Common data science and plotting libraries pre-installed in the container environment to start working immediately
We are going to be keeping an eye on Torus and see what comes of its release. Will Torus help your team bridge the gap?