You'll need the right tools

Machine learning A-team: TensorFlow, Apache Spark MLlib, MOA and more

Gabriela Motroc

Machine learning experts are in high demand right now. As tech giants rely heavily on machine learning and AI these days, it comes as no surprise that their ML hiring spree has intensified. If you want to jump on the ML bandwagon, you’ll need the right tools.

Machine learning is gaining momentum and whether we want to admit it or not, it has become an essential part of our lives. As Adam Geitgey, Director of Software Engineering at Groupon, told JAXenter a few months ago, “anyone who knows how to program can use machine learning tools to solve problems.”

I think that in five years, machine learning won’t be thought of as “magic” anymore. It will be a very common tool that nearly all programmers use to solve problems – just like how most programmers today know about databases and networking.

Geitgey explained that even if you don’t need a deep mathematical background to be able to apply machine learning, learning Python —”by far the most popular programming language today for machine learning”— is a must.

Most popular machine learning tools

When asked to recommend some frameworks for those who want to make use of machine learning, he mentioned TensorFlow, Theano, Torch, Keras and tflearn. Let’s see what else is out there:

Apache Spark MLlib

JAXenter talked to Xiangrui Meng, Apache Spark PMC member and software engineer at Databricks, about MLlib and what lies underneath the surface. Databricks uses Scala to implement core algorithms and utilities in MLlib and exposes them in Scala as well as Java, Python, and R. Users can pick their favorite language and get started with MLlib.

MLlib’s mission is to make practical machine learning easy and scalable. We want to make it easy for data scientists and machine learning engineers to build real-world machine learning (ML) pipelines. MLlib provides scalable implementation of popular machine learning algorithms, which lets users train models from big dataset and iterate fast.

Read the entire interview here.


Weka provides a uniform interface to a collection of machine learning algorithms in Java.It is implemented in Java, but there are packages for Weka that enable the use of code written in Python, and R can also be used from Weka. Dr. Eibe Frank, Associate Professor (Computer Science) at the University of Waikato, New Zealand told JAXenter that it is also possible to script Weka using Groovy or Jython.

Weka’s strength lies in classification, so applications that require automatic classification of data can benefit from it, but it also supports clustering, association rule mining, time series prediction, feature selection, and anomaly detection.

Read the entire interview here.


MOA is an open source software specific for machine learning/data mining on data streams in real time. Albert Bifet, co-leader at MOA and author of a book on Adaptive Stream Mining and Pattern Learning and Mining from Evolving Data Streams, told JAXenter that MOA is developed in Java, and can be easily be used with Weka and Adams. 

It is very easy to use MOA objects inside Scala.

Read the entire interview here.

Amazon Machine Learning

According to the product description, Amazon Machine Learning is a managed service for building ML models and generating predictions, enabling the development of robust, scalable smart applications. One of the reasons behind its popularity is the fact that users don’t require an extensive background in machine learning algorithms and techniques.

Google Cloud Machine Learning

Google Cloud Machine Learning allows users to build sophisticated, large scale machine learning models in a short amount of time. It’s portable, fully managed and scalable and promises to take care of everything from data ingestion through to prediction.

Cloud Machine Learning supports any TensorFlow models – you can build and use models that can work on any type of data, across a whole variety of scenarios.


TensorFlow is an open-source software library for Machine Intelligence. It comes with an easy-to-use Python interface and no-nonsense interfaces in other languages to build and execute computational graphs.

The new release candidate contains new Android demos, Java API, Python 3 Docker images and more.

Microsoft Azure Machine Learning Studio

Microsoft Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure. It is a collaborative, drag-and-drop tool you can use to build, test, and deploy predictive analytics solutions on your data.

There is no programming required, just visually connecting datasets and modules to construct your predictive analysis model.


Caffe is a deep learning framework made with expression, speed, and modularity in mind developed by the Berkeley Vision and Learning Center (BVLC). According to the project description, speed makes Caffe perfect for research experiments and industry deployment.

Furthermore, models and optimization are defined by configuration without hard-coding. One can switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.

What did we miss? Tell us your favorites in the comments section.


Gabriela Motroc
Gabriela Motroc was editor of and JAX Magazine. Before working at Software & Support Media Group, she studied International Communication Management at the Hague University of Applied Sciences.

Inline Feedbacks
View all comments