“Python is the most popular programming language today for machine learning”
Machine learning may sound futuristic, but it is not. Speech recognition systems such as Cortana or Search in e-commerce systems have already showed us the benefits and challenges that go hand in hand with these systems. In our machine learning series we introduced you to several tools that make all this possible. Now it’s time to allow software developer Adam Geitgey to talk about the ABCs of machine learning and teach you how to make use of ML.
This interview is part of a Machine Learning series. We invited Adam Geitgey, Director of Software Engineering at Groupon, to talk about the difference between machine learning and the older artificial intelligence effort and the progress we’ve made so far.
JAXenter: How are you involved in machine learning?
Adam Geitgey: My professional background is primarily in traditional software development, not machine learning. I’ve worked on scaling large-scale websites, building backend systems, building mobile apps and other things like that.
But along the way, I’ve always been fascinated with the idea of using computers to solve problems that weren’t so linear and clear. I was first introduced to the idea of probabilistic programming about 10 years ago in an article by Peter Norvig about how spelling correction works using probability. It was so interesting to me that I looked for similar topics which led me to Machine Learning. Once I found that, I was hooked and started writing articles and working on projects.
JAXenter: The domain of artificial intelligence is not new —it’s been around since the 60s. Expectations were high – but many of them were not really met. It seems that only now – with “Machine Learning / Deep Learning” we see real progress in this domain. What differentiates this newer approach from the older AI effort?
Adam Geitgey: The biggest difference is that modern deep neural networks are able to figure out how to interpret messy raw data on their own without needing humans to explain the format of the data. This makes them easily adaptable to many different kinds of problems. The exact same neural network can be used to recognize speech or instead recognize images just by showing it different training data.
In the past, AI research didn’t work this way. Instead, one group of researchers would work on speech recognition, another group would work on image recognition, and so on. And each of those different groups would spend a lot of time figuring out custom ways to capture their raw data in a format that could be understood by a computer.
The number of breakthroughs happening in machine translation (the technology that powers systems like Google Translate) is amazing.
Before deep neural networks, a researcher who wanted to build a system to recognize pictures of apples would have to first create a system that could recognize the edges of objects. Then using that edge detector system, they could build a slightly more complicated system to recognize basic shapes. Then using that shape detector, they could build an even more complicated system to recognize apple-shaped objects. It was a lot of custom, one-off work.
With deep neural networks, none of that custom work is required. The computer does all those in-between steps for you. The researcher can just show pictures of apples to a deep neural network and it figures out how to recognize apples by itself. And if instead you feed in an audio file of something saying the word “apple”, it could learn to recognize that instead.
This means that as soon as one problem has been solved with deep neural networks, someone can often apply that same approach to another field and make another breakthrough. Being able to share advancements between fields has really sped up the pace of innovation.
JAXenter: In which domains of ML do you see the highest level of progress right now?
Adam Geitgey: The number of breakthroughs happening in machine translation (the technology that powers systems like Google Translate) is amazing. There’s so many breakthroughs happening that it’s hard to keep track of them. Over the next few years, these systems will get better at understanding the context of the documents they are translating instead of just translating every sentence individually.
There’s years of work left before you will be able to have a real conversation with your phone.
Another very new area with a lot of development happening is called Generative Adversarial Networks (GANs). These kinds of systems are a new way of teaching computers to generate data from scratch based on past data. For example, they can be used to build a system where you describe a picture in a few words and then the computer creates a real picture based on your description from scratch. There are endless possible applications.
JAXenter: One domain of ML is Speech Detection NLP. With bots like Siri, Google Now, Cortana, ML is entering the sphere of daily life. But we are not yet ready to have a “real” conversation with a machine. What is the technical reason for the missing piece? Why are we not ready yet to converse naturally with a machine?
Adam Geitgey: Deep neural networks have improved the accuracy of speech recognition significantly compared to older methods. That’s why these systems are entering daily life. Now that they’ve cross the 95% accuracy threshold, they are finally convenient to use.
But there’s a huge different between the computer understanding which words you said and the computer understanding what those words actually mean. Researchers are just now making baby steps towards building systems that can actually understand meaning. There’s years of work left before you will be able to have a real conversation with your phone.
JAXenter: In your first Medium post about ML you mentioned that machine learning only works if the problem is actually solvable with the data that you have. What happens when you don’t have enough data? Can machine learning help you fill the gaps?
Adam Geitgey: What happens if you don’t have enough training data is that the machine learning system ends up being very inaccurate. Since there isn’t enough data for the system to learn how to solve the problem correctly, it will generating incorrect answers.
Sometimes there are tricks that can be used to create more data. For example, image recognition researchers will create blurry or flipped versions of their training images to increase the amount of training data. But this only helps in limited cases. Many times the only answer is to just go and collect more data.
JAXenter: How much theory – e.g. statistics, algorithms, neural networks – does one need to grasp before diving into ML?
Adam Geitgey: Anyone who knows how to program can use machine learning tools to solve problems. There’s no reason to be afraid.
You don’t need a deep mathematical background to be able to apply machine learning.
I think that in five years, machine learning won’t be thought of as “magic” anymore. It will be a very common tool that nearly all programmers use to solve problems – just like how most programmers today know about databases and networking.
If you want to develop brand new machine learning algorithms or read the latest research papers, then you’ll need to get pretty deep into linear algebra and statistics. But you don’t need a deep mathematical background to be able to apply machine learning.
JAXenter: There are a lot of projects, libraries, frameworks out there that help develop ML applications. For example, Google recently open sourced TensorFlow. Which technologies do you recommend for someone who wants to make use of ML?
Adam Geitgey: Definitely start by learning Python. It’s by far the most popular programming language today for machine learning.
For solving most machine learning problems (which don’t require deep learning), the answer is easy. You just need to install a few python libraries: scikit-learn, NumPy and pandas. These tools are free and designed to work well together. And if you have a large classification problem that runs too slowly on a single CPU using scikit-learn, you can use the xgboost library to run it on multiple cpus.
For deep learning, there isn’t one clear winner yet. Everyone is using their own favorite tool and you will have to install several different toolkits to be able to replicate the work of other researchers. TensorFlow is probably the most well-known, but there are several other popular frameworks like Theano and Torch.
TensorFlow, Theano and Torch are low-level frameworks. There are also higher-level frameworks like Keras and tflearn that build on top of the lower-level frameworks but provide simpler programming interfaces. These are great choices for implementing your own deep learning solutions.
JAXenter: AI always had the power to feed the imagination of mankind – or at least of science fiction writers ;-). What do you think personally about the potential of ML/AI? What does the future really hold for machine learning?
Adam Geitgey: I think that machine learning will eventually allow us to automate many jobs that are currently done by people. This means that we might need a lot less people working in order for society to function as it does. It will be very interesting to see how this impacts society and how we all decide to deal with that change. Hopefully we can figure out how to make sure AI benefits everyone.
We asked Adam Geitgey to finish the following sentences:
In 50 years’ time machine learning will be involved somewhere in the production or functionality of every product you buy.
If machines become more intelligent than humans then hopefully we can use that incredible capability to build a better society for everyone.
Compared to a human being, a machine will never …. I don’t know! I’m not sure there’s an answer to this question!
Without the help of machine learning, mankind would never be able to automate away all the boring jobs in the world – which might end up to being a good or bad thing!
Take a look at our machine learning initiative: