Why deep learning is an essential tool for developers
The field of artificial intelligence has shown tremendous progress in the past decade. But there’s more to AI than chess-playing robots. Mat Leonard, the Head of Udacity’s School of AI, explains how the history of deep learning is the history of a programming revolution. Are you ready for Software 2.0?
The last few years have seen amazing progress in the field of artificial intelligence. AlphaGo beat the world’s Go grandmasters, a feat thought impossible. The next version, AlphaZero, became the world’s best chess player in four hours. Cars are driving themselves, smartphones can identify skin cancer as well as trained dermatologists, and we’re on the verge of human-level universal translation. All of this is the result of not just machine learning, but actually a subset technology of machine learning called deep learning. But what is deep learning, what can it do, and how did we get to this point?
When people talk about AI these days, they are actually talking about deep learning. Artificial intelligence is a broad field that includes pathfinding algorithms such as A*, logic and planning algorithms, and machine learning. The field of machine learning consists of various algorithms with internal parameters found from example data through optimization. Deep learning is a branch of machine learning utilizing “deep” neural networks, that is, artificial neural networks with dozens of layers and millions of parameters. The recent advances in AI such as speech recognition, realistic image generation, and AlphaZero are all based on deep learning models.
The history of deep learning
Artificial neural networks have existed since the 1950s, which were known as perceptrons at that time. The perception was an algorithm that roughly approximates the way neurons operate. An individual neuron (or “unit”) summed up its input values, each value multiplied by some connection strength or weight. The sum of those values and weights was then passed through an activation function to get the output of the neuron. The neurons could be combined into layers with multiple neurons in each layer, using the output of neurons in one layer as the input to neurons in the next layer. The weights would be set such that the network performs some specified behavior. Most of the time setting the weights by hand is practically impossible. Instead, the network was “trained” using example data. That is, the input data is labeled and the weights are adjusted such that the network is able to reproduce the correct labels from the data.
After the initial excitement, researchers were blocked because they couldn’t train neural networks with more than two layers, restricting the ability of the networks to perform complex behaviors. Two decades later in the 1980s, a solution was found in the backpropagation algorithm, which allowed information to flow through the network from the output layers back to the input layers. Suddenly, researchers could train deeper neural networks with multiple layers. However, the process of training was computationally expensive and while there were some successes, neural networks weren’t seen as better alternatives to other machine learning algorithms.
In the 2012 ImageNet competition, Alex Krizhevsky and Ilya Sutskever from Geoffrey Hinton’s lab trained a deep neural network with 60 million parameters on two GPUs for a week. The goal of the competition was to identify objects in images with the lowest error rate, using a dataset of 1.2 million images as training examples. Their algorithm dominated the field with an error rate of 15%, beating the next best attempt by ten percentage points. Afterwards, deep neural networks became the only choice for computer vision problems. This combination of massive neural networks, trained on gigantic datasets and using GPUs to increase computational efficiency, is the basis of deep learning and all the amazing breakthroughs we’re seeing in AI.
A programming revolution
Deep learning has created a new paradigm for software development. Traditional development involves a programmer building applications line by line, instructing the computer to perform specific behaviors. With deep learning, the software is written in the internal parameters of a neural network. These parameters are found by specifying the desired behavior of the program (usually with example data) and optimizing the network to reproduce that behavior. Andrej Karpathy, the Director of AI at Tesla, calls this “Software 2.0”, contrasted with “Software 1.0”, the familiar procedural and object-oriented paradigms.
Karpathy notes, “It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program.” This can be clearly seen in the domain of computer vision. Researchers worked for decades on hard coding feature detectors so that computers could understand the contents of images. Deep learning models called convolutional neural networks learn these features from the images themselves, performing far better than any algorithm written procedurally. By collecting a large dataset of labeled images, deep learning researchers made the entire history of computer vision research obsolete.
Fifteen years ago, cloud computing and AWS didn’t exist, but today it’s an essential tool for programmers. Five years ago, Docker didn’t exist, now it’s ubiquitous and another tool that developers are expected to know. Similarly, five to ten years from now, deep learning will likely be an essential tool. The best solution for a large range of applications will be to collect and label data for a deep learning model, rather than the traditional process of hard-coding behavior. Karpathy continues, “A large portion of the programmers of tomorrow do not maintain complex software repositories, write intricate programs, or analyze their running times. They collect, clean, manipulate, label, analyze and visualize data that feeds neural networks.” For developers and engineers, it’s becoming necessary to learn deep learning as an essential skill.
How to learn deep learning
Today, there is a prevalent but incorrect assumption that only people with PhDs can understand deep learning. It is true that for the most part deep learning practitioners have come from academia wielding PhDs in computer science. However, modern deep learning frameworks such as PyTorch and Keras allow anyone with programming experience to build their own deep learning applications. Working developers also have a strong advantage because they have shipped code to production. The skill most lacking in the AI field is deploying deep learning models to production and maintaining those models afterwards. In many cases, an experienced developer who learns how to build deep learning models is more employable than a computer science PhD.
There are many options for learning deep learning. Bootcamps such as Galvanize are teaching deep learning as part of their data science curriculum. There are also online options such as Google’s Machine Learning Crash Course and Udacity’s School of AI. A great way to gain experience is to work on Kaggle competitions. One can also find deep learning papers on Arxiv and implement the models; you’ll often be able to find implementations on GitHub to guide you. Regardless of the education method, it’s important to practice lifelong learning to keep up with the rapidly evolving technology industry.
Deep learning has fundamentally changed how humans interact with machines and it’s clear that AI will impact nearly every industry. Within the next five to ten years, deep learning will be another essential skill in a developer’s toolkit. Now is the perfect time to get involved in the deep learning community, just as the new era in software begins.