How blockchain can save AI from itself
How do we prevent failures in A.I. such as Microsoft’s offensive chatbot and Tesla’s autopilot car crash? Blockchain may have the answer. Rob May discusses the future of artificial intelligence’s training.
With good reason, the general public has become increasingly wary about the sudden growth in artificial intelligence software, but another high-growth area of software—blockchain—could solve many of the issues holding back modern A.I. development and adoption.
As a community, we must first acknowledge that real-world artificial intelligence has suffered from some rather spectacular public failures of late. Microsoft, one of the most technically advanced companies on the planet, saw its Tay A.I. chatbot turn racist almost immediately after its release in March of 2016.
Meanwhile, Tesla recently admitted that a January 2018 crash was caused because its autopilot A.I. technology couldn’t recognize a parked fire truck.
In the two years between the Tay incident and the Tesla fire truck crash, A.I. has been required to accomplish more dangerous and complex work, without having shed its capacity for serious, unexpected failure.
Blockchain has the ability to change that.
Blockchain’s distributed ledger could not only help prevent incidents like the Tay corruption from happening, but could also unlock collaborative A.I. innovation, ensuring failures like the Tesla fire truck blind spot don’t happen again.
Most artificial intelligence programs in use today are variants on “deep learning” algorithms. These systems learn how to recognize, categorize, and respond to inputs by ingesting training data.
For example, a facial recognition program could “learn” how to recognize faces by being fed millions of images—some depicting faces, some not—with each training image labeled as containing a face, or containing no faces. The A.I. would then develop its own set of criteria for distinguishing faces from non-faces, which it would test against more training data, grade its results, and repeat until the A.I. gained highly reliable competence in recognizing faces.
The key to this performance is the quality and content of the algorithm’s training data. Where Tay went off the rails was by adapting to the conversations it had with real, online users. Those users quickly realized that Tay was mimicking some of their own conversational ticks, and those users decided to amuse themselves by coaching Tay into using offensive language.
More simply, some Twitter trolls intentionally corrupted Tay’s training data.
Blockchain could prevent this by keeping a public record of which core algorithms were developed using which sets of training data. More broadly, a blockchain could record: who wrote an original A.I. algorithm; what data was used to train that algorithm; and who is deploying that “trained” algorithm now.
(Well-annotated training data is actually fairly hard to come by, so we can expect common training sets to be cataloged and either sold, licensed, or open-sourced for broader use. It will likely be similar to how academic science makes test samples available for use by multiple institutions.)
Companies like Facebook and Google hoard their in-house training data as a competitive advantage, but with a blockchain system in place to ensure that no one can use their data sets without proper licensure, they could be more likely to share both the training data and the algorithms that data could produce.
The no-middleman, high-transparency nature of blockchain ledger would encourage these A.I. developers to share their data and their work products without fear of an “A.I. arbitrator” favoring their competitors or stealing their intellectual property.
A blockchain-based A.I. certification would also encourage sharing by ensuring that an A.I. agent came from who it says it came from, was trained the way it claimed to be trained, and that all parties involved get appropriate payment and credit for their work.
Which brings us back to Tesla’s fire truck problem. The Tesla autopilot software is an amalgamation of several different A.I. algorithms, each stitched together into a navigation solution that helps keep a car in its lane and helps that same car avoid running into any other vehicles, obstacles, or—most importantly—living creatures.
The list of items that the Tesla autopilot software has to quickly recognize, categorize, and then predict the behavior of is staggering. Most deep-learning algorithms are sort of A.I. idiot savants, extremely capable of sorting the kind of data to which they are exposed, but ignorant of any larger context.
In our earlier example of the facial recognition algorithm, the A.I. wouldn’t natively be able to tell a male face from a female face, or a happy face from a sad face, without richer training data. It also couldn’t produce a “rules engine” that could be easily tweaked to distinguish fire trucks from cargo trucks, for example, because there is no abstracted image-recognition learning here.
A fire-truck recognizer needs to be trained on firetrucks. A pedestrian recognizer needs to be trained on pedestrian data. A pothole recognizer needs to be trained on pothole images. There is no generic “see and recognize everything” algorithm available, because A.I. isn’t human-like in its intelligence.
The range of training data needed to prepare the autopilot for every reasonably predictable situation it might encounter on the highway is all but impossible for any one source to produce.
With a blockchain system licensing both wide swaths of standardized training data and fully polished single-use algorithms, Tesla could integrate several independently developed A.I. “skills” into a widely competent virtual driver, and then test the end-product with a vast array of publicly developed training scenarios to ensure that the integrated product was up to par.
Blockchain would both encourage and expedite A.I. innovation, because blockchain is uniquely capable of ensuring no one can illicitly copy or train an A.I. system. Moreover, the decentralized nature of blockchain also makes it a neutral party between A.I. competitors like Google, Facebook, Tesla, Uber, and dozens of other high-profile technology companies.
A.I. isn’t living up to the hype because training data is rare and expensive, and creating broadly competent A.I. agents requires combining several different A.I. skills into a single solution. Blockchain has the ability to solve both problems, as well as save A.I. development from itself.