Combatting AI bias: remembering the human at the heart of the data is key
When artificial algorithms are biased, this can create unethical results, which in turn can lead to PR disasters for businesses. In this article, you will learn about three different types of AI bias – algorithmic, technical, and emergent – and what measures can be used to limit them.
Artificial Intelligence (AI) – once considered the stuff of science fiction – has now permeated almost every aspect of our society. From making decisions regarding arrests and parole, to determining health and suitability for jobs, we are seeing algorithms take on the challenge of quantifying, understanding, and making recommendations in place of a human arbitrator.
For businesses this has presented a wealth of opportunities to streamline and hone processes, along with providing critical services such as facial recognition and healthcare screening to governments and nations across the world. With this growing demand comes an increased supply for developers who can create and build algorithms with the level of complexity and sophistication needed to make decisions on a global scale. However, according to a 2019 survey by Forrester, only 29% of developers have worked on AI projects, despite 83% expressing a desire to learn and take them on.
As with any developer project, working on AI brings with it its own unique set of challenges that businesses and developers must be aware of. In the case of AI, the chief issue is that of bias. A biased algorithm can be the difference between a reliable, trustworthy, and useful product, and an Orwellian nightmare resulting in prejudiced, unethical decisions and a PR catastrophe. It is therefore crucial that businesses and developers understand how to mitigate these effects from the beginning, and that an awareness of bias is built into the heart of the project.
The three kinds of bias: algorithmic, technical, and emergent
Every developer (and every person, for that matter) has conscious and unconscious biases that inform the way they approach data collection and the world in general. This can range from the mundane, such as a preference for the colour red over blue, to the more sinister, via the assumption of gender roles, racial profiling, and historical discrimination. The prevalence of bias throughout society means that the training sets of data used by algorithms to learn reflect these assumptions, resulting in decisions which are skewed for or against certain sections of society. This is known as algorithmic bias.
Whilst 99% of developers would never intend to cause any kind of unfairness and suffering to the end users – in fact, most of these products are designed to help people – the results of unintentional AI bias can often be devastating. Take the case of Amazon’s recruitment algorithm, which scored women lower based on historical data where the majority of successful candidates (and indeed the only applicants) were men. Or the infamous US COMPAS system, which ranked African-American prisoners at a far higher risk of re-offending than their white counterparts regardless of their crimes or previous track record based on data gathered from racial profiling.
There is also a second possibility of technical bias when developing an algorithm. This occurs when the training data is not reflective of all possible scenarios that the algorithm may encounter when used for life-saving or critical functions. In 2016, Tesla’s first known autopilot fatality occurred as a result of the AI being unable to identify the white side of a van against a brightly lit sky, resulting in the autopilot not applying the brakes. This kind of accident highlights the need to provide the algorithm with constant, up-to-date training and data reflecting myriad scenarios, along with the importance of testing in-the-wild, in all kinds of conditions.
Finally, we have emergent bias. This occurs when the algorithm encounters new knowledge, or when there’s a mismatch between the user and the design system. An excellent example of this is Amazon’s Echo smart speaker, which has mistaken countless different words for its wake up cue of “Alexa”, resulting in the device responding and collecting information unasked for. Here, it’s easy to see how incorporating a broader range of dialects, tones, and potential missteps into the training process may have helped to mitigate the issue.
Bringing testing back to humans
Whilst companies are increasingly researching methods to spot and mitigate biases, many fail to realise the importance of human-centric testing. At the heart of each of the data points feeding an algorithm lies a real person, and it is essential to have a sophisticated, rigorous form of software testing in place that harnesses the power of crowds – something that simply cannot be achieved in a static QA lab.
All of the biases outlined above can be limited by working with a truly diverse data set which reflects the mix of languages, races, genders, locations, cultures, hobbies that we see in our day-to-day life. In-the-wild testers can also help to reduce the likelihood of accidents by spotting errors which AI might miss, or simply by asking questions which the algorithm does not have the programmed knowledge to comprehend. Considering the vast wealth of human knowledge and insight available at our fingertips via the web, not making use of this opportunity is highly amiss. Spotting these kinds of obstacles early can also be incredibly beneficial from a business standpoint, allowing the developer team to create a product which truly meets the needs of the end user and the purpose it was created for.
As AI continues its path to become omnipresent in our lives, it’s crucial that those tasked with building our future are able to make it as fair and inclusive as possible. This is no easy task, but with a considered approach which stops to remember the human at the heart of the data we are one step closer to achieving it.