Robots and bots are becoming famous for misbehaving. Social media is full of stories about famous mishaps caused by poorly designed or inadequately tested Artificial Intelligence (AI) programs.
Sometimes, machine learning, a subset of AI, goes off the rails. Chatbots, such as Alexa, Siri, and Google Assistant, used for tasks including online shopping, selecting music and searching for information, and robots such as Boston Dynamic’s Atlas, that perform physical tasks, have made some noteworthy mistakes.
The stories of AI gone awry are not urban myths. There’s the self-driving car that killed a pedestrian when it disabled the car’s automated braking system. Another t-boned a tractor-trailer that ran through a stop sign – it literally didn’t see that coming.
An online merchant’s voice recognition bot mistakenly allowed a six-year-old to order herself an elaborate dollhouse without her parents’ knowledge. There is even an incident where a robot, which was demonstrating how it detects and jumps over different types and sizes of obstacles, tripped on the edge of a curtain and fell when it was leaving the stage.
If you search the internet for AI failures, perhaps start with 2018 in Review: 10 AI Failures. Poor AI programming may result in funny outcomes, but it can also be much worse, so mistakes must be avoided.
Eight ways to avoid mistakes
- Rigorous measurement and quality control. Use proven methods to measure accuracy, including well-known metrics, separation of training and testing sets, cross-validation, stratification and more. Make sure your measurements are trustworthy, and, when the alarm bells ring, dig in.
- Steady phased roll-outs. Only release new AI technology to a subset of your users at one time, and then monitor the real-world results that you observe. Don’t do a massive release to everyone at the same time. With phased roll-outs, you can minimize the damage if anything goes wrong.
- Annotate real-world data as it comes in. Don’t use strictly “research” data. The real world is much messier than the clean, self-contained datasets worked on in the academic community. By always including real-world data in your dataset, you can avoid many of the “Well it worked in the lab…..” type of problems seen from researchers.
- Check models for unintended bias. Ask yourself if your model is susceptible to learning any unwanted human biases? Might your model results change depending on gender, race, sexual orientation, age, or background? Then, more importantly, actually check. Use statistical techniques, like a chi-squared P-test or a stratified accuracy analysis, in order to verify that your model does not have any of these unintended biases.
- Set realistic expectations with customers. AI isn’t perfect. When you sell AI, what your clients buy should be more akin to a specialized employee then a piece of software guaranteed to execute the same way every time. Software is written to be completely accurate. AI is built to be mostly correct, with a given probability. Thus, expectations need to be set appropriately to avoid confusing AI with regular software.
- Keep humans in the loop. Every AI system is based on data. Some of that data must be provided by humans. For example, voice recognition systems are trained by having audio clips paired with written transcripts of what a person says. In order to avoid AI disaster, you should have humans analyze the real-world data entering your system and compare it to the results produced by the AI algorithm. Spot checking 0.5% to 1% of the examples in your system is usually enough to prevent major disasters.
- Test in an “adversarial” environment, instead of testing with “friendlies”. Invite people who are not involved in the development process, such as future users, to try to “break the program”. Engineers often sub-consciously test software in ways that they know will work, following the standard “happy path” and feeling good as a result. To avoid AI disasters, bring in people to deliberately try and break it – people who will try unusual sentences, upload strange or unexpected images, or try your AI in weird and different scenarios.
- Assume nothing. Test everything. Developing AI isn’t easy. Just because you saw one great internal demo does not mean your technology is ready for a public audience. Test the bejeebers out of it. Test it like it’s a Boeing 737 Max airplane, and when your engineers are ready to quit because you drove them nuts, start a new round of testing right at that moment.
AI development is quickly being incorporated into daily operations across many industries. Essential services, such as healthcare and transportation, are seeing a dramatic increase in AI development and machine learning. AI streamlines processes, reduces costs and produces better outcomes for service users and providers.
At PureFacts, we've seen many of the challenges that arise with AI technology. We've developed strategies to mitigate and reduce the risks involved. Our hands-on implementation experience will ensure that your next AI project is a success instead of just another click-bait AI disaster story. Talk to us ... before disaster strikes!