The amount that people interact with AI is continuously growing. From chatbots that can answer whatever question you ask to self-driving cars to image classifiers that tell you if your medical scans show evidence of cancer — there are few regions of life that are likely to remain unchanged by the advance of AI. In many ways, this makes it just the same as millions of technologies that have come before it, but AI still manages to remain unique in its ability to lie.
We don’t always know why AI gets things wrong. Its ability to hallucinate is well documented but without a comprehensive solution. We can mitigate against it through various assurance processes and treating results with appropriate suspicion, but even doing that becomes difficult when AI can’t always be trusted to give accurate confidences for predictions it makes.
When an AI system produces any sort of classification or label, it usually does so with an associated confidence score. If an object detection model is shown an image of a stop sign, it may say that it is 90% confident that it can see a stop sign. If a sentiment analysis model is shown a product review, it may identify it as having positive sentiment with 60% confidence. Due to the fact that when a model is assessed, it is usually its accuracy (whether or not predictions are correct) which is cited over the truthfulness of its confidence scores, models can end up seeming as though they will perform well while still being woefully miscalibrated.
Within the context of AI systems which produce classifications with confidence scores, calibration is defined as the extent to which those confidence scores are representative of accuracy. For example, if predictions have been made with 100% confidence you would expect them to all be correct, while if predictions have been made with 50% confidence, you would expect for half of them to be correct. A model which fulfills this criteria is a well calibrated model, while a model which produces confidence scores that are not representative of accuracy is a badly calibrated model.
There is a simple way to resolve the common issue of badly calibrated AI; measuring and improving calibration as its own quantity rather than as a by-product of loss function optimisation. Methods for improvement can range from restructuring the neural architecture of a network to slapping an extra layer on the end that squishes confidence scores closer to a mean value. While it can become challenging to figure out what sort of method is appropriate for a given use case, the range of different methods means that there are few situations in which nothing will be applicable. In using these methods, while AI may still produce incorrect results, we gain an accurate picture of how much trust we should have in a given output.
AI might still lie but at least it will tell us when it does.