Learning AI #17

The danger of edges cases in AI

Ai is great, until its not. Everything makes mistakes. The problem is that AI often has no idea when it is operating in territory it has never seen before. It just keeps going, confident as ever, straight into the ditch.

That is what an edge case is. It is a situation that falls outside the data the system was trained on.

AI models were built to handle the normal stuff, the common stuff, the stuff that shows up over and over again in the training data. When something unusual pops up, a well-designed system should recognize the gap and ask for help. A lot of systems don't and that’s the danger. They pattern-match to whatever looks closest and give you an answer anyway.

Graphic by author

This is where AI stops being annoying and starts being dangerous.

In 2016, Joshua Brown became the first person killed in a Tesla Autopilot crash. The car was on a Florida highway when a white tractor-trailer crossed in front of it. The Tesla autopilot system failed to distinguish the trailer from the bright sky behind it. It had been trained on thousands of highway scenarios. A white truck against a white sky at that angle was not one of them. The car never braked. The system was not confused. It was certain it was looking at open sky.

In 2018, an Uber self-driving car struck and killed Elaine Herzberg as she crossed a street in Tempe, Arizona. The system detected her six seconds before impact. It classified her as an unknown object, then a vehicle, then a bicycle, changing its assessment repeatedly as she approached. Because engineers had programmed the system to ignore unreliable detections and avoid false alarms, the braking alert was suppressed.

By the time the system decided something was wrong, it was too late. The edge case was simple: a pedestrian crossing outside a crosswalk at night, pushing a bike. The system had no clean category for it and kept guessing until it ran out of time.

The third example is less dramatic but more troubling. IBM spent years building Watson for Oncology, an AI system designed to recommend cancer treatments. Hospitals across India, South Korea, and elsewhere paid serious money for it.

When doctors started comparing Watson's recommendations to their own clinical judgment, they found that Watson was frequently suggesting treatments that were unsafe or just wrong.

The system had been trained on a small set of hypothetical cases constructed by oncologists at Memorial Sloan Kettering Hospital in New York, not on the full complexity and breadth of real patients across the globe.

When it encountered patients with unusual presentations or treatment histories that didn't fit its narrow training, it didn't flag uncertainty. It gave confident recommendations anyway. Doctors described some of them as dangerous. This is the essence of an edge case.

Without the human-in-the-loop (the doctor), the Watson treatment plans could not be trusted. Perhaps Watson was better at beating chess champions than diagnosing and treating cancer.

The common thread across all three of the edge case examples is not a software bug or a rogue algorithm. It is a system that was built for the common case and then deployed without a clear mechanism for recognizing when it was no longer in the common case. The AI had no way to say "I haven't seen this before." It just kept doing what it was built to do.

This is why the human-in-the-loop conversation matters so much. Not because AI is bad, but because it is selectively blind and doesn't know which eye is covered.

Things I think about

The average lifespan of a webpage is about 100 days before it's changed or removed.

**********