Learn about AI >

What AI Gets Wrong — And Why That's Built Into How It Works

AI makes mistakes in ways that are different from software bugs — they're structural, predictable, and worth understanding before you rely on these systems for anything that matters.

When a traditional software program produces the wrong answer, something has gone wrong that can, in principle, be fixed. A bug is a bug: find it, patch it, move on. AI doesn't work this way, and understanding why is one of the most practically useful things you can take away from this section.

The failures that matter most in AI systems aren't accidents. They're properties of how the systems are built.

Take hallucination, the term used when an AI generates confident, fluent, completely wrong information. A language model isn't retrieving facts from a database; it's producing text that fits the statistical patterns it learned during training. When it doesn't know something, it doesn't say so. It produces whatever is most plausible given the surrounding context, whether or not it's true. As Kristian Hammond, director of Northwestern's Center for Advancing Safety of Machine Intelligence, has put it: "They are designed to write. They are just not designed to tell the truth." Hallucination isn't a bug to be patched; it's a consequence of what these systems are optimized to do.

Brittleness is a related problem. AI systems trained on one distribution of data can fail badly when the real world presents something slightly outside that distribution. A model that performs beautifully on thousands of test cases can stumble on a case that looks, to a person, nearly identical to ones it handled correctly. The failure isn't random; it reflects the edges of what the training data covered.

Then there's bias. Because these systems learn from human-generated data, they absorb the patterns in that data, including the inequitable ones. A hiring model trained on historical decisions will tend to reproduce historical preferences. A facial recognition system trained mostly on certain demographics will perform worse on others. These aren't oversights that better intentions would fix; they're reflections of what the training data contained.

None of this means AI is unreliable in any absolute sense. It means that AI is reliable in specific, bounded ways, and unreliable in others, and that the boundary between the two isn't always obvious from the outside. The systems are confident whether they're right or wrong. They don't signal uncertainty the way a careful person does.

The practical implication is straightforward: understanding how AI fails is as important as understanding what it can do. The articles ahead get into the mechanics of how these systems actually process information, which is where the failure modes start to make intuitive sense.