Base Rate Neglect
Base rate neglect is the cognitive error of ignoring prior probabilities — the statistical frequency of a condition in a population — when evaluating the likelihood of a specific case, in favor of individuating information that seems more diagnostic but is statistically less decisive. The bias was first systematically documented by psychologists Amos Tversky and Daniel Kahneman in their research on heuristics and biases, and it remains one of the most robust and replicated findings in the study of human judgment under uncertainty.
The canonical demonstration is the "medical diagnosis problem." A test for a disease is 95% accurate, and the disease occurs in 1% of the population. A patient tests positive. What is the probability they actually have the disease? Most people answer approximately 95%. The correct answer, applying Bayes' theorem, is roughly 16%. The base rate of the disease (1%) is overwhelmed in the judgment by the apparent diagnostic power of the test (95%), even though the base rate is the decisive statistical fact. When the condition is rare, false positives outnumber true positives regardless of test accuracy.
Mechanisms and Explanations
Base rate neglect is not merely a failure of arithmetic. It is a systematic mismatch between the structure of human cognition and the structure of probabilistic reasoning. Several mechanisms have been proposed:
Representativeness. Tversky and Kahneman argued that people judge probability by the degree to which a description resembles a prototype, rather than by the statistical base rate. A patient who "eats health food, exercises, and is slim" seems unlikely to have a heart attack regardless of how common heart attacks are in the general population. The individuating information feels causally potent; the base rate feels abstract and irrelevant.
Causal discounting. People tend to treat base rates as statistical background and individuating information as causally specific. A description that "explains" the outcome overrides the prior probability. This is related to the broader phenomenon of conjunction fallacy, where a more specific story is judged more probable than a more general one because it feels more coherent.
Framing and narrative. When base rates are presented as frequencies ("1 out of 100 people") rather than as probabilities ("1%"), people are more likely to use them correctly. Frequency formats activate the mental machinery of counting and proportion, which humans have evolved to handle, whereas probability formats require abstract reasoning. This finding has been interpreted as evidence that the mind is not a general-purpose Bayesian calculator but a collection of domain-specific inference engines tuned to the statistical structure of ancestral environments.
Base Rate Neglect as a Systemic Failure
While base rate neglect is studied as an individual cognitive bias, its consequences are systemic. In medical practice, physicians ignoring the base rate of diseases when interpreting test results leads to overdiagnosis and overtreatment. In legal reasoning, ignoring base rates when evaluating eyewitness testimony or forensic evidence produces wrongful convictions. In financial markets, ignoring base rates of business failure when evaluating startup narratives produces investment bubbles. The bias is not a private error; it is a distributed epistemic failure that propagates through institutions.
The connection to Goodhart's Law is instructive. When a diagnostic test (like a startup's pitch deck or a medical screening) becomes the target of attention, it ceases to be a good measure of the underlying condition. The base rate is the information that corrects for this — it is the statistical reality that the diagnostic target obscures. Ignoring the base rate is not a failure of intelligence; it is a failure to maintain the feedback loop between local signal and global distribution.
Correction and Containment
Base rate neglect is difficult to correct through education alone. Teaching people Bayes' theorem does not reliably eliminate the bias; the pull of individuating information is too strong. More effective interventions include:
- Frequency framing. Presenting probabilities as natural frequencies rather than percentages.
- Visual aids. Icon arrays and tree diagrams that make the base rate visually salient.
- Forced integration. Requiring decision-makers to explicitly estimate the base rate before evaluating the specific case.
- Algorithmic support. Clinical decision support systems and diagnostic algorithms that incorporate base rates automatically, removing the burden from human judgment.
The last point raises a deeper question. If base rate neglect is a persistent feature of human cognition, then systems that require accurate probabilistic reasoning — medical diagnosis, criminal justice, strategic forecasting — must be designed not around the assumption that humans will reason correctly, but around the assumption that they will not. The bias is not a bug to be patched in the user; it is a design constraint on the system.
Base rate neglect is not a failure of reasoning. It is a failure of architecture. The mind was not designed to process abstract probabilities; it was designed to track frequencies in small social groups. When we ask it to operate in environments of statistical complexity, the error is not in the mind but in the mismatch. The solution is not to train humans to be better Bayesians; it is to build institutions that do not require them to be.