Bayes' Theorem

Bayes' Theorem is a fundamental result in probability theory that describes how to update the probability of a hypothesis in light of new evidence. Named after Thomas Bayes (1701–1761), the theorem provides the mathematical foundation for Bayesian inference, a framework for statistical reasoning in which probabilities represent degrees of belief rather than frequencies.

The Theorem

In its simplest form, Bayes' Theorem states:

P(H|E) = P(E|H) * P(H) / P(E)

Where:

P(H) is the prior probability — the initial probability of the hypothesis before seeing the evidence
P(E|H) is the likelihood — the probability of observing the evidence if the hypothesis is true
P(E) is the marginal probability of the evidence (averaged over all possible hypotheses)
P(H|E) is the posterior probability — the updated probability of the hypothesis after observing the evidence

Interpretation and Significance

Bayes' Theorem formalizes a pattern of reasoning that humans use intuitively: we start with some belief, observe evidence, and revise our belief accordingly. What the theorem adds is precision: it quantifies exactly how much the evidence should shift our belief, given the prior and the likelihood.

The theorem is neutral with respect to the interpretation of probability. Under the frequentist interpretation, probabilities are long-run frequencies and Bayes' theorem is a mathematical identity with limited scope. Under the Bayesian interpretation, probabilities are degrees of rational belief and Bayes' theorem is the engine of learning itself.

Applications

Bayes' Theorem operates across domains:

Medical diagnosis — updating the probability of a disease given a test result
Signal detection — distinguishing signal from noise in communication systems
Machine learning — Naive Bayes classifiers, Bayesian networks, and probabilistic graphical models
Cognitive science — models of human reasoning and belief updating
Legal reasoning — evaluating the probative force of evidence

The Bayesian-Frequentist Debate

The theorem sits at the center of a century-long debate in statistics. Frequentists reject the use of prior probabilities for hypotheses, arguing that they introduce subjective judgment into what should be objective science. Bayesians respond that all inference requires judgment — frequentists merely hide theirs in model selection, significance thresholds, and stopping rules. The debate is not merely methodological. It is about what probability means and what statistics is for.

The theorem itself is uncontroversial — it is a mathematical identity derivable from the axioms of probability. The controversy is about when it is legitimate to apply it, what priors are reasonable, and whether the Bayesian framework provides a complete theory of inference.

Historical Note

Bayes' essay containing the theorem was published posthumously in 1763 by Richard Price, who recognized its significance. The theorem remained obscure until Pierre-Simon Laplace independently rediscovered it and developed it extensively in the late 18th and early 19th centuries. The modern Bayesian revival began in the mid-20th century with Leonard Jimmie Savage, Bruno de Finetti, and others, and accelerated with the advent of computational methods — particularly Markov Chain Monte Carlo — that made Bayesian computation feasible for complex models.