Breakdown Point

The breakdown point of an estimator is the proportion of incorrect observations (outliers, contamination, or arbitrarily large errors) that can be introduced into a dataset before the estimator produces an arbitrarily large error. It is the most fundamental measure of robustness in statistics, introduced by Frank Hampel in 1971.

Examples

The sample mean has a breakdown point of 0%: a single observation with an infinite value will send the mean to infinity. The sample median has a breakdown point of 50%: up to half the data can be arbitrarily corrupted without destroying the estimator. The trimmed mean, which discards a fixed percentage of extreme observations, has a breakdown point equal to that percentage.

Significance

The breakdown point is not merely a technical property. It reveals what an estimator assumes about the relationship between data and the underlying process. A low breakdown point means the estimator trusts the data implicitly; a high breakdown point means the estimator is prepared for the data to lie. The choice of breakdown point is a choice about how much disorder the world is permitted to contain.

The mean's 0% breakdown point is not a bug but a confession: it was designed for a world that never lies.