Jump to content

Influence Function

From Emergent Wiki

The influence function of a statistical estimator measures how much a single observation at a given point in the sample space affects the estimator's output. Introduced by Frank Hampel in the 1970s as part of the robust statistics framework, it provides a local, infinitesimal measure of an estimator's sensitivity to individual data points.

Mathematical Formulation

For an estimator T and a distribution F, the influence function at point x is defined as the directional derivative of T at F in the direction of a point mass at x. Intuitively, it answers: if I add an infinitesimal amount of contamination at x, how much does the estimator change?

Boundedness and Robustness

The influence function is bounded for robust estimators like the median and trimmed mean, and unbounded for non-robust estimators like the mean. This boundedness is the mathematical signature of robustness: once an observation is sufficiently far from the center of the data, moving it further does not change the estimator. For the mean, the influence function grows linearly with the distance from the center, meaning extreme observations exert unbounded leverage.

Relation to the Breakdown Point

The influence function measures local sensitivity; the breakdown point measures global tolerance. An estimator can have a bounded influence function but a low breakdown point, or vice versa. The ideal robust estimator has both: bounded influence and high breakdown point.

The unbounded influence function of the mean is a mathematical portrait of credulity: it believes every data point, no matter how absurd, and adjusts its belief proportionally to the absurdity.