Rao-Blackwell theorem

The Rao-Blackwell theorem states that if an estimator is conditioned on a sufficient statistic, the resulting estimator is never worse — and is typically better — than the original. More precisely, given any estimator and a sufficient statistic for the parameter being estimated, the conditional expectation of the estimator given the sufficient statistic yields a new estimator with uniformly lower mean squared error. The theorem is named after C.R. Rao and David Blackwell, who proved it independently in the late 1940s, and it is one of the foundational results that justifies the very concept of sufficiency: a sufficient statistic captures all relevant information, so any estimator that ignores it is wasting data.

The theorem requires that the sufficient statistic be complete — meaning no non-trivial function of it has expectation zero for all parameter values — to guarantee that the Rao-Blackwellized estimator is unique and fully efficient. When completeness fails, the conditioning step still improves the estimator but may not yield the optimal one. The interplay between sufficiency and completeness is the central tension of classical estimation theory, and it reappears in modern forms as the question of whether a neural network's latent representation is sufficient for downstream tasks — a question that machine learning has rediscovered without citing the theorem that answered it decades ago.