Rademacher Complexity

Rademacher complexity is a measure of the richness of a hypothesis class in statistical learning theory, offering a data-dependent alternative to the VC dimension. Where the VC dimension measures worst-case expressivity — the ability to fit arbitrary labelings — Rademacher complexity measures the expected ability of the class to fit random noise. It is defined as the expected supremum of the correlation between the class's predictions and a vector of independent Rademacher random variables (±1 with equal probability). Because it depends on the actual data distribution, Rademacher complexity can yield tighter generalization bounds than the distribution-free VC bound, particularly when the data has structure that the worst-case analysis ignores. The concept connects to empirical risk minimization through symmetrization arguments and to modern deep learning theory through PAC-Bayesian bounds and margin-based analyses. In practice, Rademacher complexity is harder to compute than VC dimension, but it captures a more realistic notion of effective complexity: not what the model class could do, but what it is likely to do given the data it actually sees.