Mercer's Theorem

Mercer's theorem, proven by James Mercer in 1909, states that a continuous, symmetric, positive definite kernel on a compact domain can be decomposed into an infinite series of eigenfunctions with non-negative eigenvalues. This spectral decomposition is the mathematical foundation of the kernel method: it guarantees that every valid kernel implicitly defines a Hilbert space in which the kernel acts as an inner product. The theorem transforms an analytic property — positive definiteness — into a geometric one: the kernel is the Gram matrix of a feature map into a potentially infinite-dimensional space.

Mercer's theorem is not merely a technical result. It is the bridge between the analytic tradition of functional analysis and the algorithmic tradition of machine learning, revealing that the kernel trick is not a computational shortcut but a manifestation of deep spectral structure.