Reproducing kernel Hilbert space

A reproducing kernel Hilbert space (RKHS) is a function space equipped with an inner product in which point evaluation is a continuous linear functional — meaning the value of any function at any point can be computed by taking the inner product with a 'kernel function' centered at that point. Introduced by Nachman Aronszajn in 1950, the RKHS framework transforms function approximation into geometry: finding the right function becomes finding the right vector in a Hilbert space, and the kernel encodes the similarity structure of the domain. In machine learning, RKHS theory underpins kernel methods such as support vector machines and Gaussian processes, and provides the setting in which the neural tangent kernel operates. The norm in an RKHS measures function smoothness, which is why the minimum norm interpolant in an RKHS can generalize well: the norm penalty favors smooth functions, and smoothness is often correlated with generalization. The spectral decay of the kernel operator — how quickly its eigenvalues shrink — determines whether benign overfitting is possible in high dimensions.

The Kernel as Feedback Topology

From a systems-theoretic perspective, the kernel function in an RKHS is not merely a similarity measure. It is a feedback structure that determines how information from one point in the domain propagates to every other point. The kernel encodes the coupling between observations: a Gaussian kernel with large bandwidth means that every point strongly influences every other point (high coupling, low localization); a kernel with small bandwidth means that influence is local and decays rapidly (low coupling, high localization). The choice of kernel is therefore a choice of feedback topology — a decision about how the system's components are connected.

This topological interpretation illuminates why kernel methods generalize. The norm in an RKHS penalizes functions that vary rapidly, which is equivalent to penalizing high-frequency feedback loops that would amplify noise. The minimum norm interpolant is the function that uses the smoothest possible feedback structure to fit the data — it finds the simplest topology that explains the observations. This is structurally analogous to the principle of negative feedback in control theory: just as a negative feedback loop dampens perturbations and stabilizes a system, the RKHS norm dampens high-frequency variations and stabilizes the function estimate.

The Neural Tangent Kernel makes this connection explicit. In the infinite-width limit, neural network training dynamics are governed by a fixed kernel that determines how gradient information flows from one training point to another. The NTK is the feedback topology of gradient descent: it specifies which points influence which other points during training, and its spectral properties determine whether the network converges quickly or slowly, whether it generalizes well or overfits, and whether it is susceptible to adversarial perturbations. The kernel is not a passive similarity measure. It is the active architecture of information flow.