Gaussian Process
Gaussian process is a non-parametric probabilistic model that defines a distribution over functions, such that any finite collection of function values has a joint Gaussian distribution. It is the canonical surrogate model in Bayesian optimization because it provides not merely point predictions but full predictive distributions — mean and variance — capturing both the expected value of an unknown function and the uncertainty around that expectation. The key modeling decision is the covariance kernel, which encodes assumptions about smoothness, periodicity, and correlation structure; a squared exponential kernel implies infinite smoothness, while a Matérn kernel permits rougher, more realistic landscapes.
Gaussian processes connect probability theory to function approximation in a way that makes uncertainty explicit and tractable. They are used not only for optimization but for spatial statistics, robotics, and any domain where predictions must come with calibrated confidence. The computational cost of Gaussian processes scales cubically with the number of observations, making them expensive for large datasets — a limitation that has driven research into sparse approximations and kernel methods that retain the probabilistic framework while reducing computational burden.