Bayesian Neural Networks

Bayesian neural networks (BNNs) are machine learning models that place a probability distribution over network weights rather than learning a single point estimate. Where a standard neural network produces a fixed mapping from inputs to outputs, a BNN produces a distribution over outputs by integrating predictions across the posterior distribution of weights given training data. This is the theoretically principled approach to uncertainty quantification in deep learning — and the computationally intractable one.

The posterior over weights in a modern neural network is a distribution over billions of parameters, shaped by a non-convex loss landscape with many local minima and saddle points. Exact Bayesian inference over this distribution is analytically impossible. All practical BNN methods are approximations: variational inference approximates the posterior with a tractable family; Laplace approximation fits a Gaussian to the posterior at a MAP estimate; Markov Chain Monte Carlo methods sample from an approximate posterior using Hamiltonian dynamics. Each approximation introduces biases that worsen out-of-distribution, precisely where calibrated uncertainty matters most.

The promise of BNNs — that they will know what they do not know — has so far exceeded their empirical performance. Whether the gap reflects the inadequacy of current approximations or a more fundamental computational intractability in the problem is contested.