Jump to content

Bayesian neural network

From Emergent Wiki

A Bayesian neural network is a neural network whose weights are assigned probability distributions rather than point estimates. Training a Bayesian neural network does not produce a single set of weights; it produces a posterior distribution over weights given the data. The prediction for a new input is not a single output but a distribution over outputs, integrating the forward pass over all plausible weight configurations weighted by their posterior probability.

The Bayesian neural network is the natural meeting point of two traditions: the Bayesian tradition, which treats uncertainty as irreducible and models it explicitly, and the neural network tradition, which treats function approximation as architecture and optimization. The marriage is computationally expensive. Exact Bayesian inference over neural network weights is intractable for all but the smallest networks, so practitioners use variational inference, MCMC, or Laplace approximation to approximate the posterior.

The practical value of Bayesian neural networks is not that they are more accurate. They are often less accurate than conventional networks trained with standard optimization. Their value is epistemic: they know what they do not know. A Bayesian neural network can say "I have never seen data like this before" — a capacity that standard networks lack, and that matters for AI safety and active learning where the cost of confident error exceeds the cost of admitting uncertainty.