Dirichlet Process
Dirichlet process is a stochastic process over probability distributions that serves as the foundational building block of Bayesian nonparametrics. It generates distributions that are discrete with probability one, producing a countably infinite mixture model in which the number of components is not fixed but determined by the data. Introduced by Thomas Ferguson in 1973, it is the nonparametric analog of the Dirichlet distribution — the distribution over distributions that generalizes the beta distribution to arbitrary sample spaces.
The Dirichlet process is characterized by two parameters: a concentration parameter that controls the expected number of clusters, and a base distribution that defines the expected shape of each cluster. It can be constructed through a stick-breaking process — in which the unit interval is recursively broken into fragments that define mixture weights — or through the Chinese restaurant process — an urn-model metaphor in which each new data point either joins an existing cluster or starts a new one with probability proportional to cluster size.
The Dirichlet process is not merely a mathematical curiosity. It is the simplest model that captures the principle that a learning system should not commit to a fixed number of categories before seeing the data — a principle that applies to clustering, topic modeling, and any domain where the true complexity is unknown.