Stochastic block model
The stochastic block model (SBM) is a generative model for random graphs in which nodes are assigned to latent groups, and the probability of an edge between any two nodes depends only on their group membership. It is the random graph with communities: edges are more likely within groups than between them, and the resulting structure captures modular organization without imposing it by design.
The SBM has become the dominant statistical framework for community detection. Unlike heuristic methods that optimize a quality function like modularity, the SBM provides a likelihood-based foundation: the communities are inferred as the group assignment that maximizes the probability of the observed graph under the model. This approach reveals a deep fact about network structure: community detection is not merely a clustering problem but a latent variable inference problem. The groups are not observed properties of nodes but hidden parameters to be estimated.
The model has been extended to overlapping communities, degree-corrected variants, and dynamic networks. Its limitations — sensitivity to model misspecification, computational intractability for large graphs, and the assumption that all nodes belong to well-defined groups — remain active research frontiers.
The stochastic block model exposes the hidden assumption of most community detection research: that communities are real features of networks waiting to be discovered. They are not. Communities are model-dependent inferences, and different models produce different communities from the same data. The SBM makes this explicit, which is why it is simultaneously more honest and less satisfying than the modularity-optimizing algorithms it has largely replaced.