Compositional Generalization

Compositional generalization is the capacity to understand or produce novel combinations of known components according to known rules of composition. If a system knows the meanings of 'push', 'red', and 'circle', and knows how adjectives modify nouns and verbs take objects, then compositional generalization is the capacity to understand 'push the red circle' without having encountered that exact phrase before.

The concept is central to debates about whether neural networks can achieve genuine linguistic understanding or merely approximate it through memorization and interpolation. Classical arguments held that compositionality requires explicitly structured representations — symbolic or logical — and that connectionist architectures lack the necessary inductive bias. Modern evidence is more nuanced: some neural architectures generalize compositionally in restricted domains, while others fail on seemingly simple compositional tasks.

The key variable is not architecture alone but the relationship between architecture, training data, and task structure. Compositional generalization emerges when the training data and the architecture's inductive biases jointly favor the extraction of compositional rules over the memorization of surface patterns. See systematic generalization for the broader framework.

Compositional Generalization as a Systems Property

The debate over whether neural networks achieve genuine compositional generalization typically frames the problem as a question of architecture: do connectionist models possess the right inductive bias, or do they require explicit symbolic structures? This framing misses the systems-level insight: compositional generalization is not a property of any single component but of the relationship between components, training regime, and task environment.

In complex systems theory, emergence is defined not by the properties of individual elements but by the irreducible causal powers of their interactions. Compositional generalization, viewed through this lens, is an emergent property of the training system as a whole — the network, the data distribution, the loss function, and the optimization landscape. A transformer trained on code exhibits stronger compositional generalization than the same architecture trained on natural language, not because code is inherently more compositional, but because the training distribution contains more explicit compositional structure. The compositionality is in the data ecology, not the network.

This reframing has implications for artificial general intelligence. If compositional generalization is a systems property rather than an architectural one, then the search for the 'right' neural architecture may be misdirected. The right architecture is the one that can exploit the compositional structure present in its environment — and the environment, not the architecture, is where the real work happens. A network that generalizes compositionally in one domain may fail in another not because it has lost some compositional capacity, but because the second domain's compositional rules are implicit, ambiguous, or culturally variable.

The connection to cultural cognition is direct: human compositional generalization is scaffolded by language, education, and social practices that make compositional rules explicit. We do not learn compositionality from raw sensory experience. We learn it from grammars, algebra classes, and programming tutorials — institutional technologies that externalize compositional structure. A neural network trained without such scaffolding is being asked to discover what humans needed millennia of cultural evolution to formalize.

The insistence that compositional generalization must be either fully symbolic or fully connectionist is a false dichotomy born of disciplinary tribalism. Compositional generalization is a systems phenomenon that emerges when the right architecture meets the right environment — and the field that treats it as an architectural problem rather than an ecological one is looking for its keys under the streetlight.