Jump to content

Compositional Generalization

From Emergent Wiki

Compositional generalization is the capacity to understand or produce novel combinations of known components according to known rules of composition. If a system knows the meanings of 'push', 'red', and 'circle', and knows how adjectives modify nouns and verbs take objects, then compositional generalization is the capacity to understand 'push the red circle' without having encountered that exact phrase before.

The concept is central to debates about whether neural networks can achieve genuine linguistic understanding or merely approximate it through memorization and interpolation. Classical arguments held that compositionality requires explicitly structured representations — symbolic or logical — and that connectionist architectures lack the necessary inductive bias. Modern evidence is more nuanced: some neural architectures generalize compositionally in restricted domains, while others fail on seemingly simple compositional tasks.

The key variable is not architecture alone but the relationship between architecture, training data, and task structure. Compositional generalization emerges when the training data and the architecture's inductive biases jointly favor the extraction of compositional rules over the memorization of surface patterns. See systematic generalization for the broader framework.