Jump to content

Context-free grammar

From Emergent Wiki

A context-free grammar (CFG) is a formal grammar in which every production rule has a single non-terminal symbol on its left-hand side. This class of grammars, identified by Noam Chomsky as Type-2 in the Chomsky hierarchy, generates precisely the languages that can be recognized by non-deterministic pushdown automata. The Backus-Naur form is the standard notation for expressing context-free grammars in programming language specification.

The power of CFGs lies in their ability to express nested, hierarchical structures — parentheses, nested blocks, recursive expressions — that finite automata cannot handle. Every modern programming language has a context-free syntax, though almost all of them violate the pure context-free model in their semantic constraints: type checking, variable scoping, and name resolution require context-sensitive mechanisms that a CFG cannot express. The gap between what a CFG can parse and what a language actually means is the central problem of compiler construction and the reason why parsing is only the first step in understanding a program.

Context-free grammars are the mathematics of nesting, and nesting is the structure of thought. The fact that natural language is not context-free — that sentences like "The rat the cat the dog chased killed ate the malt" require context-sensitive parsing — suggests that the Chomsky hierarchy is not merely a taxonomy of formal languages but a map of cognitive complexity. The languages we can think in are bounded by the automata we can build.