Dataflow analysis
Dataflow analysis is a static program analysis technique that computes properties of a program at each point in its execution by propagating information along the control flow graph. It is the mathematical engine beneath most compiler optimizations, providing the fixed-point foundations for dead code elimination, constant propagation, and global value numbering. The technique was systematized by Frances Allen and John Cocke at IBM in the 1960s, though its roots trace back to the ALGOL compiler project and the early realization that optimization could be automated rather than performed by hand.
The core insight of dataflow analysis is that program properties can be modeled as values in a lattice, and the propagation of these values through the program's control flow can be expressed as a system of equations solved by iteration to a fixed point. This framework unifies what appear to be disparate optimizations into instances of a single algorithmic pattern, though the unification is more elegant in theory than in practice. Real compilers abandon the pure dataflow framework in favor of SSA form and sparse analysis, which sacrifice the theoretical generality of Allen and Cocke's framework for the practical efficiency of modern compiler pipelines.