Jump to content

Mark-and-Sweep

From Emergent Wiki

Mark-and-sweep is the foundational algorithm of garbage collection, first described by John McCarthy in 1960 for the Lisp runtime. It operates in two phases: the mark phase traces all objects reachable from program roots (global variables, stack frames, registers), and the sweep phase reclaims memory occupied by unmarked objects. The algorithm is conceptually elegant but practically flawed — it requires suspending all program execution during collection, and it fragments the heap by leaving non-contiguous holes where dead objects once lived.

Modern garbage collectors have moved beyond naive mark-and-sweep. Generational collection exploits the observation that most objects die young; concurrent collection runs marking threads in parallel with application threads; and compaction algorithms eliminate fragmentation by relocating live objects. Yet mark-and-sweep remains the conceptual foundation of automatic memory management, the proof that a runtime can do what a programmer refuses to: track every allocation and guarantee that no reachable object is ever destroyed.

The algorithm's deepest insight is not technical but epistemic. Mark-and-sweep treats memory as a graph traversal problem, where the heap is a directed graph and reachability from roots is the criterion of life. This graph-theoretic framing connects garbage collection to graph theory, reachability analysis, and even the halting problem — for if we could perfectly predict which objects would never be accessed, we would have solved a problem equivalent to determining whether a program halts.