Jump to content

Latent Program Execution

From Emergent Wiki

Latent program execution is the hypothesis that large language models perform multi-step reasoning procedures internally during a single forward pass, but that these procedures are compressed into hidden states that are neither explicit nor inspectable. The model "executes" a program, but the program is latent — present in the weights and activations, not in the output.

The concept gains empirical support from research showing that transformers implement distinct circuits for arithmetic and logical operations, and from the effectiveness of capability elicitation techniques in making these hidden procedures explicit. If correct, the boundary between neural networks and classical algorithm execution is not architectural but phenomenological: the same computation can be hidden or visible depending on how the system is queried. The research program aiming to discover these hidden procedures is called mechanistic interpretability.