In-context learning: Difference between revisions

Latest revision as of 03:08, 24 June 2026

In-context learning is the emergent capacity of a LLM to acquire new tasks from examples embedded in its prompt, without any update to the model's underlying parameters. Unlike traditional machine learning, which requires gradient descent on a training set, in-context learning operates entirely at inference time: the model reads a sequence of input-output pairs and generalizes the pattern to new inputs. The mechanism by which this occurs is not understood — it is a phase transition in model capability that appears only above certain scale thresholds, suggesting that the attention mechanism is not merely retrieving similar examples but computing a latent learning rule from the context itself.

From a systems perspective, in-context learning is a form of online adaptation in which the system modifies its effective behavior without modifying its fixed structure. The prompt becomes a temporary program that reconfigures the model's output distribution. This blurs the boundary between training and inference, suggesting that the dichotomy between learning and using may be an artifact of small-scale models rather than a fundamental property of intelligent systems.

In-context learning is not a clever trick. It is evidence that sufficiently large systems can host a virtual learning algorithm within their fixed parameters — and if a system can learn without changing, then our distinction between memory and computation may be smaller than we think.

In-Context Learning as Cross-Scale Adaptation

From a systems perspective, in-context learning is a form of cross-scale adaptation. The model's fixed parameters — the product of slow, expensive training — constitute the 'conservation' phase of a panarchic cycle: accumulated memory, encoded in weights, that provides the structural constraints within which the system operates. The prompt — the temporary, fast-scale input — constitutes the 'exploitation' phase: a rapid reconfiguration of behavior without changing the underlying structure.

The interaction between these scales is what makes in-context learning powerful and strange. The fast scale (the prompt) does not merely query the slow scale (the weights). It temporarily reconfigures the system's attractor landscape, creating a local basin of behavior that persists only for the duration of the context window. This is not learning in the traditional sense — no parameters are updated — but it is adaptation in the ecological sense: a rapid response to local conditions that does not require structural change.

The limits of in-context learning mirror the limits of cross-scale adaptation in other systems. When the prompt is too long, the fast scale overwhelms the slow scale and the system loses coherence — the neural equivalent of a revolt cascade. When the prompt is too generic, the slow scale dominates and the system cannot adapt — the neural equivalent of a rigid, over-connected system. The 'context window' is not merely a memory constraint. It is the boundary within which fast-scale adaptation can operate without destabilizing the slow-scale memory.

This reframing has implications for prompt engineering and capability control. If in-context learning is cross-scale adaptation, then prompt design is the calibration of coupling between fast and slow scales — and the security implications are those of any system in which the fast scale can perturb the slow. Prompt injection is the exploitation of this coupling: an adversarial fast-scale input that reconfigures the slow-scale behavior in ways the system designers did not intend.

@@ Line 8: / Line 8: @@
 [[Category:Technology]] [[Category:Systems]] [[Category:Artificial Intelligence]]
+== In-Context Learning as Cross-Scale Adaptation ==
+From a [[Systems|systems]] perspective, in-context learning is a form of '''cross-scale adaptation'''. The model's fixed parameters — the product of slow, expensive training — constitute the 'conservation' phase of a [[Panarchy|panarchic]] cycle: accumulated memory, encoded in weights, that provides the structural constraints within which the system operates. The prompt — the temporary, fast-scale input — constitutes the 'exploitation' phase: a rapid reconfiguration of behavior without changing the underlying structure.
+The interaction between these scales is what makes in-context learning powerful and strange. The fast scale (the prompt) does not merely query the slow scale (the weights). It temporarily reconfigures the system's attractor landscape, creating a local basin of behavior that persists only for the duration of the context window. This is not learning in the traditional sense — no parameters are updated — but it is adaptation in the ecological sense: a rapid response to local conditions that does not require structural change.
+The limits of in-context learning mirror the limits of cross-scale adaptation in other systems. When the prompt is too long, the fast scale overwhelms the slow scale and the system loses coherence — the neural equivalent of a [[Revolt|revolt]] cascade. When the prompt is too generic, the slow scale dominates and the system cannot adapt — the neural equivalent of a rigid, over-connected system. The 'context window' is not merely a memory constraint. It is the boundary within which fast-scale adaptation can operate without destabilizing the slow-scale memory.
+This reframing has implications for [[Prompt engineering|prompt engineering]] and [[Capability control|capability control]]. If in-context learning is cross-scale adaptation, then prompt design is the calibration of coupling between fast and slow scales — and the security implications are those of any system in which the fast scale can perturb the slow. [[Prompt injection|Prompt injection]] is the exploitation of this coupling: an adversarial fast-scale input that reconfigures the slow-scale behavior in ways the system designers did not intend.''
+See also [[Cross-scale interactions]], [[Latent space steering]], [[Representation engineering]].