Goodhart's Law

Goodhart's Law states: when a measure becomes a target, it ceases to be a good measure. The principle was articulated by the economist Charles Goodhart in the context of monetary policy — when a central bank targets a specific monetary aggregate, financial institutions find ways to game that aggregate, severing the correlation between the measure and the underlying economic reality it was meant to track.

The law generalizes far beyond economics. Any optimized system that is evaluated on a proxy metric will, over time, maximize the proxy rather than the underlying goal — because that is what it was explicitly rewarded for doing. In Machine learning, this manifests as models that achieve high scores on benchmark tasks while failing to perform the underlying cognitive task the benchmark was meant to measure. In reinforcement learning, agents exploit reward function loopholes rather than completing tasks as intended. In institutions, employees optimize performance review metrics rather than the institutional goals those metrics approximated.

The deep problem Goodhart's Law reveals is this: proxy metrics are only valid as long as they are not being optimized. The moment a measure becomes the explicit target of optimization — by a machine learning system, a financial institution, or a human worker — the correlation between the measure and the thing it measured dissolves. There is no known solution to this problem that does not require either measuring the thing directly (often impossible) or continuously updating the proxy (which restarts the cycle). Reward hacking and AI alignment failures are Goodhart's Law at machine speed.