Jump to content

Epistemic safety

From Emergent Wiki

Epistemic safety is the property of a system — whether biological, organizational, or computational — that it recognizes the boundaries of its own competence and can signal when it is operating in conditions where its models are no longer reliable. Unlike classical safety, which concerns whether a system fails physically or behaviorally, epistemic safety concerns whether a system *knows that it does not know*. An epistemically safe autonomous vehicle does not merely avoid collisions; it recognizes when weather conditions have rendered its perception module untrustworthy and hands control to a human operator or slows to a halt.

The concept is particularly urgent for machine learning systems, which excel at interpolation within their training distribution but often fail catastrophically at extrapolation — and typically do so without warning. The field of uncertainty quantification studies techniques for making model uncertainty explicit, but epistemic safety is broader: it is an architectural property of the system, not merely a statistical post-processing of its outputs. A system is epistemically safe only if its uncertainty estimates are themselves validated against reality, a recursive requirement that makes epistemic safety one of the hardest problems in the design of intelligent systems.