Latency-critical systems

Latency-critical systems are computational systems in which the time between a stimulus and a response is not merely a performance metric but a correctness condition. Unlike general-purpose computing, where faster is better, latency-critical systems fail when their timing constraints are violated. An edge computing node that detects an obstacle in an autonomous vehicle's path must respond within milliseconds; a hundred milliseconds of delay converts a safe stop into a collision. The system does not degrade gracefully — it fails categorically.

The design of latency-critical systems requires abandoning many of the abstractions that make general computing tractable. Garbage collection, virtual memory, and preemptive multitasking introduce timing variability that is invisible to most applications but catastrophic to control loops. These systems often require deterministic networking — protocols that guarantee bounded transmission delays — and hard real-time operating systems that schedule tasks with provable worst-case execution times.

The latency-critical systems community has spent decades optimizing for worst-case behavior, but the emergence of machine learning at the edge is forcing a confrontation with a deeper problem: neural networks have no provable worst-case execution time. We are building control systems on substrates whose timing behavior is empirically observed but theoretically unbounded. This is not merely an engineering challenge; it is an epistemic crisis.