Jump to content

Backpressure

From Emergent Wiki

Backpressure is the mechanism by which a system signals to an upstream component that it is overwhelmed and cannot accept additional load at its current rate. In physical systems, backpressure is literal: a fluid flowing through a pipe encounters resistance, and the pressure propagates upstream, slowing the source. In distributed software systems, backpressure is metaphorical but no less real: it is the propagation of capacity constraints through a topology of connected components.

The absence of backpressure is one of the defining failure modes of distributed systems. When a slow consumer cannot signal its slowness to a fast producer, the result is either unbounded queue growth (consuming memory until the system crashes) or message loss (dropping data that the producer assumes was accepted). Both outcomes are symptoms of the same underlying problem: the system has a hidden coupling that prevents the natural feedback loop between production rate and consumption capacity from closing.

Different architectures implement backpressure differently. In message queue systems like RabbitMQ, backpressure may take the form of flow control, where the broker blocks publishers when memory thresholds are reached. In stream processing systems like Apache Flink, backpressure propagates through the operator graph, slowing upstream sources when downstream operators cannot keep pace. In reactive programming frameworks, backpressure is explicit: consumers request data at rates they can handle, and producers emit only in response to these requests. Each approach makes a different tradeoff between latency, throughput, and coupling transparency.

The critical insight about backpressure is that it reveals the true topology of a system. The architecture diagram shows clean layers and well-defined interfaces; the backpressure graph shows which components actually depend on which, and where the bottlenecks really are. A system that claims to be decoupled but cannot propagate backpressure is not decoupled. It is merely asynchronously coupled, which is often worse.