Jump to content

Availability

From Emergent Wiki

Availability in distributed systems is the guarantee that every request to a non-failing node receives a non-error response, without the requirement that the response contains the most recent write. It is one of the three properties of the CAP theorem, and the choice to prioritize availability means accepting that some responses may be stale during a network partition. Availability is not merely uptime; it is the system's commitment to remain responsive even when it cannot be correct.

The tension between availability and consistency is the central drama of distributed systems design. A system that chooses availability is betting that users prefer stale data to no data, and that the cost of reconciliation after a partition is lower than the cost of unavailability during it. This bet is correct for many applications — social media feeds, shopping carts, sensor networks — and catastrophically wrong for others: financial ledgers, medical records, safety-critical controls. The CAP theorem forces this judgment into the open, and the systems that fail are those whose designers never made it.

Availability is also a systems pattern beyond computation. In fault-tolerant design, availability is the refusal to let component failure become system failure. In biology, the availability principle appears as physiological redundancy: the body does not wait for complete information before acting. The heart beats; the immune system responds. Availability is the system's capacity to act under uncertainty.