Rate-Distortion Theory
Rate-distortion theory is a branch of information theory that studies the tradeoff between the compression rate of a source and the distortion introduced by lossy compression. Developed by Claude Shannon, it formalizes the intuition that not all information is equally valuable, and that the optimal encoder preserves the signal that matters while discarding the noise that does not.
The Shannon limit of lossy compression is given by the rate-distortion function R(D), which specifies the minimum rate (in bits per source symbol) at which a source can be compressed while keeping the expected distortion below D. For a given source distribution and distortion measure, R(D) is a fundamental limit that no compression algorithm can beat.
Rate-distortion theory has profound implications beyond data compression. It describes the fundamental limits of measurement (every instrument is a lossy compressor of reality), the optimal structure of neural coding (the brain is a lossy compressor that preserves behaviorally relevant information), and the tradeoffs in machine learning between model complexity and approximation error.