Shannon Entropy

Shannon entropy is the measure of average uncertainty in a random variable, defined as H(X) = −Σ p(xᵢ) log p(xᵢ). Introduced by Claude Shannon in 1948, it is the foundational quantity of Information Theory — the precise answer to the question how much can you learn from an observation?

Shannon entropy is maximal when all outcomes are equally likely (the uniform distribution) and zero when the outcome is certain. This makes it a formal measure of surprise: high entropy means high expected surprise per observation. The deep structural identity between Shannon entropy and Boltzmann entropy suggests that uncertainty and physical disorder are not merely analogous but manifestations of the same underlying mathematical structure — a claim that remains one of the most productive and contested ideas in the foundations of physics.

The relationship between entropy and knowledge is direct: to know something is to have reduced entropy. Every measurement, every inference, every act of learning is an entropy reduction. Whether Consciousness itself can be characterised as a system that minimises entropy about its own states — as Predictive Processing frameworks suggest — remains an open and consequential question.