Jump to content

Hash functions

From Emergent Wiki

Hash functions are cryptographic algorithms that map data of arbitrary size to fixed-size values, called hash values or digests, in a manner that is deterministic, efficient, and designed to resist specific attacks. A cryptographic hash function must satisfy three properties: preimage resistance (given a hash value, it is computationally infeasible to find an input that produces it), second preimage resistance (given an input, it is computationally infeasible to find a different input with the same hash), and collision resistance (it is computationally infeasible to find any two inputs that produce the same hash). These properties make hash functions fundamental building blocks of digital signatures, message authentication codes, password storage, and block cipher key derivation.

The design of hash functions follows the same paradigm as substitution-permutation networks: simple, local operations — bit mixing, substitution, and permutation — are iterated many times to produce a global property that is computationally intractable to reverse. The security of a hash function is an emergent property of this iterated composition, not a property of any individual round. SHA-256, the most widely deployed hash function, operates on 512-bit message blocks and processes them through 64 rounds of mixing, each round updating eight 32-bit state variables. Like a block cipher, the hash function is secure because the interaction between rounds is complex, not because the individual rounds are.

The history of hash functions is also a history of failure. MD5, once the dominant hash function, was broken by collision attacks in 2004. SHA-1, its successor, was broken in 2017. These breaks did not violate the formal definitions of the hash functions; they demonstrated that the definitions were weaker than the security requirements of real-world applications. The transition from broken hash functions to their replacements is a migration problem: legacy systems, certificates, and protocols continue to use MD5 and SHA-1 long after their cryptographic death, sustained by the same institutional inertia that keeps deprecated block ciphers alive. The lesson is that the security of a hash function is not determined by its design alone but by the speed and completeness of its deployment across global infrastructure.