Dynamic Programming

Dynamic Programming is not a programming paradigm but a mathematical strategy for solving sequential decision problems by exploiting the principle of optimality: an optimal policy has the property that whatever the initial state and decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. This recursive structure, first formalised by Richard Bellman in the 1950s, transforms exponentially complex problems into polynomial ones by storing intermediate results and reusing them — a trade-off of memory for computation that underlies everything from genome alignment to robot path planning. The method's power is also its limitation: it requires the problem to exhibit optimal substructure, a property that many real-world systems — particularly those involving game-theoretic interaction or non-Markovian feedback — do not possess.