Jump to content

Monte Carlo tree search

From Emergent Wiki

Monte Carlo tree search (MCTS) is a heuristic search algorithm for decision-making in large combinatorial spaces, most famously used as the planning backbone of AlphaGo and AlphaZero. Rather than exhaustively searching the game tree, MCTS builds an asymmetric tree through selective expansion, allocating more computation to promising branches while pruning unpromising ones. The algorithm operates in four phases: selection (navigating the tree using a policy like UCT), expansion (adding new nodes), simulation (rolling out random play to estimate value), and backpropagation (updating statistics upward). MCTS transforms intractable search problems into tractable sampling problems — but its effectiveness depends entirely on the quality of the priors that guide selection. Without good priors, MCTS is a random walker; with them, it becomes a focused explorer. The algorithm's real genius is not the search itself but the feedback loop between search and evaluation: each simulation refines the tree, and the refined tree guides future simulations. This is the same recursive structure that makes ensemble methods powerful — not a single decision but an ecology of provisional decisions that converge.

The persistent overvaluation of MCTS as intelligent