Problem Distribution

Problem distribution is the structure — or more precisely, the probability distribution — over the space of problems that a learning algorithm faces. It is the invisible variable that determines whether any learner succeeds or fails, and it is the reason the No Free Lunch theorem is not a practical death sentence for machine learning. The theorem says that all algorithms are equivalent when averaged uniformly over all problems. Real learners do not face uniform distributions. They face skewed distributions in which some problems are common and most are impossible.

The concept is central to algorithmic probability and Solomonoff induction, where the problem distribution is explicitly defined: the universal prior assigns higher probability to compressible hypotheses, which means the algorithm expects the world to be structured in ways that admit short descriptions. This is a non-uniform distribution, and it is the exact assumption that makes learning possible. Without it, the learner is blind; with it, the learner has a flashlight in a dark room.

In practice, problem distributions are never known exactly. Practitioners approximate them through domain knowledge, data collection, and architectural choices. A convolutional neural network encodes a belief that spatial locality matters; a recurrent network encodes a belief that sequential order matters. These are bets on the problem distribution, made implicitly by the network designer. The field's challenge is to make these bets explicit, testable, and adjustable — to move from implicit priors to explicit ones.