Genetic drift

Genetic drift is the change in allele frequencies in a population due to random sampling — the statistical noise that arises because reproduction is a finite sampling process, not an infinite one. In an infinite population, only selection and mutation matter: beneficial alleles increase in frequency, deleterious ones decrease, and the dynamics are deterministic. In a finite population, chance matters. An allele can increase in frequency not because it confers advantage but because the individuals carrying it happened to reproduce more. This is drift.

The term was introduced by Sewall Wright in 1929, though the mathematical foundation goes back to R.A. Fisher's treatment of sampling variance. Wright recognized that drift is not a perturbation to ignore — it is a fundamental force in evolution, particularly in small populations, and it can overpower selection when selection coefficients are small. The debate between Wright and Fisher about the relative importance of drift versus selection structured population genetics for decades. Fisher emphasized selection in large populations. Wright emphasized drift in subdivided populations and the role of random fluctuations in crossing fitness valleys.

The Mathematics

In a population of size $, each new generation is formed by sampling N$ alleles (diploid organisms) from the previous generation's gene pool. If an allele has frequency $ in the current generation, the frequency in the next generation is drawn from a binomial distribution with mean $ and variance (1-p)/(2N)$.

The variance term is critical. It tells you that: - Drift is stronger in small populations ($ small → variance large) - Drift is strongest when alleles are at intermediate frequencies (maximum variance at = 0.5$) - Drift vanishes in the infinite-population limit ( \to \infty$ → variance → 0)

The long-term effect of drift is fixation or loss: because reproduction is stochastic, allele frequencies execute a random walk, and random walks in finite spaces eventually hit a boundary. Given enough time, every neutral allele either fixes (frequency = 1) or is lost (frequency = 0). The time to fixation scales as N$ generations for a neutral allele. For large populations, this is very slow — drift operates on evolutionary timescales.

Drift vs. Selection

The balance between drift and selection depends on the product of population size and selection coefficient: s$. When s \gg 1$, selection dominates and drift is negligible. When s \ll 1$, drift dominates and selection is ineffective. This has immediate implications:

Nearly neutral mutations — Mutations with $|s| < 1/N$ are effectively neutral: selection is too weak to reliably fix or eliminate them, so their fate is determined by drift. Motoo Kimura's neutral theory (1968) argued that most molecular evolution is driven by drift acting on nearly neutral mutations, not by positive selection. This was controversial when proposed — it appeared to contradict Darwin — but it is now the null hypothesis in molecular evolution. The controversy was semantic: Kimura was not claiming adaptation is unimportant, but that most sequence changes at the DNA level are invisible to selection because they do not affect fitness.

Population bottlenecks — A sharp reduction in population size (disease, habitat loss, founder event) increases drift temporarily and can lead to loss of genetic diversity even for beneficial alleles. The cheetah and northern elephant seal are canonical examples: extreme bottlenecks reduced their genetic diversity to levels where even small deleterious mutations cannot be efficiently purged. The population survives but with reduced adaptive potential.

Wright's shifting balance theory — Wright proposed that evolution in subdivided populations can cross fitness valleys via drift in small subpopulations, followed by selection once a new fitness peak is reached. The idea is that drift allows the population to escape local optima that selection alone could not traverse. This theory is difficult to test empirically and remains controversial, but it highlights drift's constructive role: randomness is not merely noise — it is exploration.

Drift and Information

From an information-theoretic perspective, genetic drift is entropy increase: allele frequency information is lost due to random sampling. Selection is entropy decrease: fitness differentials impose structure on allele frequencies. Evolution is the interplay between these two forces.

In small populations, drift dominates and the population loses information — diversity collapses toward fixation of random alleles. In large populations, selection dominates and information is preserved in proportion to fitness structure. The transition between these regimes — the drift barrier — is determined by s$. Populations smaller than the drift barrier cannot maintain adaptations requiring selection coefficients below /N$, no matter how beneficial those adaptations would be in principle.

This has implications for molecular evolution, where many functional constraints operate at the level of individual nucleotides with very small fitness effects. A sufficiently small population cannot maintain such fine-grained adaptations — they are swamped by drift. Michael Lynch's work on genome complexity argues that the complexity ceiling for genome architecture is set by the drift barrier: features requiring selection coefficients below /N$ cannot evolve, regardless of their potential benefit.

Drift as a Systems Phenomenon

Genetic drift is often taught as a population genetics problem, but it is structurally identical to many other systems where finite sampling produces random fluctuations: - Diffusion in statistical mechanics (Brownian motion is drift for particles) - innovation dynamics in technology adoption (early random success can lock in standards) - cultural evolution (ideas propagate stochastically in small communities)

The common structure: a finite system, a stochastic sampling process, and the resulting random walk of system state. Wright's population genetics formalism is a special case of a broader class of stochastic processes in Complex adaptive systems.

The lesson: randomness is not the opposite of structure. It is a mechanism for exploration, for diversity maintenance, and for escaping local optima. Systems that eliminate randomness in the name of optimization become brittle — they lose the variability necessary for adaptation. Drift is the price of finite populations, but it is also the source of variability on which selection acts. Evolution requires both.

Genetic drift is what happens when you build a system out of finite samples rather than infinite ensembles. It is not a mistake to be corrected — it is the signature of a system operating under resource constraints, where every decision is a finite bet and chance is inescapable. The question is not whether drift happens, but how its exploratory potential is harnessed without collapsing into noise.

Drift in Fragmented Landscapes

The population genetics of drift takes on particular urgency when populations are embedded in real ecological landscapes — fragmented, heterogeneous, and subject to ongoing habitat loss. Laboratory models assume idealized populations with stable size and random mating. Real populations exist in patches connected by dispersal, with effective population sizes that vary in time and space and that are routinely far smaller than census sizes suggest.

The key concept is effective population size (N_e): the size of an idealized Wright-Fisher population that would experience the same rate of drift as the actual population. Because of variance in reproductive success, fluctuating population size, sex ratio asymmetries, and geographic structure, N_e is almost always substantially smaller than the census count. In many vertebrate species, N_e is one to two orders of magnitude smaller than the number of living individuals. This means drift is operating far more powerfully than the naive headcount suggests.

Conservation biology has been transformed by this recognition. The minimum viable population concept — once stated as a simple threshold of individual count — must be restated as a function of N_e. A population of 1,000 individuals with an N_e of 50 is functionally equivalent, from a drift perspective, to a population of 50. The genetic consequences — loss of adaptive variation, accumulation of deleterious mutations through mutational meltdown, and inbreeding depression — are the same.

Landscape genetics asks how the spatial arrangement of habitat patches shapes gene flow and drift across the landscape. Habitat corridors that facilitate dispersal between patches increase effective population size by allowing genetic exchange — offsetting local drift. The same trophic cascade logic that ecologists use to understand community structure (remove the apex predator, alter the whole system) applies to genetic drift in fragmented landscapes: remove the corridor, and the patch populations begin drifting independently toward different random fixation outcomes, losing shared variation and accumulating incompatibilities that can eventually cause reproductive isolation — the first step in speciation.

The empirical lesson is uncomfortable for conservation practice: genetic considerations must enter landscape planning at the design stage, not as an afterthought. A reserve network that preserves census numbers but severs dispersal corridors is not maintaining viable populations — it is creating an archipelago of slowly diverging genetic isolates, each accumulating its own genetic load of deleterious mutations, each losing the adaptive variation it will need to respond to climate-driven environmental change. The timescale for these effects is decades to centuries — too slow to be visible in project review cycles, too fast to be irreversible only when populations are already in decline.

The uncomfortable claim: the systematic exclusion of population genetics from landscape planning decisions is not a technical oversight. It reflects the persistent institutional separation of ecology from genetics — two disciplines that study the same biological systems using different tools and, too often, without reading each other's literature. The cost is borne by the populations being managed.