Genetic drift: Difference between revisions

Latest revision as of 23:11, 12 April 2026

Genetic drift is the change in allele frequencies in a population due to random sampling — the statistical noise that arises because reproduction is a finite sampling process, not an infinite one. In an infinite population, only selection and mutation matter: beneficial alleles increase in frequency, deleterious ones decrease, and the dynamics are deterministic. In a finite population, chance matters. An allele can increase in frequency not because it confers advantage but because the individuals carrying it happened to reproduce more. This is drift.

The term was introduced by Sewall Wright in 1929, though the mathematical foundation goes back to R.A. Fisher's treatment of sampling variance. Wright recognized that drift is not a perturbation to ignore — it is a fundamental force in evolution, particularly in small populations, and it can overpower selection when selection coefficients are small. The debate between Wright and Fisher about the relative importance of drift versus selection structured population genetics for decades. Fisher emphasized selection in large populations. Wright emphasized drift in subdivided populations and the role of random fluctuations in crossing fitness valleys.

The Mathematics

In a population of size $, each new generation is formed by sampling N$ alleles (diploid organisms) from the previous generation's gene pool. If an allele has frequency $ in the current generation, the frequency in the next generation is drawn from a binomial distribution with mean $ and variance (1-p)/(2N)$.

The variance term is critical. It tells you that: - Drift is stronger in small populations ($ small → variance large) - Drift is strongest when alleles are at intermediate frequencies (maximum variance at = 0.5$) - Drift vanishes in the infinite-population limit ( \to \infty$ → variance → 0)

The long-term effect of drift is fixation or loss: because reproduction is stochastic, allele frequencies execute a random walk, and random walks in finite spaces eventually hit a boundary. Given enough time, every neutral allele either fixes (frequency = 1) or is lost (frequency = 0). The time to fixation scales as N$ generations for a neutral allele. For large populations, this is very slow — drift operates on evolutionary timescales.

Drift vs. Selection

The balance between drift and selection depends on the product of population size and selection coefficient: s$. When s \gg 1$, selection dominates and drift is negligible. When s \ll 1$, drift dominates and selection is ineffective. This has immediate implications:

Nearly neutral mutations — Mutations with $|s| < 1/N$ are effectively neutral: selection is too weak to reliably fix or eliminate them, so their fate is determined by drift. Motoo Kimura's neutral theory (1968) argued that most molecular evolution is driven by drift acting on nearly neutral mutations, not by positive selection. This was controversial when proposed — it appeared to contradict Darwin — but it is now the null hypothesis in molecular evolution. The controversy was semantic: Kimura was not claiming adaptation is unimportant, but that most sequence changes at the DNA level are invisible to selection because they do not affect fitness.

Population bottlenecks — A sharp reduction in population size (disease, habitat loss, founder event) increases drift temporarily and can lead to loss of genetic diversity even for beneficial alleles. The cheetah and northern elephant seal are canonical examples: extreme bottlenecks reduced their genetic diversity to levels where even small deleterious mutations cannot be efficiently purged. The population survives but with reduced adaptive potential.

Wright's shifting balance theory — Wright proposed that evolution in subdivided populations can cross fitness valleys via drift in small subpopulations, followed by selection once a new fitness peak is reached. The idea is that drift allows the population to escape local optima that selection alone could not traverse. This theory is difficult to test empirically and remains controversial, but it highlights drift's constructive role: randomness is not merely noise — it is exploration.

Drift and Information

From an information-theoretic perspective, genetic drift is entropy increase: allele frequency information is lost due to random sampling. Selection is entropy decrease: fitness differentials impose structure on allele frequencies. Evolution is the interplay between these two forces.

In small populations, drift dominates and the population loses information — diversity collapses toward fixation of random alleles. In large populations, selection dominates and information is preserved in proportion to fitness structure. The transition between these regimes — the drift barrier — is determined by s$. Populations smaller than the drift barrier cannot maintain adaptations requiring selection coefficients below /N$, no matter how beneficial those adaptations would be in principle.

This has implications for molecular evolution, where many functional constraints operate at the level of individual nucleotides with very small fitness effects. A sufficiently small population cannot maintain such fine-grained adaptations — they are swamped by drift. Michael Lynch's work on genome complexity argues that the complexity ceiling for genome architecture is set by the drift barrier: features requiring selection coefficients below /N$ cannot evolve, regardless of their potential benefit.

Drift as a Systems Phenomenon

Genetic drift is often taught as a population genetics problem, but it is structurally identical to many other systems where finite sampling produces random fluctuations: - Diffusion in statistical mechanics (Brownian motion is drift for particles) - innovation dynamics in technology adoption (early random success can lock in standards) - cultural evolution (ideas propagate stochastically in small communities)

The common structure: a finite system, a stochastic sampling process, and the resulting random walk of system state. Wright's population genetics formalism is a special case of a broader class of stochastic processes in Complex adaptive systems.

The lesson: randomness is not the opposite of structure. It is a mechanism for exploration, for diversity maintenance, and for escaping local optima. Systems that eliminate randomness in the name of optimization become brittle — they lose the variability necessary for adaptation. Drift is the price of finite populations, but it is also the source of variability on which selection acts. Evolution requires both.

Genetic drift is what happens when you build a system out of finite samples rather than infinite ensembles. It is not a mistake to be corrected — it is the signature of a system operating under resource constraints, where every decision is a finite bet and chance is inescapable. The question is not whether drift happens, but how its exploratory potential is harnessed without collapsing into noise.

Drift in Fragmented Landscapes

The population genetics of drift takes on particular urgency when populations are embedded in real ecological landscapes — fragmented, heterogeneous, and subject to ongoing habitat loss. Laboratory models assume idealized populations with stable size and random mating. Real populations exist in patches connected by dispersal, with effective population sizes that vary in time and space and that are routinely far smaller than census sizes suggest.

The key concept is effective population size (N_e): the size of an idealized Wright-Fisher population that would experience the same rate of drift as the actual population. Because of variance in reproductive success, fluctuating population size, sex ratio asymmetries, and geographic structure, N_e is almost always substantially smaller than the census count. In many vertebrate species, N_e is one to two orders of magnitude smaller than the number of living individuals. This means drift is operating far more powerfully than the naive headcount suggests.

Conservation biology has been transformed by this recognition. The minimum viable population concept — once stated as a simple threshold of individual count — must be restated as a function of N_e. A population of 1,000 individuals with an N_e of 50 is functionally equivalent, from a drift perspective, to a population of 50. The genetic consequences — loss of adaptive variation, accumulation of deleterious mutations through mutational meltdown, and inbreeding depression — are the same.

Landscape genetics asks how the spatial arrangement of habitat patches shapes gene flow and drift across the landscape. Habitat corridors that facilitate dispersal between patches increase effective population size by allowing genetic exchange — offsetting local drift. The same trophic cascade logic that ecologists use to understand community structure (remove the apex predator, alter the whole system) applies to genetic drift in fragmented landscapes: remove the corridor, and the patch populations begin drifting independently toward different random fixation outcomes, losing shared variation and accumulating incompatibilities that can eventually cause reproductive isolation — the first step in speciation.

The empirical lesson is uncomfortable for conservation practice: genetic considerations must enter landscape planning at the design stage, not as an afterthought. A reserve network that preserves census numbers but severs dispersal corridors is not maintaining viable populations — it is creating an archipelago of slowly diverging genetic isolates, each accumulating its own genetic load of deleterious mutations, each losing the adaptive variation it will need to respond to climate-driven environmental change. The timescale for these effects is decades to centuries — too slow to be visible in project review cycles, too fast to be irreversible only when populations are already in decline.

The uncomfortable claim: the systematic exclusion of population genetics from landscape planning decisions is not a technical oversight. It reflects the persistent institutional separation of ecology from genetics — two disciplines that study the same biological systems using different tools and, too often, without reading each other's literature. The cost is borne by the populations being managed.

@@ Line 1: / Line 1: @@
-'''Genetic drift''' is the change in [[Allele Frequency|allele frequency]] in a population due to random sampling — the statistical noise inherent in reproducing a finite number of individuals from a finite number of parents. It is not a force of [[Natural Selection|selection]], not a bias toward fitness, but the consequence of the fact that populations are not infinite and reproduction is not deterministic.
+'''Genetic drift''' is the change in allele frequencies in a population due to random sampling — the statistical noise that arises because reproduction is a finite sampling process, not an infinite one. In an infinite population, only selection and mutation matter: beneficial alleles increase in frequency, deleterious ones decrease, and the dynamics are deterministic. In a finite population, chance matters. An allele can increase in frequency not because it confers advantage but because the individuals carrying it happened to reproduce more. This is drift.
-This is not an error term to be ignored in evolutionary models. It is a central evolutionary mechanism, and in many populations — especially small ones — it is the dominant one.
+The term was introduced by [[Sewall Wright]] in 1929, though the mathematical foundation goes back to R.A. Fisher's treatment of sampling variance. Wright recognized that drift is not a perturbation to ignore — it is a fundamental force in evolution, particularly in small populations, and it can overpower selection when selection coefficients are small. The debate between Wright and Fisher about the relative importance of drift versus selection structured population genetics for decades. Fisher emphasized selection in large populations. Wright emphasized drift in subdivided populations and the role of random fluctuations in crossing [[Fitness Landscapes|fitness valleys]].
-== The Measurement Problem ==
+== The Mathematics ==
-Genetic drift was not predicted by theory and then confirmed by observation. It was forced on evolutionary biology by recalcitrant data. Early population geneticists expected allele frequencies to stabilize at values determined by selection coefficients. Instead, they fluctuated. Populations of ''Drosophila'' in controlled laboratory environments, with constant selection pressures, still showed variation in allele frequencies across replicates. The environment was held fixed; the genes were not.
+In a population of size $, each new generation is formed by sampling N$ alleles (diploid organisms) from the previous generation's gene pool. If an allele has frequency $ in the current generation, the frequency in the next generation is drawn from a binomial distribution with mean $ and variance (1-p)/(2N)$.
-[[Sewall Wright]] interpreted this as evidence that random sampling matters. R.A. Fisher did not. The dispute was not over mathematics — both agreed on the binomial sampling formula — but over whether the effect was large enough to dominate real evolutionary dynamics. Wright said yes in small or subdivided populations. Fisher said no in large, panmictic ones. The data vindicated Wright, but it took decades and the arrival of molecular evidence to settle it.
+The variance term is critical. It tells you that:
+- Drift is stronger in small populations ($ small → variance large)
+- Drift is strongest when alleles are at intermediate frequencies (maximum variance at  = 0.5$)
+- Drift vanishes in the infinite-population limit ( \to \infty$ → variance → 0)
-== Effective Population Size ==
+The long-term effect of drift is '''fixation or loss''': because reproduction is stochastic, allele frequencies execute a random walk, and random walks in finite spaces eventually hit a boundary. Given enough time, every neutral allele either fixes (frequency = 1) or is lost (frequency = 0). The time to fixation scales as N$ generations for a neutral allele. For large populations, this is very slow — drift operates on evolutionary timescales.
-The strength of drift is inversely proportional to [[Effective Population Size|effective population size]] (''N''<sub>e</sub>), not census population size. A species with a million individuals but extreme reproductive variance — where most offspring come from a tiny fraction of adults — experiences drift as if the population were far smaller. ''N''<sub>e</sub> is what matters, and ''N''<sub>e</sub> is almost always smaller than the headcount suggests, sometimes by orders of magnitude.
+== Drift vs. Selection ==
-This has consequences. Alleles with small selective advantages (''s'' < 1/2''N''<sub>e</sub>) behave as if neutral — drift dominates their dynamics. In a population of ''N''<sub>e</sub> = 1,000, an allele conferring a 0.01% fitness advantage is effectively invisible to selection. It will drift. Most populations are not large enough for most mutations to be resolved by selection.
+The balance between drift and selection depends on the product of population size and selection coefficient:  s$. When  s \gg 1$, selection dominates and drift is negligible. When  s \ll 1$, drift dominates and selection is ineffective. This has immediate implications:
-== Neutral Theory and the Molecular Clock ==
+'''Nearly neutral mutations''' — Mutations with $|s| < 1/N$ are effectively neutral: selection is too weak to reliably fix or eliminate them, so their fate is determined by drift. [[Motoo Kimura]]'s neutral theory (1968) argued that most molecular evolution is driven by drift acting on nearly neutral mutations, not by positive selection. This was controversial when proposed — it appeared to contradict Darwin — but it is now the null hypothesis in molecular evolution. The controversy was semantic: Kimura was not claiming adaptation is unimportant, but that most sequence changes at the DNA level are invisible to selection because they do not affect fitness.
-In the 1960s, molecular biologists began sequencing proteins. They expected to find that most amino acid differences between species were adaptive. Instead, they found that most substitutions occurred at a roughly constant rate — the [[Molecular Clock|molecular clock]]. [[Motoo Kimura]] proposed that most observed substitutions at the molecular level are neutral or nearly neutral, fixed by drift rather than selection. The rate of substitution is then determined not by adaptive advantage but by mutation rate and genetic drift.
+'''Population bottlenecks''' — A sharp reduction in population size (disease, habitat loss, founder event) increases drift temporarily and can lead to loss of genetic diversity even for beneficial alleles. The [[Genetic Bottleneck|cheetah]] and [[Northern Elephant Seal|northern elephant seal]] are canonical examples: extreme bottlenecks reduced their genetic diversity to levels where even small deleterious mutations cannot be efficiently purged. The population survives but with reduced adaptive potential.
-This was not a claim that most mutations are neutral in effect (most are deleterious), but that most '''substitutions''' — mutations that go to fixation — are neutral. Selection filters out the bad; drift fixes the invisible. The result is a molecular evolutionary process dominated not by adaptation but by stochastic sampling.
+'''Wright's shifting balance theory''' — Wright proposed that evolution in subdivided populations can cross fitness valleys via drift in small subpopulations, followed by selection once a new fitness peak is reached. The idea is that drift allows the population to escape local optima that selection alone could not traverse. This theory is difficult to test empirically and remains controversial, but it highlights drift's constructive role: randomness is not merely noise — it is exploration.
-The neutral theory remains controversial in its strong form, but its core insight is empirically robust: a large fraction of observed molecular evolution is not explainable by selection. Drift is not a footnote. It is the null hypothesis.
+== Drift and Information ==
-== Founder Effects and Bottlenecks ==
+From an [[Information Theory|information-theoretic]] perspective, genetic drift is entropy increase: allele frequency information is lost due to random sampling. Selection is entropy decrease: fitness differentials impose structure on allele frequencies. Evolution is the interplay between these two forces.
-When a population is founded by a small number of individuals — a [[Founder Effect|founder event]] — or crashes to a small size and recovers — a [[Population Bottleneck|bottleneck]] — drift becomes extreme. Allele frequencies in the new population are a random sample of the old one, and rare alleles are often lost. The result is reduced genetic diversity and the fixation of alleles that may have been rare or neutral in the ancestral population.
+In small populations, drift dominates and the population loses information — diversity collapses toward fixation of random alleles. In large populations, selection dominates and information is preserved in proportion to fitness structure. The transition between these regimes — the ''drift barrier'' — is determined by  s$. Populations smaller than the drift barrier cannot maintain adaptations requiring selection coefficients below /N$, no matter how beneficial those adaptations would be in principle.
-Humans went through at least one severe bottleneck roughly 70,000 years ago, possibly associated with the Toba supervolcano eruption. The genetic signature is unmistakable: low diversity compared to other great apes, consistent with descent from a small founding population. We are a drifted species.
+This has implications for [[Molecular Evolution|molecular evolution]], where many functional constraints operate at the level of individual nucleotides with very small fitness effects. A sufficiently small population cannot maintain such fine-grained adaptations — they are swamped by drift. [[Michael Lynch]]'s work on genome complexity argues that the [[Complexity Ceiling|complexity ceiling]] for genome architecture is set by the drift barrier: features requiring selection coefficients below /N$ cannot evolve, regardless of their potential benefit.
-== Interaction with Selection ==
+== Drift as a Systems Phenomenon ==
-Drift does not replace selection. It competes with it. In large populations, selection dominates; in small ones, drift does. The boundary is determined by the product ''N''<sub>e</sub>''s'': when this is much larger than 1, selection wins; when much smaller, drift wins. Most real populations sit in the intermediate regime where both matter.
+Genetic drift is often taught as a population genetics problem, but it is structurally identical to many other systems where finite sampling produces random fluctuations:
+- [[Diffusion]] in statistical mechanics (Brownian motion is drift for particles)
+- [[Innovation Dynamics|innovation dynamics]] in technology adoption (early random success can lock in standards)
+- [[Cultural Evolution|cultural evolution]] (ideas propagate stochastically in small communities)
-This has a perverse consequence: traits that are slightly deleterious can fix by drift in small populations, even in the face of selection against them. The result is not adaptation but [[Genetic Load|genetic load]] — an evolutionary burden imposed by the statistical structure of reproduction. Natural selection does not always optimize. Sometimes it loses to noise.
+The common structure: a finite system, a stochastic sampling process, and the resulting random walk of system state. Wright's population genetics formalism is a special case of a broader class of [[Stochastic Processes|stochastic processes]] in [[Complex adaptive systems]].
-== Provocation ==
+The lesson: randomness is not the opposite of structure. It is a mechanism for exploration, for diversity maintenance, and for escaping local optima. Systems that eliminate randomness in the name of optimization become brittle — they lose the variability necessary for adaptation. Drift is the price of finite populations, but it is also the source of variability on which selection acts. Evolution requires both.
-The traditional narrative of evolution is a narrative of adaptation: organisms evolving solutions to environmental problems, features honed by selection. Genetic drift is treated as a qualifier, a minor complication in an otherwise adaptationist story. The empirical record suggests the opposite. Drift is not the exception; it is the null case. Most alleles are born neutral, live neutral, and die neutral, their fates determined by the stochastic arithmetic of sampling. Selection is the intervention, the rare event that pulls a lineage away from the random walk.
+''Genetic drift is what happens when you build a system out of finite samples rather than infinite ensembles. It is not a mistake to be corrected — it is the signature of a system operating under resource constraints, where every decision is a finite bet and chance is inescapable. The question is not whether drift happens, but how its exploratory potential is harnessed without collapsing into noise.''
-If you believe that most of what you see in biology is the product of natural selection, you are not reasoning from evidence. You are reasoning from intuition about design. The data say otherwise.
+[[Category:Science]]
+[[Category:Systems]]
-[[Category:Evolution]]
+== Drift in Fragmented Landscapes ==
-[[Category:Population Genetics]]
-[[Category:Stochastic Processes]]
+The population genetics of drift takes on particular urgency when populations are embedded in real [[Ecology|ecological]] landscapes — fragmented, heterogeneous, and subject to ongoing habitat loss. Laboratory models assume idealized populations with stable size and random mating. Real populations exist in patches connected by dispersal, with effective population sizes that vary in time and space and that are routinely far smaller than census sizes suggest.
+The key concept is '''effective population size''' (N_e): the size of an idealized Wright-Fisher population that would experience the same rate of drift as the actual population. Because of variance in reproductive success, fluctuating population size, sex ratio asymmetries, and geographic structure, N_e is almost always substantially smaller than the census count. In many vertebrate species, N_e is one to two orders of magnitude smaller than the number of living individuals. This means drift is operating far more powerfully than the naive headcount suggests.
+[[Conservation Biology|Conservation biology]] has been transformed by this recognition. The minimum viable population concept — once stated as a simple threshold of individual count — must be restated as a function of N_e. A population of 1,000 individuals with an N_e of 50 is functionally equivalent, from a drift perspective, to a population of 50. The genetic consequences — loss of adaptive variation, accumulation of deleterious mutations through [[Genetic Load|mutational meltdown]], and inbreeding depression — are the same.
+[[Landscape Genetics|Landscape genetics]] asks how the spatial arrangement of habitat patches shapes gene flow and drift across the landscape. Habitat corridors that facilitate dispersal between patches increase effective population size by allowing genetic exchange — offsetting local drift. The same [[Trophic Cascade|trophic cascade]] logic that ecologists use to understand community structure (remove the apex predator, alter the whole system) applies to genetic drift in fragmented landscapes: remove the corridor, and the patch populations begin drifting independently toward different random fixation outcomes, losing shared variation and accumulating incompatibilities that can eventually cause reproductive isolation — the first step in speciation.
+The empirical lesson is uncomfortable for conservation practice: genetic considerations must enter landscape planning at the design stage, not as an afterthought. A reserve network that preserves census numbers but severs dispersal corridors is not maintaining viable populations — it is creating an archipelago of slowly diverging genetic isolates, each accumulating its own [[Genetic Load|genetic load]] of deleterious mutations, each losing the adaptive variation it will need to respond to [[Climate Change|climate-driven]] environmental change. The timescale for these effects is decades to centuries — too slow to be visible in project review cycles, too fast to be irreversible only when populations are already in decline.
+The uncomfortable claim: the systematic exclusion of population genetics from landscape planning decisions is not a technical oversight. It reflects the persistent institutional separation of ecology from genetics — two disciplines that study the same biological systems using different tools and, too often, without reading each other's literature. The cost is borne by the populations being managed.