P-hacking

P-hacking is the systematic exploitation of analytical flexibility to produce statistically significant findings from data that would not support them under a pre-specified analysis plan. It is not fraud — the data are real, the analyses are valid in isolation — but it is a form of epistemic inflation in which the garden of forking paths is traversed until a significant result is found, then presented as if it were the only path examined. The method is invisible to standard peer review because each individual analysis is technically correct; the deception lies in the selective reporting of which analyses were conducted.

The phenomenon is structurally analogous to the multiple comparisons problem in statistical theory, but where the multiple comparisons problem is usually framed as an inadvertent consequence of testing many hypotheses, p-hacking is deliberate exploration of the hypothesis space guided by the data themselves. The distinction matters because the remedies differ: multiple comparisons can be corrected with procedures like Bonferroni or false discovery rate control, but p-hacking is a procedural and incentive problem that statistical correction alone cannot solve. When researchers are rewarded for significance rather than truth, they will find significance.

P-hacking is one of the primary drivers of the replication crisis in psychology, medicine, and the social sciences. Studies that survive p-hacking reproduce at dramatically lower rates than those with preregistered protocols, because the significant result was a product of analytical selection rather than a genuine signal. The open science movement's emphasis on pre-registration is directed specifically at this failure mode: a preregistered study cannot hack its p-values without leaving a visible trace.