Power failure: why small sample size undermines the reliability of neuroscience
π Original studyπ Appears in:
Plain English Summary
Most neuroscience studies are woefully underpowered β meaning they use too few participants to reliably detect real effects. This landmark analysis of 730 studies found the typical study had only a 21% chance of catching a true effect, dropping to a dismal 8% for brain volume research. When underpowered studies do land on significant results, those results are more likely flukes than real discoveries. The "winner's curse" inflates initial effect sizes by 25β50%, and the authors found far more significant results than expected (349 versus 254), a telltale sign of reporting bias. This matters hugely for parapsychology, where small studies dominate and meta-analyses pool many underpowered experiments. The fix? Pre-register studies, plan sample sizes in advance, share data, and run large collaborative replications.
Research Notes
Landmark paper quantifying the replication crisis in neuroscience, directly relevant to psi research where small-N studies dominate. The PPV framework and winner's curse analysis provide a quantitative basis for skeptical critiques of parapsychological effect sizes, especially when aggregated via meta-analysis from underpowered constituent studies.
Low statistical power in neuroscience studies reduces both the chance of detecting true effects and the probability that significant findings reflect genuine effects (positive predictive value). Analysis of 49 neuroscience meta-analyses (730 studies published in 2011) reveals median power of 21%, falling to 18% when high-power neurological outliers are excluded. Brain volume studies show median power of just 8%. An excess significance test confirms more significant results than expected (349 vs. 254, p < 0.0001), indicating reporting biases. The winner's curse further inflates initial effect estimates by 25-50% at typical power levels. These problems undermine reproducibility and waste resources, including animal lives in preclinical research. Recommendations include a priori power calculations, pre-registration, transparent reporting, data sharing, and large-scale collaborative replication.
Links
Related Papers
Cites
Companion
- Testing for Questionable Research Practices in a Meta-Analysis: An Example from Experimental Parapsychology β Bierman, Dick J (2016)
- Editors' Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence? β Pashler, Harold (2012)
- A Practical Solution to the Pervasive Problems of p Values β Wagenmakers, Eric-Jan (2007)
- Mindless Statistics β Gigerenzer, Gerd (2004)
- Experimenter Fraud: What Are Appropriate Methodological Standards? β Kennedy, J.E (2017)
- Registered Reports: A Method to Increase the Credibility of Published Results β Nosek, Brian A (2014)
- Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses β Lakens, DaniΓ«l (2017)
- The "File Drawer Problem" and Tolerance for Null Results β Rosenthal, Robert (1979)
- Small Telescopes: Detectability and the Evaluation of Replication Results β Simonsohn, Uri (2015)
Cited By
- Testing for Questionable Research Practices in a Meta-Analysis: An Example from Experimental Parapsychology β Bierman, Dick J (2016)
- The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No "Fishing Expedition" or "P-Hacking" and the Research Hypothesis Was Posited Ahead of Time β Gelman, Andrew (2013)
Also by these authors
More in Methodology
Paranormal belief, conspiracy endorsement, and positive wellbeing: a network analysis
Planning Falsifiable Confirmatory Research
Addressing Researcher Fraud: Retrospective, Real-Time, and Preventive Strategies β Including Legal Points and Data Management That Prevents Fraud
Quantum Aspects of the Brain-Mind Relationship: A Hypothesis with Supporting Evidence
Paranormal beliefs and cognitive function: A systematic review and assessment of study quality across four decades of research
π Cite this paper
Button, Katherine S, Ioannidis, John P.A, Mokrysz, Claire, Nosek, Brian A, Flint, Jonathan, Robinson, Emma S. J, MunafΓ², Marcus R (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. https://doi.org/10.1038/nrn3475
@article{button_2013_power_failure,
title = {Power failure: why small sample size undermines the reliability of neuroscience},
author = {Button, Katherine S and Ioannidis, John P.A and Mokrysz, Claire and Nosek, Brian A and Flint, Jonathan and Robinson, Emma S. J and MunafΓ², Marcus R},
year = {2013},
journal = {Nature Reviews Neuroscience},
doi = {10.1038/nrn3475},
}