Skip to main content

Power failure: why small sample size undermines the reliability of neuroscience

πŸ“„ Original study
Button, Katherine S, Ioannidis, John P.A, Mokrysz, Claire, Nosek, Brian A, Flint, Jonathan, Robinson, Emma S. J, MunafΓ², Marcus R β€’ 2013 Modern Era β€’ methodology

πŸ“Œ Appears in:

Plain English Summary

Most neuroscience studies are woefully underpowered β€” meaning they use too few participants to reliably detect real effects. This landmark analysis of 730 studies found the typical study had only a 21% chance of catching a true effect, dropping to a dismal 8% for brain volume research. When underpowered studies do land on significant results, those results are more likely flukes than real discoveries. The "winner's curse" inflates initial effect sizes by 25–50%, and the authors found far more significant results than expected (349 versus 254), a telltale sign of reporting bias. This matters hugely for parapsychology, where small studies dominate and meta-analyses pool many underpowered experiments. The fix? Pre-register studies, plan sample sizes in advance, share data, and run large collaborative replications.

Research Notes

Landmark paper quantifying the replication crisis in neuroscience, directly relevant to psi research where small-N studies dominate. The PPV framework and winner's curse analysis provide a quantitative basis for skeptical critiques of parapsychological effect sizes, especially when aggregated via meta-analysis from underpowered constituent studies.

Low statistical power in neuroscience studies reduces both the chance of detecting true effects and the probability that significant findings reflect genuine effects (positive predictive value). Analysis of 49 neuroscience meta-analyses (730 studies published in 2011) reveals median power of 21%, falling to 18% when high-power neurological outliers are excluded. Brain volume studies show median power of just 8%. An excess significance test confirms more significant results than expected (349 vs. 254, p < 0.0001), indicating reporting biases. The winner's curse further inflates initial effect estimates by 25-50% at typical power levels. These problems undermine reproducibility and waste resources, including animal lives in preclinical research. Recommendations include a priori power calculations, pre-registration, transparent reporting, data sharing, and large-scale collaborative replication.

Links

Related Papers

Also by these authors

More in Methodology

πŸ“‹ Cite this paper
APA
Button, Katherine S, Ioannidis, John P.A, Mokrysz, Claire, Nosek, Brian A, Flint, Jonathan, Robinson, Emma S. J, MunafΓ², Marcus R (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. https://doi.org/10.1038/nrn3475
BibTeX
@article{button_2013_power_failure,
  title = {Power failure: why small sample size undermines the reliability of neuroscience},
  author = {Button, Katherine S and Ioannidis, John P.A and Mokrysz, Claire and Nosek, Brian A and Flint, Jonathan and Robinson, Emma S. J and MunafΓ², Marcus R},
  year = {2013},
  journal = {Nature Reviews Neuroscience},
  doi = {10.1038/nrn3475},
}