Skip to main content
🔍

Skeptical & Critical

A curated collection of research papers focusing on skeptical & critical. Explore the methodology, key findings, and ongoing debates in this field.

Total Papers 28
Year Range 1978 – 2024
Top Contributors
Hyman, RayAlcock, James EWagenmakers, Eric-Jan

Recent Publications

Cognitive Styles and Psi: Psi Researchers Are More Similar to Skeptics Than to Lay Believers

Pehlivanova, M; Weiler, M; Greyson, B 2024 Frontiers in Psychology

Cross-sectional survey comparing cognitive styles among four groups: academic psi researchers (N=44), lay psi believers (N=32), academic skeptics (N=35), and lay skeptics (N=33). Measured actively open-minded thinking (AOT) and need for closure (NFC) using validated scales, plus psi beliefs/experiences via the NEBS. Found significant group differences in AOT (F(3,138)=4.8, p=0.003, η²=0.09): psi researchers scored identically to academic skeptics (4.5±0.3 vs 4.5±0.3, p=0.91) and lay skeptics (p=0.80), while lay psi believers scored significantly lower (4.2±0.4, ps: 0.005-0.04). No differences in NFC (p=0.67). Results held after controlling for age and education. The AOT-belief inverse correlation was driven entirely by skeptics (r=-0.29, p=.01) and was null in psi groups (r=-0.03, p=.78).

#cognitive_styles #actively_open_minded_thinking #need_for_closure #paranormal_belief #psi_researchers

Searching for the Impossible: Parapsychology's Elusive Quest

Reber, Arthur S; Alcock, James E 2019 American Psychologist

A broad-based critique of parapsychology published as a direct rebuttal to Cardeña (2018) in American Psychologist. Argues psi phenomena are impossible because they violate four fundamental principles: causality (no mechanism exists), time-asymmetry (precognition requires time reversal unsupported by physics), thermodynamics (psychokinesis creates energy in closed systems), and the inverse square law. Dismisses quantum mechanics and relativity as scaffolding for psi, critiques parapsychological meta-analyses as built on marginal individual studies, and highlights Bem's 2017 interview admission that his precognition studies were "rhetorical devices." Concludes parapsychology persists due to isolation from mainstream science, unfalsifiability, and appeal of secular dualism.

#fundamental_physics_argument #meta_analysis_critique #quantum_mechanics_misuse #replication_failure #unfalsifiability

False-Positive Effect in the Radin Double-Slit Experiment on Observer Consciousness as Determined with the Advanced Meta-Experimental Protocol

Walleczek, Jan; von Stillfried, Nikolaus 2019 Frontiers in Psychology

A conceptual replication of the Radin double-slit (DS) experiment was commissioned using 10,000 test trials performed blindly by the same investigator who reported the original results. The Advanced Meta-Experimental Protocol (AMP) implemented systematic negative, positive, and time-reversed controls alongside a sham-experiment conducted without test subjects. The replication failed to confirm the original anomalous consciousness effect (0% true-positive match rate). Critically, the sham-experiment revealed a statistically significant false-positive effect (p = 0.021, σ = −2.02, N = 1,250) in exactly the test category predicted for a true-positive result. The false-positive effect size (~0.01%) was within an order of magnitude of the claimed consciousness effect (0.001%), and its statistical significance matched that of the original study. These findings demonstrate that the DS-apparatus produces significant effects without observers present, calling into question all prior claims of anomalous observer consciousness effects.

#double_slit #false_positive #systematic_error #confirmatory_replication #sham_experiment

Cross-Examining the Case for Precognition: Comment on Mossbridge and Radin (2018)

Houran, James; Lange, Rense; Hooper, Dan 2018 Psychology of Consciousness: Theory, Research, and Practice

A multidisciplinary commentary challenging Mossbridge and Radin's (2018) case for precognition, co-authored by anomalistic psychologists and Fermi Lab theoretical astrophysicist Dan Hooper. On statistical grounds, the authors argue that extremely small effect sizes in precognition meta-analyses do not exceed the 'crap factor' — systematic measurement artifacts inherent in psychological experiments — and that standard null hypothesis testing is biased toward rejection with increasing sample sizes. On theoretical grounds, Hooper argues that nothing in special/general relativity, quantum mechanics, or quantum field theory permits retrocausal information transfer, and any such mechanism would violate the second law of thermodynamics. The authors propose transliminality and intuitive thinking as conventional neuropsychological explanations for apparent presentiment effects.

#precognition_critique #meta_analysis_critique #retrocausation #transliminality #effect_size_interpretation

N,N-Dimethyltryptamine and the Pineal Gland: Separating Fact from Myth

Nichols, David E 2017 Journal of Psychopharmacology

Examines the popular claim that the pineal gland secretes N,N-dimethyltryptamine (DMT) in amounts sufficient to produce near-death and out-of-body experiences. Reviews the biochemistry of indolethylamine N-methyltransferase (INMT), DMT receptor binding affinities, dose-response data from human IV studies, and evidence for brain accumulation. The adult pineal weighs <0.2 g and produces only ~30 µg/day of melatonin; producing the ~25 mg DMT needed for psychoactive effects is implausible by three orders of magnitude. No credible evidence supports active DMT accumulation in neurons. Alternative mechanisms—dynorphin/kappa-opioid activation, massive neurotransmitter surges during asphyxia (norepinephrine >30-fold, serotonin >20-fold), and glutamate excitotoxicity—more parsimoniously explain near-death altered states.

#endogenous_dmt #pineal_gland #near_death_neurochemistry #kappa_opioid #neurotransmitter_surge

Paranormal psychic believers and skeptics: a large-scale test of the cognitive differences hypothesis

Gray, Stephen J; Gallo, David A 2016 Memory & Cognition

Across three studies, strong psychic believers and strong skeptics (screened from 2,541 adults using a modified Australian Sheep-Goat Scale, matched on age, sex, and education) completed multiple cognitive tasks. No consistent group differences emerged on episodic memory distortion (DRM false recall, criterial recollection, imagination inflation) or working memory. However, skeptics reliably outperformed believers on Shipley Logic (pooled d = 0.46) and Vocabulary (d = 0.62), and believers disproportionately endorsed conspiracy theories (interaction eta-squared = .104, pBIC > .99). Believers also reported higher dissociative experiences (d = 0.84) and absorption (d = 1.30). Both groups equally endorsed Darwinian evolution, and psychic belief positively predicted life satisfaction (beta = .19, N = 2,541).

#paranormal_belief #cognitive_differences #analytical_thinking #false_memory #conspiracy_endorsement

Meta-Analyses Are No Substitute for Registered Replications: A Skeptical Perspective on Religious Priming

van Elk, Michiel; Matzke, Dora; Gronau, Quentin F; Guan, Maime; Vandekerckhove, Joachim; Wagenmakers, Eric-Jan 2015 Frontiers in Psychology

Critique of Shariff et al. (2015) meta-analysis claiming religious priming has small but reliable effect on prosocial behavior. Re-analyzes the same 92-study dataset using PET-PEESE and Bayesian bias correction methods. PET-PEESE finds no evidence for effect after correcting for publication bias (intercept = -0.002, p = 0.97); BBC method reaches opposite conclusion with strong evidence for real effect (~0.3). Argues contradictory results demonstrate meta-analysis alone cannot resolve disputed effects due to inability to disentangle true effects from publication bias and experimenter bias. Concludes preregistered large-scale replications are the sole remedy.

#registered_replications #meta_analysis #publication_bias #pet_peese #bayesian_bias_correction

We Should Have Seen This Coming

Schwarzkopf, D. Samuel 2014 Frontiers in Human Neuroscience

Precognition claims violate the second law of thermodynamics and would invalidate baseline correction procedures fundamental to experimental research. Six objections are raised against the Mossbridge et al. (2012, 2014) presentiment meta-analysis: questionable primary study quality including circular inference in fMRI data, failure to include broader non-parapsychological literature using similar designs, neglect of ~2:1 trial imbalances enabling learned stimulus predictions, potential baseline correction artifacts from slow post-stimulus signal decay, inadequate testing of expectation bias, and biological implausibility of one neural mechanism producing precognitive effects across measures with vastly different temporal scales.

#presentiment_critique #falsifiability #parsimony #baseline_correction_artifacts #expectation_bias

The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No "Fishing Expedition" or "P-Hacking" and the Research Hypothesis Was Posited Ahead of Time

Gelman, Andrew; Loken, Eric 2013 Columbia University Department of Statistics Working Paper

Researcher degrees of freedom can produce a multiple comparisons problem even when scientists perform only a single analysis on their data. Using case studies from published psychology — including Bem's (2011) precognition experiments, menstrual-cycle effects on voting, and upper-body strength and political attitudes — a four-level typology of testing procedures is proposed, distinguishing deliberate fishing from the more common pattern where a single analysis path is chosen that appears predetermined but is actually contingent on the observed data. Pre-registration and pre-publication replication are recommended as solutions.

#researcher_degrees_of_freedom #forking_paths #multiple_comparisons #pre_registration #replication_crisis

A Bayes Factor Meta-Analysis of Recent Extrasensory Perception Experiments: Comment on Storm, Tressoldi, and Di Risio (2010)

Rouder, Jeffrey N; Morey, Richard D; Province, Jordan M 2013 Psychological Bulletin

Reassessing Storm, Tressoldi, and Di Risio's (2010) meta-analysis of 67 free-response ESP experiments using Bayes factors, the full dataset yields evidence of approximately 6 billion to 1 in favor of psi. However, studies using manual randomization show significantly higher hit rates than computer-randomized studies (BF ≈ 6,350 to 1 for the difference), suggesting procedural flaws rather than genuine psi. Excluding manually randomized studies and including omitted null conditions reduces the evidence to approximately 32–328 to 1 depending on model assumptions. The residual evidence is argued to be unpersuasive given the absence of any plausible mechanism and likely additional unreported null results.

#bayesian_meta_analysis #bayes_factor #ganzfeld_free_response #randomization_quality #study_selection

Too Good to Be True: Publication Bias in Two Prominent Studies from Experimental Psychology

Francis, Gregory 2012 Psychonomic Bulletin & Review

Applying the Ioannidis and Trikalinos (2007) test for excess significance to Bem's (2011) ten psi experiments and a set of verbal overshadowing studies, this analysis finds that the observed number of null hypothesis rejections substantially exceeds what would be expected given the experiments' statistical power. Bem's studies yield a pooled effect size of g* = 0.186, predicting 6.27 rejections out of 10, yet 9 were reported (p = .058). The verbal overshadowing literature shows a similar pattern (p = .022). These results indicate publication bias contaminates both literatures, rendering them uninformative as scientific evidence. Bayesian data analysis is proposed as a partial remedy.

#publication_bias #test_for_excess_significance #statistical_methodology #replication #bem_critique

Correcting the Past: Failures to Replicate Psi

Galak, Jeff; LeBoeuf, Robyn A; Nelson, Leif D; Simmons, Joseph P 2012 Journal of Personality and Social Psychology

Across seven experiments (N=3,289), the retroactive facilitation of recall paradigm from Bem's (2011) Experiments 8 and 9 was replicated using computer-standardized delivery, predetermined sample sizes, and no data inspection before stopping. Six of seven experiments found no evidence of precognition; the combined effect was d≈0.01 with Bayesian BF=70.48 providing 'extreme' support for the null. A meta-analysis of all 19 known replication attempts (N=4,091) yielded an overall effect of d=0.04, 95% CI [-0.00, 0.09], indistinguishable from zero. The only significant moderator was whether Bem himself conducted the experiment (d=0.29 vs. d=0.02 for all others).

#replication_failure #retroactive_facilitation_recall #bayesian_analysis #researcher_degrees_of_freedom #bem_replication

Failing the Future: Three Unsuccessful Attempts to Replicate Bem's 'Retroactive Facilitation of Recall' Effect

Ritchie, Stuart J; Wiseman, Richard; French, Christopher C 2012 PLoS ONE

Three pre-registered, independent replication attempts of Bem's Experiment 9 ('retroactive facilitation of recall') were conducted at Edinburgh, Goldsmiths, and Hertfordshire, each with 50 participants (combined N = 150, 99.92% power to detect original d = .42). Using Bem's original software and procedure, none produced significant effects: Replication 1 DR% = 0.19% (p = .46), Replication 2 DR% = −2.72% (p = .94), Replication 3 DR% = 2.58% (p = .61); combined p = .83 (one-tailed). A methodological improvement over Bem's original used blind raters for ambiguous word coding. The authors favour experimental artifacts as the explanation for Bem's original result.

#bem_replication #failed_replication #retroactive_facilitation #pre_registered #multi_site

Results from a Confirmatory Replication Study of Bem (2011): Precognitive Detection of Erotic Stimuli?

Wagenmakers, Eric-Jan; Wetzels, Ruud; Borsboom, Denny; van der Maas, Han L. J; Kievit, Rogier A 2012

Pre-registered confirmatory replication of Bem's (2011) Experiment 1, testing whether participants can detect erotic pictures behind curtains at above-chance rates. One hundred female participants each completed two sessions of 60 trials (15 erotic, 45 neutral). All methods and Bayesian analyses were specified and posted online before data collection. Six pre-specified Bayes factor tests — comparing erotic vs. neutral performance, erotic vs. chance, extraversion correlations, and cross-session consistency — all yielded evidence favoring the null hypothesis. Combined-session Bayes factors reached BF₀₁ = 16.6, providing strong evidence against precognition. A small positive extraversion-performance correlation (r = 0.13) was not supported by Bayesian analysis (BF₀₁ = 3.64).

#precognition_replication #bayesian_analysis #pre_registration #null_result #feeling_the_future

Back from the Future: Parapsychology and the Bem Affair

Alcock, James E 2011 Skeptical Inquirer

An experiment-by-experiment methodological critique of Daryl Bem's nine 'Feeling the Future' precognition experiments. Identifies multiple procedural irregularities: protocols changed mid-study in Experiments 1 and 2 (after 40/100 and 100/150 participants), at least seven t-tests without correction for multiple comparisons rendering the headline p=.01 nonsignificant (~.06 after correction), ad hoc two-item stimulus-seeking scales with no psychometric validation, and deliberate use of one-tailed tests. Effect sizes correlate negatively with sample size across all nine experiments (r=-0.91). Placed in historical context alongside Rhine, Schmidt, Targ/Puthoff, PEAR, and ganzfeld as recurring cycles of psi breakthroughs followed by methodological critiques.

#bem_critique #questionable_research_practices #multiple_testing #historical_context #methodology_critique

A Bayes Factor Meta-Analysis of Bem's ESP Claim

Rouder, Jeffrey N; Morey, Richard D 2011 Psychonomic Bulletin & Review

Bayesian meta-analysis of the nine experiments in Bem's (2011) 'Feeling the Future' paper, using a newly developed meta-analytic extension of the JZS default Bayes factor t-test. Excluding three retroactive mere-exposure experiments as uninterpretable, the remaining data were categorized by stimulus type. Evidence for ESP with erotic stimuli was slight (BF = 3.23), with neutral stimuli negligible (BF = 1.57), but with emotionally valenced nonerotic stimuli noteworthy (BF = 38.7). The analysis also demonstrated that simply multiplying individual Bayes factors across experiments — as Wagenmakers et al. (2011) implicitly did — systematically underestimates cumulative evidence. Despite the BF of ~40 for emotional stimuli, the authors argued this is insufficient to overcome appropriate prior skepticism about ESP given the absence of plausible physical mechanisms.

#bayes_factor #meta_analytic_method #precognition_reanalysis #bayesian_statistics #feeling_the_future

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant

Simmons, Joseph P; Nelson, Leif D; Simonsohn, Uri 2011 Psychological Science

Demonstrates that flexibility in data collection, analysis, and reporting dramatically inflates false-positive rates beyond the nominal 5%. Monte Carlo simulations of 15,000 samples show four common researcher degrees of freedom -- flexible dependent variables, optional stopping, covariate selection, and condition dropping -- individually raise false-positive rates to 7.7-12.6% and in combination produce a 60.7% rate. Two actual experiments exploit these freedoms to produce statistically significant evidence for an impossible hypothesis (that listening to a Beatles song makes people younger, p=.040). Proposes six author disclosure requirements and four reviewer guidelines as remedy.

#p_hacking #false_positives #researcher_degrees_of_freedom #replication_crisis #methodology

Why Psychologists Must Change the Way They Analyze Their Data: The Case of Psi

Wagenmakers, Eric-Jan; Wetzels, Ruud; Borsboom, Denny; van der Maas, Han 2011 Journal of Personality and Social Psychology

Reanalysis of Bem's (2011) nine precognition experiments using a default Bayesian t-test reveals that the statistical evidence for psi is weak to nonexistent. Of 10 critical tests, only one yields 'substantial' Bayesian evidence for psi (BF01 = 0.17); three yield 'substantial' evidence for the null hypothesis (BF01 = 3.14 to 7.61); the remaining six produce only 'anecdotal' evidence in either direction. The paper identifies three flaws in Bem's approach: conflation of exploratory and confirmatory analyses, the fallacy of the transposed conditional, and reliance on p-values that overstate evidence against the null. Proposes six guidelines for confirmatory research including pre-registration, Bayesian testing, and adversarial collaboration.

#bayesian_hypothesis_testing #confirmatory_research #p_value_critique #bem_critique #reanalysis

Meta-Analysis That Conceals More Than It Reveals: Comment on Storm et al. (2010)

Hyman, Ray 2010 Psychological Bulletin

Responding to Storm, Tressoldi, and Di Risio's (2010) meta-analysis of ganzfeld studies, this commentary argues that meta-analytic aggregation manufactures apparent consistency from fundamentally heterogeneous data. The original ganzfeld database's significant hit rate derived almost entirely from four experimenters (44% hit rate) while others obtained chance-level results (26%). The autoganzfeld's significance came only from dynamic targets (37%), with static targets at chance (~26%), constituting a failed replication of the original static-target database. Autoganzfeld II, meeting all of Storm et al.'s criteria for a reliable study, yielded hit rates of 26.5% (N=151) and 25.8% (N=209) — chance level. Hyman concludes that parapsychology requires prospective, independently replicable evidence rather than retrospective meta-analytic consistency.

#ganzfeld_replicability #meta_analysis_critique #experimenter_effects #prospective_replication #autoganzfeld

Of Two Minds: Sceptic-Proponent Collaboration within Parapsychology

Schlitz, Marilyn J; Wiseman, Richard; Watt, Caroline; Radin, Dean 2006 British Journal of Psychology

Third in a series of joint sceptic-proponent collaborations on remote staring detection. The first two studies (Wiseman & Schlitz, 1997, 1999) found that the proponent experimenter (Schlitz) obtained significant EDA effects (es=0.50, es=0.33) while the skeptic (Wiseman) did not (es=0.11, es=0.07). This third study employed a 2x2 cross-over design (N=100) at IONS to determine whether the earlier experimenter effects arose from the greeter role or the sender role. Neither main effects of greeter (F(4,93)=0.46, p=.50) nor sender (F(4,93)=0.21, p=.64) reached significance. The condition replicating the original protocol yielded es=-0.03, p=.87. Results are consistent with either genuine psi disrupted by uncontrolled factors, or chance/artifact explanations of the earlier studies.

#experimenter_effects #remote_staring #dmils #skeptic_proponent #electrodermal_activity

Give the Null Hypothesis a Chance: Reasons to Remain Doubtful about the Existence of Psi

Alcock, James E 2003 Journal of Consciousness Studies

Invited commentary for a JCS special issue on parapsychology enumerating reasons to maintain the null hypothesis regarding psi. Alcock presents a structured case built on: (1) lack of subject-matter definition, (2) negative definition of constructs, (3) failure to achieve replication — highlighting Jeffers' ignored null double-slit results and the Jahn consortium's null PortREG outcome, (4) multiplication of entities (psi-experimenter effect, sheep-goat, psi-missing, decline effects) to immunize against falsification, (5) unfalsifiability, (6-7) unpredictability and lack of cumulative progress despite technological advances, (8) unique reliance on statistical significance to infer the phenomenon's existence, and (9) incompatibility with established physics and neuroscience. Also engages with other special-issue contributors including Parker, Palmer, French, Brugger & Taylor, and Dean & Kelly.

#null_hypothesis #replication_failure #unfalsifiability #experimenter_effect #philosophy_of_science

Was There Evidence of Global Consciousness on September 11, 2001?

Scargle, Jeffrey D 2002 Journal of Scientific Exploration

A critical commentary on two accompanying papers by Nelson and Radin analyzing Global Consciousness Project (GCP) random number generator data from September 11, 2001. Scargle, a NASA astrophysicist, identifies several methodological concerns: the XOR bit-flipping operation renders the GCP insensitive to direct coherent effects on bit frequencies; cumulative sums of chi-squared statistics create misleading visual structure resembling 1/f noise even in purely random data; and the GCP prediction registry lacks sufficiently specific hypotheses to eliminate post-hoc 'fiddle room.' When independent (non-overlapping) running means are applied instead, the data resemble white noise. Scargle concludes that none of the reported results are compelling and recommends Bayesian analysis, stricter prediction protocols, and blind parallel testing.

#gcp_critique #statistical_methodology #cumulative_sums #exploratory_vs_confirmatory #bayesian_analysis

Fundamentally Misunderstanding Visual Perception: Adults’ Belief in Visual Emissions

Winer, Gerald A; Cottrell, Jane E; Gregg, Virginia; Fournier, Jody S; Bica, Lori A 2002 American Psychologist

A review of research documenting widespread extramission beliefs among adults — the conviction that vision involves emissions from the eyes. Across multiple studies using computer animations, drawings, and verbal forced-choice items, 41–67% of college students affirmed extramission representations; drawing tasks yielded rates as high as 86%. At least 70% of believers judged emissions as functionally necessary for seeing. Standard educational interventions (textbook readings, introductory psychology coursework) failed to reduce the misconception. Refutational teaching produced short-term gains (100% correct on immediate posttest) that vanished within 3–5 months. The authors attribute the belief’s persistence to primitive phenomenological experiences of outer-directed vision that syncretically fuse with lay theories of the visual process.

#extramission_beliefs #visual_perception_misconception #science_education #staring_detection #cognitive_development

Evaluation of a Program on Anomalous Mental Phenomena

Hyman, Ray 1996 Journal of Scientific Exploration

Commissioned alongside Jessica Utts to evaluate the U.S. government-funded Stargate remote viewing program at SRI and SAIC (1973–1994), Hyman focuses on the ten most recent SAIC experiments. He concedes these experiments are methodologically superior to earlier SRI work and that statistical effects are too large to dismiss as chance flukes. However, he argues Utts’ conclusion that psychic functioning has been proven is premature: the experiments were conducted in secrecy precluding peer review, relied on the same viewers, targets, and a single judge (the principal investigator) across all studies, and have not been independently replicated. He identifies key inconsistencies between ganzfeld and remote viewing findings and argues that without a positive theory of anomalous cognition, statistical departures from chance alone cannot establish its existence.

#remote_viewing #ganzfeld_autoganzfeld #stargate_program #replication #methodology_critique

Anomaly or Artifact? Comments on Bem and Honorton

Hyman, Ray 1994 Psychological Bulletin

Reanalysis of 11 autoganzfeld experiments (N = 330 sessions) from Bem and Honorton (1994) reveals inconsistencies challenging the claim of a replicable psi effect. Hit rate correlates strongly with target occurrence frequency (Spearman r = .83, p = .013), and a significant interaction with experimenter prompting shows hit rates jumping from .140 (first occurrences) to .445 (later occurrences; chi-squared = 14.702, p = .0001). The overall effect is a composite of different rates for dynamic (.372) versus static (.271) targets, undermining claimed consistency with the original ganzfeld database. Hyman concludes that inadequate randomization testing and target-frequency patterns cast doubt on whether results reflect psi or artifact.

#ganzfeld_autoganzfeld #randomization_critique #target_frequency_artifact #experimenter_prompting #methodology_critique

Bayesian Analysis of Random Event Generator Data

Jefferys, William H 1990 Journal of Scientific Exploration

Applying Bayesian hypothesis testing to Jahn, Dunne & Nelson's (1987) PEAR random event generator dataset of 104.49 million trials reveals that the Jeffreys-Lindley paradox undermines the strong classical p-values reported. Under a uniform prior on the alternative hypothesis, the Bayes factor B = 12, actually increasing confidence in the null. For nearly all reasonable prior distributions on effect size, B exceeds 1 (favoring no anomaly). Even the most favorable prior yields B approximately 30 times larger than the classical p-value, showing the frequentist test overestimates significance by at least a factor of 20. Jefferys concludes these data are insufficient to shift the opinions of observers with even moderate priors, and advocates Bayesian methods as more appropriate for parapsychology.

#bayesian_statistics #random_event_generator #jeffreys_lindley_paradox #statistical_methodology #pear_lab

Parapsychological Research: A Tutorial Review and Critical Appraisal

Hyman, Ray 1986 Proceedings of the IEEE

Invited review surveying 130 years of parapsychological research for a mainstream engineering audience. Examines historically prominent evidence from 1850s spiritualism (Crookes/Home, the Creery sisters) through Rhine's card-guessing program, the Soal-Shackleton experiments (later shown fraudulent by Markwick in 1978), and contemporary ganzfeld, RNG, and remote viewing paradigms. Introduces the 'False Dichotomy' concept: critics feel forced to accept psi or accuse fraud, missing subtler explanations. Argues that parapsychological evidence is fundamentally non-cumulative—each generation's best cases are discredited and replaced by new paradigms repeating the same patterns. Notes Akers' finding that 85% of 54 selected ESP experiments had serious methodological flaws.

#historical_review #methodological_critique #ganzfeld_autoganzfeld #false_dichotomy #fraud_detection

Statistical Problems in ESP Research

Diaconis, Persi 1978 Science

Landmark statistical critique of ESP research by Harvard statistician Persi Diaconis, published in Science (Vol. 201, No. 4351, pp. 131–136). Analyzes four classes of methodological problems that generate spurious positive results: (1) optional stopping — analyzing data repeatedly and halting collection when significance is reached, without a pre-specified stopping rule; (2) multiple testing — examining many subjects, conditions, and measures then reporting only significant outcomes; (3) inadequate randomization — pseudo-random target sequences in Rhine-era experiments had detectable statistical regularities that subjects' guessing strategies exploited; and (4) sensory leakage — insufficient physical isolation in card-guessing paradigms provided olfactory, visual, and auditory cues. Concludes that better-controlled experiments consistently yield weaker or null results, and that statistical analysis alone cannot validate ESP claims without rigorous experimental design. Directly motivated methodological reforms including pre-registration, automated randomization, and double-blind protocols.

#statistical_critique #methodology #optional_stopping #multiple_testing #esp