The “reproducibility crisis” in science is erupting again. A research project attempted to replicate 21 social science experiments published between 2010 and 2015 in the prestigious journals Science and Nature. Only 13 replication attempts succeeded. The other eight were duds, with no observed effects consistent with the original findings.
The failures do not necessarily mean the original results were erroneous, as the authors of this latest replication effort note. There could have been unexpected problems of some type in the second try. But the authors also noted that even in the replications that succeeded, the observed effect was on average only about 75% as large as the first time around.
The researchers conclude that there is a systematic bias in published findings, “partly due to false positives and partly due to the overestimated effect sizes of true positives.”
The two-year replication project, published in the journal Nature Human Behaviour, is likely to roil research institutions and scientific journals that in recent years have grappled with reproducibility issues. The ability to replicate a finding is fundamental to experimental science. This latest project provides a reminder that the publication of a finding in a peer-reviewed journal does not make it true.
Scientists are under attack from ideologues, special interests and conspiracy theorists who reject the evidence-based consensus in such areas as evolution, climate change, the safety of vaccines, and cancer treatment. The replication crisis is different; it is largely an in-house problem with experimental design and statistical analysis.
Refreshingly, other scientists have a pretty good detector for which studies are likely to stand the test of time. In this latest effort, the researchers asked more than 200 peers to predict which studies would replicate and to what extent the effect sizes would be duplicated. The prediction market got it remarkably right. The study’s authors suggest that scientific journals could tap into the “wisdom of crowds” when deciding how to treat submitted papers with novel results.