The Replication Crisis in Epidemiology: Snowball, Snow Job, or Winter Solstice?

Lash, Timothy L.; Collin, Lindsay J.; Van Dyke, Miriam E.

doi:10.1007/s40471-018-0148-x

The Replication Crisis in Epidemiology: Snowball, Snow Job, or Winter Solstice?

Epidemiologic Methods (R Maclehose, Section Editor)
Published: 12 April 2018

Volume 5, pages 175–183, (2018)
Cite this article

Current Epidemiology Reports Aims and scope Submit manuscript

Timothy L. Lash¹,
Lindsay J. Collin¹ &
Miriam E. Van Dyke¹

1227 Accesses
20 Citations
15 Altmetric
Explore all metrics

Abstract

Purpose of Review

Like a snowball rolling down a steep hill, the most recent crisis over the perceived lack of reproducibility of scientific results has outpaced the evidence of crisis. It has led to new actions and new guidelines that have been rushed to market without plans for evaluation, metrics for success, or due consideration of the potential for unintended consequences.

Recent Findings

The perception of the crisis is at least partly a snow job, heavily influenced by a small number of centers lavishly funded by a single foundation, with undue and unsupported attention to preregistration as a solution to the perceived crisis. At the same time, the perception of crisis provides an opportunity for introspection. Two studies’ estimates of association may differ because of undue attention on null hypothesis statistical testing, because of differences in the distribution of effect modifiers, because of differential susceptibility to threats to validity, or for other reasons. Perhaps the expectation of what reproducible epidemiology ought to look like is more misguided than the practice of epidemiology. We advocate for the idea of “replication and advancement.” Studies should not only replicate earlier work, but also improve on it in by enhancing the design or analysis.

Summary

Abandoning blind reliance on null hypothesis significance testing for statistical inference, finding consensus on when preregistration of non-randomized study protocols has merit, and focusing on replication and advance are the most certain ways to emerge from this solstice for the better.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Qualitative Research: Ethical Considerations

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Article Open access 19 November 2018

The ABC of systematic literature review: the basic methodological guidance for beginners

Article 23 October 2020

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Ioannidis JP. How to make more published research true. PLoS Med. 2014;11(10):e1001747. https://doi.org/10.1371/journal.pmed.1001747.
Article PubMed PubMed Central Google Scholar
Unreliable research: trouble at the lab. Economist. 2013 19 October 2013.
Collins FS, Tabak LA. Policy: NIH plans to enhance reproducibility. Nature. 2014;505(7485):612–3.
Article PubMed PubMed Central Google Scholar
Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. SCIENTIFIC STANDARDS. Promoting an open research culture. Science. 2015;348(6242):1422–5. https://doi.org/10.1126/science.aab2374.
Article PubMed PubMed Central CAS Google Scholar
Journals unite for reproducibility. Nature 2014;515(7525):7. https://doi.org/10.1038/515007a.
US National Institutes of Health. Rigor and Reproducibility. 2016. http://grants.nih.gov/reproducibility/index.htm#guidance. Accessed 6 July 2016.
Benjamin D, Berger J, Johannesson M, et al. Redefine Statistical Significance. Unpublished Manuscript. 2017.
•• Lash TL. The harm done to reproducibility by the culture of null hypothesis significance testing. Am J Epidemiol. 2017;186(6):627–35. https://doi.org/10.1093/aje/kwx261. Demonstrates that null hypothesis significance testing leads to the appearance of poor reproducibility by at least four mechanisms, yet few proposed interventions to improve reproducibility have suggested change to the culture of null hypothesis significance testing.
Article PubMed Google Scholar
Matthews R, Wasserstein R, Spiegelhalter D. The ASA’s p-value statement, one year on. Significance. 2017;14(2):38–41. https://doi.org/10.1111/j.1740-9713.2017.01021.x.
Article Google Scholar
McShane B, Gal D, Gelman A, Robert C, Tackett J. Abandon statistical significance. Unpublished Manuscript. 2017.
Trafimow D, Amrhein V, Areshenkoff C, et al. Manipulating the alpha level cannot cure significance testing—comments on “Redefine statistical significance”. Unpublished Manuscript. 2017.
Lash TL. Declining the transparency and openness promotion guidelines. Epidemiology. 2015;26(6):779–80. https://doi.org/10.1097/ede.0000000000000382.
Article PubMed Google Scholar
Lash TL. Lash responds to “is reproducibility thwarted by hypothesis testing?” and “the need for cognitive science in methodology”. Am J Epidemiol. 2017;186(6):646–7. https://doi.org/10.1093/aje/kwx260.
Article PubMed Google Scholar
Crane H. Why “redefining statistical significance” will not improve reproducibility and could make the replication crisis worse. Unpublished Manuscript 2017.
Feinstein AR. Scientific standards in epidemiologic studies of the menace of daily life. Science. 1988;242(4883):1257–63.
Article PubMed CAS Google Scholar
Taubes G. Epidemiology faces its limits. Science. 1995;269(5221):164–9.
Article PubMed CAS Google Scholar
Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2(8):e124.
Article PubMed PubMed Central Google Scholar
• Blair A, Saracci R, Vineis P, Cocco P, Forastiere F, Grandjean P, et al. Epidemiology, public health, and the rhetoric of false positives. Environ Health Perspect. 2009;117(12):1809–13. https://doi.org/10.1289/ehp.0901194. One of several papers emphasizing the importance of false-positive associations without due consideration to the importance of false-negative associations.
Article PubMed PubMed Central Google Scholar
Ioannidis JP. Why most discovered true associations are inflated. Epidemiology. 2008;19(5):640–8. https://doi.org/10.1097/EDE.0b013e31818131e7.
Article PubMed Google Scholar
Ioannidis JP, Tarone R, McLaughlin JK. The false-positive to false-negative ratio in epidemiologic studies. Epidemiology. 2011;22(4):450–6. https://doi.org/10.1097/EDE.0b013e31821b506e.
Article PubMed Google Scholar
McLaughlin JK, Tarone RE. False positives in cancer epidemiology. Cancer Epidemiol Biomark Prev. 2013;22(1):11–5. https://doi.org/10.1158/1055-9965.EPI-12-0995.
Article Google Scholar
• Mayes LC, Horwitz RI, Feinstein AR. A collection of 56 topics with contradictory results in case-control research. Int J Epidemiol. 1988;17(3):680–5. Demonstrates long-standing concerns about the reproducibility of epidemiologic research.
Article PubMed CAS Google Scholar
Goodman S, Greenland S. Why most published research findings are false: problems in the analysis. PLoS Med. 2007;4(4):e168. https://doi.org/10.1371/journal.pmed.0040168.
Article PubMed PubMed Central Google Scholar
Chemicals ECfEaTo. ECETOC workshop report no. In: 18; 2009.
Google Scholar
• Lash TL, Vandenbroucke JP. Commentary: should preregistration of epidemiologic study protocols become compulsory?: reflections and a counterproposal. Epidemiology. 2012;23(2):184–8. https://doi.org/10.1097/EDE.0b013e318245c05b. Review of advantages and disadvantages of compulsory preregistration of nonrandomized epidemiologic research.
Article PubMed Google Scholar
Boccia S, Rothman KJ, Panic N, Flacco ME, Rosso A, Pastorino R, et al. Registration practices for observational studies on ClinicalTrials.gov indicated low adherence. J Clin Epidemiol. 2016;70:176–82. https://doi.org/10.1016/j.jclinepi.2015.09.009.
Article PubMed Google Scholar
De Angelis C, Drazen JM, Frizelle FAP, Haug C, Hoey J, Horton R, et al. Clinical trial registration: a statement from the International Committee of Medical Journal Editors. N Engl J Med. 2004;351(12):1250–1. https://doi.org/10.1056/NEJMe048225.
Article PubMed Google Scholar
Krleza-Jeric K, Chan AW, Dickersin K, Sim I, Grimshaw J, Gluud C. Principles for international registration of protocol information and results from human trials of health related interventions: Ottawa statement (part 1). BMJ. 2005;330(7497):956–8. https://doi.org/10.1136/bmj.330.7497.956.
Article PubMed PubMed Central Google Scholar
Williams RJ, Tse T, Harlan WR, Zarin DA. Registration of observational studies: is it time? CMAJ. 2010;182(15):1638–42. https://doi.org/10.1503/cmaj.092252.
Article PubMed PubMed Central Google Scholar
Bracken MB. Preregistration of epidemiology protocols: a commentary in support. Epidemiology. 2011;22(2):135–7. https://doi.org/10.1097/EDE.0b013e318207fc7c.
Article PubMed Google Scholar
Loder E, Groves T, MacAuley D. Registration of observational studies. BMJ. 2010;340:c950. https://doi.org/10.1136/bmj.c950.
Article PubMed Google Scholar
Center for Open Science. Our Sponsors. https://cos.io/about/our-sponsors/.
Buck S. Solving reproducibility. Science. 2015;348(6242):1403. https://doi.org/10.1126/science.aac8041.
Article PubMed CAS Google Scholar
Laura and John Arnold Foundation. Grants. http://www.arnoldfoundation.org/grants/
Begley CG, Ioannidis JP. Reproducibility in science: improving the standard for basic and preclinical research. Circ Res. 2015;116(1):116–26. https://doi.org/10.1161/CIRCRESAHA.114.303819.
Article PubMed CAS Google Scholar
Iqbal SA, Wallach JD, Khoury MJ, Schully SD, Ioannidis JP. Reproducible research practices and transparency across the biomedical literature. PLoS Biol. 2016;14(1):e1002333. https://doi.org/10.1371/journal.pbio.1002333.
Article PubMed PubMed Central CAS Google Scholar
Stodden V, McNutt M, Bailey DH, Deelman E, Gil Y, Hanson B, et al. Enhancing reproducibility for computational methods. Science. 2016;354(6317):1240–1. https://doi.org/10.1126/science.aah6168.
Article PubMed CAS Google Scholar
Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, et al. A manifesto for reproducible science. Nat Hum Behav. 2017;1:0021. https://doi.org/10.1038/s41562-016-0021.
Article Google Scholar
Apple S. John Arnold made a fortune at Enron. Now he’s declared war on bad science. Wired 2017.
Dreber A, Pfeiffer T, Almenberg J, Isaksson S, Wilson B, Chen Y, et al. Using prediction markets to estimate the reproducibility of scientific research. PNAS. 2015;112(50):15343–7.
Article PubMed PubMed Central CAS Google Scholar
Hill AB. The environment and disease: association or causation? Proc Royal Soc Med. 1965;58:295–300.
CAS Google Scholar
Lemen RA. Chrysotile asbestos as a cause of mesothelioma: application of the Hill Causation Model. Int J Occup Environ Health. 2004;10(2):233–9. https://doi.org/10.1179/oeh.2004.10.2.233.
Article PubMed Google Scholar
Degelman ML, Herman KM. Smoking and multiple sclerosis: a systematic review and meta-analysis using the Bradford Hill criteria for causation. Mult Scler Relat Disord. 2017;17:207–16. https://doi.org/10.1016/j.msard.2017.07.020.
Article PubMed Google Scholar
Weed DL. Epidemiologic evidence and causal inference. Hematol Oncol Clin North Am. 2000;14(4):797–807. viii
Article PubMed CAS Google Scholar
Holman CD, rnold-Reed DE, de KN, McComb C, English DR. A psychometric experiment in causal inference to estimate evidential weights used by epidemiologists. 2001. p. 246–255.
Causes RKJ. Am J Epidemiol. 1976;104(6):587–92.
Article Google Scholar
Rothman KJ, Greenland S, Poole C, Lash TL. Causation and causal inference. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 5–31.
Google Scholar
Open Science CPSYCHOLOGY. Estimating the reproducibility of psychological science. Science. 2015;349(6251):aac4716. https://doi.org/10.1126/science.aac4716.
Article CAS Google Scholar
•• Gelman A, Stern H. The difference between “significant” and “not significant” is not itself statistically significant. Am Stat. 2006;60(4):328–31. https://doi.org/10.1198/000313006X152649. Two results, one statistically significant and the other not, are not necessarily different.
Article Google Scholar
•• Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31(4):337–50. https://doi.org/10.1007/s10654-016-0149-3. Comprehensive review of all the ways that null hypothesis significance testing is misused and misunderstood.
Article PubMed PubMed Central Google Scholar
Rothman KJ, Lanes S, Robins J. Casual inference. Epidemiology. 1993;4(6):555–6.
Article PubMed CAS Google Scholar
Seliger C, Meier CR, Becker C, Jick SS, Bogdahn U, Hau P, et al. Statin use and risk of glioma: population-based case–control analysis. Eur J Epidemiol. 2016;31(9):947–52. https://doi.org/10.1007/s10654-016-0145-7.
Article PubMed CAS Google Scholar
Brown HK, Ray JG, Wilton AS, Lunsky Y, Gomes T, Vigod SN. Association between serotonergic antidepressant use during pregnancy and autism spectrum disorder in children. JAMA. 2017;317(15):1544–52. https://doi.org/10.1001/jama.2017.3415.
Article PubMed CAS Google Scholar
Utts J. Replication and meta-analysis in parapsychology. Stat Sci. 1991;6(4):363–78.
Article Google Scholar
Rothman KJ, Poole C. A strengthening programme for weak associations. Int J Epidemiol. 1988;17(4):955–9.
Article PubMed CAS Google Scholar
Cole SR, Stuart EA. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. Am J Epidemiol. 2010;172(1):107–15. https://doi.org/10.1093/aje/kwq084.
Article PubMed PubMed Central Google Scholar
Lesko CR, Buchanan AL, Westreich D, Edwards JK, Hudgens MG, Cole SR. Generalizing study results: a potential outcomes perspective. Epidemiology. 2017;28(4):553–61. https://doi.org/10.1097/EDE.0000000000000664.
Article PubMed PubMed Central Google Scholar
Westreich D, Edwards JK, Lesko CR, Stuart E, Cole SR. Transportability of trial results using inverse odds of sampling weights. Am J Epidemiol. 2017;186(8):1010–4. https://doi.org/10.1093/aje/kwx164.
Article PubMed PubMed Central Google Scholar
Rothman KJ, Greenland S, Lash TL. Design strategies to improve study accuracy. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 168–82.
Google Scholar
Greenland S, Lash TL. Bias Analysis. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 345–80.
Google Scholar
Lash TL, Fox MP, MacLehose RF, Maldonado G, McCandless LC, Greenland S. Good practices for quantitative bias analysis. Int J Epidemiol. 2014;43(6):1969–85. https://doi.org/10.1093/ije/dyu149.
Article PubMed Google Scholar
Hernan MA, Sauer BC, Hernandez-Diaz S, Platt R, Shrier I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol. 2016;79:70–5. https://doi.org/10.1016/j.jclinepi.2016.04.014.
Article PubMed PubMed Central Google Scholar
Maldonado G. Adjusting a relative-risk estimate for study imperfections. J Epidemiol Community Health. 2008;62(7):655–63.
Article PubMed CAS Google Scholar
Fox MP, Lash TL. On the need for quantitative bias analysis in the peer-review process. Am J Epidemiol. 2017;185(10):865–8. https://doi.org/10.1093/aje/kwx057.
Article PubMed Google Scholar
Hunnicutt JN, Ulbricht CM, Chrysanthopoulou SA, Lapane KL. Probabilistic bias analysis in pharmacoepidemiology and comparative effectiveness research: a systematic review. Pharmacoepidemiol Drug Saf. 2016;25(12):1343–53. https://doi.org/10.1002/pds.4076.
Article PubMed PubMed Central Google Scholar
Greenland S. Invited commentary: the need for cognitive science in methodology. Am J Epidemiol. 2017;186(6):639–45. https://doi.org/10.1093/aje/kwx259.
Article PubMed Google Scholar
O’Boyle EH, Banks GC, Gonzalez-Mulé E. The Chrysalis effect: how ugly initial results metamorphosize into beautiful articles. J Manag. 2014 https://doi.org/10.1177/0149206314527133.
Sterling TD. Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. J Am Stat Assoc. 1959;54(285):30–4. https://doi.org/10.2307/2282137.
Article Google Scholar
Begg CBA. Measure to aid in the interpretation of published clinical trials. Stat Med. 1985;4(1):1–9.
Article PubMed CAS Google Scholar
Motulsky HJ. Common misconceptions about data analysis and statistics. Pharmacol Res Perspect. 2015;3(1):e00093. https://doi.org/10.1002/prp2.93.
Article PubMed Google Scholar
Kerr NL. HARKing: hypothesizing After the Results are Known. Personal Soc Psychol Rev. 1998;2(3):196–217. https://doi.org/10.1207/s15327957pspr0203_4.
Article CAS Google Scholar
Rothman KJ. Significance questing. Ann Intern Med. 1986;105(3):445–7.
Article PubMed CAS Google Scholar
Announcement: transparency upgrade for Nature journals. Nature. 2017;543(7645):288. doi:https://doi.org/10.1038/543288b.
US National Institutes of Health. Rigor and reproducibility. https://www.nih.gov/research-training/rigor-reproducibility.
Goldstein ND. Toward open-source epidemiology. Epidemiology. 2018;29(2):161–4. https://doi.org/10.1097/ede.0000000000000782.
Article PubMed Google Scholar
Khoury MJ. Planning for the future of epidemiology in the era of big data and precision medicine. Am J Epidemiol. 2015;182(12):977–9. https://doi.org/10.1093/aje/kwv228.
Article PubMed PubMed Central Google Scholar
Galea S. An argument for a consequentialist epidemiology. Am J Epidemiol. 2013;178(8):1185–91. https://doi.org/10.1093/aje/kwt172.
Article PubMed Google Scholar
von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet. 2007;370(9596):1453–7.
Article Google Scholar
Lanes SF. Error and uncertainty in causal inference. In: Rothman KJ, editor. Causal Inference. Chestnut Hill: Epidemiology Resources Inc.; 1988.
Google Scholar
Lash TL. Advancing research through replication. Paediatr Perinat Epidemiol. 2015;29(1):82–3. https://doi.org/10.1111/ppe.12167.
Article PubMed Google Scholar
Munafo M, Davey Smith G. Robust research needs many lines of evidence. Nature. 2018;553:399–401.
Article PubMed CAS Google Scholar
Rothman KJ, Greenland S, Lash TL. Precision and statistics in epidemiologic studies. In: Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 148–67.
Google Scholar
Lash TL, Fox MP, Fink AK. Applying quantitative bias analysis to epidemiologic data. Statistics for biology and health, vol book, whole. New York: Springer; 2009.
Book Google Scholar
Kieler H, Cnattingius S, Haglund B, Palmgren J, Axelsson O. Sinistrality—a side-effect of prenatal sonography: a comparative study of young men. Epidemiology. 2001;12(6):618–23.
Article PubMed CAS Google Scholar
Salvesen KA. Ultrasound in pregnancy and non-right handedness: meta-analysis of randomized trials. Ultrasound Obstet Gynecol. 2011;38(3):267–71. https://doi.org/10.1002/uog.9055.
Article PubMed CAS Google Scholar
The American College of Obstetricians and Gynecologists. Ultrasound Exams. 2017. https://www.acog.org/Patients/FAQs/Ultrasound-Exams.
Grady D, Rubin SM, Petitti DB, Fox CS, Black D, Ettinger B, et al. Hormone therapy to prevent disease and prolong life in postmenopausal women. Ann Intern Med. 1992;117(12):1016–37.
Article PubMed CAS Google Scholar
Stampfer MJ, Colditz GA. Estrogen replacement therapy and coronary heart disease: a quantitative assessment of the epidemiologic evidence. Prev Med. 1991;20(1):47–63.
Article PubMed CAS Google Scholar
Petitti D. Hormone replacement therapy and coronary heart disease: results of randomized trials. Prog Cardiovasc Dis. 2003;46(3):231–8.
Article PubMed CAS Google Scholar
Rossouw JE, Anderson GL, Prentice RL, LaCroix AZ, Kooperberg C, Stefanick ML, et al. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women’s Health Initiative randomized controlled. Trials. 2002:321–33.
Lawlor DA, Davey Smith G, Ebrahim S. Commentary: the hormone replacement-coronary heart disease conundrum: is this the death of observational epidemiology? Int J Epidemiol. 2004;33(3):464–7. https://doi.org/10.1093/ije/dyh124.
Article PubMed Google Scholar
Hernan MA, Alonso A, Logan R, Grodstein F, Michels KB, Willett WC, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19(6):766–79. https://doi.org/10.1097/EDE.0b013e3181875e61.
Article PubMed PubMed Central Google Scholar
Gunn LJ, Chapeau-Blondeau F, McDonnell MD, Davis BR, Allison A, Abbott D. Too good to be true: when overwhelming evidence fails to convince. Proc Math Phys Eng Sci. 2016;472(2187):20150748. https://doi.org/10.1098/rspa.2015.0748.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgments

Richard MacLehose reviewed and served as section editor for this manuscript. He has previously consulted with the Nutritional Science Initiative which received funds from the Arnold Foundation.

Author information

Authors and Affiliations

Department of Epidemiology, Rollins School of Public Health, Emory University, 1518-002-3BB, 1518 Clifton Rd. NE, Atlanta, GA, 30033, USA
Timothy L. Lash, Lindsay J. Collin & Miriam E. Van Dyke

Authors

Timothy L. Lash
View author publications
You can also search for this author in PubMed Google Scholar
Lindsay J. Collin
View author publications
You can also search for this author in PubMed Google Scholar
Miriam E. Van Dyke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timothy L. Lash.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflicts of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

This article is part of the Topical Collection on Epidemiologic Methods

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lash, T.L., Collin, L.J. & Van Dyke, M.E. The Replication Crisis in Epidemiology: Snowball, Snow Job, or Winter Solstice?. Curr Epidemiol Rep 5, 175–183 (2018). https://doi.org/10.1007/s40471-018-0148-x

Download citation

Published: 12 April 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s40471-018-0148-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Replication Crisis in Epidemiology: Snowball, Snow Job, or Winter Solstice?