Abstract
Replicability is usually considered to be one of the cornerstones of science; however, the growing recognition of nonreplicable experiments and studies in scientific journals—a phenomenon that has been called ‘replicability crisis’—has spurred a debate on the meaning, function, and significance of replicability in science. Amid this discussion, it has become clear that replicability is not a monolithic concept; what is still controversial is exactly how the distinction between different kinds of replicability should be laid out terminologically and conceptually, and to what extent it bears on the more general debate on the centrality of replicability in science. This paper’s goals are to clarify the different uses of the terms related to replicability and, more importantly, to conceptually specify the kinds of replicability and their respective epistemic functions.
Similar content being viewed by others
Notes
For instance, the case of cold fusion, as it is usually presented in the literature (but see Norton 2015 for a discussion on this point).
For instance, the experiments by Bednorz and Muller, which revolutionized super-conductivity (Di Bucchianico 2014).
For instance, I will not discuss Machery’s resampling account of replication, which is based on conceptual engineering and is relevant only in those specific scientific fields that are based on resampling.
As they say “More is needed for achieving reproducible research. Responsibility for accomplishing this goal begins with adopting a universal lexicon of terms and concepts.” (Pellizzari et al. 2017, p. 47).
Usually, the expression ‘independent experiment’ refers to any experiment whose methodology differs from the methodology of the original experiment.
In this paper, I differentiate exact and direct replications following Westfall et al. (2015), and Nosek and Errington (2017), however, it is usual to identify exact and direct replications (Stroebe and Strack 2014; Ward and Kemp 2019). The reason why I keep the difference between exact and direct replications is that within this taxonomy direct replications recreate only the critical elements of an experiment or study, and not all aspects. Whenever exact and direct replications are regarded as synonymous it is because direct replications are defined as replications that attempt to recreate all aspects of an original experiment or study. Another use of exact replications is to regard them as kinds of direct replications (LeBel et al. 2017).
Exact replications are sometimes considered only theoretically possible, but practically impossible (Westfall et. al. 2015). So, most of the times, replications that keep all aspects invariant, and replications that keep only relevant aspects invariant are treated together.
This way of characterizing the controversy may sound too naïve, but this way of presenting the debate helps frame the issue in clearer terms and, as we will see, it picks some real controversy going on in disciplines for instance psychology, which was accused of non-being replicable because it cannot implement direct replications.
According to Schmidt (2009), direct replications can: i. address sampling error; ii. control for artefacts; iii. address research fraud; iv. test generalizations to different populations; while conceptual replications can test the same hypothesis of a prior study using a different procedure.
Before getting into the issue in question, a few remarks are in order. First of all, my discussion will touch on the notions of random and systematic errors, precision, accuracy, reliability, and validity, all terms that have been discussed in the literature in many different ways. The meanings and relations of these terms, as for the case of terms related to replicability, vary across different fields but also within the same fields. In order to limit controversy, I use the terminology that is usually employed in textbooks on error analysis or in physics textbooks (Taylor 1997; Squires 2001). Secondly, this section does not aim to be a technical review of error analysis; rather it only aims to support the claim that different kinds of replicability serve different functions and that, for this reason, any hope to come up with a general principle stating which kind of replicability provides the gold standards is bound to fail.
Here I am following a very standard categorizations of errors that distinguishes random errors from systematic errors (Taylor 1997).
The standard way is to associate systematic errors with the methodology or experimental procedure. However, exceptions are possible. For instance, systematic errors can also be caused by environmental interferences or other aspects.
One caveat is important at this point. Even though each kind of replicability is particularly relevant and particularly efficient to assess one particular kind of error (see Table 3), it is true that each of them may also indirectly inform, in some particular circumstances, of the presence of other kinds of errors. For instance, suppose that I do not know whether the measurement performed with a manual-stepper pipette was affected by a random error, and I perform a replication that replaces the manual-stepper pipette with a digital one. Further suppose that I obtain a different result. The discrepancy of results can of course be due to a difference in systematic, but can also reveal the possible presence of a random error, which I omitted to check and quantified by performing exact duplications. This is why it is always important to evaluate all kinds of errors by performing different kinds of replications.
References
Association for Computing Machinery (2018) Artifact review and badging. https://www.acm.org/publications/policies/artifact-review-badging. Accessed: May 2019
Baker M (2016) Biotech giant publishes failures to confirm high profile science. Nature 530:141. https://doi.org/10.1038/nature.2016.19269
Barba LA (2018) Terminologies for reproducible research. arXiv:1802.03311
Barlow R (2003) Introduction to statistical issues in particle physics. Preprint arXiv:physics/0311105
Barsalou LW (2016) Situated conceptualization offers a theoretical account of social priming. Curr Opin Psychol 12:6–11
Braude SE (1979) ESP and psychokinesis. A philosophical examination. Temple University Press, Philadelphia
Cesario J (2014) Priming, replication, and the hardest science. Perspect Psychol Sci 9(1):40–48
Chen X (1985) The rule of reproducibility and its applications in experiment appraisal. Synthese 99(1):87–109
Coyne JC (2016) Replication initiatives will not salvage the trustworthiness of psychology. BMC Psychol 4(1):1–11
Crandall CS, Sherman JW (2016) On the scientific superiority of conceptual replications for scientific progress. J Exp Soc Psychol 66:93–99
Di Bucchianico M (2014) A matter of phronesis: experiment and virtue in physics, a case study. In: Fairweather A (ed) Virtue epistemology naturalized. Springer International Publishing, Cham, pp 291–312
Dunlap K (1926) The experimental methods of psychology. In: Powell lecture in psychological theory, April 1925, Clark University, Worcester, MA, US; Portions of this research were presented at the Powell Lecture in Psychological Theory at Clark University, April 21, 1925. Clark University Press.
Fanelli D, Costas R, Ioannidis JP (2017) Meta-assessment of bias in science. Proc Natl Acad Sci 114(14):3714–3719
Fidler F, Chee YE, Wintle BC, Burgman MA, McCarthy MA, Gordon A (2017) Metaresearch for evaluating reproducibility in ecology and evolution. Bioscience 67(3):282–289
Flier JS (2017) Irreproducibility of published bioscience research: Diagnosis, pathogenesis and therapy. Mol Metabol 6(1):2–9
Freedman WL, Madore BF, Hatt D, Hoyt TJ, Jang IS, Beaton RL et al (2019) The Carnegie-Chicago hubble program. VIII. An independent determination of the Hubble constant based on the tip of the red giant branch. Astrophys J 882(1):34
Goodman SN, Fanelli D, Ioannidis JP (2016) What does research reproducibility mean? Sci Transl Med 8(341):341ps12
JCGM J (2008) 200: 2008 International vocabulary of metrology—basic and General Concepts and Associated Terms (VIM) Vocabulaire international de métrologie—Concepts fondamentaux et généraux et termes associés (VIM). International Organization for Standardization Geneva ISBN, 3, 1042008
Jasny BR et al (2011) Again, and again, and again …. Science 334:1225
LeBel EP, Berger D, Campbell L, Loving TJ (2017) Falsifiability is not optional. J Pers Soc Psychol 113(12):254–261
Leonelli S (2018) Rethinking reproducibility as a criterion for research quality. In: Including a symposium on Mary Morgan: curiosity, imagination, and surprise. Emerald Publishing Limited, pp 129–146
Lynch JG Jr, Bradlow ET, Huber JC, Lehmann DR (2015) Reflections on the replication corner: in praise of conceptual replications. Int J Res Mark 32(4):333–342
Machery E (2020) What is a replication? Forthcoming in philosophy of science.
Makel MC, Plucker JA, Hegarty B (2012) Replications in psychology research: how often do they really occur? Perspect Psychol Sci 7(6):537–542
Manninen T, Aćimović J, Havela R, Teppola H, Linne ML (2018) Challenges in reproducibility, replicability, and comparability of computational models and tools for neuronal and glial networks, cells, and subcellular structures. Front Neuroinform 12:20
Miłkowski M, Hensel WM, Hohol M (2018) Replicability or reproducibility? On the replication crisis in computational neuroscience and sharing only relevant detail. J Comput Neurosci 45(3):163–172
Von Neumann J (1955) Mathematical foundations of quantum mechanics. Princeton University Press
Norton JD (2015) Replicability of experiment. THEORIA. Revista de Teoría, Historia y Fundamentos de la Ciencia 30(2):229–248
Nosek BA, Errington TM (2017) Reproducibility in cancer biology: making sense of replications. Elife 6:e23383
Nosek BA, Spies JR, Motyl M (2012) Scientific Utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspect Psycholog Sci 7(6):615–631. https://doi.org/10.1177/1745691612459058
Open Science Collaboration (2015) Estimating the reproducibility of psychological science. Science 349(6251). https://doi.org/10.1126/science.aac4716
Pashler H, Harris CR (2012) Is the replicability crisis overblown? Three arguments examined. Perspect Psychol Sci 7(6):531–536
Pedersen-Bjergaard S, Gammelgaard B, Halvorsen TG (2019) Introduction to pharmaceutical analytical chemistry. Wiley
Pelizzari E, Lohr K, Creel D, Blatecky A (2017) Reproducibility: a primer on semantics and implications for research. RTI, Internationalss
Peng RD (2011) Reproducible research in computational science. Science 334(6060):1226–1227
Peng RD, Dominici F, Zeger SL (2006) Reproducible epidemiologic research. Am J Epidemiol 163(9):783–789
Plesser HE (2018) Reproducibility vs. replicability: a brief history of a confused terminology. Front Neuroinform 11:76
Popper K (1934) The logic of scientific discovery. Edition Consulted, Routledge
Radder H (2012) The material realization of science a philosophical view on the experimental natural sciences, developed in discussion with Jürgen Habermas. Springer, Cham
Redish AD, Kummerfeld E, Morris RL, Love AC (2018) Opinion: reproducibility failures are essential to scientific inquiry. Proc Natl Acad Sci 115(20):5042–5046
Riess AG, Macri LM, Hoffmann SL, Scolnic D, Casertano S, Filippenko AV et al (2016) A 2.4% determination of the local value of the Hubble constant. Astrophys J 826(1):56
Romero F (2019) Philosophy of science and the replicability crisis. Philos Compass 14(11):e12633-1
Schmidt S (2009) Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev Gen Psychol 13(2):90–100. https://doi.org/10.1037/a0015108
Schnall S (2014) Moral intuitions, replication, and the scientific study of human nature. Edge. https://www.edge.org/conversation/simone_schnall-simone-schnall-moral-intuitions-replication-and-the-scientific-study-of
Squires GL (2001) Practical physics. Cambridge University Press
Steen RG, Casadevall A, Fang FC (2013) Why has the number of scientific retractions increased? PLOS ONE 8(7):e68397
Stroebe W, Strack F (2014) The alleged crisis and the illusion of exact replication. Perspect Psychol Sci 9(1):59–71
Taylor J (1997) Introduction to error analysis, the study of uncertainties in physical measurements. University Science Books
Velden T, Hinze S, Scharnhorst A, Schneider JW, Waltman L (2018) Exploration of reproducibility issues in scientometric research Part 2: Conceptual reproducibility. arXiv:1804.05026
Ward M, Kemp S (2019) The probability of conceptual replication and the variability of effect size. Methods Psychol 1:100002
Wen H, Wang HY, He X, Wu CI (2018) On the low reproducibility of cancer studies. Nat Sci Rev 5(5):619–624
Westfall J, Judd CM, Kenny DA (2015) Replicating studies in which samples of participants respond to samples of stimuli. Perspect Psychol Sci 10(3):390–399
Zirpel, M. (2013). Repeatable measurements and the collapse postulate. arXiv:1311.1152
Zwaan RA, Etz A, Lucas RE, Donnellan MB (2018) Making replication mainstream. Behavl Brain Sci 41
Acknowledgements
I would like to thank the members of the research colloquium in philosophy of science at the University of Bern, in particular Claus Beisbart, for stimulating questions and insightful comments on an earlier version of this article. I also would like to thank Casey McCoy for reading different versions of this paper and for his painstaking comments. Finally, I am grateful to Kevin Heng and the audience of the 2020 CSH symposium at the University of Bern for their thought-provoking questions on the topic of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Matarese, V. Kinds of Replicability: Different Terms and Different Functions. Axiomathes 32 (Suppl 2), 647–670 (2022). https://doi.org/10.1007/s10516-021-09610-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10516-021-09610-2