Abstract
The ability to independently verify and replicate observations made by other researchers is a hallmark of science. In this article, we provide an overview of recent discussions concerning replicability and best practices in mainstream psychology with an emphasis on the practical benefists to both researchers and the field as a whole. We first review challenges individual researchers face in producing research that is both publishable and reliable. We then suggest methods for producing more accurate research claims, such as transparently disclosing how results were obtained and analyzed, preregistering analysis plans, and publicly posting original data and materials. We also discuss ongoing changes at the institutional level to incentivize stronger research. These include officially recognizing open science practices at the journal level, disconnecting the publication decision from the results of a study, training students to conduct replications, and publishing replications. We conclude that these open science practices afford exciting low-cost opportunities to improve the quality of psychological science.
Notes
“Chance” here refers to fluctuating factors that are uncontrolled/unaccounted for in an experiment, as well as the sampling procedures used to obtain one particular subset of a population, as opposed to another. It does not signify that behavior itself is ultimately random or probabilistic. As Poincaré (1914/1952) observed, “Every phenomenon, however trifling it be, has a cause, and a mind infinitely powerful and infinitely well-informed concerning the laws of nature could have foreseen it. . . . Chance is only the measure of our ignorance” (p. 65). To say that a difference is “due to chance” is to say that, for unspecifiable reasons not involving systematic treatment, one’s sample is atypical of the population.
Version control allows users to maintain a history of their files and to fall back to any previous version. This is useful in the case that one deletes material from a manuscript that should be later added back in, changes something in analytic code that accidentally breaks everything, or accidentally saves changes to the original raw data file. Version control eliminates the need for a cluttered folder full of files named “manuscript.docx,” “manuscript_final.docx,” and “manuscript_final2.docx.” In general, it not considered an open-science development, but it is a useful tool nonetheless.
References
Abelson, R. P. (1995). Statistics as principled argument. Mahwah, NJ: Lawrence Erlbaum Associates.
Baron, A., & Perone, M. (1999). Experimental design and analysis in the laboratory study of human operant behavior. In K. Lattal & M. Perone (Eds.), The handbook of research methods in human operant behavior (pp. 45–91). Boston, MA: Springer.
Berkowitz, L. (1971). Reporting an experiment: A case study in leveling, sharpening, and assimilation. Journal of Experimental Social Psychology, 7, 237–243.
Branch, M. (2014). Malignant side-effects of null-hypothesis significance testing. Theory & Psychology, 24, 256–277. https://doi.org/10.1177/0959354314525282.
Branch, M. (2018). The "reproducibility crisis”: Might the methods used frequently in behavior-analysis research help? Perspectives on Behavior Science. doi: https://doi.org/10.1007/s40614-018-0158-5
Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., et al. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224. https://doi.org/10.1016/j.jesp.2013.10.005.
Camerer, C. F., Dreber, A., Holzmeister, H., H, T., Huber, J., Johannesson, M., et al. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behavior., 2, 637–644. https://doi.org/10.1038/s41562-018-0399-z.
Campbell, L. (2016, September 27). Campbell Lab: OSF Research Milestones. Retrieved from osf.io/jrd8f
Center for Open Science (2018). Registered reports: Peer review before results are known to align scientific values and practices. https://cos.io/rr/
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Cook, F. L. (2018, March 9). Dear colleague letter: Achieving new insights through replicability and reproducibility. National Science Foundation. Retrieved from https://www.nsf.gov/pubs/2018/nsf18053/nsf18053.jsp
Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7–29. https://doi.org/10.1177/0956797613504966.
Davison, M. (1999). Statistical inference in behavior analysis: Having my cake and eating it? Behavior Analyst, 22, 99–103. https://doi.org/10.1007/BF03391986.
Dienes, Z. (2014). Using Bayes to get the most out of non-significant results. Frontiers in Psychology, 5, 1–17. https://doi.org/10.3389/fpsyg.2014.00781.
Eastwick, P. (2018, January 9). Two lessons from a registered report. Retrieved from http://pauleastwick.blogspot.com/2018/01/two-lessons-from-registered-report.html
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193–242.
Eich, E. (2014). Business not as usual. Psychological Science, 25, 3–6. https://doi.org/10.1177/0956797613512465.
Fabrigar, L. R., & Wegener, D. T. (2016). Conceptualizing and evaluating the replication of research results. Journal of Experimental Social Psychology, 66, 68–80. https://doi.org/10.1016/j.jesp.2015.07.009.
Faith, M. S., Allison, D. B., & Gorman, B. S. (1996). Meta-analysis of single-case research. In R. D. Franklin, D. B. Allison, & B. S. Gorman (Eds.), Design and analysis of single-case research (pp. 245–277). Hillsdale, NJ: Lawrence Erlbaum Associates.
Fanelli, D. (2012). Negative results are disappearing from most disciplines and countries. Scientometrics, 90, 891–904. https://doi.org/10.1007/s11192-011-0494-7.
Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41, 1149–1160. https://doi.org/10.3758/BRM.41.4.1149.
Finkel, E. J., Eastwick, P. W., & Reis, H. T. (2017). Replicability and other features of high-quality science: Toward a balanced empirical approach. Journal of Personality & Social Psychology, 133, 244–253. https://doi.org/10.1037/pspi0000075.
Finkel, E. J., Eastwick, P. W., & Reis, H. T. (2015). Best research practices in psychology: Illustrating epistemological and pragmatic considerations with the case of relationship science. Journal of Personality and Social Psychology, 108, 275–297. https://doi.org/10.1037/pspi0000007.
Fisch, G. S. (2001). Evaluating data from behavioral analysis: Visual inspection or statistical models? Behavioural Processes, 54, 137–154. https://doi.org/10.1016/S0376-6357(01)00155-3.
Fiske, S. T. (2016). A call to change science’s culture of shaming. APS Observer. Retrieved from https://www.psychologicalscience.org/observer/a-call-to-change-sciences-culture-of-shaming.
Fox, A. E. (2018). The future is upon us. Behavior Analysis: Research & Practice, 18, 144–150. https://doi.org/10.1037/bar0000106.
Francis, G. (2012). Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19, 151–156. https://doi.org/10.3758/s13423-012-0227-9.
Frank, M. C., & Saxe, R. (2012). Teaching replication. Perspectives on Psychological Science, 7, 600–604. https://doi.org/10.1177/1745691612460686.
Funder, D. C., Levine, J. M., Mackie, D. M., Morf, C. C., Sansone, C., Vazire, S., & West, S. G. (2014). Improving the dependability of research in personality and social psychology: Recommendations for research and educational practice. Personality & Social Psychology Review, 18, 3–12. https://doi.org/10.1177/1088868313507536.
Gelman, A., & Loken, E. (2014). The statistical crisis in science. American Scientist, 102, 460–465.
Gilbert, D. T., King, G., Pettigrew, S., & Wilson, T. D. (2016). Comment on “Estimating the reproducibility of science.”. Science, 351, 1037a. https://doi.org/10.1126/science.aad7243.
Giner-Sorolla, R. (2012). Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspectives on Psychological Science, 7, 562–571. https://doi.org/10.1177/1745691612457576.
Goh, J. X., Hall, J. A., & Rosenthal, R. (2016). Mini meta-analysis of your own studies: Some arguments on why and a primer on how. Social & Personality Psychology Compass, 10, 535–549. https://doi.org/10.1111/spc3.12267.
Grahe, J. E., Reifman, A., Hermann, A. D., Walker, M., Oleson, K. C., Nario-Redmond, M., & Wiebe, R. P. (2012). Harnessing the undiscovered resource of student research projects. Perspectives on Psychological Science, 7, 605–607. https://doi.org/10.1177/1745691612459057.
Grahe, J. E. (2014). Announcing Open Science badges and reaching for the sky. Journal of Social Psychology, 154, 1–3. https://doi.org/10.1080/00224545.2014.853582.
Grahe, J. E. (2018). Another step towards scientific transparency: Requiring research materials for publication. Journal of Social Psychology, 158, 1–6. https://doi.org/10.1080/00224545.2018.1416272.
Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1–20. https://doi.org/10.1037/h0076157.
Greenwald, A. G. (1976). Within-subjects designs: To use or not to use? Psychological Bulletin, 83, 314–320.
Hales, A. H. (2016). Does the conclusion follow from the evidence? Recommendations for improving research. Journal of Experimental Social Psychology, 66, 39–46. https://doi.org/10.1016/j.jesp.2015.09.011.
Hawkins, R. X. D., Smith, E. N., Au, C., Arias, J. M., Catapano, R., Hermann, E., et al. (2018). Improving the replicability of psychological science through pedagogy. Advances in Methods & Practices in Psychological Science, 1, 7–18.
Higgins, J. P., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. Statistics in Medicine, 21, 1539–1558. https://doi.org/10.1002/sim.1186.
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world. Behavioral & Brain Sciences, 33, 61–135. https://doi.org/10.1017/S0140525X0999152X.
Hennes, E. P., & Lane, S. P. (2017, April). Power to the people: Simulation methods for conducting power analysis for any model. In Workshop presented at the 89th Annual Meeting of the Midwestern Psychological Association. Chicago: IL.
Huitema, B. E., & McKean, J. W. (2000). Design specification issues in time-series intervention models. Educational & Psychological Measurement, 60, 38–58. https://doi.org/10.1177/00131640021970358.
Iyengar, S. S., & Lepper, M. R. (2000). When choice is demotivating: Can one desire too much of a good thing? Journal of Personality & Social Psychology, 79, 995–1006. https://doi.org/10.1037/0022-3514.79.6.995.
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science, 23, 524–532. https://doi.org/10.1177/0956797611430953.
Jones, L. V., & Tukey, J. W. (2000). A sensible formulation of the significance test. Psychological Methods, 5, 411–414. https://doi.org/10.1037//1082-989X.5.4.411.
Kanyongo, G. Y., Brook, G. P., Kyei-Blankson, L., & Gocmen, G. (2007). Reliability and statistical power: How measurement fallibility affects power and required sample sizes for several parametric and nonparametric statistics. Journal of Modern Applied Statistical Methods, 6, 81–90. https://doi.org/10.22237/jmasm/1177992480.
Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., et al. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLoS Biology, 14, e1002456. https://doi.org/10.1371/journal.pbio.1002456.
Kirkpatrick, K., Marshall, A. T., Steele, C. C., & Peterson, J. R. (2018). Resurrecting the individual in behavioral analysis: Using mixed effects models to address nonsystematic discounting data. Behavior Analysis: Research & Practice, 18, 219–238. https://doi.org/10.1037/bar0000103.
Kawakami, K. (2015). Editorial. Journal of Personality and Social Psychology: Interpersonal Relations & Group Processes, 108, 58–59. https://doi.org/10.1037/pspi0000013.
Kitayama, S. (2017). Editorial. Journal of Personality and Social Psychology: Attitudes and. Social Cognition, 3, 357–360. https://doi.org/10.1037/pspa0000077.
Kyonka, E. G. E. (2018). Tutorial: Small-N power analysis. Perspectives on Behavior Science. https://doi.org/10.1007/s40614-018-0167-4.
Kochari, A., & Ostarek, M. (2018, January 17). Introducing a replication-first rule for PhD projects (BBS commentary on Zwaan et al., Making replication mainstream). Retrieved from psyarxiv.com/6yv45
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44, 701–710. https://doi.org/10.1002/ejsp.2023.
Lakens, D., & Etz, A. J. (2017). To true to be bad: When sets of studies with significant and nonsignificant findings are probably true. Social Psychological & Personality Science. Advance online publication, 8, 875–881. https://doi.org/10.1177/1948550617693058.
Lindsay, D. S. (2015). Replication in psychological science. Psychological Science, 26, 1827–1832. https://doi.org/10.1177/0956797615616374.
Makel, M. C., Plucker, J. A., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science, 7, 537–542. https://doi.org/10.1177/1745691612460688.
Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, 147–163. https://doi.org/10.1037/1082-989X.9.2.147.
Mayer, H. (Producer/Director) & Norris, M. (Writer). (1970). The social animal [Motion picture]. (Available from Indiana University Audio-Visual Center, Bloomington, Indiana).
McCabe, C. J., Kim, D. S., & King, K. M. (2018). Improving present practices in the visual display of interactions. Advances in Methods & Practices in Psychological Science, 1, 147–165. https://doi.org/10.1177/2515245917746792.
McShane, B. B., & Böckenholt, U. (2017). Single-paper meta-analysis: Benefits for study summary, theory testing, and replicability. Journal of Consumer Research, 43, 1048–1063. https://doi.org/10.1093/jcr/ucw085.
Mellor, D. T., Esposito, J., DeHaven, A. C., & Stodden, V. (2018, October 24). Resources. Open Science Foundation. Retrieved from osf.io/kgnva
Meyer, M. N. (2018). Practical tips for ethical data sharing. Advances in Methods & Practices in Psychological Science, 1, 131–144. https://doi.org/10.1177/2515245917747656.
Meyvis, T., & van Osselaer, S. M. J. (2018). Increasing the power of your study by increasing your effect size. Journal of Consumer Research, 44, 1157–1173. https://doi.org/10.1093/jcr/ucx110.
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2017, August 24). Preregistration Revolution. Retrieved from osf.io/2dxu5
Nosek, B. A., & Lakens, D. (2013). Replications of important results in social psychology. [Special issue]. Social Psychology, 44, 59–60.
Nosek, B. A., & Lakens, D. (2014). Registered reports: A method to increase the credibility of published results. Social Psychology, 45, 137–141.
Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7, 615–631. https://doi.org/10.1177/1745691612459058.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 346(6251), aac4716. https://doi.org/10.1126/science.aac4716.
Patil, P., Peng, R. D., & Leek, J. T. (2016). What should researchers expect when they replicate studies? A statistical view on replicability in psychological science. Perspectives on Psychological Science, 11, 539–544. https://doi.org/10.1177/1745691616646366.
Perone, M. (1999). Statistical inference in behavior analysis: Experimental control is better. The Behavior Analyst, 22, 109–116.
Perone, M. (2018). How I learned to stop worrying and love replication failures. Perspectives on Behavior Science. https://doi.org/10.1007/s40614-018-0153-x.
Perrino, T., Howe, G., Sperling, A., Beardslee, W., Sandler, I., Shern, D., . . . Brown, C. (2013). Advancing science through collaborative data sharing and synthesis. Perspectives on Psychological Science, 8, 433–444. doi: https://doi.org/10.1177/1745691613491579.
Poincaré, H. (1952). Science and method. London, UK: Thomas Nelson (Original work published 1914).
Ramsey, P. H. (1990). "One-and-a-half-tailed" tests of significance. Psychological Reports, 66, 653–654. https://doi.org/10.2466/PR0.66.2.653-654.
Rosenthal, R. (1984). Meta-analytic procedures for social research. (Applied social research methods series, Vol. 6). Newbury Park, CA: Sage.
Sagarin, B. J., Ambler, J. K., & Lee, E. M. (2014). An ethical approach to peeking at data. Perspectives on Psychological Science, 9, 293–304. https://doi.org/10.1177/1745691614528214.
Sakaluk, J. K. (2016). Exploring small, confirming big: An alternative system to the new statistics for advancing cumulative and replicable psychological research. Journal of Experimental Social Psychology, 66, 47–54. https://doi.org/10.1016/j.jesp.2015.09.013.
Schachter, S. (1951). Deviation, rejection and communication. Journal of Abnormal & Social Psychology, 46, 190–207.
Schimmack, U. (2012). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17, 551–566. https://doi.org/10.1037/a0029487.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359–1366. https://doi.org/10.1177/0956797611417632.
Simons, D. J., Holcombe, A. O., & Spellman, B. A. (2014). An introduction to registered replication reports at Perspectives on Psychological Science. Perspectives on Psychological Science, 9, 552–555. https://doi.org/10.1177/1745691614543974.
Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12, 1123–1128. https://doi.org/10.1177/1745691617708630.
Simonsohn, U. (2013). Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24, 1875–1888. https://doi.org/10.1177/0956797613480366.
Simonsohn, U. (2015). Small telescopes: Detectability and the evaluation of replication results. Psychological Science, 26, 559–569. https://doi.org/10.1177/0956797614567341.
Spellman, B. A. (2015). A short (personal) future history of Revolution 2.0. Perspectives on Psychological Science, 10, 886–899. https://doi.org/10.1177/1745691615609918.
Vasishth, S., & Gelman, A. (2017). The statistical significance filter leads to overconfident expectations of replicability. arXive. Retrieved April 27, 2018, from https://arxiv.org/abs/1702.00556
Vazire, S. (2016). Editorial. Social Psychological & Personality Science, 7, 3–7. https://doi.org/10.1177/1948550615603955.
Wagenmakers, E., Wetzels, R., Borsboom, D., van der Maas, H. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7, 632–638. https://doi.org/10.1177/1745691612463078.
Wagenmakers, E., & Dutil, G. (2016). Seven selfish reasons for preregistration. Observer, 29, 13–14.
Wahrman, R., & Pugh, M. D. (1972). Competence and conformity: Another look at Hollander’s Study. Sociometry, 35, 376–386.
Wang, Y. A., Sparks, J., Gonzales, J. E., Hess, Y. D., & Ledgerwood, A. (2017). Using independent covariates in experimental designs: Quantifying the trade-off between power boost and Type I error inflation. Journal of Experimental Social Psychology, 72, 118–124. https://doi.org/10.1016/j.jesp.2017.04.011.
Wesselmann, E. D., Williams, K. D., Pryor, J. B., Eichler, F. A., Gill, D. M., & Hogue, J. D. (2014). Revisiting Schachter’s research on rejection, deviance, and communication (1951). Social Psychology, 45, 164–169.
Wesselmann, E., Wirth, J. H., & Grahe, J. E. (2018, February 27). “Sources of Ostracism” Hub. https://doi.org/10.17605/OSF.IO/ENV5W
Zwaan, R. A., Etz, A., Lucas, R. E., & Donnellan, M. B. (2017). Making replication mainstream. Behavioral & Brain Sciences. Advance online publication, 41. https://doi.org/10.1017/S0140525X17001972.
Acknowledgements
We thank Thomas Critchfield for valuable comments on a draft of this article.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hales, A.H., Wesselmann, E.D. & Hilgard, J. Improving Psychological Science through Transparency and Openness: An Overview. Perspect Behav Sci 42, 13–31 (2019). https://doi.org/10.1007/s40614-018-00186-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40614-018-00186-8