Avoid common mistakes on your manuscript.
We thank Dr. Macnaughton for the correction [1]. In its official statement [2], the ASA did say that “A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.” and “By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.” The follow-up editorial by Wasserstein et al. [3], which indeed was not an official ASA statement, asserted that in the twenty-first century “researchers are free to treat ‘p = 0.051’ and ‘p = 0.049’ as not being categorically different, [and] … authors no longer find themselves constrained to selectively publish their results based on a single magic number. In this world, … studies with ‘p < 0.05’ and studies with ‘p > 0.05’ are not automatically in conflict….”
We understand that after teaching and using significance testing for a professional lifetime, some statisticians feel compelled to defend this methodology despite its proclivity to spur misinterpretations of data. We believe that is the reason why many statisticians would prefer to walk back the ASA statement underscoring the problems with significance testing. We emphasize that the p-value itself, while often misinterpreted to mean a probability that chance explains an association, is not the problem. The problem is the conversion of a p-value into a dichotomy of “significant” or “not significant” and the resulting misinterpretation that so often ensues.
A striking illustration of the harm that has been done to science by reliance on statistical significance testing was given by van Zwet and Cator [4] (Fig. 1). They examined the distribution of over 1,000,000 z-values from Medline, showing that the distribution is clearly distorted because of the artificial dichotomization of P-values in significance testing and the biased publication decisions that ensue. Reliance on significance testing has distorted many individual analyses, and furthermore has had a damaging effect on the literature of science as a whole. Realization of the harm done by significance testing is a crucial reason that many experienced statisticians and researchers regard “statistical significance” as an antiquated tool that is long overdue for replacement (e.g., [5,6,7,8,9,10,11]).
References
Macnaughton DB. Re: Statistical inference and effect measures in abstracts of randomized controlled trials, 1975–2021. Stang A, Rothman KJ. Eur J Epidemiol 2023;38:1035–1042
Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. Am Statistician. 2016;70(2):129–33.
Wasserstein RL, Schirm AL, Lazar NA. Moving to a World Beyond "p < 0.05". Am Stat. 2019;73:1–19. https://doi.org/10.1080/00031305.2019.1583913
van Zwet EW, Cator EA. The significance filter, the winner’s curse and the need to shrink. Stat Neerl. 2021;75(4):437–52. https://doi.org/10.1111/stan.12241.
Rothman KJ. A show of confidence. N Engl J Med. 1978;299(24):1362–3. https://doi.org/10.1056/NEJM197812142992410.
Rothman KJ. Significance questing. Ann Intern Med. 1986;105(3):445–7.
Cumming G. Understanding the new statistics. Effect sizes, confidence intervals, and meta-analysis. New York: Routledge; 2011.
Gigerenzer G. Statistical rituals: the replication delusion and how we got there. Adv Meth Pract Psych. 2018;1(2):198–218. https://doi.org/10.1177/2515245918771329.
Amrhein V, Greenland S, McShane B. Retire statistical significance. Nature. 2019;567:305–7.
McShane BB, Gal D, Gelman A, Robert C, Tackett JL. Abandon statistical significance. Am Stat. 2019;73:235–45. https://doi.org/10.1080/00031305.2018.1527253.
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Res Methodol. 2020;20(1). https://doi.org/10.1186/s12874-020-01105-9
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stang, A., Rothman, K.J. Authors’ Reply: Statistical inference and effect measures in abstracts of randomized trials, 1975-2021. Eur J Epidemiol 39, 567–568 (2024). https://doi.org/10.1007/s10654-023-01081-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10654-023-01081-6