Abstract
Purpose
The primary objective of our study was to determine how lowering a P value threshold from 0.05 to 0.005 would affect the statistical significance of previously published randomized controlled trials (RCTs) in major anesthesiology journals.
Methods
We searched the PubMed database for studies electronically published in 2020 within three major general anesthesiology journals as indexed by both Google Metrics and Scimago Journal & Country Rank. Studies included were RCTs published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; had a primary endpoint, and used a P value threshold to determine the effect of the intervention. We performed screening and data extraction in a masked duplicate fashion.
Results
Ninety-one RCTs met inclusion criteria. The most frequently studied type of intervention was drugs (44/91, 48%). From the 91 trials, 99 primary endpoints, and thus P values, were obtained. Fifty-eight (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 primary endpoints previously considered statistically significant, 21 (36%) P values would maintain statistical significance at P < 0.005, and 37 (64%) would be reclassified as “suggestive.”
Conclusions
Lowering a P value threshold of 0.05 to 0.005 would have altered one third of significance interpretations of RCTs in the surveyed anesthesiology literature. Thus, it is important for readers to consider post hoc probabilities when evaluating clinical trial results. Although the present study focused on the anesthesiology literature, we suggest that our results warrant further research within other fields of medicine to help avoid clinical misinterpretation of RCT findings and improve quality of care.
Résumé
Objectif
L’objectif principal de notre étude était de déterminer comment l’abaissement d’un seuil de valeur P de 0,05 à 0,005 affecterait la signification statistique des études randomisées contrôlées (ERC) précédemment publiées dans certaines des principales revues d’anesthésiologie.
Méthode
Nous avons réalisé des recherches dans la base de données PubMed pour trouver des études publiées électroniquement en 2020 dans trois des principales revues d’anesthésiologie générale et indexées par Google Metrics et Scimago Journal & Country Rank. Les études incluses étaient des ERC publiées en 2020 dans les revues Anesthesiology, Anesthesia & Analgesia, et le British Journal of Anaesthesia, qui avaient un critère d’évaluation principal et utilisaient un seuil de valeur P pour déterminer l’effet de l’intervention. Nous avons effectué la sélection et l’extraction des données de manière dupliquée masquée.
Résultats
Quatre-vingt-onze ERC remplissaient les critères d’inclusion. Le type d’intervention le plus fréquemment étudié était de nature médicamenteuse (44/91, 48 %). Sur les 91 études, 99 critères d’évaluation principaux, et donc valeurs P, ont été obtenus. Cinquante-huit (59 %) critères d’évaluation avaient une valeur P < 0,05 et 41 (41 %) avaient une valeur P ≥ 0,05. Sur les 58 critères d’évaluation principaux précédemment considérés comme statistiquement significatifs, 21 (36 %) valeurs P maintiendraient leur signification statistique à P < 0,005, et 37 (64 %) seraient reclassées comme étant « suggestives ».
Conclusion
Le fait d’abaisser le seuil de valeur P de 0,05 à 0,005 aurait modifié un tiers des interprétations de signification des ERC dans la littérature anesthésiologique étudiée. Il est donc important que les lectrices et lecteurs tiennent compte des probabilités post hoc lors de l’évaluation des résultats d’études cliniques. Bien que la présente étude se soit concentrée sur la littérature en anesthésiologie, nous suggérons que nos résultats justifient des recherches supplémentaires dans d’autres domaines de la médecine afin d’éviter une mauvaise interprétation clinique des résultats des ERC et d’améliorer la qualité des soins.
Similar content being viewed by others
References
Panagiotakos DB. Value of p-value in biomedical research. Open Cardiovasc Med J 2008; 2: 97–9. https://doi.org/10.2174/1874192400802010097
Dorey F. The p value: what is it and what does it tell you? Clin Orthop Relat Res 2010; 468: 2297–8. https://doi.org/10.1007/s11999-010-1402-9
Fisher RA. Statistical Methods for Research Workers. In: Kotz S, Johnson NL (Eds.). Breakthroughs in Statistics, Volume II: Methodology and Distribution. New York: Springer; 1992: 66–70.
Altman DG, Bland JM. Statistics notes: absence of evidence is not evidence of absence. BMJ 1995; 311: 485. https://doi.org/10.1136/bmj.311.7003.485
Nahm FS. What the P values really tell us. Korean J Pain 2017; 30: 241–2. https://doi.org/10.3344/kjp.2017.30.4.241
Jones DS, Podolsky SH. The history and fate of the gold standard. Lancet 2015; 385: 1502–3. https://doi.org/10.1016/s0140-6736(15)60742-5
Thiese MS. Observational and interventional study design types; an overview. Biochem Med 2014; 24: 199–210. https://doi.org/10.11613/bm.2014.022
Ioannidis JP. The proposal to lower P value thresholds to .005. JAMA 2018; 319: 1429–30. https://doi.org/10.1001/jama.2018.1536
Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav 2018; 2: 6–10. https://doi.org/10.1038/s41562-017-0189-z
Piroli F, Angelini F, D’Ascenzo F, De Ferrari GM. Does lowering p value threshold to 0.005 impact on evidence-based medicine? An analysis of current European Society of Cardiology guidelines on STEMI. Eur J Intern Med 2020; 79: 147–8. https://doi.org/10.1016/j.ejim.2020.05.036
Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS Biol 2015; 13: e1002106. https://doi.org/10.1371/journal.pbio.1002106
Greenberg RS, Bembea M, Heitmiller E. Rainy days for the Society for Pediatric Anesthesia. Anesth Analg 2012; 114: 1102–3. https://doi.org/10.1213/ane.0b013e318248e383
Shafer SL, Dexter F. Publication bias, retrospective bias, and reproducibility of significant results in observational studies. Anesth Analg 2012; 114: 931–2. https://doi.org/10.1213/ane.0b013e31824a0b5b
Evans S, Anderson JM, Johnson AL, et al. The potential effect of lowering the threshold of statistical significance from P < .05 to P < .005 in orthopaedic sports medicine. Arthroscopy 2021; 37: 1068–74. https://doi.org/10.1016/j.arthro.2020.11.041
Navon D, Cohen Y. Consider avoiding the .05 significance level, 2016. Available from URL: https://arxiv.org/abs/1606.09017 (accessed April 2023).
Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci 2008; 3: 286–300. https://doi.org/10.1111/j.1745-6924.2008.00079.x
Wayant C, Scott J, Vassar M. Evaluation of lowering the P value threshold for statistical significance from .05 to .005 in previously published randomized clinical trials in major medical journals. JAMA 2018; 320: 1813–15. https://doi.org/10.1001/jama.2018.12288
Johnson AL, Evans S, Checketts JX, et al. Effects of a proposal to alter the statistical significance threshold on previously published orthopaedic trauma randomized controlled trials. Injury 2019; 50: 1934–7. https://doi.org/10.1016/j.injury.2019.08.012
Malhotra A, Le Grice K, Shah N. Lowering the p value threshold in recently published Respiratory Medicine RCTs. Eur Respir J 2019; 54.: https://erj.ersjournals.com/content/54/suppl_63/PA1484
Grolleau F, Collins GS, Smarandache A, et al. The fragility and reliability of conclusions of anesthesia and critical care randomized trials with statistically significant findings: a systematic review. Crit Care Med 2019; 47: 456–62. https://doi.org/10.1097/ccm.0000000000003527
Waters P, Love M, Rucker B. Lowering the statistical significance threshold in general anesthesia; 2021. Available from URL: https://osf.io/gj2w5/ (accessed August 2023).
Google Scholar Metrics. Anesthesiology. Available from URL: https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=med_anesthesiology (accessed August 2023).
Scimago Lab. Scimago scientific journal rankings. Available from URL: https://www.scimagojr.com/journalrank.php?category=2703 (accessed January 2023).
Rayyan. Homepage. Available from URL: https://www.rayyan.ai (accessed January 2023).
Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci 2014; 1: 140216. https://doi.org/10.1098/rsos.140216
Sullivan GM, Feinn R. Using effect size-or why the P value is not enough. J Grad Med Educ 2012; 4: 279–82. https://doi.org/10.4300/jgme-d-12-00156.1
Mobbs RJ. From the subjective to the objective era of outcomes analysis: how the tools we use to measure outcomes must change to be reflective of the pathologies we treat in spinal surgery. J Spine Surg 2021; 7: 456–7. https://doi.org/10.21037/jss-2021-2
Bruno AM, Shea AE, Einerson BD, et al. Impact of the p-value threshold on interpretation of trial outcomes in obstetrics and gynecology. Am J Perinatol 2021; 38: 1223–30. https://doi.org/10.1055/s-0041-1731345
Singh Bajwa SJ. Anesthesiology research and practice in developing nations: economic and evidence-based patient-centered approach. J Anaesthesiol Clin Pharmacol 2013; 29: 295–6. https://doi.org/10.4103/0970-9185.117039
The Cochrane Collaboration. Chapter 5: Collecting data. Available from URL: https://training.cochrane.org/handbook/current/chapter-05 (accessed January 2023).
Author contributions
Philo Waters contributed to all aspects of this manuscript including the conception and design; acquisition, analysis, and interpretation of data; and drafting of the article. Brayden Rucker contributed to the conception of the study, the interpretation of data, and drafting of the article. Mitchell Love contributed to the acquisition of data and drafting of the article. Matt Vassar contributed to the conception and design; the analysis and interpretation of data; and drafting of the article.
Disclosures
No financial or other support was provided for the development of this manuscript. Dr. Vassar reports receiving funding from the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, the US Office of Research Integrity, and internal grants from Oklahoma State University Center for Health Sciences—all of which are outside of the present work. All remaining authors have nothing to disclose.
Funding statement
This study was not externally funded.
Editorial responsibility
This submission was handled by Dr. Stephan K. W. Schwarz, Editor-in-Chief, Canadian Journal of Anesthesia/Journal canadien d’anesthésie.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (PDF 214 KB)
eTable 1 P values for each study or reasons for exclusion. eTable 2 Primary endpoint designation for each study
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Waters, P., Rucker, B., Love, M. et al. Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals. Can J Anesth/J Can Anesth 70, 1441–1448 (2023). https://doi.org/10.1007/s12630-023-02529-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12630-023-02529-9