Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals

Waters, Philo; Rucker, Brayden; Love, Mitchell; Vassar, Matt

doi:10.1007/s12630-023-02529-9

Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals

Abaissement du seuil de signification statistique des études randomisées contrôlées dans trois des principales revues d’anesthésiologie générale

Reports of Original Investigations
Published: 10 August 2023

Volume 70, pages 1441–1448, (2023)
Cite this article

Canadian Journal of Anesthesia/Journal canadien d'anesthésie Aims and scope Submit manuscript

Philo Waters IV BS ORCID: orcid.org/0000-0003-0918-1517¹,
Brayden Rucker DO¹,
Mitchell Love BS¹ &
…
Matt Vassar PhD^1,2

262 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Purpose

The primary objective of our study was to determine how lowering a P value threshold from 0.05 to 0.005 would affect the statistical significance of previously published randomized controlled trials (RCTs) in major anesthesiology journals.

Methods

We searched the PubMed database for studies electronically published in 2020 within three major general anesthesiology journals as indexed by both Google Metrics and Scimago Journal & Country Rank. Studies included were RCTs published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; had a primary endpoint, and used a P value threshold to determine the effect of the intervention. We performed screening and data extraction in a masked duplicate fashion.

Results

Ninety-one RCTs met inclusion criteria. The most frequently studied type of intervention was drugs (44/91, 48%). From the 91 trials, 99 primary endpoints, and thus P values, were obtained. Fifty-eight (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 primary endpoints previously considered statistically significant, 21 (36%) P values would maintain statistical significance at P < 0.005, and 37 (64%) would be reclassified as “suggestive.”

Conclusions

Lowering a P value threshold of 0.05 to 0.005 would have altered one third of significance interpretations of RCTs in the surveyed anesthesiology literature. Thus, it is important for readers to consider post hoc probabilities when evaluating clinical trial results. Although the present study focused on the anesthesiology literature, we suggest that our results warrant further research within other fields of medicine to help avoid clinical misinterpretation of RCT findings and improve quality of care.

Résumé

Objectif

L’objectif principal de notre étude était de déterminer comment l’abaissement d’un seuil de valeur P de 0,05 à 0,005 affecterait la signification statistique des études randomisées contrôlées (ERC) précédemment publiées dans certaines des principales revues d’anesthésiologie.

Méthode

Nous avons réalisé des recherches dans la base de données PubMed pour trouver des études publiées électroniquement en 2020 dans trois des principales revues d’anesthésiologie générale et indexées par Google Metrics et Scimago Journal & Country Rank. Les études incluses étaient des ERC publiées en 2020 dans les revues Anesthesiology, Anesthesia & Analgesia, et le British Journal of Anaesthesia, qui avaient un critère d’évaluation principal et utilisaient un seuil de valeur P pour déterminer l’effet de l’intervention. Nous avons effectué la sélection et l’extraction des données de manière dupliquée masquée.

Résultats

Quatre-vingt-onze ERC remplissaient les critères d’inclusion. Le type d’intervention le plus fréquemment étudié était de nature médicamenteuse (44/91, 48 %). Sur les 91 études, 99 critères d’évaluation principaux, et donc valeurs P, ont été obtenus. Cinquante-huit (59 %) critères d’évaluation avaient une valeur P < 0,05 et 41 (41 %) avaient une valeur P ≥ 0,05. Sur les 58 critères d’évaluation principaux précédemment considérés comme statistiquement significatifs, 21 (36 %) valeurs P maintiendraient leur signification statistique à P < 0,005, et 37 (64 %) seraient reclassées comme étant « suggestives ».

Conclusion

Le fait d’abaisser le seuil de valeur P de 0,05 à 0,005 aurait modifié un tiers des interprétations de signification des ERC dans la littérature anesthésiologique étudiée. Il est donc important que les lectrices et lecteurs tiennent compte des probabilités post hoc lors de l’évaluation des résultats d’études cliniques. Bien que la présente étude se soit concentrée sur la littérature en anesthésiologie, nous suggérons que nos résultats justifient des recherches supplémentaires dans d’autres domaines de la médecine afin d’éviter une mauvaise interprétation clinique des résultats des ERC et d’améliorer la qualité des soins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Fragility Index of randomized controlled trials in pediatric anesthesiology

Article 08 June 2023

Sample size calculations for randomized clinical trials published in anesthesiology journals: a comparison of 2010 versus 2016

Article 22 March 2018

Mapping multicenter randomized controlled trials in anesthesiology: a scoping review

Article Open access 26 October 2021

References

Panagiotakos DB. Value of p-value in biomedical research. Open Cardiovasc Med J 2008; 2: 97–9. https://doi.org/10.2174/1874192400802010097
Article PubMed PubMed Central Google Scholar
Dorey F. The p value: what is it and what does it tell you? Clin Orthop Relat Res 2010; 468: 2297–8. https://doi.org/10.1007/s11999-010-1402-9
Article PubMed PubMed Central Google Scholar
Fisher RA. Statistical Methods for Research Workers. In: Kotz S, Johnson NL (Eds.). Breakthroughs in Statistics, Volume II: Methodology and Distribution. New York: Springer; 1992: 66–70.
Chapter Google Scholar
Altman DG, Bland JM. Statistics notes: absence of evidence is not evidence of absence. BMJ 1995; 311: 485. https://doi.org/10.1136/bmj.311.7003.485
Article CAS PubMed PubMed Central Google Scholar
Nahm FS. What the P values really tell us. Korean J Pain 2017; 30: 241–2. https://doi.org/10.3344/kjp.2017.30.4.241
Article PubMed PubMed Central Google Scholar
Jones DS, Podolsky SH. The history and fate of the gold standard. Lancet 2015; 385: 1502–3. https://doi.org/10.1016/s0140-6736(15)60742-5
Article PubMed Google Scholar
Thiese MS. Observational and interventional study design types; an overview. Biochem Med 2014; 24: 199–210. https://doi.org/10.11613/bm.2014.022
Article Google Scholar
Ioannidis JP. The proposal to lower P value thresholds to .005. JAMA 2018; 319: 1429–30. https://doi.org/10.1001/jama.2018.1536
Article PubMed Google Scholar
Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav 2018; 2: 6–10. https://doi.org/10.1038/s41562-017-0189-z
Article PubMed Google Scholar
Piroli F, Angelini F, D’Ascenzo F, De Ferrari GM. Does lowering p value threshold to 0.005 impact on evidence-based medicine? An analysis of current European Society of Cardiology guidelines on STEMI. Eur J Intern Med 2020; 79: 147–8. https://doi.org/10.1016/j.ejim.2020.05.036
Article PubMed Google Scholar
Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS Biol 2015; 13: e1002106. https://doi.org/10.1371/journal.pbio.1002106
Article CAS PubMed PubMed Central Google Scholar
Greenberg RS, Bembea M, Heitmiller E. Rainy days for the Society for Pediatric Anesthesia. Anesth Analg 2012; 114: 1102–3. https://doi.org/10.1213/ane.0b013e318248e383
Article PubMed Google Scholar
Shafer SL, Dexter F. Publication bias, retrospective bias, and reproducibility of significant results in observational studies. Anesth Analg 2012; 114: 931–2. https://doi.org/10.1213/ane.0b013e31824a0b5b
Article PubMed Google Scholar
Evans S, Anderson JM, Johnson AL, et al. The potential effect of lowering the threshold of statistical significance from P < .05 to P < .005 in orthopaedic sports medicine. Arthroscopy 2021; 37: 1068–74. https://doi.org/10.1016/j.arthro.2020.11.041
Article PubMed Google Scholar
Navon D, Cohen Y. Consider avoiding the .05 significance level, 2016. Available from URL: https://arxiv.org/abs/1606.09017 (accessed April 2023).
Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci 2008; 3: 286–300. https://doi.org/10.1111/j.1745-6924.2008.00079.x
Article PubMed Google Scholar
Wayant C, Scott J, Vassar M. Evaluation of lowering the P value threshold for statistical significance from .05 to .005 in previously published randomized clinical trials in major medical journals. JAMA 2018; 320: 1813–15. https://doi.org/10.1001/jama.2018.12288
Article PubMed PubMed Central Google Scholar
Johnson AL, Evans S, Checketts JX, et al. Effects of a proposal to alter the statistical significance threshold on previously published orthopaedic trauma randomized controlled trials. Injury 2019; 50: 1934–7. https://doi.org/10.1016/j.injury.2019.08.012
Article PubMed Google Scholar
Malhotra A, Le Grice K, Shah N. Lowering the p value threshold in recently published Respiratory Medicine RCTs. Eur Respir J 2019; 54.: https://erj.ersjournals.com/content/54/suppl_63/PA1484
Grolleau F, Collins GS, Smarandache A, et al. The fragility and reliability of conclusions of anesthesia and critical care randomized trials with statistically significant findings: a systematic review. Crit Care Med 2019; 47: 456–62. https://doi.org/10.1097/ccm.0000000000003527
Article PubMed Google Scholar
Waters P, Love M, Rucker B. Lowering the statistical significance threshold in general anesthesia; 2021. Available from URL: https://osf.io/gj2w5/ (accessed August 2023).
Google Scholar Metrics. Anesthesiology. Available from URL: https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=med_anesthesiology (accessed August 2023).
Scimago Lab. Scimago scientific journal rankings. Available from URL: https://www.scimagojr.com/journalrank.php?category=2703 (accessed January 2023).
Rayyan. Homepage. Available from URL: https://www.rayyan.ai (accessed January 2023).
Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci 2014; 1: 140216. https://doi.org/10.1098/rsos.140216
Article PubMed PubMed Central Google Scholar
Sullivan GM, Feinn R. Using effect size-or why the P value is not enough. J Grad Med Educ 2012; 4: 279–82. https://doi.org/10.4300/jgme-d-12-00156.1
Article PubMed PubMed Central Google Scholar
Mobbs RJ. From the subjective to the objective era of outcomes analysis: how the tools we use to measure outcomes must change to be reflective of the pathologies we treat in spinal surgery. J Spine Surg 2021; 7: 456–7. https://doi.org/10.21037/jss-2021-2
Article PubMed PubMed Central Google Scholar
Bruno AM, Shea AE, Einerson BD, et al. Impact of the p-value threshold on interpretation of trial outcomes in obstetrics and gynecology. Am J Perinatol 2021; 38: 1223–30. https://doi.org/10.1055/s-0041-1731345
Article PubMed PubMed Central Google Scholar
Singh Bajwa SJ. Anesthesiology research and practice in developing nations: economic and evidence-based patient-centered approach. J Anaesthesiol Clin Pharmacol 2013; 29: 295–6. https://doi.org/10.4103/0970-9185.117039
Article PubMed PubMed Central Google Scholar
The Cochrane Collaboration. Chapter 5: Collecting data. Available from URL: https://training.cochrane.org/handbook/current/chapter-05 (accessed January 2023).

Download references

Author contributions

Philo Waters contributed to all aspects of this manuscript including the conception and design; acquisition, analysis, and interpretation of data; and drafting of the article. Brayden Rucker contributed to the conception of the study, the interpretation of data, and drafting of the article. Mitchell Love contributed to the acquisition of data and drafting of the article. Matt Vassar contributed to the conception and design; the analysis and interpretation of data; and drafting of the article.

Disclosures

No financial or other support was provided for the development of this manuscript. Dr. Vassar reports receiving funding from the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, the US Office of Research Integrity, and internal grants from Oklahoma State University Center for Health Sciences—all of which are outside of the present work. All remaining authors have nothing to disclose.

Funding statement

This study was not externally funded.

Editorial responsibility

This submission was handled by Dr. Stephan K. W. Schwarz, Editor-in-Chief, Canadian Journal of Anesthesia/Journal canadien d’anesthésie.

Author information

Authors and Affiliations

Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
Philo Waters IV BS, Brayden Rucker DO, Mitchell Love BS & Matt Vassar PhD
Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
Matt Vassar PhD

Authors

Philo Waters IV BS
View author publications
You can also search for this author in PubMed Google Scholar
Brayden Rucker DO
View author publications
You can also search for this author in PubMed Google Scholar
Mitchell Love BS
View author publications
You can also search for this author in PubMed Google Scholar
Matt Vassar PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philo Waters IV BS.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 214 KB)

eTable 1 P values for each study or reasons for exclusion. eTable 2 Primary endpoint designation for each study

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Waters, P., Rucker, B., Love, M. et al. Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals. Can J Anesth/J Can Anesth 70, 1441–1448 (2023). https://doi.org/10.1007/s12630-023-02529-9

Download citation

Received: 15 September 2022
Revised: 09 January 2023
Accepted: 10 January 2023
Published: 10 August 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s12630-023-02529-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals