Skip to main content
Log in

Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals

Abaissement du seuil de signification statistique des études randomisées contrôlées dans trois des principales revues d’anesthésiologie générale

  • Reports of Original Investigations
  • Published:
Canadian Journal of Anesthesia/Journal canadien d'anesthésie Aims and scope Submit manuscript

Abstract

Purpose

The primary objective of our study was to determine how lowering a P value threshold from 0.05 to 0.005 would affect the statistical significance of previously published randomized controlled trials (RCTs) in major anesthesiology journals.

Methods

We searched the PubMed database for studies electronically published in 2020 within three major general anesthesiology journals as indexed by both Google Metrics and Scimago Journal & Country Rank. Studies included were RCTs published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; had a primary endpoint, and used a P value threshold to determine the effect of the intervention. We performed screening and data extraction in a masked duplicate fashion.

Results

Ninety-one RCTs met inclusion criteria. The most frequently studied type of intervention was drugs (44/91, 48%). From the 91 trials, 99 primary endpoints, and thus P values, were obtained. Fifty-eight (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 primary endpoints previously considered statistically significant, 21 (36%) P values would maintain statistical significance at P < 0.005, and 37 (64%) would be reclassified as “suggestive.”

Conclusions

Lowering a P value threshold of 0.05 to 0.005 would have altered one third of significance interpretations of RCTs in the surveyed anesthesiology literature. Thus, it is important for readers to consider post hoc probabilities when evaluating clinical trial results. Although the present study focused on the anesthesiology literature, we suggest that our results warrant further research within other fields of medicine to help avoid clinical misinterpretation of RCT findings and improve quality of care.

Résumé

Objectif

L’objectif principal de notre étude était de déterminer comment l’abaissement d’un seuil de valeur P de 0,05 à 0,005 affecterait la signification statistique des études randomisées contrôlées (ERC) précédemment publiées dans certaines des principales revues d’anesthésiologie.

Méthode

Nous avons réalisé des recherches dans la base de données PubMed pour trouver des études publiées électroniquement en 2020 dans trois des principales revues d’anesthésiologie générale et indexées par Google Metrics et Scimago Journal & Country Rank. Les études incluses étaient des ERC publiées en 2020 dans les revues Anesthesiology, Anesthesia & Analgesia, et le British Journal of Anaesthesia, qui avaient un critère d’évaluation principal et utilisaient un seuil de valeur P pour déterminer l’effet de l’intervention. Nous avons effectué la sélection et l’extraction des données de manière dupliquée masquée.

Résultats

Quatre-vingt-onze ERC remplissaient les critères d’inclusion. Le type d’intervention le plus fréquemment étudié était de nature médicamenteuse (44/91, 48 %). Sur les 91 études, 99 critères d’évaluation principaux, et donc valeurs P, ont été obtenus. Cinquante-huit (59 %) critères d’évaluation avaient une valeur P < 0,05 et 41 (41 %) avaient une valeur P ≥ 0,05. Sur les 58 critères d’évaluation principaux précédemment considérés comme statistiquement significatifs, 21 (36 %) valeurs P maintiendraient leur signification statistique à P < 0,005, et 37 (64 %) seraient reclassées comme étant « suggestives ».

Conclusion

Le fait d’abaisser le seuil de valeur P de 0,05 à 0,005 aurait modifié un tiers des interprétations de signification des ERC dans la littérature anesthésiologique étudiée. Il est donc important que les lectrices et lecteurs tiennent compte des probabilités post hoc lors de l’évaluation des résultats d’études cliniques. Bien que la présente étude se soit concentrée sur la littérature en anesthésiologie, nous suggérons que nos résultats justifient des recherches supplémentaires dans d’autres domaines de la médecine afin d’éviter une mauvaise interprétation clinique des résultats des ERC et d’améliorer la qualité des soins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure

Similar content being viewed by others

References

  1. Panagiotakos DB. Value of p-value in biomedical research. Open Cardiovasc Med J 2008; 2: 97–9. https://doi.org/10.2174/1874192400802010097

    Article  PubMed  PubMed Central  Google Scholar 

  2. Dorey F. The p value: what is it and what does it tell you? Clin Orthop Relat Res 2010; 468: 2297–8. https://doi.org/10.1007/s11999-010-1402-9

    Article  PubMed  PubMed Central  Google Scholar 

  3. Fisher RA. Statistical Methods for Research Workers. In: Kotz S, Johnson NL (Eds.). Breakthroughs in Statistics, Volume II: Methodology and Distribution. New York: Springer; 1992: 66–70.

    Chapter  Google Scholar 

  4. Altman DG, Bland JM. Statistics notes: absence of evidence is not evidence of absence. BMJ 1995; 311: 485. https://doi.org/10.1136/bmj.311.7003.485

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Nahm FS. What the P values really tell us. Korean J Pain 2017; 30: 241–2. https://doi.org/10.3344/kjp.2017.30.4.241

    Article  PubMed  PubMed Central  Google Scholar 

  6. Jones DS, Podolsky SH. The history and fate of the gold standard. Lancet 2015; 385: 1502–3. https://doi.org/10.1016/s0140-6736(15)60742-5

    Article  PubMed  Google Scholar 

  7. Thiese MS. Observational and interventional study design types; an overview. Biochem Med 2014; 24: 199–210. https://doi.org/10.11613/bm.2014.022

    Article  Google Scholar 

  8. Ioannidis JP. The proposal to lower P value thresholds to .005. JAMA 2018; 319: 1429–30. https://doi.org/10.1001/jama.2018.1536

    Article  PubMed  Google Scholar 

  9. Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav 2018; 2: 6–10. https://doi.org/10.1038/s41562-017-0189-z

    Article  PubMed  Google Scholar 

  10. Piroli F, Angelini F, D’Ascenzo F, De Ferrari GM. Does lowering p value threshold to 0.005 impact on evidence-based medicine? An analysis of current European Society of Cardiology guidelines on STEMI. Eur J Intern Med 2020; 79: 147–8. https://doi.org/10.1016/j.ejim.2020.05.036

    Article  PubMed  Google Scholar 

  11. Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS Biol 2015; 13: e1002106. https://doi.org/10.1371/journal.pbio.1002106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Greenberg RS, Bembea M, Heitmiller E. Rainy days for the Society for Pediatric Anesthesia. Anesth Analg 2012; 114: 1102–3. https://doi.org/10.1213/ane.0b013e318248e383

    Article  PubMed  Google Scholar 

  13. Shafer SL, Dexter F. Publication bias, retrospective bias, and reproducibility of significant results in observational studies. Anesth Analg 2012; 114: 931–2. https://doi.org/10.1213/ane.0b013e31824a0b5b

    Article  PubMed  Google Scholar 

  14. Evans S, Anderson JM, Johnson AL, et al. The potential effect of lowering the threshold of statistical significance from P < .05 to P < .005 in orthopaedic sports medicine. Arthroscopy 2021; 37: 1068–74. https://doi.org/10.1016/j.arthro.2020.11.041

    Article  PubMed  Google Scholar 

  15. Navon D, Cohen Y. Consider avoiding the .05 significance level, 2016. Available from URL: https://arxiv.org/abs/1606.09017 (accessed April 2023).

  16. Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci 2008; 3: 286–300. https://doi.org/10.1111/j.1745-6924.2008.00079.x

    Article  PubMed  Google Scholar 

  17. Wayant C, Scott J, Vassar M. Evaluation of lowering the P value threshold for statistical significance from .05 to .005 in previously published randomized clinical trials in major medical journals. JAMA 2018; 320: 1813–15. https://doi.org/10.1001/jama.2018.12288

    Article  PubMed  PubMed Central  Google Scholar 

  18. Johnson AL, Evans S, Checketts JX, et al. Effects of a proposal to alter the statistical significance threshold on previously published orthopaedic trauma randomized controlled trials. Injury 2019; 50: 1934–7. https://doi.org/10.1016/j.injury.2019.08.012

    Article  PubMed  Google Scholar 

  19. Malhotra A, Le Grice K, Shah N. Lowering the p value threshold in recently published Respiratory Medicine RCTs. Eur Respir J 2019; 54.: https://erj.ersjournals.com/content/54/suppl_63/PA1484

  20. Grolleau F, Collins GS, Smarandache A, et al. The fragility and reliability of conclusions of anesthesia and critical care randomized trials with statistically significant findings: a systematic review. Crit Care Med 2019; 47: 456–62. https://doi.org/10.1097/ccm.0000000000003527

    Article  PubMed  Google Scholar 

  21. Waters P, Love M, Rucker B. Lowering the statistical significance threshold in general anesthesia; 2021. Available from URL: https://osf.io/gj2w5/ (accessed August 2023).

  22. Google Scholar Metrics. Anesthesiology. Available from URL: https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=med_anesthesiology (accessed August 2023).

  23. Scimago Lab. Scimago scientific journal rankings. Available from URL: https://www.scimagojr.com/journalrank.php?category=2703 (accessed January 2023).

  24. Rayyan. Homepage. Available from URL: https://www.rayyan.ai (accessed January 2023).

  25. Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci 2014; 1: 140216. https://doi.org/10.1098/rsos.140216

    Article  PubMed  PubMed Central  Google Scholar 

  26. Sullivan GM, Feinn R. Using effect size-or why the P value is not enough. J Grad Med Educ 2012; 4: 279–82. https://doi.org/10.4300/jgme-d-12-00156.1

    Article  PubMed  PubMed Central  Google Scholar 

  27. Mobbs RJ. From the subjective to the objective era of outcomes analysis: how the tools we use to measure outcomes must change to be reflective of the pathologies we treat in spinal surgery. J Spine Surg 2021; 7: 456–7. https://doi.org/10.21037/jss-2021-2

    Article  PubMed  PubMed Central  Google Scholar 

  28. Bruno AM, Shea AE, Einerson BD, et al. Impact of the p-value threshold on interpretation of trial outcomes in obstetrics and gynecology. Am J Perinatol 2021; 38: 1223–30. https://doi.org/10.1055/s-0041-1731345

    Article  PubMed  PubMed Central  Google Scholar 

  29. Singh Bajwa SJ. Anesthesiology research and practice in developing nations: economic and evidence-based patient-centered approach. J Anaesthesiol Clin Pharmacol 2013; 29: 295–6. https://doi.org/10.4103/0970-9185.117039

    Article  PubMed  PubMed Central  Google Scholar 

  30. The Cochrane Collaboration. Chapter 5: Collecting data. Available from URL: https://training.cochrane.org/handbook/current/chapter-05 (accessed January 2023).

Download references

Author contributions

Philo Waters contributed to all aspects of this manuscript including the conception and design; acquisition, analysis, and interpretation of data; and drafting of the article. Brayden Rucker contributed to the conception of the study, the interpretation of data, and drafting of the article. Mitchell Love contributed to the acquisition of data and drafting of the article. Matt Vassar contributed to the conception and design; the analysis and interpretation of data; and drafting of the article.

Disclosures

No financial or other support was provided for the development of this manuscript. Dr. Vassar reports receiving funding from the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, the US Office of Research Integrity, and internal grants from Oklahoma State University Center for Health Sciences—all of which are outside of the present work. All remaining authors have nothing to disclose.

Funding statement

This study was not externally funded.

Editorial responsibility

This submission was handled by Dr. Stephan K. W. Schwarz, Editor-in-Chief, Canadian Journal of Anesthesia/Journal canadien d’anesthésie.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philo Waters IV BS.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 214 KB)

eTable 1 P values for each study or reasons for exclusion. eTable 2 Primary endpoint designation for each study

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Waters, P., Rucker, B., Love, M. et al. Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals. Can J Anesth/J Can Anesth 70, 1441–1448 (2023). https://doi.org/10.1007/s12630-023-02529-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12630-023-02529-9

Keywords

Navigation