Abstract
Purpose
The primary objective of our study was to determine how lowering a P value threshold from 0.05 to 0.005 would affect the statistical significance of previously published randomized controlled trials (RCTs) in major anesthesiology journals.
Methods
We searched the PubMed database for studies electronically published in 2020 within three major general anesthesiology journals as indexed by both Google Metrics and Scimago Journal & Country Rank. Studies included were RCTs published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; had a primary endpoint, and used a P value threshold to determine the effect of the intervention. We performed screening and data extraction in a masked duplicate fashion.
Results
Ninety-one RCTs met inclusion criteria. The most frequently studied type of intervention was drugs (44/91, 48%). From the 91 trials, 99 primary endpoints, and thus P values, were obtained. Fifty-eight (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 primary endpoints previously considered statistically significant, 21 (36%) P values would maintain statistical significance at P < 0.005, and 37 (64%) would be reclassified as “suggestive.”
Conclusions
Lowering a P value threshold of 0.05 to 0.005 would have altered one third of significance interpretations of RCTs in the surveyed anesthesiology literature. Thus, it is important for readers to consider post hoc probabilities when evaluating clinical trial results. Although the present study focused on the anesthesiology literature, we suggest that our results warrant further research within other fields of medicine to help avoid clinical misinterpretation of RCT findings and improve quality of care.
Résumé
Objectif
L’objectif principal de notre étude était de déterminer comment l’abaissement d’un seuil de valeur P de 0,05 à 0,005 affecterait la signification statistique des études randomisées contrôlées (ERC) précédemment publiées dans certaines des principales revues d’anesthésiologie.
Méthode
Nous avons réalisé des recherches dans la base de données PubMed pour trouver des études publiées électroniquement en 2020 dans trois des principales revues d’anesthésiologie générale et indexées par Google Metrics et Scimago Journal & Country Rank. Les études incluses étaient des ERC publiées en 2020 dans les revues Anesthesiology, Anesthesia & Analgesia, et le British Journal of Anaesthesia, qui avaient un critère d’évaluation principal et utilisaient un seuil de valeur P pour déterminer l’effet de l’intervention. Nous avons effectué la sélection et l’extraction des données de manière dupliquée masquée.
Résultats
Quatre-vingt-onze ERC remplissaient les critères d’inclusion. Le type d’intervention le plus fréquemment étudié était de nature médicamenteuse (44/91, 48 %). Sur les 91 études, 99 critères d’évaluation principaux, et donc valeurs P, ont été obtenus. Cinquante-huit (59 %) critères d’évaluation avaient une valeur P < 0,05 et 41 (41 %) avaient une valeur P ≥ 0,05. Sur les 58 critères d’évaluation principaux précédemment considérés comme statistiquement significatifs, 21 (36 %) valeurs P maintiendraient leur signification statistique à P < 0,005, et 37 (64 %) seraient reclassées comme étant « suggestives ».
Conclusion
Le fait d’abaisser le seuil de valeur P de 0,05 à 0,005 aurait modifié un tiers des interprétations de signification des ERC dans la littérature anesthésiologique étudiée. Il est donc important que les lectrices et lecteurs tiennent compte des probabilités post hoc lors de l’évaluation des résultats d’études cliniques. Bien que la présente étude se soit concentrée sur la littérature en anesthésiologie, nous suggérons que nos résultats justifient des recherches supplémentaires dans d’autres domaines de la médecine afin d’éviter une mauvaise interprétation clinique des résultats des ERC et d’améliorer la qualité des soins.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Study interventions are often evaluated using hypothesis testing, and statistical analyses are performed to evaluate the likelihood that treatment effects are attributable to the intervention rather than chance. The P value is the method most commonly used to summarize the statistical significance of results in research publications. Within statistics, there are generally two hypotheses—the null hypothesis and the alternative hypothesis. The null hypothesis most commonly indicates no association between factors investigated.1 A P value is a strength measure against a null hypothesis.2 Thus, for example, a P value of 0.2 implies that if there is truly no difference in outcome between factors examined, the probability of seeing either the same or more extreme data is 20%. In the 1950s, Fisher proposed the concept of the P value and suggested 0.05 as a threshold of significance, implying that if the P value is less than 0.05, there is evidence to reject the null hypothesis.3 Ever since, clinical researchers have maintained the typical P value threshold of 0.05 to indicate a statistically significant result.4,5
Because of their common use in the medical literature, P values have far-reaching effects on clinical research, and thus clinical practice. Clinical practice is often based on guidelines that provide evidence-based treatment recommendations. For a half-century, randomized controlled trials (RCTs) have been considered the gold standard of medical research and are the most commonly used study design cited as supporting evidence in clinical practice guidelines.6,7 Randomized controlled trials often investigate treatment interventions, and P values are reported as evidence of intervention efficacy. Clinical practice relies on interventions found to be efficacious by way of a statistically significant difference between the intervention and standard of care groups. Thus, using the conventional 0.05 threshold as bright line criterion for statistical significance is an important consideration that warrants investigation.
Despite recognizing the importance of providing statistical significance, some researchers suggest that P values are often misinterpreted, misrepresented, and overtrusted.5,8 Many researchers make the argument that placing the P value threshold at 0.05 results in high rates of false positives.8,9,10 Others have exploited the concept of “p-hacking,” described as selective reporting of data until statistical significance is reached.11 This practice of outcome-tampering by authors is likely a result of many journals’ tendency to favour publishing studies with statistically significant (“positive”) results compared to those with “negative” ones.11 A study by Greenberg et al. determined the association between rainy days and the Society for Pediatric Anesthesia’s (SPA) annual meeting to be P = 0.006, a seemingly significant statistic.12 Nevertheless, the purpose of this study was not to point the SPA leadership’s “rainmaking abilities,” but rather to provide an outstanding representation of common shortcomings associated with P values such as publication bias, retrospective bias, and reproducibility bias.13 The risks of false positives and “p-hacking,” combined with an overreliance on P values by journals, have led to the development—not without controversy—of potential solutions to increase the validity of clinical findings. One practice that many researchers have recently supported is to lower the P value threshold of statistical significance from 0.05 to 0.005, and to reassign P values within the range of 0.05 to 0.005 as statistically “suggestive.”8,9,14,15 The researchers argued that lowering the threshold that defines statistical significance would improve reproducibility, reduce false positives, minimize p-hacking risks, and promote the conduct of more carefully designed studies with sufficient power. Moreover, it has been shown that P values offer little information regarding reproducibility of results and need to be exceedingly small to approach a desirable chance of reproducibility.13 Cumming16 found that at a two-sided result of P = 0.05, the one-sided replication probability at P < 0.05 would be 41%, at P < 0.005 approximately 80%, and at P < 0.0001 nearly 95%. Thus, the premise of transition to P < 0.005 as the significance threshold is to achieve a replication probability of 80%.
It is hypothesized that shifting a P value threshold from 0.05 to 0.005 will alter about one-third of statistically significant results from published biomedical literature.8 In addition, the effect of lowering the P value threshold to < 0.005 has been investigated in general medicine, as well as other specialized fields of medicine.14,17,18,19 Nevertheless, the effects of such a change in anesthesiology are unknown. In a systematic review, Grolleau et al. calculated the fragility index of statistically significant results from RCTs in anesthesia and critical care.20 They found that RCT results are often “fragile,” suggesting a need to increase the strength and validity of trials within anesthesia literature.
Given the influential disposition of P values within anesthesia practice, we sought to investigate how lowering the P value threshold from 0.05 to 0.005 could potentially affect the statistical significance of previously published RCTs in anesthesiology. We aimed to determine the percentage of primary endpoints stated in anesthesiology RCTs that will either remain statistically significant, be reassigned to “suggestive,” or remain statistically insignificant.
Methods
This study was conducted and is reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement. Additionally, in efforts to promote transparency and reproducibility, our protocol, data extraction form, search string return, statistical analysis plain text, and raw data have been uploaded to the Open Science Framework and made publicly available (https://osf.io/gj2w5/).21 Our protocol with detailed methodology was uploaded a priori on 9 December 2021. These documents were uploaded prior to initiating eligibility screening and data extraction, which began on 15 December 2021.
Search strategy
We used the PubMed database to search for RCTs and clinical trials electronically published (ePub) in 2020 in the three major general anesthesiology journals as indexed by both Google Metrics22 and Scimago Journal & Country Rank.23 These journals are ranked in order as follows: Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia. The following search string was used to retrieve articles: (“Anesthesiology”[Journal] OR “Anesthesia and analgesia”[Journal] OR “British journal of anaesthesia”[Journal]) AND ((clinicaltrial[Filter] OR randomized controlled trial[Filter]) AND (2020:2020[pdat])).
Inclusion criteria and exclusion criteria
To be included in our study, articles had to be RCTs; published in 2020 in Anesthesiology, Anesthesia & Analgesia, and the British Journal of Anaesthesia; have a primary endpoint; and use a P value to determine the effect of the intervention. All studies that did not specifically state a primary endpoint or that did not provide a P value were excluded.
Screening process for eligible randomized controlled trials
Following the literature search, the returns were uploaded to Rayyan (Qatar Computing Research Institute, Doha, Qatar),24 a platform for screening and selecting studies for systematic literature reviews. Title and abstract screening were conducted by two investigators (P. W. and M. L.) in a masked, duplicate manner. Following the initial screening, investigators settled any disagreements through discussion with a third investigator (B. R.) who was available for arbitration. For the studies not included in the final analysis, reasons for exclusion are detailed in Electronic Supplementary Material (ESM) eTable 1.
Data collection process
Two investigators (P. W. and M. L.) carried out masked, duplicate data extraction using a Google Form. A third independent investigator (B. R.) was available for arbitration. In addition to the P value of the primary endpoints, the following study characteristics were extracted from each trial: article title, journal name, funding source, sample size, type of endpoint (subjective or objective), type of intervention, name of intervention, and setting (single institution, multicentred, etc.). The P value for each included study is provided in ESM eTable 1. Regarding adjudication for the type of endpoint, a subjective endpoint was defined as an outcome based on or influenced by the patient’s thoughts, feelings, opinions, or interpretation (patient-reported outcome, pain, etc.). In contrast, an objective endpoint was defined as an outcome not based on or influenced by the patient’s thoughts, feelings, opinions, or interpretation (labs, imaging, physical exam, etc.). The primary endpoint and its designation as objective or subjective from each study is provided in ESM eTable 2.
Statistical analysis
We identified the ratio of endpoints that retained statistical significance with the P value < 0.005 and that were redefined as “suggestive.” We then applied a binary logistic regression model to evaluate whether particular study features were associated with P value thresholds. We used logistic regression to generalize the estimated association probabilities to the general anesthesiology literature beyond our sample of studies. Thus, for our logistic regression analysis, the study characteristic was used as the independent variable and whether the P value maintained significance or not in the proposed threshold was used as the dependent variable. All logistic regression analyses were performed using Stata version 15.1 (StataCorp LLC, College Station, TX, USA), for which an automated selection algorithm was not used. We also calculated the false discovery rate (FDR) using the Benjamini–Hochberg method in Microsoft® Excel (Microsoft Corporation, Redmond, WA, USA). This methodology was adopted from a study previously published in the Journal of the American Medical Association (JAMA) that assessed the effects of reducing the P value threshold to < 0.005 in three major general medical journals.17 It was necessary to collapse categories in the funding variable because of the small sample sizes found in some of these categories.
Results
The literature search returned 134 articles, 27 of which were excluded after title and abstract screening. An additional 16 articles were excluded during full-text screening and data extraction. Our final sample included 91 RCTs. A screening flow diagram documenting the reasons for exclusion is presented in the Figure.
Characteristics of included randomized controlled trials
Of the 91 RCTs, 37 (40%) were published in the British Journal of Anaesthesia, 27 (30%) were published in Anesthesiology, and 27 (30%) were published in Anesthesia & Analgesia (Table 1). Objective endpoints were the most common primary endpoints used within the studies (60/91, 66%). The most frequently studied type of intervention was drugs (44/91, 48%). Of those studies evaluating drugs, 19 related to pain, 14 to general anesthesia, ten to cardiovascular anesthesia, and one to obstetric anesthesia. Additionally, of the same 44 studies evaluating drugs, 14 were comparative studies, four were dose-finding studies, 19 looked at adverse outcomes of a specific drug, and seven looked at physiologic effects of a specific drug. Of the studies, 71% were conducted at a single centre (65/91, 71%). Of the RCTs that included a funding statement (n = 90), 39% (35/90) reported receiving internal hospital/university funding. Few studies were supported by nonprofit (2/90, 2%) and industry/private (8/90, 9%) sources. The sample sizes of the RCTs ranged from ten to 10,010 participants, with a median of 105.
Primary endpoint analysis
Because some trials reported multiple primary outcomes, and thus multiple P values, a total of 99 primary endpoints were included in our analysis. A total of 58 (59%) endpoints had a P value < 0.05 and 41 (41%) had a P value ≥ 0.05. Of the 58 P values < 0.05, 21 (36%) would maintain statistical significance under the proposed threshold of 0.005. Of the 58 primary endpoints previously considered statistically significant, 37 (64%) would be reclassified as “suggestive.” Of these 37 RCTs, 27% (10/37) were published in Anesthesiology, 38% (14/37) were published in the British Journal of Anaesthesia, and 35% (13/37) were published in Anesthesia & Analgesia. After adjusting for covariates and using an odds ratio with a 95% confidence interval, devices and “other” interventions as well as RCTs with industry/private funding sources and multiple funding sources were found to be more likely to report outcomes that maintained statistical significance (Table 2). Nevertheless, using the P values and after adjusting for the FDR using the Benjamini–Hochberg method, there were no study characteristics that related to maintaining significance (Table 2).
Discussion
We found that nearly four out of five primary endpoints from all RCTs included in our study would not hold statistical significance under the proposed threshold of 0.005, and nearly two of every three primary endpoints with previous statistical significance would be relabelled as “suggestive.” The results of our study pose important implications for anesthesiologists’ drawing of critical conclusions of an RCT’s significance given the reported P value.
Our findings suggest a potential need to improve how we interpret the strength of intervention efficacy. Lowering the P value threshold to < 0.005 may serve as a temporizing mechanism until other solutions can be evaluated and implemented. For one, “p-hacking” is a practice widely used by clinical trialists and data analysts. Lowering the threshold in which a study may be considered significant would make p-hacking more difficult, likely reducing its use. Furthermore, a P value of 0.05 can lead to false positive rates as high as 30%.25 Therefore, within the clinical setting, an anesthesiologist making critical treatment decisions based on RCTs with a statistical significance set at P < 0.05 could lead to negative consequences. Statistical significance is driven by many factors. For example, a large enough sample size will almost always manifest statistically significant results.26 Additionally, given their nature, subjective outcomes suffer from poor reliability, recall, and reporting bias when compared with their objective counterparts.27 These factors may be especially relevant within anesthesia and other surgical specialties as large studies and objective outcomes are less common when compared with general medicine trials. Nevertheless, we found no correlation between sample size or type of outcome and the maintenance of statistical significance (Table 2). Thus, lowering the P value threshold would not be affected by sample size variances.
With nearly two-thirds of anesthesia trials in our study with previous statistical significance shifting to the “suggestive” category, these results underestimate the prediction made by Ioannidis that adjusting the P value would shift one-third of past biomedical literature to “suggestive.”8 In fact, the results of our study parallel those from studies in other medical specialties. Bruno et al. analyzed 202 RCTs published in obstetrics and gynecology journals between 2017 and 2019.28 Of the 90 RCTs with statistically significant outcomes (P < 0.05), nearly half would be relabelled as “suggestive.” A study evaluating RCTs within orthopedic sports medicine reported that over half of statistically significant primary outcomes would be reclassified as “suggestive.”14 Whereas these studies evaluated particular clinical specialties, a study of clinical trials published in three high impact factor general medical journals discovered that only 29.3% of statistically significant P values would be reclassified as “suggestive.”17 Our study indicates that fewer primary outcomes measured in anesthesiology RCTs would maintain statistical significance when compared with trials published in other high impact factor journals in different specialties. Our findings suggest that the clinical application of evidence-based medicine within anesthesiology would dramatically shift under a proposed lowering of the P value. As medical advancements continue to rely on the proper understanding of peer-reviewed medical literature, the classification and interpretation of clinical findings is becoming more crucial. Furthermore, as surgical procedure rates continue to rise,29 the evidence-based approach of practicing anesthesiologists becomes even more necessary for providing the most beneficial and proven therapeutic strategies.
Study limitations
This study is strengthened by sound methodology, including masked duplicate screening and extraction—the gold standard method established by the Cochrane Collaboration.30 Nevertheless, we acknowledge that this study is not without limitations. One limitation is that only the three major impact anesthesiology journals were included over a one-year period. Therefore, the results may not be generalizable to all anesthesiology RCTs. In addition, because we conducted our search through PubMed—which uses the MEDLINE database—there is a potential that our search did not return all studies published in these three journals.
Conclusion
Overall, our study found that lowering the P value threshold from 0.05 to 0.005 would alter the statistical significance of over one third of published RCTs in major anesthesiology journals. Thus, it is critical that readers consider post hoc probabilities, as well as any other attributing factors when evaluating and interpreting clinical trial results. Reducing the P value threshold and applying a reclassification of “statistically suggestive” may serve as a temporary means to reduce potential clinical trial result misinterpretation. Providing reliable information through the use of evidence-based literature is necessary for anesthesiologists to administer the most educated and skillful care to patients. Although our study shows substantial findings regarding a change to the P value in anesthesiology literature, we suggest that future studies further explore this proposal within other fields of medicine to avoid clinical misinterpretation of RCT findings and improve quality of care.
References
Panagiotakos DB. Value of p-value in biomedical research. Open Cardiovasc Med J 2008; 2: 97–9. https://doi.org/10.2174/1874192400802010097
Dorey F. The p value: what is it and what does it tell you? Clin Orthop Relat Res 2010; 468: 2297–8. https://doi.org/10.1007/s11999-010-1402-9
Fisher RA. Statistical Methods for Research Workers. In: Kotz S, Johnson NL (Eds.). Breakthroughs in Statistics, Volume II: Methodology and Distribution. New York: Springer; 1992: 66–70.
Altman DG, Bland JM. Statistics notes: absence of evidence is not evidence of absence. BMJ 1995; 311: 485. https://doi.org/10.1136/bmj.311.7003.485
Nahm FS. What the P values really tell us. Korean J Pain 2017; 30: 241–2. https://doi.org/10.3344/kjp.2017.30.4.241
Jones DS, Podolsky SH. The history and fate of the gold standard. Lancet 2015; 385: 1502–3. https://doi.org/10.1016/s0140-6736(15)60742-5
Thiese MS. Observational and interventional study design types; an overview. Biochem Med 2014; 24: 199–210. https://doi.org/10.11613/bm.2014.022
Ioannidis JP. The proposal to lower P value thresholds to .005. JAMA 2018; 319: 1429–30. https://doi.org/10.1001/jama.2018.1536
Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav 2018; 2: 6–10. https://doi.org/10.1038/s41562-017-0189-z
Piroli F, Angelini F, D’Ascenzo F, De Ferrari GM. Does lowering p value threshold to 0.005 impact on evidence-based medicine? An analysis of current European Society of Cardiology guidelines on STEMI. Eur J Intern Med 2020; 79: 147–8. https://doi.org/10.1016/j.ejim.2020.05.036
Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. The extent and consequences of p-hacking in science. PLoS Biol 2015; 13: e1002106. https://doi.org/10.1371/journal.pbio.1002106
Greenberg RS, Bembea M, Heitmiller E. Rainy days for the Society for Pediatric Anesthesia. Anesth Analg 2012; 114: 1102–3. https://doi.org/10.1213/ane.0b013e318248e383
Shafer SL, Dexter F. Publication bias, retrospective bias, and reproducibility of significant results in observational studies. Anesth Analg 2012; 114: 931–2. https://doi.org/10.1213/ane.0b013e31824a0b5b
Evans S, Anderson JM, Johnson AL, et al. The potential effect of lowering the threshold of statistical significance from P < .05 to P < .005 in orthopaedic sports medicine. Arthroscopy 2021; 37: 1068–74. https://doi.org/10.1016/j.arthro.2020.11.041
Navon D, Cohen Y. Consider avoiding the .05 significance level, 2016. Available from URL: https://arxiv.org/abs/1606.09017 (accessed April 2023).
Cumming G. Replication and p intervals: p values predict the future only vaguely, but confidence intervals do much better. Perspect Psychol Sci 2008; 3: 286–300. https://doi.org/10.1111/j.1745-6924.2008.00079.x
Wayant C, Scott J, Vassar M. Evaluation of lowering the P value threshold for statistical significance from .05 to .005 in previously published randomized clinical trials in major medical journals. JAMA 2018; 320: 1813–15. https://doi.org/10.1001/jama.2018.12288
Johnson AL, Evans S, Checketts JX, et al. Effects of a proposal to alter the statistical significance threshold on previously published orthopaedic trauma randomized controlled trials. Injury 2019; 50: 1934–7. https://doi.org/10.1016/j.injury.2019.08.012
Malhotra A, Le Grice K, Shah N. Lowering the p value threshold in recently published Respiratory Medicine RCTs. Eur Respir J 2019; 54.: https://erj.ersjournals.com/content/54/suppl_63/PA1484
Grolleau F, Collins GS, Smarandache A, et al. The fragility and reliability of conclusions of anesthesia and critical care randomized trials with statistically significant findings: a systematic review. Crit Care Med 2019; 47: 456–62. https://doi.org/10.1097/ccm.0000000000003527
Waters P, Love M, Rucker B. Lowering the statistical significance threshold in general anesthesia; 2021. Available from URL: https://osf.io/gj2w5/ (accessed August 2023).
Google Scholar Metrics. Anesthesiology. Available from URL: https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=med_anesthesiology (accessed August 2023).
Scimago Lab. Scimago scientific journal rankings. Available from URL: https://www.scimagojr.com/journalrank.php?category=2703 (accessed January 2023).
Rayyan. Homepage. Available from URL: https://www.rayyan.ai (accessed January 2023).
Colquhoun D. An investigation of the false discovery rate and the misinterpretation of p-values. R Soc Open Sci 2014; 1: 140216. https://doi.org/10.1098/rsos.140216
Sullivan GM, Feinn R. Using effect size-or why the P value is not enough. J Grad Med Educ 2012; 4: 279–82. https://doi.org/10.4300/jgme-d-12-00156.1
Mobbs RJ. From the subjective to the objective era of outcomes analysis: how the tools we use to measure outcomes must change to be reflective of the pathologies we treat in spinal surgery. J Spine Surg 2021; 7: 456–7. https://doi.org/10.21037/jss-2021-2
Bruno AM, Shea AE, Einerson BD, et al. Impact of the p-value threshold on interpretation of trial outcomes in obstetrics and gynecology. Am J Perinatol 2021; 38: 1223–30. https://doi.org/10.1055/s-0041-1731345
Singh Bajwa SJ. Anesthesiology research and practice in developing nations: economic and evidence-based patient-centered approach. J Anaesthesiol Clin Pharmacol 2013; 29: 295–6. https://doi.org/10.4103/0970-9185.117039
The Cochrane Collaboration. Chapter 5: Collecting data. Available from URL: https://training.cochrane.org/handbook/current/chapter-05 (accessed January 2023).
Author contributions
Philo Waters contributed to all aspects of this manuscript including the conception and design; acquisition, analysis, and interpretation of data; and drafting of the article. Brayden Rucker contributed to the conception of the study, the interpretation of data, and drafting of the article. Mitchell Love contributed to the acquisition of data and drafting of the article. Matt Vassar contributed to the conception and design; the analysis and interpretation of data; and drafting of the article.
Disclosures
No financial or other support was provided for the development of this manuscript. Dr. Vassar reports receiving funding from the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, the US Office of Research Integrity, and internal grants from Oklahoma State University Center for Health Sciences—all of which are outside of the present work. All remaining authors have nothing to disclose.
Funding statement
This study was not externally funded.
Editorial responsibility
This submission was handled by Dr. Stephan K. W. Schwarz, Editor-in-Chief, Canadian Journal of Anesthesia/Journal canadien d’anesthésie.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (PDF 214 KB)
eTable 1 P values for each study or reasons for exclusion. eTable 2 Primary endpoint designation for each study
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Waters, P., Rucker, B., Love, M. et al. Lowering the statistical significance threshold of randomized controlled trials in three major general anesthesiology journals. Can J Anesth/J Can Anesth 70, 1441–1448 (2023). https://doi.org/10.1007/s12630-023-02529-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12630-023-02529-9