Our investigation found that SRs cited in colorectal guidelines are frequently at unclear or high risk of bias and do not report key SR items that are important for the critical appraisal of results. Specifically, that our predominant risk of bias judgment was unclear signals that much of the critical SR methodological items were missing or poorly described. Our finding—that SRs adhered to a median of 20/27 PRISMA items—may appear at odds with our risk of bias findings. However, the difference in these two findings highlights our key takeaway: a SR item may be reported but still represent a flawed method, thus placing the SR at risk of bias. Thus, our findings identify two key action items for future and ongoing SRs in colorectal cancer: ensure SRs report all items from PRISMA and ensure SRs describe methods in enough detail to facilitate critical appraisal of results.
Two key examples of how missing or poorly described information may affect the critical appraisal of an SR relate to study protocols and risk of bias evaluations. In our sample, SRs rarely directed the reader to a publicly available, a priori protocol (2/63, 3.2%). It has been shown that SRs, like randomized controlled trials,21, 22 exhibit significant rates of selective outcome reporting—defined as the selective inclusion, omission, or alteration of study outcomes, often due to statistical significance.23 Thus, the lack of a publicly available protocol leaves the possibility that SR results are published at the author’s discretion, rather than at the behest of a prespecified protocol. Similarly, a lack of detail regarding risk of bias evaluations may compromise the validity of meta-analytic effects in an SR. In our study, authors often reported that a risk of bias evaluation was conducted (46/63, 73.0%), but further inspection of the risk of bias methods showed that many authors used outdated, flawed tools. For example, authors frequently used the Jadad scale for assessing risk of bias of included clinical trials. The Jadad scale is notorious for its omission of allocation concealment as a bias domain, and according to the Cochrane Handbook, use of the Jadad scale is “explicitly discouraged.”24 Thus, the use of the Jadad scale leaves the possibility that interventional effects shown in the included colorectal SRs are confounded by bias that is undetected by SR authors. Furthermore, even if authors used Cochrane risk of bias tool, they often reported only judgment for individual risk of bias domains, without an accompanying comment that explained the judgment. It has been shown previously that authors frequently make erroneous judgments (i.e., judgments that were not in line with the accompanying comment), and thus, not in line with recommendations available in the Cochrane Handbook.25,26,27 Therefore, inadequate reporting of Cochrane risk of bias tool prevents readers to verify accuracy of authors’ judgments.
The cohort of SRs we analyzed is unique since these SRs informed the evidence base of NCCN colorectal guidelines. However, this sample of SRs is likely not the only, or even the primary, source of evidence for most NCCN recommendations, since the field of oncology relies heavily on randomized controlled trial data. Indeed, the NCCN categories of recommendations simply state that “high-level evidence” and “uniform NCCN consensus” are necessary to achieve level 1 evidence status28. Nonetheless, the findings from our study warrant concern due to the predominance of unclear or high risk of bias judgments and variability in reporting quality. For example, in the NCCN rectal cancer guidelines, seven SRs were cited in the discussion of laparoscopy vs. open resection (Jiang et al. 2015; Zhao et al. 2016; Zhang et al. 2014; Xiong et al. 2012; Vennix et al. 2014; Arezzo et al. 2013; Trastulli et al. 2012). Five of these SRs were at high or unclear risk of bias, while 2 were at low risk, including the only Cochrane review. There was no discussion of the risk of bias for any of these SRs. This oversight may be reasonable in this case because of the dearth of other data available and cited for laparoscopy, all pointing to a fairly certain conclusion of its risks and benefits. Moreover, in this case, the low risk of bias SRs had similar findings as the high and unclear risk of bias SRs. However, even this scenario highlights an important point—risk of bias assessments is crucial to reasoned discussions and serves to augment the ongoing, skillful clinical appraisal inherent to CPG panel discussions. In this case, where the benefits and risks of laparoscopy are fairly well-established, the harm of omitting risk of bias from a CPG discussion may be benign, but for emerging therapies with less certain benefit, risk of bias evaluations are necessary because the risk of false positive or negative results may have a broad impact of CPG recommendations and clinical practice. This study has several key limitations. First, our findings may not be generalizable to all colorectal SRs, since we only evaluated SRs cited by the NCCN rather than all colorectal SRs available. Next, we discourage the interpretation of our findings to mean that NCCN recommendations are at risk of bias, since the NCCN recommendations rely on other robust research, such as clinical trials, that we did not include in our investigation. Any judgments about the quality of NCCN recommendations would need to be supported with thorough assessment of all evidence included and validated tools for assessment of clinical guidelines. Moreover, the included NCCN guidelines included 1698 total references, so our 63 included SRs represent only a small fraction of the cited evidence. Finally, this study is limited by investigating only guidelines written for healthcare professionals, rather than NCCN guidelines for patients. In conclusion, our investigation of the risk of bias and quality of reporting of SRs referenced by the NCCN guidelines for colon and rectal cancer found that SRs are commonly at high risk of bias and do not fully report key items. Specifically, we found that a SR item may be mentioned, but may report a flawed method or incomplete report all aspects of the item. The implication for the treatment and management of colon and rectal cancer, which relies on high-quality evidence for demographically diverse patients, is that summary effects may not exemplify the trust normally imputed on systematic reviews and meta-analyses. Further, even though the objective of our investigation is not to question the strength of NCCN guideline recommendations, our findings may be of concern to oncologists who heavily rely on NCCN recommendations. The NCCN developers use what literature is available to formulate recommendations, and thus, we recommend more stringent SR methodology and reporting be enforced in journal publications. When readers or guideline developers encounter a biased SR, we recommend careful critical appraisal of the results and conclusions, since bias may result in false positive or false negative results.