Introduction

According to the World Health Organisation (WHO), colorectal cancer (CRC) is the third most common cancer worldwide with 1.80 million cases resulting in 862,000 deaths in 2018 [1]. Screening programmes can be effective in reducing the number of deaths attributed to cancer through early detection. However, a national audit found that only 58% of people in England, United Kingdom (UK), completed bowel screening and only 10% of all CRC patients are diagnosed through bowel screening [2]. Inequalities in bowel screening uptake are consistently demonstrated: participation is typically lower among those with low socio-economic status (SES) [3,4,5]. The COVID-19 pandemic has potentially exacerbated these inequalities in uptake, with reduced access to screening. New innovations such as stratified screening may make screening more efficient, and better able to deal with increasing colonoscopy demands.

There have been growing calls for cancer screening programmes, including bowel screening, to be risk-stratified [6], moving away from a ‘one size fits all’ approach to a more personalised one. The premise of risk stratification is that having more precise knowledge about one’s risk of CRC can be used to determine which screening modality and intensity (type of test, when screening should start/finish, frequency) should be offered to patients with varying levels of risk. Higher-risk individuals have more to gain from screening and targeting them would potentially be a more efficient and cost-effective approach. This would, however, require significant change and investment [7]; for example, screening hubs would need to adapt their IT systems to accommodate different screening regimes for different groups. With questions over ethical, legal and social implications of risk-stratified cancer screening [8], screening participants and their healthcare providers (HCPs) would need to find this approach acceptable, and the information needs of patients, in understanding this more complex approach, would need to be addressed. At present, we do not know how feasible these changes would be. Given this limited knowledge, we carried out a scoping review which is appropriate for a field whereby there are large numbers of complex and heterogeneous studies. Arskey and O’Malley [9] present four purposes of a scoping review: to examine the extent and range of research activity; to determine the value of undertaking a full systematic review; to summarise research findings; and to identify research gaps. Our objective was to examine international evidence and identify evidence gaps relating to the feasibility and acceptability of risk-stratified approaches to bowel screening to inform future research, policy and practice. Specifically, we sought evidence on organisational aspects of risk-stratified screening, its potential to worsen health inequalities, parameters of diagnostic performance, available models and tools to risk stratify, acceptability of these approaches and evidence-based guidelines.

Methods

The scoping review protocol is registered with the Open Science Framework [10]. We have used the PRISMA Extension for Scoping Reviews checklist [11] in the reporting of this review (Supplementary file 1).

Inclusion/exclusion criteria

Any study, both primary and secondary, which examined risk-stratified bowel screening was eligible. We included theoretical/modelling studies developing risk scores if they had undertaken either internal or external validation. Non-English studies, those which lacked sufficient detail for data extraction, protocols, and studies which included different cancer types but lacked specific data on bowel screening, were all excluded. Studies which included patients with existing health conditions (e.g. Lynch syndrome) were also excluded as this study is about screening people who are asymptomatic.

Search strategy

Searches were conducted on six electronic databases: Medline All, Embase and PsycINFO via OVID, CINAHL Complete via EBSCOHost, The Cochrane Database of Systematic Reviews and Cochrane Central Register of Controlled Trials. The Medline strategies are available in Supplementary file 2 and combined text word searching with database-specific indexed terms. The initial search period was from database inception to the 26 June 2020 combining search terms for three major concepts (bowel cancer, screening and risk stratification) with search filters for systematic reviews and randomised controlled trials for non-Cochrane databases. A second search combined the three major concepts with other terms of interest including feasibility, acceptability and inequalities. Supplementary searches were also conducted on: PMC Europe Grant Finder, Bielefeld Academic Search Engine (BASE) and Google Scholar to identify additional relevant studies and grey literature. Forwards and backwards citation searches were also conducted via Web of Science using studies identified after the initial search and screening phase and the entire database search was updated on 18 October 2021.

Screening and data charting

After deduplication, title, abstract and full-text screening were undertaken against the inclusion/exclusion criteria using Covidence software. The main reviewer (JC) screened 100% and two additional reviewers (SG/OB) independently screened approximately 50% each. Conflicts were resolved through discussion. A data chart was created in Excel. Data charting was carried out primarily by JC but checked by SG/OB (25% each). No quality appraisal was undertaken for this scoping review as the aim was to summarise existing evidence on the topic to inform future research, policy and practice, not to include or exclude studies based on quality [5].

Results

In total, 4,340 records were identified through database searching, an additional 588 by forward and backward citation searching of initially included studies after the search bringing the total to 4,928. There were 3,629 records after duplicates were removed. These were title and abstract screened; 3,416 records were excluded at this stage. 213 records with full-texts were assessed for eligibility against the inclusion/exclusion criteria. 111 full-texts were excluded with reasons listed in the PRISMA flow diagram (Fig. 1), and 102 unique studies (some records were merged if they were part of the same study) were included in this study.

Fig. 1
figure 1

PRISMA 2009 Flow Diagram

Overview of the current state of evidence

Most studies were conducted in the US (n = 28) followed by China (n = 13), Australia (n = 11), UK (n = 8), Netherlands (n = 7), South Korea (n = 7), Germany (n = 4), Japan (n = 3), Thailand (n = 2) and one each from Canada, Belgium, France, Iran, Lebanon and Spain; 13 were multi-country studies (see Fig. 2). The studies varied in their methodological designs (Tables 1, 2, 3, 4, 5, and 6, Supplementary file 3) which ranged from primary research (mostly observational or experimental studies) (n = 79) to systematic (n = 6) and non-systematic reviews/evidence-based commentaries/editorials (n = 17). We did not perform a quality appraisal of the included studies as our objective was to summarise the extent and full range of evidence on the topic. We have organised the findings into the following groups: (1) the diagnostic performance of risk-stratified bowel cancer screening approaches; (2) the effectiveness of risk prediction models; (3) the use of risk prediction tools in clinical environments; (4) the acceptability of risk-based bowel screening approaches to patients and HCPs; (5) cost-effectiveness; and (6) evidence-based guidelines and recommendations for future risk-stratified bowel screening.

Fig. 2
figure 2

Map of included studies

Table 1 Studies examining diagnostic performance of risk-stratified approaches
Table 2 Systematic review studies summarising risk prediction models for risk-stratified screening
Table 3 Studies evaluating risk assessment tools
Table 4 Studies examining acceptability of risk-stratified approaches
Table 5 Cost-effectiveness studies examining risk-stratified scenarios
Table 6 Studies examining risk-stratified guidelines and evidence-based recommendations

Diagnostic performance of risk-stratified bowel cancer screening approaches

Thirteen studies [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27] examined diagnostic performance of risk-stratified approaches to bowel screening in comparison to the Faecal Immunochemical Test (FIT). Various outcome measures of diagnostic performance were used including diagnostic yield, detection rate/prevalence, odds ratios, positive predictive values (PPV), negative predictive values (NPV), sensitivity and specificity. Only five reported discriminatory power, ranging from 0.676 to 0.86 AUC (Table 1).

An ongoing randomised controlled trial (RCT) study conducted in China [12,13,14] found that its risk-adapted approach based on the Asia Pacific Colorectal Scoring System (APCS) had a high participation rate and superior diagnostic yield of colorectal cancer (CRC)/advanced colorectal neoplasia (ACRN) compared to FIT but inferior yield to colonoscopy. For some sub-groups (e.g. men or 60–74-year-olds), risk-adapted screening showed a similar detection rate to colonoscopy. A post-hoc analysis of one arm of the trial examined risk-based screening based on lifestyle and polygenic risk score (PRS) and found a larger PPV (ACRN) for the combined approach when compared to either colonoscopy, lifestyle or PRS only showing a cumulative effect. A feasibility trial conducted in Thailand [15] found greater detection rate of ACRN using the APCS in combination with FIT (6.15-fold, 3.72–10.17 in the high risk with positive FIT group) although the study used a lower-than-usual threshold for FIT positivity (50 ng/mL) which may have resulted in a higher number of false positives (1 in 7 cancers were still missed). A population-based trial in the Netherlands [16,17,18] further identified participants who had either a positive FIT and/or positive family health questionnaire (FHQ) result, confirmed after genetic counselling, and referred them for a colonoscopy. There was no increased diagnostic yield for the combined FIT and FHQ approach, and it had a high false-positive rate (35%). Participants who returned the FHQ tended to be younger, and had higher SES, possibly due to costs of genetic testing. A similar study [19] compared FIT with a questionnaire-based risk assessment (QRA) and found that FIT was superior to the QRA or combined FIT and QRA approach. However, another study [20] found an increased detection rate of the combined FIT and FHQ when adjusting the FIT cut off points (10/15/20 µg Hb/g). A few other studies also looked at the impact of changing the FIT cut-off but instead of using family history they adjusted according to age/sex. For instance, a Spanish cohort study [21] found higher odds of detecting ACRN for men than women and when combined with faecal haemoglobin concentration levels, the risk of ACRN increased 11.46-fold amongst individuals in the highest versus those in the lowest risk category. Similar results were found by a cohort study conducted in Belgium [22] indicating that FIT may be an effective tool not only as a screening modality but also for risk stratification. However, another study using data from the Colonoscopy or Colonography for Screening (COCOS) Netherlands trial [23] found no statistically significant differences between different FIT cut-offs and matched positivity thresholds. The absolute differences between sensitivities were higher at lower FIT cut-offs, suggesting that models using age and sex may have greater benefit at low positivity thresholds. A Chinese cohort study [24] found that prior negative FIT results could be used as a risk stratification tool since detection of ACRN was greater than the combined colonoscopy and FIT group but inferior to colonoscopy alone. A Japanese cross-sectional study [25] also examined the role of FIT as a risk stratification tool, this time in combination with age, and found higher detection of CRC for 2-day FIT positive aged 50 years and over. They showed that 2-day FIT had a higher yield than one positive FIT result. Therefore, it is proposed that a 2-day FIT could help to prioritise patients for colonoscopy. Another Japanese study [26] evaluated the performance of an 8-point risk score based on age, sex, CRC family history, BMI and smoking and in combination with FIT at different thresholds for 1 and 2 days. PPV was higher in the combined risk score and FIT group with increased sensitivity but lower specificity. Lastly, a cross-sectional study conducted in the Netherlands [27] found that a risk-based model (age, CRC family history, smoking, BMI, regular aspirin use/nonsteroidal anti-inflammatory drug use, total calcium intake and physical activity) had better discrimination in distinguishing ACRN and greater sensitivity compared to FIT alone. They found that with the risk-based screening the same number of colonoscopies would lead to the detection of five more cases of ACRN, thus this combined approach has better accuracy than FIT alone and may help to reduce the number of colonoscopies required.

Overall, it is difficult to draw definitive conclusions about the efficacy of the risk-based screening approaches in comparison/combination with FIT since the results were mixed. However, diagnostic performance did improve in some studies which show promise for risk-adapted bowel screening and may help to prioritise colonoscopies for those at highest risk. Review findings suggest models based on more than just family history lead to a better detection of ACRN when used in conjunction with FIT.

Risk prediction model validation studies

Thirty-five studies [28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62] examined the detection of CRC, ACRN or advanced proximal neoplasia by modelling various risk prediction scoring systems (Supplementary file 3). Of the 35 risk prediction models, 15 achieved good discriminatory power (AUC/C-statistic ≥ 0.70) while 11 were externally validated. The studies used a variety of risk models, most notably the APCS, originally developed in 14 Asian sites [62] but was externally validated outside of Asia [32]. The APCS was adapted by some studies, such as Korean version [42]. Additionally, risk scoring systems comprising factors such as age, gender, lifestyle factors, and polygenic risk scores were evaluated. There are too many to summarise here but many of them have been summarised in previous systematic reviews, detailed in Table 2. These reviews synthesised various risk scoring systems based on socio-demographics (age/sex), lifestyle (smoking, obesity/BMI), medication use, family history, and biomarkers. They typically found that the models had modest performance in predicting ACRN.

In summary, there is a wealth of studies examining a broad range of risk prediction models that could be used to stratify risk as part of bowel screening programmes but most models do not have an acceptable level of discriminatory power while others need to be externally validated, particularly in more ethnically diverse populations. This should be the focus of future studies looking at ways to stratify risk.

Studies evaluating risk assessment tools in clinical practice

Sixteen studies [63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82], of various study designs, examined the clinical utility of risk stratification tools to accurately classify patients into risk groups for various cancers based on personal and family history provide recommendations for type of guidance-concordant screening and promote adherence. Eleven tools were identified in total: Colorectal cancer RISk Predictor (CRISP) [63,64,65]; MeTree [66,67,68]; Family Healthware [69]; Cancer Risk Intake System (CRIS) [70,71,72]; an online family history tool [73,74,75]; National Cancer Institute Colorectal Cancer Risk Assessment Tool (CCRAT) [76, 77]; Personal or Family History Questionnaire [78]; family history questionnaire followed by a geneticist review [79]; Your Disease Risk [80]; Persian risk assessment [81]; genetic risk score and family history tools [82]. Apart from five studies [64, 65, 73,74,75, 79], the rest were US-based.

These tools (Table 3), were evaluated for their ability to accurately predict the presence of CRC when a referral is made [67, 68], utility and accuracy in assigning patients to risk categories or re-classify/refine previous estimates of risk categories [63, 64, 66, 73,74,75, 79], concordance with existing referral guidance [71, 72, 80] and impact on screening participation [69,70,71,72].

The studies typically found the tools to be helpful in assisting with referrals, albeit with mixed evidence on whether they had improved sensitivity and specificity when compared with referral decisions based on existing practice. Utility in assigning patients to risk categories as a basis for more- or less-intense screening, or in refining categories based on less detailed information was typically reported. The accuracy of these risk assignments was assessed in several ways, including comparisons with clinical records [78] and the opinion of clinicians [79, 81]. Overall, the tools examined showed high concordance with existing guidance (that is, similar numbers of patients, with similar characteristics, would have been referred), but ability to achieve compliance with screening recommendations, in the absence of an organised programme, was less encouraging [72, 80]. While improved levels of uptake were achievable [69], the ability of participants to complete the tools without assistance was questioned in some of the studies [63, 64].

Authors of the studies raised concerns around a few issues, including comprehension of the tools by patients, potential to increase referrals and overwhelm diagnostic services, inappropriate assignation to a lower-intensity screening regime and burden of completion of the tools, for patients and HCPs. Concerns were also raised about the quality of information used to inform risk stratification; family history is not always well-recorded, and self-reports may be inaccurate [83]. Indeed, one study [78] showed that clinician-led history taking was superior to a self-administered family/personal history questionnaire. Nevertheless, overall, these risk assessment tools showed improvements in either stratification of risk based on personal or family history and, in some cases, bowel screening uptake. Future studies examining the clinical utility of risk assessment tools should consider ways in which they can be easily integrated into routine practice.

Studies examining acceptability of risk-stratified screening to patients and providers

The principal focus of ten included studies [83,84,85,86,87,88,89,90,91,92,93] was attitudes towards, and acceptability of, risk-based screening. They are summarised in Table 4.

Risk-stratified approaches had variable levels of acceptability among study participants. Discomfort with being assigned to a less-intensive screening regime featured [84], mediated by factors such as trust in the treating physician, belief in the efficacy of screening and perceived threat from CRC. One study noted that HCPs were typically supportive of risk assessment tools to inform decision-making [85], but did not necessarily agree with the decision as colonoscopy was seen as the ‘gold standard’. This is an important caveat for implementing these approaches. Concerns were also sometimes expressed over the extra burden, in terms of workload and time, risk-based strategies could entail. In general, there is a preference for systems which can readily be accommodated within routine clinical practice [86, 87] as well as HCPs questioning the clinical accuracy of the tool [88]. Similarly, patients will not necessarily comply with risk-based recommendations, particularly if they are at odds with their screening preferences [89] even if it does enable them to make a more informed decision [90]. There is mixed evidence that receipt of information about higher CRC risk can lead to increased anxiety. For instance, an online risk assessment test in the Netherlands [91] did not increase anxiety levels following receipt of risk information and because it was able to acquire novel family history information in 40% of participants the authors recommend using the test in bowel screening. However, an RCT [92] conducted in Scotland, UK, found that the personalised CRC risk information was easy to understand, but the information was distressing for some. They also found that intention to undergo colonoscopy was greatest amongst the highest risk groups but even the lowest risk group showed that over 50% would undergo colonoscopy. Therefore, regardless of level of risk, the results show that two-thirds would opt for colonoscopy, increasing demand on existing services. Promisingly, a study [93] conducted in Canada showed that adherence to risk-stratified screening guidelines increased with CRC risk but the authors call for future research to address low adherence among average and moderate risk groups. However, another study [83] found that in Australia the rate of screening advice ever received was low (only a third) which suggest that more could be done to communicate risk between patient and HCP.

Cost-effectiveness studies examining risk-stratified scenarios

Five studies [94,95,96,97,98] examined the cost-effectiveness of risk-stratified bowel screening. Two US studies [94, 95] showed that even though optimal risk-stratified bowel screening may not be cost-effective, they are associated with reduced CRC mortality and higher total quality adjusted life years (QALYs). False positives were reduced by more than 48.6% and perforations were reduced by at least 9.9% in one study [94] while in another study optimal policies suggest that females will undergo less frequent screening compared to males with corresponding risk levels [95]. Findings from a UK-based study [96] suggest that risk-stratified screening based on genetic and/or phenotypic risk scores as opposed to age alone are likely to save costs and reduce CRC incidence and mortality without significantly increasing resource use provided that risk assessment is kept to £114 per person. According to this study, risk-stratified screening is likely to benefit men more than women. A study in Japan [97] evaluated three screening strategies (1-Colonoscopy, 2-FIT, 3-Risk score compared to no screening) and found that colonoscopy (based on 60% uptake) was the most effective in terms of highest number of QALYS and lowest CRC incidence and deaths, however, it requires a large number of colonoscopy procedures which may put additional strain on resource use. Lastly, a study in the Netherlands [98] showed that both uniform and personalised risk-based screening led to similar yield in QALYs (0.11–0.32% versus 0.02–0.32%) but risk-based screening cost more due to the costs associated with risk stratification. On the whole, based on these modelling studies, risk-stratified bowel screening is likely to cost more while generating a similar reduction in CRC deaths and number of QALYs but these approaches are likely to reduce the burden on resource use and the frequency of screening for those deemed low risk, therefore, it may be beneficial.

Evidence-based guidelines and recommendations for risk-stratified bowel screening

The remaining seventeen papers [99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114] examined the current national guidelines for their respective countries and/or put forward recommendations for risk-stratified bowel screening based on evidence. The US, Australia and Canada have developed evidence-based risk-stratified bowel screening guidance which are not just based on age but also personal/family history [99,100,101,102] and it is argued that such guidelines may pave the way for risk stratification in other countries. Some researchers have proposed that ethnicity should also be included in risk stratification due to the increased incidence of CRC for some groups [103]. For instance, one paper refers to the American College of Gastroenterology which recommends that bowel screening should start at age 45 (as opposed to age 50) for African Americans given that they have the highest incidence of CRC than all other ethnic groups in the US [103]. A Delphi study was conducted to update to Asian guidelines [104] on bowel screening recommended using a risk-stratified scoring system combining four risk factors (age, sex, family history and smoking status) to select patients for colonoscopy, which may help to reduce cost and workload. An evidence-based commentary by an author in Belgium [115] recommended screening those at intermediate risk due to, for instance, having a first degree relative, at an earlier age given that they have between a two- to three-fold increased risk of developing CRC. This was also suggested two other papers [105, 114] while an Australian paper recommends taking into account additional factors (age, gender, lifestyle, SES and genetic profiling) as well as family history in future risk-stratified approaches [106]. A UK-based study calls for the use of risk scoring systems in combination with FIT since some studies have shown improved sensitivity of predictive models [113]. However, there was consensus that more needs to be done to validate risk scoring systems [107,108,109,110,111]. Furthermore, there are calls for more research to examine the acceptability [108, 109, 112], organisational implications [108, 112] and cost-effectiveness [109] of risk-stratified bowel screening going forward.

Discussion

The review identified important research gaps, most notably in relation to the organisation of screening services, because few studies have piloted risk-stratified approaches with most studies to date having developed models/tools to aid with risk stratification. Since adoption of risk stratification would involve profound organisational change within screening programmes, there would be constraints in terms of organisational resistance, IT infrastructure limitations and human behaviour. More research on this process of organisational change is vital if risk-stratified screening is to be introduced. Further, we identified no studies which examined the potential impact of risk-stratified approaches on health inequalities. Whilst none of the studies directly measured impact of risk stratification on health inequalities, several studies mentioned important limitations of their studies that may have salience for health inequalities. For instance, studies noted that participants tended to be from higher SES backgrounds [79] with a lack of ethnic diversity [69], higher screening adherence and greater likelihood of having medical insurance [69, 89]. One of the studies demonstrated that higher income was associated with increased risk-stratified screening compliance [93], therefore, it is possible that risk-stratified bowel screening may widen pre-existing health inequalities and this needs careful analysis. However, if we look at acceptability of risk-stratified screening for other screening programmes, it is promising to see that ethnic minority groups may look favourably on it if risk is communicated clearly and translated where necessary [116].

There are some limitations to our review. Information on risk stratification in bowel screening is difficult to categorise resulting in some overlap between the six categories we applied. Further, there were some challenges in identifying studies focused on risk-stratified screening, with some lack of clarity over what constitutes risk stratification, and outcomes of interest. Nevertheless, strengths of our study included its development according to a predefined protocol, systematic and transparent approach to identification of studies, having multiple reviewers at each stage and being reported according to the PRISMA extension for scoping reviews.

Based on the review findings, we have developed recommendations for future research, practice and policy. See Box 1.

Conclusion

This scoping review mapped out the international literature on risk-stratified bowel screening. Despite over 20 years of studies and growing calls for risk stratification, we have found a limited number of studies which have actually piloted such an approach and there are mixed results. Risk stratification has the potential to improve diagnostic performance but introducing it in national bowel screening programmes can be a challenging process. Programmes have, on the whole, been established on an ‘average risk’ basis – that is, they offer the same screening regime to everyone in the population, unless they have familial/genetic conditions (such as Lynch Syndrome) in which case they would fall under surveillance programmes instead of screening [117]. Even with this ‘one-size-fits-all’ approach, there are enormous challenges facing bowel screening programmes. These include maintaining sufficient uptake to ensure population impact on CRC outcomes, and disparities in uptake due to ethnic differences and socio-demographic factors. Screening programmes are complex, requiring systems to identify eligible patients, invite them and follow-up non-responders, provide diagnostic and treatment services with sufficient capacity to accommodate screen-detected cancers, and quality assurance protocols to ensure the maintenance of high standards. It is little wonder then, that there are few examples of attempts to incorporate risk-stratification into these complex processes – quantifying risk in target populations and offering tailored screening regimes based on this risk introduce new demands in areas such as recruitment processes, organisational systems, IT infrastructure, patient and provider education and ethical considerations.