Background

Colorectal cancer (CRC) is the third most common cancer and second most lethal cancer worldwide [1]. In terms of cancer incidence and mortality in China, in 2020, there were 0.56 million new cases of CRC, and CRC was responsible for 0.29 million deaths [2]. Generally, the development of CRC is a multistage and slow process occurring over several decades, with colorectal neoplasia (CRN) and advanced colorectal neoplasm (ACN) representing key steps in progression to CRC [3]. If detected and treated in the early stages, the prognosis is favourable, with a five-year survival rate reaching 90% [4].

Current guidelines recommend that individuals over 45 years old should undergo CRC screening to reduce CRC-related incidence and mortality [5,6,7]. Many countries and regions, such as Korea, America and China, have implemented CRC screening programs [8, 9]. The faecal occult blood test (FOBT)/faecal immunochemical test (FIT) and colonoscopy are the two most common screening tools [10]. However, extensive screening programs with colonoscopy have not been implemented in some countries due to resource and staffing constraints [11, 12]. FIT is relatively simple and inexpensive, but its sensitivity and specificity are limited, which may lead to missed diagnosis in high-risk persons [13]. Therefore, a risk stratification strategy is needed for persons undergoing CRC screening to make the most of the limited resources and improve the efficiency of screening.

Multiple risk scoring systems have been introduced in various countries for CRC screening [14,15,16]. In China, the high-risk factor questionnaire (HRFQ) was proposed for risk stratification and was based on the CRC screening project in two counties of Zhejiang Province [17]. Subsequently, the combination of the HRFQ and FIT has been adapted by the Chinese Ministry of Health since 2006 [18] and implemented in many cities across China, including Shanghai, Guangzhou, and Tianjin, and significant results have been achieved [19,20,21,22]. In this screening procedure, subjects with positive FIT or HRFQ were classified as high risk and recommended to undergo colonoscopy [21]. Nevertheless, previous studies have found that adherence to further colonoscopy follow-up was generally low, and less than 30% of high-risk participants underwent colonoscopy [20, 23]. The current risk stratification has a high sensitivity but a high false positive rate, leading to unnecessary colonoscopy, which results in doubts about the effectiveness of screening and decreases compliance [24]. In addition, some of the latest well-documented risk factors, such as smoking status, alcohol intake and BMI, were not included in the current HRFQ and may affect the discriminatory ability of the screening procedure [25].

In this study, we developed a risk scoring model for predicting colorectal lesions based on the high-risk population in the Tianjin CRC screening programme from 2012–2020. Our findings may help to improve the commonly used risk stratification methods, promote adherence to colonoscopy and increase the efficiency of CRC screening in China.

Methods

Study setting

This study was conducted at Tianjin Union Medical Center as a part of the Tianjin CRC screening programme. The Tianjin CRC screening programme was established in 2012 by the government, and it already provided free CRC screening for more than 6 million residents aged > 40 years in Tianjin from 2012–2020. A two-step method similar to that of Jiashan County [18, 26] was applied in the CRC screening programme, in which HRFQ and FIT were used for initial screening, and subjects with any positive HRFQ or FIT were recommended to undergo colonoscopy screening. The population of this study was a subset of the screening programme, and this study was approved by the Ethics Committee of Tianjin Union Medical Center.

Study participants

This study retrospectively analysed the data obtained from the prospective screening programme. The inclusion criteria included the following: (1) Participants defined as high-risk in initial screening, with positive FIT (ABON, China) or (and) positive-HRFQ. Positive HRFQ was defined as subjects meeting any of following conditions: (a) history of any cancer or colorectal polyps; (b) history of CRC in first-degree relatives; (c) history of two or more of the following symptoms: chronic diarrhoea, chronic constipation, serious unhappy life events such as death among first-degree relatives, mucus or bloody stool, chronic appendicitis or appendectomy, chronic cholecystitis or cholecystectomy; (2) those who underwent subsequent colonoscopy after being identified as high risk by FIT or (and) HRFQ; and (3) those who wished to participate and signed informed consent form. Subjects with a history of CRC were excluded from this study.

Colonoscopy procedures

All endoscopic examinations were performed in the hospital by experienced endoscopists who had at least 5 years of experience and were all board certified to perform endoscopy. All abnormal findings were confirmed by expert gastrointestinal pathologists following up-to-date clinical guidelines. Only high-quality colonoscopies were included, with adequate bowel preparation, photo documentation of caecal landmarks, and a withdrawal time > 6 min.

Colorectal lesions were classified as ulcerative colitis or Crohn’s disease, chronic inflammatory bowel disease, ulcer, adenoma, and CRC [20]. ACN was defined as CRC or advanced adenoma ≥ 10 mm in diameter or with villous components or high-grade dysplasia. CRN was defined as cancer or any adenoma.

Measurements and definitions

In this study, smoking status was categorized as never smoker and current or past smoker. Alcohol consumption was categorized as never or occasional alcohol intake and weekly alcohol intake. Regular exercise was defined as at least 30 min of exercise more than once weekly over the last year; otherwise, it was classified as ‘physical inactivity’. Obesity was defined as BMI ≥ 28 kg/m2 according to Chinese criteria [27]. Educational level was categorized as low (primary education or below), intermediate (secondary education, high education or lower vocational education) and high (higher vocational education, university or above).

Development of risk scores

The association between the prevalence of ACN and clinical risk factors was assessed by univariate and multivariate logistic regression models. The risk factors examined included sex, age, body mass index (BMI), smoking status, alcohol intake and physical activity. All risk factors with p values less than 0.05 in univariate analysis were included in the binary logistic regression model for ACN. A scoring model for predicting ACN was developed based on the results of multivariate analysis, and each variable was assigned a weight using the respective adjusted OR rounded to the nearest integer. The final score of each participant was the sum of scores for each risk factor.

Statistical analysis

All statistical analyses were performed using R software (V.4.1.2). The prevalence of CRN, ACN and CRC was calculated in participants stratified by clinical parameters and assessed with the chi-square test. The odds ratio (OR) with 95% confidence interval (95% CI), beta-confidence and standard error of each variable were calculated by a logistic regression model. The prevalence of ACN was evaluated according to each score, and then screened individuals were divided into three subgroups according to the final scores: “low risk (LR)”, “moderate risk (MR)” and “high risk (HR)”. A bootstrapping test with 1000 replicates was performed to internally validate the new scoring model. To evaluate the discriminatory capability of the model, a receiver operating characteristic (ROC) curve was plotted, and c-statistics were calculated. P values less than 0.05 were considered statistically significant.

Results

Characteristics of participants

The population-based CRC screening process is shown in Fig. 1. The clinical parameters and colorectal lesions of the participants are summarized in Table 1. There were 29,504 high-risk participants enrolled in this study, with 20,299 (68.8%) having a positive FIT and 13,460 (45.6%) having positive HRFQ results. Of these, 14,677 (49.7%) individuals were found to have CN, including 2324 (7.9%) diagnosed with ACN. The average age of all subjects was 63.53 years (SD 7.47), 15,710 (53.2%) were female participants, 16.0% had a BMI greater than 28 kg/m2, and 24.4% were current or past smokers.

Fig. 1
figure 1

Flowchart depicting the process of patient inclusion. CRC, colorectal cancer

Table 1 Prevalence of colorectal neoplasia, advanced neoplasia, and colorectal cancer in the cohort according to risk factors

Univariate and multivariate predictors of ACN

The association between the prevalence of ACN and risk factors evaluated by univariate and multivariate analyses is shown in Table 2. From multivariate analysis adjusted for educational level and marital status, each 10-year increase in age from 50 years onwards (OR = 2.14–5.52), male sex (OR = 1.63, 95% CI = 1.49–1.79), BMI ≥ 28 kg/m2 (OR = 1.20, 95% CI = 1.08–1.33), ever smoking (OR = 1.17, 95% CI = 1.06–1.29) and weekly alcohol intake (OR = 1.22, 95% CI = 1.08–1.38) were significantly correlated with the presence of ACN, while physical inactivity showed no correlation with ACN in univariate or multivariate analysis.

Table 2 Univariate and multivariate predictors of colorectal advanced neoplasia in this cohort

Development of the risk score

A scoring model for predicting ACN was developed, and points were assigned to relative risk factors as follows (Table 3): male sex (2), female sex (0), age < 50 years (0), age 50–60 years (2), age 60–70 years (4), age ≥ 70 years (6), BMI ≥ 28 kg/m2 (1), BMI < 28 kg/m2 (0), current or past smoking (1), never smoking (0), weekly alcohol intake (1), and never or occasional alcohol intake (0). The scoring system ranged from 0–11, and the points were weighted according to ORs from the multivariate analysis rounding to the nearest whole number. The number of participants with different scores and the prevalence of ACN by scores are presented in Table 4.

Table 3 Scoring model for the prediction of risk for advanced colorectal neoplasia
Table 4 Prevalence of advanced colorectal neoplasia by risk tier and risk score

Discriminatory ability and reliability of the risk score

Subjects were divided into three risk tiers: LR, low risk (score 0–3); MR, moderate risk (score 4–6); and HR, high risk (score 7–11). The proportions of LR, MR and HR were 17.9% (5267/29504), 53.8% (15,888/29504) and 28.3% (8349/29504), respectively. Subjects with MR had a prevalence of ACN similar to the overall prevalence (7.1% vs. 7.9%). Compared with subjects with LR, those with MR and HR had ORs of 2.47 (2.09–2.93) and 4.59 (3.86–5.44), respectively (Table 4). Furthermore, internal validation showed a c-statistic of 0.64 (95% CI = 0.63–0.65) in the 1000 bootstrapped samples, which showed that it has a relatively stable risk-stratification ability.

Discussion

In this study, we developed a scoring system for further risk stratification of a CRC high-risk population classified by the current screening procedure. The risk score derived from the logistic regression model included five well-recognized risk factors (sex, age, BMI, alcohol intake and smoking status) that were not included in the current screening tools. The scoring model performed well, and the ACN risk was found to be 2–sixfold higher in subjects with scores over 6 than in other individuals. Given the high false positive rate and low adherence to colonoscopy in participants, our findings have the potential to serve as a powerful complement to existing screening strategies and improve the effectiveness of screening programs.

After risk stratification by this scoring system, the prevalence of ACN in the high-risk group increased from 7.9% to 12.4%. Colonoscopy resources remain limited across China, and the application of this scoring system could take better advantage of these resources, which may help increase the discovery rate of colorectal lesions and help more patients with colorectal neoplasia diagnosed at an early stage. Integrating this scoring model with current screening methods could improve the effectiveness of screening programs, which will make populations more trusting of the results of risk stratification and more willing to receive colonoscopy tests.

Multiple scoring models for predicting ACN have been constructed in previous studies, containing common variables (sex, age, smoking status and family history) and subtle differences in other factors [14, 15, 28, 29]. Sekiguchi et al. proposed a scoring system for risk stratification in Japanese average-risk populations using five risk factors, including sex, age, family history, BMI and smoking history, which stratified screened subjects into three risk groups [14]. The most well-known Asia–Pacific Colorectal Screening score (APCS) also incorporated age, sex, family history and smoking to predict the risk for ACN, which was created based on the population from 11 countries in Asia [28], and the factors included in our scoring model were similar to those of previous studies. In addition, some well-recognized factors, such as family CRC history and personal cancer history, were not considered in our score, since they were included in the HRFQ and have been evaluated before.

The predictive ability of our score was consistent with that of previous studies; participants with higher scores had a significantly higher risk of colorectal neoplasia. The prevalence of ACN in our study was 12.2%, which was slightly higher than that in studies conducted in Japan (10.2%) [14] and the USA (9.0%) [16]. In addition, participants with high and moderate risk had a 4.59- and 2.40-fold higher risk of ACN, respectively, than those with low risk, which was consistent with the results of the APCS study [28]. The c-statistic of this score was 0.64, superior to that established by Liang et al. (c-statistic = 0.62) [30] and comparable to multiple previous scoring systems [14, 15].

Previous studies mainly focused on average-risk asymptomatic populations [14, 28], while we pioneered to focus on a relatively high-risk population according to HRFQ and FIT screening. In mainland China, nearly every large-scale CRC screening programme has adapted a two-step procedure, using HRFQ as the first step and colonoscopy as the second step [19,20,21,22]. Nevertheless, the HRFQ was implemented by the Chinese Ministry of Health in 2006, and some newly identified factors were not included in it [18]. Thus, it is of vital importance to develop a scoring system incorporating the latest well-recognized neoplasia-related risk factors to improve the current imperfect HRFQ. For instance, male sex and older age were found to be associated with ACN in various studies [28, 31, 32].

Individuals with overweight and obesity account for at least 11% of CRC cases in Europe, and each 1-kg/m2 increase in BMI confers an additional risk of CRC (HR = 1.03) [33, 34]. Lee et al. reported that the prevalence of colorectal polyps was 3.5 times higher in current smokers than in never smokers [35]. Alcohol intake is a well-documented risk factor for CRC, and a recent meta-analysis based on 8 large-scale studies revealed that heavy consumption was associated with an increased risk of CRC [36, 37]. Moreover, the latest study in Lancet reported that the leading risk factors contributing to global cancer burden were smoking, alcohol use and high BMI [38]. The scoring models involving the factors above have been implemented in many countries, such as Korea [15], Japan [14] and the USA [29], and incorporating these factors into Chinese official CRC screening programmes will improve the effectiveness of screening and increase adherence to colonoscopy.

There were several limitations to this study. First, although we have done our best to analyse all CRC-related factors, some potential related factors, such as diabetes [39], dietary habits and lifestyle [40], were still not considered since the information was not collected. Second, external validation of this risk score was not performed due to the lack of external data. Third, most indicators were self-reported by participants, which may introduce recall bias. Only subjects who underwent colonoscopy were involved in this study, which may cause selection bias. Finally, this score was developed based on participants in Tianjin, and generalizability to other settings or cities should be done with caution.

Conclusions

In conclusion, we developed a scoring system for further risk stratification of a high-risk population classified by HRFQ and FIT that could identify very high-risk populations with colorectal neoplasia. Combining this risk score with current Chinese screening methods may improve the effectiveness of CRC screening and adherence to colonoscopy.