Background

Rectal cancer is a common malignant tumor globally. Its high incidence and mortality threaten patients’ health and quality of life [1]. Locally advanced rectal cancer is generally treated with multimodal strategies involving neoadjuvant chemotherapy, radiotherapy and total mesorectal excision to improve patients’ survival rates. However, data from over the past decade demonstrated that locally advanced rectal cancer had high metastasis rates (29–39%) [2]. Although postoperative adjuvant chemotherapy was adopted, the metastasis rates were greater than twice the rate of primary tumor recurrence [2, 3]. The heterogeneity of molecular pathological characteristics between and inside tumors resulted in different clinical prognosis outcomes [4]; thus, accurate prognostic markers are necessary to facilitate and optimize therapeutic decision-making [5]. Rectal cancer marker detection can indicate the biological characteristics of tumors, which benefits the evaluation of therapeutic efficacy and patient prognosis [6,7,8].

Among tumor markers, Ki-67 expression is key for the diagnosis of rectal cancer. Ki-67 is a monoclonal antibody for identifying relevant proliferating cell nuclear antigens. Research has indicated that the Ki-67 expression levels are closely related to differentiation levels, infiltration, metastasis, and prognosis of rectal cancer and directly affect prognosis outcomes [9, 10]. However, Ki-67 expression can only be determined through invasive biopsy or surgical pathology tissues [11]. Preoperative examination approaches assessing Ki-67 expression enabling to predict patient prognosis without having to undergo invasive examinations or surgery can thus be beneficial for patients.

MRI is fundamental for preoperative diagnosis in cancer staging, evaluation of therapeutic efficacy, and postoperative follow-up [12,13,14]; it facilitates comprehensive evaluation of multiple crucial prognostic factors in rectal cancer. Clinically, doctors evaluate tumor characteristics with the imaging features of the lesions. However, this method relies on doctor experience and specialties, lacking repeatability [15, 16]. Therefore, a noninvasive imaging biomarker to predict ki-67 status prior to surgery would offer additional prognostic value and allow more individualized management of patients with rectal cancer.

Radiomics uses high-throughput quantitative image analysis to represent tumors and the relevant microenvironment and can identify more features than visual inspection [16, 17]. Radiomics has extensive applications in rectal cancer research, including: (1) prediction of the therapeutic efficacy of neoadjuvant methods [18]; (2) evaluation of tumor, node, metastasis (TNM) staging [19, 20] and neurovascular invasion [21]; (3) analysis of survival gains after clinical treatment [22]. The performance of established prediction models is varied. Radiomics has been conducted to evaluate Ki-67 expression as a tumor prognostic indicator in breast cancer [23], bladder cancer [24], and gastrointestinal stromal tumors [25, 26]. However, studies have mostly used single-scanning sequences or -modality evaluation methods [25, 26]. Such evaluations have been rarely reported in rectal cancer.

In this study, we developed a multi-parametric MRI radiomics for preoperative prediction of Ki-67 expression in patients with rectal cancer. Clinical data were also used to construct a combined model. The stability and reliability of the model were validated using internal and external data from two centers.

Methods

Participants

All experimental protocols were approved by the ethics committees of Tongde Hospital of Zhejiang Province (Center 1, No.TD2021-96) and Shanghai Putuo District People’s Hospital (Center 2, No.PT2022-2). The need for informed consent was waived by the ethics committees of Tongde Hospital of Zhejiang Province and Shanghai Putuo District People’s Hospital to this retrospective study design. All of the procedures were performed in accordance with the Declaration of Helsinki and relevant policies in China. From January 2015 to August 2021, a total of 259 patients with rectum adenocarcinoma were included as participants (163 males and 96 females). The average age was 65.0 ± 11.6 years. The participants were divided into training (139 cases from Center 1), internal validation (in-valid, 60 cases from Center 1), and external validation (ex-valid, 60 cases from Center 2) cohorts. The 199 patients in Center 1 were randomly distributed into the training cohort and the in-valid cohort with a ratio of 7:3. The model was constructed based on the training cohort. Internal and external data validation were conducted to verify the reliability of the model.

The patient inclusion criteria were as follows: (1) The patient was confirmed with rectum adenocarcinoma by surgical pathology and had received immunohistochemistry (IHC) tests for Ki-67. (2) The patient received rectal MRI scanning with contrast enhancement within 2 weeks before surgery. (3) Complete images, clinical data, surgical data, and pathological data were available. The exclusion criteria were as follows: (1) The patient had not received surgery or pathologically confirmed not to be rectum adenocarcinoma. (2) The clinical and image data were incomplete. (3) MRI images had severe artifact problems leading to difficulty in interpreting the images. A flow diagram of patient recruitment is shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of enrolled patients in this two-center study

MRI protocol and image analysis

MRI scanning was performed with SIEMENS 3.0T Verio and SIEMENS 1.5T Avanto (Siemens, Germany). The selected sequences included the conventional axial T2WI, DWI, and contrast enhancement T1WI (CE- T1). MRI scanners and scanning Parameters were summarized in Table 1.

Table 1 MRI scanning parameters in two centers

Prior to contrast-enhanced MRI scanning, the contrast agent Gadopentetate Dimeglumine (Beilu Pharmaceutical Co., China) was injected at a rate of 2.5-3.5ml/s with a dosage of 0.1mmol/Kg through the dorsal metacarpal veins, followed by rinsing with 20 ml saline. The late arterial phase images were selected for processing, when tumors are more strongly enhanced and having good contrast with the surrounding tissues.

MRI images were viewed by two radiologists with 8 and 15 years of experience in abdominal diagnosis, respectively. The radiologists were not informed of the patients’ histopathological results. They observed the images, identified lesions, and made a diagnosis at a picture archiving and communication system workstation. They reported the following contents: (1) patients’ basic information (Sex、Age、Serum CEA level); (2) neoadjuvant chemotherapy; (3) tumor location (upper, middle, low) ; (4) apparent diffusion coefficient (ADC) values: place region of interest (ROI) at the maximum level of the tumor in the ADC image, with a circular shape that covers the tumor as much as possible; (5) long diameter: tumor length measured on sagittal T2WI; (6) infiltration depth: infiltration depth of tumor measured on axial T2WI; (7) circum-involvement ratio (CIR): the percentage of largest circumferential tumor invasion of the rectal wall; (8) MRI-detected circumferential resection margin (mrCRM): positive mrCRM was defined as tumor lying within 1 mm of the mesorectal fascia; (9) MRI-detected T (mrT) stage: T1 stage, tumor signal intensity is limited to the submucosal layer. T2 stage, tumor signal intensity extends the mucosal layer but does not exceed the muscular layer. T3 stage, tumor signal intensity grows through the muscle layer and penetrates into the mesorectal fat of the rectum. T4 stage, tumor signal intensity infiltrates the visceral peritoneum and/or invades the adjacent organs. (10) MRI-detected lymph node (mrN) stage: N0 (no positive lymph node), N1 (1–3 positive lymph node) or N2 (≥ 4 positive lymph node). It was considered positive if lymph node showed suspicious morphology (round shape, irregular border, heterogeneous signal) and short axis diameter > 9 mm [13]; (11) enhancement pattern (EP): Mild/Moderate or obvious enhancement of tumor compared to normal rectal wall enhancement (Fig. 2a,b); and (12) rM: presence of metastasis determined according to radiological examination. Discrepancies between the radiologists were resolved by consensus after joint re-evaluation of the images and confirmed by another radiologist with 20 years of experience in abdominal diagnosis.

Fig. 2
figure 2

Sketch map of infiltration depth (ID), circum-involvement ratio (CIR), MRI-detected circumferential resection margin (mrCRM), MRI-detected T (mrT) and MRI-detected lymph node (mrN) (a) and enhancement pattern (EP) of rectal cancer (b). Mild/Moderate (left, white arrow) or obvious (right, red arrow) of tumor compared to normal rectal wall enhancement (middle)

Pathological analysis

The parenchymal parts of the surgically removed tumor were stained with regular hematoxylin-eosin (HE) and IHC marker of Ki-67. The interpretation of Ki-67 utilizes visual assessment under light microscopy. Ki-67 positive cell expression were indicated by brown-yellow or brown after staining. After browsing the entire pathological slice, positive cells were counted and the ratio of positive cells were calculated. The Ki-67 proliferation index was calculated based on the percentage of Ki-67 positive cells in the total carcinoma cells (reported at 10% intervals as 10%, 20%, 30%, etc.). 50% was used as the critical value to divide Ki-67 expression into low expression (< 50%) and high expression (≥ 50%) groups [27, 28]. Ki-67 expression in rectal cancer was tested by 2 clinical physiologists (with 10 and 15 years of work experience). Disagreement of the outcome was resolved by discussion and confirmed by a third clinical physiologist (with 22 years of work experience).

Image segmentation and feature extraction

Lesion extraction was based on T2WI, DWI, and contrast-enhanced T1WI (CE- T1) sequences of MRI to plot the ROI and the image was output in a DICOM format. The ROIs were created manually via the open-source ITK-SNAP 3.4.0. A radiologist with 8 years of experience in abdominal imaging diagnosis determined the contours and plotted the shape along the boundaries by layers. Another radiologist with 15 years of abdominal imaging diagnosis then calibrated and verified the results before the entire tumor scope was plotted and a 3D-Mask was generated. To avoid data heterogeneity, all the DICOM images were subjected to normalization and resampled to the same resolution (1 mm×1 mm×1 mm). An extension software package in Python, pyradiomics, was used for feature extraction. DWI (maximum b value) was used as the original image, the plotted 3D-Mask was copied to an ADC map for feature extraction. Z-score standardization was adopted to downscale the dimension of each feature to the same order of magnitude before feature extraction. The selected features were further handled by using Max-Relevance and Min-Redundancy (mRMR) and recursive feature elimination methods. Then, the least absolute shrinkage and selection order (LASSO) linear regression method was used to reduce feature dimensions to identify features with highest correlation for radscore model construction.

Model construction and validation

Each potential clinical risk factor and radiomics marker in the training cohort were analyzed through univariate and multivariate logistic regression analysis to screen independent predictors of Ki-67 expression in rectal cancer to construct the prediction models. Ki-67 was used as the dependent variable, radscore and clinical information were used as the independent variables to calculate the regression coefficients. Through weighted linear combination, a combined model was constructed and nomograms were generated. Decision curve analysis (DCA) was adopted to evaluate the clinical utilities. Calibration curves were used to assess the consistency between the model-predicted probability and the actual probability for Ki-67 expression (Fig. 3).

Fig. 3
figure 3

Workflow of multi-parametric MRI radiomics for predicting Ki-67 expression in rectal cancer

Statistical analysis

The R software (version 3.6.1, http://www.r-project.rog) was used for statistical analysis. Normally distributed continuous variables were represented as means ± standard deviation. Categorical variables were represented by frequency and percentage. The caret software package was used to segment the queue and preprocess features. A confusion matrix was established to obtain accuracy, sensitivity, and specificity data. Multivariate logistic regression analysis was performed to select clinical features. ROCs were drawn to evaluate the model’s prediction performance via AUC, accuracy, and 95% CI (confidence interval). ROCs of each prediction model were compared and validated using Delong’s test. The rms and rmda software packages in R were used for calibration curve analysis and DCA, respectively. Two-tailed test results with p < 0.05 indicated statistical significance.

Results

Comparison of clinical and image features between high and low Ki-67 expression groups in patients with rectal cancer

Clinical and imaging data in two centers were summarized in Table 2. In the three cohorts, the age, sex, CEA expression, neoadjuvant chemotherapy, location, long diameter and rM stage of the patients were not statistically different between the high and low Ki-67 expression groups (p > 0.05). In contrast, the ADC value and mrT stage differed significantly between patients with high and low Ki-67 expression (p < 0.05). In the training and in-valid cohorts, mrN stage was significantly different between patients with high and low Ki-67 expression (p < 0.05); however, no significant difference was observed for the ex-valid cohort (p > 0.05). In the training cohort, CIR differences were significant between patients with high and low Ki-67 expression (p < 0.05); however, no significant difference was observed in the other two cohorts. In the training and ex-valid cohorts, mrCRM was significantly different (p < 0.05); whereas no significant difference was observed in in-valid cohort (Table 3).

Table 2 Clinical and imaging data in two centers
Table 3 Comparison of Clinical and imaging characteristics with different Ki67 status in three cohorts

Radiomics characteristics

In the training cohort, image features from T2WI, DWI, and CE-T1 were selected after consistency evaluation. A total of 2553 features were initially extracted. Then mRMR and LASSO were used for dimensionality reduction to select 18 features with strongest correlation (5 for T2WI, 8 for DWI, and 5 for CE-T1) (Table 4). Logistic regression was used to calculate the regression coefficient of radscore on the dependent variable Ki-67 in DWI, T2WI, and CE-T1, estimate the radiomics marker radscore, and construct a radscore model (Radscore=-1.601121 + CE-T1_Radscore*0.497129 + T2WI_Radscore*0.010777 + DWI_Radscore*0.021327).

Table 4 The selected radiomics features (5 for T2WI, 8 for DWI, and 5 for CE-T1) and their relevant coefficients in DWI-score, T2-score and CE-T1 were shown respectively

Features for Model construction

In the training cohort, features with significant differences namely Radscore, CIR, ADC value, mrT stage, mrN stage and mrCRM were analyzed using multivariate logistic regression. Radscore, mrT stage and ADC value were identified as the independent factors for predicting Ki-67 expression in rectal cancer (Table 5; Fig. 4). Then mrT stage and ADC value were used to construct a clinical model. Finally, the clinical and radscore model were combined to construct a combined model.

Table 5 Univariate and multivariate analyses of factors for assessing the status of Ki67
Fig. 4
figure 4

COX regression forest of the independent factors for predicting Ki-67 expression in rectal cancer

ROC analysis

For all cohorts, AUC was consistently higher in the combined model, followed by the radscore model and then the clinical model (Fig. 5a-c). For the radscore model, AUCs in the training, in-valid, and ex-valid cohorts were 0.81, 0.83, and 0.78, respectively. While for the combined model, AUCs were 0.84, 0.88, and 0.85 for the three cohorts. Accuracy, AUC, Sensitivity, Specificity, Positive predictive value (PPV) and Negative predictive value(NPV) for all models are presented in Table 6. Delong tests were conducted for all models among the three cohorts (Table 7). In the three cohorts, the prediction performance of the combined model for Ki-67 expression was greater than that of the clinical model (p < 0.05), and no significant difference was observed between Radscore model and combined model (p > 0.05).

Fig. 5
figure 5

ROC analysis of the prediction model in the training cohort (a), in-valid cohort (b), and ex-valid cohort (c)

Table 6 The ROC analysis of the different models
Table 7 Delong test for different models in three groups

Clinical application of nomograms

The combined model was used to construct the nomogram. The nomogram calibration curves indicate that prediction outcomes of the training and validation datasets for Ki-67 expression in rectal cancer highly consistented with the postoperative pathological IHC results. DCA revealed that the nomogram based on the combined model had relatively good clinical performance (Fig. 6a-c).

Fig. 6
figure 6

The nomogram of the combined model, which generated corresponding evaluation scores according to the respective contributions of the radiomics marker value, mrT staging, and ADC value to Ki-67 expression in the regression model. The score for each factor was summed to obtain the total score for the probability of predicting Ki-67 expression (a). Calibration curves of the nomogram. Calibration curves with slope near 1 indicate good fit and accurate nomograms (b). DCA of the nomogram, radiomics model and clinical model. Nomogram had superior capabilities for determining Ki-67 expression in rectal cancer. The y- and x-axes indicate net benefits and threshold probability, respectively (c)

Discussion

In this study, multi-parametric MRI radiomics were used to conduct preoperative evaluation of Ki-67 expression in rectal cancer. The patients were divided into three cohorts (training, in-valid, and ex-valid cohorts). For each cohort, three prediction models (namely, radscore, clinical, and combined models) were established to predict Ki-67 expression in rectal cancer. Among the three cohorts, the combined model had a higher AUC than that of the other two models. The combined model had the highest accuracy for preoperative prediction of Ki-67 expression. The nomograms constructed based on the combined model could be served as intuitive and easy-to-use prediction tools for clinicians. Individualized prediction information was obtained through the simple scores provided, which facilitates their clinical decision-making and improves the prognosis of patients with rectal cancer.

Previous researches [27, 28] indicated that high Ki-67 expression (≥ 50%) in patients with colorectal cancer contributed to poor tumor differentiation and high metastatic recurrence risks. So Ki-67 expression is an independent prediction factor of poor prognosis of colorectal cancer, with 50% Ki-67 expression defined as the critical value [10, 27, 28]. Currently, Ki-67 expression of rectal cancer was generally obtained through invasive biopsy or surgical pathology tissues [29].

This study extracted features from three preoperative MRI sequences of patients with rectal cancer in the training cohort. Then 18 features significantly correlated with Ki-67 expression in rectal cancer were selected to construct models, among which wavelet features account for the majority (14/18, 77.8%), followed by original_shape features (3/18, 16.6%) and original_firstorder features (1/18, 5.6%). Other studies [30, 31] also showed that “wavelet” features had powerful prognostic abilities and were major components in building radiomic model or signature, which is consistent with our study. “Wavelet” features [30, 32], which are derived from the wavelet transform algorithm, can describe the texture information of images at different scales and provide valuable feature information for discrimination and classification of lesions that cannot be identified by the naked eye. The “original_shape” feature is used to describe the geometric features of the image, providing quantifiable indicators for tumor morphological analysis. “Original_firstorder” features are used to describe the distribution of gray values within an image for the discrimination and classification of lesions. These features not only represented the morphological characteristics but also indicated the heterogeneities of tumors that are correlated with tumor proliferation and prognosis.

Through a further fitting of the three sequences, we constructed a radscore model, which yielded AUCs in the training, in-valid, and ex-valid cohorts of 0.81, 0.83, and 0.78, respectively. Previous studies [33, 34] have suggested that models built based on multiple sequences outperformed those based on a single sequence in evaluation of extramural venous invasion status, T staging and neoadjuvant chemotherapy outcomes for rectal cancer. Shu et al. [33] used multi-sequence MRI to select 20 features for construction of a radscore model that successfully evaluated EMVI in rectal cancer. The AUC for the training and validation cohorts were 0.744 and 0.738, respectively. You et al. [34] used T2WI and ADC map to construct a radscore model to evaluate T staging of rectal cancer; the model’s AUCs were higher than those obtained through single sequences. However, these radscore models based on multiple sequences lacked external validation to confirm the model stability in evaluation of postoperative pathological status of rectal cancer. By contrast, our study included an internal validation cohort and an external validation cohort to demonstrate the reliability and stability of the constructed model.

In the training cohort, patient clinical information and preoperative MRI examinations were evaluated for feature extraction to construct a clinical model. ADC value and mrT stage revealed by a multivariate regression analysis to be independent predictors of Ki-67 expression in rectal cancer and were thus used to establish a clinical model. Previous reports [35, 36] showed accurate clinical staging of rectal cancer was closely correlated with the selection of individualized treatment plans and prognosis. Different stages of rectal cancer require correspondingly individualized treatment methods, including resection surgery, chemotherapy, neoadjuvant chemotherapy, or neoadjuvant radiotherapy [36]. DWI, which reflects tumor cell density and necrosis, is the only noninvasive method for detecting water molecule diffusion in living tissues, while the ADC value is a quantitative indicator of DWI [37]. Many studies have indicated that the ADC value is useful for predicting rectal cancer prognosis [37, 38]. A study by our team [1] analyzed the correlation between T staging and ADC in 77 patients with rectal cancer, which revealed that Ki-67 expression in rectal cancer is negatively correlated with ADC. Higher Ki-67 expression corresponds with lower ADC value. Consistently, the pathological T (pT) staging of rectal cancer is negatively correlated with ADC. The higher the pT staging, the lower the ADC value. A study [37] investigated 91 cases of patients with rectal cancer and reported that the ADC value was positively correlated with histological differentiation but negatively correlated with Ki-67 expression. According to the 8th AJCC stratification system, the anatomic extent (T stage) is one of the most important prognostic factors for primary colorectal cancer. The 5-year disease-free survival (DFS) and 5-year overall survival (OS) were different among patients with different T stages. The higher the T stage, the lower the 5-year DFS and 5-year OS. It should be noted that other factors also contribute to the prognosis of patients, such as tumor differentiation, lymph node metastasis, lymphovascular invasion, perineural invasion, etc. [39, 40]. Hence, these factors should be comprehensively considered before making an individualized treatment plan.

The present study revealed that the combined model successfully predicted Ki-67 expression levels in the training, in-valid, and ex-valid cohorts. The AUC value of the combined model was higher than that of the other models. Delong test results revealed that the prediction ability of the combined model was superior to the clinical model for all cohorts. However, the performance of the combined model and the radscore model were not significantly different across all three cohorts. The nomogram revealed that higher radscore values, deeper tumor infiltration, and lower ADC values were correlated with higher Ki-67 expression. We further used DCA to quantify net benefits of nomogram for individualized prediction under different threshold probabilities, which revealed that the net benefits of nomogram-predicted Ki-67 expression outperformed those of the clinical model and radscore model. Cai et al. [5] collected 149 patients with rectal cancer and plotted ROIs on T2WI, DWI, CE-T1 and ADC maps, respectively. Then, they were employed to screen features and construct a radiomics signature for predicting tumor-stroma ratio in rectal cancer. Both mean ADC and rad-score showed a positive correlation with the tumor-stroma ratio in the training group. However, the AUCs of the rad-score were better than that of the mean ADC in the training and validation groups. Although the model achieved good predictive performance, it lacked external validation to determine whether it was suitable for data from other centers. Meng et al. [30] used multi-parametric MRI data to construct radiomic models for predicting multiple biological characteristics (Ki67 expression, lymph node metastasis, tumor differentiation, HER-2 and KRAS-2 mutation) of rectal cancer, with AUC values ranged from 0.651 to 0.699. Compared with these studies [5, 30], our model had the following advantages: (1) The model achieved satisfactory prediction performance verified both in the internal and external validations, implying that it was stable and reliable. (2) After adding clinical information, the predictive performance of the combined model was improved. Hence the Nomogram based on the combined model can be served as an easy-to-use tool for Ki-67 prediction in patients with rectal cancer and has potential for clinical applications.

The present study had the following limitations. (1) Retrospective data from patients diagnosed with rectal cancer through surgery were collected for analysis, resulting in selection bias. (2) Few patients with rectal cancer had low Ki-67 expression. This may explain the poor prognosis of most rectal cancer cases and requires more data for validation. (3) In spite of limited training samples, the predictive performance of the model constructed in present study was satisfactory. We argue that more data in the training cohort may further improved the performance of the predicting radiomics model. Therefore, a larger sample will be collected in future studies.

Conclusions

In conclusion, we constructed a multi-parametric MRI radiomics model for preoperative prediction of Ki-67 expression in patients with rectal cancer. The proposed prediction model had superior performance and was validated in two centers. The model was stable and reliable by internal and external validations. The combined model had the best prediction performance, thus the nomograms constructed based on the combined model could provide doctors with a noninvasive and accurate preoperative tool to support clinical decision-making.