Introduction

Cervical cancer is the fourth most common malignancy and the fourth leading cause of cancer-associated death in females [1]. Previous evidence has reported that there will be 604,000 new cases of cervical cancer and 342,000 deaths in 2020, posing a serious threat to global women’s health all over the world [2]. Over the past few decades, increasing number of cervical cancer patients were detected at an early stage due to the spread of cervical cancer screening [3]. Deep stromal invasion is an important pathological factor associated with the treatments and prognosis of cervical cancer patients [4, 5]. Patients with moderate or 1/3 deep stromal invasion were recommended to receive adjuvant radiotherapy after radical hysterectomy (RH), especially for cervical cancer patients with vascular infiltration and other risk factors [5, 6]. At present, the diagnosis of deep stromal invasion is mainly confirmed by postoperative pathology data [7]. Accurate determination of deep stromal invasion before RH is of great value for early clinical treatment decision-making and improving the prognosis of these patients.

Magnetic resonance imaging (MRI) is a routine imaging examination method for diagnosis, staging, and monitoring of cervical cancer [8]. Currently, studies based on MRI features or quantitative imaging parameters were extracted by naked eyes, which can observe limited visual image gray scale, and some microscopic imaging features related to clinical results may be lost, hampering the accurate representation of tumor heterogeneity [9, 10]. The visual assessment of MRI features by trained radiologists is prone to interobserver variability and lacks generalizability across different institutions [11]. Radiomics is an emerging technology with quantitative features extracted from radiographic medical images by data-characterization algorithms, which is designed to develop prognostic prediction tools and treatment decision support tools in cancers [12]. The predictive value of radiomics using MRI data for preoperative lymph node metastasis, vascular invasion, and parastatal invasion of early cervical cancer has been confirmed previously [10, 13, 14]. Recently, Ren et al. constructed a MRI-based radiomics model to predict the preoperative deep stromal invasion, and the AUC of the model based on radiomics features constructed by logistics regression was 0.879, and combined with clinical features, the AUC was 0.886 [15]. Nonetheless, the predictive values of prediction models for preoperative deep stromal invasion in patients with early cervical cancer still need improving.

The conventional logistic regression model can only explore the linear associations, and nonlinear associations cannot be solved; the accuracy of the prediction models was not always good [16]. Lack of high-quality dataset algorithm training and development and proper validation using more updated methods might be major drawbacks in current clinical practices to predict preoperative deep stromal invasion in patients with early cervical cancer. In order to improve the accuracy of clinical diagnosis or prediction, machine learning is gradually applied in the construction of clinical models, which showed better effects than traditional models such as logistic regression [17, 18]. Machine learning involves the utilization of computer algorithms to derive predictive models from data, and these algorithms ascertain mathematical functions that elucidate the relationships between features within a given dataset [19]. Lately, increasing studies revealed that the integration of radiomics and machine learning enabled the development of classification models for targeted diagnosis of various diseases [19, 20]. However, there was no study combining radiomics and machine learning methods to construct prediction models for preoperative diagnosis of deep stromal invasion in patients with early cervical cancer. Light gradient boosting machine (GBM) is one of the machine learning methods that can reduce calculation time and allow missing values for prediction, which is more advantageous than the conventional logistic regression model [21]. Compared to deep learning and other traditional machine learning algorithms, LightGBM showed better generalization ability [22]. Whether LightGBM can improve the preoperative diagnosis accuracy of deep stromal invasion in patients with cervical cancer based on radiomics data was still unclear.

In the present study, the machine learning method was used to construct three preoperative diagnostic models for deep stromal invasion in patients with early cervical cancer based on clinical, radiomics, and clinical combined radiomics data, respectively. The predictive efficacy of different models was compared. The findings might help identify a novel tool for risk stratification of deep stromal invasion in patients with early cervical cancer in a quicker and more accurate manner. This might help guide the clinicians to make proper treatment adjustments for these patients and improve their prognosis.

Methods

Study Design and Population

This cross-sectional study enrolled 245 patients with early cervical cancer receiving RH combined with pelvic lymph node dissection (PLND) in the local hospital. The inclusion criteria were as follows: (1) patients’ age ≥ 18 years old, (2) patients with primary cervical cancer confirmed by pathology, (3) patients receiving RH combined with PLND, (4) patients who underwent MRI examination within 2 weeks before surgery, (5) patients with complete clinical data. The exclusion criteria were (1) patients with other malignant tumors, (2) patients undergoing palliative tumor resection, (3) pregnant or lactating women, (4) patients who received neoadjuvant therapy before surgery, and (5) MRI data does not meet the requirements of post-processing. After excluding participants who received neoadjuvant therapy before surgery, subjects receiving RH combined with PLND in other hospital, and patients with positive circumferential resection margin, 229 patients were included. This study was approved by the Ethics Committee of the local hospital. Informed consent was obtained from all individual participants included in the study.

Radiomic Features Extraction

T2-weighted images and contrast-enhanced T1-weighted imaging were exported from the workstation of image storage and transmission system in Digital Imaging and Communications in Medicine format. A semi-automatic threshold classification method was used to select region of interest (ROI) of MRI using the 3D region growing GrowCut algorithm from the medical image analysis and visualization Slicer platform (3D-Slicer; version 4.3.1). Given a set of initial label points, the 3D-Slicer algorithm can automatically segment the remaining images through cellular automation, which achieves reliable and reasonably fast segmentation of moderately difficult objects in 2D and 3D using an iterative labeling procedure resembling competitive region growing [23]. Since the MRI were collected from different devices, the images were normalized before extraction, and all images were unified into a resolution of 1 × 1 mm by means of interpolation. ROI covered the entire tumor region. For each patient, a total of 2632 features (T2-weighted images + T1-weighted imaging) were extracted using the “PyRadiomics” package implemented in Python 3.11.1 (Supplementary Table 1). The features included first-order features (n = 18), texture features derived from texture matrices including grey-level co-occurrence matrix (n = 24), grey-level run length matrix (n = 16), grey-level size zone matrix (n = 16), grey-level dependence matrix (n = 14), neighboring gray tone difference matrix (n = 5) and shape-based (n = 14), wavelet transform features including first-order features (n = 144), grey-level co-occurrence matrix (n = 192), grey-level dependence matrix (n = 112), grey-level run length matrix (n = 128), grey-level size zone matrix (n = 128) and neighboring gray tone difference matrix (n = 40), and local binary pattern including first-order features (n = 90), grey-level co-occurrence matrix (n = 120), grey-level dependence matrix (n = 70), grey-level run length matrix (n = 80), and grey-level size zone matrix (n = 80) and neighboring gray tone difference matrix (n = 25).

Clinical Variables

Age (years), body mass index (BMI, kg/m2), menopausal status (premenopausal, perimenopause or postmenopausal), the International Federation of Gynecology and Obstetrics (FIGO) staging (IA, IIA, IB, or IIB), marital status (married or unmarried), preterm birth history (yes or no), reproductive history (primipara or meningopara), history of abortion (yes or no), histological subtype (adenocarcinoma, squamous cell carcinoma or other), complicated with other diseases (yes or no), red blood cell (RBC), white blood cell (WBC), platelet (PLT), neutrophil percentage (NEU; %), lymphocyte percentage (LYM; %), monocyte percentage (MONO; %), eosinophil percentage (EOS; %), basophil percentage (BASO; %), NEU (109/L), LYM (109/L), MONO (109/L), EOS (109/L), BASO (109/L), tumor size, carcinoembryonic antigen (CEA; normal or abnormal; ng/mL), squamous cell carcinoma antigen (SCC-Ag; normal or abnormal; ng/mL), carbohydrate antigen-125 (CA125; normal or abnormal; ng/mL), and carbohydrate antigen-199 (CA199; normal or abnormal; U/mL) were analyzed.

Building Prediction Classifiers

The radiomics features were extracted after image segmentation on the original MRI image to delineate the ROI, and features with statistical significance (P < 0.05) were included (SciPy tool in Python version 1.10.0). Then Pearson’s correlation coefficient was applied; when the Pearson correlation coefficient between the two features > 0.85, the features with higher P-value were excluded (Pandas tool in Python version 1.5.3). Further, the analysis of variance (ANOVA) was applied to select the top 15 radiomics features with high variance (scikit-learn tool in Python version 1.2.1). Next, the least absolute shrinkage and selection operator (LASSO) and the fivefold cross-validation were applied to further screen out features (coefficent ≠ 0). Univariable and multivariate logistic regression analyses were applied to identify clinical predictors associated with the deep stromal invasion in patients with early cervical cancer, and variables with statistical association with deep stromal invasion in patients with early cervical cancer were included as clinical predictors (P < 0.05). All subjects were randomly divided into the training set (n = 160) and testing set (n = 69) at a ratio of 7:3. Three LightGBM models were constructed in the training set: a radiomics model constructed with radiomics features alone (model 1), a clinical model constructed with clinic features alone (model 2), and a combined model constructed with the combination of radiomics features and clinical predictor (model 3). The parameters set for training each model are shown in Table 1. During the training of each model, optuna ultra parameter optimization tool was adopted to optimize the parameters, the optimized model was used to verify in the training set, and the corresponding evaluation indexes were calculated. The predictive performances of the models were verified in the testing set. The proposed model’s whole architecture is exhibited in Fig. 1. The pseudocode for the proposed work was shown as follows:

figure a
figure b
figure c
Table 1 The parameters used for training each prediction model
Fig. 1
figure 1

The proposed model’s whole architecture

Measurement of the Performance of the Prediction Model

The proposed model more accurately predicted the deep stromal invasion in patients with early cervical cancer. The robustness of the model was assessed in the training set and the testing set. F1 score, accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and area under the curve (AUC) were employed to evaluate the predictive values of the models. The receiver operator characteristic (ROC) curves and Kolmogorov–Smirnov (KS) curves were plotted.

The accuracy assessment parameter is calculated:

$$\mathrm{F}1\mathrm{ score}=\frac{2\times \mathrm{SN}\times \mathrm{PRE}}{\mathrm{SN}+\mathrm{PRE}}$$
$$\mathrm{Accurancy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$
$$\mathrm{Sensitivity}={^{\mathrm{TP}}/ _{(\mathrm{TP}+\mathrm{FN})}}$$
$$\mathrm{Specificity}= {^{\mathrm{TN}}/_{(\mathrm{TN}+\mathrm{FP})}}$$

SN, sensitivity; TP, true positive; TN, true negative; FP, false positive; FN, false negative; PRE, TP/(TP + FP).

Statistical Analysis

The measurement data of normal distribution were expressed as mean and standard deviation (Mean (SD)), and t test was used to compare the differences between the two groups. Median and quartiles were used to describe the distribution of non-normally-distributed measurement data, and Wilcoxon rank sum test was used to compare the difference between the two groups. The enumeration data were displayed using the number of cases and percentages, and the chi-square test was used to compare differences between groups. The radiomics features were extracted, and features were selected via Pearson’s correlation coefficient, ANOVA, LASSO regression analysis, and the fivefold cross-validation. Univariable and multivariate logistic regression analyses were applied to identify clinical predictors associated with the deep stromal invasion in patients with early cervical cancer. All subjects were randomly split into the training set (n = 160) and testing set (n = 69) at a ratio of 7:3. Three LightGBM models were constructed in the training set: model 1 included radiomics features, model 2 included clinical predictors, and model 3 included radiomics features and clinical predictors. The models were verified in the testing set. The ROC and KS curves were plotted. The confidence level was alpha = 0.05. R (Institute for Statistics and Mathematics, Vienna, Austria) was used for data analysis. Python 3.11.1 was used for radiomics features extraction and model construction.

Results

Identification of Predictors in the Models for Deep Stromal Invasion in Patients with Early Cervical Cancer

In total, 245 patients with early cervical cancer who underwent RH combined with PLND in the local hospital were enrolled. Among them, participants who received neoadjuvant therapy before surgery (n = 6), subjects receiving RH combined with PLND in other hospital (n = 1), and patients with positive circumferential resection margin (n = 9) were excluded. Finally, 229 patients were included. The screen process of participants is shown in Fig. 2.

Fig. 2
figure 2

The screen process of the participants

A total of 2632 features were extracted from MRI, and those with statistical significance (P < 0.05) were kept. When the Pearson correlation coefficient between the two features were > 0.85, the features with higher P-value were excluded. Further, the ANOVA was applied to select the top 15 radiomics features with high variance. Finally, LASSO regression analysis was applied to screen out the features (Fig. 3, Table 2). We used fivefold cross-validation to find the optimal value of regularization parameter lambda with mean square error, and MSE was changed with lambda. The optimal lambda value was used for variable selection and was 0.019179102616724848 (Fig. 4). The coefficients of features finally included are exhibited in Table 2 and Fig. 5.

Fig. 3
figure 3

The results of LASSO regression analysis for radiomics features

Table 2 The radiomics features associated with deep stromal invasion in patients with early cervical cancer screened by LASSO
Fig. 4
figure 4

The optimal Lambda value of LASSO regression analysis

Fig. 5
figure 5

The coefficients of features screened out by LASSO regression analysis

As presented in Table 3, age, postmenopausal, FIGO-IIA, LYM, tumor size, SCC-Ag, and CA125 might be associated with deep stromal invasion in patients with early cervical cancer. Multivariate logistical regression analysis revealed that FIGO-IIA (OR = 2.43, 95% CI 1.36–4.37), FIGO-IB (OR = 1.87, 95% CI 1.05–3.33) and FIGO-IIB (OR = 3.42, 95% CI 1.28–9.15), and SCC-Ag (OR = 1.38, 95% CI 1.19–1.59) were correlated with deep stromal invasion in patients with early cervical cancer.

Table 3 Clinical predictors for deep stromal invasion in patients with early cervical cancer

Construction of the Prediction Models for Deep Stromal Invasion in Patients with Early Cervical Cancer

All the samples were randomly divided into the training set (n = 160) and the testing set (n = 69). There was no statistical difference between the data in the training set and testing set according to the results of equilibrium test (all P > 0.05) (Table 4). The numbers of samples with deep stromal invasion < 1/3 and deep stromal invasion ≥ 1/3 in different dataset are presented in Table 5. The percentages of patients with different FIGO staging and abnormal SCC-Ag (11.86% vs 54.46%) were statistically different between deep stromal invasion < 1/3 group and deep stromal invasion ≥ 1/3 group. The radiomics features were also statistically different between deep stromal invasion < 1/3 group and deep stromal invasion ≥ 1/3 group (Table 6).

Table 4 Comparisons of the variables in the training set and the testing set
Table 5 The numbers of samples with deep stromal invasion < 1/3 and deep stromal invasion ≥ 1/3 in different dataset
Table 6 Comparisons of variables of patients between deep stromal invasion < 1/3 group and deep stromal invasion ≥ 1/3 group

Evaluation of the Predictive Performance of the Prediction Models for Deep Stromal Invasion in Patients with Early Cervical Cancer

The AUC of the prediction model based on radiomics features was 0.951 (95% CI 0.922–0.980) in the training set. The AUC of the prediction model based on clinical predictors was 0.769 (95% CI 0.703–0.835) in the training set. The AUC of the prediction model based on radiomics features and clinical predictors was 0.969 (95% CI 0.947–0.990) in the training set (Table 7). The AUC of the prediction model based on radiomics features and clinical predictors was 0.914 (95% CI 0.848–0.980) in the testing set (Table 7, Fig. 6). The KS curves of the prediction models based on radiomics features, clinical predictors, and radiomics features combined with clinical predictors were plotted. The KS test was used to assess the agreement between the predicted and actual probabilities of deep stromal invasion and higher KS values indicating greater ability of the model to discriminate the samples. Generally, KS > 0.2 denotes a strong risk differentiation ability of the model developed. The KS values of the prediction models based on radiomics features, clinical predictors, and radiomics features combined with clinical predictors were 0.59 (Fig. 7), 0.47 (Fig. 8), and 0.69 (Fig. 9), respectively. The variable importance of all the predictors in the prediction model based on radiomics features combined with clinical predictors is presented in Fig. 10.

Table 7 The predictive values of the models
Fig. 6
figure 6

The ROC curves showing the AUCs of different models in the testing set

Fig. 7
figure 7

The KS curves of the prediction model based on radiomics features

Fig. 8
figure 8

The KS curves of the prediction model based on clinical predictors

Fig. 9
figure 9

The KS curves of the prediction model based on radiomics features combined with clinical predictors

Fig. 10
figure 10

The variable importance of all the predictors in the prediction model based on radiomics features combined with clinical predictors

Discussion

The present study constructed three preoperative diagnostic models for deep stromal invasion in patients with early cervical cancer based on clinical, radiomics, and clinical combined radiomics data based on machine learning method. The model combined with radiomics features and clinical predictors showed better predictive performance than the prediction models based on radiomics features or clinical predictors. The findings might provide an effective tool to help clinicians early identify patients with the deep stromal invasion and guide the treatments accordingly.

Previously, there were several prediction models based on MRI data for deep stromal invasion in patients with cervical cancer. Song et al. constructed a prediction model based on amide proton transfer weighted imaging combined with dynamic contrast-enhanced MRI and found that Ktrans + SCC-Ag had the AUC of 0.819 for predicting deep stromal invasion in patients with IB1-IIA1 cervical cancer [24]. Another prospective multicenter study constructed a preoperative prediction model for deep stromal invasion in women with invasive cervical cancer using 2D and 3D ultrasound and showed an AUC of 0.93 [25]. These models mostly constructed based on the conventional logistic regression model, which can only explore the linear associations, and the predictive ability still needs improvement [16]. To use the machine learning algorithm to train and validate the prediction model might help improve the predictive accuracy of deep stromal invasion in patients with early cervical cancer. In our study, the model based on radiomics features had an AUC of 0.951, and the AUC of the model based on radiomics features and clinical predictors was 0.969. The models presented better predictive performance for deep stromal invasion in patients with early cervical cancer than previous models. The detailed information on database, computational complexity, and reliability of our model and previous prediction models are exhibited in Table 8. MRI had the advantages of relatively low cost, high spatial resolution and contrast of pelvic tissues and organs, and no radiation [26, 27]. MRI was highly individual specific and non-invasive, which has been applied to clinical decision support for the improvement of the screening accuracy, diagnosis, and prognosis prediction [28]. The prediction model in our study was constructed using LightGBM, which used histogram-based segmentation algorithm instead of presort traversal algorithm to reduce the number of features by gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB) [29]. LightGBM had higher efficiency and accuracy [30] and better generalization ability [22]. The model combining LightGBM methods and MRI in the current study might provide a convenient and easy tool for early identification of those at a high risk of deep stromal invasion in patients with early cervical cancer. The accuracy for predicting deep stromal invasion in patients with early cervical cancer was improved compared to previous models, which might help guide the treatments options of these patients with high risk of deep stromal invasion, and early interventions might improve their prognosis.

Table 8 Comparisons of our prediction model and previous prediction models for deep stromal invasion in patients with early cervical cancer

MRI is a vital exam for the initial assessment of loco-regional involvement of cervical cancer. In previous studies, multiple studies found that MRI was applied to evaluate the early response to radiochemotherapy before image-guided brachytherapy in patients with locally advanced cervical cancer [31]. Multiparametric MRI–derived radiomics was also applied for the prediction of disease-free survival in early-stage squamous cervical cancer [32]. Multimodal MRI was reported to have good diagnostic value for the discrimination of metastatic and non-metastatic pelvic lymph nodes in cervical cancer [33]. Another prospective preliminary study applied the synthetic MRI to evaluate the prognostic factors in cervical cancer [34]. These studies gave support to the results of this study, which elucidated that MRI-derived radiomics features were important predictors for deep stromal invasion in patients with early cervical cancer. Cancer staging is an essential index for the diagnosis, prognosis, and treatment of cervical cancer [35]. The FIGO staging system was widely applied in cervical cancer [36], which was reported to be associated with the treatment outcomes in early-stage cervical cancer patients [37]. Herein, the FIGO staging system was also found to be an important predictor for deep stromal invasion in patients with early cervical cancer. Another predictor for deep stromal invasion in patients with early cervical cancer in this study was SCC-Ag. This was allied by previous evidence. SCC-Ag was used in outcome prediction after concurrent chemo-radiotherapy and treatment decisions for patients with cervical cancer [38]. SCC-Ag changes in patients with locally advanced cervical cancer were one of the parameters of prognostic evaluation [39].

The current study compared the predictive abilities of three preoperative diagnostic models using the machine learning method for preoperative non-invasive diagnosis of deep stromal invasion in patients with early cervical cancer based on clinical, radiomics, and clinical combined radiomics data, respectively. The predicting performance of the model for deep stromal invasion in patients with early cervical cancer based on clinical combined radiomics data was good. The findings might provide a tool to help clinicians identify deep stromal invasion in patients with early cervical cancer and formulate treatment strategies accordingly. There were several limitations in this study. Firstly, the participants were from a single center, and there might be selection bias. Secondly, the MRI images were collected from different devices, which might have a potential impact on the stability of radiomics features. Therefore, the images were normalized before feature extraction, and all images were unified to a resolution of 1 × 1 mm. The standardization process was considered a useful way to promote good feature robustness in cervical cancer. In recent years, more and more deep learning methods such as automated in-depth feature learning algorithm [40] and a deep convolutional neural network-based approach [41] were widely applied for disease prediction and prognosis evaluation. These methods are unsupervised active learning, which increase efficiency and accuracy of diseases and prognosis prediction including cancers [42]. The future of applied deep learning in cervical cancer might help integrate medical images and clinical data to construct more reliable prediction models. In the future, more well-designed studies using deep learning methods were needed to verify the results in this study.

Conclusions

The AUC values of the prediction model for deep stromal invasion in patients with early cervical cancer based on clinical and radiomics data were 0.969 in the training set and 0.914 in the testing set, which exhibited good predictive performance than previous prediction models. The prediction model might help the clinicians early and accurately identify patients with high risk of deep stromal invasion and provide timely interventions.