Background

A global increase in the prevalence of endometrial pathologies parallels escalating levels of obesity, progressive aging of the population, and increasing trends in delaying childbearing [1, 2]. In clinical practice, patients are diagnosed with suspected endometrial lesions due to abnormal uterine bleeding, infertility, or even an abnormal appearance of the endometrium as an incidental finding on imaging performed for other indications [3,4,5]. It is crucial to make an accurate preoperative diagnosis of endometrial lesions for radiologists and gynecologists, thus avoiding unnecessary surgical procedures and protecting the patients’ fertility.

Endometrial sampling biopsy with curettage or hysteroscopy serves as the primary diagnostic approach. Still, this method is invasive and not always possible (e.g., for patients with cervical stenosis or those unable to tolerate the procedure). Transvaginal ultrasound is an alternate cost-efficient examination and usually the first choice, but it has a relatively low specificity and depends largely on the operators [6, 7]. Magnetic resonance imaging (MRI) is recommended for cases with inconclusive sonographic findings. Nevertheless, pre-surgical evaluation of uterine cavity abnormalities by conventional MRI remains challenging [8]. This difficulty can be attributed to the variable and potentially overlapping imaging features of a large spectrum of benign and malignant endometrial lesions [9].

Apparent diffusion coefficient (ADC) values obtained from diffusion-weighted imaging (DWI) have shown potency in characterizing endometrial pathologies as benign or malignant [10,11,12,13]. The ADC measures the random motion of water molecules and decreases with increasing tumor cellularity, as seen in malignant lesions [14]. However, drawing a region of interest on a representative section of tumors may impact the ADC values and interobserver variability [15, 16].

The whole-lesion histogram-based ADC analysis offers a more comprehensive assessment of a given abnormality than traditional ADC measures and provides an option for quantifying the overall heterogeneity of the tumor. Recently, ADC histogram analysis has been increasingly applied in genitourinary imaging [17,18,19] to differentiate the histological types, predict tumor grade, and assess treatment response. A previous study has demonstrated that ADC histogram metrics may help radiologists differentiate benign from malignant endometrial lesions in premenopausal patients [20] while limited by fairly small sample size (n = 54). To the best of our knowledge, no studies have reported on ADC histogram analysis for differentiating benign endometrial lesions (BELs) from the International Federation of Gynecology and Obstetrics (FIGO) stage IA endometrial carcinoma (EC, the tumor is limited to the uterine corpus without or with less than 50% myometrial invasion). Moreover, several studies have shown that ADC histogram parameters may reflect different histopathological features and assist in evaluating tumor proliferation. For instance, ADC histogram parameters have recently been reported to be associated with the expression of Ki-67, epidermal-growth factor (EGFR), and histone 3 in uterine cervical cancer [21, 22], and also the expression of p53 in epithelial ovarian cancer [23].

Therefore, we aimed to evaluate the role of whole-lesion ADC histogram analysis in differentiating stage IA EC from BELs in premenopausal and postmenopausal patients and characterizing histopathologic features of stage IA EC preoperatively.

Methods

Study population

This retrospective study was approved by our institutional review board, and the requirement for written informed consent was waived. After reviewing the medical records of our hospital between January 2011 and December 2019, 232 patients diagnosed with BELs and EC were selected. All lesions were pathologically confirmed after hysterectomy or hysteroscopic resection. The benign pathologies included endometrial polyp, endometrial hyperplasia without atypia, and atypical endometrial hyperplasia. All of the ECs were in FIGO 2018 stage IA.

Inclusion criteria were: (1) pelvis MR imaging with DWI performed within 20 days before surgery; (2) no tumor-related therapy received before MR examination. Exclusion criteria were the following: (1) endometrium too thin (maximum thickness was less than 5 mm on sagittal T2-weighted images or sketchable layers were less than two) to be accurately measured; (2) DWI with non-standard b values (other than 0, 800 s/mm2); (3) poor image quality or noticeable artifacts; (4) incomplete clinical data. The flowchart of patient enrollment is shown in Fig. 1.

Fig. 1
figure 1

The flowchart of patient enrollment

MRI protocol

All pelvis MR examinations were performed using 3.0T scanners (Signa HDxt and Discovery MR 750, GE Medical System) equipped with an eight-element phased coil with patients in the supine position. Patients with no contraindications received 10 mg raceanisodamine hydrochloride injection intramuscularly before image acquisition to reduce the bowel motion artifacts. DWI was obtained in the axial plane using a single-shot echo-planar imaging technique before the injection of the contrast agent. Diffusion gradients were applied in three orthogonal directions with b values of 0 and 800 s/mm2. More detailed sequence scanning parameters are shown in Table 1.

Table 1 MR imaging protocol

Imaging analysis

ADC maps were manually generated from DWI on the post-processing workstation (Advantage Workstation 4.6; GE Medical System). Two radiologists (J.Z. and X.Y., with 6- and 18-years’ experience in gynecologic MR imaging, as reader 1 and 2) retrospectively reviewed all images independently while blinded to the clinical and pathological information.

The ITK-SNAP software (version 3.8.0, www.itksnap.org) was used in this study. The volume of interest (VOI) covering the whole tumor was manually drawn along the boundary of the tumor or entire endometrium (if without visible tumor) on all slices of DWI images (b = 800 s/mm2) by reader 1. T1-weighted images, T2-weighted images, and dynamic contrast-enhanced images were used as references to avoid the necrotic, cystic, hemorrhagic areas and adjacent normal tissues being included in the VOIs. Then, the VOIs were automatically copied to ADC maps. After a one-month interval, we repeated the drawing by reader 1 and 2 independently. Inter- and intra-observer agreements of ADC histogram metrics were determined by calculating the intraclass correlation coefficients (ICC). Cases with apparent inconsistent VOIs between reader 1 and 2 were reassessed by another radiologist (H.O., with 30-years’ experience in gynecologic imaging) to ensure high-quality final segmentation results.

The ADC histogram analysis was performed with the open-source PyRadiomics software [24] to obtain the volume of tumors and 18 first-order parameters, including 10th percentile ADC (ADC10th), 90th percentile ADC (ADC90th), ADCmin, ADCmax, ADCmean, ADCmedian, interquartile range (IQR), range, mean absolute deviation (MAD), robust mean absolute deviation (rMAD), root mean squared (RMS), energy, total energy, entropy, skewness, kurtosis, variance, and uniformity. Image normalization was applied on the ADC map before parameter extraction using the PyRadiomics normalization method by centering it at the mean with standard deviation based on all gray values in the image (not just those inside the segmentation).

Histopathologic analysis

A pathologist (Y.S.) with 20 years’ experience in gynecologic pathology reviewed the postoperative pathological data, including tumor classification, grading, and Ki-67 testing, while blinded to the clinical and image data. The EC tumor grade was established first (Grade 1 = well differentiated; Grade 2 = moderately differentiated; Grade 3 = poorly differentiated). According to the literature [25], the following histological subtypes of EC were classified as Grade 3: serous carcinoma, clear cell carcinoma, mixed carcinoma, and carcinosarcomas. Then, the tumors were divided into two groups: high-grade (Grade 3) and low-grade (Grade 1 and 2). The Ki-67 labeling index was estimated by counting positively stained nuclei of tumor cells number in all pictures per lesion. A cut-off value of 30% was used to divide Ki-67 expression into the low-proliferation group (< 30%) and high-proliferation group (≥ 30%) [26, 27].

Statistical analysis

Categorical variables were analyzed using the chi-square or Fisher’s exact test when the expected value in any cell was less than five. Continuous variables were analyzed using a t-test or Mann–Whitney U test after checking for normality using the Kolmogorov–Smirnov test. The receiver operating characteristic (ROC) curves were performed for all significant variables to assess the differential diagnostic efficiency of these features. Based on ROC analysis, the optimal cut-off value was determined using the maximum Youden index (i.e., sensitivity + specificity − 1).

Before the feature selection process, we applied Z score normalization (standardization) to ensure that the histogram parameters were measured on the same scale. The variables with a p < 0.1 on univariate analysis were further analyzed using multivariate analysis. After that, the multivariate logistic regression analysis with a forward stepwise selection procedure was used to select and construct different diagnostic models. The diagnostic performance of the model was assessed using the ROC curves with the corresponding AUC, sensitivity, specificity, and accuracy. Calibration curves and the Hosmer–Lemeshow test were used to evaluate the goodness-of-fit of models. DeLong’s test was used to compare the AUC of each model. Model internal validations were performed using the enhanced bootstrap resampling method (n = 1000), which obtained the estimates of optimism in the regression models to provide a bias-corrected AUC value. Spearman's rank correlation coefficient was used to calculate the correlation between ADC histogram parameters and the Ki-67 labeling index. Statistical analyses were performed using R software (version 4.0.3; http://www.Rproject.org). A p < 0.05 was considered statistically significant.

Results

Patient characteristics

A total of 232 patients were enrolled in our study, including 106 BEL patients (age range, 34–77 years; median age, 49 years) and 126 stage IA EC patients (age range, 28–77 years; median age, 53 years). Tables 2 and 3 show patients’ clinical and histopathological characteristics, respectively. Representative cases of BELs and stage IA EC are presented in Fig. 2.

Table 2 Summary of patients' clinical characteristics
Table 3 Histopathological features of stage IA EC
Fig. 2
figure 2

Findings of three patients with histopathological-proven endometrial polyp (a–d), atypical endometrial hyperplasia (eh), and stage IA EC (il). Sagittal T2WI (a, e, i), axial DWI (b = 800 s/mm2; b, f, j), axial ADC maps (c, g, k), and ADC histogram (d, h, l). All three lesions showed moderate hyperintensity on T2WI and hyperintensity on DWI. The BELs (endometrial polyp and hyperplasia) showed slight hyperintensity or isointensity, while stage IA EC showed hypointensity on the ADC maps. m The ADC histograms reflect the differences in the frequency of voxels distribution between benign and malignant endometrial tumors

Reliability of ADC histogram analysis

All measurements of whole-lesion histogram analysis showed excellent intraobserver and interobserver reliability (ICC = 0.955–0.998 and ICC = 0.926–0.997, respectively; Table 4).

Table 4 Comparison of ADC histogram parameters between stage IA EC and BELs

Comparison of ADC histogram parameters between BELs and stage IA EC

The results and distribution of ADC histogram parameters between stage IA EC and BELs are shown in Table 4 and Fig. 3. The ADC values of the stage IA EC, including ADC10th, ADC90th, ADCmin, ADCmax, ADCmean, ADCmedian, and IQR, were all significantly lower than those of BELs (all p < 0.05). The MAD, rMAD, RMS, energy, total energy, entropy, and variance of the stage IA EC were also significantly lower than those of BELs, and the skewness, kurtosis, and uniformity were higher (all p < 0.05). There were no significant differences in the tumor volume and range between stage IA EC and BELs (all p > 0.05). Spearman correlation coefficients of the significant ADC histogram parameters are shown in Fig. 4.

Fig. 3
figure 3

Boxplots graphically depict the quartiles and distributions of normalized volumetric ADC histogram parameters between BELs and stage IA EC. The colors are grouped based on histopathological results: BEL in red and EC in blue. *p < 0.05, **p < 0.01, ***p < 0.001. IQR, interquartile range; MAD, mean absolute deviation; rMAD, robust mean absolute deviation; RMS, root mean squared

Fig. 4
figure 4

Spearman correlation coefficients of the significant ADC histogram parameters. The text in bold showed the parameters included in the ADC histogram model. IQR, interquartile range; MAD, mean absolute deviation; rMAD, robust mean absolute deviation; RMS, root mean squared

Diagnostic efficacy of ADC histogram parameters in differentiating BELs and stage IA EC

The results of ROC curve analysis are summarized in Table 5. For the discrimination of stage IA EC from BELs, ADCmedian generated the highest AUC (AUC = 0.928; 95% CI 0.895–0.960; cut-off value = 1.161 × 10−3 mm2/s; sensitivity = 88.9%; specificity = 83.0%), followed by ADCmean, RMS, and ADC10th (AUC = 0.926, 0.925, and 0.920, respectively).

Table 5 Diagnostic performance of ADC histogram parameters in differentiating between stage IA EC and BELs

Multivariate logistic regression models based on clinical and ADC histogram parameters

Clinical model

In the clinical model, multivariate regression analysis showed that age (51–64 years) (odds ratio [OR] = 4.106, 95% confidence interval [CI] 2.269–7.429; p < 0.001), nulliparity (OR = 1.433, 95% CI 1.066–16.472; p = 0.040), and long-term tamoxifen therapy (OR = 0.140, 95% CI 0.038–0.524; p = 0.003) were significantly associated with differential diagnosis of stage IA EC from BELs. This clinical model achieved an AUC of 0.705 (95% CI 0.638–0.773, sensitivity = 65.1%; specificity = 70.8%).

ADC histogram model

After univariate and multivariate regression analysis, ADC10th, rMAD, total energy, and skewness were retained to fit the ADC histogram model. ADC-score calculated as the linear combination of these features with the logistic regression model coefficients was as follows:

$${\text{ADC - score}} = \;0.{257} + - {3}.{27}0 \times {\text{ADC}}_{{{1}0{\text{th}}}} { + } - 0.{71}0 \times {\text{ rMAD}} + {1}.0{25} \times {\text{Total}}\_{\text{Energy}} + 0.{6}0{5} \times {\text{Sknewness}}$$

Moreover, when combining clinical parameters and ADC-score, ADC-score was the only significant independent predictor (OR = 2.641, 95% CI 2.045–3.411; p < 0.001). All data from multivariate logistic regression models are summarized in Table 6.

Table 6 Results of multivariate logistic regression models for differentiating stage IA EC from BELs

ROC analysis showed that the ADC histogram model had a significantly higher AUC of 0.941 (95% CI 0.912–0.970, sensitivity = 88.1%; specificity = 89.6%) than the clinical model (p < 0.001). Although the AUC of the ADC histogram model was higher than ADCmedian, no significant difference existed (p = 0.071). Bias-corrected AUCs generated through an enhanced bootstrap resampling process showed slight reductions for the ADC histogram model, from 0.941 to 0.937. The ROCs of the models are shown in Fig. 5a, b. The calibration curve showed good fitness for the ADC histogram model (Hosmer–Lemeshow test, p = 0.504) (Fig. 5c).

Fig. 5
figure 5

a ROCs of the ADC histogram parameters and combined ADC histogram model. b ROCs of the ADC histogram model and clinical model. c The calibration plot of the ADC histogram model. Patient risk scores output by the ADC histogram model for d premenopausal and e postmenopausal patients, while red bars show scores for those with stage IA EC. rMAD, robust mean absolute deviation

Subgroup analysis revealed that the ADC histogram model achieved an AUC of 0.919 (95% CI 0.866–0.973), a sensitivity of 0.881, a specificity of 0.896, and an accuracy of 0.888 in the premenopausal group. The model achieved an even higher AUC of 0.957 (95% CI 0.922–0.991), a sensitivity of 0.886, a specificity of 0.935, and an accuracy of 0.905 in the postmenopausal group. Figure 5d, e shows the performance of the ADC histogram model in the premenopausal and postmenopausal populations intuitively. In addition, the ADC histogram model also performed well in distinguishing BELs from stage IA endometrioid ECs, with an AUC of 0.943 (95% CI 0.913–0.973), a sensitivity of 0.892, a specificity of 0.896, and an accuracy of 0.894.

Diagnostic efficacy of ADC histogram parameters in characterizing histopathologic features of stage IA EC

Grade 3 stage IA ECs showed significantly lower ADCmin and ADC10th values compared to Grade 1/2 tumors (p = 0.022 and 0.047, respectively; Additional file 1: Table S1). ROC analysis showed that ADCmin was more effective in comparison to other parameters. Using the Youden index, a threshold value of 0.583 × 10−3mm2/s for ADCmin was identified. This threshold yielded an AUC of 0.641 (95% CI 0.518–0.763), a sensitivity of 40.7%, a specificity of 85.4%, and an accuracy of 75.6% (Additional file 1: Table S2).

The level of the proliferation index Ki-67 was available for 80 EC patients. For the 80 stage IA EC lesions, pathologic evaluation of Ki-67 ranged from 1% to 90% (median, 30%). Spearman’s rank correlation coefficients showed no correlations between ADC histogram parameters and expression of Ki-67 in stage IA EC (all p > 0.05; Additional file 1: Table S3). Also, there were no significant differences between the ADC parameters in low- and high- Ki-67 expression groups (all p > 0.05; Additional file 1: Table S4).

Discussion

Our study demonstrated that ADC histogram parameters derived from whole-lesion assessment could help distinguish stage IA EC from BELs preoperatively. The ADCmedian yielded the highest AUC of 0.928 for differentiating BELs from stage IA EC among all the ADC histogram parameters. Furthermore, multivariate analysis showed that ADC-score (ADC10th + skewness + rMAD + total energy) was the only significant independent predictor for stage IA EC when considering the clinical parameters. This ADC histogram model (ADC-score) achieved an AUC of 0.941 and bias-corrected AUC of 0.937 and performed well for premenopausal and postmenopausal patients.

In the present study, the ADC values, including ADC10th, ADC90th, ADCmin, ADCmax, ADCmean, ADCmedian, and IQR of stage IA EC, were all significantly lower than those of BELs, which is consistent with previous studies [10, 28, 29]. The denser cellularity in malignant lesions leads to the restriction of water molecular diffusion and corresponding decreased ADC values [12]. Furthermore, most previous studies included EC of different stages with relatively small sample sizes. In this study, we analyzed the capacity of ADC values in the discriminating early-stage EC from BELs. Whereas prior studies mainly focused on the role of standard mean ADC values, we observed that the whole-lesion ADCmedian, ADCmean, and ADC10th all showed high classification potential, which could be easily applied in clinical practice.

Several previous studies have suggested that low percentiles of ADC are more helpful in diagnosing and classifying malignancies compared to mean ADC or high percentiles [30,31,32,33]. Kierans et al. [20] suggested that ADC10th may accurately predict malignant endometrial lesions compared to ADCmean. In our study, although ADC10th was not superior to the mean or median ADC in the classification task, it showed a reasonably good differentiating performance and was selected as one of the optimal parameters of the ADC histogram model. A possible explanation is that lower percentiles ADC may better represent aggressive solid components within endometrial malignancies, while the high percentile ADC might be vulnerable to the cystic or necrotic components [34]. In clinical work, such microcystic changes possibly failed to be excluded from the VOI because of the limitation of visual detection. Therefore, it was unsurprising that ADC10th effectively discriminated the two lesions with distinct compactness.

The ADC histogram analysis represents texture-based statistics of the variation and frequency of ADC values within a given tissue. It can assess the deviation of the histogram from a normal distribution as a marker of structural heterogeneity and complexity. Previous studies demonstrated more significant heterogeneity in more aggressive lesions [35,36,37,38]. Besides the quantitative ADC values, we found that the histogram parameters, MAD, rMAD, RMS, energy, total energy, entropy, and variance of the stage IA EC were significantly lower. At the same time, skewness, kurtosis, and uniformity were significantly higher than those of BELs. After univariate and multivariate regression analysis, skewness, total energy, and rMAD were included in the final ADC histogram model.

Skewness reflects the asymmetry of the ADC histogram distribution. Positive skewness indicates that most voxels contain ADC values below the mean, and a long tail of the curve leans rightward. Prior studies have demonstrated significantly higher skewness in soft tissue sarcomas than in benign peripheral neurogenic tumors [35], as well as invasive compared to noninvasive intraductal papillary neoplasms of the bile ducts [36]. Therefore, our observation of greater ADC skewness in stage IA EC probably reflects this increased structural heterogeneity within the lesions, with a predominance of lower ADC values indicating the reduction in ADC arising from neoplasia-related cellularity.

Energy refers to the magnitude of voxel values in the image; it is volume-dependent, and larger values imply a higher sum of the squares of these values. The total energy is the value of the energy feature scaled by the voxel volume in cubic mm [24]. Since no significant difference existed in the volume of VOIs between these two groups in the current study, we concluded that stage IA EC had significantly lower energy and total energy than BELs due to lower voxel intensity values within the entire tumor in ADC maps.

In our study, significantly higher rMAD was observed in BELs than in stage IA EC. The rMAD is the mean distance of all intensity values from the mean value calculated on the subset of image array with gray levels in between, or equal to the 10th and 90th percentile, which is robust optimization of the MAD model [24]. A larger MAD indicates a higher contrast between high and low intensity in a tumor. With our effort to exclude the necrotic, cystic, and hemorrhagic areas in the VOIs, we considered that stage IA EC lesions had increased tumor cellularity, resulting in a relatively uniform reduction in ADC values. In contrast, benign lesions, such as endometrial polyps, contain endometrial glands and stroma of focally or diffusely dense fibrous or smooth muscle tissue [6]. Cystic glandular hyperplasia commonly occurs within the polyp. This tissue characterization can cause a more dispersed ADC distribution in the VOIs.

Previous studies demonstrated that endometrial pathologies share common predisposing risk factors, such as age, obesity, diabetes, postmenopausal status, nulliparity, and long-term tamoxifen therapy [39]. Our data suggested that women aged 51–64 were more likely to have EC than BELs than those under 50 or over 65. Similarly, Abid et al. [40] found that age was associated with more progressive lesions in peri- and postmenopausal age groups such as EC, yet endometrial polyp was the most common pathology in postmenopausal women. Meanwhile, nulliparity is an established risk factor for endometrial cancer, and each pregnancy provides an additional risk reduction [41]. In contrast, the mechanisms and hormone profiles that underlie alterations in endometrial cancer risk are not fully understood. Prolonged tamoxifen use is associated with an increased incidence of various endometrial lesions, including endometrial polyp, endometrial hyperplasia with or without atypia, EC, and sarcoma [42]. However, compared to EC, most endometrial lesions detected in tamoxifen users are benign, among which endometrial polyps represent the most common endometrial pathology [43, 44]. In the current study, we found that long-term tamoxifen therapy was significantly associated with diagnosing BELs, consistent with prior studies. Nevertheless, the efficacy of the above-mentioned clinical model was moderate, and no clinical parameters survived after multivariate regression analysis. Different from a previous study [20], we proved that the ADC histogram model could predict the presence of malignancy in the postmenopausal and premenopausal groups with larger sample size.

As the tumor progresses, changes in the tumor microenvironment enhance the proliferative ability. ADC values based on water molecular diffusion may reflect the microstructure of relevant tissues in this respect. Recently, numerous studies have analyzed the associations between ADC values and histopathological features like tumor grade and Ki-67 in different tumors [45, 46]. In our research, ADCmin and ADC10th values for Grade 3 stage IA EC were significantly lower than those for Grade 1/2 tumors, in line with the results of previous studies [47, 48]. Similar to the differentiation of malignant from benign lesions, the region showing minimum ADC values may reflect the highest cellular area within the tumor, which is more representative of tumor grade or aggressiveness. However, the diagnostic efficiency of ADCmin is insufficient, with an AUC of 0.641 and a sensitivity of 40.7%. Also, no significant correlation was found between ADC histogram parameters and the expression of Ki-67 in stage IA EC. Therefore, it is necessary to explore more sensitive indicators to predict tumor aggressiveness of early-stage EC in the future, such as parameters derived from amide proton transfer-weighted imaging, which have shown promising prospects in this regard [26].

This study has a few limitations. First, this was a single-center retrospective study with inherent selection bias. Further external validation in independent data sets with a large number of patients is necessary in the future. Second, we did not investigate the potential value of ADC histogram parameters in classifying BELs as polyps, hyperplasia without atypia, or atypical hyperplasia. Because BELs usually coexist pathologically and can have similar signals on conventional MRI, it is hard to ensure the precision of segmentation. Third, although most endometrial lesions could be contoured on DWI with multimodal MR images as references, there were still obscure boundaries between the lesions and normal endometrium in some BELs, such as endometrial hyperplasia, where we contoured the entire endometrium as VOIs. The bias introduced by inconsistency in VOI drawing was difficult to avoid in clinical practice and minimized by consulting another experienced radiologist in our study. Finally, patients underwent MRI with different MR equipment and protocols. The histogram metrics, rather than the ADC values, can directly reflect the ADC distribution and are hoped to remain reliable despite different MRI systems.

Conclusions

Our study suggested that whole-lesion ADC histogram analysis can promote preoperative differentiation of stage IA EC from BELs and histopathologic grading of stage IA EC in premenopausal and postmenopausal patients, thereby contributing to clinical treatment planning.