Background

Intrahepatic cholangiocarcinoma (ICC) is the second most common primary liver malignancy and is arising in incidence worldwide [1, 2]. It originates from the intrahepatic biliary epithelium and can be classified into three types according to the morphologic classification system: mass forming, periductal infiltrating, and intraductal growing [3]. Intrahepatic mass-forming cholangiocarcinoma (IMCC) accounts for a large percentage of ICC. Partial hepatectomy is the first option for IMCC curative treatment, which would prolong survival [4]. However, even after a curative resection, the 5-year survival rate is only 20–35% [5, 6]. The main reason for the poor outcome is the incidence of recurrence, which can be as high as 54–71% [5, 7, 8].

The time interval from the resection to IMCC recurrence is an independent prognostic factor of survival [9]. Approximately 78.8% of recurrence develops within 24 months, defined as early recurrence (ER), and the prognosis for patients with ER is worse than that for those with late recurrence. Adjuvant trans-arterial chemoembolization (TACE) or chemotherapy after surgery was associated with better survival among the IMCC patients with early recurrence [10, 11]. Therefore, patients at a high risk of ER need to be precisely determined and effective adjunctive treatment strategies and closer follow-up after operation need to be performed.

Previous studies have revealed several pathological tumor characteristics associated with postoperative ER of IMCC (e.g. tumor size, satellite lesions, lymph node metastasis, lymphatic invasion, microvascular invasion and stage) [9]. In addition, certain immunohistochemical molecules have been reported as predictive markers of ICC prognosis. Iguchi et al. found that P53 and Ki67, markers indicating cell proliferation, were related to the overall survival of ICC patients [12]. Epidermal growth factor receptor (EGFR) expression was demonstrated to be an independent predictor of ICC prognosis [13]. As an anti-angiogenesis therapeutic target, vascular endothelial growth factor receptor (VEGFR) expression has been correlated with the prognosis of many cancers (e.g. breast cancer, ovarian cancer, lung cancer, lymphoma, etc.) [14,15,16,17]. However, ER prediction by these factors leads to noticeably different outcomes among studies. Therefore, it is necessary to further explore whether they are effective predictors of ER in IMCC.

Magnetic resonance imaging (MRI) is widely used in the diagnosis and treatment planning of liver tumors. Previous studies revealed some radiological features that might be predictors of IMCC prognosis, such as the degree of diffusion restriction on diffusion weighted images (DWI), enhancement pattern of contrast enhanced MRI (CE-MRI), and intensity on the hepatobiliary phase of gadoxetic acid-enhanced MRI [18,19,20]. However, these qualitative assessments were unable to quantify tumor heterogeneity. Radiomics is considered to be an emerging quantitative technique for evaluating the entire underlying intra-tumor heterogeneity by extracting numerous features from radiologic images. Studies in many cancers (e.g. colorectal cancer, breast cancer, lung cancer, esophageal cancer and hepatocellular carcinoma) have shown that radiomics has the potential for prognosis prediction [21,22,23,24,25].

We combined the above pathological characteristics and immunohistochemical molecules with both visible and invisible radiological features for better ER prediction. To our knowledge, combining immunohistochemistry and radiomics features for ER prediction in IMCC patients has not yet been investigated. Therefore, the aim of this study was to develop a nomogram based on radiological features, immunohistochemical markers, and radiomics features for predicting the ER of IMCC to create a better stratification of IMCC patients thus improving personalized treatment.

Materials and methods

Patients

This retrospective study was approved by our Institutional Review Board, and the need for informed consent was waived. We collected electronic medical records from May 2011 to August 2016 at our institution. A total of 219 patients who underwent a curative-intent resection and lymph node dissection, with histopathologically confirmed IMCC, were recruited. Among these patients, the study population was selected using the following inclusion criteria: (a) patients who underwent preoperative liver CE-MRI within 4 weeks of their surgery, (b) patients without a history of previous adjuvant treatment before the surgery, (c) patients with histopathologically proven IMCC and negative resection margin (R0), combined hepatocellular-cholangiocarcinoma were excluded, (d) patients without a history of other tumors, and (e) patients who completed at least 2 years of follow-up. Consequently, 47 patients were enrolled in our study and were divided into an ER group (n = 31) and a non-ER group (n = 16), with ER being defined as the development of intrahepatic or extrahepatic recurrence within 2 postoperative years (Fig. 1).

Fig. 1
figure 1

Flowchart of the enrolled patients. IMCC, intrahepatic mass-forming cholangiocarcinoma; MRI, magnetic resonance imaging; CE-MRI, contrast enhanced magnetic resonance imaging; ER, early recurrence

Clinical and pathologic characteristics

Clinical and pathological characteristics consisted of age, gender, hepatitis, carcinoembryonic antigen (CEA), carbohydrate antigen 199 (CA199), satellite lesions, maximum tumor diameter (MTD), tumor location, differential degree of tumor, stage and lymph node metastases. Lymph node metastasis was determined by the pathological results of lymph node dissection and preoperative CT/MR imaging. The threshold values chosen for CA199 and CEA levels were based on the normal ranges used at our institution (0-37 U/ml for CA199 and 0-5 ng/ml for CEA).

Immunohistochemistry

All pathology sections and macroscopic pictures of the resected specimen were retrospectively reviewed. Combining with immunohistochemistry, all 47 patients were confirmed IMCC by post-operative pathology. Immunohistochemical staining was detected on formalin-fixed, paraffin-embedded sections using standard immunohistochemical methods. Then five-micron-thick sections were created, and antibodies specific for EGFR, VEGFR, P53, and Ki67 (Beijing Zhongshan Golden Bridge Biotechnology Co. LTD, China) were used to perform the further immunohistochemical staining. All samples were analyzed by an anatomic pathologist with 10 years of experience, who was unaware of the patient’s outcome. Less than 10% of the positive staining was identified as negative expression, while more than 10% of the positive staining was identified as positive expression.

Follow-up

All patients underwent contrast enhanced CT or MRI every 3–6 months after surgery in the first 2 years. Images were analyzed to identify ER, which was determined as the presence of new intrahepatic lesions with typical imaging features of IMCC, atypical lesions with histopathological confirmation, or extrahepatic metastasis (lymph node metastases or distant metastasis) within 2 postoperative years.

Magnetic resonance imaging acquisition and analysis

All patients underwent 3.0 T MRI scans (Signa Excite HDxt, GE Healthcare, Milwaukee, USA) with an eight-element phased-array torso coil. After nonenhanced axial breath-hold T1 weighted imaging, fat suppression T2 weighted imaging (T2WI/FS) and DWI (b-values of 0 and 800 s/mm2), contrast enhanced T1-weighted three-dimensional (3D) spoiled gradient echo sequence (LAVA) was performed. Gadodiamide (Omniscan 0.5 mmol/ml; GE Healthcare, Ireland) was injected at a dose of 0.2 mL per kilogram and a rate of 2 mL per second as a bolus by an automatic pump injector and a subsequent 20 mL 0.9% sterile saline flush. Contrast enhanced imaging was performed in the arterial phase (AP) (30 s), portal venous phase (PVP) (60 s), and delayed phase (DP) (180 s). Images of AP, PVP, and DP were all used for radiomics feature extraction and analysis. Additional technical details are provided in Table 1.

Table 1 The details of MR imaging sequences parameters

MR images were reviewed independently by two radiologists who had 5 and 10 years experience in the abdominal MRI, respectively. Both radiologists were blinded to the clinical data of the patients when they evaluated the MR images. They reached a consensus by discussion when there were disagreements. The basic imaging traits potentially associated with ER included lesion shape, contour, biliary dilation, capsular retraction, DWI intensity and the enhancement pattern. The target appearance was peripheral hyperintensity compared to the center on high b-value DWI. The enhancement pattern was assessed using the following subdivisions: (a) gradual enhancement (the enhancement area gradually increased from the periphery to the center of the tumor), (b) persistent enhancement (enhancement remained through all phases), (c) wash in and wash out (hyperenhancement of the AP followed by washout), and (d) minimal or no enhancement.

Radiomics features extraction and analysis

MR images (AP, PVP, DP, T2WI/FS sequence) were loaded into ITK-SNAP software (version 2.2.0, www.itksnap.org) for 3D manual segmentation. A radiologist with 10 years of MRI experience (reader 1) performed the tumor segmentations in all 47 patients. After 2 weeks, images of all patients were segmented again by reader 1 and another radiologist (reader 2) with 5 years of experience of MRI diagnosis to assess intra−/inter-reader agreement in the feature analysis. All outcomes were based on the features extracted by the first segmentation from reader 1.

Artificial Intelligence Kit software (A.K. software; GE Healthcare, Life Sciences, Beijing, China) was used to extract 396 parameters from each sequence. Those parameters include first order histogram features (n = 42), grey-level co-occurrence matrix (GLCM) features (n = 144), grey-level run-length matrix features (n = 180), Haralick features (n = 10), morphological features (n = 9) and grey-level zone size matrix features (n = 11).

The proposed parameters were analyzed for consistency and correlation. First, the intraclass correlation coefficient was determined for each parameter for the inter-observer and intra-observer reproducibility test. Features with intraclass correlation coefficient values less than 0.8 were excluded. Second, stratified analyses were conducted using the Wilcoxon signed-rank test to discover the potential association between the remaining parameters and ER status, followed by univariate logistic regression. To keep discriminative parameters, we set a threshold of 0.1. A variance inflation factor was then used to eliminate parameters with high collinearity in a multiple mutual linear situation. Finally, multivariate logistic regression was applied to evaluate the performance of distinguishing ER status in each sequence. Different sequence combinations (AP + PVP, AP + PVP + DP, AP + PVP + DP + T2WI) were also tried to explore the best model using the same methods described above. The validation of each model was performed by using leave-one-out cross-validation. Thus, three predictive models were built: a best performance radiomics model, a clinicoradiologic-pathologic(CRP) model, and a combined model with both selected radiomics features and clinicoradiologic-pathologic features (Fig. 2).

Fig. 2
figure 2

Study workflow. Basic features of biliary dilation on T2WI/FS (a), capsular retraction on T2WI/FS (b), and target appearance on DWI (c). ER, early recurrence; ROI, region of interest; HE, hematoxylin and eosin; VEGFR, vascular endothelial growth factor receptor; T2WI/FS, fat suppression T2-weighted imaging; PVP, portal venous phase; AP, arterial phase; DP, delayed phase; DWI, diffusion weighted image

Statistics

The differences in patient characteristics between the two groups were assessed using t-test or the Mann-Whitney U test for continuous variables and the chi-square test or Fisher exact test for categorical variables. Kappa tests were used to determine inter-observer agreement for qualitative MRI features. Kappa values of 0.81–1.00 indicated excellent agreement, 0.61–0.80 signified substantial agreement, and 0.41–0.60 denoted moderate agreement. Receiver operating characteristics (ROC) curves were performed in each model. The area under the curve (AUC) of the ROC curves, accuracy, sensitivity, specificity, positive predictive values (PPVs) and negative predictive values (NPVs) were obtained and comparisons between the performance of final three models were performed using the Delong test. All statistical analyses were performed using R studio Server (Version 3.5.0; RStudio, Inc., Boston, MA, USA) and SPSS (version 20.0; IBM, Armonk, NY, USA). A two-sided P value less than 0.05 indicated a statistically significant difference.

Results

The median age of the patients was 57 years old (range, 35 to 78 years old), and ER was found in 31 (66.0%) patients. The median MTD was 5.73 cm, ranging from 1.5 to 12.8 cm. The median follow-up was 34 months (range, 25 to 87 months). Clinical and pathological characteristics in the ER group and non-ER group are summarized in Table 2. There were no significant differences between the ER group and non-ER group in terms of these characteristics. Radiological features and immunohistochemical markers are listed in Table 3. Among these factors, enhancement patters and VEGFR expression showed significant differences between the ER group and non-ER group (P = 0.001 and 0.034, respectively). Additionally, among 33 IMCC patients with gradual enhancement pattern, 26 (78.8%) patients were in the ER group, while all 5 (100%) patients with wash in and wash out enhancement pattern were in non-ER group (P = 0.002). The inter-observer agreement for the radiological features showed excellent agreement (k = 0.811–0.849).

Table 2 Characteristics of patients
Table 3 Radiological features and immunohistochemistry

We began by developing radiomics models based on T2WI/FS, AP, PVP, and DP images separately. The PVP model demonstrated preferable accuracy, sensitivity, specificity, PPV and NPV (0.872, 0.75, 0.936, 0.857, and 0.879, respectively) while it presented a slightly lower AUC (0.841, 95% confidence interval (CI): 0.697–0.984) than that of the AP model (0.871, 95% CI: 0.761–0.981). Next, radiomics models based on multiple sequences were built according to the aforementioned results, including AP + PVP (two sequences with higher AUC), AP + PVP + DP (multi-phase contrast enhanced sequences), and T2WI/FS + AP + PVP + DP (all sequences) models. The AP + PVP + DP model showed superior AUC (0.889, 95% CI: 0.783–0.996) among all the radiomics models, and was used in the follow-up study. This model illustrated that the four most important parameters for predicting ER were AP_skewness and PVP_Variance, both derived from the histogram, as well as AP_ClusterShade_AllDirection_offset7_SD and AP_GLCMEntropy_angle45_offset7 derived from the GLCM. The accuracy, sensitivity, specificity, PPV, NPV and AUC of the seven radiomics models are presented in Table 4.

Table 4 Predictive performance of the radiomics model

The clinicoradiologic-pathologic (CRP) model contained enhancement pattern and VEGFR. The combined model incorporated radiomics features, clinicoradiological features and pathological factors. The predictive performance of the CRP model, radiomics model, and combined model are listed in Table 5, and ROCs are shown in Fig. 3. Nomograms for the combined model are presented in Fig. 4. The combined model displayed the best accuracy, sensitivity, specificity, PPV, NPV and AUC (0.872, 0.938, 0.839, 0.750, 0.963 and 0.949, respectively). Also, the combined model significantly improved the predictive performance of the CRP model in predicting ER of IMCC (P = 0.009).

Table 5 Predictive performance of three models
Fig. 3
figure 3

The receiver operating characteristics curves of the radiomics, CRP, and combined models. AUC, area under the curve; CRP, clinicoradiologic-pathological

Fig. 4
figure 4

Nomogram with rad_score, enhancement pattern of pre-operative MRI and VEGFR. MRI, magnetic resonance imaging; VEGFR, vascular endothelial growth factor receptor

Discussion

Combining radiomics features, enhancement patterns, and VEGFR led to significant improvements in the AUC, sensitivity, specificity and accuracy for predicting ER compared to the radiomics model or CRP model alone, which indicated that the combination of qualitative and quantitative MRI features along with immunohistochemical markers maximizes the predictive performance of ER. Radiomics models based on CE-MRI sequences (AP, PVP, or DP) showed better specificity and AUC for predicting ER than that of T2WI, though they exhibited relatively lower sensitivity.

Previous studies have investigated ER predictions of IMCC. For instance, Liang et al. developed a novel nomogram to predict the recurrence of ICC with AUC, sensitivity, and specificity values of 0.90, 0.74, 0.89, respectively [26]. This nomogram was achieved based on radiomics and clinical stage, and radiomic signatures extracted only from AP images with a relatively low sensitivity. Jeong et al. established a predictive nomogram of IMCC recurrence based only on clinical characteristics: lymph node metastasis, tumor size, surface antigen of the hepatitis B virus, and Child–Pugh score, with a concordance C index of 0.71 (95% CI: 0.65–0.77) [27]. Our study demonstrated preferable sensitivity and AUC compared with these studies. Furthermore, our combined model that included radiomics features, clinicoradiological and pathological factors was the superior predictive model. This suggested that combining the morphology, quantification of tumor heterogeneity and molecular pathology could better reflect aggressive malignant tumor biology.

There have been an increasing number of studies showing the potential of radiomics based on MR images for diagnosis and prognosis assessment for specific tumors. In our study, the radiomics features were extracted from T2WI/FS, AP, PVP and DP of contrast enhanced images to build the best radiomics model. Radiomics models of AP, PVP and DP all provided better AUC and specificity for predicting ER than that of T2WI/FS, which suggested that CE-MRI contain more potential tumor heterogeneous information. Catharina et al. also revealed the exceptional discriminating ability provided by CE-MRI among T1WI, T2WI, T2WI/short inversion time recovery, and contrast enhanced sequences in differentiating low-grade chondrosarcoma and enchondroma by texture analysis (TA) [28]. Furthermore, the AP + PVP + DP model showed superior AUC among all the TA models, which was consistent with previous studies. Ueno et al. [29] found that a TA model based on texture parameters with several sequences led to a better predictive value than that with a single sequence, demonstrating that multi-phase CE-MRI could provide added value.

We found that VEGFR was a predictor of ER, which was consistent with previous studies. Sang et al. found that ICC with positive VEGFR expression represented aggressive malignancy owing to the mechanism that inhibition of VEGFR-2 expression increased apoptosis and decreased cell proliferation [30]. The enhancement patterns were also related to ER due to the histopathologic basis of fibrous stroma in tumors. Gradual enhancement pattern was relevant to the large amount of fibrous stroma of the tumor and indicated a poor prognosis [31]. Small IMCC with diameter less than 3 cm in the cirrhotic liver showed atypical wash in and wash out enhancement pattern frequently [32]. IMCC with wash in and wash out enhancement pattern demonstrated less central fibrous stroma and more cellular areas than that with gradual enhancement pattern; also, hyperenhancement on AP was an independent factor for longer survival.

Whether MTD and CA199 levels could serve as a predictor of ER is currently controversial, perhaps due to the heterogenous population of IMCC patients. Several studies reported that no associations were found between MTD and IMCC prognosis [33]. Nevertheless, other studies found that MTD was associated with ER [27, 34]. Similarly, a few previous studies found CA199 to be a preoperative predictor of prognosis [34, 35], while others removed CA199 as an independent prognostic factor of ER [27]. In our study, MTD and CA199 levels displayed no significant correlation with ER.

Our study had some limitations. First, since it was a retrospective study and was performed in a single center, thus lacking the heterogeneity of MR images and the cohort population of other institutions, selection bias may exist. We will explore the prediction model using MR images from multiple centers in the future. Other limitations of this study were the relatively small sample size cohort due to the incidence of IMCC and the need to obtain pathological sections for immunohistochemistry. Finally, we only developed predictive models for ER without including long-term survival analysis. Prediction of long-term survival should be included in future studies.

Conclusions

Our study show that the combined model was the superior predictive model of ER compared with radiomics or CRP model alone. Combining qualitative and quantitative MRI features and VEGFR might be useful for predicting ER and guide personalized treatment in patients with IMCC.