Introduction

Combined hepatocellular carcinoma-cholangiocarcinoma (cHCC-CCA) is a rare subtype of primary liver cancer (PLC) that contains various proportions of both hepatocytic and biliary components, with an incidence of 0.4–14.2% in PLC [1,2,3,4]. Partially due to its rarity and histologic heterogeneity, prognosis and treatment of cHCC-CCA have long been a controversial issue to clarify. Thus, appropriate identification of prognostic factors will facilitate risk stratification and expedite individualized management in cHCC-CCA.

Microvascular invasion (MVI) is a well-defined risk factor in certain tumors [5,6,7,8], and the relationship between MVI and the prognosis of cHCC-CCA has been verified by several previous works [9, 10]. Therefore, some researchers, especially radiologists, have paid close attention to the preoperative prediction of MVI in order to function better in clinical practice. Some conventional imaging features and clinical biomarkers, such as the Liver Imaging Reporting and Data System (LI-RADS) categorization, irregular arterial peritumoral enhancement, and serum AFP elevation, have already been determined as significant risk factors for MVI in cHCC-CCA, however, relatively suboptimal interobserver consistency or low sensitivity [10].

Radiomics, as a noninvasive tool to extract quantitative information that is invisible to the naked eye from medical images [11], can potentially capture markers that guide clinical decisions and may be a promising method to predict MVI in cHCC-CCA preoperatively. Moreover, significant differences in gene expression have been demonstrated between MVI-presence and MVI-absence groups in HCC [12,13,14], and increasing evidence has supported the intimate connection between radiomics features and specific biological portraits [15,16,17,18]. Thus, further study is warranted to investigate the biological information of radiomics to validate its clinical value and to further promote clinical transition in cHCC-CCA.

Therefore, the purpose of the present study was to establish a robust MRI-based radiomics model for predicting MVI status of cHCC-CCA, and to investigate the underlying biologic processes of the radiomics model by analyzing RNA sequencing data.

Materials and methods

This study was approved by the Institutional Review Board and informed consent was required from every enrolled patient.

Study patients

For MRI-based radiomics model construction, a total of 158 pathologically confirmed cHCC-CCA patients who underwent surgical resection in Zhongshan Hospital and Shanghai Geriatric Medical Center between January 2019 and December 2021 were retrospectively enrolled by following inclusion criteria: (1) pathologic diagnosis of cHCC-CCA based on the 2019 WHO classification; (2) preoperative contrast-enhanced MRI performed within 2 weeks; and (3) solitary lesion without intrahepatic metastasis or multiple origins. Forty patients were excluded according to the following criteria: (1) any preoperative treatment prior to MRI; (2) insufficient MR image quality; (3) incomplete pathological description data; and (4) presence of macrovascular invasion. Finally, 118 patients were included in our study and were randomly divided into the training set and a validation set in a ratio of 7:3 (Fig. 1a).

Fig. 1
figure 1

Flowcharts of the patient recruitment process. a Training set and validation set. b Test set. cHCC-CCA, combined hepatocellular carcinoma-cholangiocarcinoma

For prospective biologic verification of the radiomics model, 25 pathologically confirmed cHCC-CCA patients who underwent surgical resection with RNA sequence data from March 2022 to December 2022 according to the above-mentioned inclusion criteria were enrolled (Fig. 1b), which were named as a test set. This data set was also included in an unpublished paper aiming to explore specific biological portraits of each component in cHCC-CCA.

Clinicopathological data evaluation

Relevant clinical and pathological data of cHCC-CCA patients were retrieved from medical records retrospectively or prospectively, including age, gender, hepatitis virus infection, tumor size, tumor biomarkers (AFP, CEA, and CA 19-9), and MVI status (MVI + refers to a tumor nest of ≥ 50 suspended tumor cells found within the lumen of the endothelium-lined vessels which is visible only at microscopy). For the evaluation of MVI status, hepatectomy specimens from each patient were viewed microscopically by two pathologists independently.

MRI technique and conventional MR image analysis

All MR images were acquired via a 1.5-T MR scanner (uMR 560, United Imaging Healthcare). Gadobutrol (Gadavist; Bayer HealthCare) was intravenously administered at a rate of 2 mL/s for a total dose of 0.1 mmol/kg. Routine contrast-enhanced MR imaging protocol included T1-weighted in-phase and out-of-phase sequences, transverse T2-weighted fast spin-echo sequence, diffusion-weighted imaging (DWI) with b values of 0 s/mm2, 50 s/mm2, and 500 s/mm2, pre- and post-contrast three-dimensional T1-weighted imaging at arterial phase (20–30 s), portal venous phase (70–90 s), and delayed phase (160–180 s). All detailed parameters of each sequence were previously reported [10].

The MRI images were analyzed by two experienced radiologists, C.Y. and C.W.Z., with 15 years and 14 years of expertise in abdominal imaging analysis, respectively. In case of any discrepancies between the two radiologists, a consensus was achieved through thorough discussion. The evaluation focused on several contrast-enhanced MR features, including enhancement patterns (nonrim arterial phase hyperenhancement (APHE) and rim APHE), washout patterns (nonperipheral washout and peripheral washout), enhancing capsule, delayed central enhancement, and corona enhancement. Additionally, intratumoral hemorrhage, fat deposition, restriction diffusion status (present or absent, rim or nonrim), cholangiectasis, a nodule in nodule architecture, mosaic architecture, and hepatic capsule retraction were also assessed. Targetoid appearance was defined as the presence of any of the following features: rim APHE, peripheral washout, targetoid restriction, and delayed central enhancement. The detailed definitions of these MR features can be found in Table S1.

Radiomics analysis

A radiologist (Y.Y.X., with 7 years of abdominal imaging analysis experience) performed tumor segmentation by ITK-SNAP software, these segmentation results were checked by a senior radiologist (C.Y., with 15 years of abdominal imaging analysis experience. Volumes of interests were manually delineated on six sequences of pre-T1WI, AP, PVP, DP, T2WI-FS, and DWI with b values of 500 s/mm2. In addition, MR images of randomly selected 30 lesions were delineated again after 1 month by Y.Y.X. to assess the intra-observer reproducibility, and these 30 MRI images were also delineated by another radiologist (C.W.Z., with 14 years of abdominal imaging analysis experience) independently to evaluate inter-observer reproducibility.

All MR imaging voxels were isotropically resampled to 1 × 1 × 1 mm3 to eliminate acquisition-related voxel heterogeneity. Radiomic features were extracted using the uAI Portal (version: 20230715), in which the PyRadiomics tool was embedded, and the Z-score method was used to acquire normalized values of the radiomic features.

Follow-up of recurrence-free survival (RFS) and overall survival (OS)

The RFS time referred to the time interval from surgery to the date of recurrence, death or the last follow-up, while the OS time was defined as the time interval from the surgery to death, the date of the last follow-up or the study end date of July 31, 2023.

Statistical analysis

Intra- and inter-observer reproducibility was evaluated by using intraclass correlation coefficient (ICC), and radiomic features with ICC ≥ 0.80 in both intra- and inter-observer settings were selected for further analysis. The Spearman correlation analysis, max-relevance and min-redundancy, and least absolute shrinkage and selection operator methods were successively performed to obtain optimal radiomic features. Uni- and multivariate logistic regression analysis were used to develop a clinical-imaging model in the training set. The diagnostic performance parameters of each predictive model, such as the area under the receiver operating characteristic curves (AUC), sensitivity, specificity, accuracy, precision, and F1-score, were calculated. Delong test and McNemar’s test were performed to compare AUCs, accuracy, sensitivity, and specificity, respectively, and the false discovery rate (FDR) was corrected using the Benjamini–Hochberg method. Hosmer–Lemeshow goodness-of-fit test was performed, and calibration curves were then generated. A decision curve was used to evaluate the clinical practicability.

Patients in the prospective RNA sequencing group were divided into low- and high-score groups according to the lower quartile of radiomic score. We then used the DESeq2 package to identify differentially expressed genes (DEGs) with |log2 (fold change)| > 1 and FDR-adjusted p < 0.05 between the low- and high-score groups. Statistically significant DEGs were then used to identify distinct gene ontology (GO)-based biological processes. GO highlights the most DEGs and finds the systematic linkages between those genes and biological processes.

Continuous variables were compared using the student t-test, ANOVA, Mann–Whitney U-test or Kruskal–Wallis H-test, and categorical variables were compared using the χ2 test or Fisher’s exact test among different groups. Survival curves were generated and compared by the Kaplan–Meier method and log-rank test. Statistical analyses were performed using R software (version 4.1.1). p values less than 0.05 were indicative of a statistical difference.

Results

Patient characteristics

A total of 143 patients (mean age, 56.4 ± 10.5; 114 men) were enrolled. Eighty-two and 36 patients were assigned to the training and validation set, and 25 patients were enrolled in the test set. The clinicopathologic characteristics of the three data sets were presented and compared in Table 1. Patients in the validation set had lower rates of hepatitis B infection (63.9%, p = 0.015), and patients in the test set exhibited larger tumor size (5 [4–6], p = 0.029). The patient characteristics in the training and validation set according to the MVI status are summarized in Table 2. In the training set, 38 patients were assigned to the MVI + group, and these patients were more likely to show larger tumor size (2.6 [1.95–4] vs 3.75 [2.5–6.375], p = 0.018), more surface retraction (6.8% vs 31.6%, p = 0.004), and more intratumoral hemorrhage (0.0% vs 21.1%, p = 0.005). In the validation set, serum AFP level was the only factor that exhibited statistical significance between MVI + and MVI − groups.

Table 1 Baseline information of patients with cHCC-CCA in the three data sets
Table 2 Patient characteristics in the training set and validation set according to the MVI status

Construction of prediction model and performance comparison

Tumor size (OR = 2.041, p = 0.015) and surface retraction (OR = 4.688, p = 0.032) were predictors of MVI status in both univariate and multivariate logistic analysis in the training set, and these two features were then used to construct clinical-imaging model (Table 3), showing unsatisfactory predictive performance, with AUCs in training set and validation set of 0.673 (0.554–0.792) and 0.630 (0.442–0.818), respectively. However, this clinical-imaging model showed a more notable AUC of 0.815 (0.648–0.981), in the test set (Table 4).

Table 3 Uni/multivariate logistic regression analysis of MVI status based on clinical and MR imaging features in patients with cHCC-CCA
Table 4 Diagnostic performance of predictive models

A total of 62 significant radiomic features were extracted from six single MR sequences (Table S2), and the prediction performance of each single sequence model in the training and validation set was presented in Table S3 and Fig. S1. Among all single MR sequence models, the pre-TIWI, AP, and PVP models showed the most stable and best diagnostic performance, with a range of AUCs of 0.797–0.958 and 0.759–0.794 in the training set and validation set, respectively. In order to establish a robust multi-sequence radiomics model, the above-mentioned single MR sequence models were combined, referring to the radiomics model. This radiomics model exhibited satisfactory predictive performance with AUCs of 0.935 (0.885–0.986), 0.873 (0.760–0.986), and 0.779 (0.580–0.978) in the training set, validation set, and test set, respectively. Also, this radiomics model consisting of three sequences showed significantly higher AUCs than the clinical-imaging model in the training set (0.935 vs 0.673, p < 0.001) and validation set (0.873 vs 0.630, p = 0.007), but not in the test set (0.779 vs 0.815, p = 0.781). What’s more, the prediction performance of the radiomics model was not inferior to the clinical-imaging-radiomics model in the training set (0.935 vs 0.937, p = 0.859), validation set (0.873 vs 0.873, p > 0.999), and test set (0.779 vs 0.786, p = 0.845). The diagnostic performance of each prediction model is detailed in Table 4 and Fig. S2.

The calibration curve shows the goodness of fit between the predicted MVI status and actual MVI status in three sets (Fig. S3d–f), and all clinical-imaging models, radiomics model, and clinical-imaging-radiomics model showed FDR p value of the Hosmer-less how to test higher than 0.05 in all three sets (Table S4). Decision curves of the clinical-imaging model, the radiomics model, and the clinical-imaging-radiomics model in three sets were presented in Fig. S3a–c. Two examples of applications for MVI status prediction in cHCC-CCA using our prediction models are provided in Fig. 2.

Fig. 2
figure 2

Two examples of applications for MVI status prediction in cHCC-CCA. a, b Images of a 46-year-old male with a 10.0 cm MVI-positive cHCC-CCA. Based on the radiomics model calculation, the radiomics score for this case is 0.928, and T1-weighted imaging shows homogeneous hypointensity of the lesion, with surface retraction (b). The predictive MVI status was positive. c, d Images of a 51-year-old male with a 2.5 cm MVI-negative cHCC-CCA. Based on the radiomics model calculation, the radiomics score for this case is 0.018, and T1-weighted imaging shows homogeneous hypointensity of the lesion, without surface retraction (d). The predictive MVI status was negative

Predictive value of prediction model for survival

All 118 patients in the training and validation sets were followed up after the initial hepatectomy, with a median follow-up time of 21 (range, 3–56) months. The overall recurrence rate was 50.8% (60/118) and the overall death rate was 25.4% (30/118).

The median RFS of the patients was 14 (range, 1–56) months, and in particular 10 (range, 2–55) months for MVI + patients and 18 (range, 1–56) months for MVI − patients (log-rank test, p = 0.042). In radiomics model, the median RFS was 10.5 (range, 1–56) months for predicted MVI + patients and 18 (range, 2–54) months for predicted MVI − patients with the marginal p value of log-rank test of 0.100 (Fig. 3a, c).

Fig. 3
figure 3

Survival curves according to histological MVI status and predicted MVI status by radiomics model on RFS (a, c) and OS (b, d). MVI, microvascular invasion; RFS, recurrence-free survival; OS, overall survival

The median OS for all patients was 21 (range, 3–56) months, and specifically 18 (range, 3–55) months for those with MVI and 25 (range, 6–56) months for MVI − patients (log-rank test, p = 0.023). Similar results were also found in patients stratified by the radiomics model: the median OS was 18 (range, 3–56) months for predicted MVI + patients and 25 (range, 6–56) months for predicted MVI − patients, with the p value of log-rank test of 0.008 (Fig. 3b, d).

Biological processes associated with radiomic score

Of the external set with RNA sequencing data, all 25 patients were assigned into low- and high-score groups according to the lower quartile (− 0.976) of radiomic score, by which seven patients were in the low-score group and 18 in the high-score group. Forty-six DEGs were identified to be differentially expressed between the “low-score” and “high-score” groups and were exhibited in Fig. 4a. Further GO analysis was carried out based on these 46 DEGs, and results showed that of the top ten biological processes that were correlated with the radiomics model, five biological processes were implicated in immune response, such as production of molecular mediator of immune response and cytokine production involved in the immune response. p value and the number of genes involved in the various biological processes have been listed in Fig. 4b.

Fig. 4
figure 4

Radiogenomic analysis of biological process associated with the radiomics model. a Volcano plot showed the DEGs in the high-score group compared with the low-score group. b GO analysis revealed several biological processes associated with radiomics score. GO, gene ontology; BP, biological process

Discussion

Here, we constructed a radiomics model to noninvasively predict MVI status in patients with cHCC-CCA, with AUC of 0.935, 0.873, and 0.779 in the training set, validation set, and test set, respectively. Importantly, our findings based on RNA sequencing data uncovered the underlying biological processes (mainly implicated in immune response) associated with the radiomics model.

This study first established a clinical-imaging model to preoperatively predict the MVI status of cHCC-CCA. The results of univariate and multivariate logistic regression analyses showed that tumor size and surface retraction were independent predictors of MVI status. Tumor size has always been a reliable predictor of MVI status [19,20,21]: based on the hypothesis of tumor progression [22, 23], the histological grade and invasiveness of tumors increase with increasing tumor size, and thus the risk of MVI also increases with increasing tumor size. Zhou et al [24] also demonstrated that tumor size is an independent predictor of MVI status in cHCC-CCA. In addition, Liao et al [25] explored the application value of clinical and CT imaging features in predicting MVI status in patients with cHCC-CCA, and found that surface retraction is an independent predictor of MVI, which is consistent with our research results. However, the predictive performance of the clinical-imaging model established in this study was not ideal. Therefore, we further established a radiomics model and a clinical-imaging-radiomics model, and compared their predictive performances. The results showed that the predictive performance of the radiomics model was significantly better than that of the clinical-imaging model and was not inferior to that of the clinical-imaging-radiomics model. Based on this, we further explored the prognosis and potential biological significance of the radiomics model.

In recent years, a growing body of evidence has demonstrated that radiomics analysis holds the potential to address diagnostic ambiguity, monitor response to adjuvant therapies, enhance prognostic models, and even visualize the connection between histologic and biologic features of tumors [26,27,28,29,30,31]. In the current study, we performed a canonical selection of radiomic features for the MVI status prediction model, and then carried out multiple validations to verify its robustness. In the test set, the clinical-imaging model showed statistically equivalent AUC with the radiomics model, as AUCs of the radiomics model were significantly higher than the clinical-imaging model in the training and validation sets, and the relatively small sample size of the test set may account for this. Regardless, our first established radiomics model showed notable and reproducible performance in predicting MVI status in cHCC-CCA, indicating its generalizability in other patient samples.

The prognostic aspects of our radiomics model were also investigated. Histologic MVI of cHCC-CCA has been reported to be a significant prognostic factor of outcome in many studies [9, 10, 32,33,34] and was also verified by our study. In addition, the radiomics model constructed in our study was not only capable of accurately predicting the MVI status but also has correlations with OS in cHCC-CCA patients. Therefore, our work goes further by showing a radiomics link among MR imaging, MVI status, and clinical outcomes after surgical resection, shedding light on risk stratification and personal management for patients with cHCC-CCA, with enormous clinical translational potential.

One of the primary challenges in radiomic research is the obscurity regarding the underlying biological explanations of radiomic features. Although a fundamental hypothesis behind radiomics is the association between radiomics features and gene profile, no studies have directly investigated this link in cHCC-CCA. In this study, radiogenomic analysis revealed that the radiomics features were associated with several biological processes, most of which were involved in regulating the immune response. The tumor immune microenvironment plays a crucial role in tumor progression and prognosis. A previous study by Nguyen et al [35] determined that, compared with the immune-low subtype cHCC-CCA, the immune-high subtype responded better to immunotherapy and exhibited improved OS; Zheng et al [36] also constructed an immune score based on the densities of immune cells, which holds promise as a valuable prognostic predictor for patients with cHCC-CCA. As the correlation between radiomic score and biological processes involved in regulating immune response was discovered in the present study, the utilization of the radiomics approach to characterize the MVI status will offer valuable insights for selecting patients with cHCC-CCA who may have up- or down-regulated genes associated with regulating immune response, and who may benefit from immunotherapy, thus guiding immunotherapy strategies and risk stratification in cHCC-CCA.

This study has limitations. First, the prediction models were constructed based on retrospectively gathered data, in which selection bias was inevitable. Second, to simplify model construction in the current study, we only enrolled patients with surgically resected single cHCC-CCA, but its generalization would be sacrificed. Third, several studies focusing on the preoperative prediction of MVI in cHCC-CCA indicated that arterial peritumoral enhancement was the significant predictor [10, 37], so radiomics features extracted from the peritumoral area are supposed to be introduced in the future. Finally, the sample size in the present study, especially in the test set, was relatively small, so a multicenter study with a large sample size, for more convincing results and more comprehensive and in-depth transcriptomic analysis, was also warranted in the future.

In conclusion, we established a robust MRI-based radiomics model for predicting MVI status in cHCC-CCA, which demonstrated good diagnostic performance and potential prognostic value. Additionally, the study revealed potential biological processes that regulate immune response underlying the radiomics model, which will offer valuable insights for guiding immunotherapy strategies and risk stratification in cHCC-CCA.