Introduction

Nasopharyngeal carcinoma (NPC) is a prevalent type of head and neck cancer arising in the nasopharynx. Its geographical distribution is remarkably uneven, with a high incidence in Southeast Asia (Bray et al. 2018). Although significant advancements in survival rates attributed to intensity-modulated radiation therapy (IMRT) and concurrent chemotherapy, a substantial proportion of NPC patients (20–30%) still succumb to local-regional relapse and distant metastasis (Pan et al. 2016). Thus, early prognostic prediction of NPC is critical for guiding treatment decisions and enabling personalized precision therapy.

Currently, therapeutic decisions and prognostic assessments of NPC rely mainly on the tumor-node-metastasis (TNM) staging system (Pan et al. 2016). However, it has been observed that patients with the same clinical stages can have highly variable treatment outcomes (Chen et al. 2017), indicating that the TNM staging system may not provide sufficient precision. Additionally, several prognostic factors have been identified to distinguish NPC patients with poor clinical outcomes. These include the plasma load of Epstein-Barr virus DNA (You et al. 2019), the expression of programmed cell death protein 1/programmed cell death ligand 1 (Ono et al. 2018), and the C-reactive protein/albumin ratio (He et al. 2016), which may serve as independently prognostic indicators for NPC. However, their clinical use remains limited. Therefore, there is a need for improved and effective prognostic markers to stratify patients with NPC before treatment and refine the selection of therapeutic regimens.

Radiomics transforms macroscopic medical images into quantitative information by extracting features that are not visible to the naked eye (Hatt et al. 2017; Kuo and Jamshidi 2014). This technique has the potential to provide crucial information regarding tumor phenotypes. Radiomics features hold promise as potential biomarkers for predicting treatment efficacy and evaluating prognosis in patients with NPC (Spadarella et al. 2021). A recent systematic review highlighted several studies aimed at predicting the prognosis of NPC based on magnetic resonance imaging (MRI) radiomics (Lee et al. 2022). However, these studies’ results demonstrated considerable variability in the efficacy of radiomics for predicting progression-free survival (PFS) in NPC. Although radiomics shows great potential for prognostic prediction of NPC, its performance requires further enhancement and robustness.

The emergence of digital pathology facilitated by digital pathological section scanners and whole-slide image (WSI) has revolutionized the pathology field. By digitizing pathological section information, digital pathology can provide valuable insights into the molecular characteristics and genetic patterns that correlate with therapeutic response and tumor prognosis. Recent studies have applied artificial intelligence to mine digital pathology data to aid risk stratification, genotype prediction, and outcome assessment of various tumors (Kather et al. 2019; Pantanowitz et al. 2020; Skrede et al. 2020). To date, a few studies have combined pathological microfeatures with deep learning to predict survival risk, and the results have demonstrated the reliability of pathological models in predicting NPC prognosis (Liu et al. 2020). However, further research is warranted to explore the full potential of digital pathology and artificial intelligence for improving the prognostic prediction of NPC.

Multi-omics research, encompassing genomics, proteomics, pathomics, radiomics, and other fields, has emerged as a prominent area of study. Researchers are actively exploring the relationship between multi-omics data and diseases. Radiopathomics, a significant branch of the multi-omics approach, bridges radiomics and pathomics data, providing a multidimensional perspective for investigating tumor heterogeneity across macroscopic and microscopic scales. Radiopathomics has shown promise in predicting the prognosis of solid tumors (Alvarez-Jimenez et al. 2020; Feng et al. 2022; Saednia et al. 2022; Shao et al. 2020) demonstrating that digital pathology can complement tumor heterogeneity information and improve the prognostic performance of existing radiomics models. However, the application of radiopathomics to NPC remains largely unexplored. The aim of the study was to construct and validate an integrative radiopathomics model to predict the prognosis of patients with NPC.

Materials and methods

Compliance with ethical standards

Local institutional review board approved this retrospective study (approval number: 2014-8) and the requirement for written informed consent was waived.

Patients

We reviewed and screened eligible patients diagnosed at our center between January 2014 and December 2019. Inclusion criteria for the study were as follows: (1) newly histologically confirmed non-metastatic NPC; (2) availability of pretreatment MRI scan; (3) biopsy under nasopharyngoscopy at our hospital with complete hematoxylin-eosin (H&E)-stained slides; and (4) regular post-treatment follow-up and complete clinical data. Exclusion criteria were: (1) incomplete pretreatment head and neck MRI; (2) poor quality on H&E-stained slides, including over-light and over-deep staining; (3) other malignancies and receipt of other anti-cancer therapies; or (4) distant metastasis at pretreatment. The final study cohort consisted of 357 eligible patients (Fig. 1).

Fig. 1
figure 1

Flow-chart shows patient recruitment process. Abbreviations: NPC = nasopharyngeal carcinoma, MRI = magnetic resonance imaging

Treatment, follow-up, and clinical endpoint

All NPC patients were staged based on the 7th or 8th edition of the American Joint Committee on Cancer TNM classification system. As per the guidelines for NPC management, patients with stages I–II (T1N0, T2N0) disease (n = 5) received radiotherapy alone, whereas those with stage II (T1-2N1, T3N0) disease (n = 18) received concurrent chemoradiotherapy (CCRT). Patients with stages III–IVA disease (n = 334) received CCRT combined with adjuvant chemotherapy (AC) or induction chemotherapy (IC). The IC and AC regimens consisted of cisplatin-based therapies administered every 3 weeks for 2–4 cycles. All patients received radiotherapy at a total dose of 70–76 Gy, delivered in 30–33 fractions. During radiotherapy, the CCRT regimen involved the use of cisplatin for 3–6 cycles.

All patients in the study underwent regular follow-ups according to the National Comprehensive Cancer Network guidelines (Pfister et al. 2020). They had MRI examination at 3-month intervals during the first year, 6-month intervals during the second year, and yearly intervals thereafter. PFS was set as the primary endpoint in the study, which was defined as the time interval from the initiation of therapy to disease progression (including locoregional recurrence or distant metastases), death due to any cause, or the date of the last follow-up visit (censored). Disease progression was confirmed through pathological and/or imaging examinations, such as MRI, computed tomography (CT), and positron emission tomography-CT.

MRI protocols

The MRI procedure was carried out utilizing a 1.5-Tesla MRI scanner with a combined head and neck coil. The MRI protocols were as follows: (1) axial T1-weighted spin-echo images (T1WI): repetition time (TR)/echo time (TE), 580/7.8 ms; slice thickness, 5 mm; slice number, 36; slice space, 1 mm; and number of excitations (NEX), 2; (2) axial T2-weighted spin-echo images with fat suppression (T2WI): TR/TE, 6289 ms/85 ms; slice thickness, 5 mm; slice number 36, slice space, 1 mm; and NEX, 2; and (3) axial contrast-enhanced T1-weighted spin-echo images (CET1-w): TR/TE, 500/8 ms; field of view, 22 × 22 cm; NEX, 2; slice thickness, 4 mm; and interslice gap, 0.8 mm. To acquire axial CET1-w images, gadodiamide was administered intravenously at a dose of 0.1 mmol/kg of body weight and a rate of 3.5 ml/s.

Radiomics signature construction and validation

Given the variation in MRI images of intra- and inter-patients, preprocessing is crucial. Preprocessing of the MRI data was carried out using the AK software, which has obtained registration and approval. The preprocessing steps included resampling, skull stripping, and standardization of intensity. The image resolution was modified to 1 × 1 × 1 mm during the resampling process. The thickness of all MRI images was adjusted to 1 mm using linear interpolation. And then a gray-level unified adjustment was performed to standardize the range to 0-255.

Two radiologists, Radiologist A and Radiologist B, each having 13 and 14 years of experience in head and neck MRI, respectively, performed the segmentation using the ITK-SNAP software. The radiologists were unaware of the prognostic outcomes of the NPC patients and separately outlined the boundaries of the primary NPC lesion, layer by layer. Following this, the regions of interest (ROIs) were automatically transformed into volumes of interest (VOIs) and saved in the NII format.

Inter- and intraclass correlation coefficients (ICCs) were used to evaluate the reproducibility of intra- and inter-observer segmentation. The two above-mentioned radiologists independently selected 50 NPC patients for MRI segmentation at random. One month later, Radiologist A repeated the same process. Good consistency was obtained, with ICCs of > 0.75 (details of the results are provided below). As a result, Radiologist A was tasked with segmenting all the remaining images.

A total of 792 radiomic features were obtained from T2WI and CET1-w images, with 396 features extracted from each MRI sequence using AK software. These features were further categorized into the following six categories, namely histogram, gray-level co-occurrence matrix, gray-level size zone matrix, haralick, form factor, and run-length matrix.

In order to identify the most predictive features linked to PFS, two methods for feature selection, the minimum redundancy maximum relevance (mRMR) and least absolute shrinkage and selection operator (LASSO), were employed. Firstly, the mRMR method was utilized to eliminate redundant and irrelevant features, utilizing a heuristic scoring criterion that assessed their relevance-redundancy index ranks. Subsequently, LASSO classification was carried out to choose an optimized subset of features, while 10-fold cross-validation was used to prevent overfitting. The Rad-score, a radiomics signature, was developed through LASSO-Cox regression analysis, generating a linear combination of the selected features weighted by their respective coefficients for each patient.

Pathomics signature construction and validation

All H&E-stained slices of the enrolled patients with NPC were converted to WSI using a Pannoramic MIDI II digital section scanner at a magnification of ×40, with a resolution of 0.14 μm/pixel, and saved in SVS format.

A novel deep learning architecture called the Swin Transformer was developed as a deep feature extractor. The process involved the following steps: (1) the tissue regions were segmented using morphological methods, including Gaussian blur, the OTSU algorithm, and closed operation; (2) a sliding window (224 × 224) traversed the entire WSI image from top left to bottom right to obtain tissue-involved patches, each containing ≥ 50% of pathological tissue; (3) each patch was fed into the Swin Transformer, with its classification head removed, which was pretrained using the ImageNet dataset to extract 768 features for further survival analysis of PFS; and (4) the mean value of all patches in each patient was calculated. The neural network was implemented using the PyTorch 1.4.1 library in Python 3.7.0 (https://pytorch.org).

Radiopathomics model construction and validation

Univariate and multivariate Cox regression analyses were carried out to identify independent clinical prognostic factors from all clinical parameters. And a clinical model was constructed by independent clinical prognostic factors. Subsequently, a radiopathomics nomogram was constructed to predict PFS by incorporating the radiomics signature, pathomics signature, and independent clinical factors utilizing logistic regression analysis. The prognostic performance of the radiomics, pathomics, clinical, and radiopathomics models was quantitatively evaluated using the concordance index (C-index). Additionally, a calibration curve was produced to evaluate the efficacy of the radiopathomics model. The goodness-of-fit of the radiopathomics model was assessed using the Hosmer-Lemeshow test. Furthermore, risk stratification analysis was performed to investigate the discriminability of the radiomics, pathomics, clinical, and radiopathomics models. The workflow of this study is illustrated in Fig. 2.

Fig. 2
figure 2

Workflow of this study. Note: Images of pretreatment MRI (T2WI, CET1-w) and WSI were retrospectively collected and segmented for feature extraction. After feature selection and modelling, two sets of signatures (radiomics, pathomics) were generated, and the two signatures combined with some independent clinical factors to construct the radiopathomics model. Abbreviations: T2WI = T2-weighted imaging, CET1-w = contrast-enhanced T1-weighted image, WSI = whole-slide image

Statistical analysis

All statistical analyses were conducted with R software (version 3.3.3, http://www.Rproject.org). C-index was used to evaluate the prognostic performance of the radiomics, pathomics, clinical, and radiopathomics models. The CompareC package was used to compare the C-indices of the different models. In the two-tailed analysis, statistical significance was defined as a P-value < 0.05.

Results

Baseline patient characteristics

Table 1 summarizes the demographic factors and clinical statuses of the patients in both the training and validation cohorts. The median follow-up interval were 60.02 months (interquartile range [IQR]: 45.25–72.33) and 54.23 months (IQR: 43.33–67.60) for the training and validation cohorts, respectively. No significant differences were observed in any clinical factor between the two cohorts.

Table 1 Characteristics of patients in the training and validation sets

Single-modality models

The intra-reader ICC for measurements operated by radiologist A varied from 0.769 to 0.926. Additionally, the inter-reader ICC between the two radiologists ranged from 0.834 to 0.915.

Following the mRMR and LASSO algorithms for feature selection, 7 features derived from T2WI (n = 1) and CET1-w (n = 6) were retained to construct the final radiomics model. The detailed feature selection process is depicted in Fig. S1, and the Radscore formula can be found in the Supplementary Material. The radiomics model exhibited poor prognostic performance in both the training and validation cohorts, with C-indices of 0.619 (95% confidence interval [CI]: 0.553–0.706) and 0.640 (95% CI: 0.571–0.771), respectively (Table 2).

Table 2 Prognostic performance of four models in NPC patients

However, the pathomics signature demonstrated a good C-indices of 0.732 (95% CI: 0.662–0.802) and 0.744 (95% CI: 0.655–0.834) in the training and validation cohorts, respectively (Table 2).

In the clinical model, T stage (hazard ratio [HR]: 1.512, P = 0.013), N stage (HR: 1.859, P = 0.013), family history of cancer (HR: 2.173, P = 0.033), and lactate dehydrogenase (LDH) level (HR: 1.007, P < 0.001) were significantly associated with PFS. However, the clinical model yielded poor C-indices of 0.655 (95% CI: 0.581–0.728) and 0.576 (95% CI: 0.484–0.627) in the training and validation cohorts, respectively (Table 2).

Radiopathomics model

The radiopathomics nomogram model, which integrated independent clinical predictors, radiomics, and pathological signatures, is shown in Fig. 3. The radiopathomics model exhibited the highest predictive performance with a C-index of 0.791 (95% CI: 0.724–0.871) among the four models in the training set. Similarly, in the validation set, the combined model also demonstrated the highest predictive performance (C-index 0.785, 95% CI: 0.716–0.875) (Table 2). Significant differences were observed in the C-indices between the radiopathomics model and the radiomics (P < 0.001), pathomics (P = 0.036), and clinical models (P < 0.001).

Fig. 3
figure 3

Radiopathomics model integrating radiomics signature, pathomics signature, and clinical factors for the prediction of PFS in patients with NPC. Abbreviations: PFS = progression-free survival, NPC = nasopharyngeal carcinoma, LDH = lactate dehydrogenase

The calibration curve of the radiopathomics model displayed excellent calibration performance in both the training (Fig. 4a) and validation sets (Fig. 4b). The Hosmer–Lemeshow test revealed no significant difference between the training and validation sets (P > 0.05). Furthermore, decision curve analysis indicated that the integrative model exhibited the highest performance compared to the other three models (Fig. 4c).

Fig. 4
figure 4

Calibration curve of radiopathomics model in training set (a) and validation set (b) and decision curve analysis (c) of the radiopathomics, radiomics, pathomics and clinical models for the prediction of PFS in NPC patients. Abbreviations: PFS = progression-free survival, NPC = nasopharyngeal carcinoma

When employing the constructed models to categorize patients into high- and low-risk groups using a threshold value derived from their PFS risk score, we found that the radiomics, pathomics, and radiopathomics models successfully stratified patients into two groups with distinct differences in PFS in both the training and validation sets (all, P < 0.001) (Fig. 5a, b, d). This indicated that the low-risk group exhibited a longer PFS than the high-risk group. However, the clinical model stratified patients into low- and high-risk groups in the training set (P < 0.001) but not in the validation set (P = 0.065) (Fig. 5c).

Fig. 5
figure 5

Risk stratification analysis of radiomics (a), pathomics (b), clinical (c) and radiopathomics model (d) for predicting PFS in the training set and validation set respectively in NPC patients. Abbreviations: PFS = progression-free survival, NPC = nasopharyngeal carcinoma

Discussion

In clinical practice, obtaining comprehensive prognostic information for treatment optimization based on single radiographic or clinical data can be challenging. Here, we addressed this issue by constructing a radiopathomics model that integrates a quantitative radiomics signature, pathomics signature, and clinical factors to predict NPC prognosis. The radiopathomics model demonstrated superior prediction performance compared with any of the single-modality models.

Pathological microscopic features can provide valuable insights into tissue heterogeneity and have been utilized in constructing prognostic models for various tumors, such as lung cancer, prostate cancer, and NPC (Lambrechts et al. 2018; Pantanowitz et al. 2020; Travis et al. 2013; Wang et al. 2018). In our study, we observed that quantitative pathological information derived from WSI was valuable for enriching lesion descriptors to predict NPC prognosis. Among the three single-modality models, the pathomics model exhibited the best prognostic performance for PFS in both the training and validation sets. This finding is consistent with a previous study that reported a similar predictive performance (C-index: 0.723) for PFS in NPC using a pathological model constructed with handcrafted features extracted from WSI (Liu et al. 2020). In contrast, the present study used an end-to-end approach to develop the pathomics model, which automatically extracted representative information and avoided the potential issues of manual segmentation, e.g., incompleteness and instability. Furthermore, we found that the prognostic performance of the radiopathomics model outperformed the radiomics and pathomics models alone in predicting PFS. This result further confirms the essential role of the pathomics signature in predicting PFS and its ability to enhance the predictive power of the radiomics model. These findings align with a previous report involving 220 patients with NPC(Zhang et al. 2020).

We constructed a prognostic radiomics model using seven MRI-based features to predict PFS in patients with NPC. However, the radiomics model exhibited a poor predictive performance when compared to the findings of a previous meta-analysis study, which reported a mean C-index of 0.76 (Lee et al. 2022). The heterogeneity observed across the studies may be attributed to several factors. First, the number of selected features has been reported as a significant factor influencing C-index performance (Chu et al. 2019; Skrede et al. 2020). Studies with a larger number of features tend to yield higher C-indices; however, this may lead to overfitting and reduced reproducibility of the external datasets. We selected a limited number of features to maintain a balance between predictive performance and generalizability. Furthermore, our study included patients with NPC at various stages (T1–T4), whereas previous studies focused only on advanced stages. This broad inclusion may have influenced the model’s predictive performance. A radiomics model based on a cohort of patients at stage T1–2 or T3–4 had a better prognostic performance than that based on a dataset of patients at all stages (Wu et al. 2021). Nevertheless, the broad scope of the model reflects real-world clinical situations and enhances its ability to be generalized to different patient populations. Interestingly, when we incorporated radiomics features into the pathomics model, the predictive performance improved. This finding suggests that the combination of radiomics and pathomics signatures adds complementary value to prognosis prediction.

Although clinical factors and biochemical indicators are related to survival, their individual prognostic values are limited. In our study, T stage, N stage, family history of cancer, and LDH levels were identified as independent factors. However, the clinical model based solely on these factors displayed a poor predictive performance. However, when these clinical factors were integrated into the radiomics, pathomics, or radiopathomics models, the combined models showed significantly improved predictive abilities compared with the original models. Our results align with those of prior studies (Bologna et al. 2020; Yang et al. 2019; Zhang et al. 2017), emphasizing the complementary nature of all prognostic information, which should be thoroughly explored for a comprehensive understanding of patient outcomes.

Our study has a few limitations. First, all patients were recruited from a single center, necessitating multicenter validation using independent datasets and prospective studies to confirm the model’s robustness and generalizability. Additionally, NPC biopsy specimens often contain numerous benign lymphoid tissues interspersed with atypical squamous cells. Analyzing the entire slice may have influenced the accuracy of the final results. Future investigations should explore manual labeling and delineation of cellular components to better elucidate the value of digital pathology for the prognostic evaluation of NPC. Moreover, the sample size in our study may require further expansion to enhance the statistical power and reliability of our findings. Future work with independent validation sets from multiple institutions would provide a more comprehensive assessment of the model’s stability and generalizability.

Conclusion

In summary, we have successfully constructed and validated an integrative radiopathomics model that combines radiological information from pretreatment MRI and pathological information from WSI for predicting NPC prognosis. The radiopathomics model demonstrated superior discrimination of PFS and effectively stratified individual patients into distinct prognostic groups. Due to its non-invasive and effective nature, the radiopathomics model holds promise for prognosis prediction and risk stratification of NPC, facilitating individualized clinical decision-making.