Introduction

Patients who undergo radical prostatectomy (RPE) for localized prostate cancer (PCa) as initial treatment experience biochemical recurrence (BCR) in up to 50% within 5 years after treatment [1]. Adverse pathological features are associated with higher rates of BCR [2]. Salvage radiotherapy (sRT) with or without androgen deprivation therapy (ADT) provides the only curative treatment options for these patients and should be initiated at low PSA levels [3]. Since response rates are heterogeneous [4], tools for improved risk stratifications are warranted in order to identify patients who are at higher or lower risk for relapse after sRT and thus might be candidates for therapy intensification or de-intensification.

Positron-emission tomography targeting prostate-specific membrane antigen (PSMA-PET) combined with computer tomography (CT) significantly improved detection rates of local (LR) and nodal recurrence (NR) RPE [5] and altered treatment management [6], but prospective evidence of putative beneficial effects on outcomes are pending.

Despite the great diagnostic value of modern imaging technologies, the computer-based extraction and analysis of image features (radiomics) offers new opportunities for improved image analysis to provide additional information about tumor characteristics as shown for multiple cancer entities [7,8,9,10]. Various studies have reported on improvements for PCa detection, prediction of ISUP, ECE, and BCR in primary PCa patients but no data exists in the setting of sRT [11].

In order to identify novel markers for personalized risk stratification, this retrospective multicenter study aims to evaluate the impact of PSMA-PET-guided CT-based radiomic features (RF) derived from PSMA-PET/CT scans on BCR-free survival (BRFS) in patients who receive sRT due to recurrent or persistent PCa cancer after RPE.

Material and methods

Patients and treatment

This retrospective multicenter study pooled patients from three high-volume centers in Germany (University Medical Centre Freiburg (UKF), Klinikum Rechts der Isar Technical University Munich (TUM), University Hospital of the Ludwig-Maximillian’s-University Munich (LMU)). Each center received an institutional review board approval for this study (Freiburg No.: 15/18; TUM:466/16 S; LMU: 17-765). Written informed consent was waived due to the retrospective nature of the study.

Data from patients who received radical surgery and underwent 68Ga-PSMA11-PET/CT due to PSA persistence (PSA after surgery ≥ 0.1 ng/ml) or recurrence (PSA ≥ 0.2 as nadir after surgery) and were subsequently treated with PSMA-PET-guided sRT were collected. From the total cohort, only patients who received a contrast-enhanced CT were included in this analysis. Patients with distant metastases (lymph nodes above the iliac bifurcation, bone metastases, or visceral metastases) present in PSMA-PET/CT and if ADT was given prior to PSMA-PET/CT scans were excluded. Treatment decisions were taken locally at the discretion of the treating physicians according to standards of care at the time of treatment [3] and based on PSMA-PET/CT findings. See Supplemental Methods Table S1 for details on salvage RT concepts for each center. ADT was administered at the discretion of the treating physician. In total, 99 patients with PET-positive local recurrence treated with sRT between 2014 and 2020 met the inclusion criteria.

Data collection and follow-up

The following clinical data were collected: age at sRT, International Society of Urologic Pathology Grading (ISUP), pathological T-stage and N-stage, initial PSA and PSA prior to sRT, presence of nodal recurrence, administration and duration of ADT, and sRT doses. Follow-up assessments included serum PSA testing at regular intervals based on institutional clinical standards.

68Ga-PSMA11 PET/CT

68Ga-PSMA11 was synthesized according to good manufacture practice in all centers and in accordance with international procedural guidelines [12]. Acquisition protocols and scanner types are provided in the Supplemental Methods.

All scanners fulfilled the requirements indicated in the European Association of Nuclear Medicine (EANM) imaging guidelines and obtained EANM Research Ltd. (EARL1) accreditation during acquisition.

See [13] for details on PSMA image acquisition and reconstruction algorithms.

All PSMA-PET/CT images were reviewed locally prior to data sharing according to reporting international guidelines [14] by two nuclear medicine physicians with experience on PCa imaging. Disagreements were resolved by consensus.

Segmentation

Further image processing was performed using the 3D Slicer v4.10.0 [23]. Two separate segmentation strategies were followed: First, considering the local nuclear medicine report, PSMA-PET-positive PCa lesions were manually contoured within the CT image by one reader (SS) with  >3 years’ experience in PSMA-PET/CT segmentation guided by the PSMA-PET signal using validated segmentation approach levels [15]. Second, 20% of the maximal standard uptake value (SUVmax) of the lesions was used as a threshold for PET-based semi-automatic segmentations.

Radiomic feature extraction

Radiomic feature and preprocessing were performed using the pyradiomics package (version 3.0.1) in Python (version 3.7.9) [16]. For preprocessing, a fixed bin width of 5 HU was used for image discretization [17]. Isotropic resampling was performed to a voxel size of 1 × 1 × 1 mm using Bspline interpolation. Shape, first-order, and texture features were computed from the original image according to the “image biomarker standardization initiative” guidelines [18]. Texture matrices were aggregated averaged over 3D directions for GLCM and GLRLM, or 3D for GLSZM, NGTDM, and GLDM features. See Supplemental Table S2 for a list of the total 104 features.

Modeling strategy and statistical analyses

The modeling steps were performed using the familiar package (0.0.0.53) in R (version 4.1.2, R core team, Vienne, Austria) (https://github.com/alexzwanenburg/familiar). For signature building, a recently published approach was chosen [19]. See Supplemental Methods for a detailed description. In brief, the radiomic feature space was reduced by excluding features susceptible (intraclass correlation coefficient (ICC 3.1)  <0.8) to small differences in the segmentation type (manual vs. PET threshold-based) and of features highly correlated with clinical variables (PSA initial, ISUP, rcN, max PSA) (Spearman coefficient  ≤0.6). Third, we performed hierarchical clustering with complete linkage and Spearman correlation as a distance metric keeping one representative feature from each cluster.

Finally, Cox proportional hazard models were calculated in 10 iterations of fivefold nested cross validation with different feature selection methods (Spearman correlation (spearman), concordance index (concordance), minimum redundancy maximum relevance (mrmr), mutual information feature selection (mifs), and random selection as control) (see Supplemental Figure S1 for a detailed graphical depiction). The manual segmentation was used. Prior to analysis, Yeo-Johnson transformation and z-transformation to mean zero and standard deviation of one were performed. For each iteration of the outer folds, the internal cross validation folds were repeated 11 times to select the median signature size and the top ranking features. The predictive performance in the outer folds was aggregated over all 10 iterations.

We developed models comprising of clinical features (Clinical, including the following variables: age, ISUP grade after surgery, initial PSA, maximum PSA prior sRT and rcN status) and radiomic features (Radiomics). Finally, a combined clinical-radiomic model was generated by using clinical and radiomic features as input into the same pipeline.

Statistical analysis

Descriptive statistics were performed with Excel 2016 (Microsoft Cooperation, USA). Statistical analysis and model building were performed using R (version 4.1.2, R core team, Vienna, Austria). The primary endpoint of the study was BRFS, defined as time to serum PSA  >0.2 ng/ml above the post-sRT nadir without initiation of additional salvage therapies or death of any cause.

In order to compare the predictive value of the developed models in the test sets within the nested cross validation approach, the following methods were used: time-dependent receiver-operating characteristic (ROC) curves [20], calibration curves (see Supplemental Figure S2) [21], time-dependent discrimination improvement index (tdIDI), time-dependent net reclassification improvement index (tdNRI) (see Supplemental Methods) [22], and a decision curve analysis (DCA) [23]. The median predictor over all 10 × 5 outer testing sets was determined for each patient. Kaplan-Meier analysis [24] was conducted by recording the median value of the predictions in each training set and by applying it as a cut-off value for classification in the respective test sets for all patients. The final classification was determined by majority voting over all 10 iterations.

The C-index and ROC area under the curve (AUC) were calculated as a performance metric. The Wilcoxon rank-sum test was used for comparison of values at a significance level of 0.05.

Decision curve analysis was performed according to Vickers et al. to compare the clinical net benefit of the developed models [25]. Decision curves for “treating no patient” and “treating all patients” were depicted as reference.

Results

Patient characteristics

Ninety-nine patients with a median follow-up of 29 months (range 3–79 months) were included in this analysis (Fig. 1). No patient died during FU. See Table 1 for details about patient characteristics.

Fig. 1
figure 1

Consort flow diagram. Abbreviations: PSMA-PET/CT = positron-emission tomography targeting prostate-specific membrane antigen combined with computer tomography

Table 1 Patient characteristics

Radiomic models outperform clinical models for prediction of biochemical failure

The developed clinical signatures achieved low to moderate performance for prediction of BRFS with a C-index ranging between 0.51 and 0.61 in the test sets. The radiomic signature achieved superior prediction of BRFS with good performance in both training and test set with a C-index ranging between 0.66 and 0.71 in the test set. Combined clinical-radiomic models achieved only moderate performance in the test set with a C-index of 0.60–0.65. The models based on random feature selection performed worse for the radiomic models and similar to other clinical models. See Table 2 for details. Feature reduction with mrmr was chosen for further analyses of all models due to its performance close to the median overall training performance and narrow 95% confidence interval for the clinical and radiomic models (excluding random). For consistency, mrmr was also selected for the combined model.

Table 2 Performance of clinical, radiomic, and clinical-radiomic signatures for prediction of biochemical failure after salvage radiotherapy. Results of various feature selection methods are shown

We further stratified patients into low or high probability of BRFS based on predictions of the respective models. Only the radiomic models resulted in significantly different survival probabilities (p < 0.001). Time-dependent ROC analysis showed consistent AUC values over time of up to 0.8 for the radiomic signatures up to 60 months of follow-up. Combination of clinical and radiomic signatures showed lower AUCs than the radiomic signature alone. See Fig. 2 for details.

Fig. 2
figure 2

The Kaplan–Meier survival curves and time-dependent area-under-the-curve (AUC) results for the clinical (A), radiomic, (B) and combined clinical and radiomic (C) signatures obtained from repeated nested cross validation

At 24-month FU, the clinical signatures, radiomic signatures, and combined clinical and radiomic signatures achieved an AUC of 0.53, 0.73, and 0.63, respectively. See Fig. 3 for details.

Fig. 3
figure 3

Results of the receiver operator characteristic analysis for the clinical signatures, radiomic signatures, and clinical-radiomic signature at 24 months of follow-up based on repeated nested cross validation (test results in the outer fold) results

Clinical relevance of the radiomic signature

Decision curve analysis reflects the highest clinical net benefit for the radiomic signatures compared to the two alternative signatures while the combined model also performed better than the clinical model. See Fig. 4 for details.

Fig. 4
figure 4

Decision curves for clinical, radiomic, and clinical-radiomic signatures based on repeated nested cross validation results (test sets within the outer folds). Decision curves for “treating no patient” and “treating all patients” were depicted as reference

Despite lower AUC values, the combined clinical-radiomic signatures achieved higher improvements in sensitivity as the radiomic signatures alone compared to the clinical signatures in the test sets of the outer folds (0.78 vs. 0.26). The clinical-radiomic signatures also demonstrated improved tdIDI over the clinical signatures (0.766, p = 0.027). The radiomic and clinical-radiomic models achieved a significantly better tdNRI (0.392 and 0.762, respectively) compared to the clinical model (p < 0.005) (Table 3).

Table 3 shows results of time-dependent reclassification analysis for the clinical, radiomics, and combined clinical-radiomic model based on repeated nested cross validation results (test sets within the outer folds)

CT mean intensity and PSA initial as most important features

For the radiomic signature, the feature “firstorder_mean,” i.e., the mean CT intensity value, was selected as predictive feature for all feature reduction methods with a frequency of 98% in the case for mrmr (see Supplemental Material Table S3 for selected features). For the clinical model, PSA initial was predominantly selected with a frequency of 40% for mrmr. The same two features were the most often selected features in the clinical-radiomic model (40% and 23%, respectively). See Table 4 for intensity values stratified by BCR.

Table 4 CT intensity values in Hounsfield units stratified by biochemical recurrence (BCR) are shown

The maximally selected rank statistics on the complete dataset revealed a mean intensity of 19.7 Hounsfield units (HU) as optimal cut point. A univariate Cox proportional hazard model for CT mean intensity (HR 0.99, p = 0.012, Fig. 5) and the respective nomogram is provided in the supplement (Supplemental Material Table S4 and Figure S3). An exemplary patient case is provided in Fig. 6. In a multivariate Cox model including PSA initial (HR 1.01, p = 0.052) and CT mean intensity (HR 0.99, p = 0.012), only the latter was significant.

Fig. 5
figure 5

Kaplan-Maier curve for biochemical recurrence free survival (BRFS) stratified after mean intensity of Hounsfield units  <20 and  >20

Fig. 6
figure 6

Exemplary patient case: A shows the PET-positive local recurrence in the prostatic fossa/seminal vesicle fossa (segmentation green. B shows the respective segmentation in the CT scan (segmentation in green). Mean intensity of the segmentation was 15.8 and radiomic majority vote was high risk. C shows an axial slide of the radiotherapy plan in colorwash with dose escalation in the area of the local recurrence. D shows the prostate-specific antigen (PSA) values of time. Time point of radical prostatectomy (RPE), biochemical recurrence BCR), and salvage radiotherapy (sRT) are highlighted

Discussion

In this study, we have developed PSMA-PET-guided CT-based radiomic signatures for prediction of BRFS after sRT due to PCa recurrence using a multicenter cohort from three high-volume centers. We designed the model as a pre-therapeutic tool to guide treatment decision and to identify patients who are at higher or lower risk of relapse and might thus be candidates for treatment intensification or de-intensification. Consequently, treatment-specific parameters, such as delivered radiation dose, were not considered for model building. The developed radiomic signatures yielded good predictive performances and outperformed clinical signatures based on classical histological and clinical parameters. The radiomic model achieved significant patient stratification and demonstrated durable prediction of BRFS in time-dependent ROC analysis. To the best of our knowledge, this is the first study assessing CT-based radiomics in patients who underwent PSMA-PET-based sRT and thus provides novel insights into this field of research.

Analyses of RFs have extensively been performed in primary prostate cancer patients [11]. Most of these studies are based on MRI and demonstrated the ability of radiomics to non-invasively characterize and detect clinically significant PCa and extracapsular extension or predict BCR [26,27,28].

Fewer studies reported on CT-based radiomics and all of these were performed in the primary setting. Three studies developed CT-based radiomic classifiers with good performance to predict Gleason score and risk groups (AUC 0.70–0.83) [29,30,31].

Based on PSMA-PET/CT scans, Peeken et al. developed a CT-based model to detect lymph node metastases, which outperformed conventional CT parameters with an AUC of 0.95 in external testing [32], addressing the limited ability of conventional imaging to detect PCa-positive lymph nodes. Acar et al. used CT-based RFs to differentiate between bone metastases and sclerotic areas with good accuracy (AUC 0.76) [33].

Since patients with BCR after surgery experience heterogeneous response rates [4, 34], our study aims to improve risk stratification with commonly available diagnostics for patients receiving sRT based on state-of-the-art diagnostics and to identify patients who might benefit from treatment de-intensification or intensification.

In our study, clinical signatures showed insufficient prognostic value for BRFS after sRT in the test sets, which reflects the deficiency of classical clinical and pathological parameters for prognostication demonstrated by retrospective and prospective studies. However, the developed radiomic signatures outperformed the clinical models with good prognostic values in the test sets. Radiomic signatures and particularly various feature selection methods outperformed clinical models, which demonstrates a certain robustness of these signatures. The inferior performance of the combined clinical and radiomic signatures might be explainable due to the poor prognostic value of clinical parameters and the low patient number for effective model building.

Since no other studies evaluated CT-based radiomics to predict BCR, we cannot directly compare our signatures with other CT-based models. Nevertheless, in comparison with mpMRI-derived RF, the radiomic models in our study performed similarly well with a C-index of  >0.7, considering different clinical scenarios between these studies. DCA demonstrates a net benefit of the radiomic signatures, suggesting that clinical utilization of radiomics can help to identify patients who are at higher risk of BCR after sRT. Whether these patients benefit from intensified treatments and if which kind of treatment intensification is optimal need to be evaluated in future studies.

CT-based radiomics might in future play an even more important role, since technical advantages such as dual-energy CTs provide more image information and may thus allow for more differentiated radiomic analyses. In addition, the prognostic capability of PSMA-based radiomic signatures needs to be evaluated in future studies.

Due to the small patient number, we were not able to separate an external testing cohort, but rather obtained high statistical robustness by applying a nested cross validation approach. Future studies should focus on external validation to demonstrate transferability of models.

The mean intensity within the VOI was selected as the most important RF. Lower intensity values were associated with decreased BRFS. To provide a simple cut-off metric, we applied the maximally selected rank statistics. A cut-off of 19.7 HU was determined as optimal cut-off point for BCR. However, unlimited reduction of HU values is not plausible, since we expect local recurrence to have HU values greater than fat tissue. Thus, this cut-off value should be validated in further studies. Moreover, we provide a univariate Cox model and nomogram trained on the complete cohort for future external validation.

There are several limitations in our study. First we want to mention the retrospective character and possible selection bias. Secondly, we have included patients with LR and NR in this analysis, who experience different outcomes. Separation of both cohorts would have resulted in an insufficient sample size. Nevertheless, development of CT-based radiomic signatures might be influenced to a lesser extent through this heterogeneity in comparison to functional imaging methods. Thirdly, we used an internal validation due to the low number of patients within each institution. However, we applied a sophisticated nested cross validation approach to overcome methodological disadvantages. The inclusion of patients that received ADT may have biased optimal outcome predictions. Again, exclusion of these patients would have significantly reduced the sample number. Finally, the FU in our cohorts is relatively short with a median FU of 29 months.

Summary

The developed CT-based radiomic signatures outperform clinical/clinical-radiomic signatures for prediction of BCR after sRT and demonstrated durable prediction of biochemical recurrence in time-dependent ROC analysis. Decision curve analysis demonstrates a net benefit for clinical utilization of the radiomic signature. Future studies need to evaluate whether these improved prognostications can be transferred into personalized treatments.