Introduction

As an orphan malignancy, the incidence of neuroendocrine tumors with pancreatic origin (pNET) is continuously rising, mainly due to technical progress in diagnostic imaging and improved awareness of treating physicians [1, 2]. Surgical resection is the only curative approach [3]. In advanced settings, treatment options include cytotoxic chemotherapy, somatostatin analogs, or targeted therapies such as tyrosine kinase and mTOR inhibitors [4,5,6,7]. Recently, favorable results have been reported for unresectable midgut as well as for bronchial NET using peptide receptor radionuclide therapy (PRRT) with [177Lu]DOTA-D-Phe-Tyr3-octreotate ([177Lu]DOTATATE) [8, 9].

Tailored medical treatment mainly focuses on proteomics or gene sequencing; however, their prognostic ability is rather limited due to small sample sizes, ongoing tumor development, and incomplete reflection of the entire tumor burden [10, 11]. Recently, the Delphic Consensus Assessment for Gastroenteropancreatic (GEP)-NET disease management reported on the limitations of chromogranin A (CgA) alterations as well as Ki67 in identification of therapy responders. More precise clinical decision-making increased demand for real-time multidimensional information regarding tumor behavior [12]. Non-invasive determination of intratumoral heterogeneity as assessed by baseline somatostatin receptor (SSTR)-positron emission tomography (PET) before PRRT has already proven its prognostic performance by outperforming conventional PET parameters, such as mean/maximum standardized uptake values (SUVmean/max) in a mixed cohort of patients scheduled for endoradiotherapy [13]. However, in particular for pNET, PRRT efficacy prediction has not been elucidated yet due to considerable heterogeneous diversity, earlier relapse of pNET patients undergoing radionuclide therapy, or mechanisms of tumor escape in dedifferentiated tumors [14,15,16]. Decoding a general prognostic phenotype [10], we hypothesized that intratumoral textural feature (TF) analysis assessed by a baseline SSTR-PET might address the urgent clinical need of prognostication in G1/2 pNET patients prior to PRRT. Patients with potentially poor response to PRRT may be identified and different therapeutic regimens might be applicable (e.g., systemic therapies). Therefore, we aimed to elucidate the prognostic capability of a baseline PET scan in a homogenous cohort of G1/G2 pNET patients.

Materials and Methods

Since our study comprises a retrospective analysis of routinely acquired data, the local ethic committees waived the need for further approval. All patients gave written and informed consent to the procedures as well as all patients provided informed consent for scientific analysis of the obtained data.

Patient Population

A total of 31 subjects (14/31 females (45.2 %), mean 60 ± 10 years (y), range, 39–79 y) of four university medical centers with histologically proven pNET were enrolled. The patients enrolled in the present subanalysis were part of a larger patient cohort [13]. The study population was restricted to G1/2 pNET, as G3 tumors normally suffer from rapid disease progression under PRRT [17]. Ki67 ranged between 1 and 20 % with a median of 5 % for the entire cohort (n = 31). Eight out of thirty-one (25.8 %) were classified as G1 NET and 23/31 (74.2 %) as G2 NET. In G2 NET, the median Ki67 was 8 % (range, 4–20 %).

Analysis of CgA levels before PRRT revealed a range between 35 and 64.700 μg/l (median, 924 μg/l). Twenty-five out of thirty-one (80.1 %) patients were pre-treated (somatostatin analogs, n = 19/31, (61.2 %); surgery, n = 13/31 (41.9 %); chemotherapy, n = 9/31 (29 %); and external beam radiation, n = 1/31 (3.2 %)). Clinical characteristics of the patient cohort are given in Table 1.

Table 1 Detailed patient’s characteristics according to Ki67/grading

PRRT was performed with a mean of 7.2 ± 1.0 GBq (194.6 ± 27 mCi; range, 3.3–8.9 GBq, 89.2–240.5 mCi) per cycle using [177Lu]DOTATATE. In total, the enrolled subjects underwent 112 treatment cycles (median, 4, range, 1–6; mean 3.6 ± 1.2) aiming at a standard interval of 3 months on a compassionate use basis [18, 19]. The majority of cases (21/31, 67.7 %) received at least four subsequent treatment cycles. PRRT was performed according to The joint IAEA, EANM, and SNMMI practical guidance on a compassionate use basis or in accordance with the Rotterdam protocol as published by Kwekkeboom et al., i.e., at time point of disease progression [18, 19]. Long-acting and short-acting release formulations were also discontinued according to [18]. Imaging including both functional (SSTR-PET) and/or morphologic imaging (CT) modalities was performed every 3–6 months after PRRT [18, 19].

rogression-free survival (PFS) was defined according to Response Evaluation Criteria in Solid Tumors 1.1 (RECIST1.1) by follow-up examinations starting from the time point of baseline imaging [18, 20]. For the calculation of overall survival (OS), the time interval between the baseline SSTR-PET examination and date of death was analyzed.

PET/CT Imaging, Imaging Interpretation

As a prerequisite for treatment initiation, all patients had to demonstrate sufficient uptake in pre-therapeutic SSTR-PET computed tomography (CT) [18, 19], i.e., lesional uptake higher than physiological liver uptake [21]. A mean of 132 ± 35.7 MBq (3.6 ± 0.9 mCi; range, 72–185 MBq, 1.9–5 mCi) of [68Ga]DOTATATE/-TOC (n = 27, [68Ga]DOTATATE and n = 4, -[68Ga]DOTATOC) was administered intravenously. After 60 min, imaging was performed using the following scanners: Bonn, Biograph 2 PET/CT (Siemens Medical Solutions, Erlangen, Germany); Wuerzburg, Biograph 64 (Siemens Medical Solutions, Erlangen, Germany); Munich, Gemini TF PET/CT (Philips Medical, Eindhoven, Netherlands) or Siemens Biograph 64 (Siemens Medical Solutions, Erlangen, Germany); Hannover, Biograph 2 (Siemens Medical Solutions, Erlangen, Germany). System spatial resolutions are 4.8 mm for the Gemini TF, 4.4 mm for the Biograph 64, and 9.3 mm for the Biograph 2 [22,23,24]. All data was reconstructed using iterative algorithms implemented by the manufacturer and depending on the routine protocol of the different medical centers. Scatter and attenuation correction was performed based on the different transmission data [13]. To allow for valid pooling of the results between Siemens and Philips PET/CT scanners, phantom studies based on the National Electrical Manufacturers Association NU2-2001 standard were conducted in Munich. According to a recent published study investigating the robustness of TF in GEP-NET patients using SSTR-PET in a multicentric setting, the following TF were taken into account [25]: from the gray-level co-occurrence matrix (Entropy, Homogeneity), from the gray-level run length matrix (high gray-level run emphasis (HGRE)), and from the gray-level size zone matrix (intensity variation, high gray-level zone emphasis (HGZE), zone length non-uniformity (ZLNU), short-zone high gray-level emphasis (SZHGE), zone percentage (ZP)). In addition, metabolic tumor volume (MTV) and total receptor expression (TRE) were assessed. Lesions were identified by reviewing the SSTR-PET, CT, and fused hybrid imaging by board-certified nuclear medicine physicians. In case of multiorgan involvement, a maximum of three lesions per organ (largest in size and metabolically most active lesion) was segmented. A manual segmentation method was preferred in order to exclude adjacent physiological SSTR-avid structures on PET/CT images [26]. TF analysis was performed by using the Interview Fusion Workstation (Mediso Medical Imaging Systems Ltd., Budapest, Hungary) [13]. As previously described, CT images were available only for localization and not used to guide delineation of the VOI [27]. Further, conventional PET parameters (SUVmean/max, metabolic tumor volume (MTV) and tissue receptor expression (TRE = MTV × SUVmean)) were also investigated [13]. The radiotracer concentration in the ROIs was normalized to the injected dose per kilogram of patient’s body weight to derive the SUVs. For the assessment of TF, 162 volume of interest (median, 5, range, 1–12 per patient) were manually segmented. In the majority of the cases (22/31, 71 %), at least four lesions were investigated. Metastases with an MTV smaller than 10 cm3 were excluded [28].

Statistical Analysis

Statistical analysis was performed using SPSS Statistics 22 and MedCalc (Vers. 17.4.4). The cutoff values of each parameter for the prediction of PFS and OS were identified through receiver operating characteristic (ROC) analysis using the Youden Index for maximization of specificity and sensitivity [29]. Kaplan–Meier analysis (univariate analysis) was performed using thresholds established by ROC analysis in cases in which ROC showed statistical significant results. A multivariate Cox hazard analysis was conducted to determine independent prognostic parameters as well as relative risks (RR) [26, 30]. Non-parametric log-rank tests were used to assess the differences in the Kaplan–Meier curves; statistical significance was considered with a p value < 0.05.

Results

Almost all subjects suffered from liver metastases (30/31, 96.8 %), less than half of the cohort demonstrated lymph node metastases (14/31, 45.2 %), 8/31 suffered from bone lesions (25.8 %), and 1/31 (3.2 %) showed pulmonary metastases (Table 1).

During an observation period/follow-up of median 3.7 years, disease progression occurred in 21/31 subjects (67.7 %) after a median of 1.5 y from the baseline PET scan (range, 0.8 months–4.5 y). Thirteen out of thirty-one (41.9 %) patients died from their tumor after a median of 1.9 y (range, 0.8 months–4.6 y). Of those, 11/13 (84.6 %) belonged to the G2 group (mean Ki67, 11 %). The median proliferation index Ki67 in the deceased patients was 5 % (range, 2–20 %).

Entire Cohort

In ROC analysis of TF, entropy demonstrated a significant predictive ability for OS (cutoff = 6.7, AUC = 0.71, p = 0.02) with an accuracy of 71 %. Increasing entropy could predict longer survival (> 6.7, OS = 2.5 y, 17/31), whereas less entropy portended inferior outcome (< 6.7, OS = 1.9 y, 14/31, Table 2, Supplementary Table a: see electronic supplementary material (ESM)). All the investigated conventional PET parameters (SUVmean/max, MTV, TRE) failed in response prediction (Supplementary Table b).

Table 2 Receiver operating characteristic (ROC) analysis for progression-free (PFS) and overall survival (OS) for the textural feature entropy (independent according to Cox analysis)

Subsequent Kaplan–Meier analysis revealed a significant distinction between high- and low-risk patients for OS using entropy (p = 0.045) in the whole cohort (PFS, n.s.).

In Cox hazard analysis, entropy and intensity variation demonstrated significance for OS (p < 0.05, respectively). For PFS, none of the investigated conventional PET parameters (SUVmean/max) or other TF were significant. Regarding clinical parameters, the cumulative administered dose reached significance for OS (p = 0.04, r = 0.37) and Ki67 for PFS prediction (p = 0.002, r = − 0.54). For those patients below the ROC-derived threshold for entropy, the RR of cancer-related death after PRRT was 2.73 (n = 31, CI 1.07–7.01; p = 0.04) (Fig. 1, Supplementary Table c in ESM).

Fig. 1.
figure 1

Relative risk (RR) charts with 95 % confidence interval (CI) using the ROC-derived threshold (Table 2) of entropy for overall survival. a Entire cohort (n = 31) and b G2 neuroendocrine tumor subgroup (n = 23). When the RR is exactly 1, the risk is unchanged. For those patients below the ROC-derived threshold of entropy, the RR of cancer-related death after PRRT increases (indicating worse outcome, applies to both the entire cohort and the G2 subcohort). Asterisk denotes statistically significant.

Neither Ki67 nor grading demonstrated significant correlation with the independent heterogeneity parameters (e.g., entropy/Ki67, r = − 0.27, n.s.).

Subanalysis of G2 NET

In ROC analysis of G2 NET, entropy reached significance for OS prediction with an accuracy of 70 % (ROC, cutoff = 6.9, AUC = 0.72, p = 0.03). Regarding OS prediction, the findings of the entire cohort were supported in a subanalysis of G2 NET (> 6.9, OS = 2.8 y, 9/23 vs. < 6.9, OS = 1.9 y, 14/23, Table 2).

In Kaplan–Meier analysis, no statistical significance was reached in the G2 subgroup (p = 0.072). Results are displayed for the entire cohort and the G2 group (Table 3); respective Kaplan–Meier plots for OS are given in Fig. 2.

Table 3 Results of Kaplan–Meier analysis for overall survival (OS) for the entire cohort (n = 31) and G2 neuroendocrine tumors (NET, n = 23) for the textural feature entropy. Asterisk denotes statistically significant
Fig. 2.
figure 2

Kaplan–Meier plots and number-at-risk tables for the probability of overall survival. a Entire cohort, n = 31, and b G2 neuroendocrine tumor subgroup, n = 23. Low-risk group (solid lines) was identified by various textural parameters measured on somatostatin receptor-positron emission tomography/computed tomography (SSTR-PET/CT) before peptide receptor radionuclide therapy. Cutoff values derived by receiver operating characteristics (ROC) analysis were used (Table 2). Only entropy was significant in both ROC and Cox analysis; d days.

In Cox analysis, entropy reached significance (p = 0.03) for OS prediction. In accordance with the findings for the entire cohort, the RR of cancer-related death after PRRT was 2.89 (CI 0.8–10.44; p = 0.1) for the G2 subgroup (Fig. 1, Supplementary Table c in ESM). The parameter intensity variation [13] trended to be significant in Cox analysis (p = 0.05).

For both the entire cohort as well as the G2 subgroup, results for ROC and Cox analyses of investigated parameters are presented in Supplementary Table b in ESM.

Discussion

This is the first study to assess intratumoral heterogeneity as a risk stratification tool for pNET patients scheduled for PRRT. Entropy, reflecting derangement on a voxel-by-voxel level, outperformed standard conventional PET parameters in prognostication. These findings were further corroborated in a G2 subanalysis. However, this group per se includes a wide range of NET from 2 to 20 %, i.e., that the therapeutic response of a “low” G2 NET scheduled for PRRT might differ from a G2 NET with an increased Ki67 [16].

Biopsy carries the potential for tumor under-sampling, and as a consequence, inaccurate therapeutic decisions can be made [31]. Hence, as a non-invasive whole-body molecular tool considering the extent of disease, PET-based assessment of intratumoral heterogeneity might serve as a novel diagnostic biomarker reflecting the entire phenotypical tumor burden. As previously described, the prognostic value of TF derived by PET has been successfully investigated in different tumor types [26, 32,33,34]. In our previous trial investigating various disease entities, TF analysis of a baseline SSTR-PET/CT proved prognostic value in PRRT candidates [13]. In the present study, we focused on subjects only suffering from NET of pancreatic origin. For NET, treatment options have improved in the last years [35]: The NETTER-1 trial revealed impressive findings using PRRT in midgut NET [8]. Moreover, recent developments of systemic agents such as everolimus or axitinib have also demonstrated favorable results. However, attention should also be paid to cardiac adverse events (grade 3/4 hypertension) leading to axitinib withdrawal in 20 % of patients [6, 36]. Hence, novel risk stratification approaches for this tumor entity might be intensively sought for: As demonstrated in this study, imaging-based survival prediction using TF analysis might be helpful to differentiate between low-risk and high-risk groups. Of note, entropy reached significance in all three statistical tests (ROC, Kaplan–Meier, Cox analysis) emphasizing its potential in response prediction independent of other investigated variables at least for the entire cohort. However, clinical implications have to be made with extreme caution, as the herein presented findings should rather be interpreted as a “proof-of-concept” and further research in larger, more homogeneous cohorts is definitely warranted.

Analyzing pre-therapeutic [68Ga]DOTATOC scans of liver metastases in pNET patients scheduled for [Y-90/Lu-177] treatment, a SUVmax threshold of > 16.4 for achieving radiologic response was proposed [37]. In our study, a cohort treated with the less nephrotoxic and more common Lu-177 was enrolled [38]. However, comparative thresholds were reached, but the SUV was not significant in our analysis. Similar to our findings, Gabriel also reported that the SUV profile of a baseline [68Ga]DOTATOC PET does not add additional information for response prediction in PRRT patients [39]. Sansovini et al. have recently proven that a negative 2-deoxy-2-[18F]fluoro-D-glucose (FDG) PET scan in advanced pNET patients treated with [177Lu]DOTATATE was linked to a better outcome after PRRT; however, [18F]FDG PET is not routinely assessed in treatment planning [40].

Higher entropy values are related to superior outcome in our study. A multivariate Cox analysis corroborated these findings: the RR for cancer-related death for those patients below the ROC-derived threshold of entropy indicated an almost threefold increased mortality compared to that for the low-risk group (Fig. 1). These results are contrary to findings in [18F]FDG PET studies investigating TF in pancreatic ductal carcinoma or non-small cell lung cancer (low entropy associated with longer OS) [41, 42]. Understandably, results from [18F]FDG PET in highly metabolically active tumors cannot be directly transferred to SSTR-PET; however, these findings emphasize the value of tumor heterogeneity assessment.

The value of entropy in patients with esophageal cancer undergoing RTx has been recently evaluated. Although responders were associated with greater local heterogeneity than non-responders, responders presented lower entropy values [33]. The response of NET tumor tissue to radiation exposure, however, might vary [43], and the included patients in the present study were heavily pre-treated with CTx and RTx (30 %), which could also have a certain impact on the SSTR fluctuations on the tumor cell surface. Moreover, as intratumoral phenotypic heterogeneity is frequently observed in NET even between synchronous or metachronous metastases, no attempt was performed to correlate these histopathological findings with patient outcome [15]. Wetz et al. have recently reported on the predictive role of asphericity in GEP-NET patients scheduled for PRRT: a higher level of asphericity was associated with poorer outcome. However, compared to the present study investigating SSTR-PET, heterogeneity parameters were derived from [111In-DTPA0]octreotide scintigraphy [44], which has a lower affinity to SSTR2A compared to its PET counterparts [45]. Moreover, entropy and asphericity differ in their equations, which also serves as a possible explanation for the different results: The latter one quantitates the deviation from spherical of the shape of the MTV, and it takes both the mean surface S and the mean volume V into account [44, 46]. On the contrary, entropy considers I as the voxel value in the ROI and P(I) as the probability of the occurrence of that pixel value [47]. Apart from that, in the present study, SSTR-PET had been used, while Wetz and co-workers performed a SPECT approach [44]. Taken together, the exact association between the herein presented imaging-derived analysis of tumor lesion texture and the underlying tumor biology must be further determined in prospective, longitudinal studies.

In contrast to previous findings, hepatic tumor burden did not turn out as an independent survival predictor, mainly since almost the entire cohort of our patients (97 %) suffered from liver metastases [48]. Of note, Ki67 did not correlate significantly with the investigated heterogeneity parameters emphasizing their independence. Correlating the proliferation index with outcome variables, Ki67 demonstrated its potential in PFS prediction but failed for OS. This might be also caused by sampling variability, as needle biopsies are typically not guided to regions with a higher proliferative rate [15].

This multicenter analysis has several limitations. First, only a limited number of patients could be included in this study, even though pNETs have a low annual incidence [2]. Additionally, imaging protocols differ slightly from center to center, including various PET reconstruction algorithms and different used PET scanners. Moreover, no harmonization between the used Biograph 2 and 64 PET has been performed. Compared to previous investigations [37], this might explain why the SUVmax did not turn out as a significant predictive parameter and other conventional PET parameters, like SUVpeak, could be subject of future studies. The herein stated OS for pNET patients under PRRT is significantly lower as described previously [16, 49]: However, it remains a matter of debate if the OS should be defined from diagnosis and treatment initiation or from baseline SSTR-PET. Moreover, therapeutic algorithms might also vary between centers, as the treating nuclear medicine physician has to adjust treatment planning under current circumstances (e.g., due to renal impairment); nonetheless, this reflects a typical clinical situation. Furthermore, in our cohort, the number of treatment cycles ranged from 1 to 6; however, the majority of the cases (70 %) received at least four radiopeptide administrations (median 4 cycles). Changes of imaging-derived parameters between subsequent scans might be also of prognostic value; however, functional follow-up imaging was not available in every patient. Moreover, in only 70 % of the cases, at least four lesions could be manually segmented and metastases with MTV smaller than 10 cm3 were not considered. Ki67 is prone to sampling bias as well as particular inaccuracy regarding the time lag of assessment and subsequent SSTR-PET scan. Due to the different affinities to SSTR2A, the use of [68Ga]DOTATATE/-TOC might have also led to a certain data bias [45]. In accordance with the reported robustness of certain TF published in [25], a pre-selection of heterogeneity parameters has been performed in the present study. Hence, a limited number of TF had been investigated, and therefore, no correction for p values was applied to adjust for multiple tests, but such procedures could be subject of future studies [50]. Consequently, as a Bonferroni correction had not been applied, the herein derived findings must be interpreted with caution. Moreover, a more homogenous study setting might strengthen our preliminary findings, in particular by enrolling a larger, prospective cohort using the same scanners and without variances in the imaging protocol.

Conclusion

As demonstrated in this multicenter study, application of entropy as obtained by baseline SSTR-PET might be useful for differentiating high-risk from low-risk groups in pNET patients scheduled for PRRT.