The value of 18F-FDG PET before and after induction chemotherapy for the early prediction of a poor pathologic response to subsequent preoperative chemoradiotherapy in oesophageal adenocarcinoma

Purpose The purpose of our study was to determine the value of 18F-FDG PET before and after induction chemotherapy in patients with oesophageal adenocarcinoma for the early prediction of a poor pathologic response to subsequent preoperative chemoradiotherapy (CRT). Methods In 70 consecutive patients receiving a three-step treatment strategy of induction chemotherapy and preoperative chemoradiotherapy for oesophageal adenocarcinoma, 18F-FDG PET scans were performed before and after induction chemotherapy (before preoperative CRT). SUVmax, SUVmean, metabolic tumour volume (MTV), and total lesion glycolysis (TLG) were determined at these two time points. The predictive potential of (the change in) these parameters for a poor pathologic response, progression-free survival (PFS) and overall survival (OS) was assessed. Results A poor pathologic response after induction chemotherapy and preoperative CRT was found in 27 patients (39 %). Patients with a poor pathologic response experienced less of a reduction in TLG after induction chemotherapy (p < 0.01). The change in TLG was predictive for a poor pathologic response at a threshold of −26 % (sensitivity 67 %, specificity 84 %, accuracy 77 %, PPV 72 %, NPV 80 %), yielding an area-under-the-curve of 0.74 in ROC analysis. Also, patients with a decrease in TLG lower than 26 % had a significantly worse PFS (p = 0.02), but not OS (p = 0.18). Conclusions 18F-FDG PET appears useful to predict a poor pathologic response as well as PFS early after induction chemotherapy in patients with oesophageal adenocarcinoma undergoing a three-step treatment strategy. As such, the early 18F-FDG PET response after induction chemotherapy could aid in individualizing treatment by modification or withdrawal of subsequent preoperative CRT in poor responders.

F-FDG PET scans were performed before and after induction chemotherapy (before preoperative CRT). SUV max , SUV mean , metabolic tumour volume (MTV), and total lesion glycolysis (TLG) were determined at these two time points. The predictive potential of (the change in) these parameters for a poor pathologic response, progression-free survival (PFS) and overall survival (OS) was assessed. Results A poor pathologic response after induction chemotherapy and preoperative CRT was found in 27 patients (39 %). Patients with a poor pathologic response experienced less of a reduction in TLG after induction chemotherapy (p < 0.01). The change in TLG was predictive for a poor pathologic response at a threshold of −26 % (sensitivity 67 %, specificity 84 %, accuracy 77 %, PPV 72 %, NPV 80 %), yielding an area-under-the-curve of 0.74 in ROC analysis. Also, patients with a decrease in TLG lower than 26 % had a significantly worse PFS (p = 0.02), but not OS (p = 0.18). Conclusions 18 F-FDG PET appears useful to predict a poor pathologic response as well as PFS early after induction chemotherapy in patients with oesophageal adenocarcinoma undergoing a three-step treatment strategy. As such, the early

Introduction
The long-term survival of patients with locoregionally advanced oesophageal cancer remains quite poor despite considerable advances in surgery, radiotherapy, and chemotherapy, with 5-year survival rates still below 50 % [1,2]. Multimodality treatment strategies have been implemented in an effort to improve the outcome achieved with surgery alone [3]. Since early studies showed that adjuvant therapy did not improve outcomes [4][5][6][7], contemporary research mainly focused on neoadjuvant strategies, which resulted in improved resection rates, pathologic downstaging, and a reduction in disease recurrences [3]. As a result, preoperative concurrent chemoradiotherapy (CRT) followed by oesophagectomy is commonly applied in clinical practice [8].
An important observation in patients treated with trimodality therapy (i.e., preoperative CRT followed by oesophagectomy) is that the most common pattern of treatment failure is now distant progression [8,9]. In an attempt to eliminate micrometastases and thereby improve the distant failure rate and overall outcome, additional induction chemotherapy before trimodality therapy has been investigated in the United States and Europe, as well as in Asia [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29]. Results of comparative studies have been inconclusive with some studies reporting a benefit of induction chemotherapy [15,16], while others were equivocal [27,29]. Nonetheless, induction chemotherapy is thought to have a number of potential advantages including improvement of swallowing/nutritional status and obviating the need for feeding tubes in patients presenting with dysphagia [11,12,14,18,19,22,24]. More importantly, it has been suggested that the use of induction chemotherapy may permit early identification of poorly responding patients in whom neoadjuvant treatment is ineffective or even harmful [24,[30][31][32]. 18 F-fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) is a well-established imaging modality for initial staging and re-staging after preoperative CRT for the detection of distant (interval) metastases [33][34][35][36][37]. 18 F-FDG PET has been shown to be more accurate than other modalities in predicting pathologic response to neoadjuvant chemotherapy or CRT for oesophageal cancer [38,39]. However, current evidence is limited with regard to the value of 18 F-FDG PET for response prediction in the setting of a three-step strategy of induction chemotherapy and preoperative CRT followed by oesophagectomy. Therefore, the aim of this study was to determine the value of 18 F-FDG PET scanning at baseline and after induction chemotherapy for the early prediction of a poor versus good pathologic response (i.e. >10 % versus ≤10 % residual carcinoma) to subsequent preoperative CRT.

Material and methods
This retrospective study has been approved by our Institutional Review Board, and the need for written informed consent was waived. The study was conducted in accordance with the Health Insurance Portability and Accountability Act (HIPAA) and the checklist from the STAndards for the Reporting of Diagnostic accuracy studies (STARD) statement (http://www. stard-statement.org) [40].

Study population
From a prospectively acquired database, we extracted all consecutive patients with a biopsy-proven potentially resectable adenocarcinoma of the oesophagus or gastro-oesophageal junction and no distant metastases that underwent a threestep treatment strategy of induction chemotherapy and preoperative chemoradiotherapy followed by surgery at our institution from March 2006 to February 2013. Patients were excluded if one of two 18 F-FDG PET scans of interest were either not available or acquired at another institution. Also, non-FDG-avid tumours at baseline, Siewert type 3 gastrooesophageal junction tumours, and patients with a stent insitu at the time of scanning were excluded. Finally, patients with a time interval between completion of preoperative chemoradiation and surgery of less than 5 weeks or more than 14 weeks -indicating urgent and salvage resections, respectively -were excluded.

Histopathologic assessment
Histopathologic examination of the resected specimen was standardized in accordance with the seventh edition of the American Joint Committee on Cancer protocol for TNMclassification [41]. The degree of pathologic response to neoadjuvant treatment was graded as follows [42]: complete absence of residual cancer (tumour regression grade [TRG] 1), 1-10 % residual carcinoma (TRG 2), 11-50 % residual carcinoma (TRG 3), and >50 % residual carcinoma (TRG 4). A poor pathologic response (defined as TRG 3-4) as opposed to a good pathologic response (defined as TRG 1-2) was considered the reference standard of this study.
Image acquisition 18 F-FDG PET/computed tomography (CT) scans were performed on an integrated PET/CT system (Discovery RX, ST, or STE; GE Medical Systems, Milwaukee [WI], USA). Before 18 F-FDG PET, a CT scan was acquired (120 kV peaks, 300 mA, 0.5 seconds rotation, pitch of 1.375, slice thickness 3.75 mm, and slice interval 3.27 mm) for attenuation correction purposes. 18 F-FDG PET scans were acquired 60-90 minutes after administration of 18 F FDG with a dose of 555-740 MBq, in either two-dimensional (2-D) or threedimensional (3-D) acquisition mode at 3-5 minutes per bed position. Images were reconstructed using ordered-subset expectation maximization in 2-D or iterative reconstruction in 3-D images. All analyses were performed on the attenuationcorrected images.

Image analysis
The primary tumour was defined as the volume of interest (VOI) and delineated on the 18 F-FDG PET scans using a semi-automatic gradient-based delineation method from commercially available software (MIM Software, Cleveland [OH], USA). This contouring method has recently been validated in a multi-observer study that showed superiority over manual and threshold methods [43]. The following quantitative features were extracted from the VOIs of the 18 F-FDG PET scans at baseline and after induction chemotherapy (before preoperative CRT): maximum and mean standardized uptake value (SUV max and SUV mean ), metabolic tumour volume (MTV) and total lesion glycolysis (TLG). The MTV was automatically calculated by the software by summing up the areas within each two-dimensional transverse tumour contour multiplied by the corresponding slice thickness. The TLG was calculated by multiplying MTV by SUV mean [44]. In addition, the relative changes (in %) of these parameters between 18 F-FDG PET at baseline and 18 F-FDG PET after induction chemotherapy were calculated and included in the analysis.

Statistical analysis
First, the association between clinical parameters and poor versus good pathologic response was studied using the chisquare test (or Fisher's exact test in case of small cell count) for categorical parameters, and Student's T-test for parametric continuous parameters. The association between the quantitative 18 F-FDG PET parameters and pathologic response was quantified using logistic regression analysis providing odds ratios (ORs) with 95 % confidence intervals (CIs). Multiple 18 F-FDG PET parameters were logarithmically transformed to meet the assumption of linearity on the logit scale. For these parameters, the relative changes (%) were calculated using the logarithmically transformed parameter values before and after induction chemotherapy.
Second, receiver operating characteristics (ROC) curve analyses (providing area-under-the-curve [AUC] values) were used to assess the potential of the studied 18 F-FDG PET parameters to discriminate poor responders from good responders. For the 18 F-FDG PET parameter with the highest discriminatory ability (AUC), the sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated for an optimal threshold that was determined by giving equal weight to sensitivity and specificity on the ROC curve.
Third, the Kaplan-Meier method was applied to estimate progression-free and overall survival differences among patients predicted to have a poor versus good response based on the 18 F-FDG PET parameter with the highest discriminatory ability. For the survival analysis, the log-rank test was used to determine significance. Progression-free survival and overall survival were calculated from the starting date of induction chemotherapy to the date of disease progression after surgery or the date of death, respectively. In patients who were free of disease progression or alive at last follow-up, the date of last follow-up was used to censor progression-free or overall survival times, respectively. Statistical analysis was performed using SPSS 23.0 (IBM Corp., Armonk [NY], USA) and R 3.1.2 open-source software (http://www.R-project.org). A pvalue <0.05 was considered statistically significant.

Results
From a total of 132 patients with an oesophageal adenocarcinoma who underwent induction chemotherapy and preoperative chemoradiotherapy followed by surgery in the study period, 70 were considered eligible for analysis. Some excluded patients missed at least one of two 18 F-FDG PET scans of interest performed at our institution (n = 28); these patients had similar response and survival rates compared to the included cohort. Other excluded patients had a Siewert type 3 gastro-oesophageal junction tumour (n = 15), a non-FDG avid tumour (n = 6), a stent in-situ at the time of scanning (n = 1), or underwent an urgent or salvage oesophagectomy (n = 1 and n = 11, respectively).
Patients with a poor response had a mean age of 60 years and 96 % (n = 26) of them were male, whereas patients with a good response had a mean age of 59 years and 88 % (n = 38) of them were male. None of the studied baseline characteristics were significantly related to the pathologic response to neoadjuvant treatment (Table 1). More specifically, only small non-significant differences regarding pathologic response for the various induction chemotherapy regimens, radiation therapy characteristics and concurrent chemotherapy regimens were found. However, worse tumour characteristics (i.e., higher clinical T-stage, signet ring cell adenocarcinoma, poor differentiation grade) and co-morbidities (i.e., cardiac co- Data are presented as numbers with percentages in parentheses † : Expressed as mean ± SD. COPD: Chronic obstructive pulmonary disease. EUS: Endoscopic ultrasound. IMRT: Intensity-modulated radiotherapy morbidity, diabetes mellitus, chronic obstructive pulmonary disease, and smoking at diagnosis) were consistently observed more frequently in the poor response group. Baseline 18 F-FDG PET parameters, SUV max , and SUV mean after induction chemotherapy were not related to pathologic poor versus good response (Table 2). However, both a larger MTV and a larger TLG after induction chemotherapy were significantly related to a higher chance of a poor pathologic response (p = 0.01). The relative changes after induction chemotherapy in 18 F-FDG PET intensity parameters (i.e., ΔSUV max and ΔSUV mean ) and metabolic tumour volume (i.e., ΔMTV) were also significantly related to pathologic response (p = 0.01), and their discriminatory ability appeared to be superior compared with single time point measurements (AUC range 0.71-0.72 vs. 0.52-0.69; Table 2). The association of the relative change in (the logarithmically transformed) total lesion glycolysis (ΔTLG) with pathologic response was highly significant (p < 0.01) and this parameter yielded the highest discriminatory ability (AUC 0.74).
The ideal cut-off value for ΔTLG to distinguish poor pathologic responders from good responders was statistically determined at −26 % (i.e., a 26 % decrease). Patients with a ΔTLG above (n = 25) versus below (n = 45) this threshold had a poor pathologic response in 72 % versus 20 % of cases, respectively. At the threshold of −26 %, the ΔTLG yielded a sensitivity of 67 % (95 % CI: 51-79 %), specificity of 84 % (95 % CI: 74-91 %), accuracy of 77 % (95 % CI: 65-86 %), PPV of 72 % (95 % CI: 55-85 %), and NPV of 80 % (95 % CI: 71-87 %) for predicting a poor pathologic response (Fig. 1 Post-operative 30-day and 90-day mortality rates were 1 % (1 of 70) and 4 % (3 of 70), respectively. These three patients (who were part of the predicted good responders group) were excluded from survival analysis. For patients alive at last follow-up, the median follow-up duration was 48 months (range 15 to 99). In the 25 patients with a predicted poor response based on (the logarithmically transformed) ΔTLG the median progression-free survival was 17 months, whereas the median progression-free survival in the 42 patients with a predicted good response was not reached (Fig. 2a). The progression-free survival was significantly better for the predicted good responders compared to the predicted poor responders based on ΔTLG (p = 0.02). Although overall survival rates appeared higher in patients with a predicted good response (median, not reached) compared to predicted poor responders (median, 70 months), this difference was not statistically significant (p = 0.18; Fig. 2b).

Discussion
In this study, the value of 18 F-FDG PET before and after induction chemotherapy for the prediction of response to neoadjuvant treatment was investigated in patients undergoing induction chemotherapy followed by trimodality therapy for oesophageal adenocarcinoma. Significant associations were found between treatment-induced changes in studied 18 F-FDG PET parameters and histopathologic tumour regression defined as poor response (TRG 3-4) versus good response (TRG 1-2).
A decrease of less than 26 % in (the logarithmically transformed) TLG after induction chemotherapy, indicating only a mild reduction in intensity and volume of FDGuptake of the primary tumour, predicted a poor pathologic response with a specificity of 84 % and PPV of 72 %. This implies that the baseline (a priori) chance of a poor pathologic response of 39 % (i.e., the overall prevalence) almost doubled to 72 % (i.e., the PPV) in predicted poor responders. This is particularly interesting when considering modification of the chemotherapy regimen administered concurrently with preoperative CRT after induction chemotherapy (e.g., in patients with burdening toxicity from induction chemotherapy) or even omission of ineffective and toxic preoperative CRT in predicted poor responders. On the other hand, a strong reduction of more than 26 % in TLG after induction chemotherapy predicted a good pathologic response with a sensitivity of 67 % and NPV of 80 %. This implies that the baseline (a priori) chance of a good pathologic response of 61 % (i.e., the overall prevalence) increased to 80 % (i.e., the NPV) in predicted good responders. This indicates that 18 F-FDG PET before and after induction chemotherapy provides a reasonable basis to encourage good responders to have induction chemotherapy and to proceed with preoperative chemoradiotherapy.
Several single-arm phase I-II studies [10-14, 19, 21-23, 25] and two retrospective comparative studies [15,16] found promising results with the three-step treatment strategy compared to preoperative CRT without induction chemotherapy in terms of treatment response, R0 resection rates, and survival rates. However, this potential superiority was not found in a retrospective comparative study [17] and two prospective randomized phase II studies [27,29]. One study suggested that only patients with stage III and IVa (and not stage II) disease who received induction chemotherapy had a significant survival advantage over preoperative CRT alone [16]. The three-step approach has not been evaluated in the context of a phase III trial. Therefore, the use of induction chemotherapy to improve oncologic outcomes remains a subject of debate. Nonetheless, the response to induction chemotherapy may serve as a marker for tumour sensitivity indicating whether benefit is to be expected from subsequent CRT or whether different chemotherapeutic agents should be incorporated into the preoperative CRT [24][25][26].
Since oesophageal cancer patients with a poor pathologic response to neoadjuvant treatment do not seem to benefit from this treatment but are exposed to its treatmentrelated toxicity [11,13,30,31], accurately predicting pathologic response before or early during treatment would produce much-needed knowledge to help individualize therapy. In this regard, the predictive value of 18 F-FDG PET response has previously been reported in preoperative chemotherapy studies of patients with oesophageal Horizontal continuous lines represent group means and the dotted line represents the optimal discriminatory cut-off level for ΔTLG of −26 % adenocarcinoma [45,46]. In the subsequent MUNICON trial from that group [32], 18 F-FDG PET-based poor responders early during preoperative chemotherapy were referred for immediate surgery rather than continuation of preoperative chemotherapy, and this discontinuation of ineffective chemotherapy did not adversely affect outcome compared with continuing such therapy [32].
The current study demonstrates that 18 F-FDG PET before and after induction chemotherapy yields a moderate ability to predict a poor pathologic response to subsequent preoperative CRT. The value of 18 F-FDG PET in this setting has been previously described in four smaller cohorts [20,24,26,47], one of which had no histopathologic reference as no surgery was performed [26]. Similar to the current study, three previous studies with 45, 55, and 46 patients, respectively [20,24,47], performed 18 F-FDG PET before and after induction chemotherapy and reported a significant association between early 18 F-FDG PET response and histopathologic tumour regression. Two studies reported the predictive performance of 18 F-FDG PET for predicting a poor pathologic response with sensitivities of 52 % and 68 %, and specificities of 60 % and 52 % [20,47]. The differences with the current study (sensitivity 67 %, specificity 84 %) may be explained by varying 18 F-FDG PET hardware, scan protocols, and reconstruction algorithms between studies [20,47] and within one multicenter study [47], by the different applied thresholds for 18 F-FDG PET response [20,47], and by the different treatment regimens used in other studies [20,47]. One previous study only reported on the value of 18 F-FDG PET before and after induction chemotherapy to predict residual cancer as opposed to a pathologic complete response (i.e., TRG 2-4 vs. 1), and found a sensitivity of 61 % and specificity of 89 % [24]. These results led investigators to examine the use of 18 F-FDG PET to direct preoperative therapy in patients with oesophageal cancer in the Cancer and Leukemia Group B trial 80803, which was opened in 2011 [24]. Results of that trial, in which the chemotherapy regimen to be used during preoperative CRT will be selected by 18 F-FDG PET response after induction chemotherapy, are currently awaited.
Although 18 F-FDG PET before and after induction chemotherapy appears to have a reasonable discriminatory ability for predicting pathologic response, it remains suboptimal. Studies have been focusing mainly on quantitative parameters, but subjective assessment by clinicians is thought to have some additional potential, as it is felt that on post-treatment scans more focused 18 F-FDG avidity instead of linear uptake may be indicative of a poor response. Unfortunately, other modalities that have been extensively studied for predicting pathologic responseincluding endoscopic biopsy, endoscopic ultrasonography, and CTyielded unsatisfactory results [38,48]. Recently, diffusion-weighted magnetic resonance imaging has been suggested as potentially powerful tool for this purpose [49], but this tool has not yet been described in the setting of a three-step treatment strategy and requires further validation.
Besides pathologic response, 18 F-FDG PET response (ΔTLG) after induction chemotherapy was also significantly associated with progression-free survival (p = 0.02)but not with overall survival (p = 0.18)in the current study. This finding is supported by a previous prospective study in which 18 F-FDG PET responders to induction chemotherapy had significantly improved progression-free survival (p = 0.02), but not overall survival (p = 0.29) [24]. In this way, the early response to induction chemotherapy apparently is an indicator of tumour biology and the likelihood of treatment failure. As such, the early 18 F-FDG PET response after induction chemotherapy could aid in patient selection for treatment intensification or modification aiming to reduce the high risk of locoregional and distant recurrences in the poor responders.
Certain limitations apply to this study. First, the study was retrospective by nature. Second, different regimens of induction chemotherapy and preoperative chemoradiotherapy were applied in this study. However, our analysis was strengthened by including the largest sample size for this topic so far, using a prospectively maintained database, and using modern 18 F-FDG PET techniques and imaging analysis.
In conclusion, this study demonstrated that 18 F-FDG PET seems useful to predict a poor pathologic response early after induction chemotherapy in patients with oesophageal adenocarcinoma undergoing a three-step treatment strategy. As such, the early 18 F-FDG PET response after induction chemotherapy has the potential to aid in individualized treatment decision-making in this group of patients. However, the standard use of 18 F-FDG PET for this indication cannot yet be recommended, as the findings (e.g., the determined threshold) of the current exploratory study require external validation. Also, a larger sample size is desired as the 95 % CIs of the estimated diagnostic performance indices in the current study were relatively wide. Also, additional studies are required to determine and validate whether 18 F-FDG PET alone or in combination with other modalities provides sufficient accuracy to justify modification or withdrawal of subsequent CRT prior to surgery.

Compliance with Ethical Standards
Funding This study was funded in part by The University of Texas MD Anderson Cancer Center and by the National Cancer Institute Cancer Center Support Grant CA016672.
Conflicts of interest All authors have no conflict of interest to declare.
Ethical approval All procedures performed in this study were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent This retrospective study was approved by our Institutional Review Board, and the need for written informed consent was waived.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.