Volumetric study reveals the relationship between outcome and early radiographic response during bevacizumab-containing chemoradiotherapy for unresectable glioblastoma

Purpose Although we have shown the clinical benefit of bevacizumab (BEV) in the treatment of unresectable newly diagnosed glioblastomas (nd-GBM), the relationship between early radiographic response and survival outcome remains unclear. We performed a volumetric study of early radiographic responses in nd-GBM treated with BEV. Methods Twenty-two patients with unresectable nd-GBM treated with BEV during concurrent temozolomide radiotherapy were analyzed. An experienced neuroradiologist interpreted early responses on fluid-attenuated inversion recovery (FLAIR) and gadolinium-enhanced T1-weighted images (GdT1WI). Volumetric changes were evaluated using diffusion-weighted imaging (DWI) and GdT1WI according to the Response Assessment in Neuro-Oncology (RANO) criteria. The results were categorized into improved (complete response [CR] or partial response [PR]) or non-improved (stable disease [SD] or progressive disease [PD]) groups; outcomes were compared using Kaplan–Meier analysis. Results The volumetric GdT1WI improvement was a significant predictive factor for overall survival (OS) prolongation (p = 0.0093, median OS: 24.7 vs. 13.6 months); however, FLAIR and DWI images were not predictive. The threshold for the neuroradiologist’s interpretation of improvement in GdT1WI was nearly 20% of volume reduction, which was lesser than 50%, the definition of PR applied in the RANO criteria. However, even less stringent neuroradiologist interpretation could successfully predict OS prolongation (improved vs. non-improved: p = 0.0067, median OS: 17.6 vs. 8.3 months). Significant impact of OS on the early response in volumetric GdT1WI was observed within the cut-off range of 20–50% (20%, p = 0.0315; 30%, p = 0.087; 40%, p = 0.0456). Conclusions Early response during BEV-containing chemoradiation can be a predictive indicator of patient outcome in unresectable nd-GBM.


Introduction
Glioblastoma (GBM) is one of the most common malignant brain tumors and has a poor prognosis. The current standard treatment for newly diagnosed GBM (nd-GBM) is maximal safe removal with concurrent temozolomide and radiation (TMZ-RT), followed by maintenance TMZ with, if possible, tumor-treating fields [1]. Despite such multimodal treatment, the median overall survival (OS) remains less than 2 years.
In addition to these treatments, the FDA approved bevacizumab (BEV), an anti-VEGF antibody moleculartargeted drug that produces an indirect antitumor activity via inhibition of tumor angiogenesis [2], as treatment for recurrent GBM in 2009. Thereafter, two randomized trials, AVAglio and RTOG0825, were conducted to verify the efficacy of BEV for the treatment of nd-GBM, resulting in only progression-free survival (PFS) prolongation, but failed to impact OS [3,4]. Accordingly, there is no robust evidence supporting the efficacy of BEV treatment for nd-GBM; however, BEV has been approved in Japan as an insurance-covered first-line drug for GBM concurrently with its second-line application, considering the benefit of maintaining patient performance status [5]. Thereafter, Japanese institutes, including ours, have launched several real-world studies, which indicate the clinical benefits of optional first-line BEV for patients with severe clinical conditions [1,[6][7][8][9][10]. Practically, we selected first-line BEV for patients with unresectable GBM and accumulated the clinical data [11]. These case series revealed that the radiographic course following first-line BEV for unresectable tumors varied among patients, and its outcome was considered an unresolved issue that needs to be addressed.
In the response judgment of GBM, gadoliniumenhanced T1-weighted image (GdT1WI) measurement based on the Macdonald criteria has generally been applied [12]. However, in BEV treatment for GBM, apparent tumor reduction on GdT1WI, the so-called pseudoresponse, can be observed at an earlier time. Therefore, the evaluation of response using GdT1WI alone may overestimate the therapeutic effect of BEV. This type of complicated radiographic response during BEV treatment was taken into consideration, and the Response Assessment in Neuro-Oncology (RANO) criteria added the evaluation of non-enhanced lesions using T2/FLAIR [13]. Consequently, integrated evaluation based on multiple magnetic resonance imaging (MRI) sequences became essential for the assessment of treatment response; however, the association between complicated radiographic response and clinical outcome remains a controversial issue.
Only a limited number of studies have explored the relationship between radiographic response and clinical outcome following BEV treatment for nd-GBM [14,15]. In this study, we retrospectively examined the detailed radiographic response using MRI scans during TMZ-RT combined with BEV for unresectable nd-GBM, and aimed to elucidate the relationship between early radiographic response and clinical outcome.

Patients
Since the Japanese approval of BEV for GBM in 2013, 72 adult (> 18 y) patients with IDH-wt nd-GBM were registered in our brain tumor database. Adaptive add-on BEV treatment to the Stupp regimen described previously [8][9][10] was selected for patients with unresectable GBM in our institute. The patients, whose postsurgical residual tumors were radiographically evident, were included in the present study. Two patients were excluded because BEV was added for the treatment of a novel lesion during radiotherapy, or radiographical total resection was performed previously. Finally, 22 patients were enrolled in the present study (Table 1). During radiotherapy, all enrolled patients underwent biweekly BEV administrations in combination with TMZ (mean 2.81 times; min 1; max 4). Steroid (betamethasone) was administered in three patients for neurological symptom control during concurrent chemoradiation therapy (CCRT). Two patients

Neuroimaging findings
We evaluated the change in MRI images between pre-and post-CCRT (pre-RT and post-RT) ( FLAIR was evaluated according to the RANO criteria methods [13]. Improvement in diffusion-weighted imaging (DWI) was evaluated using the measurement methods proposed by another group [16]. RANO criteria-based judgments on GdT1WI images that measured enhancing lesions and the sum of the product of the perpendicular diameter (Gd-SPPD) were also performed, and patients were categorized into two groups: improvement (complete response [CR] or partial response [PR]) and non-improvement (stable disease [SD] or progressive disease [PD]) [13].

Statistical analysis
Chi-squared and Fisher's exact tests were used to investigate the relationship of neuroimaging changes with patient characteristics and molecular genetic stratification. Kaplan-Meier analysis was conducted to evaluate OS, and the log-rank test was used to compare survival distributions. The level of statistical significance was set at p < 0.05. All statistical analyses were performed using JMP Pro version 13 (SAS Institute Inc., NC, USA).

Background characteristics
Patient characteristics and molecular genetic stratifications of tumors are summarized in Table 1. The judgment of improvement on DWI, FLAIR, and Gd-SPPD was determined in 8 (36.4%), 16 (72.7%), and 13 (59.1%) patients, respectively. Table 2 shows the genetic markers analyzed in this study. In the univariate analysis, unmethylated MGMT status and CDKN2A homozygous deletion were associated with a poor prognosis (unmethylated MGMT status: HR 2.54, 95% confidence intervals [CI] 0.86-7.5, CDKN2A: HR 2.49, 95% CI 0.86-7.27, p = 0.093). Prognostic significances of analyzed genetic markers were also evaluated in the multivariate analysis, and unmethylated MGMT was the only  No significant bias in clinical and genetic backgrounds was detected between patients whose radiographic findings improved with each of the three judgments (Table 3).

Survival outcome of radiographic findings
Kaplan-Meier analyses revealed that only Gd-SPPD improvement was a significant predictive factor for OS prolongation (p = 0.0093). The median OS was 24.7 and 13.6 months when GD-SPPD was improved or not, respectively. In contrast, FLAIR and DWI images were not predictive of OS outcome (FLAIR improved vs. non-improvement: p = 0.13, 16.9 vs. 16.3 months, DWI improved vs. nonimproved: p = 0.48: 17.6 vs. 16.3 months) (Fig. 2).
In addition to the evaluation of Gd-SPPD, GdT1WI improvement was interpreted by a neuroradiologist (Gd-IP). Improvement in Gd-IP was associated with OS prolongation (improvement vs. non-improvement: p = 0.0067, 17.6 vs. 8.3 months). We compared the discrepant judgment between Gd-SPPD and Gd-IP results for seven cases in which the reduction was less than 50% in the measurement, as shown in Fig. 3 (Table 4).
To determine the suitable threshold of GdT1 improvement for the prediction of outcome, we changed the cutoff line for PR judgment in the measurement from 10 to 70%, and we performed Kaplan-Meier analysis (Fig. 4). OS prolongation was observed as a cut-off of 20% to 50% (20% improvement vs. non-improvement: p = 0.0315, 30% improvement vs. non-improvement: p = 0.087, 40% improvement vs. non-improvement: p = 0.0456).

BEV toxicity
During the course of CCRT, the only obvious BEV-related toxicity was deep venous thrombosis, which was identified in a single patient and led to a temporary discontinuation of BEV administration.

Discussion
We analyzed the impact of radiographic changes during BEV-containing chemoradiotherapy for unresectable nd-GBM. As a result, while changes in DWI and FLAIR images did not have a significant impact, only GdT1WI improvement was associated with significant OS prolongation. The Macdonald criteria applied to GdT1WI have been the standard for determining the treatment response of GBM. The RANO criteria added FLAIR image progression to predict pseudo-response [12,13]. In previous reports, the relationship between the radiographic response assessed by the RANO criteria and survival outcome following BEV treatment was analyzed in patients with recurrent GBM. Ellingson et al. reported that the changes in FLAIR and GdT1WI were not related to both PFS and OS, and the pre-treatment ratio of FLAIR to contrast-enhancing volume was a predictive marker of both PFS and OS [27]. Boxerman et al. reported that early progression of GdT1WI was a poor prognostic factor for OS and that changes in FLAIR images showed no significant impact on OS [28]. Both studies investigated recurrent GBM, and the therapeutic situations were different from those for nd-GBM. These two studies indicated that quantitative FLAIR improvement showed no significant correlation with OS because BEV treatment improved FLAIR hyperintensity with an anti-permeability effect and supported our results that an early response on FLAIR images is not likely to reflect the survival outcome after BEV treatment. It is speculated that FLAIR progression is useful for differentiating the pseudo response and that FLAIR improvement does not indicate an antitumor effect. An exploratory analysis of AVAglio classified the type of radiologic progression of nd-GBM treated with TMZ-RT and BEV, revealing that CR in the GdT1WI group showed longer OS than that in the PR group [12]. Our results indicated that even PR on GDT1WI could have a survival impact in real-world clinics. The discrepancy between AVAglio and our results might be due to differences in the background characteristics. Our case series consisted of patients with severe clinical conditions that were more likely to be excluded from clinical trials due to their strict inclusion criteria. In addition, the extent of resection should be taken into consideration because CR on GdT1WI is likely to occur after treatment of patients with near completely resected tumors, while there were very few such cases included in our cohort. The impact of GdT1WI improvement in clinical practice is currently an unsolved issue to be elucidated by the accumulation of clinical reports from Japanese institutes where BEV is approved for treatment of nd-GBM. Other MRI sequences were analyzed for their impact on outcome. DWI has been recognized as a promising sequence for the prediction of the response to BEV treatment because the apparent diffusion coefficient level can reflect the cellularity of tumor tissues [29]. Yamasaki et al. reported that DWI can distinguish the pseudo-response from true response after BEV treatment for recurrent GBM, and the evaluation based on the RANO criteria predicted OS more precisely when combined with DWI change, suggesting that DWI can clearly demonstrate the true extent of the tumor area at an early point [12]. In this study, DWI evaluation was performed according to the method of Yamasaki et al.; however, there was no correlation between the OS and DWI responses. It is noteworthy that the impact of radiotherapy should be considered when discussing these issues because the treatment situation is different between nd-and recurrent GBM. Regarding nd-GBM, relative cerebral blood volume (rCBV) is attracting attention for its association with the response to BEV treatment. An exploratory analysis of RTOG0825 revealed OS prolongation with BEV treatment in the high We performed partial removal of the tumor. The patient was treated with temozolomide (TMZ; 75 mg/m 2 /day) and radiotherapy (Intensity Modulated RT60 Gy) and bevacizumab (four times in total). On the pre-RT magnetic resonance imaging (MRI), enhancing lesion and the sum of the product of the perpendicular diameter (Gd-SPPD) was 37 × 19 mm. After RT, GdT1WI showed a reduction in interpretation by the neuroradiologist. However, Gd-SPPD was 30 × 15 mm and the change in SPPD was 32.4%, which was determined as SD based on the Response Assessment in Neuro-Oncology (RANO) criteria. OS at 22.8 months. GBM glioblastoma; RT radiotherapy; SD stable disease; TMZ temozolomide Table 4 Cases that were judged SD in the Gd-SPPD due to reduction of less than 50%  Non imp − 7.53 SD No pretreatment rCBV group compared to the placebo group [30]. However, rCBV changes during BEV treatment did not show an impact on OS in both nd-and recurrent GBMs [30,31]. Another recent study focused on the contrast between DWI and perfusion images to generate an automated threshold by measuring the hypercellular tumor volume and hyperperfused tumor volume and showed that the ratio changes of these two values during chemoradiotherapy had an impact on OS [32]. On the other hand, they reported that significant GdT1WI volume reduction during chemoradiotherapy was also observed; however, the GdT1WI volume change was not correlated with OS. Nonetheless, the treatment situation in this study was also different from that in ours, in which chemoradiation included BEV administration. Further studies including multiple MRI sequences are warranted to confirm the relationship between early radiographic response and outcome in clinical situations where first-line BEV is approved.
Our study revealed that Gd-improvement evaluated not only by the RANO criteria but also by neuroradiologist's impression can predict the outcome of unresectable GBM treated with BEV-combined regimen. In addition, the extent of GD-SPPD improvement correlated with significant outcome impact ranging from 20 to 50%, similar to the GD-IP and GD-SPPD thresholds. In other words, OS prolongation can be predicted even in cases when GD-improvement is insufficient to determine PR according to the RANO criteria. These results indicated that, for outcome prediction, evaluation by a neuroradiologist hinted at a clinically appropriate response judgment of BEV-combined treatment for nd-GBM. The criteria of radiographic response by measuring contrast-enhancement lesions have been consistently used from the Macdonald criteria to the RANO criteria [12,13]. McDonald's standard was created based on the WHO oncology response criteria, which is a general diagnostic imaging standard for solid tumors [33]; therefore, the measurement protocol for contrast-enhancement lesions has not been changed for more than 20 years. Our results propose the possibility that some patients evaluated as SD by the RANO criteria may have a good prognosis and suggest an alternative threshold value for identifying the group with a good prognosis [26].
Our study has several limitations. First, this was a singlecenter, non-randomized, retrospective study that included a small number of patients. Hence, our results should be verified in a larger cohort. As first-line BEV for GBM has not been approved outside of Japan, a multi-institutional clinical study involving several Japanese facilities is desirable. Second, subsequent treatments were inconsistent among the enrolled patients, which might have affected the outcome. Third, the analyzed image sequences were limited to only DWI, Gd, and FLAIR, and other sequences such as rCBV may be more significant predictors. Fourth, while there existed a correlation between the extent of GD-improvement and outcome, how the background bioactivity attributed to such a relationship was unclear. In the present study, univariate analysis for molecular markers revealed an unmethylated MGMT status, and CDKN2A homozygous deletion showed a trend toward poor prognosis. Our recent study also reported that MGMT and CDKN2A status could stratify Japanese GBM patients into three race-specific groups with different prognoses [26]. Further accumulation of studies including molecular-genetic signatures and evidence beyond real-clinic data are warranted to evaluate the significance of image changes during BEV-included regimens for unresectable GBM.

Conclusions
We examined the radiographic response in multiple MRI sequences (FLAIR, Gd-SPPD, and DWI) in patients with unresectable nd-GBM treated with BEV-including chemoradiotherapy and proved that the Gd-SPPD improvement group showed a significant prolongation of OS. Furthermore, the OS impact was significant even with less strict judgment of radiographic response compared with that of the RANO criteria. This raised the possibility that some patients evaluated as SD by the RANO criteria may have a good prognosis, and the interpretation of neuroradiologists likely hinted at an alternative evaluation for outcome prediction. Fig. 4 Correlation between Gd-improvement and outcome. A Significant overall OS prolongation is revealed in the Gd-interpretation improvement group. C-F OS prolongation is observed in the improvement group according to a cut-off line from 20 to 50% (20% improved vs. non-improved: p = 0.0315, 30% improved vs. nonimproved: p = 0.087, 40% improved vs. non improved: p = 0.0456). B, G, H On the other hand, this outcome disappeared when using a 10%, 60%, and 70% cut-off line. OS overall survival; PR partial response Funding This work was supported by a Japanese Society for the Promotion of Science Grants-in-Aid for Scientific Research (JSPS KAKENHI) Award (Grant No. 19K17673, 21H03044, 21K09128, 20K17972, and JP20K09392).

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The present investigation was approved by the ethics committee (Ethics review number: 2019-090).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.