Static FET PET radiomics for the differentiation of treatment-related changes from glioma progression

Purpose To investigate the potential of radiomics applied to static clinical PET data using the tracer O-(2-[18F]fluoroethyl)-l-tyrosine (FET) to differentiate treatment-related changes (TRC) from tumor progression (TP) in patients with gliomas. Patients and Methods One hundred fifty-one (151) patients with histologically confirmed gliomas and post-therapeutic progressive MRI findings according to the response assessment in neuro-oncology criteria underwent a dynamic amino acid PET scan using the tracer O-(2-[18F]fluoroethyl)-l-tyrosine (FET). Thereof, 124 patients were investigated on a stand-alone PET scanner (data used for model development and validation), and 27 patients on a hybrid PET/MRI scanner (data used for model testing). Mean and maximum tumor to brain ratios (TBRmean, TBRmax) were calculated using the PET data from 20 to 40 min after tracer injection. Logistic regression models were evaluated for the FET PET parameters TBRmean, TBRmax, and for radiomics features of the tumor areas as well as combinations thereof to differentiate between TP and TRC. The best performing models in the validation dataset were finally applied to the test dataset. The diagnostic performance was assessed by receiver operating characteristic analysis. Results Thirty-seven patients (25%) were diagnosed with TRC, and 114 (75%) with TP. The logistic regression model comprising the conventional FET PET parameters TBRmean and TBRmax resulted in an AUC of 0.78 in both the validation (sensitivity, 64%; specificity, 80%) and the test dataset (sensitivity, 64%; specificity, 80%). The model combining the conventional FET PET parameters and two radiomics features yielded the best diagnostic performance in the validation dataset (AUC, 0.92; sensitivity, 91%; specificity, 80%) and demonstrated its generalizability in the independent test dataset (AUC, 0.85; sensitivity, 81%; specificity, 70%). Conclusion The developed radiomics classifier allows the differentiation between TRC and TP in pretreated gliomas based on routinely acquired static FET PET scans with a high diagnostic accuracy.


Introduction
During the follow-up of glioma patients, treatment-related changes often cannot be reliably differentiated from tumor progression by structural magnetic resonance imaging (MRI) alone. Yet a false diagnosis will either result in the continuation of an ineffective treatment or a premature termination of an effective one, both negatively impacting patients' prognosis. To improve the differentiation of treatment-related changes and tumor progression, advanced MRI techniques such as perfusion-weighted imaging as well as MR spectroscopy are under investigation, yielding accuracies of about 80% for this clinically challenging and highly important differential diagnosis [1].
Besides advanced MRI, amino acid positron emission tomography (PET) has demonstrated its potential to differentiate treatment-related changes from tumor progression in initial studies [2][3][4]. Previous studies using the amino acid PET tracer O-(2-[ 18 F]fluoroethyl)-l-tyrosine (FET) showed that the combination of static and dynamic parameters 1 3 discriminates treatment-related changes from tumor progression in recurrent gliomas with an accuracy of up to 90% [5][6][7]. These studies, however, require dynamic FET PET parameters based on a 40-50 min PET scan that is more time consuming in clinical routine than a static scan from 20 to 40 min post injection. Other studies demonstrated an improved diagnostic accuracy by combination of FET PET with advanced MRI methods using hybrid PET/MRI [8,9].
Despite these recent advances, a method to improve the diagnostic performance of FET PET without the need for a time-consuming and expensive dynamic acquisition or additional, dedicated MRI scans would be of clinical relevance.
In recent years, methods based on artificial intelligence and machine learning have become increasingly important and found their way into medical image analysis. Several methods from this field are under investigation also in brain tumor patients and promise to improve diagnosis by extracting additional imaging features from routinely acquired imaging data. These features are usually not accessible through conventional image analysis and can be used to generate prognostic or predictive mathematical models. This methodology is also referred to as radiomics [10,11].
FET PET radiomics has already demonstrated its potential in neurooncology for the prediction of the isocitrate dehydrogenase genotype [12], the diagnosis of pseudoprogression [13,14], the differentiation of treatment-related changes from recurrent brain metastases after radiosurgery [15,16], or the prediction of the BRAF mutational status in patients with melanoma brain metastases [17].
The goal of our study was to investigate the potential of FET PET radiomics for the differentiation between treatment-related changes and tumor progression in patients with glioma based on routinely acquired static FET PET when added to clinically established FET PET parameters.

Patients
The patient group was partly included in a previous study concerning the diagnostic performance of perfusionweighted MRI and dynamic FET PET for the differentiation of treatment-related changes from glioma progression [18].
One hundred and fifty-one patients (n = 54 females, n = 97 males; median age, 52.3 years; age range 20.4-78.0 years) with histologically confirmed WHO grade II-IV glioma according to the 2016 WHO Classification of Tumors of the Central Nervous System [19] were included in this retrospective study [mostly WHO grade IV glioblastoma, IDHwildtype (n = 71); WHO grade III anaplastic astrocytoma, IDH-mutant (n = 20); WHO grade II astrocytoma, IDHmutant (n = 14)]. All patients presented with post-therapeutic MRI findings suspicious for tumor progression according to the Response Assessment in Neuro-Oncology (RANO) criteria and were hence investigated using FET PET. All patients included here showed an increased FET uptake in the area of the primary lesion. Complete patient characteristics are provided in Table 1.

Diagnosis of treatment-related changes and tumor progression
Diagnosis was based on histopathology in 46 patients (30%), and on clinicoradiological follow-up in 105 patients (70%). For histopathologic diagnosis, tissue samples were obtained by resection or biopsy and analyzed as described previously [5].
For WHO grade II gliomas, both the clinical and the radiological situation had to be stable or improved for at least 12 months without change in therapy to exclude tumor progression [20]. For WHO grade III-IV gliomas, the diagnosis treatment-related changes required at least 6 months of stable or improved clinical and radiological condition [21], as well as no change in tumor treatment. Tumor progression was diagnosed if lesions continued to increase in size on at least two subsequent MRI scans according to the RANO criteria, accompanied by a deterioration in performance status, or if a patient died of glioma, whichever occurred first. Of note, the applied classification criteria are in accordance with previous studies [5,22,23].

FET PET imaging
The amino acid FET was produced and applied as described previously [24]. According to international guidelines for brain tumor imaging, all patients fasted for at least 4 h before the PET measurement [25].
All patients underwent a dynamic PET scan from 0 to 50 min post injection of 3 MBq of FET per kg of body weight. 124 patients were examined on a stand-alone PET scanner (ECAT EXACT HR+, Siemens Healthcare, Erlangen, Germany) in 3D mode, and 27 patients on a high-resolution 3 T hybrid PET/MRI scanner (BrainPET, Siemens Healthcare, Erlangen, Germany). The BrainPET is a compact cylinder that fits into the bore of the Magnetom Trio MR scanner [26,27].
As described before [28], iterative reconstruction parameters were: 16 subsets, six iterations using the OSEM algorithm for the ECAT HR+ PET scanner and two subsets, and 32 iterations using the OP-OSEM algorithm for the Brain-PET. Data were corrected for random, scattered coincidences, dead time, and motion for both systems. Attenuation correction for the ECAT HR+ PET was based on a transmission scan. For the BrainPET, a template-based approach was used [26]. The reconstructed dynamic data sets consisted of 16 time frames (5 × 1 min; 5 × 3 min; 6 × 5 min) for both scanners. To optimize comparability of the results related to the influence of the two different PET scanners, reconstruction parameters, and post-processing steps, a 2.5 mm 3D Gaussian filter was applied to the BrainPET data before further processing. In previous phantom experiments with spheres of different sizes that simulated lesions, this filter kernel demonstrated the best comparability of PET data obtained from the ECAT HR+ PET and the BrainPET scanner [29].

Evaluation of FET PET parameters
The FET uptake was expressed as standardized uptake value (SUV) by dividing the radioactivity in the tissue (kBq/ml) by the radioactivity injected per gram of body weight.
Semi-automated segmentation of the suspected brain lesion was performed in the summed PET images from 20 to 40 min post-injection using the Pmod Biomedical Image Quantification Software (Version 3.806, PMOD Technologies, Zurich, Switzerland). For assessment of the FET uptake in healthy brain tissue, a region-of-interest was positioned in the semioval center of the unaffected hemisphere contralateral to the tumor, involving grey and white matter [25]. As described before [28], tumor segmentation was performed using a 2-dimensional auto-contouring process in the transversal slice containing the voxel with the maximum tracer uptake using a tumor-to-brain ratio (TBR) of 1.6 or more. In a previous study, this threshold has been shown to best separate between vital tumor and healthy brain parenchyma in FET PET [30].
Maximum and mean TBRs (TBR max , TBR mean ) were calculated by dividing the maximum or mean SUV of the tumor by the mean SUV of healthy brain tissue. The segmentations were validated by an experienced, board-certified specialist in neuroradiology with broad experience in nuclear medicine. Figure 1 illustrates the segmented lesions in representative patients.

Image preprocessing and radiomics feature extraction
The group of patients scanned on the stand-alone PET scanner was divided into a training and a validation dataset in a ratio of 3/1 with an equal ratio of tumor progression to treatment-related changes. The group of patients scanned on the 3 T hybrid PET/MR scanner was used for model testing.
Feature extraction was performed by the RadiomiX toolbox (supported by Oncoradiomics, Liège, Belgium) [31] implemented in Matlab 2017a (MathWorks, Natick, MA, USA), including International Biomarker Standardization Initiative (IBSI)-compliant [32] radiomic features as well as others. A total of 221 features were extracted. No spatial resampling was performed. Absolute intensity resampling was performed using a fixed bin width of 0.1 according to current recommendations [33].
The definitions and detailed feature descriptions are provided elsewhere [10]. Detailed mathematical definitions of the features are available in the RadiomiX toolbox manual.

Feature selection
Using large number of features on a limited number of patients for model calculation may result in data overfitting. Overfitting is a methodological mistake in which a generated model corresponds too closely or even matches the analyzed dataset. This results in a perfect classification accuracy on the dataset that has been used for training but renders the model too specialized to classify new or additional imaging data or reliably predict future observations. To lower the risk of overfitting, the most important features must be identified in a process called feature selection before model generation. [34] Feature selection was performed using the RadiomiX toolbox. First, the machine learning module eliminates features with (near) zero variance and an inter-feature correlation of 0.9 or more to remove redundancies within the feature set. A threshold of 0.9 is generally considered high enough to eliminate high correlation from the dataset. Second, a stepwise forward feature selection using stratified cross-validation with logistic regression is used to further reduce the number of radiomics features, i.e., it stops adding features if the inclusion of the next feature does not add more than 0.005 to the average cross-validation AUC.

Model generation and validation
Logistic regression models were generated on the training dataset using the 'tidyverse/ggplot2' packages in R (version 4.0.5, R Studio, Inc., Boston, MA, USA). Logistic regression models were fitted separately for the conventional FET PET parameters TBR mean and TBR max , as well as for the selected radiomics features and a combination of conventional and radiomics features. Finally, the models were applied to the holdout validation dataset that was not part of model generation.

Model testing
The best performing models were applied to the test dataset acquired on the BrainPET scanner. Since the test dataset was not involved in the process of model training and validation and was acquired on a different PET scanner, it represents an

Statistical evaluation
Descriptive statistics are provided as mean and standard deviation or as median and range. The Mann-Whitney-U test was used for intergroup comparison. The diagnostic performance of the PET parameters, the machine learning models and combinations thereof were assessed by receiver operating characteristic (ROC) analysis. The decision cutoff was considered optimal when the product of paired values for sensitivity and specificity reached its maximum. Fisher's exact test for 2 × 2 contingency tables was used for statistical evaluation of the parameters. P-values of less than 0.05 were considered statistically significant. Statistical analyses were performed using SPSS (SPSS Statistics 24, IBM, New York, USA) and Microsoft Excel (Excel:Mac 2020, Version 16.53, Microsoft, Redmond, WA, USA).

Treatment-related changes and tumor progression
Of the 124 patients examined on the stand-alone PET scanner, 31 (25%) were diagnosed with treatment-related changes and 93 (75%) with tumor progression. The test dataset consisted of 27 patients examined on the BrainPET scanner. Thereof, six (22%) were diagnosed with treatment-related changes and 21 (78%) with tumor progression.

Group comparison of FET PET parameters
TBR mean was significantly higher for patients with tumor progression compared to patients with treatment-related changes (mean TBR mean ± standard deviation, 2.1 ± 0.3 vs. 1.9 ± 0.3; p < 0.001). TBR max was also significantly higher for patients with tumor progression compared to patients with treatment-related changes (mean TBR max ± standard deviation, 3.7 ± 0.9 vs. 2.8 ± 0.7; p < 0.001).

Performance of machine learning models in the training and validation dataset
The two most important radiomics features according to the feature selection were Informational Measure of Correlation 2 calculated from the GLCM, and Intensity Non-Uniformity Normalized from the GLSZM.
Further details on the model performances in the validation and test dataset are summarized in Table 2 and Fig. 3.

Performance of radiomics models in the test dataset
The model based on PET parameters resulted in an AUC of 0.78 (95% confidence interval, 0.67-0.88; sensitivity, 66%; specificity, 80%) in the test dataset.
The model based solely on radiomics features resulted in an AUC of 0.85 (95% confidence interval, 0.77-0.94; sensitivity, 73%; specificity, 80%) in the test dataset. The model combining FET PET parameters and radiomics features resulted in an AUC of 0.85 (95% confidence interval, 0.77-0.94; sensitivity, 81%; specificity, 70%). Further details on the model performances in the test dataset are summarized in Table 2 and Fig. 3.

Discussion
The main finding of our study is that a machine learning model based on static FET PET radiomics features differentiates treatment-related changes from tumor progression in patients with gliomas with a high diagnostic accuracy, i.e., an AUC of 0.85, and outperforms conventional FET PET analysis (AUC, 0.78). Interestingly, combining both FET PET radiomics features with conventional FET PET parameters further improved the overall diagnostic performance, especially the sensitivity of the model. Further, the developed radiomics model was evaluated in a small, but independent test dataset acquired with a different PET scanner demonstrating its reliability and robustness regardless of the used scanners and imaging parameters. Since the developed model is based on routinely acquired FET PET scans and can be applied fully automated on a conventional computer in a few minutes, the approach seems feasible for clinical implementation.
Several studies already investigated the potential of FET PET for the differentiation of treatment-related changes from tumor progression in glioma patients [7, 20, 35,36]. Although these studies demonstrate high diagnostic accuracies, the number of patients in these studies was either low, and/or only a small fraction of patients were diagnosed with treatment-related changes. A more recent study by Maurer and colleagues [5] using partially overlapping patients investigated the value of static and dynamic FET PET parameters for the differentiation of treatment-related changes from glioma progression. This retrospective analysis of 127 patients with WHO grade II-IV gliomas yielded a diagnostic accuracy of 81% (sensitivity, 86%; specificity, 67%) by combining TBR max and the dynamic FET PET parameter slope. In our study, the combination of static FET PET parameters TBR mean and TBR max resulted in a comparable diagnostic performance (AUC, 0.78; sensitivity, 64%; specificity, 80%) in the validation dataset. Nevertheless, the use of FET PET radiomics parameters alone already outperformed the conventional FET PET parameters with an AUC of 0.90 (sensitivity, 87%; specificity, 80%). Combining radiomics and conventional FET PET parameters further increased the diagnostic performance in the validation dataset (AUC, 0.92; sensitivity, 91%; specificity, 80%) and demonstrated its generalizability in the external test dataset (AUC, 0.85; sensitivity, 81%; specificity, 70%). Our results concerning FET PET are also comparable with a recent study of our group investigating the value of combining perfusion-weighted MRI with dynamic FET PET [37]. In that study, accuracy of PWI, which can be performed easily during routine conventional MR scanning, was poor in differentiating treatment-related changes and tumor progression (accuracy, 63%). However, the high positive predictive value of PWI (100%) allowed a correct diagnosis of treatment-related changes in 42% of the patients. In the remaining patients, PWI was nondiagnostic, but FET PET still achieved an accuracy of 78% leading to the recommendation of a sequential use of perfusion-weighted MRI and dynamic FET PET in clinical practice. In this context, the developed radiomics classifier in our study may achieve a higher diagnostic performance based on 20 min static FET PET images. Since the analysis can be performed fully automated on a conventional computer in a few minutes, this combination appears promising in terms of a clinical translation.
Over the past years, the value of machine learning techniques and FET PET radiomics for the diagnosis of treatment-related changes such as pseudoprogression in patients with glioma [13,14,38] or radiation injury in patients with brain metastases [16] has been demonstrated. Interestingly, in these studies, different patterns of tracer uptake could already visually be distinguished. Patients with pseudoprogression or radiation injuries showed a more homogenous uptake of FET compared to a more heterogenous uptake of FET in patients with tumor progression.
In our study, different patterns of FET uptake between patients with treatment-related changes and glioma progression could not be identified by visual evaluation (Fig. 1). This might be due to the more inhomogeneous group of patients that included a broad range of glioma subtypes and treatment regimens. Nevertheless, we identified the two textural features Informational Measure of Correlation 2 calculated from the GLCM, and Intensity Non-Uniformity Normalized from the GLSZM as being discriminative between treatment-related changes and glioma progression. Both textural features describe differences in tumor heterogeneity not accessible by means of human perception. Even though visually accessible differences are desirable for a better presentation of the results, the real benefit and concept of radiomics becomes more apparent if this is not the case-"images are more than pictures, they are data" [39].
Besides a visual interpretation of radiomics features, further efforts are needed for a deeper understanding of the biological meaning of features and machine learning models. This might be essential for a potential clinical translation and acceptance of radiomics in clinical routine. Hence, the correlation of radiomics features with local tissue samples including extensive neuropathological workup is necessary in future studies.
Although promising, our results must be further validated in a larger group of patients from multiple institutions. Albeit the number of patients in our study is larger than in other studies investigating PET radiomics for the differentiation of treatment-related changes and tumor progression, the generally low number of patients available in neuro-oncology remains a limitation. Nonetheless, our model has shown its value in an external test dataset without extensive preprocessing, acquired on a different PET scanner, so further model evaluation in other centers is warranted and feasible.
Another limitation of our study might be the heterogenous composition of patients in terms of glioma subtypes and treatment regimens and the relatively low amount of histopathological validation of the diagnosis. Further, the group of patients is likely biased towards more challenging cases as only patients with equivocal MRI findings and remaining therapeutic options usually undergo FET PET scans. However, this dataset is representative of a clinical situation and further underlines the value of the model as it does not require an extensive preselection of patients.
Additionally, the developed machine learning model is based on FET PET alone and does not include structural or advanced MRI. Future studies should hence address the potential additional value of a combined FET PET/ MRI radiomics analysis, also considering advanced MRI methods such as PWI or MR spectroscopy.

Conclusion
The results from our study suggest that the developed radiomics model is of clinical value for the differentiation between treatment-related changes and tumor progression in patients with gliomas regardless of tumor type or pretreatment. The radiomics model is based on routinely acquired static 20 min FET PET scans facilitating the translation into clinical routine. Especially in combination with other clinical parameters, the developed radiomics model might have an additional diagnostic value once translated into clinical routine. Data availability All data supporting the findings presented in this manuscript are available upon request directly to the corresponding author, PL. These data are not part of public domain or database as they are part of the patient protected medical record and public sharing would compromise the privacy of the research participants.

Conflict of interest
The authors have no relevant financial or non-financial interest to disclose.

Ethical approval
The local ethics committees approved the retrospective analysis of the data (project number, SNO-8-2018). The study adheres to the standards established in the declaration of Helsinki.
Informed consent Informed consent was obtained from all individual participants included in the study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.