Diagnostic value of 18F-FDG PET-CT in detecting malignant peripheral nerve sheath tumors among adult and pediatric neurofibromatosis type 1 patients

Purpose Detecting malignant peripheral nerve sheath tumors (MPNSTs) remains difficult. 18F-FDG PET-CT has been shown helpful, but ideal threshold values of semi-quantitative markers remain unclear, partially because of variation among scanners. Using EU-certified scanners diagnostic accuracy of ideal and commonly used 18F-FDG PET-CT thresholds were investigated and differences between adult and pediatric lesions were evaluated. Methods A retrospective cohort study was performed including patients from two hospitals with a clinical or radiological suspicion of MPNST between 2013 and 2019. Several markers were studied for ideal threshold values and differences among adults and children. A diagnostic algorithm was subsequently developed. Results Sixty patients were included (10 MPNSTs). Ideal threshold values were 5.8 for SUVmax (sensitivity 0.70, specificity 0.92), 5.0 for SUVpeak (sensitivity 0.70, specificity 0.97), 1.7 for TLmax (sensitivity 0.90, specificity 0.86), and 2.3 for TLmean (sensitivity 0.90, specificity 0.79). The standard TLmean threshold value of 2.0 yielded a sensitivity of 0.90 and specificity of 0.74, while the standard SUVmax threshold value of 3.5 yielded a sensitivity of 0.80 and specificity of 0.63. SUVmax and adjusted SUV for lean body mass (SUL) were lower in children, but tumor-to-liver ratios were similar in adult and pediatric lesions. Using TLmean > 2.0 or TLmean < 2.0 and SUVmax > 3.5, a sensitivity and specificity of 1.00 and 0.63 can be achieved. Conclusion 18F-FDG PET-CT offers adequate accuracy to detect MPNSTs. SUV values in pediatric MPNSTs may be lower, but tumor-to-liver ratios are not. By combining TLmean and SUVmax values, a 100% sensitivity can be achieved with acceptable specificity. Supplementary Information The online version contains supplementary material available at 10.1007/s11060-021-03936-y.


Introduction
Peripheral nerve sheath tumors (PNSTs) are relatively common and include both benign and malignant tumors. Schwannomas are the most common benign nerve sheath tumors (BPNSTs) and neurofibromas make up the largest proportion of remaining BPNSTs [1,2]. Nerve sheath tumors may arise sporadically or in association with neurofibromatosis. Neurofibromatosis type 1 (NF1) patients are at increased risk for developing PNSTs, with often high body tumor burden of neurofibromas [1][2][3][4]. Importantly, these neurofibromas may act as precursor lesions and can transform into malignant peripheral nerve sheath tumors (MPNSTs) [5]. MPNSTs are aggressive soft tissue sarcomas (STS), accounting for 2-3% of all STS [6,7]. Although MPNSTs are rare in the common population, NF1 patients have an 8-13% lifetime risk of developing an MPNST. MPNSTs generally have poor clinical outcomes, being the leading cause of mortality in NF1 patients [8,9]. The median survival of localized disease ranges from 5-6 years, demanding aggressive treatment [10,11]. Surgical resection is the only curative therapeutic option improving survival as MPNSTs respond poorly to chemo-and radiotherapy [10][11][12]. While the resection of MPNSTs commonly results in high postoperative morbidity and motor deficits, BPNSTs may be removed by intracapsular resections, minimizing neurologic damage [13][14][15]. BPNSTs only require resection in selected cases, making adequate preoperative differentiation crucial. 18 F-FDG PET-CT, using standardized uptake values (SUVs) and tumor-to-liver ratios as semi-quantitative metabolic imaging markers, has been increasingly used as a non-invasive diagnostic tool for the characterization of PNSTs in NF1 patients. However, ideal parameters and their corresponding thresholds have yet to be elucidated [16]. There is large variation in current literature regarding this matter, part of which might be caused by variation among scanners and scanning protocols [17][18][19][20]. Suggested optimal threshold values of semi-quantitative parameters vary greatly, but the SUVmax threshold of ≥ 3.5 is commonly cited [21][22][23][24]. However, its value has been doubted since it may provide high false positive rates [22]. Additional concerns rise among scanning in pediatric NF1 populations, as few studies have investigated the diagnostic accuracy in this subpopulation. By using European Association of Nuclear Medicine (EANM) Research Ltd. (EARL) protocol certified scanners, results are reproducible for any center utilizing a scanner of that kind.
Given current uncertainties of accurately distinguishing MPNSTs and BPNSTs using 18 F-FDG PET-CT, this study investigated the diagnostic accuracy of optimal and commonly used thresholds of semi-quantitative 18 F-FDG PET-CT markers using EARL certified scanners and evaluated possible differences between adult and pediatric populations.

Study population
Patient data was retrospectively collected from two neurofibromatosis expertise centers. Patients with NF1 (fulfilling the NIH criteria and/or genetically proven) who underwent 18 F-FDG PET-CT examination for suspected MPNST based on clinical symptoms and/or radiological examination were included. The EARL protocol is used for performance harmonization for semi-quantitative imaging markers of 18 F-FDG PET-CT, enabling comparison of imaging markers among patients and sites, regardless of the 18 F-FDG PET-CT used. To increase homogeneity between imaging only patients following EARL protocol were included, thus only patients that underwent scans after 2013 were included. Patients with BPNSTs, either suspected or concluded by biopsy, with less than 12 months follow-up were excluded. Patients receiving treatment consisting of radiotherapy, chemotherapy or surgical excision of the lesion prior to 18 F-FDG PET-CT were excluded as this may alter tumor imaging features. Patient data was obtained from electronic medical files including demographical information, histopathological outcomes, and (semi-quantitative) scan characteristics. This study was approved by the Ethics Committee of both participating centers with waiver of individual patient consent. 18 F-FDG PET-CT scans were performed using a Siemens Biograph mCT PET/CT scanner (Siemens Healthineers, Erlangen, Germany) and Philips Gemini 64 TOF (Philips Medical Systems International BV, Best, The Netherlands). After fasting for approximately 4-6 h the patients received intravenous administration of 18 F-2-fluoro-2-deoxy-d-glucose (FDG). In adults the dose of FDG in MBq was based on weight in one center and on weight adjusted to surface body area (ranging 113-385) in the other. Pediatric patients received weight-dependent administration of FDG based on the pediatric dose card of the EANM [25]. Administration of tracer took place after confirming blood glucose levels were within normal range. If blood glucose levels were greater than 10 mmol/L, the study was rescheduled. Whole body attenuation corrected images were acquired approximately 60 min after tracer injection. During this uptake phase, patients were instructed to rest in a warm, dimly lit room with minimal stimulation. According to scanning protocol, first a whole-body low dose CT was acquired for attenuation correction and localization purposes (120 kV, Quality reference mAs 40, rotation time 0.5 s, pitch of 0.8 mm, slice thickness of 3 mm; reconstructed slice thickness 3 mm). Directly after the low dose CT, PET acquisition started in list-mode, using 6 to 7 bed positions per patient (from skull base to inguinal region). All scans were corrected for scatter and attenuation using the low dose CT and reconstructed using ordered subset expectation maximization (OSEM) and Time of Flight (TOF). Logistic time constraints warranted delayed imaging was performed after 3 h. In the neurofibromatosis expertise centers, semi-quantitative analysis was performed by a nuclear medicine physician with over 3 years of experience, blinded to both clinical history and pathology results. Maximum, mean, and peak standardized uptake values (SUVmax, SUVmean, and SUVpeak) were determined by drawing a volume of interest (VOI) around the target lesion or in the liver as reference (Fig. 1). Tumor-to-liver ratios were determined by drawing a VOI with a diameter of 3 cm in the center of the right liver lobe. Care was taken that the whole VOI was inside the liver. The SUVmax and mean of this region were measured.

Histological analysis
Histology was considered gold standard and was performed according to institutional standards. Tumors were classified as typical neurofibroma, atypical neurofibroma, or malignant peripheral nerve sheath tumor, using established pathologic criteria [20,26,27].

Statistical analysis
The following semi-quantitative imaging markers were analyzed for potential use to differentiate malignant transformation in neurofibromas: SUVmax, SUVpeak, SUVmax adjusted to lean body mass (SULmax), SUVpeak adjusted to lean body mass (SULpeak), delayed SUVmax, delayed SUVpeak, delayed SULmax, delayed SULpeak, TLmax, and TLmean. Lean body mass (LBM) was calculated using Janmahasatian's formula [25][26][27][28][29]. Receiver operating characteristic (ROC) analysis was performed for each semi-quantitative imaging marker and optimal threshold values were determined using Youden's index. Ideal threshold sensitivity, specificity, positive likelihood ratio (pLR), and negative likelihood ratio (nLR) were determined. Diagnostic accuracy was described using area under the receiver operating curve (AUC). Optimization of the diagnostic algorithm was performed using commonly used imaging markers SUVmax and TLmean. Performance of commonly used threshold values for SUVmax (3.0-6.0) and TLmean (1.5-3.0) was assessed. Steps of 0.5 were used to improve generalizability. Additionally, threshold values yielding 100% sensitivity or 100% specificity were assessed. Combinations of these parameters were manually assessed to identify the diagnostic algorithm with highest sensitivity and acceptable Fig. 1 Maximum, mean, and peak standardized uptake values determined by drawing a volume of interest around the target lesion. Imaging displaying maximum, mean, and peak standardized uptake values determined by drawing a volume of interest around the target lesion in MPNST and neurofibroma. MPNST malignant peripheral nerve sheath tumor; SUV standard uptake value specificity. Patients were stratified by age (adults vs. children) and subgroup analysis was performed for MPNSTs and BPNSTs. Nine patients received more than one 18 F-FDG PET-CT. Differences in PNSTs and between subgroups were analyzed using chi-square test for categorical variables and for continuous variables a one-way test/t-test depending on normality of distribution based on the Shapiro-Wilk test. Additionally, Kruskal-Wallis or Wilcoxon test were used, depending on distribution. As recent literature indicates that PNSTs are at risk of undergoing malignant transformation at any point in time, each tumor was investigated for malignant transformation at every 18 F-FDG PET-CT independently of previous measurements [23,[30][31][32]. Typical and atypical neurofibromas were evaluated together as they are both considered benign lesions. Statistical significance was established for p-values < 0.05. All statistical analyses were performed using R version 4.0.3 (R Core Team 2020).

Study population
Sixty patients were included, undergoing 18 F-FDG PET-CT examinations for seventy tumors, 10 MPNSTs and 60 BPNSTs (Table 1). Forty lesions were found in females and thirty in males. Nineteen of seventy lesions had delayed scans, of which 3 MPNSTs. Fifteen lesions were evaluated in children (≤ 18 years). Mean duration of follow-up was 3.5 ± 1.6 years. At last follow-up, 7 MPNST patients and 4 BPNST patients were deceased.

Differences between adults and children
Statistically significant differences between adults and children were found in MPNSTs for mean SUVmax (11.56 vs. 3.10, p = 0.037) and SUVpeak (7.48 vs. 2.14, p = 0.037), but not in BPNSTs (  Table 3). This diagnostic algorithm resulted in 100% sensitivity and 63% specificity, requiring 22/60 BPNSTs to undergo biopsy (Fig. 2). Additionally, using the optimal threshold of TLmean found in this study (≥ 2.3), specificity may be increased to 65%, resulting in one less BPNST requiring biopsy.

Discussion
This retrospective study found that PET scans offer adequate accuracy for detecting malignant transformation of neurofibromas both in adults and children. Combining SUVmax and TLmean threshold values in a diagnostic algorithm increases specificity while retaining 100% sensitivity.

Optimal thresholds in PET scans
In the past decades, 18 F-FDG PET-CT scans have increasingly been used to detect malignancy in NF1 patients. Though numerous studies aimed to identify ideal  [20, 25-27, 31, 33-40]. Studies evaluating TL ratio reported ideal thresholds varying from 1.4 to 3.0 [17,25,31,35,37,39,41]. This study found ideal threshold values for SUVmax and TL ratio consistent with those reported in literature and delayed imaging did not improve diagnostic accuracy. However, using these thresholds some MPNSTs may be missed.

Children vs. adult populations
Malignant transformation of neurofibromas also occurs in children [12,42]. As detection of MPNST at early stages could increase the possibility of curative resections, frequent and serial imaging for surveillance of lesions is often performed. Conversely, this practice may possibly lead to harmful long term radiation effects [22,35,39,43,44].
Unfortunately, only few published 18 F-FDG PET-CT studies have included children for analysis and no analysis has been performed comparing imaging marker values between adult and pediatric NF1 patients. Studies that combined data from both adults and children with NF1 found an optimal threshold value of SUVmax ranging from 3.90 to 4.00 with sensitivity ranging from 82 to 100% and specificity ranging from 66 to 94% [25][26][27]. Studies including only adult NF1 patients found a wider range of optimal threshold values for SUVmax ranging from 1.8 to 7.0, suggesting that children may have lower SUVmax values compared to adults [20-23, 28, 30-33, 35, 39, 41, 42, 44-46]. It is suggested that SUV values in adults may be higher, as the administered dose is adjusted by weight and since adults have comparably more fat tissue which has relatively low FDG, the uptake in lesions and normal organs is higher. Adjusting SUV to lean body mass may correct for body composition as a contributing factor for SUV differences found between adult and pediatric patients. Recent studies have investigated the use of SUL using James's formula to improve diagnostic accuracy in differentiation of PNSTs in adult population [20,29,39,42,47,48]. This study adjusted SUV to lean body mass using a recently proposed formula by Janmahasatian, as it is suggested to be more accurate for use in children [25][26][27][28][29]. Significantly lower SUVmax and SUVpeak values in MPNSTs in children were found. However, after adjusting for lean body mass uptake values of SUVmax and SUVpeak remained significantly lower in MPNSTs in children, suggesting it is less likely that differences in body composition significantly contribute to SUV differences found between adults and children [29]. Though based on only 2 MPNSTs, significantly lower SUV values were found in children. This may be due to the large spread in uptake values in adults, which require relatively low SUVmax thresholds. Nevertheless, based on the significant differences in SUV values between adults and children, caution should be taken in interpreting SUV thresholds on their own in children.

Optimal PET algorithm
A threshold of 3.5 for SUVmax has often been proposed as the ideal threshold [21][22][23]. A recent meta-analysis pooled individual level patient data from 11 different study populations and found a threshold of 3.5 provided the highest sensitivity (0.99) and acceptable specificity (0.75) [24]. Arguments against using this threshold often consisted of the low specificity it offered. This study found a sensitivity of 0.80 and specificity of 0.63 using a threshold of 3.5 for SUVmax. In this study, TLmean yielded slightly better accuracy (0.92) compared to SUVmax, while there was no significant difference between adults and children in proportional values. Contrasting to previous studies, the current study combined the use of SUVmax and TLmean, proposing an algorithm aimed to achieve optimal sensitivity while retaining acceptable specificity. Using a threshold of TLmean ≥ 2.0 or TLmean < 2.0 and SUVmax ≥ 3.5, sensitivity of 1.00 was achieved and specificity of 0.63. As TL values did not differ between the adult and pediatric population, there does not seem to be a rationale to have separate diagnostic algorithms. Using single semi-quantitative imaging markers, sensitivity of 1.00 is often not achieved or comes at the cost of lower specificity. A single marker's threshold may also be less reproducible in other populations.

Strengths and limitations
This study is limited by its relatively small population, which is mainly a result of the strict inclusion criteria. The inclusion of symptomatic lesions and EARL adhering scans only, is stricter than previous studies. As EARL criteria were adapted in both participating centers only in 2013 and a follow-up of a year for benign lesions was required, the study period was relatively short. Therefore, subgroup analysis of pediatric patients should be interpreted with caution. Despite these limitations, the results of this study are reproducible for any center using PET-scanners that adhere to EARL criteria. Additionally, this study used a combination of SUVmax and TLmean and developed an optimal diagnostic work-up algorithm to identify all MPNSTs while minimalizing the number of false positives. To the best of our knowledge, this is the first study to compare semi-quantitative imaging marker values between adult and pediatric patients. This study found that while SUVmax and SUL were significantly lower for MPNSTs in children, TL values were not. Based on the findings of this study, future research should investigate several knowledge gaps. First, the semi-quantitative characteristics evaluated in this study should be validated in large prospective cohort studies with PET scanners adhering to EARL criteria. This may identify ideal threshold values for accurate detection of malignant transformation of PNSTs. Secondly, the use of the proposed diagnostic algorithm should be replicated in a large database of adult and pediatric NF1 patients. Additionally, SUV values of semi-quantitative imaging markers in adult and pediatric NF1 patients should be studied too. Though adjusting optimal threshold values based on age did not impact the diagnostic accuracy of the proposed algorithm, potential differences in diagnostic accuracy between these populations may necessitate different diagnostic guidelines nevertheless. Altogether, the results from these studies will provide a framework that may enable optimal diagnostic algorithms to be formulated. This study only assessed the diagnostic accuracy of 18 F-FDG PET-CT. A recently published metaanalysis reported that although conventional MRI yields varying degrees of accuracy, some studies have shown high accuracies in functional MRI [24]. Though further research is required on this modality, reducing the need for 18 F-FDG PET-CT may diminish radiation exposure that accumulates due to numerous follow-up scans necessary in NF1 patients prone to tumorigenesis.

Conclusion
In EARL adhering PET-scanners, semi-quantitative imaging markers offer acceptable diagnostic accuracy for detecting malignant transformation of PNSTs in NF1. An algorithm was proposed, combining SUVmax and TLmean, which maximizes sensitivity while simultaneously reducing the number of false positives, thus reducing the number of unnecessary biopsies. This algorithm can readily be used in any center using EARL adhering PETscanners. In pediatric MPNSTs SUVmax values were significantly lower even after correction for lean body mass, yet TL values were similar to adult cases. These potential differences between uptake values of adults and children did not impact the diagnostic algorithm.
Author contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by RG and EM. The first draft of the manuscript was written by RG and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. The manuscript for this protocol was drafted by RG and EM. All authors have made substantial contributions to the development of the design and conceptualization of the protocol. All authors read, provided feedback, and approved the final protocol.
Funding No funding was received from any extramural sources.

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Conflict of interest
No author has any form of disclosure.
Ethical approval This study was approved by the Ethics Committee of participating centers with waiver of individual patient consent.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.