The dilemma of 18F-FDG PET/CT thyroid incidentaloma: what we should expect from FNA. A systematic review and meta-analysis

Purpose 18F-FDG thyroid incidentaloma (TI) occurs in ~2% of PET/CT examinations with a cancer prevalence of up to 35–40%. Guidelines recommend fine-needle aspiration cytology (FNA) if a focal 18F-FDG TI corresponds to a sonographic nodule >1 cm. The aim of this systematic review and meta-analysis was to provide evidence-based data on the diagnostic distribution of 18F-FDG TIs in the six Bethesda systems for reporting thyroid cytopathology (BETHESDA) subcategories. Methods Original studies reporting 18F-FDG TIs and cytologically classified according to BETHESDA were included. Six separate meta-analyses were performed to obtain the pooled prevalence (95% confidence interval, 95% CI) of 18F-FDG TIs in the six BETHESDA subcategories. Results Fifteen studies were finally included. Nine studies were from Asian/Eastern and six from Western countries. FNA data according to BETHESDA was available in 2304 cases. The pooled prevalence of 18F-FDG TIs according to BETHESDA was BETHESDA I 10% (6–14), BETHESDA II 45% (37–53), BETHESDA III 8% (3–13), BETHESDA IV 8% (5–12), BETHESDA V 6% (4–9), BETHESDA VI 19% (13–25). A significantly different prevalence was found in the BETHESDA IV between Asian/Eastern (2%) and Western (19%) studies. Conclusion Two-thirds of focal 18F-FDG TIs undergoing FNA have either malignant (BETHESDA VI) or benign (BETHESDA II) cytology while a minority will have indeterminate (BETHESDA III or IV) FNA results. Significant differences between Asian/Eastern and Western studies are also present in the prevalence of indeterminate FNA results.


Introduction
The advent in recent years of high-performance medical imaging tools, such as ultrasonography (US), computed tomography (CT), magnetic resonance (MRI), and positron emission tomography/computed tomography (PET/CT) with various tracers such as fluorine-18-fluorodeoxy-glucose ( 18 F-FDG), radiolabelled choline, and radiolabelled prostatespecific membrane antigen, has improved the management of patients [1,2]. However, using these new imaging modalities has led to the frequent detection of unexpected asymptomatic lesions, "incidentalomas": an increasingly important topic in clinical practice [3].
Because of the high incidence of nodular thyroid disease in the general population, the identification of thyroid incidentalomas (TI) occurs frequently in clinical practice [4,5]. According to evidence-based data, the prevalence of 18 F-FDG TI is about 2-3% of all PET/CTs, approximately two in three showing focal uptake [5]. Focal 18 F-FDG TI is defined as any focal uptake corresponding to a given thyroid nodule. The cancer detection rate in focal 18 F-FDG TIs is reported to be up to 35-40% [5][6][7].
The 2015 American Thyroid Association guidelines for adult patients with thyroid nodules and differentiated thyroid cancer recommend performing fine-needle aspiration cytology (FNA) in all 18 F-FDG PET/CT TIs with a sonographically confirmed thyroid nodule >1 cm [8]. Although US is the key diagnostic modality in the initial diagnostic assessment of thyroid nodules, the diagnostic accuracy and performance of thyroid imaging reporting and data systems (TIRADSs) in 18 F-FDG TI needs to be further validated [9]. Moreover, significant selection bias may influence the calculation of the risk of malignancy (ROM) of 18 F-FDG TIs for several reasons. Firstly, the majority of studies of 18 F-FDG TI only report outcomes for nodules undergoing thyroid diagnostic work-up, which represent a minority of most institutional series of patients. Oncological patients with 18 F-FDG TI have a low likelihood of thyroid surgery if the comorbid non-thyroidal malignancy is more aggressive and/or of worse prognosis [10]. Secondly, most studies use histopathological assessment as the reference standard for malignant lesions and non-neoplastic FNA cytology as the reference standard for benign lesions as benign lesions are often not operated [11]. Thirdly, 18 F-FDG TIs with indeterminate FNA, which is the Bethesda system for reporting thyroid cytopathology [12] subcategories III and IV, without histological diagnoses, are generally not included in statistical analyses of ROM even if the expected ROM is not negligible.
In view of the above, before undertaking FNA in an 18 F-FDG TI, it is important to know the relative frequency of the BETHESDA cytological subcategories in an 18 F-FDG TI. For all thyroid nodules a recently published meta-analysis by Vuong et al. [13]. shows pooled BETHESDA frequencies as follows: Category I non-diagnostic 12.2%, Category II benign 62.3%, Category III AUS/FLUS 8.0%, Category IV follicular neoplasm/suspicious for follicular neoplasm 6.1%, Category V suspicious for malignancy 3.7%, and Category VI malignant 7.4%. The purpose of the current study was to ask the question-are the cytologic findings in 18 F-FDG TI different from those in non-FDG PET/CT detected thyroid nodules? It is known for example that Hürthle cell/oncocytic thyroid lesions (HTL) may be over-represented in 18 F-FDG avid thyroid nodules [14]. With this information, the clinician can better manage 18 F-FDG TI patients using FNA, particularly for HTL. Both benign and malignant HTLs are known to be 18 F-FDG avid [15]. However, HTL usually falls into class IV of BETHESDA [12], where the resection rate was reported 60.5% and ROM 28.9% [13].
This study was designed to achieve evidence-based information on the relative distribution of 18 F-FDG TIs in the various cytological categories of BETHESDA (non-diagnostic, benign, atypia of undetermined significance or follicular lesion of undetermined significance, follicular neoplasm/ suspicious of a follicular neoplasm, suspicious for malignancy, and malignant) [12], providing information useful for the clinical management of 18 F-FDG TI.

Guidelines followed
In this study, all procedures utilized were consistent with PRISMA guidelines [16].

Search strategy
Three investigators (LS, AP, and PT) independently conducted a comprehensive literature search of online databases MEDLINE (PubMed) and Scopus using the following search terms and their combinations: thyroid, nodule, incidentaloma, FDG, PET, positron. A commencement date limit was not used. The last search was carried out on 31 October 2020. No language restrictions were imposed. The search was restricted to human studies. Three investigators (LS, AP, and PT) screened independently titles and abstracts of the retrieved articles, reviewed the full-texts, and then selected articles for inclusion. References from included studies were also screened for additional articles.

Eligibility criteria
The major inclusion criterion was original studies reporting 18 F-FDG TIs undergoing FNA and classified according to BETHESDA. The following studies were excluded: (1) with an overlapping patient or nodule data; (2) reporting only some BETHESDA cytologic subcategories (because these results did not allow calculation of the frequency of each cytological subclass of BETHESDA); (3) with ≤10 18 F-FDG TIs. Three researchers (LS, AP, and PT) applied the above criteria selecting studies for inclusion. Disagreements were resolved via online consensus discussion among all the authors.

Data extraction
For the included studies, the following data were extracted independently and coded in duplicate by three investigators (LS, AP, and PT), in the pilot form: (a) author, publication year, country, study design; (b) number of PET scans performed during the study period; (c) patients' ages and gender; (d) SUVmax value; (e) size of 18 F-FDG TIs; (f) number of 18 F-FDG TI undergoing FNA; (g) number of 18 F-FDG TI in all six BETHESDA categories; (h) number of 18 F-FDG TI with a histological diagnosis. The collated details were cross-checked and any discrepancies were fully reconciled by joint reevaluation among the authors.

Study quality assessment
The risk of bias of the included studies was assessed independently by three investigators (LS, AP, and PT) using the National Heart, Lung, and Blood Institute Quality Assessment Tool (https://www.nhlbi.nih.gov/health-topics/ study-quality-assessment-tools).

Data analysis
The characteristics of included studies were summarized. A proportion meta-analysis calculation was used to obtain the pooled rate of 18 F-FDG TI assessed in all BETHESDA categories [12]. Six separate meta-analyses were performed to obtain the pooled prevalence (95% confidence interval, 95% CI) of 18 F-FDG TIs in the six different BETHESDA categories. Heterogeneity between studies was assessed by using I 2 , with 50% or higher values regarded as high heterogeneity. If the presence of heterogeneity was identified, further analyses were performed to explain it. The Egger's test was carried out to evaluate the possible presence of significant publication bias. For statistical pooling of data, a random-effect model was used. A p < 0.05 was regarded as significant. All analyses were performed using StatsDirect statistical software (StatsDirect Ltd; Birkenhead, Merseyside, UK).

Study selection
Literature searches using the above algorithms yielded 407 studies. All the records were assessed as depicted in Fig. 1. Of these, 289 were screened, 84 were assessed as eligible, and 15 [17][18][19][20][21][22][23][24][25][26][27][28][29][30][31] were included in the final systematic review and meta-analysis. Table 1 summarizes the quality assessment of the 15 included studies. The risk of bias for each study was judged as low for 12 items. All studies were high risk with respect to sample size. None of the studies reported power or sample size justification.

Study quality assessment
Qualitative analysis (systematic review) Table 2 summarizes the main features of the 15 included studies. The 15 studies were published between 2012 and 2020. All studies but one [29] were reported in the English language. Nine studies were performed by Asian/ Eastern [17-20, 22-24, 27, 31] and six by Western [21,25,26,[28][29][30] authors. They were single-center or observational cohort studies. The total number of 18

Quantitative analysis (meta-analysis)
The distribution of the 2304 18 F-FDG TIs according to BETHESDA was evaluated. Table 3 shows the pooled prevalence results. The most frequent cytologic category was benign (BETHESDA II) comprising 45% of 18 F-FDG TIs. The malignant category (BETHESDA VI) was the second most common cytologic subcategory (19%). Inconsistency was found in all six categories. Publication bias was present in two categories.
In an attempt to explain the above heterogeneity, further analyses were performed. One sensitivity analysis included only those studies with more than 100 TIs [18,19,22,23,25,27]. This analysis included 1904 18 F-FDG TIs. The prevalence in the BETHESDA subcategories was substantially unchanged (I 12%, II 47%, III 6%, IV 4%, V 6%, and VI 21%) and the inconsistency remained high (detailed data not shown). A second analysis was performed considering separately the nine studies from Asian/Eastern countries [17-20, 22-24, 27, 31] and the six studies from Western authors [21,25,26,[28][29][30] as shown in Table 4. There was a significantly different prevalence in the BETHESDA category IV between Asian/Eastern and Western studies. With this sub-analysis, no heterogeneity was identified in category IV of Asian/Eastern studies and in categories II, III, and V of the Western studies. Unfortunately, data on the frequency and presence or absence of HTL was not available to perform further specific analysis. Forest and funnel plots are included as supplemental material.

Discussion
An evidence-based review detailing information on BETHESDA FNA subcategory outcomes in 18 F-FDG focally avid nodules is not currently available in the literature. This systematic review with meta-analysis provides additional evidence-based data on the distribution of 18 F-FDG TIs across the full range of BETHESDA cytologic subcategories, enabling more detailed consideration of clinical management in patients undergoing cytologic assessment for 18 F-FDG TIs [8].
This study shows that FDG avid nodules show a relative excess of BETHESDA category IV and V FNA (see Table 4). Although the published studies included in this meta-analysis do not give specific information on the prevalence of Hurthle cell neoplasms, the finding of relatively higher frequencies of category IV and V FNA in thyroid TI implies that this is due to the underlying higher clinical ROM of PET avid thyroid nodules, in combination with a relative excess of HCN in nodules that are FDG avid which would typically fall in BETHESDA categories III and IV [14,[32][33][34].
Diagnostic assessment of 18 F-FDG TIs represents a major challenge in clinical practice. The majority of focal 18 F-FDG PET/CT avid thyroid nodules are benign. Unfortunately, we could not derive conclusions about potential relationships between SUVmax values and BETHESDA categories due to the paucity of data. Patients undergoing thyroid 18 F-FDG PET/CT generally show more aggressive non-thyroidal comorbid tumors, hence investigation of a potential co-existent thyroid carcinoma may not be a therapeutic priority [10]. Although up to 35-40% of 18 F-FDG TIs are malignant, this data is likely to be biased due to the inclusion of patients with non-thyroid tumors [10]. 18 F-FDG-avid thyroid cancer is a malignancy said to be more aggressive than non 18 F-FDG-avid thyroid carcinoma [35].
This study does not address cancer prevalence among focal 18 F-FDG TIs, rather it describes how a cytologic report can influence the clinical management of clinically suspected 18 F-FDG TI patients as this information is important, especially for patients with more aggressive Fig. 1 Diagram of the flow of searching data. US ultrasound, CT computed tomography, MRI magnetic resonance imaging, pts patients, FNA fine-needle aspiration. Bethesda system for reporting thyroid cytopathology (BETHESDA) Kim [18] Choi [19] Lee [20] Jamsek [21] Kim [22] Kim [23] Yoon [24] Hagenimana [25] Li [26] Suh [27] Thuillier [28] De Guevara [29] Kaliszewski [30] Kamakshi [31] 1. Research question.   Jamsek [ [13]. This study, therefore, indicates that 18 F-FDG TIs have a higher ROM than that observed in the general population of thyroid nodules so requiring further investigation. A 45% of 18 F-FDG TIs were cytologically assessed as benign, enabling the use of FNA as a rule-out test in these patients, most of whom have a more aggressive non-thyroidal cancer.
Pooled prevalence data shows that overall, nearly two-thirds of 18 F-FDG TIs have either a malignant or benign FNA report, which provides reassurance because the false positive and false negative rates of these categories are negligible (i.e., 2-3% and 0-3%, respectively) [12]. Overall, combining results from Western and Asian/Eastern studies, 13% of 18 F-FDG TIs are indeterminate (8% BETHESDA III and 5% BETHESDA IV) although over one-quarter of the patients will have an inconclusive FNA based on Western studies whereas the figure for Asian/Eastern patients is much lower. Because the ROM for the indeterminate FNA categories approaches 30% [13], this represents a challenge for clinical practice. A relatively high number of BETHESDA IV FNA among patients with 18 F-FDG TIs are oncocytic/Hürthle cell lesions [36,37] although most studies do not separately record these lesions in published series.
Ultrasound assessment is known to be suboptimal for detection of follicular and oncocytic thyroid carcinomas as compared to papillary thyroid carcinoma; yet some thyroid cancers with poor prognosis, e.g., some follicular and oncocytic carcinomas, are typically classified as BETHESDA III or IV [38,39]. Moreover, 18 F-FDG avid thyroid cancer is a tumor that is generally more biologically aggressive [40][41][42]. The possibility of an indeterminate FNA result should be taken into account when requesting FNA in 18 F-FDG TI. In addition, the presence of noninvasive follicular thyroid neoplasm with papillary-like nuclear features in Bethesda III and IV categories, as well as in the other categories, should also be considered [43,44]. High heterogeneity was found in these findings. However, this heterogeneity can be explained according to the  sub-analysis performed separating the studies published by Asian/Eastern and Western authors. As shown in Table 4, the heterogeneity was canceled in some cases, as there was a significant difference between Asian/Eastern and Western studies in BETHESDA category IV. The latter finding corroborates the results obtained by Vuong et al. [13]. However, in the Vuong et al. meta-analysis [13] the prevalence of category IV was 7.9% in Western and 3.5% in Asian/Eastern studies, while here we found 19% and 2%, respectively. The reasons for the published differences between Western and Asian/Eastern cytopathology practice are unclear. There may also be differences in nuclear thresholds for papillary thyroid carcinoma although data on this is lacking. Beyond any consideration of the 18 F-FDG avid thyroid nodule risk (clinical information, biochemical tests, and US features), we should always keep in mind the context in which we are moving. Indeed, the extrathyroidal PET/CT findings should be carefully considered prior to the decision to undertake thyroid FNA. In this context, there are probably two optimal imaging scenarios to perform FNA biopsy: patients with complete remission of their nonthyroidal cancer and patients with 18 F-FDG PET/CT findings suspected of a new diagnosis of metastatic thyroid cancer. In other terms, the higher the risk of finding thyroid cancer and the higher the likelihood of detection of a highly aggressive primary thyroid malignancy, the more appropriate the indication for FNA. The pooled data of this meta-analysis could be affected by various biases which should be discussed. The studies included in this systematic review recorded 4031 focal 18 F-FDG TIs while they reported the results of 2304 (57.1%) FNAs. It is unclear whether patients were managed according to specific clinical features, US-related risk, SUVmax value, or other characteristics associated with the cancers which indicated PET/CT (i.e., selection bias). Whether the cytopathologists were influenced by the FNA indication (i.e., 18 F-FDG TI in an oncological patient) is also not reported. The studies report the results of a retrospective review of single-center series of patients. The PET/CT systems utilized were different with different sensitivity and resolutions. Finally, a high statistical heterogeneity among the included studies was found although this heterogeneity can be partially explained after a sub-analysis of the two groups of Asian/ Eastern and Western studies.

Conclusions
This meta-analysis shows for the first time that two in three of 18 F-FDG TIs undergoing FNA have a malignant (BETHESDA VI) or benign (BETHESDA II) cytology. The remaining approximately one-quarter of cases have an indeterminate FNA and the remainder are non-diagnostic FNA. A significant difference between Asian/Eastern and Western studies is present in the prevalence of the indeterminate category IV. Thyroidologists should be aware of this data to enable better management of patients, especially when an aggressive non-thyroid cancer is present. This evidence-based data suggests guiding clinical decisionmaking according to the patient's clinical context, including the indication for FNA and extra-thyroidal findings of 18 F-FDG PET/CT.
Author contributions PT and DNP: conceptualization. LS and PT: investigation and data curation. LS, AP, and GT: formal analysis, methodology. LS, PT, and DNP: original draft and writing. All authors have reviewed the manuscript and agreed to their individual contributions prior to submission.
Funding Open access funding provided by Università degli Studi della Campania Luigi Vanvitelli within the CRUI-CARE Agreement.

Compliance with ethical standards
Conflict of interest The authors declare no competing interests.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.