Thyroid nodules are common, but seldom harbour malignancy [1, 2]. Ultrasonography and fine needle aspiration cytology (FNAC) adequately differentiate benign from malignant thyroid nodules in approximately 70% of patients, but diagnostic dilemmas remain for nodules with indeterminate cytology, including atypia of undetermined significance or follicular lesion of undetermined significance (Bethesda III, AUS/FLUS) and (suspicious for a) follicular neoplasm (Bethesda IV, FN/SFN) or Hürthle cell neoplasm (Bethesda IV, HCN/SHCN) [2, 3]. The follicular lesions of which this group largely consists require histopathological assessment of capsular and vascular invasion to obtain a conclusive benign or malignant diagnosis [3]. Current international guidelines recommend repeat FNAC in Bethesda III nodules and consideration of clinical and ultrasound characteristics and patient preference in both Bethesda III and IV nodules, before deciding to proceed with either active surveillance or diagnostic surgery [3, 4]. When diagnostic surgery is performed, a mere one in four indeterminate thyroid nodules harbours malignancy. Thus, the surgery is futile in approximately 75% of these patients, with associated morbidity, risk of surgical complications, higher health care costs, and possible negative influence on the patients’ health-related quality of life (HRQoL) [3, 5,6,7]. A more accurate preoperative differentiation is needed to avoid futile diagnostic surgeries for benign nodules.

Positron emission tomography/computed tomography (PET/CT) using 2-[18F]fluoro-2-deoxy-D-glucose ([18F]FDG) visualises metabolic activity in tissues. A meta-analysis of the earlier small, non-randomised studies demonstrated that [18F]FDG-PET/CT reliably ruled out malignancy with 95% sensitivity in indeterminate thyroid nodules, increasing to 100% for nodules above 15 mm in diameter [5]. Consequently, [18F]FDG-PET/CT-driven management may cost-effectively reduce the fraction of futile surgeries from ~ 75% to ~ 40%, with an expected reduction in direct healthcare costs while maintaining HRQoL [5, 7]. More recent studies reported sensitivities ranging from 71% to 100%, with most trials confirming the safety of [18F]FDG-PET/CT-driven management [8, 9]. International guidelines acknowledged the potential but stopped short of recommending the routine use of [18F]FDG-PET/CT for indeterminate thyroid nodules, as randomised controlled trials underpinning the impact of [18F]FDG-PET/CT on improved patient outcomes are lacking [4].

Here, we report the first randomised controlled trial evaluating the implementation of [18F]FDG-PET/CT as a rule-out test in the diagnostic workup of indeterminate thyroid nodules. The primary objective was to accurately reduce unbeneficial patient management, i.e., avoid diagnostic surgery for benign nodules and avoid surveillance for malignant and borderline nodules requiring surgical resection. Secondary objectives were to determine the impact of [18F]FDG-PET/CT-driven management on the surgical complication rate, HRQoL, societal costs, and consequences of incidental PET/CT findings and to assess the implementability of [18F]FDG-PET/CT.

Material and methods

Study design and participants

The Efficacy of [18F]FDG-PET in Evaluation of Cytological indeterminate Thyroid nodules prior to Surgery (EfFECTS) trial was a blinded, randomised controlled multicentre trial performed in all eight academic and seven large community hospitals in the Netherlands (Supplementary Data p3). At all study sites, local investigators and physicians were highly experienced in the multidisciplinary diagnosis and treatment of thyroid nodules and thyroid carcinoma and worked in accordance with national and international guidelines [4, 10]. Adult euthyroid patients in whom diagnostic surgery was scheduled for an indeterminate thyroid nodule, defined as Bethesda III (confirmed on two subsequent FNAC procedures) or Bethesda IV cytology, were eligible for study participation [3]. Bethesda III or IV diagnosis was established by blinded central review by two dedicated thyroid pathologists (AE and BK). Prior to inclusion in the trial, clinical and ultrasound characteristics of the index nodule were considered in a multidisciplinary setting by the local physicians to establish the indication for diagnostic surgery, in accordance with current guidelines [4]. Patients were excluded from study participation if they had contraindications for [18F]FDG-PET/CT or a higher a priori risk of thyroid malignancy based on their presentation or history (i.e., unexplained stridor, vocal cord paralysis or radiation exposure to the thyroid), if they already underwent any non-routine preoperative diagnostic stratification (i.e., [18F]FDG-PET/CT or molecular diagnostics) or were unable to undergo randomisation (e.g., patient preference for surgery) [11]. Full eligibility criteria are listed in the study protocol (Supplementary Data). Written informed consent was obtained from all participants prior to any study activity. The study protocol was approved by the Medical Research Ethics Committee on Research Involving Human Subjects region Arnhem-Nijmegen, Nijmegen, the Netherlands. The trial was overseen by a trial steering committee and an independent study safety committee. The funder of the study had no role in its design, data collection and analysis, or writing of this report.


Patients were individually randomly assigned to the [18F]FDG-PET/CT-driven group or diagnostic surgery group in a 2:1 ratio. To pursue an even distribution of risk factors for differentiated thyroid carcinoma across both arms, stratification was applied by patient sex, age (dichotomised at 45 years), ultrasonographic thyroid nodule size (0–10, 11–20, 21–40, or > 40 mm), Bethesda classification (III or IV), and study site. Randomisation was performed in the trial management system, Castor Electronic Data Capture (Castor EDC, Amsterdam, the Netherlands), which uses a validated variable block randomisation model.


All patients underwent an [18F]FDG-PET/CT of the neck, acquired by 20 PET/CT scanners at 12 EARL-accredited study sites (Supplementary Table 1) using a standard acquisition and reconstruction protocol in accordance with European Association of Nuclear Medicine (EANM) guidelines [11, 12]. In summary, patients fasted for at least 6 h prior to the injection of the radiopharmaceutical (activity adjusted for patient body weight, time per bed position, and PET/CT scanner sensitivity). Approximately 60 min (range 55–70 min) after intravenous [18F]FDG administration, patients were scanned from the external acoustic meatus to the aortic arch in a supine position, with at least 2 min per bed position. A low-dose non-contrast–enhanced CT scan was performed. EARL-reconstructed, pseudonymised scans were stored in a central, password-protected online database within the National Biomedical Imaging Archive environment (NBIA, National Cancer Institute, Bethesda, MD, USA). Next, scans were centrally assessed by two independent, experienced nuclear medicine physicians (LG and DV). Any focal [18F]FDG-uptake within the thyroid that was visually higher than the background [18F]FDG-uptake of the surrounding normal thyroid tissue and that corresponded to the index thyroid nodule in size and location was considered positive. To support the visual assessment, [18F]FDG-uptake was quantified using maximum and peak (ø1-cm sphere) standardised uptake values (SUVmax, SUVpeak), using body weight for normalization. In case of a discordant assessment, a third reviewer (WO) was consulted for a consensus meeting. All image analyses were performed using OsiriX Lite DICOM-viewer (Pixmeo SARL, Bernex, Switzerland).

One project team member (EK) combined patient allocation, and the [18F]FDG-PET/CT result to a preformulated treatment advice. A written report containing only this advice was presented to the patient’s local physician; the patient’s allocation and [18F]FDG-PET/CT result remained concealed. If present, incidental [18F]FDG-PET/CT-findings outside the index nodule and with potential diagnostic and/or therapeutic consequences were also reported in an appendix to the report, to be evaluated by the local physician in the context of the patients’ medical history. In the [18F]FDG-PET/CT-driven group, the treatment advice was based on the [18F]FDG-PET/CT results. When the index nodule was [18F]FDG-positive, patients were advised to proceed to the scheduled diagnostic surgery. When the nodule was [18F]FDG-negative, patients were advised to refrain from surgery and undergo active surveillance of the nodule, which was defined as at least one follow-up ultrasound exam of the neck and outpatient clinic visit after one year. Any additional follow-up visits during study participation were permitted at the discretion of the local physician. The nodule was presumed benign when it remained unchanged in size and appearance on the one-year ultrasound. In case of significant growth (> 50% volume change or > 20% increase in at least two dimensions, excluding cystic components) or changed ultrasound appearance including newly observed suspicious characteristics, further evaluation by repeat FNAC was recommended. Suspicious ultrasound characteristics were defined as a marked hypoechoic solid nodule, irregular shape (i.e., taller-than-wide), irregular margins (i.e., lobulated, infiltrative), and/or presence of microcalcifications.

In the diagnostic surgery group, the treatment advice for all patients was to proceed to the scheduled diagnostic surgery, in accordance with the current international guidelines [4, 10]. In both study groups, the patient and his/her physician were free to deviate from the study treatment advice at any time.

All postoperative patient management was based on the local histopathological diagnosis and current international guidelines [4]. After completion of all study procedures and data collection, all histopathology was centrally reviewed by a dedicated thyroid pathologist (AE). In case of a discordant review, a second central pathologist was consulted for a consensus meeting. Incidentally detected (micro)carcinomas located outside the index nodule were not considered for the main outcome measures.

HRQoL and societal costs were assessed during one year, calculated from the date of the [18F]FDG-PET/CT scan. Patients were asked to complete the EuroQol 5-dimension 5-level questionnaire (EQ-5D-5L), the iMTA Medical Consumption Questionnaire (iMCQ), and the iMTA Productivity Costs Questionnaire (iPCQ) at 0 (baseline), 3, 6, and 12 months (Supplementary Data p9) [13,14,15,16]. Societal costs (in €) included all direct medical costs for thyroid-related and other healthcare consumption, patient costs (i.e., informal care, travel expenses), and productivity losses. Volumes of healthcare consumption were extracted from individual patient files and the iMCQ. Costs were valued using reference prices and the 2019 reimbursement rates of the Dutch System of Diagnosis-Treatment Combinations, where appropriate and available (Supplementary Data p10). The estimated cost of one partial-body [18F]FDG-PET/CT scan was €754 [17, 18]. All [18F]FDG-PET/CT-related costs, including the costs of the scan itself and any additional healthcare consumption for incidental [18F]FDG-PET/CT findings, were only taken into account for the [18F]FDG-PET/CT-driven group.


Patients, all local study site personnel, and all pathologists were blinded to [18F]FDG-PET/CT results and allocation. Central pathologists were additionally blinded to the local cyto- and histopathological diagnoses. Central nuclear medicine physicians were blinded to allocation and all clinicopathological data except for the ultrasonographic size and location of the index nodule. Other study investigators assessing outcomes were blinded to allocation. Patients allocated to the [18F]FDG-PET/CT-driven group with an [18F]FDG-negative nodule could inevitably deduce their allocation and [18F]FDG-PET/CT result from the surveillance advice.


The primary outcome was the fraction of patient management that was considered unbeneficial one year after the [18F]FDG-PET/CT scan. Unbeneficial management was defined as futile diagnostic surgery for histopathologically benign nodules (including hyperplastic nodules, follicular adenoma, and Hürthle cell adenoma) and/or unjustified surveillance for histopathological malignant or borderline tumours. According to novel insights arising during the trial, nodules diagnosed as noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) or follicular tumour of uncertain malignant potential (FT-UMP) are considered benign yet potentially premalignant (i.e., borderline) lesions, for which surgery is considered justified [19, 20]. Following broad acceptance of these insights, we added a refinement concerning these borderline tumours to the study protocol during the trial. As histopathology was not reviewed until trial completion, this modification did not in any way influence trial execution and primary endpoints. The results for the primary outcome for a strictly benign-malignant differentiation are reported in the Supplementary Data (p16).

Secondary outcomes of the trial included the differences in surgical complication rates, one-year HRQoL expressed in quality-adjusted life years (QALYs), and societal costs (in €) between both strategies. The diagnostic accuracy of [18F]FDG-PET/CT (whole-group analysis) was estimated: [18F]FDG-positive or [18F]FDG-negative nodules confirmed as malignant or borderline tumours on histopathology were considered true-positive or false-negative, respectively; [18F]FDG-positive or [18F]FDG-negative nodules confirmed as benign on histopathology and those that remained unchanged on the one-year ultrasound were considered false-positive or true-negative, respectively. Finally, the number of incidental [18F]FDG-PET/CT findings with diagnostic and/or therapeutic consequences in the scanned area (descriptive whole-group analysis), implementability of [18F]FDG-PET/CT (i.e., diagnostic confidence) defined as the fraction of patients not reassured by a negative [18F]FDG-PET/CT result, and survival were assessed. Per protocol, the follow-up for all endpoints was set at one year after [18F]FDG-PET/CT. Whenever available and relevant, data beyond one year of follow-up (censored 1 October 2021) are presented.

Statistical analysis

The trial was designed to have 80% power to detect a reduction in unbeneficial management from ~ 75% to ~ 40% at a significance level of 0.05. At least 90 evaluable patients with nodules > 10 mm were required (2:1 allocation). After correction for 82.7% expected nodule size > 10 mm and 15% estimated data-attrition, the sample size was set at 132 patients [5, 7].

After half of the anticipated patients of the diagnostic surgery group were recruited, the study safety committee conducted a predetermined interim analysis and reported no objections to safe continuation of the trial.

The applied descriptive statistics were mean ± standard deviation or median and interquartile range for continuous variables and absolute numbers and relative frequencies (%) for categorical variables. Intention-to-treat analysis was performed. Categorical primary and secondary outcomes were compared between allocated groups using Pearson’s chi-squared or Fisher’s exact tests, where appropriate. We adjusted for the stratifying variables using binary logistic regression; the corrected p values are reported together with an adjusted odds ratio and their 95% confidence intervals (CI) [21].

Sensitivity, specificity, and negative and positive predictive value (NPV, PPV) were calculated using the traditional formulas. 95% CIs were calculated using the β-distribution (Clopper-Pearson interval). For EQ-5D-5L, iMCQ, and iPCQ questionnaires, we used multiple imputation to account for possible selectively missing values. To estimate HRQoL, we calculated Dutch utility scores from the EQ-5D-5L and estimated the mean one-year QALYs as the area under the utility curves (Supplementary Data p8). One-year societal costs were estimated as the mean sum of [volume x costs] for all components. QALYs and costs are presented as mean and 95% CI and compared using independent samples T-tests with unequal variances. In the analysis of QALYs and costs, we adjusted for the stratifying variables and additionally adjusted for possible influences of the unevenly distributed malignancy/borderline rate (see Results, and Supplementary Tables 4 and 5), using a generalized linear model (GLM). The local benign/malignant histopathological diagnosis was included in the GLM as a covariate, as this diagnosis determined the patients’ postoperative course of treatment and thus contributed to the costs and perceived HRQoL. The adjusted means, p values, and mean differences are presented.

Two prespecified subgroup analyses for the primary outcome were performed: one for nodules > 10 mm (ultrasonographic largest diameter) and one for Hürthle cell nodules (defined as Bethesda IV HCN/SHCN cytology) and non-Hürthle cell nodules (defined as Bethesda III AUS/FLUS and Bethesda IV FN/SFN cytology).

Data collection was performed using Castor EDC (Castor EDC, Amsterdam, the Netherlands). Statistical analysis was performed using SPSS Statistics version 26 (IBM Corp, Armonk, NY, USA). This trial is registered with NCT02208544 (5 August 2014).


After screening 260 patients for eligibility, we finally enrolled 132 patients with a cytologically indeterminate thyroid nodule and scheduled diagnostic thyroid surgery between 1 July 2015 and 16 October 2018 (Fig. 1). Their mean age was 54.5 ± 13.6 years; 107 (81.1%) patients were female. A total of 91 (69%) patients were randomly allocated to the [18F]FDG-PET/CT-driven group and 41 (31%) to the diagnostic surgery group. Baseline characteristics, including stratifying variables and PET/CT parameters, were balanced across both allocation groups, except for two patient-reported complaints upon the first presentation: subjectively increased size of a known thyroid nodule (p = 0.01) and dysphagia (p = 0.02) (Table 1). Suspicious ultrasound characteristics were present at baseline in 40% (36/91) patients in the [18F]FDG-PET/CT-driven group and 46% (19/41) patients in the diagnostic surgery group (p = 0.47).

Fig. 1
figure 1

Trial profile. The dashed line indicates the patients who deviated from the treatment advise per protocol. NIFTP, non-invasive follicular thyroid neoplasm with papillary-like nuclear features. FT-UMP-OV, follicular tumour of uncertain malignant potential, Hürthle cell type. *: a specification of reasons for ineligibility is provided in Supplementary Table 2

Table 1 Baseline characteristics of the study population and [18F]FDG-PET/CT parameters

[18F]FDG-PET/CT results showed a visually [18F]FDG-negative index nodule in 41 of 132 (31%) patients: 26 of 91 (29%) in the [18F]FDG-PET/CT-driven group and 15 of 41 (37%) in the diagnostic surgery group (p = 0.36). All 26 patients with an [18F]FDG-negative index nodule in the [18F]FDG-PET/CT-driven group were advised active surveillance. After one year, 23 had not undergone surgery. On the one-year ultrasound, 21 of 23 nodules (91%) were unchanged in size and appearance; they were considered benign. Two of 23 (9%) nodules had increased by 28–37% in largest diameter on the one-year ultrasound. To date, after a median follow-up of 29 months (IQR 24–45) until their latest ultrasound exam, 20 of 23 (87%) nodules have remained unchanged. Three patients, including the two with an apparently growing nodule on the one-year ultrasound, experienced local discomfort attributed to local compression of the nodule and underwent diagnostic surgery outside the study follow-up (20, 35, and 41 months after [18F]FDG-PET/CT, respectively). Histopathology was benign, showing two follicular adenomas and one hyperplastic nodule.

All 41 patients in the diagnostic surgery group and the 65 patients with an [18F]FDG-positive index nodule in the [18F]FDG-PET/CT-driven group were advised to proceed to the scheduled diagnostic surgery. One patient in the diagnostic surgery group and two patients in the [18F]FDG-PET/CT-driven group, all with [18F]FDG-positive nodules, waived surgery. To date, after 28–42 months of follow-up and repeated ultrasound exams, none of these nodules have changed; they are considered benign (false-positive).

In total, 106 of 132 (80.3%) patients underwent diagnostic surgery during study follow-up (Supplementary Table 4). The central review of the histopathology was discordant with the local diagnosis in six cases (6%) (Supplementary Table 5). A total of 34 (26%) nodules had a histopathological diagnosis that justified surgery, including 25 malignancies, five NIFTP, three FT-UMP, and one paraganglioma. A total of 72 (55%) nodules had benign histopathology. In addition to the 26 nodules that were presumed benign during active surveillance, in total, 98 of 132 (74.2%) nodules were considered benign: 63 of 91 (69%) in the [18F]FDG-PET/CT-driven group and 35 of 41 (85%) in the diagnostic surgery group. Despite successful stratified randomisation, the rate of malignant/borderline nodules appeared higher in the [18F]FDG-PET/CT-driven group (28/91, 31%) than in the diagnostic surgery group (6/41, 15%). After adjusting for the stratifying variables, the difference was not statistically significant (p = 0.08). All patients completed all study-related procedures and one year of follow-up. There were no adverse events.

Primary outcomes

After one year of follow-up, patient management had been unbeneficial in 38 of 91 (42% [95% CI, 32–53%]) patients in the [18F]FDG-PET/CT-driven group, compared to 34 of 41 (83% [95% CI, 68–93%]) patients in the diagnostic surgery group (p < 0.001, OR 0.1 [95% CI, 0.1–0.4]). These were all futile diagnostic surgeries for histopathologically benign nodules. There was no unjustified surveillance: no malignancies or borderline tumours were observed in patients under surveillance. [18F]FDG-PET/CT-driven management avoided surgery for 25 of 63 (40% [95% CI, 28–53%]) benign nodules. In comparison, only one of 35 (3% [95% CI, 0–15%]) patients in the diagnostic surgery group did not undergo the recommended surgery and was considered benign on ultrasound follow-up (p = 0.002, OR 26.9 [95% CI, 3.3–219.0]) (Table 2).

Table 2 Therapeutic yield after one year of follow-up

Secondary outcomes

Sensitivity, specificity, NPV, PPV, and benign call rate of [18F]FDG-PET/CT were 94.1% (95% CI, 80.3–99.3%), 39.8% (95% CI, 30.0–50.2%), 95.1% (95% CI, 83.5–99.4%), 35.2% (95% CI, 25.4–45.9%), and 31.1% (95% CI, 23.3–39.7%), respectively (Table 3). Two of 132 (1.5%) [18F]FDG-PET/CT scans were false-negative (both in the diagnostic surgery group). In both cases, the corresponding index nodules had caused extensive debate during the blinded interpretation of the histopathology (i.e., benign or malignant diagnosis). One was a 15 mm, RAS-mutated, non-invasive neoplasm with uncommon spindle cell metaplasia, which was ultimately classified as a papillary thyroid carcinoma (PTC, TNM pT1b). The other was a 32 mm, predominantly cystic, non-invasive neoplasm with a solid component of 8 mm. It was only considered malignant (follicular variant of PTC, TNM pT2) after detection of an ETV6-NTRK3 fusion during the central review of the histopathology (details provided in the Supplementary Data p18).

Table 3 Diagnostic accuracy parameters, including results for non-Hürthle and Hürthle cell subgroupsa

No difference in the surgical complication rate was observed between both groups. Still, following the reduction in diagnostic surgeries, the rate of new levothyroxine suppletion-dependent hypothyroidism due to partial thyroidectomy procedures was only 6% (5/86) in the [18F]FDG-PET/CT-driven group as compared to 17% (7/41) in the diagnostic surgery group (p = 0.07, OR 0.3 [95% CI, 0.1–1.1]). Other surgical complications infrequently occurred (Table 4).

Table 4 Secondary outcomes

EQ-5D-5L questionnaires were completed by 69 of 91 (76%) patients in the [18F]FDG-PET/CT-driven and 29 of 41 (71%) patients in the diagnostic surgery group (p = 0.54). Perceived HRQoL during the first year after the [18F]FDG-PET/CT scan was similar in both groups. Adjusted for the stratifying variables and malignancy/borderline rate, a mean of 0.793 (95% CI, 0.753–0.833) QALYs was estimated in the [18F]FDG-PET/CT-driven group, as compared to 0.725 (0.651–0.799) QALYs in the diagnostic surgery group (p = 0.11). The adjusted mean societal costs during the first year were significantly lower in the [18F]FDG-PET/CT-driven group than the diagnostic surgery group: €14,800 (95% CI, + €12,600– + €17,000) as compared to €21,700 (+ €16,800– + €26,600) per patient, respectively, with an adjusted mean difference of − €6,900 (-€12,100– − €1,600, p = 0.01) (Table 5 and Supplementary Data p8-10).

Table 5 Secondary outcomes: HRQoL and societal costs

Incidental [18F]FDG-PET/CT findings with diagnostic or therapeutic consequences were reported for 22 of 132 (17%) [18F]FDG-PET/CT scans (Table 4, Supplementary Data p26-27). These included 21 [18F]FDG-positive thyroid incidentalomas in 19 (14%) patients, for which 13 additional FNAC procedures were performed. Eleven of 21 (52%) incidentalomas were surgically resected. Two ipsilateral incidentalomas were malignant. In four patients, their initially scheduled hemithyroidectomy was extended to a total thyroidectomy to include a contralateral incidentaloma; all were histopathologically benign: one follicular adenoma and three hyperplastic nodules. These total thyroidectomy procedures in 4 of 132 (3%) of patients are considered overtreatment due to the [18F]FDG-PET/CT.

Diagnostic confidence in [18F]FDG-PET/CT was high: only one of six patients who underwent surgery (three during and three after study follow-up) despite advised surveillance (Fig. 1) was not fully reassured by the negative [18F]FDG-PET/CT result. The main reason for surgery in all six patients, however, was not the fear or suspicion of cancer but increasing compressive symptoms causing discomfort. Noncompliance to the surveillance advice did not change the one-year therapeutic yield in the [18F]FDG-PET/CT-driven group, as patient crossover between surgical and non-surgical management occurred in both directions (Fig. 1). Based on theoretical full compliance to the given treatment advice, a maximum 41% reduction in futile diagnostic surgeries for benign nodules (i.e., 26 [18F]FDG-negative nodules of 63 benign nodules) was estimated following full implementation of [18F]FDG-PET/CT (p = 0.86).

Subgroup analysis of nodules > 10 mm (128/132 patients, excluding two 10 mm nodules from each group) demonstrated similar therapeutic yield and diagnostic accuracy as compared to the main results (Supplementary Data p12).

Subgroup analysis of the 101 non-Hürthle cell nodules (60 AUS/FLUS and 41 FN/SFN) and 31 Hürthle cell (HCN/SHCN) nodules was performed. The malignant/borderline rate was 17% (10/60) in Bethesda III as compared to 33% (24/72) in Bethesda IV nodules (p = 0.03), of which 37% (15/41) in FN/SFN and 29% (9/31) in HCN/SHCN nodules (p = 0.50).

In non-Hürthle cell nodules, the fractions of unbeneficial management and prevented surgeries for benign nodules after one year were 37% (95% CI, 25–49%) and 48% (95% CI, 33–63%) in the [18F]FDG-PET/CT-driven group, as compared to 85% (95% CI, 68–95%) (p < 0.001) and 0% (95% CI, 0–18%) (p < 0.001) in the diagnostic surgery group (Table 6). Sensitivity, specificity, NPV, PPV, and benign call rate in non-Hürthle cell nodules were 92.0% (95% CI, 74.0–99.0%), 50.0% (95% CI, 38.3–61.7%), 95.0% (95% CI, 83.1–99.4%), 37.7% (95% CI, 25.6–51.0%), and 39.6% (95% CI, 30.0–49.8%), respectively (Table 3). Therapeutic yield and diagnostic accuracy were similar in AUS/FLUS and FN/SFN nodules. In Hürthle cell nodules, [18F]FDG-PET/CT showed a benign call rate of only 3.2% (1/31). Consequently, [18F]FDG-PET/CT-driven management was not contributory to improve the diagnostic workup: the fractions of unbeneficial management and prevented surgeries for benign Hürthle cell nodules were low and similar in both study groups (p = 1) and included one patient in each group who declined the advised surgery (Table 6). [18F]FDG-PET/CT-driven management avoided significantly more futile surgeries in non-Hürthle cell nodules (48% [95% CI, 33–63%]) than in Hürthle cell nodules (13% [95% CI, 2–40%]) (p = 0.02).

Table 6 Subgroup analysis: therapeutic yield after one year of follow-up in AUS/FLUS, FN/SFN, and HCN/SHCN nodules


The EfFECTS trial demonstrated that [18F]FDG-PET/CT-driven management resulted in 40% avoided futile surgeries for benign nodules after one year. The high 94.1% sensitivity of [18F]FDG-PET/CT ensures that omitting diagnostic surgery does not compromise oncological safety. Despite patient cross-over between surgical and non-surgical management strategies, these results are in line with our previous meta-analysis, in which we estimated that [18F]FDG-PET/CT-driven management could accomplish a reliable maximum 47% reduction in diagnostic surgeries for benign nodules [5, 7]. The secondary outcomes of the trial showed significantly lower one-year societal costs of [18F]FDG-PET/CT-driven management, amply compensating the additional costs of the [18F]FDG-PET/CT (€754) by other cost-savings. Combined with similar HRQoL in both groups, a high likelihood of cost-effectiveness of an [18F]FDG-PET/CT-driven diagnostic workup is suggested. Finally, a trend towards fewer cases of postoperative medication-dependent hypothyroidism after hemithyroidectomy was demonstrated.

The Hürthle cell nodules in our population were nearly all [18F]FDG-positive, irrespective of malignant or benign histopathology. Visual assessment of [18F]FDG-PET/CT did not contribute to any reduction of futile surgeries in this subgroup. To prevent the unbeneficial application of [18F]FDG-PET/CT and optimise its therapeutic yield, it should only be offered to patients with non-Hürthle cell AUS/FLUS or FN/SFN cytology. Nodules with Hürthle cell cytology should be excluded from visual analysis with [18F]FDG-PET/CT. Any benefits of quantitative [18F]FDG-PET/CT assessment methods, such as SUV-derived analysis, texture analysis, and radiomics, have shown potential in indeterminate thyroid nodules and [18F]FDG-positive thyroid incidentaloma, although the current evidence is limited and further studies are required [22,23,24]. Other diagnostics, such as molecular analysis for specific driver mutations, mitochondrial DNA mutations, and copy number variations, should be considered for Hürthle cell nodules [25,26,27].

The earlier [18F]FDG-PET/CT studies repeatedly demonstrated sensitivities up to 100%, while more recent studies did report some missed cancer diagnoses [5, 8, 9, 28]. A recent meta-analysis hypothesised that the progress from stand-alone PET to hybrid PET/CT techniques likely increased the false-negative rate because PET/CT provides a better anatomical correlation [29]. Although [18F]FDG uptake in ipsilateral multinodular disease could complicate exact anatomical correlation, we consider this a highly unlikely explanation. The improved spatial resolution and decreased detection limit (now ~ 10 mm diameter to reliably exclude [18F]FDG-uptake) of newer PET/CT scanners likely results in fewer false-negative as well as more false-positive readings. Rather than the impact of improved technology, between-study heterogeneity may result from varying thresholds for the definition of an [18F]FDG-negative nodule, in combination with global variations in case-mix, ranging from variable malignancy rates to differences in histopathological subtypes, genomic patterns, and altered protein expression levels related to the glycolysis pathway [25, 30].

Other diagnostics can be considered for indeterminate nodules [25]. Even though they were initially developed for pre-FNAC risk assessment of thyroid nodules, various ultrasound classification systems, such as the American Thyroid Association (ATA) and Thyroid Imaging Reporting and Data System (TIRADS) classifications, have increasingly demonstrated their added diagnostic value in nodules with indeterminate cytology [31]. TIRADS assessment may also improve the diagnostic accuracy of [18F]FDG-PET/CT [8, 9]. The current study focussed purely on [18F]FDG-PET/CT: patients were only included after their indication for diagnostic surgery was established based on cytology, clinical and ultrasound parameters (in accordance with international guidelines) to prevent undesirable interference of considerations regarding ultrasound characteristics when aiming to assess the impact of [18F]FDG-PET/CT-driven management, our primary objective. At the time when the current study was initiated in 2015, TIRADS was less established and it was only implemented very limitedly in the Netherlands. Its prospective assessment was not part of the study procedures. We considered it inappropriate to retrospectively reassess baseline stored ultrasound captures as ultrasound is a dynamic technique.

Molecular diagnostics are undeniably gaining traction in clinical practice and are increasingly applied in the preoperative workup of thyroid nodules. Besides aiding the differentiation between benign and malignant, these have an added advantage of risk stratification based on the type of genetic alteration found [32]. However, few tests meet the rule-out and/or rule-in requirements for a safe implementation of an ancillary test [4, 25]. [18F]FDG-PET/CT meets this rule-out criterium (i.e., false-negative rate lower than or equal to a benign (Bethesda II) cytological diagnosis), as do some commercial gene mutation classifiers with similar sensitivity. These panels appear to outperform [18F]FDG-PET/CT on specificity and benign call rate but have major downsides with regard to their limited global availability and very high costs per patient (a Medicare reimbursement rate of $3,600 = €3,109; €1 = $1.18 on 01–10-2021), in addition to practical challenges concerning the required quality, quantity, and storage of the cytological material [33, 34]. In a European setting, with relatively limited costs for surgery and hospital admission, cost-effectiveness of these commercial molecular tests seems unattainable [7]. Locally developed and less comprehensive European molecular panels are available, but their diagnostic accuracy appears too limited for routine application in daily practice [35,36,37]. Due to large global variations in local healthcare expenses, case-mix, and availability of techniques, cost-utility, and convenience of any diagnostic workup will greatly vary among different healthcare systems [7, 25, 38, 39]. Following previous model-based assumptions and the significant difference in costs that is demonstrated in the current study, life-long real-world cost-effectiveness of [18F]FDG-PET/CT is currently being modelled using the results of the EfFECTS trial [7].

Implementability of [18F]FDG-PET/CT was assessed in patients with an [18F]FDG-negative nodule in the [18F]FDG-PET/CT-driven group, who could deduce their allocation and [18F]FDG-PET/CT result from the surveillance advice. In spite of the suspense of participating in a clinical trial, the observed high therapy compliance reflects the patients’ and physicians’ diagnostic confidence and adoption of [18F]FDG-PET/CT as a trustworthy diagnostic tool. During study participation as well as afterwards, compressive symptoms were the principal reason for surgery in patients with a negative [18F]FDG-PET/CT result and surveillance advice. This demonstrated that shared decision-making remains crucial to select patients for [18F]FDG-PET/CT who would not prefer surgery for discomfort from compressive symptoms, fear of malignancy or other reasons, optimize the (long-term) therapeutic yield of [18F]FDG-PET/CT, and limit unbeneficial use of resources.

A potential limitation of our study is its per protocol one-year follow-up period for [18F]FDG-negative nodules. Concerns about missed cancer diagnoses were ameliorated by the extended median ultrasound follow-up of 29 months. Whether any very slow-growing malignant or borderline thyroid tumours in these nodules will lower diagnostic accuracy of [18F]FDG-PET/CT and limit the [18F]FDG-PET/CT-driven reduction in futile diagnostic surgeries remains to be established. Similarly, additional long-term false-negative results were recently reported for a well-known molecular classifier [40]. It seems clinically unlikely that a delayed diagnosis of any missed slow-growing malignancies or borderline tumours following [18F]FDG-PET/CT-driven management will alter the patients’ prognosis, as such tumours are likely indolent in nature. False-negative results in previous [18F]FDG-PET/CT studies in indeterminate nodules mostly concerned low-risk (T1) cancers [8, 9, 28]. In differentiated thyroid carcinoma, [18F]FDG-uptake is inversely related to prognosis, and [18F]FDG-negative carcinomas showed fewer aggressive characteristics on histopathology [41, 42]. In the current study, the two reported false-negative nodules were difficult to establish and only classified as malignant after extensive assessments including molecular analysis by multiple expert thyroid pathologists.

Limitations of the routine use of [18F]FDG-PET/CT include the limited worldwide availability of PET/CT scanners and adherence to standardised international scanning protocols, the use of low levels of ionizing radiation (~ 4 mSv), and the diagnostic and therapeutic consequences (including costs) of incidental findings. Our study showed that incidental findings caused overtreatment in 4 of 132 (3%) patients, even though treatment was compliant with the current guidelines (Supplementary Data p26). These individual cases underpin that careful exploration of further diagnostic options should be considered, especially when drastic management changes are the consequence.

In conclusion, this randomised controlled trial shows that an [18F]FDG-PET/CT-driven diagnostic workup of indeterminate thyroid nodules leads to practice changing patient management, accurately and oncologically safely ruling out malignancy, reducing futile surgeries by 40%, and saving approximately €6,900 per patient. Its use should be limited to nodules with non-Hürthle cell cytology only to further optimise its therapeutic yield to 48%, as [18F]FDG-PET/CT does not contribute to the management of patients with Hürthle cell nodules.