Background

According to the literature, thyroid nodules can be detected among up to 76% of the adult population in thyroid ultrasonography, of which 15% can be malignant [1]. A considerable percentage of theses cancers relates to nodules with a diameter smaller than 10 mm (microcarcinomas) [2]. Also, owing to increasing number of imaging studies (ultrasound, computed tomography, etc.) for reasons other than thyroid assessment, incidental small thyroid nodules (≤ 10 mm) are now frequently found in practice [2, 3].

Ultrasonography and fine-needle aspiration (FNA) cytology are first methods routinely used for discriminating malignant from benign thyroid nodules [4]; however, for better diagnostic management of thyroid nodules, different ultrasound-based risk stratification systems have been developed over the past years, such as American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TIRADS), Korean TIRADS (K-TIRADS), and American Thyroid Association (ATA) guideline [5]. These guidelines present specific recommendations for ultrasound-guided FNA based on the imaging features and size of the nodules. Generally, they do not recommend FNA for thyroid nodules smaller than 10 mm, irrespective of sonographic characteristics [6]; therefore, rare data exist on the diagnostic performance of these guidelines for risk stratification of nodules with a diameter up to 10 mm.

The purpose of the present study was to investigate the performance of three different ultrasound-based risk stratification systems (ACR-TIRADS, K-TIRADS, and ATA guidelines) in predicting malignancy in small thyroid nodules (≤ 10 mm). We also tried to find the ultrasonographic characteristics potentially associated with the risk of malignancy.

Methods

This prospective cross-sectional study was performed on the patients with a diagnosis of small thyroid nodules (≤ 10 mm), who were rereferred from an endocrinologist to the radiologists for sonography and FNA between May 2019 and July 2021. The nodules were primarily detected by an endocrinologist in the clinics of our institution. The inclusion criteria were presence of thyroid nodules ≤ 10 mm in ultrasound, and ACR-TIRADS, K-TIRADS, and ATA classifications of the nodules during ultrasound assessment. Nodules with purely cystic component and/or atypical diagnosis in cytology were excluded. Patients, who were not willing to participate in the study, were also excluded.

Two senior radiologists with more than 15 years of experience contributed in thyroid nodules ultrasonography using a Samsung H60 ultrasound machine with a 3–14 MHz linear probe prior to FNA procedure. Ultrasonographic features of the nodules were recorded, including size, echogenic foci (punctate, coarse, peripheral), margins (regular, ill-defined, irregular), echogenicity (hyperechogenicity, isoechogenicity, hypoechogenicity), composition (solid-cystic, solid), and shape (taller-than-wide, wider-than-tall). The radiologists reviewed the thyroid nodules independently and any disagreements were resolved by consensus.

Given the sonographic characteristics recorded, thyroid nodules were categorized as per the ACR-TIRADS [7, 8], K-TIRADS (suggested by the Korean Society of Thyroid Radiology) [9], and ATA-2015 [10] guidelines, separately. ACR-TIRADS is scored based on echogenicity, shape, margin, echogenic foci, and composition of the thyroid nodules. According to K-TIRADS, irregular margins, solid component, taller-than-wide shape, microcalcifications, and hypoechogenicity are defined as ultrasound features of high suspicion for malignancy. Concerning ATA-2015, ultrasound features of microcalcifications, irregular margins, hypoechogenicity, taller-than-wide shape, and rim calcifications are suggestive of nodule malignancy. Table 1 summarizes different classifications of these systems.

Table 1 Classifications of the three ultrasound-based risk stratification systems

A senior radiologist conducted the FNA procedure using a 5 ml plastic syringe attached to a 23-gauge needle with the free hand-biopsy technique under the guidance of ultrasound. The aspirates were then smeared on microscope glass slides, dried in the air, and fixed with 95% alcohol. Two expert pathologists performed the cytological assessment, who were blinded to the sonographic diagnosis of the thyroid nodules.

All statistical analyses were performed by SPSS v22. Descriptive analysis was used to calculate the performance of the three ultrasound classification systems (ACR-TIRADS, ATA-2015, K-TIRADS) in the diagnosis of malignant thyroid nodules, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy. To estimate the ability of the three ultrasound classification systems for predicting the malignancy, we used a receiver operator characteristics (ROC) analysis, as estimated by the area under the curve (AUC). We performed these analyses for cut-off values of 4 and 5 for each guideline, separately. To evaluate the association between sonographic features and malignancy risk, logistic regression analysis was used; the results were presented as odds ratio (OR) along with 95% confidence interval (CI). A p value less than 0.05 was considered as significant. The interobserver agreement between the radiologists was assessed with Kappa statistics.

The details of this study were initially explained to the patients, and then, the written informed consents were taken from all of them. The study protocol was approved by the ethics committee of Babol University of Medical Sciences (code: IR.MUBABOL.REC.1400.155). The patients’ information was kept confidential.

Results

In total, 287 thyroid nodules from 256 subjects (64 men and 192 women) were finally included in the study (Fig. 1). The mean age of the cases was 45.1 ± 12.5 years old, ranging from 18 to 76 years. Out of 287 nodules, 37 (12.9%) were malignant. According to cytology, 33 malignant nodules were consistent with papillary thyroid carcinoma, and four malignant nodules were follicular neoplasm. There was good interobserver agreement between the two radiologists (Kappa = 0.72, 95% CI 0.58–0.86).

Fig. 1
figure 1

Flowchart of the thyroid nodules inclusion (≤ 10 mm)

Table 2 shows ultrasound characteristics of thyroid nodules, including size, echogenic foci, margin, echogenicity, composition, and shape. Most of the nodules had regular margin (n = 224), hyperechogenicity (n = 173), solid composition (n = 254), wider-than-tall shape (n = 273), and no echogenic foci (n = 210). In Table 2, distribution of benign and malignant thyroid nodules according to the three ultrasound classification systems has been indicated. Regarding ACR-TIRADS, the prevalence of malignancy for classes TR2-TR5 was 0.0%, 4.4%, 17.6%, and 56.8%, respectively. This rate for ATA-2015 classes very low suspicion to high suspicious was 3.4%, 7.0%, 21.9%, and 60.0%, respectively. For K-TIRADS classes 2–5, the prevalence of malignancy was 3.4%, 7.0%, 20.6%, and 65.2%, respectively.

Table 2 Characteristics of the thyroid nodules ≤ 10 mm (n = 287)

Regarding the performance of the ultrasound classification systems for malignant thyroid nodules, the accuracy of ACR-TIRADS categories TR5 and TR4/5 was 88.9% and 72.1%, respectively. This rate for ATA-2015 classes high suspicion and intermediate suspicion/high suspicion was 88.9% and 82.6%, respectively. For K-TIRADS classes 5 and 4/5, the diagnostic accuracy was 89.6% and 82.9%, respectively. Table 3 shows these results in detail.

Table 3 Distribution of benign and malignant thyroid nodules (≤ 10 mm) according to the three ultrasound classification systems

The ROC curve for the ability of the three ultrasound classification systems in predicting malignant thyroid nodules was denoted according to their categories in Fig. 2. For category 5, ACR-TIRADS had the greatest predictive ability (AUC = 0.706), followed by K-TIRADS (AUC = 0.687), and ATA-2015 (AUC = 0.683). For category 4 or 5, ACR-TIRADS still had the greatest predictive ability (AUC = 0.759), and K-TIRADS and ATA-2015 had an equal value (both AUC = 0.727). Figures 3, 4, and 5 indicate the ultrasound-guided FNA of a benign and a malignant nodule, respectively.

Fig. 2
figure 2

Receiver operating characteristic (ROC) curve of different ultrasound classification systems (category 5; category 4 or 5) for predicting malignancy of thyroid nodules smaller than 10 mm

Fig. 3
figure 3

The ultrasound-guided fine-needle aspiration from an isoechoic nodule with incomplete peripheral echogenic foci and a diameter of 8 mm (ACR-TIRADS-4, ATA-2015-Low suspicion, K-TIRADS-3), which was proved by cytology to be a nodular goiter

Fig. 4
figure 4

The ultrasound-guided fine-needle aspiration from a hypoechoic solid nodule with regular margin, punctate echogenic foci, and a diameter of 6 mm (ACR-TIRADS-5, ATA-2015-High suspicion, K-TIRADS-5), which was proved by cytology to be a papillary carcinoma

Fig. 5
figure 5

The ultrasound-guided fine-needle aspiration from a hypoechoic solid nodule with regular margin, punctate echogenic foci, and a diameter of 8 mm (ACR-TIRADS-5, ATA-2015-High suspicion, K-TIRADS-5), which was proved by cytology to be a papillary carcinoma

In Table 4, the association between the different sonographic features of the thyroid nodules and risk of malignancy has been represented. Significant direct associations were found between malignancy and punctate echogenic foci (OR = 6.46, 95% CI 2.37–17.43), hypoechogenicity (OR = 6.39, 95% CI 2.26–18.07), ill-defined margin (OR = 4.38, 95% CI 1.72–11.15), and irregular margin (OR = 7.33, 95 CI 1.01–33.05) (Table 5).

Table 4 Diagnostic performance values of the three ultrasound classification systems for malignant thyroid nodules ≤ 10 mm
Table 5 Association between sonographic features and cytology results of the thyroid nodules ≤ 10 mm

Discussion

The ultrasound-based risk stratification system for thyroid nodules malignancy was initially introduced by Horvath et al. [11] in 2009; thereafter, different ultrasound reporting systems have been developed. However, most of the studies evaluating the diagnostic performance of these systems were performed on the nodules larger than 1 mm, and limited evidence is available on the performance of different risk stratification systems in the diagnosis of malignancy in smaller thyroid nodules; therefore, in the present survey, we attempted to assess the accuracy of ACR-TIRADS, K-TIRADS, and ATA guidelines (separately for cut-off values of 4 and 5 for each guideline) in diagnosis of malignancy in small thyroid nodules (≤ 10 mm). We observed that 81% of malignant thyroid nodules presented with an ACR-TIRADS TR4 or 5, while this rate for ATA-2015 intermediate suspicion/high suspicion and K-TIRADS 4 or 5 classifications was about 60%. Additionally, for category 5, we found that specificity and NPV values were higher than sensitivity and PPV values, respectively, for all three guidelines; however, the accuracy of these systems was acceptable (nearly 90%). For category 4 or 5, sensitivities and NPVs relatively increased versus category 5; conversely, specificities, PPVs, and accuracies decreased.

In the study by Schenke et al. [12], the authors reported that sensitivity of ACR-TIRADS TR4 and TR5 was 100% for thyroid nodules ≤ 10 mm, while this rate for Kwak-TIRADS 4C and 5, and EU-TIRADS 5 was about 97%. Overall, these rates were relatively higher than those obtained in our study. On the other hand, Schenke et al. reported that specificities of ACR-TIRADS TR4 and TR5, Kwak-TIRADS 4C and 5, and EU-TIRADS 5 were 40.6%, 55.1%, and 49.3%, respectively, which were lower than those found in the present study. Of course, we did not assess Kwak-TIRADS and EU-TIRADS in our study; however, these comparisons can help in gaining a better insight into the existing evidence.

In consistent with other guidelines, ACR-TIRADS does not recommend FNA biopsy for small thyroid modules (< 10 mm); however, due to importance of papillary thyroid microcarcinomas, 5–9 mm TR5 nodules can be biopsied under certain conditions based on this system [7]. Regarding K-TIRADS, radiologists selectively recommend FND for category 5 small nodules (5–10 mm) when extrathyroidal extensions, trachea or recurrent laryngeal nerve invasion, cervical lymph node or distant metastasis, and tumor progression are found [13]. About ATA guideline, the size criterion for high suspicion nodules was initially > 5 mm in 2009, but it changed to ≥ 10 mm in 2015 [10, 13].

Overall, active surveillance of thyroid nodules smaller than 10 mm without FNA may be associated with anxiety for the patients because of remaining uncertainties in the nodules’ cytopathology. Anyhow, there is still likelihood of malignancy for these small nodules; for example, about 13% of the thyroid nodules in our study were malignant. Therefore, FNA and cytopathological assessment still seem necessary for high-risk small nodules. Our results showed that the rate of malignancy among low-risk thyroid nodules (including ACR-TIRADS TR2/3, ATA-2015 very low suspicion/low suspicion, and K-TIRADS 2/3) was low. These results are concordant with studies by Schenke et al. [12], Ha et al. [14], and Mendes et al. [2], demonstrating that use of these ultrasound reporting systems can prevent overdiagnosis and overtreatment of low-risk thyroid nodules smaller than 10 mm.

In the present study, we also investigated the sonographic features potentially associated with risk of malignancy in thyroid nodules. In this regard, punctate echogenic foci, ill-defined and irregular margins, and hypoechogenicity were predictive for thyroid cancer. No significant association was found between nodule size and malignancy risk. These findings are in agreement with most of the previously published data [15, 16]. On the other hand, solid component and taller-than-wide shape have been suggestive of nodule malignancy according to the literature [15, 16], which were not consistent with the results of the present study. It is noteworthy that the number of nodules with a taller-than-wide shape was low; so, these results should be interpreted with caution.

A strength of this study was prospective data collection, ensuring that cases were assessed strictly before analysis. On the other hand, a limitation was lack of the histological results of the malignant nodules. Also, we did not use ultrasound elastography for assessment of the nodules. Finally, it is suggested to perform multicenter studies with larger sample sizes in the future.

Conclusions

According to our findings, ACR-TIRADS with a cut-off ≥ TR4 had the highest sensitivity and NPV, whereas K-TIRADS with a cut-off ≥ 4 and ATA-2015 classes intermediate suspicion and high suspicion had equally the highest specificity and PPV. These differences should be considered by clinicians and radiologists in the management of thyroid nodules smaller than 10 mm. Finally, we found that punctate echogenic foci, ill-defined and irregular margins, and hypoechogenicity were predictive for thyroid malignancy.