Diagnosis of thyroid micronodules on ultrasound using a deep convolutional neural network

Rho, Miribi; Chun, Sei Hyun; Lee, Eunjung; Lee, Hye Sun; Yoon, Jung Hyun; Park, Vivian Youngjean; Han, Kyunghwa; Kwak, Jin Young

doi:10.1038/s41598-023-34459-3

Diagnosis of thyroid micronodules on ultrasound using a deep convolutional neural network

Article
Open access
Published: 04 May 2023

Volume 13, article number 7231, (2023)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Diagnosis of thyroid micronodules on ultrasound using a deep convolutional neural network

Download PDF

Miribi Rho¹,
Sei Hyun Chun¹,
Eunjung Lee²,
Hye Sun Lee³,
Jung Hyun Yoon¹,
Vivian Youngjean Park¹,
Kyunghwa Han¹ &
…
Jin Young Kwak¹

1480 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

To assess the performance of deep convolutional neural network (CNN) to discriminate malignant and benign thyroid nodules < 10 mm in size and compare the diagnostic performance of CNN with those of radiologists. Computer-aided diagnosis was implemented with CNN and trained using ultrasound (US) images of 13,560 nodules ≥ 10 mm in size. Between March 2016 and February 2018, US images of nodules < 10 mm were retrospectively collected at the same institution. All nodules were confirmed as malignant or benign from aspirate cytology or surgical histology. Diagnostic performances of CNN and radiologists were assessed and compared for area under curve (AUC), sensitivity, specificity, accuracy, positive predictive value, and negative predictive value. Subgroup analyses were performed based on nodule size with a cut-off value of 5 mm. Categorization performances of CNN and radiologists were also compared. A total of 370 nodules from 362 consecutive patients were assessed. CNN showed higher negative predictive value (35.3% vs. 22.6%, P = 0.048) and AUC (0.66 vs. 0.57, P = 0.04) than radiologists. CNN also showed better categorization performance than radiologists. In the subgroup of nodules ≤ 5 mm, CNN showed higher AUC (0.63 vs. 0.51, P = 0.08) and specificity (68.2% vs. 9.1%, P < 0.001) than radiologists. Convolutional neural network trained with thyroid nodules ≥ 10 mm in size showed overall better diagnostic performance than radiologists in the diagnosis and categorization of thyroid nodules < 10 mm, especially in nodules ≤ 5 mm.

Diagnosing thyroid nodules with atypia of undetermined significance/follicular lesion of undetermined significance cytology with the deep convolutional neural network

Article Open access 08 October 2021

Diagnosis of Thyroid Nodules: Performance of a Deep Learning Convolutional Neural Network Model vs. Radiologists

Article Open access 28 November 2019

Deep learning to assist composition classification and thyroid solid nodule diagnosis: a multicenter diagnostic study

Article 11 October 2023

Introduction

The detection of thyroid nodules has substantially increased with the widespread use of high-resolution ultrasound (US), resulting in a high prevalence of 19–67% for thyroid nodules in the general population^1,2. Approximately 7–15% of detected thyroid nodules are thyroid cancers³. In thyroid micronodules (< 10 mm), fine-needle aspiration (FNA) remains controversial because papillary thyroid microcarcinomas, defined as tumors < 10 mm in size, have shown near-zero cancer-specific mortality⁴. As it is difficult to predict which thyroid microcarcinoma will progress with clinical significance, most guidelines simply state FNA as an available option, leaving the decision up to clinicians to decide based on clinical settings and patient preference^5,6,7,8.

Multifocality and bilaterality in papillary thyroid carcinoma are common features with a reported frequency of 18–87%⁹, and are known risk factors of nodal metastasis, distant metastasis, and regional recurrence after initial therapy¹⁰. The American Thyroid Association guideline first recommends lobectomy for unifocal papillary thyroid microcarcinoma without extrathyroidal extension but also notes that the presence of a bilateral nodule can suggest the need for a bilateral thyroidectomy to address the possibility of bilaterality⁵. To note, the US features used to differentiate benign and malignant thyroid nodules are equally applied to both macronodules and micronodules)^11,12. Considering that physicians’ visual analysis of micronodules on US, especially of nodules smaller than 5 mm, has shown high false-positive rates, the preoperative detection of micronodules may increase additional FNA^13,14. Furthermore, given the high nondiagnostic rate of FNA, preoperative diagnosis is still a challenging task for micronodules^10,14.

The convolutional neural network (CNN) is a deep learning model which enables high-performance visual recognition and classification after automatically learning representative features from a training set^15,16. The characteristics of the training set are therefore critical to the performance of CNN. CNN-based methods have been investigated to differentiate malignant and benign thyroid nodules and showed non-inferior or comparable diagnostic performance to radiologists^{17,18,19,20,21,22,23,24,25}. Most studies have been conducted on thyroid nodules over 10 mm, and only three included thyroid nodules larger than 5 mm, but their mean size was larger than 10 mm^20,24,25. Three other investigations have shown validation results for nodules corresponding to the same size criteria with training sets made up of nodules larger than 10 mm^18,21 or 5 mm²⁰, while no other study has demonstrated nodule size criteria in both the training and validation of CNN^{17,19,22,23,24,25}. To the best of our knowledge, no study has applied a CNN-based model to thyroid nodules beyond the size criteria of the training set. In this study, we investigated the diagnostic performances of a CNN that was previously trained with thyroid nodules ≥ 10 mm to discriminate malignant and benign thyroid nodules < 10 mm and compare its diagnostic performance with those of radiologists.

Methods

The institutional review board of Severance Hospital (Seoul, South Korea) approved this retrospective study, with a waiver for informed consent (IRB number: 2020-3659-001). Signed informed consent for biopsy or surgical procedures was obtained preoperatively from all patients. All methods were performed in accordance with relevant guidelines and regulations.

Patients

This study was performed at a single tertiary referral center from March 2016 to February 2018, during which 4110 nodules in 3716 consecutive patients were consulted for US-guided FNA. The initial FNA was performed in 3323 nodules in 3240 patients, of which 698 nodules were < 10 mm in 683 patients. Our study included nodules < 10 mm if they (a) were cytologically confirmed as benign or malignant (Bethesda category II or VI) or (b) were confirmed as malignant on postsurgical histology. We excluded nodules that were not confirmed or lost to follow-up. Finally, a total of 370 thyroid nodules in 362 patients were included and analyzed (Fig. 1). Two thyroid nodules were included for 8 patients, among which 6 patients had both malignant nodules and 2 patients had one benign and one malignant nodule.

US imaging

US examinations of both thyroid glands and neck areas were performed using a 5–12 MHz linear array transducer (iU22, Philips Healthcare, Amsterdam, Netherlands). Real-time US scans and subsequent US-FNA were performed by 12 radiologists with 1–20 years of experience in thyroid imaging.

Each radiologist who performed the US and US-FNA/core biopsy procedures interpreted each US scan of the thyroid nodules and recorded US features prospectively in our institutional database^26,27. US features including composition, echogenicity, margin, calcifications, and shape were recorded using descriptors that have been used from June 2012 to the present in our institution²⁸. Each thyroid nodule was categorized according to the Thyroid Imaging Reporting and Data System suggested by the Korean Society of Thyroid Radiology (KSThR TIRADS) using pre-recorded US features⁷.

Image acquisition and CNN evaluation

An experienced radiologist with 20 years of experience dedicated to thyroid imaging who was blinded to clinical information and pathological results selected and retrieved a representative US image for each thyroid nodule from the PACS and stored it in JPEG format. For each image, a square ROI enclosing the entire targeted thyroid nodule was manually labeled using the Paint program of Windows 10 by the same radiologist who retrieved the images.

We used a computer-aided diagnosis (CAD) program to assess the malignancy risk of 370 thyroid nodules on US images. The performance of a CNN algorithm differs by data set, that is, it highly depends on the data used to train its network. There are many pre-trained models and a few of their test results (accuracy, sensitivity, and specificity of 370 test data sets) are reported in Supplemental Table S1. As ResNet101 shows one of the best performances with current US images, this paper focuses on analyzing the results from transfer learning using ResNet101. The pretrained CNN model ResNet101^29,30 was fine-tuned with 13,560 US images of thyroid nodules ≥ 10 mm in size (further details on the CAD program are provided in the Supplemental Material)²¹. ResNet101 is a deep neural network that was originally trained with 1000 object classes, 1,281,167 training images, and 50,000 validation images. The basic algorithm of the residual net family (ResNet-18,34,50,101, and 152) has been previously introduced²⁹ and the paper achieved state-of-the-art results in image classification by taking a standard feed-forward ConvNet and adding skip-connections that bypassed a few convolution layers at a time. Each bypass/shortcut produced a residual block from which the convolution layers predicted a residual further used in the block’s input tensor. ResNet101 consists of 347 layers capable of learning rich feature representations of images with an image input size of 224-by-224. For transfer learning, 13,560 US images composed of 7160 malignant and 6400 benign nodule images were used. To balance the number of data sets, we used the left–right mirroring augmentation of 760 randomly selected benign images so that a final total of 14,320 images were used in training. Since the fully connected layer and classification layer at the end of the original pretrained network were configured for 1000 classes, they were replaced with new layers adapted to the new data set (benign and malignant) with learning rates for weights and biases set to 10 each. In the fine-tuning process, the stochastic gradient descent with a momentum optimizer was used to train the network, the initial learning rate was set to 10-4, 10 epochs were conducted, and the mini-batch size was set to 50. The momentum of the stochastic gradient descent optimizer was set to 0.9 and the learning rate dropped by a factor of 0.5 every 4 epochs. The model was validated with internal data (95 benign, 539 malignant) and external data from three different hospitals (429 benign, 761 malignant).

Using the CAD program, we calculated the risks of malignancy as continuous values ranging from 0 to 100% (CAD value). We also categorized nodules by designating categories based on the CAD value (CNN TIRADS) according to the predicted probability from KSThR TIRADS. CNN TIRADS category 2 was assigned to nodules with a malignancy probability < 3%, category 3 for a probability < 15%, category 4 for a probability < 60% and category 5 for a probability ≥ 60%⁷.

Statistical analysis

For the reference standard, histopathologic results from FNA or surgery were used to confirm the final diagnosis of each thyroid nodule. If there was a discrepancy between the two results, the reference standard was the histopathologic result from the surgical specimen.

Baseline patient characteristics and nodal US features were compared between malignant and benign nodules with the Student’s t-test and Pearson’s χ²-test at the patient level and the logistic regression analysis with the generalized estimating equation method for clustered data in a nodule-level comparison. Areas under the receiver operating characteristics curve (AUCs) with 95% CIs were obtained and the TIRADS category and CAD value of each thyroid nodule were divided as either positive or negative according to the Youden index. We compared the diagnostic performances of the TIRADS category and CNN by analyzing the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value using logistic regression with the generalized estimating equation method. AUC values were compared with the Obuchowski algorithm for clustered data³¹. The same statistical analysis was performed for the subgroup analysis separately according to nodule size with a cut-off value of 5 mm.

We assessed the categorization performances of CNN TIRADS and KSThR TIRADS using the likelihood ratio χ²-test and the linear trend χ²-test for each categorization system to determine heterogeneity (small differences in risk of malignancy among nodules in the same category) and monotonicity of gradients (whether the risk of malignancy of nodules increases as the category increases), respectively^32,33. We also used the Akaike information criterion, which is a widely used estimator for model selection. Smaller Akaike information criterion values indicate a more informative model in terms of goodness of fit³⁴.

Statistical analysis was performed using statistical software (SAS version 9.4, SAS Institute, Cary, NC, USA) and the R Statistical Package (Version 4.0.2, Institute for Statistics and Mathematics, Vienna, Austria). Two-sided P values < 0.05 were considered to indicate statistical significance.

Results

Patients and nodules characteristics

A total of 370 nodules in 362 patients (mean ages, 46 ± 12 years; range 20–76 years) made up of 289 (79.8%; mean ages, 46 ± 12 years; range, 20–76 years) women and 73 (20.2%; mean ages, 45 ± 12 years; range, 26–73 years) men, were included in the final study population (Fig. 1). There were 347 (93.8%) nodules which were confirmed with surgery and 23 (6.2%) nodules which were confirmed with FNA. FNA was performed in the 370 nodules because of requests from physicians at outside clinics (n = 127), high suspicion nodules > 5 mm (n = 123)⁷, the need to determine surgical extent in patients with bilateral nodules (n = 83), patient request (n = 30) and cervical lymph node metastasis (n = 7).

Among the 370 nodules, 323 nodules were confirmed as malignant and 47 nodules were confirmed as benign (Figs. 2 and 3). Of these malignant nodules, 322 nodules were confirmed as papillary thyroid carcinoma and 1 nodule as medullary thyroid carcinoma. The mean nodule size of the malignant and benign nodules was 5.3 ± 1.5 mm and 5.8 ± 2.2 mm, respectively (P = 0.14, Table 1). No significant difference was observed between the malignant and benign nodules for age (46.0 years vs. 45.9 years, P = 0.97) and female proportion (79.2% vs. 85.1%, P = 0.34).

Table 1 Patient demographics and nodal US features.

Full size table

Comparison of diagnostic performance between the CNN and radiologists

The optimal cut-off points, set by the Youden index, were probability > 56.1% for CNN and KSThR TIRADS category 5 for radiologists. CNN showed higher AUC values than radiologists when diagnosing thyroid nodules (0.66 vs. 0.57, P = 0.04, Table 2). CNN also showed higher values for sensitivity (89.8% vs. 87.3%, P = 0.26), specificity (38.3% vs. 25.5%, P = 0.10), accuracy (83.2% vs. 79.5%, P = 0.08), positive predictive value (90.9% vs. 89.0%, P = 0.07) and negative predictive value (35.3% vs. 22.6%, P = 0.048).

Table 2 Diagnostic performance of the CNN and radiologists.

Full size table

Among 370 nodules, 179 nodules were > 5 mm and 191 nodules were ≤ 5 mm. The characteristics of the patients and nodules are presented in the Supplemental Table S2. Age and portion of malignancy were not different between the subgroups divided by nodule size.

Cut-off values for the malignancy probability from CNN were redefined as > 55.8% for nodules > 5 mm and > 90.3% for nodules ≤ 5 mm. AUC values for diagnosing thyroid nodules did not differ between the CNN and radiologists in nodules > 5 mm (0.69 vs. 0.62, P = 0.25), while CNN showed higher AUC values than radiologists in nodules ≤ 5 mm with borderline significance (0.63 vs. 0.51, P = 0.08, Supplemental Table S3). In nodules ≤ 5 mm, CNN showed lower values for sensitivity (56.8% vs. 92.3%, P < 0.001) and accuracy (58.1% vs. 82.7%, P < 0.001) but higher values for specificity (68.2% vs. 9.1%, P < 0.001).

Comparison of categorization performance between the CNN and radiologists

Among 323 malignant nodules, 4 (1.2%) nodules were category 3, 37 (11.5%) nodules were category 4 and 282 (87.3%) nodules were category 5 according to KSThR TIRADS (Table 1). Among 47 benign nodules, 3 (6.4%) nodules were category 3, 9 (19.2%) nodules were category 4 and 35 (74.5%) nodules were category 5. TIRADS categorization according to CNN showed higher values in the linear trend χ²-test (20.3 vs. 7.0) and likelihood ratio χ²-test (20.9 vs. 6.3) and lower Akaike information criterion values (264.8 vs. 279.4) than KSThR TIRADS assessed by radiologists, suggesting better categorization performance (Table 3).

Table 3 Comparison of categorization performance between the CNN and radiologists.

Full size table

Discussion

Our study demonstrated that when diagnosing thyroid nodules < 10 mm, CNN trained with thyroid nodules ≥ 10 mm showed better performance than radiologists. CNN also showed better performance than radiologists even in very tiny nodules ≤ 5 mm with borderline significance. In our study, we used a pretrained CNN which was fine-tuned with 13,560 images of thyroid nodules ≥ 10 mm and implemented it to smaller thyroid nodules < 10 mm.

CNN is an end-to-end model that automatically extracts features from digital images to enable pattern recognition, object detection, and classification. Since LeCun et al. proposed LeNet, the first CNN model in 1989, CNN has rapidly developed and various CNNs such as AlexNet or ResNet have been introduced³⁵. The CNN-based diagnosis of thyroid nodules has shown comparable performance to experienced radiologists (Table 4). CNN has also shown significantly higher AUC values in recent studies using training sets with large numbers of nodules^19,21,22,25. In addition, CNN has shown higher specificity than radiologists with similar levels of sensitivity (except in some studies using specific commercially available CAD)^19,21,25.

Table 4 Comparison of diagnostic performance between CNN and radiologists in previous studies.

Full size table

To the best of our knowledge, no studies have validated the diagnostic performance of CNN on a test set that has a size range different from that of the training set. Our study shows that CNN can diagnose nodules that are completely different in size from those in the training set with significantly better AUC and negative predictive value than experienced radiologists. This is largely consistent with previous studies^19,21. Our study also shows that differences in specificity and AUC are more significant between the CNN and radiologists in very tiny nodules < 5 mm. Considering the high false-positive rate of FNA in very tiny nodules, we can expect CNN to reduce unnecessary FNA in clinical practice, especially in thyroid micronodules¹³.

In our study, the categorization of nodules on CAD values showed comparable or better stratification ability than KSThR TIRADS in terms of discriminatory ability and homogeneity^32,33,34. Since the CNN TIRADS defines categories according to the predicted risk of malignancy suggested by KSThR TIRADS, CNN can help clinicians decide the next management step for patients such as whether to follow up or perform FNA under the existing TIRADS guideline. CNN has the potential to be used as a convenient tool that will reduce the burden of clinical triaging thyroid micronodules.

We acknowledge that there are several limitations to our study. First, the number of benign nodules is markedly lower than that of malignant nodules. Because micronodules only underwent FNA when they showed highly suspicious features, FNA-confirmed benign nodules were relatively rare, resulting in low negative predictive value values of both CNN and radiologists. Second, a majority of the malignant nodules were papillary thyroid carcinoma. Because follicular neoplasms or the follicular variant of papillary thyroid carcinoma exhibit distinctive US features, our result cannot be generalized to the diagnosis of other pathologic disease entities³⁶. Third, radiologists manually selected key images and draw ROIs to be entered into the CNN, implying that the calculations made by CNN are inevitably operator-dependent. In a past study using support vector machine-based CAD, the diagnostic performance of computer-aided diagnosis for thyroid nodules varied significantly according to the experience of radiologists^37,38. Further studies should be followed to evaluate the reproducibility of CNN.

Conclusion

The deep convolutional neural network trained with thyroid nodules ≥ 10 mm showed overall better diagnostic and categorization performance than radiologists in thyroid nodules < 10 mm, especially those ≤ 5 mm.

Data availability

The raw data analyzed in the study are available from the corresponding author on reasonable request.

Abbreviations

US:: Ultrasound
FNA:: Fine-needle aspiration
CNN:: Convolutional neural network
TIRADS:: Thyroid Imaging Reporting and Data System
KSThR:: Korean Society of Thyroid Radiology
CAD:: Computer-aided diagnosis
AUC:: Area under the receiver characteristic curve

References

Davies, L., Ouellette, M., Hunter, M. & Welch, H. G. The increasing incidence of small thyroid cancers: Where are the cases coming from?. Laryngoscope 120, 2446–2451. https://doi.org/10.1002/lary.21076 (2010).
Article PubMed Google Scholar
Mitchell, J. & Parangi, S. The thyroid incidentaloma: An increasingly frequent consequence of radiologic imaging. Semin. Ultrasound CT MR 26, 37–46. https://doi.org/10.1053/j.sult.2004.10.004 (2005).
Article PubMed Google Scholar
Mandel, S. J. A 64-year-old woman with a thyroid nodule. JAMA 292, 2632–2642. https://doi.org/10.1001/jama.292.21.2632 (2004).
Article CAS PubMed Google Scholar
Baudin, E. et al. Microcarcinoma of the thyroid gland: The Gustave–Roussy Institute experience. Cancer 83, 553–559. https://doi.org/10.1002/(sici)1097-0142(19980801)83:3%3c553::aid-cncr25%3e3.0.co;2-u (1998).
Article CAS PubMed Google Scholar
Haugen, B. R. et al. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: The American Thyroid Association guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid 26, 1–133. https://doi.org/10.1089/thy.2015.0020 (2016).
Article MathSciNet PubMed PubMed Central Google Scholar
Russ, G. et al. European Thyroid Association guidelines for ultrasound malignancy risk stratification of thyroid nodules in adults: The EU-TIRADS. Eur. Thyroid J. 6, 225–237. https://doi.org/10.1159/000478927 (2017).
Article PubMed PubMed Central Google Scholar
Shin, J. H. et al. Ultrasonography diagnosis and imaging-based management of thyroid nodules: Revised Korean Society of Thyroid Radiology consensus statement and recommendations. Korean J. Radiol. 17, 370–395. https://doi.org/10.3348/kjr.2016.17.3.370 (2016).
Article PubMed PubMed Central Google Scholar
Tessler, F. N. et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White paper of the ACR TI-RADS Committee. J. Am. Coll. Radiol. 14, 587–595. https://doi.org/10.1016/j.jacr.2017.01.046 (2017).
Article PubMed Google Scholar
Iacobone, M., Jansson, S., Barczynski, M. & Goretzki, P. Multifocal papillary thyroid carcinoma—A consensus report of the European Society of Endocrine Surgeons (ESES). Langenbecks Arch. Surg. 399, 141–154. https://doi.org/10.1007/s00423-013-1145-7 (2014).
Article PubMed Google Scholar
So, Y. K., Kim, M. W. & Son, Y. I. Multifocality and bilaterality of papillary thyroid microcarcinoma. Clin. Exp. Otorhinolaryngol. 8, 174–178. https://doi.org/10.3342/ceo.2015.8.2.174 (2015).
Article PubMed PubMed Central Google Scholar
Remonti, L. R., Kramer, C. K., Leitão, C. B., Pinto, L. C. & Gross, J. L. Thyroid ultrasound features and risk of carcinoma: A systematic review and meta-analysis of observational studies. Thyroid 25, 538–550. https://doi.org/10.1089/thy.2014.0353 (2015).
Article PubMed PubMed Central Google Scholar
Kwak, J. Y., Kim, E. K., Kim, M. J. & Son, E. J. Significance of sonographic characterization for managing subcentimeter thyroid nodules. Acta Radiol. 50, 917–923. https://doi.org/10.1080/02841850903062724 (2009).
Article PubMed Google Scholar
Mazzaferri, E. L. & Sipos, J. Should all patients with subcentimeter thyroid nodules undergo fine-needle aspiration biopsy and preoperative neck ultrasonography to define the extent of tumor invasion?. Thyroid 18, 597–602. https://doi.org/10.1089/thy.2008.0100 (2008).
Article Google Scholar
Moon, H. J., Son, E., Kim, E. K., Yoon, J. H. & Kwak, J. Y. The diagnostic values of ultrasound and ultrasound-guided fine needle aspiration in subcentimeter-sized thyroid nodules. Ann. Surg. Oncol. 19, 52–59. https://doi.org/10.1245/s10434-011-1813-1 (2012).
Article PubMed Google Scholar
Shin, H. C. et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35, 1285–1298. https://doi.org/10.1109/tmi.2016.2528162 (2016).
Article PubMed Google Scholar
Tajbakhsh, N. et al. Convolutional neural networks for medical image analysis: Full training or fine tuning?. IEEE Trans. Med. Imaging 35, 1299–1312. https://doi.org/10.1109/tmi.2016.2535302 (2016).
Article PubMed Google Scholar
Ma, J., Wu, F., Zhu, J., Xu, D. & Kong, D. A pre-trained convolutional neural network based method for thyroid nodule diagnosis. Ultrasonics 73, 221–230. https://doi.org/10.1016/j.ultras.2016.09.011 (2017).
Article PubMed Google Scholar
Ko, S. Y. et al. Deep convolutional neural network for the diagnosis of thyroid nodules on ultrasound. Head Neck 41, 885–891. https://doi.org/10.1002/hed.25415 (2019).
Article PubMed Google Scholar
Li, X. et al. Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: A retrospective, multicohort, diagnostic study. Lancet Oncol. 20, 193–201. https://doi.org/10.1016/s1470-2045(18)30762-9 (2019).
Article PubMed Google Scholar
Park, V. Y. et al. Diagnosis of thyroid nodules: Performance of a deep learning convolutional neural network model vs. radiologists. Sci. Rep. 9, 17843. https://doi.org/10.1038/s41598-019-54434-1 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Koh, J. et al. Diagnosis of thyroid nodules on ultrasonography by a deep convolutional neural network. Sci. Rep. 10, 15245. https://doi.org/10.1038/s41598-020-72270-6 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Buda, M. et al. Management of thyroid nodules seen on US images: Deep learning may match performance of radiologists. Radiology 292, 695–701. https://doi.org/10.1148/radiol.2019181343 (2019).
Article PubMed Google Scholar
Han, M., Ha, E. J. & Park, J. H. Computer-aided diagnostic system for thyroid nodules on ultrasonography: Diagnostic performance based on the thyroid imaging reporting and data system classification and dichotomous outcomes. Am. J. Neuroradiol. https://doi.org/10.3174/ajnr.A6922 (2020).
Article PubMed PubMed Central Google Scholar
Kim, H. L., Ha, E. J. & Han, M. Real-world performance of computer-aided diagnosis system for thyroid nodules using ultrasonography. Ultrasound Med. Biol. 45, 2672–2678. https://doi.org/10.1016/j.ultrasmedbio.2019.05.032 (2019).
Article PubMed Google Scholar
Wang, L. et al. Automatic thyroid nodule recognition and diagnosis in ultrasound imaging with the YOLOv2 neural network. World J. Surg. Oncol. 17, 12. https://doi.org/10.1186/s12957-019-1558-z (2019).
Article PubMed PubMed Central Google Scholar
Kim, E. K. et al. New sonographic criteria for recommending fine-needle aspiration biopsy of nonpalpable solid nodules of the thyroid. AJR Am. J. Roentgenol. 178, 687–691. https://doi.org/10.2214/ajr.178.3.1780687 (2002).
Article PubMed Google Scholar
Kwak, J. Y. et al. Thyroid imaging reporting and data system for US features of nodules: A step in establishing better stratification of cancer risk. Radiology 260, 892–899. https://doi.org/10.1148/radiol.11110206 (2011).
Article PubMed Google Scholar
Yoon, J. H., Lee, H. S., Kim, E. K., Moon, H. J. & Kwak, J. Y. Malignancy risk stratification of thyroid nodules: Comparison between the thyroid imaging reporting and data system and the 2014 American Thyroid Association management guidelines. Radiology 278, 917–924. https://doi.org/10.1148/radiol.2015150056 (2016).
Article PubMed Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 770–778. https://doi.org/10.1109/CVPR.2016.90 (2016).
He, K., Zhang, X., Ren, S. & Sun, J. Identity Mappings in Deep Residual Networks ArXiv Web site. arXiv:1603.05027 (2016).
Obuchowski, N. A. Nonparametric analysis of clustered ROC curve data. Biometrics 53, 567–578 (1997).
Article CAS PubMed MATH Google Scholar
An, C., Choi, G. H., Lee, H. S. & Kim, M. J. Assessment of preoperative magnetic resonance imaging staging in patients with hepatocellular carcinoma undergoing resection compared with the seventh American Joint Committee on Cancer System. Investig. Radiol. 47, 634–641. https://doi.org/10.1097/RLI.0b013e3182630e8d (2012).
Article Google Scholar
Marrero, J. A. et al. Prognosis of hepatocellular carcinoma: Comparison of 7 staging systems in an American cohort. Hepatology 41, 707–716. https://doi.org/10.1002/hep.20636 (2005).
Article PubMed Google Scholar
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723. https://doi.org/10.1109/TAC.1974.1100705 (1974).
Article ADS MathSciNet MATH Google Scholar
LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551. https://doi.org/10.1162/neco.1989.1.4.541 (1989).
Article Google Scholar
Yoon, J. H., Kwon, H. J., Kim, E. K., Moon, H. J. & Kwak, J. Y. The follicular variant of papillary thyroid carcinoma: Characteristics of preoperative ultrasonography and cytology. Ultrasonography 35, 47–54. https://doi.org/10.14366/usg.15037 (2016).
Article PubMed Google Scholar
Jeong, E. Y. et al. Computer-aided diagnosis system for thyroid nodules on ultrasonography: Diagnostic performance and reproducibility based on the experience level of operators. Eur. Radiol. 29, 1978–1985. https://doi.org/10.1007/s00330-018-5772-9 (2019).
Article PubMed Google Scholar
Zhou, Y. et al. Differentiating non-lactating mastitis and malignant breast tumors by deep-learning based AI automatic classification system: A preliminary study. Front. Oncol. 12, 997306. https://doi.org/10.3389/fonc.2022.997306 (2022).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (2021R1A2C2007492).

Author information

Authors and Affiliations

Department of Radiology, Severance Hospital, Research Institute of Radiological Science, Yonsei University College of Medicine, Seoul, Korea
Miribi Rho, Sei Hyun Chun, Jung Hyun Yoon, Vivian Youngjean Park, Kyunghwa Han & Jin Young Kwak
School of Mathematics and Computing, Yonsei University, Seoul, Korea
Eunjung Lee
Biostatistics Collaboration Unit, Yonsei University College of Medicine, Seoul, Korea
Hye Sun Lee

Authors

Miribi Rho
View author publications
You can also search for this author in PubMed Google Scholar
Sei Hyun Chun
View author publications
You can also search for this author in PubMed Google Scholar
Eunjung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hye Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jung Hyun Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Vivian Youngjean Park
View author publications
You can also search for this author in PubMed Google Scholar
Kyunghwa Han
View author publications
You can also search for this author in PubMed Google Scholar
Jin Young Kwak
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.R., S.H.C. and J.Y.K. contributed to the study design and writing the first draft of manuscript. E.L. developed the AI-CAD program and collected data from the program. Statistical analysis of data was performed by H.S.L. and K.H. All authors reviewed, contributed to, and approved the manuscript.

Corresponding author

Correspondence to Jin Young Kwak.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rho, M., Chun, S.H., Lee, E. et al. Diagnosis of thyroid micronodules on ultrasound using a deep convolutional neural network. Sci Rep 13, 7231 (2023). https://doi.org/10.1038/s41598-023-34459-3

Download citation

Received: 08 September 2022
Accepted: 30 April 2023
Published: 04 May 2023
DOI: https://doi.org/10.1038/s41598-023-34459-3
Springer Nature Limited

Diagnosis of thyroid micronodules on ultrasound using a deep convolutional neural network

Abstract

Similar content being viewed by others

Diagnosing thyroid nodules with atypia of undetermined significance/follicular lesion of undetermined significance cytology with the deep convolutional neural network

Diagnosis of Thyroid Nodules: Performance of a Deep Learning Convolutional Neural Network Model vs. Radiologists

Deep learning to assist composition classification and thyroid solid nodule diagnosis: a multicenter diagnostic study

Introduction