Computer-aided diagnosis system for thyroid nodules on ultrasonography: diagnostic performance and reproducibility based on the experience level of operators
To evaluate the diagnostic performance and reproducibility of a computer-aided diagnosis (CAD) system for thyroid cancer diagnosis using ultrasonography (US) based on the operator’s experience.
Materials and methods
Between July 2016 and October 2016, 76 consecutive patients with 100 thyroid nodules (≥ 1.0 cm) were prospectively included. An experienced radiologist performed the US examinations with a real-time CAD system integrated into the US machine, and three operators with different levels of US experience (0–5 years) independently applied the CAD system. We compared the diagnostic performance of the CAD system based on the operators’ experience and calculated the interobserver agreement for cancer diagnosis and in terms of each US descriptor.
The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy of the CAD system were 88.6, 83.9, 81.3, 90.4, and 86.0%, respectively. The sensitivity and accuracy of the CAD system were not significantly different from those of the radiologist (p > 0.05), while the specificity was higher for the experienced radiologist (p = 0.016). For the less-experienced operators, the sensitivity was 68.8–73.8%, specificity 74.1–88.5%, PPV 68.9–73.3%, NPV 72.7–80.0%, and accuracy 71.0–75.0%. The less-experienced operators showed lower sensitivity and accuracy than those for the experienced radiologist. The interobserver agreement was substantial for the final diagnosis and each US descriptor, and moderate for the margin and composition.
The CAD system may have a potential role in the thyroid cancer diagnosis. However, operator dependency still remains and needs improvement.
• The sensitivity and accuracy of the CAD system did not differ significantly from those of the experienced radiologist (88.6% vs. 84.1%, p = 0.687; 86.0% vs. 91.0%, p = 0.267) while the specificity was significantly higher for the experienced radiologist (83.9% vs. 96.4%, p = 0.016).
• However, the diagnostic performance varied according to the operator’s experience (sensitivity 70.5–88.6%, accuracy 72.0–86.0%) and they were lower for the less-experienced operators than for the experienced radiologist.
• The interobserver agreement was substantial for the final diagnosis and each US descriptor and moderate for the margin and composition.
KeywordsArtificial intelligence Fine-needle aspiration Thyroid nodule Thyroid cancer Ultrasonography
Area under receiver operating characteristic curve
Negative predictive value
Positive predictive value
Papillary thyroid carcinoma
Receiver operating characteristic
The authors state that this work was supported by the National Research Foundation of Korea (# 2017R1C1B5016217).
Compliance with ethical standards
The scientific guarantor of this publication is Eun Ju Ha.
Conflict of interest
The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Statistics and biometry
No complex statistical methods were necessary for this paper.
Written informed consent was obtained from all patients before they underwent US.
This study was approved by our institutional review board.
• Prospective case-control study
- 5.Park CS, Kim SH, Jung SL et al (2010) Observer variability in the sonographic evaluation of thyroid nodules. J Clin Ultrasound 38:287–293Google Scholar