Skip to main content

Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings



To investigate the feasibility of a deep learning–based detection (DLD) system for multiclass lesions on chest radiograph, in comparison with observers.


A total of 15,809 chest radiographs were collected from two tertiary hospitals (7204 normal and 8605 abnormal with nodule/mass, interstitial opacity, pleural effusion, or pneumothorax). Except for the test set (100 normal and 100 abnormal (nodule/mass, 70; interstitial opacity, 10; pleural effusion, 10; pneumothorax, 10)), radiographs were used to develop a DLD system for detecting multiclass lesions. The diagnostic performance of the developed model and that of nine observers with varying experiences were evaluated and compared using area under the receiver operating characteristic curve (AUROC), on a per-image basis, and jackknife alternative free-response receiver operating characteristic figure of merit (FOM) on a per-lesion basis. The false-positive fraction was also calculated.


Compared with the group-averaged observations, the DLD system demonstrated significantly higher performances on image-wise normal/abnormal classification and lesion-wise detection with pattern classification (AUROC, 0.985 vs. 0.958; p = 0.001; FOM, 0.962 vs. 0.886; p < 0.001). In lesion-wise detection, the DLD system outperformed all nine observers. In the subgroup analysis, the DLD system exhibited consistently better performance for both nodule/mass (FOM, 0.913 vs. 0.847; p < 0.001) and the other three abnormal classes (FOM, 0.995 vs. 0.843; p < 0.001). The false-positive fraction of all abnormalities was 0.11 for the DLD system and 0.19 for the observers.


The DLD system showed the potential for detection of lesions and pattern classification on chest radiographs, performing normal/abnormal classifications and achieving high diagnostic performance.

Key Points

The DLD system was feasible for detection with pattern classification of multiclass lesions on chest radiograph.

The DLD system had high performance of image-wise classification as normal or abnormal chest radiographs (AUROC, 0.985) and showed especially high specificity (99.0%).

In lesion-wise detection of multiclass lesions, the DLD system outperformed all 9 observers (FOM, 0.962 vs. 0.886; p < 0.001).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6



Area under the curve


Area under the receiver operating characteristic curve


Computer-aided detection


Deep learning–based detection


Figure of merit


False positive


Jackknife alternative free-response receiver operating characteristic curve


Receiver operating characteristic


True positive


  1. 1.

    de Hoop B, Schaefer-Prokop C, Gietema HA et al (2010) Screening for lung cancer with digital chest radiography: sensitivity and number of secondary work-up CT examinations. Radiology 255:629–637

    Article  Google Scholar 

  2. 2.

    Kundel HL (1981) Predictive value and threshold detectability of lung tumors. Radiology 139:25–29

    CAS  Article  Google Scholar 

  3. 3.

    Quekel LG, Kessels AG, Goei R, van Engelshoven JMA (2001) Detection of lung cancer on the chest radiograph: a study on observer performance. Eur J Radiol 39:111–116

    CAS  Article  Google Scholar 

  4. 4.

    Toyoda Y, Nakayama T, Kusunoki Y, Iso H, Suzuki T (2008) Sensitivity and specificity of lung cancer screening using chest low-dose computed tomography. Br J Cancer 98:1602–1607

    CAS  Article  Google Scholar 

  5. 5.

    Li F, Arimura H, Suzuki K et al (2005) Computer-aided detection of peripheral lung cancers missed at CT: ROC analyses without and with localization. Radiology 237:684–690

    Article  Google Scholar 

  6. 6.

    Gavelli G, Giampalma E (2000) Sensitivity and specificity of chest X-ray screening for lung cancer: review article. Cancer 89:2453–2456

    CAS  Article  Google Scholar 

  7. 7.

    Bley TA, Baumann T, Saueressig U et al (2008) Comparison of radiologist and CAD performance in the detection of CT-confirmed subtle pulmonary nodules on digital chest radiographs. Invest Radiol 43:343–348

    Article  Google Scholar 

  8. 8.

    Kasai S, Li F, Shiraishi J, Doi K (2008) Usefulness of computer-aided diagnosis schemes for vertebral fractures and lung nodules on chest radiographs. AJR Am J Roentgenol 191:260–265

    Article  Google Scholar 

  9. 9.

    Li F, Hara T, Shiraishi J, Engelmann R, MacMahon H, Doi K (2011) Improved detection of subtle lung nodules by use of chest radiographs with bone suppression imaging: receiver operating characteristic analysis with and without localization. AJR Am J Roentgenol 196:W535–W541

    Article  Google Scholar 

  10. 10.

    Nam JG, Park S, Hwang EJ et al (2019) Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology 290:218–228

    Article  Google Scholar 

  11. 11.

    Cicero M, Bilbily A, Colak E et al (2017) Training and validating a deep convolutional neural network for computer-aided detection and classification of abnormalities on frontal chest radiographs. Invest Radiol 52:281–287

    Article  Google Scholar 

  12. 12.

    Dunnmon JA, Yi D, Langlotz CP, Re C, Rubin DL, Lungren MP (2019) Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology 290:537–544

    Article  Google Scholar 

  13. 13.

    Annarumma M, Withey SJ, Bakewell RJ, Pesce E, Goh V, Montana G (2019) Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology.

  14. 14.

    Hwang EJ, Park S, Jin KN et al (2019) Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open 2:e191095

    Article  Google Scholar 

  15. 15.

    Park S, Lee SM, Kim N et al (2019) Application of deep learning-based computer-aided detection system: detecting pneumothorax on chest radiograph after biopsy. Eur Radiol.

    Article  Google Scholar 

  16. 16.

    Chakraborty DP (2006) Analysis of location specific observer performance data: validated extensions of the jackknife free-response (JAFROC) method. Acad Radiol 13:1187–1193

    Article  Google Scholar 

  17. 17.

    Bender R, Lange S (2001) Adjusting for multiple testing—when and how? J Clin Epidemiol 54:343–349

    CAS  Article  Google Scholar 

  18. 18.

    Schalekamp S, van Ginneken B, Koedam E et al (2014) Computer-aided detection improves detection of pulmonary nodules in chest radiographs beyond the support by bone-suppressed images. Radiology 272:252–261

    Article  Google Scholar 

  19. 19.

    Novak RD, Novak NJ, Gilkeson R, Mansoori B, Aandal GE (2013) A comparison of computer-aided detection (CAD) effectiveness in pulmonary nodule identification using different methods of bone suppression in chest radiographs. J Digit Imaging 26:651–656

    Article  Google Scholar 

  20. 20.

    Dellios N, Teichgraeber U, Chelaru R, Malich A, Papageorgiou IE (2017) Computer-aided detection fidelity of pulmonary nodules in chest radiograph. J Clin Imaging Sci 7:8–8

    Article  Google Scholar 

  21. 21.

    Schalekamp S, van Ginneken B, Karssemeijer N, Schaefer-Prokop CM (2014) Chest radiography: new technological developments and their applications. Semin Respir Crit Care Med 35:3–16

    CAS  Article  Google Scholar 

  22. 22.

    Park SH, Han K (2018) Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 286:800–809

    Article  Google Scholar 

Download references


This study has received funding from the Industrial Strategic Technology Development Program (10072064, Development of Novel Artificial Intelligence Technologies To Assist Imaging Diagnosis of Pulmonary, Hepatic, and Cardiac Diseases and Their Integration into Commercial Clinical PACS Platforms), which is funded by the Ministry of Trade Industry and Energy (MI, South Korea).

Author information



Corresponding authors

Correspondence to Sang Min Lee or Kyung Hee Lee.

Ethics declarations


The scientific guarantor of this publication is Sang Min Lee.

Conflict of interest

The authors declare that they have no conflict of interest.

Statistics and biometry

The statistician of our institution (Seon Ok Kim) kindly provided statistical advice for this manuscript.

Informed consent

Written informed consent was waived by the institutional review board.

Ethical approval

Institutional review board approval was obtained.


• retrospective

• diagnostic or prognostic study

• multicenter study

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material


(DOCX 18 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Park, S., Lee, S.M., Lee, K.H. et al. Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings. Eur Radiol 30, 1359–1368 (2020).

Download citation


  • Deep learning
  • Thoracic radiography
  • Automated pattern recognition
  • Classification