Learning Lung Nodule Malignancy Likelihood from Radiologist Annotations or Diagnosis Data

  • Luís Gonçalves
  • Jorge NovoEmail author
  • António Cunha
  • Aurélio Campilho
Original Article


Lung cancer is the world’s most lethal type of cancer, being crucial that an early diagnosis is made in order to achieve successful treatments. Computer-aided diagnosis can play an important role in lung nodule detection and on establishing the nodule malignancy likelihood. This paper is a contribution in the design of a learning approach, using computed tomography images. Our methodology involves the measurement of a set of features in the nodular image region, and train classifiers, as K-nearest neighbor or support vector machine (SVM), to compute the malignancy likelihood of lung nodules. For this purpose, the Lung Image Database Consortium and image database resource initiative database is used due to its size and nodule variability, as well as for being publicly available. For training we used both radiologist’s labels and annotations and diagnosis data, as biopsy, surgery and follow-up results. We obtained promising results, as an Area Under the Receiver operating characteristic curve value of 0.962 ± 0.005 and 0.905 ± 0.04 was achieved for the Radiologists’ data and for the Diagnosis data, respectively, using an SVM with an exponential kernel combined with a correlation-based feature selection method.


Lung nodules Computer-aided diagnosis Thoracic computed tomography (CT) imaging Feature selection Malignancy likelihood 



This work is financed by the ERDF—European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020 Programme, and by National Funds through the Portuguese funding agency, FCT—Fundação para a Ciência e a Tecnologia, within the project with code POCI-01-0145-FEDER-016673 and the Grant Contract SFRH/BPD/85663/2012 (J. Novo).


  1. 1.
    American Cancer Society (2015). Cancer facts and figures.Google Scholar
  2. 2.
    Organization WH. International Agency for Research on Cancer. (2015). GLOBOCAN 2012: Estimated cancer incidence, mortality and prevalence worldwide in 2012.Google Scholar
  3. 3.
    Motohiro, A., Ueda, H., Komatsu, H., Yanai, N., & Mori, T. (2002). Prognosis of non-surgically treated, clinical stage I lung cancer patients in Japan. Lung Cancer, 36(1), 65–69.CrossRefGoogle Scholar
  4. 4.
    Breadsmoore, C. J., & Screaton, N. J. (2003). Classification, staging and prognosis of lung cancer. European Journal of Radiology, 45, 8–17.CrossRefGoogle Scholar
  5. 5.
    van Ginneken, B. (2008). Computer-aided diagnosis in thoracic computed tomography. Imaging Decisions MRI, 12, 11–22.CrossRefGoogle Scholar
  6. 6.
    Aberle, D., Adams, A., Berg, C., Black, W., Clapp, J., Fagerstrom, R., et al. (2011). Reduced lung-cancer mortality with low-dose computed tomographic screening. The New England Journal of Medicine, 365, 395–409.CrossRefGoogle Scholar
  7. 7.
    Rasmussen, J., Siersma, V., Pedersen, J., Heleno, B., Saghir, Z., & Brodersen, J. (2014). Healthcare costs in the Danish randomised controlled lung cancer CT-screening trial: A registry study. Lung Cancer, 83(3), 347–355.CrossRefGoogle Scholar
  8. 8.
    Way, T., Chan, H., Hadjiiski, L., Sahiner, B., Chughtai, A., Song, T., et al. (2010). Computer-aided diagnosis of lung nodules on CT scans: ROC study of its effect on radiologists’ performance. Academic Radiology, 17(3), 323–332.CrossRefGoogle Scholar
  9. 9.
    Antonelli, M., Cococcioni, M., Lazzerini, B., & Marcelloni, F. (2011). Computer-aided detection of lung nodules based on decision fusion techniques. Pattern Analysis and Applications, 14, 295–310.MathSciNetCrossRefGoogle Scholar
  10. 10.
    Saien, S., Hamid Pilevar, A., & Abrishami Moghaddam, H. (2015). Refinement of lung nodule candidates based on local geometric shape analysis and laplacian of gaussian kernels. Computers in Biology and Medicine, 54, 188–198.CrossRefGoogle Scholar
  11. 11.
    Han, H., Li, L., Wang, H., Zhang, H., Moore, W., Liang, Z. (2014). A novel computer-aided detection system for pulmonary nodule identification in CT images. Proc. Of SPIE. Progress in Biomedical Optics and Imaging, 9035.Google Scholar
  12. 12.
    Badura, P., & Pietka, E. (2014). Soft computing approach to 3D lung nodule segmentation in CT. Computers in Biology and Medicine, 53, 230–243.CrossRefGoogle Scholar
  13. 13.
    Heckel, F., Meine, H., Moltz, J., Kuhnigk, J. M., Heverhagen, J., Kießling, A., et al. (2014). Segmentation-based partial volume correction for volume estimation of solid lesions in CT. IEEE Transactions on Medical Imaging, 33(2), 462–480.CrossRefGoogle Scholar
  14. 14.
    Sun, S., Guo, Y., Guan, Y., Ren, H., Fan, L., & Kang, Y. (2014). Juxta-vascular nodule segmentation based on flow entropy and geodesic distance. IEEE Journal of Biomedical and Health Informatics, 18(4), 1355–1362.CrossRefGoogle Scholar
  15. 15.
    Krewer, H., Geiger, B., Hall, L., Goldgof, D., Gu, Y., Tockman, M. Gillies, R. (2013). Effect of texture features in computer aided diagnosis of pulmonary nodules in low-dose computed tomography. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, SMC 2013 (pp. 3887–3891).Google Scholar
  16. 16.
    Aggarwal, P., Vig, R., & Sardana, H. (2013). Patient-wise versus nodule-wise classification of annotated pulmonary nodules using pathologically confirmed cases. Journal of Computers (Finland), 8(9), 2245–2255. doi: 10.4304/jcp.8.9.2245-2255.Google Scholar
  17. 17.
    Han, F., Wang, H., Song, B., Zhang, G. Lu, H., Moore, W., Zhao, H., Liang, Z. (2013). A new 3D texture feature based computer-aided diagnosis approach to differentiate pulmonary nodules. Proceedings of the SPIE - The International Society for Optical Engineering (p. 8670).Google Scholar
  18. 18.
    Way, T. W. (2008). Computer-aided diagnosis of pulmonary nodules in thoracic computed tomography, Ph.D. thesis, The Universtity of Michigan.Google Scholar
  19. 19.
    Ye, X., Lin, X., Dehmeshkia, J., Slabaugh, G., & Beddoe, G. (2009). Shape-based computer-aided detection of lung nodules in thoracic CT images. IEEE Transactions on Biomedical Engineering, 56(10), 1810–1820.Google Scholar
  20. 20.
    Armato, S., McLennan, G., Bidaut, L., McNitt-Gray, F., Meyer, R., Reeves, P., et al. (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38, 915–931.CrossRefGoogle Scholar
  21. 21.
    Xu, D., van der Zaag-Loonen, H., Oudkerk, M., Wang, Y., Vliegenthart, R., Scholten, E., et al. (2009). Smooth or attached solid indeterminate nodules detected at baseline CT screening in the NELSON study: Cancer risk during 1 year of follow-up. Radiology, 250(1), 264–272.CrossRefGoogle Scholar
  22. 22.
    Novo, J., Gonçalves, L., Mendonça, A. M., & Campilho, A. (2015). 3D lung nodule candidates detection in multiple scales. In Proceedings of the IAPR International Conference on Machine Vision Applications, MVA’2015 (pp. 5–8).Google Scholar
  23. 23.
    Way, T., Hadjiiski, L., Sahiner, B., Chan, H. P., Cascade, P., Kazerooni, E., et al. (2006). Computer-aided diagnosis of pulmonary nodules on CT scans: Segmentation and classification using 3D active contours. Medical Physics, 37(7), 2323–2337.CrossRefGoogle Scholar
  24. 24.
    Wu, H., Sun, T., Wang, J., Li, X., Wang, W., Huo, D., et al. (2013). Combination of radiological and gray level co-occurrence matrix textural features used to distinguish solitary pulmonary nodules by computed tomography. Journal of Digital Imaging, 26(4), 797–802.CrossRefGoogle Scholar
  25. 25.
    Chen, H., Xu, Y., Ma, Y., & Ma, B. (2010). Neural network ensemble-based computer-aided diagnosis for differentiation of lung nodules on CT images. Academic Radiology, 17, 595–602.CrossRefGoogle Scholar
  26. 26.
    Orozco, H. M., Villegas, O. O. V., Sanchez, V. G. C., Domínguez, H. J. O., & Alfaro, M. J. N. (2015). Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine. Biomedical Engineering Online, 14(1), 9.CrossRefGoogle Scholar
  27. 27.
    Zhu, Y., Tan, Y., Hua, Y., Wang, M., Zhang, G., & Zhang, J. (2010). Feature selection and performance evaluation of support vector machine (SVM)-based classifier for differentiating benign and malignant pulmonary nodules by Computed Tomography. Journal of Digital Imaging, 23(1), 51–65.CrossRefGoogle Scholar
  28. 28.
    Way, T. W., Sahiner, B., Chan, H. P., Hadjiiski, L., Cascade, P. N., Chughtai, A., et al. (2009). Computer-aided diagnosis of pulmonary nodules on CT scans: Improvement of classification performance with nodule surface features. Medical Physics, 36(7), 3086–3098.CrossRefGoogle Scholar
  29. 29.
    Kuruvilla, J., & Gunavathi, K. (2014). Lung cancer classification using neural networks for CT images. Computer Methods and Programs in Biomedicine, 113(1), 202–209.CrossRefGoogle Scholar
  30. 30.
    Iwano, S., Nakamurab, T., Kamiokac, Y., Ikeda, M., & Ishigaki, T. (2008). Computer-aided differentiation of malignant from benign solitary pulmonary nodules imaged by high-resolution CT. Computerized Medical Imaging and Graphics, 32, 416–422.CrossRefGoogle Scholar
  31. 31.
    Tartar, A., Akan, A., & Kilic, N. (2014). A novel approach to malignant-benign classification of pulmonary nodules by using ensemble learning classifiers. In Proceedings of the Engineering in Medicine and Biology Society, 36th Annual International Conference of the IEEE (pp, 4651–4654).Google Scholar
  32. 32.
    da Silva, E. C., Silva, A. C., de Paiva, A. C., & Nunes, R. A. (2008). Diagnosis of lung nodule using Moran’s index and Geary’s coefficient in computerized tomography images. Pattern Analysis and Applications, 11, 89–99.MathSciNetCrossRefGoogle Scholar
  33. 33.
    Silva, A. C., Carvalho, P. C., & Gattass, M. (2004). Analysis of spatial variability using geostatistical functions for diagnosis of lung nodule in computerized tomography images. Pattern Analysis and Applications, 7, 227–234.CrossRefGoogle Scholar
  34. 34.
    Armato, S. G., Altman, M. B., Wilkie, J., Sone, S., Li, F., Doi, K., et al. (2003). Automated lung nodule classification following automated nodule detection on CT: A serial approach. Medical Physics, 30(6), 1188–1197.CrossRefGoogle Scholar
  35. 35.
    Silva, S., Madeira, J., Santos, B.S., & Ferreira, C. (2011) Inter-observer variability assessment of a left ventricle segmentation tool applied to 4D MDCT images of the heart. In Proceedings of the Engineering in Medicine and Biology Society, EMBC’2011 Annual International Conference of the IEEE (pp. 3411–3414).Google Scholar
  36. 36.
    Lee, M., Boroczky, L., Stasik, K., Cann, A., Borczuk, A., Kawut, S., et al. (2010). Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction. Artificial Intelligence in Medicine, 50, 43–53.CrossRefGoogle Scholar
  37. 37.
    Ciompi, F., de Hoop, B., van Riel, S., Chung, K., Scholten, E., Oudkerk, M., et al. (2015). Automatic classification of pulmonary perifissural nodules in computed tomography using an ensemble of 2D views and a convolutional neural network out-of-the-box. Medical Image Analysis, 26(1), 195–202.CrossRefGoogle Scholar
  38. 38.
    Reeves, A., Xie, Y., & Jirapatnakul, A. (2016). Automated pulmonary nodule CT image characterization in lung cancer screening. International Journal of Computer Assisted Radiology and Surgery, 11(1), 73–88.CrossRefGoogle Scholar
  39. 39.
    Kaya, A., & Can, A. (2015). A weighted rule based method for predicting malignancy of pulmonary nodules by nodule characteristics. Journal of Biomedical Informatics, 56, 69–79.CrossRefGoogle Scholar
  40. 40.
    Firmino, M., Angelo, G., Morais, H., Dantas, M. R., & Valentim, R. (2016). Computer-aided detection (CADe) and diagnosis (CADx) system for lung cancer with likelihood of malignancy. Biomedical Engineering Online, 15(2), 1–17. doi: 10.1186/s12938-015-0120-7.Google Scholar
  41. 41.
    Aerts, H. J., Velazquez, E. R., Leijenaar, R. T., Parmar, C., Grossmann, P., Carvalho, S., et al. (2014). Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nature Communications, 4006, 1–8. doi: 10.1038/ncomms5006.Google Scholar
  42. 42.
    Kumar, D., Shafiee, M., Chung, A., Khalvati, F., Haider, M., & Wong, A. (2015). Discovery radiomics for computed tomography cancer detection. Computer Vision and Pattern Recognition, 1–8.Google Scholar
  43. 43.
    Novo, J., Rouco, J., Mendonça, A., & Campilho, A. (2014). Reliable lung segmentation methodology by including juxtapleural nodules. In Lecture Notes in Computer Science: Image Analysis and Recognition, International Conference Image Analysis and Recognition, ICIAR’2014 (Vol. 8815, pp. 227–235).Google Scholar
  44. 44.
    Jacobs, C., van Rikxoort, E., Twellmann, T., Scholten, E., de Jong, P., Kuhnigk, J. M., et al. (2014). Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images. Medical Image Analysis, 18(2), 374–384.CrossRefGoogle Scholar
  45. 45.
    Diciotti, S., Lombardo, S., Falchini, M., Picozzi, G., & Mascalchi, M. (2011). Automated segmentation refinement of small lung nodules in CT scans by local shape analysis. IEEE Transactions on Biomedical Engineering, 58(12), 3418–3428.CrossRefGoogle Scholar
  46. 46.
    Aggarwal, P., Vig, R., & Sardana, K. (2013). Largest versus smallest nodules marked by different radiologists in chest CT scans for lung cancer detection. In Proceedings of the International MultiConference of Engineers and Computer ScientistsIMECS’2013 (Vol. 1, pp. 462–466).Google Scholar
  47. 47.
    He, X., Sahiner, B., Gallas, B., Chen, W., & Petrick, N. (2014). Computerized characterization of lung nodule subtlety using thoracic CT images. Physics in Medicine & Biology, 59(4), 897–910.CrossRefGoogle Scholar
  48. 48.
    Gonçalves, L., Novo, J., & Campilho, A. (2016). Hessian based approaches for 3D lung nodule segmentation. Expert Systems with Applications, 61, 1–15.CrossRefGoogle Scholar
  49. 49.
    Murphy, K., van Ginneken, B., Schilham, A., de Hoop, B., Gietema, H., & Prokop, M. (2009). A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Medical Image Analysis, 13, 757–770.CrossRefGoogle Scholar
  50. 50.
    Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (1973). Textural features for image classification. IEEE Transactions on Systems, Man and Cybernetics, 6, 610–621.CrossRefGoogle Scholar
  51. 51.
    Albregtsen, F. (2008). Statistical texture measures computed from gray level coocurrence matrices, pp. 1–14.Google Scholar
  52. 52.
    Grigorescu, S. E., Petkov, N., & Kruizinga, P. (2002). Comparison of texture features based on gabor filters. IEEE Transactions on Image Processing, 11(10), 1160–1167.MathSciNetCrossRefGoogle Scholar
  53. 53.
    Laws, K. I. (1980). Textured image segmentation. Tech. rep., DTIC Document.Google Scholar
  54. 54.
    Liu, Y., & Zheng, Y. F. (2006). FS_SFS: A novel feature selection method for support vector machines. Pattern Recognition, 39(7), 1333–1345.CrossRefzbMATHGoogle Scholar
  55. 55.
    Mao, K. (2004). Feature subset selection for support vector machines through discriminative function pruning analysis. IEEE Transactions on Systems, Man, and Cybernetics B, 34(1), 60–67.CrossRefGoogle Scholar
  56. 56.
    Hall, M. A. (1999). Correlation-based feature selection for machine learning. Ph.D. thesis, The University of Waikato.Google Scholar
  57. 57.
    Kononenko, I. (1994). Estimating attributes: analysis and extensions of relief. In Proceedings of the European Conference on Machine Learning - ECML’1994 (pp. 171–182).Google Scholar
  58. 58.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.zbMATHGoogle Scholar
  59. 59.
    Zhang, F., Song, Y., Cai, W., Lee, M. Z., Zhou, Y., Huang, H., et al. (2013). Lung nodule classification with multilevel patch-based context analysis. IEEE Transactions on Biomedical Engineering, 61(4), 1155–1166.CrossRefGoogle Scholar

Copyright information

© Taiwanese Society of Biomedical Engineering 2017

Authors and Affiliations

  1. 1.INESC TEC - INESC Technology and SciencePortoPortugal
  2. 2.Department of ComputingUniversity of A CoruñaA CoruñaSpain
  3. 3.Universidade de Trás-os-Montes e Alto DouroVila RealPortugal
  4. 4.Faculty of EngineeringUniversity of PortoPortoPortugal

Personalised recommendations