Journal of Signal Processing Systems

, Volume 87, Issue 2, pp 179–196 | Cite as

Lung-Nodule Classification Based on Computed Tomography Using Taxonomic Diversity Indexes and an SVM

  • Antonio Oseas de Carvalho FilhoEmail author
  • Aristófanes Corrêa Silva
  • Anselmo Cardoso de Paiva
  • Rodolfo Acatauassú Nunes
  • Marcelo Gattass


The present work aims to develop a methodology for classifying lung nodules using the LIDC-IDRI image database. The proposed methodology is based on image-processing and pattern-recognition techniques. To describe the texture of nodule and non-nodule candidates, we use the Taxonomic Diversity and Taxonomic Distinctness Indexes from ecology. The calculation of these indexes is based on phylogenetic trees, which, in this work, are applied to the candidate characterization. Finally, we apply a Support Vector Machine (SVM) as a classifier. In the testing stage, we used 833 exams from the LIDC-IDRI image database. To apply the methodology, we divided the complete database into two groups for training and testing. We used training and testing partitions of 20/80 %, 40/60 %, 60/40 %, and 80/20 %. The division was repeated five times at random. The presented methodology shows promising results for classifying nodules and non-nodules, presenting a mean accuracy of 98.11 %. Lung cancer presents the highest mortality rate and has one of the lowest survival rates after diagnosis. Therefore, the earlier the diagnosis, the higher the chances of a cure for the patient. In addition, the more information available to the specialist, the more precise the diagnosis will be. The methodology proposed here contributes to this.


Lung cancer Phylogenetic trees Taxonomic diversity index Taxonomic distinctness Medical image 



The authors acknowledge Coordination for the Improvement of Higher Education Personnel (CAPES), the National Council for Scientific and Technological Development (CNPq), and the Foundation for the Protection of Research and Scientific and Technological Development of the State of Maranho (FAPEMA) for financial support


  1. 1.
    Akram, S., Javed, M.Y., Hussain, A., Riaz, F., & Akram, M.U. (2015). Intensity-based statistical features for classification of lungs ct scan nodules using artificial intelligence techniques. Journal of Experimental & Theoretical Artificial Intelligence, 27(6), 737–751. doi: 10.1080/0952813X.2015.1020526.CrossRefGoogle Scholar
  2. 2.
    Al-Absi, H., Samir, B., Shaban, K., & Sulaiman, S. (2012). Computer aided diagnosis system based on machine learning techniques for lung cancer. In 2012 International conference on computer information science (ICCIS) (Vol. 1, pp. 295–300). doi: 10.1109/ICCISci.2012.6297257.
  3. 3.
    Armato, S.G., McLennan, G., Bidaut, L., McNitt-Gray, M.F., Meyer, C.R., Reeves, A.P., Zhao, B., Aberle, D.R., Henschke, C.I., Hoffman, E.A., Kazerooni, E.A., MacMahon, H., Van Beeke, E.J.R., Yankelevitz, D., Biancardi, A.M., Bland, P.H., Brown, M.S., Engelmann, R.M., Laderach, G.E., Max, D., Pais, R.C., Qing, D.P.Y., Roberts, R.Y., Smith, A.R., Starkey, A., Batrah, P., Caligiuri, P., Farooqi, A., Gladish, G.W., Jude, C.M., Munden, R.F., Petkovska, I., Quint, L.E., Schwartz, L.H., Sundaram, B., Dodd, L.E., Fenimore, C., Gur, D., Petrick, N., Freymann, J., Kirby, J., Hughes, B., Casteele, A.V., Gupte, S., Sallamm, M., Heath, M.D., Kuhn, M.H., Dharaiya, E., Burns, R., Fryd, D.S., Salganicoff, M., Anand, V., Shreter, U., Vastagh, S., & Croft, B.Y. (2011). The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans. Medical Physiology, 38(2), 915–931. Scholar
  4. 4.
    Baxevanis, A.D., & Ouellette, B.F.F. (2004). Bioinformatics: a practical guide to the analysis of genes and proteins. Methods of biochemical analysis. Wiley.
  5. 5.
    Bolboaca, S.D., Jantschi, L., Sestraa, A.F., Sestra, R.E., & Pamfil, D.C. (2011). Pearson-fisher chi-square statistic revisited. Information, 2(3), 528–545. doi: 10.3390/info2030528. Scholar
  6. 6.
    Câncer, I.N. (2014). Estimativas da incidência e mortalidade por câncer no brasil. Available: (Accessed: 1 January 2014).
  7. 7.
    de Carvalho Filho, A.O, de Sampaio, W.B., Silva, A.C., de Paiva, A.C., Nunes, R.A., & Gattass, M. (2013). Automatic detection of solitary lung nodules using quality threshold clustering, genetic algorithm and diversity index. Artificial Intelligence in Medicine. doi: 10.1016/j.artmed.2013.11.002.
  8. 8.
    Chang, C.C., & Lin, C.J. LIBSVM—a library for support vector machines (2013). Available at
  9. 9.
    Chen, W., Li, Z., Bai, L., & Lin, Y. (2011). Nf-kappab in lung cancer, a carcinogenesis mediator and a prevention and therapy target. Frontiers in Bioscience (Landmark edition), 16, 1172–1185. doi: 10.2741/3782.CrossRefGoogle Scholar
  10. 10.
    Dehmeshki, J., Ye, X., Casique, M.V., & Lin, X. (2006). A hybrid approach for automated detection of lung nodules in ct images. In ISBI (pp. 506–509). IEEE.
  11. 11.
    Duda, R.O., & Hart, P.E. (1973). Pattern classification and scene analysis. New York: Wiley-Interscience Publication.zbMATHGoogle Scholar
  12. 12.
    van Erkel, A., & Pattynama, P. (1998). Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. European Journal of Radiology, 27(2), 88–94.CrossRefGoogle Scholar
  13. 13.
    Farag, A., Ali, A., Graham, J., Farag, A., Elshazly, S., & Falk, R. (2011). Evaluation of geometric feature descriptors for detection and classification of lung nodules in low dose ct scans of the chest. In 2011 IEEE international symposium on biomedical imaging: from nano to macro (pp. 169–172). doi: 10.1109/ISBI.2011.5872380.
  14. 14.
    Galloway, M.M. (1975). Texture analysis using gray level run lengths. Computer Graphics and Image Processing, 4(2), 172–179. doi: 10.1016/S0146-664X(75)80008-6. Scholar
  15. 15.
    Hardie, R.C., Rogers, S.K., Wilson, T.A., & Rogers, A. (2008). Performance analysis of a new computer aided detection system for identifying lung nodules on chest radiographs. Medical Image Analysis, 12(3), 240–258. Scholar
  16. 16.
    Huang, P.W., Lin, P.L., Lee, C.H., & Kuo, C. (2013). A classification system of lung nodules in ct images based on fractional brownian motion model. In 2013 International conference on system science and engineering (ICSSE) (pp. 37–40). doi: 10.1109/ICSSE.2013.6614710.
  17. 17.
    Jing, Z., Bin, L., & Lianfang, T. (2010). Lung nodule classification combining rule-based and svm. In 2010 IEEE fifth international conference on bio-inspired computing: theories and applications (BIC-TA) (pp. 1033–1036). doi: 10.1109/BICTA.2010.5645114.
  18. 18.
    King, P.H. (2012). Digital image processing and analysis: Human and computer applications with cviptools, 2nd edition (umbaugh, s.; 2011) [book reviews]. IEEE Pulse, 3(4), 84–85. doi: 10.1109/MPUL.2012.2196843.MathSciNetCrossRefGoogle Scholar
  19. 19.
    Lee, S., Kouzani, A., & Hu, E. (2010). Random forest based lung nodule classification aided by clustering. Computerized Medical Imaging and Graphics, 34(7), 535–542. doi: 10.1016/j.compmedimag.2010.03.006. Scholar
  20. 20.
    Leef, J. 3rd, & Klein, J. (2002). The solitary pulmonary nodule. Radiologic Clinics of North America, 40 (1), 123–143, ix. doi: 10.1056/NEJMcp012290.CrossRefGoogle Scholar
  21. 21.
    Liu, Y., Yang, J., Zhao, D., & Liu, J. (2009). Computer aided detection of lung nodules based on voxel analysis utilizing support vector machines. In International conference on future biomedical information engineering, 2009. FBIE 2009 (pp. 90–93).Google Scholar
  22. 22.
    Magurran, A.E. (2004). Measuring biological diversity. African Journal of Aquatic Science, 29(2), 285–286.CrossRefGoogle Scholar
  23. 23.
    Moura, H., & Viana, G. (2011). Phylogenetic trees drawing web service. In BIOTECHNO 2011, the third international conference on bioinformatics, biocomputational systems and biotechnologies (pp. 73–77).Google Scholar
  24. 24.
    Netto, S.M.B., Silva, A.C., Nunes, R.A., & Gattass, M. (2012). Automatic segmentation of lung nodules with growing neural gas and support vector machine. Computers in Biology and Medicine, 42(11), 1110–1121. doi: 10.1016/j.compbiomed.2012.09.003.CrossRefGoogle Scholar
  25. 25.
    Orozco, H., Osiris Vergara Villegas, O., Maynez, L., Sanchez, V., & de Jesus Ochoa Dominguez, H. (2012). Lung nodule classification in frequency domain using support vector machines. In 2012 11th international conference on information science, signal processing and their applications (ISSPA) (pp. 870–875). doi: 10.1109/ISSPA.2012.6310676.
  26. 26.
    Pienkowski, M.W., Watkinson, A.R., Kerby, G., Clarke, K.R., & Warwick, R.M. (1998). A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology, 35(4), 523–531. doi: 10.1046/j.1365-2664.1998.3540523.x.CrossRefGoogle Scholar
  27. 27.
    Schölkopf, B., & Smola, A. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge: MIT Press.Google Scholar
  28. 28.
    da Silva, I.A., & Batalha, M.A. (2006). Taxonomic distinctness and diversity of a hyperseasonal savanna in central brazil. Diversity and Distributions, 12(6), 725–730. doi: 10.1111/j.1472-4642.2006.00264.x.CrossRefGoogle Scholar
  29. 29.
    Sivakumar, S., & Chandrasekar, C. (2013). Lung nodule detection using fuzzy clustering and support vector machines. International Journal of Engineering and Technology (IJET), 5 (11), 179–185.Google Scholar
  30. 30.
    Soliman, A.A., Abd Ellah, A.H., Abou-Elheggag, N.A., & Modhesh, A.A. (2012). Estimation of the coefficient of variation for non-normal model using progressive first-failure-censoring data. Journal of Applied Statistics, 39(12), 2741–2758. Scholar
  31. 31.
    Tartar, A., Kilic, N., & Akan, A. (2013). Classification of pulmonary nodules by using hybrid features. Computational and Mathematical Methods in Medicine, 2013, 148363. doi: 10.1155/2013/148363. Scholar
  32. 32.
    Wagner, J.M., & Shimshak, D.G. (2007). Stepwise selection of variables in data envelopment analysis: Procedures and managerial perspectives. European Journal of Operational Research, 180(1), 57–67. doi: 10.1016/j.ejor.2006.02.048. Scholar
  33. 33.
    Walker, R.F., Jackway, P.T., & Longstaff, I.D. (1997). Recent developments in the use of the co-occurrence matrix for texture recognition. In 1997 13th international conference on digital signal processing proceedings, 1997. DSP 97 (Vol. 1, pp. 63–65). doi: 10.1109/ICDSP.1997.627968.
  34. 34.
    Ye, X., Lin, X., Dehmeshki, J., Slabaugh, G., & Beddoe, G. (2009). Shape-based computer-aided detection of lung nodules in thoracic ct images. IEEE Transactions on Biomedical Engineering, 56(7), 1810–1820. doi: 10.1109/TBME.2009.2017027.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Antonio Oseas de Carvalho Filho
    • 1
    Email author
  • Aristófanes Corrêa Silva
    • 1
  • Anselmo Cardoso de Paiva
    • 1
  • Rodolfo Acatauassú Nunes
    • 2
  • Marcelo Gattass
    • 3
  1. 1.Federal University of Maranhão - UFMA, Applied Computing Group - NCASão LuísBrazil
  2. 2.State University of Rio de JaneiroRio de JaneiroBrazil
  3. 3.Department of Computer SciencePontifical Catholic University of Rio de Janeiro - PUC-RioRio de JaneiroBrazil

Personalised recommendations