Abstract
In this paper, we compare five common classifier families in their ability to categorize six lung tissue patterns in high-resolution computed tomography (HRCT) images of patients affected with interstitial lung diseases (ILD) and with healthy tissue. The evaluated classifiers are naive Bayes, k-nearest neighbor, J48 decision trees, multilayer perceptron, and support vector machines (SVM). The dataset used contains 843 regions of interest (ROI) of healthy and five pathologic lung tissue patterns identified by two radiologists at the University Hospitals of Geneva. Correlation of the feature space composed of 39 texture attributes is studied. A grid search for optimal parameters is carried out for each classifier family. Two complementary metrics are used to characterize the performances of classification. These are based on McNemar’s statistical tests and global accuracy. SVM reached best values for each metric and allowed a mean correct prediction rate of 88.3% with high class-specific precision on testing sets of 423 ROIs.
Similar content being viewed by others
References
Flaherty KR, King TE, Ganesh Raghu J, Lynch III, JP, Colby TV, Travis WD, Gross BH, Kazerooni EA, Toews GB, Long Q, Murray S, Lama VN, Gay SE, Martinez FJ: Idiopathic interstitial pneumonia: what is the effect of a multidisciplinary approach to diagnosis? Am J Respir Crit Care Med 170:904–910, 2004 (July)
Stark P: High resolution computed tomography of the lungs. UpToDate September, 2007
Shyu C-R, Brodley CE, Kak AC, Kosaka A, Aisen AM, Broderick LS: ASSERT: a physician-in-the-loop content-based retrieval system for HRCT image databases. Comput Vis Image Underst 75:111–132, 1999 (special issue on content-based access for image and video libraries, July/August)
Aisen AM, Broderick LS, Winer-Muram H, Brodley CE, Kak AC, Pavlopoulou C, Dy J, Shyu C-R, Marchiori A: Automated storage and retrieval of thin-section CT images to assist diagnosis: system description and preliminary assessment. Radiology 228:265–270, 2003
Nishikawa RM: Current status and future directions of computer-aided diagnosis in mammography. Comput Med Imaging Graph 31:224–235, 2007 (June)
Müller H, Michoux N, Bandon D, Geissbuhler A: A review of content-based image retrieval systems in medicine—clinical benefits and future directions. Int J Med Informat 73:1–23, 2004 (February)
Biedermann I: Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147, 1987
Unser M: Texture classification and segmentation using wavelet frames. IEEE Trans Image Process 4(11):1549–1560, 1995
Van De Ville D, Blu T, Unser M: Tsotropic polyharmonic B-splines: scaling functions and wavelets. IEEE Trans Image Process 14:1798–1813, 2005 (November)
Depeursinge A, Sage D, Hidki A, Platon A, Poletti P-A, Unser M, Muller H: Lung tissue classification using wavelet frames. Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE, pp. 6259–6262, August 2007
Tourassi GD: Journey toward computer-aided diagnosis: role of image texture analysis. Radiology 213:317–320, 1999 (July)
Jain AK, Duin RPW, Mao J: Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37, 2000
Bishop CM: Pattern Recognition and Machine Learning, Berlin: Springer, 2006 (August)
van der Walt C, Barnard E: Data characteristics that determine classifier performance. Proceedings of the Sixteenth Annual Symposium of the Pattern Recognition Association of South Africa, pp. 166–171, (Parys, South Africa), November 2006
Cover T, Hart P: Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27, 1967
Quinlan RJ: Induction of decision trees. Mach Learn 1:81–106, 1986 (March)
Bishop CM: Neural Networks for Pattern Recognition, Oxford: Clarendon, 1995
Jain AK, Mao J, Mohiuddin KM: Artificial neural networks: a tutorial. Computer 29(3):31–44, 1996
Burges CJC: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167, 1998
Vapnik VN: The Nature of Statistical Learning Theory, Berlin: Springer, 1999 (November)
Cohen G, Hilario M, Sax H, Hugonnet S, Geissbuhler A: Learning from imbalanced data in surveillance of nosocomial infection. Artif Intell Med 37:7–18, 2006 (May)
Caban JJ, Yao J, Avila NA, Fontana JR, Manganiello VC: Texture-based computer-aided diagnosis system for lung fibrosis. Medical Imaging 2007: Computer-Aided Diagnosis 6514, p. 651439, SPTE, February 2007
Zavaletta VA, Bartholmai BJ, Robb RA: Nonlinear histogram binning for quantitative analysis of lung tissue fibrosis in high-resolution CT data. Medical Imaging 2007: Physiology, Function, and Structure from Medical Images 6511, p. 65111Q, SPTE, February 2007
Wong JSJ, Zrimec T: Classification of lung disease pattern using seeded region growing. Australian Conference on Artificial Intelligence, pp. 233–242, 2006
Zrimec T, Wong J: Improving computer aided disease detection using knowledge of disease appearance. Stud Health Technol Inform 129:1324–1328, 2007
Uppaluri R, Hoffman EA, Sonka M, Hartley PG, Hunninghake GW, McLennan G: Computer recognition of regional lung disease patterns. Am J Respir Crit Care Med 160:648–654, 1999 (August)
Shamsheyeva A, Sowmya A: Tuning kernel function parameters of support vector machines for segmentation of lung disease patterns in high-resolution computed tomography images. SPIE Med Imaging 5370:1548–1557, 2004 (May)
Shamsheyeva A, Sowmya A: The anisotropic Gaussian kernel for SVM classification of HRCT images of the lung. Proceedings of the 2004 Intelligent Sensors, Sensor Networks and Information Processing Conference, pp. 439–444, December 2004
Depeursinge A, Müller H, Hidki A, Poletti P-A, Rochat T, Geissbuhler A: Building a library of annotated pulmonary CT cases for diagnostic aid. Swiss Conference on Medical Informatics (SSIM 2006), Basel, Switzerland, April 2006
Depeursinge A, Müller H, Hidki A, Poletti P-A, Platon A, Geissbuhler A: Image-based diagnostic aid for interstitial lung disease with secondary data integration. Medical Imaging 2007: Computer-Aided Diagnosis 6514, p. 65143P, SPTE, February 2007
Witten IH, Frank E: Data mining: practical machine learning tools and techniques, Morgan Kaufmann Series in Data Management Sys, Morgan Kaufmann, second ed., June 2005
Frank E, Hall MA, Holmes G, Kirkby R, Pfahringer B, Witten IH, Trigg L: Weka—a machine learning workbench for data mining. In: Maimon O, Rokach L Eds. The Data Mining and Knowledge Discovery Handbook. Berlin: Springer, 2005, pp 1305–1314
Chang CC, Lin CJ: LIBSVM: a library for support vector machines, 2001
Dietterich TG: Approximate statistical test for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923, 1998
Kubat M, Matwin S: Addressing the curse of imbalanced training sets: one-sided selection. Proceedings of the 14th International Conference on Machine Learning, pp. 179–186, Morgan Kaufmann, 1997
Acknowledgments
We thank Dr. Mélanie Hilario for her valuable comments on the methodology for benchmarking the classifiers. This work was supported by the Swiss National Science Foundation (FNS) with grant 200020-118638/1, the equalization fund of University and Hospitals of Geneva (grant 05-9-II), and the EU 6th Framework Program in the context of the KnowARC project (IST 032691).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Depeursinge, A., Iavindrasana, J., Hidki, A. et al. Comparative Performance Analysis of State-of-the-Art Classification Algorithms Applied to Lung Tissue Categorization. J Digit Imaging 23, 18–30 (2010). https://doi.org/10.1007/s10278-008-9158-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-008-9158-4