Development of a Computer-Aided Differential Diagnosis System to Distinguish Between Usual Interstitial Pneumonia and Non-specific Interstitial Pneumonia Using Texture- and Shape-Based Hierarchical Classifiers on HRCT Images
A computer-aided differential diagnosis (CADD) system that distinguishes between usual interstitial pneumonia (UIP) and non-specific interstitial pneumonia (NSIP) using high-resolution computed tomography (HRCT) images was developed, and its results compared against the decision of a radiologist. Six local interstitial lung disease patterns in the images were determined, and 900 typical regions of interest were marked by an experienced radiologist. A support vector machine classifier was used to train and label the regions of interest of the lung parenchyma based on the texture and shape characteristics. Based on the regional classifications of the entire lung using HRCT, the distributions and extents of the six regional patterns were characterized through their CADD features. The disease division index of every area fraction combination and the asymmetric index between the left and right lungs were also evaluated. A second SVM classifier was employed to classify the UIP and NSIP, and features were selected through sequential-forward floating feature selection. For the evaluation, 54 HRCT images of UIP (n = 26) and NSIP (n = 28) patients clinically diagnosed by a pulmonologist were included and evaluated. The classification accuracy was measured based on a fivefold cross-validation with 20 repetitions using random shuffling. For comparison, thoracic radiologists assessed each case using HRCT images without clinical information or diagnosis. The accuracies of the radiologists’ decisions were 75 and 87%. The accuracies of the CADD system using different features ranged from 70 to 81%. Finally, the accuracy of the proposed CADD system after sequential-forward feature selection was 91%.
KeywordsComputer-aided differential diagnosis Usual interstitial pneumonia Non-specific interstitial pneumonia Regional lung disease patterns SVM classifier
Diffuse interstitial lung disease (DILD) is a type of chronic disorder that infiltrates the lung parenchyma (functional tissue) and leads to respiratory problems if the cause is not removed or if therapy fails. Idiopathic interstitial pneumonia (IIP) is a type of DILD that consists of seven clinical-radiologic-pathologic entities, including usual interstitial pneumonia (UIP) and non-specific interstitial pneumonia (NSIP). Specifically, UIP and NSIP account for two-thirds of IIP cases and show different prognoses with a five-year survival rate . Differentiating between UIP and NSIP is clinically important in terms of their different therapies and prognoses .
Because of the rapid development of computer tomography (CT), high-resolution computed tomography (HRCT) has become an important tool for characterizing various types of lung parenchyma disorders, particularly DILD [3, 4]. The texture and shape characteristics of the local lung parenchyma of DILD patients have potential importance for understanding the various lung diseases that correlate with disease pathology [5, 6, 7]. Several lung disease quantification methods employing textural and shape features have been verified for accurate regional disease differentiation and a reproducible assessment [8, 9, 10, 11].
In this paper, we present a computer-aided differential diagnosis system for distinguishing between usual interstitial pneumonia (UIP) and non-specific interstitial pneumonia (NSIP) by employing HRCT lung quantification methods. The proposed system consists of two classification steps. First, the DILD regional disease–pattern classifier quantifies the lung parenchyma into one normal and five regional pulmonary disease patterns (ground-glass opacity, consolidation, reticular opacity, emphysema, and honeycombing) using textural and shape features extracted from HRCT images. Subsequently, the computer-aided differential diagnosis (CADD) classifier differentiates the HRCT images into UIP and NSIP, based on their quantified lung characteristics.
Materials and Methods
The Asan Medical Center’s institutional review board for human investigations approved the study protocol, removed all patient identifiers, and waived the informed-consent requirements owing to the retrospective nature of this study.
For the lung quantification, HRCT images were selected retrospectively from images obtained from 14 healthy subjects, 16 patients with emphysema, 35 patients with cryptogenic-organizing pneumonia, 36 patients with usual interstitial pneumonia, 4 patients with pneumonia, and 1 patient with acute interstitial pneumonia. (See “Lung quantification”).
For modeling the CADD classifier, images from 26 different patients with UIP and 28 patients with NSIP, diagnosed both clinically and pathologically, were selected as the dataset. This decision, based on a combination of clinico-radiologico-pathological discussions and consensus, has been regarded as the gold standard according to the latest official statement by ATS/ERS/JRS/ALAT, i.e., Idiopathic Pulmonary Fibrosis: Evidence-based Guidelines for Diagnosis and Management .
The entire chest was covered within the scanned field of view. A 10-mm interval was used, and 30 to 40 slices were acquired per patient. For the image reconstruction, a 16-multidetector CT (Sensation 16, Siemens, Erlangen, Germany) with a 1-mm slice thickness and an edge-enhancing reconstruction kernel (B70f) was used.
In this study, we employed our previous work on regional DILD disease-pattern classification for lung quantification . We used the same dataset and classifiers with similar parameters.
A thoracic radiologist with 10 years of experience marked 900 typical regions of interest (ROI), including normal (NL, n = 150), ground-glass opacity (GGO, n = 150), reticular opacity (RO, n = 150), honeycombing (HC, n = 150), emphysema (EMPH, n = 150), and consolidation (CONS, n = 150), using a circular mask with a 20-pixel diameter. To prevent a clustering effect, only one ROI was selected in each image. To characterize the six types of regional DILD disease patterns, we extracted 28 textural and shape features from the ROI of an HRCT image [9, 14], e.g., histogram, gradient, run-length matrix, co-occurrence matrix, cluster analysis, and top-hat transform. An SVM was employed to quantify the lung parenchyma into six classes. We applied sequential-forward feature selection, and the SVM was trained using the 900-ROI dataset with a radial basis function (RBF) kernel and optimized parameters.
Computer-Aided Differential Diagnosis
After the lung quantification, the entire lung area is represented as an area composed of six regional disease patterns. The distribution of the six classes provides important evidence for differentiating UIP and NSIP [15, 16]. We defined CADD features for quantifying the distribution characteristics after consulting with experienced radiologists: area fraction (AF), directional probability density function (dPDF), regional cluster distribution pattern (RCDP), disease division index (DDI), and asymmetric index (AI).
An SVM classifier was again used to classify the UIP and NSIP using the CADD features. Before training the SVM, meaningful features were selected from a number of CADD features to maximize the classification accuracy and avoid the curse of dimensionality. Sequential-forward floating feature selection (SFFS) was employed for selecting the CADD features . A grid search algorithm was applied to optimize the parameters, including the SVM cost and gamma, using the training data. Various cost and gamma pairs were attempted, and the one with the best classification performance was selected. The details of training and testing of the classifiers are described in the following section.
To evaluate the proposed differential diagnosis system, an HRCT dataset of 54 patients, who were clinically and pathologically diagnosed with UIP (n = 26) and NSIP (n = 28), was used. In this study, we carried out two experiments, including a radiologic decision and a classification-based decision.
Two thoracic radiologists were recruited, and asked to review the HRCT images to diagnose each case as either UIP or NSIP, based on a visual assessment without clinical information or diagnosis. During the review, the radiologists assessed each HRCT image and scored 21 entries: five disease-pattern quantifications (five entries, 20 scales from 0 to 100%), three-directional distributions of five disease patterns (15 entries, 20 scales from 0 to 100%), and a radiologic decision (one entry, five scales).
The entities were used as feature sets for the SVM classifier. Effective features were selected through sequential-forward selection, and the classifier parameter was optimized using the grid search algorithm. The trained classifier was evaluated using a five-fold cross-validation with 20 repetitions. The average accuracies of the radiologist decisions were 0.75 and 0.87, respectively.
Qualitative Analysis of Lung Quantification and CADD Features
To verify whether the lung quantification and CADD feature extraction can represent lung-regional disease patterns and characteristics for the differentiation, we compared the quantification results and extracted features with radiological knowledge. For each UIP and NSIP case, the lung quantification results were captured and their CADD features were extracted.
CADD features of a UIP case
CADD features of an NSIP case
The present study aims to differentiate between usual interstitial pneumonia and non-specific interstitial pneumonia using HRCT images, excluding any clinical or pathological information. As shown in the “Results” section, the computer-aided differential diagnosis system can be compared with the visual assessment of experienced radiologists. To the best of our knowledge, this is the first development and validation trial of a CADD system for UIP and NSIP, including the semi-automatic quantification of regional disease patterns of DILD from HRCT images.
A total of 16 of the most accurate CADD features were selected. We found that the features were well fitted to the radiological knowledge for differentiating between UIP and NSIP. The decision procedures of the lung quantification and classifier, using different combinations of the proposed CADD features, were similar to the diagnosis-decision procedures of the radiologists, and showed a similar differentiation performance.
Our computer-aided differential diagnosis system for UIP and NSIP included two steps for quantifying the lung and classifying between UIP and NSIP. For the lung quantification, the trained SVM classifier classified the lung parenchyma into one normal and five regional disease patterns. If the performance of the SVM classifier can be improved using a well-controlled dataset, the trained classifier will consistently produce quality results. Moreover, we found that intra-reader variability exists in the visual assessment of the HRCT images in our previous study, which might have depended on the experience of the radiologist . In this situation, a semi-automatic assessment method can be useful for supporting the decisions of the clinicians or as an initial screening when experts are unavailable.
There are several limitations to the present study. First, the study is dependent on two evidentiary categories, UIP and NSIP, among the various types of lung diseases because it is not easy to clearly differentiate between the different kinds of DILD and we want to prove the validity of the proposed method. However, we need to extend this study to differentiate among UIP, possible UIP, and images inconsistent with UIP or various other types of DILD, which could be a topic of further research. As another limitation, this is a retrospective study, using a dataset collected from patients with regional UIP and NSIP disease patterns. Finally, a lack of consensus among radiologists remains problematic for the type of supervised learning algorithm applied. Unsupervised learning could be a solution to the gold standard used, which is unclear even to expert radiologists.
In this study, we proposed a computer-aided differential diagnosis (CADD) system that differentiates between usual interstitial pneumonia (UIP) and non-specific interstitial pneumonia (NSIP) using high-resolution computed tomography (HRCT) images. Lung quantification was presented to automatically classify the voxels of the HRCT images into one normal and five regional disease patterns. Based on the lung quantification, the CADD features that characterize each regional disease pattern throughout the entire lung were extracted. Using these CADD features, a CADD classifier was able to predict the patient HRCT images as either UIP or NSIP cases.
To evaluate the proposed system, we compared its accuracy against the determinations of radiologists. The results of the comparison indicate that the proposed system can be a robust and quantitative tool supporting the decisions of clinicians and providing an initial screening for UIP and NSIP.
This work was supported by the Industrial Strategic technology development program (10072064) funded by the Ministry of Trade, Industry and Energy (MI, Korea).
Compliance with Ethical Standards
Conflict of Interest
Namkug Kim and Joon Beom Seo have conflicts of interest regarding royalties received for a patent on classifying regional diseased patterns of diffuse interstitial lung disease, and as stockholders of Coreline Soft, Inc. The other authors have no relevant conflicts of interest to disclose.
- 1.Travis WD, King TE, Bateman ED, Lynch DA, Capron F, Center D, Colby TV, Cordier JF, DuBois RM, Galvin J: American Thoracic Society/European Respiratory Society international multidisciplinary consensus classification of the idiopathic interstitial pneumonias. American journal of respiratory and critical care medicine 165(2):277–304, 2002CrossRefGoogle Scholar
- 11.N. Kim, J. B. Seo, Y. S. Sung, B.-W. Park, Y. Lee, S. H. Park, Y. K. Lee, S.-H. Kang: Effect of various binning methods and ROI sizes on the accuracy of the automatic classification system for differentiation between diffuse infiltrative lung diseases on the basis of texture features at HRCT, presented at the Medical Imaging, 2008 (unpublished).Google Scholar
- 12.Raqhu G, Collard HR, Eqan JJ, Martinez FJ, Behr J, Brown KK, Colby TV, Cordier JF, Flaherty KR, Lasky JA, Lynch DA, Ryu JH, Swiqris JJ, Wells AU, Ancochea J, Bouros D, Carvalho C, Costabel U, Ebina M, Hansell DM, Johkoh T, Kim DS, King, Jr TE, Kondoh Y, Myers J, Muller NL, Nicholson AG, Richeldi L, Selman M, Dudden RF, Griss BS, Protzko SL, Schunemann HJ: An official ATS/ERS/JRS/ALAT statement: Idiopathic pulmonary fibrosis: evidence-based guidelines for diagnosis and management. American Journal of Respiratory and Critical Care Medicine 183(6):788–824, 2011CrossRefGoogle Scholar