Tuberculosis (TB) is a health disorder caused by Mycobacterium tuberculosis. According to World Health Organization (WHO), almost 10 million people were diagnosed with TB in 2018, out of which 1.45 million died (including 0.25 million with HIV) [1]. TB, along with HIV are among the deadliest diseases of the current century. TB spreads through sneezing or coughing of a person having active form of TB. The most prevalent TB regions are Africa and Southeast Asia mainly due to limited resources and relatively high poverty rates. Pakistan, India, Bangladesh and China are among the high burden TB countries [2]. Early diagnosis is very crucial in combatting TB effectively. The death rate due to TB can be reduced significantly through early diagnosis. However, the lack of medical facilities in under-developed countries makes the task of early detection quite difficult.
Despite of the fact, that TB’s cure rate through antibiotics is quite high, it has a high mortality rate which reflects that either the TB cases remain undetected or they are detected at an advanced stage. Sputum Smear Microscopy [3] and chest X-ray (CXR) are the most common ways for TB detection. CXR has higher sensitivity than verbal screening for identifying pulmonary TB [3]. However, CXR despite being an effective method for TB detection also has some challenges. TB diagnosis through CXR requires expert personnel for CXR image interpretation. TB causes different manifestations on the lungs. Common TB manifestations include infiltrates, consolidation and cavitation [4]. Figure 1 shows sample CXR images with different TB manifestations.
TB affects the shape and texture of lung in a chest radiograph image. The job of a qualified radiologist is to determine the disease within CXR accurately. Unfortunately, there aren’t enough radiologists available especially in high burden TB countries [5]. Computer aided diagnosis (CAD) is a step forward for initial screening of TB. Through CAD, TB can be detected automatically in CXR. It will help in decreasing death rate especially in resource limited areas by reducing the need of qualified radiologist [6].
A typical CAD system for TB detection consists of three stages, namely (i) lung field segmentation, (ii) feature extraction, and (iii) classification. Lung segmentation in CXRs is often carried out as a pre-processing step to extract the region of interest (ROI). These ROIs are normally required for further analysis and can be susceptible to abnormalities. For example, clavicle segmentation can play a key role in the early diagnosis because TB and many other lung diseases most commonly manifest in lung apex [7]. Furthermore, segmentation can help region-based processing, such as contrast enhancement and bone suppression [8].
Once the segmentation is done, the next step is to extract the visual features that effectively represent these ROIs. Several texture features [e.g., wavelets, local binary pattern (LBP)], shape features (e.g., ellipticity, circularity), and a combination of both are employed to characterize these lung regions [9,10,11]. Further, various classifiers such as Support Vector Machine (SVM), Neural Network (NN), Random Forest (RF) and Bayesian network (BN) are explored in Refs. [10, 12] to classify CXR as normal or abnormal.
Since the emergence of deep learning (DL) algorithms and their promising results for various medical applications, significant progress has been made in developing DL systems [13,14,15,16,17,18,19,20,21] to detect pulmonary TB and other lung abnormalities. Among all DL algorithms, deep convolutional neural network (DCNN), a type of supervised machine learning algorithm, has emerged as an attractive technique for TB surveillance and detection [19]. DCNN consists of multiple convolution layers, pooling layers and fully-connected layers. Each layer is connected to the previous layer via kernels that have a predefined, fixed-size receptive field. The weights are shared within each layer to reduce the complexity and computation. Convolutional neural network (CNN) model often employs a large dataset to learn the parameters and extracts the global and local features that are more discriminative in the image. In contrast to handcrafted features, CNN model does not require domain-specific knowledge and has strong feature representation ability. AlexNet was the first CNN model used in Ref. [21] for CXR TB classification. In addition, features extracted through pre-trained CNN can be fine-tuned to fit on a different dataset, referred to as transfer learning. Transferring the learned parameters from a larger dataset is quite effective in comparison to training the CNN from scratch, especially with limited datasets [22]. To this end, we briefly review the related work in the following, highlighting the challenges which have motivated our work in this paper.
Han et al. [23] proposed an automatic recognition system for cavity imaging sign in lung computed tomography (CT). Fusion of hand-crafted and deep features was made and hybrid resampling was used. Multi-feature fusion worked better than any single feature class and achieved high sensitivity as compared to the rest. Ma et al. [24] proposed a multi-level similarity technique for the retrieval of common lung disease signs in lung CT scans. The similarity measurement was characterized into low, mid and high levels of scale. The final similarity score was obtained from the weighted sum of each level.
Wang et al. [25] proposed thoracic diseases’ classification scheme based on regularized deep neural network. The proposed network was named as Thorax-Net which composed of an attention and a classification branch. The output diagnosis was obtained through Thorax-Net by means of an average of the two branches. Thorax-Net achieved higher area under curve (AUC) values as compared to other deep learning models. Based on the observation that TB infected CXRs reveals deformed thoracic edge maps, Santosh et al. [26] proposed a TB screening system based on deformed thoracic edge maps. They implemented five ROI localization methods to find the best performing model.
Govindarajan et al. [27] proposed a TB classification scheme using ‘Speeded Up Robust Feature’ (SURF) descriptor and ‘Bag of Features’ approach. Distance regularized level set was used to segment the lung field and Multilayer perceptron was used to classify normal and TB infected images. Vajda et al. [10] proposed optimal feature selection from a wide variety of lung region features. Lung segmentation was performed to keep the focus of feature extraction on lung region. Three different subsets were made from initial pool of features. Each feature set consisted of different types of features like shape, edge, intensity, sharpness and gradient. The feature set consisting of shape, texture and edge descriptors performed best at classifying TB infected or normal CXR images.
Lopes et al. [28] proposed transfer learning approach in which pre-trained model weights were used with some fine tuning at final layers. CNN architectures deployed in the proposed scheme were GoogleNet [29], ResNet [30] and VggNet [31]. The study conducted three different experiments. In 1st experiment, input images were fed directly to the neural network by downsizing the image to fit respective CNNs. Image downsizing may result in loss of some important information, so input images were divided into smaller parts referred as bags of features in 2nd experiment. In 3rd experiment, the output of all three CNN architectures were combined through Ensemble Learning. Deep CNN has high computational cost which makes them difficult to be deployed in mobile devices. Pasa et al. [32] proposed an efficient CNN model having five convolutional layers followed by average pooling layers and a softmax layer. The size, complexity and computational cost was reduced while preserving the accuracy of the model.
Generally, deep learning models are tested on the same dataset on which the model is being trained, so there is every possibility that the model may become biased for a specific dataset. To address this problem, Das et al. [33] proposed a cross-population train/test model to measure the performance of a deep learning classifier. In cross-population train/test, the model’s training and test datasets have different sources.
In a nutshell, several efforts have been made to make a fully automatic TB CAD system. Earlier, research was limited to hand-crafted features but recently the focus has shifted towards deep learnig models. However, low accuracy of the reported systems is still an unresolved issue. The primary reason of the low accuracy of reported ssytems is the diverse TB’s manifestations on a chest radiograph image. All these different types of manifestations impose a challenge on CAD based systems. To cope with these different types of manifestations, a robust system is needed that can truly identify and differentiate between TB and non-TB manifestations.
In this paper, we have proposed a fully automatic CAD system for the effective detection of TB. We have used different pre-trained CNN architectures and supervised learning to predict TB in CXR images. Performance comparison of deployed CNN architectures has been made. Next, we have experimented with the Gabor filter and evaluated its performance on TB detection. Finally, we have used Ensemble Learning to combine the individual classifier outputs and their results have been reported. Our proposed method achieves better result as compared to present schemes. Further, the proposed methodology works without lung segmentation and requires minimum pre-processing. The main contributions of this work are summarized below:
-
A fully automatic computer aided TB detection scheme using CXR images is proposed which can be deployed for initial screening purposes.
-
A performance analysis has been made of notable pre-trained CNN architectures for an effective detection.
-
A fusion of hand-crafted features with deep features is made and Ensemble Learning is deployed to improve the detection performance.
-
A detailed comparison has been made with state of the art techniques for TB detection.
The rest of the paper is structured as follows: the following section presents the methodology opted in the proposed method. Experimental results are presented in “Results” section followed by “Discussion”. Finally, the paper is summarized and concluded with future directions in “Conclusion” section.