ET-NET: an ensemble of transfer learning models for prediction of COVID-19 infection through chest CT-scan images

Kundu, Rohit; Singh, Pawan Kumar; Ferrara, Massimiliano; Ahmadian, Ali; Sarkar, Ram

doi:10.1007/s11042-021-11319-8

ET-NET: an ensemble of transfer learning models for prediction of COVID-19 infection through chest CT-scan images

1192: Pioneering AI, Data Science and Multimedia Techniques and Findings for COVID-19
Published: 31 August 2021

Volume 81, pages 31–50, (2022)
Cite this article

Download PDF

Multimedia Tools and Applications Aims and scope Submit manuscript

ET-NET: an ensemble of transfer learning models for prediction of COVID-19 infection through chest CT-scan images

Download PDF

4903 Accesses
47 Citations
1 Altmetric
Explore all metrics

Abstract

The COVID-19 virus has caused a worldwide pandemic, affecting numerous individuals and accounting for more than a million deaths. The countries of the world had to declare complete lockdown when the coronavirus led to community spread. Although the real-time Polymerase Chain Reaction (RT-PCR) test is the gold-standard test for COVID-19 screening, it is not satisfactorily accurate and sensitive. On the other hand, Computer Tomography (CT) scan images are much more sensitive and can be suitable for COVID-19 detection. To this end, in this paper, we develop a fully automated method for fast COVID-19 screening by using chest CT-scan images employing Deep Learning techniques. For this supervised image classification problem, a bootstrap aggregating or Bagging ensemble of three transfer learning models, namely, Inception v3, ResNet34 and DenseNet201, has been used to boost the performance of the individual models. The proposed framework, called ET-NET, has been evaluated on a publicly available dataset, achieving $97.81\pm 0.53\%$ accuracy, $97.77\pm 0.58\%$ precision, $97.81\pm 0.52\%$ sensitivity and $97.77\pm 0.57\%$ specificity on 5-fold cross-validation outperforming the state-of-the-art method on the same dataset by 1.56%. The relevant codes for the proposed approach are accessible in: https://github.com/Rohit-Kundu/ET-NET_Covid-Detection

COVID-19Net: An Effective and Robust Approach for Covid-19 Detection Using Ensemble of ConvNet-24 and Customized Pre-trained Models

Article 11 December 2023

Online diagnosis of COVID-19 from chest radiography images by using deep learning algorithms

Article 24 July 2023

Classifier Fusion for Detection of COVID-19 from CT Scans

Article 03 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The COVID-19 pandemic caused by the novel coronavirus or SARS-CoV-2 originated in Wuhan, China and has affected more than 100 million people worldwide with more than 2 million deaths during January-October 2020. Although the mortality rate has dropped, the pandemic is not over yet. The tests performed for the detection of COVID-19 are the rapid IgM-IgG combined antibody test [32] and the real-time Polymerase Chain Reaction (RT-PCR) [48]. The RT-PCR test has several limitations: (1) A long time is required for obtaining the test results; (2) It is a costly test requiring experts to perform the test and analyze the results (3) They have a high false-negative rate (sensitivity of 71%) [55]. Although the rapid antigen test can produce results within 15 minutes by detection of the IgG and IgM antibodies simultaneously in the human blood, it might take several days for the human body to form the antibodies and thus there is a risk of spread of the virus before being detected. This leads to a very high false-negative rate. Hence, as an alternative, an automated diagnosis tool is required that is sensitive as well as specific to the COVID-19 disease which can lead to fast predictions.

Currently, in the worldwide scenario, there are 108.3 million total COVID-19 cases and 2.38 million total deaths. The plots for the total cases and daily cases are shown in Fig. 1a, b, respectively. In India, there are more than 10.8 million cases in total and 155,000 deaths as shown in Fig. 2a and the daily cases and mortality rates are shown in Fig. 2b (All data for the graphs have been collected from the publicly available data by Roser et al. [35]). Due to the acute shortage of RT-PCR test kits, especially in developing countries like India, population-wise screening is not possible, which has led to uncontrolled community spread of the virus. Also, the RT-PCR test is a tedious and time-consuming process, so an appropriate and viable option can be the use of chest CT-scan images for COVID-19 screening.

Computed Tomography or CT-scan is a relatively common test [34] and can be performed more amply. It is much more sensitive (98%) than RT-PCR (71%) as established by Fang et al. [18]. Figure 3 shows two CT images, one of which is of a COVID-19 infected patient and the other, a tested negative patient. The most common finding from the chest CTs is “ground-glass opacities” scattered throughout the lungs. They represent tiny air sacs or alveoli, getting filled with fluid, and turning a shade of grey in the CT-scan turning into a white consolidation in more severe cases, as marked by the red circle in Fig. 3a. The disease severity is proportionate to the lung findings, meaning sicker individuals have more of such opacities in one of both lobes of the lungs in chest CT-scans.

Also, to aid the clinicians in COVID-19 screening, automation based methods need to be developed that is both reliable and fast. Hence, researchers around the world have tried developing Computer-Aided Diagnosis tools for the detection of COVID-19 from chest X-rays or chest CT images. Chest CT images reveal more details than chest X-rays, and hence ET-NET considers chest CT images for the prediction of COVID-19 positive patients. Deep learning [30] is a powerful machine learning tool that uses structured or unstructured data for classification using a complex decision-making process.

The current image classification problem [13] is a supervised learning task. Supervised learning [4, 22] refers to the learning procedure, where an algorithm is trained on a labelled dataset, meaning that the true classes of the samples are already provided for the model to tune its parameters based on the training accuracy. Transfer learning is a technique where a deep learning model used for one task is utilized for another separate task. This method is particularly effective when the task at hand has less amount of data available for training the model, and the parameters trained from the previous task are loaded and trained with the new data for fine-tuning.

Ensemble learning allows the fusion of the salient features of multiple base learners, leading to more accurate predictions than the individual models. Such a learning scheme is robust since the variance in the prediction errors is reduced upon ensembling. An ensemble model aims to capture the complementary information from the base models and makes superior predictions. In the present study, the Bagging technique is used as a method to fuse the important aspects of all the transfer learning models considered here to form the ensemble. The Bagging technique is preferred over the Boosting algorithm in the present work, since the dataset available has only a small amount of data, and might lead to excessive overfitting using the Boosting technique. The bagging technique, on the other hand, reduces overfitting and hence is beneficial for the current problem. Thus, the ET-NET or Ensemble Transfer learning Network is proposed in this paper.

2 Literature survey

A vast amount of research is being conducted to help stop the COVID-19 pandemic [5]. However, the existing methods are time consuming and expensive, while also being less accurate. Yang et al. [54] showed that chest CT-scans can serve as an important make-up for the diagnosis of COVID-19. They used respiratory samples including nasal and throat swabs, and bronchoalveolar lavage fluid (BALF) to draw comparisons. The accuracy in the detection of COVID-19 was only 88.9% for severe cases and 82.2% for mild cases using sputum samples. Nasal swabs and throat swabs gave even lower accuracies (73.3% for nasal swabs and 60.0% for throat swabs). Table 1 shows some of the recent methods proposed in the literature for the automated diagnosis of COVID-19 from either CT-scan or Chest X-ray images.

Table 1 Some recent methods for COVID-19 detection

Full size table

Several automated frameworks for the screening of COVID-19 infected patients have been proposed since the outbreak of the pandemic, a majority of which have used chest X-ray images [11, 50]. According to clinicians and doctors, CT-scans are more reliable and sensitive than radio-graph (X-ray) images, and hence a better input for screening.

Deep learning has been widely used as a computer-aided detection tool for COVID-19 screening like in [31, 57]. Gozes et al. [23] utilized deep learning by fusing two subsystems, one being a 2D slice model and the other a 3D volumetric model for CT image classification. Li et al. [31] developed COVNet for extracting visual features from volumetric chest CT images. The COVNet they developed extracted both 2D local and 3D global features using a ResNet50 backbone and fused the features using a max-pooling layer and employing a final fully connected layer for generating the probability scores.

Zhang et al. [56] proposed a novel deep learning model for utilizing 3D chest CT volumes for the classification of infected patients and localization of swelling regions in the CT-scans. They used a pretrained U-net for segmentation of the 3D CT-scans and fed the 3D segmented chest areas into a deep neural network for forecasting the infection probability. The computation time for the detection of test images in their method is only 1.93 seconds per image. Abdel et al. [1] proposed a semi-supervised meta learning-based lung segmentation model for COVID-19 detection. Karbhari et al. [29] proposed a Generative Adversarial Network (GAN) framework to address the challenge of data scarcity for COVID-19 detection and used the generated data for training a classification model. Das et al. [14] proposed a bi-level classification model that uses pre-trained VGG-19 for feature extraction and then a shallow classifier for the final predictions. Sen et al. [42] and Chattopadhyay et al. [11] proposed deep features extraction and classification framework using meta-heuristics to reduce the feature set dimensionality. Garain et al. [21] developed a Spiking Neural Network-based model for the detection of COVID-19 from CT-scan images.

Most of the previous methods as shown in Eq. (1) use a single model for the predictions, however, we propose an ensemble scheme for the detection of COVID-19. Using the complementary information provided by the different base classifiers, based on the confidence scores, enhances the overall performance and robustness by reducing the variation in prediction errors. The ensemble method is a kind of fusion mechanism that uses the outputs or features from more than one model to compute the final prediction of the input [36, 39, 40]. It aims to enhance the performance of the framework beyond the reach of the individual models. Ensemble learning works better than the individual models, because of the diversification of the information considered. When more than one model’s opinion is accounted for, less noisy predictions are produced. Hence, such a technique has been employed in the present work. A large variety of ensemble techniques [17, 25, 36, 52] have been proposed in literature, two of the most popular techniques being Bagging [10, 38] and Boosting [9].

2.1 Motivation and contributions

In light of the current pandemic situation, the medical practitioners and healthcare professionals are working tirelessly, fighting the disease. However, the current gold standard method for COVID-19 screening, the RT-PCR test, is slow and tedious, and hence inadequate for population-wise screening resulting in an uncontrolled number of infected individuals. Several researchers, therefore, are trying to develop systems for faster and more efficient screening of the infected patients, which is the primary motivation behind the current paper. (Vaccine?) Ensemble learning allows the fusion of salient properties of the base classifiers, thus achieving an overall enhanced performance. Such models are robust since computing the ensemble model decreases the spread (or dispersion) of the predictions of the base models. That is, the variance in the prediction errors are diminished and complementary information is captured. Figure 4 shows a diagram depicting the overall workflow of the proposed ET-NET model.

The contributions of this paper are as follows:

1.
An ensemble-based COVID detection approach has been used that boosts the performance of the individual CNN classifiers: Inception v3 [47], ResNet34 [24] and DenseNet201 [27]. For this, a bagging ensemble technique has been used that uses the average of the decision scores generated by each model for each class of the dataset.
2.
The proposed model, called ET-NET, has been evaluated on a publicly available dataset [45] using 5-fold cross-validation, outperforming the previous state-of-the-art method by 1.56%.
3.
Most of the previous works considered chest X-ray images which are less sensitive than lung CT images used in this work. To account for the less availability of publicly available data, Transfer Learning has been used to generate the decision scores. The ensembling technique helps capture complementary information, thus outperforming individual models.

CT-scan images have been used, generally requiring no prior segmentation, for the classification of the chest CT-scans into two categories: COVID or Non-COVID.

The rest of the paper has been organized as follows: Sect. 3: Proposed Method, explains in detail the working of ET-NET in the current study; Sect. 4: Results and Discussion, highlights the results obtained by the ET-NET on a publicly available dataset, compares it to existing models and discusses the efficacy of ET-NET and Sect. 5: Conclusions, concludes the findings and contributions of this paper, and discusses the possibilities of future works on the proposed model.

3 Proposed method

Convolutional Neural Networks (CNNs) are preferred for image classification problems since an image is a 2D matrix of pixel intensities, and it might help to look at an image in parts, for example, a 300x300 image can be seen 3x3 parts at a time, for, say, object detection, etc., which is achieved by the convolution operation. The pooling operation [53] helps in dimensionality reduction. CNNs are shift-invariant [49, 58] and have less number of parameters in comparison to deep fully connected neural networks and hence are computationally more efficient even while accommodating a very deep network [19, 20].

In the proposed work, three models namely, Inception v3 [47], ResNet34 [24] and DenseNet201 [27] pretrained on ImageNet [15] have been used, which are then fine-tuned using the chest CT-scan dataset. The number of layers and parameters of each deep transfer learning model have been shown in Table 2.

Table 2 Number of layers and parameters in each network

Full size table

3.1 Inception v3

The characteristic features of the Inception v3 model developed by Szegedy et al. [47] in 2016, are the three types of inception block, which have parallel convolutions. Such modules account for more efficient computation in the deep architecture, while also addressing the overfitting problem. The architecture of the Inception v3 CNN has been illustrated in Fig. 5a.

3.2 ResNet34

The salient features of Residual Networks or ResNets developed by He et al. [24] in 2016 are that they have skip connections that directly concatenate the current layer with features from a previous layer, resulting in preservation of features from past layers, which might be important. ResNet34 is one such network that is 34 layers deep (and one fully connected classification layer), the architecture of which is shown in Fig. 5b.

3.3 DenseNet201

In DenseNets by Huang et al. [27] in 2017, each layer is a concatenation of feature maps of the current layer and all preceding layers. As a result, these networks are compact (that is, less number of channels), and hence in terms of computation and memory requirement, it is efficient, while also having rich features representation for the input images. The architecture of the DenseNet201 is shown in Fig. 5c.

3.4 Loss function

A loss function is a measure of the performance of a deep learning model. The main objective of a deep learning model is to minimize the error between the predicted and the original labels, which is calculated during backward propagation [12] in a neural network.

In the current study, the cross-entropy loss function is used, which evaluates the performance of the classifier which outputs a matrix of probabilities (each probability value between 0 and 1). Since the present classification problem deals with only two classes, the loss function is called Binary Cross-Entropy Loss function. The cross-entropy loss function is chosen since it performs well for binary classification problems which have a large decision boundary [33]. This loss function also helps curb the vanishing gradient descent problem since the use of logarithm nullifies any exponential behaviour which occurs due to the sigmoid (or softmax) activation function. The logarithm avoids saturation of the gradients at extreme values which is beneficial since large gradients are essential for making significant progress through the iterations.

Suppose for an input x, the true label is y and the predicted label from the classifier is $\hat{y}$, which is given by Eq. 1, where w is the weight matrix associated with the neural network and b is the bias matrix associated with it. f is the non-linear activation function associated with the layers in the neural network. For the present work, the activation function Rectified Linear Unit or ReLU [3] has been used.

$$\begin{aligned} \hat{y} = f(w^{T}.x+b) \end{aligned}$$

(1)

The ReLU activation function is given as in Eq. 2.

$$\begin{aligned} ReLU(x) = max(0,x) \end{aligned}$$

(2)

Then the loss function L is given by Eq. 3 where N denotes the number of classes in the problem. $N=2$ for the present study.

$$\begin{aligned} L(\hat{y}^{(i)},y^{(i)}) = -\sum _{i=1}^{N}y^{(i)}\log \hat{y}^{(i)} \end{aligned}$$

(3)

For m training samples, the cost function is given by Eq. 4

$$\begin{aligned} J(w,b) = -\frac{1}{m}\sum _{i=1}^{m}L(\hat{y}^{(i)},y^{(i)}) \end{aligned}$$

(4)

Using the cost function in Eq. 4, the weights and biases associated with the layers in the neural networks are updated.

3.5 Ensemble

The ensemble approach adopted for the current is the bootstrap aggregating or “Bagging” ensemble [8]. This machine learning-based ensemble technique was developed to make the machine learning classification algorithms more stable and accurate. Bagging ensemble techniques help to reduce overfitting problems, in contrast to Boosting ensemble technique [41] which increases the overfitting problem, because, in each stage of the Boosting algorithm, only the misclassified samples from the previous stage are used as training data.

In the current study, the Bagging ensemble technique uses the same training set for training the three pretrained models (Inception v3, ResNet34 and DenseNet201) independently and then predicts the class probabilities of the samples in the test set by the fine-tuned models to calculate the average probability score, thus giving equal weightage to all the three classifiers.

Suppose m models (classifiers) numbered as $1,2,\dots,m$ are used for a classification task of n classes, and the prediction probability scores are denoted by P. The prediction scores for a single image from model i can be expressed as a matrix as in Eq. 5.

$$\begin{aligned} P^{(i)} = \left[ p^{(i)}_1 p^{(i)}_2 ... p^{(i)}_n \right] \end{aligned}$$

(5)

So the final prediction score $P^{ensemble}$ using the average probability ensemble technique is given by Eq. 6.

$$\begin{aligned} P^{ensemble}= & {} \frac{\sum _{i=1}^{m} P^{(i)}}{m}\\ \nonumber= & {} \left[ \frac{\sum _{i=1}^{m} p^{(i)}_1}{m} \frac{\sum _{i=1}^{m} p^{(i)}_2}{m} ... \frac{\sum _{i=1}^{m} p^{(i)}_n}{m} \right] \\ \nonumber= & {} \left[ p^{\prime }_1 p^{\prime }_2 ... p^{\prime }_n \right] \end{aligned}$$

(6)

Now, the class having the maximum probability out of the values $p^{\prime }_1, p^{\prime }_2, ... , p^{\prime }_n$ is decided as the predicted class, which is then compared with the true labels to obtain the accuracy. In the current problem, there are 3 models and 2 categories to sort the images into, accounting for $m=3$ and $n=2$ in Eqs. 5 and 6.

4 Results and discussion

In this section, we will briefly describe the dataset used for the current study in Sect. 4.1, the evaluation metrics used for comparing and validating ET-NET in Sect. 4.2. The implementation of the developed methodology and the results thus obtained, are described in detail in Sect. 4.3, and the comparison with the existing literature and standard models are made in Sect. 4.5.

4.1 Dataset description

For evaluating the performance of the proposed methodology, the dataset used is publicly available on Kaggle^{Footnote 1} developed by Soares et al. [45]. The dataset consists of a total of 2481 CT-scan images unevenly distributed into COVID and Non-COVID categories as shown in Table 3. For the proposed framework, 70% of the images (1736 scans) are used as training data and the rest 30% (745 scans) are used as testing data.

Table 3 Class-wise distribution of images in the Kaggle dataset

Full size table

4.2 Evaluation metrics

For evaluating the performance of ET-NET on the binary classification task at hand, parameters such as accuracy, precision, recall (or sensitivity), f1 score and specificity. For defining these terms, first the terms True Positive, True Negative, False Positive and False Negative needs to be defined.

In a binary classification problem, suppose the two classes are a positive class and a negative class. True Positive (TP) refers to a sample belonging to the positive class, being classified correctly. False Positive (FP) refers to a sample belonging to the negative class, but classified to be belonging to the positive class. Similarly, True Negative (TN) refers to a sample being classified correctly as belonging to the negative class. False Negative (FN) refers to a sample belonging to the positive class, but classified as being part of the negative class. Now the metrics can be defined as follows:

$$\begin{aligned} Accuracy= & {} \frac{TP+TN}{TP+FP+TN+FN}\end{aligned}$$

(7)

$$\begin{aligned} Precision= & {} \frac{TP}{TP+FP}\end{aligned}$$

(8)

$$\begin{aligned} Recall\, (or\, Sensitivity)= & {} \frac{TP}{TP+FN}\end{aligned}$$

(9)

$$\begin{aligned} F1 Score= & {} \frac{2}{\frac{1}{Precision}+\frac{1}{Recall}}\end{aligned}$$

(10)

$$\begin{aligned} Specificity= & {} \frac{TN}{TN+FP} \end{aligned}$$

(11)

4.3 Implementation

The CNN transfer learning models have been trained for 100 epochs, and the loss curves of the models have been shown in Fig. 6. The predictions of the models on the test set images have been saved. The hyperparameters used for training the three models are shown in Table 4.

Table 4 Hyperparameters used for training each model

Full size table

The probability prediction matrices from the three classifiers have been averaged per sample to get the final prediction scores, and hence the predicted result for all the images are obtained.

The class-wise metrics obtained have been shown in Table 5 and the net result has been shown in Table 6. The confusion matrix for the test set has been shown in Fig. 7 and the Receiver Operating Characteristics (ROC) curves of the individual models and the proposed ET-NET are shown in Fig. 8.

Table 5 Class-wise evaluation metrics generated by the base classifiers and the proposed ET-NET model on Fold-4 (best fold) of 5-fold cross-validation

Full size table

Table 6 Evaluation metrics produced by the proposed ET-NET model on 5-fold cross-validation of the dataset

Full size table

4.4 Error analysis

ET-NET performs very well for the current classification problem. Examples of correctly classified images from each class are shown in Fig. 9. In both the images, a part of the lungs has not been captured by the CT-scan properly, and as a result, it is an erroneous image. The contrast for Fig. 9a is also too high, while Fig. 9b is a hazy image. Even with all these limitations of the images in the dataset, ET-NET was able to classify them correctly, proving the model to be reliable even for imperfect imaging conditions. Hence, slightly noisy images do not affect the performance of ET-NET.

Figure 10 shows one misclassified image from each class of the dataset. Figure 10a belongs to class “COVID” of the dataset but was classified by ET-NET as “Non-COVID”. The prime reason for that is, the lung condition depicted in the CT-scan is one of a mild COVID condition, as a result, prominent ground-glass opacity has not yet developed in the lung alveoli. So, ET-NET was unable to detect the presence of COVID-19 infection from such a preliminary stage of infection. Figure 10b on the other hand, is a sample belonging to the “Non-COVID” class of the dataset, but ET-NET predicted it to be a “COVID” condition. One of the reasons for that is, the lung CT-scan quality is not appropriate, because visibly the lung shape has not been properly captured. The other reason might be the fact that the CT-scan is very hazy unlike the low level of noise present in Fig. 9b.

4.5 Comparison with existing models

Several transfer learning models have been used for comparing the performance of the proposed approach, which has been shown in Table 7. Table 8 shows the comparison of ET-NET with some existing methods that use the same dataset.

Angelov and Soares [6] extracted features from non-pretrained GoogLeNet [46] and used a Multi-Layer Perceptron (MLP) for final classification. Panwar et al. [37] used the VGG19 [44] transfer learning model and added five more layers ahead and trained the network. Jaiswal et al. [28] used deep transfer learning technique with DenseNet201 [27] for feature extraction and classification.

Table 7 Comparison of ET-NET with some standard deep learning models

Full size table

Table 8 Comparison of the proposed ET-NET with some existing models in literature on the Kaggle dataset

Full size table

4.6 Statistical test

The McNemar’s test [16] is performed for statistically analysing the performance of the proposed ET-NET ensemble model with the base CNN classifiers which have been used to form the ensemble, and other standard transfer learning classifiers. McNemar’s test is a non-parametric analysis of paired nominal data distribution. Table 9 displays the results obtained from McNemar’s test on the Kaggle dataset. The “$p-value$” signifies the probability that two models are similar, thus, a lower $p-value$ is desired. To reject the null hypothesis that the two models are similar, the $p-value$ needs to be smaller than $5\%$ that is, if $p-value<0.05$, we can safely say that the two models under consideration are significantly different.

Table 9 Results of the McNemar’s test performed between ET-NET and standard CNN models on the Kaggle dataset: Null hypothesis is rejected for all cases

Full size table

In Table 9, it can be noted that for every model with which the ET-NET is compared, $p-value<0.05$, thus rejecting the null hypothesis. So, it can be said that the proposed ensemble model captures complementary information from the constituent base classifiers, thus producing superior results while making the ensemble model markedly dissimilar from the base classifiers.

5 Conclusions

The spread of COVID-19 has collapsed economies of the world and caused numerous deaths, and people are still suffering due to this pandemic situation. Although RT-PCR is used for the screening of COVID-19 patients, it is a tedious process with low sensitivity. ET-NET uses a more sensitive CT-scan based detection using Computer-Aided Diagnosis. Deep transfer learning and an average probability-based ensemble approach have been utilized for the binary classification task which obtained results superior to existing CT-scan based screening models achieving an accuracy of 97.73% which is impressive for the small dataset used. Also, the sensitivity and specificity of the proposed ET-NET is better than RT-PCR and hence can be used as a reliable and robust COVID-19 detection mechanism. The proposed ET-NET model is also domain-independent, and can be extended to problems in gait detection [2], action recognition [7], etc.

The primary limitation of this method is that the non-availability of the data may deter to prove the robustness and generalization ability of the method. Deep learning models essentially perform best with a very large database, but, the dataset used in this study, has only 2481 images, whereas the more efficient deep learning models need to be trained on larger datasets depending on the complexity of the problem for optimal performance. As a result, we had to use transfer learning models that were pretrained on ImageNet consisting of 14 million images and then fine-tuned using the chest CT-scan images from this study. Also, other pulmonary diseases like the Middle East respiratory syndrome (MERS) and Chronic obstructive pulmonary disease (COPD) are possible biases to the present work as compared to RT-PCR and IgG-IgM antibody tests. It might also be important to perform segmentation to improve the Non-Covid control group design, which we intend to address in the future.

We aim to perform more experiments once more extensive datasets of chest CT-scan become available and develop better models for classification. We shall try to use image enhancement techniques to address the limitations mentioned in Sect. 4.4. We may try more pretrained models to form the ensemble and try more sophisticated ensemble approaches like Dempster-Shafer theory, Choquet fuzzy integral or rank based fusions.

Notes

https://www.kaggle.com/plameneduardo/sarscov2-ctscan-dataset

References

Abdel-Basset M, Chang V, Hawash H, Chakrabortty RK, Ryan M (2021) Fss-2019-ncov: A deep learning architecture for semi-supervised few-shot segmentation of covid-19 infection. Knowl-Based Syst 212:106647
Article Google Scholar
Achanta SDM, Karthikeyan T, Vinothkanna R (2019) A novel hidden markov model-based adaptive dynamic time warping (hmdtw) gait analysis for identifying physically challenged persons. Soft Comput 23(18):8359–8366
Article Google Scholar
Agarap AF (2018) Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375
Al Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In SDM06: workshop on link analysis, counter-terrorism and security 30:798–805
Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl pp 1–33
Angelov P, Soares E (2020) Explainable-by-design approach for covid-19 classification via ct-scan. medRxiv
Banerjee A, Singh PK, Sarkar R (2020) Fuzzy integral based cnn classifier fusion for 3d skeleton action recognition. IEEE Trans Circuits Syst Video Technol
Breiman L (1996) Stacked regressions. Mach Learn 24(1):49–64
MATH Google Scholar
Bühlmann P, Hothorn T et al (2007) Boosting algorithms: Regularization, prediction and model fitting. Stat Sci 22(4):477–505
MathSciNet MATH Google Scholar
Bühlmann PL (2003) Bagging, subagging and bragging for improving some prediction algorithms. In Research report/Seminar für Statistik, Eidgenössische Technische Hochschule (ETH), vol 113
Chattopadhyay S, Dey A, Singh PK, Geem ZW, Sarkar R (2021) Covid-19 detection by optimizing deep residual features with improved clustering-based golden ratio optimizer. Diagnostics
Chauvin Y, Rumelhart DE (1995) Backpropagation: theory, architectures, and applications. Psychology Press
Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In 2012 IEEE conference on computer vision and pattern recognition pp 3642–3649
Das S, Roy SD, Malakar S, Velásquez JD, Sarkar R (2021) Bi-level prediction model for screening covid-19 patients using chest x-ray images. Big Data Research 25:100233
Article Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conf Comput Vis Pattern Recognit pp 248–255
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer pp 1–15
Fang Y, Zhang H, Xie J, Lin M, Ying L, Pang P, Ji W (2020) Sensitivity of chest ct for covid-19: comparison to rt-pcr. Radiology p 200432
Fukushima K (1988) Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130
Article Google Scholar
Fukushima K, Miyake S (1982) Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets. Springer pp 267–285
Garain A, Basu A, Giampaolo F, Velasquez JD, Sarkar R (2021) Detection of covid-19 from ct scan images: A spiking neural network-based approach. Neural Comput Applic pp 1–14
Ghahramani Z, Jordan MI (1994) Supervised learning from incomplete data via an em approach. In Adv Neural Inf Proces Syst pp 120–127
Gozes O, Frid-Adar M, Greenspan H, Browning PD, Zhang H, Ji W, Bernheim A, Siegel E (2020) Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection & patient monitoring using deep learning ct image analysis. arXiv preprint arXiv:2003.05037
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit pp 770–778
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci pp 382–401
Huang G, Liu Z, Pleiss G, Van Der Maaten L, Weinberger K (2019) Convolutional networks with dense connectivity. IEEE Trans Pattern Anal Mach Intell
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In Proc IEEE Conf Comput Vis Pattern Recognit pp 4700–4708
Jaiswal A, Gianchandani N, Singh D, Kumar V, Kaur M (2020) Classification of the covid-19 infected patients using densenet201 based deep transfer learning. J Biomol Struct Dyn pp 1–8
Karbhari Y, Basu A, Geem Z-W, Han G-T, Sarkar R (2021) Generation of synthetic chest x-ray images and detection of covid-19: a deep learning based approach. Diagnostics 11(5):895
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Li L, Qin L, Xu Z, Yin Y, Wang X, Kong B, Bai J, Lu Y, Fang Z, Song Q et al (2020) Artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct. Radiology
Li Z, Yi Y, Luo X, Xiong N, Liu Y, Li S, Sun R, Wang Y, Hu B, Chen W et al (2020) Development and clinical application of a rapid igm-igg combined antibody test for sars-cov-2 infection diagnosis. J Med Virol 92(9):1518–1524
Article Google Scholar
Mannor S, Peleg D, Rubinstein R (2005) The cross entropy method for classification. In Proceedings of the 22nd international conference on Machine learning pp 561–568
Masood A, Sheng B, Li P, Hou X, Wei X, Qin J, Feng D (2018) Computer-assisted decision support system in pulmonary cancer detection and stage classification on ct images. J Biomed Inform 79:117–128
Article Google Scholar
Ortiz-Ospina E, Roser M, Ritchie H, Hasell J (2020) Coronavirus pandemic (covid-19). Our World in Data. https://ourworldindata.org/coronavirus
Opitz D, Maclin R (1999) Popular ensemble methods: An empirical study. J Artif Intell Res 11:169–198
Article Google Scholar
Panwar H, Gupta P, Siddiqui MK, Morales-Menendez R, Bhardwaj P, Singh V (2020) A deep learning and grad-cam based color visualization approach for fast detection of covid-19 cases using chest x-ray and ct-scan images. Chaos, Solitons Fractals p 110190
Pham H, Olafsson S (2019) Bagged ensembles with tunable parameters. Comput Intell 35(1):184–203
Article MathSciNet Google Scholar
Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45
Article Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39
Article Google Scholar
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Google Scholar
Sen S, Saha S, Chatterjee S, Mirjalili S, Sarkar R (2021) A bi-stage feature selection approach for covid-19 prediction using chest ct images. Appl Intell pp 1–16
Silva P, Luz E, Silva G, Moreira G, Silva R, Lucio D, Menotti D (2020) Covid-19 detection in ct images with deep learning: A voting-based scheme and cross-datasets analysis. Informatics in Medicine Unlocked 20
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Soares E, Angelov P, Biaso S, Froes MH, Abe DK (2020) Sars-cov-2 ct-scan dataset: A large dataset of real patients ct scans for sars-cov-2 identification. medRxiv
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In Proc IEEE Conf Comput Vis Pattern Recognit pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In Proc IEEE Conf Comput Vis Pattern Recognit pp 2818–2826
Tahamtan A, Ardebili A (2020) Real-time rt-pcr in covid-19 detection: issues affecting the results. Expert Rev Mol Diagn 20(5):453–454
Article Google Scholar
Waibel A, Hanazawa T, Hinton G, Shikano K, Lang KJ (1989) Phoneme recognition using time-delay neural networks. IEEE Trans Acoust Speech Signal Process 37(3):328–339
Article Google Scholar
Wang L, Wong A (2020) Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images. arXiv preprint arXiv:2003.09871
Wang Z, Liu Q, Dou Q (2020) Contrastive cross-site learning with redesigned net for covid-19 ct classification. IEEE J Biomed Health Inform 24(10):2806–2813
Article Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article Google Scholar
Yamaguchi K, Sakamoto K, Akabane T, Fujimoto Y (1990) A neural network for speaker-independent isolated word recognition. In First International Conference on Spoken Language Processing
Yang Y, Yang M, Shen C, Wang F, Yuan J, Li J, Zhang M, Wang Z, Xing L, Wei J et al (2020) Laboratory diagnosis and monitoring the viral shedding of 2019-ncov infections. MedRxiv
YAP JCH, ANG IYH, TAN SHX, Jacinta I, Pei C, LEWIS RF, Qian Y, YAP RKS, NG BXY, TAN HY (2020) Covid-19 science report: diagnostics. NUS Libraries
Zhang J, Chu Y, Zhao N (2020) Supervised framework for covid-19 classification and lesion localization from chest ct. The Ethiopian Journal of Health Development (EJHD) 34(4)
Zhang J, Xie Y, Li Y, Shen C, Xia Y (2020) Covid-19 screening on chest x-ray images using deep learning based anomaly detection. arXiv preprint arXiv:2003.12338
Zhang W, Itoh K, Tanida J, Ichioka Y (1990) Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl Opt 29(32):4790–4797
Article Google Scholar

Download references

Acknowledgements

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. This research was financially supported for Ali Ahmadian by the Ministry of Higher Education, Malaysia with Fundamental Research Grant Scheme (FRGS) with the reference no FRGS/1/2018/STG06/UPM/02/6.

Author information

Authors and Affiliations

Department of Electrical Engineering, Jadavpur University, Kolkata, 700032, India
Rohit Kundu
Department of Information Technology, Jadavpur University, Kolkata, 700106, India
Pawan Kumar Singh
Department of Law, Economics and Human Sciences & Decisions Lab, Mediterranea University of Reggio Calabria, Reggio Calabria, 89125, Italy
Massimiliano Ferrara
ICRIOS - The Invernizzi Centre for Research in Innovation, Organization, Strategy and Entrepreneurship, Bocconi University - Department of Management and Technology, Via Sarfatti 25, Milano, 20136, MI, Italy
Massimiliano Ferrara
Institute of IR 4.0, The National University of Malaysia, Bangi, 43600 UKM, Selangor, Malaysia
Ali Ahmadian
Department of Mathematics, Near East University, Nicosia, TRNC, Mersin 10, Turkey
Ali Ahmadian
Institute for Mathematical Research, Universiti Putra Malaysia, Seri Kembangan, Selangor, 43400 UPM, Malaysia
Ali Ahmadian
Department of Computer Science & Engineering, Jadavpur University, Kolkata, 700032, India
Ram Sarkar

Authors

Rohit Kundu
View author publications
You can also search for this author in PubMed Google Scholar
Pawan Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Massimiliano Ferrara
View author publications
You can also search for this author in PubMed Google Scholar
Ali Ahmadian
View author publications
You can also search for this author in PubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Ahmadian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kundu, R., Singh, P.K., Ferrara, M. et al. ET-NET: an ensemble of transfer learning models for prediction of COVID-19 infection through chest CT-scan images. Multimed Tools Appl 81, 31–50 (2022). https://doi.org/10.1007/s11042-021-11319-8

Download citation

Received: 10 November 2020
Revised: 07 July 2021
Accepted: 21 July 2021
Published: 31 August 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11042-021-11319-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

ET-NET: an ensemble of transfer learning models for prediction of COVID-19 infection through chest CT-scan images

Abstract

Similar content being viewed by others

COVID-19Net: An Effective and Robust Approach for Covid-19 Detection Using Ensemble of ConvNet-24 and Customized Pre-trained Models

Online diagnosis of COVID-19 from chest radiography images by using deep learning algorithms

Classifier Fusion for Detection of COVID-19 from CT Scans

1 Introduction