1 Introduction

The emergence of COVID-19 infection has critically affected the social and economic structures of both the developing and the developed countries since December 2019 [12]. Researchers and health care workers around the globe are trying to apprehend the COVID-19 etiology and its effect on the quality of life [30].

Nowadays, Computer Tomography (CT) scans are emerging as an alternative for screening in contrast to the conventional reverse transcription-polymerase chain reaction (RT-PCR). This is owing to the limitations associated with the availability, reproducibility, and significant false-negative outcomes produced by RT-PCR kits.

Distinctive CT scans having patchy ground-glass opacities are important biomarkers that can aid in speedy detection and isolation of the subject [1, 16, 31].

As the cases are increasing at an alarming rate, an automated detection system for COVID-19 is the need of the hour that can assist in faster virus detection at different stages thereby relieving the healthcare professionals from the manual annotation task.

Several artificial intelligence (AI) based methods have evolved that automatically provide a diagnosis whether a CT image is COVID +ve or not [3, 4, 8, 19,20,21, 29, 32, 34].

Soares et al. [28] build a publically available CT scan data for severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) comprising a total of 2482 images. The authors proposed a non-iterative algorithm named eXplainable deep network (x-DNN) based on recursive calculation for the categorization of data as non-COVID and COVID +ve. The authors attained a promising F1 value of 97.31% over the validation data.

Nour et al. [21] developed a diagnosis model for SARS-COV-2 infection detection using deep discriminative features and a Bayesian algorithm. The authors used a five convolutional layered model as a deep feature extractor; thereafter, the features were fed to the standard K-nearest neighbor (KNN), support vector machine (SVM), and decision trees whose hyperparameters were fine-tuned with Bayesian optimization procedure. The authors validated their method on the publically available X-ray image dataset and concluded that the SVM algorithm provided the best prediction resulting in an accuracy of 98.97% under the data partitioning ratio of 7:3.

He et al. [11] introduced a dataset of CT scan images with hundreds of COVID +ve images freely available for research. The authors also developed the Self-Trans method integrating self-supervised contrastive learning with the transfer learning-based mechanism for the separation of infected scans from normal ones. The researchers attained an F1-value of 0.85 using a split proportion of 0.6, 0.25, and 0.15 for training, testing, and validation.

Saygılı [25] proposed an automated system for separating COVID +ve CT scans from the non-infected scan images. The authors followed the pipeline of data set acquisition, image pre-processing that includes rgb2gray transformation, image resizing & image sharpening, Feature extraction that includes Local Binary Patterns & histograms of the Oriented Gradients, Feature reduction using Principal Component Analysis, and classification using standard machine learning (ML) algorithms. They achieved an accuracy of 98.11% on the dataset provided by Soares et al. [28] using the handcrafted features with a tenfold data partitioning scheme.

Kaur and Gandhi [13] designed a method for COVID detection based upon concatenation of deep features from the learned ResNet50 and learned MobileNetv2 model. The deep features were deduced by taking activations from the ‘avg_pool’ and the ‘Conv1’ layer of the learned models. Thereafter, the features were concatenated and given as input to the SVM for classification. The feature fusion approach, although with the high dimensionality of 63,232 yielded a validation accuracy of 98.35% on the benchmark COVID CT dataset.

In another work via Kaur et al. [15], they developed a diagnosis scheme for COVID-19 signature detection using deep features and Parameter Free-BAT (PF-BAT) enhanced Fuzzy-KNN (FKNN). Firstly, the pre-trained MobileNetv2 model was fine-tuned on CT chest radiographs. Thereafter, the features were extracted by performing activations onto the fully connected layer of the fine-tuned model. The features along with the corresponding label were fed to the FKNN classifier whose hyperparameters, i.e., nearest neighbor (‘k’) and fuzzy strength measure (‘m’) were fine-tuned via PF-BAT. Experimenting on the dataset by Soares et al. [28] reveals that the system achieved an average accuracy of 99.18%.

Goel et al. [7] designed an automated method for SARS-COV-2 detection using a framework that employs Generative Adversarial Network (GAN) for augmentation, Whale optimization for hyperparameter tuning of GAN network, and classification using transfer learned Inception V3 model. The researchers achieved a prediction accuracy value of 99.22% on the benchmark CT scan dataset using train test splits as 7:3.

Sen et al. [26] developed a feature selection approach for COVID-19 signature detection from lung CT scans. The detection framework uses a Convolutional Neural Network (CNN) architecture as a deep feature extractor. Thereafter, feature selection was done in two stages, i.e., firstly filter-based method is used then the Dragonfly optimization algorithm was applied over the ranked features. The selected features were then used by SVM for classification. The prediction rate was 90% and 98.39% on the benchmark CT scan image datasets.

Surveying the recent literature divulges that the use of CT images is evolving rapidly for COVID-19 detection owing to the shortage and the limited sensitivity of the RT-PCR detection kits [17]. Furthermore, to automate and expedite the screening of distinctive CT scans, the concept of transfer learning has proven to be quite advantageous, especially if the size of the dataset is limited [13, 15]. Additionally, to generalize the performance of the different algorithms on the benchmark CT scan dataset, a public database was provided by Soares et al. [28]. The survey also highlights that the predicted accuracy over this dataset is still limited, and there is a scope to improvise this rate further via effective mathematical models.

With the inspiration to improvise the prediction results, variants of the ResNet model are explored for detection. ResNet model variants are selected because of their proven better performance for the SARS-COV-2 detection task using CT scans [19].

Additionally, the potential of the learned image features from the transfer learned model is also investigated for COVID-19 detection when combined with the established ML algorithms (SVM. KNN, Logistic Regression (LR)). A classification fusion mechanism is also proposed that combines the predictions from the different ML algorithms via majority voting. Summarizing the vital contributions of the proposed work are:

  1. 1.

    A comparative study is performed with various architectures of Deep CNN’s (DCNN) like ResNet18, ResNet50, and ResNet101 for COVID-19 detection using the transfer learning concept.

  2. 2.

    Results revealed that the transfer learning-based deep ResNet50 model exhibited the best performance w.r.t the other residual network variants.

  3. 3.

    The potential of the image attributes extracted from the various layers of the transfer learned model is also investigated by combining them with well-established ML algorithms like SVM, KNN, and LR.

  4. 4.

    A classification fusion scheme is also proposed that combines the predictions from the different classifiers via majority voting to further boost the classification performance.

In this present work, Sect. 2 provides the materials and methods, followed by Sect. 3, which describes the experimental results. Discussion and conclusions are illustrated in Sects. 4 and 5, respectively.

2 Materials and Methods

2.1 COVID Dataset Depiction

The proposed method has been analyzed on the publically available benchmark COVID CT scan dataset provided by Soares et al. [28]. The statistics relating to the data are provided in Table 1. Sample CT scans are shown in Fig. 1.

Table 1 Train validation splits
Fig. 1
figure 1

CT scans from the dataset COVID + ve (upper row) and non-infected by SARS-COV-2 (lower row)

2.2 Performance Measures

In the present paper, Accuracy, Recall, Precision, F-score, and Area under the Curve (AUC) are selected to quantitatively validate the competence of the designed method.

2.3 Transfer Learning

One of the effective mechanisms making use of pre-trained deep models for image classification under limited data scenarios is transfer learning. In the present work, variants of a pre-trained ResNet model are used [11]. ResNet models follow a forward neural network architecture with “shortcut connections” to train CNN [10, 24]. Through these “shortcut connections” gradients can easily propagate, which makes the training faster. Primarily, ResNet18 [24], ResNet50 [10], and ResNet101 [10] with transfer learning are used in this work as they have exhibited better performance than other computational models [19]. Only the last three layers of the model variants are fine-tuned to accommodate new image categories, as shown in Fig. 2 [33].

Fig. 2
figure 2

Visualization for transfer learning using pre-trained models [14]

Fig. 3
figure 3

Proposed methodology

2.4 Feature Classification Using Conventional Machine Learning Algorithms

Apart from using the transfer learned model for classification, the potential of learned image features is also explored for COVID-19 detection. The network activations from different layers of the learned model are taken, and then, they are fed to the proven state-of-the-art classifiers like SVM, KNN, and the LR. The following section briefly outlines the mathematical formulation for SVM, KNN, and LR.

2.4.1 Support Vector Machine (SVM)

SVM is among the most popular supervised ML algorithms that function by constructing an optimal hyperplane [9]. Let M training examples (pi, zi) be represented in an N-dimensional sample space, where pi is an example pattern and \(z_{i} \in \{ - 1,1\}\) is the label. Let the kernel value matrix be represented by K and αi be the Lagrange coefficients to be calculated via optimization procedure. Solving the quadratic equation given below results in an optimal separating hyperplane

$$ \max \,W(\alpha ) = - \frac{1}{2}\sum\limits_{i}^{M} {\sum\limits_{j}^{M} {\alpha_{i} } } \alpha_{j} z_{i} z_{j} K(p_{i} ;p_{j} ) + \sum\limits_{i}^{M} {\alpha_{i} } $$
(1)

The equation given above is subject to the constrain \(0 \le \alpha_{i} \le C,\forall i\) and \(\sum\nolimits_{i}^{M} {\alpha_{i} z_{i} = 0}\).

2.4.2 K-Nearest Neighbor (KNN)

KNN is one of the most simplistic nonparametric pattern recognition technique [5, 18]. In the KNN algorithm, a label is assigned according to the most common labels from its k-nearest neighbors. The main advantages of the KNN classifier are its simple implementation and fewer parameters to tune, i.e., distance metric and k.

Step by Step procedure for KNN algorithm:

  1. (i)

    Initialize the number of nearest neighbors (k).

  2. (ii)

    Compute the distance between the test image and all the training images. Any distance criteria could be used. E.g. Euclidean distance is primarily used and it is governed by the equation

    $$ {\text{Distance}}(a,b) = \left\| {a - \left. b \right\|} \right. $$
    (2)

    where (a, b) are two different samples in the feature space.

  3. (iii)

    Sort the distances and calculate the nearest neighbor based on the kth minimum distance.

  4. (iv)

    Get the corresponding labels of the training data which falls under k for the sorted condition.

  5. (v)

    Take the majority of k-nearest neighbors as the output label.

2.4.3 Logistic Regression (LR)

LR method uses a sigmoid or a logistic function. Considering a 2 class problem with f0(X), f1(X) as the class conditional densities, q0(X), q1(X) as posterior probabilities, and p0(X), p1(X) as prior probabilities, then according to Bayes rule [2]

$$ q_{0} (X) = \frac{{f_{0} (X)p_{0} }}{{f_{0} (X)p_{0} + f_{1} (X)p_{1} }} = \frac{1}{1 + \exp ( - \xi )} $$
(3)

where ξ is defined as

$$ \xi = - \ln \left( {\frac{{f_{1} (X)p_{1} }}{{f_{0} (X)p_{0} }}} \right) = \ln \left( {\frac{{f_{0} (X)p_{0} }}{{f_{1} (X)p_{1} }}} \right) $$
(4)

Alternatively,

$$ \ln \left( {\frac{{f_{0} (X)p_{0} }}{{f_{1} (X)p_{1} }}} \right) = W^{{\text{T}}} X + w_{0} $$
(5)

The above equation holds if f0 and f1 are Gaussian with similar covariance matrix. This is the case of logistic regression where it would result in an optimal classifier. In logistic regression, the goal is to find W and w0 that minimizes \(\frac{1}{2}\sum\nolimits_{i = 1}^{n} {\left( {h\left( {W^{T} X_{i} + w_{0} } \right) - y_{i} } \right)^{2} }\) where \(h(a) = (1 + \exp ( - a))^{ - 1}\) is the sigmoid or logistic function and \(y_{i} \in \{ 0,1\}\) are the targets.

3 Experimental Results

3.1 Experimental Setup

The experimental settings for the proposed work include using an ‘Adam” optimizer with the initial learning rate of 0.0006 to minimize the cross-entropy loss. The variant models were fitted for 20-epochs with 150 as the batch size. The image data were processed on an Intel Core i7-4500U CPU having 8 GB RAM, with a 1.8 GHz processor in a MATLAB 19a platform. Other training options include shuffling the data before each epoch using L2-regularizer with a weight decay value of 0.05 to circumvent overfitting. Moreover, the data splits, i.e., train/validation were used as per the base paper by Soares et al. [28]. The researchers in Soares et al. [28] have given the .mat files for train-validation splits that do not contain subject-related information, i.e., without any metadata on the images. The images are only designated by numbers without any indication of whether they are subject-independent or not.

3.2 Results

The experimental results via deploying the learned ResNet model variants for the COVID-19 detection task are given in Table 2. The tabulated entries show that the best results are attained by the ResNet50 model. It has achieved a precision of 98.02%, recall of 98.80%, AUC of 0.9994, F1-score of 98.41%, and a validation accuracy of 98.35%. Other models that are taken for comparison, i.e., ResNet18 and ResNet101 achieved a validation accuracy of 97.12% and 96.71%, respectively. The confusion matrix for all the three learned variants is given in Fig. 4. The smallest misclassification error is achieved by ResNet50 followed by ResNet18, and then ResNet101. The accuracy/loss versus epoch plot for the best performing model is given in Fig. 5, which indicates that validation closely follows the training. The AUC plot for ResNet50 is shown in Fig. 6, which specifies that the model attains a value of 0.9994 for True Positive Rate Vs False Positive fraction.

Table 2 Performance comparison of the ResNet model variants over validation data
Fig. 4
figure 4

Confusion matrix a ResNet18 b ResNet50 c ResNet101 d classification fusion

Fig. 5
figure 5

Training versus epoch and loss versus epoch plot for best performing transfer learned model (ResNet50)

Fig. 6
figure 6

AUC curve for best performing transfer learned model (ResNet50)

Figure 7 shows the occlusion sensitivity maps for the learned ResNet50 model. It gives us an idea about which area of the scan is most decisive for classification, i.e., occluding which results in a maximum drop in the probability score. The regions that are positively contributing to the probability score are shown in red color.

Fig. 7
figure 7

Activation maps obtained via transfer learned ResNet50 model

In the present work, we also tried to investigate the efficacy of the activations from the several layers of the transfer learned ResNet50 model for the COVID-19 detection task. These activations from the various layers of the trained model are used in conjunction with well-established classifiers such as SVM, KNN, and LR. Beginning from the fully connected layer, activations from specific layers are taken and the detection results are reported in Table 3. Interestingly, e.g., the activations extracted from the layer ‘res5b_branch2c’ proved to be decisive for the classification using SVM and the LR classifier. It rendered a value of 99.20% for precision, recall, F1-score, and a value of 99.18%, 0.9997 for accuracy, and AUC.

Table 3 Performance metrics for transfer learned ResNet50 using activations from specific layers

To further improve the classification performance, a fusion strategy is also proposed as outlined in Fig. 8, where predictions from the three different classifiers are fused according to the majority voting rule. The confusion matrix resulting from the classification fusion is shown in Fig. 4d indicating that the misclassifications are reduced to merely three samples.

Fig. 8
figure 8

Classification fusion strategy

On fusing the predictions from SVM, KNN, and LR using features from the ‘res5b_branch2c’ layer, a validation accuracy of 99.38% and F1-score of 99.40% is achieved. The advantage of classification fusion is also highlighted for the activations from ‘avg_pool’ layer where a validation accuracy of 99.38% is achieved using a feature dimension of 2048 rather than using a feature space of 100,352.

4 Discussion

The proposed fusion approach has also been compared with other model architectures that have used the same dataset. As apparent from Table 4, the proposed prediction fusion mechanism yields a precision of 99.60%, F-score of 99.40%, Recall of 99.20%, Accuracy of 99.38% that is higher than the existing network architectures (detection accuracy of 97.38%, 98.35%, 98.39%, 98.37%, 98.99%, 94.04%, 92% reported in Kaur and Gandhi [13], Panwar et al. [22], Pathak et al. [23], Silva et al. [27], Soares et al. [28], Fouladi et al. [6], Sen et al. [26]). In contrast to the reported works in literature, the proposed prediction fusion mechanism offers some benefits in addition to achieving a promising level of detection performance: (1) It employs a single pre-trained network architecture for COVID-19 classification differing from the usage of both VGG-16 and xDNN proposed in Soares et al. [28]. (2) It does not employ any optimization procedure for fine-tuning the model hyperparameters as in Kaur et al. [15], Pathak et al. [23] where the authors have employed PF-BAT and Memetic Adaptive Differential Evolution optimization for FKNN and deep bidirectional long short-term memory network with a Mixture Density model (DBM) hyperparameter tuning.

Table 4 Comparison of the proposed prediction fusion scheme with the recent state-of-the-art works

5 Conclusion

In the present paper, we have investigated the efficacy of different pre-trained network architectures with transfer learning for COVID-19 detection using a limited CT scan dataset. Investigation reveals that the transfer learned ResNet50 model turned out to be the finest by achieving an accuracy value of 98.35% that is superior to the considered models and the existing state-of-the-art works in the literature. Moreover, the potential of the activations from different layers of the learned ResNet50 network is also explored for detection using the established ML algorithms. The exploration reveals that the activations from some of the specific layers of the learned ResNet50 model are quite decisive for classification yielding an accuracy of 99.18% using SVM and LR classifiers. A classification fusion strategy is also proposed that further improvised the accuracy to 99.38% by combining the predictions from the different classifiers via majority voting.

The proposed automated system can assist the healthcare professionals in rapid detection of the virus at different stages.