1 Introduction

According to current knowledge, SARS-corona virus-2, which causes COVID-19, originated in a bat, and was transferred to humans in December 2019. The viral epidemic that has affected the whole world began at Wuhan city's Huanan Seafood Market. The World Health Organization (WHO) documented COVID-19-related disorders like fever, coughing, and other diseases [1]. Individuals' brains and lives were profoundly affected due to this incident. Although as time progressed, new vaccinations were discovered, and new prevention methods were implemented. All preventative measures were heat streams, social distance, self-quarantine, anti-gen testing, and RT-PCR testing. Images from CT-Scans, X-rays, and ultrasounds were required for the remaining operations. Many restrictions exist on how medical data may be used because of Protected Health Information and Personal Identifiable Information. As a result, there is a shortage of high-quality datasets.

Furthermore, this problem was resolved by constructing datasets using open-sourced information and documents readily accessible on the internet. In preliminary experiments, it was discovered that pure image processing methods were used to the collected images to enhance the texture and features so that they could be utilized as study subjects in further studies. Before applying these imaging techniques, images were processed using histogram equalization, morphological operations, segmentation of images, finding ROI (Region of Interest), mean equalization, image fill operations, and image enhancements. Other techniques, such as edge and counter detection, were also used to prepare the images for processing [2, 3]. These criteria were utilized to calculate infection rates, but more robust and dependable approaches, such as artificial intelligence-assisted disease diagnosis systems, came into existence as the technology sector progressed. Machine learning and artificial intelligence have displaced traditional rule-based systems in favor of mathematical modeling and designing concepts. These models served as a way of training machines to do a particular activity, so they were given the designation machine learning.

However, ML had an overhead of feature engineering, often known as feature selection, that needed to be addressed. This is the most crucial and time-consuming step in the whole procedure. First, we must identify the dependent and independent characteristics, choosing the most dominant feature among the others that will influence the model learning process. As a result, it is not always the case that the best-matched characteristics for a problem statement will be discovered immediately. To cope with the problem of feature engineering, more complex ideas were developed and termed deep learning (DL) concepts. DL is made up of neural networks (NN), a network that attempts to simulate the human brain. These NNs may be considered a subset of machine learning. However, they vary in terms of processing and architecture. The selection of features in NN is made based on the forward pass and the backward pass. In recent years, deep neural networks (DNNs) have been more popular in classification issues, with more and more DNNs becoming relevant. These models may be taught in two ways: the traditional technique, in which pre-processed data are input into the network and training takes place, and the other uses, the deep learning method. Other methods include pre-trained models, in which a model trained on one dataset is utilized and then re-trained on another, known as reinforcement learning. This helps in dealing with insufficient data and may also speed up the training process, resulting in better outcomes with fewer epochs of training required. Over-fitting is possible when using such models, but it is also controllable [4].

Early detection of the illness is critical for isolating positive individuals and preventing the disease from spreading across the population [5]. X-rays and CT are commonly used to assess the severity of the infection in the lung region because it is the primary site of infection [6, 7]. It is common to use X-ray imaging methods to diagnose COVID-19 because of its widespread availability, rapid processing time, and reasonable cost. On the other hand, CT imaging methods are favored since they provide extensive information on the affected area [8].

As a result of a lack of comprehensive understanding of the illness, even expert radiologists have found it challenging to anticipate infection from medical images. Using medical images in conjunction with deep learning algorithms has been a beneficial option in diagnosing COVID-19, resulting in quicker and more accurate findings [9, 10]. The primary purpose of the CNN is to identify COVID-19 from medical images [11]. COVID-19 image classification has been suggested by several authors using ML-based algorithms. Decision Trees, Random Forests, SVM, and ensemble-based systems for classification were some of the classifiers employed in these systems that needed processed images and consumed them to make the machines learn [12]. Nowadays, its standard categorization of COVID-19 is to apply transfer learning by altering the hyper-parameters and allowing the model to pick up new features based on the dataset. Publicly accessible datasets like COVID-CT [13] and SARS-COV-2 CT [8] have been utilized in studies dealing with covid-19.

Several studies indicated that DNN and CNN better-classified diseases in medical image datasets than other methods [14, 15]. It has also been observed that most authors have only presented methods that are the most effective for a single form of image collection for categorizing COVID-19, whether it be CT-scan, ultrasound, or X-ray images, among other things. We attempted to do so by developing a system that would function for both CT scans and X-ray images as part of our investigation. We have not included Ultrasound images since they are available in videos. Because of this, they cannot be considered images; therefore, we have utilized an expanded COVID-19 dataset that contains images of both CT-scans and X-ray images. The main contributions of this work are as follows:

  • Tried to make a system that can work for both the image types, i.e., CT and X-ray.

  • A cloud-based strategy is being developed to offer a system capable of handling large amounts of traffic and scaling appropriately.

  • Devised an architecture that can classify COVID-19, thereby utilizing the existing best systems proposed by different authors

  • Tried to provide a common solution for COVID-19 classification.

  • We created a new model (our base model) based on the architecture of Inception.

  • Trained the model on an extended COVID-19 dataset.

This work aims to present a common cloud-based architecture capable of classifying the COVID-19 on both types of images. It will automatically scale as the volume of traffic rises, resulting in a dependable system that can serve millions of people with little downtime and latency. As a result, the primary goal is to overcome the limitations of existing systems, which are limited to working with a single type of image. After conducting a literature review, we discovered that all existing works have proposed architectures, listed the performance metrics by training them on either CT or X-ray images, and demonstrated that some of the models work well on CT-Scan and some work on X-ray images. However, none of them have demonstrated an architecture that will work for both images and provide better results. As a result, we have tried to eliminate this difficulty in our work.

In addition, we have proposed an architecture with an interface, which can be developed using any frontend framework, such as ReactJS or Angular, and using standard HTML, CSS, and Javascript. This interface is connected to our base model, built on top of Inception's architecture, and deployed to the cloud for computation. We trained our base model using the larger covid-19 dataset with early stopping and a new classifier layer. Our base model classifies the kind of images in CT or X-ray, and then, based on the output from the base model, we transmit that image to the best models available for COVID-19 classification. Various performance measures are used to evaluate the performance of our proposed model.

The rest of the paper is divided into the following parts: Sect. 2 focuses on COVID-19 categorization in CT-scan and X-ray images using deep learning models. The proposed architecture is described in Sect. 3. Section 4 summarizes our outcomes, and Sect. 5 concludes.

2 Related Work

Automatic COVID-19 image classification started at the beginning of 2020 after many diagnosis reports were available. Only limited samples and images were available for researchers at the initial time. Nowadays, researchers apply various models for automatic COVID-19 detection, but at the beginning, CNN-based models were built for COVID-19 image classification. Some of the CNN-based models are VGG 16 [16], ResNet [17], and Inception Resnet V2 [18]. These models are based on transfer learning and can be used to train new models. For COVID-19 automatic detection, CT-scan and X-ray images are both used. Also, the models can be categorized according to these images type. Researchers use X-ray images to find the absence or presence of COVID-19 in given images [19]. CT-scan images may be used to detect the presence of COVID-19 and the severity level of the infection. The problem of the smaller dataset is solved using the noise-resistant DICE loss during the model's training. Inf-Net [20] is used to segment the COVID-19 lung infection in CT-scan images. It consists of a parallel partial decoder, edge attention models, a global map, and semi-supervised segmentation.

In addition, a multitasking model [21] for classifying CT-scan images based on segmentation was presented. This model can identify the infected lung area based on CT-scan images. The researchers segmented the features and classified the lesion sites using a standard encoder in this research study. Zhou et al. [22] put out the idea of segmenting CT scan images automatically using COVID-19. This study successfully segregated diseased regions with symmetric features of lungs and tissues. To do multi-class segmentation on CT-scan images, MSD-Net [23] was proposed as an alternative. Segmentation uses a pyramidal convolutional block and a variable-size kernel. In addition to that, they also make use of channel attention blocks. It is suggested to use a framework for deep learning that is only lightly supervised to find lesion locations in CT-scan images. Inside this framework is a pre-train form of the U-Nets model. DeepCov-Net is a three-dimensional (3D) deep neural network that performs categorization using the segmented image as input. In addition to that, it uses UNET and residual block [24].

Wang et al. [25] advised a deep learning-based approach for categorizing CT-scan images. Excellent accuracy was observed using the GoogleNet InceptionV3 architecture. According to Xu et al. [26], a 3D-CNN-based approach was proposed to isolate infected areas. The ResNet-18 model is utilized to extract features, and multiple classification algorithms are employed to classify images in this approach. A variety of deep learning and transfer learning algorithms are used to identify COVID-19 in X-ray images [27]. To identify COVID-19 in X-ray images, methods such as Xception [28] are used. Pre-trained models are used to classify images in these methods. A different method for extracting features has been proposed by Abraham et al. [29]. Pre-trained CNN models are used in this approach. The characteristics are extracted using a correlation-based technique, and the classification is carried out using a Bayes net classifier. FractalCovNet, trained on chest x-ray images, is used in the suggested technique and outperforms ImageNet.

COVID-19 in X-ray images was first detected using a sequential CNN model by Haque et al. [30]. This method outperforms all others currently in use. COVID-GAN [31] is used to generate synthetic X-ray images. This model has a 95% accuracy rate. For the classification of COVID-19 X-ray images, a new COVIDNet [32] model is presented. Images from the ImageNet dataset were used to train this model before it was refined for classification. Jin et al. [33] propose an AI-based method for detecting COVID-19 patients using machine learning. CT-scan images of COVID-19 patients show contaminated regions [34]. For COVID-19 identification, this AI-based method combines human expertise, which takes additional time from medical personnel.

2.1 X-Ray-Based COVID-19 Recognition

Many studies have utilized X-ray images for training the deep learning model to detect COVID-19 because they have wider datasets than other medical imaging modalities, making them a good choice for COVID-19 detection. The example of X-ray images of normal and COVID-19 positive are shown in Fig. 1. This subsection addresses deep learning approaches in newly suggested systems to identify COVID-19 from X-rays.

Fig. 1
figure 1

Sample X-ray images

Pre-trained CNN model DenseNet121 was used to identify COVID-19 in chest radiographs from available datasets by Arellano et al. [35]. An accuracy of 94.7% was achieved since the algorithm was previously trained to identify various lung diseases. A three-step approach is used to identify COVID-19 and pneumonia [26]. The C-GAN network may involve the segmentation of lung regions from CXR images. Features are extracted from the segmented lung images in the second stage using classic feature extraction methods and deep neural networks. The retrieved characteristics were utilized to categorize the CXR images using several machine-learning classifiers. In conjunction with VGG-19, it achieved the maximum classification accuracy of 96.6%.

Another method of COVI-19 detection has been proposed by Waisy et al. [37] named as COVID-CheXNet. This method uses ResNet34 and HRNet, and achieves an accuracy of 99.9%. This method gives superior results over others. Wang et al. [32] have proposed a DCNN model to detect COVID-19 from CXR images. This was the first open-source design for Covid-19 detection. In this work, the authors suggest an open-source dataset COVIDx with the help of five separate open-source databases. The performance of this dataset was evaluated using VGG-19 and ResNet50 architectures. Yang et al. [38] studied and evaluated multiple deep learning-enhanced approaches for identifying COVID-19 in medical images.

Six modified pre-trained models were evaluated in a study published by Ahsan et al. [39]. Regarding identifying COVID-19 patients, VGG-16 and MobileNet V2 had accuracy levels of up to 100%. The Fourier–Bessel series expansion-based dyadic decomposition was developed by Chaudhary et al. [40]. The deep features are retrieved from each sub-band image using the ResNet50 model. By the collected characteristics, the softmax classifier correctly identified pneumonia caused by COVID-19 instead of other types of pneumonia with a 98.66% accuracy rate.

According to the findings of Shamsi et al. [41], transfer learning-based uncertainty-aware algorithms can identify COVID-19-infected patients from X-ray and CT images. To extract features, pre-trained models are utilized. These features are then input into deep learning models, which categorize the extracted features. The ResNet 50 model with the SVM classifier yielded the highest results when used together. Researchers in [42] constructed a multi-input deep convolutional attention network to perform simultaneous processing of 3D CT and 2D X-ray images. Consequently, the model's accuracy improves to a level of 98% following the addition of a convolutional block attention module (CBAM). Ouchicha et al. [43] came up with the idea for CVDNet to recognize COVID-19 in X-ray images. To distinguish between the local and global features of the data, parallel columns with the same structure but varied kernel sizes are used. An accuracy of 96.69 percent was achieved due to concatenating the model results with the information from the two columns.

Sarki et al. [44] use transfer learning-based VGG-16, Xception, and InceptionV3 models to classify images into two and three classes. In the first scenario, they considered only two classes (normal, COVID-19) for classifying images. The second scenario classifies images into three classes (normal, covid-19, pneumonia). The author uses five-layer CNN architecture to categorize images and achieve good results in this work. They have also used publicly available image datasets for training and validation. As a result, they achieve 100% and 87.50% accuracies for binary and three three classes, respectively. The summary of COVID-19 detection using X-ray images is represented in Table 1.

Table 1 A comparative review of COVID-19 detection using X-ray images

2.2 COVID-19 Detection Using CT Images

CT scanning is a beneficial technique in the primary recognition of COVID-19. It is chosen because it offers a 3-D image of the lung, which includes extensive information about the damaged area, and because it is less expensive. Figure 2 shows an example of CT scans images. Several newly suggested transfer learning algorithms have used CT scans as the imaging methodology, briefly covered in this subsection.

Fig. 2
figure 2

Sample CT-scan images

COVID-AL was suggested by Wu et al. [45] to diagnose COVID-19 from CT scans, with the lung area segmented using a 2D U-Net pre-trained to segment the lung region. The network achieved 95% accuracy with just 30% of the information identified, resulting in a reduction in the cost of human labeling. COVIDNet-CT was developed by Gunraj et al. [46] to identify COVID-19 from CT images using a deep CNN. An algorithmic design exploration technique was utilized to automatically identify the ideal architecture for developing a deep convolutional neural network. With minimal computing complexity, an accuracy of 99.1% may be achieved. CovTANet is a hybrid neural network suggested by Mahmud et al. [47]. It was found that the network's accuracy for severity prediction was 95.8% when it was utilized in conjunction with a segmentation.

Using 3D CT images, Wang et al. [48] demonstrated DeCovNet for lesion localization and COVID-19 identification. With or without labeling the COVID-19 lesions in CT images, the suggested model achieved an accuracy of 90.1% in validation. Ten standard CNN models were evaluated on a proposed AI-based CAD system to detect COVID-19 from CT scans [49], with results showing that the models performed well. The best result was achieved by ResNet-101, which achieved an accuracy of 99.51%. Li et al. [50] suggested a technique for training a model with restricted CT data that relied on a transfer learning approach. This strategy used ChexNet and obtained an accuracy of 87% in severity evaluation compared to other methods.

Shah et al. [51] suggested an approach based on the COVID-CT dataset. They make use of the VGG-19 architecture as well as imagenet weights that have already been learned. VGG-19 is used for the model's primary layers. Dropouts with 0.3 values are used in the classifier layer, and the final layer has a single neuron and sigmoid activation function coupled for binary classification, which is the goal of this technique. According to the comparison results, the VGG-19 model outperformed the other proposed methods, including the ResNet50 model, CTnet-10 model, DenseNet-169 model, and InceptionV3 model, with an accuracy of 94.5%.

The article [52] recommends that a DCNN model with the name ReCOV-101 be used to identify COVID-19 from CT-scan data, with ResNet-101 acting as the system's backbone. The dataset is expanded via data augmentation and transfer learning, and the quality of the model is improved using a technique known as skip connection. The overall percentage of correct predictions is 94.9%. The effectiveness of a deep learning system to filter COVID-19 from CT scans was investigated by Wang et al. [25] and achieved 85.2% accuracy. Dubey et al. [4] created an artificial intelligence model based on transfer learning that may assist physicians and other individuals in determining whether they are suffering from COVID. The authors employed the VGG-19 model based on CNN with the open-source COVID-CT dataset. They were able to attain a 95% accuracy rate in their studies. Table 2 visually represents various authors' previous research on CT images.

Table 2 A comparative review of COVID-19 detection using CT images

Although there have been several ways to detect COVID-19 in chest X-ray and CT-scan images, for the identification of COVID-19, several techniques rely on models that have already been trained. This study proposes a typical cloud-based architecture capable of identifying the COVID-19 on both images while also being simple to deploy. Because it is designed to grow as the traffic increases automatically, it will result in a reliable system serving millions of users with little latency and downtime. It is thus essential to overcome the limits of present systems, which are confined to dealing with a particular kind of image, such as a CT or X-ray image, as the first step toward solving the problem. When we get a CT or X-ray image, we first classify it using our base model, and then, depending on the output from the base model, we submit it to the best models available in the literature for covid-19 categorization.

2.3 COVID-19 Detection Using CT and X-ray Images

One of the significant roadblocks to the investigation of COVID-19 AI-based solutions is the unavailability of a publically available dataset containing CXR and CT-scan images. Maghdid et al. [62] attempted to solve this issue by presenting a large dataset consisting of both sorts of photographs. In its initial form, the dataset included 170 CXR pictures and 361 CT-scan images. They also showed their work with CXR and CT-scan images using a basic CNN and a tweaked pre-trained AlexNet model. The first model obtained an accuracy of up to 94.1% with the experimental data, while the second model reached an accuracy of up to 98%.

For COVID-19 classification, Ravi et al. [60] present a large-scale learning strategy that utilizes stacked ensemble meta-classifiers and a deep learning-based feature fusion method. The dimensionality of the collected features was lowered using kernel principal component analysis after the features were taken from the global average pooling of EfficientNet-based pre-trained models. Afterward, a feature fusion method was used to combine the characteristics of several different retrieved features. In the end, a stacked ensemble meta-classifier-based method was utilized for the classification process. It is a method that consists of two stages. After using a random forest and support vector machine (SVM) for prediction in the first stage, the results were aggregated before input into the second stage. In the second step, a logistic regression classifier is utilized to categorize the data sample consisting of CT and CXR as either COVID-19 or Non-COVID-19. The author of this study says that its categorization is accurate to 99%. A form of modified MobileNet is proposed by Jia et al. [59] for the classification of COVID-19 CXR images, while a modified ResNet architecture is proposed for the classification of CT images. Specifically, a modified approach of convolutional neural networks (CNN) is meant to overcome the gradient vanishing problem and enhance the classification performance by dynamically integrating features in different layers of a CNN. This is accomplished through the use of a neural network. When classifying COVID-19, Tuberculosis, viral pneumonia (except COVID-19), bacterial pneumonia, and normal controls using CXR images, the modified MobileNet algorithm is utilized. Additionally, CT scans are used in conjunction with the modified ResNet that was proposed to classify COVID-19 infections, non-COVID-19 infections, and normal controls. According to the findings, the suggested approaches obtained a test accuracy of 99.6% on the CXR image dataset consisting of five categories and a test accuracy of 99.3% on the CT image dataset. Comparative studies use six different advanced CNN architectures and two distinct COVID-19 detection models, which are referred to as COVID-Net and COVIDNet-CT.

A Graph Isomorphic Network (GIN) based model called GraphCovidNet was suggested by Saha et al. [61] for detecting COVID-19 in patients' CT scans and CXRs. Following a GIN-based design, their model can only take graphs as input. The first step is pre-processing, which involves transforming the picture data into an undirected graph so that only the image edges need to be considered. To gage how well the model performs, it is tested on four industry-standard datasets: the SARS-COV-2 CT-Scan dataset, the COVID-CT dataset, a combination of the COVID-CT and chest X-ray datasets, the chest X-ray Images (Pneumonia) dataset, and the CMSC-678 ML-Project dataset. The model achieves an astounding 99% accuracy for all datasets, and for the binary classification task of recognizing COVID-19 images, its predictive power becomes 100% accurate.

3 Proposed Work

The recommended model is built using different components, as shown in Fig. 3. It shows a complete high-level system design for distinguishing COVID-19 using X-ray and CT images. This system consists following components:

  1. 1.

    User Interface to upload images (Upload UI in Fig. 3)

  2. 2.

    Base Model tied with azure functions (Base model Image Classifier)

  3. 3.

    CT model using ResNet-101 [49] tied with another azure function

  4. 4.

    XRAY model using VGG-16 and MobileNet V2 [39] tied with another azure function

Fig. 3
figure 3

Cloud-based proposed architecture

To show the flow in this system, each step is numbered, as shown in Fig. 3. Following are the description of each step to get a better understanding.

Step 1. Users will upload any one type of image from the Upload UI; this may be of CT-scan or X-ray.

Step 2. Once our first azure function receives this image, say image classifier function, it is sent to our Base Model, which is trained to identify the image type in CT-scan or X-ray image. After that, this image is forwarded to other well-trained models of covid identification.

Step 3. As per the output from step 2, a condition is checked.

  • 3.1.If the output type is 'CT,' then pass this image to the CT image classifier function

  • 3.2.If the output type is 'XRAY,' then pass this image to the XRAY image classifier function

Step 4. The image received from step 3 is then fed to the respective model, which classifies whether that patient is COVID positive or not and returns this as a response.

Step 5. Response from step 4 is sent to the UI with the outcomes and that image, where the image and outcome can be displayed in a report format for better visuals.

Within the scope of this study, we have integrated two pre-trained networks into the base model we have suggested. ResNet-101 and VGG 16 are the names of the two networks. The classification of CT images is handled by ResNet-101, whereas X-ray image classification is handled by VGG 16. The ResNet-101 version is a ResNet variant with its own distinctive residual block. ResNet-101 is a deep neural network that consists of 101 layers. It begins with a convolution layer, moves on to 33 residual blocks, and finishes with a fully connected layer. Figure 4 illustrates the primary constituents of the residual blocks that make up ResNet-101 [49].

Fig. 4
figure 4

ResNet-101 architecture [49]

In most situations, a network that has been pre-trained in the past on a bigger dataset is sufficient to learn a distinct hierarchy from which to extract features. This allows the network to be called "pre-trained." When used in smaller datasets, it performs much better. The VGG16 architecture is a good illustration of this. Ahsan et al. [39] made certain changes to the design of the VGG 16 fine-tuning sequences, which are illustrated in Fig. 5. They have made the following adjustments to the model for VGG 16: AveragePooling2D(Poolsize = (4, 4))Flatten Dense Dropout(0:5) Dense(Activation = "softmax"). They have considered. Batch size = 50, Number of epochs = 100, and Learning rate = . 001.

Fig. 5
figure 5

VGG 16 architecture [39]

4 Experimental Outcomes

4.1 Computing Resources

The tests were carried out on a 64-bit Windows 10 Pro 20H2 desktop computer with an Intel® i7-4790 processor running at 3.60 GHz and 32 GB of RAM. The machine has anaconda installed, including Python version 3.9. Other software on the system includes Jupyter-lab and TensorFlow.

4.2 Experimental Settings

To conduct this research, we have made use of the Inception framework. We compare the performance of this design to other architectures available in the literature. The original pre-trained weights were used for designing the base model with an image size of 124 × 124 pixels required, and a transfer learning approach was used. The final layer of the original Inception was removed and replaced with what we had developed specifically for classifying CT or X-ray images according to our requirements. Figure 6 depicts a classifier layer that has been applied.

Fig. 6
figure 6

Base model classifier architecture

The experiments are performed on the mixture data of CT-Scan and X-ray images. This datasetFootnote 1 [53] is 3.72 GB in size and consists of CT images of 1.10 GB and X-Ray images of 2.66 GB. Within CT, we have COVID positive and COVID negative images for X-Ray images. We mixed up the data in the train and validation folder for the training and validation of our base model. In the train folder, for CT, there is a total of 8054 images; for X-ray, there is a total of 9544 images.

Similarly, in the validation folder, there is a total of 6862 images for CT, and for X-ray, it's 6800 images. These images have different dimensions and numbers; some are rotated and distorted. Having such images in our validation and training data, we wanted our base model to classify such images to make them more robust and reliable.

4.3 Evaluation Criteria for Measuring the Model's Effectiveness

The confusion matrix is used to evaluate the various models under consideration. Table 3 depicts a simplistic form of a confusion matrix [4], which may be used to distinguish between predicted and actual values.

Table 3 Confusion matrix

Precision (P), recall (R), accuracy, and f-measure are the leading key performance indicators used to compare the performance of various classifiers [4]. Accuracy may be defined as the degree to which actual true categories are represented in a given system.

$$\mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$
(1)

The model accurately estimates the Number of positive labels. It expects precision.

$$P=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$
(2)

Recall measures the Number of positive labels we correctly predicted based on the data.

$$R=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$
(3)

On the other side, the F1-Score is calculated by taking the weighted mean of the Recall score and the Precision score.

$$F1\, \mathrm{score}=\frac{2*P*R}{P+R}$$
(4)

The ROC curve is a graph that shows how well a classification model works at all the classification thresholds for a given class. This curve has two parts: the rate of true positives and the rate of false positives. If we lower the classification threshold, it will start labeling more cases as positive. This will cause both the false-positive and true-positive rates to go up.

Area under the ROC curve (AUC) estimates area under the complete ROC curve and defined as

$${\text{AUC}} = \frac{{\left( {R - \frac{{{\text{FP}}}}{{{\text{FP}} + {\text{TN}}}} + 1} \right)}}{2}$$
(5)

4.4 Results and Discussions

Compared to other authors' Base Models, our Base Model, which was developed on top of Inception architecture, achieved the highest accuracy and precision levels, recalls, ROC, AUC, specificity, and sensitivity scores without many equalizations. Our base model has shown 100% Accuracy, Recall, Precision, F1-Score, and an AUC score of 99.6 ~ 100%., among other characteristics. As far as we know from the currently available literature, no author has mentioned a model that can categorize image classes with such high accuracy. Covid-19 may be detected in people by looking at images of a particular disease variant, and every study address models that can be used to identify whether or not a person has it. A table showing comparisons of several CT and X-ray image models is presented in Tables 1 and 2. Tables 1 and 2 include the best models for each image type, and we selected the top model from each table. To construct a complete system that is more reliable and robust, it is necessary first to establish a complete system. The VGG-16 and MobileNet V2 [5] models will be utilized to categorize X-ray images based on the literature. Our proposed base model will also use the ResNet-101 [15] network to classify CT-Scan images.

Using dropouts after each dense layer in this classifier, we addressed the issue of over-fitting. We also made other configuration adjustments inside the model training process to cope with over-fitting, such as implementing a custom model checkpoint callback and an early ending callback to deal with over-fitting. Using the model checkpoint callback, we saved the best model during training with the lowest validation loss on each epoch of the training process. Using an early halting callback, we attempted to monitor the model training for four continuous epochs to determine whether or not the validation was decreasing. We have decided to discontinue the model training process if it is not decreasing. The basic model is also constructed utilizing categorical cross-entropy in conjunction with the Adam loss optimizer, which aids in convergent modeling by reducing the loss and updating weights by the loss reduction after each epoch or cycle. As a result, it reduces the model's likelihood of misclassification.

After that, we train our base model for 200 iterations at a learning rate of 1 × 10–3. Due to the early callback that prevented the model from over-fitting, we can observe that our base model has entirely converged after 7 epochs, and the validation loss has not changed either. Model history was kept for interpretation and plotted; this plot of model accuracy vs. loss can be seen in Fig. 7.

Fig. 7
figure 7

Model accuracy vs. loss plot

The same experiments were attempted using the VGG19 architecture and classifier layer, but the results were so poor that we decided to stick with the current model based on the Inception architecture.

Figure 8 shows an example of the confusion matrix based on validation data. A categorization report is created based on the information in the confusion matrix. The classification report on validation data is seen in Fig. 9.

Fig. 8
figure 8

Confusion matrix on validation data

Fig. 9
figure 9

Classification report on validation data

The TP, FP, TN, and FN columns of the Confusion Matrix represent different types of errors. TP represents the CT images successfully categorized as CT by the base model, and TN represents the XRAY images correctly classified as X-rays. FP denotes XRAY images that have been classed as CT images but are in the XRAY class, while FN denotes CT images that have been classified as X-ray images but are actually in the CT class. Many measures were covered in-depth, with detailed explanations.

A ROC curve that has been constructed for a particular class may be used to depict the performance of a classification model across all classification thresholds. The ROC curve may be constructed using two parameters known as the True Positive Rate and the False Positive Rate. If the classification threshold is lowered, a more significant number of cases will be categorized as positive. This will lead to a rise in the rates of both false positives and true positives since the Number of occurrences that are classified as positive will grow. Figure 10 shows the ROC curve and the AUC score for the validation data associated with our basic model.

Fig. 10
figure 10

ROC and AUC score on validation data

To check the performance of our base model on additional real-world data, we attempted to test it using various datasets [13, 54]. These data were acquired from various online sources and is included in the dataset [54], accessible to the public. This dataset comprises images of COVID-19-positive patients. In this dataset, all images are chest x-rays, and there are 900. However, out of these 900 images, we only chose 167 at random.

Similarly, dataset [13] is widely used data, primarily known as the COVID-CT dataset. This dataset contains a total of 349 CT images from 216 patients. These images were taken from a variety of COVID-19 studies. To determine whether or not our model would accurately categorize all 349 images, we utilized all of them for our goal and examined the results.

As a result of our studies, we discovered that our base model could accurately classify all the images across all the datasets, as shown in Fig. 11. We discovered that our base model had shown 100% accuracy, 100% precision (on CT and X-Ray images), 100% recall (on CT and X-Ray images), and one 100% F1 score when tested using cross-validation data. Figure 12 illustrates the confusion matrix based on the cross dataset, which can help better understand how these values were produced.

Fig. 11
figure 11

Classification report on cross dataset

Fig. 12
figure 12

Confusion matrix on cross dataset

In addition, a comparison of the proposed model to existing approaches is shown in tabular form in Table 4. For this comparison, we have included the best classification methodologies of COVID-19 identification on images obtained from X-rays and CT scans in our proposed model. In light of the comparison, we can assert that the proposed model offers significant improvements over the currently used models.

Table 4 Comparison between the proposed base model and existing approaches (%)

An additional comparison is carried out using the existing method DFC [55], which concatenates the dataset. Within the framework of this method, Saad et al. [55] developed two distinct approaches to COVID detection, using CT and X-ray images, respectively. The first method allows them to categorize each dataset independently before using the DFC technique to integrate the features of both datasets. The second technique concatenated the features extracted from each dataset to produce separate findings. Based on the performance comparison results shown in Table 5, we can conclude that the architecture proposed in this study is the most effective method for classifying covid-19, whether using either a CT-Scan image or an X-Ray image.

Table 5 Comparison of the proposed work on X-ray and CT-scan images

5 Conclusion and Future Work

This paper presents an architecture that can be used as a common platform to classify CT and X-ray images and predict whether it is COVID positive or COVID negative. Since we have seen that the rate by which infection increased is high, people started reaching out to doctors rapidly for the diagnosis, which created pressure on doctors and hospitals as well because in hospitals, accommodation is limited, and doctors being a human, cannot deal with such population size. Thus, keeping all this in mind, we would like a scalable system design that will scale and load balance as more request units increase. From the literature, we found that most studies try to find solutions just to identify the COVID positive and COVID negative using different image types, CT-Scan, X-ray, or ultrasound images, but none have presented a solution capable of dealing with each of them. From the experiments and results, we can say that this can be used as a primary tool in contrast with other conventional methods like the Anti-gen test and RT-PCR test to detect whether a patient is suffering from COVID or not. These conventional methods have a drawback: they require time, human intervention, and meetings. Social distancing can be compromised because, in Anti-gen and RT-PCR, one should come closer to get the samples. Thus, point of contact chances increases, and so does the infection rate. Since all this can be handled with our approach, one who is having CT or X-Ray scans can get a prior recommendation report and then can send this to a doctor who is sitting in their home, thereby reducing the point of contact and reducing the time required to travel to the hospitals to know the status of their reports. But the matter is that AI-enabled solutions cannot take the place of the tried-and-true procedures for covid-19 testing. Therefore, a second opinion from a physician has to be sought out since incorrect findings may lead to inappropriate therapies, which, depending on the severity of the situation, may have unfavorable impacts on an individual's health or may even result in the individual's death.

One can create a system that will incorporate ultrasound images into the existing system for future work. Apart from this, such systems can be designed in which doctors can provide inputs, and then systems can learn from those inputs, like reinforcement learning systems. In that way, more precise and specialized systems can be created to benefit society.

6 Code Link

https://github.com/ankitdubey987/Combined-Cloud-based-inference-system-for-the-classification-of-Covid-19-in-CT-Scan-and-X-Ray-Images