1 Introduction

A novel coronavirus (COVID-19) has spread from Wuhan to China and several other countries since December 2019. Over 73.6 million confirmed cases were registered by April 18, and there were more than 1.64 million worldwide deaths [1]. Since preventive vaccines or treatment for COVID-19 disease is not available, early diagnosis is of great value in providing the patient with the potential for effective isolation. It decreases the risk of infection to a healthy population. As the main COVID-19 screening methods, reverse-transcription polymerase chain reaction (RT-PCR) or gene sequencing for respiratory or blood specimens was implemented [2]. However, the overall RT-PCR positive rate for samples of throat swabs is estimated to be between 30 and 60%, resulting in undiagnosed patients that infect large populations of healthy individuals [3]. Chest X-ray imaging is a simple method for quick diagnosis of disease as it is also used for pneumonia diagnosis. Also, chest computed tomography (CT) scanning has higher COVID-19 diagnostic sensitivity [4], whereas chest X-ray images provide visual indexes related to COVID-19 [5]. Multilobar involvement and peripheral airspace opacity were seen in reporting chest imaging. The most widely recorded opacity is ground glass (57%) and mixed attenuation (29%) [6]. During the early phase of COVID-19, ground glass patterns are found in areas on the edge of the pulmonary vessels that are difficult to recognize visually [7]. The diffuse airspace opacities or asymmetric patchy is also noted for COVID-19 [8]. Only specialist radiologists may perceive such subtle anomalies. With the enormous rate of suspicious persons and fewer qualified radiologists, efficient, and robust automated systems for detecting such diseases will help the diagnosis process and improve early detection rates with high precision. To solve such problems, machine learning (ML)-based automated systems are effective tools.

The Internet of things (IoT) has developed from the interconnection of embedded computer systems to the interconnection of intelligent sensor devices. However, it tends to open up problems such as restricted processing capacities and low storage capacity in the environment of a smart city. In the meantime, cloud computing provides storage and fast processing. Therefore, IoT-cloud integration is required to deal with highly challenging intelligent health care [9]. The core concept of intelligent healthcare systems was always patient surveillance and real-time contact. However, the need for a cognitive system with IoT-cloud technology offers patient-centered and high-quality intelligent health care at low cost increases. The use of human-like intelligence into intelligent health frameworks is timely with artificial intelligence (AI) and deep learning (DL) techniques.

Recently, IoT and cloud technology have made significant improvements and have helped deliver intelligent healthcare services in real time. With IoT-cloud integration, a huge demand for a smart and intelligent healthcare system provides a seamless and rapid response. DL and AI can enhance cognitive behavior and decision-making. Advanced electronic applications and innovations are available to smart city stakeholders in addition to smart sensor devices. Nevertheless, it is challenging to find or access medical professionals and hospitals in an environment of a smart city. Often critical patients need a fast response and urgent attention to save their life. Hence, data recorded from patients must be transferred and interpreted with minimal delay, and the results must be precise enough to be used by medical professionals for initial examination. Therefore, an intelligent healthcare system is needed that can solve the above challenges using the technology and services available in an environment of a smart city.

The healthcare industry is also one of the fastest expanding markets with great demands. It offers essential services to patients, but it also contributes significant profits to the health sector. Based on technical advancements, we need a healthcare system with an intelligent decision-making capacity. Many researchers have also attempted to incorporate cognitive behavior in the development of intelligent IoT frameworks [10, 11]. Given that healthcare frameworks are multimodal and involve intelligent decision-making, cognitive behavior is becoming increasingly important [11]. In a smart city model, the smart healthcare system uses IoT sensors attached to or around a patient to collect data such as images, gestures, speech, EEG, ECG, and body temperature and decides the patient’s condition. The system also evaluates all health factors and decides when an emergency response or advanced medical treatment is needed. The system also keeps all interested smart city stakeholders updated about patient state results and tracking outcomes. However, the idea of smart health care remains unclear without cognitive functions, and smart city services will never be completely exploited without such knowledge. Researchers are, therefore, now making substantial efforts in this direction [10]. In [9], the authors outlined the complexities of using smart cloud computing sensors for smart health care in an environment of a smart city. Environmental conditions, such as humidity and temperature, must be controlled to ensure the quality of smart health care. Several researchers used IoT and the cloud to view medical information and track patient status [11].

Patient diagnostics based on CT scans are also commonly studied for smart healthcare applications. They need urgent action in the event of an emergency. Any delay in delivering care or the unavailability of qualified doctors can be harmful to those patients. A smart healthcare system that tracks the status of patients is therefore critical for those individuals. However, every such device should be intelligent and mature to be accurate. Specialized physicians may have access to reports and patient records and may give guidance and recommendations daily. In emergency cases, travel systems, such as smart ambulances or mobile support, such as smart clinics, may be provided to patients.

To resolve the above concerns, we propose a smart healthcare IoT architecture for the detection of COVID-19. The proposed method classifies patients with COVID-19 and non-COVID normal subjects. It uses CT scan images sent by IoT sensors. The system uses sensors to forward the images to the cloud for further processing. These images are sent to the deep learning cognitive module, which analyzes data in real time and determines future tasks and courses of action based on the status of the patient by conducting COVID-19 detection. The cognitive structure then collects the total processed outcome and agrees on an emergency response before eventually submitting the findings to the stakeholders involved for further review. The CT scan image is analyzed and labeled as normal or as COVID-19 on the cloud. In the end, medical professionals will review the findings produced and track patients. If the patient needs emergency treatment, potential care can be determined.

Various research studies have been carried out to propose automated detection or recognition of COVID-19 cases using chest CT scans or X-ray images but the results achieved are not up to the mark due to the lack of public image databases of COVID-19 patients. There has recently been a small dataset containing a COVID-19 X-ray image that enables automated COVID-19 X-ray diagnostics training by researchers [12]. These images were taken from research publications that showed the results of the COVID-19 using CT and X-ray images. A board of radiologists analyzed the complete database of images. After analyzing, only a set of those images were retained, which were labeled by the radiologists as a COVID-19 patient. Figure 1 illustrates the three images with the marked regions. Figure 1a, c, e shows the images related to COVID-19, and Fig. 1b, d, f shows the regions in the images infected from COVID-19. We then used a subset of ChexPert dataset images as negative COVID-19 samples [13]. We prepared a dataset Covid-Xray of 6000 images from the two available datasets, i.e., Covid-Chestxray dataset and Chex-Pert dataset. Out of 6000 images, 1200 images are used for testing, and 4800 images are used for training.

Fig. 1
figure 1

a, c, e Three images related to COVID-19 and b, d, f regions in the images infected from COVID-19

In this study, an ML-based system is proposed to detect COVID-19 from Chest X-ray images. Contrary to the hand-engineered feature extraction and classification approach for the medical images, we utilize a deep learning-based end-to-end prediction system that identifies the COVID-19 cases from the input images without any manual extraction of features. A convolutional neural network (CNN) has achieved much better performance in many applications among the different deep learning models. It is a type of artificial neural network with two main advantages: local connectivity and weight sharing. These advantages make it appropriate for high-dimensional signals such as images. They have been employed in different image enhancement problems, segmentation, feature extraction, and classification [14,15,16,17,18,19,20,21,22,23]. We developed a pattern recognition system for COVID-19 detection based on a deep learning approach. An overview of such a system is shown in Fig. 2.

Fig. 2
figure 2

The architecture of proposed system

This study uses a state-of-the-art ResNet50 CNN model for this problem and evaluates its performance for COVID-19 detection. As the number of X-ray images for COVID-19 is limited, so we adopted two strategies in this study:

  • Data augmentation technique is used to increase the dataset size by factor 5. For this purpose, we use small rotation, flipping, and adding a small amount of distortion.

  • We have optimized the last layer of the ResNet50 CNN model on ImageNet, rather than learning these models from scratch. So, fewer samples of each class can be used to train the model.

The above two strategies led to creating a network with available images and obtain high performance on 1200 images. Due to the limited number of images for COVID-19, we also compute the confidence interval (C.I) of the performance metrics. To calculate the performance of the ResNet50 model, the receiver-operating characteristic (ROC) curve is also computed.

Hence, the proposed system uses deep learning for feature extraction and classification from CT scan images in a smart healthcare environment. The main contributions of this study are as follows:

  1. (1)

    Proposed a smart healthcare framework for the integration of IoT-cloud technologies.

  2. (2)

    Prepared a dataset of 6000 images. The COVID-19 images are labeled by an expert radiologist and are only used for research purposes with a clear mark.

  3. (3)

    Used the state-of-the-art ResNet50 CNN model in this problem to classify COVID-19 patients versus non-COVID normal subjects. We trained the ResNet50 model on 4800 images and tested its performance on 1200 images. An accuracy of 98.6%, a sensitivity of 97.3%, a specificity of 98.2%, and an F1-score of 97.87% have been achieved using the proposed method.

  4. (4)

    Performed an experimental analysis in detail. For this purpose, we use the histogram of predicted scores, ROC curve, accuracy, sensitivity, specificity, and F1-score.

  5. (5)

    Used the tSNE plot to visualize the features that discriminate between the two classes.

The rest of the paper is arranged as follows. The description of the dataset is given in Sect. 2. The proposed system is explained in Sect. 3. In Sect. 4, we present experimental results and discussion. In Sect. 5, we conclude the paper.

2 Related study

In this section, we review the research studies related to smart health care and COVID-19 detection.

Cognitive smart health care has recently revolutionized healthcare systems, especially for smart city applications. IoT and integrated smart healthcare sensors, together with cloud technologies, have changed the idea of smart health care. These healthcare applications include online patient monitoring and observation, smart illness diagnosis, emergency management, mobile health care, smart health records, smart alarms, smart medication distribution, and remote service and control of medical devices. Such devices will assist in medical emergencies by providing a prompt response. It is connected to several smart healthcare sensors inside and outside the human body, receiving and tracking multimodal data in real time. Some researchers have used 5G technologies to improve connectivity in these cognitive health frameworks further [24]. They also combined cognitive health programs with AI technology, such as Kinect, which have been commonly used for behavior recognition.

Several cognitive IoT systems for various domains have been reviewed in the literature. In [25], a cognitive system was suggested to make smart city modeling more sustainable. A multilayer cognitive system has been introduced in [26], which indicates a high degree of intelligence for human behavioral cognition. Another cognitive simulation system for human intelligence that can process relative knowledge has been suggested in [27]. The cognitive paradigm based on NLP, which has the potential to address questions, was proposed in [25]. Researchers used cognitive actions to evaluate massive data in [28]. Cognitive intelligence has now been incorporated into healthcare applications, such as psychological [24] and physiological [29] applications. An emotionally conscious computational framework that uses cloud computing has been presented in [30]. An emotional cognitive structure that senses facial expressions has been suggested in [26], and another such framework [29] detects emotions using facial expressions.

In consideration of their tremendous economic and social benefits, smart healthcare systems have recently gained substantial interest. Many academic studies [25], frameworks [31, 32], and services [26, 33] based on IoT-cloud convergence have been suggested for smart health care. In [10], a smart healthcare system was proposed to support patients and hospitals using smart technology. Several studies [26] suggested systems for the collection and preservation of electronic health records. A smart cognitive system for glucose control has been proposed in [28] to track the behaviors of diabetic patients. Cognitive ambulance, which is operated by robotics and used to handle cardiac patients in need of emergency assistance, was also suggested in [33]. Some mechanisms have already been found for medical forgery in the area of smart health care [34].

In [35], the researchers examine climate variables (pressure, humidity, and temperature, and wind speed) about COVID-19 risk at an urban or rural place. The research seeks to supplement the possibilities of a high risk for COVID virus in the locations where it is anticipated that it will occur in the district with knowledge of climatic and socioeconomic parameters. In [36], the authors proposed a lightweight and secure method for key and secret authentication between the nodes to increase protection in the IoT framework. Formal and informal security reviews assessed the efficacy of the MASK protocol. The MASK protocol research has shown the security from both hard and non-deductive attacks for the sensor node. In [37], the authors introduced an effective method for integrating two existing testing algorithms for WSN packet routing and surveillance into IoT networks by reclaiming new media compression standards, video high-efficiency high coding (HEVC). The performance evaluation reveals that the potential of the proposed system as a measure of three competing concerns: consumer safety, media protection, and sensor node specifications. In [38], the authors developed a novel method for security watermarking images in smart cities using CNN. They used a novel neural network algorithm that incorporated synergetic learning. The optimal PSNR ratio is obtained by the proposed model, which is better than the existing model's results. In [39], the authors describe a promising new stable network architecture that relies on the sixth-generation wireless technology (SBs). This project aims to develop a novel and stable caching device in a wireless network that works in the IoT with large-scale data.

In [40], the authors proposed a new optimist concurrency control design variant. They investigated the concurrency of three implementation techniques with partial validation, namely AOCCRBSC, AOCCRB, and STUB. They completed an extensive series of trials to see the protocols performed best and without using the new configuration. Overall, the analysis observed decreased “mismatch percentage, reset, and message delivery errors” in all tests. This finding demonstrated the proposed substantial reduction in the contact delay. The low-sensitive fog computing applications which require short response times will be supported. In [41], the authors proposed a deep learning algorithm used to identify COVID-19 disease. They used one of the best deep learning techniques based on ConvLSTM and CNN. The results for the suggested modalities are based on two separate datasets. As a result, the study findings show that the proposed COVID-based modalities can be implemented. In [42], the authors proposed a method for exploring bi-level semantic representations to more efficiently harness semantic representations of MED videos learned from various sources. The feature interaction is used to describe the semantic interaction within semantic representation, which is then used in the weight learning context. Then, sparse learning is used appropriately to mitigate the negative effect of noisy/irrelevant things on event recognition. As a result, they received classification templates for event identification on training videos along with appropriate weights. They conducted the experiments using the TRECVID MED13 EK10 and MED14 EK10 datasets. In [43], the authors proposed a method for ranking video shots based on a novel concept of semantic saliency. They developed the nearly isotonic support vector machine (NI-SVM) classifier, which can use the carefully constructed ordering information and exhibit increased discriminative ability in event analysis tasks. They carried out detailed experiments on three real-world video datasets and obtained positive results.

The smart healthcare architecture introduced in this study seeks to resolve problems and concerns relating to this area and provides a COVID-19 detection mechanism. Deep learning has been extensively used in medical image recognition and signal processing. A CNN with a drop-out technique was suggested in [44] to track seizures. A study [45] used multichannel EEG and CNN to detect seizures. In another study [46], CNN was used in combination with the auto-encoders to classify EEG signals. As CNN has been successfully used in numerous medical applications, we use the ResNet50 architecture in this study to detect COVID-19 in a smart healthcare environment.

3 Smart healthcare framework

This section discusses the proposed IoT-cloud-based smart healthcare framework and COVID-19 detection method.

3.1 Smart healthcare scenario

Smart healthcare frameworks are developed for the environment of a smart city. It allows physicians, stakeholders, and residents to monitor their health through smart sensor devices. They can view electronic health information from anywhere using cloud and IoT technology at any time. Cognitive capability makes decisions accurate and intelligent. The cognitive system analyzes, tracks, and integrates information in real time and allows patients to choose the best possible care treatment. Health reports are transferred to the cloud and are remotely open to medical providers who may counsel patients appropriately.

The main priorities of the smart healthcare system are effective diagnosis, low cost, minimized patient costs, quick access, and increased overall quality of life. To meet these goals, we are proposing a healthcare system focused on IoT-cloud technology. Infrastructure for a smart city has to be registered by its residents for its utilities. The registration process creates a safe channel between residents and healthcare providers. It allows all approved stakeholders to use the cognitive module to retrieve patient information and health records securely. The location of the patient is constantly monitored to provide support in the event of an emergency. The cognitive system accesses the status of the patient and transmits the CT scan image to the cloud to be analyzed by the deep learning cognitive module. The deep learning module detects the COVID-19 and returns the effects of the binary classification. Based on these findings, the cognitive system is anticipating future tasks. In the form of health reports, these results are exchanged with health professionals for a thorough review. The cognitive device produces alarms and notifications in the event of an emergency, and a smart ambulance or mobile clinic can identify and meet the patient in a minimum amount of time. The smart traffic system also allows emergency facilities to hit the location on the shortest path in a minimum of time. In this way, the cognitive smart healthcare system digitally delivers essential healthcare facilities to all its residents.

3.2 System architecture

The design of the proposed intelligent health system is demonstrated in Fig. 3. The CT scan images are transmitted using smart IoT sensors. The LAN consists of low-range networking equipment. This layer transmits the acquired signals to another layer called the hosting layer from the intelligent IoT sensor and the device. The hosting layer includes various intelligent devices, including handheld multimedia or laptops that can store and transmit signals. The intelligent devices are connected to the large-scale network (WAN), which transfers data from intelligent devices to the cloud.

Fig. 3
figure 3

Smart healthcare framework for COVID-19 detection

The WAN layer uses specialized networking networks, such as Cellular LAN, 4G, or 5G, to transfer data to the cloud in real time. The cloud manager in the cloud authenticates patient information and sends it to the deep learning cognitive module.

Intelligent IoT sensors are used for data transmission. Any of these sensors are incorporated into the atmosphere of the patient. This device can also connect with other IoT devices. The LAN is composed of networking protocols for short distances, including Zigbee, LoWPAN, and Bluetooth.

Smart devices, including multimedia smartphones, notebooks, tablets, and human digital helpers, are available in the hosting layer. These devices store data locally and have specialized programs to calculate the signals received. Users may collect general and tentative health reviews from these small processing systems. Data is transmitted via the WAN layer to the cloud processing unit.

The cloud layer is made up of a cloud manager and a deep learning cognitive module. The cloud manager is in charge of data flow and applies all authentication measures to validate the identities of all intelligent city actors. The DL cognitive module analyzes the data after patient verification, evaluating the patient's condition. It makes wise decisions to identify COVID-19 based on CT scan images. Deep learning models submit identification outcomes to the cognitive module, which in turn determines the patient's situation eventually and tells the stakeholders concerned about the results. Then, hospital providers review the clinical data and outcomes and monitor patients.

3.3 COVID-19 detection and classification system

In DL cognitive module, our objective is to develop a method for COVID-19 detection using deep learning. Deep learning has shown outstanding performance, and it outperforms traditional techniques based on hand-engineered features [47, 48]. As such, we will use deep learning to design the proposed method. In this section, we present the details of the proposed method.

3.3.1 Dataset

In this study, the dataset of 6000 images is prepared using the two datasets. The dataset is divided into two parts, i.e., training and testing. The training and testing sets consist of 4800 and 1200 images, respectively.

One of the datasets used in this study is the Covid-Chestxray dataset, which has been released recently and includes a selection of images taken by Joseph Paul Cohen in publications on COVID-19 subjects [12, 49]. A chest X-ray and CT scan image combination constitute this dataset. From May 3, 2020, there were 250 radiographs of COVID-19 patients in this dataset. Out of 250 images, 184 images are selected for this study that show the perfect identification of COVID-19 patients. This dataset is continuously being updated. This dataset also includes some metadata, such as the age and gender of each patient. From this dataset, all COVID-19 images are selected for our study. One hundred images are used for testing in this study, and 84 images are used for training purposes. All these total 184 images are associated with COVID-19 patients. The data augmentation scheme is also used to increase the training sample size (COVID-19 images) from 84 to 420. We ensure that images of each patient are not overlapped in training and testing sets; either the images of patients are included in the testing set or training sets.

To handle fewer images of the non-COVID class in the dataset [49], we have taken more images from the Chex-Pert dataset [13]. The Chex-Pert dataset is a large publicly available dataset consist of 224,316 chest radiographs of 65,240 patients. These images are labeled for 14 different sub-categories (pneumonia, edema, etc.). For the training of our proposed model, we used 4380 images from the Chex-Pert dataset, 480 sample images from the no-found class, and 300 sample images from each of the other 13 classes. For the testing of our proposed model, we used 1100 images from the Chex-Pert dataset, 450 sample images from the no-found class, and 50 sample images from each of the other 13 classes. Table 1 shows the number of images used for training and testing.

Table 1 Number of images used in dataset preparation

Figure 4 presents 16 sample images from the dataset. The first row shows COVID-19 images, the second row shows normal images from the Chex-Pert dataset, and the third and fourth rows show the images affected by one of the 13 diseases in the Chex-Pert dataset.

Fig. 4.
figure 4

16 sample images from the dataset. First row shows COVID-19 images, second row shows normal images from Chex-Pert dataset, and third and fourth rows show the images effected from one of the 13 diseases in Chex-Pert dataset

3.3.2 Preprocessing

The resolution of the images in this dataset varies continuously. In the COVID-19 class, we have some high-resolution images, i.e., more than 1900 × 1400, and some low-resolution images, i.e., 400 × 400. This variation is more appropriate for the proposed model because the proposed model can achieve better results after the training regardless of variation in resolution of sample image and image capturing techniques. The data collection in an extremely controlled environment is not viable such as capturing high-resolution images and cleaning the data after preprocessing. As the machine learning field advances in technology, a more focus on sophisticated frameworks and models is developed that can perform work better in the uncontrolled environment, such as variation in sample image resolution, quality, and small-scale labeled datasets. The original dataset collector provides the images of the COVID-19 class from different sources. Therefore, to tackle the resolution problem in the images, we normalized the training images before the model's training.

So the model becomes less sensitive to different resolutions as all the images are in the same distribution.

3.3.3 Transfer learning

This study used the state-of-the-art ResNet50 convolutional neural network (CNN) model to classify COVID-19 patients versus non-COVID normal subjects. For this purpose, we used the transfer learning approach to fine-tune the ResNet50 CNN model on the training dataset. In this approach, a trained model for one particular task can be adapted to another similar task. To apply the task-specific learning on a smaller dataset, we can use the ImageNet model to train the model to classify images. The ImageNet is a well-known model that consists of millions of labeled images. Transfer learning is beneficial for tasks in which appropriate samples in large numbers for training a model are not available, such as medical images associated with various diseases. This approach can be used for those models with high complexity and require a large number of parameters for training the model. By using transfer learning, models begin with good initial values, requiring minor adjustments to tackle the new problem better.

The pre-trained model is used for a particular role in two main ways. In the first method, the pre-trained model is used to extract features, and the classifier is trained on it to classify the data. In the second approach, based on the new task, the part of the model network or the entire network is fine-tuned. The pre-trained model weights will be used as the initial values, and they will be updated during the training procedure.

In this study, we performed the fine-tuning on the final layer of CNN since the number of images in the COVID-19 class is very small and uses the pre-trained model to extract the discriminative features. Then, the output of the ResNet50 model [50] is evaluated. We also discuss the architecture of the ResNet50 model in the next section and also explains its utilization in this problem. Figure 5 displays the ResNet50 CNN mode model architecture.

Fig. 5
figure 5

REsNet50 CNN model architecture

3.3.4 ResNet50 CNN model for COVID-19 detection

In this study, we used the pre-trained ResNet50 CNN model.

This model is trained on the well-known dataset called ImageNet. ResNet50 model is one of the most popular CNN architectures for more robust training, which was also the winner of the ImageNet competition 2015. The concept in ResNet50 architecture is to use the identity shortcut connection that helps to skip the layers and allows the fast learning process. This architecture enables the initial layers to have a direct link in the network. Therefore, it will make it simple for the initial layers to update the gradients.

3.3.5 Model training

In this study, the proposed ResNet50 model is used with a cross-entropy loss function. This loss function is used to minimize the difference between the target and actual probability values. It is defined using the following equation:

$$ L_{{{\text{CE}}}} =\sum_{i = 1}^{N} Pp_{i} \log q_{i} $$
(1)

where pi and qi represent the actual and predicted probability values for every sample image. Then, we use the stochastic gradient descent algorithm to reduce the loss function.

4 Experimental results and discussion

4.1 Model parameters

During the experiment, the ResNet50 model is fine-tuned after 120 epochs. The batch size is set to 30. Adam optimizer is used to optimize the cross-entropy loss function. The parameters of the Adam optimizer were set to 0.0001 for learning rate, 0.9 for beta-1, and 0.999 for beta-2. The image resolution was set to 224 × 224 for all the images before giving the input to the network.

4.2 Evaluation metrics

The tenfold cross-validation was used to evaluate the proposed method, where the dataset is divided into ten folds. Each time, nine folds (80% of the data) are used for training, and onefold is used for testing. For generalization, this process is repeated for each fold. Therefore, all folds are used for training and testing. The training set was further divided into 10% for validation and 90% for training the model. The performance of the proposed system is evaluated using the following metrics:

$$ {\text{Accuracy}}\,\left( {{\text{Acc}}} \right) = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{Total}}\,{\text{Samples}}}} $$
(2)
$$ {\text{Specificity}}\,\left( {{\text{Spec}}} \right)=\frac{{{\text{TN}}}}{{{\text{TN}} + {\text{FP}}}} $$
(3)
$$ {\text{Sensitivity}}\,\left( {{\text{Sens}}} \right) = \frac{{{\text{TP}}}}{{{\text{FN}} + {\text{TP}}}} $$
(4)
$$ {\text{F}}1{\text{-Score}} = \frac{{2*{\text{TP}} }}{{\left( {2*{\text{TP}} + {\text{FP}} + {\text{FN}}} \right)}} $$
(5)

where TP (true positives) is the number of COVID-19 images that are identified as COVID-19, FN (false negatives) is the number of COVID-19 images that are predicted as non-COVID, TN (true negatives) is the number of non-COVID images that are identified as non-COVID by the system, and FP (false positives) is the number of non-COVID images that are predicted as COVID-19. We used Matlab (2020b) to implement the proposed system. We used the server system with Intel (R) Xeon (R) CPU-F8-2920 @ 3.5 GHz (30 CPUs) having 64 GB RAM, 11 GB Nvidia Graphics Card.

4.3 Model predicted scores

In this study, we used ResNet50 CNN model for the detection of COVID-19 disease. The proposed model predicts a probability score for every input image, indicating the probability value of an image associated with the COVID-19 class. A binary label can be derived by comparing this value to a threshold, indicating whether the input image is associated with COVID-19 or not non-COVID. An ideal model should estimate the likelihood of all samples of non-COVID class close to 0 and COVID-19 class close to 1.

Figure 6 displays the distribution w.r.t ResNet50 model of expected likelihood scores for the test images. In this study, the non-COVID class consists of normal subjects and other diseased subjects infected from COVID-19. Therefore, the distribution of predicted scores is presented for 03 classes, i.e., normal (non-COVID), other diseases (non-COVID), and COVID-19. From Fig. 6, it is noted that the probability score of the normal class (non-COVID) is lower than the probability score of other disease classes (non-COVID). Therefore, the images of other diseases class are easily differentiated from normal class, but their differentiation is difficult from the COVID-19 class, as both of these “classes” have high higher probability scores. Overall, the probability scores of non-COVID images from normal and other disease classes are lower than the probability score of COVID-19 images. This depicts that the model has learned to differentiate between images of non-COVID and COVID-19 classes.

Fig. 6
figure 6

Predicted probability values with ResNet50 CNN model

4.4 Results and discussion

The probability score generated by the ResNet50 model decides that the test image belongs to the COVID-19 class or non-COVID class. These scores can then be compared to a threshold value to determine whether the input image is associated with COVID-19 or not. The projected labels are used to determine each model's sensitivity and specificity. In this study, we used four different threshold values, i.e., 0.15, 0.20, 0.25, and 3.0. Out of these four threshold values, we observe that the ResNet50 model achieved better results with a 0.15 threshold value. The average performance results are given in Table 2.

Table 2 Performance of ResNet50 CNN model

It is also noted that there are only one hundred images for testing in the COVID-19 class; therefore, the specificity and sensitivity of the proposed system are not highly reliable as the total of sample images in the COVID-19 class are not too much. To achieve a more accurate estimate of sensitivity and specificity rates, there is a need for more test images labeled with COVID-19. However, on the other side, we can also compute 95% C.I (confidence interval) of the obtained specificity and sensitivity values. The purpose of estimating C.I is to check the range of specificity and sensitivity values for the test instances of each class. The C.I can be defined as:

$$ r = z{\sqrt {\frac{{{\text{accuracy}}\;\left( {1-{\text{accuracy}}} \right)}}{N}}} $$
(6)

where z represents the significance level of C.I and N represents the total number of instances for the particular class. In our study, we used the C.I of 95%, so the corresponding z value is equal to 1.96%. It is imperative to have a good performance model for the detection of COVID-19. For this purpose, the cutoff threshold value is selected w.r.t 97.3% sensitivity of ResNet50 model.

As for the diagnosis of COVID-19, it is vital to have a sensitive model, we choose the threshold value for the ResNet50 CNN model as 97.3% sensitivity, and we compare the specificity rates of the model. From Table 3, we can observe that C.I is 2.4% for sensitivity and 1.3% for specificity. As we have 1100 sample test images in the non-COVID class; therefore, the C.I for sensitivity is higher than the specificity.

Table 3 Specificity and sensitivity values of ResNet50 CNN model

Figure 7 shows the ROC curve of testing data. It can be noticed from the curve that AUC reaches a high value (0.98) for the method. In this case, this means that the model can distinguish between the COVID-19 and non-COVID classes.

Fig. 7
figure 7

ROC curve of proposed system

Table 4 shows the confusion matrix on the testing data that has three mistakes by classifying three COVID-19 images as non-COVID and has nine errors by classifying thirteen non-COVID images as a COVID-19. The confusion matrix results in Table 4 show that the proposed system is classifying the test data with a high accuracy rate.

Table 4 Confusion matrix with ResNet50 model

Figure 8 visualizes the tSNE plot of the features. In Fig. 8, we observe two separate clusters representing the COVID-19 and non-COVID classes. This plot shows the clear discrimination of features belonging to the two classes. Only a few of the images of both classes are misclassified. The tSNE plot authenticates the dominance of the proposed system.

Fig. 8
figure 8

tSNE plot for two-class problem (COVID-19 vs non-COVID) for features visualization

4.5 Comparison

The comparison results are shown in Table 5. We have taken the results, which are reported in the research papers, to avoid any bias due to parameter tuning. It is noted that the proposed method outperforms the other state-of-the-art methods. From this comparison, we conclude that the proposed end-to-end model based on a ResNet50 CNN outperforms the recent works on COVID-19 versus non-COVID classification by achieving an accuracy of 98.6%, the sensitivity of 97.3%, a specificity of 98.2%, and F1-score of 97.87%.

Table 5 Comparison of ResNet50 model with other research studies

This study developed a state-of-the-art framework based on deep learning for COVID-19 detection using chest X-ray images. For this purpose, we fine-tuned the pre-trained ResNet50 CNN model on our training set. A dataset of 6000 images is prepared in this study. The detailed experimental analysis is performed to evaluate the performance of the ResNet50 CNN model using performance measures of accuracy, specificity, sensitivity, and F1-score. For a sensitivity rate of 97%, the proposed model achieved a specificity rate of 98% on average. This is really encouraging, as it shows promising results using X-ray images for COVID-19 detection. This study is carried on publicly available datasets that consist of 520 and 5480 COVID-19 and COVID images, respectively. The results indicate that the proposed system gives an accuracy of 98.6%, sensitivity of 97.3%, specificity of 98.2%, and F1-score of 97.87%. The comparison shows that the proposed system performs better than the existing systems.

4.6 Future work

COVID-19 detection from chest X-ray images is a challenging problem and still many problems need to be overcome. There are many future directions related to the proposed work. Probably, the most important challenge for our future work is to further improve the accuracy rate. In the future, we would be interested to design a more generalized and powerful model by increasing the depth of the model to see how it affects the accuracy. For this purpose, we will use state-of-the-art and sophisticated deep learning techniques to develop such models. For our current study, two public available datasets are used. One of the future directions is to use more images for the COVID-19 detection problem.

In the future, it would also be of interest to investigate the identification of different categories/multi-classes for COVID-19 detection. Though the proposed system gives good performance on a publicly available dataset, its deployment in a real-time environment for the healthcare sector is also future work. It would also be interesting to observe if more subjects in the experiments would positively or negatively impact the results as the amount of data for the classifier increases.

The proposed technique based on DL will be helpful in medical diagnosis research and healthcare systems. It will also support the medical experts for COVID-19 screening and lead to a precious second opinion.

5 Conclusion

In this study, a smart healthcare system is proposed that incorporates IoT-cloud technologies to detect and classify COVID-19. The system uses smart sensors to collect data from medical images. These images are stored in the cloud and used to assess the status of patients. The system then advises on the facilities and medical assistance requested by patients. The image is sent to the deep learning cognitive module, which detects COVID-19 and tells all stakeholders of the patient's situation in order to execute follow-up procedures. In the DL cognitive module, we have proposed an intelligent and robust system for detecting coronavirus disease (COVID-19) using a state-of-the-art deep learning approach. We used Chest X-ray images as a dataset and fine-tuned the ResNet50 CNN model on our training dataset. We validate the proposed system’s robustness and effectiveness using two benchmark publicly available datasets (Covid-Chestxray dataset and Chex-Pert dataset). At first, a dataset of 6000 images is prepared from Covid-Chestxray and Chex-Pert datasets. The proposed system is trained to collect images from 80% of the datasets and tested with 20% of the data. Results clearly show that the performance metrics of our proposed method are high. Cross-validation is done using a tenfold cross-validation technique. In this study, a detailed experimental analysis is performed to evaluate the performance of the proposed system. The results indicate that the proposed system gives an accuracy of 98.6%, sensitivity of 97.3%, specificity of 98.2%, and F1-score of 97.87%. The comparison shows that the proposed system performs better than the existing systems. The proposed technique based on DL will be helpful in medical diagnosis research and healthcare systems. It will also support the medical experts for COVID-19 screening and lead to a precious second opinion.