1 Introduction

The lungs are among the primary organs of humans’ respiratory systems and are located near the backbone on either side of the heart. Their function in the respiratory system is to extract oxygen from the atmosphere and transfer it into the bloodstream, and to release carbon dioxide from the bloodstream into the atmosphere, in the process of gas exchange. The tissue of the lungs can be attacked by bacteria, viruses, and rarely parasites, as result, a lung infection may occur. The infection might be a sign of various diseases such as pneumonia, and nowadays the world’s nightmare coronavirus.

Pneumonia is swelling and possible inflammation of the tissue in one or both lungs. It causes the air sacs, or alveoli, of the lungs to fill up with fluid or pus. Traditionally, pneumonia is diagnosed through reviewing the patient’s medical history, physical exam, and other diagnostic tests like a blood test, sputum test, pulse oximetry test. Moreover, the chest X-ray is the main tool in diagnosis. People at high risk of having pneumonia are older than 65 or younger than 2 years of age or already have health problems. The disease can be seen at any age, but children under 2 years of age, people with very low immune systems, and people over 65 years of age are at high risk. According to the data of the World Health Organization (WHO), 450 million pneumonia cases are recorded every year in the world [36].

However, the coronavirus, a commonly used name for the disease caused by a new novel coronavirus caused by SARS-CoV-2, was first seen in Wuhan, China in December 2019. The disease has worsened both human health worldwide and the world’s economic health. Also, it has nearly collapsed health systems in many countries, thus it has been declared by WHO as a global pandemic [28, 42]. Indeed, the disease tied to the new coronavirus was originally called novel coronavirus-infected pneumonia (NCIP) and later named COVID-19 (Corona Virus Disease 2019) by WHO [20]. COVID-19 negatively affected many people around the world and caused many deaths at the rate of 6.14% globally [16]. COVID-19 resembles almost 79 percent of SARS-CoV, which caused the SARS epidemic that occurred between 2002-2003, also, COVID-19 is 96 percent similar to coronaviruses seen in bats. The pathologic result of SARS and COVID-19 is diffuse alveolar damage with fibrin-rich hyaline membranes and a few multinucleated giant cells [31], but COVID-19 is more deadly than SARS, the number of deaths in COVID-19 has already overtaken that of in SARS epidemic.

Similar to viral pneumonia (non-COVID-19), the main feature of COVID-19, which is understood to be the result of a new mutation, is easy attachment to lung type 2 alveolar cells in humans [7]. COVID-19 test is not possible due to the absence of the diagnostic kit everywhere, even if the kit exists, its false-negative (giving a negative result for a person infected with COVID-19) rate is high. Also, early detection of COVID-19 is crucial to keep its morbidity and mortality rates low. These reasons were the main motivations to develop other diagnostic methods. It is known that X-rays are used to analyze the health of the lungs of COVID-19 and pneumonia patients. However, a specialist radiologist should analyze chest X-rays or computed tomography (CT) scans to observe structural changes in the chest. With the rise of the pandemic, the need for radiologists has been increased considerably not only to detect COVID-19 but also to identify other abnormalities it caused. Moreover, the diagnosis process is also time-consuming and subject to human error.

On the other hand, advances in computer technology play important roles in medicine as in many areas of life. In medicine, computerized systems assist doctors during interpretation, diagnosis, and treatment as a second reader to reduce medical errors and save time [25]. Convolutional Neural Networks (CNNs) have been a dominant method in medical imaging used in computer-aided diagnostic systems with successful results in classification and feature extraction problems due to their ability to take advantage of Spatio-temporal correlation structures. There are vast number of successful CNN based studies such as, in medical imaging, COVID–19 image classification [37], detection of breast cancer [29], diagnosis of lung lesions [38], detection of hand osteoarthritis [44] and brain tumor classification [18] are some of these studies. To design a successful CNN model diverse, but labeled data corpus is required. Moreover, the quality of the images and the appropriate network are important components [19, 21, 40]. However, in situations where enough data is not available, data augmentation and transfer learning can be used [23, 27]. Transfer learning allows a pre-trained model to be applied to different problems after some modifications to the model’s structure and re-training the modified model for fine-tuning using the problems’ dataset. It is exceptionally important where the annotated data is scarce. This is a crucial advantage of transfer learning because it is difficult to obtain adequate annotated medical data [8].

The current study offers an automated computerized CNN-based multiclass model to differentiate viral pneumonia, COVID-19, and normal cases using chest radiographs only, without the presence of a specialist. The model takes chest X-ray images and monitors structural differences in the lungs caused by the diseases. The model reaches 98.86% average accuracy, 98.29% average recall, and 98.37% average precision. Moreover, the average false-positive and false-negative rates that significantly impact the pandemic are 0.0085 and 0.0171, respectively. The model was also evaluated with some unused chest images randomly selected from COVID-19 Radiography Dataset [14, 35]. The model accurately predicted 54 of 60 cases, 20 from each case. Thus, the model can be used to assist radiologists as a second reader and is capable of reducing variability in the interpretation of images among the radiologists that may be caused by differences in their experience.

The main advantages and contributions of this research are:

  • The proposed model improves the correct detection of pneumonia and differentiating non-COVID-19 pneumonia and COVID-19 and reduces both false-negative and false-positive diagnoses.

  • Deep medical imaging is utilized to emphasize the effect of chest X-rays on different cases such as positive COVID-19, non-COVID-19 pneumonia, and normal cases.

  • CNNs are capable of extracting radiographic patterns on chest X-ray images to turn into valuable information in medical imaging.

  • The advantage of working is the use of pre-trained NasNet-Mobile CNN, an algorithm generated by artificial intelligence. To prevent overfitting, a dropout layer was added to the original network.

  • By working on raw data without pre-processing, even images that are difficult to classify were classified accurately and quickly in the shortest time possible. The recommended method is automatic, user-independent.

The rest of the paper is organized as follows. Section 2 overviews the importance of X-rays in monitoring COVIT-19 and summarizes studies about COVIT-19. Materials and methods utilized in the study are described in Section 3. Section 4 presents simulation results. Comparision to existing studies are given in Section 5.

2 Related studies

Recent studies show that coronavirus causes structural effects in the lungs, such as disruption of normal lung structure and transformation of respiratory tissue into fibrotic material [10], which can be monitored using chest X-rays and CT scans. On the other hand, scientific developments based on deep learning have begun to emerge in medical image analysis, both in extracting features that represent data and in diagnosing diseases. CNNs, which make use of the structural and configurational information available in the images, produce expert-level performances in the fields of medical imaging. For example, Padma and Kumari used convolution 2D techniques applied to open-source data sets of COVID-19. As a result of the study, they proposed a binary classifier that takes chest X-ray images as input to identify COVID-19 with 99% accuracy and 98% verification capability [33]. Moreover, Catak and Sahinbas used 278 positive, 66 negative X-ray images and 39 positive and 30 negative images for the test. In this study, VGG[Visual Geometry Group]16, VGG19, ResNet, DenseNet, and Inception models were used. VGG16 had the highest accuracy rate with 75% success among models [12]. In another study, Shan et al. [39] used VB-Net neural networks to detect COVID-19 infection sites on CT images. The system was trained using 249 COVID-19 patients and validated using 300 new COVID-19 patients. The proposed model gave 91.6% ± 10.0% Dice similarity coefficients between automatic and manual segmentation and an average POI estimation error of 0.3% for the whole lung in the validation dataset [39]. Akram et al. [1] proposed a method that automatically classifies COVID-19 CT images. The method uses features extracted by wavelet transform and fractal texture analysis and applied feature selection with a genetic algorithm. They achieved a 92.6% accuracy rate in the method they used Naive Bayes as a classifier. Khan et al. [26] classified the contrast enhancement preprocessed COVID-19 CT images with the pre-trained DenseNet-201. As a result, they achieved a high accuracy rate of 94.76%. Sahlol et al. [37] proposed a hybrid method using Inception CNN and feature selection algorithm based on marine predators algorithm (MPA) to classify COVID X-ray images. They achieved high accuracy rates of 98.7% and 99.6% in their trials on two different datasets.

Pneumonia has a significant impact on COVID-19 and should be diagnosed urgently with its underlying causes. Mahmud et al. [30], using quite less number of COVID-19 chest X-rays, conducted deep learning-assisted automatic detection of COVID-19 and pneumonia types. They proposed CNN-based architecture, called CovXNet, that uses deep convolution with varying rates of expansion to efficiently extract various features from chest X-rays. At first, chest X-rays of normal and (viral/bacterial) pneumonia patients were used to train CovXNet. Training of this initial phase was carried out by limited chest X-rays, and later with some additional layers, the model trained more with COVID-19, and other pneumonia patients’ chest X-ray images. In the proposed method, different forms of CovXNets were designed, trained with X-ray images of various resolutions, and a stacking algorithm was used for further optimization of their estimates. Finally, a gradient-based differential localization was integrated to distinguish the abnormal regions of X-ray images that refer to different types of pneumonia. Extensive experiments using two different datasets, multiple satisfactory detection performance with an accuracy of 97.4% for COVID / Normal, 96.9% for COVID / Viral pneumonia, 94.7% for COVID/Bacterial pneumonia, and 90.2% for multi-class COVID/normal/Viral/Bacterial pneumonias provides [30].

Radiographic patterns in CT chest scans showed higher accuracy compared to RT-PCR detection of COVID-19, which had a relatively low positive detection rate in the early stages, according to WHO. Butt et al. [11] conducted a study comparing multiple convolutional neural network (CNN) models to classify CT samples as COVID-19, Influenza viral pneumonia, or infection-free. In this mentioned study, they developed existing 2D and 3D deep learning models. Combining these with the latest clinical understanding, they compared it with a study that achieved an AUC of 0.996 (95% CI: 0.989-1.00) for non-coronavirus cases, and they calculated 98.2% sensitivity and 92.2% specificity in thoracic CT studies.

Panwar et al. [34] studied the visual indicators found in the lungs. They have proposed nCOVnet, a deep learning neural network-based method, which is an alternative rapid screening method that can be used to detect COVID-19 by analyzing patients’ x-rays. Dansana et al. [17] VGG-19, Inception-V2, and convolution neural networks method were used for binary classification pneumonia-based transformation of the decision tree model in the X-ray and CT scan images data set containing 360 images. It was concluded that the precision-tuned VGG-19, Inception-V2, and decision tree model performed quite satisfactorily with an increase in training and verification accuracy (91%), except for the Inception-V2 (78%) and decision tree (60%) models [17]. Table 1 summarizes these studies.

Table 1 Overview of recent studies related to the current study

The current study proposes a multiclass computed model to distinguish between viral pneumonia, COVID-19, and normal cases using chest radiographs only. The model takes raw chest X-ray images and tracks structural differences in the lungs caused by the disease. Performance measures such as accuracy, recall, and precision state that the model is promising.

3 Materials and methods

Dataset

The dataset for the study is a collection of Cohen dataset [15] and Kermany dataset [24]. The latter contains normal (i.e. healthy) and viral pneumonia chest radiographs, while the former contains only COVID-19 images. The distribution of images is 131 normal, 130 viral pneumonia, and 129 COVID-19, almost the same number of images in each class. Figure 1 shows sample image from each class.

Fig. 1
figure 1

Original raw images from the dataset left normal, middle viral pneumonia, and right COVID-19

Later, the dataset was randomly split into two parts using the stratified selection strategy to allocate %70 of the images for training, and %30 for validation. Table 2 shows the training set and validation set numbers.

Table 2 Train and validation sets statistics

Data augmentation

The performance of CNN-based models highly depends on the amount of diverse data available otherwise, the model might not generalize well i.e. the model produces good results on the seen data (training set) but performs poorly on the unseen data(test set). Conversely, Spatio-temporal correlation structures have been proven to improve the effectiveness of models [3,4,5,6]. Therefore, the dataset was augmented artificially to increase the model’s effective size, as image data augmentation not only prevents CNN from learning irrelevant patterns, overfitting, and memorizing exact details of training images but also supports temporal and spatial relationships. Herein, a series of online data augmentation such as reflection, translation, and scaling was applied during training.

Classification model

Classical deep learning models accept inputs in vectorial form; however, especially in medical images, structural information contained in nearby pixels or voxels is important, and vectorization destroys the structural information. CNNs are aiming to take advantage of spatial and configuration information. They have the ability to learn hierarchical features from the data using a backpropagation algorithm. A typical CNN pipeline in image processing contains a pile of convolution, data reduction, and accumulation layers. Convolution layers are used as small-scale detectors to explore the features of an image. Suppose I represents an input image of size m × n in the image domain D, where I(i,j,3) is pixel value at location (i,j) with i = 1,2,⋯m and j = 1,2,⋯n. Here, the third coordinate represents the number of channels, and for the sake of being more succinct, it is usually omitted, thus, I is considered as a two-dimensional function on D. It is worth mentioning that each operation mentioned below is applied to each channel separately. CNNs are composed of a variety of layers such as convolution layers, pooling layers, fully connected layer

Each convolution layer uses filters, say F of size (2p + 1) × (2q + 1), to output so-called feature maps (aka activation maps) by convolution operation to I, that is,

$$ (I\ast F)(i,j)=\sum\limits_{r=-p}^{p}\sum\limits_{s=-q}^{q}I(i-r,j-s)F(r,s) $$
(1)

where i = 1,2,⋯m and j = 1,2,⋯n. Handling the border values is a matter of choice. General approach is padding to obtain an output of size m × n. See [22] for details. The convolutional layer is completed by adding bias term b(i,j) i = 1,2,⋯m and j = 1,2,⋯n followed by applying nonlinear activation function σ(x) to each element of the output in (1) to destroy linearity and fix pixel values. Thus, under the assumption that both input and output of (1) having the same size, we obtain

$$ S(i,j)=\sigma ((I\ast F)(i,j)+b(i,j)),\quad i=1,2,{\cdots} m\quad\text{and}\quad j=1,2,{\cdots} n. $$
(2)

For the choices of nonlinear activation functions, see [2].

After that, during the pooling layer, a downsampling operation is applied to each feature map. With an equal dimensional pooling of size k, the windowed area of size k × k over the feature map S is summarized. The most commonly used pooling function ρ is max pooling since it preserves detected features. The output of the pooling layer, say W, also depends on the stride value s, the amount of window moved after each pooling operation, and the padding t. Thus,

$$ W(i,j)=\rho(S,k,s) $$
(3)

where \(i=1,2,{\cdots } \lfloor \frac {m+2t-k}{s}\rfloor +1\) and \(j=1,2,{\cdots } \lfloor \frac {n+2t-k}{s+1}\rfloor +1\). We define \(u= \lfloor \frac {m+2t-k}{s}\rfloor +1\) and \(v=\lfloor \frac {n+2t-k}{s}\rfloor +1\). We also define \(\bar {p}=2p+1\) and \(\bar {q}=2q+1\). Then, one can represent the result of (1), (2) and (3) in implicit form as

$$ W={\varGamma}(I_{m\times n},F_{\bar{p}\times\bar{q}},b_{m\times n},S,k,s;\ast,\sigma,\rho) $$
(4)

which is downsampled feature map for the filter F and is an input to the next layer. We call this sort of combination of layers as a CNN block. If we ask a CNN block to produce r such outputs we have to provide r filters, which can be formulated by

$$ W^{r}={\varGamma}(I_{m\times n},F^{r}_{\bar{p}\times\bar{q}},b^{r}_{m\times n},S^{r},k,s;\ast,\sigma,\rho). $$
(5)

Note that (5) produces total of r downsampled feature maps, one map corresponding to each filter. Moreover, if the initial input image set has w images then (5) produces rw downsampled feature maps.

An efficient CNN model is composed of many of these kinds of CNN blocks, say l, where the output of one is the input to the next one. We can formulate the following recurrence relation

$$ W^{r_{i}}_{i}={\varGamma}(W^{r_{i-1}}_{i-1},F^{r_{i-1}}_{\bar{p_{i}}\times\bar{q_{i}}},b^{i-1}_{m_{i}},S_{i-1},k_{i-1},s_{i-1};\ast,\sigma,\rho) $$
(6)

where \(W^{r_{0}}_{0}=I^{w}_{m\times n}\) and i = 1,2,⋯,l.

Later, all of the downsampled feature maps \({W^{i}_{l}}, i=1,2,{\cdots } r_{l}\) are stretched and concatenated to a single column vector, say x, and fed into one or more fully connected layers. We denote these layers by f. The output size of the last fully connected layer is equal to the number of classes, say c, in the target data. The softmax activation function 𝜃 normalizes the output so that all output values are positive and sum to one. Then, the predicted value for the input image I with label y is given by

$$ \hat{y}=\max_{1\le i\le c}\theta{(f(x))}. $$
(7)

During the training process two variables lost function of y and \(\hat {y}\) is computed at each data in validation dataset Ω of size ν and the following minimization problem

$$ \min_{F^{r_{i}}_{\bar{p_{i}}\times\bar{q_{i}}},b^{i}_{m_{i}},0\le i\le l} \sum\limits_{j=1}^{\nu}\ell (y_{j},\hat{y}_{j}). $$
(8)

is solved.

This is a good place to point out that when the model is trained from scratch, the recurrence relation (6) is initiated by assigning random values to the contents of the filters \(F^{r_{0}}_{\bar {p_{0}}\times \bar {q_{0}}}\). However, with the idea of using other’s experiences, so-called transfer learning, we prefer to transfer filters from pre-exists models such as Xception [13], VGG-19 [41], AlexNet [27], and NASNet-Mobile [48]. We constructed models on each of these, but NASNet-Mobile received the highest accuracy. Thus, the proposed classification model was built on NASNet-Mobile. There are 5,326,716 learnable parameters in NASNet-Mobile.

NASNet-Mobile, a state-of-the-art pre-trained network, is a convolutional neural network that had trained on 1.2 million training images from the ImageNet database. The model has a total of 913 layers. The image input size for the network is 224 × 224 × 3 [48].

The initial construction of the model was NASNet-Mobile. Last three layers named 'predictions', 'predictions_softmax', 'ClassificationLayer_predictions' replaced with the appropriate layers, (i.e. fullyConnectedLayer, softmaxLayer, classificationLayer). Local hyperparameters were defined to the final fullyConnectedLayer such that 'WeightLearnRateFactor' was assigned 10, and 'BiasLearnRateFactor' assigned 10. A dropout layer with dropout probability 0.5 was added just before the final fullyConnectedLayer so only randomly selected neurons enter the following layer. The first 10 layers was kept frozen during training which corresponds to 2018 learnable parameters, however, higher layers were fine-tuned. Figure 2 shows the graphical abstraction of the model.

Fig. 2
figure 2

Graphical abstraction of the classification model. The model was built on NASNET-Mobile. The classification layer was updated, the dropout layer was added, and the initial layers were kept unchanged

Before fine-tuning the model, both the training and validation sets went through a sequence of online data augmentation. Each image was reflected horizontally with a 50% probability. Also, all images were translated both horizontally and vertically with translation distance picked randomly from a continuous uniform distribution within the interval [− 30,30]. Moreover, horizontal and vertical scaling was applied using a scale factor selected randomly from a continuous uniform distribution within the interval [0.9,1.1].

Table 3 shows additional hyperparameters and their assigned values along with frozen weights.

Table 3 Model’s training hyperparameters, frozen weights, training elapsed time

4 Simulation results

The implementation, as well as training of the proposed transfer model, was carried out using the programming environment MATLAB®. The model was trained using 273 chest radiography, among them 92 were normal, 90 were COVID-19, and 91 were viral pneumonia. The training accuracy is checked at the beginning and during training at every five epochs via the validation set. The validation set contained 117 chest images distributed evenly among classes normal, viral pneumonia, and COVID-19. Furthermore, the losses for training and validation were measured using the training and the validation sets, respectively. Figure 3 outlines the training progress, but Fig. 3a shows plots of accuracy for both training and validation, and Fig. 3b shows the same for the loss.

Fig. 3
figure 3

Transfer learning using pretrained NASNet-Mobile with fine-tuning

Figure 4 shows the confusion matrix for the validation test. The model correctly predicted 115 of 117 cases with only two mispredictions with two viral pneumonia cases predicted as normal.

Fig. 4
figure 4

NASNet-Mobile confusion matrix for the validation set. The rows correspond to the true class and the columns correspond to the predicted class. The bottom table shows the model’s true and false predictions ratio to the validation set size in terms of percentages

Moreover, the efficiency of the transfer model was evaluated using performance metrics such as accuracy, recall, precision, F1 score, false-positive rate, false-negative rate, false discovery rate, false omission rate, critical success index, and Matthews correlation coefficient. These performance metrics are evaluated using the confusion matrices given in Fig. 4, and Table 4 presents the evaluation results along with their macro average values. Note that the model uses only pixel information. The macro average accuracy of the model was 0.9886, while the macro average recall and precision were 0.9829 and 0.9837. However, both false-positive and false-negative rates for COVID-19 were 0.0000.

Table 4 NASNet-Mobile performance statistics for the validation set

As mentioned above, throughout our experiments, we constructed models by modifying Xception, VGG-19, and AlexNet, but they received the highest accuracy of 94.87, 93.16, and 94.94, respectively.

The model was further tested with some unused chest images, namely, the images that had used neither used in training nor validating. The test set contained a total of 60 images, with 20 from each class. The images were randomly selected from COVID-19 Radiography Dataset [14, 35]. Figure 5 presents the confusion matrix for the test set. The model accurately predicted 58 of 60 cases in which one COVIT-19 case predicted as normal, and one normal case predicted as viral pneumonia.

Fig. 5
figure 5

NASNet-Mobile confusion matrix for the test set. The rows correspond to the true class and the columns correspond to the predicted class. The bottom table shows the model’s true and false predictions ratio to the test set size in terms of percentages

Moreover, Fig. 6 shows the Receiver operative characteristics (ROC) curve for the model. It is the ROC of one versus others. Observe that the class prediction probability value is high, that is the model is robust.

Fig. 6
figure 6

NASNet-Mobile ROC curve for the test set

5 Discussion

According to the World Health Organization data for December 2020, there have been more than 79 million cases and more than 1.7 million deaths worldwide since the beginning of the pandemic due to the epidemic that has spread all over the world (WHO, 2020). The restrictions applied in the manufacturing industry had serious negative effects on the global supply chain. Currently, with the high rate of continuing spread of COVID-19 worldwide, the state of emergency is declared in many countries: schools are being closed, online work for both the government and private agencies are promoted. As economists warn, the epidemic could cost $1.1 trillion globally.

It is known that RT-PCR testing of 2019-nCoV RNA is capable of making the definitive diagnosis of COVID-19 from Influenza-A viral pneumonia patients. However, in some cases, nucleic acid testing has some disadvantages, including delays in results, relatively low detection rate, and shortage of supplies. Moreover, in the early stages of COVID-19, some patients may already have positive pulmonary imaging findings but lack sputum and therefore may receive negative RT-PCR test results from nasopharyngeal swabs. These patients are not diagnosed as suspected or confirmed cases, isolated, treated, and become potential sources of infection.

Mahmud et al. [30] identified the distinctive localization of abnormal areas on X-rays that might help to diagnose variations of the clinical features of pneumonia. It is stated that this result obtained with a small number of images can be improved by increasing the number of images. Panwar et al. [34] pointed out that economic differences may cause limitations in testing and that rapid determinations made through such images can help hospital administration and medical professionals. Dansana et al. [17] drew attention to the importance of early detection of covid19 and analyzed the data with 3 different models. As a result, they achieved an accurate prediction success between 60 percent and 91 percent.

Although WHO strongly recommends testing as many people as possible, it has not been fulfilled by many countries due to the apparent lack of resources/staff and the lack of RT-PCR testing. However, the use of CT imaging with artificial intelligence methods might help detect COVID-19 by offering a rapid alternative and thus helping to limit the disease spread.

Developments in computer technology in recent years have been facilitating the acquisition of high-resolution images and the processing of these images. These developments have revolutionized the field of medical imaging and artificial intelligence, thus, so many successful studies have been conducted in the field of medical imaging. Especially, transfer learning-based CNN models achieved expert-level performances even on the nowadays world’s nightmare COVID-19 there exists successful studies. Below we name some of them.

Apostolopoulos and Mpesiana [9] designed transfer learning-based CNN model for automatic detection of COVID-19 disease. Their dataset contained X-ray images of normal, common bacterial pneumonia, and COVID-19 diseases. They modeled VGG-19 based CNN, and achieved the highest accuracy of 93.48%, while the sensitivity of 92.85%, and specificity of 98.75%.

Narin et al. [32] proposed three transfer learning-based deep learning models, based on ResNet50, InceptionV3 and Inception-ResNetV2, to diagnose COVID-19. In the study, 50 COVID-19 and 50 normal cases chest X-ray images were included. ResNet50 based model reached the highest classification performance accuracy of 98.0%. The accuracies of InceptionV3 and Inception-ResNetV2 were 97.0% accuracy and 87%, respectively.

Zhang et al. [47] proposed ResNet-based transfer model for detecting COVID-19 using X-rays. The dataset contained 70 COVID-19 cases and 1008 non-COVID-19 pneumonia cases. Sensitivity and specificity were 96.0% and 70.7%, respectively.

In another study by Wang et al. [45], a deep convolution neural network-based model (COVID-Net) was designed to detect COVID-19 cases. The dataset was composed of 5941 chest X-ray images in total, among them 1203 were normal, 931 bacterial pneumonia, 660 viral pneumonia, and 45 COVID-19. COVID-Net achieved 83.5% test accuracy.

Just by looking at performance metrics, when compared with the above studies, the model proposed in this study receives the highest validation accuracy and the highest validation recall and validation precision. However, even though the performance of the model was slightly lower in the test images, it is still competitive.

6 Conclusion

Herein, a fine-tuning CNN model was developed to characterize chest X-ray images as viral pneumonia, COVID-19, and normal cases. Performance metrics for both validation and test sets such as accuracy, recall, and precision along with the others presented in Tables 4 and 5 state that the model is encouraging. For example, the model’s macro-average accuracy rate on the training set is 0.9886 and 0.9778. Moreover, macro-average recall is 98.29% on the training set and 96.67% on the test set. Also, the model’s precision on the training set is 98.37%, on the test set 96.67%. We can conclude that the model is suitable for use in a laboratory environment. As an outcome of this study, the use of CNN technology can assist physicians in their final decisions and increase accuracy in the decision-making process.

Table 5 NASNet-Mobile performance statistics for the test set