1 Introduction

The latest coronavirus (COVID-19) pandemic started in December 2019 in Wuhan, China, and has become a major global public health issue [1, 2]. The pandemic virus COVID-19 was also named SARS-CoV-2 [3], a severe acute syndrome of coronavirus. Coronaviruses (CoV) are a large group of viruses that cause residual conditions such as Middle East Respiratory Syndrome (MERS-CoV) and Extreme Acute Respiratory Syndrome (SARS-CoV). The new genus Coronavirus (COVID-19) was detected in 2019 and never before observed in humans.

The animal-to-human contamination of the zoonotic coronaviruses was discovered by [4]. Research has shown the contamination of SARS-CoV by muscular cats inhuman beings and the contamination of MERS-CoV by dromedary in human beings [5]. Table 1 shows the definition, mortality rate, and origin of the coronavirus.

Table 1 Detail of coronavirus

Respiratory transmission from person to person caused the rapid spread of the disease. The signs of infection include respiratory problems, fatigue, cough and dyspnea. Severe acute respiratory syndromes, septic shock, multi-organ failure and death are more severely affected by the disease [6]. Men have been found to be sicker than women and children aged 0–9 years of age have no death [7]. In cases of COVID-19 pneumonia, respiratory levels were shown to be higher than in people with health [8]. Even with the increasing demand for intensive care facilities in many developed countries, the health system has come to a standstill. The diagnosis of COVID-19 should be confirmed as a key indicator for reverse transcription polymerase or hospitalisation by gene sequencing for respiratory or blood samples, as stated by the Chinese government. The current public health emergency causes the low sensitivity of the RT-PCR to obstruct the detection and treatment of many COVID-19 patients. Moreover, because of the extremely infectious nature of the virus, a wider population is at risk of infection [9]. The diagnoses now include all the individuals who display the common chest pneumonia pattern COVID-19 instead of the patients who wait for positive virus tests. This method allows authorities to isolate and treat patients quicker. Some patients survive from irreversible lung failure even though death does not occur at COVID-19. COVID-19 also opens holes in the lungs like the SARS according to the World Health Organisation, giving them a ‘honeycomb-like appearance’ [7]. One of the methods used to treat pneumonia is the computed chest tomography (CT). Automated image analysis tools for the detection, quantification and surveillance of corona virus were developed on the basis of artificial intelligence (AI) and to differentiate between patients with coronavirus and disease free [10]. In a Fei et al. research [11], a deep-learning system was developed to automatically segment all lung and infection places using chest CT. Xiaowei et al. [12] aimed at developing an early model for the diagnosis, using CT images and in-depth education techniques, of COVID-19 pneumonia and influenza-A viral pneumonia in a stable situation. In the Shuai et al. [13] research, they developed a deep-learning system based on the COVID-19 radiographic changes of images taken from the CT that can draw out the graphical characteristics of COVID-19 before pathogenic testing, thus saving crucial time for the diagnosis of the disease. Hamimi’s [14] analysis of MERS-CoV found that characteristics such as pneumonia can be present in the chest X-ray and CT. Data-mining techniques were used in the Xuanyang et al. research [15] to differentiate between SARS and conventional pneumonia based on X-ray images. On 41 COVID-19 patients, Huang et al. [16] identified the clinical characteristics, which indicate that cough, heavy, myalgia, or fatigue were typical onset symptoms. Both 41 patients had pneumonia and the chest CT test was anomalous. The first proof of human COVID-19 transmission was found at the University of Hong Kong by the Kok-KH team [17]. Zhao et al. [18] proposed a mathematical model to estimate the real numbers of cases reported in COVID-19 during the first half of January 2020. They concluded that the number of cases not reported was 469 from 1 to 15 January 2020. They also announced that after 17 January 2020, cases increased 21-fold. Nishiura et al. [19] suggested a model prediction of the COVID-19 infection rate in Wuhan, China on 29–31 January 2020 based on data from 565, Japanese citizens evacuated from Wuhan. They assume that the predicted rate is 9.5% and the mortality rate is from 0.3 to 0.6%. However, the number of Japanese people evacuated from Wuhan is limited and insufficient to estimate infection and death. Tang et al. [20] proposed a mathematical model to assess the probability of transmission for COVID-19. They concluded that 6.47 simple reproductions could take place. It also estimated the number of confirmed cases in 7 days (23–29 January 2020). Moreover, they expected the best possible outcome after 2 weeks (from 23 January 2020). The data for estimating sustainable human-to-human transmission of COVID-19 from 47 patients were used in [21]. The author concluded that 0.4 is transmitted, but that 0.012 is transmitted if the time for hospitalisation effects is half the time the results are tested. The authors presented a model of an assessment of risk of death for COVID-19 in [22]. The estimates for two different cases are 5.1% and 8.4%. The reproductive number for both scenarios was also measured at 2.1 and 3.2. Schemes have shown that a COVID-19 X-ray pandemic could occur to scan the body for fractures, bone dislocation, lung infections, pneumonia, and tumours. CT scanning is a kind of state-of-the-art X-ray device that explores the very soft nature and transparency of the active part of the body of soft internal tissues and organs [23]. Radiation is easier, stronger, more effective and less hazardous than CT. If COVID-19 pneumonia is not recognised and treated quickly, mortality can increase.

We have presented an automated COVID-19 prevision based on a deep convolution network using a pre-trained transmission model and chest X-ray image. The chest X-ray of 50 COVID-19 patients are taken from Dr. Joseph Cohen’s open-source GitHub [24] repository. This dataset is used as a deep extractor of features based on a profound learning collection such as AlexNet, VGG16, VGG19, GoogleNet, ResNet18, ResNet50 and ResNet101. These deep-sensed models have the deep features categorised according to the J48 algorithm. The parameter tuning is influenced by deep-learning models. To eliminate this problem, a multi-objective algorithm [25] is used to efficiently modify CNN model parameters. Finally, we verify the findings using deep extraction methods (see Fig. 1).

Fig. 1
figure 1

COVID-19 classification approach

The rest of this paper is structured as follows: Sect. 2 presents the optimization process used in this research. In Sects. 3 and 4, deep learning and J48 models are discussed. The dataset description is given in Sect. 5 followed by the performance metrics in Sect. 6. Experimental results and discussions are discussed in Sect. 7. Finally, conclusions are presented in Sect. 8.

2 Optimization

2.1 Emperor Penguin Optimizer (EPO)

Optimization is a process to find the optimal number of solutions [26,27,28,29,30,31,32,33,34]. The emperor penguin optimizer [35] is encouraged from the huddling attitude of the emperor penguins which were originated in the Antarctic. Emperor penguins usually process in colonies for foraging. This huddling behaviour is the unique characteristic as observed in these social animals during foraging. Hence, in the mathematical model, the prime objective will be to identify an effective mover from the swarm. To achieve this, the distances between Emperor Penguins (EPs) \((Z_{{{\text{ep}}}} )\) are calculated followed by their temperature profile \((T_{{{\text{mp}}}} )\). From this, the effective mover is identified, and the locations of other EPs are updated to get the optimum value.

The temperature profile of the Emperor Penguins is calculated as:

$$T_{{{\text{mp}}}}^{^{\prime}} = \left( {T_{{{\text{mp}}}} - \frac{{{\text{Itr}}_{{{\text{max}}}} }}{{k - {\text{Itr}}_{{{\text{max}}}} }}} \right),$$
(1)
$$T_{{{\text{mp}}}} = \left\{ {\begin{array}{*{20}c} {0 \;{\text{if}} R_{{{\text{nd}}}} > 0.5} \\ {1 \;{\text{if}} R_{{{\text{nd}}}} < 0.5} \\ \end{array} } \right.,$$
(2)

where \(k\) represents the recent iteration, \({\text{Itr}}_{{{\text{max}}}}\) presents the maximum iteration and \(R_{{{\text{nd}}}}\) shows the random number between [0, 1].

The generated huddle boundary signifies the distance of EPs to the best optimal solution. The optimum solution is determined by considering the fitness value nearer to optimal solution. The other emperor penguins updated their position depending upon the optimum solutions as:

$$\overrightarrow {{D_{{{\text{eps}}}} }} = \left| {Xs\left( {\vec{A}} \right).\overrightarrow {{Z_{{\text{b}}} \left( k \right)}} - \vec{C}.\overrightarrow {{Z_{{{\text{ep}}}} \left( k \right)}} } \right|.$$
(3)

Here, \(\overrightarrow {{D_{{{\text{eps}}}} }}\) shows the distance from EPs to the best solution. \(\vec{A}\) and \(\vec{C}\) are two vectors responsible for escaping from collision from other EPs. \(\overrightarrow {{Z_{{\text{b}}} }}\) represents the optimum solutions, \(\overrightarrow {{Z_{{{\text{ep}}}} }}\) shows the EP’s position vector. \(Xs()\) indicates the social forces of EPs.

Since EPs generally huddle together to maintain temperature. Thus, special care is to be taken to make them safe from collisions among the neighbours. For this reason, two vectors (\(\vec{A}\)) and (\(\vec{C}\)) are calculated as:

$$\vec{A} = \left\{ {M \times \left( {T_{{{\text{mp}}}}^{^{\prime}} + X_{{{\text{grid}}}} \left( {{\text{acrcy}}} \right)} \right) \times {\text{Rand}}()} \right\} - T_{{{\text{mp}}}}^{^{\prime}} .$$
(4)
$$\vec{C} = {\text{Rand}}().$$
(5)
$$X_{{{\text{grid}}}} \left( {{\text{acrcy}}} \right) = \left| {\vec{Z}_{{\text{b}}} - \vec{Z}_{{{\text{ep}}}} } \right|.$$
(6)

Here, \(M\) indicates the movement parameter and is set as 2, \({\text{Rand}}\) is the random value in the range [0, 1], \(X_{{{\text{grid}}}} \left( {{\text{acrcy}}} \right)\) shows the absolute difference between EPs and the optimal solution.

$$X_{{\text{s}}} \left( {\vec{A}} \right) = \left( {\sqrt {f.e^{{ - \frac{k}{v}}} - e^{ - k} } } \right).$$
(7)

Here, \(e\) represents the expression function,\(f\) and \(v\) represent the control parameters for a better exploration and exploitation lie within the range [2, 3] and [1.5, 2], respectively. The positions of the EPs are updated as per the optimal agent obtained as:

$$\vec{Z}_{{{\text{eps}}}} \left( {k + 1} \right) = \vec{Z}_{{\text{b}}} \left( k \right) - \vec{A} \cdot \vec{D}_{{{\text{eps}}}} .$$
(8)

In Multi-objective Emperor Penguin Optimizer (MOEPO) algorithm [25], archive and grid mechanisms are used. Further to update the search agents, group selection method is employed for better exploration and exploitation. This algorithm is used to tune the parameters of CNN model.

3 Deep-Learning Method

Deep learning is a sub-branch of machine learning, inspired by the structure of the brain. Deep-learning approaches used in the analysis of medical images have shown positive results in many fields in recent years. Data-processing image and signal are collected using medical imagery technologies such as MRI, computer tomography (CT), and X-rays using deep-learning models. Through this study, diseases such as diabetes mellitus, brain tumours, skin cancer and breast cancer have been detected and diagnosed more easily [36, 37].

Convolutional neural networks (see Figs. 2, 3) are inspired by the system of the human neuron similar to traditional neural networks. Every strange number layer has a convolution layer and a pooling and subsampling layer each even number layer except for both the input and the output layer. There are 8 levels in the CNN’s architecture. For each convolution, we have used 12, 8 and 6 attributes, and each is connected to five kernel pool layers. The lots were set to 100 and the cap for the sample was set to 100 times 1.

Fig. 2
figure 2

Neural network

Fig. 3
figure 3

Architecture of CNN

4 Decision Tree (J48 Algorithm)

Decision trees are classification or regression algorithms supervised. J48 is selected by prominence and accuracy in results from the decision tree algorithms for COVID-19 recognition. This is an extension of ID3 (Iterative Dichotomiser 3). Additional features of J48 include (a) derivation of rules, (b) pruning of trees, (c) incompleted accounts and (d) continuous attributes. The continuous and categorical analysis of data constitutes one of the best grading algorithms. The main aim is to divide the data into a uniform class, to predict as much as possible of the variables. J48 allows classification based on either the rules defined or a decision-tab. It aims to reduce impurity or data volatility. The list of attributes for the control of continuous attributes is divided by a threshold and separated by those below and below or equal to threshold. However, it was called “?” for missing values. These missing values were not included in entropy and gain calculation.

The algorithm consists of collecting (classified) training data. A decision tree is thus generated as an output where every leaf node is a decision and a test a non-leaf node. When the root node has been tested for leaf node testing paths, a leaf node will be shown whether the variable belongs or not. After the tree is built, the tree is used to classify the tuple data in any database tuple. J48 does not know the missing values when a tree is formed. The values of this object can be determined by the established attribute value in other documents. This creates a top–down forest in the classification model. It uses uniform criteria of data division. The knowledge gain of each attribute is measured according to entropy. The most standardised characteristic is chosen for decision making. Then, a root is selected for the best attribute of the next recurrently constructed sub-trees.

5 Dataset Description

During this analysis, a chest X-ray image of 50 COVID-19 patients [24] was taken from the open-source GitHub repository supplied by Dr. Joseph Cohen. This archive consists of X-ray/CT images in the chest mainly of patients with ARDS, COVID-19, Middle East Air Syndrome (MERS), pneumonia, serious acute respiratory syndrome (SARS). In addition, 50 standard chest radiation images have been used in the Kaggle repository called “Pneumonia” [38]. Our studies were performed using a dataset of 50 regular patients [38] and 50 COVID-19 patients [24]. All images were restored to a scale of 280 × 280 pixels in this data collection. Chest X-ray images of COVID-19 and normal patients are shown in Figs. 4 and 5.

Fig. 4
figure 4

X-ray images of coronavirus (COVID-19) disease effected patients

Fig. 5
figure 5

X-ray images of normal patients

In this study, we have established the classification COVID-19 X-ray chest image classified by AlexNet, VGG16, VGG19, Google Net, ResNet18, ResNet50, ResNet101, InceptionV3, InceptionResNetV2, DenseNet201, and XceptionNet deep convolutional neural network (CNN). The deep characteristics are extracted from the fully linked layer and feed for training purposes to the classifier. The J48 decision tree classification is used in this article for the deep functions obtained from each CNN network. The classification is then performed and the performance of all classification models is measured. A particular layer eliminates the deep characteristics and functionality of CNN models. The J48 classifier for the detection of COVID-19 diseases has the characteristics. Table 2 describes the characteristic layer and vector.

Table 2 Feature layer and feature vector characteristics of CNN models

6 Performance Metrics

The five well-known performance metrics are employed in this paper for the performance of deep-learning models.

$${\text{Accuracy}} = \left( {{\text{TN}} + {\text{TP}}} \right)/\left( {{\text{TN}} + {\text{TP}} + {\text{FN}} + {\text{FP}}} \right),$$
(9)
$${\text{Recall}} = {\text{TP}}/\left( {{\text{TP}} + {\text{FN}}} \right),$$
(10)
$${\text{Specificity}} = {\text{TN}}/\left( {{\text{TN}} + {\text{FP}}} \right),$$
(11)
$${\text{Precision}} = {\text{TP}}/\left( {{\text{TP}} + {\text{FP}}} \right),$$
(12)
$${\text{F1 - score}} = {2} \times \left( {\left( {{\text{precision }} \times {\text{recall}}} \right)/\left( {{\text{precision}} + {\text{recall}}} \right)} \right),$$
(13)

where TP, FP, TN, and FN represent the number of true positive, false positive, true negative and false negative, respectively. TP is the percentage of positive (COVID-19), correctly marked by model as COVID-19; FP is the percentage of negative (normal) mislabelling (COVID-19); TN is a percentage of negative (normal) properly labelling of positive (COVID-19) that is mislabelled as negative (normal) by model; FN is the percentage of positive (COVID-19) that is wrongly labelled.

7 Experimental Results and Discussions

In this study, we analysed the efficiency of COVID-19 recognition classification models on the basis of 11 CNN models. The research studies are conducted with software of MATLAB 2019a edition. All programmes are run on a Microsoft Windows environment Core i7 8th Generation and 8 GB main memory. The well-known performance metrics are used for each classifier such as Accuracy, Recall, Specificity, Precision, and F1-Score. Table 3 results are based on an average of 50 independent simulations. The preparation, validation, and test ratio for each execution is 60:20:20 and the random selection are updated for training, validation and testing.

Table 3 The obtained results on different models using performance metrics

Figures 6, 7, 8, 9 and 10 show the performance metrics values of different models. It is observed from results that the accuracy of ResNet101 plus J48 is superior to other classification models in terms of Accuracy, Recall, Specificity, Precision, and F1-Score performance metrics. Hence, ResNet101- and J48-based CNN method result better classification for detection of COVID-19 with Accuracy, Recall, Specificity, Precision, and F1-Score are 98.50%, 100%, 97.20%, 100%, and 98.40%, respectively.

Fig. 6
figure 6

The accuracy results using different classification models

Fig. 7
figure 7

The recall results using different classification models

Fig. 8
figure 8

The specificity results using different classification models

Fig. 9
figure 9

The precision results using different classification models

Fig. 10
figure 10

The F1-score results using different classification models

CNN deep-learning model is efficiently detecting the COVID-19 disease from the X-ray chest images of normal and coronavirus effected patients. The detection parts from the normal images and COVID-19 images are shown in Figs. 11 and 12. It can be seen from these figures that the segmented part of the COVID-19 patients X-ray images of chest is smaller than the normal patient X-ray images. Moreover, the computational times of different classification models are also calculated (see Fig. 13). Overall, it can be concluded that the CNN deep-learning model and J48 decision approach is able to classify the chest X-ray images of coronavirus patients. Coronavirus identification (COVID-19) is now a vital task for the physicians and researchers. The spread of COVID-19 is declared by WHO since March, 2020 as a global outbreak of the pandemic. To reduce COVID-19 spread and initiate early medical treatment for the infected individuals, it is a crucial priority to become aware of the infected individuals so that preventive procedures can be performed.

Fig. 11
figure 11

Segmented chest area of normal patients using CNN approach

Fig. 12
figure 12

Calculated computational time to predict the COVID-19 disease using different CNN models

Fig. 13
figure 13

Segmented chest area of COVID-19 patients using CNN approach

8 Conclusions

The contents of this paper are based on data available from the WHO, the EDC an agency of the European Union, and other official websites. The chest X-ray images used for simulation purposes are collected from the GitHub and Kaggle repositories, for coronavirus identification using deep features and J48 approach. The extraction is done using 11 pre-trained CNN models and individually supplied them for J48 classification and MOEPO algorithm. Statistical research is conducted to select the best classification pattern. ResNet101 plus J48 classification model statistical performance is better than the other ten competitor models.