Introduction

Severe acute respiratory syndrome virus strain reported as causing the respiratory disease COVID-19 (Gorbalenya et al. 2020). The World Health Organization (WHO) has issued a global warning about the outbreak of the coronavirus (Chan et al. 2020; WHO 2020). This disease is reported to be very aggressive and it has affected millions of people in all countries (Ivanov 2020; Rahimi et al. 2020). Attempts have been made around the world to deal with COVID-19 from a variety of medical, psychological, and engineering perspectives (Pan et al. 2020). Despite suggestions for treatment and prevention of the disease, there is still no definitive method against this disease (Di Lorenzo et al. 2020; Saghazadeh and Rezaei 2020). Quarantine and social distance have been proposed as preventive measures for cities with high prevalence (Peak et al. 2020; Thu et al. 2020). Several advanced epidemiological models are designed to provide prevention, detection, and prediction approaches and present the effects of this disease (Koolhof et al. 2020). The impact of underlying diseases such as cardiovascular disease on COVID-19 patients has been studied (Bansal 2020). The virus can also increase the risk of death in people with chronic obstructive pulmonary disease, diabetes mellitus, and kidney failure (Rahmani and Mirmahaleh 2021). Some studies have shown that pregnant women suffer worse than non-pregnant women (Fan et al. 2020). For predicting possibly unknown ncRNA-disease relationships used multi-type hierarchical clustering (Barracchia et al. 2020). Some papers aim to find possible treatments or drug/gene-disease association clustering ( Loucera et al. 2020). These articles show the high complexity of this disease for different patients. However, the exact medical method for diagnosing different levels of symptomatic COVID-19 patients has not yet been identified. On the other hand, accurate identification of relationships among patient characteristics diagnoses by physicians requires a long time. On the other hand, machine learning methods can learn many relationships among different features of asymptomatic and symptomatic COVID-19 patients based on their data. These methods present good performances for diagnosing and predicting the prevalence of COVID-19 (Rekha 2020). It is noteworthy that radiological imaging techniques, X-rays, and CT scans are great complements for the diagnosis of different levels of symptomatic and asymptomatic COVID-19 patients (Rubin et al. 2020; Shi et al. 2020). Machine learning methods could consider different features of COVID-19 patients along with their image data to detect their levels.

Therefore, a new hierarchical model is proposed in this paper to find different levels of symptomatic COVID-19 patients. Detecting the need for symptomatic COVID-19 patients to ICU and finding whether or not they are in the end-stage are important targets. In this regard, two real datasets about COVID-19 patients are selected and different features of them are studied to choose suitable features through available features. One of these datasets has both clinical and image data about COVID-19 patients. Therefore, some image processing and deep learning methods are utilized to extract meaningful features from available images. It should be noted that physicians first utilize clinical features to detect the severity level of COVID-19 patients. Some symptomatic COVID-19 patients are returned to their homes after examination and training and the rest are hospitalized based on medical diagnoses. Then, physicians decide on the need for hospitalized symptomatic COVID-19 patients to ICU and whether or not they are in the end-stage according to their clinical and image data. Accordingly, the proposed model in this paper is quite analogous to what happens in reality.

Then, different types of well-known classification methods including case-based reasoning (CBR), decision tree (DT), convolutional neural networks (CNN), K-nearest neighbors (KNN), learning vector quantization (LVQ), multi-layer perceptron (MLP), Naive Bayes (NB), radial basis function network (RBF), support vector machine (SVM), recurrent neural networks (RNN), fuzzy type-I inference system (FIS), and adaptive neuro-fuzzy inference system (ANFIS) are designed for these datasets and their results are analyzed for different random groups of the train and test data. The best classifier for these datasets is selected based on various indicators including accuracy, sensitivity, specificity, precision, F-score, and G-mean. ANFIS has most of the best performances for both datasets among other classifiers. Besides, fuzzy C-Mean (FCM) clustering method is utilized to cluster different groups of COVID-19 patients and reinforce ANFIS classification learning. It is noteworthy that there are various linear and nonlinear relationships between the features of patients and the class. It greatly increases the complexity of classification learning. Consequently, the FCM clustering method is utilized to deal with it. The divide-to-conquer approach imposed by the FCM clusters allows classifiers to be more specialized in COVID-19 subpopulations, typically leading to higher classification performance.

It is noteworthy that one of the selected datasets has image data. Therefore, some features are extracted based on image data and added to other clinical data. Then, useless features among all features are reduced using principal component analysis (PCA) method to improve the results of classification. The computational results verify the superiority of the proposed hierarchical model compared to other utilized classifiers in this paper. Moreover, the results of the Wilcoxon signed-rank test prove the effectiveness of the proposed extracted features based on images and the PCA feature reduction method. This model is compatible with both datasets about COVID-19 patients based on clinical data and both clinical and image data, as well.

The rest of the article is as follows. The “Methods” section is devoted to reviewing the literature of relevant studies. Next, the selected datasets, utilized methods, the proposed hierarchical model, and the different types of extensions are represented in the “Results” section. The clinical and image data of datasets are introduced and the relevant experimental results are presented in the “Discussion” section. The last section is about conclusions and future studies.

Literature review

Many diseases affected different people in the world. There are different linear and nonlinear relationships among features of patients that create complexities for medical treatments. Therefore, machine learning methods are widely utilized in these fields to find better diagnoses and treatment plans (Ershadi and Seifi 2020a, b). These methods are appropriate to cope with detecting infectious diseases, especially in COVID-19, a widely spread pandemic in the world. In the following, there is a quick review of the most important research studies that are merely focused on detecting different diseases using learning approaches.

Machine learning methods present good performances for differential diagnosis of COVID-19 (Dai et al. 2020; Rauschecker et al. 2020). By machine learning and using data from 29 patients at Tongji Hospital in China, another developed algorithm is proposed to find the mortality risk of infected COVID-19 patients (Yan et al. 2020). On other hand, these methods present better performances in comparison with medical diagnoses based on experts’ knowledge (Ershadi and Seifi 2020a, b). Another study focuses on the diagnosis of diabetes type-II patients and proposes a hybrid machine learning-based ensemble model for this purpose (Sarwar et al. 2020). Heart disease diagnosis is studied using a new expert system based on a fuzzy Bayesian network. This study presents the advantages of a machine learning method based on experts’ knowledge (Zarandi et al. 2017).

Another study uses medical information such as age, sex, income level, place of residence, household type, disability, respiratory symptoms, route of infection, and medical background to extract new meaningful features for COVID-19 diagnosis. The extracted features can improve classification methods learning (including RBF, SVR, and KNN) in comparison with clinical data (An et al. 2020). New extracted features are considered in another paper for COVID-19 patents, as well. This paper showed that machine learning methods can save radiologists time for diagnosis and can be more cost-effective than standard COVID-19 tests (Bullock et al. 2020). Another method is proposed for accurate and automatic diagnosis of COVID-19 patients based on advanced artificial intelligence using chest CT. This method can classify the chest image with high accuracy according to extensive computational results (Ozturk et al. 2020). A clinical study with 1014 patients in Wuhan obtained chest CT with 60% accuracy, 97% sensitivity, and 25% specificity for COVID-19 diagnosis (Ai et al. 2020). The convolutional neural network model is proposed for the automatic diagnosis of COVID-19 from chest X-ray images of patients (Elaziz et al. 2020). The diagnosis of respiratory decompensation in COVID-19 patients is studied in another article using a machine-learning approach (Burdick et al. 2020). Other most recently published papers in this area and their comparison in terms of different features including the type of utilized learning algorithms and the modality of images are reported in Table 1.

Table 1 Some research papers and their techniques

According to the literature review, the diagnosis of COVID-19 patients is helpful and most papers proposed different machine-learning methods for this purpose (Lalmuanawma et al. 2020). However, the increasing number of COVID-19 patients in different countries in the world lead to many problems in hospitalization. COVID-19 disease has new symptoms and effects in comparison with other infectious diseases. They make it difficult to find suitable medical treatments for hospitalized COVID-19 patients. Accordingly, several critical evaluations of the significant papers in the literature are in order. Although the proposed methods in the literature are applicable for disease diagnosis, finding different levels of hospitalized symptomatic COVID-19 patients is ignored as the first point (see Table 1 and Appendix). Secondly, evaluations of the proposed models are limited to accuracies in most papers, and other measurements including sensitivity and specificity. are neglected. Third, COVID-19 datasets without images and few classifiers are considered in most papers. These points lead to different research gaps. At first, the COVID-19 diagnosis cannot support physicians to find the correct levels of COVID-19 patients. Besides, the accuracies of a few classifiers are evaluated regardless of their different measurements and performances of other types of classifiers.

Contributions

According to the literature, contributions are summarized as follows to close part of mentioned gaps:

  1. 1-

    A new hierarchical model is proposed using ANFIS classifiers and FCM clustering method in this paper. Its structure is designed based on experts’ knowledge and real medical process. FCM reinforces the ANFIS classification learning phase based on the features of COVID-19 patients

  2. 2-

    Two real datasets about COVID-19 patients are studied in this paper. One of these datasets has both clinical and image data. Therefore, appropriate features are extracted based on its image data and considered with available meaningful clinical data. Different levels of hospitalized symptomatic COVID-19 patients are considered in this paper including the need of patients to ICU and whether or not they are in the end-stage

  3. 3-

    Well-known classification methods including case-based reasoning (CBR), decision tree, convolutional neural networks (CNN), K-nearest neighbors (KNN), learning vector quantization (LVQ), multi-layer perceptron (MLP), Naive Bayes (NB), radial basis function network (RBF), support vector machine (SVM), recurrent neural networks (RNN), fuzzy type-I inference system (FIS), and adaptive neuro-fuzzy inference system (ANFIS) are designed for these datasets and their results are analyzed for different random groups of the train and test data

  4. 4-

    According to unbalanced utilized datasets, different performances of classifiers including accuracy, sensitivity, specificity, precision, F-score, and G-mean are compared to find the best classifier. ANFIS classifiers have the best results for both datasets

  5. 5-

    To reduce the computational time, the effects of the principal component analysis (PCA) feature reduction method are studied on the performances of the proposed model and classifiers. According to the results and statistical test, the proposed hierarchical model has the best performances among other utilized classifiers.

Methods

Before getting into the details of the proposed hierarchical model, some brief explanations about the scheme of research, selected datasets, data preprocessing, utilized feature reduction, classifications, FCM clustering method, and image processing are presented in the following to make this paper self-contained.

Schema of research

To illustrate the utilized schema in this research, Fig. 1 is represented as follows.

Fig. 1
figure 1

Utilized scheme of research

Selected datasets

Two real datasets about symptomatic COVID-19 patients are utilized in this paper as follows:

There is a collection of radiographic and CT imaging studies for patients who tested positive for COVID-19 in the second dataset. Clinical data correlates with key radiology for every patient from the same population. Detailed descriptions of the second dataset are demonstrated in Table 2.

Table 2 Detailed description of the second dataset

The size of the first dataset is less than 50 MB and there is no image data about patients in this dataset. These datasets help to evaluate the performances of the proposed model for only clinical data and clinical with image data. It is noteworthy that the compatibility of classification models with image data is important for some diseases, specially COVID-19 due to its characteristics.

Data preprocessing

There are both quantitative and qualitative clinical features in selected datasets. First, the qualitative features are categorized based on related items in the attribute to find new quantitative features. Secondly, the missing features for the first dataset are deleted due to the large number of available data in this dataset. However, missing features of the second dataset are replaced with the average of available features due to the limited number of patients in the second dataset. In the third step, features with equal values (e.g., name of the hospital) and features with unique values (e.g., patient ID) for all patients are eliminated in both datasets because these columns had no meaningful information for the decision-making process. Moreover, remained features of clinical data have different dimensions and scales in both datasets. Therefore, they must be normalized and descaled such that their values state between (0, 1) in the fourth step. Finally, the prepared datasets are divided into two different sections called to train and test data in the fifth step. It is noteworthy that these sections are selected randomly for both datasets to have less dependency on the data.

Utilized feature reduction, classification, and clustering methods

Principal component analysis

PCA is one of the feature reduction methods that find eigenvalues of the dataset’s features. Then, eigenvectors are defined based on the eigenvalues to represent meaningful data features. In other words, PCA assigns principal components for each feature based on eigenvectors to make a new expressive dataset. The dimensions of the new space are usually designed based on some of the features with the most eigenvalues. Finally, all data are generated in a new space with fewer features and no correlation. Therefore, these new features may increase the performances of classifiers as the irrelevant features have been omitted. In this paper, some of the top components with the highest eigenvalues are considered to generate a new dataset such that they have more than 99% of the variations.

Case-based reasoning

CBR tries to find the class of test data based on its similarity to other data. In this paper, Euclidean distances between different features of two data are considered as the similarity. The minimum number of similar data, related weights, and selection type of CBR were selected and optimized using the design of experiments (DoE) and Taguchi method before applying this classifier in this paper.

Decision tree

This classifier considers different subsets for a dataset in the form of a tree structure. Creating decision nodes and leaf nodes in this method is obtained by combining mentioned subsets. Every decision node has two or more branches and every leaf node presents a decision. There are different algorithms to make a decision tree for a dataset. The C4.5 algorithm is utilized in this paper for this purpose. Criterion type, splitter, maximum depth, minimum samples for split, minimum samples for leaf, minimum weight fraction, maximum features, related impurities, and class weight of this classifier were selected and optimized using DoE and Taguchi method before applying it in this paper.

Convolutional neural network

CNN is a class of artificial neural networks based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation equivariant responses known as feature maps. CNNs take advantage of the hierarchical pattern in data and assemble patterns of increasing complexity using smaller and simpler patterns embossed in their filters. Kernel size, number of filters, filter size, activation function, pooling type, and its size of this classifier were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

K-Nearest neighbors

KNN is one of the instance-based learning methods that consider the distances between different features of test data and some train data to find the appropriate class. Therefore, different methods for calculating the distance can be used in this method. Euclidean distance is considered in this paper for this purpose. The number of neighbors, weights, learning algorithm type, leaf size, and metric of KNN were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Learning vector quantization

LVQ is one of the artificial neural network methods based on the winner-take-all approach. The type of utilized artificial neural networks in this classification method is self-organizing maps that consider a learning algorithm to find the best winner for every test data. Margins, likelihood ratio, distance type, related kernel, dissimilarities, and learning type of LVQ were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Multi-layer perceptron

MLP is one of the feedforward artificial neural network methods that consider different layers of perceptron with special threshold activation for them. It utilizes backpropagation as its supervised learning technique. MLP classifiers present good performances to detect data whether or not they are linearly separable. Training methods, size of hidden layers, activation type, learning type, solver function, alpha, beta, epsilon, batch size, learning rate, maximum iteration, shuffle, verbose, and validation fraction of MLP were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Naive Bayes

NB is one of the probabilistic classifiers based on the Bayesian theorem. It considers independence assumptions between the features and finds a suitable probability associated with each feature category. This classifier presents acceptable performances with simple assumptions in different applications. Parameter estimation type, variation smoothing function, and divisions’ size of NB were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Radial basis function network

RBF is one of the artificial neural networks with activation functions based on radial basis functions. The class of every test data is a linear combination of radial basis functions of its features in this method. The size of hidden layers, activation type, batch size, maximum iteration, shuffle, and verbose of RBF were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Support vector machine

SVM classifiers try to find the best hyperplanes among different classes of train data of a dataset. This hyperplane has the highest distance to the nearest training-data point of different classes. Then, the class of every test data is defined based on its position and defined hyperplanes. Kernels, degree, gamma, shrinking, probability, class weight, verbose, maximum iteration, decision function shape, and penalty cost of SVM were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Recurrent neural network

An RNN is a CNN where connections between nodes form a directed or undirected graph along a temporal sequence. In other words, RNN works on the principle of saving the output of a particular layer and feeding this back to the input in order to predict the output of the layer. Kernel size, number of filters, filter size, activation function, pooling type and the size of this classifier were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Fuzzy type-inference system

FIS classifiers are one type of fuzzy classifiers based on fuzzy logic applications. They consider some if–then rules according to the Takagi–Sugeno method and define fuzzy membership functions for different features of a dataset. Finally, features of test data are analyzed based on extracted rules, and the best classes are selected for them. The number of fuzzy rules, type of related fuzzy membership functions, number of neurons in different layers, connection weights, and summation function type of FIS were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Adaptive neuro-fuzzy inference system

ANFIS is one of the artificial neural networks based on the Takagi–Sugeno fuzzy inference system. It utilized the advantages of both fuzzy logic principles and artificial neural networks. The artificial neural network in ANFIS tries to find the optimal parameters for Takagi–Sugeno fuzzy inference system using a supervised-based learning approach. The number of neurons in different layers, activation type, learning type, alpha, beta, epsilon, batch size, learning rate, maximum iteration, related membership functions, connection weights, and summation function type of ANFIS were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Fuzzy C-mean

Clustering methods attempt to find different clusters for some data such that all data in the same cluster have the most similarity (compactness measure) and data in different clusters are as dissimilar as possible (separation measure). However, FCM clustering defines the membership function to find the membership degree of each data to each cluster. Therefore, each data can belong to more than one cluster. It leads to more computational capabilities for this soft clustering method. The number of clusters, maximum iteration, minimum threshold, array exponentiation applied to the membership function, initial fuzzy c-partitioned matrix, initial seed, and stopping criterion of FCM were selected and optimized using DoE and Taguchi method before applying this classifier in this paper.

Preprocessing for images

There are two different datasets in this study. The first one is based on clinical data and the second one has both clinical and image data. Clinical data of both datasets are studied in this paper using the mentioned classifiers and the proposed hierarchical model. Besides, some features are extracted based on the image data of the second dataset and added to its available features. Therefore, a new dataset is created based on its clinical features and the new extracted features based on the chest radiograph images. Then, the performances of utilized classifiers and the proposed hierarchical model are obtained for the mentioned new dataset to find the effects of added features on their learning phase. The performances of utilized classifiers and the proposed hierarchical model can indicate whether or not the extracted features from the images are effective to find different levels of hospitalized symptomatic COVID-19 patients. In this subsection, preprocessing steps for radiology images of the second dataset are presented as follows:

  1. 1)

    The suitable chest radiograph image for each patient is selected in the first step. Given that there are several images for each patient, the image with the greatest number of white pixels in the chest area is considered. This image is related to the most sensitive condition of any patient with the greatest number of dead cells in his/her lung. It is noteworthy that some areas in the photo, the angle, and the intensity of the images are similar for different patients

  2. 2)

    In the second step, the additional information from the chest radiograph images including shoulder, non-lung tissues, and the separate parts of the body are eliminated using appropriate cutting operations

  3. 3)

    All of the noises through the intensity adjustment function are deleted to make the pictures brighter. Then, all pictures changed to gray-level pictures

  4. 4)

    An FCM clustering method is utilized to find the three clusters on the final image from the previous step. These clusters select different pixels based on their values and their centers are near 0, 0.5, and 1, respectively. The results show that the last two clusters are more suitable than the first cluster

  5. 5)

    Make black and white segments for each image using the threshold \(=\frac{(\max_{\mathrm{lable}\;\mathrm{cluster}=2}\mathrm{gray}-\mathrm{level}\;\mathrm{value}+\min_{\mathrm{lable}\;\mathrm{cluster}=3}\mathrm{gray}-\mathrm{level}\;\mathrm{value})}2\)  

After performing the above steps, some features are extracted based on prepared images as presented in the following subsection.

Extracted features based on images

Coronavirus attacks different parts of the body and causes inflammation according to physicians and experts’ knowledge. The lung is ground zero for COVID-19 and its healthy cells are affected by occurred inflammations. This causes the death of respiratory cells in different parts of the lungs. Therefore, the white parts of the chest radiograph image in the lungs of injured people are more than healthy people. As a result, it is necessary to extract features from the image that are sensitive to its white pixels. Consequently, some statistical features are extracted according to the prepared images. Let consider an image of size \(MN\) and \(p(i,j)\) to demonstrate the pixel value in point \((i,j)\). Then, the mean is calculated using Eq. (1) as the first statistical feature. It can support classifiers to find more general sensitivity in the image.

$$\mu =\frac{1}{MN}\sum_{i=1}^{M}\sum_{j=1}^{N}p(i,j))$$
(1)

The second statistical feature is entropy to show distribution diversity as represented in Eq. (2). It defines the distribution diversity of an image based on determining the number of pixels with a special level.

$$h=-\sum_{k=0}^{L-1}\frac{{z}_{k}}{\mathrm{MN}}*\mathrm{log}\frac{{z}_{k}}{\mathrm{MN}}$$
(2)

In this equation, \({z}_{k}\) is the total number of pixels with the level \(k\), and \(L\) is the total number of levels.

Skewness is the third statistical feature represented in Eq. (3). It shows the degree of asymmetry of a pixel in the window specified around its distribution average. Regarding this feature, we could consider asymmetry of special scatter pixel around its distribution average.

$$s=\frac{1}{MN}\sum_{i=1}^{M}\sum_{j=1}^{N}{(\frac{p\left(i,j\right)-\mu }{\sigma })}^{3}$$
(3)

For the fourth statistical feature, the rate of flatness of distribution relative to a normal distribution is considered as kurtosis measurement and represented in Eq. (4). This feature defines behaviors of special scatter pixel around its distribution average.

$$k=\left\{\frac{1}{MN}\sum_{i=1}^{M}\sum_{j=1}^{N}{\left(\frac{p\left(i,j\right)-\mu }{\sigma }\right)}^{4}\right\}-3$$
(4)

The fifth feature extracted from prepared images is standard deviation as represented in Eq. (5). This feature determines the degree of COVID-19 spread in patients’ lung.

$$\sigma =\sqrt{\frac{1}{MN}\sum_{i=1}^{M}\sum_{j=1}^{N}{(p\left(i,j\right)-\mu )}^{2}}$$
(5)

The proposed hierarchical model

In real medical diagnosis, experts considered a special set of features for different groups of COVID-19 patients to find their treatment plans. Physicians pay attention to special features of different groups of patients and classify them. This approach is considered in the structure of the proposed hierarchical model. In the proposed model, we applied clustering methods to patients’ data to determine some clusters. Then, we learn classifiers for each cluster in a hierarchical model. FCM and ANFIS were considered to design the proposed hierarchical model due to the following details.

ANFIS is a classifier that utilizes an adaptive neural network to search optimal parameters systematically for learning the Takagi–Sugeno-type fuzzy model. This classification technique is utilized to diagnose different diseases and cancers in most cases in the literature. ANFIS consists of 5 layers and represents good performances for many applications due to using advantages of both fuzzy logic principles and artificial neural networks. Let (\({x}_{1},\dots ,{x}_{n}\)) are \(n\) inputs, \((R=m)\) are \(m\) fuzzy rules, and \({I}_{k}^{l}\) are output of \({k}^{th}\) node in \({l}^{th}\) layer. Then, the structure of utilized ANFIS in this paper is presented in Fig. 2.

Fig. 2
figure 2

Structure of utilized ANFIS in this paper with n inputs and m fuzzy rules

There are \((n*m)\) nodes and \({I}_{k}^{1}={\mu }_{i}\left({x}_{j}\right)={\left(1+{\left|\frac{{x}_{j}-{c}_{k}}{{a}_{k}}\right|}^{2{b}_{k}}\right)}^{-1}\forall i,j\in 1,\dots ,n , k\in 1,\dots ,n*m\) is membership degree of \({j}^{th}\) input for \({i}^{th}\) rule in layer1. Premise parameters (\({a}_{k}\),\({b}_{k}\), and\({c}_{k}\)) change the shapes of the utilized bell-shaped membership function in this layer. The number of nodes in layer2 is the same as \(m\) fuzzy rules and \({I}_{k}^{2}={w}_{i}=\prod_{j=1}^{N}{\mu }_{i}\left({x}_{j}\right) \forall k=i\in 1,\dots ,m\) is firing strength of \({i}^{th}\) fuzzy rule. The number of nodes in layer3 is the same as \(m\) fuzzy rules and \({I}_{k}^{3}=\overline{{w }_{i}}=\frac{{w}_{i}}{{\sum }_{i=1}^{m}{w}_{i}} \forall k=i\in 1,\dots ,m\) is the firing strength of \({i}^{th}\) rule after normalization. The number of nodes in layer4 is the same as \(m\) fuzzy rules and the \({k}^{th}\) node of this layer finds the contribution of \({i}^{th}\) rule towards the all output with \({I}_{k}^{4}={\overline{{w }_{i}}f}_{i}=\frac{{w}_{i}}{{\sum }_{i=1}^{m}{w}_{i}}({\sum }_{j=1}^{N}{p}_{i,j}{x}_{j}+{r}_{i}) \forall k=i\in 1,\dots ,m\) node function based on \({p}_{i,j}\) and, \({r}_{i}\) as consequent parameters. Finally, the single node in layer5 finds\({I}_{5}={\sum }_{i=1}^{m}{\overline{{w }_{i}}f}_{i}\). Therefore, the training process of ANFIS attempts to find the best values of mentioned parameters. Typical ANFIS and other classifiers have the following process in Fig. 3 for the diagnosis of different diseases and cancers in this paper.

Fig. 3
figure 3

General process for classifying data for all types of classifications in this paper

Although ANFIS has the best performances among all utilized classifications in this paper for finding appropriate levels for hospitalized symptomatic COVID-19 patients, its performances have significant weaknesses in some computational experiments. After careful examination, we found different groups of COVID-19 patients and each group has specific characteristics. However, there are a lot of common characteristics among different groups of COVID-19 patients. These variations make the decision-making process very hard to find the appropriate levels for symptomatic COVID-19 patients. To cope with this problematic issue, an FCM clustering method is applied before ANFIS classification to clustering COVID-19 patients. Other clustering methods assign each data to a single cluster, but there are a lot of common characteristics among different groups of COVID-19 patients. Therefore, we select FCM to cluster these patients. FCM determines the membership degree of each data to every cluster. Besides, the threshold equal to 0.05 is defined to eliminate some data from the cluster(s) whose membership degrees are less than 0.05. It leads to fewer complexities in computational calculations. Then, ANFIS classification should be performed separately for all COVID-19 patients belonging to each cluster. It is noteworthy that each patient is found in different clusters according to the characteristics of the FCM method. Figure 4 represents more details about it. There are 4 diverse groups of COVID-19 patients in this figure and the FCM tries to learn classifiers based on its results.

Fig. 4
figure 4

A 2D view of the proposed model

Therefore, the outputs of all ANFIS classifications for different clustering are combined using the following algorithm.

1. Find the appropriate number of clusters (or \(c\)) for FCM in each dataset using (average Euclidean distances of clusters centers or separation-average Euclidean distance of the data within the clusters or compactness) among different numbers of clusters (between 2 and \(\sqrt{\mathrm{all}\;\mathrm{record}\;\mathrm{in}\;\mathrm{each}\;\mathrm{dataset}}\)) based on train data

2. Find the membership degree of data/patient \(i\) in cluster \(j\), \({u}_{ji}=\frac{1}{\sum_{k=1}^{c}{(\frac{{d}_{ji}}{{d}_{jk}})}^{={~}^{2}\!\left/ \!{~}_{(m-1)}\right.}}\) where \(c\) is the number of clusters, \(m\) is equal to 2, \({d}_{ji}\) is Euclidean distance of \({i}^{th}\) data from the center of \({j}^{th}\) cluster, and \({d}_{jk}\) is Euclidean distance of \({j}^{th}\) cluster of \({k}^{th}\) cluster \((k=1,\dots ,c \mathrm{except} j)\)

3. Find all \({u}_{ji}\) less than the mentioned threshold (0.05) and replace them by zero

4. Learn a special ANFIS classifier for each cluster based on its train data. It is noteworthy that \({i}^{th}\) data belongs to \({j}^{th}\) cluster if its membership degree (\({u}_{ji}\)) is greater than zero

5. Find the appropriate clusters or cluster \((J)\) for each test data (\(t\in T)\) and feed its feature to learned ANFIS. The outputs of different ANFIS are stored in \({O}_{t}^{j} \forall j\in J, t\in T\)

6. Find the class of \({t}^{th}\) test data based on \(\mathit{arg}\underset{j}{\mathit{max}}{R}_{tj}\); where \({R}_{tj}={O}_{t}^{j}*{u}_{jt}\)

The proposed hierarchical model and its algorithm led to improve ANFIS classifiers learning. According to this model, a special ANFIS classifier is learned for each cluster of COVID-19 patients. Each cluster attempts to consider both the special and common characteristics of each group of COVID-19 patients. Figure 5 represents the processes of the proposed hierarchical model in this paper.

Fig. 5
figure 5

Process of classifying data for the proposed hierarchical model

Results

In this section, more details about the clinical and image data of selected datasets are presented. Then, the effects of PCA on both datasets are evaluated as a feature reduction method. Finally, different groups of results and related statistical tests are presented.

Clinical data

There are different features for each patient in utilized datasets. Some of these features are removed using preprocessing steps and the obtained features are described in Table 3.

Table 3 Clinical features of both datasets

Besides, there are image data in the second dataset. More details about them are presented in the next subsection.

Image data

Each patient in the second dataset has clinical data with different types of CT, CR, and DX images. A patient may see a physician multiple times and order multiple types of images at different times. Therefore, the total record for all of these patients is greater than the number of patients (31,935 images recorded for 105 patients). We used the images in the latest updated version of this dataset on Dec 17, 2020. According to the properties of different types of features, CR images are selected for 105 patients. Then, the image preprocessing steps are applied to them. The results of four images after image preprocessing steps for three patients are presented in Figs. 6, 7, and 8 to represent the effects of utilized steps.

Fig. 6
figure 6

Different images for a patient (ICU admit: No and mortality: No)

Fig. 7
figure 7

Different images for a patient (ICU admit: Yes and mortality: No)

Fig. 8
figure 8

Different images for a patient (ICU admit: Yes and mortality: Yes)

Applying PCA feature reduction

In this paper, some features based on image data are added to the second dataset. These features are mean, entropy, skewness, kurtosis, and standard deviation measurements of images. Considering these features along with other clinical features (see Table 3) could increase the complexity and decrease classification performances. Accordingly, the PCA feature reduction method is implemented for the second dataset before classification to eliminate meaningless features. For this purpose, PCA sorts all features based on their eigenvalues. Therefore, the first of them has the most eigenvalues among all features. Moreover, the cumulative percentage of covering data is changed based on the eigenvalues. Figure 9 presents more details to understand the process of PCA. In Fig. 9, features 1 to 5 are extracted features based on the image and features 6 to 23 are 18 clinical features of the second dataset represented in Table 3. Eigenvalues of different features and the cumulative percentage of covering data are demonstrated in Fig. 9.

Fig. 9
figure 9

The eigenvalue of features (left axis) and the cumulative percentages of covering data in the new space (right axis) after running PCA

According to Fig. 9, the first 18 features out of 23 cover more than 99% of the variations. Therefore, we ignore some meaningless features that cover less than 1% of the variations according to the results of PCA. It helps classifiers to learn better relationships among new features in less time.

Computational results

MATLAB R2022a 64-bit software is utilized on a computer with Intel(R) Core (TM) i5 CPU@2.30 GHz processor and 4.00 GB RAM for all runs in this research. There are two different target classes in the second dataset. Therefore, the performances of different classifiers and the proposed hierarchical classifier are obtained for the first dataset, the clinical data of the second dataset for both target classes, and both clinical features and extracted features of image data of the second dataset for both target classes.

Due to imbalanced utilized datasets, different measurements including accuracy, sensitivity, specificity, precision, F-score (\(F.\mathrm{mea}=\frac{2*\mathrm{sensitivity}*\mathrm{precision}}{\mathrm{sensitivity}+\mathrm{precision}}\)), G-mean (\(GSS=\sqrt{\mathrm{sensitivity}*\mathrm{specificity}}\)) are utilized to show the performances of classifiers. Besides, the average of these measurements for 30 different groups of train and test data based on standard tenfold cross validation is represented in this paper. While these groups are created based on 30 random seeds, they are the same for all classifiers in this paper. It is noteworthy that about 80% of data are considered as train data. Accordingly, 453,281 data of the first dataset and 84 data of the second dataset are considered as train data and the rest are test data. The number of experimental trials is considered 25 for each classifier and groups of train and test data to check its stability. Besides, the test data results corresponded to the training data results and validated the classification process.

At first, the results of classifiers and the proposed hierarchical model for clinical data of both datasets are presented in Tables 4 and Table 5. The ICU target class is considered for all results in both tables. Besides, these results are related to all features of datasets after data preparation. In other words, the PCA feature reduction is not utilized yet. The top three results for each measurement are highlighted and the best of them is determined using underline. According to these results, ANFIS has the most top results among other classifiers. Therefore, we consider ANFIS in the proposed model to improve related measurements. Besides, the proposed hierarchical model has all of the best results in both tables.

Table 4 The performances of all classifiers for the first dataset
Table 5 The performances of all classifiers for the second dataset base on ICU target class

Performances of classifiers for different situations about the second dataset before applying PCA feature reduction are presented in Table 6. The average and standard deviation of three measurements (accuracy, F-score, and G-mean) for 10 different groups of train and test data are presented for this purpose. In addition, other results are presented in Table 7 for predefined situations after applying PCA feature reduction.

Table 6 The performances of all classifiers for the second dataset in different situations before feature reduction
Table 7 The performances of all classifiers for the second dataset in different situations after feature reduction

As is illustrated in Tables 6 and 7, the average of determined measurements of the proposed hierarchical model is the best performance among other classifiers. Besides, the ANFIS has the best performance among other classifiers according to these results. The PCA improves the performances of most of the classifiers and the proposed model, as well. Also, all of the standard deviations in Tables 6 and 7 are less than 0.005. It demonstrates the small difference among each measurement of the utilized classifier in different random groups of train and test data. Therefore, the results can be considered stable for utilized classifiers and the proposed hierarchical model in this paper.

To determine whether the proposed image features and feature reduction make significant statistical differences, the non-parametric two-sided Wilcoxon signed rank test has been carried out between different groups of results (Taheri and Hesamian 2013). It is assumed that the distribution is not a normal distribution and the outliers do not affect its performance (Derrac et al. 2011). It tests the presence of a significant statistical difference between two groups of results. The statistically significant difference between each pair of groups is evaluated based on the average of measures from classifiers on the selected dataset. Moreover, their related p-values are represented in Table 8. In this decision test, H = 1 is the logical value that indicates a statistically significant difference between the results of two groups at the given significance level. In this research, α = 0.05, 0.01, and 0.005 are considered for the levels of significance to evaluate decision tests. According to the results, the proposed image features and feature reduction method are statistically better than other groups of results for considered metrics.

Table 8 The results of the Wilcoxon signed-rank test

According to these results, the proposed features of images and the feature reduction method enhance the performances of the proposed hierarchical model and make it more efficient for the diagnosis of different levels of symptomatic COVID-19 patients.

Discussion

Two real datasets about symptomatic COVID-19 patients are utilized in this paper including the COVID-19 patient pre-condition dataset and Chest Imaging with Clinical and Genomic Correlates Representing a Rural COVID-19 Positive Population (COVID-19-AR). The first dataset has clinical data and the second one has both clinical and image data. Performances of utilized classifiers and the proposed hierarchical model are compared based on different measurements including accuracy, sensitivity, specificity, precision, F-score, and G-mean. To have more precise results, we utilized principal component analysis (PCA) on clinical features and extracted features of image data for the second dataset. All results proved that the proposed hierarchical model has the best performances among utilized classifiers for both datasets. It achieves 92% accuracy and 98% accuracy for the first and second datasets, respectively. The proposed extracted features from image data and the utilized feature reduction method improve the performances of classifiers in comparison with other groups of results.

Different research studies utilized these datasets, as well. Dutta et al. (2020) utilized the COVID-19 patient pre-condition dataset and proposed a stacked Gated Recurrent Unit (GRU) based model to identify whether a patient can be infected by this disease or not. The accuracy of their proposed method was 66%. Different properties of the second dataset are evaluated in the paper of Santa Cruz et al. (2021). They found the second dataset appropriate for the proper assessment of the risk of bias. Desai et al. 2020a, b) explained the different features of the second dataset to guide other researchers about this dataset. They determined different features of this dataset as target classes that could be helpful for different purposes. Tang et al. (2020) proposed a segmentation model for identifying the opacity regions from the COVID-19-positive chest X-rays including haziness, ground-glass opacity, and lung consolidation. Although their results are accurate and robust, these results are not analyzed using other classifiers to make an appropriate diagnosis system. Sarv Ahrabi et al. (2021) proposed a convolutional neural network (CNN) with optimized parameters for the second dataset. They utilized deep learning (DL) paradigms for analyzing X-rays in this dataset to achieve an accuracy of 93%. According to these results and literature review, the proposed model in this paper could be supportive in medical purposes for fighting COVID-19 (Rahimi Rise et al. 2022; Rise and Ershadi 2022; Sadat et al. 2022).

Conclusions

In today’s world, there are many newly developed methods to deal with infectious diseases along with a new algorithm. Most of these methods help with disease diagnosis and other areas such as detecting levels of symptomatic patients in a hospital are neglected. Besides, most of the proposed methods in different papers considered clinical data for disease, and the image data are ignored, as well. It is noteworthy that image data plays an important role to find the level of hospitalized symptomatic COVID-19 patients. This gap becomes very serious for various types of Coronavirus disease due to its effects on patients. Timely detection of this disease in its early stages would remarkably increase the chance of complete treatment in a shorter time. Furthermore, on-time detection of the patients’ level would directly affect the service plans in the hospital.

Therefore, in this paper, for the first time and to the best of our knowledge, we used both clinical and image data to find the levels of different hospitalized COVID-19 patients. The need for these patients to ICU and whether or not they are in end-stage are considered as target classes for these patients. A new hierarchical model is proposed for this aim in this paper. This model cluster all patients based on predefined measurements and learn different ANFIS for each of the clusters. Different groups of COVID-19 patients have special and common characteristics. Therefore, we select the FCM clustering method that considers the membership function degree for each data to different clusters. Besides, ANFIS classifiers have the best performances among the 12 utilized classifiers in this paper including case-based reasoning (CBR), decision tree, convolutional neural networks (CNN), K-nearest neighbors (KNN), learning vector quantization (LVQ), multi-layer perceptron (MLP), Naive Bayes (NB), radial basis function network (RBF), support vector machine (SVM), recurrent neural networks (RNN), fuzzy type-i inference system (FIS), and adaptive neuro-fuzzy inference system (ANFIS). Consequently, ANFIS classification methods are selected to understand relationships among features of each cluster.

Accordingly, the results of the proposed models and determining features of images in this paper could be supportive for medical decisions about COVID-19 patients. The proposed hierarchical model in this paper could be supportive in hospitals to find the appropriate levels for different hospitalized symptomatic COVID-19 patients. A graphical user interface is designed based on the proposed hierarchical model and presented in Fig. 10.

Fig. 10
figure 10

The designed user interface for the proposed hierarchical model in this paper

The most important limitation for this work is the dependence of the proposed model on data. Therefore, the limitations of data and appropriate features have a high impact on its performance. It is noteworthy that different groups of COVID-19 patients are dissimilar in different cities/countries. Therefore, this graphical user interface has to be updated based on related datasets about hospitalized symptomatic COVID-19 patients in a special city/country.

In future studies, one can use other classification and clustering methods with non-Euclidean distances to find the performances of the used methods. Furthermore, different combinations of clustering and classification methods could be tailored based on geographical situations.