Predictive Value of Machine Learning Models in Mortality of Coronavirus Disease 2019 (COVID-19) Pneumonia

Rostami, Atefeh; Mousavi, Faezeh; Javadinia, Seyed Alireza; Robatjazi, Mostafa; Mehrpouyan, Mohammad

doi:10.1007/s44196-024-00633-2

Predictive Value of Machine Learning Models in Mortality of Coronavirus Disease 2019 (COVID-19) Pneumonia

Research Article
Open access
Published: 26 August 2024

Volume 17, article number 221, (2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

Predictive Value of Machine Learning Models in Mortality of Coronavirus Disease 2019 (COVID-19) Pneumonia

Download PDF

153 Accesses
Explore all metrics

Abstract

The coronavirus disease 2019 (COVID-19) became the most spread and lethal disease in the last 3 years. Early predictions could optimize the decision-making process, healthcare outcomes, and effective usage of healthcare resources during peaks. This study set out to predict the mortality risk of COVID-19 patients by investigating 14 machine learning (ML) models using extensive clinical, laboratory, and image-based features. Additionally, feature importances in each model and the influences of features in the mortality prediction of ML models have been evaluated in this study. Data from 252 patients during the 5th peak of the COVID-19 pandemic (July 2021-September 2021) with 42 features were used for the training of ML models. Fourteen ML models were created using the fivefold cross-validation method. Each model was trained using a training-validation dataset with its own optimized parameters. The performance of models has been evaluated by metric parameters of accuracy, precision, sensitivity, specificity, AUC, and F1 score. The highest values of accuracy (87.30%), precision (100%), sensitivity (77.27%), specificity (100%), AUC (91.90%), and F1 score (77.99%) were observed for the linear discriminant analysis (LDA), K-Nearest Neighbors (KNN), Gaussian Naive Bayes (GNB), KNN, Passive Aggressive Classifier (PAC), and LDA models, respectively, when training was performed with all 42 features. By using feature selection techniques, the support vector classifier (SVC) model with 10 features showed the most AUC of 93.40%. The features of mechanical ventilation, consolidation, fatigue, malignancy, dry cough, level of consciousness (LOC), gender, diarrhea, O2 therapy, and SpO2 are potential predictors of mortality rates in COVID-19 patients.

Comparing machine learning algorithms for predicting COVID-19 mortality

Article Open access 04 January 2022

Machine Learning-Based Mortality Prediction of COVID-19 Patients

Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method

Article 05 January 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The worldwide pandemic of coronavirus disease 2019 (COVID-19) has been the most serious health problem in the last 3 years [1]. With considerably higher transmission and mortality rates, COVID-19 has become the most widespread and lethal disease in human history [2, 3]. Nevertheless, most patients present with mild symptoms including fever, myalgia, cough, headache, gastrointestinal problems, shortness of breath, loss of smell and taste, and even consciousness disorders [4, 5]. Up to 30% of the patients will need hospitalization during the course of the disease, with an intensive care unit (ICU) admission rate of 23% accompanied by mechanical ventilation in most of them. These rates could be increased with age and other pre-existing high-risk comorbidities [6,7,8]. Timely detection and treatment of severe cases of COVID-19 are essential to reduce the mortality rate and to avoid unnecessary hospital/ICU beds being occupied. Artificial intelligence (AI) algorithms with Machine learning (ML) and deep learning (DL) models may have a role to distinguish these patients from patients who can be treated with hospitalization.

With the development of digital systems and the emergence of big data, AI algorithms have been widely integrated into healthcare systems and have obsoleted classical statistical analysis [9]. ML as a branch of AI can be divided into two main categories of supervised and unsupervised learning approaches. In the supervised learning method, there is a data set with pre-defined labels. Methods of classification and regression, two main types of supervised learning, can be used for the prediction of categorical and continuous outputs. By contrast, the unsupervised algorithms attempt to identify patterns in unlabeled data sets for clustering and dimensionality reduction of large data sets [10]. AI algorithms have been used to automate the analysis of various data types of text, signal, sound, images, and videos in medical applications [11,12,13,14].

ML algorithms may have a great impact on various aspects of the COVID-19 management plan, from the diagnostic point by the early prediction of disease severity, intensive care unit (ICU) admission, and mortality risk to the therapeutic point by evaluation of treatment response, drug discovery, and social control [15,16,17,18,19]. In addition to ML algorithms, DL models have been used in the diagnosis and treatment of COVID-19, with applications including outbreak prediction, virus spread tracking, and vaccine and drug discovery research. DL models, with the ability to extract features from the image, have received much attention in the early diagnosis and spread of the COVID-19 disease based on lung computed tomography and radiography images [20].

Early severity and mortality predictions in COVID-19 patients could optimize the decision-making process, improve healthcare outcomes, and facilitate effective usage of healthcare resources during peaks [21]. The mortality prediction in COVID-19 patients has been investigated in some studies with various data sets, ML and DL algorithms, and feature selection methods [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]. Although there are no obvious special limitations in the related studies, each of them used various models and features of different study populations for mortality prediction of COVID-19 disease. Reconfirmation of these findings in other populations with various features is demanded by the healthcare systems.

This study set out to predict the mortality risk of patients with COVID-19 disease by investigating 14 ML models using extensive clinical, laboratory, and image-based features.

2 Materials and Methods

The following flowchart outlines the step-by-step procedures employed in this study to ensure replicability and transparency. It provides a visual representation of the study’s processes, guiding researchers through the various stages from data collection to analysis and interpretation (Fig. 1).

2.1 Data of the Study

In this study, data from 252 COVID-19 patients have been used for building, comparison, and evaluation of the applied models. All diagnoses were confirmed by reverse transcription-polymerase chain reaction (RT-PCR) diagnostic assay. The data from patients were collected, retrospectively, during the 5th peak of the COVID-19 pandemic (July 2021–September 2021) from Vasei Hospital of Sabzevar in Iran with an Ethics Committee of Sabzevar University of Medical Sciences (IR.MEDSAB.REC.1400.132). No algorithm is considered for data correction. At the beginning of the study, the patients’ data were checked in terms of the recorded features, some features that were not reported for all patients and in cases where the patient information was not complete, it was excluded from the study in the data-cleaning step. After that, we had 42 clinical, laboratory, and image features for each patient that have been used for the training of ML models.

2.2 Statistical Data Evaluation

Raw data were checked for null features. Statistical analysis and correlations of various features were evaluated with SciPy library of python. Finally, all quantitative features scaled for training of ML models.

2.3 Training of ML Models

In this study, the performance of 14 various ML models of Logistic Regression (LR), Passive Aggressive Classifier (PAC), Support Vector Classifier (SVC), K-Nearest Neighbors (KNN), Decision Tree Classifier (DTC), Random Forest Classifier (RFC), Gradient Boosting Classifier (GBC), Hist Gradient Boosting Classifier (HGBC), AdaBoost Classifier (ABC), Bagging Classifier (BC), Extra Trees Classifier (ETC), Multi-Layer Perception Classifier (MLPC), Gaussian Naive Bayes (GNB), and Linear Discriminant Analysis (LDA) have been investigated in the prediction of mortality risk in COVID-19 patients. The theoretical explanations of the used ML models have been explained on the Python Machine Learning Library website (Scikit Learn) and related articles [50, 51].

Models were developed on randomly drawn 70% of the training-validation data set and its performance was evaluated on the 30% of the test data set. Due to the use of the K-fold cross-validation data splitting method and GridSearchCv technique for the selection of optimum hyperparameters, the train-validation dataset did not split at the first step. GridSearchCV is the process of performing hyperparameter tuning to determine the optimal values for a given model. There is no way to know in advance the best values for hyperparameters so ideally, we need to try all possible values to know the optimal values. Doing this manually could take a considerable amount of time and resources and thus we used GridSearchCV to automate the tuning of hyperparameters.

In the first step, the GridSearchCV technique along with the fivefold cross-validation method was used to control the learning process of models and optimized their parameters based on the highest accuracy on the training data set. Then each model has been trained by applying dedicated optimized parameters.

2.4 Evaluation of ML Models

The evaluation of designed models has been performed using independent test data set. The confusion matrix has been calculated and some metric parameters such as accuracy, precision, sensitivity, specificity, AUC, and F1 score have been reported for each model.

2.5 Feature Importance

To reduce the number of features, complexity of models, and improve their accuracy, various feature selection methods can be used to comprise various models and build the final classification model [31, 32]. The feature importance technique indicates the contribution of each feature in the model’s prediction. Feature selection methods could be a guide for removing the features which have a low impact on the model’s predictions and focusing on model improvement based on significant features. In simple terms, the main purpose of feature selection techniques is to maximize the performance of a model with a minimized number of features [33].

Feature importance for all of the models has been investigated by three methods including (a) coef, for linear models of LR and PAC, (b) feature_importances, for DTC, RFC, GBC, ABC, and ECT models, and (c) permutation_importance for SVC, KNN, HGBC, BC, MLPC, GNB, and LDA models.

All 14 ML models have been trained and evaluated by using all 42 features. Then, for the assessment of feature numbers on the accuracy of models, four models with the higher AUC have been chosen. The wrapper-type feature selection algorithm of Recursive Feature Elimination (RFE) has been used to select 20, 10, and 5 features for these four models. According to the methods that have been explained in Sects. 2.2 and 2.3, the training and evaluation of 4 models have been performed by using 20, 10, and 5 features.

3 Results

3.1 Data Description

Description of continuous and categorical features have been shown in Tables 1 and 2, respectively. The distribution of continuous variables has been shown in Fig. 2.

Table 1 The statistical description of quantitative continuous features (output 1 and 0 are for died and alive patients, respectively)

Full size table

Table 2 The statistical description of categorical features

Full size table

3.2 Data Correlations

The assessment of data correlation was performed at two stages including (a) full data correlation between all of the features and (b) data correlation of features just with the outcome or target. The correlation between features has been shown in Fig. 3 using a heatmap.

Data correlation assessment just with the outcome has been shown in Fig. 4 and has been sorted in Table 3. The features that had five high correlation values with the outcome was including mechanical ventilation, ICU admission, malignancy, steroid therapy, and level of consciousness (LOC).

Table 3 The correlation values of output (died or alive) with all 42 clinical, laboratory, and image features

Full size table

3.3 Evaluation of ML Models

Performance evaluation of considered models using the test dataset has been reported in Table 4. The highest values of accuracy, precision, sensitivity, specificity, AUC, and F1 score were seen for LDA, KNN, GNB, KNN, PAC, and LDA, respectively. According to similar studies, the AUC has been considered the most analysis parameter for model evaluation. It has been shown in Table 3 that, 4 models including PAC, SVC, LDA, and ETC have better performance than other ML models based on AUC values. The ROC curve of these four models is shown in Fig. 5.

Table 4 Evaluation metrics of 14 optimized ML models using all 42 clinical, laboratory, and image features

Full size table

3.4 Feature Importance and Feature Selection

To evaluate how each variable (feature) influences mortality prediction in ML models, we performed feature importance analysis for PAC, SVC, LDA, and ETC as they were the best-performing models (based on AUC scores) among all 14 ML models for mortality prediction.

Based on the results of Fig. 6, the 10 most important features that had the largest impact on the performance of the PAC, SVC, ETC, and LDA models have been reported in Table 5.

Table 5 10 important features with more impact on the performance of 4 ML models of PAC, SVC, ETC, and LDA (AUC > 90%)

Full size table

Performance evaluation of the four selected models based on 20, 10, and 5 features have been reported in Table 6. Interestingly, feature reduction has no significant influence on the performance of models. Even in some models, the reduction of features has led to an increase in evaluation parameters. The highest AUC of 93.40% was obtained for the SVC model with 10 features.

Table 6 Evaluation metrics (Accuracy, Precision, Sensitivity, Specificity, AUC, F1 score) of PAC, SVC, ETC, and LDA ML models using 20, 10, and 5 selected features

Full size table

4 Discussion

In this study, we evaluated the ability of 14 ML algorithms to predict the clinical outcome of mortality using data available from 252 COVID-19 patients.

Based on the AUC evaluation parameter, we found that the linear-based method of PAC has slightly better performance than other ML algorithms by using all 42 features. KNN and DTC models with 72% and 75% AUC, respectively, have the lowest AUC and the other models with a slight difference showed an AUC of more than 82% in predicting patient death.

There have been some studies that predicted the severity of the disease or the possibility of death using biomarkers of LDH, hs-CRP, ferritin, and IL-10 [23, 29], but in the current study, the mechanical ventilation, ICU admission, LOC, malignancy, steroid therapy, calcification, consolidation, and fatigue were introduced as important features in early prediction of patient death based on the used features in the training of ML models and top 4 models selected based on AUC.

Evaluation metrics for PAC, SVC, ETC, and LDA models which were reported in Table 5 using 20, 10, and 5 features showed that feature reduction leads to reducing the complexity and increasing the performance of the models. Among these four models, the SVC model using 10 features has the most AUC of 93.40%. Just with five features, an AUC of 92.07% has been acquired for the LDA model. The best models for the prediction of mortality risk with 42, 20, 10, and 5 features based on the AUC metric have been shown in Table 7.

Table 7 The best ML models in the prediction of mortality risk of COVID-19 patients with 42, 20, 10, and 5 features

Full size table

Among all the models evaluated with varying numbers of features, the SVC model was found to be the best-performing, achieving an AUC of 93.40% using 10 features. The SVC is a machine learning model that constructs a maximum-margin hyperplane to separate data points belonging to different classes. This hyperplane is positioned to maximize the distance between the decision boundary and the nearest data points from each class, creating a robust classification model. The SVC algorithm identifies the optimal hyperplanes that divide the input data into distinct classes. It then determines the boundaries between the input classes, with the input elements defining these boundaries. The resulting maximum-margin hyperplane provides the best separation between the training data samples belonging to the different classes [48].

Mortality prediction of COVID-19 patients has been carried out in some studies by various machine learning algorithms [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49]. Various studies with the investigation and comparison of different models were reviewed in the text and the others have shown in Table 8.

Table 8 The prediction of mortality risk in COVID-19 patients in similar studies with various data sets and ML models in comparison with the current study

Full size table

An et al. investigated the performance of the least absolute shrinkage and selection operator (LASSO), linear support vector machine (SVM), random forest (RF), and k-nearest neighbors (KNN) models for mortality prediction of 10,237 COVID-19 patients within 14 and 30 days after the initial diagnosis. In their study, linear SVM achieved the highest performance of AUC = 0.962 compared to the other models. They also found that age, chronic lung disease, diabetes mellitus, and cancer can increase the risk of mortality [34]. Pourhomayoun et al. used the Artificial Neural Networks (ANN), Decision Tree (DT), Logistic Regression (LR), SVM, RF, and KNN models for mortality prediction using 57 features in three categories of symptoms, pre-existing conditions (or comorbidities), and demographics. In the best way, ANN with tenfold cross-validation could predict the mortality of patients with an accuracy of 89.98% [33].

Aljameel et al. developed LR, RF, and extreme gradient boosting (XGB) to predict the severity of disease in COVID-19 patients. Their findings indicated that the best model was RF, which achieved an accuracy of 0.952 and an AUC of 0.99 [38]. Yu et al. compared the performance of CatBoost, a novel gradient-boosting algorithm, with the XGBoost model in predicting mechanical ventilation and mortality. The study found that CatBoost achieved an accuracy of 86.2% for predicting mechanical ventilation and 80% accuracy for predicting mortality, which was either comparable to or better than the performance of the XGBoost model [36]. Using two datasets from Wolfram and GitHub, Li et al. examined the accuracy of autoencoder, LR, RF, and SVM models, all of which achieved accuracy values above 0.9 according to their findings [37].

Subudhi et al. was compared 18 models for the prediction of severity and mortality. They indicated that ensemble-based models with mean F1 scores ≥ 0.8 have the highest performance. After that, LR, DT, LDA, Quadratic Discriminant Analysis (QDA), and MLPClassifier (MLPC) also had high F1 scores between 77 and 79%. In contrast, PAC, perceptron, and linear SVC models had relatively low F1 [38].

In the other study by Jamshidi et al., the efficiency of LR, RF, ANN, KNN, LDA, and Naive Bayes models have been tested to predict the risk of mortality in 23,749 patients. They have shown RF method outperformed the other methods with an AUC value of 0.79 for the test data [9]. Rustam et al. in a time series study, investigated the four regression models of LR, LASSO, SVM, and Exponential Smoothing (ES) for the prediction of recovery and death rates. According to their findings, the ES algorithm demonstrated the best performance, followed by LR and LASSO, whereas SVM exhibited poor performance across all prediction scenarios using the available dataset [3]. Table 8 shows some studies and relevant AUC scores related to the prediction of COVID-19 mortality compared to the current study.

In our study, suffering from malignancies, being unconscious at admission, and undergoing mechanical ventilation, ICU admission, and steroid therapy during the course of treatment were the most predictors of mortality in patients hospitalized due to the COVID-19 infection. Previous research has indicated that patients with pre-existing conditions, including cancer, who contract COVID-19 are at an increased risk of mortality. Despite the possibility of atypical symptoms, the risk of death is significant for these patients [52,53,54]. This finding can be attributed to several factors, including delayed diagnosis resulting from patients being asymptomatic or presenting uncommon symptoms. Additionally, the presence of other confounding factors, such as older age, is more common in this population, which may contribute to the delayed diagnosis [52, 55]. Moreover, while altered consciousness is not a frequent symptom in patients with COVID-19, its presence shows the severity of the disease and potential underlying multi-organ failure which are collectively accompanied by higher mortality rates [56, 57].

Regarding the increased mortality rates in COVID-19 patients receiving steroid therapies, it is worth mentioning that before the recent clinical trials showing the importance of earlier initiation of corticosteroids not only in the management of the acute phase of the disease but also in the mitigation of long term complication of COVID-19, corticosteroids had been only prescribed in severely clinically ill patients [58, 59]. Therefore, along with mechanical ventilation and ICU admission, all of these three therapies were signs of suffering from more severe COVID-19 infections. Severe COVID-19 infections are considered as the main prognostic and predictive factor of COVID-19 infection by numerous investigators [60,61,62].

There are a few limitations in our study from a clinical aspect. One of the most important limitations of this study is the measurement of the features used at the time of the patient’s initial hospitalization after 2 until 14 days from observation of the initial symptoms. While time-series studies have shown that using features at different times can help predict the patient’s acute condition [22].

It is also worth noting that the limited data availability and incomplete patient information necessitated the exclusion of some participants from the study. Future research focusing on training neural network and DL models with larger, more comprehensive datasets, potentially including time-series data, may enable earlier identification of high-risk individuals. This, in turn, could lead to changes in treatment regimens that could positively impact the mortality prediction capabilities of the ML models.

5 Conclusion

Performance evaluation of 14 ML models showed that the highest values of accuracy (87.30%), precision (100%), sensitivity (77.27%), specificity (100%), AUC (91.90%), and F1 score (77.99%) have been seen for LDA, KNN, GNB, KNN, PAC, and LDA models in training with all 42 features. By using feature selection techniques, the SVC model can predict the mortality risk of COVID-19 patients with an AUC of 93.40% with 10 features of mechanical ventilation, consolidation, fatigue, malignancy, dry cough, LOC, gender, diarrhea, O₂ therapy, and SpO₂.

From a technical point of view, correct prediction in dividing COVID-19 patients into low-risk and high-risk groups can reduce unnecessary hospital visits, costs, and psychological and physical stress to the medical staff. On the other hand, it can speed up the treatment process and ultimately decrease mortality. There are several potential research directions to build upon the findings of this study including: collect and analyze larger datasets of COVID-19 patients across multiple healthcare settings to further validate the predictive performance of the developed ML models, incorporate time-series data and track changes in patient features over the course of illness to improve the early identification of high-risk individuals, explore the use of deep learning algorithms, which may have greater capacity to extract complex patterns from the clinical, laboratory, and imaging data compared to traditional ML models, investigate the incremental value of adding novel biomarkers or other data modalities (e.g., wearable sensor data) to further enhance the mortality prediction capabilities.

By addressing these future research directions, the predictive models developed in this study could be further refined and optimized to provide more accurate and actionable insights for improving COVID-19 patient outcomes.

Data Availability

Due to ethical and legal concerns, the data and materials used in this study will not be publicly available. However, requests for access to the data and materials may be made by the corresponding author. We will make every effort to provide access to the data and materials in a manner that is consistent with ethical and legal standards.

References

Gandhi, R.T., Lynch, J.B., del Rio, C.: Mild or moderate COVID-19. N. Engl. J. Med. 383, 1757–1766 (2020)
Article Google Scholar
World Health Organization: Middle East respiratory syndrome coronavirus (MERS-CoV). https://www.who.int/emerg encie s/merscov (2020)
Rustam, F., Reshi, A., Mehmood, A., et al.: COVID-19 future forecasting using supervised machine learning models. IEEE Access 8, 101489–101499 (2020)
Article Google Scholar
Tabata, S., Imai, K., Kawano, S., et al.: Clinical characteristics of COVID-19 in 104 people with SARS-CoV-2 infection on the Diamond Princess cruise ship: a retrospective analysis. Lancet Infect. Dis. 20, 1043–1050 (2020)
Article Google Scholar
Jamshidi, E., Babajani, A., Soltani, P., Niknejad, H.: Proposed mechanisms of targeting COVID-19 by delivering mesenchymal stem cells and their exosomes to damaged organs. Stem Cell Rev. Rep. 17(1), 176–192 (2021)
Article Google Scholar
Chow, N., Fleming-Dutra, K., Gierke, R., et al.: Preliminary estimates of the prevalence of selected underlying health conditions among patients with coronavirus disease. MMWR Morb. Mortal. Wkly Rep. 69, 382–386 (2020)
Google Scholar
Gold, J.A.W., Wong, K.K., Szablewski, C.M., et al.: Characteristics and clinical outcomes of adult patients hospitalized with COVID-19. MMWR Morb. Mortal. Wkly Rep. 69(18), 545–550 (2020)
Article Google Scholar
Goh, K.J., Kalimuddin, S., Chan, K.S.: Rapid progression to acute respiratory distress syndrome: review of current understanding of critical illness from coronavirus disease 2019 (COVID-19) infection. Ann. Acad. Med. Singap. 49, 108–118 (2020)
Article Google Scholar
Jamshidi, E., Asgary, A., Tavakoli, N., et al.: Symptom prediction and mortality risk calculation for COVID-19 using machine learning. Front. Artif. Intell. 4, 673527 (2021)
Article Google Scholar
Alballa, N., Al-Turaiki, I.: Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: a review. Inform. Med. Unlocked 24, 100564 (2021)
Article Google Scholar
Basu, K., Sinha, R., Ong, A., Basu, T.: Artificial intelligence: How is it changing medical sciences and its future? Indian J. Dermatol. 65(5), 365–370 (2020)
Article Google Scholar
Bhatti, U.A., et al.: Local similarity-based spatial–spectral fusion hyperspectral image classification with deep CNN and gabor filtering. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)
Article Google Scholar
Zeng, C., Liu, J., Li, J., et al.: Multi-watermarking algorithm for medical image based on KAZE-DCT. J. Ambient Intell. Human Comput. 15, 1735–1743 (2024)
Article Google Scholar
Bhatti, U.A., Yuan, L., Yu, Z., et al.: New watermarking algorithm utilizing quaternion Fourier transform with advanced scrambling and secure encryption. Multimed. Tools Appl. 80, 13367–13387 (2021)
Article Google Scholar
Chowdhury, M.E.H., Rahman, T., Khandakar, A., et al.: An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit. Comput. 11, 1–16 (2021)
Google Scholar
Nemati, M., Ansary, J., Nemati, N.: Machine-learning approaches in COVID-19 survival analysis and dischargetime likelihood prediction using clinical data. Patterns 1(5), 100074 (2020)
Article Google Scholar
Dong, E., Du, H., Gardner, L.: An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 1, 30120–30121 (2020)
Google Scholar
Kafieh, R., Arian, R., Saeedizadeh, N., et al.: COVID-19 in Iran: a deeper look into the future. medRxiv (2020). https://doi.org/10.1101/2020.04.24.20078477
Article Google Scholar
Lalmuanawma, S., Hussain, J., Chhakchhuak, L.: Applications of machine learning and artificial intelligence for COVID-19 (SARS-CoV-2) pandemic: a review. Chaos Solitons Fractals 139, 110059 (2020)
Article MathSciNet Google Scholar
Bhattacharya, S., Maddikunta, P., Pham, Q., Gadekallu, T., Krishnan, S., Chowdhary, Ch., Alazab, M., Piran, M.: Deep learning and medical image processing for coronavirus (COVID-19) pandemic: a survey. Sustain. Cities Soc. 65, 102589 (2021)
Article Google Scholar
Zoabi, Y., Deri-Rozov, S., Shomron, N.: Machine learning-based prediction of COVID-19 diagnosis based on symptoms. npj Digit. Med. 4, 3 (2021)
Article Google Scholar
Parchure, P., Joshi, H., Dharmarajan, K., et al.: Development and validation of a machine learning-based prediction model for near-term in-hospital mortality among patients with COVID-19. BMJ Support. Palliat. Care 12, 1–8 (2020)
Google Scholar
Yan, L., Zhang, H., Goncalves, J., et al.: A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv (2020). https://doi.org/10.1101/2020.02.27.20028027
Article Google Scholar
Wang, K., Zuo, P., Liu, Y., Zhang, M., Zhao, X., Xie, S., Zhang, H., Chen, X., Liu, C.: Clinical and laboratory predictors of in-hospital mortality in patients with coronavirus disease-2019: a cohort study in Wuhan, China. Clin. Infect. Dis. 71(16), 2079–2088 (2020)
Article Google Scholar
Rechtman, E., Curtin, P., Navarro, E., Nirenberg, S., Horton, M.K.: Vital signs assessed in initial clinical encounters predict COVID-19 mortality in an nyc hospital system. Sci. Rep. 10, 1–6 (2020)
Article Google Scholar
Guan, X., Zhang, B., Fu, M., Li, M., Yuan, X., Zhu, Y., Peng, J., Guo, H., Lu, Y.: Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study. Ann. Med. 53, 257–266 (2021)
Article Google Scholar
Hu, C., Liu, Z., Jiang, Y., Shi, O., Zhang, X., Xu, K., et al.: Early prediction of mortality risk among patients with severe COVID-19, using machine learning. Int. J. Epidemiol. 49(6), 1918–1929 (2020)
Article Google Scholar
Liu, Q., Song, N.C., Zheng, Z.K., Li, J.S., Li, S.K.: Laboratory findings and a combined multifactorial approach to predict death in critically ill patients with COVID-19: a retrospective study. Epidemiol. Infect. 148, 129 (2020)
Article Google Scholar
de Terwangne, C., Laouni, J., Jouffe, L., Lechien, J.R., Bouillon, V., Place, S., Capulzini, L., Machayekhi, S., Ceccarelli, A., Saussez, S., et al.: Predictive accuracy of COVID-19 world health organization (who) severity classification and comparison with a bayesian-method-based severity score (epi-score). Pathogens 9, 880 (2020)
Article Google Scholar
Li, S., Lin, Y., Zhu, T., Fan, M., Xu, S., Qiu, W., Chen, C., Li, L., Wang, Y., Yan, J., et al.: Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method. Neural Comput. Appl. 35, 13037–13046 (2023)
Article Google Scholar
Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018)
Article Google Scholar
Jiang, X., Coffee, M., Bari, A., Wang, J., Jiang, X., Huang, J., Shi, J., Dai, J., Cai, J., Zhang, T., Wu, Z., He, G., Huang, Y.: Towards an artificial intelligence framework for datadriven prediction of coronavirus clinical severity. Comput. Mater. Contin. 63, 537–551 (2020)
Google Scholar
Pourhomayoun, M., Shakibi, M.: Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health 20, 100178 (2021)
Article Google Scholar
An, C., Lim, H., Kim, D.-W., Chang, J.H., Choi, Y.J., Kim, S.W.: Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study. Sci. Rep. 10, 18716 (2020)
Article Google Scholar
Aljameel, S.S., Khan, I.U., Aslam, N., Aljabri, M., Alsulmi, E.S.: Machine learning-based model to predict the disease severity and outcome in COVID-19 patients. Sci. Program. 2021, 5587188 (2021)
Google Scholar
Yu, L., Halalau, A., Dalal, B., Abbas, A.E., Ivascu, F., Amin, M., et al.: Machine learning methods to predict mechanical ventilation and mortality in patients with COVID-19. PLoS ONE 16(4), 0249285 (2021)
Article Google Scholar
Li, Y., Horowitz, M.A., Liu, J., Chew, A., Lan, H., Liu, Q., Sha, D., Yang, C.: Individual-level fatality prediction of COVID-19 patients using AI methods. Front. Public Health 8, 587937 (2020)
Article Google Scholar
Subudhi, S., Verma, A., Patel, A.: Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. npj Digit. Med. 4, 87 (2021)
Article Google Scholar
Moulaei, K., Shanbehzadeh, M., Mohammadi-Taghiabad, Z., Kazemi-Arpanahi, H.: Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Med. Inform. Decis. Mak. 22, 2 (2022)
Article Google Scholar
Elshennawy, N.M., Ibrahim, D.M., Sarhan, A.M., Arafa, M.: Deep-risk: deep learning-based mortality risk predictive models for COVID-19. Diagnostics 12, 1847 (2022)
Article Google Scholar
González-Cebrián, A., Borràs-Ferrís, J., Ordovás-Baines, J.P., Hermenegildo-Caudevilla, M., Climente-Marti, M., Tarazona, S., et al.: Machine-learning-derived predictive score for early estimation of COVID-19 mortality risk in hospitalized patients. PLoS ONE 17(9), e0274171 (2022)
Article Google Scholar
Casillas, N., Torres, A.M., Moret, M., Gómez, A., Rius-Peris, J.M., Mateo, J.: Mortality predictors in patients with COVID-19 pneumonia: a machine learning approach using eXtreme gradient boosting model. Intern. Emerg. Med. 17, 1929–1939 (2022)
Article Google Scholar
Talkhi, N., Akbari Sharak, N., Yousefi, R., Salari, M., Sadati, S.M., Shakeri, M.T.: Predicting COVID-19 mortality and identifying clinical symptom patterns in hospitalized patients: a machine-learning study. Iran. J. Health Sci. 12(1), 39–48 (2024)
Article Google Scholar
Tamal, M., Marufur Rahman, M., Alhasim, M., Al Mulhim, M., Deriche, M.: Artificial intelligence (AI) based prediction of mortality, ICU admission and ventilation support requirement for COVID-19 patients using 122 clinical and demographic parameters. medRxiv 3, 1157 (2024)
Google Scholar
Li, S., Lin, Y., Zhu, T., et al.: Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method. Neural Comput. Appl. 35, 13037–13046 (2023)
Article Google Scholar
Ustebay, S., Sarmis, A., Kaya, G., Sujan, M.: A comparison of machine learning algorithms in predicting COVID-19 prognostics. Intern. Emerg. Med. 18, 229–239 (2023)
Article Google Scholar
Saadatmand, S., Salimifard, K., Mohammadi, R., et al.: Using machine learning in prediction of ICU admission, mortality, and length of stay in the early stage of admission of COVID-19 patients. Ann. Oper. Res. 328, 1043–1071 (2023)
Article Google Scholar
Janmenjoy, N., Bighnaraj, N., Behera, H.: A comprehensive survey on support vector machine in data mining tasks: applications & challenges. Int. J. Database TheoryAppl. 8(1), 169–186 (2015)
Article Google Scholar
Banoei, M.M., Rafiepoor, H., Zendehdel, K., Seyyedsalehi, M.S., Nahvijou, A., Allameh, F., Amanpour, S.: Unraveling complex relationships between COVID-19 risk factors using machine learning based models for predicting mortality of hospitalized patients and identification of high-risk group: a large retrospective study. Front. Med. 10, 1170331 (2023)
Article Google Scholar
Yang, L., Shami, A.: On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415, 295–316 (2020)
Article Google Scholar
Alzubi, J., Nayyar, A., Kumar, A.: Machine learning from theory to algorithms: an overview. J. Phys. Conf. Ser. 1142, 012012 (2018)
Article Google Scholar
Shahidsales, S., Aledavood, S.A., Joudi, M., Molaie, F., Esmaily, H., Javadinia, S.A.: COVID-19 in cancer patients may be presented by atypical symptoms and higher mortality rate, a case-controlled study from Iran. Cancer Rep. 4(5), e1378 (2021)
Article Google Scholar
Taghizadeh-Hesary, F., Pejman Porouhan, P., Soroosh, D., PeyroShabany, B., et al.: COVID-19 in cancer and non-cancer patients. Int. J. Cancer Manag. 14(4), e110907 (2021)
Article Google Scholar
Fazilat-Panah, D., Fallah Tafti, H., Rajabzadeh, Y., Fatemi, M.A., et al.: Clinical characteristics and outcomes of COVID-19 in 1294 new cancer patients: single-center, prospective cohort study from Iran. Cancer Invest. 40(6), 505–515 (2022)
Article Google Scholar
Clark, A., Jit, M., Warren-Gash, C., Guthrie, B., Wang, H.H., Mercer, S.W., et al.: Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study. Lancet Glob. Health 8(8), e1003–e1017 (2020)
Article Google Scholar
Kim, H.K., Cho, Y.J., Lee, S.Y.: Neurological manifestations in patients with COVID-19: experiences from the central infectious diseases hospital in South Korea. J. Clin. Neurol. 17(3), 435–442 (2021)
Article Google Scholar
Chen, X., Laurent, S., Onur, O.A., Kleineberg, N.N., Fink, G.R., Schweitzer, F., et al.: A systematic review of neurological symptoms and complications of COVID-19. J. Neurol. 268, 392–402 (2021)
Article Google Scholar
Horby, P., Lim, W.S., Emberson, J.R., Mafham, M., Bell, J.L., Linsell, L., et al.: Dexamethasone in hospitalized patients with COVID-19. N. Engl. J. Med. 384(8), 693–704 (2021)
Article Google Scholar
Dhooria, S., Chaudhary, S., Sehgal, I.S., Agarwal, R., Arora, S., Garg, M., et al.: High-dose versus low-dose prednisolone in symptomatic patients with post-COVID-19 diffuse parenchymal lung abnormalities: an open-label, randomised trial (Acronym: COLDSTER). Eur. Respir. J.Respir. J. 59(2), 2102930 (2021)
Article Google Scholar
Marcilio, I., Lazar Neto, F., Lazzeri Cortez, A., Miethke-Morais, A., et al.: Mortality over time among COVID-19 patients hospitalized during the first surge of the pandemic: a large cohort study. PLoS ONE 17(9), e0275212 (2022)
Article Google Scholar
Patel, U., Malik, P., Usman, M.S., et al.: Age-adjusted risk factors associated with mortality and mechanical ventilation utilization amongst COVID-19 hospitalizations—a systematic review and meta-analysis. SN Compr. Clin. Med. 2, 1740–1749 (2020)
Article Google Scholar
Khamis, F., Memish, Z., Bahrani, M.A., Dowaiki, S.A., Pandak, N., et al.: Prevalence and predictors of in-hospital mortality of patients hospitalized with COVID-19 infection. J. Infect. Public Health 14(6), 759–765 (2021)
Article Google Scholar

Download references

Acknowledgements

The authors sincerely thank from Vasei Clinical Research Development Unit at Sabzevar University of Medical Sciences, for providing advice and guidance in conducting this research.

Funding

This work was fully supported by Sabzevar University of Medical Sciences (grant number 400151).

Author information

Authors and Affiliations

Department of Medical Physics and Radiological Sciences, Sabzevar University of Medical Sciences, Sabzevar, Iran
Atefeh Rostami, Mostafa Robatjazi & Mohammad Mehrpouyan
Student Research Committee, Sabzevar University of Medical Sciences, Sabzevar, Iran
Faezeh Mousavi
Non-Communicable Diseases Research Center, Sabzevar University of Medical Sciences, Sabzevar, Iran
Seyed Alireza Javadinia & Mostafa Robatjazi

Authors

Atefeh Rostami
View author publications
You can also search for this author in PubMed Google Scholar
Faezeh Mousavi
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Alireza Javadinia
View author publications
You can also search for this author in PubMed Google Scholar
Mostafa Robatjazi
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Mehrpouyan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Atefeh Rostami: conceptualization, methodology, formal analysis, resource, writing—original draft, writing—review and editing, supervision, project administration. Faezeh Mousavi: methodology, software, validation, investigation, and data curation. Mostafa Robatjazi: conceptualization, methodology, resource, writing—review and editing, supervision, project administration. Seyed Alireza Javadinia: resource, writing—review and editing, supervision, project administration. Mohammad Mehrpouyan: resource, writing—review and editing, project administration.

Corresponding author

Correspondence to Mostafa Robatjazi.

Ethics declarations

Conflict of Interest

We declare that we have no financial or personal relationships with any individuals or organizations that could potentially influence the content of our article.

Consent for Publication

Not applicable.

Ethical Approval

This study was approved by the Ethics Committee of Sabzevar University of Medical Sciences (IR.MEDSAB.REC.1400.132). Obtaining any informed consent was waived by the Ethics Committee of Sabzevar University of Medical Sciences. All methods were carried out in accordance with relevant guidelines and regulations.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Rostami, A., Mousavi, F., Javadinia, S.A. et al. Predictive Value of Machine Learning Models in Mortality of Coronavirus Disease 2019 (COVID-19) Pneumonia. Int J Comput Intell Syst 17, 221 (2024). https://doi.org/10.1007/s44196-024-00633-2

Download citation

Received: 18 March 2024
Accepted: 13 August 2024
Published: 26 August 2024
DOI: https://doi.org/10.1007/s44196-024-00633-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predictive Value of Machine Learning Models in Mortality of Coronavirus Disease 2019 (COVID-19) Pneumonia

Abstract

Similar content being viewed by others

Comparing machine learning algorithms for predicting COVID-19 mortality

Machine Learning-Based Mortality Prediction of COVID-19 Patients

Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method

1 Introduction