Background

Predictive modeling has been a hot topic of coronavirus disease 2019 (COVID-19) research [1]. Since the very beginning of the epidemic, there was a significant push towards developing predictive models for COVID-19 diagnosis and prognosis. The interest in predictive models' development was associated with the initial lack of knowledge about COVID-19 diagnosis/treatment/prognosis and the unexpected and dramatic pressure on the healthcare system, especially on intensive care units (ICU) [2]. Such predictive models were aimed at helping physicians stratify patients’ risk of developing the outcome of interest, e.g., need of hospitalization and mechanical ventilation.

A systematic review of the literature by Wynants et al. identified more than sixty predictive models already published at the beginning of the pandemic, i.e., April 2020. The update of this systematic review recorded more than two hundred models [1]. Initially, most models focused on COVID-19 diagnosis, while the update of the revision showed that much more published models focused on patients' prognosis and particularly on predicting death risk.

The idea behind such algorithms was to characterize patients at higher risk of death from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection to help physicians identify the best treatment for each patient according to his/her characteristics. The final aim was to guarantee an efficient allocation of the healthcare resources given the dramatic shortage resulting from the outbreak.

Italy was the first European country hit by the COVID-19 outbreak. Lombardy and Veneto were the two Italian regions where COVID-19 spread first. In a short time, healthcare authorities tried to activate emergency measures to contain the virus spread at the population level and organize the healthcare system response to face the sudden and unexpected increased demand for healthcare assistance [2,3,4,5]. In the Veneto Region, the “COVID-19 VENETO ICU Network” was established [5]. It is an official task force aimed at optimizing ICU resources management through the identification of dedicated COVID-19 pathways and the increase of ICU beds capacity. Furthermore, the network aims to share experience on COVID-19 patients’ treatment among intensive care medicine specialists to standardize patient care. Finally, data on COVID-19 patients admitted to the COVID-19 ICUs of the network have been collected routinely, allowing the epidemiological surveillance of the phenomenon, e.g., to plan the activation of additional ICU beds and clinical research.

The aim of the present study was to develop and validate a predictive model through a machine learning approach for ICU mortality in COVID-19 patients using VENETO ICU Network data.

Methods

We prospectively screened the records of all adult patients with confirmed SARS-CoV-2 infection admitted to the ICUs of the COVID-19 VENETO ICU network, between 28th of February 2020 and 4th of April 2021 [5, 6]. COVID-19 diagnosis was made according to the World Health Organization interim guidance (http://www.who.int/docs/default-source/coronaviruse/clinical-management-of-novel-cov.pdf).

The study was approved by the Institutional Ethical Committee of each participating center (coordinator center approval reference number 4853AO20) and informed consent was obtained for each patient in compliance with national regulation and the recommendations of the Institutional Ethical Committee of Padova University Hospital.

The study cohort was divided into two groups, according to the time of ICU admission. The first group, i.e., “training set,” included patients admitted to the ICUs from 28th of February to 28th of April 2020 plus from 27th of November to 4th of March 2021, and was used for model training. The second group (named “test set 1”), composed by patients admitted to the ICUs from 5th of March to 4th of April 2021, was used for external validation of the model.

In addition to that, a third group (named “test set 2”), composed by patients admitted to the ICU of IRCCS Ca’ Granda Ospedale Maggiore Policlinico of Milan (Lombardy Region) in the same period of time, was also used for external validation.

At ICU admission, the physicians in charge of the patients prospectively collected a predefined set of clinical variables at ICU admission, as listed in Supplementary Materials (Table S1), and entered data into a predesigned data collection form implemented in a web-based system. Moreover, the physicians recorded the need of respiratory support, tracheostomy, re-intubation, prone positioning, extracorporeal membrane oxygenation, continuous venous-venous hemofiltration, vasoactive agents during ICU stay, or re-admission. Each investigator had a personal username and password. Patients’ privacy was protected by assigning a de-identified patient code. Prior to data analysis, two independent investigators and a statistician screened the database for errors against standardized ranges and contacted local investigators with any queries. Then, validated data were entered in the database for final analysis.

Models estimation

Three SuperLearner (SL) prediction tools were developed and validated on the ICU data (see Additional file 1, Table S1, for the complete list of variables included in each model)

  1. 1)

    Model 1. The first model was tuned considering only the variables collected at ICU admission having less than 85% of missing data (see Additional file 1, Table S1, for the complete list of variables included in the model). The external validation was performed on the “test set 1 and 2.”

  2. 2)

    Model 2. The second model was tuned considering all the variables collected at ICU admission, even though missing data were more than 85% (Additional file 1, Table S1). The external validation was performed on the “test set 1.”

  3. 3)

    Model 3. The third model was tuned considering the variables collected at ICU admission and during ICU stay, even though the missing data were more than 85% (Additional file 1, Table S1). The external validation was performed on the “test set 1.”

SuperLearner approach

SuperLearner (SL) is an ensemble Machine Learning algorithm that combines multiple Machine Learning Techniques (MLTs), i.e., base learners, to achieve the best possible weighted performance of the base learners [7,8,9]. The detailed description of the algorithms is provided in the Additional file 1, Methods S2. Figure 1 presents the schematic representations of the base learners.

Fig. 1
figure 1

Schematic representation of the base learners used in the Super Learner ensemble model

Performance measures

The sensitivity, specificity, F1 statistics, the balanced accuracy, and were computed. The training ROC plots were reported.

Internal cross-validation

The base models and the SuperLearner models underwent internal cross-validation performing a 5-fold cross-validation procedure.

Variable importance plot

The variable importance plots were reported. The importance measure was computed considering the mean decrease in the ROC measure resulting from the removal of the variable within the permutations, as recommended in the literature [10].

Descriptive statistics

Continuous data were reported I quartile/median/III quartile categorical data were reported as a percentage and absolute frequencies.

Shiny web application

A shiny web application was developed. The tool calculates the ICU death probability, according to the patients’ characteristics based on each one of the models estimated.

Results

Study population

The overall population included 1616 patients. The first 1293 (80%) patients admitted to the ICUs of the VENETO ICU Network were used for models training (“training set”), while the following 124 (8%) patients were used for external validation (“test set 1”). As well, a further cohort of 199 (12%) patients, admitted to the IRCCS Ca’ Granda Ospedale Maggiore Policlinico, was used as additional external validation (“test set 2”).

Table 1 presents the training and validation cohorts’ characteristics. The proportion of deaths was of 39% in the cohort of 1417 patients admitted to the ICUs of the VENETO ICU Network, and 28% in the cohort of 199 patients admitted to the IRCCS Ca’ Granda Ospedale Maggiore Policlinico.

Table 1 Training and test sets characteristics

Model 1 was trained on the overall ‘training set’ of patients (Table 1) because the model included a limited set of variables measured at ICU admission (Supplementary materials, Table S1) with less than 85% of missing data. Models 2 and 3 were trained on 656 out of 1293 patients because they also included variables with more than 85% of missing data (Table 1) (see Supplementary materials, Table S1, for the complete list of variables included in each model). The main difference between Model 2 and Model 3 is that Model 2 included only variables measured at ICU admission, while Model 3 included variables recorded at admission and also during the ICU stay.

Model 1 was validated on both the cohort of 124 patients belonging to the “test set 1” and on the cohort of 199 patients named “test set 2.” Models 2 and 3 were validated on the external cohort of 124 patients admitted to the ICUs of the COVID-19 VENETO ICU Network (“test set 1”).

Models’ performance

The three models showed similar performances in predicting ICU mortality (Table 2 and Additional file 1, Figure S3), with a training balanced accuracy that ranged between 0.72 and 0.90.

Table 2 Training and test validation performances

The cross-validation performance is in Fig. 2. The best performance was achieved by Model 3, with a ROC of 0.85, while both Models 1 and 2 presented a ROC value of 0.75.

Fig. 2
figure 2

Cross-validated performances. The figure presents cross-validated area under the ROC curves according to base learners and SuperLearner for the three models

With regards to the performance of the algorithms on which the SuperLearner was based, the RF was the one with the best performance on Model 3, as well as for Model 2, together with the GBM. For Model 1, the best performance was achieved by a Bayesian Machine Regression Trees (BartMachine) (Fig. 2).

Variables importance in relation to the outcome

Age was the leading predictor for all the considered models, followed by total SOFA score at ICU admission and the arterial partial pressure of oxygen to inspired oxygen fraction ratio used for SOFA calculation (SOFA PaO2/FiO2) in Model 1. The SOFA PaO2/FiO2 was a relevant predictor for Model 2, as well the Palliative Predictive Score (PPS) Activity variable. The PPS Activity was also in the top five parameters for Model 3, together with the need of O2 therapy, non-invasive or invasive ventilation (Fig. 3).

Fig. 3
figure 3

Variable importance plots. The ten most important predictors are reported in the plots. Abbreviations: PaO2/FiO2, arterial partial pressure of oxygen to inspired oxygen fraction ratio; SOFA, Sequential Organ Failure Assessment; CCI, Charlson Comorbidity Index; GCS, Glasgow Coma Scale; PPS, Palliative Performance Score.

The shiny app reporting the three ICU mortality prediction tools is available at https://r-ubesp.dctv.unipd.it/shiny/CoViD-19%20icupred/.

Discussion

The present study provides a tool for predicting ICU mortality in COVID-19 patients using data from a large cohort of patients admitted to the ICUs of the COVID-19 VENETO ICU Network. The three models, systematically built through a machine learning approach, showed good training and validation performances, yielding similar results to predict ICU mortality.

In particular, age was identified as the most important predictive parameter in every model investigated. Secondary, total SOFA score at ICU admission, the level of daily activity and the need of different types of respiratory supports were important parameters for Model 1, 2 and 3.

This finding is in line with current literature, describing a great impact of age on mortality in COVID-19 patients undergoing invasive and non-invasive ventilation [6, 11,12,13,14,15]. Karagiannidis C et al, in the widest cohort of hospitalized COVID-19 patients, showed that mortality has been high for patients receiving mechanical ventilation, particularly for patients aged 80 years or older and those requiring dialysis, and has been considerably lower for patients younger than 60 years [11].

Similar findings were reported by Boscolo et al. and Vaschetto et al. investigating in-hospital mortality of COVID-19 mechanically ventilated. In both studies, the cumulative incidence of mortality at 60 days was higher in the older ones [6, 12].

Worth noting, from the beginning of the pandemic, several tools have been proposed for mortality prediction of COVID-19 patients; however, it is difficult to compare their performance because each model was developed on patients with different characteristics, using different sets of variables, and using different techniques for model development. Indeed, Wynants and colleagues have shown that all published models have several limitations [1], including small sample size and lack of information and clarity on algorithm development reporting. For these reasons, it is difficult to compare models’ performance and to identify the most feasible model to be used in everyday clinical practice to assist physicians’ decisions. Our findings show that the SL is a feasible approach to be used with clinical data, providing good predictive performances and good generalizability. Although machine learning approaches are increasingly used in the clinical setting, also in COVID-19 research [7-10], more traditional techniques, i.e., traditional logistic regression for binary outcomes and survival regression models for time-to-event outcomes, are still widely used since they are much simpler to be implemented and interpreted. However, the use of machine learning approaches represents an added value to predictive modeling, as it allows the detection of complex relationships between the outcomes of interest and the covariates, overcoming the limits of traditional analysis, especially when a high number of predictors is evaluated in front of a low number of events.

The predictive tools described in the present paper have several strengths, including the fact that they have been developed on a large multicenter cohort of patients admitted to the ICUs of one of the Italian regions most severely affected by the COVID-19 pandemic, the use of both internal and external validation, and the use of a machine learning tool instead of more traditional techniques to build the predictive model.

However, our study has some limitations. First, clinical variables investigated in our study represent only a small number of parameters potentially relevant and able to affect critically ill patients’ outcomes. Second, several patients had incomplete records, which depended on the overwhelming workload for ICU physicians during the COVID-19 pandemic.

Conclusions

Our study provides a useful and reliable tool, through a machine learning approach using the SL algorithm, for predicting ICU mortality in COVID-19 patients. Age was the most predictive parameter in all the estimated Models.