Introduction

Coronavirus Disease 2019 (COVID-19) pandemic has implacably stricken on the wellness of many countries and their health-care systems.

COVID-19 has rapidly spread over the world, the incidence curve being overwhelmingly steep in Mediterranean countries such as Italy or Spain [1]. The first case in Madrid was reported on February 25th 2020 [2], and our center diagnosed the first patient in March 1st. In spite of the efforts for flattening the epidemic curve, more than 3000 patients have been attended over the following weeks in our hospital. The number of subjects needing critical-care assistance grew enormously, this picture being the rule in the metropolitan area of Madrid and in some other highly populated Spanish regions. The magnitude of the epidemic, the uncertain benefit-harm balance of available treatments, and the ethical responsibility for fairly allocating medical resources has generated great stress among physicians [3].

In such scenario, a rapid and early assessment of patients’ risk to progress to respiratory failure is essential to wisely manage hospital resources. Recent studies have reported that elderly patients with high blood pressure and presenting with high C-reactive protein (CRP), Sequential Organ Failure Assessment index, lymphopenia or d-dimer are at higher risk of severe disease and death [4,5,6,7,8,9,10]. Still, these parameters may lack sensitivity and specificity, and they should be adjusted by other strategies aiming to avoid or delay respiratory support. The ideal evaluation of patients at baseline should be simple and executive, but also should lead to right decisions. The aim of this study was to analyze the clinical characteristics of the initial wave of patients with COVID-19 attended in our center, and to create an easy-to-perform score to rapidly identify patients at risk of developing respiratory failure.

Methods

Study Population and design

This prospective and observational study included all consecutive adult (≥ 18 years) patients with confirmed COVID-19 and hospitalized at the University Hospital “12 de Octubre” from March 2nd to 18th, 2020. Our center is a 1200-bed teaching hospital, including 56 ICU-beds and referral to a population of around 470,000 inhabitants in southern Madrid (Spain). Patients were enrolled at the time of diagnosis of COVID-19 and followed up to May 1st, 2020 or death, whatever came first.

Clinical characteristics, baseline features measured by Charlson Comorbidity Index [11], vital signs, respiratory status, radiological data, and laboratory values at admission, along with patient progress and complications during hospitalization were recorded in electronic medical records and concurrently extracted.

Vital signs were obtained in the Emergency Department triage and the first available laboratory data from each patient were used to calculate the score. Study data were collected and managed using REDCap electronic data capture tools hosted at the Research Institute of Hospital 12 de Octubre (imas+12). REDCap (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies [12, 13].

The protocol was approved by the Hospital 12 de Octubre Clinical Research Ethics Committee (reference 20/117) and granted a waiver of informed consent due to its observational design.

Microbiological methods

For the molecular diagnosis of SARS-CoV-2 infection, nasopharyngeal swabs [flocked swabs in UTM™ viral transport medium (Copan Diagnostics, Brescia, Italy)] were obtained from suspected cases and processed by automatized extraction and specific PCR methods [14]. For real-time reverse transcription polymerase chain reaction (rRT-PCR), the LightCycler 480 System instrument (Roche Life Science, Indianapolis, IN, USA) was used.

Endpoint definition

Respiratory failure was defined as a partial pressure of arterial oxygen/fraction of inspired oxygen PaO2/FiO2 (PaO2/FiO2) ratio ≤ 200 mmHg [15], or the need for mechanical ventilation (either non-invasive positive pressure ventilation—Continuous (CPAP) or Bilevel positive pressure ventilation- or invasive mechanical ventilation), including those patients who had a clinical indication for ventilatory support but for any reason were finally not ventilated. If PaO2 was not available, the estimated PaO2/FiO2 (ePaO2/FiO2) ratio was calculated using the pulse oximetry saturation/fraction of inspired oxygen (SpO2/FiO2) ratio applying the formula SpO2/FiO2 = 64 + 0.84 × PaO2/FiO2 [16].

Statistical methods

The presence or absence of respiratory failure was blindly defined to clinical information. We did not calculate formal sample size. Instead, all available data were used to maximize the power of the study.

Outcome (respiratory failure) was recorded in all patients. In the predictive model of respiratory failure [age, lymphocytes, SpO2%, CRP and lactate dehydrogenase (LDH)] only 18 patients had missing data, representing 3.45% of the entire cohort. Given the low percentage of patients excluded from the multivariate model, no imputation of data was performed and only complete-cases were included for the development of the model.

Quantitative variables were described using median and interquartile range (IQR) or means ± standard deviation (SD), and compared by Student's t test for independent samples or Mann–Whitney U test, as appropriate. All parameters were tested for normality of distribution by means of the Kolmogorov–Smirnov test. Categorical variables were expressed by absolute and relative frequency, and compared by X2-test or Fisher exact test.

Univariate analysis was performed to establish the relationship of variables to the development of respiratory failure and adjusted p values for multiple comparisons were obtained using the method of Benjamini and Hochberg. Associations were expressed as odds ratios (ORs) with 95% confidence intervals (95% CI) for categorical variables, and effect size with 95% CI for continuous variables was calculated using Cohen´s d method with the “esize” function of Stata 16. Thirty-two variables were considered for inclusion in the study of the presence of respiratory failure. The complete list of variables initially analyzed is shown in Supplementary Table 3. Secondly, a multivariate analysis of significant risk factors of respiratory failure identified in the univariate analysis, as well as other risk factors that we considered clinically relevant, was performed using a logistic regression. Backward selection with a type I error rate of 0.05 was used to reach a final reduced model containing 5 predictor variables (age, SpO2, lymphocytes, LDH and CRP). Discrimination of the final model was quantified via a C-statistic (ROC area), the predictive ability was determined with Brier score and R2 Nalgerkerke and Hosmer and Lemeshow test was used to determine the goodness-of-fit. The logistic regression model was converted to a more user-friendly integer score predicting an individual’s probability of respiratory failure. With each quantitative factor grouped into categories, an individual score increases by an integer amount for each level above the lowest category. Each integer amount is a rounding of the exact coefficient obtained from the logistic regression model. The lower the value of the score, the lower the risk of respiratory failure and vice-versa. This risk score was based on increasing categories of probability of respiratory failure on the methodology of risk score function implemented in the Framingham study [17]. A calibration plot was used to validate predicted probabilities against binary events. Model development was performed using rms package (Frank E Harrell Jr (2015). rms: Regression Modeling Strategies. R package version 4.3–1. http://CRAN.R-project.org/package=rms). All statistical tests were 2-tailed and the threshold of statistical significance was p < 0.05. Additionally, LOESS smoothing function was used to plot probability of respiratory failure. Statistical analysis was performed with computer software (IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp).

Results

Baseline and clinical characteristics

During the recruitment period, 521 patients were included, of whom 181 (34.7%) developed respiratory failure after a median time from the onset of symptoms of 9 days [interquartile range (IQR): 6–11)]. Supplementary Fig. 1 shows the number of patients hospitalized at 12 Octubre University Hospital in medical wards and ICU during the study period.

Demographic and clinical characteristics of the patients are shown in Table 1. Median age was 64.6 ± 18.2 years, with 317 patients (60.8%) over 60 years (77.9% in respiratory failure group vs 51.8% in non-respiratory failure, p < 0.0001). Median Charlson Comorbidity index was ≥ 1 in 50% of patients, being higher among cases with respiratory failure group [1 (IQR 0–2) vs 0 (IQR 0–1), p < 0.0001]. The most frequent previous medical condition was hypertension (42%).

Table 1 Demographic, clinical, radiological, and laboratory findings at admission according to the development of respiratory failure

Patients’ characteristics at admission

The median time between the onset of symptoms to the first positive rRT-PCR was 5 days (IQR 3–7), it being above 7 days in 116 patients (22.5%). Clinical, radiological and laboratory findings at admission are shown in Table 1. At admission, initial chest X-ray showed abnormal findings in 416 cases (81.1%) (87.8% in respiratory failure vs 77.5% in non-respiratory failure, p = 0.004). The most common radiologic finding was bilateral ground-glass opacities (41.9%).

Hematologic and biochemical abnormalities

As shown in Table 1, significant differences were observed in most laboratory parameters between those who developed respiratory failure in comparison with those did not. Partial pressure of arterial oxygen (PaO2) on room air was determined in 262 cases whereas pulse oximetry saturation (SpO2) on room air was used in 512 patients. SpO2 (%) at admission was 93 ± 6, and SpO2 < 90% was present in 101 cases (19.7%). Median values of LDH, CRP, and lymphocyte count at admission were 328 UI/l (IQR 265–413), 7.6 mg/dl (IQR 3.1–15), and 0.9 × 103cells/µl (IQR 0.62–1.2), respectively.

Management and outcome

In the course of hospitalization, 347 patients (68%) received oxygen therapy during a median of 6 days (IQR 3–11). Four hundred seventy-six patients were treated with antibiotics (91.5%), being azithromycin used in 292 cases (56%). Likewise, Lopinavir/ritonavir, hydroxychloroquine, interferon-β1b, corticosteroids, tocilizumab and remdesivir were prescribed in 60% (314), 75% (393), 24% (128), 25% (131), 8% (44), and 0.2% (1) patients, respectively.

As shown in Supplementary Table 1, ICU admission occurred in 52 patients (10%), of whom 51 belonged to the respiratory failure group (28% vs 0.3%, p < 0.0001). Median length of stay in ICU was 12.5 days (IQR 6–18).

Median time from admission to respiratory failure was 3 days (IQR 1–6). In the respiratory failure group, 27% (49/181) of patients were treated with invasive mechanical ventilation for a median duration of 12 days (IQR 7–17), whereas 31 patients (17%) were managed with non-invasive mechanical ventilation. Extracorporeal membrane oxygenation and prone position were used in 1 and 43 patients, respectively.

Median time from admission to clinical improvement was 5 days (IQR 3–9), with a significant difference between groups [14.5 days (IQR 9–20) in the respiratory failure group vs.5 days (IQR 3–7) in the non-respiratory failure group, p < 0.0001].

Overall mortality occurred in 23.8% (124/521), with a significant difference between groups [65.7% (119/181) vs. 1.5% (5/340), p < 0.0001]. Median time from admission to discharge or death was 9 days (IQR 6–14 days), [11 days (IQR 7–19) vs. 8 days (IQR 5–12), p < 0.0001]. Among survivors, median hospital stay was 22 days (IQR 16.5–31) in respiratory failure group vs. 8 days (IQR 5–12) in non-respiratory failure one (p < 0.0001).

To analyze medical complications happening in the course of hospitalization, patients were followed up for 1 month or until death (see Supplementary Table 1).

Risk estimation of respiratory failure

As Table 2 shows, a reduced model identifying respiratory failure was generated. Five variables remained independently associated with the primary endpoint: age (OR 1.026; 95% CI 1.019–1.042, p = 0.0004), SpO2(%) (OR 0.853; 95% CI 0.804–0.906, p < 0.0001), lymphocyte count (OR 0.414; 95% CI 0.232–0.737, p = 0.0029), LDH (OR 1.004; 95% CI 1.002–1.006, p = 0.0001) and CRP (OR 1.048; 95% CI 1.018–1.078, p = 0.0013). LOESS smoothing curve plotting the probability of respiratory failure against variables included in the score are depicted in Fig. 1.

Table 2 Regression coefficients of the logistic regression model
Fig. 1
figure 1

LOESS smoothing curve plotting the probability of respiratory failure against variables included in the score

A score predicting the occurrence of respiratory failure was created as visualizations of the logistic regression model (Table 3). As an example, a patient with age 60 years, 800 lymphocytes/µl, SpO2 of 93%, LDH of 315 U/I and a CRP of 5 mg/dl, receives a score of 11. Therefore, using this model, this patient would have an estimated probability of 58.4% of respiratory failure.

Table 3 Proposed score for predicting respiratory failure

This reduced model provided good discriminative ability (bootstrap-corrected c index 0.85) as Fig. 2 and Supplementary Table 2 shows and goodness-of-fit (Hosmer and Lemeshow p = 0.49). According to Youden´s Index J the optimal cut-off for the score was 9 points (sensitivity of 82.66% and specificity of 71.96%). The corresponding sensitivity, specificity, positive, and negative likelihood ratios of different points are detailed in Supplementary Table 4.

Fig. 2
figure 2

Receiver operating characteristic (ROC) curve of the score in discriminating the presence of respiratory failure

Calibration plot for the score is shown in Supplementary Fig. 2. The c-statistic (area under ROC curve) for internal validation was 0.84.

Discussion

During the recruitment period of the study, our hospital experimented an exponential growth of COVID-19 patients in medical wards and ICU in a short period of time, as shown in Supplementary Fig. 1. This circumstance forced an adaptive restructuration aimed to rapidly increase medical, respiratory intermediate care, and ICU beds. In a situation of overload such as the one we suffered during the first wave of the pandemic, the early identification of patients at high risk of respiratory failure seems mandatory to ensure appropriate infrastructure for respiratory support. In this context, a score with the ability of facilitating the early triage of these patients is essential. Furthermore, this score should be simple and easily implemented, even in resource-limited settings.

In this regard, we present a simple and quick 5-item score which can be easily calculated at patients’ bedside at admission. A large proportion of subjects at a high risk of respiratory failure would be identified by this tool with a high discriminative ability (C-statistic = 0.85). Having this information at an early stage would allow a sound planning of hospital beds and use of respiratory resources. Of note, this score would also select patients who would benefit from the early use of anti-inflammatory drugs to manage the characteristic dysfunctional immune response in patients with severe SARS-CoV-2 infection.

As shown in Supplementary Table 2, Charlson Comorbidity index substituting age could slightly increase the accuracy of the score, but the use of age is much simpler and practical, and the discrimination of the final model and the predictive ability was similar (pairwise comparison of ROC curves, p = 0.3). The score also includes SpO2% which is easier to determine than PaO2, and therefore was collected in the majority of cases. It should also be noted that arterial blood gas diagnostics could be inaccessible in resource-constrained settings, being pulse oximetry a reliable alternative to achieve a validated estimation of PaO2/FIO2, as proposed by the Kigali modification of Berlin criteria [18]. We also believe that SpO2% is a more reliable parameter than dyspnea, since this symptom may not be reported by patients even in presence of severe hypoxemia, probably because of the persistence of spared, normal compliant lung tissue surrounding affected areas with extreme intrapulmonary shunt [19].

The other three variables included in the score are laboratory values. Lymphopenia is described as a prognostic marker in SARS-CoV-2 [20]. Others, like interleukin-6, D-dimers, procalcitonin, and ferritin [10, 21, 22] have been also associated with a poor outcome, but their availability in many Emergency Departments is scarce, this needing to be considered in a resource shortage scenario. In our score, the laboratory values related with respiratory failure were LDH and CRP, which performed soundly for predicting respiratory failure (Fig. 3). As shown in Supplementary Table 2, ferritin did not improve the model accuracy.

Fig. 3
figure 3

Predicted probability of respiratory failure

Different outcomes have been proposed in COVID-19 observational studies, being mortality the most frequently reported [7, 9, 10, 21]. Only one study reported the risk factors associated with acute respiratory distress syndrome, although it did not propose a score [10].

A recently published Chinese report [23] proposed a 10-item score for predicting the need of invasive ventilation, as part of a composite endpoint (which also included death and ICU admission). While of interest, the population analyzed was significantly different from that of our cohort regarding age and baseline features, COVID-19 respiratory involvement, and mortality. Our study may be more representative of the situation experienced in most Western countries. In this regard, our endpoint focuses on the most frequent and serious complication associated with COVID-19. Development of respiratory failure frequently leads to the need for invasive ventilation, ICU admission, and death (Liang's study endpoints), but it also adds the use of non-invasive mechanical ventilation outside ICU, which needs trained staff and sophisticated facilities, too. In addition, it also includes a significant number of patients with respiratory failure who would finally not be suitable for ventilation, but would still benefit from optimized medical therapy and medical resources for a long time. We believe that our primary endpoint offers a more realistic picture of the situation experienced in overloaded hospitals.

We want to point out that despite the majority of chest X-ray were abnormal, the score was verified in the subgroup with normal X-ray, being the AUC 0.81 (p < 0.0001). These data might be of interest, since it could facilitate the use of the score even in less severe population, although this recommendation should be taken with caution.

An important strength of our study was that possible biases have been minimized by including all consecutive adult hospitalized patients with confirmed COVID-19 and presenting a very low percentage of missing data in the main variables of the study, especially in vital signs and analytical values, as they were gathered automatically by electronic records.

Conversely, this single center study has the inherent limitations of potential selection bias, depending on the main demographic features of the population attended. Additional prospective multicenter validation studies of the proposed score for predicting the occurrence of respiratory failure should be completed before clinical use. The only laboratory value with a high percentage of missing data was D-dimers, so we failed in demonstrating if it could be an independent risk factor. This parameter has been associated with worse prognosis in some studies [10] but not in all of them [9, 21] and its high cost and, in our case, unavailability of enough test at some moments of the wave, limited its use.

Summing up, we believe that the proposed score may have significant clinical implications. In comparison with other scores proposed for predicting an unfavorable outcome for COVID-19 patients, this has the advantage of its simplicity and the fact that it can be calculated at the bedside of the patient on his arrival at the Emergency Department. It would be a useful tool to optimize the scarce resources available in a pandemic situation by identifying, at admission in the Emergency Department, patients at risk to develop respiratory failure. As some authors pointed out, adoption of straightforward triage algorithms might be useful to optimize the management of hypoxic patients with severe disease [24]. Additionally, our endpoint includes patients with respiratory failure (and not only invasive ventilation). This fact give us an accurate idea of a subgroup of more severe and life-threating patients in which early immunomodulatory drugs may be considered.

Conclusions

We propose a simple score to early predict the development of respiratory failure in COVID-19 to optimize antiviral and immunomodulatory therapy and to adequate health-care resources, including respiratory support in such a pandemic situation.