Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)–associated coronavirus disease 2019 (COVID-19) has rapidly spread around the world. The clinical spectrum of SARS-CoV-2 infection appears to be wide, encompassing asymptomatic infection, mild upper respiratory tract illness, and severe pneumonia with respiratory failure and death.1,2 COVID-19 was the third leading cause of death in the USA in 2020 in persons aged 45 through 84 years.3 Most studies have identified advanced age, male gender, presence of comorbidity,4,5,6 and a number of laboratory parameters, including D-dimers, lymphocytes, C-reactive protein (CRP), and lactate dehydrogenase (LDH), as independent risk factors for mortality and need for mechanical ventilation1,7,8,9,10,11,12,13,14 in COVID-19 hospitalized patients. Respiratory failure and death have been found to be associated with age and age-associated risk factors but it is still present in younger populations; therefore, we need to investigate if the risk factors change based on age.15,16,17,18

The current life expectancy in Spain is 83.2 years19 and similar figures can be found in many other Western countries and Japan. The median age of patients included in the SEMI-COVID Registry, for example, is 69.1 years,20 and similar values could be observed in other large series.1,21,22,23 Meanwhile, younger patients with COVID-19 requiring hospitalization are scarce in large series, and their clinical characteristics, risk factors, and outcomes may have been obscured or missed when analyzing these cohorts.24 At the same time, other risk factors often associated with a poor outcome in COVID-19 patients, such as dementia or cardiovascular disease, are highly associated with aging, but do not necessarily play a significant role in the severity of SARS-CoV-2 infection in younger patients.1,2,21,22,23,25,26 For this reason, we selected a younger population (18 to 50 years) with COVID-19 who required hospital admission27,28 in order to analyze their clinical characteristics and risk factors for the development of respiratory failure and outcome.


Study Population and Design

This is a multicenter retrospective cohort study of patients aged between 18 and 50 years old with confirmed SARS-CoV2 infection who were hospitalized in Spanish hospitals between March 1, 2020, and July 2, 2020. All patient data were obtained from the SEMI-COVID-19 Registry, with 150 participating hospitals, which is coordinated by the Spanish Society of Internal Medicine. The SEMI-COVID-19 Registry is an ongoing, multicenter, observational, nationwide cohort of consecutive 18 and older patients requiring hospitalization in Spanish hospitals for microbiologically confirmed COVID-19. The main objective is to obtain detailed information on the epidemiology, clinical progress, and treatment received by patients with COVID-19 in real-world clinical practice at admission and during hospitalization.

The STROBE Statement guidelines were followed in the conduct and reporting of the study.

Ethical Approval

The study was defined as observational by the Spanish Agency of Medicines and Medical Products (AEMPS, in its initials in Spanish), in accordance with the applicable regulations. The SEMI-COVID-19 Registry was initially approved by the Provincial Research Ethics Committee of Málaga (Spain). The study was carried out in accordance with the Declaration of Helsinki and approved by the Institutional Research Ethics Committees of each participating hospital. Written informed consent was obtained from all patients before enrollment. When there were biosafety concerns and/or the patient had already been discharged, verbal informed consent was requested and noted on the medical record.

Data Collection

All data were extracted from electronical medical records using a standardized electronic data capture system. A database manager ensured data consistency verification. The database platform is hosted on a secure server. Patient identification is encrypted and anonymized. Information includes sociodemographic variables and previous medical history, first available vital signs, respiratory status, laboratory and radiology data at admission and complications during hospital admission, and clinical situation on day +30 of the evolution of the illness. Patients were managed according to local criteria, although national recommendations provided by the Spanish Ministry of Health guided clinical management, treatment, and discharge criteria.29 Laboratory confirmation of SARS-CoV-2 was by real-time reverse transcription polymerase chain reaction (rRT-PCR) on nasal/nasopharyngeal exudate.

Endpoint Definitions

The primary endpoint was the presence of respiratory failure during hospital admission. Respiratory failure was defined as the ratio of partial pressure of arterial oxygen to fraction of inspired oxygen (PaO2/FiO2 ratio) ≤200 mmHg30 or the need for mechanical ventilation (either noninvasive positive pressure ventilation—continuous (CPAP) or bilevel positive pressure ventilation—or invasive mechanical ventilation) and/or high-flow nasal cannula or the presence of acute respiratory distress syndrome (ARDS). The use of baseline CPAP is not a criterion for ventilatory failure unless there is respiratory impairment according to the definition used. If PaO2 was not available, an estimated PaO2/FiO2 (ePaO2/FiO2) ratio was calculated, using pulse oximetry saturation/fraction of inspired oxygen (SpO2/FiO2 ratio) and applying the formula SpO2/FiO2 = 64 + 0.84 × PaO2/FiO2.31 In-hospital mortality during the 30-day follow-up was recorded.

Statistical Methods

Quantitative variables were described using medians and interquartile ranges (IQR) or means ± standard deviation (SD), and compared by Student’s t-test for independent samples or the Mann-Whitney U test, as appropriate, taking into account whether the variance was equal or unequal. The Kolmogorov-Smirnov test was used to test all parameters for normality of distribution. Categorical variables were expressed as absolute and relative frequencies and compared by χ2 test or Fisher’s exact test. The LOESS smoothing function was applied to plot the probability of respiratory failure of the quantitative variables (Supplementary Figure 2).

Univariate analysis was performed to establish the relationship of variables with the development of respiratory failure. Associations were expressed as odds ratios (ORs) with 95% confidence intervals (95% CI). Adjusted p values for multiple comparisons were obtained using the Benjamini and Hochberg method.32 The complete list of variables analyzed for respiratory failure is shown in Supplementary Table 1. A multivariate logistic regression model was then developed to identify independent predictive variable for the risk of respiratory failure. In order to explore the importance of certain clinical and analytical data, two multivariate analyses were performed: one with variables available in at least 90% of the population, and another with all recorded variables. In the second one, we decided to include a greater number of prognostic variables (such as ferritin or albumin), in accordance with previously published data, although these were collected from a smaller number of patients.

Discrimination of the final models was measured using the C-Index (ROC area), and predictive ability was determined with the Brier score and Nagelkerke’s R2. All statistical tests were 2-tailed and the threshold of statistical significance was p<0.05. Statistical analysis was performed with computer software (IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp). Cutoff values for laboratory parameters were determined by MedCalc Version 19.2.0.


Baseline Demographic and Clinical Characteristics of Patients at Admission

During the recruitment period, 15,034 patients were included in the SEMI-COVID Registry, of which 2327 were younger than 50 years old (Supplementary Figure 1). The median age was 42.2 years (IQR: 36–46.7) and 59.5% of patients were male. A total of 9.4% were healthcare workers. Most of them were Caucasian race (67.9%), followed by those of Hispanic ethnicity (27.8%). The most frequent previous medical conditions were obesity (defined as a body mass index (BMI) ≥30 kg/m2) (21.6%), hypertension (13%), asthma (9.7%), and diabetes (5.5%). The main demographic and clinical characteristics of the patients are shown in Table 1. The median time from onset of symptoms to the first positive rRT-PCR was 7 days (IQR: 4 to 9) and was less than 5 days in only 515 patients (23.9%). At admission, the initial chest X-ray showed abnormal findings in 2066 cases (90.1%). Significant differences in most laboratory parameters were observed at admission between those who developed respiratory failure and those who did not, as shown in Table 2. Extended data of Tables 1 and 2 are available in the Supplementary File.

Table 1 Demographic and Clinical Characteristics of the Study Population (n=2327)
Table 2 Laboratory Findings at Admission According to the Development of Respiratory Failure

Management and Outcome

In the course of hospitalization, antibiotics were given in 1976 patients (84.9%), with azithromycin being used in 1381 cases (59.3%). Hydroxychloroquine was prescribed in 2058 (88.8%), lopinavir/ritonavir in 1605 (69%), corticosteroids 507 (21.8%), interferon-β1b in 240 (10.3%), tocilizumab in 215 (9.2%), and remdesivir in 21 (0.9%) patients, respectively.

Respiratory failure was the most common complication (14.7%). ICU admission occurred in 180 patients (8.2%) and 172 of these belonged to the respiratory failure group (50.3% vs 0.4%, p<0.0001). As Table 3 shows, the frequency of complications and in-hospital stay was significantly increased in the respiratory failure group when compared with those patients who did not develop respiratory failure. Overall mortality was 2.3% (50/2327), with a significant difference between groups according to development of respiratory failure (12.5% (43/343) vs 0.4% (7/1984), p<0.0001). Thirty-seven patients (1.7%) suffered a venous thrombosis, with a significative difference between those who suffered respiratory failure (6.7% (23/343) vs 0.8% (14/1984), p<0.001). MACE events occurred in 2.7% (58/2327) and were more frequent in respiratory failure patients (13% (44/343) vs 0.8% (14/1984), p<0.00). Hospital stay was 7 (5–10) days longer in the respiratory failure group (15 (9–24) vs 6 (4–9) days, p<0.001).

Table 3 Complications and Relevant Outcomes

Analysis and Risk Estimation of Respiratory Failure

In the respiratory failure group, 37.9% (130/343) of patients were treated with invasive mechanical ventilation for a median duration of 10 days (IQR: 7–15), whereas 62 (18.1%) and 138 (40.2%) patients were managed with noninvasive mechanical ventilation and high-flow nasal cannula, respectively. The prone position was used in 40.2% (138/343) patients. Patients with ARDS at different levels of severity received some of the therapies mentioned above. Of the total of 2327 patients, a total of 259 (11.1%) patients suffered moderate (102 (4.4%)) or severe ARDS (157 (6.7%)) at admission.

Table 4 and Figure 1 summarize the univariate and multivariate analyses of parameters predicting respiratory failure. An extended univariate analysis is provided in Supplementary Table 4. LOESS smoothing curve plots of the probability of respiratory failure against quantitative variables included in the univariate analysis are shown in Supplementary Figure 2.

Table 4 Univariate Analysis of Respiratory Failure
Fig. 1
figure 1

Risk factors for respiratory failure with multivariate analysis. A Risk factors for respiratory failure with multivariate analysis (model 1). B Risk factors for respiratory failure with multivariate analysis (model 2).

Finally, a reduced model identifying respiratory failure was generated. Ten variables remained independently associated with the primary endpoint: obesity, alcohol abuse, sleep apnea syndrome, Charlson index ≥1, fever (temperature ≥38°C), lymphocytes ≤1100 cells/μL, LDH >320 U/I, AST >35 mg/dL, sodium <135 mmol/L, and C-reactive protein (CRP) >8 mg/dL. Their corresponding odds ratios (OR) and 95% confidence intervals are shown in Figure 1A. This reduced model provided good discriminatory ability (bootstrap-corrected C-index 0.779 (CI 95%: 0.757–0.800) and goodness-of-fit (Hosmer and Lemeshow p = 0.36).

At the same time, in a second analysis, we performed another predictive model giving priority to the inclusion of a greater number of possible analytical variables in order to explore the influence of ferritin and albumin, although this approach meant a significant loss of patients. In this second model, seven variables remained independently associated with the presence of respiratory failure (obesity, leukocytes > 6×103 cells/μL, lymphocytes ≤1100 cells/μL, ferritin >1200 ng/mL, albumin ≤3.7mg/dL, sodium <135 mmol/L, and CRP >8 mg/dL). Some of these differed from the original more inclusive model, as Figure 1B shows. This second multivariate model provided better discriminatory ability (bootstrap-corrected C-index 0.84 (95% CI: 0.809–0.874) and goodness-of-fit (Hosmer and Lemeshow p = 0.81).


The primary goal in conducting this study was to analyze the clinical characteristics of young people (18–50 years) hospitalized patients with COVID-19 and the risk factors for the development of respiratory failure. To do this, we analyzed the SEMI-COVID-19 Registry in which 150 Spanish hospitals participated. The first clinically relevant finding is that 15% of COVID-19 patients admitted to hospital were under 50 years of age. The strength of our study is that it is a real-life multicenter study with a large number of patients, a total of 2327 hospitalized young patients, which is to the best of our knowledge the largest report on this subgroup of COVID-19 patients.

Second, it provides insights into the evolution and prognosis of COVID-19 in patients who are younger than those generally enrolled in clinical studies. As a result, we have been able to analyze specific risk factors in this subgroup of patients without the interference of frailty or aging-associated comorbidities and to show the importance of obesity and a few laboratory parameters for making an early evaluation of the possible development of respiratory failure, which is the main cause of SARS-CoV2-related death.

The third major conclusion is that this younger population is not exempt from serious complications, since respiratory failure, ICU admission, and in-hospital mortality were found in 14.7%, 8.2%, and 2.3% of patients, respectively. This mortality rate is similar to that described by other authors.33

The fourth major finding is that young patients with respiratory failure present more medical complications, as other studies have highlighted.16,18,24,34 There was a notable presence of acute kidney injury, venous thrombosis, and, especially, the development of MACE events. Hence, early identification of patients at potential risk of complications is essential to ensure adequate management.

The fifth conclusion is that comorbidity, obesity, alcohol abuse, and sleep apnea syndrome, as well as the presence of fever, were independent predictors of respiratory failure, which is consistent with previous publications,4,5,10,18 and lymphocytes, LDH, AST, sodium, and CRP were predictors of poor prognosis, as other studies have also pointed out.5,7,10,16,18

We decided to use respiratory failure as the main endpoint because, in the vast majority of COVID-19 patients, mortality is related to the presence of respiratory failure.35 We defined it in broader terms than invasive mechanical ventilation or ICU admission in order to obtain a more realistic view of the demands of the first wave of the pandemic in terms of healthcare pressure. This approach continues to be valid, given that the pandemic is currently global. If we focus on patients who developed respiratory failure, we observe that half the cases required ICU admission, while the other 50% was managed with non-invasive ventilatory support. Likewise, we observed that the presence of respiratory failure was more frequent in men, with a slight predominance of Hispanic ethnicity and subjects with previous comorbidities, highlighting the presence of sleep apnea syndrome and previous alcohol consumption. However, the most notable differences were observed in the predictive parameters, even at the time of hospital admission, which presented important predictive values for the development of respiratory failure (Fig. 1A and B). Analytical variables were treated as continuous variables and categorized using ROC curves (Supplementary Table 2) to facilitate their management in the risk factor analysis. Although most of the analytical variables were collected in more than 90% of the subjects, a few other relevant variables were accessible in a smaller number of patients, so making them difficult to use in potential predictive models. This was the case of TnT, CPK, ferritin, albumin, prothrombin ratio, and interleukin-6 (IL-6). To deal with this, we created the first multivariate model (Fig. 1A), which included the largest possible number of patients (n = 1457). In order to analyze other widely reported prognostic variables such as ferritin and albumin,1,9,36 we constructed a second model (Fig. 1B), which included a larger number of analytical variables, although at the expense of reducing the number of patients (the final model included 503 patients). Following this approach, the only clinical variable that remained in the model was obesity. All the analytical variables in model 1 were retained, and leukocytes, ferritin, and albumin were added as prognostic markers. This second model has an accessible analysis and offers high prognostic value (AUC of 0.85) for predicting the development of respiratory failure. In both models (1 and 2 in Fig. 1), the presence of lymphocytes ≤ 1100 cells/μL, obesity, sodium < 135 mmol/L, and C-reactive protein >8 mg/dL remain independent risk factors. In the first model (which includes a larger number of cases), we would also add albumin ≤ 3.7 mg/dL, ferritin >1200 ng/mL, and leukocytes > 6×103 cells/μL. In the second model, which includes a smaller number of patients but more analytical variables, alcohol abuse, sleep apnea syndrome, Charlson index ≥ 1, LDH > 320 U/L, fever (≥ 38°C), and AST > 35 mg/dL were also additional risk factors.

These findings highlight a case analysis at admission which include those variables that allow us to identify populations at risk for the development of respiratory failure in order to increase predictive power, which would allow for optimization of the necessary care resources.

Unfortunately, the main limitation of our study is that it is a real-life registry so that, despite the inclusion of a large number of patients, some of the potential prognostic values indicated in the univariate study of biochemistry parameters, such as IL-6, lactate, and troponin, could not be confirmed in the multivariate study. These biomarkers were obtained in 338, 926, and 294 cases, respectively.

The results of this study suggest that the risk factors for poor outcome in younger patients may be different from those in older one. To confirm this observation, a study comparing the two groups directly should be performed.37

Unfortunately, our work has other limitations. First the outcomes in March–July 2020 were different than now, with more consistent use of anticoagulation (prophylactic or therapeutic when appropriate) and corticosteroids; thus, it may be hard to extrapolate or predict how similar patients would do now.

Second, the laboratory variables are dichotomized. The relationship between laboratory variables and outcomes are unlikely to be dichotomous or even linear; thus, these relationships are overly simplified.

Third, this study has been carried out in a single country in southern Europe, but we consider that it would be very appropriate to know whether the results are extrapolable to other regions of the world with different socio-demographic characteristics to ours.

In conclusion, a significant percentage of younger patients (18–50 years old) with COVID-19 requiring hospital admission presented respiratory failure, particularly those who were obese or had SAHS. This serious complication can be identified at admission by the usual laboratory tests.