Introduction

Since December 2019, Coronavirus Disease 2019 (COVID-19), a newly recognized illness caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), formerly named 2019-nCoV-infected pneumonia (NCIP) broke out in Wuhan (Hubei, China) and rapidly spread throughout China and other regions of the world [1,2,3,4,5]. COVID-19 has caused more than 200,000 deaths around the world [5]. Although most patients with COVID-19 were mild or moderate, severe or critical cases progressed rapidly to severe pneumonia, acute respiratory distress syndrome (ARDS), coagulopathy, and septic shock, etc. [2]. Therefore, early identification of severe or critical patients is crucial to optimize patient stratification management and to potentially reduce fatality. A reliable prediction tool for mortality risk at an early stage among patients with COVID-19 would be highly valuable.

In this study, we investigated the demographics, clinical features and outcomes of patients with COVID-19, and developed a nomogram based on multiple risk factors to predict the death probability of these patients. Then, an integrated score was generated to provide a quantitative tool to early stratify COVID-19 patients and to guide the clinical management.

Methods

Study design

In this retrospective study, data were collected between January 1, 2020 and February 15, 2020 from two clinical centers for COVID-19 (Main District and Tumor Center) of the Union Hospital of Huazhong University of Science and Technology (Wuhan, China). A total of 301 adult patients (≥ 18 years old) diagnosed with laboratory-confirmed COVID-19 were enrolled, 11 patients with COVID-19 in Main District transferred to other designated hospitals and 3 patients without confirmed SARS-CoV-2 infection in Tumor Center were excluded (Fig. 1). The derivation cohort consisted of 176 patients from Main District of Union Hospital (located at 1277 JieFang Avenue, Wuhan, China). The validation cohort was obtained from 125 patients admitted to Tumor Center of Union Hospital (located at 109 MaChang Road, Wuhan, China). Definite outcomes (dead or discharged) of 301 cases were followed up until March 15, 2020. All patients in this study were diagnosed according to the Guidelines of the Diagnosis and Treatment of Novel Coronavirus Pneumonia released by the China NHC [6]. This study was approved by the Ethics Committee of Tongji Medical College of Huazhong University of Science and Technology in Wuhan (2020-0058). Written informed consent was waived due to this public health emergency.

Fig. 1
figure 1

Flow chart of study participants in the derivation and validation cohort

Respiratory specimens (throat swab) were collected from suspected SARS-CoV-2 infection patients for laboratory confirmation, which was performed at the local Center for Disease Control and Prevention (CDC) in Wuhan or Union Hospital (Wuhan, China). Methods for laboratory confirmation of COVID-19 had been described previously [1].

Data collection

All patients were examined with chest computed tomography (CT) scan and additional laboratory tests on admission, including leukocytes, neutrophils, lymphocytes, alanine aminotransferase (ALT), aspartate aminotransferase (AST), lactic dehydrogenase (LDH), Creatinine, d-dimer, C-reactive protein (CRP), procalcitonin (PCT), etc. Data on epidemiological and demographic characteristics, clinical features, chronic medical histories, laboratory findings, radiological features, and clinical outcomes was obtained from the patients’ electronic medical and nursing records, and evaluated by two trained physicians (ZHW and QZ) independently. The primary outcome of this study was all-cause death.

Statistical analysis

Discrete variables were presented as frequency and proportion, and the Fisher exact probability test was applied for comparing the difference between two groups (survivors and non-survivors). Continuous measurements were described using median and interquartile ranges (IQRs) and Mann–Whitney U tests were adopted for comparison. Random forest was applied for data imputation [7] for variables with a missing proportion of less than 10%. Those larger than 10% were excluded in the model development. A total of 24 candidate variables were involved in model development according to clinicians’ recommendations and existing literature.

The least absolute shrinkage and selection operator (LASSO) is one of the most widely used auto-variable selection techniques. In this study, LASSO was adopted for data dimensionality reduction and to enhance the model’s interpretability [8]. Five-fold cross-validation was applied to get the hyperparameter for LASSO. In order to build the most parsimonious model, we optimized LASSO by lambda within 1 SE. The predictors selected by LASSO were incorporated into a logistic regression to build the nomogram [9]. Logistic model was generally applied for a binary outcome. Since in some case, the data could not be fully interpreted by the traditional logistic model if over-dispersion, we also calculated the dispersion parameter to guarantee the model specification. Multicollinearity would lead to the difficulty of model interpretation and variance inflation factor (VIF) is a measurement to the degree of model collinearity. For this reason, we also calculated VIF to avoid the multicollinearity of the model. In internal validation, we applied Hosmer–Lemeshow test for the goodness-of-fit of the model and drew the 1000-times-bootstrapping resampling calibration plot [9] for internal calibration. Internal discrimination was assessed by the median of the area under the curve (AUC) for receiver operating characteristic (ROC) of the logistic model through 1000-times-repeated bootstrapping resampling. Similarly, we redrew a calibration plot and estimate AUC for retesting model performance in external-validation cohort to avoid overfitting. Decision curve analysis takes both discrimination and calibration into consideration. In this analysis, we used it [10] for comparison of the performance between single-predictor models and the full model. We further classified the patients into three subgroups based on the total points derived from the predictive nomogram. The Fisher’ exact test was applied to compare the actual fatality rates among the three subgroups.

In this study, the two-tailed test was conducted apart from the one-tailed Hosmer–Lemeshow test and significance thresholds were set at 0.05 for the two-tailed test and 0.025 for the one-tailed. In addition, we applied Bonferroni correction to adjust significance thresholds for multiple comparisons. All statistical analyses were performed using the software R version 3.6.2 (https://www.r-project.org/). The following packages were used: “missForest”, “glmnet”, “rms”, “rmda”, “pROC” and “ggplot2″.

Results

Characteristics of COVID-19 patients

A total of 301 COVID-19 patients were enrolled from two clinical centers of Union Hospital (Wuhan, China) (Fig. 1). Among 301 cases with laboratory-confirmed COVID-19, the case fatality rates (CFR) of the derivation and validation cohorts were 11.9% (21/176) and 8.8% (11/125), respectively (p = 0.451). The proportion of sex and diabetes were comparable between the two cohorts (p > 0.05, Table 1), while the proportion of age, hypertension, coronary heart disease, days from illness onset to admission and days from illness onset to discharge or death were different between the two cohorts (p < 0.05, Table 1).

Table 1 Characteristics between the derivation cohort and validation cohort

The details for characteristics of survivors and non-survivors of the derivation cohort were summarized in Table 2. The median age of the participants in the derivation cohort was 47.0 (33.0–62.0) years and more than half of them were female (58.5%). Non-survivors were older than survivors (70.0 vs 43.0 years, p < 0.001). The proportions of diabetes, hypertension and coronary heart disease were significantly different between the two groups (Table 2). In particular, no significant difference was found in the median time from illness onset to hospital admission between survivors and non-survivors (p = 0.391). Compared with survivors, non-survivors had increased white blood cells (7.2 vs 4.4 × 109/L, p < 0.001), higher neutrophil counts (6.3 vs 2.8 × 109/L, p < 0.001), lower lymphocyte counts (0.66 vs 1.06 × 109/L, p < 0.001), higher CRP levels (83.15 vs 13.70 mg/L, p < 0.001), higher d-dimer levels (1.85 vs 0.39 mg/L, p < 0.001), and higher lactate dehydrogenase levels (451.0 vs 227.0 U/L, p < 0.001).

Table 2 The characteristics of patients with COVID-19 in derivation cohort

On admission, all patients in the derivation cohort had pneumonia which was diagnosed by chest CT scan and 161 (91.5%) patients’ CT images showed bilateral lung impairment. All patients received antiviral treatment, such as ribavirin, arbidol hydrochloride, lopinavir and ritonavir or interferon-α2b (nebulization inhalation). Other symptomatic and supportive treatments were performed according to the Guidelines of the Diagnosis and Treatment of Novel Coronavirus Pneumonia published by the China NHC [6]. Acute respiratory distress syndrome, septic shock, acute cardiac injury, and acute renal injury were the common complications (Table 2).

Development of the nomogram

The nomogram (Fig. 2), containing four variables: age, NLR, d-dimer, and CRP, is a regression model visualization for evaluating death probability. Here is the instruction of the nomogram: locate the values of a patient’s age, NLR, d-dimer, and CRP and draw four vertical lines for each of the four predictors to reach the “Points” axis, respectively. The intersections between the vertical lines and the “Points” axis are the corresponding score for the predictors. The summation of the scores from four predictors (named ANDC) could be converted to death probability in the same way. In that, the clinicians could easily predict the death probability and identify the high-risk patient. In general, the higher value of ANDC, the greater probability of death (Fig. 2).

Fig. 2
figure 2

Nomogram to predict the death probability of patients with COVID-19. The nomogram was constructed based on the following variables: age, NLR, D-dimer and CRP. Locate the values of a patient’s age, NLR, D-dimer, and CRP and draw four vertical lines for each of the four predictors to reach the “Points” axis, respectively. The intersections between the vertical lines and the “Points” axis are the corresponding score for the predictors. The summation of the scores from four predictors (named ANDC) could be converted to death probability of patients with COVID-19 by drawing another vertical line from the “Total points” axis to the “Death probability” axis. COVID-19, coronavirus disease 2019; NLR, neutrophils-to-lymphocytes ratio; CRP, C-reactive protein

Alternatively, the ANDC score also could be calculated by using the following formula: \( {\text{Total points }}\left( {\text{ANDC}} \right) = \left( {1.14 \times {\text{age}} - 20} \right)\left( {\text{years}} \right) + 1.63 \times {\text{NLR}} + 5.00 \times {\text{D}} - {\text{dimer}}\left( {{\text{mg}}/{\text{L}}} \right) + 0.14 \times {\text{CRP}}\left( {{\text{mg}}/{\text{L}}} \right) \). We provided a list about the specific ANDC score and corresponding death probability at Additional file 1: Table S1. In particular, an ANDC of 59 and 101 corresponded to the 5% and 50% cutoffs of death probability, respectively. We suggested that 59 and 101 could be used as cutoff values to stratify COVID-19 patients into three groups. The death probability of low risk group (ANDC < 59) was less than 5%, moderate risk group (59 ≤ ANDC ≤ 101) was between 5% and 50%, and high-risk group (ANDC > 101) was more than 50%, respectively.

Furthermore, we compared the actual death proportion with the predicted death probability in the three classified subgroups according to the ANDC score. As shown in Additional file 2: Table S2, the proportions of death were 0.9% (1/110) for low risk group, 18.0% (9/50) for moderate risk group and 68.8% (11/16) for high-risk group. The actual fatality rates were significant different (p < 0.001) among the three subgroups.

Performance of the nomogram

The dispersion parameter was 0.382 less than 1 and the maximum of VIF of predictors in the full model is less than 1.25, which showed the non-existence of over-dispersion and multicollinearity. p value of the Hosmer–Lemeshow test was 0.751 greater than 0.025, which demonstrated consistency between actual probability and observed probability of the outcome. In addition, according to Fig. 3, the biased-corrected curve in calibration plot graphed closely toward the diagonal line, representing the consistent conclusion under bootstrapping correction conditions.

Fig. 3
figure 3

Calibration plot comparing predicted and actual death probability of patients with COVID-19. These two figures show actual against predicted death probability of patients with COVID-19. a represents the internal validation. b Represents the external validation. Dotted curve represents the apparent curve without bootstrapping correction. The solid curve represents the 1000-times repeated bootstrapping-correction curve. The dashed curve represents the ideal fit. COVID-19, coronavirus disease 2019

Our model’s discrimination statistics AUC was 0.921 (95% CI 0.835–0.968) under bootstrapping correction. Based on Fig. 4, the net benefit of every single predictor model was positive, indicating every predictor contributed to the prediction of outcomes. In particular, the full model demonstrated the best performance and hence it was necessary to combine four predictors in the model.

Fig. 4
figure 4

Decision curves analysis comparing different models to predict the death probability of patients with COVID-19. The net benefit balances the mortality risk and potential harm from unnecessary over-intervention for patients with COVID-19. Full model incorporates the following predictors: age, NLR, D-dimer and CRP. COVID-19 coronavirus disease 2019, NLR neutrophils-to-lymphocytes ratio, CRP C-reactive protein

External validation

It is essential to evaluate the model performance by the cohort which is not used for developing prediction model to avoid overfitting. In this study, we performed a series of external validation analysis. The patients in the validation cohorts were divided into three classified subgroups according to the ANDC score. As shown in Additional file 3: Table S3, the proportions of death were 0.0% (0/35) for low risk group, 1.4% (1/71) for moderate risk group and 52.6% (10/19) for high risk group. The actual fatality rates were significant different (p < 0.001) among the three subgroups. In consistent with the derivation cohort, the model still performed well in AUC of 0.975 (95% CI 0.947–1.000) and calibration plot was indicative of the reliable model even under the context of an external dataset (Fig. 3).

Discussion

In this study, we investigated the correlation between the outcomes of COVID-19 patients and multiple factors. Several indicators were identified by LASSO regression analysis as predictors for COVID-19 patients with a poor outcome including age, NLR, D-dimer and CRP. Then, a visualized nomogram was established based on these four variables, and ANDC scores could be easily obtained for early stratifying COVID-19 patients and improving the clinical management for the disease Additional files 4, 5: Figures S1, S2.

The majority of patients with COVID-19 were mild [4]. Most mild cases were recommended oral medications and self-isolation at home at the initial stage of this outbreak, which may result in the relatively higher CFR among inpatients in our derivation cohort. In addition, Tumor Center of Union Hospital was the designated clinical center for patients with COVID-19 since February, 2020. Therefore, severe or critical COVID-19 patients were transferred to there from other hospitals, which also possibly contributed to the higher CFR among hospitalized patients in the validation cohort in our study than previously reported [3, 4].

In accordance with previous studies on severe acute respiratory syndrome (SARS) [11], Middle East respiratory syndrome (MERS) [12] and COVID-19 [13], older age was also identified as a predictor for poor prognosis of patients with COVID-19 in our study. It is hypothesized that immunosenescence and/or underlying comorbidities might deem geriatric patients more vulnerable to developing severe COVID-19 illness.

Recently, Liu et al. [14] proposed that increased NLR was a risk factor for the early identification of severe COVID-19 illness. In this study, we found that higher NLR was associated with higher mortality. In patients with community-acquired pneumonia, an extensive activation of the immune system and/or immune dysfunction can lead to alterations of the ratio between serum neutrophil and lymphocyte levels [15]. Moreover, when there are immunosuppression and apoptosis of lymphocytes induced by various anti-inflammatory cytokines, neutrophils, especially immature neutrophils are recruited from the bone marrow into the cycle [16], resulting in an increased NLR. Research performed by Lu et al. [17] suggested that CRP tested on admission can predict confirmed or suspected COVID-19 related short-term mortality. CRP is synthesized by hepatocytes in response to cytokines which are derived from leukocytes stimulated by infection, inflammation, or tissue damage. Recently, CRP is widely used clinically to evaluate disease progression, and it served as an indicator for predicting bacterial infections in patients [18,19,20]. In our study, increased CRP levels measured at admission of patients with COVID-19 was associated with increased mortality risk. This suggested that a severe inflammation or potentially a secondary infection has developed in these patients, and empirical antibiotic treatment might be necessary.

A previous study had suggested that the dysregulation of the urokinase, coagulation and fibrinolysin pathways during SARS-CoV and influenza virus infections results in more severe lung pathologies, by disturbing the balance between host coagulation and fibrinolysin pathways [21]. d-dimer is often regarded as an indicator for fibrinolytic system activity. Once inflammation has occurred, the alveolar hemostatic balance is shifted to prominently exhibit procoagulant activity, resulting high d-dimer levels [22]. Furthermore, inflammatory cytokines can also activate coagulation cascade and inhibit fibrinolysis in patients with severe sepsis [23]. On the other hand, d-dimer has been demonstrated as a major indicator for diagnosing pulmonary embolism (PE) [24], which also affects prognosis. In the current study, 21 fatal cases demonstrated elevated D-dimer levels on admission in the derivation cohort, which might indicate that a therapeutic approach targeting coagulopathy-related signaling pathway should be considered at that time.

Based on the above analyses, these four mortality predictors in our nomogram were associated with inflammation, immunity and coagulation function, which might contribute to the pathogenesis of COVID-19. We speculated that the inflammatory response to SARS-CoV-2 infection may be the core in the pathogenesis of COVID-19, and the dysregulation of the immune and/or coagulation system will result in worse disease outcomes, such as ARDS, coagulopathy, and septic shock, etc. In our study, non-survivors had low levels lymphocytes and higher levels of neutrophils, D-dimer and CRP than those of survivors. An early intervention based on comprehensive consideration of inflammatory response, immune dysfunction and coagulopathy might contribute to make a reasonable and individualized therapeutic strategy for COVID-19 patients with high mortality risk.

Lately, Chen et al. proposed that older age, dyspnea, coronary heart disease, cerebrovascular disease, elevated PCT and AST are independent risk factors associated with fatal outcome and developed a nomogram to predict the survival of patient with COVID-19 in China but without external validation. In our study, the aforementioned four predictors (age, NLR, D-dimer and CRP) obtained on admission were selected by the LASSO analysis to construct a predictive nomogram, which exhibited good discrimination and calibration in the individualized prediction for the death probability of COVID-19 patients. Furthermore, our nomogram was validated by an external heterogeneous cohort and it appeared to be useful in different clinical settings. The application of our nomogram in the derivation and validation cohort showed good differentiation with AUC values of 0.912 and 0.975, which were higher than Chen’s nomogram (AUC = 0.849). Moreover, the ANDC score derived from the nomogram provided a quantitative tool for the early identification of patients with high mortality risk on admission and for guiding clinical managements. Patients with COVID-19 was classified by the ANDC score obtained on admission into three risk groups with varied mortality risk. Cases in the low risk group should be isolated and treated in “Mobile Cabin Hospitals” [17]. Patients with moderate risk should be admitted to a designated hospital for comprehensive treatments in an isolation ward. Patients with high risk should be intensive surveillance and should be transferred to ICU for aggressive treatment and critical supportive care if necessary.

There were several limitations in our study. Firstly, this is a retrospective study and hence the model needs to be validated by multicenter prospective studies. Secondly, patients with elevated D-dimer levels on admission may indicate that the underlying high risk PE status possibly occurred. Owing to the retrospective study design, CT angiography used to diagnose PE was not performed in all COVID-19 patients.

Conclusions

In summary, based on multiple risk factors (age, NLR, D-dimer and CRP), our nomogram for predicting the prognosis of patients with COVID-19 showed good discrimination and calibration. The application of ANDC would help clinicians make a prompt and reasonable decision to optimize patient stratification management and to potentially reduce fatality. However, this quantitative tool needs to be validated by further large-scale prospective studies.