Introduction

The coronavirus disease 2019 (COVID-19) began to spread globally in December 2019, posing a serious pandemic and threat to human health1. As of March 21, 2022, there have been over 469 million confirmed cases of COVID-19 worldwide and over 6.07 million deaths2. Currently, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that causes the COVID-19 disease is detected using reverse transcription-polymerase chain reaction (RT-PCR)3. Although this method has reasonable specificity and sensitivity, it requires specialized equipment, reagents, and personnel training4. It also takes a relatively long time to get results and is costly5. A positive RT-PCR test result indicates a confirmed COVID-19 case6. Some countries discharge inpatients with COVID-19 if two RT-PCR tests are negative more than 24 h apart7. However, the discharge criteria vary widely among countries and some do not have specific discharge criteria8. Thus, with limited time and human and material resources, there is an urgent need for a rapid, simple, and affordable method of monitoring illness progression in patients and determining when they can be discharged from the hospital.

Most previous studies focused on the imaging examination and clinical symptoms of patients with COVID-19, which frequently required additional expenses and energy, burdened patients physically or financially, and lacked a fast and cost-effective method to predict patients’ disease process9,10. Blood closely interacted with various tissues and cells in the body and it could provide a wide range of information11. A complete blood count (CBC) was the most common test done in clinical practice on hospitalized patients12. It was a simple, quick, and low-cost test13. COVID-19 had been shown to affect the blood circulatory system, and obvious and persistent changes in blood cells could be detected during the infection14,15. Many studies done on patients with mild and severe COVID-19, and found that blood cell changes correlate strongly with the severity of COVID-1916,17,18. But few studies had been performed on patients with moderate cases. Of the 72,314 local COVID-19 cases reported by the Chinese Center for Disease Control and Prevention, a majority (81%) had mild or moderate cases19. According to the World Health Organization, as of March 21, 2022, the percentage of deaths following COVID-19 infection was approximately 1.3% and the rate of patients treated and discharged was about 86.6%2. Therefore, it is of more practical significance to assess the progression of disease in patients with moderate COVID-19 and find the factors related to the improvement of patients’ conditions. This study will conduct a retrospective multicenter study to analyze the role of a CBC test in the rehabilitation of patients with moderate COVID-19 for the first time. On this basis, an efficient and convenient multivariable combination model will be developed to predict patient recovery.

Methods

Study design and patient population

We retrospectively analyzed data of 127 patients with COVID-19 from the electronic medical record systems of Changchun Chinese Medicine Hospital and Siping Infectious Diseases Hospital from January 2020 to March 2021. The data included gender, age, comorbidities, clinical symptoms, length of hospitalization, and results of multiple laboratory tests after admission. All patients in the study had data on admission, multiple times after admission, turning negative and after discharge. The inclusion and discharge criteria were found in the Diagnosis and treatment protocol for novel coronavirus pneumonia (Trial Version 7)20. Inclusion criteria were as follows: Mild cases had mild clinical symptoms and no signs of pneumonia on imaging. Moderate cases had a fever and respiratory symptoms with imaging findings of pneumonia. Cases meeting any of the following criteria were defined as severe cases: Respiratory distress (respiratory rate, ≥ 30 breaths/min); oxygen saturation ≤ 93% at rest; arterial oxygen partial pressure/fraction of inspired oxygen ≤ 300 mmHg. Lung imaging indicated that the lesions progressed significantly within 24–48 h, and patients with lung lesions occupying > 50% of the lung were treated according to management protocols for severe cases. Cases meeting any of the following criteria were defined as critical cases: Respiratory failure and requirement of mechanical ventilation; shock; combination with failure of other organs that required care in the intensive care unit. The mild, severe, or critical cases were excluded according to the criteria. Discharge criteria were as follows: Body temperature had been back to normal for more than three days; respiratory symptoms improved obviously; pulmonary imaging showed obvious absorption of inflammation; nucleic acid tests were negative twice consecutively on respiratory tract samples such as sputum and nasopharyngeal swabs (sampling interval being at least 24 h). The verification cohort consisted of 38 patients with moderate COVID-19 in Changchun Infectious Disease Hospital from January to March 2020. The data were divided into two subgroups according to the laboratory test time. The first test results after admission went into the early onset group, usually within 1–3 days after admission. The first test results within 3 days before discharge went into the turning negative group. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Committee of three hospitals (the Ethics Committee of Changchun Infectious Disease Hospital, No. 2020-001; the Ethics Committee of Changchun Chinese Medicine Hospital, No. 2021-005; the Ethics Committee of Siping Infectious Disease Hospital, No. 2020-001). The requirement for written informed consent was waived due to the study’s retrospective nature by the ethics committees (the Ethics Committee of Changchun Infectious Disease Hospital, the Ethics Committee of Changchun Chinese Medicine Hospital, and the Ethics Committee of Siping Infectious Disease Hospital).

Data collection

COVID-19 results were confirmed through the Changchun Center for Disease Control and Prevention, Siping Center for Disease Control and Prevention, or Jilin Center for Disease Control and Prevention. Laboratory tests included hematology and biochemical tests. The biochemical and hematology equipment used at Changchun Chinese Medicine Hospital were B-S800M (Mindray Biomedical Electronics Corp., Shenzhen, China) and BC-5390 (Mindray Biomedical Electronics Corp., Shenzhen, China). The biochemical and hematology equipment used at Siping Infectious Disease Hospital were Pointcare M3i (Mnchip Technology Corp., Tianjin, China) and ABX Pentra XL 80 (Horiba Medical, Montpellier, France). The biochemical and hematology equipment used at Changchun Infectious Disease Hospital were CS-T300 (Dirui Industrial Corp., Changchun, China) and DF53 (Dymind Biotechnology Corp., Shenzhen, China). The instruments underwent rigorous quality control testing. The three laboratories that participated in this study all passed the external quality assessment and proficiency certification of the Jilin Clinical Laboratory Center. All physicians, technicians, and nurses in this study received uniform training from the Health Commission of Jilin Province. The National Health Commission of the People’s Republic of China had issued the reference interval standards for common biochemical analyte and blood cell analysis of Chinese adults21,22. The reference intervals used by three hospitals followed the standards. There was no influence of different instruments on the test results.

Statistical analysis

The Kolmogorov–Smirnov test was used to test the normality of the quantitative data. The quantitative data with normal distribution were compared using the independent-samples t-test and expressed as \(\overline{x }\) ± s [Average ± standard deviation]; and quantitative data with non-normal distribution were compared using the Mann–Whitney U test and expressed as M (P25, P75) [Median (25th percentile, 75th percentile)]. The qualitative data were compared using the chi-squared test or Fisher’s exact test and expressed as n (frequency). Three methods were used to screen for independent factors for the recovery of patients with moderate COVID-19. Variables with P values greater than or equal to 0.05 were excluded by the univariate logistic regression analysis. The Spearman correlation was used to determine whether there was a significant correlation. Collinearity diagnostics was used to screen the variables to avoid possible multicollinearity of the model. In general, a variance inflation factor (VIF) greater than 10 and tolerance less than 0.2 indicated possible multicollinearity between the independent variables and were excluded. Finally, the multifactorial logistic regression included the variables that met the requirements. The multivariate model was fitted using the Backward: Likelihood Ratio method to calculate the odds ratio (OR) and 95% confidence interval (CI) for each variable. The combined model was presented as a nomogram. The receiver operating characteristic (ROC) curve was used to assess the predictive model discrimination and calculate the area under the curve (AUC) and the 95% CI. A P value less than 0.05 was considered statistically significant. The model’s goodness-of-fit was assessed using the calibration curve and a P value greater than 0.05 was considered a satisfactory fit. The clinical usefulness of the model was evaluated using a decision curve analysis (DCA). Stata 15, GraphPad Prism 8, and SPSS 23.0 were used for data analysis and graphical plotting.

Ethics approval and consent to participate

The Declaration of Helsinki conducted the study (as revised in 2013). The study was approved by the Ethics Committee of Changchun Infectious Disease Hospital (No. 2020-001), the Ethics Committee of Changchun Chinese Medicine Hospital (No. 2021-005), and the Ethics Committee of Siping Infectious Disease Hospital (No. 2020-001). The requirement for written informed consent was waived due to the study’s retrospective nature by the ethics committees.

Results

Clinical and laboratory characteristics of patients with moderate COVID-19

Among 127 patients with COVID-19, those excluded were one patient who died, 31 patients with mild cases, 5 patients with severe cases, and 4 patients with critical cases. A total of 86 patients with moderate COVID-19 were finally included and the patient selection flowchart was shown in Fig. 1. Their mean age was 53 years, their mean hospital stay was 20 days, the most common comorbidity was cardiovascular disease (25.6%), and the most common clinical symptom was cough (41.9%). Men made up 43% of the cohort. There were no statistically significant gender differences in the length of hospital stays, age, comorbidities (except cerebrovascular disease), clinical symptoms, and medication use (P > 0.05). The clinical characteristics of these patients were shown in Table 1. The red blood cell, hemoglobin, hematocrit, platelet count (PLT), mean platelet volume (MPV), creatinine, total protein (TP), albumin (ALB), and potassium (K) of the patients were normally distributed (P > 0.05). The independent samples t-test and Mann–Whitney U test showed white blood cells (WBC), neutrophil count (NE), lymphocyte count (LY), eosinophils (EO), basophils (BA), mean corpuscular volume (MCV), mean corpuscular hemoglobin concentration, red blood cell distribution width (RDW), PLT, MPV, platelet distribution width (PDW), glucose, Cr, urea, carbon dioxide combining power, TP, ALB, aspartate aminotransferase, alkaline phosphatase, γ-glutamyl transpeptidase, sodium, K, and chloride showed statistically significant differences between the early onset and turning negative data (P < 0.05), as detailed in Table 2. We collected longitudinal changes in CBC in these patients after hospitalization and rehabilitation discharge. Most of the changes in CBC indicators were within the reference ranges and the trajectory of each indicator over time was shown in Fig. 2.

Figure 1
figure 1

The flowchart of this study.

Table 1 Clinical characteristics of patients with moderate COVID-19.
Table 2 Laboratory characteristics of patients with moderate COVID-19.
Figure 2
figure 2

Changes of blood cell parameters with time in patients with COVID-19 during hospitalization and after discharge. M male, F female. The shaded parts were the reference intervals of the tests.

Establishing a predictive model for turning negative in patients with moderate COVID-19

We explored the relationship between hematocyte and recovery of patients with moderate COVID-19. The univariate logistic regression analysis showed that at the early onset and turning negative periods, the differences in WBC, NE, LY, EO, BA, MCV, RDW, PLT, MPV, and PDW were statistically significant (P < 0.05) (Fig. 3A). The Spearman correlation was performed on the above variables and the results showed significant correlations between most of them (P < 0.05) (Fig. 3B). To eliminate redundant indicators and avoid covariance among highly correlated indicators, we performed collinearity diagnostics to screen the variables for subsequent inclusion in the multifactor model. The results showed that WBC, NE, LY, and EO had multicollinearity (VIF > 10 and tolerance < 0.2) (Fig. 3C), so these four variables were excluded from future calculations. BA, MCV, RDW, PLT, MPV, and PDW were included in the multifactor logistic regression and the model was fitted using the Backward: Likelihood Ratio method. The results showed that BA (OR 6.372; 95% CI 3.284–12.363; P = 0.001), MCV (OR 1.244; 95% CI 1.088–1.422; P < 0.001), RDW (OR 2.585; 95% CI 1.261–5.297; P = 0.010), PDW (OR 1.559; 95% CI 1.154–2.108; P = 0.004) could jointly predict recovery in patients with moderate COVID-19 (Sensitivity 95.3%, Specificity 91.9%). The combined model was presented in a nomogram. Each variable was assigned a score, and the total score was calculated by summing the individual scores, which reflected the probability of a patient recovering from COVID-19 (Fig. 4A). In both the training and verification cohorts, BA, MCV, RDW, and PDW were lower in the early onset period compared with the turning negative period, and the differences were statistically significant (P < 0.05) (Fig. 4B,C).

Figure 3
figure 3

Screening for independent factors associated with patient improvement. (A) Forest plot based on univariate logistic regression analysis. (B) Correlation heat map of 10 significant difference tests. (C) Collinearity diagnostics. CI confidence interval, VIF variance inflation factor. **P < 0.001, *P < 0.05.

Figure 4
figure 4

Visual representation of the model. (A) Nomogram to illustrate how BA, MCV, RDW, PDW are related to recovery. (B) Distribution of BA, MCV, RDW, and PDW in training cohort. (C) Distribution of BA, MCV, RDW, and PDW in verification cohort.

Evaluating and validating a predictive model for turning negative in patients with moderate COVID-19

The ROC curves showed that the combined model had better discrimination compared with any single variable model in the training cohort (AUC = 0.968; 95% CI 0.943–0.992; P < 0.001) (Fig. 5A) and in the external verification cohort (AUC = 0.870; 95% CI 0.793–0.948; P < 0.001) (Fig. 5B). The detailed parameters of the ROC curves were shown in Table 3. The Hosmer–Lemeshow goodness-of-fit showed results in the calibration curve for the training cohort (χ2 = 8.804; P = 0.359) and the verification cohort (χ2 = 7.502, P = 0.484) (Fig. 5C). The DCA showed that the net benefit of the combined model was significantly higher than that of an arbitrary single variable model in the training and verification cohorts (Fig. 5D,E).

Figure 5
figure 5

Model performance in the training and verification cohorts. (A) Receiver operating characteristic curve of training cohort. (B) Receiver operating characteristic curve of verification cohort. (C) Calibration curve. (D) Decision curve analysis of training cohort; (E) decision curve analysis of verification cohort. Model, Combined model of basophil, mean corpuscular volume, red blood cell distribution width and platelet distribution width; the shaded parts represented the 95% confidence interval of the areas under receiver operating characteristic curves.

Table 3 Characteristics of the receiver operating characteristic curve.

Discussion

This study had the following innovative findings: (1) This study used a variety of statistical methods to screen independent factors, including univariate logistic regression, Spearman correlation, collinearity diagnosis, and Backward: Likelihood Ratio method, which were conducive to fitting a more efficient prediction model. (2) This study evaluated the discrimination, calibration, and clinical usefulness of the model using training and external validation cohorts, which could more comprehensively demonstrate the prediction ability of the model based on complete blood count. (3) This study revealed that BA, MCV, RDW, and PDW could be used to predict the recovery of patients with moderate COVID-19. It was worth noting that although the medians of BA, MCV, RDW, and PDW were within the normal ranges at both admission and discharge, these values were higher at discharge. Thus, the “elevation” described in this study was not an abnormal increase outside the reference range.

BA were rare blood leukocytes produced by bone marrow progenitor cells (approximately 2%). Conceição-Silva et al. showed that they had extracellular traps with a fungicidal and antifungal activity that might play a protective role during COVID-19 infection23. Rodriguez et al. discovered that BA in patients with severe COVID-19 increased significantly from the acute phase to the recovery phase24. Our findings were consistent with the above studies and we concluded that elevated BA predicted improvement and could be a prognostic marker for recovery in patients with moderate COVID-19. However, other studies suggested that a progressive increase in BA was a risk factor for COVID-19 lethality, which contradicted our findings25. MCV and RDW were used as parameters to assess the mean volume and size heterogeneity of erythrocytes. Our study showed no significant changes in erythrocyte morphology in patients with moderate COVID-19. A slight increase in MCV and RDW within the normal range could predict improvement in patients. Studies concluded that the uneven red blood cell distribution was closely related to the poor prognosis and mortality of COVID-19, but some studies believed that there was no significant correlation between them26,27,28. These conflicting views suggested the need for future in-depth investigations. The concept of PDW was like RDW and reflected the heterogeneity of platelet size. Wang et al. found that PDW was significantly higher in patients with mild COVID-19 at discharge compared with at admission and that PDW had a potential diagnostic value for mild COVID-1929. Our findings were consistent with these results and demonstrated that elevated PDW could be used to predict recovery in patients with moderate COVID-19. In contrast, Bommenahalli Gowda et al. showed that elevated PDW was significantly associated with increased mortality in COVID-1930. Possible reasons for the differences in study results included the following: our study used moderate cases only, patients with different subtypes were excluded, and it was conducted in Jilin province, China, where the COVID-19 severity was relatively low.

This study had several limitations: (1) This study had a small sample size with only 86 patients included, which might have affected the statistical power. (2) This study was a retrospective study, we lacked the results of some laboratory indicators, thus failing to show the changes in all laboratory indicators at the time of admission and discharge. (3) Considering the advantages of CBC in terms of simplicity, speed, and cost-saving, this study built a prediction model for CBC indicators only, without incorporating other indicators that may have better prediction performance. (4) Changes in these parameters might have been influenced by medication, but due to the limitations of retrospective studies, we were unable to intervene in the patients’ medication use. (5) The chest X-ray and computed tomography scan results of the patients were not collected in this study, so it was not possible to analyze the influence of the image features on the rehabilitation of the patients.

Conclusion

This study developed and validated a reliable nomogram model for predicting the recovery in patients with moderate COVID-19. We concluded that small elevations in BA, MCV, RDW, and PDW within the normal ranges could jointly predict disease progression in patients with moderate COVID-19 and help clinicians to better monitor disease progression in these patients.