Introduction

Coronavirus disease 2019 (COVID-19), a highly infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has been rapidly spreading all over the world and remains a great threat to global public health [1]. The clinical symptoms of patients with COVID-19 vary widely. A significant proportion of patients with COVID-19 have mild symptoms, such as fever, muscle ache, cough, shortness of breath and fatigue, and about half of patients do not show any obvious symptoms [2, 3]. However, some severe cases with severe pneumonia can develop into acute respiratory distress syndrome (ARDS), pulmonary oedema or multiple organ dysfunction syndrome (MODS), hence leading to a high mortality [4,5,6]. Although many patients have mild symptoms, they may suddenly progress to ARDS, septic shock or even MODS [7]. Patients diagnosed with severe or critical illness have a poor prognosis. Hence, it is crucial for us to identify potentially severe or critical cases early and give timely treatments for targeted patients. Therefore, we can prevent the progression of COVID-19, save medical resources and reduce mortality.

Similar to patients with Middle East respiratory syndrome (MERS) and severe acute respiratory syndrome (SARS), dysregulated inflammation leading to cytokine storms is associated with worsening clinical outcomes in patients with COVID-19 [8,9,10]. Emerging evidences suggested that peripheral blood neutrophil-to-lymphocyte ratio (NLR) can be used as a marker of systemic inflammation. Furthermore, NLR has shown good predictive values on progression and clinical outcomes in various disease, such as solid tumours, chronic obstructive pulmonary disease (COPD), cardiovascular disease and pancreatitis [11,12,13,14]. Recently, several studies have reported that NLR may differentiate between mild/moderate and severe/critical groups and probability of death in patients with COVID-19 infection. In addition, a series of studies have suggested NLR is a reliable predictor of COVID-19 progression and elevated NLR is associated with high mortality [15,16,17,18,19,20].

NLR is a readily available biomarker that can be calculated from components of the differential white cell count (dividing neutrophil by lymphocyte count). We performed this systematic review and meta-analysis to evaluate the predictive values of NLR on disease severity and mortality in patients with COVID-19 and to provide a reliable marker for early identification of potentially severe or critically ill cases.

Methods

We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA statement) guidelines to perform this meta-analysis [21]. It was prospectively registered on PROSPERO database (Registration number: CRD42020203612).

Selection of studies

We reviewed PubMed, EMBASE, China National Knowledge Infrastructure (CNKI) and Wanfang databases through August 11, 2020. The search terms were as follows: (“Neutrophil to lymphocyte ratio” or “neutrophil lymphocyte ratio” or “neutrophil-to-lymphocyte ratio” or “neutrophil/lymphocyte ratio” or “NLR”) and (“Coronavirus disease 2019” or “2019 Novel Coronavirus” or “SARS-CoV-2” or “2019-nCoV” or “COVID-19”). The detail of search strategy of PubMed is shown in Additional file 1. No language restrictions were imposed. To find additional citations, the reference lists of the included studies and recent review articles were screened when necessary.

Two authors (X.L and C.L) independently screened all identified citations to find studies for the final analysis. Any disagreement was resolved through discussion. In case of persistent disagreement, we consulted the third reviewer (F.Z) for arbitration. Studies were selected if they met the following criteria: (1) The predictive value of NLR on disease severity or mortality in patients with COVID-19 was evaluated; (2) a 2 × 2 table of results could be constructed [sufficient information to calculate true positive (TP), false positive (FP), false negative (FN) and true negative (TN)]. The exclusion criteria were as follows: (1) case report, review, editorial, conference abstract, comment, letter, animal study; (2) unable to extract a 2 × 2 table of results.

Data extraction and quality assessment

Two authors (X.L and C.L) independently extracted relevant information from individual studies, including first author, publication year, country, publication language, number of patients (male/female), mean age, cut-off value, area under curve (AUC), TP, TN, FP, FN, sensitivity (SEN) and specificity (SPE). The extracted information was checked by a third author (Z.M). We used the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria to evaluate each of the included studies in 4 domains: patient selection; index test; reference standard; and flow and test timing [22].

Statistical analysis

The statistical analyses were conducted by STATA (version 14.0) using MIDAS module [23]. A bivariate random-effects regression model was performed to calculate SEN, SPE, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio (DOR) and corresponding 95% credible interval (CI). A summary receiver operating characteristic (SROC) curve was drawn to assess the overall diagnostic accuracy. The higher the AUC value, the better the diagnostic power [24]. We used Deek funnel plot to detect publication bias. If the P value is less than 0.1, publication bias may exist. I2 index was calculated to assess heterogeneity between studies, and I2 values above 50% were regarded as the indicative of substantial heterogeneity [25]. We conducted Fagan nomograph to explore the relationship between the pre-test probability, likelihood ratio and the post-test probability. To investigate potential sources of heterogeneity among included studies, sensitivity and subgroup analyses were conducted. In sensitivity analyses, we only included studies published in English. We did subgroup analyses based on cut-off value.

Results

Selection and characteristics of studies

As a result of the literature search, a total of 298 studies were identified, including 97 from PubMed, 91 from EMBASE, 62 from CNKI and 48 from Wanfang debase. Figure 1 shows the study selection process. In total, 111 duplicate publications were excluded. According to the inclusion and exclusion criteria, we excluded 145 studies by evaluating the titles and abstracts. The remaining 42 studies were further scrutinized by reading the full text. Finally, only 19 studies were included in this meta-analysis, of which 9 reported the predictive value on disease severity [26,27,28,29,30,31,32,33,34], 6 reported the predictive value on mortality [35,36,37,38,39,40], and 4 reported the predictive value on both disease severity and mortality [41,42,43,44].

Fig. 1
figure 1

Flow diagram for the identification of eligible studies

The characteristics of the included studies and the predictive value of NLR on disease severity or mortality in each study are presented in Table 1. Most studies were conducted in China. Twelve studies were published in English, six in Chinese and one in Spanish. Except one prospective study [27], all others were retrospective studies. The number of participants across studies ranged from 45 to 1004. Notably, the SEN, SPE, AUC and cut-off value of NLR predicting mortality and disease severity ranged greatly among the included studies. Except two studies [41, 42], all other studies defined severe patients as meeting at least one of the following criterions: shortness of breath, respiratory rate (RR) ≥ 30 times/min or oxygen saturation (resting state) ≤ 93%, or PaO2/FiO2 ≤ 300 mmHg.

Table 1 Characteristics of the included studies and diagnostic test performance of NLR for disease severity and mortality

Study quality and publication bias

The methodological quality of the included studies is presented in Additional file 2. One study only included patients classified as moderate [36], one included only severe patients [35], and another included only elderly patients [40]. Therefore, these three studies were considered to have a high risk of patient selection bias. One study included 32 moderate cases, and another 31 severe cases were included as a control group [32]. One study included 48 moderate cases, and another 37 severe cases were included as a control group [34]. One study included 50 moderate cases, and another 43 severe cases were included as a control group [43]. One study included 42 dead patients, and another 42 discharged patients were included as a control group [37]. These four studies were also assessed to show high risk of patient selection bias, because they did not avoid a case–control design. One study did not provide sufficient information about patients enrolled and leaded to a high risk of patient selection in our opinion [33]. Most studies were considered to have unclear risk of bias regarding index tests, because they did not report the blindness between index and reference tests. Deek funnel plot is shown in Additional file 3, and publication bias may exist among studies reporting the predictive value of NLR on disease severity (P = 0.04).

Predictive value of NLR on disease severity

Thirteen studies involving 1579 patients reported the predictive value of NLR on disease severity. The pooled SEN and SPE were 0.78 (95% CI 0.70–0.84, I2 = 71.83) and 0.78 (95% CI 0.73–0.83, I2 = 77.80), respectively (Fig. 2a). The positive likelihood ratio was 3.6 (95% CI 2.9–4.4), and the negative likelihood ratio was 0.28 (95% CI 0.21–0.38). The DOR was 13 (95% CI 9–18). The SROC curve is shown in Fig. 3a. The AUC of NLR for predicting disease severity was 0.85 (95% CI 0.81–0.88), indicating high diagnostic value. We can learn from Fagan nomogram (Fig. 4a) that if the pre-test probability was set to 50%, the post-test probability of NLR for the detection of severe cases was 78% when the NLR was above the cut-off value. On the contrary, when the NLR was below the cut-off value, the post-test probability was 26%.

Fig. 2
figure 2

a. Forest plot of the sensitivity and specificity of NLR for predicting disease severity in patients with COVID-19. The pooled sensitivity and specificity were 0.78 (95% CI 0.70–0.84) and 0.78 (95% CI 0.73–0.83), respectively. b. Forest plot of the sensitivity and specificity of NLR for predicting mortality in patients with COVID-19. The pooled sensitivity and specificity were 0.83 (95% CI 0.75–0.89) and 0.83 (95% CI 0.74–0.89), respectively

Fig. 3
figure 3

Summary receiver operating characteristic graph for the included studies. a. The AUC of NLR for predicting disease severity was 0.85 (95% CI 0.81–0.88). b. The AUC of NLR for predicting mortality was 0.90 (95% CI 0.87–0.92)

Fig. 4
figure 4

Fagan nomogram of NLR for predicting disease severity and mortality in patients with COVID-19. The pre-test probability was set to 50%. a. The post-test probability of NLR for the detection of severe cases was 78% when the NLR was above the cut-off value. The post-test probability was 22% when the NLR was below the cut-off value. b. The post-test probability of NLR for the detection of mortality was 83% when the NLR was above the cut-off value. The post-test probability was 17% when the NLR was below the cut-off value

Predictive value of NLR on mortality

Ten studies involving 2967 patients reported the predictive value of NLR on mortality. The pooled SEN and SPE were 0.83 (95% CI 0.75–0.89, I2 = 66.13) and 0.83 (95% CI 0.74–0.89, I2 = 90.34), respectively (Fig. 2b). The positive likelihood ratio was 4.8 (95% CI 3.3–7.0), and the negative likelihood ratio was 0.21 (95% CI 0.15–0.30). The DOR was 23 (95% CI 15–36). The SROC with pooled diagnostic accuracy was 0.90 (95% CI 0.87–0.92), presented in Fig. 3b. The Fagan nomogram showed that the post-test probability of NLR for the detection of mortality was 83% when the NLR was above the cut-off value and the post-test probability was 17% when the NLR was below the cut-off value (Fig. 4b).

Subgroup analyses and sensitivity analyses

We conducted the subgroup analyses based on the cut-off value. In terms of predicting disease severity, the cut-off value in six studies was higher than 4.5 and was termed the “high cut-off value” subgroup. Seven others used a lower cut-off value, which were included in the “low cut-off value” subgroup. The AUC was 0.86 (95% CI 0.83–0.89) and 0.82 (95% CI 0.78–0.85), respectively. Similarly, ten studies reporting the predictive value of NLR on mortality were divided into “high cut-off value” (cut-off ≥ 6.5) and “low cut-off value” (< 6.5) subgroups, and the AUC was 0.92 (95% CI 0.89–0.94) and 0.84 (95% CI 0.80–0.87), respectively. In the sensitivity analyses, we only included studies published in English. The pooled AUC for predicting disease severity and mortality was 0.83 (95% CI 0.80–0.86) and 0.90 (95% CI 0.87–0.92), respectively. Detailed results about subgroup analyses and sensitivity analyses are presented in Table 2.

Table 2 Results of sensitivity analysis and subgroup analysis

Discussion

Although in the clinical practice of treating patients with COVID-19, we have observed that the NLR of severe patients is higher than that in mild patients, there is no systematic review and meta-analysis to evaluate the predictive values of NLR on disease severity and mortality in patients with COVID-19. Studies have reported various thresholds to NLR. Clinicians are therefore unclear regarding the thresholds of NLR that should be applied in order to categorize severity of disease and predict prognosis. Our study suggested that NLR can not only be a good biomarker predicting disease severity in patients with COVID-19 (AUC = 0.85, SEN = 0.78 and SPE = 0.78), but also have value in predicting mortality (AUC = 0.90, SEN = 0.83 and SPE = 0.83).

COVID-19 spread rapidly and is an ongoing global pandemic. Medical workers from different countries make efforts to explore the best diagnostic method and the most effective treatment for COVID-19. More and more studies have focused on COVID-19 and published in different languages. To find enough studies that reported the predictive values of NLR on disease severity and mortality in patients with COVID-19, we did not impose any language restrictions. In our final analyses, twelve studies were published in English, six in Chinese and one in Spanish. To our knowledge, English is the most widely used language in the world. Studies published in English may have a wider readership and receive peer review from different countries, while studies published in other languages may be available only to native speakers. Therefore, we performed sensitivity analyses by omitting studies not published in English. The results of the sensitivity analyses were in accordance with the main analyses, indicating that the publish language was not a confounding factor.

To our knowledge, the treatments for mild cases and severe cases are greatly different. For mild cases, there is no need to intervene too much. Some patients can even recover without any treatments. However, for severe cases, even we take many kinds of measures, such as mechanical ventilation, extracorporeal membrane oxygenation (ECMO) and continuous renal replacement therapy (CRRT), the mortality is still high [45, 46]. Therefore, if the potentially severe cases were identified early and effective treatments were taken to prevent the progression of those patients, more patients’ lives may be saved.

The current criteria for classifying mild cases and severe cases are mainly based on RR, oxygen saturation and PaO2/FiO2. These indicators are important but lack specificity for COVID-19. In laboratory examination of patients with COVID-19, the absolute value of peripheral white blood cells is usually normal or low, and lymphopenia is common [47]. However, in severe or non-survival patients with COVID-19, the lymphocytes count decreases progressively, while the neutrophils count gradually increases. This may be due to excessive inflammation and immune suppression caused by SARS-CoV-2 infection. On the one hand, neutrophils are generally regarded as pro-inflammatory cells with a range of antimicrobial activities, which can be triggered by virus-related inflammatory factors, such as interleukin-6 and interleukin-8 [9]. On the other hand, systematic inflammation triggered by SARS-CoV-2 significantly depresses cellular immunity, leading to a decrease in CD3 + T cells, CD4 + T cells and CD8 + T cells. In addition, SARS-CoV-2-infected T cells may also cause cytopathic effects on T cells [10, 48,49,50]. Therefore, NLR, a cost-effective marker, can be easily calculated from peripheral blood routine tests and may be associated with the progression and prognosis of COVID-19. To date, four meta-analyses have reported that patients with severe COVID-19 infection had a higher NLR than those with non-severe COVID-19 infection [51,52,53,54]. However, none of them evaluated the predictive values of NLR on disease severity and mortality.

There are several limitations in this meta-analysis. First, all but one of the studies were retrospective, meaning the data were prone to confounding factors. Second, the progression and prognosis of disease were influenced by many factors, such as age, sex and comorbidities, while we did not evaluate other factors. Finally, there was considerable heterogeneity among the included studies. Although we conducted sensitivity and subgroup analyses, the heterogeneity was not significantly decreased. That may be caused by different cut-off values, different conditions of patients or different comorbidities among the included studies. Additional high-quality studies are required to shed light on the role of NLR in the progression and prognosis of COVID-19 and find the optimal cut-off value.

Conclusions

NLR has good predictive values on disease severity and mortality in patients with COVID-19 infection. Evaluating NLR can help clinicians identify potentially severe cases early, conduct early triage and initiate effective management in time, which may reduce the overall mortality of COVID-19.