Introduction

Since the outbreak of the pandemic in January 2020, the management of coronavirus disease 2019 (COVID-19) has rapidly become a priority in all healthcare organisations worldwide [1,2,3]. COVID-19 is a systemic disease primarily affecting the respiratory system, although 30% of all infected individuals complain about central and peripheral neurological manifestations [1,2,3]. In this regard, various pathogenetic pathways may lead to neuronal damage, such as direct viral invasion, cytokines storm, para- or post-infectious autoimmunity, and secondary effects of a severe multi-organ dysfunction [4, 5]. The introduction of ultrasensitive immunoassays has allowed the assessment of blood neuronal and glial biomarkers, as correlates of central nervous system (CNS) involvement, in large and longitudinal cohorts of primary and non-primary neurological diseases, including COVID-19 [6,7,8,9]. In particular, neurofilament light chain protein (NfL) has gained significant attention as a marker of neuroaxonal injury, given its ability to accurately track subclinical axonal pathology, monitor disease course, and predict long-term outcomes in different neurological and systemic conditions [7, 9].

Cases with major COVID-19-associated CNS manifestations, such as acute cerebrovascular events, (meningo-) encephalitis and seizures/status epilepticus, showed significantly increased cerebrospinal fluid (CSF) and blood NfL concentrations due to ongoing neuronal damage [10,11,12,13,14]. However, the reported blood NfL increase also in COVID-19 cases with only mild-to-moderate (e.g., anosmia, headache) or without specific neurological symptoms [15, 16] suggested that a subtle neuronal damage might be even more frequent and still underestimated in COVID-19. On the other side, cases with severe COVID-19 showed a sustained NfL elevation at follow-up, possibly reflecting a persistent neuronal injury [17, 18] (Fig. 1). Most interestingly, both prospective and cross-sectional studies have demonstrated an association between NfL and unfavourable clinical outcomes, encompassing death, intensive care unit (ICU) admission, and mechanical ventilation (MV) in hospitalized COVID-19 patients [15, 17,18,19,20,21].

Fig. 1
figure 1

Mechanisms leading to blood neurofilament light chain (NfL) increase in COVID-19-associated central nervous system (CNS) damage. CNS neuro-axonal injury in COVID-19 may result from the interplay between different pathophysiological mechanisms including (1) direct viral invasion, (2) pro-inflammatory cytokine release and autoimmunity, (3) secondary damage due to systemic impairment (e.g., hypoxia for concomitant COVID-19-related pneumonia). NfL are first released in the interstitium, then they are drained in the cerebrospinal fluid (CSF) or pass directly through CSF-brain barrier breakdown, and finally reach the bloodstream. In COVID-19 a sustained blood NfL increase could be also enhanced by concomitant blood–brain barrier breakdown due to inflammatory and hypoxia-related mechanisms [7]

However, NfL median levels varied largely among different cohorts, probably because of heterogeneous inclusion criteria, such as differences in disease duration and severity, and lack of systematic NfL value adjustment according to known influencing factors (e.g., age) [6, 7, 13, 22,23,24]. In this regard, the usage of the age-adjusted NfL Z score, based on large healthy control cohorts may estimate the deviation from normal NfL concentrations to better assess which NfL changes might be clinically relevant at the individual level [7, 22]. Further, the accuracy of NfL in the outcome prediction varied broadly across studies, limiting the derivation of any sort of standardised thresholds. To overcome these limitations, we conducted an individual participant data (IPD) meta-analysis to test whether the NfL Z score in the acute phase of COVID-19 may aid prognostication in the real-life scenario of hospitalized cases with COVID-19 without major COVID-19 associated CNS manifestations.

Methods

Our IPD meta-analysis protocol followed the Preferred Reporting Items for Systematic Reviews and meta-analysis guidelines for IPD systematic reviews (PRISMA-IPD) [25] and was registered on the PROSPERO registry (CRD42022358924).

Search strategy and selection criteria

Six authors (MF, AA, SAR, LB, RO, MR) systematically searched MEDLINE (PubMed) and Scopus for articles published from databases inception to May 23rd, 2022 addressing the role of biomarkers in predicting the outcomes of interest. An a priori search string was developed with 3-steps Delphi method to include terms for (i) NfL as a biomarker and (ii) COVID-19 as the disease of interest (Supplementary Box 1). Results were restricted to original articles in English, German, or Italian language. A priori criteria for inclusion were: (1) hospitalized individuals, (2) age ≥ 18 years, (3) PCR- or radiologically based COVID-19 diagnosis (Fig. 2, Supplementary Table 1), (4) a measurement of blood NfL during the acute phase and (5) available data regarding at least one outcome among ICU admission, need of MV and death (primary outcome). We excluded cases diagnosed with major COVID-19-associated CNS manifestations [acute cerebrovascular events, (meningo-)encephalitis, seizures or status epilepticus] [2] at the time of blood sampling. We included only studies with sufficient information to calculate the age- adjusted NfL Z score (see below). Study selection was conducted on Rayyan online platform (rayyan.ai). Titles and abstracts were screened independently. Potentially relevant articles were acquired in full text and assessed for eligibility by the same six authors working in pairs. The final selection was shared among all the six authors. Disagreements were resolved by consensus.

Fig. 2
figure 2

PRISMA Flow-chart

Data extraction and processing

We invited authors of the included studies to participate by providing IPD on a standardized collection tool with case definitions (Supplementary Tables 1, 2 and 3). We collected demographic information, comorbidities and COVID-19 severity according to a 4 levels scoring (mild, moderate, severe and critical) adapted from the World Health Organization (WHO) criteria for the clinical management of COVID-19 [26] (Supplementary Table 2); timing (i.e., days from onset to admission and to blood collection), used NfL assay and kit, biological matrix (plasma or serum) and values of NfL, Horowitz index (PaO2/FiO2 ratio) and other laboratory parameters [absolute lymphocyte and neutrophil count, lactate dehydrogenase (LDH), C-reactive protein (CRP), creatinine levels] which have been described as prognostic factors in COVID-19 [27]. Submitted datasets were processed by two investigators (FC, MF) to harmonise data recording across studies in accordance with pre-defined variable types. If a contributor was unable to harmonise data with our format, we allowed to report the original study data; these data were extracted and fully checked by two reviewers (FC, MF) with a standardised approach, then the harmonisation was shared with all investigators.

All studies included in this IPD meta-analysis received ethics approval (see section below). All subjects gave written informed consent to be enrolled in the studies included in this IPD meta-analysis. Contributors ensured local regulatory and data sharing agreement were in place.

Bias assessment

Quality assessment was performed with the Newcastle–Ottawa scale (NOS) [28]. NOS includes assessment of selection of cohort explored, control cohort, length, and adequacy of observation, as well as comparability of control and experimental cohorts. We summarized the assessment as low, moderate, or high according to the overall score achieved by each study.

Statistical analysis

Statistical analyses were carried out with IBM SPSS Statistics V.21 (IBM, Armonk USA), GraphPad Prism V.7 (GraphPad Software, La Jolla, California, USA) and R version 4.2.2 (R Foundation, Vienna, Austria).

As NfL correlates with age [7, 22], NfL age-adjusted Z score were calculated using an available large reference database (n = 4532 samples from control persons) [22]. Generalized Additive Model for Location, Scale and Shape was used to model NfL variations with age and to derive individual NfL Z score, a continuous measure indicating how strongly (in terms of number of standard deviations) the adjusted NfL value deviates from levels in healthy controls. Plasma NfL concentrations were converted into corresponding serum values for Z score calculations according to a published equation [22]. Single Sample Wilcoxon-signed rank test was used to compare the reported NfL Z scores in subjects with COVID-19 to the reported, healthy population average (Z score of 0).

Meta-analysis was conducted using mixed-effects modelling, with centre/study implemented as random effect. Generalized linear mixed-effects models (GLMMs) were applied to test the associations between NfL Z scores (dependent variable), and clinical features or other potential prognostic variables (sex, PaO2/FiO2, hypertension, diabetes, absolute neutrophil and lymphocyte count, creatinine, LDH, CRP) [27]. Additional GLMMs with binary clinical outcomes (ICU admission, need of MV and death) as dependent variables were fitted to explore the association between NfL Z scores and the outcome of interest after correction for relevant covariates. Given the presence of missing data, significant covariates at univariate analyses were included in multivariable models only if reported in at least 50% patients with available outcome data and if obtained for at least 2 centres. For each model, we reported data on estimate coefficient and/or odds ratio (OR), associated 95% confidence intervals (95%CI) as well as p-values.

The lack of a common cut-off among studies precluded the traditional bivariate models for accuracy testing. First, we used a 2-stage random-effects model integrating multiple thresholds within each study to obtain a summary receiver-operating characteristic curve (SROC) from meta-analysis of diagnostic accuracy. Further, we investigated the performance of NfL Z score in the assessment of poor outcome by calculating the area under the curve (AUC) from the ROCs derived from GLMMs. We estimated optimal thresholds through maximized Youden index (sensitivity + specificity− 1) or through Youden Index after weighting specificity at 75, 85, and 95%, defined a-priori as progressive reasonable thresholds for prognostications, and we reported each respective sensitivity. All analyses were considered statistically significant with p < 0.05.

Results

We identified 382 records by database searches. Seven studies reached final stage, providing IPD for 688 hospitalized COVID-19 cases [11, 12, 19,20,21, 29, 30]. The bias assessment did not reveal substantial selection or reporting bias with all studies being of high quality. A total of 669 participants referred to 7 centres (Oslo n = 26, Drammen n = 20, Milan n = 104, Uppsala n = 19, Brescia n = 332, Basel n = 26, Jacksonville n = 142) met the inclusion criteria and were included in the analysis (Fig. 2, 3, Supplementary Table 4).

Fig. 3
figure 3

Location of the seven recruiting centres providing individual participant data (IPD). Country and city names and number of participants for which IPD were analyzed are displayed in boxes. Figure created with Biorender.com

Cohort description

Demographic, clinical, and laboratory features of included cohorts are reported in Table 1 and Supplementary Tables 4 and 5. Mean age at blood sampling was 66.2 ± 15.0 years (males n = 442, 68.1%), median disease duration from symptom onset to admission was 7 (IQR: 4–9) days, and from onset to biomarker assessment was 7 (IQR: 2–13) days. Hypertension and diabetes were described in 23.9% and 12.5% of cases with available data, respectively. COVID-19 diagnosis was confirmed by RT-PCR on nasopharyngeal swab in all but 5 cases (99.3%), who had a diagnosis of COVID-19 pneumonia based on typical radiological findings (in the early phase of the pandemic [20]). Included participants were mainly diagnosed with critical disease course (391 out of 431 with available classification, 90.7%), while 13 (3.0%) and 27 (6.3%) had moderate and severe disease, respectively. Elevated CRP and LDH values, a normal absolute neutrophil count, and a low absolute lymphocyte count were typical findings in COVID-19 (Table 1). Compared to the healthy control range (i.e., NfL Z score of 0), median NfL Z scores were higher in the included COVID-19 cases (median: 2.37; interquartile range, IQR: 1.13–3.06 referring to the 99th percentile in healthy controls, p < 0.001) (Supplementary Figs. 1 and 2). Moreover, 336/669 (50.2%) patients had NfL values above the 99th percentile of the corresponding healthy range (median: 99.2, IQR: 87.0–99.9, range: 0.01–100.0). The median hospitalization time was 14 days (IQR: 6—30). During the hospital stay, 74/233 subjects (31.8%) required MV with a median MV duration of 7.5 days (IQR: 3–16). Moreover, 316 out of 609 (51.9%) were admitted to the ICU. Finally, data on survival were available for all participants and death during hospitalization occurred in 180 cases (26.9%). In hospital, death was associated to COVID-19 itself or its related complications in all subjects Fig. 4.

Table 1 Demographic, clinical, and laboratory features of analysed participants with COVID-19
Fig. 4
figure 4

Summary receiver operating characteristic curve (SROC) for the diagnostic accuracy of NfL Z score for predicting death. Specificity 75%: optimal cut-off 2.73 (sensitivity 0.49, specificity 0.79), specificity 85%: optimal cut-off 3.06 (sensitivity 0.36, specificity 0.86), specificity 95%: optimal cut-off 4.01 (sensitivity 0.09, specificity 0.96). At maximized Youden-Index (green diamond): optimal cut-off 1.96 (sensitivity 0.78, specificity 0.59)

Associations between NfL, clinical, and laboratory variables

We found significant associations between NfL Z scores (dependent variable) and most clinical and laboratory variables in univariable analyses (Table 2). NfL Z scores were significantly associated with COVID-19 severity with NfL Z scores approximately 1.4 unit higher in critical vs. moderate cases [estimate: 1.364 (95%CI: 0.686–2.042), p = 0.0001] and correlated with disease duration (time from onset to blood sampling) [estimate: 0.040 (95%CI: 0.024–0.056), p < 0.0001] (Table 2). Moreover, NfL Z scores were 0.3 units higher in participants with a history of hypertension [estimate: 0.347 (95%CI: 0.072–0.623), p = 0.014] and approximately 0.7 units higher in those with diabetes mellitus [estimate: 0.728 (95%CI: 0.380–1.077), p = 0.0001]. Among laboratory parameters, CRP [estimate: 0.002 (95%CI: 0.0006–0.004), p = 0.005], LDH [estimate: 0.001 (95%CI: 0.0003–0.002), p = 0.003], creatinine [estimate: 0.948 (95%CI: 0.613–1.283), p < 0.001], but not absolute lymphocyte and neutrophil counts, were related to NfL Z scores. Low values of PaO2/FiO2 ratio were associated with higher NfL Z scores [estimate: − 0.003 (− 0.004– − 0.001), p = 0.0003].

Table 2 Associations between NfL Z scores and other demographical, laboratory and clinical variables (univariate GLMM, centre as random effect).

Associations between NfL and clinical outcome measures

In the whole cohort, univariate GLMM analyses identified NfL Z score, sex, disease duration (days from symptoms onset to blood sampling), COVID-19 severity, diabetes mellitus, LDH and PaO2/FiO2 ratio as variables associated with ICU admission in COVID-19 cases (Table 3). Given the proportion of missing data, multivariable GLMMs for each outcome included all covariates that tested significant at univariate analyses, were available in at least 50% patients and were obtained for at least 2 centres. In the multivariable analysis, NfL Z score remained a significant independent predictor of ICU admission (Table 3). In detail, each unit increase in NfL Z score was associated with a 2.5-fold increase in the likelihood for ICU admission [OR: 2.50 (95%CI 1.17–5.37), p = 0.018], after correction for sex, COVID-19 severity, presence of diabetes mellitus, and disease duration.

Table 3 Associations between NfL Z score or other variables and ICU admission (recruiting centre as random effect), 609 patients with available outcome.

When the need of MV was considered as the outcome, NfL Z score, sex, disease duration, and CRP resulted significant predictors in the univariate GLMM (Table 4). Similarly, in the multivariable GLMM, higher NfL Z score values were significantly associated with the need of MV (Table 4). After accounting for sex and disease duration, we found a 2.6-fold increase [OR: 2.63 (95%CI: 1.79–3.87), p < 0.0001] in the likelihood of need of MV with each unit increase in NfL Z score.

Table 4 Associations between NfL Z score or other variables and the need of mechanical ventilation (recruiting centre as random effect), 233 patients with available outcome

Further, NfL Z score, age, days from onset to admission, the presence of diabetes mellitus, absolute lymphocyte count, CRP, LDH, creatinine and PaO2/FiO2 ratio were significantly associated with death at univariate GLMM (Table 5). After accounting for covariates, NfL Z score had still a significant negative association with survival (Table 5). Here, each unit increase in NfL Z score was associated with a 1.7-fold higher likelihood of death, after accounting for age and the presence of diabetes mellitus [OR: 1.70 (95%CI: 1.34–2.15), p < 0.0001].

Table 5 Associations between NfL Z score or other variables and death (recruiting centre as random effect), 669 patients with available outcome.

ROC curve analyses

SROCs for NfL Z score with mortality as primary outcome are showed in Fig. 3. After setting specificity at 75%, 85% and 95%, the optimal NfL Z score thresholds for death were 2.73 (sensitivity: 0.49, specificity: 0.79), 3.06 (sensitivity: 0.36, specificity: 0.86), and 4.01 (sensitivity: 0.09, specificity: 0.96), respectively. At maximized Youden Index, a cut-off of 1.96 yielded 78% sensitivity and 59% specificity for mortality (AUC: 0.74, 95%CI: 0.60–0.83). SROCs for need of MV (AUC: 0.80, 95%CI: 0.64–0.89) and ICU admission (AUC: 0.71, 95%CI: 0.57–0.80) showed also a fair predictive value of NfL Z score, with sensitivity being low at a priori set specificity boundaries (Supplementary Fig. 3). In the ROC analyses derived from univariate and multivariable GLMMs, the performance of NfL Z score to discriminate participants with poor from those with good outcome was good, yielding an AUC > 0.70 (Supplementary Table 6). The best accuracy was yielded by the multivariate GLMM including NfL Z score, sex, time from onset to blood sampling, COVID-19 severity and diabetes mellitus as variables (AUC: 0.92, 95%CI: 0.86–0.98) in the prediction of ICU admission (Supplementary Table 6).

Discussion

In this IPD meta-analysis, we investigated the prognostic role of blood NfL in a large and comprehensive cohort of 669 hospitalized adult participants with COVID-19 admitted to 7 hospitals worldwide.

We showed that blood NfL values were typically elevated in hospitalized COVID-19 cases even without major COVID-19- associated CNS manifestations. Like previous reports, NfL Z scores correlated significantly with clinical severity as higher values were found in most severe cases [18, 31]. The significant associations between NfL Z score and CRP or Horowitz index (PaO2/FiO2 ratio), an established measure of lung injury severity, may support the complex interplay between hypoxic injury, inflammatory response and other mechanisms that contribute to neuroaxonal damage in COVID-19 [17, 18, 32]. Taking all together, our data suggest that the rise of the biomarker in blood might reflect the degree of a multifactorial neuroaxonal injury, which occurs in the acute phase, even in absence of major CNS manifestations, and, in turn, relates to disease severity.

Most interestingly, we provided here evidence about the strong relationships between higher blood NfL values and higher likelihoods of ICU admission, need of MV, and death in hospitalized COVID-19 cases. Of note, NfL remained significantly associated with unfavourable outcomes, even after adjustment for covariates. However, a very recent and large cohort study from two centres [32] found contradicting results in regard to the association between blood NfL and mortality in hospitalized participants with COVID-19. Nevertheless, Smeele et al. postulated a potential relationship between faster mortality and NfL values assessed at admission but not during disease course. In line with this hypothesis, NfL at admission represented the mainstay of included values in our meta-analysis, explaining, thus, our findings of the strong link between the marker and survival [32].

Nevertheless, at least in differentiating survivors from non-survivors, the accuracy of NfL Z score in isolation was still insufficient to provide meaningful and univocal information with the cut-offs set. Indeed, despite reaching an overall 74% accuracy, the critical issue of low sensitivity at non-absolute thresholds for specificity is sufficient to discourage attempts to promote the single neuronal marker as the main driver of prognosis, particularly given that COVID-19 is a complex and multisystemic disorder. On the other hand, blood NfL might well represent a complementary, rapid and robust test for a multimodal assessment to simplify the early identification of cases at higher risk.

On another issue, our results are in line with recent findings in other critical conditions, such as hypoxic-ischemic encephalopathy after cardiac arrest (CA), sepsis-associated encephalopathy (SAE) and traumatic brain injury (TBI), in whom blood NfL values were significantly associated with disease severity as well as clinical and functional outcome [7, 33,34,35,36,37]. In patients after CA, blood NfL had shown a higher prognostic value compared to that of classical investigations, such as blood markers [neuron-specific enolase (NSE), and S100 calcium-binding protein b (S100b)], head computed tomography (CT) and electro-encephalogram [7, 34]. Similarly, in the context of TBI, the accuracy of NfL in detecting CT or magnetic resonance imaging pathology seemed to outperform that of S100b or tau and to be similar to that of Glial Fibrillary Acidic Protein (GFAP) [37]. Nevertheless, this represents a topic which is still largely unexplored in COVID-19.

Moreover, it might be also of interest to investigate whether repeated measurements with trends over time might further improve the accuracy of blood NfL to predict clinical outcomes. Indeed, Maskevar et al. [29], found a progressive increase in biomarker levels close to the point of death, while such elevation was not observed in subjects who eventually survived. Furthermore, blood NfL should be tested in multicentric cohorts of cases with major COVID-19 associated CNS manifestations, in order to assess the prognostic performance in those subjects.

In addition, our study replicates, in a large cohort of participants, the potential influence of several physiological and pathophysiological factors on blood NfL. The strong correlation between blood NfL and age is well-known in literature, and it is possibly related to ageing and age-related comorbidities [6, 7]. Therefore, the adoption of NfL Z scores instead of raw biomarker concentrations [22] allowed to add robustness to our findings, overcoming the potential lower consistency of unadjusted analyses and consistently shrinking the comparability bias. As previously described [7, 13, 24, 38], the presence of renal dysfunction, hypertension and diabetes mellitus also influenced blood NfL variability in our cohort.

The major strength of our study relies on the generalisation of findings derived from small, heterogeneous and geographically distinct cohorts to the population at large, allowing to overcome a potential single-centre bias. Further, the large proportion of patients with critical disease included in the meta-analysis might highlight the potential of NfL at admission, alongside other prognostic markers, to predict clinical outcome and appropriately escalate treatment strategies to this population at risk. As the main limitation of the study, we acknowledge that many of the variables considered for the analyses, especially laboratory parameters (e.g., creatinine levels, PaO2/FiO2 ratio, etc.), comorbidities and the need of MV as outcome, did not have complete data to perform models including all participants. Further, we cannot exclude that some patients may have suffered from concomitant COVID-19-related peripheral nervous system manifestations which may also increase blood NfL levels (e.g., critical illness polyneuropathy, Guillain-Barré syndrome, etc.) [7, 39]. As additional limitations we have to mention the lack of a comparative analysis of NfL prognostic performance with that of established clinical scales [e.g. sequential organ failure assessment (SOFA), acute physiology and chronic health evaluation II (APACHE II) scores [40], etc.] or other candidate markers in blood or olfactory mucosa (e.g. GFAP, substance P) [8, 41].

In conclusion, the present IPD meta-analysis showed that blood NfL is significantly associated with disease severity and clinical outcomes in COVID-19 cases without major COVID-19 associated CNS manifestations. The assessment of NfL together with other biological and clinical markers of COVID-19 severity may allow to create reliable scores of easy implementation for outcome prognostication in clinical practice.