Introduction

Infectious diseases of the central nervous system (CNS), including meningitis, encephalitis, and meningoencephalitis, carry a high disease burden. These typically acute infections present significant morbidity and mortality, especially in children [1]. Rapid diagnosis is crucial for these medical emergencies to guide early and appropriate therapy. Unfortunately, microbiological confirmation and laboratory diagnosis remain challenging for most of these infections. Current diagnostic workflows typically rely on differential diagnosis based on patient history, clinical presentation, and imaging findings, followed by serial targeted testing methods like PCR assays for specific pathogens. Yet, these methods are clearly inefficient, as the aetiological agent of paediatric meningitis/encephalitis (M/ME) remains unidentified in up to 40–60% of patients [2, 3]. Rapid syndromic molecular arrays based on multiplex PCR tests which include the most prevalent microorganisms causing M/ME have been developed. For instance, EliTech commercializes CNS infection panels for detecting seven prevalent virus causing M/ME (namely, Meningitis Viral ELITe MGB® Panel for HSV1, HSV2, and VZV; Meningitis Viral 2 ELITe MGB Panel for enterovirus, parechovirus and adenovirus; and EliTech HPeV RT-PCR Test for parechovirus (types 1–6)) and BioFire Diagnostics commercializes the FilmArray Meningitis/Encephalitis Panel [4], which detects 14 pathogens, seven of them virus: cytomegalovirus (CMV), enterovirus (EV), herpes simplex virus type 1 (HSV-1), herpes simplex virus 2 (HSV-2), human herpes virus type 6 (HHV-6), human parechovirus and varicella- zoster virus (VZV). These panels, along with in-house multiplex real-time PCR developed at reference laboratories and used elsewhere [5], are valuable for diagnosing their target pathogens. Nevertheless, broader nucleic acid amplification tests for a wider range of potential pathogens in CSF are needed, as existing tools are limited to detecting known viruses or predefined panels of viral targets [4, 6,7,8]. The emergence and re-emergence of numerous clinically significant viruses causing CNS infections, not included in these panels and PCRs, highlight the need for novel or unexpected viral pathogen identification [9].

Next-Generation Sequencing (NGS) technology is transforming microorganism detection and characterization. Unlike the low throughput of Sanger sequencing, NGS is a high throughput technology that sequences multiple DNA molecules in parallel, allowing the sequencing of hundreds of millions of DNA molecules at once. A significant advantage of metagenomic NGS over targeted methods is its unbiased sequencing of all nucleic acid in a clinical sample, enabling the detection of both unexpected and expected pathogens. Additionally, metagenomic NGS can detect and quantify minor subpopulations of a specific pathogen. Incorporating pan-viral hybrid capture assays to enrich sequences covering all known families of human and animal viral pathogens in the workflow addresses the primary disadvantage of lower sensitivity compared to real-time PCR [10, 11]. These approaches necessitate further clinical interpretation by virologists, clinicians, and bioinformaticians.

This study aims to enhance understanding of viruses in the CSF of paediatric patients with clinically diagnosed M/ME of unknown aetiology using viral HCSS. As NGS technologies for pathogen identification continue to evolve, determining optimal approaches for patient-care scenarios where tests are likely to be conducted becomes crucial.

Patients and methods

Patients and clinical samples

Prospective recruitment of patients aged ≥ 3 months and ≤ 14 years, who were treated for M/ME of unknown aetiology at the Sant Joan de Déu University Hospital (Barcelona), a renowned centre in paediatrics, was conducted from May 2021 to July 2022. All patients had previously tested negative in CSF using routine microbiological methods (bacterial cultures and FilmArray ME Panel (Biomérieux, Marcy-l'Étoile, France)). The FilmArray ME Panel tests for specific DNA/RNA fragments of: Escherichia coli K1, Haemophilus influenzae, Listeria monocytogenes, Neisseria meningitidis, Streptococcus agalactiae, Streptococcus pneumoniae, CMV, enterovirus, HSV-1, HSV-2, HHV-6, human parechovirus, VZV and Cryptococcus neoformans/gattii.

Inclusion criteria were as follows: patients whose parents or guardians agreed to participate in the study under informed consent and had available surplus of CSF samples (at least 200 μl). Diagnostic criteria for meningitis included fever, headache, neck stiffness or bulging fontanelle, with or without altered mental status and pleocytosis in CSF (CSF White Cell Count > 5/mm3) [12]. Encephalitis diagnostic criteria included altered mental status (defined as decreased or altered level of consciousness, lethargy, or personality change) lasting ≥ 24 h with no alternative identified cause (major criteria) and at least two minor criteria (documented fever ≥ 38 °C within 72 h before or after presentation; generalized or partial seizures not fully attributable to a pre-existing seizure disorder; new onset of focal neurologic findings; CSF WBC count ≥ 5/cubic mm; abnormality of brain parenchyma on neuroimaging suggestive of encephalitis that is either new from prior studies or appears acute in onset; abnormality on electroencephalography consistent with encephalitis and not attributable to another cause) [13]. Exclusion criteria included lack of informed consent or inability to obtain the minimum surplus volume of CSF sample for HCSS analysis.

Shotgun sequencing

Sequencing was performed using a pan-viral (DNA and RNA viruses) metagenomic approach as previously reported [5, 14].

Sample processing

Total nucleic acid was extracted from 200 µl of cerebrospinal fluid using the QiAmp Mini Elute Virus Spin Kit (Qiagen, Hilden, Germany) with no RNA carrier, eluted in 30 µl nuclease-free water, aliquoted and stored at -80 °C until further processing. One microlitre was used to quantify RNA using the QuantiFluor® RNA System (Promega, Madison, WI, USA). A CSF sample previously positive for enterovirus by PCR was processed through all steps and served as a positive control. Four negative controls (HyClone™ HyPure Molecular Biology Grade Water) representing each round of extraction procedure were used. These controls underwent the entire shotgun sequencing protocol described herein.

Library preparation

Previous studies carried out in our laboratory [5, 14] indicated that active infections caused by DNA virus were detectable through an RNA shotgun sequencing protocol. This approach allows for the identification of viral gene expression and the precise determination of the viral pathogen responsible for the active infection.

Libraries were constructed using the NEBNext® Ultra II Directional RNA Library Prep Kit for Illumina® (New England Biolabs, Ipswich, MA, USA), following the manufacturer’s protocol. The initial amount of RNA was 1 ng/μl in a final volume of 5 μl for the CSF samples; the fragmentation time was 13 min at 94 °C, the dilution of the NEBNext-adaptor was 1:25; and the number of cycles in the amplification step was 13. In addition to the above-described RNA Library Prep Kit, NEBNext Multiplex Oligos for Illumina®- Index Primer Set 1 (New England Biolabs, Ipswich, MA, USA) and AMPure XP SPRI Reagent by Beckman Coulter (Life Sciences Division Headquarters, Indianapolis, IN, USA) were used to complete the library construction. Nucleic acid was quantified by QuantiFluor® dsDNA System (Promega, Madison, WI, USA), and quality (integrity and size of libraries) was verified by Bioanalyzer High Sensitivity DNA Analysis System (Agilent Technologies, Inc., Santa Clara, CA, USA). Samples underwent further processing when they met the manufacturer’s recommended criteria in terms of quality (300–500 base pairs) and quantity (at least 1 ng/μl).

Viral nucleic acid enrichment by hybrid capture

Twist Target Enrichment Standard Hybridization v2 (Twist Bioscience, South San Francisco, CA, USA) was used following the manufacturer’s protocol. The Twist Target Enrichment protocol was used to generate viral-enriched DNA libraries for sequencing on Illumina next-generation sequencing (NGS) systems. This method is based on hybridization probes and covers reference sequences for 3,153 viruses, including 15,488 different strains. Individual libraries were combined by equal mass (33.33 ng per library) into six capture enrichment pools. Library pools were processed at high drying speed using the Savant SpeedVac DNA130 Vacuum Concentrator (Thermo Fisher Scientific, Waltham, MA, USA). Then, the library pools were resuspended and hybridized with the custom biotin-labelled probe panel for a minimum of 16 h, and then hybrids were captured using streptavidin according to the manufacturer’s protocol. In the amplification step, the maximum number of cycles (× 15) allowed by the protocol was used.

Sequencing

DNA from hybrid capture libraries was quantified with the QuantiFluor dsDNA System (Promega Madison, WI, USA) and quality verified for sequencing by the Bioanalyzer High Sensitivity DNA Analysis System (Agilent Technologies, Inc. Santa Clara, CA, USA). Shotgun sequencing of the captured libraries was performed on a NovaSeq 6000 Illumina platform (Illumina Inc., San Diego, CA, USA) using NovaSeq 6000 Standard S2 Reagent Kit v1.5 (300 cycles, with 150 cycles per strand in a paired-end format).

Data analysis and identification of virus

Sequencing raw data were processed and analysed using the Chan-Zuckerberg ID (CZ ID) metagenomics pipeline (https://czid.org/). We applied a background model using four negative controls. Viral discovery was performed with the PikaVirus pipeline (developed at the Bioinformatics Unit, National Centre for Microbiology, Spain, https://github.com/BU-ISCIII/PikaVirus). In both pipelines, the analysis involved multiple steps, including quality control, removal of human and non-relevant sequences, assembly of viral genomes or contigs, and the identification of viral sequences through sequence alignment and comparison to known viral databases.

Viral PCR confirmation

Viral pathogens detected by shotgun sequencing were confirmed using specific PCR tests, when available. Enterovirus and parechovirus were confirmed through multiplex real-time PCR as previously described by Cabrerizo et al. [15]. Similarly, multiplex real-time PCR was used for EBV, CMV, and HHV-7 as previously described by Recio et al. [16]. The detection of HERV-K113 was confirmed via a PCR assay as detailed by Moyes et al. [17]. For rotavirus, PCR methods described by Mijatovic-Rustempasic et al. [18], along with the Allplex™ GI-Virus Assay (Seegene Inc., Seoul, Souht Korea) were utilised. Influenza A virus was identified by PCR as described by Ruiz-Carrascoso et al. [19]. HSV-1 and VZV were confirmed through multiplex RT real-time PCR as previously described by Castellot et al. [5]. Lastly, the presence of human polyomavirus (BKV and JCV) was established by a real-time PCR assay as described by Bárcena-Panero et al. [20].

Statistical analysis

Comparisons of categorical data were conducted using the Pearson chi-square test or the Fisher exact test. For continuous variables that were not normally distributed, the Mann–Whitney U-test analysis was used. A p-value of less than 0.05 was considered statistically significant. Statistical analysis was performed with SPSS v22.0 software (IBM Corp, Armonk, NY, USA).

Results

Patients’ main characteristics

During the study, 48 out of 55 episodes of M/ME had no aetiological diagnosis following routine clinical tests. Out of these, 40 had sufficient CSF sample for NGS analysis and were included in the study. One patient developed two different episodes. The median age of the patients was 3.6 years (IQR: 1.3–6.6), with 25 (62%) being male. Twenty (50%) had a previously known condition, primarily oncologic-haematologic diseases, autistic spectrum disorders, and other chronic neurological disabilities. One patient was tested with two different samples; therefore 41 samples were analysed.

Hybrid capture shotgun sequencing

The forty-one CSF samples, along with one enterovirus-positive CSF used as a control, and four negative controls were shotgun sequenced and processed using the CZ ID metagenomics pipeline v8.2. Average reads per sample were 43.47 million, with an average of 13.18 million reads passing filters (host and low quality) per sample. Significant parameter detection was achieved in 22 CSF samples using four negative controls in a background model computed on the CZ ID pipeline. A Z-score of 100 indicates that the virus was not detected in negative controls. HERV-K113 was detected in 13 CSF samples by the PikaVirus pipeline, of which 9 showed co-detection with other viruses. It is noteworthy that the CZ ID pipeline did not detect HER-K113 in any sample, presumably because this virus is considered a part of the human genome. Significant results from the CZ ID metagenomics pipeline v8.2 are shown in Table 1. Virus detection was further confirmed by specific PCR in three HHV-7, one BKV, and one EBV cases.

Table 1 Significant results and metagenomic data following background model generation in the CZ ID metagenomics pipeline

Other viruses were detected in 17 CSF samples. However, these hits were dismissed as feasible results because they were also present in the negative controls, resulting in scores far from 100 according to the CZ ID background model generation. Nonetheless, PCR testing was conducted when available, revealing the detection of human papillomavirus 115 in one CSF sample and additional HERV-K113 detections in four of these samples. Data from these non-relevant detections are shown in Table 2.

Table 2 Non-significant results and metagenomic data following background model generation in the CZ ID metagenomics pipeline

Clinical characteristics

The main clinical characteristics of the patients are summarized in Table 3. When considering only significant results from the CZ ID HCSS analysis, there were no differences in terms of age, sex, or previously known conditions between individuals with a detection and those without any detection. Similarly, no differences were observed in the main clinical symptoms, CSF characteristics, or blood characteristics. However, patients requiring admission to the PICU had a higher rate of positive CZ ID HCSS results compared to those who did not require pediatric intensive care unit (PICU) admission. The length of hospital stay and the incidence of sequelae were similar between the two groups.

Table 3 Main clinical characteristics and differences between patients with and without significant CZ ID HCSS detections

The specific clinical characteristics associated with each patient with HCSS detections was obtained are summarized in Table S1 of the supplementary information file.

Discussion

HCSS proved to be a successful approach for detecting both RNA and DNA viruses in CSF from children with meningoencephalitis of unknown aetiology. Compared to the standard-of-care screening using the FilmArray Meningitis/Encephalitis (FA/ME) panel, HCSS achieved additional significant viral detection in 30 cases, some of which were not included in the FA/ME panel. These detections included six cases of parechovirus A, three of enterovirus ACD, three of HHV-7, two of BK virus, four of polyomavirus 5, one of HSV-1, one of VZV, two of CMV, one of EBV, one of influenza A virus, and one of rhinovirus by using CZ ID pipeline. In addition, 13 detections of HERV-K-113 by using Pika Virus pipeline were found. Of all these, one sample with BKV, three with HHV-7, one with EBV, and all with HERV-K-113 were further confirmed by PCR.

The rare positive identifications through HCSS, in the absence of specific PCR confirmation, posed a challenge for explanation. We introduced negative controls at the outset of each round of nucleic acid extraction as a standard procedure to contamination control. These negative controls underwent the same processing steps as our experimental samples, from nucleic acid extraction to sequencing. Containing no target DNA or RNA, they served as a baseline for detecting any contamination that might occur during sample handling, processing and due to kits, reagents and instrumentation. Furthermore, the CZ ID pipeline expressed the normalized results of each sample in terms of reads per million (rPM), Z and Z-score metrics, which were highly valuable for interpreting results and making them readily comparable to data generated across the scientific community. Moreover, the Z-score metric in the CZ ID sample report is based on the prevalence of each virus in the selected negative control samples, with which the CZ ID pipeline built the background model. It used rPM as the metric of relative abundance and rPM values were normalized for sequencing depth. In analysing results from a particular sample, the Z-score metric can be used to provide insight into whether a particular virus was present in the negative control samples. A virus present at a higher abundance in the sample than in the controls will have a Z-score > 1. If a virus is not found in the set of control samples, the Z-score is set to 100, and if the virus is not found in the sample but is present in the controls, the Z-score is set to -100.

Regarding herpesviruses, establishing HHV-7, EBV, and CMV as the cause of encephalitis is often challenging because viral nucleic acids can also be detected during latency and asymptomatic viral reactivation. Interestingly, three cases of HHV-7 were detected by shotgun sequencing and further confirmed by PCR, with co-detection of parechovirus A and human polyomavirus 5 in two cases, respectively Ideally, a cohort of healthy children would assist with interpreting results, but obtaining cerebrospinal fluid from healthy children is legally challenging. Using patient groups with non-infectious diseases may lead to misinterpretations, since herpesviruses are known to reactivate during inflammatory processes. Since parechovirus A is a common cause of neurological disease in young children, it is reasonable to consider this virus as the cause of encephalitis, even though only HHV-7 was further confirmed by PCR in this CSF. Most neurological diagnostics rely on molecular tests for specific pathogens, making it difficult to identify multi-viral infections. However, when a metagenomics approach was used, some co-infections were observed [20]. Single case reports and case series have previously described HHV-7 -related encephalitis or encephalopathy in both immunocompetent and immunocompromised children and adults [21, 22]. The detection of HHV-7 DNA by shotgun sequencing, further confirmed by PCR in the CSF of the patient with neurological disease, alongside the exclusion of all alternative aetiological causes (in accordance with the clinical practice guidelines of the Infectious Diseases Society of America [23]), supports HHV-7 as a possible cause of the encephalitis. This is based on the criteria established by Schwartz et al. [24]. Nevertheless, HHV-7 is highly prevalent and has been detected in normal brain tissue [25]. Clinical judgment is crucial in determining the clinical significance of HHV-7 detection in the CSF. In a multicentre prospective surveillance study of viral agents causing meningoencephalitis, HHV-7 and EBV were found in 10% and 6% of cases, respectively [26]. Again, the role of EBV as a cause of CNS disease must be deduced from its clinical setting.

We detected one case of HSV-1 and another of VZV, which could not be confirmed by PCR due to insufficient clinical samples and were not previously detected with the FA/ME test. A recent study conducting diagnostic test accuracy meta-analysis (including sensitivity and subgroup analyses) reported suboptimal sensitivities for HSV-1, concluding that the FA/ME test is excellent for ruling in but limited for ruling out CNS infections [27]. Nonetheless, the patient improved with only 48 h of specific treatment and showed no typical radiological signs of herpetic encephalitis. We also detected CMV, which is one of the most clinically recognized viral pathogens in immunocompromised patients, in two cases not confirmed by specific PCR, possibly due to low viral load as suggested by the low rPM parameter from the CZ ID pipeline. For genomic detection of low virus concentrations, HCSS has proven more sensitive compared to untargeted shotgun sequencing [28].

Regarding enterovirus and parechovirus, we detected three and six cases, respectively. The species Parechovirus A currently consists of 19 different human parechovirus types (HPeV) 1 to 19. Parechovirus A and enteroviruses (EV) are common viral causes of meningoencephalitis in children [29, 30]. However, the FA-ME assay failed to detect them in the CSF samples, possibly due to the lower sensitivity of PCR compared to HCSS or because PCR primers used in the initial screening lacked sensitivity and capacity to amplify EV sequences of all known genotypes [31]. The recently available EliTech HPeV RT-PCR Test [6] increases HPeV detection chances, being able to detect up to six different types. However, reduced sensitivity due to reagent competition, and lack of flexibility to modify the panel are major obstacles for the applications of multiplex CNS pathogen detection panels in clinical laboratories [32].

Regarding polyomavirus, we detected BKV and Merkel-cell polyomavirus (MCPyV; Human polyomavirus 5) in two and four samples, respectively. BKV is common in kidney disease of immunocompromised patients but rarely reported in neurological infection. However, in a study of 2,062 CSF samples from neurological patients, BKV was detected by specific PCR in 20 patients diagnosed with progressive multifocal leukoencephalopathy or multiple sclerosis [33]. MCPyV infection is usually asymptomatic, but it can cause life-threatening pathologies such as Merkel cell carcinoma in immunocompromised hosts. MCPyV was detected in 22 neurological patients with acute ME by PCR elsewhere, but likely as a bystander rather than as an aetiological agent [34].

The presence and confirmation of HERV-K113 by specific PCR is relevant. HERVs are genetic elements in the human genome that originated from ancient retroviral germline infections [35]. Of note, HERV- K113 is the only endogenous retrovirus known to produce viral infective particles [36] and is associated with certain autoimmune diseases [17]. The clinical implications of this retrovirus in numerous pathologies are still debated, but a study by Moyes et al. [17] found its prevalence significantly higher in multiple sclerosis and Sjögren's syndrome.

Rhinovirus is a common respiratory pathogen in children year-round; however, its CNS involvement is extremely rare, with few reported cases. Current literature does not definitively identify this virus as a causative pathogen. Furthermore, it has been associated with cerebellitis [37], a condition absents in our case. Although rhinovirus is not typically tested in cerebrospinal fluid (CSF), it belongs to the Enteroviridae family. Most PCR tests cannot distinguish it from other enteroviruses. Nervous system injury associated with influenza is a leading cause of influenza-related child deaths, with fatality rates up to 30% [38]. Neurological symptoms of brain injury typically appear on the same day or within days after cold symptom onset, commonly involving convulsions and consciousness alterations. Common types of nervous system injury associated with influenza include influenza-associated encephalopathy (IAE) [39], Reye’s syndrome, Guillain-Barré syndrome, haemorrhagic shock encephalopathy syndrome [40], and acute necrotizing encephalopathy (ANE) [39], with ANE being the most severe [41]. Our case, not linked to respiratory symptoms, had uncertain relevance.

As discussed, using a background model in the CZ ID pipeline facilitated results interpretation, distinguishing contamination or artifacts. Human coronavirus HCoV-NL63, HCoV-229E, HCoV-HKU1, and HCoV-OC43 account for 15%–30% of common cold cases [42]. Human coronavirus 229E causes common colds and self-limiting respiratory infections. However, the aetiologic role of HCoVs in neurological diseases remains unproven [43]. Anelloviruses, such as Alphatorquevirus (TTV), are ubiquitous in humans and are nearly constantly shed by infants [44], indicating early-life infections. TTV is recognized as the main component of the human viral flora [45]. While they are considered non-pathogenic, Anelloviridae might be associated with the occurrence of some disorders, including respiratory disorders in childhood [46]. Though exceptional Torque teno virus cases have been described in the literature [47], it is usually considered a contaminant or bioinformatics artifact. Few laboratory reagents appear to be entirely free from contamination, particularly by ssDNA viruses, and predominantly circoviruses [48,49,50,51,52,53].

This study’s findings may have significant implications for patient management, treatment strategies, and public health interventions. However, limitations include the necessity of meticulous sample collection, storage, and processing to enhance the accuracy of viral pathogen detection. Enrichment procedures, like the hybrid capture used here, can increase the percentage of viral nucleic acid relative to host nucleic acid. Yet, the choice of method greatly affects results. For instance, depleting methylated DNA to eliminate human DNA can also inadvertently remove viral DNA from viruses that rely on human machinery, such as Epstein-Barr virus (EBV). Utilising appropriate positive and negative controls is crucial for adjusting the assay threshold and evaluating the plausibility of identified pathogens. The methodology used in this study enabled detecting expected and unexpected pathogens, requiring further clinical interpretation by virologists, bioinformaticians, and the judgment of experienced physicians.