Background

Tuberculosis (TB) is a leading cause of death in children [1]. Calculating accurate mortality rates in children is difficult since many cases are never diagnosed or reported [2, 3]. Microbiological diagnosis of TB enables confirmation of disease and initiation of appropriate treatment, including treatment for drug-resistant TB when indicated, through detection of resistance to antimicrobial agents. However, this is challenging in children because they often have paucibacillary disease, and most young children cannot voluntarily produce good quality sputum specimens, the standard sample collected in adults [4]. Underdiagnosis is therefore common, with most pediatric TB deaths occurring in those who did not receive treatment [5].

The World Health Organization (WHO) has recommended induced sputum, gastric aspirate (GA), stool, and nasopharyngeal aspirate (NPA) as alternative samples for diagnosing pediatric TB [6]. Sputum induction requires electricity and equipment for the nebulization [7] and a well-ventilated area with adequate infection control measures to mitigate the transmission risk [8]. Overnight fasting is needed for good-quality GA samples, often necessitating hospital admission [6]. Sputum induction and gastric aspiration can thus be challenging to implement at lower-level health facilities due to operational and resource limitations, including adequately trained staff [9]. Whilst stool collection is non-invasive, stool can rarely be passed on demand, and there is a potential for invalid results or errors using molecular detection techniques [10].

Nasopharyngeal aspiration involves inserting a small catheter into the nasopharynx to stimulate a cough reflex, with aspiration of secretions into a mucus trap [11]. It does not require hospital admission like GA and has fewer transmission risks than the collection of induced sputum [7]. Although trained personnel and equipment are still needed, results from a large randomized trial found that 97% of children with symptoms of pneumonia had an NPA successfully obtained. In comparison, only 81% of children had stool collected [12]. NPA collection has the potential to be implemented across varying levels of the healthcare system, thereby increasing access to TB diagnosis. However, further information on its diagnostic yield using existing TB diagnostic tools is needed.

We conducted a systematic review and meta-analysis on detecting Mycobacterium tuberculosis (Mtb) using culture or nucleic acid amplification testing (NAAT) on NPA from children evaluated for pulmonary TB (PTB). Our primary aim was to estimate the proportion of children diagnosed by NPA compared to a microbiological reference standard (MRS) and, where available, compared to a composite or clinical reference standard (CRS). As secondary aims, we estimated the incremental yield of two NPA samples compared to one and summarized information on operational aspects of NPA collection and processing. To our knowledge, this is the first systematic review focusing on both the diagnostic yield and operational aspects of NPA for pediatric TB.

Methods

This systematic review was reported according to the PRISMA diagnostic test accuracy (DTA) guidelines [13]. The PRISMA checklist is available in Additional file 1.

Protocol and registration

The protocol for this systematic review is registered at PROSPERO — CRD42021283965 (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=283965).

Search strategy

We conducted a systematic search of PubMed, Embase, and the Cochrane Library published up to 24th November 2022, with no other time limits. The search strategy was constructed with a medical librarian and incorporated text words and database subject headings related to the index specimen — “nasopharyngeal aspirate” and the target condition — “tuberculosis.” Complete search strategies for each database are presented in the supplementary material (Additional file 2). We also checked reference lists of included studies and review articles. For unpublished or ongoing studies, we searched ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform and contacted study authors when potentially eligible unpublished studies were identified.

Eligibility criteria

We included studies that reported the number of participants under 18 years with presumed PTB and the number that was diagnosed using culture or NAAT on NPA in comparison to an appropriate MRS, irrespective of HIV status, previous TB testing, or anti-TB treatment of any duration. Original data studies written in English, French, Italian, Portuguese, German, and Dutch, utilizing any study design or enrolment timing and evaluating fresh or banked specimens, were eligible. We excluded conference proceedings, editorials, reviews, and studies using mixed adult and pediatric populations, unless they reported accuracy results for children separately. We also excluded studies if data were available only on a per-specimen basis rather than on a per-child basis, which we deemed more meaningful for clinical practice, where usually multiple tests and sample types per child are used to diagnose TB.

Study screening and selection

After removing duplicates, two reviewers (N. K. and E. B.) independently screened titles and abstracts per eligibility criteria, followed by full-text review for inclusion in the systematic review. Any disagreement was resolved through discussion with a third reviewer (LO).

Data extraction

We designed an Excel data extraction form and piloted it on two studies, after which the form was optimized and used for all selected full-text articles. Two reviewers (N. K. and E. B.) independently extracted data for the diagnostic yield of NPA culture or NAAT compared to the MRS as defined below and, where available, a CRS. We also collected information on study characteristics and population and data on NPA sample collection and processing for a post hoc analysis on operational aspects of NPA. Disagreements were discussed until consensus was reached. We contacted study investigators regarding missing data and clarification and stratification of diagnostic performance, if needed.

Quality assessment

Two reviewers (N. K. and E. B.) independently assessed the methodological quality of included studies using the quality assessment of diagnostic accuracy studies-2 (QUADAS-2) framework [14]. The adapted tool was first piloted with two studies. Discrepancies were resolved by discussion between NK and EB, with a third reviewer (L. O.) consulted if needed. The QUADAS-2 tool with signaling questions tailored to this review and justification for assigning levels of bias is included in the supplementary material (Additional file 3).

Reference standards

We defined the MRS as mycobacterial culture and/or a WHO-endorsed NAAT on any clinical specimen for diagnosing childhood PTB, including induced sputum, GA, NPA, stool, string test, expectorated sputum, and bronchoalveolar lavage as per international case definitions for pediatric intrathoracic TB [15]. Children who were MRS positive were defined as having confirmed TB. WHO-endorsed NAATs include Xpert MTB/RIF (Xpert) and Xpert MTB/RIF Ultra (Ultra) (Cepheid, USA), Truenat MTB (Molbio, India), and moderate complexity automated NAATs [16]. Since inclusion of positive TB cases by NPA in the MRS could overestimate the diagnostic yield, we also defined a modified MRS where NPA was not included. We anticipated definitions of the CRS to be heterogeneous across studies and used the definitions in original publications. CRS in studies included children with confirmed TB and children with clinically diagnosed TB based on symptoms and signs, radiological changes, exposure history, immunological evidence, and treatment response (unconfirmed TB) [15].

Data synthesis and statistical analysis

For the primary objective, we calculated the diagnostic yield of NPA with 95% confidence intervals (CI) for individual studies. We defined this as the proportion of children diagnosed with PTB using either culture or NAAT on NPA compared to the number of children positive by MRS (confirmed TB) and, where available, to the number of children positive by CRS (confirmed + unconfirmed TB). Diagnostic yield was based on one NPA sample. In studies evaluating multiple NPA specimens, the first NPA sample was used. Secondarily, to assess the incremental yield of a second NPA specimen versus the MRS, we included studies where data could be extracted for both the first and second NPA samples.

We performed meta-analyses to estimate the pooled diagnostic yield for culture or NAAT on one NPA with univariate random-effect hierarchical models. All studies were included irrespective of the risk of bias. In a prespecified sensitivity analysis, we calculated the pooled diagnostic yield after excluding studies with a high or unclear risk of bias for the reference standard. This was used as a proxy for the quality of the study. Observed proportions from individual studies were transformed to a natural logarithm scale to account for skewed data and extreme proportions. Results from individual studies and summary estimates were demonstrated in forest plots, with the I2statistic (95% CI) used to quantify between-study heterogeneity. To explore sources of heterogeneity, we conducted sub-analyses stratified by HIV status and age. All analyses were conducted using the “metafor” and “meta” packages in R version 4.2.2 [17, 18].

Results

Search results

We identified 1483 unique studies, of which 54 were selected for full-text review and 12 met our eligibility criteria (Fig. 1). We identified three unpublished studies (NCT04121026, NCT04240990, NCT04038632) for which data were unavailable from the authors. Three eligible studies were excluded because data for our primary objective were unavailable despite contacting authors. Specifically, two studies only reported the combined NPA diagnostic yield based on two samples [19, 20]; one stopped NPA collection during the study, and data extraction on NPA yield or the MRS was not possible [21]. The remaining nine studies were included in this systematic review.

Fig. 1
figure 1

PRISMA study flow diagram

Study and patient characteristics

Study and patient characteristics are presented in Table 1. Participants were recruited across eight high-TB burden countries, mostly within Africa, with 7/9 studies including cohorts with a high HIV prevalence as per WHO definition [22]. Seven studies recruited in hospitals, with one additionally recruiting from the community [23], and two solely from primary-level health facilities [24, 25]. The most common exclusion criterion was current or previously received TB treatment within varying time periods. The prevalence of children positive by MRS (confirmed TB) varied widely, ranging from 3 [24] to 41% [11]. The prevalence of children positive by CRS (confirmed and unconfirmed TB) ranged from 51 [25] to 90% [26]. Additional file 4: Table S1 summarizes the MRS and CRS definitions of the included studies.

Table 1 Study and patient characteristics

NPA collection, processing, and applied microbiological tests varied between studies (Table 2). The proportion of children with one NPA collected was high, ranging from 96 to 100% [11, 24, 26, 28], although collection rates for two samples across consecutive days were slightly lower (83/105, 79%) [24]. Operational aspects that might affect diagnostic yield were not uniformly reported. The target volume of NPA was only described in one study (2–5 ml) [26], and no study reported the actual volume collected. The proportion of uninterpretable NPA NAAT results was less than 5% [24,25,26]. Only one study reported the total proportion of contaminated NPA cultures (31/184, 17%) [24]. The culture method differed across studies, including liquid culture: mycobacteria growth indicator tube (MGIT) 960 or microscopic observation drug susceptibility (MODS) and solid culture: Löwenstein–Jensen or 7H11. For NAAT, most studies used Xpert (5/9), with the remaining using Ultra (1/9), the real‐time RealArt™ PCR kit (1/9), or in-house hemi-nested PCRs (2/9). For studies that tested both NAAT and culture [11, 23, 24, 26,27,28,29], NPA specimens were split for separate testing.

Table 2 Summary of NPA collection, testing, and processing across studies

Quality assessment

Figures 2 and 3, Additional file 5: Table S2 summarize the QUADAS-2 assessments. Seven out of nine studies had a low or unclear risk of bias (ROB) for patient selection. Two had a high ROB for excluding the clinically unwell and children above 10 years [24] and comparing cases to healthy controls [27]. Applicability concerns were overall low, except for one case–control study which enrolled asymptomatic children with a positive tuberculin skin test, a test not routinely used for TB screening in most high-burden settings [27].

Fig. 2
figure 2

Summary of risk of bias and applicability concerns using QUADAS-2 tool. The review authors’ judgements about each domain are presented for each included study

Fig. 3
figure 3

Summary of risk of bias and applicability concerns using QUADAS-2 tool. The review authors’ judgements about each domain are presented as percentages across the included studies

For the index test domain, most included studies had a low ROB since they used tests with automatically generated results and pre-specified thresholds (Xpert, Ultra, and MGIT). Most studies reported an appropriate method of mucus extraction with suction, so applicability concerns were overall low, except for two studies [11, 28].

The reference standard domain scored most poorly. We only scored three studies as having a low ROB since they collected multiple different specimens and used both culture and a WHO-endorsed NAAT [23, 24, 26]. Applicability concerns were high in four studies for not reporting specification methods to distinguish Mtb from other mycobacteria [11, 24, 27, 28].

Most studies had a low ROB for the flow and timings domain. One study included substantially fewer children in the analyses than the number enrolled (loss of 20%) [29]. In another, children received different culture reference tests [26], known to have differing sensitivities [31]. Both were scored as having a high ROB.

Diagnostic yield of NPA

Seven studies (including 242 children with confirmed TB) evaluated the diagnostic yield of culture on one NPA against the MRS [11, 23, 24, 26,27,28,29]. A total of 17 to 88% of children with confirmed TB were diagnosed using culture on NPA (Fig. 4, Additional file 6: Table S3). The pooled estimate was 58% (95% CI 42–73%). Nonoverlapping CIs between some studies and an I2 value of 77% (95% CI 57–98%) indicated considerable between-study heterogeneity.

Fig. 4
figure 4

NPA diagnostic yield compared to children positive for MRS, according to study test. A Culture on NPA. B NAAT on NPA

The diagnostic yield of NAAT on one NPA versus the MRS could not be extracted in three studies which used in-house PCRs or the RealArt™ PCR kit [11, 27, 28]. In the remaining six studies (including 256 children with confirmed TB), 31 to 60% of children with confirmed TB were diagnosed using NAAT on one NPA (Fig. 4, Additional file 6: Table S3) [23,24,25,26, 29, 30]. The pooled estimate was 44% (95% CI 36–51%) with CIs largely overlapping. The I2 value was 25% (95% CI 0–88%).

We calculated diagnostic yield of NPA, excluding NPA in the MRS (modified MRS) (Additional file 7: Table S4). This data was only available from 2/6 studies for NAAT and 5/7 studies for culture. Diagnostic yield relative to this modified MRS was very similar compared to diagnostic yield relative to the original MRS, except for one study with very small numbers of children with TB [24].

Based on three studies with data available against a CRS, 1 to 15% of children with confirmed and unconfirmed TB were diagnosed using culture on one NPA [24, 26, 29]. Based on five studies, 2 to 14% of children with confirmed and unconfirmed TB were diagnosed using NAAT on one NPA [24,25,26, 29, 30] (Additional file 8: Table S5). Given the small number of studies and the significant heterogeneity observed in CRS definitions, meta-analyses were not done.

Testing two NPA samples compared to single sample testing increased the diagnostic yield by 4–35% for culture [23, 27, 29] and by 8–19% for NAAT [23, 25, 29, 30] versus a MRS (Fig. 5). The percentage of children with microbiologically confirmed TB by testing of other specimens who were not detected by two NPAs varied from 28 to 48% for culture and 41–63% for NAAT.

Fig. 5
figure 5

Incremental diagnostic yield of a second NPA using culture or NAAT compared to a MRS. The number in bars refers to the diagnostic yield of either the 1st or 2nd NPA sample in %. n refers to the total number of children with microbiologically confirmed TB in each study (MRS positive)

We undertook two sensitivity analyses for the meta-analyses (Additional file 9: Table S6). Firstly, we only included the three studies with a low ROB for the MRS [23, 24, 26]. Pooled diagnostic yield for culture (63%, 95% CI 51–74%) and NAAT (53%, 95% CI41–64%) were similar to the pooled estimates from all the studies. Secondly, three studies in the main analyses for culture on NPA did not include both NAAT and culture in the MRS [11, 27, 28]. A MRS that only includes one reference test may detect fewer confirmed cases in the denominator for diagnostic yield, which could lead to overestimation of the result. To address this, these three studies were post hoc excluded, which did not change the pooled estimate (57%, 95% CI 46–68%) compared to our main analyses. I2 values were lower in both sensitivity analyses suggesting less heterogeneity.

We also explored study heterogeneity based on HIV status and age. There were too few studies to derive pooled estimates for subgroups and conduct meta-regression; however, inspection of forest plots suggested a higher trend in NPA yield among CLHIV than HIV-negative children for culture and NAAT (Additional file 10: Fig. S1) and a higher trend in diagnostic yield in younger children for NAAT (Additional file 10: Fig. S2). Since diagnostic yield of NPA is influenced by the number of MRS-positive children in a study, we also conducted a post hoc analysis to explore this. Visual inspection of scatterplots suggested a positive relationship between microbiological confirmation rate and NPA culture diagnostic yield (Additional file 10: Fig. S3). This association was not observed for NAAT (Additional file 10: Fig. S3).

Discussion

In this systematic review and meta-analysis, microbiological testing of one NPA specimen detected Mtb in approximately half of all children with microbiologically confirmed TB. The summary diagnostic yield of culture (58%; 95% CI 42–73%) was slightly higher than the summary estimate of Xpert or Xpert Ultra (44%; 95% 36–51%). Whilst we confirmed the added value of repeated NPA samples to increase microbiological yield by 4–35% for culture and 8–19% for NAAT, two samples will, at best, still miss a third of children with TB.

We identified between-study heterogeneity in NPA diagnostic yield, especially for culture. Whereas all studies in our meta-analysis for NAAT used the GeneXpert Systems, culture methods varied. Liquid culture is more sensitive than solid culture [32], and using both improves Mtb recovery if contamination occurs [33]. Differences in the reference standard likely contributed to heterogeneity, reflected in the lower I2 in the sensitivity analyses only including studies with a low ROB for the MRS and studies with two reference tests as opposed to one, although this should be interpreted with caution given the few studies and the wide I295% CIs [34]. Diagnostic yield also depends on the quality and volume of the specimen. The minimum volume for NPA recommended by the WHO is 2 ml, although larger volumes are associated with greater bacteriological yield [35]. Limited data on NPA volumes and other aspects of the collection process made it difficult to evaluate the impact on yield.

Heterogeneity in yield can be due to variation in study population and the pre-test probability of TB. Indeed, the microbiological confirmation rate, which is highly influenced by the patient population, appeared to be related to the yield for culture on NPA. Patients were recruited from different levels of healthcare facilities, with tertiary referral centers more likely to have children with advanced disease and higher mycobacterial burdens [36]. The trend for a higher NPA yield in CLHIV compared to HIV-negative children suggested in our review has been noted in other diagnostic specimens [37,38,39] and could be related to the greater risk of TB and more advanced disease. In contrast, the trend for a greater NPA yield in younger children is surprising since they often have paucibacillary disease, although other factors may contribute to these findings.

Operational factors including feasibility and acceptability influence the choice of specimen collection [6]. The high proportion of children with successful NPA collection (> 95%) across different levels of healthcare and the low rate of indeterminate results with NAAT (< 5%) in our review support the feasibility of NPA. This is consistent with the recent TB-speed pneumonia study where 97% (1140/1169) of children with symptoms of pneumonia across six high TB incidence countries had NPA collected, and only 1.3% (15/1132) of Ultra results on NPA were invalid or had errors, although this study only recruited from hospitals [12]. No study in our review provided data on the acceptability of NPA. Preliminary findings from a cross-sectional qualitative study within the TB-speed pneumonia project identified that whilst NPA collection was perceived as painful by nurses and parents, it was overall well-accepted and judged to be quicker and less invasive than GA [40].

Our diagnostic yield estimates for NPA were lower than sensitivity estimates for Xpert Ultra on expectorated or induced sputum (75.3%), GA (70.4%), and stool (56.1%) based on a reference standard of culture in another meta-analysis for pediatric TB [38]. However, the use of different MRS definitions hampers this comparison, and indirect comparison of specimens between studies can be biased by differences in population and setting. Although testing of NPA will miss some children with TB, detection is significantly improved when a combination of different samples is utilized [26]. Obtaining different specimens in 1 day may be simpler than collecting samples over consecutive days. In a study of children with presumptive TB in South Africa, a combination of one induced sputum and NPA using Ultra identified 80% of children with confirmed TB [30]. Similarly, in a pediatric study in Kenya, testing one NPA and stool with MGIT and Xpert had a diagnostic yield of 71%, which was comparable to the yield from two GA (77%) over multiple days [23]. NPA, as a relatively easy procedure, can increase access to microbiological testing, with yield improved if feasible by testing additional specimens.

There are several strengths to our review. We conducted a search strategy that covered six languages and included the three key bibliographic databases recommended for diagnostic studies [41] and trial registers for unpublished studies. Although our inclusion criteria were limited to European languages, we did not find any article that could not be screened due to language restrictions. Most studies in our review included CLHIV and had an average age of under 5 years, suggesting our results are highly applicable to key diagnostic groups. Our dataset included children from different levels of health facilities across three continents, improving the generalizability of our findings. We conducted multiple sensitivity analyses to check our assumptions and explore alternative explanations for our findings. Finally, we considered diagnostic yield estimates separately for NAAT and culture. Access to culture is restricted to highly specialized health facilities [42], whereas automated NAAT has lower technical and infrastructure requirements and is more suitable for lower-level health settings [7]. Distinguishing these two tests reflects their different potential roles in TB diagnostic algorithms.

This review and evidence base do have limitations. Whilst pooled estimates can summarize information across multiple studies, between-study heterogeneity, especially for culture, means that they must be interpreted with caution, and readers are encouraged to consider the variety in yield estimates as shown in the forest plots. Although we performed sub-analyses based on HIV and age, paucity of data meant we could not conduct meta-regression to fully explore how these variables contributed to differences in NPA diagnostic yield. Secondly, we included NPA in the MRS, which can potentially overestimate the diagnostic yield. However, diagnostic yield was very similar for nearly all studies when using a modified microbiological reference standard in which NPA results were excluded [15]. Thirdly, whereas all studies using NAAT on NPA had culture and NAAT in their MRS, some studies only had culture in their MRS, potentially skewing estimates. However, our sensitivity analysis showed minimal change to diagnostic yield. Finally, despite contacting authors, we had to exclude three eligible studies as data for our primary aim could not be extracted.

Whilst the feasibility of NPA supports decentralization to lower levels of healthcare, we identified several gaps in the evidence to be addressed. Firstly, more qualitative research is needed on the perspectives of children, caregivers, and health workers on NPA, especially regarding acceptability, repeated sampling, and barriers to collection. Although reporting was incomplete, we noted variation between studies in many aspects of NPA collection. Protocols for NPA themselves are not uniform; whereas the WHO suggests 2 h of fasting prior to NPA collection [6], other national and international bodies do not [43,44,45]. Operational research into standardizing and optimizing sample processing and collection in low-resource settings to enhance recovery of bacilli from NPA is recommended. Finally, improved reporting on the performance of NPA specifically for children under 5 could help researchers better understand its value where it is most clinically relevant.

Conclusions

Our systematic review and meta-analysis confirm the suitability of NPA as an alternate specimen for the microbiological confirmation of pediatric PTB. Despite suboptimal diagnostic yield, the high rates of successful collection across different levels of healthcare help improve access to microbiological testing. This supports the inclusion of NPA in diagnostic algorithms for TB, especially if sampling is repeated or in combination with other specimens.