Introduction

Acute lower respiratory infections (ALRIs) are a major cause of clinic presentations [1] and hospitalizations [2] of young Indigenous children living in remote regions of Australia. One-fifth of Indigenous infants born in Australia’s Northern Territory (NT) are hospitalized with an ALRI in the first year of life [2]. The reported incidence of radiologically confirmed pneumonia is 27 per 1000 child-years among NT children aged < 5 years [3], and for bronchiolitis it is 380 per 1000 child-years in those aged < 12 months [2]. Moreover, the 7-valent pneumococcal conjugate vaccine (PCV7), introduced in 2001, did not significantly reduce the incidence of WHO radiologically confirmed pneumonia among children aged < 18 months [4].

Despite the importance of ALRIs in our setting, the microbiological associations remain poorly characterized with only two published studies to date [5, 6]. These were both hospital-based. Worldwide, many studies have identified the presence of viruses and bacteria in children’s respiratory secretions during an ALRI [7]; however, the contribution to infection of individual microbes, polymicrobial communities, and microbe-host interactions at the individual and population levels remains poorly understood. Furthermore, the value of nasopharyngeal swabs (NPS) for informing microbiology in the lung has been questioned. For example, in NT children hospitalized with pneumonia, nasopharyngeal bacteria and viruses were not associated with clinical signs on admission or during recovery [6], questioning the utility of NPS taken at the time of ALRI in clinical diagnosis or for prognosis in children with already high carriage of respiratory pathogens. Recent data from the PERCH study confirmed the limited value of cross-sectional NPS for diagnosing ALRI [8].

Microaspiration of secretions from the upper airways is a likely precursor to ALRI [9], and when nasopharyngeal secretions are abundant, these bacteria-laden secretions are more likely to be aspirated, particularly in young children who have immature laryngeal chemoreflexes [10]. Indigenous children in remote NT communities are arguably at high risk of microaspiration of respiratory bacteria (e.g. Streptococcus pneumoniae, Haemophilus influenzae) or viruses as these microbes are detected in 62–79% of NPS taken from asymptomatic infants and young children enrolled in community-based studies [11,12,13]. Therefore, in this setting examining nasopharyngeal microbes before, rather than during, an ALRI may provide important information on the initial pathogenesis of these infections.

ALRIs remain a complex problem. This is made more challenging by the lack of methods to determine etiology, compromizing preventive and treatment strategies. Given the limited available data on the etiology of community-based ALRIs and difficulty sampling the lung, our aim was to investigate whether the prior presence or load of specific bacteria or viruses in the nasopharynx was associated with developing an ALRI episode in a cohort of young Indigenous children living in the NT. We hypothesized that the prevalence of both respiratory bacterial and viral pathogens in NPS specimens would be higher before ALRI episodes than in controls.

Materials and methods

Study design and participants

A retrospective, nested case-control, case-crossover analysis (Fig. 1) of clinical data and samples collected from Indigenous children aged 90 days to 2 years enrolled previously in two completed studies was undertaken (Table 1) [14, 15]. We aimed to achieve 1:1 matching of cases and controls, but other ratios were allowed when other controls were not available. NPS were stored in skim-milk tryptone glucose glycerol broth at − 80 °C [16].

Fig. 1
figure 1

Schematic representation of case and control selection. Case NPS (0–21 days before ALRI), SCC NPS (90–180 days before the ALRI event), and DCC NPS from time (± 21 days) and age (± 60 days) matched children without ALRI. For case NPS from children aged < 180 days a SCC NPS was chosen 25–90 days before the event. ALRI, acute lower respiratory infection. DCC, different child control. NPS, nasopharyngeal swab. SCC, same child control

Table 1 Description of original studies

ALRI episodes

Research nurses identified ALRI cases by reviewing health centre and hospital medical records. Hospitalized ALRI episodes were defined by fever (> 38 °C) with at least one of the following within 24 h of presentation: abnormal chest radiograph showing pulmonary infiltrates; age-adjusted respiratory rate above the WHO threshold (≥ 50 breaths/min [2–12 months]; ≥ 40 breaths/min [1–5 years]); crackles, wheeze or bronchial breathing on auscultation. Community ALRI episodes were defined by fever with chest wall recession or tachypnea with/without crackles or wheeze. Pneumonia and bronchiolitis diagnoses were as recorded in the health centre notes or medical admission.

Selection of NPS (Fig. 1)

NPS sampling strategy [14, 15]

NPS were collected from participants every 2–4 weeks for the first 6–12 months of life, then 6-monthly until age 2 years (Table 1).

Case NPS

Case NPS were collected 0–21 days before ALRI episodes. To limit potential age bias [12], NPS were only selected from infants aged > 90 days.

Same child control (SCC) NPS

SCC NPS were chosen 90–180 days before ALRI episodes (for case-crossover analyses). For children aged < 180 days when the case NPS was collected, a SCC NPS was chosen 25–90 days before ALRI to avoid selection of a NPS collected when the child was < 90 days of age.

Different child control (DCC) NPS

DCC NPS selected within 21 days of the case NPS and if the children were within 60 days of age to match for age and seasonality (for case-control analysis). DCC NPS were excluded if collected within 21 days of an ALRI episode in the control child.

Other clinical data and potential confounders

In the original studies [14, 15], otitis media was identified by video otoscopy. Clinic records were used to identify recent antibiotic prescription (within 21 days of the NPS) and PCV7 vaccination status (defined as ≥ 2 doses administered at least 2 weeks prior), at the time of NPS collection.

Bacterial and viral analyses of NPS

Bacterial and viral testing of stored NPS in STGGB has previously been performed successfully [12].

Viral PCR detection and typing

Specimens were batch-tested for respiratory syncytial virus (RSV-A and B); human rhinovirus (HRV); influenza virus (IFV-A and B); parainfluenza virus (PIV-1, 2, 3); human metapneumovirus (hMPV); human coronavirus (HCoV- NL63, OC43, 229E, HKU1); human enterovirus (HEV); human adenovirus (HAdV); human bocavirus-1 (HBoV-1); and human polyomaviruses (HPyV-WU, KI) using previously validated, real-time PCR assays [17]. HAdV and HRV typing was performed as described [18, 19].

Bacterial PCR detection and quantification

S. pneumoniae, H. influenzae, Moraxella catarrhalis, and Staphylococcus aureus were detected and quantified by PCR following published methods [20]. Chlamydophila pneumoniae and Mycoplasma pneumoniae detection by PCR was as described [17].

Statistical analysis

Summary statistics are presented as medians and range for continuous variables and frequency for categorical variables. Conditional logistic regression was used for comparison of characteristics, pathogen presence/absence and for (log transformed) respiratory bacterial loads between cases and controls. Conditional logistic regression models assessed the influence of antibiotics, PCV7 status and otitis media. Samples were not all independent of one another; e.g. whereas a child could supply a case and SCC swab, they could also supply a DCC for another child. Use of single swabs more than once and matching ratios other than 1:1 were allowed. Analysis was performed using Stata version 14 and a p value < 0.05 was considered statistically significant.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Results

Selection of cases and controls

From the 222 eligible children, we identified 782 ALRI episodes in 179 children (Fig. 2). A total of 182 ALRI case episodes (104 children) were able to be matched to controls from either the same child (case-SCCs) or a different child (case-DCCs). Most children (85%; 89/104) contributed one or two ALRI cases. Among the 182 ALRI episodes, 102 were specified as bronchiolitis cases and 12 were specified as pneumonia (3 X-ray confirmed) cases, while the remainder met inclusion criteria for an ALRI, but could not be categorized further.

Fig. 2
figure 2

Participant outline demonstrating flow of selection of case and control nasopharyngeal swabs. ALRI, acute lower respiratory infection. DCC, different child control. NPS, nasopharyngeal swab. SCC, same child control. *Whereas most case-SCC (91%) and case-DCC (93%) groups were a 1:1 match, some nasopharyngeal swabs were matched more than once

For case-crossover analysis, we matched 120 cases (75 children) with 112 eligible SCCs (75 children) for 110 case-SCC groups in total. For case-control analysis, we matched 170 cases (101 children) with 155 eligible DCCs (n = 86 children) for 154 case-DCC groups in total. Most case-SCC (91%; 100/110) and case-DCC (93%; 143/154) groups were a 1:1 match; otherwise, controls were used more than once where alternative controls were not available. In the same child comparison, the median time between case NPS and ALRI episode was − 7.5 days (range 0 to − 21) and between SCC and their paired case NPS was − 83 days (range − 27 to − 173). In the different child comparison, median time between case NPS and ALRI episode was − 7.0 days (range 0 to − 21) and between DCC and their paired case NPS was 0 days (range − 20 to 20).

Comparison of participant characteristics of cases and controls

The children’s clinical characteristics (Table 2) showed high prevalence of concurrent ear disease and a male gender imbalance of the matched cases for DCC (p = 0.029). Receipt of two or more PCV7 doses (scheduled at 2, 4 and 6 months of age) was least frequent before the SCC NPS due to the selection strategy where SCC NPS were chosen earlier than case NPS. Recent antibiotic use (within 21 days of the NPS) more commonly preceded control NPS (SCC 17%; 19/112 and DCC 19%; 29/155) compared to the respective case NPS (11% each); SCC p = 0.038; DCC p = 0.031.

Table 2 Participant characteristics at time of nasopharygeal swab collection

Identification of microbes and genotypes in NPS

Overall, 432 NPS were analysed. Data were incomplete for one NPS tested for typical respiratory bacteria (n = 431) and for one NPS tested for viruses (n = 431), whilst only 318 NPS were available for M. pneumoniae and C. pneumoniae testing due to insufficient sample. Presence of typical respiratory bacteria was high (96%, 415/431): S. pneumoniae in 74%; (320/431), H. influenzae in 75% (325/431) and M. catarrhalis in 88% (378/431). Virus detection was also high (70%, 300/431) in NPS specimens. HRV was the most prevalent (48%; 208/431), followed by HPyV-WU (14%; 60/431), HAdV (12%; 52/431), HCoV (7%; 32/431), and HBoV-1 (7%; 30/431). Other bacteria and viruses were detected in < 5% of NPS. HAdV and HRV genotyping demonstrated wide diversity. Typed HAdV were primarily group C (50%; 26/52), with 12 genotypes in groups A–D in total. Among 208 HRV-positive samples tested, 52 different HRV genotypes were identified from 88 typed, primarily from HRV species A (39%; 34/88) and C (49%; 43/88) and one novel HRV genotype (Online Resource Table 1).

Comparison of microbial presence/load between cases and controls

Viruses

In unadjusted conditional logistic regression analyses (Table 3, Fig. 3), HAdV was significantly higher among ALRI cases compared to SCCs; cases (18%; 21/119) versus SCC (7%; 8/112), unadjusted OR = 3.08, 95%CI 1.22–7.76 (p = 0.017). This association was similar (but not statistically significant) between ALRI cases and DCCs: cases (15%; 26/169) versus DCC (10%; 15/155), unadjusted OR = 1.93, 95%CI 0.97–3.87 (p = 0.063). This association was tested in a post hoc subanalysis using only case NPS (and matched controls) collected up to 7 days before the ALRI event. A similar trend to the complete dataset for HAdV was seen, reaching significance for HAdV in cases (18%; 21/119) versus DCCs (10%; 11/105), unadjusted OR = 2.32, p = 0.046) (Online Resource Table 2).

Table 3 Prevalence of respiratory viral and bacterial pathogens in cases and controls (unadjusted)
Fig. 3
figure 3

Statistical comparison of respiratory pathogens in cases versus controls. Conditional logistic regression was used to measure the association of respiratory pathogens in cases versus both same child controls (filled circles) and different child controls (open circles). Odds ratios > 1 favour an association with cases, odd ratios < 1 favour controls. RSV and IFV odds ratios were generated using an exact logistic regression model to manage completely determined outcomes, although the upper boundaries were infinite (RSV-DCC, IFV-SCC). Further confidence interval truncations were made for visual purposes (indicated by +). DCC, different child control. SCC, same child control. HRV, human rhinovirus; HAdV, human adenovirus; HPyV, polyomaviruses; HCoV, human coronavirus; HBoV-1, human bocavirus-1; PIV; parainfluenza virus; RSV, respiratory syncytial virus; IFV, influenza virus; HEV, human enterovirus; hMPV, human metapneumovirus

Presence of HPyV-WU was significantly negatively associated with ALRI in the case-SCC analysis; cases (8%; 9/119) versus SCC (17%; 19/112), unadjusted OR = 0.32, 95%CI 0.12–0.87 (p = 0.026). This negative association was not seen in the DCC analysis (Table 3, Fig. 3). Similarly, presence of the other viruses tested was not associated with subsequent ALRI in SCC and DCC analyses. Conditional logistic regression analysis adjusted for recent antibiotic use (case-SCC analysis), or gender and antibiotic use (case-DCC analysis) did not influence interpretation of the unadjusted analysis. The prevalence of HCoV, RSV, IFV, PIV, HAdV and HRV subtypes are presented per their case-control groups in Online Resource Table 1. The small number of individual genotypes precluded analysis for association with ALRI.

Bacteria

Neither bacterial presence nor density was associated with subsequent ALRI. Bacterial load data for S. pneumoniae, H. influenzae, and M. catarrhalis identified similar distributions when comparing cases and controls.

Virus-bacteria co-detections

In a conditional logistic regression analysis to compare microbe co-detection between cases and controls where overall prevalence was above 10%, the magnitude and significance of the case-SCC associations for HAdV (Table 4) remained consistent with analysis of HAdV independently associated with ALRI onset; S. pneumoniae-HAdV (15 versus 4%) OR 4.31, p = 0.009; H. influenzae-HAdV (15 versus 7%) OR 2.69, p = 0.039; M. catarrhalis-HAdV (17 versus 7%) OR 3.03, p = 0.019; and HRV-HAdV (11 versus 3%) OR 4.32, p = 0.023. HPyV co-detections (S. pneumoniae, H. influenzae, and HRV) similarly remained significantly associated with SCCs. No significant co-detection associations were identified for other microbe pairs (Table 4), nor any for case versus DCC comparisons (not shown).

Table 4 Co-detection of commonly identified pathogens among cases and same child controls

In an analysis of pooled data across all NPS specimens, we found a significant positive association for co-detections between H. influenzae and HRV (p < 0.001), even after adjusting for other co-detections, PCV7, recent antibiotics and age. However, H. influenzae-HRV co-detection (Table 4) was not associated with ALRI cases.

Discussion

Children in remote communities in northern Australia are at very high risk of ALRI. In this cohort of 222 children, 80% of the children had 782 discrete episodes of ALRI recorded in the first 2 years of life. NPS specimens tested by PCR for 24 respiratory bacterial and non-bacterial pathogens were characterized by a high prevalence of S. pneumoniae (74%), H. influenzae (75%) and M. catarrhalis (88%) and any respiratory virus (70%, predominantly HRV). In 120 ALRI episodes where matched NPS data were available from SCCs and 170 ALRI episodes with matched contemporaneous NPS data from DCCs, neither presence nor load of the major respiratory bacterial pathogens S. pneumoniae, H. influenzae and M. catarrhalis was associated with ALRI onset. Carriage prevalence of these pathogens was high in cases and controls, potentially complicating analyses for associations due to saturation. HAdV was the only microbial signal associated with ALRI onset; the odds of a nasopharyngeal HAdV detection was 3-fold higher in the period immediately before an ALRI episode compared to an earlier ALRI-free period in the same child (p = 0.017).

HAdV was identified as a pathogen of interest in a previous study in this population, which found that HAdV detected in NPS specimens was independently associated with severe otitis media [12]. In a large childhood pneumonia aetiology study in South Africa where a high incidence of pneumonia is reported, RSV in NP swabs was most strongly associated with pneumonia in case-control comparisons followed by IFV, HBoV, HAdV, and cytomegalovirus [21]. Additional analysis of induced sputum specimens also identified Bordetella pertussis, H. influenzae type b, M. pneumoniae and PIV as associated with pneumonia [21]. Whilst a meta-analysis of 23 studies demonstrated that RSV, IFV, PIV, hMPV and HRV, were associated with ALRI in children aged < 5 years, there was no significant difference in detection of HAdV between cases and controls [22]. A recent prospective, community-based, longitudinal birth cohort study involving non-Indigenous Australian children also could not attribute parent-reported ALRI symptoms to detecting HAdV by PCR in weekly collected nasal swab specimens [23]. Nonetheless, HAdV disease outbreaks are known to occur; for example, types 2, 3, and 7 were observed in children during a 2011 Taiwan community outbreak, with type 7 associated with more severe disease [24]. HAdV outbreaks also occur in military training centres associated with crowded living conditions; however, reintroduction of a type 4 and 7 vaccine in 2011 was followed by a reduction in acute respiratory infection in this population [25]. Whilst HAdV type 7 (HAdV B) was detected in our study, HAdV C types were most common, as was shown previously in respiratory specimens in another Australian study [26].

Presence of a microbe may contribute to ALRI development, but may not be the primary or sole cause. Respiratory viruses can promote bacterial superinfection by multiple mechanisms that decrease bacterial clearance, increase bacterial adherence, and suppress immunity during recovery (reviewed [7]). Bacteria can also promote viral infection by facilitating attachment to host cells or contributing to elevated virus production or release from airway cells (reviewed [7]). However, a meta-analysis of 19 studies found that evidence to support the role of viral co-infection in increasing disease severity was inconclusive [27]. In our study, the only significant pathogen associations identified were between H. influenzae and HRV [28].

A study strength was the case-crossover analysis designed to account for proximal risk factors, such as host (e.g. genetic) or environmental factors (e.g. smoke exposure) that could influence susceptibility to ALRI. Nevertheless, we performed a secondary analysis of specimens and data from studies not specifically designed to address our aim. Furthermore, diagnostic classification relied on clinic records which may lack accuracy. Additionally, ALRI episodes ranged from mild to severe disease, with an under-representation of hospital cases. This clinical heterogeneity may translate to microbial differences, and thus future studies should analyse the range of disease phenotypes. Indeed, where we previously found that 42% of NPS from Indigenous infants hospitalized with bronchiolitis were RSV-positive [29], few NPS were RSV-positive for cases in the current study, despite bronchiolitis being a common diagnosis (56% of 182 episodes). This difference may be due to sampling of primarily non-hospitalized ALRI cases in infants outside the first 6 months of life.

A limitation of ALRI studies based on NPS microbiology relates to inferring causation of infection at distal areas of the respiratory tract [8]. Whereas most studies have investigated microbial exposures concurrent to the outcome, our study captured the nasopharyngeal microbiology prior to the ALRI onset to account for the possibility of the original infecting organism being cleared. It is possible that pathogens detected in the case NPS may be cleared before an ALRI episode, or be acquired subsequently to NPS collection; however, a subanalysis of case NPS collected within 7 days of the ALRI episode identified HAdV trends consistent with the main analysis. Here, we investigated a high-risk population for ALRI (80% affected with 49% experiencing multiple ALRI episodes in the first 2 years of life), making us question the suitability of controls from this cohort. Whilst we attempted to prevent misclassification bias using strict criteria for control NPS specimens, the high background rate of pathogen-positive NPS and the design element that allowed case and control NPS to be sought from the same set of children, may bias the results towards the null hypothesis. Furthermore, the cross-sectional design limits conclusions that can be confidently drawn since our study was not designed to detect the contribution of new strain acquisitions of bacteria or viruses to ALRI onset, which is important in other respiratory disorders [30], and is an area requiring further investigation in this population with high rates of respiratory bacterial colonization.

In this pediatric population at high risk of ALRI, a high prevalence of several respiratory bacterial and viral pathogens was identified in NPS collected from cases in the period prior to ALRI onset, and in controls. Whilst our retrospective analysis was limited by several factors, HAdV was the only pathogen significantly associated with ALRI onset. We require a large set of longitudinal host, environmental and microbial data (including clinical phenotype and microbial strain) modelled to account for multiple parallel effects to unpack these complex interactions and to inform the use or development of targeted interventions to reduce the ALRI burden in this high-risk population.