Background

Antiretroviral therapy (ART) has substantially increased the lifespan and reduced acquired immunodeficiency syndrome-related morbidity and mortality of people living with human immunodeficiency virus (PLWH, HIV) [1, 2]. The aging of PLWH nonetheless places them at higher risk for developing age-related comorbidities, including chronic obstructive pulmonary disease (COPD). PLWH have been shown to have an elevated risk for COPD [3,4,5] which appears to be independent of smoking history [3, 6] and also worsens their mortality risk [7]. As the population of PLWH ages, they are likely to bear an increasing burden of COPD.

There are many hypotheses regarding the mechanisms of HIV-associated COPD, including but not limited to longer exposure to risky behaviours [8], side effects from ART [8, 9], chronic inflammation [10], and Pneumocystis colonization [11, 12]. Speculation that microbial dysbiosis in the lung, the result of repeated pulmonary infections and antibiotic exposure, could also drive obstructive lung disease has led to multiple studies investigating the lung microbiome in PLWH [13,14,15,16]. For instance, we previously reported on decreased microbial diversity and community shifts in the airway epithelium of PLWH compared to HIV-uninfected patients [16]. The characterization of the airway epithelial microbiome specifically in PLWH with COPD, however, has yet to be reported. Moreover, although dysbiosis may conceivably lead to profound changes in host molecular mechanisms such as transcription, epigenetic regulation, metabolism, and immunity, these disruptions have not yet been fully described.

In this study, we hypothesize that microbial disruptions in the airway epithelium of PLWH with COPD are associated with key transcriptomic and epigenetic alterations that can help us gain insights into the disease pathogenesis of HIV-associated COPD. We simultaneously profile the microbiome, methylome, and transcriptome from airway epithelial cells collected via bronchoscopy in PLWH with COPD (COPD + HIV +) to (1) characterize the distinct microbiome features distinguishing them from PLWH without COPD (COPD − HIV +), HIV-uninfected patients with COPD (COPD + HIV−), and healthy controls (COPD − HIV−) and (2) link these features with epigenetic and transcriptomic alterations to better understand the host response to dysbiosis. To the best of our knowledge, no study has integrated and examined all three ’omes together in the context of the HIV airway.

Methods

Study population and design

Airway epithelial cell (AEC) brushings were obtained from 76 (18 COPD + HIV +, 16 COPD− HIV + , 22 COPD + HIV− and 20 COPD – HIV −) adult patients at St. Paul’s Hospital, Vancouver, BC, who consented to undergo bronchoscopic collection of research specimens under the University of British Columbia Research Ethics Board Certificates H11-02713 and H15-02166. Bronchial brushings were obtained according to previously published methods [16,17,18]. Briefly, the bronchoscope was guided to either the right or left upper lobe segment and a cytology brush was inserted until resistance was met (approximately 2 mm in diameter) at which point gentle brushings were obtained for AEC collection. PLWH also provided background controls for the study, including oral washings, reagent controls, bronchoscope channel washings, water rinsed over unused cytology brushes, and extraction negative samples specifically to investigate for any cross-contamination with AEC samples. COPD was defined based on a pulmonologist’s diagnosis of COPD and either a pre-bronchodilator forced expiratory volume in one second (FEV1)/forced vital capacity (FVC) ≤ lower limit of normal [19] or clear evidence of emphysema on computed tomography imaging on visual inspection. PLWH were defined as subjects with documented HIV-1 infection. An overview of the study design is provided in Additional file 1: Fig. S1. Details of the cohort and the methylation and transcriptome profiling of the samples have previously been reported [18]. Briefly, methylation profiles were obtained using the Illumina Infinium MethylationEPIC BeadChip microarray, which captures the methylation status 863,904 CpG sites across the genome. Paired end RNA sequencing was performed to a depth of 50 million reads using the Illumina Novaseq6000 platform.

Microbiome profiling

Bacterial genomic DNA was extracted from bronchial brushings, and microbiome profiles were obtained using touchdown droplet digital polymerase chain reaction, followed by 16S amplicon sequencing using the Illumina Miseq® platform at the Sequencing and Bioinformatics Consortium at the University of British Columbia (Vancouver, Canada). Sequencing data were processed following the QIIME2™ workflow using Divisive Amplicon Denoising Algorithm (DADA2) to denoise sequences. During these steps, the sequencing reads were demultiplexed, merged and resolved into amplicon sequence variants (ASVs). Further quality filtering steps were performed to remove contaminating host mitochondrial or chloroplast sequences, ASVs present in PCR controls, ASVs with significantly fewer sequences than the majority, and ASVs present only in one sample (singletons). Taxonomy assignment was performed on ASVs using a pre-trained naive Bayes classifier artifact trained against Greengenes (138 revision) trimmed to include only the V4 hypervariable region and pre-clustered at 99% sequence identity. Phylogenetic trees were generated using the MAFFT program in QIIME 2™, which was consecutively used as input to compute different phylogenetic diversity measures.

Alpha diversity was measured using the Shannon Diversity Index, a metric of community richness and evenness. T-tests were used to identify differences in Shannon Diversity between PLWH and HIV-uninfected groups and between COPD and non-COPD groups, while analysis of variance (ANOVA) and Mann–Whitney U-tests were used to identify differences between the COPD-HIV-, COPD + HIV −, COPD −HIV + , and COPD + HIV + groups. Beta diversity was measured using Bray Curtis Dissimilarity, which measures differences in richness between two or more communities, tested with permutational multivariate analysis of variance (PERMANOVA) (adonis function in vegan R package [20]), and visualized using principal coordinate analysis (PCoA) plots. For both alpha and beta diversity, a second model was performed to adjust for age, sex, and smoking status using the following: Diversity Metric ~ Age + Sex + Smoking Status + COPD/HIV. We also examined the interaction effects between COPD and HIV on alpha and beta diversity metrics. Average relative taxon abundance comparisons were performed between the COPD + HIV + , COPD −HIV + , COPD + HIV− and COPD −HIV − groups at the phylum and genus levels. The Kruskal–Wallis and Dunn’s tests were used to identify between group differences. Significant taxon differences were identified at false discover rate (FDR) < 0.05 using the Benjamini–Hochberg method.

Multi ’omic integration

The microbiome, transcriptome, and methylome were integrated using Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO) implemented in the mixOmics R package. This method simultaneously identifies key ’omics features among heterogeneous datasets and their respective correlations [21, 22].

This integration analysis included features from the three datasets as input: microbiome (126 ASVs), transcriptome (28 genes) and methylome (4404 CpGs). The ASVs were obtained after quality filtering and taxonomy assignment steps described above. Top genes and CpG sites were selected based on robust linear modeling examining the interaction effect of COPD*HIV on gene expression and methylation, respectively (further methods provided in the Additional file 1: Methods). The following design matrix with values ranging between 0 (indicating no correlation between ’omics datasets) to 1 (indicating maximum correlation) was chosen, such that there was a compromise between correlation and discrimination between the features across the different datasets.

figure a

Subsequently, a DIABLO model with no variable selection was fit to evaluate the global performance and choose the number of components (ncomp) for the final model. The ncomp were chosen with considerations of centroids distance measures and the balanced error rates, after tenfold cross validation repeated 50 times. Sparse partial least squares discriminant analysis was then used to obtain the optimal number of variables of each component in the three datasets, after tenfold cross validation repeated 50 times. Using the chosen parameters, the final DIABLO models were fit to identify key interactions (using a correlation threshold of |0.7|) between the microbiome, methylome and the transcriptome. Similar methods were used to integrate just two datasets at a time—the (a) microbiome and methylome, and (b) microbiome and transcriptome—to identify correlations between the corresponding ’omes.

Results

Table 1 provides a summary of participant demographics. The study cohort was composed of 76 participants (18 COPD + HIV + , 16 COPD − HIV + , 22 COPD + HIV −, and 20 COPD – HIV −). The mean (standard deviation) age was 61.3 ± 11.6 years, with males (n = 48) making up 63.2% of the total. Of the PLWH (n = 34), 30 (88.2%) were receiving ART and 26 (76.5%) had undetectable HIV plasma viral load (< 40 copies/mL), and the mean CD4 count was 439 cells/mm3.

Table 1 Demographics and clinical features of the study cohort

PLWH with COPD feature reduced airway epithelial microbiome diversity and microbial community shifts

There were no significant differences in 16S rRNA gene copy levels between the COPD + HIV + , COPD + HIV −, COPD − -HIV + and COPD – HIV − groups (overall Kruskal–Wallis p = 0.300, Additional file 1: Fig. S2). Alpha diversity measured using the Shannon Diversity Index is represented in Fig. 1A–C. We observed significant differences between the (a) COPD + and COPD − groups (median[interquartile range] 2.80[1.73] vs. 3.99[1.29]; p = 0.003, adjusted p = 0.046), (b) HIV + and HIV − groups (2.66[1.74] vs. 3.83[1.15]; p = 0.002, adjusted p = 0.021), and (c) combined COPD and HIV groups: COPD + HIV + , COPD + HIV −, COPD − HIV + and COPD – HIV − (2.56[1.61] vs. 3.19[1.33] vs. 3.48[1.94] vs. 4.13[0.91]; p = 1.5e−04, adjusted p = 0.016). PLWH with COPD featured the lowest alpha diversity of the four groups. There were no statistically significant differences in alpha or beta diversity measures by sex groups.

Fig. 1
figure 1

Shannon Diversity Index differences between (a) COPD + and COPD− groups, (b) HIV + and HIV− groups, and (c) COPD + HIV + , COPD + HIV−, COPD−HIV + and COPD−HIV− groups. The COPD + HIV + group featured the lowest Shannon Diversity Index of all four groups. P-values were adjusted for age, sex, and smoking status. Microbial community structures in AECs according to (d) COPD status (COPD− (N): red circles; COPD + (Y): blue circles), (e HIV status (HIV− (Negative): red circles; HIV + (Positive): blue circles), and (f) combined COPD + HIV status (COPD + HIV +: purple circles; COPD + HIV−: blue circles; COPD-HIV +: green circles; COPD−HIV−: red circles) based on Bray–Curtis distances; the centroids for each group are also shown. Permutational multivariate ANOVA (PERMANOVA) was used for comparisons of microbial community structures between groups, with adjustment for age, sex, and smoking status

Beta diversity measured between the COPD −HIV −, COPD + HIV −, COPD −HIV + , and COPD + HIV + groups are shown in Fig. 1D–F. The principal coordinate plots and PERMANOVA analysis show that there were significant differences between the resident microbial communities of the COPD + and COPD − groups (p = 0.001, adjusted p = 0.009), HIV + and HIV − groups (p = 0.009, adjusted p = 0.041), and the combined COPD and HIV groups (p = 0.001, adjusted p = 0.006). Consistent with these findings, smoking status was also associated with significant differences in resident microbial communities (p = 0.037, Additional file 1: Fig. S3). There were no significant interaction effects between COPD and HIV on alpha and beta diversity metrics (Additional file 1: Fig. S4). Removal of patients using inhaled corticosteroids (n = 4) did not significantly change relationships in alpha and beta diversity (Additional file 1: Figs. S5, S6).

Relative phyla and genera abundance are shown in Fig. 2 and in Additional file 1: Tables S1 and S2. The most abundant phylum in airway epithelial cells was Firmicutes, followed by Bacteroidetes and Proteobacteria. At the genus level, Prevotella, Veillonella, and Streptococcus were the most abundant. COPD − showed higher relative abundance of phyla Bacteroidetes and Fusobacteria, and genera Prevotella, Veillonella, Megasphaera, Neisseria, Selenomonas, and Fusobacterium compared to the COPD + group. Similarly, the HIV − group had higher abundance of phyla Fusobacteria, and genera Prevotella, Neisseria, Selenomonas and Fusobacterium compared to the HIV + group. Phyla Fusobacteria and Bacteroidetes, and genera Prevotella, Megasphaera, Neisseria, Selenomonas and Fusobacterium showed significant differences between the 4 groups (COPD-HIV-, COPD-HIV +, COPD + HIV −, and COPD + HIV +). There were no individual genera or phyla, however, that were significantly correlated with FEV1 percent predicted.

Fig. 2
figure 2

Average relative taxa abundance comparisons at the phylum level between (a) COPD + and COPD− groups, (b) HIV+ and HIV− groups, and (c) COPD + HIV + , COPD + HIV−, COPD−HIV + and COPD−HIV− groups. Average relative taxa abundance comparisons at the genus level between (d) COPD + and COPD− groups, (e) HIV + and HIV− groups, and (f) COPD + HIV + , COPD + HIV−, COPD−HIV + and COPD−HIV− groups

To address possible microbial contamination in AEC brushings, oral washings, reagent controls, bronchoscope channel washings, water rinsed over unused cytology brushes, and extraction negative samples were collected from the PLWH group. 16S rRNA gene copies were significantly elevated in the AEC brushings and oral wash controls compared to the background environmental samples (Additional file 1: Fig. S7A). PCA plots demonstrated that each sample type had a distinct community profile (Additional file 1: Fig. S7B, Table S3), thus suggesting that there was minimal cross-contamination.

Multi ‘omic integration using DIABLO

The microbiome, transcriptome and methylome were integrated using DIABLO to identify any ‘between’ and ‘within’ ’ome correlations based on (a) COPD, (b) HIV and (c) COPD*HIV statuses (Additional file 1: Fig. S8). The top three ASV-Gene, ASV-CpG and Gene-CpG pairs and their respective correlation values are shown in representative Table 2. The balanced error rate, used as an estimate of model performance, was ∼30–40% in each component. Network interactions between the microbiome, methylome, and transcriptome for the strongest correlations (> 0.75) are shown in Fig. 3. Complete tables of pairwise correlations for COPD, HIV, and COPD*HIV are provided in Additional files 2, 3, and 4, respectively.

Table 2 Top ASV-Gene, ASV-CpG and Gene-CpG pairs and their respective correlation values
Fig. 3
figure 3

Network visualization of correlations between the microbiome, methylome and transcriptome in (a) COPD, (b) HIV, and (c) COPD*HIV analyses. Nodes represent multiomic features (ASVs—Pink triangle, CpGs—Purple circle, and genes—Yellow square), and edges connecting any two nodes corresponds to their correlation (Positive correlation—Green; Negative correlation—Red). Correlation cutoff = 0.75

In both our COPD effect and combined COPD*HIV effect analyses, we identified Bacteroidetes Prevotella to be a top ASV correlated with features of the transcriptome and the epigenome. In the former analysis, Bacteroidetes Prevotella was correlated with genes WDR72, AKR1C2 and SETDB1, and methylation sites CpG-TIMP3;SYN3, CpG-UTP11L and CpG-PHACTR2; CpG-UTP11L was in turn correlated with genes WDR72 and AKR1C2, and CpG-TIMP3; SYN3 was correlated with gene WDR72. In the COPD*HIV analysis, Bacteroidetes Prevotella was correlated with genes FASTKD3, FUZ, and ACVR1B, and CpG-FUZ and CpG-PHLDB3.

Discussion

In support of the hypothesis that HIV infection is associated with alterations in the airway epithelial microbiome, our analysis showed several novel conclusions: (1) PLWH with COPD had significantly lower microbial diversity and a distinct microbial community in their airway epithelium compared to PLWH without COPD, HIV-negative COPD patients, and patients with neither HIV nor COPD; (2) microbial features that appeared disrupted in PLWH with COPD were correlated with methylation and transcriptomic alterations along genes not previously recognized to be part of the pathogenesis of HIV-associated COPD.

Our analysis of the microbiome revealed that relatively “healthier” individuals were enriched in characteristic phyla Fusobacteria (in COPD −, HIV − and COPD – HIV − groups) and Bacteroidetes (in the COPD − and COPD −HIV − groups), and that decreases in these phyla may be associated with disease. However, we found no significant differences in the relative abundance of other characteristic phyla Proteobacteria and Firmicutes, contrary to the findings of Sze et al. [23], Xu et al.[16], and Ramsheh et al. [24]. Lower in the taxonomic hierarchy, we noted differences in relative abundance of genera Prevotella, Selenomonas, Neisseria, Fusobacterium, Streptococcus and Veillonella in the group with both COPD and HIV. These microbes are known to be oral commensals and have previously been implicated in lower airway colonization and driving severity of disease in COPD [25,26,27]. Our results reinforce the notion that diversity and composition are important components of a “healthy” microbiome [28]. However, how exactly lung microbial dysbiosis translates to the clinical disease presentations observed in patients with HIV and COPD is still unclear. This relationship is likely multifactorial, with variations in host processes such as transcription, metabolism, and immunity contributing to a certain extent.

In light of this uncertainty, we carried out multi ‘omic integration to effectively combine information from three ‘omes (microbiome, methylome and transcriptome) and identified highly correlated ‘omic features that may be relevant in the combined COPD and HIV states. Our integrative analyses consistently identified the single microbiome feature Bacteroidetes Prevotella, reduced in relative abundance in PLWH with COPD. Although a widely studied microbe, the exact role of Prevotella in the respiratory system is still poorly understood. Certain pathobiontic strains have been implicated as promoters of subclinical inflammation, particularly through increased Th-17 cytokine expression [29]. Twigg et al.observed that long-term ART use (> 3 years), which would normally be associated with a “healthier” phenotype, was associated with decreased Prevotella abundance in nine PLWH [14]. On the other hand, Prevotella has also been described in terms of healthy microbial ecosystems. Within HIV-uninfected patients with COPD, airway epithelial abundance of Prevotella has been associated with increased lung function, reduced dyspnea scores and inflammation, and expression of epithelial genes involved in tight junction promotion [24]. Many of these properties of Prevotella may be due to its interactions with other microbes and its dynamic role within the respiratory ecosystem. Recent studies have shown that Prevotella may exert its anti-inflammatory effects by inhibiting cytokine production by other gram-negative bacteria like Haemophilus influenzae or may co-aggregate and form heterotrophic biofilms with microbes like Porphyromonas [30, 31]. Future research based on cell culture models can help determine the strain-specific phenotypic response of the host.

The power of multi ‘omics integration allows for greater insight into what impact microbial dysbiosis might have on the host airway response. For instance, Bacteroidetes Prevotella was highly correlated with the methylation and expression of specific genes in the COPD and COPD*HIV interaction effect multi ‘omic analyses. Moreover, we found that in PLWH with COPD, reduced Prevotella abundance was correlated with the increased methylation of CpG sites along the genes PHLDB3 and FUZ, and the decreased expression of gene FUZ and the increased expression of gene FASTKD3. The consistent appearance of FUZ in relation to Prevotella abundance in our multi ‘omic integration analyses suggests a strong relationship between these two features along both epigenetic and transcription pathways. FUZ encodes a planar cell polarity protein that plays a prominent role in ciliogenesis [32]. Within the lung, higher methylation and lower expression of FUZ has previously been associated with tumour promotion and poor prognosis in lung adenocarcinoma [33], a compelling finding given the known increased rate of lung cancer in PLWH [34]. In our analysis, FUZ expression was also positively correlated ACVR1B and PTPRF and inversely correlated with FASTKD3. ACVR1B, part of the transforming growth factor-beta family, has established associations with COPD pathogenesis, as an identified expression quantitative trait loci in COPD blood, sputum, and lung [35, 36] and as a causal gene in emphysema distribution [37]. While the associations of PTPRF and FASTKD3 with COPD are not as clear, links between these two genes and lung cancer prognosis [38, 39] suggest some degree of activity within the lung through their roles in apoptosis, cell growth and differentiation, and oncogenesis. Interestingly, PTPRF has been noted to have a role in regulating the assembly and contraction of actin and actomyosin, and formation of tight junctions, with potential barrier function against HIV entry into target cells [40]. Whether modulation of Prevotella abundance within the airway epithelium can subsequently affect the activity of these downstream genes in PLWH with COPD is unknown, but may be a worthy area for future study.

Although this study is one of the largest to evaluate the airway epithelial microbiome in HIV-associated COPD, it has several limitations. We showed that microbial disruptions are evident in the airways of PLWH with COPD, however this study was not designed to prove causation of airway injury. Future experimentation using cell culture models or germ-free animals mimicking HIV infection may provide greater insight into the direction of the microbe-gene relationships identified in this study. Second, since the majority of our cohort were receiving ART and had undetectable HIV plasma viral loads, these results may not reflect microbiome and gene changes observed in PLWH not on ART. Third, we collected brushings from only one upper lobe lung segment per patient and acknowledge that there may be regional variation in the microbiome which we would not have been able to detect. Fourth, we did not have independently verified records of the cohort’s previous exacerbation and antibiotic use history. Fifth, there were demographic differences between our study groups and future studies with greater numbers of female PLWH would be welcomed. Greater balance of concurrent asthma and bronchiectasis, as well as of lung function amongst the COPD subgroups, would also be beneficial. Finally, while a promising field to uncover novel biologic relationships, multi ‘omic integration brings further challenges. Different ‘omics datasets are often generated via varied technologies and platforms and the search for a “gold-standard” workflow for data filtering, normalization, and integration continues [21]. “Over-fitting” the data in these workflows is often a concern which can cause the predictive performance to suffer in other cohorts. Future work using a validation dataset can improve these shortcomings and provide better assurance of model performance [41].

Conclusions

In this study, we are able to demonstrate that multi ‘omics integration can yield new insights into the impact microbial dysbiosis can have in the airway epithelium of PLWH, identifying novel genes in the pathogenesis of HIV-associated COPD. These genes could be further explored as potential biomarkers or drug targets specific to PLWH.