The sBNs were obtained by prudent use of BNs in conjunction with CoNs. The main contribution of this paper is to show evidence to support the claim that sBNs can help make inferences about colonization order. In some niche environments, research has shown that microbes colonize the niche in specific orders, with early colonizers often recruiting late colonizers or creating conditions that make it more attractive for specific late colonizers [25]. We have observed that with high accuracy, the edges of sBNs are consistent with known colonization orders. In particular, we show that the sBNs can capture colonization order when augmented with the correlation coefficient. The findings were validated by analyzing oral, infant gut, and vaginal microbiome data sets, where prior published information on colonization order was available. The colonization order was also retained in our experiments with the semi-synthetic data sets as well.
The sBNs generated from the data sets mentioned above were visualized with Cytoscape. In all the sBNs generated (Figs. 1, 2, 3, 4 and additional files 1 – 6), nodes correspond to bacterial taxa, node sizes are proportional to the average abundance of the taxa, thickness of the edges are proportional to the absolute value of Pearson correlation coefficient (i.e., measure of co-occurrence), and opacity of an edge is proportional to its bootstrap values. Edges are colored green and red for positive and negative correlations, respectively. The purple and red node colors correspond to the bacterial taxa that are described as early and late colonizers (in published literature), respectively [26–28]. The black nodes indicate colonizers whose order has not been described previously. We note (data not shown) that while there are many strongly connected clusters in CoNs, these nodes remain connected in sBNs (as expected), but relatively sparsely because of the stringent conditional probability tests.
Semi-synthetic data from infant gut microbiome – sBN edges are consistent with temporal order
The infant gut data set was temporally aligned as described earlier. We then divided the time line into k periods, with k=1,2,… and created sBNs from each period. The goal was to see if any of the known orders of colonization can be observed in the figures, even after having modified the time axis of each subject differently.
The infant gut is dominated by three classes that generally appear and colonize in a sequential order: Bacilli (Firmicutes) soon after birth, which then gives way to the Gammaproteobacteria (Proteobacteria), and followed by Clostridia (Firmicutes) [29]. When we partitioned the time series into k=2 periods, the sBN from the first period had a directed edge from the Bacilli to Gammaproteobacteria. The red-colored edge suggested a negative correlation as would be expected if this inference came from colonization order. Additionally, the sBN generated from the second period showed a directed edge from Gammaproteobacteria to Clostridia, also colored red (Fig. 5).
When the time series were partitioned into three periods, the same two edges were represented strongly in periods 2 and 3 respectively. In fact, the strength of the two edges in the three periods were (1) 0.4 and 0.16 (i.e., both weak), (2) 0.94 and 0.16, and (3) 0.61 and 0.80. The above observations suggest strongly that the transition from Bacilli to Gammaproteobacteria occurs before the transition from Gammaproteobacteria to Clostridia, and that the colonization order is supported in the sBNs.
We, therefore, conclude that sBNs are capable of capturing colonization order using the methods suggested above. Red edges or negative correlations are consistent with the model that for both edges when one taxon is declining in abundance, the other is increasing in abundance.
Oral microbiome – sBN edges are consistent with colonization order
In the oral cavity, early and late bacterial colonizers have been identified and reviewed in the literature [26]. Many species from the genus Streptococcus is the early primary colonizer, accounting for 60% - 90% of the early abundance profile [30]. The following taxa have been identified as early and late colonizers for oral microbiomes [26–28]. Early: Streptococcus gordonii, Streptococcus mitis, Streptococcus oralis, Streptococcus sanguis, Actinomyces israelii, Actinomyces naeslundii, Propionibacterium acnes. Late: Selenomonas flueggei, Treponema spp., Porphyromonas gingivalis.
Comparison of the sBNs for all oral microbiomes (Figs. 1–2 and additional files 1–6) showed that the keratinized gingiva (Fig. 1) and tongue dorsum (Fig. 2) have the fewest number of distinct taxa. The sBNs for these two sites were more distinctive than those derived from other sites and showed stronger correlations between taxa. The saliva, subgingival, and palatine tonsils sites harbored a higher number of taxa and exhibited weaker correlations. Note that not every taxa is present in every oral site, thus explaining the differences in the set of nodes present in each sBN.
The sBNs for the oral microbiomes had a combined total of 716 edges. Of these, 78 edges connected vertices, which were associated with known early or late colonizers. Table 1 summarizes the directed edges between early and late colonizers, they are consistent with the known colonization order, and the correlation (negative/positive edges) among them. More than 90% of the sBN edges for the oral microbiome were directed with the exceptions of saliva and buccal mucosa, for which only 83-84% were directed. Of the 78 edges connecting labeled vertices, all edges except for two were consistent with the known colonization order, i.e., directed from early to late colonizers (Table 1). These two edges are shown as dashed lines in the corresponding sBNs (see additional file 2 and additional file 5). In summary, for the oral microbiome the directed sBN edges go from early to late colonizers, with few exceptions. For example, the sBN from keratinized gingiva (Fig. 1) has three directed edges (Actinomyces2-Porphyromonas1, Streptococcus1-Porphyromonas1, and Streptococcus2-Porphyromonas1) from early colonizers to late colonizers and none from late to early colonizers. Note that all taxonomic names have been abbreviated in the figures to the first five characters plus a number, each name refers to a distinct OTU. The sBN for the buccal mucosa (Additional file 1), palatine tonsils (Additional file 2), saliva (Additional file 3), subgingival plaque (Additional file 4), supragingival plaque (Additional file 5), and throat (Additional file 6) are included in the supplementary files.
Table 1 Inferring Colonization order in oral microbiomes Oral microbiome – sBN edges with negative correlation are consistent with colonization order
As mentioned above, two out of the 78 edges are exceptions to the rule that no edges in the sBNs are directed from late to early colonizers. In particular, one edge goes from Trepo5 (Treponema, labeled as a late colonizer) to Actin3 (Actinomyces, labeled early colonizer) in palatine tonsils. Similarly, another edge goes from Porph3 (Porphyromonas, labeled as late colonizer) to Actin3 (Actinomyces, labeled early colonizer) in supra-gingival plaque. However, the correlation coefficient of the edges between them is positive. Thus, the accuracy in terms of direction is 97.4%, and all correctly directed edges have negative correlations. According to Kolenbrander et al., the bacterial taxa representing early colonizers coaggregate with only a specific set of other early colonizers, and not with any of the late colonizers [26]. Our findings, albeit limited, are consistent with this observation, that all edges connecting early to late colonizers in that direction are negatively correlated (red edges).
Infant gut microbiome
The abundance of microbes in neonatals over the course of the first few weeks of their lives have been reported [29]. In two infant gut microbiome studies, the class Bacteroidetes and Gammaproteobacteria were observed early, followed by Bacilli, Clostridia and Gammaproteobacteria [29, 31]. Over time, there was a significant decrease in Bacilli, and the infant’s gut appears to have a tug-of-war between the two classes Gammaproteobacteria and Clostridia [31]. When the sBNs were constructed with the infant gut microbiome data, we obtained a directed network that supported the claim that sBNs shed light on the colonization pattern (Fig. 3). There were directed edges from Bacteroidetes, Bacilli, and Clostridia to Gammaproteobacteria (Fig. 3). The results also supported the prior knowledge that Clostridia precedes Bacilli in the colonization order. All these taxa are mostly negatively correlated (red edges), as shown in Fig. 3, reinforcing the point that a directed edge combined with negative correlations is strongly suggestive of colonization order.
Vaginal microbiome
A healthy vaginal microbiome is dominated mainly by Lactobacillus species [32]. When women at a reproductive age suffer from bacterial vaginosis (BV), the Lactobacillus species are replaced by Gardnerella, Peptostreptococcus, Atopobium, Sneathia, Parvimonas, and Corynebacterium, among others [33]. Figure 4 shows three sBNs for vaginal microbiomes associated with low (healthy), medium (early BV), and high (advanced BV) Nugent scores. All samples were analyzed for the abundance of the same set of 23 genera. Overall, the predominant genera observed were Lactobacillus, Atopobium, Gardnerella, Parvimonas, and Prevotella (Fig. 4).
In the sBN associated with the healthy “vaginome”, the abundance of Lactobacillus was comparatively higher as expected. The Lactobacillus species, especially, L. crispatus and L. iners (data not shown) displayed an antagonistic relationship with the BV-associated Gardnerella.
In the sBN for the medium Nugent score cohort, indicative of early vaginosis, the BV-associated genera, Atopobium, and Sneathia AND Gardnerella were significantly increased in abundance, and appeared as early colonizers. The abundance of all the BV-associated pathogens was negatively correlated with Lactobacillus, reaffirming an antagonistic relationship.
In the sBN for the advanced BV cohort, characterized by higher Nugent scores, a proportional increase in abundance was observed with Atopobium followed by Gardnerella. Even with the antagonistic relationship with Lactobacillus, the BV-associated pathogenic genera especially Atopobium and Gardnerella, Sneathia are connected by a directed edge to Lactobacillus. The appearance of the pathogenic genera as late colonizers is consistent with clinical findings [34]. Strong positive relationships were observed between Prevotella and Peptostreptococcus, and Peptostreptococcus with Parvimonas. This may suggest that the presence of Prevotella enables the colonization of Peptostreptococcus followed by Parvimonas.
To check the robustness we also experimented with a higher number of taxa, i.e., by including all taxa whose abundance added up to 99.99%. We found that sBNs can retrieve the known colonization order even if we include taxa with small abundance (from 99% to 99.99% of most abundant taxa shown in Additional file 7).