Impact of early events and lifestyle on the gut microbiota and metabolic phenotypes in young school-age children
The gut microbiota evolves from birth and is in early life influenced by events such as birth mode, type of infant feeding, and maternal and infant antibiotics use. However, we still have a gap in our understanding of gut microbiota development in older children, and to what extent early events and pre-school lifestyle modulate the composition of the gut microbiota, and how this impinges on whole body metabolic regulation in school-age children.
Taking advantage of the KOALA Birth Cohort Study, a long-term prospective birth cohort in the Netherlands with extensive collection of high-quality host metadata, we applied shotgun metagenomics sequencing and systematically investigated the gut microbiota of children at 6–9 years of age. We demonstrated an overall adult-like gut microbiota in the 281 Dutch school-age children and identified 3 enterotypes dominated by the genera Bacteroides, Prevotella, and Bifidobacterium, respectively. Importantly, we found that breastfeeding duration in early life and pre-school dietary lifestyle correlated with the composition and functional competences of the gut microbiota in the children at school age. The correlations between pre-school dietary lifestyle and metabolic phenotypes exhibited a striking enterotype dependency. Thus, an inverse correlation between high dietary fiber consumption and low plasma insulin levels was only observed in individuals with the Bacteroides and Prevotella enterotypes, but not in Bifidobacterium enterotype individuals in whom the gut microbiota displayed overall lower microbial gene richness, alpha-diversity, functional potential for complex carbohydrate fermentation, and butyrate and succinate production. High total fat consumption and elevated plasma free fatty acid levels in the Bifidobacterium enterotype are associated with the co-occurrence of Streptococcus.
Our work highlights the persistent effects of breastfeeding duration and pre-school dietary lifestyle in affecting the gut microbiota in school-age children and reveals distinct compositional and functional potential in children according to enterotypes. The findings underscore enterotype-specific links between the host metabolic phenotypes and dietary patterns, emphasizing the importance of microbiome-based stratification when investigating metabolic responses to diets. Future diet intervention studies are clearly warranted to examine gut microbe-diet-host relationships to promote knowledge-based recommendations in relation to improving metabolic health in children.
KeywordsSchool-age children Gut microbiota Enterotype Metabolic phenotypes
Branched-chain amino acid
Dirichlet multinomial mixtures
Evolutionary genealogy of genes: Non-supervised Orthologous Groups
G protein-coupled receptors
Integrated gene catalog
Kyoto Encyclopedia of Genes and Genomes
Principal component analysis
Permutational multivariate analysis of variance.
Short-chain fatty acids
Microbial colonization in early life is crucial for infant health and may affect health status in later life [1, 2]. Substantial effort has been devoted into studying the development of the gut microbiota during infancy. Throughout the first year of life, the gut microbiota increases dramatically in diversity and stability, and reportedly reaches an adult-like configuration in the subsequent years [3, 4, 5]. Many studies have demonstrated that early events such as birth mode, type of infant feeding, the presence of older siblings, and maternal and infant antibiotics use affect the establishment and composition of gut microbiota during infancy [4, 6, 7, 8]. After weaning, dietary patterns moreover have a profound impact on shaping the childhood gut microbiota . However, we still have a gap in knowledge of the roles played by early events and lifestyle in the development of the gut microbiota during childhood.
Studies conducted with animal models and adult human beings have provided strong indications of metabolic cross-talk between gut microbes and the host. The gut microbiota influences the development and regulation of the immune system, and energy and metabolic homeostasis of the host via the production of a vast array of metabolites such as short-chain fatty acids (SCFAs) and secondary bile acids [10, 11, 12]. For instance, SCFAs produced by gut bacterial fermentation of complex dietary carbohydrates interact with G protein-coupled receptors (GPCRs) and affect adiposity and insulin resistance . Nevertheless, our understanding of how dietary patterns interact with the gut microbiota and subsequently affect metabolic traits of children is limited.
In this study, we examined the composition and functional potential of the gut microbiota of 281 Dutch children at early school-age (6–9 years of age) and revealed the impact of early events and pre-school dietary patterns on the gut microbiota of children. We identified enterotype-based differences of not only the structure and functional potential of the microbial communities, but also differences in the correlations between metabolic phenotypes and dietary patterns in children. Our results provide new insights into the normal developmental trajectories of the gut microbiota and the environmental factors affecting microbiota development in school-age children, and increase our understanding of the microbiota-dependent interactions between diets and host metabolic phenotypes.
Characterizing the gut microbiome of early school-age Dutch children
Comparison of gut microbiota in Dutch children and adults, and overweight and lean children
Comparison of gut microbial gene diversity in Dutch adults and children revealed an adult-like alpha diversity in the school-age children (Wilcoxon rank-sum test, P > 0.05, Fig. 1b). Principal component analysis (PCA) based on genus profiles showed no separation between Dutch children and adults (Fig. 1c). Within- and between-group β diversity of genus profiles further revealed a slightly more similar microbial composition within children compared to within adults (Fig. 1d). In line with previous studies comparing healthy infants, children (1–12 years), and adults [3, 17, 18], Dutch children showed a relative enrichment in abundance of the genus Bifidobacterium compared to adults (adjusted P < 0.05, Fig. 1e, Additional file 1: Table S4). Of note, the abundance of Bifidobacterium adolescentis, a reported adult-type Bifidobacterium species with no or poor ability for human milk oligosaccharides (HMOs) degradation [9, 19, 20], showed no differences between children and adults (Wilcoxon rank-sum test, P > 0.05, Additional file 1: Table S5). Comparisons further revealed that children were enriched in bacteria from the phylum Bacteroidetes including Bacteroides and Prevotella, while Firmicutes assigned to the genera Eubacterium, Clostridium, Dorea, and Coprococcus were more abundant in adults (Additional file 2: Figure S1a) (adjusted P < 0.05, Fig. 1e). A recent large gut microbiome cohort study on 1135 Dutch adult participants also reported higher abundances of Firmicutes (63.7%) than Bacteroidetes (8.1%) . By contrast, findings from a USA cohort revealed greater abundances of Firmicutes and significantly lower abundances of Bacteroidetes in healthy children aged 7–12 years than in healthy adults , indicating that the composition and development of bacterial communities varies in populations with different geographic and genetic origins.
At the functional level, a total of 6771 KEGG (Kyoto Encyclopedia of Genes and Genomes) orthologues (KOs) were identified in the childhood samples (median of 4695 per individual) and assigned to eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) functional categories. An adult-like stable composition based on 25 eggNOG functional categories was observed in children (Additional file 2: Figure S1b).
In order to determine if the previously reported relation between bacterial gene richness and BMI in adults  was also observed in children, we examined whether bacterial gene richness differed between overweight children (BMI z-score ≥ 1.04, n = 23) and lean children (BMI z-score < 1.04, n = 258). A BMI z-score ≥ 1.04 was used as the threshold for identification of overweight in children as recommended by the Dutch National Growth Study . Interestingly, we observed a bimodal distribution of bacterial gene counts in the overweight group (Additional file 3: Figure S2) where the children with a gene number < 600,000 (n = 8) exhibited significantly higher BMI z-score (Wilcoxon rank-sum test, P = 0.016). By contrast, no such gene distribution pattern was observed in lean children. We did not identify significant differences in relative species abundances between overweight and lean children (Wilcoxon rank-sum test, adjusted P > 0.05).
Stratification of Dutch children based on their gut microbiome
Although, the concept of gut microbiota enterotyping has been highly debated , a consensus concerning enterotypes, including guidelines for rational enterotyping, was recently achieved . Previous studies have suggested that human adult fecal metagenomes can be stratified into enterotypes that are independent of nationality, health status, age, BMI, or sex [26, 27], but associate strongly with long-term dietary patterns . To examine this phenomenon in children, we here conducted a Dirichlet multinomial mixtures (DMM)-based enterotype analysis  to investigate the presence and characteristics of enterotypes in children based on the underlying compositional structure of their gut microbiome (see details in “Methods” section).
Comparative analyses further revealed distinct compositional and functional differences of the gut microbiota between the three enterotypes. Interestingly, children with a gut microbiota belonging to E1 and E2 showed similar gene count and diversity (Dunn’s post-hoc test, P > 0.05) while children with E3 harbored about 110,000 (15%) fewer genes (median value: 604,692 genes/child, Dunn’s post-hoc test, P < 0.05) compared to the children with the other two enterotypes (Additional file 1: Table S6, Additional file 3: Figure S2a). Kruskal-Wallis tests on differences between the numbers of genes in each genus further demonstrated that E3 harbored significantly fewer genes from unannotated bacteria and several abundant genera including Bacteroides, butyrate-producing Faecalibacterium, Eubacterium, and Roseburia (Additional file 1: Table S9). E3 also exhibited lower Shannon diversity (Dunn’s post-hoc test, P < 0.05, Additional file 4: Figure S3b), and higher reads mapping rates (median of 64.2%) to taxonomically annotated genes than the other two enterotypes (Dunn’s post-hoc test, P < 0.05, Additional file 4: Figure S3c). At the functional level, E3 showed a slightly lower number of KOs than E1 (Dunn’s post-hoc test, P < 0.05, Additional file 4: Figure S3d). However, compared to the other enterotypes, E3 displayed similar functional diversity (Dunn’s post-hoc test, P > 0.05, Additional file 4: Figure S3e) and higher reads mapping ratios to the KO annotated genes (Dunn’s post-hoc test, P < 0.05, Additional file 4: Figure S3f). Lower β diversity was identified between E1 and adults, as compared to that of adults vs. E2 or E3 (Kruskal-Wallis test, P < 0.05, Additional file 4: Figure S3g), overall suggesting that the gut microbiota development may continue beyond 6–9 years of age. Altogether, our data points to a more adult-like gut microbiome in children belonging to E1 and much simpler structured gut microbiome in children belonging to E3.
KEGG pathway enrichment analysis revealed distinct differences in microbial functional patterns between enterotypes. Metabolic modules involved in biosynthetic pathways for leucine, lysine, serine, methionine, proline, and histidine production, together with amino acid transport systems for glutamate, branched-chain amino acid (BCAA), and d-methionine, were highly enriched in the microbiome of E3 (Reporter score > 1.96, Additional file 5: Figure S4a). Conversely, the module for cysteine biosynthesis was less represented in E3 (Reporter score < − 1.96, Additional file 5: Figure S4a), in agreement with prior studies reporting the lack of cysteine synthetase in the genomes of Bifidobacterium species [30, 31]. Furthermore, compared to the other two enterotypes, the gut microbiota of children belonging to E3 showed higher enrichment of functions involved in metabolism of simple sugars including glycolysis and the pentose-phosphate pathway, while functions for utilizing complex carbohydrates such as pectin, uronic acids, and glycosaminoglycan degradation were depleted (Additional file 1: Table S10, Additional file 5: Figure S4a). Consistently, Bacteroides and Prevotella have been reported to possess large numbers of genes for fermentation and utilization of complex polysaccharides . In contrast, in vitro culture experiments reported that Bifidobacterium strains grew well on glucose or ribose containing media but exhibited poor or no growth on media containing non-HMO-derived complex carbohydrates such as inulin or exopolysaccharide [33, 34]. Besides these differences, we also observed enterotype-dependent differences in the potential for biosynthesis of several water and lipid-soluble vitamins. For instance, children with a gut microbiota belonging to E1 showed higher potential for biosynthesis of cobalamin (B12) and biotin (B7), whereas children belonging to E2 exhibited higher potential for biosynthesis of menaquinone (vitamin K), pantothenate (B5), and riboflavin (B2) (Additional file 5: Figure S4a). A list of KEGG modules that differed significantly in abundance between enterotypes is provided in Additional file 1: Table S10. A heatmap showed that the relative abundance profiles of eight selected KOs, which are responsible for key steps in amino acid biosynthesis and carbohydrate metabolism, allowed to distinguish children in E3 from the two other groups (Additional file 5: Figure S4b).
Associations between early events and pre-school lifestyle and gut microbiota in school-age children
Spearman’s rank correlation analyses further revealed significant correlations between early events, pre-school lifestyle, and the relative abundance of certain microbial species (Additional file 6: Figure S5a, adjusted P < 0.05). Interestingly, bacteria enriched in E3 such as B. adolescentis, B. angulatum, and B. breve were negatively correlated with breastfeeding duration (Additional file 6: Figure S5a). The relative abundances of several Streptococcus species, reported to be part of the oral cavity microbiota of children [35, 36, 37], were positively related to maternal BMI (Additional file 6: Figure S5a, adjusted P < 0.05). The relative abundances of B. angulatum, B. dentium, and Streptococcus mitis were negatively correlated with plant-based protein intake (adjusted P < 0.05). Moreover, several species showed correlations with both early events and pre-school dietary patterns before age 5. For instance, Streptococcus parasanguinis and Streptococcus gordonii that associated with maternal BMI were also positively correlated with animal-based protein intake (adjusted P < 0.05). Similarly, the relative abundances of Collinsella intestinalis showed a positive correlation with maternal BMI and a negative correlation with dietary fiber intake (adjusted P < 0.05). Collectively, our findings indicate that both early events and pre-school dietary lifestyle contribute to shaping of the gut microbiota in school-age Dutch children, with different factors influencing each enterotype.
Enterotype-dependent associations between school-age metabolic phenotypes and pre-school lifestyle
An increasing number of studies has linked the gut microbiota with host metabolic phenotypes, including glucose and insulin homeostasis, and amino acid metabolism [38, 39, 40]. We examined the association between the gut microbiota and metabolic phenotypes including glucose, insulin, and amino acids levels measured in blood samples collected 3.5 h after the last meal on the same day as the fecal samples were collected (Additional file 6: Figure S5b). We observed that the relative abundances of Bacteroides spp., including Bacteroides xylanisolvens, B. dorei, B. vulgatus, and B. eggerthii, enriched in E1, were negatively correlated with plasma branched-chain amino acid (BCAA) levels. It has been demonstrated in a single mouse study that certain Bacteroides strains may contribute to BCAA degradation and hence reduce the circulating levels of BCAA in the host . In addition, we also observed positive correlations between Dorea longicatena, Coprococcus comes, and plasma glutamate levels, which is in line with a recent study reporting an enrichment of these species and plasma glutamate levels in young obese Chinese adults . The abundance of S. gordonii positively correlated with plasma total triglyceride and glucose.
Next, we examined if the measured levels of blood metabolic parameters, such as insulin and glucose, in school-age children differed between the three enterotypes. No differences were observed between individuals within the three enterotypes (Additional file 1: Table S11, Kruskal-Wallis test, P > 0.05).
Given the multi-relationships between gut microbiota and the above-mentioned early events and pre-school lifestyle, we next asked whether these factors would impact metabolic responses in the early school-age children. We first conducted Spearman’s rank correlation analyses between early factors and school-age metabolic phenotypes across the entire children cohort. We found that insulin levels exhibited negative associations with the pre-school lifestyle related to intake of plant-based protein and dietary fiber (Spearman’s correlation, adjusted P < 0.05, Additional file 7: Figure S6a). However, by stratifying according to enterotypes, we observed different correlation patterns among children within the different enterotypes. For instance, only children within E1 and E2 exhibited negative associations between insulin levels and intake of plant-based protein and dietary fiber (adjusted P < 0.05, Additional file 7: Figure S6b, c). In addition, the negative correlations between increased pre-school dietary fiber intake and reduced total serum triglyceride (TG) levels at school-age were solely present in E2 (Additional file 7: Figure S6c). Despite no such relationships were seen for E3 children, we found plasma free fatty acid (FFA) levels positively correlated with animal-based protein (Additional file 7: Figure S6d), and additionally, a few Streptococcus spp. to be positively correlated with the plasma FFAs levels only in this group (Additional file 8: Figure S7). Most of these enterotype-dependent associations persisted after adjusting for multiple covariates including sex, age, and BMI z-score at stool collection and early events by linear regression models (Additional file 1: Table S12).
Collectively, our results illustrate that early events and pre-school dietary lifestyle impact on the development of the gut microbiota, which in turn seems to affect host metabolic responses.
Comprehensive information on the structural and functional configuration of the gut microbiota of early school-age children is limited. Previous studies have surveyed this age-specific population using 16S rRNA gene amplicon sequencing or metagenomic shotgun sequencing with relatively small sample sizes [17, 18]. In this study, we provide a comprehensive overview of the gut microbial characteristics of 281 early school-age Dutch children, displaying an enrichment of bacteria from the Bacteroidetes and Actinobacteria phyla, and a similar functional composition as compared to Dutch healthy adults. Furthermore, we identified three enterotypes in healthy children, with children dominated by Bifidobacterium showing the lowest gene number and the lowest diversity compared to children enriched by Bacteroides or Prevotella, suggesting stratified developmental trends of the childhood gut microbiota toward an adult-like configuration.
Further supporting the concept of the importance of environmental factors in shaping the development of the gut microbiota, we found several factors, especially breastfeeding duration in early life and plant-based food intake (dietary fiber and plant-based protein) in pre-school children (adjusted P < 0.05), to significantly correlate with the gut microbiota composition in school-age children. Moreover, we identified a shorter breast-feeding duration in the Bifidobacterium-dominated enterotype (E3) in school-age children than in other groups. This was reflected in significant negative correlations between a few Bifidobacterium species and breastfeeding duration including the most abundant adult-type B. adolescentis. In pre-school children, less than average dietary fiber intake was also observed in E3, in accordance with previous studies that reported significantly lower abundance of Bifidobacterium in healthy subjects on fiber-blend formula than subjects on fiber-free formula , and lower Bifidobacterium in vegans than in vegetarians . Functional analyses demonstrated that children with a gut microbiota enriched in Bifidobacterium strains favored utilization of simple sugars but lacked the potential for complex carbohydrate utilization. In addition, a recent Dutch adult cohort study reported that the relative abundance of Bifidobacterium in the gut could be affected by an interaction between genotype and intake of dairy products, with adults of the GG genotype on rs4988235 in the MCM6 gene showing a positive relationship between Bifidobacterium abundance and milk product consumption . Combined, the present and previous findings suggest that intestinal Bifidobacterium levels might not only be influenced by early events and dietary patterns but also by host genetics. However, the detailed mechanisms underlying colonization and establishment of Bifidobacterium species in the human gastrointestinal tract remain to be determined.
Dietary patterns have been shown to substantially impact on the gut microbiota and host metabolism. A key finding of this study is the enterotype-stratified association pattern between school-age metabolic phenotypes and pre-school dietary patterns. Although the benefits of a high-fiber diet have been well documented in epidemiological studies, we only observed significant negative correlations between plant-based diet intake (dietary fiber and plant-based protein) and insulin levels in children with high abundances of Bacteroides or Prevotella, but not in children with high Bifidobacterium abundance. Similarly, the dietary fiber-induced improvement in postprandial glucose response has also been shown to be associated with increased abundance of Prevotella in a recent human intervention study . In our search for possible functional explanations for the identified enterotype-dependent correlations, we discovered very distinct gut microbial functional capacities between enterotypes, particularly in relation to the potential for transformation of metabolites derived from microbial fermentation. Thus, the functional potential for fermentation of complex carbohydrate was less in children with high Bifidobacterium abundance compared to others. Focusing on the two groups with higher fermentation potential, we observed that children with a Bacteroides-driven enterotype exhibited a higher abundance of a set of genes (buk and but) related to butyrate biosynthesis, while children with a Prevotella-driven enterotype possessed higher potential for succinate production. Both butyrate and succinate have been demonstrated to exert beneficial effects in relation to glucose and insulin homeostasis [13, 43]. Together, our findings support the notion that host metabolic benefits of plant-based diets might be conferred by specific bacteria and specific metabolites derived from carbohydrate fermentation. In addition, several Streptococcus spp. exhibited negative correlations with dietary fiber and plant-based protein intake in E2, but not in E1 and E3. Streptococcus spp. being high in E3 might also play a role in the enterotype-based metabolic responses to dietary patterns in this group due to the reported positive correlation with total fat consumption and elevated plasma free fatty acid levels. Further well-controlled dietary intervention studies are warranted to confirm the metabolic responses to dietary factors and establish the causal roles for these bacterial taxa and functional pathways in determining the host metabolic benefits observed in this study.
In conclusion, this study reveals important characteristics of the gut microbial structure and function in healthy early school-age Dutch children, which are affected by certain early life factors, such as breastfeeding duration and pre-school dietary habits, while the influence from physical activity is negligible. Of specific note, our findings suggest that distinct metabolic responses to dietary lifestyle are strongly governed by the composition and functional potentials of the gut microbiota, implying that stratification of children according to gut microbiota enterotypes may well be included in future investigations on the relationship between dietary intake and metabolic health in children.
KOALA study population
The 281 Dutch children were part of the KOALA Birth Cohort Study. The design of the study has been described in detail elsewhere . Inclusion criteria for the present study were the availability of fecal and blood samples collected in children 6 to 9 years of age (72–108 months). Children born prematurely, twins, children with abnormalities linked to growth (such as Down’s syndrome, Turner syndrome, Fallot’s tetralogy, multiple disabilities, and cystic fibrosis), and children treated with antibiotics within 4 weeks prior to fecal sample collection were excluded. Fecal samples were collected as previously described . Blood samples and anthropometric measures were collected during home visits within the week of fecal collection. Height and weight were measured by trained research assistants. BMI measurements were converted into age- and gender-specific z-scores using data from children enrolled in the Dutch National Growth Study as the reference population and converted into dichotomous outcomes: lean vs. overweight (BMI z-score > 1.04, corresponding to the 85th percentile according to standard guidelines ).
Blood samples were collected approximately 3.5 h after the last meal intake. Data on early events were collected through repeated parent-reported questionnaires during pregnancy and early life. Lifestyle characteristic including dietary intake (Food Frequency Questionnaire) and physical activity were collected through questionnaires at 4–5 years as described previously . Blood biochemical indices were measured by radioimmunoassay (insulin, adiponectin, leptin) or with enzymatic/colorimetric methods on an automated spectrophotometric analyzer (free fatty acids, glucose, total and HDL cholesterol, triglycerides, high-sensitivity C-reactive protein). Plasma amino acids were measured by ultra-performance liquid chromatography–tandem mass spectrometry (UPLC-MS/MS) . The summary of sample information is presented in Additional file 1: Table S1.
Generation of children fecal metagenomic data
Libraries with an insert size of 350 base pair (bp) were constructed from metagenomic DNA for each sample following the manufacturer’s instructions (Illumina, San Diego, California, USA). Illumina sequencing with 100 bp paired-end reads was applied to all 281 samples.
Availability of public Dutch adult metagenomic data
Illumina-based paired-end metagenomic sequencing data were collected from 62 healthy Dutch adult controls from the Maastricht IBS cohort (MIBS-CO) . The sample information is presented in Additional file 1: Table S3.
Profiling of metagenomic samples
The raw sequencing data from the KOALA cohort were processed for quality control using the FASTX toolkit in the MOCAT pipeline . We trimmed reads with continuous bases from the 3′-end of a read with average Phred score ≤ 20 and kept the remaining high-quality reads with length larger than 30 bp. The high-quality reads were then aligned to hg19 using SOAP2.2 (identity ≥ 0.9) to remove human reads. The high-quality non-human reads were defined as clean reads and aligned against the human gut microbial integrated gene catalog (IGC) to generate count profiles using SOAP2.2 (identity ≥0.95) .
To eliminate the influence of sequencing depth in comparative analyses, unique IGC mapped reads of each sample were downsized to 20 million for each child. After this, we identified 268,363 to 1,069,059 microbial genes in the 281 samples, with an average of 694,404 genes per sample. The relative abundance profiles of genes, genera, species, and KOs were determined by summing the relative abundance of genes from each taxon or KO using the downsized mapped reads per sample . Due to the relatively limited amount of data (average 13.5 million uniquely mapped reads per adult), the metagenomic datasets from Dutch adults were processed to calculate Shannon index and generate gene, genus, KO, and eggNOG profiles using the same pipeline as described above without reads downsizing. Data statistics for each sample are provided in Additional file 1: Table S2.
Richness, diversity, and enterotype analysis
Alpha diversity (within-sample diversity) was quantified by the Shannon index using gene relative abundance profiles . Beta diversity (between-sample diversity) was calculated using Bray-Curtis distance (R 3.2.5, vegan package 2.4-4). For Dutch children, the gene and KO counts which represented gene or functional richness were calculated in each downsized sample in accordance with the previous study .
Genus-level enterotypes analysis was performed according to the Dirichlet multinomial mixtures (DMM) and partitioning around medoid (PAM)-based clustering protocols using Jensen-Shannon divergence (PAM-JSD) and Bray-Curtis (PAM-BC) (Additional file 9: Figure S8) [29, 27]. The optimal cluster number of the protocols was calculated using Laplace approximation and Calinski–Harabasz index [29, 27], with all protocols indicating an optimal cluster number of three among the 281 Dutch children (Additional file 9: Figure S8a–c). We next randomly selected 200, 220, 240, 260, and 280 samples for enterotype clustering to evaluate clustering stability of the three protocols, by repeating the above sampling and clustering process 10 times for each sample size and generating enterotyping results of 50 times for each method. The DMM-based protocol, which showed a higher consistency than the two PAM-based protocols, was chose for this study (Additional file 9: Figure S8d–f).
KEGG enrichment analysis
Differentially enriched KEGG modules were identified according to reporter Z-scores. One-tail Wilcoxon rank-sum test was performed on all the KOs that occurred in more than six samples and adjusted for multiple testing using the Benjamini-Hochberg method . The Z-score for each KO was calculated based on the adjusted P value for a particular KO and the aggregated Z-score for each module was calculated from the Z-scores of all KOs involved in the module . An absolute value of reporter score of 1.96 (95% confidence according to normal distribution) or higher was used as the detection threshold for significance.
Before comparison, the taxa with low coverage of annotated gene number (less than 100) in the entire cohort were filtered for further analyses and this confined our analysis to 82 genera and 226 species. Principal component analysis (PCA) was implemented using the function prcomp in R 3.2.5.
Wilcoxon rank-sum tests were conducted to detect differences in the gut microbial characteristics, including gene count, Shannon index, and the relative abundances of genera, species, and KOs between the Dutch children and adults, and between overweight and lean children. Kruskal-Wallis tests were performed to assess the differences in gut microbial characteristics and continuous phenotypic variables between enterotypes. Dunn’s post-hoc tests followed by pairwise comparisons were performed to explore the differences between two groups. To detect the differences in categorical phenotypic variables, Chi-square tests were conducted. The Benjamini-Hochberg method was used for multiple testing correction , with cutoff for adjusted P value at 0.05.
Permutational multivariate analysis of variance (PERMANOVA) was performed to assess the correlation between gene-level microbial profiles and phenotypic factors including early events and lifestyle (Bray-Curtis distance). The pseudo F statistics and P values were calculated using the function adonis from vegan package in R 3.2.5 based on 9999 permutations. The cutoff was set as adjusted P < 0.05.
SparCC was run with default parameters and 1000 bootstraps to test for correlations between the relative abundances of species . Pseudo P values were calculated as the proportion of simulated bootstrapped data sets with a correlation at least as extreme as the one computed for the original data set. The significant cutoff for SparCC was set at pseudo P < 0.01. The co-occurrence network of species was visualized in Cytoscape 3.5.1.
Spearman’s rank coefficient correlation (SCC) analysis was used for correlations between continuous phenotypic factors and between continuous phenotypic factors and microbial species or KOs. The significant cutoff for SCC was set at an adjusted P < 0.05.
General linear model (GLM) regression analyses were conducted to validate the enterotype-dependent associations between pre-school lifestyle and school-age blood metabolic parameters determined by Spearman correlation analysis. Two confounder adjustment models were applied, with model 1 adjusting for gender, age, and BMI z-score and model 2 adjusting for gender, age, BMI z-score, and all early events. Shapiro-Wilk test was conducted to test for normality of residuals of the regression models. Box-Cox transformation was conducted on dependent variables of non-normally distributed residuals. Residuals were considered as normally distributed with P > 0.05 (Shapiro-Wilk test). The regression coefficient (β) and two-tailed P values for the coefficient were calculated (glm in R 3.2.5). The P < 0.05 was regarded as significant.
We thank all the family volunteers who participated in this study. We thank Assistant Professor Qiang Sun from University of Toronto for providing useful statistical suggestions. We also thank Dr. Cong Lin at BGI for helpful discussion and suggestions on developing the manuscript. We gratefully acknowledge colleagues at BGI for DNA extraction, library preparation and sequencing experiments, and helpful discussions.
This work was supported by grants from the EFSD/Chinese Diabetes Society/Lilly Program for Collaborative Diabetes Research between China and Europe, Netherlands Asthma Foundation (grant no. 3.2.07.022 & 3.2.03.48), Stichting Astma Bestrijding, Dutch Biobanking and Biomolecular Resources Infrastructure (BBMRI-NL), and Shenzhen Municipal Government of China (No. JSGG20160229172752028). J.P. is supported by a grant from ZonMW/JPI HDHL Intestinal Microbiomics (grant 50-52905-98-599).
Availability of data and materials
Metagenomic sequencing data for all 281 children samples have been deposited in the European Bioinformatics Institute (EBI) database under accession code (ENA) PRJEB26795 and China Nucleotide Sequence Archive (CNSA): CNP0000037.
CT, IA, and JP designed the KOALA study. CT and JP coordinated the follow-up of KOALA children at 6–9 years. DJ and AM designed and coordinated the study among healthy adults and provided the metagenomic sequencing related to the healthy adults. HZ, ZS, HR, CF, KC, QD, and JL performed the bioinformatic analyses of the study guided by KK and SB. HZ, JP, IA, KC, SB, and KK interpreted the data. HZ, JP, IA, ZS, HR, SB, and KK wrote the manuscript. HZ, JP, IA, CT, ZS, HR, CF, KC, JL, XX, HY, JianW, JunW, SB, and KK participated in discussions. All authors read and approved the final manuscript.
Ethics approval and consent to participate
For the KOALA Birth Cohort Study, written informed consent was given by all parents, and the study was approved by the Medical Ethics Committee of Maastricht University Medical Centre (MUMC) and the National Committee of Medical Research. The MIBS-CO study was also approved by the institutional ethics review board of the MUMC and all participants signed an informed consent form.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 19.Biavati B, Castagnoli P, Trovatelli LD. Species of the genus Bifidobacterium in the feces of human adults. Microbiol. 1986;9:39–46.Google Scholar
- 33.Rios-Covian D, Arboleya S, Hernandez-Barranco AM, Alvarez-Buylla JR, Ruas-Madiedo P, Gueimonde M, et al. Interactions between Bifidobacterium and Bacteroides species in cofermentations are affected by carbon sources, including exopolysaccharides produced by bifidobacteria. Appl Environ Microbiol. 2013;79:7518–24.CrossRefGoogle Scholar
- 35.Ghasempour M, Rajabnia R, Irannejad A, Hamzeh M, Ferdosi E, Bagheri M. Frequency, biofilm formation and acid susceptibility of streptococcus mutans and streptococcus sobrinus in saliva of preschool children with different levels of caries activity. Dent Res J. 2013;10:440–5.Google Scholar
- 50.Waterval WAH, Scheijen JLJM, Ortmans-Ploemen MMJC, Habets-van der Poel CD, Bierau J. Quantitative UPLC-MS/MS analysis of underivatised amino acids in body fluids is a reliable tool for the diagnosis and follow-up of patients with inborn errors of metabolism. Clin Chim Acta. 2009;407:36–42.CrossRefGoogle Scholar
- 52.Hochberg B. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc. 1995;57:289-300Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.