Isolation of gut organisms involved in metabolizing dietary components
Studies on the metabolism of the soy isoflavone daidzein to equol provide good examples of methods commonly used to identify specific organisms involved in gut microbiota metabolism of dietary compounds. The identification of equol-producing organisms has been the subject of several studies as a consequence of the potential importance of equol in human health and the intriguing observation that only about 30% of people appear to be capable of its production.
Matthies et al.  isolated a novel strain from an equol-producing subject, by serial dilution of a faecal homogenate and incubation in a nutrient broth containing 100 uM daidzein and tetracycline. The latter inhibited the growth of the majority of the faecal microbiota without affecting the metabolism of daidzein. From the highest dilution that contained equol-producing microorganisms, further serial dilutions were prepared and repeated until a pure culture was obtained. On the basis of phenotypic and phylogenetic characterization, the culture was identified as a new species and named Slackia isoflavoniconvertans.
In their study to identify equol-producing organisms, Decroos et al.  serially diluted a faecal sample from an equol producer and plated on a nutrient agar. Single colonies from the plates were tested for ability to metabolize daidzein. From one such colony a stable, mixed culture capable of converting daidzein to equol was obtained and shown to comprise four bacterial strains identified as Lactobacillus mucosae, Enterococcus faecium, Finegoldia magna, and a Veillonella sp. The first three were obtained as pure cultures, but interestingly, none was capable of producing equol in pure culture, and the complete consortium was required for the conversion. These isolation attempts illustrate the difficulties that can be experienced in obtaining a pure culture of a bacterium that is intimately dependent on another bacterial species/strain to provide essential growth co-factor(s).
Enrichment techniques have been used extensively in environmental microbiology to isolate organisms capable of degrading contaminants and other xenobiotics in the environment. These techniques usually involve either suspension batch cultures or continuous culture enrichment methods in which the mixed culture is incubated with the xenobiotic as a selection factor, usually as the sole carbon source . These techniques would lend themselves to the isolation of organisms or consortia capable of metabolism dietary compounds, but they have not been widely used in the human gut microbiota area. A recent study by Ziemer  illustrates the potential of the technique as applied to the ruminant gut. In this study, continuous culture fermenters containing nutrient medium with cellulose or xylan-pectin as sole carbon sources were inoculated with cattle faeces and run for 8 weeks under operating conditions that modelled the caecum and colon of cattle. Samples were then serially diluted and plated onto carbohydrate-specific agar to isolate colonies that were then identified by 16S rRNA gene sequencing. The communities that arose during the enrichment had a broad microbial diversity representing six phyla (Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Synergistetes, and Fusobacteria). Many of the Firmicutes and Bacteroidetes isolates were related to species demonstrated to possess enzymes involved in fermenting plant cell wall components, but interestingly did not exhibit a high identity to cultured bacteria with sequences in the Ribosomal Database Project and so represented novel genera or species. In fact, over 98% of the isolates were not previously cultured. This methodology, therefore, could provide new opportunities to characterize the metabolic capacities of members of the gut microbiota.
Although the approach of isolating strains capable of metabolizing dietary components provides insight into the potential microorganisms involved in vivo, there are drawbacks, in particular, it clearly focuses only on those gut microorganisms that can be cultured in vitro. Furthermore, the ability of a single strain to metabolize a compound in vitro may not translate into metabolism in the different physico-chemical conditions in the host-gut, and when in the presence of millions of other bacteria, which may be competing for the substrate or acting in partnership to degrade it .
Gut microbial enzyme activity
Much of the focus of recent microbiota research has utilized sequencing methods to describe the composition and relative abundance of the colonic community. Less attention has been paid to the assessment of specific microbial functions, which could be more useful in elucidating the gut metabolism of dietary components and links between the microbiota and health.
Measurement in faecal or colonic samples of the activity of enzymes involved in metabolism of dietary and endogenous compounds has been described for many years. The enzymes β-glycosidase (catalysing the hydrolysis of plant polyphenol glycosides), β-glucuronidase (cleavage of glucuronidated hepatic dietary metabolites), and various polysaccharide-degrading enzymes have been particularly well described . In most cases, this approach of assaying enzyme activities ignores the contribution of individual bacterial types and focuses instead on overall activity in faecal samples. One of the limitations of this approach is that the activities are measured in vitro in faecal suspensions usually using model substrates so may not reflect the activity in vivo where the substrate concentrations and environmental conditions such as pH could be very different.
Studies have also been conducted using a range of gut bacterial isolates to identify the main organisms involved. For example, Dabek et al.  screened 40 bacterial strains representative of the main bacterial groups in human faeces for β-glucosidase and β-glucuronidase activity. There was a higher prevalence of β-glucosidase producers (23/40 strains, including most of the Bifidobacterium spp. and Bacteroides thetaiotaomicron and over half of low G + C% Gram-positive Firmicutes) than β-glucuronidase producers (9/40 strains mainly members of clostridial clusters XIVa and IV). There was also evidence of dramatic strain specificity in β-glucuronidase activity in three F. prausnitzii isolates. The study also tested whether exposure to glycoside and glucuronide substrates induced enzyme activity. While there was no effect on most strains, a few exhibited several fold (4–12) increases in activity suggesting that changes in overall faecal enzyme activities in response to dietary exposure may be due to changes in the number of microbes possessing those activities and also enzyme induction in certain strains. McIntosh et al.  combined an enzymatic approach and a clone library analysis to study distribution of the β-glucuronidase genes gus and BG in the microbiota. Firmicutes accounted for 96% of amplified gus sequences, while 59% of BG sequences were attributed to Bacteroidetes.
It should be noted that measurement of enzyme activities of individual strains in vitro does not necessarily reflect activity in vivo where the environmental conditions, including pH, and relative abundance of the microbial types may be very different. For example, Cole et al.  compared the activity of enzymes measured after in vitro culture and also after the same strains has been introduced into germ-free rats and found significant differences.
More recently, a variety of molecular methods have been exploited to explore enzymatic diversity in the highly complex gut ecosystem. El Kaoutari et al.  have designed a custom microarray of non-redundant DNA probes for over 6500 genes coding for enzymes involved in dietary polysaccharide breakdown. It allows the detection of carbohydrate-degrading enzymes present in low abundance bacterial species in the gut. Alternatively, gene-specific primers can be used to enumerate all bacteria capable of performing a specific role in the gut, using qPCR, as has been done for butyrate producing bacteria . However, both these techniques only identify the presence of genes, and not whether they are actively expressed at a specific time.
There is a growing awareness of the importance of the gut microbiome in the overall system of the host. This has led to the inclusion of top-down approaches studying the composition and functionality of the microbiota, so-called ‘-omics’ approaches. Metagenomics provides insight into the genes that could be expressed, while metatranscriptomics reveals information about regulatory networks and gene expression and combined with metaproteomics, and metabolomics informs about the functionality of the microbiota and, therefore, provides some strong insights into microbial activities in the gut.
Studies are performed in an unbiased fashion with the focus on hypothesis generation rather than hypothesis testing. This has proven particularly effective for studying the gut microbiota due to the relatively limited understanding of this multi-dimensional dynamic variable. Each ‘-omic’ technology provides its own unique perspective of the microbiota and its impact on the host, so to fully exploit their potential multiple ‘-omic’ approaches can be applied simultaneously and results integrated, preferably from the same sample. With the help of mathematical modelling, this enables a comprehensive understanding of the microbial ecosystem to be gleaned and its contribution to the overall biological system to be studied at the molecular level. This represents a significant technical and bioinformatic challenge, although a new methodological framework developed by Roume et al.  for the co-extraction of DNA, large and small RNA, proteins, and polar and non-polar metabolites from single samples of microbial communities represents a significant step in this process.
Metagenomics has extensively been used to investigate differences in microbiota composition in disease states such as inflammatory bowel disease, obesity, and diabetes compared to healthy individuals, but it has also revealed novel changes in microbiota function in some diseases . For example, Wei et al.  reported that the faecal microbiota of 20 patients with hepatitis B cirrhosis of the liver showed enrichment of metabolism of glutathione, branched-chain amino acids, nitrogen, lipids, and gluconeogenesis, and a decrease in aromatic amino acids and bile acid-related metabolism in comparison to control subjects.
Metagenomic analysis is being increasingly used to study functional genes of the gut microbiota. Jones et al.  used this methodology to study the distribution of BSH genes. Via metagenomic analyses, they identified functional BSH in all the main bacterial divisions and Archaea in the gut and demonstrated that BSH is a conserved adaptation to the amount of conjugated bile acids in the gut and exhibits a high level of redundancy. Of particular relevance to the present review is the approach taken in a recent paper by Mohammed and Guda . The authors developed an ensemble of machine learning methods termed ECemble (Enzyme Classification using ensemble approach) to model and predict enzymes from protein sequences and identify enzyme classes and subclasses at high resolution. The method was then applied to predict enzymes encoded by the human gut microbiome from gut metagenomic samples, and to study the role of microbe-derived enzymes in the human metabolism. They identified 48 pathways that have at least one bacteria-encoded enzyme. The pathways were primarily involved in the metabolism of amino acids, lipids, co-factors, and vitamins. Subsequently, the methods were used to demonstrate differences in the profiles of gut microbiota-derived enzymes in lean and obese subjects and in patients with IBD. For example, the microbiota of obese subjects was enriched in polygalacturonase, which is encoded by Bacteroides and Prevotella species. In contrast, urease-encoding bacteria were found in fewer numbers in obese versus lean subjects.
A number of metagenomic studies have focused on so-called carbohydrate active enzymes (CAZymes) due to the critical role that the gut microorganisms play in the breakdown of dietary fibre and other non-absorbed carbohydrates in the gut. Such an approach is not restricted to the study of the enzymatic activity of cultivable microbes and has revealed a wide diversity of CAZymes of at least 81 families of glycoside-hydrolases. For example, Tasse et al.  using in-depth pyrosequencing, discovered 73 CAZymes from 35 different families and also identified 18 multigenic clusters encoding complementary enzyme activities for fibre degradation.
Single cell genomics is an emerging technology in which single microbial cells are isolated from a sample, their DNA extracted and amplified and then shotgun sequenced . The advantage of this approach is that genomic data can be placed in a phylogenetic context even where the function of a putative gene is unknown and information from rare or uncharacterized species can be obtained. This has the potential to complement metagenomics by aiding the functional assignment of metagenomic data.
Although metagenomics is a powerful tool for investigating the gut microbiota, it does have limitations. These have been discussed in detail by Wang et al.  but include the requirement for sufficient high-quality DNA, the impact of different DNA extraction methods and kits on results, and of particular relevance for functional metagenomics, the limitations in the size and quality of reference databases, which impedes the assignment of functions to the data obtained. Finally, the presence of a gene does not inform us about gene expression patterns. Metatranscriptomics, metaproteomics, and metabonomics enable the latter to be more effectively addressed.
Metatranscriptomics extracts and sequences mRNAs from a microbial ecosystem to determine the genes that may be expressed in that community. It usually involves reverse transcription to generate cDNA, which is then sequenced using similar methodologies as for metagenomics. Metatranscriptomics allows the identification of novel non-coding RNAs, including small RNAs thought to play important roles in biological processes such as quorum sensing and stress response . The approach has mostly been applied to samples from water and soil environments and less frequently to the gut microbiota [117, 118] and the microbiota studies need to be interpreted with considerable caution, given the major limitation of the short half-life of bacterial mRNAs, although this is less of an issue for studies using ribosomal RNAs, which are more stable.
Gosalbes et al.  performed a metatranscriptomic analysis of faecal microbiota from 10 healthy subjects. Microbial cDNAs from each sample were sequenced by 454 methodology and analysis of the 16 S rRNA transcripts revealed that Firmicutes and Bacteroidetes were the sources of the greatest number of transcripts (49 and 31%, respectively) with smaller numbers from Proteobacteria (3.7%), Actinobacteria (0.4%), and Lentisphaerae (0.2%). The majority of the Firmicutes sequences fell into the Lachnospiraceae and Ruminococcaceae families, which contain pectin and cellulose degraders. In the Bacteroidetes phylum Bacteroidaceae, Prevotellaceae, and Rickenellaceae families were functionally the most important. Interestingly, the most active families were the same in all the volunteers.
The non-ribosomal transcripts from the faecal samples were searched by BLASTX against an established NCBI COG database to obtain a functional distribution for each sample. The pattern was very similar for all the samples with carbohydrate transport and metabolism, energy production and conversion, and synthesis of cellular components being the main activities. Other areas such as amino acid and lipid metabolism, cell motility, and secondary metabolite biosynthesis were underrepresented in the metatranscriptome. These results are consistent with an earlier, smaller study by Turnbaugh et al.  in monozygotic twins in which the genes with higher relative expression included those for carbohydrate metabolism, energy metabolism, nucleotide metabolism, and those associated with essential cell processes, e.g., RNA polymerase and glycolysis.
As with all the ‘-omics’ approaches, metatranscriptomics has its limitations and studies are challenging both technically and in terms of bioinformatics. The short half-life of mRNA leads to difficulty in the detection of short-term responses to environmental changes, consequentially extrapolating results obtained from transcriptional analysis of faecal samples to functions within the large intestine itself can present problems [115, 117].
Metaproteomics aims to characterize the complete profile of gene translation products and can yield additional information about post-translational modifications and localization over that provided by metatranscriptomics measurements . One of the advantages of metaproteomics is that it is possible to link proteins to specific taxonomic groups, thus providing insight into the microbes at species and strain level involved in specific catalytic functions and pathways, i.e., genotype–phenotype linkages .
Methodologies for metaproteomics are in a state of development, but typically they involve heat treatment of the faecal sample and extensive bead beating to extract and denature the proteins, which are subsequently enzymatically digested to peptides. Peptide analysis is usually by nano-2D-LC-MS-MS and COG assignments are determined for each peptide sequence by BLAST against the NCBI COG database. Microbial community functions are analyzed by grouping proteins into COG categories.
Metaproteomic studies on the gut microbiota to date have been performed in small numbers of subjects (usually n = 1–3), which limits the conclusions that can be drawn, but the results have shown some consistencies. Verberkmoes et al.  conducted a faecal metaproteomic analysis of a pair of adult female monozygotic twins. Analysis was by nano-2D-LC-MS-MS and the proteins identified by database searches were classified into COG categories. In both subjects, the most abundant COG functions were energy production, amino-acid metabolism, nucleotide metabolism, carbohydrate metabolism, translation, and protein folding. The authors compared the metaproteomic profile with a previously published metagenomic profile of two individuals which revealed that in contrast to the most abundant functions identified in the metaproteome above, the metagenome was dominated by proteins involved in inorganic ion metabolism, cell wall and membrane biogenesis, cell division, and secondary metabolite biosynthesis.
Kolmeder et al.  investigated composition and temporal stability of the faecal metaproteome in samples collected at 2 time points from 3 healthy subjects over a period of 6–12 months. The results indicated that the faecal metaproteome is subject-specific and is stable over a 1-year period. A stable common core of about 1000 proteins was recognised in each of the subjects. The most abundant core protein was found to be glutamate dehydrogenase, and this enzyme showed high level of redundancy in the intestinal tract, since it was associated with a number of microbial families, Lachnospiraceae, Bacteroidaceae, Ruminococcaceae, and Bifidobacteriaceae. Other high abundance proteins included pyruvate-formate lyase, which converts pyruvate to acetyl-CoA and formate, and chaperone proteins involved in protein folding and Fe-S cluster formation. About 10% of the total proteome comprised proteins involved in carbohydrate transport and metabolism including ABC sugar transporters and glycolytic enzymes. When the COGs were mapped onto pathways, the main functional categories were metabolism of carbohydrates, nucleotides, energy, amino acids, and co-factors and vitamins (especially B12 and folic acid).
Metaproteomics has also been applied to faecal samples from a lean and an obese subject and to comparisons of Crohn’s Disease patients and healthy subjects (reviewed by Xiong et al. ). Young et al.  used shotgun proteomics to characterize the functional changes in the faecal microbiota 7–21 days after birth of a preterm infant. The results suggested that the developing microbial community initially focuses its resources on cell division, protein production, and lipid metabolism later switching to more complex metabolic functions, such as carbohydrate metabolism, and secreting and trafficking proteins. It is noteworthy that this functional distribution seen after 3 weeks was similar to that observed in the adult human gut .
Metaproteomics is a developing technology and has its limitations, in particular there is no reference protocol, so it can be difficult to compare studies and the bioinformatic systems for metaproteomics are less well developed than those for metagenomics. Kolmeder and de Vos  have discussed in detail published methodologies, highlighting the importance of sampling techniques and sample preparation and processing.
Metabolic profiling (metabonomics/metabolomics)
Metabolic profiling has emerged as a powerful systems biology approach simultaneously measuring the low-molecular weight compounds in a biological sample, capturing the metabolic profile or phenotype. In the host, these metabolic signatures contain thousands of molecular components that arise from endogenous and exogenous metabolic processes, environmental inputs, and metabolic interactions between the host and environment. The environmental inputs can include dietary components and products of gut microbial activity. A major strength of using metabonomics to study the gut microbiota is the ability to measure metabolites in host samples that derive directly from the microbiome, for example the SCFAs. This provides a direct read-out of gut microbial activity and variations due to diet. Furthermore, upon absorption from the gut, microbial products can enter host metabolic processes resulting in downstream metabolic perturbations and the generation of microbial-host co-metabolites, all of which can be captured by metabolic profiling.
Practically, metabolic profiling can be applied to a range of different sample types and experimental models. It can be used to characterize the metabolites in samples from in vitro experiments, including pure cultures and complex gut models, and various sample types collected from in vivo studies . In vivo samples can include biofluids, such as urine, blood, faecal water, saliva, and cerebrospinal fluid, and various tissue samples such as those collected from the gut, liver and brain. To measure the metabolic profile of a sample, two analytical platforms are typically used, 1HNMR spectroscopy and mass spectroscopy (MS). Both techniques are capable of simultaneously capturing quantitative and structural information on a broad range of metabolites in an unbiased manner in a single measurement. Comprehensive reviews on these analytical techniques have been published [124–126]. 1H NMR spectroscopy measures protons (1H) on metabolites in a sample and MS measures the exact mass of molecular ions in a sample and how they fragment. This information can then be used to identify the metabolites present and their abundance. MS is usually preceded by a separation step to allow the analysis of complex mixtures, which includes liquid or gas chromatography, or in some cases capillary electrophoresis. Although a single analytical technique is routinely applied in metabonomic studies, these two techniques are complementary and their parallel application can provide wide metabolome coverage.
The metabolic phenotype acquired from these techniques is multivariate in nature containing hundreds to thousands of metabolites. To extract latent information associated with gut microbial function or their influence on host metabolism from this multi-dimensional data, a range of pattern recognition techniques are applied [127–129]. Standard multivariate statistical techniques used for metabolic profiling studies include the unsupervised approaches, principal components analysis (PCA), and hierarchical clustering analysis (HCA) and the supervised approach, projection to latent structures (PLS) analysis. Unsupervised methods are concerned with modelling variation within the data and have no a priori knowledge of sample classification. In contrast, supervised methods use known information of the samples (e.g., germ-free versus conventional status; placebo versus prebiotic intervention) to extract information in the metabolic data that are related to this information.
The utility of this approach for studying the microbiome has been demonstrated in animal models of altered microbial status, such as germ-free, gnotobiotic, antibiotic-treated animals, and also in human studies [82, 130–132]. These studies have shown the extensive reach of the gut microbiota throughout the metabolic system of the host and the diverse pathways modulated. This is not just restricted locally to the immediate environment of the gut but systemically to peripheral tissues such as the heart and brain. For example, dietary choline from sources such as red meat and eggs can be metabolized by the microbes in the gut to trimethylamine and dimethylamine. Trimethylamine is toxic and requires oxidation in the liver by the flavin-containing monooxygenase 3 (FMO 3) enzyme before being excreted as trimethylamine-N-oxide (TMAO). TMAO can be used as an electron acceptor by Escherichia coli, and, interestingly, has been implicated as a risk factor for CVD .
A vast amount of information is captured with metabolic profiling and various inherent (e.g., genetic constitution, age, and gender) and environmental (e.g., diet, alcohol intake, and drug therapy) factors can influence the metabolic phenotype. Unrelated variation in these metabolic signatures can often mask or obscure the variation resulting from gut microbial activity. As such, careful study design is essential when investigating the role of the gut microbiota on host metabolism to minimise this unrelated noise. Although this can be tightly controlled in animal studies, it represents a major challenge for human studies. One way to overcome this issue is through the use of statistical approaches. Orthogonal projections to latent structures (OPLS) is one such method applying an orthogonal signal correction (OSC) to remove metabolic variation unrelated to the variable being studied. This improves the interpretation of the data enabling the influence of the gut microbiota to be illuminated. Another limitation in metabolic profiling studies is the metabolite identification stage. Once significant metabolic associations are discovered assigning an identity, pathway and/or function to these features can represent a bottleneck in the metabonomic workflow. Advancements in databases, software platforms, and analytical approaches are helping to overcome these limitations and expedite this process.
Stable isotope probing (SIP)
The use of stable isotopes (e.g., 13C, 15N, and 18O) can help elucidate the fate of specific compounds within complex microbial systems such as the gut and can be particularly useful if combined with ‘-omics’ techniques . Tannock et al.  used the technique to identify the main microbial users of inulin in the rat gut. Inulin labelled with 13C was fed to rats and RNA extracted from caecal contents by isopycnic buoyant density gradients was used to detect labelled RNA from cells that had metabolized the inulin. 16 S rRNA genes amplified from cDNA from the labelled fractions were sequenced and showed that Bacteroides uniformis, Blautia glucerasea, Clostridium indolis, and Bifidobacterium animalis were the main species utilizing inulin in these rats.
Use of mathematical modelling to mimic the gut ecosystem
Meta-omic analyses have resulted in a tremendous amount of data on the composition, encoded functionalities, and metabolic output of the human gut microbiota . However, the inherent complexity of the gut ecosystem hinders the interpretation of this wealth of data . A systems-level understanding of the microbiota has to include the underlying complex interactions, since the gut ecosystem as a whole is more than the sum of its parts . A complete, integrated view of the human gut microbiota requires the use of mathematical models.
Identifying novel food ingredients which may have beneficial effects on the gut microbiota when provided as dietary supplements is often difficult due to the large number of variables that need to be compared in well-designed controlled human intervention studies. Experimental models (small animals and fermenter systems) have proved extremely useful, but even they prove time consuming and not all variables can be compared. Mathematical models offer an alternative to try and evaluate bacterial interactions and the impact of different dietary components on microbial composition and activity, at least with the aim of refining the choices to be used in the experimental situation.
One approach applied to simulate the behaviour of gut microbial communities is kinetic modelling. A kinetic model showed the role of bacterial cross-feeding in the conversion of lactate to butyrate by two distinct but abundant human gut bacteria, Eubacterium hallii and Anaerostipes coli . Kettle et al. [138, 139] created a minimal model of the intestinal ecosystem, distilling the microbiota down into 10 bacterial functional groups, each comprising of a mixture of at least 10 bacteria. Pure and mixed culture data was used to provide assumptions of growth rates, substrate specificity and metabolic activities for each of the functional groups. The model successfully predicted the switch between high butyrate production at pH 5.5 and high propionate production at pH 6.5 that had occurred in a previous continuous flow fermenter experiment [139, 140]. The model was also used to estimate the effect of removing entire functional groups (or single strains within a functional group) on the community profile and activity, revealing changes in both SCFA concentrations and abundances of other groups . Such models could be of huge benefit in assessing the consequence of bacterial species shown to be missing, or extra, in disease states on the development of the disease. They would also show whether simply adding back a ‘missing’ bacterium would be sufficient to potentially change the course of the development of the disease. Such models could also be used to illustrate the potential consequences to microbial composition and metabolite production during periods of starvation or substrate depletion or replacement.
Agent-based modelling (ABM) is another approach frequently employed to study the dynamic behaviour of ecological systems in silico. In ABM, objects with well-defined properties (representing, e.g., bacterial cells), are allowed to interact with each other, resulting in a dynamic model that can depict real-time behaviour of a biological system . A recent study used ABM to simulate the positive (e.g., mutualistic) and negative (e.g., competitive) interactions between two bacteria representing a typical Bacteroidetes and Firmicutes, respectively . A simulation of exposure to antibiotics and subsequent recovery confirmed that feedback mechanisms between bacterial species enabled the restoration of system stability after antibiotic perturbation .
Topological analysis of metabolic networks
A genome-scale metabolic reconstruction summarizes known biochemical reactions of a target organism in a well-structured manner. Several studies have applied metabolic networks of the human gut microbiome, revealing topological properties of its global metabolic network and characterizing the microbiome’s metabolic potential (reviewed in Manor et al. and Heinken and Thiele [136, 142]). Recently, metabolic networks of gut microbes have been integrated with a Boolean dynamic model constructed from time series metagenomic data . The model predicted that the commensal Barnesiella intestinihominis can inhibit the growth of Clostridium difficile, which was validated experimentally . Another study integrated community-wide metabolic networks with metagenomic and metabolomic data and the community-wide metabolic turnover was subsequently predicted for the vaginal and the gut microbiome . While correlation-based statistical analyses of metabolomic measurements are not mechanistic, this framework has the advantage of proposing mechanisms for the contributions of species to the turnover of particular metabolites .
Modelling emerging phenotypic properties
The constraint-based reconstruction and analysis (COBRA) approach uses genome-scale reconstructions (GENREs) that are constructed from the genome of a target organism and curated and validated against the available literature . More than 20 GENREs have been built for microorganisms inhabiting the human body . GENREs are converted into mathematical models that are tailored to specific conditions by enforcing constraints: physico-chemical (e.g., mass–charge balance), environmental (e.g., nutrient availability), and regulatory (e.g., gene expression) . By defining an objective, e.g., biomass production, metabolic fluxes through the network that satisfy this objective are predicted . For instance, by imposing constraints on nutrient availability, the growth requirements of a reconstructed organism can be predicted. This approach resulted in the prediction and subsequent experimental validation of a defined medium for Faecalibacterium prausnitzii . Moreover, GENREs enable the evaluation of a species’ functional repertoire. In a large-scale study, metabolic networks were retrieved from the genomes of 301 representative human gut microbes and their metabolic capabilities were systematically compared in the context of phylogenetic distance between species . The analysis revealed an exponential relationship between metabolic and phylogenetic distance, with closely related species being more metabolically diverse than could be expected in a linear relationship .
Of particular interest is the prediction of in silico interactions in microbial communities, and between the gut microbiota and the human host. Similar to the kinetic model described above, such a multi-species model can predict the effect of perturbing community composition (e.g., the removal of key species). Moreover, the effects of varying nutrient environments (e.g., different diets) on the community can be explored. In a first effort to model a host-gut microbe symbiosis, a metabolic model of the mouse was joined with Bacteroides thetaiotaomicron and their mutually beneficial cross-feeding was simulated on five dietary regimes . Moreover, the rescue of lethal host gene defects by Bacteroides thetaiotaomicron was predicted . The effect of a simplified model gut microbiota on human metabolites was modelled by joining the human reconstruction Recon2 with 11 published, manually curated GENREs of gut microbes from three phyla . The combined contribution of secretion products by microbes and dietary input on the host’s metabolome was quantified on four simulated diets . The underlying mechanisms were predicted for the microbial contribution to host secretion for the examples of glutathione, taurine, and leukotrienes .
Constraint-based modelling is commonly applied to the in silico prediction of microbe–microbe interactions (reviewed in Heinken and Thiele, Biggs et al., and Zomorrodi et al. [142, 151, 152]). Typically, two or more GENREs are joined to construct community models with well-defined species–species boundaries, thus allowing the prediction of metabolic cross-feeding between species . To investigate the effect of varying metabolic environments on microbe–microbe interactions, 11 published gut microbe GENREs were joined in all pairwise combinations and the outcome (positive, neutral, negative interaction) was predicted . The metabolic exchange and the distribution of resources between the pairs were simulated on four simulated gut microenvironments and three diets . The model predicted that the need to regenerate reducing equivalents enforced mutualism in certain pairs under anoxic conditions . In vitro screens of pairwise interactions between microbes are laborious and the described in silico framework constitutes an important first step for the prediction of candidate pairs of interest that would be subsequently validated experimentally.
GENREs also provide a useful framework for the contextualization of meta-omic data, with a variety of constraint-based methods for the integration of such measurements already available .
In summary, important advances have been made in the construction of mathematical models that capture key aspects of the gut microbiota and its interactions with its host, summarize the current knowledge on its metabolism, and propose hypotheses that can be experimentally validated. In future efforts, such models will result in the elucidation of previously unknown, non-intuitive relationships between the gut microbiota and host physiological states. Moreover, dietary and drug interventions to favourably manipulate the human gut microbiota may be predicted in silico prior to experimental validation. Knowledge of an individual’s microbial composition will ultimately enable models to be tailored to predict effects of specific diets or supplements at an individual level.