Introduction

In Europe, flammable swamp gas is presumably known since the roman period. Microbial anaerobic digestion (AD) of plant material as origin of burnable biogas was scientifically recognized and analyzed in detail since mid of the twentieth century. Complex microbial consortia are responsible for successive degradation of organic biomass to biogas consisting of methane (CH4) and carbon dioxide (CO2) and, in smaller proportions, of other gases. In industrial-scale biogas plants (BGPs), biogas is produced by AD from agriculturally produced renewable resources such as maize, grass, and sugar beet, and even biodegradable organic wastes can be used as substrates (Weiland 2010; Zhang et al. 2016). Today, biogas production considerably contributes to the recovery of energy from renewable resources thereby also positively affecting the balance of the climate-relevant gas carbon dioxide. In Germany, nearly 9000 biogas plants (including about 450 bio-waste biogas plants), with 4.5 Giga-Watt installed electric power and about 1 Giga-Watt thermal energy usage, are operated (FNR 2017).

The biomethanation process is formally subdivided into four phases, i.e., (i) hydrolysis/cellulolysis of complex organic compounds, namely carbohydrates, proteins and lipids, towards corresponding oligomers and monomers, (ii) acidogenesis (fermentation) of the latter metabolites to the intermediates propionate, butyrate, other short-chain volatile fatty acids (VFA) and alcohols, (iii) acetogenesis of primary fermentation products to acetic acid, CO2, and H2, and (iv) methanogenesis resulting in CH4 and CO2 (Angenent et al. 2004). The first three phases are solely performed by fermentative Bacteria. Only certain methanogenic Archaea are able to synthesize CH4 from the end-products of bacterial fermentation. Despite the technical improvements of anaerobic wastewater treatment in the beginning and mid of the twentieth century, for a long period, anaerobic and biomass degrading microorganisms were regarded as a “relatively unimportant group of organisms” (McBee 1950). However, their potential for the development of economically valuable bioconversion processes utilizing cellulosic or organic waste materials for production of valuable end-products was recognized.

First attempts to study microbial biogas communities relied on cultivation-based approaches and started in the beginning of the twentieth century (e.g., Schnellen 1947; McBee 1954) resulting in over 150 newly discovered species of microorganisms (Söhngen et al. 2016). Technical improvements concerning anaerobic cultivation of microorganisms are still leading to the isolation and characterization of new type strains for hydrolytic/acidogenic Bacteria such as Clostridium bornimense, Herbinix hemicellulosilytica, Herbinix luporum, Herbivorax saccincola, Proteiniphilum saccharofermentans, Petrimonas mucosa, Fermentimonas caenicola, and Proteiniborus indolifex (Hahnke et al. 2014; Koeck et al. 2015; Hahnke et al. 2016; Koeck et al. 2016a; Koeck et al. 2016b; Hahnke et al. 2018). Likewise, new species for methanogenic Archaea such as Methanobacterium aggregans or Methanosarcina flavescens were described (Kern et al. 2015, 2016). However, the options of cultivation-based approaches to uncover all members of biogas communities are intrinsically highly limited. Thus, cultivation-independent methods are indispensable to tackle the whole complexity of biogas communities.

Metagenomics gained in importance for the dissection of microbial assemblages in the same way as the performance and efficiency of next-generation sequencing (NGS) technologies were advanced. Inevitable, anaerobic digestion communities were elucidated by applying methods of metagenome, genome and post-genome research taking advantage of high-throughput sequencing of environmental whole community DNA and RNA. In the present review, state-of-the-art metagenomics approaches are presented to illustrate their usefulness in analyses tackling microbial communities residing in BGPs. To elucidate the metabolically active biogas community, metatranscriptome, metaproteome, and metabolome studies are addressed in an integrated manner.

It is commonly accepted that biogas producing microbial communities are the key for process shaping and development of optimization strategies since they provide opportunities for their management and engineering (Carballa et al. 2015; Koch et al. 2014). Accordingly, this article reviews current knowledge on structure and performance of microbial communities residing in full-scale biogas-producing reactors considering microbiome management and monitoring options. Previous reviews addressing microbial communities of anaerobic digestion systems were mainly focused on laboratory-scaled reactors and classical molecular methods to uncover community compositions and functions (e.g., Demirel and Scherer 2008; Nasir et al. 2012; Čater et al. 2013; Venkiteshwaran et al. 2015; Schnürer 2016; Demirel 2014; Vanwonterghem et al. 2014). In the present review, microbial communities of BGPs converting agriculturally produced renewable primary products, organic residues, and manure were considered. Laboratory systems were only included if they feature pilot character exemplifying fundamental methodological approaches or important insights in microbial community structure and functionality.

Taxonomic characterization of microbial communities by high-throughput 16S rRNA gene amplicon sequencing

To obtain direct and immediate insights into microbial community compositions and the phylogenetic relationship of community members, specific marker genes were studied by their PCR-amplification from whole community (metagenomic) DNA. A widely and commonly used approach for microbial community profiling without prior cultivation is the analysis of the 16S small subunit ribosomal-RNA (rRNA) gene sequence (Woese 1987; Lebuhn et al. 2014; Simó et al. 2014). As an integral part of the ribosome, this rRNA is ubiquitously present in prokaryotic organisms. With a size of about 1500 bp, the 16S rRNA gene comprises nine hypervariable regions (V1-V9) separated by conserved regions. The hypervariable regions provide organism specific sequences enabling identification of taxa. The whole 16S rRNA gene or specific regions can be considered as taxonomic marker gene (Yang et al. 2016). Currently, as taxonomic thresholds for the species level, 98.65% and for the genus level, 94.5% 16S rRNA gene nucleotide sequence identity are proposed (Kim et al. 2014; Yarza et al. 2014).

To study the highly diverse and dynamic microbial community structures present in environmental systems and to minimize the costs for DNA sequencing, initially, several fingerprinting techniques targeting the 16S rRNA gene were developed such as denaturing gradient gel electrophoresis (DGGE) analysis (Muyzer et al. 1993), amplified ribosomal DNA restriction analysis (ARDRA) of 16S rRNA gene libraries (Moyer et al. 1994), and terminal restriction fragment length polymorphisms (T-RFLP) analysis (Liu et al. 1997). These approaches were applied in previous studies on microbial structures present in anaerobic digesters and biogas reactors (Cabezas et al. 2015), as examples, DGGE for the analysis of mesophilic anaerobic digestion of municipal waste (Silvey et al. 2000), ARDRA for the analysis of mesophilic methanogenic bioreactors supplied with artificial media (Fernandez et al. 2000), and ARDRA combined with T-RFLP for the analysis of psychrophilic anaerobic digestion of synthetic industrial wastewaters (Collins et al. 2003).

The advantage of DGGE and ARDRA as low-cost methods providing rapid insights into community structures and dynamics resulted in their continued application, e.g., to study the impact of the addition of cellulolytic/hydrolytic enzymes to anaerobic digestion (Wang et al. 2016) or to follow microbial community dynamics in a BGP (Yamei et al. 2017). Also semi-automated, computer-aided T-RFLP analysis still is a valuable tool to study the development of microbial populations depending on changes of technical process parameters. As examples, Goux et al. (2015) analyzed the response of microbial communities on stress factors in laboratory scaled experiments. Krakat et al. (2010), Argyropoulos et al. (2013), and Theuerl et al. (2015) showed that even the microbiome of a well-running BGP undergoes structural fluctuations, and Witarsa et al. (2016) characterized the influence of different starter inocula.

However, due to the rapidly decreasing costs for high-throughput DNA sequencing, fingerprinting tools for monitoring of microbial community structures are now commonly applied in combination with 16S rRNA gene targeted amplicon NGS, e.g., DGGE together with Ion Torrent sequencing (Akyol et al. 2016), or T-RFLP together with 454-pyrosequencing or Illumina sequencing (Goux et al. 2015; Sun et al. 2016; Liu et al. 2017; Ozbayram et al. 2017; Ziganshina et al. 2017). Fingerprinting techniques allow the rapid pre-screening of samples for NGS, or the focusing on particular functional groups of microorganisms, e.g., by targeting glycoside hydrolase genes to access cellulose-degrading Bacteria (Sun et al. 2016) or the genes for the methyl-coenzyme-M-reductase to monitor exclusively methanogenic Archaea (Ozbayram et al. 2017).

The development of NGS platforms led to an optimized 16S rRNA gene analysis, because they enabled direct sequencing of DNA libraries and thus several hundreds of samples can be sequenced in parallel (see Fig. 1). NGS platforms usually generate short read lengths. Therefore, only a single or a combination of adjacent hypervariable 16S rRNA gene regions can be sequenced. It was shown that despite these short read lengths, a sufficiently accurate taxonomic classification of microbial communities can be achieved (Liu et al. 2008). The 16S rRNA gene fragments can be amplified by PCR applying universal primers (e.g., Takahashi et al. 2014) or bacterial and archaeal specific primers. In this context, amplification of different hypervariable regions of the 16S rRNA gene and application of different sequencing technologies used for taxonomic community profiling should be considered carefully since both factors have a strong impact on the obtained results as shown previously by Tremblay et al. (2015) analyzing a mock community composed of a known number of species. Moreover, an essential step after sequencing of constructed amplicon libraries is the quality control which ensures increased accuracy and prevents overestimation of community diversity. Pre-processing of amplicon datasets involves several steps (Jünemann et al. 2017) including removal of chimeric sequences. Subsequently, sequences are subjected to operational taxonomic units (OTUs) clustering, where usually a cutoff of 97% sequence similarity is used, taxonomic classification, and statistical analysis. For reference-based taxonomic classification, 16S rRNA gene reference databases like SILVA (Quast et al. 2013) and RDP (Cole and Tiedje 2014) are consulted. For further analyses of sequence datasets such as rarefaction estimations, calculation of diversity metrics, principle component analyses (PCA), or calculation of UniFrac distances, different bioinformatics tools are available (Schloss et al. 2009; Caporaso et al. 2010). Despite the benefits of taxonomic community profiling based on 16S rRNA gene sequencing, this method faces some limitations. The PCR-based amplification of the target region is biased due to primer properties. Moreover, the resolution of the method is limited due to short read lengths and may lead to underestimation of species diversity. Since the 16S rRNA gene copy numbers vary for different species, taxa abundance estimations may be biased (Větrovský and Baldrian 2013). This problem can be addressed bioinformatically, provided that the gene copy numbers of the species involved are known which frequently is not the case for unknown microbial communities. Moreover, 16S rRNA gene amplicon analyses depend on completeness and correctness of corresponding reference databases for sequence classification (Ranjan et al. 2016). However, the 16S rRNA gene amplicon sequencing method has often been used for characterizing microbial communities in biogas plants because of its advantageous cost-benefit relation. To compensate for resolution biases (like underestimation of species diversity) associated with sequencing of individual variable regions, the approach of “full-length” 16S rRNA amplicon sequencing by means of the PacBio© single molecule, real-time (SMRT) technology can be considered (Wagner et al. 2016).

Fig. 1
figure 1

Schematic overview on taxonomic profiling of biogas-producing microbial communities applying 16S rRNA gene amplicon sequencing. After extraction of whole community DNA, 16S rRNA gene amplicon libraries were constructed and subsequently sequenced. Obtained sequences were processed with the program QIIME (Caporaso et al., 2010) to calculate taxonomic community profiles

Taxonomic composition of bacterial communities residing in biogas plants

Microbial communities of biogas fermenter samples consist of bacterial and archaeal sub-communities. These differ in their diversity and abundances depending on the BGP operating parameters such as temperature, fed substrates, pH, and reactor and fermentation type (Weiland 2010; Yu et al. 2014; Abendroth et al. 2015). For several BGPs operated under mesophilic (35–45 °C) or thermophilic (45–60 °C) conditions, fed with different substrates like agricultural residues, manure, and/or sewage sludge, the community composition was analyzed. It was reported that relative abundances of the bacterial community within these fermenters amount from 80 to 100% (Sundberg et al. 2013; Maus et al. 2017). Remaining portions are mainly occupied by methanogenic Archaea (see chapter 3.2.). While the four phases of AD and their principal pathways are known, the exact taxonomic compositions and network dynamics of corresponding communities are still only partly understood. Since 16S rRNA gene amplicon sequencing is convenient to provide insights into the taxonomic composition of complex microbial communities, it has often been the method of choice for a number of studies on the correlation of process parameters and/or environmental conditions and community structure (refer to chapter 2.).

By these approaches, dominant and therefore important taxa of the biogas formation process were identified. Regarding the bacterial part of the community, the majority of relevant studies described predominance of the bacterial phyla Firmicutes and, under certain conditions, of Bacteroidetes and Thermotogae. An example of the mesophilic and thermophilic bacterial community composition is shown in Fig. 2 (Maus et al. 2016b). These taxa are believed to belong to the core microbiome of biogas-producing microbial communities (e.g., Rui et al. 2015). However, the ratio of these phyla is very much related to the respective temperature, fed substrates, and process conditions (Sundberg et al. 2013; Rui et al. 2015). Furthermore, representatives belonging to other phyla such as Proteobacteria, Spirochaetes, Tenericutes, Verrucomicrobia, candidate phylum Cloacimonetes (previously named WWE1), Acidobacteria, and Chloroflexi have prevalently been detected in mesophilic reactors, but typically at comparatively lower abundances (Klocke et al. 2007; Sundberg et al. 2013; Ziganshin et al. 2013; St-Pierre and Wright 2014; Rui et al. 2015; Li et al. 2013; Sun et al. 2016; Stolze et al. 2016).

Fig. 2
figure 2

Taxonomic profiling of microbial communities residing in exemplary mesophilic and thermophilic biogas plants fed with agricultural (by-) products based on 16S rRNA gene amplicon sequencing (Maus et al. 2016b). For amplicon processing, a pipeline including FLASH (Magoč and Salzberg 2011), UPARSE (Edgar 2013), Usearch 8.0 (Edgar 2010), and RDP classifier (Wang et al. 2007) was used as described recently by Maus et al. (2016b). Relative abundances of the most abundant classes and families of bacterial (left) and archaeal (right) communities were shown

In mesophilic full-scale reactors fed with maize silage and manure as substrates, the classes Clostridia and Bacilli were highly abundant within the phylum Firmicutes (Stolze et al. 2016; Treu et al. 2016a). In particular, many species belonging to the genus Clostridium are involved in decomposition of complex carbohydrates including cellulose, xylan, amylose, and amylopectin and represent hydrolysis key players (Labbe and Duncan 1975; Dürre 2005). Some mesophilic Clostridia species such as Acetoanaerobium sticklandii (Fonknechten et al. 2010) and Butyrivibrio proteoclasticus (Attwood et al. 1996) are able to degrade proteins in addition to complex carbohydrates (Schnürer 2016; Li et al. 2013).

The class Clostridia also comprises species capable of performing acetogenesis as well as syntrophic fatty acid degradation. Corresponding syntrophs belong to the families Thermoanaerobacteriaceae, Costridiaceae, and Syntrophomonadaceae (Schnürer 2016). In addition, the phylum Proteobacteria also includes many syntrophs belonging to the genera Syntrophus, Pelobacter, Smithella, Syntrophorhabdus, and Syntrophobacter described to live in association with methanogenic Archaea (Jackson et al. 1999; Bok et al. 2001; Qiu et al. 2008).

Mesophilic digesters fed with protein-rich and hardly digestible substrates such as straw showed prevalence of the families Bacteroidetes, Porphyromonadaceae, and Marinilabiaceae (Sun et al. 2015; Moset et al. 2015). Bacteroidetes representatives are known to ferment sugars to acetate and propionate. Some studies reported on higher abundances of Porphyromonadaceae members in reactors operating at, e.g., high organic loading rates (OLRs) or nitrogen/ammonia levels caused by high protein content of the used substrates (Goux et al. 2015; Müller et al. 2016). This observation led to the suggestion that these may serve as potential marker microorganism for deteriorated biogas process conditions. In general, despite the overall dominance of a few taxa, a high degree of variation is often seen within mesophilic bacterial communities, driven by the composition of the fed substrate, ammonia levels, and by the operating conditions applied.

In thermophilic BGPs, members of the phylum Firmicutes also dominate the bacterial community followed by Thermotogae and Bacteroidetes. Members of the class Clostridia were more abundant in thermophilic compared to mesophilic digesters. One of the most prominent representatives of the genus Clostridium is C. thermocellum, which is known to be a very efficient thermophilic cellulose degrader (Akinosho et al. 2014; Koeck et al. 2014). Likewise, the families Lachnospiraceae and Halanaerobiaceae are prevalent under thermophilic conditions (Maus et al. 2016b; Stolze et al. 2016). Cellulolytic Lachnospiraceae species are mainly responsible for the degradation of complex plant material. For example, Herbinix hemicellulosilytica T3/55T isolated from a thermophilic full-scale BGP (Koeck et al. 2015) is able to digest cellulose. Likewise, Halocella cellulolytica (Halanaerobiaceae) was described to degrade cellulose producing acetate, ethanol, lactate, H2, and CO2 as end-products. Moreover, Halocella spp. are capable to tolerate high salt concentrations frequently occurring in biogas reactor environments (Simankova et al. 1993).

In recent years, the taxon Thermotogae was recognized to be of importance for the biogas-production process. In this class, the genera Defluviitoga (Ben Hania et al. 2012) and Petrotoga (Lien et al. 1998) are determined as predominant genera in particular BGPs (Maus et al. 2016a; Stolze et al. 2016). Corresponding species utilize a large variety of saccharides (Maus et al. 2016a; Lien et al. 1998) for acetate, ethanol, CO2, and H2 production.

However, it should be noted that BGP microbiomes still comprise a high diversity of uncharacterized Bacteria (from 14 to 41%, Stolze et al. 2015; Maus et al. 2016b) such as the candidate taxa Hyd24–12 (Kirkegaard et al. 2016), OD1 (Peura et al. 2012), TM7 (also known as candidatus Saccharibacteria) (Ferrari et al. 2014), and SR1 (Harris et al. 2004), often only known by their 16S rRNA gene sequence. Clarifying the ecological role of uncultured Bacteria still is addressed in ongoing research.

Taxonomic composition of archaeal communities residing in biogas plants

To evaluate the occurrence of methanogenic Archaea in biogas communities, 78 full-scale anaerobic digesters described in 17 recent publications (2008–2017) were considered. The main attributes and features of identified archaeal biogas community members are summarized in Table 1. The comparison of the process parameters of BGPs with respect to the observed dominant genera of methanogens shows that apparently fed substrates affect the microbial community more than temperature or hydraulic retention time (Cardinali-Rezende et al. 2012; Franke-Whittle et al. 2014; Han et al. 2017; Lee et al. 2014; Lucas et al. 2015; Nettmann et al. 2008; St-Pierre and Wright 2013; Zhu et al. 2011). Different substrates account for different ammonium/ammonia contents in the fermentation sludge which significantly affect the composition of the microbial community (Fotidis et al. 2014). Moreover, the amount and type of nutrients and minerals present in the substrate also shape the methanogenic sub-community (Fontana et al. 2016; Luo et al. 2016). Previous reviews indicated a dominance of the hydrogenotrophic pathway in anaerobic digesters with a high nutrient content like BGPs fed with agricultural material or municipal bio-waste (Demirel and Scherer 2008; Demirel 2014). An increased hydrogen partial pressure like under thermophilic conditions may favor hydrogenotrophic methanogens for thermodynamic reasons (Zinder 1990).

Table 1 Summarized data of 78 mesophilic and thermophilic BGPs (as referenced in Supplemental Table S1) regarding predominant methanogenic genera, type of biogas plant, temperature, hydraulic retention time, and ammonia content

In municipal sewage digesters, consistently the acetoclastic genus Methanosaeta dominated (see Fig. 3). Due to its high substrate affinity with a KS–value of 0.4 to 1.2 mM (Jetten et al. 1990), it outcompetes other methanogens under low acetate concentrations. Digested sewage sludges are known to consist of generally low amounts of easily degradable compounds, but represent high substrate diversity. This coincides with pioneering investigations by isotopic methods (Kaspar and Wuhrmann 1978). Additionally, a correlation between high concentrations of total ammonia (sum of NH3 and NH4+) and the absence of Methanosaeta in biogas reactors was observed (Karakashev et al. 2005; Nettmann et al. 2010).

Fig. 3
figure 3

The predominant archaeal methanogenic genera as summarized by recent literature covering 78 mesophilic and thermophilic BGPs (as referenced in Supplemental Table S1). *Methanosarcina mainly occurred as single cells instead of typical aggregates (single-coccoid form, ≤ 1 μm)

Usually, agricultural BGPs are operated under mesophilic conditions (35–45 °C). In contrast, thermophilic full-scale plants are managed at temperatures of approx. 50 to 55 °C. BGPs of the agricultural type are characterized mainly by high salt (10–20 g KCl equivalent per liter) and high ammonium concentrations. They were dominated by the genera Methanoculleus, Methanosarcina, and Methanobrevibacter. All three genera included species adapted to either thermophilic or mesophilic temperatures.

At high ammonia levels (> 2.8 g L−1 NH4+-N), methane production in BGPs generally occurs through syntrophic acetate oxidation and hydrogenotrophic methanogenesis (Ek et al. 2011; Fotidis et al. 2014). This suggests that hydrogenotrophic methanogenic Archaea belonging to Methanomicrobiales spp. and Methanobacteriales spp. are non-susceptible to ammonia toxicity. Prevalent occurrence of the order Methanomicrobiales could also be modulated by a high hydrogen affinity. Corresponding species exhibit a low threshold concentration for hydrogen of about 0.1 μM, respective 15 Pa H2-pressure (Lee and Zinder 1988), possibly providing an advantage over certain members of the order Methanobacteriales.

The genus Methanoculleus has previously been reported to be related to relatively elevated ammonium levels (Schnürer et al. 1999; Schnürer and Nordberg 2008; Nettmann et al. 2010; Ziganshin et al. 2013). Beside the tolerance towards ammonia (1.4–5.4 g L−1 NH4+-N, Table 1), Methanoculleus appears to be highly adaptable to process parameters such as temperature and hydraulic retention time with 35–45 or 50–62 °C and 3–108 days, respectively (see Table 1). Therefore, the reason why Methanosarcina dominates over Methanoculleus in some BGPs, which mainly ferment different types of manure and agricultural by-products, cannot be directly correlated to one of the different process parameters listed in the Supplemental Table S1. However, it is well-known that Methanosarcina sp. are robust towards different detrimental conditions such as high ammonia or salt concentrations, or fluctuating operational conditions such as temperature shifts (De Vrieze et al. 2012). The third most predominant methanogenic genus is Methanobrevibacter. It prevails compared to other genera, especially in the fermentation of fecal matter in mesophilic biogas reactors (Table 1; Supplemental Table S1). The genera Methanobacterium and Methanothermobacter often seem to be accompanied by a second abundant methanogen in thermophilic BGPs fed with variable agricultural by-products and the organic fraction of municipal solid waste including fat-rich waste (Supplemental Table S1).

The only example for a predominant methanogen which is involved in the methylotrophic pathway (Deppenmeier et al. 1996) is the genus Methanomethylovorans. It was found to be highly abundant together with Methanosaeta in a mesophilic anaerobic digestion reactor fed with municipal and industrial sewage sludge. The analyzed reactor featured a very low OLR (0.5 kg VS m−3 d−1) (Table 1; Supplemental Table S1). However, the abundance of Methanomethylovorans might be connected to the presence of particularly high amounts of oil and alcohols such as methanol, since the analyzed digester was supplemented with remnants from biodiesel production.

Regarding the percentages of the dominant genera, niche formation is noticeable in case of changing substrates. For example, over 90% of the archaeal community is dominated by one genus when substrates such as food-waste-recycling wastewater (FRW), the organic fraction of municipal solid waste (OFMSW) or slaughterhouse waste (SHW) were fed. Only in two biogas plants, the archaeal sub-community is composed of only two genera (see Supplemental Table S1).

In terms of diversity, Kirkegaard et al. (2017) showed that the archaeal community of thermophilic samples among 32 sewage sludge digesters of 20 wastewater treatment plants featured a lower diversity than the mesophilic samples (Kirkegaard et al. 2017). In contrast, other studies revealed that the impact of the temperature on the archaeal diversity was weaker compared to that of the substrates, and their conditional nutrient and ammonium contents (Sundberg et al. 2013; Luo et al. 2016). While the sewage sludge digesters revealed a diversity of up to 16 different methanogenic genera (6–16, on average 11, n = 30 full-scale digesters, 24 under mesophilic [34–40 °C] and six under thermophilic conditions [51–55 °C]), the co-digesters (agricultural by-products, organic fraction of municipal solid waste, slaughterhouse waste) achieved a diversity of up to 8 different methanogenic genera (1–8, on average 4, n = 20 full-scale BGPs, 11 under mesophilic [37–40 °C] and nine under thermophilic conditions [50–55 °C]) (Sundberg et al. 2013; Luo et al. 2016; Kirkegaard et al. 2017).

The genomic potential of communities in biogas plants by sequencing of community DNA

NGS 16S rRNA gene amplicon sequencing enabled high-resolution taxonomic profiling of biogas-producing microbial communities. However, to gain insights into their functional potential, sequencing of the whole metagenome is indispensable. Analysis of metagenome sequence data can be done in either of two possible ways: (i) functional information can be assigned to single metagenome sequence reads regarding them as Environmental Gene Tags (EGTs) (Krause et al. 2008); and (ii) metagenome assembly and binning strategies were followed to gain deeper insights into genome sequence information of so far non-cultivable biogas community members (see Fig. 4).

Fig. 4
figure 4

Workflow for functional profiling of microbial biogas communities exploiting metagenome sequence data. After sampling at biogas reactors, total DNA was extracted for construction of whole metagenome shotgun libraries which were subsequently sequenced on high-throughput sequencing platforms. Resulting sequencing data were quality checked and functionally characterized based on single read sequences in order to deduce functional profiles of the underlying biogas community. Moreover, metagenome assembly followed by a binning approach was applied to compile MAGs, which were then analyzed for their metabolic potential

The genomic potential of microbial communities by single-read metagenomics

Different approaches are applied to assign functional information to single metagenome reads (see Fig. 4). Common tools for this task are for example MG-RAST (Meyer et al. 2008), MEGAN 6 (Huson et al. 2016), or eggNOG (Huerta-Cepas et al. 2016). Implemented bioinformatics pipelines usually comprise one or a combination of different functional classification systems referring to COG (cluster of orthologous groups) categories (Tatusov et al. 2000), SEED subsystems (Overbeek et al. 2014), eggNOG orthologous groups (Huerta-Cepas et al. 2016), or InterPro entries (Finn et al. 2017) which includes different protein signature databases (e.g., Pfam). The reconstruction of metabolic pathways of the anaerobic digestion process can be done by using databases like KEGG (Kyoto Encyclopedia of Genes and Genomes; Kanehisa et al. 2017), MetaCyc (Caspi et al. 2016), and BRENDA (Chang et al. 2009). Potential functions of the microbial community as well as abundances of involved genes are determined by assigning metagenome reads (e.g., by BLASTX) to defined pathways (Cai et al. 2016).

Functional classification according to COG categories of a metagenome from a mesophilic full-scale BGP fed with renewable primary products was firstly applied by Schlüter et al. (2008). The major identified COG categories were metabolism, cellular processes and signaling, and information storage and processing. These findings correlate with several studies where different full-scale biogas reactors (fed with agricultural material, manure, wastewater sludge or municipal sludge) were functionally analyzed. Metabolic functions with 43 to 45% of the total reads, cellular processes and signaling (about 20% of the total reads), and basic house-keeping genes, e.g., information storage and processing (about 22% of the total reads) were the major COG categories (Guo et al. 2015; Cai et al. 2016). Within the COG category related to metabolism, the most abundant sub-categories were energy production and conversion, amino acid transport and metabolism, carbohydrate transport and metabolism, as well as lipid transport and metabolism. These metabolic functions are linked to the conversion of the fed substrates into smaller molecules and finally to methane (Schlüter et al. 2008; Li et al. 2013; Guo et al. 2015; Cai et al. 2016; Maus et al. 2016b). Among the amino acid transport and metabolism COG sub-category, genes involved in the biosynthesis of valine, leucine, and isoleucine as well as the metabolism of glycine, serine, threonine, cysteine, and methionine were identified (Guo et al. 2015). These amino acids are known to be commonly involved in Stickland reactions, indicating importance of amino acid fermentation pathways in the analyzed biogas reactors (Guo et al. 2015). Reads assigned to the carbohydrate metabolism COG category referred to genes of the sub-categories glycolysis/gluconeogenesis, pentose phosphate pathway, and amino sugar and nucleotide sugar metabolism (Guo et al. 2015) as well as processing of monosaccharides and disaccharides (Schlüter et al. 2008). These findings indicated that abundant species within the analyzed biogas reactors are involved in carbohydrate digestion and energy conversion. Additionally, assignments to enzymes like cellobiose phosphorylase, glucosidase, and cellulase/cellobiase indicated the potential for cellulose degradation within agricultural BGPs (Jaenicke et al. 2011; Maus et al. 2016b).

Several studies analyzing different full-scale anaerobic digesters (fed with manure or sludge from wastewater treatment plants and operated under mesophilic or thermophilic conditions) revealed highly similar results for the major level 1 SEED sub-systems. These refer to metabolism of carbohydrates, followed by clustering-based systems, protein metabolism, and amino acids and derivatives (Yang et al. 2014b; Guo et al. 2015; Luo et al. 2016). These major subsystems were also identified in other microbiomes from different ecosystems, like soil, freshwater or wastewater treatment plants and most of them can be related to the degradation of organic matter which occurs in many different habitats (Cai et al. 2016). However, differences in the functional genes were uncovered for anaerobic digesters fed with manure or activated wastewater sludge. The metagenomes of the sludge-based digesters featured significantly higher abundances of genes involved in nitrogen metabolism, phosphorus metabolism, and aromatic compound metabolism. These findings correlate with the composition of the wastewater sludge, where higher amounts of nitrite, aromatic compounds, organic contaminants, and phosphate are present compared to manure (Luo et al. 2016).

Since the carbohydrate metabolism was shown to be the major SEED level 1 subsystem for several different anaerobic digesters, detailed analyses of this sub-system on level 2 were conducted. The major level 2 carbohydrate sub-systems refer to the central carbohydrate metabolism and one-carbon metabolism (Yang et al. 2014b; Guo et al. 2015). Identified functions comprise transport and oxidation of main carbon sources, which are then enzymatically converted into metabolic precursors for the generation of cell biomass (Yang et al. 2014b). Especially the one-carbon metabolism plays an important role in the methanogenesis process (Ferry 1999).

Assignment of metagenome reads to KEGG metabolic pathways or Pfam families led to the identification of the major methanogenesis pathway performed in the analyzed biogas reactors (Yang et al. 2014b; Guo et al. 2015; Cai et al. 2016; Luo et al. 2016; Stolze et al. 2015). In principle, three methane synthesis pathways were described, namely hydrogenotrophic, acetoclastic, and methylotrophic methanogenesis. For biogas reactors fed with, e.g., sewage sludge, it was shown that the abundance of genes involved in acetoclastic methanogenesis was higher than the abundance of genes involved in hydrogenotrophic methane synthesis (Yang et al. 2014b; Guo et al. 2015; Cai et al. 2016). In contrast, genes involved in the hydrogenotrophic pathway were more abundant in biogas reactors fed with agricultural residues, maize silage, and manure (Stolze et al. 2015; Jaenicke et al. 2011). The methylotrophic pathway seems to be of minor importance in the analyzed biogas reactors (Guo et al. 2015).

The most abundant genes within the acetoclastic pathway were acetyl-CoA synthetase (EC: 6.2.1.1) and acetyl-CoA decarbonylase/synthase complex (ACDS) (Yang et al. 2014b; Luo et al. 2016). These genes are essential in the synthesis of acetyl-CoA from acetate (Li et al. 2013). Since these genes are also present in Bacteria where they are involved in other metabolic pathways, the identification of the major methanogenesis pathway based on the abundance of these genes is biased (Luo et al. 2016). On the other hand, the most dominant genes of the hydrogenotrophic pathway were formate dehydrogenase (EC: 1.2.1.2) and formylmethanofuran dehydrogenase (EC: 1.2.99.5) (Yang et al. 2014b; Luo et al. 2016).

Functional interpretation of metagenomes representing biogas communities elucidated their functional potential, which strongly depends on the fed substrates, but does not provide insights into the metabolically active community which in principle can be tackled by metatranscriptome/metaproteome analyses (see below) (Yang et al. 2014b; Guo et al. 2015; Cai et al. 2016).

Single-read assembly and contig binning to compile genomes of biogas community members

A pilot study on functional characterization of the biogas microbiome employing a novel metagenome assembly and binning strategy was carried out by Campanaro et al. (2016). These authors analyzed laboratory-scale continuously stirred tank reactors (CSTRs) fed with cattle manure under thermophilic conditions. Approximately, 51 Gb metagenomic sequence information obtained on the Illumina HiSeq system was assembled and binned based on contig tetranucleotide frequencies and coverage of contigs by sequence reads. Coverage information resulted from metagenome sequencing of eight reactors varying in their OLRs assuming that different OLRs affect abundances of taxa residing in the system. The binning approach resulted in the reconstruction of 106 MAGs taxonomically assigned to the phyla Firmicutes, Proteobacteria, Bacteroidetes, Synergistetes, Actinobacteria, Thermotogae, Spirochaetes, and Euryarchaeota. However, only ten of the 106 MAGs could be assigned at the genus level whereas most of them were only classified at higher taxonomic ranks indicating that currently, genomes closely related to the newly reconstructed biogas MAGs are missing in databases. This means that the analyzed biogas community is a rich source of new, so far uncharacterized species that probably cannot easily be obtained by cultivation-based approaches. The three most abundant species represented by MAGs in the system were assigned to the Clostridiales (phylum Firmicutes), Methanoculleus sp. (phylum Euryarchaeota), and Rikenellaceae (phylum Bacteroidetes).

In this regard, a new Methanoculleus species, tentatively designated Candidatus Methanoculleus thermohydrogenotrophicum (sp. nov.) was reconstructed. This new species reached dominance levels of up to 20% in a particular thermophilic biogas reactor (Kougias et al. 2017). The dominance of species belonging to the genus Methanoculleus has previously been reported for mesophilic as well as thermophilic biogas systems (Jaenicke et al. 2011; Stolze et al. 2015; Maus et al. 2016b; Maus et al. 2017) (also refer to chapter 2.2.). The assembly/binning approach also led to the identification of other new taxa of higher ranks. This was exemplified by recognition of one MAG assigned to the phylum Euryarchaeota. The corresponding, so far uncultured species most probably represents a new class of this phylum.

Moreover, the compilation of MAG sequence information allows for reconstruction of corresponding metabolic pathways based on the organism’s genome. Species represented by MAGs were classified according to key pathways of the biogas process such as carbohydrate utilization, fatty acid degradation, amino acid fermentation, propionate and butanoate metabolism, acetogenesis, and methanogenesis. As perspective, functional interactions between biogas community members can be deduced based on MAG derived sequence information.

A similar metagenomic assembly/binning approach was conducted for mesophilic and thermophilic two-stage lab-scale CSTRs digesting cattle manure (Treu et al. 2016b; Bassani et al. 2015). This study yielded 157 new MAGs. However, settings for compilation of MAGs were less stringent compared to the previous study (completeness higher than 20% and contamination rate lower than 50%). Overall, the taxonomic distribution of identified taxa was similar to the study described above. However, also rare biogas taxa representing the phyla Acidobacteria, Fibrobacteres, Lentisphaerae, Planctomycetes, and Thermotogae were identified in the set of MAGs. Moreover, comparative analyses enabled to determine a core group of microorganisms that are important for the biogas process comprising among others Methanoculleus, Methanothermobacter, Synthrophomonas, and Proteobacteria.

First metagenome assembly for a full-scale mesophilic BGP fed with maize silage and pig manure was published in 2015 featuring a sequencing depth of 17 Gb (Bremges et al. 2015). This approach yielded an assembly size of 228 Mb; and 250,596 protein-coding genes were predicted on assembled continuous sequences. As an example, almost all gene products of the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway map “Methane Metabolism” are encoded on the compiled contigs. Reuse of these data was encouraged since all analysis tools and nucleotide sequences were provided as one Docker container accessible at “Docker-Hub” (Docker-Hub Registry).

Likewise, the microbiome of a mesophilic (40 °C) agricultural full-scale BGP loaded with maize silage and farm animal manure was analyzed by metagenome sequencing (Güllert et al. 2016). Metagenomic sequence reads were assembled to a size of 1.25 Gb comprising approx. two million predicted coding regions. Binning of contigs based on sequence composition and differential coverage resulted in 104 high-quality MAGs representing the taxa Firmicutes, Bacteroidetes, Fibrobacteres, Spirochaetes, Actinobacteria, Verrucomicrobia, and Euryarchaeota. Only very few MAGs could be assigned at the species or genus level. Examples are MAGs classified to belong to the genera Fibrobacter, Clostridium, Paludibacter, Methanosarcina, and the family Porphyromonadaceae. MAGs assigned to the genus Clostridium and the family Lachnospiraceae (phylum Firmicutes) possess scaffoldin genes encoding the structural scaffoldin subunit of cellulosomes. This finding indicates that the corresponding, so far unknown Bacteria are able to hydrolyze cellulose by employment of cellulosomes. Availability of the identified MAGs now enables analysis of the genomic context regarding the utilization potential of complex carbohydrates.

The to date deepest metagenome sequencing was performed for three mesophilic and one thermophilic full-scale BGPs digesting renewable raw materials, mainly maize silage, together with different types of manure (Stolze et al. 2016). In total, 328 Gb sequence information was obtained for these four BGPs yielding an assembly size of approx. 1.5 Gb. Assembled contigs could be bundled to 532 MAGs, five of which represented noticeable distinct taxa of the BGPs analyzed. The latter five MAGs were assigned to the phyla Thermotogae, Fusobacteria, Spirochaetes, and Cloacimonetes. Reliability of the assembly/binning approach could be demonstrated by the example of the Thermotogae MAG since it is very closely related to the isolate Defluviitoga tunisiensis L3 obtained from the same thermophilic BGP sampled for metagenome sequencing (Maus et al. 2016b). The other reconstructed MAGs represent novel and uncharacterized species. Based on the MAG sequence information, the metabolism of the corresponding species was reconstructed allowing classification of the Fusobacteria and Cloacimonetes taxonomic units as amino acid fermenting and carbon dioxide/hydrogen producing Bacteria, whereas the Thermotogae and Spirochaetes MAGs were predicted to represent sugar utilizing and acetate, carbon dioxide, and hydrogen producing Bacteria. Moreover, obtained results indicated a syntrophic association for the taxa analyzed. The described studies impressively illustrate the usefulness of metagenome assemblies combined with binning methods to uncover so-far unknown species of the biogas microbiome residing in biogas reactor systems.

The metagenome assembly/binning approach has also been applied for other anaerobic digestion habitats to reconstruct MAGs representing novel taxa. For example, three MAGs assigned to the new candidate phylum Hyd24-12 of the superphylum Fibrobacteres-Chlorobi-Bacteroidetes were reconstructed from metagenome sequences originating from full-scale mesophilic digesters of wastewater treatment plants (Kirkegaard et al. 2016). Hyd24-12 members were predicted to produce acetate and hydrogen by fermenting sugars and may utilize sulfur as terminal electron acceptor. Likewise, five MAGs representing members of the insufficiently analyzed class Anaerolinea (phylum Chloroflexi) were obtained from lab-scale thermophilic cellulose-fermenting reactors (Xia et al. 2014; Xia et al. 2016). Reconstruction of their metabolism indicates a carbohydrate-based lifestyle with acetate, lactate, and hydrogen as fermentation end-products.

In summary, metagenome assembly and binning approaches contributed and still will contribute to the compilation of a resource covering genome sequence information of biogas community members that so far have not been cultivated. Future metagenome assembly/binning approaches will benefit from sequencing technologies producing longer read lengths such as single molecule real-time (SMRT) sequencing (PacBio©). Longer reads significantly enhance taxonomic binning and genome compilation as recently exemplified for a microbiome sample from a commercial biogas reactor fed with slaughterhouse waste, food waste, and plant biomass located in Sweden (Frank et al. 2016).

The functional characterization of microbial communities residing in biogas plants by analyzing community RNA and community proteins

Metagenome analyses for biogas communities revealed their genetic potential. However, they do not enable conclusions on the metabolic activity of community members. To tackle profiling of the metabolically active biogas community, metatranscriptome, and metaproteome analyses were conducted. Metatranscriptome analyses provided insights into the transcriptional activity of biogas microbiomes (see Fig. 5). However, expression of enzymes, the catalysts of metabolism, involves translation of messenger-RNAs implicating the possibility of regulation at the post-transcriptional level. Analysis of the biogas microbiome’s proteome was addressed in metaproteome studies (see Fig. 6).

Fig. 5
figure 5

Metatranscriptome-based analyses of biogas-producing microbial communities. After sampling, whole community RNA was extracted followed by depletion of ribosomal RNAs. Metatranscriptome cDNA libraries were prepared and sequenced. Resulting metatranscriptome reads were mapped on corresponding metagenome data or MAGs. Finally, Transcripts per million (TPM) values were calculated for each gene to deduce transcriptional profiles of biogas microorganisms

Fig. 6
figure 6

Metaproteomics workflow comprising sampling of the microbial biogas community, protein extraction, tryptic digestion of proteins, mass spectrometry of resulting peptides, and database searching using mass spectrometry data to identify proteins within the metaproteome analyzed

The functional characterization of biogas communities by sequencing the metatranscriptome

First high-throughput metatranscriptome sequencing for a full-scale BGP fed with renewable primary products and manure was done based on total RNA preparations from biogas community members (Zakrzewski et al. 2012). Since the majority of transcriptome reads represented ribosomal RNAs, 16S rRNA tags were used for profiling of the transcriptionally active community members. Most sequences were assigned to the phyla Euryarchaeota and Firmicutes followed by Bacteroidetes, Actinobacteria, and Synergistetes indicating metabolic activity of microorganisms belonging to these phyla. Among the messenger-RNA (mRNA) tags within the latter dataset, sequences assigned to genes with predicted functions in hydrolysis, volatile fatty acid (VFA) and acetate formation, and methanogenesis were identified. High relative abundance of transcripts encoding key methanogenesis enzymes such as the methyl-coenzyme-M-reductase indicated high activity of hydrogenotrophic archaeal species in the analyzed BGP.

These results were confirmed in an independent study for another agricultural BGP fed with maize silage, cow and chicken manure (Güllert et al. 2016). In addition, Bacteria of the family Peptococcaceae and the order Halanaerobiales appeared to be transcriptionally very active in the latter study. Members of the Firmicutes residing in the biogas fermenter actively transcribed a diverse set of genes encoding glycosyl hydrolases some of which are involved in hydrolysis of lignocellulose. The authors of the latter study also took advantage of MAGs compiled from corresponding metagenome sequence data. In a genome-enabled metatranscriptomics approach, they mapped transcriptome sequences to MAGs allowing detection of putatively cellulolytic Polysaccharide Utilization Loci (PUL) in bacteroidetal MAGs. Likewise, genes for cellulosome-associated cellulases were predominantly transcribed in MAGs assigned to the Firmicutes (Güllert et al. 2016). Importance of PUL identified in Bacteroidetes members for polysaccharide decomposition was also shown for a thermophilic lab-scale reactor with microcrystalline cellulose as substrate by analyzing the microbiome’s metatranscriptome (Xia et al. 2014).

Metatranscriptome analyses were also reported for a thermophilic full-scale BGP digesting maize silage, barley, and cattle manure (Maus et al. 2016b). Whole metatranscriptome sequencing without prior removal of ribosomal RNAs revealed high transcriptional activities of the taxa Defluviitoga (Thermotogae), Methanoculleus (Euryarchaeota), Clostridium cluster III (Firmicutes), Tepidanaerobacter (Firmicutes), Anaerobaculum (Synergistetes), and Cellulosibacter (Firmicutes) in decreasing order regarding their contribution to 16S rRNA-derived sequence tags. In contrast, members of the genus Halocella and other genera appeared to be transcriptionally less active compared to their relative abundance estimated by evaluation of corresponding metagenome-derived 16S rRNA gene sequences. Importance of Defluviitoga species for the thermophilic full-scale biogas process with renewable primary products as substrates was previously shown (Maus et al. 2016b).

The genome-enabled metatranscriptomics approach also was exemplified in a very recent study in which metatranscriptome data from full-scale BGPs were mapped to metagenomically assembled and binned genomes representing distinctively abundant taxa (Stolze et al. 2016). Evaluation of transcriptional activities revealed that the metabolism of two MAGs assigned to the phyla Thermotogae and Spirochaetes is based on sugar utilization whereas amino acid fermentation was predicted for a Fusobacteria and a Cloacimonetes MAG.

Further metatranscriptome studies were conducted for lab-scale biogas systems addressing specific questions. Treu et al. (2016a) analyzed the effect of long-chain-fatty-acid (LCFA) addition to the biogas microbiome at the transcriptional level. Genes of species belonging to the genus Syntrophomonas were upregulated in response to the feeding of LCFA indicating that these species are important for LCFA degradation. The underlying analysis also took advantage of the availability of Synthrophomonas sp. MAGs compiled in an accompanying study (Campanaro et al. 2016). Genome-enabled metatranscriptomics elucidated the transcriptional profile of Synthrophomonas sp. under the conditions tested. Other community members expressed protective mechanisms towards the effects of LCFA. Likewise, the influence of the process temperature was investigated at the example of lab-scale biogas reactors fermenting swine manure at 25 to 55 °C (Lin et al. 2016). Metatranscriptome sequencing revealed that methane production was regulated by limiting the diversity of functional pathways at higher temperatures. This means that functional pathways are centralized under thermophilic conditions. This observation correlates with lower diversities of thermophilic biogas microbiomes (see above).

The functional characterization of biogas communities by analyzing the metaproteome

First metaproteome analyses were conducted for a thermophilic lab-scale biogas reactor fed with beet silage and rye (Hanreich et al. 2012). Separation of total protein preparations by 2-dimensional gel electrophoresis and subsequent analysis of tryptically digested peptides by nanoHPLC-nanoESI-MS/MS led to the identification of housekeeping proteins and enzymes assigned to the archaeal acetoclastic and hydrogenotrophic methanogenesis pathways demonstrating high activity of these routes under thermophilic conditions.

Likewise, metaproteome analyses were done for a lab-scale biogas system digesting the lignocellulose-rich substrates straw and hay to study hydrolysis pathways (Hanreich et al. 2013). Bacteroidetes members were found to express ABC transporter proteins predicted to be involved in sugar uptake and TonB-dependent receptors which probably represent components encoded by PUL. Firmicutes taxa appeared to be responsible for cellulose degradation whereas species of the Bacteroidetes mainly participated in the digestion of other polysaccharides. Again, key methanogenesis enzymes were found to be highly expressed by species of the genera Methanobacterium, Methanosaeta, and Methanoculleus.

Higher throughput regarding identification of non-redundant protein functions within the metaproteome of biogas-producing microbial communities was achieved for a laboratory biogas system digesting office paper and utilizing the organic fraction of municipal solid waste as inoculum. Detected proteins covered i.a. the central carbon metabolism, fermentation, acetogenesis, syntrophic acetate oxidation, proteolysis, and methanogenesis (Lü et al. 2014).

Subsequently, first metaproteome analyses were performed for full-scale agricultural BGPs (Heyer et al. 2013). Obtained protein profiles allowed differentiation of mesophilic and thermophilic processes. Moreover, progression of the biogas process could be followed enabling for example prediction of disturbances such as acidification. Abundance of the key methanogenesis enzyme methyl-coenzyme-M-reductase assigned to the order Methanosarcinales decreased prior to acidification of the reactor and a decline of the methane yield (Heyer et al. 2013).

Improvements of metaproteome analyses of biogas communities were achieved by implementing different separation dimensions prior to protein identification. For example, application of liquid isoelectric focusing (IEF) as one of three dimensions resulted in the identification of approx. 750 to 1650 proteins within the metaproteome of communities residing in a mesophilic and a thermophilic BGP (Kohrs et al. 2014). Enzymes assigned to the four common phases of anaerobic digestion could be recognized in this approach and enabled differentiation of the mesophilic vs. the thermophilic process.

A more comprehensive study comprising 35 full-scale BGPs was conducted recently (Heyer et al. 2016). Processing of samples was done without elaborate pre-fractionation since high coverage of protein identifications could be achieved by applying the sensitive Orbitrap-mass-spectrometry technology. Proteotyping revealed different profile clusters for mesophilic and thermophilic BGPs and upflow anaerobic sludge blanket (UASB) fermenters and those fed with sewage sludge. Correlations between expressed proteins and the process parameters (i) ammonia concentration, (ii) sludge retention time, (iii) OLR, and (iv) temperature were detected. Very pronounced, high ammonia concentrations are associated with protein profiles indicating hydrogenotrophic methanogenesis and syntrophic acetate oxidation (Heyer et al. 2016).

Compilation of protein databases specific for biogas communities essentially improved protein identification in metaproteome analyses (Heyer et al. 2016; Ortseifen et al. 2016). Metagenome sequences representing biogas-producing microbial communities were assembled to deduce encoded gene products for compilation of biogas-specific protein databases. Metagenome assemblies also paved the way for obtaining genetic context information regarding proteins identified in metaproteome analyses. Metagenome contigs encoding metaproteome proteins provide information on genes located in the vicinity of the target gene. Exemplarily, an assembled contig from an agricultural BGP assigned to the genus Methanoculleus encoded several methanogenesis enzymes, three of which also were detected as abundant proteins in the community’s metaproteome (Ortseifen et al. 2016).

Quantitative metaproteomics applying nano-liquid-chromatography-(LC)-MS/MS elucidated the degradation of lignocellulosic biomass at the example of corn stover as substrate (Zhu et al. 2016). A great diversity of enzymes predicted to be involved in xylan degradation was identified. Most of these enzymes including xylanases, xylosidases, and cellulases were secreted by members of the Firmicutes. Interactions between bacterial species and enzymatic synergism with respect to hemicellulose digestion were elucidated in that study.

Likewise, in a bioprospecting approach involving induction of enzyme expression in response to cellulose addition, several enzymes predicted to be associated with cellulose metabolism were identified (Speda et al. 2017). For example, different cellulases, 1,4-β-cellobiosidases, endo 1-4-β-xylanases, cellobiose phosphorylases, and glycosyl hydrolase family proteins were identified in the extracellular metaproteome of the community. Since a corresponding metagenome had not been sequenced in that study, genes encoding enzymes of interest were not accessible, again demonstrating the advantages of combined metagenome/metaproteome analyses.

In a very recent study, the metaproteome of a community digesting grass in a two-stage biogas production system was analyzed under mesophilic vs. thermophilic conditions addressing the acidogenesis phase of the decomposition process (Abendroth et al. 2017). The metaproteome reflected the microbiome’s activity in polysaccharide utilization and sugar fermentation leading to the formation of short-chain fatty acids. Compared to mesophilic process conditions, the thermophilic community (55 °C) expressed a more stable protein profile suggesting that the thermophilic process early reached its steady state.

Commonly, metatranscriptome and/or metaproteome studies are complemented by metabolome analyses to correlate gene expression to metabolite profiles. However, to our knowledge, elaborate metabolome profiling mainly has been done for laboratory biogas-producing systems. A recent study revealed more than 200 metabolite peaks for a two stage lab-scale fermentation reactor digesting corn stalk (Yang et al. 2014a). The acidogenic phase was characterized by high levels of fatty acids whereas during methanogenesis, sugars, and sugar alcohols accumulated. The authors concluded that metabolome analyses should be interpreted considering metagenome data of the biogas community studied.

Future prospects

Analyses of biogas producing communities by integrating findings from genome and metagenome sequencing, metatranscriptomics, metaproteomics, and metabolomics approaches provided deep insights into their compositions, performance, interrelations between community members, and dependencies concerning fed substrates and process parameters. Due to further improvements and developments in high-throughput sequencing technologies, large-scale taxonomic profiling will routinely be done for biogas communities to follow their succession in the course of fermentation processes and for monitoring purposes. In this context, full length HT-sequencing of the 16S rRNA marker gene on third- and fourth-generation sequencing platforms is of importance since longer sequences enable more precise taxonomic classifications.

Although hundreds of isolates were obtained from biogas producing communities by elaborate cultivation techniques, these do not represent their entire complexity. Cultivation of other community members may be challenging which most probably is due to specific growth requirements, trophic dependencies, and/or syntrophic associations. Moreover, it appeared that genome sequences of biogas producing microorganisms are underrepresented in public nucleotide sequence repositories implicating that more reference genomes are needed to evaluate metagenome, metatranscriptome, and metaproteome data from biogas microbiomes. Advanced bioinformatics strategies and deep metagenome sequencing led to assemblies and binning of genome sequences representing species that received little attention so far and/or are acting as key-players in AD. Accordingly, the latter approach provided access to the non-cultivable fraction of biogas communities.

Recently, a considerable number of MAGs from biogas communities was collected in databases and still offers the chance to discover new taxa. This resource, combined with metatranscriptome, metaproteome, and metabolome datasets, provides the opportunity to holistically study the metabolic potential and performance of taxa represented by MAGs and relate their activities to changing environmental conditions and process parameters. Increasingly, future research will also include network analyses since data collected by -omics technologies facilitate predictions concerning interactions between different taxa, trophic/syntrophic relationships, and other associations within the microbial community. Implementation of radio-isotopic labelling experiments even permits to follow the fate of particular metabolites and their fluxes.

In the future, metagenome assembly/binning studies will be complemented by single cell genomics (Yilmaz and Singh 2012) involving sorting and separation of single cells from complex communities followed by genome amplification (multiple displacement amplification - MDA), and NGS-sequencing. Single amplified genomes (SAGs) will complement the repository of biogas microbiome members. It can be expected that results from MAG and SAG analysis will also stimulate the development of new culturomics strategies to enable physiological analysis of new taxa (Lagier et al. 2015). Regarding exploitation and application of compiled knowledge on biogas communities, the design of inocula (starter) cultures for biogas processes as well as new and innovative approaches for monitoring, management, and engineering of anaerobic digestion assemblages is considered.