Background

Methane is estimated to have ~ 82.5 times the global warming potential of carbon dioxide over a 20-year timescale [1, 2]. Animal agriculture is believed to be the largest source of anthropogenic methane emissions (95–109 Tg CH4/year), with ruminant livestock responsible for at least 80% (87–97 Tg CH4/year) of these emissions via feed digestion and associated microbial fermentation [3,4,5]. Typically, herbivorous diets consisting of foliage or lignin-rich plant products favour microbial communities with increased methanogen diversity and abundance, largely due to the increased availability of substrates through the bacterial hydrolysis and fermentation of plant polysaccharides [6]. Interestingly, some of Australia’s native marsupial herbivores appear to be ‘low-methane’ emitters. For instance, kangaroos and wallabies (members of the Macropodidae family) are foregut fermenters and eruct less methane (when corrected for digestible energy intake) compared to ruminant livestock when reared on the same diet [7]. Further work has similarly shown two kangaroo species, Macropus fuliginosus and M. rufus, to produce lower concentrations of methane compared to ruminant animals [8].

Several surveys of the foregut digesta contents from these animals suggested methanogen communities are much smaller in both relative and absolute abundances compared to ruminants and in some cases undetectable [9,10,11]. Some methanogen genera are common to the ruminant and macropodid foregut including Methanobrevibacter, Methanosphaera, and Methanomassiliicoccus (originally classified as Thermoplasmatales) [11, 12]. However, there is evidence that members of these genera are distinct between ruminants and marsupials. For example, marsupial Methanosphaera species have smaller genomes and broader methanogenic substrate profiles than their ruminant counterparts [12, 13]. Novel lineages of methanogens have also been observed in metagenomics analysis of the gut microbiome of koala (Phascolarctos cinereus) and southern hairy-nosed wombat (Lasiorhinus latifrons), which showed uncharacterised Methanocorpusculum species in both animals, with a greater relative abundance (2.14%) in the wombat compared to koala (0.11%) [14], suggesting that this lineage may represent a prominent group of methanogens in marsupials. A recent analysis of host-associated archaeal diversity using 16S rRNA gene amplicon data further supports that Methanocorpusculum is a significant archaeal genus that resides within the digestive tract of diverse animal species [15]. However, there is currently a paucity of knowledge about the ecology and evolution of host-associated Methanocorpusculum lineages due to a lack of available genomes and cultured isolates.

Here, we describe the expansion of the phylogenetic and functional understanding of host-associated Methanocorpusculum species using axenic isolates recovered from the common wombat (Vombatus ursinus) and mahogany glider (Petaurus gracilis), as well as MAGs produced from native Australian herbivores and other publicly available datasets. Using these MAGs and isolate genomes, we subsequently perform the first comparative genomic analysis of the genus Methanocorpusculum and identify host-specific genetic adaptations of novel Methanocorpusculum species.

Results

Methanocorpusculum detected in Australian marsupials

To further explore the prevalence of Methanocorpusculum in the faecal microbiome of Australian marsupials, the faecal DNA of 23 marsupial species (n = 102) was screened using universal 16S rRNA amplicon sequencing (Additional file 1: Table S1). Only two archaeal OTUs were detected: one Methanobrevibacter OTU closely related to Methanobrevibacter gottschalkii HO [16] (98.81% sequence identity) and one OTU with 99.2% sequence identity to Methanocorpusculum labreanum Z [17, 18]. Of the marsupial faecal samples, 68% (69/102) contained at least one detectable methanogen OTU (Additional file 1: Table S1). Only 25% (26/102) of samples contained both OTUs, though the majority of these (58%) belonged to the Macropodidae. Interestingly, the average relative abundance of Methanocorpusculum was significantly lower in the marsupial samples that also contained Methanobrevibacter, 0.10 ± 0.22% compared to 0.99 ± 1.56% (p = 0.0013; Fig. 1A, Additional file 1: Table S1), suggesting there may be competition between these two groups of methanogens.

Fig. 1
figure 1

Methanogen profiles detected in marsupial species with 16S rRNA amplicon sequencing. The phylogenetic tree was built using the exon 28 of the von Willebrand factor (vWF) of the respective species available from the NCBI nucleotide database. MEGA-X [19] with MUSCLE was used to align the genes, and phylogeny was inferred using maximum likelihood and 1000 bootstrap replications. A The average relative abundance of respective OTUs and empty cells indicates no methanogen signal was detected. B The prevalence of respective OTUs for each species. The host diet composition is displayed as per legend, adapted from Shiffman et al. [14]. The number of samples is displayed to the right of the respective species

This competition is especially evident in the wombats with Methanocorpusculum detected in all wombat samples except for one southern hairy-nosed wombat, which had a high abundance of Methanobrevibacter (5.52%; Fig. 1B). Furthermore, the wombat samples with Methanobrevibacter contained a lower average relative abundance of Methanocorpusculum (0.04 ± 0.06%; n = 4) compared to those without (2.85 ± 2.20%; n = 7; Fig. 1B). The southern hairy-nosed wombat also contained the greatest abundance of Methanocorpusculum (2.72 ± 2.35%; Fig. 1A), while the common wombat had a substantially lower abundance (0.25 ± 0.24%). Similarly, all squirrel gliders (4/4) and 89% (8/9) of mahogany gliders contained Methanocorpusculum, and only a single mahogany glider contained Methanobrevibacter (Fig. 1B). The mahogany glider samples contained the second greatest average abundance of Methanocorpusculum at 1.0 ± 1.2%, with the greatest individual faecal abundance of 3.26% (Fig. 1A, Additional file 1: Table S1). The Macropodidae contained a lower abundance of Methanocorpusculum at 0.11 ± 0.22% potentially due to a higher prevalence of Methanobrevibacter (70%; Fig. 1A, B). The faecal samples from koalas contained both methanogen OTUs, though only three samples were positive for Methanobrevibacter and two for Methanocorpusculum. The average abundance of Methanobrevibacter was greater than that of the kangaroo and wallaby; however, this was attributed to a single koala sample with 2.6%, greater than any other marsupial sample except for the wombat methanogen communities.

Isolation of novel host-associated Methanocorpusculum species enriched in marsupials

Given the high abundance and prevalence of Methanocorpusculum in the common wombat (CW) and mahogany glider (MG), as well as their high methane production in faecal enrichments relative to other marsupials (Additional file 2: Fig. S1), efforts were made to obtain pure cultures of Methanocorpusculum from the enrichments. Cultures supplemented with a combination of CO2/H2 and sodium acetate produced high microbial growth coupled with methane production. Amplicons produced from the CW and MG enrichment cultures both clustered within the order Methanomicrobiales, forming a deep lineage to other available Methanocorpusculum 16S rRNA sequences (Additional file 2: Fig. S2). Both enrichment cultures were representatives of the Methanocorpusculum OTU (100% sequence identity) identified in the marsupial faecal samples (see above; Fig. 1) and distinct from other cultured Methanocorpusculum spp., with 97% identity to the 16S rRNA of Methanocorpusculum labreanum. Single colonies were picked after ~ 4 weeks of incubation at 37 °C. Axenic cultures are henceforth referred to as Methanocorpusculum sp. MG for the mahogany glider isolate and Methanocorpusculum sp. CW153 for the common wombat isolate.

The whole-genome sequences for Methanocorpusculum sp. CW153 and MG are near complete (97.69 and 98.01%, respectively) with low contamination (1.96 and 1.31%, respectively; Table 1). Compared to the three other previously published Methanocorpusculum isolate genomes, all of which are from environmental sources, the genomes of CW153 and MG are larger and also possess a greater number of predicted coding genes according to IMG JGI based annotations (Table 1; Additional file 1: Table S2). Taxonomic classification using GTDB-Tk showed CW153 to be the same species as Methanocorpusculum sp001940805 represented by Phil4, a MAG produced from a faecal sample of a southern hairy-nosed wombat [14]. Strain MG was only classified to the genus level and thus likely represents a novel Methanocorpusculum species. Indeed, the pairwise average nucleotide identity (ANI) of CW153 and MG was only 88%, indicating that the two isolates represent distinct species of host-associated Methanocorpusculum, according to the operational ≥ 95% ANI threshold commonly used for species demarcation [20] (Additional file 1: Table S3).

Table 1 Summary of genome statistics for cultured Methanocorpusculum strains. Strain designation, estimated completeness, estimated contamination, genome size, number of contigs, N50, guanine-cytosine (GC) content, coding density, number of tRNAs out of the 20 canonical amino acids, 5S rRNA count, 16S rRNA count, 23S rRNA count, geographical location, isolation source, and GTDB classification are shown for each available cultured isolate. Only cultured isolates with available genomic information were included

MG and CW153 stained Gram-negative and presented as pleomorphic cells, ~ 0.5 to 1.5 μm in diameter (Additional file 2: Fig. S3B). Viable cells of the two strains were auto-fluorescent at 420 nm (470 nm emission), due to the presence of the reduced form of cofactor F420 (Additional file 2: Fig. S3B) [23, 24]. Transmission electron micrographs (TEM) of MG and CW153 again showed pleomorphism, with both possessing a thin cell wall and singular membrane with no obvious capsule-like structures (Additional file 1: Fig. S3A).

Recovery of Methanocorpusculum-associated MAGs from diverse animal hosts

To further characterise novel host-associated lineages of Methanocorpusculum, 130 MAGs assigned to the genus Methanocorpusculum were successfully recovered from publicly available metagenomes (Additional file 1: Table S4), with 24 of these MAGs being high-quality (HQ; ≥ 90% completeness, ≤ 5% contamination). These MAGs were combined with nine MAGs produced from southern hairy-nose wombats and mahogany gliders from this study, four human-associated MAGs [25], four environmental MAGs [26], one MAG produced from a wombat [14], 15 MAGs produced from ruminants [27], one MAG from a chicken [28], and 10 other MAGs identified as Methanocorpusculum on the NCBI genome database (Additional file 1: Table S5).

A genome tree comprising the resultant 176 Methanocorpusculum genomes was constructed using a concatenated set of 122 archaeal marker genes that shows a striking diversity of host-associated Methanocorpusculum species (Fig. 2). There are at least six environment-associated (ENC) and 23 host-associated (HAC) Methanocorpusculum species (≥ 95% AAI), of which 17 are novel (Fig. 2, Additional file 1: Table S5) as of GTDB release 06-RS202. MAGs recovered from rhinoceros, elephant, and horse samples contained the greatest diversity, comprising nine species (Fig. 2; HAC001, 002–012, 020–021). Two MAGs from rhinoceros produced an outlying clade compared to all other Methanocorpusculum genomes (Fig. 2; HAC001) likely representing a novel Methanocorpusculaceae genus. One MAG recovered from a sperm whale was phylogenetically distinct but grouped closest with HAC003 and HAC004 recovered from elephant and rhinoceros, respectively. The MAGs recovered from domesticated (i.e. sheep, cows, and goats) and wild (i.e. water buffalo and water deer) ruminants were assigned to five species (HAC013–HAC017). Additionally, four human-derived MAGs were also assigned to HAC017 (Fig. 2). The MAGs recovered from ptarmigan represent a divergent Methanocorpusculum species (Fig. 2; HAC018), potentially resulting from the geographic isolation of this avian species. Methanocorpusculum MAGs and genomes were recovered from two Australian marsupials and separated into two distinct species: those produced from mahogany gliders (M. petauri sp. nov.; see below) and those from wombats (M. vombati sp. nov.; see below), except for one wombat MAG which clustered with M. petauri. Two MAGs produced from rhesus macaque were also assigned to M. vombati, and two MAGs recovered from elephants represent two species (HAC020, HAC021) closely related to but distinct from M. vombati. HAC023 is the most highly represented species with 66 MAGs, accounting for the most represented host (Fig. 2B; chicken) and geographical location (Fig. 2C; India), though this is likely a sampling artefact due to a large number of chicken metagenomes analysed compared to other animals.

Fig. 2
figure 2

Phylogenetic distribution of Methanocorpusculum MAGs and isolate genomes. A Concatenated archaeal marker gene files were produced using GTDB-Tk (v.1.3.0) [29], with Methanomicrobium mobile BP used as the outgroup. Phylogeny was inferred using FastTree (v2.1.10) [30] and visualisation by iToL (https://itol.embl.de/). MAGs and isolate genomes of ≥ 50% completeness and ≤ 10% contamination were included, and HQ MAGs were identified by blue circles. Cultured isolates were identified by a red star. Bootstrap values are shown by the red (≥ 0.7) and black (≥ 0.9) circles. All MAGs and isolate genomes from environmental sources were identified as ‘environmental’ under the host description. B The host distribution and C geographical distribution of Methanocorpusculum genomes

Methanocorpusculum genotypes cluster according to the host environment

To explore the genetic potential of Methanocorpusculum, 52 high-quality MAGs and isolate genomes representing 14 Methanocorpusculum species were compared (Fig. 3A, Additional file 1: Table S5). These included five ENC and nine HAC species, noting that no species had both environmental and host-associated representatives. AAI confirmed the separation of these species based on an operational definition (≥ 95% AAI; Additional file 2: Fig. S4). Analysis of the core and pan genomes, as defined by Chaudhari et al. [31], showed 8290 genes contributing to the Methanocorpusculum pangenome and only 149 core genes (~ 1.8%; Additional file 2: Fig. S5A). When the 11 ENC and 41 HAC genomes were analysed separately, the number of core genes for the ENC genomes increased to 890 out of 3097 (~ 29% of the pangenome). However, the core genes for the HAC genomes only increased to 179 out of 6559 (~ 2.7% of the pangenome), again showing the diversity of the HAC species (Additional file 2: Fig. S5B-C). It is worth noting the HAC maintained a significantly smaller core genome when 10 genomes were randomly sampled from both groups; 533 for the HAC compared to 918 for the ENC (P < 0.0001, Additional file 2: Fig. S5B-C). As expected, genomes from individual species contained a greater number of core genes (848–1441), although the MAGs from ptarmigan contained a smaller number of core genes, likely attributed to the comparatively smaller genome size and number of coding genes (Additional file 2: Fig. S6).

Fig. 3
figure 3

Phylogenetic and genotypic distribution of high-quality Methanocorpusculum MAGs and isolate genomes. A Phylogenetic tree of concatenated archaeal marker gene files was produced using GTDB-Tk (v.1.3.0) [29], with Methanomicrobium mobile BP used as the outgroup. Phylogeny was inferred using FastTree (v2.1.10) [30] and visualisation by iToL (https://itol.embl.de/). MAGs and isolate genomes of ≥ 90% completeness and ≤ 5% contamination were included. Cultured isolates are identified by a red star. Bootstrap values are shown by the red (≥ 0.7) and black (≥ 0.9) circles. The black arrow denotes the common ancestor of Methanocorpusculum, the white arrow denotes the common ancestor of the Env clade and host clade 2, and the blue arrow denotes the common ancestor of the Env clade. B, C The genetic variance based on the presence/absence of KO and ortholog annotation, respectively. Gene annotations and PCA plots were generated using EnrichM (v0.4.15, https://github.com/geronimp/enrichM) with the ‘--ko’ (B) and ‘--orthologs’ (C) analyses. Genomes are coloured according to species, as per the legend

Based on a well-supported phylogeny, we infer that the ancestor of the genus Methanocorpusculum was host-associated and that this trait was lost in one line of descent (arrowed in Fig. 3A). This environmental clade currently comprises six species, including M. labreanum and M. parvum, and effectively splits host-associated species into two distinct clades—host clade 1 (four species found in birds, marsupials, and one species of old-world monkey) and host clade 2 (five species found in pachyderms and ruminants; Fig. 3A). These two distinct clades likely correspond to the dominant host-associated Methanocorpusculum clades recently identified by Thomas et al. [15] using 16S rRNA sequencing. This phylogenetic separation is reflected in predicted functional differences between the clades including pathways involved in amino acid, carbohydrate, energy, and lipid metabolism, as well as membrane transport, signalling and cellular processes, and genetic information processing (Fig. 3B, C; Additional file 1: Table S6). Genes encoding homocitrate synthase, homoisocitrate, and homoaconitase were significantly enriched in the environmental Methanocorpusculum spp. Interestingly, the ptarmigan genomes were deemed phylogenetically distinct (Fig. 3A) but share functional features with genomes from the ruminant group (Fig. 3B), suggesting these two separate lineages show some convergence in their adaptation to a similar habitat provided by their host. Indeed, further evidence of host-specific adaptations are reflected in the gene annotations lacking a current functional assignment (Fig. 3C).

Host-associated Methanocorpusculum species have unique metabolic potential

According to KO annotations, 413 genes were significantly differentially enriched between the Methanocorpusculum species (Additional file 1: Table S7). Most of these KOs were metabolic (53%), ~ 25% transport-associated, and ~ 12% genetic information processing proteins. KOs assigned to the metabolism of cofactors and vitamins, amino acid metabolism, cell motility and defence, and transport proteins were significantly differentially enriched between the host groups (Additional file 1: Table S8, Additional file 2: Figs. S7-S12).

In terms of genes encoding functions for methanogenesis and energy production, the environmental genomes were significantly enriched for alcohol dehydrogenase (adh) AKR1A1 (K00002; Fig. 4A). This gene was described in M. parvum and inferred to be involved in the use of short-chain alcohols as alternative reductants in CO2-dependent methanogenesis (Fig. 4A) [32]. The absence of this gene in host-associated Methanocorpusculum species suggests that they are unable to use short-chain alcohols through this pathway. However, the M. vombati (HAC022) genomes were significantly enriched for a predicted meso-butanediol dehydrogenase (budC) and may allow for the utilisation of meso-2,3-butanediol, (S)-acetoin and/or (S,S)-butane-2,3-diol for the NADH-dependent production of hydrogen (Fig. 4A) [33]. Additionally, the genomes from ruminant hosts (HAC016-017) were enriched for a different alcohol dehydrogenase family (adh2), which has been characterised by the use of 2-propanol by Gordonia [34].

Fig. 4
figure 4

Differential enrichment of carbohydrate and energy metabolism-associated genes in Methanocorpusculum. KO annotation and statistical analyses were performed using the ‘annotate’ and ‘enrichment’ functions of EnrichM (v0.4.15; https://github.com/geronimp/enrichM). Genomes were grouped by host species and compared by Fisher’s exact test, where KOs with corrected p values of < 0.05 were retained and considered significant. Heatmap values are colour-coded according to the legend and represent the proportion of respective genomes for a given host group. The Methanocorpusculum are also labelled as environmental clade (green), host clade 2 (orange), and host clade 1 (blue), as per Fig. 3. A Carbohydrate metabolism. B Energy metabolism

Some species of Methanocorpusculum are also enriched for genes which could support additional or alternative pathways for growth. For example, HAC003, M. vombati, and M. petauri encode genes annotated as benzoyl-CoA reductase subunit A (badF; Additional file 2: Fig. S12), which catalyses intermediate steps in benzoate degradation and is closely related to 2-hydroxyglutaryl-CoA dehydratase of amino acid fermenting Gottschalkia. Although badF is a key marker gene for aromatic compound metabolism for many bacteria [35, 36], our Methanocorpusculum-affiliated MAGs and isolate genomes do not possess genes with confirmed or putative capacity for the complete metabolic pathway. As such, while our results raise the intriguing possibility of there being “ancillary” metabolic scheme(s) in these species, coupled or uncoupled from methanogenesis and/or growth, more detailed studies are needed to functionality validate these predictions. The M. vombati genomes were significantly enriched for adh1, with the protein sequence sharing 56% similarity (E = 6e−157) to the predicted phosphonoacetaldehyde reductase of the bacterium Natronincola peptidivorans (Additional file 2: Fig. S12). Phosphoenolpyruvate phosphomutase (PPM) and phosphonopyruvate decarboxylase (PPD) were also enriched in the marsupial genomes (Additional file 2: Fig. S12), with all three genes involved in phosphonate metabolism. These genes are found within a single gene cluster in the Methanocorpusculum vombati CW153 genome flanked by genes with similarity to bacterial transposases (CDS358) and, as such, may have been acquired through horizontal gene transfer.

Furthermore, multiple genes of bacterial origin were found to be enriched in the marsupial lineages. For instance, M. petauri was significantly enriched for KOs associated with nitrogen assimilation (nifDEKNU, nifHD1/2; Fig. 4B, Additional file 2: Fig. S6), which also look to be of bacterial origin. Additionally, the marsupial Methanocorpusculum species were differentially enriched for specific CAZymes, including glycosyltransferase (GT) families 2, 39, and 83 (Additional file 2: Fig. S13). M. vombati specifically encodes for a greater number of GT4 and GT66, as well as carbohydrate-binding module (CBM) 44 that has been shown to bind both cellulose and xyloglucan (Additional file 2: Fig. S13) [37]. Comparatively, M. petauri encodes for a greater number of GT8, with the genome of M. petauri MG specifically encoding for the greatest number of GT2, as well as GT11, GT111, and GT10 (Additional file 2: Fig. S13). These unique CAZymes show the greatest similarity to bacterial protein sequences, suggesting they may also be of bacterial origin.

Marsupial-associated Methanocorpusculum isolates have simplified substrate utilisation

As expected, and like their environmental counterparts, both Methanocorpusculum vombati and petauri could use CO2 and H2 for growth [17, 38], but the removal of sodium acetate and sodium formate from the basal media significantly reduced the maximum yield of both strains (Fig. 5). While formate is a likely substrate for methanogenesis via formate dehydrogenase mediated activation, acetate is unlikely to be used for methanogenesis, instead as a source for central carbon assimilation, as shown for M. parvum [32, 38].

Fig. 5
figure 5

Primary in vitro substrate utilisation profile of M. vombati CW153 and M. petauri MG. Strains were grown using BRN-RF10 medium without added sodium formate and sodium acetate at 37 °C with 1% (v/v) supplementation of each substrate, as listed in the legend. CO2/H2 was used for the positive control. CO2/H2 with ‘BRN+’ (i.e. with sodium formate and sodium acetate added) was used to show growth in basal BRN-RF10 medium. A, B M. vombati CW153 cultured with a headspace of CO2 and H2, respectively. C, D M. petauri MG cultured with a headspace of CO2 and H2, respectively. CO2 and H2 alone were used for negative controls. Growth was measured by optical density at 600 nm (OD600) at ~ 2 h interval

Despite the lack of M. parvum adh homologue in our cultured isolates, another type of adh was predicted to be present. As such, it was unclear whether these new host-associated Methanocorpusculum spp. could utilise short-chain alcohols for growth via an alternative, currently uncharacterised pathway(s). To that end, both M. vombati CW153 and M. petauri MG were cultured using a basal medium prepared to contain various short-chain alcohols to determine their potential to perform alcohol-dependent methanogenesis. Contrary to the environmental Methanocorpusculum isolates, neither M. vombati nor M. petauri showed significant growth with short-chain alcohols as substrates for methanogenesis, even with prolonged incubation (500 h; data not shown). It is worth noting that M. petauri grown with a headspace of CO2 and 1-butanol supplementation did produce a statistically significant increase in yield (OD600 = 0.022) after ~ 200 h of incubation; however, such a small increase in yield suggests CO2 and 1-butanol are likely not viable substrates for M. petauri (Additional file 2: Fig. S14). Such findings suggest that both M. vombati and M. petauri are effectively incapable of growth with short-chain alcohols as alternative substrates for methanogenesis, consistent with the absence of an adh homologue in these species (AKR1A1; Fig. 4).

Discussion

The genus Methanocorpusculum has long been recognised as “environmental” methanogens, principally isolated from soils and hydrocarbon-rich bogs [17, 21]. However, there are sporadic reports of 16S rRNA gene amplicons affiliated with Methanocorpusculum from stool samples of various animal species [22, 39]. Recently, Thomas et al. [15] showed the presence of Methanocorpusculum-associated sequences across diverse lineages of animal hosts, suggesting that this genus represents an important but largely uncharacterised host-associated group of methanogens. We have expanded on previous studies [9,10,11, 14] to show the presence of Methanocorpusculum in the faecal microbiomes of a wide variety marsupial species, including a relatively high abundance in wombat and glider species. Furthermore, we have isolated two host-associated representatives of this family from a common wombat and mahogany glider. Based on our phylogenetic, AAI, and comparative genomic analyses, these isolates represent two novel host-associated species of Methanocorpusculum, for which we propose the names M. vombati (strain CW153) and M. petauri (strain MG), respectively. These strains constitute the second and third host-associated isolates for this genus, after M. aggregans BU5 isolated from a buffalo [40], and the first two for which genomic information is available.

Shiffman and colleagues [14] successfully recovered a MAG from the faecal metagenome of a southern hairy-nosed wombat, Methanocorpusculum sp. Phil4, representing the first host-associated genome of Methanocorpusculum. Subsequently, Methanocorpusculum MAGs have been produced as part of several metagenomic studies of both human and non-human hosts [25, 27, 28, 41]. Through our analysis, we have further expanded the number of available Methanocorpusculum MAGs to include more than 20 distinct species from the faecal microbiomes of various mammalian and avian animals. Interestingly, the phylogenetic placement and genomic profile of each species were consistent except for HAC018, which grouped phylogenetically with host clade 1 but genetically with host clade 2 (Fig. 3). Like the hoatzin, ptarmigan have evolved a prominent crop for the microbial fermentation of a highly herbivorous diet that may have induced similar genetic adaptations to those observed in the Methanocorpusculum recovered from ruminant hosts [42, 43].

Environmental Methanocorpusculum species appear to have a wider capacity for the utilisation of short-chain alcohols that was not observed in the host-associated strains. Previous analyses of the environmental species, M. parvum and M. bavaricum, demonstrated that they can utilise short-chain alcohols in CO2-dependent methanogenesis, with M. parvum able to use 2-propanol and 2-butanol [21]. Analysis of the respective adh further predicted the utilisation of cyclopentanol, 2,3-butanediol, ethanol, and 1-propanol for M. parvum and cyclopentanol and 2,3-butanediol for M. bavaricum [44]. The utilisation of short-chain alcohols by M. parvum was attributed to an adh (AKR1A1) that likely functions by converting 2-propanol and NADP to acetone and NADPH [45]. The absence of homologues of the M. parvum adh in all host-associated MAGs and isolate genomes suggests they are unable to utilise short-chain alcohols through this pathway. Indeed, we showed both M. vombati and M. petauri were unable to use short-chain alcohols as primary substrates for methanogenesis (Fig. 5). This simplified substrate profile is in contrast to Methanosphaera, which are found in the kangaroo gut and can utilise ethanol in methanol-dependent methanogenesis unlike other Methanosphaera spp. [12].

Contrary to the prevailing idea that Methanocorpusculum is primarily an environmental genus of methanogens, our expanded dataset indicates that this lineage was ancestrally host-associated and that one branch of the genus comprising the best characterised isolates (M. parvum, M. labreanum, and M. bavaricum) transitioned to an environmental lifestyle. Furthermore, the absence of alcohol-utilising M. parvum adh (AKR1A1) homologues in all host-associated lineages suggests the capacity to utilise short-chain alcohols through this pathway is a unique acquisition by the environmental Methanocorpusculum lineage (blue arrow in Fig. 3) and may indicate an adaptation to the hydrocarbon-rich environments in which these lineages are often found [17, 21]. However, M. vombati, HAC016, and HAC017 do contain homologues of uncharacterised archaeal and bacterial alcohol dehydrogenases, suggesting specific species of host-associated Methanocorpusculum may also have the capacity to utilise short-chain alcohols despite the absence of AKR1A1 though this requires validation.

Similarly, the host-associated Methanocorpusculum also contain a subset of genes indicative of host-specific adaptations. M. vombati is enriched for a cluster of genes characterised in the biosynthesis of phosphonate compounds such as the antibiotic dehydrophos produced by Streptomyces luridus and fosfomycin from Streptomyces wedmorensis [46,47,48], suggesting this lineage of Methanocorpusculum may also produce antibiotic phosphonate compounds to improve persistence within the gut. The ruminant Methanocorpusculum species contain a predicted bile salt hydrolase for the detoxification of bile acids [49], which is similarly encoded by the bovine-derived Methanosphaera sp. BMS [13]. In a recent analysis of bile acid metabolism in dairy cows, over one-third of MAGs analysed contained bile acid transformation pathways, including Methanobrevibacter and Methanocorpusculum, indicating the importance of bile acid metabolism as an adaptation to the bovine gastrointestinal tract [50]. Furthermore, the host-associated Methanocorpusculum contained a reduced capacity for the biosynthesis of tryptophan and other aromatic amino acids. This suggests the intestinal tract provides a constant source of exogenous amino acids to the host-associated species, where the inconsistent availability of amino acids to the environmental Methanocorpusculum has caused the need for biosynthesis.

Conclusions

Through our characterisation of host-associated MAGs and isolate genomes, we have confirmed that the genus Methanocorpusculum is widely found in the gastrointestinal tract of herbivores and that each of the species encodes for unique genetic adaptations to their host environment. Furthermore, the ancestor of the Methanocorpusculum genus was likely host-associated and that these traits were lost in environmental lineage. Future studies are required to determine how these host-associated Methanocorpusculum species interact with the wider gut microbiome compared to other groups of methanogenic archaea, such as the typically dominant Methanobrevibacter.

Methanocorpusculum vombati (sp. nov.)

Methanocorpusculum sp. CW153T represents a novel species of Methanocorpusculum isolated from the Australian common wombat (Vombatus ursinus), for which we propose the name Methanocorpusculum vombati (sp. nov.; from Vombatus, denoting the genera of the common wombat form which it was isolated). This organism is a hydrogenotrophic methanogenic archaea, for which the type strain of the species is strain CW153T.

Methanocorpusculum petauri (sp. nov.)

Methanocorpusculum sp. MGT represents a novel species of Methanocorpusculum isolated from the Australian mahogany glider (Petaurus gracilis), for which we propose the name Methanocorpusculum petauri (sp. nov.; from Petaurus, denoting the genera of the mahogany glider form which it was isolated). This organism is a hydrogenotrophic methanogenic archaea, for which the type strain of the species is strain MGT.

Methods

Marsupial sample collection and storage

Marsupial faecal samples were collected from sanctuaries and zoos in South-East Queensland (Lone Pine Koala Sanctuary, Brisbane and David Fleay Wildlife Park, Burleigh Heads) and North Queensland (Wildlife Habitat, Port Douglas and Cairns Tropical Zoo). Faecal material was collected from 23 marsupial species, including greater glider (n = 4), mahogany glider (n = 9), squirrel glider (n = 4), yellow-bellied glider (n = 1), eastern grey kangaroo (n = 13), red kangaroo (n = 12), koala (n = 125), Lumholtz’s tree-kangaroo (n = 7), red-legged pademelon (n = 4), common brushtail possum (n = 10), common ringtail possum (n = 4), green ringtail possum (n = 5), Herbert River ringtail possum (n = 2), mountain brushtail possum (n = 1), short-eared possum (n = 3), striped possum (n = 2), agile wallaby (n = 3), northern nail-tail wallaby (n= 3 ), parma wallaby (n = 3), red-necked wallaby (n = 3), swamp wallaby (n = 2), common wombat (n = 6), and southern hairy-nosed wombat (n = 8). Ethical permission for the collection of all samples was granted by the Animal Welfare Unit, the University of Queensland, Brisbane, Australia, under ANRFA/SCMB/099/14. Half of each sample was stored at − 80 °C in Eppendorf tubes within 24 h and until processing. The other half was resuspended in RF30 [12, 51] and incubated at 37 °C for 24 h. Subsequently, ~ 2 mL of head space gas was retrieved for each tube using a sterile, gas-tight syringe and subjected to gas chromatography analyses, as described by Gagen et al. [52], using a Shimadzu GC-2014 (Shimadzu, Kyoto, Japan) fitted with a flame ionisation detector for CO2, H2, and CH4. A subsample of each culture was stored in anaerobic glycerol [53] at − 80 °C.

Faecal sample DNA extraction, amplicon sequencing, and metagenome sequencing

A subset of collected marsupial faecal samples was chosen for DNA extraction (Additional file 1: Table S1). The faecal DNA extraction methods are described by Shiffman et al. [14]. For the 16S rRNA analyses, the V6–V8 hypervariable region (926F – 1392R [54];) of the 16S rRNA genes were amplified by PCR in 50 μL volumes containing 25 ng of DNA, 5 μL of 10× buffer, 1.5 μL of Bovine Serum Albumin (Roche diagnostic, Australia), 0.2 μL of 1 U Fisher Taq DNA polymerase (Thermo Fisher Scientific Inc., USA), 1 μL of dNTP mix (each at a concentration of 10 mM), 4 μL of 25 mM MgCl2, and 1 μL of each 10 mM of 926F and 1392R primers [54] ligated to Illumina adapter sequences. Each reaction was performed using the following cycling conditions: 95 °C for 3 min, followed by 30 cycles of 95 °C for 30 s, 55 °C for 30 s, 74 °C for 30 s, and a final extension at 74 °C for 10 min. AMPure XP beads (Beckman Coulter, Rea, CA, USA) were used to purify the resulting amplicons, as per the manufacturer’s instructions. Each sample was then indexed with unique 8-bp barcodes using the Illumina Nextera XT V2 Index Kit Set A-D (Illumina FC-131-1002; Illumina, San Diego, CA, USA) under standard PCR conditions. Equimolar indexed amplicons were pooled and sequenced at the Australian Centre for Ecogenomics, using the Illumina MiSeq platform with the version 3 reagent kit for 300 cycles, according to the manufacturer’s instructions. The raw data was demultiplexed and processed as per Shiffman et al. [14].

Four mahogany glider and four southern hairy-nosed wombat samples containing a high abundance of Methanocorpusculum were chosen for metagenomic sequencing. Aliquots of the extracted DNA were subjected to double-size selection for Illumina library preparation. First, 60 μL of AMPure XP beads (Beckman Coulter, Rea, CA, USA) was mixed with 100 μL of the DNA extract, vortexed, and held at room temperature for 5 min. The sample tubes were then placed on a magnetic stand for ~ 5 min, and once the solution was clear, the supernatant containing the desired DNA fragments was transferred to another sterile tube and the beads discarded. This process was repeated with 10 μL of AMPure XP beads (Beckman Coulter, Rea, CA, USA), and then the sample tube was placed on a magnetic as above, after which the supernatant was discarded. While remaining on the magnetic stand, the beads were washed by two rounds of exposure to 200 μL of 80% (v/v) ethanol for 30 s, with the ethanol removed at each step via pipette. The beads were then air-dried for ~ 15 min on the magnetic stand, and 25 μL of nuclease-free water was added, vortexed, and held at room temperature for 2 min. The mixtures were then placed on the magnetic stand for ~ 1 min (or until the solution was clear) and the liquid containing the eluted DNA was harvested via pipette and transferred to a new sterile tube. The DNA libraries for each sample were constructed using the Illumina Nextera XT DNA library preparation kit (Illumina, San Diego, CA, USA) and ~ 3 nM of each library was then sequenced using the Illumina NextSeq 500 platform with 2 × 150-bp paired-end chemistry, using standard protocols at the Australian Centre for Ecogenomics.

Assessment of archaea prevalence and diversity in metagenomic sequencing datasets

A phylogenetic tree was constructed using the von Willebrand factor (vWF) of respective species available from the NCBI nucleotide database. The vWF genes were aligned using MUSCLE in MEGA-X [19], and phylogeny was inferred using maximum likelihood with 1000 bootstraps. The abundance of respective archaeal OTUs (above) was visualised against the marsupial phylogeny using pheatmap (v1.0.12) in RStudio (v2022.02.0-443).

Recovery of archaeal MAGs from marsupial metagenomic sequencing datasets

SeqPurge with default settings (v.2018_11) [55] was used for adaptor trimming of the raw reads. Metaspades (v3.13.0) [56] was used to produce contiguous sequences, with auto PHRED offset and k-mer assembly lengths of 21, 33, and 55. BamM (v1.7.3; https://github.com/Ecogenomics/BamM) was used to map the paired-end reads of samples back to respective sample types (kangaroo, koala, glider, or wombat) and to produce contig coverage. UniteM (v0.0.16; https://github.com/dparks1134/UniteM) was used to recover MAGs from each sample, and CheckM (v1.0.12) [57] was used to assess quality. MAGs with ≥ 50% completeness and ≤ 10% contamination were retained and taxonomically assigned using GTDB-Tk (v1.0.2) [29]. The mean coverage of contigs for each MAG was determined using CoverM (v.0.4.0; https://github.com/wwood/CoverM) in contig mode.

Methanogen enrichment and isolation from marsupial faecal samples

The wombat and mahogany glider samples that produced the highest CH4 for the given species were chosen for methanogen isolation. A 200-μL subsample of the wombat faecal slurry (CW153) was inoculated into 10 mL volumes of anaerobic BRN-RF10 medium [58] in Balch tubes and pressurised to 150 kPa with either H2:CO2 (80:20) or H2 gas. Cultures with H2 alone were supplemented with 1% (v/v) combinations of methanol (Sigma-Aldrich; 179337), ethanol (Sigma-Aldrich; E7023), 2-propanol (Sigma-Aldrich; I9516), 1-butanol (Sigma-Aldrich; 360465), 2 M sodium acetate, or 2 M trimethylamine (TMA) filter sterilised solutions. Streptomycin (600 μg/mL), ampicillin (200 μg/mL), and erythromycin (100 μg/mL) were added to all cultures, before incubation at 37 °C with rotational agitation at 100 rpm. Once subcultures of the enrichments were determined to be free of bacteria by PCR (27F/1492R) [59], CW153 cultures were then 10-fold serially diluted and 0.5 mL of the highest dilution which showed growth was transferred to BRN-RF10 (0.7%) agar roll tubes containing respective substrates. The roll tubes were incubated at 37 °C for 4–6 weeks. Random colonies were aseptically picked using a sterile glass Pasteur pipette and propagated in BRN-RF10 containing the respective gas/substrate combination.

For the mahogany glider enrichment, streptomycin and ampicillin were used but were changed for erythromycin (100 μg/mL) and vancomycin (50 μg/mL) after 10 subcultures to further suppress bacterial growth. One hundred microlitres of the bacteria-free enrichment cultures was spread on anaerobic BRN-RF10 agar (1.5% w/v), supplemented with respective substrates, in an anaerobic chamber (Coy Laboratory Products, MI, USA) with an atmosphere of CO2:H2:N2 (15:5:85). The agar plates were incubated at 37 °C for 4–6 weeks. Single colonies were picked from the agar and propagated in a broth medium as described above. The broth cultures from both rounds of enrichment were then evaluated for their purity and taxonomic origin using archaeal-specific 16S rRNA PCR (86F/1492R) [60]. PCR amplicons were cleaned using the Wizard SV Gel and PCR Clean-Up System gel extraction protocol, as per the manufacturer’s instructions, and sequenced at AGRF (https://www.agrf.org.au/). MEGA-X [19] was used to align the amplicon sequences with reference methanogen 16S rRNA sequences downloaded from the NCBI nucleotide database. Sequences were aligned using MUSCLE in MEGA-X, and phylogeny was inferred using maximum likelihood with 1000 bootstraps. The phylogenetic tree was visualised using iTOL (https://itol.embl.de/). Axenic broth cultures of the novel methanogens, hereafter referred to as Methanocorpusculum sp. CW153 and Methanocorpusculum sp. MG, were stored in anaerobic 30% glycerol solution at − 80 °C, prepared as per Teh et al. [53].

Methanogen whole-genome sequencing

High-molecular weight genomic DNA was extracted from 10-mL cultures of each isolate using the consecutive freeze-thaw method described by Hoedt et al. [13], with 15 sets of consecutive freeze-thaws on dry ice for 5 min and 55 °C for 3 min. The quality and quantity of the genomic DNA samples were confirmed by Nanodrop and agarose gel electrophoresis, prior to genome sequencing at the Australian Centre for Ecogenomics. The Nextera DNA Flex Library Preparation Kit (Illumina #20018705) was used according to the manufacturer’s instructions, and the Mantis Liquid Handler (Formulatrix) was used for library preparation and cleanup. Each library was quality assessed using the TapeStation 4200 (Agilent #G2991AA) with Agilent D1000 HS tapes (#5067-5582) and quantified using the Quant-iT™ dsDNA HS Assay Kit (Invitrogen), as per the manufacturer’s instructions. Each library was sequenced using the Illumina NextSeq500 platform with NextSeq 500/550 High Output v2 2 × 150 bp paired-end chemistry and a sequencing depth of 1 Gbp for each sample.

Sequences were trimmed using Trimmomatic (v0.32) [61] and assembled using Spades (v3.14.1) [56]. The quality of each genome assembly was determined using CheckM (v1.1.2) [57], and taxonomic classification was performed using GTDB-Tk (v1.3.0) [29]. The coverage of each genome was determined using BamM (v1.7.3; https://github.com/Ecogenomics/BamM) and samtools mpileup. Predicted coding sequences were annotated using prokka (v1.14.6) [62] and the IMG Annotation Pipeline (v5.0.23; https://img.jgi.doe.gov/submit/) [63, 64]. BlastKOALA was used to assign KEGG Orthologs to the protein sequences of each genome [65].

Microscopy and transmission electron microscopy of marsupial methanogen isolates

Methanogen isolates were cultured in BRN-RF10 medium with 150 kPa of CO2:H2 (20:80) headspace gas and respective substrates for methylotrophic strains. For light microscopy, samples of the cultures were heat-fixed on glass slides and stained using standard Gram staining protocols. Gram-stained slides were then imaged using a Nikon Eclipse 50i, under 100× magnification. Wet mount slides of each culture were visualised using a Zeiss AX10 epifluorescence microscope at 420 nm with a cyan (47 HE) filter set. Transmission electron microscopy of each isolate was conducted by Dr. Rick Webb at the University of Queensland Centre for Microscopy and Microanalysis (https://cmm.centre.uq.edu.au/). Cultures of each isolate were pelleted and mixed with low-gelling temperature agarose made with uninoculated BRN-RF10 medium. The sample was then immediately frozen using a Leica EMPACT2 high-pressure freezer. Each sample was then freeze-substituted (1% osmium tetroxide, 0.5% uranyl acetate, and 5% water in acetone), as per McDonald and Webb [66]. Samples were brought to room temperature and washed with acetone. Epon resin was used for infiltration and allowed to polymerise for 2 days at 60 °C. A Leica Ultracut UC6 ultramicrotome was used to produce ultrathin sections, which were picked up on Formvar-coated copper grids. The sections were stained with Reynolds lead citrate for 1 min and 5% uranyl acetate in 50% ethanol for 2 min and re-stained in Reynolds lead citrate again for 1 min, with a water wash after each subsequent step [67]. The sections were visualised, and micrographs were taken using a Hitachi HT7700 transmission electron microscope operated at 80 kV.

Recovery of Methanocorpusculum MAGs from publicly available datasets

A total of 1276 metagenomes from 20 publicly available datasets were downloaded from the NCBI SRA database (https://www.ncbi.nlm.nih.gov/sra) between 18/05/2019 and 15/12/2020 (Additional file 1: Table S4). Each metagenome was trimmed using Trimmomatic (v0.32) [61] and assembled using MegaHit (v1.1.1) [68]. BamM (v1.7.3) was used to map the reads back to the assembly, and MetaBAT (v2.12.1) was used to produce genome bins with a minimum contig size of 1500 bps. Bin coverage was estimated using BamM and samtools as above. Bin quality was assessed using CheckM (v1.0.7) [57], and GTDB-Tk (v1.3.0) [29] was used to taxonomically assign each archaeal MAG. MAGs with ≥ 50% completeness and ≤ 10% were classified as medium-quality (MQ), and those with ≥ 90% completeness and ≤ 5% contamination were classified as high-quality (HQ). Predicted tRNAs and rRNAs were annotated using Aragorn and Barrnap within prokka (v1.14.6) [62], with the archaeal kingdom modifier. Multiple archaeal MAGs produced from a single metagenome were dereplicated using dRep (v2.4.0) [69].

Phylogenetic analysis and average nucleotide identity of Methanocorpusculum genomes

Recovered MAGs, isolate genomes, and Methanocorpusculum downloaded from NCBI were taxonomically assigned and concatenated archaeal marker gene files were produced using GTDB-Tk (v1.6.0) [29]. Phylogeny was inferred using FastTree (v2.1.10) [30] and visualised by iToL (https://itol.embl.de/). The average nucleotide identity (ANI) was determined using fastani (v1.1) [70]. Average amino acid identity (AAI) was determined using EzAAI [71]. AAI was displayed as a heatmap using pheatmap (v1.0.12) in RStudio (v2022.02.0-443).

Comparative analysis of Methanocorpusculum isolate genomes and MAGs

The HQ Methanocorpusculum MAGs and isolate genomes were included in the comparative genomic analyses. PCA plots of gene orthologs and KEGG Orthology variance were generated using EnrichM, with ‘--orthologs’ and ‘--ko’ annotation and subsequent enrichment functions (v0.4.15; https://github.com/geronimp/enrichM). Statistical analyses between the host groups were performed in EnrichM by Fisher’s exact test and Mann–Whitney U test. Corrected p-values of < 0.05 were considered significant. The percentage of genomes within host groups containing annotations was visualised using the pheatmaps (v1.0.12) package in RStudio (v2022.02.0-443).

Methanocorpusculum growth kinetics and substrate utilisation

Growth curves for M. petauri MG and M. vombati CW153 were conducted in BRN-RF10 medium [58], without the addition of sodium acetate and sodium formate, and sparged using N2 gas. Ten millilitres of aliquots were prepared in Balch tubes and each pressurised with CO2, H2, or CO2/H2 (20:80) to 150 kPa. Cultures with CO2/H2 were used as positive controls and CO2 or H2 alone was used as the negative control. Substrate test cultures contained either CO2 or H2, along with 1% v/v supplementation of N2 sparged and filter sterilised substrates. Test substrates included methanol (Sigma-Aldrich; 179337), ethanol (Sigma-Aldrich; E7023), 1-propanol (Sigma-Aldrich; 402893), 2-propanol (Sigma-Aldrich; I9516), 1-butanol (Sigma-Aldrich; 360465), 2-butanol (Sigma-Aldrich; 19440), iso-butanol (Sigma-Aldrich; 320048), tert-butanol (Sigma-Aldrich; 360538), 1-pentanol (Sigma-Aldrich; 76929), 2-pentanol (Sigma-Aldrich; P8017), cyclopentanol (Sigma-Aldrich; C112208), cyclohexanol (Sigma-Aldrich; 105899), 2,3-butanediol (Sigma-Aldrich; B84904), or glycerol (Chem Supply; GA010). Sodium acetate (Sigma-Aldrich; S2289), sodium formate (Sigma-Aldrich; 798630), and methylamine (Sigma-Aldrich; M0505) were also included as 2-M solutions prepared in N2-sparged milli-Q H2O. Parent cultures were grown to the mid-exponential phase (0.2 OD600). Two hundred microlitres was aseptically inoculated into each tube in triplicate and incubated horizontally at 37 °C with 100 rpm rotational agitation. Growth was measured by OD600 at two-hourly intervals for ~ 36 h and then every ~ 24 h thereafter until ~ 500 h. Growth curves were visualised using GraphPad Prism 9.