Introduction

Nodulated legumes are important and established components of Australian agricultural systems: the value of atmospheric nitrogen (N2) fixed by rhizobia in symbiotic association with these legumes is estimated to be worth more than $2 billion annually [1, 2]. The major agricultural region of south-western Australia has a Mediterranean climate, with soils that are often acid, have a low clay content and low organic matter, and tend to be inherently infertile [3, 4]. The last forty years, however, have seen a sharp decrease in average winter rainfall by about 15–20% [5]. This, together with the development of dryland salinity [6], has challenged the sustainability of using the commonly sown subterranean clover and annual medics as pasture legumes in these systems. Alternative perennial legume species (and their associated rhizobia) are therefore being sought [2]. We have identified a suite of South African perennial, herbaceous forage legumes, including several species in the crotalarioid genus Listia (previously Lotononis) [7], that are potentially well-adapted to the arid climate and acid, infertile soils of the target agricultural areas.

Listia species are found in seasonally wet habitats throughout southern and tropical Africa [8]. They produce stoloniferous roots [8, 9] and form lupinoid nodules rather than the indeterminate type found in other crotalarioid species [7, 10]. Rhizobial infection occurs by epidermal entry rather than via root hair curling [7]. Listia-rhizobia symbioses are highly specific. The tropically distributed L. angolensis forms effective (i.e. N2-fixing) nodules with newly described species of Microvirga [11], while all other studied Listia species are only nodulated by strains of pigmented methylobacteria [7, 10, 12]. Unlike the methylotrophic Methylobacterium nodulans, which specifically nodulates some species of Crotalaria [13], the Listia methylobacteria are unable to utilize methanol as a sole carbon source [14]. In Australia, strains of pigmented methylobacteria have been used as commercial inoculants for Listia bainesii and are able to persist in acidic, sandy, infertile soils, while remaining symbiotically and serologically stable [10, 15].

A pigmented Methylobacterium strain, WSM2598, isolated from a root nodule of L. bainesii cv “Miles” in South Africa in 2002, was found to be a highly effective nitrogen fixing microsymbiont of both L. bainesii and Listia heterophylla (previously Lotononis listii) [10]. Here we present a set of preliminary classification and general features for Methylobacterium sp. strain WSM2598, together with the description of the genome sequence and annotation.

Organism information

Methylobacterium sp. strain WSM2598 is a motile, non-sporulating, non-encapsulated, Gram-negative rod with one to several flagella. It is a member of the family Methylobacteriaceae in the class Alphaproteobacteria. The rod-shaped form varies in size with dimensions of approximately 0.5 μm in width and 1.0-1.5 μm in length (Figure 1 Left and 1 Center). WSM2598 is medium to slow growing, forming 0.5-1.5 mm diameter colonies within 6–7 days at 28°C. WSM2598 is pigmented, an unusual property for rhizobia. When grown on half strength Lupin Agar (½LA) [10], WSM2598 forms dark pink pigmented, opaque, slightly domed colonies with smooth margins (Figure 1 Right).

Figure 1
figure 1

Images of Methylobacterium sp. strain WSM2598 using scanning (Left) and transmission (Center) electron microscopy as well as light microscopy to visualize colony morphology on solid ½LA [10] (Right).

WSM2598 alkalinizes ½LA containing universal indicator (BDH Laboratory Supplies). WSM2598 cultured in minimal medium [16] is unable to utilize arabinose, galactose, glucose, mannitol, methanol, methylamine or formaldehyde as sole carbon sources, but grows poorly on formate and well on succinate and glutamate [14]. Minimum Information about the Genome Sequence (MIGS) is provided in Table 1 and Additional file 1: Table S1.

Table 1 Classification and general features of Methylobacterium sp. strain WSM2598 according to the MIGS recommendations [17, 18]

Figure 2 shows the phylogenetic neighborhood of Methylobacterium sp. WSM2598 in a 16S rRNA sequence based tree. The 16S rDNA sequence of WSM2598 has 99% (1,358/1,364 bp) and 98% (1,334/1,365 bp) sequence identity to the 16S rRNA of the fully sequenced strains Methylobacterium sp. 4–46 (Gc00857) and M. nodulans ORS2060 (Gc00935), respectively.

Figure 2
figure 2

Phylogenetic tree showing the relationships of Methylobacterium sp. WSM2598 (shown in blue print) with some of the root nodule bacteria in the order Rhizobiales based on aligned sequences of the 16S rRNA gene (1,340 bp internal region). All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5 [28]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis [29] with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Brackets after the strain name contain an accession number. Strains with a genome sequencing project registered in GOLD [30] are in bold print and the GOLD ID is mentioned after the accession number. Published genomes are designated with an asterisk.

Symbiotaxonomy

Methylobacterium sp. WSM2598 forms nodules on (Nod+), and fixes N2 (Fix+), with southern African species of Listia. On Listia angolensis, some species of the crotalarioid genus Leobordea and the promiscuous legume Macroptilium atropurpureum, WSM2598 forms white, ineffective (Fix-) nodules. It does not form nodules on other tested legumes [7], [Table 2].

Table 2 Compatibility of Methylobacterium sp. WSM2598 with 11 host legume genotypes for nodulation (Nod) and N 2 -Fixation (Fix)

Genome sequencing and annotation information

Genome project history

This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Community Sequencing Program at the U.S. Department of Energy, Joint Genome Institute (JGI) for projects of relevance to agency missions. The genome project is deposited in the Genomes OnLine Database [30] and an improved-high-quality-draft genome sequence in IMG. Sequencing, finishing and annotation were performed by the JGI. A summary of the project information is shown in Table 3.

Table 3 Genome sequencing project information for Methylobacterium sp. WSM2598

Growth conditions and DNA isolation

Methylobacterium sp. WSM2598 was grown to mid-logarithmic phase in TY rich media on a gyratory shaker at 28°C [32]. DNA was isolated from 60 mL of cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [33].

Genome sequencing and assembly

The draft genome of Methylobacterium sp. WSM2598 was generated at the DOE Joint Genome Institute (JGI) using Illumina technology [34, 35]. For this genome, we constructed and sequenced an Illumina short-insert paired-end library with an average insert size of 270 bp which generated 19,048,548 reads and an Illumina long-insert paired-end library with an average insert size of 6354.14 +/− 3100.07 bp which generated 18,876,864 reads totaling 5,689 Mbp of Illumina data. (unpublished, Feng Chen). All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website. The initial draft assembly contained 141 contigs in 41 scaffold(s). The initial draft data was assembled with Allpaths, version 39750, and the consensus was computationally shredded into 10 Kbp overlapping fake reads (shreds). The Illumina draft data was also assembled with Velvet, version 1.1.05 [36] and the consensus sequences were computationally shredded into 1.5 Kbp overlapping fake reads (shreds). The Illumina draft data was assembled again with Velvet using the shreds from the first Velvet assembly to guide the next assembly. The consensus from the second VELVET assembly was shredded into 1.5 Kbp overlapping fake reads. The fake reads from the Allpaths assembly and both Velvet assemblies and a subset of the Illumina CLIP paired-end reads were assembled using parallel phrap, version 4.24 (High Performance Software, LLC). Possible mis-assemblies were corrected with manual editing in Consed [3739]. Gap closure was accomplished using repeat resolution software (Wei Gu, unpublished), and sequencing of bridging PCR fragments with Sanger and/or PacBio (unpublished, Cliff Han) technologies. One round of manual/wet lab finishing was also completed. 17 PCR PacBio consensus sequences were completed to close gaps and to raise the quality of the final sequence. The total (“estimated size” for the unfinished) size of the genome is 8.3 Mbp and the final assembly is based on 5,689 Mbp of Illumina draft data, which provides an average 685× coverage of the genome.

Genome annotation

Genes were identified using Prodigal [40] as part of the DOE-JGI Annotation pipeline [41], followed by a round of manual curation using the JGI GenePRIMP pipeline [42]. Within the Integrated Microbial Genomes (IMG-ER) system [43], predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [44], RNAMMer [45], Rfam [46], TMHMM [47], and SignalP [48]. Additional gene prediction analyses and functional annotation were performed within IMG.

Genome properties

The genome is 7,669,765 nucleotides with 71.17% GC content (Table 4) and comprised of 5 scaffolds (Figure 3) of 83 contigs. From a total of 7,349 genes, 7,236 were protein encoding and 18 RNA only encoding genes. The majority of genes (71.22%) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COGs functional categories is presented in Table 5.

Table 4 Genome statistics for Methylobacterium sp. WSM2598
Figure 3
figure 3

Graphical map of the 5 scaffolds assembled for the genome of Methylobacterium sp. WSM2598. From top to bottom, the scaffolds are: WSM2598: MET2598DRAFT _scaffold1.1, WSM2598: MET2598DRAFT_scaffold2.2, WSM2598: MET2598DRAFT _scaffold3.3, WSM2598: MET2598DRAFT _scaffold4.4, and WSM2598: MET2598DRAFT _scaffold5.5. From the bottom to the top of each scaffold: Genes on forward strand (color by COG categories as denoted by the IMG platform), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew.

Table 5 Number of protein coding genes of Methylobacterium sp. WSM2598 associated with the general COG functional categories

Conclusion

WSM2598 was sequenced as part of the DOE Joint Genome Institute GEBA-RNB project. In common with other sequenced rhizobial strains, WSM2598 has a comparatively large genome of around 7.69 Mbp, with a high proportion of genes assigned to the COG functional categories associated with transcription control and signal transduction (14.69%), transport and metabolism (29.38%) and secondary metabolite biosynthesis (3.12%). These features are characteristic of soil bacteria, which inhabit oligotrophic environments with typically diverse but scarce nutrient sources. Rhizobial methylobacteria are unusual, however, in that they form symbiotic associations exclusively with African crotalarioid legume hosts, several species of which are well-adapted to arid climates and acid, infertile soils and are therefore potentially useful pasture plants in marginal agricultural systems. The molecular basis for this symbiotic specificity has yet to be determined. As WSM2598 is highly effective for N2-fixation on several of these hosts, its sequenced genome is a valuable resource for gaining an understanding of symbiotic specificity and N2-fixation in a currently understudied group of legumes and rhizobia.