Introduction

Bradyrhizobium sp. strain Ai1a.2 is a representative of a distinctive lineage affiliated with the Bradyrhizobium elkanii superclade [1]. The B. elkanii superclade is one of the three main branches of the genus, together with the B. japonicum / B. diazoefficiens superclade [2,3], and the group encompassing photosynthetic Aeschynomene symbionts [4].

Members of the lineage represented by strain Ai1a.2 are readily diagnosed because they share a distinctive length variant in helix 9 within the 5′ intervening sequence region of the 23S rRNA gene [5]. Strain Ai1a.2 and its relatives have an insertion of 16 nucleotides in this region in comparison to B. elkanii USDA76, which can be identified by a straightforward PCR assay [6]. In a survey of 420 Bradyrhizobium strains from 25 countries [1], only 2% of the strains had this 23S rRNA length variant. These strains all clustered together into a strongly supported clade based on concatenated data for 23S rRNA and five protein-coding genes [1].

This clade was placed as the most basally diverging lineage within the B. elkanii superclade, and it included strains from three locations: Central America, the Caribbean, and South Africa. Strain Ai1a.2 was sampled in Costa Rica from the tree Andira inermis [6], and highly similar strains are also known to occur as symbionts of the same host legume in Panama [7]. Parker and Rousteau [8] also detected strains from this group in nodule samples from the beach legume Canavalia rosea in two Caribbean locations (Guadeloupe and Puerto Rico). Two Bradyrhizobium strains from distantly related legume hosts (Leobordea spp.) in South Africa (WSM2632, WSM2783) also belong to this clade [9].

Andira inermis , the host of strain Ai1a.2, is a large tree (up to 35 m height) commonly found in riparian habitats from southern Mexico through northern South America [10]. Andira was traditionally considered to be an early-diverging lineage within the Tribe Dalbergieae [11], but more recent phylogenetic analyses have suggested that it forms a separate lineage with unclear relationship to dalbergioid legumes [12]. Here we provide an analysis of the high-quality permanent draft genome sequence of Bradyrhizobium strain Ai1a.1. The fact that the genome of its close relative WSM2783 has also been sequenced as part of the Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project [13] will enable detailed comparative analysis of this group.

Organism information

Classification and features

Bradyrhizobium sp. Ai1a-2 is a motile, non-sporulating, non-encapsulated, Gram-negative strain in the order Rhizobiales of the class Alphaproteobacteria . The rod shaped form has dimensions of approximately 0.5 μm in width and 1.5-2.0 μm in length (Figure 1 Left and Center). It is relatively slow growing, forming colonies after 6–7 days when grown on half strength Lupin Agar (½LA) [14], tryptone-yeast extract agar (TY) [15] or a modified yeast-mannitol agar (YMA) [16] at 28°C. Colonies on ½LA are opaque, slightly domed and moderately mucoid with smooth margins (Figure 1 Right).

Figure 1
figure 1

Images of Bradyrhizobium sp. Ai1a-2 using scanning (Left) and transmission (Center) electron microscopy as well as light microscopy to visualize colony morphology on solid media (Right).

Figure 2 shows the phylogenetic relationship of Bradyrhizobium sp. Ai1a-2 in a 16S rRNA gene sequence based tree. The 16S rRNA gene sequence of Aia1-2 (using a 1,370 bp intragenic sequence) is identical to that of Bradyrhizobium sp. WSM2783. Bradyrhizobium sp. Ai1a-2 is also closely related to Bradyrhizobium sp. Cp5.3 and Bradyrhizobium sp. Th.b2 with 16S rRNA gene sequence identities of 99.77% and 99.23%, respectively, as determined using NCBI BLAST analysis [17]. The highest identity (99.16%) of the 16S rRNA gene sequence of strain Ai1a-2 to type strain sequences occurs with Bradyrhizobium icense LMTR 13T and Bradyrhizobium paxllaeri LMTR 21T based on alignment using the EzTaxon-e server [18,19].

Figure 2
figure 2

Phylogenetic tree showing the relationship of Bradyrhizobium sp. Ai1a-2 (shown in blue print) relative to other type and non-type strains in the Bradyrhizobium genus using a 1,310 bp intragenic sequence of the 16S rRNA gene. Azorhizobium caulinodans ORS 571T sequence was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [37]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [20] have the GOLD ID mentioned after the strain number and are represented in bold, otherwise the NCBI accession number is provided.

Minimum Information about the Genome Sequence (MIGS) is provided in Table 1 and Additional file 1: Table S1.

Table 1 Classification and general features of Bradyrhizobium sp. Ai1a-2 in accordance with the MIGS recommendations [38] published by the Genome Standards Consortium [39]

Symbiotaxonomy

Strain Ai1a.2 was isolated from the tree Andira inermis , Costa Rica [6]. The authentication of the symbiotic ability could not be performed using this host because seeds could not be accessed. The symbiotic capability of strain Ai1a.2 was tested on Macroptilium atropurpureum and this strain was able to nodulate this host. Acetylene reduction assays showed established nodules contained active nitrogenase, indicating an effective symbiosis with this host [6].

Genome sequencing information

Genome project history

This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, Root Nodulating Bacteria (GEBA-RNB) project at the U.S. Department of Energy, Joint Genome Institute (JGI). The genome project is deposited in the Genomes OnLine Database [20] and a high-quality permanent draft genome sequence in IMG [21]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [22]. A summary of the project information is shown in Table 2.

Table 2 Project information

Growth conditions and genomic DNA preparation

Bradyrhizobium sp. Ai1a-2 was cultured to mid logarithmic phase in 60 ml of TY rich media on a gyratory shaker at 28°C [23]. DNA was isolated from the cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [24].

Genome sequencing and assembly

The draft genome of Bradyrhizobium sp. Ai1a–2 was generated at the DOE Joint Genome Institute (JGI) using the Illumina technology [25]. An Illumina standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 21,669,974 reads totaling 3,250.5 Mbp. All general aspects of library construction and sequencing were performed at the JGI and details can be found on the JGI website [26]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J, Unpublished). Following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet (version 1.1.04) [27], (2) 1–3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [28], (3) Illumina reads were assembled with simulated read pairs using Allpaths–LG (version r42328) [29]. Parameters for assembly steps were: 1) Velvet (velveth: 63 –shortPaired and velvetg: −very_clean yes –exportFiltered yes –min_contig_lgth 500 –scaffolding no –cov_cutoff 10) 2) wgsim (−e 0 –1 100 –2 100 –r 0 –R 0 –X 0) 3) Allpaths–LG (PrepareAllpathsInputs: PHRED_64 = 1 PLOIDY = 1 FRAG_COVERAGE = 125 JUMP_COVERAGE = 25 LONG_JUMP_COV = 50, RunAllpathsLG: THREADS = 8 RUN = std_shredpairs TARGETS = standard VAPI_WARN_ONLY = True OVERWRITE = True). The final draft assembly contained 247 contigs in 246 scaffolds. The total size of the genome is 9.0 Mbp and the final assembly is based on 1,081.2 Mbp of Illumina data, which provides an average 119.7X coverage of the genome.

Genome annotation

Genes were identified using Prodigal [30], as part of the DOE-JGI genome annotation pipeline [31,32]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [33] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [34]. Other non–coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [35]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes-Expert Review (IMG-ER) system [36] developed by the Joint Genome Institute, Walnut Creek, CA, USA.

Genome properties

The genome is 9,029,266 nucleotides with 62.56% GC content (Table 3) and comprised of 246 scaffolds. From a total of 8,584 genes, 8,482 were protein encoding and 102 RNA only encoding genes. The majority of genes (75.10%) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COGs functional categories is presented in Table 4.

Table 3 Genome statistics for Bradyrhizobium sp. Ai1a-2
Table 4 Number of genes associated with the general COG functional categories

Conclusions

Bradyrhizobium sp. Ai1a-2 is a member of a widely distributed Bradyrhizobium lineage, isolated from diverse legume hosts in North, Central and South America and South Africa. Little is currently known of the symbiotic associations of its host Andira inermis , apart from the discovery that the Puerto Rican isolate Bradyrhizobium sp. EC3.3 can also establish a symbiosis with this host [8]. The Costa Rican isolate Aia1-2 16S rRNA gene sequence is distinct to that of EC3.3 but identical to the 16S rRNA sequence of South African isolate Bradyrhizobium sp. WSM2783. Phylogentically, Ai1a-2 is closely related to Bradyrhizobium sp. Cp5.3 and Bradyrhizobium sp. Th.b2 from Panama and USA, respectively. The genome of Bradyrhizobium 1a-2 and Ai sp.WSM2783 were sequenced along with 23 other Bradyrhizobium genomes as a part of the GEBA-RNB project. Of these 25 sequenced strains, the Bradyrhizobium spp. Ai1a-2, WSM2783, Cp5.3, Th.b2 and B. elkanii USDA76T are affiliated with the Bradyrhizobium elkanii superclade. The Bradyrhizobium Ai1a-2 genome has the 2nd lowest genome size (9 Mbp), gene count (8,584) and signal peptide percentage (9.75%) among these five strains. Comparing the genome attributes of Bradyrhizobium sp. Ai1a-2 along with other sequenced Bradyrhizobium genomes will be important for the understanding of the biogeography of Bradyrhizobium spp. interactions required for the successful establishments of effective symbioses with their diverse hosts.