Introduction

Legumes establish symbiotic associations with a group of soil bacteria, rhizobia, able to fix atmospheric nitrogen (N2). Rhizobia elicit the formation of a symbiotic organ called a nodule comprising differentiated plant and bacterial cells. Differentiated rhizobia within nodules are termed bacteroids, and acquire the ability to fix nitrogen. Rhizobia are phylogenetically diverse including genera from the Alphaproteobacteria (Allorhizobium, Azorhizobium, Bradyrhizobium, Ensifer, Mesorhizobium, Rhizobium, etc.) as well as from the Betaproteobacteria (Burkholderia, Cupriavidus) [1, 2].

The biological nitrogen fixation process significantly contributes to the development of sustainable agriculture reducing the use of supplies dependent on fuel and alleviating environmental impacts produced by the addition of chemical fertilizer [3]. Moreover, forestation with leguminous trees associated with rhizobia, “nitrogen-fixing trees”, has been successfully used for recovering degraded soils [4].

Parapiptadenia rigida (Benth.) Brenan, is a “nitrogen-fixing tree” belonging to the Piptadenia group from the Mimosoideae subfamily [5]. It is a multipurpose tree, very appreciated because of its timber and therefore used in high quality furniture and construction. It is also used for gums, tannins and essential oil extraction, has medicinal properties and is included in agroforestry and reforestation programs [4, 6, 7]. Taulé et al.[8] demonstrated that this species could be nodulated either by Alpha-rhizobia (Rhizobium) or by Beta-rhizobia (Burkholderia and Cupriavidus) with Burkholderia being the preferred natural symbiont of this legume. In the case of Cupriavidus sp. UYPR2.512, this strain was isolated from a nodule of a P. rigida plant grown in soils collected from Mandiyú native forest in Artigas, Uruguay. Isolated bacterial colonies of Cupriavidus sp. UYPR2.512 were able to nodulate and to promote the growth of P. rigida, as well as Mimosa pudica plants [8].

To our knowledge, the only published sequenced genome of a Beta-rhizobia belonging to the Cupriavidus genus so far is that of C. taiwanensis LMG 19424T[9]. Interestingly, the closest relative of Cupriavidus sp. UYPR2.512 is C. necator ATCC 43291T[8]. Here, we present the description of the Cupriavidus sp. UYPR2.512 high-quality permanent draft genome sequence and its annotation.

Organism information

Classification and features

Cupriavidus sp. strain UYPR2.512 is a motile, Gram-negative, non-spore-forming rod (Figure 1 Left, Center) in the order Burkholderiales of the class Betaproteobacteria. The rod-shaped form varies in size with dimensions of 0.5-0.7 μm in width and 0.9-1.2 μm in length (Figure 1 Left). It is fast growing, forming 0.5-0.8 mm diameter colonies after 24 h when grown on TY [10] at 28°C. Colonies on TY are white-opaque, slightly domed, moderately mucoid with smooth margins (Figure 1 Right).

Figure 1
figure 1

Images of Cupriavidus sp. strain UYPR2.512 using scanning (Left) and transmission (Center) electron microscopy and the appearance of colony morphology on solid media (Right).

Figure 2 shows the phylogenetic relationship of Cupriavidus sp. strain UYPR2.512 in a 16S rRNA gene sequence based tree. This strain is the most similar to Cupriavidus necator ATCC 43291T, Cupriavidus oxalaticus DSM 1105T and Cupriavidus taiwanensis LMG 19424T based on the 16S rRNA gene alignment with sequence identities of 99.32%, 98.49% and 98.42%, respectively, as determined using the EzTaxon-e server [11]. Cupriavidus necator ATCC 43291T has been isolated from soil and is a non-obligate predator causing lysis of various Gram-positive and Gram-negative bacteria in the soil [12]. Cupriavidus taiwanensis LMG 19424T is a plant symbiont and was isolated from root nodules of Mimosa pudica collected from three fields at Ping-Tung Country in the southern part of Taiwan [1]. Minimum Information about the Genome Sequence (MIGS) is provided in Table 1 and Additional file 1: Table S1.

Figure 2
figure 2

Phylogenetic tree highlighting the position of Cupriavidus sp. strain UYPR2.512 (shown in blue print) relative to other type and non-type strains in the Cupriavidus genus using a 1,034 bp internal region of the 16S rRNA gene. Several Alpha-rhizobia sequences were used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [13]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [14] are shown in bold and have the GOLD ID mentioned after the strain number, otherwise the NCBI accession number has been provided. Finished genomes are designated with an asterisk.

Table 1 Classification and general features of Cupriavidus sp. strain UYPR2.512 in accordance with the MIGS recommendations [15] published by the Genome Standards Consortium [16]

Symbiotaxonomy

Cupriavidus sp. strain UYPR2.512 was isolated from Parapiptadenia rigida, a Mimosoideae legume native to Uruguay [8]. This tree is native to South America, including south Brazil, Argentina, Paraguay, and Uruguay, and used by locals for timber and as a source of gums, tannins and essential oils [8]. Cupriavidus sp. strain UYPR2.512 is able to renodulate its original host and is highly efficient in fixing nitrogen with this host [8]. A selection of other host plants, including Trifolium repens, Medicago sativa, Peltophorum dubium and Mimosa pudica were tested for their ability to nodulate with UYPR2.512. Of these plants, strain UYPR2.512 was only able to nodulate and fix nitrogen effectively with M. pudica[8].

Genome sequencing information

Genome project history

This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, The Root Nodulating Bacteria chapter (GEBA-RNB) project at the U.S. Department of Energy, Joint Genome Institute [25]. The genome project is deposited in the Genomes OnLine Database [14] and the high-quality permanent draft genome sequence in IMG [26]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [27]. A summary of the project information is shown in Table 2.

Table 2 Genome sequencing project information for Cupriavidus sp. strain UYPR2.512

Growth conditions and DNA isolation

Cupriavidus sp. strain UYPR2.512 was grown to mid logarithmic phase in TY rich media [10] on a gyratory shaker at 28°C. DNA was isolated from 60 mL of cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [29].

Genome sequencing and assembly

The draft genome of Cupriavidus sp. UYPR2.512 was generated at the DOE Joint Genome Institute [27]. An Illumina Std shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 29,312,424 reads totaling 4,396.9 Mbp [30]. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI web site [31]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J. unpublished). Artifact filtered sequence data was then screened and trimmed according to the k–mers present in the dataset. High–depth k–mers, presumably derived from MDA amplification bias, cause problems in the assembly, especially if the k–mer depth varies in orders of magnitude for different regions of the genome. Reads with high k–mer coverage (>30x average k–mer depth) were normalized to an average depth of 30x. Reads with an average kmer depth of less than 2x were removed. Following steps were then performed for assembly: (1) normalized Illumina reads were assembled using Velvet version 1.1.04 [32] (2) 1–3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [33] (3) normalized Illumina reads were assembled with simulated read pairs using Allpaths–LG (version r41043)[34]. Parameters for assembly steps were: 1) Velvet (velveth: 63 –shortPaired and velvetg: -very clean yes –exportFiltered yes –min contig lgth 500 –scaffolding no –cov cutoff 10) 2) wgsim (-e 0 –1 100 –2 100 –r 0 –R 0 –X 0) 3) Allpaths–LG (PrepareAllpathsInputs: PHRED 64 = 1 PLOIDY = 1 FRAG COVERAGE = 125 JUMP COVERAGE = 25 LONG JUMP COV = 50, RunAllpathsLG: THREADS = 8 RUN = std_shredpairs TARGETS = standard VAPI_WARN_ONLY = True OVERWRITE = True). The final draft assembly contained 369 contigs in 365 scaffolds. The total size of the genome is 7.9 Mbp and the final assembly is based on 839.6 Mbp of Illumina data, which provides an average of 106.8x coverage.

Genome annotation

Genes were identified using Prodigal [35], as part of the DOE-JGI genome annotation pipeline [36, 37] followed by a round of manual curation using GenePRIMP [38] for finished genomes and Draft genomes in fewer than 10 scaffolds. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [39] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [40]. Other non–coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [41]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes-Expert Review (IMG-ER) system [42] developed by the Joint Genome Institute, Walnut Creek, CA, USA.

Genome properties

The genome is 7,858,949 nucleotides with 65.25% GC content (Table 3) and comprised of 365 scaffolds and 369 contigs (Figure 3). From a total of 7,487 genes, 7,411 were protein encoding and 76 RNA only encoding genes. The majority of genes (75.64%) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COG functional categories is presented in Table 4.

Table 3 Genome statistics for Cupriavidus sp. strain UYPR2.512
Figure 3
figure 3

Graphical map of the four largest scaffolds of the genome of Cupriavidus sp. strain UYPR2.512. From the bottom to the top of each scaffold: Genes on forward strand (color by COG categories as denoted by the IMG platform), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew.

Table 4 Number of protein coding genes of Cupriavidus sp. strain UYPR2.512 associated with the general COG functional categories

Conclusion

Cupriavidus sp. UYPR2.512 belongs to a group of Beta-rhizobia isolated from Parapiptadenia rigida, a native tree from Uruguay belonging to the Mimosoideae legume group [8]. This tree is also native to the south of Brazil, Argentina and Paraguay [8]. Greenhouse experiments from previous studies have shown that Cupriavidus sp. UYPR2.512 is also able to nodulate and fix nitrogen with Mimosa pudica, an invasive species in many regions around the world [8]. Phylogenetic analysis revealed that UYPR2.512 is the most closely related to Cupriavidus necator ATCC 43291T, Cupriavidus oxalaticus DSM 1105T and Cupriavidus taiwanensis LMG 19424T . In contrast to the other two strains, Cupriavidus taiwanensis LMG 19424T is a microsymbiont that is able to nodulate and fix nitrogen in association with Mimosa species [43]. In total five Cupriavidus strains (AMP6, LMG 19424T, STM6018, STM6070 and UYPR2.512), which can form a symbiotic association have now been sequenced. A comparison of these strains reveals that UYPR2.512 has the largest genome (7.9 Mbp), with the highest KOG count (1398), the lowest G + C (65.25%) and signal peptide (9.3%) percentages in this group. All of these genomes share the nitrogenase-RXN MetaCyc pathway catalyzed by a multiprotein nitrogenase complex. Out of five Cupriavidus strains (AMP6, LMG 19424T, STM6018, STM6070 and UYPR2.512), which contain the N-fixation pathway, only Cupriavidus sp. UYPR2.512 has been shown to nodulate and fix effectively with Parapiptadenia rigida. The genome attributes of Cupriavidus sp. UYPR2.512 will therefore be important for ongoing molecular analysis of the plant microbe interactions required for the establishment of leguminous tree symbioses with this host.