Introduction

Strain OK10T (= DSM 16294 = ATCC BAA-671 = JCM 11897) is the type strain of Sulfurimonas autotrophica [1], which is the type species of its genus Sulfurimonas [1,2]. Together with S. paralvinellae and S. denitrificans, the latter of which was formerly classified as Thiomicrospira denitrificans [3]. There are currently three validly named species in the genus Sulfurimonas [4,5]. The autotrophic and mixotrophic sulfur-oxidizing bacteria such as the members of the genus Sulfurimonas are believed to contribute significantly to the global sulfur cycle [6]. The genus name derives from the Latin word ‘sulphur’, and the Greek word ‘monas’, meaning a unit, in order to indicate a “sulfur-oxidizing rod” [1]. The species epithet derives from the Greek word ‘auto’, meaning self, and from the Greek adjective ‘trophicos’ meaning nursing, tending or feeding, in order to indicate its autotrophy [1]. S. autotrophica strain OK10T, like S. paralvinellae strain GO25T (= DSM 17229), was isolated from the surface of a deep-sea hydrothermal sediment on the Hatoma Knoll in the Mid-Okinawa Trough hydrothermal field [1,2]. Thus, the members of the genus Sulfurimonas appear to be free living, whereas the other members of the family Helicobacteraceae, the genera Helicobacter and Wolinella, appear to be strictly associated with the human stomach and the bovine rumen, respectively. Here we present a summary classification and a set of features for S. autotrophica OK10T, together with the description of the complete genomic sequencing and annotation.

Classification and features

There exist currently no experimental reports that indicate further cultivated strains of this species. The type strains of S. denitrificans and S. paralvinellae share 93.5% and 96.3% 16S rRNA gene sequence similarity with strain OK10T. Further analysis also revealed that strain OK10T shares high similarity (99.1%) with the uncultured clone sequence PVB-12 (U15104) obtained from a microbial mat near the deep-sea hydrothermal vent in the Loihi Seamont, Hawaii [7]. This further corroborates the distribution of S. autotrophica in hydrothermal vents. The 16S rRNA gene sequence similarities of strain OK10T to metagenomic libraries (env_nt) were 87% or less, indicating the absence of further members of the species in the environments screened so far (status August 2010).

Figure 1 shows the phylogenetic neighborhood of S. autotrophica OK10T in a 16S rRNA based tree. The sequences of the four 16S rRNA gene copies in the genome differ from each other by up to four nucleotides, and differ by up to three nucleotides from the previously published sequence (AB088431).

Figure 1.
figure 1

Phylogenetic tree highlighting the position of S. autotrophica OK10T relative to the type strains of the other species within the genus and the type strains of the other genera within the order Campylobacterales. The tree was inferred from 1,327 aligned characters [8,9] of the 16S rRNA gene sequence under the maximum likelihood criterion [10] and rooted in accordance with current taxonomy [11]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 350 bootstrap replicates [12] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [13] are shown in blue, published genomes in bold [14,15], such as the recently published GEBA genomes from Sulfurospirillum deleyianum [16] and Arcobacter nitrofigilis [17].

The cells of strain OK10T are Gram-negative, occasionally slightly curved rods of 1.5–2.5 × 0.5–1.0 µm (Figure 2 and Table 1) [1]. On solid medium, the cells form white colonies [1]. Under optimal conditions, the generation time of S. autotrophica strain OK10T is approximately 1.4 h [1,2]. The reductive tricarboxylic acid (rTCA) cycle for autotrophic CO2 fixation is present in strain OK10T, as shown by PCR amplification of the respective genes [28]. Moreover, the activities of several rTCA key enzymes (ACL, ATP dependent citrate lyase; POR, pyruvate:acceptor oxidoreductase; OGOR, 2-oxoglutarase:accecptor oxidoreductase; ICDH, isocytrate dehydrogenase) have been determined, also in comparison to S. paralvinellae and S. denitrificans [28]. There were no enzyme activities for the phosphoenolpyruvate and ribulose 1,5-bisphosphate (Calvin-Benson) pathways detected in strain OK10T [28], though the latter is apparently active in S. thermophila [28]. Also, soluble hydrogenase activity was not found in strain OK10T [28]. With respect to sulfur oxidation, enzyme activity for SOR (sulfite oxidoreductase) but not for APSR (adenosine 5′-phosphate sulfate reductase) and TSO (thiosulfate-oxidizing enzymes) were detected [28]. A detailed comparison of these enzyme activities to S. paralvinellae and S. denitrificans is given in Takai et al. [28]. Elemental sulfur, thiosulfate or sulfide is utilized as the sole electron donor for chemolithoautotrophic growth with O2 as electron acceptor. Thereby thiosulfate is oxidized to sulfate [1]. Organic substrates and H2 are not utilized as electron donors and only oxygen is utilized as an electron acceptor [28]. Strain OK10T requires 4% sea salt for growth [1] and is not able to reduce nitrate [2].

Figure 2.
figure 2

Scanning electron micrograph of S. autotrophica OK10T

Table 1. Classification and general features of S. autotrophica OK10T according to the MIGS recommendations [18]

Chemotaxonomy

The major cellular fatty acids found in strain OK10T are C14:0 (8.4%), C16:1 cis (45.2%), C16:0 (37.1%) and C18:1 trans (9.4%) [1]. Further fatty acids were not reported [1]. The only polyamine identified in S. autotrophica is spermidine [29]. Spermidine was also found in another representative of the order Campylobacterales, Sulfuricurvum kujiense. For comparison, Hydrogenimonas thermophila, the type species and genus of the family Hydrogenimonaceae in the order Campylobacterales, contains both spermidine and spermine as the major polyamines [29]. The cellular fatty acid composition of S. autotrophica was compared with that of other autotrophic Epsilonproteobacteria from deep-sea hydrothermal vents: Nautilia profundicola AmHT, Lebetimonas acidiphila Pd55T, Hydrogenimonas thermophila EP1-55-1%T, and Nitratiruptor tergarcus MI55-1T [30]. It was found that S. autotrophica strain OK10T has much higher levels of the fatty acid C16:1cis (45.2%) than do other Epsilonproteobacteria from hydrothermal vents express (3.6%–28.8%) [2,30]. On another hand, the percentage of C18:1 trans was the lowest in S. autotrophica: (9.4%), while other Epsilonproteobacteria contained 20.0%-73.3% [30]. C14:0 (8.4%) was also more abundant in strain OK10T than in other strains [30].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position [31], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [32]. The genome project is deposited in the Genome OnLine Database [13] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

S. autotrophica strain OK10T, DSM 16294, was grown in DSMZ medium 1011 (MJ medium) [33] at 24°C. DNA was isolated from 0.5–1 g of cell paste using MasterPure Gram Positive DNA Purification Kit (Epicenter MGP04100) following the standard protocol as recommended by the manufacturer, with modification st/LALM for cell lysis as described in Wu et al. [32].

Genome sequencing and assembly

The genome was sequenced using a combination of Sanger, 454 and Illumina sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website. Illumina sequencing data was assembled with VELVET [34], and the consensus sequences were shredded into 1.5 kb overlapped fake reads and used for the assembly with 454 and Sanger data. Contigs resulting from a 454 Newbler (2.0.00.20-PostRelease-11-05-2008-gcc-3.4.6) assembly were shredded into 2 kb fake reads, which were assembled with Sanger data. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification (Roche Applied Science, Indianapolis, IN) [35]. A total of 790 additional custom primer reactions were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to improve the final consensus quality using an in-house developed tool - the Polisher [36]. Together, the combination of the Illumina and 454 sequencing platforms provided 155.4 × coverage of the genome. The error rate of the completed genome sequence is less than 1 in 100,000.

Genome annotation

Genes were identified using Prodigal [37] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [38]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [39].

Genome properties

The genome consists of a 2,153,198 bp long chromosome with a 35.2% GC content (Table 3 and Figure 3). Of the 2,220 genes predicted, 2,165 were protein-coding genes, and 55 RNAs; seven pseudogenes were also identified. The majority of the protein-coding genes (69.1%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the genome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories