Introduction

The genus Thermovibrio consists of three validly published, named species: T. ammonificans strain HB-1T [1], T. ruber strain ED11/3LLK T [2] and T. guaymasensis strain SL19T [3]. All three Thermovibrio spp. are anaerobic, chemolithoautotrophic bacteria that grow on mineral salts in the presence of carbon dioxide and hydrogen, reducing nitrate or sulfur to ammonium or hydrogen sulfide, respectively. T. ammonificans was isolated from an active high-temperature deep-sea hydrothermal vent located on the East Pacific Rise at 9° North, while T. ruber was isolated from shallow water hydrothermal vent sediments in Papua New Guinea and T. guaymasensis from a deep-sea hydrothermal vent chimney in the Guaymas Basin [13]. Anaerobic chemolithoautotrophic bacteria mediate the transfer of energy and carbon from a geothermal source to the higher trophic levels. These anaerobic primary producers, which depend on inorganic chemical species of geothermal origin (i.e., carbon dioxide, hydrogen and sulfur), are completely independent from photosynthetic processes and represent an important component of the deep-sea hydrothermal vent ecosystem. Furthermore, microorganisms such as T. ammonificans, which also couple autotrophic carbon dioxide fixation with nitrate respiration, are of particular interest, as they link the carbon and nitrogen cycle, the latter of which has been under-studied at deep-sea hydrothermal vents. Here we present a summary of the features of T. ammonificans strain HB-1T and a description of its genome.

Classification and features

Thermovibrio ammonificans strain HB-1T (=DSM 15698T =JCM 12110T) is a member of the phylum Aquificae, a group of thermophilic, deeply branching bacteria thought to be among the oldest on Earth. The phylum Aquificae consists of a single order, the Aquificales, which is composed of three families, Aquificaceae, Hydrogenothermaceae and Desulfurobacteriaceae (Figure 1). The genus Thermovibrio belongs to the family Desulfurobacteriaceae, which also includes the genera Desulfurobacterium, Balnearium and the newly described Phorcysia [68]. While the genomes of several members of the families Aquificaceae and Hydrogenothermaceae have been sequenced, the only genome sequences publicly available for the Desulfurobacteriaceae are those of T. ammonificans and Desulfurobacterium thermolithotrophum [9].

Figure 1.
figure 1

Phylogenetic position of Thermovibrio ammonificans HB-1T relative to other type strains within the Aquificae. Sequences were aligned automatically using CLUSTAL X and the alignment was manually refined using SEAVIEW [4,5]. The neighbor-joining tree was constructed with Phylo_Win, using the Jukes-Cantor correction [4]. Bootstrap values based on 100 replications. Bar, 0.02 substitutions per nucleotide position.

Table 1 summarizes the classification and general features of Thermovibrio ammonificans HB-1T. Cells of T. ammonificans are Gram-negative, motile rods of about 1.0 µm in length and 0.6 µm in width (Figure 2). Growth occurs between 60 and 80 °C (optimum at 75 °C), 0.5 and 4.5% (w/v) sodium chloride (optimum at 2%) and pH 5 and 7 (optimum at 5.5). Generation time under optimal conditions is 1.5 h. Growth occurs under chemolithoautotrophic conditions in the presence of hydrogen and carbon dioxide, with nitrate or sulfur as the electron acceptor and with concomitant formation of ammonium or hydrogen sulfide, respectively. Thiosulfate, sulfite and oxygen are not used as electron acceptors. Acetate, formate, lactate and yeast extract inhibits growth. No chemoorganoheterotrophic growth was observed on peptone, tryptone or Casamino acids. The genomic DNA G+C content is 52.1 mol% [1].

Figure 2.
figure 2

Electron micrograph of a platinum shadowed cell of Thermovibrio ammonificans strain HB-1 T showing multiple flagella. Bar, 1 µm.

Table 1. Classification and general features of Thermovibrio ammonificans HB-1T

Chemotaxonomy

None of the classical chemotaxonomic features (peptidoglycan structure, cell wall sugars, cellular fatty acid profile, respiratory quinones, or polar lipids) are known for Thermovibrio ammonificans strain HB-1T.

Genome sequencing information

Genome project history

T. ammonificans was selected for genome sequencing because of its phylogenetic position within the phylum Aquificae and because of its ecological function as a primary producer at deep-sea hydrothermal vents. Sequencing, finishing and annotation were carried out by the US DOE Joint Genome Institute (JGI). Table 2 shows a summary of the project information and its association with MIGS version 2.0 compliance [17].

Table 2. Project information

Growth conditions and DNA isolation

T. ammonificans was grown in two liters of modified SME medium at 75 °C under a H2/CO2 gas phase (80:20; 200 kPa) with CO2 as the carbon source and nitrate as the electron acceptor [1]. Genomic DNA was isolated from 0.5–1 g of pelleted cells using a protocol that included a lysozyme/SDS lysis step, followed by two extractions with phenol:chloroform:isoamyl alcohol (50:49:1) and ethanol precipitation. This procedure yielded about 25 µg of genomic DNA, which was submitted to the DOE JGI for sequencing.

Genome sequencing and assembly

The genome of Thermovibrio ammonificans was sequenced at the DOE JGI [18] using a combination of Illumina [19] and 454 platforms [20]. The following libraries were used: 1) An Illumina GAii shotgun library, which generated 10,255,5615 reads totaling 7,794 Mb; 2) A 454 Titanium standard library, which generated 186,945 reads; and 3) A paired end 454 library with an average insert size of 11.895 +/− 2.973 kb, which generated 115,495 reads totaling 104.7 Mb of 454 data. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website [21]. The initial draft assembly contained 16 contigs in 2 scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together with Newbler, version 2.3. The Newbler consensus sequences were computationally shredded into 2 kb overlapping fake reads (shreds). Illumina sequencing data was assembled with VELVET, version 0.7.63 [22], and the consensus sequences were computationally shredded into 1.5 kb overlapping fake reads (shreds). The 454 Newbler consensus shreds, the Illumina VELVET consensus shreds and the read pairs in the 454 paired end library were integrated using parallel phrap, version SPS - 4.24 (High Performance Software, LLC). The software Consed [23] was used in the finishing process. Illumina data were used to correct potential base errors and increase consensus quality using the software Polisher developed at JGI (Alla Lapidus, unpublished). Possible mis-assemblies were corrected using gapResolution (Cliff Han, unpublished), Dupfinisher [24], or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR (J-F Cheng, unpublished) primer walks. A total of 46 additional reactions and 1 shatter library were necessary to close gaps and to raise the quality of the finished sequence. The total size of the genome is 1,759,526 bp (chromosome and plasmid) and the final assembly is based on 67.7 Mb of 454 draft data, which provide an average 40× coverage of the genome, and 7,284 Mb of Illumina draft data, which provide an average 4,285× coverage of the genome.

Genome annotation

Genes were identified using Prodigal [25] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [26]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [27], RNAMMer [28], Rfam [29], TMHMM [30], and signalP [31].

Genome properties

The genome includes one circular chromosome and one plasmid, for a total size of 1,759,526 bp (chromosome size: 1,682,965 bp; GC content: 52.13%). Of the 1,888 genes predicted from the genome, 1,831 are protein-coding genes. Of the protein coding genes, 1,279 were assigned to a putative function, with those remaining annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Figure 3 and Tables 3 and 4.

Figure 3.
figure 3

Graphical circular map of the genome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs cyan, rRNAs red, other RNAs blue), GC content, GC skew.

Table 3. Genome statistics
Table 4. Number of genes associated with the 25 general COG functional categories