Introduction

Strain 21T (= DSM 1279 = ATCC 35948 = VKM B-1258) is the type strain of the species Meiothermus ruber, which is the type species of the genus Meiothermus [1]. Strain 21T was first described as a member of the genus Thermus by Loginova and Egorova in 1975 [2], but the species name to which it was assigned was not included on the Approved Lists of Bacterial Names [3]. Consequently Thermus ruber was revived, according to Rule 28a of the International Code of Nomenclature of Bacteria [4] in 1984 [5]. It received its current name in 1996 when transferred from the genus Thermus into the then novel genus Meiothermus by Nobre et al. [1]. Currently, there are eight species placed in the genus Meiothermus [6]. The genus name derives from the Greek words ‘meion’ and ‘thermos’ meaning ‘lesser’ and ‘hot’ to indicate an organism in a less hot place [1,6]. The species epithet derives from the Latin word ‘ruber’ meaning red, to indicate the red cell pigmentation [5,6]. Members of the genus Meiothermus were isolated from natural hot springs and artificial thermal environments [1] in Russia [5], Central France [7], both Northern and Central Portugal [8,9], North-Eastern China [10], Northern Taiwan [11] and Iceland [12]. Interestingly, the genus Meiothermus is heterogeneous with respect to pigmentation. The yellow pigmented species also form a distinct group on the basis of the 16S rRNA gene sequence similarity, with the red/orange pigmented strains forming two groups, one comprising M. silvanus and the other the remaining species [9,10]. Like all members of the Deinococci the lipid composition of the cell membrane of members of the genus Meiothermus is based on unusual and characteristic structures. Here we present a summary classification and a set of features for M. ruber 21T, together with the description of the complete genomic sequencing and annotation.

Classification and features

The 16S rRNA genes of the seven other type strains in the genus Meiothermus share between 88.7% (M. silvanus) [13] and 98.8% (M. taiwanensis) [14] sequence identity with strain 21T, whereas the other type strains from the family Thermaceae share 84.5 to 87.6% sequence identity [15]. Thermus sp. R55-10 from the Great Artesian Basin of Australia (AF407749), as well as other reference strains, e.g. 16105 and 17106 [12], and the uncultured bacterial clone 53-ORF05 from an aerobic sequencing batch reactor (DQ376569) show full length 16S rRNA sequences identical to that of strain 21T. A rather large number of isolates with almost identical 16S rRNA gene sequences originates from the Great Artesian Basin of Australia, clone R03 (AF407684), and various hot springs in Hyogo, Japan (strain H328; AB442017), Liaoning Province, China (strain L462; EU418906, and others), Thailand (strain O1DQU (EU376397), a Finnish paper production facility (strain L-s-R2A-3B.2; AM229096) and others), but also the not validly published ‘M. rosaceus’ (99.9%) [16] from Tengchong hot spring in Yunnan (China). Environmental samples and metagenomic surveys do not surpass 81–82% sequence similarity to the 16S rRNA gene sequence of strain 21T, indicating a rather mixed impression about the environmental importance of strains belonging to the species M. ruber, as occurring only in very restricted extreme habitats (status August 2009).

Figure 2 shows the phylogenetic neighborhood of M. ruber 21T in a 16S rRNA based tree. The sequences of the two 16S rRNA gene copies in the genome are identical and differ by only one nucleotide from the previously published sequence generated from ATCC 35948 (Z15059).

A detailed physiological description based on five strains has been given by Loginova et al. [5]. The cells are described as Gram-negative nonmotile rods that are 3 to 6 by 0.5 to 0.8 µm (Table 1), have rounded ends, and are nonsporeforming [5]. In potato-peptone-yeast extract broth incubated at 60°C, filamentous forms (20 to 40 µm in length) are observed along with shorter rods (Figure 1) [5]. No filamentous forms are observed after 16 h of incubation. M. ruber is obligately thermophilic [5]. On potato-peptone-yeast extract medium, the temperature range for growth is approx. 35–70°C, with an optimum temperature at 60°C (the generation time is then 60 min) [5]. A bright red intracellular carotenoid pigment is produced, which resembles retro-dehydro-γ-carotene (neo A, neo B) in its spectral properties [2]. The absorption spectra of acetone, methanol-acetone (l:l), and hexane extracts show three maxima at 455, 483, and 513 nm. The major carotenoid has since been identified as a 1′-β-glucopyranosyl-3,4,3′,4′-tetradehydro-1′,2′-dihydro-β,ψ-caroten-2-one, with the glucose acetylated at position 6 [30]. One strain (strain INMI-a) contains a bright yellow pigment resembling neurosporaxanthine in its spectral properties [5], although it may well have been misidentified, since other species within the genus Meiothermus are yellow pigmented [8,9]. M. ruber is obligately aerobic [5]. It grows in minimal medium supplemented with 0.15% (wt/vol) peptone as an N source, 0.05% (wt/vol) yeast extract, and one of the following carbon sources at a concentration of 0.25% (wt/vol): D-glucose, sucrose, maltose, D-galactose, D-mannose, rhamnose, D-cellobiose, glycerol, D-mannitol, acetate, pyruvate, succinate, fumarate, or DL-malate (sodium salts). No growth occurs if the concentration of D-glucose in the medium is raised to 0.5% (wt/vol) [5]. Only moderate growth occurs when ammonium phosphate (0.1%, wt/vol) is substituted for peptone as the N source. No growth occurs in the control medium without a carbon source. No growth occurs on minimal medium supplemented with 0.25% (wt/vol) D-glucose, 0.05% (wt/vol) yeast extract and one of the following nitrogen sources at a concentration of 0.1% (wt/vol): L-alanine, glycine, L-asparagine, L-tyrosine, L-glutamate, ammonium sulfate, nitrate, or urea. Further lists of carbon source utilization, which differ in part from the above list, are published elsewhere [7,1012]. Nitrates are not reduced and milk is not peptonized [5], but M. ruber strain 21T is positive for catalase and oxidase [10]. The most comprehensive and updated list of physiological properties is probably given by Albuquerque et al [7].

Figure 1.
figure 1

Scanning electron micrograph of M. ruber 21T

Figure 2.
figure 2

Phylogenetic tree highlighting the position of M. ruber 21T relative to the type strains of the other species within the genus and to the type strains of the other species within the family Thermaceae. The trees were inferred from 1,403 aligned characters [31,32] of the 16S rRNA gene sequence under the maximum likelihood criterion [33] and rooted in accordance with the current taxonomy [34]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 1,000 bootstrap replicates [35] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [36] are shown in blue, published genomes in bold (Thermus thermophilus; AP008226).

Table 1. Classification and general features of M. ruber 21T according to the MIGS recommendations [17]

Chemotaxonomy

Initial reports on the polar lipids of M. ruber indicated that they consist of two major glycolipids GL1a (∼ 42%) and GL1b (∼ 57%) and one major phospholipid PL2 (∼ 93%), with small amounts of two other phospholipids PL1 and PL3 [37]. Detailed work indicates that in strains of Thermus oshimai, T. thermophilus, M. ruber, and M. taiwanensis the major phospholipid is a 2’-O-(1, 2-diacyl-sn-glycero-3-phospho)-3’-O-(α-N-acetyl-glucosaminyl)-N-glyceroyl alkylamine [38]. This compound is related to the major phosphoglycolipid reported from Deinococcus radiodurans [39] and can be considered to be unambiguous chemical markers for this major evolutionary lineage. The glycolipids are derivatives of a Glcp-> Glcp-> GalNAcyl-> Glcp-> diacyl glycerol [40]. Based on mass spectral data it appears that there may be three distinct derivatives, differing in the fatty acid amide linked to the gatactosamine [40]. These may be divided into one compound containing exclusively 2-hydroxylated fatty acids (mainly 2-OH iso-17:0) and a mixture of two compounds that cannot be fully resolved by thin layer chromatography carrying either 3-hydroxylated fatty acids or unsubstituted fatty acids. The basic glycolipid structure dihexosyl - N-acyl-hexosaminyl - hexosyl - diacylglycerol is a feature common to all members of the genera Thermus and Meiothermus examined to date. There is currently no evidence that members of the family Thermaceae (as currently defined) produce significant amounts of polar lipids containing only two aliphatic side chains. The consequences of having polar lipids containing three aliphatic side chains on membrane structure has yet to be examined. Such peculiarities also indicate the value of membrane composition in helping to unravel evolution at a cellular level. The major fatty acids of the polar lipids are iso-C15:0 (30-40%) and iso-C17:0 (13-17%), followed by anteiso-C15:0, C16:0, iso-C16:0, anteiso-C17:0, iso-C17:0-2OH, and, at least in some studies, iso-C17:1 ω9c (the values range from 3–10%). Other fatty acid values are below 2%, including 3-OH branched chain fatty acids. The values vary slightly between the different studies [7,9,11,12,37]. Detailed structural studies suggest that long chain diols may be present in small amounts, substituting for the 1-acyl-sn-glycerol [38]. Although not routinely reported the presence of alkylamines (amide linked to the glyceric acid of the major phospholipid) can be deduced from detailed structural studies of the major phospholipid [38]. Menaquinone 8 is the major respiratory quinone, although it is not clear which pathway is used for the synthesis of the naphthoquinone ring nucleus [41]. Ornithine is the major diamino acid of the peptidoglycan in the genus Meiothermus [1].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position [42], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [43]. The genome project is deposited in the Genome OnLine Database [36] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

M. ruber 21T, DSM 1279, was grown in DSMZ medium 256 (Nutrient Agar) [44] at 50°C. DNA was isolated from 0.5–1 g of cell paste using Qiagen Genomic 500 DNA Kit (Qiagen, Hilden, Germany) following the standard protocol as recommended by the manufacturer, with modification L for cell lysis as described in Wu et al. [43].

Genome sequencing and assembly

The genome was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website (http://www.jgi.doe.gov/). Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 3,428 overlap ping fragments of 1,000 bp and entered into assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and adjust inflated q-scores. A hybrid 454/Sanger assembly was made using PGA assembler. Possible misassemblies were corrected and gaps between contgis were closed by primer walks off Sanger clones and bridging PCR fragments and by editing in Consed. A total of 431 Sanger finishing reads were produced to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. Illumina reads were used to improve the final consensus quality using an in-house developed tool (the Polisher [45]). The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Sanger and 454 sequencing platforms provided 37.24× coverage of the genome. The final assembly contains 30,479 Sanger reads and 371,362 pyrosequencing reads.

Genome annotation

Genes were identified using Prodigal [46] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [47]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [48].

Genome properties

The genome consists of a 3,097,457 bp long chromosome with a 63.4% GC content (Table 3 and Figure 3). Of the 3,105 genes predicted, 3,052 were protein-coding genes, and 53 RNAs; thirty eight pseudogenes were also identified. The majority of the protein-coding genes (71.8%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the genome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories