Introduction

Strain 1T (= DSM 74 = ATCC 33905 = LMG 10896) is the type strain of the species Spirosoma linguale, which is the type species of the genus Spirosoma. The genus currently consists of five species [1]. Strain 1T is reported to be isolated from a laboratory water bath (websites of DSMZ and ATCC), however, a proper reference could not be identified. Another strain of S. linguale was isolated from fresh water from deep wells in Long Beach, California, USA [2]. Other strains from the genus Spirosoma were isolated from high arctic permafrost soil in Norway [3], soil from a ginseng field in Pocheon province, South Korea [4], and fresh water from the Woopo wetlands, South Korea [5]. This would allow the hypothesis that S. linguale is a free-living species with a worldwide distribution. The genus name Spirosoma derives from ‘spira’ from Latin meaning coil combined with ‘soma’, Latin for ‘body’, resulting in ‘coiled body’ [1]. Spirosoma was the first genus in the family Spirillaceae in Migula’s “System der Bakterien” [6]. The species name is effectively published by Migula in 1894 [7] and validly published by Skerman in 1980 [8]. Various taxonomic treatments have placed this organism either in the family “Flexibacteraceae” or the family Cytophagaceae. This would appear to be due to a number of nomenclatural problems. The family “Flexibacteriaceae” as outlined in TOBA 7.7 would include Cytophaga hutchinsonii, which is the type species of the genus Cytophaga, which, in turn is the type of the family Cytophagaceae, a name that may not be replaced by the family name “Flexibacteriaceae” as long as Cytophaga hutchinsonii is one of the included species. However, the topology of the 16S rDNA based dendrogram indicates that it may be possible to define a second family, including the genus Spirosoma, but excluding Cytophaga hutchinsonii. At the same time, the family Cytophagaceae may be defined to exclude the type species of the genus Flexibacter and members of the genus Spirosoma. It should also be remembered that the genus Spirosoma is the type of the family Spirosomaceae Larkin and Borrall 1978. At present the higher taxonomic ranks of this group of organisms lacks formal modern descriptions and circumscriptions making it difficult to make definitive statements that would hold over the next few years. Here we present a summary classification and a set of features for S. linguale 1T, together with the description of the complete genomic sequencing and annotation.

Classification and features

Uncultured clone sequences in Genbank showed 96% or less sequence identity to the 16S gene sequence (AM000023) of strain S. linguale 1T. No reasonable sequence similarity (>87%) to any metagenomic survey were reported from the NCBI BLAST server (October 2009).

Figure 1 shows the phylogenetic neighborhood of for S. linguale 1T in a 16S rRNA based tree. The sequences of the four identical 16S rRNA gene copies in the genome of S. linguale 1T are also identical with the previously published 16S rRNA sequence generated from LMG 10896 (AM000023).

Figure 1.
figure 1

Phylogenetic tree highlighting the position of S. linguale 1T and the type strains of the other species within the genus relative to the other type strains within the family Cytophagaceae. The tree was inferred from 1,320 aligned characters [9,10] of the 16S rRNA gene sequence under the maximum likelihood criterion [11] and rooted with the type strain of the family Sphingobacteriaceae. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 1,000 bootstrap replicates if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [12] are shown in blue, published genomes such as the one of Dyadobacter fermentans [13] in bold.

On TGEY medium [14], strain S. linguale 1T forms mucoid, opaque, and smooth colonies with a yellowish nondiffusible pigment [15]. The colony size is 2–4 mm, circular, with entire margins and convex elevation. In broth, growth is aerobic (Table 1) with even turbidity and flaky sediment [15]. The Gram-negative cells have round ends and show vibroid, horseshoe, and ring-like shapes, as well as coils and spiral forms [Figure 2 and ref. 15]. The cell width is 0.5–1.0 µm, and the outer ring diameter is 1.5–3.0 µm. The cell length is 2.0–5.0 µm [22]. Reports on filaments are conflicting [15,22].

Figure 2.
figure 2

Scanning electron micrograph of S. linguale 1T

Table 1. Classification and general features of S. linguale 1T according to the MIGS recommendations [16]

Strain S. linguale 1T produces oxidatively acid from arabinose, ribose, xylose, rhamnose, fructose, galactose, glucose, mannose, α-methyl-D-glucoside, salicin, cellobiose, lactose, maltose, melibiose, sucrose, trehalose, raffinose, dextrin and inulin, but not from sorbose, glycerol, erythritol, dulcitol, mannitol, and sorbitol [22]. On the enzymatic level, strain S. linguale 1T is positive for oxidase, catalase, ONPG-reaction, and, albeit weakly, for phosphatase, but negative for urease, lecithinase, lysine decarboxylase, phenylalanine deaminase, and hemolysin, indole, methyl red, Voges-Proskauer, NO3 reduction and H2S reactions [22]. Strain S. linguale 1T hydrolyzes esculin, tributyrin, gelatin, and, less well, starch and casein, but not cellulose and chitin [22]. It utilizes for growth on basal medium [25] glycerol phosphate, succinate, tartrate, and malonate as single carbon source, but not acetate, benzoate, citrate, formate, methylamine, propionate, and methanol [22]. Strain S. linguale 1T grows well on nutrient agar, nutrient agar + 5% sucrose, Microcyclus-Spirosoma agar, and yeast extract tryptone agar, weakly on peptonized milk agar, blood, and chocolate, and not on eosin methylene blue agar, phenol red mannitol salt agar, phenyl ethyl alcohol agar, trypticase soy agar (TSA), TSA + 3% glucose, TSA + 3% sucrose, McConkey, bismuth sulfide agar, and Salmonella-Shigella agar [22]. Strain S. linguale 1T is susceptible to actinomycin D (100 µg/ml), ampicillin (10 µg), aureomycin (15 µg), carbenicillin (50 µg), erythromycin (15 µg), furadantin/macrodantin (300 µg), gentamicin (10 µg), kanamycin (30 µg), mitomycin C (1 µg/ml), neomycin (30 µg), penicillin G (10 units), streptomycin (10 µg), sulfamethoxyzole/trimethopterin (25 µg), sulfathiazole (300 µg), and tetracycline(30 µg), but resistant to colistin (10 µg), polymixin B (300 units), and triple sulfa (1 mg) [22].

Chemotaxonomy

Earlier studies report C16:1 to be the dominant fatty acid (47.9%), followed by iso-C17:0 (20.1%), C16:0 (14.2%), iso-C15:0 (11.0%) and iso-C13:0 (3.4%). Anteiso and hydroxy fatty acids are each below 2.1% [26]. The fatty acids comprise a complex mixture of straight chain saturated and unsaturated fatty acids, together with iso-branched and 3-hydroxylated iso-branched fatty acids. The fatty acids comprise iso-C13:0 (2.2%), iso-C15:0 (9.3%), iso-C15:0 3-OH (3.4%), anteiso-C15:0 (2.6%), C16:0 (3.6%), C16:0 3-OH (2.2%), C16:1 ω 5c (22.2%), C17:0 2-OH (1.0%), iso-C17:0 3-OH (8.6%), iso-C13:0 (2.2%), C17:1 ω 9c (1.2%) and C16:1 ω 7c and/or iso-C15:0 2-OH (42.4%). The polar lipids comprise phosphatidylethanolamine and a number of lipids and amino lipids that were not further characterized. The fatty acid pattern is typical of the evolutionary group currently defined as the phylum Bacteroidetes. Furthermore the presence of phosphatidylethanolamine as the predominant/sole diglyceride based phospholipid is also typical of the vast majority of the phylum Bacteroidetes. Limited detailed studies indicate that this phospholipid contains both saturated and unsaturated straight chain fatty acids. Hydroxylated fatty acids are not present in this compound. In contrast, the limited studies on the amino lipids of Flavobacterium johnsoniae indicate that they are amino acid based, with a 3-OH fatty acid in amide linkage with a free amino group of the amino acid. The 3-OH fatty acid is further esterified with either a non-hydroxylated fatty acid, or with a second hydroxylated fatty acid. The presence of lipids that did not stain further running in proximity with the major aminolipid may also be indicative of capnines. The failure to resolve the fatty acids reported by the MIDI Sherlock MIS system as “summed feature 4” C16:1ω7c/iso C15:02-OH is problematic for genomics, since it either indicates that two mechanisms of introducing double bonds into fatty acids are present (C16:1ω5c and C16:1ω7c) or a fatty acid 2-hydroxylase is present. Furthermore, the distribution of 3-OH and 2-OH fatty acids among the amino- and non-staining lipids may also be characteristic. The main isoprenoid quinone is MK-7 (91.5%), followed by MK-8 (7.2%) and MK-6 (1.3%) [26].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position, and is part of the Genomic Encyclopedia of Bacteria and Archaea project. The genome project is deposited in the Genome OnLine Database [12] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

S. linguale 1T, DSM 74, was grown in DSMZ medium 7 [27] at 28°C. DNA was isolated from 0.5–1 g of cell paste using Qiagen Genomic 500 DNA Kit (Qiagen, Hilden, Germany) with cell lysis modification st/L [28] and one hour incubation at 37°C.

Genome sequencing and assembly

The genome was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at http://www.jgi.doe.gov/. 454 Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 9,401 overlapping fragments of 1,000 bp and entered into assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and to adjust inflated q-scores. A hybrid 454/Sanger assembly was made using the parallel phrap assembler (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher [29] or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification. A total of 974 Sanger finishing reads were produced to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. Illumina reads were used to improve the final consensus quality using an in-house developed tool (the Polisher). The error rate of the completed genome sequence is less than 1 in 100,000. Together all sequence types provided 28.5× coverage of the genome. The final assembly contains 87,186 Sanger and 666,973 pyrosequence reads.

Genome annotation

Genes were identified using Prodigal [30] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [31]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes Expert Review (IMG-ER) platform [32].

Genome properties

The genome consists of a 8,078,757 bp long chromosome and eight plasmids with 6,072 to 189,452 bp length (Table 3 and Figure 3). Of the 7,129 genes predicted, 7,069 were protein-coding genes, and 60 RNAs; 131 pseudogenes were also identified. The majority of the protein-coding genes (61.5%) were assigned with a putative function while those remaining were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the chromosome (A) and the eight plasmids: pSLIN01 (B), pSLIN02 (C), pSLIN03 (D), pSLIN04 (E), pSLIN05 (F), pSLIN06 (G), pSLIN07 (H), pSLIN08 (I). Plasmids not drawn to scale. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories