Introduction

Strain AQ1.S1T (= DSM 17230 = JCM 13409) is the type strain of the species Ignisphaera aggregans, which is the type species of the genus Ignisphaera [1], one out of nine genera in the family Desulfurococcaceae [25]. The generic name derives from the Latin word ‘ignis’ meaning ‘fire’, and ‘sphaera’ meaning ‘ball’, referring to coccoid cells found in the high-temperature environment such as hot springs [1]. The species epithet is derived from the Latin word ‘aggregans’ meaning ‘aggregate forming or aggregating clumping’, referring to the appearance of the cells when grown on mono-, di- or polysaccharides [1]. Strain AQ1.S1T is of particular interest because it is able to ferment quite a number of polysaccharides and complex proteinaceous substrates [1]. Here we present a summary classification and a set of features for I. aggregans AQ1.S1T, together with the description of the complete genomic sequencing and annotation.

Classification and features

Strain AQ1.S1T was isolated from a near neutral, boiling spring situated in Kuirau Park, Rotorua, New Zealand [1]. Interestingly, strains of I. aggregans could not be cultivated from pools with similar characteristics in Yellowstone National Park [1]. Only three cultivated strains are reported for the species I. aggregans in addition to AQ1.S1T, these are strains Tok37.S1, Tok10A.S1 and Tok1 [1]. The 16S rRNA sequence of AQ1.S1T is 99% identical to Tok37.S1, 98% to Tok10A.S1 and 98% to Tok1. Sequence similarities between strain AQ1.S1T and members of the family Pyrodictiaceae range from 93.0% for Pyrodictium occultum to 93.4% for P. abyssi [6] but from 89.7% for Ignicoccus islandicus to 93.5% for Staphylothermus hellenicus [6] with members of the family Desulfurococcaceae in which I. aggregans is currently classified (Table 1). Genbank [16] currently contains only three 16S rRNA gene sequences with significantly high identity values to strain AQ1.S1T: clone YNP_BP_A32 (96%, DQ243730) from hot springs of Yellowstone National Park, clone SSW_L4_A01 (95%, EU635921) from mud hot springs, Nevada, USA, and clone DDP-A02 (94%, AB462559) from a Japanese alkaline geothermal pool, which does not necessarily indicate the presence of I. aggregans but probably the presence of yet to be identified other species in the genus Ignisphaera. Environmental samples and metagenomic surveys featured in Genbank contain not a single sequence with >87% sequence identity (as of June 2010), indicating that I. aggregans might play a rather limited and regional role in the environment.

Table 1. Classification and general features of I. aggregans AQ1.S1T according to the MIGS recommendations [7]

The cells of strain AQ1.S1T are regular to irregular cocci which occur singly, in pairs or as aggregates of many cells [1]. They usually have dimensions between 1–1.5 εm (Figure 1). Aggregation of cells is common when AQ1.S1T is grown on mono-, di- or polysaccharides [1]. Strain AQ1.S1T is hyperthermophilic and grows optimally between 92°C and 95°C, the temperature range for growth is 85–98°C. The pH range for growth is 5.4–7.0, with an optimum at pH 6.4. The strain grows in the presence of up to 0.5% NaCl, however, it grows optimally without NaCl. The doubling time is 7.5 h under optimal conditions [1]. I. aggregans strain AQ1.S1T is strictly anaerobic and grows heterotrophically on starch, trypticase peptone, lactose, glucose, konjac glucomannan, mannose, galactose, maltose, glycogen, and β-cyclodextrin. Growth on beef extract and glucose is weak and not observed on yeast extract, cellobiose, methanol, ethanol, trehalose, pyruvate, acetate, malate, casamino acids (0.1% w/v), carboxymethylcellulose, amylopectin (corn), xanthan gum, locust gum (bean), guar gum, dextran, xylan (oat spelts, larch or birch), xylitol, xylose or amylose (corn and potato) [1]. Mono- and disaccharides are accumulated in AQ1.S1T cultures grown in media containing konjac glucomannan, but not in sterile media that had been exposed to the same temperature as the inoculated medium or the stock of konjac glucomannan [1]. As hypothesized by Niederberger et al. [1], this most probably indicates that the konjac glucomannan is being hydrolyzed enzymatically by AQ1.S1T into sugars for metabolism. Removal of cystine from the growth medium does not affect cell density significantly. Hydrogen sulfide is also detected in AQ1.S1T cultures grown in enrichment media. Strain AQ1.S1T is resistant to novobiocin and streptomycin but sensitive to erythromycin, chloramphenicol and rifampicin [1].

Figure 1.
figure 1

Scanning electron micrograph of I. aggregans AQ1.S1T

Chemotaxonomy

No chemotaxonomic data are currently available for I. aggregans strain AQ1.S1T. Also, chemotaxonomic information for the family Desulfurococcaceae is scarce. What is known is that the type species of this family, Desulfurococcus mucosus, lacks a murein cell wall and contains phytanol and polyisopreonoid dialcohols as major components of the cellular lipids [3].

Figure 2 shows the phylogenetic neighborhood of I. aggregans AQ1.S1T in a 16S rRNA based tree. The sequence of the single 16S rRNA gene copy in the genome of strain AQ1.S1 does not differ from the previously published 16S rRNA sequence from DSM 17230 (DQ060321).

Figure 2.
figure 2

Phylogenetic tree highlighting the position of I. aggregans AQ1.S1T relative to the type strains of the other genera within the order Desulfurococcales. The tree was inferred from 1,329 aligned characters [17,18] of the 16S rRNA gene sequence under the maximum likelihood criterion [19] and rooted with the type strains of the genera of the neighboring order Acidilobales. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 250 bootstrap replicates [20], if greater than 60%. Lineages with type strain genome sequencing projects registered in GOLD [21] are shown in blue, published genomes in bold ([2225], CP000504 and CP000852).

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position [26], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [27]. The genome project is deposited in the Genome OnLine Database [21] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

I. aggregans AQ1.S1T, DSM 17230, was grown anaerobically in DSMZ medium 1043 (Ignisphaera medium) [28] at 92°C. DNA was isolated from 0.5–1 g of cell paste using MasterPure Gram Positive DNA Purification Kit (Epicentre MGP04100). One µl lysozyme and five µl mutanolysin and lysostaphine, each, were added to the standard lysis solution for one hour at 37°C followed by 30 min incubation on ice after the MPC-step.

Genome sequencing and assembly

The genome of strain AQ1.S1T was sequenced using a combination of Illumina and 454 technologies. An Illumina GAii shotgun library with reads of 152 Mb, a 454 Titanium draft library with average read length of 320 bases, and a paired end 454 library with average insert size of 15 kb were generated for this genome. All general aspects of library construction and sequencing can be found at http://www.jgi.doe.gov/. Illumina sequencing data was assembled with VELVET and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. Draft assemblies were based on 177 Mb 454 draft data, and 454 paired end data. Newbler parameters are -consed - a 50 -l 350 -g -m -ml 20. The initial assembly contained 20 contigs in 1 scaffold. The initial 454 assembly was converted into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. The Phred/Phrap/Consed software package (http://www.phrap.com) was used for sequence assembly and quality assessment [29] in the following finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution (http://www.jgi.doe.gov), Dupfinisher [29], or sequencing cloned bridging PCR fragments with subcloning or transposon bombing (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J.-F. Chan, unpublished). A total of 32 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to improve the final consensus quality using an in-house developed tool (the Polisher [30]). The error rate of the final genome sequence is less than 1 in 100,000

Genome annotation

Genes were identified using Prodigal [31] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [32]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [33].

Genome properties

The genome consists of a 1,875,953 bp long chromosome with a 35.7% G+C content (Table 3 and Figure 3). Of the 2,061 genes predicted, 2,009 were protein-coding genes, and 52 RNAs; 79 pseudogenes were also identified. The majority of the protein-coding genes (56.2%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the genome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories

Insights from the genome sequence

Even though the tree depicted in Figure 1 is not particularly well resolved, the fact that I. aggregans does not cluster with the Desulfurococcaceae in 16S rRNA gene sequence-based phylogenies calls for a more detailed whole-genome-based analysis [34]. Both, in Figure 1 and in the All-Species-Living-Tree [35], I. aggregans is located deep on the branch leading to the Thermoproteaceae (and Sulfolobaceae). By circumstance, the class Thermoprotei within the phylum Crenarchaeota already offers a reasonably large set of reference genomes required for such an analysis. We thus assembled a dataset comprising all publicly available genomes from the set of organisms represented in the 16S rRNA tree (Fig. 1). Pairwise distances were calculated using the GBDP algorithm [36,37], which has recently been used to mimic DNA-DNA-hybridization values [37,38]. Here we applied the logarithmic version of formula (3) in [34,38]. The NeighborNet algorithm as implemented in SplitsTree version 4.10 [39] was used to infer a phylogenetic network from the distances, which is shown in Fig. 4.

The results indicate that the placement of I. aggregans as sister group of Thermoproteales (Fig. 1) is an artifact of the 16S rRNA analysis. The whole-genome network, while showing some conflicting signal close to the backbone, is in agreement with the splitting of the considered genera into the orders Desulfurococcales and Thermoproteales. However, the analysis provides some evidence that Aeropyrum pernix (Desulfurococcaceae) is more closely related to Pyrodictiaceae (represented by Hyperthermus and Pyrolobus) than to the remaining Desulfurococcaceae. The numerous additional type strain genome sequencing projects in the Desulfurococcales (Fig. 1) are likely to shed even more light on the phylogenetic relationships within this group by enabling future whole-genome phylogenies based on many more taxa.

A separate status of I. aggregans within the Desulfurococcaceae is supported by a lack of genes encoding membrane-bound multienzyme complexes that are thought to participate in the energy metabolism of members of this group. Operons encoding a MBX-related ferredoxin-NADPH oxidoreductase and a dehydrogenase-linked MBX complex are lacking in I. aggregans, although both are present in the completed genome sequences of Thermosphaera aggregans [24], Staphylothermus marinus [25] and Desulfurococcus kamchatkensis. The genome of A. pernix also lacks genes for the MBH-related energy-coupling hydrogenase, which are found in most members of the Desulfurococcaceae including I. aggregans (Igag_1902 - Igag_1914).

Figure 4.
figure 4

Phylogenetic network inferred from whole-genome (GBDP) distances, showing the relationships between Desulfurococcaceae (Aeropyrum, Ignisphaera, Staphylothermus and Thermosphaera), Pyrodictiaceae (Hyperthermus and Pyrolobus), Thermoproteaceae (Caldivirga, Pyrobaculum, Thermoproteus and Vulcanisaeta) and Thermofilaceae (Thermofilum).