Introduction

Strain ALA-1T (= DSM 12261) is the type strain of the species Aminobacterium colombiense, which is the type species of the genus Aminobacterium [1,2]. The name of the genus relates to its ability to ferment amino acids and the species name refers to origin of the isolate, Columbia [1]. Currently, the genus Aminobacterium consists of only two species [1,3,4]. Strain ALA-1T has been isolated from an anaerobic dairy wastewater lagoon in 1998 or before [1]. At the moment, strain ALA-1T is the only known isolate of this species. Highly similar (98%) nearly complete (>1,400 bp) uncultured 16S gene clone sequences were frequently obtained from anaerobic habitats, e.g., from anaerobic municipal solid waste samples in France [5], from a biogas fermentation enrichment culture in China (GU476615), from a swine wastewater anaerobic digestion in a UASB reactor in China (FJ535518), and from a mesophilic anaerobic BSA digester in Japan [6], suggesting quite a substantial contribution of Aminobacterium to anaerobic prokaryotic communities. The type strain of the only other species in the genus, A. mobile [3] shares 95% 16S rRNA sequence identity with A. colombiense, whereas the type strains of the other species in the family Synergistaceae share between 84.3 and 88.3% 16S rRNA sequence identity [7]. Environmental samples and metagenomic surveys detected only one significantly similar phylotype (BABF01000111, 92% sequence similarity) in a human gut microbiome [7], with all other phylotypes sharing less than 84% 16S rRNA gene sequence identity, indicating a rather limited general ecological importance of the members of the genus Aminobacterium (status April 2010). Here we present a summary classification and a set of features for A. colombiense ALA-1T, together with the description of the complete genomic sequencing and annotation.

Classification and features

Figure 1 shows the phylogenetic neighborhood of A. colombiense ALA-1T in a 16S rRNA based tree. The sequences of the three identical copies of the 16S rRNA gene in the genome differ by 14 nucleotides (0.9%) from the previously published 16S rRNA sequence generated from DSM 12661 (AF069287). which contains 3 ambiguous base calls. These differences are most likely due to sequencing errors in AF069287.

Figure 1.
figure 1

Phylogenetic tree highlighting the position of A. colombiense ALA-1T relative to the other type strains within the phylum Synergistetes. The tree was inferred from 1,282 aligned characters [8,9] of the 16S rRNA gene sequence under the maximum likelihood criterion [10] and rooted in accordance with the current taxonomy [11]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 250 bootstrap replicates [12] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [13] are shown in blue, published genomes in bold, e.g. the recently published GEBA genome of Thermanaerovibrio acidaminovorans [14].

The cells are rod-like, occasionally slightly curved with 3–4 µm in length and 0.2–0.3 µm in width (Figure 2 and Table 1) [1]. The colonies are up to 1.0 mm in diameter and are round, smooth, lens-shaped, and white [1]. Strain ALA-1T requires yeast extract for growth and ferments serine, glycine, threonine, and pyruvate in its presence [1]. Poor growth is obtained on casamino acids, peptone, biotrypcase, cysteine and α-ketoglutarate [1]. The fermentation and end-products include acetate and H2, and also propionate in the case of α-ketoglutarate fermentation. Carbohydrates (such as glucose, saccharose, ribose, xylose, cellobiose, mellobiose, maltose, galactose, mannose, arabinose, rhamnose, lactose, sorbose and mannitol), gelatin, casein, glycerol, ethanol, acetate, propionate, butyrate, lactate, citrate, fumarate, malate, succinate and the other amino acids tested are not utilized [1].

Figure 2.
figure 2

Scanning electron micrograph of A. colombiense ALA-1T

Table 1. Classification and general features of A. colombiense ALA-1T according to the MIGS recommendations [16]

As typical for anoxic habitats, strain ALA-1T is engaged in syntrophic interactions: alanine, glutamate, valine, isoleucine, leucine, methionine, aspartate and malate are oxidized only in the presence of the hydrogenotroph, Methanobacterium formicicum, strain DSM 1525 [1]. In addition, the utilization of cysteine, threonine and α-ketoglutarate are also improved in the presence of M. formicicum [1]. An 80% hydrogen atmosphere (supplied as H2-CO, (80:20) at 2 bar pressure) inhibits growth of strain ALA-1T on threonine and α-ketoglutarate, whereas glycine degradation is not affected [1]. Serine and pyruvate degradation are partially affected by the presence of hydrogen. Sulfate, thiosulfate, elemental sulfur, sulfite, nitrate, and fumarate are not utilized as electron acceptors [1]. Strain ALA-1T does not perform the Stickland reaction when alanine is provided as an electron donor and glycine, serine, arginine or proline are provided as electron acceptor.

As noted above, alanine is oxidized only in the presence of the hydrogenotroph M. formicicum, which utilizes the produced H2 [1]. In the absence of an H2-consuming organism, the H2 partial pressure would rapidly reach a level that thermodynamically inhibits further fermentation [21]. Adams and colleagues used a H2-purging culture vessel to replace the H2-consuming syntrophic partner, in order to study in detail the energetic characteristics of alanine consumption of strain ALA-1T in a pure culture [21].

Strain ALA-1T is non-motile [1], whereas interestingly the other species in the genus, A. mobile, is motile by means of lateral flagella [3]. A parallel situation is in the genus Anaerobaculum (Figure 1), where A. thermoterrenum is non-motile [22] but A. mobile is motile by means of lateral flagella [23]. In fact, the phenotype of non-motility versus motility by means of lateral flagella is heterogeneously distributed among the organisms depicted in Figure 1. This may suggest that the last common ancestor of the group shown in Figure 1 was motile by flagella and that the selection pressure for a functioning flagella might be currently more relaxed in this group, leading in individual strains to mutational inactivation of the flagella. Interestingly, the annotation of the genome does not give any indication of the presence of any genes related to flagellar assembly. The only genes related to cellular motility refer to type II secretory pathway and to pilus assembly. This is surprising, as it is hardly probable that strain ALA-1T lost all genes for flagellar assembly after the evolutionary separation of strain ALA-1T and its closely related sister species A. mobile from their last common ancestor. A similar situation has been observed in the non-motile strain Alicyclobacillus acidocaldarius 104-IAT in comparison to several motile sister species in the genus Alicyclobacillus [24]. Here, the genome of the non-motile strain A. acidocaldarius 104-IAT still contains most of the genes needed for flagellar assembly [24]. Thus, the genotypic status of flagellar motility in the genus Aminobacterium remains unclear.

Chemotaxonomy

Ultrathin sections of strain ALA-1T revealed a thick cell wall with an external S-layer similar to that of Gram-positive type cell walls [1]. Unfortunately, no chemotaxonomic data have been published for the genus Aminobacterium. Among the organisms depicted in Figure 1, chemotaxonomic data are available for Dethiosulfovibrio peptidovorans, Jonquetella anthropi, Pyramidobacter piscolens, Cloacibacillus evryensis, and Synergistes jonesii, though the data are not always present in the original species description publications [2527]. In major phenotypes, such as being strictly anaerobic and Gram-negative in staining within the usually Gram-positive Firmicutes, mostly also in their ability to degrade amino acids, the organisms shown in Figure 1 are highly similar, which may justify also a comparison in their chemotaxonomic features. The major fatty acids in different strains of Jonquetella are iso-C15:0 (25-43%) and C16:0 (14-21%), other iso-branched and unbranched fatty acids are present in smaller amounts, and anteiso-C15:0 is below 5% [26]. In Dethiosulfovibrio, the major fatty acid is iso-C15:0 (59.7%), followed by C18:0 (9.0%) and C16:0 (8.5%) [26]. Dethiosulfovibrio differs qualitatively from Jonquetella by the absence of anteiso branched fatty acids and by the presence of C18:1ω9c (3.0%) [26]. The major fatty acids in two strains of P. piscolens are C14:0 (16-19%) and C13:0 (12-14%) [27]. The cellular fatty acids of C. evryensis are characterized by a mixture of saturated, unsaturated, hydroxy- and cyclopropane fatty acids [25]. The major fatty acids were iso-C15:0 (16.6%), iso-C15:0 3-OH (12.4%) and C17:1ω6c (9.5%) [25]; the major fatty acids in its closest relative, Synergistes jonesii, were C15:0 (16.0%), C20cyc (14:0) and C17:1ω6c (9.0%) [25]. The polar fatty acid profile of C. evryensis (data not shown in the original publication) revealed diphosphatidylglycerol, phosphatidylglycerol, phosphatidyl-ethanolamine and phosphatidylmonomethylamine [25].

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its phylogenetic position [28], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [29]. The genome project is deposited in the Genome OnLine Database [13] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

A. colombiense ALA-1T, DSM 12661, was grown anaerobically in DSMZ medium 846 (Anaerobic serine/arginine medium) [30] at 37°C. DNA was isolated from 1–1.5 g of cell paste using MasterPure Gram Positive DNA Purification Kit (Epicentre MGP04100) adding additional 1µl lysozyme and 5 µl mutanolysin to the standard lysis solution for 40 min incubation at 37°C.

Genome sequencing and assembly

The genome was sequenced using a combination of Illumina and 454 technologies. An Illumina GAii shotgun library with reads of 909 Mb, a 454 Titanium draft library with average read length of 283 bases, and a paired end 454 library with average insert size of 12 kb were generated for this genome. All general aspects of library construction and sequencing can be found at http://www.jgi.doe.gov/. Draft assemblies were based on 169 Mb 454 draft data and 454 paired end data (543,550 reads). Newbler (version 2.0.0-PostRelease-10/28/2008 was used) parameters are -consed -a 50 -l 350 -g -m -ml 20. The initial Newbler assembly contained 18 contigs in 1 scaffold. The initial 454 assembly was converted into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. Illumina sequencing data was assembled with VELVET [31], and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The Phred/Phrap/Consed software package (www.phrap.com) was used for sequence assembly and quality assessment in the following finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution (http://www.jgi.doe.gov/), Dupfinisher, or sequencing cloned bridging PCR fragments with subcloning or transposon bombing [32]. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J-F.Cheng, unpublished). A total of 113 additional Sanger reactions were necessary to close gaps and to raise the quality of the finished sequence. The error rate of the completed genome sequence is less than 1 in 100,000.

Genome annotation

Genes were identified using Prodigal [33] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [34]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [35].

Genome properties

The genome consists of a 1,980,592 bp long chromosome with an overall GC content of 45.3% (Table 3 and Figure 3). Of the 1,970 genes predicted, 1,914 were protein-coding genes, and 56 RNAs; 38 pseudogenes were also identified. The majority of the protein-coding genes (77.2%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Figure 3.
figure 3

Graphical circular map of the genome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Genome Statistics
Table 4. Number of genes associated with the general COG functional categories