Introduction

Strain Sjm18-20T (=NBRC 101213T =DSM 18026T) is the type strain of the species Oscillibacter valericigenes, which is the type species of the monotypic genus Oscillibacter Iino et al. 2007 [1]. The strain was isolated from the alimentary canal of a Japanese corbicula clam (Corbicula japonica) collected on a seacoast in Shimane Prefecture in Japan. The strain belongs to clostridial cluster IV, one of the 19 clusters of clostridial bacteria proposed based on 16S rRNA gene sequences [2]. Clostridial cluster IV includes phenotypically heterogeneous bacteria, most of which were isolated from digestive tracts and feces of various organisms as well as from anaerobic sewage sludge. While draft genome sequences of some species affiliated with cluster IV, including those obtained as reference sequences for human microbiome projects, have been published, complete genome sequences are thus far been limited to those of a ruminal cellulolytic bacterium Ruminococcus albus 7 [3] and an ethanologenic sludge bacterium Ethanoligenens harbinense YUAN-3, both of which are phylogenetically distant from O. valericigenes with 16S sequence similarities of 86% and 84%, respectively. In addition, the 16S rRNA gene sequence of strain Sjm18-20T is distantly related to the other species within the cluster (Figure 1), with similarity values less than 91%. Instead, the 16S gene sequence is most similar to those of uncultured bacteria in digestive tracts and feces of both herbivorous and omnivorous animals. The latter include the large cells enriched from sheep rumen contents by flow cytometric sorting, which are thought to represent Oscillospira guilliermondii Chatton and Perard 1913 [5]. Although O. guilliermondii was first described nearly a century ago [6] and has attracted much attention due to its conspicuous morphology, growth in pure culture has not been achieved. In the current NCBI taxonomy database [7], the family Oscillospiraceae Peshkoff 1940 [8] is tentatively classified within the order Clostridiales to accommodate O. valericigenes and related environmental samples, although there is no type strain for the type species Oscillospira guilliermondii. Accordingly, the strain Sjm18-20T is currently the only strain in this family having a validly published name.

Figure 1.
figure 1

Phylogenetic tree highlighting the position of O. valericigenes strain Sjm18-20T relative to other representative type strains within the clostridial cluster IV. The tree was constructed by the neighbor-joining method [4] based on an alignment of 1,339 bp 16S rRNA gene sequences. Corresponding INSDC accession numbers are shown in parentheses. Numbers at nodes indicate support values obtained from 1,000 bootstrap replications. Type strains within the clostridial cluster III were used as the outgroup. Species for which draft assembly sequences are available are labeled with one asterisk, while species for which complete genome sequences are available are labeled with two asterisks. INSDC accession numbers for draft and complete genome sequences are as follows: A. colihominis, ABGD00000000; C. leptum, ABCB00000000; C. methylpentosum, ACEC00000000; E. harbinense, CP002400; E. siraeum, ABCA00000000; F. prausnitzii, ACOP00000000; F. plautii, AGCK00000000; O. valericigenes, AP012044; P. capillosus, AAXG00000000; R. albus, CP002403; R. bromii, FP929051; R. flavefaciens, ACOK00000000; S. variabile, ACBY00000000; C. cellulolyticum, CP001348; C. thermocellum, CP000568.

Organism information

Strain Sjm18-20T is a mesophilic, neutrophilic, strictly anaerobic bacterium with features as summarized in Table 1 [1]. Unlike other clostridial bacteria, which are typically characterized as being low G+C content, Gram-positive, endospore-forming and anaerobic, Sjm18-20T is Gram-stain negative and non-sporulating. Cells are straight to slightly curved rods with 0.4–0.6 × 2.5–6.0 µm in size. Cells are elongated after prolonged cultivation and often reach 30 µm in length. Optimum growth is observed at 30°C and pH 6.0–6.5. The strain tolerates up to 4% NaCl, but growth is also observed in the absence of NaCl.

Table 1. Classification and general features of Oscillibacter valericigenes Sjm18-20T

Cells are motile with oscillatory movements. Electron microscopic observation demonstrated the presence of peritrichous flagella [1]. In agreement with this observation, the genome encodes genes necessary for flagellar synthesis and chemotaxis, as is typical in many Gram-positive bacteria. In contrast, while some clostridial bacteria, including the pathogenic species Clostridium perfringens, are known to utilize type IV pili for their gliding motility [20], neither genes encoding the constituents of type IV pili, nor the gld motility genes of Flavobacterium johnsoniae [21], were found in the Sjm18-20T genome, suggesting that flagella are solely responsible for the oscillatory movements.

Strain Sjm18-20T grows poorly even in the medium supplemented with 0.5% each of yeast extract and polypeptone, with a generation time of 18.3 hours under optimum growth conditions [1]. From the genome sequence, strain Sjm18-20T seems to be able to synthesize most amino acids, with the exception of branched-chain amino acids. The genome encodes, however, several ABC transporters possibly involved in the uptake of branched-chain amino acids (OBV_11160-11200, OBV_36860-6900 and OBV_40040-40050).

The strain grows fermentatively and produces acids from D-glucose, L-arabinose, D-ribose and D-xylose, with n-valeric acid being the major end product from glucose [1]. Consistent with these observations, genes encoding catabolic enzymes and possible transporters for these sugars could be assigned on the genome. However, we could not identify a gene encoding the authentic form of enolase (EC 4.2.1.11), which catalyzes the penultimate step of glycolysis. Trials of genomic PCR using degenerate primers designed based on the enolase gene sequences of related clostridial bacteria also failed (data not shown). Considering the fermentative phenotype of the strain, either an alternative enolase-like enzyme or a novel metabolic pathway which directs the glycolytic flow towards the synthesis of pyruvate could be present.

Phylogenetic analysis based on 16S rRNA gene sequences unequivocally placed strain Sjm18-20T within the clostridial cluster IV [1] (Figure 1). In addition, phylogenetic analysis based on protein-coding genes such as ileS, valS, gyrB and rplKLM, which were extracted from genomic sequences, consistently placed strain Sjm18-20T within the cluster IV (data not shown).

Genome sequencing information

Genome project history

O. valericigenes Sjm18-20T was selected for sequencing because of its isolated phylogenetic position and characteristics which distinguish this strain from other described clostridial species. Table 2 presents the project information and its association with MIGS version 2.0 compliance [25].

Table 2. Project information

Growth conditions and DNA isolation

O. valericigenes Sjm18-20T cells were grown in a 200 ml volume at 30°C under N2 atmosphere in GYP medium in which air had been replaced with nitrogen gas by flushing [1]. DNA was isolated from 1 g of wet cells by manual extraction after lysis with lysozyme and SDS.

Genome sequencing and assembly

The genome of O. valericigenes Sjm18-20T was sequenced using the conventional whole-genome shotgun sequencing method. DNA shotgun libraries with average insert sizes of 1.7kb and 4.6kb were generated in pUC18 (TaKaRa), while a fosmid library with average insert size of 40 kb was constructed in pCC1FOS (EPICENTRE) as described previously [26]. A total of 37,824 clones (20,352, 12,288 and 5,184 clones from libraries with 1.7kb, 4.6 kb and 40 kb inserts, respectively) were subjected to sequencing from both ends of the inserts on ABI 3730xl DNA Analyzer (Applied Biosystems). Sequence reads were trimmed at a threshold of 20 in Phred score and assembled by using Phrap and CONSED assembly tools [22,23]. For alignment and validation of contigs, Optical Mapping (OpGen) was used. Gaps between contigs were closed by sequencing PCR products which bridge two neighboring contigs. Finally, each base of the genome was ensured to be sequenced from multiple clones either from both directions with Phrap quality score≧70 or from one direction with Phrap quality score ≧40.

Genome annotation

Complete sequences of the chromosome and the plasmid were analyzed using Glimmer3 [24] for predicting protein-coding genes, tRNAscan-SE [27] and ARAGORN [28] for tRNA genes, and RNAmmer [29] for rRNA genes. The functions of predicted protein-coding genes were assigned manually, using in-house genome annotation system OCSS (unpublished), from the comparison with Uniprot [30], Interpro [31], HAMAP [32] and KEGG [33] databases.

Genome properties

The genome of O. valericigenes Sjm18-20T consisted of a circular chromosome of 4,410,036 bp and a circular plasmid of 60,586 bp (Figure 2). The chromosome was predicted to contain 4,656 protein-coding genes, 58 tRNA genes, 9 rRNA genes and 5 other RNA genes, whereas the plasmid contained 67 predicted protein-coding genes. Of the total of 4,723 protein-coding genes predicted in the genome, 2,483 (52.6%) were assigned known functions, 1,499 (31.7%) were similar to genes with unknown function in other bacterial genomes, and 741 (15.7%) had no similarity with other genes. Average G+C contents of the chromosome and the plasmid were 53.3% and 43.3%, respectively. The properties and the statistics of the genome are summarized in Tables 34.

Figure 2.
figure 2

Circular representation of the O. valericigenes Sjm18-20T chromosome and the plasmid. From outside to the center: circles 1 and 2, predicted protein coding genes on the forward and reverse strands, respectively; circle 3, tRNA genes; circle 4, rRNA operons; circle 5, G+C content; circle 6, GC skew. Predicted protein coding genes are colored according to their assigned COG functional categories (see Table 4).

Table 3. Nucleotide content and gene count levels of the genome
Table 4. Number of genes associated with the 25 general COG functional categories

Similarities to O. guilliermondii

Phylogenetic analysis based on 16S rRNA gene sequences revealed that O. valericigenes Sjm18-20T is closely related to the uncultivated cells thought to represent Oscillospira guilliermondii Chatton and Perard 1913 [5]. In addition, strain Sjm18-20T shares some phenotypic characteristics, including the elongated and often curved cell morphology, the oscillatory motility by means of peritrichous flagella, and the Gram-negative staining, with those described for O. guilliermondii [1]. However, while O. guilliermondii was reported to form endospores, spore formation was not detected with O. valericigenes Sjm18-20T by microscopic observations and by heat treatment for testing the presence of heat resistant bodies such as spores. In phylogenetically related bacteria such as Bacillus and Clostridium, phosphorylation of Spo0A, a master regulatory factor, is known to initiate the process of sporulation through the successive synthesis of sporulation-stage specific sigma factors. We found that the genome of strain Sjm18-20T encoded the Spo0A factor (OBV_15500) and all sporulation sigma factors known in other bacteria, i.e., sigma H (OBV_22080), sigma E (OBV_21490), sigma F (OBV_29180), sigma K (OBV_12200), and sigma G (OBV_24420), as well as other regulatory proteins related to the sigma cascade.

In contrast, genes necessary for the later stages of sporulation, i.e., the formation of cortex and spore coat, seemed either largely different or partly missing. For example, cotF, cotS and yabQ genes, widely found in the genomes of clostridial species, could not be found in the genome of strain Sjm18-20T. Strain Sjm18-20T might have the potential for sporulation, although it needs to be further investigated whether conditions exist under which this bacterium would actually sporulate.