Introduction

Anaerococcus pacaensis strain 9403502T (= CSUR P122 = DSM 26346), is the type strain of Anaerococcus pacaensis sp. nov., and a member of the genus Anaerococcus. This bacterium is a Gram-positive, anaerobic, non spore-forming, indole negative coccus that was isolated from a blood sample, during a study prospecting anaerobic isolates from deep samples [1].

The “gold standard” method to define a new bacterial species or genus is DNA-DNA hybridization and G+C content determination [2]. Those methods are expensive and poorly reproducible and actually, bacterial species can be classified with PCR and sequencing methods, particularly 16S rRNA sequences with internationally-validated cutoff [3]. More recently, an increasing number new bacterial genera and species have been described using high throughput genome sequencing and mass spectrometric analyses that allow access to the wealth of genetic and proteomic information [4,5]. In the past, studies have described new bacterial species and genera using genome sequencing, MALDI-TOF spectra, main phenotypic characteristics [623], and we propose here to describe a new species within the genus Anaerococcus in the same way.

Here we present a summary classification and a set of features for A. pacaensis sp. nov. strain 9403502T (= CSUR P122= DSM 26346) together with the description of the complete genomic sequencing and annotation. These characteristics support the circumscription of a novel species, Anaerococcus pacaensis sp. nov., within the genus Anaerococcus, and within the Clostridiales Family XI Incertae sedis.

The genus Anaerococcus was first described in 2001 [24], and belongs to the Clostridiales Family XI Incertae sedis. This family is defined mainly on the basis of phylogenetic analyses of ARNr 16S sequences, and in the Anaerococcus genus, bacteria are all anaerobic gram positive cocci. Based on the comparison of the 16S rRNA gene sequence, the first closest related species to Anaerococcus pacaensis sp., nov., is Anaerococcus prevotii. It was first described in 1948 by Foubert and Douglas [25] and reclassified later in the genus Anaerococcus [24]. The second closest related species is A. octavius, which was described first as Peptostreptococcus octavius, isolated from a human sample in 1998 by Murdoch et al [26]. It was later re-classified in the genus Anaerococcus, as A. octavius [24].

Classification and features

A blood sample was collected from a patient during a study analyzing emerging anaerobes, with MALDI-TOF and 16S rRNA gene sequencing [1]. The specimen was sampled in Marseille and preserved at −80°C after collection. Strain 9403502T (Table 1) was isolated in July 2009, by anaerobic cultivation on 5% sheep blood-enriched Columbia agar (BioMerieux, Marcy l’Etoile, France). This strain exhibited a 95% nucleotide sequence similarity with Anaerococcus prevotii [24,25]. Those similarity values are lower than the threshold recommended to delineate a new genus without carrying out DNA-DNA hybridization [38]. In the inferred phylogenetic tree, it forms a distinct lineage close to A. octavius (Figure 1).

Figure 1.
figure 1

Phylogenetic tree highlighting the position of Anaerococcus pacaensis strain 9403502T relative to other type strains within the genus Anaerococcus. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the maximum-likelihood method within the MEGA 4 software [39]. Numbers at the nodes are bootstrap values obtained by repeating the analysis 500 times the analysis to generate a majority consensus tree. Clostridium butyricum was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.

Table 1. Classification and general features of Anaerococcus pacaensis strain 9403502T

Different growth temperatures (23°C, 25°C, 28°C, 32°C, 35°C, 37°C, 50°C) were tested; no growth occurred at 23°C, 25°C, 28°C and 50°C, growth occurred between 32° and 37°C, and optimal growth was observed at 37°C.

Colonies are punctiform, very small, grey, dry and round on blood-enriched Columbia agar under anaerobic conditions using GENbag anaer (BioMérieux). Bacteria were grown on blood-enriched Columbia agar (Biomerieux), in BHI broth medium, and in Trypticase-soja TS broth medium, under anaerobic conditions using GENbag anaer (BioMérieux), under microaerophilic conditions using GENbag microaer (BioMérieux) and in the presence of air, with 5%CO2. They also were grown under anaerobic conditions on BHI agar, and on BHI agar supplemented with 1% NaCl. Growth was achieved only anaerobically, on blood-enriched Columbia agar, and weakly on BHI agar, and BHI agar supplemented with 1% NaCl after 72h incubation. Gram staining showed round non spore-forming Gram-positive cocci (Figure 2). The motility test was negative. Cells grow anaerobically in TS broth medium have a mean diameter of 1.140µm (min = 0.955µm; max = 1.404µm), as determined using electron microscopic observation after negative staining (Figure 3).

Figure 2.
figure 2

Gram staining of A. pacaensis strain 9403502T

Figure 3.
figure 3

Transmission electron microscopy of A. pacaensis strain 9403502T, using a Morgani 268D (Philips) at an operating voltage of 60kV. The scale bar represents 500 nm.

Strain 9403502T exhibited catalase activity but no oxidase activities. Using API 20A, a positive reaction could be observed only weekly for Gelatinase. Using Api Zym, a positive reaction was observed for alkaline phosphatase (5nmol of hydrolyzed substrata), acid phosphatase (5nmol), naphtolphosphohydrolase (5nmol), and hyaluronidase (40nmol). Using Api rapid id 32A, a positive reaction could be observed only for beta glucuronydase and pyroglutamic acid arylamidase. Regarding antibiotic susceptibility, A. pacaensis was susceptible to penicillin G, amoxicillin, cefotetan, imipenem, metronidazole and vancomycin. When compared to the representative species within the genus Anaerococcus, A. pacaensis exhibits the phenotypic characteristics details in Table 2 [40].

Table 2. Differential characteristics of Anaerococcus pacaensis sp. nov., strain 9403502T, A. octavius strain NCTC 9810T, and A. tetradius strain DSM 2951T.

Matrix-assisted laser-desorption/ionization time-of-flight (MALDI-TOF) MS protein analysis was carried out as previously described [41]. A pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Germany). Ten distinct deposits were done for strain 9403502T from ten isolated colonies. Each smear was overlaid with 2 µL of matrix solution (saturated solution of alpha-cyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (ISI), 20kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The ten 9403502T spectra were imported into the MALDI Bio Typer software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 5,697 bacteria, in the Bio Typer database. The method of identification includes the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with the spectra in database. A score enabled the identification, or not, from the tested species: a score ≥ 2 with a validated species enabled the identification at the species level; a score ≥ 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain 9403502T, the best obtained score was 1.265, which is not significant, suggesting that our isolate was not a member of a known genus. Our database was incremented with the reference spectrum from strain 9403502T (Figure 4). A dendrogram was constructed with the MALDI Bio Typer software (version 2.0, Bruker), comparing the reference spectrum of strain 9403502T with reference spectra of 26 bacterial species, all belonging to the order of Clostridiales. In this dendrogram, strain 9403502T appears as a separated branch within the genus Anaerococcus (Figure 5).

Figure 4.
figure 4

Reference mass spectrum from A. pacaensis strain 9403502T. Spectra from 10 individual colonies were compared and a reference spectrum was generated.

Figure 5.
figure 5

Dendrogram based on the comparison of the A. pacaensis strain 9403502T MALDI-TOF reference spectrum, and 26 other species of the order of Clostridiales.

Genome sequencing and annotation

Genome project history

The organism was selected for sequencing on the basis of its phylogenetic position, 16S rRNA similarity to other members of the Anaerococcus genus, and is part of a study prospecting anaerobic bacteria in several clinical deep samples. It was the first genome of the new genus Anaerococcus pacaensis sp. nov., and the 7th genome of Anaerococcus sp.

The Genbank accession number is CAJJ020000000 (CAJJ020000001-CAJJ020000053) and consists of 14 scaffolds with a total of 53 contigs. Table 2 shows the project information and its association with MIGS version 2.0 compliance.

Growth conditions and DNA isolation

A. pacaensis sp. nov. strain 9403502T, CSUR= P122, DSM = 26346, was grown on blood agar medium at 37°C under anaerobic conditions. Eight petri dishes were spread and resuspended in 5 ×100µl of G2 buffer. A first mechanical lysis was performed by glass powder on the Fastprep-24 device (Sample Preparation system) from MP Biomedicals, USA during 2x20 seconds. DNA was then incubated for a lysozyme treatment (30 minutes at 37°C) and extracted through the BioRobot EZ 1 Advanced XL (Qiagen). The DNA was then concentrated and purified on a Qiamp kit (Qiagen). The yield and the concentration were measured by the Quant-it Picogreen kit (Invitrogen) on the Genios_Tecan fluorometer at 15.7ng/µl.

Genome sequencing and assembly

A 3 kb paired end libraries was pyrosequenced on the 454 Roche Titanium. This project was loaded on a 1/4 region on PTP Picotiterplates. 5 µg of DNA was mechanically fragmented on the Hydroshear device (Digilab, Holliston, MA, USA) with an enrichment size at 3–4kb. The DNA fragmentation was visualized through the Agilent 2100 BioAnalyzer on a DNA labchip 7,500 with an optimal size of 3.2 kb. The library was constructed according to the 454 Titanium paired end protocol and manufacturer. Circularization and nebulization were performed and generated a pattern with an optimal at 604 bp. After PCR amplification through 15 cycles followed by double size selection, the single stranded paired end library was then quantified on the Agilent 2100 BioAnalyzer on a RNA pico 6,000 labchip at 91pg/µL. The library concentration equivalence was calculated at 2.76E+08 molecules/µL. The library was stocked at −20°C until using.

The library was clonal amplified with 0.5 and 1 cpb in 2 emPCR reactions in each condition with the GS Titanium SV emPCR Kit (Lib-L) v2. The yield of the emPCR was 10.46 and 11.53% respectively according to the quality expected by the range of 5 to 20% from the Roche procedure. 790,000 beads were loaded on the GS Titanium PicoTiterPlates PTP Kit 70x75 sequenced with the GS Titanium Sequencing Kit XLR70.

The run was performed in overnight and then analyzed on the cluster through the gsRunBrowser and gsAssembler_Roche. The global 221,117 passed filter sequences generated 71.95Mb with a length average of 325bp.

The 454 sequencing generated 607,067 reads (105,03 Mb) assembled into contigs and scaffolds using Newbler version 2.7 (Roche) and Opera software v1.2 [42] combined to GapFiller V1.10 [43]. Finally, the available genome consists of 14 scaffolds and 53 contigs, with a coverage of 44.9.

Genome annotation

Non-coding genes and miscellaneous features were predicted using RNAmmer [44], ARAGORN [45], Rfam [46], PFAM [47]. Open Reading Frames (ORFs) were predicted using Prodigal [48] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing GAP region. The functional annotation was achieved using BLASTP [49] against the GenBank database [50] and the Clusters of Orthologous Groups (COG) database [51,52].

Genome properties

The genome of Anaerococcus pacaensis strain 9403502T is estimated at 2.36 Mb long with a G+C content of 35.05% (Figure 6 and Table 3). A total of 2,186 protein-coding and 72 RNA genes, including 3 rRNA genes, 42 tRNA, 1 tmRNA and 26 miscellaneous other RNA were founded. The majority of the protein-coding genes were assigned a putative function (74.1%) while the remaining ones were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Tables 3 and 4. The Table 5 presents the difference of gene number (in percentage) related to each COG categories between Anaerococcus pacaensis and Anaerococcus prevotii DSM 20548. The proportion of COG is highly similar between the two species. The maximum difference is related to the COG “Carbohydrate Metabolism and transportation” which does not exceed 1.94%. The distribution of genes into COGs functional categories is presented in Table 6.

Figure 6.
figure 6

Graphical circular map of the genome. From outside to the center: scaffolds are in grey (unordered), genes on forward strand (colored by COG categories), genes on reverse strand (colored by COG categories), RNA genes (tRNAs green, rRNAs red, tm RNAs black, misc_RNA pink), GC content (black/grey), and GC skew (purple/olive).

Table 3. Project information
Table 4. Nucleotide content and gene count levels of the genome
Table 5. Number of genes associated with the 25 general COG functional categories
Table 6. Percentage of genes associated with the 25 general COG functional categories for Anaerococcus pacaensis and Anaerococcus prevotii DSM 20548.

Insights into the genome sequence

We made some brief comparisons against Anaerococcus prevotii DSM 20548 (NC_013171), which is currently the closest available genome. This genome contains 1 chromosome (accession number: NC_013171) and 1 plasmid (accession number: NC_013164).

The draft genome sequence of Anaerococcus pacaensis has a bigger size compared to the Anaerococcus prevotii (respectively 2,36 Mbp and 1,99 Mbp). The G+C content is slightly larger than Anaerococcus prevotii too (respectively 37.5% and 35.05%). Anaerococcus pacaensis shares more genes (2,272 genes against 1,916 genes), however the ratios of genes per Mb is very similar (962,71–962,81).

Conclusion

On the basis of phenotypic, phylogenetic and genomic analysis, we formally propose the creation of Anaerococcus pacaensis, whichcontains the strain 9403502T. This bacterium has been found in Marseille, France.

Description of Anaerococcus pacaensis sp. nov.

Anaerococcus pacaensis (pa.ca’en.sis L. gen. masc. n. pacaensis, of PACA, the acronym of Provence Alpes Côte d’Azur, the region where was isolated Anaerococcus pacaensis). Isolated from a blood sample from a patient from Marseille. A. pacaensis is a Gram-positive cocci, obligate anaerobic, non-spore-forming bacterium. Grows on axenic medium at 37°C in anaerobic atmosphere. Negative for indole. Non-motile. The G+C content of the genome is 35.05%. The type strain is 9403502T (= CSUR P122 = DSM 26346).