Introduction

Strain G5T (= CSUR P290 = DSM 27179) is the type strain of Gorillibacterium massiliense gen. nov., sp. nov. This bacterium which is proposed to belong to the family Paenibacillaceae, is a Gram-negative, flagellated, facultative anaerobic, indole-negative bacillus that was isolated from a fecal sample of a wild western lowland gorilla from Cameroon, through a culturomics study of the bacterial diversity of the feces of wild gorillas. This technique was used successfully to explore the human gut microbiota allowing the isolation of many new species and genera [13].

The newly proposed strategy of applying high throughput genome sequencing, MALDI-TOF spectral analysis of cellular proteins, coupled with more traditional methods of phenotypic characterization has been demonstrated as a useful approach for the description of new bacterial taxa [415]. A principle advantage is that this method circumvents the vagaries of methods that rely mainly on DNA-DNA hybridization to delineate species. Here, we applied this polyphasic approach to describe G. massiliense gen. nov., sp. nov. strain G5T.

The family Paenibacilliaceae [16] belongs to the phylum Firmicutes and includes the 9 following genera [17]: Paenibacillus [18,19], Ammoniphilus [20], Aneurinibacillus [21], Brevibacillus [21], Thermobacillus [22], Fontibacillus [23], Cohnella [24], Saccharibacillus [25] and Oxalophagus [26]. Members belonging to this family were isolated mainly from soil, roots, blood, feces and other sources [16]. To the best of our knowledge, this is the first report of the isolation of a novel genus from the fecal flora of a gorilla.

Here we present a summary classification and a set of features for G. massiliense gen. nov., sp. nov. strain G5T (= CSUR P290 = DSM 27179) together with the description of the complete genomic sequencing and its annotation. These characteristics support the circumscription of a novel genus, Gorillibacterium gen. nov. within the family Paenibacillaceae, with Gorillibacterium massiliense gen. nov., sp. nov. as the type species.

Classification and features

In July 2011, a fecal sample was collected from a wild Gorilla gorilla subsp.gorilla near Minton, a village in the south-central part of the DJA FAUNAL Park (Cameroon). The collection of the stool sample was approved by the Ministry of Scientific Research and Innovation of Cameroon. No experiments were conducted on this gorilla. The fecal specimen was preserved at −80°C after collection and sent to Marseille. Strain G5T (Table 1) was isolated in August 2012 by aerobic cultivation at 37°C on sterilized soil medium (12 g of soil (Latitude: N 43°17′20.151″; Longitude: E 5°24′15.3822″)/agar (14g/l). This strain exhibited a 93.72% 16S rRNA nucleotide sequence similarity with Paenibacillus turicensis, the phylogenetically closest validly published Paenibacillus species (Figure 1). This value was lower than the 95.0% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new genus without carrying out DNA-DNA hybridization [37].

Figure 1.
figure 1

Phylogenetic tree highlighting the position of Gorillibacterium massiliense strain G5T relative to other type strains within the Paenibacillaceae family. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTAL X (V2), and phylogenetic inferences obtained using the maximum-likelihood method within the MEGA 5 software [36]. Numbers at the nodes are percentages of bootstrap values obtained by repeating the analysis 1,000 times to generate a majority consensus tree. Brevibacillus brevis was used as out-group. The scale bar represents a 2% nucleotide sequence divergence.

Table 1. Classification and general features of Gorillibacterium massiliense strain G5T

Different growth temperatures (25, 30, 37, 45°C) were tested. No growth occurred at 45°C, growth occurred between 25°and 37°C, and optimal growth was observed at 37°C. Colonies were bright grey with a diameter of 1.0 mm on 5% blood-enriched Columbia agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMérieux), and under aerobic conditions, with or without 5% CO2. Growth was observed under anaerobic and microaerophilic conditions, but optimal growth was obtained aerobically. Moreover, the Gram staining showed Gram-negative rod (Figure 2). A motility test produced a negative result. Cells grown on agar did not sporulate and the rods exhibited peritrichous flagella and had a mean length of 1.75 µm and a mean diameter of 0.67 µm as determined by negative staining transmission electron microscopy (Figure 3).

Figure 2.
figure 2

Gram staining of G. massliensis strain G5T.

Figure 3.
figure 3

Transmission electron microscopy of G. massiliense strain G5T using a Morgani 268D (Philips) at an operating voltage of 60kV. The scale bar represents 500 nm.

Strain G5T exhibited catalase activity but not oxidase activity. Using the API 50CH system (BioMerieux), a positive reaction was obtained for D-xylose, D-glucose, D-fructose, D-mannose, N-acetylglucosamine, aesculin, salicin, D-cellobiose, D-maltose, D-lactose, D-melibiose, D-saccharose, D-trehalose, inulin, D-melezitose, D-raffinose, glycogen, gentiobiose, D-turanose, Methyl-α-D-glucopyranoside and hydrolysis of starch. A weak positive reaction was observed for L-arabinose. A negative reaction was observed for glycerol, ribose, D-galactose, L-rhamnose, L-sorbose, dulcitol, inositol, D-mannitol, D-sorbitol, methyl-αD-mannopyranoside, D-arabinose, amygdalin, arbitin, potassium gluconate, potassium 2-cetogluconate, potassium 5-cetogluconate, adonitol and D-tagatose. Using the API ZYM system, positive reactions were obtained only for naphthol-AS-BI-phosphohydrolase, α-galactosidase, β-galactosidase, β-glucosidase, arginine arylamidase and arginine dihydrolase. The production of α-glucosidase, β-glucuronidase, esterase lipase, leucine arylamidase, cystine arylamidase, valine arylamidase, glycine arylamidase, phenylalanine arylamidase, lipase, alkaline phosphatase, acid phosphatase, N-acetyl-β-glucosaminidase and a-chymotrypsin were negative. Urease reaction and reduction of nitrates to nitrogen were also positive. Indole production was negative. G. massiliense was susceptible to ticarcillin, amoxicillin, tobramycin, imipenem, vancomycin and rifampin but resistant to ceftazidime (Caz 30), colistin (CT50) and metronidazole.

When compared with representative species from the family Paenibacillaceae [3842], G. massiliense gen. nov., sp. nov. strain G5T exhibited the phenotypic differences detailed in Table 2.

Table 2. Differential phenotypic characteristics between Gorillibacterium massiliense gen. nov., sp. nov., strain G5T and phylogenetically close Paenibacillaceae species.

Matrix-assisted laser-desorption/ionization time-of-flight (MALDI-TOF) MS protein analysis was carried out as previously described [15] using a Microflex spectrometer (Bruker Daltonics, Leipzig, Germany). Twelve distinct deposits were done for strain G5T from 12 isolated colonies. The 12 G5T spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against 6,252 bacterial spectra used as reference data, in the BioTyper database. A score enabled the presumptive identification of the isolated based on the following heuristicpecies: a score ≥ 2 with a validated species enabled the identification at the species level, a score ≥ 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain G5T, a significant score was not obtained, suggesting it was not a member of any known species or genus. We incremented our database with the spectrum from strain G5T (Figure 4). Spectrum differences with other of Paenibacillaceae family are shown in Figure 5.

Figure 4.
figure 4

Reference mass spectrum from G. massiliense strain G5T. Spectra from 16 individual colonies were compared and a reference spectrum was generated.

Figure 5.
figure 5

Gel view comparing Gorillibacterium massilinensis gen. nov., sp. nov strain G5T spectra with other members of the Paenibacillaceae family. The Gel View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units. Displayed species are indicated on the left.

Genome sequencing information

Genome project history

The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the family Paenibacillaceae, and is part of a “culturomics” study of the gorilla flora aiming at isolating all bacterial species within gorilla feces. It was the 81st genome of the Paenibacillaceae family and the first genome of Gorillibacterium massiliense gen. nov., sp. nov. A summary of the project information is shown in Table 3. The Genbank accession number is CBQR000000000 and consists of 176 large contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [43].

Table 3. Project information

Growth conditions and DNA isolation

Gorillibacterium massiliense gen. nov., sp. nov., strain G5T (= CSUR P290 = DSM 27179) was grown aerobically on 5% sheep blood-enriched Columbia agar at 37°C. Four petri dishes were spread and resuspended in 3×500µl of TE buffer and stored at 80°C. Then, 500 µl of this suspension were thawed, centrifuged 3 minutes at 10,000 rpm and resuspended in 3×100 µL of G2 buffer (EZ1 DNA Tissue kit, Qiagen). A first mechanical lysis was performed by glass powder on the Fastprep-24 device (Sample Preparation system, MP Biomedicals, USA) using 2×20 seconds cycles. DNA was then treated with 2.5µg/µL lysozyme (30 minutes at 37°C) and extracted using the BioRobot EZ1 Advanced XL (Qiagen). The DNA was then concentrated and purified using the Qiamp kit (Qiagen). The yield and the concentration were measured by the Quant-it Picogreen kit (Invitrogen) on the Genios Tecan fluorometer at 50ng/µl.

Genome sequencing and assembly

The paired-end library was prepared with 5 µg of bacterial DNA using DNA fragmentation on a Covaris S-Series (S2) instrument (Woburn, Massachusetts, USA) with an enrichment size at 4.5kb. DNA fragmentation was visualized with an Agilent 2100 BioAnalyzer on a DNA labchip 7500. The library was constructed according to the 454 GS FLX Titanium paired-end protocol (Roche). Circularization and nebulization were performed and generated a pattern with an optimum at 510 bp. After PCR amplification through 17 cycles followed by double size selection, the single stranded paired-end library was quantified using a BioAnalyzer 2100 on a RNA pico 6000 labchip at 68 pg/µL. The library concentration equivalence was calculated as 2.45E+08 molecules/µL. The library was stored at −20°C until further use.

The paired-end library was clonally amplified with 0.25 cpb and 0.5 cpb in 2 emPCR reactions with the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche). The yield of the emPCR was respectively of 5 and 6% as expected of the yield ranging from 5 to 20% recommended by the Roche procedure.

Approximately 790,000 beads were loaded twice (i.e. two runs were performed using the same paired-end library) on a ¼ region of the GS Titanium PicoTiterPlate PTP Kit 70×75 and sequenced with the GS FLX Titanium Sequencing Kit XLR70 (Roche). The two runs were performed overnight and then analyzed on the cluster through the gsRunBrowser and Newbler assembler (Roche). A total of 387,157 passed filter wells were obtained and generated 142.7 Mb of sequences with a length average of 369 bp. The passed filter sequences were assembled using Newbler with 90% identity and 40-bp as overlap. The final assembly identified 12 scaffolds with 176 large contigs (>1.5kb), generating a genome size of 5.5 Mb which corresponds to a genome coverage of 25.71×.

Genome annotation

Open Reading Frames (ORFs) were predicted using Prodigal [44] with default parameters but the predicted ORFs were excluded if they spanned a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [45] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [46] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [47] and BLASTn against the GenBank database. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05.

To estimate the mean level of nucleotide sequence similarity at the genome level between G. massiliense and another 2 members of the family Paenibacillaceae and Brevibacillus brevis, we use the Average Genomic Identity of Orthologous gene Sequences (AGIOS), a custom application we developed. Briefly, the AGIOS software combines the Proteinortho software [48] for detecting orthologous proteins between genomes compared two by two, then retrieves the corresponding genes and determines the mean percentage of nucleotide sequence identity among orthologous ORFs using the Needleman-Wunsch global alignment algorithm.

Genome properties

The genome is 5,546,433 bp long with a 50.39% G+C content (Figure 6 and Table 4). It is composed of 189 Contigs (176 large contigs, 12 scaffolds). Of the 5,221 predicted genes, 5,145 were protein-coding genes, and 76 were RNAs (1 gene is 16S rRNA, 1 gene is 23S rRNA, 5 genes are 5S rRNA, and 69 are tRNA genes). A total of 3,865 genes (75.12%) were assigned a putative function (by cogs or by NR blast). In addition, 272 genes were identified as ORFans (5.29%). The remaining genes were annotated as hypothetical proteins (680 genes => 13.22%). The distribution of genes into COGs functional categories is presented in Table 5. The properties and the statistics of the genome are summarized in Table 4 and 5.

Figure 6.
figure 6

Graphical circular map of the chromosome. From outside to the center: Genes on the forward strand colored by COG categories (only genes assigned to COG), genes on the reverse strand colored by COG categories (only gene assigned to COG), RNA genes (tRNAs green, rRNAs red), G+C content and GC skew. Purple and olive indicating negative and positive values, respectively.

Table 4. Nucleotide content and gene count levels of the chromosome
Table 5. Number of genes associated with the 25 general COG functional categories

Genomic comparison of G. massiliense and other members of the family Paenibacillaceae

The genome of G. massiliense strain G5T was compared to those of P. elgii strain B69, P. alvei strain DSM 29 and B. brevis strain NBRC 100599 (Table 6A and Table 6B). The draft genome of G. massiliense is smaller in size than those of P. elgii, P. alvei and B. brevis (5.54 vs 7.96, 6.83 and 6.3 Mb respectively). G. massiliense has a lower G+C content than P. elgii (50.39% vs 52.6%) but higher than those of P. alvei and B. brevis (50.39% vs 45.9% and 47.3% respectively). The protein content of G. massiliense is lower than those of P. elgii, P. alvei and B. brevis (5,146 vs 7,597, 6,823 and 5,946 respectively) (Table 6 and Table 6B). In addition, G. massiliense shares 2,122, 1,846 and 1,716 orthologous genes with P. elgii, P. alvei and B. brevis, respectively (Table 6). The nucleotide sequence identity of orthologous genes ranges from 66 to 67.6% among previously published genomes, and from 65.3 to 68.7% between G. massiliense and other studied genomes (Table 6A and Table 6B). Table 6 summarizes the number of orthologous genes and the average percentage of nucleotide sequence identity between the different genomes studied.

Table 6A. Genomic comparison of G. massiliense gen. nov., sp. nov., strain G5T with four other members of the family Paenibacillaceae
Table 6B. Genomic comparison of G. massiliense gen. nov., sp. nov., strain G5T with four other members of the family Paenibacillaceae

Conclusion

On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Gorillibacterium massiliense gen. nov., sp. nov., that contains the strain G5T. This bacterium has been found in stool sample of wild gorilla collected in Cameroon.

Description of Gorillibacterium gen. nov.

Gorillibacterium (go.ri.li.bac.te.ri’um. gor.il.i NL gen fem, the genus name of the great ape; bac.ter’i.um N.L. neut. n., bacterium a rod; gorillibacterium a rod-shaped bacterium isolated from a gorilla).

Gram-negative rod. Facultatively anaerobic. Mesophilic. Non-motile. Oxidase negative, catalase positive. Positive for urease, nitrate reduction, α- and β-galactosidase, arginine dihydrolase, arginine arylamidase, and β-glucosidase. Habitat: gorilla gut. Type species: Gorillibacterium massiliense.

Description of Gorillibacterium massiliense gen. nov., sp. nov.

Gorillibacterium massiliense (ma.si.li.en’se. L. gen. neut. n. massiliense, of Massilia, the ancient Roman name for Marseille, France, where the type strain was isolated).

G. massiliense is Gram-negative rod. Facultatively anaerobic. Mesophilic. Optimal growth is achieved at 37°C. Non-sporulating and non-motile bacterium. Colonies are bright gray and 0.5–1 mm in diameter on blood-enriched Columbia agar. Cells are rod-shaped and have a mean diameter of 0.67 µm and a mean length of 1.75 µm.

Catalase positive, oxidase negative. Using the API 20NE system, positive reactions are observed for nitrate reduction and urease reaction, but indole production was negative. Using the API 50CH system (BioMerieux), a positive reaction was obtained for the fermentation of D-xylose, D-glucose, D-fructose, D-mannose, N-acetylglucosamine, aesculin, salicin, D-cellobiose, D-maltose, D-lactose, D-melibiose, D-saccharose, D-trehalose, inulin, D-melezitose, D-raffinose, glycogen, gentiobiose, D-turanose, Methyl-αD-glucopyranoside and starch. Negative reactions are observed for glycerol, ribose, D-galactose, L-rhamnose, L-sorbose, dulcitol, inositol, D-mannitol, D-sorbitol, methyl-αD-mannopyranoside, D-arabinose, amygdalin, arbitin, potassium gluconate, potassium 2-cetogluconate, potassium 5-cetogluconate, adonitol and D-tagatose. Using the API ZYM system, positive reactions were observed for the production of naphthol-AS-BI-phosphohydrolase, α-galactosidase, β-galactosidase, β-glucosidase, Arginine arylamidase and Arginine dihydrolase. The production of α-glucosidase, β-glucuronidase, esterase lipase, leucine arylamidase and cystine arylamidase, valine arylamidase, glycine arylamidase, phenylalanine arylamidase, lipase, alkaline phosphatase, acid phosphatase, N-acetyl-β-glucosaminidase and a-chymotrypsin are negative. Susceptible to ticarcillin, amoxicillin, tobramycin, imipenem, vancomycin and rifampin but resistant to ceftazidime, colistin and metronidazole.

The G+C content of the genome is 50.39%. The 16S rRNA and genome sequences are deposited in GenBank under accession numbers KC193239 and CBQR000000000, respectively. The type strain G5T (= CSUR P290 = DSM 27179) was isolated from the fecal flora of a Gorilla gorilla gorilla from Cameroon.