Introduction

Bacillus massilioanorexius strain AP8T (= CSUR P201 = DSM 26092) is the type strain of B. massilioanorexius sp. nov. This bacterium is a Gram-positive, non-spore-forming, aerobic and motile bacillus that was isolated from the stool of a 21-year-old Caucasian French female suffering from a severe form of anorexia nervosa since the age of 12 years and is part of a “culturomics” study aiming at cultivating all species within human feces individually [13]. This bacterium was one of the 11 new bacterial species isolated from this single stool sample [3].

The current classification of Bacteria and Archaea remains a subject of debate and currently relies on a combination of phenotypic and genotypic characteristics [4]. Genomic data has not yet been routinely incorporated into descriptions. However, as more than 6,000 bacterial genomes have been sequenced including 982 type strains [5,6] and another 15,000 genomic projects are ongoing including 2,120 type strains [5,6], we recently proposed to integrate genomic information in the description of new bacterial species [728].

The genus Bacillus (Cohn 1872) was created in 1872 [29]. It consists mainly of Gram-positive, motile, spore-forming bacteria classified within 251 species and 3 subspecies with validly published names [30]. Members of the genus Bacillus are ubiquitous bacteria isolated from various environments including soil, fresh and sea water and food. In humans, Bacillus species may be opportunists in immunocompromised patients [31] or pathogenic, such as B. anthracis [32] and B. cereus. However, in addition to these two species, various Bacillus species may be involved in a variety of aspecific human infections, including cutaneous, ocular, central nervous system or bone infections, pneumonia, endocarditis and bacteremia [33].

Here we present a summary classification and a set of features for B. massilioanorexius sp. nov. strain AP8T (= CSUR P201 = DSM 26092), together with the description of the complete genomic sequence and its annotation. These characteristics support the circumscription of the species B. massilioanorexius.

Classification and information

A stool sample was collected from a 21-year-old Caucasian French female suffering from a severe restrictive form of anorexia nervosa since the age of 12 years. She was hospitalized in the nutrition unit of our hospital for recent aggravation of her medical condition. At the time of hospitalization, her weight and height was 27.7 kg, and 1.63 m (BMI: 10.4 kg/m2) respectively. The patient gave an informed and signed consent. This study and the assent procedure were approved by the Ethics Committee of the Institut Fédératif de Recherche IFR48, Faculty of Medicine, Marseille, France (agreement 09-022). The fecal specimen was preserved at −80°C after collection. Strain AP8T (Table 1) was isolated in March 2012 by aerobic cultivation on Columbia agar (BioMerieux, Marcy l’Etoile, France) after one month of preincubation of the stool sample with addition of 5ml of sheep rumen in blood bottle culture. This strain exhibited a 97% nucleotide sequence similarity with B. simplex [34], the phylogenetically closest validated Bacillus species (Figure 1). This value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [35].

Figure 1.
figure 1

Phylogenetic tree highlighting the position of Bacillus massilioanorexius strain AP8T relative to a selection of type strains of validly published species of Bacillus genus. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the maximum-likelihood method within MEGA program. Numbers at the nodes are percentages of bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. Clostridium botulinum was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.

Table 1. Classification and general features of Bacillus massilioanorexius strain AP8T

Different growth temperatures (25, 30, 37, 45°C) were tested. Growth was observed between 25 and 45°C, with optimal growth at 37°C after 24 hours of incubation. Colonies were 3 mm in diameter and 0.5 mm in thickness and gray in color with coarse appearance on blood-enriched Columbia agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMerieux), and under aerobic conditions, with or without 5% CO2. Growth was obtained in all the above mentioned conditions except in anaerobic conditions, where weak growth was observed. Gram staining showed Gram-positive rods. The motility test was positive. Cells grown on agar are Gram-positive rods (Figure 2), have a mean diameter of 0.77 µm and a mean length of 2.27 µm in electron microscopy (Figure 3).

Figure 2.
figure 2

Gram staining of B. massilioanorexius strain AP8T

Figure 3.
figure 3

Transmission electron microscopy of B. massilioanorexius strain AP8T, using a Morgani 268D (Philips) at an operating voltage of 60kV. The scale bar represents 900 nm.

Strain AP8T exhibited catalase and oxidase activity. Substrates oxidation and assimilation were examined with an API 50CH strip (BioMerieux) at the optimal growth temperature. Positive reactions were observed for D-glucose, D-fructose, D-saccharose, ribose, mannose, mannitol and D-trehalose and weak reactions were observed for L-rhamnose, esculine, salicine, D-cellobiose and gentiobiose. Using an API 20E strip (BioMerieux, Marcy l’Etoile), positive reactions were observed for tryptophane deaminase, acetoin and gelatinase production. Negative reactions were found for urease and indole production.

B. massilioanorexius is susceptible to amoxicillin, rifampicin, ciprofloxacin, gentamicin, doxycycline and vancomycin but resistant to trimethoprim/sulfamethoxazole and metronidazole. When compared with representative species from the genus Bacillus, B. massilioanorexius strain AP8T exhibited the phenotypic differences detailed in Table 2.

Table 2. Differential characteristics of Bacillus massilioanorexius strain AP8T, B. timonensis strain DSM 25372, B. amyloliquefaciens strain FZB42, B. massiliosenegalensis strain JC6T, B. mycoides strain DSM 2048 and B. thuringiensis strain BMB171

Matrix-assisted laser-desorption/ionization time-of-flight (MALDI-TOF) MS protein analysis was carried out as previously described [47] using a Microflex spectrometer (Brüker Daltonics, Leipzig, Germany). Twelve individual colonies were deposited on a MTP 384 MALDI-TOF target plate (Brüker). The twelve AP8T spectra were imported into the MALDI BioTyper software (version 2.0, Brüker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria, including 129 spectra from 98 validly named Bacillus species, used as reference data in the BioTyper database. A score enabled the presumptive identification and discrimination of the tested species from those in a database: a score ≥ 2 with a validated species enabled the identification at the species level; and a score < 1.7 did not enable any identification. For strain AP8T, no significant score was obtained, suggesting that our isolate was not a member of any known species (Figures 4 and 5).

Figure 4.
figure 4

Reference mass spectrum from B. massilioanorexius strain AP8T. Spectra from 12 individual colonies were compared and a reference spectrum was generated.

Figure 5.
figure 5

Gel view comparing B. massilioanorexius sp. nov strain AP8T and other Bacillus species. The gel view displays the raw spectra of loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units. Displayed species are indicated on the left.

Genome sequencing information

Genome project history

The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the Bacillus genus, and is part of a “culturomics” study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the twenty-seventh genome of a Bacillus species and the first genome of Bacillus massilioanorexius sp. nov. A summary of the project information is shown in Table 3. The Genbank accession number is CAPG00000000 and consists of 120 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [48].

Table 3. Project information

Growth conditions and DNA isolation

Strain AP8T was grown aerobically in Columbia broth (BioMerieux, Marcy l’Etoile, France). Extraction of chromosomal DNA was performed by using 50 mL of 48–72 h culture of B. massilioanorexius, centrifuged at 4oC and 2000 × g for 20 min. Resuspension of cell pellets was done in 1 mL Tris/EDTA/NaCl [10mM Tris/HCl (pH7.0), 10 mM EDTA (pH8.0), and 300 mM NaCl] and re-centrifugation was done under the same conditions. The pellets were resuspended in 200µL TE/lysozyme [25 mM Tris/HCl (pH8.0), 10 mM EDTA (pH8.0), 10 mM NaCl, and 10 mg lysozyme/mL]. The sample was incubated at 37oC for 30 min and then 30 µL of 30% (w/v) sodium N-lauroyl-sarcosine (Sarcosyl) was added to it, incubated for 20 min at 65oC, followed by incubation for 5 min at 4oC. Purification of DNA with phenol/chloroform/isoamylalcohol (25:24:1) was followed by precipitation with ethanol. DNA concentration was 64.3 ng/µl as determined by Genios Tecan fluorometer, using the Quant-it Picogreen kit (Invitrogen).

Genome sequencing and assembly

A 3kb paired-end sequencing strategy (Roche, Meylan, France) was used. Five µg of DNA were mechanically fragmented on the Covaris device (KBioScience-LGC Genomics, Middlesex, UK) through miniTUBE-Red with an enrichment size at 3–4kb. The DNA fragmentation was visualized through the Agilent 2100 BioAnalyzer on a DNA labchip 7500 with an optimal size of 2.95 kb. The library was constructed according to the 454 GS FLX Titanium paired end protocol. Circularization and nebulization were performed which generated a pattern of 553 bp optimal size. PCR amplification was performed for 17 cycles followed by double size selection. The single stranded paired-end library was quantified using Quant-it Ribogreen kit (Invitrogen) with Genios Tecan fluorometer that yielded concentration of 556 pg/µL. The library concentration equivalence was calculated as 1.82E+09 molecules/µL. The library was stored at −20°C until further use.

The shotgun library was clonally amplified with 5cpb in 4 emPCR reactions and the 3kb paired-end library was amplified with lower cpb in 4 emPCR reactions at 1cpb and 2 emPCR at 0.5 cpb with the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche). The yield of the shotgun emPCR reactions was 16.9 and 5.62% respectively for the two kinds of paired-end emPCR reactions according to the quality expected (range of 5 to 20%) from the Roche procedure. Two libraries were loaded on the GS Titanium PicoTiterPlates (PTP Kit 70x75, Roche) and pyrosequenced with the GS Titanium Sequencing Kit XLR70 and the GS FLX Titanium sequencer (Roche). The run was performed overnight and analyzed on the cluster through the gsRunBrowser and Newbler assembler (Roche). A total of 410,883 passed filter wells were obtained and generated 144.49 Mb with a length average of 344 bp. The passed filter sequences were assembled Using Newbler with 90% identity and 40 bp as overlap. The final assembly identified 20 scaffolds and 120 contigs and generated a genome size of 4.61Mb which corresponds to a coverage of 31.34 × genome equivalent.

Genome annotation

Open Reading Frames (ORFs) were predicted using Prodigal [49] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [50] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [51] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [52] and BLASTn against the GenBank database. Lipoprotein signal peptides and the number of transmembrane helices were predicted using SignalP [53] and TMHMM [54] respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. Ortholog sets composed of one gene from each of six genomes (B. massilioanorexius strain AP8T, B. timonensis strain DSM 25372 (GenBank accession number CAET00000000), B. amyloliquefaciens strain FZB42 (GenBank accession number NC_009725), B. massiliosenegalensis strain JC6T (GenBank accession number CAHJ00000000), B. mycoides strain DSM 2048 (GenBank accession number CM000742) and B. thuringiensis strain BMB171 (GenBank accession number CP001903),) were identified using the Proteinortho software (version 1.4) [55] using a 30% protein identity and 1e-05E-value. The average percentage of nucleotide sequence identity between corresponding orthologous sets were determined using the Needleman-Wunsch algorithm global alignment technique. Artemis [56] was used for data management and DNA Plotter [57] was used for visualization of genomic features. Mauve alignment tool was used for multiple genomic sequence alignment and visualization [58].

Genome properties

The genome of B. massiliensis strain AP8T is 4,616,135 bp long (1 chromosome, but no plasmid) with a 34.10% G + C content (Figure 6 and Table 4). Of the 4,519 predicted genes, 4,432 were protein-coding genes, and 87 were RNAs. Eight rRNA genes (one 16S rRNA, one 23S rRNA and six 5S rRNA) and 79 predicted tRNA genes were identified in the genome. A total of 3,290 genes (72.80%) were assigned a putative function. Three hundred fifty-four genes were identified as ORFans (7.98%). The remaining genes were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 4 and Table 5. The distribution of genes into COGs functional categories is presented in Table 5.

Figure 6.
figure 6

Graphical circular map of the chromosome. From the outside in, the outer two circles show open reading frames oriented in the forward and reverse directions (colored by COG categories), respectively. The third circle shows the rRNA gene operon (red) and tRNA genes (green). The fourth circle shows the G+C% content plot. The inner-most circle shows GC skew, purple and olive indicating negative and positive values, respectively.

Table 4. Nucleotide content and gene count levels of the genome
Table 5. Number of genes associated with the 25 general COG functional categories

Comparison with other Bacillus species genomes

Here, we compared the genome of B. massilioanorexius strain AP8T, B. timonensis strain DSM 25372, B. amyloliquefaciens strain FZB42, B. massiliosenegalensis strain JC6T, B. mycoides strain DSM 2048 and B. thuringiensis strain BMB171. The draft genome of B. massilioanorexius is larger in size than that of B. amyloliquefaciens (4.6 vs 3.9 Mb, respectively), similar in size than that of B. timonensis (4.6 Mb) and smaller in size than those of B. massiliosenegalensis, B. mycoides and B. thuringiensis (4.9, 5.5 and 5.6 Mb, respectively). The G+C content of B. massilioanorexius is lower than those of B. massiliosenegalensis, B. timonensis, B. amyloliquefaciens, B. mycoides and B. thuringiensis (34.10, 37.60, 37.30, 46.48, 35.21 and 35.18%, respectively). The gene content of B. massilioanorexius is larger than that of B. amyloliquefaciens (4,519 and 3,814, respectively) and fewer than those of B. massiliosenegalensis, B. timonensis, B. mycoides and B. thuringiensis (4,997, 4,684, 5,747 and 5,495, respectively). The ratio of genes per MB of B. massilioanorexius is greater than that of B. amyloliquefaciens (982 and 978, respectively), comparable to that of B. thuringiensis (982) and smaller to those of B. massiliosenegalensis, B. timonensis and B. mycoides (1,019, 1,018 and 1,044, respectively). However, the distribution of genes into COG categories was not entirely similar in all the three compared genomes (Figure 7). The nucleotide sequence identity ranged from 66.09 to 83.69% among Bacillus species, and from 66.09 to 70.10% between B. massilioanorexius and other Bacillus species, thus confirming its new species status. Table 6 summarizes the numbers of orthologous genes and the average percentage of nucleotide sequence identity between the different genomes studied.

Figure 7.
figure 7

Distribution of functional classes of predicted genes in B. massilioanorexius (red), B. massiliosenegalensis (blue), B. timonensis (pink), B. amyloliquefaciens (yellow), B. mycoides (brown) and B. thuringiensis (green) chromosomes according to the clusters of orthologous groups of proteins.

Table 6. Orthologous gene comparison and average nucleotide identity Bacillus species B. massilioanorexius1 with B. massiliosenegalensis2; B. timonensis3, B. thuringiensis4; B. mycoides5; B. amyloliquefaciens6†.

Conclusion

On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Bacillus massilioanorexius sp. nov. that contains the strain AP8T. The strain has been found in France.

Description of Bacillus massilioanorexius sp. nov.

Bacillus massilioanorexius (ma.si.li.o.a.no.rex’i.us. L. masc. adj. massilioanorexius, combination of Massilia, the Latin name of Marseille, France, where the type strain was isolated, and anorexia, the disease presented by the patient from whom the strain was cultivated).

Colonies were 3 mm in diameter and 0.5 mm in thickness, gray in color with a coarse appearance on blood-enriched Columbia agar. Cells are rod-shaped with a mean diameter of 0.77 µm. Optimal growth occurs aerobically, weak growth was observed under anaerobic conditions. Growth occurs between 25 and 45°C, with optimal growth observed at 37°C. Cells stain Gram-positive, are non-endospore forming and are motile. Cells are Gram-positive, catalase-positive, oxidase-positive. D-glucose, D-fructose, D-saccharose, D-trehalose, ribose, mannitol, mannose were used as carbon source. Positive reactions were observed for tryptophane deaminase, acetoin and gelatinase production. Weak reactions were obtained for L-rhamnose, esculine, salicine, D-cellobiose and gentiobiose. Cells are susceptible to amoxicillin, rifampicin, ciprofloxacin, gentamicin, doxycycline and vancomycin but resistant to trimethoprim/sulfamethoxazole and metronidazole.

The G+C content of the genome is 34.10%. The 16S rRNA and genome sequences are deposited in GenBank under accession numbers JX101689 and CAPG00000000, respectively. The type strain AP8T (= CSUR P201 = DSM 26092) was isolated from the fecal flora of a female suffering from anorexia nervosa in Marseille, France.