Introduction

Bacillus thuringiensis is a rod-shaped, Gram-positive bacterium that has been isolated from a variety of ecological niches including soil, aquatic environments, and dead insects, among many others [1]. B. thuringiensis is known for its utility as a bioinsecticide due to its ability to produce parasporal crystals that contain protein toxins (e.g. Cry proteins, also called δ-endotoxins) during its sporulation and stationary growth phase [2]. These protein toxins have also been successfully introduced to genetically modified crops, as exemplified in Bt corn, rendering these crops resistant to specific insect pests [3]. The protein toxins have been shown to be safe to plants, beneficial insects, and mammals due to the absence of specific receptors that are normally only found in the target organisms [4, 5]. The potential of B. thuringiensis to serve as an alternative to chemical insecticides has driven the discovery of new B. thuringiensis strains that may lead to the identification of novel protein toxins with potential use in pest management [1, 6]. Aside from the insecticidal properties of B. thuringiensis , it has also been reported to exhibit antibacterial, antifungal, antibiofilm and emulsifying activities [7, 8]. In general, the Bacillus species are known to be rich sources of antimicrobial compounds [9,10,11,12]. For B. thuringiensis , its antibacterial effects can be attributed to a wide range of compounds including bacteriocins and lipopeptides [13]. On the other hand, its antifungal activity has been attributed to the production of compounds such as zwittermycin, chitinase, and lipopeptides [7]. In this study, the whole genome sequence of B. thuringiensis DNG9 that was isolated from an oil-contaminated slough in Baraki-Algiers, Algeria was determined. This strain was chosen for sequencing due to its strong antimicrobial and emulsifying properties. It was the aim of this work to obtain a better understanding of the observed bioactivities based on the genes encoded in its genome.

Organism information

Classification and features

Bacillus thuringiensis DNG9 was isolated from an oil-contaminated soil slough in Baraki-Algiers, Algeria. The samples were serially diluted in water, heat-shocked at 80 °C for 30 min, spread onto Luria Bertani (LB) agar and incubated at 35 °C for 24 h. Strain DNG9, like the majority of other reported B. thuringiensis strains, are Gram-positive, aerobic to facultative anaerobic bacterium [14]. The cells are rod-shaped, flagellated (Fig. 1a) and endospore-forming (Fig. 1b, c). The bacterium has a growth temperature range from 10 to 48 °C with an optimal growth at 28–35 °C [15] and pH 4.9–8.0 with an optimal pH of 7.0 [16, 17]. It produces parasporal bodies during the stationary phase of its growth cycle (Fig. 1c), which is consistent with the three cry genes predicted from its genome. Two homologs of cry41 and one homolog of cry6 genes were predicted from the genome of DNG9 using the BtToxin Scanner server [18]. The key features of DNG9 are summarized in Table 1.

Fig. 1
figure 1

General characteristics of Bacillus thuringiensis DNG9. Transmission electron micrograph (TEM) of DNG9 showing a flagellated cell, b subcentral endospore, ES, and c parasporal bodies, PB. d Spot-on-lawn assay showing the activity of DNG9 supernatant (labelled as 4) against indicator strain Lactococcus lactis subsp. cremoris HP

Table 1 Classification and general features of Bacillus thuringiensis strain DNG9 according to the MIGS recommendation [19]

Thirteen Bacillus strains and DNG9 were chosen for phylogenetic analysis. The chosen species represent the members of B. cereus sensu lato supergroup [19]. This includes the type strains B. thuringiensis Berliner ATCC 10792T, B. cereus ATCC 14579T and B. anthracis AMES Ancestor. The 16S rRNA gene sequence from the type strain B. subtilis subsp. subtilis ATCC 6051T [20] was selected as an outgroup. The maximum likelihood method was used to construct the phylogenetic tree shown in Fig. 2. The phylogenetic tree supports the placement of strain DNG9 within the B. thuringiensis group together with the type strain B. thuringiensis Berliner ATCC 10792T.

Fig. 2
figure 2

Maximum likelihood phylogeny of Bacillus thuringiensis DNG9 16S rRNA gene isolated from Algerian soil-oil slough. Nucleic acid sequences were aligned using Geneious and the tree compiled using RaxML. Numbers above the branches refer to bootstrap values. The tree was rooted using Bacillus subtilis subsp. subtilis ATCC 6051T. Type strains are indicated with T. All strains represent sequenced genomes. Scale bar indicates 2 nucleotide substitution for each 10 nucleotide sequences. Accession numbers of publicly available sequences are given in brackets

Genome sequencing information

Genome project history

The project information and associated MIGS (Minimum Information about a Genome Sequence) 2.0 compliance [21] are summarized in Table 2. This bacterium was selected for sequencing as it was determined to be one of the most promising strains for discovery of compounds with strong antibacterial (Fig. 1d), antifungal and biosurfactant abilities (Additional file 1: Figure S1). The availability of the draft genome of DNG9 may contribute to the evolution and comparative genomics studies of the B. cereus sensu lato group. Furthermore, future investigations on its genome-encoded bioactive metabolites may be pursued. This work provided a standard draft genome and the assembled contigs have been deposited in public repositories. The PGAP- and JGI-IM- annotated genomes were deposited to the DDBJ/ENA/GenBank databases under accession numbers MSTN00000000 and Ga0180945, respectively.

Table 2 Project information

Growth conditions and genomic DNA preparation

Genomic DNA was isolated from a combined 16-h grown single colony isolate and a two mL 16-h grown liquid culture (150 rpm) from LB agar and LB broth, respectively. Total nucleic acid was extracted using the method described previously [22]. Briefly, cells were harvested at 500×g for 2 min and resuspended in 100 μl 1× TE buffer (100 mM Tris-HCl, 50 mM EDTA, pH 8.0). Cell slurry was sequentially treated with 20 mg/ml lysozyme (37 °C, 30 min), 2 mg/ml proteinase K (56 °C, 30 min) and 0.5 mg/ml RNase A (37 °C, 30 min). The sphaeroplast suspension was lysed with 500 μl cell breakage buffer (0.4% sodium dodecyl sulfate, 0.5% N-lauroyl sarcosine, 0.5% Triton X-100, 50 mM Tris, 100 mM EDTA, pH 8.0), 400 μl phenol and 150 μl glass beads (0.5 mm dia, Sartorius, Germany). The slurry was vortexed for 1 min and rested for 1 min on ice, for a total of 10 cycles, and finally clarified at 13000×g for 5 min at room temperature. The aqueous layer was repeatedly extracted with equal volume of phenol, followed by phenol:chloroform (1:1) and finally with chloroform:isoamyl alcohol (24:1). The DNA was precipitated with 0.1× 3 M sodium acetate pH 5.2 and 2.5× absolute ethanol, washed with 70% ethanol and resuspended in 10 mM Tris buffer, pH 8.0. Quantity and quality were assessed using Qubit 2.0 fluorometry (Qiagen) and agarose gel electrophoresis, respectively.

Genome sequencing and assembly

The genome of Bacillus thuringiensis DNG9 was sequenced at The Applied Genomic Core, Department of Biochemistry, University of Alberta using Illumina paired-end sequencing platform and Nextera XT DNA library kit (Illumina, USA). Whole genome sequencing was performed in duplicates using the MiSeq Reagent kit v2. Sequencing of 250 bp paired-end modules gathered 3.69 M reads, which provided an average coverage of 317× resulting in 38 contigs. De novo assembly of the 6,057,430 bp paired-end sequences was created using CLC Genomics Worksbench v 7.5.2. (CLC bio, Aarhus, Denmark).

Genome annotation

Gene prediction was performed using four automated genome annotation pipelines: (1) the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) [23] using GeneMarkS+ and best-placed reference protein set; (2) the Joint Genome Institute – Integrated Microbial Genomes and Microbiomes (JGI-IMG/M) pipeline [24] utilizing Prodigal gene caller [25]; (3) the Rapid Annotation using Subsystem Technology (RAST) v2.0 server [26]; and (4) the Bacterial Annotation System (BASys) server [27]. CRISPR repeats were predicted by using CRISPRfinder [28]. The draft genome of DNG9 was aligned with the type strain B. thuringiensis Berliner ATCC 10792T closed genome to generate a single scaffold using Contiguator v2 [29] and Multi-Draft based Scaffolder (MEDUSA) [30]. A chromosome map was generated from the single scaffold using BASys automated pipeline [27] and viewed using CGViewer [31].

Species was established using genome-wide Average Nucleotide Identity (gANI) metric and alignment fraction (AF) calculated within the JGI-IMG/M server using the Microbial Species Identifier (MiSI) calculator [32]. Strain was established using the Genome-to-Genome Distance Calculator (GGDC) 2.1 server employing digital DNA:DNA hybridization (dDDH) and DNA G + C content [33].

Genome properties

The draft genome of DNG9 is 6,057,430 bp with 34.9% GC content, similar to the genomes of other Bacillus thuringiensis strains [34,35,36], and contained 38 scaffolds with N50 of 347,259 bp. A total of 135 RNA genes and 284 pseudogenes were annotated by IMG/M and PGAP, respectively (Table 3). Annotation using the DOE-JGI IMG/M pipeline revealed 6109 total coding sequences of which 4463 have functional predictions. Conversely, RAST annotation pipeline predicted 6055 coding sequences; NCBI-PGAP revealed 6213 coding genes; and lastly, BASys annotated 6102 coding sequences. The 4463 coding sequences predicted in IMG/M pipeline were placed in 25 general clusters of orthologous (COG) functional gene catalogs. The distribution of these protein-coding genes based on COG function is listed in Table 4. The 6.06 Mbp draft genome map of DNG9, as aligned against the type strain B. thuringiensis Berliner ATCC 10792, is presented in Fig. 3.

Table 3 Genome statistics
Table 4 Number of genes associated with general COG functional categories
Fig. 3
figure 3

Circular representation of the draft genome of DNG9 representing relevant genome features. The draft genome was aligned into one scaffold using B. thuringiensis Berliner ATCC 10792T genome. The outer most circle shows COG functional categories of coding regions in the clockwise direction. The lines in each concentric circle represent the position of the indicated feature; the color legend is shown to the right of the map. The second circle shows predicted coding regions transcribed on the forward (clockwise) DNA strand. The third circle shows predicted coding regions transcribed on the reverse (counterclockwise) DNA strand. The fourth circle shows COG functional categories of coding regions in the counterclockwise direction. The fifth and sixth circles show the percent GC content of the genome and the percent GC deviation (skewness) by strand, respectively

Insights from the genome sequence

B. thuringiensis DNG9 was found to be flagellated, sporulating with a subcentral endospore and producing the insecticidal parasporal bodies (Fig. 1a, b, c). These phenotypes are supported by gene inventories found in the genome of DNG9 (Fig. 3). The RAST annotation has allocated these genes into 490 subsystems, the most abundant of which are genes that are associated with amino acid and derivatives metabolism (15.5%), followed by carbohydrate (11.7%), and protein metabolism (7.6%).

DNG9 was found to be most active against Lactococcus lactis subsp. cremoris HP (Fig. 1d) [37, 38], and was also active against Carnobacterium divergens LV13 [39], Salmonella. enterica Typhimurium ATCC 23564 [40], and Micrococcus sp. ATCC 700405 [41] but not against Escherichia coli JM109 [42, 43], Pseudomonas aeruginosa ATCC 14217 [42, 44], and Enterococcus faecalis 710C [45]. Conversely, DNG9 was also found to be active against the fungus Galactomyces geotrichum MUCL 28959 but not Aspergillus niger ATCC 9142 and Candida albicans ATCC 10231. The antiSMASH 4.0 server predicted that DNG9 genome carries the gene clusters responsible for the production of several secondary metabolites including antibiotics, siderophores, and biopolymers. The genome was found to encode gene clusters with complete homology to the biosynthetic gene clusters of the antifungal compound, zwittermycin A (Fig. 4a), the iron-siderophore, petrobactin (Fig. 4b), and the bioplastic precursor, polyhydroxyalkanoates (PHAs) (Fig. 4c). The aminopolyol compound zwittermycin A was previously shown to suppress fungal-oomycete diseases in plants [46, 47], suggesting that the antifungal activity of DNG9 could be attributed to this secondary metabolite. The presence of siderophores, like petrobactin and bacillibactin, in the genome of DNG9 suggests its iron acquisition abilities. These gene clusters are not exclusive in B. thuringiensis but are also found in the genomes of other members of the Bacillus cereus sensu lato group [48,49,50]. Both antiSMASH 4.0 and BAGEL 4.0 servers also predicted a number of novel bacteriocins, mainly belonging to the class referred to as lanthipeptides (Fig. 4d, e, f). Lastly, Bt_toxin scanner revealed that cry genes encoding the insecticidal protein associated with B. thuringiensis is present in DNG9 genome, two homologs of cry41 and one homolog of cry6 genes. The wide biological target range of DNG9, including its antibacterial, antifungal and insecticidal properties, could be attributed to these bioactive compounds.

Fig. 4
figure 4

Secondary metabolite biosynthetic gene cluster organization in DNG9. Gene clusters for zwittermycin A (a), petrobactin (b), and polyhydroxyalkanoate (c) biosynthesis as predicted by antiSMASH 4.0. The DNG9 biosynthetic gene cluster is color coded with respect to its homology (%) to the known biosynthetic gene cluster. Gene cluster for three lanthipeptide class I (d), lanthipeptide class I (e) and lanthipeptide class II (f) biosynthesis as predicted by BAGEL 4.0. Color legend for Fig. 4d, e, f is presented in G

The genome of DNG9 is highly similar to those of B. thuringiensis Berliner ATCC 10792T, B. thuringiensis YBT-1518, and B. thuringiensis Bt407 based on average nucleotide identity (> 99%) and digital DNA:DNA hybridization (> 95%) (Additional file 2: Table S1), shared gene content (Fig. 5) and phylogenetic analyses of the 16S rRNA gene (Fig. 2). The functional comparison of DNG9 genome composition with closely related Bacillus species (i.e. B. thuringiensis , B. cereus and B. anthracis ) [19] is presented in Fig. 5. Bacillus subtilis subsp. subtilis ATCC 6051T was used as an outgroup in the map. Comparison of the genomes of DNG9 and seven closely related Bacillus species by uni- and bidirectional best BlastP implemented in RAST, cross-validated with IMG annotations and viewed in IslandViewer 4 server [51], revealed strain-specific genes that encode hypothetical proteins, which are grouped into genomic islands. (Fig. 5, Additional file 3: Table S2). These ORFs in DNG9 include a high proportion of mobile genetic elements, phage-like proteins, transposases and hypothetical proteins in five distinct genomic islands including an intact prophage in region A which is further supported by Phaster server [52] analysis.

Fig. 5
figure 5

Genomic comparison of DNG9 to other Bacillus sp. genomes conducted using RAST. Each track represents pair-wise BLAST comparison between the open reading frames in query genome against those in Bacillus thuringiensis DNG9 (Ref. = reference), with percentage of similarity represented with different colors shown in the legend. Regions marked in the genomic map correspond to gene number presented in Additional File 3: Table S2 (a = 250–313, b = 1882–2051, c = 2127–2374, d = 2785–2880, e = 5318–5365). Query genomes used in this analysis (outer ring to inner ring): B. thuringiensis Berliner ATCC 10792T, B. anthracis F34, B. cereus ATCC 14579, B. cereus E41, B. thuringiensis YBT-1518, B. subtilis subsp. subtilis ATCC 6051T and B. anthracis AMES Ancestor

Conclusions

In conclusion, here we report a 6.06 Mbp draft genome of Bacillus thuringiensis DNG9, isolated from an oil-contaminated soil-slough in Baraki-Algeirs, Algeria. The final de novo assembly is based on 306.5 Mb of Illumina data, which provided an average coverage of 317×. The assembled genome contains 6120 coding sequences (average of 4 annotation pipelines), of which the most abundant are genes that are associated with amino acid (15.5%), followed by carbohydrate (11.7%), and protein metabolism (7.6%). The antimicrobial properties of this bacterium against several Gram-positive and Gram-negative bacteria, as well as fungal phytopathogens, could be inferred in part with a number of gene inventories encoded in the draft genome. The comparative analysis with closely related bacterial genomes, alignment of the 16S rRNA sequences and prediction of gene inventories for the insecticidal Cry protein biosynthesis placed strain DNG9 under Bacillus thuringiensis . This indicated that strain DNG9 could have several potential utility as an insect biocontrol agent, a fungal phytopathogen control agent, and a source of biopolymers (PHA) and antibacterial compounds. Lastly, the genome sequence of DNG9 may provide another model system to study pathogenicity against insect pests and plant diseases, and for antimicrobial compound mining and phylogenesis among Bacillus cereus sensu lato group.