Introduction

The family Bacillaceae [1, 2] is one of the largest bacterial families and currently consists of 57 genera [3]. The Bacillaceae are either rod-shaped (bacilli) or spherical (cocci) Gram-positive bacteria, the majority of which produce endospores [4]. Anoxybacillus [5, 6] is one of the genera within the Bacillaceae [1, 2], classified within the phylum Firmicutes [7], class Bacilli [8, 9], and order Bacillales [1, 10].

Anoxybacillus spp. are alkalo-thermophiles with optimum growth at temperatures between 50 °C and 65 °C and at pH 5.6–9.7 [4]. Most of the Anoxybacillus spp. are found in hot springs [4], but Anoxybacillus has also been found in animal manure [5], contaminated diary and meat products [4], animals (i.e., fish gut) [4], insects (i.e., glassy-winged sharpshooter and spiraling whitefly) [11], and plants (i.e., Indian mulberry) [11]. To date, a total of 22 species and two subspecies of Anoxybacillus have been described [4, 12, 13].

Almost all members of the Bacillaceae are excellent industrial enzyme producers [4, 14, 15]. Members of the genus Anoxybacillus exhibit the additional advantage of thermostability compared to the mesophilic Bacillaceae . It has been reported that enzymes from Anoxybacillus spp. can degrade various substrates such as starches, cellulose, fats, and proteins [4]. Many carbohydrase-encoding genes have been identified in Anoxybacillus spp. genomes, and some of the well-studied starch-degrading enzymes are α-amylase [16], pullulanase [17], amylopullulanase [18], CDase [19], and xylose-isomerase [20]. In addition, xylanolytic enzymes such as xylanase [21] and α-L-arabinofuranosidase [22] have been characterized from Anoxybacillus spp. Apart from their hydrolytic capabilities, Anoxybacillus spp. have been proposed as agents for bioremediation of Hg2+, Cr2+, Al3+, As3+ ions [4, 2325], and nitrogen oxide [26], and as possible candidates for biohydrogen production [4].

Among the members of the family Bacillaceae , intensive genome sequencing efforts have been undertaken for Geobacillus [27] (>80 projects) and Bacillus [1, 28] (>1,500 projects), which have been registered in the NCBI BioProject database. In contrast, genomic studies on Anoxybacillus are rather limited, with only 16 registered projects. At present, the genome of Anoxybacillus flavithermus WK1 is the only completely sequenced genome (BioProject accession number PRJNA59135) among the Anoxybacillus spp. [5, 29]. Draft genome sequences are available for Anoxybacillus ayderensis AB04T (PRJNA258494; this study) [30], Anoxybacillus sp. BCO1 (PRJNA261743) [31, 32], Anoxybacillus thermarum AF/04T (PRJNA260786) [3335], Anoxybacillus gonensis G2T (PRJNA264351) [36], Anoxybacillus sp. ATCC BAA-2555 (PRJNA260743), Anoxybacillus sp. KU2-6(11) (PRJNA258246), Anoxybacillus tepidamans PS2 (PRJNA214279) [37], A. flavithermus 25 (PRJNA258119) [5, 38], A. flavithermus AK1 (PRJNA190633) [5, 39], Anoxybacillus kamchatkensis G10 (PRJNA170961) [4042], A. flavithermus Kn10 (PRJDB1085) [5, 43], A. flavithermus TNO-09.006 (PRJNA169174) [5, 44], Anoxybacillus sp. SK3-4 (PRJNA174378) [45, 46], Anoxybacillus sp. DT3-1 (PRJNA182115) [45, 46], and A. flavithermus subsp. yunnanensis E13T (PRJNA213809) [35, 47, 48]. Therefore, the genomic study of Anoxybacillus spp. is essential not only to fully understand their biochemical networks, but also to discover their potential applicability in industrial processes.

In the present report, we describe the cellular features of A. ayderensis AB04T and we present a high-quality annotated draft genome of strain AB04T. Additionally, we provide a comparative analysis of the GHs of strain AB04T and other sequenced Anoxybacillus spp. In addition, we discuss the presence of other under-explored industrial enzymes and the potential applications of the bacterium.

Organism information

Classification and features

A. ayderensis AB04T (= NCIMB 13972T = NCCB 100050T ) was isolated from mud and water samples from the Ayder hot spring located in the province of Rize in Turkey [30]. Microscopic examination revealed that colonies of strain AB04T were cream-colored, regular in shape with round edges, and 1–2 mm in diameter.

Phenotypic analysis revealed that strain AB04T is a Gram-positive, rod-shaped, motile, and spore-forming bacterium [30]. It is a facultative anaerobe, moderate thermophile that grows well at 30–70 °C (optimum 50 °C) and at pH 6.0–11.0 (optimum pH 7.5–8.5) (Table 1). FESEM showed that cells of the strain AB04T were 0.7–0.8 × 3.5–5.0 μm in size (Fig. 1). The strain gave positive responses for catalase and oxidase activity, and was able to reduce nitrate to nitrite. Strain AB04T was capable of utilizing a wide range of carbon sources including starch, gelatin, d-glucose, d-raffinose, d-sucrose, d-xylose, d-fructose, l-arabinose, maltose, and d-mannose. The strain grew optimally in the presence of 1.5 % (w/v) NaCl, but it was able to grow in the absence of NaCl. Growth was inhibited in the presence of ampicillin (25 μg/ml), streptomycin sulphate (25 μg/ml), tetracycline (12.5 μg/ml), gentamicin (10 μg/ml), and kanamycin (10 μg/ml). The FAME profile showed that the major fatty acid in AB04T is C15:0iso (48.17 %), followed by C17:0 iso (20.62 %), C17:0 anteiso (9.22 %), C16:0 (9.10), C16:0 iso (7.47 %), C15:0 anteiso (3.58 %), C14:0 (1.02 %), and C15:0 (0.83 %) [30].

Table 1 Classification and general features of A. ayderensis AB04T [74]
Fig. 1
figure 1

FESEM micrograph of A. ayderensis AB04T. The micrograph was captured using FESEM (JEOL JSM-6701 F, Tokyo, Japan) operating at 5.0 kV at a magnification of 15,000 ×

The 16S rRNA-based phylogenetic tree constructed using MEGA6.0 [49] showed that strain AB04T clusters together with Anoxybacillus sp. SK3-4 [45, 46] and A. thermarum AF/04T [3335] (Fig. 2). Pairwise 16S rRNA sequence similarities among the strains were determined using the EzTaxon server [50], revealing that AB04T shares 99.6 % and 99.2 % similarity with Anoxybacillus sp. SK3-4 [45, 46] and A. thermarum AF/04T [3335], respectively.

Fig. 2
figure 2

Phylogenetic tree based on 16S rRNA gene sequences showing the relationship between A. ayderensis AB04T and representative Anoxybacillus spp. The 16S rRNA accession number for each strain is shown in brackets. The 16S rRNA sequences were aligned using ClustalW and the tree was constructed using the ML method with 1000 bootstrap replicates embedded in the MEGA6.0 package [49]. The scale bar represents 0.01 nucleotide substitutions per position. Brevibacillus brevis NCIMB 9372T [77] was used as an out-group. Type strains are indicated with a superscript T. Published genomes are indicated in blue

Genome sequencing information

Genome project history

Genomic studies on the genus Anoxybacillus are relatively limited [45]. Hence, the findings of the genomic study on A. ayderensis AB04T presented in this study are important because they contribute to the body knowledge of the Anoxybacillus genomes. This whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number JXTG00000000. The NCBI BioProject accession number is PRJNA258494. The GOLD Project ID for strain AB04T is Gp0026071. Table 2 presents the project information and its association with MIGS version 2.0 compliance.

Table 2 Project information

Growth conditions and genomic DNA preparation

A. ayderensis AB04T was plated on Nutrient Agar (pH 7.5) and incubated at 50 °C for 18 h. A single colony was transferred into Nutrient Broth (pH 7.5) and incubated at 50 °C with rotary shaking at 200 rpm for 18 h. The cells were harvested by centrifugation at 10,000 × g for 5 min using a Microfuge® 16 centrifuge (Beckman Coulter, Brea, CA, USA). Genomic DNA was extracted using a Qiagen DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. The purity, quality, and concentration of the genomic DNA were determined using a 6 % (w/v) agarose gel, NanoDrop 1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA), and Qubit 2.0 fluorometer (Invitrogen, Merelbeke, Belgium).

Genome sequencing and assembly

The genome of A. ayderensis AB04T was sequenced using the Illumina MiSeq sequencing platform (Illumina, San Diego, CA, USA) with 300-bp paired-end reads. The adapter sequences were removed and low quality regions and reads were filtered out using Trimmomatic [51] (Phred score = 25 (Q25), sliding window = 4 bp, leading and trailing qualities = 3, and minimum read length = 36 bp), Scythe (UC Davis Bioinformatics Core, Davis, DA, USA) (prior contamination rate = 0.3, minimum match length argument = 5, and minimum sequence to keep after trimming = 36 bp), and String Graph Assembler (SGA) [52] (k-mer threshold = 3, k-mer rounds = 10, and read error correction = 0.04). Next, the reads were subjected to de novo genome assembly using IDBA-UD 1.0.9 [53] (k min  = 35).

Genome annotation

Genes, tRNAs and tmRNAs, and rRNAs were predicted with Prodigal [54], ARAGORN [55], and RNAmmer [56], respectively. For functional annotation, the predicted coding sequences were translated and used to search for the closest matches in the NCBI non-redundant database and the UniProt [57], TIGRFAM [58], Pfam [59], CRISPRfinder [60], PRIAM [61], KEGG [62], COG [63], and InterProScan 5 [64] databases. The GHs were identified and verified using the dbCAN CAZy [65], NCBI BLASTp, and InterProScan 5 [64] databases. Genome comparison was done by the ANI function in the EzTaxon-e database [66].

Genome properties

The overall genome coverage was approximately 239-fold. The draft genome was assembled into 74 contigs with a total length of 2,832,347 bp and a G + C content of 41.8 % (Fig. 3 and Table 3). The longest and shortest contigs were 448,584 bp and 606 bp, respectively. The mean length of the contigs was 38,275 bp and the N50 contig length was 112,260 bp. We did not detect any additional DNA elements. The genome consisted of 2,998 predicted genes, of which 2,895 were protein-coding sequences and 103 were RNA genes including 14 rRNAs, 88 tRNAs, and 1 tmRNA. A total of 235 (8.1 %) genes were assigned a putative function. The remaining annotated genes (1023; 35.3 %) were hypothetical proteins. The properties and the statistics of the genome are summarized in Table 3. The distribution of genes into COGs and KEGG functional categories is presented in Table 4 and Fig. 3.

Fig. 3
figure 3

A graphical circular map of the A. ayderensis AB04T genome. From outside to the center: genes on the forward strand (colored by COG categories), genes on forward strand (red), genes on reverse strand (blue) and genes on the reverse strand (colored by COG categories)

Table 3 Genome statistics
Table 4 Number of genes associated with general COG functional categories

Insights from the genome sequence

Genome features of A. ayderensis AB04T and other Anoxybacillus spp

The genome sizes of the currently sequenced Anoxybacillus spp. are shown in Fig. 2. Most of the reported Anoxybacillus draft genome sizes are between 2.60 and 2.86 Mb [31, 33, 3840, 4345, 47], and the completely sequenced A. flavithermus WK1 genome has a size of 2.85 Mb [29]. The incomplete genome sequence of A. tepidamans PS2 has a size of 3.36 Mb (Fig. 2), which is the largest Anoxybacillus genome sequenced to date [37]. However, cumulative information on the Anoxybacillus genomes (Fig. 2) indicates that Anoxybacillus has a smaller genome size than the closest genus, Geobacillus (~3.50 Mb) [27, 45]. The genomes of other genera within Bacillaceae such as Bacillus [1, 28] and Lysinibacillus [67] are at least 40 % larger than that of Anoxybacillus [5, 6, 45]. The average G + C content of the Geobacillus spp. genomes (~50.0 %) [27, 45] is slightly higher than that of the A. ayderensis [30] genome (Fig. 2), while most Bacillus genomes have less than 40 % G + C content [1, 28, 45].

Table 5 summarizes the pairwise ANI values of Anoxybacillus spp. [66]. A. ayderensis AB04T showed the highest ANI of 97.6 % with Anoxybacillus sp. SK3-4 [46]. As this ANI value is greater than 95 % [68], Anoxybacillus sp. SK3-4 [45, 46] is likely to be a subspecies of A. ayderensis [30].

Table 5 Genomic comparison of A. ayderensis AB04T and 15 other sequenced Anoxybacillus spp. using ANI [66]

Analysis of the GHs in A. ayderensis AB04T and other Anoxybacillus genomes

We detected 14 genes in the AB04T genome encoding GH enzymes belonging to GH families 1, 10, 13, 31, 32, 51, 52, and 67 (Table 6). On average, the AB04T GHs shared 93.9 % similarity with GHs identified in other Anoxybacillus spp. The GHs could be grouped into two types according to their predicted catalytic ability (Table 6). Nine GH enzymes were predicted to be active on α-chain polysaccharides whereas the remaining five GH enzymes were specific for β-linked polysaccharides (i.e., cellulose and xylan).

Table 6 List of several glycoside hydrolases (GHs) identified in various Anoxybacillus genomes

Interestingly, we found two GH enzymes that were uniquely present in strain AB04T: endo-1,4-β-xylanase (NCBI locus ID: KIP21668) and α-glucuronidase (KIP21917) (Table 6). The closest homologs of endo-1,4-β-xylanase and α-glucuronidase were found in Geobacillus thermoglucosidans and Geobacillus stearothermophilus with 81.9 % and 87.1 % sequence similarity, respectively [27].

Genes coding for at least five of the aforementioned GHs including cell-bound α-amylase, pullulanase, CDase, oligo-1,6-glucosidase, and α-glucosidase were consistently found in the genomes of all Anoxybacillus spp. (Table 6). Therefore, these enzymes might play an important role in Anoxybacillus carbohydrate metabolism. A high molecular-mass amylopullulanase (>200 kDa) from Anoxybacillus sp. SK3-4 has been reported previously [18]. We detected this enzyme in other Anoxybacillus spp., for instance A. flavithermus WK1 [5, 29], A. flavithermus subsp. yunnanensis E13T [35, 47, 48], A. kamchatkensis G10 [4042], A. flavithermus AK1 [5, 39], and A. flavithermus Kn10 [5, 43]. From the current analysis, it can be concluded that amylopullulanase is the GH with greatest molecular-mass in Anoxybacillus (Table 6). Despite their widespread distribution in Anoxybacillus spp., only a limited number of GHs have been studied intensively. At present, only α-amylase [16], pullulanase [17], amylopullulanase [18], CDase [19], and α-L-arabinofuranosidase [22] have been cloned, purified, and biochemically characterized (Table 6). he number of underexplored GH enzymes such as β-glucosidase, endo-1,4-β-xylanase, α-L-arabinofuranosidase, α-glucuronidase, and β-xylosidase remains high; however, because of their interesting applications and their important roles in second-generation biofuel production [69], these enzymes are worthy of examination in the near future.

Other A. ayderensis AB04T enzymes with potential applications

Apart from the GHs, we found that A. ayderensis AB04T has genes coding for other industrially important enzymes such as xylose isomerase, esterase, and aldolase. Xylose isomerase (EC 5.3.1.5) catalyzes the isomerization of xylose to xylulose and of glucose to fructose, which is important in the industrial production of high-fructose corn syrup [20]. Earlier, a xylose isomerase from A. gonensis G2T was characterized and the enzyme displays 96.8 % amino acid sequence similarity to the one identified in strain AB04T (KIP21927) [20].

Previous studies have indicated that A. gonensis G2T, A. gonensis A4, and Anoxybacillus sp. PDF-1 produce esterase [7072]. We identified two esterases (KIP19922 and KIP21735) in the genome of strain AB04T, which shared 96.3 % and 96.0 % amino acid sequence similarity with the esterase from Anoxybacillus sp. PDF-1 [72] and A. gonensis G2T [70], respectively. In addition, a fructose-1,6-bisphosphate aldolase from A. gonensis G2T has been described [73]. Strain AB04T carries two aldolases, KIP21451 and KIP21450, which showed 95.9 % and 99.9 % amino acid similarity to aldolase from A. flavithermus WK1 [5, 29] and A. thermarum AF/04T [3335], respectively. We did not biochemically characterize these enzymes from strain AB04T in the current study.

Thermophilic bacteria are highly sought after for their potential use in bioremediation processes. Several Anoxybacillus spp. efficiently reduce metal ions such as Hg2+, Cr4+,Al3+, and As3+ [4, 2325]. The genome of strain AB04T contains at least six heavy metal resistance genes. Four genes are related to mercuric ion reduction; two of these are mercury resistance (mer) operons (KIP20706 and KIP20408) and the two other genes encode mercuric reductases, which catalyze the reduction of Hg2+ to Hg0 (KIP19952 and KIP20409). In addition, strain AB04T carries genes for an arsenate reductase (KIP20402) and an arsenic efflux pump protein (KIP20401). The function of these genes will be studied in the close future.

Conclusions

Knowledge on the genomics, industrial enzymes, and relevant applications of Anoxybacillus spp. are rather limited compared to that in their closest relatives, Geobacillus and Bacillus . In the present work we presented a whole-genome sequence of A. ayderensis AB04T and its annotation. Additionally, we provided insights into several GHs, under-explored enzymes, and putative applications of strain AB04T.