Introduction

Modern metagenomic approaches have provided insights on the evolution and functional capacity of microbial communities resistant to classical culture-based methods [1]. However, these classical techniques remain crucial for understanding the molecular adaptations of microbial guilds, especially those with potential biotechnological applications [2, 3]. Consequently, efforts to isolate novel taxa, particularly from environmentally extreme habitats remain widespread [4, 5].

The genus Thermoactinomyces is a member of the family Thermoactinomycetaceae . The first known representative from this genus ( Thermoactinomyces vulgaris ) was isolated from decaying straw and manure [6]. Since then, a number of isolates, from a wide array of extreme habitats [710] have been validly described. Currently, this genus comprises ten validly published species, and a few of these are; Thermoactinomyces vulgaris [6], Thermoactinomyces intermedius [11], Thermoactinomyces daqus [7] and Thermoactinomyces guangxiensis [8]. These species are all Gram-positive, aerobic, non-acid-fast, chemoorganotrophic, filamentous and thermophilic bacteria.

Here, we report the draft genome sequence of Thermoactinomyces sp. strain AS95, which was isolated from a sebkha (endorheic salt pan) in the Thamelaht region of Algeria. We present a summary of the classification and set of phenotypic features for Thermoactinomyces sp. strain AS95 together with the description of the non-contiguous genome sequence and its annotation with particular reference to ORFs encoding proteolytic enzymes.

Organism information

Classification and features

Thermoactinomyces strain AS95 was isolated from a sebkha water sample collected in June 2013 from the Thamelaht region of Algeria (Table 1). This isolate is a Gram-positive, aerobic, thermophilic, filamentous bacterium (Fig. 1) belonging to the order Bacillales . Based on the 16S rRNA gene sequence similarity searches by BLASTN against the NCBI-NT database, strain AS95 showed 97–99 % sequence similarity to members of the genus Thermoactinomyces . A 16S rRNA gene-based phylogenetic tree of Thermoactinomyces sp. strain AS95 was constructed (Fig. 2), based on neighbor-joining and maximum composite likelihood models with 1000 bootstrap replications using MEGA 7 [12]. The Thermoactinomyces sp. strain AS95 (KU942442) 16S rRNA gene sequence exhibited high identity (99 %) with Thermoactinomyces vulgaris RVH210302 (AY114167), the closest validly published Thermoactinomyces species.

Table 1 Classification and general features of Thermoactinomyces sp. strain AS95
Fig. 1
figure 1

Scanning electron microscopy of Thermoactinomyces sp. strain AS95 using a Cryo-SEM (JEOL)

Fig. 2
figure 2

Phylogenetic tree based on 16S rRNA gene sequences showing the relationship between strain AS95 (1435 bp) and strains of related genera of the family Thermoactinomycetaceae. The strains and their corresponding Genbank accession numbers are shown following the organism name and indicated in parentheses. The phylogenetic tree was made using the neighbor-joining method with maximum composite likelihood model implemented in MEGA 7. The tree includes the 16S rRNA gene sequence of Sulfobacillus acidophilus DSM 10332T as outgroup. Bootstrap consensus trees were inferred from 1000 replicates, only bootstrap values >50 % are indicated. The scale bar represents 0.02 nucleotide changes per position. (♦) indicates the isolate assessed in the current study, Thermoactinomyces sp. strain AS95

The strain was cultivated on Thermus medium agar containing 2.0 g NaCl, 4.0 g yeast extract, 8.0 g peptone and 30.0 g agar per liter of distilled water. The bacterium grew optimally at 55 °C, with a broad temperature growth range of between 40 and 65 °C (Table 1). The strain grew in liquid media at pH values from 5.6 to 8.6, but optimal growth occurred at a pH of 7.2. Morphologically, the isolate forms white colonies and abundant aerial mycelia with the appearance of well-developed, branched and septate substrate mycelia. The micromorphology of the cells was examined using scanning electron microscopy (Fig. 1). The predominant menaquinone was MK-7. Major fatty acids included iso-C15:0, and significant amounts of iso-C17:0 were also present.

Genome sequencing information

Genome project history

A high-quality draft genome sequence is deposited at DDBJ/EMBL/GenBank under the accession LSVF00000000 and consists of 11 scaffolds of 11 contigs. A summary of the project information and its association with MIGS version 2.0 compliance are shown in Table 2 [13].

Table 2 Project information

Growth conditions and genomic DNA preparation

Thermoactinomyces sp. strain AS95 was grown aerobically on Thermus medium agar (pH 7.2) at 55 °C for 24 h. Genomic DNA was extracted using a modification of a previously described protocol [14]. The quantity and quality of the genomic DNA was measured using a NanoDrop Spectrophotometer and a Qubit™ Fluorometer (Thermo Fisher Scientific Inc.).

Genome sequencing and assembly

Genomic DNA samples of Thermoactinomyces sp. strain AS95 were sequenced at MR DNA (Shallowater, TX, USA). Genome sequencing was performed on a MiSeq (Illumina, Inc.) generating 2 x 300 bp paired-end libraries. The sequencing run produced a total of 5,085,250 reads, with a mean length of 265.58 bp. The raw paired-end sequences were subjected to the fastxtools software [15] for quality trimming using a phred quality score ≥ 20. After trimming, a total of 3,013,639 reads with a mean length of 171.11 bp were assembled using SPAdes, version 3.5.0 [16]. The final assembly resulted in a total of 11 scaffolds, which generated a genome size of 2.56 Mb.

Genome annotation

Genome annotation was carried out on the RAST server [17] and using the NCBI Prokaryotic Genome Annotation Pipeline tools [18]. This Whole Genome Shotgun sequence project has been deposited at DDBJ/EMBL/GenBank under accession LSVF00000000. The version described in this paper is version LSVF00000000.

Genome properties

The genome is composed of 2,558,690 nucleotides with 47.95 % G + C content (Table 3) and comprised 11 scaffolds of 11 contigs. The genome contains a total of 2649 genes, 2550 of which were protein coding, 39 pseudogenes and 60 RNA coding genes. The majority of protein-coding genes (75.45 %) were assigned a putative function while the remaining genes were annotated as hypothetical. The distribution of genes in COGs functional categories is presented in Table 4.

Table 3 Genome statistics of the Thermoactinomyces sp. strain AS95
Table 4 Number of genes associated with general COG functional categories

A blastp comparison was conducted against the MEROPS database. A total of 64 protein-coding genes (2.4 %) were predicted to share homology with various categories of proteases (Table 5). Of these predictions indicated that 36 were putatively secreted in a classical pathway (SignalP), whereas the other 28 were secreted in a non-classical pathway (SecretomeP). Only 2 of the 64 protein-coding genes share sequence similarities with proteases of the Thermoactinomyces vulgaris and sp. E79 families of peptidases in the MEROPS database.

Table 5 The four major types of proteases predicted in Thermoactinomyces sp. strain AS95

Conclusions

This study describes the draft genome sequence of Thermoactinomyces sp. strain AS95, which is associated with a high level of extracellular proteolytic activities. To date, only a few metabolic pathways involved in protein degradation have been characterized for the genus Thermoactinomyces [19]. The genome sequence and characteristics of strain AS95 will provide new insights into the mechanisms of protein degradation in the genus Thermoactinomycetes, and towards establishing a comprehensive genomic catalog of the metabolic diversity of the genus Thermoactinomyces .