Introduction

Since the discovery of streptomycin in 1944 from Streptomyces griseus, the first antibiotic isolated from bacteria (Schatz et al. 1944), Actinobacteria have obtained large attention from the scientific community, mainly due to their ability in producing a wide spectrum of bioactive compounds with antimicrobial and antitumor activities (Demain 2014; Barka et al. 2016). Nowadays, this class provided more than 65% of antibiotics and antifungals used in medicine; including over 10,000 bioactive compounds that were produced by the members of the genus Streptomyces (Bérdy 2005, 2012; De Lima et al. 2012). This genus, proposed for the first time by Waksman and Henrici (1943), belong to the Gram-positive bacteria and is ubiquitous in nature (Kämpfer 2012). The classification of Streptomyces species is generally carried out using a polyphasic taxonomic approach, including chemotaxonomic, phenotypic, and genotypic characteristics (Anderson and Wellington 2001). Currently (Status April 2020), about 841 species of the genus Streptomyces with valid names have been described and published so far (https://www.bacterio.net/streptomyces.html). Thus, several studies have been conducted to find new strains isolated from extreme habitats (Tiwari and Gupta 2012; Goodfellow 2013; Zhang et al. 2016). To cope with the major health problem of antibiotic-resistant microorganisms, many laboratories have focused on the discovery of new bioactive molecules produced from new species belonging to this genus. In fact, whole-genome sequencing (wgs) have opened new potential for better exploitation of useful secondary metabolites produced by this genus (Harrison and Studholme 2014; Lee et al. 2018; Ward and Allenb 2018). This fundamental technology was supported by the use of new bioinformatics tools like antiSMASH, which adds several new features, including prediction of gene cluster (Blin et al. 2017).

In this study, the genome of a new strain Streptomyces tunisialbus DSM 105760T, which produces a broad spectrum of secondary metabolites with antibacterial and antifungal effects (Ayed et al. 2018), was sequenced. The genome sequence of S. tunisialbus was established to broaden the genomic basis for functional and comparative analyses focusing on secondary metabolite biosynthesis clusters of the strain.

Materials and methods

The genomic DNA was extracted from a culture grown 3 days in ISP2 medium (Shirling and Gottlieb 1966), using cetyltrimethylammonium bromide (CTAB)-based method (Hopwood 2000). The quality of the DNA was assessed by gel-electrophoresis, and its quantity was determined by a fluorescence-based method using the Quant-iT PicoGreen dsDNA kit (Invitrogen) and the Tecan Infinite 200 Microplate Reader (Tecan Deutschland GmbH). For sequencing of the S. tunisialbus DSM 105760T genome, 4 μg of purified DNA was used to generate a paired library (TruSeq DNA PCR-Free Library Prep Kit, Illumina). The draft genome sequence of strain S. tunisialbus was established by sequencing on an Illumina MiSeq system. To assemble the obtained and processed reads, the GS de novo Assembler software version 2.8 (Roche, Mannheim, Germany) with default setting was applied as recently described (Yücel et al. 2017). Annotation of the draft genome was performed by applying the GenDB 2.0 system (Meyer et al. 2003) and Prokka version 1.11 (Seemann 2014). To identify clusters involved in secondary metabolite biosynthesis, the annotated genome was analyzed by applying antiSMASH 5.1.2 (Blin et al. 2017). Based on the 16S analysis in Ayed et al., S. varsoviensis NRRL ISP-5346 was determined as closest relative with a sequenced genome. S. varsoviensis NRRL ISP-5346 was found to produce multiple hygrolide sub-families, e.g., hygrobafilomycins (JBIR-100 and hygrobafilomycin) and bafilomycins (bafilomycins C1 and D) (Molloy et al. 2016). Finally, comparative genome analysis for S. tunisialbus DSM 105760T and S. varsoviensis NRRL ISP-5346 was performed by means of the comparative genomics tool EDGAR (Blom et al. 2009, 2016). To determine the relatedness between the S. tunisialbus DSM 105760T and S. varsoviensis NRRL ISP-5346 (Accession number: JOBF0100001-118), an average nucleotide identity analysis (ANI), average amino acid identity analysis (AAI) (Konstantinidis and Tiedje 2005) and a DDH estimation were performed as described previously (Meier-Kolthoff et al. 2013).

Results and discussion

The MiSeq sequencing run (2 × 300 bp) resulted in 2,092,356 reads yielding approx. 627 Mb sequence information for the sample representing on average a coverage of about 90-fold. The final draft genome established by the GS Assembler software (version 2.8, Roche) consists of 35 contigs (N50: 455,831 bp) and 20 scaffolds (N50: 678,417 bp). The annotation of the draft genome of S. tunisialbus showed a linear chromosome which has a size of 6,880,753 bp and encodes 5802 protein-coding sequences (CDS)s. The chromosome also contains 72 tRNA genes and eight rRNA operons (16S-23S-5S rRNA). The average GC content is 71.85% (Table 1).

Table 1 General features of S. tunisialbus DSM 105760T and S. varsoviensis NRRL ISP-5346

An ANI value of 84.93%, an AAI value of 87.43% and a DDH value of 25.00% were calculated for S. tunisialbus DSM 105760T and S. varsoviensis NRRL ISP-5346. High ANI, AAI values (above 95%) and high DDH estimations (Formula 2 above 70%) were usually observed for bacterial isolates representing the same species and are considered to indicate bacterial species demarcation previously (Konstantinidis and Tiedje 2005). Therefore, both strains are closely related, but do not belong to the same species.

By applying EDGAR, a comparative analysis was performed for S. tunisialbus DSM 105760T and S. varsoviensis NRRL ISP-5346. It appeared that 3830 genes corresponding to 66% and 55% of all genes identified in both draft genomes represent the core set of genes shared by both isolates. These core genes mainly encode for housekeeping functions, e.g., glycolysis. However, S. tunisialbus DSM 105760T possesses 1972 singletons, whereas 3062 singletons were identified for S. varsoviensis NRRL ISP-5346. Most of the S. tunisialbus DSM 105760T singletons were annotated as ‘hypothetical’ genes or transposases. However, genes encoding enzymes potentially involved in the synthesis of secondary metabolites were identified within the unique genes of S. tunisialbus DSM 105760T. In addition, a few unique genes encoding for different cytochromes were found. For S. varsoviensis NRRL ISP-5346, the clusters found to produce hygrobafilomycins (JBIR-100 and hygrobafilomycin) and bafilomycins (bafilomycins C1 and D) were unique.

To identify all clusters involved in secondary metabolite biosynthesis, the S. tunisialbus DSM 105760T genome was analyzed by means of antiSMASH 5.1.2. The platform revealed the presence of 34 predicted gene clusters encode for potential secondary metabolites, such as antibiotics, antitumor compounds, fungicide, and lantibiotics. Likewise, six of these gene clusters are terpenes, such 2-methylisoborneol [100%] a volatile organic compound well known for its biological activity, two for melanin biosynthesis and one for Ectoine (Table 2). S. varsoviensis NRRL ISP-5346 has a similar portfolio of secondary metabolite clusters (Table 3).

Table 2 Secondary metabolite cluster of S. tunisialbus DSM 105760T
Table 3 Secondary metabolite cluster of S. varsoviensis NRRL ISP-5346

The presence of a large number of cryptic secondary metabolite gene clusters in its genome, offers unsuspected potential for synthesizing bioactive compounds. In addition, our study shows that sequencing of new strains allows the discovery of new secondary metabolites relevant in ecology, biotechnology and medicine.