Draft genome sequence of Streptomyces tunisialbus DSM 105760T

Streptomyces strains are well known as promising source of bioactive secondary metabolites, important in ecology, biotechnology and medicine. In this study, we present the draft genome of the new type strain Streptomyces tunisialbus DSM 105760T (= JCM 32165T), a rhizospheric bacterium with antimicrobial activity. The genome is 6,880,753 bp in size (average GC content, 71.85%) and encodes 5802 protein-coding genes. Preliminary analysis with antiSMASH 5.1.2. reveals 34 predicted gene clusters for the synthesis of potential secondary metabolites, which was compared with those of Streptomyces varsoviensis NRRL ISP-5346.


Introduction
Since the discovery of streptomycin in 1944 from Streptomyces griseus, the first antibiotic isolated from bacteria (Schatz et al. 1944), Actinobacteria have obtained large attention from the scientific community, mainly due to their ability in producing a wide spectrum of bioactive compounds with antimicrobial and antitumor activities (Demain 2014; Barka et al. 2016). Nowadays, this class provided more than 65% of antibiotics and antifungals used in medicine; including over 10,000 bioactive compounds that were produced by the members of the genus Streptomyces (Bérdy 2005(Bérdy , 2012De Lima et al. 2012). This genus, proposed for the first time by Waksman and Henrici (1943), belong to the Gram-positive bacteria and is ubiquitous in nature (Kämpfer 2012). The classification of Streptomyces species is generally carried out using a polyphasic taxonomic approach, including chemotaxonomic, phenotypic, and genotypic characteristics (Anderson and Wellington 2001). Currently (Status April 2020), about 841 species of the genus Streptomyces with valid names have been described and published so far (https ://www.bacte rio.net/strep tomyc es.html). Thus, several studies have been conducted to find new strains isolated from extreme habitats (Tiwari and Gupta 2012;Goodfellow 2013;Zhang et al. 2016). To cope with the major health problem of antibiotic-resistant microorganisms, many laboratories have focused on the discovery of new bioactive molecules produced from new species belonging to this genus. In fact, whole-genome sequencing (wgs) have opened new potential for better exploitation of useful secondary metabolites produced by this genus (Harrison and Studholme 2014;Lee et al. 2018;Ward and Allenb 2018). This fundamental technology was supported by the use of new bioinformatics tools like antiSMASH, which adds several new features, including prediction of gene cluster (Blin et al. 2017).
In this study, the genome of a new strain Streptomyces tunisialbus DSM 105760 T , which produces a broad spectrum of secondary metabolites with antibacterial and antifungal effects (Ayed et al. 2018), was sequenced. The genome sequence of S. tunisialbus was established to broaden the genomic basis for functional and comparative analyses focusing on secondary metabolite biosynthesis clusters of the strain.

Materials and methods
The genomic DNA was extracted from a culture grown 3 days in ISP2 medium (Shirling and Gottlieb 1966), using cetyltrimethylammonium bromide (CTAB)-based method (Hopwood 2000). The quality of the DNA was assessed by gel-electrophoresis, and its quantity was determined by a fluorescence-based method using the Quant-iT PicoGreen dsDNA kit (Invitrogen) and the Tecan Infinite 200 Microplate Reader (Tecan Deutschland GmbH). For sequencing of the S. tunisialbus DSM 105760 T genome, 4 μg of purified DNA was used to generate a paired library (TruSeq DNA PCR-Free Library Prep Kit, Illumina). The draft genome sequence of strain S. tunisialbus was established by sequencing on an Illumina MiSeq system. To assemble the obtained and processed reads, the GS de novo Assembler software version 2.8 (Roche, Mannheim, Germany) with default setting was applied as recently described (Yücel et al. 2017). Annotation of the draft genome was performed by applying the GenDB 2.0 system (Meyer et al. 2003) and Prokka version 1.11 (Seemann 2014). To identify clusters involved in secondary metabolite biosynthesis, the annotated genome was analyzed by applying antiSMASH 5.1.2 (Blin et al. 2017).

Results and discussion
The MiSeq sequencing run (2 × 300 bp) resulted in 2,092,356 reads yielding approx. 627 Mb sequence information for the sample representing on average a coverage of about 90-fold. The final draft genome established by the GS Assembler software (version 2.8, Roche) consists of 35 contigs (N50: 455,831 bp) and 20 scaffolds (N50: 678,417 bp). The annotation of the draft genome of S. tunisialbus showed a linear chromosome which has a size of 6,880,753 bp and encodes 5802 protein-coding sequences (CDS)s. The chromosome also contains 72 tRNA genes and eight rRNA operons (16S-23S-5S rRNA). The average GC content is 71.85% (Table 1). An ANI value of 84.93%, an AAI value of 87.43% and a DDH value of 25.00% were calculated for S. tunisialbus DSM 105760 T and S. varsoviensis NRRL ISP-5346. High ANI, AAI values (above 95%) and high DDH estimations (Formula 2 above 70%) were usually observed for bacterial isolates representing the same species and are considered to indicate bacterial species demarcation previously (Konstantinidis and Tiedje 2005). Therefore, both strains are closely related, but do not belong to the same species.
By applying EDGAR, a comparative analysis was performed for S. tunisialbus DSM 105760 T and S. varsoviensis NRRL ISP-5346. It appeared that 3830 genes corresponding to 66% and 55% of all genes identified in both draft genomes represent the core set of genes shared by both isolates. These core genes mainly encode for housekeeping functions, e.g., glycolysis. However, S. tunisialbus DSM 105760 T possesses 1972 singletons, whereas 3062 singletons were identified for S. varsoviensis NRRL ISP-5346. Most of the S. tunisialbus DSM 105760 T singletons were annotated as 'hypothetical' genes or transposases. However, genes encoding enzymes potentially involved in the synthesis of secondary metabolites were identified within the unique genes of S. tunisialbus DSM 105760 T . In addition, a few unique genes encoding for different cytochromes were found. For S. varsoviensis NRRL ISP-5346, the clusters found to produce hygrobafilomycins (JBIR-100 and hygrobafilomycin) and bafilomycins (bafilomycins C1 and D) were unique. To identify all clusters involved in secondary metabolite biosynthesis, the S. tunisialbus DSM 105760 T genome was analyzed by means of antiSMASH 5.1.2. The platform revealed the presence of 34 predicted gene clusters encode for potential secondary metabolites, such as antibiotics, antitumor compounds, fungicide, and lantibiotics. Likewise, six of these gene clusters are terpenes, such 2-methylisoborneol [100%] a volatile organic compound well known for its biological activity, two for melanin biosynthesis and one for Ectoine (Table 2). S. varsoviensis NRRL ISP-5346 has a similar portfolio of secondary metabolite clusters (Table 3).
The presence of a large number of cryptic secondary metabolite gene clusters in its genome, offers unsuspected potential for synthesizing bioactive compounds. In addition, our study shows that sequencing of new strains allows the discovery of new secondary metabolites relevant in ecology, biotechnology and medicine. Acknowledgements Open Access funding provided by Projekt DEAL. The bioinformatics support of the BMBF-funded project "Bielefeld-Gießen Center for Microbial Bioinformatics" (BiGi) (Grant 031A533) within the German Network for Bioinformatics Infrastructure (de.NBI) is gratefully acknowledged. This research work has been financed by the German Academic Exchange Service (DAAD) with funds from the German Federal Foreign Office in the frame of the Research Training Network "Novel Cytotoxic Drugs from Extremophilic Actinomycetes" (Project ID 57166072).

Data availability
The scaffolds of the whole-genome shotgun project have been deposited in the EMBL database (EBI) under the accession no. OKRJ01000001-OKRJ01000020.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.