Introduction

Members of the genus Streptomyces , Gram-positive filamentous actinomycetes, are an attractive source for bioactive secondary metabolites. Terrestrial surface soil is the most common habitat for Streptomyces but a recent survey has disclosed its ubiquitous distribution in marine environments. Marine Streptomyces are currently attracting much attention as an untouched resource of novel bioactive compounds useful for drug development [13]. In our screening for new anti-MRSA antibiotics, Streptomyces sp. TP-A0598 collected from deep sea water was found to produce lydicamycin and its four new congeners of polyketide origin (Fig. 1) [4]. Lydicamyicn is characterized by the unprecedented pyrrolidine ring modified by an aminoiminomethyl group to which a polyketide-derived carbon chain with multiple hydroxyl and olefinic functionalities is linked and to the other end of the chain is linked an octalin modified by a tetramic acid. Despite this unique structural feature, biosynthetic genes of lydicamycin have not been reported to date. In this study, we conducted whole genome shotgun sequencing of the strain TP-A0598 to identify the PKS gene cluster for lydicamycin. We herein present the draft genome sequence of Streptomyces sp. TP-A0598, together with the description of genome properties and annotation for secondary metabolite genes. The putative lydicamycin biosynthetic gene cluster and a plausible biosynthetic pathway are also reported.

Fig. 1
figure 1

Chemical structures of lydicamycin and its congeners produced by Streptomyces sp. TP-A0598

Organism information

Classification and features

In the course of screening for new bioactive molecules produced by marine microorganisms, Streptomyces sp. TP-A0598 was isolated from a seawater sample collected in 2,600 meters off the shore and 321 meters in depth at Namerikawa, Toyama, Japan by a membrane filter method and found to produce lydicamycin and its novel congeners. This strain grew well on Bennett’s, ISP 3, ISP 4, ISP 5 and Yeast starch agars. On ISP 5, ISP 6 and ISP 7 agars, the growth was poor. The color of aerial mycelia was grayish olive and that of the reverse side was pale yellow on ISP 3 agar. Diffusible pigments were not formed on any agar media that we examined. Strain TP-A0598 formed spiral spore chains and the spores were cylindrical, 0.5 × 0.9 μm in size, having a warty surface [4]. A scanning electron micrograph of this strain is shown in Fig. 2. Growth occurred at 15–37 °C (optimum 30 °C) and pH 5–9 (optimum pH 7). Strain TP-A0598 exhibited growth with 0–7 % (w/v) NaCl (optimum 0 % NaCl). Strain TP-A0598 utilized D-glucose, sucrose, inositol, L-rhamnose, D-mannitol, D-raffinose, D-fructose, L-arabinose, and D-xylose for growth (Table 1) [4]. This strain was deposited in the NBRC culture collection with the registration number of NBRC 110027. The genes encoding 16S rRNA were amplified by PCR using two universal primers, 9 F and 1541R. After purification of the PCR product by AMPure (Beckman Coulter), the sequencing was carried out according to a established methods [5]. Homology search of the sequence by EzTaxon-e [6] indicated the highest similarity (99.93 %, 1465/1466) to Streptomyces angustmyceticus NBRC 3934T (AB184817) [7] as the closest type strain. A phylogenetic tree was reconstructed on the basis of the 16S rRNA gene sequence together with phylogenetic neighbors that showed over 98.5 % similarity (Fig. 3) using ClustalX2 [8] and NJplot [9]. The phylogenetic analysis confirmed that the strain TP-A0598 belongs to the genus Streptomyces .

Fig. 2
figure 2

Scanning electron micrograph of Streptomyces sp. TP-A0598 grown on ten-fold diluted ISP 2 medium agar for 11 days at 28 °C. Bar, 5 μm

Table 1 Classification and general features of Streptomyces sp. TP-A0598
Fig. 3
figure 3

Phylogenetic tree highlighting the position of Streptomyces sp. TP-A0598 relative to phylogenetically close type strains within the genus Streptomyces. The strains and their corresponding GenBank accession numbers for 16S rRNA genes are shown in parentheses. The tree uses sequences aligned by ClustalX2 [8], and constructed by the neighbor-joining method [27]. All positions containing gaps were eliminated. The building of the tree also involves a bootstrapping process repeated 1000 times to generate a majority consensus tree [28], and only bootstrap values above 50 % are shown at branching points. Kitasatospora setae [29] was used as an outgroup

Chemotaxonomic data

The whole-cell hydrolysates of strain TP-A0598 contained L,L-diaminopimelic acid, glycine, ribose and madurose. The cellular fatty acids consisted of 21 % 14-methylpentadecanoic acid (iso C16), 9 % 13-methyltetradecanoic acid (iso C15:0), 8 % 12-methyltetradecanoic acid (anteiso C15:0) and other minor fatty acids [4].

Genome sequencing information

Genome project history

In collaboration between Toyama Prefectural University and NBRC, the organism was selected for genome sequencing to elucidate the lydicamycin biosynthetic gene cluster. We successfully accomplished the genome project of Streptomyces sp. TP-A0598 as reported in this paper. The draft genome sequence data have been deposited in the INSDC database under the accession number BBNO01000001-BBNO01000020. The project information and its association with MIGS version 2.0 compliance are summarized in Table 2 [10].

Table 2 Project information

Growth conditions and genomic DNA preparation

Streptomyces sp. TP-A0598 monoisolate was grown on polycarbonate membrane filter (Advantec) on double diluted ISP 2 agar medium (0.2 % yeast extract, 0.5 % malt extract, 0.2 % glucose, 2 % agar, pH 7.3) at 28 °C. High quality genomic DNA for sequencing was isolated from the mycelia with an EZ1 DNA Tissue Kit and a Bio Robot EZ1 (Qiagen) according to the protocol for extraction of nucleic acid from Gram-positive bacteria. The size, purity, and double-strand DNA concentration of the genomic DNA were measured by pulsed-field gel electrophoresis, ratio of absorbance values at 260 nm and 280 nm, and Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies) to assess the quality.

Genome sequencing and assembly

Shotgun and pair-end libraries were prepared and sequenced using 454 pyrosequencing technology and HiSeq1000 (Illumina) pair-end technology, respectively (Table 2). The 70 Mb shotgun sequences and 702 Mb pair-end sequences were assembled into 20 scaffolds larger than 500 bp using Newbler v2.6, and subsequently finished using GenoFinisher [11].

Genome annotation

Coding sequences were predicted by Prodigal [12] and tRNA-scanSE [13]. The gene functions were annotated using an in-house genome annotation pipeline and domains related to PKS and NRPS were searched for using the SMART and PFAM domain databases. PKS and NRPS gene clusters and their domain organizations were analyzed manually. Similarity search in the NCBI nr databases was also used for functional prediction of genes in the lydicamycin biosynthetic gene cluster.

Genome properties

The total size of the genome is 8,319,549 bp and the GC content is 71.0 % (Table 3), similar to other genome-sequenced Streptomyces members. Of the total 7,344 genes, 7,240 are protein-coding genes and 75 are RNA genes. The classification of genes into COGs functional categories is shown in Table 4. As for the secondary metabolism, Streptomyces sp. TP-A0598 has two type I PKS, two type II PKS, two NRPS, and two hybrid PKS/NRPS gene clusters, suggesting the high capacity of production of polyketides and nonribosomal peptides.

Table 3 Genome statistics
Table 4 Number of genes associated with general COG functional categories

Insights from the genome sequence

The chemical structure of lydicamycin (Fig. 1) suggests that its carbon skeleton is assembled from eleven malonyl-CoA and six methylmalonyl-CoA precursors by type I PKS pathway. In addition, this pathway should be combined with NRPS pathway since lydicamycin bears a tetramic acid moiety derived from the condensation of an amino acid to the polyketide chain. We therefore searched for a type I PKS gene cluster consisting of seventeen PKS modules and an NRPS module. A hybrid PKS/NRPS gene cluster in scaffold03 (Table 5, Fig. 4) consists of seventeen PKS modules and one NRPS module (Fig. 5b). According to the assembly line rule [14], the predicted structure of the polyketide arising from this PKS/NRPS hybrid gene cluster was in good accordance with the actual structure of lydicamycin (Fig. 5b). As a starter unit for the polyketide assembly, 4-guanidinobutyryl CoA could be proposed on the basis of annotation of TPA0598_03_00880, TPA0598_03_00650 and TPA0598_03_00700. These genes were predicted to encode amine oxidase, acyl-CoA ligase, and transacylase by comparing the corresponding genes present in the ECO-02301 biosynthetic gene cluster. In the biosynthesis of ECO-02301, 4-aminobutyryl-CoA is supplied from L-arginine by a sequential action of amine oxidase, acyl-CoA ligase, and amidinohydrolase and is transferred to ACP by transacylase (Fig. 5a) [15]. In the lydicamycin cluster, genes for an amine oxidase (TPA0598_03_00880), an acyl-CoA ligase (TPA0598_03_00650), and a transacylase (TPA0598_03_00700) are present in the surrounding region of the PKS cluster but an amidinohydrolase gene responsible for the hydrolysis of the guanidine residue to the primary amine is lacking (Fig. 5a, Table 5). After the 4-guanidinobutyryl starter is loaded onto ACP of TPA0598_03_00840, the polyketide chain is extended by eight PKSs and a glycine is added to the polyketide terminus by an NRPS module (Fig. 5b), followed by the formation of an octalin and a tetramic acid ring (Fig. 5c). It was not possible to assign a gene responsible for the cyclization of the guanidino precursor into a pyrrolidine ring. A cytochrome P450 (TPA0598_03_00850) would be responsible for the hydroxylation of the octalin carbon at C-8 (Fig. 5c). Production of deoxy- and demethylcongeners suggests that substrate recognition by the AT domain in module3 (second module of TPA0598_03_00740) and the ER domain in module11 (first module of TPA0598_03_00780) is likely not strict (Table 6).

Table 5 Open reading frames in the lydicamycin biosynthetic gene cluster
Fig. 4
figure 4

Genetic map of lydicamycin biosynthetic gene cluster

Fig. 5
figure 5

Proposed lydicamycin synthetic pathway. a starter synthesis compared with that of ECO-02301; b chain elongation; c cyclization and modification yielding final products

Table 6 Proposed mechanism to produce lydicamycin congeners

Conclusions

The 8 Mb draft genome of Streptomyces sp. TP-A0598, a producer of lydicamycins isolated from seawater, has been deposited at GenBank/ENA/DDBJ under accession number BBNO00000000. We successfully identified the PKS/NRPS hybrid cluster for lydicamycin biosynthesis and proposed a plausible biosynthetic pathway. In addition, the genome of strain TP-A0598 contained seven orphan PKS or NRPS gene cluster but secondary metabolites from these orphan clusters have not been isolated yet. The genome sequence information disclosed in this study will be utilized for the investigation of additional new bioactive compounds from this strain and will also serve as a valuable reference for evaluation of the metabolic potential in marine-derived Streptomyces .