Introduction

Rakicidin D is an inhibitor of tumor cell invasion isolated from the culture broth of an actinomycete strain MWW064 of the genus Streptomyces [1]. To date, five congeners rakicidins A, B, and E from Micromonospora and rakicidins C and D from Streptomyces have been reported [14]. Rakicidins share the 15-membered cyclic depsipeptide structure comprising three amino acids and a fatty acid modified with hydroxy and methyl substitutions. The most intriguing part of rakicidins is a rare unusual amino acid, 4-amino-2,4-pentadienoate (APDA), which is present only in a limited range of secondary metabolites of actinomycetes such as BE-43547 [5] and microtermolide [6, 7]. Despite the scarcity of APDA unit in nature, nothing is known about its biosynthesis. Recently, putative biosynthetic genes for rakicidin D were reported [8], but the data is incomplete, no detailed information is shown in the paper, and DNA sequences have not been registered in public databases. Hence, the biosynthesis of rakicidins has been actually unclear yet. In this study, we performed whole genome shotgun sequencing of the strain MWW064 to elucidate the biosynthetic mechanism of rakicidin D. We herein present the draft genome sequence of Streptomyces sp. MWW064, together with the taxonomical identification of the strain, description of its genome properties and annotation of the gene cluster for rakicidin synthesis. We propose the rakicidin-biosynthetic mechanism predicted by bioinformatics analysis and confirmed by precursor-incorporation experiments.

Organism information

Classification and features

In the course of screening for antitumor compounds from actinomycetes, Streptomyces sp. MWW064 was isolated from a marine sediment sample collected in Samut Sakhon province of Thailand and found to produce rakicidin D [1]. The general feature of this strain is shown in Table 1. This strain grew well on ISP 2 and ISP 4 agars. On ISP 5 and ISP 7 agars, the growth was poor. The color of aerial mycelia was white and that of the reverse side was pale red on ISP 2 agar. Diffusible pigments were dark orange on ISP 2 agar medium. Strain MWW064 formed extensively branched- substrate and aerial mycelia. The aerial mycelium formed flexuous spore chains at maturity. The spores were cylindrical, having a smooth surface. A scanning electron micrograph of this strain is shown in Fig. 1. Growth occurred at 15–37 °C (optimum 28 °C) and pH 5–9 (optimum pH 7). Strain MWW064 exhibited growth with 0–3 % (w/v) NaCl (optimum 0 % NaCl). Strain MWW064 utilized glucose and inositol for growth. The gene sequence encoding 16S rRNA was obtained from GenBank/EMBL/DDBJ databases (accession no. GU295447). A phylogenetic tree was reconstructed on the basis of the 16S rRNA gene sequence together with taxonomically close Streptomyces type strains using ClustalX2 [9] and NJPlot [10]. The phylogenetic analysis confirmed that the strain MWW064 belongs to the genus Streptomyces (Fig. 2).

Table 1 Classification and general features of Streptomyces sp. MWW064 [13]
Fig. 1
figure 1

Scanning electron micrograph of Streptomyces sp. MWW064 grown on 1/2 ISP 2 agar for 7 days at 28 °C. Bar, 2 μm

Fig. 2
figure 2

Phylogenetic tree of Streptomyces sp. MWW064 and phylogenetically close type strains, showing over 98.5 % similarity, based on 16S rRNA gene sequences. The accession numbers for 16S rRNA genes are shown in parentheses. The tree uses sequences aligned by ClustalX2 [9], and constructed by the neighbor-joining method [35]. All positions containing gaps were eliminated. The building of the tree also involves a bootstrapping process repeated 1,000 times to generate a majority consensus tree, and only bootstrap values above 50 % are shown at branching points. Streptomyces albus NBRC 13014T was used as an outgroup

Chemotaxonomic data

The isomer of diaminopimelic acid in the whole-cell hydrolysate was analyzed according to the method described by Hasegawa et al. [11]. Isoprenoid quinones and cellular fatty acids were analyzed as described previously [12]. The whole-cell hydrolysate of strain MWW064 contained ll-diaminopimelic acid as its diagnostic peptidoglycan diamino acid. The predominant menaquinones were identified as MK-9(H2), MK-9(H4) and MK-9(H6); MK-10(H2), MK-10(H4) and MK-10(H6) were also detected as minor components. The major cellular fatty acids were found to be anteiso-C15:0, iso-C15:0, C16:0, anteiso-C17:0, iso-C17:0 and iso-C16:0.

Genome sequencing information

Genome project history

In collaboration between Toyama Prefectural University and NBRC, the organism was selected for genome sequencing to elucidate the rakicidin biosynthetic pathway. We successfully accomplished the genome project of Streptomyces sp. MWW064 as reported in this paper. The draft genome sequences have been deposited in the INSDC database under the accession number BBUY01000001-BBUY01000099. The project information and its association with MIGS version 2.0 compliance are summarized in Table 2 [13].

Table 2 Project information

Growth conditions and genomic DNA preparation

Streptomyces sp. MWW064 was deposited in the NBRC culture collection with the registration number of NBRC 110611. Its monoisolate was grown on polycarbonate membrane filter (Advantec) on double diluted ISP 2 agar medium (0.2 % yeast extract, 0.5 % malt extract, 0.2 % glucose, 2 % agar, pH 7.3) at 28 °C. High quality genomic DNA for sequencing was isolated from the mycelia with an EZ1 DNA Tissue Kit and a Bio Robot EZ1 (Qiagen) according to the protocol for extraction of nucleic acid from Gram-positive bacteria. The size, purity, and double-strand DNA concentration of the genomic DNA were measured by pulsed-field gel electrophoresis, ratio of absorbance values at 260 nm and 280 nm, and Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies), respectively, to assess the quality of genomic DNA.

Genome sequencing and assembly

Shotgun and paired-end libraries were prepared and subsequently sequenced using 454 pyrosequencing technology and HiSeq1000 (Illumina) paired-end technology, respectively (Table 2). The 70 Mb shotgun sequences and 739 Mb paired-end sequences were assembled using Newbler v2.8 and subsequently finished using GenoFinisher [14] to yield 99 scaffolds larger than 500 bp.

Genome annotation

Coding sequences were predicted by Prodigal [15] and tRNA-scanSE [16]. The gene functions were annotated using an in-house genome annotation pipeline, and PKS- and NRPS-related domains were searched using the SMART and PFAM domain databases. PKS and NRPS gene clusters and their domain organizations were determined as reported previously [17]. Substrates of adenylation (A) and acyltransferase (AT) domains were predicted using antiSMASH [18]. BLASTP search against the NCBI nr databases were also used for predicting function of proteins encoded in the rakicidin biosynthetic gene cluster.

Genome properties

The total size of the genome is 7,870,697 bp and the GC content is 71.1 % (Table 3), similar to other genome-sequenced Streptomyces members. Of the total 7,206 genes, 7,135 are protein-coding genes and 71 are RNA genes. The classification of genes into COGs functional categories is shown in Table 4. As for secondary metabolite pathways by modular PKSs and NRPSs, Streptomyces sp. MWW064 has at least four hybrid PKS/NRPS gene clusters, three type I PKS gene clusters, and seven NRPS gene clusters. According to the assembly line mechanism [19], we predicted the chemical backbones that each cluster will synthesize (Table 5), suggesting the potential of Streptomyces sp. MWW064 to produce diverse polyketide- and nonribosomal peptide-compounds as the secondary metabolites.

Table 3 Genome statistics
Table 4 Number of genes associated with general COG functional categories
Table 5 Modular PKS and NRPS gene clusters in Streptomyces sp. MWW064

Insights from the genome sequence

Rakicidin biosynthetic pathway in Streptomyces sp. MWW064

The chemical structure of rakicidin D suggested that it is synthesized by a hybrid PKS/NRPS pathway. Among the four hybrid PKS/NRPS gene clusters present in Streptomyces sp. MWW064 (Table 5), pks/nrps-1 is most likely responsible for rakicidin synthesis because the carbon backbone of the predicted product (R-C3-C3-Ser-C2-Gly-X) is in good accordance with that of rakicidin D. Genes in pks/nrps-1 (Table 6) encode enzymes necessary for rakicidin biosynthesis (Fig. 3). This cluster contains three PKS genes (SSP35_09_01910, SSP35_09_01900, SSP35_09_01880) and three NRPS genes (SSP35_09_01890, SSP35_09_01870, SSP35_09_01860), corresponding to rakAB, rakC, rakEF, rakD, rakG, and rakH [8], respectively. Based on the collinearity rule of modular PKS/NRPS pathways, it is deduced that RakAB loads a starter molecule (‘R’ in Fig. 3), and subsequently RakAB and RakC add a diketide chain to the starter by condensation of two methylmalonyl-CoA molecules, since the substrates of their AT domains are likely methylmalonyl-CoA (‘ATm’ in Fig. 3). An NRPS RakD and the remaining PKS RakEF are most likely involved in the APDA supply: the A domain of RakD has signature amino acid residues for serine, and RakEF contains a set of domains (AT, KR, DH) for malonate incorporation, ketoreduction, and dehydration to provide a double bond between C9 and C10. In addition, the DH domain in RakEF is also proposed to be responsible for the dehydration of the primary hydroxy group of the incorporated serine molecule on the basis of the following reasons although experimental evidences are required. First, no dehydratase gene is present near the rakicidin cluster. In the biosynthesis of dehydroalanine in bacterial peptides such as lantibiotics, a dehydratase catalyzes the exo-methylene formation from serine [20, 21]. Second, the order of KR and DH domains in RakEF is unusual: among the three hundred type I PKS genes for eighty actinomycete polyketides, the order of two domains is exclusively DH-KR [22]. The only exception can be seen in the PKS genes for enediynes in which the chain elongation is iteratively catalyzed as similar to type II PKS [23]. The unusual order of KR-DH may render an undescribed function to the DH domain of RakEF. After formation of APDA moiety, RakG is likely responsible for the condensation of glycine and the following N-methylation, and RakH for asparagine condensation. Hydroxylation of asparagine would be catalyzed by asparagine hydroxylase encoded by rakO in the downstream of the cluster, to yield rakicidin D. On the basis of the above-mentioned bioinfomatic evidences, we here propose the biosynthetic pathway of rakicidin D as shown Fig. 3.

Table 6 ORFs in the rakicidin-biosynthetic gene cluster of Streptomyces sp. MWW064
Fig. 3
figure 3

Genetic map of rakicidin biosynthetic gene cluster of Streptomyces sp. MWW064 and the biosynthetic mechanism of rakicidin D

Identification of biosynthetic precursors of the APDA moiety

To verify the predicted biosynthetic origin of the APDA unit, feeding experiments using 13C-labeled precursors were carried out. Inoculation, cultivation, extraction, and purification were performed in the same manner as previously reported [1]. Addition of sodium [2-13C]acetate or [1-13C]-L-serine (20 mg/100 ml medium/flask, 10 flasks for [2-13C]acetate, 3 flasks for [1-13C]-L-serine) was initiated at 48 h after inoculation and periodically carried out every 24 h for four times. After further incubation for 24 h, the whole culture broths were extracted with 1-butanol and several steps of purification yielded 2.5 mg and 1.7 mg of 13C-labeled rakicidin D, respectively. The 13C NMR spectrum of these labeled rakicidin D is shown in Table 7. Feeding of sodium [2-13C]acetate gave enrichments at C9 of the APDA unit and three carbons C18, C20, and C22 in the aliphatic chain of the fatty acid moiety. [1-13C]-L-serine feeding enriched C10 of the APDA unit and the carbonyl carbon of Gly (C5). These results unambiguously indicated that the APDA unit is derived from an acetate and a serine (Fig. 4). Labeling of C5 by serine-feeding can be explained by the interconversion between glycine and serine by transformylase in primary metabolism for amino acid supply.

Table 7 Incorporation of 13C-labeled precursors into rakicidin D
Fig. 4
figure 4

Incorporation of 13C-labeled precursors into rakicidin D

Conclusions

The 7.9 Mb draft genome of Streptomyces sp. MWW064, a producer of rakicidin D isolated from marine segment, has been deposited at GenBank/ENA/DDBJ under the accession number BBUY00000000. We successfully identified the PKS/NRPS hybrid gene cluster for rakicidin synthesis and proposed the plausible biosynthetic pathway. Labeled precursor incorporation experiments showed the APDA moiety is synthesized from serine and malonate. These finding will open up possibilities of genetic engineering to synthesize more potential rakicidin-based antitumor compounds and discovering new bioactive compounds possessing APDA units.