Background

As a widely distributed class of natural products, triterpenoids are a group of C30 compounds with over 20,000 structurally diverse members (Rascon-Valenzuela et al. 2017). In general, triterpenoids can be cyclic (e.g., lanostane-type, oleane-type, ursane-type, lupane-type) (Chudzik et al. 2015) and linear (squalene-type). Squalene-type triterpenoids (STs) exhibit many promising biological activities, including anti-cancer (Cen-Pacheco et al. 2011; Nguyen et al. 2010), anti-oxidative (Warleta et al. 2010), and anti-inflammatory (Kouam et al. 2012) activities. In addition, many STs are oil-like chemicals with important applications in drug delivery and skin care (Kim and Karadeniz 2012). To fulfill their biotechnological potential, there is a compelling need to efficiently biosynthesize STs.

STs are derived from squalene, which can be produced via the mevalonate (MVA) or 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway (Spanova and Daum 2011). After formation of squalene, a series of post-modifications (e.g., hydroxylation, epoxidation) are responsible for generating the structural diversity of STs (Chen and Wang 2015; Kouam et al. 2012; Nguyen et al. 2010). As the potential candidates for post-modifications of STs, cytochrome P450s (CYPs) catalyze various chemical reactions including chemically challenging stereo- and regio-selective hydroxylation and epoxidation reactions (Wang et al. 2017). In contrast to the role of CYPs in cyclic triterpenoids biosynthesis (Han et al. 2012; Seki et al. 2011; Yasumoto et al. 2017), the contribution of CYPs in ST biosynthesis is relatively unknown. The currently characterized CYPs for cyclic triterpenoids prefer cyclized substrates (Brill et al. 2014; Dai et al. 2019; Han et al. 2011), suggesting that other enzymes are responsible for the post-modification of squalene.

Mushrooms are an important group of organisms that possess abundant and diverse CYPs (Chen et al. 2012). As known to produce a myriad of unique triterpenoids, mushrooms play a significant role in the ecosystem as well as in human nutrition and health (Valverde et al. 2015). To circumvent the immature genetic manipulation of mushrooms, we recently developed a synthetic biology platform to identify mushroom-derived CYPs using Saccharomyces cerevisiae as a screening host (Wang et al. 2018). We chose S. cerevisiae because: (i) it is a genetically tractable host; (ii) it naturally produces direct triterpenoid precursors (e.g., squalene, lanosterol); (iii) it has subcellular organelles (e.g., endoplasmic reticulum) to support the function of membrane-bounded CYPs; and (iv) its well-characterized endogenous CYPs are less likely to interfere with exogenous ones (Kelly et al. 2001; Xiao et al. 2019). This strategy enabled the discovery of cyp5150l8 from a famous traditional medicinal mushroom, Ganoderma lucidum, as the first CYP responsible for biosynthesis of the cyclic triterpenoid ganoderic acid HLDOA (Wang et al. 2018).

Along with the above mentioned paradigm, we discovered that overexpression of cyp505d13 from G. lucidum in S. cerevisiae YL-T3 yielded many compounds as compared to the control strain. Three major compounds were purified and identified to be STs, including two new STs. We further analyzed the production profile of these STs by the engineered yeast strain. This work is valuable towards STs’ discovery and their efficient bioproduction for biotechnological purposes.

Materials and methods

Strains and media

Escherichia coli DH5α (Tiangen Biotech, Beijing, China) was used in routine DNA cloning. CYP505D13 was overexpressed in either E. coli Rosseta™ (DE3) (Weidi Biotechnology, Shanghai, China) or S. cerevisiae YL-T3 (BY4742, Δtrp1, δDNA:: PPGK1-tHMG1-TADH1-PTEF1-LYS2-TCYC1, TRP:: HIS-PPGK1-ERG20-TADH1-PTEF1-ERG9-TCYC1-PTDH3-ERG1-TTPL1) (Wang et al. 2018). Engineered E. coli strains were grown in LB medium containing 50 µg/mL kanamycin and 34 µg/mL chloramphenicol at 37 °C and 220 rpm. Engineered yeast strains were grown either in SC-His-Ura-Leu medium or in YPD medium at 30 °C and 220 rpm.

Construction of plasmids and strains

The coding sequence (CDS) of cyp505d13 was cloned using primer pair GL17184-F and GL17184-R with the cDNA of G. lucidum as template (Additional file 1: Table S1). G. lucidum cDNA was prepared as described (Wang et al. 2018). Cyp505d13 was ligated to PmeI-linearized pRS426-HXT7p-FBA1t (Wang et al. 2018) to yield plasmid pRS426-HXT7p-CYP505D13-FBA1t as described in SoSoo cloning kit (Tsingke, Beijing, China). Alternatively, the CDS of cyp505d13 was cloned using primer pair 28a-GL17184-F and 28a-GL17184-R (Additional file 1: Table S1), and was ligated to NcoI & XhoI-linearized pET28a to yield plasmid pET28a-CYP505D13 as described in Ezmax One-Step kit (Tolo Biotech, Shanghai, China). Plasmids pRS426-HXT7p-CYP505D13-FBA1t and pRS425-TEF1p-PGK1t were transformed into S. cerevisiae YL-T3 using standard lithium acetate method (Gietz and Schiestl 2007) to yield the YL-T3-CYP505D13 strain. YL-T3 containing void plasmids pRS426-HXT7p-FBA1t and pRS425-TEF1p-PGK1t was served as the control strain for the experiments presented in this work (Wang et al. 2018). Plasmid pET28a-CYP505D13 was transformed into E. coli Rosseta™ (DE3) to yield strain Rosseta-CYP505D13.

UPLC-MS and NMR analyses

UPLC-MS and NMR analyses were carried out as previously reported (Wang et al. 2018) with a minor modification in the UPLC condition. The mobile phase A contained water/formic acid (100: 0.1 v/v) and mobile phase B contained methanol/formic acid (100: 0.1 v/v). A linear gradient of 70% B to 100% B in 10.5 min at 0.4 mL/min was adopted. Compounds were detected at a wavelength of 210 nm.

Extraction and purification of compounds 1, 2, and 3

YL-T3- CYP505D13 was cultured in YPD medium for the purification of 1, 2, and 3. After 4 days of fermentation, 1.1 kg (wet cell weight) of cells was collected after centrifugation (3214g, 5 min, 4 °C) and extracted twice with 22 L of ethyl acetate, each time for 1 h by magnetic stirrer-assisted extraction. Thirty grams of crude extract was obtained after collection and evaporation of the organic phase. The crude extract (15 g) was subjected to silica gel column chromatography (6 × 40 cm) and eluted with a petroleum ether-ethyl acetate system. Fractions containing 1, 2, and 3 were further purified by preparative HPLC (Waldbronn, Germany) equipped with a preparative Kromasil 100-10-C18 column (20 × 250 mm) (Kromasi, Sweden). For preparative HPLC, the mobile phase A was 100% water and mobile phase B was acetonitrile. For compound 1, a linear gradient of 85–90% B in 30 min at 10 mL/min was adopted, and 5.2 mg of purified 1 was obtained by collecting elutes from 29 to 30 min. For compound 2, a linear gradient of 90–95% B in 30 min at 10 mL/min was chosen, yielding 20.3 mg of purified 2 by collecting elutes from 18.6 to 21.2 min. For compound 3, a linear gradient of 95–100% B in 10 min at 10 mL/min was adopted, and 35.6 mg of purified 3 was obtained by collecting elutes from 25.7 to 28 min.

Yeast fermentation, analyses of cell growth and product accumulation

For shake-flask fermentations, a single colony was picked into a 15 mL tube containing 4 mL of SC-His-Ura-Leu medium and grown to an OD600 of 3–4. Then, 1 mL of these starter cultures was diluted into 50 mL of SC-His-Ura-Leu medium in a 250 mL shake flask and grown to an OD600 of 2. These seed cultures were used to inoculate 50 mL of YPD medium in 250 mL shake flasks to achieve an initial OD600 of 0.05.

For fermentation in 10 L stirred bioreactor (T&J Bioengineering, Shanghai, China), 140 mL of seed cultures from shake flasks was inoculated into 6.5 L of YPD medium and agitated by a standard six‐blade turbine impeller at a speed of 300 rpm and an aeration rate of 1.2 vvm at 30 °C. Cell growth, glucose, ethanol, and acetate concentrations were determined as previously reported (Wang et al. 2018). For detection of compounds 1, 2, and 3, 20 mL of yeast culture was mixed with an equivalent volume of ethyl acetate and shaken for 30 min (220 rpm). The organic phase was collected by centrifugation and evaporation, and the resulting residue was re-dissolved in methanol for HPLC (Agilent, Waldbronn, Germany) analysis. Samples were assayed on an Agilent SB-C18 column (5 μm, 4.6 mm × 250 mm). Mobile phase A was 100% water, and mobile phase B contained methanol/acetic acid (100: 0.1 v/v). A linear gradient for 80% to 100% B in 30 min at 1 mL/min was adopted.

Expression of CYP505D13 in E. coli

A single colony from E. coli strain Rosseta-CYP505D13 was picked into a 15 mL tube containing 4 mL of LB medium and appropriate antibiotics, and grown to an OD600 of 2.5 at 37 °C with 220 rpm. Then, 9 mL of these starter cultures was diluted into 800 mL of LB medium in a 2 L shake flask and grown to an OD600 of 0.8. Then, CYP expression was induced with 0.5 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) at 16 °C and 220 rpm for another 14–18 h. Meanwhile, 0.5 mM 5-aminolevulinic acid hydrochloride was added to promote heme biosynthesis.

CYP spectral assay

A total of 10 mg protein from E. coli cell lysates was used for CYP spectral assay as described (Guengerich et al. 2009). To obtain E. coli cell lysates, about 12 g (wet cell weight) of E. coli cells after IPTG induction were suspended in 120 mL extraction buffer (50 mM Tris–HCl (pH 7.5), 20 mM β-mercaptoethanol, 1 mM EDTA, 20% glycerol), and broken by a Nano homogenizer (ATS Engineering Ltd., Suzhou, China) (cycle 2, 800 bar, 4 °C).

In vitro enzymatic assay

Microsomal isolation was performed as previously reported (Wang et al. 2018). The enzymatic assay was carried out in 0.5 mL of extraction buffer containing 1.25 mg protein from E. coli cell lysates or 1 mg microsomal protein from yeast strains, 1 μM FMN, 1 μM FAD, 2 mM NADPH, and 50 μM substrate. The enzymatic reaction was carried out at 30 °C and 120 rpm for 2 h. For the control experiments, the same amount of protein from cell lysates of Rosseta-CYP505D13 or YL-T3-CYP505D13-containing microsomes was inactivated by incubation at 80 °C for 10 min prior to the assay. Alternatively, the same amount of microsomal protein from the control yeast strain (YL-T3 containing void plasmids) was used. The product was extracted by 0.5 mL of ethyl acetate for three times after incubation at 30 °C and 120 rpm for 2 h. The ethyl acetate layer was collected, evaporated, and re-dissolved in methanol for HPLC analysis.

Results and discussion

Overexpression of CYP505D13 generates many new UPLC detectable peaks as compared to the control strain

Since CYP505D13 from G. lucidum was co-expressed with lanosterol synthase (Chen et al. 2012), it was initially considered as a candidate of lanosterol oxidase for biosynthesis of ganoderic acids, a group of lanostane-type triterpenoids (Wang et al. 2018). To test the function of CYP505D13, the CDS of cyp505d13 was cloned into a yeast expression vector pRS426-HXT7p-FBA1t and transformed into S. cerevisiae YL-T3 to generate strain YL-T3-CYP505D13, in accordance with our previous paradigm for discovering lanosterol oxidase (Wang et al. 2018).

As shown in Fig. 1a, 20 new peaks were detected in the cell extracts of YL-T3-CYP505D13 after 96 h shake-flask fermentation compared to the control strain (YL-T3 harboring void plasmids). Except for peak 18 with a detected m/z at 383, the other 19 peaks are likely correspond to ST products with the primary m/z values at 423 (peaks 10, 11, 12, 13), 425 (peaks 17, 19, 20), 439 (peaks 1, 2, 4, 7, 8, 9), 441 (peaks 14, 15, 16), and 457 (peaks 3, 5, 6) (Fig. 1 and Additional file 2: Fig. S1). The peaks with m/z 423 and 425 are likely hydrogenation products of 2,3-oxidosqualene, and the peaks with m/z 439 and 441 may be the hydrogenation products of oxidized 2,3-oxidosqualene. Lastly, the peaks with m/z 457 probably correspond to the oxidized products of the peaks with m/z 441.

Fig. 1
figure 1

UPLC-MS analysis of 96 h fermentation extract of YL-T3-CYP505D13. a UPLC analysis of extracts of YL-T3-CYP505D13 (blue line) and the control strain (black line); bd MS spectra of compound 1, compound 2 and compound 3 as indicated in a

Identification of 1, 2 and 3 as STs

To determine the chemical structures of new compounds that were generated as a result of CYP505D13 overexpression, 6.9 mg, 20.3 mg, and 65.4 mg of 1, 2, and 3, corresponding to peaks 11, 13, and 20 were extracted and purified (Fig. 1a).

The molecular formula of 3 was established to be C30H50O2 based on high resolution atmospheric pressure chemical ionization (HRAPCI) MS (m/z 443.3885, calcd 443.3884 [M + H]+) and NMR analyses (Fig. 1d). The 1H NMR spectrum showed eight singlet methyl groups (δH 1.30, H-1/25/24/30; δH 1.25, H-27/28; δH 1.61, H-26/29), two methylene groups (δH 2.70, H-3/22), four olefinic protons (δH 5.15, H-7/11/14/18) (Additional file 4: Fig. S3C, Table 3). With the aid of DEPT and HSQC spectra, 13C NMR showed 30 carbons including eight CH3 (δC 24.9, C-1/25/24/30; δC 18.8, C-27/28; δC 16.0, C-26/29), ten CH2 (δC 27.5, C-4/21; δC 36.3, C-5/20; δC 26.7, C-8/17; δC 39.7, C-9/16; δC 28.2, C-12/13), two oxygenated CH (δC 64.2, C-3/22), and eight olefin carbons (Additional file 3: Fig. S2C, Additional file 5: Fig. S4C, Additional file 6: Fig. S5C and Table 3). Combined analysis of other 2D NMR spectra (HMBC and 1H–1H COSY) confirmed this compound to be 2,3;22,23-squalene dioxide (José-Luis et al. 1992) (Additional file 7: Fig. S6C and Additional file 8: Fig. S7C). Henceforth, this compound will be known as ST-3.

The molecular formula of 2 was established as C30H50O3 based on HRAPCIMS (m/z 441.3726, calcd 441.3727 [M + H – H2O]+) and NMR analyses (Fig. 1c). The 1D NMR data of 2 were very similar to that of 3 (Additional file 3: Fig. S2B, C, Additional file 4: Fig. S3B, C, Additional file 5: Fig. S4B, C, Additional file 6: Fig. S5B, C, Tables 2 and 3). An obvious difference was an additional oxygenated CH (δC 65.6, C-8 and δH 4.42, H-8) found in 2 (Additional file 3: Fig. S2B, C, Additional file 4: S3B, C, Tables 2 and 3). 1H–1H COSY correlation between H-7 (δH 5.25)/H-8 (δH 4.42)/H-9(δH 2.14) (Additional file 8: Fig. S7B and C), together with HMBC from H-8 to C-6 and C-10 (Additional file 7: Fig. S6B and C), indicated a hydroxyl group at C-8. Finally, 2 was identified as 8-hydroxy-2,3;22,23-squalene dioxide. This novel compound will be known as ST-2 from here on.

The molecular formula of 1 was established as C30H50O3 based on HRAPCIMS (m/z 441.3726, calcd 441.3727 [M + H–H2O]+) and NMR analyses (Fig. 1b). The 1D NMR data of 1 were very similar to that of 2 and 3 (Additional file 3: Fig. S2, Additional file 4: Fig. S3, Additional file 5: Fig. S4, Additional file 6: Fig. S5, Tables 1, 2 and 3). Compared with the 13C NMR spectrum of 2, there were two new olefin carbons in 1 (δC 129.3, C-3 and δH, 5.26, H-3; δC 131.8, C-2) and an oxygenated CH (δC 65.8, C-4 and δH, 4.42, H-4) (Additional file 3: Fig. S2 A, B, Additional file 4: Fig. S3 A, B, Tables 1 and 2). 1H–1H COSY correlation between H-3 (δH, 5.26) and H-4 (δH, 4.42) suggested an oxygenated CH at C-3 (Additional file 8: Fig. S7, Tables 1 and 2). In addition, an oxygenated CH (δC 64.2, C-3 and δH 2.70, H-3) in 2 was absent in 1 (Additional file 3: Fig. S2, Additional file 4: Fig. S3, Tables 1 and 2). These data showed that a double bond was formed between C-2 and C-3 and a hydroxyl group at C-4, as supported by HMBC from H-4 to C-2 and C-6 (Fig. S6). 1 was identified as 4,8-dihydroxy-22,23-oxidosqualene, which is a novel compound and henceforth known as ST-1.

Table 1 NMR data of 1
Table 2 NMR data of 2
Table 3 NMR data of 3

Heterologous bioproduction of ST-1, ST-2, and ST-3

The fed batch fermentation of YL-T3-CYP505D13 was performed in a 10 L bioreactor. After a lag phase of 9.5 h, the glucose was completely consumed within 24 h while the S. cerevisiae cells grew rapidly before 45 h (Fig. 2a). The dissolved oxygen (DO) decreased quickly after 15 h, dropping to the lowest level of 9% of air saturation at 31 h before quickly rebounding to 70% after 40 h (Fig. 2b). The pH changed between 5.4 and 7.0 during the fermentation process (Fig. 2b). Production of ethanol and acetic acid, together with their consumption were all observed (Fig. 2c). After ST-3 production was first detected at 9.5 h, production of ST-2 and ST-3 were subsequently observed at 24 h and 45 h, respectively (Fig. 2d). The highest production of ST-1, ST-2, and ST-3 were achieved at 59 h, 45 h and 59 h, reaching 3.28 mg/L, 14.29 mg/L, and 12.23 mg/L, respectively (Fig. 2d).

Fig. 2
figure 2

Fermentation of strain YL-T3-CYP505D13 in 10 L bioreactor. Time profile of a residual glucose and cell growth, b pH and dissolved oxygen (DO), c ethanol and acetate and d accumulation of ST-1, ST-2, and ST-3

Surprisingly, all three identified STs generated by YL-T3-CYP505D13 could not be detected in the fermentation extracts of G. lucidum (data not shown). Possible explanations include (1) repressed expression of cyp505d13 in G. lucidum; (2) a fast conversion of these intermediates by other enzymes; and (3) preference for other substrates by CYP505D13 in G. lucidum. It should be noted that ST-1 and ST-2 were discovered for the first time, highlighting the potential of our synthetic biology platform to discover novel compounds. Notably, the production yields of the STs using our yeast production platform are sufficient for future bioactivity profiling studies.

By exploring the genome of the famous traditional Chinese medicinal mushroom—G. lucidum, we discovered the first CYP-CYP505D13 responsible for ST biosynthesis using S. cerevisiae as a heterologous host. This discovery is not only important for ST production, but also meaningful for the generation of cyclic triterpenoids. Formation of known cyclic triterpenoids predominately involves cyclization of 2,3-oxidosqualene by terpenoid cyclase, and post-modification by other enzymes including CYPs (Dong et al. 2018). Harnessing the great power of protein engineering on terpenoid cyclases and the promiscuity of CYPs (Xiao et al. 2019), structurally diversified cyclic triterpenoids can be generated from alternative substrates, including the novel STs discovered in this study.

In vitro enzymatic assay by CYP505D13

To characterize the function of CYP505D13, we first tried and failed to get a functional purified CYP505D13 from E. coli or S. cerevisiae. Due to an unobserved membrane anchored region, CYP505D13 was predicted as a soluble CYP. However, the typical CO-shift in functional CYP was not detected in the cell lysates of Rosetta-CYP505D13 (Additional file 9: Fig. S8). It’s not surprising that biosynthesis of ST-1, ST-2, and ST-3 was not detected after the cell lysates were incubated with 2,3-oxidosqualene and squalene (Additional file 10: Fig. S9).

Notably, ST-3 can also be produced by S. cerevisiae under specific conditions. This is due to the low substrate specificity of endogenous yeast squalene epoxidase ERG1 towards 2,3-oxidosqualene (Field and CE 1977; Pollier et al. 2019). Erg1 is a yeast essential gene and the corresponding knockout mutant is only viable under anaerobic conditions (Rosenfeld et al. 2003). To rule out the possibility that ERG1 expression in YL-T3 is responsible for the production of these STs, we tried to construct an erg1 knockout strain but failed after several trials (data not shown).

Next, CYP505D13 containing microsomes from YL-T3-CYP505D13 were prepared and incubated with a potential substrate. Compared to the boiled microsomes, feeding microsomes with 2,3-oxidosqualene resulted in significant increase of ST-3 and a HPLC peak (with retention time (Rt) at 7.99 min), decrease of squalene and 2,3-oxidosqualene, and no change in the levels of ST-1 and ST-2 (Additional file 11: Fig. S10). In addition, no production of ST-1, ST-2, and ST-3 were observed when the microsomes prepared from the control strain (YL-T3 containing void plasmids) were incubated with 2,3-oxidosqulaene (Additional file 11: Fig. S10). These results indicated that CYP505D13 may catalyze the epoxidation of 2,3-oxidosqualene (squalene) to generate ST-3, while ST-1 and ST-2 are not the direct reaction products of CYP505D13.

Many CYP505 family members are able to oxidize different forms of fatty acids (a straight aliphatic chain from C9 to C20) (Baker et al. 2017; Kitazume et al. 2002; Nakayama et al. 1996), with similar chemical structures to squalene (a straight chain with C30). Interestingly, CYP505 proteins are naturally fused with a cytochrome P450 reductase (CPR) to execute efficient electron transfer, displaying a higher catalytic kinetics compared to those CYPs without a CPR domain (Bernhardt and Urlacher 2014). To our knowledge, no CYP was reported to be active against long-chain substrates with more than C26 (Hardwick 2008), not mentioned to CYP505 family members. Our study indicated CYP505D13 is a subterminal epoxidase (catalyzing epoxidation between ω-1 and ω-2). This seems to match with the catalytic characteristics of known CYP505 proteins, which prefer to hydroxylate at positions from ω to ω-3 (Baker et al. 2017; Kitazume et al. 2002; Nakayama et al. 1996).

Conclusion

Using S. cerevisiae as a synthetic biology platform, we have showed that G. lucidum CYP505D13 is responsible for ST biosynthesis and identified two new STs [4,8-dihydroxy-22,23-oxidosqualene (ST-1), 8-hydroxy-2,3;22,23-squalene dioxide (ST-2)] in addition to 2,3;22,23-squalene dioxide (ST-3). Overall, this work provides an alternative to discover STs and facilitate their efficient bioproduction and application.