Introduction

The few plant pathways to specialized metabolites found in text books appear as discrete cascades to end-point molecules of interest (e.g. morphine biosynthesis). A paradigm of linear pathways still appears applicable to the universally shared pathways providing the C5 building blocks for terpenoid metabolism across all life kingdoms. However, plant pathways of the specialized metabolism have impressively diversified throughout evolution. Growing evidence, in recent years increasingly fueled by deep sequencing technologies, indicates highly branched networks, complex anastomosing metabolic grids in rapidly evolving pathways. Because P450s typically catalyze irreversible reactions, they represent ideal points of control over metabolic bifurcations. In contrast to the view of P450s as highly specialized enzymes dedicated to specific pathways, evidence is emerging for substantial promiscuity of the enzymes. This may allow re-purposing and rapid evolution of novel pathways in plants and sets the stage for their combinatorial reconstitution in neo-natural pathways in biotechnology. Below we introduce the characteristics, applications and progress for each class of terpenoids individually, followed by a discussion of illustrative examples.

Oxidative monoterpenoid metabolism

At more than 4000 known structures with over 92% carrying two, or more oxygen atoms (Dictionary of Natural Products 23.1, Pateraki et al. 2015), plant monoterpenoids show an impressive structural diversity. The short C10 hydrocarbon backbone, typicallyFootnote 1 derived from geranyl diphosphate (GDP) establishes their physicochemical properties and, with that, their industrial applications. Limonene, for example, a simple cyclic monoterpene alkene, is used as a green solvent with a boiling point of 176 °C. A high standard enthalpy of combustion (−6100 kJ Mol−1) means limonene is a potential high-density biofuel. Oxygenations increase the polarity and modulate the properties and applications. Carvone or perrilyl alcohol, oxidized derivatives of limonene, have a broadened spectrum of applications, such as flavor additives or emerging cancer therapeutics (Chen et al. 2015; de Carvalho and da Fonseca 2006), emphasizing the industrial relevance of the conversion.

Menthol

An increasing global demand of 30,000 MT/year is already exceeding current supply through novel semi-chemical strategies (e.g. BASF expects an operational (−)-menthol production unit online in 2017, https://www.basf.com/en/company/news-and-media/news-releases/2016/09/p-16-290.html, retrieved December 2016) complementing extraction from the natural sources of a growing market estimated at $300 M (Kirby and Keasling 2009). Synthetic Biology may provide an alternative to petro-based production and extraction, and biosustainable access to these natural compounds, but requires knowledge of the pathways. The essential oils from spearmint and peppermint contain oxygenated monoterpenes with characteristic and distinct positions of oxygenation on the p-menthane backbone. The olefinic monoterpene precursor (−)-4S-limonene was established as the first cyclic intermediate for the oxygenated menthanes in mint species in groundbreaking feeding studies with isotopically labelled precursors almost four decades ago in the laboratory of Rodney Croteau (Kjonaas and Croteau 1983). This opened the door for trailblazing discoveries of the molecular underpinnings of terpenoid biosynthetic pathways. Peppermint, a hybrid mint (Mentha × piperita) produces nearly exclusively C-3 oxygenated monoterpenes, e.g. (−)-menthol, whereas spearmint (M. spicata) accumulates C-6 oxygenated monoterpenes, e.g. (−)-carvone (overview provided in Fig. 1). In peppermint, cytochromes P450 (P450s) CYP71D15 and CYP71D13 catalyze regio-specific C-3 allylic hydroxylation of (−)-4S-limonene to (−)-trans-isopiperitenol, which is subsequently oxidized to (−)-menthol (Lupien et al. 1999; Schalk and Croteau 2000). These P450s are members of the CYP71 clan which is the major source of chemical diversification in specialized metabolism (reviewed in Hamberger and Bak 2013). Spearmint CYP71D18, on the other hand, catalyzes C-6 allylic hydroxylation of (−)-4S-limonene yielding (−)-trans-carveol, which is further oxidized to (−)-carvone (Lupien et al. 1999; Schalk and Croteau 2000). These next oxidative steps are catalyzed by members of the short-chain dehydrogenase/reductase superfamily, which are orthologous enzymes with a high degree of identity in spearmint and peppermint. The dehydrogenases cannot distinguish (−)-trans-isopiperitenol from (−)-trans-carveol as substrate (Ringer et al. 2005). Taking advantage of the regio-specificity of CYP71D15 and CYP71D18 allowed investigating the molecular basis by a combination of domain swapping and reciprocal site-directed mutagenesis. A single amino acid residue was identified controlling the regio-selectivity: exchange of the residue (F363I) in the spearmint limonene-6-hydroxylase fully converted activity and catalytic efficiency to that of the peppermint limonene-3-hydroxylase (Schalk and Croteau 2000). Thus, the closely related but functionally distinct CYP71D15 and CYP71D18 play a critical role in bifurcating the pathway from (−)-4S-limonene to channel the carbon flow into distinct monoterpenes in different mint species. Perilla frutescens, member of a different tribe within the family of Lamiaceae and, which has not undergone selective breeding for specific monoterpene profiles, accumulates the (−)-4S-limonene derived (−)-perillyl aldehyde in a mixture of monoterpenes in the essential oil of glandular trichomes. A partial cDNA clone homologous to the known P450s, expressed as recombinant chimeric variant CYP71D174 yielded, next to 7-hydroxy limonene (perillyl alcohol), oxidations of three of the four allylic carbons within the olefin monoterpene (Mau et al. 2010). These studies provide a classical example of monoterpene biosynthetic pathways governed by regio-selectivity of P450s. Structure–function analysis of the limonene hydroxylases inspired rational engineering of the regio-specificity of P450s by manipulation of selected residues, guided by natural variation (Lupien et al. 1999; Schalk and Croteau 2000). A complementary approach is discussed below in the paragraph for the sesquiterpenoid artemisinin, and discrete activities of members of another CYP71 subfamily.

Fig. 1
figure 1

Regio-specific hydroxylation of (−)-4S-limonene by cytochromes P450. *Recombinant chimeric variant of CYP71D174 yields (−)-perillyl alcohol along with (−)-trans-isopiperitenol and (−)-trans-carveol in an in vitro assay

Within the same plant system, there is a second instance of pathway control by P450s. The hepatotoxic monoterpene (R)-(+)-menthofuran, which also negatively affects commercial value of the essential oil, can accumulate to substantial levels under adverse growth conditions (Bertea et al. 2001). Specifically, a new P450 of the subfamily CYP71A was discovered from the peppermint oil glands. Functional characterization of heterologously expressed recombinant CYP71A32 indicated a role in formation of menthofuran by converting of (R)-(+)-pulegone to (R)-(+)-menthofuran via an allylic hydroxylation and spontaneous rearrangement yielding the furan ring (Bertea et al. 2001; Mizutani and Sato 2011). (R)-(+)-pulegone is also an intermediate in the route to (−)-menthol, with (+)-pulegone reductase catalyzing the conversion into the intermediate (−)-menthone (Ringer et al. 2003), effectively competing for the substrate with CYP71A32. In addition, under low-ambient light, mint plants selectively sequestered menthofuran in the oil glands at concentrations sufficient for competitive inhibition of the pulegone reductase (Mahmoud and Croteau 2001; Rios-Estepa et al. 2008). In the mint-system, improvement of the yield and purity from the natural sources was addressed by a multipronged approach, assisted by mathematical modeling: (i) overexpression of three steps of the plastidial precursor pathway provided increased isoprene C5 building blocks, (ii) a heterologous variant from a different species of the geranyl diphosphate synthase for the production of the general monoterpene precursor, geranyl diphosphate (GDP), (iii) anti-sense mediated suppression of the above menthofuran shunt-pathway, and (iv) introduction of an engineered variant of a (+)-limonene synthase. This latter feature is non-native to this system and introduces a chemical watermark permitting convenient identification of origin of the natural products of this engineered biosynthetic platform (Lange et al. 2011).

Oxidized derivatives of linalool

Linalool is an acyclic monoterpene alcohol and contributes to the floral scents of various plants. Because of its characteristic odor, linalool is extensively used in perfumes, cosmetics, soaps, foods (Amiri et al. 2016). Linalool is synthesized from GDP by activity of the linalool synthase (TPS10 and TPS14 in Arabidopsis) and further oxidized by P450s, effectively generating a range of oxidized derivatives, and implying important roles for P450s in creating and directing the biodiversity of plant specialized metabolites (overview provided in Fig. 2). Two Arabidopsis P450s of distinct subfamilies, CYP71B31 and CYP76C3, with expression localized mainly in the flowers, were shown to catalyze oxidation of both (R)- and (S)-enantiomers of linalool to produce distinct, and partially overlapping sets of hydroxylated or epoxidized products (Ginglinger et al. 2013). From a (3S)-linalool substrate, CYP71B31 yielded predominantly (3S)-1,2-epoxylinalool and diastereomeric 5-hydroxylinalool in (3S,5S)- and (3S,5R)-configuration along with traces of (3R)-4-hydroxylinalool. With (3R)-linalool as substrate, CYP71B31 afforded the analogous diastereomeric pair of (3R,5S)- and (3R,5R)-5-hydroxylinalool as the most abundant products with (3S)-4-hydroxylinalool as the minor product. Interestingly, the epoxide was not formed by CYP71B31 with (3R)-linalool as substrate. CYP76C3 accepted both (3S)- and (3R)-linalool affording diastereomeric 5-hydroxylinalool as the major product along with 8-hydroxylinalool, 8-oxolinalool, and 9-hydroxylinalool at minor amounts (Ginglinger et al. 2013). Hence, CYP71B31 and CYP76C3 were shown to be involved in creating diversity in linalool metabolized products in Arabidopsis. While they displayed a lack of stereospecificity as observed for many other P450s, both were specific for linalool, as they did not accept geraniol, nerol, myrcene and ocimene among some other monoterpenes offered as substrates. In general, P450s of subfamily CYP76C act as versatile monoterpene oxidases and other members from Arabidopsis, CYP76C1, CYP76C2, and CYP76C4 were implicated in linalool metabolism as well, forming 8-hydroxylinalool as a major product and 9-hydroxylinalool as a minor product (Höfer et al. 2014). CYP76C4 and CYP76C2 were shown to additionally form 1,2-epoxylinalool, but unlike CYP76C3, these P450s are highly promiscuous in nature, accepting citronellol and lavandulol as substrates. CYP76C1 was found to contribute a range of multiple oxidized linalool derivatives including C-8 oxidized derivatives of linalool (8-hydroxy, 8-oxo, and 8-carboxy linalool) along with lilac aldehydes and alcohols to the floral volatile emissions of Arabidopsis (Boachon et al. 2015). Additionally, CYP76C2 and CYP76C4 were found active with nerol, CYP76C1 and CYP76C4 with α-terpineol, and CYP76C4 toward geraniol as substrates. Differential expression of the P450s across tissues and highly variable levels of transcript accumulation suggests limited functional redundancy of these genes in Arabidopsis (Höfer et al. 2014). This large functional spectrum of the P450s also implied a large potential for exploitation in biotechnological conversion of non-native substrates (see paragraph below on monoterpene indole alkaloids).

Fig. 2
figure 2

Cytochrome P450 mediated linalool metabolism in Arabidopsis. With exception of CYP76C4, the P450s are expressed in flowers. CYP76C4 has very low expression in roots. TPS: terpene synthase

Hydroxygeraniol

The dialdehyde 8-oxogeranial is key intermediate in biosynthesis of the large group of monoterpene iridoids, carrying a characteristic bicyclic cyclopentane-pyrane ring system. A rare scenario for formation of the terpenoid scaffold (see paragraph below on complex macrocyclic diterpenes for a second example) is the initial oxidative activation of geraniol, a simple monoterpene alcohol by P450s and a dehydrogenase and subsequent reductive cyclization, catalyzed by the NADPH-dependent iridoid synthase (Geu-Flores et al. 2012; Kries et al. 2016). Iridoids are broadly found in plants and include monoterpene indole alkaloids, exhibiting diverse range of bioactivities (Dinda et al. 2011; Tundis et al. 2008; Viljoen et al. 2012), fueling interest for biosynthetic production. 8-hydroxylation of geraniol constitutes the first committed step in iridoid biosynthesis (Collu et al. 2001; Höfer et al. 2013). In the quest to identify optimal P450s for production of iridoid intermediates, an extended functional probing of available enzymes of the CYP76 family was undertaken. Two P450s, the archetypical CYP76B6 from Catharanthus roseus and CYP76C4 from Arabidopsis were shown to oxidize geraniol (Höfer et al. 2013). CYP76B6 catalyzed the two consecutive regio-selective C-8 oxidations to afford the aldehyde 8-oxogeraniol intermediate of the iridoid pathway, via 8-hydroxygeraniol. On the other hand, the related CYP76C4 was found to oxidize geraniol to predominantly 9-hydroxygeraniol but with 8-hydroxygeraniol only as a minor product (Fig. 3). In vivo reconstruction of the pathway in Nicotiana benthamiana, which allowed for convenient gene stacking, including CYP76C4, resulted in unspecific conversion of the generated dioxygenated 8-oxogeraniol, and to a minor extent the intermediate monoterpene diol into a range of further oxidized, reduced and conjugated derivatives (Höfer et al. 2013). This finding pointed at potential complications with this system. However, it also indicated that efficient channeling through engineered pathways can possibly outcompete endogenous, non-specific conversions, with an underlying mechanism remaining yet to be determined (see paragraph below on new perspectives).

Fig. 3
figure 3

Cytochrome P450 mediated regio-specific oxidation of geraniol takes place at the marked C-8 and C-9 positions

Diversity in sesquiterpenoid metabolism

The regio- and stereospecificity of various P450s involved in sesquiterpene biosynthesis leads to broad diversification of oxygenated sesquiterpenes. Here, our current knowledge indicates that the sesquiterpene synthases, rather than the P450s, contribute more significantly to pathway bifurcations, leading from the most common precursor of farnesyl diphosphate (FDP) to distinct families of products. However, the substrate specificity and/or promiscuity of the relevant P450s typically drive chemical diversification of the different families of sesquiterpenoids. Illustrative examples of biotechnologically motivated discovery of P450s yielding the industrially relevant sesquiterpene feedstock nootkatone and the P450 driven diversification of the structurally intriguing sesquiterpene lactones, including the flagship molecule artemisinin, have been recently reviewed (e.g. Hamberger and Bak 2013; Pateraki et al. 2015). Here we discuss selected examples of sesquiterpenes metabolite pathways, where the activity of P450s plays a key role, often at the end of the pathway.

Santalol

The sesquiterpene alcohols (Z)-α-santalol, (Z)-β-santalol, (Z)-epi-β-santalol and (Z)-α-exo-bergamotol are the predominant constituents of the essential oil of sandalwood, Santalum ssp. and represent high-value commercial targets for the flavor and fragrance industry, and, with a higher market volume, as bio-insecticide and insect repellent (Roh et al. 2011). Challenges in sustainable extraction from its natural source, the mature heartwood of the trees, and formal chemical synthesis have raised interest in metabolic engineering to provide alternative access to these high-value targets. A metabolic engineering strategy would require knowledge of the biosynthetic route, which has led to an ongoing line of research. A panel of orthologous sesquiterpene synthases was identified from three sandalwood species in groundbreaking work in 2011. The santalene/bergamotene synthase (SSy) catalyzes formation of a blend of α-santalene, β-santalene, epi-β-santalene and α-exo-bergamotene, which were later reported to be decorated by P450s (Celedon et al. 2016; Diaz-Chavez et al. 2013; Jones et al. 2011). Here, initial transcriptome mining of the xylem of S. album permitted identification of an expanded family of candidate P450 genes in the CYP71 clan. Their functional characterization established a total of ten P450s of the CYP76F subfamily (overview provided in Fig. 4; Diaz-Chavez et al. 2013). In vitro and yeast in vivo assays demonstrated that nine out of these ten genes encoded multi-substrate santalene/bergamotene oxidases with some functional redundancy. The P450s were found to hydroxylate the terminal allylic methyl group of santalenes and bergamotene to predominantly yield the (E)-stereoisomers (Diaz-Chavez et al. 2013). To identify the hypothesized P450 involved in formation of the sesquiterpene alcohols in (Z)-configuration, Celedon and co-workers integrated comparative transcriptomics across three different tissues, and critically, including the heartwood, to demonstrate a unique signature (Celedon et al. 2016). One P450 specifically fulfilled the criteria selected by the authors, of high, and spatially exclusive expression in the heartwood (Fig. 4). Intriguingly, and despite representing a member of the CYP71 clan, the candidate showed no significant homology to related enzymes of terpenoid metabolism, or the previously identified members of subfamily CYP76F, which would have rendered simple homology-informed approaches inefficient in the identification. Co-expression of codon optimized variants of SaCYP736A167 and SaPOR2 in yeast allowed in vitro assays with supplemented substrates, while engineering of a yeast strain with the corresponding SaFDPS and SaSSy confirmed results by stereoselective in vivo formation of the four, main sandalwood oil sesquiterpenols in the correct (Z)-configuration. Here, the instructive strategy for identification of SaCYP736A167 as a stereo-selective P450 in sandalwood sesquiterpene alcohols showcases its role in in vivo pathway bifurcation to channel the olefinic substrates into the specific streoisomeric alcohols, naturally present in the sandalwood oil.

Fig. 4
figure 4

Cytochrome P450 mediated santalol metabolism in S. album. Stereo-specificity of heartwood specific CYP736A167 channels the olefin precursor into the (Z)-isomers of the constituent santalols of the sandalwood oil. SaSSY: S. album santalene/bergamotene synthase

Rotundone

(−)-Rotundone is an oxygenated sesquiterpene and an important aroma constituent contributing a peppery scent in various herbs and spices, including pepper, oregano and basil. Despite being present only in very low concentrations, it is also a characteristic of Shiraz wine varieties (Huang et al. 2014; Takase et al. 2016; Wood et al. 2008). The terpene synthase forming the scaffold and precursor of rotundone was identified by Drew and co-workers in the challenging system of developing grapevine (Vitis vinifera) berries (Drew et al. 2016). The main hurdles tackled were an expected extremely low expression of the pathway in conjunction with an unprecedented number and genomic complexity of the terpene synthase family (Martin et al. 2010). The α-guaiene synthase was found as a novel allelic variant of VvTPS24, established as an enzyme forming a blend of selinene-type sesquiterpenes, with only two amino acid residues controlling the product profile. Inspired by earlier reports of P450s active in terpenoid metabolism, a functional P450 of the CYP71BE subfamily was discovered as specifically expressed in the Syrah grape exocarp, consistent with accumulation of (−)-rotundone (see overview provided in Fig. 5; Takase et al. 2016). In vitro assays with recombinant enzymes in microsomes from yeast demonstrated that CYP71BE5 could also catalyze the C-2 oxidation of (+)-valencene to β-nootkatol. However, the absence of these terpenes in the Syrah grape exoxcarp suggested that CYP71BE5 functions as α-guaiene 2-oxidase in planta.

Fig. 5
figure 5

Cytochrome P450 mediated biodiversity of oxygenated sesquiterpenes in various plants. CYP71BE5 has been found to be active in Syrah grape exocarp, CYP71D20 in tobacco, CYP71D55 in henbane. Vv: Vitis vinifera, TPS: terpene synthase, EAS: 5-epi-aristolochene synthase, HPS: Hyoscyamus muticus premnaspirodiene synthase

Capsidol

Capsidol is a bicyclic, dihydroxylated sesquiterpene produced by many solanaceous species in response to various environmental cues including pathogen attack, elicitor challenge or exposure to UV light (Ralston et al. 2001; Takahashi et al. 2005). Biosynthesis of capsidol involves the 5-epi-aristolochene synthase, catalyzing formation of the bicyclic sesquiterpene olefin intermediate, 5-epi-aristolochene (5-EA), followed by P450-mediated dihydroxylation of 5-EA to capsidol (Facchini and Chappell 1992; Takahashi et al. 2005). The corresponding enzyme CYP71D20 from Nicotiana tabacum was found to catalyze the unique stereo- and regio-selective sequential dihydroxylation of 5-EA at C-1 and C-3 to afford capsidol (Greenhagen et al. 2003; Ralston et al. 2001; Takahashi et al. 2005). Investigation of the kinetic behavior of CYP71D20 established the putative sequence of oxidation of 5-EA at the C-1 position followed by the C-3 position, generating stereoselectively 1β-hydroxylated EA followed by 1β,3α-capsidol (Takahashi et al. 2005). CYP71D20 has also been found to catalyze the conversion of premnaspirodiene to solavetivone, albeit at very low rates (Greenhagen et al. 2003). Although the pathway bifurcation for the sesquiterpene hydrocarbon intermediates 5-EA and premnaspirodiene depends on the corresponding sesquiterpene synthase, the stereo-and regio-selective catalysis by CYP71D20 of the individual sesquiterpene hydrocarbon intermediate plays an important role in creating the diversity of the oxygenated sesquiterpenes (Fig. 5).

Solavetivone

Solavetivone, a potent antifungal phytoalexin, plays a role as defense molecule in the solanaceous plants henbane (Hyoscyamus muticus) and potato. It is biochemically synthesized from the vetispirane-type sesquiterpene based scaffold premnaspirodiene, which is formed by the H. muticus premnaspirodiene synthase. Another member of the CYP71 subfamily, CYP71D55, was found to catalyze the successive regio-selective oxidation at the carbon atom C-2 position of premnaspirodiene to yield solavetivone (Fig. 5). In vitro assays demonstrated that CYP71D55 also converted valencene and 5-EA with an eremophilane based sesquiterpene scaffold, but only to their corresponding mono-oxygenated product (Takahashi et al. 2007), unlike the related P450s catalyzing successive oxidations such as CYP71D20 and CYP71AV8 (Cankar et al. 2011; Ralston et al. 2001). As with previous examples, CYP71D55 is involved in increasing the biodiversity of the oxygenated sesquiterpenes.

Artemisinin

The founding member of the subfamily, CYP71AV1, from Artemisia annua catalyzes a three-step oxidation of amorphadiene to artemisinic acid, off-product of the pathway to the anti-malaria pharmaceutical artemisinin (Ro et al. 2006). The homology-based identification of orthologous sequences in closely related Asteraceae led to the discovery of enzymes with lowered regioselectivity (i.e. formation of a distinct amorphadienol isomer), in addition to a lack of activity for the formation of artemisinic acid (Komori et al. 2013). The initial number of different amino acids over the entire sequence was narrowed down to a section carrying nine residues through functional testing of chimeric variants with swapped domains. Using structural modeling, four amino acids emerged, putatively residing in the catalytic site. Following site directed mutagenesis and functional testing, conversion of a serine to phenylalanine in position 479 reduced the conversion of the alcohol to the aldehyde nearly 20-fold and limited the activity of the P450 to a single step (Komori et al. 2013). While not of immediate biotechnological relevance, as the pathway to Artemisinin proceeds via the aldehyde, this approach highlights the rational engineering of the activity of a P450 with an important role at this metabolic junction, guided by natural variation and structural modeling.

Oxidation of labdane-type and macrocyclic diterpenes

With backbones consisting of 20 carbon atoms, the theoretically possible structural complexity of diterpenes far exceeds that of the classes using two or three C5 building blocks. Yet, among documented plant terpenoids, the number of diterpenoids is in the same range of sesquiterpenoids, with substantially fewer monoterpenoids known (diterpenoids, 12,505 vs. sesquiterpenoids 13,981 vs. monoterpenoids, 4129; Dictionary of Natural Products 23.1; Pateraki et al. 2015). On the other hand, over 95% of known diterpenoid structures are carrying two or more oxygen atoms (sesquiterpenoids, 90%; monoterpenoids, 92%), indicating increased relevance of P450s in generation of structural complexity and chirality in the respective biosynthetic routes.

Gibberellins (GAs) are archetypal plant diterpenes and hormones involved in the regulation of plant growth and development, but also controlling developmental processes such as germination, flowering, and reproduction. Formation of GA12, the simplest of the GAs, is catalyzed in a linear route by two classes of enzymes, a pair of diterpene synthases (diTPS) cyclize ent-kaurene from the common acyclic precursor geranylgeranyl diphosphate (GGDP) and a pair of sequentially acting P450s oxidizing ent-kaurene via ent-kaurenoic acid to GA12 (Helliwell et al. 2001). The P450s involved are noteworthy, as they are members from two divergent families, CYP701 and CYP88, which, after duplication, have repeatedly served as starting material for the evolution of enzymes in specialized metabolism. The concept of P450s as drivers of the evolution of specialized metabolism has been reviewed in Hamberger and Bak, (2013), while a plant- and biotechnological perspective for their discovery is given in (Nelson and Werck-Reichhart 2011; Pateraki et al. 2015), respectively. CYP701 is the only member of the CYP71 clan known to be involved in terpenoid general metabolism and catalyzes the regio-specific three-step oxidation at carbon atom C-18 of ent-kaurene to ent-kaurenoic acid (Helliwell et al. 1999). CYP88 is found in the CYP85 clan and catalyzes further oxidation and contraction of the B-ring of ent-kaurenoic acid to GA12 (see overview provided in Fig. 6; Helliwell et al. 2001). Combinatorial biosynthesis, which assembles both natural and neo-natural pairs of diTPS into modules has been used to biosynthesize a highly diverse array of diterpene scaffolds (Andersen-Ranberg et al. 2016). This approach was used to probe the activity of two representative members of mono- and dicotyledon CYP701A with 22 different labdane-type diterpenes (Mafu et al. 2016). AtCYP701A3 was found to exhibit a very broad promiscuity and accepted two-thirds of the diterpenes, while the rice ortholog OsCYP701A6 was strictly specific towards ent-kaurene and ent-isokaurene. It was speculated that this phenomenon correlates with the accumulation of diverse labdane-type diterpenoids in rice, but not in Arabidopsis, and consequently requires higher specificity in rice (Mafu et al. 2016). Next to possible implications for enzyme evolution, this study was among the first probing the promiscuity of plant P450s with terpenoids with an outstanding relevance in biotechnological applications (see below another example, biosynthesis of steviol). Dedicated P450s of terpenoid specialized metabolism in rice have been discovered in multiple families, expanded in the CYP71 clan, and present an intriguing diversity. Their biochemical activity and involvement in control of the pathways to bioactive diterpenoids of the labdane-class has been comprehensively reviewed (Schmelz et al. 2014). Recently, ent-10-oxodepressin, a novel casbane-type macrocyclic diterpene phytoalexin was discovered in rice (Inoue et al. 2013). This indicates that, despite our substantial understanding gained since characterization of the first P450 in phytoalexins biosynthesis more than two decades ago (Kato et al. 1995), additional pathways remain to be discovered.

Fig. 6
figure 6

Cytochrome P450 mediated regio-specific oxidation of ent-kaurene in GA12 biosynthesis. Regio-specificity of P450s (CYP714A2 and CYP88) leads to pathway bifurcation between general and engineered specialized metabolism

Casbene based diterpenes

Casbene (and related cembrene/neocembrene) and casbene derived phorbol ester diterpenoids increase in complexity with additional ring closures to become lathyrane, jatrophane, tigliane and ingenol, and their derivatives. Together they constitute characteristic macrocyclic diterpenoids of the Euphorbiaceae family (Appendino 2016). Their pharmaceutical applications include anti-cancer, anti-viral and anti-tumor activities. The structural diversity of these compounds results from an unconventional biosynthetic route, requiring activation of the backbone of the simple precursor macrocyclic diterpene by P450s prior to additional cyclization and re-arrangement (Luo et al. 2016). The principle resembles formation of iridoid terpenoids, where the reductive cyclization was shown to entail activation of the monoterpene scaffold (Geu-Flores et al. 2012) in contrast to typical diterpenoid biosynthetic routes, where the cyclization represents the first committed entry step of the pathway. Mining for candidate P450s, a significant, Euphorbiaceae-specific bloom in subfamily CYP726A of tribe 71D (consisting of subfamily CYP71D and additional closely related subfamilies) was reported earlier in E. peplus, which accumulates ingenane diterpenoids. Inspired by an activity of the founding member from E. lagascae in formation of structurally unusual epoxy-fatty esters (CYP726A1, Cahoon et al. 2002), a function of CYP726A members in phorbol ester formation was proposed (Zerbe et al. 2013). In the Euphorbiaceae castor bean (Ricinus communis) it was shown that C-5 oxidation of casbene is catalyzed by CYP726A14, CYP726A17 and CYP726A18 (King et al. 2014, Boutanaev et al. 2015). 5-keto-casbene, characteristic of simpler bicyclic diterpenes in that species, was found to be further epoxidized by CYP726A16 (see overview provided in Fig. 7). The genes encoding these P450 were reported clustered in the genome of castor bean with the corresponding casbene synthase. A shared pattern of co-expression, which implied a role in the pathway to oxidized casbenes in planta facilitated their discovery (King et al. 2014). Another member of the same subfamily CYP726A15, which is also located in the same gene cluster in castor bean, was found to be active on neocembrene forming 5-keto-neocembrene. Recently, two casbene-activating P450s of sub-families CYP71D445 and CYP726A27 were isolated from E. lathyris (Luo et al. 2016), which allowed identification of functional orthologues also from E. peplus (CYP71D365 and CYP726A4 respectively). Key for their discovery was analysis of the mature seed transcriptome of E. lathyris, which, in contrast to the earlier developed transcriptome in E. peplus, was highly enriched in full length sequences of P450s in those families. Functional characterization, both in the transient N. benthamiana system and engineered yeast demonstrated that CYP71D445 and CYP71D365 catalyzed the regio-specific C-9 oxidation of casbene. Considering that this specific oxidized position is characteristic of complex multicyclic diterpene types, it was proposed that this activity controlled an early bifurcation steps in the biosynthesis of macrocyclic diterpenoids. Combining CYP71D445 with CYP726A27 and CYP71D365 with CYP726A4 yielded oxidation at C-5 and formation of 9-keto casbene and 9-keto-5-hydroxy-casbene, mechanistically implausible intermediates to multicyclic diterpenes. Indeed, in conjunction with an enzyme of the alcohol dehydrogenase family, co-expressed with the casbene synthase and P450s, it was demonstrated that instead of the oxidized ketone derivatives, hydroxyl derivatives of casbene were effectively converted to the multicyclic jolkinol C (Luo et al. 2016). Notably, engineered strains of yeast for production of casbene and expression of the P450s were critical for independent confirmation of the enzyme activities and for isolating intermediates supporting the hypothesized pathway. However, these strains did not afford jolkinol C, indicating that potential limitations of the current yeast system will need to be addressed for further engineering of biosynthetic production towards higher cyclized phorbol esters.

Fig. 7
figure 7

Regio-specific oxidation of casbene by different orthologous members of cytochromes P450. ADH alcohol dehydrogenase

Carnosic acid, tanshinones, steviol and forskolin

Labdane-type diterpenoids are the most widely distributed type of diterpenoids in specialized metabolism (Zi et al. 2014). Carnosic acid and its related derivatives are diterpenes with a distinctive aromatic ring and exhibit a wide range of activities including antioxidant, anticancer and antimicrobial activities (Ignea et al. 2016; Scheler et al. 2016). These compounds belong to the group of labdane-type diterpenes with a bicyclic decalin core and are formed through the intermediates of miltiradiene, dehydroabietadiene (or abietatriene) and ferruginol. The P450s involved in the biosynthesis of carnosic acid have recently been identified from Salvia pomifera, Rosmarinus officinalis, and S. fruticosa (overview provided in Fig. 8; Ignea et al. 2016; Scheler et al. 2016). Scheler and co-workers reported in elegant work that RoCYP76AH4, RoCYP76AH22, RoCYP76AH23, SfCYP76AH24 from R. officinalis and S. fruticosa carry out two successive oxygenations of dehydroabietadiene to form ferruginol and 11-hydroxyferruginol, expanding on earlier reports of homologs which only yielded ferruginol (Bozic et al. 2015, see also below). Modeling studies and site directed mutagenesis established that individual amino acid residues in the active site confer specificity of the P450s towards either formation of ferruginol, or in case of the multifunctional enzymes, both formation of ferruginol and its further oxidation to 11-hydroxy ferruginol (CYP76AH4, CYP76AH22 to CYP76AH24). This work also identified three members of a closely related subfamily, SfCYP76AK6, RoCYP76AK7, and RoCYP76AK8 from S. fruticosa and R. officinalis as oxidases with a specificity for carbon atom C-20, and that accept both ferruginol and 11-hydroxyferruginol as well as miltiradiene to produce pisiferic acid, carnosic acid and miltiradien-20-al, respectively (Scheler et al. 2016). Elucidation of the complete biosynthetic pathway for carnosic acid/pisiferic acid and the rational site-directed mutagenesis of the required P450s to modulate their specificity is encouraging for future metabolic engineering of these phenolic diterpenes in microbial hosts.

Fig. 8
figure 8

Regio-specific oxidation of miltiradiene by different orthologous members of cytochromes P450. Regio-specificity of P450s leads to pathway bifurcation and metabolic diversity. CPS copalyl diphosphate synthase, MiS miltiradiene synthase

Independent, complementary research found that CYP76AH24, CYP76AK6 from S.- pomifera and CYP76AH4, CYP76AK8 from R. officinalis account for the set of oxidations in the pathway from dehydroabietadiene to carnosic acid (Ignea et al. 2016). Functional analysis of the P450s in engineered yeast showed in addition that yet another member of a related subfamily, CYP71BE52, can oxidize ferruginol in the C-2 position to salviol. Hence, CYP76AH22/CYP76AH24/CYP76AH4, CYP76AK6/CYP76AK7/CYP76AK8, and CYP71BE52 control multiple pathway bifurcations leading to chemical diversification in the biosynthesis of dehydroabietadiene based diterpenoid metabolism.

Salvia miltiorrhiza, another member of the family of Lamiaceae, is well known for accumulating tanshinones, which have expansive uses in traditional Chinese medicine, but having also attracted interest due to anti-bacterial and a range of therapeutic activities. Tanshinones, like carnosic acid, are abietane-type diterpenoids, derived from ferruginol and through a biosynthetic route involving members of subfamilies CYP76AH and CYP76AK: founding members CYP76AH1 and CYP76AK1 of both subfamilies were discovered in S. miltiorrhiza. CYP76AH1 was established as ferruginol synthase, however, a rather obscure mechanism was proposed to explain the apparent conversion of the diterpene olefin miltiradiene to ferruginol (Guo et al. 2013). This misperception was later corrected when the orthologous CYP76AH4 from R. officinalis was functionally characterized demonstrating that ferruginol is produced by CYP76AH4 mediated oxidation of dehydroabietadiene (abietatriene), which is a spontaneous oxidation product from miltiradiene (Zi and Peters 2013). Subsequent identification and characterization of CYP76AH3 and CYP76AK1 by expression in yeast led to demonstration of a pathway with at least one intersection and bifurcation controlled by these P450s. CYP76AH3 catalyzed the oxidation to yield a C-11 hydroxyl function, as well as formation of 7-keto ferruginol (sugiol) and 7-keto-11-hydroxyferruginol (11-hydroxy sugiol). In contrast, CYP76AK1 showed regio-selectivity for oxidation at carbon atom C-20 of both 11-hydroxy ferruginol and 11-hydroxy sugiol (Guo et al. 2016).

Another group of labdane-type diterpenoids with commercial relevance are steviol-glucoside sweeteners accumulating in the leaves of the Asteraceae Stevia rebaudiana. These are based on a bifurcation, leading from the GA12 intermediate ent-kaurenoic acid through a single hydroxylation at carbon atom C-13 to steviol (Fig. 6). Recombinant Arabidopsis CYP714A2, with a role in GA metabolism, was shown to also yield steviol, when incubated with ent-kaurenoic acid (Nomura et al. 2013). Inspired by this finding, an active ent-kaurenoic acid hydroxylase was identified in S. rebaudiana. An engineered variant of CYP714A2 finally yielded over 15 mg L−1 of steviol, when expressed in a strain of E. coli dedicated for production of ent-kaurenoic acid (Fig. 6; Wang et al. 2016).

In the Lamiaceae Coleus forskohlii, diterpenes carrying both abietane and epoxy-labdane (13R-manoyl oxide) scaffolds are prevalent, and their biosynthetic origins have been elucidated (Pateraki et al. 2014). Of interest in this plant species is the root-specific accumulation of the 13R-manoyl oxide-derived structurally complex diterpene forskolin, which consists of an oxygen heteroatom-containing labdane scaffold, with five functionalized positions. Identification of a substantial bloom of subfamily CYP76AH yielded an intriguing number of enzymes with, in part multifunctional activity towards 13R-manoyl oxide and, concomitant, extensive chemical diversification of the product palette. Ultimately, combinatorial testing led to a minimal set sufficient to catalyze regio- and stereo-specific formation of deacetyl-forskolin. Isolation of a regio-selective acetyl-transferase completed a biosynthetic route to forskolin, which was stably integrated in an engineered yeast strain optimized for production of 13R-manoyl oxide (Pateraki et al. 2017; patents Andersen-Ranberg and Pateraki 2016; Hamberger et al. 2015, 2016).

Diterpene Resin Acids

The labdane-type diterpene resin acids are important constituents of the oleoresin defense of conifers and have both constitutive and induced protective roles for protection of these long-lived plants against fast evolving pests and pathogens. Industrial applications include the use of resin acids as biopolymers, constituents of high-end inks, in glues, tackifiers and as coatings (Bohlmann and Keeling 2008). The sequential activities of diterpene synthases (diTPSs) and cytochromes P450, creating a mixture of chemically divergent resin acids, and to a lesser extent aldehydes and alcohols, is increasingly well understood and may serve as illustrative example of meticulous research providing deep insights into the genetic and biochemical underpinning of chemical diversity. The bifunctional diTPS catalyzing stereo-specific formation of (+)-abietadiene and the corresponding P450-related activity have been known for over two decades, with the first P450 cloned and characterized in 2005 (Funk and Croteau 1994; LaFever et al. 1994; Ro et al. 2005). Formation of the backbones of four distinct diterpenes through the activity of the Norway spruce (Picea abies) diTPS appeared established, until Keeling and co-workers demonstrated that the initial enzyme product is the thermally labile tertiary alcohol 13-hydroxy-8(14)-abietene, which, after water elimination yields the known mix of four olefins, levopimaradiene, abietadiene, neoabietadiene and palustradiene (Keeling et al. 2011). Similarly, for the P450s, established members of CYP720B subfamily, Pinus taeda (loblolly pine) CYP720B1 and Picea sitchensis (Sitka spruce) CYP720B4 from the CYP85 clan, were shown to oxidize a range of diterpene olefins to yield the corresponding resin acids in planta, with production in engineered yeast reaching 0.9 and 0.2 mg L−1 for isopimaric acid and abietic acid, respectively (see overview provided in Fig. 9, Hamberger et al. 2011; Ro et al. 2005). These enzymes are members of a largely expanded subfamily with about a dozen P450s in each conifer species investigated. Recently, a comprehensive investigation by Geisler and co-workers provided evidence for this impressive genetic diversity and the missing link with the unstable diterpene product. CYP720B2 and CYP720B12, members of a distant clade of the previously established P450s were functionally characterized in three conifer species, Pinus banksiana (jack pine), P. contorta (lodgepole pine) and P. sitchensis. While the enzymes did not accept diterpene olefins, they efficiently catalyzed regio-selective C-18 oxidation of abietaenol, converting the unstable diterpene alcohol (13-hydroxyl-(8)14-abietene in Fig. 9) into the corresponding hydroxyl-resin acid, ultimately yielding after non-enzymatic water loss, the panel of levopimaric acid, abietic acid, neoabietic acid and palustric acid. In contrast, the diterpene olefin isopimaradiene, which does not proceed via a tertiary alcohol during biosynthesis is effectively oxidized by CYP720B4 and CYP720B1, but not the enzymes accepting the diterpene alcohol as substrate, highlighting the exceptional modularity and plasticity of conifer diterpene metabolism (Geisler et al. 2016). The distinct difference in the substrate specificity of the P450s results in an alternate and extended but finally convergent route for conifer resin acid biosynthesis, with reconnected end products of both distant clades. From a biotechnological perspective, those multifunctional P450s have a high potential in synthetic pathways when combined with diterpene modules. For example, CYP720B4 has recently been used to engineer the in vivo production of dehydroabietic acid from glucose via miltiradiene in yeast, which was hypothesized as a key intermediate en route to the diterpene therapeutic triptolide from Tripterygium wilfordii (Forman et al. 2017).

Fig. 9
figure 9

Cytochrome P450 mediated regio-specific oxidation of various diterpene scaffolds leads to chemical diversity of diterpene resin acids in conifers. Example of P450s from different clades in subfamily CYP720B which yield the same products through distinct routes. *The pathway forming dehydroabietadiene from GGDP is not known in conifers. LAS levopimaradiene/abietadiene synthase, ISO isopimaradiene synthase, PIM pimaradiene synthase, diTPS diterpene synthase

Paclitaxel

Paclitacel (Taxol®), a highly functionalized cancer therapeutic, and arguably one of the highest-value diterpenoids, is found in the bark of yew (Taxus ssp.) species (Wani et al. 1971). Due to the low accumulation in its natural host (i.e. 0.02% yield from the concentrated alcohol extract of the stem bark of Taxus brevifolia; Wani et al. 1971), the biotechnological production of paclitaxel has captured the interest of scientists for a long time to complement production in cell culture (Howat et al. 2014). The biosynthesis of paclitaxel was estimated to involve approximately 20 discrete enzymatic steps (Chau et al. 2004). The first committed step is formation of taxa-4(5),11(2)-diene, the macrocyclic core skeleton of paclitaxel (Wildung and Croteau 1996). Based on the prevalent distribution of the pattern of hydroxylation, the next predicted step requires the P450 taxadiene-5α-hydroxylase, CYP725A4, which catalyzes the regio-selective hydroxylation of the 5α position of taxadiene to yield taxa-4(20),11(12)-dien-5α-ol (T5OH/T-5α-ol) (Fig. 10, Biggs et al. 2016b; Edgar et al. 2016; Hefner et al. 1996; Jennewein et al. 2004). The subsequent conversions include a series of P450-mediated oxygenations, providing the molecular handles for acylations and further conjugation of the skeleton. These include in a yet not fully elucidated sequence of carbon atoms at C-2, C-9, C-10, C-13 and C-14 (Jennewein and Croteau 2001; Jennewein et al. 2003, 2001; Schoendorf et al. 2001; Walker and Croteau 2001). Much to the detriment of approaches building biosynthetic platforms, even the early steps show metabolic bifurcations. For example, acetylation of T5OH was found to re-route the taxoid into further hydroxylation at carbon C-10, or C-14 by two P450s, while the free alcohol T5OH underwent oxidation at carbons C-9, or C-13, depending on the P450 (reviewed in Kaspera and Croteau 2006). Potential multifunctional enzymes and uncertainties of the sequence of conversions dramatically complicated the rational design of routes to individual products. Hence, substantial effort was spent investigating the established first oxidative step, which is highly illustrative for the challenges encountered. Various research groups have described the heterologous expression and biosynthetic formation of taxa-4(5),11(12)-diene and its first oxidation by CYP725A4 in chassis organisms like yeast, tobacco, E. coli, tomato (Ajikumar et al. 2010; Dejong et al. 2006; Engels et al. 2008; Huang et al. 2001; Kovacs et al. 2007; Rontein et al. 2008). Heterologous production of taxadiene was achieved in substantial amounts, and oxidation, resulting in further products was accomplished to a minor degree (Zhou et al. 2015). However, metabolic engineering efforts showed that the formation of the CYP725A4-mediated 5-hydoxylated product is highly limited due to a behavior of this enzyme possibly dependent on the experimental context. It has been observed that the heterologous expression of taxadiene synthase and CYP725A4 in tobacco (Nicotiana sylvestris) result in the formation of 5(12)-oxa-3(11)-cyclotaxane (OCT) instead of T5OH (Rontein et al. 2008). Further studies described that expression of recombinant taxadiene synthase and CYP725A4 in an E. coli host leads to equal formation of T5OH and OCT and indicated that the lack of specificity of the oxygenation represents a first limitation to overcome (Ajikumar et al. 2010; Edgar et al. 2016; Zhou et al. 2015). Investigating the mechanism more in detail, earlier studies have suggested that promiscuous H-atom abstraction from both the isomeric olefinic precursors, taxa-4(5),11(2)-diene and taxa-4(20),11(2)-diene, forming an allylic radical followed by oxygen insertion can lead to the formation of T5OH (Hefner et al. 1996; Jennewein et al. 2004). On the other hand, two independent recent studies have suggested an epoxide intermediate (Barton et al. 2016; Edgar et al. 2016) and formation of a mixture of products derived from taxa-4(5),11(2)-diene and a single oxidation product from taxa-4(20),11(2)-diene, concluding that an unstable epoxide intermediate may contribute to the native product profile of CYP725A4 (Edgar et al. 2016). In contrast, and supporting promiscuity of CYP725A4, Biggs and co-workers performed in vivo characterization in engineered E. coli and in vitro characterization in a nanodisc platform, confirming the formation of T5OH, OCT, iso-OCT and additional oxygenated taxoids at titers exceeding 500 mg L−1 (Biggs et al. 2016a). This example of paclitaxel may well reflect two other high profile scenarios, underscoring the challenges met by Synthetic Biology in production of industrially relevant targets at commercially competitive level: artemisinin and morphine (Oye et al. 2015; Peplow 2016).

Fig. 10
figure 10

CYP725A4 mediated regio-specific oxidation of taxadiene takes place at the marked C-5 position

Oxidative pathways of triterpene specialized metabolism

Saponins are a diverse group of sugar conjugated triterpenes detected in a broad variety of plants including medicinal plants such as licorice and ginseng, as well as crop plants, such as legumes and oats (Fukushima et al. 2011; Haralampidis et al. 2002). Biotechnological interest in saponins is fueled by applications in the pharmaceutical and agrochemical industries, but also their use in food and cosmetics (Huhman et al. 2005; Sparg et al. 2004; Suzuki et al. 2002; Tava and Avato 2006). Our knowledge of oxidative functionalization of the triterpene scaffold by P450s and the subsequent glycosylation has dramatically expanded during the last decade, and since the first enzyme with activity towards both β-amyrin and sophoradiol was identified in soybean (Shibuya et al. 2006). An excellent overview of the history of discovery and the diverse reactions of triterpenoids catalyzed by P450s and corresponding glucosyl transferases was recently published by Seki and co-workers (Seki et al. 2015). Based on this substantial body of evidence, it is well established that regio- and stereospecific oxidations by P450s in triterpenoid metabolism represent critical points of divergence controlling subsequent modification and conjugation. For their illustrative nature, we briefly provide an example of regiospecific gatekeepers controlling the biosynthetic pathways of two types of saponins from β-amyrin, and further focusing on the most recent developments and studies of biotechnological relevance (see overview provided in Figs. 11 and 12).

Fig. 11
figure 11

Regio-specific oxidation of β-amyrin by different orthologous members of cytochromes P450. *Key cytochrome P450s controlling the regio-specific C-2 oxidation of various intermediates involved in sapogenin biosynthesis

Fig. 12
figure 12

Regio-specific oxidation of α-amyrin, δ-amyrin, germanicol, and lupeol by various orthologous members of cytochromes P450

Recruitment of highly diverse P450s for triterpenoid oxidation

Studies involving selection of candidate genes by co-expression analysis in Medicago truncatula followed by their in vivo and in vitro functional characterization demonstrated that CYP716A12 acts as β-amyrin 28-oxidase (Carelli et al. 2011; Fukushima et al. 2011; Naoumkina et al. 2010). CYP716A12 catalyzes three sequential oxidation steps at C-28 position of β-amyrin to produce oleanolic acid, which gets further decorated by other P450s to produce hemolytic sapogenins. In transgenic yeast, CYP716A12 is also shown to oxidize α-amyrin and lupeol to ursolic acid and betulinic acid, respectively (CYP716A175 and CYP716A179 from apple and licorice, respectively, have identical activity, Fig. 11). On the contrary, CYP93E2 from M. truncatula has been found to oxidize β-amyrin at C-24 forming 24-hydroxy β-amyrin (and also probably β-amyrin-24-oic acid) which further leads to the biosynthesis of non-hemolytic sapogenins (soyasapogenols) (Fukushima et al. 2011). This highlights the recruitment of highly divergent P450s from different clans for controlling the metabolic junction in these routes and has implications for pathway discovery driven by identification of recent expansions of gene families. Specifically, CYP93E2 is a member of the notoriously enriched CYP71 clan, where numerous functions in terpenoid specialized metabolism have spawned. In contrast, CYP716A12 resides in the CYP85 clan, with broader involvement in terpenoid general metabolism, but which also carries evolutionarily old examples of specialized metabolism such as in the conifer lineage (Kaspera and Croteau 2006; Ro et al. 2005). In recent advances, M. truncatula CYP72A67 was identified through TILLING in a mutagenized population. Through functional characterization by genetic approaches and heterologous expression, CYP72A67 was established in the context of other related P450s as the key enzyme controlling oxidation at C-2 of several intermediates in the hemolytic sapogenin pathway to zhantic acid (Biazzi et al. 2015). This body of work, together with numerous studies in other plant systems, solidly established the P450 families CYP93, CYP716 and CYP72 as rich repositories for candidates involved in tripterpenoid oxidation. However, isolated examples have also occurred in other families. This has inspired recent studies, where since 2015 members of CYP716 were implicated as key enzymes in C-28 oxidation of α-amyrin, β-amyrin, and lupeol leading to the corresponding acids in apple (see overview provided in Fig. 12, CYP716A175; Andre et al. 2016), and analogously in licorice, producing ursolic acid, oleanolic acid, and betulinic acid (CYP716A179, Tamura et al. 2017). Similar oxidation is also observed for germanicol to morolic acid by CYP716A175. In Artemisia annua, best known for the elucidated biosynthesis of the sesquiterpenoid artemisinin, CYP716A14v2 was shown to catalyze C-3 oxidation of α-amyrin, β-amyrin, and δ-amyrin to yield the 3-keto triterpenes (Moses et al. 2015). Founding members of the subfamily, Arabidopsis CYP716A1 and CYP716A2, were reported in a genomic cluster co-localized on chromosome 5 with a triterpene synthase. When co-expressed with the triterpene synthase, CYP716A1 displayed activity and afforded an oxidized tirucalla-7,24-dien-3β-ol (Boutanaev et al. 2015). Comprehensive testing of both P450s in yeast, engineered for production of α-amyrin, β-amyrin, and lupeol established them as multifunctional enzymes, with partially overlapping functions. CYP716A1 catalyzed multiple oxidations at specific positions of the tripterpene scaffolds toward ursolic acid and oleanolic acid but not betulinic acid. In contrast, CYP716A2 was limited to a mono-oxidation, yielding the alcohol intermediates of the triterpenes, 22-α-hydroxy α-amyrin and traces of β-amyrin oxidized at C-28 and C-16. With that, CYP716A2 contributes triterpene oxidation of carbon C-22 to the existing toolbox enabling combinatorial biosynthesis (Yasumoto et al. 2016). Formation of additional, yet unidentified oxidized triterpenes by other relatives in subfamily CYP716A (Khakimov et al. 2015) and broad phylogenetic distribution of P450s with demonstrated activity in triterpenoid oxidation over at least eight subfamilies in four clans (Yasumoto et al. 2016) highlights a promising biosynthetic potential for biotechnological production of high-value triterpenoids.

Triterpenoid sweeteners

From a biotechnological perspective, triterpene saponins have attracted considerable interest for their therapeutic activity, and as non-sugar sweeteners. The pathway to glycyrrhizin, the main sweet-tasting triterpenoid saponin found in the roots of Chinese licorice (Glycyrrhiza ssp.), is well understood. Involvement of CYP88D6 as an 11-oxidase of β-amyrin in the glycyrrhizin biosynthetic pathway has been established and was comprehensively reviewed recently (Seki et al. 2008, 2015). Another group of triterpenoid based sweeteners which has been focus of intense research and seen substantial recent progress are mogrosides, accumulating in the Chinese cucurbit Siraitia grosvenorii (Cucurbitaceae). Mogrosides carry the scaffold of the cucurbitane triterpenoids, broadly distributed in the family, including the bitter cucurbitacins with a wide palette of pharmaceutical activities. In the biosynthetic route to cucurbitacins and mogrosides, cyclization of the general triterpenoid precursor oxidosqualene to cucurbitadienol was suggested to represent the first committed step, but alternative routes may exist (see below). Distinguishing feature of the different classes of bioactive cucurbitanes is a characteristic pattern of oxidative decoration of the tetracyclic scaffold. In a burst of recent advances, and elegantly integrating knowledge of genomic clustering of the key elements of the pathway, co-expression analysis and comparative genomics of closely related cucurbits, considerable light was shed by several teams on the evolution of the routes, including the governance P450s exert (see overview provided in Fig. 13, Itkin et al. 2016; Shang et al. 2014; Zhang et al. 2016; Zhou et al. 2016). Specifically, orthologous members of subfamily CYP87D, residing in the CYP85 clan, were discovered in Siraitia, cucumber, melon and watermelon. The P450s were shown to yield 11-oxo cucurbitadienol when combined with the triterpene synthase cucurbitadienol synthase in engineered yeast strains. In addition, 11-oxo-24,25-epoxy cucurbitadienol was found to accumulate, along with the monohydroxylated 11-hydroxy cucurbitadienol specifically in Siraitia (Zhang et al. 2016). The further oxidized 11-oxo-20-hydroxy cucurbitadienol was detected in cucumber, melon and watermelon, while co-expression with CYP81Q orthologs from melon and watermelon specifically yielded 11-oxo-2,20-dihydroxy cucurbitadienol (Zhou et al. 2016). This particular activity could not be confirmed in cucumber. Instead, cucumber and Siraitia carry members of the divergent subfamily CYP88L, yielding 19-hydroxy derivatives of cucurbitadienol, consistent with saponins found in these species (Itkin et al. 2016; Shang et al. 2014).

Fig. 13
figure 13

Regio-specificity of orthologous members of cytochromes P450 in the metabolism of cucurbitadienol. SgCbQ S. grosvenorii cucurbitadienol synthase

Reconstruction of a complete mogroside recombinant pathway, including optimization of production and several steps catalyzed by UDP-dependent glycosyl transferases, has been reported (Liu et al. 2014). However, the origin of the vicinal C-24 and C-25 hydroxyl groups in the mogroside aglycon remained unclear with alternatives possible (Itkin et al. 2016). C-24 and C-25 hydroxyl groups are rare among cucurbitane triterpenoids. Zhang and co-workers proposed an origin of the intermediate epoxide through activity of CYP87D18 (Zhang et al. 2016). In contrast, based on observations in yeast and transient expression in tobacco, where endogenously formed 2,3;22,23-diepoxy squalene is offered as substrate for the cucurbitadienol synthase, Itkin and co-workers suggest an initial di-epoxydation of squalene, and cyclization to 24,25-epoxy cucurbitadienol as plausible intermediate step. Hydrolytic ring opening to yield the vicinal 24,25-diol was suggested to be catalyzed by epoxide hydrolases and not to require activity of P450s (for an overview see Fig. 13; Itkin et al. 2016).

New perspectives

Regulation and organization

Two recent studies shed light on a previously unrecognized level of metabolic regulation and organization. Parage and co-workers suggest that control of the monoterpene indole alkaloid pathway in C. roseus and activity of the large group of P450s involved, including CYP76B6, is administrated through the enzyme providing the reduction equivalents, the NADPH-dependent cytochrome P450 oxidoreductase (POR, synonym CPR) (Parage et al. 2016). With exceptions (Apiaceae, Andersen et al. 2016), PORs are typically represented by two distinct classes. Specifically, in planta, but not in reconstituted or in vitro systems, a member of the C. roseus class II POR was shown to be essential for the specialized metabolism responding to external stimuli. In contrast, the class I POR was found to be associated with the general metabolism, with no measurable contribution in the metabolism of monoterpene indole alkaloids, as shown by deep co-expression and silencing studies (Parage et al. 2016). Further highlighting a critical mechanistic importance of the POR in specialized metabolism, a recent study of the protein–protein interaction in context of the local lipid environment demonstrated the existence of a dynamically assembled and disassembled metabolon (Laursen et al. 2016). The team led by Jean-Étienne Bassard proposed an operative stabilization of the metabolon when all enzymes are co-expressed and interact. The work establishes a higher complexity order, including homo- and heterodimers of the POR, a soluble glycosyl transferase, and the two involved P450s, resulting in a highly efficient biosynthetic pathway. Despite using an experimental model outside of terpenoid metabolism, it is suggested that the principle of organization in dynamic metabolons may apply more broadly to biosynthetic pathways involved in specialized metabolism (Laursen et al. 2016). This principle may also have implications for the rational engineering of orchestrated pathways or scaffolded complexes in Synthetic Biology, where a critical goal is to enable effective channeling of intermediates while avoiding disadvantageous metabolic shunt pathways, leakage of labile intermediates or unspecific endogenous activities in the chassis organism.

The chassis

The insights discussed in this review highlight only a few of the complexities encountered during the engineering of biotechnological production platforms, and when depending on heterologous enzymes which plausibly evolved to drive chemical diversification in their plant source species. On the other hand, the biotechnological host species, or chassis, presents conceptual challenges, which need to be overcome (reviewed in Renault et al. 2014). The use of heterotrophic microbes has been comprehensively reviewed (e.g. Li and Pfeifer 2014) and efficient gene-stacking of P450s remains a limitation. A recent study elegantly demonstrated the production of di-oxygenated and acetylated taxadiene through a synthetic consortium of both E. coli and S. cerevisiae. The novel approach successfully engineered interdependency between the species and split the pathway into two segments, each expressed in one of the microbes (Zhou et al. 2015). Photosynthetic hosts offer several potential advantages, including presence of reducing equivalents (electrons) derived from photosynthesis and a source of carbon. Even though there is currently no photosynthetic platform with reported scaled and stable production of multiple oxidized terpenoids at industrially relevant level, recent proof-of-concept studies in cyanobacteria, algae, the lower land plant Physcomitrella patens and chloroplast engineering (reviewed in Nielsen et al. 2016) highlight the potential for these platforms. Finally, vascular land plants have evolved highly specialized anatomical structures dedicated for terpenoid storage, such as glandular trichomes, laticifers and resin ducts. This structural repertoire was recently suggested to include lipid droplets for storage of both simple and highly functionalized terpenoids (Pateraki et al. 2014). As terpenes were shown to potentially cause critical perturbations on the thermotropic and structural properties of lipid bilayers (Jagalski et al. 2016), the coordinated engineering of both terpenoid biosynthetic pathways and the formation of intracellular storage organelles may relieve this potential bottleneck.