Background

Bioenergy crops are grown with low inputs to generate lignocellulosic biomass that constitutes a sustainable source of renewable energy. One appealing purpose of bioenergy crops is the deconstruction of their biomass into aromatics and simple sugars for downstream conversion into bioproducts. This strategy has been put forward for the environmentally sustainable production of fuels and chemicals currently derived from petroleum refining [1]. However, several challenges still need to be overcome to render bio-based products cost-competitive vis-à-vis their petroleum-based counterparts. For biological conversion of plant biomass hydrolysates, robust engineered microbial strains are required to allow efficient production of the desired chemicals at high yields, titers, and rates using complex mixtures of low-molecular weight monomers as substrates. Moreover, achieving higher plant biomass yields at reduced cost and enabling efficient deconstruction of the recalcitrant lignocellulosic material represent two other important milestones towards the cost-effectiveness of biochemical production [2, 3].

Plant genetic engineering has the potential to overcome some of these challenges. For example, bioenergy crops can be genetically improved for producing more biomass, being more resilient to stresses like pathogens and drought conditions, or synthesizing cell wall materials with reduced recalcitrance towards deconstruction processes [4, 5]. In addition, metabolic engineering offers the possibility to increase the value of biomass by overproducing in planta a wide range of products [6, 7]. These valuable chemicals can be naturally produced by certain plants, but are often present in too low quantities in dedicated bioenergy crop for their exploitation. In some cases, the target chemicals do not occur in plants and implementation of de-novo metabolic pathways is required for their synthesis. Biochemicals isolated from engineered energy crops can be further modified biologically or catalytically to serve several industrial sectors. In particular, in the scope of a bio-based approach for producing large quantities of commodity chemicals to satisfy global markets, the use of bioenergy crops as green factories may represent a valuable option since these are expected to be grown on large acreage of land, including marginal lands not suitable for food crop cultivation. In fact, accumulation of bioproducts or metabolic intermediates in crops may benefit specific biorefinery configurations in which hydrolysates are fractionated to derive maximum value from each component [2]. For example, it has been recently documented from techno-economic analyses that co-production of high-value chemicals in bioenergy crops has the potential to ameliorate the economics of advanced biofuels obtained from lignocellulosic biomass [8].

In this review, we are inventorying metabolic routes that constitute or originate from the shikimate and isoprenoid pathways in plants. Especially, we summarize engineering approaches leading to the overproduction of several chemicals of interest derived from these two pathways (Fig. 1). In many instances, we illustrate how exploiting metabolic steps or enzymes found in non-plant organisms enable the production of specific biochemicals de-novo or at higher levels. Various potential or already existing industrial applications for these plant-based chemicals are presented.

Fig. 1
figure 1

Structures of the chemicals of interest described in this review

Biochemicals derived from the shikimate pathway

The shikimate pathway, which is confined to plastids in plants, is responsible for the synthesis of aromatic amino acids that are precursors to secondary metabolites such as pigments, alkaloids, hormones, and phenylpropanoids including lignin [9]. In microbes, the shikimate pathway has been exploited for the production of aromatic chemicals which are otherwise derived from petroleum-based benzene, toluene and xylene [10, 11]. Nevertheless, most aromatic compounds used for industrial applications are still synthesized chemically due to the inefficiency of current biological production methods. Notably, several metabolic steps from these engineering approaches developed in microbial systems have been successfully implemented in plants, which open new avenues for the production of shikimate-derived metabolites in bioenergy crops (Fig. 2). Several metabolites derived from the shikimate and phenylpropanoid pathways find applications in medicine [12], and other emerging applications for these chemicals include the manufacturing of biopolymers [13]. In particular, because of their aromatic nature, intermediates of the shikimate pathway have the potential to generate bio-replacements for commonly fossil fuel-derived aromatics.

Fig. 2
figure 2

Proposed metabolic steps for the synthesis of chemicals derived from the shikimate and general phenylpropanoid pathways. Enzyme names are indicated in the case of steps that have been the object of metabolic engineering. Green and red fonts are used to denote enzymes from plant and non-plant origins, respectively. Asterisks indicate that a mutant version of the enzyme was used. ADCL, 4-amino-4-deoxychorismate-lyase; ADCS, 4-amino-4-deoxychorismate synthase; ADHα, arogenate dehydrogenase alpha; ANT, anthranilate; AroG, 3-Deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase; AROG, arogenate; BAHDs, BAHD transferases; BX1, indole synthase; CAAT, coniferyl alcohol:acetyl-CoA transferase; CAD, cinnamyl alcohol dehydrogenase; CAF-CoA, caffeoyl-CoA; CAF, caffeate; CAld5H, coniferladehyde 5-hydroxylase; CatA, catechol 1,2-dioxygenase; CAT, catechol; CA, cinnamate; CHA, chorismate; CONIFalc, coniferyl alcohol; CONIFald, coniferaldehyde; CONIF-acet, coniferyl acetate; CYP2A6, cytochrome P450 monooxygenase; DAHP, 3-Deoxy-d-arabino-heptulosonate 7-phosphate; DES, deep eutectic solvents; DHS, 3-dehydroshikimate; EGS, eugenol synthase; EUG, eugenol; E4P, erythrose 4-phosphate; FER-CoA, feruloyl-CoA; FER, ferulate; GAL, gallate; Hbald, 4-hydroxybenzaldehyde; HBA, 4-hydroxybenzoate; HCHL, hydroxycinnamoyl-CoA hydratase-lyase; HGA, homogentisate; Irp9, salicylate synthase; MA, muconic acid; NahG, salicylate hydroxylase; OAS1D, rice anthranilate synthase α-subunit; PAAS, phenylacetaldehyde synthase; PABA, p-aminobenzoate; PAR, phenylacetaldehyde reductase; PAT, phosphoribosyl transferase; pCA-CoA, p-coumaroyl-CoA; pCA-SHI, p-coumaroyl-shikimate; pCA, p-coumarate; PCA, protocatechuate; PEP, phosphoenolpyruvate; PHEald, phenylacetaldehyde; PHE, phenylalanine; PMT; p-coumaroyl-CoA:monolignol transferase, PobA, 4-hydroxybenzoate 3-monooxygenase; PPA, prephenate; PP, phenylpyruvate; QsuB, 3-dehydroshikimate dehydratase; SA, salicylate; SINAPald, sinapaldehyde; SYRINGald, syringaldehyde; TAL, tyrosine ammonia-lyase; TnaA, tryptophanase; TRP, tryptophan; TRP5, Arabidopsis feedback-insensitive anthranilate synthase α-subunit; TYR, tyrosine; UbiC, chorismite pyruvate-lyase; UGT, indoxyl glucosyltransferase; VAN, vanillin; 2-PE, 2-phenylethanol

4-Hydroxybenzoic acid

4-Hydroxybenzoic acid (4-HBA) is synthesized from benzene on industrial scale for the synthesis of liquid crystal polymers used to manufacture fibers such as Vectran™. Strategies have been explored for converting 4-HBA into the commodity chemical terephthalic acid, a precursor to the polyester polyethylene terephthalate (PET) used to make clothing and plastic bottles [14]. 4-HBA is also the precursor for parabens, which are preservatives used in cosmetic and pharmaceutical products [15,16,17]. Considering that 4-HBA is produced by plants and is potentially released from biomass during pretreatment processes, its biological upgrading represents a conceivable option towards valorization [18]. Microbial strains with the capacity to catabolize 4-HBA are therefore appropriate chassis for biological upgrading. For example, certain oleaginous bacteria from the Rhodococcus genus use 4-HBA as sole carbon source for the production of triacylglycerols that can be transesterified into fatty acid methyl esters for biodiesel applications [19]. Similarly, the betaproteobacterium Pandoraea sp. ISTKB was shown to use 4-HBA for growth and production of polyhydroxyalkanoate biopolyesters [20]. Other examples include the biological funneling of 4-HBA into valuable catabolic pathway intermediates such as muconic acid, 2-pyrone-4,6-dicarboxylic acid, beta-ketoadipic acid, and isocinchomeronic acid using engineered strains of Novosphingobium aromaticivorans and Pseudomonas putida [21,22,23].

Overproduction of 4-HBA has been achieved in plants by overexpression of bacterial chorismate pyruvate-lyase (UbiC) or hydroxycinnamoyl-CoA hydratase-lyase (HCHL) (Fig. 2) [24,25,26,27]. In these cases, the UbiC enzyme is fused to the sequence of a chloroplast transit peptide to reroute the chorismate pool generated from the shikimate pathway [24], whereas native HCHL converts p-coumaroyl-CoA into 4-hydroxybenzaldehyde which becomes oxidized by endogenous dehydrogenase(s) into 4-HBA [25, 27]. For both strategies, 4-HBA accumulates as glucose-conjugated forms, presumably stored in vacuoles, suggesting an export of 4-HBA from chloroplasts to the cytosol in the case of the plastid-targeted UbiC approach. 4-HBA glucosides are readily extracted from plant biomass using aqueous methanol solvents [24,25,26,27]. Assessment of both strategies in sugarcane showed that HCHL was more efficient than UbiC, resulting in the production of 7.3% and 1.5% dry weight (DW) of 4-HBA glucosides in leaves and stems, respectively [26]. Moreover, analysis of lignin purified from Arabidopsis plants transformed with a HCHL gene showed the presence of 4-HBA and syringaldehyde units, as well as a reduction in the lignin degree of polymerization, which results in increases of biomass saccharification efficiency [27, 28]. Therefore, the HCHL engineering strategy has the potential to bring two valuable traits in bioenergy crops, namely ‘enhanced biomass deconstructability’ and ‘value-added coproduct’. Lignin in cell walls could represent a preferred site for 4-HBA accumulation in case large amount of intracellular 4-HBA leads to toxic effects. Although several plants such as poplar, aspen, willow, and certain palm species contain 4-HBA esters in their lignins, the exact mechanism for 4-HBA transfer onto monolignols and the transferase(s) involved remain to be elucidated as well [29].

Gallate is a derivate of 4-HBA that is obtained from the hydrolysis of plant gallotannins using microbial tannases. As a more sustainable alternative, gallate is produced biologically from 4-HBA in engineered microorganisms using a mutant version of the 4-HBA hydroxylase PobA from Pseudomonas aeruginosa [30, 31]. We previously showed that PobA could be functionally expressed in Arabidopsis plastids for production of protocatechuate from 4-HBA (Fig. 2) [32]. When expressed transiently in tobacco leaves, we also showed that PobA mutant can efficiently convert 4-HBA into gallate, which accumulates mainly as glucogallin (Lin et al., unpublished). This is of particular interest since gallate can be converted into epoxy resins used in several materials such as coatings, adhesives or laminates [33], or further esterified to produce the food additives E311, E312, and E313. Additionally, gallate can be easily decarboxylated to form pyrogallol, an important platform chemical used as a reducing agent in photography and dyeing agent in cosmetics. Finally, methyl gallate can be used as starting material for the synthesis of the drug precursor thebaine [34].

Muconic acid

Muconic acid (MA) is a platform chemical used as a precursor for the synthesis of products such as adipic acid, terephthalic acid, and caprolactam, which are widely used in the nylon and thermoplastic polymer industries. Current processes for the manufacturing of these products rely on non-renewable petroleum-based chemicals, require a high energy input, and yield large quantities of toxic by-products [35]. As an alternative, the biological production of MA using engineered microorganisms and inexpensive carbohydrate feedstocks has received increasing attention over the past 20 years [36]. Most biological routes established in microbes consist in the production of catechol and its subsequent conversion into MA by ring-cleaving catechol 1,2-dioxygenase. These routes exploit the intrinsic shikimate pathway for the biosynthesis of catechol precursors such as protocatechuate, anthranilate, salicylic acid (SA), and 2,3-dihydroxybenzoic acid [37]. The SA route was recently implemented in Arabidopsis and resulted in the production of readily extractable muconic acid from plant biomass [38]. In these plants, SA pools were increased by co-expression of plastid-targeted bacterial feedback-resistant 3-deoxy-d-arabino-heptulosonate 7-phosphate synthase (AroG*) and SA synthase (Irp9). Conversion of SA into catechol and muconic acid was further achieved by co-expression of plastid-targeted bacterial SA hydroxylase (NahG) and catechol 1,2-dioxygenase (CatA), respectively (Fig. 2). Functional expression of CatA in plants was an important milestone towards developing crops that serve as production platforms for MA. Strategies for overproducing catechol precursors other than SA from the shikimate pathway (i.e., protocatechuate and anthranilate) have already been established, but their efficient conversion into catechol remains to be demonstrated. These other routes towards MA production could be more suitable in the case of crops that trigger stress responses upon SA signaling. Nevertheless, the SA route toward production of bio-derived muconic acid could be appropriate for bioenergy crops from the Salicaceae family (e.g., poplar, willow), which are particularly productive at synthesizing multiple SA-derived compounds.

Besides its conversion into MA, catechol obtained from engineered bioenergy crops could represent a value-added product since it is a starting material for the manufacturing of insecticides (e.g., carbofuran and propoxur), fragrances, drugs, and polymerization inhibitors [39]. Recently, catechol was converted to a deep eutectic solvent (DES) that was found to be effective for the pretreatment of plant biomass and facilitate the removal of lignin prior enzymatic saccharification [40].

Protocatechuate

Protocatechuate (PCA) represents a valuable coproduct that has potential to add value to biomass of bioenergy crops as it possesses several pharmacological applications related to its antioxidant activities and anti-inflammatory properties [41]. In addition, several studies reported on the biological upgrade of PCA using engineered microbial strains. For these approaches, no purification step of PCA from biomass is required since engineered microbial strains utilize the various components of biomass hydrolysates for growth while funneling PCA into valuable chemicals. As examples, engineered Pseudomonas strains have been developed for efficient conversion of PCA into beta-ketoadipic acid, muconolactone, and 2-pyrone-4,6-dicarboxylic acid [42, 43], and engineered Rhodosporidium strains have been designed for conversion of aromatics into the biofuel precursor bisabolene [18].

Although PCA is commonly found in several plant species [41], no biosynthetic routes have been described so far. Two engineering approaches have resulted in higher PCA titers in plants. Expression of 3-dehydroshikimate dehydratase (QsuB) from Corynebacterium glutamicum allows conversion of 3-dehydroshikimate into PCA [44]. Co-expression of chorismate pyruvate-lyase (UbiC) from E. coli and 4-HBA hydroxylase (PobA) from Pseudomonas aeruginosa allows conversion of chorismate into PCA via 4-HBA (Fig. 2) [32]. For both approaches, bacterial enzymes were targeted to plastids in order to co-localize with their substrate, and PCA accumulated in green tissues mainly as conjugated forms (presumably glycosides), suggesting a transit of PCA from plastids to vacuoles and via the cytosol. In tobacco plants expressing QsuB, PCA conjugates were readily extracted from senesced biomass using aqueous methanol as solvent, and free PCA subsequently recovered after an acid hydrolysis step was efficiently upgraded biologically into muconic acid using an engineered E. coli strain [45]. In connection to these observations, a techno-economic analysis in sweet sorghum assessed that achieving PCA titers of 5% DW in biomass could reduce the minimum selling price of cellulosic ethanol obtained from sorghum bagasse if PCA is efficiently converted biologically into muconic acid as coproduct [46]. Interestingly, PCA was shown to be a competitive inhibitor of the lignin biosynthetic enzyme hydroxycinnamoyl-CoA:shikimate hydroxycinnamoyl transferase (HCT), therefore, increasing PCA in crops is also a strategy for reducing lignin and improving biomass digestibility [32]. So far, the QsuB approach aiming at overproducing PCA in biomass has been successfully translated to bioenergy crop such as poplar (Shawn Mansfield, University of British Columbia, personal communication), sorghum and switchgrass (Eudes et al., unpublished).

2-Phenylethanol

2-Phenylethanol (2-PE) is found in essential oils of several plant species such as rose, carnation, hyacinth, and jasmine. It is one of the most used chemicals for fragrances and flavors due to its pleasant rose-like aroma, and it also represents a precursor for producing the flavoring agent 2-phenylethyl acetate. 2-PE also finds applications as antibacterial and antifungal agent [47]. Other studies have explored the use of 2-PE for the production of ethyl benzene, a monocyclic aromatic hydrocarbon important in the petrochemical industry for the synthesis of styrene, which is the precursor to the common plastic material polystyrene [48]. Currently, 2-PE, styrene, and ethyl benzene are produced from petrochemical sources. There is, however, a demand for bio-based 2-PE in the case of flavor and fragrance applications, with a selling price about two orders of magnitude higher compared to synthetic 2-PE [49].

Overproduction of 2-PE has been reported in plants. Transgenic hybrid poplar co-expressing tomato phenylacetaldehyde reductase (PAR) and rose phenylacetaldehyde synthase (PAAS), both under the control of a constitutive promoter, produced more than 3.5% DW of 2-PE in leaves (Fig. 2). 2-PE accumulated as a glucoside form that was readily extractable using methyl tert-butyl ether as solvent [50]. Interestingly, a similar work conducted in Arabidopsis showed reduction of lignin content and improved biomass saccharification in addition to 2-PE production in engineered lines co-expressing PAR and PAAS genes [51]. Therefore, the approach may have the potential to reduce biomass recalcitrance for production of simple sugars in combination with the delivery of a valuable coproduct. The agronomic performances of such bioenergy crops remain to be evaluated. It would be interesting for example to assess their resistance to insect herbivores considering that certain poplar species produce 2-PE as defense compound [52].

Eugenol

Eugenol, an aromatic compound found in essential oils from clove, nutmeg, cinnamon, basil and bay leaf, is used as flavoring agent in perfumes, food and cosmetics. Actually, commercial eugenol is made from the refining of these oils obtained by steam distillation [53, 54]. Eugenol is also used as a local antiseptic and analgesic for dentistry, and other pharmaceutical applications for eugenol or its derivatives have been explored. Moreover, eugenol can serve as substrate for biological conversion into vanillin through a two-step biotransformation process that uses engineered E. coli [55].

Eugenol is a phenylpropanoid that derives from coniferyl alcohol, which is also the precursor to guaiacyl units in lignin. In flowers and leaves that produce eugenol, coniferyl alcohol is converted into coniferyl acetate by coniferyl alcohol:acetyl-CoA transferase (e.g., petunia PhCAAT and creosote bush LtCAAT) [56, 57], which is in turn converted into eugenol by eugenol synthase (e.g., creosote bush LtAPS, sweet basil ObEGS and petunia PhEGS) (Fig. 2) [58, 59]. Constitutive expression of PhCAAT in hybrid aspen successfully resulted in the production of eugenol (> 80 μg/g fresh weight (FW)) and its glucosides, which were extracted from biomass using hexane and aqueous methanol solvents, respectively. Interestingly, co-expression of PhCAAT with PhEGS did not improve eugenol yields further [60]. Similarly, constitutive co-expression of LtCAAT and LtAPS in hybrid poplar resulted in the production of eugenol glucoside (~ 0.4% DW). Preliminary data did not reveal any significant differences for lignin content in eugenol overproducing trees, but an early flowering observed after 4‐year field trial growth suggested some developmental modifications [61].

p-Coumarate

p-Coumarate (pCA) is a hydroxycinnamate that serves as precursor for manufacturing valuable chemicals. It includes for example 4-vinylphenol (or p-hydroxystyrene, p-HS), a versatile petroleum-derived platform chemical used to produce polyvinylphenol (PVP), which is employed in photoresist materials, elastomers, resins and coatings. p-HS is also used to produce flavoring and fragrance substances in food, beverage and perfume industries. Several microbial strains have been developed for bio-based production of p-HS using pCA as direct precursor [62, 63]. Other engineered microbial strains have been designed to upgrade pCA into resveratrol, an antioxidant with potential benefits for human health [64,65,66], as well as the biofuel precursor bisabolene [18], the platform chemical muconic acid [67], the biodegradable polyester precursor lactic acid [68], and polyhydroxyalkanoate polyesters [20, 69, 70]. Moreover, pCA can be transformed chemically to deep eutectic solvents potentially useful for biomass pretreatments [40] and converted to different types of high-performance polymers [71, 72].

Biomass from bioenergy crops, especially grasses such as switchgrass, corn, and sorghum, contains non-negligible amount of pCA which occurs on the hydroxyl group on the γ-carbon of lignin unit side chains, mostly on syringyl units [73]. Tyrosine ammonia-lyase (TAL) catalyzes the conversion of tyrosine into pCA, and expression of a bacterial TAL in Arabidopsis was sufficient to increase the accumulation of soluble pCA derivatives such as anthocyanins and flavonoids [74]. Therefore, higher content of pCA in cell wall biomass may be achieved by concomitantly boosting pCA production from tyrosine and promoting pCA transfer onto lignin monomers. Engineering strategies that enhance tyrosine in plants include expression of feedback‐insensitive 3‐deoxy‐D‐arabino‐heptulosonate 7‐phosphate synthase (AroG*) from E. coli, which resulted in a ~ 20-fold increase in tyrosine content [75], and expression of an isoform of arogenate dehydrogenase with relaxed sensitivity to tyrosine negative feedback inhibition (ADHα from Beta vulgaris), which led to ~ 100-fold increase in tyrosine accumulation (Fig. 2) [76]. Moreover, several enzymes named p-coumaroyl-CoA: monolignol transferase (PMT) and involved in the transfer of the CoA-activated form of pCA onto monolignols have been discovered in rice, Brachypodium, and maize (Fig. 2) [77,78,79]. Expression of these transferases in Arabidopsis and poplar resulted in increases of cell wall-bound pCA, which is otherwise found in low amounts in these two eudicots plants [80, 81]. Recently, overexpression of the maize PMT gene in maize led to a ~ 40% increase in lignin-bound pCA, which was released from biomass upon alkaline treatment [82]. Interestingly, higher amount of pCA attached to lignins in Arabidopsis resulted in increased lignin solubility under alkaline treatment of biomass [81]. These observations are relevant considering that base-catalyzed depolymerization liquors generated from biomass are rich in pCA and other aromatic compounds that are suitable for microbial conversion [83, 84]. Alternatively, pCA can be recovered and purified as coproduct from the alkaline pretreatment stream after ethanol separation and precipitation under acidic conditions [82]. For such a biorefinery concept, a techno-economic analysis indicates that pCA content in plant biomass should be at least 5% DW in order to be economically attractive at current pCA market price [82]. Other purification methods potentially involve membrane fractionation [85] or ultrafiltration coupled to affinity adsorption [86]. Finally, in switchgrass, overexpression of a transferase from rice (OsAT10) was shown to increase cell wall-bound pCA and to enhance biomass saccharification efficiency [87]. Although being part of the same ‘BAHD’ enzyme family that contains PMT transferases, OsAT10 belongs to a different clade and its substrates remain to be identified.

Caffeate and ferulate

As described previously for pCA, caffeic and ferulic acids are valuable building blocks for the manufacturing of advanced polymers, especially polyesters, which have a wide range of applications [88]. Moreover, in the case of ferulic acid, approaches for biological conversion into important chemicals such as vanillin [89, 90], polyhydroxyalkanoate [69], and muconic acid [91, 92] have also been reported. Ferulic acid was also recently used as starting material for the synthesis of ilicifoline, a dimeric berberine alkaloid with potential pharmacological and therapeutic effects [93]. Caffeic acid and its phenethyl ester (CAPE) are used in the pharmaceutical and cosmetic industries due to their anti-oxidant, anti-aging and anti-carcinogenic activities. For these markets, caffeic acid is extracted from plants whereas its ester is produced by chemical synthesis [94].

Caffeic and ferulic acids are two hydroxycinnamates that derive from the general phenylpropanoid pathway. Caffeic acid, in addition to being an intermediate of the monolignol biosynthetic pathway for lignin, is found as a constituent of lipid polymers that form cuticle and suberin [95]. It is also part of several conjugated molecules such as chlorogenic acids, clovamide, rosmarinic acid, and CAPE, as well as polyamine conjugates. Ferulic acid is also found in lipid polymers, polyamines, and certain chlorogenic acids and anthocyanins. Hemicelluloses represent an important source of ferulate in plant biomass: ferulate is esterified to arabinose residues in xylan and participate in crosslinking between hemicellulose and lignin [73, 96]. The release of ferulate from sugarcane biomass has been described using alkaline-sulfite chemi-thermomechanical pretreatment [97].

Enhancing ferulate and caffeate contents in plant biomass can be achieved by overexpressing BAHD transferases involved in the attachement of ferulate and caffeate moieties to polymers like suberin and cutin, or to compounds like chlorogenic acids [98, 99]. However, targeting xylan to increase ferulate amount in biomass could represent a more attractive strategy considering the large amount of hemicellulose present in cell walls. In sorghum, overexpression of CCoAOMT, an O-methyltransferase that methylates caffeoyl-CoA to generate feruloyl-CoA, led to an increase of cell wall-bound ferulates in biomass [100]. Similarly, overexpression in Brachypodium of a BAHD feruloyl-CoA transferase potentially involved in arabinoxylan feruloylation resulted in higher content of ferulates in cell walls [101]. These observations suggest that increasing both the pool of feruloyl-CoA and the expression of xylan-specific feruloyl-CoA transferases could lead to higher amount of easily cleavable ferulate esters in biomass of bioenergy crops.

Vanillin and syringaldehyde

Vanillin is extensively used as flavor and fragrance in food, beverages, cosmetics, pharmaceutical formulations, and homecare products [102]. It is also a precursor for the manufacturing of bio-based epoxy resins [103], and, along with syringaldehyde, can be used to produce polymers with high thermostability [104]. In addition, syringaldehyde has bioactive properties and is therefore used in pharmaceuticals (e.g., trimethoprim antibiotics), foods, and cosmetics [105]. Lastly, vanillin represents a suitable precursor for biological conversion into platform chemicals such as 2-pyrone-4,6-dicarboxylic acid and muconic acid [45, 106].

Despite its economic importance, the vanillin/syringaldehyde biosynthetic pathway in plants has not been fully elucidated [107]. Nevertheless, several engineering approaches have resulted in augmentations of these two aromatics in biomass. Plants affected in the lignin biosynthetic enzyme cinnamyl alcohol dehydrogenase (CAD) are known to accumulate significantly higher amounts of vanillin and syringaldehyde [108]. In CAD-deficient pine and poplar, higher quantity of these hydroxybenzaldehydes is released from biomass after alcohol- or alkaline-based extractions [109, 110]. Furthermore, the effectiveness of hydrothermal treatments at releasing vanillin and syringaldehyde from biomass of CAD mutant plants was recently evidenced [111, 112]. Development of environmentally friendly processes towards valorization of lignin streams generated in biorefineries has garnered interest, and it would be interesting to evaluate the yield of vanillin and syringaldehyde obtained from the depolymerization of lignins that derive from CAD mutants [113]. Interestingly, it has been demonstrated that the ratio of vanillin to syringaldehyde in biomass can be modulated by altering the expression of coniferaldehyde 5-hydroxylase (CAld5H) in a CAD mutant background (Fig. 2) [114]. Finally, in sorghum, overexpression of MYB transcription factor (SbMyb60) that induces monolignol biosynthesis resulted in higher amount of syringaldehyde crosslinked to cell walls [115].

Indican

Indican (indoxyl-beta-d-glucoside) is a metabolite naturally occurring at ~ 1–2% FW in leaves of Indigofera plants and other species such as the Japanese indigo plant (Persicaria tinctoria) [116]. Upon beta-glucosidase activity, indican is hydrolyzed to indoxyl which spontaneously undergoes oxidative dimerization to form crystalline indigotine, an important blue chemical used as indigo dye [117]. Several thousand tons of synthetic indigo are produced each year from non-renewable petroleum-derived chemicals for the textile industry. Especially, indigo is chemically synthesized from aniline, a toxic aromatic derived from benzene, which involves the use of hazardous compounds such as formaldehyde, hydrogen cyanide, sodamide, and strong bases [118]. Alternatively, the water extraction process of indican from Indigofera plant species and its conversion into indigotine has been practiced in various forms for hundreds of years throughout the world, and current demand for natural indigo is expanding [119]. Bioenergy crops, because of their higher biomass yield and amenability to metabolic engineering, could be used as platform for bio-based and sustainable indican production. Considering that around 50 thousand tons of indigo are produced annually worldwide, 5 million tons of engineered biomass containing indoxyl at 2% DW could potentially supply the entire market each year, with the assumption of 100% extraction and recovery efficiencies. Based on the estimates from the U.S. Department of Energy’s 2016 Billion-Ton Report, ~ 27 million tons of switchgrass could be economically available annually by 2040 at an offered farmgate price ≤ 40$ per dry ton and considering a baseline scenario of 1% yield growth per year [120]. This suggests that ~ 18.5% of switchgrass grown in the U.S. in the future would need to be engineered for indoxyl production (i.e., 2% DW) to potentially supply the entire indigo current global market.

Indican synthesis was achieved in tobacco by expression of indole synthase (BX1) from maize and a cytochrome P450 monooxygenase (CYP2A6) from human (Fig. 2) [121]. The activity of endogenous glucosyltransferases allowed conversion of indoxyl into indican, which prevented formation of blue indigotine crystals in planta. Alternatively, as shown in transient expression studies in tobacco, enhancement of indole synthesis can be achieved by expressing a tryptophanase (TnA) from E. coli, which also resulted in indican biosynthesis when co-expressed with CYP2A6 (Fig. 2) [122]. Therefore, engineering strategies aiming at increasing tryptophan content could be leveraged for higher production of indican in bioenergy crops. For example, expression in Arabidopsis of feedback‐insensitive 3‐deoxy‐d‐arabino‐heptulosonate 7‐phosphate synthase (AroG*) from E. coli led to ~ 2.5-fold increases in tryptophan content [123]. Similarly, expression in rice of feedback‐insensitive anthranilate synthase (OASA1D) from rice resulted in 35-fold increases in tryptophan accumulation (Fig. 2) [124].

Aminobenzoates

p-Aminobenzoic acid (PABA) and its derivatives are commonly used as ultraviolet-B filters in cosmetic sunscreens. PABA also has applications as cross-linking agent for polyurethane resins and dyes [125]. Its derivative ethyl PABA (anesthesin) is a local anesthetic of low toxicity used in dentistry, and potassium PABA (Potaba) has applications in pharmacy due to its anti-inflammatory and antifibrotic activities. PABA is currently produced chemically from 4-nitrobenzoic acid, which itself is produced from petroleum-derived toluene. 2-Aminobenzoic acid (anthranilic acid) is an intermediate in the production of the artificial sweetener saccharin [126] and can also act as a non-toxic corrosion inhibitor for metallic materials [127]. In addition, anthranilic acid esters are widely employed for the synthesis of azo dyes, pharmaceuticals (e.g., loop diuretics and fenamates), perfumes, and potential insect repellents [125, 128].

In plants, PABA derives from a two-step conversion of chorismate catalyzed by 4-amino-4-deoxychorismate synthase (ADCS) and lyase (ADCL), respectively (Fig. 2) [129]. Anthranilic acid derives from conversion of chorismate by anthranilate synthase which is composed of alpha and beta subunits [130, 131]. Arabidopsis plants affected in anthranilate phosphoribosyl transferase (PAT) activity accumulate large quantities of anthranilic acid and its glycosides, but also exhibit a dwarf phenotype compared to wildtype [132]. Interestingly, in order to alleviate the negative effect on biomass yield, we observed that overexpression of a feedback‐insensitive anthranilate synthase (TRP5) in an Arabidopsis background deficient in PAT activity was able to restore growth characteristics similar to wildtype plants while maintaining elevated levels of anthranilic acid and its glycosides (Berthomieu et al., unpublished) (Fig. 2). These observations, in conjunction with the opportunity of exploiting anthranilate-specific UDP-glucosyltransferases and BAHD acyltransferases to favor the formation of anthranilic acid conjugates that accumulate in vacuoles or cell walls [44, 133, 134], could guide the design of bioenergy crops delivering anthranilic acid as valuable coproduct. Finally, several studies reported on the overexpression of ADCS in fruits and seeds of various crops to greatly enhance PABA as part of folate biofortification strategies [135], and a PABA-specific glucosyltransferase that mediates vacuolar storage of PABA-glucose ester has been identified [136].

Biochemicals derived from the isoprenoid pathways

Terpenes and terpenoids are the largest and most diverse class of specialized metabolites (> 80,000 compounds [137]), which possess versatile biological and ecological functions for regulating plant development and responses to biotic and abiotic stresses. In general, the structural diversity of terpenes depends on the number of isoprene (C5H8) moieties that constitute them. With the exception of steroids (C27), terpenes can be subdivided into hemi- (C5), mono- (C10), sesqui- (C15), di- (C20), sester- (C25), tri- (C30), tetra- (C40) and polyterpenes (C5n, n > 8), based on the number of isoprene units [138]. In contrast to terpenes, which comprise solely hydrocarbons, terpenoids are more structurally diverse and consist of oxygen-containing or chemically modified terpene analogs [139]. For convenience, all the terpenes and terpenoids described in this review will be named “isoprenoids”. In this section, we will review several isoprenoids considered as potential bioproducts with commercial value for the production of nutraceuticals, flavors/scents/colors, biofuels and biopolymers.

Isoprenoids biosynthetic pathways

Isoprenoids are all derived biosynthetically from the five-carbon precursors isopentenyl pyrophosphate (IPP) [140, 141] and its isomer dimethylallyl diphosphate (DMAPP) [142], for which the ratio is controlled by IPP isomerase (IDI/IPI) [143, 144]. The biosynthesis of IPP or DMAPP in plants occurs via two spatially distinct pathways, which are the cytosolic mevalonate (MVA) pathway [145] and the plastidial methylerythritol-phosphate (MEP) pathway [146] (Fig. 3). More generally, the MVA pathway is germane to Archaea and present in most eukaryotes, fungi (Saccharomyces cerevisiae), plant cytoplasm, and some bacteria (enterococci, staphylococci, and streptococci); while the MEP pathway is germane to Bacteria and exists in most cyanobacteria, algae, plant chloroplasts, and some protozoa [147,148,149]. Studies revealed that formation of diverse terpenes, including triterpenes (steroids), brassinosteroids, ubiquinones and most sesquiterpenes originates from the MVA pathway, whereas isoprene, hemiterpenes, monoterpenoids, some sesquiterpenes, diterpenoids, tetraterpenoids, abscisic acid, strigolactones, gibberellins, prenylquinones, and the phytol tail of chlorophylls are derived from the MEP pathway [140, 150, 151]. In addition, it has been suggested that both pathways are antagonistically regulated by the circadian clock, which may be exploited as an engineering strategy to orchestrate gene expression of the two pathways for production of desired products [140]. For monoterpene biosynthesis, geranyl diphosphate (GPP, C10) is formed from the condensation of DMAPP with IPP, which is catalyzed by geranyl diphosphate synthase (GPPS). For sesquiterpene biosynthesis, GPP is converted with one equivalent of IPP by farnesyl diphosphate synthase (FPPS) to farnesyl diphosphate (FPP, C15). Subsequently, FPP further condenses with one IPP to form geranylgeranyl diphosphate (GGPP, C20) by GGPP synthase (GGPPS) for diterpene biosynthesis. For sesterterpene synthesis, GGPP undergoes one more condensation with IPP to produce geranylfarnesyl diphosphate (GFPP, C25), which is catalyzed by GFPP synthase (GFPPS). Although these common precursors (e.g., IPP, GPP, FPP, and GGPP) are synthesized by the two distinct pathways, they can be transported freely to different cellular compartments. For higher carbon skeletons of terpenes such as triterpenes and tetraterpenes, two molecules of FPP or GGPP are typically dimerized to form the precursors squalene (C30) and phytoene (C40), respectively [152] (Fig. 3). Finally, to synthesize diverse functional terpene molecules, these precursors will further undergo a variety of reactions such as isomerization, cyclization, reduction, oxidation, and conjugation.

Fig. 3
figure 3

A depict of isoprenoid biosynthesis from cytosolic MVA and plastidial MEP pathways. AACA, acetoacetyl-CoA; AACT, acetoacetyl-CoA thiolase; ACA, acetyl-CoA; CDP-ME, 4-diphosphocytidyl-2-C-methylerythritol; CDP-MEP, CDP ME 2-phosphate; CMK, CDP-ME kinase; CMS/MCT, CDP-ME synthase; DMAPP, dimethylallyl diphosphate; DXP, 1-deoxy-d-xylulose 5-phosphate; DXR, DXP reductoisomerase; DXS, DXP synthase; ER, endoplasmic reticulum; FPP, farnesyl diphosphate; FPPS, farnesyl diphosphate synthase; GA3P, glyceraldehydes-3-phosphate; GFPP, geranylfarnesyl diphosphate; GFPPS, geranylfarnesyl diphosphate synthase; GGPP, geranylgeranyl diphosphate; GGPPS, geranylgeranyl diphosphate synthase; GPP, geranyl diphosphate; GPPS, geranyl diphosphate synthase; HDR, HMBPP reductase; HDS, HMB-PP synthase; HMB-PP, 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate; HMGR, HMG-CoA reductase; HMGCA, 3-hydroxy-3-methylglutaryl-CoA; HMGS, HMG-CoA synthase; IDI/IPI, IPP isomerase; IPP, isopentenyl pyrophosphate; ISPS, isoprene synthase; MCS/MDS, ME-cPP synthase; ME-cPP, ME 2,4-cyclodiphosphate; MVA, mevalonate; MVAP, mevalonate-5-phosphate; MVAPP, mevalonate 5-diphosphate; MVD/PMD, MVAPP decarboxylase; MVK, MVA kinase; PEP, phosphoenolpyruvate; PVK/PMK, MVAP kinase; MEP, 2-C-methyl-d-erythritol-4-phosphate

Engineering approaches for increasing the metabolic flux through isoprenoid biosynthesis

The amount of specialized isoprenoids in plants is low, and it is impractical to rely solely on natural plant resources for production and large-scale downstream applications [153]. In general, the structure of these compounds is too complex to achieve their chemical synthesis with high regio- and stereos- specificity [154]. Therefore, in order to produce sustainably bioactive isoprenoids at higher titers, boosting the common precursors within the MVA or MEP pathways via plant metabolic engineering represents an attractive option. The biosynthetic pathways of isoprenoids are known to be highly regulated, and enzymes that control metabolic fluxes were identified and studied towards maximizing product yields within the MVA or MEP pathways [155, 156]. For terpenoid biosynthesis, IDI/IPI and several other enzymes in the MVA pathway (HMGR, HMGS, MVK, and PMK/PVK) and the MEP pathway (DXS, DXR, and HDR) are rate-limiting enzymes [157, 158] (Fig. 3).

HMGR (EC 1.1.1.34) activity and transcript abundance can be regulated by multiple factors such as hormones, environmental signals, and metabolic needs [159, 160]. In contrast to a single HMGR gene found in animals, archaea, and eubacteria, higher plants possess multiple HMGR isozymes, which are spatially and temporally regulated at the gene expression level [161, 162]. These are 60–65 kDa in size and their structure can be divided into three regions: an N-terminal domain for subcellular compartmentalization, a central domain with two transmembrane spanning regions that can modulate protein level, and a C-terminal domain for catalytic activity [151, 163]. It has been shown that overexpression of a HMGR truncated N-terminal domain (t-HMGR) can improve the production of amorphadiene and taxadiene in S. cerevisiae [164, 165]. Recently, in tobacco, overexpression of biotin carboxyl carrier protein (BCCP) linked to t-HMGR by a cleavable peptide 2A increased 20–40-fold C15 sesquiterpenes and sixfold C30 β-amyrin (a triterpene), which demonstrated an new strategy to improve terpenoid production by incorporating metabolic cross-talks [166]. In addition, heterologous expression in Arabidopsis of HMGR from ginseng enhanced the production of phytosterols and the two triterpenes α-amyrin and β-amyrin [167].

HMGS (EC 2.3.3.10) is also considered to be a rate-limiting enzyme [168] whose activity is inhibited by both substrate (acetoacetyl-CoA) and products (HMG-CoA and HS-CoA) [169]. Several functional HMGS have been cloned or characterized from plants such as Arabidopsis [170], pine [171], rubber tree [172], ginkgo [173], maize [174], and mustard [175]. Overexpression of GlHMGS from lingzhi mushroom can boost the production of ganoderic acid in Ganoderma lucidum [176]. Several mutants of BjHMGS1 from brown mustard (Brassica juncea) have been generated [169]: one of them (H188N) is insensitive to substrate inhibition but showed an eightfold decrease in enzyme activity, whereas overexpression of BjHMGS1 (S359A) improved the production of terpenoids such as sterol [177], α-tocopherol, carotenoid, squalene, and phytosterols [178].

DXS (EC 2.2.1.7) is the first entry point of the MEP pathway and a well-known rate-limiting enzyme in bacteria and plants [157, 179, 180]. In plants, based on sequence similarity, three different classes of DXS enzymes have been proposed: class 1 contains essential enzymes to synthesize terpenoids for photosynthesis, class 2 DXS is correlated with secondary terpenoids biosynthesis, and class 3 DXS are involved in the synthesis of isoprenoids (phytohormones) [181, 182]. Manipulating the activity of DXS has been considered as an effective strategy to fine-tune the biosynthesis of terpenoids (i.e., chlorophyll, carotenoids, tocopherols, abscisic acid, or gibberellin), as demonstrated in several plants, such as Arabidopsis [183,184,185], tomato [186, 187], ginkgo [188], potato [189] and carrot [190]. Moreover, DXS activity is subjected to negative feedback-regulation by IPP and DMAPP, which can bind and allosterically inactivate DXS [191, 192].

Downstream of DXS in the MEP pathway, DXR (EC 1.1.1.267) also represents a potential point of regulation [193]. Several transgenic approaches evidenced positive correlations between DXR expression levels and the content of isoprenoids. For example, overexpression of DXR increased the production of plastid isoprenoids (i.e., chlorophylls, carotenoids, and taxadiene) in Arabidopsis [194] and the level of monoterpenes in leucoplasts of peppermint [195]. Similarly, overexpression of Synechosystis DXR in tobacco chloroplasts increased the production of isoprenoids (i.e., chlorophyll a, β-carotene, lutein, antheraxanthin, solanesol and β-sitosterol) [196]. Co-expression of DXR and solanesyl diphosphate synthase (SDS) in potato can significantly enhance the levels of solanesol compared to DXR alone [197]. In addition, higher DXR levels have been associated with a role in arbuscule development in the roots of several monocot plants (e.g., wheat, maize, rice, and barley) along with increased apocarotenoid biosynthesis [198, 199].

Isoprenoid-derived biofuels and bioproducts

With recent advancements in plant synthetic biology, implementation in crops of multigene isoprenoid metabolic pathways to develop biofuels and bioproducts at industrial scale is becoming attainable [200,201,202]. The wondrous diversity and structural richness of isoprenoid molecules from plants provide a plethora of potent and useful target compounds, especially for advanced biofuels and value-added bioproducts [152, 203,204,205,206]. In relation to bioenergy crops, several efforts have been made to elucidate the genetic components that influence terpene yields, as well as to develop biomass pretreatment methods for simultaneous extraction of both terpenes and fermentable sugars [207, 208]. More generally, several methods have been developed for the extraction and isolation of terpenoids from plants [209], but these processes often remain major bottlenecks for the production of chemicals from plant biomass at industrial scale.

Isoprenoid-derived nutraceuticals carnosic acid (C20H28O4, diterpenoid)

Carnosic acid is considered as a nutraceutical due to its antioxidant, preservative and antimicrobial capacities, and represents an important constituent in food, beverages, cosmetic, and medicinal products [210]. It is exclusively identified in plant species of the Lamiaceae family, such as Salvia officinalis and rosemary (Rosmarinus officinalis), where its content can reach 10% DW in certain cultivars [210]. Carnosic acid and its oxidized derivative, carnosol (C20H26O4), are the two main bioactive components responsible for 90% of antioxidant properties of rosemary extracts [210]. Recently, their isolation from rosemary in high yield and purity in one step was developed using centrifugal partition chromatography [211]. The biosynthesis of carnosic acid is initiated by the conversion of GGPP into miltiradiene mediated by copalyl diphosphate synthase (CPS) and a kaurene synthase-like (KSL) protein. Following spontaneous oxidation of miltiradiene to abietatriene, members of the CYP76AH sub-family (CYP76AH1 and CYP76AH4) were proposed to catalyze abietatriene to ferruginol [212]. Finally, consecutive oxidations of ferruginol by CYP76AH24 or CYP76AH4 and CYP76AK6 or CYP76AK8 produce carnosic acid [213] (Fig. 4a).

Fig. 4
figure 4

Proposed metabolic steps for the synthesis of isoprenoid-derived nutraceuticals (blue), flavors/fragrances/cosmetics (black), biofuel (red) and polymer (green). a Biosynthesis of selected isoprenoids from GPP. b Biosynthesis of selected isoprenoids from FPP. c Biosynthesis of selected isoprenoids from GGPP. GES, geraniol synthase; LMS, limonene synthase; L3H, limonene 3-hydroxylase; ISPD, isopiperitenol dehydrogenase; ISPR, isopiperitenone reductase; ISPI, isopulegone isomerase; PR, pulegone reductase; MR, menthone reductase (MR). BIS, bisabolene synthases; SQS, squalene synthase; SSL, squalene synthase-like enzyme; CPT, cis-prenyltransferase; GGR, geranylgeranyl reductase; VTE2, homogentisate phytyl transferase (HPT); VTE1, tocopherol cyclase (TC); VTE4, γ-tocopherol methyl transferase (γ-TMT); VTE3, 2-methyl-6-phytyl-1,4-benzoquinol methyl transferase; PSY, phytoene synthase; PDS, phytoene desaturase; Z-ISO, ζ-carotene isomerase; ZDS, ζ-carotene desaturase; CRTISO, carotenoid isomerase; LCTb, lycopene β-cyclase (LCYb); BKT, β-carotene ketolase; LCYe, lycopene ε-cyclase; ent-CPS, ent-copalyl diphosphate synthase; KS, kaurene synthase; KAO, ent-kaurenoic acid oxidase; KO, ent-kaurene oxidase; CPS, copalyl diphosphate synthase; KSL, kaurene synthase-like enzyme; HGA, homogentisate; DMBQ, 2-methyl-6-phytylbenzoquinol; DMPBQ, 2,3-dimethyl-6-phytyl-1,4-benzoquinol

Carotenoids (tetraterpenoids)

Carotenoids are universal red, orange, and yellow pigments found in plants, algae, fungi, and photosynthetic bacteria [214]. Due to their antioxidant and cytoprotective properties, carotenoids and their derivatives (i.e., apocarotenoids) are high-value supplements for the food, feed, beverage, and nutraceuticals/nutricosmetics industries [215,216,217]. The global market value for carotenoids is projected to reach US $1.53 billion by 2021 with a compound annual growth rate (CAGR) of 3.9% [214, 218]. Among different types of carotenoids, astaxanthin is more bioactive than zeaxanthin, lutein, and β-carotene, which is mainly due to the presence of a keto- and a hydroxyl group on each end of its molecule [219]. Engineered Arabidopsis overexpressing an algal β-carotene ketolase (CrBKT) accumulates high amounts of astaxanthin (2 mg/g DW) in the leaves [220] and shows enhanced oxidative stress tolerance and bacterial pathogen resistance [221].

The biosynthetic pathway of (apo)carotenoids and their industrial potential have been extensively reviewed [214, 222] (Fig. 4a). In brief, the first committed step for the biosynthesis of carotenoids begins with the production of phytoene catalyzed by phytoene synthase (PSY). Phytoene is then converted to cis-lycopene through two consecutive desaturation steps, and one isomerization step catalyzed by phytoene desaturase (PDS), ζ-carotene isomerase (Z-ISO), and ζ-carotene desaturase (ZDS). Cs-lycopene is further isomerized to all-trans lycopene by carotenoid isomerase (CRTISO) [223]. Cyclization of all-trans lycopene by lycopene β-cyclase (LCYb) or lycopene ε-cyclase (LCYe) [224] creates a branch point for the synthesis of unique carotene backbones such as α-, β-, γ-, δ-, or ε-carotene. Finally, carotene molecules undergo a variety of decorations catalyzed by specific CYP450, epoxidase, ketolase or hydroxylase to produce the bioactive carotenoids [222]. Successful engineering of rice grain resulted in the accumulation of β-carotene, canthaxanthin, capsanthin and astaxanthin in the case of Golden Rice and aSTARice [225, 226]. Engineered maize grain with a high content of astaxanthin has been evaluated as fish feed supplement for effective pigmentation of rainbow trout flesh [227].

Tocopherols (C29H50O2, diterpenoids)

Tocopherols, also known as vitamin E, can be found in plants, algae, photosynthetic organisms, and, even, non-photosynthetic parasites [228, 229]. In nature, there are eight forms of vitamin E: four types of tocopherols (α-, β-, γ-, δ-forms) and four types of tocotrienols (α-, β-, γ-, δ-forms), named according to the number and the position of methyl groups on the chromanol ring [230]. Among the eight forms, α-tocopherol is the most biologically active [231], while γ-tocopherol is the major form in many plant seeds and in the US diet [232]. Along with carotenoids, α-tocopherol is considered as an effective antioxidant for protecting lipids against photooxidation by scavenging reactive oxygen species. Therefore, α-tocopherol has also been considered as a nutraceutical to reduce the risk of human diseases, such as cancer, aging, and cardiovascular diseases [233, 234]. The biosynthesis of vitamin E has been systematically reviewed [231, 235]. In brief, it requires two precursors from distinct biosynthetic pathways, which are polyprenyl precursors from the MEP pathway and homogentisate (HGA) from the shikimate pathway (Figs. 2, 4a). Phytyl diphosphate (phytyl-PP) generated by geranylgeranyl reductase (GGR) [236, 237] is condensed with HGA by HGA phytyltransferase (HPT) to generate tocopherol precursors, while tocotrienol precursors result from the coupling of GGPP to HGA mediated by HGA geranylgeranyltransferase (HGGT) [230, 238]. These precursors undergo cyclization by tocopherol cyclase (TC) and methylation by methyltransferase (MT) to produce the δ- and γ-form, respectively. Additional methylations catalyzed by γ-TMT generate the α- and β-forms of tocopherols. In Arabidopsis, VTE2, VTE3, VTE1 and VTE4 are the enzymes corresponding to HPT, MT, TC, and γ-TMT [238]. Exogenous expression of HvHGGT from barley in Arabidopsis resulted in 10- to 15-fold increase in total vitamin E antioxidants (i.e., tocotrienols plus tocopherols), while in corn seeds, sixfold increase was achieved using the same strategy [239]. Moreover, in sorghum, HvHGGT overexpression stacked with carotenoid biosynthesis can further enhance the stability of provitamin A by mitigating β-carotene oxidative degradation [240]. Similarly, a brown seed mutant caused by the deletion of homogentisate dioxygenase (HGO) shows reduced HGA catabolism and enhanced production of vitamin E [241]. As for the engineering cases previously cited for carotenoids, enhancement of tocopherol synthesis in non-reproductive green tissues of bioenergy crops remains to be demonstrated.

Isoprenoid-derived flavors and fragrances

The global market value of flavors and fragrances will reach US $36.6 billion by 2024 with a CAGR of 4.3% (2019–2024) [242]. For industrial products, flavors and fragrances consist of essential oils (EO) in which isoprenoids are the principal components responsible for their characteristic scents [243,244,245], including monoterpenoids and sesquiterpenoids such as linalool, valencene, nootkatone, and santalol [204, 246]. Because of the extensive applications of EO in fragrance, but also as insect repellents and antimicrobial agents, their global market value is expected to reach US $11.67 billion by 2022 [247].

Geraniol (C10H18O, monoterpenoid)

Geraniol is the major components of EO of several plants, such as geranium, lemongrass rose, palmarosa, and citronella [248]. It has been used as an additive in the food and beverage industries because of its pleasant rose-like flavor, but also as natural pest control and antimicrobial agent due to its repellent and microbiocidal properties [249]. Geraniol is one of the key ingredients for the global flavor and fragrance products, and its market was valued at US $18.6 billion in 2015 [248]. However, the content of geraniol is in negligible concentrations in most plants, which provide an opportunity for its overproduction using metabolic engineering [250].

Geraniol biosynthesis occurs via the MEP pathway from GPP and can be enhanced by overexpression of geraniol synthase (GES) when the supply of GPP is sufficient [251,252,253,254] (Fig. 4b). A large-scale statistical experimental design was performed to determine the essential cultivation parameters for geraniol production using tobacco cell suspension cultures. Under optimized conditions, cells harboring the GES from Valeriana officinalis (VoGES) was reported to produce geraniol at ~ 5.2 mg/L after 12 days of cultivation [255]. Overexpression of VoGES in hairy roots of tobacco led to the production of geraniol and its derivatives in comparable titers, ranging from ~ 150 μg/g DW [256] to ~ 200 μg/g DW [257]. In tomato fruits, co-expression of a noncatalytic small subunit of GPPS (GPPS-SSU) from snapdragon and ObGES from basil (Ocimum basilicum) enhanced the production of geraniol 6.9- and 19.2-fold compared to the ObGES and GPPS-SSU parental lines, respectively [258]. Similarly, in Catharanthus roseus, accumulation of secologanin, a geraniol derived-monoterpene indole alkaloid, was achieved upon overexpression of CrGES along with a bifunctional CrG(G)PPS [259, 260]. In addition, a combinatorial production of geraniol in tobacco has been examined using different subcellular compartments (i.e., plastids, cytosol or mitochondria): results showed that plastid-targeted VoGES combined with expression GPPS from Picea abies (PaGDPS1) for GPP overproduction lead to the highest production of geraniol [261].

Furthermore, considering that geraniol is a volatile organic compound, several promising approaches for its sequestration have been reported. A UDP-glucosyltransferases from kiwifruit (Actinidia deliciosa), AdGT4, showed significant activity towards geraniol, with the capacity to glycosylate geraniol when transiently expressed in tobacco leaves, and to enhance the production of geraniol glycosides up to ~ 0.5 μg/g FW when expressed in tomato fruits [262]. Recently, a promising biotransformation mechanism of geraniol to methyl geranate via geraniol glucoside has been demonstrated in Achyranthes bidentata upon methyl jasmonate elicitation [263].

Menthol (C10H20O, monoterpenoid)

Menthol is a major constituent of EO in peppermint (Mentha piperita, 30–55%) and cornmint (Mentha arvensis, 70–90%), the latter representing a natural source for the production of menthol crystals and natural menthol flakes by simple freeze-crystallization [264, 265]. Menthol is used in flavor and fragrance products for its cooling and refreshing sensation. It has an increasing global demand of 30,000 MT/year [153] and a market value estimated at US $3.85 billion in 2018, which is expected to reach US $5.59 billion by the end of 2025 with a CAGR of 4.8% (2019–2025) [266, 267].

The biosynthesis of menthol is proposed to be catalyzed by eight enzymes [268] (Fig. 4b): firstly, the conversion of GPP from GPPS to 4S-limonene is catalyzed by limonene synthase (LMS). Second, 4S-limonene is hydrolyzed to trans-isopiperitenol by limonene 3-hydroxylase (L3H, CYP71D13/15). Third, trans-isopiperitenol is dehydrogenated to isopiperitenone by trans-isopiperitenol dehydrogenase (ISPD). Subsequently, isopiperitenone undergoes three steps of reduction and one step of isomerization to produce menthol, which are catalyzed by isopiperitenone reductase (ISPR) [269], isopulegone isomerase (ISPI), pulegone reductase (PR) [269] and menthone reductase (MR) [270]. Several genetic engineering approaches with these genes in peppermint led to improved oil compositions and menthol yield [271]. Recently, an engineered bacterial ketosteroid isomerase (KSI) with ISPI activity has been generated and could be useful for heterologous production of menthol considering that plant ISPI remains unidentified [272].

Isoprenoid-derived biofuel precursors

Several isoprenoids represent potential biofuel precursors because of their high energy density due to their cyclic nature, high octane/cetane numbers, greater molecular stability under high pressure related to their methyl branching, low freezing point through reduced molecule stacking, and high heat of combustion [205, 273]. Certain isoprenoids and their hydrogenated derivatives have been evaluated and proposed as biofuel precursors or fuel-alternatives, including monoterpenes (i.e., pinene dimers, camphene, limonene, myrcene and ocimene, and linalool), sesquiterpenoids (i.e., farnesane and bisabolene), and triterpenes (i.e., botryococcene) [274, 275].

4R-Limonene (C10H16, monoterpene)

4R-Limonene, a colorless cyclic monoterpene, is the major constituent (30–95%) of the Citrus essential oil contained in the fruit’s outer peel [276]. Several methods have been developed to extract efficiently Citrus essential oil, such as hydro-distillation, cold-press, instant controlled pressure drop, and steam explosion [277]. The enantiomer 4S-limonene is abundant in peppermint, spearmint, and perilla [278], and its biosynthesis has been extensively studied [279]. Because of a fairly high boiling point (176 °C) and high standard enthalpy of combustion (− 6100 kJ/mol), limonene represents a promising high-density isoprenoid-derived biofuel [267]. The biosynthesis of limonene requires two biosynthetic enzymes for the formation of GPP by GPPS and cyclization of GPP into limonene catalyzed by LMS [280]. Several attempts have been made to engineer limonene production in plants via overexpression of 4S-LMS (Fig. 4b). In peppermint, overexpression of spearmint 4S-LMS increased the content of menthone/menthofuran/pulegone and reduced menthol content [281], whereas its overexpression in spike lavender produced high limonene amount in the youngest leaves and revealed a possible developmental regulation of essential oil composition among the transgenic plants [282]. Overexpression of LMS from Perilla frutescens (PfLMS) in tobacco leaves led to higher limonene production in the case of plastid-targeted PfLMS (143 ng/g FW) compared to cytosol-targeted PfLMS (40 ng/g FW) [283]. Transgenic eucalyptus (Eucalyptus camaldulensis) constitutively expressing plastidic or cytosolic PfLMS produced 2.6- and 4.5-times more limonene in leaves, respectively [284]. Overexpression and plastid targeting of both Arabidopsis GPPS and Citrus LMS in tobacco increased limonene content 10–30 fold compared to the cytosolic-targeting strategy [285]. Recently, the overproduction of limonene has been successfully demonstrated in oilseed crop (Camelina sativa) by overexpressing LMS under the control of Arabidopsis promoters [286]. A recent study indicated that reaching limonene titers of 2.2% DW in bioenergy crops and extracting this coproduct from biomass at 70% efficiency could positively impact the economics of advanced biofuels [8].

Bisabolene (C15H24, sesquiterpene)

Bisabolene has three isoforms (α-, β- and γ-bisabolene), which can be found in EO of plants including Matricaria chamomilla, Duguetia gardneriana, opopanax, and ginger [287, 288]. Recently, the chemically hydrogenated product of bisabolene (i.e., bisabolane) has been proposed as a promising diesel D2 replacement because of its cetane carbon number similar to that of diesel fuels, better cold property, and high energy density [289].

As a type of sesquiterpene, the biosynthesis of bisabolene requires two condensations catalyzed by GPPS and FPPS to form FPP, while FPP is subsequently converted to bisabolene by bisabolene synthase (Fig. 4c). Five plant bisabolene synthases were screened for bisabolene production and α-bisabolene synthase from Abies grandis (Ag1) was functionally expressed in E. coli and S. cerevisiae [289]. In plants, overexpression of tomato α-zingiberene synthase (ZIS) elevated the level of bisabolene (4–148 ng/g FW) [290].

Botryococcene (C34H58, triterpene)

Botryococcus braunii is a freshwater, colonial green microalgae that represents a potential species for the production of biofuel and bioproduct precursors due to its accumulation of triterpene oil, especially squalene, botryococcene and their methylated forms [291]. The accumulated hydrocarbon oils can represent up to 86% DW and are stored in intracellular oil bodies and the extracellular matrix [292].

Botryococcene is a valuable precursor for producing chemicals and high quality fuels (gasoline and jet fuel) by standard hydrocracking and distillation at high yields (97%) [293]. Biosynthesis of botryococcene in B. braunii has been elucidated [294]. Unlike squalene synthesis which involves the condensation of two FPP molecules at their carbon 1, botryococcene biosynthesis is achieved by coupling two FPP molecules at their carbon 1 and 3 mediated by squalene synthase-like enzyme (SSL-1) to form presqualene diphosphate (PSPP), which is then converted to C30 botryococcene by SSL-3 (Fig. 4c). C30 botryococcene is further methylated to produce C31, C32, C33, C34, C36 and C37 botryococcenes [295]. Two successful examples for biosynthesis and accumulation of botryococcene were achieved using plants as green factories. In transgenic tobacco, up to 544 μg/g FW of botryococcene was produced when biosynthesis was directed to chloroplasts, and methylated bioproducts were obtained by introducing triterpene methyltransferases from B. braunii [291]. Genetic engineering in Brachypodium resulted in higher titers of botryococcene (> 1 mg/g FW) and healthier plants were obtained using the cytosolic-targeting strategy instead of a plastid-targeting approach [296]. Although these plant engineering efforts are promising, improving growth rates and certain physiological characteristics of B. braunii strains represent concrete research avenues towards converting this natural host into a biorefinery organism for industrial production of long-chain non-oxygenated hydrocarbons [297].

An isoprenoid-derived polymer precursor: natural rubber ((C5H8)n)

Isoprenoid-derived precursors represent valuable building blocks for the production of polymers and industrial materials [298, 299]. Due to the complexity of plant biomass, it is difficult to isolate pure macromolecules; however, biosynthesis and purification of bio-based monomers for functionalization and copolymerization are simpler and cheaper. Unlike ring-opening polymerization processes, terpenes can be polymerized using radical initiators [300].

Natural rubber (NR), with an expected annual global consumption of 16.5 megatons by 2023 [301], is an industrially important isoprenoid-derived polymer because of its superior thermomechanical properties in elasticity, resilience, heat and cold resistance compared to other synthetic polymers [302]. Milky latex made by specialized laticifer cells in the bark phloem of perennial rubber tree (Hevea brasiliensis) is the main commercial source for NR (cis-1,4 polyisoprene) [303]. In general, most rubber products are made from two major types of raw materials from Hevea latex, which are liquid latex concentrate (60% v/v of polyisoprene) and solid dry rubber. Versatile commercial commodities such as gloves, balloons and catheters are made from liquid latex concentrate, while tires, tubing, hoses, footwear are from solid dry rubber [303].

The biosynthesis of rubber can be simply divided into two modules: a first module involves the biosynthesis of isoprenoid precursors (IPP and DMAPP), which has been discussed previously. In H. brasiliensis, although the existence of HbDXS and HbDXR indicates some indirect roles of the MEP pathway for rubber biosynthesis, the biosynthesis of NR is generally believed to be dependent on the MVA pathway [304,305,306]. A second module consists in the polymerization of oligomeric allylic diphosphates (GPP, FPP, GGPP) to cis-1,4 polyisoprene by rubber transferase (RT-ase or cis-prenyltransferase, CPT) [307] (Fig. 4c). The polymerization process of the rubber chain is initiated primarily from FPP and a CPT complex, consisting of a rubber elongation factor (REF), a small rubber particle protein (SRPP), and CPT-binding proteins (CPTBP). Finally, the termination has been proposed to be controlled by ubiquitin‐proteasome proteolysis [308, 309].

Sequencing of the rubber tree genome revealed 84 rubber biosynthesis-related genes from 20 gene families, including 18 for the MVA pathway, 22 for the MEP pathway, 15 for the biosynthesis of cytosolic isoprenoid precursors, and 29 for REF genes [310]. A review recently illustrated research developments for NR biosynthesis, advances in plant metabolic engineering towards improving NR content, and provided future perspectives for commercial development using alternative rubber crops, such as Parthenium argentatum (guayule) and Taraxacum kok-saghyz (rubber dandelion) [311]. Several attempts have been made to improve NR yield using transgenic plants. For example, overproduction of FPPS, GGPPS, or hexa-heptaprenyl pyrophosphate synthase (H-HPPS, a mutated form of GGPPS) in guayule showed increases of both rubber molecules and resin content in field-grown engineered guayule [312]. However, a lower molecular weight of these rubber molecules in transgenic plants was observed, indicating an endogenous metabolic flux that may lead to an insufficient IPP pool to support rubber elongation [311]. Moreover, overexpression of HbHMGR1 from the rubber tree in Arabidopsis resulted in a 50% enlargement in leaf size and more vigorous growth compared to non-transgenic plants [313]. Overexpression of HbHMGR1 increased levels of photosynthetic pigments, protein content, and, most importantly, significantly enhanced latex yield [314]. In addition, CRISPR/Cas9 genome editing has been demonstrated in rubber dandelion, which provides an avenue to accelerate the investigation of rubber biosynthesis [315]. Based on a techno-economic analysis, the accumulation of latex in bioenergy crops at titers of 2.2% DW could significantly improve the economics of second-generation biofuels [8].

Conclusions and future perspectives

Plant metabolic engineering towards the production of chemicals has emerged as a promising approach to enhance crop value. With the advancement of biotechnological tools in both synthetic biology and plant transformation techniques, the understanding and implementation of metabolic pathways in plants became feasible [316, 317]. These technical improvements dramatically accelerate the Design–Build-Test–Learn (DBTL) cycles of plant metabolic engineering, and important increases of target biochemicals in crops are now achieved at a faster pace. One of the challenges is the testing of engineered crops under field conditions to asses stress resilience and possible yield penalty. As exemplified by the case study of crops engineered for production polyhydroxyalkanoates, subcellular compartmentalization represents an effective strategy to alleviate toxicity associated with high titers of bioproducts [318]. Similarly, promising strategies have been developed for introducing organelles that allow bioproduct sequestration and accumulation in engineered plant tissues [319]. In several cases, extraction and purification of biochemicals from plant biomass represent other important challenges to overcome for rendering biorefineries economically attractive [8]. In addition to the work conducted by plant metabolic engineers to increase titers of specific chemicals in crops, which in some instances has resulted in remarkable increases by more than two orders of magnitude after several years of research [320], an emphasis should also be given to the development of isolation and purification processes to render plant-derived chemicals economically competitive compared to their petroleum-derived equivalents. Moreover, engineering approaches and biomass processing should be evaluated with life cycle analyses to assess the environmental impacts of a specific bioproduct and inform on the types of crops to improve. Next generation of holistic biorefineries that include upstream biomass extraction step(s) prior to hydrolysis and conversion of lignocellulose are expected to beneficiate from these value-added coproduct traits implemented in bioenergy crops. The development of solvents and extraction methods compatible with existing biorefineries should enable the integration of novel streams that generate valuable coproducts while reducing recovery costs [321, 322]. Furthermore, in a concept of one-pot biomass conversion process, the release from engineered lignocellulosic feedstocks of either target bioproducts or their immediate metabolic precursors during biomass pretreatment and saccharification offers a potential for increasing final bioproduct yields, but this approach will necessitate the development of microbial strains that are tolerant to inhibitors found in lignocellulosic hydrolysates [323, 324]. For our future bioenergy crops, exploiting diverse metabolic pathways inherent to plants such as the shikimate and isoprenoid pathways will certainly contribute to the supply of several valuable biochemicals that find multiple industrial applications. Such endeavor is intended to reduce the production of fossil fuel-derived chemicals and our dependence on petroleum.