Metabolic engineering of Bacillus subtilis for terpenoid production

Terpenoids are the largest group of small-molecule natural products, with more than 60,000 compounds made from isopentenyl diphosphate (IPP) and its isomer dimethylallyl diphosphate (DMAPP). As the most diverse group of small-molecule natural products, terpenoids play an important role in the pharmaceutical, food, and cosmetic industries. For decades, Escherichia coli (E. coli) and Saccharomyces cerevisiae (S. cerevisiae) were extensively studied to biosynthesize terpenoids, because they are both fully amenable to genetic modifications and have vast molecular resources. On the other hand, our literature survey (20 years) revealed that terpenoids are naturally more widespread in Bacillales. In the mid-1990s, an inherent methylerythritol phosphate (MEP) pathway was discovered in Bacillus subtilis (B. subtilis). Since B. subtilis is a generally recognized as safe (GRAS) organism and has long been used for the industrial production of proteins, attempts to biosynthesize terpenoids in this bacterium have aroused much interest in the scientific community. This review discusses metabolic engineering of B. subtilis for terpenoid production, and encountered challenges will be discussed. We will summarize some major advances and outline future directions for exploiting the potential of B. subtilis as a desired “cell factory” to produce terpenoids.


Introduction
Nature provides an infinite treasure of complex molecules (Wilson and Danishefsky 2006) which have served as leads and scaffolds for drug discovery in the past centuries (Newman and Cragg 2007;Newman and Cragg 2012;Newman et al. 2003). Numerous reports have detailed their diverse structures and biological functions. The largest and most diverse class of small-molecule natural products is the terpenoids, also known as isoprenoids or terpenes (Köksal et al. 2011). The Dictionary of Natural Products describes approximately 359 types of terpenoids, which comprise 64,571 compounds (as of May 2015). Since these terpenoids account for ca. 24.11 % (64,571 of 267,783) of all natural products (recorded in the dictionary, http://dnp.chemnetbase.com/) and are required for biological functions in all living creatures, they indisputably play a dominant role in both the scientific community and the commercial world (Breitmaier 2006).
Along with a growing attraction for sustainable production, great interest has been expressed in biotechnological production of chemical products in general and terpenoids in particular. Since the 1990s, the interest in biosynthesizing terpenoids has skyrocketed, especially for desperately needed efficacious drugs such as artemisinin (Chang et al. 2007;Martin et al., 2003;Newman et al., 2006;Paddon et al. 2013;Ro et al. 2006;Tsuruta et al. 2009;Westfall et al. 2012) and taxol (Ajikumar et al. 2010;Jiang et al. 2012). In the past 20 years, most research has focused on using Escherichia coli, the host Zheng Guan and Dan Xue contributed equally to this work with the most advanced genetic tools, for biosynthesis of terpenoids (Fig. 1). Intensive experimentation in Escherichia coli (E. coli) has led to high yield production of some isoprenoids. However, uncertainty still looms around some aspects such as genetic engineering, characterization, reliability, quantitative strategy, and independence of biological modules (Kwok 2010). More options are needed to validate and optimize cell factories for terpenoid production. According to PubMed data, in comparison to other microorganisms, Bacillales (47.32 %) naturally possess more genes and proteins related to terpenoid biosynthesis pathways (Fig. 1), but surprisingly, little research effort has been devoted to the study of Bacillales as factories for natural products.
In the mid-1990s, it was discovered that Bacillus subtilis, a member of Bacillales that has a fast growth rate and is considered generally recognized as safe (GRAS) (FDA 1997;Schallmey et al. 2004;Widner et al. 2005), has inherent MEP pathway genes (Kuzma et al. 1995;Takahashi et al. 1998). The interest rose in B. subtilis as it has been used extensively for the industrial production of proteins (Westers et al. 2004;Sauer et al. 1998;Stockton and Wyss 1946). In addition, it was also reported that Bacillus is the highest isoprene producer among all tested microorganisms including E. coli, Pseudomonas aeruginosa, and Micrococcus luteus. The reported isoprene production rate (B. subtilis ATCC 6051) is 7 to 13 nmol per gram cells per hour (Kuzma et al. 1995). This high yield makes it a promising microbial host for terpenoid biosynthesis (Julsing et al. 2007;Wagner et al. 2000). Furthermore, B. subtilis has a wide substrate range and is able to survive under harsh conditions. Owing to its innate cellulases, it can even digest lignocellulosic materials and use the pentose sugars as its carbon source, hence decreasing the cost of biomass pretreatment (Maki et al. 2009;Ou et al. 2009). Here, we review major progress in metabolic engineering of B. subtilis for synthesizing terpenoids. The related pathway enzymes, genetic engineering reports, terpenoid detection methods, and their advantages and challenges will be summarized and discussed. We hope to provide a comprehensive review for exploiting the potential of B. subtilis as a cell factory for terpenoid production.

Inherent terpenoid biosynthetic pathways of B. subtilis
Terpenoids are synthesized based on isoprene (C5) units. In terpenoid biosynthetic pathways, IPP and DMAPP (C5 unit, diphosphate isoprene forms) are the basic terpenoid building blocks, generated by the Mevalonate and MEP pathways (the terpenoid backbone biosynthesis upstream pathways). The terpenoid backbone downstream pathway is responsible for biosynthesis of geranyl diphosphate (GPP), farsenyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP), which are the precursors of monoterpenoids (C10), sesquiterpenoids (C15), and diterpenoids (C20), respectively. B. subtilis has 15 inherent enzymes, belonging to five terpenoid biosynthesis pathways: two terpenoid backbone biosynthesis upstream pathways (the mevalonate pathway and MEP pathway), the terpenoid backbone biosynthesis downstream pathway, carotenoid biosynthesis pathway, and ubiquinone and other terpenoid-quinone biosynthesis pathway (Table 1, Fig. 2). For decades, isoprene yield has been considered the bottleneck for all terpenoid biosynthesis. Thus, to construct a cell platform which can produce and tolerate high amounts of Fig. 1 Percent of terpenoid biosynthesis related articles and terpenoid related gene reports, by source. a Percent of terpenoid biosynthesis related articles, by source. b Publication amount of terpenoid biosynthesis related articles, by year. c Percent of terpenoid related gene reports, by source isoprene and downstream intermediates is crucial. Since B. subtilis possesses all of the eight MEP pathway enzymes and can naturally produce high amounts of isoprene, it appears to be an ideal choice to utilize overexpression mutants of these enzymes to increase isoprene production.
However, there are few reports on the B. subtilis MEP pathway. Most of the MEP pathway studies are based on E. coli. Withers and Keasling have described the MEP pathway of E. coli briefly (Withers and Keasling 2007). Kuzuyama and Seto (Kuzuyama and Seto 2012) clearly illustrated the enzymes and reactions involved in the MEP pathway. Carlsen summarized MEP pathway reactions and cofactors in a table (Carlsen et al. 2013). More details can be found in Zhao's review (Zhao et al. 2013). As the kinetics of the MEP pathway enzymes are still unknown, it is unclear which step represents the largest barrier. Thus, the lack of knowledge about the kinetic parameters of the key enzymes is the main obstacle facing metabolic engineering of the MEP pathway in B. subtilis to produce terpenoids. Besides that, the low number of reports about using the B. subtilis MEP pathway to produce terpenoids highlights the need for more research in this area.
Here, we summarize information about the MEP pathway: 1. The initial enzyme in the MEP pathway is 1-deoxy-Dxylulose-5-phosphate synthase (dxs), which forms 1-deoxy-D-xylulose 5-phosphate (DXP) by the condensation of D-glyceraldehyde 3-phosphate (GAP) and pyruvate. This enzyme is not only specific for the MEP pathway but also plays a role in thiamine metabolism (Sprenger et al. 1997), which shares the flux with the MEP pathway. Gene knockout results (Julsing et al. 2007) suggest that overexpressing dxs may result in a significant improvement in terpenoid production without notable toxicity to the host cell (Zhao et al. 2011;Zhou et al. 2013b). Previous studies in other bacteria also supported the theory that dxs may be the first rate-limiting step of the MEP  pathway, as overexpressing dxs can increase isoprenoid production (Estévez et al. 2001;Kim et al. 2006;Xue and Ahring 2011). Moreover, compared to the mevalonate pathway, the theoretical mass yield of terpenoids from glucose is 30 % from DXP, 5 % higher than the yield from MVA (Rude and Schirmer 2009;Whited et al. 2010), which emphasizes the importance of dxs in the MEP pathway. 2. The enzymes 4-diphosphocytidyl-2-C-methyl-Derythritol synthase (ispD), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (ispE), and 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (ispF) are required to conv e r t M E P t o 2 -C -m e t h y l -D -e r y t h r i t o l 2 , 4cyclodiphosphate (MECDP) Kuzuyama et al. 2000a;Kuzuyama et al. 2000b;Lüttgen et al. 2000;Rohdich et al. 1999). In most organisms containing MEP pathway homologs, the genes encoding ispD and ispF are neighbors on the chromosome with the ispE at a distal location. They are also regarded as key enzymes in the MEP pathway (Ajikumar et al. 2010;Lu et al. 2014;Yuan et al. 2006;Zhou et al. 2013b). IspD and ispF are essential for cell survival due to their significant impact on cell wall biosynthesis and depletion (Campbell and Brown 2002).
IspE has also been identified as crucial for survival of pathogenic bacteria and essential in Mycobacterium smegmatis (Eoh et al. 2009).
It has been reported that ispG can effectively reduce the efflux of methylerythritol cyclodiphosphate (MECDP), resulting in a significant increase in downstream terpenoid production (Zhou et al. 2012). Additional information on the bio-organometallic chemistry of ispG and ispH can be found in Wang's review (Wang and Oldfield 2014).

Genetic engineering of B. subtilis
Most of the knowledge about the MEP pathway was obtained from research in E. coli and other bacteria. Therefore, research into the progress of genetic engineering of MEP pathway enzymes in B. subtilis can provide more direct support for utilizing B. subtilis as a microbial host for terpenoid biosynthesis. Wagner first described the phases of isoprene formation during growth and sporulation of B. subtilis (Wagner et al. 1999). They found that isoprene formation is linked to glucose catabolism, acetoin catabolism, and sporulation. One possible mechanism is that isoprene is a metabolic overflow metabolite released when flow of carbon to higher isoprenoids is restricted. This phenomenon can be illustrated as follows: (a) when cells are rapidly metabolizing the available carbon sources, isoprene is released; (b) when less carbon is available during transitions in carbon assimilation pathways, isoprene production declines; and (c) when cell growth ceases and spore formation is initiated, production of isoprene continues. In 2000, it was confirmed that isoprene is a product of the MEP pathway in B. subtilis (Wagner et al. 2000). It was also reported that isoprene release might be used as a barometer of central carbon flux changes during the growth of Bacillus strains . Besides that, the activity of isoprene synthase (ISPS) was studied by using permeabilized cells. When grown in a bioreactor, B. subtilis cells released isoprene in parallel with the ISPS activity (Sivy et al. 2002).
In order to gain more insight into the MEP pathway of B. subtilis, conditional knockouts of the MEP pathway genes of B. subtilis were constructed, then the amount of emitted isoprene was analyzed. The results show that the emission of isoprene is severely decreased without the genes encoding dxs, ispD, ispF, or ispH, indicating their importance in the MEP pathway. In addition, idi has been proven not to be essential for the B. subtilis MEP pathway (Julsing et al. 2007). Xue and Ahring first tried to enhance isoprene production by modifying the MEP pathway in B. subtilis. They overexpressed the dxs and dxr genes. The strain that overexpressed dxs showed a 40 % increase in isoprene yield compared to the wild-type strain, whereas in the dxr overexpression strain, the isoprene level was unchanged. Furthermore, they studied the effect of external factors and suggested that 1 % ethanol inhibits isoprene production, but the stress factors heat (48°C), salt (0.3 M), and H 2 O 2 (0.005 %) can induce the production of isoprene. In addition, they found that these effects are independent of SigB, which is the general stress-responsive alternative sigma factor of B. subtilis (Xue and Ahring 2011). Hess et al. co-regulated the terpenoid pathway genes in B. subtilis. Transcriptomics results showed that the expression levels of dxs and ispD are positively correlated with isoprene production, while on the other hand, the expression levels of ispH, ispF, ispE, and dxr are inversely correlated with isoprene production. Moreover, their results supported Xue's conclusions about the effect of external factors (Hess et al. 2013).
In 2009, Yoshida et al. first successfully transcribed and transfected crtM and crtN genes into B. subtilis to direct the carbon flux from the MEP pathway to C 30 carotenoid biosynthesis and successfully produced 4,4′-diapolycopene and 4,4′diaponeurosporene (Yoshida et al. 2009). Thereafter, Maeda reported a method to produce glycosylated C 30 carotenoic acid by introducing Staphylococcus aureus (S. aureus) crtP and crtQ genes into B. subtilis, together with crtM and crtN (Maeda 2012). Later, Zhou overexpressed dxs and idi genes along with introducing ads (ads encodes the synthase which cyclizes farnesyl diphosphate into amorphadiene) in B. subtilis and got the highest yield of amorphadiene (∼20 mg/L) at shake-flask scale. They thought that the lack of genetic tools for fine-tuning the expression of multiple genes is the bottleneck in production of terpenoids in B. subtilis. So they modified B. subtilis genes by using a two-promoter system to independently control the expression levels of two gene cassettes (Zhou et al. 2013a). After that, Xue et al. systematically studied the B. subtilis MEP pathway enzymes (Xue et al. 2015). A series of synthetic operons expressing MEP pathway genes were analyzed by using the level of C30 carotenoid production as a measure of the effect of those modulations. All of the overexpressed gene constructs showed higher production of carotenoids compared to wild type. Dxs and dxr (8-fold and 9.2-fold increase in carotenoid production) have been validated as the most productive part of the MEP pathway genes in this study.
Other reports are related to C 35 terpenoids and their enzymes, which were found in B. subtilis, like heterodimeric enzyme, heptaprenyl diphosphate synthase (HepS and HepT), and tetraprenyl-β-curcumene synthase (YtpB), which are responsible for forming long prenyl diphosphate chains (C 35 ) (Sato et al. 2011). As Heider noted in his review, B. subtilis has not yet been a major focus to produce carotenoids (Heider et al. 2014). Furthermore, we cannot find other research about terpenoid biosynthesis in B. subtilis. Since B. subtilis possesses many advantages as mentioned above in the introduction section, biosynthesis of terpenoids via the B. subtilis MEP pathway could be both an opportunity and a challenge.

Detection and metabolomics methods for engineering terpenoid pathway
As is known, most metabolic engineering work is improved by using a combination of random and targeted approaches. Mariët and Renger (Wilson and Danishefsky 2006) pointed out that the selection of these targets has depended at best on expert knowledge but to a great extent also on Beducated guesses^and Bgut feeling.^Consequently, time and money are wasted on irrelevant targets or only a minor improvement result. Along with the development of systems biology, metabolomics, a technology that includes non-targeted, holistic metabolite analysis of the cellular and/or environmental changes combined with multivariate data analysis tools is being increasingly used to replace empirical approaches for targeted natural product biosynthesis Paddon et al. 2013). Gregory's group (Ajikumar et al. 2010) has used metabolomics analysis of their previous strains, leading them to identify a noticeable metabolite by-product that inversely correlated with taxadiene accretion. This hint helped them to achieve approximately 1 g per liter taxadiene from E. coli.
Because the research on the Bacillus MEP pathway is still at an early stage, it is urgent to develop guidelines for unbiased selection of the best rational design approach to engineering the terpenoid. The newest developments of metabolomics, meta-omics, computer, and mathematic sciences offer more options for not only unbiased selection and ranking methods but also high-throughput and more precise prediction models that enable a mechanistic description of microbial metabolic pathways (Breitmaier 2006;Martin et al., 2003). Scheme 1 summarizes the workflow, essential reports, and resources for the study of terpenoid microbial metabolomics.
To observe and optimize the terpenoid biosynthesis pathways, detection methods are also crucial. The techniques that are currently employed in the study of microbially produced terpenoids are usually gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-MS (LC-MS). Other techniques such as nuclear magnetic resonance (NMR) (Hecht et al. 2001) and Raman spectroscopic analysis (de Oliveira et al. 2010) are also used in terpenoid analysis, although compared with MS-based coupling techniques, they are less Scheme 1 Flowchart and resources for terpenoid microbial metabolomics study. a Microbial metabolic engineering workflow. b Related information of each step for microbial metabolic engineering. * Selected resources: 1. MS data of B. subtilis metabolites (Coulier et al. 2006;Koek et al. 2006;Soga et al. 2003). 2. The metabolomics standards initiative (Fiehn et al. 2007). 3. Microbial metabolomics study examples for terpenoid biosynthesis (Paddon and Keasling 2014;Zhou et al. 2012). 4. Databases, software packages, and protocols (Thiele and Palsson 2010) and http://omictools.com/. 5. Genome-scale data of reconstructed B.
subtilis metabolic net (impact of single-gene deletions on growth in B. subtilis) (Oh et al. 2007). 6. Comparative microbial metabolomics study of E. coli, B. subtilis, and S. cerevisiae . 7. The complete genome sequence of B. subtilis (Kunst et al. 1997). 8. Constraint-based modeling methods (Bordbar et al. 2014). 9. Software applications for flux balance analysis (including a software comparative list) (Lakshmanan et al. 2012). 10. Sample treatment methods (Jia et al. 2004;Larsson and Törnkvist 1996;Maharjan and Ferenci 2003;Villas-Bôas and Bruheim 2007) sensitive and/or reliable. Most likely, the currently existing methods for the quantitative determination of terpenoids in bacteria are sufficient. There are numerous articles about quantifying and identifying terpenoids (esp. carotenoids, see Foppen's tables (Foppen 1971)) in plants, microorganisms, and other organisms. Most of these methods can be applied in B. subtilis.
• HepPP heptaprenyl-PP • UDPP di-trans, poly-cis-undecaprenyl-PP • PDP phytyl-PP • OPP octaprenyl-PP As is the case for biosynthesis of different chemical compounds, genetic modification often leads to dead ends. The difficulties in metabolic engineering of bacteria for terpenoid production normally are not terpenoid detection but problems in the complex metabolic net (Baidoo and Keasling 2013). Although the latest reports (Zhou et al. 2012;Zhou et al. 2013a) describe a promising method that can simultaneously detect MEP pathway intermediates, the repeatability is not as good for CDP-MEP as for the other intermediates, especially when the amount of CDP-MEP in bacteria is very low (summarized MS information of MEP pathway metabolites can be found in Table 3). In addition, even if the reported methods are sufficient to analyze all the MEP pathway intermediates, it is still difficult to predict and identify the unknown mechanisms for improving terpenoid production and other relevant compounds due to the fact that all of the MEP pathway enzymes are also involved in other metabolic activities (http://www.kegg.jp/). Cho's untargeted metabolomics study (Cho et al. 2014) may have pointed out a direction that can help solve some of these problems, whereas few untargeted metabolomics research for B. subtilis metabolic pathway study can be found online. As mentioned above, integrated metabolomics studies and constraint-based models might orient future study for biosynthesis of terpenoids (see Scheme 1). The current state of analysis methods, which can be integrated into metabolomics researches and be used in terpenoid biosynthesis studies, raises questions about the following issues: (1) detailed preparation work such as reproducible growth of B. subtilis, sampling, and quenching methods, which can be used in metabolomics studies to elucidate the mechanisms of the MEP pathway; (2) extraction methods that maintain the original structure of intermediates and subsequently allow the identification of those compounds and their accurate quantification; (3) extraction coupled quantification methods that can be used to quantify minor components from small-scale bacterial cultures to reduce the workload; and (4) data pre-processing, biostatistics, and bioinformatics methods for big data analysis, integration, and modeling that can reflect the cell bio-net, narrow the research scope, target the key products, genes, and enzymes, and finally lead us to further improvements.

Summary
B. subtilis offers new opportunities and good prospects for terpenoid biosynthesis. This review provides a brief account of metabolic engineering of B. subtilis for terpenoid production, summarizing our understanding of B. subtilis, the MEP pathway, and related techniques. While the mevalonate pathway and terpenoid biosynthesis in E. coli have been studied for decades, research on the Bacillus MEP pathway is still at an early stage. That is why, at this point, there is no sufficient data on Bacillus yield to make a fair comparison with published yields of terpenoids in E. coli and other cell factories. However, theoretically, B. subtilis has the potential to be optimized as a high-yield-producing cell factory. The advantages of studying terpenoid biosynthesis in B. subtilis include (1) its fast growth rate and ability to survive under harsh conditions, (2) its GRAS status, (3) its wide substrate range and inherent MEP pathway genes, (4) the fact that it is a naturally high isoprene producer, (5) its clear genetic background, abundant genetic tools, and (6) its innate cellulases, which can digest lignocellulosic materials and use the breakdown products as its carbon source, which would decrease largescale production costs. Still, B. subtilis share some of the features of other gram-positive bacteria like plasmid instability. Also, there are some B. subtilis-specific engineering challenges that need to be explored. The catalytic mechanisms of two MEP pathway enzymes (IspG, IspH) in B. subtilis are unclear yet. The importance of DXR and IDI in the MEP pathway is controversial. DXS has been generally regarded as the essential rate-limiting enzyme, but even the functional parameters of DXS in B. subtilis have not yet been reported. Many questions regarding the mechanism of the MEP pathway, the interactions of related enzymes and metabolites, and the kinetic parameters of MEP pathway enzymes in B. subtilis remain unanswered. Obviously, the organism is promising and the questions are fascinating. There is thus significant reason for detailed investigations of terpenoid biosynthesis via the B. subtilis MEP pathway, particularly in metabolic engineering where there is not yet sufficient knowledge about the precise mechanisms or the effects of co-regulation of the enzymes.
Acknowledgments The authors gratefully acknowledge support from EU FP-7 grant 289540 (PROMYSE). Z.G. acknowledges financial support from the China Scholarship Council (201408330157). I.I.A. is a recipient of Erasmus Mundus Action 2, Strand 1, Fatima Al Fihri project ALFI1200161 scholarship and is on study leave from Faculty of Pharmacy, Alexandria University. They also thank Dr. Diane Black, Language Centre, University of Groningen for carefully reading the manuscript and correcting the English.

Compliance with ethical standards
Conflict of interest The authors declare that they have no competing interests.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.