Genome-scale metabolic model in guiding metabolic engineering of microbial improvement
- First Online:
- Cite this article as:
- Xu, C., Liu, L., Zhang, Z. et al. Appl Microbiol Biotechnol (2013) 97: 519. doi:10.1007/s00253-012-4543-9
- 1.5k Views
In the past few decades, despite all the significant achievements in industrial microbial improvement, the approaches of traditional random mutation and selection as well as the rational metabolic engineering based on the local knowledge cannot meet today’s needs. With rapid reconstructions and accurate in silico simulations, genome-scale metabolic model (GSMM) has become an indispensable tool to study the microbial metabolism and design strain improvements. In this review, we highlight the application of GSMM in guiding microbial improvements focusing on a systematic strategy and its achievements in different industrial fields. This strategy includes a repetitive process with four steps: essential data acquisition, GSMM reconstruction, constraints-based optimizing simulation, and experimental validation, in which the second and third steps are the centerpiece. The achievements presented here belong to different industrial application fields, including food and nutrients, biopharmaceuticals, biopolymers, microbial biofuel, and bioremediation. This strategy and its achievements demonstrate a momentous guidance of GSMM for metabolic engineering breeding of industrial microbes. More efforts are required to extend this kind of study in the meantime.
KeywordsGenome-scale metabolic modelSystems biologyMetabolic engineeringMicrobial improvementIndustrial application
With modern genome-sequencing capabilities, metabolic model has been developed into genome-scale reconstruction. This reconstruction tries to collect every reaction of target organism through integrating genome annotation and biochemical knowledge to reconstruct a stoichiometric mathematical model (Palsson 2006). It bridges the gaps between genome-derived biochemical information and metabolic phenotype in a principled manner, offering an ideal view of the whole cell (Durot et al. 2009).
The first genome-scale metabolic model (GSMM) was published 13 years ago (Haemophilus influenza (Edwards and Palsson 1999)). In the decade since, great strides have been made in GSMM reconstruction. Both efficiency and quality are significantly improved (Feist et al. 2009; Kim et al. 2011b; Notebaart et al. 2006). Lately, the process to build a GSMM has been extensively described (Baart and Martens 2012; Durot et al. 2009; Thiele and Palsson 2010), which further stimulates GSMM reconstruction. With rapid reconstruction and improved quality, GSMM has become an indispensable tool for studying system biology of organism metabolism. It is widely used in many aspects, such as contextualization of high-throughput data, understanding complex biological phenomena, guidance of metabolic engineering, directing hypothesis-driven discovery, interrogation of multi-species relationships, and network property discovery (Durot et al. 2009; Feist and Palsson 2008; Liu et al. 2010; Oberhardt et al. 2009; Palsson 2009; Rocha et al. 2008).
In the past few decades, despite all the significant achievements in industrial microbial improvement, the approaches of traditional random mutation and selection as well as the rational metabolic engineering based on the local knowledge cannot meet today’s needs. Recently, systems metabolic engineering based on simulation and prediction of mathematic model has provided a reliable method for microbial improvement (Lee et al. 2011b; Park and Lee 2008). As a major tool of the systematic method, GSMM has been widely applied to guide metabolic engineering of microbial improvement (Oberhardt et al. 2009; Palsson 2009). This GSMM-guided metabolic engineering is a systematic process, generally named as systematic method (Alper et al. 2005a; Alper et al. 2005b) or in silico-aided metabolic engineering approach in industrial biotechnology (Bro et al. 2006). Many reviews included this application of GSMM but described the achievements inadequately (Oberhardt et al. 2009; Palsson 2009; Rocha et al. 2008). Other reviews focused on different aspects, such as strategy and method (Kim et al. 2008a, b), GSMM species (Milne et al. 2009), and chemical materials (Curran and Alper 2012; Lee et al. 2012). Although the use of in silico GSMMs is still in its early stages for delivering to industry, some significant successes in microbial improvement have been achieved in recent years. Thus, it is timely to collect the successful cases to summarize the application of GSMM in guiding microbial metabolic engineering.
This review presents a synthetical overview of GSMM in guiding microbial improvements, focusing on the systematic strategy and its achievements in different industrial fields. It begins with a brief introduction to available GSMMs and then highlights the systematic strategy, implementation methods, and the successful applied cases in industrial microbial improvements. Selected exemplary achievements of every field are described in detail, and the potential research and application of GSMM are outlined as well. Furthermore, future directions of GSMM-guided metabolic engineering are prospected, in which we highlight the issues, trends, and opportunities. It is expected that all the contents reviewed here could expand the overview of using genome-scale metabolic model to guide metabolic engineering of microbial improvement.
Currently available GSMMs
The species built into GSMM have been across three areas including basic study, medical biotechnology, and industrial biotechnology (Feist and Palsson 2008; Milne et al. 2009). Some model species such as E. coli and yeast are explored in basic studies by theory biologists to clarify biological network evolution (Papp et al. 2011) and discover network characteristics (Oberhardt et al. 2009). Along with the applied scope of GSMM trends towards medical biotechnology, particularly for anti-pathogen target discovery, GSMM reconstructions for pathogenic organism have increased rapidly in recent years (Chavali et al. 2012; Kim et al. 2012a). Although many species are built into GSMMs for research fields mentioned above, applications in industrial biotechnology are the biggest motivation to reconstruct GSMMs for sequenced species (Blazeck and Alper 2010; Milne et al. 2009). These models are mainly used to design metabolic engineering strategy to enhance the yield of target products in microbial factory and to improve metabolic degradation ability of pollutants in bioremediation (Izallalen et al. 2008). Therefore, the GSMM-guided metabolic engineering strategy and its achievements undoubtedly show the value and significance of this systematic strategy both in theory and practice.
Strategy for guiding metabolic engineering of microbial improvement
Essential data acquisition
In present biological researches, genome sequencing is the first premise for any microbe which is desired to study its internal metabolic state in system scale. Hence, genome sequencing and annotation are essential for the strategy. In recent years, sequencing technology has evolved into the so-called “next-generation sequencing” which accelerates sequencing rate, decreases the cost, and more importantly, improves the veracity of data (Hall 2007; Mardis 2008; Schuster 2008). Meanwhile, more and more genes and proteins verified in biological functions are collected in many excellent databases, such as NCBI (Sayers et al. 2010), KEGG (Kanehisa et al. 2006), BRENDA (Scheer et al. 2011), LIGAND (Goto et al. 1998), and BioCyc (Caspi et al. 2012). These achievements greatly improve the genome annotation, providing researchers more and reliable genome resources used for GSMM reconstruction. Other microbial data, such as biochemical and physiological information, are also important for GSMM building and simulation. They could be collected from corresponding databases and literatures. In addition, the recent omics technology moves biological data acquisition into high-throughput era, producing massive microbial information in the scale of transcriptome, proteome, and metabolome. Thus, it is highly possible to reconstruct a high-quality GSMM and to introduce the systematic strategy into microbial improvement.
GSMM reconstruction is more mature comparing to other biochemical networks, such as transcriptional and translational networks and transcriptional regulatory networks (Feist et al. 2009). However, it is still time-consuming and labor-intensive. So far, none of the GSMMs are reconstructed without manual refinement.
The key of reconstruction lies in the iterative refinement. Actually, its major work is to perfect the enzymatic reactions and relevant genes. The major items requiring curations are enumerated in Fig. 3b. In addition, the non-enzymatic reactions, such as transport reactions, spontaneous reactions, exchange reactions, demand reactions, and sink reactions should also be added correctly. During the curations, gap analysis is the most immediately helpful approach. In most cases, it is beneficial for gap analysis by identifying missing enzymes through analysis of incomplete but essential metabolic pathways, stimulating literature searches that reveal previously overlooked phenotypic data, and analysis of high-throughput omics data (Oberhardt et al. 2009). Many computational methods, such as flux balance analysis (FBA) (Orth et al. 2010) and gapfind/gapfill (Satish Kumar et al. 2007), are developed to analyze the gaps (Pitkanen et al. 2010). Drawing software are developed to create organism-specific metabolic maps, such as Cellular Overview (Latendresse and Karp 2011) and MyBioNet (Huang et al. 2011), which is very useful for gap analysis.
In addition to the crucial reaction refinements, some cellular parameters representing microbial physiological and biochemical process are indispensable. One essential parameter is biomass synthesis reaction which accounts for all known biomass components (protein, DNA, RNA, lipids, peptidoglycan, glycogen,polyamines, etc.) and their fractional contributions to the whole cellular biomass (Thiele and Palsson 2010). Other important parameters, e.g., P/O ratio, ATP maintenance costs, and minimal medium, also need to be estimated or measured. Reported experimental data are the main resources. With the model and essential parameters, constraints-based modeling and analysis can effectively characterize cell metabolism and predict potential expression responses to environmental or genetic perturbations.
Constraints-based optimizing simulation
This simulation process is implemented on GSMM through many constraints-based algorithms. So far, the number of these optimization algorithms has been developed up to dozens (Kim et al. 2012b; Lewis et al. 2012). They have played a significant role in predicting microbial metabolic capability after genetic manipulations (Lewis et al. 2012).
According to the cardinal principle, many algorithms have been developed to simulate biological phenotype (Lewis et al. 2012; Park et al. 2009). FBA is the most basic and simplistic constraints-based method (Orth et al. 2010). Its constraints consist of three fundamental assumptions: pseudo-steady state (S·v = 0), mass conservation, and an optimizing objective. These constraints (assumptions) shrink the unconstrained flux distribution to a closed finite flux space. Then biomass synthesis flux (vbiomass) is optimized through linear programming to find a unique flux distribution. FBA can perform gene-deletion simulation to investigate gene essentiality, calculate growth rates under a given medium, and predict the yields of important cofactors (Orth et al. 2010).
Applications of algorithms in guiding metabolic engineering
Yim et al. (2011)
Hatzimanikatis et al. (2005)
Fowler et al. (2009)
Burgard et al. (2004)
Kim et al. (2007)
Lee et al. (2007)
Delgado and Liao (1997)
Choi et al. (2010)
Bushell et al. (2006)
Lun et al. (2009)
Segre et al. (2002)
Ranganathan et al. (2010)
Patil et al. (2005)
Burgard et al. (2003)
Kim and Reed (2010)
Pharkya and Maranas (2006)
Pharkya et al. (2004)
Shlomi et al. (2005)
Yang et al. (2011)
Kim et al. (2011a)
Gene amplification is another useful strategy so that some constraints-based frameworks focus on this simulation and prediction (Table 1). As an example, the method named flux scanning based on enforced objective flux (FSEOF) is skilled in identifying gene amplification targets by scanning the changes of all the metabolic fluxes in response to the enhancement of the flux toward the desired biochemical (Choi et al. 2010). It was further validated by identifying amplification targets that improved the production of lycopene in E. coli (Choi et al. 2010). Many FBA-based algorithms such as flux variability analysis (FVA) and flux sensitivity analysis (FSA), some OptKnock derivate including OptReg, OptORF, and OptForce, and other independent frameworks are developed to predict gene amplifications to investigate up-or downregulation of genes in the target organism (Table 1).
In addition to perturbation of endogenous genes, heterologous pathway assembly and expression is another critical approach for strain improvement. However, the number of constraints-based algorithms with the ability to predict foreign gene insertion is limited at present. OptStrain (Pharkya et al. 2004), an OptKnock derivate based on mixed-integer linear programming, is the most popular one with this prediction ability. In order to confer nonnative functionality into a host organism for a desired phenotype, OptStrain first identifies the minimal heterologous pathway that can achieve the maximum in silico yield of desired metabolites from a universal reaction database and then uses OptKnock framework to carry out a new efficient GSMM incorporated with the searched pathway. Other frameworks such as Biopathway predictor (Yim et al. 2011) and BNICE (Hatzimanikatis et al. 2005) are developed to find an optimal insertion pathway or reaction for redesigning a metabolic network.
These algorithms have their special particularities suitable for different guides of stain design. Selecting a wrong algorithm will result in misleading or erroneous interpretation, so it is necessary to keep cautious when choosing one or more for guiding metabolic engineering. In general, the given constraints of target organism and the simulation purpose determine the choice of constraints-based algorithms. In addition, many software are developed for simulating these algorithms (Copeland et al. 2012; Wiechert 2002), which make great contributions to constraints-based modeling. Among them, the COBRA Toolbox (a Matlab-based package) (Schellenberger et al. 2011) has become a near-standard tool in this field. It can perform many algorithms, e.g., FBA, OptKnock, OptStrain, and MOMA. Instead of COBRA’s deficiency with unfriendly interface, other tools with friendly graphical user interface, such as MetaFluxNet (Lee et al. 2003), BioMet Toolbox (Cvijovic et al. 2010), and OptFlux (Rocha et al. 2010), are frequently used as well.
Our effort taken in reconstructing GSMMs and developing computational tools is to predict reliable engineering targets. Then these predicted results need to be validated in wet lab. This engineering process is commonly composed of genetic manipulation, strain cultivation, and phenotype measurement, which involves a lot of experimental methods.
Recombinant DNA techniques are the centerpiece of metabolic engineering (Tyo et al. 2010). In the achievements of GSMM-guided metabolic engineering, gene knockout is the principal genetic manipulation, which is always carried out by DNA homologous recombination (Capecchi 1989). Insertion mutagenesis (Klinner and Schäfer 2004) and RNAi (Agrawal et al. 2003) are the alternative gene-deletion technologies when homologous recombination is difficult for some organisms. In addition, other genetic manipulating strategies including genetic insertion and amplification are applied in the achievements of GSMM-guided metabolic engineering. As the most important and basic experimental techniques of molecular biology, these genetic manipulations reform and regroup the DNA of interest to modify a target industrial microbe, resulting in a possibility to create a biological factory for products of interest (Le Borgne 2012).
In addition to changes in genome, microbial cultivations are equally essential for strain improvement. Laboratory research commonly starts in batch cultivation. However, in order to get higher products using limited bioreactor and shorten production cycle to fulfill industrial applications, further efficient cultivations including fed-batch cultivation and continuous cultivation are applied under aerobic or anaerobic conditions (Bro et al. 2006; Brochado et al. 2010; Choi et al. 2010; Lee et al. 2007; Park et al. 2011). As important as genetic manipulations, the selection and optimization of microbial cultivations can also contribute to strain improvement for industrial applications.
Authentication and measurement of microbial phenotype are the final challenging and indispensable tasks. In most cases, the detection techniques refer to qualification and quantification of a target metabolite. At present, metabolite quantification depends on either spectrophotometric assays (detection of single molecules) or simple chromatographic separation techniques (detection of molecules on mixtures of low complexity). For example, high-performance liquid chromatography has been used in different methods to analyze metabolites, such as reversed-phase chromatography used for detecting spectinomycin (Yan et al. 2009) and size-exclusion chromatography for analyzing heparin (Ziegler and Zaia 2006). In order to analyze complex mixtures of compounds in high accuracy and sensitivity, some advanced methods combining chromatographic techniques and spectrometry-based techniques have been established, such as gas chromatography-mass spectrometry, liquid chromatography-mass spectrometry, and nuclear magnetic resonance (NMR). All the representative methods in phenotype measurement greatly speed up the validation of in silico predictions.
It is obvious that only these experimental biotechnologies could make microbial improvements come true. Hence, efficient genetic tools and genetic manipulation systems, appropriate cultivation, and accurate phenotype measurement are the prerequisites to apply GSMM-guided metabolic engineering strategy for strain improvement.
Achievements in different industrial application fields
Achievements of industrial strain improvement using GSMM-guided systematic method
25 % improvement
Bro et al. (2006)
An approximately 85 % increase in the final cubebol titer
Asadollahi et al. (2009)
Brochado et al. (2010)
Threefold higher in log-phase and the extracellular concentration got 16.5-fold increased
Kennedy et al. (2009)
2,3-Butanediol titer (2.29 g/L) and yield (0.113 g/g) were achieved.
Ng et al. (2012)
Nearly 40 % increase
Alper et al. (2005a)
8.5-fold increase over the wild strain
Alper et al. (2005b)
Developed FSEOF algorithm to find gene amplification targets resulting in lycopene yields increased significantly
Choi et al. (2010)
Over 12-fold improvement
Boghigian et al. (2012)
E. coli and its DXP pathway were found with the most potential ability beneficial for taxadiene production.
Meng et al. (2011)
A high yield of 0.378 g of l-valine per gram of glucose
Park et al. (2007)
A high yield of 32.3 g/L l-valine in fed-batch cultivation
Park et al. (2011)
Polylactic acid and its copolymers
PLA, P (3HB-co-LA), and 3HB.P (3HB-co-LA) produced up to 11 wt.%, 56 wt.%, and 46 wt.% from glucose, respectively
Jung et al. (2010)
9.25 g/L could be obtained after 12 h of aerobic cultivation.
Moon et al. (2008)
A high yield of 0.393 g per gram of glucose and 82.4 g/L threonine by fed-batch culture
Lee et al. (2007)
Increased production by more than sevenfold and the ratio by ninefold
Lee et al. (2005a)
A high yield of 1.29 mol succinate/mol glucose and high productivity
Wang et al. (2006)
Lactate titers ranged from 0.87 to 1.75 g/L and secretion rates were directly coupled to growth rates.
Fong et al. (2005)
817 mg/L of leucocyanidin and 39 mg/L (+)-catechin with 10 g/L glucose, a fourfold and twofold increase, respectively
Chemler et al. (2010)
A fourfold increase in the levels of intracellular malonyl-CoA
Xu et al. (2011)
Leading to a strain of E. coli capable of producing 18 g/L of this highly reduced, non-natural chemical
Yim et al. (2011)
Increased by over 660 % for naringenin and by over 420 % for eriodictyol
Fowler et al. (2009)
Li et al. (2012)
0.55 g per gram of glucose, a titer of 120 g/L lysine, and a productivity of 4.0 g/L/h
Becker et al. (2011)
The best strain obtained 10 % higher yields.
van Ooyen et al. (2012)
At least 15 % higher GFP per cell than the control strain
Oddone et al. (2009)
Successfully increasing electron transfer as a result of higher respiratory rate
Izallalen et al. (2008)
Acinetobacter baylyi ADP1
5.6-fold more triacylglycerol (milligrams per gram cell dry weight) and the proportion in total lipids was increased by eightfold
Santala et al. (2011)
Approximately 43.2 % higher than that of the parental strain
Huang et al. (2012)
Food and nutrients
In food and nutrients industry, GSMMs are built to improve the yield of fermentation byproducts and explore metabolic mechanisms and processes. Guided by the systematic strategy, improving E. coli for the production of amino acid and organic acid is the most successful attempt (Becker and Wittmann 2012). In addition, certain non-model microbes such as lactic acid bacteria (LAB) and Corynebacterium glutamicum have been reconstructed into GSMM to investigate the global cell for strain improvement (Milne et al. 2009; Teusink et al. 2011).
Due to its importance in nutrition, amino acids are commonly used in nutrition supplements, fertilizers, and food industry. Based on the GSMM, E. coli was improved for producing l-valine through systematic analysis and simulation (Park et al. 2007). In this study, Park et al. first constructed an l-valine-producing basic strain by analyzing metabolic and regulatory information available in the literatures. Then, this basic strain was improved stepwise guided by new information obtained from transcriptome analysis and in silico gene knockout simulation. The final engineered E. coli strain was able to attain a high yield of l-valine per gram of glucose, up to 0.378 g. In its GSMM-guided process, firstly, an E. coli GSMM named MBEL979 was slightly updated from iJR904 (Lee et al. 2005b; Reed et al. 2003), which included 979 metabolic reactions and 814 metabolites. Then, through simulating MOMA algorithm, aceF, mdh, and pfkA genes were identified as the best triple knockout targets that could insure a reasonable growth rate. After wet experimental gene deletions, the l-valine concentration of the mutation strain was 2.27-fold higher than that of the corresponding recombinant start strain, which highly agreed with the in silico predictions. Four years later, Park et al. (2011) indicated that l-valine yields was further improved using the GSMM-guided method. Differently, the simulating algorithm was flux response analysis (FRA). In addition to valine, Lee et al. (2007) reported an improvement of a genetically modified l-threonine-overproducing strain, in which FRA was first developed to perform on GSMM. In summary, these three representative studies demonstrate that GSMM-guided metabolic engineering strategy has been applied efficiently to improve prokaryotic E. coli to produce primary metabolites.
Improving eukaryotic microbe S. cerevisiae for vanillin and malic acid production is another typical example (Brochado et al. 2010; Zelle et al. 2008). Vanillin is one of the most widely used flavoring agents in food industry and has been expressed in S. cerevisiae (Hansen et al. 2009). Recently, Brochado et al. (2010) improved vanillin production in baker’s yeast through in silico design. Several genetic targets were identified by OptGene and OptKnock on the GSMM iFF708 (Forster et al. 2003) while MOMA was used as the biological objective function. Subsequently, two of them (PDC1 and GDH1) were selected for further experimental verification, resulting in a Δpdc1 mutant with fivefold increase in production compared with previous works. In another study, S. cerevisiae was engineered to produce 59 g/l of malate, five times higher than earlier efforts (Zelle et al. 2008). The GSMM iND750 (Duarte et al. 2004) was used as the basis for the 13C flux model, so that the remarkable improvement could be validated by 13C-NMR flux determination. Therefore, the two studies demonstrate that GSMMs are not only applied to design metabolic engineering strategy but also to make further improvements on strains through interpreting experimental data.
Apart from the engineering model strains (E. coli and S. cerevisiae) mentioned above, there are a number of microbes well known as natural producers for materials of food and nutrients industry. C. glutamicum is one of the most important natural producers of various amino acids, which has two GSMMs reconstructed with a high quality (Kjeldsen and Nielsen 2009; Shinfuku et al. 2009). Its ability to over-produce l-lysine was also remarkably modified by GSMM-guided metabolic engineering (Becker et al. 2011). LAB are the other useful nutrient-related microbes because of their powerful ability to produce bacteriocins, exopolysaccharides, polyols, vitamins, etc. (Zhu et al. 2009). Lactococcus lactis is the earliest one whose metabolic network is reconstructed into genome scale to analyze metabolic capabilities and whole-cell function under aerobic and anaerobic continuous cultures (Oliveira et al. 2005). Base on this GSMM, Oddone et al. (2009) employed the dynamic flux balance analysis (DFBA) algorithm to predict gene targets to increase the expression of green fluorescent protein (GFP, a model heterologous protein) in L. lactis IL1403. The subsequent wet-lab experiments increased GFP production of L. lactis by 15 %, which validated the model-based prediction to certain extent. In addition, some other LAB such as Lactobacillus plantarum (Teusink et al. 2006) and Streptococcus thermophiles (Pastink et al. 2009) were reconstructed into GSMMs as well. Hence, in the industrial production of food and nutrients, applying these GSMMs of non-model organisms for microbial improvement will be a significant coming progress.
Microbes are famous as the source of pharmaceuticals for a long time. Many drugs such as penicillin, cephalosporin, and tetracycline are produced by natural or engineered microbes. The reason that we choose microbes as drug production factory is that they have more advantages comparing with total chemical synthesis or extraction from natural resources, including friendliness to environment, low costs, and higher producing rates (Lee et al. 2009b). With the reconstructed GSMMs as basis, some biopharmaceuticals have benefited from the GSMM-guided metabolic engineering strategy (Alper et al. 2005a; Alper et al. 2005b; Asadollahi et al. 2009; Boghigian et al. 2012; Meng et al. 2011).
For example, lycopene is a valuable pharmaceutical and nutrient in our diets. It is beneficial to human health because of its abilities to prevent cardiovascular disease and cancers of the prostate or gastrointestinal tract (Clinton 1998; Gerster 1997). Over 10 years ago, lycopene and carotenoids had received high attention and achieved their production in recombinant microorganisms (Farmer and Liao 2000). In order to explore the guidance of GSMM in metabolic engineering, Alper et al. (2005a) did an in silico analysis to investigate the putative genes impacting network properties and cellular phenotype. Profiting from the GSMM iJE660 of E. coli (Edwards and Palsson 2000) and the applied algorithm MOMA, five genes were identified as candidates for experimental validation to improve lycopene production. After experimental attempts of single and multiple gene knockouts, lycopene yields in the final engineered strain got a nearly 40 % increase over parental strain. What is even more exciting is that the yields of lycopene achieved 8.5-fold increase over recombinant K12 wild-type after combining GSMM-based and combinatorial (transposon-based) methods (Alper et al. 2005b). Recently, Lee’s group has also successfully employed the systematic method to identify the genetic amplification targets in E. coli for enhancing lycopene production (Choi et al. 2010). In addition, GSMMs of E. coli are simulated to produce taxadiene (Boghigian et al. 2012; Meng et al. 2011) and sesquiterpene (Asadollahi et al. 2009) as well. The productions of some drug precursors in E. coli have also been improved by this method, such as l-valine and l-threonine mentioned above. Obviously, with prokaryotes as drug expression system, the model organism E. coli is the best choice to explore GSMM-guided metabolic engineering.
In eukaryotic organisms, model microbe S. cerevisiae is generally designed as a microbial cell factory to produce pharmaceuticals. The sesquiterpene production of S. cerevisiae is a typical successful example using this GSMM-guided metabolic engineering (Asadollahi et al. 2009). GSMM iFF708 (Forster et al. 2003) was employed for constraints-based analyses. While OptGene was chosen as the modeling framework and MOMA as objective function, GDH1 encoding NADPH-dependent glutamate dehydrogenase was then identified as the best target gene for the improvement of sesquiterpene biosynthesis in yeast. Deletion of GDH1 resulted in an approximately 85 % increase in the final cubebol titer, but it decreased the maximum specific growth rate significantly. Just as well, this disadvantage was then mitigated by over-expression of GDH2. Thus, the complexity of eukaryotic organisms might bring a greater obstacle for using GSMM-guided systematic strategy.
Aside from engineering model strains (E. coli and S. cerevisiae), there are lots of microbes well known for their natural production of biologically active drugs or precursors. However, so far only two popular microbes of this kind have their high-quality GSMMs, Bacillus subtilis (Henry et al. 2009; Oh et al. 2007) and Streptomyces coelicolor (Alam et al. 2010; Borodina et al. 2005). Streptomyces bacteria are the well-deserved microbial factories for antibiotics. It is said that almost two thirds of all known natural antibiotics are produced by Streptomyces (Borodina et al. 2008). S. coelicolor A3(2), the best genetically characterized strain in this genus, has become a preferred model organism in Streptomyces research. Jens Nielsen’s group had applied the GSMM of S. coelicolor A3(2) to display its global metabolism (Borodina et al. 2005). Then, it was predicted that decreased phosphofructokinase activity would lead to an increase in pentose phosphate pathway flux and in flux to pigmented antibiotics and pyruvate (Borodina et al. 2008). Alam et al. (2010) updated the GSMM and successfully predicted flux changes when the cell switches from biomass to antibiotic production. Recently, Huang et al. (2012) reconstructed a partial metabolic network of Streptomyces roseosporus based on the GSMM of S. coelicolor and successfully improved the strain in daptomycin yield using in silico metabolic flux analysis. Thus, the GSMM of S. coelicolor A3(2) shows its widespread applications in model reconstruction and prediction for other Streptomyces microbes. B. subtilis is another best-characterized drug-producing microbe with an ability to produce antibiotics. The first GSMM of B. subtilis was reconstructed based on the combination of genomic, biochemical, high-throughput phenotype, and gene essentiality data (Oh et al. 2007), and then it was updated as a result of the more accurate genomic annotations (Henry et al. 2009). Although these works brilliantly investigate the metabolic network characteristics of these strains, there are few reports about successful strains improvements for producing biopharmaceuticals driven by this systematic strategy.
Furthermore, many other important drug-producing microbes are applied in industry, yet have no high-quality GSMMs (Lee et al. 2009b). Therefore, in biopharmaceutical fields, not only the applications aimed at strain improvement but also the GSMM reconstruction still requires further attempts.
In synthetic material industry, microbes also make important contributions. Many polymer materials and their monomers could be produced by natural or engineered microbes, e.g., poly-3-hydroxyalkanates (PHAs), polylactic acid (PLA), carboxylic acids, butanediols, etc. (Lee et al. 2011a). Recently, GSMM-guided metabolic engineering strategy has been successfully implemented to enhance the productivity of useful biopolymers and their precursors (Jung et al. 2010; Ng et al. 2012; Yim et al. 2011).
For example, PLA is a promising biomass-derived polymer which is considered to be biodegradable, biocompatible, and of low toxicity to humans. It is reported that PLA can be synthetized by engineered E. coli, but at relatively low efficiency (Yang et al. 2010). To overcome this insufficiency, Jung et al. (2010) further improved this engineered E. coli based on in silico genome-scale metabolic flux analysis. In this case, MOMA, FBA, and FRA were simulated for in silico knockout and amplification studies by using the GSMM EcoMBEL979. In silico gene knockout simulation demonstrated that deleting adhE gene could achieve a PLA production rate under an acceptable growth rate, much higher than the predicted flux of control strain. With the other two genes ackA and ppc deleted, this triple knockout was considered as the most beneficial strategy for maximizing the PLA flux. After additional promoter modification, the resulting strain allowed the most efficient production of PLA homopolymer and poly[3-hydroxybu-tyrate(3HB)-co-LA] copolymers which agrees well with the in silico simulation results. This study allowed efficient bio-based one-step production of PLA and its copolymers. It is expected that this strategy might be generally useful for developing other engineered organisms capable of producing various unnatural polymers.
Beyond the full-length polymers, the production of monomers in microbial cell factories is an easier biosynthetic route. It is reported that some platform monomers including propanediols, butanediols, diamines, and terpenoids have been produced in microbes (Curran and Alper 2012; Lee et al. 2011a; Lee et al. 2012). Among them, the production of butanediols including 2, 3-butanediol (Ng et al. 2012) and 1, 4-butanediol (Yim et al. 2011) was improved based on the GSMM-guided metabolic engineering strategy. In addition, the production of some carboxylic acid monomers, such as formic acid, malic acid, and succinic acid, were also enhanced in engineered S. cerevisiae or E. coli by the guidance of GSMM (Kennedy et al. 2009; Lee et al. 2005a; Moon et al. 2008; Wang et al. 2006). All these monomers are the important raw materials in synthetic material industry.
The achievements mentioned above in this industrial field are implemented on model species. Actually, there are many natural microbes with the abilities to produce biopolymers and their monomers (Lee et al. 2011a; Lee et al. 2012). For instance, Pseduomonas putida is a typical microbe of this kind. Its two GSMMs are built to investigate the production of biopolymer PHA (Nogales et al. 2008; Puchalka et al. 2008). However, only a few of these microbes are reconstructed into GSMM. Thus, in this application fields, GSMM-guided metabolic engineering strategy has an extensive development space in the future even while it needs more effort to be taken in reconstructing GSMMs for these natural species.
As one of the most important renewable energy, biofuel has gained increasing public and scientific attention, driven by factors such as oil price hikes, environmental concerns, and supports from government subsidies (Stephanopoulos 2007). Microbes make a significant contribution to the production of biofuels, including bio-ethanol, bio-butanol, bio-gasoline, bio-diesel, bio-hydrogen, etc. (Jang et al. 2012). However, original microbes need to be improved because of low yield rates in biofuel production. Recent trends have been developed into using systems biology strategies for biofuel strain improvement (Gowen and Fong 2011; Mukhopadhyay et al. 2008), in which GSMM holds great promise to guide strain design for improving biofuel production by microorganisms (Lee et al. 2008b).
It is a typical case that ethanol production of S. cerevisiae was increased through manipulating the genetic targets predicted by in silico GSMM-guided simulation (Bro et al. 2006). Firstly, different strategies were characterized based on the published GSMM of S. cerevisiae (Forster et al. 2003). Then, one of them (an insertion of the GAPN gene) was predicted as the optimal genetic manipulation for ethanol production. After the experiments, the first resulted strain had a 40 % lower glycerol yield on glucose while the ethanol yield increased with 3 % without affecting the maximum specific growth rate. Subsequently, the GAPN gene was further expressed in the strain harboring xylose reductase and xylitol dehydrogenase, the ethanol production was finally increased by up to 25 %. Though there is not an outstanding enhance for ethanol production, it is at least the first successful attempt of using the GSMM-guided metabolic engineering strategy in the biofuel field. In addition, one recent study was reported that the production of isobutanol in B. subtilis was enhanced by using elementary mode analysis based on an updated B. subtilis GSMM (Li et al. 2012). These cases indicate the practicability of this strategy in guiding related microbial improvements of biofuel field.
The potential microbes which have been reconstructed into GSMMs in biofuel field
Photosynthetic organisms as a source of hydrogen
Chang et al. (2011)
de Oliveira Dal’Molin et al. (2011)
Of interest for industrial solvent (particularly bio-butanol) production.
Lee et al. (2008a)
Senger and Papoutsakis (2008)
A sustainable alternative to petroleum-based production of butanol
Milne et al. (2011)
Biochemically converting plant sugar and cellulose to ethanol
Roberts et al. (2010)
Producing many unique cofactors, coenzymes, and enzymes during methanogenesis
Tsoka et al. (2004)
A methanogen capable of producing methane
Satish Kumar et al. (2011)
The more obvious use is to produce methane as an alternative fuel
Feist et al. (2006)
Its potential application for production of carotenoids and alkanes
Rokem et al. (2011)
The organism creates ATP for an energy source and acetate, CO2 and H2 as bio-products
Sun et al. (2010)
Of interest as microbial fuel cells for production of ethanol and acetate
Sun et al. (2010)
To produce hydrogen, polyhydroxybutyrate or other hydrocarbons
Imam et al. (2011)
It can utilize a large host of electron donors
Risso et al. (2009)
Synechocystis sp. PCC6803
A Cyanobacterium considered as a candidate photo-biological production platform for bio-hydrogen
Montagud et al. (2010)
Yoshikawa et al. (2011)
Nogales et al. (2012)
Montagud et al. (2011)
A leading candidate for ethanol production
Widiastuti et al. (2011)
Lee et al. (2010)
Given the thriving performances in other fields, the advances of experimental genetic accessibility in biofuel species, and the preparedness of comprehensive exploration and knowledge on metabolic system, it is expected that GSMM-guided metabolic engineering approach would be wildly applied to improve biofuel microbes in the future.
The representative microbes with GSMMs in application of bioremediation
Of interest in environmental and biotechnological applications with large-spectrum biodegradation capabilities.
Durot et al. (2008)
Widespread application in bioremediation of toxic, persistent, carcinogenic, and ubiquitous ground water pollutants
Islam et al. (2010)
Used for bioremediation and electricity generation from waste organic matter and renewable biomass
Sun et al. (2009)
Used for bioremediation and electricity generation from waste organic matter and renewable biomass
Mahadevan et al. (2006)
With the ability to degrade organic solvents such as toluene and also to convert styrene oil to biodegradable plastic polyhydroxyalkanoates (PHA)
Puchalka et al. (2008)
Sohn et al. (2010)
Nogales et al. (2008)
The remarkable catabolic diversity of R. erythropolis makes it an interesting organism for bioremediation and fuel desulfurization.
Aggarwal et al. (2011)
Its cytochromes have been of particular interest in the field of research due to their potential of bioremediation of heavy metals.
Pinchuk et al. (2010)
Geobacter spp., the natural inhabitants of a diverse range of soils and aquatic sediments (Lovley et al. 2004), which can reduce insoluble metal oxides, are the typical microorganisms in bioremediation. Geobacter sulfurreducens has become a model organism of this species for studying the mechanism of Fe(III) respiration and the process of environmental remediation, because it owns the earliest available sequenced genome (Methe et al. 2003), a workable system for genetic manipulation (Coppi et al. 2001), and the in silico GSMM (Mahadevan et al. 2006). The GSMM was reconstructed to investigate its central metabolism and electron transport, revealing that energy conservation with extracellular electron acceptors is limited when comparing with that associated with intracellular acceptors (Mahadevan et al. 2006). Guided by the prediction, Izallalen et al. (2008) had achieved a strain improvement. The Optknock algorithm (Burgard et al. 2003) was simulated on the GSMM of G. sulfurreducens to determine optimal gene knockouts which maximally increased respiration rates. The in silico analysis indicated that gene deletions in central metabolism or in the fatty and amino acid metabolism could increase respiration and cellular ATP demand. Subsequently, the prediction was validated in wet experiments through altering the F1 portion of membrane-bound F0F1 ATP synthase. As a result of increased electron transfer, a higher respiratory rate is beneficial to its ability in bioremediation. This was the first report of metabolic engineering to improve the respiratory rate of a microorganism.
Since the systematic approach for microbial strain improvement in bioremediation plays a significant role in environment protection (de Lorenzo 2008), a number of relevant microbial species are reconstructed into in silico GSMMs (Table 4). However, up to date, the successful cases using this systematic strategy are very limited in this field. Hence, much more explorations are needed in the future.
Future improvements and directions
Indeed, those applied achievements reviewed here are not easy to carry out in the past since there are a lot of obstacles hindering the guidance of GSMM. In summary, there are following three aspects with many challenges to be overcome in the near future.
For this reason, the efforts to automate reconstruction have been taken continuously. A few methods, such as pathologic (used in software The Pathway Tools) and AUTOGRAPH (Notebaart et al. 2006), can help for GSMM automatic reconstruction. Some computational tools, such as Pathwaytools (Karp et al. 2002), metaSHARK (Pinney et al. 2005), and MetaNetMaker (Forth et al. 2010), have been developed to reconstruct a draft GSMM. Recently, a web-based resource for high-throughput generation, optimization, and analysis of genome-scale metabolic models, the Model SEED (Henry et al. 2010), which could automate the reconstruction process in approximately 48 h using a completed genome sequence, greatly accelerates the reconstruction of a new GSMM. Another building strategy based on a refined GSMM template has been proven to be a feasible method (AbuOun et al. 2009; Liao et al. 2011; Vongsangnak et al. 2012). This reconstruction method identifies orthologous genes between the target species and a template species with extensively curated metabolic network, then followed by extracting the “orthologous” part of the GSMM from the well-studied species (AbuOun et al. 2009). Although those methods and tools decrease the time for reconstruction and improve model’s reliability to some degree, faster and more reliable automatic methods are still the immediate needs to meet the requirement of high-throughput GSMM reconstruction for industrial microbe.
Secondly, it is absolutely inescapable that none of the current existed models can provide a complete view of a natural cell (Baginsky et al. 2010). The accuracy of a GSMM and its predicting ability are the key issues for further applying GSMM-guided strategy widely. Thus, manual refinement to ensure GSMM with high-quality is another important researching aspect. In this process, the primary method is to search the physiological and biochemical information of target organism from literatures and databases. So it is apparent that comprehensive and non-redundant databases and enough phenotype data are desired. Recently, a biochemical database developed through integrating the data of more than ten different biochemical databases, Rhea is a comprehensive resource of expert-curated biochemical reactions, providing a set of non-redundant chemical reaction information including enzyme-catalyzed reactions, transport reactions, and spontaneously occurring reactions for GSMM reconstruction (Alcantara et al. 2012). The development of phenotype microarrays for microbes (Bochner 2009), which attempt to give a global view of cellular phenotypes, have greatly improved the reliability of the GSMM (Oh et al. 2007). In addition, some model curating strategies, such as the popular gap analysis (Satish Kumar et al. 2007), have played a significant role in filling the flaws of GSMM. A novel method, which could eliminate erroneous differences between species through comparative systems analysis, was reported for improving GSMM reconstruction (Oberhardt et al. 2011). These methods and resources mentioned above are expected to accelerate the reconstruction of high-quality GSMM to meet the industrial interest.
Beyond increasing the quality of GSMM, another desired researching direction should be the further developments of GSMM simulation methods. As is known to all, one of the most important features of constraints-based simulation based on GSMM is that it can easily depict a global metabolic system without kinetic and regulatory information. However, that is something unsatisfactory for it to predict the true metabolism of a target microbe. In order to make it up, one promising approach is to combine multiple omics data and thermo-kinetics data with the constraints-based simulation. For example, some biochemical kinetic parameters, if incorporated into GSMM, could significantly increase its predicting power (Blazeck and Alper 2010). Based on GSMM, the DFBA was simulating to identify gene targets for increasing specific expression of GFP in L. lactis IL1403 and to analyze the ethanol production of S. cerevisiae in fed-batch culture (Becker et al. 2011; Hjersted et al. 2007). As regulation genes are important targets in metabolic engineering, it is apparent that incorporating gene regulatory information from transcriptome data can greatly increase the prediction accuracy (Åkesson et al. 2004). For instance, the engineered target genes to increase valine and threonine production in E. coli were successfully identified by genome-scale metabolic simulations using transcriptome profiling data (Lee et al. 2007; Park et al. 2007). Furthermore, 13C-based flux analysis was used to predict engineering targets and evaluate cellular physiology with relatively high accuracy based on the GSMMs (Park et al. 2010; van Ooyen et al. 2012). Thus, it can be seen that GSMM-guided strategy incorporated with other effective data is expected to be the most efficient method to improve microbes at present or in the near future.
Thirdly, for industrial application aspect, microbial improvement guided by GSMM is expected to expand in applications towards none-model microbes. From the collected examples, it is obvious that this systematic method is well explored in the model organisms, E. coli and S. cerevisiae. However, those non-model microbes of interest have advantages in which they have inherent metabolic and regulation systems for the special application, making them attractive as a powerful platform for material production or other industrial use. In general, extensive applications of this strategy in non-model microbes are hindered by three major obstacles including costs in sequencing and analysis, great effort took in searching data and building a GSMM, and genetically accessibility to the non-model target microbe (Blazeck and Alper 2010). Comparing with the cost in “sanger” sequencing era, recent advances in “next-generation” sequencing have greatly reduced this cost (Hall 2007; Mardis 2008; Schuster 2008). Other high-throughput biological technologies, e.g., RNA-seq, gene-chips, phenotype microarray technology, NMR metabolite detection, and so on, provide us enough data for model building and simulation. More and more efforts imposed on the study of genetic manipulating system of target industrial microbes or genetic close species create prerequisites to apply GSMM-guided metabolic engineering for strain improvement. Profiting from these advances, some non-model microbes have been improved by this systematic strategy (Table 2). Although the obstacles that must be considered before using this strategy are still inescapable at present, it is expected that this strategy would be the state-of-the-art technology in the improvements of industrial microbes.
With the rise in high-throughput measurement technologies and the growing number of sequenced genomes, the continued construction of in silico GSMMs will provide increasingly powerful tools to investigate biological systems and design efficient cell factories. Much more effort should be made to accelerate GSMM reconstruction, improve model simulation, as well as expand the scope of GSMM-guided strategy in microbial improvement of industrial interests in future.
The authors are grateful to the supports by the major special project of science and technology of Zhejiang province, China (No. 2008C12G2020010), the Fundamental Research Funds for the Central Universities, and the NSFC project, China (No. 30971743).