Genome-scale modeling using flux ratio constraints to enable metabolic engineering of clostridial metabolism in silico
- First Online:
- Cite this article as:
- McAnulty, M.J., Yen, J.Y., Freedman, B.G. et al. BMC Syst Biol (2012) 6: 42. doi:10.1186/1752-0509-6-42
Genome-scale metabolic networks and flux models are an effective platform for linking an organism genotype to its phenotype. However, few modeling approaches offer predictive capabilities to evaluate potential metabolic engineering strategies in silico.
A new method called “f lux b alance a nalysis with flux ratio s (FBrAtio)” was developed in this research and applied to a new genome-scale model of Clostridium acetobutylicum ATCC 824 (i CAC490) that contains 707 metabolites and 794 reactions. FBrAtio was used to model wild-type metabolism and metabolically engineered strains of C. acetobutylicum where only flux ratio constraints and thermodynamic reversibility of reactions were required. The FBrAtio approach allowed solutions to be found through standard linear programming. Five flux ratio constraints were required to achieve a qualitative picture of wild-type metabolism for C. acetobutylicum for the production of: (i) acetate, (ii) lactate, (iii) butyrate, (iv) acetone, (v) butanol, (vi) ethanol, (vii) CO2 and (viii) H2. Results of this simulation study coincide with published experimental results and show the knockdown of the acetoacetyl-CoA transferase increases butanol to acetone selectivity, while the simultaneous over-expression of the aldehyde/alcohol dehydrogenase greatly increases ethanol production.
FBrAtio is a promising new method for constraining genome-scale models using internal flux ratios. The method was effective for modeling wild-type and engineered strains of C. acetobutylicum.
KeywordsGenome-scale modelclostridiaflux ratioflux balance analysismetabolic engineeringsystems biology
Modeling clostridial metabolism
Butanol is of considerable research interest as a potential biofuel, and its renewable production through fermentation is sought largely from the clostridia. In particular, Clostridium acetobutylicum ATCC 824 has been one of multiple clostridia researched for butanol production over the past few decades. In fact, the first applications of metabolic flux balancing were performed using a model of C. acetobutylicum primary metabolism to understand what caused this organism to produce butanol and the competing metabolic byproducts: (i) acetate, (ii) butyrate, (iii) lactate, (iv) acetone, (v) ethanol, and several others in small amounts [1, 2]. Flux modeling of the primary metabolism of C. acetobutylicum has led to a better understanding of the role cofactor balancing plays in directing global metabolic changes. It has played a significant role in metabolic engineering by identifying bottlenecks and critical flux distributions at metabolic branch points [3–8]. Multiple “genome-scale” metabolic network reconstructions now exist for C. acetobutylicum[9–12]. Similar networks and their corresponding genome-scale models have been reviewed extensively [10, 13–20]. In general, they are used to (i) complete genome annotation , (ii) predict optimal culturing conditions [22–24], (iii) discover genomic regulation [20, 25, 26], (iv) identify essential genes and drug targets [27–33], (v) study strain evolution [34, 35], and (vi) design productive strains [36–38]. Modeling results on the genome-scale have been applied to both “acidogenic” and “solventogenic” programs of clostridial metabolism . The acidogenic program is characterized by high acids (i.e., acetate and butyrate) production and high growth rates, and the solventogenic program (i.e., acetone and butanol production) largely coincides with the stationary growth phase of the culture. During solventogenesis, acetate and butyrate are re-consumed by the culture and converted to acetone and butanol. The genetic program of this metabolic shift between acids and solvents production has been studied in detail . Several insights into C. acetobutylicum metabolism have been gained from “gap filling” the metabolic network by locating previously unknown enzymes and biochemical reactions [11, 12]. The total rate at which a cell produces/consumes protons through the several membrane transport mechanisms is termed the specific proton flux (SPF), and this parameter has shown to significantly reduce the total number of flux “solutions” available for the under-determined genome-scale model of C. acetobutylicum. Reducing the number of degrees of freedom of these genome-scale models through application of genetic regulation and physicochemical constraints has been recognized as a key strategy for generating metabolic flux predictions that coincide with experimental observations .
Engineering clostridial metabolism
Metabolic engineering in silico
The goal of metabolic engineering in silico is to derive (or at least evaluate) potential metabolic engineering strategies prior to constructing them in the laboratory. For example, will a particular gene over-expression or knockout in C. acetobutylicum increase butanol production? Answering questions of this type is one of the potential uses of genome-scale modeling. However, with the initial genome-scale model for C. acetobutylicum[11, 12], these questions could not be addressed without constraints on acid/solvent production. These constraints artificially specified ranges for secretion rates of acid and solvent products. These were necessary due to the large number of degrees of freedom that exist in the under-determined genome-scale model and the high degree of branching in the primary metabolism of clostridia. Simply, too many flux solutions were available if the user was only to define the substrate uptake rate and a proper objective function. The production of products/byproducts by a metabolic network not only completes elemental balances but it also regenerates and balances cofactors. In clostridial metabolism, ATP is regenerated by the production of acetate or butyrate, and NAD+ is produced by the production of (i) lactate, (ii) ethanol, or (iii) butanol. With several options to balance cofactors available, information about enzyme specificity is necessary to achieve reasonable selectivity. If constraints in a genome-scale model are simply placed around secretion of a product or byproduct, the model does not represent the cellular mechanisms that result in proper selection. Thus, an effective metabolic engineering strategy cannot be formulated in silico given these types of constraints.
With the ultimate goal of re-directing metabolic flux through the butanol production pathway in C. acetobutylicum, few tools, with the notable exception of OptKnock , exist for deriving a metabolic engineering strategy. Even with its many successes, OptKnock is restricted to gene knockouts and cannot suggest over-expression and partial gene knockdown strategies to engineer metabolism. However, the recently published OptForce algorithm  provides the capability to identify both gene over-expressions and knockdowns required of a metabolic network to produce a targeted amount of a specified product. Ultimately, methods that target the regulatory network of the cell and re-direct metabolic flux at network branch points will enable even more effective metabolic engineering in silico. The research presented here is a first step to constraining metabolic branching based on enzyme specificity. This approach also enables simulation of gene over-expressions and partial gene knockdowns in addition to gene knockouts.
Considering metabolic flux ratios
The experimental determination of metabolic flux and pathway usage through the use of isotope tracers has significantly contributed to the overall understanding of regulated metabolism. One approach to characterize metabolism is through the use of metabolic flux ratio analysis (METAFoR) [45–47]. This method is used to determine the degree of converging pathway usage to produce a metabolite pool when multiple synthesis routes exist. For example, METAFoR can reveal the relative contributions of anaplerosis and the TCA cycle to the formation of the oxaloacetate pool. Early results revealed the robustness of central carbon metabolism of Escherichia coli[45, 47] as many calculated flux ratios were found impervious to genetic perturbations. Additional computational method development led to the formulation of constraints for flux balancing from measured flux ratios [48, 49]. The resulting algorithm was effective given small metabolic networks of primary metabolism and the use of nonlinear programming methods. Unfortunately, these aspects have limited the applicability to large genome-scale metabolic networks, which often must rely on linear programming.
Genome-scale modeling with flux ratios
Of the several successful (and unsuccessful) metabolic engineering strategies applied to clostridia (many of which are not mentioned here), it was not immediately apparent which design(s) would be successful upon conception. The mutant strains had to be created in the laboratory and analyzed. From these results, hypotheses were formed that guided more advanced designs. The purpose of metabolic engineering in silico is to analyze and optimize engineering strategies a priori so that only the most promising candidates are constructed in the laboratory. While genome-scale modeling has provided the necessary platform for metabolic engineering in silico, the large number of degrees of freedom of these models has been limiting. Here, a new approach called “f lux b alance a nalysis with flux ratio s (FBrAtio)” is developed and applied. One significant advantage of FBrAtio is that flux ratio constraints are built into the stoichiometric matrix directly. This approach allows for multiple flux ratio constraints to be included simultaneously, and the flux balancing problem can be solved using simple linear programming. In particular, FBrAtio is used to show that the butanol to acetone production ratio of C. acetobutylicum increases in the presence of CoAT knockdown by antisense RNA (asRNA). This metabolic engineering strategy is also simulated in the presence of AAD knockdown and over-expression to show this method can predict these published outcomes [43, 50].
A new genome-scale model for C. acetobutylicum ATCC 824 was constructed by expanding the previously published model by Senger and Papoutsakis [11, 12]. The new model is called i CAC490 and contains 707 metabolites involved in 794 biochemical reactions, including 66 membrane transport reactions. The model includes 490 genes from the C. acetobutylicum genome. The newly updated i CAC490 model differs from the original Senger and Papoutsakis model [11, 12] in that it contains 242 more reactions (a 44% increase) and 285 more metabolites (a 68% increase). The new reactions added to create the i CAC794 model were obtained from the KEGG database  and recent literature. The i CAC490 model also contains an updated TCA cycle that operates in both oxidative and reductive directions to succinate, as shown by recent fluxomics studies [52, 53]. The model allows the export of succinate since its metabolic fate has not yet been resolved conclusively. The i CAC490 model is also fully compartmentalized and allows the presence of chemical reactions in the extracellular environment. Thermodynamic reaction reversibility constraints based on Gibb’s free energy calculations from the group contribution method [54, 55] have also been applied. The biomass equation was also updated for the i CAC490 model, using the initial version by Senger and Papoutsakis [11, 12] as a template. A nonlinear optimization procedure was applied (manuscript in preparation) to optimize the biomass equation given specific environmental conditions. The biomass equation derived for exponential growth was used extensively in simulation studies reported here. It was found that the exponential growth biomass equation could result in qualitatively accurate model predictions. It is acknowledged that an updated and dynamic biomass equation will be required to obtain model predictions that are quantitatively accurate. The reconstructed metabolic network of the i CAC490 model is included as Additional file 1. The SBML formatted model is included as Additional file 2.
Flux balance analysis
The i CAC490 genome-scale model was simulated using flux balance analysis through the COBRA toolbox . The open-source GLPK linear programming software was used to solve the flux balance equation (S · v = 0), where S is a stoichiometric coefficient matrix and v is a vector of flux values. Methods related to construction of the stoichiometric matrix and the required steady-state approximation for intracellular metabolite concentrations have been detailed elsewhere . The objective functions used for all simulations were (i) maximizing the specific growth rate of the cell while (ii) minimizing the total flux of the system.
The specific proton flux
The concept of the specific proton flux (SPF) was first introduced by Senger and Papoutsakis  and describes the total rate of proton influx/efflux through all membrane transport mechanisms. This value is negative when protons are leaving the cell and positive when protons are taken-up by the cell. For the case of C. acetobutylicum, the SPF is highly negative during exponential growth (acidogenesis) and turns slightly positive during the stationary phase (solventogenesis). In this research, the SPF was constrained to specific values by constraining the proton exchange reaction (the total flux of protons in/out of the systems boundary). The SPF range was between −30 (proton efflux) and 5 (proton influx), and the limits were chosen from experimental observations.
Next, a new row is added to the stoichiometric matrix. In this new row, two values are added (all other values in the row are zero). In the column representing the reaction catalyzed by THL (Equation 1), the coefficient 1 is added to the matrix. In the column representing the reaction catalyzed by the PTA, the coefficient −2 is added (in the new row). With these additions, when the flux balance equation (S · v = 0) is solved, the ratio of fluxes for the reactions of Equation 1 and Equation 2 will be exactly 2. If the flux ratio chosen leads to an impossible solution of the metabolic network, no solution will be found to the flux balance equation.
The goal of this research was to develop a method of constraining a metabolic network so that metabolic engineering can be performed in silico. Until now, gene knockouts have been the dominant strategy for designing metabolic engineering strategies in silico. However, flux ratios offer the ability to include over-expression and flux re-direction at key branch points in a metabolic network. These results can offer a snapshot of the metabolic potential of the engineered cell and offer the metabolic engineer an experimental target to achieve these results. Simulations performed in this research focus on using FBrAtio to reproduce metabolic engineering strategies that have been experimentally validated in C. acetobutylicum[6, 43, 50]. In particular, simulations were performed with the i CAC490 model in which the glucose uptake rate and the SPF were the only specified membrane transport fluxes. Next, FBrAtio was applied to achieve the experimentally observed wild-type metabolic activity of C. acetobutylicum. The following metabolic characteristics were sought on a qualitative level: (i) at highly negative values of SPF (acidogenesis), acetate and butyrate are produced in high quantities, (ii) high hydrogen production accompanies acidogenesis, (iii) solvents are produced at SPF values close to zero and slightly positive, (iv) hydrogen production decreases during solventogenesis, (v) the maximum growth rate of the culture occurs during acidogenesis, (vi) the production of butyrate is slightly greater than the production of acetate and much greater than the production of lactate, and (vii) the production of butanol is greater than the production of acetone and is much greater than the production of ethanol. Following obtaining a qualitatively accurate simulation of wild-type metabolism, additional flux ratios were applied through FBrAtio in attempt to predict the following experimental observations [43, 50]. Knockdown of the CoAT (by asRNA) resulted in increased butanol to acetone selectivity, but this strategy resulted in decreased ethanol and butanol production . The asRNA was designed against the mRNA of the cftB gene in particular, which is a part of the tricistronic operon (aad-ctfA-ctfB). It was hypothesized that AAD activity was also compromised by this asRNA construct, so aad was over-expressed under its native promoter. Significantly higher ethanol and butanol yields were observed as a result of this metabolic engineering strategy . Flux ratios were designed to (i) knockdown CoAT activity only, (ii) knockdown activity of both CoAT and AAD, and (iii) knockdown CoAT while over-expressing AAD at and above wild-type levels.
Simulations with a minimal set of constraints
The i CAC490 model was simulated with a glucose uptake rate constrained to 10 and the SPF was varied between −30 and 5 . Only thermodynamic reversibility constraints were used initially. Results showed acidogenic and solventogenic metabolic phases that coincided with SPF values . Results also showed a maximum specific growth rate at an SPF value of −10 , which is consistent with previous findings [2, 12]. However, during acidogenesis, acetate was the primary acid produced, and acetone was the primary solvent produced during solventogenesis. Hydrogen (H2) production was also maximized during solventogenesis. These characteristics are not consistent with experimental observations. Since only acetate was produced in acidogenesis, this demonstrates that the network required the generation of ATP. Since butyrate or ethanol was not produced, this means that NAD+ was regenerated in a futile cycle elsewhere in the network. The ability of the network to artificially balance NAD(P)+/NAD(P)H also explains why hydrogen production remained high during solventogenesis. By maximizing the specific growth rate of the cell and minimizing the total flux of the system, flux in longer pathways, such as butyrate/butanol production were minimized in favor of shorter ATP (acetate) and NAD+ (futile cycle) regenerating pathways.
Approximating wild-type metabolism with FBrAtio
Reactions and updated constraints involving the ferredoxins
PFO: pyruvate ferredoxin oxidoreductase
FNO: ferredoxin NAD+ oxidoreducatase
FNPO: ferredoxin NADP+ oxidoreductase
asRNA knockdown of CoAT only
Knockdown of both CoAT and AAD
Over-express AAD and knockdown CoAT
The large number of degrees of freedom in primary clostridial metabolism makes this system challenging to model. Initial efforts [1, 7, 8] relied on experimentally measured data to fit a basic metabolic model and back-calculate pathways fluxes. With the development of genome-scale models, there was initial enthusiasm that this approach would result in a model capable of predicting the metabolic response of the organism to genetic and environmental manipulations. However, this level of prediction was not achieved by the first genome-scale model for C. acetobutylicum[11, 12]. This original genome-scale model was updated in this research with additional reactions and thermodynamic constraints. Even with a more complete model and updated constraints, the number of degrees of freedom of the primary metabolic network proved too large to generate meaningful predictions, even of wild-type metabolism. This is evident from the results shown in Figure 2. To build a truly predictive model, care must be taken when determining how proper constraints are imposed. It is important that these constraints not only lead to accurate representations of metabolism but can be manipulated to mimic genetic and environmental perturbations. For example, a common method is to artificially constrain the glucose uptake rate (as was done in this research). From there, constraints can be imposed on product (e.g., acetate, butyrate, butanol, etc.) secretion fluxes to mimic the wild-type metabolism. This approach is detrimental to metabolic engineering. For example, if constraints are placed on secretion of the end-products, how do these constraints change when a genetic manipulation is made elsewhere in the metabolic network (e.g., at the thiolase enzyme)? There is no clear mathematical relationship between a secretion flux constraint and the metabolic flux through an enzyme elsewhere in the network. Thus, constraints that are imposed to achieve accurate representations of metabolism must be imposed at the metabolic engineering targets themselves. However, this leads to the questions, what is a metabolic engineering target? And, how can constraints be imposed there? This research has focused on “branch points” (or critical nodes) of the metabolic network as potential sites of metabolic engineering. The use of acetyl-CoA in clostridial metabolism is a good example of a metabolic branch point. Acetyl-CoA can be used in the production of (i) acetate, (ii) butyrate/butanol, (iii) ethanol, and (iv) macromolecules required for cell growth. Each of these routes produces/consumes different cofactors, and the balancing of these cofactors ultimately determines the cellular phenotype.
The use of metabolic flux ratio constraints through FBrAtio enabled qualitatively accurate modeling of acidogenic and solventogenic metabolism of C. acetobutylicum using the new i CAC490 genome-scale model. The use of flux ratios allows for constraints to be placed directly at points where metabolic engineering strategies can be applied. For example, flux ratios can be manipulated to achieve a desired result (e.g., maximized butanol production). Then, genetic manipulations such as (i) over-expression, (ii) knockout, and (iii) asRNA knockdown can be applied to achieve the optimum ratios. In this research, flux ratio constraints were implemented to achieve a qualitative picture of metabolism that mimics experimental observations. As a proof of concept, the wild-type and two engineered strains analyzed were consistent with published experimental results. The case of AAD over-expression went a step further and exposed a possible metabolic engineering limit to re-routing flux into the alcohol production pathways. This suggests that the approach of flux ratio constraints is tunable. The flux values obtained here were not converted into concentrations of metabolites and biomass and compared directly to published values. The results obtained here are qualitative (not quantitative) pictures of metabolism. There are several reasons for this. First, a fixed glucose uptake rate of 10 was used for all values of the SPF examined. Previous results  have shown that the glucose uptake rate varies with the SPF. However, the relationship between the glucose uptake rate and the SPF remains uncharacterized. At best, a causal relationship can be established between these two with the current level of knowledge. Next, a single biomass equation was used for all values of the SPF examined. Previous research has shown that the biomass composition, including the maintenance ATP requirement, of C. acetobutylicum changes with the SPF [10, 12]. To obtain quantitatively accurate predictions, one must first understand the relationships that exist between glucose uptake and biomass composition with the SPF. While research is underway to uncover these relationships, the use of parameters associated with exponential growth seemed to be sufficient with the FBrAtio approach.
FBrAtio is a new method to derive metabolic engineering strategies to achieve optimum phenotypes. The concept of using metabolic flux ratios was initially developed with the METAFoR approach [45, 47]. It enabled researchers to determine how multiple biosynthetic pathways contributed to the production of a metabolite pool. This enabled identification of new metabolic pathways and regulatory mechanisms. Since the implementation of FBrAtio accommodates the use of linear programming, flux ratios found with METAFoR can now easily be applied to appropriate genome-scale models using the techniques described in the Methods section (see Equations 14). The FBrAtio approach is different from METAFoR in that it considers how a metabolite pool is distributed as a substrate among competing enzymes. Of course, this process is governed by thermodynamics. This means that enzyme availability and intermediate accumulation downstream (among other factors) are responsible for flux ratios in physical systems. The FBrAtio approach can lead a metabolic engineer to optimum flux ratios, and enzyme availability can be manipulated through gene (i) over-expression, (ii) knockout, or (iii) partial knockdown. However, the FBrAtio approach cannot predict the potential accumulation of downstream intermediates once flux is redirected. This remains a problem for the experimentalist that may be addressed through additional gene over-expression or enzyme engineering.
The FBrAtio approach is presented in detail here and is applied to model previously published metabolic engineering approaches in C. acetobutylicum. Obviously, the full potential of FBrAtio will be realized when it can be used systematically. To do this, algorithms are needed to identify critical nodes (metabolite pools) in the metabolic network where flux ratios can be optimized to produce a desired phenotype. Research is currently underway to address this challenging task. The end result will provide the metabolic engineer with a list of flux ratios that can be manipulated using existing toolsets. Although additional complications may be encountered in some cases due to unforeseen regulatory interactions, the FBrAtio approach has the potential to provide effective “fine-tuned” metabolic engineering strategies.
The FBrAtio approach for -incorporating metabolic flux ratio constraints into a genome-scale metabolic network and generating solutions using simple linear programming was developed in this research. The approach proved effective in modeling wild-type metabolism of C. acetobutylicum. FBrAtio was then applied to metabolically engineered strains, and a high ethanol producing strain was effectively modeled. A nonlinear relationship exists between the flux ratios at a critical node and the resulting phenotype. FBrAtio is capable of capturing these nonlinearities. How flux ratio constraints can be used to design metabolic engineering strategies is currently a subject of much future research, and the developments presented here represent the first steps toward truly predictive genome-scale models that can accurately reflect the impacts of genetic and environmental manipulations.
This research was funded by the USDA AFRI Biobased Products and Bioenergy Production Program (Award Number 2010-65504-20346). BF was supported by the Doctoral Scholars Program of the Institute of Critical Technologies and Applied Science (ICTAS) at Virginia Tech. JY was supported by a grant from the Biodesign and Bioprocessing Research Center at Virginia Tech.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.