Introduction

Microbial production of biofuels and biochemicals from renewable resources or biomass has been paid recent attention from global sustainability and environmental protection points of view, and many attempts have been made for the cell design by metabolic engineering approach. However, the practical application is limited in many cases, and more innovative design of cell factories is desired [1]. On the other hand, significant progress has been made on molecular biology from the reductionist point of view. However, the molecular knowledge alone is in many cases not sufficient to understand the cell system's behavior, where the system's behavior emerges from the interactions between the characterized molecules [2]. Thus, the systems biology approach has been paid recent attention in the post genome era. The ultimate goal of systems biology is to reconstruct a cell system into the computer which can predict observable phenotypes. If this could be attained, the effects of culture environment and/or the specific genetic mutation on the metabolism can be predicted without conducting many exhaustive experiments, and metabolic engineering may be made more efficient with verification by the experiment for the selected mutants in the optimized condition based on the computer simulation. Thus, the appropriate model can contribute for the efficient design of cell factories from the practical application point of view.

It is quite important to quantitatively understand the complex and highly interrelated cellular behavior from biological science and metabolic engineering points of view. This may be attained with the help of informatics and systems biology by integrating different levels of ever increasing data with deep insight into the available data by biological knowledge [2-4].

In living organisms, metabolic network, defined as the set and topology of metabolic biochemical reactions in a cell, plays an essential role for the cell to survive, where it is under well-organized control. Thousands of different biochemical reactions as well as transport processes are linked together to break down organic compounds to generate energy (catabolism) and to synthesize macromolecular compounds (anabolism) for the cell synthesis. Similarly, complex signaling networks interconvert signals or stimuli that are important for the cellular function and interactions with the environment. This implies the importance of the transfer of information in signal transduction pathways and cascades designed to maximize the efficiency for cellular responses to environmental perturbations.

In order to understand the cell system in response to culture environment, the coupling between the recognition or sensing of the environmental condition and adjustment of the metabolic system must be properly incorporated into the model. In particular, it is important to incorporate the coupling between enzymatic reactions and the transcriptional regulation [5]. Moreover, although local regulation mechanisms are known to exist, it is not clear how those local regulation systems are coordinated on the systems level, where this may be made by ‘distributed sensing of the intracellular metabolic fluxes’ [5].

In the present article, current status of kinetic modeling is overviewed from the point of view of proper modeling with incorporation of metabolic regulation mechanism. Metabolic regulation analysis is critical for the proper modeling and has to be made in evaluating the performance of the designed cell as well as for reengineering the cell factories [6]. In bacterial adaptation to the culture environment, the global regulators detect the change in culture environment and control the metabolic pathway genes [7-9]. Here, the modeling of the metabolic regulation is considered focusing on Escherichia coli (but not limited to E. coli) based on the kinetic modeling approach with consideration of metabolic regulation.

Basic modeling approach

A variety of models have been proposed in the past, where they are discriminated from others depending on the underlying assumptions for the modeling, the data they require, and the accuracy of the model prediction [10]. The types of modeling formalism depend on such characteristics [11].

The model development may start with considering the network structure with kinetic rate expressions, model structure, parameter identification, and model validation, which may differ depending on the purpose of using a model [12]. It must be careful that the determination of kinetic rate expression is not straightforward due to the difficulty in identifying the mechanisms of enzymes and transporters [13], and therefore, some appropriate model simplification may be considered. Although parameter identification, sensitivity analysis, identifiability, experimental design, and optimization are important for the modeling in practice [12,14,15], here, we rather focus on the kinetic modeling with consideration of metabolic regulation.

Metabolic flux analysis

Among different levels of information, the metabolic fluxes are located on top of those, and it is the most important information from the phenotypic fermentation point of view [16-18], and it can be used for the analysis of the specific pathway gene knockout on the metabolism [19,20]. 13C-metabolic flux analysis (13C-MFA) has been shown to be useful for the metabolic regulation analysis [16,19-23]. However, this is essentially the analysis method for the physiological state of the organism based on mass balance together with isotopomer balance, and it does not have the predictability. It is highly desirable and useful to be able to predict the cell growth characteristics and the metabolic changes in response to the change in culture environment and/or the specific pathway mutation.

Flux balance analysis and its extensions

Flux balance analysis (FBA) and its extension to genome scale has made significant progress as it requires only basic knowledge of the metabolic reaction stoichiometry and has a reasonably accurate predictability. Significant efforts have also been made to integrate gene level regulation and metabolic networks to reveal the regulation mechanism [24,25]. In such approach, however, some appropriate objective functions such as the maximization of the cell growth rate, the specific substrate consumption rate, and/or the metabolite production rate must be introduced due to excess degrees of freedom. It was, however, shown that no single objective function can accurately represent the flux data for the different culture condition [26]. Rather, a vector-valued objective function or multiple objective functions must be considered, resulting in the Pareto optimal set to represent the metabolic fluxes [27], where the influential objective function may be the maximum ATP yield, maximum biomass yield, and minimum sum of absolute fluxes (which corresponds to minimum enzyme investment).

The formulation may be as follows:

$$ \max J={\left[{j}_1,{j}_2,\dots, {j}_k\right]}^T $$
(1a)

Subject to:

$$ {\displaystyle \sum_{j=1}^n{S}_{ij}}{v}_j=0\begin{array}{cc}\hfill \hfill & \hfill \hfill \end{array}\left(i=1,2,\dots, m\right) $$
(1b)

where j i is the ith objective function, S ij is the stoichiometric coefficient, and v j is the flux. However, it is not easy to determine the optimal point on the Pareto optimal set in practice depending on the variety of culture conditions.

FBA approach together with MFA information may be considered for the metabolic engineering purpose such as OptKnock [28], a bi-level programming framework for identifying gene knockout for the strain improvement. This has been extended as OptReg to consider not only knockouts but also overexpression and downregulations of various reactions in the network [29]. Another extension has also been made as OptForce [30], OptFlux [31], and differential bees FBA (DFFBA) with OptKnock to identify the optimal gene knockout strategies for maximizing the yield of the desired phenotypes while sustaining the growth rate [32]. Further extension has been made as OptStrain aiming at guiding pathway modifications, through reaction additions and deletions, of microbial networks for the overproduction of targeted compounds based on stoichiometrically balanced approach imposing maximum product yield requirements, pinpointing the optimal substrates, and evaluating the different microbial hosts such as Helicobacter pylori, E. coli, S. cerevisiae, and other microorganisms [33].

Stoichiometry-based strain design algorithms are often formulated as bi-level mixed integer linear programming problems [28-30,34,35], where outer level optimizes the objective function(s), while the inner level optimizes the cellular system that counteract any externally imposed genetic or environmental perturbations [36,37]. Different fitness functions may be considered [38,39].

The linear property of stoichiometric equations underlying FBA is the computational advantage and allows for genome-scale extension. However, it is not easy to confirm the designed cell metabolism in view of enzymatic reactions with intracellular metabolite concentrations.

The problem in FBA and its extension to genome-scale is the difficulty for the dynamic analysis as compared to kinetic modeling approach. Some extension has been made by incorporating kinetic expressions of multiple carbon sources and other nutrients into the quasi steady-state [40-42]. The dynamic multi-species metabolic modeling (DMMM) approach has been considered by incorporating the metabolite uptake kinetics into stoichiometric models of a microbial consortium [43,44]. On the other hand, steady-state flux distributions obtained from FBA and stoichiometric information have been used to parameterize genome-scale kinetic models applicable for small perturbations [45-48]. Lin-log kinetic expression and thermodynamics may be incorporated to constrain FBA simulation [49].

Although some attempts have been made by the hybrid type of stoichiometric/kinetics-based modeling approach [45,50,51], its potential may not be fully investigated. The dynamic flux balance analysis (dFBA) may be considered for diauxic growth of E. coli consuming glucose and acetate by taking into account the constraints that govern the cell growth at different phases in the batch culture [52]. Moreover, dFBA may be used for the co-culture with multiple sugars for the cellulosic biofuels production [53-55]. Recently, OptForce formalism has been extended as k-OptForce by bridging the gap between stoichiometric approach and kinetics-based approach, where the procedure seamlessly integrates the mechanistic detail given by kinetic models within a constraint-optimization framework tractable for genome-scale models [56].

The proper formulation for the interaction between the metabolism and gene expression by applying the principle of growth optimization enables the accurate prediction of multi-scale phenotypes [25], where constitutively expressed genes show growth-rate-dependent expression trends [57,58]. This implies the economic ways of the cell system that is regulated in response to global change in metabolic efficiency [59]. Moreover, such optimality model may be used for the adaptive laboratory evolution [60].

The construction of a virtual microbe will be an ambitious but realistic target that builds a novel resource that can provide significant benefits in the variety of practical applications. As an extension of the constraint-based genome-scale models [61], a whole cell computational model was developed for Mycoplasma genitalium, a urogenital parasite adored by synthetic biologists for its reduced genome [62]. This model constitutes 28 processes of the cell's operation, where these include processes that track exchanges with the extracellular medium, all the metabolic fluxes, the state of the supercoiled chromosome, transcription of all active genes, processing of all mRNAs, translation of all proteins, formation of all macromolecular complexes including RNA polymerases and ribosomes, and progresses of cytokinesis and FtsZ polymerization [63]. This may be the dawn of virtual cell biology [64], and this might even go beyond the previous attempt of the so-called ‘a grand challenge of the twenty-first century’ [65].

Once again, although powerful and attractive for the possible extension to the whole cell modeling or the so-called virtual microbes, the main drawback of the above approach is the difficulty in incorporating ‘explicitly’ the metabolic regulation mechanism.

Kinetic modeling and incorporation of metabolic regulation

Kinetic models for the metabolism require quantitative expressions that connect fluxes and metabolite concentrations. Those can be obtained with respect to time by solving a set of ordinary differential equations (ODEs) such as:

$$ \frac{d{x}_i}{dt}={\displaystyle \sum_{j=1}^{n_i}{S}_{ij}{v}_j\left({v_j}^{\max },x,p\right)}\kern1em \left(\mathrm{i}=1,\dots, \mathrm{m}\right) $$
(2a)

where the typical mechanistic expression may be the Michaelis-Menten type such as:

$$ {v}_j=\frac{{v_j}^{\max }S}{K_s+S} $$
(2b)

or Hill type expression may be considered, where v j max and K s are the model parameters, and S is the substrate concentration for the corresponding pathway reaction. These expressions require detailed enzyme reaction mechanism and characterization [66,67]. In order to reduce the computational burden in association with kinetic modeling, various approximate kinetic forms such as lin-log [68-70] and log-lin [71] kinetics, power-law kinetic expressions such as S-system [72], generalized mass action [45], and others [73-75] have been proposed.

As mentioned in the previous section, although the stoichiometric constraint-based genome-scale metabolic models have been developed for a variety of organisms [76], it is not easy to incorporate or express the effects of intracellular metabolites and enzyme activities appropriately with such approach. Although some attempts have been made for incorporating transcriptional regulation into FBA framework in the form of Boolean rules [77-80], the regulatory rules are not based on the metabolic regulation mechanisms, but on the basis of the available data which may be the manifestation of part or snapshot of the real regulation mechanism [81].

In contrast to the stoichiometric models, the kinetic modeling approach is attractive in the sense that such mechanism can be incorporated into the model appropriately. The primary attempts of incorporating the regulation mechanisms into kinetic models have been made by cybernetic modeling approach, where the organisms are considered to utilize the available nutrient sources with the maximum efficiency by the optimal strategy [82]. This approach has been extended to more structural models that contain detailed pathways [83]. More recently, this approach has been considered for the potential applications to metabolic engineering [84,85]. In such modeling approach, an elementary mode was considered as a metabolic subunit to model cellular regulatory processes, where the elementary modes are a set of metabolic pathways by which the cellular metabolic routes can be completely described, and any feasible fluxes can be represented by their combinations at steady state [86]. The elementary modes consist of a minimal set of reactions that function at steady state, which implies that the elementary mode cannot be a functional unit if any reaction is removed [86]. The hybrid type modeling has also been developed by assuming quasi-steady-state for the intracellular metabolites [87,88], where several applications were made for E. coli [88] and for yeast [81].

In the kinetic modeling approach, it is critical to identify kinetic parameter values and kinetic rate laws applicable to a variety of genetic and/or environmental perturbations. Moreover, the large-scale extension may be limited by considering unambiguous kinetic model parameterization [89]. Several attempts have been made towards postulating a generalized uniform kinetic expression such as approximate enzyme kinetic equations [68,73,90-93], S-system formalism [94,95], or a combination of in vitro-based lumped and approximate rate equations [96,97]. However, the predictability may not be the satisfied level [68,75].

Recently, the ensemble modeling (EM) has been considered to cope with large-scale kinetic modeling by successively reducing the size of parameter space based on the available experimental data such as fluxes and/or intracellular metabolite concentrations together with thermodynamic constraints for the direction of the net fluxes [98]. In EM approach, any type of pathway reaction mechanism can be considered as well as already known mechanism, where each reaction is decomposed into elementary reaction steps with mass action kinetics [99] such as:

$$ A+E\underset{k_{-1}}{\overset{k_1}{\iff }}AE\underset{k_{-2}}{\overset{k_2}{\iff }} BE\underset{k_{-3}}{\overset{k_3}{\iff }}B+E $$
(3)

where A and B are the metabolites and E is the enzyme.

The EM procedure starts with initially assumed kinetic models that predict the experimentally observed phenotypic characteristics, and the additional data such as those of the strain under environmental and/or genetic perturbations are used to screen the models until a minimal set of kinetic models are obtained [99]. This modeling approach has been successfully applied for lysine production [100], fatty acid production [101], aromatic production [98], robustness analysis for engineered non-native pathways [102], and modeling cancer cells [103]. Moreover, this approach has been applied for the modeling of E. coli that reasonably predicts the fluxes and intracellular metabolite concentrations of wild type and its single gene knockout mutants [104] based on the available multi-omics data [105].

Modeling of the main metabolism for catabolite regulation

The metabolic reactions of the central metabolism play important roles for energy generation and the production of the precursors for biosynthesis, and those form the hub on which all nearly catabolic and anabolic processes are built. Metabolic regulation of the central metabolism plays a key role in the adaptation of organisms to changes in their environment. The overall structure of central metabolic pathways is remarkably well conserved in the living organisms. Thus, the metabolic model of the central metabolism will provide a platform for further extension to peripheral metabolism and incorporation.

An attempt has been made for the modeling of the main metabolic pathways to simulate the dynamic behavior of Saccharomyces cerevisiae in response to the pulse addition of the carbon limited growth condition and measurement by fast sampling system [106,107]. The kinetic model equations for the glycolysis and pentose phosphate (PP) pathway have been developed for E. coli to simulate the transient data obtained by the fast sampling system [108]. The kinetic models for the tricarboxylic acid (TCA) cycle and anaplerotic pathways as well as glycolysis and PP pathway were also considered to simulate the typical batch and continuous cultures with some rule-based approach, where the cell growth rate was estimated based on the specific ATP production rate computed from the fluxes [109]. The kinetic modeling for the main metabolism of E. coli has also been made based on fluxomics and metabolomics data [110].

A wealth of information is available on genetic regulation, biochemistry, and physiology of cellular metabolism in response to culture environment, and some attempt has been made for the modeling and simulation, where it is important to make modeling based on the integrated information from gene level to flux level by incorporating the roles of transcription factors [5,111,112]. The important steps are how to incorporate (i) the effect of culture environment on global regulators, (ii) the effects of global regulators on the metabolic pathway genes, (iii) the effects of metabolic pathway genes on the corresponding enzyme activities, as well as (iv) the effects of enzyme level regulations (Figure 1).

Figure 1
figure 1

Overall metabolic regulation scheme.

Importance of the modeling for the main metabolic pathways

Although the modeling of the restricted metabolic pathways such as glycolysis only or glycolysis plus PP pathway, etc. may be useful depending on the purpose of using the model such as short-time transient responses against pulse addition of substrate, it is by far important to model the whole main metabolic pathways such as glycolysis, TCA cycle, PP pathway, together with anaplerotic and gluconeogenic pathways. This enables us to simulate the typical batch culture, where the metabolic transition occurs from glucose-rich (glycolysis) condition to acetate-rich (gluconeogenic) condition in E. coli and others.

In relation to the model development of the main metabolism, the accurate estimation of the cell growth rate is critical for the practical application point of view. In general, the cell growth rate may be expressed as a function of substrate such as Monod type model:

$$ \frac{dX}{dt}=\mu (S)X $$
(4a)

with:

$$ \mu (S)=\frac{\mu_mS}{K_s+S} $$
(4b)

where X and S are the cell and substrate concentrations, and μ is the specific cell growth rate. However, the saturation constant K s is small, and the dynamics depend on the maximum specific growth rate parameter μ m which is usually constant, resulting in the difficulty in estimating the reasonable cell growth rate by Monod type model and its modification. The importance of accurate estimation of the cell growth rate is more eminent for the dynamic simulation of the specific gene knockout mutants and for the effect of culture condition. Thus, it is desirable to be able to accurately predict the cell growth rate in general.

The cell growth rate is determined by the catabolic reactions such as ATP production as well as anabolic reactions under typical growth conditions. Once the main metabolic pathways could be appropriately modeled, the ATP production rate can be estimated. The model equations are established by the mass balance with kinetic equations for the main metabolic pathways (Figure 2). The solution to such ODEs can be used to compute the fluxes with respect to time. This enables us to compute the specific ATP production rate with respect to time such as:

Figure 2
figure 2

Main metabolic pathways.

$$ {\nu}_{\mathrm{ATP}}=O{P}_{\mathrm{NADH}}+O{P}_{{\mathrm{FADH}}_2}+{v}_{\mathrm{Pgk}}+{v}_{\mathrm{Pyk}}+{v}_{\mathrm{Ack}}+{v}_{\mathrm{SCS}}-{v}_{\mathrm{Pfk}}-{v}_{\mathrm{Pck}}-{v}_{\mathrm{Acs}} $$
(5)

where v is the reaction rate of the pathway (Figure 2). OP NADH and \( O{P}_{{\mathrm{FADH}}_2} \) are the specific ATP production rate via oxidative phosphorylation by NADH and FADH2, respectively, and those may be expressed as:

$$ O{P}_{\mathrm{NADH}}=\left({v}_{\mathrm{GAPDH}}+{v}_{\mathrm{PDH}}+{v}_{\mathrm{KGDH}}\left(+{v}_{\mathrm{ICDH}}\right)+{v}_{\mathrm{MDH}}\right)\times \left(P/O\right) $$
(6a)
$$ O{P}_{{\mathrm{FADH}}_2}={v}_{\mathrm{SDH}}\times {\left(P/O\right)}^{\hbox{'}} $$
(6b)

where (P/O) and (P/O)' are the P/O ratios for NADH and FADH2, respectively, and those are most likely to be 2.5 and 1.5, respectively, under typical aerobic condition. Those may be considered as model parameters and can be adjusted by the experimental data [109].

Now, 13C-MFA shows the correlation between the specific ATP production rate and the specific cell growth rate [109,113-115]. This indicates that the above ν ATP can be used to estimate the specific growth rate, and in fact, it was shown that this approach allows us to estimate the cell growth rate and fluxes of the specific gene knockout mutant for E. coli to some extent [109,112].

In particular, in the case of anaerobic fermentation, NADH re-oxidation and substrate level phosphorylation for ATP generation are important, and ATP generation by acetate kinase (Ack) pathway is critical for survival in the case of using only xylose as a carbon source [116]. This may be simulated by the model with the cell growth rate taking into account the effect of ATP production rate as mentioned above.

Moreover, if the main metabolism was appropriately modeled, the specific CO2 production rate can be also estimated by:

$$ {\nu}_{{\mathrm{CO}}_2}={v}_{\mathrm{PGDH}}+{v}_{\mathrm{PDH}}+{v}_{\mathrm{ICDH}}+{v}_{\mathrm{KGDH}}+{v}_{\mathrm{Mez}}+{v}_{\mathrm{Pck}}-{v}_{\mathrm{Ppc}} $$
(7)

where this can be used to estimate CO2 evolution rate (CER), and thus the cell yield may be estimated together with other metabolite production rates and the cell growth rate. Since CER can be measured in practice, this may be also validated by the experimental data.

In relation to NADH production as mentioned above, the specific NADPH production rate can be also estimated as:

$$ {\nu}_{\mathrm{NADPH}}={v}_{\mathrm{G}6\mathrm{P}\mathrm{D}\mathrm{H}}+{v}_{\mathrm{PGDH}}\left(+{v}_{\mathrm{ICDH}}\right)+{v}_{\mathrm{Mez}} $$
(8)

where NADH is produced in many bacteria, while some bacteria such as E. coli produce NADPH at isocitrate dehydrogenase (ICDH). It has also been observed that the specific NADPH production rate is linearly correlated with the specific growth rate from the point of view of anabolism. This means that the flux from G6P to the oxidative PP pathway can be estimated as far as the oxidative PP pathway is dominant for NADPH production, once the specific growth rate was determined from the catabolic ATP production rate [109].

Metabolic regulation mechanisms to be incorporated in the kinetic model

As mentioned in the ‘Kinetic modeling and incorporation of metabolic regulation’ section, several efforts have been made for the appropriate kinetic models which can describe the metabolic regulation in response to genetic and/or environmental perturbations. Here, we consider the metabolic regulation mechanisms that have to be incorporated into the kinetic models, where the enzyme level regulation such as allosteric regulation may be incorporated into the kinetic rate expression, while the transcriptional regulation may be expressed as functions of transcription factors (TFs), where the activities of TFs may be considered to be functions of intracellular metabolites as will be mentioned next.

The cell system achieves the coupling between recognition and adjustment through TFs, whose activities respond to the culture environment, and regulate the expression of the associated genes. This combined recognition and adjustment forms the reaction networks that overarch the metabolic and genetic layers [5]. In general, fast action is made by the enzyme level regulation such as the feed-forward activation of pyruvate kinase (Pyk) by fructose 1,6-bisphosphate (FBP), and the feedback inhibition of phosphofruct kinase (Pfk) by phosphoenol pyruvate (PEP), a motif that enables a high level of the upstream metabolite to lower the level of the downstream metabolite [117] (Figure 3a). The slow action is made through the transcriptional regulation, where cAMP-Crp activates the expression of TCA cycle genes, while (FBP-inhibited) catabolite repressor/activator, Cra activates the expression of gluconeogenic pathway genes as well as some of the TCA cycle genes and the glyoxylate pathway genes, and represses the expression of the glycolysis genes in the case of E. coli (Figure 3b), which will be explained later.

Figure 3
figure 3

Metabolic regulation of the main metabolism. (a) Enzyme level regulation of glycolysis. (b) Transcriptional and enzyme level regulations of the main metabolic pathways.

The levels of the flux-signaling metabolites become coupled, enabling a robust, coherent response of the TFs. The coherent behavior of the overall system is not established by a common transcriptional master regulator, but arises from the molecular interactions within the system itself [5]. It may be considered that the system of reactions of the lower glycolysis and the feed-forward activation of Pyk by FBP translate flux information into the concentration of FBP, and that this feed-forward activation affects the linearization of the glycolytic kinetics, where the glycolysis from FBP to PEP may be expressed as the reversible Michaelis-Menten (MM) equation, while Pyk may be expressed as the irreversible MM equation [118] such as:

$$ v\to \mathrm{F}\mathrm{B}\mathrm{P}\underset{k_{-1}}{\overset{k_1}{\iff }}\mathrm{P}\mathrm{E}\mathrm{P}\overset{\mathrm{Pyk}}{\operatorname{}}\mathrm{P}\mathrm{Y}\mathrm{R} $$
(9)

where feed-forward activation of FBP on Pyk may be expressed by Monod-Wyman-Changeux (MWC) kinetics [118]. In fact, feed-forward regulation has been known to ensure the structural robustness against perturbations [117]. This mechanism may be conserved in many organisms. For example, in S. cerevisiae, sugar uptake rate is well correlated with the respiratory and fermentative pathways, or the specific ethanol production rate, and the similar relationship may be seen between the glycolytic flux and FBP [119,120], where Pyk is also feed-forward activation by FBP in S. cerevisiae [121].

In order to realize the above mechanism, the kinetic expression of Pyk (and also phosphoenolpyruvate carboxylase (Ppc)) must be a positive function of FBP such as:

$$ {v}_{\mathrm{Pyk}}={v}_{\mathrm{Pyk}}\left(\left[\mathrm{P}\mathrm{E}\mathrm{P}\right],\left[\mathrm{P}\mathrm{Y}\mathrm{R}\right],\left[\mathrm{F}\mathrm{B}\mathrm{P}\right],{p}_{\mathrm{Pyk}}\right) $$
(10)

where [・] denotes the concentration, and PEP and pyruvate (PYR) are the substrate and product of Pyk reaction, respectively. FBP is the allosteric activator, and p Pyk is the kinetic parameter vector for Pyk. The feed-forward activation of Pyk by FBP may be enhanced by the feedback inhibition of Pfk by PEP, where the kinetic expression for Pfk must be a negative function with respect to PEP:

$$ {v}_{\mathrm{Pfk}}={v}_{\mathrm{PFk}}\left(\left[\mathrm{F}6\mathrm{P}\right],\left[\mathrm{F}\mathrm{B}\mathrm{P}\right],\left[\mathrm{P}\mathrm{E}\mathrm{P}\right],{p}_{\mathrm{Pfk}}\right) $$
(11)

where F6P and FBP are the substrate and product of Pfk reaction, respectively, while PEP is the allosteric inhibitor, and p Pfk is the parameter vector for Pfk reaction, namely, if the sugar uptake rate or the upper glycolytic flux were increased, FBP increases, and in turn allosterically activates Pyk, which decreases PEP concentration, and the feedback inhibition of Pfk by PEP is relaxed and causes further increase in the upper glycolytic flux (Figure 3a). In the nominal growth condition, the feedback inhibition of Pfk by PEP may not be important, but this may cause oscillatory behavior under certain condition [122].

Moreover, the increased pyruvate goes down through pyruvate dehydrogenase (PDH) to acetyl-coenzyme A (AcCoA), where AcCoA becomes homeostatic, namely, the increase in AcCoA activates Ppc reaction [123,124], thus reducing the upcoming Pyk-PDH fluxes, increases oxaloacetate (OAA), and activates citrate synthase (CS) reaction, and therefore activating the outgoing flux from AcCoA. In this way, AcCoA concentration decreases, forming the feed-back regulation against the initial increase in AcCoA (Figure 4). This mechanism can be realized by expressing the Ppc activity as a positive function of AcCoA as well as FBP such as:

Figure 4
figure 4

Homeostasis of AcCoA by the activation of Ppc.

$$ {v}_{\mathrm{Ppc}}={v}_{\mathrm{Ppc}}\left(\left[\mathrm{P}\mathrm{E}\mathrm{P}\right],\left[\mathrm{O}\mathrm{A}\mathrm{A}\right],\left[\mathrm{F}\mathrm{B}\mathrm{P}\right],\left[\mathrm{AcCoA}\right],{p}_{\mathrm{Ppc}}\right) $$
(12)

where PEP and OAA are the substrate and product of Ppc reaction, respectively, and FBP and AcCoA are the allosteric activators. In another view, this phenomenon may be considered as the feed-forward regulation in the sense that the repression of TCA cycle activity is detected by the increase in AcCoA, which causes the activation of the anaplerotic Ppc pathway, and backs up the precursor metabolite such as OAA since it is expected to be decreased due to deactivated TCA cycle.

As the glucose uptake rate increases, the TCA cycle flux tends to increase by the increased OAA and AcCoA, and then NADH is overproduced. The accumulated NADH inhibits CS and ICDH allosterically [125], forming feedback regulation, and thus results in AcCoA accumulation, which in turn causes acetate overflow metabolism. This enzyme level regulation by NADH in the TCA cycle can be verified by incorporating NADH oxidase (NOX) [126] or nicotinic acid [125], whereby activating TCA cycle. This effect is more enhanced under arcA mutant [127]. In the long run, the expression of TCA cycle genes is eventually repressed by the transcriptional regulation by cAMP-Crp toward steady state as will be explained later. The inhibitory effect of NADH on CS and ICDH may be expressed explicitly in the rate equation, but the problem is that the estimation of NADH/NAD+ pool is difficult without detailed proper modeling of the respiratory chain, which is not easy at this stage.

The typical growth condition changes from glucose-rich to acetate-rich in the batch culture. This requires a significant reorganization of the central metabolism from glycolysis to gluconeogenesis. Although the molecular mechanism underlying the metabolic transition from glucose to acetate has been extensively investigated in E. coli [128], its dynamics have been poorly understood. Since it is critical for the cell to efficiently and quickly reprogram the metabolism under the changing environmental condition, the cell must have the elaborate managing system.

The expression of the reaction rate for Ppc is the function of both FBP and AcCoA as mentioned above, which then enables us to simulate the ultrasensitive regulation of anapleurosis [129], namely, after glucose depletion, FBP concentration decreases accordingly, where Ppc and Pyk activities decrease in turn by the allosteric regulation, and PEP consumption is almost completely turned off. These make PEP concentration to be increased, and this buildup of PEP is kept during certain period [114], and this may serve to quickly uptake the glucose by PTS if it becomes available again [129]. This mechanism is important for the fed-batch culture compensated by DO-stat or pH-stat, where carbon limitation often occurs periodically, and the uptake of carbon source can be made quickly and efficiently without delay. Such phenomenon can be simulated by the model as mentioned above as compared to the case without feed-forward regulation mechanism. This feed-forward regulation mechanism is also important for the modeling and simulation of lactic acid bacteria, where lactate dehydrogenase (LDH) as well as Pyk is also activated by FBP, thus producing lactate quickly and lowering the pH around the cell as soon as the glucose is available [130]. Although the kinetic model for lactic acid bacteria has been developed previously [131], the above mechanism is not incorporated, and thus the simulation result does not properly reflect the real characteristics.

Moreover, after glucose depletion, FBP level drops, and thus Ppc activity decreases, while PEP carboxykinase (Pck) activity is activated by the activated Cra caused by the decreased FBP. This reveals the mechanism of avoiding the futile cycling caused by Ppc and Pck during gluconeogenic phase [129], where ATP generation becomes important. During the active glycolysis with enough sugars available, this futile cycling occurs, and loses ATP without efficient use for the compensation of the flexible metabolic fluxes and the metabolic regulation [123]. This may be simulated by the appropriate models taking into account both enzymatic and transcriptional regulations.

Now, the enzyme level regulation in the glycolysis made by Pyk and Pfk as well as FBP and PEP as mentioned above keeps increasing the substrate uptake rate, where this makes the system unstable. This is counterbalanced by the transcriptional regulation by cAMP-Crp, where cAMP level decreases due to the lower level of phosphorylated EIIA (EIIA-P), and lower activity of adenylate cyclase (Cya) at higher glucose consumption rate. Since ptsG which encodes EIIBC of PTS is under control of cAMP-Crp, the glucose uptake rate is repressed by the lower level of cAMP-Crp (Figure 3b). Thus, the transcriptional repression of PTS by cAMP-Crp must be incorporated into the model to realize such feedback regulation for the glucose uptake rate. The molecular mechanism for catabolite regulation has been illustrated by several researchers [132-136], and the PTS and catabolic regulation have been modeled by several researchers [5,109,111,137].

Moreover, the TCA cycle is transcriptionally repressed by the lower level of cAMP-Crp as the glucose consumption rate was increased. In the continuous culture (Chemostat), the effect of the cell growth rate or the dilution rate on the metabolism can be appropriately simulated for E. coli [112]. Namely, as the specific growth rate was increased, FBP increases due to the increase in the fluxes of the upper glycolysis and Cra decreases. Moreover, PEP concentration decreases as the PTS flux increases, since PEP is the co-substrate of PTS, and in turn EIIA-P decreases, resulting in the decrease in Cya activity and cAMP-Crp level. The decrease in cAMP-Crp causes the repression of the TCA cycle, and acetate overflow metabolism occurs as the specific growth rate increases. The trend of the simulation result by reflecting the above mechanism [112] is the similar as the experimental data [105] (Figure 5).

Figure 5
figure 5

Comparison of the simulation results (a-c) [ 112 ] and the experimental data (d-f) [ 105 , 113 ].

In E. coli, acetate is formed from AcCoA by Pta-Ack and from pyruvate by pyruvate oxidase, Pox [128]. Acetate can be metabolized to AcCoA either by the reversed reactions of Pta-Ack or by acetyl-coenzyme A synthetase (Acs). Acetate formation has been known to be due to metabolic imbalance, where the rate of AcCoA formation via glycolysis surpasses the capacity of the TCA cycle in E. coli [138]. Pox and Acs may be expressed as functions of the sigma factor (σ38) RpoS, but it may be difficult to predict the behavior of RpoS, while Acs may be expressed as a function of cAMP-Crp, where Acs is activated by cAMP-Crp during gluconeogenic phase [112].

Among intracellular metabolites, α-keto acid such as αKG turns to be a master regulator for catabolite regulation and co-ordination of different regulations [139]. Namely, when favored carbon source such as glucose was depleted, αKG level fall, and cAMP increases to stimulate other carbon catabolic machinery. Namely, when preferred carbon source such as glucose is abundant, the cell growth rate becomes higher with lower cAMP level, while if it is scarce, the cell growth rate declines with higher cAMP level. In particular, under nitrogen (N)-limitation, αKG accumulates due to decreased activity of glutamate dehydrogenase (GDH) and inhibits carbon assimilation, where there is less need for carbon-catabolic enzymes, and more demand for those involved in such nutrient assimilation. On the other hand, when anabolic nutrient such as ammonia is in excess, αKG concentration decreases due to activated GDH, producing glutamate (Glu) from αKG, cAMP level increases, and carbon catabolic enzymes increases to accelerate carbon assimilation. Namely, αKG coordinates the catabolic (C)-regulation and N-regulation by inhibiting EI of PTS [140] or cAMP via Cya [58,141]. Moreover, the physiological function of cAMP signaling goes beyond simply enabling hierarchical utilization of carbon sources as will be mentioned later but also controls the function of the proteome [139,142]. In order to model such phenomenon, EI of PTS has to be expressed as the inhibition by αKG, or Cya has to be expressed as the inhibition by keto acids such as OAA and PYR as well as αKG, where the modeling for nitrogen regulation will be mentioned later.

In the case of biofuels production from cellulosic biomass, the hydrolyzed biomass contains multiple sugars, and those sugars are selectively assimilated with catabolite repression depending on the type of microorganism used [143,144]. The metabolic regulation differs depending on the carbon sources used.

Glycerol has been paid recent attention for the production of biofuels and biochemicals, since it is a by-product of the biodiesel production [145-149]. In E. coli, glycerol is transported and phosphorylated to produce dihydroxy acetone phosphate (DHAP) of the central metabolism via the pathway encoded by glpF, glpK, and glpD, where ATP (or in certain cases PEP) is used for the phosphorylation at glycerol kinase (GlpK) reaction, while NADH is produced at glycerol 3-phosphate dehydrogenase (G3PDH) reaction under aerobic condition (Figure 6). These genes are under catabolic regulation by cAMP-Crp, so that glycerol is assimilated after glucose was depleted if glucose co-exists. NADH production at G3PDH becomes important for the biofuels production under anaerobic condition affecting NADH/NAD+ balance for dehydrogenase reactions. In the case of using glycerol as a single carbon source, cAMP-Crp increases due to the increase in the phosphorylated EIIAGlc, where cAMP-Crp induces glpFKD genes. Since FBP concentration decreases in the case of using glycerol as a carbon source, Cra is activated, and this together with upregulation of cAMP-Crp causes pckA gene as well as TCA cycle genes to be upregulated [150,151]. The kinetic expressions for the glycerol uptake pathways have been proposed [152]. This together with the inclusion of PTS and the transcriptional regulation as mentioned above enables the simulation for the case of using multiple carbon sources such as glucose and glycerol. Moreover, the enhancement of the TCA cycle caused by the increase in cAMP-Crp can be also simulated in the case of using glycerol as a single carbon source.

Figure 6
figure 6

Glycerol-, fructose-, and xylose-assimilating pathways (a,b).

In the case of using fructose, it is transported by fructose-PTS, which has its own HPr-like protein domain called FPr. Namely, the phosphate of PEP is first transferred to EI (as EI-P), but then this phosphate is transferred to FPr instead of HPr, and in turn the phosphate is transferred via fructose specific EIIAFru and EIIBCFru, and phosphorylates fructose, where phosphorylated fructose becomes fructose 1-phosphate (F1P) [153]. The fruBKA operon is under control of cAMP-Crp, and thus glucose is preferentially consumed by glucose PTS when glucose co-exists. On the other hand, this operon is repressed by Cra [154]. Because of this, cra gene knockout enables co-consumption of glucose and fructose with fructose to be consumed faster as compared to glucose [155], where activated FruB in cra mutant competes with HPr (for glucose phosphorylation) for the phosphate of EI-P. Since phosphorylation of EIIAGlc via HPr becomes lower [156], the glucose uptake rate decreases as compared to the wild-type strain [155]. This phenomenon may be also simulated by the similar expression as glucose-PTS but compete the phosphate of EI with glucose-PTS (Figure 6).

In the case of using xylose as a carbon source, it is transported either by an ATP-dependent high affinity ABC transporter encoded by xylFGH or ATP-independent low affinity proton symporter encoded by xylE [157,158]. In the case of xylose utilization, the transcription factor XylR regulates xylAB/xylFGH [159], where xylR is under control of cAMP-Crp, and then catabolite repression occurs when glucose co-exists, where glucose is preferentially consumed first. The kinetic model for xylose uptake pathways as well as Entner-Doudoroff (ED) pathways has been proposed for Zymomonas mobilis [160], and thus it is necessary to incorporate the activation of the transporter by cAMP-Crp, and this can be made for the catabolite repression when co-exist with glucose [112] (Figure 6).

Modeling for the peripheral metabolism

Although it is critical to consider the main metabolism for the metabolic regulation as well as for the cell growth rate, the peripheral metabolic pathways become important for the practical applications such as amino acids fermentation.

The kinetic model for lysine synthetic pathways from OAA in the TCA cycle has been proposed [161], and this can be used to apply sensitivity analysis such as metabolic control analysis (MCA) to identify the limiting pathways in Corynebacterium glutamicum [162]. This investigation revealed that lysine production is primarily controlled by aspartokinase (Ask) and lysine permease. This was verified by the experiment using the recombinant strain overexpressing Ask, resulting in the significant increase in lysine production, although that flux did not increase as much as would be expected by MCA [162].

Shikimic acid production and aromatic amino acids production may be also simulated based on the formation of the precursor metabolites such as erythrose 4-phosphate (E4P) and PEP in the central metabolism [111]. Other amino acid fermentation may be also simulated using dynamic metabolic models [111,137,163].

Modeling for the metabolism under oxygen limitation

Most of the biofuels and biochemical productions by microbes is made by the fermentation under anaerobic condition, and thus it is important to properly model such fermentation as well as under aerobic condition, where the latter is often employed for the enhancement of the cell growth rate before anaerobic condition to improve the productivity of the target metabolites.

Although the modeling and computer simulation of a microbial cell cultivated under anaerobic condition such as lactate fermentation [131], and acetone-butanol-ethanol fermentation [164-166] has been proposed by several researchers, the regulatory mechanisms are rarely incorporated. Moreover, cofactor balance such as NADH balance becomes important under anaerobic condition, and thus it may be better to appropriately incorporate in the model equations. However, this is not so easy without proper modeling of the respiratory pathways under different oxygen concentration.

In order to properly model the metabolic transition from aerobic to anaerobic conditions, the roles of global regulators such as ArcA/B and Fnr must be properly incorporated, where the effect of dissolved oxygen concentration on the activation of such TFs has been reported [167,168], and this may be taken into account for the simulation under microaerobic conditions. In particular, it is important to properly simulate the behavior at the branch point of PYR, where the reaction rate through PDH, ν PDH must be the negative function of ArcA (or phosphorylated ArcA, ArcA-P), while the reaction rate through pyruvate formate lyase (Pfl), ν Pfl is the positive function with respect to ArcA and Fnr, where PYR is converted to formate (FOR) and AcCoA (Figure 7). Moreover, ethanol-forming pathway from AcCoA, alcohol dehydrogenase (ADH) must be activated where NADH is required for this reaction. As for TCA cycle, α-ketoglutarate dehydrogenase (KGDH) may be repressed by ArcA, while formate reductase (Frd) is activated by Fnr, thus the TCA cycle will be branched under anaerobic conditions. Some attempt has been made to estimate the fluxes in relation to such global regulators [169].

Figure 7
figure 7

Anoxic regulation of the metabolic pathways by ArcA and Fnr.

Modeling for the nitrogen regulation

Next to carbon (C) catabolite regulation, the nitrogen (N) regulation is also important [170], and the silicon-cell models have been developed based on experimental kinetic data for the enzymes, involved that predict the flux of assimilation of extracellular ammonia into glutamate in E. coli.

Glutamate (Glu) and glutamine play key roles in cellular metabolism and serve as precursors of protein synthesis. Glutamate can be synthesized from two different pathways such as by one simple step reaction catalyzed by glutamate dehydrogenase (GDH) from αKG, and by glutamine synthetase (GS) and glutamate synthase (GOGAT) [171]. GS is active during low ammonium concentration while GDH is active at higher ammonium concentrations, where GS has a higher affinity than GDH for ammonia (K m = 0.1 and 1.1 mM, respectively) [172,173]. The activity of GS was controlled by PII which acts in response to the concentration of glutamine and αKG [174,175]. AmtB is used for NH3 for transport when the ammonia concentration is lower than 0.05 μM [176], while the AmtB will be blocked when it was higher than 50 μM.

The activation (adenylylation) and inactivation (deadenylylation) of GS depends on C/N ratio such as αKG (C)-to-glutamine (N) ratio. In the case where C/N ratio was higher, GS will be adenylylated, otherwise, GS is deadenylylated. Sensing of the C/N ratio for GS adenylylation involves the protein PII, which is in two forms such as urydylylated PI-UMP and deurydylylated PII. Urydylylation/deurydylylation of PII catalyzed by UT/UR enzymes are promoted by glutamine (Figure 8).

Figure 8
figure 8

Ammonia assimilation pathways.

The model developed by Bruggeman et al. [173] combined metabolic regulation with signal transduction through the covalent modification of PII and GS by urydylyl transferase (UTase) and adenylyl transferase (ATase). It shows that the regulation is distributed between the two modes of regulation. However, the model may be incomplete in the sense that αKG pool size was assumed to be constant, while it changes significantly during nitrogen perturbations, where it is not only the substrate for ammonia assimilation but also a regulator of the GS covalent modification cascade [177]. Moreover, it is important to capture the interdependence of metabolite pools and growth, where metabolite pool size of αKG affects the glucose uptake by inhibiting EI of PTS [140].

In order to see the effect of ammonium assimilation, the main metabolic pathway must be considered. This model involves GDH, GS, and GOGAT pathways together with nitrogen regulation mechanism. At present, several kinetic models have been proposed for ammonium assimilation [173,176,178], but little has been analyzed for the relationship between cell growth rate and NADPH production rate in relation to ammonium assimilation. Moreover, it is strongly desirable to combine the models for catabolite regulation and nitrogen regulation in order to simulate the coordinated regulation between C- and N-regulations via the dynamic behavior of intracellular metabolite αKG.

Conclusions

Completeness of the model may not be necessary for it to improve predictions or rationalizations. The uncertainty of the model or the simulation result comes either from uncertainty of the model parameters, uncertainty of the model structure due to ambiguity in the selection of rate laws, or uncertainty caused by neglecting the regulation mechanism or neglecting cofactor balances etc. The importance of the above factors depends on the strains and culture conditions. It is useful in understanding cellular mechanism, and the process towards the development of the whole cell metabolism could well pay off. The appropriate model will be of immense value to the following:

  • Biotechnologists aiming to improve fermentation performances such as the yield and productivity of the target metabolite,

  • Microbial engineers aiming to design novel microbes able to capture available carbon and produce bio-fuels and biochemicals,

  • Basic scientists aiming to understand the metabolic regulation system in the living organisms, which can be used for the synthetic biology, and

  • Systems biologists aiming to advance the science of modeling.

The modeling approach will greatly exceed the importance of the microbial genome sequencing projects, as it will be much closer to understanding biological function and will have widespread practical application.

In the present article, it is stressed the importance of incorporating the enzyme level and transcriptional regulations appropriately in the kinetic model to predict the cell's growth characteristics under environmental and/or genetic perturbations. The drawback of the kinetic modeling is the increase in the kinetic model parameters as the system becomes large, and thus it may be difficult to expand to genome-scale. The reasonable idea may be to consider the kinetic modeling only for the main metabolism, and the simplified model may be considered for the peripheral metabolisms.

Moreover, it is quite important to combine the catabolic regulation model with nitrogen regulation model for the coordination between C- and N-regulations, where the intracellular pool sizes of α-keto acids play important roles affecting PTS and cAMP level.

The simulation result based on the model developed must be verified by experiments, or the simulation result may give hint for additional experimental design. In this way, modeling approach together with experimental works contributes to the innovation for the efficient design of the cell factories for biofuels and biochemical production.