Multi-timescale analysis of a metabolic network in synthetic biology: a kinetic model for 3-hydroxypropionic acid production via beta-alanine

A biosustainable production route for 3-hydroxypropionic acid (3HP), an important platform chemical, would allow 3HP to be produced without using fossil fuels. We are interested in investigating a potential biochemical route to 3HP from pyruvate through β\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta $$\end{document}-alanine and, in this paper, we develop and solve a mathematical model for the reaction kinetics of the metabolites involved in this pathway. We consider two limiting cases, one where the levels of pyruvate are never replenished, the other where the levels of pyruvate are continuously replenished and thus kept constant. We exploit the natural separation of both the time scales and the metabolite concentrations to make significant asymptotic progress in understanding the system without resorting to computationally expensive parameter sweeps. Using our asymptotic results, we are able to predict the most important reactions to maximize the production of 3HP in this system while reducing the maximum amount of the toxic intermediate compound malonic semi-aldehyde present at any one time, and thus we are able to recommend which enzymes experimentalists should focus on manipulating.


Introduction
3-hydroxypropionic acid (3HP) can be used to produce many other valuable chemicals, such as acrylic acid, 1,3-propanediol, and biodegradable polyesters (Werpy et al. 2004;Jiang et al. 2009). 3HP can be derived from biological sources and, as current industrial methods to produce acrylic acid involve fossil fuels, a production route through 3HP provides a biosustainable alternative. Although an in vivo production line in a bacterial or fungal host is the eventual industrial target for 3HP, performing in vitro experiments is an important prior step. The controlled nature of in vitro experiments allows for more systematic deductions to be made, and hence in vitro experiments can highlight potential roadblocks that may be more difficult to analyse in a complex in vivo system. Mathematical modelling allows for further systematic progress to be made, and can reduce the experimental parameter space that needs to be searched.
In Kumar et al. (2013), three thermodynamically feasible pathways from pyruvate to 3HP are suggested. We are interested here in mathematically modelling the route through aspartate and β-alanine; this is the synthetic pathway successfully introduced to Saccharomyces cerevisiae in Borodina et al. (2015). Thus, we consider the reactions Oxaloacetate + GLU β-alanine + Pyruvate + BAPAT Malonic semialdehyde + HPDH and we show a schematic representation of this pathway in Fig. 1. Here, we do not consider enzyme mechanics for the first two reactions in (1). This is because pyruvate and oxaloacetate are required for the citric acid cycle, and we are not interested in altering any enzymatic processes that may significantly change this important metabolic cycle. Thus, the enzyme kinetics for the latter four reactions are the most appropriate targets for enhancement, and we therefore include these.
We note that (1e), the reaction between l-alanine and pyruvate, is not necessary for a connected pathway between pyruvate and 3HP. However, as (1e) recycles lalanine and α-ketoglutarate (AKG) back into pyruvate and glutamic acid (GLU), used in (1a) and (1b), respectively, the inclusion of (1e) may enhance the efficiency of the , and the dotted lines denote reactions which additionally require AKG and produce GLU. If we consider the enzyme mechanics for a reaction, we include the name of that enzyme next to the reaction arrow pathway. In Borodina et al. (2015), it was hypothesized that (1e) may have an effect on the overall 3HP production, and we investigate this hypothesis here. More generally, we are interested in determining which of the reactions are the most important for 3HP production. Additionally, malonic semialdehyde is a toxic compound, and so we will also look for ways to reduce the amount of malonic semialdehyde present in the system. To summarize, the general goal of this system, so far as possible, is to maximize the 3HP produced whilst minimizing the maximum level of malonic semialdehyde present in the system. We will assume that there are enough molecules of each metabolite to use the law of mass action and that the entire system is well-mixed and thus spatially independent. Moreover, in any given reaction, we neglect any inhibitory effects from metabolites not involved in the reaction and from any products of the reaction. We do so to limit the number of parameters in the model, and to enable us to make analytic progress, thus allowing us to develop physical insight into the system. Additionally, the production levels of pyruvate within the system will vary in time. We consider two limiting cases, pyruvate being a finite resource which is never replenished in the first, but an infinite resource held at a constant concentration in the second. Whilst the actual levels of pyruvate will, in practice, fall somewhere between these two cases, it is useful to analyse both in detail: exploring and comparing these two extreme cases will provide insight into the general pathway kinetics in intermediate cases.
As there are many parameters involved in this problem, a fully experimental approach would be very time consuming. For the same reason, investigating its mathematical model using a purely numerical approach would also be a protracted process, albeit shorter than the fully experimental one. We perform an asymptotic analysis (see, for example, Kevorkian and Cole 2013;Hinch 1991;O'Malley Jr (2012)) to investigate our model, thus enhancing our physical insight into the underlying system and enabling us to determine how the concentrations vary as functions of the experimental parameters. Due to this approach, exact knowledge of each kinetic parameter is not required, just an estimate of the order of magnitude of each parameter.
We introduce a mathematical model to describe the nonlinear reaction kinetics in §2. We solve this for non-replenished and continuously replenished pyruvate in §3 and §4, respectively, where we give both numerical and asymptotic solutions to describe the system behaviour. We also consider the problem of general pyruvate production in "Appendix C", where we derive asymptotic solutions in one time regime for the metabolite concentrations at leading order in terms of the pyruvate in the system. We finish by discussing our results and comparing the two regimes in §5, where we also suggest future avenues of inquiry that follow from this work.

Model description
We assume that the reaction kinetics are governed by the law of mass action, yielding the system The dimensional metabolite concentrations are denoted with square brackets and have units of moles per volume, as does S 0 All the variables introduced here are defined in Table 1, and the units and typical values for each kinetic parameter are given in Table 2. We model a scenario where each of the enzymes are introduced to a solution containing only pyruvate and glutamic acid, and this resulting mixture is instantaneously well-mixed. We assume that the initial concentration of pyruvate is [S 1 ](0) = S 0 , where S 0 is around 1 mM. We will state the remaining initial conditions after discussing how we deal with uncertainty in the kinetic parameters. We see in Table 2 that estimates for the kinetic parameters can vary over three orders of magnitude between different organisms. To investigate how the system behaves as these parameters vary we first form dimensionless variables, scaling each dimensional metabolite concentration with S 0 and time with 1/k 1 (the characteristic time of the first reaction between pyruvate and oxaloacetate). Moreover, we form dimensionless parameters by scaling the kinetic parameters with k 1 and the appropriate power of S 0 . We then introduce the artificial small dimensionless parameter ε = 10 −2 , and allow each dimensionless system parameter to be written as cε j where c is an O(1) parameter (between 0.1 and 10), and j is an integer. The dimensionless parameters in our system are given in the lower half of Table 2. This approach allows us to interrogate the system using an asymptotic analysis. Although (as always) there may theoretically be an issue with this method in equating terms with the same powers of ε when large (or small) O(1) parameters are multiplied together, we will show that our asymptotic Pyrococcus furiosus e 1 × 10 −2 -2 × 10 0 mM −2 s −1 e,l k −7 = 10 2 s −1 Pyrococcus furiosus e 5 × 10 0 -6 × 10 2 s −1 e,l k 8 = 10 2 s −1 Pyrococcus furiosus e 5 × 10 0 -6 × 10 2 s −1 e,l k −8 = 4 × 10 −2 mM −2 s −1 Pyrococcus furiosus e 1 × 10 −2 -1 × 10 0 mM −2 s −1 e,l k 9 = 10 2 mM −1 s −1 Metallosphaera sedula f 5 × 10 1 -2 × 10 2 mM −1 s −1 f k −9 = 5 s −1 Metallosphaera sedula f 1 × 10 0 -1 × 10 1 s −1 f k 10 = 30 s −1 Metallosphaera sedula f 1 × 10 0 -1 × 10 2 s −1 f,m Dimensionless parameter k 5 = k 5 a 2 S 2 0 /k 1 = 4a 2 k −5 = k −5 a 2 ε 2 /k 1 = 0.8a 2 k 6 = k 6 a 2 ε/k 1 = 1.5a 2 k 7 = k 7 a 4 S 2 0 /k 1 = 3a 4 k −7 = k −7 a 4 ε 2 /k 1 = a 4 k 8 = k 8 a 4 ε 2 /k 1 = a 4 k −8 = k −8 a 4 S 2 0 /k 1 = 4a 4 k 9 = k 9 a 3 S 0 ε 2 /k 1 = a 3 Table 2 continued Dimensionless parameter k −9 = k −9 a 3 ε/k 1 = 5a 3 k 10 = k 10 a 3 ε 2 /k 1 = 0.3a 3 In general, it is only ratios of kinetic parameters that are known for a given reaction, as substrate binding to an enzyme is too quick to accurately measure the reaction rate. We therefore choose parameter values (using the ratios given in the references) that lead to the enzyme complexes being formed over a time period that is much shorter than the time period over which the metabolite concentrations change. This, experimental uncertainty, and differences in enzymes in different organisms lead to a large possible range for the parameter values. In scaling the kinetic parameters with S 0 and powers of ε, we have used the values S 0 = 1mM, ε = 10 −2 . The values of k 1 and k 2 are chosen to be effective reaction rates as we are not interested in modifying native enzymes. Thus, k 1 is the ratio of the PYC turnover number to the Michaelis constant for pyruvate multiplied by an estimated concentration of PYC, chosen to be around 1 µM. k 2 is the ratio of the AAT turnover number to the product of the Michaelis constants for oxaloacetate and GLU, multiplied by an estimated concentration of AAT, chosen to be around 0.1 µM. The references are: a Branson et al. (2002), b Nobe et al. (1998), c Ramjee et al. (1997, d Nakano et al. (1977), e Ward et al. (2000), f Kockelkorn and Fuchs (2009), g Jitrapakdee et al. (2007), h Yagi et al. (1982), i Chopra et al. (2002), j Williamson and Brown (1979), k Hayaishi et al. (1961), l Umemura et al. (1994), m Berg et al. (2007) and numerical results show excellent agreement, and thus the approach is reliable for this system.
For the initial conditions, we assume that the initial levels of pyruvate and of glutamic acid are of comparable size and, to mimic the environment within a cell, that these concentrations are much larger than the initial concentrations of each enzyme. Additionally, we assume that the initial concentrations of each enzyme are of comparable size. Although enzyme-to-substrate levels will vary, a reasonable assumption is that this ratio is around 1:100 and thus of O(ε) for our problem (Albe et al. 1990). Thus, recalling that [S 1 ](0) = S 0 , we also have [R 1 ](0) = αS 0 , [E 1 ](0) = εa 1 S 0 , [E 2 ](0) = εa 2 S 0 , [E 3 ](0) = εa 3 S 0 , and [E 4 ](0) = εa 4 S 0 , noting that the remaining metabolite concentrations start at zero. The dimensionless parameters α, a 1 , a 2 , a 3 , and a 4 are of O(1). These conditions will allow us to investigate how the system changes as the initial conditions of the system vary when we perform our asymptotic analysis. From the initial conditions and (2i-2p), we can immediately deduce that where i = 1, 2, 3, 4.
As we now have our dimensionless variables in terms of powers of ε, we can scale each concentration variable with the appropriate power of ε to obtain the correct leading-order system for t = O(1). This will allow us to immediately perform an asymptotic analysis for small ε. We give the correct scalings for these variables in Table 1, obtained by considering a dominant balance for each equation in the system and the non-zero initial conditions discussed above. We will perform a similar procedure to determine the correct asymptotic scalings over longer timescales, for which we will use effective 'initial' conditions obtained by asymptotic matching between timescales. The dimensionless form of (3) yields the relationships E 1 = 1 − εC 1 , E 2 = 1 − ε 3 C 2 , E 3 = 1 − ε 4 C 3 , and E 4 = 1 − ε 2 C 4 . Replacing each instance of E i (for i ∈ {1, 2, 3, 4}) with the appropriate function of C i allows us to reduce the dimension of the system of ODEs, accounting for (2i-2l). The remaining dimensionless system iṡ where the dimensionless parameters are defined in Table 2 and an overdot denotes d/dt. In the following two sections, we solve the system presented in (4) for nonreplenished and constantly replenished pyruvate, respectively. The non-replenished case is governed by (4), and the replenished case by (4b-m), with (4a) replaced by S 1 ≡ 1. For future reference, we note the immediate consequence of (4c) and (4i), thatṠ Numerical solutions in the non-replenished pyruvate case for the dynamic concentrations of a malonic semialdehyde and b 3HP. We use the parameter values given in Table 2 for all simulations, and initial dimensionless enzyme concentrations of 1 unless specified in the legend. The solid black line corresponds to the reference simulation where a i = 1 for i = 1, 2, 3, 4 3 Non-replenished pyruvate

Numerical results
We solve the system (4) numerically, using ode15s in MATLAB. Our asymptotic solutions will reveal that the system is stiff, and our choice of numerical approach reflects this. We use the parameter values given in Table 2, and see that there are two important timescales, where t = O(1) and t = O(1/ε) (recalling that ε = 10 −2 ), respectively ( Fig. 2). Over each of these timescales, the levels of malonic semialdehyde rise then fall, and the levels of 3HP rise to a plateau. Additionally, the levels of malonic semialdehyde and 3HP can vary significantly when the initial concentration of different enzymes is increased (over-expression). We model the effect of over-expressing a particular enzyme by starting the simulation with twice the levels of that enzyme compared to the black reference line in Fig. 2. We emphasize that the parameter values we give are not likely to be exact but, as we will later derive solutions to our system as functions of these parameters, we will be able to determine which reactions are the most important to the system by analyzing the form of these solutions without knowledge of the exact values that these parameters take. We now discuss how the over-expression of a given enzyme affects the system. A schematic for the metabolic network is given in Fig. 1. Increasing the levels of PAND has a positive but small effect on the levels of both malonic semialdehyde and 3HP. Over-expressing BAPAT has a much larger effect, significantly increasing the levels of both malonic semialdehyde and 3HP. In contrast, we see that over-expressing HPDH does not appear to have any effect on the levels of 3HP and, in fact, significantly decreases the maximum amount of malonic semialdehyde in the system. Finally, we see that over-expressing ALT has a very small negative effect on the levels of malonic semialdehyde, and a significant positive effect on the levels of 3HP.
In the next section, we explore the system (4) using an asymptotic analysis. This will allow us to determine how the system depends on the dimensionless parameters, and thus to explore the experimental parameter space without resorting to computationally expensive parameter sweeps.

Asymptotic structure
We now investigate the system (4) by exploiting the small parameter, ε, using an asymptotic analysis. There are two main asymptotic regions of interest in time, where t = O(1) and t = O(1/ε), which we refer to as medium and long time, respectively (Fig. 2).
When t = O(1), the system is governed by the release of pyruvate and GLU through the main pathway to 3HP and towards the production of l-alanine. When t = O(1/ε), the levels of pyruvate have diminished, and the system is governed by the degradation of oxaloacetate with GLU into aspartate and AKG. Additionally, the l-alanine produced when t = O(1) is now converted back into pyruvate which, although only a small amount, is enough to have a leading-order effect on the reaction with β-alanine, and thus a leading-order effect on the amount of 3HP produced. Thus, the l-alanine acts as a store of pyruvate over t = O(1), and this is released over In the next section, we solve the leading-order versions of the system (4) in each of these asymptotic regions.

Asymptotic solutions
The early-time behaviour of the system is given in "Appendix A", obtained by Taylor expanding the initial conditions as t → 0 + . The more interesting behaviour for malonic semialdehyde and 3HP production occurs for medium and long time, and we now consider the system dynamics over these timescales in turn.
We solve the leading-order system (6) explicitly as ε → 0, and the solutions are given by where we define the parameters From Table 2, we see that the parameter ω =k 3k4 /k −3 is associated with (1c), the reaction from aspartate to β-alanine. That is, ω is proportional to both the ratio of the forward to backward reaction rate constants in (1c) and to the initial concentration of PAND, the enzyme that controls (1c). Thus, from the prevalence of ω in the solution (7), we see that (1c) affects the timing of the system when t = O(1), and the other reactions control the concentration levels of the metabolites. For later analysis, we additionally note that the parameter groupingk 5k6 /k −5 is associated with (1d), the reaction from β-alanine and pyruvate to l-alanine and malonic semialdehyde. We note there is an implicit assumption that ω = 1 in writing (7), but the apparent singularities are removable in this limit and, as such, they do not change the nature of the solutions. We see that these solutions are excellent approximations to the numerical solutions (dashed lines in Fig. 3).
Whilst the solutions (7) do satisfy the given initial conditions, several were not formally imposed in the solution derivation. This is because the system (4) is singular  (7), dash-dotted lines represent the early-time solutions given in "Appendix A", and dotted lines represent the late time solutions given in (11, B2), and (B4). We use parameter values α = 0.5, a 1 = 1, a 2 = 1, a 3 = 1, and a 4 = 1. We have split the metabolite concentrations into a the substrates that tend to a non-zero constant value for large time, b the substrates which tend to zero for large time, c the enzyme complexes, and d the product. We plot Although we could formally match the solution (7) with an early-time solution, it is not useful for our eventual goal of determining 3HP production. Instead, we show that there are further early-time regions by Taylor expanding the initial conditions as t → 0 + in "Appendix A". We see that these early-time solutions are excellent approximations to the numerical solutions (dash-dotted lines in Fig. 3).

Late time: t = O(1/ε)
The more interesting behaviour occurs for longer time. The solutions for S 4 and R 2 , given in (7d) and (7h), respectively, appear to be unbounded. As we start with a finite level of nonreplenishable pyruvate, this is unphysical and suggests that there are further dynamics at play. We can obtain the correct scaling by looking for a change in leading-order terms in the system (4), which occurs when t = O(1/ε). To investigate this, we introduce t = T /ε, where T = O(1), and we make the asymptotic scalings , and S 1 = ε 2 S 1 . The long-term version of the system (4) is and (5) becomes The leading-order version of (9) is the differential-algebraic system Now the system is controlled by the nonlinear dynamics of S 2 and R 1 , and the solutions to the leading-order system (10) are where Ω =k 7k8 /(k 2 (k −7 +k 8 )), which is proportional to the initial concentration of ALT, the enzyme controlling (1e). Thus, Ω is associated with the reversible reaction between l-alanine and pyruvate, and terms that are raised to the power of Ω arise due to this reaction. In contrast, the terms not raised to the power of Ω in the integrand of (11m) arise due to the conversion of pyruvate into β-alanine.
To match the long-term solutions (11) with the t = O(1) solutions (7), we would have to consider intermediate matching regions where exponentially decreasing terms balance algebraically growing terms. These regions occur between t = O(1) and t = O(1/ε), and thus generally involve a translation in time of some multiple of log(1/ε). As these regions are uninteresting-the decreasing and increasing terms pass by each other without interacting-we omit further discussion of these regions.
The long-time dynamics described by (11) are sensitive to the sign of α − 1, with the singularities as α → 1 being removable. Moreover, the steady states in (11) as T → ∞ depend on the sign of α − 1, and thus are sensitive to the initial ratio of pyruvate to GLU. The exponential decay to these steady states slows down as α → 1, and becomes algebraic decay in the limit.
There is further long-time behaviour that can occur in this system, though only for certain parameter regimes. We discuss this behaviour in "Appendix B.1".

Discussion
Of the reactions given in (1), (1c-f) are heterologous in Saccharomyces cerevisiae (that is, they are not native to the microorganism), and are the most appropriate targets for enhancement. Thus, identifying and enabling the correct enzymes within cells for these reactions is paramount. To this end, we discuss our results in the context of being able to vary parameters from reactions (1c-f).
The general goal of the metabolic system we have considered is to maximise the total 3HP produced, whilst minimizing the malonic semialdehyde (S 6 ) produced, as the latter is toxic. The total 3HP produced at leading order, P tot , can be obtained by taking the limit as t → ∞ in (7m), and is given by The integral (12b) has asymptotic behaviour I ∼ 1 as α → 0, I ∼ 1/α as α → ∞, I ∼ 1 as Ω → 0 for α < 1, I ∼ 1/α as Ω → 0 for α > 1, and I ∼ √ π/(2αΩ) as Ω → ∞, which agree well with the numerical solutions of (12b) (Fig. 4). To numerically evaluate (12b), we make the substitution u = e −s to switch to a finite domain and use an asymptotic approximation to evaluate the integral near u = 0, where the integrand has an integrable singularity.
The amount of S 6 present in the system is at its maximum when t = O(1), and we give an analytic expression for this in (7f), from which we see that max t>0 S 6 (t) = β 6 f (ω; u(ω)),  (13) is given by the grey curve. The dashed and dotted black curves are the small and large ω approximations, respectively. These approximations are defined in the text immediately below (13) where u(ω) > 0 satisfies The function f is monotonically increasing in ω, and bounded above (Fig. 5). For small ω, we find that f ∼ ωu 2 e −u /4 ≈ 0.127ω where u satisfies e −u = 1 − u + u 2 /4 (thus, u ≈ 2.56). For large ω, we find that f ∼ ue −u /2 ≈ 0.162 where u satisfies e −u = 1 − u/2 (thus, u ≈ 1.59). The numerical approximations we give are all to three significant figures. From the solutions for the total 3HP produced, (12a), and the maximum value of malonic semialdehyde, (13), we see that increasing ω will increase both metabolites, but there is a limiting effect. Increasing ω corresponds to increasing the reaction rate through (1c), from aspartate to β-alanine, and thus we can deduce that this reaction step is important to our goals, but there are diminishing returns.
There is a greater effect on the system from the pre-factor constants λ and β 6 = λ/k 9 for 3HP and malonic semialdehyde, respectively. λ is a measure of the reaction rate through (1d), from β-alanine to malonic semialdehyde and l-alanine, and we can deduce that increasing the flux through this reaction will result in higher levels of 3HP and malonic semialdehyde. Moreover, a larger value ofk 9 will result in a lower maximum value of malonic semialdehyde, without affecting the levels of 3HP at leading order. This physically corresponds to choosing or designing an enzyme to which malonic semialdehyde will bind very quickly in the reaction (1f) from malonic semialdehyde to 3HP. Moreover, we see that increasing λ, and hence increasing the reaction rate through (1d), increases the levels of both 3HP and malonic semialdehyde without diminishing returns. Thus, (1d) is an important reaction in the system with regards to our goal, and the extra malonic semialdehyde produced by higher reaction rates through this system could be balanced by increasingk 9 .
The reaction (1e), where l-alanine reacts with AKG to produce pyruvate and GLU, and vice versa, is important for the long-time production of 3HP. The reversibility of this reaction means that the initial pyruvate is stored as l-alanine, then is able to be converted back into pyruvate due to the excess AKG produced by the reaction (1b), from oxaloacetate and GLU to aspartate and AKG. This process provides more pyruvate for the important reaction (1d), allowing it to proceed for a longer time. Bypassing the long route to β-alanine from pyruvate is important for this system, as the pyruvate runs out whilst the levels of β-alanine tend to a finite value. Thus, diverting pyruvate from producing to reacting with β-alanine will result in a more efficient system and thus increase the total 3HP produced. This effect manifests in the second term in (12a), where we find that increasing β 5 , a measure of the backwards reaction in (1e), from pyruvate to l-alanine, increases the total 3HP produced for the reasons stated above. However, as the enzyme ALT used in this reaction also affects the value of Ω, care must be taken in interpreting this result. In particular, a large value of Ω will reduce the amount of 3HP produced (Fig. 4).
With regards to over-or under-expressing enzymes in the system, we can use Table 2 to determine that ω ∼ a 1 , λ ∼ a 2 , β 5 ∼ a 4 , β 6 ∼ a 2 /a 3 , and Ω ∼ a 4 , where a 1 , a 2 , a 3 , and a 4 relate to the initial concentrations of the enzymes PAND, BAPAT, HPDH, and ALT, respectively, in the system. Hence, using the arguments presented above, we may also deduce the following results for enzyme expression (a schematic of these results is given in Fig. 6). Over-expressing PAND will monotonically increase the levels of both 3HP and the maximum malonic semialdehyde present, but the effect of this over-expression has diminishing returns and is bounded above. Similarly, overexpressing BAPAT will also result in higher levels of both 3HP and the maximum malonic semialdehyde present, but now without diminishing returns, as both scale linearly with a 2 . Over-expressing HPDH decreases the maximum level of malonic semialdehyde present (scaling with 1/a 3 ), but has no leading-order effect on the levels on 3HP. Finally, over-expressing ALT will increase the amount of 3HP produced (scaling with √ a 4 for large a 4 ), without affecting the maximum (leading-order) level of malonic semialdehyde present. Thus, the effect of over-expressing ALT does have diminishing returns, but is not bounded above.  Fig. 6 A schematic to highlight the effect of over-expressing a given enzyme in the no replenishment case on a malonic semialdehyde and b 3HP. The underlying network and the arrows between the nodes are explained in Fig. 1. An enzyme that is boxed and red/green means that over-expressing this enzyme causes a/an decrease/increase in the metabolite of interest. Our goal is to reduce the levels of malonic semialdehyde whilst increasing the levels of 3HP, where possible. A dashed box denotes that the overexpression has diminishing returns with no upper bound, and a dotted box denotes that the over-expression has diminishing returns with an upper bound (colour figure online)

Continuously replenished pyruvate 4.1 Numerical results
We now investigate the system (4b-m), setting S 1 ≡ 1. Thus, the concentration of pyruvate is no longer governed by (4a). This models a system where pyruvate is being constantly replenished and maintained at a given value. Solving this system numerically, we see that there are two main timescales for 3HP production, where t = O(1) and t = O(10 1 ), respectively (Fig. 7). In the t = O(1) timescale, the levels of both malonic semialdehyde and 3HP are increasing, before the t = O(10 1 ) timescale, where the levels of malonic semialdehyde increases at a much slower rate, and the levels of 3HP tends to a constant production rate. There is a further timescale when t = O(10 2 ), and the levels of malonic semialdehyde slightly increase to a steady state in this timescale. As with the never-replenished pyruvate case, the levels of malonic semialdehyde and 3HP can vary significantly when different enzymes are over-expressed. We show a schematic for the metabolic network in Fig. 1. Increasing the initial amount of PAND has no discernible effect on the levels of both malonic semialdehyde and 3HP. In contrast, over-expressing BAPAT significantly increases the levels of both malonic semialdehyde and 3HP. We see that over-expressing HPDH does not appear to have any effect on the levels of 3HP and, in fact, significantly decreases the maximum amount of malonic semialdehyde in the system. Finally, we see that over-expressing ALT has a small positive effect on the levels of malonic semialdehyde overall (though a small negative effect over t = O(10 1 )), and a small negative effect on the levels of 3HP.
We proceed in the same manner as §3, using an asymptotic analysis to explore the system (4b-m), setting S 1 ≡ 1. This will allow us to determine how the system depends  Table 2 for all simulations, and initial dimensionless enzyme concentrations of 1 unless specified in the legend. The solid black line corresponds to the reference simulation where a i = 1 for i = 1, 2, 3, 4 on the dimensionless parameters, and thus to explore the experimental parameter space without resorting to computationally expensive parameter sweeps.

Asymptotic structure
As we did with §3, we now consider this system by exploiting the small parameter, ε, and using an asymptotic analysis. There are two main asymptotic regions of interest in time, where t = O(1) and t = O(1/ε 1/2 ), and we refer to these as medium and intermediate time, respectively (Fig. 7).
When t = O(1), the system is governed by the release of pyruvate through the main pathway to 3HP and towards the production of l-alanine. There is a significant increase in the levels of all metabolites apart from pyruvate and GLU, the only metabolites that are initially present. When t = O(1/ε 1/2 ), the initial rate of metabolite production is slowed and most metabolites approach their steady state. There is a further asymptotic region when t = O(1/ε), and the system responds to the overproduction of l-alanine due to the continuous replenishment of pyruvate. We consider this region in "Appendix B.2 ", as 3HP production is not affected by this redistribution. In the next section, we solve the leading-order versions of the system (4b-m), with S 1 ≡ 1 in these asymptotic regions.

Asymptotic solutions
The early-time behaviour of the system is the same as for the never-replenished case, and this is given in "Appendix A". The more interesting behaviour for malonic semialdehyde and 3HP production occurs over longer timescales, which we now discuss in turn.

Medium time: t = O(1)
The leading-order system (4b-m), with S 1 ≡ 1, as ε → 0 is given by the differentialalgebraic systeṁ on using (5). The solution to (14) is where the parameters are defined in (8). The solutions (15) show excellent agreement with the numerical results (dashed lines in Fig. 8). As with the case considered in §3, we did not have to impose all of the initial conditions to determine (15). That is, there is an early-time solution as t → 0 + . For early time, the diminishing pyruvate case considered in the previous section is equivalent to the constant pyruvate case we consider here. Thus, the early-time behaviour for this case is also given in "Appendix A" and shows excellent agreement with the numerical results (dashed-dotted lines for early time in Fig. 8).  for c, d). We use parameter values α = 0.5, a 1 = 1, a 2 = 1, a 3 = 1, and a 4 = 1. We have split the metabolite concentrations into a metabolites whose early-time behaviour is described by the t = O(1) behaviour, b metabolites whose late-time behaviour is described by the intermediate-time behaviour, c metabolites whose behaviour is distinct in each asymptotic region we have discussed, and d the product. We plot ( We can see that several terms in the system are promoted to leading order when t = O(ε −1/2 ). Using t = τ/ε 1/2 with the scalings (S 2 , S 3 , S 5 , C 1 ) = (Ŝ 2 ,Ŝ 3 ,Ŝ 5 ,Ĉ 1 )/ε 1/2 , (S 4 , S 6 , R 2 , C 2 , C 3 ) = (Ŝ 4 ,Ŝ 6 ,R 2 ,Ĉ 2 ,Ĉ 3 )/ε, and P = P/ε 3/2 , where all the new variables are O(1), the medium-time system (4) becomes and (5) becomes The leading-order version of (16) is given by the differential-algebraic system and is solved byŜ where the error function erf(z) is defined as These solutions show excellent agreement with the numerical results (dotted lines in Fig. 8).
Similarly to the analysis in §3, there is further long-time behaviour arising from higher-order terms becoming important as lower-order terms exponentially decrease. These can be obtained by proceeding to higher orders in the above analysis, but it is simpler to investigate the asymptotic balances in (16) using the solutions (18). As this does not affect the 3HP production at leading order, we discuss this in "Appendix B.2".

Discussion
We discuss our results in the same context as for the previous section. That is, we consider the effect of varying parameters from the heterologous reactions (1c-f), and with the goal of maximizing 3HP production whilst minimizing the maximum level of malonic semialdehyde, S 6 .
We see from the long-time solution (18l) that 3HP ends up being produced at a constant rate λ/k 2 , and hence the concentration of 3HP ends up linearly increasing with time. Additionally, from (18e), the concentration of malonic semialdehyde monotonically increases and tends to a constant value of β 6 /k 2 = λ/(k 2k9 ). Hence, although there are clear differences in the long-time behaviour of this system compared to the never-replenished pyruvate case, there is a similar reaction dependence between systems. Specifically, whilst increasing the reaction rate through (1d), from β-alanine to malonic semialdehyde and l-alanine, will result in greater levels of 3HP and malonic semialdehyde, the levels of malonic semialdehyde produced can be reduced by (b) Fig. 9 A schematic to highlight the effect of over-expressing a given enzyme in the continuous replenishment case on a malonic semialdehyde and b 3HP. The underlying network and the arrows between the nodes are explained in Fig. 1. An enzyme that is boxed and red/green means that over-expressing this enzyme causes a/an decrease/increase in the metabolite of interest. Our goal is to reduce the levels of malonic semialdehyde whilst increasing the levels of 3HP, where possible (colour figure online) increasingk 9 . That is, by selecting an enzyme to which malonic semialdehyde has a large binding affinity.
In contrast to the never-replenished pyruvate case, the continuously-replenished pyruvate case has no leading-order dependence on (1c), the reaction from aspartate to β-alanine, nor on (1e), the reaction where l-alanine reacts with AKG to produce pyruvate and GLU, and vice versa. Increasing the flux through the former and the reverse reaction of the latter had a positive effect on 3HP production, where the latter allowed l-alanine to initially act as a pyruvate store, before releasing pyruvate to react with β-alanine. Thus, for long-time, the rate of 3HP production and the levels of malonic semialdehyde are only dependent on the reaction (1d), compared to §3 where these quantities are dependent on (1c-e).
With regards to over-or under-expressing enzymes in the system, we can use Table 2 to determine that λ ∼ a 2 andk 9 ∼ a 3 , where a 2 and a 3 relate to the initial concentrations of the enzymes BAPAT and HPDH, respectively, in the system (a schematic of these results is given in Fig. 9). Hence, using the arguments presented above, we may also deduce that over-expressing BAPAT will result in higher levels of both 3HP and the maximum malonic semialdehyde present, both scaling linearly with a 2 . However, over-expressing HPDH decreases the maximum level of malonic semialdehyde present, which is inversely proportional to a 3 , but has no leading-order effect on the levels on 3HP. Importantly, we also find that, in contrast to the non-replenished-pyruvate case, there is no leading-order dependence on the initial concentrations of PAND or ALT. Thus, our model suggests that the over-expression of PAND or ALT will not result in significant differences in the maximum level of malonic semialdehyde or the 3HP production for continuous replenishment of pyruvate.

Conclusions
In this paper, we develop and solve a mathematical model for 3-hydroxypropionic acid (3HP) production from pyruvate via aspartate and β-alanine. We consider two limit-  Fig. 10 a The maximum level of malonic semialdehyde in the system, b the total 3HP produced, both in the no replenishment of pyruvate case. In both figures, the labels on the x-axis denote i Reference value (using the initial enzyme concentrations a 1 = a 2 = a 3 = a 4 = 1). (ii) Over-expressing PAND (a 1 = 10, a 2 = a 3 = a 4 = 1), (iii) Concurrently over-expressing BAPAT and HPDH (a 1 = a 4 = 1, a 2 = a 3 = 10), (iv) Over-expressing ALT (a 1 = a 2 = a 3 = 1, a 4 = 10). The units of the y-axis are mM ing cases, where the initial level of pyruvate is never and continuously replenished, respectively. In both cases, we make substantial asymptotic progress to gain physical insight into the system behaviour, successfully comparing these results with numerical solutions to check their accuracy. Our asymptotic model (including relatively straightforward explicit expressions) allows us to predict and quantify ways to increase the levels of 3HP produced without increasing the levels of malonic semialdehyde, a toxic compound produced as an intermediate in the pathway.
In terms of over-expressing enzymes, we find that for both cases the strongest positive effect comes from simultaneously over-expressing both BAPAT and HPDH. This significantly increase the levels of 3HP produced, but provides no significant increase in the maximum malonic semialdehyde present ((iii) in Figs. 10 and 11). Our results also show that over-expressing just one of BAPAT or HPDH will not have as strong an effect. We see a similar (though weaker) effect when over-expressing ALT for a limited supply of pyruvate, and the maximum level of malonic semialdehyde is slightly reduced ((iv) in Fig. 10). However, when pyruvate is continuously replenished, over-expressing ALT leads to a small increase in both the maximum level of malonic semialdehyde and the rate of 3HP production ((iv) in Fig. 11). Finally, over-expressing PAND only has a significant effect when there is a limited supply of pyruvate, where the over-expression leads to a slight increase in the maximum level of malonic semialdehyde present ((ii) in Fig. 10). This effect vanishes when the pyruvate is continuously replenished ((ii) in Fig. 11). We provide schematics outlining the effect of over-expressing a given enzyme on both malonic semialdehyde and 3HP in Figs. 6 and 9 for the never and continuous replenishment cases, respectively.
Another approach to optimizing metabolite pathways is to explore alternative enzymes for a given reaction. Our results can also be used to expedite this process, as we are able to determine how the system behaves as a function of the kinetic parameters of the enzymes, which will differ for alternative enzymes. Thus, we may interpret  Fig. 11 a The maximum level of malonic semialdehyde in the system, b the eventual rate of 3HP production, both in the continuous replenishment of pyruvate case. In b, we use P/t as the label for the y-axis as the levels of 3HP linearly increase with time (this continuous production is due to the continuous replenishment of pyruvate), and thus the appropriate measure here is the long-time production rate of 3HP. In both figures, the labels on the x-axis denote (i) Reference value (using the initial enzyme concentrations a 1 = a 2 = a 3 = a 4 = 1). (ii) Over-expressing PAND (a 1 = 10, a 2 = a 3 = a 4 = 1), (iii) Concurrently over-expressing BAPAT and HPDH (a 1 = a 4 = 1, a 2 = a 3 = 10), (iv) Over-expressing ALT (a 1 = a 2 = a 3 = 1, a 4 = 10). The units of the y-axis are mM our results to determine the optimal targets for enzyme replacement. We find that, in both cases, the kinetic parameters involving the enzyme BAPAT (k 5 ,k −5 , andk 6 , defined in Table 2) have the most significant effect on 3HP production, and our model suggests that focusing attention on maximizing the parameter groupingk 5k6 /k −5 will have the greatest effect on 3HP yield. Although increasingk 5k6 /k −5 also has the effect of increasing the levels of malonic semialdehyde, we further show that choosing an HPDH enzyme to which malonic semialdehyde will quickly bind can negate this issue (that is, increasingk 9 , defined in Table 2), and we show in Figs. 10 and 11 that this combined effect of increasing the 3HP production whilst maintaining the maximum level of malonic semialdehyde can also be achieved by over-expressing BAPAT and HPDH.
We additionally show that the kinetic parameters involving the enzyme ALT (k 7 , k −7 ,k 8 , andk −8 , defined in Table 2) have a leading-order effect on the 3HP levels when the initial level of pyruvate is never replenished. This is because the reversibility of the reaction effectively allows pyruvate to be stored and converted back when required. This storage effect is negligible in the continuous replenishment case as the pyruvate is plentiful (by definition) and thus there is no benefit to having additional storage of pyruvate. The important parameter groupings here arek 7k8 /(k −7 +k 8 ) andk −7k−8 /(k 7k8 ), where increasing both of these dimensionless groupings leads to more 3HP produced, with no effect on the maximum levels of malonic semialdehyde present at leading order. We find that the combined effect of these parameters means thatk −7 andk −8 should be increased, whereask 7 andk 8 should be decreased, and we note that there are diminishing returns on increasingk −7 . We show in Fig. 10 that this effect can be weakly obtained by over-expressing ALT.
The kinetic parameters of the remaining enzyme we consider, PAND, given byk 3 , k −3 , andk 4 defined in Table 2, are only significant at leading order in the neverreplenished pyruvate case. In this case, we find that increasing the value of the parameter groupingk 3k4 /k −3 increases both the amount of 3HP produced, and also the maximum levels of malonic semialdehyde present. However, the effect of increasing this parameter grouping is bounded above, and thus these diminishing returns suggest that this parameter grouping is unlikely to be particularly useful to exploit for the eventual goal of industrial production of 3HP.
We show that, although the full reaction dynamics are different between the neverreplenished and continuously-replenished pyruvate cases, there are general similarities between predictions of the important reactions in both cases. Namely, our model suggests that over-expressing BAPAT or using a version of BAPAT with different kinetic parameters will increase levels of both 3HP and malonic semialdehyde, and that over-expressing HPDH or using an alternate that can very quickly bind to malonic semialdehyde will decrease the levels of malonic semialdehyde. The main differences between the two cases are as follows. Firstly, the kinetic parameters involved in the PAND and ALT reactions have a leading-order effect in the never-replenished pyruvate case, but not in the continuously-replenished case. Secondly, the time taken to reach the 'long-time' stage of 3HP production is asymptotically larger in the continuouslyreplenished pyruvate case, an effect relating to the continuous production of 3HP in this case. Thirdly, the malonic semialdehyde reaches its maximum value at a finite time in the never-replenished case, but tends to its maximum (constant) value in the continuously-replenished case.
We have made significant use of asymptotic analysis in this paper. This method has allowed us to determine the key parameter groupings involved in the production of 3HP and malonic semialdehyde. Moreover, there are some interesting aspects of the asymptotic analysis in its own right. For example, we have leading-order exponentially decreasing terms that become subdominant to algebraically increasing terms. Resolving this issue without going to higher asymptotic orders requires matching over a timescale that involves logarithmic functions of the small parameter ε, as shown in "Appendix B.1". Another interesting facet of the asymptotic analysis we perform is that, in the never-replenished pyruvate case, there are effects that occur over two asymptotic timescales that contribute to the levels of 3HP at leading order, and the system behaviour over the longer timescale highlights how l-alanine acts as a store for pyruvate, yielding deeper physical insight into the system.
The two limiting cases of pyruvate replenishment that we have considered in this paper are chosen as modelling assumptions. In reality, the levels of pyruvate will fall somewhere between these two extremes, as some time-dependent level of pyruvate will be generated from glucose via glycolysis. Whilst our model does highlight the BAPAT enzyme as the most important to 3HP production in both of the extreme cases we consider, it may be interesting to investigate whether this is true for any given pyruvate production. This could be examined by imposing a certain time-dependent form for the pyruvate concentration (as we did by imposing that pyruvate was constant in the second case we considered) if it were known, or by including a known source of pyruvate in the governing equations and solving the full system. From (4), the governing equations for O(1) time, we can see that the leading-order metabolite concentrations can be written in terms of integrals of (time-dependent) imposed pyruvate concentrations or pyruvate sources, and these are given in "Appendix C". It may be instructive to analysis these general results further.
Additionally, we chose somewhat arbitrary initial conditions which modelled the instantaneous addition of pyruvate to a well-mixed solution of enzymes. In reality, there is no clean starting point to such a set of reactions and there are likely to be small initial levels of every metabolite. We have tested our results for small but non-zero initial values of each metabolite and found that the early-time solutions we derived in "Appendix A" were inaccurate, but that the system relaxed into the remaining solutions we derived in this paper (results not shown).
Care must be taken with applying these results to in vivo experiments. Some of the metabolite concentrations we have derived are very small and, whilst this is not an issue for reactions occurring within a well-mixed beaker in vitro, it may pose a problem if the situation being modelled is reactions occurring within cells. For one molecule within a cell, the resulting macroscale concentration within a bacteria cell would be roughly 10 −9 M ≈ ε 3 S 0 using the typical parameter values in this paper. Hence, care needs to be taken when interpreting our results for reactions occurring within bacterial cells. One option to accurately model a small number of molecules is is to perform stochastic simulations of the molecule numbers, assigning a probability of each reaction occurring. Additionally, for small numbers of molecules it may be that spatial effects are important. It would be useful to include these extra features in an extension of this work to check whether our conclusions still applied for in vivo experiments, but we note that this would significantly increase the computational expense of solving the model.
Finally, we note that this work highlights how mathematical models can be used to understand a complicated system, even one exhibiting nonlinear behaviour. From our model, we were able to highlight the strong positive effect of over-expressing two enzymes, BAPAT and HPDH, at the same time to increase 3HP production while maintaining the levels of malonic semialdehyde. Such theoretical results should allow significant reductions in the time taken to explore the experimental parameter space, and to aid in other ways the understanding of biological systems.  Fig. 12 a The shaded areas denote the regions in which there is switching of asymptotic orders in the long-time solution for limited initial pyruvate. In the region where Ω < 1/α − 1, solutions (B2) hold. In the region where Ω < α − 1, solutions (B4) hold. b Results for S 3 , taking k i = a j = 1 for all i, j, and α = 0.5, thus falling into the region where α < 1 and Ω < 1/α − 1. The solid grey curve shows the numerical result, the dashed black curve shows the O(1) term in (B2a), and the dotted black curve shows the full expression given in (B2a). The latter two asymptotic results show excellent agreement with the numerical result until t ≈ 10 3 , where asymptotic switching occurs, and the O(ε) term in (B2a) dominates to determine the leading-order corrections R 1 ∼ α(1 − α)ek 2 (α−1)T + ε β 4 β 5 Ωk 8 C 1 ∼ γ 1 (1 − α) 2 ek 2 (α−1)T + ε β 4 β 5 Ωk 8 as T → ∞. The O(ε) terms in (B2) arise because the O(1) term has a larger exponential decay rate than the O(ε) term in (9g), and thus these terms will eventually balance. When α > 1 and Ω < α − 1, we look into the region in (9) where to determine the leading-order corrections as T → ∞. The O(ε) terms in (B4) arise because the O(1) term has a larger exponential decay rate than the O(ε) term in (9b), and thus these terms will eventually balance. We see that these solutions are excellent approximations to the numerical solution. This further long-time behaviour occurs when (1e), the reaction between l-alanine and pyruvate, is slow. The physical interpretation of the solutions in this appendix is that this slow reaction allows a slow replenishment of pyruvate which, in turn, is converted through the metabolites in the pathway to β-alanine. This slow release allows for a more gradual depletion of these intermediates.