Abstract
Gene Regulatory Networks are powerful models for describing the mechanisms and dynamics inside a cell. These networks are generally large in dimension and seldom yield analytical formulations. It was shown that studying the conditional expectations between dimensions (interactions or species) of a network could lead to drastic dimension reduction. These conditional expectations were classically given by solving equations of motions derived from the Chemical Master Equation. In this paper we deviate from this convention and take an Algebraic approach instead. That is, we explore the consequences of conditional expectations being described by a polynomial function. There are two main results in this work. Firstly, if the conditional expectation can be described by a polynomial function, then coefficients of this polynomial function can be reconstructed using the classical moments. And secondly, there are dimensions in Gene Regulatory Networks which inherently have conditional expectations with algebraic forms. We demonstrate through examples, that the theory derived in this work can be used to develop new and effective numerical schemes for forward simulation and parameter inference. The algebraic line of investigation of conditional expectations has considerable scope to be applied to many different aspects of Gene Regulatory Networks; this paper serves as a preliminary commentary in this direction.
Similar content being viewed by others
Notes
The terms species, dimensions, and vertices originate from different fields of study but refer to the same concept. Hence, we interchange between the terms to match the context.
In our context the results can be reformulated to be raw moments, factorial moments, or central moments. For this reason we say classical moments to encompass it all.
A Gaussian reconstruction in this context involves computing the Gaussian distribution over the discrete state space and then normalising to make the total mass one.
References
Anderson D (2007) A modified next reaction method for simulating chemical systems with time dependent propensities and delays. J Chem Phys 127(21):214107. https://doi.org/10.1063/1.2799998
Andreychenko A, Mikeev L, Wolf V (2015) Reconstruction of multimodal distributions for hybrid moment-based chemical kinetics. J Coupled Syst Multiscale Dyn 3(2):156–163. https://doi.org/10.1166/jcsmd.2015.1073
Andreychenko A, Bortolussi L, Grima R, Thomas P, Wolf V (2017) Distribution approximations for the chemical master equation: comparison of the method of moments and the system size expansion. In: Graw F, Matthäus F, Pahle J (eds) Modeling cellular systems. Contributions in mathematical and computational sciences. Springer, Cham, pp 39–66. https://doi.org/10.1007/978-3-319-45833-5_2
Ball K, Kurtz TG, Popovic L, Rempala G (2006) Asymptotic analysis of multiscale approximations to reaction networks. Ann Appl Probab 16(4):1925–1961
Banasiak J (2014) Positive semigroups with applications. PhD thesis, University of KwaZulu-Natal, Durban, South Africa
Barkai N, Leibler S (2000) Biological rhythms: circadian clocks limited by noise. Nature 403(6767):267–268. https://doi.org/10.1038/35002258
Blake WJ, Krn M, Cantor CR, Collins JJ (2003) Noise in eukaryotic gene expression. Nature 422(6932):633–637. https://doi.org/10.1038/nature01546
Bokes P, King JR, Wood ATA, Loose M (2012) Exact and approximate distributions of protein and mRNA levels in the low-copy regime of gene expression. J Math Biol 64(5):829–854. https://doi.org/10.1007/s00285-011-0433-5
Burrage K, MacNamara S, Tian TH (2006) Accelerated leap methods for simulating discrete stochastic chemical kinetics. Posit Syst Proc 341:359–366. https://doi.org/10.1007/3-540-34774-7_46
Cao Z, Grima R (2018) Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nat Commun. https://doi.org/10.1038/s41467-018-05822-0
Cardelli L, Kwiatkowska M, Laurenti L (2016) Stochastic analysis of chemical reaction networks using linear noise approximation. BioSystems 149:26–33. https://doi.org/10.1016/j.biosystems.2016.09.004
Choudhary K, Oehler S, Narang A (2014) Protein distributions from a stochastic model of the lac operon of E. coli with DNA looping: analytical solution and comparison with experiments. PLoS ONE. https://doi.org/10.1371/journal.pone.0102580
Engblom S (2006) Computing the moments of high dimensional solutions of the master equation. Appl Math Comput 180(2):498–515. https://doi.org/10.1016/j.amc.2005.12.032
Gardner TS, Cantor CR, Collins JJ (2000) Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767):339–342. https://doi.org/10.1038/35002131
Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81(25):2340–2361. https://doi.org/10.1021/j100540a008
Goutsias J (2005) Quasiequilibrium approximation of fast reaction kinetics in stochastic biochemical systems. J Chem Phys 122(18):184102. https://doi.org/10.1063/1.1889434
Grima R, Schmidt DR, Newman TJ (2012) Steady-state fluctuations of a genetic feedback loop: an exact solution. J Chem Phys. https://doi.org/10.1063/1.4736721
Haseltine EL, Rawlings JB (2002) Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys 117(15):6959–6969. https://doi.org/10.1063/1.1505860
Hasenauer J, Wolf V, Kazeroonian A, Theis FJ (2013) Method of conditional moments (MCM) for the Chemical Master Equation. J Math Biol. https://doi.org/10.1007/s00285-013-0711-5
Hellander A, Lötstedt P (2007) Hybrid method for the chemical master equation. J Comput Phys 227(1):100–122. https://doi.org/10.1016/j.jcp.2007.07.020
Henzinger TA, Mikeev L, Mateescu M, Wolf V (2010) Hybrid numerical solution of the chemical master equation. In: Proceedings of the 8th international conference on computational methods in systems biology. ACM, Trento, pp 55–65. https://doi.org/10.1145/1839764.1839772
Higham DJ (2008) Modeling and simulating chemical reactions. SIAM Rev 50(2):347–368. https://doi.org/10.1137/060666457
Jahnke T (2011) On reduced models for the chemical master equation. Multiscale Model Simul 9(4):1646–1676. https://doi.org/10.1137/110821500
Jahnke T, Huisinga W (2007) Solving the chemical master equation for monomolecular reaction systems analytically. J Math Biol 54:1–26
Jahnke T, Kreim M (2012) Error bound for piecewise deterministic processes modeling stochastic reaction systems. SIAM Multiscale Model Simul 10(4):1119–1147. https://doi.org/10.1137/120871894
Jahnke T, Sunkara V (2014) Error bound for hybrid models of two-scaled stochastic reaction systems. In: Dahlke S, Dahmen W, Griebel M, Hackbusch W, Ritter K, Schneider R, Schwab C, Yserentant H (eds) Extraction of quantifiable information from complex systems: lecture notes in computational science and engineering, vol 102. Springer, Berlin, pp 303–319. https://doi.org/10.1007/978-3-319-08159-5_15
Karlebach G, Shamir R (2008) Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol 9(10):770–780. https://doi.org/10.1038/nrm2503
Khammash M, Munsky B (2006) The finite state projection algorithm for the solution of the chemical master equation. J Chem Phys 124(044104):1–12. https://doi.org/10.1063/1.2145882
Kurtz TG (1972) Relationship between stochastic and deterministic models for chemical reactions. J Chem Phys 57(7):2976–2978. https://doi.org/10.1063/1.1678692
MacArthur BD, Ma’ayan A, Lemischka IR (2009) Systems biology of stem cell fate and cellular reprogramming. Nat Rev Mol Cell Biol 10(10):672–681. https://doi.org/10.1038/nrm2766
MacNamara S, Bersani AM, Burrage K, Sidje RB (2008) Stochastic chemical kinetics and the total quasi-steady-state assumption: application to the stochastic simulation algorithm and chemical master equation. J Chem Phys 129(095105):1–13. https://doi.org/10.1063/1.2971036
Menz S, Latorre J, Schütte C, Huisinga W (2012) Hybrid stochastic-deterministic solution of the chemical master equation. Multiscale Model Simul 10(4):1232–1262. https://doi.org/10.1137/110825716
Nagel W, Steyer R (2017) Probability and conditional expectation. Wiley series in probability and statistics. Wiley, Oxford. https://doi.org/10.1002/9781119243496
Pájaro M, Alonso AA, Otero-Muras I, Vázquez C (2017) Stochastic modeling and numerical simulation of gene regulatory networks with protein bursting. J Theor Biol 421:51–70. https://doi.org/10.1016/j.jtbi.2017.03.017
Rao CV, Arkin AP (2003) Stochastic chemical kinetics and the quasi-steady-state assumption: application to the Gillespie algorithm. J Chem Phys 118(11):4999–5010. https://doi.org/10.1063/1.1545446
Ruess J (2015) Minimal moment equations for stochastic models of biochemical reaction networks with partially finite state space. J Chem Phys 143(24):244103. https://doi.org/10.1063/1.4937937
Seber GAF, Lee AJ (2003) Linear regression analysis. Wiley, Hoboken. https://doi.org/10.1002/9780471722199
Singh A, Hespanha JP (2005) Models for multi-specie chemical reactions using polynomial stochastic hybrid systems. In: IEEE conference on decision and control, pp 2969–2974. https://doi.org/10.1109/CDC.2005.1582616
Smadbeck P, Kaznessis YN (2012) Efficient moment matrix generation for arbitrary chemical networks. Chem Eng Sci 84:612–618. https://doi.org/10.1016/j.ces.2012.08.031
Smadbeck P, Kaznessis YN (2013) A closure scheme for chemical master equations. Proc Natl Acad Sci 110(35):14261–14265. https://doi.org/10.1073/pnas.1306481110
Srinivastiv R, You L, Summers J, Yin J (2002) Stochastic vs. deterministic modeling of intracellular viral kinetics. J Theor Biol 218(3):309–321. https://doi.org/10.1006/jtbi.2002.3078
Sunkara V (2013) Analysis and numerics of the chemical master equation. PhD thesis, Australian National University
Sunkara V (2017) PyME (Python solver for the chemical master equation). https://github.com/vikramsunkara/PyME. Accessed 1 Aug 2019
Sunkara V, Hegland M (2010) An optimal finite state projection method. Procedia Comput Sci 1(1):1579–1586. https://doi.org/10.1016/j.procs.2010.04.177
Thomas P, Popovi N, Grima R (2014) Phenotypic switching in gene regulatory networks. Proc Natl Acad Sci 111(19):6994–6999. https://doi.org/10.1073/pnas.1400049111
Van Kampen NG (2007) Stochastic processes in physics and chemistry, 3rd edn. North Holland, Amsterdam
Vilar JMG, Kueh HY, Barkai N, Leibler S (2002) Mechanisms of noise-resistance in genetic oscillators. Proc Natl Acad Sci 99(9):5988–5992. https://doi.org/10.1073/pnas.092133899
Wilkinson DJ (2006) Stochastic modelling for systems biology. Mathematical and computational biology series. Chapman & Hall, CRC, Boca Raton
Funding
V. Sunkara was supported by the BMBF (Germany) project PrevOp-OVERLOAD, grant number 01EC1408H.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proofs
Proof of Lemma 3.1-3
We substitute the conditional expectation form into Eve’s law (Law of Total Variance) and then reduce.
Eve’s Law states that
Verbosely, the total variation of Y is the sum of the expectation of the conditional variances and the variance of the conditional expectation. We begin by reducing the covariance of the conditional expectations:
substituting the linear conditional expectation form and the expanding gives us
substituting the definition of a covariance gives
Substituting this term above into Eve’s law gives us that,
\(\square \)
Parameters of the three models
Proof that the simple mRNA translation model has a linear conditional expectation structure
The idea and outline for this proof was given by one of the anonymous reviewers of this paper. The author is grateful to the reviewer and the peer-review process for this contribution.
We prove that the simple mRNA translation model has linear conditional expectation structure by using the notion of generating functions. We begin by first deriving the definition of the conditional expectation in terms of the generating function.
1.1 Conditional expectation in terms of the generating function
Let X and Y be two coupled random variables whose state space are the natural numbers including zero. The generating function of the joint distribution \({{\,\mathrm{\textit{p}}\,}}(X=\cdot ,Y=\cdot )\) is given by,
It is well known that taking the nth derivative of \(\phi \) and setting t or s to zero gives the nth degree classical moment of the random variables X and Y, respectively. We aim to similarly formulate the conditional expectation in terms of derivatives of the generating function.
For \(x\in \Omega _X,\) we define
Verbosely, the function \(g_x(s)\) is the xth derivative of \(\phi \) with respect to t, evaluated at \(t=0.\) We take the natural logorithm of \(g_x(s)\) to get,
Taking the derivative of the expression above with respect to s gives us,
Then evaluating the function at \(s=1\) gives us,
We have derived the definition of the conditional expectation as function of the derivatives of the generating function. Naturally, if the generating function is known, one can evaluate the terms in (C.4) and determine the corresponding conditional expectation structure.
1.2 Linear conditional expectation form of the simple mRNA transcription model
We prove that the simple mRNA transcription model has a linear conditional expectation form by using the generating function given by Bokes et al. (2012) and substituting it into (C.4). We begin by establishing some notation in order to align with the work by Brokes et al.
Let M, N be the random variables corresponding with mRNA population and protein population, respectively. Let the reaction channels be given as follows:
We are investigating the dynamics of the stationary distribution, hence we omit the time component. It was shown by Brokes et al. that the stationary moments of the simple mRNA translation model are as follows:
Then the generating function of the stationary distribution is given by,
where
with \(K(\cdot ,\cdot ,\cdot )\) being the Kummer’s function and
To find the conditional expectation of the simple mRNA translation model, we will substitute its generating function (C.7), into (C.2) and reduce.
Taking the natural log gives us,
Then taking the derivative with respect to s gives us,
By the fundamental theorem of calculus we have that
and by the properties of the derivative of the Krummer’s function, \(\frac{d}{dc} K(a,b,f(c)) = \frac{a\,f'(c)}{b} K(a+1,b+1,f(c)),\) we have that
Substituting these terms into (C.9), then evaluating at \(s=1\) and applying the property that \(K(\cdot ,\cdot ,0) = 1\) gives us,
By the definition given in (C.4), we have that
After substituting in the term (C.8), the conditional expectation in terms of the reaction rates is given to be,
Hence, the conditional expectation of the simple mRNA translation model has a linear form. We now cross-validate the coefficients linking the terms above to the raw moments using Lemma 3.1.
1.3 Cross-validation
Using (3.2) we know that linear conditional expectations of protein conditioned on mRNA should have the form:
Substituting in (C.5) and (C.6) for the moments gives us,
expanding the terms gives us,
Both the terms in (C.10) and (C.11) match.
Model 3: Conditional expectation through time
In this section we evaluate Model 3 at different time points to observe if the conditional expectation’s quadratic structure is present through time. Since there are no analytical solutions for the model known to date, we use an OFSP approximation as the reference solution and see how close this approximation’s conditional expectation is to the conditional expectation ansatz. The OFSP approximation was set to have a global \(\ell _1\) error of \(10^{-7}.\)
In Fig. 10a–c, the joint distribution is rendered in a contour plot, evaluated at time points \(T=0.15,\ 0.3,\)\(\text { and } 1.2.\) Below the joint distributions, in Fig. 10d–e, the corresponding conditional expectation and the quadratic ACE ansatz are given. We see that the conditional expectation and the ansatz are fairly similar. There are some mismatches at the boundary, but this is to be expected since the OFSP does produce artefacts at the boundary due to truncation criterions.
To further investigate the resolution at which conditional expectations and the ACE ansatz differ, we study the differences between them though time using three different metrics: the \(\ell _{\infty }\) norm, to study the maximum error at a particular time point; the \(\ell _{2}\) norm, to study the difference over the entire state space; and lastly, the relative error in \(\ell _{2},\) to see how the error is changing with respect to the change in the conditional expectation. In Fig. 10g, we see that the \(\ell _{\infty }\) norm is of the order \(10^{-2}\) in the interval of interest and the error is increasing with time. Then in Fig. 10h, we notice that the \(\ell _2\) norm has a similar trend as the \(\ell _{\infty }.\) However, interestingly the total error over the state space of the \(\ell _2\) norm is only twice as much as that of the \(\ell _{\infty }\) norm, implying that there are only a few states which are contributing most of the error. Lastly, in Fig. 10i, we study the relative error over time. We notice that this error falls to roughly \(10^{-4},\) implying that the error between the ACE ansatz and the conditional expectation is roughly ten thousand times smaller than the conditional expectation. This suggests that the model likely does exhibit a quadratic conditional expectation structure.
Simple gene switch derivations
1.1 Chemical master equation
1.2 Marginal distributions
We follow the same steps as in the generalised form (see Sect. 2.2). Deriving the CME for the marginal distribution of the gene and the proteins involves the following two steps:
-
substituting \( {{\,\mathrm{\textit{p}}\,}}( G=\cdot ,M=\cdot ,A=\cdot ;t) = {{\,\mathrm{\textit{p}}\,}}(M=\cdot \,|\,G=\cdot , A=\cdot ;t)\, {{\,\mathrm{\textit{p}}\,}}( G=\cdot ,A=\cdot ;t),\)
-
summing over all \(m \in \Omega _M\) and then collating all conditional probability terms.
1.2.1 Step 1
1.2.2 Step 2
Formal ACE-Ansatz approximation derivation
Before we begin the derivation, it is important to discuss Assumption 2.1-3. We state that the joint distribution needs to have non-zero probability over all of the state space through all time. We can easily violate this condition by starting the Kurtz process with the initial probability distribution which is non-zero over only a subset of the entire state space (e.g. a single state). However, the CME generator (2.4) has the feature that regardless of the initial condition, in an infinitesimal time, all the states have non-zero probability. Hence, numerically, if the processes does start at a single state, we can evolve it forward by a small time step using OFSP, and then use this time point for the initial condition in the dimension reduction methods. In the case of the Simple Gene Switch example in Sect. 5.1.2, we used \(t=1\) as the starting point for all dimension reduction methods.
We use the following notational convention: the approximation of the probability measure \(p(G=g,A=a;t)\) is denoted by the function w(g, a, t), furthermore, the approximation for the expectation operator \({\mathbb {E}}[\bullet (t)]\) is denoted by the function \(\eta _{\bullet }(t).\) Then the formal derivation of equation (5.1)–(5.8) are given by Eqs. (F.1)–(F.12).
Two gene toggle switch derivations
We use the following notational convention: the approximation of the probability measure \(prob(G_0=g,P=p;t)\) is denoted by the function w(g, p, t), furthermore, the approximation for the expectation operator \({\mathbb {E}}[\bullet (t)]\) is denoted by the function \(\eta _{\bullet }(t).\) Like in the simple gene switch case, the approximation is started at \(t=0.35\) to satisfy Assumption 2.1-3. We introduce the equations of motions in the following order: marginal distributions, moments, higher order moment closures, and the linear ACE-Ansatz approximations.
1.1 Marginal distribution
1.2 Moments
We derive the equations of motion for the following eight moments: \({\mathbb {E}}[G_1(t)],\)\( {\mathbb {E}}[M(t)],\)\({\mathbb {E}}[G_0\,G_1(t)],\)\({\mathbb {E}}[G_0\,M(t)],\)\({\mathbb {E}}[G_1\,P(t)],\)\({\mathbb {E}}[G_1\,M(t)],\)\({\mathbb {E}}[P\,M(t)],\) and \({\mathbb {E}}[M^2(t)].\)
Let \(\mu (t) := [ \eta _{G_1}(t), \eta _{M}(t), \eta _{G_0\,G_1}(t),\eta _{G_0\,M}(t),\eta _{G_1\, M}(t),\eta _{P\, M}(t),\eta _{M^2}(t) ],\) then the equation of motion for the approximation of the moments has the form:
where
and
1.3 Higher order moment closures
Let \(w(G_0^\mathbf{on },t) := \sum _p w(G_0^\mathbf{on },p,t)\), and \(w(p,t) := w(G_0^\mathbf{on },p,t) + w(G_0^\mathbf{off },p,t)\). We apply the follow moment closers:
Similarly, we can use the marginal distribution, \(w(G_0^\mathbf{on },p,t),\) to generate the corresponding moments:
1.4 Linear ACE-Ansatz approximations
We approximate the conditional expectations with the linear ACE anzats:
Where the gradients are given by:
SIR system parameters
The initial starting population was set to \((S(0)=200, I(0)=4).\) The OFSP method was configured to have a global error of \(10^{-6},\) with compression performed every 10 steps where each time step was of length 0.002. The distribution is the snapshot of the system at \(t=0.15.\) We also omit the recovered state since the total population is conserved, that is, \(S(t)+I(t) +R(t) = 204\) for all time (See Table 12).
Rights and permissions
About this article
Cite this article
Sunkara, V. Algebraic expressions of conditional expectations in gene regulatory networks. J. Math. Biol. 79, 1779–1829 (2019). https://doi.org/10.1007/s00285-019-01410-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-019-01410-y