Stochastic Rate Parameter Inference Using the CrossEntropy Method
 901 Downloads
Abstract
We present a new, efficient algorithm for inferring, from timeseries data or highthroughput data (e.g., flow cytometry), stochastic rate parameters for chemical reaction network models. Our algorithm combines the Gillespie stochastic simulation algorithm (including approximate variants such as tauleaping) with the crossentropy method. Also, it can work with incomplete datasets missing some model species, and with multiple datasets originating from experiment repetitions. We evaluate our algorithm on a number of challenging case studies, including bistable systems (Schlögl’s and toggle switch) and experimental data.
1 Introduction
In this paper we are concerned with the inference of biochemical reaction stochastic rate parameters from data. Reactions are discrete events that can occur randomly at any time with a rate dependent on the chemical kinetics [40]. It has recently become clear that stochasticity can produce dynamics profoundly different from the corresponding deterministic models. This is the case, e.g., in genetic systems where key species are present in small numbers or where key reactions occur at a low rate [23], resulting in transient, stochastic bursts of activity [4, 24]. The standard model for such systems is the Markov jump process popularised by Gillespie [13, 14]. Given a collection of reactions modelling a biological system and timecourse data, the stochastic parameter inference problem is to find parameter values for which the Gillespie model’s temporal behaviour is most consistent with the data. This is a very difficult problem, much harder, both theoretically and computationally, than the corresponding problem for deterministic kinetics—see, e.g., [41, Sect. 1.3]. One simple reason is because stochastic models can behave widely differently from the same initial conditions. (The related issue of parameter nonidentifiability is outside the scope of this paper, but the interested reader can find more in, e.g., [37, 38] and references therein.) Additionally, experimental data is usually sparse and most often involves only a limited subset of a model’s species; and the system under study might exhibit multimodal behaviour. Also, data might not directly relate to a species, it might be measured in arbitrary units (e.g., fluorescence measurements), thus requiring the estimation of scaling factors, or it might be described by frequency distributions (e.g., highthroughput data such as flow cytometry). Stochastic parameter inference is thus a fundamental and challenging problem in systems biology, and it is crucial for obtaining validated and predictive models.

we present a new, cross entropybased algorithm for the stochastic parameter inference problem that outperforms previous, state–of–the–art approaches;

our algorithm can work with multiple, incomplete, and distribution datasets;

we show that tauleaping can be used within our technique;

we provide a thorough evaluation of our algorithm on a number of challenging case studies, including bistable systems (Schlögl model and toggle switch) and experimental data.
2 Background
3 Methods
In this section, we present our stochastic rate parameter inference with crossentropy (SPICE) algorithm.
 1.
Updating of \(\gamma _n\): Generate K sample trajectories using the SSA, \(\varvec{z}_1,\ldots ,\varvec{z}_K\), from the model \(f(\cdot ;\varvec{\theta }^{(n1)})\) with \(\varvec{\theta }^{(n1)}\) sampled from the lognormal distribution, and sort them in order of their performances \(J_{1'} \le \cdots \le J_{K'}\) (see Eqs. (7) and (6) for the actual definition of the performance, or score, function we adopt). For a fixed small \(\rho \), say \(\rho =10^{2}\), let \(\hat{\gamma }_n\) be defined as the \(\rho \)th quantile of \(J(\varvec{z})\), i.e., \( \hat{\gamma }_n=J_{(\lceil \rho K\rceil )}\).
 2.
Updating of \(\varvec{\theta }_n\): Using the estimated level \(\hat{\gamma }_n\), use the same K sample trajectories \(\varvec{z}_1,\ldots ,\varvec{z}_K\) to derive \(\hat{\varvec{\theta }}_{n}\) and \(\hat{\varvec{\sigma }}^2_n\) from the solution of Eqs. (3) and (4). In case of numerical issues (or undersampling) in our implementation we switch to (5) for updating the variance.
The SPICE algorithm’s pseudocode is shown in Algorithm 1. This 2step approach provides a simple iterative scheme which converges asymptotically to the optimal density. A reasonable termination criteria to take would be to stop if \(\hat{\gamma }_n \nleq \hat{\gamma }_{n1} \nleq \ldots \) for a fixed number of iterations. In general, more samples are required as the mean and variance of the estimates approach their optima.
Adaptive Sampling. We adaptively update the number of samples \(K_n\) taken at each iteration. The reasoning is to ensure the parameter estimates improve with statistical significance at each step. Thus, our method allows the algorithm to make faster evaluations early on in the iterative process, and concentrate simulation time on later iterations, where it becomes increasingly hard to distinguish significant improvements of the estimated parameters. We update our parameters based on a fixed number of elite samples, \(K_{{E}}\), satisfying \(J(\varvec{z})\le \gamma \). The performance of the ‘best’ elite sample is denoted \(J_n^*\), while the performance of the ‘worst’ elite sample—previously given by the \(\rho \)th quantile of \(J(\varvec{z})\)—is \(\hat{\gamma }_n\). The quantile parameter \(\rho \) is adaptively updated each iteration as \(\rho _n=K_{{E}}/K_n\), where \(K_{E}\) is typically taken to be 1–10% of the base number of samples \(K_0\). At each iteration, a check is made for improvement in either of the best or worst performing elite samples, i.e., if, \( J^*_n < J^*_{n1}\) or \( \hat{\gamma }_n < \hat{\gamma }_{n1}\), then we can update our parameters and proceed to the next iteration. If no improvement in either values are found, the number of samples \(K_n\) in the current iteration is increased in increments, up to a maximum \(K_{\text {max}}\). If we hit the maximum number of samples \(K_{\text {max}}\) for c iterations (e.g., \(c=3\)), then this suggests no further significant improvement can be made given the restriction on the number of samples.
Multiple Shooting and Particle Splitting. SPICE can optionally utilise these two techniques for trajectory simulation between time intervals. For multiple shooting we construct a sample trajectory comprised of T intervals matching the time stamps within the data \(\varvec{y}\). Originally [42], each segment from \(\varvec{x}_{t1}\) to \(\varvec{x}_t\) was simulated using an ODE model with the initial conditions set to the previous time point of the dataset, i.e., \(\varvec{x}_{t1} = \varvec{y}_{t1}\). We instead treat the data as being mixturenormally distributed, thus we sample our initial conditions \(\varvec{x}_{t1}\sim \mathcal {N}(\varvec{y}_{n,t1},\varvec{\sigma }^2_{n,t1})\), where the index of the time series n is first uniformly sampled. Using the SSA, each piecewise section of a trajectory belonging to sample k is then simulated with the same parameter vector \(\varvec{\theta }\). For particle splitting we adopt a multilevel splitting approach as in [8], and the objective function is calculated after the simulation of each segment from \(\varvec{x}_{t1}\) to \(\varvec{x}_t\). The trajectories \(\varvec{z}_k\) satisfying \(J(\varvec{z}_k)\le \hat{\gamma }\) are then resampled with replacement \(K_n\) times before simulation continues (recall \(K_n\) is the number of samples in the nth iteration). This process aims at discarding poorly performing trajectories in favour of those ‘closest’ to the data. This will in turn create an enriched sample, at the cost of introducing an aspect of bias propagation.
Hyperparameters. SPICE allows for the inclusion of hyperparameters \(\varvec{\phi }\) (e.g., scaling constants, and non kineticrate parameters), which are sampled (logarithmically) alongside \(\varvec{\theta }\). These hyperparameters are updated at each iteration via the standard CE method.
Proposition 1
The CE solution for the optimal rate parameter over a tauleaping trajectory is the same as that for a standard SSA trajectory.
Proof
4 Experiments
We utilise our SPICE algorithm on four commonly investigated systems: (i) the LotkaVolterra predator–prey model, (ii) a Yeast Polarization model, (iii) the bistable Schlögl system, and (iv) the Genetic Toggle Switch. We present results for each system obtained using both the standard SSA and optimised tauleaping (with an error control parameter of \(\varepsilon =0.1\)) to drive our simulations.
For each run of the algorithm we set the sample parameters \(K_{E}=10\), \(K_{\text {min}}=1,000\), \(K_{\text {max}}=20,000\), and set an upper limit on the number of iterations to 250. The smoothing parameters \((\lambda , \beta , q)\) were set to (0.7, 0.8, 5) respectively. For our analysis, we define the mean relative error (MRE) between a parameter estimate \(\hat{\varvec{\theta }}\) and the truth \(\varvec{\theta }^*\) as \(\text {MRE}(\%_{\textsc {ERR}})=M^{1}\sum _j^M\hat{\theta }_j  \theta ^*_j / \theta ^*_j \times 100 \). All our experiments were performed on a Intel Xeon 2.9GHz Linux system without using multiple cores—all reported CPU times are singlecore. SPICE has been implemented in Julia and is open source (https://github.com/pzuliani/SPICE).
For models (i)–(iii), we use synthetic data where the true solution is known, and compare the results of SPICE against some commonly used parameter estimation techniques implemented in COPASI 4.16 [17]. Specifically, we check the performance of SPICE against the genetic algorithm (GA), evolution strategy (ES), evolutionary programming (EP), and particle swarm (PS) implementations. For the ES and EP algorithms we allow 250 generations with a population of 1,000 particles. For the GA, we run 500 generations with 2,000 particles. For the PS, we allow 1,000 iterations with 1,000 particles^{1}. For model (iv), the Genetic Toggle Switch, we show results for SPICE using real experimental data.
All statistics presented are based on 100 runs of each algorithm using fixed datasets. For each approach we also compared the performance of using the standard SSA versus tauleaping, alongside multipleshooting and particle splitting approaches. However, for the models tested, neither multiple shooting nor particle splitting helped in reducing CPU times or improving the estimates accuracy.
In the previous LotkaVolterra predator–prey example, SPICE was provided with the complete data for both species \(X_1,X_2\). However, we are also concerned with cases where the data is not fully observed, i.e., when we have latent species. To compare the effects of latent species on the quality of parameter estimates, we ran SPICE again (averaging across 100 runs), this time supplying information about species \(X_1\) alone. The results are presented in Table 1.
The relative errors for each stochastic rate parameter averaged across 100 runs using COPASI’s Evolutionary Programming (EP), Evolution Strategy (ES), Genetic Algorithm (GA), and Particle Swarm (PS) algorithms, and our SPICE algorithm are shown. The minimum, maximum, and average mean relative error (MRE) for all parameter estimates across all runs are also given alongside the averaged CPU time.
The placed bounds on the initial kinetic parameter search space, based upon reported halflives for the variants of GFP [2] and mCherry [31], were \(\theta _{1,3}\, \mathord {\in }\, [1\mathrm {e}{3},1]\), and \(\theta _{2,4}\, \mathord {\in } \,[1,50]\). The respective bounds on the search space for the hyperparameters were \(\phi _{1,2,3,4}\, \mathord {\in }\, [1\mathrm {e}{3},10]\), and \(\phi _{5,6}\, \mathord {\in }\, [50,500]\). To generate the parameter estimates, we used SPICE with tauleaping (\(\varepsilon =0.1\), CPU time = 4,293 s). The estimated parameters and the resulting fit against the data for the model can be seen in Fig. 5.
5 Discussion
We can see from the presented results that our SPICE algorithm performs well on the models studied. For the LotkaVolterra model the quality of the estimates is always good—there is no relative error larger than 2.1% in Table 1 for SPICE. The CPU times are reasonable in absolute terms (about 20 min, single core), and much smaller than those of the methods implemented in COPASI, and with smaller errors. Also, having one unobserved species (\(X_2\)) in the data does not seem to impact the results very much. In particular, from Table 1 we see that the latent model indeed has higher error than the fully observable model. However, the error is always smaller than 10%, which is acceptable.
The Yeast Polarization model is a more difficult system: we can indeed see from Table 1 that a number of parameter estimates have large relative errors. These are the same ‘hard’ parameters estimated by MCEM\(^2\) [8] with similar errors. However, in CPU time terms, our SPICE algorithm does much better than MCEM\(^2\): SPICE can return a quite good estimate (in line with MCEM\(^2\)’s) on average in about 18 min using the direct method, while MCEM\(^2\) would need about 30 days [8]—a speedup of 2,400 times. Furthermore, for this model one could use tauleaping instead of the direct method, gaining a 3x speedup in performance while giving up little on accuracy (the Min., Av., and Max. MRE \(\%_\text {ERR}\) were 31.2, 41.5, and 56.3, respectively; Av. CPU time was 303 s).
The Schlögl system is another challenging case study, as clearly showed by results of Table 1, which were obtained by utilising tauleaping (as a matter of fact, for the Schlögl model the average accuracy of SPICE increases with the use of tauleaping). Our choice was motivated by the large CPU time of the direct method due to the fact that the upper steady state for X in the model has a large molecule number (about 600), which negatively impacts the running time of the direct method samples. The results of Table 1 show that there is no clear winner: the Evolutionary Programming method in COPASI has the smallest runtime, but twice the error achieved by SPICE, which has the best accuracy. As noted before, running the COPASI implementations with larger populations and more iterations did not significantly improve accuracy for the increased cost.
Lastly, the genetic Toggle Switch presents an interesting realworld case study with highthroughput data. The model now comprises four hyperparameters, each of which must be estimated alongside the four kinetic rate constants. In addition, the nondiscrete (and noisy) data is no longer known to be generated from a convenient mathematical model. In other terms, there is no guarantee that the model reflects the true underlying biochemical reaction network. Despite these challenges, our SPICE algorithm does a very good job (in little more than an hour of CPU time) in computing parameter estimates for which the model quite closely matches the experimental data—we see in fact from Fig. 5 that the model simulations fall inside the data, with very few exceptions, and the empirical and simulated distributions closely match.
Related Work. Techniques for stochastic rate parameter estimation fall into four categories. Early efforts included methods based on MLE: simulated maximum likelihood utilises Monte Carlo simulation and a genetic algorithm to maximise an approximated likelihood [34]. Efforts have been made to incorporate the ExpectationMaximisation (EM) algorithm with the SSA [18]. The stochastic gradient descent explores a Markov Chain Monte Carlo sampler with a MetropolisHastings update step [39]. In [25] a hidden Markov model is used for the system state, which is then solved by (approximate) likelihood maximisation. Lastly, a recent work [8] has combined an ascentbased EM algorithm with a modified crossentropy method. Another category of methodologies include Bayesian inference. In particular, approximate Bayesian computation (ABC) gains an advantage by becoming ‘likelihood free’, and recent advances in sequential Monte Carlo (SMC) samplers have further improved these methods [32, 35]. We note the similarities between ABC(SMC) approaches and SPICE. Both methods can utilize ‘elite’ samples to produce better parameter estimates. A key difference is that ABC(SMC) uses accepted simulation parameters to construct a posterior distribution, while SPICE utilizes complete trajectory information to compute optimal updates of an underlying parameter distribution. The Bayesian approach presented in [5] can handle partially observed systems, including notions of experimental error. Linear noise approximation techniques have been used alongside Bayesian analysis [19]. A very recent work [36] combines Bayesian analysis with statistical emulation in an attempt at reducing the cost due to the SSA simulations. A third class of methodologies center around the numerical solution of the chemical master equation (CME), which is often intractable for all but the simplest of systems. One approach is to use dynamic state space truncation [3] or finite state projection methods [9] that truncate the CME state space by ignoring the smallest probability states. Another variation is to use a method of moments approximation [10, 16] to construct ordinary differential equations (ODEs) describing the time evolution for the mean, variance, etc., of the underlying distribution. Other CME approximations are system size expansion using van Kampen’s expansion [11], and solutions of the FokkerPlanck equation [22] using a form of linear noise approximation. Finally, another method [42] treats intervals between time measurements piecewise, and within each interval an ODE approximation is used for the objective function. This method has been recently extended using linear noise approximation [43]. A recent work [1], tailored for highthroughput data, proposes a stochastic parameter inference approach based on the comparison of distributions.
6 Conclusions
In this paper we have introduced the SPICE algorithm for rate parameter inference in stochastic reaction networks. Our algorithm is based on the crossentropy method and Gillespie’s algorithm, with a number of significant improvements. Key strengths of our algorithm are its ability to use multiple, possibly incomplete datasets (including distribution data), and its (theoretically justified) use of tauleaping methods for model simulation. We have shown that SPICE works well in practice, in terms of both computational cost and estimate accuracy (which was often the best in the models tested), even on challenging case studies involving bistable systems and real highthroughput data. On a nontrivial case study, SPICE can be orders of magnitude faster than other approaches, while offering comparable accuracy in the estimates.
Footnotes
Notes
Acknowledgements
This work has been supported by a BBSRC DTP PhD studentship and the EPSRC Portabolomics project (EP/N031962/1).
References
 1.Aguilera, L.U., Zimmer, C., Kummer, U.: A new efficient approach to fit stochastic models on the basis of highthroughput experimental data using a model of IRF7 gene expression as case study. BMC Syst. Biol. 11(1), 26 (2017)CrossRefGoogle Scholar
 2.Andersen, J.B., Sternberg, C., Poulsen, L.K., Bjørn, S.P., Givskov, M., Molin, S.: New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria. Appl. Environ. Microbiol. 64(6), 2240–2246 (1998)Google Scholar
 3.Andreychenko, A., Mikeev, L., Spieler, D., Wolf, V.: Approximate maximum likelihood estimation for stochastic chemical kinetics. EURASIP J. Bioinform. Syst. Biol. 2012(1), 9 (2012)Google Scholar
 4.Blake, W.J., KAErn, M., Cantor, C.R., Collins, J.J.: Noise in eukaryotic gene expression. Nature 422(6932), 633–637 (2003)CrossRefGoogle Scholar
 5.Boys, R., Wilkinson, D., Kirkwood, T.: Bayesian inference for a discretely observed stochastic kinetic model. Stat. Comput. 18, 125–135 (2008)MathSciNetCrossRefGoogle Scholar
 6.Costa, A., Jones, O.D., Kroese, D.: Convergence properties of the crossentropy method for discrete optimization. Oper. Res. Lett. 35(5), 573–580 (2007)MathSciNetCrossRefGoogle Scholar
 7.Daigle, B.J., Roh, M.K., Gillespie, D.T., Petzold, L.R.: Automated estimation of rare event probabilities in biochemical systems. J. Chem. Phys. 134(4), 044110 (2011)CrossRefGoogle Scholar
 8.Daigle, B.J., Roh, M.K., Petzold, L.R., Niemi, J.: Accelerated maximum likelihood parameter estimation for stochastic biochemical systems. BMC Bioinform. 13(1), 68 (2012)CrossRefGoogle Scholar
 9.Dandach, S.H., Khammash, M.: Analysis of stochastic strategies in bacterial competence: a master equation approach. PLoS Comput. Biol. 6(11), 1–11 (2010)MathSciNetCrossRefGoogle Scholar
 10.Engblom, S.: Computing the moments of high dimensional solutions of the master equation. Appl. Math. Comput. 180(2), 498–515 (2006)MathSciNetzbMATHGoogle Scholar
 11.Fröhlich, F., Thomas, P., Kazeroonian, A., Theis, F.J., Grima, R., Hasenauer, J.: Inference for stochastic chemical kinetics using moment equations and system size expansion. PLoS Comput. Biol. 12(7), 1–28 (2016)CrossRefGoogle Scholar
 12.Gardner, T.S., Cantor, C.R., Collins, J.J.: Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767), 339–342 (2000)CrossRefGoogle Scholar
 13.Gillespie, D.T.: A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22(4), 403–434 (1976)MathSciNetCrossRefGoogle Scholar
 14.Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81(25), 2340–2361 (1977)CrossRefGoogle Scholar
 15.Gillespie, D.T.: Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 115(4), 1716–1733 (2001)CrossRefGoogle Scholar
 16.Hasenauer, J., Wolf, V., Kazeroonian, A., Theis, F.J.: Method of conditional moments (MCM) for the chemical master equation. J. Math. Biol. 69(3), 687–735 (2014)MathSciNetCrossRefGoogle Scholar
 17.Hoops, S., et al.: COPASI  a complex pathway simulator. Bioinformatics 22(24), 3067–3074 (2006)CrossRefGoogle Scholar
 18.Horváth, A., Martini, D.: Parameter estimation of kinetic rates in stochastic reaction networks by the EM method. In: BMEI, pp. 713–717. IEEE (2008)Google Scholar
 19.Komorowski, M., Finkenstädt, B., Harper, C.V., Rand, D.A.: Bayesian inference of biochemical kinetic parameters using the linear noise approximation. BMC Bioinform. 10(1), 343 (2009)CrossRefGoogle Scholar
 20.Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefGoogle Scholar
 21.Leon, M.: Computational design and characterisation of synthetic genetic switches. Ph.D. thesis, University College London, UK (2017). http://discovery.ucl.ac.uk/1546318/1/Leon_Miriam_thesis_final.pdf
 22.Liao, S., Vejchodský, T., Erban, R.: Tensor methods for parameter estimation and bifurcation analysis of stochastic reaction networks. J. Roy. Soc. Interface 12(108), 20150233 (2015)CrossRefGoogle Scholar
 23.McAdams, H.H., Arkin, A.: Stochastic mechanisms in gene expression. PNAS 94(3), 814–819 (1997)CrossRefGoogle Scholar
 24.Pirone, J.R., Elston, T.C.: Fluctuations in transcription factor binding can explain the graded and binary responses observed in inducible gene expression. J. Theoret. Biol. 226(1), 111–112 (2004)MathSciNetCrossRefGoogle Scholar
 25.Reinker, S., Altman, R.M., Timmer, J.: Parameter estimation in stochastic biochemical reactions. IEE Proc.  Syst. Biol. 153(4), 168–178 (2006)CrossRefGoogle Scholar
 26.Robert, C., Casella, G.: Monte Carlo Statistical Methods. Springer, Heidelberg (2004). https://doi.org/10.1007/9781475741452CrossRefzbMATHGoogle Scholar
 27.Rubinstein, R.Y.: Optimization of computer simulation models with rare events. Eur. J. Oper. Res. 99(1), 89–112 (1997)CrossRefGoogle Scholar
 28.Rubinstein, R.Y.: The crossentropy method for combinatorial and continuous optimization. Methodol. Comput. Appl. Prob. 1(2), 127–190 (1999)MathSciNetCrossRefGoogle Scholar
 29.Rubinstein, R.Y., Kroese, D.P.: The CrossEntropy Method. Springer, Heidelberg (2004)CrossRefGoogle Scholar
 30.Schlögl, F.: Chemical reaction models for nonequilibrium phase transitions. Zeitschrift für physik 253(2), 147–161 (1972)CrossRefGoogle Scholar
 31.Shaner, N.C., Campbell, R.E., Steinbach, P.A., Giepmans, B.N.G., Palmer, A.E., Tsien, R.Y.: Improved monomeric red, orange and yellow fluorescent proteins derived from Discosoma sp. red fluorescent protein. Nat. Biotechnol. 22, 1567–1572 (2004)CrossRefGoogle Scholar
 32.Sisson, S.A., Fan, Y., Tanaka, M.M.: Sequential Monte Carlo without likelihoods. PNAS 104(6), 1760–5 (2007)MathSciNetCrossRefGoogle Scholar
 33.Sobol’, I.M.: On the distribution of points in a cube and the approximate evaluation of integrals. USSR Comput. Math. Math. Phys. 7(4), 86–112 (1967)MathSciNetCrossRefGoogle Scholar
 34.Tian, T., Xu, S., Gao, J., Burrage, K.: Simulated maximum likelihood method for estimating kinetic rates in gene expression. Bioinformatics 23(1), 84–91 (2007)CrossRefGoogle Scholar
 35.Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.H.: Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. Roy. Soc. Interface 6(31), 187–202 (2009)CrossRefGoogle Scholar
 36.Vernon, I., Liu, J., Goldstein, M., Rowe, J., Topping, J., Lindsey, K.: Bayesian uncertainty analysis for complex systems biology models: emulation, global parameter searches and evaluation of gene functions. BMC Syst. Biol. 12(1), 1 (2018)CrossRefGoogle Scholar
 37.Villaverde, A.F., Banga, J.R.: Reverse engineering and identification in systems biology: strategies, perspectives and challenges. J. Roy. Soc. Interface 11(91), 20130505 (2013)CrossRefGoogle Scholar
 38.Voit, E.O.: The best models of metabolism. Wiley Interdisc. Rev.: Syst. Biol. Med. 9(6), e1391 (2017)CrossRefGoogle Scholar
 39.Wang, Y., Christley, S., Mjolsness, E., Xie, X.: Parameter inference for discretely observed stochastic kinetic models using stochastic gradient descent. BMC Syst. Biol. 4(1), 99 (2010)CrossRefGoogle Scholar
 40.Wilkinson, D.J.: Stochastic modelling for quantitative description of heterogeneous biological systems. Nat. Rev. Genet. 10(2), 122–133 (2009)CrossRefGoogle Scholar
 41.Wilkinson, D.J.: Stochastic Modelling for Systems Biology. CRC Press, Boca Raton (2012)Google Scholar
 42.Zimmer, C., Sahle, S.: Parameter estimation for stochastic models of biochemical reactions. J. Comput. Sci. Syst. Biol. 6(1), 11–21 (2012)Google Scholar
 43.Zimmer, C., Sahle, S.: Deterministic inference for stochastic systems using multiple shooting and a linear noise approximation for the transition probabilities. IET Syst. Biol. 9, 181–192 (2015)CrossRefGoogle Scholar
Copyright information
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis> This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara> <SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>