Sensitivity Analysis for Multiscale Stochastic Reaction Networks Using Hybrid Approximations

Gupta, Ankit; Khammash, Mustafa

doi:10.1007/s11538-018-0521-4

Sensitivity Analysis for Multiscale Stochastic Reaction Networks Using Hybrid Approximations

Special Issue: Gillespie and His Algorithms
Published: 09 October 2018

Volume 81, pages 3121–3158, (2019)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

469 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

We consider the problem of estimating parameter sensitivities for stochastic models of multiscale reaction networks. These sensitivity values are important for model analysis, and the methods that currently exist for sensitivity estimation mostly rely on simulations of the stochastic dynamics. This is problematic because these simulations become computationally infeasible for multiscale networks due to reactions firing at several different timescales. However it is often possible to exploit the multiscale property to derive a “model reduction” and approximate the dynamics as a Piecewise deterministic Markov process, which is a hybrid process consisting of both discrete and continuous components. The aim of this paper is to show that such PDMP approximations can be used to accurately and efficiently estimate the parameter sensitivity for the original multiscale stochastic model. We prove the convergence of the original sensitivity to the corresponding PDMP sensitivity, in the limit where the PDMP approximation becomes exact. Moreover, we establish a representation of the PDMP parameter sensitivity that separates the contributions of discrete and continuous components in the dynamics and allows one to efficiently estimate both contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sensitivity Estimation and Inverse Problems in Spatial Stochastic Models of Chemical Kinetics

A review of the deterministic and diffusion approximations for stochastic chemical reaction networks

Article 24 January 2018

Stability and Strong Convergence for Spatial Stochastic Kinetics

Notes

The generator of a Markov process is an operator specifying the infinitesimal rate of change of the distribution of the process [see Chapter 4 in (Ethier and Kurtz 1986) for more details].

References

Anderson DF (2007) A modified next reaction method for simulating chemical systems with time dependent propensities and delays. J Chem Phys 127(21):214107
Article Google Scholar
Anderson D (2012) An efficient finite difference method for parameter sensitivities of continuous time markov chains. SIAM J Numer Anal 50(5):2237–2258
Article MathSciNet MATH Google Scholar
Anderson DA, Kurtz TG (2011) Continuous time Markov chain models for chemical reaction networks. In: Koeppl H, Setti G, di Bernardo M, Densmore D (eds) Design and analysis of biomolecular circuits. Springer, Berlin
Google Scholar
Arkin AP, Rao CV, Wolf DM (2002) Control, exploitation and tolerance of intracellular noise. Nature 420:231–237. https://doi.org/10.1038/nature01258
Article Google Scholar
Ball K, Kurtz TG, Popovic L, Rempala G (2006) Asymptotic analysis of multiscale approximations to reaction networks. Ann Appl Probab 16(4):1925–1961
Article MathSciNet MATH Google Scholar
Cao Y, Petzold LR, Rathinam M, Gillespie DT (2004) The numerical stability of leaping methods for stochastic simulation of chemically reacting systems. J Chem Phys 121(24):12169–12178
Article Google Scholar
Cao Y, Gillespie DT, Petzold LR (2005) The slow-scale stochastic simulation algorithm. J Chem Phys 122(1):1–18
Article Google Scholar
Cao Y, Gillespie DT, Petzold LR (2006) Efficient step size selection for the tau-leaping simulation method. J Chem Phys 124(4):044109
Article Google Scholar
Crudu A, Debussche A, Radulescu O (2009) Hybrid stochastic simplifications for multiscale gene networks. BMC Syst Biol 3(1):89
Article Google Scholar
Darden T (1979) A pseudo-steady state approximation for stochastic chemical kinetics. Rocky Mt J Math 9(1):51–71
Article MathSciNet MATH Google Scholar
Davis MHA (1993) Markov models and optimization, vol 49. Monographs on statistics and applied probability. Chapman & Hall, London
Book MATH Google Scholar
Duncan A, Erban R, Zygalakis K (2016) Hybrid framework for the simulation of stochastic chemical kinetics. J Comput Phys 326:398–419
Article MathSciNet MATH Google Scholar
Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) Stochastic gene expression in a single cell. Science 297(5584):1183–1186. https://doi.org/10.1126/science.1070919
Article Google Scholar
Ethier SN, Kurtz TG (1986) Markov processes. Probability and mathematical statistics. Wiley series in probability and mathematical statistics. ISBN 0-471-08186-8. Characterization and convergence. Wiley, New York
Book MATH Google Scholar
Eymard R, Mercier S, Roussignol M (2011) Importance and sensitivity analysis in dynamic reliability. Methodol Comput Appl Probab 13(1):75–104
Article MathSciNet MATH Google Scholar
Feng X, Hooshangi S, Chen D, Li Weiss R, Rabitz H (2004) Optimizing genetic circuits by global sensitivity analysis. Biophys J 87(4):2195–2202
Article Google Scholar
Fink M, Noble D (2009) Markov models for ion channels: versatility versus identifiability and speed. Philos Trans R Soc A Math Phys Eng Sci 367(1896):2161–2179
Article MathSciNet MATH Google Scholar
Ganguly A, Altintan D, Koeppl H (2015) Jump-diffusion approximation of stochastic reaction dynamics: error bounds and algorithms. Multiscale Model Simul 13(4):1390–1419
Article MathSciNet MATH Google Scholar
Gibson MA, Bruck J (2000) Efficient exact stochastic simulation of chemical systems with many species and many channels. J Phys Chem A 104(9):1876–1889
Article Google Scholar
Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81(25):2340–2361
Article Google Scholar
Gillespie DT (2001) Approximate accelerated stochastic simulation of chemically reacting systems. J Chem Phys 115(4):1716–1733
Article Google Scholar
Goutsias J (2007) Classical versus stochastic kinetics modeling of biochemical reaction systems. Biophys J 92(7):2350–2365
Article Google Scholar
Gunawan R, Cao Y, Doyle FJ (2005) Sensitivity analysis of discrete stochastic systems. Biophys J 88(4):2530–2540
Article Google Scholar
Gupta A, Khammash M (2013) Unbiased estimation of parameter sensitivities for stochastic chemical reaction networks. SIAM J Sci Comput 35(6):2598–2620
Article MathSciNet MATH Google Scholar
Gupta A, Khammash M (2014) An efficient and unbiased method for sensitivity analysis of stochastic reaction networks. J R Soc Interface 11(101):20140979
Article Google Scholar
Gupta A, Rathinam M, Khammash M (2018) Estimation of parameter sensitivities for stochastic reaction networks using tau-leap simulations. SIAM J Numer Anal 56(2):1134–1167
Article MathSciNet MATH Google Scholar
Gupta A, Rathinam M, Khammash M (2017) Estimation of parameter sensitivities for stochastic reaction networks using tau-leap simulations. arXiv:1703.00947
Hepp B, Gupta A, Khammash M (2015) Adaptive hybrid simulations for multiscale stochastic reaction networks. J Chem Phys 142(3):034118
Article Google Scholar
Kang H-W, Kurtz TG (2013) Separation of time-scales and model reduction for stochastic reaction networks. Ann Appl Probab 23(2):529–583
Article MathSciNet MATH Google Scholar
Kurtz TG (1978) Strong approximation theorems for density dependent Markov chains. ISSN 03044149
McAdams HH, Arkin A (1999a) It’s a noisy business! Genetic regulation at the nanomolar scale. TIG 15(2):65–69 (ISSN 0168-9525)
Article Google Scholar
McAdams HH, Arkin A (1999b) It’s a noisy business! Genetic regulation at the nanomolar scale. TIG 15(2):65–69 (ISSN 0168-9525)
Article Google Scholar
Michaelis L, Menten ML (2007) Die kinetik der invertinwirkung. Universitätsbibliothek Johann Christian Senckenberg
Plyasunov S, Arkin AP (2007) Efficient stochastic sensitivity analysis of discrete event systems. J Comput Phys 221:724–738
Article MathSciNet MATH Google Scholar
Rathinam M, Petzold LR, Cao Y, Gillespie DT (2003) Stiffness in stochastic chemically reacting systems: the implicit tau-leaping method. J Chem Phys 119(24):12784–12794
Article Google Scholar
Rathinam M, Sheppard PW, Khammash M (2010) Efficient computation of parameter sensitivities of discrete stochastic chemical reaction networks. J Chem Phys 132(3):034103
Article Google Scholar
Rudnicki R, Tyran-Kamińska M (2017) Piecewise deterministic processes in biological models. Springer, Berlin
Book MATH Google Scholar
Sheppard PW, Rathinam M, Khammash M (2012) A pathwise derivative approach to the computation of parameter sensitivities in discrete stochastic chemical systems. J Chem Phys 136(3):034115
Article Google Scholar
Stelling J, Gilles ED, Doyle FJ (2004) Robustness properties of circadian clock architectures. Proc Natl Acad Sci USA 101(36):13210–13215
Article Google Scholar
Thattai M, van Oudenaarden A (2001) Intrinsic noise in gene regulatory networks. Proc Natl Acad Sci 98(15):8614–8619
Article Google Scholar
Weinan E, Liu D, Vanden-Eijnden E (2005) Nested stochastic simulation algorithm for chemical kinetic systems with disparate rates. J Chem Phys 123(19):1–8
Google Scholar
Weinan E, Liu D, Vanden-Eijnden E (2007) Nested stochastic simulation algorithms for chemical kinetic systems with multiple time scales. J Comput Phys 221(1):158–180 (ISSN 0021-9991)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The funding was provided by European Research Council (Grant no. 743269).

Author information

Authors and Affiliations

Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058, Basel, Switzerland
Ankit Gupta & Mustafa Khammash

Authors

Ankit Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Khammash
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mustafa Khammash.

Appendix

In this Appendix, we prove the main results of the paper which are Theorems 2 and 3. Throughout this section, we use the same notation as in Sect. 3. In particular, $\varPsi _{t}(x,U,\theta )$ is defined by (15) and under Assumption 1, this function is differentiable w.r.t. x. We start this section with a simple proposition.

Proposition 1

Let the multiscale process $( Z^N_\theta (t) )_{t \ge 0}$ and the PDMP $( Z_\theta (t) )_{t \ge 0}$ be as in Sect. 3. Suppose that Assumption 1 holds and $Z^N_\theta \Rightarrow Z_\theta $ as $N \rightarrow \infty $. Then,

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{ \partial }{ \partial \theta } \mathbb {E}( f( Z^{N}_\theta ( T ) ) ) = {\bar{S}}_\theta (f,T) \end{aligned}$$

where

$$\begin{aligned} {\bar{S}}_\theta (f,T)&= \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla \varPsi _{T-t}(x_\theta (t),U_\theta (t),T-t) , \zeta ^{(c)}_k \right\rangle {\text {d}}t \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _{k} \varPsi _{T-t}(x_\theta (t),U_\theta (t),T-t) {\text {d}}t \right] . \end{aligned}$$

(24)

Proof

Let

$$\begin{aligned} S^N_\theta (f,T) = \frac{ \partial }{ \partial \theta } \mathbb {E}( f( Z^{N}_\theta ( T ) ) ) \end{aligned}$$

and analogous to $\varPsi _t$ (15) define the map $\varPsi ^N_t$ as

$$\begin{aligned} \varPsi ^N_t (x,U,\theta ) = \mathbb {E}( f( Z^N_\theta (t) ) ), \quad for any \quad t \ge 0, \end{aligned}$$

where $Z^N$ is the scaled process describing the multiscale reaction dynamics with (x, U) as its initial state. Due to Theorem 3.1 in (Gupta et al. 2017), we obtain

$$\begin{aligned}&S^N_\theta (f,T) \\&= \sum _{ k =1}^K \mathbb {E}\left( N^{\rho _k + r } \int _0^T \partial _\theta \lambda ^N_k( Z^N_\theta (t) ,\theta ) ( \varPsi ^N_{T-t}( Z^N_\theta (t)+ \zeta ^N_k,\theta ) - \varPsi ^N_{T-t}( Z^N_\theta (t) , \theta ) ) {\text {d}}t \right) , \end{aligned}$$

where $\zeta ^N_k:= \varLambda _N \zeta _k$, $\rho _k = \beta _k + \langle \nu _k, \alpha \rangle $ and r is the timescale of observation (10). We can write $Z^N_\theta (t) = (x^N_\theta (t), U^N_\theta (t) )$ where $x^N_\theta (t) \in \mathbb {R}^{S_c}$ denotes the states of species in ${\mathscr {S}}_c$ and $U^N_\theta (t) \in \mathbb {N}_0^{S_d}$ denotes the states of species in ${\mathscr {S}}_d$. Exploiting the analysis in Sect. 2.3, we can express $S^N_\theta (f,T) $ as

$$\begin{aligned} S^N_\theta (f,T) = S^{N,c}_\theta (f,T) + S^{N,d}_\theta (f,T) +o(1) \end{aligned}$$

where the o(1) term converges to 0 as $N \rightarrow \infty $,

$$\begin{aligned} S^{N,c}_\theta (f,T)&= \sum _{ k \in {\mathscr {R}}_c} \mathbb {E}\left( \int _0^T \partial _\theta \lambda _k( x^N_\theta (t), U^N_\theta (t) ,\theta ) N^{\rho _k + r } ( \varPsi ^N_{T-t}( x^N_\theta (t)\right. \\&\quad \left. +N^{-(\rho _k + r) } \zeta ^{(c)}_k , U^N_\theta (t) , \theta ) - \varPsi ^N_{T-t}( x^N_\theta (t), U^N_\theta (t) ,\theta ) ) {\text {d}}t \right) \end{aligned}$$

and

$$\begin{aligned} S^{N,d}_\theta (f,T)&= \sum _{ k \in {\mathscr {R}}_d}\mathbb {E}\left( \int _0^T \partial _\theta \lambda _k( x^N_\theta (t), U^N_\theta (t),\theta ) \right. \\&\qquad \left. ( \varPsi ^N_{T-t}( x^N_\theta (t), U^N_\theta (t)+ \zeta ^{(d)}_k,\theta ) - \varPsi ^N_{T-t}( x^N_\theta (t), U^N_\theta (t) , \theta ) ) {\text {d}}t \right) . \end{aligned}$$

We know that as $N \rightarrow \infty $, process $(x^N_\theta , U^N_\theta )$ converges in distribution to process $(x_\theta , U_\theta )$ in the Skorohod topology on $\mathbb {R}^{S_c} \times \mathbb {N}^{S_d}_0$. This ensures that for any (x, U) and $t \ge 0$, $\varPsi ^N_t(x,U,\theta ) \rightarrow \varPsi _t(x,U,\theta )$ as $N \rightarrow \infty $, and this convergence holds uniformly over compact sets, i.e.,

$$\begin{aligned} \lim _{N \rightarrow \infty } \sup _{ (x,U) \in C , t \in [0,T] } \left| \varPsi ^N_t(x,U,\theta ) - \varPsi _t(x,U,\theta ) \right| = 0 \end{aligned}$$

(25)

for any $T >0$ and any compact set $C \subset \mathbb {R}^{S_c} \times \mathbb {N}^{S_d}_0$. In fact, under Assumptions 1 we also have

$$\begin{aligned} \lim _{N \rightarrow \infty } \sup _{ (x,U) \in C , t \in [0,T] } \left\| \nabla \varPsi ^N_t(x,U,\theta ) - \nabla \varPsi _t(x,U,\theta ) \right\| = 0. \end{aligned}$$

(26)

As $(x^N_\theta , U^N_\theta ) \Rightarrow (x_\theta , U_\theta )$, using (25), it is straightforward to conclude that

$$\begin{aligned}&\lim _{N \rightarrow \infty } S^{N,d}_\theta (f,T)\nonumber \\&= \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _{k} \varPsi _{T-t}(x_\theta (t),U_\theta (t),T-t) {\text {d}}t \right] . \end{aligned}$$

(27)

Noting that

$$\begin{aligned}&N^{\rho _k + r } \left[ \varPsi ^N_{T-t}\left( x^N_\theta (t)+ \frac{1}{ N^{\rho _k + r } }\zeta ^{(c)}_k , U^N_\theta (t) , \theta \right) - \varPsi ^N_{T-t}\left( x^N_\theta (t), U^N_\theta (t) ,\theta \right) \right] \\&\quad = \left\langle \nabla \varPsi ^N_{T-t}\left( x^N_\theta (t), U^N_\theta (t) ,\theta \right) , \zeta ^{(c)}_k \right\rangle +o(1), \end{aligned}$$

(26) allows us to obtain

$$\begin{aligned}&\lim _{N \rightarrow \infty } S^{N,c}_\theta (f,T) \\&\quad = {\sum }_{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla \varPsi _{T-t}(x_\theta (t),U_\theta (t),T-t) , \zeta ^{(c)}_k \right\rangle {\text {d}}t \right] . \end{aligned}$$

This relation along with (27) proves the proposition. $\square $

In light of Proposition 1, to prove Theorem 2 it suffices to show that ${\bar{S}}_\theta (f,T) = \hat{S}_\theta (f,T) $, where $\hat{S}_\theta (f,T) $ is the sensitivity for limiting PDMP $(Z_\theta (t))_{t \ge 0 }$ defined by

$$\begin{aligned} \hat{S}_\theta (f,T)&= \lim _{ h \rightarrow \infty } \frac{ \mathbb {E}( f(Z_{\theta +h} (T) ) ) - \mathbb {E}( f(Z_{\theta } (T) ) ) }{h} \nonumber \\&= \lim _{ h \rightarrow \infty } \frac{ \mathbb {E}( f(x_{\theta +h} (T), U_{\theta +h} (T) ) ) - \mathbb {E}( f(x_{\theta } (T), U_{\theta } (T) ) ) }{h}. \end{aligned}$$

(28)

The next proposition derives a formula for $\hat{S}_\theta (f,T) $ by coupling processes $Z_\theta = (x_\theta , U_\theta )$ and $Z_{\theta +h} = (x_{\theta +h}, U_{\theta +h} )$. This formula will be useful later in proving both Theorems 2 and 3.

Proposition 2

Let $y_\theta (t)$ be the solution of IVP (16) and let $D_\theta \lambda _k( x_\theta (t), U_\theta (t) ,\theta )$ be given by (17). Then, the PDMP sensitivity $\hat{S}_\theta (f,T) $ defined by (28) can be expressed as

$$\begin{aligned} \hat{S}_\theta (f,T)&= \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{T} \partial _{\theta } \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle {\text {d}}t \right] \\&\quad + \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{T} \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] , y_\theta (t) \right\rangle {\text {d}}t \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{T} \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla \left( \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) , y_\theta (t) \right\rangle {\text {d}}t \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d } \mathbb {E}\left[ \int _0^T \partial _{\theta } \lambda _k (x_\theta (t) , U_\theta (t) ,\theta ) \varDelta _k \varPsi _{T- t}( x_\theta (t), U_\theta (t), \theta ) {\text {d}}t \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d } \mathbb {E}\left[ \int _0^T \left\langle \nabla \lambda _k (x_\theta (t) , U_\theta (t) ,\theta ), y_\theta (t) \right\rangle \varDelta _k \varPsi _{T- t}( x_\theta (t), U_\theta (t), \theta ) {\text {d}}t \right] . \end{aligned}$$

Proof

Analogous to the “split-coupling” introduced in (Anderson 2012), we couple the PDMPs $Z_\theta = (x_\theta , U_\theta )$ and $Z_{\theta +h} = (x_{\theta +h} , U_{\theta +h} )$ as follows

$$\begin{aligned} x_\theta (t)&= x_0 + \sum _{k \in {\mathscr {R}}_c} \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right) \zeta ^{(c)}_k \\ x_{\theta +h}(t)&= x_0 + \sum _{k \in {\mathscr {R}}_c} \left( \int _{0}^t \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \zeta ^{(c)}_k \\ U_\theta (t)&= U_0 \\&\quad + \sum _{k \in {\mathscr {R}}_d} Y_k \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \zeta ^{(d)}_k \\&\quad + \sum _{k \in {\mathscr {R}}_d} Y^{(1)}_k \left( \int _{0}^t \lambda ^{(1)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \zeta ^{(d)}_k \\ U_{\theta +h}(t)&= U_0\\&\quad + \sum _{k \in {\mathscr {R}}_d} Y_k \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \zeta ^{(d)}_k \\&\quad + \sum _{k \in {\mathscr {R}}_d} Y^{(2)}_k \left( \int _{0}^t \lambda ^{(2)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \zeta ^{(d)}_k, \end{aligned}$$

where $a \wedge b$ denotes the minimum of a and b, $\{ Y_k , Y^{ (1) }_k , Y^{ (2)}_{k}\}$ is a collection of independent unit-rate Poisson processes, and

$$\begin{aligned}&\lambda ^{(1)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) = \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \\&\quad - \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) \quad and \\&\lambda ^{(2)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) = \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s) ,\theta +h ) \\&\quad - \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k(x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ). \end{aligned}$$

Define a stopping time as the first time that processes $U_\theta $ and $U_{ \theta +h}$ separate, i.e.,

$$\begin{aligned} \tau _h = \inf \{ t \ge 0: U_{\theta }(t) \ne U_{\theta +h}(t) \}. \end{aligned}$$

Observe that the generator for the PDMP $Z_\theta = (x_\theta , U_\theta )$ is

$$\begin{aligned} \mathbb {A}_\theta g(x,u) = \sum _{ k \in {\mathscr {R}}_c} \lambda _k(x,u,\theta ) \left\langle \nabla g(x,u) , \zeta ^{(c)}_k \right\rangle + \sum _{k \in {\mathscr {R}}_d} \lambda _k(x,u,\theta ) \varDelta _k g(x,u), \end{aligned}$$

where $g : \mathbb {R}^{S_c} \times \mathbb {N}^{S_d}_0$ is any function which is differentiable in the first $S_c$ coordinates. Applying Dynkin’s formula, we obtain

$$\begin{aligned} \mathbb {E}\left( f(x_\theta (t) ,U_\theta (t) ) \right)&= f(x_0,U_0) + \mathbb {E}\left( \int _{0}^t \mathbb {A}_\theta f( x_\theta (s) , U_\theta (s) ) {\text {d}}s \right) \qquad and \\ \quad \mathbb {E}\left( f(x_{\theta +h}(t) ,U_{\theta +h}(t) ) \right)&= f(x_0,U_0) + \mathbb {E}\left( \int _{0}^t \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) {\text {d}}s \right) . \end{aligned}$$

The above coupling between processes $Z_\theta = (x_\theta , U_\theta )$ and $Z_{\theta +h} = (x_{\theta +h} , U_{\theta +h} )$ ensures that for $0 \le s \le \tau _h$ we have $U_{\theta +h}(s) = U_{\theta }(s)$ and $x_{\theta +h}(s) = x_\theta (s) + h y_\theta (t) +o(h)$. Noting that $\tau _h \rightarrow \infty $ a.s. as $h \rightarrow 0$, we obtain

$$\begin{aligned}&\lim _{h \rightarrow 0 } \frac{1}{h} \left[ \mathbb {E}\left( \int _{0}^{\tau _h\wedge t} \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) {\text {d}}s \right) - \mathbb {E}\left( \int _{0}^{\tau _h\wedge t} \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) {\text {d}}s \right) \right] \nonumber \\&\quad = \lim _{h \rightarrow 0 } \frac{1}{h} \left[ \mathbb {E}\left( \int _{0}^{\tau _h\wedge t} \left[ \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta }(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right] {\text {d}}s \right) \right] \nonumber \\&\quad = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{ t} \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle {\text {d}}s \right] \nonumber \\&\qquad + \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{t} \left\langle \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle \right] , y_\theta (s) \right\rangle {\text {d}}s \right] \nonumber \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \left\langle \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k f(x_\theta (s),U_\theta (s) ) \right] , y_\theta (s) \right\rangle {\text {d}}s \right] \nonumber \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \partial _\theta \lambda _k(x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k f(x_\theta (s),U_\theta (s)) {\text {d}}s \right] . \end{aligned}$$

(29)

Let $\sigma _0 = 0$ and for each $i=1,2,\dots $ let $\sigma _i$ denote the ith jump time of the process

$$\begin{aligned} \sum _{k \in {\mathscr {R}}_d} Y_k \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta +h}(s), U_{\theta +h}(s), \theta +h ) {\text {d}}s \right) \end{aligned}$$

which counts the common jump times among processes $U_\theta $ and $U_{\theta +h}$. Observe that

$$\begin{aligned}&\lim _{h \rightarrow 0 } \frac{1}{h} \left[ \mathbb {E}\left( \int _{\tau _h\wedge t}^t \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) {\text {d}}s \right) - \mathbb {E}\left( \int _{\tau _h\wedge t}^t \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) {\text {d}}s \right) \right] \\&\quad = \sum _{i=0}^\infty \lim _{h \rightarrow 0 } \frac{1}{h} \mathbb {E}\left[ \mathrm{1l}_{ \{ \sigma _i \wedge t \le \tau _h < \sigma _{i+1} \wedge t \} } \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) \right. \right. \\&\left. \left. \qquad \qquad - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] . \end{aligned}$$

Recall the definition of $D_\theta \lambda _k( x_\theta (t), U_\theta (t) ,\theta )$ from (17). We shall soon prove that

$$\begin{aligned}&\lim _{h \rightarrow 0 } \frac{1}{h} \mathbb {E}\left[ \mathrm{1l}_{ \{ \sigma _i \wedge t \le \tau _h < \sigma _{i+1} \wedge t \} } \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) )\right. \right. \nonumber \\&\left. \left. \qquad \qquad \qquad - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s\right] . \nonumber \\&\quad = \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ t \wedge \sigma _i}^{ t \wedge \sigma _{i +1} } D_\theta \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \left( \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) )\right. \right. \nonumber \\&\left. \left. \qquad \qquad \qquad - \varDelta _k f( x_\theta (s), U_\theta (s)) \right) {\text {d}}s \right] . \end{aligned}$$

(30)

Assuming this for now, we get

$$\begin{aligned}&\lim _{h \rightarrow 0 } \frac{1}{h} \left[ \mathbb {E}\left( \int _{\tau _h\wedge t}^t \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) {\text {d}}s \right) - \mathbb {E}\left( \int _{\tau _h\wedge t}^t \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) {\text {d}}s \right) \right] \\&\quad = \sum _{k \in {\mathscr {R}}_d} \sum _{i=0}^\infty \mathbb {E}\left[ \int _{t \wedge \sigma _i}^{ t \wedge \sigma _{i+1} } D_\theta \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \left( \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) \right. \right. \\&\left. \left. \qquad \qquad \qquad - \varDelta _k f( x_\theta (s), U_\theta (s)) \right) {\text {d}}s \right] \\&\quad = \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \partial _\theta \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right] \\&\qquad - \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \partial _\theta \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \varDelta _k f( x_\theta (s), U_\theta (s)) {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) , y_\theta ( s) \right\rangle \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right] \\&\qquad - \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{ t} \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) , y_\theta ( s) \right\rangle \varDelta _k f( x_\theta (s), U_\theta (s)) {\text {d}}s \right] . \end{aligned}$$

Combining this formula with (29), we obtain

$$\begin{aligned}&\hat{S}_\theta (f,t) \\&\quad = \lim _{h \rightarrow 0} \frac{ \mathbb {E}\left( f(x_{\theta +h} (t), U_{\theta +h} (t) ) \right) - \mathbb {E}\left( f(x_{\theta } (t), U_{\theta } (t) \right) }{h} \\&\quad = \lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \int _{0}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \\&\quad = \lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \int _{0}^{t \wedge \tau _h } \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \\&\qquad + \lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \int _{t \wedge \tau _h }^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \\&\quad = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{t} \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{t} \left\langle \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle \right] , y_\theta (s) \right\rangle {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{t} \left\langle \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k f(x_\theta (s),U_\theta (s) ) \right] , y_\theta (s) \right\rangle {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{t} \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k f(x_\theta (s) , U_\theta ( s) ) {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) , y_\theta ( s) \right\rangle \varDelta _k \varPsi _{t- s}( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right] \\&\qquad - \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varDelta _k f(x_\theta (s) , U_\theta ( s) ) {\text {d}}s \right] \\&\qquad - \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) , y_\theta ( s) \right\rangle \varDelta _k f(x_\theta (s) , U_\theta ( s) ) {\text {d}}s \right] . \end{aligned}$$

In the last expression, the fourth term cancels with the sixth term. Expanding the third term via the product rule $\nabla (gh) = g \nabla h + h \nabla g$ produces two terms, one of which cancels with the last term, and then we obtain the result stated in the statement of this proposition. Therefore, to prove this proposition the only step remaining is to show (30). This is what we do next.

Assume that $x_\theta ( \sigma _i ) = x$, $x_{\theta +h}( \sigma _i ) = x(h) = x + o(1)$, $U_\theta (\sigma _i) =U_{\theta +h}(\sigma _i) =U$ and $\{\tau _h > \sigma _i\}$. Given this information ${\mathscr {F}}_i$, the random time $\delta _i = ( \tau _h -\sigma _i ) \wedge (\sigma _{i+1} -\sigma _i )$ has distribution that satisfies

$$\begin{aligned} \mathbb {P}\left( \delta _i \le w \vert {\mathscr {F}}_i \right) = 1 - \exp \left( - \int _{0}^w \lambda _0( x_\theta (s + \sigma _i) , U ,\theta ) {\text {d}}s \right) + o(1), for w \in [0, \infty ) \end{aligned}$$

(31)

where $\lambda _0(x,U,\theta ) = \sum _{k \in {\mathscr {R}}_d} \lambda _k(x,U,\theta )$. Given $\delta _i = w$, the probability that event $\{ (\sigma _{i+1} -\sigma _i ) > ( \tau _h -\sigma _i ) \}$ occurs (i.e., $\delta _i = \tau _h -\sigma _i$) and the perturbation reaction is $k \in {\mathscr {R}}_d$ is simply

$$\begin{aligned}&\frac{1}{ \lambda _0(x_\theta ( \sigma _i +w) , U,\theta ) }\left| D_\theta \lambda _k(x_\theta (\sigma _i+w) , U,\theta ) \right| h +o(h). \end{aligned}$$

If $D_\theta \lambda _k(x_\theta (\sigma _i+w) , U,\theta ) > 0$, then at time $\tau _h$ process $U_{ \theta +h}$ jumps by $\zeta ^{ (d) }_k$, and if $D_\theta \lambda _k(x_\theta (\sigma _i+w) , U,\theta ) <0$, process $U_{ \theta }$ jumps by $\zeta ^{ (d) }_k$. We will suppose that the first situation holds, but the other case can be handled similarly. Assuming $w < (t - \sigma _i)$, we have

$$\begin{aligned}&\lim _{h \rightarrow 0} \mathbb {E}\left( \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \bigg \vert {\mathscr {F}}_i, \tau _h = \sigma _i + w ,k \right) \\&\quad = \varDelta _{k} \varPsi _{t - \sigma _i -w} ( x_\theta ( \sigma _i + w), U_\theta ( \sigma _i+w ), \theta ) - \varDelta _{k} f( x_\theta ( \sigma _i + w), U_\theta ( \sigma _i+w ) ) \\&\quad := G_k( x_\theta ( \sigma _i +w) , U_\theta ( \sigma _i +w ) , t - \sigma _i-w) \end{aligned}$$

and as $\delta _i$ has distribution (31), we obtain

$$\begin{aligned}&\lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \mathrm{1l}_{ \{ \sigma _i \wedge t \le \tau _h < \sigma _{i+1} \wedge t \} } \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \nonumber \\&\quad = \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \mathrm{1l}_{ \{ \sigma _i \le t \} } \int _{0}^{t - \sigma _i} G_k( x_\theta ( \sigma _i +w) , U_\theta ( \sigma _i +w ) , t - \sigma _i-w) \right. \nonumber \\&\quad \left. D_\theta \lambda _k(x_\theta (\sigma _i+w) , U_\theta (\sigma _i+w),\theta ) \exp \left( -\int _{0}^w \lambda _0(x_\theta ( \sigma _i+s) , U_\theta (\sigma _i+s),\theta ) {\text {d}}s \right) dw \right] . \end{aligned}$$

(32)

Note that given $\sigma _i < t$ and ${\mathscr {F}}_i$, the random variable $\gamma _i = (t \wedge \sigma _{i +1} - t \wedge \sigma _{i })$ has probability density function given by

$$\begin{aligned} p(w) = \lambda _0(x_\theta ( \sigma _i+ w ) , U_\theta (\sigma _i+w),\theta ) \exp \left( -\int _{0}^w \lambda _0(x_\theta ( \sigma _i+ u ) , U_\theta (\sigma _i+u),\theta ) {\text {d}}u \right) , \end{aligned}$$

for $w \in [0, t -\sigma _i)$ and $ \mathbb {P}\left( \gamma _i \le w \vert {\mathscr {F}}_i \right) =1$ if $w \ge (t - \sigma _i)$. Letting

$$\begin{aligned}&G(s,t) = G_k( x_\theta ( s) , U_\theta ( s) , t - s) D_\theta \lambda _k(x_\theta (s) , U_\theta (s),\theta )\\&\quad and \quad P(w) = \int _{w}^\infty p(u){\text {d}}u = \exp \left( -\int _{0}^w \lambda _0(x_\theta ( \sigma _i+ u ) , U_\theta (\sigma _i+u),\theta ) {\text {d}}u \right) \end{aligned}$$

we have

$$\begin{aligned}&\mathbb {E}\left( \int _{t \wedge \sigma _i}^{ t \wedge \sigma _{i+1} } G(s,t) {\text {d}}s \bigg \vert {\mathscr {F}}_i , \sigma _i< t \right) = \mathbb {E}\left( \int _{0}^{ \gamma _i } G(s+\sigma _i,t) {\text {d}}s \bigg \vert {\mathscr {F}}_i , \sigma _i< t \right) \\&\quad = \mathbb {P}\left( \gamma _i \ge t - \sigma _i \bigg \vert {\mathscr {F}}_i , \sigma _i< t \right) \int _{0}^{t - \sigma _i} G(s+\sigma _i,t) {\text {d}}s \\&\qquad + \mathbb {E}\left( \mathrm{1l}_{ \{ 0 \le \gamma _i<(t - \sigma _i) \} } \int _{0}^{ \delta _i } G(s+\sigma _i,t) {\text {d}}s \bigg \vert {\mathscr {F}}_i , \sigma _i < t \right) \\&\quad = P(t - \sigma _i) \int _{0}^{t - \sigma _i} G(s+\sigma _i,t) {\text {d}}s + \int _{0}^{t - \sigma _i} p(w) \left( \int _{0}^{ w } G(s+\sigma _i,t) {\text {d}}s \right) dw. \end{aligned}$$

Using integration by parts

$$\begin{aligned} \int _{0}^{t - \sigma _i} p(w) \left( \int _{0}^{ w } G(s+\sigma _i,t) {\text {d}}s \right) dw&= -P(t - \sigma _i)\left( \int _{0}^{ t -\sigma _i } G(s+\sigma _i,t) {\text {d}}s \right) \\&\quad +\int _{0}^{t - \sigma _i} P(w) G(w+\sigma _i,t) dw \end{aligned}$$

which shows that

$$\begin{aligned} \int _{0}^{t - \sigma _i} P(w) G(w+\sigma _i,t) {\text {d}}s = \mathbb {E}\left( \int _{t \wedge \sigma _i}^{ t \wedge \sigma _{i+1} } G(s,t) {\text {d}}s \bigg \vert {\mathscr {F}}_i , \sigma _i < t \right) . \end{aligned}$$

Substituting this expression in (32) gives us

$$\begin{aligned}&\lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \\&\quad = \sum _{i=0}^\infty \lim _{h \rightarrow 0} \frac{1}{h} \mathbb {E}\left[ \mathrm{1l}_{ \{ \sigma _i \wedge t \le \tau _h < \sigma _{i+1} \wedge t \} } \int _{\tau _h\wedge t}^t \left( \mathbb {A}_{\theta +h} f( x_{\theta +h}(s) , U_{\theta +h}(s) ) \right. \right. \\&\left. \left. \qquad \qquad \qquad - \mathbb {A}_{\theta } f( x_{\theta }(s) , U_{\theta }(s) ) \right) {\text {d}}s \right] \\&\quad = \sum _{k \in {\mathscr {R}}_d} \sum _{i=0}^\infty \mathbb {E}\left[ \int _{ t \wedge \sigma _i}^{ t \wedge \sigma _{i +1} } G_k( x_\theta ( s) , U_\theta ( s) , t - s) D_\theta \lambda _k(x_\theta (s) , U_\theta (s),\theta ) {\text {d}}s \right] . \end{aligned}$$

This proves (30) and completes the proof of this proposition. $\square $

Define a $S_c \times S_c$ matrix by

$$\begin{aligned} M(x,U,\theta ) = \sum _{k \in {\mathscr {R}}_c } \zeta ^{ (c) }_{k} ( \nabla \lambda _k(x,U,\theta ) )^* \end{aligned}$$

for any $(x , U, \theta ) \in \mathbb {R}^{S_c} \times \mathbb {N}^{S_d}_0 \times \mathbb {R}$, where $v^*$ denotes the transpose of v. Let $\varPhi (x_0,U_0,t)$ be the solution of the linear matrix-valued equations

$$\begin{aligned} \frac{{\text {d}} }{{\text {d}}t} \varPhi (x_0,U_0,t) = M(x_\theta (t) , U_\theta (t) ,\theta ) \varPhi (x_0,U_0,t) \end{aligned}$$

(33)

with $\varPhi (x_0,U_0,0) = \mathbf{I}$, which is the $S_c \times S_c$ identity matrix. Here $(x_0, U_0)$ denotes the initial state of $(x_\theta (t) ,U_\theta (t) )$. It can be seen that $y_\theta (t)$, which is the solution of IVP (16), can be written as

$$\begin{aligned} y_\theta (t) = \sum _{k \in {\mathscr {R}}_c } \int _0^t \partial _\theta \lambda _k ( x_\theta (s) , U_\theta (s) , \theta ) \varPhi (x_\theta (s),U_\theta (s),t - s) \zeta _k^{(c)} {\text {d}}s. \end{aligned}$$

(34)

This shall be useful in proving the next proposition which considers the sensitivity of $\varPsi _t (x_\theta (t) , U_\theta (t) ,\theta )$ to the initial value of the continuous state $x_0$.

Proposition 3

Let $\varPhi (x_0,U_0,t)$ be the matrix-valued function defined above. Then, we can express the gradient of $\varPsi _t(x_0,U_0,\theta ) $ w.r.t. $x_0$ as

$$\begin{aligned}&\nabla \varPsi _t(x_0,U_0,\theta ) = \nabla f(x_0,U_0) \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{t} \varPhi ^*(x_0, U_0,s) \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle \right] {\text {d}}s \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{t} \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varPhi ^*(x_0, U_0,s) \nabla \left( \varDelta _k f(x_\theta (s),U_\theta (s) ) \right) {\text {d}}s \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \varPhi ^*(x_0, U_0,s) \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) \varDelta _k \varPsi _{t-s}( x_\theta ( s) , U_\theta ( s ) ,\theta ) {\text {d}}s \right] . \end{aligned}$$

(35)

Proof

To prove this proposition, it suffices to show that for any vector $v \in \mathbb {R}^{S_c}$, the inner product of v with the l.h.s. of (35) is same as the inner product of v with the r.h.s. of (35). Defining

$$\begin{aligned} y(t) = \varPhi (x_0, U_0,t) v \end{aligned}$$

our aim is to prove that

$$\begin{aligned}&\left\langle \nabla \varPsi _t(x_0,U_0,\theta ) , v \right\rangle = \left\langle \nabla f(x_0,U_0) , v \right\rangle \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^{t} \left\langle \nabla \left[ \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla f(x_\theta (s),U_\theta (s) ) , \zeta ^{(c)}_k \right\rangle \right] , y(s) \right\rangle {\text {d}}s \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^{t} \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla \left( \varDelta _k f(x_\theta (s),U_\theta (s) ) , y(s) \right) \right\rangle {\text {d}}s \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{ 0}^{ t } \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) , y(s) \right\rangle \varDelta _k \varPsi _{t-s}( x_\theta ( s) , U_\theta ( s ) ,\theta ) {\text {d}}s \right] . \end{aligned}$$

(36)

Note that y(t) solves the IVP

$$\begin{aligned} \frac{{\text {d}} y}{{\text {d}}t}&= \sum _{k \in {\mathscr {R}}_c} \ \left\langle \nabla \lambda _k(x_\theta (t) , U_\theta (t) ,\theta ) , y(t) \right\rangle \zeta ^{(c)}_k \nonumber \\ and&\qquad y(0) = v, \end{aligned}$$

(37)

which shows that y(t) is the directional derivative of $x_\theta (t)$ [see (12)] w.r.t. the initial state $x_0$ in the direction v.

This proposition can be proved in the same way as Proposition 2, by coupling process $(x_\theta , U_\theta )$ with another process $(x_{\theta ,h} , U_{\theta ,h} )$ according to

$$\begin{aligned} x_\theta (t)&= x_0 + \sum _{k \in {\mathscr {R}}_c} \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s), \theta ) {\text {d}}s \right) \zeta ^{(c)}_k \\ x_{\theta ,h}(t)&= x_0 + h v + \sum _{k \in {\mathscr {R}}_c} \left( \int _{0}^t \lambda _k( x_{\theta , h}(s), U_{\theta , h}(s), \theta ) {\text {d}}s \right) \zeta ^{(c)}_k \\ U_\theta (t)&= U_0 + \sum _{k \in {\mathscr {R}}_d} Y_k \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta , h}(s), U_{\theta ,h}(s), \theta ) {\text {d}}s \right) \zeta ^{(d)}_k \\&\quad + \sum _{k \in {\mathscr {R}}_d} Y^{(1)}_k \left( \int _{0}^t \lambda ^{(1)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta , h}(s), U_{\theta , h}(s), \theta ) {\text {d}}s \right) \zeta ^{(d)}_k \\ U_{\theta ,h}(t)&= U_0 + \sum _{k \in {\mathscr {R}}_d} Y_k \left( \int _{0}^t \lambda _k( x_\theta (s), U_\theta (s) ,\theta ) \wedge \lambda _k( x_{\theta , h}(s), U_{\theta , h}(s), \theta ) {\text {d}}s \right) \zeta ^{(d)}_k \\&\quad + \sum _{k \in {\mathscr {R}}_d} Y^{(2)}_k \left( \int _{0}^t \lambda ^{(2)}_k( x_\theta (s), U_\theta (s) ,\theta , x_{\theta , h}(s), U_{\theta , h}(s), \theta ) {\text {d}}s \right) \zeta ^{(d)}_k, \end{aligned}$$

where $\{ Y_k , Y^{ (1) }_k , Y^{ (2)}_{k}\}$ is a collection of independent unit-rate Poisson processes, and $\lambda ^{(1)}_k$, $\lambda ^{(2)}_k$ are as in the proof of Proposition 2. An important difference between this proposition and Proposition 2 is that the value of $\theta $ is the same in the coupled processes, and hence the only difference between the two processes comes due to difference in the initial continuous state $x_0$. Consequently, the $ \partial _\theta \lambda _k$ terms in the statement of Proposition 2disappear and we obtain (36). $\square $

Proof

(Proof of Theorem 2) Define

$$\begin{aligned} L(t) = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^t \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla \varPsi _{t-s}(x_\theta (s),U_\theta (s),t-s) , \zeta ^{(c)}_k \right\rangle {\text {d}}s \right] . \end{aligned}$$

Due to Proposition 2, to prove Theorem 2 it suffices to prove that

$$\begin{aligned} L(T)&= \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t) ,U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle {\text {d}}t \right] \nonumber \\&\quad +\sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \rangle \right] , y_\theta (t) \right\rangle {\text {d}}t \right] \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^T \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla \left( \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) , y_\theta (t) \right\rangle {\text {d}}t \right] \nonumber \\&\quad +\sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^T \left\langle \nabla \lambda _k(x_\theta ( t) ,U_\theta (t),\theta ) ,y_\theta (t) \right\rangle \varDelta _k \varPsi _{T- t}( x_\theta (t), U_\theta (t), \theta ) {\text {d}}t \right] . \end{aligned}$$

(38)

Let $\{ {\mathscr {F}}_t \}$ be the filtration generated by process $(x_\theta , U_\theta )$. For any $t \ge 0$, let $\mathbb {E}_t (\cdot )$ denote the conditional expectation $\mathbb {E}( \cdot \vert {\mathscr {F}}_t )$. Proposition 3 allows us to write

$$\begin{aligned}&\nabla \varPsi _{t - s}(x_\theta (s) , U_\theta (s), t-s ) \\&\quad = \nabla f(x_\theta (s) , U_\theta (s) ) \\&\qquad + \sum _{k \in {\mathscr {R}}_c} \int _{s}^{t} \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s),u-s) \right. \\&\left. \qquad \qquad \nabla \left[ \lambda _k (x_\theta (u) ,U_\theta (u) ,\theta ) \left\langle \nabla f(x_\theta (u),U_\theta (u) ) , \zeta ^{(c)}_k \right\rangle \right] \right] {\text {d}}u \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \int _{s}^{t} \mathbb {E}_s \left[ \lambda _k (x_\theta (u) ,U_\theta (u) ,\theta ) \varPhi ^*(x_\theta (s), U_\theta (s),u-s) \nabla \left( \varDelta _k f(x_\theta (u),U_\theta (u) ) \right) \right] {\text {d}}u \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \int _{ s}^{ t } \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s), u -s) \right. \\&\quad \left. \qquad \qquad \nabla \lambda _k(x_\theta ( u) ,U_\theta (u),\theta ) \varDelta _k \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) \right] {\text {d}}u . \end{aligned}$$

This shows that

$$\begin{aligned}&\frac{{\text {d}}}{{\text {d}}t} \nabla \varPsi _{t - s}(x_\theta (s) , U_\theta (s), t-s )\\&\quad = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s),t-s) \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}_s \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \varPhi ^*(x_\theta (s), U_\theta (s),t-s) \nabla \left( \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s), t -s) \nabla \lambda _k(x_\theta ( t) ,U_\theta (t),\theta ) \varDelta _k f( x_\theta ( t) , U_\theta (t ) ) \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \int _{ s}^{ t } \mathbb {E}_s \Bigg [ \varPhi ^*(x_\theta (s), U_\theta (s), u -s)\\&\qquad \qquad \nabla \lambda _k(x_\theta ( u) ,U_\theta (u),\theta ) \frac{{\text {d}}}{{\text {d}}t} \varDelta _k \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) \Bigg ] {\text {d}}u. \end{aligned}$$

The middle two terms can be combined using the product rule $\nabla (gh) = g \nabla h + h \nabla g$ to yield

$$\begin{aligned}&\frac{{\text {d}}}{{\text {d}}t} \nabla \varPsi _{t - s}(x_\theta (s) , U_\theta (s), t-s ) \\&\quad = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s),t-s) \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s),t-s) \nabla \left( \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \int _{ s}^{ t } \mathbb {E}_s \left[ \varPhi ^*(x_\theta (s), U_\theta (s), u -s) \right. \\&\left. \qquad \qquad \nabla \lambda _k(x_\theta ( u) ,U_\theta (u),\theta ) \frac{{\text {d}}}{{\text {d}}t} \varDelta _k \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) \right] {\text {d}}u. \end{aligned}$$

Using this, we can compute the time derivative of L(t) as

$$\begin{aligned} \frac{{\text {d}} L(t) }{{\text {d}}t} = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t) ,U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] + A+ B+ C, \end{aligned}$$

(39)

where

$$\begin{aligned}&A := \sum _{k \in {\mathscr {R}}_c} \sum _{j \in {\mathscr {R}}_c} \int _{0}^{t} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \varPhi ^*(x_\theta ( s) , U_\theta ( s) , t-s) \right. \right. \\&\left. \left. \qquad \qquad \nabla \left[ \lambda _j (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_j \rangle \right] , \zeta ^{(c)}_k \right\rangle \right] {\text {d}}s, \\&B:= \sum _{k \in {\mathscr {R}}_c} \sum _{j \in {\mathscr {R}}_d} \int _{0}^{t} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \right. \\&\left. \left\langle \varPhi ^*(x_\theta ( s) , U_\theta ( s) , t-s) \nabla \left[ \lambda _j (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _j f(x_\theta (t),U_\theta (t) ) \right] , \zeta ^{(c)}_k \right\rangle \right] {\text {d}}s \\&\qquad and \\&C: = \sum _{k \in {\mathscr {R}}_c} \sum _{j \in {\mathscr {R}}_d} \int _{0}^{t} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta )\left\langle \int _{ s}^{ t } \varPhi ^*(x_\theta (s), U_\theta (s), u -s) \right. \right. \\&\left. \left. \qquad \qquad \nabla \lambda _j(x_\theta ( u) ,U_\theta (u),\theta ) \frac{{\text {d}}}{{\text {d}}t} \varDelta _j \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) {\text {d}}u, \zeta ^{(c)}_k \right\rangle \right] {\text {d}}s. \end{aligned}$$

This definition of A, B and C ensures that

$$\begin{aligned}&A+ B+ C \\&\quad = \sum _{k \in {\mathscr {R}}_c} \int _{0}^{t} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \frac{{\text {d}}}{{\text {d}}t} \nabla \varPsi _{t - s}(x_\theta (s) , U_\theta (s), t-s ), \zeta ^{(c)}_k \right\rangle {\text {d}}s \right] . \end{aligned}$$

Recall that $y_\theta (t)$ can be expressed as (34). Therefore, we can write A as

$$\begin{aligned} A&= \sum _{j \in {\mathscr {R}}_c} \mathbb {E}\left[ \left\langle \nabla \left[ \lambda _j (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_j \rangle \right] , \right. \right. \nonumber \\&\qquad \left. \left. \sum _{k \in {\mathscr {R}}_c} \int _{0}^{t} \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varPhi (x_\theta ( s) , U_\theta ( s) , t-s) \zeta ^{(c)}_k \right\rangle {\text {d}}s \right] \nonumber \\&= \sum _{j \in {\mathscr {R}}_c} \mathbb {E}\left[ \left\langle \nabla \left[ \lambda _j (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_j \rangle \right] , y_\theta (t) \right\rangle \right] . \end{aligned}$$

(40)

Similarly, we can write B as

$$\begin{aligned} B = \sum _{j \in {\mathscr {R}}_d} \mathbb {E}\left[ \left\langle \nabla \left[ \lambda _j (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _j f(x_\theta (t),U_\theta (t) ) \right] , y_\theta (t) \right\rangle \right] . \end{aligned}$$

(41)

Changing the order of integration, we can write C as

$$\begin{aligned}&C= \sum _{j \in {\mathscr {R}}_d} \int _{0}^{t} \mathbb {E}\left[ \left\langle \nabla \lambda _j(x_\theta ( u) ,U_\theta (u),\theta ) \frac{{\text {d}}}{{\text {d}}t} \varDelta _j \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u) ,\theta ) , \right. \right. \\&\qquad \left. \left. \sum _{k \in {\mathscr {R}}_c} \int _{ 0}^{ u} \partial _\theta \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \varPhi (x_\theta (s), U_\theta (s), u -s)\zeta ^{(c)}_k {\text {d}}s \right\rangle {\text {d}}u \right] \\&\quad = \sum _{j \in {\mathscr {R}}_d} \int _{0}^{t} \mathbb {E}\left[ \left\langle \nabla \lambda _j(x_\theta ( u) ,U_\theta (u),\theta ) \frac{{\text {d}}}{{\text {d}}t} \varDelta _j \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) , y_\theta (u)\right\rangle {\text {d}}u \right] \\&\quad = \sum _{j \in {\mathscr {R}}_d} \frac{{\text {d}}}{{\text {d}}t} \int _{0}^{t} \mathbb {E}\left[ \left\langle \nabla \lambda _j(x_\theta ( u) ,U_\theta (u),\theta )\varDelta _j \varPsi _{t-u}( x_\theta ( u) , U_\theta ( u ) ,\theta ) , y_\theta (u)\right\rangle {\text {d}}u \right] \\&\qquad - \sum _{j \in {\mathscr {R}}_d} \mathbb {E}\left[ \left\langle \nabla \lambda _j(x_\theta ( t) ,U_\theta (t),\theta )\varDelta _j f ( x_\theta ( t) , U_\theta ( t ) ) , y_\theta (t)\right\rangle \right] . \end{aligned}$$

This relation along with (40), (41) and (39) implies that

$$\begin{aligned} \frac{{\text {d}} L(t) }{{\text {d}}t}&= \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t) ,U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] \\&\quad +\sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \rangle \right] , y_\theta (t) \right\rangle \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d}\mathbb {E}\left[ \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \varDelta _k f(x_\theta (t),U_\theta (t) ) \right] , y_\theta (t) \right\rangle \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d} \frac{{\text {d}}}{{\text {d}}t} \int _{0}^t \mathbb {E}\left[ \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) \varDelta _k \varPsi _{t-s}( x_\theta ( s) , U_\theta ( s ) ,\theta ) ,y_\theta (s) \right\rangle {\text {d}}s \right] \\&\quad - \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \left\langle \nabla \lambda _k(x_\theta ( t) ,U_\theta (t),\theta )\varDelta _k f ( x_\theta ( t) , U_\theta ( t ) ) , y_\theta (t)\right\rangle \right] . \end{aligned}$$

Applying the product rule on the third term will produce two terms, one of which will cancel with the last term to yield

$$\begin{aligned} \frac{{\text {d}} L(t) }{{\text {d}}t}&= \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t) ,U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle \right] \\&\quad +\sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \rangle \right] , y_\theta (t) \right\rangle \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d}\mathbb {E}\left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla \left( \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) , y_\theta (t) \right\rangle \right] \\&\quad + \sum _{k \in {\mathscr {R}}_d} \frac{{\text {d}}}{{\text {d}}t} \int _{0}^t \mathbb {E}\left[ \left\langle \nabla \lambda _k(x_\theta ( s) ,U_\theta (s),\theta ) \varDelta _k \varPsi _{t-s}( x_\theta ( s) , U_\theta ( s ) ,\theta ) ,y_\theta (s) \right\rangle {\text {d}}s \right] . \end{aligned}$$

Integrating this equation from $t = 0$ to $t =T$ will prove (38), and this completes the proof of Theorem 2. $\square $

Proof

(Proof of Theorem 3) Consider the Markov process $( x_\theta (t) , U_\theta (t) , y_\theta (t) )_{t \ge 0}$. The generator of this process is given by

$$\begin{aligned} \mathbb {H} F(x,u,y)&= \sum _{k \in {\mathscr {R}}_c} \lambda _k(x,u,\theta ) \left\langle \nabla F(x,u,y), \zeta ^{(c)}_k \right\rangle + \sum _{k \in {\mathscr {R}}_d} \lambda _k(x,u,\theta ) \varDelta _k F(x,u,y) \\&\quad + \sum _{k \in {\mathscr {R}}_c} \partial _\theta \lambda _k(x,u,\theta ) \left\langle \nabla _y F(x,u,y), \zeta ^{(c)}_k \right\rangle \\&\quad + \sum _{k \in {\mathscr {R}}_c} \left\langle \nabla \lambda _k(x,u,\theta ) , y \right\rangle \left\langle \nabla _y F(x,u,y), \zeta ^{(c)}_k \right\rangle \end{aligned}$$

for any real-valued function $F: \mathbb {R}^{S_c} \times \mathbb {N}^{S_d}_0 \times \mathbb {R}^{S_c} \rightarrow \mathbb {R}$. Here, $\nabla _y F$ denotes the gradient of function F w.r.t. the last $S_c$ coordinates. Setting

$$\begin{aligned} F(x,u,y) = \left\langle \nabla f(x,u), y \right\rangle \end{aligned}$$

we obtain

$$\begin{aligned} \mathbb {H} F(x,u,y)&= \sum _{k \in {\mathscr {R}}_c} \lambda _k(x,u,\theta ) \left\langle \varDelta f(x,u) y, \zeta ^{(c)}_k \right\rangle + \sum _{k \in {\mathscr {R}}_d} \lambda _k(x,u,\theta ) \varDelta _k \left\langle \nabla f(x,u), y \right\rangle \\&\quad + \sum _{k \in {\mathscr {R}}_c} \partial _\theta \lambda _k(x,u,\theta ) \left\langle \nabla f(x,u), \zeta ^{(c)}_k \right\rangle \\&\quad + \sum _{k \in {\mathscr {R}}_c} \left\langle \nabla \lambda _k(x,u,\theta ) , y \right\rangle \left\langle \nabla f(x,u), \zeta ^{(c)}_k \right\rangle \end{aligned}$$

where $\varDelta F$ denotes the Hessian matrix of F w.r.t. the first $S_c$ coordinates. However, note that the first and the fourth terms can be combined with product rule as

$$\begin{aligned}&\sum _{k \in {\mathscr {R}}_c} \lambda _k(x,u,\theta ) \left\langle \varDelta f(x,u) y, \zeta ^{(c)}_k \right\rangle +\sum _{k \in {\mathscr {R}}_c} \left\langle \nabla \lambda _k(x,u,\theta ) , y \right\rangle \left\langle \nabla f(x,u), \zeta ^{(c)}_k \right\rangle \\&\quad = \sum _{k \in {\mathscr {R}}_c} \left\langle \nabla \left[ \lambda _k (x, u ,\theta ) \langle \nabla f(x,u ) , \zeta ^{(c)}_k \rangle \right] , y \right\rangle \end{aligned}$$

and hence we get

$$\begin{aligned} \mathbb {H} F(x,u,y)&= \sum _{k \in {\mathscr {R}}_c} \partial _\theta \lambda _k(x,u,\theta ) \left\langle \nabla f(x,u), \zeta ^{(c)}_k \right\rangle \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_c} \left\langle \nabla \left[ \lambda _k (x, u ,\theta ) \langle \nabla f(x,u ) , \zeta ^{(c)}_k \rangle \right] , y \right\rangle \nonumber \\&\quad + \sum _{k \in {\mathscr {R}}_d} \lambda _k(x,u,\theta ) \varDelta _k \left\langle \nabla f(x,u), y \right\rangle . \end{aligned}$$

(42)

Using Dynkin’s formula, we have

$$\begin{aligned} \mathbb {E}\left( F(x_\theta (T) , U_\theta (T) , y_\theta (T) ) \right) = \mathbb {E}\left[ \int _0^T \mathbb {H} F(x_\theta (t) , U_\theta (t) , y_\theta (t)){\text {d}}t \right] \end{aligned}$$

and substituting (42) yields

$$\begin{aligned}&\mathbb {E}\left[ \left\langle \nabla f (x_\theta (T) ,U_\theta (T) ) , y_\theta (T) \right\rangle \right] \\&\quad = \sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \partial _\theta \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \left\langle \nabla f(x_\theta (t) ,U_\theta (t) ) , \zeta ^{(c)}_k \right\rangle {\text {d}}t \right] \\&\qquad +\sum _{k \in {\mathscr {R}}_c} \mathbb {E}\left[ \int _{0}^T \left\langle \nabla \left[ \lambda _k (x_\theta (t) ,U_\theta (t) ,\theta ) \langle \nabla f(x_\theta (t),U_\theta (t) ) , \zeta ^{(c)}_k \rangle \right] , y_\theta (t) \right\rangle {\text {d}}t \right] \\&\qquad + \sum _{k \in {\mathscr {R}}_d} \mathbb {E}\left[ \int _{0}^T \lambda _k (x_\theta (s) ,U_\theta (s) ,\theta ) \left\langle \nabla \left( \varDelta _k f(x_\theta (t),U_\theta (t) ) \right) , y_\theta (t) \right\rangle {\text {d}}t \right] . \end{aligned}$$

This relation along with Proposition 2 proves Theorem 3. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, A., Khammash, M. Sensitivity Analysis for Multiscale Stochastic Reaction Networks Using Hybrid Approximations. Bull Math Biol 81, 3121–3158 (2019). https://doi.org/10.1007/s11538-018-0521-4

Download citation

Received: 14 January 2018
Accepted: 01 October 2018
Published: 09 October 2018
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s11538-018-0521-4

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sensitivity Analysis for Multiscale Stochastic Reaction Networks Using Hybrid Approximations

Abstract

Access this article

Similar content being viewed by others

Sensitivity Estimation and Inverse Problems in Spatial Stochastic Models of Chemical Kinetics

A review of the deterministic and diffusion approximations for stochastic chemical reaction networks

Stability and Strong Convergence for Spatial Stochastic Kinetics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Proposition 1

Proof

Proposition 2

Proof

Proposition 3

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Sensitivity Analysis for Multiscale Stochastic Reaction Networks Using Hybrid Approximations

Abstract

Access this article

Similar content being viewed by others

Sensitivity Estimation and Inverse Problems in Spatial Stochastic Models of Chemical Kinetics

A review of the deterministic and diffusion approximations for stochastic chemical reaction networks

Stability and Strong Convergence for Spatial Stochastic Kinetics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proposition 1

Proof

Proposition 2

Proof

Proposition 3

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation