Langevin and Kalman Importance Sampling for Nonlinear Continuous-Discrete State-Space Models

Singer, Hermann

doi:10.1007/978-3-319-77219-6_16

Hermann Singer⁴

1263 Accesses

Abstract

The likelihood function of a nonlinear continuous-discrete state-space model with state dependent diffusion function is computed by integrating out the latent variables with the help of Langevin sampling. The continuous-time paths are discretized on a time grid in order to obtain a finite-dimensional integration and densities w.r.t. Lebesgue measure. We use importance sampling, where the exact importance density is the conditional density of the latent states, given the measurements. This unknown density is either estimated from the sampler data or approximated by an estimated normal density. Then, new trajectories are drawn from this Gaussian measure. Alternatively, a Gaussian importance density is directly derived from an extended Kalman smoother with subsequent sampling of independent trajectories (extended Kalman sampling (EKS)). We compare the Monte Carlo results with numerical methods based on extended, unscented, and Gauss-Hermite Kalman filtering (EKF, UKF, GHF) and a grid-based solution of the Fokker-Planck equation between measurements. This comprises the repeated multiplication of transition matrices based on Euler transition kernels, finite differences, and discretized integral operators. The methods are illustrated for the geometrical Brownian motion and the Ginzburg-Landau model for phase transitions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Mother behaviors—look at infant, smile, vocalize, touch, or hold infant. Infant behaviors—look at mother, smile, vocalize, touch or hold mother, fuss/cry.
2.
In order to avoid misunderstandings, one must distinguish between (non)linearity in the continuous-time dynamical specification (differential equation) w.r.t. the state variables, and in the derived “exact discrete model” w.r.t. the parameters.
3.
W.r.t. the parameters.
4.
One has $\int \delta (x-x') \phi (x')dx' =\phi (x)$ and $\sum _{\rho '} \delta _{\rho \rho '}\phi _{\rho '}=\phi _{\rho }$.
5.
Otherwise, one can use the singular normal distribution (cf. Mardia et al. 1979, ch. 2.5.4, p. 41). In this case, the generalized inverse of Ω _j is used and the determinant |⋅|, which is zero, is replaced by the product of positive eigenvalues. Singular covariance matrices occur, for example, in autoregressive models of higher order, when the state vector contains derivatives of a variable.
6.
In statistical mechanics, one assumes the equivalence of time averages and ensemble averages (cross sections of identical systems).
7.
In the case of a state-dependent diffusion matrix, η _j+1 = η _j + G(η _j, x _j, ψ)δW _j generates a more general martingale process. Expression (16.16) remains finite in a continuum limit (see Appendix 2).
8.
These are called irreducible diffusions. A transformation z = h(y) leading to unit diffusion for z must fulfil the system of differential equations h _α,βg _βγ = δ _αγ, α, β = 1, …, p; γ = 1, …, r. The inverse transformation y = v(z) fulfills v _α,γ(z) = g _αγ(v(z)). Thus v _α,γδ = g _αγ,𝜖v _𝜖,δ = v _α,δγ = g _αδ,𝜖v _𝜖,γ. Inserting v, one obtains the commutativity condition $g_{\alpha \gamma _, \epsilon } \; g_{\epsilon \delta }=g_{\alpha \delta ,\epsilon } \; g_{\epsilon \gamma }$, which is necessary and sufficient for reducibility. See Kloeden and Platen (1992, ch. 10, p. 348), Aït-Sahalia (2008).

References

Aït-Sahalia, Y. (2002). Maximum likelihood estimation of discretely sampled diffusions: A closed-form approximation approach. Econometrica, 70(1), 223–262. https://doi.org/10.1111/1468-0262.00274
Article MathSciNet MATH Google Scholar
Aït-Sahalia, Y. (2008). Closed-form likelihood expansions for multivariate diffusions. Annals of Statistics, 36(2), 906–937. https://doi.org/10.1214/009053607000000622
Article MathSciNet MATH Google Scholar
Apte, A., Hairer, M., Stuart, A. M., & Voss, J. (2007). Sampling the posterior: An approach to non-Gaussian data assimilation. Physica D: Nonlinear Phenomena, 230(1–2), 50–64. https://doi.org/10.1016/j.physd.2006.06.009
Article MathSciNet MATH Google Scholar
Apte, A., Jones, C. K. R. T., Stuart, A. M., & Voss, J. (2008). Data assimilation: Mathematical and statistical perspectives. International Journal for Numerical Methods in Fluids, 56(8), 1033–1046. https://doi.org/10.1002/fld.1698
Article MathSciNet MATH Google Scholar
Arasaratnam, I., Haykin, S., & Hurd, T. (2010). Cubature Kalman filtering for continuous-discrete systems: Theory and simulations. IEEE Transactions on Signal Processing, 58, 4977–4993. https://doi.org/10.1109/TSP.2010.2056923
Article MathSciNet MATH Google Scholar
Arnold, L. (1974). Stochastic differential equations. New York: Wiley.
MATH Google Scholar
Åström, K. J. (1970). Introduction to stochastic control theory. Mineola, NY: Courier Corporation.
MATH Google Scholar
Bagchi, A. (2001). Onsager-Machlup function. In Encyclopedia of mathematics. Berlin: Springer. https://www.encyclopediaofmath.org/index.php/Onsager-Machlup_function
Ballreich, D. (2017). Stable and efficient cubature-based filtering in dynamical systems. Berlin: Springer International Publishing.
Book MATH Google Scholar
Bartlett, M. S. (1946). On the theoretical specification and sampling properties of autocorrelated time-series. Journal of the Royal Statistical Society (Supplement), 7, 27–41. https://doi.org/10.2307/2983611
Article MathSciNet MATH Google Scholar
Basawa, I. V., & Prakasa Rao, B. L. S. (1980). Statistical inference for stochastic processes. London: Academic Press.
MATH Google Scholar
Bergstrom, A. R. (1976a). Non-recursive models as discrete approximations to systems of stochastic differential equations. In A. R. Bergstrom (Ed.), Statistical inference in continuous time models (pp. 15–26). Amsterdam: North Holland.
MATH Google Scholar
Bergstrom, A. R. (Ed.). (1976b). Statistical inference in continuous time economic models. Amsterdam: North Holland.
MATH Google Scholar
Bergstrom, A. R. (1983). Gaussian estimation of structural parameters in higher order continuous time dynamic models. Econometrica: Journal of the Econometric Society, 51(1), 117–152. https://doi.org/10.2307/1912251
Article MathSciNet MATH Google Scholar
Bergstrom, A. R. (1988). The history of continuous-time econometric models. Econometric Theory, 4, 365–383. https://doi.org/10.1017/S0266466600013359
Article MathSciNet Google Scholar
Beskos, A., Papaspiliopoulos, O., Roberts, G. O., & Fearnhead, P. (2006). Exact and efficient likelihood-based inference for discretely observed diffusion processes (with discussion). Journal of the Royal Statistical Society Series B, 68, 333–382. https://doi.org/10.1111/j.1467-9868.2006.00552.x
Article MATH Google Scholar
Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81, 637–654. https://doi.org/10.1086/260062
Article MathSciNet MATH Google Scholar
Chang, J., & Chen, S. X. (2011). On the approximate maximum likelihood estimation for diffusion processes. The Annals of Statistics, 39(6), 2820–2851. https://doi.org/10.1214/11-AOS922
Article MathSciNet MATH Google Scholar
Chow, S.-M., Ferrer, E., & Nesselroade, J. R. (2007). An unscented Kalman filter approach to the estimation of nonlinear dynamical systems models. Multivariate Behavioral Research, 42(2), 283–321. https://doi.org/10.1080/00273170701360423
Article Google Scholar
Chow, S.-M., Lu, Z., Sherwood, A., & Zhu, H. (2016). Fitting nonlinear ordinary differential equation models with random effects and unknown initial conditions using the stochastic approximation expectation–maximization (SAEM) algorithm. Psychometrika, 81(1), 102–134. https://doi.org/10.1007/s11336-014-9431-z
Article MathSciNet MATH Google Scholar
Da Prato, G. (2004). Kolmogorov equations for stochastic PDEs. Basel: Birkhäuser. https://doi.org/10.1007/978-3-0348-7909-5
Da Prato, G., & Zabczyk, J. (1992). Stochastic equations in infinite dimensions. New York: Cambridge University Press. https://doi.org/10.1017/CBO9780511666223
Doreian, P., & Hummon, N. P. (1976). Modelling social processes. New York, Oxford, Amsterdam: Elsevier.
Google Scholar
Durham, G. B., & Gallant, A. R. (2002). Numerical techniques for simulated maximum likelihood estimation of stochastic differential equations. Journal of Business and Economic Statistics, 20, 297–316. https://doi.org/10.1198/073500102288618397
Article MathSciNet Google Scholar
Elerian, O., Chib, S., & Shephard, N. (2001). Likelihood inference for discretely observed nonlinear diffusions. Econometrica, 69(4), 959–993. https://doi.org/10.1111/1468-0262.00226
Article MathSciNet MATH Google Scholar
Feller, W. (1951). Two singular diffusion problems. Annals of Mathematics, 54, 173–182. https://doi.org/10.2307/1969318
Article MathSciNet MATH Google Scholar
Flury, T., & Shephard, N. (2011). Bayesian inference based only on simulated likelihood: particle filter analysis of dynamic economic models. Econometric Theory, 27(5), 933–956. https://doi.org/10.1017/S0266466610000599
Article MathSciNet MATH Google Scholar
Fuchs, C. (2013). Inference for diffusion processes: With applications in life sciences. Berlin: Springer. https://doi.org/10.1007/978-3-642-25969-2
Gard, T. C. (1988). Introduction to stochastic differential equations. New York: Dekker.
MATH Google Scholar
Girolami, M., & Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214. https://doi.org/10.1111/j.1467-9868.2010.00765.x
Article MathSciNet Google Scholar
Hairer, M., Stuart, A. M., & Voss, J. (2007). Analysis of SPDEs arising in path sampling, part II: The nonlinear case. Annals of Applied Probability, 17(5), 1657–1706. https://doi.org/10.1214/07-AAP441
Article MathSciNet MATH Google Scholar
Hairer, M., Stuart, A. M., & Voss, J. (2009). Sampling conditioned diffusions. In J. Blath, P. Morters, & M. Scheutzow (Eds.), Trends in stochastic analysis (pp. 159–186). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139107020.009
Chapter Google Scholar
Hairer, M., Stuart, A. M., & Voss, J. (2011). Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods. In D. Crisan & B. Rozovsky (Eds.), The Oxford handbook of nonlinear filtering (pp. 833–873). Oxford: Oxford University Press.
Google Scholar
Hairer, M., Stuart, A. M., Voss, J., & Wiberg, P. (2005). Analysis of SPDEs arising in path sampling, part I: The Gaussian case. Communications in Mathematical Sciences, 3(4), 587–603. https://doi.org/10.4310/CMS.2005.v3.n4.a8
Article MathSciNet MATH Google Scholar
Haken, H. (1977). Synergetics. Berlin: Springer.
Book MATH Google Scholar
Haken, H., Kelso, J. A. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51(5), 347–356. https://doi.org/10.1007/BF00336922
Article MathSciNet MATH Google Scholar
Hamerle, A., Nagl, W., & Singer, H. (1991). Problems with the estimation of stochastic differential equations using structural equations models. Journal of Mathematical Sociology, 16(3), 201–220. https://doi.org/10.1080/0022250X.1991.9990088
Article MATH Google Scholar
Harvey, A. C., & Stock, J. (1985). The estimation of higher order continuous time autoregressive models. Econometric Theory, 1, 97–112. https://doi.org/10.1017/S0266466600011026
Article Google Scholar
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97–109. https://doi.org/10.1093/biomet/57.1.97
Article MathSciNet MATH Google Scholar
Herings, J. P. (1996). Static and dynamic aspects of general disequilibrium theory. Boston: Springer. https://doi.org/10.1007/978-1-4615-6251-1
Jazwinski, A. H. (1970). Stochastic processes and filtering theory. New York: Academic Press.
Google Scholar
Jetschke, G. (1986). On the equivalence of different approaches to stochastic partial differential equations. Mathematische Nachrichten, 128, 315–329. https://doi.org/10.1002/mana.19861280127
Article MathSciNet MATH Google Scholar
Jetschke, G. (1991). Lattice approximation of a nonlinear stochastic partial differential equation with white noise. In International series of numerical mathematics (Vol. 102, pp. 107–126). Basel: Birkhäuser. https://doi.org/10.1007/978-3-0348-6413-8_8
Jones, R. H. (1984). Fitting multivariate models to unequally spaced data. In E. Parzen (Ed.), Time series analysis of irregularly observed data (pp. 158–188). New York: Springer. https://doi.org/10.1007/978-1-4684-9403-7_8
Chapter Google Scholar
Jones, R. H., & Tryon, P. V. (1987). Continuous time series models for unequally spaced data applied to modeling atomic clocks. SIAM Journal of Scientific and Statistical Computing, 8, 71–81. https://doi.org/10.1137/0908007
Article MathSciNet MATH Google Scholar
Kac, M. (1980). Integration in function spaces and some of its applications. Pisa: Scuola normale superiore.
MATH Google Scholar
Kloeden, P. E., & Platen, E. (1992). Numerical solution of stochastic differential equations. Berlin: Springer. https://doi.org/10.1007/978-3-662-12616-5
Kloeden, P. E., & Platen, E. (1999). Numerical solution of stochastic differential equations. Berlin: Springer. (corrected third printing)
Google Scholar
Langevin, P. (1908). Sur la théorie du mouvement brownien [On the theory of Brownian motion]. Comptes Rendus de l’Academie des Sciences (Paris), 146, 530–533.
MATH Google Scholar
Li, C. (2013). Maximum-likelihood estimation for diffusion processes via closed-form density expansions. The Annals of Statistics, 41(3), 1350–1380. https://doi.org/10.1214/13-AOS1118
Article MathSciNet MATH Google Scholar
Lighthill, M. J. (1958). Introduction to Fourier analysis and generalised functions. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139171427
Liptser, R. S., & Shiryayev, A. N. (2001). Statistics of random processes (Vols. I and II, 2nd ed.). New York: Springer
Google Scholar
Lorenz, E. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20, 130. https://www.encyclopediaofmath.org/index.php/Onsager-Machlup_function
Article MATH Google Scholar
Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Association B, 44(2), 226–233.
MathSciNet MATH Google Scholar
Malik, S., & Pitt, M. K. (2011). Particle filters for continuous likelihood evaluation and maximisation. Journal of Econometrics, 165(2), 190–209. https://doi.org/10.1016/j.jeconom.2011.07.006
Article MathSciNet MATH Google Scholar
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis. London: Academic Press.
MATH Google Scholar
Molenaar, P., & Newell, K. M. (2003). Direct fit of a theoretical model of phase transition in oscillatory finger motions. British Journal of Mathematical and Statistical Psychology, 56(2), 199–214. https://doi.org/10.1348/000711003770480002
Article MathSciNet Google Scholar
Moler, C., & Van Loan, C. (2003). Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Review, 45(1), 1–46. https://doi.org/10.1137/S00361445024180
Article MathSciNet MATH Google Scholar
Newell, K. M., & Molenaar, P. C. M. (2014). Applications of nonlinear dynamics to developmental process modeling. Hove: Psychology Press.
Book Google Scholar
Okano, K., Schülke, L., & Zheng, B. (1993). Complex Langevin simulation. Progress of Theoretical Physics Supplement, 111, 313–346. https://doi.org/10.1143/PTPS.111.313
Article MATH Google Scholar
Onsager, L., & Machlup, S. (1953). Fluctuations and irreversible processes. Physical Review, 91(6), 1505–1515. https://doi.org/10.1103/PhysRev.91.1505
Article MathSciNet MATH Google Scholar
Oud, J. H. L., & Jansen, R. A. R. G. (2000). Continuous time state space modeling of panel data by means of SEM. Psychometrika, 65, 199–215. https://doi.org/10.1007/BF02294374
Article MathSciNet MATH Google Scholar
Oud, J. H. L., & Singer, H. (2008). Continuous time modeling of panel data: SEM versus filter techniques. Statistica Neerlandica, 62(1), 4–28.
Article MathSciNet MATH Google Scholar
Ozaki, T. (1985). Nonlinear time series and dynamical systems. In E. Hannan (Ed.), Handbook of statistics (pp. 25–83). Amsterdam: North Holland.
Google Scholar
Parisi, G., & Wu, Y.-S. (1981). Perturbation theory without gauge fixing. Scientia Sinica, 24, 483.
MathSciNet Google Scholar
Pedersen, A. R. (1995). A new approach to maximum likelihood estimation for stochastic differential equations based on discrete observations. Scandinavian Journal of Statistics, 22, 55–71.
MathSciNet MATH Google Scholar
Pitt, M. K. (2002). Smooth particle filters for likelihood evaluation and maximisation (Warwick economic research papers No. 651). University of Warwick. http://wrap.warwick.ac.uk/1536/
Reznikoff, M. G., & Vanden-Eijnden, E. (2005). Invariant measures of stochastic partial differential equations and conditioned diffusions. Comptes Rendus Mathematiques, 340, 305–308. https://doi.org/10.1016/j.crma.2004.12.025
Article MathSciNet MATH Google Scholar
Risken, H. (1989). The Fokker-Planck equation (2nd ed.). Berlin: Springer. https://doi.org/10.1007/978-3-642-61544-3
Book MATH Google Scholar
Roberts, G. O., & Stramer, O. (2001). On inference for partially observed nonlinear diffusion models using the Metropolis–Hastings algorithm. Biometrika, 88(3), 603–621. https://doi.org/10.1093/biomet/88.3.603
Article MathSciNet MATH Google Scholar
Roberts, G. O., & Stramer, O. (2002). Langevin diffusions and Metropolis-Hastings algorithms. Methodology and Computing in Applied Probability, 4(4), 337–357. https://doi.org/10.1023/A:1023562417138
Article MathSciNet MATH Google Scholar
Rümelin, W. (1982). Numerical treatment of stochastic differential equations. SIAM Journal of Numerical Analysis, 19(3), 604–613. https://doi.org/10.1137/0719041
Article MathSciNet Google Scholar
Särkkä, S. (2013). Bayesian filtering and smoothing (Vol. 3). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139344203
Book MATH Google Scholar
Särkkä, S., Hartikainen, J., Mbalawata, I. S., & Haario, H. (2013). Posterior inference on parameters of stochastic differential equations via non-linear Gaussian filtering and adaptive MCMC. Statistics and Computing, 25(2), 427–437. https://doi.org/10.1007/s11222-013-9441-1
Article MathSciNet MATH Google Scholar
Schiesser, W. E. (1991). The numerical method of lines. San Diego: Academic Press.
MATH Google Scholar
Schuster, H. G., & Just, W. (2006). Deterministic chaos: An introduction. New York: Wiley.
MATH Google Scholar
Shoji, I., & Ozaki, T. (1997). Comparative study of estimation methods for continuous time stochastic processes. Journal of Time Series Analysis, 18(5), 485–506. https://doi.org/10.1111/1467-9892.00064
Article MathSciNet MATH Google Scholar
Shoji, I., & Ozaki, T. (1998a). Estimation for nonlinear stochastic differential equations by a local linearization method 1. Stochastic Analysis and Applications, 16(4), 733–752. https://doi.org/10.1080/07362999808809559
Article MathSciNet MATH Google Scholar
Shoji, I., & Ozaki, T. (1998b). A statistical method of estimation and simulation for systems of stochastic differential equations. Biometrika, 85(1), 240–243. https://doi.org/10.1093/biomet/85.1.240
Article MathSciNet MATH Google Scholar
Silverman, B. W. (1986). Density estimation for statistics and data analysis. London: Chapman and Hall. https://doi.org/10.1007/978-1-4899-3324-9
Singer, H. (1986). Depressivität und gelernte Hilflosigkeit als Stochastischer Prozeß [Depression and learned helplessness as stochastic process] (Unpublished master’s thesis). Universität Konstanz, Konstanz, Germany. https://www.encyclopediaofmath.org/index.php/Onsager-Machlup_function
Singer, H. (1990). Parameterschätzung in zeitkontinuierlichen dynamischen Systemen [Parameter estimation in continuous time dynamical systems]. Konstanz: Hartung-Gorre-Verlag.
Google Scholar
Singer, H. (1993). Continuous-time dynamical systems with sampled data, errors of measurement and unobserved components. Journal of Time Series Analysis, 14(5), 527–545. https://doi.org/10.1111/j.1467-9892.1993.tb00162.x
Article MathSciNet MATH Google Scholar
Singer, H. (1995). Analytical score function for irregularly sampled continuous time stochastic processes with control variables and missing values. Econometric Theory, 11, 721–735. https://doi.org/10.1017/S0266466600009701
Article MathSciNet Google Scholar
Singer, H. (1998). Continuous panel models with time dependent parameters. Journal of Mathematical Sociology, 23, 77–98. https://doi.org/10.1080/0022250X.1998.9990214
Article MATH Google Scholar
Singer, H. (2002). Parameter estimation of nonlinear stochastic differential equations: Simulated maximum likelihood vs. extended Kalman filter and Itô-Taylor expansion. Journal of Computational and Graphical Statistics, 11(4), 972–995. https://doi.org/10.1198/106186002808
Article MathSciNet Google Scholar
Singer, H. (2003). Simulated maximum likelihood in nonlinear continuous-discrete state space models: Importance sampling by approximate smoothing. Computational Statistics, 18(1), 79–106. https://doi.org/10.1007/s001800300133
Article MathSciNet MATH Google Scholar
Singer, H. (2005). Continuous-discrete unscented Kalman filtering (Diskussionsbeiträge Fachbereich Wirtschaftswissenschaft No. 384). FernUniversität in Hagen. http://www.fernunihagen.de/lsstatistik/publikationen/ukf2005.shtml
Singer, H. (2008a). Generalized Gauss-Hermite filtering. Advances in Statistical Analysis, 92(2), 179–195. https://doi.org/10.1007/s10182-008-0068-z
Article MathSciNet Google Scholar
Singer, H. (2008b). Nonlinear continuous time modeling approaches in panel research. Statistica Neerlandica, 62(1), 29–57.
Article MathSciNet MATH Google Scholar
Singer, H. (2010). SEM modeling with singular moment matrices. Part I: ML-estimation of time series. Journal of Mathematical Sociology, 34(4), 301–320. https://doi.org/10.1080/0022250X.2010.509524
Article MATH Google Scholar
Singer, H. (2011). Continuous-discrete state-space modeling of panel data with nonlinear filter algorithms. Advances in Statistical Analysis, 95, 375–413. https://doi.org/10.1007/s10182-011-0172-3
Article MathSciNet MATH Google Scholar
Singer, H. (2012). SEM modeling with singular moment matrices. Part II: ML-estimation of sampled stochastic differential equations. Journal of Mathematical Sociology, 36(1), 22–43. https://doi.org/10.1080/0022250X.2010.532259
Article MathSciNet MATH Google Scholar
Singer, H. (2014). Importance sampling for Kolmogorov backward equations. Advances in Statistical Analysis, 98(4), 345–369. https://doi.org/10.1007/s10182-013-0223-z
Article MathSciNet MATH Google Scholar
Singer, H. (2016). Simulated maximum likelihood for continuous-discrete state space models using Langevin importance sampling (Diskussionsbeiträge Fakultät Wirtschaftswissenschaft No. 497). Paper presented at the 9th International Conference on Social Science Methodology (RC33), 11–16 September 2016, Leicester, UK. https://www.encyclopediaofmath.org/index.php/Onsager-Machlup_function
Stramer, O., Bognar, M., & Schneider, P. (2010). Bayesian inference for discretely sampled Markov processes with closed-form likelihood expansions. Journal of Financial Econometrics, 8(4), 450–480. https://doi.org/10.1093/jjfinec/nbp027
Article Google Scholar
Stratonovich, R. L. (1971). On the probability functional of diffusion processes. In Selected translations in mathematical statistics and probability, Vol. 10 (pp. 273–286).
Google Scholar
Stratonovich, R. L. (1989). Some Markov methods in the theory of stochastic processes in nonlinear dynamic systems. In F. Moss & P. McClintock (Eds.), Noise in nonlinear dynamic systems (pp. 16–71). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511897818.004
Chapter Google Scholar
Stuart, A. M., Voss, J., & Wiberg, P. (2004). Conditional path sampling of SDEs and the Langevin MCMC method. Communications in Mathematical Sciences, 2(4), 685–697. https://doi.org/10.4310/CMS.2004.v2.n4.a7
Article MathSciNet MATH Google Scholar
Thomas, E. A. C., & Martin, J. A. (1976). Analyses of parent-infant-interaction. Psychological Review, 83(2), 141–156. https://doi.org/10.1037/0033-295X.83.2.141
Article Google Scholar
Van Kampen, N. G. (1981). Itô vs. Stratonovich. Journal of Statistical Physics, 24, 175–187. https://doi.org/10.1007/BF01007642
Article MathSciNet MATH Google Scholar
Wei, G. W., Zhang, D. S., Kouric, D. J., & Hoffman, D. K. (1997). Distributed approximating functional approach to the Fokker-Planck equation: Time propagation. Journal of Chemical Physics, 107(8), 3239–3246. https://doi.org/10.1063/1.474674
Article Google Scholar
Weidlich, W., & Haag, G. (1983). Quantitative sociology. Berlin: Springer.
MATH Google Scholar
Wong, E., & Hajek, B. (1985). Stochastic processes in engineering systems. New York: Springer. https://doi.org/10.1007/978-1-4612-5060-9
Yoo, H. (2000). Semi-discretization of stochastic partial differential equations on R1 by a finite-difference method. Mathematics of Computation, 69(230), 653–666. https://doi.org/10.1090/S0025-5718-99-01150-3

Download references

Author information

Authors and Affiliations

Department of Economics, FernUniversität in Hagen, Hagen, Germany
Hermann Singer

Authors

Hermann Singer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hermann Singer .

Editor information

Editors and Affiliations

Marketing and Supply Chain Management, Nyenrode Business University, Breukelen, The Netherlands
Kees van Montfort
Behavioural Science Institute, University of Nijmegen, Nijmegen, The Netherlands
Johan H. L. Oud
Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany
Manuel C. Voelkle

Appendices

Appendix 1: Langevin Sampler: Analytic Drift Function

16.1.1 Notation

In the following, the components of vectors and matrices are denoted by Greek letters, e.g., f _α, α = 1, …, p, and partial derivatives by commas, i.e., f _α,β := ∂f _α∕∂η _β = ∂ _βf _α = (f _η)_αβ. The Jacobian matrix ∂f∕∂η is written as f _η and its βth column as (f _η)_•β. Likewise, Ω _α• denotes row α of matrix Ω _αβ and Ω _•• = Ω for short.

Latin indices denote time, e.g., f _jα = f _α(η _j). Furthermore, a sum convention is used for the Greek indices (i.e., f _αg _α =∑_αf _αg _α). The difference operators δ = B ⁻¹ − 1, ∇ = 1 − B, with the backshift Bη _j = η _j−1 are used frequently. One has δ ⋅∇ = B ⁻¹ − 2 + B := Δ for the central second difference.

16.1.2 Functional Derivatives

The functional Φ(y) may be expanded to first order by using the functional derivative $(\delta \varPhi /\delta y)(h) = \int (\delta \varPhi /\delta y(s)) h(s)ds$. One has Φ(y + h) − Φ(y) = (δΦ∕δy)(h) + O(∥h∥²).

A discrete version is Φ(η) = Φ(η ₀, …, η _J) and Φ(η + h) − Φ(η) =∑_j[∂Φ(η)∕∂(η _jδt)]h _jδt + O(∥h∥²). As a special case, consider the functional Φ(η) = η _j. Since $\eta _{j}+h_{j}-\eta _{j}=\sum (\delta _{jk}/\delta t) h_{k} \delta t$ one has the continuous analogue $y(t)+h(t)-y(t)=\int \delta (t-s)h(s)ds$, thus δy(t)∕δy(s) = δ(t − s).

16.1.2.1 State-Independent Diffusion Coefficient

First we assume a state-independent diffusion coefficient Ω _j = Ω, but later we set Ω _j = Ω(η _j, x _j). This is important, if the Lamperti transformation does not lead to constant coefficients in multivariate models.^{Footnote 8} In components, the term (16.15) reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} S_{0} &\displaystyle =&\displaystyle \frac{1}{2} \sum_{j=0}^{J-1} (\eta_{j+1; \beta}-\eta_{j\beta}) (\varOmega_{\beta \gamma} \delta t)^{-1} (\eta_{j+1; \gamma}- \eta_{j \gamma}), \end{array} \end{aligned} $$

Note that (Ω _βγδt)⁻¹ ≡ [(Ωδt)⁻¹]_βγ and the semicolon in η _j+1;β serves to separate the indices; it is not a derivative. Differentiation w.r.t. the state η _jα yields (j = 1, …, J − 1)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{0}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle -\varOmega_{\alpha \gamma}^{-1} \delta t^{-2} (\eta_{j+1; \gamma}-2\eta_{j\gamma}+ \eta_{j-1; \gamma}) \end{array} \end{aligned} $$

(16.40)

In vector notation, we have ∂S ₀∕∂(η _jδt) = −Ω ⁻¹δt ⁻²Δη _j. On the boundaries j = 0, j = J we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} \partial S_{0}/\partial(\eta_{0 \alpha}\delta t) &\displaystyle =&\displaystyle -\varOmega_{\alpha \gamma}^{-1} \delta t^{-2} (\eta_{1\gamma}-\eta_{0\gamma}) \\ \partial S_{0}/\partial(\eta_{0 \alpha}\delta t) &\displaystyle =&\displaystyle \varOmega_{\alpha \gamma}^{-1} \delta t^{-2} (\eta_{J\gamma}-\eta_{J-1;\gamma}) \end{array} \end{aligned} $$

Next, the derivatives of $\log \alpha (\eta )$ are needed. One gets

$$\displaystyle \begin{aligned} \begin{array}{rcl} \partial S_{1}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle -\delta t^{-1}[f_{j\beta,\alpha}\varOmega_{\beta \gamma}^{-1}\delta \eta_{j\gamma} -\varOmega_{\alpha \gamma}^{-1}(f_{j\gamma}-f_{j-1;\gamma})] \end{array} \end{aligned} $$

or in vector form, using difference operators

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{1}/\partial(\eta_{j\alpha}\delta t) &\displaystyle =&\displaystyle -\delta t^{-1}[f_{j\bullet,\alpha}^{\prime}\varOmega^{-1}\delta \eta_{j} -\varOmega^{-1}\delta f_{j-1}], \end{array} \end{aligned} $$

(16.41)

where f _j•,α is column α of the Jacobian f _η(η _j). The second term yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{2}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle \partial/\partial\eta_{j \alpha}\frac{1}{2}[f_{j\beta}\varOmega_{\beta \gamma}^{-1}f_{j\gamma}] = f_{j\beta,\alpha}\varOmega_{\beta \gamma}^{-1}f_{j\gamma}\\ &\displaystyle =&\displaystyle f_{j\bullet,\alpha}^{\prime}\varOmega^{-1}f_{j}. \end{array} \end{aligned} $$

(16.42)

Finally, one has to determine the drift component corresponding to the measurements, which is contained in the conditional density p(z|η). Since it was assumed that the error of measurement is Gaussian (see 16.2), we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} p(z|\eta) &\displaystyle =&\displaystyle \prod_{i=0}^{T} p(z_{i}|\eta_{j_{i}}) = \prod_{i=0}^{T} \phi(z_{i};h_i,R_{i}), \end{array} \end{aligned} $$

where ϕ(y;μ, Σ) is the multivariate Gaussian density, $h_i=h(\eta _{j_{i}},x_{j_{i}})$ is the output function and $R_{i}=R(x_{j_{i}})$ is the measurement error covariance matrix. Thus the derivative reads (matrix form in the second line)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial \log p(z|\eta)/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle \sum_{i=0}^{T}h_{i\gamma,\alpha}R_{i\beta\gamma}^{-1}(z_{i\beta}-h_{i\beta}) (\delta_{j j_{i}}/\delta t)\\ &\displaystyle =&\displaystyle \sum_{i=0}^{T}h_{i\bullet,\alpha}^{\prime}R_{i}^{-1}(z_{i}-h_{i}) (\delta_{j j_{i}}/\delta t) \end{array} \end{aligned} $$

(16.43)

The Kronecker symbol $\delta _{j j_{i}}$ only gives contributions at the measurement times $t_{i}=\tau _{j_{i}}$. Together we obtain for the drift of the Langevin equation (16.14)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \delta_{\eta}\log p(\eta|z) &\displaystyle =&\displaystyle \delta_{\eta}[\log p(z|\eta) + \log p(\eta)]\\ &\displaystyle =&\displaystyle \text{(A.4)} - (\text{A.1} +\text{A.2}+\text{A.3}) + \delta_{\eta}\log p(\eta_{0}). \end{array} \end{aligned} $$

(16.44)

Here, p(η ₀) is an arbitrary density for the initial latent state.

16.1.2.2 State-Dependent Diffusion Coefficient

In the case of Ω _j = Ω(η _j, x _j) the expressions get more complicated. The derivative of S ₀ now reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{0}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle \delta t^{-2}[\varOmega_{j-1;\alpha \beta}^{-1} \delta\eta_{j-1; \beta} -\varOmega_{j\alpha \beta}^{-1} \delta\eta_{j \beta} \\ &\displaystyle &\displaystyle +\frac{1}{2} \delta\eta_{j \beta}\varOmega_{j\beta\gamma,\alpha }^{-1} \delta\eta_{j \gamma}], \end{array} \end{aligned} $$

(16.45)

$\varOmega _{j\beta \gamma ,\alpha }^{-1}\equiv (\varOmega ^{-1})_{j\beta \gamma ,\alpha }$. A closer relation to expression (16.40) may be obtained by the Taylor expansion

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \varOmega_{j-1;\alpha \beta}^{-1} &\displaystyle =&\displaystyle \varOmega_{j\alpha \beta}^{-1} + \varOmega_{j\alpha \beta,\gamma}^{-1} (\eta_{j-1;\gamma}-\eta_{j\gamma})+O(\|\delta\eta_{j-1}\|{}^{2}) \end{array} \end{aligned} $$

(16.46)

leading to

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{0}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle -\varOmega_{j\alpha \beta}^{-1} \delta t^{-2} (\eta_{j+1; \beta}-2\eta_{j \beta}+ \eta_{j-1; \beta}) \\ &\displaystyle &\displaystyle - \varOmega_{j\alpha \beta,\gamma}^{-1} \delta t^{-2}\delta\eta_{j-1; \beta}\delta\eta_{j-1; \gamma} +O(\delta t^{-2}\|\delta\eta_{j-1}\|{}^{3})\\ &\displaystyle &\displaystyle + \frac{1}{2} \varOmega_{j \beta \gamma, \alpha}^{-1} \delta t^{-2}\delta\eta_{j\beta}\delta\eta_{j\gamma}. \end{array} \end{aligned} $$

(16.47)

In the state-dependent case, also the derivative of the Jacobian term $\log Z^{-1}=-\frac {1}{2} \sum _{j} \log $ |2πΩ _jδt| is needed. Since the derivative of a log determinant is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \partial \log|\varOmega| /\partial\varOmega_{\alpha\beta} &\displaystyle =&\displaystyle \varOmega_{\beta \alpha}^{-1}, \end{array} \end{aligned} $$

one obtains

$$\displaystyle \begin{aligned} \begin{array}{rcl} \partial \log Z^{-1}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle -\frac{1}{2} \delta t^{-1}\varOmega_{j\beta\gamma}^{-1} \varOmega_{j\beta\gamma,\alpha} = -\frac{1}{2} \delta t^{-1} \mbox{tr}[\varOmega_{j}^{-1} \varOmega_{j,\alpha}], \end{array} \end{aligned} $$

Ω _j,α = Ω _j••,α for short. Using the formula $\varOmega _{j} \varOmega _{j}^{-1} = I; \varOmega _{j,\alpha } = -\varOmega _{j} \varOmega ^{-1}_{j,\alpha } \varOmega _{j}$, we find

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial \log Z^{-1}/\partial(\eta_{j \alpha}\delta t) = \frac{1}{2} \delta t^{-1} \mbox{tr}[\varOmega^{-1}_{j,\alpha} \varOmega_{j} ]. \end{array} \end{aligned} $$

(16.48)

The contributions of S ₁ and S ₂ are now (see 16.16)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} &\displaystyle {}{}{}&\displaystyle {\partial S_{1}/\partial(\eta_{j \alpha}\delta t)=} \\ &\displaystyle &\displaystyle -\delta t^{-1}[f_{j\beta,\alpha}\varOmega_{j\beta \gamma}^{-1}\delta \eta_{j\gamma} -(\varOmega_{j\alpha \gamma}^{-1}f_{j\gamma} - \varOmega_{j-1;\alpha \gamma}^{-1}f_{j-1;\gamma}) +f_{j\beta}\varOmega_{j\beta\gamma,\alpha}^{-1}\delta\eta_{j\gamma}] \end{array} \end{aligned} $$

(16.49)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial S_{2}/\partial(\eta_{j \alpha}\delta t) &\displaystyle =&\displaystyle f_{j\beta,\alpha}\varOmega_{j\beta \gamma}^{-1}f_{j\gamma}+ \frac{1}{2} f_{j\beta}\varOmega_{j\beta \gamma,\alpha}^{-1}f_{j\gamma}. \end{array} \end{aligned} $$

(16.50)

It is interesting to compare the terms in (16.45, 16.49, 16.50) depending on the derivative $\varOmega _{j\beta \gamma ,\alpha }^{-1}$, which read in vector form

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \frac{1}{2} \delta t^{-2} \mbox{tr}[\varOmega_{j, \alpha}^{-1} \delta\eta_{j}\delta\eta_{j}^{\prime}] -\delta t^{-1} \mbox{tr}[\varOmega_{j,\alpha}^{-1}\delta\eta_{j}f_{j}^{\prime}] +\frac{1}{2} \mbox{tr}[\varOmega_{j,\alpha}^{-1}f_{j}f_{j}^{\prime}], \end{array} \end{aligned} $$

and the Jacobian derivative (16.48). The terms can be collected to yield

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \frac{1}{2} \delta t^{-2} \mbox{tr}\{\varOmega^{-1}_{j,\alpha} [\varOmega_{j}\delta t - (\delta\eta_{j}-f_{j}\delta t)(\delta\eta_{j}-f_{j}\delta t)^{\prime}]\}, \end{array} \end{aligned} $$

(16.51)

as may be directly seen from the Lagrangian (16.7).

In summary, the Langevin drift component (jα), j = 0, …J;α = 1, …, p is in vector-matrix form

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \delta_{\eta_{j\alpha}}\log p(\eta|z) &\displaystyle =&\displaystyle \delta_{\eta_{j\alpha}}[\log p(z|\eta) + \log p(\eta)] \\ &\displaystyle =&\displaystyle \sum_{i=0}^{T} h_{i\bullet,\alpha}^{\prime} R_{i}^{-1}(z_{i}-h_{i}) (\delta_{j j_{i}}/\delta t)\\ &\displaystyle &\displaystyle + \delta t^{-2}[ \varOmega_{j\alpha\bullet}^{-1} \delta\eta_{j} -\varOmega_{j-1;\alpha\bullet}^{-1} \delta\eta_{j-1}]\\ &\displaystyle &\displaystyle + \delta t^{-1}[f_{j\bullet,\alpha}^{\prime}\varOmega_{j}^{-1}\delta \eta_{j} -(\varOmega_{j\alpha\bullet}^{-1}f_{j} - \varOmega_{j-1;\alpha\bullet}^{-1}f_{j-1})]\\ &\displaystyle &\displaystyle - f_{j\bullet,\alpha}^{\prime}\varOmega_{j}^{-1}f_{j} \\ &\displaystyle &\displaystyle + \frac{1}{2} \delta t^{-2} \mbox{tr}\{\varOmega^{-1}_{j,\alpha} [\varOmega_{j}\delta t - (\delta\eta_{j}-f_{j}\delta t)(\delta\eta_{j}-f_{j}\delta t)^{\prime}]\}\\ &\displaystyle &\displaystyle + \delta_{\eta_{j\alpha}}\log p(\eta_{0}). \end{array} \end{aligned} $$

(16.52)

Here, h _i•,α is column α of Jacobian $h_{\eta }(\eta _{j_{i}})$, $\varOmega _{j\alpha \bullet }^{-1}$ is row α of Ω(η _j)⁻¹, $\varOmega _{j,\alpha }^{-1}:=\varOmega _{j \bullet \bullet ,\alpha }^{-1}$, and f _j•,α denotes column α of Jacobian f _η(η _j).

Appendix 2: Continuum Limit

The expressions in the main text were obtained by using an Euler discretization of the SDE (16.1), so in the limit δt → 0, one expects a convergence of η _j to the true state y(τ _j) (see Kloeden and Platen 1999, ch. 9). Likewise, the (J + 1)p-dimensional Langevin equation (16.14) for η _jα(u) will be an approximation of the stochastic partial differential equation (SPDE) for the random field Y _α(u, t) on the temporal grid τ _j = t ₀ + jδt.

A rigorous theory (assuming constant diffusion matrices) is presented in the work of Reznikoff and Vanden-Eijnden (2005); Hairer et al. (2005, 2007); Apte et al. (2007); Hairer et al. (2011). In this section it is attempted to gain the terms, obtained in this literature by functional derivatives, directly from the discretization, especially in the case of state-dependent diffusions. Clearly, the finite-dimensional densities w.r.t. Lebesgue measure lose their meaning in the continuum limit, but the idea is to use large but finite J, so that the Euler densities p(η ₀, …, η _J) are good approximations of the unknown finite-dimensional densities p(y ₀, τ ₀;…;y _J, τ _J) of the process Y (t) (cf. Stratonovich 1971, 1989, Bagchi 2001 and the references cited therein).

16.1.1 Constant Diffusion Matrix

First we consider constant and (nonsingular) diffusion matrices Ω. The Lagrangian (16.15) attains the formal limit (Onsager-Machlup functional)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} S &\displaystyle =&\displaystyle \frac{1}{2} \int dy(t)^{\prime} (\varOmega dt)^{-1}dy(t) \end{array} \end{aligned} $$

(16.53)

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle - \int f(y)^{\prime} \varOmega^{-1} dy(t) + \frac{1}{2} \int f(y)^{\prime} \varOmega^{-1} f(y) dt. {} \vspace{-2pt}\end{array} \end{aligned} $$

(16.54)

If y(t) is a sample function of the diffusion process Y (t) in (16.1), the first term (16.53) does not exist, since the quadratic variation dy(t)dy(t)^′ = Ωdt is of order dt. Thus we have dy(t)^′(Ωdt)⁻¹dy(t) = tr[(Ωdt)⁻¹ dy(t)dy(t)^′] = tr[I _p] = p. Usually, (16.53) is written as the formal expression $\frac {1}{2} \int \dot {y}(t)^{\prime } $ $\varOmega ^{-1} \dot {y}(t) dt$, which contains the (nonexisting) derivatives $\dot {y}(t)$. Moreover, partial integration yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} -\frac{1}{2} \int {y}(t)^{\prime}\varOmega^{-1}\ddot{y}(t) dt \vspace{-2pt}\end{array} \end{aligned} $$

(16.55)

so that C ⁻¹(t, s) = Ω ⁻¹(−∂ ²∕∂t ²)δ(t − s) is the kernel of the inverse covariance (precision) operator of Y (t) (for drift f = 0; i.e., a Wiener process). Indeed, since

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \partial^{2}/\partial t^{2} \min(t,s)=-\delta(t-s), \end{array} \end{aligned} $$

(16.56)

the covariance operator kernel C(t, s) is

$$\displaystyle \begin{aligned} \begin{array}{rcl} C(t,s) &\displaystyle =&\displaystyle \varOmega(-\partial^{2}/\partial t^{2})^{-1}\delta(t-s) = \varOmega \min(t,s). \end{array} \end{aligned} $$

Thus, $p(y)\propto \exp [-\frac {1}{2} \int {y}(t)^{\prime }\varOmega ^{-1}\ddot {y}(t) dt]$ is the formal density of a Gaussian process Y (t) ∼ N(0, C).

In contrast, the terms in (16.54) are well defined and yield the Radon-Nikodym derivative (cf. 16.17)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \alpha(y) &\displaystyle =&\displaystyle \exp\Big\{\int f(y)^{\prime} \varOmega^{-1} dy(t) - \frac{1}{2} \int f(y)^{\prime} \varOmega^{-1} f(y) dt\Big\}. \vspace{-2pt}\end{array} \end{aligned} $$

(16.57)

This expression can be obtained as the ratio of the finite-dimensional density functions p(y _J, τ _J, …, y ₁, τ ₁|y ₀, τ ₀) for drifts f and f = 0, respectively, in the limit δt → 0 (cf. Wong and Hajek 1985, ch. 6, p. 215 ff.). In this limit, the (unkown) exact densities can be replaced by the Euler densities (16.5). Now, the terms of the Langevin equation (16.14) will be given. We start with the measurement term (16.43), α = 1, …, p

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \delta \log p(z|y)/\delta y_{\alpha}(t) &\displaystyle =&\displaystyle \sum_{i=0}^{T} h_{i\bullet,\alpha}^{\prime} R_{i}^{-1}(z_{i}-h_{i}) \delta(t-t_{i}) \end{array} \end{aligned} $$

(16.58)

where the scaled Kronecker delta $(\delta _{j j_{i}}/\delta t)$ was replaced by the delta function (see Appendix 1). Clearly, in numerical implementations, a certain term of the delta sequence δ _n(t) must be used (cf. Lighthill 1958). Next, the term stemming from the driftless part (16.40) is

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\delta S_{0}/\delta y_{\alpha}(t) &\displaystyle =&\displaystyle \varOmega^{-1}_{\alpha\bullet} \ddot{y}(t) = \varOmega^{-1}_{\alpha\bullet} y_{tt}(t), \vspace{2pt}\end{array} \end{aligned} $$

or Ω ⁻¹y _tt(t) in matrix form, which corresponds to (16.55). The contributions of S ₁ are (cf. 16.41)

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\delta S_{1}/\delta y_{\alpha}(t) &\displaystyle =&\displaystyle f(y)_{\beta,\alpha}\varOmega_{\beta \gamma}^{-1}dy_{\gamma}(t)/dt -\varOmega_{\alpha \gamma}^{-1} df_{\gamma}(y)/dt. \vspace{2pt}\end{array} \end{aligned} $$

The first term is of Itô form. Transformation to Stratonovich calculus (Apte et al. 2007, sects. 4, 9) yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} h_{\alpha\beta} dy_{\beta} &\displaystyle =&\displaystyle h_{\alpha\beta} \circ dy_{\beta} - \frac{1}{2} {h}_{\alpha\beta,\gamma}\varOmega_{\beta\gamma}dt \end{array} \end{aligned} $$

(16.59)

$$\displaystyle \begin{aligned} \begin{array}{rcl} df_{\alpha} &\displaystyle =&\displaystyle f_{\alpha,\beta}dy_{\beta} + \frac{1}{2} f_{\alpha,\beta\gamma}\varOmega_{\beta\gamma}dt = f_{\alpha,\beta}\circ dy_{\beta}{} \vspace{2pt}\end{array} \end{aligned} $$

(16.60)

Thus, we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\delta S_{1}/\delta y_{\alpha}(t) &\displaystyle =&\displaystyle f(y)_{\beta,\alpha}\varOmega_{\beta \gamma}^{-1}\circ dy_{\gamma}(t)/dt -\frac{1}{2} f(y)_{\beta,\alpha\beta}\\ &\displaystyle &\displaystyle - \varOmega_{\alpha \gamma}^{-1} f(y)_{\gamma,\delta}\circ dy_{\gamma}(t)/dt\\ &\displaystyle =&\displaystyle (f_{y}^{\prime}\varOmega^{-1}-\varOmega^{-1}f_{y})\circ y_{t}(t) -\frac{1}{2} \partial_{y}[\partial_{y}\cdot f(y)] \vspace{2pt}\end{array} \end{aligned} $$

where ∂ _y ⋅ f(y) = f _β,β = div(f). Finally we have (cf. 16.42)

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\delta S_{2}/\delta y(t) &\displaystyle =&\displaystyle -f_{y}^{\prime}\varOmega^{-1}f \vspace{2pt}\end{array} \end{aligned} $$

and $\delta _{y(t)}\log p(y(t_{0}))=\partial _{y_{0}}\log p(y_{0})\delta (t-t_{0})$. Putting all together, one finds the Langevin drift functional (in matrix form)

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\frac{\delta \varPhi(y|z)}{\delta y(t)} &\displaystyle :=&\displaystyle F(y|z) \\ &\displaystyle =&\displaystyle \sum_{i=0}^{T} h_{iy}^{\prime}(y) R_{i}^{-1}(z_{i}-h_{i}(y)) \delta(t-t_{i})\\ &\displaystyle &\displaystyle + \varOmega^{-1} y_{tt}+(f_{y}^{\prime}\varOmega^{-1}-\varOmega^{-1}f_{y})\circ y_{t}\\ &\displaystyle &\displaystyle - \frac{1}{2} \partial_{y}[\partial_{y}\cdot f(y)]-f_{y}^{\prime}\varOmega^{-1}f \\ &\displaystyle &\displaystyle + \partial_{y_{0}}\log p(y_{0})\delta(t-t_{0}) \end{array} \end{aligned} $$

and the SPDE (cf. Hairer et al. 2007)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} dY(u,t) &\displaystyle =&\displaystyle F(Y(u,t)|z)) du +\sqrt{2}\; dW_{t}(u,t), \end{array} \end{aligned} $$

(16.61)

where W _t(u, t) = ∂ _tW(u, t) is a cylindrical Wiener process with E[W _t(u, t)] = 0, E[W _t(u, t) $W_{s}(v,s)^{\prime }]=I_{p}\min (u,v)\delta (t-s)$, and W(u, t) is a Wiener field (Brownian sheet). See, e.g., Jetschke (1986); Da Prato and Zabczyk (1992, ch. 4.3.3). The cylindrical Wiener process may be viewed as continuum limit of $W_{j}(u)/\sqrt {\delta t}$, $E[W_{j}(u)/\sqrt {\delta t} \; W^{\prime }_{k}(v)/\sqrt {\delta t}]=I_{p}\min (u,v)\delta t^{-1}\delta _{jk}$.

16.1.2 State-Dependent Diffusion Matrix

In this case, new terms appear. Starting with the first term in (16.47), one gets

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\varOmega_{j\alpha \beta}^{-1} \delta t^{-2} (\eta_{j+1; \beta}-2\eta_{j \beta}+ \eta_{j-1; \beta}) &\displaystyle \rightarrow&\displaystyle -\varOmega(y(t))^{-1} \circ \ddot{y}(t). \end{array} \end{aligned} $$

The second term in (16.47) contains terms of the form h _j (η _j − η _j−1) which appear in a backward Itô integral. Here we attempt to write them in symmetrized (Stratonovich) form. It turns out that the Taylor expansion (16.46) must be carried to higher orders. Writing (for simplicity in scalar form)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varOmega_{j-1}^{-1} \delta\eta_{j-1} -\varOmega_{j}^{-1} \delta\eta_{j} &\displaystyle :=&\displaystyle h_{j-1}\delta\eta_{j-1}-h_{j}\delta\eta_{j} \vspace{-2pt}\end{array} \end{aligned} $$

and expanding around η _j

$$\displaystyle \begin{aligned} \begin{array}{rcl} h_{j-1} &\displaystyle =&\displaystyle h_{j} + \sum_{k=1}^{\infty} \frac{1}{k!} h_{j,k} (\eta_{j-1}-\eta_{j})^{k} \vspace{-2pt}\end{array} \end{aligned} $$

one obtains

$$\displaystyle \begin{aligned} \begin{array}{rcl} h_{j-1}\delta\eta_{j-1}-h_{j}\delta\eta_{j} &\displaystyle =&\displaystyle h_{j}(\delta\eta_{j-1}-\delta\eta_{j}) + \sum_{k=1}^{\infty} \frac{(-1)^{k}}{k!} h_{j,k} \delta \eta_{j-1}^{k+1}. \vspace{-2pt}\end{array} \end{aligned} $$

(16.62)

To obtain a symmetric expression, h _j,k is expanded around $\eta _{j-1/2}:=\frac {1}{2}(\eta _{j-1}+\eta _{j})$. Noting that $\eta _{j}-\eta _{j-1/2}=\frac {1}{2} \delta \eta _{j-1}$, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} h_{j,k} &\displaystyle =&\displaystyle \sum_{l=0}^{\infty} \frac{(\frac{1}{2})^l}{l!} h_{j-1/2,k+l} \delta \eta_{j-1}^{l} \end{array} \end{aligned} $$

(16.63)

and together

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} h_{j}(\delta\eta_{j-1}-\delta\eta_{j})+ \sum_{k=1,l=0}^{\infty} \frac{(-1)^{k}(\frac{1}{2})^l}{k! \; l!} h_{j-1/2,k+l} \delta \eta_{j-1}^{k+l+1}. \end{array} \end{aligned} $$

(16.64)

Multiplying with δt ⁻² and collecting terms to order O(δt ²), one gets the continuum limit

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} -\varOmega^{-1} \circ \ddot{y} - \varOmega^{-1}_{\eta}\circ \dot{y}^{2} - \tfrac{1}{24} \varOmega^{-1}_{\eta\eta\eta} \varOmega^{2}. \end{array} \end{aligned} $$

(16.65)

The last term in (16.47) is absorbed in the expression (16.51).

The continuum limit of the first two terms in the derivative of S ₁ (see (16.49) is

$$\displaystyle \begin{aligned} \begin{array}{rcl} - f(y)_{\beta,\alpha}\varOmega(y)_{\beta \gamma}^{-1}dy_{\gamma}(t)/dt + d[\varOmega(y)_{\alpha \gamma}^{-1} f_{\gamma}(y)]/dt. \end{array} \end{aligned} $$

Transforming to Stratonovich calculus (16.59–16.60) yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} -\{f(y)_{\beta,\alpha}\varOmega(y)_{\beta \gamma}^{-1} -[\varOmega(y)_{\alpha \beta}^{-1} f_{\beta}(y)]_{, \gamma}\} \circ dy_{\gamma}(t)/dt \\ +\frac{1}{2} [f(y)_{\beta,\alpha}\varOmega(y)_{\beta \gamma}^{-1}]_{,\delta}\varOmega_{\gamma\delta}. \end{array} \end{aligned} $$

(16.66)

Equation (16.50) yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \delta S_{2}/\delta y_\alpha(t) &\displaystyle =&\displaystyle f(y)_{\beta,\alpha}\varOmega(y)_{\beta \gamma}^{-1}f(y)_{\gamma}+ \frac{1}{2} f(y)_{\beta}\varOmega(y)_{\beta \gamma,\alpha}^{-1}f(y)_{\gamma}. \end{array} \end{aligned} $$

(16.67)

The last term to be discussed is (16.51). Formally,

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \frac{1}{2} \delta t^{-2} \mbox{tr}&\displaystyle &\displaystyle \{\varOmega^{-1}_{,\alpha} [\varOmega dt - (dy-f dt)(dy-f dt)^{\prime}]\} \\ &\displaystyle &\displaystyle =\frac{1}{2} \mbox{tr}\{\varOmega^{-1}_{,\alpha} [\varOmega \delta t^{-1} - (\dot{y}-f)(\dot{y}-f)^{\prime}]\} . \end{array} \end{aligned} $$

(16.68)

From the quadratic variation formula (dy − fdt)(dy − fdt)^′ = Ωdt, it seems that it can be dropped. But setting $\delta \eta _{j}-f_{j}\delta t = g_{j} z_{j}\sqrt {\delta t}$ (from the Euler scheme, see (16.3)), one gets

$$\displaystyle \begin{aligned} \begin{array}{rcl} X := \frac{1}{2} \delta t^{-1} \mbox{tr}\{\varOmega^{-1}_{j,\alpha} \varOmega_{j} \;(I- z_{j} z_{j}^{\prime})\} \end{array} \end{aligned} $$

In scalar form, one has $X := \frac {1}{2} \delta t^{-1} \varOmega ^{-1}_{j,\alpha } \varOmega _{j} \;(I- z_{j}^{2})$ which is $\chi _{1}^{2}$-distributed, conditionally on η _j. One has E[1 − z ²] = 0;Var(1 − z ²) = 1 − 2 + 3 = 2, thus E[X] = 0 and $\mbox{Var}[X]=\frac {1}{2} \delta t^{-2}E[\varOmega ^{-2}_{j,\alpha }\varOmega _{j}^{2}]$.

Therefore, the drift functional in the state-dependent case is

$$\displaystyle \begin{aligned} \begin{array}{rcl} -\frac{\delta \varPhi(y|z)}{\delta y(t)} &\displaystyle :=&\displaystyle F(y|z) \\ &\displaystyle =&\displaystyle (\text{A.19})-(\text{A.26})-(\text{A.27})-(\text{A.28})+(\text{A.29})\\ &\displaystyle &\displaystyle +\partial_{y_{0}}\log p(y_{0})\delta(t-t_{0}) \end{array} \end{aligned} $$

16.1.3 Discussion

The second-order time derivative (diffusion term w.r.t. t) Ω ⁻¹y _tt in the SPDE (16.61) resulted from the first term (16.53) in the Lagrangian corresponding to the driftless process (random walk process). Usually this (in the continuum limit), infinite term is not considered and removed by computing a density ratio (16.17) which leads to a well-defined Radon-Nikodym density (16.54). On the other hand, the term is necessary to obtain the correct SPDE. Starting from the Radon-Nikodym density (16.57) for the process dY (t) = fdt + GdW(t) at the outset, it is not quite clear how to construct the appropriate SPDE. Setting for simplicity f = 0 and dropping the initial condition and the measurement part, Eq. (16.61) reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} dY(u,t) &\displaystyle =&\displaystyle \varOmega^{-1} \;Y_{tt}(u,t) du +\sqrt{2}\; dW_{t}(u,t). \end{array} \end{aligned} $$

This linear equation (Ornstein-Uhlenbeck process) can be solved using a stochastic convolution as ($A:=\varOmega ^{-1}\partial _t^2$)

$$\displaystyle \begin{aligned} \begin{array}{rcl} Y(u,t) &\displaystyle =&\displaystyle \exp(A u) Y(0,t) + \int_{0}^{u} \exp(A (u-s)) \sqrt{2}\; dW_{t}(s,t). \end{array} \end{aligned} $$

(cf. Da Prato 2004, ch. 2). It is a Gaussian process with mean $\mu (u) = \exp (A u) E[Y(0)]$ and variance $Q(u)=\exp (A u) \mbox{Var}(Y(0)) \exp (A^{*} u) + \int _{0}^{u} \exp (A s) 2 \exp (A^{*} s) ds$ where A ^∗ is the adjoint of A. Thus the stationary distribution (u →∞) is the Gaussian measure N(0, Q(∞)) with $Q(\infty )=-A^{-1}=-\varOmega \cdot [\partial _{t}^{2}]^{-1}$, since A = A ^∗. But this coincides with $C(t,s)=\varOmega \min (t,s)$, the covariance function of the scaled Wiener process G ⋅ W(t) (see (16.56); Ω = GG′). Thus, for large u, Y (u, t) generates trajectories of GW(t). More generally (f≠0), one obtains solutions of SDE (16.1). A related problem occurs in the state-dependent case Ω(y). Again, the term $\int dy'(\varOmega dt)^{-1} dy$ yields a second-order derivative in the SPDE, but after transforming to symmetrized Stratonovich form, also higher-order terms appear (16.64), (16.65).

Moreover, the differential of Ω ⁻¹ in the Lagrangian (16.53)–(16.54) imports a problematic term similar to (16.53) into the SPDE, namely, $\frac {1}{2} (\dot {y}-f)^{\prime }(\varOmega ^{-1})_{y} (\dot {y}-f)$, which can be combined with the derivative of the Jacobian (cf. 16.68). Formally, it is squared white noise where the differentials are in Itô form. A procedure similar to (16.63), i.e.,

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} h_{j,k} &\displaystyle =&\displaystyle \sum_{l=0}^{\infty} \frac{(-\frac{1}{2})^l}{l!} h_{j+1/2,k+l} \delta \eta_{j}^{l} \end{array} \end{aligned} $$

(16.69)

can be applied to obtain Stratonovich-type expressions. Because of the dubious nature of these expressions, only the quasi-continuous approach based on approximate finite-dimensional densities and Langevin equations is used in this paper.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Singer, H. (2018). Langevin and Kalman Importance Sampling for Nonlinear Continuous-Discrete State-Space Models. In: van Montfort, K., Oud, J.H.L., Voelkle, M.C. (eds) Continuous Time Modeling in the Behavioral and Related Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-77219-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-77219-6_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77218-9
Online ISBN: 978-3-319-77219-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics