Skip to main content
Log in

Time-course window estimator for ordinary differential equations linear in the parameters

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In many applications obtaining ordinary differential equation descriptions of dynamic processes is scientifically important. In both, Bayesian and likelihood approaches for estimating parameters of ordinary differential equations, the speed and the convergence of the estimation procedure may crucially depend on the choice of initial values of the parameters. Extending previous work, we show in this paper how using window smoothing yields a fast estimator for systems that are linear in the parameters. Using weak assumptions on the measurement error, we prove that the proposed estimator is \(\sqrt{n}\)-consistent. The estimator does not require an initial guess for the parameters and is computationally fast and, therefore, it can serve as a good initial estimate for more efficient estimators. In simulation studies and on real data we illustrate the performance of the proposed estimator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bellman, R., Roth, S.: The use of splines with unknown end points in the identification of systems. J. Math. Anal. Appl. 34(1), 26–33 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  • Bernstein, S.: Matrix Mathematics Theory, Facts, and Formulas. Princeton University Press, Princeton, NJ (2009)

    MATH  Google Scholar 

  • Bickel, P.J., Ritov, Y.: Nonparametric estimators which can be “plugged-in”. Ann. Stat. 31(4), 1033–1053 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Bock, H.G.: Recent advances in parameter identification techniques for odes. In: Numerical Treatments of Inverse Problems in Differential and Integral Equtions, vol. 2, pp. 95–121 (1983)

  • Bonhoeffer, S., May, R.M., Shaw, G.M., Nowak, A.: Virus dynamics and drug therapy. Proc. Natl. Acad. Sci. 94(13), 6971–6976 (1997)

    Article  Google Scholar 

  • Brunel, B.: Parameter estimation of ode’s via nonparametric estimators. Electron. J. Stat. 2, 1242–1267 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Campbell, D., Steele, J.: Smooth functional tempering for nonlinear differential equation models. Stat. Comput. 22(2), 429–443 (2012)

    Article  MathSciNet  Google Scholar 

  • Dattner, I., Klaassen, A.: Estimation in systems of ordinary differential equations linear in the parameters. arXiv preprint arXiv:1305.4126 (2013)

  • Earn, D.J., Rohani, P., Bolker, B.M., Grenfell, B.T.: A simple model for complex dynamical transitions in epidemics. Science 287(5453), 667–670 (2000)

    Article  Google Scholar 

  • Edelstein-Keshet, L.: Mathematical Models in Biology, vol. 46. SIAM, Philadelphia, PA (2005)

    Book  MATH  Google Scholar 

  • Fang, Y., Wu, H., Zhu, L.X.: A two-stage estimation method for random coefficient differential equation models with application to longitudinal HIV dynamic data. Statistica Sinica 21(3), 1145 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Fine, P.E., Clarkson, A.: Measles in england and wales-I: an analysis of factors underlying seasonal patterns. Int. J. Epidemiol. 11(1), 5–14 (1982)

    Article  Google Scholar 

  • Finkenstädt, B.F., Grenfell, T.: Time series modelling of childhood diseases: a dynamical systems approach. J. R. Stat. Soc. Ser. C (Appl. Stat.) 49(2), 187–205 (2000)

  • FitzHugh, R.: Impulses and physiological states in theoretical models of nerve membrane approach. Biophys. J. 1(6), 445–466 (1961)

    Article  Google Scholar 

  • González, J., Vujačić, I., Wit, E.: Inferring latent gene regulatory network kinetics. Stat. Appl. Genet. Mol. Biol. 12(1), 109–127 (2013)

    MathSciNet  Google Scholar 

  • González, J., Vujačić, I., Wit, E.: Reproducing kernel Hilbert space based estimation of systems of ordinary differential equations. Pattern Recognit. Lett. 45, 26–32 (2014)

    Article  Google Scholar 

  • Gugushvili, S., Klaassen, C.A.J.: \(\sqrt{n}\)-consistent parameter estimation for systems of ordinary differential equations: bypassing numerical integration via smoothing. Bernoulli 18, 1061–1098 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  • Gugushvili, S., Spreij, P.: Parametric inference for stochastic differential equations: a smooth and match approach. Lat. Am. J. Probab. Math. Stat. 9(2), 609–635 (2012)

    MATH  MathSciNet  Google Scholar 

  • Goldstein, L., Messer, K.: Optimal plug-in estimators for nonparametric functional estimation. Ann. Stat. 20, 1306–1328 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  • He, D., Ionides, E.L., King, A.: Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J. R. Soc. Interface 7(43), 271–283 (2010)

    Article  Google Scholar 

  • Himmelblau, D.M., Jones, C.R., Bischoff, B.: Determination of rate constants for complex kinetics models. Ind. Eng. Chem. Fundam. 6(4), 539–543 (1967)

    Article  Google Scholar 

  • Hooker, G., Ellner, S.P., Roditi, L.D.V., Earn, J.: Parameterizing state-space models for infectious disease dynamics by generalized profiling: measles in ontario. J. R. Soc. Interface 8(60), 961–974 (2011)

  • Hooker, G., Xiao, L., Ramsay, J.: CollocInfer: collocation inference for dynamic systems. R package version 0.1.7 (2012), http://CRAN.R-project.org/package=CollocInfer

  • Huppert, A., Barnea, O., Katriel, G., Yaari, R., Roll, U., Stone, L.: Modeling and statistical analysis of the spatio-temporal patterns of seasonal influenza in Israel. PloS One 7(10), e45107 (2012)

    Article  Google Scholar 

  • Khanin, R., Vinciotti, V., Wit, E.C.: Reconstructing repressor protein levels from expression of gene targets in Escherichia coli. Proc. Natl. Acad. Sci. 103(49), 18592–18596 (2006)

    Article  Google Scholar 

  • Khanin, R., Vinciotti, V., Mersinias, V., Smith, C.P., Wit, E.C.: Statistical reconstruction of transcription factor activity using Michaelis-Menten kinetics. Biometrics 63(3), 816–823 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Liang, H., Wu, H.: Parameter estimation for differential equation models using a framework of measurement error in regression models. J. Am. Stat. Assoc. 103(484), 1570–1583 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Loader, C.: Local Regression and Likelihood. Springer, Berlin (1999)

    MATH  Google Scholar 

  • Miao, H., Dykes, C., Demeter, L.M., Cavenaugh, J., Park, S.Y., Perelson, A.S., Wu, H.: Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments. Bull. Math. Biol. 70(6), 1749–1771 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Nagumo, J., Arimoto, S., Yoshizawa, S.: An active pulse transmission line simulating nerve axon. Proc. IRE 50(10), 2061–2070 (1962)

    Article  Google Scholar 

  • Olinky, R., Huppert, A., Stone, L.: Seasonal dynamics and thresholds governing recurrent epidemics. J. Math. Biol. 56(6), 827–839 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Qi, X., Zhao, H.: Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of parameters in ordinary differential equations. Ann. Stat. 38(1), 435–481 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  • Ramsay, J.O., Hooker, G., Campbell, D., Cao, J.: Parameter estimation for differential equations: a generalized smoothing approach. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 69(5), 741–796 (2007)

  • Steinke, F., Schölkopf, B.: Kernels, regularization and differential equations. Pattern Recognit. 41(11), 3271–3286 (2008)

    Article  MATH  Google Scholar 

  • Stone, L., Olinky, R., Huppert, A.: Seasonal dynamics of recurrent epidemics. Nature 446(7135), 533–536 (2007)

    Article  Google Scholar 

  • Varah, J.: A spline least squares method for numerical parameter estimation in differential equations. SIAM J. Sci. Stat. Comput. 3(1), 28–46 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  • Wood, N.S., Lindgren, F.: APTS statistical computing. http://www2.warwick.ac.uk/fac/sci/statistics/apts/students/resources/apts-statcomp.pdf (2013). Accessed 27 Dec 2013

  • Xue, H., Miao, H., Wu, H.: Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error. Ann. Stat. 38(4), 2351–2387 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  • Xun, X., Cao, J., Mallick, B., Carroll, R.J., Arnab, M.: Parameter estimation of partial differential equation models. J. Am. Stat. Assoc. to appear (2013)

Download references

Acknowledgments

The authors are very grateful to the reviewers and to the Associate Editor for their very useful comments, which enabled us to greatly improve the manuscript. Vujačić acknowledges financial support through the JoinEUsee project scholarship. Wit and González acknowledge financial support through the NWO SBC-EMA-435065 project grant. The authors would like to thank Amit Huppert (Gertner Institute for Epidemiology & Health Policy Research, Israel) for useful discussions concerning the real data example. We also thank Prof. H.G. Bock (Heidelberg University) and Dr. A. Abbruzzo (University of Palermo) for providing important references for this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Vujačić.

Appendices

Appendix 1: Proof of Theorem 1

Denote the number of time points in subinterval \(S_i\) by \(I_i\). The length of each subinterval \(S(t)\) is \(T/I\). We have that \(\sum _{i=1}^I I_i=n\) and \(\lfloor \sqrt{n}\rfloor \le I_i\le \lfloor \sqrt{n}\rfloor +2\), which implies

$$\begin{aligned} \sum _{i=1}^I\frac{1}{I_i}&=O(1),\end{aligned}$$
(14)
$$\begin{aligned} \sum _{i=1}^I \frac{1}{I_i^3}&=O\Big (\frac{1}{n}\Big ). \end{aligned}$$
(15)

With the following notation

$$\begin{aligned} \varepsilon (t_i)&= \left( \varepsilon _1(t_i),\ldots ,\varepsilon _d(t_i)\right) ^{\top },\\ \varepsilon (t_i)&= Y(t_i)-x(t_i), \end{aligned}$$

we have

$$\begin{aligned}&\text {E}\varepsilon _h(t_i)=0,\qquad h=1,\ldots ,d,\ i=1,\ldots ,n,\nonumber \\&\text {E}\widehat{x}_n(t)=\frac{1}{|S(t)|}\sum _{t_j\in S(t)}x(t_j),\ t\in S(t),\nonumber \\&\widehat{x}_n(t)-\text {E}\widehat{x}_n(t)=\frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon (t_j), \end{aligned}$$
(16)
$$\begin{aligned} \text {E}\bigg \{\frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon _h(t_j)\bigg \}&= 0,\quad h=1,\ldots ,d,\nonumber \\ \text {E}\left\{ \varepsilon _h(t_i)\varepsilon _h(t_j)\right\}&= 0,\quad i\ne j, \end{aligned}$$
(17)

and for any \(t\in [0,T]\) we have

$$\begin{aligned}&\text {var}\bigg \{\frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon _h(t_j)\bigg \}=\frac{1}{|S(t)|^2}\text {var}\bigg \{\sum _{t_j\in S(t)}\varepsilon _h(t_j)\bigg \}\\&\qquad =\frac{\sigma ^2_{\varepsilon }}{|S(t)|}. \end{aligned}$$

We use these results throughout the proof. The proof is based on verifying the following conditions as stated in Theorem 2 in (Dattner and Klaassen 2013).

  1. 1.

    \(\Vert \text {E}\widehat{x}_n-x\Vert ^2_{\infty }=O(\frac{1}{n})\) For nonnegative numbers \(a_1,\ldots ,a_m\) it holds

    $$\begin{aligned} \Big (\sum _{k=1}^m a_k\Big )^2 \le m \sum _{k=1}^m a_k^2. \end{aligned}$$
    (18)

    The inequality is the consequence of that between arithmetic and quadratic mean. Applying the triangle inequality and (18) yields

    $$\begin{aligned}&\Vert \text {E}\widehat{x}_n-x\Vert ^2_{\infty }\\&\quad ={\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}x(t_j)-x(t)\Vert ^2\\&\quad ={\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\{x(t_j)-x(t)\}\Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{1}{I_i^2}\left\{ \sum _{t_j\in S_i}\Vert \int \limits _{t}^{t_j}x'(s)\mathrm{{d}}s\Vert \right\} ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\Vert \int \limits _{t}^{t_j}g(x(s))\theta \mathrm{{d}}s\Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\sup _{s\in S_i}\Vert g(x(s))\Vert ^2\Vert \theta \Vert ^2(t_j-t)^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\frac{T^2}{I^2}\sup _{s\in S_i}\Vert g(x(s))\Vert ^2\Vert \theta \Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}\frac{I}{I_i^2}\frac{I_i}{I^2}\cdot O(1)=O\Big (\frac{1}{\sqrt{n}\sqrt{n}}\Big )=O\Big (\frac{1}{n}\Big ). \end{aligned}$$

    Here we used boundedness of parameter space \(\varTheta \). Also, since \(g(\cdot )\) is continuous and \(x(\cdot )\) is bounded, then \(g(x(\cdot ))\) is bounded.

  1. 2.

    \(\Vert \widehat{x}_n\Vert _{\infty }=O_p(1)\)

    $$\begin{aligned}&P(\Vert \widehat{x}_n-\text {E}\widehat{x}_n\Vert _{\infty }\ge M)\\&\quad =P({\mathop {\sup }\limits _{t\in [0,T]}}\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert \ge M)\\&\quad =P\left( {\mathop {\sup }\limits _{t\in [0,T]}}\parallel \frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon (t_j)\parallel \ge M\right) \\&\quad =P\Big ({\mathop {\max }\limits _{1\le i\le I}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \ge M\Big )\\&\quad =1-P\Big ({\mathop {\max }\limits _{1\le i\le I}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \le M\Big )\\&\quad =1-\prod _{i=1}^I P\Big (\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \le M\Big )\\&\quad =1- \prod _{i=1}^I\Big \{1-P\Big (\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \ge M\Big )\Big \}\\&\quad \le 1-\prod _{i=1}^I\Big (1-\frac{d\sigma ^2_{\varepsilon }}{M^2I_i}\Big )\\&\quad \le 1- \Big (1-\sum _{i=1}^I\frac{d\sigma ^2_{\varepsilon }}{M^2}\frac{1}{I_i}\Big ) \le \frac{d\sigma ^2_{\varepsilon }}{M^2}\sum _{i=1}^I\frac{1}{I_i}. \end{aligned}$$

    The first inequality above follows from Chebyshev’s inequality and the second from Lemma 1, which is given in the Appendix 2. We conclude that \(\widehat{x}_n-\text {E}\widehat{x}_n\) is bounded in probability. Since we have

    $$\begin{aligned} \widehat{x}_n=\underbrace{\widehat{x}_n-\text {E}\widehat{x}_n}_{O_p(1)}+\underbrace{\text {E}\widehat{x}_n-x}_{O(\frac{1}{n})}+\underbrace{x}_{O(1)}, \end{aligned}$$

    it follows that the sup norm of \(\widehat{x}_n\) is bounded in probability, i.e.

    $$\begin{aligned} \Vert \widehat{x}_n\Vert _{\infty }=O_p(1). \end{aligned}$$
  1. 3.

    \(\text {E}(\int _0^T\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert ^2\mathrm{{d}}t)=O(\frac{1}{\sqrt{n}} )\)

    $$\begin{aligned}&\text {E}\Big (\int \limits _0^T\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert ^2\mathrm{{d}}t\Big )\\&\quad \mathop {=}\limits ^{(16)}\text {E}\Big (\sum _{i=1}^I\int \limits _{S_i}\Vert \frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon (t_j) \Vert ^2\mathrm{{d}}t\Big )\\&\quad =\text {E}\Big (\frac{1}{I}\sum _{i=1}^I\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j) \Vert ^2\Big )\\&\quad =\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\text {E}\Big [\sum _{h=1}^d\big \{\sum _{t_j\in S_i}\varepsilon _h(t_j)\big \}^2\Big ]\\&\quad =\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^d\sum _{t_j\in S_i}\text {E}\varepsilon _h(t_j)^2\\&\quad \quad +\,\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^d\sum _{t_j,t_k\in S_i: j\ne k}\text {E}\varepsilon _h(t_j)\varepsilon _h(t_k)\\&\quad \mathop {=}\limits ^{(17)}\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^dI_i\sigma ^2_{\varepsilon }=\frac{1}{I}\sum _{i=1}^I\frac{d\sigma ^2_{\varepsilon }}{I_i}\\&\quad \mathop {=}\limits ^{(14)}O\Big (\frac{1}{I}\Big )=O\Big (\frac{1}{\sqrt{n}}\Big ).\\ \end{aligned}$$
  1. 4.

    \(\int _0^T\text {var}\{\int _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{d}{s}\}\mathrm{{d}}t=O(\frac{1}{n})\) By using Lemma 2 (Appendix 2) we obtain

    $$\begin{aligned}&\int \limits _0^T\text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} \mathrm{{d}}t\\&\quad =\sum _{i=1}^I\int \limits _{S_i}\text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} \mathrm{{d}}t\\&\quad =\sum _{i=1}^I\int \limits _{S_i}\Big [\sum _{l=1}^I\Big \{\int \limits _{S_l\cap [0,t]} f(s)\mathrm{{d}}s\Big \}^2\frac{\sigma _{\epsilon }^2}{I_l}\Big ]\mathrm{{d}}t\\&\quad \le \Vert f\Vert _{\infty }^2\sum _{i=1}^I\int \limits _{S_i}\mathrm{{d}}t\sum _{l=1}^I \frac{\sigma _{\epsilon }^2}{I_l^3}\\&\quad =\Vert f\Vert _{\infty }^2\sum _{i=1}^I\frac{1}{I_i}\sum _{l=1}^I \frac{\sigma _{\epsilon }^2}{I_l^3}\\&\quad =O\Big (\frac{1}{n}\Big ), \end{aligned}$$

    where the last equality is due to (14) and (15).

  1. 5.

    \(\text {var}\{\int _0^T f(t)\widehat{x}_{n,h}(t)\mathrm{{d}}t\}=O(\frac{1}{n})\) According to Lemma 2 (Appendix 2) for any \(t\in [0,T]\)

    $$\begin{aligned} \text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} =\sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,t]}f(s)\mathrm{{d}}s\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}, \end{aligned}$$

    whence it follows that

    $$\begin{aligned}&\text {var}\left\{ \int \limits _0^T f(t)\widehat{x}_{n,h}(t)\mathrm{{d}}t\right\} \\&\quad =\sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,T]}f(t)\mathrm{{d}}t\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad =\sum _{i=1}^I \left\{ \int \limits _{S_i}f(t)\mathrm{{d}}t\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad \le \sum _{i=1}^I \left( \Vert f\Vert _{\infty }\frac{1}{I}\right) ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad = \Vert f\Vert _{\infty }^2\frac{\sigma _{\epsilon }^2}{I^2}\sum _{i=1}^I\frac{1}{I_i}\\&\quad \mathop {=}\limits ^{(14)}O\Big (\frac{1}{I^2}\Big )=O\Big (\frac{1}{n}\Big ). \end{aligned}$$

Appendix 2: Auxiliary results

Lemma 1

For any \(0\le a_1,\ldots ,a_n\le 1\) the following inequality holds

$$\begin{aligned} \prod _{i=1}^n(1-a_{i})\ge 1-\sum _{i=1}^n a_{i}. \end{aligned}$$

Proof

Using induction.

Lemma 2

For any \(t\in [0,T]\) the following equality holds

$$\begin{aligned} \mathrm{var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} =\sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,t]}f(s)\mathrm{{d}}s\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}. \end{aligned}$$

Proof

Let \(C_i\) denote \(\widehat{x}_{n,h}(S_i)\). Since, \([0,T]=\cup _{i=1}^I S_i\) then for any \(t\in [0,T]\) it holds that \([0,t]=\cup _{i=1}^I (S_i \cap [0,t])\). Now we have

$$\begin{aligned} \text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\}&= \text {var}\left\{ \sum _{i=1}^I \int \limits _{S_i\cap [0,t]}f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} \\&= \text {var}\left\{ \sum _{i=1}^I \int \limits _{S_i\cap [0,t]}f(s)C_i\mathrm{{d}}s\right\} \\&= \text {var}\left\{ \sum _{i=1}^I \int \limits _{S_i\cap [0,t]}f(s)\mathrm{{d}}s\cdot C_i\right\} \\&= \sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,t]}f(s)\mathrm{{d}}s\right\} ^2\text {var}(C_i)\\&\quad +\,2\sum _{1\le k<l\le I}\int \limits _{S_k\cap [0,t]}f(s)\mathrm{{d}}s\\&\quad \times \int \limits _{S_l\cap [0,t]}f(s)\mathrm{{d}}s\\&\quad \times \,\text {cov}(C_k,C_l),\\ \end{aligned}$$

where in the second equality we have used the fact that \(\widehat{x}_{n,h}\) is constant on \(S_i\) and equal to \(C_i\). We need \(\text {var}(C_i)\) and \(\text {cov}(C_k,C_l)\).

$$\begin{aligned}&\text {var}(C_i)=\text {var}\left\{ \frac{1}{I_i}\sum _{t_j\in S_i}Y_h(t_j)\right\} =\frac{1}{I_i^2}I_i\sigma _{\epsilon }^2=\frac{\sigma _{\epsilon }^2}{I_i},\\&\text {cov}(C_k,C_l)=\text {cov}\left( \frac{1}{I_k}\sum _{t_i\in S_k}Y_h(t_i),\frac{1}{I_l}\sum _{t_j\in S_l}Y_h(t_j)\right) \\&= \frac{1}{I_kI_l}\sum _{t_i\in S_k,t_j\in S_l}\text {cov}(Y_h(t_i),Y_h(t_j))=0. \end{aligned}$$

Plugging the variance and covariance terms into the original expression we obtain the formula.

Appendix 3: Calculation for the algorithmic complexity

The following results are taken from Wood and Lindgren (2013) and are used in the sequel.

Lemma 3

If \(A\) and \(B\) are matrices of dimension \(n\times m\) then the computational cost of addition and subtraction i.e. \(A+B,\,A-B\), is \(O(mn)\) flops.

Lemma 4

If \(A\) is a matrix of dimension \(n\times m\) and \(B\) is a matrix of dimension \(m\times p\) then the computational cost of the matrix multiplication \(AB\) is \(O(nmp)\) flops.

Lemma 5

If a matrix \(A\) is of order \(n\) then the computational cost of the matrix inversion \(A^{-1}\) is \(O(n^3)\) flops.

The matrices that appear in the form for the estimators have the following dimensions:

  • \(I_d\) and \(\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) are of dimension \(d\times d\).

  • \(\widehat{A}_n,\,\widehat{G}_n(S_i),\,\widehat{A}_n\widehat{B}_n^{-1}\) and \(g(\widehat{x}_n(t))\) are of dimension \(d\times p\).

  • \(\widehat{A}_n^{\top }\), \(\widehat{G}_n^{\top } (t_i)\) and \(g (\widehat{x}_n(t))^{\top }\) are of dimension \(p\times d\).

  • \(\widehat{B}_n\) is of dimension \(p\times p\).

  • \(\widehat{x}_n(t_i),\,Y(t_i)\) and \(\widehat{\xi }_n\) are of dimension \(d\times 1\).

  1. 1.

    The cost for \(\widehat{x}_n(S_i)\) and \(\widehat{G}_n(a_i)\) We calculate \(\widehat{x}_n(S_i)\), for \(i=1,\ldots ,I\). The cost for \(\widehat{x}_n(S_i)\) is \(O((I-1)d)\) so the total cost is \(O(I(I-1)d)=O(I^2d)=O(nd)\). The calculation of \(\widehat{G}_n(a_i)\) involves the calculation of \(g(\widehat{x}_n(S_i))\) and since the construction of matrix \(g\) depends on the form of the ODE we assume that every entry of the matrix is obtained by at most a constant number of flops \(C\) that does not depend on \(n,\,p\) and \(d\). Then the cost of obtaining matrix \(g(\widehat{x}_n(S_i))\), for any \(i=1,\ldots ,I\), is \(O(Cdp)=O(dp)\) flops. The estimator \(\widehat{G}_n(t)\) can be calculated recursively by

    $$\begin{aligned} \widehat{G}_n(a_1)&= g(\widehat{x}_n(S_1))a_1,\nonumber \\ \widehat{G}_n(a_i)&= \widehat{G}_n(a_{i-1})\!+\!g(\widehat{x}_n(S_i))\varDelta ,\quad i\!=\!2,\ldots ,I.\quad \end{aligned}$$
    (19)

    Since the cost for computing \(g(\widehat{x}_n(S_i))\), for \(i=1,\ldots ,I\) is \(O(dp)\) flops, from (19) it follows that \(\widehat{G}_n(a_i),\,i=1,\ldots ,I\) can be computed in \(O(dp)\) flops and therefore summing across \(i=1,\ldots ,I\) we obtain that the total cost is

    $$\begin{aligned} O(Idp)=O(\sqrt{n}dp) \end{aligned}$$

    flops.

  1. 2.

    The cost for \(\widehat{A_n},\,\widehat{A_n}(i)\) and \(\widehat{B_n}\) Since the cost for computing \(g(\widehat{x}_n(S_i))\) and \(\widehat{G}_n(a_i)\) for each \(i\) is \(O(dp)\) flops, from (9) it follows that the cost for \(\widehat{A_n(i)}\) is \(O(dp)\) flops. The equality \(\widehat{A_n}=\sum _{i=1}^n\widehat{A_n}(i)\) implies that the total cost for computing \(\widehat{A_n}\) and \(\widehat{A_n}(i),\,i=1,\ldots ,n\) is \(O(Idp)\) flops. For the cost of \(\widehat{B_n}\) we analyze each term in (10).

  • \((\widehat{G}_n(T)^{\top },\widehat{A}_n)\mapsto \widehat{G}_n(T)^{\top }\widehat{A}_n\) costs \(O(p^2d)\) flops.

  • \(\left( g(\widehat{x}_{n}(S_i))^{\top }, t^2\widehat{G}_n(t) \Big |_{a_{i-1}}^{a_i}\right) \!\mapsto \!g(\widehat{x}_{n}(S_i))^{\top }t^2\widehat{G}_n(t) \Big |_{a_{i-1}}^{a_i}\) costs \(O(p^2d)\) flops.

  • \((R_i^{\top },g(\widehat{x}_{n}(S_i))(2i-1))\mapsto R_i^{\top }g(\widehat{x}_{n}(S_i))(2i-1)\) costs \(O(p^2d)\) flops.

  • \((g(\widehat{x}_{n}(S_i))^{\top },g(\widehat{x}_{n}(S_i))(3i^2-1))\mapsto g(\widehat{x}_{n}(S_i))^{\top }g(\widehat{x}_{n}(S_i))(3i^2-1)\) costs \(O(p^2d)\) flops.

Each term above has complexity \(O(p^2d)\) and summing over \(i\) we get that the cost for \(\widehat{B_n}\) is \(O(Ip^2d)\). Finally, the total cost for \(\widehat{A_n},\,\widehat{A_n}(i),\,i=1,\ldots ,I\), and \(\widehat{B_n}\) is \(O(Idp)+O(Ip^2d)=O(Ip^2d)\), which is equal to

$$\begin{aligned} O(\sqrt{n}p^2d) \end{aligned}$$

flops.

  1. 3.

    The cost for \(\widehat{\xi }_n\) The term \(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\)

  • \(\widehat{B}_n\mapsto \widehat{B}_n^{-1}\) costs \(O(p^3)\) flops.

  • \((\widehat{A}_n,\widehat{B}_n^{-1})\mapsto \widehat{A}_n\widehat{B}_n^{-1}\) costs \(O(p^2d)\) flops.

  • \((\widehat{A}_n\widehat{B}_n^{-1},\widehat{A}_n^{\top })\mapsto \widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) costs \(O(d^2p)\) flops.

  • \((I_d,\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top })\mapsto TI_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) costs \(O(d^2)\) flops.

  • \(T I_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\mapsto (T I_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top })^{-1}\) costs \(O(d^3)\) flops.

Thus, the computational cost of \(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\) is \(O(p^3+p^2d+d^2p+d^2+d^3)\) which is equal to \(O(p^3+d^3)\) flops.

The term \(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\)

  • \(\widehat{x}_{n}(S_i)\mapsto T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I\) costs \(O(Id)\) flops.

  • \((\widehat{A}_n(i)^{\top },\widehat{x}_{n}(S_i))\mapsto \sum _{i=1}^I\widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(Ipd)\) flops.

  • \((\widehat{A}_n \widehat{B}_n^{-1},\sum _{i=1}^I\widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i))\!\mapsto \!\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(dp)\) flops.

In total, the cost for \(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) is \(O(Idp)=O(\sqrt{n}dp)\) flops. Finally we have the following:

  • \(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\) costs \(O(p^3+d^3)\) flops.

  • \(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(\sqrt{n}dp)\) flops.

  • Multiplication of two aforementioned terms costs \(O(d^2)\) flops.

In total, calculation of \(\widehat{\xi }_n \) costs \(O(p^3+d^3+\sqrt{n}dp+d^2)flops\), which is equal to

$$\begin{aligned} O(p^3+d^3+\sqrt{n}dp) \end{aligned}$$

flops.

  1. 4.

    The cost for \(\widehat{\theta }_n\) The term \(\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) -\widehat{A}_n^{\top }\widehat{\xi }_n\)

  • \((\widehat{A}_n(i)^{\top },\widehat{x}_{n}(S_i))\mapsto \sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) \) costs \(O(Ipd)\) flops.

  • \((\widehat{A}_n^{\top },\widehat{\xi }_n)\mapsto \widehat{A}_n^{\top }\widehat{\xi }_n\) costs \(O(pd)\) flops.

  • Addition of two aforementioned terms costs \(O(p)\) flops.

In total, the cost for calculating \(\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) -\widehat{A}_n^{\top }\widehat{\xi }_n\) is \(O(Ipd)=O(\sqrt{n}pd)\) flops. The multiplication of \(\widehat{B}_n^{-1}\) with the last mentioned term costs \(O(p^2)\) flops so the total cost for \(\widehat{\theta }_n\) is

$$\begin{aligned} O(p^2+\sqrt{n}pd) \end{aligned}$$

flops.

Hence, the computational complexity of the procedure is \(O(\sqrt{n}dp+\sqrt{n}dp^2+p^3+d^3+\sqrt{n}dp+p^2+\sqrt{n}pd)\) flops which is equal to

$$\begin{aligned} O(p^3+d^3+\sqrt{n}dp^2) \end{aligned}$$

flops.

Appendix 4: Calculation of the integrals

Let us first recall that \(S_i=[a_{i-1},a_i]\), for \(i=1,\ldots ,I-1\) and \(S_I=[a_{I-1},T]\). The length of each subinterval \(S_i\) is \(\varDelta =T/I\) and the boundary points of the subintervals are \(a_i=i\varDelta \), for \(i=0,\ldots ,I\). To shorten the notation we let \(G_1(t)\) stand for an arbitrary entry of the matrix \(\widehat{G}_n(t)\). Similarly, let \(g_1(\widehat{x}_n(t))\) denote the entry of the matrix \(g(\widehat{x}_n(t))\) that corresponds to \(G_1(t)\), i.e \(G_1(t)=\int _0^tg_1(\widehat{x}_n(s))\)ds. Let \(C_i=g_1(\widehat{x}_n(S_i))\), for \(i=1,\ldots ,I\). Using integration by parts yields

$$\begin{aligned} \int \limits _{S_i}G_1(t)\mathrm{{d}}t&= \int \limits _{a_{i-1}}^{a_{i}}G_1(t)\mathrm{{d}}t\nonumber \\&= t G_1(t)|_{a_{i-1}}^{a_{i}}-\int \limits _{a_{i-1}}^{a_{i}} t g_1(\widehat{x}_n(t))\mathrm{{d}}t\nonumber \\&= t G_1(t)|_{a_{i-1}}^{a_{i}}-g_1(C_i)(a_{i}^2-a^2_{i-1})/2. \end{aligned}$$
(20)

In the same manner, we obtain that for \(t\in S_i\) it holds

$$\begin{aligned} \int \limits _{0}^{t} G_1(s)\mathrm{{d}}s&= s G_1(s)|_{0}^{t}-\int \limits _{0}^{t} s g_1(\widehat{x}_n(s))\mathrm{{d}}s\nonumber \\&= t G_1(t)- \sum _{m=1}^{i-1}\int \limits _{a_{m-1}}^{a_m} s g_1(\widehat{x}_n(s))\mathrm{{d}}s\nonumber \\&\quad -\,\int \limits _{a_{i-1}}^{t} s g_1(\widehat{x}_n(s))\mathrm{{d}}s\nonumber \\&= t G_1(t)-\sum _{m=1}^{i-1}\frac{g_1(C_m)}{2}(a_m^2-a^2_{m-1})\nonumber \\&\quad -\,\frac{1}{2}g_1(C_i)(t^2-a^2_{i-1})\nonumber \\&= t G_1(t)-\frac{\varDelta ^2}{2}\sum _{m=1}^{i-1}(2m-1)g_1(C_m)\nonumber \\&\quad -\,\frac{1}{2}g_1(C_i)(t^2-a^2_{i-1}), \end{aligned}$$
(21)

where we have used that \(a_m^2-a^2_{m-1}=(2m-1)\varDelta ^2\). Recall the convention that \(\sum _{m=1}^{i-1}(2m-1)g_1(C_m)\) is equal to zero for \(i=1\). Setting \(t=T\) in (21) we obtain

$$\begin{aligned} \int \limits _0^T G_1(t)\mathrm{{d}}t&= TG_1(T)\nonumber \\&\quad -\,\frac{\varDelta ^2}{2}\sum _{m=1}^{I}(2m-1)g_1(C_m). \end{aligned}$$
(22)

Applying (20) and (22) to every entry of the matrices \(\int _{S_i}\widehat{G}_n(t)\mathrm{{d}}t\) and \(\int _0^T\widehat{G}_n(t)\mathrm{{d}}t\), respectively, we conclude that

$$\begin{aligned} \widehat{A}_n(i)&= \int \limits _{S_i}\widehat{G}_n(t)\mathrm{{d}}t\\&= t\widehat{G}_n(t)|_{a_{i-1}}^{a_i}-\frac{1}{2}(2i-1)\varDelta ^2 g(\widehat{x}_n(S_i)),\\ \widehat{A}_n&= \int \limits _0^T\widehat{G}_n(t)\mathrm{{d}}t\\&= T\widehat{G}_n(T)-\frac{\varDelta ^2}{2}\sum _{i=1}^{I}(2i-1)g(\widehat{x}_n(S_i)). \end{aligned}$$

To obtain \(\widehat{B}_n\), we need integrals of the form \(\int _0^T G_1(t)G_2(t)\mathrm{{d}}t\) where, to simplify notation, we have denoted by \(G_1(t)\) and \(G_2(t)\) entries of the matrix \(\widehat{G}_n(t)\). Similarly \(g_1(\widehat{x}_n(t))\) and \(g_2(\widehat{x}_n(t))\) denote the corresponding entries of the matrix \(g(\widehat{x}_n(t))\), i.e \(G_i(t)=\int _0^tg_i(\widehat{x}_n(s))\mathrm{d}{s},\,i=1,2\). Integration by parts yields

$$\begin{aligned} \int \limits _{0}^{T} G_1(t)G_2(t)\mathrm{{d}}t&= G_1(s)\int \limits _0^t G_2(s)\mathrm{{d}}s\bigg |_{0}^{T}\\&\quad -\,\int \limits _{0}^{T} \Big \{g_1(\widehat{x}_n(s))\int \limits _0^t G_2(s)\mathrm{{d}}s\Big \}\mathrm{{d}}t\\&= G_1(T)\int \limits _0^T G_2(t)\mathrm{{d}}t\\&\quad -\,\sum _{i=1}^Ig_1(C_i)\int \limits _{a_{i-1}}^{a_i} \Big \{\int \limits _0^t G_2(s)\mathrm{{d}}s\Big \}\mathrm{{d}}t. \end{aligned}$$

According to (21), for any \(t\in S_i\)

$$\begin{aligned} \int \limits _{0}^{t}G_2(s)\mathrm{{d}}{s}&=tG_2(t)-\frac{\varDelta ^2}{2}\sum _{m=1}^{i-1}(2m-1)g_2(C_m)\\&\qquad -\frac{g_2(C_i)}{2}(t^2-a^2_{i-1}) \end{aligned}$$

and hence

$$\begin{aligned}&\int \limits _{a_{i-1}}^{a_i} \Big \{\int \limits _0^t G_2(s)\mathrm{{d}}s\Big \}\mathrm{{d}}t\\&\quad = \int \limits _{a_{i-1}}^{a_i}tG_2(t)\mathrm{{d}}t-\frac{\varDelta ^3}{2}\sum _{m=1}^{i-1}(2m-1)g_2(C_m)\\&\qquad -\,\frac{g_2(C_i)}{2}\Big \{\frac{1}{3}(a_i^3-a_{i-1}^3)-a^2_{i-1}(a_i-a_{i-1})\Big \}. \end{aligned}$$

Again, using integration by parts, we obtain

$$\begin{aligned} \int \limits _{a_{i-1}}^{a_i}tG_2(t)\mathrm{{d}}t&= \frac{1}{2}t^2G_2(t)\Big |_{a_{i-1}}^{a_i}-\frac{1}{2}\int \limits _{a_{i-1}}^{a_i}t^2g_2(\widehat{x}_n(t))\mathrm{{d}}t \\&= \frac{1}{2}t^2G_2(t)\Big |_{a_{i-1}}^{a_i}-\frac{1}{2} g_2(C_i)\frac{1}{3}\left( a^3_i-a^3_{i-1}\right) . \end{aligned}$$

Since \(a_i=i\varDelta \), for \(i=0,\ldots I\) it follows that

$$\begin{aligned}&a^3_i-a^3_{i-1}=(3i^2-3i+1)\varDelta ^3, \\&\frac{1}{3}(a_i^3-a_{i-1}^3)-a^2_{i-1}(a_i-a_{i-1})=\varDelta ^3(3i-2)/3. \end{aligned}$$

Also, the following identity holds

$$\begin{aligned} \sum _{i=1}^Ig_1(C_i)\sum _{m=1}^{i-1}(2m-1)g_2(C_m)=\sum _{i=1}^{I-1}(2i-1)r_ig_2(C_i), \end{aligned}$$

where \(r_i=\sum _{m=i+1}^{I}g_1(C_m)\). By using previous identities, we finally obtain

$$\begin{aligned} \int \limits _{0}^{T} G_1(t)G_2(t)\mathrm{{d}}t&= G_1(T)\int \limits _0^T G_2(s)\mathrm{{d}}s\\&\quad -\frac{1}{2}\sum _{i=1}^Ig_1(C_i)t^2G_2(t)\Big |_{a_{i-1}}^{a_i}\\&\quad +\,\frac{\varDelta ^3}{6}\sum _{i=1}^I(3i^2-1)g_1(C_i) g_2(C_i)\\&\quad +\,\frac{\varDelta ^3}{2}\sum _{i=1}^{I-1}(2i-1)r_ig_2(C_i). \end{aligned}$$

Since \((r,k)\)th entry, \(\widehat{G}_n(t)[r,k]\), of the matrix \(\int _{0}^{T} \widehat{G}_n(t)^{\top } \widehat{G}_n(t)\mathrm{{d}}t \) is the sum of expressions \(\int _{0}^{T} \widehat{G}_n(t)[l,r]\widehat{G}_n(t)[l,k]\) \(\mathrm{{d}}t\) that are of the form above we obtain that

$$\begin{aligned} \widehat{B}_n=\int \limits _{0}^{T} \widehat{G}_n(t)^{\top } \widehat{G}_n(t)\mathrm{{d}}t&= \widehat{G}_n(T)^{\top } \int \limits _0^T \widehat{G}_n(t) \mathrm{{d}}t\\&\quad -\,\frac{1}{2}\sum _{i=1}^Ig(\widehat{x}_n(S_i))^{\top }t^2\widehat{G}_n(t) \Big |_{a_{i-1}}^{a_i}\\&\quad +\,\frac{\varDelta ^3}{6}\sum _{i=1}^I (3i^2-1)g^{\top }(\widehat{x}_n(S_i))\\&\quad \times g(\widehat{x}_n(S_i))\\&\quad +\,\frac{\varDelta ^3}{2}\sum _{i=1}^{I-1}(2i-1)R_i^{\top }g(\widehat{x}_n(S_i)), \end{aligned}$$

where \(R_i=\sum _{m=i+1}^{I}g(\widehat{x}_n(S_m))\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vujačić, I., Dattner, I., González, J. et al. Time-course window estimator for ordinary differential equations linear in the parameters. Stat Comput 25, 1057–1070 (2015). https://doi.org/10.1007/s11222-014-9486-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9486-9

Keywords

Navigation