Adaptive efficient analysis for big data ergodic diffusion models

Galtchouk, Leonid I.; Pergamenshchikov, Serge M.

doi:10.1007/s11203-021-09241-9

Adaptive efficient analysis for big data ergodic diffusion models

Published: 27 March 2021

Volume 25, pages 127–158, (2022)
Cite this article

Statistical Inference for Stochastic Processes Aims and scope Submit manuscript

Leonid I. Galtchouk¹ &
Serge M. Pergamenshchikov^2,3

194 Accesses
2 Citations
Explore all metrics

Abstract

We consider drift estimation problems for high dimension ergodic diffusion processes in nonparametric setting based on observations at discrete fixed time moments in the case when diffusion coefficients are unknown. To this end on the basis of sequential analysis methods we develop model selection procedures, for which we show non asymptotic sharp oracle inequalities. Through the obtained inequalities we show that the constructed model selection procedures are asymptotically efficient in adaptive setting, i.e. in the case when the model regularity is unknown. For the first time for such problem, we found in the explicit form the celebrated Pinsker constant which provides the sharp lower bound for the minimax squared accuracy normalized with the optimal convergence rate. Then we show that the asymptotic quadratic risk for the model selection procedure asymptotically coincides with the obtained lower bound, i.e this means that the constructed procedure is efficient. Finally, on the basis of the constructed model selection procedures in the framework of the big data models we provide the efficient estimation without using the parameter dimension or any sparse conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions

Article 12 February 2015

The Dantzig selector for a linear model of diffusion processes

Article 04 October 2018

AIC type statistics for discretely observed ergodic diffusion processes

Article 21 June 2014

References

Bayisa FL, Zhou Z, Cronie O, Yu J (2019) Adaptive algorithm for sparse signal recovery. Digit Signal Proc 87:10–18
Article Google Scholar
Comte F, Genon-Catalot V, Rozenholc Y (2009) Non-parametric estimation for a discretely observed integrated diffusion model. Stoch Process Appl 119:811–834
Article Google Scholar
Dalalyan AS (2005) Sharp adaptive estimation of the drift function for ergodic diffusion. Ann Stat 33(6):2507–2528
Article MathSciNet Google Scholar
Dalalyan AS, Kutoyants YuA (2002) Asymptotically efficient trend coefficient estimation for ergodic diffusion. Math Methods Stat 11(4):402–427
MathSciNet Google Scholar
Fan J, Fan Y, Barut E (2014) Adaptive robust variable selection. Ann Stat 42(1):324–351
Article MathSciNet Google Scholar
Florens-Zmirou D (1993) On estimating the diffusion coefficient from discrete observations. J Appl Probab 30(4):790–804
Article MathSciNet Google Scholar
Fujimori K (2019) The Danzing selector for a linear model of diffusion processes. Stat Infer Stoch Process 22:475–498
Article Google Scholar
De Gregorio A, Iacus SM (2012) Adaptive LASSO-type estimation for multivariate diffusion processes. Econ Theory 28(4):838–860
Article MathSciNet Google Scholar
Galtchouk LI (1978) Existence and uniqueness of a solution for stochastic equations with respect to semimartingales. Theory Probab Appl 23:751–763
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2001) Sequential nonparametric adaptive estimation of the drift coefficient in diffusion processes. Math Methods Stat 10(3):316–330
MathSciNet MATH Google Scholar
Galtchouk L, Pergamenshchikov S (2004) Nonparametric sequential estimation of the drift in diffusion via model selection. Math Methods Stat 13:25–49
MathSciNet MATH Google Scholar
Galtchouk L, Pergamenshchikov S (2005) Nonparametric sequential minimax estimation of the drift coefficient in diffusion processes. Sequ Anal 24(3):303–330
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2006) Asymptotically efficient sequential kernel estimates of the drift coefficient in ergodic diffusion processes. Stat Infer Stoch Process 9:1–16
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2007) Uniform concentration inequality for ergodic diffusion processes. Stoch Process Appl 117:830–839
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2009a) Sharp non-asymptotic oracle inequalities for nonparametric heteroscedastic regression models. J Nonparametr Stat 21:1–16
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2009b) Adaptive asymptotically efficient estimation in heteroscedastic nonparametric regression. J Korean Stat Soc 38(4):305–322
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2011) Adaptive sequential estimation for ergodic diffusion processes in quadratic metric. J Nonparametr Stat 23(2):255–285
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2013) Uniform concentration inequality for ergodic diffusion processes observed at discrete times. Stoch Process Appl 123(1):91–109
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2014) Geometric ergodicity for classes of homogeneous Markov chains. Stoch Process Appl 124(10):3362–3391
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2015) Efficient pointwise estimation based on discrete data in ergodic nonparametric diffusions. Bernoulli 21(4):2569–2594
Article MathSciNet Google Scholar
Galtchouk L, Pergamenshchikov S (2019) Non asymptotic sharp oracle inequalities for high dimensional ergodic diffusion models. Preprint https://hal.archives-ouvertes.fr/hal-02387034 (2019)
Gihman II, Skorohod AV (1968) Stochastic differential equations. Naukova Dumka, Kiev
MATH Google Scholar
Gobet E, Hoffmann M, Reiss M (2004) Nonparametric estimation of scalar diffusions based on low frequency data. Ann Stat 32(5):2223–2253
Article MathSciNet Google Scholar
Hoffmann M (1999) Adaptive estimation in diffusion processes. Stoch Process Appl 79:135–163
Article MathSciNet Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference and prediction. Springer series in statistics. Springer, Berlin
Jacod J (2000) Non-parametric kernel estimation of the coefficient of a diffusion. Scand J Stat 27(1):83–96
Article MathSciNet Google Scholar
Kabanov YM, Pergamenshchikov SM (2003) Two scale stochastic systems: asymptotic analysis and control. Applications of mathematics, stochastic modelling and applied probability, vol 49. Springer Berlin
Karatzas I, Shreve SE (1998a) Brownian motion and stochastic calculus. Graduate texts in mathematics, 2nd edn. Springer Science+Business Media, New York
Karatzas I, Shreve SE (1998b) Methods of mathematical finance. Springer, New York
Book Google Scholar
Kutoyants YuA (1977) Estimation of the signal parameter in a Gaussian noise. Probl Inf Transm 13(4):29–38
MathSciNet MATH Google Scholar
Kutoyants YuA (1984a) Parameter estimation for stochastic processes. Heldeman-Verlag, Berlin
MATH Google Scholar
Kutoyants YuA (1984b) On nonparametric estimation of trend coefficients in a diffusion process. Statistics and control of stochastic processes. Steklov seminar proceedings, Moscow, pp 230–250
Google Scholar
Kutoyants YuA (2003) Statistical inferences for ergodic diffusion processes. Springer, Berlin
Google Scholar
Konev VV, Pergamenshchikov SM (2015) Robust model selection for a semimartingale continuous time regression from discrete data. Stoch Process Appl 125:294–326
Article MathSciNet Google Scholar
Lamberton D, Lapeyre B (1996) Introduction to stochastic calculus applied to finance. Chapman & Hall, London
MATH Google Scholar
Pinsker MS (1981) Optimal filtration of square integrable signals in gaussian noise. Probl Inf Transm 17:120–133
MATH Google Scholar

Download references

Acknowledgements

The work of the last author was partially supported by the Russian Federal Professor program, Project No. 1.472.2016/1.4 (the Ministry of Science and Higher Education of the Russian Federation).

Author information

Authors and Affiliations

International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, Tomsk State University, 36, Lenin str., Tomsk, Russia, 634041
Leonid I. Galtchouk
Laboratoire de Mathématiques Raphael Salem, Université de Rouen, Avenue de l’Université, BP 12, 76801, Saint Etienne du Rouvray Cedex, France
Serge M. Pergamenshchikov
International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, Tomsk State University, 36 Lenina prosp., Tomsk, Russia, 634050
Serge M. Pergamenshchikov

Authors

Leonid I. Galtchouk
View author publications
You can also search for this author in PubMed Google Scholar
Serge M. Pergamenshchikov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leonid I. Galtchouk.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by RSF, Grant No. 20-61-47043, National Russian Research Tomsk State University, Russia.

Appendix

1.1 Property of the penalty term

Proposition A.1

For any $0<\varepsilon <1/2$,

$$\begin{aligned} \mathbf{E}_{\vartheta }\,\mathbf{1}_{\mathbf{G}_{*}}\,P_n(\lambda _0)&\le \frac{1}{1-2\varepsilon }\, \mathbf{E}_{\vartheta } \text{ Err}_{n}(\lambda _0)\mathbf{1}_{\mathbf{G}_{*}} +\,\frac{{\check{\mathbf{x}}}\mathbf{g}^{*}_{T}}{\varepsilon (1-2\varepsilon )T}\\&\quad +\frac{2{\check{\mathbf{x}}}\sigma _{1,*}}{n\varepsilon } + \frac{2\Vert S\Vert _{n}\sqrt{{\check{\mathbf{x}}}\sigma _{1,*}}}{\sqrt{n}} \sqrt{\mathbf{P}_{\vartheta }\left( \mathbf{G}^c_*\right) } , \end{aligned}$$

where the term $\mathbf{g}^{*}_{T}$ is given in (3.6).

Proof

Note that on the set $\mathbf{G}_{*}$

$$\begin{aligned} \text{ Err}_{n}(\lambda )&=\sum ^n_{j=1}(\lambda (j) \widehat{\theta }_{j,n}-\theta _{j,n})^2 =\sum ^n_{j=1}\lambda ^2(j)\zeta ^2_{j,n}\\&\quad -2\sum ^n_{j=1}(1-\lambda (j))\lambda (j)\theta _{j,n}\zeta _{j,n} +\sum ^n_{j=1}(1-\lambda (j))^2\theta ^2_{j,n}. \end{aligned}$$

Taking into account here that

$$ \zeta ^2_{j,n}=g^2_{j,n}+\frac{{\check{\mathbf{x}}}}{n}\xi ^2_{j,n}+2\sqrt{\frac{{\check{\mathbf{x}}}}{n}}g_{j,n}\xi _{j,n}, $$

we obtain

$$ \text{ Err}_{n}(\lambda ) \ge \frac{{\check{\mathbf{x}}}}{n}\sum ^n_{j=1}\lambda ^2(j)\xi ^2_{j,n}\,+\,2\sqrt{\frac{{\check{\mathbf{x}}}}{n}}I_{1}\,-\,2\sqrt{\frac{{\check{\mathbf{x}}}}{n}}I_{2}, $$

where $I_{1}=\sum ^n_{j=1}\lambda ^2(j)g_{j,n}\xi _{j,n}$ and $I_{2}=\sum ^n_{j=1}(1-\lambda (j))\lambda (j)\theta _{j,n}\xi _{j,n}$. Moreover, note that, for any $0<\varepsilon <1$,

$$ 2\sqrt{\frac{{\check{\mathbf{x}}}}{n}}I_{1} \le \frac{1}{\varepsilon }\Vert g\Vert ^2_n+\frac{\varepsilon {\check{\mathbf{x}}}}{n}\sum ^n_{j=1}\lambda ^2(j)\xi ^2_{j,n}. $$

Therefore

$$ \text{ Err}_{n}(\lambda _0)\ge \frac{(1-\varepsilon ){\check{\mathbf{x}}}}{n}\sum ^n_{j=1}\lambda ^2(j)\xi ^2_{j,n} -\frac{2\sqrt{{\check{\mathbf{x}}}}}{\sqrt{n}}I_{2}-\frac{1}{\varepsilon }\Vert g\Vert ^2_{n} $$

and

$$\begin{aligned} \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\text{ Err}_{n}(\lambda _0) \ge \frac{(1-\varepsilon ){\check{\mathbf{x}}}}{n}\mathbf{E}_{\vartheta } \mathbf{1}_{\mathbf{G}_{*}}\sum ^n_{j=1}\lambda ^2(j)\xi ^2_{j,n} - \frac{2\sqrt{{\check{\mathbf{x}}}}}{\sqrt{n}} \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,I_{2} -\frac{1}{\varepsilon }\mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,\Vert g\Vert ^2_{n} . \end{aligned}$$

Taking into account here the definition of $\mathbf{B}(\cdot )$ in (6.1) and that $\mathbf{E}_{\vartheta }I_{2}=0$ we can rewrite the last inequality as

$$\begin{aligned} \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\text{ Err}_{n}(\lambda _0)&\ge (1-\varepsilon ) \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,P_{n}(\lambda ) +\frac{(1-\varepsilon )}{\sqrt{n}}\mathbf{E}_{\vartheta } \mathbf{1}_{\mathbf{G}_{*}}\mathbf{B}(\lambda ^{2}) \\&\quad + \frac{2\sqrt{{\check{\mathbf{x}}}}}{\sqrt{n}} \mathbf{E}_{\vartheta }\mathbf{1}_{(\mathbf{G}_{*})^{c}}\,I_{2} -\frac{1}{\varepsilon }\mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,\Vert g\Vert ^2_{n} . \end{aligned}$$

Now Propositions 6.1– 6.2 imply that

$$\begin{aligned} \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\text{ Err}_{n}(\lambda _0)&\ge (1-2\varepsilon ) \mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,P_{n}(\lambda ) -\frac{2{\check{\mathbf{x}}}\sigma _{1,*}}{n\varepsilon } \\&\quad - \frac{2\Vert S\Vert _{n}\sqrt{{\check{\mathbf{x}}}\sigma _{1,*}}}{\sqrt{n}} \sqrt{\mathbf{P}_{\vartheta }\left( \mathbf{G}^c\right) }\, -\frac{1}{\varepsilon }\mathbf{E}_{\vartheta }\mathbf{1}_{\mathbf{G}_{*}}\,\Vert g\Vert ^2_{n} . \end{aligned}$$

Hence Proposition A.1. $\square $

1.2 Asymptotic analysis tools

Proposition A.2

Assume that the conditions $\mathbf{A}_1)$–$\mathbf{A}_2)$ hold. Then, for any $\mathbf{x}_{0}<\mathbf{x}_{1}$ and $a>0$,

$$ \lim _{T\rightarrow \infty }\, T^{a} \max _{\mathbf{x}_{0}\le x\le \mathbf{x}_{1}} \sup _{\vartheta \in \Theta } \mathbf{P}_{\vartheta }(|{\widetilde{q}}_{T}(x)-\mathbf{q}_{\vartheta }(x)|> \upsilon _{T}) =0. $$

The proof is the same as for Lemma A.3 in Galtchouk and Pergamenshchikov (2015), so it is omitted.

Proof of Proposition 3.3

First, note that, to show the limit (3.7) it suffices to check that, for any $a>0$,

$$\begin{aligned} \lim _{T\rightarrow \infty }\,{T^{1-a}}\, \sup _{\mathbf{x}_{0}\le z\le \mathbf{x}_{1}} \sup _{\vartheta \in \Theta }\, \left( \mathbf{E}_{\vartheta }\,\mathbf{g}^{2}_{1}(z)\mathbf{1}_{\Gamma (z)} +\mathbf{E}_{\vartheta }\,\mathbf{g}^{2}_{2}(z)\mathbf{1}_{\Gamma (z)} \right) =0. \end{aligned}$$

(A.1)

Indeed, using the definition of $\mathbf{g}_{1}(z)$ in (2.16) we represent it on the set $\Gamma (z)$ as $ \mathbf{g}_{1}(z) =\mathbf{g}_{1,1}(z)+\mathbf{g}_{1,2}(z)$, where

$$ \mathbf{g}_{1,1}(z)=\frac{1}{\delta H(z)} \,(1-\sqrt{\varkappa (z)})\sqrt{\varkappa (z)} \chi _{\tau (z)}(z,h)\, \int ^{t_{\tau (z)}}_{t_{\tau (z)-1}}\,S(y_{u})\,\mathrm {d}u $$

and

$$ \mathbf{g}_{1,2}(z)=\frac{1}{\delta H(z)} \sum ^{\tau (z)}_{j=N_{0}+1}\,{\widetilde{\varkappa }}_{j}(z)\, \chi _{j}(z,h)\, \int ^{t_{j}}_{t_{j-1}}\,S(y_{u})\,\mathrm {d}u -S(z). $$

To estimate the term $\mathbf{g}_{1,2}(z)$ note that

$$ \mathbf{g}^{2}_{1,2}(z)\le \frac{\Psi _{\tau (z)}(z)}{\delta H^{2}(z)}, \quad \Psi _{\tau (z)}(z)= \chi _{\tau (z)}(z,h)\, \int ^{t_{\tau (z)}}_{t_{\tau (z)-1}}\,S^{2}(y_{u})\,\mathrm {d}u. $$

Moreover, note also here, that for some constant $C>0$

$$ \max _{N_{0}<j\le N}\, \max _{t_{j-1}\le u\le t_{j}} \sup _{\vartheta \in \Theta }\, \mathbf{E}_{\vartheta }\left( S^{2}(y_{u})\vert {{\mathcal {F}}}_{t_{j-1}}\right) \le C(1+y^{2}_{t_{j-1}}) . $$

From the definition (2.13) it follows that $\{\tau (z)=j\}\in {{\mathcal {F}}}_{t_{j-1}}$, i.e.

$$ \mathbf{E}_{\vartheta }\left( \Psi _{j}(z) \vert {{\mathcal {F}}}_{t_{j-1}} \right) \le \delta C(1+y^{2}_{t_{j-1}})\chi _{j}(z,h)\le \delta C. $$

Therefore, for some $C>0$

$$ \mathbf{E}_{\vartheta }\left( \Psi _{\tau (z)}(z) \vert {{\mathcal {F}}}_{t_{N_{0}}} \right) = \sum ^{N}_{j=N_{0}+1} \mathbf{E}_{\vartheta }\left( \mathbf{1}_{\{\tau (z)=j\}} \mathbf{E}_{\vartheta }\left( \Psi _{j}(z) \vert {{\mathcal {F}}}_{t_{j-1}} \right) \vert {{\mathcal {F}}}_{t_{N_{0}}} \right) \le \delta C $$

and

$$ \mathbf{E}_{\vartheta }\left( \mathbf{g}^{2}_{1,2}(z) \vert {{\mathcal {F}}}_{t_{N_{0}}} \right) \le \frac{C}{H^{2}(z)}. $$

Using the definition (2.12), the conditions $\mathbf{A}_{1}$)–$\mathbf{A}_{2}$), and Propositions 4.1–4.2 from Galtchouk and Pergamenshchikov (2015) we obtain the property (3.7). Hence Proposition 3.3. $\square $

Proof of Proposition 3.4

Note that

$$ \mathbf{E}_{\vartheta }\vert \widehat{\sigma }_{l}-\sigma ^{2}_{l} \vert \le \frac{1}{\upsilon _{T}\delta (N-N_{0})h}\,\mathbf{E}_{\vartheta }\vert \widehat{b}_{l}-b^{2}(z_{l})\vert . $$

Taking into account the definition of $N_{0}$ in (2.9) we obtain through Proposition 3.1 from Galtchouk and Pergamenshchikov (2019) the limit equality (3.11). Hence Proposition 3.4. $\square $

Now, we study the heteroscedastic property in the model (3.4). To this end we study asymptotic properties of the average variance $\mathbf{s}_{n}$ defined in (7.10).

Proposition A.3

Assume that the condition $\mathbf{A}_1)$ holds. Then

$$\begin{aligned} \lim _{T\rightarrow \infty }\, \sup _{\vartheta \in \Theta _{k,r}} \, \mathbf{E}_{\vartheta } \left| \mathbf{s}_{n} - \frac{\mathbf{J}_{\vartheta }}{{\check{\mathbf{x}}}} \right| \,=0 . \end{aligned}$$

(A.2)

Proof

Using the definition of $\sigma _{l}$ in (3.4) and taking into account the form of h given in (2.13), we can represent the term $\mathbf{s}_{n}$ as

$$\begin{aligned} \mathbf{s}_{n} = \frac{1}{{\check{\mathbf{x}}}} \sum _{l=1}^{n} {\widetilde{b}}_{\vartheta }(z_{l}) (z_{l}-z_{l-1}) +\, R_{1}(\vartheta )\,+\,R_{2}(\vartheta ), \end{aligned}$$

(A.3)

where ${\widetilde{b}}_{\vartheta }(x)=b^2(x)/\mathbf{q}_{\vartheta }(x)$,

$$ R_{1}(\vartheta )= \frac{1}{{\check{\mathbf{x}}}} \left( \frac{n^{2}}{\delta \,(N-N_{0})} -1 \right) \sum _{l=1}^{n} \, {\widetilde{b}}_{\vartheta }(z_{l})\, (z_{l}-z_{l-1}) $$

and

$$ R_{2}(\vartheta )=\sum _{l=1}^{n}\frac{nb^2(x_l)}{\delta \,h\,(N-N_{0})} \left( \frac{1}{2{\widetilde{q}}_{T}(z_{l})-\upsilon _{T}}\, -\,\frac{1}{2\mathbf{q}_{\vartheta }(z_{l})}\right) (z_{l}-z_{l-1}). $$

First of all note, that the function ${\widetilde{b}}_{\vartheta }(\cdot )$ and its derivative are uniformly bounded, i.e. $ \sup _{\vartheta \in \Theta _{k,r}} \max _{\mathbf{x}_{0}\le z\le \mathbf{x}_{1}}\, \left( {\widetilde{b}}_{\vartheta }(z) + \vert {\widetilde{b}}^{\prime }_{\vartheta }(z)\vert \right) <\infty $. Therefore,

$$ \lim _{T\rightarrow \infty }\, \sup _{\vartheta \in \Theta _{k,r}} \left| \sum _{l=1}^{n} {\widetilde{b}}_{\vartheta }(z_{l}) (z_{l}-z_{l-1}) - \mathbf{J}_{\vartheta } \right| =0. $$

As the second term (A.3), note that, in view of the condition (3.2),

$$ \lim _{T\rightarrow \infty } \, \frac{n^{2}}{\delta \,(N-N_{0})} \,=1. $$

Therefore, $ \lim _{T\rightarrow \infty } \sup _{\vartheta \in \Theta _{k,r}} \, \vert R_{1}(\vartheta ) \vert =0$. Moreover, taking into account that $2{\widetilde{q}}_{T}(z_{l})-\upsilon _{T}>\upsilon ^{1/2}_{T}$, we obtain, for sufficiently large T, that for some $C>0$

$$ |R_{2}(\vartheta )| \le \,C\left( \sum _{l=1}^{n}\,\upsilon ^{-1/2}_{T}\left( |{\widetilde{q}}_{T}(z_{l})-\mathbf{q}_{\vartheta }(z_{l})|\right) (z_{l}-z_{l-1}) +\sqrt{\upsilon _{T}} \right) . $$

Note here, that for any $1\le l\le n$ and for sufficiently large T,

$$ \upsilon ^{-1/2}_{T} \mathbf{E}_{\vartheta } \, |{\widetilde{q}}_{T}(z_{l})-\mathbf{q}_{\vartheta }(z_{l})| \le \,2\upsilon ^{-1}_{T} \mathbf{P}_{\vartheta }(|{\widetilde{q}}_{T}(z_{l})-\mathbf{q}_{\vartheta }(z_{l})|>\upsilon _{T}) + \sqrt{\upsilon _{T}} $$

and, therefore, Proposition A.2 implies $ \lim _{T\rightarrow \infty } \sup _{\vartheta \in \Theta _{k,r}} \,\vert R_{2}(\vartheta ) \vert =0$. $\square $

Lemma A.4

Let f be an absolutely continuous $[\mathbf{x}_{0},\mathbf{x}_{1}] \rightarrow {{\mathbb {R}}}$ function with $\Vert {\dot{f}}\Vert <\infty $ and g be $[\mathbf{x}_{0},\mathbf{x}_{1}] \rightarrow {{\mathbb {R}}}$ a step-wise function $ g(z)=\sum _{j=1}^{n}\,c_{j}\,\chi _{(z_{j-1},z_{j}]}(z)$, where $c_{j}$ are some constants and the sequence $(z_{j})_{0\le j\le n}$ is given in (3.1). Then, for any $ {\widetilde{\varepsilon }}>0$, the function $\Delta =f-g$ satisfies the following inequalities

$$ \frac{1}{{\widetilde{\varepsilon }}}\frac{\Vert {\dot{f}}\Vert ^{2}}{n^{2}}{\check{\mathbf{x}}}^2 - \frac{\Vert \Delta \Vert ^{2}}{1+{\widetilde{\varepsilon }}} \le \Vert \Delta \Vert ^{2}_{n}\le (1+{\widetilde{\varepsilon }})\Vert \Delta \Vert ^{2} + \left( 1+\frac{1}{{\widetilde{\varepsilon }}}\right) \frac{\Vert {\dot{f}}\Vert ^{2}}{n^{2}}{\check{\mathbf{x}}}^2. $$

The proof is given in Lemma A.2 from Konev and Pergamenshchikov (2015).

1.3 Properties of the trigonometric basis

Lemma A.5

For any $1\le j\le n$ and any ${\widetilde{\varepsilon }}>0$, the discrete trigonometric Fourier coefficients $(\theta _{j,n})_{1\le j\le n}$ introduced in (4.3) for $S\in W_{k,r}$ are bounded as

$$\begin{aligned} \theta ^{2}_{j,n} \, \le \,(1+{\widetilde{\varepsilon }}) \,\theta ^{2}_{j} \, +(1+{\widetilde{\varepsilon }}^{-1})\, \frac{{\check{r}}_{k}}{n^{2k}} ,\quad {\check{r}}_{k}=\frac{2r(\pi ^2+1){\check{\mathbf{x}}}^{2k}}{\pi ^{2k}} , \end{aligned}$$

(A.4)

where the coefficients $\theta _{j}$ are defined in (5.5).

Proof

First we represent the function S in ${{\mathcal {L}}}[\mathbf{x}_{0},\,\mathbf{x}_{1}]$ as

$$\begin{aligned} S(x)=\sum ^{n}_{l=1}\,\theta _{l}\,\phi _{l}(x) +\Delta _{n}(x) \quad \text{ and }\quad \Delta _{n}(x)=\sum _{l>n}\,\theta _{l}\,\phi _{l}(x) . \end{aligned}$$

(A.5)

Since $ \theta _{j,n}=(S,\phi _{j})_{n} =\theta _{j} + (\Delta _{n},\phi _{j})_{n}$, we get, that for any $0<{\widetilde{\varepsilon }}<1$,

$$ \theta ^{2}_{j,n} \le (1+{\widetilde{\varepsilon }}) \theta ^{2}_{j} + (1+{\widetilde{\varepsilon }}^{-1}) \Vert \Delta _{n}\Vert ^{2}_{n} . $$

Moreover, through Lemma A.4 and the definition (5.5) we deduce

$$ \Vert \Delta _{n}\Vert ^{2}_{n} \le 2\sum _{l>n}\,\theta ^{2}_{l} + 2 \frac{\Vert {\dot{\Delta }}_{n}\Vert ^{2}{\check{\mathbf{x}}}^{2}}{n^{2}} \le \frac{2 r }{a_{n+1}} + \frac{2\Vert {\dot{\Delta }}_{n}\Vert ^{2}{\check{\mathbf{x}}}^{2}}{n^{2}} . $$

Taking into account here that $2[l/2]\ge l-1$ for $l\ge 2$, we get

$$ \Vert \Delta _{n}\Vert ^{2}_{n}\le \frac{2r{\check{\mathbf{x}}}^{2k}}{\pi ^{2k}n^{2k}} + 2 \frac{\Vert {\dot{\Delta }}_{n}\Vert ^{2}{\check{\mathbf{x}}}^{2}}{n^{2}} . $$

Similarly, for any $n\ge 1$,

$$\begin{aligned} \Vert {\dot{\Delta }}_{n}\Vert ^{2}&= \frac{(2\pi )^{2}}{{\check{\mathbf{x}}}^{2}}\, \sum _{l>n}\,\theta ^{2}_{l}\,[l/2]^{2} = \frac{{\check{\mathbf{x}}}^{2(k-1)}}{\pi ^{2(k-1)}}\, \sum _{l>n}\,\frac{a_{l}\theta ^{2}_{l}}{(2[l/2])^{2(k-1)}} \nonumber \\&\le \frac{{\check{\mathbf{x}}}^{2(k-1)}}{\pi ^{2(k-1)}} \sum _{l>n}\,\frac{a_{l}\theta ^{2}_{l}}{(l-1)^{2(k-1)}} \le \frac{r{\check{\mathbf{x}}}^{2(k-1)}}{\pi ^{2(k-1)}n^{2(k-1)}} . \end{aligned}$$

(A.6)

Hence Lemma A.5. $\square $

Lemma A.6

For any $n\ge 2$, $1\le m< n$ and $r>0$, the coefficients $(\theta _{j,n})_{1\le j\le n}$ of functions S from the class $W_{k,r}$ satisfy, for any ${\widetilde{\varepsilon }}>0$, the following inequality

$$\begin{aligned} \sum ^{n}_{j=m+1} \theta ^{2}_{j,n} \, \le \,(1+{\widetilde{\varepsilon }}) \,\sum _{j\ge m+1}\,\theta ^{2}_{j} \, +(1+{\widetilde{\varepsilon }}^{-1})\, \frac{{\check{r}}_{1}}{n^{2} m^{2(k-1)}} , \end{aligned}$$

(A.7)

where ${\check{r}}_{1}=r{\check{\mathbf{x}}}^{2k}/\pi ^{2(k-1)}$.

Proof

First we note that

$$\begin{aligned} \sum ^{n}_{j=m+1} \theta ^{2}_{j,n}&=\min _{x_{1},\ldots ,x_{m}} \,\Vert S-\sum ^{m}_{j=1}\,x_{j}\phi _{j} \Vert ^{2}_{n} \le \Vert \Delta _{m}\Vert ^{2}_{n}, \end{aligned}$$

where the function $\Delta _{m}(\cdot )$ is defined in (A.5). By applying Lemma A.4 with $f=\Delta _{m}, g=0$, and taking into account the inequality (A.6), we obtain the bound (A.7). Hence Lemma A.6$\square $

Lemma A.7

For any $k\ge 1$,

$$\begin{aligned} \sup _{n\ge \,2}n^{-k} \sup _{x\in [\mathbf{x}_{0},\mathbf{x}_{1}]}\,\left| \sum ^n_{l=2}\, l^{k}\overline{\phi }_{l}(x)\right| \,\le \,2^{k}, \end{aligned}$$

(A.8)

where $\overline{\phi }_{l}(x)={\check{\mathbf{x}}}\phi ^2_{l}(x)-1$.

Proof of this result is given in Lemma A.2 from Galtchouk and Pergamenshchikov (2009a).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galtchouk, L.I., Pergamenshchikov, S.M. Adaptive efficient analysis for big data ergodic diffusion models. Stat Inference Stoch Process 25, 127–158 (2022). https://doi.org/10.1007/s11203-021-09241-9

Download citation

Received: 25 May 2020
Accepted: 10 March 2021
Published: 27 March 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11203-021-09241-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive efficient analysis for big data ergodic diffusion models

Abstract

Access this article

Similar content being viewed by others

Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions

The Dantzig selector for a linear model of diffusion processes

AIC type statistics for discretely observed ergodic diffusion processes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

1.1 Property of the penalty term

Proposition A.1

Proof

1.2 Asymptotic analysis tools

Proposition A.2

Proof of Proposition 3.3

Proof of Proposition 3.4

Proposition A.3

Proof

Lemma A.4

1.3 Properties of the trigonometric basis

Lemma A.5

Proof

Lemma A.6

Proof

Lemma A.7

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Adaptive efficient analysis for big data ergodic diffusion models

Abstract

Access this article

Similar content being viewed by others

Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions

The Dantzig selector for a linear model of diffusion processes

AIC type statistics for discretely observed ergodic diffusion processes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 Property of the penalty term

Proposition A.1

Proof

1.2 Asymptotic analysis tools

Proposition A.2

Proof of Proposition 3.3

Proof of Proposition 3.4

Proposition A.3

Proof

Lemma A.4

1.3 Properties of the trigonometric basis

Lemma A.5

Proof

Lemma A.6

Proof

Lemma A.7

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation