Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions


The problem of pointwise adaptive estimation of the drift coefficient of a multivariate diffusion process is investigated. We propose an estimator which is sharp adaptive on scales of Sobolev smoothness classes. The analysis of the exact risk asymptotics allows to identify the impact of the dimension and other influencing values—such as the geometry of the diffusion coefficient—of the prototypical drift estimation problem for a large class of multidimensional diffusion processes. We further sketch generalizations of our results to arbitrary diffusions satisfying suitable Bernstein-type inequalities.

This is a preview of subscription content, log in to check access.


  1. 1.

    Aït-Sahalia, Y.: Closed-form likelihood expansions for multivariate diffusions. Ann. Stat. 36(2), 906–937 (2008)

    Article  MATH  Google Scholar 

  2. 2.

    Bakry, D., Cattiaux, P., Guillin, A.: Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré. J. Funct. Anal. 245(3), 727–759 (2008)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Butucea, C.: Exact adaptive pointwise estimation on Sobolev classes of densities. ESAIM: Prob. Stat. 5, 1–31 (2001)

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Cattiaux, P., Chafaï, D., Guillin, A.: Central limit theorems for additive functionals of ergodic Markov diffusions processes. ALEA, Lat. Am. J Probab. Math. Stat. 9(2), 337–382 (2012)

    MathSciNet  MATH  Google Scholar 

  5. 5.

    Dalalyan, A.S.: Sharp adaptive estimation of the drift function for ergodic diffusions. Ann. Stat. 33(6), 2507–2528 (2005)

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Dalalyan, A.S., Kutoyants, Y.A.: Asymptotically efficient trend coefficient estimation for ergodic diffusion. Math. Methods Stat. 11, 402–427 (2002)

    MathSciNet  Google Scholar 

  7. 7.

    Dalalyan, A.S., Reiß, M.: Asymptotic statistical equivalence for ergodic diffusions: the multidimensional case. Probab. Theory Relat. Fields 137(1), 25–47 (2007)

    MATH  Google Scholar 

  8. 8.

    Guillin, A., Léonard, C., Wu, L., Yao, N.: Transportation-information inequalities for Markov processes. Probab. Theory Relat. Fields 144(3–4), 669–695 (2009)

    Article  MATH  Google Scholar 

  9. 9.

    Itō, K., McKean, H.P.: Diffusion Processes and their Sample Paths. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, vol. 125. Springer, Berlin (1965)

    Google Scholar 

  10. 10.

    Klemelä, J., Tsybakov, A.B.: Sharp adaptive estimation of linear functionals. Ann. Stat. 29(6), 1567–1600 (2001)

    Article  MATH  Google Scholar 

  11. 11.

    Klemelä, J., Tsybakov, A.B.: Exact constants for pointwise adaptive estimation under the Riesz transform. Probab. Theory Relat. Fields 129, 441–467 (2004)

    Article  MATH  Google Scholar 

  12. 12.

    Lepski, O.V.: One problem of adaptive estimation in Gaussian white noise. Theory Probab. Appl. 35, 459–470 (1990)

    MathSciNet  Google Scholar 

  13. 13.

    Lezaud, P.: Chernoff and Berry–Esséen inequalities for Markov processes. ESAIM: Probab. Stat. 5, 183–201 (2001)

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Liptser, R.S., Shiryaev, A.N.: Statistics of Random Processes. General Theory of Applications of Mathematics: Stochastic Modelling and Applied Probability, vol. 1, 2nd edn. Springer, Berlin (2001)

    Google Scholar 

  15. 15.

    Metafune, G., Pallara, D., Rhandi, A.: Global properties of invariant measures. J. Funct. Anal. 223(2), 396–424 (2005)

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Qian, Z., Zheng, W.: A representation formula for transition probability densities of diffusions and applications. Stoch. Process. Appl. 111(1), 57–76 (2004)

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Grundlehren der mathematischen Wissenschaften, vol. 293, 3rd edn. Springer, Berlin (1999)

    Google Scholar 

  19. 19.

    Royer, G.: An initiation to logarithmic Sobolev inequalities. Collection SMF: Cours spécialisés. American Mathematical Society (2007)

  20. 20.

    Spokoiny, V.G.: Adaptive drift estimation for nonparametric diffusion model. Ann. Stat. 28(3), 815–836 (2000)

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Strauch, C.: Sharp adaptive drift estimation and Donsker-type theorems for multidimensional ergodic diffusions. Ph.D. thesis, Universität Hamburg (2013)

  22. 22.

    Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Grundlehren der mathematischen Wissenschaften, vol. 233. Springer, Berlin, New York (1979)

    Google Scholar 

  23. 23.

    Tsybakov, A.B.: Pointwise and \(\sup \)-norm sharp adaptive estimation of functions on the Sobolev classes. Ann. Stat. 26(6), 2420–2469 (1998)

    MathSciNet  Article  MATH  Google Scholar 

  24. 24.

    van de Geer, S.: Empirical Processes in M-Estimation. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  25. 25.

    van der Vaart, A.W., Wellner, J.W.: Weak Convergence and Empirical Processes. Springer Series in Statistics. Springer, New York (1996)

    Google Scholar 

Download references


The author is grateful to her Ph.D. advisor, Angelika Rohde, and Enno Mammen for constant encouragement and constructive advise. She also would like to thank two anonymous referees for helpful comments which led to a substantial improvement of the paper. This work was partially supported by the DFG Priority Program SPP 1324 (Project: RO 3766/2-1).

Author information



Corresponding author

Correspondence to Claudia Strauch.


Appendix A: Preliminaries

We first give a result which allows to deduce the exact asymptotics for pointwise estimation of the drift component \(b^j\) from exact results on estimating \(b^j\rho _{b}\), \(j\in \{1,\ldots ,d\}\). In particular, it allows to identify \(D(\beta ,L;\rho _b,\sigma _0)\) as defined in (1.3) as the optimal normalizing factor for estimating the \(j\)-th component of \(b\in \varPi (c_1,c_2,\sigma _0\mathbf {Id})\). For the detailed derivation of the exact lower bound, we refer to Theorem 2.5.7 in Strauch [21].

Lemma 1

  1. (a)

    There exist two positive constants \(C_1,C_2\) (depending only on \(c_1\), \(c_2\) and \(\sigma )\) such that the invariant density \(\rho _{b}\) satisfies \(\rho _{b}(x) \le C_1\mathrm {e}^{-C_2\Vert x\Vert ^2}\), \(x \in \mathbb {R}^d\), for any \(b\in \varPi (c_1,c_2)\).

  2. (b)

    If \(b\in \widetilde{\varPi }(c_1,c_2)\) and if \(\rho _{b} \in {\mathcal {S}}(\beta +1,L')\), for some \(\beta >d/2\) and \(L' > 0\), then there exists an invariant density estimator \(\widehat{\rho }_T\) such that

    $$\begin{aligned} \mathbf {E}_b|\widehat{\rho }_T(x)-\rho _{b}(x)|^{2} \le K_1 T^{-\frac{\beta +1-(d/2)}{\beta +1}} \exp \left( -K_2\Vert x\Vert \right) \!, \qquad x \in \mathbb {R}^d, \end{aligned}$$

    where the constants \(K_1,K_2\) depend only on \(L',c_1,c_2\) and \(\sigma \).


  1. (a)

    The pointwise upper bound on \(\rho _b\) is an immediate consequence of the results of Metafune et al. [15] who study global regularity properties of invariant measures of divergence-form operators. Their results also hold in our specific framework since we restrict attention to the case of constant, uniformly elliptic diffusion part. Denote by \(\lambda _{\max }\) the largest eigenvalue of \(a\). Due to Corollary 2.5 in Metafune et al. [15], \(\mathrm{(P_1) }\) implies that \(\exp (\eta \Vert x\Vert ^2) \in L^1(\mu _b)\) for \(\eta < c_1\left( 2\lambda _{\max }\right) ^{-1}\). Since \(\left\| b(x)\right\| \le c_2(1+\Vert x\Vert )\lesssim \exp (\Vert x\Vert )\), Theorem 6.1 in Metafune et al. [15] applies and yields the assertion.

  2. (b)

    Let \(G=G_{\beta +1}:\mathbb {R}^d\rightarrow \mathbb {R}\) be the kernel with Fourier transform

    $$\begin{aligned} \phi _G(\lambda ) = \int _{\mathbb {R}^d} \mathrm {e}^{\mathrm{i }\lambda ^\top y} G(y)\mathrm {d}y = \frac{1}{1+\Vert \lambda \Vert ^{2(\beta +1)}}, \quad \lambda \in \mathbb {R}^d, \end{aligned}$$

    and define the invariant density estimator

    $$\begin{aligned} \widehat{\rho }_T(x)=\widehat{\rho }_{T,h}(x) := \frac{1}{T h^d} \int _0^T G\left( \frac{X_u-x}{h}\right) \mathrm {d}u, \quad x \in \mathbb {R}^d. \end{aligned}$$

    The bandwidth \(h = h_T \searrow 0\) is to be specified later. For bounding the stochastic error, note that, using (4.3),

    $$\begin{aligned} \mathbf {E}_b \big |\widehat{\rho }_T(x)-\mathbf {E}_b \widehat{\rho }_T(x)\big |^2&= \frac{1}{T^2h^{2d}}\ \mathrm{Var }_b\Bigg (\int _0^T G\left( \frac{X_u-x}{h}\right) \mathrm {d}u\Bigg ) \\&\le \frac{C}{T h^{2d}} \int _{\mathbb {R}^d} G^2\left( \frac{y-x}{h}\right) \rho _{b}(y)\mathrm {d}y. \end{aligned}$$

    Taking into account the regularity properties of \(G\), a multidimensional version of Theorem 1A in Parzen [16] yields

    $$\begin{aligned} \mathbf {E}_b \big |\widehat{\rho }_T(x)-\mathbf {E}_b \widehat{\rho }_T(x)\big |^2 \le \frac{C}{Th^d} \rho _{b}(x)\Vert G\Vert _{L^2(\mathbb {R}^d)}^2 (1+o_T(1)). \end{aligned}$$

    It remains to treat the bias term. Note that, using in particular Cauchy–Schwarz,

    $$\begin{aligned} \big |\mathbf {E}_b \widehat{\rho }_T(x)-\rho _{b}(x)\big |&= (2\pi )^{-d} \left| \int _{\mathbb {R}^d}\phi _{\rho _{b}}(\lambda )\left\{ \left( 1+\Vert h\lambda \Vert ^{2\beta }\right) ^{-1}-1\right\} \mathrm {e}^{-\mathrm{i }\lambda ^\top x}\mathrm {d}\lambda \right| \\&\le h^{\beta + 1}\left( (2\pi )^{-d}\int _{\mathbb {R}^d}\left| \phi _{\rho _{b}} (\lambda )\right| ^2\Vert \lambda \Vert ^{2(\beta +1)}\mathrm {d}\lambda \right) ^{1/2}\\&\quad \times \left( (2\pi )^{-d}\int _{\mathbb {R}^d}\frac{\left\| h\lambda \right\| ^{2(\beta +1)}}{\left( 1+\Vert h\lambda \Vert ^{2(\beta +1)}\right) ^2} \mathrm {d}\lambda \right) ^{1/2}\\&\le L' ~ (2\pi )^{-d/2}\left( \int _{\Vert y\Vert \le 1}\frac{\mathrm {d}y}{(1+\Vert y\Vert ^{2\beta })^2} + \int _{\Vert y\Vert >1} \frac{\mathrm {d}y}{\Vert y\Vert ^{2\beta }}\right) ^{1/2}\\&\quad \times h^{\beta +1-d/2} =: M h^{\beta +1-d/2}. \end{aligned}$$

    Specifying \(h = h_T \sim \left( \frac{C\rho _{b}(x)}{M^2T}\right) ^{\frac{1}{2(\beta +1)}}\) and using the upper bound on \(\rho _b(x)\) from part (a), we obtain (7.1). \(\square \)

Denote by \(N_{[\,]}\left( \varepsilon ,\mathcal {F},L^2(\mu _b)\right) \) the \(\varepsilon \)-entropy with bracketing, that is, the smallest number of \(\varepsilon \)-brackets (in \(L^2(\mu _b)\)) which are required to cover \(\mathcal {F}\) (cf. van der Vaart and Wellner [25], Definition 2.1.6).

Lemma 2

  1. (a)

    Let \(b\in \varPi (c_1,c_2,\sigma )\), and suppose that \(X\) satisfies (PI). Fix \(j \in \{1,\ldots ,d\}\), and assume that there exists some positive constant \(B\) such that, for any bounded measurable \(f\in L^2(\mu _b)\),

    $$\begin{aligned} \max \bigg \{\sup _{x \in \mathrm{supp}(f)} |b^j(x)|,\sup _{x \in \mathrm{supp}(f^2)} |b^j(x)|^2\bigg \} \le B. \end{aligned}$$

    Then Assumption (BI) is satisfied.

  2. (b)

    Let \(\mathcal {F}\subset L^2(\mu _b)\) be some class of measurable functions \(f: \mathbb {R}^d \rightarrow \mathbb {R}\), and assume that, for some positive constants \(K\) and \(M\), it holds

    $$\begin{aligned} \sup _{f \in \mathcal {F}} \Vert f\Vert _\infty \le K, \quad \sup _{f \in \mathcal {F}}\Vert f\Vert _{L^2(\mu _b)} \le M. \end{aligned}$$

    Grant Assumptions (BI) and (SG). Then, for arbitrary \(T >0\) and any positive \(r\) satisfying, for some positive constants \(K_1\) and \(K_2\),

    $$\begin{aligned} \frac{K_1}{\sqrt{T}}\int _0^1 \max \left\{ \sqrt{\log N_{[\ ]}(\varepsilon ,\mathcal {F},L^2(\mu _b))},1\right\} \mathrm {d}\varepsilon ~ \le ~ r ~ \le ~ \frac{K_2M^2}{K}, \end{aligned}$$

    there exist some positive constants \(C_1\) and \(C_2\) such that


(a) Letting, for \(r,T>0\),

$$\begin{aligned} \mathsf {p}_T(r) := \mathbf {P}_b\Bigg (\bigg |\frac{1}{T}\int _0^T \left( f(X_u)b^j(X_u) - \int _{\mathbb {R}^d} f(y)b^j(y)\mathrm {d}\mu _b(y)\right) \mathrm {d}u\bigg | > r\Bigg ), \end{aligned}$$

Theorem 1.1 in Lezaud [13] implies that

$$\begin{aligned} \mathsf {p}_T(r)&\le 2 \exp \left( -\frac{Tr^2}{2\left( \varsigma _b^2(fb^j) + c_P\Vert fb^j\Vert _\infty r\right) }\right) . \end{aligned}$$

Using the spectral gap assumption, we get, for any \(T>0\), \(g \in L^2(\mu _b)\),

$$\begin{aligned} \frac{1}{T} \mathrm{Var }_{\mathbf {P}_b}\left( \int _0^T g(X_u)\mathrm {d}u\right)&\le 2 \int _0^T \big \langle P_tg,g\big \rangle _{\mu _b} \mathrm {d}t \le 2\Vert g\Vert _{L^2(\mu _b)}^2 \int _0^T \mathrm {e}^{-2t/c_P}\mathrm {d}t\\&\le c_P\Vert g\Vert _{L^2(\mu _b)}^2. \end{aligned}$$

Consequently, in view of (7.2),

$$\begin{aligned} \varsigma _b^2(fb^j) = \lim _{T \rightarrow \infty } \frac{1}{T} \mathrm{Var }_{\mathbf {P}_b}\left( \int _0^T (fb^j)(X_u)\mathrm {d}u\right) \le c_P\Vert fb^j\Vert _{L^2(\mu _b)}^2 \le c_P B \Vert f\Vert _{L^2(\mu _b)}^2 \end{aligned}$$

and \(\Vert fb^j\Vert _\infty \le B \Vert f\Vert _\infty \). Plugging these estimates into (7.4), we obtain the asserted inequality.

(b) Under the given assumptions, Bernstein’s inequality for continuous martingales can be used to show that there exists some constant \(\widetilde{C}_B\) such that, for any \(r,T>0\),

$$\begin{aligned} \mathsf {q}_T(r)&:= \mathbf {P}_b\Bigg (\bigg |\frac{1}{T}\int _0^T f(X_u)\mathrm {d}X^j_u - \int _{\mathbb {R}^d} f(y)b^j(y)\mathrm {d}\mu _b(y)\bigg | > r \Bigg ) \nonumber \\&\le 2 \exp \left( -\frac{Tr^2}{2\widetilde{C}_B\big (\Vert f\Vert _ {L^2(\mu _b)}^2+\Vert f\Vert _\infty \big )}\right) . \end{aligned}$$

To see this, write \(\mathsf {q}_T(r) \le \mathsf {p}_T(r/2) + \mathsf {p}'_T(r/2)\), for \(\mathsf {p}_T(\cdot )\) introduced in (7.3) and

$$\begin{aligned} \mathsf {p}'_T(r)&:= \mathbf {P}_b\Bigg (\bigg |\frac{1}{T}\int _0^T f(X_u)\sum _{k=1}^d \sigma _{jk}\mathrm {d}W^k_u\bigg | > r\Bigg ), \quad r>0. \end{aligned}$$

Letting \(M_t(f) := \int _0^t f(X_u)\sum _{k=1}^d \sigma _{jk}~ \mathrm {d}W_u^k\), \(t \ge 0\), and denoting by \(\langle M\rangle _\cdot \) the quadratic variation of the martingale \(M\), Bernstein’s inequality for continuous martingales (see p. 154 in Revuz and Yor [18]) gives

$$\begin{aligned} \mathsf {p}'_T(r/2)&\le \mathbf {P}_b\Big (\big |M_T(f)\big | > Tr/2;\ \langle M(f)\rangle _T \le Tr\Vert f\Vert _\infty /2\Big )\\&\quad + \mathbf {P}_b\Big (\langle M(f)\rangle _T > Tr\Vert f\Vert _\infty /2\Big )\\&\le 2 \exp \left( -\frac{Tr}{4 \Vert f\Vert _\infty }\right) + \underbrace{\mathbf {P}_b\left( T^{-1}\int _0^T f^2(X_u) \mathrm {d}u > a_{jj}^{-1} r\Vert f\Vert _\infty /2\right) }_{=: \mathsf {p}''_T(r)}. \end{aligned}$$

Theorem 1.1 in Lezaud [13] then can be used to show that

$$\begin{aligned} \mathsf {p}''_T(r)&\le \exp \left( -\frac{Tr^2}{8 c_P a_{jj}\big ( a_{jj} \ \Vert f\Vert _{L^2(\mu _b)}^2 + \Vert f\Vert _\infty /2\big )}\right) . \end{aligned}$$

The inequality (7.5) now follows for \(\widetilde{C}_B := 4\max \big \{2c_P,2c_Pa_{jj}^2,c_Pa_{jj},1\big \}\). In view of (7.5), a uniform exponential inequality in the spirit of Theorem 5.11 in van de Geer [24] is available. Indeed, Theorem 5.11 in van de Geer [24] appears as a special case of the uniform inequality for martingales in van de Geer [24]’s Theorem 8.13, and the proof of Theorem 8.13 continues to hold in the diffusion setting if the Bernstein inequality for martingales in van de Geer [24]’s Corollary 8.10 is replaced with the Bernstein-type deviation inequality (BI). \(\square \)

For the proof of the following Lemma, we refer to the proofs of Proposition 1 in Klemelä and Tsybakov [10] and of Lemma 10 in Tsybakov [23].

Lemma 3

Let \(\beta >d/2\), and, for \(\mathbb {I}_\beta \), \(\widetilde{K}_\beta (\cdot )\) and \(\mathrm {b}= \mathrm {b}(\beta )\) defined according to (1.4), (4.7) and (4.8), respectively, let \(K_\beta ^*(x):= \mathbb {I}_\beta ^{-1} \mathrm {b}^{-\beta +d/2} \ \widetilde{K}_\beta (\mathrm {b}x)\).

  1. (a)

    (cf. Proposition 1 in Klemelä and Tsybakov [10]) It holds

    $$\begin{aligned} \widetilde{K}_\beta (0) = (2\pi )^{-d} \int _{\mathbb {R}^d} \left( 1+\Vert \lambda \Vert ^{2\beta }\right) ^{-1}\mathrm {d}\lambda = \frac{2\beta }{d}\ \mathbb {I}_\beta ^2, \end{aligned}$$

    and, for \(K_\beta (x) = \mathrm {b}^d \widetilde{K}_\beta (\mathrm {b}x)\), \(\big \Vert K_\beta \big \Vert _{L^2(\mathbb {R}^d)} = \mathbb {I}_\beta \left( \frac{2\beta - d}{d}\right) ^{\frac{\beta +d/2}{2\beta }} = \mathbb {I}_\beta \ \mathrm {b}^{\beta +d/2}\).

  2. (b)

    (cf. Lemma 10 in Tsybakov [23]) For fixed \(\delta \in (0,1)\), there exists some compactly supported modification \(\overline{K}_\beta \) of \(K_\beta ^*\) which enjoys the following properties,

    $$\begin{aligned} \big \Vert \overline{K}_\beta \big \Vert _{L^2(\mathbb {R}^d)}&\le 1-\delta /2, \end{aligned}$$
    $$\begin{aligned} \eta _\beta (\overline{K}_\beta )&\le 1-\delta /2, \end{aligned}$$
    $$\begin{aligned} (1-\delta /2) K_\beta ^*(0)&\le \overline{K}_\beta (0) \le K_\beta ^*(0). \end{aligned}$$

Appendix B: Proofs

B.1: Lower bound

Proof (of Theorem 1)

Let \(\psi _{\beta ,L} = \psi ^j_{\beta ,L} := \psi _{T,\beta } C_j(\beta ,L;\rho _{b},\sigma )\), for \(\psi _{T,\beta }\) and \(C_j(\beta ,L;\rho _{b},\sigma )\) defined in (3.1) and (3.3), respectively. To enlighten notation, the dependence on the coordinate \(j \in \{1,\ldots ,d\}\) will be mostly suppressed in the sequel.

(I) Construction of the hypotheses. Let \(L \in [L_*,L^*]\), fix some nondegenerate \(\mathbb {R}^{d\times d}\)-matrix \(\sigma \), let \(a:=\sigma \sigma ^\top \), and consider some positive density function

$$\begin{aligned} \rho \in C^\infty (\mathbb {R}^d)\cap {\mathcal {S}}\left( \beta _T+1,L'\right) \cap {\mathcal {S}}\left( \beta _*+1,L'\right) , \quad \text {where } L' := 2L\left( \sum _{k=1}^d a^2_{jk}\right) ^{-1/2}. \end{aligned}$$

Fix \(\delta _{0} \in (0,1/2)\), \(c_1,c_2>0\), and assume that \(\rho \) is such that the function

$$\begin{aligned} \rho _{T,0}(x) := \delta _{0}^{1/(\beta _{*}+3/2)}\ \rho \left( x\delta _{0}^{1/(\beta _{*}+3/2)}\right) , \quad x \in \mathbb {R}^d, \end{aligned}$$

satisfies \(\rho _{T,0}(x_0) \ge \rho _T^*\), for any \(x\) with \(\Vert x\Vert \) large enough,

$$\begin{aligned} \langle a\nabla \log \rho _{T,0}(x),\ x\rangle \le -2c_1\Vert x\Vert ^2, \end{aligned}$$

and for any \(x\in \mathbb {R}^d\),

$$\begin{aligned} \Big |\sum _{k=1}^da_{jk}\partial _k\rho _{T,0}(x)\Big |\le 2\quad \text { and }\quad \Vert \nabla \log \rho _{T,0}(x)\Vert \le 2 \Vert a\Vert _{S_2}^{-1}c_2. \end{aligned}$$

Consequently, \(a\nabla (\log \rho _{T,0})/2=:b_{T,0}\in \varPi (c_1,c_2,\sigma )\). In particular, the SDE \(\mathrm {d}X_t = b_{T,0}(X_t)\mathrm {d}t +\sigma ~ \mathrm {d}W_t\), \(t\ge 0\), admits a strong solution with Lebesgue continuous invariant measure and invariant density \(\rho _{T,0}\). Define further

$$\begin{aligned} g_{T,0}(x) := \frac{1}{2}\sum _{k=1}^d a_{jk} \partial _k \rho _{T,0}(x), \quad x \in \mathbb {R}^d. \end{aligned}$$

For \(\beta >d/2\), consider \(\widetilde{K}_\beta \) and \(\mathrm {b}= \mathrm {b}(\beta )\) as introduced in (4.7) and (4.8), respectively, and denote again \(K_\beta ^*(x)=\mathbb {I}_\beta ^{-1} \mathrm {b}^{-\beta + d/2} ~ \widetilde{K}_\beta (\mathrm {b}x)\). Lemma 3 implies that

$$\begin{aligned} K^*_\beta (0) = \mathbb {I}_\beta ^{-1} \mathrm {b}^{-\beta + d/2} ~ \widetilde{K}_\beta (0) = \frac{2\beta }{d}\left( \frac{d}{2\beta -d}\right) ^{\frac{\beta -d/2}{2\beta }}\ \mathbb {I}_\beta \end{aligned}$$


$$\begin{aligned} \Vert K_\beta ^*\Vert _{L^2(\mathbb {R}^d)} = \mathbb {I}_\beta ^{-1} \mathrm {b}^{-\beta + d/2} \left( \int _{\mathbb {R}^d}\widetilde{K}_\beta ^2(\mathrm {b}x)\mathrm {d}x\right) ^{1/2} = 1. \end{aligned}$$

Denote by \(\overline{K}_\beta \) the compactly supported modification of \(K_\beta ^*\) from Lemma 3 satisfying (K1), (K2), and (K3) for \(\delta = \delta _0\). Define the function \(g_{T,\beta _*}: \mathbb {R}^d \rightarrow \mathbb {R}\) such that, for any \(k \in \{1,\ldots ,d\}\),

$$\begin{aligned} \partial _k g_{T,\beta _*}(x) =2La_{jj}^{-1} h_{T,\beta _*}^{\beta _*-d/2} ~ \overline{K}_{\beta _*}\left( \frac{x-x_0}{h_{T,\beta _*}}\right) \ \delta _{kj}, \qquad x \in \mathbb {R}^d, \end{aligned}$$


$$\begin{aligned} h_{T,\beta _*}:= \left( \frac{d\rho _{T,0} (x_0) a_{jj} \log T}{\beta _*L^2T}\right) ^{1/(2\beta _*)}. \end{aligned}$$


$$\begin{aligned} \rho _{T,1}(x) := \rho _{T,0} (x)\bigg (1-\int _{\mathbb {R}^d}g_{T,\beta _*}(y)\mathrm {d}y\bigg ) + g_{T,\beta _*}(x), \end{aligned}$$

and consider the hypothesis \(g_{T,1}\), defined as \(g_{T,1}:= \sum _{k=1}^d a_{jk}\partial _k\rho _{T,1}/2\). The function \(b_{T,1}:\mathbb {R}^d\rightarrow \mathbb {R}^d\) is taken as \(b_{T,1} := a\nabla (\log \rho _{T,1})/2\). Note that, for \(T\) large enough,

$$\begin{aligned} \frac{\rho _{T,0}(x_0)}{\rho _{T,1}(x_0)} \le \frac{\rho _{T,0} (x_0)}{\rho _{T,0} (x_0)\left( 1-\int _{\mathbb {R}^d}g_{T,\beta _*}(y)\mathrm {d}y\right) } \le 1+ \delta _{0}/2. \end{aligned}$$

Plugging in the respective definitions of the hypotheses, it can be shown that \(g_{T,0}\in {\mathcal {S}}(\beta _T,L)\), \(g_{T,1} \in {\mathcal {S}}(\beta _*,L)\) and \(b_{T,1}\in \varPi (c_1,c_2,\sigma )\). The above definitions of the hypotheses further imply that \(\rho _{T,0}\in {\mathcal {S}}(\beta _T+1,L'), \rho _{T,1}\in {\mathcal {S}}(\beta _*+1,L')\) and

$$\begin{aligned} 2b_{T,i}^j\rho _{T,i} = \sum _{k=1}^d a_{jk}\partial _k\big (\log \rho _{T,i}\big )\rho _{T,i} = \sum _{k=1}^d a_{jk}\partial _k \rho _{T,i} , \quad i \in \{0,1\}. \end{aligned}$$

Summing up, \(\rho _{T,0}\in \varSigma _T(\beta _T,L)\) and \(\rho _{T,1}\in \varSigma _T(\beta _*,L)\).

(II) A version of Theorem 6(i) in Tsybakov [23]. The central ingredient of the proof is a special case of Theorem 6(i) in Tsybakov [23]. It will be applied in the following situation: Denote by \(\mathbf {E}_i = \mathbf {E}_{b_{T,i}}\) expectation under the measure \(\mathbf {P}_i = \mathbf {P}_{b_{T,i}}\) associated with the hypothesis \(b=b_{T,i}\), \(i \in \{0,1\}\), and note that

$$\begin{aligned}&\inf _{\widehat{g}_T} \sup _{(\beta ,L)\in {\mathcal {B}}_T}\sup _{b\in \varPi (c_1,c_2,\sigma )} \sup _{\rho _b\in \varSigma _T(\beta ,L)}\psi _{\beta ,L}^{-2} ~ \mathbf {E}_b\big |\widehat{g}_T(x_0)- (b^j\rho _{b})(x_0)\big |^2\nonumber \\&\quad \ge \inf _{\widehat{g}_T}\max \Bigg \{\sup _{b\in \varPi (c_1,c_2,\sigma )}\sup _ {\rho _b\in \varSigma _T(\beta _T,L)}\psi _{\beta _T,L}^{-2}~ \mathbf {E}_b\big |\widehat{g}_T(x_0)- (b^j\rho _{b})(x_0)\big |^2,\nonumber \\&\quad \qquad \qquad \qquad \qquad \qquad \sup _{b\in \varPi (c_1,c_2,\sigma )} \sup _{\rho _b\in \varSigma _T(\beta _*,L)}\psi _{\beta _*,L}^{-2}~ \mathbf {E}_b\big |\widehat{g}_T(x_0)\!-\! (b^j\rho _{b})(x_0)\big |^2\Bigg \} \nonumber \\&\quad \ge \inf _{\widehat{g}_T}\max \left\{ \mathbf {E}_0 \left[ \psi _{\beta _T,L}^{-2}~ \big |\widehat{g}_T(x_0)\!-\!g_{T,0} (x_0)\big |^2\right] ,\, \mathbf {E}_1\left[ \psi _{\beta _*,L}^{-2}~ \big |\widehat{g}_T(x_0)\!-\! g_{T,1}(x_0)\big |^2\right] \right\} \nonumber \\&\quad = \inf _{\widehat{T}_T}\max \left\{ \mathbf {E}_0\big |Q_T\widehat{T}_T\big |^2,\ \mathbf {E}_1\big |\widehat{T}_T-\theta _1\big |^2\right\} , \end{aligned}$$

where \(Q_T := \psi _{\beta _*,L}\psi _{\beta _T,L}^{-1}\), \(\widehat{T}_T:= \psi _{\beta _*,L}^{-1}\big (\widehat{g}_T(x_0)-g_{T,0}(x_0)\big )\), and

$$\begin{aligned} \theta _1:=\psi _{\beta _*,L}^{-1}\left( g_{T,1}(x_0)-g_{T,0}(x_0)\right) . \end{aligned}$$

The proof of the following lemma is completely along the lines of the proof of Theorem 6(i) in Tsybakov [23].

Lemma 4

(Theorem 6(i) in Tsybakov [23]) Consider \(Q_T\), \(\widehat{T}_T\) and \(\theta _1\) as introduced above, and assume that \(\theta _1 \in \mathbb {R}\) satisfies

$$\begin{aligned} \big |\theta _1\big | \ge 1-\delta _{0}. \end{aligned}$$

If \(\mathbf {P}_0,\mathbf {P}_1\) are such that \(\mathbf {P}_0 \ll \mathbf {P}_1\) and, for \(\tau > 0\) and \(\alpha \in (0,1)\) fixed,

$$\begin{aligned} \mathbf {P}_1\left( \frac{\mathrm {d}\mathbf {P}_0}{\mathrm {d}\mathbf {P}_1}\ge \tau \right) \ge 1-\alpha , \end{aligned}$$


$$\begin{aligned} \inf _{\widehat{T}_T}\max \left\{ \mathbf {E}_0\big |Q_T\widehat{T}_T\big |^2,\ \mathbf {E}_1\big |\widehat{T}_T-\theta _1\big |^2\right\} \ge \frac{(1-\alpha )\tau (1-2\delta _{0})^2\big (Q_T\delta _{0}\big )^2}{(1-2\delta _{0})^2+\tau \big (Q_T\delta _{0}\big )^2}, \end{aligned}$$

where the infimum is taken over all \(\widehat{T}_T= \psi _{\beta _*,L}^{-1}\big (\widehat{g}_T(x_0)-g_{T,0}(x_0)\big )\).

We proceed with verifying (A1) and (A2). Note first that

$$\begin{aligned} \Big |\frac{1}{2}\sum _{k=1}^d a_{jk}\partial _k g_{T,\beta _*}(x_0)\Big |&= L h_{T,\beta _*}^{\beta _*-d/2} \overline{K}_{\beta _*}(0)\mathop {\ge }\limits ^{(K3)} L h_{T,\beta _*}^{\beta _*-d/2} \left( 1-\delta _{0}/2\right) ~ K^*_{\beta _*}(0)\\&= \left( 1-\delta _{0}/2\right) L \left( \frac{d^2 a_{jj} \rho _{T,0} (x_0)\log T}{\beta _*(2\beta _*-d)L^2 T}\right) ^{\frac{\beta _*-d/2}{2\beta _*}} \frac{2\beta _*}{d} ~ \mathbb {I}_{\beta _*}\\&= \left( 1-\delta _{0}/2\right) ~ C_j(\beta _*, L; \rho _{T,0},\sigma ) \, \psi _{T,\beta _*}. \end{aligned}$$

Since, for \(T\) large enough, \(\psi _{\beta _*,L}^{-1}\int _{\mathbb {R}^d}g_{T,\beta _*}(y) \mathrm {d}y \le \delta _0/2\), this implies

$$\begin{aligned} \big |\theta _1\big |&\ge \psi _{\beta _*,L}^{-1}\ \bigg |\frac{1}{2}\sum _{k=1}^d a_{jk}\bigg (\partial _k g_{T,\beta _*}(x_0)- \partial _k\rho _{T,0} (x_0)\int _{\mathbb {R}^d}g_{T,\beta _*}(y) \mathrm {d}y\bigg )\bigg |\nonumber \\&\ge 1-\delta _{0}. \end{aligned}$$

Denote by \(Y\) the solution of the SDE \(\mathrm {d}Y_t = b_{T,1}(Y_t)\mathrm {d}t+\sigma ~\mathrm {d}W_t\). In order to verify (A2), note that the specifications on pp. 296–297 in Liptser and Shiryaev [14] imply that the likelihood ratio under \(\mathbf {P}_1\) is given by

$$\begin{aligned} \frac{\mathrm {d}\mathbf {P}_0}{\mathrm {d}\mathbf {P}_1}(Y^T)&= \frac{\rho _{T,0}}{\rho _{T,1}}(Y_0)\exp \Bigg (-\frac{1}{2}\int _0^T (b_{T,0} -b_{T,1})^\top (Y_u)a^{-1}(b_{T,0} -b_{T,1})(Y_u)\mathrm {d}u \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \quad \quad +\int _0^T \left( \sigma ^{-1}\left( b_{T,0} -b_{T,1}\right) \right) ^\top (Y_u)\mathrm {d}W_u\Bigg ). \end{aligned}$$

To proceed, set

$$\begin{aligned} M_t := \int _0^t \left( \sigma ^{-1}\left( b_{T,0} -b_{T,1}\right) \right) ^\top (Y_u)\mathrm {d}W_u,\quad t \ge 0, \end{aligned}$$

denote \(g := (b_{T,0} -b_{T,1})^\top a^{-1}(b_{T,0} -b_{T,1})\), and consider the following stationary sequence of random variables,

$$\begin{aligned} Z_k := \int _{(k-1)t}^{kt}g(Y_u)\mathrm {d}u, \quad k \ge 1. \end{aligned}$$

Since \(g \in L^1(\mathbf {P}_1)\), it follows from the ergodic theorem that, for any \(t>0\),


$$\begin{aligned} c:&= \mathbf {E}_1\left[ (b_{T,0} -b_{T,1})^\top (Y_0)a^{-1}(b_{T,0} -b_{T,1})(Y_0)\right] \nonumber \\&= \mathbf {E}_1\left\| \sigma ^{-1}(b_{T,0} -b_{T,1})(Y_0)\right\| ^2. \end{aligned}$$

In particular, this implies by means of the martingale CLT that, for some standard Brownian motion \(W\),


Denoting by \([s]\) the integer part of \(s\) and considering an arbitrary sequence \(\gamma (s)\rightarrow _{s\rightarrow \infty }0\), it holds

Choosing \(t\equiv 1\) in (8.8) and passing to the continuous-time case, we obtain

It is verified by straightforward algebra that the definition of the hypotheses \(b_{T,0} \) and \(b_{T,1}\) entails that

$$\begin{aligned} 2\sigma ^{-1}\left( b_{T,0} -b_{T,1}\right)&= \sigma ^\top \nabla \left( \log \frac{\rho _{T,0} }{\rho _{T,1}}\right) = \frac{g_{T,\beta _*} \sigma ^\top \nabla \left( \log \rho _{T,0} \right) + \sigma ^\top \nabla g_{T,\beta _*}}{\rho _{T,1}}. \end{aligned}$$

The definition of \(g_{T,\beta _*}\) further implies that, using in particular (8.1),

$$\begin{aligned} \mathbf {E}_1\left[ \frac{\big \Vert \sigma ^\top \nabla g_{T,\beta _*}(Y_0)\big \Vert ^2}{4\rho _{T,1}^2(Y_0)}\right]&= \int _{\mathbb {R}^d}\frac{\big \Vert \sigma ^\top \nabla g_{T,\beta _*}(y)\big \Vert ^2}{4\rho _{T,1}(y)}\mathrm {d}y = a_{jj} \int _{\mathbb {R}^d}\frac{\big (\partial _j g_{T,\beta _*}(y)\big )^2}{4\rho _{T,1}(y)}\mathrm {d}y \\&= a_{jj}^{-1} L^2\ h_{T,\beta _*}^{2\beta _*-d} \int _{\mathbb {R}^d}\overline{K}_{\beta _*}^2\left( \frac{y-x_0}{h_{T,\beta _*}}\right) \frac{\mathrm {d}y}{\rho _{T,1}(y)}\\&= a_{jj}^{-1} L^2 \ h_{T,\beta _*}^{2\beta _*} \int _{\mathbb {R}^d}\overline{K}_{\beta _*}^2(y)\mathrm {d}y \ \frac{\big (1+o_T(1)\big )}{\rho _{T,1}(x_0)} \, \\&\le \left( 1-\delta _{0}/2\right) ^2 a_{jj}^{-1} L^2 h_{T,\beta _*}^{2\beta _*}\frac{\big (1+o_T(1)\big )}{\rho _{T,1}(x_0)}. \end{aligned}$$

The last inequality follows from (K1). Thus, plugging in the definition of \(h_{T,\beta _*}\) [see (8.2)] and using (8.3),

$$\begin{aligned} \mathbf {E}_1\left[ \frac{\big \Vert \sigma ^\top \nabla g_{T,\beta _*}(Y_0)\big \Vert ^2}{4\rho _{T,1}^2(Y_0)}\right]&\le \left( 1-\delta _{0}/2\right) ^2 L^2 \left( \frac{d \rho _{T,0} (x_0)a_{jj} \log T}{\beta _*L^2 T}\right) \ \frac{\big (1+o_T(1)\big )}{a_{jj} \rho _{T,1}(x_0)}\\&\le \left( 1-\delta _{0}^2/4\right) ^2 \frac{d\log T}{T\beta _*}\, \big (1+o_T(1)\big ). \end{aligned}$$

It can be shown by analogous arguments that the terms

$$\begin{aligned} \mathbf {E}_1\left[ \frac{g_{T,\beta _*}^2(Y_0)\Vert \sigma ^\top \nabla (\log \rho _{T,0} )(Y_0)\Vert ^2}{\rho _{T,1}^2(Y_0)}\right] \end{aligned}$$


$$\begin{aligned} \mathbf {E}_1\left[ \frac{g_{T,\beta _*}(Y_0) \left( \nabla (\log \rho _{T,0} )(Y_0)\right) ^\top a\nabla g_{T,\beta _*}(Y_0)}{\rho _{T,1}^2(Y_0)}\right] \end{aligned}$$

are asymptotically negligible. Thus, for \(c\) defined in (8.7) and whenever \(\delta _{0}\) is small and \(T\) is large enough,

$$\begin{aligned} c \le \left( 1-\delta _{0}^2/4\right) ^2 \frac{d}{\beta _*} \ \frac{\log T}{T}\ \big (1+o_T(1)\big ). \end{aligned}$$

Consequently, for \(\tau := \exp \left( -\frac{\left( 1-\delta _{0}^2/4\right) d\log T}{2\beta _*}\right) \), it holds a.s.

$$\begin{aligned} \frac{\log \tau -\log \rho _{T,0} (Y_0)+\log \rho _{T,1}(Y_0)+\frac{1}{2}\langle M\rangle _T}{\sqrt{Tc}} \le -\frac{\delta _{0}^2}{8} \sqrt{\frac{d\log T}{\beta _*}} + o_T(1) \rightarrow -\infty . \end{aligned}$$

The verification of (A2) is accomplished by means of a tightness argument. Consider some sequence of probability measures \((\mathbf {P}_n)_{n\ge 1}\) on some measurable space, converging weakly to some probability measure \(\mathbf {P}\). Tightness of \(\mathbf {P}_n\) implies that, for any sequence \(\gamma _n\rightarrow -\infty \),

$$\begin{aligned} \lim _{m\rightarrow \infty }\max \left\{ \mathbf {P}\left( (-\infty ,\gamma _m)\right) ,\ \sup _{n \in \mathbb {N}}\mathbf {P}_n\left( (-\infty ,\gamma _m)\right) \right\} = 0. \end{aligned}$$

Thus, \(\lim _{m\rightarrow \infty }\sup _{n\in \mathbb {N}}\mathbf {P}_n\left( (-\infty ,\gamma _m)\right) = 0\) and \(\lim _{m\rightarrow \infty }\inf _{n\in \mathbb {N}}\mathbf {P}_n\left( (-\infty ,\gamma _m)\right) =1\). In particular,

$$\begin{aligned} \lim _{m \rightarrow \infty }\mathbf {P}_m\left( (\gamma _m,\infty )\right) = 1. \end{aligned}$$

In the current framework, this last assertion implies that, for

$$\begin{aligned} \gamma _T := \frac{\log \tau -\log \rho _{T,0} (Y_0)+\log \rho _{T,1}(Y_0)+\frac{1}{2}\langle M\rangle _T}{\sqrt{Tc}} \rightarrow -\infty , \end{aligned}$$

one has [plugging in (8.6)]

$$\begin{aligned} \lim _{T\rightarrow \infty }\mathbf {P}_1\left( \frac{\mathrm {d}\mathbf {P}_0}{\mathrm {d}\mathbf {P}_1}\ge \tau \right)&= \lim _{T\rightarrow \infty }\mathbf {P}_1 \left( \frac{\rho _{T,0}}{\rho _{T,1}}(Y_0)\exp \Big (M_T-\frac{1}{2}\langle M\rangle _T\Big )\ge \tau \right) \\&= \lim _{T\rightarrow \infty }\mathbf {P}_1\left( \frac{M_T}{\sqrt{Tc}}\ge \gamma _T\right) =1. \end{aligned}$$

For large enough \(T\) and fixed \(\tau >0\) (where \(\delta _{0}\in (0,1)\) can be chosen arbitrarily small), we thus obtain

$$\begin{aligned} \mathbf {P}_1\left( \frac{\mathrm {d}\mathbf {P}_0}{\mathrm {d}\mathbf {P}_1}\ge \tau \right) \ge 1-\delta _{0}. \end{aligned}$$

(III) Completion of the proof. In view of (8.5) and (8.9), Lemma 4 gives

$$\begin{aligned}&\inf _{\widehat{g}_T}\sup _{(\beta ,L)\in {\mathcal {B}}_T} \sup _{b\in \varPi (c_1,c_2,\sigma )} \sup _{\rho _b\in \Sigma _T(\beta ,L)}\mathbf {E}_b\big | \widehat{g}_T(x_0)-(b^j\rho _{b})(x_0)\big |^2 \psi _{\beta ,L}^{-2} \\&\quad \mathop {\ge }\limits ^{(8.4)}\inf _{\widehat{T}_T}\max \left\{ \mathbf {E}_0\big |Q_T\widehat{T}_T\big |^2,\ \mathbf {E}_1\big |\widehat{T}_T-\theta _1\big |^2\right\} \\&\quad \; \ge \frac{(1-\delta _{0})\tau (1-2\delta _{0})^2\big (Q_T\delta _{0}\big )^2}{(1-2\delta _{0})^2+\tau \big (Q_T\delta _{0}\big )^2}. \end{aligned}$$

Since, for \(C := C_j(\beta _*,L;\rho _{T,0},\sigma )/ C_j(\beta _T,L;\rho _{T,0},\sigma )\),

$$\begin{aligned} Q_T = \frac{\psi _{\beta _*,L}}{\psi _{\beta _T,L}} = C\exp \left( -\frac{d}{4\beta _*\beta _T}\left( \beta _*-\beta _T\right) (\log T-\log \log T)\right) , \end{aligned}$$

we have

$$\begin{aligned} \tau Q_T^2 = C^2 \exp \left( \frac{d\left( \delta _{0}^2\beta _T/4- \beta _*\right) }{2\beta _*\beta _T} \log T\right) \times \exp \left( \frac{d\left( \beta _*-\beta _T\right) }{2\beta _*\beta _T}\log \log T\right) . \end{aligned}$$

As \(T \rightarrow \infty \), \(\tau Q_T^2 \rightarrow \infty \). Choosing \(\delta _{0} > 1/A\) for \(A\) large enough to ensure \(\delta _{0} < 1/2\), it holds

$$\begin{aligned} \frac{(1-\delta _{0})\tau (1-2\delta _{0})^2\big (Q_T\delta _{0}\big )^2}{(1-2\delta _{0})^2+\tau \big (Q_T\delta _{0}\big )^2} = \frac{(1-\delta _{0})(1-2\delta _{0})^2\delta _{0}^2}{\frac{(1-2\delta _{0})^2}{\tau Q_T^2}+\delta _{0}^2} \rightarrow _{T\rightarrow \infty } (1-\delta _{0})(1-2\delta _{0})^2. \end{aligned}$$

Taking now \(A\rightarrow \infty \), the assertion follows. \(\square \)

B.2: Upper bound

Proof (of Theorem 2)

Let \(\beta \in \left[ \beta _*,\beta _T\right] \), \(L\in \left[ L_*,L^*\right] \), \(\beta ' \in (d/2,\beta ]\), \(c_1\in (0,\infty ]\), \(c_2>0\), \(\sigma \) some nondegenerate \(\mathbb {R}^{d\times d}\)-matrix, \(L'>0\), and fix \(j \in \{1,\ldots ,d\}\).

Denote by \(\gamma _{Ti}\), \(i \in \mathbb {N}\), functions of \(T\) such that \(\lim _{T\rightarrow \infty }\gamma _{Ti} = 0\). For \(\psi _{T,\beta }\) and \(C_j(\beta ,L;\rho _{b},\sigma )\) introduced in (3.1) and (3.3), respectively, recall that \(\psi _{\beta ,L} = \psi _{\beta ,L}^j = \psi _{T,\beta } C_j(\beta ,L;\rho _{b},\sigma )\). To enlighten notation, the dependence on the coordinate \(j\) again will be mostly suppressed. Denote by \(\widetilde{T}(\beta )\) the effective noise level under adaptation, defined as

$$\begin{aligned} \widetilde{T}(\beta )=\widetilde{T}^j(\beta ):=\left( \frac{d \rho _{b}(x_0)a_{jj} \log T}{\beta T}\right) ^{1/2}. \end{aligned}$$

Consider the following deterministic counterparts of the bandwidth \(\widehat{h}_{T,\beta '}\) and the thresholding sequence \(\widehat{\eta }_{T,\beta '}\),

$$\begin{aligned} h_{T,\beta '}&:=\left( \frac{d\rho _{b}(x_0)a_{jj}\log T}{\beta 'T}\right) ^{1/2\beta '} = \widetilde{T}(\beta ')^{1/\beta '} \end{aligned}$$


$$\begin{aligned} \eta _{T,\beta '}:= \left( \frac{d\rho _{b}(x_0)a_{jj}\log T}{\beta 'T}\right) ^{\frac{\beta '-d/2}{2\beta '}}\ \big \Vert K_{\beta '}\big \Vert _{L^2(\mathbb {R}^d)} = h_{T,\beta '}^{\beta '-d/2}\ \big \Vert K_{\beta '}\big \Vert _{L^2(\mathbb {R}^d)}. \end{aligned}$$


$$\begin{aligned} \widetilde{\beta } = \widetilde{\beta }(\beta ,\beta ') := {\left\{ \begin{array}{ll} \beta '+\frac{d}{2}, &{} \text { if } \frac{d}{2}\le \beta ' \le \frac{\beta }{2}+\frac{d}{4},\\ \beta ,&{} \text { if } \frac{\beta }{2}+\frac{d}{4}<\beta '\le \beta . \end{array}\right. } \end{aligned}$$

Define \(\overline{\delta }_T := (\log T)^{-1}\), and introduce the random event

$$\begin{aligned} A_{T,\beta '}:= \bigg \{\Big |\big (\widehat{h}^j_{T,\beta '}/h_{T,\beta '}\big ) ^{\widetilde{\beta '}-d/2}-1\Big | \le \overline{\delta }_T\bigg \} \end{aligned}$$

and the associated deterministic set \({\mathcal {H}}_{T,\beta '} \!=\! {\mathcal {H}}^j_{T,\beta '} \!:=\! \bigg \{h : \Big |\big (h/h_{T,\beta '}\big )^{\widetilde{\beta '}-d/2}\!-\!1\Big |\!\le \! \overline{\delta }_T\bigg \}\). Note that there exists a positive constant \(c_0\) such that

$$\begin{aligned} \mathcal {H}_{T,\beta '} \subset \bigg \{h : \Big |\big ( h/h_{T,\beta '}\big )-1\Big |\le c_0\overline{\delta }_T\bigg \} =: H_{T,\beta '}. \end{aligned}$$

Denote the kernel estimator of \((b^j\rho _{b})(x_0)\) with deterministic bandwidth \(h \in {\mathcal {H}}_{T,\beta '}\) by

$$\begin{aligned} g_{T,\beta '}(x_0,h):= \frac{1}{Th^d}\int _0^T K_{\beta '}\left( \frac{X_u-x_0}{h}\right) \mathrm {d}X_u^j, \end{aligned}$$

and set \(g_{T,\beta '}(x_0) := g_{T,\beta '}\big (x_0,h_{T,\beta '}\big )\). Define

$$\begin{aligned} \mathrm{s }_T(\beta )&:= h_{T,\beta }^{-d/2}\sqrt{\frac{\rho _{b}(x_0)a_{jj}}{T}} \ \big \Vert K_\beta \big \Vert _{L^2(\mathbb {R}^d)}, \end{aligned}$$

and let \(d_T(\beta ):= \sqrt{(d\log T)/\beta }\) such that \(\mathrm{s }_T(\beta ) d_T(\beta ) = \eta _{T,\beta }\). For \(\beta ' \le \beta \), introduce the auxiliary sequence

$$\begin{aligned} \tau _T(\beta '):=\mathrm{s }_T(\beta ')\left( \sqrt{d^2_T(\beta ')-d^2_T(\beta )} + \left( \frac{\log T}{\beta _T}\right) ^{1/4}\right) . \end{aligned}$$

Lemma 5

(Bound on the bias) Consider the estimator \(g_{T,\beta '}(x_0,h)\) defined in (8.11), and let

$$\begin{aligned} \mathrm {b}_{\beta ,\beta '} := \left( \frac{2\beta '-d}{d}\right) ^{\frac{d/2-\widetilde{\beta }}{2\beta '}} \left( (2\pi )^{-d}\int _{\mathbb {R}^d}\frac{\Vert \lambda \Vert ^{4\beta '-2\widetilde{\beta }}}{\left( 1+\Vert \lambda \Vert ^{2\beta '}\right) ^2}~ \mathrm {d}\lambda \right) ^{1/2}. \end{aligned}$$

For any \({\mathcal {H}}_{T,\beta '} \ni h >0\),

$$\begin{aligned} \sup _{b\in \widetilde{\varPi }(c_1,c_2,\sigma )} \sup _{\rho _b \in \Sigma _T(\beta ,L)}\Big |\mathbf {E}_b g_{T,\beta '}(x_0,h)-(b^j\rho _{b})(x_0)\Big | \le L h^{\widetilde{\beta }-d/2} \mathrm {b}_{\beta ,\beta '}. \end{aligned}$$


$$\begin{aligned} \sup _{d/2 < \beta '\le \beta < \infty } \mathrm {b}_{\beta ,\beta '}< \infty , \qquad \limsup _{\delta \rightarrow 0}\sup _{\beta ,\beta ' \in [\beta _*,\infty ):\ |\beta -\beta '| \le \delta }\frac{\mathrm {b}_{\beta ,\beta '}}{\mathrm {b}_{\beta ,\beta }}\le 1. \end{aligned}$$


The bound on the bias in (8.12) is proven by standard arguments and relies in particular on exploiting the scaling properties of the Fourier transform of \(K_{\beta ',h}\). The remaining assertions are Lemma 1(ii), (iii) in Klemelä and Tsybakov [11].\(\square \)

The principal importance of exponential bounds on the stochastic error of estimators considered in the adaptive procedure was already indicated in the introduction. The Bernstein-type deviation inequality (BI) and its implication (BI+), the basic uniform exponential inequality stated in Lemma 2, can be applied to derive more specific bounds on the stochastic error of the estimators \(g_{T,\beta }\) defined according to (8.11). The following function classes are defined analogously to Butucea [3],

$$\begin{aligned} \mathcal {K}_1&:= \left\{ K_{\beta ',h}(\cdot ) := h^{-d}K_{\beta '}\left( (\cdot -x_0)/h\right) \mid h \in H_{T,\beta '}\right\} ,\\ \mathcal {K}_2&:= \left\{ K_{\beta ',h} - K_{\beta ',h_{T,\beta '}}\mid h \in H_{T,\beta '}\right\} . \end{aligned}$$

For \(h \in H_{T,\beta }\), let

$$\begin{aligned} Z_{T,\beta }(h)&:= g_{T,\beta }(x_0,h)-\mathbf {E}_b g_{T,\beta }(x_0,h)\nonumber \\&= \frac{1}{T} \int _0^T K_{\beta ,h}(X_u)\mathrm {d}X^j_u - \int _{\mathbb {R}^d} K_{\beta ,h}(y)(b^j\rho _{b})(y)\mathrm {d}y. \end{aligned}$$

Lemma 6

Grant Assumptions (BI) and (SG+). For any \(\beta ' > d/2\), the stochastic error \(Z_{T,\beta '}(\cdot )\) defined according to (8.13) has the following properties:

  1. (a)

    For any \(u \in \big [\tau _T(\beta '), \ R_1 \mathrm{s }_T(\beta ')\sqrt{\log T}\big ]\), \(R_1>0\) an absolute constant, there exist some sufficiently small \(\gamma >0\), independent of \(\beta '\), and some universal constant \(c^{\prime }_1>0\) such that

    $$\begin{aligned} \mathbf {P}_b\Bigg (\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > u\Bigg ) \le c^{\prime }_1 \exp \left( -\frac{1}{2} \bigg (\frac{u(1-\gamma )}{\mathrm{s }_T(\beta ')}\bigg )^2\right) + o\big (T^{-1}\big ).\qquad \quad \end{aligned}$$
  2. (b)

    For any \(u\in \big [R_1 \mathrm{s }_T(\beta ')\sqrt{\log T}, \ R_2\big ]\), \(R_1, R_2 >0\) absolute constants, it holds, for some absolute constants \(c^{\prime }_2, c^{\prime }_3 >0\),

    $$\begin{aligned} \mathbf {P}_b\Bigg (\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |> u\Bigg ) \le c^{\prime }_2\exp \left( -c^{\prime }_3\bigg (\frac{u}{\mathrm{s }_T(\beta ')}\bigg )^2\right) . \end{aligned}$$
  3. (c)

    Assume that \(\beta ' < \beta \). Then, uniformly in \(\beta \in {\mathcal {B}}_T\),

    $$\begin{aligned}&\sup _{\mathop {\beta '<\beta }\limits ^{\beta '\in \mathcal {B},}} m \sup _{b\in \widetilde{\varPi }(c_1,c_2)} \sup _{\rho _b \in \varSigma _T(\beta ,L)}\psi _{\beta ,L}^{-2}\, \\&\quad \times \,\mathbf {E}_b\left[ \bigg (\sup _{h \in H_{T,\beta '}} \big |Z_{T,\beta '}(h)\big |\bigg )^2 \, {1\!\!1}\bigg \{\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > \tau _T(\beta ')\bigg \}\right] \rightarrow 0,\\&\sup _{\mathop {\beta '<\beta }\limits ^{\beta '\in \mathcal {B},}} m\sup _{b\in \widetilde{\varPi }(c_1,c_2)} \sup _{\rho _b \in \varSigma _T(\beta ,L)}\psi _{\beta ,L}^{-2}\, \\&\quad \times \, \mathbf {E}_b\left[ \bigg (\sup _{h \in H_{T,\beta '}} \big |Z_{T,\beta '}(h)\big |\bigg )^2 \ {1\!\!1}\bigg \{\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > \sqrt{\mathrm{s }_T(\beta ')\psi _{\beta ,L}}\bigg \}\right] \rightarrow 0. \end{aligned}$$


The assertions are analogue to the statements in Lemma 4.3, Lemma 4.5 and Theorem 4.6 in Butucea [3]. For deriving the inequality (8.14) with the specific factor \(1/2\) in the exponent in the current diffusion framework, we however have to go into greater detail. Throughout the proof, \(D_1, D_2, \ldots \) denote positive constants. For fixed \(\delta ' \in (0,1)\) and arbitrary \(u,T>0\), write

$$\begin{aligned} \mathbf {P}_b\Big (\big |Z_{T,\beta '}(h_{T,\beta '})\big |>u\Big ) \le \mathsf {t}_1 + \mathsf {t}_2, \end{aligned}$$


$$\begin{aligned} \mathsf {t}_1 \!:=\! \mathbf {P}_b\Bigg (\bigg |\frac{1}{T}\int _0^T \bigg (K_{\beta ',h_{T,\beta '}}(X_u)b^j(X_u)\!-\! \int _{\mathbb {R}^d}K_{\beta ',h_{T,\beta '}}(y)(b^j\rho _{b})(y)\mathrm {d}y\bigg )\mathrm {d}u\bigg |\!>\! \delta ' u\Bigg ), \end{aligned}$$

and, denoting \(M_t(K) := \int _0^t K(X_u)\sum _{r=1}^d \sigma _{jr}\mathrm {d}W_u^r\), \(t>0\),

$$\begin{aligned} \mathsf {t}_2 := \mathbf {P}_b\Bigg (\bigg |\frac{1}{T}M_T\Big (K_{\beta ',h_{T,\beta '}}\Big ) \bigg |>(1-\delta ')u\Bigg ). \end{aligned}$$

For any \(u \le R_1 \mathrm{s }_T(\beta ')\sqrt{\log T}\), we have

$$\begin{aligned} u \big \Vert K_{\beta ,h_{T,\beta '}}\big \Vert _\infty \!\le \! uh_{T,\beta '}^{-d}K_{\max } \!\le \! R_1 \mathrm{s }_T(\beta ')h_{T,\beta '}^{-d} \sqrt{\log T}\ K_{\max }\le D_1 \mathrm{s }_T^2(\beta ') T h_{T,\beta '}^{-d/2+\beta '}, \end{aligned}$$

such that, since \(\beta '>d/2\),

$$\begin{aligned} \frac{u \Vert K_{\beta ,h}\Vert _\infty }{\mathrm{s }_T^2(\beta ') T} = o_T(1). \end{aligned}$$

Furthermore, the enhanced spectral gap assumption (SG+) gives

$$\begin{aligned} \varsigma _b\big (K_{\beta ,{h_{T,\beta '}}}\big ) \le D_2\times {\left\{ \begin{array}{ll}1, &{}d=1,\\ \max \left\{ 1,(\log (h_{T,\beta '}^{-4}))^2\right\} , &{}d=2,\\ h_{T,\beta '}^{2-d}, &{}d\ge 3. \end{array}\right. } \end{aligned}$$

Thus, for \(T\) sufficiently large, \(C_B\Big (\varsigma _b\big (K_{\beta ,{h_{T,\beta '}}}\big )+ \delta ' u\ \big \Vert K_{\beta ,h_{T,\beta '}}\big \Vert _\infty \Big ) \le \mathrm{s }_T^2(\beta ') T.\) The Bernstein-type deviation inequality (BI) therefore implies that

$$\begin{aligned} \mathsf {t}_1&\le 2\exp \left( -\frac{\delta '^2u^2}{2\mathrm{s }_T^2(\beta ')}\right) . \end{aligned}$$

For bounding \(\mathsf {t}_2\) from above, we first use Bernstein’s inequality for continuous martingales which gives, for any \(h>0\),

$$\begin{aligned} \mathbf {P}_b\Big (\big |M_T\big (K_{\beta ',h}\big )\big | \!>\! T(1-\delta ')u; \ \big \langle M\big (K_{\beta ',h}\big )\big \rangle _T \!\le \! T^2 \mathrm{s }_T^2(\beta ')\Big ) \!\le \! 2\exp \left( -\frac{(1\!-\!\delta ')^2u^2}{2\mathrm{s }_T^2(\beta ')}\right) . \end{aligned}$$

By means of (BI) and using again that \(\beta '>d/2\), it can be shown that

$$\begin{aligned} \mathbf {P}_b\Big (\big \langle M\big (K_{\beta ',h_{T,\beta '}}\big )\big \rangle _T> T^2\mathrm{s }_T^2(\beta ')\Big )&= \mathbf {P}_b\bigg (\frac{1}{T}\int _0^T K_{\beta ',h_{T,\beta '}}^2(X_u)\mathrm {d}u > a_{jj}^{-1}h_{T,\beta '}^{-d}\bigg )\nonumber \\&= o\big (T^{-1}\big ). \end{aligned}$$

Adding the upper bounds (8.15), (8.16) and (8.17), we obtain, for some small \(\gamma >0\),

$$\begin{aligned} \mathbf {P}_b\Big (\big |Z_{T,\beta '}(h_{T,\beta '})\big |>u\Big ) \le 2\exp \left( -\frac{u^2(1-\gamma )}{2\mathrm{s }_T^2(\beta ')}\right) + o\big (T^{-1}\big ). \end{aligned}$$

Consider the sequence \(\delta _{T1} := \beta _T\overline{\delta }_T\sqrt{\log T} = (\log \log T)^\delta (\log T)^{-1/2} \rightarrow 0\), and note that, for any \(u>0\),

$$\begin{aligned} \mathbf {P}_b\bigg (\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > u\bigg )&\le \mathbf {P}_b\bigg (\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)-Z_{T,\beta '}(h_{T,\beta '})\big | > u\delta _{T1}\bigg )\\&\quad + \mathbf {P}_b\bigg (\big |Z_{T,\beta '}(h_{T,\beta '})\big | > u(1-\delta _{T1})\bigg ). \end{aligned}$$

Since \(u(1-\delta _{T1}) \le u \le R_1 \mathrm{s }_T(\beta ')\sqrt{\log T}\), (8.18) gives an upper bound on the latter summand. For \(T\) large enough, it further holds

$$\begin{aligned} \left[ \tau _T(\beta ') \delta _{T1}, \ R_1\delta _{T1}\mathrm{s }_T(\beta ') \sqrt{\log T}\right] \subset \left[ \frac{\beta _T \overline{\delta }_T\sqrt{\log T}}{\sqrt{T}h_{T,\beta '}^{d/2}},\ \beta _T \overline{\delta }_T\right] . \end{aligned}$$

Taking into account that

$$\begin{aligned} \sup _{h \in H_{T,\beta '}}\big \Vert K_{\beta ',h} - K_{\beta ',h_{T,\beta '}}\big \Vert _{L^2(\mu _b)}&\le O(1) \sup _{h \in H_{T,\beta '}} h_{T,\beta '}^{-d/2} \left| 1-\big (h/h_{T,\beta '}\big )^{2\beta '}\right| \\&\le O(1) h_{T,\beta '}^{-d/2}\ \beta _T \overline{\delta }_T \end{aligned}$$

and since

$$\begin{aligned} \int _0^1\max \left\{ \sqrt{\log N_{[\ ]}(\varepsilon ,\mathcal {K}_2,L^2(\mu _b)}, 1\right\} \mathrm {d}\varepsilon \le D_4\beta _T \overline{\delta }_T\sqrt{\log T}h_{T,\beta '}^{-d/2}, \end{aligned}$$

the uniform exponential inequality (BI+) implies that

$$\begin{aligned} \mathbf {P}_b\Bigg (\sup _{h\in H_{T,\beta '}}\Big |Z_{T,\beta '}(h)-Z_{T,\beta '}(h_{T,\beta '})\Big | > u\delta _{T1}\Bigg ) \le C_1 \exp \left( -\frac{D_5 Th_{T,\beta '}^d(u\delta _{T1})^2}{(\beta _T\overline{\delta }_T)^2}\right) . \end{aligned}$$

Summing the upper bounds due to (8.18) and (8.19), we obtain (8.14).

The inequality stated in \(\mathrm{(b) }\) follows as an application of (BI+) with \(K := h_{T,\beta '}^{-d} K_{\max }\) and

$$\begin{aligned} M^2:= h_{T,\beta '}^{-d} \big \Vert K_{\beta '}\big \Vert _{L^2(\mathbb {R}^d)}^2 \rho _{b}(x_0). \end{aligned}$$

Finally, part \(\mathrm{(c) }\) is proven similarly to Theorem 4.6 in Butucea [3] by noting that there exists some positive constant \(R\) such that \(\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | \le R h_{T,\beta '}^{-d}\). A suitable decomposition of

$$\begin{aligned} \mathbf {E}_b\Bigg [\bigg (\sup _{h \in H_{T,\beta '}} \big |Z_{T,\beta '}(h)\big |\bigg )^2 \, {1\!\!1}\bigg \{\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |>\tau _T(\beta ')\bigg \}\Bigg ] \end{aligned}$$

and uniform exponential bounds on the corresponding integrands as they follow from parts \(\mathrm{(a) }\) and \(\mathrm{(b) }\) of this lemma then yield the assertions. \(\square \)

The next lemma contains a decomposition of the normalizing factor \(\psi _{\beta ,L}\) and some relations which are needed later in the proof. For \(h_{T,\beta }\) defined according to (8.10), denote

$$\begin{aligned} \mathrm {b}_{T,\beta '} := L\mathrm {b}_{\beta ,\beta '}h_{T,\beta '}^{\widetilde{\beta }-d/2}. \end{aligned}$$

Lemma 7

Let \(\beta \in [d/2,\infty )\), \(L\in \left[ L_*,L^*\right] \), and denote \(\nu = (\beta ,L)\). It then holds

$$\begin{aligned} \psi _\nu = L^{d/(2\beta )} \, \Big (\eta _{T,\beta } + h_{T,\beta }^{\beta -d/2} \mathrm {b}_{\beta ,\beta }\Big ). \end{aligned}$$

Furthermore, there exist positive constants \(D_1, \ldots , D_5\), depending only on \(\beta _*, L_*, L^*, d\) and \(\sigma \), such that

$$\begin{aligned} D_1 \le \psi _\nu /\eta _{T,\beta } \le D_2, \end{aligned}$$

and, for \(\beta ' \in [d/2,\infty )\), \(\beta ' < \beta \),

$$\begin{aligned} \frac{D_3}{\beta _T}\left( \frac{\log T}{T}\right) ^{\kappa (\beta ')-\kappa (\beta )} \le \frac{\psi _{\beta ',L}}{\psi _\nu }\le D_4 T^{\kappa (\beta )-\kappa (\beta ')} \end{aligned}$$


$$\begin{aligned} \frac{\mathrm {b}_{T,\beta '}^2+\tau _T^2(\beta ')}{\psi _{\nu }^2}&\le D_5 \log T \ T^{2\kappa (\beta )- 2\kappa (\beta ')}. \end{aligned}$$


The proof of the decomposition is comparable to the derivation of relation (68) on p. 461 in Klemelä and Tsybakov [11]; for details, see the proof of Lemma 2.6.6 in Strauch [21]. Assertions (8.22) and (8.23) follow analogously to the proof of the relations (44)–(46) in Lemma 4 in Klemelä and Tsybakov [11] (pp. 453–454). For the proof of (8.24), we refer to the proof of Lemma 3.5 in Butucea [3]. \(\square \)

Main part of the proof of the upper bound. Define \(\beta ^-=\beta ^-(\beta )\) by

$$\begin{aligned} \beta ^- := \beta - \frac{\beta _T^+}{\log T}, \end{aligned}$$

where \(\beta _T^+ := (\log \log T)^{\delta '}\), for some \(\delta ' \in (\delta ,1)\). We follow the standard approach and decompose the risk successively. Assume that \(b\in \widetilde{\varPi }(c_1,c_2,\sigma )\), let \(\nu = (\beta ,L)\), and set

$$\begin{aligned} {\mathcal {R}}_{T,\nu }^+ = {\mathcal {R}}_{T,\nu }^+(j)&:= \sup _{\rho _b\in \varSigma _T(\beta ,L)}\psi _{\nu }^{-2} ~ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |^2\ {1\!\!1}\big \{\widehat{\beta }^j_T \ge \beta ^-\big \}\right] ,\\ {\mathcal {R}}_{T,\nu }^- = {\mathcal {R}}_{T,\nu }^-(j)&:= \sup _{\rho _b\in \varSigma _T(\beta ,L)}\psi _{\nu }^{-2} ~ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |^2 \, {1\!\!1}\big \{\widehat{\beta }^j_T < \beta ^-\big \}\right] . \end{aligned}$$

For ease of notation, we usually suppress the dependence of the risk on the coordinate \(j\).

(I) We first consider the case \(\widehat{\beta }_T^j \ge \beta ^-\), and we show that

$$\begin{aligned} \limsup _{T\rightarrow \infty }\sup _{\nu \in \mathcal {B}_T}{\mathcal {R}}_{T,\nu }^+\le 1. \end{aligned}$$

Define \(\overline{\beta }= \overline{\beta }(\beta )\) via the equation \(\left( \frac{\log T}{L^2 T}\right) ^{1/(2\beta )} = \left( \frac{\log T}{T}\right) ^{1/(2\overline{\beta })}\). Let \(\beta ^+ \in \mathcal {G}\) be the largest grid point \(\le \overline{\beta }\), and assume that \(T\) is large enough for ensuring \(\beta ^- < \beta ^+\). Denote \(\mathcal {G}_1 = \mathcal {G}_1(\beta ) := \left\{ \beta ' \in \mathcal {G}\mid \beta ^-\le \beta ' \le \beta ^+\right\} \), \(\mathcal {G}_2 = \mathcal {G}_2(\beta ):=\left\{ \beta '\in \mathcal {G}\mid \beta ^+ < \beta ' \le \beta _T\right\} \), and rewrite

$$\begin{aligned} {\mathcal {R}}_{T,\nu }^+ = \sup _{\rho _b\in \varSigma _T(\beta ,L)} \psi _{\nu }^{-2} ~ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |^2\ {1\!\!1}\big \{\widehat{\beta }_T^j \in \mathcal {G}_1 \cup \mathcal {G}_2\big \}\right] . \end{aligned}$$

Let \(\beta ' \in \mathcal {G}_1 = \mathcal {G}_1(\beta )\) and \(\rho _b\in \varSigma _T(\beta ,L)\), and assume that \(T\) is so large that \(\widetilde{\beta }(\beta ,\beta ')=\beta \). Using Lemma 5, the facts that \(\mathcal {H}_{T,\beta '}\subset H_{T,\beta '}\), that \(\beta '\le \overline{\beta }\) and the definition of \(\overline{\beta }\), it can be shown that

$$\begin{aligned} \sup _{h \in \mathcal {H}_{T,\beta '}} \big |\mathbf {E}_b g_{T,\beta '}(x_0,h)-(b^j\rho _{b})(x_0)\big | \le \Lambda (\beta ,\beta ')L^{d/(2\beta )}h_{T,\beta }^{\beta -d/2}\mathrm {b}_{\beta ,\beta '}\ \big (1+\overline{\delta }_T\big ), \end{aligned}$$


$$\begin{aligned} \Lambda (\beta ,\beta ') := \left( d\rho _{b}(x_0)a_{jj}\right) ^ {\frac{\beta -d/2}{2\beta '}-\frac{\beta -d/2}{2\beta }} (\beta ')^{-\frac{\beta -d/2}{2\beta '}}\beta ^{\frac{\beta -d/2}{2\beta }}. \end{aligned}$$

The following arguments are along the lines of the proof of the upper bound in Klemelä and Tsybakov [11] (see pp. 461–463). For any \(\beta ' \in \mathcal {G}_1\), there exists some positive constant \(C\) such that

$$\begin{aligned} \left| \beta -\beta '\right| \le C\beta _T^+(\log T)^{-1}. \end{aligned}$$

Since \(\Lambda (\beta ,\beta ')\) is uniformly continuous in \(\beta ,\beta ' \in [\beta _*,\infty )\), this implies that \(\Lambda (\beta ,\beta ') \le 1 +\gamma _{T1}\). Furthermore, for any \(\beta ' \in \mathcal {G}_1\), \(\beta \in \left[ \beta _*,\beta _T\right] \), it holds \(\mathrm {b}_{\beta ,\beta '}\le \mathrm {b}_{\beta ,\beta }\ (1+\gamma _{T2})\). Consequently, for any \(\beta ' \in \mathcal {G}_1\), \(\rho _b\in \Sigma _T(\beta ,L)\),

$$\begin{aligned} \sup _{h \in \mathcal {H}_{T,\beta '}} \big |\mathbf {E}_b g_{T,\beta '}(x_0,h)-(b^j\rho _{b})(x_0)\big |\le L^{d/(2\beta )} h_{T,\beta }^{\beta -d/2}\mathrm {b}_{\beta ,\beta }\ (1+\gamma _{T3}). \end{aligned}$$

Similar arguments (also see the derivation of line (54) on p. 1591 in Klemelä and Tsybakov [10]) yield

$$\begin{aligned} \eta _{T,\beta ^+}&\le \left( \frac{4d\rho _{b}(x_0)a_{jj}}{\beta ^+}\right) ^ {\frac{\beta ^+-d/2}{2\beta ^+}}\left( \frac{\log T}{T}\right) ^{ \frac{\overline{\beta }-d/2}{2\overline{\beta }}} \Vert K_{\beta ^+}\Vert _{L^2(\mathbb {R}^d)}\ (1+\gamma _{T4})\nonumber \\&\le L^{d/(2\beta )} \eta _{T,\beta }\ (1+\gamma _{T5}). \end{aligned}$$

Recall the definition of the stochastic error \(Z_{T,\beta }(\cdot )\). Whenever \(\widehat{\beta }_T^j = \beta ' \in \mathcal {G}_1\) and the event \(A_{T,\beta '}\) holds, the above arguments imply that

$$\begin{aligned} \big |\widetilde{g}^j_T(x_0)-(b^j\rho _{b})(x_0)\big |&= \big |\widehat{g}_{T,\beta '} (x_0)-(b^j\rho _{b})(x_0)\big |\nonumber \\&\le \sup _{h\in \mathcal {H}_{T,\beta '}}\big |g^j_{T,\beta '}(x_0,h) - (b^j\rho _{b})(x_0)\big | \nonumber \\&\le \sup _{h \in \mathcal {H}_{T,\beta '}} \big |Z_{T,\beta '}(h)\big | \!+\! L^{d/(2\beta )}h_{T,\beta }^{\beta -d/2}\mathrm {b}_{\beta ,\beta }\ (1+\gamma _{T3})\qquad \qquad \end{aligned}$$
$$\begin{aligned}&\le \sup _{h \in H_{T,\beta '}} \big |Z_{T,\beta '}(h)\big | +\psi _\nu (1+\gamma _{T3}).\qquad \qquad \end{aligned}$$

The last line holds true since \(\mathcal {H}_{T,\beta '} \subset H_{T,\beta '}\) and in view of the decomposition of the normalizing factor \(\psi _\nu \) according to (8.21). If \(\widehat{\beta }_T^j \ge \beta ^+\), the definition of the estimator \(\widehat{\beta }_T^j\) according to (4.10) implies that \(\big | \widehat{g}_{T,\widehat{\beta }_T^j}^j(x_0) - \widehat{g}_{T,\beta ^+}^j(x_0)\big | \le \widehat{\eta }_{T,\beta ^+}\).

Therefore, if \(\widehat{\beta }_T^j = \beta ' \in \mathcal {G}_2\), it holds on \(A_{T,\beta ^+}\),

$$\begin{aligned}&\big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |\\&\quad \le \ \, \Big (\big |\widehat{g}^j_{T,\beta '}(x_0)-\widehat{g}^j_{T,\beta ^+}(x_0)\big | + \big |\widehat{g}^j_{T,\beta ^+}(x_0)-(b^j\rho _{b})(x_0)\big |\Big )\\&\quad \le \widehat{\eta }_{T,\beta ^+} + \big |\widehat{g}^j_{T,\beta ^+}(x_0)-(b^j\rho _{b})(x_0)\big |\\&\quad \le \sup _{h \in \mathcal {H}_{T,\beta ^+}} \bigg \{\eta _{T,\beta ^+}\left( h/ h_{T,\beta }\right) ^{\beta -d/2} + \big |g_{T,\beta ^+}(x_0,h) - \mathbf {E}_b g_{T,\beta ^+}(x_0,h)\big | \\&\qquad + \big |\mathbf {E}_b g_{T,\beta ^+}(x_0,h)-(b^j\rho _{b})(x_0)\big |\bigg \}\\&\quad \mathop {\le }\limits ^{(8.28)} \eta _{T,\beta ^+} \, \big (1+\overline{\delta }_T\big ) + \sup _{h \in H_{T,\beta ^+}}\big |Z_{T,\beta ^+}(h)\big | + L^{d/(2\beta )}h_{T,\beta }^{\beta -d/2}\mathrm {b}_{\beta ,\beta } \, (1+\gamma _{T3})\\&\quad \mathop {\le }\limits ^{(8.27)} \sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta ^+}(h)\big | \!+\! \Big (L^{d/(2\beta )} \eta _{T,\beta } \ (1+\gamma _{T 6}) \!+\! L^{d/(2\beta )}h_{T,\beta }^{\beta -d/2} \mathrm {b}_{\beta ,\beta } (1\!+\!\gamma _{T3})\Big ). \end{aligned}$$

In view of the decomposition (8.21), this last line implies that

$$\begin{aligned} \big |\widetilde{g}^j_T(x_0)-(b^j\rho _{b})(x_0)~\big |{1\!\!1}\big \{\widehat{\beta }_T^j&= \beta ' \in \mathcal {G}_2\big \} \, {1\!\!1}\big \{A_{T,\beta ^+}\big \} \nonumber \\&\quad \quad \ \ \le \sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta ^+}(h)\big |+ \psi _\nu \ (1+\gamma _{T 7}).\quad \quad \quad \quad \end{aligned}$$

Thus, using (8.29) and (8.30),

$$\begin{aligned}&\psi _\nu ^{-2}~ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0) - (b^j\rho _{b})(x_0)\big |^2 \, {1\!\!1}\big \{\widehat{\beta }_T^j\in \mathcal {G}_1\cup \mathcal {G}_2\big \}\right] \\&\quad \le ~ \sum _{\beta '\in \mathcal {G}_1} \mathbf {E}_b\left[ \Big (1+\gamma _{T3} + \psi _\nu ^{-1}~ \sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\Big )^2\ {1\!\!1}\big \{\widehat{\beta }_T^j = \beta '\big \} \ {1\!\!1}\big \{A_{T,\beta '}\big \}\right] \\&\qquad + \sum _{\beta ' \in \mathcal {G}_1} \mathbf {E}_b\left[ \psi _\nu ^{-2}\ \big |\widetilde{g}^j_T(x_0)-(b^j\rho _{b})(x_0)\big |^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j = \beta '\big \}\ {1\!\!1}\big \{A_{T,\beta '}^{\mathrm{c }}\big \}\right] \\&\qquad + \sum _{\beta '\in \mathcal {G}_2}\mathbf {E}_b\left[ \Big (1+\gamma _{T 7} + \psi _\nu ^{-1}~ \sup _{h \in H_{T,\beta ^+}}\, \big |Z_{T,\beta ^+}(h)\big |\Big )^2 \, {1\!\!1}\big \{\widehat{\beta }_T^j = \beta '\big \} \ {1\!\!1}\big \{A_{T,\beta ^+}\big \}\right] \\&\qquad + \sum _{\beta ' \in \mathcal {G}_2}\mathbf {E}_b\left[ \psi _\nu ^{-2}\big |\widetilde{g}^j_T(x_0)-(b^j\rho _{b})(x_0)\big |^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j = \beta '\big \} \, {1\!\!1}\big \{A_{T,\beta ^+}^{\mathrm{c }}\big \}\right] \\&\quad =: \sum _{\beta '\in \mathcal {G}_1}\left( \mathsf {p}_1(\beta ') + \mathsf {p}_2(\beta ')\right) + \sum _{\beta ' \in \mathcal {G}_2}\left( \mathsf {p}_3(\beta ')+\mathsf {p}_4(\beta ')\right) , \text {say}. \end{aligned}$$

The terms \(\mathsf {p}_1(\cdot ), \ldots , \mathsf {p}_4(\cdot )\) are now considered separately. Note first that, for any \(\beta ' \in \mathcal {G}_1\),

$$\begin{aligned} \mathsf {p}_1(\beta ')&\le \mathbf {E}_b\left[ \Big (1 + \gamma _{T3} + \psi _\nu ^{-1}\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\Big )^2\ {1\!\!1}\bigg \{\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |> \sqrt{\mathrm{s }_T(\beta ')\psi _\nu }\bigg \}\right] \nonumber \\&+ \left( 1+\gamma _{T3}+ \sqrt{\mathrm{s }_T(\beta ')\psi _\nu ^{-1}}\right) ^2\mathbf {P}_b\left( \widehat{\beta }_T^j = \beta '\right) . \end{aligned}$$

Since (8.26) holds for any \(\beta ' \in \mathcal {G}_1\), it can be shown by means of (8.22) and (8.23) that

$$\begin{aligned} \frac{\mathrm{s }_T(\beta ')}{\psi _\nu }&\le \frac{D_1^{-1}D_4 \, \exp \big (C\beta _T^+\big )\, \mathrm{s }_T(\beta ')}{\eta _{T,\beta '}} \le \frac{D_1^{-1}D_4 \, \exp \big (C\beta _T^+\big )\ }{d_T(\beta _T)}\\&\le D_1^{-1}D_4 \, \exp \big (C\beta _T^+\big )\sqrt{\frac{\beta _T}{\log T}} =: A_T. \end{aligned}$$

The summand in (8.31) is bounded from above by the sum of the terms

$$\begin{aligned} 2 \, (1+\gamma _{T3})^2&\mathbf {P}_b\Bigg (\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |> \sqrt{\mathrm{s }_T(\beta ')\psi _\nu }\Bigg )\\&\quad \qquad \qquad \mathop {\le }\limits ^{(8.14)} 2c^{\prime }_1(1+\gamma _{T3})^2 \exp \left( -\frac{\psi _\nu (1-\gamma _T)^2}{2\mathrm{s }_T(\beta ')}\right) \end{aligned}$$


$$\begin{aligned} 2\psi _\nu ^{-2} \mathbf {E}_b\Bigg [\bigg (\sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\bigg )^2 \ {1\!\!1}\bigg \{\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > \sqrt{\mathrm{s }_T(\beta ')\psi _\nu }\bigg \}\Bigg ]. \end{aligned}$$

Part (c) of Lemma 6 entails that the latter term tends to zero, uniformly in \(\beta ' \in \mathcal {G}_1\). Therefore,

$$\begin{aligned} \mathsf {p}_1(\beta ') \le \big (1+\gamma _{T3}+ \sqrt{A_T}\big )^2 ~ \mathbf {P}_b\left( \widehat{\beta }_T^j = \beta '\right) + O(1) \exp \left( -\frac{(1-\gamma _T)^2}{2 A_T}\right) + o_T(1). \end{aligned}$$

Recall that the cardinality \(m\) of the grid \(\mathcal {G}\) satisfies

$$\begin{aligned} m \le k_1^{-1}\beta _T (\log T)^{\delta _1} = k_1^{-1} (\log T)^{\delta _1} (\log \log T)^\delta . \end{aligned}$$

By construction, \(\beta ^+ \in \mathcal {G}_1\), such that \(\mathsf {p}_3(\beta ')\) is upper-bounded analogously. Consequently,

$$\begin{aligned} \sum _{\beta '\in \mathcal {G}_1}\mathsf {p}_1(\beta ') + \sum _{\beta '\in \mathcal {G}_2}\mathsf {p}_3(\beta ')&\le \Big (1+ \max \big \{\gamma _{T3},\gamma _{T 7}\big \} + \sqrt{A_T}\Big )^2 \ \mathbf {P}_b\Big (\widehat{\beta }_T^j\in \mathcal {G}_1 \cup \mathcal {G}_2\Big )\\&\qquad \qquad \qquad \quad + O(1) m \exp \left( -\frac{(1-\gamma _T)^2}{2 A_T}\right) + o_T(1)\\&\le 1 + o_T(1). \end{aligned}$$

For \(\mathsf {p}_2(\cdot )\) and any \(\beta ' \in \mathcal {G}_1\), there exists some universal constant \(c_0\) such that

$$\begin{aligned} \mathsf {p}_2(\beta ')&\le \psi _\nu ^{-2}~ \mathbf {E}_b\left[ \big |\widehat{g}^j_{T,\beta '}(x_0)-(b^j\rho _{b})(x_0)\big |^2 \, {1\!\!1}\big \{A_{T,\beta '}^{\mathrm{c }}\big \}\right] \le c_0 \left( \mathbf {P}_b\left( A_{T,\beta '}^{\mathrm{c }}\right) \right) ^{1/2}. \end{aligned}$$

The regularity conditions on the bandwidth and the kernel used for defining \(\widehat{\rho }_T(x_0)\) ensure that, for any \(\beta ' \in (d/2,\beta ]\) and some sufficiently small constant \(\alpha >0\) fixed,

$$\begin{aligned} \mathbf {P}_b\left( A_{T,\beta '}^{\mathrm{c }}\right) \le 2\exp \left( -\frac{T\left( (1-\alpha )h_T^d\overline{\delta }_T\rho _T^*\right) ^2}{2\Vert Q\Vert _\infty ^2}\right) = o_T(1). \end{aligned}$$

This implies that \(\mathsf {p}_2(\beta ')\) is exponentially small, for any \(\beta ' \in \mathcal {G}_1\), such that \(\sum _{\beta ' \in \mathcal {G}_1}\mathsf {p}_2(\beta ') \rightarrow 0.\) Analogously, it follows that \(\sum _{\beta ' \in \mathcal {G}_2}\mathsf {p}_4(\beta ') \rightarrow 0\), completing finally the verification of (8.25).

(II) It is proven now that

$$\begin{aligned} \lim _{T\rightarrow \infty }\sup _{\nu \in {\mathcal {B}}_T}{\mathcal {R}}_{T,\nu }^-= 0. \end{aligned}$$

Let \(\beta ' \in \mathcal {G}_T\), and assume that the event \(A_{T,\beta '}\) holds. In view of the definition of the stochastic error \(Z_{T,\beta '}\) in (8.13) and taking into account Lemma 5, it holds, whenever \(\rho _b\in \varSigma _T(\beta ,L)\),

$$\begin{aligned} \big |\widehat{g}^j_{T,\beta '}(x_0)-(b^j\rho _{b})(x_0)\big |&= \big |\widehat{g}^j_{T,\beta '}\big (x_0,\widehat{h}_{T,\beta '}\big ) - (b^j\rho _{b})(x_0)\big |\\&\le \sup _{h \in \mathcal {H}_{T,\beta '}}\left\{ \big |Z_{T,\beta '}(h)\big | + Lh^{\widetilde{\beta }-d/2}\mathrm {b}_{\beta ,\beta '}\right\} . \end{aligned}$$

Then, using the definition of \(\mathrm {b}_{T,\beta '}\) in (8.20),

$$\begin{aligned}&\sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)}\psi _{\nu }^{-2}\ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j=\beta '\big \} \ {1\!\!1}\big \{A_{T,\beta '}\big \}\right] \\&\quad \le ~ \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)}\psi _\nu ^{-2}\ \mathbf {E}_b\Bigg [\Big (\mathrm {b}_{T,\beta '}(1+\overline{\delta }_T)+ \sup _{h\in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\Big )^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j=\beta '\big \}\Bigg ]\\&\quad \le ~ 2(1+\overline{\delta }_T)^2 \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)} \psi _\nu ^{-2}\mathrm {b}_{T,\beta '}^2\ \mathbf {P}_b\Big (\widehat{\beta }_T^j=\beta '\Big )\\&\qquad + 2\, \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)}\psi _\nu ^{-2} \ \mathbf {E}_b\bigg [\Big (\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\Big )^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j=\beta '\big \}\bigg ]\\&\quad \le ~ 2\ \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)}\psi _\nu ^{-2}\left( \mathrm {b}_{T,\beta '}^2 \big (1+\overline{\delta }_T\big )^2 +\tau _T^2(\beta ')\right) \mathbf {P}_b\Big (\widehat{\beta }_T^j=\beta '\Big )\\&\qquad + 2\sum _{\beta ' \in \mathcal {G}, \beta '<\beta ^-}\sup _{\rho _b\in \Sigma _T(\beta ,L)}\psi _\nu ^{-2}\, \mathbf {E}_b\bigg [\Big (\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big |\Big )^2\, {1\!\!1}\Big \{\sup _{h \in H_{T,\beta '}}\big |Z_{T,\beta '}(h)\big | > \tau _T(\beta ')\Big \}\bigg ]\\&\quad =: g_1(\nu )+g_2(\nu ). \end{aligned}$$

The term \(g_1(\nu )\) is bounded from above by exploiting the fact that the probability to underestimate the value of \(\beta \) by \(\widehat{\beta }_T^j\) substantially is small, whenever \(\rho _b\in \Sigma _T(\beta ,L)\). Recall that \(m\) is the cardinality of the grid \(\mathcal {G}\).

Lemma 8

(Probability of undershooting) Let \(\beta \in [\beta _*,\infty )\), \(\beta ' \in \mathcal {G}\), \(\beta ' < \beta ^-\), \(L\in \left[ L_*,L^*\right] \), and \(\nu = (\beta ,L)\). Then there exists some constant \(K\), depending only on \(\beta _*,L_*,L^*,d\) and \(\sigma \) such that, for any \(b\in \widetilde{\varPi }(c_1,c_2,\sigma )\),

$$\begin{aligned} \sup _{\rho _b\in \varSigma _T(\beta ,L)} \mathbf {P}_b\left( \widehat{\beta }_T^j = \beta '\right) \le Km\big (T^{-d/(2\beta ')} + o(T^{-1})\big ). \end{aligned}$$


The proof substantially relies on applications of Lemma 2. Since the basic arguments are similar to those used in the proof of Lemma 4.8 in Butucea [3] and the proof of Lemma 5 in Klemelä and Tsybakov [11], we do not include the proof but refer to Strauch [21]. \(\square \)

Let \(\beta \in \left[ \beta _*,\beta _T\right] \), \(\beta ' \in \mathcal {G}\), \(L \in \left[ L_*,L^*\right] \), \(\nu = (\beta ,L)\). By means of Lemma 8 and using relation (8.24) in Lemma 7, we obtain

$$\begin{aligned} g_1(\nu )&= 2\ \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \varSigma _T(\beta ,L)}\psi _\nu ^{-2}\left( \mathrm {b}_{T,\beta '}^2 \big (1+\overline{\delta }_T\big )^2 +\tau _T^2(\beta ')\right) \mathbf {P}_b\left( \widehat{\beta }_T^j=\beta '\right) \\&\le 2D_5Km \sum _{\beta ' \in \mathcal {G},\beta ' < \beta ^-}\log T\ T^{-d/(2\beta )+d/(2\beta ')} \ \big (T^{-d/(2\beta ')} + o(T^{-1})\big ). \end{aligned}$$

In view of the upper bound on the cardinality \(m\) of the grid \(\mathcal {G}= \mathcal {G}_T\) in (8.32), it follows that \(\lim _{T\rightarrow \infty }\sup _{\nu \in \mathcal {B}_T}g_1(\nu ) = 0,\) and the first assertion in Lemma 6(c) immediately gives \(\lim _{T\rightarrow \infty }\sup _{\nu \in \mathcal {B}_T}g_2(\nu ) = 0\). Note finally that, for any \(\beta ' \in \mathcal {G}\),

$$\begin{aligned}&\sup _{\rho _b\in \varSigma _T(\beta ,L)} \psi _\nu ^{-2}~ \mathbf {E}_b\left[ \big |\widehat{g}^j_{T,\beta '}(x_0)-(b^j\rho _{b})(x_0)\big |^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j=\beta '\big \} \ {1\!\!1}\big \{A_{T,\beta '}^{\mathrm{c }}\big \}\right] \\&\quad \le \sup _{\rho _b\in \varSigma _T(\beta ,L)} \psi _\nu ^{-2}~ \sqrt{\mathbf {E}_b\left[ \big |\widehat{g}_{T,\beta '}^j (x_0)-(b^j\rho _{b})(x_0)\big |^4\right] \, \mathbf {P}_b\left( A_{T,\beta '}^{\mathrm{c }}\right) } \lesssim \sqrt{\mathbf {P}_b\left( A_{T,\beta '}^{\mathrm{c }}\right) }. \end{aligned}$$

Consequently, taking into account (8.33), it holds, independent both of \(\beta \) and \(\beta '\),

$$\begin{aligned} \sum _{\beta ' \in \mathcal {G},\beta '<\beta ^-}\sup _{\rho _b\in \varSigma _T(\beta ,L)}\psi _{\nu }^{-2}\ \mathbf {E}_b\left[ \big |\widetilde{g}_T^j(x_0)-(b^j\rho _{b})(x_0)\big |^2 \ {1\!\!1}\big \{\widehat{\beta }_T^j=\beta '\big \} \ {1\!\!1}\big \{A_{T,\beta '}^{\mathrm{c }}\big \}\right]&\\ = o_T(1),&\end{aligned}$$

thus completing the proof of (8.34). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Strauch, C. Exact adaptive pointwise drift estimation for multidimensional ergodic diffusions. Probab. Theory Relat. Fields 164, 361–400 (2016). https://doi.org/10.1007/s00440-014-0614-4

Download citation


  • Ergodic diffusion
  • Minimax drift estimation
  • Exact constants in nonparametric smoothing
  • Sharp adaptivity
  • Pointwise risk

Mathematics Subject Classification

  • 62M05
  • 62G07
  • 62G20