Abstract
Two adaptive nonparametric procedures are proposed to estimate the density of the random effects in a mixed-effect Ornstein–Uhlenbeck model. First a kernel estimator is introduced with a new bandwidth selection method developed recently by Goldenshluger and Lepski (Ann Stat 39:1608–1632, 2011). Then, we adapt an estimator from Comte et al. (Stoch Process Appl 7:2522–2551, 2013) in the framework of small time interval of observation. More precisely, we propose an estimator that uses deconvolution tools and depends on two tuning parameters to be chosen in a data-driven way. The selection of these two parameters is achieved through a two-dimensional penalized criterion. For both adaptive estimators, risk bounds are provided in terms of integrated \(\mathbb {L}^2\)-error. The estimators are evaluated on simulations and show good results. Finally, these nonparametric estimators are applied to neuronal data and are compared with previous parametric estimations.
Similar content being viewed by others
Notes
We insist that this bad estimation is not due to the fact that the noise is Gaussian. Indeed even if Fan (1991) proves the rates to be logarithmic in that case, the rates are improved and can be polynomial when the density under estimation is of the same type as the noise (see Lacour 2006; Comte et al. 2006).
References
Birgé L, Massart P (1997) From model selection to adaptive estimation. Springer, New York
Birgé L, Massart P (1998) Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4:329–375
Bissantz N, Dümbgen L, Holzmann H, Munk A (2007) Nonparametric confidence bands in deconvolution density estimation. J R Stat Soc Series B (Stat Methodol) 69:483–506
Briane M, Pagès G (2006) Théorie de l’intégration. Vuibert, Paris
Butucea C, Tsybakov A (2007) Sharp optimality in density deconvolution with dominating bias II. Teor Veroyatnost i Primenen 52:336–349
Carroll R, Hall P (1988) Optimal rates of convergence for deconvolving a density. J Am Stat Assoc 83:1184–1186 ISSN 01621459
Chagny G (2013) Warped bases for conditional density estimation. Math Methods Stat 22:253–282
Comte F, Genon-Catalot V, Rozenholc Y (2007) Penalized nonparametric mean square estimation of the coefficients of diffusion processes. Bernoulli 13:514–543
Comte F, Genon-Catalot V, Samson A (2013) Nonparametric estimation for stochastic differential equation with random effects. Stoch Process Appl 7:2522–2551
Comte F, Johannes J (2012) Adaptive functional linear regression. Ann Stat 40:2765–2797
Comte F, Rozenholc Y, Taupin M-L (2006) Penalized contrast estimator for adaptive density deconvolution. Can J Stat 34:431–452
Comte F, Samson A (2012) Nonparametric estimation of random-effects densities in linear mixed-effects model. J Nonparametr Stat 24:951–975
Davidian M, Giltinan D (1995) Nonlinear models for repeated measurement data. CRC press
Delattre M, Genon-Catalot V, Samson A (2015) Estimation of population parameters in stochastic differential equations with random effects in the diffusion coefficient. ESAIM Probab Stat 19:671–688
Delattre M, Genon-Catalot V, Samson A (2016) Mixtures of stochastic differential equations with random effects: application to data clustering. J Stat Plan Inference 173:109–124
Delattre M, Lavielle M (2013) Coupling the SAEM algorithm and the extended Kalman filter for maximum likelihood estimation in mixed-effects diffusion models. Stat Interface 6:519–532
Diggle P, Heagerty P, Liang K, Zeger S (2002) Analysis of longitudinal data. Oxford statistical science series
Dion C, Genon-Catalot V (2015) Bidimensional random effect estimation in mixed stochastic differential model. Stoch Inference Stoch Process 18(3):1–28
Donnet S, Foulley J, Samson A (2010) Bayesian analysis of growth curves using mixed models defined by stochastic differential equations. Biometrics 66:733–741
Donnet S, Samson A (2008) Parametric inference for mixed models defined by stochastic differential equations. ESAIM Prob Stat 12:196–218
Donnet S, Samson A (2013) A review on estimation of stochastic differential equations for pharmacokinetic–pharmacodynamic models. Adv Drug Deliv Rev 65:929–939
Donnet S, Samson A (2014) Using PMCMC in EM algorithm for stochastic mixed models: theoretical and practical issues. J Soc Fr Stat 155:49–72
Fan J (1991) On the optimal rates of convergence for nonparametric deconvolution problems. Ann Statist 19:1257–1272
Genon-Catalot V, Jacod J (1993) On the estimation of the diffusion coefficient for multi-dimensional diffusion processes. Ann Inst Henri Poincaré B Probab Stat 29:119–151
Genon-Catalot V, Larédo C (2016) Estimation for stochastic differential equations with mixed effects. Statistics. doi:10.1080/02331888.2016.1141910
Goldenshluger A, Lepski O (2011) Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. Ann Stat 39:1608–1632
Hoffmann M (1999) Adaptive estimation in diffusion processes. Stoch Process Appl 79:135–163
Klein T, Rio E (2005) Concentration around the mean for maxima of empirical processes. Ann Probab 33:1060–1077
Kutoyants Y (2004) Statistical inference for ergodic diffusion processes. Springer, London
Lacour C (2006) Rates of convergence for nonparametric deconvolution. C R Math Acad Sci Paris 342:877–882
Lacour C, Massart P (2016) Minimal penalty for Goldenshluger–Lepski method. Stoch Processes Appl. doi:10.1016/j.spa.2016.04.015
Lansky P, Sanda P, He J (2006) The parameters of the stochastic leaky integrate-and-fire neuronal model. J Comput Neurosci 21:211–223
Picchini U, De Gaetano A, Ditlevsen S (2010) Stochastic differential mixed-effects models. Scand J Stat 37:67–90
Picchini U, Ditlevsen S (2011) Practicle estimation of high dimensional stochastic differential mixed-effects models. Comput Stat Data Anal 55:1426–1444
Picchini U, Ditlevsen S, De Gaetano A, Lansky P (2008) Parameters of the diffusion leaky integrate-and-fire neuronal model for a slowly fluctuating signal. Neural Comput 20:2696–2714
Pinheiro J, Bates D (2000) Mixed-effect models in S and Splus. Springer, New York
Talagrand M (1996) New concentration inequalities in product spaces. Invent Math 126:505–563
Yu Y, Xiong Y, Chan Y, He J (2004) Corticofugal gating of auditory information in the thalamus: an in vivo intracellular recording study. J Neurosci 24:3060–3069
Acknowledgments
The author would like to thank Fabienne Comte and Adeline Samson for very useful discussions and advice.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proofs
1.1 Proof of Theorem 3.1
Given \(h \in {\mathcal {H}}_{N,T}\), we denote:
Using the definition of A(h) and of \({\widehat{h}}\) we obtain
Thus,
hence, we only have to study the term \(\mathbb {E}[A(h)]\). We can decompose \(\Vert {\widehat{f}}_{h,h'}-{\widehat{f}}_{h'}\Vert ^2\) as follows:
thus
with:
According to Young’s inequality (see Theorem 9.1), we obtain
thus
Let us study the term \(D_2\). We denote \({\mathcal {B}}(1)=\{g\in \mathbb {L}^2(\mathbb {R}), \Vert g\Vert =1\}\). We define
then \(|\nu _{N,h}(g)|\le \Vert g\Vert \Vert {\widehat{f}}_{h}-\mathbb {E}[{\widehat{f}}_{h}]\Vert \) thus, the estimator \({\widehat{f}}_{h}\) satisfies:
We can also compute the scalar product which defines \(\nu _{N,h}\) and we obtain
with \(K_h^-(x):=K_h(-x)\). This finally conducts to:
This bound and Eq. (19) leads to apply Talagrand’s Theorem (9.2). We have to compute 3 quantities: M, \(H^2\) and v.
First:
Secondly, the bound of Proposition 3.1 gives
Thirdly:
Let us investigate the two terms separately. Young’s inequality gives:
Then, one can write: \(K_{h'}(x-Z_{1,T})-K_{h'}(x-\phi _1)=(\phi _1-Z_{1,T}) \int _0^1 (K_{h'})'(x-\phi _1+u(\phi _1-Z_{1,T}))du\), thus
With \(\mathbb {E}[(\phi _1-Z_{1,T})^2]=\frac{\sigma ^2}{T^2}\mathbb {E}[W_1(T)^2] =\frac{\sigma ^2}{T}\), the assumption \(T^{-1}\le h^{5/2}\) leads to
Finally \(v=v_1+v_2=A_0/\sqrt{h'}\) with \(A_0=\Vert f\Vert \Vert K\Vert ^2_{4/3}+\Vert K'\Vert ^2\sigma ^2\).
If \(\kappa _1 \Vert K\Vert _1^2 \ge 40\), with the assumption \(1/(Nh) \le 1\), Talagrand’s inequality (under the assumptions of the Theorem 3.1) gives
One can lead the study of \(D_3\) as we have done for \(D_2\), using the same steps and tools. However \(K_h {\star } K_{h'}\) instead of \(K_{h'}\), adds \(\Vert K\Vert _1\) in M and \(\Vert K\Vert ^2_1\) in \(H^2\) and v.
Then, let us study the term \(D_4\). If \(\kappa _2 \ge 10/(3\Vert K\Vert _1^2)\), the bound (9) leads us to
thus \(D_4=0\). Finally, similarly, if \( \kappa _2\ge 10/3\), we obtain
Thus finally we obtained that:
with c a constant depending on \(\Vert f\Vert ,\Vert K\Vert _1,\Vert K\Vert ,\Vert K\Vert _{4/3}\). Finally we have shown that for all \(h \in {\mathcal {H}}_{N,T}\):
where \(C_1= \max (7, 30 \Vert K\Vert _1^2+6)\) and \(C_2\) depends on \(\Vert f\Vert ,\Vert K\Vert _1,\Vert K\Vert ,\Vert K\Vert _{4/3}\). \(\square \)
1.2 Proof of Proposition 4.1
The bias term is \(\Vert f-\mathbb {E}[{\widetilde{f}}_{m,s}]\Vert ^2\). Let us compute \(\mathbb {E}[{\widetilde{f}}_{m,s}]\). As the \(Z_{j,\tau }\) are i.i.d. when \(\tau \) is fixed and due to the independence of \(\phi _1\) and \(W_1\), we obtain:
Therefore this gives \(\mathbb {E}[{\widetilde{f}}_{m,s}(x)]={f}_{m}(x)\), and \(\Vert f-\mathbb {E}[{\widetilde{f}}_{m,s}]\Vert ^2=\Vert f-{f}_{m}\Vert ^2=\frac{1}{2\pi } \int _{|u| \ge m}|f^*(u)|^2du\).
The variance term is:
\(\Box \)
1.3 Proof of Theorem 4.1
Let us study the term \(\Vert {\widetilde{f}}_{{\widetilde{m}},{\widetilde{s}}}-f\Vert ^2\). We decompose it into a sum of three terms and the definition of \(({\widetilde{m}},{\widetilde{s}})\) (15) implies for all \((m,s) \in {\mathcal {C}}\)
Now we study \({\varGamma }_{m,s}\). First:
Thus:
The last maximum can be explicit. If \(m'\le m\), then \(\Vert f_{m'}-f_{m \wedge m'}\Vert ^2=\Vert f_{m'}-f_{m'}\Vert ^2=0\). Otherwise,
Finally:
We get the following bound for \(\varGamma _{m,s}\):
Then we gather Eqs. (25) and (26):
We first notice that our penalty function is increasing in s and m, thus we get the following bound for the last term:
Moreover, according to Proposition 4.1 and using the inequality \(\int _0^1 e^{\sigma ^2s^2v^2}dv \le e^{\sigma ^2s^2}\), we obtain, for all \( (m,s) \in {\mathcal {C}}\),
Then we obtain the announced result with the following Lemma.
Lemma 8.1
There exists a constant \(C'>0\) such that for \(\text {pen}(m,s)\) defined by \(\text {pen}(m,s)=\kappa \frac{m}{ N} e^{\sigma ^2s^2}\),
According to Lemma 8.1, to be proved next, we choose \(\text {pen}(m,s)=\kappa \frac{m}{ N} e^{\sigma ^2s^2}\), thus, there exist two constants \(C=145,~C'>0\) such that,
\(\Box \)
Proof of Lemma 8.1
For a couple \((m,s)\in {\mathcal {C}}\) fixed, let us consider the subset \(S_{m}:= \{t \in \mathbb {L}^1(\mathbb {R})\cap \mathbb {L}^2(\mathbb {R}), \text {supp}(t^*)=[-m,m] \}\). For \(t \in S_{m}\),
with \(\varphi _t(x):=\frac{1}{2\pi } \int {\overline{t^*(u)}}e^{iux+\sigma ^2u^2s^2/(2m^2)}du\), then \(\nu _N(t)=\frac{1}{2\pi } \langle t^*,({\widetilde{f}}_{m,s}-f_{m})^*\rangle \). This leads to
We also have by Cauchy–Schwarz inequality
thus
Then, by Proposition 4.1,
Using Fubini and Cauchy–Schwarz inequalities we obtain for all \((m,s)\in {\mathcal {C}}\):
Finally using that \(m \le N\), \(s \le 2/\sigma \) and \(\sum _{s \in {\mathcal {S}}} s=(4/\sigma )(1-(1/2)^{P+1})< 4/\sigma \), the Talagrand’s inequality with \(\alpha =1/2\) if \(4H^2 \le \text {pen}(m,s)/6\) implies,
because with the definition of \({\mathcal {M}}\), \( \sum _{m \in {\mathcal {M}}} \sqrt{m} e^{-C_2 \frac{\sqrt{m}}{\Vert f\Vert }}\le a_1 \sum _{k \in \mathbb {N}} k^{1/4} e^{-a_2 k^{1/4}} < +\infty \), and \(\sum _{m \in {\mathcal {M}}} e^{-C_4 m^{1/2}}\le \sum _{k\in \mathbb {N}} e^{-a_3 k^{1/4}} <+\infty \), with \(a_1, a_2, a_3\) three positive constants. Notice that \(C'>0\) depends on \(\sigma , \Vert f\Vert \), \(\varDelta \).
We choose \(\text {pen}(m,s)=\kappa m e^{\sigma ^2 s^2} /N\) with \(\kappa \ge 24\). \(\square \)
Appendix 2
1.1 Young inequality
This inequality can be found in Briane and Pagès (2006) for example.
Theorem 9.1
Let f be a function belonging to \(\mathbb {L}^p(\mathbb {R})\) and g belonging to \(\mathbb {L}^q(\mathbb {R})\), let p, q, r be real numbers in \([1, +\infty ]\) and such that
Then,
1.2 Talagrand’s inequality
The following result is a consequence of the Talagrand concentration inequality Talagrand (1996) given in Birgé and Massart (1997).
Theorem 9.2
Consider \(n \in \mathbb {N}^*\), \({\mathcal {F}}\) a class at most countable of measurable functions, and \((X_i)_{i\in \{1,\ldots ,N\}}\) a family of real independent random variables. One defines, for all \(f\in {\mathcal {F}}\),
Supposing there are three positive constants M, H and v such that \(\underset{f\in {\mathcal {F}}}{\sup } \Vert f\Vert _{\infty } \le M\),
\(\mathbb {E}[\underset{f\in {\mathcal {F}}}{\sup } |\nu _Nf| ] \le H\), and \(\underset{f\in {\mathcal {F}}}{\sup } ({1}/{N})\sum _{i=1}^{N} \mathrm {Var}(f(X_i)) \le v\), then for all \(\alpha >0\),
with \(C(\alpha )=(\sqrt{1+\alpha }-1) \wedge 1\), and \(a=\frac{1}{6}\).
1.3 Discretization
Indeed, if we assume that the times of observations are the \(t_k=k\delta , k=1,\ldots ,N\) and \(0<\delta <1\), we must study the error applied by discretization of the \(Z_{j,\tau }\). Then, for any \(0<m^2/s^2 \le T\) we use:
to approximate \(Z_{j,m^2/s^2}\) given by (2). The corresponding estimator of f is
We investigate the error:
where the second term of the right hand side is bounded by Proposition 4.1. Then, Plancherel–Parseval’s Theorem implies:
and
thus we study the last term. For all \((m,s) \in {\mathcal {C}}\), \(m^2/s^2 \le T\),
then by Cauchy–Schwarz’s inequality we obtain
Höder’s inequality yields
Let us study \(\mathbb {E}[(X_j(s)-X_j((k-1)\delta ))^2]\), for \((k-1)\delta \le s \le k\delta \):
and Cauchy–Schwarz’s inequality gives
Finally, after simplification and using for all \(x\in \mathbb {R}^+\), \([ x] \le x\),
and we can deal with the term \(\mathbb {E}[\left( X_j(m^2/s^2)-X_j(\delta [ m^2/(s^2\delta )]) \right) ^2]\) using formula (30) and \(m^2/s^2-\delta [ m^2/(s^2\delta )] \le \delta \). Thus:
Besides, for model (1), Eq. (17) implies \(\mathbb {E}[X_j(s)^2]\le 3x_j^2+3\alpha ^2\mathbb {E}[\phi _j^2]+3\sigma ^2\), and \(0<\delta <1\) implies
with C a positive constant which does not depend on \(\delta \) or \(m^2/s^2\). Finally,
But \(s \le 2/\sigma \) and \(m=\sqrt{k\varDelta }/\sigma \), with \(k\in \mathbb {N}^*\) and \(0<\varDelta <1\), thus we obtain
Proposition 9.1
Under (A), assuming \(\mathbb {E}[\phi _j^2]<+\infty \), the estimator \({\widehat{{\widetilde{f}}}}_{m,s}\) given by (29) satisfies
Finally if \(\varDelta \) is fixed and \(\delta \) is small, the error is acceptable. For example if \(\delta =\varDelta \) the error is of order \(\sqrt{\delta }\).
For study on the kernel estimator we refer to Comte et al. (2013).
Rights and permissions
About this article
Cite this article
Dion, C. Nonparametric estimation in a mixed-effect Ornstein–Uhlenbeck model. Metrika 79, 919–951 (2016). https://doi.org/10.1007/s00184-016-0583-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-016-0583-y
Keywords
- Stochastic differential equations
- Ornstein–Uhlenbeck process
- Mixed-effect model
- Nonparametric estimation
- Deconvolution method
- Kernel estimator
- Neuronal data