Skip to main content


Log in

Data-Driven Bandwidth Selection for Recursive Kernel Density Estimators Under Double Truncation

  • Published:
Sankhya B Aims and scope Submit manuscript


In this paper we proposed a data-driven bandwidth selection procedure of the recursive kernel density estimators under double truncation. We showed that, using the selected bandwidth and a special stepsize, the proposed recursive estimators outperform the nonrecursive one in terms of estimation error in many situations. We corroborated these theoretical results through simulation study. The proposed estimators are then applied to data on the luminosity of quasars in astronomy. We corroborated these theoretical results through simulation study, then, we applied the proposed estimators to data on the luminosity of quasars in astronomy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  • Bilker, W.B. and Wang, M.C. (1996). A semiparametric extension of the Mann-Whitney test for randomly truncated data. Biometrics 52, 10–20.

    Article  Google Scholar 

  • Bojanic, R. and Seneta, E. (1973). A unified theory of regularly varying sequences. Math. Z. 134, 91–106.

    Article  MathSciNet  Google Scholar 

  • Efron, B. and Petrosian, V. (1999). Nonparametric methods for doubly truncated data. J. Amer. Statist. Assoc. 94, 824–834.

    Article  MathSciNet  Google Scholar 

  • Galambos, J. and Seneta, E. (1973). Regularly varying sequences. Proc. Amer. Math. Soc. 41, 110–116.

    Article  MathSciNet  Google Scholar 

  • Klein, J.P. and Moeschberger, M.L. (2003). Survival Analysis Techniques for Censored and Truncated Data. Springer, New York.

    MATH  Google Scholar 

  • Mokkadem, A. and Pelletier, M. (2007). A companion for the Kiefer-Wolfowitz-Blum stochastic approximation algorithm. Ann. Statist. 35, 1749–1772.

    Article  MathSciNet  Google Scholar 

  • Mokkadem, A., Pelletier, M. and Slaoui, Y (2009). The stochastic approximation method for the estimation of a multivariate probability density. J. Statist. Plann. Inference 139, 2459–2478.

    Article  MathSciNet  Google Scholar 

  • Moreira, C. and de Uña-Àlvarez, J. (2010). A semiparametric estimator of survival for doubly truncated data. Statist. Med. 29, 3147–3159.

    Article  MathSciNet  Google Scholar 

  • Moreira, C. and de Uña-Àlvarez, J. (2012). Kernel density estimation with doubly truncated data. Electron. J. Stat. 6, 501–521.

    Article  MathSciNet  Google Scholar 

  • Moreira, C., de Uña-Àlvarez, J. and Crujeiras, R. (2010). DTDA: an R package to analyze randomly truncated data. J. Stat. Softw. 37, 1–20.

    Article  Google Scholar 

  • Parzen, E. (1962). On estimation of a probability density and mode. Ann. Math. Statist. 33, 1065–1076.

    Article  MathSciNet  Google Scholar 

  • Révész, P. (1973). Robbins-Monro procedure in a Hilbert space and its application in the theory of learning processes I. Studia Sci. Math. Hung. 8, 391–398.

    MathSciNet  MATH  Google Scholar 

  • Révész, P. (1977). How to apply the method of stochastic approximation in the non-parametric estimation of a regression function. Math. Operationsforsch. Statist., Ser. Statistics. 8, 119–126.

    MathSciNet  MATH  Google Scholar 

  • Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27, 832–837.

    Article  MathSciNet  Google Scholar 

  • Shen, P.S. (2010). Nonparametric analysis of doubly truncated data. Ann. Inst. Statist. Math. 62, 835–853.

    Article  MathSciNet  Google Scholar 

  • Silverman, B.W. (1986). Density estimation for statistics and data analysis. Chapman and Hall, London.

    Book  Google Scholar 

  • Slaoui, Y. (2013). Large and moderate principles for recursive kernel density estimators defined by stochastic approximation method. Serdica Math. J. 39, 53–82.

    MathSciNet  MATH  Google Scholar 

  • Slaoui, Y. (2014a). Bandwidth selection for recursive kernel density estimators defined by stochastic approximation method. J. Probab. Stat 2014, 739640.

  • Slaoui, Y. (2014b). The stochastic approximation method for the estimation of a distribution function. Math. Methods Statist. 23, 306–325.

  • Slaoui, Y. (2015). Plug-In Bandwidth selector for recursive kernel regression estimators defined by stochastic approximation method. Stat. Neerl. 69, 483–509.

    Article  MathSciNet  Google Scholar 

  • Slaoui, Y. (2016a). Optimal bandwidth selection for semi-recursive kernel regression estimators. Stat. Interface. 9, 375–388.

  • Slaoui, Y. (2016b). On the choice of smoothing parameters for semi-recursive nonparametric hazard estimators. J. Stat. Theory Pract. 10, 656–672.

  • Tsybakov, A.B. (1990). Recurrent estimation of the mode of a multidimensional distribution. Probl. Inf. Transm. 8, 119–126.

    MathSciNet  Google Scholar 

Download references


We are grateful to referee and an Editor for their helpful comments, which have led to this substantially improved version of the paper.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Yousri Slaoui.


Appendix A: Proofs

First, we introduce the following asymptotically equivalent version of the proposed recursive estimators (2.3),

$$\begin{array}{@{}rcl@{}} f_{n}\left( x\right)=\left( 1-\gamma_{n}\right)f_{n-1}\left( x\right)+\gamma_{n}\alpha h_{n}^{-1}\frac{K\left( h_{n}^{-1}\left[x-X_{n}\right]\right)}{G\left( X_{n}\right)}, \end{array} $$

and the asymptotically equivalent version of the non-recursive estimator (2.4),

$$\begin{array}{@{}rcl@{}} \widetilde{f}_{n}\left( x\right)=\frac{\alpha}{nh_{n}}\sum\limits_{k = 1}^{n}\frac{K\left( \frac{x-X_{k}}{h_{n}}\right)}{G_{n}\left( X_{k}\right)}. \end{array} $$

Remark 1.

The consistency results of \(\frac {\alpha _{n}}{G_{n}\left (.\right )}\) can be obtained from Shen (2010) and Moreira and de Uña-Àlvarez (2010).

Throughout this section we use the following notation:

$$\begin{array}{@{}rcl@{}} {\Pi}_{n}&=&\prod\limits_{j = 1}^{n}\left( 1-\gamma_{j}\right),\\ Z_{n}\left( x\right)&=&h_{n}^{-1}\alpha \frac{K\left( \frac{x-X_{n}}{h_{n}}\right)}{G\left( X_{n}\right)}. \end{array} $$

Let us first state the following technical lemma.

Lemma 1.

Let \(\left (v_{n}\right )\in \mathcal {GS}\left (v^{*}\right )\), \(\left (\eta _{n}\right )\in \mathcal {GS}\left (-\eta \right )\), and \(m>0\) such that \(m-v^{*}\xi >0\) where \(\xi \) is defined inEq. 3.2. We have

$$\begin{array}{@{}rcl@{}} \lim_{n \to +\infty}v_{n}{{\Pi}_{n}^{m}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-m}\gamma_{k}v_{k}^{-1} =\left( m-v^{*}\xi\right)^{-1}. \end{array} $$

Moreover, for all positive sequence \(\left (\beta _{n}\right )\) such that \(\lim _{n \to +\infty }\beta _{n}= 0\), and all \(C \in \mathbb {R}\),

$$\begin{array}{@{}rcl@{}} \lim_{n \to +\infty}v_{n}{{\Pi}_{n}^{m}}\left[\sum\limits_{k = 1}^{n} {\Pi}_{k}^{-m} \eta_{k}v_{k}^{-1}\beta_{k}+C\right]= 0. \end{array} $$

Lemma 1 is widely applied throughout the proofs. Let us underline that it is its application, which requires Assumption \((A2)(iii)\) on the limit of \((n\gamma _{n})\) as n goes to infinity.

Our proofs are organized as follows. Propositions 1 and 2 in Sections 5.1 and 5.2 respectively, Theorem 1 in Section 5.3.

1.1 A.1 Proof of Proposition 1

In view of Eqs. 5.1 and 5.3, we have

$$\begin{array}{@{}rcl@{}} f_{n}\left( x\right) - f\left( x\right)\!\!\!&=&\!\!\!\left( 1-\gamma_{n}\right)\left( f_{n-1}\left( x\right)-f\left( x\right)\right)+\gamma_{n}\left( Z_{n}\left( x\right)-f\left( x\right)\right)\\ \!\!\!&=&\!\!\!\sum\limits_{k = 1}^{n-1}\left[\prod\limits_{j=k + 1}^{n}\left( 1-\gamma_{j}\right)\right]\gamma_{k}\left( Z_{k}\left( x\right) - f\left( x\right)\right)+\gamma_{n}\left( Z_{n}\left( x\right)-f\left( x\right)\right) \\ &&\!\!\!+\left[\prod\limits_{j = 1}^{n}\left( 1-\gamma_{j}\right)\right]\left( f_{0}\left( x\right)-f\left( x\right)\right)\\ \!\!\!&=&\!\!\!{\Pi}_{n}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-1}\gamma_{k}\left( Z_{k}\left( x\right)-f\left( x\right)\right)+{\Pi}_{n}\left( f_{0}\left( x\right)-f\left( x\right)\right). \end{array} $$

It follows that

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left( f_{n}\left( x\right)\right)-f\left( x\right) \!\!\!&=&\!\!\!{\Pi}_{n}{\sum}_{k = 1}^{n}{\Pi}_{k}^{-1}\gamma_{k}\left( \mathbb{E}\left( Z_{k}\left( x\right)\right) - f\left( x\right)\right)+{\Pi}_{n}\left( f_{0}\left( x\right)-f\left( x\right)\right). \end{array} $$

Moreover, for simplicity, we let \(H\left (x\right )=\frac {f_{X}\left (x\right )}{G\left (x\right )}\), and in view of Eq. ??, we have

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[Z_{k}^{p}\left( x\right)\right] \!\!\!&=&\!\!\!h_{k}^{-p}\alpha^{p}\mathbb{E}\left[\frac{K^{p}\left( \frac{x-X_{k}}{h_{k}}\right)}{G^{p}\left( X_{k}\right)}\right]\\ \!\!\!&=&\!\!\!h_{k}^{-p}\alpha^{p}{\int}_{\mathbb{R}}K^{p}\left( \frac{x-y}{h_{k}}\right)\frac{f_{X}\left( y\right)}{G^{p}\left( y\right)}dy\\ \!\!\!&=&\!\!\!h_{k}^{-p + 1}\alpha^{p}{\int}_{\mathbb{R}}K^{p}\left( z\right)\frac{f_{X}\left( x-zh_{k}\right)}{G^{p}\left( x-zh_{k}\right)}dy\\ \!\!\!&=&\!\!\!h_{k}^{-p + 1}\alpha^{p}{\int}_{\mathbb{R}}K^{p}\left( z\right)\frac{H\left( x-zh_{k}\right)}{G^{p-1}\left( x-zh_{k}\right)}dz\\ \!\!\!&=&\!\!\!h_{k}^{-p + 1}\alpha^{p-1}{\int}_{\mathbb{R}}K^{p}\left( z\right)\frac{f\left( x-zh_{k}\right)}{G^{p-1}\left( x-zh_{k}\right)}dz. \end{array} $$

Then, it follows from Eq. 5.4, for \(p = 1\), that

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[Z_{k}\left( x\right)\right]-f\left( x\right)\!\!\!&=&\!\!\!{\int}_{\mathbb{R}}K\left( z\right)\left[f\left( x-zh_{k}\right)-f\left( x\right)\right]dz\\ \!\!\!&=&\!\!\!\frac{{h_{k}^{2}}}{2}f^{\left( 2\right)}\left( x\right)\mu_{2}\left( K\right)+\eta_{k}\left( x\right), \end{array} $$


$$\begin{array}{@{}rcl@{}} \eta_{k}\left( x\right)={\int}_{\mathbb{R}}K\left( z\right)\left[f\left( x-zh_{k}\right)-f\left( x\right)-\frac{1}{2}z^{2}{h_{k}^{2}}f^{\left( 2\right)}\left( x\right)\right]dz, \end{array} $$

and, since f is bounded and continuous at x, we have \(\lim _{k\to \infty }\eta _{k}\left (x\right )= 0\). In the case \(a\leq \gamma /5\), we have \(\lim _{n\to \infty }\left (n\gamma _{n}\right )>2a\); the application of Lemma 1 then gives

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[f_{n}\left( x\right)\right]-f\left( x\right)\!\!\!&=&\!\!\!\frac{1}{2}f^{\left( 2\right)}\left( x\right){\int}_{\mathbb{R}}z^{2}K\left( z\right)dz{\Pi}_{n}{\sum}_{k = 1}^{n}{\Pi}_{k}^{-1}\gamma_{k}{h_{k}^{2}}\left[1+o\left( 1\right)\right]\\ &&+{\Pi}_{n}\left( f_{0}\left( x\right)-f\left( x\right)\right)\\ \!\!\!&=&\!\!\!\frac{1}{2\left( 1-2a\xi\right)}f^{\left( 2\right)}\left( x\right)\mu_{2}\left( K\right)\left[h_{n}^{2}+o\left( 1\right)\right], \end{array} $$

and Eq. ?? follows. In the case \(a>\gamma /5\), we have \({h_{n}^{2}}=o\left (\sqrt {\gamma _{n}h_{n}^{-1}}\right )\), and \(\lim _{n\to \infty }\left (n\gamma _{n}\right )>\left (\gamma -a\right )/2\), then Lemma 1 ensures that

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[f_{n}\left( x\right)\right]-f\left( x\right)\!\!\!&=&\!\!\!{\Pi}_{n}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-1}\gamma_{k}o\left( \sqrt{\gamma_{k}h_{k}^{-1}}\right)+O\left( {\Pi}_{n}\right)\\ \!\!\!&=&\!\!\!o\left( \sqrt{\gamma_{n}h_{n}^{-1}}\right). \end{array} $$

which gives (3.4). Further, we have

$$\begin{array}{@{}rcl@{}} Var\left[f_{n}\left( x\right)\right]\!\!\!&=&\!\!\!{{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}Var\left[Z_{k}\left( x\right)\right]\\ \!\!\!&=&\!\!\!{{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}\left( \mathbb{E}\left( {Z_{k}^{2}}\left( x\right)\right)-\left( \mathbb{E}\left( Z_{k}\left( x\right)\right)\right)^{2}\right). \end{array} $$

Moreover, in view of Eq. 5.4, for \(p = 2\), that

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left( {Z_{k}^{2}}\left( x\right)\right)\!\!\!&=&\!\!\!h_{k}^{-1}\alpha{\int}_{\mathbb{R}}\frac{f\left( x-zh_{k}\right)}{G\left( x-zh_{k}\right)}K^{2}\left( z\right)dz\\ \!\!\!&=&\!\!\!h_{k}^{-1}\alpha \frac{f\left( x\right)}{G\left( x\right)}{\int}_{\mathbb{R}}K^{2}\left( z\right)dz+\nu_{k}\left( x\right), \end{array} $$


$$\begin{array}{@{}rcl@{}} \nu_{k}\left( x\right)=h_{k}^{-1}\alpha{\int}_{\mathbb{R}}K^{2}\left( z\right)\left[\frac{f\left( x-zh_{k}\right)}{G\left( x-zh_{k}\right)}-\frac{f\left( x\right)}{G\left( x\right)}\right]dz. \end{array} $$

Moreover, it follows from Eq. 5.5, that

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[Z_{k}\left( x\right)\right]&=&f\left( x\right)+\widetilde{\nu}_{k}\left( x\right), \end{array} $$


$$\begin{array}{@{}rcl@{}} \widetilde{\nu}_{k}\left( x\right)={\int}_{\mathbb{R}}K\left( z\right)\left[f\left( x-zh_{k}\right)-f\left( x\right)\right]dz. \end{array} $$

Then, it follows from Eqs. 5.65.7 and 5.8, that

$$\begin{array}{@{}rcl@{}} Var\left[f_{n}\left( x\right)\right]&=&\alpha \frac{f\left( x\right)}{G\left( x\right)}R\left( K\right){{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}h_{k}^{-1}+{{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}\nu_{k}\left( x\right)\\ &&-f^{2}\left( x\right){{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}-2f\left( x\right){{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}\widetilde{\nu}_{k}\left( x\right)\\ &&-{{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}\widetilde{\nu}_{k}^{2}\left( x\right). \end{array} $$

Since f and \(fG^{-1}\) are bounded continuous, we have \(\lim _{k\to \infty }\nu _{k}\left (x\right )= 0\) and \(\lim _{k\to \infty }\widetilde {\nu _{k}}\left (x\right )= 0\). In the case \(a\geq \gamma /5\), we have \(\lim _{n\to \infty }\left (n\gamma _{n}\right )>\left (\gamma -a\right )/2\), and the application of Lemma 1 gives

$$\begin{array}{@{}rcl@{}} Var\left[f_{n}\left( x\right)\right]&=&\frac{\gamma_{n}}{h_{n}}\left( 2-\left( \gamma-a\right)\xi\right)^{-1}\alpha \frac{f\left( x\right)}{G\left( x\right)}R\left( K\right) +o\left( \frac{\gamma_{n}}{h_{n}}\right), \end{array} $$

which proves (3.5). Now, in the case \(a<\gamma /5\), we have \(\gamma _{n}h_{n}^{-1}=o\left ({h_{n}^{4}}\right )\), and \(\lim _{n\to \infty }\left (n\gamma _{n}\right )>2a\), then the application of Lemma 1 gives

$$\begin{array}{@{}rcl@{}} Var\left[f_{n}\left( x\right)\right]&=&{{\Pi}_{n}^{2}}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}\gamma_{k}o\left( {h_{k}^{4}}\right)\\ &=&o\left( {h_{n}^{4}}\right), \end{array} $$

which proves (3.6).

1.2 A.2 Proof of Proposition 2

Following similar steps as the proof of the Proposition 2 of Mokkadem et al. (2009), we proof the Proposition 2.

1.3 A.3 Proof of Theorem 1

Let us at first assume that, if \(a\geq \gamma /5\), then

$$\begin{array}{@{}rcl@{}} \sqrt{\gamma_{n}^{-1} h_{n}}\left( f_{n}\left( x\right) - \mathbb{E}\left[f_{n}\left( x\right)\right]\right)\stackrel{\mathcal{D}}{\rightarrow}\mathcal{N}\left( 0,\left( 2 - \left( \gamma - a\right)\xi\right)^{-1}\alpha \frac{f\left( x\right)}{G\left( x\right)}R\left( K\right)\right). \end{array} $$

In the case when \(a>\gamma /5\), Part 1 of Theorem 1 follows from the combination of Eqs. 3.4 and 5.9. In the case when \(a=\gamma /5\), Parts 1 and 2 of Theorem ?? follow from the combination of Eqs. 3.3 and 5.9. In the case \(a<\gamma /5\), Eq. 3.6 implies that

$$\begin{array}{@{}rcl@{}} h_{n}^{-2}\left( f_{n}\left( x\right)-\mathbb{E}\left( f_{n}\left( x\right)\right)\right)\stackrel{\mathbb{P}}{\rightarrow}0, \end{array} $$

and the application of Eq. 3.3 gives Part 2 of Theorem 1.

We now prove (5.9). In view of Eq. 2.3, we have

$$\begin{array}{@{}rcl@{}} f_{n}\left( x\right)-\mathbb{E}\left[f_{n}\left( x\right)\right]={\Pi}_{n}\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-1}\gamma_{k}\left( Z_{k}\left( x\right)-\mathbb{E}\left[Z_{k}\left( x\right)\right]\right). \end{array} $$


$$\begin{array}{@{}rcl@{}} Y_{k}\left( x\right)&=&{\Pi}_{k}^{-1}\gamma_{k}\left( Z_{k}\left( x\right)-\mathbb{E}\left[Z_{k}\left( x\right)\right]\right). \end{array} $$

The application of Lemma 1 ensures that

$$\begin{array}{@{}rcl@{}} {v_{n}^{2}}&=&{\sum}_{k = 1}^{n}Var\left( Y_{k}\left( x\right)\right)\\ &=&\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}Var\left( Z_{k}\left( x\right)\right)\\ &=&\sum\limits_{k = 1}^{n}{\Pi}_{k}^{-2}{\gamma_{k}^{2}}h_{k}^{-1}\left[\alpha \frac{f\left( x\right)}{G\left( x\right)}R\left( K\right)+o\left( 1\right)\right]\\ &=&{\Pi}_{n}^{-2}\gamma_{n}h_{n}^{-1}\left[\left( 2-\left( 2-\left( \gamma-a\right)\xi\right)\right)^{-1}\alpha \frac{f\left( x\right)}{G\left( x\right)}R\left( K\right)+o\left( 1\right)\right]. \end{array} $$

On the other hand, we have, for all \(p>0\),

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left[\left|Z_{k}\left( x\right)\right|^{2+p}\right] &=& O\left( \frac{1}{h_{k}^{1+p}}\right), \end{array} $$

and, since \(\lim _{n\to \infty }\left (n\gamma _{n}\right )>\left (\gamma -a\right )/2\), there exists \(p>0\) such that \(\lim _{n\to \infty }\) \(\left (n\gamma _{n}\right )>\frac {1+p}{2+p}\left (\gamma -a\right )\). Applying Lemma 1, we get

$$\begin{array}{@{}rcl@{}} {\sum}_{k = 1}^{n}\mathbb{E}\left[\left|Y_{k}\left( x\right)\right|^{2+p}\right]&=&O\left( \sum\limits_{k = 1}^{n} {\Pi}_{k}^{-2-p}\gamma_{k}^{2+p}\mathbb{E}\left[\left|Y_{k}\left( x\right)\right|^{2+p}\right]\right)\\ &=&O\left( \sum\limits_{k = 1}^{n} \frac{{\Pi}_{k}^{-2-p}\gamma_{k}^{2+p}}{h_{k}^{1+p}}\right)\\ &=&O\left( \frac{\gamma_{n}^{1+p}}{{\Pi}_{n}^{2+p}h_{n}^{1+p}}\right), \end{array} $$

and we thus obtain

$$\begin{array}{@{}rcl@{}} \frac{1}{v_{n}^{2+p}}\sum\limits_{k = 1}^{n}\mathbb{E}\left[\left|Y_{k}\left( x\right)\right|^{2+p}\right]& = & O\left( {\left[\gamma_{n}h_{n}^{-1}\right]}^{p/2}\right)=o\left( 1\right). \end{array} $$

The convergence in Eq. 5.9 then follows from the application of Lyapounov’s Theorem.

Appendix B: R Source Code

Here we give a source code of the proposed method according to the first model: \(U^{*}\sim \mathcal {U}\left (-1,0\right )\), \(V^{*}\sim \mathcal {U}\left (0,1\right )\) and \(X^{*}\sim \mathcal {N}\left (0,1\right )\).

n=100; #sample size Np=250; #number of discretization points niter=50; #number of iteration # initialization of parameters FNR=matrix(0,niter,Np); FR=matrix(0,niter,Np); A1=rep(0,niter); A2=rep(0,niter); A=rep(0,niter); Y=matrix(0,n,Np); # generation of discretization points D=seq(-4,4,8/(Np-1)); #Gaussian kernel KG<-function(x){1/sqrt(2*pi)*exp(-x^2/2)} #second derivative of Gaussian kernel Kd<-function(x){y<-(x^2-1)*KG(x)} #computing R(K) K2G<-function(x){1/(2*pi)*exp(-x^2)} IR=integrate(K2G, lower = -Inf, upper = Inf)$value; #computing mu2(K) Kx<-function(x){1/sqrt(2*pi)*x^{2}*exp(-x^{2}/2)} mu=integrate(Kx, lower = -Inf, upper = Inf)$value; #start of iterations for (iter in 1:niter) { print(iter) #simulation Xt=rnorm(n); Ut=runif(n,-1,0); Vt=runif(n,0,1); XX=(Xt>=Ut)*Xt*(Xt<=Vt); X1=(Xt>=Ut)*Xt; X2=Xt*(Xt<=Vt); alpha1=mean(X1!=0); alpha2=mean(X2!=0); alpha=mean(XX!=0); X=Xt;U=Ut;V=Vt; for (i in 1:n){ while (X[i]<U[i]|X[i]>V[i]){ U[i]<-runif(1,-1,0) X[i]<-rnorm(1) V[i]<-runif(1,0,1)}} #estimation of alpha Gn=ecdf(X); G=function(x) {1-x^2} alpha=mean(G(XX)); #estimation of I1 and I2 Q1=quantile(X,0.25) Q3=quantile(X,0.75) d=(Q3-Q1)/1.349; c=min(sd(X),d); #estimation of I1 for the non-recursive estimator (see, equation (18)) #pilot bandwidth for estimating I1 h=c*n^(-2/5); M1=matrix(0,ncol=n,nrow=n); for (i in 1:n){ for (j in 1:n){ M1[i,j]=KG((X[i]-X[j])/h)*(G(X[i])*G(X[j]))^(-1);}} I1tilde=(sum(M1)-sum(diag(M1)))/(n*(n-1)*h); #estimation of I2 for the non-recursive estimator (see, equation (19)) #pilot bandwidth for estimating I2 h=c*n^(-3/14); M2 = array (dim = c(n,n,n)); for (i in 1:n){ for (j in 1:n){ for (k in 1:n){ M2[i,j,k]=Kd((X[i]-X[j])/h)*Kd((X[i]-X[k])/h)*(G(X[j]) *G(X[k]))^(-1);}}} L=rep(0,n); for (i in 1:n) { L[i]=sum(diag(M2[i,,]));} I2tilde=(sum(M2)-sum(L))/(n^3*h^6); #estimation of I1 for the recursive estimator (see, equation (12)) #pilot stepsize for estimating I1 gam=1.36/c(2:n); Gl=1-gam; Pn=prod(1-gam); ng=length(Gl); L1=rep(0,ng); for (k in 1:ng) { L1[k]=prod(Gl[1:k]);} P1=L1^(-1); #pilot bandwidth for estimating I1 hk=c*c(1:n)^(-2/5); N1=matrix(0,ncol=n,nrow=n); for (i in 1:ng){ for (j in 1:ng){ N1[i,j]=(P1[j]*gam[j]/hk[j]*KG((X[i]-X[j])/hk[j]))*(G(X[i]) *G(X[j]))^(-1);}} I1hat=Pn*n^(-1)*(sum(N1)-sum(diag(N1))); #estimation of I2 for the recursive estimator (see, equation (13)) #pilot stepsize for estimating I2 gam=1.48/c(2:n); Gl=1-gam; Pn=prod(1-gam); ng=length(Gl); L1=rep(0,ng); for (k in 1:ng) { L1[k]=prod(Gl[1:k]);} P1=L1^(-1); #pilot bandwidth for estimating I2 hk=c*c(1:n)^(-3/14); N2=array (dim = c(ng,ng,ng)); for (i in 1:ng){ for (j in 1:ng){ for (k in 1:ng){ N2[i,j,k]=P1[j]*gam[j]*P1[k]*gam[k]*hk[j]^(-3)*hk[k]^(-3) *Kd((X[i]-X[j])/hk[j]) *Kd((X[i]-X[k])/hk[k])*(G(X[j])*G(X[k]))^(-1);}}} Ln=rep(0,ng); for (i in 1:ng) { Ln[i]=sum(diag(N2[i,,]));} I2hat=Pn^2*n^(-1)*(sum(N2)-sum(Ln)); #Optimale Bandwidth for the non-recursive estimator (see, equation (20)) hN=(I1tilde/I2tilde)^(1/5)*(IR/(mu^2))^(1/5)*alpha^(1/5) *n^(-1/5); #Optimale Bandwidth for the recursive estimator (see, equation (15)) hR=(3/10)^(1/5)*(I1hat/I2hat)^(1/5)*(IR/(mu^2))^(1/5) *alpha^(1/5)*c(2:n)^(-1/5); #Non-recursive estimator I1=matrix(0,Np,n); for (i in 1:Np){ for (j in 1:n) { I1[i,j]=sum(hN^(-1)*KG(hN^(-1)*(D[i]-X[j]))*G(X[j])^(-1))} } FNR[iter,]=alpha*rowSums(I1)/n; #Recursive estimator with stepsize gamma_n=n^(-1) I2=matrix(0,Np,(n-1)); for (i in 1:Np){ for (j in 1:(n-1)) { I2[i,j]=sum(hR[j]^(-1)*KG(hR[j]^(-1)*(D[i]-X[j])) *G(X[j])^(-1))} } FR[iter,]=alpha*rowSums(I2)/n; }#end of iterations #output the result FNR1=FNR[!rowSums(!is.finite(FNR)),] FR1=FR[!rowSums(!is.finite(FR)),] FNR1=colMeans(FNR1); #nonrecursive FR1=colMeans(FR1); #recursive #qualitative comparaison between the two estimators (3) and (4) MD=dnorm(D); #Density for the standard normal distribution NR1=c(mean(abs(FNR1-MD)),mean((FNR1-MD)^2), mean(abs(FNR1/MD-1))) NR1=round(NR1,4); print("Non recursive") print(NR1) Rec1=c(mean(abs(FR1-MD)),mean((FR1-MD)^2),mean(abs(FR1/MD-1))) Rec1=round(Rec1,4); print("Recursive") print(Rec1) #quantitative comparaison between the two estimators (3) and (4) plot(D,MD,xlab="",ylab="",ylim=c(0,0.65)) Data1=data.frame(D,FNR1) lines(Data1,type="l",lty=1,lwd=2) Data2=data.frame(D,FR1) lines(Data2,type="l",lty=2,lwd=2) legend("topleft",legend=c("Nonrecursive","Recursive"), lty=c(1,2),lwd=2)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Slaoui, Y. Data-Driven Bandwidth Selection for Recursive Kernel Density Estimators Under Double Truncation. Sankhya B 80, 341–368 (2018).

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI:

Keywords and phrases

AMS (2000) subject classification