Skip to main content
Log in

Kernel estimation for a superpopulation probability density function under informative selection

  • Published:
METRON Aims and scope Submit manuscript

Abstract

Kernel density estimation of the probability density function (pdf) of a response variable is considered under informative selection from a finite population. The informative selection implies that the conditional pdf of a response, given that it was selected for observation, is not the same as the inferential target, which is the unconditional pdf of the response in the superpopulation. Instead, the pdf of the observations (sample pdf) is a weighted version of the superpopulation pdf of interest. Properties of the standard kernel density estimator are described under an asymptotic framework that covers a wide range of informative selection mechanisms. The theory allows for the possibility that the selection mechanism has a parametric structure. A variety of adjustments (parametric or nonparametric) to account for the informative selection are proposed, and investigated via simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Beaumont, J.F.: A new approach to weighting and inference in sample surveys. Biometrika 95(3), 539–553 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bellhouse, D.R., Stafford, J.E.: Density estimation from complex surveys. Stat. Sin. 9, 407–424 (1999)

    MathSciNet  MATH  Google Scholar 

  3. Bonnéry, D., Breidt, F.J., Coquet, F.: Uniform convergence of the empirical cumulative distribution function under informative selection from a finite population. Bernoulli 18(4), 1361–1385 (2012). doi:10.3150/11-BEJ369. Accessed 2 Aug 2017

  4. Bonnéry, D., Breidt, F.J, Coquet, F.: BonneryBreidtCoquetMetronKDE. (2017). https://github.com/DanielBonnery/BonneryBreidtCoquetMetronKDE

  5. Bonnéry, D., Breidt, F.J., Coquet, F.: Asymptotics for the maximum sample likelihood estimator under informative selection from a finite population. Bernoulli p. I (in press). doi:10.3150/16-BEJ809

  6. Breidt, F.J., Opsomer, J.D.: Nonparametric and semiparametric estimation in complex surveys. In: Pfeffermann, D., Rao, C. (eds.) Handbook of Statistics. Sample Surveys: Inference and Analysis, vol. 29B, pp. 103–119. North Holland, Amsterdam (2009)

    Google Scholar 

  7. Buskirk, T.D., Lohr, S.L.: Asymptotic properties of kernel density estimation with complex survey data. J. Stat. Plan. Inference 128(1), 165–190 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Duong, T.: ks: Kernel Smoothing (2017). R package version 1.10.6. http://CRAN.R-project.org/package=ks. Accessed 2 Aug 2017

  9. Fuller, W.A.: Sampling Statistics. Wiley, Hoboken (2009)

    Book  MATH  Google Scholar 

  10. Korn, E.L., Graubard, B.I.: Analysis of Health Surveys. Wiley, New York (1999)

    Book  MATH  Google Scholar 

  11. Nadaraya, E.A.: On estimating regression. Theory Prob. Appl. 9, 141–142 (1964)

    Article  MATH  Google Scholar 

  12. Pfeffermann, D., Sverchkov, M.: Inference under informative sampling. In: Pfeffermann, D., Rao. C.R. (eds) Sample Surveys: Inference and Analysis, Handbook of Statistics, vol 29B, Elsevier/North-Holland, Amsterdam, pp. 455 – 487 (2009). doi:10.1016/S0169-7161(09)00239-9

  13. Tsybakov, A.B.: Introduction to Nonparametric Estimation. Springer Series in Statistics. Springer, New York (2009)

    Book  MATH  Google Scholar 

  14. Watson, G.S.: Smooth regression analysis. Sankhyā Ser. A 26, 359–372 (1964)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F. Jay Breidt.

A Technical appendix

A Technical appendix

1.1 A.1 Definitions

Definition 1

Let T be an interval of \(\mathbb {R}\) and \(\beta \in \left( 0,+\infty \right) \). The Hölder class on T, denoted \(\Sigma (\beta ,T)\) is the set of functions \(g:T\rightarrow \mathbb {R}\) for which there exists \(L\in \left( 0,+\infty \right) \) and \(g^{(l)}\) (with \(l=\lfloor \beta \rfloor \), denoting the largest integer less than or equal to \(\beta \)) such that \(\forall (x,x')\in T^2\), \(\left| g^{(l)}(x)-g^{(l)}(x')\right| \le L\left| x-x'\right| ^{\beta -l}\). Let

$$\begin{aligned} {\mathcal {P}(\beta )=\left\{ g\in \Sigma (\beta ,\mathbb {R})\mid g\ge 0,\int gd\lambda =1\right\} }. \end{aligned}$$

We assume exchangeability as in Section 2.1 of [3] and adopt the following definition from that paper:

Definition 2

Given \(\gamma \), let \(k,\ell \in U_\gamma \) with \(k\ne \ell \). Let

$$\begin{aligned} \mu _\gamma (y)= & {} \mathrm {E}\left[ I_{\gamma k}\mid Y_{k}=y\right] \\ v_\gamma (y)= & {} {\text {Var}}\left[ I_{\gamma k}\mid Y_{k}=y\right] \\ \mu '_\gamma (y_1,y_2)= & {} \mathrm {E}\left[ I_{\gamma k}\mid Y_{k}=y_1,Y_{\ell }=y_2\right] \\ c_\gamma (y_1,y_2)= & {} \text{ Cov }\left( I_{\gamma k},I_{\gamma \ell }\mid Y_{k}=y_1,Y_{\ell }=y_2\right) . \end{aligned}$$

(These definitions do not depend on the choice of \(k, \ell \) under the exchangeability assumption).

1.2 A.2 Assumptions

  • D0 There exist \(M:\mathbb {R}\rightarrow \mathbb {R}^+\) and \(\mu _\infty :\mathbb {R}\rightarrow \mathbb {R}^+\), both \(\lambda \)-measurable, such that

    $$\begin{aligned}&\left\{ \begin{array}{l} \forall \gamma \in \mathbb {N}, \mu _\gamma<M \\ \int Mf\ d\lambda <\infty \end{array}\right. \\&\left\{ \begin{array}{l} \mu _\gamma \rightarrow \mu _\infty \text { pointwise as}~\gamma \rightarrow \infty \\ \int \mu _\infty f\,d\lambda >0. \end{array}\right. \end{aligned}$$
  • D1

    $$\begin{aligned} {\text {Var}}\left[ n_\gamma \right] =o_\gamma (N_\gamma ^2). \end{aligned}$$
  • D2

    $$\begin{aligned} {\left\{ \begin{array}{ll}{} \sup _{y\in \mathbb {R}} \left\{ (\mu _\gamma f)(y)\right\} =O_\gamma (1),\\ \sup _{y\in \mathbb {R}} \left\{ (\mu _\infty f)(y)\right\} <+\infty ,\\ \mu _\gamma -\mu _\infty \text { converges uniformly on a neighborhood of }y_0\text { to }0,\\ \mu _\infty \text { is continuous in }y_0,\\ f \text { is continuous in }y_0. \end{array}\right. } \end{aligned}$$
  • D3 There exists V a neighborhood of \(y_0\) such that:

    $$\begin{aligned} {\left\{ \begin{array}{ll}{} \int K^2(u) \, d u<+\infty , \\ \exists v_\infty \text { a measurable real function such that }v_\gamma -v_\infty \text { converges uniformly to }0\text { on }V, \\ v_\infty \text { is continuous in }y_0, \\ \sup _{y\in \mathbb {R}} \left\{ v_\gamma (y)f(y)\right\} =O_\gamma (1), \\ \sup _{y\in \mathbb {R}} \left\{ v_\infty f(y)\right\} <+\infty ,\\ \sup _{(y_1,y_2)\in \mathbb {R}^2} \left\{ \left( d_\gamma (y_1,y_2)+c_\gamma (y_1,y_2)\right) \right\} =O_\gamma (1),\\ c_\gamma +d_\gamma \text { converges uniformly to }0\text { on }V\times V. \end{array}\right. } \end{aligned}$$
  • D4

    $$\begin{aligned} {\left\{ \begin{array}{ll}{} (\rho _\gamma \times f)\in \mathscr {C}^2,\\ (\rho _\gamma \times f)^{(2)}-(\rho _\infty \times f)^{(2)}\text { converges uniformly to }0\text { on a neighborhood of }y_0,\\ \sup _{u\in \mathbb {R}}\left\{ |(\rho _\gamma \times f)^{(2)}(u)|\right\} =O_\gamma (1),\\ \int u^2K(u) \, d u<+\infty ,\\ \int u^2K(u) \, d u\ne 0, \end{array}\right. } \end{aligned}$$

where \(\mathscr {C}^2\) is the space of twice differentiable and continuous functions.

1.3 A.3 Preliminary results

Result 1

\(\forall y_0\in \mathbb {R}\),

$$\begin{aligned} {\text {Var}}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right]\le & {} \frac{\sup _{y\in \mathbb {R}}\{(v_\gamma \times f)(y)\}+\sup _{y\in \mathbb {R}}\{(\mu _\gamma \times f)(y)\}\int K^2d\lambda }{N_\gamma h_\gamma \left( \int \mu _\gamma f d\lambda \right) ^2}\\&+\,\frac{1}{\left( \int \mu _\gamma f d\lambda \right) ^2}\sup _{(y_1,y_2)\in \mathbb {R}^2}\{(c_\gamma (y_1,y_2)f(y_1)f(y_2))\}\\&+\,\frac{1}{\left( \int \mu _\gamma f d\lambda \right) ^2}\sup _{(y_1,y_2)\in \mathbb {R}^2}\\&\times \left\{ \left( \mu '_\gamma (y_2,y_1)-\mu _\gamma (y_1)\mu _\gamma (y_2)\right) f(y_1)f(y_2))\right\} . \end{aligned}$$

Proof of Result 1

$$\begin{aligned}&{{\text {Var}}\left[ \frac{n_\gamma }{N_\gamma \int \mu _\gamma fd\lambda }p_\gamma (y_0)\right] }\\= & {} {\text {Var}}\left[ \frac{1}{h_\gamma N_\gamma \int \mu _\gamma f\,d\lambda } \sum _{k=1}^{N_\gamma }K\left( \frac{y_k-y_0}{h_\gamma }\right) I_{\gamma k}\right] \\= & {} \frac{1}{N_\gamma \left( h_\gamma \int \mu _\gamma fd\lambda \right) ^2}{\text {Var}}\left[ K\left( \frac{y_1-y_0}{h_\gamma }\right) I_{\gamma 1}\right] \\&\quad +\,\frac{N_\gamma -1}{N_\gamma \left( h_\gamma \int \mu _\gamma fd\lambda \right) ^2} Cov\left[ K\left( \frac{y_1-y_0}{h_\gamma }\right) I_{\gamma 1},K\left( \frac{y_2-y_0}{h_\gamma }\right) I_{\gamma 2}\right] \\= & {} \frac{1}{N_\gamma \left( h_\gamma \int \mu _\gamma fd\lambda \right) ^2} \left( \int K\left( \frac{y-y_0}{h_\gamma }\right) v_\gamma (y)f(y)d\lambda (y)\right. \\&+\,\int K^2\left( \frac{y-y_0}{h_\gamma }\right) m^2_\gamma (y)f(y)d\lambda (y) \\&-\,\left. \left( \int K\left( \frac{y-y_0}{h_\gamma }\right) \mu _\gamma (y)f(y)d\lambda (y)\right) ^2 \right) \\&+\,\frac{N_\gamma -1}{N_\gamma \left( h_\gamma \int \mu _\gamma fd\lambda \right) ^2} \left( \int K\left( \frac{y_1-y_0}{h_\gamma }\right) K\left( \frac{y_2-y_0}{h_\gamma }\right) \right. \\&\times \left. c_\gamma (y_1,y_2)f(y_1)f(y_2)d\lambda (y_1) d\lambda (y_2)\right. \\&+\,\int K\left( \frac{y_1-y_0}{h_\gamma }\right) K\left( \frac{y_2-y_0}{h_\gamma }\right) \left( \mu '_\gamma (y_1,y_2)\mu '_\gamma (y_2,y_1)\right. \\&\left. \left. -\,\mu _\gamma (y_1)\mu _\gamma (y_2)\right) f(y_1)f(y_2) d\lambda (y_1) d\lambda (y_2)\right) . \end{aligned}$$

\(\square \)

A kernel is of order l if \(\int u^lK(u)\,du\ne 0\) and \(\int u^jK(u)\,du=0\) for \(j=1,\ldots ,l-1\). In this paper, our kernels are typically symmetric pdf’s with mean zero and finite, non-zero second moment, so the order is \(l=2\). The following result is stated more generally, and in particular allows for kernels that need not be pdf’s.

Result 2

Assume that K is a kernel of order l such that \(\int \left| u\right| ^{\beta }\left| K(u)\right| du <+\infty \). If \(\rho _\gamma f\in \mathcal {P}(\beta )\), then

$$\begin{aligned} \forall y_0\in \mathbb {R},\ \forall \gamma \in \mathbb {N},\ \left| \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)\right| \le C_2h_\gamma ^\beta , \end{aligned}$$

with \(C_2=\frac{L}{l!}\int |u|^\beta |K(u)|\, du\).

Proof of Result 2

$$\begin{aligned}&{\mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)}\\&\quad =\mathrm {E}\left[ \frac{n_\gamma }{N_\gamma \int \mu _\gamma f d\lambda }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)\\&\quad =\frac{1}{h_\gamma }\int K\left( \frac{y-y_0}{h_\gamma }\right) (\rho _\gamma \times f)(y)\, d y-(\rho _\gamma \times f)(y_0)\\&\quad =\int K(u)\left( (\rho _\gamma \times f)(y_0+uh_\gamma )-(\rho _\gamma \times f)(y_0)\right) \, d u. \end{aligned}$$

In addition, \(\exists \tau _\gamma :u\mapsto [0,1]\) such that:

$$\begin{aligned} (\rho _\gamma \times f)(y_0+uh_\gamma ) =\sum _{j=0}^{l-1}\frac{(\rho _\gamma \times f)^{(j)}(y_0)(uh_\gamma )^j}{j!} +\frac{(\rho _\gamma \times f)^{(l)}(y_0+\tau _\gamma (u)uh_\gamma )(uh_\gamma )^l}{l!}. \end{aligned}$$

As K is of order \(l=\lfloor \beta \rfloor \),

$$\begin{aligned}&{\mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)}\\&\quad =\int K(u)\frac{(u h_\gamma )^l}{l!}\left( (\rho _\gamma \times f)^{(l)}(y_0+\tau _\gamma (u)uh_\gamma ) -(\rho _\gamma \times f)^{(l)}(y_0)\right) \, d u, \end{aligned}$$

and

$$\begin{aligned}&{\left| \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)\right| }\\&\quad \le \int |K(u)|\frac{|uh|^l}{l!}\left| (\rho _\gamma \times f)^{(l)}(y_0+\tau _\gamma (u)uh_\gamma ) -(\rho _\gamma \times f)^{(l)}(y_0)\right| \, d u\\&\quad \le L\int |K(u)|\frac{|uh|^l}{l!}\left| \tau _\gamma (u)uh_\gamma \right| ^{\beta -l}\, d u\\&\quad \le C_2h_\gamma ^\beta . \end{aligned}$$

\(\square \)

Result 3

If \(f\in \mathcal {P}(\beta )\), then

$$\begin{aligned}&{\sup _{y_0\in \mathbb {R}} \mathrm {E}\left[ \left( \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)-(\rho _\gamma \times f)(y_0)\right) ^2\right] }\\&\le \frac{\sup _{y\in \mathbb {R}}\{(v_\gamma \times f)(y)\}+\sup _{y\in \mathbb {R}}\{(\mu _\gamma \times f)(y)\}\int K^2\, d \lambda }{N_\gamma h_\gamma \left( \int \mu _\gamma f \, d \lambda \right) ^2}\\&\quad +\frac{1}{\left( \int \mu _\gamma f \, d \lambda \right) ^2}\sup _{(y_1,y_2)\in \mathbb {R}^2}\{(c_\gamma (y_1, y_2)f(y_1)f(y_2))\}\\&\quad +\frac{1}{\left( \int \mu _\gamma f \, d \lambda \right) ^2} \sup _{(y_1,y_2)\in \mathbb {R}^2}\{\left( \mu '_\gamma (y_2, y_1)-\mu _\gamma (y_1)\mu _\gamma (y_2)\right) f(y_1)f(y_2))\}\\&\quad +C_2^2h_\gamma ^{2\beta }. \end{aligned}$$

Proof of Result 3

Let \(y_0\in \mathbb {R}\), \(\gamma \in \mathbb {N}\). Then by Results 1 and 2,

$$\begin{aligned}&{\mathrm {E}\left[ \left( \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)-(\rho _\gamma \times f)(y_0)\right) ^2\right] }\\&\quad \le {\text {Var}}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] +\left| \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)\right| ^2\\&\quad \le \frac{\sup _{y\in \mathbb {R}}\{(v_\gamma \times f)(y)\}+\sup _{y\in \mathbb {R}}\{(\mu _\gamma \times f)(y)\}\int K^2\, d \lambda }{N_\gamma h_\gamma \left( \int \mu _\gamma f \, d \lambda \right) ^2}\\&\quad \quad +\frac{1}{\left( \int \mu _\gamma f \, d \lambda \right) ^2}\sup _{(y_1,y_2)\in \mathbb {R}^2}\{(c_\gamma (y_1, y_2)f(y_1)f(y_2))\}\\&\quad \quad +\frac{1}{\left( \int \mu _\gamma f \, d \lambda \right) ^2} \sup _{(y_1,y_2)\in \mathbb {R}^2}\{\left( \mu '_\gamma (y_2, y_1)-\mu _\gamma (y_1)\mu _\gamma (y_2)\right) f(y_1)f(y_2))\}\\&\quad \quad +C_2^2h_\gamma ^{2\beta }. \end{aligned}$$

\(\square \)

1.4 A.4 Proofs of Section 2 results.

Proof of Theorem 1:

Bochner’s lemma under informative sampling. We calculate:

$$\begin{aligned}&{\left| \frac{1}{h_\gamma } \int Q\left( \frac{y-y_0}{h_\gamma } \right) r_\gamma (y)f(y)\, d \lambda (y)-g(y_0)r_\gamma (y_0)\int Q(u) \, d \lambda (u) \right| \quad \quad \quad }\\&\quad =\left| \int \left( (g\times r_\gamma )(y_0+uh_\gamma )-(g\times r)(y_0)\right) Q(u) \, d \lambda (u)\right| \\&\quad \le \sup _{|u|\le h_\gamma ^{-1/2}}\left\{ \left| (g\times r_\gamma )(y_0+uh_\gamma )-(g\times r)(y_0)\right| \right\} \int \left| Q\right| (u) \, d \lambda (u)\\&\quad \quad + \int _{|u|> h_\gamma ^{-1/2}} \left| (g\times r_\gamma )(y_0+uh_\gamma )-(g\times r)(y_0)\right| \left| Q\right| (u) \, d \lambda (u)\\&\quad \le \sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| (g\times r_\gamma )(y_0+v)-(g\times r)(y_0)\right| \right\} \int \left| Q\right| (u) \, d \lambda (u)\\&\quad \quad + \left( \sup _{u\in \mathbb {R}}\left\{ (g\times r_\gamma )(u)\right\} +\sup _{u\in \mathbb {R}}\left\{ (g\times r)(u)\right\} \right) \int _{|u|> h_\gamma ^{-1/2}}\left| Q\right| (u) \, d \lambda (u) \end{aligned}$$
$$\begin{aligned}&\le \left( \sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| (g\times r_\gamma )(y_0+v)-(g\times r)(y_0+v)\right| \right\} \right. \\&\quad \quad \left. +\sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| (g\times r)(y_0+v)-(g\times r)(y_0)\right| \right\} \right) \int \left| Q\right| (u) \, d \lambda (u)\\&\quad \quad + \left( \sup _{u\in \mathbb {R}}\left\{ (g\times r_\gamma )(u)\right\} +\sup _{u\in \mathbb {R}}\left\{ (g\times r)(u)\right\} \right) \int _{|u|> h_\gamma ^{-1/2}}\left| Q\right| (u) \, d \lambda (u) \end{aligned}$$
$$\begin{aligned}&\quad \le \left( \sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| (r_\gamma -r)(y_0+v)\right| \right\} \sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| g(y_0+v)\right| \right\} \right. \\&\quad \quad \left. +\sup _{|v|\le h_\gamma ^{1/2}}\left\{ \left| (g\times r)(y_0+v)-(g\times r)(y_0)\right| \right\} \right) \int \left| Q\right| (u) \, d \lambda (u)\\&\quad \quad + \left( \sup _{u\in \mathbb {R}}\left\{ (g\times r_\gamma )(u)\right\} +\sup _{u\in \mathbb {R}}\left\{ (g\times r)(u)\right\} \right) \int _{|u|> h_\gamma ^{-1/2}}\left| Q\right| (u) \, d \lambda (u). \end{aligned}$$

When \(h_\gamma \rightarrow 0\) every summand tends to 0. \(\square \)

Proof of Theorem 2

We calculate:

$$\begin{aligned}&{\left| \frac{1}{h_\gamma ^2}\int Q\left( \frac{y_1-y_0}{h_\gamma }\right) Q\left( \frac{y_2-y_0}{h_\gamma }\right) r_\gamma (y_1,y_2)g(y_1)g(y_2) \, d y_1 \, d y_2 \right| \quad \quad \quad \quad \quad \quad }\\&\quad =\left| \int Q\left( y_0+u_1h_\gamma \right) Q\left( y_0+u_2h_\gamma \right) \right. \\&\qquad \left. r_\gamma (y_0+u_1h_\gamma ,y_0+u_2h_\gamma )g(y_0+u_1h_\gamma )g(y_0+u_2h_\gamma ) \, d u_1 \, d u_2 \phantom {\frac{1}{2}}\right| \\&\quad \le \sup _{\max (u_1,u_2)\le h_\gamma ^{-1/2}} \left\{ \left| g(y_0+u_1h_\gamma )g(y_0+u_2h_\gamma ) r_\gamma (y_0+u_1h_\gamma ,y_0+u_2h_\gamma )\right| \right\} \\&\qquad \left( \int \left| Q\right| (u) \, d u\right) ^2 +\sup _{(u_1,u_2)\in \mathbb {R}^2}\left\{ \left| g(u_1)g(u_2) r_\gamma (u_1,u_2)\right| \right\} \left( \int _{u> h_\gamma ^{-1/2}}\left| Q\right| (u) \, d u\right) ^2\\&\quad \le \sup _{\max (v_1,v_2)\le h_\gamma ^{1/2}} \left\{ \left| g(y_0+v_1)g(y_0+v_2) r_\gamma (y_0+v_1,y_0+v_2)\right| \right\} \left( \int \left| Q\right| (u) \, d u\right) ^2\\&\quad \quad +\sup _{(u_1,u_2)\in \mathbb {R}^2}\left\{ \left| g(u_1)g(u_2) r_\gamma (u_1,u_2)\right| \right\} \left( \int _{u> h_\gamma ^{-1/2}}\left| Q\right| (u) \, d u\right) ^2. \end{aligned}$$

When \(h_\gamma \rightarrow 0\) every summand tends to 0. \(\square \)

Proof of Proposition 1

With D0, as \(\lim _{\gamma \rightarrow \infty } \int \mu _\gamma f>0\), then \(\exists \Gamma \in \mathbb {N}\) such that \(\forall \gamma \in \mathbb {N}\), \(\gamma \ge \Gamma \Rightarrow \mathrm {E}\left[ n_\gamma \right] >0\). Then,

$$\begin{aligned} \mathrm {E}\left[ \frac{n_\gamma }{N_\gamma \int \mu _\gamma f} p_\gamma (y_0)\right]= & {} \left( \frac{1}{\int \mu _\gamma f}+o_\gamma (1)\right) \frac{1}{h_\gamma } \int K\left( \frac{y-y_0}{h_\gamma }\right) \mu _\gamma (y)f(y) \, d \lambda (y). \end{aligned}$$

We apply Theorem 1 with \(g=f\), \(r_\gamma =\mu _\gamma \), \(r=\mu _\infty \), and obtain that:

$$\begin{aligned} \lim _{\gamma \rightarrow \infty } \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] =\rho _\infty (y_0)f(y_0). \end{aligned}$$

Besides,

$$\begin{aligned} p_\gamma (y_0)-\frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)= \left( \frac{\mathrm {E}\left[ n_\gamma \right] }{n_\gamma +\mathbb {1}_{\{0\}}(n_\gamma )}-1\right) \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0), \end{aligned}$$

and with D1, \(\mathrm {E}\left[ \left( \frac{\mathrm {E}\left[ n_\gamma \right] }{n_\gamma +\mathbb {1}_{\{0\}}(n_\gamma )}-1\right) ^{2}\right] =o_\gamma (1)\). \(\square \)

Proof of Proposition 2

We calculate:

$$\begin{aligned}&{{\text {Var}}\left[ \frac{1}{h_\gamma N_\gamma \int \mu _\gamma f \, d \lambda } \sum _{k=1}^{N_\gamma }K\left( \frac{y_k-y_0}{h_\gamma }\right) I_{\gamma k}\right] }\\&\quad =\frac{1}{N_\gamma \left( \int \mu _\gamma f\, d \lambda \right) ^2} \left( \frac{1}{h_\gamma ^2}\int K\left( \frac{y-y_0}{h_\gamma }\right) v_\gamma (y)f(y)\, d \lambda (y)\right. \\&\quad \quad +\frac{1}{h_\gamma ^2}\int K^2\left( \frac{y-y_0}{h_\gamma }\right) \mu ^2_\gamma (y)f(y)\, d \lambda (y) \\&\quad \quad -\left. \left( \frac{1}{h_\gamma } \int K\left( \frac{y-y_0}{h_\gamma }\right) \mu _\gamma (y)f(y)\, d \lambda (y)\right) ^2 \right) \\&\quad \quad +\frac{N_\gamma -1}{N_\gamma \!\left( \!h_\gamma \int \mu _\gamma f\, d \lambda \right) ^2} \left( \!\int \! K\!\left( \!\frac{y_1-y_0}{h_\gamma }\!\right) \!K\!\left( \!\frac{y_2-y_0}{h_\gamma }\!\right) \right. \\&\quad \times \left. (c_\gamma +\, d _\gamma )(y_1,y_2)f(y_1)f(y_2)\, d \lambda (y_1) \, d \lambda (y_2)\!\right) . \end{aligned}$$

By application of Theorem 1 to \(Q=K\), \(r_\gamma =v_\gamma \), and \(r=v_\infty \), we obtain that:

$$\begin{aligned} \frac{1}{h_\gamma }\int K\left( \frac{y-y_0}{h_\gamma } \right) v_\gamma (y)f(y)\, d \lambda (y)=v_\infty (y_0)f(y_0)+o_\gamma (1). \end{aligned}$$

By application of Theorem 1 to \(Q=K\), \(r_\gamma =\mu _\gamma \), and \(m=v\), we obtain that:

$$\begin{aligned} \frac{1}{h_\gamma }\int K\left( \frac{y-y_0}{h_\gamma } \right) \mu _\gamma (y)f(y)\, d \lambda (y)=\mu _\infty (y_0)f(y_0)+o_\gamma (1). \end{aligned}$$

By application of Theorem 2 to \(Q=K\), and \(r_\gamma =(c_\gamma +\, d _\gamma )\), we obtain that:

$$\begin{aligned} \frac{1}{\left( h_\gamma \right) ^2}\left( \int K\left( \frac{y_1{-}y_0}{h_\gamma }\right) K\left( \frac{y_2{-}y_0}{h_\gamma }\right) (c_\gamma {+} d _\gamma )(y_1,y_2)f(y_1)f(y_2) d \lambda (y_1) d \lambda (y_2)\right) {=}o_\gamma (1). \end{aligned}$$

So

$$\begin{aligned}&{\text {Var}}\left[ \frac{1}{h_\gamma N_\gamma \int \mu _\gamma f \, d \lambda } \sum _{k=1}^{N_\gamma }K\left( \frac{y_k-y_0}{h_\gamma }\right) I_{\gamma k}\right] \\&\quad = \frac{1}{N_\gamma h_\gamma \left( \int \mu _\gamma f\, d \lambda \right) ^2}\left( v(y_0)f(y_0)+o_\gamma (1)\right) \\&\quad \quad +\frac{1}{N_\gamma h_\gamma }\left( \rho _\infty ^2(y_0)f(y_0)\int K^2(u)\, d u\right) (1+o_\gamma (1))\\&\quad \quad +\frac{1}{N_\gamma }\left( \rho _\infty ^2(y_0)f^2(y_0)+o_\gamma (1)\right) +o_\gamma (1). \end{aligned}$$

Thus

$$\begin{aligned} {\text {Var}}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y)\right] \sim _\gamma \frac{f(y_0)}{N_\gamma h_\gamma }\left( \left( \frac{v(y_0)}{\left( \int \mu _\infty f\, d \lambda \right) ^2}+\rho _\infty ^2(y_0)\right) \int K^2(u)\, d u\right) \end{aligned}$$

and by D1, \(\mathrm {E}\left[ \left( \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }-1\right) ^2\right] =o_\gamma (1),\) which together with the preceding implies the result. \(\square \)

Proof of Proposition 3

We use the calculations of Result 2 with \(l=2\), and the fact that K is a kernel of order \(l=2\): \(\exists \tau _\gamma :u\mapsto [0,1]\) such that:

$$\begin{aligned} \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)\right] -\rho _\gamma (y_0)f(y_0)= & {} \int K(u)\frac{(uh_\gamma )^2}{2}(\rho _\gamma \times f)^{(2)}(y_0+\tau (u)uh_\gamma )\, d u\\= & {} \frac{h_\gamma ^2}{2}(\rho _\gamma \times f)^{(2)}(y_0)\int u^2K(u)\, d u+\Delta _\gamma (u), \end{aligned}$$

with

$$\begin{aligned} \Delta _\gamma (u)= & {} \int u^2K(u)\left( (\rho _\gamma \times f)^{(2)}(y_0+\tau _\gamma (u)uh_\gamma )-(\rho _\gamma \times f)^{(2)}(y_0)\right) \, d u\\ |\Delta _\gamma (u)|\le & {} \sup _{u\le h_\gamma ^{-1/2}}\left\{ \left| (\rho _\gamma \times f)^{(2)}(y_0+\tau _\gamma (u)uh_\gamma )-(\rho _\gamma \times f)^{(2)}(y_0)\right| \right\} \int u^2K(u)\, d u\\&+2\sup _{u\in \mathbb {R}}\left\{ |(\rho _\gamma \times f)^{(2)}(u)|\right\} \int _{u>h_\gamma ^{-1/2}} u^2K(u)\, d u\\\le & {} \sup _{v\le h_\gamma ^{1/2}}\left\{ \left| (\rho _\gamma \times f)^{(2)}(y_0+v)-(\rho _\gamma \times f)^{(2)}(y_0)\right| \right\} \int u^2K(u)\, d u\\&+2\sup _{u\in \mathbb {R}}\left\{ |(\rho _\gamma \times f)^{(2)}(u)|\right\} \int _{u>h_\gamma ^{-1/2}} u^2K(u)\, d u\\= & {} O_\gamma (h_\gamma ). \end{aligned}$$

Then

$$\begin{aligned} \mathrm {E}\left[ \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }p_\gamma (y_0)-\rho _\infty (y_0)f(y_0)\right] = \frac{h_\gamma ^2}{2}\left( (\rho _\gamma \times f)^{(2)}(y_0)\int u^2K(u)\, d u+o_\gamma (1)\right) . \end{aligned}$$

By D1, \(\mathrm {E}\left[ \left( \frac{n_\gamma }{\mathrm {E}\left[ n_\gamma \right] }-1\right) ^2\right] =o_\gamma (1)\), and then:

$$\begin{aligned} \mathrm {E}\left[ p_\gamma (y_0)-\rho _\infty (y_0)f(y_0)\right] = \frac{h_\gamma ^2}{2}\left( (\rho _\gamma \times f)^{(2)}(y_0)\int u^2K(u)\, d u+o_\gamma (1)\right) . \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bonnéry, D., Breidt, F.J. & Coquet, F. Kernel estimation for a superpopulation probability density function under informative selection. METRON 75, 301–318 (2017). https://doi.org/10.1007/s40300-017-0127-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40300-017-0127-x

Keywords

Navigation