Skip to main content
Log in

Kernel density estimation from complex surveys in the presence of complete auxiliary information

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

Auxiliary information is widely used in survey sampling to enhance the precision of estimators of finite population parameters, such as the finite population mean, percentiles, and distribution function. In the context of complex surveys, we show how auxiliary information can be used effectively in kernel estimation of the superpopulation density function of a given study variable. We propose two classes of “model-assisted” kernel density estimators that make efficient use of auxiliary information. For one class we assume that the functional relationship between the study variable Y and the auxiliary variable X is known, while for the other class the relationship is assumed unknown and is estimated using kernel smoothing techniques. Under the first class, we show that if the functional relationship can be written as a simple linear regression model with constant error variance, the mean of the proposed density estimator will be identical to the well-known regression estimator of the finite population mean. If we drop the intercept from the linear model and allow the error variance to be proportional to the auxiliary variable, the mean of the proposed density estimator matches the ratio estimator of the finite population mean. The properties of the new density estimators are studied under a combined design-model-based inference framework, which accounts for the underlying superpopulation model as well as the randomization distribution induced by the sampling design. Moreover, the asymptotic normality of each estimator is derived under both design-based and combined inference frameworks when the sampling design is simple random sampling without replacement. For the practical implementation of these estimators, we discuss how data-driven bandwidth estimators can be obtained. The finite sample properties of the proposed estimators are addressed via simulations and an example that mimics a real survey. These simulations show that the new estimators perform very well compared to standard kernel estimators which do not utilize the auxiliary information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ahmad IA (2002) Kernel estimation in a continuous randomized response model. In: Handbook of applied econometrics and statistical inference vol 165, pp 97–114

  • Bellhouse DR, Stafford JE (1999) Density estimation from complex surveys. Stat Sin 9:407–424

    MathSciNet  MATH  Google Scholar 

  • Billingsley P (1995) Probability and measure. Wiley, New York

    MATH  Google Scholar 

  • Bleuer SR, Kratina IS (2005) On the two-phase framework for joint model and design-based inference. Ann Stat 33:2789–2810

    Article  MathSciNet  MATH  Google Scholar 

  • Bonnéry D, Breidt FJ, Coquet F (2017) Kernel estimation for a superpopulation probability density function under informative selection. Metron 75:301–318

    Article  MathSciNet  MATH  Google Scholar 

  • Breidt FJ, Claeskens G, Opsomer JD (2005) Model-assisted estimation for complex surveys using penalised splines. Biometrika 92:831–846

    Article  MathSciNet  MATH  Google Scholar 

  • Breidt JF, Opsomer JD (2000) Local polynomial regression estimators in survey sampling. Ann Stat 28:1026–1053

    Article  MATH  Google Scholar 

  • Breunig RV (2001) Density estimation for clustered data. Econom Rev 20:353–367

    Article  MathSciNet  MATH  Google Scholar 

  • Breunig RV (2008) Nonparametric density estimation for stratified samples. Stat Probab Lett 78:2194–2200

    Article  MathSciNet  MATH  Google Scholar 

  • Buskirk TD, Lohr SL (2005) Asymptotic properties of kernel density estimation with complex survey data. J Stat Plan Inference 128:165–190

    Article  MathSciNet  MATH  Google Scholar 

  • Dorfman AH, Hall P (1993) Estimators of the finite population distribution function using nonparametric regression. Ann Stat 21:1452–1475

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, New York

    MATH  Google Scholar 

  • Fuller WA (2009) Sampling statistics. Wiley, New Jersey

    Book  MATH  Google Scholar 

  • Glad IK, Hjort NL, Ushakov NG (2003) Correction of density estimators that are not densities. Scand J Stat 30:415–427

    Article  MathSciNet  MATH  Google Scholar 

  • Hájek J (1960) Limiting distribuions in simple random sampling from a finite population. Publications of Mathematical Institute of Hungarian Academy of Sciences. Ser. A vol 5, pp 361–374

  • Hansen BE (2008) Uniform convergence rates for kernel estimation with dependent data. Econ Theory 24:726–748

    Article  MathSciNet  MATH  Google Scholar 

  • Harms T, Duchesne P (2010) On kernel nonparametric regression designed for complex survey data. Metrika 72:111–138

    Article  MathSciNet  MATH  Google Scholar 

  • Hartley HO, Sielken RL (1975) A “superpopulation viewpoint” for finite population sampling. Biometrics 31:411–422

    Article  MathSciNet  MATH  Google Scholar 

  • Hayfield T, Racine JS (2008) Nonparametric econometrics: The np package. J Stat Softw 27:1–32

    Article  Google Scholar 

  • Howell KB (2001) Principles of fourier analysis. Chapman & Hall /CRC Press, New York

    Book  MATH  Google Scholar 

  • Isaki CT, Fuller WA (1982) Survey design under the regression superpopulation model. J Am Stat Assoc 77:89–96

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson AA, Breidt FJ, Opsomer JD (2008) Estimating distribution functions from survey data using nonparametric regression. J Stat Theory Pract 2:419–431

    Article  MathSciNet  MATH  Google Scholar 

  • Korn EL, Graubard BI (1999) Analysis of health surveys. Wiley, New York

    Book  MATH  Google Scholar 

  • Krewski D, Rao JNK (1981) Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods. Ann Stat 9:1010–1019

    Article  MathSciNet  MATH  Google Scholar 

  • Kulik R (2011) Nonparametric conditional variance and error density estimation in regression models with dependent errors and predictors. Electron J Stat 5:856–898

    Article  MathSciNet  MATH  Google Scholar 

  • Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9:141–142

    Article  MATH  Google Scholar 

  • Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076

    Article  MathSciNet  MATH  Google Scholar 

  • Pfeffermann D (1993) The role of sampling weights when modeling survey data. Int Stat Rev 61:317–337

    Article  MATH  Google Scholar 

  • Pons O (2011) Functional estimation for density. regression models and processes. World Scientific Publishing Co., Pte. Ltd., Singapore

    Book  MATH  Google Scholar 

  • R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Randles RH (1982) On the asymptotic normality of statistics with estimated parameters. Ann Stat 10:462–474

    Article  MathSciNet  MATH  Google Scholar 

  • Rao JNK, Kovar JG, Mantel HJ (1990) On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 77:365–375

    Article  MathSciNet  MATH  Google Scholar 

  • Robinson PM, Särndal CE (1983) Asymptotic properties of the generalized regression estimator in probability sampling. Sankhya Ser B 45:240–248

    MathSciNet  MATH  Google Scholar 

  • Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837

    Article  MathSciNet  MATH  Google Scholar 

  • Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer, New York

    Book  MATH  Google Scholar 

  • Scott DW (2004) Multivariate density estimation and visualization. Papers / Humboldt-Universität Berlin. Cent Appl Stat Econ 16:1–23

    Google Scholar 

  • Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New York

    Book  MATH  Google Scholar 

  • Sen PK (1988) Asymptotics in finite population sampling. In: Handbook of statistics vol 6, pp 291–331

  • Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J R Stat Soc B 53:683–690

    MathSciNet  MATH  Google Scholar 

  • Terrell GR, Scott DW (1980) On improving convergence rates for nonnegative kernel density estimators. Ann Stat 8:1160–1163

    Article  MathSciNet  MATH  Google Scholar 

  • Thompson ME (1997) Theory of sample surveys. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Wand M, Jones M (1995) Kernel smoothing. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Watson GS (1964) Smooth regression analysis. Sankhya Ser A 26:359–372

    MathSciNet  MATH  Google Scholar 

  • Yao Q, Tong H (1994) Quantifying the influence of initial values on nonlinear prediction. J R Stat Soc Ser B 56:701–725

    MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Editor and two anonymous referees for their insightful comments and suggestions which helped to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sayed A. Mostafa.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 180 KB)

Appendix: Proofs for Section 2

Appendix: Proofs for Section 2

In this section, we provide the proofs for the results presented in Sect. 2. Sketch proofs for the results presented in Sect. 3 can be found in the Web-based supplementary material.

Lemma A1

Suppose that Assumption A2 holds. Then, the estimator \(\hat{f}_{\text {par}}\), defined in (2.4), is bounded and square integrable, and satisfies the inequality in (2.6).

Proof

First, note that the estimator \(\hat{f}_{\text {par}}\) can be written as follows:

$$\begin{aligned} \hat{f}_{\text {par}}(y)= & {} \frac{1}{Nh}\sum _{i \in s} d_iK\left( \frac{y-y_i}{h}\right) -\frac{1}{Nh}\sum _{i \in {s}} d_iK\left( \frac{y-\hat{y}_i}{h}\right) +\frac{1}{Nh}\sum _{i \in U}K\left( \frac{y-\hat{y}_i}{h}\right) .\nonumber \\ \end{aligned}$$
(A.1)

Clearly, the kernel K is bounded and square integrable by Assumption A2. Consider the first term on the right hand side of (A.1) and notice that it is a linear combination of the same kernel function evaluated at different points. This term is square integrable since linear combinations of square integrable functions are also square integrable (see Howell 2001, p. 400). Similarly, each of the other two terms on the right hand side of (A.1) is square integrable. Therefore, the estimator \(\hat{f}_{\text {par}}\) is square integrable as a function of y. Additionally, since \(\hat{f}_{\text {par}}\) integrates out to one and can be negative, it satisfies the condition in (2.6). \(\square \)

Lemma A2

Under Assumptions A1(i), A2 and A3, the estimator \(\hat{f}_{\mathrm{par}}\) has the same limiting design-based distribution as \(\tilde{f}_{\mathrm{par}}\).

Proof

We use similar notation to that in Randles (1982). Let \(\gamma \) be a mathematical variable, and denote \(\hat{f}_{\text {par}}(y)\) as \(T_N(\hat{\beta })\) and \(\tilde{f}_{\text {par}}(y)\) as \(T_N(\beta _{\text {U}})\). Recall the definition of the sample membership indicators: \(I_i=1\) if \(i\in s\) and \(I_i=0\) otherwise, for all \(i\in U\). Therefore, \(I_i\) is a Bernoulli random variable with mean \(\mathrm {E}_{\mathcal {D}}(I_i)=\pi _i\), the sample inclusion probability of unit i. Note that

$$\begin{aligned} \mathrm {E}_{\mathcal {D}}\left[ T_N(\gamma )\right]= & {} \frac{1}{N}\mathrm {E}_{\mathcal {D}}\left[ \sum _{i \in {s}}d_i\left\{ K_{h}(y-y_i)-K_{h}(y-\gamma x_i)\right\} +\sum _{i \in U}K_{h}(y-\gamma x_i)\right] \\= & {} \frac{1}{N}\mathrm {E}_{\mathcal {D}}\left[ \sum _{i \in {U}} I_i d_i\left\{ K_{h}(y-y_i)-K_{h}(y-\gamma x_i)\right\} +\sum _{i \in U}K_{h}(y-\gamma x_i)\right] \\= & {} \frac{1}{N}\sum _{i \in {U}}K_{h}(y-y_i)=f_{\text {U}}(y;h). \end{aligned}$$

Thus, the limiting mean function is

$$\begin{aligned} \mu (\gamma )=\underset{N \rightarrow \infty }{\lim }\mathrm {E}_{\mathcal {D}}\left[ T_N(\gamma )\right] =f(y), \end{aligned}$$
(A.2)

where the second equality in (A.2) follows from the consistency of \(f_{\text {U}}(y)\) as an estimator for f(y) (e.g., Parzen 1962). It is clear that \(\mu (\gamma )\) has a zero differential at \(\gamma =\beta _{\text {U}}\). Now, it follows from Randles (1982, p. 463) that \(T_N(\hat{\beta })\) and \(T_N(\beta _{\text {U}})\) have the same limiting distribution in the design space, and the proof is complete. \(\square \)

Throughout the rest of this Appendix and the supplementary material, we use the notation \(A_N\simeq B_N\) if \(\lim _{N\rightarrow \infty }A_N/B_N=1\).

Proof of Theorem 2.1

Using Lemma A2 together with Assumption A5, we have [by the Corollary on p. 338 of Billingsley (1995)]

$$\begin{aligned} \mathrm {E}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y;h)\right]\simeq & {} \mathrm {E}_{\mathcal {D}}\left[ \tilde{f}_{\text {par}}(y;h)\right] \nonumber \\= & {} \frac{1}{N}\mathrm {E}_{\mathcal {D}}\left[ \sum _{i \in {s}}d_i\left\{ K_{h}(y-y_i)-K_{h}(y-\beta _{\text {U}}x_i)\right\} +\sum _{i \in U}K_{h}(y-\beta _{\text {U}}x_i)\right] \nonumber \\= & {} \frac{1}{N}\sum _{i \in {U}}K_{h}(y-y_i)=f_{\text {U}}(y; h). \end{aligned}$$
(A.3)

From (A.3), \(\hat{f}_{\text {par}}(y; h)\) is asymptotically design-unbiased for \(f_{\text {U}}(y;h)\). It remains to show that the design-variance of \(\hat{f}_{\text {par}}(y; h)\) approaches zero in the limit. For this, note that again using Lemma A2 together with Assumption A5, we have

$$\begin{aligned}&\mathrm {Var}_{\mathcal {D}} \left[ \hat{f}_{\text {par}}(y;h)\right] \simeq \mathrm {Var}_{\mathcal {D}}\left[ \tilde{f}_{\text {par}}(y;h)\right] \nonumber \\&\quad =\frac{1}{N^2}\mathrm {Var}_{\mathcal {D}}\left[ \sum \limits _{i \in {s}} d_i\left\{ K_{h}(y-y_i)-K_{h}(y-\beta _{\text {U}}x_i)\right\} +\sum _{i \in U}K_{h}(y-\beta _{\text {U}}x_i)\right] \nonumber \\&\quad =\frac{1}{N^2}\underset{i<j \in U}{\sum \sum }(\pi _i\pi _j-\pi _{ij})\left\{ \frac{K_{h}(y-y_i)-K_{h}(y-\beta _{\text {U}}x_i)}{\pi _i}\right. \nonumber \\&\qquad \left. -\frac{K_{h}(y-y_j)-K_{h}(y-\beta _{\text {U}}x_j)}{\pi _j}\right\} ^2\nonumber \\&\quad \le \frac{1}{N^2}\underset{i<j \in U}{\sum \sum }(\pi _i\pi _j-\pi _{ij})\left\{ \frac{K_{h}(y-y_i)}{\pi _i}+\frac{K_{h}(y-\beta _{\text {U}}x_j)}{\pi _j}\right\} ^2\nonumber \\&\quad \le \frac{1}{N^2}\underset{i<j\in U}{\max }|\pi _i\pi _j-\pi _{ij}|\underset{i<j \in U}{\sum \sum }\left[ \left\{ \frac{K_{h}(y-y_i)}{\pi _i}\right\} ^2+\left\{ \frac{K_{h}(y-\beta _{\text {U}}x_i)}{\pi _j}\right\} ^2\right. \nonumber \\&\qquad \left. +\,2\left\{ \frac{K_{h}(y-y_i)}{\pi _i}\right\} \left\{ \frac{K_{h}(y-\beta _{\text {U}}x_i)}{\pi _j}\right\} \phantom {\int ^{\int ^{\int }}}\right] \nonumber \\&\quad \le \frac{1}{N^2}\underset{i<j\in U}{\max }|\pi _i\pi _j-\pi _{ij}|\underset{i<j \in U}{\sum \sum }\left[ \left\{ h^{-1}K^*/\lambda \right\} ^2+\left\{ h^{-1}K^*/\lambda \right\} ^2+2\left\{ h^{-1}K^*/\lambda \right\} ^2\right] \nonumber \\&\quad =n\underset{i<j\in U}{\max }|\pi _i\pi _j-\pi _{ij}|\frac{1}{nh^2}\frac{(N-1)}{N}\frac{4(K^*)^2}{\lambda ^2}\longrightarrow 0, \end{aligned}$$
(A.4)

where \(K^*\) is such that \(|K(u)|\le K^*<\infty \) for all u, by the boundedness of K. To reach (A.4), we used the fact that the second summation in the first equality above is a finite population quantity and, hence, considered fixed with respect to the randomization distribution. The equality in (A.4) follows because \(\sum _{i \in {s}}d_i\left\{ K_{h}(y-y_i)-K_{h}(y-\beta _{\text {U}}x_i)\right\} \) is the Horvitz-Thompson estimator of the finite population total \(\sum _{i \in {U}}\left\{ K_{h}(y-y_i)-K_{h}(y-\beta _{\text {U}}x_i)\right\} \). The zero limit follows from Assumptions A3A4. \(\square \)

Proof of Theorem 2.2

Starting with the bias statement, note that

$$\begin{aligned} \mathrm {E}_C\left[ \hat{f}_{\text {par}}(y;h)\right]= & {} \mathrm {E}_{\xi }\left\{ \mathrm {E}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y;h)|\mathbf {w}_{\text {U}}, \mathbf {x}_{\text {U}}, \mathbf {y}_{\text {U}}\right] \right\} . \end{aligned}$$

From the proof of Theorem 2.1, we have

$$\begin{aligned} \mathrm {E}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y;h)\right]\simeq & {} \mathrm {E}_{\mathcal {D}}\left[ \tilde{f}_{\text {par}}(y;h)\right] =f_{\text {U}}(y; h). \end{aligned}$$
(A.5)

Therefore,

$$\begin{aligned} \mathrm {E}_C\left[ \hat{f}_{\text {par}}(y;h)\right]\simeq & {} \frac{1}{N} \sum _{i \in U}\mathrm {E}_{\xi }\left[ K_{h}(y-Y_i)\right] \nonumber \\= & {} f(y)+\frac{1}{2}h^2c_Kf''(y)+o(h^2), \end{aligned}$$
(A.6)

where (A.6) is a standard result in KDE (cf. Wand and Jones 1995, p. 20).

Next, consider the variance of \(\hat{f}_{\text {par}}(y)\) and note that

$$\begin{aligned} \mathrm {Var}_C\left[ \hat{f}_{\text {par}}(y)\right]= & {} \mathrm {E}_{\xi }\left\{ \mathrm {Var}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y)|\mathbf {w}_{\text {U}}, \mathbf {x}_{\text {U}}, \mathbf {y}_{\text {U}}\right] \right\} +\mathrm {Var}_{\xi }\left\{ \mathrm {E}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y)|\mathbf {w}_{\text {U}}, \mathbf {x}_{\text {U}}, \mathbf {y}_{\text {U}}\right] \right\} \nonumber \\&~{=:}~&J_1+J_2. \end{aligned}$$
(A.7)

Using (A.5), we have

$$\begin{aligned} J_2= & {} \mathrm {Var}_{\xi }\left[ \frac{1}{N} \sum _{i \in U}K_{h}(y-Y_i)\right] =(Nh)^{-1}d_Kf(y)+o\{(Nh)^{-1}\}, \end{aligned}$$
(A.8)

where (A.8) is a standard result in KDE (cf. Wand and Jones 1995, p. 21).

On the other hand, observe that we can rewrite (A.4) as follows:

$$\begin{aligned} \mathrm {Var}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y;h)\right]\simeq & {} \frac{1}{N^2}\underset{i, j \in U}{\sum \sum }\left[ \frac{(\pi _{ij}-\pi _i\pi _j)}{\pi _i\pi _j} \left\{ K_{h}(y-Y_i)-K_{h}(y-\beta _{\text {U}}X_i)\right\} \right. \nonumber \\&\left. \times \left\{ K_{h}(y-Y_j)-K_{h}(y-\beta _{\text {U}}X_j)\right\} \right] . \end{aligned}$$
(A.9)

Taking \(\Delta _{ij}=(\pi _{ij}-\pi _i\pi _j)/\pi _i\pi _j\), \(\Delta _{i}=(1-\pi _i)/\pi _i\) and \(\hat{Y}_{\text {U}i}=\beta _{\text {U}} X_i\), (A.9) can be rewritten as follows:

$$\begin{aligned} \mathrm {Var}_{\mathcal {D}}\left[ \hat{f}_{\text {par}}(y;h)\right]\simeq & {} \frac{1}{N^2}\sum _{i \in U}\Delta _{i}\left\{ K_{h}(y-Y_i)-K_{h}(y-\hat{Y}_{\text {U}i})\right\} ^2 +\frac{1}{N^2}\nonumber \\&\times \underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij}\left\{ K_{h}(y-Y_i)-K_{h} (y-\hat{Y}_{\text {U}i})\right\} \nonumber \\&\times \left\{ K_{h}(y-Y_j)-K_{h}(y-\hat{Y}_{\text {U}j})\right\} \nonumber \\= & {} \frac{1}{N^2}\sum _{i \in U}\Delta _{i}V^2_i+\frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij}V_iV_j, \end{aligned}$$
(A.10)

where \(V_l=\left\{ K_{h}(y-Y_l)-K_{h}(y-\hat{Y}_{Ul})\right\} \). Then we have

$$\begin{aligned} J_1\simeq & {} \frac{1}{N^2}\sum _{i \in U}\Delta _{i}\mathrm {E}_{\xi }(V^2_i)+\frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij}\mathrm {E}_{\xi }(V_iV_j)~{=:}~J_{11}+J_{12}. \end{aligned}$$
(A.11)

We work out each term in (A.11) separately. First, note that

$$\begin{aligned} V^2_i=\left\{ K^2_h(Y_i-y)-2K_{h}(Y_i-y)K_{h}(\hat{Y}_{\text {U}i}-y)+K^2_{h}(\hat{Y}_{\text {U}i}-y)\right\} . \end{aligned}$$

By the conditions on the derivatives of K, we can form the following Taylor expansions

$$\begin{aligned} K_{h}(\hat{Y}_{\text {U}i}-y)= & {} \frac{1}{h}K\left( \frac{Y_i-y}{h}\right) +\frac{(\hat{Y}_{\text {U}i}-Y_i)}{h^2}K'\left( \frac{Y_i-y}{h}\right) \nonumber \\&+ \frac{(\hat{Y}_{\text {U}i}-Y_i)^2}{2h^3}K{''}\left( \frac{Y_i-y}{h}\right) +\frac{(\hat{Y}_{\text {U}i}-Y_i)^3}{6h^4}K^{(3)} \left( \frac{\varrho -y}{h}\right) ,\qquad \qquad \end{aligned}$$
(A.12)
$$\begin{aligned} K^2_{h}(\hat{Y}_{\text {U}i}-y)= & {} \frac{1}{h^2}K^2\left( \frac{Y_i-y}{h}\right) +2\frac{(\hat{Y}_{\text {U}i} -Y_i)}{h^3}K\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_i-y}{h}\right) \nonumber \\&+\frac{(\hat{Y}_{\text {U}i}-Y_i)^2}{h^4}\left[ K'\left( \frac{Y_i-y}{h} \right) \right] ^2+\frac{(\hat{Y}_{\text {U}i}-Y_i)^2}{h^4}\nonumber \\&\times K \left( \frac{Y_i{-}y}{h}\right) K{''}\left( \frac{Y_i{-}y}{h}\right) {+}\frac{(\hat{Y}_{\text {U}i}{-}Y_i)^3}{6h^5}K^{2(3)}\left( \frac{\varrho {-}y}{h}\right) , \end{aligned}$$
(A.13)

for some \(\varrho \) between \(Y_i\) and \(\hat{Y}_{\text {U}i}\), and \(K^{(3)}\) (\(K^{2(3)}\)) is the third derivative of K (\(K^2\)). Using (A.12) and (A.13), we can write

$$\begin{aligned} J_{11}= & {} \frac{1}{N^2}\sum _{i \in U}\Delta _{i}\mathrm {E}_{\xi }\left\{ K^2_h(Y_i-y)-2K_{h}(Y_i-y)K_{h} (\hat{Y}_{\text {U}i}-y)+K^2_{h}(\hat{Y}_{\text {U}i}-y)\right\} \nonumber \\= & {} \frac{1}{N^2}\sum _{i \in U}\Delta _{i}\mathrm {E}_{\xi } \left\{ \frac{1}{h^4}\hat{Y}^2_{\text {U}i}\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2-\frac{2}{h^4}\hat{Y}_{\text {U}i}Y_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right. \nonumber \\&\left. +\frac{1}{h^4}Y^2_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\} +o\left( \frac{1}{Nh^3}\right) \nonumber \\&~{=:}~&\frac{1}{N^2}\sum _{i \in U}\Delta _{i}\mathrm {E}_{\xi } \left\{ W_1-2W_2+W_3\right\} +o\left( \frac{1}{Nh^3}\right) , \end{aligned}$$
(A.14)

where the term \(o\left( \{Nh^3\}^{-1}\right) \) is obtained by noting that each of \(h^{-1}K^{(3)}\left( \frac{\varrho -y}{h}\right) \) and \(h^{-2}K^{2(3)}\left( \frac{\varrho -y}{h}\right) \) is \(O_p(1)\) (it is not hard to verify that each of \(\mathrm {E}_{\xi }\big |h^{-1}K^{(3)}\left( \frac{Y-y}{h}\right) \big |\) and \(\mathrm {E}_{\xi }\big |h^{-2}K^{2(3)}\left( \frac{Y-y}{h}\right) \big |\) is O(1)), and by the assumption about the finite population moments of the errors.

We now evaluate the model expectation of each term on the right-hand side of (A.14).

$$\begin{aligned} \frac{1}{N}\mathrm {E}_{\xi }(W_3)= & {} \frac{1}{Nh^4}\int _{\mathbb {R}}y^2_i \left[ K'\left( \frac{y_i-y}{h}\right) \right] ^2f(y_i)dy_i\nonumber \\= & {} \frac{1}{Nh^3}\int _{\mathbb {R}}(y+hz)^2\left[ K'\left( z\right) \right] ^2 f(y+hz)dz\nonumber \\= & {} \frac{1}{Nh^3}\int _{\mathbb {R}}\left[ K'\left( z\right) \right] ^2(y^2+2yhz +h^2z^2)\left[ f(y)+hzf'(y)+\cdots \right] dz\nonumber \\= & {} \frac{1}{Nh^3}y^2f(y)\int _{\mathbb {R}}\left[ K'\left( z\right) \right] ^2dz +\frac{1}{Nh^2}\left[ y^2f'(y)+2yf(y)\right] \int _{\mathbb {R}}z \left[ K'\left( z\right) \right] ^2dz\nonumber \\&+\frac{1}{Nh}\left[ f(y)+2yf'(y)\right] \int _{\mathbb {R}}z^2 \left[ K' \left( z\right) \right] ^2dz+o\left( \frac{1}{Nh}\right) \nonumber \\= & {} \frac{1}{Nh^3}y^2f(y)d_{K'}+\frac{1}{Nh}\left[ f(y)+2yf'(y)\right] c^{*}_{K'}+o\left( \frac{1}{Nh}\right) . \end{aligned}$$
(A.15)

where (A.15) follows from the fact that \(\int _{\mathbb {R}}z \{K'\left( z\right) \}^2dz=0\).

Consider \(\beta _{\text {U}}=[\sum _{i \in U}x_iy_i][\sum _{i \in U}x^2_i]^{-1}\) as defined under Eq. (2.9). It can be easily shown that under the assumptions of the theorem [cf. Fuller (2009, p. 107)]

$$\begin{aligned} \beta _{\text {U}}-\beta =O_p\left( N^{-1/2}\right) . \end{aligned}$$
(A.16)

Therefore, we can write

$$\begin{aligned} \frac{1}{N}\mathrm {E}_{\xi }(W_2)= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi } \left\{ \beta _{\text {U}}X_iY_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\} \nonumber \\= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ \left( \beta +O_p(N^{-1/2}) \right) X_iY_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\} \nonumber \\= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ \beta X_iY_i\left[ K' \left( \frac{Y_i-y}{h}\right) \right] ^2\right\} +O\left( \frac{1}{N^{\frac{3}{2}}h^4} \right) . \end{aligned}$$
(A.17)

But

$$\begin{aligned}&\frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ X_iY_i\left[ K'\left( \frac{Y_i-y}{h} \right) \right] ^2\right\} \nonumber \\&\quad =\frac{1}{Nh^4}\iint _{\mathbb {R}}x_iy_i\left[ K'\left( \frac{y_i-y}{h} \right) \right] ^2t(x_i|y_i)f(y_i)dx_idy_i\nonumber \\&\quad =\frac{1}{Nh^3} \iint _{\mathbb {R}}w(y+hz)\left[ K'\left( z\right) \right] ^2t(w|y+hz)f(y+hz)dwdz\nonumber \\&\quad =\frac{1}{Nh^3}yf(y)d_{K'}\int _{\mathbb {R}}wt(w|y)dw\nonumber \\&\qquad +\frac{1}{Nh}c^{*}_{K'}\int _{\mathbb {R}} w\left[ yf'(y)t_2'(w|y) +f'(y)t(w|y)+f(y)t'_2(w|y)\right] dw+o\left( \frac{1}{Nh}\right) ,\nonumber \\ \end{aligned}$$
(A.18)

where \(t'_2(u|v)=\partial t(u|v)/\partial v\). Substituting by (A.18) in (A.17), we get

$$\begin{aligned} \frac{1}{N}\mathrm {E}_{\xi }(W_2)= & {} \frac{1}{Nh^3}\beta yf(y)d_{K'} \int _{\mathbb {R}}xt(x|y)dx\nonumber \\&+\frac{1}{Nh}\beta c^{*}_{K'}\int _{\mathbb {R}}x\left[ yf'(y)t_2'(x|y)+f'(y) t(x|y)+f(y)t'_2(x|y)\right] dx\nonumber \\&+O\left( \frac{1}{N^{\frac{3}{2}}h^4}\right) +o\left( \frac{1}{Nh}\right) . \end{aligned}$$
(A.19)

We now consider \(W_1\). Observe that using (A.16), we have

$$\begin{aligned} \frac{1}{N}\mathrm {E}_{\xi }(W_1)= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi } \left\{ \beta ^2_{\text {U}}X^2_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\} \nonumber \\= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ \left\{ \beta +O_p\left( N^{-1/2}\right) \right\} ^2X^2_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\} \nonumber \\= & {} \frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ \beta ^2 X^2_i\left[ K' \left( \frac{Y_i-y}{h}\right) \right] ^2\right\} +O\left( \frac{1}{N^{\frac{3}{2}}h^4}+\frac{1}{N^2h^4}\right) . \end{aligned}$$
(A.20)

But

$$\begin{aligned} \frac{1}{Nh^4}\mathrm {E}_{\xi }\left\{ X^2_i\left[ K'\left( \frac{Y_i-y}{h}\right) \right] ^2\right\}= & {} \frac{1}{Nh^3}f(y)d_{K'}\int _{\mathbb {R}}w^2t(w|y)dw\nonumber \\&+\frac{1}{Nh}c^{*}_{K'}f'(y)\int _{\mathbb {R}}w^2t_2'(w|y)dw\nonumber \\&+o\left( \{Nh\}^{-1}\right) . \end{aligned}$$
(A.21)

Substituting (A.21) in (A.20) gives

$$\begin{aligned} \frac{1}{N}\mathrm {E}_{\xi }(W_1)= & {} \frac{1}{Nh^3}\beta ^2 d_{K'}f(y)\int _{\mathbb {R}}w^2t(w|y)dw+\frac{1}{Nh}\beta ^2 c^{*}_{K'}f'(y)\int _{\mathbb {R}}w^2t_2'(w|y)dw\nonumber \\&+O\left( \frac{1}{N^{\frac{3}{2}}h^4}+\frac{1}{N^2h^4}\right) +o\left( \frac{1}{Nh}\right) . \end{aligned}$$
(A.22)

Now, we use (A.15), (A.19) and (A.22) in (A.14) to get

$$\begin{aligned} J_{11}= & {} \left( \frac{1}{N}\sum _{i \in U}\Delta _{i}\right) \left\{ \frac{1}{Nh^3}\left[ \beta ^2 f(y)\int _{\mathbb {R}}x^2t(x|y)dx-2\beta yf(y)\right. \right. \nonumber \\&\quad \left. \int _{\mathbb {R}}xt(x|y)dx+y^2f(y)\right] d_{K'}\nonumber \\&\left. +\frac{1}{Nh}\left[ \beta ^2 f'(y)\int _{\mathbb {R}}x^2t_2'(x|y)dx-2\beta \int _{\mathbb {R}}x[yf'(y)t_2'(x|y)\right. \right. \nonumber \\&\qquad +f'(y)t(x|y)+f(y)t'_2(x|y)]dx\nonumber \\&\left. \left. \qquad +f(y)+2yf'(y) \right] c^{*}_{K'}+O\left( \frac{1}{N^{\frac{3}{2}}h^4}+\frac{1}{N^2h^4}\right) +o\left( \frac{1}{Nh}\right) \right\} +o\left( \frac{1}{Nh^3}\right) .\nonumber \\ \end{aligned}$$
(A.23)

If we keep only terms of order \(O\left( \{Nh^3\}^{-1}\right) \), (A.23) reduces to

$$\begin{aligned} J_{11}= & {} \left( \frac{1}{N}\sum _{i \in U}\Delta _{i}\right) \frac{1}{Nh^3}\left[ \beta ^2f(y)\int _{\mathbb {R}}x^2t(x|y)dx-2\beta yf(y) \int _{\mathbb {R}}xt(x|y)dx+y^2f(y)\right] d_{K'}\nonumber \\&+\,o\left( \{Nh^3\}^{-1}\right) . \end{aligned}$$
(A.24)

Next, using a Taylor expansion for \(K_h(\hat{Y}_{\text {U}l}-y)\) (see (A.12)), we can write

$$\begin{aligned} J_{12}= & {} \frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij}\mathrm {E}_{\xi }\left\{ K_{h}(Y_i-y)K_{h}(Y_j-y) -2K_{h}(Y_i-y)K_{h}(\hat{Y}_{\text {U}j}-y)\right. \nonumber \\&\left. +K_{h}(\hat{Y}_{\text {U}i}-y)K_{h} (\hat{Y}_{\text {U}j}-y)\right\} \nonumber \\= & {} \frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij}\mathrm {E}_{\xi } \left\{ \frac{1}{h^2}K\left( \frac{Y_i-y}{h}\right) K\left( \frac{Y_j-y}{h}\right) -\frac{2}{h^2}K\left( \frac{Y_i-y}{h}\right) K\left( \frac{Y_j-y}{h}\right) \right. \nonumber \\&\left. -\frac{2}{h^3}(\hat{Y}_{\text {U}j}-Y_j)K \left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h}\right) +\frac{1}{h^2}K \left( \frac{Y_i-y}{h}\right) \right. \nonumber \\&\left. \times K\left( \frac{Y_j-y}{h}\right) +\frac{2}{h^3}(\hat{Y}_{\text {U}j}-Y_j)K\left( \frac{Y_i-y}{h}\right) K' \left( \frac{Y_j-y}{h}\right) \right. \nonumber \\&\left. +\frac{1}{h^4}(\hat{Y}_{\text {U}i}-Y_i) (\hat{Y}_{\text {U}j}-Y_j)K'\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h} \right) \right\} +o\left( \frac{1}{nh^2}\right) \nonumber \\= & {} \frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij} \mathrm {E}_{\xi }\left\{ \frac{1}{h^4}\left( \hat{Y}_{\text {U}i} \hat{Y}_{\text {U}j}-2Y_i\hat{Y}_{\text {U}j}+Y_iY_j\right) K'\left( \frac{Y_i-y}{h} \right) K'\left( \frac{Y_j-y}{h}\right) \right\} +o\left( \frac{1}{nh^2}\right) \nonumber \\= & {} \frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij} \mathrm {E}_{\xi }\left\{ \frac{1}{h^4}\hat{Y}_{\text {U}i}\hat{Y}_{\text {U}j} K'\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h}\right) -\frac{2}{h^4}Y_i \hat{Y}_{\text {U}j}K'\left( \frac{Y_i-y}{h}\right) \right. \nonumber \\&\left. \times K'\left( \frac{Y_j-y}{h}\right) +\frac{1}{h^4} Y_iY_jK'\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h}\right) \right\} +o \left( \frac{1}{nh^2}\right) \nonumber \\=: & {} \frac{1}{N^2}\underset{i,j\in U, i\not = j}{\sum \sum }\Delta _{ij} \mathrm {E}_{\xi }\left\{ H_1-2H_2+H_3\right\} +o\left( \frac{1}{nh^2}\right) . \end{aligned}$$
(A.25)

Starting with \(H_3\), note that

$$\begin{aligned} \frac{1}{n}\mathrm {E}_{\xi }(H_3)= & {} \frac{1}{nh^4}\mathrm {E}_{\xi }\left[ Y_iY_jK'\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h}\right) \right] \nonumber \\&\overset{iid}{=}&\frac{1}{nh^4}\left\{ \mathrm {E}_{\xi }\left[ Y_1K'\left( \frac{Y_1-y}{h}\right) \right] \right\} ^2\nonumber \\= & {} \frac{1}{nh^4}\left\{ \int _{\mathbb {R}}y_1K'\left( \frac{y_1-y}{h}\right) f(y_1)dy_1\right\} ^2\nonumber \\= & {} \frac{1}{nh^4}\left\{ h\int _{\mathbb {R}}(y+hz)K'(z)f(y+hz)dz\right\} ^2\nonumber \\= & {} \frac{1}{nh^2}\left\{ yf(y)\int _{\mathbb {R}}K'(z)dz+h[f(y)+yf'(y)]\int _{\mathbb {R}}zK'(z)dz+o(h)\right\} ^2\nonumber \\= & {} \frac{1}{nh^2}\left\{ h[f(y)+yf'(y)]\int _{\mathbb {R}}zK'(z)dz+o(h)\right\} ^2=O\left( \frac{1}{n}\right) , \end{aligned}$$
(A.26)

where the first equality in (A.26) follows from the fact that \(\int _{\mathbb {R}}K'(z)dz=0\), by the assumptions on K.

Next, consider \(H_1\) and note that

$$\begin{aligned} \frac{1}{n}\mathrm {E}_{\xi }(H_1)= & {} \frac{1}{nh^4}\mathrm {E}_{\xi }\left[ \beta ^2_{\text {U}}X_iX_jK'\left( \frac{Y_i-y}{h}\right) K'\left( \frac{Y_j-y}{h}\right) \right] \nonumber \\&\overset{iid}{=}&\frac{1}{nh^4}\left\{ \mathrm {E}_{\xi }\left[ \{\beta +O_p(N^{-1/2})\}^2X_1K'\left( \frac{Y_1-y}{h}\right) \right] \right\} ^2\nonumber \\= & {} \frac{\beta ^2}{nh^4}\left\{ \int \int _{\mathbb {R}}x_1K'\left( \frac{y_1-y}{h}\right) t(x_1,y_1)dx_1dy_1\right\} ^2+O(\{nNh^2\}^{-1})\nonumber \\= & {} \frac{\beta ^2}{nh^4}\left\{ h\int \int _{\mathbb {R}}wK'(z)t(w,y+hz)dwdz\right\} ^2+O(\{nNh^2\}^{-1})\nonumber \\= & {} \frac{\beta ^2}{nh^2}\left\{ \int \int _{\mathbb {R}}wK'(z)\left\{ f(y)+hzf'(y)+o(h)\right\} \right. \nonumber \\&\qquad \left. \left\{ t(w|y)+hzt'_2(w|y)+o(h)\right\} dwdz\right\} ^2\nonumber \\&+O(\{nNh^2\}^{-1})\nonumber \\= & {} \frac{\beta ^2}{nh^2}\left\{ f(y)\left( \int _{\mathbb {R}}K'(z)dz\right) \left( \int _{\mathbb {R}}m(w)t(w|y)dw\right) +h\left( \int _{\mathbb {R}}zK'(z)dz\right) \right. \nonumber \\&\times \left. \left( f(y)\int _{\mathbb {R}}m(w)t'_2(w|y)dw+f'(y)\int _{\mathbb {R}}m(w)t(w|y)dw\right) +o(h)\right\} ^2\nonumber \\&+O\left( \frac{1}{nNh^2}\right) \nonumber \\= & {} O(n^{-1}). \end{aligned}$$
(A.27)

Similar calculations show that the model expectation of \(H_2\) is O(1). Further, note that Assumption A4 implies

$$\begin{aligned} \frac{n}{N^2}\sum _{i \in U}\Delta _{i}= & {} \frac{n}{N^2}\sum _{i \in U}\frac{(1-\pi _i)}{\pi _i}\le \frac{n}{N^2}\sum _{i \in U}\frac{(1-\lambda )}{\lambda }=\frac{n}{N}\frac{(1-\lambda )}{\lambda }=O(1) \end{aligned}$$

and

$$\begin{aligned} \frac{n}{N^2}\underset{i,j \in U, \ i\not = j}{\sum \sum }\Delta _{ij}= & {} \frac{n}{N^2}\underset{i,j \in U, \ i\not = j}{\sum \sum }\frac{(\pi _{ij}-\pi _i\pi _j)}{\pi _i\pi _j}\\\le & {} \frac{1}{N^2}\underset{i,j \in U, \ i\not = j}{\sum \sum }\frac{n \ \underset{i\not =j}{\max }|\pi _{ij}-\pi _i\pi _j|}{\lambda ^2}=O(1). \end{aligned}$$

Consequently,

$$\begin{aligned} J_{12}= & {} O(n^{-1}). \end{aligned}$$
(A.28)

Finally, using (A.24) and (A.28) in (A.11), we get

$$\begin{aligned} J_1= & {} \left( \frac{n}{N^2}\sum _{i \in U}\Delta _{i}\right) \frac{1}{nh^3}\left[ \beta ^2 f(y)\int _{\mathbb {R}}x^2t(x|y)dx-2\beta yf(y)\int _{\mathbb {R}}xt(x|y)dx+y^2f(y)\right] d_{K'}\nonumber \\&+\,o\left( \{Nh^3\}^{-1}\right) . \end{aligned}$$
(A.29)

Now, use (A.8) and (A.29) in (A.7) to get

$$\begin{aligned} \mathrm {Var}_C\left[ \hat{f}_{\text {par}}(y)\right]= & {} \left( \frac{n}{N^2}\sum _{i \in U}\Delta _{i}\right) \frac{1}{nh^3}\left[ \beta ^2 f(y)\int _{\mathbb {R}}x^2t(x|y)dx-2\beta yf(y)\right. \nonumber \\&\quad \left. \int _{\mathbb {R}}xt(x|y)dx+y^2f(y)\right] d_{K'} +o\left( \{Nh^3\}^{-1}\right) . \end{aligned}$$
(A.30)

Integrating (A.30) over y gives the following integrated variance of \(\hat{f}_{\text {par}}(\cdot )\):

$$\begin{aligned} {\mathrm {IVar}}_C\left[ \hat{f}_{\text {par}}(\cdot )\right]= & {} \left( \frac{n}{N^2}\sum _{i \in U}\Delta _{i}\right) \frac{1}{nh^3}\left[ \beta ^2\iint _{\mathbb {R}}x^2t(x|y)f(y) dxdy\right. \nonumber \\&\left. -2\beta \iint _{\mathbb {R}}yxt(x|y)f(y)dxdy+\int _{\mathbb {R}}y^2f(y)dy\right] d_{K'}+o \left( \frac{1}{Nh^3}\right) \nonumber \\= & {} \left( \frac{n}{N^2}\sum _{i \in U}\Delta _{i}\right) \frac{1}{nh^3} \left[ \beta ^2\mu _{_{X^2}}- 2\beta \mu _{_{XY}}+\mu _{_{Y^2}}\right] d_{K'}+o\left( \frac{1}{Nh^3}\right) .\nonumber \\ \end{aligned}$$
(A.31)

Integrating the squared bias, see (2.10), gives the second part of the MISE of \(\hat{f}_{\text {par}}(\cdot )\):

$$\begin{aligned} {\mathrm {ISB}}_C\left[ \hat{f}_{\text {par}}(\cdot )\right]= & {} \frac{1}{4}h^4c_K \int _{\mathbb {R}}\left\{ f''(y)\right\} ^2dy+o(h^4)\nonumber \\= & {} \frac{1}{4}h^4c_Kd_{f''}+o(h^4). \end{aligned}$$
(A.32)

Adding (A.31) to (A.32) completes the proof.

Proof of Lemma 2.1

First, rewrite the pseudo density estimator in (2.9) as follows

$$\begin{aligned} \tilde{f}_{\text {par}}(y;h)= & {} \frac{1}{N}\left[ \sum _{i\in s}d_i(u_i(h)-\tilde{v}_i(h))+\sum _{i\in U}\tilde{v}_i(h)\right] , \end{aligned}$$

where \(u_i(h)=K_h(y-y_i)\) and \(\tilde{v}_i(h)~{:=}~K_h(y-\beta _{\text {U}}x_i)\). Under SRS, \(d_i=N/n\) for all \(i\in U\). Hence,

$$\begin{aligned} \tilde{f}_{\text {par}}(y;h)= & {} \frac{1}{n}\sum _{i\in s}(u_i(h)-\tilde{v}_i(h))+\frac{1}{N}\sum _{i\in U}\tilde{v}_i(h)=\frac{1}{n}\sum _{i\in s}w_i(h), \end{aligned}$$
(A.33)

where \(w_i(h)~{:=}~u_i(h)-\tilde{v}_i(h)+(1/N)\sum _{j\in U}\tilde{v}_j(h)\). Therefore, the design-variance of \(\tilde{f}_{\text {par}}\) is

$$\begin{aligned} \Gamma _{\mathcal {D}}=\left( 1-\frac{n}{N}\right) \frac{\left[ \sum _{i\in U}(u_i(h)-\tilde{v}_i(h))^2-N^{-1}\{\sum _{i\in U}(u_i(h)-\tilde{v}_i(h))\}^2\right] }{n(N-1)}.\qquad \quad \end{aligned}$$
(A.34)

Since \(\tilde{f}_{\text {par}}\) is a sample mean as shown in (A.33), to conclude that

$$\begin{aligned} \frac{\tilde{f}_{\text {par}}(y;h)-f_{\text {U}}(y;h)}{\Gamma ^{1/2}_{\mathcal {D}}} \overset{\mathcal {L}_{\mathcal {D}}}{\longrightarrow }\mathcal {N}(0,1), \end{aligned}$$
(A.35)

we need to verify the following Lyapunov’s condition (e.g., Thompson (1997, p. 59)):

$$\begin{aligned} \left( 1-\frac{n}{N}\right) \sum _{i\in s}\mathrm {E}_{\mathcal {D}}\big |w_i(h) -\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |^{2+\eta }=o\left( \left[ n^2\left( \frac{N-1}{N}\right) \Gamma _{\mathcal {D}}\right] ^{(2+\eta )/2}\right) .\nonumber \\ \end{aligned}$$
(A.36)

Note that,

$$\begin{aligned} \mathrm {E}_{\mathcal {D}}[w_i(h)]= & {} \frac{1}{N}\sum _{i \in U}w_i(h)\\= & {} \frac{1}{N}\sum _{i \in U}[u_i(h)-\tilde{v}_i(h)+\frac{1}{N}\sum _{i\in U}\tilde{v}_j(h)]\\= & {} \frac{1}{N}\sum _{i\in U}u_i(h)=\frac{1}{N}\sum _{i\in U}K_h(y-y_i)=f_{\text {U}}(y;h). \end{aligned}$$

Therefore, by the boundedness of the kernel (\(|K(u)|\le K^*<\infty \) for all u), we have

$$\begin{aligned} \left| w_i(h)-\mathrm {E}_{\mathcal {D}}[w_i(h)]\right|= & {} \left| u_i(h)-\tilde{v}_i(h) +\frac{1}{N}\sum _{i\in U}\tilde{v}_j(h)-\frac{1}{N}\sum _{i\in U}u_i(h)\right| \\= & {} \left| (u_i(h)-\tilde{v}_i(h))-\frac{1}{N}\sum _{j\in U}(u_j(h)-\tilde{v}_j(h)) \right| \\\le & {} \left| \left( \frac{K^*}{h}-0\right) -\frac{1}{N}\sum _{i\in U}\left( 0 -\frac{K^*}{h}\right) \right| =\frac{2K^*}{h}. \end{aligned}$$

Consequently,

$$\begin{aligned}&\sum _{i\in s}\mathrm {E}_{\mathcal {D}}\big |w_i(h)-\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |^3\\&\quad \le \sum _{i\in s}\mathrm {E}_{\mathcal {D}}\big |w_i(h)-\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |^2 \underset{i\in s}{\max }\big |w_i(h)-\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |\\&\quad =\frac{2K^*}{h}\sum _{i\in s}\mathrm {E}_{\mathcal {D}}\left[ w_i(h)-\mathrm {E}_{\mathcal {D}} \{w_i(h)\}\right] ^2\\&\quad =\frac{2K^*}{h}\sum _{i\in s}\mathrm {E}_{\mathcal {D}}\left[ (u_i(h)-\tilde{v}_i(h)) -\frac{1}{N}\sum _{j\in U}(u_j(h)-\tilde{v}_j(h))\right] ^2\\&\quad =\frac{2K^*}{h}\frac{n}{N}\sum _{i\in U}\left[ (u_i(h)-\tilde{v}_i(h)) -\frac{1}{N}\sum _{j\in U}(u_j(h)-\tilde{v}_j(h))\right] ^2\\&\quad =\frac{2K^*}{h}n^2\left( \frac{N-1}{N}\right) \left( 1-\frac{n}{N}\right) ^{-1} \Gamma _{\mathcal {D}}. \end{aligned}$$

Thus,

$$\begin{aligned} \left( 1-\frac{n}{N}\right) \sum _{i\in s}\mathrm {E}_{\mathcal {D}}\big |w_i(h) -\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |^3\le & {} \frac{2K^*}{h}n^2 \left( \frac{N-1}{N}\right) \Gamma _{\mathcal {D}}, \end{aligned}$$
(A.37)

and

$$\begin{aligned}&\frac{\left( 1-\frac{n}{N}\right) \sum _{i\in s}\mathrm {E}_{\mathcal {D}}\big |w_i(h) -\mathrm {E}_{\mathcal {D}}[w_i(h)]\big |^3}{\left[ n^2\left( \frac{N-1}{N}\right) \Gamma _{\mathcal {D}}\right] ^{3/2}}\nonumber \\&\quad \le \frac{2K^*}{h}n^2\left( \frac{N-1}{N}\right) \Gamma _{\mathcal {D}} \left[ n^2\left( \frac{N-1}{N}\right) \Gamma _{\mathcal {D}}\right] ^{-3/2}\nonumber \\&\quad =\frac{2K^*}{nh}\left( \frac{N-1}{N}\right) ^{-1/2} \Gamma ^{-1/2}_{\mathcal {D}}\nonumber \\&\quad =\frac{2K^*}{\sqrt{n}h}\left( 1-\frac{n}{N}\right) ^{-1/2} \left[ \frac{\sum _{i\in U}(u_i(h)-\tilde{v}_i(h))^2-N^{-1}\{\sum _{i\in U} (u_i(h)-\tilde{v}_i(h))\}^2}{N}\right] ^{-1/2}\nonumber \\&\quad \rightarrow 0, \end{aligned}$$
(A.38)

because \(\sqrt{n}h\rightarrow \infty \) by assumption. Therefore, Lyapunov’s condition (A.36) holds with \(\eta =1\) and the asymptotic result in (A.35) is proven. Since by Lemma A2, \(\hat{f}_{\text {par}}(y;h)\) and \(\tilde{f}_{\text {par}}(y;h)\) have the same limiting distribution in the design space, it follows from (A.35) and Assumption A5 that

$$\begin{aligned} \frac{\hat{f}_{\text {par}}(y;h)-f_{\text {U}}(y;h)}{\Gamma ^{1/2}_{\mathcal {D}}} \overset{\mathcal {L}_{\mathcal {D}}}{\longrightarrow }\mathcal {N}(0,1). \end{aligned}$$
(A.39)

To complete the proof of the lemma, it remains to show that \(\hat{\Gamma }_{\mathcal {D}}\) is a design-consistent estimator for \(\Gamma _{\mathcal {D}}\), or equivalently,

$$\begin{aligned} \big |\hat{\Gamma }_{\mathcal {D}}-\Gamma _{\mathcal {D}}\big | \overset{\mathbb {P}_{\mathcal {D}}}{\longrightarrow }0, \ \text {as} \ n \ \text {increases}. \end{aligned}$$

Consider using

$$\begin{aligned} \tilde{\sigma }^2_{\mathcal {D}}=\frac{1}{n-1}\sum _{i\in s}\left[ (u^*_i(h)-\tilde{v}^*_i(h))-\frac{1}{n}\sum _{i\in s}(u^*_i(h)-\tilde{v}^*_i(h))\right] ^2 \end{aligned}$$
(A.40)

as an estimator for

$$\begin{aligned} \sigma ^2_{\mathcal {D}}=\frac{1}{N}\sum _{i\in U}\left[ (u^*_i(h)-\tilde{v}^*_i(h))-\frac{1}{N}\sum _{i\in U}(u^*_i(h)-\tilde{v}^*_i(h))\right] ^2, \end{aligned}$$
(A.41)

where \(u^*_i(h)~{:=}~hu_i(h)=K(\{y-y_i\}/h)\) and \(\tilde{v}^*_i(h)~{:=}~h\tilde{v}_i(h)=K(\{y-\beta _{\text {U}} x_i\}/h)\). Note that by the boundedness of K,

$$\begin{aligned} 0<\sigma ^2_{\mathcal {D}}\le \frac{1}{N}\sum _{i\in U}\left[ (K^*-0)-\frac{1}{N}\sum _{i\in U}(0-K^*)\right] ^2=4(K^*)^2<\infty . \end{aligned}$$
(A.42)

Similarly,

$$\begin{aligned}&\underset{i\in U}{\max }\left[ (u^*_i(h)-\tilde{v}^*_i(h))-\frac{1}{N}\sum _{i\in U}(u^*_i(h)-\tilde{v}^*_i(h))\right] ^2/N\sigma ^2_{\mathcal {D}}\nonumber \\&\quad \le \frac{4(K^*)^2}{N\sigma ^2_{\mathcal {D}}}\rightarrow 0 \ \text {as} \ \ N\rightarrow \infty . \end{aligned}$$
(A.43)

The bounds in (A.42) and (A.43) imply that conditions (2.10) and (2.12) on page 294 of Sen (1988) are satisfied. Thus, using the result in (2.19) of Sen (1988), we have

$$\begin{aligned} \big |\tilde{\sigma }^2_{\mathcal {D}}-\sigma ^2_{\mathcal {D}}\big | \overset{\mathbb {P}_{\mathcal {D}}}{\longrightarrow }0, \quad \text {as} \ n \ \text {increases}. \end{aligned}$$
(A.44)

Consequently, taking \(\tilde{\Gamma }_{\mathcal {D}}=(1-n/N)\tilde{\sigma }^2_{\mathcal {D}}/nh^2\), we have

$$\begin{aligned} \big |\tilde{\Gamma }_{\mathcal {D}}-\Gamma _{\mathcal {D}}\big |= & {} \frac{1}{nh^2} \left( 1-\frac{n}{N}\right) \Big |\tilde{\sigma }^2_{\mathcal {D}}-\frac{N}{(N-1)}\sigma ^2_{\mathcal {D}}\Big |\overset{\mathbb {P}_{\mathcal {D}}}{\longrightarrow }0, \ \text {as} \ n \ \text {increases}.\qquad \qquad \end{aligned}$$
(A.45)

Define

$$\begin{aligned} \hat{\sigma }^2_{\mathcal {D}}=\frac{1}{n-1}\sum _{i\in s}\left[ (u^*_i(h)-v^*_i(h))-\frac{1}{n}\sum _{i\in s}(u^*_i(h)-v^*_i(h))\right] ^2, \end{aligned}$$

where \(v^*_i(h)~{:=}~hv_i(h)=K(\{y-\hat{\beta }x_i\}/h)\), and observe that the only difference between \(\hat{\sigma }^2_{\mathcal {D}}\) and \(\tilde{\sigma }^2_{\mathcal {D}}\) is that \(\hat{\sigma }^2_{\mathcal {D}}\) depends on \(\hat{\beta }\) through \(v^*_i(h)\) while \(\tilde{\sigma }^2_{\mathcal {D}}\) depends on \(\beta _{\text {U}}\) through \(\tilde{v}^*_i(h)\). Since, in the design space, \(\hat{\beta }\) consistently estimates \(\beta _{\text {U}}\), as we discussed in Sect. 2.1, the results of Randles (1982) imply that \(\hat{\sigma }^2_{\mathcal {D}}\) and \(\tilde{\sigma }^2_{\mathcal {D}}\) share the same limiting properties. Additionally, it is readily seen that \(\hat{\Gamma }_{\mathcal {D}}=(1-n/N)\hat{\sigma }^2_{\mathcal {D}}/nh^2\). Therefore, (A.45) implies that

$$\begin{aligned} \big |\hat{\Gamma }_{\mathcal {D}}-\Gamma _{\mathcal {D}}\big |= & {} \frac{1}{nh^2} \left( 1-\frac{n}{N}\right) \Big |\hat{\sigma }^2_{\mathcal {D}}-\frac{N}{(N-1)} \sigma ^2_{\mathcal {D}}\Big |\overset{\mathbb {P}_{\mathcal {D}}}{\longrightarrow }0, \ \text {as} \ n \ \text {increases}.\nonumber \\ \end{aligned}$$
(A.46)

The proof of the lemma is complete upon using the results in (A.35) and (A.46) in conjunction with Slutsky’s theorem. \(\square \)

Proof of Theorem 2.3

First, note that

$$\begin{aligned} \sqrt{nh^3}[\hat{f}_{\text {par}}(y;h)-\mathrm {E}_{\xi }\{f_{\text {U}}(y;h)\}]= & {} \sqrt{nh^3}[\hat{f}_{\text {par}}(y;h)-f_{\text {U}}(y;h)]\nonumber \\&+\sqrt{nh^3}[f_{\text {U}}(y;h)-\mathrm {E}_{\xi }\{f_{\text {U}}(y;h)\}].\nonumber \\ \end{aligned}$$
(A.47)

Now, we have

$$\begin{aligned} \sqrt{nh^3}[f_{\text {U}}(y;h)-\mathrm {E}_{\xi }\{f_{\text {U}}(y;h)\}]= & {} h\sqrt{\frac{n}{N}}\sqrt{Nh}[f_{\text {U}}(y;h)-\mathrm {E}_{\xi } \{f_{\text {U}}(y;h)\}]\overset{\mathbb {P}_{\xi }}{\longrightarrow }0,\nonumber \\ \end{aligned}$$
(A.48)

where (A.48) follows by using the assumptions \(h\rightarrow 0\) and \(n/N\rightarrow \pi \), and the fact that \(\sqrt{Nh}[f_{\text {U}}(y;h)-\mathrm {E}_{\xi }\{f_{\text {U}}(y;h)\}]\) converges to \(\mathcal {N}(0,f(y)d_K)\) in the model space [note that \(f_{\text {U}}\) is the standard KDE from an iid sample where the sample is the entire finite population and see Parzen (1962, p. 1069)]. Moreover, from (A.39), \([\hat{f}_{\text {par}}(y;h)-f_{\text {U}}(y;h)]\Gamma ^{-1/2}_{\mathcal {D}}\) converges to \(\mathcal {N}(0,1)\) distribution in the design space where \(\Gamma _{\mathcal {D}}\) is defined in (A.34). Using (A.34) and (A.41), we write \(\Gamma _{\mathcal {D}}=(1-n/N)[N/(N-1)]\sigma ^2_{\mathcal {D}}/nh^2\). Using results from the proof of Theorem 2.2, see Eqs. (A.24) and (A.28), it can be shown that

$$\begin{aligned} \mathrm {E}_{\xi }\left[ \frac{1}{nh^2}\sigma ^2_{\mathcal {D}}\right]= & {} \frac{1}{nh^3}\left[ \beta ^2\mu _{_{X^2|y}}-2\beta \mu _{_{X|y}}y+y^2\right] f(y)d_{K'}+o\left( \frac{1}{Nh^3}\right) . \end{aligned}$$

Thus,

$$\begin{aligned} \mathrm {E}_{\xi }(\Gamma _{\mathcal {D}})= & {} \left( 1-\frac{n}{N}\right) \frac{N}{(N-1)}\frac{1}{nh^3}\left[ \beta ^2\mu _{_{X^2|y}}-2\beta \mu _{_{X|y}}y+y^2\right] f(y)d_{K'}+o\left( \frac{1}{Nh^3}\right) \\= & {} \frac{N}{(N-1)}\mathrm {Var}_C\{\hat{f}_{\text {par}}(y;h)\}. \end{aligned}$$

Applying Theorem 5.1 of Bleuer and Kratina (2005) followed by an application of Slutsky’s theorem completes the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mostafa, S.A., Ahmad, I.A. Kernel density estimation from complex surveys in the presence of complete auxiliary information. Metrika 82, 295–338 (2019). https://doi.org/10.1007/s00184-018-0703-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-018-0703-y

Keywords

Mathematics Subject Classification

Navigation