Skip to main content
Log in

Nonlinear wavelet density estimation with data missing at random when covariates are present

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In this paper, we construct the nonlinear wavelet estimator of a density with data missing at random when covariables are present, and provide an asymptotic expression for the mean integrated squared error (MISE) of the estimator. Unlike for kernel estimators, the MISE expression of the wavelet-based estimator still holds when the density function is piecewise smooth. Also, the asymptotic normality of the estimator is established.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Cai JJ, Liang HY (2011) Nonlinear wavelet density estimation with truncated and dependent observations. Int J Wavelets Multiresolut Inf Process 9(4):587–609

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho DL, Johnstone IM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425–455

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho DL, Johnstone IM (1995) Adapting to unknown smoothness via wavelet shrinkage. J Am Stat Assoc 90:1200–1224

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho DL, Johnstone IM, Kerkyacharian G, Picard D (1995) Wavelet shrinkage-asymptopia? (with discussion). J R Stat Soc B57:301–369

    MathSciNet  Google Scholar 

  • Donoho DL, Johnstone IM, Kerkyacharian G, Picard D (1996) Density estimation by wavelet thresholding. Ann Stat 24:508–539

    Article  MATH  MathSciNet  Google Scholar 

  • Hall P, Patil P (1995) Formulae for mean integrated squared error of non-linear wavelet-based density estimators. Ann Stat 23:905–928

    Article  MATH  MathSciNet  Google Scholar 

  • Härdle W, Kerkyacharian G, Picard D, Tsybakov A (1998) Wavelets, approximation and statistical applications. Lecture Notes in Statistics, vol 129. Springer, New York

  • Li L (2003) Non-linear wavelet-based density estimators under random censorship. J Stat Plan Inference 117(1):35–58

    Article  MATH  Google Scholar 

  • Liang HY, de Uña-Álvarez J (2011) Wavelet estimation of conditional density with truncated, censored and dependent data. J Multivar Anal 102:448–467

    Article  MATH  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Petrov VV (1995) Limit theorems of probability theory. Oxford University Press Inc., New York

    MATH  Google Scholar 

  • Serfling RJ (1980) Approximation theorems of mathematical statistics. Wiley, New York

    Book  MATH  Google Scholar 

  • Sinha SK (2012) Robust analysis of longitudinal data with nonignorable missing responses. Metrika 75(7):913–938

    Article  MATH  MathSciNet  Google Scholar 

  • Wang Q (2008) Probability density estimation with data misisng at random when covariables are present. J Stat Plan Inference 138:568–587

    Article  MATH  Google Scholar 

  • Wang Q, Linton O, Hardle W (2004) Semiparametric regression analysis with missing response at random. J Am Stat Assoc 99:334–345

    Article  MATH  MathSciNet  Google Scholar 

  • Wang Q, Rao J (2002) Empirical likelihood-based inference under imputation for missing response data. Ann Stat 30:896–924

    Article  MATH  MathSciNet  Google Scholar 

  • Wang D, Chen S (2009) Empirical likelihood for estimating equation with missing values. Ann Stat 37:490–517

    Article  MATH  Google Scholar 

  • Wang Q, Qin Y (2010) Empirical likelihood confidence bands for distribution functions with missing responses. J Stat Plan Inference 140:2778–2789

    Article  MATH  MathSciNet  Google Scholar 

  • Xu W, Guo X (2013) Nonparametric checks for varying coefficient models with missing response at random. Metrika 76(4):459–482

    Article  MATH  MathSciNet  Google Scholar 

  • Xu W, Zhu L (2013) Testing the adequacy of varying coefficient models with missing responses at random. Metrika 76(1):53–69

    Article  MATH  MathSciNet  Google Scholar 

  • Yu W, Niu C, Xu W (2014) An empirical likelihood inference for the coefficient difference of a two-sample linear model with missing response data. Metrika 77(5):675–693

    Article  MATH  MathSciNet  Google Scholar 

  • Zou YY, Liang HY (2014) Global \(L_2\) error of wavelet density estimator with truncated and strong mixing observations. Int J Wavelets Multiresolut Inf Process 12(3):1450033, 1–12

Download references

Acknowledgments

The authors greatly appreciate the constructive comments and suggestions of the Editor and referee. This research was supported by the National Natural Science Foundation of China (11271286, 11426153) and the Specialized Research Fund for the Doctor Program of Higher Education of China (20120072110007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Han-Ying Liang.

Appendix

Appendix

In this section, we collect and prove some lemmas, which have been used in Sect. 4.

Lemma 5.1

(Serfling 1980, Lemma A, page 95) Let \(\{Z_i\}\) be independent random variables satisfying \(P(|Z_i|\le M)=1, i\ge 1.\) Then, for any \(t>0,\)

$$\begin{aligned} P\Big (\Big |\sum _{i=1}^n(Z_i-EZ_i)\Big |>t\Big )\le 2\exp \Bigg \{-\frac{t^2}{2\left( \sum _{j=1}^n\mathrm{Var}Z_j+Mt/3\right) }\Bigg \}. \end{aligned}$$

Lemma 5.2

(Petrov 1995, Theorem 2.9, page 59) Let \(W_1,\ldots ,W_n\) be independent random variables with \(EW_k=0\) \((k=1,2,\ldots ,n)\), and let \(p\ge 2\). Then

$$\begin{aligned} E\Big (\Big |\sum _{i=1}^nW_i\Big |^p\Big )\le C(p)\Big \{\sum ^n_{k=1}E(|W_k|^p)+\Big (\sum ^n_{k=1}EW_k^2\Big )^{p/2}\Big \}, \end{aligned}$$

where \(C(p)\) is a positive constant depending only on \(p\).

Lemma 5.3

Set \(V_n(\mathbf {X}_i)=\frac{1}{nh_{1n}^d}\sum _{i=1}^n\delta _jK_1(\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}})\) and \(V(\mathbf {X}_i)=g(\mathbf {X}_i)\Delta (\mathbf {X}_i).\) Suppose (A1)–(A3) hold. Then  \(\max _{1\le i\le n}|V_n(\mathbf {X}_i)-V(\mathbf {X}_i)|=O(\gamma _{1n})\) a.s., where \(\gamma _{1n}=\big (\frac{\log n}{nh_{1n}^d}\big )^{1/2}+h_{1n}^r\rightarrow 0.\)

Proof

Let \(E_{\mathbf {X}_i}\) and \(P_{\mathbf {X}_i}\) denote conditional expectation and conditional probability given \(\mathbf {X}_i\), then

$$\begin{aligned} V_n(\mathbf {X}_i)-V(\mathbf {X}_i) =&\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )-g(\mathbf {X}_i)\Delta (\mathbf {X}_i)\\ =&\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big ) -E_{\mathbf {X}_i}\Big [\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\Big ]\\&+E_{\mathbf {X}_i}\Big [\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\Big ]-g(\mathbf {X}_i)\Delta (\mathbf {X}_i). \end{aligned}$$

Let \(\tilde{\gamma }_{1n}=(\log n/(nh_{1n}^d))^{1/2}\). For some large \(C_2>0\) we have

$$\begin{aligned}&P\Bigg (\max _{1\le i\le n}\Big |\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg ) -E_{\mathbf {X}_i}\Bigg [\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Bigg ]\Big |>C_2\tilde{\gamma }_{1n}\Bigg )\nonumber \\&\le \sum _{i=1}^nE\Bigg \{P_{\mathbf {X}_i}\Bigg (\Big |\sum _{j=1}^n\Big \{\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )-E_{\mathbf {X}_i}\Bigg [\delta _j K_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Bigg ]\Bigg \}\Big | >C_2nh_{1n}^d\tilde{\gamma }_{1n}\Bigg )\Bigg \}. \end{aligned}$$
(5.1)

Applying Lemma 5.1, it follows that

$$\begin{aligned}&P_{\mathbf {X}_i}\Bigg (\Big |\sum _{j=1}^n\Bigg \{\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )-E_{\mathbf {X}_i}\Bigg [\delta _j K_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Bigg ]\Bigg \}\Big |>C_2nh_{1n}^d\tilde{\gamma }_{1n}\Bigg )\\&\quad \le 2\exp \Bigg \{-\frac{(C_2nh_{1n}^d\tilde{\gamma }_{1n})^2/2}{\sum _{j=1}^n E_{\mathbf {X}_i}\{\delta _jK_1(\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}})\}^2+MC_0nh_{1n}^d\tilde{\gamma }_{1n}/3}\Bigg \}\\&\quad =2\exp \Bigg \{-\frac{C_2^2nh_{1n}^d\tilde{\gamma }_{1n}^2/2}{\Delta (\mathbf {X}_i)g(\mathbf {X}_i)\int K_1^2(\mathbf {t})d\mathbf {t}+O(h_{1n})+MC_2\tilde{\gamma }_{1n}/3}\Bigg \}\\&\quad \le 2\exp \bigg \{-CC_2\log n\}\le Cn^{-3}, \end{aligned}$$

which, together with (5.1), yields

$$\begin{aligned} \max _{1\le i\le n}\Big |\sum _{j=1}^n\Bigg \{\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )-E_{\mathbf {X}_i}\Bigg [\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Bigg ]\Bigg \}\Big | =O(\tilde{\gamma }_{1n})~~a.s. \end{aligned}$$

By applying a Taylor expansion, we find

$$\begin{aligned}&\max _{1\le i\le n}\Big |E_{\mathbf {X}_i}\Bigg [\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Bigg ]-g(\mathbf {X}_i)\Delta (\mathbf {X}_i)\Big |\\&\quad =\max _{1\le i\le n}\Big |\frac{1}{nh_{1n}^d}\sum _{j=1}^nE\Bigg \{K_1\Bigg (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Bigg )\Delta (\mathbf {X}_j)\Big |\mathbf {X}_i\Bigg \}-g(\mathbf {X}_i)\Delta (\mathbf {X}_i)\Big |\\&\quad =\max _{1\le i\le n}\Big |\int K_1(\mathbf {t})[\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})g(\mathbf {X}_i-h_{1n}\mathbf {t})-g(\mathbf {X}_i)\Delta (\mathbf {X}_i)]d\mathbf {t}\Big |\\&\quad =O(h_{1n}^r)~~a.s. \end{aligned}$$

From the proof above, it follows that \(\max _{1\le i\le n}|V_n(\mathbf {X}_i)-V(\mathbf {X}_i)|=O(\gamma _{1n})~a.s.\) \(\square \)

Lemma 5.4

Suppose (A1)–(A3) hold. Then \( \max _{1\le i\le n}|\Delta _n(\mathbf {X}_i)-\Delta (\mathbf {X}_i)|=O(\gamma _{2n})~a.s., \) where \(\gamma _{2n}=(\log n/(nh_{2n}^d))^{1/2}+h_{2n}^r\rightarrow 0.\)

Proof

Follows from the proof of Lemma 5.3 above. \(\square \)

Lemma 5.5

Let \(\widehat{b}_l\) and \(\widehat{b}_{kl}\) be defined in Section 2, and let \(\gamma _{1n}\) and \(\gamma _{2n}\) be given in Lemmas 5.3 and 5.4. Under assumptions (A1) and (A2), we have

$$\begin{aligned} \widehat{b}_l=&\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta (\mathbf {X}_i)} +\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (X_i)V(\mathbf {X}_i)} \delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&+O(\gamma _{2n})\frac{1}{n}\sum _{i=1}^n\frac{\delta _i|\phi _l(Y_i)|}{\Delta (\mathbf {X}_i)}\\&+O(\gamma _{1n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{|\Delta (\mathbf {X}_i)-\delta _i|}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\\&+O(\gamma _{2n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\\&+O(\gamma _{1n})O(\gamma _{2n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\\ :=&\tilde{b}_l+B_{1l}+B_{2l}+B_{3l}+B_{4l}+B_{5l} ~~~ a.s.,\\ \widehat{b}_{kl}=&\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\psi _{kl}(Y_i)}{\Delta (\mathbf {X}_i)} +\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\Delta (X_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\psi _{kl}(Y_j)\\&+O(\gamma _{2n})\frac{1}{n}\sum _{i=1}^n\frac{\delta _i|\psi _{kl}(Y_i)|}{\Delta (\mathbf {X}_i)}\\&+O(\gamma _{1n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{|\Delta (\mathbf {X}_i)-\delta _i|}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\psi _{kl}(Y_j)\Big |\\&+O(\gamma _{2n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\psi _{kl}(Y_j)\Big |\\&+O(\gamma _{1n})O(\gamma _{2n})\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\psi _{kl}(Y_j)\Big |\\ :=&\, \tilde{b}_{kl}+B_{1kl}+B_{2kl}+B_{3kl}+B_{3kl}+B_{4kl}+B_{5kl} ~~~ a.s. \end{aligned}$$

and \(E\tilde{b}_l=b_l\), \(E\tilde{b}_{kl}=b_{kl}.\)

Proof

We verify only the expression for \(\widehat{b}_l\), the proof for \(\widehat{b}_{kl}\) can be done similarly. Write

$$\begin{aligned} F_{n,W}(y) \!=\!\frac{1}{n}\sum _{i=1}^n\bigg \{\frac{\delta _i}{\Delta _n(\mathbf {X}_i)}I(Y_i\le y) \!+\!\bigg (1\!-\!\frac{\delta _i}{\Delta _n(\mathbf {X}_i)}\bigg )F_n(y|\mathbf {X}_i)\bigg \} :=\,F_{n1}(y)\!+\!F_{n2}(y). \end{aligned}$$

Then \( \widehat{b}_l=\int \phi _l(y)dF_{n,W}(y)=\int \phi _l(y)dF_{n1}(y)+\int \phi _l(y)dF_{n2}(y):=L_1+L_2. \) It is easy to see that \(L_1=\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta _n(X_i)}.\)

From \(F_{n2}(y)=\frac{1}{n}\sum _{i=1}^n\big (1-\frac{\delta _i}{\Delta _n(\mathbf {X}_i)}\big )\frac{\sum _{j=1}^n\delta _jI(Y_j\le y)K_1(\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}})}{\sum _{k=1}^n\delta _kK_1(\frac{\mathbf {X}_i-\mathbf {X}_k}{h_{1n}})}\), we have

$$\begin{aligned} L_2=\frac{1}{n}\sum _{i=1}^n\Big (1-\frac{\delta _i}{\Delta _n(\mathbf {X}_i)}\Big ) \frac{\sum _{j=1}^n\delta _jK_1\big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\big )\phi _l(Y_j)}{\sum ^n_{k=1}\delta _kK_1\big (\frac{\mathbf {X}_i-\mathbf {X}_k}{h_{1n}}\big )}. \end{aligned}$$

Let the definition of \(V_n(\mathbf {x})\) and \(V(\mathbf {x})\) be same as in Lemma 5.3. Using Lemmas 5.3 and 5.4 we obtain

$$\begin{aligned} \widehat{b}_l =&\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta _n(\mathbf {X}_i)} +\frac{1}{n}\sum _{i=1}^n\Big (1-\frac{\delta _i}{\Delta _n(\mathbf {X}_i)}\Big ) \frac{\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\big )\phi _l(Y_j)}{\frac{1}{nh_{1n}^d}\sum _{j=1}^n\delta _jK_1\big (\frac{X_i-X_j}{h_{1n}}\big )}\\ =&\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta (X_i)} +\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&+\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta (\mathbf {X}_i)} \frac{\Delta (X_i)-\Delta _n(\mathbf {X}_i)}{\Delta _n(\mathbf {X}_i)}\\&+\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\frac{V(\mathbf {X}_i)-V_n(\mathbf {X}_i)}{V_n(\mathbf {X}_i)} \delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&+\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\frac{\Delta _n(\mathbf {X}_i)-\Delta (\mathbf {X}_i)}{\Delta _n(\mathbf {X}_i)} \delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&+\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\frac{\Delta _n(\mathbf {X}_i)-\Delta (\mathbf {X}_i)}{\Delta _n(\mathbf {X}_i)}\frac{V(\mathbf {X}_i)-V_n(\mathbf {X}_i)}{V_n(\mathbf {X}_i)} \delta _j\\&\quad K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j) :=\tilde{b}_l+B_{1l}+B_{2l}^{'}+B_{3l}^{'}+B_{4l}^{'}+B_{5l}^{'}, \end{aligned}$$

where

$$\begin{aligned} B_{2l}^{'} \le&\,\frac{\max _{1\le i\le n}|\Delta _n(\mathbf {X}_i)-\Delta (\mathbf {X}_i)|}{\inf _{\mathbf {x}\in D}\Delta (\mathbf {x}) -\max _{1\le i\le n}|\Delta _n(\mathbf {X}_i)-\Delta (\mathbf {X}_i)|}\cdot \frac{1}{n}\sum _{i=1}^n\frac{\delta _i|\phi _l(Y_i)|}{\Delta (X_i)}\\ =&\frac{O(\gamma _{2n})}{n}\sum _{i=1}^n\frac{\delta _i|\phi _l(Y_i)|}{\Delta (\mathbf {X}_i)}:=B_{2l};\\ B_{3l}^{'} \le&\,\frac{\max _{1\le i\le n}|V_n(\mathbf {X}_i)-V(\mathbf {X}_i)|}{\inf _{\mathbf {x}\in D}V(\mathbf {x})-\max _{1\le i\le n}|V_n(\mathbf {X}_i)-V(\mathbf {X}_i)|}\\&\cdot \frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{|\Delta (\mathbf {X}_i)-\delta _i|}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\\ =&\frac{O(\gamma _{1n})}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{|\Delta (\mathbf {X}_i)-\delta _i|}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |:=B_{3l}. \end{aligned}$$

Similarly \(B_{4l}^{'} =\frac{O(\gamma _{2n})}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\big |K_1\big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\big )\phi _l(Y_j)\big |:=B_{4l}\) and

$$\begin{aligned} B_{5l}^{'} =&\,O(\gamma _{1n})O(\gamma _{2n})\frac{1}{n^2a_d}\sum _{i=1}^n\sum _{j=1}^n\frac{\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |:=B_{5l}. \end{aligned}$$

In addition, under the MAR assumption, we have

$$\begin{aligned} E\tilde{b}_l =E\Bigg [\frac{1}{n}\sum _{i=1}^n\frac{\delta _i\phi _l(Y_i)}{\Delta (\mathbf {X}_i)}\Bigg ] =E\Bigg \{\frac{\phi _l(Y)}{\Delta (\mathbf {X})}E[\delta |\mathbf {X},Y]\Bigg \}=E\phi _l(Y)=b_l. \end{aligned}$$

\(\square \)

Lemma 5.6

Under the assumptions of Theorem 3.1, we have \(EB_{1l}^2=O(n^{-1}p^{-1})\) and \(EB_{1kl}^2=O(n^{-1}p_k^{-1})\), where \(B_{1l}\) and \(B_{1kl}\) are defined as in Lemma 5.5.

Proof

We evaluate only \(EB_{1l}^2,\) the evaluation for \(EB_{1kl}^2\) is similar. From Lemma 5.5 we write

$$\begin{aligned} B_{1l}=&\frac{1}{n}\sum _{i=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\frac{1}{nh_{1n}^d}\sum _{j=1}^n \Big \{\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&-E\Big [\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\mathbf {X}_i\Big ]\Big \}\\&+\frac{1}{n}\sum _{i=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\frac{1}{nh_{1n}^d}\sum _{j=1}^n E\Big [\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\mathbf {X}_i\Big ]\\ :=&\,T_1+T_2. \end{aligned}$$

Let \(A_i=\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\) and \(B_{ij}=\{\delta _jK_1(\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}})\phi _l(Y_j) -E[\delta _jK_1(\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}})\phi _l(Y_j)|\mathbf {X}_i]\}\). Then using independence of \((\mathbf {X}_i, Y_i, \delta _i)\) for \(i=1,\ldots ,n\), it is easy to see that

$$\begin{aligned} ET_1^2&=\frac{1}{n^4h_{1n}^{2d}}E\Big (\sum _{i=1}^n\sum _{j=1}^nA_iB_{ij}\Big )^2\\&=\frac{1}{n^4h_{1n}^{2d}}\Big (\sum _{i=1}^nE(A_i^2B_{ii}^2)+\sum _{i=1}^n\sum _{j=1}^nE(A_i^2B_{ij}^2)\Big ):=T_{11}+T_{12}. \end{aligned}$$

Under conditions (A1)–(A3), we have

$$\begin{aligned} T_{11}&=\frac{1}{n^4h_{1n}^{2d}}\sum _{i=1}^n E\Big \{\Big (\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\Big )^2[\delta _{i}K_1(0)\phi _l(Y_{i}) -E(\delta _{i}K_1(0)\phi _l(Y_{i})|\mathbf {X}_i)]^2\Big \}\\&\le Cn^{-3}h_n^{-2d}. \end{aligned}$$

Note that

$$\begin{aligned} T_{12} =&\frac{1}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^nE\Big \{\Big (\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)}\Big )^2 \Big [\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\\&{\quad }-E\Big (\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\mathbf {X}_i\Big )\Big ]^2\Big \}\\ \le&\frac{1}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^nE\Big \{\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j) -E\Big [\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\mathbf {X}_i\Big ]\Big \}^2\\ \le&\,\frac{2}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^n\Big \{E\Big [\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big ]^2\\&{\quad } +E\Big [E\Big (\delta _jK_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\mathbf {X}_i\Big )\Big ]^2\Big \}\\ :=&\,T_{121}+T_{122}. \end{aligned}$$

Since \(\phi \) has a compact support, and under conditions (A2), (A3) and (A5), we have

$$\begin{aligned} T_{121} =&\,\frac{2}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^nE\Big \{E\Big [\delta _jK_1^2\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l^2(Y_j)\Big |\mathbf {X}_i,\mathbf {X}_j,Y_j\Big ]\Big \}\\ =&\,\frac{2}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^nE\Big \{K_1^2\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l^2(Y_j)\Delta (\mathbf {X}_j)\Big \}\\ =&\,\frac{2}{n^4h_{1n}^{2d}}\sum _{i=1}^n\sum _{j=1}^nE\Big \{E\Big [K_1^2\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l^2(Y_j)\Delta (\mathbf {X}_j)\Big |\mathbf {X}_i\Big ]\Big \}\\ =&\,\frac{2}{n^3h_{1n}^{2d}}\sum _{i=1}^nE\Big \{\int \int K_1^2\Big (\frac{\mathbf {X}_i-\mathbf {x}}{h_{1n}}\Big )p\phi ^2(py-l)\Delta (\mathbf {x})\xi (\mathbf {x},y)d\mathbf {x}dy\Big \}\\ =&\,\frac{2}{n^2h_{1n}^d}E\Big \{\int \int K_1^2(\mathbf {t})\phi ^2(u)\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}du\Big \}\\ \le&\, Cn^{-2}h_{1n}^{-d}. \end{aligned}$$

Similarly \(T_{122}\le Cn^{-2}p^{-1}.\)

For \(T_2\) we have

$$\begin{aligned} T_2=&\,\frac{1}{n}\sum _{i=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} h_{1n}^{-d}E\Big \{E\Big [\delta K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}}{h_{1n}}\Big )\phi _l(Y)\Big |\mathbf {X}_i,\mathbf {X},Y\Big ]\Big |\mathbf {X}_i\Big \}\\ =&\,\frac{1}{n}\sum _{i=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} h_{1n}^{-d}E\Big \{K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}}{h_{1n}}\Big )\phi _l(Y)\Delta (\mathbf {X})\Big |\mathbf {X}_i\Big \}\\ =&\,\frac{1}{np^{1/2}}\sum _{i=1}^n\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \!\int \int K_1(\mathbf {t})\phi (u)\Delta (\mathbf {X}_i\!-\!h_{1n}\mathbf {t})\xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u\!+\!l}{p}\Big )d\mathbf {t}du. \end{aligned}$$

It is easy to see that under the MAR assumption

$$\begin{aligned} E\Bigg \{\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \int \int K_1(\mathbf {t})\phi (u)\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}du\Bigg \}=0. \end{aligned}$$

Note that \(\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \int \int K_1(\mathbf {t})\phi (u)\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\xi (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p})d\mathbf {t}du\) \((0\le i\le n)\) are independent and bounded, so from conditions (A2), (A3) and (A5), we have

$$\begin{aligned} ET_2^2&=\frac{1}{n^2p}\sum _{i=1}^nE\Big \{\frac{\Delta (\mathbf {X}_i)-\delta _i}{\Delta (\mathbf {X}_i)V(\mathbf {X}_i)} \int \int K_1(\mathbf {t})\phi (u)\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\\&\qquad \xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}du\Big \}^2 \le Cn^{-1}p^{-1}. \end{aligned}$$

Since \(nh_{1n}^dp^{-1}\rightarrow \infty ,\) we collect the above calculation and obtain \(EB_{1l}^2=O(n^{-1}p^{-1})\). \(\square \)

Lemma 5.7

Under the assumptions of Theorem 3.1, we have

$$\begin{aligned} E\left\{ \frac{1}{n^2h_{1n}^d}\sum _{i=1}^n \sum _{j=1}^n\delta _j\left| K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\right| \right\} ^2=O(p^{-1}) \end{aligned}$$

and \(E\Big \{\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n \sum _{j=1}^n\delta _j\big |K_1\big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\big )\psi _{kl}(Y_j)\big |\Big \}^2=O(p_k^{-1}).\)

Proof

Write

$$\begin{aligned}&\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n \sum _{j=1}^n\delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\\ =&\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n \Bigg \{\delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big | -E\Big [\delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\Big |\mathbf {X}_i\Big ]\Bigg \}\\&+\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^nE\Big [\delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\Big |\mathbf {X}_i\Big ]\\ :=&P_1+P_2. \end{aligned}$$

Following the arguments as for \(T_1\) in the proof of Lemma 5.6, we have \(EP_1^2=O(n^{-2}h_{1n}^{-d}).\)

As for \(P_2\), and under conditions (A1)–(A4), we have

$$\begin{aligned} P_2=&\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n E\Big [\delta _j\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y)\Big |\Big |\mathbf {X}_i\Big ]\\ =&\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n E\Big \{\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |E[\delta _j|\mathbf {X}_i,\mathbf {X}_j,Y_j]|\mathbf {X}_i\Big \}\\ =&\frac{1}{n^2h_{1n}^d}\sum _{i=1}^n\sum _{j=1}^n E\Big \{\Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {X}_j}{h_{1n}}\Big )\phi _l(Y_j)\Big |\Delta (\mathbf {X}_j)\Big |\mathbf {X}_i\Big \}\\ =&\frac{1}{nh_{1n}^d}\sum _{i=1}^n \int \int \Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {x}}{h_{1n}}\Big )\phi _l(y)\Big |\Delta (\mathbf {x})\xi (\mathbf {x},y)d\mathbf {x}dy. \end{aligned}$$

Therefore, we can get

$$\begin{aligned} EP_2^2=&E\Big \{\frac{1}{nh_{1n}^d}\sum _{i=1}^n\int \int \Big |K_1\Big (\frac{\mathbf {X}_i-\mathbf {x}}{h_{1n}}\Big )\phi _l(y)\Big |\Delta (\mathbf {x})\xi (\mathbf {x},y)d\mathbf {x}dy\Big \}^2\\ =&E\Big \{\frac{1}{n}\sum _{i=1}^np^{-1/2}\int \int |K_1(\mathbf {t})\phi (u)|\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}dy\Big \}^2\\ \le&\,\frac{n}{n^2}\sum _{i=1}^np^{-1}E\Big \{\int \int |K_1(\mathbf {t})\phi (u)|\Delta (\mathbf {X}_i-h_{1n}\mathbf {t})\xi \Big (\mathbf {X}_i-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}dy\Big \}^2\\ =&\,p^{-1}\int \Big \{\int \int |K_1(\mathbf {t})\phi (u)|\Delta (\mathbf {x}-h_{1n}\mathbf {t})\xi \Big (\mathbf {x}-h_{1n}\mathbf {t},\frac{u+l}{p}\Big )d\mathbf {t}dy\Big \}^2g(\mathbf {x})d\mathbf {x}\\ =&\,O(p^{-1}). \end{aligned}$$

Since \(nh_{1n}^dp^{-1}\rightarrow \infty \), we get the result. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, YY., Liang, HY. & Zhang, JJ. Nonlinear wavelet density estimation with data missing at random when covariates are present. Metrika 78, 967–995 (2015). https://doi.org/10.1007/s00184-015-0536-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-015-0536-x

Keywords

Mathematics Subject Classification

Navigation