Skip to main content
Log in

Dimension reduction for kernel-assisted M-estimators with missing response at random

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

To obtain M-estimators of a response variable when the data are missing at random, we can construct three bias-corrected nonparametric estimating equations based on inverse probability weighting, mean imputation, and augmented inverse probability weighting approaches. However, when the dimension of covariate is not low, the estimation efficiency will be affected due to the curse of dimensionality. To address this issue, we propose a two-stage estimation procedure by using the dimension-reduced kernel estimators in conjunction with bias-corrected estimating equations. We show that the resulting three kernel-assisted estimating equations yield asymptotically equivalent M-estimators that achieve the desirable properties. The finite-sample performance of the proposed estimators for response mean, distribution function and quantile is studied through simulation, and an application to HIV-CD4 data set is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Andrews, D. W. (1995). Nonparametric kernel estimation for semiparametric models. Econometric Theory, 11, 560–586.

    Article  MathSciNet  Google Scholar 

  • Chen, X., Wan, A. T., Zhou, Y. (2015). Efficient quantile regression analysis with missing observations. Journal of the American Statistical Association, 110, 723–741.

  • Cheng, P. E. (1994). Nonparametric estimation of mean functionals with data missing at random. Journal of the American Statistical Association, 89, 81–87.

    Article  MATH  Google Scholar 

  • Cook, R. D. (1994). On the interpretation of regression plots. Journal of the American Statistical Association, 89, 177–189.

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R. D., Weisberg, S. (1991). Discussion of “Sliced inverse regression for dimension reduction”. Journal of the American Statistical Association, 86, 28–33.

  • Deng, J., Wang, Q. (2017). Dimension reduction estimation for probability density with data missing at random when covariables are present. Journal of Statistical Planning and Inference, 181, 11–29.

  • Ding, X., Wang, Q. (2011). Fusion-refinement procedure for dimension reduction with missing response at random. Journal of the American Statistical Association, 106, 1193–1207.

  • Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundaker, H., Schooley, R. T., Haubrich, R. H., et al. (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine, 335, 1081–1089.

    Article  Google Scholar 

  • Hu, Z., Follmann, D. A., Wang, N. (2014). Estimation of mean response via effective balancing score. Biometrika, 101, 613–624.

  • Huber, P. J. (1981). Robust statistics. New York: Wiley.

    Book  MATH  Google Scholar 

  • Ibrahim, J. G., Chen, M. H., Lipsitz, S. R., Herring, A. H. (2005). Missing-data methods for generalized linear models: A comparative review. Journal of the American Statistical Association, 100, 332–346.

  • Kim, J. K., Shao, J. (2013). Statistical methods for handling incomplete data. London: Chapman and Hall/CRC.

  • Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316–327.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, Y., Wang, Q., Zhu, L., Ding, X. (2017). Mean response estimation with missing response in the presence of high-dimensional covariates. Communications in Statistics-Theory and Methods, 46, 628–643.

  • Ma, Y., Zhu, L. (2012). A semiparametric approach to dimension reduction. Journal of the American Statistical Association, 107, 168–179.

  • Ma, Y., Zhu, L. (2013). A review on dimension reduction. International Statistical Review, 81, 134–150.

  • Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.

    Article  MathSciNet  MATH  Google Scholar 

  • Qin, J., Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300–325.

  • Rubins, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.

    Article  MathSciNet  Google Scholar 

  • Serfling, R. J. (1981). Approximation theorems of mathematical statistics. New York: Wiley.

    MATH  Google Scholar 

  • Shao, J., Wang, L. (2016). Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika, 103, 175–187.

  • Wang, D., Chen, S. X. (2009). Empirical likelihood for estimating equations with missing values. The Annals of Statistics, 37, 490–517.

  • Wang, L., Rotnitzky, A., Lin, X. (2010). Nonparametric regression with missing outcomes using weighted kernel estimating equations. Journal of the American Statistical Association, 105, 1135–1146.

  • Wang, Q. (2007). M-estimators based on inverse probability weighted estimating equations with response missing at random. Communications in Statistics-Theory and Methods, 36, 1091–1103.

    Article  MathSciNet  MATH  Google Scholar 

  • Wooldridge, J. M. (2007). Inverse probability weighted estimation for general missing data problems. Journal of Econometrics, 141, 1281–1301.

    Article  MathSciNet  MATH  Google Scholar 

  • Xia, Y., Tong, H., Li, W. K., Zhu, L. X. (2002). An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society: Series B, 64, 363–410.

  • Xue, L. (2009). Empirical likelihood confidence intervals for response mean with data missing at random. Scandinavian Journal of Statistics, 36, 671–685.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, B. (1995). M-estimation and quantile estimation in the presence of auxiliary information. Journal of Statistical Planning and Inference, 44, 77–94.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu, L. P., Zhu, L. X., Ferre, L., Wang, T. (2010). Sufficient dimension reduction through discretization-expectation estimation. Biometrika, 97, 295–304.

Download references

Acknowledgements

We are grateful to the Editor, the Associate Editor and one anonymous referee for their insightful comments and suggestions on this article, which have led to significant improvements. This work was supported by the National Natural Science Foundation of China (11501208) and Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Wang.

Appendix

Appendix

  1. (C1)

    The true value \(\theta _0\) is the unique root of \(n^{-1}\sum _{i=1}^ng_l(Y_i,S_i,\delta _i,\theta )=0\), \(n^{-1}\sum _{i=1}^ng_l(Y_i,S_i,\delta _i,\theta )\) is differentiable at \(\theta =\theta _0\) for \(l=1, 2, 3\) with \(\sum _{i=1}^n{\partial g_l(Y_i,S_i,\delta _i,\theta _0)}/{\partial \theta } \ne 0\).

  2. (C2)

    The function \(\varphi (Y,\theta )\) is monotone and continuous in \(\theta \), \(E|\varphi (Y,\theta )|< \infty \), \(\partial \varphi (Y,\theta )/\partial \theta \) is continuous at \(\theta =\theta _0\); \(E|\partial \varphi (Y,\theta _0)/\partial \theta |< \infty \), \( E\{\varphi ^2(Y,\theta )|S\} < \infty \).

  3. (C3)

    The kernel \(K(\cdot )\) is bounded and has compact support, and is of order \(m \ge 2\), i.e., \(\int K(s_1,...,s_{d})ds_1 \cdots ds_{d}=1\), \(\int s_j^tK(s_1,...,s_{d})ds_1\cdots ds_{d}=0\), and \(\int s_j^mK(s_1,...,s_{d})ds_1\cdots ds_{d}\ne 0\) for any \(j =1,...,d\) and \(t=1,..., m-1\).

  4. (C4)

    The function \(\pi (S)\) and the S-density function f(S) have continuous and bounded partial derivatives with respect to S up to order m, and \(\pi (S)\) are bounded away from 0 and 1.

  5. (C5)

    The function \(m_{\varphi }(S, \theta )\) is twice continuously differentiable in the neighborhood of S; has bounded partial derivatives up to order m.

  6. (C6)

    As \(n\rightarrow \infty \), \(nh^{2d}\rightarrow \infty \), \(nh^{d}/\log n\rightarrow \infty \), \(nh^{2m}\rightarrow 0\), and the estimator \(\hat{B}\) obtained by SDR is a root-n consistent estimator of B.

Proof of Theorem 1

For \(g_2(Y_i, \hat{S}_i,\delta _i,\theta )\), note that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_2(Y_i,\hat{S}_i,\delta _i,\theta ) =&\frac{1}{n}\sum \limits _{i=1}^n\{\delta _i\varphi (Y_i,\theta ) +(1-\delta _i){m}_{\varphi }(S_i, \theta )\}\\&+\frac{1}{n}\sum \limits _{i=1}^n(1-\delta _i)\{{\hat{m}}_{\varphi }(\hat{S}_i,\theta )-m_{\varphi }(S_i, \theta )\}, \end{aligned}$$

where \(S_i=BX_i\) and \(\hat{S}_i=\hat{B}X_i\). Define \(G(S)=f(S)\pi (S)\) and

$$\begin{aligned} \hat{G}_n(S)=\dfrac{1}{n}\sum \limits _{j=1}^{n}\delta _j{K}_h(S_j-S). \end{aligned}$$

Let \(\varDelta _n(\hat{S}_i,S_i)=\hat{G}_n(\hat{S})-G(S_i)\). Then,

$$\begin{aligned} \frac{1}{n}\sum \limits _{i=1}^n(1-\delta _i)\{{\hat{m}}_{\varphi }(\hat{S}_i,\theta )-m_{\varphi }(S_i, \theta )\}=A_{n1}+A_{n2}-A_{n3}, \end{aligned}$$

where

$$\begin{aligned} \begin{array}{llll} A_{n1}=\dfrac{1}{n^2}\sum \limits _{i=1}^n\sum \limits _{j=1}^n (1-\delta _i){K}_h(\hat{S}_j-\hat{S}_i)\dfrac{\delta _j\{\varphi (Y_j,\theta )-m_{\varphi }(S_j, \theta )\}}{ G(S_i)},\\ A_{n2}=\dfrac{1}{n^2}\sum \limits _{i=1}^n\sum \limits _{j=1}^n(1-\delta _i) {K}_h(\hat{S}_j-\hat{S}_i)\dfrac{\delta _j\{m_{\varphi }(S_j, \theta )-m_{\varphi }(S_i, \theta )\}}{G(S_i)},\\ A_{n3}=\dfrac{1}{n}\sum \limits _{i=1}^n(1-\delta _i) \{{\hat{m}}_{\varphi }(\hat{S}_i,\theta )-m_{\varphi }(S_i, \theta )\} \dfrac{\varDelta _n(\hat{S}_i,S_i)}{G(S_i)}. \end{array} \end{aligned}$$

Using the fact \(\delta \varphi (Y, \theta )\perp X|B X\), we can show that

$$\begin{aligned} E[\delta _j\{\varphi (Y_j, \theta )-m_{\varphi }(S_j, \theta )\}|X_j]=0. \end{aligned}$$

As in Wang and Chen (2009), we can prove that

$$\begin{aligned} A_{n1}=\dfrac{1}{n}\sum \limits _{i=1}^n\delta _i\{\pi ^{-1}(S_i)-1\} \{\varphi (Y_i, \theta )-m_{\varphi }(S_i, \theta )\}+o_p(n^{-1/2}), \end{aligned}$$

and \(A_{n2}=o_p(n^{-1/2})\). Using the arguments in Andrews (1995) and \(\Vert \hat{B}-B\Vert =O_p(n^{-1/2})\), it leads to

$$\begin{aligned} \sup _i\big |\hat{m}_{\varphi }(\hat{S}_i, \theta )-m_{\varphi }(S_i, \theta )\big |=o_p(n^{-1/4}), \end{aligned}$$

such that \(A_{n3}=o_p(n^{-1/2})\). Thus, we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_2(Y_i,\hat{S}_i,\delta _i,\theta )=&\frac{1}{n}\sum _{i=1}^n\{\delta _i\varphi (Y_i,\theta ) +(1-\delta _i){m}_{\varphi }({S}_i, \theta )\}\\&+\dfrac{1}{n}\sum \limits _{i=1}^n\delta _i\{\pi ^{-1}(S_i)-1\} \{\varphi (Y_i, \theta )-m_{\varphi }(S_i, \theta )\}\\&+o_p(n^{-1/2}). \end{aligned}$$

It leads to

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_2(Y_i,\hat{S}_i,\delta _i,\theta ) \rightarrow E\varphi (Y,\theta ). \end{aligned}$$

Furthermore, we have

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i=1}^n\{g_2(Y_i,\hat{S}_i,\delta _i,\theta )-E\varphi (Y,\theta )\} \rightarrow N(0,V(\theta )^2). \end{aligned}$$

For \(g_1(Y_i, \hat{S}_i,\delta _i,\theta )\), we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_1(Y_i,\hat{S}_i,\delta _i,\theta )&=\frac{1}{n}\sum \limits _{i=1}^n\Big [\frac{\delta _i\varphi (Y_i,\theta )}{\pi (S_i)} +\frac{\delta _i\varphi (Y_i,\theta )\{\pi (S_i)- \hat{\pi }(\hat{S}_i)\}}{\pi ^2(S_i)}\Big ]\\&\quad +\frac{1}{n}\sum \limits _{i=1}^n\frac{\delta _i\varphi (Y_i,\theta )\{\pi (S_i)- \hat{\pi }(\hat{S}_i)\}^2}{\pi ^2(S_i)\hat{\pi }(\hat{S}_i)}. \end{aligned}$$

Using the similar arguments in Wang (2007), we can prove that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_1(Y_i,\hat{S}_i,\delta _i,\theta ) =\frac{1}{n}\sum \limits _{i=1}^n\Big [\frac{\delta _i\varphi (Y_i,\theta )}{\pi (S_i)}+\big \{1-\frac{\delta _i}{\pi (S_i)}\big \}{m}_{\varphi }(S_i, \theta )\Big ], \end{aligned}$$

which leads to

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_1(Y_i,\hat{S}_i,\delta _i,\theta ) \rightarrow E\varphi (Y,\theta ), \end{aligned}$$

and

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i=1}^n\{g_1(Y_i,\hat{S}_i,\delta _i,\theta )-E\varphi (Y,\theta )\} \rightarrow N(0,V(\theta )^2). \end{aligned}$$

For \(g_3(Y_i, \hat{S}_i,\delta _i,\theta )\), it can be seen that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_3(Y_i,\hat{S}_i,\delta _i,\theta )&=\frac{1}{n}\sum \limits _{i=1}^n\Big [\frac{\delta _i\varphi (Y_i,\theta )}{\pi (S_i)}+\big \{1-\frac{\delta _i}{\pi (S_i)}\big \}{m}_{\varphi }(S_i, \theta )\Big ]\\&\quad +\frac{1}{n}\sum \limits _{i=1}^n\Big [\Big \{\frac{\delta _i}{\hat{\pi }(\hat{S}_i)}-\frac{\delta _i}{{\pi }({S}_i)}\Big \}\Big \{\varphi (Y_i,\theta )-{m}_{\varphi }(S_i, \theta )\Big \}\Big ]\\&\quad +\frac{1}{n}\sum \limits _{i=1}^n\Big [\Big \{1-\frac{\delta _i}{\hat{\pi }(\hat{S}_i)}\Big \}\Big \{\hat{m}_{\varphi }(\hat{S}_i, \theta )-{m}_{\varphi }(S_i, \theta )\Big \}\Big ]. \end{aligned}$$

Using the similar arguments for \(g_1(Y_i, \hat{S}_i,\delta _i,\theta )\) and \(g_2(Y_i, \hat{S}_i,\delta _i,\theta )\), it can be proved that the last two terms on the right side of the above equation are \(o_p(1)\). The proof is completed. \(\square \)

Proof of Theorem 2

By Taylor expansion, there exists \({\theta }_l^*\) between \(\hat{\theta }_l\) and \(\hat{\theta }_0\), \(l=1,2,3,\) such that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^ng_l(Y_i,\hat{S}_i,\delta _i,\hat{\theta }_l) =\frac{1}{n}\sum _{i=1}^ng_l(Y_i,\hat{S}_i,\delta _i,{\theta }_0)+\frac{1}{n}\sum _{i=1}^n \frac{\partial g_l(Y_i,\hat{S}_i,\delta _i,{\theta }_l^*)}{\partial \theta } (\hat{\theta }_l-\hat{\theta }_0). \end{aligned}$$

Since \(\sum _{i=1}^ng_l(Y_i,\hat{S}_i,\delta _i,\hat{\theta }_l)=0\), we have

$$\begin{aligned} \sqrt{n}(\hat{\theta }_l-{\theta }_0)=-\sqrt{n}\Big \{\frac{1}{n}\sum _{i=1}^n \frac{\partial g_l(Y_i,\hat{S}_i,\delta _i,{\theta }_l^*)}{\partial \theta }\Big \}^{-1}\frac{1}{n}\sum _{i=1}^ng_l(Y_i,\hat{S}_i,\delta _i,{\theta }_0). \end{aligned}$$

Similar to Theorem 1, as \(n \rightarrow \infty \), it can be proved that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n \frac{\partial g_l(Y_i,\hat{S}_i,\delta _i,{\theta }_l^*)}{\partial \theta } \rightarrow E\Big \{\frac{\varphi (Y,\theta _0)}{\partial \theta }\Big \}. \end{aligned}$$

The proof is completed. \(\square \)

Lemma 1

Assume that \(P(\delta = 1|X)>0\) and \(P(Y=0|X)=0\). For any given y, it can be verified that

$$\begin{aligned} \mathcal {S}_{\delta I(Y \le y)|X} \subseteq \mathcal {S}_{\delta Y|X}, \end{aligned}$$

where \(\mathcal {S}\) denotes for the central subspace (Cook 1994).

Proof of Lemma 1

Suppose that B is a basis of \(\mathcal {S}_{\delta Y|X}\), such that \(\delta Y \perp X | BX.\) Then, we have \(\mathrm{Pr}(\delta Y=0|X)=\mathrm{Pr}(\delta Y=0|BX)\) and \(\mathrm{Pr}(\delta Y \le y |X)=\mathrm{Pr}(\delta Y \le y |BX)\). Note \(\mathrm{Pr}(\delta Y \le y |X)=\mathrm{Pr}(\delta =1, Y \le y |X)+I(y \ge 0)\mathrm{Pr}(\delta =0|X)\) and \(\mathrm{Pr}(\delta Y =0 |X)=\mathrm{Pr}(\delta =1, Y=0 |X)+\mathrm{Pr}(\delta =0|X)=\mathrm{Pr}(\delta =0 |X).\) We have

$$\begin{aligned} \mathrm{Pr}(\delta I(Y \le y)=1|X)&=\mathrm{Pr}(\delta =1, Y \le y |X)\\&=\mathrm{Pr}(\delta Y \le y |X)-I(y \ge 0)\mathrm{Pr}(\delta =0|X)\\&=\mathrm{Pr}(\delta Y \le y |X)-I(y \ge 0)\mathrm{Pr}(\delta Y=0|X)\\&=\mathrm{Pr}(\delta Y \le y |BX)-I(y \ge 0)\mathrm{Pr}(\delta Y=0|BX)\\&=\mathrm{Pr}(\delta I(Y \le y)=1|BX).\\ \mathrm{Pr}(\delta I(Y \le y)=0|X)&=\mathrm{Pr}(\delta =0|X)+\mathrm{Pr}(\delta =1, Y \ge y |X)\\&=\mathrm{Pr}(\delta Y=0|X)+\mathrm{Pr}(\delta Y \ge y |X)-I(y \le 0)\mathrm{Pr}(\delta =0|X)\\&=\mathrm{Pr}(\delta Y \ge y |X)+I(y \ge 0)\mathrm{Pr}(\delta Y=0|X)\\&=\mathrm{Pr}(\delta Y \ge y |BX)+I(y \ge 0)\mathrm{Pr}(\delta Y=0|BX)\\&=\mathrm{Pr}(\delta I(Y \le y)=0|BX). \end{aligned}$$

\(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L. Dimension reduction for kernel-assisted M-estimators with missing response at random. Ann Inst Stat Math 71, 889–910 (2019). https://doi.org/10.1007/s10463-018-0664-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-018-0664-y

Keywords

Navigation