Abstract
To obtain M-estimators of a response variable when the data are missing at random, we can construct three bias-corrected nonparametric estimating equations based on inverse probability weighting, mean imputation, and augmented inverse probability weighting approaches. However, when the dimension of covariate is not low, the estimation efficiency will be affected due to the curse of dimensionality. To address this issue, we propose a two-stage estimation procedure by using the dimension-reduced kernel estimators in conjunction with bias-corrected estimating equations. We show that the resulting three kernel-assisted estimating equations yield asymptotically equivalent M-estimators that achieve the desirable properties. The finite-sample performance of the proposed estimators for response mean, distribution function and quantile is studied through simulation, and an application to HIV-CD4 data set is also presented.
Similar content being viewed by others
References
Andrews, D. W. (1995). Nonparametric kernel estimation for semiparametric models. Econometric Theory, 11, 560–586.
Chen, X., Wan, A. T., Zhou, Y. (2015). Efficient quantile regression analysis with missing observations. Journal of the American Statistical Association, 110, 723–741.
Cheng, P. E. (1994). Nonparametric estimation of mean functionals with data missing at random. Journal of the American Statistical Association, 89, 81–87.
Cook, R. D. (1994). On the interpretation of regression plots. Journal of the American Statistical Association, 89, 177–189.
Cook, R. D., Weisberg, S. (1991). Discussion of “Sliced inverse regression for dimension reduction”. Journal of the American Statistical Association, 86, 28–33.
Deng, J., Wang, Q. (2017). Dimension reduction estimation for probability density with data missing at random when covariables are present. Journal of Statistical Planning and Inference, 181, 11–29.
Ding, X., Wang, Q. (2011). Fusion-refinement procedure for dimension reduction with missing response at random. Journal of the American Statistical Association, 106, 1193–1207.
Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundaker, H., Schooley, R. T., Haubrich, R. H., et al. (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine, 335, 1081–1089.
Hu, Z., Follmann, D. A., Wang, N. (2014). Estimation of mean response via effective balancing score. Biometrika, 101, 613–624.
Huber, P. J. (1981). Robust statistics. New York: Wiley.
Ibrahim, J. G., Chen, M. H., Lipsitz, S. R., Herring, A. H. (2005). Missing-data methods for generalized linear models: A comparative review. Journal of the American Statistical Association, 100, 332–346.
Kim, J. K., Shao, J. (2013). Statistical methods for handling incomplete data. London: Chapman and Hall/CRC.
Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316–327.
Li, Y., Wang, Q., Zhu, L., Ding, X. (2017). Mean response estimation with missing response in the presence of high-dimensional covariates. Communications in Statistics-Theory and Methods, 46, 628–643.
Ma, Y., Zhu, L. (2012). A semiparametric approach to dimension reduction. Journal of the American Statistical Association, 107, 168–179.
Ma, Y., Zhu, L. (2013). A review on dimension reduction. International Statistical Review, 81, 134–150.
Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.
Qin, J., Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300–325.
Rubins, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Serfling, R. J. (1981). Approximation theorems of mathematical statistics. New York: Wiley.
Shao, J., Wang, L. (2016). Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika, 103, 175–187.
Wang, D., Chen, S. X. (2009). Empirical likelihood for estimating equations with missing values. The Annals of Statistics, 37, 490–517.
Wang, L., Rotnitzky, A., Lin, X. (2010). Nonparametric regression with missing outcomes using weighted kernel estimating equations. Journal of the American Statistical Association, 105, 1135–1146.
Wang, Q. (2007). M-estimators based on inverse probability weighted estimating equations with response missing at random. Communications in Statistics-Theory and Methods, 36, 1091–1103.
Wooldridge, J. M. (2007). Inverse probability weighted estimation for general missing data problems. Journal of Econometrics, 141, 1281–1301.
Xia, Y., Tong, H., Li, W. K., Zhu, L. X. (2002). An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society: Series B, 64, 363–410.
Xue, L. (2009). Empirical likelihood confidence intervals for response mean with data missing at random. Scandinavian Journal of Statistics, 36, 671–685.
Zhang, B. (1995). M-estimation and quantile estimation in the presence of auxiliary information. Journal of Statistical Planning and Inference, 44, 77–94.
Zhu, L. P., Zhu, L. X., Ferre, L., Wang, T. (2010). Sufficient dimension reduction through discretization-expectation estimation. Biometrika, 97, 295–304.
Acknowledgements
We are grateful to the Editor, the Associate Editor and one anonymous referee for their insightful comments and suggestions on this article, which have led to significant improvements. This work was supported by the National Natural Science Foundation of China (11501208) and Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
-
(C1)
The true value \(\theta _0\) is the unique root of \(n^{-1}\sum _{i=1}^ng_l(Y_i,S_i,\delta _i,\theta )=0\), \(n^{-1}\sum _{i=1}^ng_l(Y_i,S_i,\delta _i,\theta )\) is differentiable at \(\theta =\theta _0\) for \(l=1, 2, 3\) with \(\sum _{i=1}^n{\partial g_l(Y_i,S_i,\delta _i,\theta _0)}/{\partial \theta } \ne 0\).
-
(C2)
The function \(\varphi (Y,\theta )\) is monotone and continuous in \(\theta \), \(E|\varphi (Y,\theta )|< \infty \), \(\partial \varphi (Y,\theta )/\partial \theta \) is continuous at \(\theta =\theta _0\); \(E|\partial \varphi (Y,\theta _0)/\partial \theta |< \infty \), \( E\{\varphi ^2(Y,\theta )|S\} < \infty \).
-
(C3)
The kernel \(K(\cdot )\) is bounded and has compact support, and is of order \(m \ge 2\), i.e., \(\int K(s_1,...,s_{d})ds_1 \cdots ds_{d}=1\), \(\int s_j^tK(s_1,...,s_{d})ds_1\cdots ds_{d}=0\), and \(\int s_j^mK(s_1,...,s_{d})ds_1\cdots ds_{d}\ne 0\) for any \(j =1,...,d\) and \(t=1,..., m-1\).
-
(C4)
The function \(\pi (S)\) and the S-density function f(S) have continuous and bounded partial derivatives with respect to S up to order m, and \(\pi (S)\) are bounded away from 0 and 1.
-
(C5)
The function \(m_{\varphi }(S, \theta )\) is twice continuously differentiable in the neighborhood of S; has bounded partial derivatives up to order m.
-
(C6)
As \(n\rightarrow \infty \), \(nh^{2d}\rightarrow \infty \), \(nh^{d}/\log n\rightarrow \infty \), \(nh^{2m}\rightarrow 0\), and the estimator \(\hat{B}\) obtained by SDR is a root-n consistent estimator of B.
Proof of Theorem 1
For \(g_2(Y_i, \hat{S}_i,\delta _i,\theta )\), note that
where \(S_i=BX_i\) and \(\hat{S}_i=\hat{B}X_i\). Define \(G(S)=f(S)\pi (S)\) and
Let \(\varDelta _n(\hat{S}_i,S_i)=\hat{G}_n(\hat{S})-G(S_i)\). Then,
where
Using the fact \(\delta \varphi (Y, \theta )\perp X|B X\), we can show that
As in Wang and Chen (2009), we can prove that
and \(A_{n2}=o_p(n^{-1/2})\). Using the arguments in Andrews (1995) and \(\Vert \hat{B}-B\Vert =O_p(n^{-1/2})\), it leads to
such that \(A_{n3}=o_p(n^{-1/2})\). Thus, we have
It leads to
Furthermore, we have
For \(g_1(Y_i, \hat{S}_i,\delta _i,\theta )\), we have
Using the similar arguments in Wang (2007), we can prove that
which leads to
and
For \(g_3(Y_i, \hat{S}_i,\delta _i,\theta )\), it can be seen that
Using the similar arguments for \(g_1(Y_i, \hat{S}_i,\delta _i,\theta )\) and \(g_2(Y_i, \hat{S}_i,\delta _i,\theta )\), it can be proved that the last two terms on the right side of the above equation are \(o_p(1)\). The proof is completed. \(\square \)
Proof of Theorem 2
By Taylor expansion, there exists \({\theta }_l^*\) between \(\hat{\theta }_l\) and \(\hat{\theta }_0\), \(l=1,2,3,\) such that
Since \(\sum _{i=1}^ng_l(Y_i,\hat{S}_i,\delta _i,\hat{\theta }_l)=0\), we have
Similar to Theorem 1, as \(n \rightarrow \infty \), it can be proved that
The proof is completed. \(\square \)
Lemma 1
Assume that \(P(\delta = 1|X)>0\) and \(P(Y=0|X)=0\). For any given y, it can be verified that
where \(\mathcal {S}\) denotes for the central subspace (Cook 1994).
Proof of Lemma 1
Suppose that B is a basis of \(\mathcal {S}_{\delta Y|X}\), such that \(\delta Y \perp X | BX.\) Then, we have \(\mathrm{Pr}(\delta Y=0|X)=\mathrm{Pr}(\delta Y=0|BX)\) and \(\mathrm{Pr}(\delta Y \le y |X)=\mathrm{Pr}(\delta Y \le y |BX)\). Note \(\mathrm{Pr}(\delta Y \le y |X)=\mathrm{Pr}(\delta =1, Y \le y |X)+I(y \ge 0)\mathrm{Pr}(\delta =0|X)\) and \(\mathrm{Pr}(\delta Y =0 |X)=\mathrm{Pr}(\delta =1, Y=0 |X)+\mathrm{Pr}(\delta =0|X)=\mathrm{Pr}(\delta =0 |X).\) We have
\(\square \)
About this article
Cite this article
Wang, L. Dimension reduction for kernel-assisted M-estimators with missing response at random. Ann Inst Stat Math 71, 889–910 (2019). https://doi.org/10.1007/s10463-018-0664-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-018-0664-y