Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

An improvement on the efficiency of complete-case-analysis with nonignorable missing covariate data

Abstract

This paper develops a weighted composite quantile regression method for linear models where some covariates are missing not at random but the missingness is conditionally independent of the response variable. It is known that complete case analysis (CCA) is valid under these missingness assumptions. By fully utilizing the information from incomplete data, empirical likelihood-based weights are obtained to conduct the weighted composite quantile regression. Theoretical results show that the proposed estimator is more efficient than the CCA one if the probability of missingness on the fully observed variables is correctly specified. Besides, the proposed algorithm is computationally simple and easy to implement. The methodology is illustrated on simulated data and a real data set.

This is a preview of subscription content, log in to check access.

References

  1. Bartlett JW, Carpente JR, Tilling K, Vansteelandt S (2014) Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 15(4):719–730

  2. Bloznelis D, Claeskens G, Zhou J (2019) Composite versus model-averaged quantile regression. J Stat Plan Inference 200:32–46

  3. Bradic J, Fan JQ, Wang WW (2011) Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J R Stat Soc Ser B 73(3):325–349

  4. Jiang XJ, Jiang JC, Song XY (2012) Oracle model selection for nonlinear models based on weighted composite quantile regression. Stat Sin 22(4):1479–1506

  5. Koenker R (2005) Quantile regression. Cambridge University Press, New York

  6. Little RJ, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken

  7. Little RJ, Zhang N (2011) Subsample ignorable likelihood for regression analysis with missing data. J R Stat Soc Ser C 60(4):591–605

  8. Liu TQ, Yuan XH (2016) Weighted quantile regression with missing covariates using empirical likelihood. Statistics 50(1):89–113

  9. Molanes Lopez EM, Van Keilegom I, Veraverbeke N (2009) Empirical likelihood for non-smooth criterion functions. Scand J Stat 36(3):413–432

  10. Ning ZJ, Tang LJ (2014) Estimation and test procedures for composite quantile regression with covariates missing at random. Stat Probab Lett 95:15–25

  11. Owen AB (1990) Empirical likelihood ratio confidence regions. Ann Stat 18(1):90–120

  12. Pollard D (1991) Asymptotics for least absolute deviation regression estimators. Econ Theory 7(2):186–199

  13. Qin J, Lawless J (1994) Empirical likelihood and general estimating equations. Ann Stat 22(1):300–325

  14. Sherwood B (2016) Variable selection for additive partial linear quantile regression with missing covariates. J Multivar Anal 152:206–223

  15. Sun J, Ma YY (2017) Empirical likelihood weighted composite quantile regression with partially missing covariates. J Nonparametr Stat 29(1):137–150

  16. Sun J, Sun QH (2015) An improved and efficient estimation method for varying-coefficient model with missing covariates. Stat Probab Lett 107:296–303

  17. Sun J, Gai YJ, Lin L (2013) Weighted local linear composite quantile estimation for the case of general error distributions. J Stat Plan Inference 143:1049–1063

  18. Tang NS, Wang XJ (2019) Robust estimation of generalized estimating equations with finite mixture correlation matrices and missing covariates at random for longitudinal data. J Multivar Anal 173:640–655

  19. Tang LJ, Zhou ZG (2015) Weighted local linear CQR for varying-coefficient models with missing covariates. TEST 24(3):583–604

  20. Tang LJ, Zheng SC, Zhou ZG (2018) Estimation and inference of combining quantile and least-square regressions with missing data. J Korean Stat Soc 47:77–89

  21. Yoshida T (2017) Two stage smoothing in additive models with missing covariates. Stat Pap 60(6):1803–1826

  22. Zhao ZB, Xiao ZJ (2014) Efficient regressions via optimally combining quantile information. Econ Theory 30(6):1272–1314

  23. Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36(3):1108–1126

Download references

Acknowledgements

This research was supported by the Natural Science Foundation of Shandong Province, China (ZR2017QA011). The author would like to thank the anonymous reviewers for their valuable comments and suggestions.

Author information

Correspondence to Jing Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The following regularity conditions are required for the asymptotic analysis. Conditions (C1, C2, C6–C8) are similar to those in Sun and Ma (2017). Condition (C3) is a necessary assumption about the missing data mechanism in this paper. Condition (C4–C5) guarantee that the asymptotic covariance matrices of the CCA-based and ACC-based CQR estimators are both positive definite.

  1. (C1)

    \(\varvec{w}\) has a bounded support.

  2. (C2)

    The density function \(f(\cdot )\) of \(\varepsilon \) is bounded away from zero and has a continuous and uniformly bounded derivative.

  3. (C3)

    \(\delta \) is independent of y given \(\varvec{w}\).

  4. (C4)

    \(\varvec{D}\) and \(\varvec{S_\phi }\) are positive definite.

  5. (C5)

    \({\varvec{S}}_{\varvec{B}}\), \(\varvec{P}_2\) and \(\varvec{S_\phi }-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T\) are positive definite.

  6. (C6)

    For all \((y_i,\varvec{z}_i)\), \(\pi (y_i,\varvec{z}_i,\varvec{\gamma })\) admits all third partial derivatives \(\frac{\partial ^3\pi (y_i,\varvec{z}_i,\varvec{\gamma })}{\partial \gamma _k\partial \gamma _l\partial \gamma _m}\) for all \(\varvec{\gamma }\) in a neighborhood of the true value \(\varvec{\gamma }^*\), \(\Vert \frac{\partial ^3\pi (y_i,\varvec{z}_i,\varvec{\gamma })}{\partial \gamma _k\partial \gamma _l\partial \gamma _m}\Vert \) is bounded by an integrable function for all \(\varvec{\gamma }\) in this neighborhood, and \(\Vert \frac{\partial \pi (y_i,\varvec{z}_i,\varvec{\gamma })}{\partial \varvec{\gamma }^T}\Vert \) is bounded by an integrable function for all \(\varvec{\gamma }\) in this neighborhood.

  7. (C7)

    \(\pi (y,\varvec{z},\varvec{\gamma }^*)\) is bounded away from zero, i.e. \(\underset{y,z}{\inf }\,\pi (y,\varvec{z},\varvec{\gamma }^*)\ge c_0\) for some \(c_0 > 0\).

  8. (C8)

    \(\Vert \varvec{\xi }(y,\varvec{z},\varvec{\zeta })\Vert ^2\) is bounded by an integrable function for all \(\varvec{\zeta }\) in a neighbourhood of \(\varvec{\zeta }^*\) and \(\varvec{\xi }(y,\varvec{z},\varvec{\zeta })\) is continuous at each \(\varvec{\zeta }\) with probability one in this neighbourhood, where \(\varvec{\zeta }=(\varvec{\alpha }^T,\varvec{\beta }^T,\varvec{b}^T)^T\) and \(\varvec{\zeta }^*=({\varvec{\alpha }^*}^T,{\varvec{\beta }^*}^T,{\varvec{b}^*}^T)^T\). For some \(c > 0\),

$$\begin{aligned} \underset{\Vert \varvec{\zeta }-\varvec{\zeta }^*\Vert \le cn^{-1/2}\,}{\sup }\bigg \Vert n^{-1/2}\sum _{i=1}^n(\delta _i-\pi (y_i,\varvec{z}_i,\varvec{\gamma }^*))(\varvec{\xi }(y_i,\varvec{z}_i,\varvec{\zeta })-\varvec{\xi }(y_i,\varvec{z}_i,\varvec{\zeta }^*))\bigg \Vert =o_p(1). \end{aligned}$$

For convenience of representation, for \(i=1,\ldots ,n\) and \(k=1,\ldots ,q\), write

$$\begin{aligned}&\varvec{\kappa }=\ (\varvec{\alpha }^T,\varvec{\beta }^T,{\varvec{b}}^T,\varvec{\gamma }^T)^T,\quad \varvec{\kappa }^*=({\varvec{\alpha }^*}^T,{\varvec{\beta }^*}^T,{\varvec{b}^*}^T,{\varvec{\gamma }^*}^T)^T,\quad \\&\qquad \hat{\varvec{\kappa }}=(\hat{\varvec{\alpha }}^T,\hat{\varvec{\beta }}_{\scriptscriptstyle cc}^T,\hat{\varvec{b}}_{\scriptscriptstyle cc}^T,\hat{\varvec{\gamma }}^T)^T,\\&\zeta _k(\varvec{t}_i,\varvec{\beta },\varvec{b})=\ I(y_i-\varvec{w}_i^T\varvec{\beta }\le b_k)-\tau _k, \quad \zeta _{i,k}=\zeta _k(\varvec{t}_i,\varvec{\beta }^*,\varvec{b}^*),\\&\eta (\varvec{t}_i,\varvec{\beta },\varvec{b})=\sum _{k=1}^{q}\zeta _k(\varvec{t}_i,\varvec{\beta },\varvec{b}),\quad \eta _i=\eta (\varvec{t}_i,\varvec{\beta }^*,\varvec{b}^*),\\&\varvec{\xi }_i=\ \varvec{\xi }(y_i,\varvec{z}_i,\varvec{\alpha }^*,\varvec{\beta }^*,\varvec{b}^*),\quad \varvec{h}_i=\varvec{h}(\varvec{t}_i,\varvec{\kappa }^*),\quad \varvec{g}_i=\varvec{g}(\varvec{t}_i,\varvec{\kappa }^*),\\&\varvec{S}_{\varvec{g}}=\ E(\varvec{g}_i\varvec{g}_i^T), \quad \varvec{F}_{\varvec{g}}=E(\varvec{\phi }_i\varvec{g}_i^T), \quad {\varvec{\Lambda }}({\varvec{\lambda }},\varvec{\kappa })=\frac{1}{n}\sum _{i=1}^n\frac{\varvec{g}(\varvec{t}_i,\varvec{\kappa })}{1+{\varvec{\lambda }}^T\varvec{g}(\varvec{t}_i,\varvec{\kappa })},\\&\varvec{G_\gamma }=E(\partial {\varvec{g}_i}/\partial {\varvec{\gamma }^T}), \quad \varvec{F_\gamma }=E(\partial {\varvec{\phi }_i}/\partial {\varvec{\gamma }^T}). \end{aligned}$$

Lemma 1

  1. (i)

    \(det(\varvec{S}_{\varvec{g}})= det(\varvec{P}_2)det(\varvec{S}_B)\), where \(det(\varvec{A})\) denotes the determinant of some matrix \(\varvec{A}\).

  2. (ii)

    \(\varvec{G_\gamma }=-E(\varvec{g}_i\varvec{U}_{B_i}^T)\), \(\varvec{F_\gamma } =-E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\). Thus \(\varvec{F_\gamma }=\varvec{F_g}\varvec{S_g}^{-1}\varvec{G_\gamma }\).

  3. (iii)

    \(\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T=\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T+E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\varvec{S}_{B}^{-1}E(\varvec{U}_{B_i}\varvec{\phi }_i^T)\), \(\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}\varvec{G}_{\varvec{\gamma }}=-E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\), \(\varvec{S}_{\varvec{g}}^{-1}\varvec{G}_{\varvec{\gamma }}= \begin{pmatrix}\varvec{0}^T, -\varvec{I}^T \end{pmatrix}^T.\)

Proof of Lemma 1

The results can be derived by some calculations and the details are omitted. \(\square \)

Proof of Theorem 1

The proof of Theorem 1 is similar to and much easier than that of Theorem 2. The details are omitted. \(\square \)

Proof of Theorem 2

For fixed estimators \(\hat{\varvec{\kappa }}\), the Lagrange multiplier \(\hat{\varvec{\lambda }}\) satisfies the constraints equations \({\varvec{\Lambda }}(\hat{\varvec{\lambda }},\hat{\varvec{\kappa }})=0.\) According to Lemma A.1 in Sun and Ma (2017),

$$\begin{aligned} \hat{\varvec{\lambda }}={\varvec{\lambda }}(\hat{\varvec{\kappa }})=\Big (\frac{1}{n}\sum _{i=1}^n\varvec{g}(\varvec{t}_i,\hat{\varvec{\kappa }})\varvec{g}(\varvec{t}_i,\hat{\varvec{\kappa }})^T\Big )^{-1}\frac{1}{n}\sum _{i=1}^n\varvec{g}(\varvec{t}_i,\hat{\varvec{\kappa }})+o_p(n^{-\frac{1}{2}}). \end{aligned}$$
(4.1)

By Taylor expansion,

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\varvec{g}(\varvec{t}_i,\hat{\varvec{\kappa }})= & {} \frac{1}{n}\sum _{i=1}^n\varvec{g}(\varvec{t}_i,{\varvec{\kappa }}^*) +\frac{1}{n}\sum _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\alpha }^T}}(\hat{\varvec{\alpha }}-\varvec{\alpha }^*)\\&+\frac{1}{n}\sum _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\beta }^T}}(\hat{\varvec{\beta }}_{\scriptscriptstyle cc}-\varvec{\beta }^*)\\&+\,\frac{1}{n}\sum _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{b}^T}}(\hat{\varvec{b}}_{\scriptscriptstyle cc}-\varvec{b}^*) +\frac{1}{n}\sum _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\gamma }^T}}(\hat{\varvec{\gamma }}-\varvec{\gamma }^*), \end{aligned}$$

where \(\tilde{\varvec{\kappa }}=(\tilde{\varvec{\alpha }}^T,\tilde{\varvec{\beta }}^T,\tilde{\varvec{b}}^T,\tilde{\varvec{\gamma }}^T)^T\in \{\varvec{\kappa }:\Vert \varvec{\kappa }-\varvec{\kappa }^*\Vert \le cn^{-1/2}\}\) with a positive constant c. According to the law of large numbers,

$$\begin{aligned} \frac{1}{n}\sum \limits _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\alpha }^T}}&{\mathop {\rightarrow }\limits ^{P}}\varvec{0},\quad \frac{1}{n}\sum \limits _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\beta }^T}} {\mathop {\rightarrow }\limits ^{P}}\varvec{0},\quad \frac{1}{n}\sum \limits _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{b}^T}} {\mathop {\rightarrow }\limits ^{P}}\varvec{0},\\ \frac{1}{n}\sum \limits _{i=1}^n\frac{\partial {\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})}}{\partial {\varvec{\gamma }^T}}&{\mathop {\rightarrow }\limits ^{P}} \varvec{G}_{\varvec{\gamma }},\quad \frac{1}{n}\sum \limits _{i=1}^n\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})\varvec{g}(\varvec{t}_i,\tilde{\varvec{\kappa }})^T {\mathop {\rightarrow }\limits ^{P}}\varvec{S}_{\varvec{g}}. \end{aligned}$$

By combining these with (4.1) and

$$\begin{aligned} \hat{\varvec{\gamma }}-\varvec{\gamma }^*=\frac{1}{n}\varvec{S}_B^{-1}\sum _{i=1}^n\varvec{U}_{B_i}+o_p(n^{-\frac{1}{2}}), \end{aligned}$$

it holds that

$$\begin{aligned} \hat{\varvec{\lambda }}=\frac{1}{n}\varvec{S}_{\varvec{g}}^{-1}\sum _{i=1}^n(\varvec{g}_i+\varvec{G}_{\varvec{\gamma }}\varvec{S}_B^{-1}\varvec{U}_{B_i})+o_p(n^{-\frac{1}{2}}). \end{aligned}$$
(4.2)

Let \(\varvec{u}=\sqrt{n}(\varvec{\beta }-\varvec{\beta }^*)\), \(\hat{\varvec{u}}=\sqrt{n}(\hat{\varvec{\beta }}_{acc}-\varvec{\beta }^*)\), \(v_k=\sqrt{n}(b_k-b_k^*)\) and \(\hat{v}_k=\sqrt{n}(\hat{b}_{k,acc}-b_k^*)\), \(k=1,\ldots ,q\). Let \(\varvec{\theta }=(\varvec{u}^T,v_1,\ldots ,v_q)^T\) and \(\hat{\varvec{\theta }}=(\hat{\varvec{u}}^T,\hat{v}_1,\ldots ,\hat{v}_q)^T\), then \(\hat{\varvec{\theta }}\) is the minimizer of

$$\begin{aligned} \Xi _n(\varvec{\theta })&= \sum _{k=1}^q\sum _{i=1}^nn\hat{p}_i\delta _i\big \{\rho _{\tau _k}(\varepsilon _i-b_k^*-n^{-\frac{1}{2}}(v_k+\varvec{w}_i^T\varvec{u}))-\rho _{\tau _k}(\varepsilon _i-b_k^*)\big \} \nonumber \\&= \sum _{k=1}^qv_kz_{n,k}+\varvec{\Pi }_n^T\varvec{u}+\sum _{k=1}^qB_{n,k}, \end{aligned}$$
(4.3)

where

$$\begin{aligned} z_{n,k}&=\frac{1}{\sqrt{n}}\sum _{i=1}^nn\hat{p}_i\delta _i\zeta _{i,k},\quad \varvec{\Pi }_n=\frac{1}{\sqrt{n}}\sum \limits _{i=1}^nn\hat{p}_i\delta _i\varvec{w}_i\eta _i,\\ B_{n,k}&=\sum _{i=1}^nn\hat{p}_i\delta _i\int _{0}^{\frac{v_k+\varvec{w}_i^T\varvec{u}}{\sqrt{n}}}\big (I(\varepsilon _i\le b_k^*+s)-I(\varepsilon _i\le b_k^*)\big )\hbox {d}s. \end{aligned}$$

Owing to (2.4),

$$\begin{aligned} n\hat{p}_i=1-\hat{\varvec{\lambda }}^T\varvec{g}_i+o_p(1). \end{aligned}$$
(4.4)

Then by substituting (4.2) and (4.4) into \(\varvec{\Pi }_n\), it holds that

$$\begin{aligned} \varvec{\Pi }_n&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\delta _i\varvec{w}_i\eta _i-\Big (\frac{1}{n}\sum _{i=1}^n\delta _i\varvec{w}_i\eta _i\varvec{g}_i^T\Big )\sqrt{n}\hat{\varvec{\lambda }}+o_p(1)\nonumber \\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\varvec{\phi }_i-\Big (\frac{1}{n}\sum _{i=1}^n\varvec{\phi }_i\varvec{g}_i^T\Big )\sqrt{n}\hat{\varvec{\lambda }}+o_p(1)\nonumber \\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\varvec{\phi }_i-\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^n(\varvec{g}_i+\varvec{G}_{\varvec{\gamma }}\varvec{S}_B^{-1}\varvec{U}_{B_i})+o_p(1). \end{aligned}$$
(4.5)

Based on some calculations and Lemma 1 (ii),

$$\begin{aligned} \varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}= & {} \begin{pmatrix} \varvec{P}_1\varvec{P}_2^{-1}&E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1}-\varvec{P}_1\varvec{P}_2^{-1}E(\varvec{h}_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1} \end{pmatrix},\\ \sum _{i=1}^n(\varvec{g}_i+\varvec{G}_{\varvec{\gamma }}\varvec{S}_B^{-1}\varvec{U}_{B_i})= & {} \begin{pmatrix} \sum \limits _{i=1}^n\varvec{h}_i \\ \sum \limits _{i=1}^n\varvec{U}_{B_i} \end{pmatrix}+ \begin{pmatrix} -E(\varvec{h}_i\varvec{U}_{B_i}^T) \\ -E(\varvec{U}_{B_i}\varvec{U}_{B_i}^T) \end{pmatrix} \varvec{S}_B^{-1}\sum \limits _{i=1}^n\varvec{U}_{B_i}\\= & {} \begin{pmatrix} \sum \limits _{i=1}^n\big (\varvec{h}_i-E(\varvec{h}_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1}\varvec{U}_{B_i} \big )\\ \varvec{0} \end{pmatrix}. \end{aligned}$$

Substituting these into (4.5) leads to

$$\begin{aligned} \varvec{\Pi }_n=\frac{1}{\sqrt{n}}\sum _{i=1}^n\big (\varvec{\phi }_i-\varvec{P}_1\varvec{P}_2^{-1}\big (\varvec{h}_i-E(\varvec{h}_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1}\varvec{U}_{B_i} \big )\big )+o_p(1). \end{aligned}$$

By some calculations, it holds that

$$\begin{aligned} Var\big (\varvec{\phi }_i-\varvec{P}_1\varvec{P}_2^{-1}\big (\varvec{h}_i-E(\varvec{h}_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1}\varvec{U}_{B_i} \big )\big )= \varvec{S}_{\varvec{\phi }}-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T, \end{aligned}$$

and \(\varvec{\Pi }_n{\mathop {\rightarrow }\limits ^{d}}\varvec{\Pi }_0\), where \(\varvec{\Pi }_0\) is a random vector following \(N(\varvec{0},\varvec{S_\phi }-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T)\).

Similarly, for \(k=1,\ldots ,q\),

$$\begin{aligned} z_{n,k}= & {} \frac{1}{\sqrt{n}}\sum _{i=1}^n\delta _i\zeta _{i,k}-\Big (\frac{1}{n}\sum _{i=1}^n\delta _i\zeta _{i,k}\varvec{g}_i^T\Big )\sqrt{n}\hat{\varvec{\lambda }}+o_p(1)\\= & {} \frac{1}{\sqrt{n}}\sum _{i=1}^n\big (\delta _i\zeta _{i,k}-E(\delta _i\zeta _{i,k}\varvec{g}_i^T)\varvec{S}_{\varvec{g}}^{-1}(\varvec{g}_i+\varvec{G}_{\varvec{\gamma }}\varvec{S}_B^{-1}\varvec{U}_{B_i})\big )+o_p(1), \end{aligned}$$

and \(z_{n,k}{\mathop {\rightarrow }\limits ^{d}}z_{k}\), where \(z_{k}\) follows some one-dimensional normal distribution.

For \(i=1,\ldots ,n\) and \(k=1,\ldots ,q\), write \(A_{i,k}=\int _{0}^{(v_k+\varvec{w}_i^T\varvec{u})/\sqrt{n}}\big (I(\varepsilon _i\le b_k^*+s)-I(\varepsilon _i\le b_k^*)\big )\hbox {d}s\). Then

$$\begin{aligned} B_{n,k}&=\sum \limits _{i=1}^nn\hat{p}_i\delta _iA_{i,k}=\sum \limits _{i=1}^n\delta _iA_{i,k}-\sum \limits _{i=1}^n\delta _iA_{i,k}\varvec{g}_i^T\hat{\varvec{\lambda }}+o_p(1). \end{aligned}$$

Based on some derivations,

$$\begin{aligned} E\Big (\sum _{i=1}^n\delta _iA_{i,k}\Big )&=\sum \limits _{i=1}^nE\Big (E(\delta _i|\varvec{w}_i) \int _{0}^{\frac{v_k+\varvec{w}_i^T\varvec{u}}{\sqrt{n}}} \big (F(b_k^*+s)-F(b_k^*)\big )\hbox {d}s\Big )\\&=\frac{1}{2}f(b_k^*)v_k^2E(\delta _i)+\frac{1}{2}f(b_k^*)\varvec{u}^TE(\delta _i\varvec{w}_i\varvec{w}_i^T)\varvec{u}+o_p(1),\\ Var\Big (\sum \limits _{i=1}^n\delta _iA_{i,k}\Big )&= \sum \limits _{i=1}^nVar(\delta _iA_{i,k})=\sum \limits _{i=1}^nE\big [\big (\delta _iA_{i,k}-E(\delta _iA_{i,k})\big )^2\big ]\\&\le 4E\Big (\sum \limits _{i=1}^n\delta _iA_{i,k}\Big )\frac{|v_k|+\max _{1\le i\le n}|\varvec{w}_i^T\varvec{u}|}{\sqrt{n}}=o_p(1),\\ \Big |\sum _{i=1}^n\delta _iA_{i,k}\varvec{g}_i^T\hat{\varvec{\lambda }}\Big |&\le \underset{1\le i\le n}{\max }\left\| \varvec{g}_i\right\| \Vert \hat{\varvec{\lambda }}\Vert \Big |\sum _{i=1}^n\delta _iA_{i,k}\Big |=o_p(n^\frac{1}{2})O_p(n^{-\frac{1}{2}})O_p(1)=o_p(1). \end{aligned}$$

Therefore

$$\begin{aligned} B_{n,k}=\frac{1}{2}f(b_k^*)v_k^2E(\delta _i)+\frac{1}{2}f(b_k^*)\varvec{u}^TE(\delta _i\varvec{w}_i\varvec{w}_i^T)\varvec{u}+o_p(1). \end{aligned}$$

By substituting the expression of \(B_{n,k}\) into (4.3),

$$\begin{aligned} \Xi _n(\varvec{\theta })= \sum _{k=1}^qv_kz_{n,k}+\varvec{\Pi }_n^T\varvec{u}+\frac{1}{2}\sum _{k=1}^qf(b_k^*)v_k^2E(\delta _i)+\frac{1}{2}\varvec{u}^T\varvec{D}\varvec{u}+o_p(1). \end{aligned}$$

According to the convexity lemma in Pollard (1991), it follows that

$$\begin{aligned} \hat{\varvec{u}}=-\varvec{D}^{-1}\varvec{\Pi }_n+o_p(1). \end{aligned}$$

Finally, combining this with \(\varvec{\Pi }_n{\mathop {\rightarrow }\limits ^{d}}\varvec{\Pi }_0\) leads to

$$\begin{aligned} \sqrt{n}(\hat{\varvec{\beta }}_{\scriptscriptstyle acc}-\varvec{\beta }^*){\mathop {\rightarrow }\limits ^{d}}N\big (\varvec{0},\varvec{D}^{-1}(\varvec{S}_{\varvec{\phi }}-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T)\varvec{D}^{-1}\big ). \end{aligned}$$

\(\square \)

Proof of Theorem 3

Write \(\varvec{M}_i=\varvec{M}(\varvec{t}_i,\varvec{\beta }^*,\varvec{\gamma }^*)\). According to the proof of Lemma 6 in Molanes Lopez et al. (2009), it holds that

$$\begin{aligned} \sqrt{n}\begin{pmatrix}\hat{\varvec{\beta }}_{\scriptscriptstyle el}-\varvec{\beta }^*\\ \hat{\varvec{\gamma }}_{\scriptscriptstyle el}-\varvec{\gamma }^*\end{pmatrix} {\mathop {\rightarrow }\limits ^{d}}N(\varvec{0},(\varvec{V}_1^T\varvec{V}_2^{-1}\varvec{V}_1)^{-1}), \end{aligned}$$

where

$$\begin{aligned} \varvec{V}_1=-\begin{pmatrix}\varvec{D} &{} \varvec{0} \\ \varvec{0} &{} \varvec{G}_{\varvec{\gamma }}\end{pmatrix}, \varvec{V}_2=\begin{pmatrix}\varvec{S}_{\varvec{\phi }} &{} \varvec{F}_{\varvec{g}} \\ \varvec{F}_{\varvec{g}}^T &{} \varvec{S}_{\varvec{g}}\end{pmatrix}{\mathop {=}\limits ^{\vartriangle }}\begin{pmatrix}\varvec{B}_{11} &{} \varvec{B}_{12} \\ \varvec{B}_{12}^T &{} \varvec{B}_{22}\end{pmatrix}. \end{aligned}$$

Define \(\varvec{B}_{11.2}=\varvec{B}_{11}-\varvec{B}_{12}\varvec{B}_{22}^{-1}\varvec{B}_{12}^T=\varvec{S}_{\varvec{\phi }}-\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T\). According to Lemma 1 (iii),

$$\begin{aligned} \varvec{B}_{11.2}&= \ \varvec{S}_{\varvec{\phi }}-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T-E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\varvec{S}_{B}^{-1}E(\varvec{U}_{B_i}\varvec{\phi }_i^T),\\ \varvec{V}_1^T\varvec{V}_2^{-1}\varvec{V}_1&=\ \begin{pmatrix}\varvec{D} &{} \varvec{0} \\ \varvec{0} &{} \varvec{G}_{\varvec{\gamma }}^T\end{pmatrix} \begin{pmatrix} \varvec{B}_{11.2}^{-1} &{} -\varvec{B}_{11.2}^{-1}\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1} \\ -\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T\varvec{B}_{11.2}^{-1} &{} \varvec{S}_{\varvec{g}}^{-1}+\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T\varvec{B}_{11.2}^{-1}\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1} \end{pmatrix} \begin{pmatrix}\varvec{D} &{} \varvec{0} \\ \varvec{0} &{} \varvec{G}_{\varvec{\gamma }}\end{pmatrix}\\&=\ \begin{pmatrix} \varvec{D}\varvec{B}_{11.2}^{-1}\varvec{D} &{} -\varvec{D}\varvec{B}_{11.2}^{-1}\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1}\varvec{G}_{\varvec{\gamma }} \\ -\varvec{G}_{\varvec{\gamma }}^T\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T\varvec{B}_{11.2}^{-1}\varvec{D} &{} \varvec{G}_{\varvec{\gamma }}^T(\varvec{S}_{\varvec{g}}^{-1}+\varvec{S}_{\varvec{g}}^{-1}\varvec{F}_{\varvec{g}}^T\varvec{B}_{11.2}^{-1}\varvec{F}_{\varvec{g}}\varvec{S}_{\varvec{g}}^{-1})\varvec{G}_{\varvec{\gamma }} \end{pmatrix}\\&=\ \begin{pmatrix} \varvec{D}\varvec{B}_{11.2}^{-1}\varvec{D} &{} \varvec{D}\varvec{B}_{11.2}^{-1}E(\varvec{\phi }_i\varvec{U}_{B_i}^T) \\ E(\varvec{U}_{B_i}\varvec{\phi }_i^T)\varvec{B}_{11.2}^{-1}\varvec{D} &{} \varvec{S}_B+E(\varvec{U}_{B_i}\varvec{\phi }_i^T)\varvec{B}_{11.2}^{-1}E(\varvec{\phi }_i\varvec{U}_{B_i}^T) \end{pmatrix}\\&{\mathop {=}\limits ^{\vartriangle }} \ \begin{pmatrix} \varvec{H}_{11} &{} \varvec{H}_{12} \\ \varvec{H}_{12}^T &{} \varvec{H}_{22} \end{pmatrix}. \end{aligned}$$

Then it is easy to derive that

$$\begin{aligned} (\varvec{V}_1^T\varvec{V}_2^{-1}\varvec{V}_1)^{-1} =\begin{pmatrix} \varvec{H}_{11}^{-1}+\varvec{H}_{11}^{-1}\varvec{H}_{12}\varvec{H}_{22.1}^{-1}\varvec{H}_{12}^T\varvec{H}_{11}^{-1} &{} -\varvec{H}_{11}^{-1}\varvec{H}_{12}\varvec{H}_{22.1}^{-1} \\ -\varvec{H}_{22.1}^{-1}\varvec{H}_{12}^T\varvec{H}_{11}^{-1} &{} \varvec{H}_{22.1}^{-1} \end{pmatrix}, \end{aligned}$$

where \(\varvec{H}_{22.1}=\varvec{H}_{22}-\varvec{H}_{12}^T\varvec{H}_{11}^{-1}\varvec{H}_{12}=\varvec{S}_B\). Note that

$$\begin{aligned} -\varvec{H}_{11}^{-1}\varvec{H}_{12}\varvec{H}_{22.1}^{-1}&=-\varvec{D}^{-1}E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1},\\ \varvec{H}_{11}^{-1}+\varvec{H}_{11}^{-1}\varvec{H}_{12}\varvec{H}_{22.1}^{-1}\varvec{H}_{12}^T\varvec{H}_{11}^{-1}&=\varvec{D}^{-1}\big (\varvec{B}_{11.2}+E(\varvec{\phi }_i\varvec{U}_{B_i}^T)\varvec{S}_B^{-1}E(\varvec{U}_{B_i}\varvec{\phi }_i^T)\big )\varvec{D}^{-1}\\&=\varvec{D}^{-1}(\varvec{S}_{\varvec{\phi }}-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T)\varvec{D}^{-1}. \end{aligned}$$

Therefore

$$\begin{aligned} \sqrt{n}(\hat{\varvec{\beta }}_{\scriptscriptstyle el}-\varvec{\beta }^*){\mathop {\rightarrow }\limits ^{d}}N\big (\varvec{0},\varvec{D}^{-1}(\varvec{S}_{\varvec{\phi }}-\varvec{P}_1\varvec{P}_2^{-1}\varvec{P}_1^T)\varvec{D}^{-1}\big ). \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sun, J. An improvement on the efficiency of complete-case-analysis with nonignorable missing covariate data. Comput Stat (2020). https://doi.org/10.1007/s00180-020-00964-6

Download citation

Keywords

  • Missing covariates
  • Missing not at random
  • Conditionally independent
  • Empirical likelihood
  • Composite quantile regression