Penalized weighted composite quantile estimators with missing covariates

Yang, Hu; Liu, Huilan

doi:10.1007/s00362-014-0642-2

Penalized weighted composite quantile estimators with missing covariates

Regular Article
Published: 04 November 2014

Volume 57, pages 69–88, (2016)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Hu Yang¹ &
Huilan Liu¹

656 Accesses
20 Citations
Explore all metrics

Abstract

In this paper, we propose the penalized weighted composite quantile regression estimation for linear model when the covariates are missing at random. Under some mild conditions, the asymptotic normality, oracle property and Horvitz–Thompson property of the proposed estimators are established. Simulation results and a real data analysis are provided to examine the performance of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation and inference of combining quantile and least-square regressions with missing data

Article 21 October 2017

Weighted local linear CQR for varying-coefficient models with missing covariates

Article 21 January 2015

Weighted composite quantile regression for single index model with missing covariates at random

Article 27 March 2019

References

Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet MATH Google Scholar
Guo J, Tang M, Tian M, Zhu K (2013) Variable selection in high-dimensional partially linear additive models for composite quantile regression. Comput Stat Data Anal 65:56–67
Article MathSciNet Google Scholar
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Article MathSciNet MATH Google Scholar
Jiang R, Qian WM, Zhou ZZ (2012) Variable selection and coefficient estimation via composite quantile regression with randomly censored data. Stat Probab Lett 82:308–317
Article MathSciNet MATH Google Scholar
Kim MO (2007) Quantile regression with varying coffficients. Ann Stat 35:92–108
Article MATH Google Scholar
Knight K (1998) Limiting distributions for L1 regression estimators under general conditions. Ann Stat 26:755–770
Article MATH Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Koenker R, Bassett GS (1978) Regression quantiles. Econometrica 46:33–50
Article MathSciNet MATH Google Scholar
Liang H, Wang S, Robins JM, Carroll RJ (2004) Estimation in partially linear models with missing covariates. J Am Stat Assoc 99:357–367
Article MathSciNet MATH Google Scholar
Liang H (2008) Generalized partially linear models with missing covariates. J Multivar Anal 99:880–895
Article MATH Google Scholar
Tang LJ, Zhou ZG, Wu CC (2012) Weighted composite quantile estimation and variable selection method for censored regression model. Stat Probab Lett 82:653–663
Article MathSciNet MATH Google Scholar
Tang QG (2014) Robust estimation for spatial semiparametric varying coefficient partially linear regression. Stat Papers http://dx.doi.org/10.1007/s00362-014-0629-z
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc B 58:267–288
MathSciNet MATH Google Scholar
Sherwood B, Wang L, Zhou XH (2013) Weighted quantile regression for analyzing health care cost data with missing covariates. Stat Med 32:4967–4979
Article MathSciNet Google Scholar
Wang CY, Chen HY (2001) Augmented inverse probability weighted estimator for Cox missing covariate regression. Biometrics 57:414–419
Article MathSciNet MATH Google Scholar
Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94:553–568
Article MathSciNet MATH Google Scholar
Wong H, Guo S, Chen M, Ip WC (2009) On locally weighted estimation and hypothesis testing of varying-coefficient models with missing covariates. J Stat Plann Infer 139:2933–2951
Article MathSciNet MATH Google Scholar
Xu DK, Zhang ZZ, Wu LC (2014) Variable selection in high-dimensional double generalized linear models. Stat Pap 55:327–347
Article MathSciNet MATH Google Scholar
Xue L (2013) Estimation and empirical likelihood for single-index models with missing data in the covariates. Comput Stat Data Anal 68:82–97
Article Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Article MATH Google Scholar
Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36:1108–1126
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 11171361), Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20110191110033) and Fundamental Research Funds for the Central Universities (Grant No. CDJXS12101102).

Author information

Authors and Affiliations

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, People’s Republic of China
Hu Yang & Huilan Liu

Authors

Hu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Huilan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huilan Liu.

Appendix: Proof of theorems

Proof of Theorem 2.1

Let $\sqrt{n}(\hat{\beta }^{WCQR_{\pi }}-\beta _{0})=\mu $ and $\sqrt{n}(\hat{b}^{WCQR_{\pi }}_{k} -b_{k})=\nu _{k}$, and $\theta =(\mu ,\nu )$. So, $\theta $ is the minimizer of the following criterion:

$$\begin{aligned} L_{n}(\pi ,\theta )=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] . \end{aligned}$$

By Knight (1998), for any $x\ne 0$, we have $\rho _{\tau }(x-y)-\rho _{\tau }(x)=y[I(x<0)-\tau ]+\int _{0}^y[I(x\le t)-I(x\le 0)]dt.$ Thus, we can rewrite $L_{n}(\pi ,\theta )$ as

$$\begin{aligned} L_{n}(\pi ,\theta )&=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}} [I(\varepsilon _{i}< b_{k})-\tau _{k}]\\ {}&\quad +\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\\&=\sum _{k=1}^Kz_{n,k}\nu _{k}+W_{n,K}\mu +\sum _{k=1}^KB_{n,k}, \end{aligned}$$

where

$$\begin{aligned} z_{n,k}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}[I(\varepsilon _{i}< b_{k})-\tau _{k}],\\ W_{n,K}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}< b_{k})-\tau _{k}],\\ B_{n,k}&=\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt. \end{aligned}$$

By the $Cram\acute{e}r-Wald$ device and CLT, we know $z_{n,k}$ and $W_{n,K}$ converge in distribution to $z_{k}$ and $W_{1}$, where $z_{k}$ is a normal random variable with mean $0$, $W_{1}$ is a $p$-dimensional normal random vector with $\mathbf 0 $ and variance–covariance matrix

$$\begin{aligned} \Sigma _{1}=E\left( \frac{1}{\pi (Y)}XX^{T}\left[ \sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})\right] ^2\right) . \end{aligned}$$

(5.1)

where $\pi (Y)=Pr(V=1|Y,X)=Pr(V=1|Y)$. Therefore

$$\begin{aligned} \sum _{k=1}^Kz_{n,k}\nu _{k}+W_{n,K}\mu \rightarrow _{d}\sum _{k=1}^Kz_{k}\nu _{k}+W_{1}\mu . \end{aligned}$$

Let $ B_{ni,k}=\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt$. For any $\eta >0$, we have

$$\begin{aligned}{}[B_{ni,k}]^{2}=\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} +\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}<\eta \right) \right\} \end{aligned}$$

On the one hand, we have

$$\begin{aligned}&nE\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} \\&\quad \le nE\left\{ \frac{V_{i}}{\pi _{i}^{2}}\left[ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}2dt\right] ^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right\} \\&\quad \le \frac{4}{M^2}E\left[ |v_{k}+X_{i}^{T}\mu |^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}\ge \eta \right) \right] \rightarrow 0, \quad as \quad n\rightarrow \infty , \end{aligned}$$

On the other hand, we have

$$\begin{aligned}&nE\left\{ [B_{ni,k}]^{2}I\left( \frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}<\eta \right) \right\} \\&\quad \le nE\left\{ \frac{2V_{i}}{\pi _{i}^2}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}dt\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\right. \\&\quad \quad \left. \times I(\nu _{k}+\,X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad \le \frac{2n\eta }{M^2}E\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad =\frac{2n\eta }{M^2}E_{X}\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[F(b_{k}+t|X)-F(b_{k}|X)]dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad =\frac{2n\eta }{M^2}E_{X}\left\{ \int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}tf(b_{k}|X)dtI(\nu _{k}+X_{i}^{T}\mu <\sqrt{n}\eta )\right\} \\&\quad \le \frac{D\eta }{M^2}E|\nu _{k}+X_{i}^{T}\mu |^2I(\nu _{k}+X_{i}^{T}\mu \!<\!\sqrt{n}\eta )\!\le \!\frac{D\eta }{M^2}E|\nu _{k}\!+\!X_{i}^{T}\mu |^2\!\rightarrow \! 0,\quad \! as \!\quad \eta \!\rightarrow \!0. \end{aligned}$$

Since $E|\nu _{k}+X_{i}^{T}\mu |^2$ is bounded. Thus, when $ n\rightarrow \infty $, it follows that

$$\begin{aligned}&Var(B_{n,k})=\sum _{i=1}^nVar(B_{ni,k})\le nE(B_{ni,k})^2\rightarrow 0, \end{aligned}$$

Furthermore,

$$\begin{aligned} E[B_{n,k}]&=E\left[ \sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[I(\varepsilon _{i}\le b_{k}+t)-I(\varepsilon _{i}\le b_{k})]dt\right] \\&=E_{X}\left[ \sum _{i=1}^n\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}[F(b_{k}+t|X)-F(b_{k}|X)]dt\right] \\&=E_{X}\left[ \sum _{i=1}^n\int _{0}^{\frac{\nu _{k}+X_{i}^{T}\mu }{\sqrt{n}}}tf(b_{k}|X)dt\right] +o_{p}(1)\\&=\frac{1}{2}E_{X}(f(b_{k}|X))\nu _{k}^2+\frac{1}{2}\mu ^{T}E_{X}[f(b_{k}|X)XX^{T}]\mu +o_{p}(1). \end{aligned}$$

Therefore, we get

$$\begin{aligned}&B_{n,k}\!=\!E(B_{n,k})+o_{p}(1)=\frac{1}{2}E_{X}(f(b_{k}|X))\nu _{k}^2+\frac{1}{2}\mu ^{T}E_{X}[f(b_{k}|X)XX^{T}]\mu +o_{p}(1). \end{aligned}$$

Let $g_{k}=E_{X}(f(b_{k}|X))$ and $C=\sum _{k=1}^KE_{X}[f(b_{k}|X)XX^{T}]$, we have

$$\begin{aligned}&L_{n}(\pi ,\theta )=\frac{1}{2}\mu ^{T}C\mu +W_{1}\mu +\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k}^2+\sum _{k=1}^Kz_{k}\nu _{k}+o_{p}(1),\\ \end{aligned}$$

and

$$\begin{aligned}&L_{n}(\pi ,\theta )\rightarrow _{d}\frac{1}{2}\mu ^{T}C\mu +W_{1}\mu +\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k}^2+\sum _{k=1}^Kz_{k}\nu _{k}. \end{aligned}$$

Since $L_{n}(\pi ,\theta )$ is a convex function, following Knight (1998) and Koenker (2005), we have

$$\begin{aligned}&\mu \rightarrow _{d}N(0,C^{-1}\Sigma _{1}C^{-1}) \end{aligned}$$

where $\Sigma _{1}$ is defined in (5.1). $\square $

Proof of Theorem 2.2

Let $\sqrt{n}(\hat{\beta }^{AWCQR_{\pi }}-\beta _{0})=\mu ^{*}$, $\sqrt{n}(\hat{b}^{AWCQR_{\pi }}_{k} -b_{k})=\nu _{k}^{*}$, and $\theta ^{*}=(\mu ^{*},\nu ^{*})$. $\theta ^{*}$ is the minimizer of the following criterion:

$$\begin{aligned} \mathbf L _{n}(\pi ,\theta ^{*})&=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k}^{*}+X_{i}\mu ^{*}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] \\&\ \quad +\sum _{j=1}^p\lambda _{n}\frac{(|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|)}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}\\&=L_{n}(\pi ,\theta ^{*})+\sum _{j=1}^p\lambda _{n}\frac{(|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|)}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}. \end{aligned}$$

Similar to the proof of Theorem 4.1 in Zou and Yuan (2008), the second term above can be expressed as

$$\begin{aligned}&\frac{\lambda _{n}}{\sqrt{n}|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}\sqrt{n}\Big [|\beta _{j}+\frac{\mu _{j}^{*}}{\sqrt{n}}|-|\beta _{j}|\Big ]\rightarrow _{p}\ \left\{ \begin{array}{l} 0,\quad if\quad \beta _{j}\ne 0,\\ 0,\quad if \quad \beta _{j}=0 \quad and\quad \mu _{j}^{*}= 0,\\ \infty ,\quad if\quad \beta _{j}=0\quad and\quad \mu _{j}^{*}\ne 0.\\ \end{array} \right. \end{aligned}$$

Let $\mu ^{*}=(\mu _{1}^{T*},\mu _{2}^{T*})^T$ where $\mu _{1}^{*}$ contains the first q element of $\mu ^{*}$. Using the same arguments in Knight (1998) and Koenker (2005), we have $\mu _{2}^{*}\rightarrow _{p}0$ and $\mu _{1}^{*}\rightarrow _{d}N(0,[C^{-1}\Sigma _{1}C^{-1}]_{\Lambda \Lambda }).$ Thus, asymptotic normality is proven.

Next, we prove the consistency part. Let $\hat{\Lambda }_{n}=\{j:\hat{\beta }^{AWCQR_{\pi }}_{j}\ne 0\}$ and $\Lambda =\{j:\beta _{j}\ne 0\}$, $\forall j\in \Lambda ,$ the asymptotic normality indicates $P(j\in \hat{\Lambda }_{n})\rightarrow 1.$ It suffices to show $\forall j\notin \Lambda ,P(j\in \hat{\Lambda }_{n})\rightarrow 0.$ Note that,

$$\begin{aligned}&(\hat{b}^{AWCQR_{\pi }}_{1},\ldots ,\hat{b}^{AWCQR_{\pi }}_{K},\hat{\beta }^{AWCQR_{\pi }})\\&\quad =\mathop {\mathrm {argmin}}_{\begin{array}{c} b_1,\ldots ,b_K,\beta \end{array}} \sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}(Y_{i}-X_{i}^{T}\beta -b_{k}) +\sum _{j=1}^p\lambda _{n}\frac{\mid \beta _{j}\mid }{\mid \hat{\beta }^{WCQR_{\pi }}_{j}\mid ^2}. \end{aligned}$$

If $j\in \hat{\Lambda }_{n}$, then we must have

$$\begin{aligned}&\sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}\left( Y_{i}-X_{ij}\hat{\beta }^{AWCQR_{\pi }}_{j}-\hat{b}^{AWCQR_{\pi }}_{k}\right) +\lambda _{n}\frac{\mid \hat{\beta }^{AWCQR_{\pi }}_{j}\mid }{\mid \hat{\beta }^{WCQR_{\pi }}_{j}\mid ^2}\\&\quad \le \sum _{k=1}^K \sum _{i=1}^n \frac{V_{i}}{\pi _{i}}\rho _{\tau _{k}}\left( Y_{i}-\hat{b}^{AWCQR_{\pi }}_{k}\right) . \end{aligned}$$

Using the fact that $|\frac{\rho _{\tau }(x_{1})-\rho _{\tau }(x_{2})}{x_{1}-x_{2}}|\le $ max$(\tau ,1-\tau )<1,$ we have $\frac{\lambda _{n}}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}<\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|=K.\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|.$ So $P(j\in \hat{\Lambda }_{n})\le P\big (\frac{\lambda _{n}}{|\hat{\beta }^{WCQR_{\pi }}_{j}|^2}<K.\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}|X_{ij}|\big )\rightarrow 0.$ This completes the proof of Theorem 2.2. $\square $

Proof of Theorem 2.3

Let $\sqrt{n}(\hat{\beta }^{WCQR_{\hat{\pi }}}-\beta _{0})=\mu ^{**}$, $\sqrt{n}(\hat{b}^{WCQR_{\hat{\pi }}}_{k} -b_{k})=\nu _{k} ^{**}$, and $\theta ^{**}=(\mu ^{**},\nu ^{**})$. $\theta ^{**}$ is the minimizer of the following criterion:

$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\widehat{\pi _{i}}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] , \end{aligned}$$

Let $Q_{n}(\pi ,\theta ^{**})=\sum _{k=1}^K\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \right] .$

$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})&=[Q_{n}(\pi ,\theta ^{**})-Q_{n}(\pi ,0)]+[Q_{n}(\widehat{\pi },\theta ^{**})-Q_{n}(\pi ,\theta ^{**})]\\&\quad \ -[Q_{n}(\widehat{\pi },0)-Q_{n}(\pi ,0)]\\&=I_{1}+I_{2}-I_{3}, \end{aligned}$$

where

$$\begin{aligned}&I_{1}=L_{n}(\pi ,\theta ^{**}),\\&I_{2}=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \right] ,\\&I_{3}=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \right] . \end{aligned}$$

Considering the fact $max_{1\le i \le n}|\hat{\pi _{i}}-\pi _{i}|=O_{p}(h^q+\sqrt{\log n/nh})=O_{p}(d_{n})$, we can see that

$$\begin{aligned} I_{2}\!-\!I_{3}&=\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \!\rho _{\tau _{k}}\left( \!\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\!\right] \left[ \left( \frac{1}{\widehat{\pi _{i}}}-\frac{1}{\pi _{i}}\right) \!\right] \\&=-\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \!\rho _{\tau _{k}}\left( \!\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) \!-\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\!\right] \left[ \!\frac{(\hat{\pi _{i}}-\pi _{i})}{\pi _{i}^2}\!\right] \\&\quad \ +O_{p}(\sqrt{n}d_{n}^2). \end{aligned}$$

Recall the definition of $\widehat{\pi }(\cdot )$, we have

$$\begin{aligned} \widehat{\pi }(y)-\pi (y)&=\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-y)}{\sum _{j=1}^nL_{h}(Y_{j}-y)}+\frac{\sum _{j=1}^n(\pi _{j}-\pi (y))L_{h}(Y_{j}-y)}{\sum _{j=1}^nL_{h}(Y_{j}-y)}\\&=\frac{1}{nhf_{Y}(y)}\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-y)+O_{p}(h^q)+o_{p}(n^{-1/2}). \end{aligned}$$

Therefore,

$$\begin{aligned} I_{2}\!-\!I_{3}&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\Bigg [\rho _{\tau _{k}}\Big (\varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\Big )\nonumber \\&-\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\Bigg ]\Bigg [\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+O_{p}(h^q)\nonumber \\&+o_{p}\left( n^{-\frac{1}{2}}\right) \Bigg ]+O_{p}(\sqrt{n}d_{n}^2)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\left[ \rho _{\tau _{k}}\left( \varepsilon _{i}-b_{k}-\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}\right) -\rho _{\tau _{k}}(\varepsilon _{i}-b_{k})\right] \nonumber \\&\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+o_{p}(1)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^nV_{i}\frac{\nu _{k} ^{**}+X_{i}^{T}\mu ^{**}}{\sqrt{n}}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\nonumber \\&\frac{\sum _{j=1}^n(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}+o_{p}(1)\nonumber \\&= -\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}\frac{X_{i}^{T}\mu ^{**}}{\sqrt{n}}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}\nonumber \\&-\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}\frac{\nu _{k} ^{**}}{\sqrt{n}}[I(\varepsilon _{i}\!<\!b_{k})\!-\!\tau _{k}]\frac{(V_{j}\!-\!\pi _{j})L_{h}(Y_{j}\!-\!Y_{i})}{nhf_{Y}(Y_{i})\pi _{i}^2}\!+\!o_{p}(1).\qquad \quad \ \end{aligned}$$

(5.2)

Furthermore,

$$\begin{aligned}&\sum _{k=1}^K\sum _{j=1}^n\sum _{i=1}^nV_{i}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\nonumber \\&\quad \ =\sum _{i=1}^n\sum _{j=1}^n(V_{i}-\pi _{i})\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\nonumber \\&\ \qquad +\sum _{i=1}^n\sum _{j=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}X_{i}^{T}. \end{aligned}$$

(5.3)

The first term of (5.3) can be rewritten as

$$\begin{aligned}&\sum _{i=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{i}-\pi _{i})^2L(0)}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T} +\sum _{i\ne j}(V_{i}-\pi _{i})\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\\&\ \ \qquad \frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}X_{i}^{T}\\&\quad =O_{p}\left( \frac{1}{\sqrt{n}h}\right) +O_{p}\left( \frac{1}{\sqrt{nh}}\right) =o_{p}(1). \end{aligned}$$

The second term of (5.3) is

$$\begin{aligned}&\sum _{i=1}^n\sum _{j=1}^n\sum _{k=1}^K[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}X_{i}^{T}\\&\quad \ =\frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{(V_{j}-\pi _{j})}{\pi _{j}}E\left[ X_{j}^{T}\sum _{k=1}^K(I(\varepsilon _{j}<b_{k})-\tau _{k})|Y_{j}\right] +o_{p}(1). \end{aligned}$$

In additon, the second term of (5.2) can be rewritten as

$$\begin{aligned}&\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^nV_{i}[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}\nu _{k} ^{**}\\&\quad \ =\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^n(V_{i}-\pi _{i})[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}^2}\nu _{k} ^{**}\\&\quad \ \quad +\sum _{k=1}^K\sum _{i=1}^n\sum _{j=1}^n[I(\varepsilon _{i}<b_{k})-\tau _{k}]\frac{(V_{j}-\pi _{j})L_{h}(Y_{j}-Y_{i})}{n^{\frac{3}{2}}hf_{Y}(Y_{i})\pi _{i}}\nu _{k} ^{**}\\&\quad \ =\sum _{k=1}^K\frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{(V_{j}-\pi _{j})}{\pi _{j}}E[(I(\varepsilon _{j}<b_{k})-\tau _{k})|Y_{j}]\nu _{k} ^{**}+o_{p}(1). \end{aligned}$$

So, we have

$$\begin{aligned} L_{n}(\widehat{\pi },\theta ^{**})&=\frac{1}{2}\mu ^{**T}\left\{ \sum _{k=1}^KE_{X}[f(b_{k}|X)XX^{T}]\right\} \mu ^{**}+\Psi _{nk}\mu ^{**}+\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k} ^{**2}\\&\quad \ +\sum _{k=1}^K\Phi _{nk}\nu _{k} ^{**}+o_{p}(1), \end{aligned}$$

where

$$\begin{aligned} \Psi _{nk}&=\left\{ \frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}\le b_{k})-\tau _{k}]\right. \\&\qquad \ \left. -\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{(V_{i}-\pi _{i})}{\pi _{i}}E\left[ X_{i}^{T}\sum _{k=1}^K(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}\right] \right\} ,\\ \Phi _{nk}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{V_{i}}{\pi _{i}}[I(\varepsilon _{i}\le b_{k})-\tau _{k}]-\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{(V_{i}-\pi _{i})}{\pi _{i}}E[(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}]. \end{aligned}$$

By the central limit theorem $\Phi _{nk}$ converge in distribution to $z_{k}{^\prime }$, where $z_{k}{^\prime }$ is a normal distribution with mean 0. $\Psi _{nk}$ converge in distribution to $W_{2}$, where $W_{2}$ is a normal distribution with mean $\mathbf 0 $ and covariance $\Sigma _{2}.$

Here

$$\begin{aligned} \Sigma _{2}&=cov\left( \frac{V_{i}}{\pi _{i}}X_{i}^{T}\sum _{k=1}^K[I(\varepsilon _{i}\le b_{k})-\tau _{k}])\right. \nonumber \\&\quad \ \left. -\frac{(V_{i}-\pi _{i})}{\pi _{i}}E\left[ X_{i}^{T}\sum _{k=1}^K(I(\varepsilon _{i}<b_{k})-\tau _{k})|Y_{i}\right] \right) \nonumber \\&=E\left( \frac{1}{\pi (Y)}XX^{T}\left[ \sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})\right] ^2\right) \nonumber \\&\quad \ -E\left( \frac{1-\pi (Y)}{\pi (Y)}E\left[ X\sum _{k=1}^K(I(\varepsilon <b_{k})-\tau _{k})|Y\right] ^{\bigotimes 2}\right) . \end{aligned}$$

(5.4)

Hence, we have

$$\begin{aligned}&L_{n}(\widehat{\pi },\theta ^{**})\rightarrow _{d}\frac{1}{2}\mu ^{**T}C\mu ^{**}+W_{2}^{T}\mu ^{**}+\frac{1}{2}\sum _{k=1}^Kg_{k}\nu _{k} ^{**2}+\sum _{k=1}^K+z^{'}_{k}\nu _{k} ^{**}. \end{aligned}$$

By the same way to Knight (1998) and Koenker (2005), we have

$$\begin{aligned} \mu ^{**}\rightarrow _{d}N(0,C^{-1}\Sigma _{2}C^{-1}), \end{aligned}$$

where $\Sigma _{2}$ is defined in (5.4). Theorem is as claimed. $\square $

Proof of Theorem 2.4

Since the proof is similar to Theorem 2.2, we omit it here. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, H., Liu, H. Penalized weighted composite quantile estimators with missing covariates . Stat Papers 57, 69–88 (2016). https://doi.org/10.1007/s00362-014-0642-2

Download citation

Received: 16 June 2014
Revised: 09 October 2014
Published: 04 November 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00362-014-0642-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Penalized weighted composite quantile estimators with missing covariates

Abstract

Access this article

Similar content being viewed by others

Estimation and inference of combining quantile and least-square regressions with missing data

Weighted local linear CQR for varying-coefficient models with missing covariates

Weighted composite quantile regression for single index model with missing covariates at random

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of theorems

Proof of Theorem 2.1

Proof of Theorem 2.2

Proof of Theorem 2.3

Proof of Theorem 2.4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Penalized weighted composite quantile estimators with missing covariates

Abstract

Access this article

Similar content being viewed by others

Estimation and inference of combining quantile and least-square regressions with missing data

Weighted local linear CQR for varying-coefficient models with missing covariates

Weighted composite quantile regression for single index model with missing covariates at random

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of theorems

Appendix: Proof of theorems

Proof of Theorem 2.1

Proof of Theorem 2.2

Proof of Theorem 2.3

Proof of Theorem 2.4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation