Quantile regression and variable selection for partially linear model with randomly truncated data

Xu, Hong-Xia; Chen, Zhen-Long; Wang, Jiang-Feng; Fan, Guo-Liang

doi:10.1007/s00362-016-0867-3

Quantile regression and variable selection for partially linear model with randomly truncated data

Regular Article
Published: 09 January 2017

Volume 60, pages 1137–1160, (2019)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Hong-Xia Xu^1,2,
Zhen-Long Chen¹,
Jiang-Feng Wang¹ &
…
Guo-Liang Fan²

502 Accesses
11 Citations
Explore all metrics

Abstract

This paper focuses on the problem of estimation and variable selection for quantile regression (QR) of partially linear model (PLM) where the response is subject to random left truncation. We propose a three-stage estimation procedure for parametric and nonparametric parts based on the weights which are random quantities and determined by the product-limit estimates of the distribution function of truncated variable. The estimators obtained in the second and third stages are more efficient than the initial estimators in the first stage. Furthermore, we propose a variable selection procedure for the QR of PLM by combining the estimation method with the smoothly clipped absolute deviation penalty to get sparse estimation of the regression parameter. The oracle properties of the variable selection approach are established. Simulation studies are conducted to examine the performance of our estimators and variable selection method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantile regression for varying-coefficient partially nonlinear models with randomly truncated data

Article 29 September 2023

Simultaneous variable selection and parametric estimation for quantile regression

Article 16 July 2014

Bayesian empirical likelihood of quantile regression with missing observations

Article 18 June 2022

References

Engle R, Granger C, Rice J, Weiss A (1986) Nonparametric estimates of the relation between weather and electricity sales. J Am Stat Assoc 81:310–320
Article Google Scholar
Fan JQ, Li RZ (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet MATH Google Scholar
Fuller WA (1987) Measurement error models. Wiley, New York
Book MATH Google Scholar
Geyer CJ (1994) On the asymptotics of constrained M-estimation. Ann Stat 22:1993–2010
Article MathSciNet MATH Google Scholar
He SY, Yang GL (1998) Estimation of the truncation probability in the random truncation model. Ann Stat 26:1011–1027
Article MathSciNet MATH Google Scholar
He SY, Yang GL (2003) Estimation of regression parameters with left truncated data. J Stat Plan Inference 117:99–122
Article MathSciNet MATH Google Scholar
Honda T (2004) Quantile regression in varying coefficient models. J Stat Plan Inference 121:113–125
Article MathSciNet MATH Google Scholar
Jiang R, Qian WM, Zhou ZG (2012) Variable selection and coefficient estimation via composite quantile regression with randomly censored data. Stat Prob Lett 82:308–317
Article MathSciNet MATH Google Scholar
Kai B, Li RZ, Zou H (2010) Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. J R Stat Soc B 72:49–69
Article MathSciNet MATH Google Scholar
Kai B, Li RZ, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39:305–332
Article MathSciNet MATH Google Scholar
Kim MO (2007) Quantile regression with varying coeffcients. Ann Stat 35:92–108
Article MATH Google Scholar
Knight K (1998) Limiting distributions for $l_1$ regression estimators under general conditions. Ann Stat 26:755–770
Article MATH Google Scholar
Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50
Article MathSciNet MATH Google Scholar
Lemdani M, Ould-Saïd E, Poulin P (2009) Asymptotic properties of a conditional quantile estimator with randomly truncated data. J Multivar Anal 100:546–559
Article MathSciNet MATH Google Scholar
Liang HY, Baek JI (2016) Asymptotic normality of conditional density estimation with left-truncated and dependent data. Stat Pap 57:1–20
Article MathSciNet MATH Google Scholar
Liang HY, Liu AA (2013) Kernel estimation of conditional density with truncated, censored and dependent data. J Multivar Anal 120:40–58
Article MathSciNet MATH Google Scholar
Lv YH, Zhang RQ, Zhao WH, Liu JC (2014) Quantile regression and variable selection for the single-index model. J Appl Stat 41:1565–1577
Article MathSciNet Google Scholar
Lv YH, Zhang RQ, Zhao WH, Liu JC (2015) Quantile regression and variable selection of partial linear single-index model. Ann Inst Stat Math 67:375–409
Article MathSciNet MATH Google Scholar
Lynden-Bell D (1971) A method of allowing for known observational selection in small samples applied to 3CR quasars. Mon Not R Astron Soc 155:95–118
Article Google Scholar
Mack YP, Silverman BW (1982) Weak and strong uniform consistency of kernel regression estimators. Prob Theory Relat Fields 61:405–415
MATH Google Scholar
Neocleous T, Portnoy S (2009) Partially linear censored quantile regression. Lifetime Data Anal 15:357–378
Article MathSciNet MATH Google Scholar
Ould-Saïd E, Lemdani M (2006) Asymptotic properties of a nonparametric regression function estimator with randomly truncated data. Ann Inst Stat Math 58:357–378
Article MathSciNet MATH Google Scholar
Stute W, Wang JL (2008) The central limit theorem under random truncation. Bernoulli 14:604–622
Article MathSciNet MATH Google Scholar
Wang JF, Liang HY, Fan GL (2013) Local polynomial quasi-likelihood regression with truncated and dependent data. Statistics 47:744–761
Article MathSciNet MATH Google Scholar
Woodroofe W (1985) Estimation a distribution function with truncated data. Ann Stat 13:163–177
Article MATH Google Scholar
Wu YC, Liu YF (2009) Variable selection in quantile regression. Stat Sin 19:801–817
MathSciNet MATH Google Scholar
Yu K, Jones MC (1998) Local linear quantile regression. Am Stat Assoc 93:228–237
Article MathSciNet MATH Google Scholar
Yu K, Lu YZ, Stander J (2003) Quantile regression: applications and current research areas. Statistician 52:331–350
MathSciNet Google Scholar
Zhou WH (2011) A weighted quantile regression for randomly truncated data. Comput Stat Data Anal 55:554–566
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors thank the referees for their careful reading of the manuscript and for their constructive comments and suggestions. This research was supported by the National Natural Science Foundation of China (11371321, 11401006), the Project of Humanities and Social Science Foundation of Ministry of Education (15YJC910006), the National Statistical Science Research Program of China (2015LY55, 2016LY80), Anhui Provincial Higher Education Promotion Program Natural Science General Project (TSKJ2015B22) and Zhejiang Provincial Key Research Base for Humanities and Social Science Research (Statistics 1020XJ3316004G).

Author information

Authors and Affiliations

School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China
Hong-Xia Xu, Zhen-Long Chen & Jiang-Feng Wang
School of Mathematics & Physics, Anhui Polytechnic University, Wuhu, 241000, China
Hong-Xia Xu & Guo-Liang Fan

Authors

Hong-Xia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhen-Long Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiang-Feng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Liang Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhen-Long Chen.

Appendix

Before we present the proofs of the theorems, we first state some regularity conditions. They are also assumed in Zhou (2011) and Kai et al. (2011). Let $\delta _n=\Big (\frac{\log (1/h)}{nh}\Big )^{1/2}$.

(C1)
The kernel function $K(\cdot )$ is a symmetry continuous density function with bounded tight support, satisfies one-order Lipschitz condition, and $\int ^{\infty }_{-\infty }u^2K(u)du<\infty $, $\int ^{\infty }_{-\infty }u^jK^2(u)du<\infty ,j=0,1,2.$
(C2)
F, G are continuous and $a_G\le a_F$.
(C3)
The random variable W has bounded support $\mathcal {W}$ and its density function $f_W(\cdot )$ is positive and has a second derivative.
(C4)
$F_{\varepsilon }(0|X,W)=\tau $ for all (X, W) with a continuous and uniformly bounded derivative and $f_\varepsilon (\cdot |X,W)$ is bounded away from zero.
(C5)
The matrixes $C_2(w) ~\text {and} ~A$ are non-singular for all $w\in \mathcal {W}$.
(C6)
The function $g(\cdot )$ has a continuous and bounded second derivative.

Lemma A.1

Let $(X_1,Y_1), \ldots , (X_n,Y_n)$ be independent and identically distributed random vectors. Assume that $E|Y|^s<\infty , sup_x\int |y|^sf(x,y)dy<\infty $, where f denotes the joint density of (X, Y). Let K be a bounded positive function with a bounded support, satisfying a Lipschitz condition. Then

$$\begin{aligned} \sup _{x} \Bigg |\frac{1}{n}\sum ^n_{i=1}\big [K_h(X_i-x)Y_i-E(K_h (X_i-x)Y_i)\big ]\Bigg |=O_p\Bigg (\frac{\log ^{1/2}(1/h)}{\sqrt{nh}}\Bigg ), \end{aligned}$$

provided that $n^{2\varepsilon -1}h\rightarrow \infty $ for some $\varepsilon <1-s^{-1}$.

Lemma A.1 follows from the result by Mack and Silverman (1982).

Lemma A.2

(Lv et al. 2015) Suppose $A_n(s)$ is convex and can be represented as $\frac{1}{2}s^TVs+U_n^Ts+C_n+r_n(s)$, where V is symmetric and positive definite, $U_n$ is stochastically bounded, $C_n$ is arbitrary and $r_n(s)$ goes to zero in probability for each s. Then $\alpha _n$, the argmin of $A_n$, is only $o_p(1)$ away from $\beta _n=-V^{-1}U_n$, the argmin of $\frac{1}{2}s^TVs+U_n^Ts+C_n$. If also $U_n \mathop {\rightarrow }\limits ^{\mathcal {D}}U$, then $\alpha _n\mathop {\rightarrow }\limits ^{\mathcal {D}}-V^{-1}U$.

Proof of Theorem 2.1

For given w, then $\tilde{g}_\tau (w), \tilde{g}'_\tau (w), \tilde{\beta }$ minimizes

$$\begin{aligned} \sum _{i=1}^n\frac{1}{G_n(Y_i)}\rho _\tau \big [Y_i-X_i^T \beta -a-b(W_i-w)\big ]K_h(W_i-w). \end{aligned}$$

Denote

$$\begin{aligned} \tilde{\xi }=\sqrt{nh}\left( \begin{array}{ccc} \tilde{g}_\tau (w)-g_\tau (w)\\ h(\tilde{g}'_\tau (w)-g'_\tau (w))\\ \tilde{\beta }_\tau -\beta _\tau \end{array} \right) , \xi =\sqrt{nh}\left( \begin{array}{ccc} a-g_\tau (w)\\ h(b-g'_\tau (w))\\ \beta -\beta _\tau \end{array} \right) , N_i= \left( \begin{array}{c} 1\\ \frac{W_i-w}{h}\\ X_i \end{array} \right) , \end{aligned}$$

$\tilde{r}_i(w)=-g_\tau (W_i)+g_\tau (w)+g'_\tau (w)(W_i-w), K_i(w)=K\big (\frac{W_i-w}{h}\big )$, then $\tilde{\xi }$ will be the minimizer of

$$\begin{aligned} Q_n(\xi )=\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\left[ \rho _\tau \left( \varepsilon _i-\tilde{r}_i(w)-N_i^T\xi /\sqrt{nh}\right) -\rho _\tau (\varepsilon _i-\tilde{r}_i(w))\right] . \end{aligned}$$

Following the identity by Knight (1998),

$$\begin{aligned} \rho _\tau (u-v)-\rho _\tau (u)=-v\psi _\tau (u)+\int _0^v\big [I(u\le s)-I(u\le 0)\big ]ds, \end{aligned}$$

where $\psi _\tau (u)=\tau -I(u\le 0)$, then we obtain

$$\begin{aligned} Q_n(\xi )&=\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\bigg [ -\frac{N_i^T\xi }{\sqrt{nh}}\psi _\tau (\varepsilon _i-\tilde{r}_i(w))\\&\quad +\,\int _{0}^{N_i^T\xi /\sqrt{nh}}\big [I(\varepsilon _i-\tilde{r}_i(w)\le s)-I(\varepsilon _i-\tilde{r}_i(w)\le 0)\big ]ds\bigg ]\\&=-\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}N_i^T \xi \psi _\tau (\varepsilon _i-\tilde{r}_i(w))\\&\quad ~~+\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\int _{0}^{N_i^T\xi / \sqrt{nh}}\big [I(\varepsilon _i-\tilde{r}_i(w)\le s)-I( \varepsilon _i-\tilde{r}_i(w)\le 0)\big ]ds\\&:=-Q_{1n}^T\xi +Q_{2n}(\xi ). \end{aligned}$$

In the following, we prove $E[Q_{2n}(\xi )]=\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w)\xi $.

Denote $\tilde{Q}_{2n}(\xi )=\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)} \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \}ds$, $\Delta $ and $\tilde{r}_i(\mathbf{w})$ are equal to $N_i^T\xi /\sqrt{nh}$ and $\tilde{r}_i(w)$ which $X_i, W_i$ are replaced by $x,\mathbf{w}$. Since $\tilde{Q}_{2n}(\xi )$ is a summation of i.i.d. random variables of the kernel form, according to Lemma A.1, we have

$$\begin{aligned} \tilde{Q}_{2n}(\xi )=E[\tilde{Q}_{2n}(\xi )]+O_p(\delta _n). \end{aligned}$$

The expectation of $\tilde{Q}_{2n}(\xi )$:

$$\begin{aligned} E(\tilde{Q}_{2n}(\xi ))&=\sum _{i=1}^nE\bigg [\frac{K_i(w)}{G(Y_i)} \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \}ds\bigg ]\\&=\sum _{i=1}^n\int \int \int \frac{1}{G(y)}K(\frac{\mathbf{w}-w}{h}) \int _{0}^{\Delta }\big [I(y\le x^T\beta +g(\mathbf{w})+s+\tilde{r}_i(\mathbf{w}))\\&~~~~-I(y\le x^T\beta +g(\mathbf{w})+\tilde{r}_i(\mathbf{w}))\big ] dsf^{*} (x,y,\mathbf{w})dxdyd\mathbf{w}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \} ds\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w)\mathbb {E} \big \{\int _{0}^{N_i^T\xi /\sqrt{nh}}\big [\big \{I(\varepsilon _i \le s+\tilde{r}_i(w))\\&\qquad -I(\varepsilon _i\le \tilde{r}_i(w))\big \}|X,W\big ]ds\big \}\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big [F_{\varepsilon }(s+\tilde{r}_i(w))-F_{\varepsilon }(\tilde{r}_i(w))\big ]ds\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big [sf_{\varepsilon }(\tilde{r}_i(w)|X,W)+o(1)\big ]ds\bigg \}\\&=\frac{1}{2\theta }\xi ^T\mathbb {E}\bigg \{\frac{1}{nh} \sum _{i=1}^nK_i(w)f_{\varepsilon }(\tilde{r}_i(w)|X,W)N_iN_i^T\bigg \}\xi +O_p(\delta _n). \end{aligned}$$

Similarly, we can obtain $\text {Var}[{\tilde{Q}_{2n}(\xi )}]=o(1).$ Then $\tilde{Q}_{2n}(\xi )=\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w) \xi +O_p(\delta _n)$, where $C_1(w)=\mathbb {E}[f_{\varepsilon } (0|X,W)(1,(W-w) /h,X^T)^T(1, (W-w)/h,X^T)|W=w]$. Further, according to Lemma 5.2 in Liang and Baek (2016), we have

$$\begin{aligned} \text {sup}_y|G_n(y)-G(y)|=O_p(n^{-1/2}), \end{aligned}$$

(A.1)

and by some calculations, we have

$$\begin{aligned} | Q_{2n}(\xi )- \tilde{Q}_{2n}(\xi )|=O_p(h^{\frac{1}{2}})=o_p(1). \end{aligned}$$

(A.2)

Thus,

$$\begin{aligned} Q_{n}(\xi )&=-Q_{1n}^T\xi +E[Q_{2n}(\xi )]+O_p(\delta _n)\\&=-Q_{1n}^T\xi +\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w) \xi +O_p(\delta _n). \end{aligned}$$

According to the Lemma A.2, the minimizer of $Q_{n}(\xi )$ can be expressed as

$$\begin{aligned} \tilde{\xi }=\theta f^{-1}_W(w)C^{-1}_1(w)Q_{1n}+o_p(1). \end{aligned}$$

(A.3)

Therefore,

$$\begin{aligned} \sqrt{nh}\Big ( \begin{array}{ccc} \tilde{g}_\tau (w)-g_\tau (w)\\ \tilde{\beta }_\tau -\beta _\tau \end{array} \Big )=\theta f^{-1}_W(w)C^{-1}_2(w)Q_{1n,1}+o_p(1). \end{aligned}$$

(A.4)

where $C_2(w)=\mathbb {E}\{f_{\varepsilon }(0|X,W)(1,X^T)^T (1,X^T)|W=w\}$, $Q_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n \frac{K_i(w)}{G_n(Y_i)}\psi _\tau (\varepsilon _i-\tilde{r}_i)(1,X_i^T)^T$. In the following, consider $Q_{1n,1}$. Denote $Q^{*}_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i)$,

$$\begin{aligned} E(Q^{*}_{1n,1})&=\frac{1}{\sqrt{nh}}\sum _{i=1}^nE\big \{ \frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i)\big \}\\&=\frac{1}{\theta \sqrt{nh}}\sum _{i=1}^n\mathbb {E}\big [K_i(w) (1,X_i^T)^T\mathbb {E}\{(\tau -I(\varepsilon _i\le 0))|X,W\}\big ]\\&=\frac{1}{\theta \sqrt{nh}}\sum _{i=1}^n\mathbb {E}\big [K_i (w)(1,X_i^T)^T\big (\tau -F_{\varepsilon }(0|X,W)\big )\big ]=0, \end{aligned}$$

and Var$(Q^{*}_{1n,1})\rightarrow \frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w)$, where $D_2(w)=\mathbb {E}[\frac{1}{G(Y)}(1,X^T)^T(1,X^T)|W=w]$. By the Cramér-Wald theorem and the central limit theorem, we have

$$\begin{aligned} Q^{*}_{1n,1}\mathop {\rightarrow }\limits ^{\mathcal {D}}N\left( 0,\frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w) \right) . \end{aligned}$$

Define $\tilde{Q}_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i-\tilde{r}_i),$ we have

$$\begin{aligned}&\text {Var}(\tilde{Q}_{1n,1}-Q^{*}_{1n,1})\\&\quad \le \frac{C}{\theta G(a_F)nh}\sum _{i=1}^nK^2_i(w)(1,X_i^T)^T(1,X_i^T)max\{F_{\varepsilon } (|\tilde{r}_i|)-F_{\varepsilon }(0)\}=o_p(1). \end{aligned}$$

Thus

$$\begin{aligned} \text {Var}(\tilde{Q}_{1n,1}-Q^{*}_{1n,1})=o(1). \end{aligned}$$

By Slutsky’s theorem, conditioning on X, W, we have

$$\begin{aligned} \tilde{Q}_{1n,1}-E(\tilde{Q}_{1n,1})\mathop {\rightarrow }\limits ^{\mathcal {D}}N(0,\frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w) ). \end{aligned}$$

(A.5)

Note that

$$\begin{aligned} Q_{1n,1}=Q_{1n,1}-\tilde{Q}_{1n,1}+(\tilde{Q}_{1n,1}-E\tilde{Q}_{1n,1})+E\tilde{Q}_{1n,1}. \end{aligned}$$

(A.6)

Similar to the proof of (A.2), we have $Q_{1n,1}-\tilde{Q}_{1n,1}=o_p(1)$. Thus,

$$\begin{aligned} Q_{1n,1}-E\tilde{Q}_{1n,1}=(\tilde{Q}_{1n,1}-E\tilde{Q}_{1n,1})+o_p(1). \end{aligned}$$

(A.7)

Next we calculate the mean of $\tilde{Q}_{1n,1}$.

$$\begin{aligned} \frac{1}{\sqrt{nh}}E(\tilde{Q}_{1n,1})&=\frac{1}{nh} \sum _{i=1}^nE\bigg \{\frac{K_i(w)}{G(Y_i)}\psi _\tau (\varepsilon _i- \tilde{r}_i(w))(1,X_i^T)^T\bigg \}\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big [K_i(w)\mathbb {E}\{(I(\varepsilon _i -\tilde{r}_i(w)\le 0)-\tau )|X,W\}(1,X_i^T)^T\big ]\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big [K_i(w)\{F_{\varepsilon }(\tilde{r}_i(w)|X,W)-F_{\varepsilon }(0|X,W)\}(1,X_i^T)^T\big ]\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big \{K_i(w)\tilde{r}_i(w)f_{\varepsilon }(0|X,W)\big (1+o(1)\big )(1,X_i^T)^T\big \}\nonumber \\&=\frac{\mu _2h^2}{2\theta }f_W(w)C_2(w)\bigg ( \begin{array}{ccc} {g''}_{\!\tau }(w)\\ 0 \end{array} \bigg )+o_p(h^2). \end{aligned}$$

(A.8)

Combining (A.4), (A.5), (A.6), (A.7) and (A.8), the proof of the Theorem 2.1 is completed.

Proof of Theorem 2.2

Given $\tilde{g}_\tau (W_i)$, then

$$\begin{aligned} \hat{\beta }_\tau =\underset{\beta }{\arg \min }\sum _{i=1}^n\frac{1}{G_n(Y_i)}\rho _\tau \big (Y_i-\tilde{g}_\tau (W_i)-X_i^T\beta \big ). \end{aligned}$$

Denote $ r_i=\tilde{g}_\tau (W_i)-g_\tau (W_i)$, $\gamma =\sqrt{n} (\beta -\beta _\tau )$ and $\hat{\gamma }=\sqrt{n}(\hat{\beta }-\beta _\tau )$. Then $\hat{\gamma }$ will be the minimizer of the

$$\begin{aligned} V_n(\gamma )=\sum _{i=1}^n\frac{1}{G_n(Y_i)}\big [\rho _\tau (\varepsilon _i- r_i-X_i^T\gamma /\sqrt{n})-\rho _\tau (\varepsilon _i- r_i)\big ]. \end{aligned}$$

(A.9)

Using the identity by Knight (1998),

$$\begin{aligned} \rho _\tau (u-v)-\rho _\tau (u)=-v\psi _\tau (u)+\int _0^v\big [I(u\le s)-I(u\le 0)\big ]ds, \end{aligned}$$

(A.9) can be written as

(A.10)

Consider $V_{2n}(\gamma )$ firstly. Denote

$$\begin{aligned} V^*_{2n}(\gamma )=\sum _{i=1}^n\frac{1}{G(Y_i)}\int _{r_i}^{ r_i+X_i^T\gamma /\sqrt{n}}\big [I(\varepsilon _i\le s)-I(\varepsilon _i\le 0)\big ]ds \end{aligned}$$

The conditional expectation of $V^*_{2n}(\gamma )$:

By some calculations, we have $\text {Var}[V^*_{2n}(\gamma )]=o(1).$ Thus,

Denote $\tilde{R}_n(\gamma )=V^*_{2n}(\gamma )-E^*(V_{2n}(\gamma )|X,W)$, it is easy to get that $\tilde{R}_n(\gamma )=o_p(1)$. Note that $V_{2n}(\gamma ) =V^*_{2n}(\gamma )+o_p(1)$, then we have

Next, consider $V_{1n}$. Define $V^*_{1n}=\frac{1}{\sqrt{n}} \sum _{i=1}^n \frac{1}{G(Y_i)}X_i\psi _\tau (\varepsilon _i)$. According to (A.1), we have $V_{1n}=V^*_{1n}+o_p(1)$. Hence,

$$\begin{aligned} V_n(\gamma )= & {} -\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}X_i\psi _\tau (\varepsilon _i)\bigg ]^T\gamma +\frac{1}{2} \gamma ^TA_n\gamma \nonumber \\&\quad +\,\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)} f_{\varepsilon }(0|X,W)r_iX_i\bigg ]^T\gamma +o_p(1). \end{aligned}$$

By (A.3),

$$\begin{aligned}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}f_{\varepsilon } (0|X,W)r_iX_i\\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\frac{f_{ \varepsilon }(0|X,W)}{f_W(w)}\theta X_i\Big ( \begin{array}{ccc} 1\\ \mathbf 0 \end{array} \Big )^T C^{-1}_2(W_i)\\&\quad ~\times \, \left\{ \frac{1}{nh}\sum _{j=1}^n \frac{K_j}{G_n(Y_j)}\Big ( \begin{array}{ccc} 1\\ X_j \end{array}\Big )\psi _\tau (\varepsilon _j-\tilde{r}_j)\right\} \\&\quad ~+O_p(h^{\frac{3}{2}}+\log ^{\frac{1}{2}}(1/h)/\sqrt{nh^2})\\&= \frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{1}{G_n(Y_j)}\psi _\tau ( \varepsilon _j)\delta (X_j,W_j)+O_p(n^{\frac{1}{2}}h^2+\log ^{ \frac{1}{2}}(1/h)/\sqrt{nh^2})\\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\psi _\tau ( \varepsilon _i)\delta (X_i,W_i)+o_p(1), \end{aligned}$$

where $\delta (X_i,W_i)=\mathbb {E}\big [f_{\varepsilon }(0| X,W)X(1,\mathbf 0 ^T)\big ]C^{-1}_2(W_i)(1,X_i^T)^T$. Thus,

$$\begin{aligned} V_n(\gamma )&=\frac{1}{2}\gamma ^TA_n\gamma -\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\psi _\tau (\varepsilon _i) \{X_i-\delta (X_i,W_i)\}\bigg ]^T\gamma +o_p(1)\\&{:=}\frac{1}{2}\gamma ^TA_n\gamma -B^T_n\gamma +o_p(1). \end{aligned}$$

Observe that $A_n=EA_n+o_p(1)=\frac{1}{\theta } \mathbb {E} \big \{f_{\varepsilon }(0|X,W)XX^T\big \}+o_p(1):=\frac{1}{ \theta }A+o_p(1)$, hence

$$\begin{aligned} V_n(\gamma )=\frac{1}{2\theta }\gamma ^TA\gamma -B^T_n\gamma +o_p(1). \end{aligned}$$

According to the Lemma A.2, we have

$$\begin{aligned} \hat{\gamma }=\theta A^{-1}B_n+o_p(1). \end{aligned}$$

(A.11)

Further, according to the Cramér-Wald device and central limit theorem, we have

$$\begin{aligned} B_n\mathop {\rightarrow }\limits ^{\mathcal {D}}N\left( 0,\frac{\tau (1-\tau )}{\theta } B\right) , \end{aligned}$$

(A.12)

where $B=\mathbb {E}\{X-\delta (X,W)\}^{\otimes 2}.$ Combining (A.11) with (A.12), we accomplish the proof of Theorem 2.2.

Proof of Theorem 2.3

The asymptotic normality of $\hat{g}_\tau (w)$ can be obtained by following the ideas in the proof of Theorem 2.1, we omit the details here.

Proof of Theorem 2.4

Denote $\zeta =\sqrt{n}(\beta -\beta _\tau ),$$\hat{\zeta }=\sqrt{n}(\hat{\beta }^\lambda -\beta _\tau ),$$\hat{\zeta }_1=\sqrt{n}(\hat{\beta }_1^\lambda -\beta _{1\tau })$ and $r_i=\hat{g}_\tau (W_i)-g_\tau (W_i)$, then $\hat{\beta }^\lambda $ be the minimizer of the following penalized target function:

$$\begin{aligned} \sum ^n_{i=1}\frac{1}{G_n(Y_i)}\rho _\tau \{Y_i-X_i^T\beta -\hat{g}_\tau (W_i)\}+n\sum ^q_{j=1}p^{\prime }_\lambda (|\beta ^{(0)}_{j}|) \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j}). \end{aligned}$$

(A.13)

Minimizing (A.13) is equivalent to minimizing

$$\begin{aligned} L_n(\zeta )&=\sum ^n_{i=1}\frac{1}{G_n(Y_i)}\big \{\rho _\tau ( \varepsilon _i-r_i-X_i^T\zeta /\sqrt{n})-\rho _\tau (\varepsilon _i -r_i)\big \} \\&\quad +n\sum ^q_{j=1}p^{'}_\lambda (|\beta ^{(0)}_{j}|) \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j}). \end{aligned}$$

The second term above can be expressed as

$$\begin{aligned} n\sum ^q_{j=1}p'_\lambda \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j})\mathop {\rightarrow }\limits ^{\mathcal {P}}\bigg \{ \begin{array}{ll} 0,&{}\quad \text {if}~~ \beta _2=\beta _{2\tau },\\ \infty , &{}\quad \text {otherwise}. \end{array} \end{aligned}$$

Therefore, we obtain $\hat{\beta }_2^{\lambda }\mathop {\rightarrow }\limits ^{\mathcal {P}}0$.

Denote $B_{n,11}$ is upper-left $s\times s$ submatrix of $B_n$. Due to $\hat{\zeta }$ is the minimizer of the $L_n(\zeta )$ and $L_n(\zeta )$ can be written asymptotically as

$$\begin{aligned} L_n\big ((\zeta _1^T, \mathbf 0 ^T)^T\big )= & {} \frac{1}{2}(\zeta _1^T, \mathbf 0 ^T)\frac{A}{\theta }(\zeta _1^T,\mathbf 0 ^T)^T-B_n^T( \zeta _1^T, \mathbf 0 ^T)^T\\&+\,n\sum ^q_{j=1}p^{'}_\lambda (|\beta ^{ (0)}_{j}|)\text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j})+o_p(1)\\\rightarrow & {} L(\zeta _1)=\frac{1}{2}\zeta _1^T\frac{\Sigma _1}{\theta } \zeta _1-B_{n,11}^T\zeta _1. \end{aligned}$$

Note that $L_n(\zeta )$ is a convex function of $\zeta $ and $L(\zeta _1)$ has a unique minimizer, the epiconvergence theory of Geyer (1994) imply that

$$\begin{aligned} \arg \min L_n(\zeta )=\sqrt{n}(\hat{\beta }^\lambda -\beta _\tau )\mathop {\rightarrow }\limits ^{ \mathcal {D}}\arg \min L(\zeta _1), \end{aligned}$$

which establishes the asymptotic normality part.

To prove the consistency property of model selection, we need only to show $\hat{\beta }_2^{\lambda }=0$ with probability tending to 1. It is equivalent to prove that if $\beta _{\tau j}=0$, then $P(\hat{\beta }_j^{ \lambda }\ne 0)\rightarrow 0$. Recall the fact that $|\frac{\rho _{ \tau } (t_2)-\rho _{\tau }(t_1)}{t_2-t_1}|\le \mathrm {max}(\tau ,1-\tau )<1$, if $\hat{\beta }_j^{\lambda }\ne 0,$ then we have $\sqrt{n}p^{\prime }_\lambda ( |\beta ^{(0)}_{j}|)<n^{-1}\sum ^n_{i=1}\frac{1}{G_n(Y_i)}|X_{ij}|$. Therefore, $P(\hat{\beta }_j^{\lambda }\ne 0)\le P\big (\sqrt{n} p^{\prime }_\lambda (|\beta ^{(0)}_{j}|)<n^{-1}\sum ^n_{i=1}\frac{1}{G_n(Y_i)} |X_{ij}|\big )$, which together with $\sqrt{n}p^{\prime }_\lambda (|\beta ^{(0)}_{j}|) \rightarrow \infty $ yields that $P(\hat{\beta }_j^{\lambda }\ne 0)\rightarrow 0$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, HX., Chen, ZL., Wang, JF. et al. Quantile regression and variable selection for partially linear model with randomly truncated data. Stat Papers 60, 1137–1160 (2019). https://doi.org/10.1007/s00362-016-0867-3

Download citation

Received: 13 August 2016
Revised: 06 December 2016
Published: 09 January 2017
Issue Date: August 2019
DOI: https://doi.org/10.1007/s00362-016-0867-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantile regression and variable selection for partially linear model with randomly truncated data

Abstract

Access this article

Similar content being viewed by others

Quantile regression for varying-coefficient partially nonlinear models with randomly truncated data

Simultaneous variable selection and parametric estimation for quantile regression

Bayesian empirical likelihood of quantile regression with missing observations

References

Acknowledgements