Skip to main content
Log in

Quantile regression and variable selection for partially linear model with randomly truncated data

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

This paper focuses on the problem of estimation and variable selection for quantile regression (QR) of partially linear model (PLM) where the response is subject to random left truncation. We propose a three-stage estimation procedure for parametric and nonparametric parts based on the weights which are random quantities and determined by the product-limit estimates of the distribution function of truncated variable. The estimators obtained in the second and third stages are more efficient than the initial estimators in the first stage. Furthermore, we propose a variable selection procedure for the QR of PLM by combining the estimation method with the smoothly clipped absolute deviation penalty to get sparse estimation of the regression parameter. The oracle properties of the variable selection approach are established. Simulation studies are conducted to examine the performance of our estimators and variable selection method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Engle R, Granger C, Rice J, Weiss A (1986) Nonparametric estimates of the relation between weather and electricity sales. J Am Stat Assoc 81:310–320

    Article  Google Scholar 

  • Fan JQ, Li RZ (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Fuller WA (1987) Measurement error models. Wiley, New York

    Book  MATH  Google Scholar 

  • Geyer CJ (1994) On the asymptotics of constrained M-estimation. Ann Stat 22:1993–2010

    Article  MathSciNet  MATH  Google Scholar 

  • He SY, Yang GL (1998) Estimation of the truncation probability in the random truncation model. Ann Stat 26:1011–1027

    Article  MathSciNet  MATH  Google Scholar 

  • He SY, Yang GL (2003) Estimation of regression parameters with left truncated data. J Stat Plan Inference 117:99–122

    Article  MathSciNet  MATH  Google Scholar 

  • Honda T (2004) Quantile regression in varying coefficient models. J Stat Plan Inference 121:113–125

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang R, Qian WM, Zhou ZG (2012) Variable selection and coefficient estimation via composite quantile regression with randomly censored data. Stat Prob Lett 82:308–317

    Article  MathSciNet  MATH  Google Scholar 

  • Kai B, Li RZ, Zou H (2010) Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. J R Stat Soc B 72:49–69

    Article  MathSciNet  MATH  Google Scholar 

  • Kai B, Li RZ, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39:305–332

    Article  MathSciNet  MATH  Google Scholar 

  • Kim MO (2007) Quantile regression with varying coeffcients. Ann Stat 35:92–108

    Article  MATH  Google Scholar 

  • Knight K (1998) Limiting distributions for \(l_1\) regression estimators under general conditions. Ann Stat 26:755–770

    Article  MATH  Google Scholar 

  • Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46:33–50

    Article  MathSciNet  MATH  Google Scholar 

  • Lemdani M, Ould-Saïd E, Poulin P (2009) Asymptotic properties of a conditional quantile estimator with randomly truncated data. J Multivar Anal 100:546–559

    Article  MathSciNet  MATH  Google Scholar 

  • Liang HY, Baek JI (2016) Asymptotic normality of conditional density estimation with left-truncated and dependent data. Stat Pap 57:1–20

    Article  MathSciNet  MATH  Google Scholar 

  • Liang HY, Liu AA (2013) Kernel estimation of conditional density with truncated, censored and dependent data. J Multivar Anal 120:40–58

    Article  MathSciNet  MATH  Google Scholar 

  • Lv YH, Zhang RQ, Zhao WH, Liu JC (2014) Quantile regression and variable selection for the single-index model. J Appl Stat 41:1565–1577

    Article  MathSciNet  Google Scholar 

  • Lv YH, Zhang RQ, Zhao WH, Liu JC (2015) Quantile regression and variable selection of partial linear single-index model. Ann Inst Stat Math 67:375–409

    Article  MathSciNet  MATH  Google Scholar 

  • Lynden-Bell D (1971) A method of allowing for known observational selection in small samples applied to 3CR quasars. Mon Not R Astron Soc 155:95–118

    Article  Google Scholar 

  • Mack YP, Silverman BW (1982) Weak and strong uniform consistency of kernel regression estimators. Prob Theory Relat Fields 61:405–415

    MATH  Google Scholar 

  • Neocleous T, Portnoy S (2009) Partially linear censored quantile regression. Lifetime Data Anal 15:357–378

    Article  MathSciNet  MATH  Google Scholar 

  • Ould-Saïd E, Lemdani M (2006) Asymptotic properties of a nonparametric regression function estimator with randomly truncated data. Ann Inst Stat Math 58:357–378

    Article  MathSciNet  MATH  Google Scholar 

  • Stute W, Wang JL (2008) The central limit theorem under random truncation. Bernoulli 14:604–622

    Article  MathSciNet  MATH  Google Scholar 

  • Wang JF, Liang HY, Fan GL (2013) Local polynomial quasi-likelihood regression with truncated and dependent data. Statistics 47:744–761

    Article  MathSciNet  MATH  Google Scholar 

  • Woodroofe W (1985) Estimation a distribution function with truncated data. Ann Stat 13:163–177

    Article  MATH  Google Scholar 

  • Wu YC, Liu YF (2009) Variable selection in quantile regression. Stat Sin 19:801–817

    MathSciNet  MATH  Google Scholar 

  • Yu K, Jones MC (1998) Local linear quantile regression. Am Stat Assoc 93:228–237

    Article  MathSciNet  MATH  Google Scholar 

  • Yu K, Lu YZ, Stander J (2003) Quantile regression: applications and current research areas. Statistician 52:331–350

    MathSciNet  Google Scholar 

  • Zhou WH (2011) A weighted quantile regression for randomly truncated data. Comput Stat Data Anal 55:554–566

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank the referees for their careful reading of the manuscript and for their constructive comments and suggestions. This research was supported by the National Natural Science Foundation of China (11371321, 11401006), the Project of Humanities and Social Science Foundation of Ministry of Education (15YJC910006), the National Statistical Science Research Program of China (2015LY55, 2016LY80), Anhui Provincial Higher Education Promotion Program Natural Science General Project (TSKJ2015B22) and Zhejiang Provincial Key Research Base for Humanities and Social Science Research (Statistics 1020XJ3316004G).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhen-Long Chen.

Appendix

Appendix

Before we present the proofs of the theorems, we first state some regularity conditions. They are also assumed in Zhou (2011) and Kai et al. (2011). Let \(\delta _n=\Big (\frac{\log (1/h)}{nh}\Big )^{1/2}\).

  1. (C1)

    The kernel function \(K(\cdot )\) is a symmetry continuous density function with bounded tight support, satisfies one-order Lipschitz condition, and \(\int ^{\infty }_{-\infty }u^2K(u)du<\infty \), \(\int ^{\infty }_{-\infty }u^jK^2(u)du<\infty ,j=0,1,2.\)

  2. (C2)

    FG are continuous and \(a_G\le a_F\).

  3. (C3)

    The random variable W has bounded support \(\mathcal {W}\) and its density function \(f_W(\cdot )\) is positive and has a second derivative.

  4. (C4)

    \(F_{\varepsilon }(0|X,W)=\tau \) for all (XW) with a continuous and uniformly bounded derivative and \(f_\varepsilon (\cdot |X,W)\) is bounded away from zero.

  5. (C5)

    The matrixes \(C_2(w) ~\text {and} ~A\) are non-singular for all \(w\in \mathcal {W}\).

  6. (C6)

    The function \(g(\cdot )\) has a continuous and bounded second derivative.

Lemma A.1

Let \((X_1,Y_1), \ldots , (X_n,Y_n)\) be independent and identically distributed random vectors. Assume that \(E|Y|^s<\infty , sup_x\int |y|^sf(x,y)dy<\infty \), where f denotes the joint density of (XY). Let K be a bounded positive function with a bounded support, satisfying a Lipschitz condition. Then

$$\begin{aligned} \sup _{x} \Bigg |\frac{1}{n}\sum ^n_{i=1}\big [K_h(X_i-x)Y_i-E(K_h (X_i-x)Y_i)\big ]\Bigg |=O_p\Bigg (\frac{\log ^{1/2}(1/h)}{\sqrt{nh}}\Bigg ), \end{aligned}$$

provided that \(n^{2\varepsilon -1}h\rightarrow \infty \) for some \(\varepsilon <1-s^{-1}\).

Lemma A.1 follows from the result by Mack and Silverman (1982).

Lemma A.2

(Lv et al. 2015) Suppose \(A_n(s)\) is convex and can be represented as \(\frac{1}{2}s^TVs+U_n^Ts+C_n+r_n(s)\), where V is symmetric and positive definite, \(U_n\) is stochastically bounded, \(C_n\) is arbitrary and \(r_n(s)\) goes to zero in probability for each s. Then \(\alpha _n\), the argmin of \(A_n\), is only \(o_p(1)\) away from \(\beta _n=-V^{-1}U_n\), the argmin of \(\frac{1}{2}s^TVs+U_n^Ts+C_n\). If also \(U_n \mathop {\rightarrow }\limits ^{\mathcal {D}}U\), then \(\alpha _n\mathop {\rightarrow }\limits ^{\mathcal {D}}-V^{-1}U\).

Proof of Theorem 2.1

For given w, then \(\tilde{g}_\tau (w), \tilde{g}'_\tau (w), \tilde{\beta }\) minimizes

$$\begin{aligned} \sum _{i=1}^n\frac{1}{G_n(Y_i)}\rho _\tau \big [Y_i-X_i^T \beta -a-b(W_i-w)\big ]K_h(W_i-w). \end{aligned}$$

Denote

$$\begin{aligned} \tilde{\xi }=\sqrt{nh}\left( \begin{array}{ccc} \tilde{g}_\tau (w)-g_\tau (w)\\ h(\tilde{g}'_\tau (w)-g'_\tau (w))\\ \tilde{\beta }_\tau -\beta _\tau \end{array} \right) , \xi =\sqrt{nh}\left( \begin{array}{ccc} a-g_\tau (w)\\ h(b-g'_\tau (w))\\ \beta -\beta _\tau \end{array} \right) , N_i= \left( \begin{array}{c} 1\\ \frac{W_i-w}{h}\\ X_i \end{array} \right) , \end{aligned}$$

\(\tilde{r}_i(w)=-g_\tau (W_i)+g_\tau (w)+g'_\tau (w)(W_i-w), K_i(w)=K\big (\frac{W_i-w}{h}\big )\), then \(\tilde{\xi }\) will be the minimizer of

$$\begin{aligned} Q_n(\xi )=\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\left[ \rho _\tau \left( \varepsilon _i-\tilde{r}_i(w)-N_i^T\xi /\sqrt{nh}\right) -\rho _\tau (\varepsilon _i-\tilde{r}_i(w))\right] . \end{aligned}$$

Following the identity by Knight (1998),

$$\begin{aligned} \rho _\tau (u-v)-\rho _\tau (u)=-v\psi _\tau (u)+\int _0^v\big [I(u\le s)-I(u\le 0)\big ]ds, \end{aligned}$$

where \(\psi _\tau (u)=\tau -I(u\le 0)\), then we obtain

$$\begin{aligned} Q_n(\xi )&=\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\bigg [ -\frac{N_i^T\xi }{\sqrt{nh}}\psi _\tau (\varepsilon _i-\tilde{r}_i(w))\\&\quad +\,\int _{0}^{N_i^T\xi /\sqrt{nh}}\big [I(\varepsilon _i-\tilde{r}_i(w)\le s)-I(\varepsilon _i-\tilde{r}_i(w)\le 0)\big ]ds\bigg ]\\&=-\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}N_i^T \xi \psi _\tau (\varepsilon _i-\tilde{r}_i(w))\\&\quad ~~+\sum _{i=1}^n\frac{K_i(w)}{G_n(Y_i)}\int _{0}^{N_i^T\xi / \sqrt{nh}}\big [I(\varepsilon _i-\tilde{r}_i(w)\le s)-I( \varepsilon _i-\tilde{r}_i(w)\le 0)\big ]ds\\&:=-Q_{1n}^T\xi +Q_{2n}(\xi ). \end{aligned}$$

In the following, we prove \(E[Q_{2n}(\xi )]=\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w)\xi \).

Denote \(\tilde{Q}_{2n}(\xi )=\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)} \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \}ds\), \(\Delta \) and \(\tilde{r}_i(\mathbf{w})\) are equal to \(N_i^T\xi /\sqrt{nh}\) and \(\tilde{r}_i(w)\) which \(X_i, W_i\) are replaced by \(x,\mathbf{w}\). Since \(\tilde{Q}_{2n}(\xi )\) is a summation of i.i.d. random variables of the kernel form, according to Lemma A.1, we have

$$\begin{aligned} \tilde{Q}_{2n}(\xi )=E[\tilde{Q}_{2n}(\xi )]+O_p(\delta _n). \end{aligned}$$

The expectation of \(\tilde{Q}_{2n}(\xi )\):

$$\begin{aligned} E(\tilde{Q}_{2n}(\xi ))&=\sum _{i=1}^nE\bigg [\frac{K_i(w)}{G(Y_i)} \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \}ds\bigg ]\\&=\sum _{i=1}^n\int \int \int \frac{1}{G(y)}K(\frac{\mathbf{w}-w}{h}) \int _{0}^{\Delta }\big [I(y\le x^T\beta +g(\mathbf{w})+s+\tilde{r}_i(\mathbf{w}))\\&~~~~-I(y\le x^T\beta +g(\mathbf{w})+\tilde{r}_i(\mathbf{w}))\big ] dsf^{*} (x,y,\mathbf{w})dxdyd\mathbf{w}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big \{I(\varepsilon _i\le s+\tilde{r}_i(w))-I(\varepsilon _i\le \tilde{r}_i(w))\big \} ds\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w)\mathbb {E} \big \{\int _{0}^{N_i^T\xi /\sqrt{nh}}\big [\big \{I(\varepsilon _i \le s+\tilde{r}_i(w))\\&\qquad -I(\varepsilon _i\le \tilde{r}_i(w))\big \}|X,W\big ]ds\big \}\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big [F_{\varepsilon }(s+\tilde{r}_i(w))-F_{\varepsilon }(\tilde{r}_i(w))\big ]ds\bigg \}\\&=\frac{1}{\theta }\sum _{i=1}^n\mathbb {E}\bigg \{K_i(w) \int _{0}^{N_i^T\xi /\sqrt{nh}}\big [sf_{\varepsilon }(\tilde{r}_i(w)|X,W)+o(1)\big ]ds\bigg \}\\&=\frac{1}{2\theta }\xi ^T\mathbb {E}\bigg \{\frac{1}{nh} \sum _{i=1}^nK_i(w)f_{\varepsilon }(\tilde{r}_i(w)|X,W)N_iN_i^T\bigg \}\xi +O_p(\delta _n). \end{aligned}$$

Similarly, we can obtain \(\text {Var}[{\tilde{Q}_{2n}(\xi )}]=o(1).\) Then \(\tilde{Q}_{2n}(\xi )=\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w) \xi +O_p(\delta _n)\), where \(C_1(w)=\mathbb {E}[f_{\varepsilon } (0|X,W)(1,(W-w) /h,X^T)^T(1, (W-w)/h,X^T)|W=w]\). Further, according to Lemma 5.2 in Liang and Baek (2016), we have

$$\begin{aligned} \text {sup}_y|G_n(y)-G(y)|=O_p(n^{-1/2}), \end{aligned}$$
(A.1)

and by some calculations, we have

$$\begin{aligned} | Q_{2n}(\xi )- \tilde{Q}_{2n}(\xi )|=O_p(h^{\frac{1}{2}})=o_p(1). \end{aligned}$$
(A.2)

Thus,

$$\begin{aligned} Q_{n}(\xi )&=-Q_{1n}^T\xi +E[Q_{2n}(\xi )]+O_p(\delta _n)\\&=-Q_{1n}^T\xi +\frac{1}{2}\xi ^T\frac{f_W(w)}{\theta }C_1(w) \xi +O_p(\delta _n). \end{aligned}$$

According to the Lemma A.2, the minimizer of \(Q_{n}(\xi )\) can be expressed as

$$\begin{aligned} \tilde{\xi }=\theta f^{-1}_W(w)C^{-1}_1(w)Q_{1n}+o_p(1). \end{aligned}$$
(A.3)

Therefore,

$$\begin{aligned} \sqrt{nh}\Big ( \begin{array}{ccc} \tilde{g}_\tau (w)-g_\tau (w)\\ \tilde{\beta }_\tau -\beta _\tau \end{array} \Big )=\theta f^{-1}_W(w)C^{-1}_2(w)Q_{1n,1}+o_p(1). \end{aligned}$$
(A.4)

where \(C_2(w)=\mathbb {E}\{f_{\varepsilon }(0|X,W)(1,X^T)^T (1,X^T)|W=w\}\), \(Q_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n \frac{K_i(w)}{G_n(Y_i)}\psi _\tau (\varepsilon _i-\tilde{r}_i)(1,X_i^T)^T\). In the following, consider \(Q_{1n,1}\). Denote \(Q^{*}_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i)\),

$$\begin{aligned} E(Q^{*}_{1n,1})&=\frac{1}{\sqrt{nh}}\sum _{i=1}^nE\big \{ \frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i)\big \}\\&=\frac{1}{\theta \sqrt{nh}}\sum _{i=1}^n\mathbb {E}\big [K_i(w) (1,X_i^T)^T\mathbb {E}\{(\tau -I(\varepsilon _i\le 0))|X,W\}\big ]\\&=\frac{1}{\theta \sqrt{nh}}\sum _{i=1}^n\mathbb {E}\big [K_i (w)(1,X_i^T)^T\big (\tau -F_{\varepsilon }(0|X,W)\big )\big ]=0, \end{aligned}$$

and Var\((Q^{*}_{1n,1})\rightarrow \frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w)\), where \(D_2(w)=\mathbb {E}[\frac{1}{G(Y)}(1,X^T)^T(1,X^T)|W=w]\). By the Cramér-Wald theorem and the central limit theorem, we have

$$\begin{aligned} Q^{*}_{1n,1}\mathop {\rightarrow }\limits ^{\mathcal {D}}N\left( 0,\frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w) \right) . \end{aligned}$$

Define \(\tilde{Q}_{1n,1}=\frac{1}{\sqrt{nh}}\sum _{i=1}^n\frac{K_i(w)}{G(Y_i)}(1,X_i^T)^T\psi _\tau (\varepsilon _i-\tilde{r}_i),\) we have

$$\begin{aligned}&\text {Var}(\tilde{Q}_{1n,1}-Q^{*}_{1n,1})\\&\quad \le \frac{C}{\theta G(a_F)nh}\sum _{i=1}^nK^2_i(w)(1,X_i^T)^T(1,X_i^T)max\{F_{\varepsilon } (|\tilde{r}_i|)-F_{\varepsilon }(0)\}=o_p(1). \end{aligned}$$

Thus

$$\begin{aligned} \text {Var}(\tilde{Q}_{1n,1}-Q^{*}_{1n,1})=o(1). \end{aligned}$$

By Slutsky’s theorem, conditioning on XW, we have

$$\begin{aligned} \tilde{Q}_{1n,1}-E(\tilde{Q}_{1n,1})\mathop {\rightarrow }\limits ^{\mathcal {D}}N(0,\frac{\tau (1-\tau )f_W(w)\nu _0}{\theta }D_2(w) ). \end{aligned}$$
(A.5)

Note that

$$\begin{aligned} Q_{1n,1}=Q_{1n,1}-\tilde{Q}_{1n,1}+(\tilde{Q}_{1n,1}-E\tilde{Q}_{1n,1})+E\tilde{Q}_{1n,1}. \end{aligned}$$
(A.6)

Similar to the proof of (A.2), we have \(Q_{1n,1}-\tilde{Q}_{1n,1}=o_p(1)\). Thus,

$$\begin{aligned} Q_{1n,1}-E\tilde{Q}_{1n,1}=(\tilde{Q}_{1n,1}-E\tilde{Q}_{1n,1})+o_p(1). \end{aligned}$$
(A.7)

Next we calculate the mean of \(\tilde{Q}_{1n,1}\).

$$\begin{aligned} \frac{1}{\sqrt{nh}}E(\tilde{Q}_{1n,1})&=\frac{1}{nh} \sum _{i=1}^nE\bigg \{\frac{K_i(w)}{G(Y_i)}\psi _\tau (\varepsilon _i- \tilde{r}_i(w))(1,X_i^T)^T\bigg \}\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big [K_i(w)\mathbb {E}\{(I(\varepsilon _i -\tilde{r}_i(w)\le 0)-\tau )|X,W\}(1,X_i^T)^T\big ]\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big [K_i(w)\{F_{\varepsilon }(\tilde{r}_i(w)|X,W)-F_{\varepsilon }(0|X,W)\}(1,X_i^T)^T\big ]\nonumber \\&=-\frac{1}{h\theta }\mathbb {E}\big \{K_i(w)\tilde{r}_i(w)f_{\varepsilon }(0|X,W)\big (1+o(1)\big )(1,X_i^T)^T\big \}\nonumber \\&=\frac{\mu _2h^2}{2\theta }f_W(w)C_2(w)\bigg ( \begin{array}{ccc} {g''}_{\!\tau }(w)\\ 0 \end{array} \bigg )+o_p(h^2). \end{aligned}$$
(A.8)

Combining (A.4), (A.5), (A.6), (A.7) and (A.8), the proof of the Theorem 2.1 is completed.

Proof of Theorem 2.2

Given \(\tilde{g}_\tau (W_i)\), then

$$\begin{aligned} \hat{\beta }_\tau =\underset{\beta }{\arg \min }\sum _{i=1}^n\frac{1}{G_n(Y_i)}\rho _\tau \big (Y_i-\tilde{g}_\tau (W_i)-X_i^T\beta \big ). \end{aligned}$$

Denote \( r_i=\tilde{g}_\tau (W_i)-g_\tau (W_i)\), \(\gamma =\sqrt{n} (\beta -\beta _\tau )\) and \(\hat{\gamma }=\sqrt{n}(\hat{\beta }-\beta _\tau )\). Then \(\hat{\gamma }\) will be the minimizer of the

$$\begin{aligned} V_n(\gamma )=\sum _{i=1}^n\frac{1}{G_n(Y_i)}\big [\rho _\tau (\varepsilon _i- r_i-X_i^T\gamma /\sqrt{n})-\rho _\tau (\varepsilon _i- r_i)\big ]. \end{aligned}$$
(A.9)

Using the identity by Knight (1998),

$$\begin{aligned} \rho _\tau (u-v)-\rho _\tau (u)=-v\psi _\tau (u)+\int _0^v\big [I(u\le s)-I(u\le 0)\big ]ds, \end{aligned}$$

(A.9) can be written as

(A.10)

Consider \(V_{2n}(\gamma )\) firstly. Denote

$$\begin{aligned} V^*_{2n}(\gamma )=\sum _{i=1}^n\frac{1}{G(Y_i)}\int _{r_i}^{ r_i+X_i^T\gamma /\sqrt{n}}\big [I(\varepsilon _i\le s)-I(\varepsilon _i\le 0)\big ]ds \end{aligned}$$

The conditional expectation of \(V^*_{2n}(\gamma )\):

By some calculations, we have \(\text {Var}[V^*_{2n}(\gamma )]=o(1).\) Thus,

Denote \(\tilde{R}_n(\gamma )=V^*_{2n}(\gamma )-E^*(V_{2n}(\gamma )|X,W)\), it is easy to get that \(\tilde{R}_n(\gamma )=o_p(1)\). Note that \(V_{2n}(\gamma ) =V^*_{2n}(\gamma )+o_p(1)\), then we have

Next, consider \(V_{1n}\). Define \(V^*_{1n}=\frac{1}{\sqrt{n}} \sum _{i=1}^n \frac{1}{G(Y_i)}X_i\psi _\tau (\varepsilon _i)\). According to (A.1), we have \(V_{1n}=V^*_{1n}+o_p(1)\). Hence,

$$\begin{aligned} V_n(\gamma )= & {} -\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}X_i\psi _\tau (\varepsilon _i)\bigg ]^T\gamma +\frac{1}{2} \gamma ^TA_n\gamma \nonumber \\&\quad +\,\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)} f_{\varepsilon }(0|X,W)r_iX_i\bigg ]^T\gamma +o_p(1). \end{aligned}$$

By (A.3),

$$\begin{aligned}&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}f_{\varepsilon } (0|X,W)r_iX_i\\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\frac{f_{ \varepsilon }(0|X,W)}{f_W(w)}\theta X_i\Big ( \begin{array}{ccc} 1\\ \mathbf 0 \end{array} \Big )^T C^{-1}_2(W_i)\\&\quad ~\times \, \left\{ \frac{1}{nh}\sum _{j=1}^n \frac{K_j}{G_n(Y_j)}\Big ( \begin{array}{ccc} 1\\ X_j \end{array}\Big )\psi _\tau (\varepsilon _j-\tilde{r}_j)\right\} \\&\quad ~+O_p(h^{\frac{3}{2}}+\log ^{\frac{1}{2}}(1/h)/\sqrt{nh^2})\\&= \frac{1}{\sqrt{n}}\sum _{j=1}^n\frac{1}{G_n(Y_j)}\psi _\tau ( \varepsilon _j)\delta (X_j,W_j)+O_p(n^{\frac{1}{2}}h^2+\log ^{ \frac{1}{2}}(1/h)/\sqrt{nh^2})\\&=\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\psi _\tau ( \varepsilon _i)\delta (X_i,W_i)+o_p(1), \end{aligned}$$

where \(\delta (X_i,W_i)=\mathbb {E}\big [f_{\varepsilon }(0| X,W)X(1,\mathbf 0 ^T)\big ]C^{-1}_2(W_i)(1,X_i^T)^T\). Thus,

$$\begin{aligned} V_n(\gamma )&=\frac{1}{2}\gamma ^TA_n\gamma -\bigg [\frac{1}{\sqrt{n}}\sum _{i=1}^n\frac{1}{G(Y_i)}\psi _\tau (\varepsilon _i) \{X_i-\delta (X_i,W_i)\}\bigg ]^T\gamma +o_p(1)\\&{:=}\frac{1}{2}\gamma ^TA_n\gamma -B^T_n\gamma +o_p(1). \end{aligned}$$

Observe that \(A_n=EA_n+o_p(1)=\frac{1}{\theta } \mathbb {E} \big \{f_{\varepsilon }(0|X,W)XX^T\big \}+o_p(1):=\frac{1}{ \theta }A+o_p(1)\), hence

$$\begin{aligned} V_n(\gamma )=\frac{1}{2\theta }\gamma ^TA\gamma -B^T_n\gamma +o_p(1). \end{aligned}$$

According to the Lemma A.2, we have

$$\begin{aligned} \hat{\gamma }=\theta A^{-1}B_n+o_p(1). \end{aligned}$$
(A.11)

Further, according to the Cramér-Wald device and central limit theorem, we have

$$\begin{aligned} B_n\mathop {\rightarrow }\limits ^{\mathcal {D}}N\left( 0,\frac{\tau (1-\tau )}{\theta } B\right) , \end{aligned}$$
(A.12)

where \(B=\mathbb {E}\{X-\delta (X,W)\}^{\otimes 2}.\) Combining (A.11) with (A.12), we accomplish the proof of Theorem 2.2.

Proof of Theorem 2.3

The asymptotic normality of \(\hat{g}_\tau (w)\) can be obtained by following the ideas in the proof of Theorem 2.1, we omit the details here.

Proof of Theorem 2.4

Denote \(\zeta =\sqrt{n}(\beta -\beta _\tau ),\)\(\hat{\zeta }=\sqrt{n}(\hat{\beta }^\lambda -\beta _\tau ),\)\(\hat{\zeta }_1=\sqrt{n}(\hat{\beta }_1^\lambda -\beta _{1\tau })\) and \(r_i=\hat{g}_\tau (W_i)-g_\tau (W_i)\), then \(\hat{\beta }^\lambda \) be the minimizer of the following penalized target function:

$$\begin{aligned} \sum ^n_{i=1}\frac{1}{G_n(Y_i)}\rho _\tau \{Y_i-X_i^T\beta -\hat{g}_\tau (W_i)\}+n\sum ^q_{j=1}p^{\prime }_\lambda (|\beta ^{(0)}_{j}|) \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j}). \end{aligned}$$
(A.13)

Minimizing (A.13) is equivalent to minimizing

$$\begin{aligned} L_n(\zeta )&=\sum ^n_{i=1}\frac{1}{G_n(Y_i)}\big \{\rho _\tau ( \varepsilon _i-r_i-X_i^T\zeta /\sqrt{n})-\rho _\tau (\varepsilon _i -r_i)\big \} \\&\quad +n\sum ^q_{j=1}p^{'}_\lambda (|\beta ^{(0)}_{j}|) \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j}). \end{aligned}$$

The second term above can be expressed as

$$\begin{aligned} n\sum ^q_{j=1}p'_\lambda \text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j})\mathop {\rightarrow }\limits ^{\mathcal {P}}\bigg \{ \begin{array}{ll} 0,&{}\quad \text {if}~~ \beta _2=\beta _{2\tau },\\ \infty , &{}\quad \text {otherwise}. \end{array} \end{aligned}$$

Therefore, we obtain \(\hat{\beta }_2^{\lambda }\mathop {\rightarrow }\limits ^{\mathcal {P}}0\).

Denote \(B_{n,11}\) is upper-left \(s\times s\) submatrix of \(B_n\). Due to \(\hat{\zeta }\) is the minimizer of the \(L_n(\zeta )\) and \(L_n(\zeta )\) can be written asymptotically as

$$\begin{aligned} L_n\big ((\zeta _1^T, \mathbf 0 ^T)^T\big )= & {} \frac{1}{2}(\zeta _1^T, \mathbf 0 ^T)\frac{A}{\theta }(\zeta _1^T,\mathbf 0 ^T)^T-B_n^T( \zeta _1^T, \mathbf 0 ^T)^T\\&+\,n\sum ^q_{j=1}p^{'}_\lambda (|\beta ^{ (0)}_{j}|)\text {sgn}(\beta ^{(0)}_j)(\beta _{j}-\beta _{\tau j})+o_p(1)\\\rightarrow & {} L(\zeta _1)=\frac{1}{2}\zeta _1^T\frac{\Sigma _1}{\theta } \zeta _1-B_{n,11}^T\zeta _1. \end{aligned}$$

Note that \(L_n(\zeta )\) is a convex function of \(\zeta \) and \(L(\zeta _1)\) has a unique minimizer, the epiconvergence theory of Geyer (1994) imply that

$$\begin{aligned} \arg \min L_n(\zeta )=\sqrt{n}(\hat{\beta }^\lambda -\beta _\tau )\mathop {\rightarrow }\limits ^{ \mathcal {D}}\arg \min L(\zeta _1), \end{aligned}$$

which establishes the asymptotic normality part.

To prove the consistency property of model selection, we need only to show \(\hat{\beta }_2^{\lambda }=0\) with probability tending to 1. It is equivalent to prove that if \(\beta _{\tau j}=0\), then \(P(\hat{\beta }_j^{ \lambda }\ne 0)\rightarrow 0\). Recall the fact that \(|\frac{\rho _{ \tau } (t_2)-\rho _{\tau }(t_1)}{t_2-t_1}|\le \mathrm {max}(\tau ,1-\tau )<1\), if \(\hat{\beta }_j^{\lambda }\ne 0,\) then we have \(\sqrt{n}p^{\prime }_\lambda ( |\beta ^{(0)}_{j}|)<n^{-1}\sum ^n_{i=1}\frac{1}{G_n(Y_i)}|X_{ij}|\). Therefore, \(P(\hat{\beta }_j^{\lambda }\ne 0)\le P\big (\sqrt{n} p^{\prime }_\lambda (|\beta ^{(0)}_{j}|)<n^{-1}\sum ^n_{i=1}\frac{1}{G_n(Y_i)} |X_{ij}|\big )\), which together with \(\sqrt{n}p^{\prime }_\lambda (|\beta ^{(0)}_{j}|) \rightarrow \infty \) yields that \(P(\hat{\beta }_j^{\lambda }\ne 0)\rightarrow 0\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, HX., Chen, ZL., Wang, JF. et al. Quantile regression and variable selection for partially linear model with randomly truncated data. Stat Papers 60, 1137–1160 (2019). https://doi.org/10.1007/s00362-016-0867-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-016-0867-3

Keywords

Mathematics Subject Classification

Navigation