Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression

Guo, Chaohui; Yang, Hu; Lv, Jing

doi:10.1007/s00362-015-0736-5

Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression

Regular Article
Published: 28 December 2015

Volume 58, pages 1009–1033, (2017)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Chaohui Guo¹,
Hu Yang¹ &
Jing Lv¹

864 Accesses
11 Citations
Explore all metrics

Abstract

In this paper, a new variable selection procedure based on weighted composite quantile regression is proposed for varying coefficient models with a diverging number of parameters. The proposed method is based on basis function approximation and the group SCAD penalty. The new estimation method can achieve both robustness and efficiency. Furthermore, the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation are established under some suitable assumptions. Finally, the finite sample behavior of the estimator is evaluated by simulation studies. In addition, some interesting extensions are made to separate constant coefficients from varying coefficients.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for fixed effects varying coefficient models

Article 15 December 2014

Weighted composite quantile regression estimation and variable selection for varying coefficient models with heteroscedasticity

Article 06 June 2014

Walsh-average based variable selection for varying coefficient models

Article 18 June 2014

References

Ahmad I, Leelahanon S, Li Q (2005) Efficient estimation of a semi-parametric partially linear varying coefficient model. Ann Stat 33:258–283
Article MATH Google Scholar
Antoniadis A, Gijbels I, Lambert-Lacroix S (2014) Penalized estimation in additive varying coefficient models using grouped regularization. Stat Pap 55:727–750
Article MathSciNet MATH Google Scholar
Bradic J, Fan J, Wang W (2011) Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J R Stat Soc Ser B 73(3):325–349
Article MathSciNet Google Scholar
Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model space. Biometrika 95:759–771
Article MathSciNet MATH Google Scholar
de Boor C (2001) A practical guide to splines. Springer, New York
MATH Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet MATH Google Scholar
Fan J, Li R (2006) Statistical challenges with high dimensionality: feature selection in knowledge discovery, vol III. In: Proceedings of the Madrid international congress of mathematicians, pp 595–622
Fan J, Zhang W (1999) Statistical estimation in varying coefficient models. Ann Stat 27:1491–1518
Article MathSciNet MATH Google Scholar
Fan J, Zhang W (2000) Simultaneous confidence bands and hypotheses testing in varying-coefficient models. Scand J Stat 27:715–731
Article MATH Google Scholar
Fan J, Zhang W (2008) Statistical methods with varying coefficient models. Stat Interface 1:179–195
Article MathSciNet MATH Google Scholar
Guo J, Tang M, Tian M, Zhu K (2013) Variable selection in high-dimensional partially linear additive models for composite quantile regression. Comput Stat Data Anal 65:56–67
Article MathSciNet Google Scholar
Guo J, Tian M, Zhu K (2012) New efficient and robust estimation varying-coefficient models with heteroscedasticity. Stat Sin 22:1075–1101
MathSciNet MATH Google Scholar
Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc Ser B 55:757–796
MathSciNet MATH Google Scholar
Hu T, Xia Y (2012) Adaptive semi-varying coefficient model selection. Stat Sin 22:575–599
MathSciNet MATH Google Scholar
Hunter D, Lange K (2000) Quantile regression via an MM algorithm. J Comput Gr Stat 9:60–77
MathSciNet Google Scholar
Hunter D, Li R (2005) Variable selection using MM algorithms. Ann Stat 33:1617–1642
Article MathSciNet MATH Google Scholar
Jiang J, Zhao Q, Hui YV (2001) Robust modlling of ARCH models. J Forecast 20:111–133
Article Google Scholar
Jiang X, Jiang J, Song X (2012) Oracle model selection for nonlinear models based on weighted composite quantile regression. Stat Sin 22:1479–1506
MathSciNet MATH Google Scholar
Kai B, Li R, Zou H (2010) Local composite quantile regression smoothing: an efficient and an safe alternative to local polynomial regression. J R Stat Soc Ser B 72:49–69
Article MathSciNet Google Scholar
Kai B, Li R, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39:305–332
Article MathSciNet MATH Google Scholar
Kim M (2007) Quantile regression with varying coefficients. Ann Stat 35:92–108
Article MathSciNet MATH Google Scholar
Knight K (1998) Limiting distributions for L1 regression estimators under general conditions. Ann Stat 26:755–770
Article MATH Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Li G, Xue L, Lian H (2011) Semi-varying coefficient models with a diverging number of components. J Multivar Anal 102:1166–1174
Article MathSciNet MATH Google Scholar
Noh H, Park B (2010) Sparse varying coefficient models for longitudinal data. Stat Sin 20:1183–1202
MathSciNet MATH Google Scholar
Noh H, Chung K, Keilegom I (2012) Variable selection of varying coefficient models in quantile regression. Electron J Stat 6:1220–1238
Article MathSciNet MATH Google Scholar
Silverman BW (1986) Density estimation. Chapman and Hall, London
Book MATH Google Scholar
Tang Q, Cheng L (2012) Componentwise B-spline estimation for varying coefficient models with longitudinal data. Stat Pap 53:629–652
Article MathSciNet MATH Google Scholar
Tang Y, Wang HJ, Zhu Z (2013) Variable selection in quantile varying coefficient models with longitudinal data. Comput Stat Data Anal 57:435–449
Article MathSciNet MATH Google Scholar
Tang Y, Wang HJ, Zhu Z, Song X (2012) A unified variable selection approach for varying coefficient models. Stat Sin 22:601–628
MathSciNet MATH Google Scholar
Wang L, Li H, Huang J (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569
Article MathSciNet MATH Google Scholar
Wang H, Xia Y (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104:747–757
Article MathSciNet MATH Google Scholar
Wei F, Huang J, Li H (2011) Variable selection and estimation in high-dimensional varying coefficient models. Stat Sin 21:1515–1540
MathSciNet MATH Google Scholar
Xue L, Qu A (2012) Variable selection in high-dimensional varying-coefficient models with global optimality. J Mach Learn Res 13:1973–1998
MathSciNet MATH Google Scholar
Yang H, Guo C, Lv J (2014) Variable selection for generalized varying coefficient models with longitudinal data. Stat Pap (accepted). doi:10.1007/s00362-014-0647-x
Zhao P, Xue L (2010) Variable selection for semiparametric varying coefficient partially linear errors-in-variables models. J Multivar Anal 101:1872–1883
Article MathSciNet MATH Google Scholar
Zhao W, Zhang R, Lv Y, Zhao J (2013) Variable selection of the quantile varying coefficient regression models. J Korean Stat Soc 42:343–358
Article MathSciNet MATH Google Scholar
Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36:1108–1126
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors are very grateful to the editor, associate editor, and two anonymous referees for their detailed comments on the earlier version of the manuscript, which led to a much improved paper. This work is supported by the National Natural Science Foundation of China (Grant No. 11171361) and the Chongqing University Postgraduates’ Innovation Project.

Author information

Authors and Affiliations

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, China
Chaohui Guo, Hu Yang & Jing Lv

Authors

Chaohui Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Lv.

Appendix

Let C denote a generic constant that might assume different values at different places.

Lemma 1

Suppose ${\varvec{\pi }^{(j)}}{(u)^T}\varvec{\gamma }_j^0$ is the best approximating spline function for ${\beta _j}(u)$ and ${\varvec{\gamma } ^0} = {(\varvec{\gamma } _1^{0T},\ldots ,\varvec{\gamma } _{{p_n}}^{0T})^T}$. Under the conditions (C1)–(C5) together with two constants $C_1$ and $C_2$, we have

(a)
$\mathop {\sup }\limits _{u \in [0,1]} \left| {{\beta _j}(u) - {\varvec{\pi }^{(j)}} {{(u)}^T}\varvec{\gamma } _j^0} \right| \le C_1 K_n^{ - r}$,
(b)
$\mathop {\sup }\limits _{(u,{\varvec{X}}) \in [0,1] \times {R^{{p_n}}}} \left| {{{\varvec{X}}^T}\varvec{\beta } (u) - \varvec{\Pi }^T{\varvec{\gamma } ^0}} \right| \le C_2 K_n^{ - r}\sqrt{{p_n}}$.

Let ${R_{ni}} = \varvec{\Pi } _{i}^T{\varvec{\gamma } ^0} - \mathbf{X }_i^T\varvec{\beta } ({U_i}).$ By (b) of Lemma 1, it is easy to see that $\mathop {\max }\limits _{i} \left| {{R_{ni}}} \right| \le C_2 K_n^{ - r}\sqrt{{p_n}}$.

Proof of Lemma 1

Since ${\varvec{\pi }^{(j)}}{(u)^T}\varvec{\gamma }_j^0$ is the best approximating spline function for ${\beta _j}(u)$. According to the result on page 149 of de Boor (2001), for ${\beta _j}(u)$ satisfying condition (C1), we have $\mathop {\sup }\limits _{u \in [0,1]} \left| {{\beta _j}(u) - {\varvec{\pi }^{(j)}} {{(u)}^T}\varvec{\gamma } _j^0} \right| \le C_1 K_n^{ - r}$. We complete the proof of Lemma 1 (a). Now, we show Lemma 1 (b). Let

$$\begin{aligned} \varvec{B}(u) = \left( \begin{array}{l} {B_{11}}(u)~ \cdots ~{B_{1{K_1}}}(u)~~0 ~\cdots 0~~~~0 ~~~~~~~\cdots ~~~~~~0 \\ ~~~\vdots ~~~~~~ \ddots ~~~~~\vdots ~~~~~~~~\vdots ~ \ddots ~ \vdots ~~~~\vdots ~~~~~~~~\ddots ~~~~~\vdots \\ Z~~~0 ~~~~~\cdots ~~~~~0~~~~~~~0 ~\cdots ~0~~{B_{{p_n},1}}(u) ~\cdots ~ {B_{{p_n},{K_{{p_n}}}}}(u) \\ \end{array} \right) . \end{aligned}$$

So $\varvec{\Pi }= {{\varvec{B}(u)}^T}\varvec{X}$. Using the condition (C3), we have

$$\begin{aligned} \begin{array}{l} \mathop {\sup }\limits _{(u,{\varvec{X}}) \in [0,1] \times {R^{{p_n}}}}{\left| {{\varvec{X}^T}\varvec{\beta }(u) - {\varvec{\Pi }^T}{\varvec{\gamma }^0}} \right| ^2}\\ = \mathop {\sup }\limits _{(u,{\varvec{X}}) \in [0,1] \times {R^{{p_n}}}}{\left| {{\varvec{X}^T}\left( {\varvec{\beta }(u) - \varvec{B}(u){\varvec{\gamma }^0}} \right) } \right| ^2} \\ = {\sup _{(u,\varvec{X}) \in [0,1] \times {R^{{p_n}}}}}{\left( {\varvec{\beta }(u) -\varvec{B}(u){\varvec{\gamma }^0}} \right) ^T}\varvec{X}{\varvec{X}^T}\left( {\varvec{\beta }(u) - \varvec{B}(u){\varvec{\gamma }^0}} \right) \\ \le \mathop {\sup }\limits _{(u,X) \in [0,1] \times {R^{{p_n}}}} {\lambda _{\max }}(\varvec{X}{\varvec{X}^T}){\left( {\varvec{\beta }(u) - \varvec{B}\left( u \right) {\varvec{\gamma }^0}} \right) ^T}\left( {\varvec{\beta }(u) - \varvec{B}\left( u \right) {\varvec{\gamma }^0}} \right) \\ \rightarrow \mathop {\sup }\limits _{u \in [0,1]} {\lambda _{\max }}\left\{ {E(\varvec{X}{\varvec{X}^T}\left| {U = u} \right. )} \right\} \sum \nolimits _{j = 1}^{{p_n}} {{{\left( {{\beta _j}(u) - {\pi ^{(j)}}{{(u)}^T}\gamma _j^0} \right) }^2}} \\ \le \mathop {\sup }\limits _{u \in [0,1]} {\lambda _{\max }}\left\{ {E(\varvec{X}{\varvec{X}^T}\left| {U = u} \right. )} \right\} \sum \nolimits _{j = 1}^{{p_n}} {\mathop {\sup }\limits _{u \in [0,1]} {{\left( {{\beta _j}(u) - {\pi ^{(j)}}{{(u)}^T}\gamma _j^0} \right) }^2}} \\ \le {\lambda ^{\max }}C_1^2{p_n}K_n^{ - 2r} \\ \end{array} \end{aligned}$$

where ${\lambda ^{\max }} = \mathop {\sup }\limits _{u \in [0,1]} {\lambda _{\max }}\left\{ {E(\varvec{X}{\varvec{X}^T}\left| {U = u} \right. )} \right\} $ and ${\lambda _{\max }}(\varvec{A})$ denotes the maximum eigenvalues of a positive definite matrix $\varvec{A}$.

Thus, $\mathop {\sup }\limits _{(u,{\varvec{X}}) \in [0,1] \times {R^{{p_n}}}} \left| {{{\varvec{X}}^T}\varvec{\beta } (u) - \varvec{\Pi }^T{\varvec{\gamma } ^0}} \right| \le {C_2}\sqrt{{p_n}} K_n^{ - r}$, where ${C_2} = \sqrt{{\lambda ^{\max }}C_1^2} $. This complete the proof. $\square $

Proof of Theorem 1

Let ${\alpha _n} = \sqrt{{p_n}} \left( {{n^{ - r/(2r+1)}} + {a_n}} \right) ,{\varvec{u}_n} = \alpha _n^{ - 1}\left( {\varvec{\hat{\gamma }} - {\varvec{\gamma }^0}} \right) $ with $\varvec{u}_{nj}=\alpha _n^{ - 1}\left( {\varvec{\hat{\gamma }}_j - {\varvec{\gamma }_j^0}} \right) $, ${v_k} {=} \alpha _n^{ - 1}\left( {{\hat{c}_{\tau _k}} - c_{\tau _k}} \right) $ and $\mathscr {F}_n{=}\left\{ {\left( {{\varvec{u}_n},\varvec{v}} \right) : {{\left\| {{{\left( {\varvec{u}_n^T,\mathbf{v ^T}} \right) }^T}} \right\| }_2} = C} \right\} $, where C is a large enough constant and $\varvec{c} = {({c_{{\tau _1}}},\ldots ,{c_{{\tau _q}}})^T}, \varvec{v} = {({v_1},\ldots ,{v_q})^T}$. Our aim is to show that for any given $\eta >0$, there is a large constant C such that, for large n we have

$$\begin{aligned} P\left\{ {\mathop {\inf }\limits _{\left( {{\varvec{u}_n},\varvec{v}} \right) \in \mathscr {F}_n} {PL_n}({\varvec{\gamma } ^0} + {\alpha _n}{\varvec{u}_n},\varvec{c} + {\alpha _n}{\varvec{v}},\varvec{\omega } ) > {PL_n}({\varvec{\gamma } ^0},\varvec{c},\varvec{\omega })} \right\} \ge 1 - \eta . \end{aligned}$$

(16)

This implies that, with probability tending to one, there is local minimum $\varvec{\hat{\gamma }} $ in the ball $\left\{ {\left( {{\varvec{\gamma } ^0} + {\alpha _n}{\varvec{u}_n},\varvec{c} + {\alpha _n}{\varvec{v}}} \right) {{: }}{{\left\| {{{\left( {\varvec{u}_n^T,{\varvec{v}^T}} \right) }^T}} \right\| }_2} \le C} \right\} $ such that ${\left\| {\varvec{\hat{\gamma }} - {\varvec{\gamma } ^0}} \right\| _2} = {O_p}({\alpha _n})$. Let ${D_n}({\varvec{u}_n},\varvec{v}) = {{PL_n}({\varvec{\gamma } ^0} + {\alpha _n}{\varvec{u}_n},\varvec{c} + {\alpha _n}{\varvec{v}},\varvec{\omega }) - {PL_n}({\varvec{\gamma } ^0},\varvec{c},\varvec{\omega })}$ and ${S_n}({\varvec{u}_n},\varvec{v})= {{L_n}({\varvec{\gamma } ^0} + {\alpha _n}{\varvec{u}_n},\varvec{c} + {\alpha _n}{\varvec{v}},\varvec{\omega }) - {L_n}({\varvec{\gamma }^0},\varvec{c},\varvec{\omega })}$. Then

$$\begin{aligned} {D_n}({\varvec{u}_n},\varvec{v}) = {S_n}({\varvec{u}_n},\varvec{v}) + {P_{{\lambda _n}}}({\varvec{u}_n}), \end{aligned}$$

(17)

where ${P_{{\lambda _n}}}({\varvec{u}_n})=n\sum \limits _{j = 1}^{{p_n}} {\left[ {{p_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0 + {\alpha _n}{\varvec{u}_{nj}}} \right\| }_2}\right) - {p_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0} \right\| }_2}\right) } \right] } $.

By the identity (Knight 1998),

$$\begin{aligned} \left| {z - y} \right| - \left| z \right| = - y{\mathrm{sgn}} (z) + 2(y - z)\left\{ {I(0 < z < y) - I(y < z < 0)} \right\} , \end{aligned}$$

we have

$$\begin{aligned} {\rho _\tau }(r - s) - {\rho _\tau }(r) = s(I(r < 0) - \tau ) + \int _0^s {[I(r \le t) - I(r \le 0)]} dt. \end{aligned}$$

Then we can rewrite ${S_n}({\varvec{u}_n},\varvec{v})$ as

$$\begin{aligned} {S_n}({\varvec{u}_n},\varvec{v})= & {} \sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})} } [I({\varepsilon _i} < {R_{ni}} + c_{\tau _k}) - {\tau _k}]\nonumber \\&+\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})}{[I({\varepsilon _i} \le x + {R_{ni}} + c_{\tau _k}) - I({\varepsilon _i} \le {R_{ni}} +c_{\tau _k})]} } } dx\nonumber \\= & {} \sqrt{n} \alpha _n \left( {\varvec{Z}_n^T{\varvec{u}_n} + \varvec{z} _n^T\varvec{v}} \right) + \sum \limits _{k = 1}^q {{\omega _k}B_n^{(k)}}, \end{aligned}$$

(18)

where

$$\begin{aligned} B_n^{(k)}= & {} \sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})} {[I({\varepsilon _i} \le x + {R_{ni}} + c_{\tau _k}) - I({\varepsilon _i} \le {R_{ni}} + c_{\tau _k})]} }dx,\\ {\varvec{Z}_n}= & {} n^{-1/2} \sum \limits _{i = 1}^n {{\varvec{\Pi } _i}\sum \limits _{k = 1}^q {{\omega _k}} [I({\varepsilon _i} < {R_{ni}} + c_{\tau _k}) - {\tau _k}],}\\ {z _{n,k}}= & {} n^{-1/2} \sum \limits _{i = 1}^n {{\omega _k}\left[ {I({\varepsilon _i} < {R_{ni}} +c_{\tau _k}) - {\tau _k}} \right] }\;\, \mathrm{and}\;\, {\varvec{z} _n} = {\left( {{z _{n,1}},\ldots ,{z _{n,q}}} \right) ^T}. \end{aligned}$$

Note that, $\Vert {\varvec{\Gamma }_n}\Vert =O(1)$ and $\varepsilon _i$ is independent of $\varvec{X}_i$ and $U_i$, it follow that $E(\varvec{Z}_n^T{\varvec{u}_n}) =0$ and $E\{ {(\varvec{Z}_n^T{\varvec{u}_n})^2}\} = \varvec{u}_n^TE({\varvec{Z}_n}\varvec{Z}_n^T){\varvec{u}_n} =O(\left\| {{\varvec{u}_n}} \right\| _2^2)$. Hence, $\varvec{Z}_n^T{\varvec{u}_n} = {O}(\left\| {{\varvec{u}_n}} \right\| _2)$. This combined with (18) leads to

$$\begin{aligned} {S_n}({\varvec{u}_n},\varvec{v}) = \sum \limits _{k = 1}^q {{\omega _k}B_n^{(k)}} + {o_p}\left( n\alpha _n^2 \right) \left\| {{\varvec{u}_n}} \right\| _2, \end{aligned}$$

(19)

Applying the Markov inequality and condition (C3), for constant M, we have

$$\begin{aligned} P\left( {\mathop {\max }\limits _i \left( {{\alpha _n}\left| {\varvec{\Pi }_i^T{u_n}} \right| } \right) > \eta } \right)&= P\left( {\bigcup \limits _{i = 1}^n {\left( {{\alpha _n}\left| {\varvec{\Pi }_i^T{\varvec{u}_n}} \right| > \eta } \right) } } \right) \\&\le nP\left( {{\alpha _n}\left| {\varvec{\Pi }_1^T{\varvec{u}_n}} \right| > \eta } \right) \\&\le n\frac{{\alpha _n^8}}{{{\eta ^8}}}E{\left( {\varvec{\Pi }_1^T{\varvec{u}_n}} \right) ^8} \\&=n\frac{{\alpha _n^8}}{{{\eta ^8}}}E\left\{ {E\left[ {{{\left( {\varvec{\Pi }_1^T{\varvec{u}_n}} \right) }^8}\left| U \right. } \right] } \right\} \\&\le n\frac{{\alpha _n^8}}{{{\eta ^8}}}{\left\| {{\varvec{u}_n}} \right\| ^8}E\left\{ {E\left( {{{\left\| {{\varvec{\Pi }_1}} \right\| }^8}\left| U \right. } \right) } \right\} \\&\le n\frac{{\alpha _n^8}}{{{\eta ^8}}}{\left\| {{\varvec{u}_n}} \right\| ^8}E\left\{ {E\left( {{{\left\| {{\varvec{X} _1}} \right\| }^8}\left| U \right. } \right) } \right\} \\&\le n\frac{{\alpha _n^8}}{{{\eta ^8}}}{C^8}{M^8} \\&\rightarrow 0 \\ \end{aligned}$$

So $\mathop {\max }\limits _i \left( {{\alpha _n}\left| {\varvec{\Pi }_i^T{\varvec{u}_n}} \right| } \right) = o_p\left( 1 \right) $.

Thus, it is easy to show that $\mathop {\max }\limits _i \left( {{\alpha _n}\left| {\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k}} \right| } \right) = o_p\left( 1 \right) $. By condition (C4) and the Lebesgue’s dominated convergence theorem, we have

$$\begin{aligned}&E({B_n}^{(k)}\left| \mathscr {H} \right. ) \nonumber \\&\quad =E\left\{ \sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})} {[I({\varepsilon _i} \le x + {R_{ni}} + c_{\tau _k}) - I({\varepsilon _i} \le {R_{ni}} + c_{\tau _k})]} } dx \left| \mathscr {H} \right. \right\} \nonumber \\&\quad = \sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})} {[F(x + {R_{ni}} + {c_{{\tau _k}}}) - F({R_{ni}} + {c_{{\tau _k}}})]} } dx\nonumber \\&\quad = \sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} {[f({R_{ni}} + {c_{{\tau _k}}})x(1 + o_p(1))]} } dx\nonumber \\&\quad =\frac{{f{{(}}{c_{{\tau _k}}}{{)}}}}{2}\alpha _n^2\sum \limits _{i = 1}^n {(v_k^2 + \varvec{u}_n^T{\varvec{\Pi } _i}\varvec{\Pi } _i^T{\varvec{u}_n} + 2{v_k}\varvec{\Pi } _i^T{\varvec{u}_n})} {{(1 + }}{o_p}{{(1))}}\nonumber \\&\quad =\frac{{f{{(}}{c_{{\tau _k}}}{{)}}}}{2}n\alpha _n^2 {(v_k^2 + \varvec{u}_n^T {\varvec{\Gamma }_n}{\varvec{u}_n} + 2{v_k} {\varvec{\mu }_n^T}{\varvec{u}_n})} {{(1 + }}{o_p}{{(1))}}. \end{aligned}$$

(20)

Here we use the fact that $\mathop {\max }\limits _i \left( {{\alpha _n}\left| {\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k}} \right| } \right) = o_p\left( 1 \right) $ in the third step.

Moreover,

$$\begin{aligned}&{\mathrm{Var}} ({B_n}^{(k)}\left| \mathscr {H} \right. )\nonumber \\&\quad =\mathrm{Var} \left\{ \sum \limits _{i = 1}^n {\int _0^{{\alpha _n}(\varvec{\Pi } _i^T{\varvec{u}_n} + {v_k})} {[I({\varepsilon _i} \le x + {R_{ni}} + c_{\tau _k}) - I({\varepsilon _i} \le {R_{ni}} + c_{\tau _k})]} } dx \left| \mathscr {H} \right. \right\} \nonumber \\&\quad \le \sum \limits _{i = 1}^n E \left[ {{{\left( {\int _0^{{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} {\left\{ {I({\varepsilon _i} < x + {R_{ni}} + {c_{{\tau _k}}}) - I({\varepsilon _i} < {R_{ni}} + {c_{{\tau _k}}})} \right\} dx} } \right) }^2}\left| \mathscr {H} \right. } \right] \nonumber \\&\quad \le \sum \limits _{i = 1}^n {\int _0^{\left| {{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} \right| } {\int _0^{\left| {{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} \right| } {\Bigg \{ {F\left( {R_{ni}} + {c_{{\tau _k}}} + \left| {{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} \right| \right) }} } } \nonumber \\&\qquad { - F({R_{ni}} + {c_{{\tau _k}}})} \Bigg \}d{x_1}d{x_2}\nonumber \\&\quad \le o\left( \sum \limits _{i = 1}^n {{{\left| {{\alpha _n}(\varvec{\Pi }_i^T{\varvec{u}_n} + {v_k})} \right| }^2}}\right) \nonumber \\&\quad = {o_p}( n\alpha _n^2 )\left\| {{\varvec{u}_n}} \right\| _2^2 . \end{aligned}$$

(21)

Hence

$$\begin{aligned} B_n^{(k)}=\frac{{f({c_{{\tau _k}}}\mathrm{{)}}}}{2}n\alpha _n^2 {(v_k^2 + \varvec{u}_n^T {\varvec{\Gamma }_n}{\varvec{u}_n} + 2{v_k} {\varvec{\mu }_n^T}{\varvec{u}_n})} \mathrm{{(1 + }}{o_p}\mathrm{{(1))}}. \end{aligned}$$

This combined with (19), yields that

$$\begin{aligned} {S_n}({\varvec{u}_n},\varvec{v})= & {} \frac{1}{2}n\alpha _n^2\sum \limits _{k = 1}^q {{\omega _k} f({c_{{\tau _k}}}) {(v_k^2 + \varvec{u}_n^T {\varvec{\Gamma }_n}{\varvec{u}_n} + 2{v_k} {\varvec{\mu }_n^T}{\varvec{u}_n})} \mathrm{{(1 + }}{o_p}\mathrm{{(1))}} } \nonumber \\&+ \,{o_p}\left( n\alpha _n^2 \right) \left\| {{\varvec{u}_n}} \right\| _2. \end{aligned}$$

(22)

By condition (C6), we have

$$\begin{aligned} {P_{{\lambda _n}}}({\varvec{u}_n})&\ge \sum \limits _{j = 1}^{{s}} {\left[ {n{\alpha _n}{{p'}_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0} \right\| }_2}\right) \frac{{\varvec{\gamma }_j^{0T}{\varvec{u}_{nj}}}}{{{{\left\| {\varvec{\gamma }_j^0} \right\| }_2}}} + \frac{1}{2}n\alpha _n^2{{p''}_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0} \right\| }_2}\right) \varvec{u}_{nj}^T{\varvec{u}_{nj}}(1 + o(1))} \right] } \nonumber \\&\ge - (n\alpha _n^2{\left\| {{\varvec{u}_n}} \right\| _2} + {o}(n\alpha _n^2)\left\| {{\varvec{u}_n}} \right\| _2^2). \end{aligned}$$

(23)

It follows from ((19)–(23)) that ${D_n}({\varvec{u}_n},\varvec{v})$ in (17) is dominated by the positive quadratic term $ \frac{1}{2}n\alpha _n^2\sum \limits _{k = 1}^q {{\omega _k} f({c_{{\tau _k}}}) {(v_k^2 + \varvec{u}_n^T {\varvec{\Gamma }_n}{\varvec{u}_n} + 2{v_k} {\varvec{\mu }_n^T}{\varvec{u}_n})}} $ as long as ${\left\| {{\varvec{u}_n}} \right\| _2}$ and ${\left\| {{\varvec{v}}} \right\| _2}$ are large enough. This proves (16). By Lemma 1, we have

$$\begin{aligned} {n^{ - 1}}\sum \limits _{i = 1}^n {{{{({{\hat{\beta }}_j}({U_i}) - {\beta _j}({U_i}))}^2}} }\le & {} \frac{2}{n}\sum \limits _{i = 1}^n { {{{(\varvec{\pi } _i^{(j)T}({{\varvec{\hat{\gamma }}}_j} - \varvec{\gamma } _j^0))}^2}} } + \frac{2}{n}\sum \limits _{i = 1}^n { {{{(\varvec{\pi } _i^{(j)T}{\varvec{\gamma }_j}^0 - {\beta _j}({U_i}))}^2}} } \\\le & {} \frac{2}{n} {{{({{\varvec{\hat{\gamma }}}_j} - \varvec{\gamma } _j^0)}^T}\sum \limits _{i = 1}^n {\varvec{\pi } _i^{(j)}\varvec{\pi } _i^{(j)T}} ({{\varvec{\hat{\gamma }} }_j} - \varvec{\gamma } _j^0) + } {C^2K _n^{-2r}}\\= & {} {O_p}({n^{{{ - 2r} / {(2r + 1)}}}} ). \end{aligned}$$

This complete the proof. $\square $

Proof of Theorem 2 (a)

We use proof by contradiction. Suppose that there exists a ${s} + 1 \le {j_0} \le {p_n}$ such that the probability of ${\hat{\beta }_{{j_0}}}(u)$ being a zero function does not converge to one. Then, there exists $\eta > 0$ such that, for infinitely many n, $P({\varvec{\hat{\gamma }}_{{j_0}}} \ne 0) = P({\hat{\beta }_{{j_0}}} (u)\ne 0) \ge \eta .$ Let ${\varvec{\hat{\gamma }} ^*}$ be the vector obtained from $\varvec{\hat{\gamma }} $ with ${\varvec{\hat{\gamma }} _{{j_0}}}$ being replaced by $\varvec{0}$. It will be shown that there exists a $\delta > 0$ such that ${PL_n}\left( {\varvec{\hat{\gamma }} ,\varvec{\hat{c}}; \varvec{\omega } } \right) -{PL_n}\left( {\varvec{\hat{\gamma }}^* ,\varvec{\hat{c}}; \varvec{\omega } } \right) > 0$ with probability at least $\delta $ for infinitely many n, which contradicts with the fact that ${PL_n}\left( {\varvec{\hat{\gamma }} ,\varvec{\hat{c}}; \varvec{\omega } } \right) -{PL_n}\left( {\varvec{\hat{\gamma }}^* ,\varvec{\hat{c}}; \varvec{\omega } } \right) \le 0 $.

By Theorem 1, we have $ {\left\| {{{\varvec{\hat{\gamma }} }_j} -\varvec{ \gamma } _j^0} \right\| _2} = {O_p}({n^{{{ - r} / {(2r + 1)}}}})$. Since $\varvec{\gamma } _j^0 = \varvec{0}$ for $j=s+1,\ldots ,p_n+1$, we have ${\left\| {{{\varvec{\hat{\gamma }} }_j}} \right\| _2} = {O_p}({n^{{{ - r} / {(2r + 1)}}}})$ for $j=s+1,\ldots ,p_n+1$. So ${\left\| {{{\varvec{\hat{\gamma }}}_{j_0}}} \right\| _2} = {O_p}({n^{{{ - r} / {(2r + 1)}}}})$. With probability tending to one, ${{{\left\| {{{\varvec{\hat{\gamma }}}_{{j_0}}}} \right\| }_2} \le {\lambda _n}}$, since ${n^{r / {(2r + 1)}}} {\lambda _n}/ {\sqrt{{p_n}}} \rightarrow \infty $. By the definition of ${p_{\lambda _n} }(.)$, we have $P\left\{ {{p_{{\lambda _n}}}({{\left\| {{{\varvec{\hat{\gamma }} }_{{j_0}}}} \right\| }_2}) = {\lambda _n}{{\left\| {{{\varvec{\hat{\gamma }}}_{{j_0}}}} \right\| }_2}} \right\} \rightarrow 1.$

Since $\left( {{\rho _\tau }(u) - {\rho _\tau }(v)} \right) \ge (\tau - I(v < 0))(u - v)$ for any $u,v \in R$, we have

$$\begin{aligned}&{PL_n}\left( {\varvec{\hat{\gamma }} ,\varvec{\hat{c}}; \varvec{\omega } } \right) -{PL_n}\left( {\varvec{\hat{\gamma }}^* ,\varvec{\hat{c}}; \varvec{\omega } } \right) \nonumber \\&\quad = {L_n}\left( {\varvec{\hat{\gamma }} ,\varvec{\hat{c}}; \varvec{\omega } } \right) -{L_n}\left( {\varvec{\hat{\gamma }}^* ,\varvec{\hat{c}}; \varvec{\omega } } \right) + n\sum \limits _{j = 1}^{{p_n}} {\left( {{p_{{\lambda _n}}}({{\left\| {{{\varvec{\hat{\gamma }}}_j}} \right\| }_2}) - {p_{{\lambda _n}}}({{\left\| {\varvec{\hat{\gamma }} _j^*} \right\| }_2})} \right) } \nonumber \\&\quad {{= }}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {\left\{ {{\rho _{{\tau _k}}}\left( {{Y_i} - \hat{c}_{{\tau _k}} -\varvec{\Pi }_i^T\varvec{\hat{\gamma }} } \right) - {\rho _{{\tau _k}}}\left( {{Y_i} - \hat{c}_{{\tau _k}} - \varvec{\Pi }_i^T{{\varvec{\hat{\gamma }}}^*}} \right) } \right\} } } + n{\lambda _n}{\left\| {{{\varvec{\hat{\gamma }} }_{{j_0}}}} \right\| _2} \nonumber \\&\quad \ge - \sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {({\tau _k} - I({\varepsilon _i} < 0))} } \varvec{\Pi } _i^T(\varvec{\hat{\gamma }} - {{\varvec{\hat{\gamma }} }^*})\nonumber \\&\qquad - \sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {(I({\varepsilon _i} < 0) - I({\varepsilon _i} < {r_{ni}}+\hat{c}_{{\tau _k}}))} } \varvec{\Pi } _i^T(\varvec{\hat{\gamma }} - {{\varvec{\hat{\gamma }} }^*}) + n{\lambda _n}{\left\| {{{\varvec{\hat{\gamma }}}_{{j_0}}}} \right\| _2} \nonumber \\&\quad \ge \left\{ { - {{\left\| {\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {({\tau _k} - I({\varepsilon _i} < 0))\varvec{\Pi } _i^{({j_0})}} } } \right\| }_2} } \right. \nonumber \\&\quad \left. {~~~~~ -{{\left\| {\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {(I({\varepsilon _i} < 0) - I({\varepsilon _i} < {r_{ni}}+\hat{c}_{{\tau _k}}))\varvec{\Pi } _i^{({j_0})}} } } \right\| }_2}+ n{\lambda _n}} \right\} {\left\| {{{\varvec{\hat{\gamma }} }_{{j_0}}}} \right\| _2},\nonumber \\ \end{aligned}$$

(24)

where ${r_{ni}} = {R_{ni}} + \varvec{\Pi } _{i}^T({\varvec{\hat{\gamma }} ^*} - {\varvec{\gamma } ^0}).$

Let $\mathbf{T _n} = {\sum \limits _{i = 1}^n {(I({\varepsilon _i} < 0) - I({\varepsilon _i} < {r_{ni}} + \hat{c}_{{\tau _k}}))\varvec{\Pi } _i^{({j_0})}} } $. From conditions (C3) and (C4), we obtain that for any $L > 0$ and $\Delta ={n^{{{ - r} / {(2r + 1)}}}}\sqrt{{p_n}} $,

$$\begin{aligned}&E(\varvec{T}_n^T{\varvec{T}_n}) \\&\quad = E\left\{ {\sum \limits _{i = 1}^n {(I({\varepsilon _i} < 0) - I({\varepsilon _i} < L\Delta ))\varvec{\Pi } _i^{({j_0})T}\sum \limits _{k = 1}^n {{{(I({\varepsilon _k} < 0) - I({\varepsilon _k} < L\Delta ))}}\varvec{\Pi } _k^{({j_0})}} } } \right\} \\&\quad \le nE{\left\{ {(I(\varepsilon < 0) - I(\varepsilon < L\Delta ))\left| {{\varvec{\Pi }^{({j_0})}}} \right| } \right\} ^2} \\&\qquad + n(n - 1){\left[ {E\left\{ {(I(\varepsilon < 0) - I(\varepsilon < L\Delta ))\left| {{\varvec{\Pi }^{({j_0})}}} \right| } \right\} } \right] ^2} \\&\quad \le {M}^2\left\{ {\left( {L\Delta n} \right) + {{\left( {L\Delta n} \right) }^2}} \right\} \\&\quad \le C^2 {n^2}{p_n}{n^{{{ -2 r} / {(2r + 1)}}}}.\\ \end{aligned}$$

Thus

$$\begin{aligned} {{{\left\| {\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {(I({\varepsilon _i} < 0) - I({\varepsilon _i} < {r_{ni}}))\varvec{\Pi } _i^{({j_0})}} } } \right\| }_2}}={O_p}(n\sqrt{{p_n}} {n^{{{ - r} / {(2r + 1)}}}}), \end{aligned}$$

(25)

By simple calculation , we obtain

$$\begin{aligned} {{{\left\| {\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {({\tau _k} - I({\varepsilon _i} < 0))\varvec{\Pi } _i^{({j_0})}} } } \right\| }_2}}={O_p}(\sqrt{n{p_n}}). \end{aligned}$$

(26)

By the fact that ${{{n^{{r / {(2r + 1)}}}}{\lambda _n}} / {\sqrt{{p_n}} }} \rightarrow \infty $, so $n{\lambda _n}$ is of higher order than $O(n\sqrt{{p_n}} {n^{{{ - r} / {(2r + 1)}}}})$. This combined with (25) and (26), we can conclude that (24) is dominated by $n{\lambda _n}{\left\| {{{\varvec{\hat{\gamma }}}_{{j_0}}}} \right\| _2}$, which contradicts to ${PL_n}\left( {\varvec{\hat{\gamma }} ,\varvec{\hat{c}}; \varvec{\omega } } \right) -{PL_n}\left( {\varvec{\hat{\gamma }}^* ,\varvec{\hat{c}}; \varvec{\omega } } \right) \le 0 $. $\square $

Proof of Theorem 2 (b)

Let ${\varvec{u}_n} = \alpha _n^{ - 1}(\varvec{\gamma }- {\varvec{\gamma }^0})$. Partition the vectors ${\varvec{u}_n} = {(\varvec{u}_{na}^T,\varvec{u}_{nb}^T)^T}$ and ${\varvec{\Pi }_i} = {(\varvec{\Pi }_{ia}^T,\varvec{\Pi }_{ib}^T)^T}$ in the same way as $\varvec{\gamma }= {(\varvec{\gamma }_a^T,\varvec{\gamma }_b^T)^T}$. By (17) and $P_{\lambda _n}(0)=0$, we can write

$$\begin{aligned} {D_n}(({\varvec{u}_{na}^T, \varvec{0}^T})^T,\varvec{v}) = {S_n}(({\varvec{u}_{na}^T, \varvec{0}^T})^T,\varvec{v}) + {P_{{\lambda _n}}}({\varvec{u}_{na}}), \end{aligned}$$

(27)

where ${P_{{\lambda _n}}}\left( {\varvec{u}_{na}}\right) =n\sum \limits _{j = 1}^{{s}} {\left[ {{p_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0 + {\alpha _n}{\varvec{u}_{nj}}} \right\| }_2}\right) - {p_{{\lambda _n}}}\left( {{\left\| {\varvec{\gamma }_j^0} \right\| }_2}\right) } \right] } $. By taking Taylor’s expansion for ${P_{{\lambda _n}}}\left( {\varvec{u}_{na}}\right) $ at $\varvec{u}_{na}=0$, we obtain that

$$\begin{aligned} {P_{{\lambda _n}}}({\varvec{u}_{na}})=n{\alpha _n}\varvec{c}_n^T{\varvec{u}_{na}} + \frac{1}{2}n\alpha _n^2\varvec{u}_{na}^T{\varvec{\Sigma }_{{\lambda _n}}}{\varvec{u}_{na}}(1 + o(1)). \end{aligned}$$

Then the minimizer $(\varvec{\hat{u}}_{na}^T,\varvec{\hat{v}}^T)^T$ of ${D_n}(({\varvec{u}_{na}^T, \varvec{0}^T})^T,\varvec{v})$ satisfies the score equations

$$\begin{aligned} \begin{array}{l} {n^{ - 1}}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}{\psi _{{\tau _k}}}({\varepsilon _i} - {c_{{\tau _k}}} -R_{ni}- {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))} } (1 + o_p(1)) \\ \quad = {\varvec{c}_n} + {\alpha _n}{\varvec{\Sigma }_{{\lambda _n}}}{\varvec{\hat{u}}_{na}}(1 + {o_p}(1)), \\ \end{array} \end{aligned}$$

(28)

$$\begin{aligned} {\omega _k}\sum \limits _{i = 1}^n {{\psi _{{\tau _k}}}({\varepsilon _i} - {c_{{\tau _k}}} - R_{ni}-{\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k})) = 0}, \end{aligned}$$

(29)

where ${\psi _\tau }(u) ={{\rho '}_{{\tau }}}(u)= \tau - I(u < 0),$ we can write

$$\begin{aligned} \begin{array}{l} {n^{ - 1}}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}{\psi _{{\tau _k}}}({\varepsilon _i} - {c_{{\tau _k}}} - {R_{ni}} - {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))} } \\ = - {n^{ - {1 / 2}}}{\varvec{H}_{n}} + \sum \limits _{k = 1}^q {{\omega _k}(B_{n21}^{(k)} + B_{n22}^{(k)})} , \\ \end{array} \end{aligned}$$

(30)

where ${\varvec{H}_n} = {n^{ - {1 / 2}}}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}\sum \limits _{k = 1}^q {{\omega _k}} [I({\varepsilon _i} < {R_{ni}} + {c_{{\tau _k}}}) - {\tau _k}],} $

$$\begin{aligned} \begin{array}{l} B_{n21}^{(k)} = {n^{ - 1}}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}[F({c_{{\tau _k}}} + {R_{ni}}) - F({R_{ni}} + {c_{{\tau _k}}} + {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))],} \\ B_{n22}^{(k)} = {n^{ - 1}}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}\left\{ {[I({\varepsilon _i} < {c_{{\tau _k}}} + {R_{ni}}) - I({\varepsilon _i} < {R_{ni}} + {c_{{\tau _k}}} + {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))]} \right. } \\ \qquad \qquad \left. -{[F({c_{{\tau _k}}} + {R_{ni}}) - F({R_{ni}} + {c_{{\tau _k}}} + {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))]} \right\} .\\ \end{array} \end{aligned}$$

Taking Taylor’s explanation for ${F({R_{ni}} + {c_{{\tau _k}}} + {\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k}))}$ at ${{R_{ni}} + {c_{{\tau _k}}}}$ gives

$$\begin{aligned} \begin{aligned} B_{n21}^{(k)}&= - {n^{ - 1}}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}[f({c_{{\tau _k}}} + {R_{ni}}){\alpha _n}(\varvec{\Pi }_{ia}^T{\varvec{\hat{u}}_{na}} + {{\hat{v}}_k})](1 + o(1)),} \\&= - {\alpha _n}f({c_{{\tau _k}}})({\varvec{\Gamma }_{na}}{\varvec{\hat{u}}_{na}} + {\varvec{\mu }_{na}}{{\hat{v}}_k})(1 + {o_p}(1)). \\ \end{aligned} \end{aligned}$$

By direct calculation of the mean and variance, we can show, as in Jiang et al. (2001), $B_{n22}^{(k)} = {o_p}({\alpha _n})$. This combined with (28) and (30) lead to

$$\begin{aligned} - ({n^{ - {1 / 2}}}{\varvec{H}_{n}} + {\varvec{c}_n}) = {\alpha _n}\left\{ \sum \limits _{k = 1}^q {{\omega _k}f({c_{{\tau _k}}})({\varvec{\Gamma }_{na}}{\varvec{\hat{u}}_{na}} + {\varvec{\mu }_{na}}{{\hat{v}}_k}) + {\varvec{\Sigma }_{{\lambda _n}}}{\varvec{\hat{u}}_{na}}}\right\} (1 + {o_p}(1)).\nonumber \\ \end{aligned}$$

(31)

Similarly, (29) can be simplified as

$$\begin{aligned} {n^{ - 1/2}}\zeta _{n,k}+ {\alpha _n}{\omega _k}f({c_{{\tau _k}}})({{\hat{v}}_k} + \varvec{\mu }_{na}^T{\varvec{\hat{u}}_{na}}(1 + {o_p}(1))) = 0, \end{aligned}$$

(32)

where $ \zeta _{n,k} = {n^{{{ - 1} / 2}}}{\omega _k}\sum \limits _{i = 1}^n {[I({\varepsilon _i} < {c_{{\tau _k}}}+R_{ni}) - {\tau _k}]} $. Solving (31) and (32), we obtain that

$$\begin{aligned} {\alpha _n}\left( {{\varvec{G}_{na}} + \frac{{{\varvec{\Sigma }_{{\lambda _n}}}}}{{{\varvec{\omega }^T}\varvec{f}}}} \right) {\varvec{\hat{u}}_{na}} + \frac{{{\varvec{c}_n}}}{{{\varvec{\omega }^T}\varvec{f}}} = - {n^{ - {1 / 2}}}\left( {\frac{{{\varvec{H}_{n}}}}{{{\varvec{\omega }^T}\varvec{f}}} - {\varvec{\mu }_{na}}\sum \limits _{k = 1}^q {\frac{{{\zeta _{n,k}}}}{{{\varvec{\omega }^T}\varvec{f}}}} } \right) + {o_p}({n^{{{ - 1} / 2}}}). \end{aligned}$$

Let $\varvec{H}_n^* = {n^{ - {1 / 2}}}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}\sum \limits _{k = 1}^q {{\omega _k}} [I({\varepsilon _i} < {c_{{\tau _k}}}) - {\tau _k}]}$, $ \zeta _{n,k}^* = {n^{{{ - 1} / 2}}}{\omega _k}\sum \limits _{i = 1}^n [I({\varepsilon _i} < {c_{{\tau _k}}}) - {\tau _k}] $. Following Jiang et al. (2012), we have

$$\begin{aligned} \varvec{e}^T\varvec{G}_{na}^{{{ - 1} / 2}}{{(\varvec{H}_n^* - {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}^*} )} / {{\varvec{\omega }^T}\varvec{f}}}\mathop \rightarrow \limits ^d N(0,R(q)). \end{aligned}$$

(33)

Put ${\eta _{i,k}} = I({\varepsilon _i} < {R_{ni}} + {c_{{\tau _k}}}) - {\tau _k},\eta _{i,k}^* = I({\varepsilon _i} < {c_{{\tau _k}}}) - {\tau _k},$ Moreover,

$$\begin{aligned} \begin{array}{l} Var\left( {(\varvec{H}_n - {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}} ) - (\varvec{H}_n^* - {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}^*} )\left| \mathscr {H} \right. } \right) \\ \quad = Var\left( {[{n^{ - {1 / 2}}}\sum \limits _{i = 1}^n {\sum \limits _{k = 1}^q {{\varvec{\Pi }_{ia}}{\omega _k}({\eta _{i,k}} - \eta _{i,k}^*) - {n^{ - {1 / 2}}}{\varvec{\mu }_{na}}\sum \limits _{i = 1}^n {\sum \limits _{k = 1}^q {{\omega _k}({\eta _{i,k}} - \eta _{i,k}^*)} } } } ]\left| \mathscr {H} \right. } \right) \\ \quad \le 2\left\{ {\frac{1}{n}\sum \limits _{i = 1}^n ({{\varvec{\Pi }_{ia}}\varvec{\Pi }_{ia}^T + {\varvec{\mu }_{na}}\varvec{\mu }_{na}^T}) } \right\} Var\left( {\sum \limits _{k = 1}^q {{\omega _k}({\eta _{i,k}} - \eta _{i,k}^*)} \left| \mathscr {H} \right. } \right) \\ \quad \le 2{q^2}\left\{ {\frac{1}{n}\sum \limits _{i = 1}^n ({{\varvec{\Pi }_{ia}}\varvec{\Pi }_{ia}^T + {\varvec{\mu }_{na}}\varvec{\mu }_{na}^T})} \right\} \mathop {\max }\limits _k \left| E \{{\omega _k^2[I({\varepsilon _i} < {R_{ni}} + {c_{{\tau _k}}}) - I({\varepsilon _i} < {c_{{\tau _k}}})]\left| \mathscr {H} \right. }\} \right| \\ \quad \le 2{q^2}\left\{ {\frac{1}{n}\sum \limits _{i = 1}^n ({{\varvec{\Pi }_{ia}}\varvec{\Pi }_{ia}^T + {\varvec{\mu }_{na}}\varvec{\mu }_{na}^T}) } \right\} \mathop {\max }\limits _k \omega _k^2|F({R_{ni}} + {c_{{\tau _k}}}) - F({c_{{\tau _k}}})| \\ \quad = {o}(1). \\ \end{array} \end{aligned}$$

Thus, we have

$$\begin{aligned} \varvec{H}_n- {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}} \mathop \rightarrow \limits ^p \varvec{H}_n^* - {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}^*} . \end{aligned}$$

(34)

By Slutsky’s theorem, conditioning on $\mathscr {H}$, we have

$$\begin{aligned} \varvec{e}^T\varvec{G}_{na}^{{{ - 1} / 2}}{{(\varvec{H}_n - {\varvec{\mu }_{na}}\sum \nolimits _{k = 1}^q {\zeta _{n,k}} )} / {{\varvec{\omega }^T}\varvec{f}}}\mathop \rightarrow \limits ^d N(0,R(q)). \end{aligned}$$

(35)

Note that ${\varvec{u}_{na}} = \alpha _n^{ - 1}(\varvec{\gamma }_a - {\varvec{\gamma }_ a^0})$. It follows that

$$\begin{aligned} \sqrt{n} \varvec{e}^T\varvec{G}_{na}^{{{ - 1} / 2}}({\varvec{G}_{na}} + \frac{{{\varvec{\Sigma }_{{\lambda _n}}}}}{{{\varvec{\omega }^T}\varvec{f}}})\left[ {({\varvec{\hat{\gamma }}_a} - \varvec{\gamma }_a^0) + {{({\varvec{G}_{na}} + \frac{{{\varvec{\Sigma }_{{\lambda _n}}}}}{{{\varvec{\omega }^T}\varvec{f}}})}^{ - 1}}\frac{{{\varvec{c}_n}}}{{{\varvec{\omega }^T}\varvec{f}}}} \right] \mathop \rightarrow \limits ^d N(0,R(q)). \end{aligned}$$

$\square $

Proof of Theorem 2 (c)

By the proof Theorem 2 (a), we know immediately that ${\varvec{\hat{\gamma }} _{b,{\lambda _n}}} = 0$ with probability tending to one. Consequently, we know that ${\varvec{\hat{\gamma }}_{a,{\lambda _n}}}$ must be the solution of the following normal equation

$$\begin{aligned} \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}{\psi _{{\tau _k}}}\left( {{Y_i} - {\hat{c}_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }}}_a}} \right) } } - \varvec{c}_n=\varvec{0}, \end{aligned}$$

On the other hand, the oracle estimator must be the solution of the normal equation

$$\begin{aligned} \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi }_{ia}}{\psi _{{\tau _k}}}\left( {{Y_i} - {\hat{c}_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }} }_{ora}}} \right) } } = \varvec{0}. \end{aligned}$$

So we have

$$\begin{aligned} \begin{array}{l} \left[ \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}{\psi _{{\tau _k}}}\left( {{Y_i} -{\hat{c}_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }} }_{ora}}} \right) } }\right. \\ \left. \quad - \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}{\psi _{{\tau _k}}}\left( {{Y_i} - {\hat{c}_{{\tau _k}}}- \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }} }_a}} \right) } } \right] \\ \quad + {\left( {p'_{{\lambda _n}}}\left( {\left\| {{{\varvec{\hat{\gamma }}}_1}} \right\| _2}\right) \frac{{\varvec{\hat{\gamma }}_1^T}}{{{{\left\| {{{\varvec{\hat{\gamma }} }_1}} \right\| }_2}}},\ldots ,{p'_{{\lambda _n}}}\left( {\left\| {{{\varvec{\hat{\gamma }} }_{{s}}}} \right\| _2}\right) \frac{{\varvec{\hat{\gamma }} _{{s}}^T}}{{{{\left\| {{{\varvec{\hat{\gamma }}}_{{s}}}} \right\| }_2}}}\right) ^T} = \varvec{0}.\\ \end{array} \end{aligned}$$

(36)

Furthermore, the first term of the left hand side of (36) can be written as

$$\begin{aligned} \begin{array}{l} \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}} } \left[ {{\psi _{{\tau _k}}}\left( {{Y_i} - {{\hat{c}}_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }} }_{ora}}} \right) - {\psi _{{\tau _k}}}\left( {{Y_i} - {c_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{\varvec{\gamma } ^0}} \right) } \right] \\ - \frac{1}{n}\sum \limits _{k = 1}^q {{\omega _k}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}} } \left[ {{\psi _{{\tau _k}}}\left( {{Y_i} - {{\hat{c}}_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{{\varvec{\hat{\gamma }} }_a}} \right) - {\psi _{{\tau _k}}}\left( {{Y_i} - {c_{{\tau _k}}} - \varvec{\Pi }_{ia}^T{\varvec{\gamma } ^0}} \right) } \right] \\ \buildrel \Delta \over = {\varvec{G}_1}{{ - }}{\varvec{G}_2} .\\ \end{array} \end{aligned}$$

For $\varvec{G}_1$ and $\varvec{G}_2$, after some direct calculation, we have

$$\begin{aligned} {\varvec{G}_1} = \sum \limits _{k = 1}^q {{\omega _k}} {\varvec{\hat{S}}_{na}}({\varvec{\gamma } ^0} - {\varvec{\hat{\gamma }} _{ora}}) + {o_p}({n^{{{ - r} / {(2r + 1)}}}}) ,\\ {\varvec{G}_2} = \sum \limits _{k = 1}^q {{\omega _k}} {\varvec{\hat{S}}_{na}}({\varvec{\gamma }^0} - {\varvec{\hat{\gamma }} _a}) + {o_p}({n^{{{ - r} / {(2r + 1)}}}}) , \end{aligned}$$

where ${\varvec{\hat{S}}_{na}} = {n^{ - 1}}f{{(}}{c_{{\tau _k}}}{{)}}\sum \limits _{i = 1}^n {{\varvec{\Pi } _{ia}}\varvec{\Pi } _{ia}^T} .$ So

$$\begin{aligned} \begin{array}{l} \mathop {\sup }\limits _{u \in [0,1]} {\left\| {{{\varvec{\hat{\gamma }}}_{aj}} - {{\varvec{\hat{\gamma }} }_{oraj}}} \right\| _2} \\ = \mathop {\sup }\limits _{u \in [0,1]}{\left\| {{{(\sum \limits _{k = 1}^q {{\omega _k}} {{\varvec{\hat{S}}}_{naj}})}^{ - 1}}{{({{p'}_{{\lambda _n}}}({{\left\| {{{\varvec{\hat{\gamma }}}_j}} \right\| }_2})\frac{{\varvec{\gamma }_j}}{{{{\left\| {{{\varvec{\hat{\gamma }}}_j}} \right\| }_2}}})}}} \right\| _2} + {o_p}({n^{{{ - r} / {(2r + 1)}}}}) \\ \le \hat{\lambda }_{\min j }^{ - 1}{a_n} +{o_p}({n^{{{ - r} / {(2r + 1)}}}}) \\ ={o_p}({n^{{{ - r} / {(2r + 1)}}}}),\\ \end{array} \end{aligned}$$

where ${\hat{\lambda }_{\min ,j }} = {\inf _u}{\lambda _{\min }}({\varvec{\hat{S}}_{na,j}}).$

Thus, we have $\mathop {\sup }\limits _{u \in [0,1]}| {{{{\hat{\beta }}}_{aj}}(u) - {{{\hat{\beta }} }_{ora,j}}(u)}| ^2 = {o_p}({n^{{{ - 2r} / {(2r + 1)}}}})$. This completes the proof. $\square $

Proof of Theorem 3

Suppose for some $s_1 + 1 \le {j_0} \le s$, ${\varvec{\pi } ^{({j_0})T}}{\varvec{\hat{\gamma }} _{{j_0}}}$ does not represent a constant coefficient. Let ${\varvec{\hat{\gamma }} ^*}$ be the vector obtained from $\varvec{\hat{\gamma }} $ with ${\varvec{\hat{\gamma }} _{{j_0}}}$ being replaced by its projection onto the subspace $\left\{ {{\varvec{\gamma } _{{j_0}}}{{:}}{\varvec{\pi } ^{({j_0})T}}{\varvec{\gamma } _{{j_0}}}\mathrm{{ represents ~ a ~constant~ coefficient}}} \right\} $. For $j=s_1+1,\ldots ,s$, ${{{\varvec{\hat{\gamma }} }_j}^T{{{\varvec{F}}}_j}{{\varvec{\hat{\gamma }} }_j} = 0}$. By definition of ${\varvec{\hat{\gamma }}}$ and ${\varvec{\hat{\gamma }}^* }$, we have

(37)

Since ${n^{r / {(2r + 1)}}} {\lambda _n} / {\sqrt{{p_n}} } \rightarrow \infty $, so ${\sqrt{{{\varvec{\hat{\gamma }} }_{{j_0}}}^T{{{\varvec{F}}}_{{j_0}}}{{\varvec{\hat{\gamma }} }_{{j_0}}}} } ={O_p}( {{n^{{{ - r} / {(2r + 1)}}}}}) = o({\lambda _n})$ and $n{p_{{\lambda _n}}}\left\{ {\sqrt{{{\varvec{\hat{\gamma }} }_{{j_0}}}^T{{{\varvec{F}}}_{{j_0}}}{{\varvec{\hat{\gamma }} }_{{j_0}}}} } \right\} =n\lambda _n{\sqrt{{{\varvec{\hat{\gamma }} }_{{j_0}}}^T{{{\varvec{F}}}_{{j_0}}}{{\varvec{\hat{\gamma }} }_{{j_0}}}} }$ with probability tending to 1, by the definition of SCAD penalty function. By the proof of Theorem 2 (a), we have ${I} ={O_p}( {n\sqrt{{p_n}} {n^{{{ - r} / {(2r + 1)}}}}} )\left\| {\varvec{\hat{\gamma }} - {{\varvec{\hat{\gamma }} }^*}} \right\| $. Noting that $\left\| {\varvec{\hat{\gamma }} - {{\varvec{\hat{\gamma }} }^*}} \right\| =\left\| {\varvec{\hat{\gamma }}_{j_0} - {{\varvec{\hat{\gamma }}_{j_0} }^*}} \right\| =O_p( {\sqrt{{{\varvec{\hat{\gamma }} }_{{j_0}}}^T{{{\varvec{F}}}_{{j_0}}}{{\varvec{\hat{\gamma }} }_{{j_0}}}} } ) $. We can conclude that $n{p_{{\lambda _n}}}\left\{ {\sqrt{{{\varvec{\hat{\gamma }} }_{{j_0}}}^T{{{\varvec{F}}}_{{j_0}}}{{\varvec{\hat{\gamma }} }_{{j_0}}}} } \right\} $ dominates the other term in (37), which contradicts to $PL(\varvec{\hat{\gamma }},\varvec{\hat{c}},\varvec{\omega }) - PL({\varvec{\hat{\gamma }} ^*},\varvec{\hat{c}},\varvec{\omega }) \le 0.$ $\square $

Proof of Theorem 4

We note that because of Theorem 1–Theorem 3, we only need to consider a correctly specified PLVCM without regularization terms. This reduces the problem to the one studied in Theorem 4.1 of Zou and Yuan (2008) and the results there directly apply, showing the asymptotic normality of the parametric component.

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, C., Yang, H. & Lv, J. Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression. Stat Papers 58, 1009–1033 (2017). https://doi.org/10.1007/s00362-015-0736-5

Download citation

Received: 02 September 2014
Revised: 19 September 2015
Published: 28 December 2015
Issue Date: December 2017
DOI: https://doi.org/10.1007/s00362-015-0736-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression

Abstract

Access this article

Similar content being viewed by others

Variable selection for fixed effects varying coefficient models

Weighted composite quantile regression estimation and variable selection for varying coefficient models with heteroscedasticity

Walsh-average based variable selection for varying coefficient models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 1

Proof of Lemma 1

Proof of Theorem 1

Proof of Theorem 2 (a)

Proof of Theorem 2 (b)

Proof of Theorem 2 (c)

Proof of Theorem 3

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust variable selection in high-dimensional varying coefficient models based on weighted composite quantile regression

Abstract

Access this article

Similar content being viewed by others

Variable selection for fixed effects varying coefficient models

Weighted composite quantile regression estimation and variable selection for varying coefficient models with heteroscedasticity

Walsh-average based variable selection for varying coefficient models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 1

Proof of Lemma 1

Proof of Theorem 1

Proof of Theorem 2 (a)

Proof of Theorem 2 (b)

Proof of Theorem 2 (c)

Proof of Theorem 3

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation