Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty

Hong, Zhaoping; Hu, Yuao; Lian, Heng

doi:10.1007/s00184-012-0422-8

Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty

Published: 16 December 2012

Volume 76, pages 887–908, (2013)
Cite this article

Metrika Aims and scope Submit manuscript

Zhaoping Hong¹,
Yuao Hu¹ &
Heng Lian¹

616 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, we consider the problem of simultaneous variable selection and estimation for varying-coefficient partially linear models in a “small $n$, large $p$” setting, when the number of coefficients in the linear part diverges with sample size while the number of varying coefficients is fixed. Similar problem has been considered in Lam and Fan (Ann Stat 36(5):2232–2260, 2008) based on kernel estimates for the nonparametric part, in which no variable selection was investigated besides that $p$ was assume to be smaller than $n$. Here we use polynomial spline to approximate the nonparametric coefficients which is more computationally expedient, demonstrate the convergence rates as well as asymptotic normality of the linear coefficients, and further present the oracle property of the SCAD-penalized estimator which works for $p$ almost as large as $\exp \{n^{1/2}\}$ under mild assumptions. Monte Carlo studies and real data analysis are presented to demonstrate the finite sample behavior of the proposed estimator. Our theoretical and empirical investigations are actually carried out for the generalized varying-coefficient partially linear models, including both Gaussian data and binary data as special cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spline estimator for ultra-high dimensional partially linear varying coefficient models

Article 13 March 2018

A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates

Article 21 May 2018

Variable selection for the partial linear single-index model

Article 08 April 2017

References

Cai Z, Fan J, Li R (2000) Efficient estimation and inferences for varying-coefficient models. J Am Stat Assoc 95(451):941–956
Article MathSciNet MATH Google Scholar
Chiang CT, Rice JA, Wu C (2001) Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. J Am Stat Assoc 96(454):605–619
Article MathSciNet MATH Google Scholar
Chiaretti S, Li X, Gentleman R, Vitale A, Vignetti M, Mandelli F, Ritz J, Foa R (2004) Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 103(7):2771–2778
Article Google Scholar
De Boor C (2001) A practical guide to splines. Springer, New York, rev. edition (2001)
Eubank RL, Huang C, Maldonado YM, Wang N, Wang S, Buchanan RJ (2004) Smoothing spline estimation in varying-coefficient models. J R Stat Soc Ser B Stat Methodol 66:653–667
Article MathSciNet MATH Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Article MathSciNet MATH Google Scholar
Fan J, Lv J (2011) Nonconcave penalized likelihood with NP-dimensionality. IEEE Trans Inf Theory 57:5467–5484
Article MathSciNet Google Scholar
Fan J, Peng H (2004) Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32(3):928–961
Article MathSciNet MATH Google Scholar
Fan J, Zhang W (1999) Statistical estimation in varying coefficient models. Ann Stat 27(5):1491–1518
Article MathSciNet MATH Google Scholar
Fan J, Zhang J (2000) Two-step estimation of functional linear models with applications to longitudinal data. J R Stat Soc Ser B Stat Methodol 62:303–322
Article MathSciNet Google Scholar
Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc 106:544–557
Article MathSciNet MATH Google Scholar
Frank I, Friedman J (1993) A statistical view of some chemometrics regression tools. Technometrics 35: 109–135
Google Scholar
Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc Ser B Methodol 55(4):757–796
MathSciNet MATH Google Scholar
Huang JZ, Wu C, Zhou L (2002) Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89(1):111–128
Article MathSciNet MATH Google Scholar
Huang JZ, Wu C, Zhou L (2004) Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Stat Sin 14(3):763–788
MathSciNet MATH Google Scholar
Huang J, Horowitz J, Ma S (2008) Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann Stat 36(2):587–613
Article MathSciNet MATH Google Scholar
Huang J, Horowitz J, Wei F (2010) Variable selection in nonparametric additive models. Ann Stat 38(4):2282–2313
Article MathSciNet MATH Google Scholar
Kim Y, Choi H, Oh H (2008) Smoothly clipped absolute deviation on high dimensions. J Am Stat Assoc 103(484):1665–1673
Article MathSciNet Google Scholar
Lam C, Fan J (2008) Profile-kernel likelihood inference with diverging number of parameters. Ann Stat 36(5):2232–2260
Article MathSciNet MATH Google Scholar
Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36(1):261–286
Article MathSciNet MATH Google Scholar
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, London, New York
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Methodol 58(1):267–288
MathSciNet MATH Google Scholar
van der Geer SA (2000) Applications of empirical process theory. Cambridge University Press, Cambridge
MATH Google Scholar
Wang H, Xia Y (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104(486):747–757
Article MathSciNet Google Scholar
Wang L, Li H, Huang JZ (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103(484):1556–1569
Article MathSciNet Google Scholar
Wang L, Wu Y, Li R (2012) Quantile regression for analyzing heterogeneity in ultra-high dimension. J Am Stat Assoc 107(497):214–222
Article MathSciNet MATH Google Scholar
Wang L, Liu X, Liang H, Carroll R (2011) Estimation and variable selection for generalized additive partially linear models. Ann Stat 39:1827–1851
Google Scholar
Wei F, Huang J, Li H (2011) Variable selection in high-dimensional varying-coefficient models. Stat Sin 21:1515–1540
Google Scholar
Xie H, Huang J (2009) SCAD-penalized regression in high-dimensional partially linear models. Ann Stat 37(2):673–696
Article MathSciNet MATH Google Scholar
Yuan M, Lin Y (2007) On the non-negative garrotte estimator. J R Stat Soc Ser B Stat Methodol 69:143–161
Article MathSciNet MATH Google Scholar
Zhang C (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
Article MATH Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Article MATH Google Scholar
Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann Stat 36(4):1509–1533
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors sincerely thank the two referees for their insightful comments and suggestions that have lead to improvements on the original manuscript. The research of Heng Lian is supported by Singapore MOE Tier 1 Grant.

Author information

Authors and Affiliations

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, 637371, Singapore
Zhaoping Hong, Yuao Hu & Heng Lian

Authors

Zhaoping Hong
View author publications
You can also search for this author in PubMed Google Scholar
Yuao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Heng Lian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heng Lian.

Appendix

Proof of Theorem 1

1 Let $X_i^{(1)}=(X_{i1},\ldots ,X_{is})^T$ be the subvector of $X_i$ associated with nonzero coefficients, and correspondingly let $\beta _0^{(1)}=(\beta _{01},\ldots ,\beta _{0s})^T$. Since Theorem 1 only considers the oracle estimator, we will omit the superscript $(.)^{(1)}$ in the following. Let $a_0^*=(a_0^T,\beta _0^T)^T$, $\hat{a}^*=(\hat{a}^{oT},\hat{\beta }^{oT})^T$, and note that $U_i=(Z_i^T,X_i^T)^T$. Since $\hat{a}^*=(\hat{a}^o,\hat{\beta }^o)$ minimizes

$$\begin{aligned}&\sum _iQ(g^{-1}(U_i^Ta^*),Y_i)\nonumber \end{aligned}$$

with respect to $a^*$, $\hat{a}^*$ satisfies the first-order condition

$$\begin{aligned} \sum _i q_1(U_i^T\hat{a}^*,Y_i)U_i=0. \end{aligned}$$

(9)

Using Taylor expansion at $U_i^Ta_0^*$ for the left hand side of (9), we get

$$\begin{aligned}&\sum _i q_1\left(U_i^Ta_0^*,Y_i\right)U_i+q_2(z_i,Y_i)U_iU_i^T\left(\hat{a}^*-a_0^*\right)=0, \end{aligned}$$

(10)

where $z_i$ lies between $U_i^Ta_0^*$ and $U_i^T\hat{a}^*$.

First, note that the eigenvalues of $\sum _i q_2(z_i,Y_i)U_iU_i^T$ are of order $n$. Furthermore, we will show that

$$\begin{aligned} \left\Vert\sum _i q_1(U_i^Ta_0^*,Y_i)U_i\right\Vert=O_p( \sqrt{n(K+s)}+nK^{-d}), \end{aligned}$$

(11)

and thus (10) implies $\Vert \hat{a}^*-a_0^*\Vert =O_P(\sqrt{(K+s)/n}+1/K^{d})$, which in turn immediately implies $\sum _j\Vert \hat{\alpha }_j-a_{0j}^TB\Vert +\Vert \hat{\beta }^o-\beta _0\Vert =O_P(\sqrt{(K+s)/n}+1/K^d)$.

Now what is left is to demonstrate (11). Using the notation

$$\begin{aligned} U=(U_1,\ldots ,U_n)^T, \end{aligned}$$

and

$$\begin{aligned} \mathbf q _1(Ua_0^*,Y)=\left(\begin{array}{c} q_1(U_1^Ta_0^*,Y_1)\\ \vdots \\ q_1(U_n^Ta_0^*,Y_n)\\ \end{array}\right), \end{aligned}$$

$\Vert \sum _i{q}_1(U_ia_0^*,Y)^TU_i\Vert $ can be written as $\Vert \mathbf q _1(U^Ta_0^*,Y)^TU\Vert $. For an arbitrary $v\in R^{qK+s}$, we have $|\mathbf q _1(U^Ta_0^*,Y)^TUv|^2\le \Vert P_U\mathbf q _1(U^Ta_0^*,Y)\Vert ^2\cdot \Vert Uv\Vert ^2$, where $P_U=U(U^TU)^{-1}U^T$ is a projection matrix.

Obviously $\Vert Uv\Vert ^2=O_P(n\Vert v\Vert ^2)$. Besides, we have

$$\begin{aligned}&\Vert P_U\mathbf q _1(Ua_0^*,Y)\Vert ^2\\&\quad \le 2\Vert P_U\mathbf q _1(\mathbf m ,Y)\Vert ^2+2\Vert P_U(\mathbf q _1(Ua_0^*)-\mathbf q _1(\mathbf m ,Y))\Vert ^2, \end{aligned}$$

where $\mathbf m =(m_1,\ldots ,m_n)^T$ with $m_i=W_i^T\alpha _0(T_i)+X_i^T\beta _0$. The first term is of order $O_P(tr(P_U))=O_P(K+s)$ since $\mathbf q _1(\mathbf m ,Y)$ has mean zero conditional on the predictors. The second term is bounded by, using Taylor expansion and (C2), $O_P(n/K^{2d})$.

$\square $

Proof of Theorem 2

2 As in the previous theorem, we still omit the superscript $(1)$ here. Let $\tilde{\mathcal{G }}$ be the subset of $\mathcal G $ where $h_j$’s are constrained to be polynomial splines. The functions $\Gamma _{j}\in \mathcal G $ can be approximated by $\hat{\Gamma }_{j}\in \tilde{\mathcal{G }}$ with $\Vert \hat{\Gamma }_{j}-\Gamma _{j}\Vert _\infty =O(K^{-d})$. Let $\hat{\Gamma }=(\hat{\Gamma }_1,\ldots ,\hat{\Gamma }_s)^T$. Consider the following functional

$$\begin{aligned} \sum _i Q(\hat{m}_i+(X_i-\hat{\Gamma }(W_i,T_i))^T\nu ,Y_i), \end{aligned}$$

where $\hat{m}_i=Z_i^T\hat{a}^o+X_i^T\hat{\beta }^o$ and $\nu =(\nu _1,\ldots ,\nu _s)^T$. Obviously, the above functional is minimized by $\nu =0$ which leads to the first-order condition

$$\begin{aligned} \sum _i q_1(\hat{m}_i,Y_i)(X_i-\hat{\Gamma }(W_i,T_i))=0. \end{aligned}$$

(12)

Since

$$\begin{aligned}&\sum _i q_1(\hat{m}_i,Y_i)(\hat{\Gamma }(W_i,T_i)-\Gamma (W_i,T_i))\\&\quad =\sum _i q_1(m_{i},Y_i)(\hat{\Gamma }(W_i,T_i)-\Gamma (W_i,T_i))\\&\qquad +\sum _i q_2(.,Y_i)(\hat{m}_i-m_{i})(\hat{\Gamma }(W_i,T_i)-\Gamma (W_i,T_i))\\&\quad =O_p\left(\sqrt{n/K^{2d}}\right)+O_p(n{v_n}K^{-d})\\&\quad =o_p(\sqrt{n}), \end{aligned}$$

where $q_2(.,Y_i)$ is evaluated at some point between $\hat{m}_i$ and $m_i$, we can replace $\hat{\Gamma }$ in (12) by $\Gamma $ to get

$$\begin{aligned} \sum _i q_1(\hat{m}_i,Y_i)(X_i-\Gamma (W_i,T_i))=o_p(\sqrt{n}). \end{aligned}$$

(13)

Now we have

$$\begin{aligned}&\sum _iq_1(\hat{m}_i,Y_i)(X_i-\Gamma (W_i,T_i))\\&\quad =\sum _iq_1(m_{i},Y_i)(X_i-\Gamma (W_i,T_i))+q_2(m_{i},Y_i)(\hat{m}_i-m_{i})(X_i-\Gamma (W_i,T_i))\\&\qquad +\,q_2^{\prime }(.,Y_i)(\hat{m}_i-m_{i})^2(X_i-\Gamma (W_i,T_i))\\&\!=\!\sum _iq_1(m_{i},Y_i)(X_i\!-\!\Gamma (W_i,T_i))\!+\!q_2(m_{i},Y_i)\left(Z_i^T\hat{a}^o\!-\!W_i^T\alpha _0(T_i)\right)(X_i\!-\!\Gamma (W_i,T_i))\\&\quad +\,q_2(m_{i},Y_i)(X_i\!-\!\Gamma (W_i,T_i))^{\otimes 2}(\hat{\beta }^o\!-\!\beta _0) \!+\!q_2^{\prime }(.,Y_i)(\hat{m}_i\!-\!m_{i})^2(X_i\!-\!\Gamma (W_i,T_i)). \end{aligned}$$

Using Theorem 1, we have $\sum _iq_2(m_{i},Y_i)(Z_i^T\hat{a}^o-W_i^T\alpha (T_i))(X_i-\Gamma (W_i,T_i))=o_p(\sqrt{n})$ and $\sum _iq_2^{\prime }(.,Y_i)(\hat{m}_i-m_{i})^2(X_i-\Gamma (W_i,T_i))=o_p(\sqrt{n})$. Also, it is easy to see that

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _iq_1(m_{i},Y_i)(X_i-\Gamma (W_i,T_i))\rightarrow N(0,\Xi ), \end{aligned}$$

by central limit theorem, and that

$$\begin{aligned} \frac{1}{n}\sum _iq_2(m_{i},Y_i)(X_i-\Gamma (W_i,T_i))^{\otimes 2}\rightarrow \Xi , \end{aligned}$$

and asymptotic normality of $\hat{\beta }$ follows. $\square $

The proof of Theorem 3 is based on the following proposition, which is a direct extension of Theorem 1 in Fan and Lv (2011) to the case of quasi-likelihood (but specialized to the SCAD penalty). A similar second-order sufficiency was also used in Kim et al. (2008) in linear models (see the proof of their Theorem 1). Thus the proof of the following proposition is omitted.

Proposition 1

$(a^T,\beta ^T)\in R^{qK+p}$ is a local minimizer of the SCAD-penalized quasi-likelihood (3) if

$$\begin{aligned}&\sum _iq_1(Z_i^Ta+X_i^T\beta ,Y_i)Z_{ij}=0, j=1,\ldots ,q,\end{aligned}$$

(14)

$$\begin{aligned}&\sum _iq_1(Z_i^Ta\!+\!X_i^T\beta ,Y_i)X_{ij}\!=\!0 \quad \text{ and}\quad |\beta _j|\ge c\lambda \text{ for} j=1,\ldots ,s, (c=3.7),\qquad \end{aligned}$$

(15)

$$\begin{aligned}&|\sum _iq_1(Z_i^Ta+X_i^T\beta ,Y_i)X_{ij}|\le n\lambda \quad \text{ and}\quad |\beta _j|< \lambda \text{ for} j=s+1,\ldots ,p, \end{aligned}$$

(16)

where $Z_{ij}=(W_{ij}B_{1}(T_{i}),\ldots ,W_{ij}B_{K}(T_i))^T\in R^K$.

Proof of Theorem 3

3 We will show that $(\hat{a}^T,\hat{\beta }^T)=(\hat{a}^o, \hat{\beta }^{(1)}=\hat{\beta }^o,\hat{\beta }^{(2)}=0)$ satisfies (14)–(16). This will immediately imply all the results stated in Theorem 3.

Denote $\hat{a}^*=(\hat{a}^o, \hat{\beta }^{(1)})$ and $a_0^*=(a_0,\beta _0^{(1)})$. It trivially holds that $\sum _iq_1(Z_i^T\hat{a}^o+X_i^T\hat{\beta }^o,Y_i)Z_{ij}=0, j=1,\ldots ,q$ and $\sum _iq_1(Z_i^T\hat{a}^o+X_i^T\hat{\beta }^o,Y_i)X_{ij}=0, j=1,\ldots ,s$ by the definition of $\hat{a}^o,\hat{\beta }^o$. Furthermore, note that $|\hat{\beta }_j|\ge a\lambda $ is implied by

$$\begin{aligned}&\min _{1\le j\le s}|\beta _{0j}|\gg \lambda ,\\&|\hat{\beta }_j-\beta _{0j}|\ll \lambda , \end{aligned}$$

and both equations above are implied by (C7) as well as Theorem 1.

For $j=s+1,\ldots ,p$, $|\hat{\beta }_j|< \lambda $ is trivial since $\hat{\beta }_j=0$. Furthermore, we have

$$\begin{aligned}&\sum _iq_1\left(Z_i^T\hat{a}^o+X_i^T\hat{\beta }^o,Y_i\right)X_{ij}\nonumber \\&\quad =\sum _iq_1\left(U_i^Ta_0^*,Y_i\right)X_{ij}+q_2(z_i,Y_i)U_i^T\left(\hat{a}^*-a_0^*\right)X_{ij}\nonumber \\&\quad =\sum _iq_1\left(U_i^Ta_0^*,Y_i\right)X_{ij}\nonumber \\&\qquad -\sum _{i}q_2(z_i,Y_i)X_{ij}U_i^T\left[\!\left(\sum _{i^{\prime }}q_2(z_{i^{\prime }},Y_{i^{\prime }})U_{i^{\prime }}U_{i^{\prime }}^T\right)^{-1}\!\left(\sum _{i^{\prime }}q_1(U_{i^{\prime }}^Ta_0^*,Y_{i^{\prime }})U_{i^{\prime }}\right)\!\right],\nonumber \\ \end{aligned}$$

(17)

where in the last step above we used (10).

Denote $e=(1,\ldots ,1)^T$, $\delta _j=(X_{1j}q_1(U_1^Ta_0^*,Y_1),\ldots ,X_{nj}q_1(U_n^Ta_0^*,Y_n))^T$, and $P=(p_{ii^{\prime }})_{n\times n}$ with $p_{ii^{\prime }}\!=\!q_2(z_i,Y_i)U_i^T(\sum _{i^{\prime }}q_2(z_{i^{\prime }},Y_{i^{\prime }})U_{i^{\prime }}U_{i^{\prime }}^T)^{-1}U_{i^{\prime }}$. By Taylor expansion, we can write $\delta _j=\epsilon _j+\gamma _j$ with $\epsilon _j=(X_{1j}q_1(m_{1},Y_1),\ldots ,X_{nj}q_1(m_{n},Y_n))^T$ and $\gamma _j=(X_{1j}q_2(.,Y_1)(U_1^Ta_0^*-m_{1}),\ldots ,X_{nj}q_2(.,Y_n)(U_n^Ta_0^*-m_{n}))^T$, where $m_{i}=\sum _j\alpha _{0j}(X_{ij})$ and $q_2(.,Y_i)$ is evaluated at some point between $U_i^Ta_0^*$ and $m_{i}$.

Using these notations, (17) can be written as $e^T(I-P)\delta _j=e^T(I-P)\epsilon _j+e^T(I-P)\gamma _j$. In Lemma 1 below we show

$$\begin{aligned} \max _{j\ge s+1}|e^T(I-P)\epsilon _j|=O_P\left(\sqrt{n}\log (p\vee n)\right) \end{aligned}$$

and

$$\begin{aligned} \max _{j\ge s+1}|e^T(I-P)\gamma _j|=O_P(nK^{-d}). \end{aligned}$$

Thus (C5) implies $\max _{j\ge s+1}$ $|\sum _iq_1(U_i^T\hat{a}^*,Y_i)X_{ij}|=o_P(n\lambda )$ which completes the proof. $\square $

Lemma 1

Here we show that

$$\begin{aligned} \max _{j\ge s+1}|e^T(I-P)\epsilon _j|=O_P\left(\sqrt{n}\log (p\vee n)\right) \end{aligned}$$

(18)

and

$$\begin{aligned} \max _{j\ge s+1}|e^T(I-P)\gamma _j|=O_P(nK^{-d}). \end{aligned}$$

(19)

Proof of Lemma 1

1 First, it is easy to see that all the eigenvalues of the matrix $P$ are bounded by $1$ (in fact the eigenvalue is either 0 or 1), and thus $\Vert e^T(I-P)\Vert \le \sqrt{n}$. Write the vector $e^T(I-P)$ as $b=(b_1,\ldots ,b_n)^T$ and then $e^T(I-P)\epsilon _j$ is written as $\sum _ib_i\epsilon _{ij}$ with $\epsilon _{ij}=X_{ij}q_1(m_{i},Y_i)$. By assumption (C6), we have

$$\begin{aligned} E|b_i\epsilon _{ij}|^m\le \frac{m!}{2}(b_iJ)^{m-2}(b_iR)^2, \end{aligned}$$

and thus

$$\begin{aligned} \frac{1}{n}\sum _iE|b_i\epsilon _{ij}|^m&\le \frac{m!}{2n}\sum _i (b_iJ)^{m-2}(b_iR)^2\\&\le \frac{m!}{2}\left(\max _i|b_i|J\right)^{m-2}\left(\sum _ib_i^2/n\right)R^2\\&\le \frac{m!}{2}(\sqrt{n}J)^{m-2}R^2, \end{aligned}$$

using that $\Vert b\Vert ^2\le n$. Thus by Theorem 8.9 (Bernstein’s inequality) in van der Geer (2000), together with a simple union bound, we get

$$\begin{aligned}&P\left(\max _{j>s}|e^T(I-P)\epsilon _j|>c\right)\\&= P\left(\max _{j>s}|\sum _ib_i\epsilon _{ij}|>c\right)\\&\le 2p\exp \left\{ -\frac{c^2}{2\sqrt{n}Jc+2nR^2}\right\} ,\quad \forall c>0. \end{aligned}$$

Thus if $c=C\sqrt{n}\log (p\vee n)$ for sufficiently large $C>0$, the above probability converges to zero, showing the validity of (18).

For the proof of (19), we only need to note that $|e^T(I-P)\gamma _j|\le \Vert b\Vert \cdot \Vert \gamma _j\Vert =O_P(\sqrt{n}\cdot \sqrt{n}K^{-d})$ by (C2).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hong, Z., Hu, Y. & Lian, H. Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty. Metrika 76, 887–908 (2013). https://doi.org/10.1007/s00184-012-0422-8

Download citation

Received: 21 May 2012
Published: 16 December 2012
Issue Date: October 2013
DOI: https://doi.org/10.1007/s00184-012-0422-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty

Abstract

Access this article

Similar content being viewed by others

Spline estimator for ultra-high dimensional partially linear varying coefficient models

A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates

Variable selection for the partial linear single-index model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Theorem 1

Proof of Theorem 2

Proposition 1

Proof of Theorem 3

Lemma 1

Proof of Lemma 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty

Abstract

Access this article

Similar content being viewed by others

Spline estimator for ultra-high dimensional partially linear varying coefficient models

A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates

Variable selection for the partial linear single-index model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Theorem 1

Proof of Theorem 2

Proposition 1

Proof of Theorem 3

Lemma 1

Proof of Lemma 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation