A new variable selection approach for varying coefficient models

Ma, Xue-Jun; Zhang, Jing-Xiao

doi:10.1007/s00184-015-0543-y

A new variable selection approach for varying coefficient models

Published: 16 April 2015

Volume 79, pages 59–72, (2016)
Cite this article

Metrika Aims and scope Submit manuscript

Xue-Jun Ma¹ &
Jing-Xiao Zhang¹

576 Accesses
8 Citations
Explore all metrics

Abstract

The varying coefficient models are very important tools to explore the hidden structure between the response variable and its predictors. However, variable selection and identification of varying coefficients of the models are poorly understood. In this paper, we develop a novel method to overcome these difficulties using local polynomial smoothing and the SCAD penalty. Under some regularity conditions, we show that the proposed procedure is consistent in separating the varying coefficients from the constant ones. The resulting estimator can be as efficient as the oracle. Simulation results confirm our theories. Finally, we study the Boston housing data using the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust variable selection for the varying index coefficient models

Article 07 August 2023

Variable selection for varying-coefficient models with the sparse regularization

Article 07 August 2014

Unified Variable Selection for Varying Coefficient Models with Longitudinal Data

Article 11 November 2022

References

Breiman L (1995) Better subset selection using nonnegative garrote. Techonometrics 37:373–384
Article MATH MathSciNet Google Scholar
Candes E, Tao T (2007) The dantzig selector statistical estimation when p is much larger than n. Ann Stat 35:2313–2351
Article MATH MathSciNet Google Scholar
Fan J-Q, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MATH MathSciNet Google Scholar
Fan J-Q, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11:1031–1057
Article MATH MathSciNet Google Scholar
Fan J-Q, Zhang J-T (2000) Two-step estimation of functional linear model with application to longitudinal data. J R Stat Soc Ser B 62:303–322
Article MathSciNet Google Scholar
Fan J-Q, Zhang W-Y (2008) Statistical methods with varying coefficient models. Stat Interface 1:179–195
Article MathSciNet Google Scholar
Hastie T-J, Tibshirani R-J (1993) Varying-coefficient models. J R Stat Soc B 55:757–796
MATH MathSciNet Google Scholar
Härdle W, Liang H, Gao J-T (2000) Partially linear models. Springer, Heidelberg
Book MATH Google Scholar
Hohsuk N, Ingrid V-K (2012) Efficient model selection in semivarying coefficient models. Electron J Stat 6:2519–2534
Article MATH MathSciNet Google Scholar
Hu T, Xia Y-C (2012) Adaptive semi-varying coefficient model selection. Stat Sin 22:575–599
MATH MathSciNet Google Scholar
Leng C-L (2009) A simple approach for varying-coefficient model selection. J Stat Plan Inference 139:2138–2146
Article MATH MathSciNet Google Scholar
Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36:261–286
Article MATH MathSciNet Google Scholar
Mack Y-P, Silverman B-W (1982) Weak and strong uniform consistency of kernel regression estimates. Z Wahrsch verw Gebiete 61:405–415
Article MATH MathSciNet Google Scholar
Tang Y-L, Wang H, Zhu Z-Y, Song X-Y (2012) A unified variable selection approach for varying coefficient models. Stat Sin 22:601–628
MATH MathSciNet Google Scholar
Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc B 67:91–108
Article MATH MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288
MATH MathSciNet Google Scholar
Wang H-S, Xia Y-C (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104:747–757
Article MathSciNet Google Scholar
Xia Y-C, Li W-K (1999) On the estimation and testing of functional coefficient linear models. Stat Sin 9:735–757
MATH MathSciNet Google Scholar
Zhang C (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942
Article MATH Google Scholar
Zhao P-X, Xue L-G (2009) Variable selection for semiparametric varying coefficient partially linear models. Stat Probab Lett 79:2148–2157
Article MATH MathSciNet Google Scholar
Zhang W-Y, Lee S-Y, Song X-Y (2002) Local polynomial fitting in semivarying coefficient model. J Multivar Anal 82:166–188
Article MATH MathSciNet Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1416–1429
Article Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67:301–320
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

The authors would like to thank the Editor and the referee for their careful reading and for their comments which greatly improved the paper, and also thank Bingyi Jing and Hansheng Wang for beneficial discussions, Yanlin Tang for sending R code for the procedures proposed in their papers.

Author information

Authors and Affiliations

Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, 100872, China
Xue-Jun Ma & Jing-Xiao Zhang

Authors

Xue-Jun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jing-Xiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing-Xiao Zhang.

Additional information

This work was supported by Program for New Century Excellent Talents in University (NCET-12-0536).

Appendix: Assumptions and proofs

To study the asymptotic properties of the proposed method, Let $H=(A,B)=(\beta (U_{1}), \ldots , \beta (U_{n}))^{T}$. Moreover, the following standard regularity conditions are needed (Fan and Huang 2005).

(C1) For an $s>2, E|Y_{i}|^{2s}<\infty $ and $E||\mathbf {X}_{i}||^{2s}<\infty $.

(C2) The density function of $U_{i}, f(u)$, is continuous and positively bounded away from 0 on $[0,1]$.

(C3) Matrix $\Omega (u)=E(\mathbf {X}_{i}\mathbf {X}_{i}^{T}|U_{i}=u)$ is nonsingular and has bounded second order derivatives on [0,1]. Function $E(||\mathbf {X}_{i}||^{4}|U_{i}=u)$ is also bounded.

(C4) The second order derivative of $f(u)$ and $\sigma ^{2}(u)=E(\varepsilon _{i}^{2}|U_{i}=u)$ is also bounded.

(C5) $K(u)$ is a symmetric density function with a compact support.

(C6) The second order derivatives of coefficients $a_{j}(u),j=1,\ldots , p$, are continuous.

Note that (C2) guarantees the maximal distance between two consecutive index variables is $O_{p}(logn/n)$. For an arbitrary index value $u\in [0, 1]$, let $u^{*}$ be its nearest neighbor among the observed index values, i.e., $u^{*}=argmin_{\bar{u} \in \{U_{t}: 1\le t\le n\}}|u-\bar{u}|$. Under the smoothness assumption (C6), we have $||\beta (u)-\beta (u^{*})||=O_{p}(logn/n)$ also, which is an order substantially smaller than the optimal nonparametric convergence rate (i.e., $n^{-2/5}$). Practically, this means that the observed index values are sufficiently dense on the support. Thus, it suffices to approximate the entire coefficient curve $\beta (u)$ by $\{\beta ({U_{t}}): 1\le t \le n\}$.

Lemma 1

Suppose ($\xi _{i},U_{i}), i = 1,\ldots ,n$ are i.i.d random vectors, where $\xi _{i}$ are scalar random variables. Suppose $E|\xi _{i}|^{s}<\infty $ and $\sup _{u}\int |y|^{s}f(u,v)dv<\infty $ where $f$ denotes the joint density of $(\xi _{i},U_{i})$. Let K be a bounded positive function with bounded support, satisfying the Lipschitz condition, then

$$\begin{aligned} \sup _{u\in [0,1]}\left| \frac{1}{n}\sum _{i=1}^{n}[K_{h}(U_{i}-u)\xi _{i}-E\{K_{h}(U_{i}-u)\xi _{i}\}]\right| =O_{p}\left( \frac{log(1/h)}{nh}\right) ^{1/2} \end{aligned}$$

provided $n^{2\delta -1}h\rightarrow \infty $ for some $\delta <1-s^{-1}$.

The proof of the Lemma can be found in Mack and Silverman (1982), or Fan and Zhang (2000).

Lemma 2

If $(\hbox {C}1)-(\hbox {C}6)$ hold, and $nh^{-1/2}a_{1n}\rightarrow 0, nh^{-1/2}b_{1n}\rightarrow 0$. then we must have

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^{n}||\hat{\beta }(U_{t})-\beta (U_{t})||^{2}=O\{(nh)^{-1/2}\} \end{aligned}$$

Proof

For an arbitrary matrix $G=(g_{ij}), ||G||^{2}=\sum g_{ij}^{2}$. We use $S=(s_{ij})\in R^{n\times 2p}$ to denote an arbitrary $n\times 2p$ matrix with rows $\mathbf {s}_{1}^{T},\ldots ,\mathbf {s}_{n}^{T}$ and columns $\mathbf {v}_{1}^{T},\ldots ,\mathbf {v}_{2p}^{T}$. Let $H_{0}=(\beta _{0}(U_{1}),\ldots ,\beta _{0}(U_{n}))^{T}$ with the columns $\mathbf {h}_{01}^{T},\ldots ,\mathbf {h}_{0,2p}^{T}$. By Fan and Li (2001), it suffices to show that for any small probability $\varepsilon >0$, we can always find a constant $C>0$, such that

$$\begin{aligned} \lim _{n}\inf P\left\{ \left( \inf _{n^{-1}||s||^{2}=C}Q_{\lambda }(H_{0}+(nh)^{-1/2}S)>Q_{\lambda }(H_{0})\right) \right\} =1-\varepsilon \end{aligned}$$

(4)

By definition of $Q_{\lambda }(H)$, we have

$$\begin{aligned}&hn^{-1} \left\{ Q_{\lambda }(H_{0}+(nh)^{-1/2}S)>Q_{\lambda }(H_{0}) \right\} \\&\quad =\,hn^{-1}\sum _{t=1}^{n}\sum _{i=1}^{n}\left( Y_{i}-D_{u_{t},i}^{T}\{ \beta _{0}(U_{t})+(nh)^{-1/2}\mathbf {s}_{t}\}\right) ^{2}K_{h}(U_{t}-U_{i})\\&\qquad -\,hn^{-1}\sum _{t=1}^{n}\sum _{i=1}^{n}\left( Y_{i}-D_{u_{t},i}^{T} \beta _{0}(U_{t})\right) ^{2}K_{h}(U_{t}-U_{i})\\&\qquad +\,\frac{h}{n}\sum _{j=1}^{p}p_{\lambda _{1j}}(||\mathbf {h}_{0j}+(nh)^{-1/2}\mathbf {v}_{j}||-||\mathbf {h}_{0j}||)\\&\qquad +\,\frac{h}{n}\sum _{j=p+1}^{2p}p_{\lambda _{2,j-p}}(||\mathbf {h}_{0j}+(nh)^{-1/2}\mathbf {v}_{j}||-||\mathbf {h}_{0j}||)\doteq R_{1} \end{aligned}$$

where $D_{u_{t},i}=(\mathbf {X}_{i},(U_{t}-U_{i})\mathbf {X}_{i})^{T}$. By simple algebraic calculation and the fact that $||\mathbf {h}_{0j}||=0$ for any $p_{1}<j\le p, p+p_{0}<j\le 2p$, we have

$$\begin{aligned} R_{1}\ge & {} \frac{1}{n}\sum _{t=1}^{n}\left( \mathbf {s}_{t}^{T}\hat{{\varSigma }}(U_{t})\mathbf {s}_{t}-2\mathbf {s}^{T}_{t}\hat{\mathbf {e}}_{t} \right) \\&+\,\frac{h}{n}\sum _{j=1}^{p_{1}}p_{\lambda _{1j}}(||\mathbf {h}_{0j}+(nh)^{-1/2}\mathbf {v}_{j}||-||\mathbf {h}_{0j}||)\\&+\,\frac{h}{n}\sum _{j=p+1}^{p+p_{0}}p_{\lambda _{2,j-p}}(||\mathbf {h}_{0j}+(nh)^{-1/2}\mathbf {v}_{j}||-||\mathbf {h}_{0j}||)\doteq R_{2} \end{aligned}$$

where $\hat{{\varSigma }}(U_{t})=n^{-1}\sum _{i=1}^{n}D_{u_{t},i}D_{u_{t},i}^{T}K_{h}(U_{t}-U_{i})$ and $\hat{e}_{t}=n^{-1/2}h^{1/2}\sum _{i=1}^{n}D_{u_{t},i}\left( D_{u_{t},i}^{T}[\beta (U_{t}) -\beta (U_{i})]+\varepsilon _{i}\right) K_{h}(U_{t}-U_{i})$. Let $\hat{\lambda }_{t}^{min}$ be the smallest eigenvalue of $\hat{{\varSigma }}(U_{t})$, $\hat{\lambda }_{min}=\min \{\hat{\lambda }_{t}^{min} ,$ $t=1,\ldots ,n\}$, and $\hat{\mathbf {e}}=(\hat{\mathbf {e}}_{1},\ldots ,\hat{\mathbf {e}}_{n})^{T}\in \mathbf {R}^{n\times 2p}$, we have

$$\begin{aligned} R_{2}\ge & {} \frac{1}{n} \sum _{t=1}^{n}\left\{ ||\mathbf {s}_{t}||^{2}\hat{\lambda }_{t}^{min}-2||\mathbf {s}_{t}||\cdot ||\hat{\mathbf {e}}_{t}|| \right\} \\&-\,n^{-3/2}h^{1/2}\sum _{j=1}^{p_{1}}p_{\lambda _{1j}}(||\mathbf {v}_{j}||)-n^{-3/2}h^{1/2}\sum _{j=p+1}^{p+p_{0}}p_{\lambda _{2,j-p}}(||\mathbf {v}_{j}||)\\\ge & {} \hat{\lambda }_{min}\left\{ n^{-1}\sum _{t=1}^{n}||\mathbf {s}_{t}||^{2}\right\} -2n^{-1}\left\{ \sum _{t=1}^{n} || \mathbf {s}_{t}||\cdot ||\hat{\mathbf {e}}_{t}|| \right\} \\&-\,n^{-3/2}h^{1/2}\sum _{j=1}^{p_{1}}p_{\lambda _{1j}}(||\mathbf {v}_{j}||)-n^{-3/2}h^{1/2}\sum _{j=p+1}^{p+p_{0}}p_{\lambda _{2,j-p}}(||\mathbf {v}_{j}||)\\\ge & {} \hat{\lambda }_{min}\left\{ n^{-1}||S||^{2}\right\} -2(n^{-1}||S||^{2})^{1/2}\cdot (n^{-1}|| \hat{e}||^{2})^{1/2}\\&-\,n^{-3/2}h^{1/2}\sum _{j=1}^{p_{1}}p_{\lambda _{1j}}(||\mathbf {v}_{j}||)-n^{-3/2}h^{1/2}\sum _{j=p+1}^{p+p_{0}}p_{\lambda _{2,j-p}}(||\mathbf {v}_{j}||)\doteq R_{3} \end{aligned}$$

By the condition $n^{-1}||S||^{2}=C^{2}$, we have

$$\begin{aligned} R_{3}= & {} \hat{\lambda }_{min}\times C^{2}-2C\times (n^{-1}|| \hat{\mathbf {e}}||^{2})^{1/2}\nonumber \\&-\,n^{-3/2}h^{1/2}\sum _{j=1}^{p_{1}}p_{\lambda _{1j}}(||\mathbf {v}_{j}||)-n^{-3/2}h^{1/2}\sum _{j=p+1}^{p+p_{0}}p_{\lambda _{2,j-p}}(||\mathbf {v}_{j}||)\nonumber \\\ge & {} \hat{\lambda }_{min}\times C^{2}-2C\times (n^{-1}|| \hat{\mathbf {e}}||^{2})^{1/2}\nonumber \\&-\,n^{-1}h^{3/2}a_{1n}\left( n^{-1}\sum _{j=1}^{p_{1}}||\mathbf {v}_{j}||^{2}\right) ^{1/2}-n^{-1}h^{1/2}b_{1n}\left( n^{-1}\sum _{j=p+1}^{p+p_{0}}||\mathbf {v}_{j}||^{2}\right) ^{1/2}\nonumber \\\ge & {} \hat{\lambda }_{min}\times C^{2}-2C\times (n^{-1}|| \hat{\mathbf {e}}||^{2})^{1/2}-n^{-1}h^{1/2}(a_{1n}+b_{1n})\left( n^{-1}\sum _{j=1}^{2p}||\mathbf {v}_{j}||\right) ^{1/2}\nonumber \\= & {} \hat{\lambda }_{min}\times C^{2}-2C\times (n^{-1}|| \hat{\mathbf {e}}||^{2})^{1/2}-n^{-1}h^{1/2}(a_{1n}+b_{1n})C \end{aligned}$$

(5)

After some algebraic calculations, we have $n^{-1}||\hat{\mathbf {e}}||=O_{p}(1)$. By Lemma 1 and (C3), we have $P(\lambda _{min}\rightarrow \lambda ^{min}_{0})\rightarrow 1$, where $\lambda ^{min}_{0}=\inf _{u\in [0,1]}\lambda _{min}(f(u)\Omega (u))$, $\lambda _{min}(A)$ stands for the minimal eigenvalue of an arbitrary positive definite matrix A. By (C2), (C3), and Lemma 1, we have $\lambda ^{min}_{0}>0$. Consequently, the last term in (5) is dominated by the first two terms because in the last term $nh^{-1/2}(a_{1n}+b_{1n})\rightarrow 0$. Last, we note that the first term in (5) is a quadratic function in C while the second term is linear in C. As long as C is sufficiently large, the right hand side of (5) is guaranteed to be positive with probability arbitrarily close to 1. This proves (4). The proof is complete. $\square $

Proof of Theorem 1

(1)We only need to prove that $P(||\hat{\mathbf {b}}_{\lambda ,j}||=0)\rightarrow 1$ with $j=p$. The proofs for $p_{0}<j <p$ are similar. If the claim is not true (i.e., $||\hat{\mathbf {b}}_{\lambda ,j}||\ne 0$ ), then it must be the solution of the following normal equation

$$\begin{aligned} 0=\frac{\partial Q(H)}{\partial \mathbf {b}_{\lambda ,p}}|_{H=\hat{H}_{\lambda }}=\alpha _{1}+\alpha _{2} \end{aligned}$$

(6)

where $\alpha _{1}$ is a n-dimensional vector with its $t$th component given by

$$\begin{aligned} \alpha _{1t}=-2\sum _{i=1}^{n}(U_{i}-U_{t})X_{ip}(Y_{i}-D_{u_{t},i}^{T}\hat{v}(U_{t}))K_{h}(U_{i}-U_{t}),\quad t=1,2,\ldots ,n \end{aligned}$$

and

$$\begin{aligned} \alpha _{2}=\frac{p'_{2p}||\mathbf {b}_{p}^{(m)}||}{||\mathbf {b}_{p}^{(m)}||}\mathbf {b}_{\lambda ,j}. \end{aligned}$$

By standard arguments of kernel smoothing, and applying Lemmas 1 and 2, we have $||\alpha _{1}||=O_{p}(nh^{-1/2})$. On the other hand, under the theorem condition, we know that $nh^{-1/2}||\alpha _{2}||\ge nh^{-1/2} b_{2n} \rightarrow \infty $. This implies that $P(||\alpha _{1}||<||\alpha _{2}||)\rightarrow 1$. Consequently, we know that, with probability tending to one, the normal Eq. (6) cannot hold. This implies that $\hat{\mathbf {b}}_{\lambda ,j}$ must be located at the place where the objective function $Q_{\lambda }(H)$ is not differentiable. Since the only place where $Q_{\lambda }(H)$ is not differentiable for $\mathbf {b}_{p}$ is the origin, we know $P(||\hat{\mathbf {b}}_{\lambda ,j}||=0)\rightarrow 1$.

Similarly, we can prove the second part of the theorem. Hence, this completes the proof. $\square $

Proof of Theorem 2

By theorem 1, we know that $\hat{\mathbf {a}}_{\lambda .j}=0, p_{1}<j\le p, \hat{\mathbf {b}}_{\lambda .j}=0, p_{0}<j\le p$ with probability tending to one. Consequently, we know that $a_{a,\lambda }(u)$ must be the solution of the following normal equation

$$\begin{aligned}&-\frac{1}{n}\sum _{i=1}^{n}\mathbf{{X}}_{ia}\left\{ Y_{i}-\mathbf{{X}}_{ia}^{T}\mathbf {a}_{a,\lambda }(u)-(U_{i}-u)\mathbf{{X}}_{ia}^{T}\mathbf {a}'_{a,\lambda }(u)- \mathbf{{X}}_{ib}^{T}a_{b,\lambda }(u)\right\} \\&\qquad \times \, K_{h}(U_{i}-u)+\frac{1}{n}L=0 \end{aligned}$$

where $L=\left( p'_{\lambda _{11}}(||\hat{\mathbf {a}}_{1,\lambda }||)\frac{\hat{\mathbf {a}}_{1}(u)}{||\hat{\mathbf {a}}_{1,\lambda }||},\ldots ,p'_{\lambda _{1p_{0}}}(||\hat{\mathbf {a}}_{p_{0},\lambda }||)\frac{\hat{\mathbf {a}}_{p_{0}}(u)}{||\hat{\mathbf {a}}_{p_{0},\lambda }||}\right) ^{T}$. It implies that $\hat{\mathbf {a}}_{a,\lambda }$ is of the form

$$\begin{aligned} \hat{\mathbf {a}}_{a,\lambda }(u)=\{{\varSigma }_{1}(u)\}^{-1}\left\{ \frac{1}{n}\sum _{i=1}^{n}\mathbf{{X}}_{ia}\{ Y_{i}-(U_{i}-u)\mathbf{{X}}_{ia}^{T}\hat{\mathbf {a}}'_{a,\lambda }(u)-\mathbf{{X}}_{ib}^{T}\hat{\mathbf {a}}_{b,\lambda }(u)\}+\frac{1}{n}L\right\} \end{aligned}$$

where ${\varSigma }_{1}(u)=n^{-1}\sum _{i=1}^{n}\mathbf{{X}}_{ia}\mathbf{{X}}_{ia}^{T}K_{h}(U_{i}-u)$. Comparing with the oracle estimator, we know that

$$\begin{aligned}&\max _{u\in [0,1]} || \hat{\mathbf {a}}_{a,\lambda }-\hat{\mathbf {a}}_{ora}|| \\&\quad =||\{{\varSigma }_{1}(u)\}^{-1}\{\frac{1}{n}L+{\varSigma }_{2}(u)(\hat{\mathbf {a}}'_{a,\lambda }(u)-\mathbf {a}'_{a}(u))+{\varSigma }_{3}(u)(\hat{\mathbf {a}}_{b,\lambda }(u)-\mathbf {a}_{b}(u)) \}||\\&\quad \le \lambda ^{-1}_{1,min}|| \frac{1}{n}L||+\lambda ^{-1}_{1,min}\lambda _{2,max}||\hat{\mathbf {a}}'_{a,\lambda }(u)-\mathbf {a}'_{a}(u)||\\&\qquad +\,\lambda ^{-1}_{1,min}\lambda _{3,max}||\hat{\mathbf {a}}_{b,\lambda }(u)-\mathbf {a}_{a}(u)||\\&\quad \,\doteq J_{1}+J_{2}+J_{3} \end{aligned}$$

where ${\varSigma }_{2}(u)=n^{-1}\sum _{i=1}^{n}\mathbf{{X}}_{ia}\mathbf{{X}}_{ia}^{T}(u_{i}-u)K_{h}(u_{i}-u)$, ${\varSigma }_{3}(u)= n^{-1}\sum _{i=1}^{n}\mathbf{{X}}_{ib}\mathbf{{X}}_{ib}^{T}K_{h}(u_{i}-u)$, $\lambda _{1,min}=\min \{\lambda _{min}({\varSigma }_{1}(u)), u\in [0,1]\}$, $\lambda _{2,max}=\max \{\lambda _{max}({\varSigma }_{2}(u)),u\in [0,1]\}$, $\lambda _{3,max}$ $=\max \{\lambda _{max}({\varSigma }_{3}(u)),u\in [0,1]\}$. For $J_{1}$, applying Lemma 1, we have $J_{1}\le C \sqrt{p_{0}} a_{1n}=o_{p}(n^{-2/5})$. By Lemma 2, we have $J_{2}=o_{p}(n^{-2/5}), J_{3}=o_{p}(n^{-2/5})$. This completes the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, XJ., Zhang, JX. A new variable selection approach for varying coefficient models. Metrika 79, 59–72 (2016). https://doi.org/10.1007/s00184-015-0543-y

Download citation

Received: 09 October 2014
Published: 16 April 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s00184-015-0543-y

Keywords

Mathematics Subject Classification

62G08

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new variable selection approach for varying coefficient models

Abstract

Access this article

Similar content being viewed by others

Robust variable selection for the varying index coefficient models

Variable selection for varying-coefficient models with the sparse regularization

Unified Variable Selection for Varying Coefficient Models with Longitudinal Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Assumptions and proofs

Lemma 1

Lemma 2

Proof

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A new variable selection approach for varying coefficient models

Abstract

Access this article

Similar content being viewed by others

Robust variable selection for the varying index coefficient models

Variable selection for varying-coefficient models with the sparse regularization

Unified Variable Selection for Varying Coefficient Models with Longitudinal Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Assumptions and proofs

Appendix: Assumptions and proofs

Lemma 1

Lemma 2

Proof

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation