Variable selection for generalized varying coefficient models with longitudinal data

Yang, Hu; Guo, Chaohui; Lv, Jing

doi:10.1007/s00362-014-0647-x

Variable selection for generalized varying coefficient models with longitudinal data

Regular Article
Published: 29 November 2014

Volume 57, pages 115–132, (2016)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Hu Yang¹,
Chaohui Guo¹ &
Jing Lv¹

660 Accesses
11 Citations
Explore all metrics

Abstract

In this paper, we apply the penalized quadratic inference function to perform variable selection and estimation simultaneously for generalized varying coefficient models with longitudinal data. The proposed approach is based on basis function approximations and the group SCAD penalty, which can incorporate information on the correlation structure within the same subject to achieve an efficient estimator. Furthermore, we discuss the asymptotic theory of our proposed procedure under suitable conditions, including consistency in variable selection and the oracle property in estimation. Finally, monte carlo simulations and a real data analysis are conducted to examine the finite sample performance of the proposed procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multivariate Data Analysis: Its Approach, Evolution, and Impact

Cox Proportional Hazards Regression Model

Robust estimation in regression and classification methods for large dimensional data

Article 05 July 2023

References

Antoniadis A, Gijbels I, Lambert-Lacroix S (2014) Penalized estimation in additive varying coefficient models using grouped regularization. Stat Pap 55:727–750
Article MathSciNet MATH Google Scholar
Cho H, Qu A (2013) Model selection for correlated data with diverging number of parameters. Stat sin 23:901–927
MathSciNet MATH Google Scholar
Dziak JJ (2006) Penalized quadratic inference functions for variable selection in longitudinal research. Ph.D. dissertation, Pennsylvania State University, PA
Dziak JJ, Li R, Qu A (2009) An overview on quadratic inference function approaches for longitudinal data. In: Fan J, Liu JS, Lin X (eds) Frontiers of statistics, vol 1: new developments in biostatistics and bioinformatics. World Scientific Publishing, Singapore, pp 49–72
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet MATH Google Scholar
Fan J, Li R (2006) Statistical challenges with high dimensionality: feature selection in knowledge discovery. In: Proceedings of the Madrid International Congress of Mathematicians III. pp 595–622
Fu WJ (2003) Penalized estimating equations. Biometrics 59:126–132
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc Ser B 55:757–796
MathSciNet MATH Google Scholar
Huang JZ, Wu CO, Zhou L (2002) Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89:111–128
Article MathSciNet MATH Google Scholar
Huang JZ, Wu CO, Zhou L (2004) Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Stat Sin 14:763–788
MathSciNet MATH Google Scholar
Lian H (2012) Variable selection for high-dimensional generalized varying-coefficient models. Stat sin 22:1563–1588
MATH Google Scholar
Liang KY, Zeger SL (1986) Longitudinal data analysis using generalised linear models. Biometrika 73:12–22
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Book MATH Google Scholar
Noh H, Chung K, Keilegom I (2012) Variable selection of varying coefficient models in quantile regression. Electron J Stat 6:1220–1238
Article MathSciNet MATH Google Scholar
Noh H, Park B (2010) Sparse varying coefficient models for longitudinal data. Stat Sin 20:1183–1202
MathSciNet MATH Google Scholar
Qu A, Li R (2006) Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrika 62:379–391
Article MathSciNet MATH Google Scholar
Qu A, Lindsay BG, Li B (2000) Improving generalised estimating equations using quadratic inference functions. Biometrika 87:823–836
Article MathSciNet MATH Google Scholar
Tang Q, Cheng L (2012) Componentwise B-spline estimation for varying coefficient models with longitudinal data. Stat Pap 53:629–652
Article MathSciNet MATH Google Scholar
Tang Y, Wang H, Zhu Z (2013) Variable selection in quantile varying coefficient models with longitudinal data. Comput Stat Data Anal 57:435–449
Article MathSciNet Google Scholar
Wang L (2011) GEE analysis of clustered binary data with diverging number of covariates. Ann Stat 39:389–417
Article MATH Google Scholar
Wang L, Li H, Huang J (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569
Article MathSciNet MATH Google Scholar
Wang L, Zhou J, Qu A (2012) Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics 68:353–360
Article MathSciNet MATH Google Scholar
Xu D, Zhang Z, Wu L (2014) Variable selection in high-dimensional double generalized linear models. Stat Pap 55:327–347
Article MathSciNet MATH Google Scholar
Xue L, Qu A (2012) Variable selection in high-dimensional varying coefficient models with global optimality. J Mach Learn Res 13:1973–1998
MathSciNet MATH Google Scholar
Xue L, Qu A, Zhou J (2010) Consistent model selection for marginal generalized additive model for correlated data. J Am Stat Assoc 105:1518–1530
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors are very grateful to the editor, associate editor, and two anonymous referees for their detailed comments on the earlier version of the manuscript, which led to a much improved paper. This work is supported by the National Natural Science Foundation of China (Grant No. 11171361) and Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20110191110033).

Author information

Authors and Affiliations

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, China
Hu Yang, Chaohui Guo & Jing Lv

Authors

Hu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chaohui Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jing Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chaohui Guo.

Appendix A

Lemma 1

Under conditions (C1)–(C4) and ${L_n} = {O_p}({n^{{1/{(2r + 1)}}}})$, there exists a spline coefficient vector ${\varvec{\gamma } ^0} = {(\varvec{\gamma } _1^{0T},\ldots ,\varvec{\gamma } _{{p}}^{0T})^T}$ and some positive constant ${C_1}$, such that $\mathop {\sup }_{t \in [0,1]} \left| {{\beta _k}(t) - \varvec{\pi }^{(k)} {{(t)}^T}\varvec{\gamma } _k^0} \right| \le {C_1}{L _n^{-r}}.$ Let ${r_{nij}} = \varvec{\pi } _{ij}^{(k)T}\varvec{\gamma } _k^0 - {\beta _k}({t_{ij}})$, it is easy to see that ${\max _{i,j}}\left| {{r_{nij}}} \right| \le {C_1}{L _n^{-r}} $.

Lemma 2

Under conditions (C1)–(C4), and ${L_n} = {O_p}({n^{{1/{(2r + 1)}}}})$. Then the eigenvalues of $\frac{{{L_n}{\varvec{H}_n}}}{N}$ are uniformly bounded away from 0 and $\infty $ in probability, where ${\varvec{H}_n} = ({\varvec{\pi } _{11}},\ldots ,{\varvec{\pi } _{n{J_n}}}){({\varvec{\pi } _{11}},\ldots ,{\varvec{\pi } _{n{J_n}}})^T}$.

Lemma 3

Under the conditions of Theorem 1, the eigenvalues of ${C}_n^0$ are bounded from 0 and infinite when $n$ is large enough. Furthermore let ${\Theta _n}(C) = \left\{ \varvec{\gamma } :{{\left\| {(\varvec{\gamma } - {\varvec{\gamma }^0})} \right\| }} =\right. \left. {{C {L_n} }/{\sqrt{n} }} \right\} $, for some $C$ sufficiently large. Then for any $\varvec{\gamma } \in {\Theta _n}(C)$, $\left\| {{G}_n^0(\varvec{\gamma } )} \right\| = {O_p}({{\sqrt{{L_n}} }/{\sqrt{n} }})$ and ${Q}_n^0(\varvec{\gamma } ) = {O_p}({L_n}),$ where ${G}_n^0(\varvec{\gamma } ), {Q}_n^0(\varvec{\gamma } )$ and ${C}_n^0$ are evaluated at $\varvec{\mu } = {\varvec{\mu } ^0}$.

Lemma 4

Under the conditions of Theorem 1, for some $C$ sufficiently large, one has

$$\begin{aligned}&\mathop {\sup }\limits _{\varvec{\gamma } \in {\Theta _n}(C)} \left\| {{{G}_n}(\varvec{\gamma } ) -{ G}_n^0(\varvec{\gamma } )} \right\| = {o_p}(\sqrt{{L_n}}/{\sqrt{n} }),\\&\mathop {\sup }\limits _{\varvec{\gamma } \in {\Theta _n}(C)} \left\| {{{Q}_n}(\varvec{\gamma } ) - {Q}_n^0(\varvec{\gamma } )} \right\| = {o_p}({L_n}). \end{aligned}$$

Lemmas 1 and 2 are similar to those of Tang et al. (2013) and Lemmas 3 and 4 follow from Xue et al. (2010) since we noted that when using splines varying-coefficient models are almost identical to additive models in asymptotic theory.

Proof of Theorem 1

Proof of Theorem 1 (a), the result follows from Qu and Li (2006), we omit the proof. Proof of Theorem 1 (b). Note that

$$\begin{aligned} {\left\| {\varvec{\tilde{\beta }}(t) - \varvec{\beta }(t)} \right\| } \le {\left\| {\varvec{\tilde{\beta }} (t) - \varvec{\pi } {(t)^T}{\varvec{\gamma } ^0}} \right\| } + {\left\| {\varvec{\pi } {(t)^T}{\varvec{\gamma } ^0} - \varvec{\beta } (t)} \right\| }. \end{aligned}$$

(A.1)

By Theorem 1 (a) and Lemma 2, we have

$$\begin{aligned} {\left\| {\varvec{\tilde{\beta }}(u) - {\varvec{\pi }^T (t)\varvec{\gamma }^0}} \right\| }&= {\left[ {E\left\{ {\mathrm{{tr}}\left[ {{{(\varvec{\tilde{\gamma }} - {\varvec{\gamma } ^0})}^T}{\varvec{\pi }}\varvec{\pi }^T(\varvec{\tilde{\gamma }} - {\varvec{\gamma }^0})} \right] } \right\} } \right] ^{\frac{1}{2}}} \nonumber \\&= {\left[ {\mathrm{{tr}}\left\{ {E(\varvec{\pi } {\varvec{\pi } ^T})E(\varvec{\tilde{\gamma }} - {\varvec{\gamma } ^0}){{(\varvec{\tilde{\gamma }} - {\varvec{\gamma } ^0})}^T}} \right\} } \right] ^{\frac{1}{2}}} \nonumber \\&= {\left[ {{n^{ - 1}}\mathrm{{tr}}\left\{ {E(\varvec{\pi }{\varvec{\pi }^T})\varvec{{({\Gamma ^T}\Sigma ^{-1} ({\gamma ^0})\Gamma )^{ - 1}}})} \right\} } \right] ^{\frac{1}{2}}} \\&= {O_p}\left\{ {{{({{{L_n}}/n})}^{{1/2}}}} \right\} . \end{aligned}$$

(A.2)

According to Lemma 1, we also have ${\left\| {\varvec{\pi } {(t)^T}{\varvec{\gamma } ^0} - {\varvec{\beta }}(t)} \right\| } = {O_p}\left( {L_n^{ - r}} \right) $. As a result, ${\left\| {\varvec{\tilde{\beta }} (t) - \varvec{\beta }(t)} \right\| } = {O_p}\left\{ {{n^{{{ - r}/{(2r + 1)}}}}} \right\} $. This complete the proof. $\square $

Proof of Theorem 2

Firstly, we prove the Theorem 2 (a). Let ${\varvec{\hat{\gamma }}^*} = \mathop {\arg \min }_{\varvec{\gamma } \!=\! {{(\varvec{\gamma } _1^T,\ldots ,\varvec{\gamma }_s^T,{\varvec{0}^T},\ldots ,{\varvec{0}^T})}^T}} {Q_n}(\varvec{\gamma } ),$ which leads to the spline QIF estimator of the first $s$ components, knowing that the rest are zero terms. As a special case of Theorem 1, we have

$$\begin{aligned} {\left\| {{\varvec{\hat{\gamma }}^*} - {\varvec{\gamma }^0}} \right\| } = {O_p}( {L_n}/{\sqrt{n}}). \end{aligned}$$

(A.3)

We want to show that for large $n$ and any $\varepsilon > 0,$, there exists a constant $C$ large enough such that

$$\begin{aligned} P\left\{ {\mathop {\inf }\limits _{\varvec{\gamma } :{{\left\| {(\varvec{\gamma } - {\varvec{\hat{\gamma }}^*})} \right\| }} = {{C {L_n}}/{\sqrt{n} }}} {S_n}(\varvec{\gamma } ) > {S_n}({\varvec{\hat{\gamma }}^*})} \right\} > 1 - \varepsilon . \end{aligned}$$

(A.4)

This implies that ${S_n}(.)$ has a local minimum in the ball $\left\{ {\varvec{\gamma } :{{\left\| {(\varvec{\gamma } - {\varvec{\hat{\gamma }}^*})} \right\| }}} \right. $ ${\left. { \le {{C {L_n}}/{\sqrt{n} }}} \right\} }$. Thus ${\left\| {({\varvec{\hat{\gamma }}} - {\varvec{\hat{\gamma }}^*})} \right\| } = {O_p}( {L_n}/{\sqrt{n}})$. Further, the triangular inequality gives $ {\left\| {\varvec{\pi }^T}{{\varvec{\hat{\gamma }}} - \varvec{\beta } } \right\| } \le {\left\| {{\varvec{\pi }^T}({\varvec{\hat{\gamma }}} - {\varvec{\hat{\gamma }}^*})} \right\| } + {\left\| {{\varvec{\pi }^T}({\varvec{\hat{\gamma }}^*} - {\varvec{\gamma } ^0})} \right\| } + {\left\| {{\varvec{\pi }^T}{\varvec{\gamma } ^0} - \varvec{\beta } )} \right\| } ={O_p}\left\{ {{{({{{L_n}}/{n}})}^{{1/2}}}} \right\} $. To show (A.4), using ${p_{{\lambda _n}}}(0) = 0$ and ${p_{{\lambda _n}}}(.) \ge 0$, we have

$$\begin{aligned} {S_n}(\varvec{\gamma } ) - {S_n}({\varvec{\hat{\gamma }}^*}) \ge {Q_n}(\varvec{\gamma } ) - {Q_n}({\varvec{\hat{\gamma }}^*}) + n\sum \limits _{k = 1}^{s} {\left\{ {{p_{{\lambda _n}}}(\left\| {{\varvec{\gamma } _k}} \right\| ) - {p_{{\lambda _n}}}(\left\| {\varvec{\hat{\gamma }}_k^*} \right\| )} \right\} }. \end{aligned}$$

(A.5)

From (A.3), it follows that ${{{\left\| {\varvec{\hat{\gamma }^*} - {\varvec{\gamma }^0}} \right\| }} =O_p({{ {L_n}}/{\sqrt{n} }}})$. Since $\varvec{\gamma }_k ^0=\mathbf 0 $ for $k=s+1,\ldots ,p$, from (10) we have

$$\begin{aligned} \left\| {{\hat{\gamma }_k^*}}\right\| ={O_p}(L_n^{1/2}) , k=1,\ldots ,s. \end{aligned}$$

(A.6)

Assume that ${\left\| {{\varvec{\gamma }} - {\varvec{\hat{\gamma }}^*}} \right\| } = {O_p}( {L_n}/{\sqrt{n}})$. From (A.6), it follows that ${\left\| {{\varvec{\hat{\gamma }^* } _k}} \right\| } \ge a{\lambda _n}$ for $k = 1,\ldots ,s$ with probability tending to one, where $a$ appears in the definition of ${p_{{\lambda _n}}}(.)$. This means that, with probability tending to one, ${p'_{{\lambda _n}}}(\left\| {\varvec{\hat{\gamma }} _k^*} \right\| ) = 0$ for all $k=1,\ldots ,s$. Since ${\left\| {{\varvec{\gamma _k}} } \right\| }-{\left\| { {\varvec{\hat{\gamma }_k }^*}} \right\| }\le {\left\| {{\varvec{\gamma }} - {\varvec{\hat{\gamma }}^*}} \right\| } = {O_p}( {L_n}/{\sqrt{n}})=o_p(1)$, it follows from the definition of ${p_{{\lambda _n}}}(.)$ that

$$\begin{aligned} n\sum \limits _{k = 1}^s {\left\{ {{p_{{\lambda _n}}}({{\left\| {{\varvec{\gamma } _k}} \right\| }}) - {p_{{\lambda _n}}}({{\left\| {\varvec{\hat{\gamma }}_k^*} \right\| }})} \right\} } =n\sum \limits _{k = 1}^s{p'_{{\lambda _n}}}(\left\| {\varvec{\hat{\gamma }} _k^{**}} \right\| )\left( {\left\| \varvec{\gamma }_k \right\| - \left\| {\varvec{\hat{\gamma }} _k^*} \right\| } \right) = o_p({L_n}). \end{aligned}$$

(A.7)

where $ {\varvec{\hat{\gamma }} _k^{**}}$ is a value between $ {\varvec{\hat{\gamma }} _k^{**}}$ and $ {\varvec{\hat{\gamma }} _k}$. Furthermore,

$$\begin{aligned} {Q_n}(\varvec{\gamma } ) - {Q_n}({\varvec{\hat{\gamma }} ^*})&= {\nabla ^T}{Q_n}({\varvec{\hat{\gamma }} ^*})(\varvec{\gamma } - {\varvec{\hat{\gamma }} ^*})\\&\quad +\, \frac{1}{2}{(\varvec{\gamma } - {\varvec{\hat{\gamma }}^*})^T}{\nabla ^2}{Q_n}({\varvec{\hat{\gamma }}^*})(\varvec{\gamma } - {\varvec{\hat{\gamma }}^*})\left\{ {1 + {o_p}(1)} \right\} \end{aligned}$$

(A.8)

with ${\nabla ^T}{Q_n}({\varvec{\hat{\gamma }}^*})$ and ${\nabla ^2}{Q_n}({\varvec{\hat{\gamma }} ^*})$ being the gradient vector and Hessian matrix of ${Q_n}$, respectively. Following Qu et al. (2000) and Lemma 3 and Lemma 4 , for any $\varvec{\gamma }$, with ${\left\| {\varvec{\gamma } - {{\varvec{\hat{\gamma }} }^*}} \right\| } \le {{C {L_n}}/{\sqrt{n} }}$, let $\mathrm{{ }}\varvec{u} = (\varvec{\gamma } - {\varvec{\hat{\gamma }} ^*})$ and set $\left\| \varvec{ u }\right\| = C$ we have

$$\begin{aligned} {\nabla ^T}{Q_n}({{\varvec{\hat{\gamma }}}^*})(\varvec{\gamma } - {{\varvec{\hat{\gamma }} }^*})&=n {\nabla ^T}{G_n}({{\varvec{\hat{\gamma }}}^*})C_n^{ - 1}({{\varvec{\hat{\gamma }}}^*}){G_n}({{\varvec{\hat{\gamma }}}^*})(\varvec{\gamma } - {{\varvec{\hat{\gamma }} }^*})\left\{ {1 + {o_p}(1)} \right\} \\&= {O_p}({L_n})\left\| \varvec{u} \right\| , \end{aligned}$$

(A.9)

$$\begin{aligned}&{(\varvec{\gamma } - {{\varvec{\hat{\gamma }}}^*})^T}{\nabla ^2}{Q_n}({{\varvec{\hat{\gamma }} }^*})(\varvec{\gamma } - {{\varvec{\hat{\gamma }}}^*}) \\&\quad = n{(\varvec{\gamma } - {{\varvec{\hat{\gamma }} }^*})^T}{\nabla ^T}{G_n}({{\varvec{\hat{\gamma }}}^*})C_n^{ - 1}({{\varvec{\hat{\gamma }}}^*})\nabla {G_n}({{\varvec{\hat{\gamma }} }^*})(\varvec{\gamma } - {{\varvec{\hat{\gamma }} }^*}) \left\{ {1 + {o_p}(1)} \right\} \\&\quad = {O_p}({L_n}){\left\| \varvec{u} \right\| ^2}, \end{aligned}$$

(A.10)

where $\nabla {G_n}({{\varvec{\hat{\gamma }} }^*})$ is the first-order derivative of ${G_n}$. From (A.5)–(A.10), by choosing $\left\| \varvec{u} \right\| = C$ sufficiently large, (A.4) holds when $n$ and $C$ are sufficiently large. Combined with (A.3), we have $\left\| {\varvec{\hat{\gamma }} - {\varvec{\gamma } ^0}} \right\| = {O_p}\left\{ {{{{L_n}}/{\sqrt{n} }}} \right\} $. Furthermore, combining with Lemmas 1 and 2, we have

$$\begin{aligned}&\frac{1}{N}\sum \limits _{i = 1}^n {\sum \limits _{j = 1}^{{J_i}} {{{\left\{ {{{\hat{\beta }}_k}({t_{ij}}) - {\beta _{k}}({t_{ij}})} \right\} }^2}} } \\&\quad \le \frac{2}{N}\sum \limits _{i = 1}^n {\sum \limits _{j = 1}^{{J_i}} {{{\left\{ {\varvec{\pi } _{ij}^{(k)T}({{\varvec{\hat{\gamma }}}_k} - \varvec{\gamma }_k^0)} \right\} }^2}} } + \frac{2}{N}\sum \limits _{i = 1}^n {\sum \limits _{j = 1}^{{J_i}} {r_{nij}^2} } \\&\quad \le \frac{2}{N}{({{\varvec{\hat{\gamma }}}_k} - \varvec{\gamma } _k^0)^T}{\varvec{V}_N}({{\varvec{\hat{\gamma }}}_k} -\varvec{ \gamma }_k^0) + 2C_1^2L_n^{ - 2r}. \end{aligned}$$

By $\left\| {\varvec{ \hat{\gamma }} - {\varvec{\gamma } ^0}} \right\| ^2 = {O_p}({n^{ - 1}}L_n^2)$, which implies $\left\| {{\varvec{\hat{\gamma }} _k} - \varvec{\gamma } _k^0} \right\| ^2 = {O_p}({n^{ - 1}}L_n^2)$. By Lemma 1 and condition ${L_n} = {O}({n^{{1/{(2r + 1)}}}})$, we complete the proof of part (a) of Theorem 2.

Now, we prove the Theorem 2 (b). Suppose that there exists a $s + 1 \le {k_0} \le {p}$ such that the probability of ${\hat{\beta }_{{k_0}}}(t)$ being a zero function does not converge to one. Then, there exists $\eta > 0$ such that, for infinitely many $n$, $P({\varvec{\hat{\gamma }} _{{k_0}}} \ne \mathbf 0 ) = P({\hat{\beta }_{{k_0}}}(t) \ne 0) \ge \eta .$ Let ${\varvec{\hat{\gamma }} ^*}$ be the vector obtained from $\varvec{\hat{\gamma }} $ with ${\varvec{\hat{\gamma }}_{{k_0}}}$ being replaced by 0. It will be shown that there exists a $\delta > 0$ such that ${S_n}(\varvec{\hat{\gamma }} ) - {S_n}({\varvec{\hat{\gamma }} ^*}) > 0$ with probability at least $\eta $ for infinitely many $n$, which contradicts with the fact that ${S_n}(\varvec{\hat{\gamma }}) - {S_n}({\varvec{\hat{\gamma }}^*}) \le 0$.

$$\begin{aligned}&{S_n}(\varvec{\hat{\gamma }} ) - {S_n}({{\varvec{\hat{\gamma }} }^*})\\&\quad = {Q_n}(\varvec{\hat{\gamma }} ) - {Q_n}({{\varvec{\hat{\gamma }} }^*}) + n {\left\{ {{p_{{\lambda _n}}}({{\left\| {{{\varvec{\hat{\gamma }}}_{{k_0}}}} \right\| }}) - {p_{{\lambda _n}}}({{\left\| {\varvec{\hat{\gamma }} _{{k_0}}^*} \right\| }})} \right\} } \\&\quad = {\nabla ^T}{Q_n}({{\varvec{\hat{\gamma }} }^*}){{\varvec{\hat{\gamma }} }_{{k_0}}} + \frac{1}{2}{{\varvec{\hat{\gamma }} }_{{k_0}}}^T{\nabla ^2}{Q_n}({{\varvec{\hat{\gamma }} }^*}){{\varvec{\hat{\gamma }}}_{{k_0}}}\left\{ {1 + {o_p}(1)} \right\} + n{p_{{\lambda _n}}}({\left\| {{{\varvec{\hat{\gamma }} }_{{k_0}}}} \right\| }) \\&\quad = n{\lambda _n}{\left\| {{{\varvec{\hat{\gamma }}}_{{k_0}}}} \right\| }\left\{ {\frac{{{R_n}}}{{{\lambda _n}}} + \frac{{{{p'}_{{\lambda _n}}}(w)}}{{{\lambda _n}}}} \right\} \left\{ {1 + {o_p}(1)} \right\} , \end{aligned}$$

(A.11)

where

$$\begin{aligned} {R_n} = \frac{{{\nabla ^T}{Q_n}({{\varvec{\hat{\gamma }} }^*}){{\varvec{\hat{\gamma }} }_{{k_0}}} + \left( {{1/2}} \right) {{\varvec{\hat{\gamma }} }_{{k_0}}}^T{\nabla ^2}{Q_n}({{\varvec{\hat{\gamma }}}^*}){{\varvec{\hat{\gamma }} }_{{k_0}}}}}{{n{{\left\| {{{\varvec{\hat{\gamma }} }_{{k_0}}}} \right\| }}}} = {o_p}(\sqrt{{L_n}} /{\sqrt{n} }), \end{aligned}$$

(A.12)

and $w$ is a value between 0 and ${\left\| {{{\varvec{\hat{\gamma }} }_{{k_0}}}} \right\| }$. By the fact that ${{\sqrt{n} {\lambda _n}}/{\sqrt{{L_n}} }} \rightarrow \infty $, so that $\mathrm{{pli}}{\mathrm{{m}}_{n \rightarrow \infty }}\frac{{{R_n}}}{{{\lambda _n}}} \rightarrow 0$, whereas $\mathop {\lim \inf }_{n \rightarrow \infty } \mathop {\lim \inf }\nolimits _{w \rightarrow {0^ + }} \frac{{{{p'}_{{\lambda _n}}}(w)}}{{{\lambda _n}}} = 1.$ which contradicts to ${S_n}(\varvec{\hat{\gamma }}) - {S_n}({\varvec{\hat{\gamma }}^*}) \le 0$. we complete the proof of part (b) of Theorem 2.

Finally, we prove the Theorem 2 (c). By Theorem 2 (a) and (b), with probability tending to one, $\varvec{\hat{\gamma }} = {(\varvec{\hat{\gamma }} _a^T,\mathbf{0 ^T})^T}$ is a local minimizer of $S_n(\varvec{\gamma })$. Thus, by the definition of $S_n(\varvec{\gamma })$,

$$\begin{aligned} \varvec{0} = \frac{{\partial S_n(\varvec{\gamma } )}}{{\partial {\varvec{\gamma }}}}\left| {_{\varvec{\gamma } = {{(\varvec{\hat{\gamma }} _a^T,{\varvec{0}^T})}^T}}} \right. = \frac{{\partial Q_n(\varvec{\gamma } )}}{{\partial {\varvec{\gamma }}}}\left| {_{\varvec{\gamma }= {{(\varvec{\hat{\gamma }} _a^T,{\varvec{0}^T})}^T}}} \right. + n\sum \limits _{k = 1}^p {\frac{{\partial {p_{{\lambda _n}}}(\left\| {{\varvec{\gamma }_k}} \right\| )}}{{\partial \varvec{\gamma }}}} \left| {_{\varvec{\gamma } = {{(\varvec{\hat{\gamma }}_a^T,{\varvec{0}^T})}^T}}} \right. . \end{aligned}$$

(A.13)

From (10) we have$\left\| {\varvec{\hat{\gamma }} _k}\right\| ={O_p}(L_n^{1/2}), k=1,\ldots ,s.$ So $\left\| {{{\varvec{\hat{\gamma }} }_k}} \right\| > a{\lambda _n}$ for $k=1,\ldots ,s$, so the second part of the above equation is $\varvec{0}$. Thus, $\varvec{0} = \frac{{\partial Q_n(\varvec{\gamma } )}}{{\partial {\varvec{\gamma }_k}}}\left| {_{\varvec{\gamma } = {{(\varvec{\hat{\gamma }} _a^T,{\varvec{0}^T})}^T}}} \right. $, which implies

$$\begin{aligned} {{\varvec{\hat{\gamma }} }_a} = \mathop {\arg \min }\limits _{{\varvec{\gamma } _a}} n{{G}_n}{({\varvec{\gamma } _a})^T}{C}_n^{ - 1}({\varvec{\gamma } _a}){{G}_n}({\varvec{\gamma }_a}). \end{aligned}$$

Applying Theorem 1 (a), we can easily obtain the result. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, H., Guo, C. & Lv, J. Variable selection for generalized varying coefficient models with longitudinal data. Stat Papers 57, 115–132 (2016). https://doi.org/10.1007/s00362-014-0647-x

Download citation

Received: 26 June 2014
Revised: 01 November 2014
Published: 29 November 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00362-014-0647-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for generalized varying coefficient models with longitudinal data

Abstract

Access this article

Similar content being viewed by others

Multivariate Data Analysis: Its Approach, Evolution, and Impact

Cox Proportional Hazards Regression Model

Robust estimation in regression and classification methods for large dimensional data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable selection for generalized varying coefficient models with longitudinal data

Abstract

Access this article

Similar content being viewed by others

Multivariate Data Analysis: Its Approach, Evolution, and Impact

Cox Proportional Hazards Regression Model

Robust estimation in regression and classification methods for large dimensional data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A

Appendix A

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation