Statistical Papers

, Volume 55, Issue 3, pp 727–750 | Cite as

Penalized estimation in additive varying coefficient models using grouped regularization

  • A. Antoniadis
  • I. Gijbels
  • S. Lambert-LacroixEmail author
Regular Article


Additive varying coefficient models are a natural extension of multiple linear regression models, allowing the regression coefficients to be functions of other variables. Therefore these models are more flexible to model more complex dependencies in data structures. In this paper we consider the problem of selecting in an automatic way the significant variables among a large set of variables, when the interest is on a given response variable. In recent years several grouped regularization methods have been proposed and in this paper we present these under one unified framework in this varying coefficient model context. For each of the discussed grouped regularization methods we investigate the optimization problem to be solved, possible algorithms for doing so, and the variable and estimation consistency of the methods. We investigate the finite-sample performance of these methods, in a comparative study, and illustrate them on real data examples.


Grouped Lasso regularization Multiple linear regression models Variables selection Varying coefficient models 



The authors thank the editor and two reviewers for their detailed reading of the manuscript and their valuable comments and suggestions that led to a considerable improvement of the paper. Support from the IAP Research Network nr. P6/03 and P7/06 of the Federal Science Policy, Belgium, is acknowledged. The second author also gratefully acknowledges financial support by the projects GOA/07/04 and GOA/12/014 of the Research Fund KULeuven and the FWO-Project G.0328.08N of the Flemish Science Foundation.


  1. Antoniadis A, Gijbels I, Verhasselt A (2012a) Variable selection in additive models using P-splines. Technometrics 54(4):425–438Google Scholar
  2. Antoniadis A, Gijbels I, Verhasselt A (2012b) Variable selection in varying coefficient models using P-splines. J Comput Graph Stat 21(3):638–661Google Scholar
  3. Avalos M, Grandvalet Y, Ambroise C (2003) Regularization methods for additive models. In: Advances in intelligent data analysis V. Lecture notes in computer science 2810, pp 509–520Google Scholar
  4. Bach F (2008) Consistency of the group Lasso and multiple kernel learning. J Mach Learn Res 9:1179–1225zbMATHMathSciNetGoogle Scholar
  5. Bhatti M, Bracken P (2006) The calculation of integrals involving b-splines by means of recursion relations. Appl Math Comput 172:91–100CrossRefzbMATHMathSciNetGoogle Scholar
  6. Bickel PJ, Ritov Y, Tsybakov A (2009) Simultaneous analysis of Lasso and Dantzig selector. Ann Stat A 37(4):1705–1732CrossRefzbMATHMathSciNetGoogle Scholar
  7. Birgin EG, Martinez J, Raydan M (2000) Nonmonotone spectral projected gradient methods on convex sets. SIAM J Optim 10:1196–1211CrossRefzbMATHMathSciNetGoogle Scholar
  8. Breheny P, Huang J (2009) Penalized methods for bi-level variable selection. Stat Interface 2:369–380CrossRefzbMATHMathSciNetGoogle Scholar
  9. Breheny P, Huang J (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann Appl Stat 5:32–253CrossRefMathSciNetGoogle Scholar
  10. Brumback B, Rice J (1998) Smoothing spline models for the analysis of nested and crossed samples of curves (with discussion). J Am Stat Assoc 93:961–994CrossRefzbMATHMathSciNetGoogle Scholar
  11. Chen R, Tsay RS (1993) Functional-coefficient autoregressive models. J Am Stat Assoc 88:298–308zbMATHMathSciNetGoogle Scholar
  12. de Boor C (1978) A pratical guide to splines. Springer, New YorkCrossRefGoogle Scholar
  13. Donoho D, Johnstone I (1995) Adapting to unknown smoothness via wavelet shrinkage. J Am Stat Assoc 90:1200–1224CrossRefzbMATHMathSciNetGoogle Scholar
  14. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–489CrossRefzbMATHMathSciNetGoogle Scholar
  15. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360CrossRefzbMATHMathSciNetGoogle Scholar
  16. Fan J, Peng H (2004) Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32:928–961CrossRefzbMATHMathSciNetGoogle Scholar
  17. Fan J, Zhang J-T (2000) Two-step estimation of functional linear models with applications to longitudinal data. J R Stat Soc Ser B 62:303–322CrossRefMathSciNetGoogle Scholar
  18. Fan J, Zhang C, Zhang J (2001) Generalized likelihood ratio statistics and wilks phenomenon. Ann Stat 29:153–193CrossRefzbMATHMathSciNetGoogle Scholar
  19. Figueiredo MAT, Nowak R, Wright S (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Select Topics Signal Process 1:586–597CrossRefGoogle Scholar
  20. Harrison D, Rubinfeld D (1978) Hedonic prices and the demand for clean air. J Environ Econ Manag 5:81–102CrossRefzbMATHGoogle Scholar
  21. Hastie TJ, Tibshirani RJ (1993) Varying-coefficient models. J R Stat Soc Ser B 55:757–796zbMATHMathSciNetGoogle Scholar
  22. Hoover D, Rice J, Wu C, Yang L-P (1998) Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 85:809–822CrossRefzbMATHMathSciNetGoogle Scholar
  23. Huang JZ, Wu CO, Zhou L (2002) Varying-coefficient models and basis function approximation for the analysis of repeated measurements. Biometrika 89:111–128CrossRefzbMATHMathSciNetGoogle Scholar
  24. Huang J, Horowitz J, Ma S (2007) Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann Stat 36:587–613CrossRefMathSciNetGoogle Scholar
  25. Huang J, Ma S, Xie H, Zhang C-H (2009) A group bridge approach for variable selection. Biometrika 96(2):339–355CrossRefzbMATHMathSciNetGoogle Scholar
  26. Huang J, Breheny P, Ma S (2012) A selective review of group selection in high dimensional models. Stat Sci 27(4):481–499CrossRefMathSciNetGoogle Scholar
  27. Huang J, Zhang T (2010) The benefit of group sparsity. Ann Stat 38:1978–2004CrossRefzbMATHGoogle Scholar
  28. Kaslow RA, Ostrow DG, Detels R, Phair JP, Polk BF, Rinaldo CR (1987) The multicenter aids cohort study: rationale, organization and selected characteristics of the participants. Am J Epidemiol 126:310–318CrossRefGoogle Scholar
  29. Kim Y, Choi H, Oh H (2008) Smoothly clipped absolute deviation on high dimensions. J Am Stat Assoc 103:1665–1673CrossRefMathSciNetGoogle Scholar
  30. Knight K, Fu W (2000) Asymptotics for Lasso-type estimators. Ann Stat 28:1356–1378CrossRefzbMATHMathSciNetGoogle Scholar
  31. Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36:261–286CrossRefzbMATHMathSciNetGoogle Scholar
  32. Lin B, Zhang H (2006) Component selection and smoothing in multivariate nonparametric regression. Ann Stat 32:2272–2297CrossRefzbMATHGoogle Scholar
  33. Liu H, Zhang J (2008) On the \(\ell _1\)\(\ell _q\) regularized regression. Technical report. Carnegie Mellon University, PittsburghGoogle Scholar
  34. Meier L, Bühlman P (2007) Smoothing \(\ell _1\)-penalized estimators for high-dimensional time-course data. Electron J Stat 1:597–615CrossRefzbMATHMathSciNetGoogle Scholar
  35. Meier L, van de Geer S, Bühlman P (2008) The group Lasso for logistic regression. J R Stat Soc Ser B 70:53–71CrossRefzbMATHGoogle Scholar
  36. Nürnberger G (1989) Approximation by spline functions. Springer, New YorkCrossRefzbMATHGoogle Scholar
  37. Qingguo T, Longsheng C (2012) Componentwise B-spline estimation for varying coefficient models with longitudinal data. Stat Pap 53(3):629–652CrossRefzbMATHMathSciNetGoogle Scholar
  38. Ramsay J, Silverman B (1997) The analysis of functional data. Springer, BerlinCrossRefGoogle Scholar
  39. Rice J (2004) Functional and longitudinal data analysis: perspectives on smoothing. Stat Sin 14:631–647zbMATHMathSciNetGoogle Scholar
  40. van den Berg E, Schmidt M, Friedlander M, Murphy K (2008) Group sparsity via linear-time projection. Department of Computer Science, University of British Columbia, VancouverGoogle Scholar
  41. Wang H, Leng C (2007) Unified Lasso estimation with least squares approximation. J Am Stat Assoc 102:1039–1048CrossRefzbMATHMathSciNetGoogle Scholar
  42. Wang H, Xia Y (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104:747–757CrossRefMathSciNetGoogle Scholar
  43. Wang L, Chen G, Li H (2007) Group scad regression analysis for microarray time course gene expression. Bioinformatics 23:1486–1494CrossRefGoogle Scholar
  44. Wang L, Li H, Huang J (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569CrossRefzbMATHMathSciNetGoogle Scholar
  45. Wei X, Huang J, Li H (2011) Variable selection and estimation in high-dimensional varying-coefficient models. Stat Sin 21:1515–1540zbMATHMathSciNetGoogle Scholar
  46. Wu C, Yu K, Chiang C (2000) A two-step smoothing method for varying coefficient models with repeated measurements. Ann Inst Stat Math 52:519–543CrossRefzbMATHMathSciNetGoogle Scholar
  47. Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68:49–67CrossRefzbMATHMathSciNetGoogle Scholar
  48. Zhang C (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Laboratoire Jean Kuntzmann, Department de StatistiqueUniversité Joseph FourierGrenoble Cedex 9France
  2. 2.Department of MathematicsLeuven Statistics Research Centre (LStat)LeuvenBelgium
  3. 3.UJF-Grenoble 1/CNRS/UPMF/TIMC-IMAGGrenobleFrance

Personalised recommendations