Sankhya B

, Volume 76, Issue 1, pp 82–102 | Cite as

Penalized regression combining the L1 norm and a correlation based penalty

Article

Abstract

We consider the problem of feature selection in linear regression model with p covariates and n observations. We propose a new method to simultaneously select variables and favor a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The method is based on penalized least squares with a penalty function that combines the L1 and a Correlation based Penalty (CP) norms. We call it L1CP method. Like the Lasso penalty, L1CP shrinks some coefficients to exactly zero and additionally, the CP term explicitly links strength of penalization to the correlation among predictors. A detailed simulation study in small and high dimensional settings is performed. It illustrates the advantages of our approach compared to several alternatives. Finally, we apply the methodology to two real data sets: US Crime Data and GC-Retention PAC data. In terms of prediction accuracy and estimation error, our empirical study suggests that the L1CP is more adapted than the Elastic-Net to situations where p ≤ n (the number of variables is less or equal to the sample size). If p ≫ n, our method remains competitive and also allows the selection of more than n variables.

Keywords and phrases

Variable selection regression regularization Elastic-Net Lasso correlation based penalty 

AMS (2000) subject classification.

Primary 62J05; Secondary 62J07 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bondell, H.D. and Reich, B.J. (2008). Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR. Biometrics 64, 115–123.CrossRefMATHMathSciNetGoogle Scholar
  2. Chen, S., Donoho, D. and Saunders, M. (1998). Atomic decomposition by basis pursuit. SIAM J. Sci. Comput., 20, no. 1, 33–61.CrossRefMathSciNetGoogle Scholar
  3. Daye, Z.J. and Jeng, X.J. (2009). Shrinkage and model selection with correlated variables via weighted fusion. Comput. Statist. Data Anal., 54, 1284–1298.CrossRefMathSciNetGoogle Scholar
  4. Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist., 32, 407–499.CrossRefMATHMathSciNetGoogle Scholar
  5. El Anbari, M. and Mkhadri, A. (2008). Penalized regression with a combination of the L1 norm and the correlation based penalty. Rapports de Recherche de L’Institut National de Recherche en Informatique et Automatique, France, N° 6746.Google Scholar
  6. Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J. and Caliugiuri, M. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 513–536.CrossRefGoogle Scholar
  7. Hoerl, A. and Kennard, R. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.CrossRefMATHGoogle Scholar
  8. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B, 58, 267–288.MATHMathSciNetGoogle Scholar
  9. Tutz, G. and Ulbricht, J. (2009). Penalized regression with correlation based penalty. Stat. Comput., 19, 239–253.CrossRefMathSciNetGoogle Scholar
  10. Varmuza, K. and Filzmoser, P. (2009). Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press.Google Scholar
  11. Witten, D.M. and Tibshirani, R. (2009). Covariance-regularized regression and classification for high-dimensional problems. J. R. Stat. Soc. Ser. B, 71, 615–636.CrossRefMATHMathSciNetGoogle Scholar
  12. Wu, S., Shen, X. and Geyer, C.J. (2009). Adaptive regularization using the entire solution surfaces. Biometrika, 96, 513–527.CrossRefMATHMathSciNetGoogle Scholar
  13. Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B, 68, 49–67.CrossRefMATHMathSciNetGoogle Scholar
  14. Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic-net. J. R. Stat. Soc. Ser. B, 67, 301–320.CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Indian Statistical Institute 2013

Authors and Affiliations

  1. 1.Dept. de mathématiques Bâtiment 425Université Paris-SudOrsayFrance
  2. 2.Department of Mathematics, Faculty of Sciences SemlaliaCadi Ayyad UniversityMarrakeshMorocco

Personalised recommendations