Skip to main content
Log in

Model selection via adaptive shrinkage with t priors

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We discuss a model selection procedure, the adaptive ridge selector, derived from a hierarchical Bayes argument, which results in a simple and efficient fitting algorithm. The hierarchical model utilized resembles an un-replicated variance components model and leads to weighting of the covariates. We discuss the intuition behind this type estimator and investigate its behavior as a regularized least squares procedure. While related alternatives were recently exploited to simultaneously fit and select variablses/features in regression models (Tipping in J Mach Learn Res 1:211–244, 2001; Figueiredo in IEEE Trans Pattern Anal Mach Intell 25:1150–1159, 2003), the extension presented here shows considerable improvement in model selection accuracy in several important cases. We also compare this estimator’s model selection performance to those offered by the lasso and adaptive lasso solution paths. Under randomized experimentation, we show that a fixed choice of tuning parameter leads to results in terms of model selection accuracy which are superior to the entire solution paths of lasso and adaptive lasso when the underlying model is a sparse one. We provide a robust version of the algorithm which is suitable in cases where outliers may exist.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Angers JF, Berger JO (1991) Robust hierarchical bayes estimation of exchangeable means. Can J Stat 19: 39–56

    Article  MathSciNet  Google Scholar 

  • Bae K, Mallick BK (2004) Gene selection using a two-level hierarchical Bayesian model. Bioinformatics 20(18): 3423–3430

    Article  Google Scholar 

  • Breiman L (1995) Better subset regression using the nonnegative garrote. Technometrics 37(4): 373–384

    Article  MathSciNet  Google Scholar 

  • Breiman L, Friedman JH (1985) Estimating optimal transformations for multiple regression and correlation. J Am Stat Assoc 80(391): 580–598

    Article  MathSciNet  Google Scholar 

  • Brown P, Vannucci M, Fearn T (1998) Multivariable bayesian variable selection and prediction. J R Stat Soc Series B 60(3): 627–641

    Article  MathSciNet  Google Scholar 

  • Candes E, Tao T (2007) The dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35(6): 2313–2351

    Article  MathSciNet  Google Scholar 

  • Casella G, Moreno E (2006) Objective bayesian variable selection. J Am Stat Assoc 101(473): 157–167

    Article  MathSciNet  Google Scholar 

  • Chen MH, Shao QM, Ibrahim JG (2001) Monte Carlo methods in Bayesian computation. Springer, Berlin

    Google Scholar 

  • Chipman H, George EI, McCulloch RE (2001) The practical implementation of bayesian model selection. IMS Lecture notes—monograph series, vol 38, pp 65–116

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2): 407–499

    Article  MathSciNet  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96: 1348–1360

    Article  MathSciNet  Google Scholar 

  • Figueiredo MAT (2003) Adaptive sparseness for supervised learning. IEEE Trans Pattern Anal Mach Intell 25: 1150–1159

    Article  Google Scholar 

  • George EI (2000) The variable selection problem. J Am Stat Assoc 95: 1304–1308

    Article  MathSciNet  Google Scholar 

  • Geweke J (1993) Bayesian treatment of the independent student-t linear model. J Appl Econom 8(S): S19–40

    Article  Google Scholar 

  • Griffen JE, Brown PJ (2007) Bayesian adaptive lassos with non-convex penalization. Technical Report 07-2v2, Centre for Research in Statistical Methodology. University of Warwick, UK

  • Harville DA (1977) Maximum likelihood approaches to variance component estimation and to related problems. J Am Stat Assoc 72(358): 320–338

    Article  MathSciNet  Google Scholar 

  • Hoerl AE, Kennard RW (2000) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 42(1): 80–86

    Article  MathSciNet  Google Scholar 

  • Johnstone IM, Silverman BW (2005) Empirical bayes selection of wavelet thresholds. Ann Stat 33: 1700–1752

    Article  MathSciNet  Google Scholar 

  • Kiiveri H (2003) A bayesian approach to variable selection when the number of variables is very large. IMS Lecture Notes - Monograph Series 40: 127–143

    Article  MathSciNet  Google Scholar 

  • Lindley DV, Smith AFM (1972) Bayes estimates for the linear model. J R Stat Soc Series B 34(1): 1–41

    MathSciNet  Google Scholar 

  • O’Hagan A (1976) On posterior joint and marginal modes. Biometrika 63(2): 329–333

    Article  MathSciNet  Google Scholar 

  • Park T, Casella G (2008) The bayesian lasso. J Am Stat Assoc 103: 681–686

    Article  MathSciNet  Google Scholar 

  • Smith M, Kohn R (1996) Nonparametric regression using Bayesian variable selection. J Econom 75(2): 317–343

    Article  Google Scholar 

  • Sorensen D, Gianola D (2002) Likelihood, Bayesian, and MCMC methods in quantitative genetics. Springer, New York

    Google Scholar 

  • Sun L, Hsu JSJ, Guttman I, Leonard T (1996) Bayesian methods for variance component models. J Am Stat Assoc 91(434): 743–752

    Article  MathSciNet  Google Scholar 

  • ter Braak CJ (2005) Bayesian sigmoid shrinkage with improper variance priors and an application to wavelet denoising. Comput Stat Data Anal 51(2): 1232–1242

    Article  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B 58(1): 267–288

    MathSciNet  Google Scholar 

  • Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1: 211–244

    Article  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2005) Efficient empirical Bayes variable selection and estimation in linear models. J Am Stat Assoc 100(472): 1215–1225

    Article  MathSciNet  Google Scholar 

  • Zhao P, Yu B (2006) On model selection consistency of lasso. J Mach Learn Res 7: 2541–2563

    MathSciNet  Google Scholar 

  • Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101: 1418–1429

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artin Armagan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Armagan, A., Zaretzki, R.L. Model selection via adaptive shrinkage with t priors. Comput Stat 25, 441–461 (2010). https://doi.org/10.1007/s00180-010-0186-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-010-0186-4

Keywords

Navigation