Computational Statistics

, Volume 31, Issue 4, pp 1263–1286 | Cite as

Minimizing variable selection criteria by Markov chain Monte Carlo

  • Yen-Shiu Chin
  • Ting-Li ChenEmail author
Original Paper


Regression models with a large number of predictors arise in diverse fields of social sciences and natural sciences. For proper interpretation, we often would like to identify a smaller subset of the variables that shows the strongest information. In such a large size of candidate predictors setting, one would encounter a computationally cumbersome search in practice by optimizing some criteria for selecting variables, such as AIC, \(C_{P}\) and BIC, through all possible subsets. In this paper, we present two efficient optimization algorithms vis Markov chain Monte Carlo (MCMC) approach for searching the global optimal subset. Simulated examples as well as one real data set exhibit that our proposed MCMC algorithms did find better solutions than other popular search methods in terms of minimizing a given criterion.


Variable selection Markov chain Monte Carlo method  Optimization Subset selection 

Supplementary material

180_2016_649_MOESM1_ESM.pdf (119 kb)
Supplementary material 1 (pdf 118 KB)


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723MathSciNetCrossRefzbMATHGoogle Scholar
  2. Bakin S (1999) Adaptive regression and model selection in data mining problems. Ph.D. thesis, Australian National University, CanberraGoogle Scholar
  3. Breiman L (1995) Better subset regression using the nonnegative garrote. Technometrics 37(4):373–384MathSciNetCrossRefzbMATHGoogle Scholar
  4. Candes E, Tao T (2007) The dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann Stat 35:2313–2351MathSciNetCrossRefzbMATHGoogle Scholar
  5. Chiang A, Beck J, Yen HJ, Tayeh M, Scheetz T, Swiderski R, Nishimura D, Braun T, Kim KY, Huang J, Elbedour K, Carmi R, Slusarski D, Casavant T, Stone E, Sheffield V (2006) Homozygosity mapping with snp arrays identifies trim32, an e3 ubiquitin ligase, as a bardet-biedl syndrome gene (bbs11). Proc Natl Acad Sci 103(16):6287–6292CrossRefGoogle Scholar
  6. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499MathSciNetCrossRefzbMATHGoogle Scholar
  7. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360MathSciNetCrossRefzbMATHGoogle Scholar
  8. George EI, McCulloch RE (1993) Variable selection via gibbs sampling. J Am Stat Assoc 88(423):881–889CrossRefGoogle Scholar
  9. George EI, McCulloch RE (1997) Approaches for bayesian variable selection. Stat Sin 7(2):339–373zbMATHGoogle Scholar
  10. Huang J, Ma S, Zhang CH (2008) Adaptive lasso for sparse high-dimensional regression models. Stat Sin 18(4):1603–1618MathSciNetzbMATHGoogle Scholar
  11. Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U, Speed T (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264CrossRefzbMATHGoogle Scholar
  12. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671680MathSciNetCrossRefzbMATHGoogle Scholar
  13. Kohn R, Smith M, Chan D (2001) Nonparametric regression using linear combinations of basis functions. Stat Comput 11:313–322MathSciNetCrossRefGoogle Scholar
  14. Liang F, Paulo R, Molina G, Clyde MA, Berger JO (2008) Mixtures of g priors for bayesian variable selection. J Am Stat Assoc 103(481):410–423MathSciNetCrossRefzbMATHGoogle Scholar
  15. Mallows C (1973) Some comments on \(c_{P}\). Technometrics 15(4):661–675zbMATHGoogle Scholar
  16. Miller A (2002) Subset selection in regression, 2nd edn. Chapman and Hall/CRC, Boca RatonCrossRefzbMATHGoogle Scholar
  17. Muller P, Quintana FA (2004) Nonparametric bayesian data analysis. Stat Sci 19(1):95–110MathSciNetCrossRefzbMATHGoogle Scholar
  18. Rocha G, Zhao P (2006) Lasso Matlab codes.
  19. Scheetz T, Kim KY, Swiderski R, Philp A, Braun T, Knudtson K, Dorrance A, DiBona G, Huang J, Casavant T, Sheffield V, Stone E (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc Natl Acad Sci 103(39):14,429–14,434CrossRefGoogle Scholar
  20. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefzbMATHGoogle Scholar
  21. Shiryaev A (1996) Probability, 2nd edn. Springer, New YorkCrossRefzbMATHGoogle Scholar
  22. Smith M, Kohn R (1996) Nonparametric regression using bayesian variable selection. J Econom 75(2):317–343CrossRefzbMATHGoogle Scholar
  23. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288MathSciNetzbMATHGoogle Scholar
  24. Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc Ser B 67(1):91–108MathSciNetCrossRefzbMATHGoogle Scholar
  25. Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68(1):49–67MathSciNetCrossRefzbMATHGoogle Scholar
  26. Zellner A (1986) On assessing prior distributions and bayesian regression analysis with g-prior distributions. In: Goel PK, Zellner A (eds) Bayesian Inference and Decision Techniques Essays in Honor of Bruno de Finetti. North Holland, Amsterdam, pp 233–243Google Scholar
  27. Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429MathSciNetCrossRefzbMATHGoogle Scholar
  28. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Institute of Statistical Science, Academia SinicaTaipeiTaiwan

Personalised recommendations