Abstract
While Bayesian analogues of lasso regression have become popular, comparatively little has been said about formal treatments of model uncertainty in such settings. This paper describes methods that can be used to evaluate the posterior distribution over the space of all possible regression models for Bayesian lasso regression. Access to the model space posterior distribution is necessary if model-averaged inference—e.g., model-averaged prediction and calculation of posterior variable inclusion probabilities—is desired. The key element of all such inference is the ability to evaluate the marginal likelihood of the data under a given regression model, which has so far proved difficult for the Bayesian lasso. This paper describes how the marginal likelihood can be accurately computed when the number of predictors in the model is not too large, allowing for model space enumeration when the total number of possible predictors is modest. In cases where the total number of possible predictors is large, a simple Markov chain Monte Carlo approach for sampling the model space posterior is provided. This Gibbs sampling approach is similar in spirit to the stochastic search variable selection methods that have become one of the main tools for addressing Bayesian regression model uncertainty, and the adaption of these methods to the Bayesian lasso is shown to be straightforward.
Similar content being viewed by others
References
Andrews, D., Mallows, C.: Scale mixtures of normal distributions. J. R. Stat. Soc., Ser. B 36, 99–102 (1974)
Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32, 870–897 (2004)
Bernardo, J., Smith, A.: Bayesian Theory. Wiley, New York (2000)
Carlin, B., Chib, S.: Bayesian model choice via Markov chain Monte Carlo methods. J. R. Stat. Soc., Ser. B 57, 473–484 (1995)
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Discussion Paper 2008-31, Duke University Department of Statistical Science (2008)
Chipman, H.A., George, E.I., McCulloch, R.E.: The practical implementation of Bayesian model selection (with discussion). In: Lahiri, P. (ed.) Model Selection, pp. 65–134. IMS, Beachwood (2001)
Cui, W., George, E.I.: Empirical Bayes vs. fully Bayes variable selection. J. Stat. Plan. Inference 138, 888–900 (2008)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Fernández, C., Steel, M.: Bayesian regression analysis with scale mixtures of normals. Econom. Theory 16, 80–101 (2000)
Genz, A.: Numerical computation of multivariate normal probabilities. J. Comput. Graph. Stat. 1, 141–150 (1992)
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T.: mvtnorm: Multivariate Normal and t Distributions. R package version 0.9-7 (2009)
George, E.I., Foster, D.P.: Calibration and empirical Bayes variable selection. Biometrika 87, 731–747 (2000)
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
George, E.I., McCulloch, R.E.: Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373 (1997)
Geweke, J.: Variable selection and model comparison in regression. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 609–620. Oxford Press, London (1996)
Griffin, J., Brown, P.: Alternative prior distributions for variable selection with very many more variables than observations. Tech. Rep., University of Warwick (2005)
Griffin, J., Brown, P.: Bayesian adaptive lassos with non-convex penalization. Tech. Rep., University of Warwick (2007)
Griffin, J., Brown, P.: Inference with Normal-Gamma prior distributions in regression problems. Tech. Rep., University of Warwick (2009)
Hans, C.: Bayesian lasso regression. Biometrika Advance Access published September 24, 2009, doi:10.1093/biomet/asp047
Hans, C., Dobra, A., West, M.: Shotgun stochastic search for “large p” regression. J. Am. Stat. Assoc. 102, 507–516 (2007)
Kohn, R., Smith, M., Chan, D.: Nonparametric regression using linear combinations of basis functions. Stat. Comput. 11, 313–322 (2001)
Kuo, L., Mallick, B.: Variable selection for regression models. Sankhyā Ser. B 60, 65–81 (1998)
Ley, E., Steel, M.: On the effect of prior assumptions in Bayesian model averaging with applications to growth regression. Policy Research Working Paper 4238. World Bank (2007)
Liang, F., Paulo, R., Molina, G., Clyde, M., Berger, J.O.: Mixtures of g-priors for Bayesian variable selection. J. Am. Stat. Assoc. 103, 410–423 (2008)
Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103, 681–686 (2008)
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2009). ISBN 3-900051-07-0
Raftery, A.E., Madigan, D., Hoeting, J.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92, 1197–1208 (1997)
Raiffa, H., Schlaifer, R.: Applied Statistical Decision Theory. Graduate School of Business Administration, Harvard University (1961)
Scott, J.G., Berger, J.O.: Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Discussion Paper 2008-10, Duke University Department of Statistical Science (2008)
Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econom. 75, 317–343 (1996)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc., Ser. B 58, 267–288 (1996)
West, M.: On scale mixtures of normal distributions. Biometrika 74, 646–648 (1987)
Yi, N., Xu, S.: Bayesian LASSO for quantitative trait loci mapping. Genetics 179, 1045–1055 (2008)
Yuan, M., Lin, Y.: Efficient empirical Bayes variable selection and estimation in linear models. J. Am. Stat. Assoc. 100, 1215–1225 (2005)
Zellner, A., Siow, A.: Posterior odds ratios for selected regression hypotheses. In: Bayesian Statistics: Proceedings of the First International Meeting Held in Valencia, pp. 585–603 (1980)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hans, C. Model uncertainty and variable selection in Bayesian lasso regression. Stat Comput 20, 221–229 (2010). https://doi.org/10.1007/s11222-009-9160-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-009-9160-9