Abstract
Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one-step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes.
Similar content being viewed by others
References
Aitchison, J.: The Statistical Analysis of Compositional Data. Chapman and Hall, London (1986)
Beal, M.J., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayesian Stat. 7, 453–464 (2003)
Beal, M.J., Ghahramani, Z.: Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Anal. 1, 1–44 (2006)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Brown, P.J., Fearn, T., Vannucci, M.: Bayesian wavelet regression on curves with application to a spectroscopic calibration problem. J. Am. Stat. Assoc. 96, 398–408 (2001)
Carroll, R.J., Ruppert, D.: Transformation and Weighting in Regression. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1988)
Chan, D., Kohn, R., Nott, D.J., Kirby, C.: Adaptive nonparametric estimation of mean and variance functions. J. Comput. Graph. Stat. 15, 915–936 (2006)
Chen, J., Chen, Z.: Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95, 759–771 (2008)
Chib, S., Jeliazkov, I.: Marginal likelihood from the Metropolis-Hastings output. J. Am. Stat. Assoc. 96, 270–281 (2001)
Cottet, R., Kohn, R., Nott, D.J.: Variable selection and model averaging in overdispersed generalized linear models. J. Am. Stat. Assoc. 103, 661–671 (2008)
Davidian, M., Carroll, R.: Variance function estimation. J. Am. Stat. Assoc. 82, 1079–1091 (1987)
Efron, B.: Double exponential families and their use in generalised linear regression. J. Am. Stat. Assoc. 81, 709–721 (1986)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussion). Ann. Stat. 32, 407–451 (2004)
Friedman, J.H.: Fast sparse regression and classification (2008). URL http://www-stat.stanford.edu/jhf/ftp/GPSpaper.pdf
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
Geweke, J., Keane, M.: Smoothly mixing regressions. J. Econom. 138, 252–291 (2007)
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. In: Jordan, M.I. (ed.) Learning in Graphical Models. MIT Press, Cambridge (1999)
Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397–3415 (1993)
Nelder, J., Pregibon, D.: An extended quasi-likelihood function. Biometrika 74, 221–232 (1987)
O’Hagan, A., Forster, J.J.: Bayesian Inference, 2nd edn. Kendall’s Advanced Theory of Statistics, vol. 2B. Arnold, London (2004)
Ormerod, J.T., Wand, M.P.: Explaining variational approximation. Am. Stat. 64(2), 140–153 (2010)
Osborne, B.G., Fearn, T., Miller, A.R., Douglas, S.: Application of near infrared reflectance spectroscopy to the compositional analysis of biscuits and biscuit doughs. J. Sci. Food Agric. 35, 99–105 (1984)
Raftery, A.E., Madigan, D., Hoeting, J.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92, 179–191 (1997)
Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape (with discussion). Appl. Stat. 54, 507–554 (2005)
Ruppert, D., Wand, M.P., Carroll, R.J.: Semiparametric Regression. Cambridge University Press, Cambridge (2003)
Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econom. 75, 317–343 (1996)
Smyth, G.: Generalized linear models with varying dispersion. J. R. Stat. Soc. B 51, 47–60 (1989)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)
Villani, M., Kohn, R., Giordani, P.: Regression density estimation using smooth adaptive Gaussian mixtures. J. Econom. 153, 155–173 (2009)
Weisberg, S.: Applied Linear Regression, 3rd edn. Wiley, Hoboken (2005)
Yau, P., Kohn, R.: Estimation and variable selection in nonparametric heteroscedastic regression. Stat. Comput. 13, 191–208 (2003)
Yee, T., Wild, C.: Vector generalized additive models. J. R. Stat. Soc. B 58, 481–493 (1996)
Zhang, T.: On the consistency of feature selection using greedy least squares regression. J. Mach. Learn. Res. 10, 555–568 (2009)
Zhao, P., Yu, B.: Stagewise Lasso. J. Mach. Learn. Res. 8, 2701–2726 (2007)
Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 106, 1418–1429 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nott, D.J., Tran, MN. & Leng, C. Variational approximation for heteroscedastic linear models and matching pursuit algorithms. Stat Comput 22, 497–512 (2012). https://doi.org/10.1007/s11222-011-9243-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-011-9243-2