Skip to main content
Log in

Variational approximation for heteroscedastic linear models and matching pursuit algorithms

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one-step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aitchison, J.: The Statistical Analysis of Compositional Data. Chapman and Hall, London (1986)

    Book  MATH  Google Scholar 

  • Beal, M.J., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayesian Stat. 7, 453–464 (2003)

    MathSciNet  Google Scholar 

  • Beal, M.J., Ghahramani, Z.: Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Anal. 1, 1–44 (2006)

    Article  MathSciNet  Google Scholar 

  • Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  • Brown, P.J., Fearn, T., Vannucci, M.: Bayesian wavelet regression on curves with application to a spectroscopic calibration problem. J. Am. Stat. Assoc. 96, 398–408 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Carroll, R.J., Ruppert, D.: Transformation and Weighting in Regression. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1988)

    MATH  Google Scholar 

  • Chan, D., Kohn, R., Nott, D.J., Kirby, C.: Adaptive nonparametric estimation of mean and variance functions. J. Comput. Graph. Stat. 15, 915–936 (2006)

    Article  MathSciNet  Google Scholar 

  • Chen, J., Chen, Z.: Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95, 759–771 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Chib, S., Jeliazkov, I.: Marginal likelihood from the Metropolis-Hastings output. J. Am. Stat. Assoc. 96, 270–281 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Cottet, R., Kohn, R., Nott, D.J.: Variable selection and model averaging in overdispersed generalized linear models. J. Am. Stat. Assoc. 103, 661–671 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Davidian, M., Carroll, R.: Variance function estimation. J. Am. Stat. Assoc. 82, 1079–1091 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Efron, B.: Double exponential families and their use in generalised linear regression. J. Am. Stat. Assoc. 81, 709–721 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussion). Ann. Stat. 32, 407–451 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, J.H.: Fast sparse regression and classification (2008). URL http://www-stat.stanford.edu/jhf/ftp/GPSpaper.pdf

  • George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)

    Article  Google Scholar 

  • Geweke, J., Keane, M.: Smoothly mixing regressions. J. Econom. 138, 252–291 (2007)

    Article  MathSciNet  Google Scholar 

  • Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. In: Jordan, M.I. (ed.) Learning in Graphical Models. MIT Press, Cambridge (1999)

    Google Scholar 

  • Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397–3415 (1993)

    Article  MATH  Google Scholar 

  • Nelder, J., Pregibon, D.: An extended quasi-likelihood function. Biometrika 74, 221–232 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • O’Hagan, A., Forster, J.J.: Bayesian Inference, 2nd edn. Kendall’s Advanced Theory of Statistics, vol. 2B. Arnold, London (2004)

    MATH  Google Scholar 

  • Ormerod, J.T., Wand, M.P.: Explaining variational approximation. Am. Stat. 64(2), 140–153 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Osborne, B.G., Fearn, T., Miller, A.R., Douglas, S.: Application of near infrared reflectance spectroscopy to the compositional analysis of biscuits and biscuit doughs. J. Sci. Food Agric. 35, 99–105 (1984)

    Article  Google Scholar 

  • Raftery, A.E., Madigan, D., Hoeting, J.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92, 179–191 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape (with discussion). Appl. Stat. 54, 507–554 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Ruppert, D., Wand, M.P., Carroll, R.J.: Semiparametric Regression. Cambridge University Press, Cambridge (2003)

    Book  MATH  Google Scholar 

  • Smith, M., Kohn, R.: Nonparametric regression using Bayesian variable selection. J. Econom. 75, 317–343 (1996)

    Article  MATH  Google Scholar 

  • Smyth, G.: Generalized linear models with varying dispersion. J. R. Stat. Soc. B 51, 47–60 (1989)

    MathSciNet  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)

    Article  MathSciNet  Google Scholar 

  • Villani, M., Kohn, R., Giordani, P.: Regression density estimation using smooth adaptive Gaussian mixtures. J. Econom. 153, 155–173 (2009)

    Article  MathSciNet  Google Scholar 

  • Weisberg, S.: Applied Linear Regression, 3rd edn. Wiley, Hoboken (2005)

    Book  MATH  Google Scholar 

  • Yau, P., Kohn, R.: Estimation and variable selection in nonparametric heteroscedastic regression. Stat. Comput. 13, 191–208 (2003)

    Article  MathSciNet  Google Scholar 

  • Yee, T., Wild, C.: Vector generalized additive models. J. R. Stat. Soc. B 58, 481–493 (1996)

    MathSciNet  MATH  Google Scholar 

  • Zhang, T.: On the consistency of feature selection using greedy least squares regression. J. Mach. Learn. Res. 10, 555–568 (2009)

    MathSciNet  Google Scholar 

  • Zhao, P., Yu, B.: Stagewise Lasso. J. Mach. Learn. Res. 8, 2701–2726 (2007)

    MathSciNet  MATH  Google Scholar 

  • Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 106, 1418–1429 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minh-Ngoc Tran.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nott, D.J., Tran, MN. & Leng, C. Variational approximation for heteroscedastic linear models and matching pursuit algorithms. Stat Comput 22, 497–512 (2012). https://doi.org/10.1007/s11222-011-9243-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-011-9243-2

Keywords

Navigation