An alternative view of variational Bayes and asymptotic approximations of free energy

Watanabe, Kazuho

doi:10.1007/s10994-011-5264-5

An alternative view of variational Bayes and asymptotic approximations of free energy

Published: 13 October 2011

Volume 86, pages 273–293, (2012)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

An alternative view of variational Bayes and asymptotic approximations of free energy

Download PDF

Kazuho Watanabe¹

879 Accesses
8 Citations
Explore all metrics

Abstract

Bayesian learning, widely used in many applied data-modeling problems, is often accomplished with approximation schemes because it requires intractable computation of the posterior distributions. In this study, we focus on two approximation methods, variational Bayes and local variational approximation. We show that the variational Bayes approach for statistical models with latent variables can be viewed as a special case of local variational approximation, where the log-sum-exp function is used to form the lower bound of the log-likelihood. The minimum variational free energy, the objective function of variational Bayes, is analyzed and related to the asymptotic theory of Bayesian learning. This analysis additionally implies a relationship between the generalization performance of the variational Bayes approach and the minimum variational free energy.

References

Aoyagi, M., & Watanabe, S. (2005). Stochastic complexities of reduced rank regression in Bayesian estimation. Neural Networks, 18, 924–933.
Article MATH Google Scholar
Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Uncertainty in artificial intelligence (pp. 21–30).
Google Scholar
Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with Bregman divergences. Journal of Machine Learning Research, 6, 1705–1749.
MathSciNet MATH Google Scholar
Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London.
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
MATH Google Scholar
Hosino, T., Watanabe, K., & Watanabe, S. (2005). Stochastic complexity of variational Bayesian hidden Markov models. In Proc. of IEEE international joint conference on neural networks (pp. 1114–1119).
Chapter Google Scholar
Jaakkola, T., & Jordan, M. (2000). Bayesian parameter estimation via variational methods. Statistics and Computing, 10, 25–37.
Article Google Scholar
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183–233.
Article MATH Google Scholar
Nakajima, S., & Sugiyama, M. (2010). Implicit regularization in variational Bayesian matrix factorization. In Proc. of the 27th international conference on machine learning.
Google Scholar
Nakajima, S., & Watanabe, S. (2007). Variational Bayes solution of linear neural networks and its generalization performance. Neural Computation, 19, 1112–1153.
Article MathSciNet MATH Google Scholar
Rusakov, D., & Geiger, D. (2005). Asymptotic model selection for naive Bayesian networks. Journal of Machine Learning Research, 6(1), 1–35.
MathSciNet MATH Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.
Article MathSciNet MATH Google Scholar
Seeger, M. (2008). Bayesian inference and optimal design for the sparse linear model. Journal of Machine Learning Research, 9, 759–813.
MathSciNet MATH Google Scholar
Seeger, M. (2009). Sparse linear models: variational approximate inference and Bayesian experimental design. Journal of Physics. Conference Series, 197, 012001.
Article Google Scholar
Watanabe, S. (2009). Algebraic geometry and statistical learning theory. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Watanabe, K. (2010). An alternative view of variational Bayes and minimum variational stochastic complexity. In Proc. of 3rd workshop on information theoretic methods in science and engineering (WITMSE-10). Tampere International Center for Signal Processing.
Google Scholar
Watanabe, K., & Watanabe, S. (2006). Stochastic complexities of Gaussian mixtures in variational Bayesian approximation. Journal of Machine Learning Research, 7, 625–644.
MATH Google Scholar
Watanabe, K., & Watanabe, S. (2007). Stochastic complexities of general mixture models in variational Bayesian learning. Neural Networks, 20, 210–219.
Article MATH Google Scholar
Watanabe, K., Shiga, M., & Watanabe, S. (2009). Upper bound for variational free energy of Bayesian networks. Machine Learning, 75, 199–215.
Article Google Scholar
Watanabe, K., Okada, M., & Ikeda, K. (2011). Divergence measures and a general framework for local variational approximation. Neural Networks. doi:10.1016/j.neunet.2011.06.004.
Google Scholar
Yamazaki, K., & Watanabe, S. (2003a). Singularities in mixture models and upper bounds of stochastic complexity. Neural Networks, 16, 1029–1038.
Article Google Scholar
Yamazaki, K., & Watanabe, S. (2003b). Stochastic complexity of Bayesian networks. In Uncertainty in artificial intelligence (pp. 592–599).
Google Scholar
Yamazaki, K., & Watanabe, S. (2005). Algebraic geometry and stochastic complexity of hidden Markov models. Neurocomputing, 69, 62–84.
Article Google Scholar
Yamazaki, K., Aoyagi, M., & Watanabe, S. (2010). Asymptotic analysis of Bayesian generalization error with Newton diagram. Neural Networks, 23, 35–43.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma, Nara, 630-0192, Japan
Kazuho Watanabe

Authors

Kazuho Watanabe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazuho Watanabe.

Additional information

Editor: Kevin P. Murphy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Watanabe, K. An alternative view of variational Bayes and asymptotic approximations of free energy. Mach Learn 86, 273–293 (2012). https://doi.org/10.1007/s10994-011-5264-5

Download citation

Received: 11 February 2011
Accepted: 21 September 2011
Published: 13 October 2011
Issue Date: February 2012
DOI: https://doi.org/10.1007/s10994-011-5264-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An alternative view of variational Bayes and asymptotic approximations of free energy

Abstract

Article PDF

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Random Gradient-Free Minimization of Convex Functions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An alternative view of variational Bayes and asymptotic approximations of free energy

Abstract

Article PDF

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Random Gradient-Free Minimization of Convex Functions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation