Abstract
Bayesian learning, widely used in many applied data-modeling problems, is often accomplished with approximation schemes because it requires intractable computation of the posterior distributions. In this study, we focus on two approximation methods, variational Bayes and local variational approximation. We show that the variational Bayes approach for statistical models with latent variables can be viewed as a special case of local variational approximation, where the log-sum-exp function is used to form the lower bound of the log-likelihood. The minimum variational free energy, the objective function of variational Bayes, is analyzed and related to the asymptotic theory of Bayesian learning. This analysis additionally implies a relationship between the generalization performance of the variational Bayes approach and the minimum variational free energy.
Article PDF
Similar content being viewed by others
References
Aoyagi, M., & Watanabe, S. (2005). Stochastic complexities of reduced rank regression in Bayesian estimation. Neural Networks, 18, 924–933.
Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Uncertainty in artificial intelligence (pp. 21–30).
Banerjee, A., Merugu, S., Dhillon, I. S., & Ghosh, J. (2005). Clustering with Bregman divergences. Journal of Machine Learning Research, 6, 1705–1749.
Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London.
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
Hosino, T., Watanabe, K., & Watanabe, S. (2005). Stochastic complexity of variational Bayesian hidden Markov models. In Proc. of IEEE international joint conference on neural networks (pp. 1114–1119).
Jaakkola, T., & Jordan, M. (2000). Bayesian parameter estimation via variational methods. Statistics and Computing, 10, 25–37.
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183–233.
Nakajima, S., & Sugiyama, M. (2010). Implicit regularization in variational Bayesian matrix factorization. In Proc. of the 27th international conference on machine learning.
Nakajima, S., & Watanabe, S. (2007). Variational Bayes solution of linear neural networks and its generalization performance. Neural Computation, 19, 1112–1153.
Rusakov, D., & Geiger, D. (2005). Asymptotic model selection for naive Bayesian networks. Journal of Machine Learning Research, 6(1), 1–35.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.
Seeger, M. (2008). Bayesian inference and optimal design for the sparse linear model. Journal of Machine Learning Research, 9, 759–813.
Seeger, M. (2009). Sparse linear models: variational approximate inference and Bayesian experimental design. Journal of Physics. Conference Series, 197, 012001.
Watanabe, S. (2009). Algebraic geometry and statistical learning theory. Cambridge: Cambridge University Press.
Watanabe, K. (2010). An alternative view of variational Bayes and minimum variational stochastic complexity. In Proc. of 3rd workshop on information theoretic methods in science and engineering (WITMSE-10). Tampere International Center for Signal Processing.
Watanabe, K., & Watanabe, S. (2006). Stochastic complexities of Gaussian mixtures in variational Bayesian approximation. Journal of Machine Learning Research, 7, 625–644.
Watanabe, K., & Watanabe, S. (2007). Stochastic complexities of general mixture models in variational Bayesian learning. Neural Networks, 20, 210–219.
Watanabe, K., Shiga, M., & Watanabe, S. (2009). Upper bound for variational free energy of Bayesian networks. Machine Learning, 75, 199–215.
Watanabe, K., Okada, M., & Ikeda, K. (2011). Divergence measures and a general framework for local variational approximation. Neural Networks. doi:10.1016/j.neunet.2011.06.004.
Yamazaki, K., & Watanabe, S. (2003a). Singularities in mixture models and upper bounds of stochastic complexity. Neural Networks, 16, 1029–1038.
Yamazaki, K., & Watanabe, S. (2003b). Stochastic complexity of Bayesian networks. In Uncertainty in artificial intelligence (pp. 592–599).
Yamazaki, K., & Watanabe, S. (2005). Algebraic geometry and stochastic complexity of hidden Markov models. Neurocomputing, 69, 62–84.
Yamazaki, K., Aoyagi, M., & Watanabe, S. (2010). Asymptotic analysis of Bayesian generalization error with Newton diagram. Neural Networks, 23, 35–43.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Kevin P. Murphy.
Rights and permissions
About this article
Cite this article
Watanabe, K. An alternative view of variational Bayes and asymptotic approximations of free energy. Mach Learn 86, 273–293 (2012). https://doi.org/10.1007/s10994-011-5264-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-011-5264-5