Statistics and Computing

, Volume 10, Issue 1, pp 25–37

Bayesian parameter estimation via variational methods

  • Tommi S. Jaakkola
  • Michael I. Jordan
Article

Abstract

We consider a logistic regression model with a Gaussian prior distribution over the parameters. We show that an accurate variational transformation can be used to obtain a closed form approximation to the posterior distribution of the parameters thereby yielding an approximate posterior predictive model. This approach is readily extended to binary graphical model with complete observations. For graphical models with incomplete observations we utilize an additional variational transformation and again obtain a closed form approximation to the posterior. Finally, we show that the dual of the regression problem gives a latent variable density model, the variational formulation of which leads to exactly solvable EM updates.

logistic regression graphical models belief networks variational methods Bayesian estimation incomplete data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bernardo J. and Smith A. 1994. Bayesian Theory. New York, Wiley.Google Scholar
  2. Everitt B. 1984. An Introduction to Latent Variable Models. Cambridge University Press.Google Scholar
  3. Gelman A. 1995. Bayesian Data Analysis. Boca Raton, FL, CRC Press.Google Scholar
  4. Gilks W., Richardson, S., and Spiegelhalter D. (1996). Markov Chain Monte Carlo in Practice. London, Chapman and Hall.Google Scholar
  5. Heckerman D., Geiger D., and Chickering D. 1995. Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning 20: 197–244.Google Scholar
  6. Hinton G. and van Camp D. 1993. Keeping neural networks simple by minimizing the description length of the weights. In: Proceedings of the 6th Annual Workshop on Computational Learning Theory. New York, ACM Press.Google Scholar
  7. Jordan M., Ghahramani Z., Jaakkola T., and Saul L. 1999. An introduction to variational methods in graphical models. In: Jordan M.I. (Ed.), Learning in Graphical Models. Cambridge, MA, MIT Press.Google Scholar
  8. MacKay D. 1997. Ensemble learning for hidden Markov models. Unpublished manuscript. Department of Physics, University of Cambridge. Available on the web at http://wol.ra.phy.cam.ac.uk/mackay.Google Scholar
  9. McCullagh P. and Nelder J. 1983. Generalized Linear Models. London, Chapman and Hall.Google Scholar
  10. Nadal J-P. and Parga N. 1994. Duality between learning machines: A bridge between supervised and unsupervised learning. Neural Computation 6(3): 491–508.Google Scholar
  11. Neal R. 1992. Connectionist learning of belief networks. Artificial Intelligence 56: 71–113.Google Scholar
  12. Neal R. 1993. Probabilistic inference using Markov chain Monte Carlo methods. Technical report CRG-TR-93-1, University of Toronto.Google Scholar
  13. Parisi G. 1988. Statistical field theory. Redwood City, CA, Addison-Wesley.Google Scholar
  14. Rockafellar R. 1972. Convex Analysis. Princeton University Press.Google Scholar
  15. Rustagi J. 1976. Variational Methods in Statistics. NewYork, Academic Press.Google Scholar
  16. Saul L., Jaakkola T., and Jordan M. 1996. Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research 4: 61–76.Google Scholar
  17. Saul L. and Jordan M. 1996. Exploiting tractable substructures in intractable networks. In: Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds.), Advances in Neural Information Processing Systems 8. Cambridge MA, MIT Press.Google Scholar
  18. Spiegelhalter D. and Lauritzen S. 1990. Sequential updating of conditional probabilities on directed graphical structures. Networks 20: 579–605.Google Scholar
  19. Thomas A., Spiegelhalter D., and Gilks W. 1992. BUGS: A program to perform Bayesian inference using Gibbs sampling. In: Bayesian Statistics 4. Clarendon Press.Google Scholar
  20. Tipping M. 1999. Probabilistic visualisation of high-dimensional binary data. Advances in Neural Information Processing Systems 11.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Tommi S. Jaakkola
    • 1
  • Michael I. Jordan
    • 2
  1. 1.Dept. of Elec. Eng. & Computer ScienceMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Computer Science Division and Department of StatisticsUniversity of CaliforniaBerkeley

Personalised recommendations