Abstract
We introduce a learning algorithm for unsupervised neural networks based on ideas from statistical mechanics. The algorithm is derived from a mean field approximation for large,layered sigmoid belief networks. We show how to (approximately) infer the statistics of these networks without resort to sampling. This is done by solving the mean field equations, which relate the statistics of each unit to those of its Markov blanket. Using these statistics as target values, the weights in the network are adapted by a local delta rule. We evaluate the strengths and weaknesses of these networks for problems in statistical pattern recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. Ackley, G. Hinton, and T. Sejnowski. A learning algorithm for Boltzmann machines. Cognitive Science 9: 147–169 (1985).
C. Peterson and J. R. Anderson. A mean field theory learning algorithm for neural networks. Complex Systems 1: 995–1019 (1987).
C. Galland. The limitations of deterministic Boltzmann machine learning. Network 4: 355–379.
R. Neal. Connectionist learning of belief networks. Artificial Intelligence 56: 71–113 (1992).
J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann: San Mateo, CA (1988).
S. Lauritzen. Graphical Models. Oxford University Press: Oxford (1996).
G. Hinton, P. Dayan, B. Frey, and R. Neal. The wake-sleep algorithm for unsupervised neural networks. Science 268: 1158–1161 (1995).
P. Dayan, G. Hinton, R. Neal, and R. Zemel. The Helmholtz machine. Neural Computation 7: 889–904 (1995).
M. Lewicki and T. Sejnowski. Bayesian unsupervised learning of higher order structure. In M. Mozer, M;. Jordan, and T. Petsche, eds. Advances in Neural Information Processing Systems 9: MIT Press: Cambridge (1996).
L. Saul, T. Jaakkola, and M. Jordan. Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research 4: 61–76 (1996).
G. Cooper. Computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence 42: 393–405 (1990).
P. Dagum and M. Luby. Approximately probabilistic reasoning in Bayesian belief networks is NP-hard. Artificial Intelligence 60: 141–153 (1993).
G. Parisi. Statistical Field Theory. Addison-Wesley: Redwood City (1988).
J. Hertz, A. Krogh, and R.G. Palmer. Introduction to the Theory of Neural Computation. Addison-Wesley: Redwood City (1991).
B. Frey, G. Hinton, and P. Dayan. Does the wake-sleep algorithm produce good density estimators? In D. Touretzky, M. Mozer, and M. Hasselmo, eds. Advances in Neural Information Processing Systems 8: 661–667. MIT Press: Cambridge, MA (1996).
Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Muller, E. 5ackinger, P. Simard, and V. Vapnik. Comparison of learning algorithms for handwritten digit recognition. In Proceedings of ICA NN’95.
W. Press, B. Flannery, S. Teukolsky, and W. Vetterling. Numerical Recipes. Cambridge University Press: Cambridge (1986).
S. Russell, J. Binder, D. Koller, and K. Kanazawa. Local learning in probabilistic networks with hidden variables. In Proceedings of IJCAI-95.
A. Dempster, N. Laird, and D. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical SocietyB39: 1–38.
P. Simard, Y. LeCun, and J. Denker. Efficient pattern recognition using a new transformation distance. In S. Hanson, J. Cowan, and C. Giles, eds. Advances in Neural Information Processing Systems 5: 50–58. Morgan Kaufmann: San Mateo, CA (1993).
S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–741 (1984).
H. Seung. Annealed theories of learning. In J.-H. Oh, C. Kwon, and S. Cho, eds. Neural Networks: The Statistical Mechanics Perspective, Proceedings of the CTPPRSRI Joint Workshop on Theoretical Physics. World Scientific: Singapore (1995).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Saul, L., Jordan, M. (1998). A Mean Field Learning Algorithm for Unsupervised Neural Networks. In: Jordan, M.I. (eds) Learning in Graphical Models. NATO ASI Series, vol 89. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5014-9_20
Download citation
DOI: https://doi.org/10.1007/978-94-011-5014-9_20
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-6104-9
Online ISBN: 978-94-011-5014-9
eBook Packages: Springer Book Archive