Abstract
We present a method for mapping a given Bayesian network to a Boltzmann machine architecture, in the sense that the the updating process of the resulting Boltzmann machine model probably converges to a state which can be mapped back to a maximum a posteriori (MAP) probability state in the probability distribution represented by the Bayesian network. The Boltzmann machine model can be implemented efficiently on massively parallel hardware, since the resulting structure can be divided into two separate clusters where all the nodes in one cluster can be updated simultaneously. This means that the proposed mapping can be used for providing Bayesian network models with a massively parallel probabilistic reasoning module, capable of finding the MAP states in a computationally efficient manner. From the neural network point of view, the mapping from a Bayesian network to a Boltzmann machine can be seen as a method for automatically determining the structure and the connection weights of a Boltzmann machine by incorporating high-level, probabilistic information directly into the neural network architecture, without recourse to a time-consuming and unreliable learning process.
Similar content being viewed by others
References
J. Anderson and E. Rosenfeld (Eds.), Neurocomputing: Foundations of Research, MIT Press: Cambridge, MA, 1988.
J. Anderson and E. Rosenfeld (Eds.), Neurocomputing 2: Directions for Research, MIT Press: Cambridge, MA, 1991.
D. Rumelhart and J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA, 1986.
J. McClelland and D. Rumelhart (Eds.), Parallel Distributed Processing, MIT Press: Cambridge, MA, vol. 2, 1986.
J. Hopfield and D. Tank, “Neural computation of decisions in optimization problems,” Biological Cybernetics, vol. 52, pp. 141–152, 1985.
E. Baum, “Towards practical ‘neural’ computation for combinatorial optimization problems,” in: Proceedings of the AIP Conference 151: Neural Networks for Computing, edited by J. Denker, Snowbird, UT, pp. 53–58, 1986.
G. Hinton and T. Sejnowski, “Optimal perceptual inference,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 1983, pp. 448–453.
G. Hinton and T. Sejnowski, Learning and Relearning in Boltzmann Machines, in J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA 3, pp. 282–317, 1986.
Y. Freund and D. Haussler, “Unsupervised learning of distributions on binary vectors using two layer networks,” in Neural Information Processing Systems 4, edited by J. Moody, S. Hanson, and R. Lippmann, Morgan Kaufmann Publishers: San Mateo, CA, pp. 912–919, 1992.
C. Galland, “Learning in deterministic Boltzmann machine networks,” Ph.D. Thesis, Department of Physics, University of Toronto, 1992.
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers: San Mateo, CA, 1988.
R. Shachter, “Probabilistic inference and influence diagrams,” Operations Research, vol. 36,no. 4, pp. 589–604, 1988.
R. Neapolitan, Probabilistic Reasoning in Expert Systems, John Wiley & Sons: New York, NY, 1990.
D. Heckerman, D. Geiger, and D. Chickering, “Learning Bayesian networks: The combination of knowledge and statistical data,” Machine Learning, vol. 20,no. 3, pp. 197–243, 1995.
M. Henrion, “An introduction to algorithms for inference in belief nets,” in Uncertainty in Artificial Intelligence 5, edited by M. Henrion, R. Shachter, L. Kanal, and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 129–138, 1990.
G. Cooper, “The computational complexity of probabilistic inference using Bayesian belief networks,” Artificial Intelligence, vol. 42,no. 2/3, pp. 393–405, 1990.
S. Shimony, “Finding MAPs for belief networks is NP-hard,” Artificial Intelligence, vol. 68, pp. 399–410, 1994.
N. Metropolis, A. Rosenbluth, M. Rosenbluth, M. Teller, and E. Teller, “Equations of state calculations by fast computing machines,” Journal of Chem. Phys., vol. 21, pp. 1087–1092, 1953.
S. Kirkpatrick, D. Gelatt, and M. Vecchi, “Optimization by simulated annealing,” Science, vol. 220,no. 4598, pp. 671–680, 1983.
S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, pp. 721–741, 1984.
E. Aarts and J. Korst, Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing, John Wiley & Sons: Chichester, 1989
L. Ingber, “Very fast simulated re-annealing,” Mathematical Computer Modelling, vol. 8,no. 12, pp. 967–973, 1989.
L. Ingber, “Adaptive simulated annealing (ASA): Lessons learned,” Control and Cybernetics, vol. 25,no. 1, pp. 33–54, 1996.
H. Szu and R. Hartley, “Nonconvex Optimization by Fast Simulated Annealing,” Proceedings of the IEEE, vol. 75,no. 11, pp. 1538–1540, 1987.
G. Bilbro, R. Mann, and T. Miller, “Optimization by mean field annealing,” in Advances in Neural Information Processing Systems I, edited by D. Touretzky, Morgan Kaufmann Publishers: San Mateo, CA, pp. 91–98, 1989.
C. Peterson and J. Anderson, “A mean field theory learning algorithm for neural networks,” Complex Systems, vol. 1, pp. 995–1019, 1987.
J. Alspector, T. Zeppenfeld, and S. Luna, “A volatility measure for annealing in feedback neural networks,” Neural Computation vol. 4, pp. 191–195, 1992.
N.R.S. Ansari and G. Wang, “An efficient annealing algorithm for global optimization in Boltzmann machines,” Journal of Applied Intelligence, vol. 3,no. 3, pp. 177–192, 1993.
S. Rajasekaran and J.H. Reif, “Nested annealing: A provable improvement to simulated annealing,” Theoretical Computer Science, vol. 99, pp. 157–176, 1992.
D. Greening, “Parallel simulated annealing techniques,” Physica D, vol. 42, pp. 293–306, 1990.
L. Ingber, “Simulated annealing: Practice versus theory,” Mathematical Computer Modelling, vol. 18,no. 11, pp. 29–57, 1993.
H. Geffner and J. Pearl, “On the probabilistic semantics of connectionist networks, Technical Report R-84, UCLA Computer Science Department, Los Angeles, CA, 1987.
K. Laskey, “Adapting connectionist learning to Bayesian networks,” International Journal of Approximate Reasoning, vol. 4, pp. 261–282, 1990.
T. Hrycej, “Common features of neural-network models of high and low level human information processing,” in Proceedings of the International Conference on Artificial Neural Networks (ICANN-91), edited by T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, Espoo, Finland, 1991, pp. 861–866.
R. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, vol. 56, pp. 71–113, 1992.
P. Myllymäki, “Mapping Bayesian networks to stochastic neural networks: A foundation for hybrid Bayesian-neural systems,” Ph.D. Thesis, Report A-1995–1, Department of Computer Science, University of Helsinki, 1995.
P. Myllymäki and P. Orponen, “Programming the harmonium,” in Proceedings of the International Joint Conference on Neural Networks, vol. 1, Singapore, 1991, pp. 671–677.
P. Myllymäki, “Bayesian reasoning by stochastic neural networks,” Ph.Lic. Thesis, Tech. Rep. C-1993–67, Department of Computer Science, University of Helsinki, 1993.
P. Smolensky, Information Processing in Dynamical Systems: Foundations of Harmony Theory, in J. McClelland (Eds.), Parallel Distributed Processing, vol. 1, MIT Press: Cambridge, MA3, pp. 194–281, 1986.
P. Myllymäki, “Using Bayesian networks for incorporating probabilistic a priori knowledge into Boltzmann machines,” in Proceedings of SOUTHCON'94, Orlando, pp. 97–102, 1994.
P. Myllymäki, “Mapping Bayesian networks to Boltzmann machines,” in Proceedings of Applied Decision Technologies 1995, edited by A. Gammerman, NeuroCOLT Technical Report NC-TR–95–034, London, 1995, pp. 269–280.
J. Barnden and J. Pollack (Eds.), Advances in Connectionist and Neural Computation Theory, Vol. I: High Level Connectionist Models, Ablex Publishing Company: Norwood, NJ, 1991.
G. Hinton, “Special issue on connectionist symbol processing,” Artificial Intelligence, vol. 46,no. 1–2, 1990.
S. Goonatilake and S. Khebbal (Eds.), Intelligent Hybrid Systems, John Wiley & Sons: Chichester, 1995.
R. Sun, Integrating Rules and Connectionism for Robust Commonsense Reasoning, John Wiley & Sons: Chichester, 1994.
P. Floréen, P. Myllymäki, P. Orponen, and H. Tirri, “Neural representation of concepts for robust inference,” in Proceedings of the International Symposium Computational Intelligence II, edited by F. Gardin and G. Mauri, Milano, Italy, pp. 89–98, 1989.
P. Myllymäki, H. Tirri, P. Floréen, and P. Orponen, “Compiling high-level specifications into neural networks,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2, Washington, D.C., 1990, pp. 475–478.
P. Floréen, P. Myllymäki, P. Orponen, and H. Tirri, “Compiling object declarations into connectionist networks,” AI Communications, vol. 3,no. 4, pp. 172–183, 1990.
L. Shastri, Semantic Networks: An Evidential Formalization and Its Connectionist Realization. Pitman: London, 1988.
J. Pearl, “Fusion, propagation and structuring in belief networks,” Artificial Intelligence, vol. 29, pp. 241–288, 1986.
R. Howard and J. Matheson, “Influence diagrams,” in Readings in Decision Analysis, edited by R.A. Howard and J.E. Matheson, Strategic Decisions Group: Menlo Park, CA, pp. 763–771, 1984.
F. Jensen, An Introduction to Bayesian Networks. UCL Press: London, 1996.
E. Castillo, J. Gutiérrez, and A. Hadi, “Expert systems and probabilistic network models,” Monographs in Computer Science, Springer-Verlag: New York, NY, 1997.
T. Hrycej, “Gibbs sampling in Bayesian networks,” Artificial Intelligence, vol. 46, pp. 351–363, 1990.
S. Lauritzen and D. Spiegelhalter, “Local computations with probabilities on graphical structures and their application to expert systems,” J. Royal Stat. Soc., Ser. B, vol. 50,no. 2, pp. 157–224, 1988. Reprinted as pp. 415–448 in [63].
D. Spiegelhalter, “Probabilistic reasoning in predictive expert systems,” in Uncertainty in Artificial Intelligence 1, edited by L. Kanal and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 47–67, 1986.
W. Hastings, “Monte Carlo sampling methods using Markov chains and their applications,” Biometrika, vol. 57, pp. 97–109, 1970.
A. Barker, “Monte Carlo calculations of the radial distribution functions for a proton-electron plasma,” Aust. J. Phys., vol. 18, pp. 119–133, 1965.
A. de Gloria, P. Faraboschi, and M. Olivieri, “Clustered Boltzmann Machines: Massively parallel architectures for constrained optimization problems,” Parallel Computing, pp. 163–175, 1993.
D. Johnson, C. Aragon, L. McGeoch, and C. Schevon, “Optimization by simulated annealing: An experimental evaluation; Part I, graph partitioning,” Operations Research, vol. 37,no. 6, pp. 865–892, 1989.
H. Chin and G. Cooper, “Bayesian belief network inference using simulation,” in Uncertainty in Artificial Intelligence 3, edited by L. Kanal and J. Lemmer, Elsevier Science Publishers B.V.: North-Holland, Amsterdam, pp. 129–147, 1989.
G. Hinton, “Connectionist learning procedures,” Artificial Intelligence, vol. 40,no. 1–3, 1989.
G. Shafer and J. Pearl (Eds.), Readings in Uncertain Reasoning, Morgan Kaufmann Publishers: San Mateo, CA, 1990.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Myllymäki, P. Massively Parallel Probabilistic Reasoning with Boltzmann Machines. Applied Intelligence 11, 31–44 (1999). https://doi.org/10.1023/A:1008324530006
Issue Date:
DOI: https://doi.org/10.1023/A:1008324530006