Applied Intelligence

, Volume 7, Issue 4, pp 287–303 | Cite as

Experiments of Fast Learning with High Order Boltzmann Machines

  • M. Gra~na
  • A. D'Anjou
  • F.X. Albizuri
  • M. Hernandez
  • F.J. Torrealdea
  • A. de la Hera
  • A.I. Gonzalez
Article

Abstract

This work reports the results obtained with the application of High Order Boltzmann Machines without hidden units to construct classifiers for some problems that represent different learning paradigms. The Boltzmann Machine weight updating algorithm remains the same even when some of the units can take values in a discrete set or in a continuous interval. The absence of hidden units and the restriction to classification problems allows for the estimation of the connection statistics, without the computational cost involved in the application of simulated annealing. In this setting, the learning process can be sped up several orders of magnitude with no appreciable loss of quality of the results obtained.

Neural Networks Boltzmann Machines High Order networks Classification Problems 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    E.H.L. Aarts and J.H.M. Korst, Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing, John Wiley & Sons, 1989.Google Scholar
  2. 2.
    D.H. Ackley, G.E. Hinton, and T.J. Sejnowski, “A learning algorithm for Boltzmann Machines,” Cogn. Sci., vol. 9, pp. 147–169, 1985.Google Scholar
  3. 3.
    R. Durbin and D.E. Rumelhart, “Product units: A computationally powerful and biologically plausible extension to backpropagation networks,” Neural Computation, vol. 1, pp. 133–142.Google Scholar
  4. 4.
    G. Pinkas, “Energy minimization and satisfiability of propositional logic,” edited by Touretzky, Elman, Sejnowski, and Hinton, Connectionist Models Summer School, 1990.Google Scholar
  5. 5.
    B. Lenze, “How to make sigma-pi neural networks perform perfectly on regular training sets,” Neural Networks, vol. 7,no. 8, pp. 1285–1293, 1994.Google Scholar
  6. 6.
    F.X. Albizuri, A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “The high order Boltzmann Machine: Learned distribution and topology,” IEEE Trans. Neural Networks, vol. 6,no. 3, pp. 767–770, 1995.Google Scholar
  7. 7.
    F.X. Albizuri, “Maquina de Boltzmann de alto orden: Una red neuronal con técnicas de Monte Carlo para modelado de distribuciones de probabilidad. Caracterización y estructura,” Ph.D. Thesis Dept. CCIA Univ. Pais Vasco, 1995.Google Scholar
  8. 8.
    F.X. Albizuri, A. d'Anjou, M. Graña, and J.A. Lozano, “Convergence properties of high-order Boltzmann Machines,” Neural Networks (in press).Google Scholar
  9. 9.
    C. Peterson and J.R. Anderson, “A mean field algorithm for neural networks,” complex Systems, vol. 1, pp. 995–1019, 1987.Google Scholar
  10. 10.
    C. Peterson and E. Hartman, “Explorations of the mean field theory learning algorithm,” Neural Networks, vol. 2,no. 6, pp. 475–494, 1989.Google Scholar
  11. 11.
    G.E. Hinton, “Deterministic Boltmann learning performs steepest descent in weight-space,” Neural Computation, vol. 1, pp. 143–150, 1989.Google Scholar
  12. 12.
    G.L. Bilbro, R. Mann, T.K. Miller, W.E. Snyder, D.E. van den Bout, and M. White, “Optimization by mean field annealing,” Advances in Neural Information Processing Systems, edited by I. Touretzky, Morgan Kauffamn: Sna Mateo, CA, pp. 91–98, 1989.Google Scholar
  13. 13.
    G.L. Bilbro, W.E. Snyder, S.J. Garnier, and J.W. Gault, “Mean field annealing: A formalism for constructing GNC-like algorithms,” IEEE Trans. Neural Networks, vol. 3,no. 1, pp. 131–138, 1992.Google Scholar
  14. 14.
    L. Saul and M.I. Jordan, “Learning in Boltzmann trees,” Neural Computation, vol. 6, pp. 1174–1184, 1994.Google Scholar
  15. 15.
    T.J. Sejnowski, “Higher order Boltzmann Machines,” in Neural Networks for Computing AIP Conf. Proc. 151, edited by Denker, Snowbird UT, 1986, pp. 398–403.Google Scholar
  16. 16.
    G.E. Hinton, “Connectionist learning procedures,” Artificial Intelligence, vol. 40, pp. 185–234, 1989.Google Scholar
  17. 17.
    S.J. Perantonis and P.J.G. Lisboa, “Translation, rotation and scale invariant pattern recognition by high-order neural networks and moment classifiers,” IEEE Trans. Neural Net., vol. 3,no. 2, pp. 241–251.Google Scholar
  18. 18.
    S. Sunthakar and V.A. Jaravine, “Invariant pattern recognition using high-order neural networks,” Intelligent Robots and Computer Vision, vol. SPIE 1826, pp. 160–167, 1992.Google Scholar
  19. 19.
    J.G. Taylor and S. Coombes, “Learning higher order correlations,” Neural Networks, vol. 6, pp. 423–427, 1993.Google Scholar
  20. 20.
    J. Mendel and L.X. Wang, “Identification of moving-average systems using higher-order statistics and learning,” in Neural Networks for Signal Processing, edited by B. Kosko, Prentice-Hall, pp. 91–120, 1993.Google Scholar
  21. 21.
    M. Heywood and P. Noakes, “A framework for improved training of sigma-pi networks,” IEEE Trans. Neural Networks, vol. 6,no. 4, pp. 893–903, 1995.Google Scholar
  22. 22.
    J.M. Karlholm, “Associative memories with short-range higher order coupling,” Neural Networks, vol. 6, pp. 409–421, 1993.Google Scholar
  23. 23.
    J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation, Addison Wesley, 1991.Google Scholar
  24. 24.
    B. Kosko, Neural Networks and Fuzzy Systems, Prentice Hall, 1992.Google Scholar
  25. 25.
    T. Kohonen, Self-Organization and Associative Memory, Springer-Verlag, 1989.Google Scholar
  26. 26.
    K.M. Gutzmann, “Combinatorial optimization using a continuous state Boltzmann Machine,” Proc. IEEE Int. Conf. Neural Networks, San Diego, CA, 1987, pp. III-721–734.Google Scholar
  27. 27.
    S. Amari, “Dualistic geometry of the manifolds of high-order neurons,” Neural Networks, vol. 4, pp. 443–451, 1991.Google Scholar
  28. 28.
    S. Amari, K. Kurata, and H. Nagaoka, “Information geometry of Boltzmann Machines,” IEEE trans. Neural Networks, vol. 3,no. 2, pp. 260–271, 1992.Google Scholar
  29. 29.
    J. Alspector, T. Zeppenfeld, and S. Luna, “A volatility measure for annealing in feedback neural networks,” Neural Computation, vol. 4, pp. 191–195, 1992.Google Scholar
  30. 30.
    S. Kirpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by simulated annealing,” Science, vol. 20, pp. 671–680, 1983.Google Scholar
  31. 31.
    C. Peterson and B. Soderberg, “A new method for mapping optimization problems onto neural networks,” Int. Journal Neural Systems, vol. 1,no. 1, pp. 3–22, 1989.Google Scholar
  32. 32.
    E. Goles and S. Martinez, Neural and Automata Networks. Dynamical Behavior and Applications, Kluwer Acad. Pub., 1991.Google Scholar
  33. 33.
    C.T. Lin and C.S.G. Lee, “A multi-valued Boltzmann Machine,” IEEE Trans. Systems, Man and Cybernetics, vol. 25,no. 4, pp. 660–669, 1995.Google Scholar
  34. 34.
    L. Parra and G. Deco, “Continuous Boltzmann Machine with rotor neurons,” Neural Networks, vol. 8,no. 3, pp. 375–385, 1995.Google Scholar
  35. 35.
    R.S. Zemel, C.K.I. Williams, and M.C. Mozer, “Directionalunit Boltzmann Machines,” in Adv. Neural Information Processing Systems, edited by S.J. Hanson, J.D. Cowan, and C.L. Giles, Morgan Kauffamn, vol. 5, pp. 172–179, 1993.Google Scholar
  36. 36.
    G.A. Kohring, “On the Q-state neuron problem in attractor neural networks,” Neural Networks, vol. 6, pp. 573–581, 1993.Google Scholar
  37. 37.
    S.L. Lauritzen, “Lectures on contingency tables,” Inst. Elect. Sys., Dept. Math. Comp. Science, Univ. Aalborg (Denmark), 1989.Google Scholar
  38. 38.
    Y.M.M. Bishop, S.E. Fienberg, and P.W. Holland, Discrete Multivariate Analysis. Theory and Practice, MIT, Press 1975 (10th edition, 1989).Google Scholar
  39. 39.
    E.B. Andersen, The Statistical Analysis of Categorical Data, Springer Verlag, 1991.Google Scholar
  40. 40.
    H. Gefner and J. Pearl, “On the probabilistic semantics of connectionist networks,” 1st IEEE Int. Conf. Neural Networks, pp. 187–195.Google Scholar
  41. 41.
    R.M. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, vol. 56, pp. 71–113, 1992.Google Scholar
  42. 42.
    R.M. Neal, “Asymmetric parallel Boltzmann Machines are belief networks,” Neural Computation, vol. 4, pp. 832–834, 1992.Google Scholar
  43. 43.
    C. Berge, “Optimisation and hypergraph theory,” European Journal of Operational Research, vol. 46, pp. 297–303, 1990.Google Scholar
  44. 44.
    J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kauffman Pub., 1988.Google Scholar
  45. 45.
    M. Graña, A. D'Anjou, F.X. Albizuri, J.A. Lozano, P. Larrañaga, Y. Yurramendi, M. Hernandez, J.L. Jimenez, F.J. Torrealdea, and M. Poza, “Experimentos de aprendizaje con Máquinas de Boltzmann de alto orden,” Informática y Automática (in press).Google Scholar
  46. 46.
    S.B. Thrun et al., “The MONK's problems: A performance comparison of different learning algorithms,” Informe CMU-CS–91–197, Carnegie Mellon Univ.Google Scholar
  47. 47.
    M. Graña, V. Lavin, A. D'Anjou, F.X. Albizuri, and J.A. Lozano, High-Order Boltzmann Machines Applied to the Monk's Problems ESSAN'94, DFacto Press: Brussels, Belgium, pp. 117–122.Google Scholar
  48. 48.
    A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “Máquinas de Boltzmann para la resolución del problema de la satisfiabilidad en el cálculo proposicional,” Revista Española de Informática y Automática, vol. 24, pp. 40–49, 1992.Google Scholar
  49. 49.
    A. D'Anjou, M. Graña, F.J. Torrealdea, and M.C. Hernandez, “Solving satisfiability via Boltzmann Machines,” IEEE Trans. on Patt. Anal. and Mach. Int., vol. 15,no. 5, pp. 514–521.Google Scholar
  50. 50.
    R.P. Gorman and T.J. Sejnowski, “Analysis of hidden units in a layered network trained to classify sonar targets,” Neural Networks, vol. 1, pp. 75–89, 1988.Google Scholar
  51. 51.
    D.H. Deterding, “Speaker normalisation for automatic speech recognition,” Ph.D. Thesis, University of Cambridge, 1989.Google Scholar
  52. 52.
    A.J. Robinson, “Dynamic error propagation networks,” Ph.D. Thesis, Cambridge University, Engineering Department, 1989.Google Scholar
  53. 53.
    A.J. Robinson and F. Fallside, “A dynamic connectionist model for phoneme recognition,” Proceedings of nEuro '88, Paris, June, 1988.Google Scholar
  54. 54.
    F.L. Chung and T. Lee, “Fuzzy competitive learning,” Neural Networks, vol. 7,no. 3, pp. 539–551, 1994.Google Scholar
  55. 55.
    M. Cottrell et al., Time Series and Neural Networks: A Statistical Method for Weight Elimination ESSAN'93, DFacto Press: Brussels, Belgium, pp. 157–164, 1993.Google Scholar
  56. 56.
    O. Fambon and C. Jutten, A Comparison of Two Weight Pruning Methods ESSAN'94, DFacto Press: Brussels, Belgium, pp. 147–152, 1994.Google Scholar
  57. 57.
    M. Cottrell, B. Girard, Y. Girard, M. Mangeas, and C. Muller, “Neural modelling for time series: A statistical stepwise method for weight elimination,” IEEE Trans. Neural Networks, vol. 6,no. 6, pp. 1355–1364, 1995.Google Scholar
  58. 58.
    C. Jutten and O. Fambon, Pruning Methods: A Review ESANN'95, Dfacto Press, pp. 129–140, 1995.Google Scholar
  59. 59.
    C. Jutten and R. Chentouf, “A new scheme for incremental learning,” Neural Processing Letters, vol. 2,no. 1, pp. 1–4, 1995.Google Scholar
  60. 60.
    C. Jutten, “Learning in evolutive neural network architectures: An ill posed problem,” in From Natural to Artificial Neural Computation (IWANN'95), edited by J. Mira and F. Sandoval, Springer Verlag, vol. LNCS 930, pp. 361–373, 1995.Google Scholar
  61. 61.
    C. Lee Giles and C.W. Omlin, “Pruning recurrent neural networks for improved generalization performance,” IEEE Trans. Neural Networks, vol. 5,no. 5, pp. 848–856, 1994.Google Scholar
  62. 62.
    R. Reed, “Pruning algorithms—A survey,” IEEE Trans. Neural Networks, vol. 4,no. 5, pp. 740–747.Google Scholar
  63. 63.
    R.F. Albrecht, C.R. Reeves, and N.C. Steele (eds.), Artificial Neural Nets and Genetic Algorithms, Springer Verlag, 1993.Google Scholar
  64. 64.
    R.A. Zitar and M.H. Hassoun, “Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system,” IEEE Trans. Neural Networks, vol. 6,no. 4, pp. 859–879, 1995.Google Scholar
  65. 65.
    G.E. Hinton, Lectures at the Neural Network Summer School, Wolfson College: Cambridge, Sept. 1993.Google Scholar
  66. 66.
    P.J. Zwietering and E.H.L. Aarts, “The convergence of parallel Boltzmann Machines,” Parallel Processing in Neural Systems and Computers, edited by Eckmiller, Hartmann, and Hauske, North-Holland, pp. 277–280, 1990.Google Scholar
  67. 67.
    P.J. Zwietering and E.H.L. Aarts, “Parallel Boltzmann Machines: A mathematical model,” Journal of Parallel and Dist. Computers, vol. 13, pp. 65–75.Google Scholar

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • M. Gra~na
    • 1
  • A. D'Anjou
    • 1
  • F.X. Albizuri
    • 1
  • M. Hernandez
    • 1
  • F.J. Torrealdea
    • 1
  • A. de la Hera
    • 1
  • A.I. Gonzalez
    • 1
  1. 1.Dept. CCIAUPV/EHUSan Sebastián. E-mail

Personalised recommendations