Learning Automata as a Basis for Multi Agent Reinforcement Learning

  • Ann Nowé
  • Katja Verbeeck
  • Maarten Peeters
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3898)


In this paper we summarize some important theoretical results from the domain of Learning Automata. We start with single stage, single agent learning schema’s, and gradually extend the setting to multi-stage multi agent systems. We argue that the theory of Learning Automata is an ideal basis to build multi agent learning algorithms.


Multi Agent System Multiagent System Learn Automaton Multi Agent Learn Automaton 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence, From Natural to Artificial Systems. Santa Fe Institute studies in the sciences of complexity. Oxford University Press, Oxford (1999)MATHGoogle Scholar
  2. 2.
    Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, Renesse, Holland, pp. 195–210 (1996)Google Scholar
  3. 3.
    Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 478–485 (1999)Google Scholar
  4. 4.
    Bush, R.R., Mosteller, F.: Stochastic Models for Learning. Wiley, New York (1958)MATHGoogle Scholar
  5. 5.
    Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence, pp. 746–752 (1998)Google Scholar
  6. 6.
    Colorni, A., Dorigo, M., Maffioli, F., Maniezzo, V., Righini, G., Trubian, M.: Heuristics from nature for hard combinatorial optimization problems. International Transactions in Operational Research (1996)Google Scholar
  7. 7.
    Dorigo, M., Caro, G.D., Gambardella, L.M.: Ant algorithms for discrete optimization. Artificial Life 5, 137–172 (1999)CrossRefGoogle Scholar
  8. 8.
    Dorigo, M., Caro, G.D.: The ant colony optimization meta-heuristic. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas In Optimization. McGraw-Hill, Maidenhaid (1999)Google Scholar
  9. 9.
    Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: Optimization by a colony of cooperating agents. IEE Transactions on Systems, Man, and Cybernetics (1996)Google Scholar
  10. 10.
    Dorigo, M., Stützle, T.: Ant Colony Optimization. MIT Press, Cambridge (2004)MATHGoogle Scholar
  11. 11.
    Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning, pp. 322–328 (1994)Google Scholar
  12. 12.
    Narendra, K., Thathachar, M.: Learning Automata: An Introduction. Prentice- Hall International, Inc., Upper Saddle River (1989)MATHGoogle Scholar
  13. 13.
    Narendra, K.S., Parthasarathy, K.: Learning automata approach to hierarchical multiobjective analysis. Technical Report Report No. 8811, Electrical Engineering Yale University, New Haven, Connecticut (1988)Google Scholar
  14. 14.
    Oommen, B.J., Roberts, T.D.: Continuous learning automata solutions to the capacity assignment problem. IEEE Transactions on Computations 49, 608–620 (2000)CrossRefGoogle Scholar
  15. 15.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  16. 16.
    Tsetlin, M.L.: Automaton theory and modelling of biological systems. Mathematics in Science and Engineering, vol. 102 (1973)Google Scholar
  17. 17.
    Unsal, C., Kachroo, P., Bay, J.S.: Multiple stochastic learning automata for vehicule path control in an automated highway system. IEEE Transactions on Systems, Man, and Cybernetics, Part A 29, 120–128 (1999)CrossRefGoogle Scholar
  18. 18.
    Verbeeck, K.: Coordinated Exploration in Multi-Agent Reinforcement Learning. PhD thesis, Computational Modeling Lab, Vrije Universiteit Brussel, Belgium (2004)Google Scholar
  19. 19.
    Verbeeck, K., Nowé, A., Tuyls, K., Peeters, M.: Multi-agent reinforcement learning in stochastic single and multi-stage games. In: Kudenko, D., Kazakov, D., Alonso, E. (eds.) AAMAS 2004. LNCS (LNAI), vol. 3394, pp. 275–294. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Wheeler, R.M., Narendra, K.S.: Decentralized learning in finite markov chains. IEEE Transactions on Automatic Control AC-31, 519–526 (1986)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ann Nowé
    • 1
  • Katja Verbeeck
    • 1
  • Maarten Peeters
    • 1
  1. 1.Computational Modeling LabVrije Universiteit BrusselBrusselBelgium

Personalised recommendations