Towards Well-Defined Multi-agent Reinforcement Learning

  • Rinat Khoussainov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3192)


Multi-agent reinforcement learning (MARL) is an emerging area of research. However, it lacks two important elements: a coherent view on MARL, and a well-defined problem objective. We demonstrate these points by introducing three phenomena, social norms, teaching, and bounded rationality, which are inadequately addressed by the previous research. Based on the ideas of bounded rationality, we define a very broad class of MARL problems that are equivalent to learning in partially observable Markov decision processes (POMDPs). We show that this perspective on MARL accounts for the three missing phenomena, but also provides a well-defined objective for a learner, since POMDPs have a well-defined notion of optimality. We illustrate the concept in an empirical study, and discuss its implications for future research.


Nash Equilibrium Stochastic Game Repeated Game Bounded Rationality Folk Theorem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Shoham, Y., Grenager, T., Powers, R.: Multi-agent reinforcement learning: A critical survey. Tech.rep., Stanford University (2003)Google Scholar
  2. 2.
    Simon, H.: Models of Man. Social and Rational. John Wiley and Sons, Chichester (1957)zbMATHGoogle Scholar
  3. 3.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  4. 4.
    Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, Heidelberg (1997)zbMATHGoogle Scholar
  5. 5.
    Osborne, M., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1999)Google Scholar
  6. 6.
    Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proc. of the 11th Intl. Conf. on Machine Learning (1994)Google Scholar
  7. 7.
    Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4 (2003)Google Scholar
  8. 8.
    Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proc. of the 15th AAAI Conf. (1998)Google Scholar
  9. 9.
    Carmel, D., Markovitch, S.: Learning models of intelligent agents. In: Proc. of the 13th AAAI Conf. (1996)Google Scholar
  10. 10.
    Bowling, M.: Multiagent Learning in the Presence of Agents with Limitations. PhD thesis, Carnegie Mellon University (2003)Google Scholar
  11. 11.
    Dutta, P.K.: A folk theorem for stochastic games. J. of Economic Theory 66 (1995)Google Scholar
  12. 12.
    Rubinstein, A.: Equilibrium in supergames. In: Essays in Game Theory, Springer, Heidelberg (1994)Google Scholar
  13. 13.
    Axelrod, R.: The Evolution of Cooperation. Basic Books, New York (1984)Google Scholar
  14. 14.
    Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  15. 15.
    Chang, Y., Kaelbling, L.P.: Playing is believing: The role of beliefs in multi-agent learning. In: Advances in Neural Information Processing Systems, vol. 14, The MIT Press, Cambridge (2001)Google Scholar
  16. 16.
    Harsanyi, J., Selton, R.: A General Theory of Equilibrium Selection in Games. MIT Press, Cambridge (1988)zbMATHGoogle Scholar
  17. 17.
    Conitzer, V., Sandholm, T.: Complexity results about Nash equilibria. In: Proc. of the 18th Intl. Joint Conf. on AI (2003)Google Scholar
  18. 18.
    Singh, S., Jaakkola, T., Jordan, M.: Learning without state-estimation in partially observable Markovian decision processes. In: Proc. of the 11th Intl. Conf. on Machine Learning (1994)Google Scholar
  19. 19.
    Peshkin, L., Meuleau, N., Kim, K.E., Kaelbling, L.: Learning to cooperate via policy search. In: Proc. of the 16th Conf. on Uncertainty in AI (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Rinat Khoussainov
    • 1
  1. 1.Department of Computer ScienceUniversity College DublinBelfield, Dublin 4Ireland

Personalised recommendations