Bifurcation Analysis of Reinforcement Learning Agents in the Selten’s Horse Game

  • Alessandro Lazaric
  • Enrique Munoz de Cote
  • Fabio Dercole
  • Marcello Restelli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4865)


The application of reinforcement learning algorithms to multiagent domains may cause complex non-convergent dynamics. The replicator dynamics, commonly used in evolutionary game theory, proved to be effective for modeling the learning dynamics in normal form games. Nonetheless, it is often interesting to study the robustness of the learning dynamics when either learning or structural parameters are perturbed. This is equivalent to unfolding the catalog of learning dynamical scenarios that arise for all possible parameter settings which, unfortunately, cannot be obtained through “brute force” simulation of the replicator dynamics. The analysis of bifurcations, i.e., critical parameter combinations at which the learning behavior undergoes radical changes, is mandatory. In this work, we introduce a one-parameter bifurcation analysis of the Selten’s Horse game in which the learning process exhibits a set of complex dynamical scenarios even for relatively small perturbations on payoffs.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Börgers, T., Sarin, R.: Learning through reinforcement and replicator dynamics. Journal of Economic Theory 77(1), 1–14 (1997)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Dercole, F., Rinaldi, S.: Analysis of Evolutionary Processes: The Adaptive Dynamics Approach and its Applications. Princeton University Press, Princeton, NJ, (forthcoming)Google Scholar
  3. 3.
    Dhooge, A., Govaerts, W., Kuznetsov, Y.A.: MATCONT: A MATLAB package for numerical bifurcation analysis of ODEs. ACM Trans. Math. Software 29, 141–164 (2002)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Gintis, H.: Game Theory Evolving. Princeton University Press, Princeton, NJ (2000)Google Scholar
  5. 5.
    Kreps, D.M., Wilson, R.: Sequential equilibria. Econometrica 50(4), 863–894 (1982)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Kunigami, M., Terano, T.: Connected replicator dynamics and their control in a learning multi-agent system. In: IDEAL, pp. 18–26 (2003)Google Scholar
  7. 7.
    Kuznetsov, Y.A.: Elements of Applied Bifurcation Theory. 3rd edition (2004)Google Scholar
  8. 8.
    Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: ICML, pp. 157–163. New Brunswick, NJ, Morgan Kaufmann, San Francisco (1994)Google Scholar
  9. 9.
    Myerson, R.B.: Game Theory: Analysis of Conflict. Harvard University Press, Cambridge (1991)Google Scholar
  10. 10.
    Sato, Y., Crutchfield, J.P.: Coupled replicator equations for the dynamics of learning in multiagent systems. Phys. Rev. E 67(1), 15206 (2003)CrossRefGoogle Scholar
  11. 11.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  12. 12.
    Tuyls, K., Hoen, P.J., Vanschoenwinkel, B.: An evolutionary dynamical analysis of multi-agent learning in iterated games. JAAMAS 12(1), 115–153 (2006)Google Scholar
  13. 13.
    Watkins, C.J., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Alessandro Lazaric
    • 1
  • Enrique Munoz de Cote
    • 1
  • Fabio Dercole
    • 1
  • Marcello Restelli
    • 1
  1. 1.Department of Electronics and InformationPolitecnico di MilanoMilanItaly

Personalised recommendations