A Note on Strategic Learning in Policy Space

  • Steven O. Kimbrough
  • Ming Lu
  • Ann Kuo
Part of the International Handbooks on Information Systems book series (INFOSYS)


We report on a series of computational experiments with artificial agents learning in the context of games. Two kinds of learning are investigated: (1) a simple form of associative learning, called Q-learning, which occurs in state space, and (2) a simple form of learning, which we introduce here, that occurs in policy space. We compare the two methods on a number of repeated 2×2 games. We conclude that learning in policy space is an effective and promising method for learning in games.


Nash Equilibrium Reinforcement Learning Repeated Game Policy Space Previous Round 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Cam03]
    Colin F. Camerer, Behavioral game theory: Experiments in strategic interaction, Russell Sage Foundation and Princeton University Press, New York, NY and Princeton, NJ, 2003.Google Scholar
  2. [FM86]
    D. Fundenberg and E. Maskin, The folk theorem with discounting and with incomplete information, Econometrica 54 (1986), 533–554.MathSciNetGoogle Scholar
  3. [KL03]
    Steven O. Kimbrough and Ming Lu, A note on Q-learning in the Cournot game, WeB 2003: Proceedings of the Second Workshop in e-Business (Seattle, WA), December 13–14, 2003, Available at http://opimsun.wharton.upenn.edu/~sok/sokpapers/2004/cournot-rl-note-final.doc.Google Scholar
  4. [KL04]
    —, Simple reinforcement learning agents: Pareto beats Nash in an algorithmic game theory study, Information Systems and e-Business (forthcoming 2004).Google Scholar
  5. [KLM96]
    Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996), 237–285.Google Scholar
  6. [RGG76]
    Anatol Rapoport, Melvin J. Guyer, and David G. Gordon, The 2×2 game, The University of Michigan Press, Ann Arbor, MI, 1976.Google Scholar
  7. [SB98]
    Richar S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, The MIT Press, Cambridge, MA, 1998.Google Scholar
  8. [SC95]
    T. Sandholm and R. Crites, Multiagent reinforcement learning in iterated prisoner’s dilemma, Biosystems 37 (1995), 147–166, Special Issue on the Prisoner’s Dilemma.Google Scholar
  9. [Sky01]
    Brian Skyrms, The stag hunt, World Wide Web, 2001, Proceedings and Addresses of the American Philosophical Association. http://www.lps.uci.edu/home/fac-staff/faculty/skyrms/StagHunt.pdf. Accessed September 2004.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Steven O. Kimbrough
    • 1
  • Ming Lu
    • 2
  • Ann Kuo
    • 3
  1. 1.University of PennsylvaniaPhiladelphiaUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA
  3. 3.University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations