Skip to main content

Stochastic Games and Learning

Book cover Encyclopedia of Systems and Control
  • 393 Accesses

Abstract

A stochastic game was introduced by Lloyd Shapley in the early 1950s. It is a dynamic game with probabilistic transitions played by one or more players. The game is played in a sequence of stages. At the beginning of each stage, the game is in a certain state. The players select actions, and each player receives a payoff that depends on the current state and the chosen actions. The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players. The procedure is repeated at the new state, and the play continues for a finite or infinite number of stages. The total payoff to a player is often taken to be the discounted sum of the stage payoffs or the limit inferior of the averages of the stage payoffs.

A learning problem arises when the agent does not know the reward function or the state transition probabilities. If an agent directly learns about its optimal policy without knowing either the reward function or the state transition function, such an approach is called model-free reinforcement learning. Q-learning is an example of such a model.

Q-learning has been extended to a noncooperative multi-agent context, using the framework of general-sum stochastic games. A learning agent maintains Q-functions over joint actions and performs updates based on assuming Nash equilibrium behavior over the current Q-values. The challenge is convergence of the learning protocol.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Bibliography

  • Aumann RJ (1987) Correlated equilibrium as an expression of Bayesian rationality. Econometrica 55:1–18. doi:10.2307/1911154

    Article  MATH  MathSciNet  Google Scholar 

  • Bowling M, Veloso M (2001) Rational and convergent learning in stochastic games. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI), Seattle, pp 1021–1026

    Google Scholar 

  • Breton M (1991) Algorithms for stochastic games. In: Raghavan TES, Ferguson TS, Parthasarathy T, Vrieze OJ (eds) Stochastic games and related topics: in honor of Professor L. S. Shapley, vol 7. Springer Netherlands, Dordrecht, pp 45–57. doi:10.1007/978-94-011-3760-7_5

    Chapter  Google Scholar 

  • Brown GW (1951) Iterative solution of games by fictitious play. In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, New York, Chap. XXIV, pp 374–376

    Google Scholar 

  • Buşoniu L, Babuška R, Schutter BD (2010) Multi-agent reinforcement learning: an overview. In: Srinivasan D, Jain LC (eds) Innovations in multi-agent systems and application–1. Springer, Berlin, pp 183–221

    Google Scholar 

  • Carlson D, Haurie A (1995) A turnpike theory for infinite horizon open-loop differential games with decoupled controls. In: Olsder GJ (ed) New trends in dynamic games and applications. Annals of the international society of dynamic games, vol 3. Birkhäuser, Boston, pp 353–376

    Chapter  Google Scholar 

  • Filar J, Vrieze K (1997) Competitive Markov decision processes. Springer, New York

    MATH  Google Scholar 

  • Filar JA, Schultz TA, Thuijsman F, Vrieze OJ (1991) Nonlinear programming and stationary equilibria in stochastic games. Math Program 50(2, Ser A):227–237. doi:10.1007/BF01594936

    Google Scholar 

  • Forges F (1986) An approach to communication equilibria. Econometrica 54:1375–1385. doi:10.2307/1914304

    Article  MATH  MathSciNet  Google Scholar 

  • Fudenberg D, Levine DK (1998) The theory of learning in games, vol 2. MIT, Cambridge

    MATH  Google Scholar 

  • Greenwald A, Hall K (2003) Correlated-Q learning. In: Proceedings 20th international conference on machine learning (ISML-03), Washington, DC, 21–24 Aug 2003, pp 242–249

    Google Scholar 

  • Herings PJ-J, Peeters RJAP (2004) Stationary equilibria in stochastic games: structure, selection, and computation. J Econ Theory 118(1):32–60. doi:10.1016/j.jet.2003.10.001

    MATH  MathSciNet  Google Scholar 

  • Hu J, Wellman MP (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of the 15th international conference on machine learning, New Brunswick, pp 242–250

    Google Scholar 

  • Hu J, Wellman MP (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069

    MathSciNet  Google Scholar 

  • Leslie DS, Collins EJ (2005) Individual Q-learning in normal form games. SIAM J Control Optim 44(2):495–514. doi:10.1137/S0363012903437976

    Article  MATH  MathSciNet  Google Scholar 

  • Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 13th international conference on machine learning, New Brunswick, pp 157–163

    Google Scholar 

  • Myerson RB (1978) Refinements of the Nash equilibrium concept. Int J Game Theory 7(2):73–80. doi:10.1007/BF01753236

    Article  MATH  MathSciNet  Google Scholar 

  • Nowak AS (2008) Equilibrium in a dynamic game of capital accumulation with the overtaking criterion. Econ Lett 99(2):233–237. doi:10.1016/j.econlet.2007.05.033

    Article  MATH  Google Scholar 

  • Nowak AS, Szajowski K (1998) Nonzerosum stochastic games. In: Bardi M, Raghavan TES, Parthasarathy T (eds) Stochastic and differential games: theory and numerical methods. Annals of the international society of dynamic games, vol 4. Birkhäser, Boston, pp 297–342. doi:10.1007/978-1-4612-1592-9_7

    Google Scholar 

  • Ramsey F (1928) A mathematical theory of savings. Econ J 38:543–559

    Article  Google Scholar 

  • Robinson J (1951) An iterative method of solving a game. Ann Math 2(54):296–301. doi:10.2307/1969530

    Article  Google Scholar 

  • Rogers PD (1969) Nonzero-sum stochastic games, PhD thesis, University of California, Berkeley. ProQuest LLC, Ann Arbor

    Google Scholar 

  • Rubinstein A (1979) Equilibrium in supergames with the overtaking criterion. J Econ Theory 21:1–9. doi:10.1016/0022-0531(79)90002-4

    Article  MATH  Google Scholar 

  • Shapley L (1953) Stochastic games. Proc Natl Acad Sci USA 39:1095–1100. doi:10.1073/pnas.39.10.1095

    Article  MATH  MathSciNet  Google Scholar 

  • Shapley L (1964) Some topics in two-person games. Ann Math Stud 52:1–28

    MATH  Google Scholar 

  • Shoham Y, Leyton-Brown K (2009) Multiagent systems: algorithmic, game-theoretic, and logical foundations. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511811654

    Google Scholar 

  • Sobel MJ (1971) Noncooperative stochastic games. Ann Math Stat 42:1930–1935. doi:10.1214/aoms/1177693059

    Article  MATH  MathSciNet  Google Scholar 

  • Tijms H (2012) Stochastic games and dynamic programming. Asia Pac Math Newsl 2(3):6–10

    MathSciNet  Google Scholar 

  • Vohra R, Wellman M (eds) (2007) Foundations of multi-agent learning. Artif Intell 171:363–452

    Google Scholar 

  • Weiß G, Sen S (eds) (1996) Adaption and learning in multi-agent Systems. In: Proceedings of the IJCAI’95 workshop, Montréal, 21 Aug 1995, vol 1042. Springer, Berlin. doi:10.1007/3-540-60923-7

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Szajowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag London

About this entry

Cite this entry

Szajowski, K. (2014). Stochastic Games and Learning. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_33-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, London

  • Online ISBN: 978-1-4471-5102-9

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Stochastic Games and Learning
    Published:
    26 September 2019

    DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-3

  2. Stochastic Games and Learning
    Published:
    13 October 2014

    DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-2

  3. Original

    Stochastic Games and Learning
    Published:
    12 February 2014

    DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-1