Stochastic Games and Learning

Szajowski, Krzysztof

doi:10.1007/978-1-4471-5102-9_33-2

Krzysztof Szajowski³

393 Accesses

Abstract

A stochastic game was introduced by Lloyd Shapley in the early 1950s. It is a dynamic game with probabilistic transitions played by one or more players. The game is played in a sequence of stages. At the beginning of each stage, the game is in a certain state. The players select actions, and each player receives a payoff that depends on the current state and the chosen actions. The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players. The procedure is repeated at the new state, and the play continues for a finite or infinite number of stages. The total payoff to a player is often taken to be the discounted sum of the stage payoffs or the limit inferior of the averages of the stage payoffs.

A learning problem arises when the agent does not know the reward function or the state transition probabilities. If an agent directly learns about its optimal policy without knowing either the reward function or the state transition function, such an approach is called model-free reinforcement learning. Q-learning is an example of such a model.

Q-learning has been extended to a noncooperative multi-agent context, using the framework of general-sum stochastic games. A learning agent maintains Q-functions over joint actions and performs updates based on assuming Nash equilibrium behavior over the current Q-values. The challenge is convergence of the learning protocol.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Bibliography

Aumann RJ (1987) Correlated equilibrium as an expression of Bayesian rationality. Econometrica 55:1–18. doi:10.2307/1911154
Article MATH MathSciNet Google Scholar
Bowling M, Veloso M (2001) Rational and convergent learning in stochastic games. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI), Seattle, pp 1021–1026
Google Scholar
Breton M (1991) Algorithms for stochastic games. In: Raghavan TES, Ferguson TS, Parthasarathy T, Vrieze OJ (eds) Stochastic games and related topics: in honor of Professor L. S. Shapley, vol 7. Springer Netherlands, Dordrecht, pp 45–57. doi:10.1007/978-94-011-3760-7_5
Chapter Google Scholar
Brown GW (1951) Iterative solution of games by fictitious play. In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, New York, Chap. XXIV, pp 374–376
Google Scholar
Buşoniu L, Babuška R, Schutter BD (2010) Multi-agent reinforcement learning: an overview. In: Srinivasan D, Jain LC (eds) Innovations in multi-agent systems and application–1. Springer, Berlin, pp 183–221
Google Scholar
Carlson D, Haurie A (1995) A turnpike theory for infinite horizon open-loop differential games with decoupled controls. In: Olsder GJ (ed) New trends in dynamic games and applications. Annals of the international society of dynamic games, vol 3. Birkhäuser, Boston, pp 353–376
Chapter Google Scholar
Filar J, Vrieze K (1997) Competitive Markov decision processes. Springer, New York
MATH Google Scholar
Filar JA, Schultz TA, Thuijsman F, Vrieze OJ (1991) Nonlinear programming and stationary equilibria in stochastic games. Math Program 50(2, Ser A):227–237. doi:10.1007/BF01594936
Google Scholar
Forges F (1986) An approach to communication equilibria. Econometrica 54:1375–1385. doi:10.2307/1914304
Article MATH MathSciNet Google Scholar
Fudenberg D, Levine DK (1998) The theory of learning in games, vol 2. MIT, Cambridge
MATH Google Scholar
Greenwald A, Hall K (2003) Correlated-Q learning. In: Proceedings 20th international conference on machine learning (ISML-03), Washington, DC, 21–24 Aug 2003, pp 242–249
Google Scholar
Herings PJ-J, Peeters RJAP (2004) Stationary equilibria in stochastic games: structure, selection, and computation. J Econ Theory 118(1):32–60. doi:10.1016/j.jet.2003.10.001
MATH MathSciNet Google Scholar
Hu J, Wellman MP (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of the 15th international conference on machine learning, New Brunswick, pp 242–250
Google Scholar
Hu J, Wellman MP (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069
MathSciNet Google Scholar
Leslie DS, Collins EJ (2005) Individual Q-learning in normal form games. SIAM J Control Optim 44(2):495–514. doi:10.1137/S0363012903437976
Article MATH MathSciNet Google Scholar
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 13th international conference on machine learning, New Brunswick, pp 157–163
Google Scholar
Myerson RB (1978) Refinements of the Nash equilibrium concept. Int J Game Theory 7(2):73–80. doi:10.1007/BF01753236
Article MATH MathSciNet Google Scholar
Nowak AS (2008) Equilibrium in a dynamic game of capital accumulation with the overtaking criterion. Econ Lett 99(2):233–237. doi:10.1016/j.econlet.2007.05.033
Article MATH Google Scholar
Nowak AS, Szajowski K (1998) Nonzerosum stochastic games. In: Bardi M, Raghavan TES, Parthasarathy T (eds) Stochastic and differential games: theory and numerical methods. Annals of the international society of dynamic games, vol 4. Birkhäser, Boston, pp 297–342. doi:10.1007/978-1-4612-1592-9_7
Google Scholar
Ramsey F (1928) A mathematical theory of savings. Econ J 38:543–559
Article Google Scholar
Robinson J (1951) An iterative method of solving a game. Ann Math 2(54):296–301. doi:10.2307/1969530
Article Google Scholar
Rogers PD (1969) Nonzero-sum stochastic games, PhD thesis, University of California, Berkeley. ProQuest LLC, Ann Arbor
Google Scholar
Rubinstein A (1979) Equilibrium in supergames with the overtaking criterion. J Econ Theory 21:1–9. doi:10.1016/0022-0531(79)90002-4
Article MATH Google Scholar
Shapley L (1953) Stochastic games. Proc Natl Acad Sci USA 39:1095–1100. doi:10.1073/pnas.39.10.1095
Article MATH MathSciNet Google Scholar
Shapley L (1964) Some topics in two-person games. Ann Math Stud 52:1–28
MATH Google Scholar
Shoham Y, Leyton-Brown K (2009) Multiagent systems: algorithmic, game-theoretic, and logical foundations. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511811654
Google Scholar
Sobel MJ (1971) Noncooperative stochastic games. Ann Math Stat 42:1930–1935. doi:10.1214/aoms/1177693059
Article MATH MathSciNet Google Scholar
Tijms H (2012) Stochastic games and dynamic programming. Asia Pac Math Newsl 2(3):6–10
MathSciNet Google Scholar
Vohra R, Wellman M (eds) (2007) Foundations of multi-agent learning. Artif Intell 171:363–452
Google Scholar
Weiß G, Sen S (eds) (1996) Adaption and learning in multi-agent Systems. In: Proceedings of the IJCAI’95 workshop, Montréal, 21 Aug 1995, vol 1042. Springer, Berlin. doi:10.1007/3-540-60923-7
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Fundamental Problems of Technology, Institute of Mathematics and Computer Science, Wroclaw University of Technology, Wroclaw, Poland
Krzysztof Szajowski

Authors

Krzysztof Szajowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Szajowski .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, Boston University, Boston, Massachusetts, USA
John Baillieul
Automation and Control Solutions, Honeywell, Golden Valley, Minnesota, USA
Tariq Samad

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Szajowski, K. (2014). Stochastic Games and Learning. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_33-2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-2
Received: 16 May 2014
Accepted: 16 May 2014
Published: 13 October 2014
Publisher Name: Springer, London
Online ISBN: 978-1-4471-5102-9
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

Latest
Stochastic Games and Learning

Published:

26 September 2019

DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-3
Stochastic Games and Learning

Published:

13 October 2014

DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-2
Original
Stochastic Games and Learning

Published:

12 February 2014

DOI: https://doi.org/10.1007/978-1-4471-5102-9_33-1