Game-Theoretic Learning in Distributed Control

Marden, Jason R.; Shamma, Jeff S.

doi:10.1007/978-3-319-27335-8_9-1

Jason R. Marden³ &
Jeff S. Shamma⁴

808 Accesses
5 Citations
3 Altmetric

Abstract

In distributed architecture control problems, there is a collection of interconnected decision-making components that seek to realize desirable collective behaviors through local interactions and by processing local information. Applications range from autonomous vehicles to energy to transportation. One approach to control of such distributed architectures is to view the components as players in a game. In this approach, two design considerations are the components’ incentives and the rules that dictate how components react to the decisions of other components. In game-theoretic language, the incentives are defined through utility functions, and the reaction rules are online learning dynamics. This chapter presents an overview of this approach, covering basic concepts in game theory, special game classes, measures of distributed efficiency, utility design, and online learning rules, all with the interpretation of using game theory as a prescriptive paradigm for distributed control design.

This work was supported by ONR Grant #N00014-17-1-2060 and NSF Grant #ECCS-1638214 and by funding from King Abdullah University of Science and Technology (KAUST).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
Alternative agent control policies where the policy of agent i also depends on previous actions of agent i or auxiliary “side information” could also be replicated by introducing an underlying state in the game-theoretic environment. The framework of state-based games, introduced in Marden (2012), represents one such framework that could accomplish this goal.
2.
Recall that we are assuming a finite set of players, each with a finite set of actions.
3.
Another common equilibrium set, termed correlated equilibrium, is similar to coarse correlated equilibrium where the difference lies in the consideration of conditional deviations as opposed to the unconditional deviations considered in (8). A formal definition of correlated equilibrium can be found in Young (2004).
4.
Commonly studied variants of exact potential games, e.g., ordinal or weighted potential games, also possess the finite improvement property.
5.
Here, we use cost functions J _i(⋅ ) instead of utility functions U _i(⋅ ) in situation where the agents are minimizers instead of maximizers.
6.
We write a _i(t) = B _i ^m(h ^m(t)) with the understanding that this implies that the action profile a _i(t) is chosen randomly according to the probability distribution specified by B _i ^m(h ^m(t)).
7.
The actual definition of a finite better reply process considered in Young (2004) puts a further condition on the structure of B _i ^m under the case where the memory is not saturated, i.e., the strategy assigns positive probability to any action with strictly positive regret. However, an identical proof holds for any B _i ^m that satisfies the weaker conditions set forth in this chapter.

References

Alos-Ferrer C, Netzer N (2010) The logit-response dynamics. Games Econ Behav 68:413–427
Article MathSciNet MATH Google Scholar
Altman E, Bonneau N, Debbah M (2006) Correlated equilibrium in access control for wireless communications. In: 5th International Conference on Networking
Google Scholar
Babichenko Y (2012) Completely uncoupled dynamics and Nash equilibria. Games Econ Behav 76:1–14
Article MathSciNet MATH Google Scholar
Blondel VD, Hendrickx JM, Olshevsky A, Tsitsiklis JN (2005a) Convergence in multiagent coordination, consensus, and flocking. In: IEEE Conference on Decision and Control, pp 2996–3000
Google Scholar
Blondel VD, Hendrickx JM, Olshevsky A, Tsitsiklis JN (2005b) Convergence in multiagent coordination, consensus, and flocking. In: Proceedings of the Joint 44th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC’05), Seville
Google Scholar
Blume L (1993) The statistical mechanics of strategic interaction. Games Econ Behav 5:387–424
Article MathSciNet MATH Google Scholar
Blume L (1997) Population games. In: Arthur B, Durlauf S, and Lane D (eds) The economy as an evolving complex system II. Addison-Wesley, Reading, pp 425–460
Google Scholar
Borowski H, Marden JR, Frew EW (2013) Fast convergence in semi-anonymous potential games. In: Proceedings of the IEEE Conference on Decision and Control, pp 2418–2423
Google Scholar
Borowski HP, Marden JR, Shamma JS (2014) Learning efficient correlated equilibria. In: Proceedings of the IEEE Conference on Decision and Control, pp 6836–6841
Google Scholar
Cortes J, Martinez S, Karatas T, Bullo F (2002) Coverage control for mobile sensing networks. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’02), Washington, DC, pp 1327–1332
Google Scholar
Daskalakis C, Goldberg PW, Papadimitriou CH (2009) The complexity of computing a Nash equilibrium. SIAM J Comput 39(1):195–259
Article MathSciNet MATH Google Scholar
Fudenberg D, Levine DK (1995) Consistency and cautious fictitious play. Games Econ Behav 19:1065–1089
MathSciNet MATH Google Scholar
Fudenberg D, Levine DK (1998) The theory of learning in games. MIT Press, Cambridge
MATH Google Scholar
Fudenberg D, Tirole J (1991) Game theory. MIT Press, Cambridge
MATH Google Scholar
Gopalakrishnan R, Marden JR, Wierman A (2014) Potential games are necessary to ensure pure Nash equilibria in cost sharing games. Math Oper Res 39(4):1252–1296
Article MathSciNet MATH Google Scholar
Hart S (2005) Adaptive heuristics. Econometrica 73(5):1401–1430
Article MathSciNet MATH Google Scholar
Hart S, Mansour Y (2010) How long to equilibrium? The communication complexity of uncoupled equilibrium procedures. Games Econ Behav 69(1):107–126
Article MathSciNet MATH Google Scholar
Hart S, Mas-Colell A (2000) A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5):1127–1150
Article MathSciNet MATH Google Scholar
Hart S, Mas-Colell A (2003) Uncoupled dynamics do not lead to Nash equilibrium. Am Econ Rev 93(5):1830–1836
Article Google Scholar
Jadbabaie A, Lin J, Morse AS (2003) Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans Autom Control 48(6):988–1001
Article MathSciNet MATH Google Scholar
Kearns MJ, Littman ML, Singh SP (2001) Graphical models for game theory. In: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp 253–260
Google Scholar
Lambert III TJ, Epelman MA, Smith RL (2005) A fictitious play approach to large-scale optimization. Oper Res 53(3):477–489
Article MathSciNet MATH Google Scholar
Marden JR (2012) State based potential games. Automatica 48:3075–3088
Article MathSciNet MATH Google Scholar
Marden JR (2015) Selecting efficient correlated equilibria through distributed learning. In: American Control Conference, pp 4048–4053
Google Scholar
Marden JR, Shamma JS (2012) Revisiting log-linear learning: asynchrony, completeness and a payoff-based implementation. Games Econ Behav 75(2):788–808
Article MathSciNet MATH Google Scholar
Marden JR, Shamma JS (2015) Game theory and distributed control. In: Young HP, Zamir S (eds) Handbook of game theory with economic applications, vol 4. Elsevier Science, pp 861–899
Google Scholar
Marden JR, Wierman A (2013) Distributed welfare games. Oper Res 61:155–168
Article MathSciNet MATH Google Scholar
Marden JR, Arslan G, Shamma JS (2009) Joint strategy fictitious play with inertia for potential games. IEEE Trans Autom Control 54:208–220
Article MathSciNet MATH Google Scholar
Martinez S, Cortes J, Bullo F (2007) Motion coordination with distributed information. Control Syst Mag 27(4):75–88
Article Google Scholar
Monderer D, Shapley LS (1996) Fictitious play property for games with identical interests. J Econ Theory 68:258–265
Article MathSciNet MATH Google Scholar
Montanari A, Saberi A (2009) Convergence to equilibrium in local interaction games. In: 50th Annual IEEE Symposium on Foundations of Computer Science
Google Scholar
Murphey RA (1999) Target-based weapon target assignment problems. In: Pardalos PM, Pitsoulis LS (eds) Nonlinear assignment problems: algorithms and applications. Kluwer Academic, Alexandra
Google Scholar
Nisan N, Roughgarden T, Tardos E, Vazirani VV (2007) Algorithmic game theory. Cambridge University Press, New York
Book MATH Google Scholar
Olfati-Saber R (2006) Flocking for multi-agent dynamic systems: algorithms and theory. IEEE Trans Autom Control 51:401–420
Article MathSciNet MATH Google Scholar
Olfati-Saber R, Murray RM (2003) Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans Autom Control 49(9):1520–1533
Article MathSciNet MATH Google Scholar
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus and cooperation in networked multi-agent systems. Proc IEEE 95(1):215–233
Article MATH Google Scholar
Roughgarden T (2005) Selfish routing and the price of anarchy. MIT Press, Cambridge
MATH Google Scholar
Roughgarden T (2015) Intrinsic robustness of the price of anarchy. J ACM 62(5):32:1–32:42
Google Scholar
Shah D, Shin J (2010) Dynamics in congestion games. In: ACM SIGMETRICS, pp 107–118
Google Scholar
Shamma JS (2014) Learning in games. In: Baillieul J, Samad T (eds) Encyclopedia of systems and control. Springer, London
Google Scholar
Shoham Y, Powers R, Grenager T (2007) If multi-agent learning is the answer, what is the question? Artif Intell 171(7):365–377
Article MathSciNet MATH Google Scholar
Touri B, Nedic A (2011) On ergodicity, infinite flow, and consensus in random models. IEEE Trans Autom Control 56(7):1593–1605
Article MathSciNet MATH Google Scholar
Tsitsiklis JN (1987) Decentralized detection by a large number of sensors. Technical report. MIT, LIDS
MATH Google Scholar
Tsitsiklis JN, Bertsekas DP, Athans M (1986) Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans Autom Control 35(9):803–812
Article MathSciNet MATH Google Scholar
A. Vetta, “Nash equilibria in competitive societies, with applications to facility location, traffic routing and auctions,” Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002, pp 416–425
Google Scholar
Wang B, Han Z, Liu KJR (2009) Peer-to-peer file sharing game using correlated equilibrium. In: 43rd Annual Conference on Information Sciences and Systems, CISS 2009, pp 729–734
Google Scholar
Wolpert D, Tumor K (1999) An overview of collective intelligence. In: Bradshaw JM (ed) Handbook of agent technology. AAAI Press/MIT Press, Cambridge, USA
Google Scholar
Young HP (1998) Individual strategy and social structure. Princeton University Press, Princeton
Google Scholar
Young HP (2004) Strategic learning and its limits. Oxford University Press, New York
Book Google Scholar
Zhu M, Martínez S (2013) Distributed coverage games for energy-aware mobile sensor networks. SIAM J Control Optim 51(1):1–27
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of California, Santa Barbara, 5161 Harold Frank Hall, 93106, Santa Barbara, CA, USA
Jason R. Marden
Computer, Electrical and Mathematical Science and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955–6900, Thuwal, Saudi Arabia
Jeff S. Shamma

Authors

Jason R. Marden
View author publications
You can also search for this author in PubMed Google Scholar
Jeff S. Shamma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason R. Marden .

Editor information

Editors and Affiliations

Center for Advanced Study, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
Tamer Basar
GERAD, HEC Montréal, Montreal, Québec, Canada
Georges Zaccour

Section Editor information

Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, 1308 West Main, 61801, Urbana, IL, USA
Tamer Başar
Département de sciences de la décision, GERAD, HEC Montréal, Montreal, QC, Canada
Georges Zaccour

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Marden, J.R., Shamma, J.S. (2017). Game-Theoretic Learning in Distributed Control. In: Basar, T., Zaccour, G. (eds) Handbook of Dynamic Game Theory. Springer, Cham. https://doi.org/10.1007/978-3-319-27335-8_9-1

Download citation

DOI: https://doi.org/10.1007/978-3-319-27335-8_9-1
Received: 14 June 2016
Accepted: 09 June 2017
Published: 12 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27335-8
Online ISBN: 978-3-319-27335-8
eBook Packages: Springer Reference Religion and PhilosophyReference Module Humanities and Social SciencesReference Module Humanities

Publish with us

Policies and ethics