Advertisement

Graphical Games: Distributed Multiplayer Games on Graphs

  • Frank L. Lewis
  • Hongwei Zhang
  • Kristian Hengster-Movric
  • Abhijit Das
Chapter
Part of the Communications and Control Engineering book series (CCE)

Abstract

In this chapter, it is seen that distributed control protocols that both guarantee synchronization and are globally optimal for the multi-agent team always exist on any sufficiently connected communication graph if a different definition of optimality is used. To this end, we study the notion of Nash equilibrium for multiplayer games on graphs. This leads us to the idea of a new sort of differential game—graphical games. In graphical games, each agent has its own dynamics as well as its own local performance index. The dynamics and local performance indices of each agent are distributed; they depend on the state of the agent, the control of the agent, and the controls of the agent’s neighbors. We show how to compute distributed control protocols that guarantee global Nash equilibrium for multi-agent teams on any graph that has a spanning tree.

Keywords

Nash Equilibrium Span Tree Reinforcement Learning Communication Graph Policy Iteration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abou-Kandil H, Freiling G, Ionescu V, Jank G (2003) Matrix Riccati Equations in Control and Systems Theory. BirkhäuserGoogle Scholar
  2. 2.
    Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791Google Scholar
  3. 3.
    Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to H-Infinity control. IEEE Trans Syst, Man, Cybern B 37(1):240–247Google Scholar
  4. 4.
    Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst, Man, Cybern B 38(4):943–949Google Scholar
  5. 5.
    Başar T, Olsder GJ (1999) Dynamic Noncooperative Game Theory, 2nd edn. SIAM, PhiladelphiaGoogle Scholar
  6. 6.
    Bertsekas DP, Tsitsiklis JN (1996) Neuro-Dynamic Programming. Athena Scientific, BelmontGoogle Scholar
  7. 7.
    Brewer JW (1978) Kronecker products and matrix calculus in system theory. IEEE Trans Circuits Syst 25:772–781Google Scholar
  8. 8.
    Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multi-agent reinforcement learning. IEEE Trans Syst, Man, Cybern C 38(2):156–172Google Scholar
  9. 9.
    Dierks T, Jagannathan S (2010) Optimal control of affine nonlinear continuous-time systems using an online Hamilton–Jacobi–Isaacs formulation. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 3048–3053Google Scholar
  10. 10.
    Freiling G, Jank G, Abou-Kandil H (2002) On global existence of solutions to coupled matrix Riccati equations in closed loop Nash games. IEEE Trans Automat Contr 41(2):264–269Google Scholar
  11. 11.
    Gajic Z, Li T-Y (1988) Simulation results for two new algorithms for solving coupled algebraic Riccati equations. Paper presented at 3rd international symposium on differential games, Sophia Antipolis, Nice, FranceGoogle Scholar
  12. 12.
    Goldberg AV (1995) Scaling algorithms for the shortest paths problem. SIAM J Comput 24:494–504Google Scholar
  13. 13.
    Ioannou P, Fidan B (2006) Adaptive Control Tutorial. SIAM, PhiladelphiaGoogle Scholar
  14. 14.
    Johnson M, Hiramatsu T, Fitz-Coy N, Dixon WE (2010) Asymptotic stackelberg optimal control design for an uncertain euler lagrange system. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 6686–6691Google Scholar
  15. 15.
    Kakade S, Kearns M, Langford J, Ortiz L (2003) Correlated equilibria in graphical games. In: the 4th ACM conf. Electron. Commerce, San Diego, CA, pp. 42–47Google Scholar
  16. 16.
    Kearns M, Littman M, Singh S (2001) Graphical models for game theory. In: Proc. Annual conf. Uncertainty in Artificial Intelligence, Seattle, WA, pp. 253–260Google Scholar
  17. 17.
    Khoo S, Xie L, Man Z (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE Trans Mechatron 14:219–228Google Scholar
  18. 18.
    Leake RJ, Liu R-W (1967) Construction of suboptimal control sequences. SIAM J Contr 5(1):54–63Google Scholar
  19. 19.
    Lewis FL (1992) Applied Optimal Control and Estimation: Digital Design and Implementation. Prentice-Hall, Upper Saddle RiverGoogle Scholar
  20. 20.
    Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits & Systems Magazine (invited feature article), pp. 32–50, Third Quarter 2009Google Scholar
  21. 21.
    Lewis FL, Jagannathan S, Yesildirek A (1999) Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis, LondonGoogle Scholar
  22. 22.
    Lewis FL, Vrabie D, Syrmos VL (2012) Optimal control, 3rd edn. Wiley, HobokenGoogle Scholar
  23. 23.
    Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control. IEEE Control Systems Magazine, pp. 76–105Google Scholar
  24. 24.
    Li X, Wang X, Chen G (2004) Pinning a complex dynamical network to its equilibrium. IEEE Trans Circuits Syst I, Reg Papers 51(10):2074–2087Google Scholar
  25. 25.
    Littman ML (2001) Value-function reinforcement learning in Markov games. J Cogn Syst Res 2(1):55–66Google Scholar
  26. 26.
    Marden JR, Young HP, Pao LY (2012) Achieving pareto optimality through distributed learning. In: Proc. IEEE Conf. Decision Control, Maui, HI, pp. 7419–7424Google Scholar
  27. 28.
    Shinohara R (2010) Coalition proof equilibria in a voluntary participation game. Int J Game Theory 39(4):603–615Google Scholar
  28. 29.
    Shoham Y, Leyton-Brown K (2009). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, CambridgeGoogle Scholar
  29. 30.
    Sutton RS, Barto AG (1998) Reinforcement learning—an introduction. MIT Press, CambridgeGoogle Scholar
  30. 31.
    Tijs S (2003) Introduction to game theory. Hindustan Book Agency, New Delhi.Google Scholar
  31. 32.
    Vamvoudakis KG, Lewis FL (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888Google Scholar
  32. 33.
    Vamvoudakis KG, Lewis FL (2011). Multi-player non-zero sum games: online adaptive learning solution of coupled Hamilton–Jacobi equations. Automatica 47(8):1556–1569Google Scholar
  33. 34.
    Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611Google Scholar
  34. 35.
    Vrabie D, Lewis FL (2009) Neural network approach to continuous-time direct adaptive optimal control for partially-unknown nonlinear systems. Neural Networks 2(3):237–246Google Scholar
  35. 36.
    Vrabie D, Pastravanu O, Lewis FL, Abu-Khalaf M (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484Google Scholar
  36. 37.
    Vrancx P, Verbeeck K, Nowe A (2008). Decentralized learning in Markov games. IEEE Tran Syst Man Cyber 38(4):976–981Google Scholar
  37. 38.
    Wang F, Zhang H, Liu D (May 2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47Google Scholar
  38. 39.
    Wang X, Chen G (2002). Pinning control of scale-free dynamical networks. Physica A 310(3–4):521–531Google Scholar
  39. 40.
    Werbos PJ (1974) Beyond Regression: New Tools for Prediction and Analysis in the Behavior Sciences. Ph.D. Thesis, Harvard UniversityGoogle Scholar
  40. 41.
    Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of Intelligent Control. Van Nostrand Reinhold, New YorkGoogle Scholar
  41. 42.
    Zwick U (2002) All pairs shortest paths using bridging sets and rectangular matrix multiplication. J ACM 49(3):289-317.Google Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Frank L. Lewis
    • 1
  • Hongwei Zhang
    • 2
  • Kristian Hengster-Movric
    • 3
  • Abhijit Das
    • 4
  1. 1.UTA Research InstituteUniversity of Texas at ArlingtonFort WorthUSA
  2. 2.School of Electrical EngineeringSouthwest Jiaotong UniversityChengduChina, People’s Republic
  3. 3.UTA Research InstituteUniversity of Texas at ArlingtonFort WorthUSA
  4. 4.Advanced Systems Engineering​Danfoss Power Solutions (US) Company​AmesUSA

Personalised recommendations