Abstract
We consider general, stable nonlinear differential equations with N inputs and N outputs, where in the steady state, the output signals represent the payoff functions of a noncooperative game played by the steady-state values of the input signals. To achieve locally stable convergence to the resulting steady-state Nash equilibria, we introduce a non-model-based approach, where the players determine their actions based only on their own payoff values. This strategy is based on the extremum seeking approach, which has previously been developed for standard optimization problems and employs sinusoidal perturbations to estimate the gradient. Since non-quadratic payoffs create the possibility of multiple, isolated Nash equilibria, our convergence results are local. Specifically, the attainment of any particular Nash equilibrium is not assured for all initial conditions, but only for initial conditions in a set around that specific stable Nash equilibrium. For non-quadratic costs, the convergence to a Nash equilibrium is not perfect, but is biased in proportion to the perturbation amplitudes and the higher derivatives of the payoff functions. We quantify the size of these residual biases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altman, E., Başar, T., Srikant, R.: Nash equilibria for combined flow control and routing in networks: asymptotic behavior for a large number of users. IEEE Trans. Autom. Control 47, 917–930 (2002)
Apostol, T.M.: Mathematical Analysis, 2nd ed. Addison-Wesley, Reading (1974)
Ariyur, K.B., Kristic, M.: Real-Time Optimization by Extremum-Seeking Control. Wiley-Interscience, Hoboken (2003)
Başar, T.: Control and game-theoretic tools for communication networks (overview). Appl. Comput. Math. 6, 104–125 (2007)
Başar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd ed. SIAM, Philadelphia (1999)
Bauso, D., Giarré, L., Pesenti, R.: Consensus in noncooperative dynamic games: a multiretailer inventory application. IEEE Trans. Autom. Control 53, 998–1003 (2008)
Becker, R., King, R., Petz, R., Nitsche, W.: Adaptive closed-loop separation control on a high-lift configuration using extremum seeking, AIAA J. 45, 1382–1392 (2007)
Carnevale, D., Astolfi, A., Centioli, C., Podda, S., Vitale, V., Zaccarian, L.: A new extremum seeking technique and its application to maximize RF heating on FTU. Fusing Eng. Design 84, 554–558 (2009)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Cochran, J., Kanso, E., Kelly, S.D., Xiong, H., Krstic, M.: Source seeking for two nonholonomic models of fish locomotion. IEEE Trans. Robot. 25, 1166–1176 (2009)
Cochran, J., Krstic, M.: Nonholonomic source seeking with tuning of angular velocity. IEEE Trans. Autom. Control 54, 717–731 (2009)
Fink, A.M.: Almost Periodic Differential Equations, Lecture Notes in Mathematics, vol. 377. Springer, New York (1974)
Foster, D.P., Young, H.P.: Regret testing: learning to play Nash equilibrium without knowing you have an opponent. Theor. Econ. 1, 341–367 (2006)
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. The MIT Press, Cambridge (1998)
Guay, M., Perrier, M., Dochain, D.: Adaptive extremum seeking control of nonisothermal continuous stirred reactors. Chem. Eng. Sci. 60, 3671–3681 (2005)
Hale, J.K.: Ordinary Differential Equations. Wiley-Interscience, New York (1969)
Hart, S., Mansour, Y.: How long to equilibrium? The communication complexity of uncoupled equilibrium procedures. Games Econ. Behav. 69, 107–126 (2010)
Hart, S., Mas-Colell, A.: Uncoupled dynamics do not lead to Nash equilibrium. Am. Econ. Rev. 95, 1830–1836 (2003)
Hart, S., Mas-Colell, A.: Stochastic uncoupled dynamics and Nash equilibrium. Games Econ. Behav. 57, 286–303 (2006)
Jafari, A., Greenwald, A., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and Nash equilibrium. In: Proceedings of the 18th International Conference on Machine Learning (2001)
Khalil, H.K.: Nonlinear Systems, 3rd ed. Prentice Hall, Upper Saddle River (2002)
Killingsworth, N.J., Aceves, S.M., Flowers, D.L., Espinosa-Loza, F., Krstic, M.: HCCI engine combustion-timing control: optimizing gains and fuel consumption via extremum seeking. IEEE Trans. Control Syst. Technol. 17, 1350–1361 (2009)
Krstic, M., Frihauf, P., Krieger, J., Başar, T.: Nash equilibrium seeking with finitely- and infinitely-many players. In: Proceedings of the 8th IFAC Symposium on Nonlinear Control Systems, Bologna (2010)
Li, S., Başar, T.: Distributed algorithms for the computation of noncooperative equilibria. Automatica 23, 523–533 (1987)
Luenberger, D.G.: Complete stability of noncooperative games. J. Optim. Theory Appl. 25, 485–505 (1978)
Luo, L., Schuster, E.: Mixing enhancement in 2D magnetohydrodynamic channel flow by extremum seeking boundary control. In: Proceedings of the American Control Conference, St. Louis (2009)
MacKenzie, A.B., Wicker, S.B.: Game theory and the design of self-configuring, adaptive wireless networks. IEEE Commun. Mag. 39, 126–131 (2001)
Marden, J.R., Arslan, G., Shamma, J.S.: Cooperative control and potential games. IEEE Trans. Syst. Man Cybern. B Cybern. 39, 1393–1407 (2009)
Moase, W.H., Manzie, C., Brear, M.J.: Newton-like extremum-seeking part I: theory. In: Proceedings of the IEEE Conference on Decision and Control, Shanghai (2009)
Naimzada, A.K., Sbragia, L.: Oligopoly games with nonlinear demand and cost functions: two boundedly rational adjustment processes. Chaos Solitons Fract. 29, 707–722 (2006)
Nešić, D., Tan, Y., Moase, W.H., Manzie, C.: A unifying approach to extremum seeking: adaptive schemes based on estimation of derivatives. In: Proceedings of the IEEE Conference on Decision and Control, Atlanta (2010)
Peterson, K., Stefanopoulou, A.: Extremum seeking control for soft landing of an electromechanical valve actuator. Automatica 29, 1063–1069 (2004)
Rao, S.S., Venkayya, V.B., Khot, N.S.: Game theory approach for the integrated design of structures and controls. AIAA J. 26, 463–469 (1988)
Rosen, J.B.: Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33, 520–534 (1965)
Scutari, G., Palomar, D.P., Barbarossa, S.: The MIMO iterative waterfilling algorithm. IEEE Trans. Signal Process. 57, 1917–1935 (2009)
Semsar-Kazerooni, E., Khorasani, K.: Multi-agent team cooperation: a game theory approach. Automatica 45, 2205–2213 (2009)
Shamma, J.S., Arslan, G.: Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria. IEEE Trans. Autom. Control 53, 312–327 (2005)
Sharma, R., Gopal, M.: Synergizing reinforcement learning and game theory—a new direction for control. Appl. Soft Comput. 10, 675–688 (2010)
Stanković, M.S., Johansson, K.H., Stipanović, D.M.: Distributed seeking of Nash equilibria in mobile sensor networks. In: Proceedings of the IEEE Conference on Decision and Control, Atlanta (2010)
Stanković, M.S., Stipanović, D.M.: Extremum seeking under stochastic noise and applications to mobile sensors. Automatica 46, 1243–1251 (2010)
Tan, Y., Nešić, D., Mareels, I.: On non-local stability properties of extremum seeking control. Automatica 42, 889–903 (2006)
Young, H.P.: Learning by trial and error. Games Econ. Behav. 65, 626–643 (2009)
Zhang, C., Arnold, D., Ghods, N., Siranosian, A., Krstic, M.: Source seeking with nonholonomic unicycle without position measurement and with tuning of forward velocity. Syst. Control Lett. 56, 245–252 (2007)
Zhu, M., Martínez, S.: Distributed coverage games for mobile visual sensors (I): Reaching the set of Nash equilibria. In: Proceedings of the IEEE Conference on Decision and Control, Shanghai, China (2009)
Acknowledgements
This research was made with Government support under and awarded by DoD, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a, and by grants from National Science Foundation, DOE, and AFOSR.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
The following integrals are computed to obtain (9.21), where we have assumed the frequencies satisfy ω i ≠ω j , 2ω i ≠ω j , 3ω i ≠ω j , ω i ≠ω j + ω k , ω i ≠2ω j + ω k , 2ω i ≠ω j + ω k , for distincti, j, k ∈ { 1, …, N} and defined γ i = ω i ∕ min i {ω i }:
The conditions 3ω i ≠ω j , ω i ≠2ω j + ω k , and 2ω i ≠ω j + ω k , arise due to the payoff functions being non-quadratic and are not required for quadratic payoff functions.
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Frihauf, P., Krstic, M., Başar, T. (2013). Nash Equilibrium Seeking for Dynamic Systems with Non-quadratic Payoffs. In: Cardaliaguet, P., Cressman, R. (eds) Advances in Dynamic Games. Annals of the International Society of Dynamic Games, vol 12. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-8355-9_9
Download citation
DOI: https://doi.org/10.1007/978-0-8176-8355-9_9
Published:
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-0-8176-8354-2
Online ISBN: 978-0-8176-8355-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)