Skip to main content
Log in

Continuous Time Learning Algorithms in Optimization and Game Theory

  • Published:
Dynamic Games and Applications Aims and scope Submit manuscript

Abstract

The purpose of this work is the comparison of learning algorithms in continuous time used in optimization and game theory. The first three are issued from no-regret dynamics and cover in particular “Replicator dynamics” and “Local projection dynamics”. Then we study “Conditional gradient” versus “Global projection” dynamics and finally “Frank-Wolfe” versus “Best reply” dynamics. Important similarities occur when considering potential or dissipative games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Akin E (1979) The geometry of population genetics. Lecture notes in biomathematics, vol 31. Springer, Berlin

    Book  Google Scholar 

  2. Alvarez F, Bolte J, Brahic O (2004) Hessian Riemannian gradient flows in convex programming. SIAM J Control Optim 43:477–501

    Article  MathSciNet  MATH  Google Scholar 

  3. Antipin AS (1994) Minimization of convex functions on convex sets by means of differential equations. Differ Equ 30:1365–1375

    MathSciNet  MATH  Google Scholar 

  4. Attouch H, Teboulle M (2004) Regularized Lotka-Volterra dynamical system as continuous proximal-like method in optimization. J Optim Theory Appl 121:541–570

    Article  MathSciNet  MATH  Google Scholar 

  5. Avrachenkov K, Borkar VS (2019) Metastability in stochastic replicator dynamics. Dyn Games Appl 9:366–390

    Article  MathSciNet  MATH  Google Scholar 

  6. Barron EN, Goebel R, Jensen RR (2010) Best response dynamics for continuous games. Proc AMS 138:1069–1083

    Article  MathSciNet  MATH  Google Scholar 

  7. Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper Res Lett 31:167–175

    Article  MathSciNet  MATH  Google Scholar 

  8. Benaim M, Hofbauer J, Sorin S (2005) Stochastic approximations and differential inclusions. SIAM J Control Optim 44:328–348

    Article  MathSciNet  MATH  Google Scholar 

  9. Benaim M, Hofbauer J, Sorin S (2006) Stochastic approximations and differential inclusions. Part II: applications. Math Oper Res 31:673–695

    Article  MathSciNet  MATH  Google Scholar 

  10. Benaim M, Hofbauer J, Sorin S (2012) Perturbations of set-valued dynamical systems, with applications to game theory. Dyn Games Appl 2:195–205

    Article  MathSciNet  MATH  Google Scholar 

  11. Bolte J (2003) Continuous gradient projection method in Hilbert spaces. J Optim Theory Appl 119:235–259

    Article  MathSciNet  MATH  Google Scholar 

  12. Bolte J, Teboulle M (2003) Barrier operators and associated gradient-like dynamical systems for constrained minimization problems. SIAM J Control Optim 42:1266–1292

    Article  MathSciNet  MATH  Google Scholar 

  13. Brézis H (1973) Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North Holland, Amsterdam

    MATH  Google Scholar 

  14. Brown GW (1951) Iterative solutions of games by fictitious play. In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, Hoboken, pp 374–376

    Google Scholar 

  15. Brown GW, von Neumann J (1950) Solutions of games by differential equations. In: Kuhn HW, Tucker AW (eds) Contributions to the theory of games, I. Annals of mathematical studies, vol 24. Princeton University Press, Princeton, pp 73–79

    Google Scholar 

  16. Bruck RE (1975) Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J Funct Anal 18:15–26

    Article  MathSciNet  MATH  Google Scholar 

  17. Dafermos SC (1980) Traffic equilibrium and variational inequalities. Transp Sci 14:42–54

    Article  MathSciNet  Google Scholar 

  18. Dupuis P, Nagurney A (1993) Dynamical systems and variational inequalities. Ann Oper Res 44:9–42

    Article  MathSciNet  MATH  Google Scholar 

  19. Facchinei F, Pang J (2007) Finite-dimensional variational inequalities and complementarity problems. Springer, Berlin

    MATH  Google Scholar 

  20. Foster D, Young HP (1990) Stochastic evolutionary game dynamics. Theor Popul Biol 38:219–232

    Article  MathSciNet  MATH  Google Scholar 

  21. Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logist Q 3:95–110

    Article  MathSciNet  Google Scholar 

  22. Friesz TL, Bernstein D, Mehta NJ, Tobin RL, Ganjalizadeh S (1994) Day-to-day dynamic network disequilibria and idealized traveler information systems. Oper Res 42:1120–1136

    Article  MathSciNet  MATH  Google Scholar 

  23. Gilboa I, Matsui A (1991) Social stability and equilibrium. Econometrica 59:859–867

    Article  MathSciNet  MATH  Google Scholar 

  24. Hart S, Mas-Colell A (2003) Uncoupled dynamics do not lead to Nash equilibrium. Am Econ Rev 93:1830–1836

    Article  Google Scholar 

  25. Hofbauer J, Sandholm WH (2009) Stable games and their dynamics. J Econ Theory 144:1665–1693

    Article  MathSciNet  MATH  Google Scholar 

  26. Hofbauer J, Sigmund K (1998) Evolutionary games and population dynamics. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  27. Hofbauer J, Sorin S (2006) Best response dynamics for continuous zero-sum games. Discrete Contin Dyn Syst Ser B 6:215–224

    MathSciNet  MATH  Google Scholar 

  28. Hofbauer J, Sorin S, Viossat Y (2009) Time average replicator and best reply dynamics. Math Oper Res 34:263–269

    Article  MathSciNet  MATH  Google Scholar 

  29. Kinderlehrer D, Stampacchia G (1980) An introduction to variational inequalities and their applications. Academic Press, London

    MATH  Google Scholar 

  30. Kwon J, Mertikopoulos P (2017) A continuous time approach to on-line optimization. J Dyn Games 4:125–148

    Article  MathSciNet  MATH  Google Scholar 

  31. Lahkar R, Sandholm WH (2008) The projection dynamic and the geometry of population games. Games Econ Behav 64:565–590

    Article  MathSciNet  MATH  Google Scholar 

  32. Maynard Smith J (1982) Evolution and the theory of games. Cambridge U.P., Cambridge

    Book  MATH  Google Scholar 

  33. Mazumdar E, Ratliff LJ, Sastry SS (2020) On gradient-based learning in continuous games. SIAM J Math Data Sci 2:103–131

    Article  MathSciNet  MATH  Google Scholar 

  34. Mertikopoulos P, Sandholm WH (2016) Learning in games via reinforcement and regularization. Math Oper Res 41:1297–1324

    Article  MathSciNet  MATH  Google Scholar 

  35. Mertikopoulos P, Sandholm WH (2018) Riemannian game dynamics. J Econ Theory 177:315–364

    Article  MathSciNet  MATH  Google Scholar 

  36. Mertikopoulos P, Zhou Z (2019) Learning in games with continuous action sets and unknown payoff functions. Math Program 173:465–507

    Article  MathSciNet  MATH  Google Scholar 

  37. Minty GJ (1967) On the generalization of a direct method of the calculus of variations. Bull AMS 73:315–321

    Article  MathSciNet  MATH  Google Scholar 

  38. Monderer D, Shapley LS (1996) Potential games. Games Econ Behav 14:124–143

    Article  MathSciNet  MATH  Google Scholar 

  39. Monderer D, Shapley LS (1996) Fictitious Play property for games with identical interests. J Econ Theory 68:258–265

    Article  MathSciNet  MATH  Google Scholar 

  40. Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bull Soc Math Fr 93:273–299

    Article  MATH  Google Scholar 

  41. Nash J (1950) Equilibrium points in \(n\)-person games. Proc Natl Acad Sci 36:48–49

    Article  MathSciNet  MATH  Google Scholar 

  42. Nash J (1951) Non-cooperative games. Ann Math 54:286–295

    Article  MathSciNet  MATH  Google Scholar 

  43. Nemirovski A, Yudin D (1983) Problem complexity and method efficiency in optimization. Wiley, Hoboken

    Google Scholar 

  44. Nesterov Y (2009) Primal-dual subgradient methods for convex problems. Math Program 120:221–259

    Article  MathSciNet  MATH  Google Scholar 

  45. Nikaido H, Isoda K (1955) Note on non cooperative convex games. Pac J Math 5:807–815

    Article  MathSciNet  MATH  Google Scholar 

  46. Opial Z (1967) Weak Convergence of the sequence of successive approximations for nonexpansive mappings. Bull Am Math Soc 73:591–597

    Article  MathSciNet  MATH  Google Scholar 

  47. Pappalardo M, Passacantando M (2004) Gap functions and Lyapunov functions. J Glob Optim 28:379–385

    Article  MathSciNet  MATH  Google Scholar 

  48. Polyak B (1987) Introduction to optimization. In: Optimization software

  49. Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton

    Book  MATH  Google Scholar 

  50. Rockafellar RT (1970) Monotone operators associated with saddle-functions and minmax problems. In: Browder F (ed) Nonlinear functional analysis. Proceedings of symposia in pure math, vol 18. AMS, pp 241–250

  51. Rosen JB (1965) Existence and uniqueness of equilibrium points for concave N-person games. Econometrica 33:520–534

    Article  MathSciNet  MATH  Google Scholar 

  52. Rustichini A (1999) Optimal properties of stimulus-response learning models. Games Econ Behav 29:230–244

    Article  MathSciNet  MATH  Google Scholar 

  53. Sandholm WH (2001) Potential games with continuous player sets. J Econ Theory 97:81–108

    Article  MathSciNet  MATH  Google Scholar 

  54. Sandholm WH (2011) Population games and evolutionary dynamics. MIT Press, Cambridge

    MATH  Google Scholar 

  55. Sandholm WH (2015) Population games and deterministic evolutionary dynamics. In: Young HP, Zamir S (eds) Handbook of game theory IV. Elsevier, Amsterdam, pp 703–778

    Google Scholar 

  56. Sandholm WH, Dokumaci E, Lahkar R (2008) The projection dynamic and the replicator dynamic. Games Econ Behav 64:666–683

    Article  MathSciNet  MATH  Google Scholar 

  57. Shahshahani S (1979) A new mathematical framework for the study of linkage and selection. In: Memoirs of the American Mathematical Society, vol 211

  58. Smith MJ (1979) The existence, uniqueness and stability of traffic equilibria. Transp Res Part B 13:295–304

    Article  MathSciNet  Google Scholar 

  59. Sorin S (2009) Exponential weight algorithm in continuous time. Math Program Ser B 116:513–528

    Article  MathSciNet  MATH  Google Scholar 

  60. Sorin S (2011) On some global and unilateral adaptive dynamics. In: Sigmund K (ed) Evolutionary game dynamics. Proceedings of symposia in applied mathematics, vol 69. A.M.S., pp 81–109

  61. Sorin S (2020) Replicator dynamics: old and new. J Dyn Games 7:365–385

    Article  MathSciNet  MATH  Google Scholar 

  62. Sorin S (2021) No-regret algorithms in on-line learning, games and convex optimization. In: Mathematical programming (to appear)

  63. Sorin S, Wang C (2016) Finite composite games: equilibria and dynamics. J Dyn Games 3:101–120

    Article  MathSciNet  MATH  Google Scholar 

  64. Swinkels JM (1993) Adjustment dynamics and rational play in games. Games Econ Behav 5:455–484

    Article  MathSciNet  MATH  Google Scholar 

  65. Taylor PB, Jonker LB (1978) Evolutionary stable strategies and game dynamics. Math Biosci 40:145–156

    Article  MathSciNet  MATH  Google Scholar 

  66. Tsakas E, Voorneveld M (2009) The target projection dynamic. Games Econ Behav 67:708–719

    Article  MathSciNet  MATH  Google Scholar 

  67. Viossat Y (2014) Game dynamics and Nash equilibria. J Dyn Games 1:537–553

    Article  MathSciNet  MATH  Google Scholar 

  68. Wardrop G (1952) Some theoretical aspects of road traffic research communication networks. Proc Inst Civ Eng Part 2 1:325–378

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sylvain Sorin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

I thank K. Avrachenkov, J. Bolte and J. Hofbauer for interesting discussions and nice comments. I acknowledge partial support from COST Action GAMENET.

This article is part of the topical collection “Multi-agent Dynamic Decision Making and Learning” edited by Konstantin Avrachenkov, Vivek S. Borkar and U. Jayakrishnan Nair.

Appendix

Appendix

Assume g continuous and dissipative and recall that X is convex and compact. Let us prove that \(S_{\mathrm{ext}} \) is non-empty. Define :

$$\begin{aligned} S_{\mathrm{ext}}^y = \{x \in X; \langle g (y) | x - y \rangle \ge 0 \} \end{aligned}$$

so that \( S_{\mathrm{ext}} = \cap _{y \in X} S_{\mathrm{ext}}^y\). Hence by compactness (weak-compactness in an Hilbert framework) it is enough to establish the following:

Claim

For any finite collection \(y_i \in X, i \in I\), there exists \( x \in co \{ y_i, i \in I\}\) such that:

$$\begin{aligned} \langle g (y_i) | x - y_i \rangle \ge 0, \qquad \forall i \in I. \end{aligned}$$
(47)

Consider the finite two-person zero-sum game defined by the following \(I \times I\) matrix A:

$$\begin{aligned} A_{ij}= \langle g (y_j) | y_i - y_j \rangle . \end{aligned}$$

Introduce \(B = \frac{1}{2}[ A +\, ^tA] \) and \(C = \frac{1}{2}[ A -\, ^tA] \).

The crucial point is that B has non negative coefficients since:

$$\begin{aligned} B_{ij}= \langle g (y_j) | y_i - y_j \rangle + \langle g (y_i) | y_j - y_i \rangle = \langle g (y_j) - g(y_i) | y_i - y_j \rangle \ge 0. \end{aligned}$$

Hence an optimal strategy \(u \in \Delta (I)\) in the game C (which has value 0) gives \(uA_j \ge 0, \forall j \in I\). Letting \(x = \sum _i u_i y_i\) this writes as (47).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sorin, S. Continuous Time Learning Algorithms in Optimization and Game Theory. Dyn Games Appl 13, 3–24 (2023). https://doi.org/10.1007/s13235-021-00423-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13235-021-00423-x

Keywords

Navigation