Generalized Conditional Gradient and Learning in Potential Mean Field Games

Lavigne, Pierre; Pfeiffer, Laurent

doi:10.1007/s00245-023-10056-8

Generalized Conditional Gradient and Learning in Potential Mean Field Games

Published: 24 October 2023

Volume 88, article number 89, (2023)
Cite this article

Applied Mathematics & Optimization Submit manuscript

188 Accesses
1 Citation
Explore all metrics

Abstract

We investigate the resolution of second-order, potential, and monotone mean field games with the generalized conditional gradient algorithm, an extension of the Frank-Wolfe algorithm. We show that the method is equivalent to the fictitious play method. We establish rates of convergence for the optimality gap, the exploitability, and the distances of the variables to the unique solution of the mean field game, for various choices of stepsizes. In particular, we show that linear convergence can be achieved when the stepsizes are computed by linesearch.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two Numerical Approaches to Stationary Mean-Field Games

Article 04 October 2016

Numerical Methods for Finite-State Mean-Field Games Satisfying a Monotonicity Condition

Article 01 August 2018

Discrete potential mean field games: duality and numerical resolution

Article 28 March 2023

References

Achdou, Y., Laurière, M.: Mean field games and applications: numerical aspects. Mean Field Games: Cetraro, Italy 2019, 249–307 (2020)
Article MathSciNet MATH Google Scholar
Benamou, J.-D., Carlier, G.: Augmented Lagrangian methods for transport optimization, mean field games and degenerate elliptic equations. J. Optim. Theory Appl. 167(1), 1–26 (2015)
Article MathSciNet MATH Google Scholar
Benamou, J.-D., Carlier, G., Di Marino, S., Nenna, L.: An entropy minimization approach to second-order variational mean-field games. Math. Models Methods Appl. Sci. 29(08), 1553–1583 (2019)
Article MathSciNet MATH Google Scholar
Benamou, J.-D., Carlier, G., Santambrogio, F.: Variational mean field games. In: Active Particles, Vol. 1, pp. 141–171. Springer, New York (2017)
Bonnans, J.F., Hadikhanloo, S., Pfeiffer, L.: Schauder estimates for a class of potential mean field games of controls. Appl. Math. Optim. 83, 1431–1464 (2021)
Article MathSciNet MATH Google Scholar
Bonnans, J.F., Lavigne, P., Pfeiffer, L.: Discrete potential mean field games: duality and numerical resolution. Math. Program. (2023)
Bredies, K., Lorenz, D.A., Maass, P.: A generalized conditional gradient method and its connection to an iterative shrinkage method. Comput. Optim. Appl. 42(2), 173–193 (2009)
Article MathSciNet MATH Google Scholar
Briani, A., Cardaliaguet, P.: Stable solutions in potential mean field game systems. Nonlinear Differ. Equ. Appl. 25(1), 1 (2018)
Article MathSciNet MATH Google Scholar
Briceño-Arias, L., Kalise, D., Kobeissi, Z., Laurière, M., González, A.M., Silva, F.: On the implementation of a primal-dual algorithm for second order time-dependent mean field games with local couplings. In: ESAIM: Proceedings and Surveys, Vol. 65, pp. 330–348 (2019)
Briceño-Arias, L., Kalise, D., Silva, F.: Proximal methods for stationary mean field games with local couplings. SIAM J. Control Optim. 56(2), 801–836 (2018)
Article MathSciNet MATH Google Scholar
Cacace, S., Camilli, F., Goffi, A.: A policy iteration method for mean field games. ESAIM 27, 85 (2021)
MathSciNet MATH Google Scholar
Camilli, F., Tang, Q.: Rates of convergence for the policy iteration method for mean field games systems. J. Math. Anal. Appl. 512(1), 126–138 (2022)
Article MathSciNet MATH Google Scholar
Cardaliaguet, P., Graber, P.J., Porretta, A., Tonon, D.: Second order mean field games with degenerate diffusion and local coupling. Nonlinear Differ. Equ. Appl. 22(5), 1287–1317 (2015)
Article MathSciNet MATH Google Scholar
Cardaliaguet, P., Hadikhanloo, S.: Learning in mean field games: the fictitious play. ESAIM 23(2), 569–591 (2017)
MathSciNet MATH Google Scholar
Cardaliaguet, P., Lehalle, C.-A.: Mean field game of controls and an application to trade crowding. Math. Financ. Econ. 12(3), 335–363 (2018)
Article MathSciNet MATH Google Scholar
Elie, R., Pérolat, J., Laurière, M., Geist, M., Pietquin, O.: Approximate fictitious play for mean field games. arXiv preprint arXiv:1907.02633 (2019)
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, vol. 25. Springer, New York (2006)
MATH Google Scholar
Geist, M., Pérolat, J., Laurière, M., Elie, R., Perrin, S., Bachem, O., Munos, R., Pietquin, O.: Concave utility reinforcement learning: the mean-field game viewpoint. arXiv preprint arXiv:2106.03787 (2021)
Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Springer, New York (2015)
MATH Google Scholar
Graber, P.J., Mouzouni, C.: Variational mean field games for market competition. In: PDE Models for Multi-agent Phenomena, pp. 93–114. Springer, New York (2018)
Graber, P.J., Mullenix, A., Pfeiffer, L.: Weak solutions for potential mean field games of controls. Nonlinear Differ. Equ. Appl. 28(5), 1–34 (2021)
Article MathSciNet MATH Google Scholar
Hadikhanloo, S.: Learning in anonymous nonatomic games with applications to first-order mean field games. arXiv preprint arXiv:1704.00378 (2017)
Hadikhanloo, S., Silva, F.J.: Finite mean field games: fictitious play and convergence to a first order continuous mean field game. J. Math. Pures Appl. 132, 369–397 (2019)
Article MathSciNet MATH Google Scholar
Huang, M., Malhamé, R.P., Caines, P.E.: Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6(3), 221–252 (2006)
Article MathSciNet MATH Google Scholar
Jaggi, M.: Revisiting Frank-Wolfe: projection-free sparse convex optimization. In: International Conference on Machine Learning, pp. 427–435. PMLR (2013)
Kunisch, K., Walter, D.: On fast convergence rates for generalized conditional gradient methods with backtracking stepsize. Numerical Algebra, Control and Optimization (2022)
Lacoste-Julien, S.: Convergence rate of frank-wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345, (2016)
Ladyzhenskaia, O.A., Solonnikov, V.A., Ural’tseva, N.N.: Linear and Quasi-linear Equations of Parabolic Type, vol. 23. American Mathematical Soiety, Providence (1988)
Google Scholar
Lasry, J.-M., Lions, P.-L.: Mean field games. Jpn. J. Math. 2(1), 229–260 (2007)
Article MathSciNet MATH Google Scholar
Lions, J.-L.: Optimal Control of Systems Governed by Partial Differential Equations, vol. 170. Springer, New York (1971)
Book MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)
Book MATH Google Scholar
Perrin, S., Pérolat, J., Laurière, M., Geist, M., Elie, R., Pietquin, O.: Fictitious play for mean field games: Continuous time analysis and applications. arXiv preprint arXiv:2007.03458 (2020)
Santambrogio, F., Shim, W.: A cucker-smale inspired deterministic mean field game with velocity interactions. SIAM J. Control Optim. 59(6), 4155–4187 (2021)
Article MathSciNet MATH Google Scholar
Sorin, S.: Continuous time learning algorithms in optimization and game theory. In: Dynamic Games and Applications, pp. 1–22 (2022)
Wang, W., Han, J., Yang, Z., Wang, Z.: Global convergence of policy gradient for linear-quadratic mean-field control/game in continuous time. In: International Conference on Machine Learning, pp. 10772–10782. PMLR (2021)

Download references

Author information

Authors and Affiliations

Institut Louis Bachelier, Paris, France
Pierre Lavigne
Inria and Laboratoire des Signaux et Systèmes, CNRS (UMR 8506), CentraleSupélec, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
Laurent Pfeiffer

Authors

Pierre Lavigne
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Pfeiffer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laurent Pfeiffer.

Ethics declarations

Financial or Non-financial interests

This work was supported by a public grant as part of the Investissement d’avenir project, reference ANR-11-LABX-0056-LMH, LabEx LMH. The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by a public grant as part of the Investissement d’avenir project, reference ANR-11-LABX-0056-LMH, LabEx LMH.

A Appendix: Regularity of the auxiliary mappings

This appendix contains the proofs of the technical lemmas of Subsection 5.1.

1.1 A.1 Parabolic estimates

In this section we provide estimates for the following parabolic equation:

$$\begin{aligned} \begin{array}{rlr} \partial _t u - \sigma \Delta u + \langle b, \nabla u \rangle + c u \, = &{} h, \quad &{} (x,t) \in Q, \\ u(x,0) \,= &{} u_0(x), &{} x \in \mathbb {T}^d, \end{array} \end{aligned}$$

(31)

for different assumptions on $b :Q \rightarrow \mathbb {R}^d$, $c :Q \rightarrow \mathbb {R}$, $h :\rightarrow \mathbb {R}$, and $u_0 :\mathbb {T}^d \rightarrow \mathbb {R}$. The proofs of the following results can be found in the Appendix of [5]; they largely rely on [28]. We recall that q is a fixed parameter and $q>d+2$.

In the next theorem, we consider the Sobolev space $W^{2-2/p,p}(\mathbb {T}^d)$ with a fractional order of derivation, see [28, section II.2] for a definition.

Theorem 29

For all $R>0$, there exists $C>0$ such that for all $u_0 \in W^{2-2/q,q}(\mathbb {T}^d)$, for all $b \in L^q(Q;\mathbb {R}^d)$, for all $c \in L^q(Q)$, and for all $h \in L^q(Q)$ satisfying

$$\begin{aligned} \Vert u_0 \Vert _{W^{2-2/q,q}(\mathbb {T}^d)} + \Vert b \Vert _{L^q(Q;\mathbb {R}^d)} + \Vert c \Vert _{L^q(Q)} + \Vert h \Vert _{L^q(Q)} \le R, \end{aligned}$$

equation (31) has a unique solution u in $W^{2,1,q}(Q)$. Moreover, $\Vert u \Vert _{W^{2,1,q}(Q)} \le C$.

Theorem 30

There exists $C>0$ such that for all $u_0 \in W^{2-2/q,q}(\mathbb {T}^d)$ and for all $h \in L^q(Q)$, the unique solution u to (31) (with $b = 0$ and $c=0$) satisfies the following estimate:

$$\begin{aligned} \Vert u\Vert _{W^{2,1,q}(Q)} \le C \big ( \Vert u_0\Vert _{W^{2-2/q,q}(\mathbb {T}^d)} + \Vert h\Vert _{L^q(Q)} \big ). \end{aligned}$$

Theorem 31

For all $\beta \in (0,1)$, for all $R>0$, there exist $\alpha \in (0,1)$ and $C>0$ such that for all $u_0 \in \mathcal {C}^{2+ \beta }(\mathbb {T}^d)$, $b \in \mathcal {C}^{\beta ,\beta /2}(Q;\mathbb {R}^d)$, $c \in \mathcal {C}^{\beta ,\beta /2}(Q)$ and $h \in \mathcal {C}^{\beta ,\beta /2}(Q)$ satisfying $\Vert u_0 \Vert _{\mathcal {C}^{2+ \beta }(\mathbb {T}^d)} + \Vert b \Vert _{\mathcal {C}^{\beta ,\beta /2}(Q;\mathbb {R}^d)} + \Vert c \Vert _{\mathcal {C}^{\beta ,\beta /2}(Q)} + \Vert h \Vert _{\mathcal {C}^{\beta ,\beta /2}(Q)} \le R$, the solution to (31) lies in $\mathcal {C}^{2+\alpha ,1+\alpha /2}(Q)$ and satisfies $\Vert u \Vert _{\mathcal {C}^{2+\alpha ,1+\alpha /2}(Q)} \le C$.

1.2 A.2 Fokker-Planck equation

Proof of Lemma 13

Let us write the Fokker-Planck equation in the form of equation (31): $\partial _t m - \Delta m + (\nabla \cdot v) m + \langle v, \nabla m \rangle = 0$. The first part of lemma follows from Theorem 29. The nonnegativity of $\varvec{M}[v]$ is proved in [5, Lemma 3]. $\square $

Proof of Lemma 14

Set $w= v_2-v_1$ and $\mu = m_2-m_1$. Then $\mu $ is the solution to

$$\begin{aligned} \begin{array}{rlr} \partial _t \mu - \Delta \mu + \nabla \cdot ( v_1 \mu ) \, = &{} - \nabla \cdot (w m_2), \quad &{} (x,t) \in Q, \\ \mu (x,0) \, = &{} 0, &{} x \in \mathbb {T}^d. \end{array} \end{aligned}$$

Set $V=W^{1,2}(\mathbb {T}^d)$ and consider the Gelfand triple $(V,L^2(\mathbb {T}^d),V^*)$, where $V^*$ denotes the dual of V. Then $\mu $ is solution of a parabolic equation of the form

$$\begin{aligned} \begin{array}{rlr} \partial _t m(t) + B(t) m(t) \, = &{} f(t), \quad &{} (x,t) \in Q, \\ m(x,0) \, = &{} 0, &{} x \in \mathbb {T}^d, \end{array} \end{aligned}$$

where $B(t) \in L(V, V^{*})$ and $f(t) \in V^{*}$. For any $m \in V$, we have

$$\begin{aligned} \langle B(t) m, m \rangle _{V}&= \int _{\mathbb {T}^d} \big ( - \Delta m + \nabla \cdot v_1(t) m + \langle v_1(t), \nabla m \rangle \big ) m \, \textrm{d}x \\&= \int _{\mathbb {T}^d} \big ( |\nabla m|^2 - \langle v_1(t), \nabla m \rangle m \big ) \textrm{d}x, \end{aligned}$$

where the second equality is obtained by integration by parts. Using Cauchy-Schwarz inequality and $\Vert v_1 \Vert _{L^\infty (Q;\mathbb {R}^d)} \le R$, we obtain the following inequality:

$$\begin{aligned} \langle B(t) m, m \rangle _{V} \ge \Vert \nabla m \Vert _{L^2(\mathbb {T}^d;\mathbb {R}^d)}^2 - C \Vert \nabla m \Vert _{L^2(\mathbb {T}^d;\mathbb {R}^d)} \Vert m \Vert _{L^2(\mathbb {T}^d)} \end{aligned}$$

where the constant C is independent of t (but depends on R). A direct application of Young’s inequality yields the existence of C (depending on R) such that

$$\begin{aligned} \langle B(t) m, m \rangle _{V} \ge \frac{1}{2} \Vert m \Vert ^2_{V} - C\Vert m \Vert ^2_{L^2(\mathbb {T}^d)}. \end{aligned}$$

Thus B(t) is semi-coercive, uniformly in time. With similar techniques, one can show that $\langle B(t)m,m' \rangle _V \le C \Vert m \Vert _V \Vert m' \Vert _V$, for a.e.@ $t \in (0,T)$ and for all m and $m'$ in V. We can apply [30, Chapter 3, Theorems 1.1 and 1.2], from which we derive

$$\begin{aligned} \Vert \mu \Vert _{L^\infty (0,T;L^2(\mathbb {T}^d))}&\le C \big ( \Vert \mu \Vert _{L^{2}(0,T;V)} + \Vert \partial _t \mu \Vert _{L^{2}(0,T;V^{*})} \big ) \\&\le C \Vert f \Vert _{L^2(0,T;V^*)} \le C \Vert \nabla \cdot (w m_2) \Vert _{L^{2}(0,T;V^{*})} \\&\le C \Vert w m_2 \Vert _{L^{2}(Q;\mathbb {R}^d)}. \end{aligned}$$

Finally, since $\Vert m_2 \Vert _{L^\infty (Q)} \le R$, we have $\Vert w m_2 \Vert _{L^2(Q;\mathbb {R}^d)}^2 \le C \int _Q |w|^2 m_2 \, \textrm{d}x \, \textrm{d}t. $ Combining the two last obtained inequalities, we obtain the announced result. $\square $

1.3 A.3 HJB equation

Lemma 32

The Hamiltonian H is differentiable with respect to p and $H_p$ is differentiable with respect to x and p. Moreover, H, $H_p$, $H_{px}$, and $H_{pp}$ are locally Hölder-continuous.

Proof

See [5, Lemma 1]. $\square $

The analysis of the HJB equation relies on its connection with the value function of an optimal control problem, that was introduced in (11). This connection allows first to show a uniform bound for $\varvec{u}[\gamma ,P]$.

Lemma 33

Let $R> 0$ and let $(\gamma ,P) \in \Xi _R$. There exists a constant $C(R) > 0$ such that $\Vert \varvec{u}[\gamma ,P] \Vert _{L^\infty (Q)} \le C(R)$ and such that u is C(R)-Lipschitz continuous with respect to x. Moreover, for any $(x,t) \in Q$,

$$\begin{aligned} \varvec{u}[\gamma ,P](x,t) = \inf _{\nu \in \mathbb {L}_{\mathbb {F}}^{2,C(R)}(t,T)} J[\gamma ,P](x,t,\nu ). \end{aligned}$$

(32)

In the above relation, $\mathbb {L}_{\mathbb {F}}^{2,C(R)}(t,T)$ denotes the set of stochastic processes $\nu \in \mathbb {L}_{\mathbb {F}}^{2}(t,T)$ such that $\mathbb {E} \big [ \int _t^T |\nu _s|^2 \textrm{d}s \big ] \le C(R)$.

Proof

We first derive a lower bound of L. By assumption (H4), L(x, t, 0) and $L_v(x,t,0)$ are bounded. It follows then from the strong convexity assumption (Assumption (H1)) that there exists a constant $C> 0$ such that

$$\begin{aligned} \frac{1}{C} |\nu |^2 - C \le L(x,t,\nu ), \quad \text { for all }(x,t,\nu ) \in Q \times \mathbb {R}^d. \end{aligned}$$

(33)

Then, for any $(x,s) \in Q$ and for any $\nu \in \mathbb {R}^d$, we have the following estimates:

$$\begin{aligned} L(x,s,\nu ) + \langle A^\star [P](x,s) , \nu \rangle&\ge \frac{1}{C} |\nu |^2 - \Vert a\Vert _{L^\infty (Q;\mathbb {R}^{k \times d})} |P(s)| |\nu | - C \\&\ge \frac{1}{C} (|\nu |^2 - |P(s)|^2 - 1) \ge \frac{1}{C} (|\nu |^2 - 1). \end{aligned}$$

Now we show that $\varvec{u}[\gamma ,P]$ is bounded in $L^{\infty }(Q)$. For any $(x,t) \in Q$, using the above bound for the running cost L, the bound $\Vert \gamma \Vert _{L^{\infty }(Q)} \le R$, together with Assumption (H4) on the terminal cost g, we obtain that $\varvec{u}[\gamma ,P](x,t) \ge - C(R)$. In addition, using Assumption (H3) and the fact that that $\Vert \gamma \Vert _{L^{\infty }(Q)} \le R$, we deduce that

$$\begin{aligned} \varvec{u}[\gamma ,P](x,t) \le J[\gamma ,P](x,t,0) \le C(R), \end{aligned}$$

from which we conclude that $\Vert \varvec{u}[\gamma ,P] \Vert _{L^\infty (Q)} \le C(R)$.

Finally we show equation (32). Let $t\in [0,T]$, let $\varepsilon \in (0,1)$ and let ${\tilde{\nu }} \in \mathbb {L}_{\mathbb {F}}^2(t,T)$ be an $\varepsilon $-optimal process. Since g is bounded (Assumption (H4)) and since $(\gamma ,P) \in \Xi _R$, we deduce from the above inequality that

$$\begin{aligned} \mathbb {E} \Big [ \int _t^T |{\tilde{\nu }}_s|^2 \textrm{d}s \Big ]&\le C \Big ( \, \inf _{\nu \in \mathbb {L}_{\mathbb {F}}^2(t,T)} J[\gamma ,P](x,t,\nu ) + \varepsilon + 1 \Big ) \\&\le C \big (\varvec{u}[\gamma ,P](x,t) + 2 \big ) \le C, \end{aligned}$$

where the constant C does not depend on t and $\varepsilon $. Thus any $\varepsilon $-optimal process lies in $\mathbb {L}_{\mathbb {F}}^{2,C}(t,T)$, which concludes the proof. $\square $

Proof of Lemma 15

Let $(\gamma _1,P_1)$ and $(\gamma _2,P_2)$ be in $\Xi _R$. Let $u_1= \varvec{u}[\gamma _1,P_1]$ and $u_2= \varvec{u}[\gamma _2,P_2]$. By Lemma 33, there exists $C>0$ such that

$$\begin{aligned} u_2(x,t)-u_1(x,t) = \inf _{\nu \in \mathbb {L}_{\mathbb {F}}^{2,C}(t,T)} J[\gamma _2,P_2](x,t,\nu ) - \inf _{\nu ' \in \mathbb {L}_{\mathbb {F}}^{2,C}(t,T)} J[\gamma _1,P_1](x,t,\nu '), \end{aligned}$$

for any $(x,t) \in Q$. We denote $(X^\nu _{s})_{s \in [t,T]}$ the solution to the stochastic differential equation $\textrm{d}X_s = \nu _s \textrm{d}s + \sqrt{2} \textrm{d}B_s$ with $X^\nu _t = x$, for any $\nu \in \mathbb {L}_{\mathbb {F}}^{2}(t,T)$. Then

$$\begin{aligned}&| u_2(x,t)- u_1(x,t)| \le \sup _{\nu \in \mathbb {L}_{\mathbb {F}}^{2,C}(t,T)} \big | J[\gamma _2,P_2](x,t,\nu ) - J[\gamma _1,P_1](x,t,\nu ) \big | \\&\quad \le \sup _{\nu \in \mathbb {L}_{\mathbb {F}}^{2,C}(t,T)} \mathbb {E} \Big [ \int _t^T |\langle A^\star [P_2 - P_1](X^\nu _s,s) , \nu _s \rangle | + |(\gamma _2- \gamma _1)(X^\nu _s,s)| \, \textrm{d}s \Big ]. \end{aligned}$$

For any $(x,s) \in Q$ and $\nu \in \mathbb {R}^d$, the Cauchy-Schwarz inequality yields

$$\begin{aligned} |\langle A^\star [P_2 - P_1](x,s) , \nu \rangle |&\le |\langle a(x,s) P_2(s)- P_1(s) | |\nu | \\ {}&\le \Vert a\Vert _{L^\infty (Q;\mathbb {R}^{k \times d})} | P_2(s) - P_1(s)| \, |\nu |. \end{aligned}$$

Using again Cauchy-Schwarz inequality and $\Vert a\Vert _{L^\infty (Q;\mathbb {R}^{k \times d})} \le C$, we conclude that

$$\begin{aligned} | u_2(x,t)- u_1(x,t)| \le C \Big (\Vert P_2- P_1 \Vert _{L^2(0,T;\mathbb {R}^k)} + \Vert \gamma _2- \gamma _1 \Vert _{L^\infty (Q)} \Big ), \end{aligned}$$

as was to be proved. $\square $

We prove Proposition 16 with a density argument. In a nutshell: we prove in Proposition 34 below that the result of Proposition 16 holds true when $\gamma $ and P are Hölder continuous. Then we pass to the limit, using Lemma 15.

Proposition 34

Let $R>0$ and let $\beta \in (0,1)$. For any $(\gamma ,P) \in \Xi _R\cap \mathcal {C}^\beta (Q) \times \mathcal {C}^\beta (0,T;\mathbb {R}^k)$, the viscosity solution to the Hamilton-Jacobi-Bellman equation (9) is a classical solution. Moreover, there exists $\alpha \in (0,1)$ such that $\varvec{u}[\gamma ,P]$ lies in $\mathcal {C}^{2+\alpha ,1+\alpha /2}(Q)$ and there exists a constant C(R), depending only on R, such that $\Vert \varvec{u}[\gamma ,P] \Vert _{W^{2,1,q}(Q)} \le C$.

The proof of Proposition 34 is given at page 32 and relies on a fixed point approach which requires some preparatory work. We introduce the map $\mathcal {T} :W^{2,1,q}(Q) \times [0,1] \rightarrow W^{2,1,q}(Q)$ which associates to any $u \in W^{2,1,q}(Q)$ and $\tau \in [0,1]$ the classical solution ${\tilde{u}} = \mathcal {T}[u,\tau ]$ to the linear parabolic equation

$$\begin{aligned} \begin{array}{rlr} - \partial _t {\tilde{u}} - \Delta {\tilde{u}} + \tau \varvec{H}[\nabla u + A^\star P] \, = &{} \tau \gamma &{} (x,t) \in Q, \\ {\tilde{u}}(x,T) \, = &{} \tau g(x) &{} x\in \mathbb {T}^d. \end{array} \end{aligned}$$

For any $(u,\tau ) \in W^{2,1,q}(Q) \times [0,1]$, we have $\tau ( \gamma - \varvec{H}[\nabla u + A^\star P]) \in L^\infty (Q)$, by Lemma 32 and Lemma 1. It follows then from Theorem 29 that $\mathcal {T}[u,\tau ]$ lies in $W^{2,1,q}(Q)$, proving that $\mathcal {T}$ is well-defined.

Lemma 35

The mapping $\mathcal {T}$ is continuous and compact. In addition, for all $K >0$, there exists $\alpha \in (0,1)$ and $C>0$ depending on K, $\gamma $, and P such that $\Vert u\Vert _{W^{2,1,q}(Q)} \le K$ implies $\Vert \mathcal {T}[u,\tau ]\Vert _{\mathcal {C}^{2+\alpha ,1+ \alpha /2}(Q)} \le C$.

Proof

Step 1: Continuity of $\mathcal {T}$. Let $(u_k,\tau _k) \in W^{2,1,q}(Q) \times [0,1]$ be a sequence converging to $(u,\tau ) \in W^{2,1,q}(Q) \times [0,1]$. Then $\nabla u_k \rightarrow \nabla u$ in $L^\infty (Q;\mathbb {R}^d)$ by Lemma 1. Then $\tau _k (\gamma - \varvec{H}[\nabla u_k + A^\star P]) \rightarrow \tau (\gamma - \varvec{H}[\nabla u + A^\star P])$ in $L^\infty (Q;\mathbb {R}^d)$ by continuity of the Hamiltonian (see Lemma 32). Finally $\mathcal {T}$ is continuous, by Theorem 30.

Step 2: Compactness of $\mathcal {T}$. Let $K>0$ and let $(u,\tau ) \in W^{2,1,q}(Q) \times [0,1]$ be such that $\Vert u\Vert _{W^{2,1,q}(Q)} \le K$. Combining Lemma 1 and Lemma 32 there exist $\alpha \in (0,1)$ and $C>0$ such that $ \Vert \gamma - \varvec{H}[\nabla u + A^\star P]\Vert _{C^\alpha (Q)} \le C$. Then applying Theorem 31, there exist $\alpha \in (0,1)$ and $C>0$ such that $\Vert \mathcal {T}[u,\tau ]\Vert _{\mathcal {C}^{2+\alpha ,1+ \alpha /2}(Q)} \le C$. By the Arzela-Ascoli Theorem the centered ball of $\mathcal {C}^{2+\alpha ,1+ \alpha /2}(Q)$ of radius $C>0$ is a relatively compact subset of $W^{2,1,q}(Q)$. As a consequence $\mathcal {T}[u,\tau ]$ is a compact mapping and the conclusion follows. $\square $

Theorem 36

(Leray-Schauder) Let X be a Banach space and let $T : X \times [0, 1] \rightarrow X$ be a continuous and compact mapping. Assume that $T (x,0) = 0$ for all $x\in X$ and assume there exists $C>0$ such that $\Vert x\Vert _X < C$ for all $(x,\tau ) \in X\times [0,1]$ such that $T (x,\tau ) = x$. Then, there exists $x \in X$ such that $T(x,1) = x$.

Proof

See [19, Theorem 11.6]. $\square $

Proof of Proposition 34

We prove that under the assumptions of the proposition, the HJB equation has a classical solution in $\mathcal {C}^{2+\alpha ,1+\alpha /2}(Q)$ (for some $\alpha \in (0,1)$), which is then necessarily the unique viscosity solution $\varvec{u}[\gamma ,P]$. To this purpose, we prove the existence of a solution to the fixed point equation $u= \mathcal {T}[u,1]$. We have $\mathcal {T}[u,0] = 0$ for all $u \in W^{2,1,q}(Q)$. Now let $(u,\tau ) \in W^{2,1,q}(Q) \times [0,1]$ be such that $\mathcal {T}[u,\tau ] = u$. From Lemma 35, the mapping $\mathcal {T}$ is continuous and compact, in addition u is a classical solution and thus the viscosity solution to the Hamilton-Jacobi-Bellman equation

$$\begin{aligned} \begin{array}{rlr} - \partial _t u - \Delta u + \tau \varvec{H}[\nabla u + A^\star P] \, = &{} \tau \gamma &{} (x,t) \in Q, \\ u(x,T) \, = &{} \tau g(x) &{} x\in \mathbb {T}^d, \end{array} \end{aligned}$$

and can be interpreted as the value function associated to the following stochastic control problem

$$\begin{aligned} \inf _{\nu \in \mathbb {L}_{\mathbb {F}}^2(0,T)} \tau \mathbb {E} \Big [\int _0^T L(X^\tau _s,s,\nu _s) + \langle A^\star [P](X^\tau _s,s) , \nu _s \rangle + \gamma (X^\tau _s,s) \textrm{d}s + g(X^\tau _T) \Big ], \end{aligned}$$

where $(X^\tau _{s})_{s \in [t,T]}$ is the solution to $\textrm{d}X_s = \tau \nu _s \textrm{d}s + \sqrt{2} \textrm{d}B_s$, $X_0 = Y$. Following [5, Proposition 1, Step 2], there exists a constant $C>0$, depending only on R, such that $\Vert u\Vert _{L^\infty (Q)} + \Vert \nabla u\Vert _{L^\infty (Q;\mathbb {R}^d)} \le C$. Then using Lemma 32 and recalling that $(\gamma ,P) \in \Xi _R$, we deduce that $\Vert \varvec{H}[\nabla u + A^\star P] - \gamma \Vert _{L^\infty (Q)} \le C$. It follows that u is the solution to a parabolic PDE with bounded coefficients and thus $\Vert u \Vert _{W^{2,1,q}(Q)} \le C$, by Theorem 29. Again, C only depends on R. Finally, by the Leray-Schauder theorem (Theorem 36), there exists a solution to $u= \mathcal {T}[u,1]$, which is necessarily $\varvec{u}[\gamma ,P]$. $\square $

Proof of Proposition 16

Take $(\gamma ,P) \in \Xi _R$ and fix $\beta \in (0,1)$. Let $(\gamma _n,P_n)$ be a sequence in ${\Xi }_{R+1} \cap \mathcal {C}^{\beta }(Q) \times \mathcal {C}^{\beta }(0,T;\mathbb {R}^k)$ such that $\Vert \gamma _n - \gamma \Vert _{L^\infty (Q)} \longrightarrow 0$ and such that $\Vert P_n - P \Vert _{L^2(0,T;\mathbb {R}^k)} \longrightarrow 0$. We do not detail the construction of such a sequence, this can be done by convolution. Define $u^n= \varvec{u}[\gamma ^n,P^n]$ and $u=\varvec{u}[\gamma ,P]$. By Lemma 15, $u_n \rightarrow u$ for the $L^\infty $-norm. Moreover, by Proposition 34,

$$\begin{aligned} \Vert u^n \Vert _{W^{2,1,q}(Q)} \le C(R), \quad \forall n \in \mathbb {N}. \end{aligned}$$

(34)

Thus, the three sequences $(\partial _t u^n)_{n \in \mathbb {N}}$, $(\Delta u^n)_{n \in \mathbb {N}}$, and $(\nabla u^n)_{n \in \mathbb {N}}$ are bounded in $L^q(Q)$. By the Banach-Alaoglu theorem, the three sequences have at least one accumulation point for the weak topology of $L^q(Q)$. These three accumulation points are necessarily (by definition of weak derivatives) equal to $\partial _t u$, $\Delta u$, and $\nabla u$, respectively. Since the $L^q$-norm is weakly lower semi-continuous, we deduce that $\Vert u \Vert _{W^{2,1,q}(Q)} \le C(R)$, where C(R) is as in (34). This concludes the proof. $\square $

1.4 A.4 The other mappings

Proof of Lemma 17

Let $(\gamma ,P) \in \Xi _R$. Let $u= \varvec{u}[\gamma ,P]$. We already know from Proposition 16 that $\Vert u \Vert _{W^{2,1,q}(Q)} \le C(R)$. Then Lemma 1 implies that u and $\nabla u$ are continuous and that $\Vert u \Vert _{L^\infty (Q)} \le C(R)$ and $\Vert \nabla u \Vert _{L^\infty (Q;\mathbb {R}^d)} \le C(R)$. Let $v=\varvec{v}[\gamma ,P]= - \varvec{H}_p[ \nabla u + A^\star P]$. We have

$$\begin{aligned} D_x v = - \varvec{H}_{px}[\nabla u + A^\star P] - \varvec{H}_{pp}[\nabla u + A^\star P](D^2_{xx} u + D_x A^\star P). \end{aligned}$$

Using the regularity of u, the regularity properties of the Hamiltonian given in Lemma 20, and the regularity assumptions on a (Assumption (H4)), we deduce that $\Vert v \Vert _{L^\infty (Q;\mathbb {R}^d)} \le C(R)$ and that $\Vert D_x v \Vert _{L^q(Q;\mathbb {R}^{d \times d})} \le C(R)$. Moreover, v is continuous.

Next, let $m= \varvec{m}[\gamma ,P]= \varvec{M}[v]$. A direct application of Lemma 13 yields that $\Vert m \Vert _{W^{2,1,q}(Q)} \le C$. Finally, let $w= \varvec{w}[\gamma ,P]= mv$. Using again Lemma 1, we obtain that m is continuous and that $\Vert m \Vert _{L^\infty (Q)} \le C(R)$ and $\Vert \nabla m \Vert _{L^\infty (Q;\mathbb {R}^d)} \le C(R)$. Then $w \in \Theta {}$, with a norm bounded by some constant C(R). The lemma is proved. $\square $

Proof of Lemma 18

The two statements concerning $\varvec{\gamma }$ are directly deduced from Assumptions (H3) and (H5). Let $w \in \Theta {}$. Recalling the definition of the operator A (page 7), it is easy to see with Assumption (H4) that $Aw \in \mathcal {C}(0,T;\mathbb {R}^d)$. Assumptions (H3) and (H5) ensure then that $\varvec{P}[w]= \varvec{\phi } \big [ A[w] \big ]$ lies in $\mathcal {C}(0,T;\mathbb {R}^k)$ and that $\Vert \varvec{P}[w] \Vert _{L^\infty (0,T;\mathbb {R}^k)} \le C$. Let us next consider $w_1$ and $w_2$ in $\Theta {}$. We have

$$\begin{aligned} \Vert Aw_2 - Aw_1 \Vert _{L^\infty (0,T;\mathbb {R}^k)} \le {}&\Vert a \Vert _{L^\infty (Q;\mathbb {R}^{k \times d})} \Vert w_2- w_1 \Vert _{L^\infty (0,T;L^1(\mathbb {T}^d;\mathbb {R}^d))} \\ \le {}&C \Vert w_2 - w_1 \Vert _{L^2(Q;\mathbb {R}^d)}, \end{aligned}$$

by Assumption (H4). Using next the Lipschitz-continuity of $\phi $ (Assumption (H5)), we obtain that $\Vert \varvec{\phi }[Aw_2] - \varvec{\phi }[Aw_1] \Vert _{L^2(0,T;\mathbb {R}^k)} \le C \Vert w_2-w_1 \Vert _{L^2(Q;\mathbb {R}^d)}$, as was to be proved. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lavigne, P., Pfeiffer, L. Generalized Conditional Gradient and Learning in Potential Mean Field Games. Appl Math Optim 88, 89 (2023). https://doi.org/10.1007/s00245-023-10056-8

Download citation

Accepted: 22 August 2023
Published: 24 October 2023
DOI: https://doi.org/10.1007/s00245-023-10056-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized Conditional Gradient and Learning in Potential Mean Field Games

Abstract

Access this article

Similar content being viewed by others

Two Numerical Approaches to Stationary Mean-Field Games

Numerical Methods for Finite-State Mean-Field Games Satisfying a Monotonicity Condition

Discrete potential mean field games: duality and numerical resolution

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Financial or Non-financial interests

Additional information

Publisher's Note

A Appendix: Regularity of the auxiliary mappings

A Appendix: Regularity of the auxiliary mappings

1.1 A.1 Parabolic estimates

Theorem 29

Theorem 30

Theorem 31

1.2 A.2 Fokker-Planck equation

Proof of Lemma 13

Proof of Lemma 14

1.3 A.3 HJB equation

Lemma 32

Proof

Lemma 33

Proof

Proof of Lemma 15

Proposition 34

Lemma 35

Proof

Theorem 36

Proof

Proof of Proposition 34

Proof of Proposition 16

1.4 A.4 The other mappings

Proof of Lemma 17

Proof of Lemma 18

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation