Abstract
We consider two-player zero-sum differential games of fixed duration, where the running payoff and the dynamics are both linear in the controls of the players. Such games have a value, which is determined by the unique viscosity solution of a Hamilton–Jacobi-type partial differential equation. Approximation schemes for computing the viscosity solution of Hamilton–Jacobi-type partial differential equations have been proposed that are valid in a more general setting, and such schemes can of course be applied to the problem at hand. However, such approximation schemes have a heavy computational burden. We introduce a discretized and probabilistic version of the differential game, which is straightforward to solve by backward induction, and prove that the solution of the discrete game converges to the viscosity solution of the partial differential equation, as the discretization becomes finer. The method removes part of the computational burden of existing approximation schemes.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider two-player zero-sum deterministic differential games defined on the time-interval [0, T] (with \(T>0\)). The state of the game is given by a vector \(s\in \mathbb {R}^N\). The state is driven by the controls of the players, who can choose and adapt their controls continuously during play. Player 1 chooses controls from a convex and compact set U, while player 2 chooses controls from a convex and compact set V. Both players know the state of the game at any time, and they can choose their controls depending on both time and state. A control strategy for player 1 is therefore defined as a function \({\textbf{u}} \in {\mathcal {U}}\), where \({\mathcal {U}}\) is the set of Borel measurable functions with domain \([0,T]\times \mathbb {R}^N\) and codomain U. Similarly, a control strategy for player 2 is defined as a function \({\textbf{v}}\in {\mathcal {V}}\), where \({\mathcal {V}}\) is the set of Borel measurable functions with domain \([0,T]\times \mathbb {R}^N\) and codomain V.
Given initial time \(t\in [0,T]\) and initial state \(x\in \mathbb {R}^N\) of the game, the control strategies \({\textbf{u}}\) and \({\textbf{v}}\) of the players determine the state by the differential equation
Here \(f: [0,T]\times \mathbb {R}^N\times U\times V \rightarrow \mathbb {R}\) satisfies the following Lipschitz conditions: (a) There exists \(K_f > 0\), such that \(|f(t,x,u,v) - f(t,y,u,v)| \le K_f |x-y|\) for all \(t\in [0,T]\), \(x,y\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\); (b) There exists \(L_f > 0\), such that \(|f(t,x,u,v) - f(t^\prime ,x,u,v)| \le L_f |t-t^\prime |\) for all \(t,t^\prime \in [0,T]\), \(x\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\). Moreover, f is linear in the control variables u and v.
The payoff of the players consists of a running payoff and an additional payoff at termination:
Here \(\ell : [0,T]\times \mathbb {R}^N\times U\times V \rightarrow \mathbb {R}\) satisfies the following Lipschitz conditions: (a) There exists \(K_\ell > 0\), such that \(|\ell (t,x,u,v) - \ell (t,y,u,v)| \le K_\ell |x-y|\) for all \(t\in [0,T]\), \(x,y\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\). (b) There exists \(L_\ell > 0\), such that \(|\ell (t,x,u,v) - \ell (t^\prime ,x,u,v)| \le L_\ell |t-t^\prime |\) for all \(t,t^\prime \in [0,T]\), \(x\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\). Moreover, \(\ell \) is linear in the control variables u and v. Also, the function \(g: \mathbb {R}^N \rightarrow \mathbb {R}\) is Lipschitz continuous.
Let us denote the class of games that fall under the given description by \({\mathcal {G}}\). A distinguishing feature of the games in \({\mathcal {G}}\) is the linearity in the control variables of both the dynamics and the running payoff, determined by the functions f and \(\ell \), respectively. All games in \({\mathcal {G}}\) have a value. In this paper, we propose an approximation scheme for finding the value of such games, which specifically exploits the linearity property. We compare the proposed scheme with approximation schemes that work in a more general setting, for games that may not have a value. These general approximation schemes converge to either the lower or upper value of a game, so either version of such a general scheme can be applied to find the value of a game in \({\mathcal {G}}\). In their implementation, the general approximation schemes solve a certain minimax or maximin problem repeatedly, for each point in a given grid. For the games in \({\mathcal {G}}\), we propose to replace the minimax or maximin problem by the computationally much easier problem of finding the value of a matrix game. Proof that the proposed scheme indeed converges to the value of a game in \({\mathcal {G}}\) is given in Appendix B.
We define
and
The quantities \(W^+(t,x)\) and \(W^-(t,x)\) are called the upper and lower value of the game (with initial time t and initial state x), respectively. If \(W^+(t,x) = W^-(t,x)\), then \(W(t,x) = W^+(t,x) = W^-(t,x)\) is called the value of the game. In the following, it will be convenient to treat the (arbitrarily chosen) initial time t and the initial state x as variables, and \(W, W^+\), and \(W^-\) as functions of t and x.
In order to find out whether the value of the game exists, one looks at its Hamiltonian \({\widetilde{H}}:[0,T]\times \mathbb {R}^N\times \mathbb {R}^N\times U\times V\rightarrow \mathbb {R}\), defined by
where the notation \(\left<a,b\right>\) denotes the inner product of two vectors \(a,b\in \mathbb {R}^N\).
If we have
for all \((t,x,\lambda ) \in [0,T]\times \mathbb {R}^N\times \mathbb {R}^N\), then the value of the differential game exists, for any given initial time \(t\in [0,T]\) and initial state \(x\in \mathbb {R}^N\). Condition (4) is known as the Isaacs condition (see e.g., Bardi and Capuzzo-Dolcetti [1]).
In our context, the Hamiltonian \({\widetilde{H}}\) is linear in the variables u and v, since we assumed that the functions \(\ell \) and f are linear in u and v. Moreover, we assumed that the sets U and V are convex and compact. These facts imply that Von Neumann’s minimax theorem (Von Neumann [8]) applies to equation (4) and that equality indeed holds. Thus, the differential game defined by (1) and (2) has a value for any given initial time \(t\in [0,T]\) and initial state \(x\in \mathbb {R}^N\). We will be concerned with the numerical approximation of the value.
Let us define \(H: [0,T]\times \mathbb {R}^N\times \mathbb {R}^N \rightarrow \mathbb {R}\) as
(We make take the minimum and maximum instead of the infimum and supremum, because U and V are compact.) Now, the value W(x, t) of the differential game defined by (1) and (2) can be found as the solution of the following Hamilton–Jacobi-type partial differential equation (PDE):
Here \(\partial _t W\) denotes the partial derivative of W with respect to the time variable and DW is the vector of partial derivatives with respect to the N state variables.
The PDE given by (6) often does not have a solution in the usual sense, where the solution is smooth everywhere. In such a situation, the notion of a viscosity solution, developed during the 1980 s, is needed. Crandall et al. [3] introduced this notion for solutions of nonlinear first-order partial differential equations of the following Hamilton–Jacobi type:
where \(H: [0, T]\times \mathbb {R}^N\times \mathbb {R}\times \mathbb {R}^N \rightarrow \mathbb {R}\) is a continuous function. Crandall, Evans and Lions [3] proved uniqueness and stability results for equations of type (7). Existence was established by Crandall and Lions [4]. Finally, the convergence of general approximation schemes to the viscosity solution of (7) was proved by Souganidis [6].
2 Approximation Schemes
In this section, we discuss approximation schemes that converge to the viscosity solution of PDE’s of the type given by (6). Note that (6) is a simplified version of (7), except for the fact that the boundary condition is at time \(t=T\) instead of \(t=0\) (which can be ‘repaired’ by a substitution of the time variable). Thus, the results in Souganidis [6] apply to the situation here. In a subsequent paper, Souganidis [7] applied the results in [6] to give a proof that (under certain conditions) zero-sum differential games have an upper and a lower value and that the value exists if the Isaacs condition is met. We will use concepts and results from Souganidis [7] to derive our result.
Consider a mapping \(F:[0,T]\times [0,T]\times C(\mathbb {R}^N) \rightarrow C(\mathbb {R}^N)\), where \(C(\mathbb {R}^N)\) denotes the set of all continuous functions on the domain \(\mathbb {R}^N\). To guide the intuition: The mapping F takes the time t, a time-step \(\rho \), and an approximation \(\phi \) of the viscosity solution of (6) at time \(t+\rho \) as its arguments, and gives \(F(t,\rho ,\phi )\) as approximation of the viscosity solution at time t. The mapping F is applied as follows:
For a partition \(P = \{0=t_0< t_1< \ldots < t_K = T\}\) of [0, T], define \(\phi _P: [0,T]\times \mathbb {R}^N \rightarrow \mathbb {R}\) by
We call the mapping F an approximation scheme for (6) if \(\phi ^P\) converges to the viscosity solution of (6) as \(|P| = \max \nolimits _{1\le k \le K} (t_k - t_{k-1}) \rightarrow 0\). In Appendix A, we provide conditions under which mapping F is indeed an approximation scheme for (6).
Souganidis [7] applied the theory of general approximation schemes to a class of differential games that is larger than the class defined by (1) and (2) and that contains games without a value (see [7], section 3). Souganidis provided two different approximation schemes, one that converging to the upper and lower value of the game, respectively. When applied to the games we consider here, we obtain two different approximation schemes that both converge to the value. The approximation schemes can be formulated as follows:
and
In order to implement an approximation scheme, one must not only discretize time (with a partition P), but it is also necessary to restrict calculations to a bounded and discretized subset of \(\mathbb {R}^N\) (the grid). Since the number of points in the grid increases rapidly with increasing |N|, all approximation schemes suffer from the fact that the number of calculations increases rapidly as the dimension |N| increases. For more information on discretization schemes in specific game-theoretic examples, we refer an interested reader to Appendix A of [1], to Cardaliaguet et al. [2], and to Falcone and Stefani [5].
The schemes defined by (9) and (10) have an additional computational obstacle: For every point in the grid one must solve a subproblem of either the type \(\min \nolimits _{v\in V} \max \nolimits _{u\in U} \zeta (u,v)\) (scheme (9)) or of the type \(\max \nolimits _{u\in U} \min \nolimits _{v\in V} \zeta (u,v)\) (scheme (10)). The function \(\zeta \) lacks any special structure that might make this an easy task. Therefore, a fine discretization of the sets U and V seems necessary to obtain good approximations for each of these subproblems. To address this second computational issue, we will propose an alternative approximation scheme that exploits the linearity of the functions \(\ell \) and f in the control variables u and v.
For this purpose, we choose a finite set of controls on the boundary of U and V, say \(\{u_1, \ldots , u_m\} \in \partial U\) and \(\{v_1, \ldots , v_n\} \in \partial V\), such that the convex hull of \(\{u_1,\ldots ,u_m\}\) and the convex hull of \(\{v_1,\ldots ,v_n\}\) are good approximations of U and V, respectively. Let us denote the convex hull of a set X in a vector space by \(\text{ conv }(X)\). We will assume here that U and V are polytopes, to assure that \(U = \text{ conv }(\{u_1, \ldots , u_m\})\) and \(V = \text{ conv }(\{v_1, \ldots , v_n\})\).
We define
and
Additionally, let us define, for \(t,\rho \in [0,T]\), \(x\in \mathbb {R}^N\) and \(\phi \in C(\mathbb {R}^N)\), \(\varPsi (t,\rho ,x,\phi )\) as the \(m\times n\)-matrix for which entry (i, j) equals
For any matrix A, let us denote the value of the matrix game associated with A by \(\nu (A)\). We now define the scheme G by
The clear computational advantage of (12) is that the nonlinear ‘\(\min \max \)’ and ‘\(\max \min \)’ optimization problems in schemes (9) and (10), respectively, are replaced by the standard problem of finding the value of a matrix game. This can be done efficiently with linear programming techniques.
In what follows we will explain how the application of G is equivalent to computing the value of a certain discrete and probabilistic game, related to the differential game defined by (1) and (2).
For a partition \(P = \{0=t_0< t_1< \ldots < t_K = T\}\) of [0, T] we define a 2-player zero-sum game that proceeds in stages numbered \(0,1,\ldots , K\), at times \(0=t_0, t_1, \ldots , t_K=T\), as follows: At each stage \(k< K\), player 1 must choose an element of \(\{u_1, \ldots , u_m\}\) and player 2 must choose an element in \(\{v_1, \ldots , v_n\}\). If player 1 chooses \(u_i\) and player 2 chooses \(v_j\) (at stage k in state \(s_k\)), then the stage payoff is given by
and the next state is given by
When stage K is reached, there is a terminal payoff \(g(s_{t_K})\). Moreover, the game starts in state x.
The game described above consists of playing a sequence of classical matrix games. Such a game has a value, which can be determined by backwards induction as follows: For \(x\in \mathbb {R}^N\) and \(k\in \{0,\ldots ,K\}\), let us define the number \({\widetilde{W}}(k,x)\) as the value of the subgame starting at stage k in state \(x\in \mathbb {R}^N\). We then trivially have, for all \(x\in \mathbb {R}^N\),
To determine \({\widetilde{W}}(k,x)\) for \(x\in \mathbb {R}^N\) and \(k < K\), we first determine the expected payoff if player 1 chooses control \(u_i\) and player 2 chooses control \(v_j\). The game then advances to stage \(k+1\) and position \(x + (t_{k+1} - t_k) f(t_k,x,u_i,v_j)\), where the players can expect a payoff equal to \(\widetilde{W}\left( k+1, x + (t_{k+1} - t_k) f(t_k,x,u_i,v_j)\right) \) (assuming they play optimally from stage \(k+1\) to K). Thus, the total expected payoff at stage k and state x, associated with the control pair \((u_i,v_j)\), equals
which is the sum of the stage payoff \((t_{k+1} - t_k)\ell (t_k,x,u_i,v_j))\) and the subsequent payoff for the remaining stages. This is precisely the number \(\varPsi _{ij}(t,\rho ,x,\phi )\) defined by (11), where \(t = t_k\), \(\rho = t_{k+1} - t_k\), and \(\phi = {\widetilde{W}}(k+1,\cdot )\). Thus, in order to play optimally at stage k and state x, the players should play optimal mixed strategies for the matrix game associated with the matrix \(\varPsi \big (t_k,t_{k+1}-t_k,x,\widetilde{W}(k+1,\cdot )\big )\). It follows that
We see that application of the mapping G at moments \(t_0, t_1, \ldots , t_K\), as indicated by (8), yields exactly the value of the discrete and probabilistic game we described in this section. The main result of this paper states that G is indeed an approximation scheme for the PDE defined by (6).
Theorem 2.1
The mapping G is an approximation scheme for the PDE defined by (6).
A proof of theorem 2.1 is given in “Appendix B.” The necessary background from Souganidis [7] is given in “Appendix A.”
3 Conclusions
Finding a value of two-player zero-sum differential games with a fixed duration typically involves approximation schemes for calculating the viscosity solution of corresponding Hamilton–Jacobi partial differential equations. Such schemes are computationally very expensive, partly due to rather complex subproblems that need to be solved at each iteration.
Here, we considered two-player zero-sum differential games with a fixed duration, whose payoffs and dynamics are both linear in players’ controls. For this special class of games, we proposed an alternative approximating scheme that replaces the difficult subproblem by the problem of solving a matrix game. This gives the alternative scheme a clear computational advantage over more generic schemes. We prove that the alternative scheme indeed converges to the value of the associated differential game, as the discretization becomes finer.
We then introduced a discretized and probabilistic game, as an approximate version of the differential game, for which the value can be determined in a straightforward manner, by backward induction. We observed that the backward induction scheme for the discrete game does in fact coincide with the earlier proposed alternative approximation scheme for calculating the viscosity solution of the differential game. This gives the alternative approximation scheme a clear interpretation.
References
Bardi, M., Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton–Jacobi–Bellman Equations. Birkhäuser, Boston (1997)
Cardaliaguet, P., Quincampoix, M., Saint-Pierre, P.: Set valued numerical analysis for optimal control and differential game. In: Bardi, M., Parthasarathy, T., Raghavan, T.E.S. (eds.) Annals of the International Society of Dynamic Games, vol. 4, pp. 177–247. Birkhäuser, Boston (1999)
Crandall, M.G., Evans, L.C., Lions, P.L.: Some properties of viscosity solutions of Hamilton–Jacobi equations. Trans. Am. Math. Soc. 282(2), 487–502 (1984)
Crandall, M.G., Lions, P.L.: Viscosity solutions of Hamilton–Jacobi equations. Trans. Am. Math. Soc. 277(1), 1–42 (1983)
Falcone, M., Stefani, P.: Advances in parallel algorithms for the Isaacs equation. In: Nowak, A.S., Szajowski, K. (eds.) Annals of the International Society of Dynamic Games, vol. 7, pp. 515–544. Birkhäuser, Boston (2005)
Souganidis, P.E.: Approximation schemes for viscosity solutions of Hamilton–Jacobi equations. J. Differ. Equ. 59(1), 1–43 (1985)
Souganidis, P.E.: Max–min representations and product formulas for the viscosity solutions of Hamilton–Jacobi equations with applications to differential games. Nonlinear Anal. Theory Methods Appl. 9(3), 217–257 (1985)
von Neumann, J.: On the theory of games of strategy. In: Tucker, A.W., Luce, R.D. (eds.) Contributions to the Theory of Games, pp. 13–42. Princeton University Press, Princeton (1959)
Acknowledgements
This research was supported by European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 955708 and the Dutch National Foundation Projects OCENW.KLEIN.277 and VI.Vidi.213.139.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Bruce A. Conway.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A
In this appendix, we will state a theorem about the convergence of approximation schemes to the viscosity solution of the PDE given by (6). The theorem is a simplified version of Theorem 1.3(a) in Souganidis [7], adapted to equation (6). We will state it without proof.
Notation:
-
\( C^{0,1}_b(\mathbb {R}^N)\): the set of bounded Lipschitz functions on \(\mathbb {R}^N\).
-
For \(\phi \in C^{0,1}_b(\mathbb {R}^N)\): \(\Vert \phi \Vert = \sup \nolimits _{x\in \mathbb {R}^N} |\phi (x)|\).
-
For \(\phi \in C^{0,1}_b(\mathbb {R}^N)\): \(\Vert D\phi \Vert \) is the Lipschitz constant of \(\phi \).
-
\(C^2(\mathbb {R}^N)\): the space of twice continuously differentiable functions on \(\mathbb {R}^N\) with bounded derivatives.
-
For \(\phi \in C^2(\mathbb {R}^N)\): \(\Vert D^2 \phi \Vert = \sum _{i,j\in N} \Vert \nicefrac {\partial ^2 \phi }{\partial x_i\partial x_j}\Vert \).
Properties concerning the function H in the formulation of (6):
-
(H1)
\(H: [0,T] \times \mathbb {R}^N\times \mathbb {R}^N \rightarrow \mathbb {R}\) is uniformly continuous on \([0,T] \times \mathbb {R}^N \times \mathbb {R}^N\).
-
(H2)
\(\sup \nolimits _{(t,x)\in [0,T]\times \mathbb {R}^N} H(t,x,0) < \infty \).
There are constants \(K, L, M > 0\) such that
-
(H3)
\(|H(t,x,\lambda ) - H(t,y,\lambda )| \le K (1+|\lambda |)|x-y|\), for \(t\in [0,T]\) and \(x,y,\lambda \in \mathbb {R}^N\).
-
(H4)
\(|H(t,x,\lambda ) - H(t^\prime ,x,\lambda )| \le L (1+|\lambda |)|t-t^\prime |\), for \(t,t^\prime \in [0,T]\) and \(x,\lambda \in \mathbb {R}^N\).
-
(H5)
\(|H(t,x,\lambda ) - H(t,x,\mu )| \le M |\lambda - \mu |\), for \(t\in [0,T]\), \(x\in \mathbb {R}^N\) and \(\lambda ,\mu \in \mathbb {R}^N\).
Conditions for the mapping \(F: [0,T]\times [0,T]\times C(\mathbb {R}^N) \rightarrow C(\mathbb {R}^N)\):
-
(F1)
\(F(t,0,\phi ) = \phi \).
-
(F2)
For \(t,\rho \in [0,T]\), the mapping \((t, \rho ) \rightarrow F(t, \rho , u)\) is continuous with respect to the \(\Vert \, \Vert \) norm.
-
(F3)
\(F(t,\rho ,\phi +k) = F(t,\rho ,\phi ) + k\) for every \(k\in \mathbb {R}\). (For a real-valued function \(\alpha \) and a real number k, \(\alpha + k\) denotes the function defined by \((\alpha +k)(x) = \alpha (x) + k\) for all x in the domain of \(\alpha \).)
-
(F4)
For \(t,\rho \in [0,T]\) and \(\phi \in C^{0,1}_b(\mathbb {R}^N)\), \(\Vert F(t, \rho , \phi ) - \phi \Vert \le \rho \, C_1\), where \(C_1 \ge 0\) may depend on \(\Vert \phi \Vert \) and \(\Vert D\phi \Vert )\).
-
(F5)
\(\Vert F(t, \rho , \phi ) - F(t, \rho , {\overline{\phi }})\Vert \le \Vert \phi -{\overline{\phi }}\Vert \) for \(\phi ,{\overline{\phi }}\in C^{0,1}_b(\mathbb {R}^N)\).
-
(F6)
There exists a constant \(C_2 > 0\) such that \(\Vert F(t,\rho ,\phi )\Vert \le \text{ e}^{\rho C_2} (\Vert \phi \Vert + \rho C_2)\).
-
(F7)
Let \(t,\rho \in [0,T]\) and \(\phi \in C^{0,1}_b(\mathbb {R}^N)\). There exist constants \(C_3 > 0\) and \(C_4 > 0\) such that
$$\begin{aligned} \Vert DF(t,\rho ,\phi )\Vert \le \textrm{e}^{\rho (C_3 + C_4)}(\Vert D\phi \Vert + \rho C_4). \end{aligned}$$ -
(F8)
For every \(\phi \in C^2(\mathbb {R}^N)\) there exists \(C_5> 0\) such that
$$\begin{aligned} \Vert \frac{1}{\rho }({F(t,\rho ,\phi ) - \phi }) + H(t,\cdot ,D\phi )\Vert \le C_5(1 + \Vert D\phi \Vert + \Vert D^2\phi \Vert )\rho , \end{aligned}$$where \(C_5\) may depend on \(\Vert \phi \Vert \) and \(\Vert D\phi \Vert \).
We can now state the theorem:
Theorem A.1
[Souganidis] For \(H: [0, T] \times \mathbb {R}^N\times \mathbb {R}^N\rightarrow \mathbb {R}\) satisfying (H1), (H2), (H3), (H4), and (H5), and for \(g\in C^{0,1}_b(\mathbb {R}^N)\), let \(W\in [0,T]\times \mathbb {R}^N\) be the viscosity solution of (6). Let \(F: [0,T]\times [0,T]\times C^{0,1}_b(\mathbb {R}^N) \rightarrow C^{0,1}_b(\mathbb {R}^N)\) be such that (F1), (F2), (F3), (F4), (F5), (F6), (F7) and (F8) hold. For a partition \(P = \{0=t_0< t_1< \ldots < t_K = T\}\) of [0, T], let \(W^P: [0,T]\times \mathbb {R}^N \rightarrow \mathbb {R}\) be defined by
Then there exists a constant Q that may depend on \(\Vert g\Vert \) and \(\Vert Dg\Vert \), such that, for all \(t\in [0,T]\), \(\Vert W^P(t,\cdot ) - W(t,\cdot )\Vert \le Q |P|^{\nicefrac {1}{2}}\). Here, \(|P| = \max _{k\in \{1,\ldots ,K\}} (t_{k} - t_{k-1})\).
Appendix B
In this appendix, we assume throughout that \(U = \text{ conv }\{u_1,\ldots , u_m\}\) and \(V = \text{ conv }\{v_1,\ldots ,v_n\}\).
Recall that \(|f(t,x,u,v) - f(t,y,u,v)| \le K_f |x-y|\) and \(|\ell (t,x,u,v) - \ell (t,y,u,v)| \le K_\ell |x-y|\) for all \(t\in [0,T]\), \(x,y\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\).
Proposition B.1
If the functions \(\ell \) and f are bounded, then the function H defined by
satisfies conditions (H1), (H2), (H3), (H4) and (H5).
Proof
Let \(B_\ell \) be an upper bound for \(|\ell (t,x,u,v)|\) and let \(B_f\) be an upper bound for |f(t, x, u, v)|.
Proof of (H2): We have
since the function \(\ell \) is bounded.
Proof of (H3): Let \(t\in [0,T]\), \(x,y\in \mathbb {R}^N\), and \(\lambda \in \mathbb {R}^N\). Choose \(v^*\in V\) such that
Then choose \(u^*\in U\) such that
Then we have
where \(K = \max (K_\ell ,K_f)\). Similarly, one shows that \(H(t,y,\lambda ) - H(t,x,\lambda ) \le K (1 + |\lambda |) |x-y|\), which proves (H3).
Proof of (H4): Here we use that there exist \(L_f, L_\ell > 0\), such that \(|f(t,x,u,v) - f(t^\prime ,x,u,v)| \le L_f |t-t^\prime |\) and \(|\ell (t,x,u,v) - \ell (t^\prime ,x,u,v)| \le L_\ell |t-t^\prime |\) for all \(t,t^\prime \in [0,T]\), \(x,y\in \mathbb {R}^N\), \(u\in U\), and \(v\in V\). Similarly to the proof of (H3) we now show that \(|H(t,x,\lambda ) - H(t^\prime ,x,\lambda )| \le L (1 + |\lambda |) |t-t^\prime |\), where \(L = \max (L_\ell ,L_f)\).
Proof of (H5): Let \(t\in [0,T]\), \(x\in \mathbb {R}^N\), and \(\lambda ,\mu \in \mathbb {R}^N\). Choose \(v^*\in V\) such that
Then choose \(u^*\in U\) such that
Then we have
Similarly, one shows that \(H(t,x,\mu ) - H(t,x,\lambda ) \le B_f |\lambda - \mu |\), which proves that (H5) holds with \(M = B_f\).
Proof of (H1): Let \(t,t^\prime \in [0,T]\), \(x,y\in \mathbb {R}^N\) and \(\lambda ,\mu \in \mathbb {R}^N\). Then
Uniform continuity of H follows from this. \(\square \)
Recall that G is defined by
where elements of the matrix \(\varPsi (t,\rho ,x,\phi )\) are given as
Proposition B.2
If \(\ell \) and f are bounded, then the mapping G satisfies conditions (F1), (F2), (F3), (F4), (F5), (F6), (F7) and (F8).
Proof
Let \(B_\ell \) be an upper bound for \(|\ell (t,x,u,v)|\) and let \(B_f\) be an upper bound for |f(t, x, u, v)|.
Proof of (F1): For \(x\in \mathbb {R}^N\), \(G(t,0,\phi )(x)\) is the value of the matrix game, associated with the \(m\times n\)-matrix with all its entries equal to \(\phi (x)\). Thus, for all \(x\in \mathbb {R}^N\), \(G(t,0,\phi )(x) = \phi (x)\).
Proof of (F2): We have, by definition, \(\Vert G(t,\rho ,u)\Vert = \sup _{x\in \mathbb {R}^N} |\nu \big (\varPsi (t,\rho ,x,\phi )\big )|\). \(\varPsi _{ij}(t,\rho ,x,\phi )\) is given by \(\rho \ell (t,x,u_i,v_j) + \phi (x + \rho f(t,x,u_i,v_j))\). This expression is continuous in the variables t, x and \(\rho \), since \(\ell \) and f are continuous. Then the expression \(\sup _{x\in \mathbb {R}^N} |\nu \big (\varPsi (t,\rho ,x,\phi )\big )|\) is continuous in t and \(\rho \).
Proof of (F3): Let \(k\in \mathbb {R}\). We have, for all \(x\in \mathbb {R}^N\),
Here E denotes the \(m\times n\)-matrix with all entries equal to 1.
Proof of (F4): Let \(t,\rho \in [0,T]\) and \(\phi \in C^{0,1}_b(\mathbb {R}^N)\). Then
where E denotes the \(m\times n\)-matrix for which all entries are equal to 1. Entry (i, j) of the matrix \(\varPsi (t,\rho ,x,\phi ) - \phi (x) E\) equals \(\rho \ell (t,x,u_i,v_j) + \phi (x + \rho f(t,x,u_i,v_j)) - \phi (x)\). We have
Then \(\Vert G(t,\rho ,\phi ) - \phi \Vert \le \rho (B_\ell + B_f \Vert D\phi \Vert )\). We see that (F4) holds with \(C_1 = B_\ell + B_f \Vert D\phi \Vert \) (and that \(C_1\) only depends on \(\Vert D\phi \Vert \), not on \(\Vert \phi \Vert \)).
Proof of (F5): Let \(t,\rho \in [0,T]\), \(x\in \mathbb {R}^N\), and \(\phi ,{\overline{\phi }}\in C^{0,1}_b(\mathbb {R}^N)\). We have, for all \(i\in \{1,\ldots ,m\}\) and \(j\in \{1,\ldots ,n\}\):
Thus, for all t, \(\rho \in [0,T]\) and \(x\in \mathbb {R}^N,\) corresponding entries of the two matrices \(\varPsi (t,\rho ,x,\phi )\) and \(\varPsi (t,\rho ,x,{\overline{\phi }})\) differ by at most \(\Vert \phi - {\overline{\phi }}\Vert \). This implies that the values of the corresponding matrix games differ by at most \(\Vert \phi - {\overline{\phi }}\Vert \). It follows that
Proof of (F6): Let \(t,\rho \in [0,T]\) and \(\phi \in C^{0,1}_b(\mathbb {R}^N)\). We have:
We see that (F6) holds with \(C_2 = B_\ell \) (and that the exponential term \(\text{ e}^{\rho C_2}\) is not necessary here.)
Proof of (F7): Let \(t,\rho \in [0,T]\), \(x,y\in \mathbb {R}^N\), and \(\phi \in C^{0,1}_b(\mathbb {R}^N)\). We have for all \(i\in \{1,\ldots ,m\}\) and \(j\in \{1,\ldots ,n\}\):
This implies that
Hence,
and it follows that
We see that (F7) holds with \(C_4 = K_\ell \) and \(C_3 = K_f - K_\ell \).
Proof of (F8): (In this proof, we use the notation \(x^T\) for transposition of vector x.)
Let t, \(\rho \in [0,T]\), \(x\in \mathbb {R}^N\), and \(\phi \in C^2_b(\mathbb {R}^N)\). For all \(i\in \{1,\ldots , m\}\) and \(j\in \{1,\ldots , n\}\):
where \(\omega _{ij}\in \mathbb {R}^N\) is a vector on the line segment from x to \(x + \rho f(t,x,u_i,v_j)\). Writing this in matrix form, we obtain
Here, \(\varOmega (t,x)\) is a matrix whose entries are all between \(-1\) and 1, and we defined \({\widetilde{H}}(t,x,\lambda )\) as the \(m\times n\) matrix for which entry (i, j) equals \(\widetilde{H}(t,x,\lambda ,u_i,v_j)\). Now, observe that \(H(t,x,\lambda ) = \nu \big ({\widetilde{H}}(t,x,\lambda )\big )\). Then
We see that (F8) holds with \(C_5 = \frac{1}{2}K^2_f\). \(\square \)
Proof of theorem 2.1:
If \(\ell \), f and g are bounded, we can directly apply theorem A.1 to conclude the proof. If \(\ell \), f and/or g is not bounded, for any \(R> 0\), we define
and
Then \(\ell _R\), \(f_R\) and \(g_R\) are bounded. Moreover, it is easy to see that \(g_R\) is Lipschitz continuous and that \(f_R\) and \(\ell _R\) satisfy respectively \(|f_R(t,x,u,v) - f_R(t,y,u,v)| \le K_f |x-y|\) and \(|\ell _R(t,x,u,v) - \ell _R(t,y,u,v)| \le K_\ell |x-y|\), for all \(t\in [0,T]\), \(x,y\in \mathbb {R}^N\), \(u\in U\) and \(v\in V\). Also, \(|f_R(t,x,u,v) - f_R(t^\prime ,x,u,v)| \le L_f |t-t^\prime |\) and \(|\ell _R(t,x,u,v) - \ell _R(t^\prime ,x,u,v)| \le L_\ell |t-t^\prime |\), for all \(t,t^\prime \in [0,T]\), \(x\in \mathbb {R}^N\), \(u\in U\) and \(v\in V\). Therefore, we can apply theorem A.1 with respect to the functions \(\ell _R\), \(f_R\) and \(g_R\).
Let \(({\overline{t}},{\overline{x}}) \in [0,T] \times \mathbb {R}^N\). Now, we wish to choose R sufficiently large, such that \(W_R(\overline{t},{\overline{x}}) = W({\overline{t}},{\overline{x}})\) and such that for any partition P, we have \(W^P_R({\overline{t}},{\overline{x}}) = W^P({\overline{t}},{\overline{x}})\). Here \(W_R\) refers to the value of the differential game defined by (1) and (2), where \(\ell \), f and g are replaced by the truncated functions \(\ell _R\), \(f_R\) and \(g_R\). Similarly, \(W^P_R\) refers to the approximation of \(W_R\) that is obtained by applying G to the game with truncated functions. The choice \(R = \mathrm e^{K_f T}(|{\overline{x}} + M_f)\) will do, with \(M_f = \sup _{(t,u,v)\in [0,T]\times U\times V} f(t,0,u,v)\). \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kuipers, J., Schoenmakers, G. & Staňková, K. Approximating the Value of Zero-Sum Differential Games with Linear Payoffs and Dynamics. J Optim Theory Appl 198, 332–346 (2023). https://doi.org/10.1007/s10957-023-02236-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-023-02236-x