A multi-step Lagrangian scheme for spatially inhomogeneous evolutionary games

A multi-step Lagrangian scheme at discrete times is proposed for the approximation of a nonlinear continuity equation arising as a mean-field limit of spatially inhomogeneous evolutionary games, describing the evolution of a system of spatially distributed agents with strategies, or labels, whose payoff depends also on the current position of the agents. The scheme is Lagrangian, as it traces the evolution of position and labels along characteristics, and is a multi-step scheme, as it develops on the following two stages: First, the distribution of strategies or labels is updated according to a best performance criterion, and then, this is used by the agents to evolve their position. A general convergence result is provided in the space of probability measures. In the special cases of replicator-type systems and reversible Markov chains, variants of the scheme, where the explicit step in the evolution of the labels is replaced by an implicit one, are also considered and convergence results are provided.


Introduction
The capability of changing strategy as an adaptive response to the modification of the surrounding environment in order to maximize a certain payoff is of paramount importance in decision-making processes. Replicator-type models [21] are a particular class of dynamical models that feature this adaptivity and are well suited for studying the evolution of strategies according to their success: Given a pool of strategies, the occurrence of each of them evolves according to their performance with respect to all the others; in this way, if a strategy gives a payoff which is higher compared to the average of all strategies, it is enhanced; otherwise, it is suppressed. This criterion, in the basic replicator model, is the only one that determines the evolution of the occurrence of the strategies, which in fact is independent from all other factors, in particular from the position of the agents that play those strategies. This is a reasonable assumption, not even a restrictive one, in many cases. For example, in a financial scenario, the set of (pure) strategies U contains the financial products available to an investor.
Any combination of them, that is a portfolio, is called a mixed strategy: In a discrete setting such as this one, it corresponds to the fraction of the capital invested in each of the different financial products. Adapting the strategy means to allocate resources differently according to the evolution of the market, and the location the investor is at when making this decision is likely to not affect the reward of the portfolio. On the contrary, when the position influences the outcome, the system is more involved, as more feedback is available, and the adaptive optimization process relies on the mutual influence of position and strategy performance. We call such a system spatially inhomogeneous and make them the focus of this paper.

Overview of the problem and state of the art
The basic, spatially homogeneous, replicator equation of [21] can be enriched to include spatial dependence of the payoff function: The idea is that the same strategy adopted in two different places might originate different rewards, precisely depending on the environment. Therefore, in order to maximize the payoff players can not only adapt their strategies, but also change their position seeking for the highest possible payoff. Spatially inhomogeneous evolutionary games, introduced in [5], provide a general mathematical framework for the evolution of a distribution of players with their (distributions of) strategies: A space-dependent replicator equation governs the evolution of the distribution λ ∈ P(U ) of the strategies u ∈ U while the evolution of the spatial variable x ∈ R d is determined by λ.
In the subsequent contribution [31], this approach has been suitably extended as an abstract toolbox which is capable of rigorously describing the mean-field limit of a larger class of models which share the following features: -a multi-agent dynamics in which every agent is characterized by a label u ∈ U (accounting for different strategies or different populations to which each individual belongs); -exchange rates among the labels which are stochastic in nature and, therefore, are described by the evolution of a probability measure λ ∈ P(U ).
Several other models, besides the replicator dynamics mentioned above, are included in this class. The multi-label setting can be effectively used to describe situations in which the action of every individual is weighted differently according to the species it belongs to [3,4,16,17,19]. In the theory of mean-field games or in optimal control theory, labelling is used to distinguish informed agents in the evacuations of unknown environments, to highlight the influence of key investors in the stock market or of strong leaders in opinion formation [11,13,18,39]. The addition of source and sink terms in the spirit of [34] and of label switching [38] can be successfully dealt with in this class of models. Relevant applications where label switching may occur come, for instance, from chemical reaction networks, where a particle may change its type The framework proposed in [31] couples a nonlinear transport dynamics for the positions x ∈ R d of the agents with a Markov-type jump process for the labels λ ∈ P(U ) (see Sect. 2). The mean-field limit of the model was proved to be a nonlinear continuity equation of the form in the space of probability measures over the pairs (x, λ) ∈ R d × P(U ) driven by a velocity field b Ψ (x, λ) depending on the global state of the system Ψ ∈ P(R d × P(U )). These equations are part of a general class which is of great interest in the mathematical community [6,Chapter 8] and can be studied both with a Lagrangian or a Eulerian approach. On the one hand, the nonlinear continuity equation expresses the Eulerian point of view tracing the evolution of the global state Ψ . On the other hand, a notion of solution can also be provided by the Lagrangian point of view tracing the characteristics, which are, in our case, solutions to an ODE in a suitably constructed Banach space. Given an initial datum Ψ , a solution t → Ψ t of the initial value problem for the nonlinear continuity equation is called a Eulerian solution, whereas a curve t → Ψ t obtained via the push-forward of Ψ through the flow map associated with the ODE is called a Lagrangian solution. Since Lagrangian solutions are also Eulerian solutions, the equivalence of the two notions follows if one is able to prove that Eulerian solutions are also Lagrangian. For the model studied in [31], and also for other relevant ones [14], these two notions of solution are equivalent. This has been achieved by means of the superposition principle (see [36], and also [ [5,Theorem 5.3]. Furthermore, the Lagrangian formulation has been used to propose discretization schemes to solve the nonlinear PDE numerically [15,25,26,29,35]. Moreover, the Lagrangian point of view has been used in [5] to provide a heuristic derivation of the nonlinear continuity equation arising as the mean-field limit of the spatially inhomogeneous replicator dynamics. Let us briefly discuss this derivation. Denoting by h = T /N the time step, if an agent at time t = ih, for i ∈ {0, . . . , N −1}, is in the position x with mixed strategy λ, first they optimize the strategy distribution following a homogeneous replicator dynamics of the form Here, T Ψ t (x, λ) is the payoff operator determining the enhancement or suppression of the strategies; it depends on the random state (x, λ) and also on the current distribution Ψ t . In the setting of [5], the operator T is quadratic in λ. After updating the strategy portfolio, the agent updates its position x to choosing u with probability λ . The above two equations completely determine the conditional probability of having an agent in a state (x , λ ) at time t + h given the distribution Ψ t . Equivalently, the new distribution Ψ t+h can be defined via duality by By a formal first-order Taylor expansion, we have In the formal limit for h → 0, we obtain the weak formulation of the nonlinear continuity equation (1). A related heuristic derivation has been outlined also in [2,Remark 4.1], in the context of a leader-follower dynamics which also fits in the setting of [31]. In this case, the R d -component of b Ψ also depends on Ψ , whereas the λ-component acts linearly on λ, modelling a Markov chain on U . We point out that in this paper we neglect diffusive terms as a general existence theory is still missing. Even in the literature of mean-field games, well-posedness results have been shown only for the so-called potential games [24]. For these, numerical solutions have been proposed, see, e.g., [1], which are based on the iterative solution of a backward-forward system. In our setting, a first step towards including diffusion has been made in [10], where an entropic regularization for the spatially inhomogeneous replicator dynamics (see also Sect. 4) has been considered.

Results of this paper
The main objective of this paper is to present a rigorous proof of the formal derivation described above, by means of a multi-step Lagrangian scheme. The scheme we propose is suitable for approximating all equations in the class considered in [31] (we refer to Sect. 2 for the precise details). The method is a Lagrangian one as it is based on the approximation of the ODE (2), and it is multi-step because the updates of x and λ do not happen simultaneously, but follow the heuristics described above. Indeed, first we make an incremental step in λ and then use the updated λ to make the incremental step in x.
Since the velocity field b depends explicitly on Ψ , at each incremental step the updates of x, λ, and of the distribution Ψ involve three substeps, which are the rigorous formalization of the heuristics discussed above. To be precise, -first we update λ to λ in the spirit of (3) (see (8)); -then we transport λ to the state of the system Ψ (see (11)). This amounts to assuming that all the agents know the optimal label distribution λ of the other agents; -then we update the positions x to x in the spirit of (4) where the velocity field depends on Ψ (see (12)). Notice that, in our general framework, the velocity field depends on Ψ and this makes the previous step necessary; -finally, we update the global distribution to Ψ keeping both x and λ into account (see (15)).
Our first main result is Theorem 1 in Sect. 3 on the convergence of the scheme presented above.
In Sects. 4 and 5, we turn our attention to the case of the inhomogeneous replicator dynamics considered in [5] and to the leader-follower-type dynamics of [31, Section 5.1], respectively. More in general, for the second case, we assume that T Ψ (x, λ) is a Markov chain on a finite space of an arbitrary number n of labels.
In the spatially homogeneous case, that is, when there is no x dependence in the vector field b, in both situations the evolutions of the λ-components are gradient flows of suitable energies with respect to certain metric structures, and the solution can be approximated via a minimizing movement scheme [6,22]. The spatially homogeneous replicator equation is a gradient flow with respect to the spherical Hellinger distance (36) of probability measures. (This could be obtained, for instance, for a proper choice of f in [23, formula (1.8)].) The spatially homogeneous Markov-type jump processes are the gradient flow of an entropy-like energy penalized by a distance induced by the transition matrix [28,30].
We investigate the compliance of these structures with our algorithm. More precisely, we elaborate an implicit-explicit scheme where the explicit step (3) is replaced by a minimizing movement step suggested by the aforementioned gradient flow structure (see (39) and (82), respectively). A relevant difficulty in the spatially inhomogeneous setting is that the energy and the dissipation distances that we consider may as well depend on the state Ψ , which changes from step to step. This extension is far from trivial and requires a careful analysis of the related Euler conditions, which is partially inspired by [20,Section 4.2] for the case of the replicator dynamics. This is done is Propositions 3 and 5, respectively, where we show that the deviation from the explicit scheme is uniformly controlled by the vanishing time step.
The two main results of Sects. 4 and 5 are given by Theorems 2 and 3, proving the convergence of our multi-step Lagrangian scheme to the unique solution to (1. In particular, Theorem 2 is a global-in-time convergence result for the spatially inhomogeneous replicator dynamics, whereas Theorem 3 provides a short-time existence result for a well-prepared initial datum for spatially inhomogeneous Markov-type jump processes.
The paper is structured as follows: In Sect. 2, we introduce the structural assumptions on the systems that we consider. In Sect. 3, we describe the multi-step Lagrangian scheme, which we apply to the inhomogeneous replicator dynamics in Sect. 4 and to the inhomogeneous Markov-type jump processes in Sect. 5.

The mathematical setting
2.1. Basic notation.
Given a metric space (X, d X ), we denote by M(X ) the space of signed Borel measures μ in X with finite total variation μ TV , by M + (X ) and P(X ) the convex subsets of nonnegative measures and probability measures, respectively. We say that μ ∈ P c (X ) if μ ∈ P(X ) and the support spt μ of μ is a compact subset of X . Moreover, for K ⊆ X we will use the notation P(K ) to indicate the set of measures μ ∈ P(X ) such that spt μ ⊆ K .
As usual, if (Z , d Z ) is another metric space, for every μ ∈ M + (X ) and every μ- The push-forward measures has the same total mass as μ, namely μ( For a Lipschitz function f : X → R we set its Lipschitz constant. We denote by Lip(X ) and Lip b (X ) the spaces of Lipschitz and bounded Lipschitz functions on X , respectively. Both are normed spaces with the norm f Lip := f ∞ +Lip( f ), where · ∞ is the supremum norm. Furthermore, we use the notation Lip 1 (X ) for the set of functions f ∈ Lip b (X ) such that Lip( f ) ≤ 1.
In a complete and separable metric space (X, d X ), we shall use the Kantorovich-Rubinstein distance W 1 in the class P(X ), defined as Notice that W 1 (μ, ν) is finite if μ and ν belong to the space If (E, · E ) is a Banach space and μ ∈ M + (E), we define the first moment m 1 (μ) as Notice that, for a probability measure μ, finiteness of the above integral is equivalent to μ ∈ P 1 (E), whenever E is endowed with the distance induced by the norm · E .
For a Banach space E, the notation C 1 b (E) will be used to denote the subspace of C b (E) of functions having bounded continuous Fréchet differential at each point. The notation ∇φ(·) will be used to denote the Fréchet differential. In the case of a function φ : [0, T ]×E → R, the symbol ∂ t will be used to denote partial differentiation with respect to t. The symbol ·, · will be used to denote duality products, with no further specification if the meaning is clear from the context.

Functional setting
We consider a set of pure strategies U , where U is a compact metric space, and we denote by Y := R d × P(U ) the state space of the system. Precisely, for every y = (x, λ) ∈ Y , the component x ∈ R d describes the location of an agent in space, whereas the component λ ∈ P(U ) describes the distribution of labels of the agent.
We notice that, by definition of · BL , we always have μ BL ≤ μ TV for every μ ∈ M(U ) .
In particular, λ BL ≤ 1 for every λ ∈ P(U ). We endow Y with the norm For every R > 0, we denote by B R the closed ball of radius R in R d and by B Y R the ball of radius R in Y , namely B Y R = {y ∈ Y : y Y ≤ R}. We notice that B Y R is a compact set, as Y is locally compact by our assumptions on U .
As in [31], we consider, for every Ψ ∈ P 1 (Y ), the velocity field v Ψ : Y → R d such that for every y 1 , y 2 ∈ Y ; (v 2 ) for every R > 0 there exists L v,R > 0 such that for every Ψ 1 , Ψ 2 ∈ P(B Y R ) and (v 3 ) there exists M v > 0 such that for every y ∈ Y and every Ψ ∈ P 1 (Y ) As for T , for every Ψ ∈ P 1 (Y ) we assume that the operator T Ψ : Y → F(U ) is such that (T 0 ) for every (y, Ψ ) ∈ Y × P 1 (Y ), the constants belong to the kernel of T Ψ (y), i.e., (T 1 ) there exists M T > 0 such that for every y ∈ Y and every Ψ ∈ P 1 (Y ) Finally, for every y ∈ Y and every Ψ ∈ P 1 (Y ) we set which is the velocity field driving the evolution (see (7)).

The multi-step Lagrangian scheme
Let Ψ ∈ P c (Y ) be a probability measure on Y with compact support in Y . Given ) of the initial value problem for the nonlinear continuity equation Let to be known. With this knowledge, we update the state of the system with the following procedure, consisting of two steps.
Step 1. We update the label λ (x,λ) (t) ∈ P(U ) of a player that at time t k i sits inx ∈ R d with labelλ ∈ P(U ) by setting At this stage, we assume that λ (x,λ) (t k i+1 ) ∈ P(U ) and we continue with the construction of the piecewise affine interpolant between λ (x,λ) (t k i ) and λ (x,λ) (t k i+1 ), defined as the function λ k (x,λ),i+1 In Lemma 1, we show that the assumption λ (x,λ) (t k i+1 ) ∈ P(U ) is actually satisfied for k large enough (and therefore τ k small enough), independently of i = 0, . . . , k −1. Giving Lemma 1 for granted for the time being, we define the map Λ k i+1 and transport it to the state of the system by defining Step 2. In the second step, we update the positions of the players. Precisely, a player that at time t k i sits in the positionx with labelλ will now move following the velocity field given by (8). Hence, we set Also in this case, we can define the affine interpolant between We notice that (13), in contrast to (9), is always well defined, since R d is a convex space and the velocity field is an element of R d .
and we set, For later use, we also define By an application of Gronwall inequality, in the following lemma we give an esti- in terms of |x| and λ BL . As a consequence, we deduce that the above construction is well defined for τ k sufficiently small and can be iterated over i = 0, . . . , k − 1, since the initial condition Ψ has a compact support in Y . This indeed implies that each Ψ k i belongs to P c (Y ) ⊆ P 1 (Y ).
Proof. Along the proof of the lemma, we denote with λ k (t, x 0 , λ 0 ) and x k (t, x 0 , λ 0 ), for (x 0 , λ 0 ) ∈ spt Ψ =: S, the curves obtained by iteratively solving the difference equations (8) and (12) As we have already noticed above, the curve x k (t, x 0 , λ 0 ) is well defined as long as λ k (t, x 0 , λ 0 ) and the measures Ψ k i are. Therefore, in order to prove the lemma it is enough to show that, for τ k small enough, for every To simplify our estimates, we define the piecewise constant interpolation functions For i = 0, we have that the initial condition λ 0 ∈ P(U ); hence, there is nothing to show.
for k large enough, independently of i and of the initial condition (x 0 , λ 0 ). Since, recalling (8) and (9), we define we are led to showing that the piecewise constant interpolation functions x k (t, x 0 , λ 0 ) and λ k (t, x 0 , λ 0 ) are bounded in R d and F(U ), respectively, uniformly with respect to (x 0 , λ 0 ) ∈ S and t ∈ [0, t k i+1 ], and that the bound does not depend on i. Indeed, if this is the case, let R > 0 be such and every (x 0 , λ 0 ) ∈ S. In particular, by construction (17) Vol. 21 (2021) A multi-step Lagrangian scheme for spatially 2701 In particular, assumption (T 1 ) implies that λ R ∈ F(U ) and satisfies so that λ R can be extended in a unique way to a linear and continuous operator on C(U ). The Riesz representation theorem yields that λ R ∈ M + (U ). Moreover, by (T 0 ) we get which implies λ R ∈ P(U ). By the convexity of P(U ) we deduce that whenever In order to conclude that the interpolation curves Let us now fix r > 0 such that S ⊆ B Y r and let By taking the supremum over S in (19), we deduce that Applying the Gronwall inequality to (20), we infer that Setting R : ) and every (x 0 , λ 0 ) ∈ S. In particular, we notice that the above computations are independent of the choice of i, as long as we know that λ k (t k j , x 0 , λ 0 ) ∈ P(U ) for every j = 0, . . . , i and every (x 0 , λ 0 ) ∈ S. With this control at hand, we conclude, as explained above, that (8) and (12) are well posed.
Finally, we estimate In the next proposition, we show that the curve Ψ k (·) solves the continuity equation (7) up to an error of order τ k . (15) starting from Ψ , and let Ψ k be as in (16). Then, the following holds: There exists a positive constant C such that for every where where Ψ k (t) and Ψ k (t) are defined in (16) and (17), respectively. In order to obtain (23) from (24), we have to estimate the following quantities: where R has been determined in Lemma 1.
Let us start with I 1 . By triangle inequality, we have where, in the second inequality, we have used the systems (9) and (13). By (v 3 ) and (T 3 ), we can continue with As for I 1,2 , thanks to assumption (v 2 ) and to Lemma 1 we get Making use of (v 3 ) and (T 3 ) and recalling Lemma 1, we can continue with Combining (25)- (27), we get for some positive constant C 1 independent of k, t, ϕ, and (x, λ) ∈ sptΨ k i . Let us now estimate I 2 . By Lemma 1 and by assumption (T 2 ), we get Arguing as in (25)-(28), we deduce from (29) and from the hypotheses (v 1 ), (v 3 ), and (T 2 ) that for some positive constant C 2 independent of k, t, ϕ, and (x, λ) ∈ sptΨ k i . We are now in a position to conclude the proof of the proposition. We rewrite (24) as We conclude by noticing that, thanks to (28) and (30), the term ϑ k (ϕ) above can be estimated by for a positive constant C independent of k, t, and ϕ.

Theorem 1.
Let Ψ ∈ P c (Y ) and let Ψ k (·) be defined as in (15). Then, Proof. The existence and uniqueness of the solution to equation (7) follow from [31, In view of Proposition 1, for every k ∈ N, i ∈ {0, . . . , k−1}, By integrating the previous equality over time, we deduce that In order to pass to the limit in (31), we have to determine a candidate limit for Ψ k (t).
In Lemma 1, we have already shown that the supports of Ψ k (t) are contained in a compact subset of Y . We now show the equicontinuity of the sequence Ψ k with respect to W 1 . Given s, t ∈ [0, T ], we show that W 1 (Ψ k (s), Ψ k (t)) ≤ L|s −t| for some L > 0 independent of k. By triangle inequality, it is enough to show it for s, t ∈ [t k i , t k i+1 ]. Arguing as in the proof of Proposition 1, we obtain where R has been defined in Lemma 1. Hence, Ascoli-Arzelà theorem yields that there exists Ψ ∈ C([0, T ]; (P 1 (Y ), W 1 )) such that, up to a subsequence, W 1 (Ψ k (t), Ψ (t)) → 0 uniformly with respect to t ∈ [0, T ]. In particular, R for every k and every t. It remains to show that Ψ is a solution to (7), from which we would deduce that Ψ = Ψ and that the whole sequence Ψ k converges to Ψ . The first line of (31) passes to limit as k → ∞, since the test function φ belongs to C 1 b ([0, T ] × Y ) and the convergence of Ψ k in W 1 is uniform in time and implies the narrow convergence. The last term on the right-hand side of (31) tends to 0, since it holds We conclude by estimating By [31, Proposition 3.2], Lemma 1, and Assumptions (v 2 ) and (T 2 ), the first term on the right-hand side of (33) can be estimated by As for the second term, we first notice that, by [31,Proposition 3.2] and Lemma 1, Thus, by dominated convergence also the second term on the right-hand side of (33) tends to zero as k → ∞. Eventually, we infer that passing to the limit k → ∞ in (31) we get the equality This concludes the proof of the theorem.

An implicit-explicit scheme for the inhomogeneous replicator dynamics
We discuss in this section a different discrete-time approximation of the continuity equation (7) for the operator T Ψ : Y → F(U ) corresponding to the transition operators considered in [5] (see also [31,Section 5]) for the replicator equation (first introduced in [37]; see also [21,41]), namely defined for every Ψ ∈ P 1 (Y ) and every y = (x, λ) ∈ Y . In (34) we consider a function J : (R d × U ) 2 → R such that (J 1 ) J is locally Lipschitz continuous with respect to all of its variables; For simplicity of notation, from now on we will write so that (34) can be written as The following proposition holds. We now introduce the spherical Hellinger distance between probability measures defined for every λ 1 , λ 2 ∈ P(U ), where L 2 ([0, 1] × U ; dρ) denotes the space of functions which are square-integrable with respect to the measure ρ. For later use, we also define the Hellinger distance between nonnegative measures: For every μ 1 , μ 2 ∈ M + (U ), we set where μ * ∈ M + (U ) is such that μ 1 , μ 2 μ * . We notice that HS 2 can be expressed in terms of H 2 through HS 2 (λ 1 , λ 2 ) = arccos 1 − H 2 (λ 1 , λ 2 ) 2 2 for every λ 1 , λ 2 ∈ P(U ) , (36) and that the following chain of inequalities holds: with respect to the spherical Hellinger distance. In the spatially inhomogeneous setting, the payoff functional has a bilinear dependence on Ψ and λ: For every Ψ ∈ P 1 (Y ) and every (x, λ) ∈ Y , we set (the factor 1 4 instead of 1 8 is due to the dependence on λ which is now linear). We modify the scheme in Sect. 3 by replacing the finite difference (8) with a minimizing movement. Namely, in the interval [t k i , t k i+1 ) let Ψ k i ∈ P(U ) be given and define, for Notice that the measure λ (x,λ),i+1 ∈ P(U ) is well defined, as P(U ) is compact and the functional in (39) is strictly convex. Therefore, we can define λ k , Λ k i+1 , andΨ k i+1 exactly as in (9), (10), and (11), respectively. The second step (12) in the space variable remains instead the same, so that x k (x,λ),i+1 , X k i+1 , Ψ k i+1 are as in (13), (14), and (15), respectively. We further refer to (15), (16), and (17) for the definition of the interpolation curves Ψ k , Ψ k , and Ψ k .
The next lemma gives an estimate on the size of the support of Ψ k i+1 and Ψ k i+1 , showing that they belong to P c (Y ) ⊆ P 1 (Y ) for every k ∈ N and every i ∈ {0, . . . , k − 1}.
Proof. Let us define the piecewise constant interpolation functions λ k , λ k , and x k as in (18), and let x k and λ k be the corresponding piecewise affine interpolations. Then, by (39) we have that λ k (t, x 0 , λ 0 ), λ k (t, x 0 , λ 0 ) ∈ P(U ) for every t ∈ [0, T ] and every (x 0 , λ 0 ) ∈ spt Ψ , so that Following step by step the proof of (19) and (22), we also deduce that there exists R > 0 such that We notice that, being the step (39) defined through a minimization in P(U ) and not through a finite difference, the estimate (40) holds for every k, and not only for k large. Moreover, (40) yields that Ψ k (t), Ψ k (t), Ψ k (t) ∈ P(B Y R ). In order to write the equivalent of Proposition 1, we have to determine an approximate Euler-Lagrange equation for the minimization problem (39). This is the content of the following proposition, written here for generic Ψ , x, and λ.
To compute the second term on the left-hand side of (47), we first notice that, sincẽ λ, λ, λ ϕ ε μ * , we can write Using the algebraic equality 2 where, in the last equality, we have used the fact thatλ, λ ∈ P(U ).
In order to conclude with (43), we estimate δ k (λ, λ) and the last term on the righthand side of (50). In view of (36), (37), and (42), it is easy to check that for some positive constant c = c(R) > 0. Since ϕ Lip ≤ 1 and (45) holds, we have that Combining (37), (42), and (47)-(52), we deduce that for some positive constant C = C(R). This concludes the proof of the proposition.
We are now in a position to state the equivalent of Proposition 1.

Proposition 4. There exists C > 0 such that for every
where Proof. Along the proof, we denote by C a generic positive constant independent of i, k, t, and ϕ, that may vary from line to line. We follow step by step the proof of Proposition 1. For every test function ϕ ∈ In order to deduce (53) from (54), we need to estimate , x, λ) , 21 (2021) for (x, λ) ∈ spt Ψ k i ⊆ B Y R , where R has been determined in Lemma 2. Let us start with I 1 . By triangle inequality we have 1 (x, λ) + I 1,2 (x, λ). (55) By (v 3 ), Lemma 2, and Proposition 3, we can continue with As for I 1,2 , thanks to assumption (v 2 ) and to Lemma 2 we get Arguing as in (56), we infer that Combining (55)-(57), we get Let us now estimate I 2 . By triangle inequality, we have By Proposition 3, we have that By (T 2 ), (v 3 ), Lemma 2, and Proposition 3, and repeating the arguments of (57), we get Combining (59)-(61), we obtain that Equality (53) follows from (58) and (62) as in the proof of Proposition 1.
Proof. Since the operator T Ψ defined in (34) satisfies the property (T 0 )-(T 3 ), we only have to check that the sequence Ψ k is compact in C([0, T ]; P 1 (Y )). The rest of the proof works as for Theorem 1 using Proposition 4 instead of Proposition 1.
In view of Lemma 2, it is enough to show that Ψ k is equi-Lipschitz with respect to W 1 . Let us fix k ∈ N, i ∈ {0, . . . , k − 1}, and s ≤ t ∈ [t k i , t k i+1 ]. Then, Therefore, by (v 2 ), Lemma 2, and Proposition 3 we get for some positive constant C independent of k and t.

An implicit-explicit scheme for reversible Markov chains
In this section, we show how to adapt the scheme developed in Sect. 4 to a reversible Markov chain on n states. In particular, we will prove the convergence of such scheme for short time.
In the notation of Sects. 3 and 4, the closure Λ n can be identified with the set of probability measures P(U ) for U := {e h : h = 1, . . . , n}, e h being the elements of the canonical basis of R n . Keeping the notation of the previous sections, we set Y := R d × Λ n . Furthermore, we define Λ δ n := {λ ∈ Λ n : λ h ≥ δ} for every δ > 0 , A Markov chain is characterized by a matrix Q ∈ M n , whose element Q h ≥ 0, h = , indicates the rate of moving from the state to the state h. In our setting, we consider a more general map Q : R d × P 1 (Y ) → M n satisfying the following properties: (Y ) and every h, = 1, . . . , n, Q h (x, Ψ ) (Q 3 ) there exists M Q > 0 such that for every x ∈ R d and every Ψ ∈ P 1 (Y ) Remark 1. We remark that (Q 1 ) is always satisfied, for instance, when Q(x, Ψ ) is a tridiagonal matrix for every x ∈ R d and Ψ ∈ P 1 (Y ), see, e.g., [30, Section 5.1].
In the next two lemmas, we collect some properties of E, K , G, and d (x,Ψ ) .
For every (x, λ) ∈ R d × Λ n and every Ψ ∈ P 1 (Y ), we have that K (x, λ, Ψ ) is symmetric, positive semi-definite on R n , and positive definite on R n 0 . Since K is continuous with respect to (x, λ, Ψ ), we deduce that there exists a positive constant c 4 = c 4 (δ, R) ≤ c 2 such that inequality (69) holds for every (x, λ) ∈ B Y R,δ and every Ψ ∈ P(B Y R ). Since G is the inverse of K on R n 0 , (67) and (69) imply (66) and (68)  The Lipschitz continuity (iv) of G(x, ·, Ψ ) in Λ δ n follows from the regularity of K (x, ·, Ψ ), from (i)-(iii), and from the identity, on R n 0 , As for (v), we notice that for x ∈ B R , Ψ ∈ P(B Y R ), and λ ∈ Λ δ n , the ratio λ h /σ h (x, Ψ ) is bounded from below and from above by δ and by 1/η, respectively. Since the function a → a ln a is locally Lipschitz continuous in (0, +∞), we have that there exists L = L(δ, R) > 0 such that (71) holds.
Finally, we note that the function a → a ln a belongs to W 1, p ([0, A]) for every p ∈ [1, +∞) and every A < +∞. In view of (i), for every x ∈ B R , every Ψ ∈ P(B Y R ), and every λ ∈ Λ n , the ratio λ h /σ h (x, Ψ ) is bounded above by 1/η. Hence, by Sobolev embedding in dimension one we infer that for every α ∈ (0, 1) there exists C = C(α, R) > 0 such that (72) holds.
Before stating the main properties of the distance d (x,Ψ ) , we define, for every x ∈ R d , every Ψ ∈ P 1 (Y ), and every λ, λ 1 , λ 2 ∈ Λ n , the norm which is well defined in view of (66) and (68).
(ii) there exist two positive constants m 2 = m 2 (δ, R) and m 3 = m 3 (δ, R) such that for every x ∈ B R , every Ψ ∈ P(B Y R ), and every λ 1 , (iii) there exists a positive constant m 4 = m 4 (δ, R) such that for every x ∈ B R , every Ψ ∈ P(B Y R ), and every λ 1 , λ 2 ∈ Λ δ n satisfying we have Remark 4. The constant m 1 (R) can be assumed to be decreasing with respect to R, while m 2 (δ, R), m 3 (δ, R), and m 4 (δ, R), can be assumed to be increasing with respect to R.
Repeating step by step the proofs of Lemmas 1 and 2, we obtain the following uniform estimate on Ψ k (t), Ψ k (t), and Ψ k (t).

Lemma 5.
Let Ψ ∈ P c (Y ). Then there exists an increasing continuous function R : [0, +∞) → [0, +∞) such that for every T ∈ [0, +∞), every k ∈ N, and ev- ). Proof. The statement follows by the arguments of Lemmas 1 and 2 . In particular, we gave there an explicit formula for R(T ) as a function of T ∈ [0, +∞), which turns out to be continuous and increasing.
Also in the current setting, we need to write an approximate Euler-Lagrange equation associated with (82). This is done in Proposition 5, for proving which we need the following lemma.
Lemma 6. Let f : R N → R∪{+∞} be a convex function, let A ∈ M N be a symmetric and positive definite matrix, and let · A : R N → [0, +∞) be the norm associated with A, namely ξ 2 A := Aξ, ξ , for all ξ ∈ R N . For a fixed ζ ∈ R N and c > 0, assume that ξ 0 solves Then ξ 0 also solves Proof. It is enough to observe that the problem (84) can be equivalently rewritten as hence, it is a convex minimization problem. Since ξ 0 solves (83), we have −2c A(ξ 0 − ζ ) ∈ ∂ f (ξ 0 ), which is exactly the Euler condition for the above problem.
We finally conclude with the main result of this section.
Theorem 3. Let r > 0, η > δ > 0, and Ψ ∈ P(B Y r,η ). Then, there exists T f > 0 such that the sequence of curves Ψ k : [0, T f ] → P 1 (Y ) converges to the unique solution Ψ ∈ C([0, T f ]; P 1 (Y )) of (7) in W 1 , uniformly with respect to t ∈ [0, T f ] Proof. Let T f > 0 be as in Lemma 7, so that the curves Ψ k , Ψ k , and Ψ k are well defined in the interval [0, T f ] and (105) holds. Since the operator T Ψ (x, λ) := Q(x, Ψ )λ satisfies the property (T 0 )-(T 3 ), we only have to check that the sequence Ψ k is compact in C([0, T f ]; P 1 (Y )). The rest of the proof works as for Theorem 1, with the obvious modifications due to the fact that the rest θ k in Proposition 6 is now controlled by τ 1/4 k and not by τ k . In view of Lemma 7, it is enough to show that Ψ k is equi-Lipschitz with respect to W 1 . Let us fix k ∈ N, i ∈ N such that iτ k ≤ T f , and s ≤ t ∈ [t k i , t k i+1 ], and let R := R(T f ). Then, Therefore, by (v 2 ), Lemma 7, and Proposition 5 we get where the constants L E,δ,R and m 1 = m 1 (R) have been determined in Lemmas 3 and 4 , respectively, and are independent of k, i, and t.

Concluding remarks
In this paper, we have proposed a multi-step Lagrangian scheme for spatially inhomogeneous evolutionary games. The scheme is fully explicit, traces the evolutions of positions and labels along the characteristics, and consists of two approximation steps: in the first step the agents update their beliefs on the strategy; in the second step, they update their position accordingly. Theorem 1 in Sect. 3 provides a general convergence result for the proposed scheme. Theorem 2 in Sect. 4 and Theorem 3 in Sect. 5 deal with the special cases of inhomogeneous replicator dynamics and reversible Markov chains, respectively. Differently from Theorem 1, they contain a variant in that the explicit step for the update of the strategies is replaced by an implicit one, based on a minimizing movement justified by the gradient flow structure of the problems at hand.
We notice that we presented the approximation steps in the natural order: Agents tend to update their information on the environment before changing their position. Undertaking the reverse order in the updates would bring to the same convergence result. The partial update in (11) would have to be modified accordingly, namely by placing the id map in the second component and using the updated lifted map X k i+1 in the first component (where now x evolves with the "old" label/strategy distribution).
Using an adaptive time step, chosen on the basis of the rate of change of positions and labels/strategies, deserves further analysis. We foresee that the same convergence result given by Theorem 1 holds, provided that the choice of the time step is such that lim k→∞ max i∈{0,...,k−1} where we notice that t k i+1 − t k i = τ k is constant in our work.