1 Introduction

The capability of changing strategy as an adaptive response to the modification of the surrounding environment in order to maximize a certain payoff is of paramount importance in decision-making processes. Replicator-type models [21] are a particular class of dynamical models that feature this adaptivity and are well suited for studying the evolution of strategies according to their success: Given a pool of strategies, the occurrence of each of them evolves according to their performance with respect to all the others; in this way, if a strategy gives a payoff which is higher compared to the average of all strategies, it is enhanced; otherwise, it is suppressed. This criterion, in the basic replicator model, is the only one that determines the evolution of the occurrence of the strategies, which in fact is independent from all other factors, in particular from the position of the agents that play those strategies. This is a reasonable assumption, not even a restrictive one, in many cases. For example, in a financial scenario, the set of (pure) strategies U contains the financial products available to an investor. Any combination of them, that is a portfolio, is called a mixed strategy: In a discrete setting such as this one, it corresponds to the fraction of the capital invested in each of the different financial products. Adapting the strategy means to allocate resources differently according to the evolution of the market, and the location the investor is at when making this decision is likely to not affect the reward of the portfolio. On the contrary, when the position influences the outcome, the system is more involved, as more feedback is available, and the adaptive optimization process relies on the mutual influence of position and strategy performance. We call such a system spatially inhomogeneous and make them the focus of this paper.

1.1 Overview of the problem and state of the art

The basic, spatially homogeneous, replicator equation of [21] can be enriched to include spatial dependence of the payoff function: The idea is that the same strategy adopted in two different places might originate different rewards, precisely depending on the environment. Therefore, in order to maximize the payoff players can not only adapt their strategies, but also change their position seeking for the highest possible payoff. Spatially inhomogeneous evolutionary games, introduced in [5], provide a general mathematical framework for the evolution of a distribution of players with their (distributions of) strategies: A space-dependent replicator equation governs the evolution of the distribution \(\lambda \in {\mathcal {P}}(U)\) of the strategies \(u\in U\) while the evolution of the spatial variable \(x\in {\mathbb {R}}^{d}\) is determined by \(\lambda \).

In the subsequent contribution [31], this approach has been suitably extended as an abstract toolbox which is capable of rigorously describing the mean-field limit of a larger class of models which share the following features:

  • a multi-agent dynamics in which every agent is characterized by a label \(u\in U\) (accounting for different strategies or different populations to which each individual belongs);

  • exchange rates among the labels which are stochastic in nature and, therefore, are described by the evolution of a probability measure \(\lambda \in {\mathcal {P}}(U)\).

Several other models, besides the replicator dynamics mentioned above, are included in this class. The multi-label setting can be effectively used to describe situations in which the action of every individual is weighted differently according to the species it belongs to [3, 4, 16, 17, 19]. In the theory of mean-field games or in optimal control theory, labelling is used to distinguish informed agents in the evacuations of unknown environments, to highlight the influence of key investors in the stock market or of strong leaders in opinion formation [11, 13, 18, 39]. The addition of source and sink terms in the spirit of [34] and of label switching [38] can be successfully dealt with in this class of models. Relevant applications where label switching may occur come, for instance, from chemical reaction networks, where a particle may change its type as a result of the interaction with the others [27, 32, 33]; also in social dynamics, loss or gain of opinion leadership over time is a natural postulate, as it happens in [18, Section 3.b].

The framework proposed in [31] couples a nonlinear transport dynamics for the positions \(x\in {\mathbb {R}}^{d}\) of the agents with a Markov-type jump process for the labels \(\lambda \in {\mathcal {P}}(U)\) (see Sect. 2). The mean-field limit of the model was proved to be a nonlinear continuity equation of the form

$$\begin{aligned} \partial _t\varPsi _t+\mathrm {div}(b_{\varPsi _t}\varPsi _t)=0 \, \end{aligned}$$
(1)

in the space of probability measures over the pairs \((x,\lambda )\in {\mathbb {R}}^d\times {\mathcal {P}}(U)\) driven by a velocity field \(b_{\varPsi }(x,\lambda )\) depending on the global state of the system \(\varPsi \in {\mathcal {P}}({\mathbb {R}}^d\times {\mathcal {P}}(U))\). These equations are part of a general class which is of great interest in the mathematical community [6, Chapter 8] and can be studied both with a Lagrangian or a Eulerian approach. On the one hand, the nonlinear continuity equation expresses the Eulerian point of view tracing the evolution of the global state \(\varPsi \). On the other hand, a notion of solution can also be provided by the Lagrangian point of view tracing the characteristics, which are, in our case, solutions to an ODE in a suitably constructed Banach space.

Given an initial datum \({\widehat{\varPsi }}\), a solution \(t\mapsto \varPsi _t\) of the initial value problem for the nonlinear continuity equation is called a Eulerian solution, whereas a curve \(t\mapsto \varPsi _t\) obtained via the push-forward of \({\widehat{\varPsi }}\) through the flow map associated with the ODE

$$\begin{aligned} (\dot{x},{\dot{\lambda }})=b_{\varPsi _t}(x,\lambda ) \end{aligned}$$
(2)

is called a Lagrangian solution. Since Lagrangian solutions are also Eulerian solutions, the equivalence of the two notions follows if one is able to prove that Eulerian solutions are also Lagrangian. For the model studied in [31], and also for other relevant ones [14], these two notions of solution are equivalent. This has been achieved by means of the superposition principle (see [36], and also [6, Theorem 8.2.1], [8, Theorem 7.1], and [5, Theorem 5.2]), which provides the uniqueness of Eulerian solutions [5, Theorem 5.3]. Furthermore, the Lagrangian formulation has been used to propose discretization schemes to solve the nonlinear PDE numerically [15, 25, 26, 29, 35].

Moreover, the Lagrangian point of view has been used in [5] to provide a heuristic derivation of the nonlinear continuity equation arising as the mean-field limit of the spatially inhomogeneous replicator dynamics. Let us briefly discuss this derivation. Denoting by \(h=T/N\) the time step, if an agent at time \(t=ih\), for \(i\in \{0,\ldots ,N-1\}\), is in the position x with mixed strategy \(\lambda \), first they optimize the strategy distribution following a homogeneous replicator dynamics of the form

$$\begin{aligned} \lambda ':=\lambda +h{\mathcal {T}}_{\varPsi _{t}}(x,\lambda ) \,. \end{aligned}$$
(3)

Here, \({\mathcal {T}}_{\varPsi _t}(x,\lambda )\) is the payoff operator determining the enhancement or suppression of the strategies; it depends on the random state \((x,\lambda )\) and also on the current distribution \(\varPsi _t\). In the setting of [5], the operator \({\mathcal {T}}\) is quadratic in \(\lambda \). After updating the strategy portfolio, the agent updates its position x to

$$\begin{aligned} x':=x+h v(x,u), \end{aligned}$$
(4)

choosing u with probability \(\lambda '\). The above two equations completely determine the conditional probability of having an agent in a state \((x',\lambda ')\) at time \(t+h\) given the distribution \(\varPsi _t\). Equivalently, the new distribution \(\varPsi _{t+h}\) can be defined via duality by

$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}^d\times {\mathcal {P}}(U)}&\phi (x',\lambda ')\,\mathrm {d}\varPsi _{t+h}(x',\lambda ')\\&=\int _{{\mathbb {R}}^d\times {\mathcal {P}}(U)} \!\! \bigg (\int _U \phi (x+hv(x,u),\lambda +h{\mathcal {T}}_{\varPsi _t}(x,\lambda ))\,\mathrm {d}\lambda '(u) \bigg )\mathrm {d}\varPsi _t(x,\lambda ) \end{aligned} \end{aligned}$$

where \(\phi :{\mathbb {R}}^d\times {\mathcal {P}}(U)\rightarrow {\mathbb {R}}\) is of class \(C^1\). By a formal first-order Taylor expansion, we have

$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}^d\times {\mathcal {P}}(U)}&\phi (x',\lambda ')\,\mathrm {d}\varPsi _{t+h}(x',\lambda ') \\&= \int _{{\mathbb {R}}^d\times {\mathcal {P}}(U)} \big [\phi (x,\lambda )+h \nabla \phi (x,\lambda )\cdot b_{\varPsi _t}(x,\lambda )\big ]\,\mathrm {d}\varPsi _t(x,\lambda )+o(h), \end{aligned} \end{aligned}$$

where

$$\begin{aligned} b_{\varPsi _t}(x,\lambda )=\left( \begin{array}{cc} \displaystyle \int _U v(x,u)\,\mathrm {d}\lambda (u) \\ {\mathcal {T}}_{\varPsi } (x,\lambda ) \end{array}\right) . \end{aligned}$$

In the formal limit for \(h\rightarrow 0\), we obtain the weak formulation of the nonlinear continuity equation (1). A related heuristic derivation has been outlined also in [2, Remark 4.1], in the context of a leader–follower dynamics which also fits in the setting of [31]. In this case, the \({\mathbb {R}}^d\)-component of \(b_\varPsi \) also depends on \(\varPsi \), whereas the \(\lambda \)-component acts linearly on \(\lambda \), modelling a Markov chain on U.

We point out that in this paper we neglect diffusive terms as a general existence theory is still missing. Even in the literature of mean-field games, well-posedness results have been shown only for the so-called potential games [24]. For these, numerical solutions have been proposed, see, e.g., [1], which are based on the iterative solution of a backward–forward system. In our setting, a first step towards including diffusion has been made in [10], where an entropic regularization for the spatially inhomogeneous replicator dynamics (see also Sect. 4) has been considered.

1.2 Results of this paper

The main objective of this paper is to present a rigorous proof of the formal derivation described above, by means of a multi-step Lagrangian scheme. The scheme we propose is suitable for approximating all equations in the class considered in [31] (we refer to Sect. 2 for the precise details). The method is a Lagrangian one as it is based on the approximation of the ODE (2), and it is multi-step because the updates of x and \(\lambda \) do not happen simultaneously, but follow the heuristics described above. Indeed, first we make an incremental step in \(\lambda \) and then use the updated \(\lambda '\) to make the incremental step in x.

Since the velocity field b depends explicitly on \(\varPsi \), at each incremental step the updates of x\(\lambda \), and of the distribution \(\varPsi \) involve three substeps, which are the rigorous formalization of the heuristics discussed above. To be precise,

  • first we update \(\lambda \) to \(\lambda '\) in the spirit of (3) (see (8));

  • then we transport \(\lambda '\) to the state of the system \({{\widetilde{\varPsi }}}\) (see (11)). This amounts to assuming that all the agents know the optimal label distribution \(\lambda '\) of the other agents;

  • then we update the positions x to \(x'\) in the spirit of (4) where the velocity field depends on \({{\widetilde{\varPsi }}}\) (see (12)). Notice that, in our general framework, the velocity field depends on \(\varPsi \) and this makes the previous step necessary;

  • finally, we update the global distribution to \(\varPsi '\) keeping both \(x'\) and \(\lambda '\) into account (see (15)).

Our first main result is Theorem 1 in Sect. 3 on the convergence of the scheme presented above.

In Sects. 4 and 5, we turn our attention to the case of the inhomogeneous replicator dynamics considered in [5] and to the leader–follower-type dynamics of [31, Section 5.1], respectively. More in general, for the second case, we assume that \({\mathcal {T}}_\varPsi (x,\lambda )\) is a Markov chain on a finite space of an arbitrary number n of labels.

In the spatially homogeneous case, that is, when there is no x dependence in the vector field b, in both situations the evolutions of the \(\lambda \)-components are gradient flows of suitable energies with respect to certain metric structures, and the solution can be approximated via a minimizing movement scheme [6, 22]. The spatially homogeneous replicator equation is a gradient flow with respect to the spherical Hellinger distance (36) of probability measures. (This could be obtained, for instance, for a proper choice of f in [23, formula (1.8)].) The spatially homogeneous Markov-type jump processes are the gradient flow of an entropy-like energy penalized by a distance induced by the transition matrix [28, 30].

We investigate the compliance of these structures with our algorithm. More precisely, we elaborate an implicit–explicit scheme where the explicit step (3) is replaced by a minimizing movement step suggested by the aforementioned gradient flow structure (see (39) and (82), respectively). A relevant difficulty in the spatially inhomogeneous setting is that the energy and the dissipation distances that we consider may as well depend on the state \(\varPsi \), which changes from step to step. This extension is far from trivial and requires a careful analysis of the related Euler conditions, which is partially inspired by [20, Section 4.2] for the case of the replicator dynamics. This is done is Propositions 3 and 5, respectively, where we show that the deviation from the explicit scheme is uniformly controlled by the vanishing time step.

The two main results of Sects. 4 and 5 are given by Theorems 2 and 3, proving the convergence of our multi-step Lagrangian scheme to the unique solution to (1. In particular, Theorem 2 is a global-in-time convergence result for the spatially inhomogeneous replicator dynamics, whereas Theorem 3 provides a short-time existence result for a well-prepared initial datum for spatially inhomogeneous Markov-type jump processes.

The paper is structured as follows: In Sect. 2, we introduce the structural assumptions on the systems that we consider. In Sect. 3, we describe the multi-step Lagrangian scheme, which we apply to the inhomogeneous replicator dynamics in Sect. 4 and to the inhomogeneous Markov-type jump processes in Sect. 5.

2 The mathematical setting

2.1 Basic notation.

Given a metric space \((X,{\mathsf {d}}_X)\), we denote by \({\mathcal {M}}(X)\) the space of signed Borel measures \(\mu \) in X with finite total variation \(\Vert \mu \Vert _{\mathrm {TV}}\), by \({\mathcal {M}}_+(X)\) and \({\mathcal {P}}(X)\) the convex subsets of nonnegative measures and probability measures, respectively. We say that \(\mu \in {\mathcal {P}}_c(X)\) if \(\mu \in {\mathcal {P}}(X)\) and the support \(\mathrm {spt}\, \mu \) of \(\mu \) is a compact subset of X. Moreover, for \(K \subseteq X\) we will use the notation \({\mathcal {P}}(K)\) to indicate the set of measures \(\mu \in {\mathcal {P}}(X)\) such that \(\mathrm {spt}\, \mu \subseteq K\).

As usual, if \((Z, {\mathsf {d}}_{Z})\) is another metric space, for every \(\mu \in {\mathcal {M}}_+(X)\) and every \(\mu \)-measurable function \(f:X\rightarrow Z\), we define the push-forward measure \(f_\#\mu \in {\mathcal {M}}_+(Z)\) by \((f_\#\mu )(B):=\mu (f^{-1}(B))\) for any Borel set \(B\subset Z\). The push-forward measures has the same total mass as \(\mu \), namely \(\mu (X)=(f_\#)\mu (Z)\).

For a Lipschitz function \(f:X\rightarrow {\mathbb {R}}\) we set

$$\begin{aligned} \mathrm {Lip}(f):=\sup _{x,y\in X \atop x\ne y}\frac{|f(x)-f(y)|}{{\mathsf {d}}_X(x,y)} \end{aligned}$$

its Lipschitz constant. We denote by \(\mathrm {Lip}(X)\) and \(\mathrm {Lip}_b(X)\) the spaces of Lipschitz and bounded Lipschitz functions on X, respectively. Both are normed spaces with the norm \(\Vert f\Vert _{\mathrm {Lip}} :=\Vert f\Vert _\infty + \mathrm {Lip}(f)\), where \(\Vert \cdot \Vert _\infty \) is the supremum norm. Furthermore, we use the notation \(\mathrm {Lip}_{1}(X)\) for the set of functions \(f \in \mathrm {Lip}_{b} (X)\) such that \(\mathrm {Lip}(f) \le 1\).

In a complete and separable metric space \((X,{\mathsf {d}}_X)\), we shall use the Kantorovich–Rubinstein distance \(W_1\) in the class \({\mathcal {P}}(X)\), defined as

$$\begin{aligned} W_1(\mu ,\nu ):=\sup \bigg \{\int _X\varphi \,\mathrm {d}\mu -\int _X\varphi \,\mathrm {d}\nu : \varphi \in \mathrm {Lip}_1(X) \bigg \} \,. \end{aligned}$$

Notice that \(W_1(\mu ,\nu )\) is finite if \(\mu \) and \(\nu \) belong to the space

$$\begin{aligned} {\mathcal {P}}_1(X):=\bigg \{\mu \in {\mathcal {P}}(X): \int _X {\mathsf {d}}_X(x,{\bar{x}})\,\mathrm {d}\mu (x)<+\infty \text { for some } {\bar{x}}\in X\bigg \} \end{aligned}$$

and that \(({\mathcal {P}}_1(X),W_1)\) is complete if \((X,{\mathsf {d}}_X)\) is complete.

If \((E, \Vert \cdot \Vert _{E})\) is a Banach space and \(\mu \in {\mathcal {M}}_+(E)\), we define the first moment \(m_1(\mu )\) as

$$\begin{aligned} m_1(\mu ):=\int _{E} \Vert x \Vert _E\,\mathrm {d}\mu \,. \end{aligned}$$

Notice that, for a probability measure \(\mu \), finiteness of the above integral is equivalent to \(\mu \in {\mathcal {P}}_1(E)\), whenever E is endowed with the distance induced by the norm \(\Vert \cdot \Vert _{E}\).

For a Banach space E, the notation \(C^1_b(E)\) will be used to denote the subspace of \(C_b(E)\) of functions having bounded continuous Fréchet differential at each point. The notation \(\nabla \phi (\cdot )\) will be used to denote the Fréchet differential. In the case of a function \(\phi :[0,T]\times E \rightarrow {\mathbb {R}}\), the symbol \(\partial _t\) will be used to denote partial differentiation with respect to t. The symbol \(\langle \cdot , \cdot \rangle \) will be used to denote duality products, with no further specification if the meaning is clear from the context.

2.2 Functional setting

We consider a set of pure strategies U, where U is a compact metric space, and we denote by \(Y:={\mathbb {R}}^{d} \times {\mathcal {P}}(U)\) the state space of the system. Precisely, for every \(y= (x, \lambda ) \in Y\), the component \(x\in {\mathbb {R}}^{d}\) describes the location of an agent in space, whereas the component \(\lambda \in {\mathcal {P}}(U)\) describes the distribution of labels of the agent.

The correct functional space for the dynamics (see also [5, 31]) is the space \({\overline{Y}}:={\mathbb {R}}^{d} \times {\mathcal {F}}(U)\), where we have set (see, e.g., [7, 9] and [40, Chapter 3])

$$\begin{aligned} {\mathcal {F}}(U) :=\overline{ \mathrm {span}( {\mathcal {P}}(U) ) }^{\Vert \cdot \Vert _{\mathrm {BL}}} \subseteq (\mathrm {Lip}(U))'. \end{aligned}$$
(5)

The closure in (5) is taken with respect to the bounded Lipschitz norm \(\Vert \cdot \Vert _{\mathrm {BL}}\), defined as

$$\begin{aligned} \Vert \mu \Vert _{\mathrm {BL}}:=\sup \big \{\langle \mu ,\varphi \rangle : \varphi \in \mathrm {Lip}(U), \Vert \varphi \Vert _{\mathrm {Lip}}\le 1\big \} \qquad \text {for every }\mu \in (\mathrm {Lip}(U))'\,. \end{aligned}$$

We notice that, by definition of \(\Vert \cdot \Vert _{\mathrm {BL}}\), we always have

$$\begin{aligned} \Vert \mu \Vert _{\mathrm {BL}} \le \Vert \mu \Vert _{\mathrm {TV}} \qquad \text {for every }\mu \in {\mathcal {M}}(U)\,. \end{aligned}$$

In particular, \(\Vert \lambda \Vert _{\mathrm {BL}} \le 1\) for every \(\lambda \in {\mathcal {P}}(U)\).

We endow \({\overline{Y}}\) with the norm

$$\begin{aligned} \Vert y\Vert _{{{\overline{Y}}}}=\Vert (x,\lambda )\Vert _{{{\overline{Y}}}}:=|x |+\Vert \lambda \Vert _{\mathrm {BL}} \,. \end{aligned}$$

For every \(R>0\), we denote by \(\mathrm {B}_R\) the closed ball of radius R in \({\mathbb {R}}^{d}\) and by \(\mathrm {B}_R^Y\) the ball of radius R in Y, namely \(\mathrm {B}_R^Y=\{y\in Y:\Vert y\Vert _{{{\overline{Y}}}}\le R\}\). We notice that \(\mathrm {B}^{Y}_{R}\) is a compact set, as Y is locally compact by our assumptions on U.

As in [31], we consider, for every \(\varPsi \in {\mathcal {P}}_{1} (Y)\), the velocity field \(v_{\varPsi } :Y \rightarrow {\mathbb {R}}^{d}\) such that

(\(v_1\)):

for every \(R>0\), \(v_{\varPsi } \in \mathrm {Lip} (\mathrm {B}^{Y}_{R}; {\mathbb {R}}^{d})\) uniformly with respect to \(\varPsi \in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\), i.e., there exists \(L_{v, R}>0\) such that

$$\begin{aligned} | v_{\varPsi } (y_{1}) - v_{\varPsi }(y_{2}) | \le L_{v, R} \Vert y_{1} - y_{2} \Vert _{{\overline{Y}}} \qquad \text {for every }y_{1}, y_{2} \in Y; \end{aligned}$$
(\(v_2\)):

for every \(R>0\) there exists \(L_{v, R}>0\) such that for every \(\varPsi _{1}, \varPsi _{2} \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\) and every \(y \in \mathrm {B}^{Y}_{R}\)

$$\begin{aligned} | v_{\varPsi _{1}} (y) - v_{\varPsi _{2}} (y) | \le L_{v, R} W_{1} (\varPsi _{1}, \varPsi _{2})\,; \end{aligned}$$
(\(v_3\)):

there exists \(M_{v}>0\) such that for every \(y \in Y\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\)

$$\begin{aligned} | v_{\varPsi } (y) | \le M_{v} \big ( 1 + \Vert y \Vert _{{\overline{Y}}} + m_{1} ( \varPsi ) \big ) \,. \end{aligned}$$

As for \({\mathcal {T}}\), for every \(\varPsi \in {\mathcal {P}}_{1} (Y)\) we assume that the operator \({\mathcal {T}}_{\varPsi } :Y \rightarrow {\mathcal {F}}(U)\) is such that

(\({\mathcal {T}}_0\)):

for every \((y , \varPsi ) \in Y \times {\mathcal {P}}_{1}( Y )\), the constants belong to the kernel of \({\mathcal {T}}_{\varPsi }(y)\), i.e.,

$$\begin{aligned} \left\langle {\mathcal {T}}_{\varPsi }(y), 1 \right\rangle _{{\mathcal {F}}(U), \mathrm {Lip}(U)} = 0; \end{aligned}$$
(\({\mathcal {T}}_1\)):

there exists \(M_{{\mathcal {T}}}>0\) such that for every \(y \in Y\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\)

$$\begin{aligned} \Vert {\mathcal {T}}_{\varPsi }(y) \Vert _{\mathrm {BL}} \le M_{{\mathcal {T}}} \big ( 1 + \Vert y \Vert _{{\overline{Y}}} + m_{1} ( \varPsi ) \big ); \end{aligned}$$
(\({\mathcal {T}}_2\)):

for every \(R>0\), there exists \(L_{{\mathcal {T}}, R}>0\) such that for every \((y_{1}, \varPsi _{1}), (y_{2}, \varPsi _{2}) \in \mathrm {B}^{Y}_{R} \times {\mathcal {P}}(\mathrm {B}^{Y}_{R})\)

$$\begin{aligned} \Vert {\mathcal {T}}_{\varPsi _{1}} ( y_{1} ) - {\mathcal {T}}_{\varPsi _{2}} (y_{2} ) \Vert _{{\overline{Y}}} \le L_{{\mathcal {T}}, R} \big ( \Vert y_{1} - y_{2} \Vert _{\mathrm {BL}} + W_{1} (\varPsi _{1}, \varPsi _{2}) \big ); \end{aligned}$$
(\({\mathcal {T}}_3\)):

for every \(R>0\) there exists \(\delta _{R}>0\) such that for every \((y, \varPsi ) \in \mathrm {B}^{Y}_{R} \times {\mathcal {P}}_{1}(Y)\) we have

$$\begin{aligned} {\mathcal {T}}_{\varPsi } (y) + \delta _{R} \lambda \ge 0. \end{aligned}$$

Finally, for every \(y \in Y\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\) we set

$$\begin{aligned} b_{\varPsi }(y) :=\left( \begin{array}{cc} \displaystyle v_{\varPsi }(y) \\ \displaystyle {\mathcal {T}}_{\varPsi } (y) \end{array}\right) , \end{aligned}$$
(6)

which is the velocity field driving the evolution (see (7)).

3 The multi-step Lagrangian scheme

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\) be a probability measure on Y with compact support in Y. Given \(T>0\), for every \(k\in {\mathbb {N}}\setminus \{0\}\) we set \(\tau _k :-T/k\) and, for \(i \in \{0, \ldots , k\}\), \(t^{k}_{i} :=i\tau _{k}\).

We now show how to construct a curve \(\varPsi ^{k} :[0,T] \rightarrow {\mathcal {P}}_{1}(Y)\), defined piecewise on each time interval \([t_{i}^{k},t_{i+1}^{}k]\), which approximates a solution \(\varPsi \in C([0,1]; {\mathcal {P}}_{1}(Y))\) of the initial value problem for the nonlinear continuity equation

$$\begin{aligned} \partial _{t} \varPsi _{t} + \mathrm {div}( b_{\varPsi _{t}} \varPsi _{t}) = 0, \qquad \varPsi _{0} = \widehat{\varPsi }. \end{aligned}$$
(7)

Let \(\varPsi ^{k}_{0} :=\widehat{\varPsi }\). In each interval \([t^{k}_{i}, t^{k}_{i+1})\), assume the measure \(\varPsi ^{k}_{i} \in {\mathcal {P}}_{1}(Y)\) to be known. With this knowledge, we update the state of the system with the following procedure, consisting of two steps.

Step 1. We update the label \(\lambda _{({\hat{x}},{\hat{\lambda }})}(t) \in {\mathcal {P}}(U)\) of a player that at time \(t^{k}_{i}\) sits in \({\hat{x}} \in {\mathbb {R}}^{d}\) with label \({\hat{\lambda }} \in {\mathcal {P}}(U)\) by setting

$$\begin{aligned} \lambda _{({\hat{x}}, {\hat{\lambda }})}(t^k_{i+1}):=\lambda _{({\hat{x}},{\hat{\lambda }})}+\tau _k {\mathcal {T}}_{\varPsi _i^k}\big ({\hat{x}},\lambda _{({\hat{x}},{\hat{\lambda }})}(t^k_i)\big )\,. \end{aligned}$$
(8)

At this stage, we assume that \(\lambda _{(\hat{x}, \hat{\lambda })} (t^{k}_{i+1}) \in {\mathcal {P}}(U)\) and we continue with the construction of the piecewise affine interpolant between \(\lambda _{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i})\) and \(\lambda _{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i+1})\), defined as the function \(\lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1} :[t^{k}_{i}, t^{k}_{i+1}] \rightarrow {\mathcal {P}}(U)\) such that

$$\begin{aligned} \lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}(t):=\frac{t-t_{i}^{k}}{\tau _k} \lambda _{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i+1})+ \bigg (1-\frac{t-t_{i}^{k}}{\tau _k}\bigg ) \lambda _{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i})\,. \end{aligned}$$
(9)

In Lemma 1, we show that the assumption \(\lambda _{(\hat{x}, \hat{\lambda })} (t^{k}_{i+1}) \in {\mathcal {P}}(U)\) is actually satisfied for k large enough (and therefore \(\tau _{k}\) small enough), independently of \(i=0, \ldots , k-1\). Giving Lemma 1 for granted for the time being, we define the map \(\varLambda ^{k}_{i+1} :[t^{k}_{i}, t^{k}_{i+1}] \times {\mathbb {R}}^{d} \times {\mathcal {P}}(U) \rightarrow {\mathcal {P}}(U)\) as

$$\begin{aligned} \!\! \varLambda ^{k}_{i+1} (t, {\hat{x}}, {\hat{\lambda }}) :=\lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1} (t) \qquad \text {for every }(t,{\hat{x}}, {\hat{\lambda }}) \in [t^{k}_{i}, t^{k}_{i+1} ] \times {\mathbb {R}}^{d} \times {\mathcal {P}}(U),\nonumber \\ \end{aligned}$$
(10)

and transport it to the state of the system by defining

$$\begin{aligned} {\widetilde{\varPsi }}^{k}_{i+1} :=( id ; \varLambda ^{k}_{i+1} (t^{k}_{i+1}, \cdot , \cdot ))_{\#} \varPsi ^{k}_{i} \in {\mathcal {P}}_{1}(Y)\,. \end{aligned}$$
(11)

Step 2. In the second step, we update the positions of the players. Precisely, a player that at time \(t^{k}_{i}\) sits in the position \({\hat{x}}\) with label \({\hat{\lambda }}\) will now move following the velocity field given by \(v_{{\widetilde{\varPsi }}^{k}_{i+1}}\big ( x_{({\hat{x}},{\hat{\lambda }})}(t_{i}^{k}), \lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}(t^{k}_{i+1}) \big )\), which is determined by the updated label \(\lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}(t^{k}_{i+1})\) just obtained in (8). Hence, we set

$$\begin{aligned} x_{({\hat{x}},{\hat{\lambda }})}(t_{i+1}^{k}) :=x_{({\hat{x}},{\hat{\lambda }})}(t_{i}^{k})+\tau _k v_{{\widetilde{\varPsi }}^{k}_{i+1}} \big ( x_{({\hat{x}},{\hat{\lambda }})}(t_{i}^{k}), \lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}(t^{k}_{i+1}) \big ) \,. \end{aligned}$$
(12)

Also in this case, we can define the affine interpolant between \(x_{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i})\) and \(x_{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i+1})\), as a function \(x^{k}_{({\hat{x}},{\hat{\lambda }}),i+1} :[t^{k}_{i}, t^{k}_{i+1}] \rightarrow {\mathbb {R}}^{d}\), by

$$\begin{aligned} x^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}(t):=\frac{t-t_{i}^{k}}{\tau _k} x_{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i+1})+ \bigg (1-\frac{t-t_{i}^{k}}{\tau _k}\bigg ) x_{({\hat{x}},{\hat{\lambda }})}(t^{k}_{i}). \end{aligned}$$
(13)

We notice that (13), in contrast to (9), is always well defined, since \({\mathbb {R}}^{d}\) is a convex space and the velocity field is an element of \({\mathbb {R}}^{d}\).

Eventually, we define the map \(X^{k}_{i+1} :[t^{k}_{i}, t^{k}_{i+1}] \times {\mathbb {R}}^{d} \times {\mathcal {P}}(U) \rightarrow {\mathbb {R}}^{d}\) as

$$\begin{aligned} \! X^{k}_{i+1} (t,{\hat{x}}, {\hat{\lambda }}) :=x^{k}_{({\hat{x}},{\hat{\lambda }}),i+1} (t) \quad \text {for every } (t, {\hat{x}}, {\hat{\lambda }}) \in [t^{k}_{i}, t^{k}_{i+1}] \times {\mathbb {R}}^{d} \times {\mathcal {P}}(U) \end{aligned}$$
(14)

and we set, for \(t\in [t_{i}^k,t_{i+1}^k]\),

$$\begin{aligned} \varPsi ^{k} (t) :=\Big ( X^{k}_{i+1} (t, \cdot , \cdot ) ; \varLambda ^{k}_{i+1} (t, \cdot , \cdot ) \Big )_{\#} \varPsi ^{k}_{i}\,, \qquad \varPsi ^{k}_{i+1} :=\varPsi ^{k} (t^{k}_{i+1})\,. \end{aligned}$$
(15)

For later use, we also define

$$\begin{aligned}&{\widetilde{\varPsi }}^{k}(t) :={\widetilde{\varPsi }}^{k}_{i+1} \qquad \text {for every }t\in (t^{k}_{i}, t^{k}_{i+1}]\,, \end{aligned}$$
(16)
$$\begin{aligned}&{\underline{\varPsi }}^{k}(t) :=\varPsi ^{k}_{i} \qquad \text {for every }t\in [t^{k}_{i}, t^{k}_{i+1})\,. \end{aligned}$$
(17)

By an application of Gronwall inequality, in the following lemma we give an estimate of \(\Big | x^{k}_{({\hat{x}},{\hat{\lambda }}),i+1} (t)\Big |\) and \(\Big \Vert \lambda ^{k}_{({\hat{x}},{\hat{\lambda }}),i+1}( t)\Big \Vert _{\mathrm {BL}}\) in terms of \(|{\hat{x}} |\) and \(\Vert {\hat{\lambda }} \Vert _{\mathrm {BL}}\). As a consequence, we deduce that the above construction is well defined for \(\tau _k\) sufficiently small and can be iterated over \(i= 0, \ldots , k-1\), since the initial condition \(\widehat{\varPsi }\) has a compact support in Y. This indeed implies that each \(\varPsi ^{k}_{i}\) belongs to \({\mathcal {P}}_{c}(Y) \subseteq {\mathcal {P}}_{1}(Y)\).

Lemma 1

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\). Then, for k large enough the curves \(\varPsi ^{k}(\cdot )\)\({\underline{\varPsi }}^{k}(\cdot )\), and \({\widetilde{\varPsi }}^{k}(\cdot )\) are well defined from [0, T] with values in \({\mathcal {P}}_{1}(Y)\). Furthermore, there exists \(R>0\) independent of k and t such that \(\varPsi ^{k}(t), {\underline{\varPsi }}^{k}(t), {\widetilde{\varPsi }}^{k}(t)\in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\).

Proof

Along the proof of the lemma, we denote with \(\lambda ^{k}(t, x_{0}, \lambda _{0})\) and \(x^{k}(t, x_{0}, \lambda _{0})\), for \((x_{0}, \lambda _{0}) \in \mathrm {spt}\widehat{\varPsi } = : {\mathcal {S}}\), the curves obtained by iteratively solving the difference equations (8) and (12) in each interval \([t^{k}_{i}, t^{k}_{i+1}]\) starting from \((x_{0}, \lambda _{0})\) at time \(t_0=0\) and using, at each node \(t^{k}_{i}\), \(i= 1, \ldots , k-1\), \({\hat{\lambda }} = \lambda ^{k}(t^{k}_{i}, x_{0}, \lambda _{0})\) and \({\hat{x}} = x^{k}(t^{k}_{i}, x_{0}, \lambda _{0})\) as new initial conditions.

As we have already noticed above, the curve \(x^{k}(t, x_{0}, \lambda _{0})\) is well defined as long as \(\lambda ^{k}(t, x_{0}, \lambda _{0})\) and the measures \({\widetilde{\varPsi }}^{k}_{i}\) are. Therefore, in order to prove the lemma it is enough to show that, for \(\tau _{k}\) small enough, for every \((x_0, \lambda _0) \in \mathrm {spt}\, {\widehat{\varPsi }}\) the piecewise linear interpolant \(\lambda ^{k} (t, x_{0}, \lambda _{0})\) always belongs to \({\mathcal {P}}(U)\). This can be done recursively by arguing on each interval \([t^{k}_{i}, t^{k}_{i+1}]\)\(i=0, \ldots , k-1\).

To simplify our estimates, we define the piecewise constant interpolation functions

$$\begin{aligned} \begin{aligned}&{\underline{x}}^{k} (t, x_{0}, \lambda _{0}) :=x^{k} (t^{k}_{j}, x_{0}, \lambda _{0})\,, \;\; {\underline{\lambda }}^{k} (t, x_{0}, \lambda _{0}) :=\lambda ^{k}(t^{k}_{j}, x_{0}, \lambda _{0})&\text {for }t \in [t^{k}_{j}, t^{k}_{j+1})\,,\\&{\overline{\lambda }}^{k}(t, x_{0}, \lambda _{0}) :=\lambda ^{k}(t^{k}_{j+1}, x_{0}, \lambda _{0})&\text {for }t \in (t^{k}_{j}, t^{k}_{j+1}]\,. \end{aligned}\nonumber \\ \end{aligned}$$
(18)

For \(i=0\), we have that the initial condition \(\lambda _{0} \in {\mathcal {P}}(U)\); hence, there is nothing to show. Assuming that \(\lambda ^{k} (t^{k}_{j}, x_{0}, \lambda _{0}) \in {\mathcal {P}}(U)\) for every \(j =0, \ldots , i\) and every \((x_0, \lambda _0) \in \mathrm {spt}\, \widehat{\varPsi }\), we show that \(\lambda ^{k} (t^{k}_{i+1}, x_{0}, \lambda _{0}) \in {\mathcal {P}}(U)\) for k large enough, independently of i and of the initial condition \((x_0, \lambda _0)\). Since, recalling (8) and (9), we define

$$\begin{aligned} \lambda ^{k}( t , x_0 , \lambda _{0}) :={\underline{\lambda }}^{k} (t, x_{0}, \lambda _{0}) + (t - t^{k}_{i}) {\mathcal {T}}_{{{\underline{\varPsi }}}^{k}( t ) }({\underline{x}}^{k}(t, x_{0}, \lambda _{0}), {\underline{\lambda }}^{k}(t , x_{0}, \lambda _{0})) \end{aligned}$$

for \(t \in [t^{k}_{i}, t^{k}_{i+1}]\); by assumptions \(({\mathcal {T}}_{0})\) and \(({\mathcal {T}}_{3})\), we are led to showing that the piecewise constant interpolation functions \({\underline{x}}^{k}( t, x_{0}, \lambda _{0})\) and \({\underline{\lambda }}^{k}( t, x_{0},\lambda _{0})\) are bounded in \({\mathbb {R}}^{d}\) and \({\mathcal {F}}(U)\), respectively, uniformly with respect to \((x_{0}, \lambda _{0}) \in {\mathcal {S}}\) and \(t \in [0,t^{k}_{i+1}]\), and that the bound does not depend on i. Indeed, if this is the case, let \(R' > 0\) be such that \(({\underline{x}}^{k}(t, x_{0}, \lambda _{0}) , {\underline{\lambda }}^{k} ( t, x_{0}, \lambda _{0})) \in \mathrm {B}^{Y}_{R'}\) for every \(t \in [0, t^{k}_{i+1}]\) and every \((x_{0}, \lambda _{0}) \in {\mathcal {S}}\). In particular, by construction (17) of \({{\underline{\varPsi }}}^{k}(t)\) it holds \({{\underline{\varPsi }}}^{k}(t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R'})\). By \(({\mathcal {T}}_3)\), there exists \(\delta _{R'}>0\), independent of ki, and \((x_0, \lambda _0) \in {\mathcal {S}}\), such that for \(t \in [t^{k}_{i}, t^{k}_{i+1}]\)

$$\begin{aligned} \lambda _{R' } :=\frac{1}{\delta _{R'}} {\mathcal {T}}_{{\underline{\varPsi }}^{k}(t)} \big ( {\underline{x}}^{k}(t, x_{0}, \lambda _{0}), {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}) \big ) + {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}) \ge 0\,. \end{aligned}$$

In particular, assumption \(({\mathcal {T}}_1)\) implies that \(\lambda _{R'} \in {\mathcal {F}}(U)\) and satisfies

$$\begin{aligned} \big | \left\langle \lambda _{R'}, \eta \right\rangle _{{\mathcal {F}}(U), \mathrm {Lip}(U)} \big | \le \Vert \eta \Vert _{\infty } \Vert \lambda _{R'} \Vert _{\mathrm {BL}}\,, \end{aligned}$$

so that \(\lambda _{R'}\) can be extended in a unique way to a linear and continuous operator on C(U). The Riesz representation theorem yields that \(\lambda _{R'} \in {\mathcal {M}}_{+}(U)\). Moreover, by \(({\mathcal {T}}_0)\) we get

$$\begin{aligned} \left\langle \lambda _{R'}, 1 \right\rangle _{{\mathcal {F}}(U), \mathrm {Lip}(U)} = \big \langle {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}), 1 \big \rangle _{{\mathcal {F}}(U), \mathrm {Lip}(U)} = 1, \end{aligned}$$

which implies \(\lambda _{R'} \in {\mathcal {P}}(U)\). By the convexity of \({\mathcal {P}}(U)\) we deduce that whenever \(\tau _{k} \le 1/ \delta _{R'}\)

$$\begin{aligned} \lambda ^{k}(t, x_{0}, \lambda _{0}) = {{\underline{\lambda }}}^{k}(t, x_{0}, \lambda _{0}) + ( t - t^{k}_{i}) {\mathcal {T}}_{{\underline{\varPsi }}(t^{k}_{i})} ( {\underline{x}}^{k}( t, x_{0}, \lambda _{0}), {\underline{\lambda }}^{k}( t, x_{0}, \lambda _{0})) \in {\mathcal {P}}(U) \end{aligned}$$

for every \(t \in [t^{k}_{i}, t^{k}_{i+1}]\). Being the upper bound \(R'\) independent of i and of \((x_{0}, \lambda _{0}) \in {\mathcal {S}}\), also \(\delta _{R'}\) is. Hence, the trajectories \(x^{k}(\cdot , x_0, \lambda _0)\) and \(\lambda ^{k}(\cdot , x_0, \lambda _0)\) are well defined from [0, T] with values in \({\mathbb {R}}^{d}\) and \({\mathcal {P}}(U)\), respectively.

In order to conclude that the interpolation curves \(x^{k} (t, x_{0}, \lambda _{0})\) and \(\lambda ^{k}(t, x_{0}, \lambda _{0})\) are well defined, we have to estimate \(|{\underline{x}}^{k}(t, x_{0}, \lambda _{0})|\) and \(\Vert {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0})\Vert _{\mathrm {BL}}\) for \((x_0, \lambda _{0}) \in {\mathcal {S}}\). Since we are assuming that \(\lambda ^{k}(t^{k}_{j}, x_{0}, \lambda _{0}) \in {\mathcal {P}}(U)\) for \(j \in 0, \ldots , i\), we have that \(\Vert \lambda ^{k}(t^{k}_{j}, x_{0}, \lambda _{0})\Vert _{\mathrm {BL}} \le 1\), and the same holds for \(\Vert {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}) \Vert _{\mathrm {BL}}\). As for \({\underline{x}}^{k}(t, x_{0}, \lambda _{0})\), using (12) and \((v_3)\) we get

$$\begin{aligned} | {\underline{x}}^{k}(t, x_{0}, \lambda _{0})|&\le | x_{0} | + \int _{0}^{t^{k}_{i}} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( {\underline{x}}^{k}(\tau , x_{0}, \lambda _{0}), {\overline{\lambda }}^{k}( \tau , x_{0}, \lambda _{0}) \big | \, \mathrm {d}\tau \\&\le |x_{0}| + \int _{0}^{t} M_{v} \Big ( 3 + 2 \sup _{( \hat{x}, \hat{\lambda }) \in {\mathcal {S}}} \, | {\underline{x}}_{k}( \tau , \hat{x}, \hat{\lambda })| \Big ) \, \mathrm {d}\tau \,. \nonumber \end{aligned}$$
(19)

Let us now fix \(r>0\) such that \({\mathcal {S}} \subseteq \mathrm {B}^{Y}_{r}\) and let

$$\begin{aligned} f_{k}( t ) :=\sup _{(\hat{x}, \hat{\lambda } ) \in {\mathcal {S}}} \, | {\underline{x}}^{k}( t, \hat{x}, {\hat{\lambda }} ) | \,. \end{aligned}$$

By taking the supremum over \({\mathcal {S}}\) in (19), we deduce that

$$\begin{aligned} f_{k}(t) \le r + \int _{0}^{t} 3 M_{v} (1 + f_{k}(\tau ))\, \mathrm {d}\tau \,. \end{aligned}$$
(20)

Applying the Gronwall inequality to (20), we infer that

$$\begin{aligned} f_{k}(t) \le (r + 3 M_{v} T) e^{3 M_{v} T} \,. \end{aligned}$$
(21)

Setting \(R' :=1 + (r + 3 M_{v} T) e^{3 M_{v} T}\), we have proved that the piecewise constant interpolation function \(t \mapsto ({\underline{x}}^{k}(t, x_{0}, \lambda _{0}), {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}))\) belongs to \(\mathrm {B}^{Y}_{R'}\) for every \(t \in [t^{k}_{i}, t^{k}_{i+1})\) and every \((x_{0}, \lambda _{0}) \in {\mathcal {S}}\). In particular, we notice that the above computations are independent of the choice of i, as long as we know that \(\lambda ^{k}(t^{k}_{j}, x_{0}, \lambda _{0}) \in {\mathcal {P}}(U)\) for every \(j=0, \ldots , i\) and every \((x_{0}, \lambda _{0}) \in {\mathcal {S}}\). With this control at hand, we conclude, as explained above, that (8) and (12) are well posed.

Finally, we estimate \(x^{k}(t, x_{0}, \lambda _{0})\). For \((x_{0}, \lambda _{0}) \in \mathrm {spt}\widehat{\varPsi }\) and \(t \in [0,T]\), by \((v_{3})\) we have

$$\begin{aligned} \begin{aligned} | x^{k}(t, x_{0}, \lambda _{0})&| \le | x_{0} | + \int _{0}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big ( {\underline{x}}^{k}(\tau , x_{0}, \lambda _{0} ), {\overline{\lambda }}^{k}(\tau , x_{0}, \lambda _{0}) \big ) \big | \, \mathrm {d}\tau \\&\le r + 2M_{v}( 1 + R' ) T\,. \end{aligned} \end{aligned}$$
(22)

Setting \(R :=\max \{R', r + 2M_{v} (1 + R')T + 1 \}\), we obtain that \(\varPsi ^{k}(t), {\widetilde{\varPsi }}^{k}(t), {\underline{\varPsi }}^{k} (t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\) for every \(t \in [0,T]\) and every \(k \in {\mathbb {N}}\) large enough. \(\square \)

In the next proposition, we show that the curve \(\varPsi ^{k}(\cdot )\) solves the continuity equation (7) up to an error of order \(\tau _{k}\).

Proposition 1

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\), let \(\varPsi ^{k} :[0,T] \rightarrow {\mathcal {P}}_{1}(Y)\) be the curve defined in (15) starting from \(\widehat{\varPsi }\), and let \({\widetilde{\varPsi }}^{k}\) be as in (16). Then, the following holds: There exists a positive constant C such that for every \(\varphi \in C_{b}^{1}( {\mathbb {R}}^{d} \times {\mathcal {F}}(U))\), every \(k \in {\mathbb {N}}\), every \(i \in \{0, \ldots , k-1\}\), and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\),

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \int _{Y} \nabla \varphi (x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) + \vartheta _{k}(\varphi ),\nonumber \\ \end{aligned}$$
(23)

where \(|\vartheta _{k}(\varphi )| \le C \Vert \varphi \Vert _{C^{1}_{b}} \tau _{k}\).

Proof

Let us fix \(\varphi \in C_{b}^{1}({\mathbb {R}}^{d}\times {\mathcal {F}}(U))\) and \(t \in (t^{k}_{i}, t^{k}_{i+1})\). By definition of \(\varPsi ^{k}(t)\), we have that

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}&\displaystyle \int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ), \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \nonumber \\&= \displaystyle \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot v_{{\widetilde{\varPsi }}^{k} (t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \nonumber \\&\quad + \displaystyle \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot {\mathcal {T}}_{{\underline{\varPsi }}^{k}(t)} ( x, \lambda ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ), \end{aligned}$$
(24)

where \({\widetilde{\varPsi }}^{k}(t)\) and \({\underline{\varPsi }}^{k}(t)\) are defined in (16) and (17), respectively. In order to obtain (23) from (24), we have to estimate the following quantities:

$$\begin{aligned}&\displaystyle I_{1} (x, \lambda ) :=\Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \Big |\,, \\&\displaystyle I_{2} (x, \lambda ) :=\Big \Vert {\mathcal {T}}_{{\underline{\varPsi }}^{k}(t)} ( x, \lambda ) - {\mathcal {T}}_{\varPsi ^{k}(t)} (X^{k}_{i+1} (t, x, \lambda ) , \varLambda ^{k}_{i+1}(t, x, \lambda ) ) \Big \Vert _{\mathrm {BL}} \end{aligned}$$

for \((x, \lambda ) \in \mathrm {spt}\varPsi ^{k}_{i} \subseteq \mathrm {B}^{Y}_{R}\), where R has been determined in Lemma 1.

Let us start with \(I_{1}\). By triangle inequality, we have

$$\begin{aligned} \begin{aligned} I_{1}&(x, \lambda ) \le \Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) - v_{{\widetilde{\varPsi }}^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t, x, \lambda ) \big ) \Big | \\&\; + \Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t, x, \lambda ) \big ) - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \Big | \\&=: I_{1, 1}(x, \lambda ) + I_{1, 2} (x, \lambda ) \,. \end{aligned}\nonumber \\ \end{aligned}$$
(25)

Since \({\widetilde{\varPsi }}^{k} (t) \in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\), hypothesis \((v_{1})\) implies that

$$\begin{aligned} \begin{aligned} I_{1,1}&(x, \lambda ) \le L_{v, R} \big ( | X^{k}_{i+1}(t, x, \lambda ) - x| + \Vert \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) - \varLambda ^{k}_{i+1} (t, x, \lambda ) \Vert _{\mathrm {BL}} \big ) \\&\le L_{v, R}\bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big (x, \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) \big ) \big | \, \mathrm {d}\tau + \int _{t}^{t^{k}_{i+1}} \big \Vert {\mathcal {T}}_{\varPsi ^{k}_{i}} ( x, \lambda ) \big \Vert _{\mathrm {BL}} \, \mathrm {d}\tau \bigg ) , \end{aligned} \end{aligned}$$

where, in the second inequality, we have used the systems (9) and (13). By \((v_3)\) and \(({\mathcal {T}}_{3})\), we can continue with

$$\begin{aligned} \begin{aligned} I_{1,1} (x, \lambda )&\le L_{v, R} \bigg ( M_{v} \int _{t^{k}_{i}}^{t} \big ( 1 + |x| + \Vert \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) \Vert _{\mathrm {BL}} + m_{1} ( {\widetilde{\varPsi }}^{k} (\tau )) \big ) \, \mathrm {d}\tau \\&\qquad + M_{{\mathcal {T}}} \int _{t}^{t^{k}_{i+1}} \big ( 1 + | x | + \Vert \lambda \Vert _{\mathrm {BL}} + m_{1} ( \varPsi ^{k}_{i} ) \big ) \, \mathrm {d}\tau \bigg ) \\&\le L_{v, R} (M_{v} + M_{{\mathcal {T}}} ) ( 1 + 2R) \tau _{k} \,. \end{aligned} \end{aligned}$$
(26)

As for \(I_{1,2}\), thanks to assumption \((v_2)\) and to Lemma 1 we get

$$\begin{aligned} I_{1,2}&(x, \lambda ) \le L_{v, R} W_{1} ( {\widetilde{\varPsi }}^{k}(t) , \varPsi ^{k}(t) ) \\&= L_{v, R} \, \sup _{\eta \in \mathrm {Lip}_{1} ( Y)} \bigg \{ \int _{Y} \eta (x' , \lambda ' ) \, \mathrm {d}({\widetilde{\varPsi }}^{k}(t) - \varPsi ^{k}(t) ) ( x' , \lambda ' ) \bigg \} \\&= L_{v, R} \, \sup _{\eta \in \mathrm {Lip}_{1} ( Y)} \bigg \{ \int _{Y} \big [\eta (x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' )) \\&\quad - \eta (X^{k}_{i+1} ( t, x' , \lambda ' ), \varLambda ^{k}_{i+1} (t, x' , \lambda ' ) )\big ] \, \mathrm {d}\varPsi ^{k}_{i} ( x' , \lambda ' ) \bigg \} \\&\le L_{v, R} \int _{Y} \big [ | x - X^{k}_{i+1} ( t, x' , \lambda ' )| \\&\quad + \Vert \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) - \varLambda ^{k}_{i+1} (t, x' , \lambda ' ) \Vert _{\mathrm {BL}} \big ] \, \mathrm {d}\varPsi ^{k}_{i} ( x' , \lambda ' ) \\&\le L_{v, R} \int _{Y} \!\!\bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \big | \, \mathrm {d}\tau \\&\quad + \int _{t}^{t^{k}_{i+1}} \big \Vert {\mathcal {T}}_{\varPsi ^{k}_{i}} (x' , \lambda ' ) \big \Vert _{\mathrm {BL}} \, \mathrm {d}\tau \bigg ) \, \mathrm {d}\varPsi ^{k}_{i}( x' , \lambda ' ) \\&\le L_{v, R} \, \tau _{k} \int _{Y} \!\! \Big ( \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \big | + \big \Vert {\mathcal {T}}_{\varPsi ^{k}_{i}}( x' , \lambda ' ) \big \Vert _{\mathrm {BL}} \Big ) \, \mathrm {d}\varPsi ^{k}_{i}( x' , \lambda ' ) \,. \end{aligned}$$

Making use of \((v_3)\) and \(({\mathcal {T}}_3)\) and recalling Lemma 1, we can continue with

$$\begin{aligned} I_{1,2} (x, \lambda )\le & {} L_{v, R} (M_{v} + M_{{\mathcal {T}}}) \, \tau _{k}\int _{Y} \Big ( 1 + | x' | + \Vert \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x' , \lambda ' ) \Vert _{\mathrm {BL}} + \Vert \lambda ' \Vert _{\mathrm {BL}} \nonumber \\&+ m_{1} ({\widetilde{\varPsi }}^{k} (t) ) + m_{1}(\varPsi ^{k}_{i}) \Big ) \, \, \mathrm {d}\varPsi ^{k}_{i} ( x' , \lambda ' ) \nonumber \\\le & {} 3 L_{v, R}( M_{v} + M_{{\mathcal {T}}}) (1+R) \tau _{k}\,. \end{aligned}$$
(27)

Combining (25)–(27), we get

$$\begin{aligned} I_{1} (x, \lambda ) \le C_{1} \tau _{k} \end{aligned}$$
(28)

for some positive constant \(C_{1}\) independent of kt\(\varphi \), and \((x, \lambda ) \in \mathrm {spt}\varPsi ^{k}_{i}\).

Let us now estimate \(I_{2}\). By Lemma 1 and by assumption \(({\mathcal {T}}_2)\), we get

$$\begin{aligned} \begin{aligned} I_{2} (x, \lambda )&\le L_{{\mathcal {T}}, R} \big ( | x - X^{k}_{i+1} ( t, x, \lambda )| + \Vert \lambda - \varLambda ^{k}_{i+1} (t, x, \lambda ) \Vert _{\mathrm {BL}} \\&\quad + W_{1} ( {\underline{\varPsi }}^{k} (t) , \varPsi ^{k}(t))\big ) \,. \end{aligned} \end{aligned}$$
(29)

Arguing as in (25)–(28), we deduce from (29) and from the hypotheses \((v_{1})\)\((v_3)\), and \(({\mathcal {T}}_{2})\) that

$$\begin{aligned} I_{2}(x, \lambda ) \le C_{2}\tau _{k} \end{aligned}$$
(30)

for some positive constant \(C_{2}\) independent of kt\(\varphi \), and \((x, \lambda ) \in \mathrm {spt}\varPsi ^{k}_{i}\).

We are now in a position to conclude the proof of the proposition. We rewrite (24) as

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}&\int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) \\&= \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \\&\quad \cdot v_{\varPsi ^{k} (t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \\&\quad \cdot {\mathcal {T}}_{\varPsi ^{k}(t)} ( X^{k}_{i+1}( t, x, \lambda ), \varLambda ^{k} (t, x, \lambda ) ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad + \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ), \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \big ( v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \\&\qquad - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} ( x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \big ( {\mathcal {T}}_{{\underline{\varPsi }}^{k}(t)} ( x, \lambda ) \\&\quad - {\mathcal {T}}_{\varPsi ^{k}(t)} (X^{k}_{i+1} (t, x, \lambda ) , \varLambda ^{k}(t, x, \lambda ) )\big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad = \int _{Y} \nabla \varphi (x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) \\&\quad + \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \big ( v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \\&\quad - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} ( x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \big ( {\mathcal {T}}_{{\underline{\varPsi }}^{k}(t)} ( x, \lambda ) \\&\quad - {\mathcal {T}}_{\varPsi ^{k}(t)} (X^{k}_{i+1} (t, x, \lambda ) , \varLambda ^{k}(t, x, \lambda ) )\big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&{=:\int _{Y} \nabla \varphi (x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) + \vartheta _k(\varphi ) \,.} \end{aligned}$$

We conclude by noticing that, thanks to (28) and (30), the term \(\vartheta _k(\varphi )\) above can be estimated by

$$\begin{aligned} {|\vartheta _k(\varphi )|} \le \Vert \varphi \Vert _{C^{1}_{b}} \int _{Y} ( I_{1} (x, \lambda ) + I_{2} (x, \lambda ) )\, \mathrm {d}\varPsi ^{k}_{i}(x, \lambda ) \le C \Vert \varphi \Vert _{C^{1}_{b}} \tau _{k}\,, \end{aligned}$$

for a positive constant C independent of kt, and \(\varphi \). \(\square \)

Theorem 1

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c} (Y)\) and let \(\varPsi ^{k}( \cdot )\) be defined as in (15). Then,

$$\begin{aligned} {\lim _{k\rightarrow \infty }\, \sup _{t\in [0,T]} \, W_{1}(\varPsi ^{k}(t), \varPsi (t)) = 0,} \end{aligned}$$

where the curve \(\varPsi \in C([0,T];\) \( ({\mathcal {P}}_{1}(Y), W_{1}))\) is the unique solution of (7) with initial condition \(\varPsi (0) = \widehat{\varPsi }\).

Proof

The existence and uniqueness of the solution to equation (7) follow from [31, Theorem 3.5], so that \(\varPsi \in C([0,T]; ( {\mathcal {P}}_{1}(Y), W_{1}))\) is well defined.

Let \(\phi \in C^{1}_{b} ([0,T] \times {\overline{Y}})\). In view of Proposition 1, for every \(k\in {\mathbb {N}}\)\(i\in \{0,\ldots ,k-1\}\), and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\), we have

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \phi (t, x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) =&\ \int _{Y} \partial _{t} \phi (t, x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) \\&+ \int _{Y} \nabla \phi (t, x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) \\&+ \theta _{k} ( \phi (t, \cdot , \cdot )), \end{aligned} \end{aligned}$$

where \(|\theta _{k} ( \phi (t, \cdot , \cdot ))| \le C \tau _{k} \Vert \phi \Vert _{C^{1}_{b} ( [0,T]\times {\overline{Y}})}\) uniformly in [0, T]. By integrating the previous equality over time, we deduce that

$$\begin{aligned} \begin{aligned}&\int _{Y} \phi (t, x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) - \int _{Y} \phi (0, x, \lambda ) \, \mathrm {d}\widehat{\varPsi } (x, \lambda ) \\&\quad = \int _{0}^{t} \int _{Y} \partial _{t} \phi (\tau , x, \lambda ) \, \mathrm {d}\varPsi ^{k}(\tau ) (x, \lambda ) \, \mathrm {d}\tau \\&\qquad +\int _{0}^{t} \int _{Y} \nabla \phi (\tau , x, \lambda ) \cdot b_{\varPsi ^{k}(\tau )} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(\tau ) (x, \lambda ) \, \mathrm {d}\tau \\&\qquad + \int _{0}^{t} \theta _{k} ( \phi (\tau , \cdot , \cdot ))\, \mathrm {d}\tau \,. \end{aligned} \end{aligned}$$
(31)

In order to pass to the limit in (31), we have to determine a candidate limit for \(\varPsi ^{k}(t)\). In Lemma 1, we have already shown that the supports of \(\varPsi ^{k}(t)\) are contained in a compact subset of \({\overline{Y}}\). We now show the equicontinuity of the sequence \(\varPsi ^{k}\) with respect to \(W_{1}\). Given \(s, t \in [0,T]\), we show that \(W_{1} ( \varPsi ^{k}(s), \varPsi ^{k}(t)) \le L | s - t|\) for some \(L>0\) independent of k. By triangle inequality, it is enough to show it for \(s, t \in [t^{k}_{i}, t^{k}_{i+1}]\). Arguing as in the proof of Proposition 1, we obtain

$$\begin{aligned} W_{1}( \varPsi ^{k}(s) , \varPsi ^{k}(t)) \le 3 ( M_{v} + M_{{\mathcal {T}}}) (1+ R) |s-t|\,, \end{aligned}$$
(32)

where R has been defined in Lemma 1. Hence, Ascoli–Arzelà theorem yields that there exists \({\overline{\varPsi }} \in C([0,T]; ( {\mathcal {P}}_{1} ( Y) , W_{1}))\) such that, up to a subsequence, \(W_{1} ( \varPsi ^{k}(t), {\overline{\varPsi }} (t)) \rightarrow 0 \) uniformly with respect to \(t \in [0,T]\). In particular, \({\overline{\varPsi }}(0) = \widehat{\varPsi }\) and \({\overline{\varPsi }}(t) \in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\), since \(\mathrm {spt}\varPsi ^{k}(t) \subseteq \mathrm {B}^{Y}_{R}\) for every k and every t.

It remains to show that \({\overline{\varPsi }}\) is a solution to (7), from which we would deduce that \({\overline{\varPsi }} = \varPsi \) and that the whole sequence \(\varPsi ^{k}\) converges to \(\varPsi \). The first line of (31) passes to limit as \(k\rightarrow \infty \), since the test function \(\phi \) belongs to \(C^{1}_{b}([0,T]\times {\overline{Y}})\) and the convergence of \(\varPsi ^{k}\) in \(W_{1}\) is uniform in time and implies the narrow convergence. The last term on the right-hand side of (31) tends to 0, since it holds

$$\begin{aligned} \int _{0}^{t} |\theta _{k} (\phi (\tau , \cdot , \cdot )) | \, \mathrm {d}\tau \le C T \tau _{k} \Vert \phi \Vert _{C^{1}_{b} ( [0,T]\times {\overline{Y}})}\,. \end{aligned}$$

We conclude by estimating

$$\begin{aligned} \begin{aligned} \bigg | \int _{0}^{t}&\!\! \int _{{\overline{Y}}} \!\! \nabla \phi (\tau , x, \lambda ) \cdot b_{ \varPsi ^{k} (\tau ) } (x, \lambda ) \mathrm {d}\varPsi ^{k} (\tau ) (x, \lambda ) \mathrm {d}\tau \\&\qquad - \! \int _{0}^{t} \!\! \int _{{\overline{Y}}}\!\! \nabla \phi ( \tau , x, \lambda ) \cdot b_{ {\overline{\varPsi }} (\tau ) } (x, \lambda ) \mathrm {d}{\overline{\varPsi }} (\tau ) (x, \lambda ) \mathrm {d}\tau \bigg | \\&\le \Vert \phi \Vert _{C^{1}_{b}} \int _{0}^{t} \int _{{{\overline{Y}}}} \Vert b_{\varPsi ^{k}(\tau )} ( x,\lambda ) - b_{{\overline{\varPsi }}(\tau )} ( x,\lambda ) \Vert _{{\overline{Y}}} \, \mathrm {d}\varPsi ^{k}(\tau ) (x, \lambda ) \, \mathrm {d}\tau \\&\qquad + \int _{0}^{t}\bigg | \int _{{\overline{Y}}} \nabla \phi ( \tau , x, \lambda ) \cdot b_{{\overline{\varPsi }}(\tau )} ( x, \lambda ) \, \mathrm {d}( \varPsi ^{k} ( \tau ) - {\overline{\varPsi }}(\tau )) ( x, \lambda ) \bigg | \, \mathrm {d}\tau \,. \end{aligned} \end{aligned}$$
(33)

By [31, Proposition 3.2], Lemma 1, and Assumptions \((v_2)\) and \(({\mathcal {T}}_2)\), the first term on the right-hand side of (33) can be estimated by

$$\begin{aligned} \Vert \phi \Vert _{C^{1}_{b}} ( L_{R,v} + L_{R, {\mathcal {T}}}) \int _{0}^{t} W_{1} ( \varPsi ^{k}(\tau ), {\overline{\varPsi }}(\tau )) \, \mathrm {d}\tau \rightarrow 0 \qquad \end{aligned}$$

as \(k\rightarrow \infty \) uniformly with respect to \(t \in [0,T]\).

As for the second term, we first notice that, by [31, Proposition 3.2] and Lemma 1, the function \((x, \lambda ) \mapsto b_{{\overline{\varPsi }}(\tau )} ( x, \lambda )\) is continuous from \({\overline{Y}}\) to \({\overline{Y}}\) and is bounded on \(\mathrm {B}^{Y}_{R}\). Since \(\varPsi ^{k}(\tau )\) converges narrowly to \({\overline{\varPsi }}(\tau )\), for \(\tau \in [0,t]\) we have

$$\begin{aligned} \lim _{k\rightarrow \infty } \bigg | \int _{{\overline{Y}}} \nabla \phi ( \tau , x, \lambda ) \cdot b_{{\overline{\varPsi }}(\tau )} ( x, \lambda ) \, \mathrm {d}( \varPsi ^{k} ( \tau ) - {\overline{\varPsi }}(\tau )) ( x, \lambda ) \bigg | = 0\,. \end{aligned}$$

Furthermore, by \((v_3)\) and \(({\mathcal {T}}_1)\) we have the uniform bound

$$\begin{aligned} \bigg | \int _{{\overline{Y}}} \nabla&\phi ( \tau , x, \lambda ) \cdot b_{{\overline{\varPsi }}(\tau )} ( x, \lambda ) \, \mathrm {d}( \varPsi ^{k} ( \tau ) - {\overline{\varPsi }}(\tau )) ( x, \lambda ) \bigg | \\&\le 4 \Vert \phi \Vert _{C^{1}_{b}} (M_{v} + M_{{\mathcal {T}}}) ( 1 + R) \Vert \varPsi ^{k}(\tau ) - {\overline{\varPsi }}(\tau ) \Vert _{\mathrm {TV}} \\&\le 3 \Vert \phi \Vert _{C^{1}_{b}} (M_{v} + M_{{\mathcal {T}}}) ( 1 + R) \end{aligned}$$

for \(\tau \in [0,t]\). Thus, by dominated convergence also the second term on the right-hand side of (33) tends to zero as \(k\rightarrow \infty \).

Eventually, we infer that passing to the limit \(k\rightarrow \infty \) in (31) we get the equality

$$\begin{aligned} \begin{aligned} \int _{Y} \phi (t, x, \lambda )&\, \mathrm {d}{\overline{\varPsi }}(t) (x, \lambda ) - \int _{Y} \phi (0, x, \lambda ) \, \mathrm {d}\widehat{\varPsi } (x, \lambda ) \\&= \int _{0}^{t} \int _{Y} \partial _{t} \phi (\tau , x, \lambda ) \, \mathrm {d}{\overline{\varPsi }} (\tau ) (x, \lambda ) \\&\quad +\int _{0}^{t} \int _{Y} \nabla \phi (\tau , x, \lambda ) \cdot b_{{\overline{\varPsi }}(\tau )} (x, \lambda ) \, \mathrm {d}{\overline{\varPsi }} ( \tau ) (x, \lambda ) \end{aligned} \end{aligned}$$

for every \(\phi \in C^{1}_{b} ([0,T]\times {\overline{Y}})\) and every \(t \in [0,T]\). This concludes the proof of the theorem. \(\square \)

4 An implicit–explicit scheme for the inhomogeneous replicator dynamics

We discuss in this section a different discrete-time approximation of the continuity equation (7) for the operator \({\mathcal {T}}_{\varPsi }:Y \rightarrow {\mathcal {F}}(U)\) corresponding to the transition operators considered in [5] (see also [31, Section 5]) for the replicator equation (first introduced in [37]; see also [21, 41]), namely

$$\begin{aligned} \begin{aligned} {\mathcal {T}}_{\varPsi } ( x, \lambda ) :=\biggl (&\int _{{\overline{Y}}} \int _{U} J(x, u, x', u') \, \mathrm {d}\lambda ' (u') \, \mathrm {d}\varPsi ( x', \lambda ' ) \\&- \int _{U} \int _{{\overline{Y}}} \int _{U} J(x, u, x', u') \, \mathrm {d}\lambda ' (u') \, \mathrm {d}\varPsi ( x', \lambda ' ) \, \mathrm {d}\lambda (u) \biggr ) \lambda \end{aligned} \end{aligned}$$
(34)

defined for every \(\varPsi \in {\mathcal {P}}_{1}(Y)\) and every \(y = (x, \lambda ) \in Y\). In (34) we consider a function \(J:({\mathbb {R}}^{d} \times U)^{2} \rightarrow {\mathbb {R}}\) such that

\((J_1)\):

J is locally Lipschitz continuous with respect to all of its variables;

\((J_2)\):

there exists \(M_{J}>0\) such that for every \((x, u, x', u') \in ({\mathbb {R}}^{d} \times U)^{2}\)

$$\begin{aligned} | J(x, u, x', u') | \le M_{J} ( 1 + | x| + | x' |) \,. \end{aligned}$$

For simplicity of notation, from now on we will write

$$\begin{aligned}&(J * \varPsi ) (x, u) :=\int _{{\overline{Y}}} \int _{U} J(x, u, x', u') \, \mathrm {d}\lambda ' (u') \, \mathrm {d}\varPsi ( x', \lambda ' )\,,\\&\left\langle J * \varPsi , \lambda \right\rangle (x) :=\int _{U} (J * \varPsi ) (x, u) \, \mathrm {d}\lambda (u)\,, \end{aligned}$$

so that (34) can be written as

$$\begin{aligned} {\mathcal {T}}_{\varPsi } (x, \lambda ) = \big ((J * \varPsi ) (x, \cdot ) - \left\langle J * \varPsi , \lambda \right\rangle (x) \big ) \lambda \,. \end{aligned}$$
(35)

The following proposition holds.

Proposition 2

[31, Proposition 5.8] Under the assumptions \((J_1)\)\((J_2)\), the operator \({\mathcal {T}}_{\varPsi }\) defined in (34) satisfies the conditions \(({\mathcal {T}}_0)\)\(({\mathcal {T}}_3)\).

We now introduce the spherical Hellinger distance between probability measures

$$\begin{aligned} \mathrm {HS}^{2} ( \lambda _{1}, \lambda _{2}) :=\inf \, \biggl \{ \frac{1}{4}&\int _{0}^{1} {\int _U} | w_{t}(u) |^{2} \, \mathrm {d}\rho _{t}(u)\, \mathrm {d}t : \, \rho \in C([0,1]; {\mathcal {P}}(U)), \\&{w\in L^2([0,1]\times U;\mathrm {d}\rho ),}\, \dot{\rho }_t = \bigg ( w_t - \int _{U} w_{t} \, \mathrm {d}\rho _{t} \bigg ) \rho _{t}, \\&\rho _{0} = \lambda _{1}, \, \rho _{1} = \lambda _{2}\biggl \}\,, \end{aligned}$$

defined for every \(\lambda _{1}, \lambda _{2} \in {\mathcal {P}}(U)\), where \(L^2([0,1]\times U;\mathrm {d}\rho )\) denotes the space of functions which are square-integrable with respect to the measure \(\rho \). For later use, we also define the Hellinger distance between nonnegative measures: For every \(\mu _{1}, \mu _{2} \in {\mathcal {M}}_{+}(U)\), we set

$$\begin{aligned} \mathrm {H}^{2} (\mu _{1}, \mu _{2})&:=\inf \, \biggl \{ \frac{1}{4} \! \int _{0}^{1} {\int _U} | w_{t}(u) |^{2} \, \mathrm {d}\rho _{t} (u)\, \mathrm {d}t : \rho \in C([0,1]; {\mathcal {M}}_{+}(U)), \, \\&\qquad \qquad {w\in L^2([0,1]\times U;\mathrm {d}\rho ),}\, \dot{\rho }_t = w_t \, \rho _{t},\, \rho _{0} = \mu _{1}, \, \rho _{1} = \mu _{2} \biggl \} \\&= \int _{U} \bigg [ \bigg ( \frac{\mathrm {d}\mu _{1}}{\mathrm {d}\mu ^*} \bigg )^{\frac{1}{2}} - \bigg ( \frac{\mathrm {d}\mu _{2}}{\mathrm {d}\mu ^*} \bigg )^{\frac{1}{2}} \bigg ]^{2} \, \mathrm {d}\mu ^* , \end{aligned}$$

where \(\mu ^{*} \in {\mathcal {M}}_{+}(U)\) is such that \(\mu _{1}, \, \mu _{2} \ll \mu ^{*}\). We notice that \(\mathrm {HS}^{2}\) can be expressed in terms of \(\mathrm {H}^{2}\) through

$$\begin{aligned} \mathrm {HS}^{2}(\lambda _{1}, \lambda _{2}) = \arccos \bigg ( 1 - \frac{\mathrm {H}^{2}(\lambda _{1}, \lambda _{2})^{2} }{2}\bigg ) \qquad \text {for every }\lambda _{1}, \lambda _{2} \in {\mathcal {P}}(U)\,, \end{aligned}$$
(36)

and that the following chain of inequalities holds:

$$\begin{aligned} \begin{array}{l} \Vert \lambda _{1} - \lambda _{2} \Vert _{\mathrm {BL}} \le \Vert \lambda _{1} - \lambda _{2} \Vert _{\mathrm {TV}} \\ \quad \le 2 \,\mathrm {H}(\lambda _{1}, \lambda _{2}) \le 2 \,\mathrm {HS}( \lambda _{1}, \lambda _{2}) \end{array} \qquad \text {for every }\lambda _{1}, \lambda _{2} \in {\mathcal {P}}(U)\,. \end{aligned}$$
(37)

In the spatially homogeneous case, the replicator equation is a generalized minimizing movement [6] for the functional

$$\begin{aligned} {\mathcal {J}}_{\mathrm {hom}} (\lambda ) :=- \frac{1}{8} \int _{U}\int _{U} J(u, u') \, \mathrm {d}\lambda (u) \, \mathrm {d}\lambda (u') \end{aligned}$$

with respect to the spherical Hellinger distance. In the spatially inhomogeneous setting, the payoff functional has a bilinear dependence on \(\varPsi \) and \(\lambda \): For every \(\varPsi \in {\mathcal {P}}_{1}(Y)\) and every \((x, \lambda ) \in Y\), we set

$$\begin{aligned} {\mathcal {J}}_{\varPsi } (x, \lambda ) :=- \frac{1}{4} \left\langle (J * \varPsi ) , \lambda \right\rangle , \end{aligned}$$
(38)

(the factor \(\frac{1}{4}\) instead of \(\frac{1}{8}\) is due to the dependence on \(\lambda \) which is now linear). We modify the scheme in Sect. 3 by replacing the finite difference (8) with a minimizing movement. Namely, in the interval \([t^{k}_{i}, t^{k}_{i+1})\) let \(\varPsi ^{k}_{i} \in {\mathcal {P}}(U)\) be given and define, for every \(( \hat{x}, \hat{\lambda }) \in Y\),

$$\begin{aligned} \lambda _{(\hat{x}, \hat{\lambda }), i+1} :=\mathrm {argmin}\, \bigg \{ {\mathcal {J}}_{\varPsi ^{k}_{i}} (\hat{x} , \lambda ) + \frac{1}{2\tau _{k}} \mathrm {HS}^{2}(\lambda , \hat{\lambda }) : \, \lambda \in {\mathcal {P}}(U) \bigg \}\,. \end{aligned}$$
(39)

Notice that the measure \(\lambda _{(\hat{x}, \hat{\lambda }), i+1} \in {\mathcal {P}}(U)\) is well defined, as \({\mathcal {P}}(U)\) is compact and the functional in (39) is strictly convex. Therefore, we can define \(\lambda ^{k}_{(\hat{x}, \hat{\lambda }), i+1}\), \(\varLambda ^{k}_{i+1}\), and \({\tilde{\varPsi }}^{k}_{i+1}\) exactly as in (9), (10), and (11), respectively. The second step (12) in the space variable remains instead the same, so that \(x^{k}_{(\hat{x}, \hat{\lambda }), i+1}\)\(X^{k}_{i+1}\)\(\varPsi ^{k}_{i+1}\) are as in (13), (14), and (15), respectively. We further refer to (15), (16), and (17) for the definition of the interpolation curves \(\varPsi ^{k}\)\({\widetilde{\varPsi }}^{k}\), and \({\underline{\varPsi }}^{k}\).

The next lemma gives an estimate on the size of the support of \(\varPsi ^{k}_{i+1}\) and \({\widetilde{\varPsi }}^{k}_{i+1}\), showing that they belong to \({\mathcal {P}}_{c}(Y) \subseteq {\mathcal {P}}_{1}(Y)\) for every \(k\in {\mathbb {N}}\) and every \(i\in \{0,\ldots ,k-1\}\).

Lemma 2

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\). Then, there exists \(R>0\) such that, for every \(k \in {\mathbb {N}}\) and every \(t\in [0,T]\), \(\varPsi ^{k}(t), \, {\underline{\varPsi }}^{k}(t), \, {\widetilde{\varPsi }}^{k}(t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\).

Proof

Let us define the piecewise constant interpolation functions \({\underline{\lambda }}^{k}\)\({\overline{\lambda }}^{k}\), and \({\underline{x}}^{k}\) as in (18), and let \(x^{k}\) and \(\lambda ^{k}\) be the corresponding piecewise affine interpolations. Then, by (39) we have that \({\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0}), \, {\overline{\lambda }}^{k}(t, x_{0}, \lambda _{0}) \in {\mathcal {P}}(U)\) for every \(t \in [0,T]\) and every \((x_{0}, \lambda _{0}) \in \mathrm {spt}\widehat{\varPsi }\), so that

$$\begin{aligned} \Vert {\underline{\lambda }}^{k} (t, x_{0}, \lambda _{0}) \Vert _{\mathrm {BL}}, \Vert {\overline{\lambda }}^{k} (t, x_{0}, \lambda _{0}) \Vert _{\mathrm {BL}} \le 1\,. \end{aligned}$$

Following step by step the proof of (19) and (22), we also deduce that there exists \(R>0\) such that

$$\begin{aligned} | x_{k}(t, x_{0}, \lambda _{0}) | \le R \qquad \begin{array}{l} \text {for every }t \in [0,T], \hbox { every } (x_{0}, \lambda _{0}) \in \mathrm {spt}\, \widehat{\varPsi }, \\ \text {and every } k\in {\mathbb {N}}\,. \end{array} \end{aligned}$$
(40)

We notice that, being the step (39) defined through a minimization in \({\mathcal {P}}(U)\) and not through a finite difference, the estimate (40) holds for every k, and not only for k large. Moreover, (40) yields that \(\varPsi ^{k}(t)\), \({\underline{\varPsi }}^{k}(t)\), \({\widetilde{\varPsi }}^{k}(t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\). \(\square \)

In order to write the equivalent of Proposition 1, we have to determine an approximate Euler–Lagrange equation for the minimization problem (39). This is the content of the following proposition, written here for generic \(\varPsi \)x, and \(\lambda \).

Proposition 3

Let \(R>0\). Assume that \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), \((x, \lambda ) \in \mathrm {B}^{Y}_{R}\), and let \({\tilde{\lambda }} \in {\mathcal {P}}(U)\) be the solution to

$$\begin{aligned} \min \, \bigg \{{\mathcal {J}}_{\varPsi } (x , \rho ) + \frac{1}{2\tau _{k}} \mathrm {HS}^{2}(\rho , \lambda ) : \, \rho \in {\mathcal {P}}(U) \bigg \} \,. \end{aligned}$$
(41)

Then, there exists a constant \(C= C(R)>0\) such that

$$\begin{aligned}&\mathrm {HS}( {\tilde{\lambda }}, \lambda ) \le C \tau _{k}, \end{aligned}$$
(42)
$$\begin{aligned}&\bigg \Vert \frac{{\tilde{\lambda }} - \lambda }{ \tau _{k}} - {\mathcal {T}}_{\varPsi }(x, {\tilde{\lambda }}) \bigg \Vert _{\mathrm {BL}} \le C \tau _{k}( 1 + \tau _{k}) \,. \end{aligned}$$
(43)

Proof

Inequality (42) follows from the minimality of \({\tilde{\lambda }}\). Indeed, we have that

$$\begin{aligned} \frac{1}{2\tau _{k}} \mathrm {HS}^{2}({\tilde{\lambda }}, \lambda ) \le \big | {\mathcal {J}}_{\varPsi } (x , \lambda ) - {\mathcal {J}}_{\varPsi } (x , {\tilde{\lambda }}) \big | \,. \end{aligned}$$
(44)

By definition (38) of \({\mathcal {J}}_{\varPsi }\), by \((J_1)\), by the assumptions \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), \((x, \lambda ) \in \mathrm {B}^{Y}_{R}\), and by (37), we continue in (44) with

$$\begin{aligned} \frac{1}{2\tau _{k}} \mathrm {HS}^{2}({\tilde{\lambda }}, \lambda ) \le \frac{ M_{J} }{2} (1 + R) \Vert {\tilde{\lambda }} - \lambda \Vert _{\mathrm {TV}} \le M_J (1+R) \mathrm {HS}( {\tilde{\lambda }}, \lambda ) \,. \end{aligned}$$
(45)

From (45), we deduce (42).

In order to prove (43), we write explicitly the Euler–Lagrange equation of (41). Here, we follow the lines of [20, Section 4]. For every \(\varphi \in \mathrm {Lip}(U)\) with \(\Vert \varphi \Vert _{\mathrm {Lip}} \le 1\), we consider the auxiliary system

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _{\varepsilon } \lambda _{\varepsilon } = ( \varphi - \left\langle \varphi , \lambda _{\varepsilon } \right\rangle ) \lambda _{\varepsilon }\,,\\ \lambda _{0} = {\tilde{\lambda }}\,. \end{array}\right. \end{aligned}$$
(46)

In view of [12, Section I.3, Theorem 1.4, Corollary 1.1], the ODE system (46) admits a unique solution \(\lambda ^{\varphi }_{\varepsilon } \in {\mathcal {P}}(U)\) for \(\varepsilon > 0\). Moreover, if \({\tilde{\lambda }} \ll \mu \), it is easy to check that \(\lambda ^{\varphi }_{\varepsilon } \ll \mu \) for \(\varepsilon >0\). In the sequel, we fix \(\mu ^{*} \in {\mathcal {P}}(U)\) such that \(\lambda , {\tilde{\lambda }} \ll \mu ^*\).

Given \(\lambda ^{\varphi }_{\varepsilon }\), the Euler–Lagrange equation of (41) reads

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \!\!\! {\mathcal {J}}_{\varPsi } ( x, \lambda ^{\varphi }_{\varepsilon } ) + \frac{1}{2 \tau _{k}} \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \!\!\! \mathrm {HS}^{2} ( \lambda ^{\varphi }_{\varepsilon }, \lambda ) = 0 \,. \end{aligned}$$
(47)

We compute the two derivatives appearing in (47) separately. In view of (46), we have that

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} {\mathcal {J}}_{\varPsi } ( x, \lambda ^{\varphi }_{\varepsilon } )&= - \frac{1}{4}\frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \langle (J * \varPsi ) , \lambda _{\varepsilon }^{\varphi } \rangle = - \frac{1}{4}\big \langle (J * \varPsi ) , \big ( \varphi - \big \langle \varphi , {\tilde{\lambda }} \big \rangle \big ) {\tilde{\lambda }} \big \rangle \\&= - \frac{1}{4} \big \langle \big ( (J * \varPsi ) - \big \langle (J * \varPsi ) , {\tilde{\lambda }} \big \rangle \big ) {\tilde{\lambda }} , \varphi \big \rangle = - \frac{1}{4} \big \langle {\mathcal {T}}_{\varPsi } ( x, {\tilde{\lambda }}) , \varphi \big \rangle \,, \end{aligned}\nonumber \\ \end{aligned}$$
(48)

where, in the last equality, we have used (35).

To compute the second term on the left-hand side of (47), we first notice that, since \({{\tilde{\lambda }}} , \lambda , \lambda ^{\varphi }_{\varepsilon } \ll \mu ^{*}\), we can write

$$\begin{aligned} \begin{aligned} \mathrm {HS}^{2}( \lambda ^{\varphi }_{\varepsilon }, \lambda ) =&\arccos \left( 1 - \frac{\mathrm {H}^{2}(\lambda ^{\varphi }_{\varepsilon }, \lambda )^2}{2} \right) , \\ \mathrm {H}^{2}(\lambda ^{\varphi }_{\varepsilon }, \lambda ) =&\int _{U} \left[ \bigg ( \frac{\mathrm {d}\lambda ^{\varphi }_{\varepsilon }}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} - \bigg ( \frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \right] ^{2} \mathrm {d}\mu ^{*} . \end{aligned} \end{aligned}$$

Defining \(\delta _{k}({\tilde{\lambda }}, \lambda ) \in [0,1]\) such that \(1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) = \frac{1}{\sqrt{ 1 - \frac{\mathrm {H}^{2} ( {\tilde{\lambda }}, \lambda )^{2}}{4}}}\), we have that

$$\begin{aligned} \begin{aligned}&\frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \mathrm {HS}^{2} ( \lambda ^{\varphi }_{\varepsilon }, \lambda ) = \big ( 1 - \delta _{k}( {\tilde{\lambda }}, \lambda ) \big ) \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \!\!\! \mathrm {H}^{2} ( \lambda ^{\varphi }_{\varepsilon }, \lambda ) \\&= 2\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big ) \! \int _{U} \bigg [ \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} - \bigg ( \frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ] \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \bigg ( \frac{\mathrm {d}\lambda ^{\varphi }_{\varepsilon }}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} \, \mathrm {d}\mu ^{*} \\&=\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big ) \! \int _{U} \bigg [ \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} - \bigg ( \frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ] \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} \bigg )^{-\frac{1}{2}} \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \, \mathrm {d}{\tilde{\lambda }} \\&=\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big ) \bigg \langle \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg [ \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} - \bigg ( \frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ] \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} \bigg )^{\frac{1}{2}} \bigg \rangle \,. \end{aligned} \end{aligned}$$
(49)

Using the algebraic equality \(2 (a - b) a = a^{2} - b^{2} + (a - b )^{2}\), we continue in (49) with

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}}{\mathrm {d}\varepsilon }\bigg |_{\varepsilon = 0} \mathrm {HS}^{2} ( \lambda ^{\varphi }_{\varepsilon }, \lambda )&= \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \bigg \langle \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg ( \frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}} - \frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}} \bigg ) \bigg \rangle \\&\quad + \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \bigg \langle \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg [ \bigg (\frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} - \bigg (\frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ]^{2} \bigg \rangle \\&= \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \bigg \langle {\tilde{\lambda }} - \lambda , \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \bigg \rangle \\&\quad + \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \bigg \langle \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg [ \bigg (\frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} - \bigg (\frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ]^{2} \bigg \rangle \\&= \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \langle {\tilde{\lambda }} - \lambda , \varphi \rangle \\&\quad + \frac{\big ( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) \big )}{2} \bigg \langle \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg [ \bigg (\frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} - \bigg (\frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ]^{2} \bigg \rangle , \end{aligned}\nonumber \\ \end{aligned}$$
(50)

where, in the last equality, we have used the fact that \({\tilde{\lambda }}, \lambda \in {\mathcal {P}}(U)\).

In order to conclude with (43), we estimate \(\delta _{k} ({\tilde{\lambda }}, \lambda )\) and the last term on the right-hand side of (50). In view of (36), (37), and (42), it is easy to check that

$$\begin{aligned} \delta _{k} ( {\tilde{\lambda }}, \lambda ) \le c \tau _{k}^{2} \end{aligned}$$
(51)

for some positive constant \(c = c(R) >0\). Since \(\Vert \varphi \Vert _{\mathrm {Lip}} \le 1\) and (45) holds, we have that

$$\begin{aligned} \begin{aligned}&\bigg | \frac{( 1 - \delta _{k} ( {\tilde{\lambda }}, \lambda ) )}{2} \bigg \langle \! \big ( \varphi - \langle \varphi , {\tilde{\lambda }} \rangle \big ) \mu ^{*} , \bigg [ \bigg (\frac{\mathrm {d}{{\tilde{\lambda }}}}{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \!\! - \bigg (\frac{\mathrm {d}\lambda }{\mathrm {d}\mu ^{*}}\bigg )^{\frac{1}{2}} \bigg ]^{2} \bigg \rangle \bigg | \\&\le \mathrm {H}^{2}( {\tilde{\lambda }}, \lambda ) \le 4 M_{J}^{2} ( 1 + R)^{2} \tau _{k}^{2} \,. \end{aligned} \end{aligned}$$
(52)

Combining (37), (42), and (47)–(52), we deduce that

$$\begin{aligned} \begin{aligned}&\bigg \Vert \frac{{\tilde{\lambda }} - \lambda }{\tau _{k}} - {\mathcal {T}}_{\varPsi }(x, {\tilde{\lambda }}) \bigg \Vert _{\mathrm {BL}} \! \le \delta _{k} ( {\tilde{\lambda }}, \lambda ) \bigg \Vert \frac{{\tilde{\lambda }} - \lambda }{ \tau _{k}} \bigg \Vert _{\mathrm {BL}} \! + 8 M_{J}^{2} ( 1 + R)^{2} \tau _{k} \le C \tau _{k} ( 1 + \tau _{k})\,, \end{aligned} \end{aligned}$$

for some positive constant \(C = C(R)\). This concludes the proof of the proposition.

\(\square \)

We are now in a position to state the equivalent of Proposition 1.

Proposition 4

There exists \(C>0\) such that for every \(\varphi \in C_{b}^{1}( {\mathbb {R}}^{d} \times {\mathcal {F}}(U))\), every \(k \in {\mathbb {N}}\), every \(i \in \{0, \ldots , k-1\}\), and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\),

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \int _{Y} \nabla \varphi (x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) + \vartheta _{k}(\varphi ),\nonumber \\ \end{aligned}$$
(53)

where \(|\vartheta _{k}(\varphi )| \le C \Vert \varphi \Vert _{C^{1}_{b}} \tau _{k}\).

Proof

Along the proof, we denote by C a generic positive constant independent of ikt, and \(\varphi \), that may vary from line to line.

We follow step by step the proof of Proposition 1. For every test function \(\varphi \in C_{b}^{1}({\mathbb {R}}^{d} \times {\mathcal {F}}(U))\) and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\), by definition of \(\varPsi ^{k}(t)\) we have that

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}&\int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&= \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot v_{{\widetilde{\varPsi }}^{k} (t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big )\cdot \dot{\varLambda }^{k}_{i+1}(t, x, \lambda ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&= \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot v_{{\widetilde{\varPsi }}^{k} (t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big )\cdot \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda )\,. \end{aligned}\nonumber \\ \end{aligned}$$
(54)

In order to deduce (53) from (54), we need to estimate

$$\begin{aligned}&\displaystyle I_{1} (x, \lambda ) :=\Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \Big |\,, \\&\displaystyle I_{2} (x, \lambda ) :=\bigg \Vert \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} - {\mathcal {T}}_{\varPsi ^{k}(t)} ( X^{k}_{i+1} (t, x, \lambda ) , \varLambda ^{k}_{i+1}(t, x, \lambda ) ) \bigg \Vert _{\mathrm {BL}} \end{aligned}$$

for \((x, \lambda ) \in \mathrm {spt}\, \varPsi ^{k}_{i} \subseteq \mathrm {B}^{Y}_{R}\), where R has been determined in Lemma 2.

Let us start with \(I_{1}\). By triangle inequality we have

$$\begin{aligned} \!\! I_{1} (x, \lambda )\le & {} \Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) - v_{{\widetilde{\varPsi }}^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t, x, \lambda ) \big ) \Big | \nonumber \\&+ \Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t, x, \lambda ) \big ) \nonumber \\&- v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \Big | \nonumber \\=: & {} I_{1, 1} (x, \lambda ) + I_{1, 2} (x, \lambda ). \end{aligned}$$
(55)

Since \({\widetilde{\varPsi }}^{k} (t) \in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\), hypothesis \((v_{1})\) implies that

$$\begin{aligned} \begin{aligned} I_{1,1} (x, \lambda )&\le L_{v, R} \big ( | X^{k}_{i+1}(t, x, \lambda ) - x| + \Vert \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) - \varLambda ^{k}_{i+1} (t, x, \lambda ) \Vert _{\mathrm {BL}} \big ) \\&\le L_{v, R}\bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big (x, \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) \big ) \big | \, \mathrm {d}\tau \\&+ \int _{t}^{t^{k}_{i+1}} \bigg \Vert \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \, \mathrm {d}\tau \bigg ) \,. \end{aligned} \end{aligned}$$

By \((v_3)\), Lemma 2, and Proposition 3, we can continue with

$$\begin{aligned} I_{1,1} (x, \lambda )\le & {} L_{v, R} \bigg ( M_{v} \int _{t^{k}_{i}}^{t} \big ( 1 + |x| + \Vert \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) \Vert _{\mathrm {BL}} + m_{1} ( {\widetilde{\varPsi }}^{k} (\tau )) \big ) \, \mathrm {d}\tau \nonumber \\&+ \frac{2}{\tau _{k}} \int _{t}^{t^{k}_{i+1}} \mathrm {HS}\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ), \lambda \big ) \, \mathrm {d}\tau \bigg ) \le C \tau _{k} \,. \end{aligned}$$
(56)

As for \(I_{1,2}\), thanks to assumption \((v_2)\) and to Lemma 2 we get

$$\begin{aligned} I_{1,2}&(x, \lambda ) \le L_{v, R} W_{1} ( {\widetilde{\varPsi }}^{k}(t) , \varPsi ^{k}(t) ) \\&= L_{v, R} \, \sup _{\eta \in \mathrm {Lip}_{1} ( Y)} \bigg \{ \int _{Y} \eta (x' , \lambda ' ) \, \mathrm {d}({\widetilde{\varPsi }}^{k}(t) - \varPsi ^{k}(t) ) ( x' , \lambda ' ) \bigg \} \\&= L_{v, R} \, \sup _{\eta \in \mathrm {Lip}_{1} ( Y)} \bigg \{ \int _{Y} \big [ \eta (x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \\&\quad - \eta (X^{k}_{i+1} ( t, x' , \lambda ' ), \varLambda ^{k}_{i+1} (t, x' , \lambda ' ) ) \big ] \, \mathrm {d}\varPsi ^{k}_{i} ( x' , \lambda ' ) \bigg \} \\&\le L_{v, R} \int _{Y} \Big ( | x - X^{k}_{i+1} ( t, x' , \lambda ' ) | \\&\quad + \Vert \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) - \varLambda ^{k}_{i+1} (t, x' , \lambda ' ) \Vert _{\mathrm {BL}} \Big ) \, \mathrm {d}\varPsi ^{k}_{i} ( x' , \lambda ' ) \\&\le L_{v, R} \int _{Y} \bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \big | \, \mathrm {d}\tau \\&\quad + \int _{t}^{t^{k}_{i+1}} \bigg \Vert \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) - \lambda ' }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \, \mathrm {d}\tau \bigg ) \, \mathrm {d}\varPsi ^{k}_{i}( x' , \lambda ' ) \\&\le L_{v, R} \, \tau _{k} \int _{Y} \bigg ( \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \big | \\&\quad + \bigg \Vert \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) - \lambda }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \bigg ) \, \mathrm {d}\varPsi ^{k}_{i}( x' , \lambda ' ) \,. \end{aligned}$$

Arguing as in (56), we infer that

$$\begin{aligned} I_{1,2} (x, \lambda ) \le C\, \tau _{k} \qquad \text {for every }(x, \lambda ) \in \mathrm {spt}\, \varPsi ^{k}_{i} \,. \end{aligned}$$
(57)

Combining (55)–(57), we get

$$\begin{aligned} I_{1} (x, \lambda ) \le C \, \tau _{k} \qquad \text {for every }(x, \lambda ) \in \mathrm {spt}\, \varPsi ^{k}_{i} \,. \end{aligned}$$
(58)

Let us now estimate \(I_{2}\). By triangle inequality, we have

$$\begin{aligned} \begin{aligned} I_{2}&(x, \lambda ) \le \bigg \Vert \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} - {\mathcal {T}}_{\varPsi ^{k}_{i}} ( x , \varLambda ^{k}_{i+1}(t^{k}_{i+1} , x, \lambda ) ) \bigg \Vert _{\mathrm {BL}} \\&\quad + \big \Vert {\mathcal {T}}_{\varPsi ^{k}(t) } \big ( X^{k}_{i+1} (t, x, \lambda ) , \varLambda ^{k}_{i+1}(t, x, \lambda ) \big ) - {\mathcal {T}}_{\varPsi ^{k}_{i}} \big ( x , \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) \big ) \big \Vert _{\mathrm {BL}} \\&=: I_{2, 1} (x, \lambda ) + I_{2, 2} (x, \lambda ) \,. \end{aligned} \end{aligned}$$
(59)

By Proposition 3, we have that

$$\begin{aligned} I_{2,1} (x, \lambda ) \le C \,\tau _{k} \qquad \text {for every } (x, \lambda ) \in \mathrm {spt}\, \varPsi ^{k}_{i} \,. \end{aligned}$$
(60)

By \(({\mathcal {T}}_{2})\)\((v_{3})\), Lemma 2, and Proposition 3, and repeating the arguments of (57), we get

$$\begin{aligned} \begin{aligned} \!\! I_{2,2}(x, \lambda )&\le L_{{\mathcal {T}}, R} \bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) \big ) \big | \, \mathrm {d}\tau \\&\quad + \int _{t}^{t^{k}_{i+1}} \bigg \Vert \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \mathrm {d}\tau + W_{1} ( \varPsi ^{k}(t), \varPsi ^{k}_{i}) \bigg ) \\&\le L_{{\mathcal {T}}, R} \bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) \big ) \big | \, \mathrm {d}\tau \\&\quad + \!\int _{t}^{t^{k}_{i+1}} \! \bigg \Vert \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \mathrm {d}\tau \\&\quad + \int _{Y} \int _{t^{k}_{i}}^{t} \bigg ( \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big ( x', \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x', \lambda ') \big )\big | \\&\quad + \bigg \Vert \frac{\varLambda ^{k}_{i+1} (t^{k}_{i+1}, x', \lambda ') - \lambda ' }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \bigg ) \mathrm {d}\tau \, \mathrm {d}\varPsi ^{k}_{i}(x', \lambda ') \bigg ) \\&\le 4 L_{{\mathcal {T}}, R}\, M_{v} (1 + R) \tau _{k} + 4\, \mathrm {HS}\big ( \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) , \lambda \big ) \le C\, \tau _{k}\,. \end{aligned} \end{aligned}$$
(61)

Combining (59)–(61), we obtain that

$$\begin{aligned} I_{2} (x, \lambda ) \le C \, \tau _{k} \qquad \text {for every }(x, \lambda ) \in \mathrm {spt}\, \varPsi ^{k}_{i} \,. \end{aligned}$$
(62)

Equality (53) follows from (58) and (62) as in the proof of Proposition 1. \(\square \)

Finally, we prove the convergence of the sequence \(\varPsi ^{k}\) to the solution \(\varPsi \in C([0,T]; {\mathcal {P}}_{1}(Y))\) of the continuity equation (7).

Theorem 2

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\). Then, the sequence of curves \(\varPsi ^{k} :[0,T] \rightarrow {\mathcal {P}}_{1}(Y)\) converges to the unique solution \(\varPsi \in C([0,T]; {\mathcal {P}}_{1}(Y))\) of (7) in \(W_{1}\), uniformly with respect to \(t \in [0,T]\).

Proof

Since the operator \({\mathcal {T}}_{\varPsi }\) defined in (34) satisfies the property \(({\mathcal {T}}_{0})\)\(({\mathcal {T}}_{3})\), we only have to check that the sequence \(\varPsi ^{k}\) is compact in \(C([0,T]; {\mathcal {P}}_{1}(Y))\). The rest of the proof works as for Theorem 1 using Proposition 4 instead of Proposition 1.

In view of Lemma 2, it is enough to show that \(\varPsi ^{k}\) is equi-Lipschitz with respect to \(W_{1}\). Let us fix \(k \in {\mathbb {N}}\)\(i \in \{0, \ldots , k-1\}\), and \(s \le t \in [t^{k}_{i}, t^{k}_{i+1}]\). Then,

$$\begin{aligned} \begin{aligned} W_{1}&( \varPsi ^{k}(t) , \varPsi ^{k}(s)) = \sup \, \left\{ \int _{Y} \eta (x, \lambda ) \, \mathrm {d}( \varPsi ^{k}(t) - \varPsi ^{k}(s))(x, \lambda ): \, \eta \in \mathrm {Lip}_{1}(Y) \right\} \\&\le \int _{Y} \Big ( \big | X^{k}_{i+1}(t, x, \lambda ) - X^{k}_{i+1} (s, x, \lambda ) \big | \\&\quad + \big \Vert \varLambda ^{k}_{i+1}(t, x, \lambda ) - \varLambda ^{k}_{i+1}(s, x, \lambda ) \big \Vert _{\mathrm {BL}} \Big ) \, \mathrm {d}\varPsi ^{k}_{i}(x, \lambda ) \\&\le \int _{Y} \bigg ( \int _{s}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big (x, \varLambda ^{k}_{i+1} (\tau , x, \lambda ) \big ) \big |\, \mathrm {d}\tau \\&\quad + \int _{s}^{t} \bigg \Vert \frac{\varLambda ^{k}_{i+1}( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg \Vert _{\mathrm {BL}} \, \mathrm {d}\tau \bigg )\, \mathrm {d}\varPsi ^{k}_{i}(x, \lambda ) \,. \end{aligned} \end{aligned}$$

Therefore, by \((v_2)\), Lemma 2, and Proposition 3 we get

$$\begin{aligned} \begin{aligned} W_{1} ( \varPsi ^{k}(t) , \varPsi ^{k}(s))&\le 2M_{v} (1 + R) |t - s| \\&\quad + 2 |t - s| \int _{Y}\frac{\mathrm {HS}\big (( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) , \lambda \big ) }{\tau _{k}} \, \mathrm {d}\varPsi ^{k}_{i}(x, \lambda ) \\&\le C | t - s| \end{aligned} \end{aligned}$$

for some positive constant C independent of k and t. \(\square \)

5 An implicit–explicit scheme for reversible Markov chains

In this section, we show how to adapt the scheme developed in Sect. 4 to a reversible Markov chain on n states. In particular, we will prove the convergence of such scheme for short time.

For fixed \(n \in {\mathbb {N}}\), we consider the set of strategies

$$\begin{aligned} \varLambda _n :=\bigg \{ \lambda = (\lambda _{1}, \ldots , \lambda _{n}) \in {\mathbb {R}}^{n} : \, \lambda _{h} >0, \ \sum _{h=1}^{n} \lambda _{h} = 1 \bigg \} \,. \end{aligned}$$

In the notation of Sects. 3 and 4, the closure \({\overline{\varLambda }}_{n}\) can be identified with the set of probability measures \({\mathcal {P}}(U)\) for \(U :=\{ e_{h}: \, h = 1, \ldots , n\}\)\(e_{h}\) being the elements of the canonical basis of \({\mathbb {R}}^n\). Keeping the notation of the previous sections, we set \(Y :={\mathbb {R}}^{d} \times {\overline{\varLambda }}_{n}\). Furthermore, we define

$$\begin{aligned}&\varLambda _{n}^{\delta } :=\{ \lambda \in \varLambda _{n} : \, \lambda _{h} \ge \delta \} \quad \text {for every }\delta>0\,, \qquad {\mathbb {R}}^{n}_{0} :=\bigg \{ \xi \in {\mathbb {R}}^{n}: \, \sum _{h = 1}^{n} \xi _{h} = 0 \bigg \}\,,\\&\mathrm {B}^{Y}_{R, \delta } :=\mathrm {B}^{Y}_{R} \cap \big ( {\mathbb {R}}^{d} \times \varLambda ^{\delta }_{n} \big ) \quad \text {for }\delta , \, R > 0\,. \end{aligned}$$

A Markov chain is characterized by a matrix \(\mathrm {Q}\in {\mathbb {M}}^{n}\), whose element \(\mathrm {Q}_{h \ell } \ge 0\), \(h\ne \ell \), indicates the rate of moving from the state \(\ell \) to the state h. In our setting, we consider a more general map \({\mathcal {Q}} :{\mathbb {R}}^{d} \times {\mathcal {P}}_{1} (Y) \rightarrow {\mathbb {M}}^{n}\) satisfying the following properties:

\(({\mathcal {Q}}_{0})\):

for every \((x, \varPsi ) \in {\mathbb {R}}^{d} \times {\mathcal {P}}_{1}(Y)\) and every \(h, \ell = 1, \ldots , n\), \({\mathcal {Q}}_{h \ell }(x, \varPsi ) \ge 0\) for \(h \ne \ell \), and \({\mathcal {Q}}_{hh}(x, \varPsi ) = - \sum _{\ell \ne h} {\mathcal {Q}}_{\ell h} (x, \varPsi )\);

\(({\mathcal {Q}}_{1})\):

for every \((x, \varPsi ) \in {\mathbb {R}}^{d} \times {\mathcal {P}}_{1}(Y)\), \({\mathcal {Q}}( x, \varPsi )\) is reversible, that is, there exists a unique \(\sigma = \sigma (x, \varPsi ) \in \varLambda _{n}\) such that

$$\begin{aligned} {\mathcal {Q}}_{h \ell } ( x, \varPsi ) \sigma _{\ell } = {\mathcal {Q}}_{\ell h} (x, \varPsi ) \sigma _{h} \qquad \text {for every }h, \ell = 1, \ldots , n\,; \end{aligned}$$
\(({\mathcal {Q}}_{2})\):

\({\mathcal {Q}}\) is locally Lipschitz, that is, for every \(R>0\) there exists \(L_{{\mathcal {Q}}, R}>0\) such that for every \(x_{1}, x_{2} \in \mathrm {B}_{R}\) and every \(\varPsi _{1}, \varPsi _{2} \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\)

$$\begin{aligned} | {\mathcal {Q}}( x_{1}, \varPsi _{1}) - {\mathcal {Q}}(x_{2}, \varPsi _{2}) | \le L_{{\mathcal {Q}}, R} \big ( | x_{1} - x_{2} | + W_{1} (\varPsi _{1}, \varPsi _{2}) \big )\,; \end{aligned}$$
\(({\mathcal {Q}}_{3})\):

there exists \(M_{{\mathcal {Q}}} > 0\) such that for every \(x \in {\mathbb {R}}^{d}\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\)

$$\begin{aligned} | {\mathcal {Q}}( x, \varPsi ) | \le M_{{\mathcal {Q}}} \big ( 1 + |x| + m_{1} ( \varPsi ) \big ) \,. \end{aligned}$$

Remark 1

We remark that \(({\mathcal {Q}}_{1})\) is always satisfied, for instance, when \({\mathcal {Q}}(x, \varPsi )\) is a tridiagonal matrix for every \(x \in {\mathbb {R}}^{d}\) and \(\varPsi \in {\mathcal {P}}_{1}(Y)\), see, e.g., [30, Section 5.1].

Remark 2

We notice that if for every \(y = (x, \lambda ) \in Y\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\) we set \({\mathcal {T}}_{\varPsi } (y):={\mathcal {Q}}( x, \varPsi ) \lambda \), then the operator \({\mathcal {T}}:Y \times {\mathcal {P}}_{1}(Y) \rightarrow \varLambda _{n}\) satisfies properties \(({\mathcal {T}}_{0})\)\(({\mathcal {T}}_{3})\) of Sect. 2.

Following [28, 30], for every \(y = (x,\lambda ) \in {\mathbb {R}}^{d} \times \varLambda _{n}\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\) we consider the entropy E and the Onsager matrix K

$$\begin{aligned}&E ( x, \lambda , \varPsi ) :=\sum _{h = 1}^{n} \lambda _{h} \ln \bigg ( \frac{\lambda _{h}}{\sigma _{h} (x, \varPsi )} \bigg )\,, \end{aligned}$$
(63)
$$\begin{aligned}&K(x, \lambda , \varPsi ) :=\sum _{\ell = 2}^{n} \sum _{h = 1}^{\ell - 1} {\mathcal {Q}}_{h \ell }(x, \varPsi ) \sigma _{\ell }(x, \varPsi ) \, \varPhi \bigg ( \frac{\lambda _{h}}{\sigma _{h}( x, \varPsi ) } , \frac{\lambda _{\ell }}{\sigma _{\ell }(x, \varPsi )} \bigg ) ( e_{h} - e_{\ell }) \otimes (e_{h} - e_{\ell }), \end{aligned}$$
(64)

where \(\varPhi :[0,+\infty ) \times [0,+\infty ) \rightarrow [0,+\infty )\) is defined as

$$\begin{aligned} \varPhi ( a, b ) :=\frac{ a - b}{\ln a - \ln b }\quad \text {for } a \ne b, \qquad \varPhi (a, a) = a\,, \end{aligned}$$

so that \(\varPhi \) is analytic. Clearly, \(E(x, \cdot , \varPsi )\) and \(K(x, \cdot , \varPsi )\) can be extended to \({\overline{\varLambda }}_{n}\) by continuity. Moreover, we notice that for every \((x, \lambda ) \in {\mathbb {R}}^{d} \times \varLambda _{n}\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\), the matrix \(K(x, \lambda , \varPsi )\) is symmetric and positive definite when acting on \({\mathbb {R}}^{n}_{0}\). We denote by \(G(x, \lambda , \varPsi )\) its inverse on \({\mathbb {R}}^{n}_{0}\). The matrix G is a Riemannian tensor on \({\mathbb {R}}^{n}_{0}\). For every \(x \in {\mathbb {R}}^{d}\) and \(\varPsi \in {\mathcal {P}}_{1}(Y)\), we define the Riemannian metric \({\mathsf {d}}_{(x, \varPsi )} :\varLambda _{n} \times \varLambda _{n} \rightarrow [0,+\infty )\) as

$$\begin{aligned} {\mathsf {d}}_{(x, \varPsi )} ( \lambda _{1}, \lambda _{2} ):= & {} \inf \, \bigg \{ \int _{0}^{1} \left\langle G(x, \rho (s), \varPsi ) \rho '(s) , \rho '(s) \right\rangle ^{\frac{1}{2}}\mathrm {d}s : \nonumber \\&\qquad \qquad \quad \rho \in C^{1} ( [0 , 1 ] ; \varLambda _{n} ) , \, \rho (0) = \lambda _{1} , \, \rho (1) = \lambda _{2} \bigg \}, \end{aligned}$$
(65)

for every \(\lambda _{1}, \lambda _{2} \in \varLambda _{n}\). The metric \({\mathsf {d}}_{(x, \varPsi )}\) can be extended to \({\overline{\varLambda }}_{n} \times {\overline{\varLambda }}_{n}\) in a continuous way.

In the next two lemmas, we collect some properties of EKG, and \({\mathsf {d}}_{(x,\varPsi )}\).

Lemma 3

Let \(\delta , R > 0\). Then, the following facts hold:

  1. (i)

    there exists a positive constant \(\eta = \eta (R)\) such that \(\sigma _{h} (x, \varPsi ) \ge \eta \) for every \(x \in \mathrm {B}_{R}\), every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and every \(h = 1, \ldots , n\);

  2. (ii)

    there exist two positive constants \(c_{1} = c_{1}( R)\) and \( c_{2} = c_{2}(R)\) such that for every \(x \in \mathrm {B}_{R}\), every \(\lambda \in \varLambda _{n}\), every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and every \(\mu \in {\mathbb {R}}^{n}_{0}\),

    $$\begin{aligned}&c_{1} | \mu | ^{2} \le \left\langle G(x, \lambda , \varPsi ) \mu , \mu \right\rangle , \end{aligned}$$
    (66)
    $$\begin{aligned}&\left\langle K(x, \lambda , \varPsi ) \mu , \mu \right\rangle \le c_{2} | \mu | ^{2}\, ; \end{aligned}$$
    (67)
  3. (iii)

    there exist two positive constants \(c_{3} = c_{3}( \delta , R)\) and \(c_{4} = c_{4}(\delta , R)\) such that for every \((x, \lambda ) \in \mathrm {B}^{Y}_{R, \delta }\), every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and every \(\mu \in {\mathbb {R}}^{n}_{0}\),

    $$\begin{aligned}&\left\langle G(x, \lambda , \varPsi ) \mu , \mu \right\rangle \le c_{3} | \mu | ^{2}\,, \end{aligned}$$
    (68)
    $$\begin{aligned}&c_{4} | \mu | ^{2} \le \left\langle K(x, \lambda , \varPsi ) \mu , \mu \right\rangle \, ; \end{aligned}$$
    (69)
  4. (iv)

    \(G(x, \cdot , \varPsi )\) is Lipschitz continuous in \(\varLambda ^{\delta }_{n}\), uniformly with respect to \(x \in \mathrm {B}_{R}\) and \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), that is, there exists \(L_{G, \delta , R}>0\) such that for every \(\lambda _{1}, \lambda _{2} \in \varLambda ^{\delta }_{n}\)

    $$\begin{aligned} | G(x, \lambda _{1}, \varPsi ) - G(x, \lambda _{2}, \varPsi ) | \le L_{G, \delta , R} | \lambda _{1} - \lambda _{2}|; \end{aligned}$$
    (70)
  5. (v)

    \(E (x, \cdot , \varPsi )\) is Lipschitz continuous in \(\varLambda ^{\delta }_{n}\), uniformly with respect to \(x \in \mathrm {B}_{R}\) and \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), namely there exists \(L_{E, \delta , R}>0\) such that for every \(\lambda _{1}, \lambda _{2} \in \varLambda ^{\delta }_{n}\)

    $$\begin{aligned} | E(x, \lambda _{1}, \varPsi ) - E(x, \lambda _{2}, \varPsi ) | \le L_{E, \delta , R} | \lambda _{1} - \lambda _{2}|; \end{aligned}$$
    (71)
  6. (vi)

    for every \(\alpha \in (0,1)\) the energy \(E(x, \cdot , \varPsi )\) is \(\alpha \)-Hölder continuous in \({\overline{\varLambda }}_{n}\), uniformly with respect to \(x \in \mathrm {B}_{R}\) and \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), that is, for every \(\alpha \in (0,1)\) there exists \(C_{E, \alpha , R}>0\) such that for every \(\lambda _{1}, \lambda _{2} \in {\overline{\varLambda }}_{n}\)

    $$\begin{aligned} | E(x, \lambda _{1}, \varPsi ) - E(x, \lambda _{2}, \varPsi ) | \le C_{E, \alpha , R} | \lambda _{1} - \lambda _{2}|^{\alpha } \,. \end{aligned}$$
    (72)

Remark 3

The constants \(c_{1}(R)\) and \(c_{4}(\delta , R)\) can be assumed to be decreasing with respect to R, while \(c_{2}( R)\)\(c_{3}(\delta , R)\)\(L_{G, \delta , R}\)\(L_{E, \delta , R}\), and \(C_{E, \alpha , R}\) can be assumed to be increasing with respect to R.

Proof

(Proof of Lemma 3) In view of \(({\mathcal {Q}}_{1})\) and \(({\mathcal {Q}}_{2})\), we have that the function \((x, \varPsi ) \mapsto \sigma (x, \varPsi )\) is continuous from \({\mathbb {R}}^{d} \times {\mathcal {P}}_{1}(Y) \rightarrow \varLambda _{n}\). Hence, there exists \(\eta = \eta ( R ) > 0\) such that for every \(x \in \mathrm {B}_{R}\), every \(\varPsi \in {\mathcal {P}}_{1}(\mathrm {B}^{Y}_{R})\), and every \(h \in \{ 1, \ldots , n\}\), \( \sigma _{h} (x, \varPsi ) \ge \eta >0\), so that (i) holds.

From (i), (64), the regularity of \(\varPhi \), and \(({\mathcal {Q}}_{3})\), we further deduce that (67) holds for a suitable constant \(c_{2} = c_{2}( R)\).

For every \((x , \lambda ) \in {\mathbb {R}}^{d} \times \varLambda _{n}\) and every \(\varPsi \in {\mathcal {P}}_{1}(Y)\), we have that \(K( x, \lambda , \varPsi )\) is symmetric, positive semi-definite on \({\mathbb {R}}^{n}\), and positive definite on \({\mathbb {R}}^{n}_{0}\). Since K is continuous with respect to \((x, \lambda , \varPsi )\), we deduce that there exists a positive constant \(c_{4} = c_{4} (\delta , R) \le c_{2}\) such that inequality (69) holds for every \((x , \lambda ) \in \mathrm {B}^{Y}_{R, \delta }\) and every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\). Since G is the inverse of K on \({\mathbb {R}}^{n}_{0}\), (67) and (69) imply (66) and (68) with \(c_{1}(R) :=c_{2}(R) ^{-1} \) and \(c_{3}(\delta , R) :=c_{4} (\delta , R)^{-1}\). This concludes the proof of (ii) and (iii).

The Lipschitz continuity (iv) of \(G(x, \cdot , \varPsi )\) in \(\varLambda ^{\delta }_{n}\) follows from the regularity of \(K(x, \cdot , \varPsi )\), from (i)–(iii), and from the identity, on \({\mathbb {R}}^{n}_{0}\),

$$\begin{aligned} G ( x , \lambda _{1}, \varPsi ) - G(x, \lambda _{2}, \varPsi ) = G ( x , \lambda _{1}, \varPsi ) \big ( K ( x , \lambda _{2}, \varPsi ) - K ( x , \lambda _{1}, \varPsi )\big ) G ( x , \lambda _{2}, \varPsi ). \end{aligned}$$

As for (v), we notice that for \(x \in \mathrm {B}_{R}\)\(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and \(\lambda \in \varLambda ^{\delta }_{n}\), the ratio \(\lambda _{h}/\sigma _{h} (x, \varPsi )\) is bounded from below and from above by \( \delta \) and by \(1/\eta \), respectively. Since the function \(a \mapsto a \ln a\) is locally Lipschitz continuous in \((0,+\infty )\), we have that there exists \(L = L ( \delta , R)>0\) such that (71) holds.

Finally, we note that the function \(a \mapsto a \ln a\) belongs to \(W^{1, p}( [ 0, A])\) for every \(p \in [1, +\infty )\) and every \(A<+\infty \). In view of (i), for every \(x \in \mathrm {B}_{R}\), every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and every \(\lambda \in {\overline{\varLambda }}_{n}\), the ratio \(\lambda _{h}/\sigma _{h} (x, \varPsi )\) is bounded above by \(1/\eta \). Hence, by Sobolev embedding in dimension one we infer that for every \(\alpha \in (0, 1)\) there exists \(C = C ( \alpha , R) > 0\) such that (72) holds. \(\square \)

Before stating the main properties of the distance \({\mathsf {d}}_{(x,\varPsi )}\), we define, for every \(x \in {\mathbb {R}}^{d}\), every \(\varPsi \in {\mathcal {P}}_{1}(Y)\), and every \(\lambda , \lambda _{1}, \lambda _{2} \in \varLambda _{n}\), the norm

$$\begin{aligned} \Vert \lambda _{1} - \lambda _{2}\Vert _{G(x, \lambda , \varPsi )} :=\left\langle G(x, \lambda , \varPsi ) (\lambda _{1} - \lambda _{2}) , \lambda _{1} - \lambda _{2} \right\rangle ^{\frac{1}{2}}, \end{aligned}$$

which is well defined in view of (66) and (68).

Lemma 4

Let \(\delta , R>0\) and let \(c_{1}, c_{3} >0\) be the constants determined in (66) and (68). Then, the following facts hold:

  1. (i)

    there exists a positive constant \(m_{1} = m_{1} (R)\) such that for every \(x \in \mathrm {B}_{R}\) and every \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\)

    $$\begin{aligned} m_{1} | \lambda _{1} - \lambda _{2} | \le {\mathsf {d}}_{(x, \varPsi )} ( \lambda _{1}, \lambda _{2}) \qquad \text {for every }\lambda _{1}, \lambda _{2} \in \varLambda _{n}\,; \end{aligned}$$
    (73)
  2. (ii)

    there exist two positive constants \(m_{2} = m_{2} (\delta , R)\) and \(m_{3} = m_{3}(\delta , R)\) such that for every \(x \in \mathrm {B}_{R}\), every \(\varPsi \in {\mathcal {P}}(B^{Y}_{R})\), and every \(\lambda _{1}, \lambda _{2} \in \varLambda ^{\delta }_{n}\)

    $$\begin{aligned}&{\mathsf {d}}_{(x, \varPsi )} ( \lambda _{1} , \lambda _{2}) \le m_{2} | \lambda _{1} - \lambda _{2} |, \end{aligned}$$
    (74)
    $$\begin{aligned}&{\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) \le \Vert \lambda _{1} - \lambda _{2} \Vert _{G(x, \lambda _{1}, \varPsi )} + m_{3} | \lambda _{1} - \lambda _{2} |^{\frac{3}{2}}; \end{aligned}$$
    (75)
  3. (iii)

    there exists a positive constant \(m_{4}= m_{4}(\delta , R)\) such that for every \(x \in \mathrm {B}_{R}\), every \(\varPsi \in {\mathcal {P}}(B^{Y}_{R})\), and every \(\lambda _{1}, \lambda _{2} \in \varLambda ^{\delta }_{n}\) satisfying

    $$\begin{aligned} \sqrt{\frac{c_{3}}{c_{1}}}\, | \lambda _{1} - \lambda _{2} | < \min \, \big \{ \mathrm {dist}(\lambda _{1}, \partial \varLambda ^{\delta }_{n}), \mathrm {dist}(\lambda _{2}, \partial \varLambda ^{\delta }_{n}) \big \} \end{aligned}$$
    (76)

    we have

    $$\begin{aligned} \Vert \lambda _{1} - \lambda _{2} \Vert _{G(x, \lambda _{1}, \varPsi )} \le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) + m_{4} | \lambda _{1} - \lambda _{2} |^{\frac{3}{2}}. \end{aligned}$$
    (77)

Remark 4

The constant \(m_{1}(R)\) can be assumed to be decreasing with respect to R, while \(m_{2}(\delta , R)\), \(m_{3}(\delta , R)\), and \(m_{4}(\delta , R)\), can be assumed to be increasing with respect to R.

Proof

(Proof of Lemma 4) Point (i) is a consequence of (66). We now prove (ii). Given \(x \in \mathrm {B}_{R}\), \(\varPsi \in {\mathcal {P}}( \mathrm {B}^{Y}_{R})\), and \(\lambda _{1}, \lambda _{2} \in \varLambda ^{\delta }_{n}\), we have that the curve

$$\begin{aligned} \rho (s) :=(1- s) \lambda _{1} + s\lambda _{2} \qquad s \in [0,1] \end{aligned}$$

is a competitor for the infimum in the definition of \({\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2})\) in (65). Moreover, by convexity, \(\rho (s) \in \varLambda ^{\delta }_{n}\) for every \(s \in [0,1]\). Therefore, applying (iii) of Lemma 3 we get

$$\begin{aligned} {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) \le \int _{0}^{1} \left\langle G(x, \rho (s) , \varPsi ) (\lambda _{2} - \lambda _{1}), \lambda _{2} - \lambda _{1}\right\rangle ^{\frac{1}{2}} \mathrm {d}s \le \sqrt{c_{3}} | \lambda _{1} - \lambda _{2} |, \end{aligned}$$

which is (74) with \(m_{2} = \sqrt{c_{3}}\).

Combining, instead, (iii) and (iv) of Lemma 3, we can further estimate

$$\begin{aligned} {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2})\le & {} \int _{0}^{1} \left\langle G(x, \rho (s) , \varPsi ) (\lambda _{2} - \lambda _{1}), \lambda _{2} - \lambda _{1}\right\rangle ^{\frac{1}{2}} \mathrm {d}s \nonumber \\\le & {} \left\langle G(x, \lambda _{1} , \varPsi ) (\lambda _{2} - \lambda _{1}), \lambda _{2} - \lambda _{1}\right\rangle ^{\frac{1}{2}} \nonumber \\&+ \int _{0}^{1} \left| \left\langle \big ( G(x, \rho (s) , \varPsi ) - G(x, \lambda _{1}, \varPsi ) \big ) (\lambda _{2} - \lambda _{1}), \lambda _{2} - \lambda _{1} \right\rangle \right| ^{\frac{1}{2}} \mathrm {d}s \nonumber \\\le & {} \Vert \lambda _{1} - \lambda _{2} \Vert _{ G (x, \lambda _{1}, \varPsi ) } + \int _{0}^{1} \left( L_{ G, \delta , R } | \lambda _{1} - \rho (s) | \right) ^{\frac{1}{2}} | \lambda _{1} - \lambda _{2} | \, \mathrm {d}s \nonumber \\\le & {} \Vert \lambda _{1} - \lambda _{2} \Vert _{G(x, \lambda _{1}, \varPsi )} + \sqrt{L_{G, \delta , R}}\, |\lambda _{1} - \lambda _{2} | ^{\frac{3}{2}}, \end{aligned}$$
(78)

from which we conclude (75) with \(m_{3} = \sqrt{L_{G, \delta , R}}\).

Finally, let x, \(\varPsi \), \(\lambda _{1}\), and \(\lambda _{2}\) be as in point (iv). For every \(\varepsilon >0\) let \(\rho _{\varepsilon } \in C^{1}([0,1]; \varLambda _{n})\) with \(\rho _{\varepsilon } (0) = \lambda _{1}\) and \(\rho _{\varepsilon } (1) = \lambda _{2}\) be such that

$$\begin{aligned} \int _{0}^{1} \left\langle G(x, \rho _{\varepsilon }(s) ,\varPsi ) \rho _{\varepsilon }'(s) , \rho _{\varepsilon }'(s) \right\rangle ^{\frac{1}{2}} \mathrm {d}s \le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1} , \lambda _{2}) + \varepsilon \,. \end{aligned}$$
(79)

In view of (74) and of (66), we deduce from (79) that

$$\begin{aligned} \sqrt{c_{1}} \int _{0}^{1} | \rho _{\varepsilon }'(s) | \, \mathrm {d}s \le m_{2} | \lambda _{1} - \lambda _{2} | + \varepsilon = \sqrt{c_{3}} | \lambda _{1} - \lambda _{2} | + \varepsilon \,. \end{aligned}$$
(80)

Hence, (76) and (80) imply that

$$\begin{aligned} \int _{0}^{1} | \rho _{\varepsilon }'(s) | \, \mathrm {d}s < \min \, \big \{ \mathrm {dist}(\lambda _{1}, \partial \varLambda ^{\delta }_{n}), \mathrm {dist}(\lambda _{2}, \partial \varLambda ^{\delta }_{n}) \big \} + \frac{\varepsilon }{\sqrt{c_{1}}}\,. \end{aligned}$$

Therefore, for \(\varepsilon \) small enough we may assume that \(\rho _{\varepsilon }(s) \in \varLambda ^{\delta }_{n}\) for every \(s \in [0,1]\). For such \(\varepsilon \), we estimate

$$\begin{aligned} \begin{aligned} \Vert&\lambda _{1} - \lambda _{2} \Vert _{G(x, \lambda _{1}, \varPsi )} \le \int _{0}^{1} \left\langle G(x, \lambda _{1}, \varPsi ) \rho _{\varepsilon }'(s) , \rho '_{\varepsilon } (s) \right\rangle ^{\frac{1}{2}} \mathrm {d}s \\&\le \int _{0}^{1} \left\langle G(x, \rho _{\varepsilon }(s), \varPsi ) \rho _{\varepsilon }'(s) , \rho _{\varepsilon }' (s) \right\rangle ^{\frac{1}{2}} \mathrm {d}s \\&\qquad + \int _{0}^{1}\left| \left\langle \big ( G(x, \lambda _{1}, \varPsi ) - G( x, \rho _{\varepsilon }(s), \varPsi ) \big ) \rho _{\varepsilon }'(s) , \rho _{\varepsilon }' (s) \right\rangle \right| ^{\frac{1}{2}} \mathrm {d}s \\&\le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) + \int _{0}^{1} \left| \left\langle \big ( G(x, \lambda _{1}, \varPsi ) - G( x, \rho _{\varepsilon }(s), \varPsi ) \big ) \rho _{\varepsilon }'(s) , \rho _{\varepsilon }' (s) \right\rangle \right| ^{\frac{1}{2}} \mathrm {d}s + \varepsilon \,. \end{aligned} \end{aligned}$$

Since \(\rho _{\varepsilon }(s) \in \varLambda ^{\delta }_{n}\) for every \(s \in [0,1]\), by (iv) of Lemma 3 and by (80), we have that

$$\begin{aligned} \begin{aligned} \Vert&\lambda _{1} - \lambda _{2} \Vert _{G(x, \lambda _{1}, \varPsi )} \\&\le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) + \sqrt{L_{G, \delta , R}} \int _{0}^{1} | \lambda _{1} - \rho _{\varepsilon }(s) | ^{\frac{1}{2}} | \rho _{\varepsilon } ' (s) | \, \mathrm {d}s + \varepsilon \\&\le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) + \sqrt{L_{G, \delta , R}} \bigg (\int _{0}^{1} | \rho _{\varepsilon } ' (s) | \, \mathrm {d}s \bigg )^{\frac{3}{2}} + \varepsilon \\&\le {\mathsf {d}}_{(x, \varPsi )} (\lambda _{1}, \lambda _{2}) + \sqrt{L_{G, \delta , R}}\, \bigg ( \frac{c_{3}}{c_{1}} \bigg )^{\frac{3}{4}} |\lambda _{1} - \lambda _{2}| ^{\frac{3}{2}}+ \varepsilon \bigg ( 1 + \sqrt{\frac{ L_{ G, \delta , R}}{c_{1}^{3/2}}} \bigg ). \end{aligned} \end{aligned}$$
(81)

Thus, we conclude (77) by passing to the limit in (81) as \(\varepsilon \rightarrow 0\). In particular, \(m_{4} = \sqrt{L_{G, \delta , R}} \big ( \frac{c_{3}}{c_{1}} \big )^{\frac{3}{4}}\). \(\square \)

We now rewrite the multi-step scheme presented in Sect. 4 in the language of Markov chains and show its short-time convergence to a solution of the continuity equation (7), where for \(\varPsi \in {\mathcal {P}}_{1}(Y)\) the field \(b_{\varPsi } :Y \rightarrow Y \) is now defined as

$$\begin{aligned} b_{\varPsi } ( x, \lambda ) :=\left( \begin{array}{cc} v_{\varPsi } (x, \lambda ) \\ {\mathcal {Q}}(x, \varPsi ) \lambda \end{array}\right) \end{aligned}$$

for a velocity field \(v_{\varPsi } :Y \rightarrow {\mathbb {R}}^{d}\) satisfying properties \((v_{1})\)\((v_{3})\) of Sect. 2.

Let us fix a time step \(\tau _{k}>0\), \(k \in {\mathbb {N}}\), such that \(\tau _{k} \rightarrow 0 \) as \(k \rightarrow \infty \), and let \(t^{k}_{i} :=i \tau _{k}\) for \(i \in {\mathbb {N}}\). For \(i=0\) we set \(\varPsi ^{k}_{0} :=\widehat{\varPsi } \in {\mathcal {P}}_{1}(Y)\). For \(i>0\), assume we are given \(\varPsi ^{k}_{i} \in {\mathcal {P}}_{1}(Y)\). Then, similarly to (39), the label of an agent sitting in position \(\hat{x} \in {\mathbb {R}}^{d}\) with label \(\hat{\lambda } \in {\overline{\varLambda }}_{n}\) is updated by solving the minimizing movement

$$\begin{aligned} \min \, \bigg \{ E ( \hat{x} , \lambda , \varPsi ^{k}_{i} ) + \frac{1}{2\tau _{k}} \, {\mathsf {d}}^{2}_{(\hat{x}, \varPsi ^{k}_{i})} ( \lambda , \hat{\lambda }) : \, \lambda \in {\overline{\varLambda }}_{n} \bigg \} \,. \end{aligned}$$
(82)

Since \({\overline{\varLambda }}_{n}\) is compact, (82) admits at least one solution \(\lambda _{(\hat{x}, \hat{\lambda }), i+1}\).Footnote 1 Therefore, we can define \(\lambda ^{k}_{(\hat{x}, \hat{\lambda }), i+1}\), \(\varLambda ^{k}_{i+1}\), and \({\tilde{\varPsi }}^{k}_{i+1}\) exactly as in (9), (10), and (11), respectively. The step (12) in the space variable remains the same, and \(x^{k}_{(\hat{x}, \hat{\lambda }), i+1}\), \(X^{k}_{i+1}\), \(\varPsi ^{k}_{i+1}\) are as in (13), (14), and (15). Furthermore, we refer to (15), (16), and (17) for the definition of the interpolation curves \(\varPsi ^{k}\)\({\widetilde{\varPsi }}^{k}\), and \({\underline{\varPsi }}^{k}\).

Repeating step by step the proofs of Lemmas 1 and 2, we obtain the following uniform estimate on \(\varPsi ^{k}(t)\)\({\widetilde{\varPsi }}^{k}(t)\), and \({\underline{\varPsi }}^{k}(t)\).

Lemma 5

Let \(\widehat{\varPsi } \in {\mathcal {P}}_{c}(Y)\). Then there exists an increasing continuous function \(R:[0,+\infty ) \rightarrow [0,+\infty )\) such that for every \(T \in [0,+\infty )\), every \(k \in {\mathbb {N}}\), and every \(t\in [0,T]\), \(\varPsi ^{k}(t), {\underline{\varPsi }}^{k}(t), {\widetilde{\varPsi }}^{k}(t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R(T)})\).

Proof

The statement follows by the arguments of Lemmas 1 and 2 . In particular, we gave there an explicit formula for R(T) as a function of \(T \in [0,+\infty )\), which turns out to be continuous and increasing. \(\square \)

Also in the current setting, we need to write an approximate Euler–Lagrange equation associated with (82). This is done in Proposition 5, for proving which we need the following lemma.

Lemma 6

Let \(f:{\mathbb {R}}^{N}\rightarrow {\mathbb {R}}\cup \{+\infty \}\) be a convex function, let \(A\in {\mathbb {M}}^N\) be a symmetric and positive definite matrix, and let \(\Vert \cdot \Vert _A:{\mathbb {R}}^{N}\rightarrow [0,+\infty )\) be the norm associated with A, namely \(\Vert \xi \Vert _A^2:=\langle A\xi ,\xi \rangle \), for all \(\xi \in {\mathbb {R}}^{N}\). For a fixed \(\zeta \in {\mathbb {R}}^{N}\) and \(c>0\), assume that \(\xi _0\) solves

$$\begin{aligned} \min \big \{f(\xi )+c\Vert \xi -\zeta \Vert _A^2\big \}. \end{aligned}$$
(83)

Then \(\xi _0\) also solves

$$\begin{aligned} \min \big \{f(\xi )+c\Vert \xi -\zeta \Vert _A^2-c\Vert \xi -\xi _0 \Vert _A^2 \big \}. \end{aligned}$$
(84)

Proof

It is enough to observe that the problem (84) can be equivalently rewritten as

$$\begin{aligned} \min \big \{f(\xi )+2c \langle \xi , A(\xi _0-\zeta )\rangle \big \}; \end{aligned}$$

hence, it is a convex minimization problem. Since \(\xi _0\) solves (83), we have \(-2cA(\xi _0-\zeta )\in \partial f(\xi _0)\), which is exactly the Euler condition for the above problem.

\(\square \)

Proposition 5

Let \(\delta , R > 0\) and let \(m_{1}(R)\), \(C_{E, \alpha , R}\), and \(L_{E, \delta , R}\) be the constants determined in Lemmas 3 and 4. Assume that \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\) and \(( x, \lambda ) \in \mathrm {B}^{Y}_{R, \delta }\), and let \({\tilde{\lambda }}\) be a solution to

$$\begin{aligned} \min \, \bigg \{E (x , \rho , \varPsi ) + \frac{1}{2 \tau _{k} } {\mathsf {d}}^{2}_{(x, \varPsi )} (\rho , \lambda ) : \, \rho \in {\overline{\varLambda }}_{n} \bigg \} \,. \end{aligned}$$
(85)

Then, the following facts hold:

  1. (i)

    for every \(\alpha \in (0, 1)\)

    $$\begin{aligned} | {{\tilde{\lambda }}} - \lambda | \le \bigg ( \frac{2 \, C_{E, \alpha , R}}{m_{1}^{2}}\bigg )^{1/( 2 - \alpha )}\, \tau _{k}^{1/( 2 - \alpha )}; \end{aligned}$$
    (86)
  2. (ii)

    if \({\tilde{\lambda }} \in \varLambda ^{\delta }_{n}\), then

    $$\begin{aligned} |{\tilde{\lambda }} - \lambda | \le \frac{2 \, L_{E, \delta , R}}{m_{1}^{2}} \, \tau _{k}\,; \end{aligned}$$
    (87)
  3. (iii)

    if \(\lambda , {\tilde{\lambda }} \in \varLambda ^{\delta }_{n}\) and \(\mu \) is the unique solution to

    $$\begin{aligned} \min \, \bigg \{ E(x, \rho , \varPsi ) + \frac{1}{2\tau _{k}} \Vert \rho - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} : \, \rho \in {\overline{\varLambda }}_{n} \bigg \}, \end{aligned}$$
    (88)

    then, for every \(\alpha \in (0,1)\) we have

    $$\begin{aligned} | \mu - \lambda | \le \bigg ( \frac{2 \, C_{E, \alpha , R}}{m_{1}^{2}}\bigg )^{1/( 2 - \alpha )}\, \tau _{k}^{1/( 2 - \alpha )} \,. \end{aligned}$$
    (89)

    If, in addition, \(\mu \in \varLambda ^{\delta }_{n}\), then

    $$\begin{aligned} | \mu - \lambda | \le \frac{2\, L_{E, \delta , R}}{m_{1}^{2}} \, \tau _{k}\,. \end{aligned}$$
    (90)

    Finally, if \(\lambda , {\tilde{\lambda }} \in \varLambda ^{\delta }_{n}\) satisfy (76), there exists a positive constant \(C = C (\delta , R)\) such that

    $$\begin{aligned}&\bigg | \frac{{\tilde{\lambda }} - \lambda }{ \tau _{k}} - {\mathcal {Q}}(x, \varPsi ){\tilde{\lambda }} \bigg | \le C \tau _{k}^{1/4} \,. \end{aligned}$$
    (91)

Proof

By the minimality of \({\tilde{\lambda }}\), by (vi) of Lemma 3, and by (i) of Lemma 4 we have that for every \(\alpha \in (0, 1)\)

$$\begin{aligned} \frac{m_{1}^{2}}{2\tau _{k}} | {\tilde{\lambda }} - \lambda |^{2} \le \big | E(x, \lambda , \varPsi ) - E(x, {{\tilde{\lambda }}}, \varPsi ) \big | \le C_{E, \alpha , R} | {\tilde{\lambda }} - \lambda | ^{\alpha }, \end{aligned}$$
(92)

where \(m_{1} = m_{1}(R)\) and \(C_{E, \alpha , R}\) are defined in Lemmas 3 and 4, respectively. From (92), we deduce (86). In a similar way, we deduce (89), recalling that \(m_{1} = \sqrt{c_{1}}\), where \(c_{1}\) has been determined in (66).

If we further assume that \({{\tilde{\lambda }}} \in \varLambda ^{\delta }_{n}\), by minimality of \({{\tilde{\lambda }}}\), by (v) of Lemma 3, and by (i) of Lemma 4, we have that

$$\begin{aligned} \frac{m^{2}_{1}}{2\tau _{k}} | \lambda - {\tilde{\lambda }} |^{2} \le \frac{1}{2\tau _{k}} \, {\mathsf {d}}^{2}_{(x, \varPsi )} (\lambda , {\tilde{\lambda }} ) \le | E(x, \lambda , \varPsi ) - E(x, {\tilde{\lambda }}, \varPsi ) | \le L_{E, \delta , R} | \lambda - {\tilde{\lambda }} |\,. \end{aligned}$$

Hence, we deduce (87). Moreover, if \(\mu \in \varLambda ^{\delta }_{n}\), in the very same way we get (90).

In order to prove (91), we first estimate the Euclidean norm \(| \mu - {\tilde{\lambda }}|\). Denote by \(\chi _{{\overline{\varLambda }}_{n}}\) the characteristic function of the convex set \({\overline{\varLambda }}_{n}\) in the sense of convex analysis. Since \(E(x, \cdot , \varPsi )\) is convex in \({\overline{\varLambda }}_{n}\), we can apply Lemma 6 with \(f(\cdot )=E(x, \cdot , \varPsi )+\chi _{{\overline{\varLambda }}_{n}}(\cdot )\), \(\xi _0=\mu \), \(c=\frac{1}{2\tau _k}\), \(\zeta =\lambda \), and \(A=G(x, {\tilde{\lambda }}, \varPsi )\) obtaining

$$\begin{aligned} \begin{aligned} E(x, \mu , \varPsi )&+ \frac{1}{2\tau _{k}} \Vert \mu - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} + \frac{1}{2\tau _{k}} \Vert \mu - {\tilde{\lambda }} \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} \\&\le E(x, {\tilde{\lambda }}, \varPsi ) + \frac{1}{2\tau _{k}} \Vert {\tilde{\lambda }} - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi ) }^{2} \,. \end{aligned} \end{aligned}$$

Reordering the terms in the previous inequality and adding and subtracting on the right-hand side the terms \(\frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ({\tilde{\lambda }}, \lambda )\) and \(\frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ( \mu , \lambda )\), we obtain

$$\begin{aligned} \begin{aligned} \frac{1}{2\tau _{k}} \Vert \mu - {\tilde{\lambda }} \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} \le&\ E(x, {\tilde{\lambda }}, \varPsi ) + \frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ({\tilde{\lambda }}, \lambda ) \\&- E(x, \mu , \varPsi ) - \frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ( \mu , \lambda ) \\&- \frac{1}{2\tau _{k}} \Vert \mu - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2}+ \frac{1}{2\tau _{k}} \Vert {\tilde{\lambda }} - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi ) }^{2} \\&+ \frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ( \mu , \lambda ) - \frac{1}{2\tau _{k}} {\mathsf {d}}^{2}_{(x, \varPsi )} ({\tilde{\lambda }}, \lambda )\,. \end{aligned} \end{aligned}$$
(93)

By the minimality of \({\tilde{\lambda }}\), inequality (93) simplifies to

$$\begin{aligned} \Vert \mu - {\tilde{\lambda }} \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} \le \Vert {\tilde{\lambda }} - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi ) }^{2} - \Vert \mu - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} + {\mathsf {d}}^{2}_{(x, \varPsi )} ( \mu , \lambda ) - {\mathsf {d}}^{2}_{(x, \varPsi )} ({\tilde{\lambda }}, \lambda )\,.\nonumber \\ \end{aligned}$$
(94)

Since \(x \in \mathrm {B}_{R}\), \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), \(\lambda , {\tilde{\lambda }}, \mu \in \varLambda ^{\delta }_{n}\), and \(\lambda , {{\tilde{\lambda }}}\) satisfy (76), we deduce from (94), from (ii) of Lemma 3, and from (ii)–(iii) of Lemma 4 that

$$\begin{aligned} \begin{aligned} c_{1}^{2} | \mu - {{\tilde{\lambda }}} | ^{2} \le&\ \left( d_{(x, \varPsi )} ({{\tilde{\lambda }}}, \lambda ) + m_{4} | {{\tilde{\lambda }}} - \lambda |^{\frac{3}{2}} \right) ^{2} \\&+ \left( \Vert \mu - \lambda \Vert _{G(x, {{\tilde{\lambda }}}, \varPsi )} + m_{3} | \mu - \lambda |^{\frac{3}{2}} \right) ^{2} \\&- \Vert \mu - \lambda \Vert _{G(x, {\tilde{\lambda }}, \varPsi )}^{2} - {\mathsf {d}}^{2}_{(x, \varPsi )} ({\tilde{\lambda }}, \lambda )\,. \end{aligned} \end{aligned}$$
(95)

Developing the squares and using (iii) of Lemma 3 and (ii) of Lemma 4, we continue in (95) with

$$\begin{aligned} c_{1}^{2} | \mu - {{\tilde{\lambda }}} | ^{2} \le m_{4}^{2} | {\tilde{\lambda }} - \lambda | ^{3} + m_{3}^{2} | \mu - \lambda |^{3} + 2 \,m_{2} \, m_{4} | {\tilde{\lambda }} - \lambda |^{\frac{5}{2}} + 2\, \sqrt{c_{3}} \, m_{3} | \mu - \lambda |^{\frac{5}{2}}.\nonumber \\ \end{aligned}$$
(96)

Combining (96) with (87) and (90), we deduce

$$\begin{aligned} | \mu - {{\tilde{\lambda }}} | \le {\widetilde{C}} \tau _{k}^{5/4} \,. \end{aligned}$$
(97)

for some positive constant \({\widetilde{C}} = {\widetilde{C}} (\delta , R)\) independent of k.

We are now in a position to conclude (91). The minimality of \(\mu \), indeed, implies that for every \(\xi \in {\mathbb {R}}^{n}_{0}\)

$$\begin{aligned} \left\langle D_{\lambda } E(x, \mu , \varPsi ) , \xi \right\rangle + \frac{1}{\tau _{k}} \big \langle G(x, {\tilde{\lambda }}, \varPsi ) (\mu - \lambda ) , \xi \big \rangle = 0 \,. \end{aligned}$$

By a simple algebraic manipulation, we rewrite the previous equality as

$$\begin{aligned} \begin{aligned} \left\langle D_{\lambda } E(x, \mu , \varPsi ) , \xi \right\rangle&+ \frac{1}{\tau _{k}} \big \langle G(x, \mu , \varPsi ) ({\tilde{\lambda }} - \lambda ) , \xi \big \rangle \\&= \frac{1}{\tau _{k}} \big \langle G(x, {{\tilde{\lambda }}} , \varPsi ) ( {\tilde{\lambda }} - \mu ) , \xi \big \rangle \\&\quad + \frac{1}{\tau _{k}} \big \langle \big (G(x, \mu , \varPsi ) - G(x, {\tilde{\lambda }}, \varPsi ) \big ) ( {\tilde{\lambda }} - \lambda ) , \xi \big \rangle . \end{aligned} \end{aligned}$$
(98)

Taking \(\xi = K^{\top }(x, \mu , \varPsi ) \omega \) for \(\omega \in {\mathbb {R}}^{n}\) in (98), we get that

$$\begin{aligned} \begin{aligned} K(x, \mu , \varPsi )\,D_{\lambda } E(x, \mu , \varPsi )&+ \frac{1}{\tau _k} ( {\tilde{\lambda }} - \lambda )= \frac{1}{\tau _{k}} K(x, \mu , \varPsi )\, G(x, {{\tilde{\lambda }}} , \varPsi ) ( {\tilde{\lambda }} - \mu ) \\&+ \frac{1}{\tau _{k}} K(x, \mu , \varPsi )\,\big (G(x, \mu , \varPsi ) - G(x, {\tilde{\lambda }}, \varPsi ) \big ) ( {\tilde{\lambda }} - \lambda ) . \end{aligned} \end{aligned}$$

Since \(K(x, \mu , \varPsi ) D_{\lambda } E(x, \mu , \varPsi ) = - {\mathcal {Q}}(x, \varPsi ) \mu \) (see [30, Theorem 3.1]), we actually have

$$\begin{aligned} \begin{aligned} - {\mathcal {Q}}(x, \varPsi ) {{\tilde{\lambda }}} +&\frac{1}{\tau _k} \big ( {\tilde{\lambda }} - \lambda \big ) = \frac{1}{\tau _{k}} K(x, \mu , \varPsi )\, G(x, {{\tilde{\lambda }}} , \varPsi ) \big ( {\tilde{\lambda }} - \mu \big ) \\&+ \frac{1}{\tau _{k}} K(x, \mu , \varPsi )\,\big (G(x, \mu , \varPsi ) - G(x, {\tilde{\lambda }}, \varPsi ) \big ) ( {\tilde{\lambda }} - \lambda ) \\&+ {\mathcal {Q}}(x, \varPsi ) ( {{\tilde{\lambda }}} - \mu ) \,. \end{aligned} \end{aligned}$$
(99)

Combining (ii)–(iv) of Lemma 3 with the inequalities (87), (90), (97), and (99), and with the assumptions \(x \in \mathrm {B}_{R}\), \(\varPsi \in {\mathcal {P}}(\mathrm {B}^{Y}_{R})\), and \( {\tilde{\lambda }}, \mu \in \varLambda ^{\delta }_{n}\), we get (91), and therefore, the proof is concluded. \(\square \)

Lemma 7

Let \(r > 0\), \(\eta> \delta >0\), and \(\widehat{\varPsi } \in {\mathcal {P}}( \mathrm {B}^{Y}_{r, \eta })\). Then, there exists \(T_{f} >0\) such that for every \(k \in {\mathbb {N}}\) large enough and every \(t < T_{f}\) the following hold:

  1. (i)

    \(\varPsi ^{k}(t), {\underline{\varPsi }}^{k}(t), {\widetilde{\varPsi }}^{k}(t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R(T_{f}), \delta })\), where \(R :[0,+\infty ) \rightarrow [0, +\infty )\) is the function determined in Lemma 5;

  2. (ii)

    if \(t \in [t^{k}_{i}, t^{k}_{i + 1})\) for some \(i \in {\mathbb {N}}\), for every \((x, \lambda ) \in \mathrm {spt}\, {\underline{\varPsi }}^{k} (t)\)

    $$\begin{aligned} \sqrt{\frac{c_{3}( \delta , R(T_{f}))}{c_{1}(R(T_{f}))}} \, | \varLambda ^{k}_{i+1} (t, x, \lambda ) - \lambda | < \min \, \left\{ \mathrm {dist}(\lambda , \partial \varLambda ^{\delta }_{n}) , \mathrm {dist}(\varLambda ^{k}_{i+1} (t, x, \lambda ), \partial \varLambda ^{\delta }_{n}) \right\} . \end{aligned}$$

Proof

Since \(\widehat{\varPsi } \in {\mathcal {P}}(\mathrm {B}_{r, \eta }^{Y})\), we deduce from Lemma 5 that for every \(T>0\), every \(k \in {\mathbb {N}}\), and every i such that \(i \tau _{k} \le T\) we have \(\varPsi ^{k}_{i}, {\widetilde{\varPsi }}^{k}_{i} \in {\mathcal {P}}(\mathrm {B}^{Y}_{R(T)})\). Hence, in order to conclude the lemma we have to study the behaviour of the labels \(\lambda \in {\overline{\varLambda }}_{n}\) along the multi-step scheme.

Along the proof of the lemma we denote by \(\lambda ^{k}(t, x_{0}, \lambda _{0})\) and \(x^{k}(t, x_{0}, \lambda _{0})\), for \((x_{0}, \lambda _{0}) \in \mathrm {spt}\, \widehat{\varPsi }\), the curves obtained by iteratively solving (82) and the difference equation (12) in each interval \([t^{k}_{i}, t^{k}_{i+1}]\) starting from \((x_{0}, \lambda _{0})\) at time \(t_0=0\) and using, at each node \(t^{k}_{i}\), \({\hat{\lambda }} = \lambda ^{k}(t^{k}_{i}, x_{0}, \lambda _{0})\) and \({\hat{x}} = x^{k}(t^{k}_{i}, x_{0}, \lambda _{0})\) as new initial conditions. As in (18), we define the piecewise constant interpolations \({\overline{x}}^{k} (t, x_{0}, \lambda _{0}), {\underline{x}}^{k} (t, x_{0}, \lambda _{0})\) and \({\overline{\lambda }}^{k} (t, x_{0}, \lambda _{0}), {\underline{\lambda }}^{k}(t, x_{0}, \lambda _{0})\).

The assumption \(\widehat{\varPsi } \in {\mathcal {P}}(\mathrm {B}^{Y}_{r, \eta })\) means that for every \((x_{0}, \lambda _{0}) \in \mathrm {spt}\widehat{\varPsi }\) we have \(\lambda _{0} \in \varLambda ^{\eta }_{n}\). Since the measures \(\varPsi ^{k}(t)\) and \({\widetilde{\varPsi }}^{k}(t)\) are supported on pairs of the form \((x^{k} (t, x_{0}, \lambda _{0}), \lambda ^{k} (t, x_{0}, \lambda _{0}))\) and \(({\underline{x}}^{k}(t, x_{0}, \lambda _{0}), {\overline{\lambda }}^{k}(t, x_{0}, \lambda _{0}))\), respectively, we are led to estimate the number of steps needed by (82) to exit \(\varLambda ^{\delta }_{n}\), knowing that the initial label \(\lambda _{0} \in \varLambda ^{\eta }_{n}\).

Let us fix \({\overline{\alpha }} \in (0,1)\). For \(k \in {\mathbb {N}}\) such that \(\tau _{k} \le 1\), we claim that the properties (i) and (ii) hold with \(R = R( t^{k}_{i})\) for every \(t \in [0, t^{k}_{i}]\) until the following conditions are fulfilled:

$$\begin{aligned}&\sum _{j = i}^{i-1} \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} + \left( \frac{2 \, C(i-1, k) }{m_{1}^{2}(i-1, k)} \right) ^{1/(2 - {{\overline{\alpha }}})} \sqrt{\tau _{k}} < \eta - \delta , \end{aligned}$$
(100)
$$\begin{aligned}&\sum _{j = i}^{i-1} \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} + \frac{2 \, C(i-1, k)}{m_{1}^{2}(i-1, k)} \left( \frac{c_{3} (i-1, k)}{c_{1}(i-1, k) }\right) ^{1/2} \tau _{k} < \eta - \delta \,, \end{aligned}$$
(101)

where we have set \(L(j, k) :=L_{E, \delta , R(j\tau _{k})}\), \(C(j, k) :=C_{E, {{\overline{\alpha }}}, R(j\tau _{k})}\), \(m_{1}(j, k) :=m_{1}(R(j\tau _{k}))\), \(c_{1}(j, k) :=c_{1} (R(j\tau _{k}))\), and \(c_{3}(j, k) :=c_{3}(\delta , R(j\tau _{k}))\).

Given the claim for granted, for every \(k \in {\mathbb {N}}\) let us denote with \(i_{k} \in {\mathbb {N}}\) the first index for which at least one of the two inequalities (100) or (101) is violated. For simplicity, let us assume that it is always (100) to be violated in \(i_{k}\). Hence,

$$\begin{aligned} \sum _{j =1}^{i_{k} - 1} \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} \ge \eta - \delta - \left( \frac{2 \, C(i_{k} - 1, k) }{m_{1}^{2}(i_{k}-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}} \,. \end{aligned}$$

Since \(L_{E, \delta , R}\) is increasing with respect to R\(m_{1}(R)\) is decreasing with respect to R, and R(t) determined in Lemma 5 is increasing with respect to t, we also have that

$$\begin{aligned} \frac{2 \, L(i_{k}-1, k)}{m_{1}^{2} (i_{k}-1, k)} \, (i_{k} - 1) \tau _{k} \ge \eta - \delta - \left( \frac{2 \, C(i_{k} - 1, k) }{m_{1}^{2}(i_{k}-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}}\,, \end{aligned}$$

from which we deduce that \(i_{k} \tau _{k}\) is bounded away from 0. Therefore, there exists \(T_{f}>0\) such that \(T_{f} < (i_{k} - 1) \tau _{k}\) for every k large enough. A similar estimate can be obtained if (101) is violated, and we conclude that there exists \(T_{f}>0\) such that (i) and (ii) hold.

Let us prove the claim. For fixed \(i \in {\mathbb {N}}\), assume that (100)–(101) hold and that \(\lambda ^{k}(t^{k}_{j}, x_{0}, \lambda _{0}) \in \varLambda ^{\delta }_{n}\) for \(0 \le j<i\). Then, by (ii) of Proposition 5 we have that for every \(1 \le j < i\)

$$\begin{aligned} | \lambda ^{k} (t^{k}_{j} , x_{0}, \lambda _{0}) - \lambda ^{k} (t^{k}_{j-1}, x_{0}, \lambda _{0}) | \le \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} \,. \end{aligned}$$
(102)

By (i) of Proposition 5, we have, since \(\tau _{k} \le 1\),

$$\begin{aligned} | \lambda ^{k} (t^{k}_{i} , x_{0}, \lambda _{0}) - \lambda ^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0}) | \le \left( \frac{2 \, C(i-1, k) }{m_{1}^{2}(i-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}} \,. \end{aligned}$$
(103)

Hence, by (100), (102), (103), and by triangle inequality, we deduce that

$$\begin{aligned} \begin{aligned} | \lambda ^{k} (t^{k}_{i} , x_{0}, \lambda _{0}) - \lambda _{0} |&\le \sum _{j=1}^{i} | \lambda ^{k} (t^{k}_{j} , x_{0}, \lambda _{0}) - \lambda ^{k} (t^{k}_{j-1}, x_{0}, \lambda _{0}) | \\&\le \sum _{j = i}^{i-1} \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} + \left( \frac{2 \, C(i-1, k) }{m_{1}^{2}(i-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}} < \eta - \delta , \end{aligned} \end{aligned}$$

which implies that \(\lambda ^{k} (t^{k}_{i}, x_{0}, \lambda _{0}) \in \varLambda ^{\delta }_{n}\) as \(\lambda _{0} \in \varLambda ^{\eta }_{n}\). Since (100) is independent of the particular choice of the initial condition \((x_{0}, \lambda _{0}) \in \mathrm {spt}\, \widehat{\varPsi } \subseteq \mathrm {B}^{Y}_{r, \eta }\), we infer that \(\varPsi ^{k}(t), {\underline{\varPsi }}^{k} (t), {\widetilde{\varPsi }}^{k} (t) \in {\mathcal {P}}(\mathrm {B}^{Y}_{R(t^{k}_{i}), \delta })\) for every \(t \in [0, t^{k}_{i}]\).

Let us now denote by \(\mu ^{k}_{i} \in {\overline{\varLambda }}_{n}\) the solution to the minimum problem

$$\begin{aligned} \begin{aligned} \min _{\rho \in {\overline{\varLambda }}_{n}}\, \bigg \{&E \left( x^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0}), \rho , \varPsi ^{k}_{i-1} \right) \\&+ \frac{1}{2 \tau _{k}} \Vert \rho - \lambda ^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0}) \Vert ^{2}_{G(x^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0}) , \lambda ^{k}(t^{k}_{i}, x_{0}, \lambda _{0}) , \varPsi ^{k}_{i-1})} \bigg \} \,. \end{aligned} \end{aligned}$$

Then, by (iii) of Proposition 5 we get that

$$\begin{aligned} | \mu ^{k}_{i} - \lambda ^{k} ( t^{k}_{i-1} , x_{0}, \lambda _{0}) | \le \left( \frac{2 \, C(i-1, k)}{m^{2}_{1}(i-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}} \,. \end{aligned}$$

Therefore, by triangle inequality and by (100) we obtain

$$\begin{aligned} \begin{aligned} | \mu ^{k}_{i} - \lambda _{0} |&\le \sum _{j=1}^{i-1} | \lambda ^{k} (t^{k}_{j}, x_{0}, \lambda _{0} ) - \lambda ^{k}( t^{k}_{j-1}, x_{0}, \lambda _{0}) | + | \mu ^{k}_{i} - \lambda ^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0}) | \\&\le \sum _{j = i}^{i-1}\frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} + \left( \frac{2 \, C(i-1, k)}{m_{1}^{2} (i-1, k)} \right) ^{1/(2 - {\overline{\alpha }})} \sqrt{\tau _{k}} < \eta - \delta , \end{aligned} \end{aligned}$$

which yields \(\mu ^{k}_{i} \in \varLambda ^{\delta }_{n}\).

Since \(\lambda ^{k} (t^{k}_{j}, x_{0}, \lambda _{0} ) \in \varLambda ^{\delta }_{n}\) for \(0\le j \le i\), by (ii) of Proposition 5, by (101), and by (102), we have that

$$\begin{aligned} \begin{aligned} \sum _{j=1}^{i-1}&| \lambda ^{k} (t^{k}_{j}, x_{0}, \lambda _{0}) - \lambda ^{k} (t^{k}_{j-1}, x_{0}, \lambda _{0}) | \\&\quad + \left( \frac{c_{3} (i-1, k)}{c_{1}(i-1, k) }\right) ^{1/2} | \lambda ^{k} (t^{k}_{i}, x_{0}, \lambda _{0} ) - \lambda ^{k}( t^{k}_{i-1}, x_{0}, \lambda _{0}) | \\&\le \sum _{j = i}^{i-1} \frac{2 \, L(j-1, k)}{m_{1}^{2} (j-1, k)} \, \tau _{k} + \frac{2 \, C(i-1, k)}{m_{1}^{2}(i-1, k)} \left( \frac{c_{3} (i-1, k)}{c_{1}(i-1, k) }\right) ^{1/2} \tau _{k} < \eta - \delta , \end{aligned} \end{aligned}$$

which in turn implies

$$\begin{aligned} \begin{aligned}&\left( \frac{c_{3} (i-1, k)}{c_{1}(i-1, k) }\right) ^{1/2} | \lambda ^{k} (t^{k}_{i}, x_{0}, \lambda _{0} ) - \lambda ^{k}( t^{k}_{i-1}, x_{0}, \lambda _{0}) | \\&< \min \, \Big \{ \mathrm {dist}\big ( \lambda ^{k} (t^{k}_{i}, x_{0}, \lambda _{0} ), \partial \varLambda ^{\delta }_{n} \big ) , \mathrm {dist}\big ( \lambda ^{k} (t^{k}_{i-1}, x_{0}, \lambda _{0} ), \partial \varLambda ^{\delta }_{n} \big ) \Big \} \,. \end{aligned} \end{aligned}$$
(104)

Since all the above estimates are independent of the particular choice of \((x_{0}, \lambda _{0}) \in \mathrm {spt}\, \widehat{\varPsi }\) and since, for \(t \in [t^{k}_{i-1}, t^{k}_{i})\), the measure \({\underline{\varPsi }}^{k}(t)\) has support

$$\begin{aligned} \mathrm {spt}\, {\underline{\varPsi }}^{k}(t) \subseteq \Big \{ \big (x^{k}(t^{k}_{i-1}, x_{0}, \lambda _{0}) , \lambda ^{k}( t^{k}_{i-1}, x_{0}, \lambda _{0}) \big ) : \, (x_{0}, \lambda _{0}) \in \mathrm {spt}\, \widehat{\varPsi } \Big \} \subseteq \mathrm {B}^{Y}_{R (t^{k}_{i-1} ) , \delta }\,, \end{aligned}$$

we deduce that (ii) holds. \(\square \)

We are now in a position to prove the short-time convergence of the multi-step Lagrangian scheme for reversible Markov chains. We start by showing the equivalent of Propositions 1 and 4 .

Proposition 6

Let \(r>0\), \(\eta> \delta > 0\), \(\widehat{\varPsi } \in {\mathcal {P}}(\mathrm {B}^{Y}_{r, \eta })\), and let \(T_{f}>0\) be as in Lemma 7. Then, there exists \(C > 0\) such that for every \(\varphi \in C_{b}^{1}( {\mathbb {R}}^{d} \times {\overline{\varLambda }}_{n})\), every \(k \in {\mathbb {N}}\) large enough, every \(i \in {\mathbb {N}}\) such that \((i + 1) \tau _{k} <T_{f}\), and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\),

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \int _{Y} \nabla \varphi (x, \lambda ) \cdot b_{\varPsi ^{k}(t)} (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t) (x, \lambda ) + \vartheta _{k}(\varphi ),\nonumber \\ \end{aligned}$$
(105)

where \(|\vartheta _{k}(\varphi )| \le C \Vert \varphi \Vert _{C^{1}_{b}} \tau ^{1/4}_{k}\).

Proof

Along the proof, we denote by C a generic positive constant independent of ikt, and \(\varphi \), that may vary from line to line.

We follow step by step the proof of Propositions 1 and 4. Let i and k be as in the statement of the proposition, and let us set \(R :=R(T_{f})\). For every test function \(\varphi \in C_{b}^{1}({\mathbb {R}}^{d} \times \varLambda _{n})\) and every \(t \in (t^{k}_{i}, t^{k}_{i+1})\), by definition of \(\varPsi ^{k}(t)\) we have that

$$\begin{aligned} \begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t}&\int _{Y} \varphi (x, \lambda ) \, \mathrm {d}\varPsi ^{k}(t)(x, \lambda ) = \frac{\mathrm {d}}{\mathrm {d}t} \int _{Y} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&= \int _{Y} \nabla _{x} \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot v_{{\widetilde{\varPsi }}^{k} (t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\quad + \int _{Y} \nabla _{\lambda } \varphi \big ( X^{k}_{i+1}( t, x, \lambda ) , \varLambda ^{k}_{i+1} ( t, x, \lambda ) \big ) \cdot \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda )\,. \end{aligned}\nonumber \\ \end{aligned}$$
(106)

In order to deduce (105) from (106), we estimate

$$\begin{aligned}&\displaystyle I_{1} (x, \lambda ) :=\Big | v_{{\widetilde{\varPsi }}^{k}(t)} \big ( x , \varLambda ^{k}_{i+1} (t^{k}_{i+1}, x, \lambda ) \big ) - v_{\varPsi ^{k}(t)} \big ( X^{k}_{i+1} (t, x, \lambda ), \varLambda ^{k}_{i+1} (t , x, \lambda ) \big ) \Big |\,, \\&\displaystyle I_{2} (x, \lambda ) :=\bigg | \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} - {\mathcal {Q}}\big ( X^{k}_{i+1} (t, x, \lambda ) , \varPsi ^{k}(t) \big ) \varLambda ^{k}_{i+1}(t, x, \lambda ) \bigg | \end{aligned}$$

for \((x, \lambda ) \in \mathrm {spt}\varPsi ^{k}_{i} \subseteq \mathrm {B}^{Y}_{R, \delta }\), the last inclusion being a consequence of Lemma 7.

Arguing as in the proof of (55)–(57) and using (87), we get that

$$\begin{aligned} \begin{aligned} I_{1}&\le L_{v, R} \int _{t^{k}_{i}}^{t^{k}_{i+1}} \bigg ( M_{v} \Big ( 1 + |x| + | \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) | + m_{1} ( {\widetilde{\varPsi }}^{k} (\tau )) \Big ) \\&\quad + \bigg | \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg | \bigg ) \mathrm {d}\tau \\&\quad + L_{v, R} \, \tau _{k} \int _{Y} \bigg ( \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} ( x', \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x' , \lambda ' ) ) \big | \\&\quad + \bigg | \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x', \lambda ' ) - \lambda ' }{\tau _{k}} \bigg | \bigg ) \mathrm {d}\varPsi ^{k}_{i}( x' , \lambda ' ) \\&\le 4\,L_{v, R} \, M_{v} ( 1 + R) \tau _{k} + \frac{4 \, L_{E, \delta , R}}{m_{1}^{2}} \tau _{k} = C \, \tau _{k}\,. \end{aligned} \end{aligned}$$
(107)

Let us now estimate \(I_{2}\). By triangle inequality, we have

$$\begin{aligned} I_{2}\le & {} \bigg | \frac{\big ( \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - \lambda \big ) }{\tau _{k}} - {\mathcal {Q}}\big ( x , \varPsi ^{k}_{i} \big ) \varLambda ^{k}_{i+1}(t^{k}_{i+1} , x, \lambda ) \bigg | \nonumber \\&+ \big | {\mathcal {Q}}\big ( X^{k}_{i+1} (t, x, \lambda ) , \varPsi ^{k}(t) \big ) \varLambda ^{k}_{i+1}(t^{k}_{i+1}, x, \lambda ) - {\mathcal {Q}}\big ( x , \varPsi ^{k}_{i} \big ) \varLambda ^{k}_{i+1}(t, x, \lambda ) \big | \nonumber \\=: & {} I_{2, 1} + I_{2, 2} \,. \end{aligned}$$
(108)

By (iii) of Proposition 5 and by Lemma 7, we have that

$$\begin{aligned} I_{2,1} \le C \,\tau _{k}^{1/4} \,. \end{aligned}$$
(109)

By \(({\mathcal {Q}}_{2})\)\((v_{3})\), Lemmas 5 and 7 , and (ii) of Proposition 5, we get

$$\begin{aligned} \begin{aligned} I_{2,2}&\le L_{{\mathcal {Q}}, R} \bigg ( \int _{t^{k}_{i}}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big ( x, \varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) \big ) \big | \, \mathrm {d}\tau \\&\quad + \int _{t}^{t^{k}_{i+1}} \bigg | \frac{\varLambda ^{k}_{i+1} ( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg | \mathrm {d}\tau \bigg ) \\&\le 2 L_{{\mathcal {Q}}, R}\, M_{v} (1 + R)\, \tau _{k} + \frac{2 \, L_{E, \delta , R}}{m_{1}^{2}} \, \tau _{k} = C\, \tau _{k}\,. \end{aligned} \end{aligned}$$
(110)

Combining (109) and (110), we obtain that

$$\begin{aligned} I_{2} \le C \, \tau _{k}\,. \end{aligned}$$
(111)

Finally, equality (105) follows from (107) and (111) as in the proof of Propositions 1 and 4. \(\square \)

We finally conclude with the main result of this section.

Theorem 3

Let \(r>0\), \(\eta> \delta > 0\), and \(\widehat{\varPsi } \in {\mathcal {P}}(\mathrm {B}^{Y}_{r, \eta })\). Then, there exists \(T_{f}>0\) such that the sequence of curves \(\varPsi ^{k} :[0,T_{f}] \rightarrow {\mathcal {P}}_{1}(Y)\) converges to the unique solution \(\varPsi \in C([0,T_f]; {\mathcal {P}}_{1}(Y))\) of (7) in \(W_{1}\), uniformly with respect to \(t \in [0,T_f]\)

Proof

Let \(T_{f}>0\) be as in Lemma 7, so that the curves \(\varPsi ^{k}, {\widetilde{\varPsi }}^{k}\), and \({\underline{\varPsi }}^{k}\) are well defined in the interval \([0,T_{f}]\) and (105) holds. Since the operator \({\mathcal {T}}_{\varPsi }( x, \lambda ) :={\mathcal {Q}}(x, \varPsi ) \lambda \) satisfies the property \(({\mathcal {T}}_{0})\)\(({\mathcal {T}}_{3})\), we only have to check that the sequence \(\varPsi ^{k}\) is compact in \(C([0,T_{f}]; {\mathcal {P}}_{1}(Y))\). The rest of the proof works as for Theorem 1, with the obvious modifications due to the fact that the rest \(\theta _{k}\) in Proposition 6 is now controlled by \(\tau _{k}^{1/4}\) and not by \(\tau _{k}\).

In view of Lemma 7, it is enough to show that \(\varPsi ^{k}\) is equi-Lipschitz with respect to \(W_{1}\). Let us fix \(k \in {\mathbb {N}}\)\(i \in {\mathbb {N}}\) such that \(i \tau _{k} \le T_{f}\), and \(s \le t \in [t^{k}_{i}, t^{k}_{i+1}]\), and let \(R :=R(T_{f})\). Then,

$$\begin{aligned} \begin{aligned} W_{1}&( \varPsi ^{k}(t) , \varPsi ^{k}(s)) = \sup _{\eta \in \mathrm {Lip}_{1}(Y)} \bigg \{ \int _{Y} \eta (x, \lambda ) \, \mathrm {d}( \varPsi ^{k}(t) - \varPsi ^{k}(s))(x, \lambda ) \bigg \} \\&\le \int _{Y} \Big ( \big | X^{k}_{i+1}(t, x, \lambda ) - X^{k}_{i+1} (s, x, \lambda ) \big | \\&\quad + \big | \varLambda ^{k}_{i+1}(t, x, \lambda ) - \varLambda ^{k}_{i+1}(s, x, \lambda ) \big | \Big ) \, \mathrm {d}\varPsi ^{k}_{i} (x, \lambda ) \\&\le \int _{Y} \bigg ( \int _{s}^{t} \big | v_{{\widetilde{\varPsi }}^{k}(\tau )} \big (x, \varLambda ^{k}_{i+1} (\tau , x, \lambda ) \big ) \big |\, \mathrm {d}\tau \\&\quad + \int _{s}^{t} \bigg | \frac{\varLambda ^{k}_{i+1}( t^{k}_{i+1}, x, \lambda ) - \lambda }{\tau _{k}} \bigg | \, \mathrm {d}\tau \bigg )\, \mathrm {d}\varPsi ^{k}_{i}(x, \lambda ) \,. \end{aligned} \end{aligned}$$

Therefore, by \((v_2)\), Lemma 7, and Proposition 5 we get

$$\begin{aligned} W_{1} ( \varPsi ^{k}(t) , \varPsi ^{k}(s)) \le 2M_{v} (1 + R) |t - s| + \frac{2 \, L_{E, \delta , R}}{m_{1}^{2}} \, |t - s|, \end{aligned}$$

where the constants \(L_{E, \delta , R}\) and \(m_{1} = m_{1}(R)\) have been determined in Lemmas 3 and 4 , respectively, and are independent of k, i, and t. \(\square \)

6 Concluding remarks

In this paper, we have proposed a multi-step Lagrangian scheme for spatially inhomogeneous evolutionary games. The scheme is fully explicit, traces the evolutions of positions and labels along the characteristics, and consists of two approximation steps: in the first step the agents update their beliefs on the strategy; in the second step, they update their position accordingly. Theorem 1 in Sect. 3 provides a general convergence result for the proposed scheme. Theorem 2 in Sect. 4 and Theorem 3 in Sect. 5 deal with the special cases of inhomogeneous replicator dynamics and reversible Markov chains, respectively. Differently from Theorem 1, they contain a variant in that the explicit step for the update of the strategies is replaced by an implicit one, based on a minimizing movement justified by the gradient flow structure of the problems at hand.

We notice that we presented the approximation steps in the natural order: Agents tend to update their information on the environment before changing their position. Undertaking the reverse order in the updates would bring to the same convergence result. The partial update in (11) would have to be modified accordingly, namely by placing the \( id \) map in the second component and using the updated lifted map \(X_{i+1}^k\) in the first component (where now x evolves with the “old” label/strategy distribution).

Using an adaptive time step, chosen on the basis of the rate of change of positions and labels/strategies, deserves further analysis. We foresee that the same convergence result given by Theorem 1 holds, provided that the choice of the time step is such that

$$\begin{aligned} \lim _{k\rightarrow \infty } \,\max _{i\in \{0,\ldots ,k-1\}} \, (t^k_{i+1}-t^k_i)=0, \end{aligned}$$

where we notice that \(t^k_{i+1}-t^k_i=\tau _k\) is constant in our work.