1 Introduction

Overview of the topic. After being introduced in statistical physics by Kac [22] and then by McKean [27] to describe the collisions between particles in a gas, the mean-field approximation has become a powerful tool to analyze the asymptotic behavior of systems of interacting agents in biology, sociology, and economics. We may mention, e.g., recent applications to the description of cell aggregation and motility [12, 23], coordinated animal motion [5], cooperative robots [13], influence of key investors in the stock market [8, Introduction], or the modeling of criminal activity [6, 14, 29].

The modeling of these systems is usually inspired from Newtonian laws of motion and is based on pairwise forces accounting for repulsion/attraction, alignment, self-propulsion/friction in biological, social, or economical interactions. In this way, the evolution of N agents with time-dependent locations, \(x^1_t,\dots ,x^N_t\) in \(\mathbb {R}^d\) is described by the ODE system

$$\begin{aligned} \dot{x}^i_t = \frac{1}{N}\sum \limits _{j=1}^N f(x^i_t,x^j_t) \quad \text {for }i=1,\dots ,N,\,\, t\in (0,T], \end{aligned}$$

where f is a pre-determined pairwise interaction force between pairs of agents. The above first-order structure of multi-agent interactions appears, for instance, in some recent model in opinion formation [20], vehicular traffic flow [18], pedestrian motion [15], and synchronisation of chemical and biological oscillators in neuroscience [25].

Another context, where this approach has proved to be a useful one, is that of evolutionary games, where players are simultaneously willing to optimize their cost: this includes game theoretic models of evolution [21] or mean-field games ([11, 26]) in order to describe consensus problems. In this latter setting, the notion of spatially inhomogeneous evolutionary games has been recently proposed [3] (see also [2] for a related numerical scheme). There, the dynamics is not the outcome of an underlying non-local optimal control problem, but is determined by the agents’ local (in time and space) decisions, as in the well-known replicator dynamics [21].

We give an overview of the model in [3], which is relevant for the purpose of the paper. The position of an agent is described by \(x \in \mathbb {R}^d\), while U denotes the set of pure strategies. A pay-off function \(J:(\mathbb {R}^d\times U)^2\rightarrow \mathbb {R}\) is given, so that \(J(x,u,x',u')\) is the pay-off that a player in position x gets playing pure strategy u against a player in position \(x'\) with pure strategy \(u'\). However, agents are assumed to play different strategies according to a probability measure \(\sigma \in \mathcal {P}(U)\), which is referred to as a mixed strategy. Hence, the state variable is given by the pair \((x,\sigma )\) accounting for the position and the mixed strategy of an agent and

$$\begin{aligned} \int _U J(x,u,x',u')\,\textrm{d}\sigma '(u') \end{aligned}$$

is the pay-off that a player in position x gets playing strategy u against a player in position \(x'\) with mixed strategy \(\sigma '\). If we then consider N agents, whose states are denoted by \((x^i_t,\sigma ^i_t)\), \(i=1,\dots ,N\), the pay-off that the i-th player gets playing strategy u against all the other players at time t is

$$\begin{aligned} {\mathcal {J}}(x_t^i,u):=\frac{1}{N}\sum \limits _{j=1}^N \int _U J(x^i_t,u,x^j_t,u')\,\textrm{d}\sigma _t^j(u'). \end{aligned}$$

In order to maximize this pay-off, the i-th player has to compare it with the mean pay-off over all possible strategies according to their mixed strategy \(\sigma ^i_t\). This leads us to the system of ODEs

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}_t^i = v(x^i_t,\sigma ^i_t) \\ \dot{\sigma }_t^i = \left( {\mathcal {J}}(x_t^i,\cdot )-\displaystyle {\int _U} {\mathcal {J}}(x_t^i,v)\,\textrm{d}\sigma _t^i(v)\right) \sigma _t^i \end{array}\right. }\qquad \text {for }i=1,\dots ,N,\,\,t\in (0,T]. \end{aligned}$$

In the later contribution [28], the well-posedness theory as well as the mean-field approximation of the above system have been inserted in a more general framework which is suitable for a broader range of applications. In this setting, the velocity v of each agent is also depending on the behavior of the other ones, and the replicator dynamics for the strategies has been replaced by a more general vector field \(\mathcal {T}\), that is

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}_t^i = v_{\Lambda _t^N}(x^i_t,\sigma _t^i)\\ \dot{\sigma }_t^i = \mathcal {T}_{\Lambda _t^N}(x_t^i,\sigma _t^i) \end{array}\right. }\qquad \text {for }i=1,\dots ,N,\,\,t\in (0,T], \end{aligned}$$

where \(\Lambda _t^N = \sum _{j=1}^N \delta (x_t^j,\sigma ^j_t)\in \mathcal {P}(\mathbb {R}^d\times \mathcal {P}(U))\) is a distribution of agents with strategies at time t. The interpretation, given in [28], of these types of systems has a wider scope than the one of game theory: the interacting agents are assumed to belong to a number of different species, or populations, and therefore, more in general, we deal with labels \(\ell ^i\) instead of (mixed) strategies \(\sigma ^i\). This point of view can be used to distinguish informed agents steering pedestrians, to highlight the influence of few key investors in the stock market, or to recognize leaders from followers in opinion formation models. Throughout this work, we will adopt this perspective. Under a rather general set of assumptions on v and \(\mathcal {T}\) (which, in particular, encompass the case of the replicator dynamics), it has been shown in [28] that the empirical measures \(\Lambda _t^N\) associated with system (1.1) converge to a probability measure on the state space, which solves the continuity equation

$$\begin{aligned} \partial _t\Lambda _t + \text {div}(b_{\Lambda _t}\,\Lambda _t) = 0, \end{aligned}$$

where \(b_{\Lambda _t}\) is the vector field which drives the state in system (1.1).

In [7], a further research direction has been explored. There, the replicator equation is slightly modified adding an entropy regularization \({\mathcal {H}}\), see (1.3) below. Besides providing a mean-field theory for such systems, the authors discuss the fast reaction limit scenario, modeling situations in which the strategy (or label) switching of particles in the systems is actually happening at a faster time scale than that of the agents’ dynamics. This leads us to the purpose of our paper.

Contribution of the Present Work. In the present paper, we complement the abstract framework of [28] by adding an entropy regularization and we analyze its effects on the dynamics from an abstract point of view. We fix a reference probability measure \(\eta \in \mathcal {P}(U)\) and we consider only diffuse probability densities \(\ell \) with respect to \(\eta \). We set

$$\begin{aligned} \mathcal {H}(\ell ) :=\ell \big [I(\ell )-\log (\ell )\big ], \end{aligned}$$

where \(I(\ell )\) is the negative entropy of the probability density \(\ell \), namely

$$\begin{aligned} I(\ell ):=\int _{U} \ell (u)\,\log (\ell (u)) \,\textrm{d}\eta (u). \end{aligned}$$

Then we analyze the system

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}_t^i = v_{\Lambda _t^N}(x^i_t,\ell _t^{\,i})\\[2mm] \dot{\ell }_t^{\,i} = \lambda \,[\mathcal {T}_{\Lambda _t^N}(x_t^i,\ell _t^{\,i})+\varepsilon \, {\mathcal {H}}(\ell _t^{\,i})] \end{array}\right. }\qquad i=1,\dots ,N,\,\,t\in (0,T], \end{aligned}$$

where \(\ell ^{\,i}_t\) denotes the label of the i-th agent, \(\varepsilon >0\) is a small parameter which modulates the intensity of the entropy functional, and \(\lambda \ge 1\) takes into account the possible time scale difference between the positions and labels dynamics. In the particular case where \(\mathcal {T}_\Lambda \) is the operator of the replicator dynamics, this is exactly the system considered in [7]. The motivation for this regularization has already been discussed in [7]: it serves to avoid degeneracy of the labels (see [7, Example 2.1] for a precise discussion) and allows for faster reactions to changes in the environment. We also refer to [16] for an earlier contribution on entropic regularizations in a game-theoretical setting.

From the mathematical point of view, the state space for the labels becomes now \(\mathcal {P}(U)\cap L^p(U,\eta )\) for some \(p>1\). As non-degeneracy is a desirable feature also for the wider setting considered in [28], our first goal is then to establish a well-posedness theory in a similar spirit for system (1.4). As it happened in [28], a crucial point is giving a suitable set of assumptions on the dynamics which allows one to rely on the stability estimates for ODE’s in convex subsets of Banach spaces developed in [9, Section I.3, Theorem 1.4, Corollary 1.1] and recalled in Theorem 2.1 below. In particular, a sufficient set of assumptions on the operator \(\mathcal {T}\) which complies with this setting is given at the beginning of Sect. 3, see (T1)–(T3). It slightly adapts and, to some extent, simplifies the assumptions on [28], since here we are only considering the case of diffuse measures, and comprises both the case of the replicator dynamics and some models of leader-follower interactions with label switching modeled by reversible Markov chains [2] (see Remark 3.1).

The well-posedness of the particle model is proved in Theorem 3.3 as a consequence of the estimates in Proposition 3.2. The convergence to a mean-field limit is discussed in the subsequent Sect. 4. In Sect. 5, instead, we focus on the special case of replicator-type models and revisit the results of [7] from an abstract and more general point of view, which may also account for further modeling possibilities.

More precisely, we assume that the operator \(\mathcal {T}\) takes the form

$$\begin{aligned} \mathcal {T}_{\Lambda } (x, \ell ) :=\left( \int _{U} \partial _\xi F_{\mu } ( x , \ell (u) , u ) \ell (u) \, \textrm{d}\eta (u) - \partial _\xi F_{\mu } ( x , \ell , \cdot ) \right) \ell , \end{aligned}$$

for \(x \in \mathbb {R}^{d}\) and \(\ell \in \mathcal {P}(U)\cap L^{p}(U, \eta )\), and where \(\mu \) is the marginal of \(\Lambda \) in \(\mathbb {R}^{d}\). In (1.5), \(\partial _\xi \) denotes the derivative of F with respect to its second variable.

As we discuss in Remark 5.1, for a proper choice of \(F_\mu \), the above setting encompasses the case of undisclosed replicator dynamics. By undisclosed it is meant that the players are not aware of their opponents’ strategies. This is exactly the case dealt with in [7]; see [7, Remark 2.9] for the difficulties connected to the fast reaction limit in the general case. We stress, however, that (1.5) has a more flexible structure than the case-study of the replicator dynamics. For instance, as we discuss again in Remark 5.1, it allows one to consider pay-offs depending also on how often a strategy is played, penalizing choices that become predictable by other players. From the mathematical point of view, examples of functions fulfilling our hypotheses (F1)–(F5) of Sect. 5 are discussed in Proposition 5.2.

For a system of the form (1.4) with \(\mathcal {T}\) given by (1.5), we perform the fast reaction limit \(\lambda \rightarrow +\infty \). This corresponds to a reasonable modeling assumption, that the label dynamics takes place at a much faster rate that the spatial dynamics. In Theorem 5.12 we prove the convergence of system (1.4)–(1.5) to a Newton-like system of the form

$$\begin{aligned} \dot{x}_t^i = v_{\Lambda _t^N}(x_t^i,\ell _t^{*\,i}(x_t^1,\dots ,x_t^N)),\qquad \text {for }i=1,\dots ,N,\,\,t\in (0,T], \end{aligned}$$

where \(\ell _t^{*\,i}\) optimizes the functional

$$\begin{aligned} G_{\mu } (x,\ell ) :=\int _{U} \big ( F_{\mu }(x,\ell (u),u) + \varepsilon \ell (u) ( \log (\ell (u)) - 1) \big )\,\textrm{d}\eta (u), \qquad \text { for }\ell \in C_{\varepsilon } \end{aligned}$$

for fixed x and \(\mu \). We stress that, differently from [7], we do not need to explicitly compute the minimizer as it was done in the special case of the replicator dynamics. We remark that a crucial assumption for our proofs in Sect. 5 is convexity of the function F with respect to \(\ell \) and actually our proofs are guided by the heuristic intuition that, for fixed x and \(\mu \), the label equation in (1.4)–(1.5) is the formal gradient flow of (1.6) with respect to the spherical Hellinger distance of probability measures [24] (see also [2]). However, we provide explicit computations which do not resort to this gradient flow structure.

Outlook. The present paper provides the well-posedness theory and the mean-field approximation for multi-population agent-based systems with an entropic regularization on the labels. We remark that such a regularization in the trajectories prevents concentration in the space of labels. An analogous role could be played by diffusive terms in the space of positions, whose effects we plan to address in future contributions. We also provide an abstract structure on the evolution of the labels to perform fast reaction limits, which in particular contains the special case of [7]. On the one hand, the assumption that one agent is not fully aware of the label distribution of the other ones (the so-called undisclosed setting we consider here) is realistic in many applications. On the other hand, it would be interesting to single out the right assumptions to overcome this restriction while performing the fast reaction limite, for instance allowing one to consider F depending on the whole \(\Lambda \), and not only on the marginal \(\mu \), in (1.5).

Overview of the Paper. In Sect. 2, we present our notation, recall some tools of functional analysis and measure theory, and outline the basic settings of the problem. In Sect. 3, we present the general assumptions and we study the entropic dynamical system (1.4), proving its well-posedness. In Sect. 4, we prove the mean-field limit of (1.4) to a continuity equation such as (1.2). In Sect. 5, we obtain the fast reaction limit of system (1.4), together with the explicit rate of convergence in terms of the parameter \(\lambda \).

2 Preliminaries

2.1 Basic Notation

If \(({\mathcal {X}},\textsf{d}_{\mathcal {X}})\) is a metric space we denote by \(\mathcal {P}({\mathcal {X}})\) the space of probability measures on \({\mathcal {X}}\). The notation \(\mathcal {P}_c({\mathcal {X}})\) will be used for probability measures on \({\mathcal {X}}\) having compact support. We denote by \(C_0({\mathcal {X}})\) the space of continuous functions vanishing at the boundary of \({\mathcal {X}}\), and by \(C_b({\mathcal {X}})\) the space of bounded continuous functions. Whenever \({\mathcal {X}}=\mathbb {R}^d\), \(d\ge 1\), it remains understood that it is endowed with the Euclidean norm (and induced distance), which shall be simply denoted by \(\vert \cdot \vert \). For a Lipschitz function \(f:{\mathcal {X}}\rightarrow \mathbb {R}\) we denote by

$$\begin{aligned} \textrm{Lip}(f):=\sup _{\begin{array}{c} x,\,y\, \in {\mathcal {X}} \\ x \ne y \end{array}}\dfrac{\vert f(x)-f(y)\vert }{\textsf{d}_{\mathcal {X}}(x,y)} \end{aligned}$$

its Lipschitz constant. The notations \(\textrm{Lip}({\mathcal {X}})\) and \(\textrm{Lip}_b ({\mathcal {X}})\) will be used for the spaces of Lipschitz and bounded Lipschitz function on \({\mathcal {X}}\), respectively. Both are normed spaces with the norm \(\Vert f \Vert :=\Vert f \Vert _\infty + \textrm{Lip}(f)\), where \(\Vert \cdot \Vert _\infty \) is the supremum norm. In a complete and separable metric space \(({\mathcal {X}},\textsf{d}_{\mathcal {X}})\), we shall use the Kantorovich-Rubinstein distance \(\mathcal {W}_1\) in the class of \(\mathcal {P}({\mathcal {X}})\), defined as

$$\begin{aligned} \mathcal {W}_1(\mu ,\nu ):=\sup \left\{ \,\int _{{\mathcal {X}}}\varphi (x)\,\textrm{d}\mu (x)-\int _{{\mathcal {X}}}\varphi (x)\,\textrm{d}\nu (x)\,:\varphi \in \textrm{Lip}_b ({\mathcal {X}}),\,\textrm{Lip}(\varphi )\le 1 \right\} \end{aligned}$$

or, equivalently (thanks to the Kantorovich duality), as

$$\begin{aligned} \mathcal {W}_1(\mu ,\nu ):=\inf \left\{ \,\int _{{\mathcal {X}}\times {\mathcal {X}}}\textsf{d}_{\mathcal {X}}(x,y)\,\textrm{d}\Pi (x,y)\,:\Pi (A\times {\mathcal {X}})=\mu (A),\,\,\Pi ({\mathcal {X}}\times B)=\nu (B)\right\} , \end{aligned}$$

involving couplings \(\Pi \) of \(\mu \) and \(\nu \). It can be proved that the infimum is actually attained. Notice that \(\mathcal {W}_1(\mu ,\nu )\) is finite if \(\mu \) and \(\nu \) belong to the space

$$\begin{aligned} \mathcal {P}_1({\mathcal {X}}) \,:=\left\{ \mu \in \mathcal {P}({\mathcal {X}}):\int _{\mathcal {X}} \textsf{d}_{\mathcal {X}}(x,\overline{x})\,\textrm{d}\mu (x)<+\infty \text { for some } \overline{x}\in {\mathcal {X}} \right\} \end{aligned}$$

and that \((\mathcal {P}_1({\mathcal {X}}),\mathcal {W}_1)\) is complete if \(({\mathcal {X}},\textsf{d}_{\mathcal {X}})\) is complete. For a probability measure \(\mu \in \mathcal {P}({\mathcal {X}})\), if \({\mathcal {X}}\) is also a Banach space, we define the first moment \(m_1(\mathcal {\mu })\) as

$$\begin{aligned} m_1(\mu ):=\int _{\mathcal {X}} \Vert x \Vert _{{\mathcal {X}}}\,\textrm{d}\mu (x). \end{aligned}$$

So that, the finiteness of the integral above is equivalent to \(\mu \in \mathcal {P}_1({\mathcal {X}})\), whenever the distance \(\textsf{d}_{\mathcal {X}}\) is induced by the norm \(\Vert \cdot \Vert _{{\mathcal {X}}}\) .

Let \(\mu \in \mathcal {P}({\mathcal {X}})\) and \(f:{\mathcal {X}}\rightarrow Z\) a \(\mu \)-measurable function be given. The push-forward measure \(f_\# \mu \in \mathcal {P}(Z)\) is defined by \(f_\# \mu (B) = \mu (f^{-1}(B))\) for any Borel set \(B\subset Z\). It also holds the change of variables formula

$$\begin{aligned} \int _Z g(z)\,\textrm{d}f_\#\mu (z) = \int _{\mathcal {X}} g(f(x))\,\textrm{d}\mu (x) \end{aligned}$$

whenever either one of the integrals is well defined.

For E being a Banach space, the notation \(C^1_b(E)\) will be used to denote the subspace \(C_b(E)\) of functions having bounded continuous Fréchet differential at each point. The notation \(D\phi (\cdot )\) will be used to denote the Fréchet differential. In the case of a function \(\phi :[0,\,T]\times E \rightarrow \mathbb {R}\), the symbol \(\partial _t\) will be used to denote partial differentiation with respect to t, while D will only stand for the differentiation with respect to the variables in E.

2.2 Functional Setting

The space of labels \((U,\textsf{d})\) will be assumed to be a compact metric space. Consider the Borel \(\sigma \)-algebra \(\mathfrak {B}\) on U induced by the metric \(\textsf{d}\) and let us fix a probability measure \(\eta \in \mathcal {P}(U)\) which we can assume, without loss of generality, to have full support, i.e., \(\textrm{spt}(\eta )=U\). Notice that the measure space \((U,\mathfrak {B},\eta )\) is \(\sigma \)-finite and separable. For \(p\in [1,+\infty ]\), we consider the space \(L^p(U,\eta )\), which is a separable Banach space. Given r and R such that \(0\le r<1< R \le + \infty \), we introduce the set of probability densities with respect to \(\eta \), having lower bound r and upper bound R:

$$\begin{aligned} C_{r,R} :=\left\{ \ell \in L^p(U,\eta ): \int _{U} \ell (u)\,\textrm{d}\eta (u) = 1 \mathrm {\,\,and\,\,} r\le \ell \le R\,\, \eta \text {-}a.e. \right\} ; \end{aligned}$$

notice that \(C_{0,\infty }\) is the set of \(L^p\)-regular probability densities with respect to \(\eta \). Since \(\eta (U)=1\), the inclusion \(L^p(U,\eta )\subset L^1(U,\eta )\) holds for all \(p\in [1,+\infty ]\) and therefore the sets \(C_{r,R}\) are closed with respect to the \(L^p\)-norm. Thus, when equipped with the \(L^p\)-norm, the sets \(C_{r,R}\) are separable.Footnote 1 Finally, notice that \(C_{r,R}\) are also convex and their interiors are empty.

The state variable of our system is \(y:=(x,\ell )\in \mathbb {R}^d\times C_{0,\infty } =:Y\). The component \(x\in \mathbb {R}^d \) describes the location of an agent in space, whereas the component \(\ell \in C_{0,\infty }\) describes the distribution of labels of the agent. A probability distribution \(\Psi \in \mathcal {P}(Y)\) denotes a distribution of agents with labels. To outline the functional setting for the dynamics, we define \(\overline{Y} :=\mathbb {R}^d\times L^p(U,\eta )\) and the norm \(\Vert \cdot \Vert _{\overline{Y}}\) by

$$\begin{aligned} \Vert y \Vert _{\overline{Y}} = \Vert (x,\ell ) \Vert _{\overline{Y}} :=|x|+\Vert \ell \Vert _{L^p(U,\eta )}. \end{aligned}$$

Since \(Y\subset \overline{Y}\), we equip Y with the \(\Vert \cdot \Vert _{\overline{Y}}\) norm. For a given \(\varrho >0\), we denote by \(B_\varrho \) the closed ball of radius \(\varrho \) in \(\mathbb {R}^d\) and by \(B_\varrho ^{Y}\) the closed ball of radius \(\varrho \) in Y, namely, \(B_\varrho ^{Y} = \{y\in \ Y: \Vert y \Vert _{\overline{Y}}\le \varrho \}\). The Banach space structure of \(\overline{Y}\) allows us to define the first moment \(m_1(\Psi )\) for a probability measure \(\Psi \in \mathcal {P}(Y)\) as

$$\begin{aligned} m_1(\Psi ):=\int _{Y}\Vert y \Vert _{\overline{Y}}\,\textrm{d}\Psi (y), \end{aligned}$$

so that the space \(\mathcal {P}_1(Y)\) defined in (2.2) can be equivalently characterized as

$$\begin{aligned} \mathcal {P}_1(Y) = \{\Psi \in \mathcal {P}(Y) : m_1(\Psi )<+\infty \}. \end{aligned}$$

Whenever we fix r and R in (2.3), we set \(Y_{r,R}:=\mathbb {R}^{d}\times C_{r,R}\) and we modify the notation above accordingly.

We conclude this section by recalling the following existence result for ODEs of convex subsets of Banach spaces, which is stated in [28, Corollary 2.3] and [1, Theorem 1], generalizing the well-known results of [9, Section I.3, Theorem 1.4, Corollary 1.1].

Theorem 2.1

Let \((E,\Vert \cdot \Vert _E)\) be a Banach space, let C be a closed convex subset of E, and, for \(t\in [0,T]\), let \(A(t,\cdot ):C \rightarrow E\) be a family of operators satisfying the following properties:

  1. (i)

    for every \(\varrho >0\) there exists a constant \(L_\varrho > 0\) such that for every \(t\in [0,T]\) and \(c_1\), \(c_2\in C\cap \{e\in E: \Vert e \Vert _E\le \varrho \}\)

    $$\begin{aligned} \Vert A(t,c_1)-A(t,c_2) \Vert _E\le L_\varrho \Vert c_1-c_2 \Vert _E; \end{aligned}$$
  2. (ii)

    for every \(c\in C\) the map \(t\mapsto A(t,c)\) belongs to \(L^1([0,T];E)\);

  3. (iii)

    for every \(\varrho >0\) there exists \(\theta _\varrho > 0\) such that for every \(c \in C\cap \{e\in E: \Vert e \Vert _E\le \varrho \}\)

    $$\begin{aligned} c+ \theta _\varrho A(t,c) \in C; \end{aligned}$$
  4. (iv)

    there exists \(M>0\) such that for every \(c\in C\), there holds

    $$\begin{aligned} \Vert A(t,c) \Vert _E \le M(1+\Vert c \Vert _E). \end{aligned}$$

Then for every \(\overline{c}\in C\) there exists a unique curve \(c:[0,T]\rightarrow C\) of class \(C^1\) such that

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t} c_t= A(t,c_t)\quad \text {in }[0,T], \quad c_0 = \overline{c}. \end{aligned}$$

Moreover, if \(c^1,c^2\) are the solutions with initial data \(\overline{c}^1,\overline{c}^2\in C\cap \{e\in E:\Vert e \Vert _E\le \varrho \}\), respectively, there exists a constant \(L=L(M,\varrho ,T)>0\) such that

$$\begin{aligned} \Vert c^1_t-c^2_t \Vert _E \le e^{Lt}\,\Vert \overline{c}^1-\overline{c}^2 \Vert _E\qquad \text { for every } t\in [0,T]. \end{aligned}$$

3 Well-Posedness of the Entropic System

In this section, we study the well-posedness of the \(\varepsilon \)-regularized entropic system (1.4); for convenience, in this section, we fix \(\lambda =1\). We start by listing the assumptions on the velocity field \(y\mapsto v_\Psi (y)\) and on the transfer map \(y\mapsto \mathcal {R}_\Psi ^\varepsilon (y):=\mathcal {T}_\Psi (y)+\varepsilon \mathcal {H}(\ell )\). We assume that the velocity field \(v_\Psi :Y\rightarrow \mathbb {R}^d\) satisfies the following conditions:

  1. (v1)

    for every \(\varrho >0\), for every \(\Psi \in \mathcal {P}(B_\varrho ^{Y})\), \(v_\Psi \in \text {Lip}(B_\varrho ^Y;\mathbb {R}^d)\) uniformly with respect to \(\Psi \), namely there exists \(L_{v,\varrho }>0\) such that

    $$\begin{aligned} \vert v_\Psi (y^1)-v_\Psi (y^2)\vert \le L_{v,\varrho } \vert \vert y^1-y^2\vert \vert _{\overline{Y}}\,; \end{aligned}$$
  2. (v2)

    for every \(\varrho >0\), there exists \(L_{v,\varrho }>0\) such that for every \(y\in B_\varrho ^Y\) , and for every \(\Psi ^1\), \(\Psi ^2\in \mathcal {P}(B_\varrho ^Y)\)

    $$\begin{aligned} \vert v_{\Psi ^1}(y)-v_{\Psi ^2}(y)\vert \le L_{v,\varrho } \mathcal {W}_1(\Psi ^1,\Psi ^2); \end{aligned}$$
  3. (v3)

    there exists \(M_v>0\) such that for every \(y\in Y\), and for every \(\Psi \in \mathcal {P}_1(Y)\) there holds

    $$\begin{aligned} \vert v_\Psi (y)\vert \le M_v (1+\Vert y \Vert _{\overline{Y}}+m_1(\Psi )). \end{aligned}$$

We now describe the assumptions on \(\mathcal {T}\). For every \(\Psi \in \mathcal {P}_1(Y)\), let \(\mathcal {T}_\Psi :Y \rightarrow L^p(U,\eta )\) be an operator such that

  1. (T1)

    \(\mathcal {T}_\Psi (y)\) has zero mean for every \((y,\Psi )\in Y\times \mathcal {P}_1(Y) \) :

    $$\begin{aligned} \int _U \mathcal {T}_\Psi (y)(u)\,\textrm{d}\eta (u) = 0\,; \end{aligned}$$
  2. (T2)

    for every \(\varrho >0\) there exists \(L_{\mathcal {T},\varrho }>0\) such that for every \((y^1,\Psi ^1)\), \((y^2,\Psi ^2)\in B_\varrho ^Y\times \mathcal {P}(B_\varrho ^Y)\)

    $$\begin{aligned} \vert \vert \mathcal {T}_{\Psi ^1}(y^1)-\mathcal {T}_{\Psi ^2}(y^2)\vert \vert _{L^p(U,\eta )} \le L_{\mathcal {T},\varrho } \big (\vert \vert y^1-y^2\vert \vert _{\overline{Y}}+\mathcal {W}_1(\Psi ^1,\Psi ^2)\big ); \end{aligned}$$
  3. (T3)

    there exist a monotone increasing function \(\omega :[0,+\infty )\rightarrow [0,+\infty )\) , for which

    $$\begin{aligned} \limsup _{s\rightarrow 0^+} \frac{\omega (s)}{s} =:\underline{\omega }\in [0,+\infty ) \qquad \text {and}\qquad \limsup _{s\rightarrow \infty } \frac{\omega (s)}{s} =:\overline{\omega }\in [0,+\infty ), \end{aligned}$$

    and a constant \(C_\mathcal {T}>0\) such that for every \((y,\Psi )\in Y_{r,R}\times \mathcal {P}_1(Y)\) (for some \(0<r<1<R<+\infty \)),

    $$\begin{aligned} \mathcal {T}_\Psi (y)(u) \le C_\mathcal {T}\omega (R) \qquad \text {and}\qquad (\mathcal {T}_\Psi (y)(u))_- \le C_\mathcal {T}\omega (\ell (u)), \end{aligned}$$

    for \(\eta \)-almost every \(u\in U\).

Finally, the entropy functional \(\mathcal {H}:C_{0,\infty }\rightarrow L^0(U,\eta )\) that we consider is defined by

$$\begin{aligned} \mathcal {H}(\ell ) :=\ell \big [I(\ell )-\log (\ell )\big ], \end{aligned}$$

where \(I(\ell )\) is the negative entropy of the probability density \(\ell \), namely

$$\begin{aligned} I(\ell ):=\int _{U} \ell (u)\,\log (\ell (u)) \,\textrm{d}\eta (u). \end{aligned}$$

We notice that, for every \(r,R\in (0,+\infty )\) and every \(\ell \in C_{r,R}\), we have that \(\mathcal {H}(\ell )\in L^p(U,\eta )\) for every \(p\in [1,+\infty ]\).

Remark 3.1

We remark that assumptions \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) already appeared in [1, 2, 28] and in [3, 7] in a stronger form and are rather typical in the study of ODE systems. Conditions \(\mathrm {(T1)}\)\(\mathrm {(T3)}\), instead, are slightly different from the usual hypotheses on the operator \(\mathcal {T}_{\Psi }\) introduced in [28, Section 3]. In particular, \(\mathrm {(T3)}\) involves a pointwise condition on \(\mathcal {T}_{\Psi } (y)\), which is crucial to show existence and uniqueness of solutions to the N-particles system (3.30) below. The role played by such assumption is that of guaranteeing a pointwise control on the strategy \(\ell (u)\), ensuring a bound from above and from below away from 0. For more details, we refer to the proof of Proposition 3.2.

Here, we report two fundamental examples that fall into our theoretical framework. The first one is the replicator dynamics (see also [3, 7]). If \(\Psi \in \mathcal {P}( Y)\) stands for the distribution of players with mixed strategies \(\ell ' \in C_{0, \infty }\), the pay-off that a player in position x gets playing the strategy \(u\in U\) against all the other players writes

$$\begin{aligned} \mathcal {J}_{\Psi }(x,u) = \int _{Y}\int _U J(x,u,x',u')\,\ell '(u')\,\textrm{d}\eta (u') \, \textrm{d}\Psi (x',\ell ') \end{aligned}$$

and the corresponding operator \(\mathcal {T}\) is

$$\begin{aligned} \mathcal {T}_\Psi (x,\ell ) = \left( \mathcal {J}_{\Psi } (x,\cdot ) - \int _U \mathcal {J}_{\Psi } (x,u) \ell (u) \,\textrm{d}\eta (u) \right) \ell \,. \end{aligned}$$

In [28, Proposition 5.8] sufficient conditions on J are provided, that imply conditions \(\mathrm {(T1)}\) and \(\mathrm {(T2)}\). If J is bounded in \(\mathbb {R}^{d} \times U \times \mathbb {R}^{d} \times U\), then \(\mathcal {T}\) also satisfies \(\mathrm {(T3)}\).

The second example stems from population dynamics and models a leader-follower interactions (see [28, Sections 4 and 5]). We assume that \(U= \{ 1, \ldots , H\}\) for some \(H \in \mathbb {N}\) denotes the set of possible labels within a population. Given a distribution \(\Psi \in \mathcal {P}(Y)\) of agents with labels \(\ell \in L^{p} (U, \eta )\), for \(h \ne k \in U\) we denote by \(\alpha _{hk} (x, \Psi ) \ge 0\) the rate of change from label h to label k and set

$$\begin{aligned} \alpha _{hh}(x, \Psi ) :=\sum _{k \ne h} \alpha _{kh} (x, \Psi ). \end{aligned}$$

Since \(\eta \) is supported on the whole of U, we may identify \(\ell \in L^{p} (U, \eta )\) with the vector \((\ell _{1}, \ldots , \ell _{H})\). Hence, the operator \(\mathcal {T}_{\Psi }\) is defined by

$$\begin{aligned} (\mathcal {T}_{\Psi } (y))_{h} :=(\mathcal {Q}^{*} (x, \Psi ) \ell )_{h} = -\alpha _{hh} (x, \Psi ) \ell _{h} + \sum _{k \ne h} \alpha _{kh} (x, \Psi ) \ell _{k}, \end{aligned}$$

where the matrix \(\mathcal {Q}(x, \Psi )\) writes as

$$\begin{aligned} \mathcal {Q}(x, \Psi ) :=\left( \begin{array}{cccc} -\alpha _{11} (x, \Psi ) &{} \alpha _{12} (x, \Psi ) &{} \cdots &{} \alpha _{1H} (x, \Psi ) \\ \alpha _{21} (x, \Psi ) &{} -\alpha _{22} (x, \Psi ) &{} \cdots &{} \alpha _{2H} (x, \Psi ) \\ \vdots &{} \vdots &{} \ddots &{}\vdots \\ \alpha _{H1} (x, \Psi ) &{} \alpha _{H2} (x, \Psi ) &{} \cdots &{} -\alpha _{HH} (x, \Psi ) \end{array}\right) . \end{aligned}$$

Suitable assumptions on \(\alpha _{kh}\) that ensure \(\mathrm {(T1)}\) and \(\mathrm {(T2)}\) are given in [28, Proposition 5.1]. Once again, if \(\alpha _{kh}\) are bounded, we have \(\mathrm {(T3)}\) as well thanks to the precise structure (3.2): in particular, the positivity of \(\alpha _{kh}\) for every \(k\ne h\) is crucial to estimate \(\big (\mathcal {T}_{\Psi }(y)(u)\big )_-\) in terms of the sole \(\ell (u)\).

Proposition 3.2

Assume that \(v_\Psi :Y \rightarrow \mathbb {R}^d\) satisfies \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and \(\mathcal {T}_\Psi :Y \rightarrow L^p(U,\eta )\) satisfies \(\mathrm {(T1)}\)\(\mathrm {(T3)}\). Then, for every \(\varepsilon >0\) there exist \(r_\varepsilon \in (0,1)\) and \(R_\varepsilon \in (1,+\infty )\) such that—setting \(Y_\varepsilon :=Y_{r_\varepsilon ,R_\varepsilon }\)—for every \(\Psi \in \mathcal {P}_1(Y_\varepsilon )\), the vector field \(b^\varepsilon _\Psi :Y_\varepsilon \rightarrow \overline{Y}\) defined as

$$\begin{aligned} b^\varepsilon _\Psi (y):=\begin{pmatrix} v_\Psi (y)\\ \mathcal {R}^\varepsilon _\Psi (y) \end{pmatrix}, \qquad \text { for every }y \in Y_\varepsilon , \end{aligned}$$

satisfies the following properties:

  1. (1)

    for every \(\varrho >0\), there exists \(L_{\varepsilon , \varrho }>0\) such that for every \(\Psi \in \mathcal {P}(B_\varrho ^{Y_\varepsilon })\), and for every \(y^1,y^2\in B_\varrho ^{Y_\varepsilon }\)

    $$\begin{aligned} \vert \vert b_\Psi ^\varepsilon (y^1)-b_\Psi ^\varepsilon (y^2)\vert \vert _{\overline{Y}}\le L_{\varepsilon ,\varrho } \vert \vert y^1 - y^2 \vert \vert _{\overline{Y}}\,; \end{aligned}$$
  2. (2)

    for every \(\varrho >0\), there exists \(L_\varrho >0\) such that for every \(\Psi ^1,\Psi ^2\in \mathcal {P}(B_\varrho ^{Y_\varepsilon })\), and for every \(y\in B_\varrho ^{Y_\varepsilon }\)

    $$\begin{aligned} \vert \vert b_{\Psi ^1}^\varepsilon (y) - b_{\Psi ^2}^\varepsilon (y)\vert \vert _{\overline{Y}}\le L_\varrho \mathcal {W}_1(\Psi ^1,\Psi ^2)\,; \end{aligned}$$
  3. (3)

    there exists \(M_\varepsilon >0\) such that for every \(y\in {Y_\varepsilon }\) and for every \(\Psi \in \mathcal {P}_1({Y_\varepsilon })\) there holds

    $$\begin{aligned} \vert \vert b_\Psi ^\varepsilon (y) \vert \vert _{\overline{Y}}\le M_\varepsilon \,(1+\Vert y \Vert _{\overline{Y}} + m_1(\Psi ))\,. \end{aligned}$$
  4. (4)

    there exists \(\theta _{\varepsilon } >0\) such that for every \(\varrho >0\) and for every \(y\in B_\varrho ^{Y_\varepsilon }\) and for every \(\Psi \in \mathcal {P}(B_\varrho ^{Y_\varepsilon })\)

    $$\begin{aligned} y+\theta _{\varepsilon } b^\varepsilon _\Psi (y)\in Y_\varepsilon \,. \end{aligned}$$


The proof is divided into three steps.

Step 1 (boundedness of \(\mathcal {H}\)). We start by proving that \({\mathcal {H}}(C_{r, R})\subset L^\infty (U,\eta )\) for every \(r, R \in (0, +\infty )\) with \(r< 1 < R\), which in turn implies that for every \(\varrho \in (0, +\infty )\), every \(\Psi \in \mathcal {P} (B_{\varrho } ^{Y_{r, R}})\), and every \(y \in Y_{r, R}\), \(\mathcal {R}^\varepsilon _{\Psi } (y)\) is well defined in \(L^{p}(U, \eta )\).

For every \(u\in U\) we may write \( \ell (u) = r \zeta (u) + R (1-\zeta (u))\), with \(0\le \zeta (u) \le 1\) . Thus, using the convexity of the function \(t\mapsto t\log (t)\) in \((0, +\infty )\) we get

$$\begin{aligned} I(\ell ) \le r \log (r ) \int _{U} \zeta (u)\,\textrm{d}\eta (u)+ R \log (R)\,\int _{U}(1-\zeta (u))\,\textrm{d}\eta (u)\,. \end{aligned}$$

Since \(\ell \) is a probability density it is straightforward to check that

$$\begin{aligned} \int _U \zeta (u) \, \textrm{d}\eta (u) = \frac{R - 1}{R - r}\,. \end{aligned}$$


$$\begin{aligned} I(\ell ) \le \frac{R - 1}{R - r}\,r \log (r) + \left( 1 - \frac{R - 1}{R - r}\right) R \log (R). \end{aligned}$$

To simplify the notation, we define

$$\begin{aligned} \alpha _{r, R} :=\frac{(R - 1) r}{R - r}\in (0,1)\,, \end{aligned}$$

so that inequality (3.8) reads

$$\begin{aligned} I(\ell ) \le \alpha _{r, R} \log (r) + (1 - \alpha _{r, R}) \log (R) =:k_{r, R}\,. \end{aligned}$$

Moreover, by Jensen’s inequality we have that

$$\begin{aligned} I(\ell ) \ge \int _{U} \ell (u)\,\textrm{d}\eta (u)\,\log \left( \int _{U} \ell (u) \, \textrm{d}\eta (u)\right) = 0. \end{aligned}$$

Since \(\ell \in C_{r, R}\) and (3.10) and (3.11) hold, we deduce that

$$\begin{aligned} -R \log (R) \le {\mathcal {H}}(\ell ) \le R \,k_{r, R} + \frac{1}{e}\,, \end{aligned}$$

so that \({\mathcal {H}}(\ell )\in L^\infty (U,\eta )\).

Since \({\mathcal {H}}(\ell )\) has zero mean and \((\textrm{T}1)\) holds true, we have that

$$\begin{aligned} \int _{U} \mathcal {R}_{\Psi }^{\varepsilon }(y)(u) \, \textrm{d}\eta (u) = 0\,. \end{aligned}$$

Step 2 (Lipschitz continuity of \(\mathcal {H}\)). We now show that \({\mathcal {H}}\) is Lipschitz continuous on \(C_{r, R}\) with Lipschitz constant \(L_{r, R}\) depending on r and R. Since \(t\mapsto t\log (t)\) is Lipschitz continuous on [rR] whenever \(r > 0\) (we let \(L_{r, R}'\) be its Lipschitz constant), we may estimate for every \(\ell _{1}, \ell _{2} \in C_{r, R}\) and every \(u \in U\)

$$\begin{aligned} \begin{aligned}&\vert {\mathcal {H}}(\ell _1)(u) - {\mathcal {H}}(\ell _2)(u) \vert \le | I ( \ell _1) \ell _1(u) - I(\ell _2)\ell _2(u) | + |\ell _1(u)\log (\ell _1(u)) \\&\quad - \ell _2(u)\log (\ell _2(u)) | \le | I(\ell _1) - I(\ell _2) | \, |\ell _1(u)| + | I(\ell _2)| \, |\ell _1(u) - \ell _2(u)| \\&\quad + L_{r, R}' |\ell _1(u) - \ell _2(u)|\le R | I(\ell _1) - I(\ell _2) | + k_{r, R} |\ell _1(u) - \ell _2(u) | \\&\quad + L_{r, R}' |\ell _1(u) - \ell _2(u) | \le R \int _U | \ell _1(u) \log (\ell _1(u)) - \ell _2(u) \log (\ell _2(u)) | \, \textrm{d}\eta (u) \\&\quad + (k_{r, R} + L_{r, R}' )|\ell _1(u) - \ell _2(u)| \le R L_{r, R}' \int _U | \ell _1(u) - \ell _2(u) |\, \textrm{d}\eta (u)\\&\quad +( k_{r,R} + L_{r, R}' ) |\ell _1(u) - \ell _2(u)|\,. \end{aligned} \end{aligned}$$

Thus, there holds

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {H}}(\ell _1)-{\mathcal {H}}(\ell _2) \Vert _{L^{p}(U, \eta )}&\le R L_{r, R}' \Vert \ell _1 - \ell _2 \Vert _{L^1(U,\eta )} + (k_{r, R} + L_{r, R}' )\,\Vert \ell _1-\ell _2 \Vert _{L^{p}(u, \eta )} \\&\le R L_{r, R}' \Vert \ell _1 - \ell _2 \Vert _{L^{p}(U, \eta )} + (k_{r, R} + L_{r, R}' ) \Vert \ell _1 - \ell _2 \Vert _{L^{p}(U, \eta ) } \\&= ((R+1) L_{r, R}' + k_{r, R} ) \Vert \ell _1 - \ell _2 \Vert _{L^{p}(U, \eta )} \\&=: L_{r, R} \Vert \ell _1 - \ell _2 \Vert _{L^{p}(U, \eta )}\,, \end{aligned} \end{aligned}$$

where we have used that \(\eta \in \mathcal {P}(U)\).

Step 3 (proof of properties (1)–(4)). For \(\varepsilon >0\), we fix \(r_\varepsilon \in (0, 1)\) such that

$$\begin{aligned} \varepsilon \,\log \left( \dfrac{3}{4\,r_\varepsilon }\right) \ge C_\mathcal {T}\,\dfrac{\omega (\frac{4}{3}r_\varepsilon )}{r_\varepsilon } \,. \end{aligned}$$

Notice that, thanks to \((\textrm{T}3)\), such \(r_{\varepsilon }\) exists as

$$\begin{aligned} \limsup _{r \rightarrow 0^+} \,\varepsilon \log \left( \dfrac{3}{4 r} \right) = +\infty \qquad \text {and}\qquad \limsup \limits _{r \rightarrow 0^+} \, C_\mathcal {T}\, \dfrac{\omega (\frac{4}{3}r)}{r} = \frac{4}{3}\,C_\mathcal {T}\,\underline{\omega } \,. \end{aligned}$$

We now fix \(R_\varepsilon \in (1, +\infty )\) such that

$$\begin{aligned} \alpha _{r_{\varepsilon }, R_{\varepsilon }} \log \left( \frac{R_\varepsilon }{r_\varepsilon }\right) \ge \dfrac{2\,C_\mathcal {T}\,\omega (R_\varepsilon )}{\varepsilon \,R_\varepsilon }. \end{aligned}$$

Again, notice that there exists at least one \(R_\varepsilon >1\) satisfying (3.16) since, by \((\textrm{T}3)\) and by definition of \(\alpha _{r, R}\) in (3.9), it holds

$$\begin{aligned} \limsup \limits _{R \rightarrow +\infty }\,\alpha _{r_{\varepsilon }, R} \log \left( \frac{R}{r_\varepsilon }\right) = +\infty \qquad \text {and}\qquad \limsup \limits _{R \rightarrow +\infty } \, \dfrac{2\,C_\mathcal {T}\,\omega (R)}{\varepsilon \,R} = \dfrac{2\,C_\mathcal {T}\,\overline{\omega }}{\varepsilon }\,. \end{aligned}$$

For \(r_{\varepsilon }\) and \(R_{\varepsilon }\) given above, we now prove properties (1)–(4). For simplicity, we set from now on \( C_{\varepsilon } :=C_{r_{\varepsilon }, R_{\varepsilon }}\), \(Y_{\varepsilon } :=Y_{r_{\varepsilon }, R_{\varepsilon }}\), \(\alpha _{\varepsilon } :=\alpha _{r_{\varepsilon }, R_{\varepsilon }}\), and \(k_{\varepsilon } :=k_{r_{\varepsilon }, R_{\varepsilon }}\).

Property (1). Let \(\varrho >0\), \(\Psi \in \mathcal {P}(B_\varrho ^{Y_{\varepsilon }})\), and \(y^{1}, y^{2} \in B^{Y_{\varepsilon }}_{\varrho }\). By \((\textrm{T}2)\), the operator \(\mathcal {T}_{\Psi }\) is Lipschitz continuous on \(B_{\varrho }^{Y_{\varepsilon }}\) with Lipschitz constant \(L_{\mathcal {T}, \varrho }>0\), while by \((\textrm{v}1)\), \(v_{\Psi }\) is Lipschitz continuous on \(B_{\varrho }^{Y_{\varepsilon }}\) with Lipschitz constant \(L_{v, \varrho }>0\). In view of the Lipschitz continuity of \({\mathcal {H}}\) (cf. (3.14)), setting, for instance, \(L_{\varepsilon , \rho }:= L_{v, \varrho } + \max \{\varepsilon L_{r_{\varepsilon }, R_{\varepsilon }}, L_{\mathcal {T}, \rho }\}\), we deduce (3.4).

Property (2). It is straightforward from \((\textrm{v}2)\) and \((\textrm{T}2)\), since the entropy regularization \({\mathcal {H}}\) does not depend on \(\Psi \in \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\).

Property (3). In view of \((\textrm{v}3)\), it is enough to prove that there exists \(M_{\varepsilon }>0\) such that for every \(y \in Y_{\varepsilon }\) and every \(\Psi \in \mathcal {P}_{1}(Y_{\varepsilon })\)

$$\begin{aligned} \Vert \mathcal {R}^{\varepsilon }_{\Psi } (y)\Vert _{L^{p}(U, \eta )} \le M_{\varepsilon } \big ( 1 + \Vert y \Vert _{\overline{Y}} + m_{1} (\Psi ) \big )\,. \end{aligned}$$

By \(({\textrm{T}}3)\), we have that \(| \mathcal {T} (y) (u) | \le C_{\mathcal {T}} \omega (R_{\varepsilon })\). Recalling (3.12) and setting

$$\begin{aligned} M_{\varepsilon }:= C_{\mathcal {T}} \omega (R_{\varepsilon }) + \varepsilon \, \max \Big \{ R_{\varepsilon } \log R_{\varepsilon }, \, R_{\varepsilon } \,k_{\varepsilon } + \frac{1}{e}\Big \}, \end{aligned}$$

we infer (3.17) and therefore (3.6).

Property (4). Let \(\varrho >0\). Since \(Y_{\varepsilon } = \mathbb {R}^{d} \times C_{\varepsilon }\), we only have to find \(\theta _\varepsilon \) such that for every \(\Psi \in \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\) and every \(y = (x, \ell ) \in B^{Y_{\varepsilon }}_{\varrho }\),

$$\begin{aligned} \ell + \theta _{\varepsilon } \mathcal {R}_{\Psi }^{\varepsilon } (x, \ell ) \in C_{\varepsilon }\,. \end{aligned}$$

In view of (3.13), we already know that for any \(\theta _{\varepsilon } > 0\)

$$\begin{aligned} \int _{U} \ell (u) + \theta _{\varepsilon } \mathcal {R}_{\Psi }^{\varepsilon } (x, \ell ) (u) \, \textrm{d}\eta (u) = 1\,. \end{aligned}$$

Hence, we have to show that upper and lower bounds of \(C_{\varepsilon }\) are preserved for a suitable choice of \(\theta _{\varepsilon }\) independent of \(y \in B^{Y_{\varepsilon }}_{\varrho }\) and of \(\Psi \in \mathcal {P}(B_{\varrho }^{Y_{\varepsilon }})\). The precise \(\theta _{\varepsilon }\) will be specified along the proof.

Let \(y \in B^{Y_{\varepsilon }}_{\varrho }\) and \(\Psi \in \mathcal {P}(B_\varrho ^{Y_\varepsilon })\). We start by imposing that for \(\eta \)-a.e. \(u\in U\)

$$\begin{aligned} \ell (u) + \theta _\varepsilon \mathcal {R}^\varepsilon (\ell )(u) \le R_\varepsilon \,. \end{aligned}$$

Using \((\textrm{T}3)\) and (3.10) we get that

$$\begin{aligned} \begin{aligned} \ell (u)&+ \theta _\varepsilon \,\left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon \,{\mathcal {H}}(\ell )(u)\right] \le \ell (u) + \theta _\varepsilon \,\left[ C_\mathcal {T}\,\omega (R_{\varepsilon } ) + \varepsilon {\mathcal {H}} (\ell )(u) \right] \\&= \ell (u) + \theta _\varepsilon \left[ C_\mathcal {T}\,\omega (R_\varepsilon ) + \varepsilon \ell (u) \left( I(\ell ) - \log (\ell (u)) \right) \right] \\&\le \ell (u) + \theta _\varepsilon \left[ C_\mathcal {T}\,\omega (R_\varepsilon ) + \varepsilon \,\ell (u)\,(\alpha _\varepsilon \,\log (r_\varepsilon )\right. \\&\left. +(1-\alpha _\varepsilon )\log (R_\varepsilon )-\log (\ell (u)))\right] . \end{aligned} \end{aligned}$$

Because of (3.16) we have that

$$\begin{aligned} \begin{aligned}&\lim _{t\nearrow R_\varepsilon } [C_\mathcal {T}\omega (R_\varepsilon )+\varepsilon t (\alpha _\varepsilon \log (r_\varepsilon )+(1-\alpha _\varepsilon )\log (R_\varepsilon )-\log t)] \\&=\, C_\mathcal {T}\omega (R_\varepsilon ) - \varepsilon \alpha _\varepsilon R_\varepsilon \log \bigg (\frac{R_\varepsilon }{r_\varepsilon }\bigg ) \le -C_\mathcal {T}\omega (R_\varepsilon ) <0. \end{aligned} \end{aligned}$$

Inequalities (3.21) and (3.22) imply that there exists \(R_\varepsilon ' < R_\varepsilon \) such that

$$\begin{aligned} \ell (u) + \theta _\varepsilon \,\left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon \,{\mathcal {H}}(\ell )(u)\right] \le R_{\varepsilon } \qquad \text { whenever }\ell (u)\in [R_\varepsilon ',R_\varepsilon ]. \end{aligned}$$

If \(\ell (u)\le R_\varepsilon '\), by \((\textrm{T}3)\) and by (3.12) we estimate

$$\begin{aligned} \begin{aligned} \ell (u) + \theta _\varepsilon \mathcal {R}^\varepsilon _{\Psi }(y)(u)&\le R_\varepsilon ' + \theta _\varepsilon \,\left[ C_\mathcal {T}\,\omega (R_\varepsilon ) +\varepsilon R_\varepsilon k_\varepsilon + \frac{\varepsilon }{e}\right] \,. \end{aligned} \end{aligned}$$

It follows from (3.24) that there exists \(\theta _\varepsilon ^1 \in (0, +\infty )\) such that for every \(\theta _{\varepsilon } \in (0, \theta _{\varepsilon }^{1}]\)

$$\begin{aligned} \ell (u) + \theta _\varepsilon \,\left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon \,{\mathcal {H}}(\ell )(u)\right] \le R_{\varepsilon } \qquad \text { whenever }\ell (u) \in (r_{\varepsilon }, R'_{\varepsilon }]. \end{aligned}$$

Combining (3.23) and (3.25) we deduce the upper bound (3.20) for \(\theta _{\varepsilon } \in (0, \theta _{\varepsilon }^{1}]\).

We now show that, for a suitable choice of \(\theta _{\varepsilon } \in (0, \theta ^{1}_\varepsilon ]\), we can as well guarantee

$$\begin{aligned} \ell (u) + \theta _\varepsilon \mathcal {R}_{\Psi }^\varepsilon (y)(u)\ge r_\varepsilon \qquad {\eta -\text {a.e.}~u\in U }\,. \end{aligned}$$

In fact, using \((\textrm{T}3)\) and (3.11)

$$\begin{aligned} {\begin{matrix} \ell (u) + \theta _\varepsilon \left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon {\mathcal {H}}(\ell )(u)\right]&\ge \ell (u) + \theta _\varepsilon \,\left[ -C_\mathcal {T}\,\omega (\ell (u))-\varepsilon \,\ell (u)\log (\ell (u))\right] \end{matrix}}. \end{aligned}$$

If \(\ell (u) \in \big ( \frac{4}{3}\,r_\varepsilon , R_{\varepsilon } \big ]\), by monotonicity of \(\omega \) we continue in the previous inequality with

$$\begin{aligned} \begin{aligned} \ell (u) + \theta _\varepsilon \left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon {\mathcal {H}}(\ell )(u)\right]&\ge \frac{4}{3}\,r_\varepsilon + \theta _\varepsilon \,\left[ -C_\mathcal {T}\,\omega (R_\varepsilon ) - \varepsilon R_\varepsilon \log (R_\varepsilon )\right] \,. \end{aligned} \end{aligned}$$

From inequality (3.27) we infer the existence of \(\theta _\varepsilon ^2 \in (0, \theta _{\varepsilon }^{1} ]\) (depending only on \(r_{\varepsilon }\) and \(R_{\varepsilon }\)) such that for every \(\theta _{\varepsilon } \in (0, \theta _{\varepsilon }^{2} ]\) it holds

$$\begin{aligned} \ell (u) + \theta _\varepsilon \left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon {\mathcal {H}}(\ell )(u)\right] \ge r_{\varepsilon } \qquad \text { whenever }\ell (u) \in \Big (\frac{4}{3}r_{\varepsilon }, R_{\varepsilon } \Big ]. \end{aligned}$$

If \(\ell (u) \in \big [r_{\varepsilon }, \frac{4}{3}\,r_\varepsilon \big ]\), instead, by \((\textrm{T}3)\) and by the choice of \(r_{\varepsilon }\) in (3.15), we estimate

$$\begin{aligned} \begin{aligned} \ell (u) + \theta _\varepsilon \left[ \mathcal {T}_{\Psi }(y)(u) + \varepsilon {\mathcal {H}}(\ell )(u)\right]&\ge \ell (u) + \theta _\varepsilon \ell (u)\left[ -C_\mathcal {T}\,\frac{\omega (\ell (u))}{\ell (u)} - \varepsilon \log (\ell (u))\right] \\&\ge \ell (u) + \theta _\varepsilon \ell (u)\left[ - C_\mathcal {T}\frac{\omega \left( \frac{4}{3}\,r_\varepsilon \right) }{r_\varepsilon } + \varepsilon \log \left( \frac{3}{4\,r_\varepsilon }\right) \right] \\&\ge \ell (u) \ge r_{\varepsilon }\,, \end{aligned} \end{aligned}$$

which concludes the proof of (3.26) for \(\theta _{\varepsilon } \in (0, \theta _{\varepsilon }^{2}]\).

Combining (3.19), (3.20), and (3.26), we conclude that for every \(\theta _{\varepsilon } \in (0, \theta _{\varepsilon }^{2}]\), for every \(\Psi \in \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\), and every \(y = (x, \ell ) \in B^{Y_{\varepsilon }}_{\varrho }\), (3.18) holds. Notice, in particular, that \(\theta _\varepsilon \) is independent of \(\varrho \). \(\square \)

From now on, whenever a choice of \(r_{\varepsilon }\) and \(R_{\varepsilon }\) is made according to Proposition 3.2, the corresponding space \(Y_{r_{\varepsilon },R_{\varepsilon }}\) will be denoted by \(Y_{\varepsilon }\). Moreover, for any \(N\in \mathbb {N}\), we will denote by \(Y_{\varepsilon }^N:=(Y_{\varepsilon })^N\) the cartesian product of N copies of \(Y_{\varepsilon }\). Finally, we will consistently use the notation \(b_{\Psi }^{\varepsilon }\) for the velocity field introduced in (3.3).

As a consequence of Theorem 2.1 and Proposition 3.2, we obtain the following theorem.

Theorem 3.3

Let \(v_\Psi :Y \rightarrow \mathbb {R}^d\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and let \(\mathcal {T}_\Psi :Y \rightarrow L^p(U,\eta )\) satisfy \(\mathrm {(T1)}\)\(\mathrm {(T3)}\); let \(\varepsilon >0\) and let \(r_{\varepsilon },R_{\varepsilon }\) be as in Proposition 3.2. Then for any choice of initial conditions \({\bar{\varvec{y}}} = ({\bar{y}}^{1}, \ldots , {\bar{y}}^{N}) \in Y_\varepsilon ^{N}\), the system

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{y}^{i}_t = b_{\Lambda _t^{N}}^\varepsilon (y^{i}_t), \\ y^{i}_0=\bar{y}^{i}, \end{array}\right. } \qquad \text { for }i=1,\dots ,N, t\in [0,T], \end{aligned}$$

where \(\Lambda _t^{N}:=\frac{1}{N}\sum _{i=1}^N \delta _{y^{i}_{t}}\) is the empirical measure associated with the system, has a unique solution \(\varvec{y}:[0,T]\rightarrow Y_{\varepsilon }^{N}\). Moreover, we have that

$$\begin{aligned} \sup _{\begin{array}{c} i=1,\ldots ,N\\ t\in [0,T] \end{array}}\Vert y_t^i \Vert _{\overline{Y}} \le \Big (\sup _{i=1,\ldots ,N} \Vert \bar{y}^i \Vert _{\overline{Y}} + M_\varepsilon T\Big ) e^{2M_\varepsilon T}. \end{aligned}$$


We let \(\varvec{y}:=(y^{1},\dots , y^{N})\in Y_\varepsilon ^N\subset \overline{Y}^N\), whose norm we define as

$$\begin{aligned} \Vert \varvec{y} \Vert _{\overline{Y}^N}:=\frac{1}{N}\sum _{i=1}^N \vert \vert y^{i}\vert \vert _{\overline{Y}}, \end{aligned}$$

and we consider the associated empirical measure \(\Lambda ^{N} :=\frac{1}{N}\sum _{i=1}^N \delta _{y^{i}}\) , which belongs to \(\mathcal {P}(B_R^{Y_\varepsilon })\) whenever \(\varvec{y}\in (B_R^{Y_\varepsilon })^N\). Consider the map \(\varvec{b}^{\varepsilon ,N} :Y_\varepsilon ^N \rightarrow \overline{Y}^N\) whose components are defined through \(b_i^{\varepsilon ,N}(\varvec{y}) :=b^\varepsilon _{\Lambda ^N}(y^{i})\). Then the Cauchy problem (3.30) can be written as

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{\varvec{y}}_t = \varvec{b}^{\varepsilon ,N}(\varvec{y}_t),\\ \varvec{y}_0 =\bar{\varvec{y}}. \end{array}\right. } \end{aligned}$$

In order to apply Theorem 2.1 to the system above, we first notice that assumption (ii) is automatically satisfied since the system is autonomous. To see that the other assumptions are satisfied too, we fix a ball \(B_R^{Y_\varepsilon ^N}\) and notice that \(B_R^{Y_\varepsilon ^N}\subset \big (B_{NR}^{Y_\varepsilon }\big )^N\). Applying (3.7) with \(\Psi = \Lambda ^{N}\) to each component \(y^{i}\) of \(\varvec{y}\), we get that assumption (iii) of Theorem 2.1 is satisfied with \(\varrho = RN\). We now show that assumption (i) holds. Fix \(\varvec{y}_1,\varvec{y}_2\in B_R^{Y_\varepsilon ^N}\) and let \(\Lambda ^{N}_1\) and \(\Lambda ^{N}_2\) be the associated empirical measures. Recalling (2.1), we notice that

$$\begin{aligned} \mathcal {W}_1(\Lambda _1^{N},\Lambda _2^{N}) \le \frac{1}{N} \sum \limits _{i=1}^N \Vert y^{i}_1-y^{i}_2\Vert _{\overline{Y}} = \Vert \varvec{y}_1-\varvec{y}_2 \Vert _{\overline{Y}^N}. \end{aligned}$$

Therefore, by triangle inequality, (3.4), and (3.5), we obtain the estimate

$$\begin{aligned} \begin{aligned} \Vert \varvec{b}^{\varepsilon ,N}(\varvec{y}_1)- \varvec{b}^{\varepsilon , N}(\varvec{b}_2)\Vert _{\overline{Y}^N}&= \frac{1}{N}\sum _{i=1}^N \Vert b^\varepsilon _{\Lambda ^N_1}(y^{i}_1)-b^\varepsilon _{\Lambda ^N_2}(y^{i}_2) \Vert _{\overline{Y}} \\&\le L_{NR}\,\mathcal {W}_1(\Lambda _1^{N},\Lambda _2^{N})+\frac{L_{\varepsilon ,NR}}{N}\sum _{i=1}^N\Vert y_1^{i}-y^{i}_2 \Vert _{\overline{Y}} \\&\le (L_{NR}+L_{\varepsilon ,NR}) \Vert \varvec{y}_1-\varvec{y}_2 \Vert _{\overline{Y}^N}\,. \end{aligned} \end{aligned}$$

To see that also assumption (iv) of Theorem 2.1 holds, we apply (3.6), upon noticing that \(m_1(\Lambda ^{N})=\Vert \varvec{y} \Vert _{\overline{Y}^N}\),

$$\begin{aligned} \Vert \varvec{b}^{\varepsilon ,N}(\varvec{y}) \Vert _{\overline{Y}^N} = \frac{1}{N}\sum _{i=1}^N \Vert b^\varepsilon _{\Lambda ^{N}}(y^{i}) \Vert _{\overline{Y}} \le \frac{M_\varepsilon }{N}\sum _{i=1}^N(1+\Vert y^{i} \Vert _{\overline{Y}}+m_1(\Lambda ^{N})) = M_\varepsilon \,(1+2\Vert \varvec{y}\Vert _{\overline{Y}^N}). \end{aligned}$$

Existence and uniqueness of the solution to system (3.30) follow now from Theorem 2.1.

Finally, because of (3.6), we have that

$$\begin{aligned} \begin{aligned} \Vert y_t^i \Vert _{\overline{Y}}&\le \Vert \bar{y}^i \Vert _{\overline{Y}}+\int _0^T \Vert b^\varepsilon _{\Lambda _s^{N}}(y^{i}_s) \Vert _{\overline{Y}}\,\textrm{d}s \le \Vert \bar{y}^i \Vert _{\overline{Y}} + \int _0^T \big [M_\varepsilon (1+\Vert y^{i}_s \Vert _{\overline{Y}}+m_1(\Lambda _s^{N}))\big ]\,\textrm{d}s \\&\le \Vert \bar{y}^i \Vert _{\overline{Y}} + \int _0^T \big [M_\varepsilon (1+\Vert y^{i}_s \Vert _{\overline{Y}} + \Vert \varvec{y}_s \Vert _{\overline{Y}^N})\big ]\,\textrm{d}s \\&\le \sup _{j=1,\ldots ,N} \Vert \bar{y}^j \Vert _{\overline{Y}} +\int _0^T \Big [M_\varepsilon \Big (1+2\sup _{j=1,\ldots ,N}\Vert y_s^j \Vert _{\overline{Y}}\Big ) \Big ]\,\textrm{d}s . \end{aligned} \end{aligned}$$

Taking the supremum over \(i=1,\ldots ,N\) in the left-hand side and applying Grönwall’s Lemma, we conclude that

$$\begin{aligned} \sup _{\begin{array}{c} i=1,\ldots ,N\\ t\in [0,T] \end{array}}\Vert y_t^i \Vert _{\overline{Y}} \le \Big (\sup _{i=1,\ldots ,N} \Vert \bar{y}^i \Vert _{\overline{Y}} + M_\varepsilon T\Big ) e^{2M_\varepsilon T}, \end{aligned}$$

which is (3.31). \(\square \)

We state here a second existence and uniqueness result, which will be useful in the next section.

Proposition 3.4

Let \(v_\Psi :Y \rightarrow \mathbb {R}^d\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and let \(\mathcal {T}_\Psi :Y \rightarrow L^p(U,\eta )\) satisfy \(\mathrm {(T1)}\)\(\mathrm {(T3)}\); let \(\varepsilon >0\) and let \(r_{\varepsilon },R_{\varepsilon }\) be as in Proposition 3.2. Let \(\Lambda \in C^0([0,T]; (\mathcal {P}_1(Y_\varepsilon ), \mathcal {W}_1))\) and assume that there exists \(\varrho >0\) such that \(\Lambda _{t}\in \mathcal {P}(B_{\varrho }^{Y_\varepsilon })\) for all \(t\in [0,T]\). Then, for every \(\bar{y}\in Y_\varepsilon \) the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{y}_{t} = b^{\varepsilon }_{\Lambda _{t}} (y_{t})\,,\\ y_{0} = \bar{y} \end{array} \right. \end{aligned}$$

has a unique solution.


The result follows by a direct application of Theorems 2.1 and 3.2, as this time the field \(b_{\Lambda _{t}}^{\varepsilon }\) is fixed. \(\square \)

In view of the previous result, the following definition is justified.

Definition 3.5

Let \(\varepsilon >0\), let \(r_{\varepsilon },R_{\varepsilon }\) be as in Proposition 3.2, let \(\varrho >0\), and let \(\Lambda \in C([0, T]; (\mathcal {P}_{1} (Y_{\varepsilon }); \mathcal {W}_{1}))\) be such that \(\Lambda _{t} \in \mathcal {P}( B^{Y_{\varepsilon }}_{\varrho })\) for every \(t \in [0, T]\). We define the transition map \(\varvec{\textrm{Y}}_{\Lambda }(t,s,\bar{y})\) associated with the ODE (3.32) as

$$\begin{aligned} \varvec{\textrm{Y}}_{\Lambda }(t, s, \bar{y}) := y_t \,, \end{aligned}$$

where \(t\mapsto y_t \) is the unique solution to (3.32) where we have replaced the initial condition by \(y_s = \bar{y}\).

4 Mean-Field Limit

In this section we aim at passing to the mean-field limit as \(N\rightarrow \infty \) in system (3.30). Along the whole section, we fix \(\varepsilon >0\), \(r_{\varepsilon } \in (0, 1)\), and \(R_{\varepsilon } \in (1, +\infty )\) as in Theorem 3.3. As it is customary in the study mean-field limits of particles systems, we look at the limit of the empirical measure \(\Lambda _t^{N} = \frac{1}{N} \sum _{i=1}^{N} \delta _{y^{i}_{t}}\) associated to a solution \(\varvec{y}:[0, T] \rightarrow Y_{\varepsilon }^{N}\) of system (3.30). In Theorem 4.2 we will show that, under suitable assumptions on the initial conditions, the sequence of curves \(t \mapsto \Lambda ^{N}_{t}\) converges to a curve \(\Lambda \in C( [0, T]; (\mathcal {P}_{1}(Y_{\varepsilon }); \mathcal {W}_{1}))\) solution to the continuity equation

$$\begin{aligned} \partial _t \Lambda _t + \textrm{div}(b^\varepsilon _{\Lambda _t}\,\Lambda _t) = 0\,. \end{aligned}$$

We start by recalling the definition of Eulerian solution to (4.1).

Definition 4.1

Let \(\bar{\Lambda } \in \mathcal {P}_1 (Y_\varepsilon )\). We say that \(\Lambda \in C^0([0,T];(\mathcal {P}_1(Y_\varepsilon ),\mathcal {W}_1))\) is an Eulerian solution to equation (4.1) with initial datum \(\bar{\Lambda }\) if \(\Lambda _{0} = \bar{\Lambda }\) and for every \(\phi \in C^1_b([0,T]\times \overline{Y})\) it holds

$$\begin{aligned}{} & {} \int _{Y_\varepsilon }\phi (t,y)\,\textrm{d}\Lambda _t (y) - \int _{Y_\varepsilon }\phi (0,y)\,\textrm{d}\Lambda _0 (y) \nonumber \\{} & {} \quad = \int _0^t \int _{Y_\varepsilon }(\partial _t\phi (s,y) + D\phi (s,y)\cdot b^\varepsilon _{\Lambda }(y))\,\textrm{d}\Lambda _s (y)\,\textrm{d}s, \end{aligned}$$

where \(D\phi (s,y)\) is the Fréchet differential of \(\phi \) in the y-variable.

The main result of this section is an existence and uniqueness result of Eulerian solutions to (4.1) and its characterization as the mean-field limit of the particles system (3.30).

Theorem 4.2

Let \(\varrho >0\) and \(\bar{\Lambda }\in \mathcal {P}(B_{\varrho }^{Y_\varepsilon })\) be a given initial datum. Then, the following facts hold:

  1. (1)

    there exists a unique Eulerian solution \(\Lambda \in C([0, T]; (\mathcal {P}_{1} (Y_{\varepsilon }); \mathcal {W}_{1}))\) to (4.1) with initial datum \(\bar{\Lambda }\);

  2. (2)

    if \(\bar{\varvec{y}}_{N}:= (\bar{y}^{1}_{N}, \ldots , \bar{y}^{N}_{N}) \in Y^{N}_{\varepsilon }\) satisfies \(\Vert \bar{y}^{i}_{N}\Vert _{\overline{Y}} \le \varrho \) for every \(i = 1, \ldots , N\) and every \(N \in \mathbb {N}\) and \(\bar{\Lambda }^{N}:= \frac{1}{N}\sum _{i=1}^N\delta _{\bar{y}_{i,N}} \in \mathcal {P}(B_r^{Y_\varepsilon })\) is such that

    $$\begin{aligned} \lim _{N\rightarrow \infty } \mathcal {W}_1(\bar{\Lambda } , \bar{\Lambda }^{N}) = 0\,, \end{aligned}$$

    then the corresponding sequence of empirical measures \(\Lambda _{t}^{N}\) associated to the system (3.30) with initial data \(\bar{y}^{i}_{N}\) fulfill

    $$\begin{aligned} \lim _{N\rightarrow \infty } \mathcal {W}_1 (\Lambda _t,\Lambda _t^{N}) = 0 \qquad \text {uniformily with respect to}\,\,t\in [0,T]. \end{aligned}$$

Before proving existence of an Eulerian solution, we briefly discuss its uniqueness. This result is a consequence of the following superposition principle (see [28, Theorem 3.11] and [3, Theorem 5.2]).

Theorem 4.3

(Superposition principle) Let \((E,\Vert \cdot \Vert _{E})\) be a separable Banach space, let \(b:(0,T)\times E \rightarrow E\) be a Borel vector field, and let \(\mu \in C ([0, T]; \mathcal {P}(E)) \) be such that

$$\begin{aligned} \int _0^T \int _E \Vert b_t \Vert _E\,\textrm{d}\mu _t\,\textrm{d}t < +\infty \,. \end{aligned}$$

If \(\mu \) is a solution to the continuity equation

$$\begin{aligned} \partial _{t} \mu _t + \textrm{div}(b_t\,\mu _t) = 0 \end{aligned}$$

in duality with cylindrical functions \(\phi \in C^1_b(E)\), then there exists \(\varvec{\eta }\in \mathcal {P}(C([0,T];E))\) concentrated on absolutely continuous solutions to the Cauchy problems

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{\gamma } = b_t(\gamma ),\\ \gamma _{0} \in \textrm{spt} \mu _{0} \end{array} \right. \end{aligned}$$

and with \((\textrm{ev}_t)_\#\varvec{\eta }=\mu _t\) for all \(t\in [0,T]\), where \(\textrm{ev}_{t} :C([0, T]; E) \rightarrow E\) is the evaluation map at time t, defined as \(\textrm{ev}_{t} (\gamma ):= \gamma (t)\) for every \(\gamma \in C([0, T]; E)\).

The following uniqueness result holds.

Theorem 4.4

Let \(\bar{\Lambda } \in \mathcal {P}_{1} (Y_{\varepsilon })\) and assume that \(\Lambda \in C([0, T]; ( \mathcal {P}(Y_{\varepsilon }); \mathcal {W}_{1}))\) is a solution to (4.1) with initial condition \(\Lambda _{0} = \bar{\Lambda }\). Then, \(\Lambda \) is the unique solution to (4.1) with the same initial value.


Uniqueness of \(\Lambda \) follows from Theorems 4.3 and 3.3. Indeed, we notice that by continuity of \(t \mapsto \Lambda _{t}\) there exists finite

$$\begin{aligned} M:= \max _{t \in [0, T]} \, m_{1}(\Lambda _{t}) <+\infty . \end{aligned}$$

Hence, setting \(b_{t}: = b_{\Lambda _{t}}\) we have by (3.6) that

$$\begin{aligned} \int _{0}^{T} \int _{Y} \Vert b_{t} (y) \Vert _{L^{p}(U, \eta )} \, \textrm{d}\Lambda _{t} (y) \, \textrm{d}t&\le \int _{0}^{T} \int _{Y} M_{\varepsilon } (1 + \Vert y \Vert _{\overline{Y}} + M) \, \textrm{d}\Lambda _{t} (y) \, \textrm{d}t \\&\le M_{\varepsilon } + 2 M M_{\varepsilon } < +\infty \,, \end{aligned}$$

which is precisely (4.3). Since \(L^{p} (U, \eta )\) is a separable Banach space, we may apply Theorem 4.3 and deduce that there exists \(\varvec{\eta } \in \mathcal {P} (C([0, T]; \overline{Y})\) concentrated on solutions to the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{y}_{t} = b^{\varepsilon }_{\Lambda _{t}} (y_{t})\,,\\ y_{0} \in {\textrm{spt}} (\bar{\Lambda })\,, \end{array} \right. \end{aligned}$$

and such that \(\Lambda _{t} = ({\textrm{ev}}_{t})_{\#} \varvec{\eta }\) for \(t \in [0, T]\). As \(\bar{\Lambda } \in \mathcal {P}_{1} (Y_{\varepsilon })\), Theorem 3.3 implies that for any initial condition \(y_{0} \in {\textrm{spt}} (\bar{\Lambda })\) system (4.4) admits a unique solution. This yields the uniqueness of \(\Lambda \). \(\square \)

In order to prove existence of a Eulerian solution \(\Lambda \) to (4.1), we need to pass through the notion of Lagrangian solution, which we recall below (see also [10, Definition 3.3]).

Definition 4.5

Let \(\bar{\Lambda } \in \mathcal {P}_{1}(Y_\varepsilon )\) be a given initial datum. We say that \(\Lambda \in C^0( [0,T]; (\mathcal {P}_{1} (Y_\varepsilon ); \mathcal {W}_1))\) is a Lagrangian solution to (4.1) with initial datum \(\bar{\Lambda }\) if it satisfies

$$\begin{aligned} \Lambda _t = \varvec{\textrm{Y}}_{\Lambda }(t,0,\cdot )_\#\bar{\Lambda } \qquad \text { for every }t \in [0, T], \end{aligned}$$

where \(\varvec{\textrm{Y}}_{\Lambda }(t,s,\bar{y})\) are the transition maps associated with the ODE (3.32).

Remark 4.6

Recalling the definition of push-forward measure, it can be directly proven that Lagrangian solutions are also Eulerian solutions.

We first need the following lemma.

Lemma 4.7

Let \(v_\Psi :Y \rightarrow \mathbb {R}^d\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and let \(\mathcal {T}_\Psi :Y \rightarrow L^{p}(U, \eta )\) satisfy \(\mathrm {(T1)}\)\(\mathrm {(T3)}\). Let \(\delta >0\), let \(\bar{\Lambda } \in \mathcal {P}(B_\delta ^{Y_\varepsilon })\), and assume that \(\Lambda \in C^0([0,T];(\mathcal {P}_1(Y_\varepsilon ),\mathcal {W}_1))\) is a Lagrangian solution to  (4.1) with initial datum \(\bar{\Lambda }\). Then, there exists \(\varrho \in (0, +\infty )\) only depending on \(\varepsilon \)\(\delta \), and T such that

$$\begin{aligned} \Lambda _{t}\in \mathcal {P}(B_{\varrho }^{Y_\varepsilon })\qquad \text { for every }t\in [0,T]. \end{aligned}$$


It suffices to show that there exists \(\varrho \in (0, +\infty )\) such that

$$\begin{aligned} \max \limits _{y\in B_{\delta }^{Y_\varepsilon }}\Vert \varvec{\textrm{Y}}_{\Lambda }(t,0,y) \Vert _{\overline{Y}} \le \varrho \qquad \text { for every }t\in [0,T]. \end{aligned}$$

We first observe that by definition of Lagrangian solutions and the fact that \(\bar{\Lambda } \in \mathcal {P}(B_{\delta }^{Y_\varepsilon })\), we immediately have

$$\begin{aligned} m_1(\Lambda _t)\le \max _{y\in B_{\delta }^{Y_\varepsilon }}\Vert \varvec{\textrm{Y}}_{\Lambda } (t,0,y) \Vert _{\overline{Y}} \qquad \text { for every }t\in [0,T]. \end{aligned}$$

Arguing as in Theorem 3.3, by definition of the transition map, by (3.6), and by (4.7), for every \(y \in B^{Y_{\varepsilon }}_{\delta }\) we have that

$$\begin{aligned} \begin{aligned} \Vert \varvec{\textrm{Y}}_{\Lambda }(t,0,y) \Vert _{\overline{Y}}&\le \delta + M_\varepsilon \int _0^T(1+\Vert \varvec{\textrm{Y}}_{\Lambda ^\varepsilon }(s,0,y) \Vert _{\overline{Y}}+m_1(\Lambda _s^\varepsilon ))\,\textrm{d}s \\&\le \delta + M_\varepsilon \int _0^T \Big ( 1 + 2 \max _{y \in B^{Y_{\varepsilon }}_{\delta }} \Vert \varvec{\textrm{Y}}_{\Lambda }(s,0,y) \Vert _{\overline{Y}} \Big )\,\textrm{d}s\,. \end{aligned} \end{aligned}$$

By Grönwall inequality we deduce that (4.6) holds true with \(\varrho =(\delta + M_{\varepsilon } T ) e^{2M_{\varepsilon }T}\).\(\square \)

We are now in a position to prove Theorem 4.2.

Proof of Theorem 4.2

The structure of the proof follows step by step that of [28, Theorem 3.5] (see also [3, Theorem 4.1]). We report it here briefly for the reader convenience, underlying the use of different function spaces. In particular, we notice that closed and bounded subsets of \(L^{p}(U, \eta )\) are not compact, which does not allow us to apply Ascoli-Arzelà Theorem in combination to Theorem 3.3 to obtain a mean-field limit result.

The proof goes through a finite-dimensional approximation and involves three steps.

Step 1: Stability of Lagrangian solutions. Let us fix \(\delta > 0\) and \(\bar{\Lambda }^{1}, \bar{\Lambda }^{2} \in \mathcal {P}(B_{\delta }^{Y_\varepsilon })\). Let us assume that \(\Lambda ^{1}, \Lambda ^{2} \in C([0, T]; (\mathcal {P}_{1}(Y_{\varepsilon }, \mathcal {W}_{1} ) )\) are two Lagrangian solutions to (4.1) with initial data \(\bar{\Lambda }^{1}\) and \(\bar{\Lambda }^{2}\), respectively. In particular, by Lemma 4.7 we have that there exists \(\varrho \) (only depending on \(\delta \) and \(\varepsilon \)) such that \(\Lambda ^{1}_{t}, \Lambda ^{2}_{t} \in \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\) for every \(t \in [0, T]\). We claim that

$$\begin{aligned} \mathcal {W}_1(\Lambda _t^{1},\Lambda _t^{2}) \le e^{L_{\varepsilon ,\varrho } t + L_{\varrho } t e^{L_{\varepsilon ,\varrho }T} } \,\mathcal {W}_1(\bar{\Lambda }^{1},\bar{\Lambda }^{2})\qquad \text { for every }t\in [0,T]. \end{aligned}$$

To prove (4.8), we fix \(\bar{y}^{1}, \bar{y}^{2}\in B_{\delta }^{Y_\varepsilon }\) and first observe that by Lemma 4.7

$$\begin{aligned} \max _{t \in [0, T]} \, \vert \vert \varvec{\textrm{Y}}_{\Lambda ^{i}}(t,0,\bar{y}^{\,i})\vert \vert _{\overline{Y}} \le \varrho \qquad \text { for } i=1,2. \end{aligned}$$

For simplicity, let us set \(y^{i}_{t}:= \varvec{Y}_{\Lambda ^{i}} (t, 0, \bar{y}^{i})\). By (3.4) and (3.5) of Proposition 3.2 and by (4.9), we get that for every \(t \in [0, T]\)

$$\begin{aligned} \begin{aligned} \Vert y^1_t - y^2_t\Vert _{\overline{Y}}&\le \Vert \bar{y}^{1} - \bar{y}^{2} \Vert _{\overline{Y}} + \int _0^t\left( \Vert b^\varepsilon _{\Lambda _s^{1}}(y^1_s) - b^\varepsilon _{\Lambda _s^{1}}(y^2_s) \Vert _{\overline{Y}} + \Vert b^\varepsilon _{\Lambda _s^{1}}(y^2_s) - b^\varepsilon _{\Lambda _s^{2}}(y^2_s)\Vert _{\overline{Y}}\right) \textrm{d}s \\&\le \Vert \bar{y}^{1} - \bar{y}^{2}\Vert _{\overline{Y}} + L_{\varrho } \int _0^t \mathcal {W}_1(\Lambda _s^{1} , \Lambda _s^{2})\, \textrm{d}s + \int _0^t L_{\varepsilon ,\varrho }\,\Vert y^1_s - y^2_s \Vert _{\overline{Y}}\,\textrm{d}s\,. \end{aligned} \end{aligned}$$

Applying Grönwall’s lemma, we infer from (4.10) that for every \(t \in [0, T]\)

$$\begin{aligned} \Vert y^1_t - y^2_t \Vert _{\overline{Y}} \le \left( \Vert \bar{y}^{1} - \bar{y}^{2} \Vert _{\overline{Y}} + L_{\varrho } \int _0^t \mathcal {W}_1(\Lambda _s^{1} ,\Lambda _s^{2}) \, \textrm{d}s \right) \,e^{L_{\varepsilon , \varrho } t} \,. \end{aligned}$$

Let \(\Pi \in \mathcal {P} (\overline{Y} \times \overline{Y})\) be an optimal plan between \(\bar{\Lambda }^{1}\) and \(\bar{\Lambda }^{2}\). By the definition of Lagrangian solutions, \((\varvec{\textrm{Y}}_{\Lambda ^{1}}(t,0,\cdot ),\varvec{\textrm{Y}}_{\Lambda ^{2}}(t,0,\cdot ))_\#\Pi \) is a transport plan between \(\Lambda _t^{1}\) and \(\Lambda _t^{2}\). Therefore, using (4.11) we may estimate

$$\begin{aligned} \begin{aligned} \mathcal {W}_1(\Lambda ^{1}_t,\Lambda ^{2}_t)&\le \int _{Y_\varepsilon \times Y_\varepsilon } \Vert \varvec{\textrm{Y}}_{\Lambda ^{1}}(t,0,y^1) -\varvec{\textrm{Y}}_{\Lambda ^{2}}(t,0,y^2) \Vert _{\overline{Y}}\,\textrm{d}\Pi (y^1,y^2) \\&\le e^{L_{\varepsilon , \varrho } t } \int _{\overline{Y} \times \overline{Y} } \Vert y^1 - y^2 \Vert _{\overline{Y}} \, \textrm{d}\Pi (y^1,y^2) + L_{\varrho } e^{L_{\varepsilon ,\varrho } t } \int _0^t \mathcal {W}_1(\Lambda _s^{1} , \Lambda _s^{2} ) \, \textrm{d}s \\&= e^{L_{\varepsilon ,\varrho } t } \mathcal {W}_1(\bar{\Lambda }^{1},\bar{\Lambda }^{2}) + L_{\varrho } e^{L_{\varepsilon ,\varrho }t}\,\int _0^t \mathcal {W}_1(\Lambda _s^{1},\Lambda _s^{2}) \,\textrm{d}s\,. \end{aligned} \end{aligned}$$

Applying again the Grönwall lemma we deduce (4.8).

Step 2: Existence and approximation of Lagrangian solutions. We fix a sequence of atomic measures \(\bar{\Lambda }^{N } \in \mathcal {P}(B_{\delta }^{Y_{\varepsilon }} )\) such that

$$\begin{aligned} \lim _{N\rightarrow \infty }\mathcal {W}_1(\bar{\Lambda }^{N},\bar{\Lambda } ) = 0 \,. \end{aligned}$$

Such a sequence can be constructed as follows: let \(\bar{y}^{i}(z)\in Y_\varepsilon \) be independent and identically distributed with law \(\bar{\Lambda }\), so that the random measures \(\bar{\Lambda }^{N} :=\frac{1}{N}\sum _{i=1}^N \delta _{\bar{y}^i(z)}\) almost surely converge in \(\mathcal {P}_1(Y_\varepsilon )\) to \(\bar{\Lambda }\). Then, choose a realization z such that this convergence takes place. By Theorem 3.3, there exists unique the solution to system (3.30) with initial condition \(\bar{\varvec{y}} = (\bar{y}^{1}, \ldots , \bar{y}^{N})\) and let \(\Lambda ^{N}_t\) be the associated empirical measures. As \(\Lambda _t^{N}\) are also Lagrangian solutions to (4.1) with initial condition \(\bar{\Lambda }^{N}\), (4.8) provides a constant \(C:=C(\varepsilon , \delta , T)\) such that for every \(t\in [0,T]\) and every \(N, M \in \mathbb {N}\)

$$\begin{aligned} \mathcal {W}_1(\Lambda _t^{N},\Lambda _t^{M}) \le C \mathcal {W}_1(\bar{\Lambda }^{N} , \bar{\Lambda }^{M} )\,. \end{aligned}$$

Thus, \(\Lambda ^{N}\in C([0,T];(\mathcal {P}_1(B_{\varrho }^{Y_\varepsilon }),\mathcal {W}_1))\) is a Cauchy sequence, and there exists \(\Lambda \in C([0,T];(\mathcal {P}_1(B_{\varrho }^{Y_\varepsilon }),\mathcal {W}_1))\) such that \(\Lambda ^{N}_{t}\) converges to \(\Lambda _{t}\) with respect to the Wasserstein distance \(\mathcal {W}_{1}\), uniformly in \(t \in [0, T]\). Moreover, arguing as in the proof of (4.6), we may find \(\bar{\varrho } \ge \varrho \) such that \(\varvec{Y}_{\Lambda } (t, 0, \bar{y}) \in B^{Y_{\varepsilon }}_{\bar{\varrho }}\) for every \(t \in [0, T]\) and every \(\bar{y} \in B^{Y_{\varepsilon }}_{\delta }\). In view of (3.4) and (3.5) we obtain that

$$\begin{aligned} \Vert \varvec{\textrm{Y}}_{\Lambda } ( t , 0 , \bar{y} ) - \varvec{\textrm{Y}}_{\Lambda ^{N}}(t,0,\bar{y} )\Vert _{\overline{Y}} \le L_R\,e^{L_{\varepsilon ,\bar{\varrho } }t} \int _0^t \mathcal {W}_1(\Lambda _s ,\Lambda _{s}^{N}) \,\textrm{d}s\,. \end{aligned}$$

Step 3: Uniqueness and conclusion. Uniqueness of Lagrangian solutions, given the initial datum, follows now from (4.8). Uniqueness of Eulerian solutions is stated in Theorem 4.4. \(\square \)

5 Fast Reaction Limit for Undisclosed Replicator-Type Dynamics

The aim of this section is to address the case in which the dynamics for the labels runs at a much faster time scale than the dynamics for the agents’ positions. In this case, introducing the fast time scale \(\tau = \lambda \,t\), with \(\lambda \gg 1\), system (3.30) takes the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}_t^{i} = v_{\Lambda _t^N}(x_t^{i},\ell ^{i}_t), \\ \dot{\ell }_t^{i} = \lambda [\mathcal {T}_{\Lambda _t^{N}}(x^{i}_t,\ell ^{i}_t) + \varepsilon \,{\mathcal {H}}(\ell _t^{i})], \end{array}\right. } \qquad \text {for }i=1,\dots ,N,\,\,t\in [0,T]. \end{aligned}$$

Note that, for \(\varepsilon >0\) and \(0< r_{\varepsilon }< 1< R_{\varepsilon } < +\infty \) as in Proposition 3.2, the well-posedness of (5.1) is still guaranteed by Theorem 3.3 (see Proposition 5.3). We focus on the behavior of system (5.1) as \(\lambda \rightarrow +\infty \), thus we are interested in the case of instantaneous adjustment of the strategies.

From now on, for \(\Psi \in \mathcal {P}_{1} (Y_{\varepsilon })\) we denote \(\nu :=\pi _{\#} \Psi \), where \(\pi :Y_{\varepsilon } \rightarrow \mathbb {R}^{d}\) is the canonical projection over \(\mathbb {R}^{d}\). If \(\Lambda ^{N}, \Lambda \) are curves with values in \(\mathcal {P}_{1}(Y_{\varepsilon })\), the symbols \(\mu ^{N}\) and \(\mu \) will instead indicate the curves of measures \(\mu ^{N}_{t}, \mu _{t}\), obtained as push-forward of \(\Lambda ^{N}_{t}\) and \(\Lambda _{t}\) for \(t \in [0, T]\) through \(\pi \).

We assume that the strategies dynamics is of replicator type, i.e., we suppose that in the second equation in (5.1) the operator \(\mathcal {T}_{\Psi }\) takes the form

$$\begin{aligned} \mathcal {T}_{\Psi } (x, \ell ){} & {} :=\left( \int _{U} \partial _\xi F_{\nu } ( x, \ell (u), u ) \ell (u) \, \textrm{d}\eta (u) - \partial _\xi F_{\nu } ( x, \ell , \cdot ) \right) \ell \nonumber \\{} & {} \quad \quad \quad \text { for }x \in \mathbb {R}^{d} \text { and }\ell \in L^{p}(U, \eta ), \end{aligned}$$

for a map \(F:\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow [-\infty ,+\infty ]\) satisfying the following properties:


for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_\varrho )\), every \(x\in B_\varrho \), and every \(\ell \in C_\varepsilon \), the map \(u \mapsto F_{\nu } ( x, \ell (u), u )\) is \(\eta \)–integrable;


for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_\varrho )\), every \(x\in B_\varrho \), and every \(u\in U\), the map \(g_{(\nu , x, u)} :(0, +\infty ) \rightarrow \mathbb {R}\) defined as \(g_{(\nu , x, u)}(\xi ) :=F_{\nu } ( x, \xi , u )\) is convex, is differentiable, and its derivative \(g'_{(\nu , x, u)}\) is Lipschitz continuous in \((0, +\infty )\), uniformly with respect of \((\nu , x, u) \in \mathcal {P}(B_\varrho ) \times B_{\varrho }\times U\);


there exists \(C_{F}>0\) such that for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_{\varrho })\), every \(x \in B_{\varrho }\), every \(\xi \in (0, +\infty )\), and every \(u \in U\)

$$\begin{aligned} |\partial _{\xi } F_{\nu } (x, \xi , u) | \le C_{F}; \end{aligned}$$

for every \(\varrho >0\), the maps \((\nu , x) \mapsto F_{\nu } (x, \xi , u)\) and \((\nu , x) \mapsto \partial _{\xi } F_{\nu } (x, \xi , u)\) are Lipschitz continuous in \(\mathcal {P}_{1}(B_{\varrho }) \times B_{\varrho }\) uniformly with respect to \(u \in U\) and \(\xi \in (0, +\infty )\). Namely, there exists \(\Gamma _{\varrho }>0\) such that for every \(\xi \in (0,+\infty )\), every \(x_{1}, x_{2} \in B_{\varrho }\), every \(\nu _{1}, \nu _{2} \in \mathcal {P}(B_{\varrho })\), and every \(u \in U\)

$$\begin{aligned} | F_{\nu _{1}}(x_1,\xi ,u) - F_{\nu _{2}}(x_2,\xi ,u) |&\le \Gamma _{\varrho } \big ( | x_1 - x_2 | + \mathcal {W}_1(\nu _{1} ,\nu _{2} ) \big )\,,\\ | \partial _\xi F_{\nu _{1}} (x_1, \xi , u ) - \partial _\xi F_{\nu _2} ( x_2 , \xi , u ) |&\le \Gamma _{\varrho } \big ( | x_1 - x_2 | + \mathcal {W}_1(\nu _1,\nu _2) \big )\,; \end{aligned}$$

for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_{\varrho })\), every \(\xi \in (0, +\infty )\), and every \(u \in U\), the map \(F_{\nu } (\cdot , \xi , u)\) is differentiable in \(\mathbb {R}^{d}\).

Remark 5.1

The analysis of the fast reaction limit in the undisclosed setting has been recently performed in [7] for the replicator dynamics (see also Remark 3.1), where the authors considered a pay-off function J independent of the strategy \(u'\) played by other players. Hence, the functional \(\mathcal {J}_{\Psi }\) in (3.1) takes the form

$$\begin{aligned} \begin{aligned} \mathcal {J}_{\Psi } (x,u) = \int _{Y} J(x,u,x')\,\textrm{d}\Psi (x',\ell ') = \int _{\mathbb {R}^d} J(x,u,x')\,\textrm{d}\nu (x') =:\mathcal {J}_{\nu }(x,u)\,, \end{aligned} \end{aligned}$$

which would correspond (see (5.2)) to the operator

$$\begin{aligned} \mathcal {T}_\Psi (x,\ell ) = \left( \mathcal {J}_{\nu } (x,\cdot ) - \int _U \mathcal {J}_{\nu } (x,u) \ell (u) \,\textrm{d}\eta (u) \right) \ell \,, \end{aligned}$$

and to \(F_{\nu } (x, \xi , u) :=-\mathcal {J}_{\nu }(x,u) \xi \) for every \((\nu , x, \xi , u) \in \mathcal {P}_{1}(\mathbb {R}^{d}) \times \mathbb {R}^{d} \times (0, +\infty ) \times U\). Furthermore, in [7] a precise choice for the velocity field \(v_{\Psi }\) is made, which is independent of the state variable \(\Psi \).

The theoretical framework described in \(\mathrm {(F1)}\)\(\mathrm {(F5)}\) is more flexible than the one in [7]. Besides the freedom in the choice of \(v_{\Psi }\), we may for instance model more involved situations, where the pay-off of a certain strategy depends as well on how often such strategy has been played. Such behavior may be captured by a pay-off function \(\widetilde{J} :\mathbb {R}^{d} \times U \times \mathbb {R}^{d} \times (0, +\infty ) \rightarrow \mathbb {R}\) of the form

$$\begin{aligned} \widetilde{J} (x, u, x', \xi ) :=J(x, u, x') - J_{1} (\xi ), \end{aligned}$$

where \(J_{1} :[0, +\infty ) \rightarrow \mathbb {R}\) is monotone increasing, concave, and differentiable with bounded and Lipschitz derivative. In particular, the monotonicity assumption on \(J_{1}\) is meant to penalize strategies that are played too often, and may be therefore expected by other players. Monotonicity of \(J_{1}\) and the regularity of its derivatives comply with conditions \(\mathrm {(F1)}\)\(\mathrm {(F5)}\).

The following proposition provides a set of conditions under which assumptions \((\textrm{F1})\)\((\textrm{F5})\) are satisfied for integral functionals.

Proposition 5.2

Let \(f:\mathbb {R}^d\times ( 0, +\infty ) \times U \times \mathbb {R}^d \rightarrow (-\infty ,+\infty ]\) satisfy the following properties:

\(\mathrm {(f1)}\):

for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_{\varrho })\), every \(x\in B_{\varrho }\), and every \(\ell \in L^{p}(U, \eta )\) the map

$$\begin{aligned} u\mapsto \int _{\mathbb {R}^d} f(x,\ell (u),u,x')\,\textrm{d}\nu (x') \end{aligned}$$

is \(\eta \)–integrable;

\(\mathrm {(f2)}\):

for every \(\varrho >0\), every \(x, x' \in B_\varrho \), and every \(u\in U\), the map \(\xi \mapsto f(x, \xi , u, x')\) is convex in \((0, +\infty )\), is differentiable with derivative \(\partial _{\xi } f(x, \xi , u, x')\) Lipschitz continuous in \((0, +\infty )\), uniformly with respect to \((x, u, x') \in B_{\varrho } \times U \times B_{\varrho }\);


there exists \(C_{f}>0\) such that for every \(\varrho >0\), \(x, x' \in B_{\varrho }\), every \(\xi \in (0, +\infty )\), and every \(u \in U\)

$$\begin{aligned} |\partial _{\xi } f (x, \xi , u, x') | \le C_{f}. \end{aligned}$$
\(\mathrm {(f4)}\):

for every \(\varrho >0\), every \(x, x' \in B_{\varrho }\), every \( \xi \in (0, +\infty )\), and every \(u\in U\) the function \(x' \mapsto f(x,\xi ,u,x')\) belongs to \(\textrm{Lip}_b(\mathbb {R}^d)\) and the map \(x \mapsto f (x, \xi , u, x' )\in \textrm{Lip}(\mathbb {R}^d)\), with Lipschitz constants dependent only on \(\varrho \);

\(\mathrm {(f5)}\):

for every \(\varrho >0\), every \(x, x' \in B_{\varrho }\), every \( \xi \in (0, +\infty )\), and every \(u\in U\), the function \(x' \mapsto \partial _\xi f(x,\xi ,u, x')\) belongs to \(\textrm{Lip}_b(\mathbb {R}^d)\), and the map \(x \mapsto \partial _\xi f ( x, \xi , u,x' )\) belongs to \(\textrm{Lip}(\mathbb {R}^d)\), with Lipschitz constants depending only on \(\varrho \);

\(\mathrm {(f6)}\):

for every \(\varrho >0\), every \(\xi \in (0, +\infty )\), every \(u \in U\), and every \(x' \in B_{\varrho }\), the map \(f(\cdot , \xi , u, x')\) is differentiable in \(\mathbb {R}^{d}\).

Then, the functional \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) defined as

$$\begin{aligned} F_\nu (x,\xi ,u) :=\int _{\mathbb {R}^d} f ( x , \xi , u , x' )\,\textrm{d}\nu (x') \end{aligned}$$

fulfills conditions \(\mathrm {(F1)}\)\(\mathrm {(F5)}\).


Condition \(\mathrm {(F1)}\) coincides with \(\mathrm {(f1)}\). Property \(\mathrm {(F2)}\) follows from \(\mathrm {(f2)}\), which in particular implies that

$$\begin{aligned} \partial _{\xi } F_{\nu } (x, \xi , u) = \int _{\mathbb {R}^{d} } \partial _{\xi } f (x, \xi , u, x') \, \textrm{d}\nu (x'). \end{aligned}$$

Thus, we deduce \((\textrm{F3})\) and \((\textrm{F4})\) from \(\mathrm {(f3)}\)\(\mathrm {(f5)}\). Finally, from \(\mathrm {(f5)}\) and \(\mathrm {(f6)}\) we deduce that for every \(\varrho >0\), every \(\xi \in (0, +\infty )\), every \(u \in U\), and every \(\nu \in \mathcal {P}(B_{\varrho })\) we have

$$\begin{aligned} \partial _{x} F_{\nu } (x, \xi , u) = \int _{U} \partial _{x} f(x, \xi , u, x') \, \textrm{d}\nu (x'). \end{aligned}$$

\(\square \)

For \(\lambda \in (0, +\infty )\), we now briefly discuss the well-posedness of (5.1) for the operator \(\mathcal {T}_{\Psi }\) as in (5.2)

Proposition 5.3

Let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\). Then, the operator \(\mathcal {T}_{\Psi }\) defined in (5.2) for every \(\Psi \in \mathcal {P}_{1}(Y)\) satisfies conditions \(\mathrm {(T1)}\)\(\mathrm {(T3)}\).


By definition (5.2), \(\mathcal {T}_{\Psi }\) clearly satisfies \((\textrm{T1})\). Property \((\textrm{T2})\) is a consequence of \((\textrm{F2})\) and of \((\textrm{F4})\), while \((\textrm{T3})\) follows from \((\textrm{F3})\), as for \(y = (x, \ell ) \in Y\) and \(u \in U\) we can simply estimate

$$\begin{aligned} | \mathcal {T}_{\Psi } (y) (u)| \le 2 C_{F} | \ell (u)|. \end{aligned}$$

Thus, \((\textrm{T3})\) is satisfied with \(\omega ( \xi ) :=| \xi |\) for \(\xi \in [0, +\infty )\). \(\square \)

Corollary 5.4

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\), let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\). and let \(\mathcal {T}_{\Psi }\) be as in (5.2). Moreover, for \(\varepsilon >0\) let \(0< r_{\varepsilon }< 1< R_{\varepsilon } < +\infty \) be given by Proposition 3.2. Then, the following facts hold:


for every \(\lambda \in (0, +\infty )\) and every \(N \in \mathbb {N}\), system (5.1) admits a unique solution for every initial condition \(\bar{y} :=(\bar{y}^{1}, \ldots , \bar{y}^{N}) \in Y_{\varepsilon }^{N}\);


for every \(\lambda , \delta \in (0, +\infty )\) and every \(\bar{\Lambda } \in \mathcal {P}(B^{Y_{\varepsilon }}_{\delta })\), there exists a unique (Lagrangian / Eulerian) solution to the continuity equation

$$\begin{aligned} \partial _{t} \Lambda _{t} + \textrm{div}(b^{\varepsilon , \lambda }_{\Lambda _{t}} \Lambda _{t}) = 0 \qquad \text { with } \Lambda _{0}= \bar{\Lambda }, \end{aligned}$$

where we have set

$$\begin{aligned} b^{\varepsilon , \lambda }_{\Lambda _{t}} (y) :=\left( \begin{array}{cc} v_{\Lambda _{t}} (y) \\ \lambda ( \mathcal {T}_{\Lambda _{t}} (y) + \varepsilon {\mathcal {H}} (\ell )) \end{array}\right) ; \end{aligned}$$

for every \(\lambda , \delta \in (0, +\infty )\) and every \(\bar{\Lambda }, \bar{\Lambda }_{n} \in \mathcal {P}(B^{Y_{\varepsilon }}_{\delta })\) such that \(\mathcal {W}_{1}( \bar{\Lambda }_{n}, \bar{\Lambda }) \rightarrow 0\) as \(n \rightarrow \infty \), the corresponding solutions \(\Lambda , \Lambda _{n} \in C([0, T]; (\mathcal {P}_{1}(Y^{\varepsilon }), \mathcal {W}_{1}))\) to (5.3) with initial conditions \(\bar{\Lambda }\) and \(\bar{\Lambda }_{n}\), respectively, satisfy

$$\begin{aligned} \lim _{n \rightarrow \infty } \, \mathcal {W}_{1} (\Lambda _{n, t}, \Lambda _{t}) = 0 \qquad \text { uniformly in }t \in [0, T]. \end{aligned}$$


All the items are a consequence of Proposition 5.3 and of Theorem 2.1, and can be obtained arguing as in Proposition 3.2 and Theorems 3.3 and 4.2, taking care of the fact that all the involved constants (\(L_{\varrho }\), \(L_{\varepsilon , \varrho }\), \(M_{\varepsilon }\), and \(\theta _{\varepsilon }\)) may depend on \(\lambda \). \(\square \)

As we did in Sect. 3, from now on we fix \(\varepsilon >0\) and \(0< r_{\varepsilon }< 1< R_{\varepsilon } < +\infty \) as in Proposition 3.2 (or, equivalently, as in Proposition 5.3). We recall that we set \(C_{\varepsilon } :=C_{r_{\varepsilon }, R_{\varepsilon }}\) and \(Y_{\varepsilon } :=Y_{r_{\varepsilon }, R_{\varepsilon }}\).

Our goal is to prove the convergence, as \(\lambda \rightarrow +\infty \), of system (5.1) to a suitable system of agents with labels, where such labels are defined as minima of some particular functionals. In Proposition 5.7 we introduce the prototype for these functionals and present some of its properties. Before stating Proposition 5.7, we recall the definition of Fréchet differentiability on \(C_{\varepsilon }\) (see, e.g., [3, Appendix A.1]).

Definition 5.5

(Fréchet differentiability) Let us set \(E_{C_{\varepsilon }} :=\mathbb {R} (C_{\varepsilon } - C_{\varepsilon })\). A functional \(\mathcal {F}:C_{\varepsilon } \rightarrow \mathbb {R}\) is said to be Fréchet differentiable at \(\ell \in C_{\varepsilon }\) if there exists \(L \in \mathcal {L}(E_{C_{\varepsilon }}; \mathbb {R})\) such that

$$\begin{aligned} \lim _{\begin{array}{c} \tilde{\ell } \,\xrightarrow {L^p}\,\ell \\ \tilde{\ell } \in C_{\varepsilon } \end{array}} \frac{\vert \mathcal {F}(\tilde{\ell })-\mathcal {F}(\ell )- L [\tilde{\ell }-\ell ] \vert }{ \Vert \tilde{\ell }-\ell \Vert _{L^p(U,\eta )}}=0\,. \end{aligned}$$

Remark 5.6

Notice that the linear operator L in Definition 5.5 is not uniquely determined on \(E_{C_{\varepsilon }}\), while it is unique on the cone \(E_{\ell } :=\mathbb {R}_{+} ( C_{\varepsilon } - \ell )\). For this reason, we will always use the notation \(D \mathcal {F} (\ell )\) to denote the operator L.

Proposition 5.7

Let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\). For every \(\varrho > 0\), every \(\nu \in \mathcal {P}(B_{\varrho } )\), and every \(x\in B_{\varrho }\), let \(G_{\nu } (x, \cdot ) :C_\varepsilon \rightarrow \mathbb {R}\) be defined by

$$\begin{aligned} G_{\nu } (x,\ell ) :=\int _{U} \big ( F_{\nu }(x,\ell (u),u) + \varepsilon \ell (u) ( \log (\ell (u)) - 1) \big )\,\textrm{d}\eta (u) \qquad \text { for }\ell \in C_{\varepsilon }. \end{aligned}$$

Then, \(G_{\nu } (x, \cdot )\) is Fréchet differentiable if \(p \ge 1\), strongly convex if \(1\le p \le 2\) and uniformly convex if \(2<p<+\infty \). Moreover, there exists \(D_{\varrho }>0\) such that for every \(\ell _{1}, \ell _{2}\in C_\varepsilon \) and every \((x_{1}, \nu _{1}), (x_{2}, \nu _{2}) \in B_{\varrho }\times \mathcal {P}(B_{\varrho })\)

$$\begin{aligned} |G_{\nu _{1}} (x_{1}, \ell _{1} ) - G_{\nu _{2}} (x_{2}, \ell _{2}) | \le D_{\varrho } \big ( | x_{1} - x_{2}| + \Vert \ell _{1} - \ell _{2} \Vert _{L^{p}(U, \eta )} + \mathcal {W}_{1} (\nu _{1}, \nu _{2}) \big )\,. \end{aligned}$$


For \((x, \nu ) \in B_{\varrho } \times \mathcal {P} (B_{\varrho })\), the functional \(G_{\nu } (x,\cdot )\) is well-defined thanks to \((\textrm{F1})\). Furthermore, as a consequence of \((\textrm{F2})\), \(G_{\nu } (x,\cdot )\) is Fréchet-differentiable in \(\ell _1\in C_\varepsilon \) with differential

$$\begin{aligned} \begin{aligned}&DG_\nu (x,\ell _1)[\ell _{1} - \ell _{2}] = \int _{U} \big ( \partial _\xi F_{\nu } ( x , \ell _1(u) , u ) (\ell _2(u) - \ell _{1}(u)) \\&\quad + \varepsilon ( (\ell _{2} (u) - \ell _{1}(u)) \log (\ell _{1}(u)) \big ) \,\textrm{d}\eta (u) \,. \end{aligned} \end{aligned}$$

Indeed, by \((\textrm{F2})\) we can simply estimate

$$\begin{aligned} \begin{aligned}&| G_\nu (x,\ell _2) - G_\nu (x,\ell _1) - DG_\nu (x,\ell _1) [\ell _2 - \ell _1] | \\&\quad = \bigg | \int _U [F_{\nu } ( x , \ell _2(u) ,u ) - F_{\nu } ( x , \ell _1(u) , u ) - \partial _\xi F_{\nu } ( x , \ell _1(u) , u ) ( \ell _2 (u) - \ell _1 (u) ) ] \\&\qquad + \varepsilon \ell _{2}(u) ( \log (\ell _{2}(u)) - 1) - \varepsilon \ell _{1}(u) ( \log (\ell _{1}(u)) - 1) \\&\qquad - \varepsilon ( (\ell _{2} (u) - \ell _{1}(u)) \log (\ell _{1}(u)) \big ) \, \textrm{d}\eta (u) \bigg | \\&\quad \le o(1) \int _U | \ell _{1}(u) - \ell _{2}(u)| \, \textrm{d}\eta (u) \le o(1) \Vert \ell _1 - \ell _2 \Vert _{L^p(U,\eta )}\,. \end{aligned} \end{aligned}$$

By the local strong convexity of \(t \mapsto \log t\) in \((0, +\infty )\), there exists \(\beta _{\varepsilon }>0\) such that for every \(\xi _{1}, \xi _{2} \in [r_{\varepsilon }, R_{\varepsilon }]\) and every \(t \in [0, 1]\)

$$\begin{aligned}{} & {} (t \xi _{1} + (1-t ) \xi _{2}) \log ( t \xi _{1} + (1-t ) \xi _{2}) \le t \xi _{1} \log \xi _{1} + (1-t) \xi _{2} \log \xi _{2} \\{} & {} \quad - \frac{\beta _{\varepsilon } }{2} t (1-t) | \xi _{1} - \xi _{2}|^{2}. \end{aligned}$$

By convexity of \(F_{\nu } (x, \cdot , u)\) we deduce that for every \(t \in [0, 1]\) and every \(\ell _{1}, \ell _{2} \in C_{\varepsilon }\)

$$\begin{aligned} \begin{aligned}&G_\nu (x, t \ell _1 + (1-t) \ell _2 ) \le t G_{\nu } (x, \ell _{1}) + (1-t) G_{\nu } (x, \ell _{2}) \\&\quad - \frac{\beta _{\varepsilon }}{2} t (1-t) \Vert \ell _{1} - \ell _{2} \Vert ^{2}_{L^{2} (U, \eta )}\,. \end{aligned} \end{aligned}$$

If \(p \in [1, 2]\), inequality (5.6) implies the strong convexity of \(G_{\nu } (x, \cdot )\) in \(C_{\varepsilon }\) by Hölder inequality. If \(p \in (2, +\infty )\), instead, we infer the uniform convexity of \(G_{\nu } (x, \cdot )\) by combining (5.6) with

$$\begin{aligned} \Vert \ell _1 - \ell _2\Vert _{L^p(U,\eta )}^p \le (R_\varepsilon -r_\varepsilon )^{p-2} \Vert \ell _1 - \ell _2\Vert _{L^2(U,\eta )}^2\,. \end{aligned}$$

Finally, the Lipschitz continuity (5.5) is a direct consequence of property \(\mathrm {(F3)}\), \(\mathrm {(F4)}\), and of the local Lipschitz continuity of \(t \mapsto t \log t\) in \((0, +\infty )\). \(\square \)

As a consequence of Proposition 5.7 we have the following corollary.

Corollary 5.8

Let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\) and let G be defined as in (5.4). Then, for every \(\varrho >0\), every \(\nu \in \mathcal {P}(B_{\varrho }^{Y_\varepsilon })\), every \(x\in B_{\varrho }\), and every \(1\le p<+\infty \), there exists a unique solution \(\ell _{x, \nu }\) to the minimum problem

$$\begin{aligned} \min _{\ell \in C_\varepsilon } \,G_{\nu } (x , \ell )\,. \end{aligned}$$

Moreover, there exists \(\beta _{\varepsilon } >0\) and \(A_{\varepsilon , \varrho } >0\) such that for every \(x, x_{1}, x_{2} \in B_{\varrho }\), every \(\nu , \nu _{1}, \nu _{2} \in \mathcal {P}(B_{\varrho } )\), and every \(\ell \in C_\varepsilon \)

$$\begin{aligned}&G_{\nu } (x,\ell ) - G_\nu (x,\ell _{x, \nu }) \ge \beta _{\varepsilon } \Vert \ell - \ell _{x, \nu } \Vert ^2_{L^2(U,\eta )}\,, \end{aligned}$$
$$\begin{aligned}&| G_{\nu _{1}}(x_{1},\ell _{x_{1}, \nu _{1}}) - G_{\nu _{2}}(x_{2} , \ell _{x_{2}, \nu _{2}}) | \le D_{ \varrho } \big ( | x_{1} - x_{2}| + \mathcal {W}_{1} (\nu _{1}, \nu _{2})\big )\,, \end{aligned}$$
$$\begin{aligned}&\Vert \ell _{x_{1}, \nu _{1}} - \ell _{x_{2}, \nu _{2}}\Vert _{L^{p}(U, \eta )} \le A_{\varepsilon , \varrho } \big ( | x_{1} - x_{2}| + \mathcal {W}_{1} (\nu _{1}, \nu _{2})\big ) \qquad \text { if }p \in [1, 2]\,, \end{aligned}$$
$$\begin{aligned}&\Vert \ell _{x_{1}, \nu _{1}} - \ell _{x_{2}, \nu _{2}}\Vert _{L^{p}(U, \eta )} \le A_{\varepsilon , \varrho } \big ( | x_{1} - x_{2}| + \mathcal {W}_{1} (\nu _{1}, \nu _{2})\big )^{\frac{1}{p-1}} \qquad \text { if }p \in (2, +\infty )\,, \end{aligned}$$

where \(D_{\varrho }>0\) is the Lipschitz constant introduced in Proposition 5.7.


The existence and uniqueness to the minimum problem is a direct consequence of the strong and uniform convexity of \(G_\nu ( x,\cdot )\) and of the convexity of \(C_\varepsilon \). Then, by the minimality of \(\ell _{x, \nu }\) and by the local strong convexity of \(t \mapsto t \log t\), there exists \(\beta _{\varepsilon }>0\) such that for every \(\ell \in C_{\varepsilon }\)

$$\begin{aligned} \begin{aligned}&G_\nu (x, \ell ) - G_\nu (x,\ell _{x, \nu }) \ge \underbrace{DG_\nu (x, \ell _{x, \nu })[ \ell - \ell _{x, \nu }]}_{\ge 0} \\&\quad + \beta _{\varepsilon } \Vert \ell - \ell _{x, \nu } \Vert ^2_{L^2(U,\eta )} \ge \beta _{\varepsilon }\Vert \ell - \ell _{x, \nu }\Vert ^2_{L^2(U,\eta )}\,, \end{aligned} \end{aligned}$$

which proves (5.9).

Let us now fix \(x_1, x_2\in B_{\varrho }\), \(\nu _{1}, \nu _{2} \in \mathcal {P}_1(B_{\varrho })\), and let \(\ell _i \in C_{\varepsilon }\) be the solutions to

$$\begin{aligned} \min _{\ell \in C_\varepsilon } G_{\nu _{i} }(x_i, \ell ) \qquad \text { for }i = 1, 2. \end{aligned}$$

Without loss of generality, we may assume that \(G_{\nu _{2}}(x_2,\ell _2) \ge G_{\nu _{1} }(x_1,\ell _1)\). Using the minimality of \(\ell _2\) and applying Proposition 5.7 we get that

$$\begin{aligned} \begin{aligned} | G_{\nu _{2}} (x_2,\ell _2) - G_{\nu _{1}}(x_1,\ell _1) |&= G_{\nu _{2}}(x_2,\ell _2 ) - G_{\nu _{2}}(x_2,\ell _1) + G_{\nu _{2}} (x_2,\ell _1) - G_{\nu _{1}}(x_1,\ell _1) \\&\le G_{\nu _{2}}(x_2,\ell _1) - G_{\nu _{1}}(x_1,\ell _1) \\&\le D_\varrho ( |x_2-x_1| + W_1(\nu _{1} , \nu _{2}))\,, \end{aligned} \end{aligned}$$

which yields (5.10).

Since \(G_\nu (x,\cdot )\) is strongly convex in \(C_{\varepsilon }\) for \(p=2\), we have that there exists \(\gamma _{\varepsilon }>0\) such that

$$\begin{aligned} \big (DG_{\nu _{2}}(x_2,\ell _{2}) - DG_{\nu _{2}}(x_2,\ell _1) \big ) [\ell _2-\ell _1] \ge \gamma _{\varepsilon } \Vert \ell _2 - \ell _1 \Vert ^2_{L^2(U,\eta )}\,. \end{aligned}$$

By minimality, we have that

$$\begin{aligned} DG_{\nu _{2}}(x_2,\ell _2) [\ell _2 - \ell _1] \le 0 \le DG_{\nu _{1}}(x_1,\ell _1)[\ell _2 - \ell _1]\,. \end{aligned}$$

Therefore, property \(\mathrm {(F4)}\) yields

$$\begin{aligned} \gamma _{\varepsilon } \Vert \ell _2 - \ell _1 \Vert ^2_{L^2(U,\eta )}&\le \big ( DG_{\nu _{1}}(x_1,\ell _1) - DG_{\nu _{2}} (x_2,\ell _1)\big ) [\ell _2 - \ell _1] \nonumber \\&= \int _U \big ( \partial _\xi F_{\nu _{1}} ( x_1 , \ell _1(u) , u) - \partial _\xi F_{\nu _{2}} ( x_2 , \ell _1 (u) , u ) \big ) (\ell _2(u)\nonumber \\&\quad - \ell _1 (u)) \,\textrm{d}\eta (u) \nonumber \\&\le \Gamma _\varrho (| x_2 - x_1 | + \mathcal {W}_1( \nu _{1}, \nu _{2}) \big ) \int _U |\ell _2 (u) - \ell _1(u) | \, \textrm{d}\eta (u) \nonumber \\&\le \Gamma _\varrho \big ( | x_2 - x_1 | + \mathcal {W}_1(\nu _{1},\nu _{2}) \big ) \Vert \ell _2 - \ell _1\Vert _{L^2(U,\eta )}\,. \end{aligned}$$

If \(p \in [1, 2]\), (5.13) and Hölder inequality imply the Lipschitz continuity of \((x, \nu ) \mapsto \ell _{x, \nu }\) in \(B_{\varrho } \times \mathcal {P}(B_{\varrho })\). If \(p \in (2, +\infty )\), arguing as in (5.7) and using once again Hölder inequality we deduce from (5.13) that

$$\begin{aligned} \Vert \ell _2 - \ell _1\Vert ^{p}_{L^p(U,\eta )} \le \Big (\frac{\Gamma _\varrho }{\gamma _{\varepsilon }}\Big )^{2} \,(R_\varepsilon - r_\varepsilon )^{p-2} \big ( | x_2 - x_1 | + \mathcal {W}_1(\nu _{1} , \nu _{2}) \big )^{2}\,. \end{aligned}$$


$$\begin{aligned} A_{\varepsilon , \varrho } :=\max \, \bigg \{ \frac{\Gamma _{\varrho }}{\gamma _{\varepsilon }}, \bigg ( \frac{\Gamma _{\varrho }}{\gamma _{\varepsilon }} \bigg )^{\frac{2}{p}} (R_{\varepsilon } - r_{\varepsilon })^{\frac{p-2}{p}} \bigg \} \end{aligned}$$

we get (5.11) and (5.12). \(\square \)

As intermediate step towards the main result of this section we have the following lemma, where we estimate the behavior, as \(\lambda \rightarrow +\infty \), of the labels \(\ell ^{i}_{t}\) in system (5.1). For later use, we introduce here the map \(\Delta :\mathbb {R}^{d} \times \mathcal {P}_{1}(\mathbb {R}^{d}) \rightarrow C_{\varepsilon }\) defined as

$$\begin{aligned} \Delta (x, \nu ) :=\textrm{argmin}_{\ell \in C_{\varepsilon }} \, G_{\nu } (x, \ell ). \end{aligned}$$

In particular, by Proposition 5.7 the map \(\Delta \) is Lipschitz continuous on \(B_{\varrho } \times \mathcal {P}(B_{\varrho })\) for every \(\varrho >0\).

Lemma 5.9

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\), let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\), let the operator \(\mathcal {T}_{\Psi }\) be defined as in (5.2), and let G be as in (5.4). For \( \lambda \in (0, +\infty )\), \(N \in \mathbb {N}\), \(\delta >0\), and \(\bar{\varvec{y}} = (\bar{y}^{1}, \ldots , \bar{y}^{N}) \in (B_{\delta }^{Y^{\varepsilon }})^{N}\), let \(\{y^{i}_{\lambda }\}_{i=1}^{N}\) denote the solutions to the Cauchy problem (5.1) with initial conditions \(\bar{y}^{i}\) and corresponding empirical measure \(\Lambda ^{N}_{\lambda , t} :=\frac{1}{N} \sum _{i=1}^{N} \delta _{y^{i}_{\lambda , t}}\), let \(\bar{\Lambda }^{N}_{0} :=\frac{1}{N} \sum _{i=1}^{N} \delta _{\bar{y}^{i}}\), and let \(\mu ^{N}_{\lambda , t} :=\pi _{\#} \Lambda ^{N}_{\lambda , t}\) and \(\bar{\mu }^{N} :=\pi _{\#} \bar{\Lambda }^{N}\). Then, the following facts hold:


there exists \(\varrho >0\) (depending only on \(\delta \) and \(\varepsilon \)) such that \(\Lambda ^{N}_{\lambda , t} \in \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\) for every \(t \in [0, T]\);


there exists two positive constants \(\omega _{\varepsilon , \delta }\) and \(\gamma _{\varepsilon }\) (independent of \(\lambda \)) such that for every \(p \in [1, 2]\) and every \(t\in (0,T]\)

$$\begin{aligned} \Vert \ell _{\lambda , t}^{i } - \Delta ( x^{i}_{\lambda , t}, \mu ^{N}_{\lambda , t}) \Vert _{L^{p}(U, \eta )} \le \omega _{\varepsilon , \delta } \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg ) \,, \end{aligned}$$

while for \(p \in (2, +\infty )\) it holds

$$\begin{aligned} \frac{ \Vert \ell _{\lambda , t}^{i } - \Delta ( x^{i}_{\lambda , t}, \mu ^{N}_{\lambda , t}) \Vert ^{p}_{L^{p}(U, \eta )}}{(R_{\varepsilon } - r_{\varepsilon })^{p-2}} \le 2 \omega _{\varepsilon , \delta }^{2} \bigg ( \frac{1}{\lambda } + e^{-2 \lambda \gamma _{\varepsilon } T} \bigg )\,. \end{aligned}$$


The proof consists of two steps. In the first step, we obtain some useful estimates and properties of system (5.1), which we then use in the second step to prove (5.14). Along the proof, we drop the index \(\lambda \), as we always argue for a fixed parameter \(\lambda \in (0, +\infty )\).

Step 1. We first show that the player’s’ locations \(x^{i}_{t}\) are bounded in \(\mathbb {R}^{d}\) independently of \(\lambda \), N, and t. Indeed, using \(\mathrm {(v3)}\) and recalling that \(m_1(\Lambda _{t}^{N})\le \max _{i=1, \ldots , N} \Vert y^{i}_{t} \Vert _{\overline{Y}}\) and that \( \ell ^{i}_{t} \in C_{\varepsilon }\), we have that for every \(i=1, \ldots , N\)

$$\begin{aligned} | x_{t}^{i} |&\le | \bar{x}^{i} | + \int _0^T | v_{\Lambda _{s}^{N}} ( x_{s}^{i} , \ell _{s}^{i})| \, \textrm{d}s \le | \bar{x}^{i} | + \int _0^T M_v ( 1 + | x_s^{i} | + \Vert \ell _s^{i} \Vert _{L^{p}(U, \eta )}\nonumber \\&\quad + m_1(\Lambda _s^{N}) ) \, \textrm{d}s \nonumber \\&\le | \bar{x}^{i} | + M_v (1 + R_\varepsilon ) T + \int _0^T M_v ( | x_s^{i} | + \max _{i=1, \ldots , N} \Vert y_s^{i} \Vert _{\overline{Y}^N}) \, \textrm{d}s \nonumber \\&\le \delta + M_v (1 + 2R_\varepsilon ) T + \int _0^T 2M_v \max _{j=1, \ldots , N} | x_s^{j} | \,\textrm{d}s\,. \end{aligned}$$

Taking the maximum over \(i=1, \ldots , N\) on the left-hand side of (5.16), by Grönwall inequality we get

$$\begin{aligned} \max _{i=1, \ldots , N} |x_t^{i} |\le \Big ( \delta + M_v (1 + 2R_\varepsilon ) T \Big ) e^{2M_v T} =:R_{\delta , \varepsilon }\,. \end{aligned}$$

As a consequence of (5.17), setting \(\varrho :=R_{\delta , \varepsilon } + R_{\varepsilon }\) we have that \((x^{i}_{t}, \Lambda ^{N}_{t}) \in B_{\varrho } \times \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\) for every N, every i, and every \(t \in [0, T]\). In particular, this proves (i). Moreover, by \(\mathrm {(v3})\) and (5.17) the map \(t\mapsto x_t^{i}\) for every \(i=1, \ldots , N\), with Lipschitz constant only depending on \(\varrho \) and on \(M_{v}\). Indeed, for every \(t_{1} < t_{2} \in [0, T]\) and every i we have that

$$\begin{aligned} \begin{aligned} | x_{t_2}^{i } - x_{t_1}^{i} |&\le \int _{t_1}^{t_2} | v_{\Lambda _s^{N}}(x_s^{i} , \ell _s^{i})| \,\textrm{d}s \le M_v (1 + 2\varrho ) | t_2 - t_1 | =:A_{\varrho } | t_2 - t_1 |\,. \end{aligned} \end{aligned}$$

Therefore, also the map \(t\mapsto \mu ^{N}_{t}\) is Lipschitz continuous, with Lipschitz constant \(A_{\varrho }\). Up to a re-definition of \(A_{\varrho }\), by \(\mathrm {(F3})\) and the properties of \({\mathcal {H}}\), we may as well assume that \(\ell ^{i}_{t}\) is Lipschitz continuous in [0, T], with Lipschitz constant \(A_{\varrho }\).

Step 2. We now proceed with the proof of (5.14). Using the convexity of \(G_{\mu _{t}^{N}}(x_t^{i},\cdot )\) and the fact that \(\ell ^{i}_{t}, \Delta ( x^{i}_{t}, \mu ^{N}_{t}) \in C_{\varepsilon }\), we have that

$$\begin{aligned} \begin{aligned}&G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} , \Delta ( x_t^{i}, \mu ^{N}_{t}) ) \le DG_{\mu _{t}^{N}} (x_t^{i} , \ell _t^{i}) [ \ell _t^{i} - \Delta ( x^{i}_{t}, \mu ^{N}_{t}) ]\\&\quad = \int _U \big ( \partial _\xi F_{\mu ^{N}_{t} } (x_t^{i} , \ell _t^{i} (u) , u ) + \varepsilon \log ( \ell ^{i}_{t} (u) ) \big ) (\ell _t^{i} (u) - \Delta ( x^{i}_{t}, \mu ^{N}_{t}) (u) ) \, \textrm{d}\eta (u)\\&\quad = \int _U \bigg ( \partial _\xi F_{\mu ^{N}_{t} } ( x_t^{i} , \ell _t^{i} (u) , u ) - \int _U \partial _\xi F_{\mu _t^{N} } ( x_t^{i} , \ell _t^{i} ( u' ) , u' ) \ell _t^{i} (u') \, \textrm{d}\eta (u') \bigg )\\&\quad (\ell _t^{i} (u) - \Delta ( x^{i}_{t}, \mu ^{N}_{t}) (u) ) \, \textrm{d}\eta (u) \\&\qquad + \int _{U} \varepsilon \big ( \log ( \ell ^{i}_{t} (u) ) - I(\ell ^{i}_{t} ) \big ) (\ell _t^{i} (u) -\Delta ( x^{i}_{t}, \mu ^{N}_{t}) (u) ) \, \textrm{d}\eta (u) \\&\quad \le \bigg \Vert \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} , \cdot ) + \varepsilon \log ( \ell ^{i}_{t} ) \\&\qquad - \int _U \partial _{\xi } F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u') , u' ) \ell _t^{i}(u') \, \textrm{d}\eta (u') - \varepsilon I(\ell ^{i}_{t}) \bigg \Vert _{L^2(U,\eta )}\\&\qquad \times \Vert \ell _t^{i} -\Delta ( x^{i}_{t}, \mu ^{N}_{t}) \Vert _{L^2(U,\eta )}\,. \end{aligned} \end{aligned}$$

The above chain of inequalities, together with (5.9), leads us to

$$\begin{aligned} \begin{aligned}&\beta _{\varepsilon } \big ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t}) ) \big ) \\&\quad \le \bigg \Vert \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} , \cdot ) + \varepsilon \log ( \ell ^{i}_{t} ) - \int _U \partial _{\xi } F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u') , u' ) \ell _t^{i}(u') \, \textrm{d}\eta (u') - \varepsilon I(\ell ^{i}_{t}) \bigg \Vert _{L^2(U,\eta )}^2\,. \end{aligned} \end{aligned}$$

By Proposition 5.7, by (5.18), and by the bound \(y^{i}_{t} =(x^{i}_{t}, \ell ^{i}_{t}) \in B_{\varrho }^{Y_{\varepsilon }}\) for \(i = 1, \ldots , N\), for every \(t < s \in (0, T)\) we may estimate

$$\begin{aligned}&\big ( G_{\mu _{s}^{N}} ( x_{s}^{i} , \ell _{s}^{i} ) - G_{\mu _{s}^{N}} ( x_{s}^{i} , \Delta ( x^{i}_{s}, \mu ^{N}_{s}) ) \big ) - G_{\mu _{t}^{N}} ( x_{t_{1}}^{i} , \ell _{t}^{i} ) - G_{\mu _{t}^{N}} ( x_{t}^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) \nonumber \\&\quad = \big ( G_{\mu _{s}^{N}} ( x_{s}^{i} , \ell _{s}^{i} )- G_{\mu _{t}^{N}} ( x_{s}^{i} , \ell _{s}^{i} ) \big ) + \big ( G_{\mu _{t}^{N}} ( x_{s}^{i} , \ell _{s}^{i} ) - G_{\mu _{t}^{N}} ( x_{t}^{i} , \ell _{t}^{i} )\big ) \nonumber \\&\qquad - \big ( G_{\mu _{s}^{N}} ( x_{s}^{i} , \Delta ( x^{i}_{s}, \mu ^{N}_{s}) ) - G_{\mu _{t}^{N}} ( x_{t}^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) \big ) \nonumber \\&\quad \le D_{\varrho } \mathcal {W}_{1} (\mu ^{N}_{t} , \mu ^{N}_{s}) + \big ( G_{\mu _{t}^{N}} ( x_{s}^{i} , \ell _{s}^{i} ) - G_{\mu _{t}^{N}} ( x_{t}^{i} , \ell _{t}^{i} )\big ) \nonumber \\&\qquad - \big ( G_{\mu _{s}^{N}} ( x_{s}^{i} , \Delta ( x^{i}_{s}, \mu ^{N}_{s}) ) - G_{\mu _{t}^{N}} ( x_{t}^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) \nonumber \\&\quad \le D_{\varrho } A_{\varrho } ( s - t) + \big ( G_{\mu _{t}^{N}} ( x_{s}^{i} , \ell _{s}^{i} ) - G_{\mu _{t}^{N}} ( x_{t}^{i} , \ell _{t}^{i} )\big ) \nonumber \\&\quad \quad - \big ( G_{\mu _{s}^{N}} ( x_{s}^{i} , \Delta ( x^{i}_{s}, \mu ^{N}_{s}) ) - G_{\mu _{t}^{N}} ( x_{t}^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) \,. \end{aligned}$$

Since also the map \(t \mapsto G_{\mu ^{N}_{t}} ( x_t^{i}, \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i}, \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) )\) is Lipschitz continuous (see Proposition 5.7 and Corollary 5.8), and thus differentiable a.e. in [0, T], dividing (5.20) by \(s-t\) and passing to the limit as \(s \searrow t\) we get by chain rule that for a.e. \(t \in [0, T]\)

$$\begin{aligned}&\frac{\textrm{d}}{\textrm{d}t} \big ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) \nonumber \\&\quad \le D_{\varrho }A_{\varrho } + \underbrace{ \int _{U} \big (\partial _\xi F_{\mu _{t} ^{N}}(x_t^{i} , \ell _t^{i} (u) , u ) + \varepsilon \log (\ell ^{i}_{t}(u)) \big ) \dot{\ell }_t^{i} (u) \, \textrm{d}\eta (u)}_{\text {I}} \nonumber \\&\qquad + \underbrace{\partial _x G_{\mu _{t}^{N}}( x_t^{i} , \ell _t^{i} )\cdot \dot{x}_t^{i}}_{\text {II}} \underbrace{-\frac{\textrm{d}}{\textrm{d}t}G_{\mu _{t}^{N}} (x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) )}_{\text {III}} \end{aligned}$$

We now show that the terms \({\text {II}}\) and \({\text {III}}\) are well-defined and uniformly bounded with respect to \(\lambda \in (0, +\infty )\). Let us start with \(\textrm{II}\). By \(\mathrm {(F4)}\)\(\mathrm {(F5)}\)\(\mathrm {(v3)}\), and by the fact that \((x^{i}_{t}, \Lambda ^{N}_{t}) \in B_{\varrho } \times \mathcal {P}(B^{Y_{\varepsilon }}_{\varrho })\), we get that

$$\begin{aligned} {\text {II}}&= \partial _x G_{\mu _{t}^{N}}(x_t^{i} , \ell _t^{i} ) \cdot \dot{x}_t^{i} = \int _U \partial _x F_{\mu _{t}^{N}}(x_t^{i} , \ell _t^{i} (u), u ) \cdot v_{\Lambda _t^{N}} ( x_t^{i} , \ell _t^{i} ) \,\textrm{d}\eta (u) \nonumber \\&\le \int _U \big | \partial _x F_{\mu _{t}^{N}} (x_t^{i} , \ell _t^{i} (u) , u ) \big | \, \big | v_{\Lambda _t^{N}} ( x_t^{i} , \ell _t^{i} ) \big | \,\textrm{d}\eta (u) \nonumber \\&\le \int _{U} \Gamma _{\varrho } M_v( 1 + \Vert y_t^{i} \Vert _{\overline{Y}} + m_1(\Lambda _t^{N}) ) \, \textrm{d}\eta (u) \nonumber \\&\le \Gamma _{\varrho } M_v ( 1 + 2\varrho ) = \Gamma _{\varrho } A_{\varrho } \,. \end{aligned}$$

As for \(\textrm{III}\), by (5.10) of Corollary 5.8 and by (5.18), we have that for a.e. \(t \in [0, T]\)

$$\begin{aligned} {\text {III}} \le \left| \frac{\textrm{d}}{\textrm{d}t}G_{\mu _{t}^{N}}(x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \right| \le 2 D_{\varrho }A_{\varrho }\,. \end{aligned}$$

We now estimate \({\text {I}}\) from (5.21). Using (5.1), (5.2), and (5.19), and recalling that \(\ell ^{i}_{t} \in C_{\varepsilon }\), we obtain that

$$\begin{aligned} \begin{aligned} {\text {I}}&= \int _U \big ( \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u), u ) + \varepsilon \log (\ell ^{i}_{t} (u)) \big ) \dot{\ell }_t^{i} (u) \, \textrm{d}\eta (u) \\&= \int _U \bigg ( \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u), u ) - \!\! \int _U \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} ( u' ) , u' ) \ell _t^{i}(u') \, \textrm{d}\eta ( u' ) \\&\quad + \varepsilon \big ( \log (\ell ^{i}_{t} (u)) - I(\ell ^{i}_{t}) \big ) \bigg )\,\dot{\ell }_t^{i}(u) \textrm{d}\eta (u) \\&= -\lambda \int _U \bigg ( \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u) , u) - \int _U \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i}(u') , u') \ell _t^{i} (u') \, \textrm{d}\eta (u') \\&\quad + \varepsilon \big ( \log (\ell ^{i}_{t} (u) ) - I(\ell ^{i}_{t}) \big ) \bigg )^2 \ell _t^{i}(u)\,\textrm{d}\eta (u) \\&\le -\lambda r_\varepsilon \int _U \bigg ( \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i} (u), u ) - \int _U \partial _\xi F_{\mu _t^{N}} ( x_t^{i} , \ell _t^{i}(u'),u') \ell _t^{i}(u') \, \textrm{d}\eta (u') \\&\quad + \varepsilon \big ( \log (\ell ^{i}_{t} (u)) - I(\ell ^{i}_{t}) \big ) \bigg )^2 \, \textrm{d}\eta (u) \le - \lambda r_\varepsilon \beta _{\varepsilon } \big ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big )\,. \end{aligned} \end{aligned}$$

Combining (5.21)–(5.23) and setting \(K_{\varrho } :=(\Gamma _{\varrho } + 3D_{\varrho })A_{\varrho }\), we deduce that for a.e. \(t \in [0, T]\)

$$\begin{aligned} \begin{aligned}&\frac{\textrm{d}}{\textrm{d}t} \big (G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i}) - G_{\mu _{t}^{N}} ( x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) \le \\&\quad - \lambda r_\varepsilon \beta _{\varepsilon } \big ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i}) - G_{\mu _{t}^{N}} ( x_t^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) \big ) + K_{\varrho } \,, \end{aligned} \end{aligned}$$

or equivalently

$$\begin{aligned} \begin{aligned}&\frac{\textrm{d}}{\textrm{d}t } \bigg ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} , \Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) - \frac{K_{\varrho } }{\lambda \,\beta _{\varepsilon } r_\varepsilon } \bigg ) \\&\quad \le - \lambda r_\varepsilon \beta _{\varepsilon } \bigg ( G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) - \frac{K_{\varrho } }{\lambda \,\beta _{\varepsilon } r_\varepsilon } \bigg ) . \end{aligned} \end{aligned}$$

Therefore, by Grönwall’s lemma we deduce that for every \(t \in [0, T]\)

$$\begin{aligned} \begin{aligned}&G_{\mu _{t}^{N}} ( x_t^{i} , \ell _t^{i} ) - G_{\mu _{t}^{N}} ( x_t^{i} ,\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) ) - \frac{K_{\varrho }}{\lambda \,\beta _{\varepsilon } r_\varepsilon } \\&\quad \le \bigg ( G_{\bar{\mu }^{N}} ( \bar{x}^{i} , \bar{\ell }^{i} ) - G_{\bar{\mu }^{N} } ( \bar{x}^{i} , \Delta (\bar{x}^{i}, \bar{\mu }^{N} ) ) - \frac{K_{\varrho }}{\lambda \beta _{\varepsilon } r_{\varepsilon }} \bigg ) e^{-\lambda r_\varepsilon \beta _{\varepsilon }T} . \end{aligned} \end{aligned}$$

Using (5.9), (5.10), and the fact that \(\bar{y}^{i} \in B_{\delta }^{Y_{\varepsilon }}\) and \(\bar{\mu }^{N} \in \mathcal {P}(B_{\delta })\), we further obtain

$$\begin{aligned} \begin{aligned}&\beta _{\varepsilon } \Vert \ell _t^{i} -\Delta ( x^{i}_{t}, \mu ^{N}_{t} ) \Vert _{L^2(U,\eta )}^2\\&\quad \le \frac{K_{\varrho } }{ \lambda \beta _{\varepsilon } r_\varepsilon } + \bigg (G_{\bar{\mu }^{N}} (\bar{x}^{i} , \bar{\ell }^{i} ) - G_{\bar{\mu }^{N}} ( \bar{x}^{i} , \Delta ( \bar{x}^{i}, \bar{\mu }^{N} ) ) - \frac{K_{\varrho }}{\lambda \beta _{\varepsilon } r_\varepsilon } \bigg ) e^{-\lambda r_\varepsilon \beta _{\varepsilon } T} \\&\quad \le \frac{K_{\varrho } }{ \lambda \beta _{\varepsilon } r_\varepsilon } + \Big ( D_{\delta } \Vert \bar{\ell }^{i} - \Delta (\bar{x}^{i}, \bar{\mu }^{N})\Vert _{L^{2}(U, \eta )} - \frac{K_{\varrho }}{\lambda \beta _{\varepsilon } r_\varepsilon } \bigg ) e^{-\lambda r_\varepsilon \beta _{\varepsilon } T} \\&\quad \le \frac{K_{\varrho } }{ \lambda \beta _{\varepsilon } r_\varepsilon } + 2 D_{\delta } R_{\varepsilon } e^{-\lambda r_\varepsilon \beta _{\varepsilon } T} \,. \end{aligned} \end{aligned}$$

Recalling that \(\varrho \) only depends on \(\delta \) and \(\varepsilon \), setting

$$\begin{aligned} \omega _{\varepsilon , \delta } :=\max \, \bigg \{ \sqrt{\frac{K_{\varrho } }{ \beta _{\varepsilon }^{2} r_\varepsilon }}; \sqrt{\frac{2D_{\delta } R_{\varepsilon }}{\beta _{\varepsilon }}} \bigg \}, \qquad \gamma _{\varepsilon } :=\frac{r_{\varepsilon } \beta _{\varepsilon }}{2}, \end{aligned}$$

we infer (5.14) for \(p = 2\), and thus for every \(p \in [1, 2]\) by Hölder inequality, with \(\omega _{\varepsilon , \delta }^{p}:=\omega _{\varepsilon , \delta }\) and \(\gamma _{\varepsilon }^{p} :=\gamma _{\varepsilon }\). For \(p \in (2, +\infty )\) we recall that

$$\begin{aligned} \Vert \ell ^{i}_{t} - \Delta (x^{i}_{t}, \mu ^{N}_{t} ) \Vert ^{p}_{L^{p}(U, \eta )} \le (R_{\varepsilon } - r_{\varepsilon })^{p-2} \Vert \ell _t^{i} - \Delta ( x^{i}_{t}, \mu ^{N}_{t}) \Vert _{L^2(U,\eta )}^2, \end{aligned}$$

which implies (5.15). \(\square \)

To simplify the notation, we define \(w_{\nu }(x) :=v_{(id, \Delta )_{\#} \nu } (x, \Delta (x, \nu ))\) for \(x \in \mathbb {R}^{d}\) and \(\nu \in \mathcal {P}_{1} (\mathbb {R}^{d})\). We now discuss the convergence of solutions to (5.1) to solutions to the fast reaction system

$$\begin{aligned} \left\{ \begin{array}{lll} \dot{{x}}_t^{i} = w_{\mu ^{N}_{t}} ({x}_t^{i} )\,,\\ {x}_0^{i}=\bar{x}^{i}\,, \end{array} \right. \qquad \text {for }i=1,\dots ,N,\,\,t\in (0,T]\,, \end{aligned}$$

where we have set \(\mu ^{N}_{t} :=\frac{1}{N} \sum _{i=1}^{N} \delta _{x^{i}_{t}}\). We start with the basic properties of \(w_{\mu }\) and the well-posedness of (5.24).

Lemma 5.10

The following facts hold:


for every \(\varrho >0\) there exists \(\Xi _{\varrho }>0\) such that for every \(\nu _{1}, \nu _{2} \in \mathcal {P}( B_{\varrho })\) and every \(x_{1}, x_{2} \in B_{\varrho }\)

$$\begin{aligned} | w_{\nu _{1}} (x_{1}) - w_{\nu _{2}} (x_{2}) |&\le \Xi _{\varrho } \big ( |x_{1}- x_{2}| + \mathcal {W}_{1}(\nu _{1}, \nu _{2})\big )\,; \end{aligned}$$

there exists \(M_{w}>0\) such that the velocity field \(w_{\mu } (x)\) for every \(\nu \in \mathcal {P}_{1}(\mathbb {R}^{d})\) and every \(x \in \mathbb {R}^{d}\)

$$\begin{aligned} | w_{\nu } (x) | \le M_{w} \big ( 1 + | x| + m_{1}(\nu )\big )\,. \end{aligned}$$


Item (i) follows from \(\mathrm {(v1)}\) and \(\mathrm {(v2)}\) and Corollary 5.8. Using that \(\Delta (x, \nu ) \in C_{\varepsilon }\), we have that \(m_{1} ( ( id, \Delta )_{\#} \nu ) \le (1 + R_{\varepsilon }) m_{1}(\nu )\). Thus, we deduce (ii). \(\square \)

Proposition 5.11

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\). Then, for every \(\bar{\varvec{x}} = (\bar{x}^{1}, \ldots , \bar{x}^{N}) \in ( \mathbb {R}^{d})^{N}\) there exists a unique solution \(\varvec{x}_{t} = (x^{1}_{t}, \ldots , x^{N}_{t})\) of the Cauchy problem (5.24). Moreover, if \(\delta >0\) and \(\bar{\varvec{x}} \in (B_{\delta })^{N}\), there exists \(\varrho >0\), only depending on \(\delta \), such that \(\varvec{x}_{t} \in (B_{\varrho })^{N}\) for every \(t \in [0, T]\).


It is enough to notice that, by Lemma 5.10, the velocity field \(w_{\mu ^{N}} (x^{i})\) with \(\mu ^{N} = \frac{1}{N} \sum _{i=1}^{N} \delta _{x_{i}}\) is locally Lipschitz and sublinear in \((\mathbb {R}^{d})^{N}\) for every \(i = 1, \ldots , N\). Hence, system (5.24) admits unique solution by standard ODE theory (see, e.g., [19]). The boundedness of solutions can be obtained by Grönwall inequality as in Theorem 3.3. \(\square \)

The following convergence result holds for the N-particles system.

Theorem 5.12

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\), let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\), let the operator \(\mathcal {T}_{\Psi }\) be defined as in (5.2), and let G be as in (5.4). For \(N \in \mathbb {N}\), \(\lambda \in (0, +\infty )\), and \(\delta \in (0, +\infty )\), let \(\varvec{\bar{y}} = (\bar{y}^{1}, \ldots , \bar{y}^{N}) \in (B^{Y_\varepsilon }_{\delta })^N\) and, for \(i = 1, \ldots , N\), let \(t \mapsto \varvec{y}_{\lambda , t} = (y^{1}_{\lambda , t}, \ldots , y^{N}_{\lambda , t})\) be the solution of the Cauchy problem (5.1) with initial datum \(\bar{\varvec{y}}\) and associated empirical measure \(\Lambda ^{N}_{\lambda , t} = \frac{1}{N} \sum _{i=1}^{N} \delta _{y^{i}_{\lambda , t}}\). Moreover, let \(t \mapsto {\varvec{x}}_{t} = ({x}^{1}_{t}, \ldots , {x}^{N}_{t})\) be the solution to (5.24) with initial conditions \(\bar{\varvec{x}} = (\bar{x}^{1}, \ldots , \bar{x}^{N}) \in ( B_{\delta })^{N}\) and let

$$\begin{aligned} \varvec{y}_{t} :=\big ( (x^{1}_{t}, \Delta (x^{1}_{t}, \mu ^{N}_{t}) ), \ldots , (x^{N}_{t}, \Delta (x^{N}_{t}, \mu ^{N}_{t}) ) \big ). \end{aligned}$$

Then, there exists \(\chi _{\varepsilon , \delta }>0\) such that for every \(t\in (0,T]\)

$$\begin{aligned} \Vert \varvec{y}_{\lambda ,t} - \varvec{{y}}_t \Vert _{\overline{Y}^N}&\le \chi _{\varepsilon , \delta } \Big ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T}\Big ) \qquad \text { if }p \in [1, 2] \,, \end{aligned}$$
$$\begin{aligned} \Vert \varvec{y}_{\lambda ,t} - \varvec{{y}}_t \Vert _{\overline{Y}^N}&\le \chi _{\varepsilon , \delta } \Big ( \Big (\frac{1}{\lambda }\Big )^{\frac{1}{p}} + e^{-\frac{2 \lambda \gamma _{\varepsilon } T}{p}} \Big ) \qquad \text { if }p \in (2, +\infty )\,, \end{aligned}$$

where \(\gamma _{\varepsilon }>0\) is the constant introduced in Lemma 5.9.


In what follows, we use also the notation \(\varvec{\ell }\) for a vector in \((L^{p} (U, \eta ))^{N}\) and we endow \((L^{p} (U, \eta ))^{N}\) with the norm

$$\begin{aligned} \Vert \varvec{\ell }\Vert _{(L^{p} (U, \eta ))^{N}} :=\frac{1}{N} \sum _{i=1}^{N} \Vert \ell ^{i}\Vert _{L^{p} (U, \eta )}. \end{aligned}$$

Moreover, we set \(\varvec{\ell }_{\lambda , t} :=( \ell _{\lambda , t}^{1}, \ldots , \ell _{\lambda , t}^{N})\), \(\varvec{\ell }_{t} :=( \Delta (x^{1}_{t}, \mu ^{N}_{t}), \ldots , \Delta (x^{N}_{t}, \mu ^{N}_{t}) )\), \(\mu _{\lambda , t}^{N} :=\pi _{\#} \Lambda ^{N}_{\lambda , t}\), and \(\bar{\mu }^{N} :=\frac{1}{N} \sum _{i=1}^{N} \delta _{\bar{x}^{i}}\).

We provide a complete proof for \(p \in [1, 2]\) and we highlight later on the main differences in the case \(p \in (2, +\infty )\). By (i) of Lemma 5.9 and by Proposition 5.11, there exists \(\varrho >0\) such that \(y^{i}_{\lambda , t}, y^{i}_{t} \in B_{\varrho }^{Y_{\varepsilon }}\) for \(i =1, \ldots , N\) and \(t \in [0, T]\). Hence, by triangle inequality and by (5.11) of Corollary 5.8, we have that

$$\begin{aligned} \begin{aligned}&\Vert \varvec{\ell }_{\lambda , t} - \varvec{\ell }_t \Vert _{(L^{p} (U, \eta ))^{N}} \\&\quad \le \frac{1}{N}\sum _{i=1}^N \Vert {\ell }_{\lambda , t}^{i} - \Delta ( x^{i}_{\lambda ,, t}, \mu ^{N}_{\lambda , t}) \Vert _{L^{p}(U, \eta )} + \frac{1}{N}\sum _{i=1}^N \Vert \Delta ( x^{i}_{\lambda , t}, \mu ^{N}_{\lambda , t}) -\Delta ( x^{i}_{t}, \mu ^{N}_{t}) \Vert _{L^{p}(U, \eta )} \\&\quad \le \frac{1}{N}\sum _{i=1}^N \Vert {\ell }_{\lambda , t}^{i} - \Delta ( x^{i}_{\lambda , t}, \mu ^{N}_{\lambda , t}) \Vert _{L^{p}(U, \eta )} + A_{\varepsilon , \varrho } ( \Vert \varvec{x}_{\lambda ,t} - \varvec{x}_t \Vert _{(\mathbb {R}^d)^N} + \mathcal {W}_1( \mu _t^{N} , \hat{\mu }_t^{N}) ) \\&\quad \le \frac{1}{N}\sum _{i=1}^N \Vert {\ell }_{\lambda , t}^{i} - \Delta ( x^{i}_{\lambda , t}, \mu ^{N}_{\lambda , t}) \Vert _{L^{p}(U, \eta )} + 2 A_{\varepsilon , \varrho } \Vert \varvec{x}_{\lambda , t} -\varvec{x}_t \Vert _{(\mathbb {R}^d)^N} \,. \end{aligned} \end{aligned}$$

Thanks to (ii) of Lemma 5.9, we may continue in (5.27) with

$$\begin{aligned} \Vert \varvec{\ell }_{\lambda , t} - \varvec{\ell }_{t} \Vert _{(L^{p}(U, \eta ))^{N}} \le \omega _{\varepsilon , \delta } \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg ) + 2 A_{\varepsilon , \varrho } \Vert \varvec{x}_{\lambda , t} -\varvec{x}_t \Vert _{(\mathbb {R}^d)^N} \,. \end{aligned}$$

Combining \(\mathrm {(v1)}\), \(\mathrm {(v2)}\), and inequality (5.28), we further estimate

$$\begin{aligned} \begin{aligned} \frac{\textrm{d}}{\textrm{d}t} \Vert \varvec{x}_{\lambda , t} - \varvec{x}_{t} \Vert _{(\mathbb {R}^d)^N}&\le \frac{1}{N} \sum _{i=1}^N | \dot{x}_{\lambda , t}^{i} - \dot{x}_t^{i} | = \frac{1}{N} \sum _{i=1}^N | v_{{\Lambda }_{\lambda , t}^{N}} ( {x}_{\lambda , t}^{i}, \ell _{\lambda , t}^{i}) - w_{\mu ^{N}_{t}} ( {x}_t^{i} ) |\\&= \frac{1}{N} \sum _{i=1}^N \big | v_{{\Lambda }_{\lambda , t}^{N}} ( {x}_{\lambda , t}^{i}, \ell _{\lambda , t}^{i}) - v_{(id, \Delta )_{\#}\mu ^{N}_{t}} ( {x}_t^{i} , \Delta (x^{i}_{t}, \mu ^{N}_{t}) ) \big |\\&\le \frac{2L_{\varrho }}{N} \sum _{i=1}^N | x_{\lambda , t}^{i} - x_t^{i} | + \Vert \ell _{\lambda , t}^{i} - \Delta (x^{i}_{t}, \mu ^{N}_{t})\Vert _{L^{p}(U, \eta )}\\&\le 2 L_{\varrho } (1 + 2 A_{\varepsilon , \varrho } ) \Vert \varvec{x}_{\lambda , t} - \varvec{x}_{t} \Vert _{(\mathbb {R}^d)^N} + 2L_{\varrho } \omega _{\varepsilon , \delta } \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg )\,. \end{aligned} \end{aligned}$$

Equivantely, we can write

$$\begin{aligned} \begin{aligned}&\frac{\textrm{d}}{\textrm{d}t} \bigg ( \Vert \varvec{x}_{\lambda , t} - \varvec{x}_t \Vert _{(\mathbb {R}^d)^N} \ + \frac{ \omega _{\varepsilon , \delta }}{1 + 2 A_{\varepsilon , \varrho }} \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg ) \bigg ) \\&\quad \le 2 L_{\varrho } (1 + 2 A_{\varepsilon , \varrho } ) \bigg ( \Vert \varvec{x}_{\lambda , t} - \varvec{x}_{t} \Vert _{(\mathbb {R}^d)^N} + \frac{ \omega _{\varepsilon , \delta }}{1 + 2 A_{\varepsilon , \varrho }} \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg ) \bigg ) \,. \end{aligned} \end{aligned}$$

Therefore, by applying Grönwall’s Lemma, for \(\tau >0\) and \(t \in [\tau , T]\) we obtain

$$\begin{aligned} \begin{aligned} \Vert \varvec{x}_{\lambda , t} - \varvec{x}_t \Vert _{(\mathbb {R}^d)^N}&\le \bigg ( \Vert \varvec{x}_{\lambda , \tau } - \varvec{x}_{\tau } \Vert _{(\mathbb {R}^d)^N} + \frac{ \omega _{\varepsilon , \delta }}{1 + 2 A_{\varepsilon , \varrho }} \bigg ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T} \bigg )\bigg )\\&\quad e^{2L_{\varrho } ( 1 + 2 A_{\varepsilon , \varrho }) (t - \tau ) } \,. \end{aligned} \end{aligned}$$

Recalling that \( \varvec{x}_{\lambda , t}, \varvec{x}_{ t} \in (B_{\varrho })^{N}\) for every \(t \in [0, T]\), from \((\textrm{v3})\) and (ii) of Lemma 5.10, we infer that

$$\begin{aligned} \Vert \varvec{x}_{\lambda , \tau } - \varvec{x}_{\tau } \Vert _{(\mathbb {R}^d)^N} \le ( M_{v} + M_{w}) (1 + 2\varrho + 2 R_{\varepsilon }) \tau \,. \end{aligned}$$

Thus, we deduce from (5.29) that

$$\begin{aligned}{} & {} \Vert \varvec{x}_{\lambda , t} - \varvec{x}_t \Vert _{(\mathbb {R}^d)^N} \le \bigg ( ( M_{v} + M_{w}) (1 + 2\varrho + 2 R_{\varepsilon }) \tau + \frac{ \omega _{\varepsilon , \delta }}{1 + 2 A_{\varepsilon , \varrho }} \bigg ( \frac{1}{\sqrt{\lambda }} \\{} & {} \quad + e^{-\lambda \gamma _{\varepsilon } T} \bigg )\bigg ) e^{2L_{\varrho } ( 1 + 2 A_{\varepsilon , \varrho }) T }, \end{aligned}$$

which, together with (5.28), yields (5.25) for \(p \in [1, 2]\) by taking \(\tau = \frac{1}{\sqrt{\lambda }}\).

If \(p \in (2, +\infty )\), we replace (5.28) with

$$\begin{aligned}{} & {} \Vert \varvec{\ell }_{\lambda , t} - \varvec{\ell }_{t} \Vert _{(L^{p}(U, \eta ))^{N}} \le 2 (R_{\varepsilon } - r_{\varepsilon })^{\frac{p-2}{p}} \omega _{\varepsilon , \delta }^{\frac{2}{p}} \bigg ( \frac{1}{\lambda } + e^{-2 \lambda \gamma _{\varepsilon } T} \bigg )^{\frac{1}{p}}\nonumber \\{} & {} \quad + 2 A_{\varepsilon , \varrho } \Vert \varvec{x}_{\lambda , t} -\varvec{x}_t \Vert _{(\mathbb {R}^d)^N}. \end{aligned}$$

Following step by step the argument for (5.29) we get for \(t \in [\tau , +\infty )\)

$$\begin{aligned} \begin{aligned}&\Vert \varvec{x}_{\lambda , t} - \varvec{x}_t \Vert _{(\mathbb {R}^d)^N} \le \bigg ( \Vert \varvec{x}_{\lambda , \tau } - \varvec{x}_{\tau } \Vert _{(\mathbb {R}^d)^N} \\&\quad + \frac{ 2 (R_{\varepsilon } - r_{\varepsilon })^{\frac{p-2}{p}} \omega _{\varepsilon , \delta }^{\frac{2}{p}}}{1 + 2A_{\varepsilon , \varrho }} \bigg ( \frac{1}{\lambda } + e^{-2 \lambda \gamma _{\varepsilon } T} \bigg )^{\frac{1}{p}} \bigg ) e^{2L_{\varrho } ( 1 + 2 A_{\varepsilon , \varrho }) (t - \tau ) } \,. \end{aligned} \end{aligned}$$

Then, (5.26) follows from (5.30) as in the case \(p \in [1, 2]\) taking \(\tau = \big ( \frac{1}{\lambda }\big )^{\frac{1}{p}}\) and eventually re-defining the constant \(\chi _{\varepsilon , \delta }\). \(\square \)

We introduce the fast reaction continuity equation

$$\begin{aligned} \partial _{t} \mu _{t} + \textrm{div}(w_{\mu _{t}} \mu _{t}) = 0, \qquad \mu _{0} = \bar{\mu }, \end{aligned}$$

for \(\bar{\mu } \in \mathcal {P}_{1}(\mathbb {R}^{d})\) and \(\mu \in C([0, T]; ( \mathcal {P}_{1}(\mathbb {R}^{d}), \mathcal {W}_{1}))\). For the notion of Eulerian and Lagrangian solutions to (5.32) we refer to Definitions 4.1 and 4.5, with the obvious modifications (see also [4]). In the next proposition, we briefly discuss existence and uniqueness of solutions (5.32).

Proposition 5.13

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\) and let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\). Then, for every \(\bar{\mu } \in \mathcal {P}_{c} (\mathbb {R}^{d})\) there exists a unique Eulerian (and Lagrangian) solution to (5.32) with initial condition \(\bar{\mu }\). Moreover, for every \(\delta >0\) and every \(\bar{\mu }, \bar{\mu }^{n} \in \mathcal {P} (B_{\delta })\) such that \(\mathcal {W}(\bar{\mu }^{n}, \bar{\mu }) \rightarrow 0\) as \(n \rightarrow \infty \) we have that the corresponding solutions \(\mu , \mu ^{n} \in C([0, T]; (\mathcal {P}_{1} (\mathbb {R}^{d}), \mathcal {W}_{1}))\) with initial conditions \(\bar{\mu }\) and \(\bar{\mu }^{n}\), respectively, satisfy

$$\begin{aligned} \lim _{n \rightarrow \infty } \, \mathcal {W}_{1} (\mu _{t}^{n}, \mu _{t} ) = 0 \qquad \text { uniformly for }t \in [0, T]. \end{aligned}$$


The thesis can be obtained by combining Lemma 5.10 with the arguments used in Theorems 4.2 and 4.4. \(\square \)

Remark 5.14

As a consequence of Proposition 5.13, we have that for every \(\delta >0\) and every \(\bar{\mu } \in \mathcal {P}(B_{\delta })\), there exists \(\varrho >0\) (only depending on \(\delta \) and \(\varepsilon \)) such that the solution \(\mu \in C([0, T]; (\mathcal {P}_{1}(\mathbb {R}^{d}), \mathcal {W}_{1}))\) of (5.32) with initial condition \(\bar{\mu }\) satisfies \(\textrm{spt} (\mu _{t}) \subseteq B_{\varrho }\) for every \(t \in [0, T]\). This can be proven, for instance, by taking a sequence of empirical measures \(\bar{\mu }^{N} \in \mathcal {P}(B_{\delta })\) such that \(\mathcal {W}_{1}(\bar{\mu }^{N}, \bar{\mu }) \rightarrow 0\) and applying Propositions 5.11 and 5.13.

We are finally ready to discuss the convergence of the solutions to the continuity equations in the fast reaction limit.

Theorem 5.15

Let \(v_{\Psi }\) satisfy \(\mathrm {(v1)}\)\(\mathrm {(v3)}\), let \(F :\mathcal {P}_1(\mathbb {R}^d)\times \mathbb {R}^d \times ( 0, +\infty ) \times U \rightarrow (-\infty ,+\infty ]\) satisfy \((\textrm{F1})\)\((\textrm{F5})\), let \(\mathcal {T}_{\Psi }\) be defined as in (5.2), let \(\delta >0\), and let \(\bar{\Lambda } \in \mathcal {P}(B_{\delta }^{Y_{\varepsilon }})\) and \(\bar{\mu } = \pi _{\#} \bar{\Lambda } \in \mathcal {P}(B_{\delta })\). For every \(\lambda >0\), let \(\Lambda _{\lambda } \in C([0, T]; (\mathcal {P}_{1} (Y_{\varepsilon }), \mathcal {W}_{1}))\) be the solution to (5.3) with initial condition \(\bar{\Lambda }\) and let \(\mu \in C([0, T]; (\mathcal {P}_{1} ( \mathbb {R}^{d} ), \mathcal {W}_{1}))\) be the solution to (5.32) with initial condition \(\bar{\mu }\). Then, for every \(t \in [0, T]\) we have that

$$\begin{aligned}&\mathcal {W}_{1} (\Lambda _{\lambda , t}, (id, \Delta )_{\#} \mu _{t}) \le \chi _{\varepsilon , \delta } \Big ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T}\Big ) \qquad \text { if }p \in [1, 2], \end{aligned}$$
$$\begin{aligned}&\mathcal {W}_{1} (\Lambda _{\lambda , t}, (id, \Delta )_{\#} \mu _{t}) \le \chi _{\varepsilon , \delta } \Big ( \Big (\frac{1}{\lambda }\Big )^{\frac{1}{p}} + e^{-\frac{2 \lambda \gamma _{\varepsilon } T}{p}} \Big ) \qquad \text { if }p \in (2, +\infty ), \end{aligned}$$

where \(\gamma _{\varepsilon }\) and \(\chi _{\varepsilon , \delta }\) are the constants introduced in Lemma 5.9 and Theorem 5.12, respectively.


We proceed by finite particles approximation and let us fix \(\lambda \in (0, +\infty )\). Let us fix a sequence \(\bar{\varvec{y}}_{N} :=(\bar{y}^{1}_{N}, \ldots , \bar{y}^{N}_{N}) \in (B_{\delta }^{Y_{\varepsilon }})^{N}\), let \(\bar{\Lambda }^{N} \in \mathcal {P}(B^{Y_{\varepsilon }}_{\delta })\) denote the associated empirical measure, and assume that \(\mathcal {W}_{1} (\bar{\Lambda }^{N}, \bar{\Lambda } ) \rightarrow 0\). Let us further denote by \(\varvec{y}_{\lambda ,N, t} \in Y_{\varepsilon }^{N}\) the solution to (5.1) with initial condition \(\bar{\varvec{y}}_{N}\), let \(\Lambda _{\lambda , t}^{N}\) be the corresponding empirical measure, let \(\varvec{x}_{N, t} \in (\mathbb {R}^{d})^{N}\) be the solution to (5.24) with initial condition \(\bar{\varvec{x}}_{N} = (x^{1}_{N}, \ldots , x^{N}_{N}) \in (B_{\delta })^{N}\), and finally let \(\mu ^{N}_{t}\) be the corresponding empirical measure.

By triangle inequality, for every \(N \in \mathbb {N}\) and every \(t \in [0, T]\) we have that

$$\begin{aligned} \begin{aligned} \mathcal {W}_{1} (\Lambda _{\lambda , t}, (id, \Delta )_{\#} \mu _{t})&\le \ \mathcal {W}_{1} (\Lambda _{\lambda , t}, \Lambda ^{N}_{\lambda , t}) + \mathcal {W}_{1} (\Lambda ^{N}_{\lambda , t}, (id, \Delta )_{\#} \mu ^{N}_{t}) \\&+ \mathcal {W}_{1} ( (id, \Delta )_{\#} \mu ^{N}_{t}, (id, \Delta )_{\#} \mu _{t}) \,. \end{aligned} \end{aligned}$$

By Corollary 5.4 we have that

$$\begin{aligned} \lim _{N \rightarrow \infty } \, \mathcal {W}_{1} (\Lambda _{\lambda , t}, \Lambda ^{N}_{\lambda , t}) = 0 \qquad \text { uniformly in }[0, T]. \end{aligned}$$

Thanks to Remark 5.14, there exists \(\varrho >0\) such that \(\mu ^{N}_{t}, \mu _{t} \in \mathcal {P}(B_{\varrho })\) for every \(t \in [0, T]\) and every \(N \in \mathbb {N}\). Hence, by Proposition 5.13 and by (5.11) of Proposition 5.7 we have that

$$\begin{aligned} \lim _{N \rightarrow \infty } \, \mathcal {W}_{1} ( (id, \Delta )_{\#} \mu ^{N}_{t}, (id, \Delta )_{\#} \mu _{t}) = 0 \qquad \text { uniformly in }[0, T]. \end{aligned}$$

Applying Theorem 5.12 to \(\varvec{y}_{\lambda , N, t}\) and to

$$\begin{aligned} \varvec{y}_{N, t} :=\big ( (x^{1}_{N, t}, \Delta (x^{1}_{N, t}, \mu ^{N}_{t}), \ldots , (x^{N}_{N, t}, \Delta (x^{N}_{N, t}, \mu ^{N}_{t}) \big ), \end{aligned}$$

we get that

$$\begin{aligned}&\mathcal {W}_{1} (\Lambda ^{N}_{\lambda , t}, (id, \Delta )_{\#} \mu ^{N}_{t}) \le \chi _{\varepsilon , \delta } \Big ( \frac{1}{\sqrt{\lambda }} + e^{-\lambda \gamma _{\varepsilon } T}\Big ) \qquad \text { if }p \in [1, 2], \end{aligned}$$
$$\begin{aligned}&\mathcal {W}_{1} (\Lambda ^{N}_{\lambda , t}, (id, \Delta )_{\#} \mu ^{N}_{t}) \le \chi _{\varepsilon , \delta } \Big ( \Big (\frac{1}{\lambda }\Big )^{\frac{1}{p}} + e^{-\frac{2 \lambda \gamma _{\varepsilon } T}{p}} \Big ) \qquad \text { if }p \in (2, +\infty ). \end{aligned}$$

Combining (5.36)–(5.40) we infer (5.34) and (5.35). \(\square \)