1 Introduction

Fundamental in Boltzmann’s deduction of the equation bearing his name is the “stosszahlansatz”, or chaos assumption. Vaguely expressed this assumption means that when two particles collide, they are statistically uncorrelated just before the collision. After the collision they are not, of course, because, for example, the knowledge of the position and velocity of one particle that just collided gives some information on the position of the collision partner. And while the correlations created by collisions decrease with time, they never vanish in a system of finitely many particles, and hence the Boltzmann assumption could only be true in the limit of infinitely many particles.

A mathematical framework for studying this limit is the so-called BBGKY hierarchy (see Grad [17], Cercignani [9] and the book [11] by Cercignani et al), which consists of a family of Liouville equations, each describing the evolution of an \(N\)-particle system (deterministic, in this case), and whose solutions are densities in the space of \(N\)-particle configurations in phase space. The BBGKY hierarchy describes in a systematic way the evolution of marginal distributions. Formally, and under appropriate assumptions, most notably the chaos assumption, the one-particle marginal of solutions to the \(N\)-particle Liouville equation, converges to solutions of the Boltzmann equation.

It would take until 1974 before a mathematically rigorous proof of this statement was given by Lanford [27]. While this is a remarkable result, it only proves that the Boltzmann equation is a limit of the \(N\)-particle systems for a fraction of the mean free time between collisions, and this is essentially where the problem stands today (see however [22] for a large time limit in a near to the vacuum framework; it is worth emphasizing that in such a framework no more collisions occur than in Lanford’s framework).

In order to avoid some of the difficulties related to the deterministic evolution of a real particle system, Kac [23] invented a Markov process for a particular \(N\)-particle system, and gave a mathematically rigorous definition of propagation of chaos. He then proved that this holds for his Markov system, and thus obtained a mathematically rigorous derivation of a simplified (spatially homogeneous) Boltzmann equation in this case, usually called the Kac equation.

Kac’s work provides the framework of this paper, and we will now describe our main results. We let \(E\) be the state space of one particle (usually \({\mathbb {R}}^d\), but metric, separable and locally compact is fine). A sequence of probability measures \({(f^N)}_{N=0}^\infty \), where each \(f^N\in {\mathcal {P}}(E^N)\) is symmetric in the sense that it is invariant under permutation of the coordinates, is said to be \(f\) -chaotic for some probability measure \(f\in {\mathcal {P}}(E)\) if for each \(k\ge 1\) and functions \(\phi _j\in C_b(E), j=1, \ldots ,k\), (continuous bounded),

$$\begin{aligned} \lim _{N\rightarrow \infty } \int _{E^N} \prod _{j=1}^{k} \phi _j(z_j) \, f^N(\mathrm{d}z_1,\dots ,\mathrm{d}z_N)&= \prod _{j=1}^{k} \int _{E} \phi _j(z) \, f(\mathrm{d}z). \end{aligned}$$
(1.1)

We next consider a family of time dependent probability measures \({(f_t^N)}_{N=0}^\infty \), being the distributions of the states of Markov processes in \(E^N\). The Markov process is said to propagate chaos if given an initial family of \(N\)-particle distributions \({(f_{in}^N)}_{N =0}^\infty \), that is \(f_{in}\)-chaotic, there is a time dependent distribution \(f_t\) such that \({(f_t^N)}_{N=1}^{\infty }\) is \(f_t\)-chaotic. In this paper we are interested in specific equations that govern the evolution of \(f_t\); but it is important to bear in mind that they are in general nonlinear, and it may be difficult to prove well-posedness in function spaces relevant for proving the propagation of chaos.

The main results of this paper are abstract. We consider:

  • the family of \(N\)-particle systems represented by a Markov processes \(({\mathcal {Z}}^N_t)_{t\ge 0}\) in some product space \(E^N\), with \({\mathcal {Z}}^N_t=({\mathcal {Z}}_{1,t},\ldots ,{\mathcal {Z}}_{N,t})\), and the corresponding probability distributions \({(f_t^N)}_{N=1}^{\infty }\), solving Kolmogorov’s backward equationsFootnote 1, and where the distributions \(f_t^N\) belong to a suitable subspace of \({\mathcal {P}}(E^N)\),

  • a (nonlinear) equation defined on a subspace of \({\mathcal {P}}(E)\), which is the formal limit of the equations governing one-particle marginals of \(f^N_t\):

    $$\begin{aligned} \frac{\partial }{\partial t} f_t = Q(f_t), \quad f_{\mathrm{in}} \in {\mathcal {P}}(E). \end{aligned}$$
    (1.2)

Then we:

  • provide conditions on the processes and related function spaces that guarantee that \({(f_t^N)}_{N=1}^{\infty }\) is \(f_t\)-chaotic for \(t \ge 0\),

  • give explicit estimates of the rate of convergence in (1.1): more precisely, for any \(T>0, \ell \in {\mathbb {N}}^*\) and \(\phi _j\in {\mathcal {F}}\subset C_b(E)\) (\(j=1,\ldots ,\ell \)), there is a constant \(\epsilon (N)\) converging to zero as \(N\rightarrow \infty \) such that

    $$\begin{aligned} \sup _{t \in [0,T]} \left| \,\int _{E^N} \prod _{j=1}^{\ell } \phi _j(z_j) \, f^N_t(\mathrm{d}z_1, \dots ,\mathrm{d}z_N)- \prod _{j=1}^{\ell } \int _{E}\phi _j(z) \, f_t(\mathrm{d}z)\right| \le \epsilon (N),\quad \quad \end{aligned}$$
    (1.3)

    which holds for \(N\ge 2\ell \) and a suitably chosen space \({\mathcal {F}}\); if \({\mathcal {F}}\) is dense in \(C_b(E)\), this implies in particular the propagation of chaos.

To this end, our starting point is a technique that goes back at least to Grünbaum [21], which consists in representing an \(N\)-particle configuration \({\mathcal {Z}}^N_t\) as a sum of Dirac measures,

$$\begin{aligned} {\mathcal {Z}}^N_t=({\mathcal {Z}}_{1,t},\dots ,{\mathcal {Z}}_{N,t}) \quad \longleftrightarrow \quad \mu ^N_{{\mathcal {Z}}^N_t}=\frac{1}{N}\sum _{j=1}^N \delta _{{\mathcal {Z}}_{j,t}} \in {\mathcal {P}}(E) \end{aligned}$$

and proving that, in a weak sense, \(\mu ^N_{{\mathcal {Z}}^N_t}\) converges to \(f^N_t\). In fact, because \({\mathcal {Z}}^N_t\) is random, \(\mu ^N_{{\mathcal {Z}}^N_t}\) is a random measure in \({\mathcal {P}}(E)\), which has a probability distribution \(\Psi ^N_t\in {\mathcal {P}}({\mathcal {P}}(E))\). Proving the propagation of chaos is here equivalent to proving that \(\Psi ^N_t\rightarrow \delta _{f_t}\) in \({\mathcal {P}}({\mathcal {P}}(E))\) when \(N\rightarrow \infty \). The error \(\epsilon (N)\) is dominated by, on the one hand, how well the initial measure \(f_{\mathrm{in}}\) can be approximated by a sum of \(N\) Dirac measures, and, on the other hand, estimates comparing the equations for \(f^N_t\) and \(f_t\). These estimates depend on rather technical assumptions, and although the abstract main theorem is stated in Sect. 2, the assumptions are stated in full detail only in Sect. 4.

With the main theorem of this paper in hand, proving the propagation of chaos for a particular \(N\)-particle system is reduced to proving:

  1. (i)

    a purely functional estimate on the dual generator \(G^N\) of the \(N\)-particle dynamics which establishes and quantifies that, at first order, \(G^N\) is linked to the mean field limit generator \(Q\) (consistency estimate);

  2. (ii)

    some fine stability estimates on the flow of the mean field limit equation involving the differential of the semigroup with respect to the initial data (stability estimates).

Point (i) of our method is largely inspired from the “duality viewpoint” of Grünbaum’s paper [21] where he considered the propagation of the chaos issue for the Boltzmann equation associated to hard-spheres (unbounded) kernel. As he confessed himself the proof in [21] was incomplete due to the lake of suitable stability estimates, i.e. precisely the point (ii) of our method.

It is worth emphasizing that after we had finished writing our paper, we were told about the recent book [26] by Kolokoltsov and his series of papers on nonlinear Markov processes and kinetic equations. These interesting works focus on fluctuation estimates of LLN and CLT types in the general framework of nonlinear Markov processes, and in some sense they generalise to several other kinetic models the Grünbaum’s duality viewpoint (although Kolokoltsov seems to not be aware of that earlier work). However we were not able to extract from these works a full proof in the cases when the generator is an unbounded operator and weak distances have to be used. While the comparison of generators for the many-particle and the limit semigroup present in both [21] and [26] is reminiscent of our work, we believe that the main novelty of the present paper is to achieve, for the first time, both the fine stabilities estimates in point (ii) and the consistency estimate in point (i) in appropriate spaces (with weak topologies), in such a way that they may be combined and they lead to the already mentioned propagation of chaos result with quantitative estimates.

We illustrate the method by proving the propagation of chaos for three different well known examples:

  1. (a)

    We first consider the Boltzmann equation for Maxellian molecules with angular cutoff. For such a bounded kernel case the result is well-known since the pioneering works of Kac [23, 24] and McKean [29] (who prove the propagation of chaos without any rate) and from the works by Graham and Méléard [1820, 30] (where the authors establish the propagation of chaos with optimal rate \({\mathcal {O}}(1/N)\)). In these papers, the cornerstone of the proof is a combinatorial argument applied to the equation on the law (Wild sum expansion) or to the stochastic flow (stochastic tree). These approaches are restricted to a constant (or at least bounded) collision rate.

  2. (b)

    The second example is the McKean–Vlasov model. For such a model again, propagation of chaos is well-known and has been extensively studied. One of the most popular and efficient approaches to deal with this model is the so-called “coupling method” introduced in the 1970s, which yields the optimal convergence rate \({\mathcal {O}}(1/\sqrt{N})\) (note that the difference between these two optimal rates in (a) and (b) comes from the fact that they are not measured with the same distance). We refer to the lecture notes [30, 36] as well as to the references therein for a detailed discussion of that method. We also refer to [5] and the references therein for recent developments on the subject.

  3. (c)

    The third example is a mixed collision–diffusion equation which arises from granular gas modeling. For such a model, it seems that both the “combinatorics method” and the “coupling method” fail while our present method is robust enough to apply and yield quantitative chaos estimates. Let us also emphasize that the BBGKY method and the nonlinear martingale method (see again [30, 36] or [1, 31]) may also apply but would give a propagation of chaos without any rate since they are based on compactness arguments.

Let us emphasize that it is not difficult to write a uniform in time version of Theorem 2.1: in short, if the assumptions (A1) to (A5) are satisfied with \(T=+\infty \), then the conclusion of the main abstract Theorem 2.1 holds with \(T=+\infty \) and the proof is unchanged. But such an abstract theorem does not readily apply to the examples (a), (b) and (c) discussed above. More precisely, it is indeed possible to prove quantitative uniform in time propagation of chaos by our method (for the elastic Boltzmann model for instance), but the price to pay is a significant modification to the set of assumptions (A1) to (A5). This issue is addressed in our companion paper [32] where the abstract method is developed in a more general framework in order to (1) apply it to Boltzmann collision models associated to unbounded collisions rates, (2) develop a theory of uniform in time propagation of chaos estimates. We shall consider the question of uniform in time chaoticity estimates for the McKean–Vlasov equation in future works.

These three examples illustrate the generality of the method that we study: the same abstract framework can be used to prove propagation of chaos for \(N\)-particle systems that have not yet been analysed as well as for models that have been studied before but with conceptually different methods. But we chose to emphasize generality over optimality of the result. By optimizing the method of proof for a specific problem one can certainly obtain sharper results, for example in terms of the rate of convergence as a function of \(N\) or in the choice of topologies for which convergence can be proven. We did not pursue this goal in this paper.

Also, in applying the abstract theorem to a concrete model, one is faced with the challenge of finding functional spaces that satisfy the conditions of our abstract convergence theorem and are adapted to the model. In many cases, like for the three examples presented here, existing theory for the \(N\)-particle systems and for the limiting equations may give a strong hint on what choices to make, but in other cases this could present serious difficulties. Another guiding principle is the consistency estimates between the generators of the \(N\)-particle system and the limit equation which constrain the norms or metrics that can be used and hint at the losses on the norms or metrics at the basis of the scale of spaces used in the stability estimates.

In spite of a cost of technicality, the original approach proposed by Grunbaum, even if originally incomplete, seems to us intuitively very attractive, and with Theorems 2.1, 5.1, 6.2 and 7.1 we make this approach into a mathematically rigorous theory.

The plan of the paper is as follows. In Sect. 2, we present the method in an abstract framework, by first setting up a functional framework that is appropriate for comparing the \(N\)-particle dynamics with the limiting dynamics, and we establish the abstract quantitative propagation of chaos (Theorem 2.1). The main steps of the proof are given as well, but these rely on some technical assumptions and lemmas which are postponed to Sect. 4. The functional framework is developed with the necessary details in Sect. 3, where we also develop a differential calculus for functions on \({\mathcal {P}}(E)\), as needed for studying the nonlinear semigroup. In Sect. 5, we apply the method to the Boltzmann equation associated to the Maxwell molecules collision kernel with Grad’s cut-off. In Sect. 6, we apply the method to the McKean–Vlasov equation, and finally, in Sect. 7, it is applied to some mixed jump and diffusion equations motivated by granular gases.

2 Propagation of chaos for abstract \(N\)-particle systems

In this section we introduce the mathematical notation used in the paper, make precise statements of the results, and describe the main steps of the proof, leaving the details of the proofs to the next sections.

2.1 The \(N\)-particle system

The phase space of the \(N\)-particle system isFootnote 2 \(E^N/{\mathfrak {S}}_N\). Here \(E\) is assumed to be a locally compact, separable metric space, and \({\mathfrak {S}}_N\) denotes the symmetric group of order \(N\). This means that we identify all points in the \(N\)-particle phase space that can be obtained by permutation of the particles, so that if \(Z=(z_1,\ldots ,z_N) \in E^N/{\mathfrak {S}}_N\) we have \((z_1,\ldots ,z_N)\sim (z_{\sigma _1},\ldots ,z_{\sigma _N})\), where \((\sigma _1,\ldots ,\sigma _N)\) is any permutation of \(\{1, \ldots , N\}\). The evolution in phase space may be a stochastic Markov process or the solution to an Hamiltonian system of equations. In both cases, we denote by \({({\mathcal {Z}}^N_t)}_{t\ge 0}\) the flow of the process.

Figure 1 illustrates the relation between the different objects that we consider. The \(N\)-particle system is represented in the upper left corner of Fig. 1. The different mathematical objects in this diagram are explained along the following subsections.

Fig. 1
figure 1

A summary of spaces and their relations. Semigroups are in most cases given together with their generators, as in \(\,S^N_t \big |A\)

2.2 Master equations, Liouville’s equations and their duals

Let \({\mathcal {P}}_{\mathrm{sym}}(E^N)\) denote the probability measures on \(E^N\) that are invariant under permutation of the indices in \(Z=(z_1,\dots ,z_N)\in E^N\). The flow \({\mathcal {Z}}^N_t\) induces a semigroup of operators \(S^N_t\) on \({\mathcal {P}}_{\mathrm{sym}}(E^N)\) defined through the formula

$$\begin{aligned}&\forall \, f^N_{\mathrm{in}} \in {\mathcal {P}}_{\mathrm{sym}}(E^N), \ \varphi \in C_b(E^N),\nonumber \\&\quad \left\langle S^N_t (f^N_{\mathrm{in}}), \varphi \right\rangle = {\mathbb {E}}\left( \varphi \left( {\mathcal {Z}}^N_t\right) \right) := \int _{E^N} {\mathbb {E}}_{Z_0} \left( \varphi \left( {\mathcal {Z}}^N_t\right) \right) \, f^N_{\mathrm{in}} (\mathrm{d}Z_{\mathrm{in}}), \end{aligned}$$
(2.1)

where the bracket denotes the duality bracket between \({\mathcal {P}}(E^N)\) and \(C_b(E^N)\):

$$\begin{aligned} \langle f, \phi \rangle&= \int _{E^N} \phi (Z) \, f(\mathrm{d}Z), \end{aligned}$$

and \({\mathbb {E}}_{Z_{\mathrm{in}}}\) denotes the conditional expectation with respect to the initial condition \({\mathcal {Z}}^N_{\mathrm{in}} = Z_{\mathrm{in}}\). This semigroup is the solution to Kolmogorov’s forward equation in the case where \(({\mathcal {Z}}^N_t)_{t\ge 0}\) is a random process (this equation is often called the master equation), and of the Liouville equation in the Hamiltonian case. We always assume that \(S^N_t\) preserves the symmetry under permutation, and therefore restricts to an evolution semigroup on \({\mathcal {P}}_{\mathrm{sym}}(E^N)\). There is a dual semigroup of \(S^N_t\), that acts on \(C_b(E^N)\), the set of bounded continuous functions on \(E^N\). We write this semigroup \(T^N_t\), and denote its generator by \(G^N\). The two semigroups are related by

$$\begin{aligned} \left\langle f^N, T^N_t\phi \right\rangle = \left\langle S^N_t f^N, \phi \right\rangle . \end{aligned}$$
(2.2)

Markov processes such that (1) \(T^N_t\) is a contraction in \(C_0(E^N)\), the set of continuous functions that vanish at infinity, and (2) \(t\mapsto T^N_t\phi \) is continuous for any \(\phi \in C_0(E^N)\), are known as Feller processes.

Hence the upper part of the diagram represents a (random) process, and its distributions. The lower part of the diagram essentially shows the same thing as induced by the map \(\mu ^N_Z\), as we shall now see.

2.3 The limiting dynamics

The components \((z_1,\ldots ,z_N)\) of \(Z \in E^N / {\mathfrak {S}}_N\) represent the positions (in generalized sense, i.e. in the phase space \(E\)) of the \(N\) particles. These \(N\) particles can also be uniquelyFootnote 3 represented as an empirical measure, that is a sum of Dirac measures:

$$\begin{aligned} Z=(z_1,\dots ,z_N) \ \mapsto \ \mu ^N_Z = \frac{1}{N}\sum _{j=1}^N \delta _{z_j}. \end{aligned}$$
(2.3)

The resulting measure is normalized so as to give a probability measure, which is obviously independent of any permutation of the indices. The set of such empirical measures is denoted by \({\mathcal {P}}_N(E)\). Probability measures on \(E\) are denoted by \({\mathcal {P}}(E)\), so that \({\mathcal {P}}_N(E) \subset {\mathcal {P}}(E)\).

When the number of particles go to infinity, we may have \(\mu ^N_Z \rightarrow f\in {\mathcal {P}}(E)\), where now \(f\) is a distribution of particles in \(E\). We call a “limiting equation for the \(N\)-particle systems” the (usually nonlinear) equation of the form (1.2) that is satisfied by the probability distribution \(f_t\) obtained as the limit of \(\mu ^N_Z\), and we write its solution in the form of a nonlinear semigroup \(S^{N\!L}_t\):

$$\begin{aligned} f_t = S^{N\!L}_t (f_{\mathrm{in}}) \end{aligned}$$

is the solution of

$$\begin{aligned} \partial _t f_t = Q(f_t), \qquad f_0 = f_{\mathrm{in}}. \end{aligned}$$

The main result of this paper can be seen as a perturbation result: given a solution to the limiting equation, we consider a sequence of measures \(\mu _{{\mathcal {Z}}_{\mathrm{in}}}^N\in {\mathcal {P}}_N(E)\) such that

$$\begin{aligned} \mu _{{\mathcal {Z}}_{\mathrm{in}}}^N \rightarrow f_{\mathrm{in}} \end{aligned}$$

and prove that for all \(t\) in some interval \(0\le t\le T\),

$$\begin{aligned} \mu _{{\mathcal {Z}}_t}^N \rightarrow S^{N\!L}_t(f_{\mathrm{in}}). \end{aligned}$$

The convergence is established in the weak topology for the law of the random empirical measures, as will be explained next.

2.4 \(N\)-particle dynamics of random measures and weak solutions of the limiting equation

A random point \({\mathcal {Z}}\in E^N\) with law \(f^N \in {\mathcal {P}}_{\mathrm{sym}}(E^N)\) can be identified with a random measure, denoted by \(\mu ^N_{\mathcal {Z}}\in {\mathcal {P}}_N(E)\), whose law is induced from \(f^N\). We denote this law \(\pi ^N_P f^N \in {\mathcal {P}}({\mathcal {P}}(E))\). Note that since \(E\) is a separable metric space, then so is \({\mathcal {P}}(E)\) by Prokhorov’s Theorem, and we may define the space \({\mathcal {P}}({\mathcal {P}}(E))\) of probability measures on \({\mathcal {P}}(E)\), as well as the set of continuous bounded functions, denoted \(C_b({\mathcal {P}}(E))\), which is the dual of \({\mathcal {P}}({\mathcal {P}}(E))\). In Sect. 3 we will discuss how the choice of topology on \({\mathcal {P}}(E)\) influences \(C_b({\mathcal {P}}(E))\).

The nonlinear dynamics given by the semigroup \(S^{N\!L}_t\) is deterministic and defines a semigroup of operators on \(C_b({\mathcal {P}}(E))\), in a way that is reminiscent of the Eqs. (2.1)–(2.2) but now using the duality structure between \(C_b({\mathcal {P}}(E))\) and \({\mathcal {P}}({\mathcal {P}}(E))\). We define, for any \(f_{\mathrm{in}}\in {\mathcal {P}}(E)\) and \(\Phi \in C_b({\mathcal {P}}(E))\),

$$\begin{aligned} T^{\infty }_t \Phi (f_{\mathrm{in}} ) = \Phi \left( S^{N\!L}_t (f_{\mathrm{in}})\right) . \end{aligned}$$

The semigroup \(T^{\infty }_t \) is called the pullback semigroup of \(S^{N\!L}_t\). As we shall see, it plays a similar role for the deterministic limiting flow \(S^{N\!L}_t\) associated with the Eq. (1.2) (lower half of the diagram), as the one the semigroup \(T^{N}_t\) plays for the flow \({\mathcal {Z}}^N_t\) at the level of the \(N\)-particle system (upper half of the diagram): in both cases these are the dual statistical flows. Note that for making sense of the pullback semigroup \(T^\infty _t\) on \(C_b({\mathcal {P}}(E))\), one needs the map \(f_{\mathrm{in}} \mapsto S^{N\!L}_t(f_{\mathrm{in}})\) to be continuous, and this will be a major issue when all arguments are made precise.

To connect the \(N\)-particle dynamics with the limiting dynamics, we also need mappings between \(C_b(E^N)\) and \(C_b({\mathcal {P}}(E))\). On the one hand

$$\begin{aligned} \pi ^N: \left\{ \begin{array}{l} \displaystyle C_b({\mathcal {P}}(E)) \rightarrow \ C_b(E^N)\\ \Phi \mapsto \ \phi \end{array}\right. \end{aligned}$$

is dual to the map \({\mathcal {P}}(E)\ni f^N \mapsto \pi ^N_P f^N \in {\mathcal {P}}({\mathcal {P}}(E))\), and is defined by

$$\begin{aligned} \forall \, Z \in E^N,\quad \phi (Z) = \left( \pi ^N\Phi \right) (Z) = \Phi \left( \mu ^N_Z\right) , \end{aligned}$$

where the empirical measure \(\mu ^N_Z\) is defined through (2.3). On the other hand

$$\begin{aligned} R^N: \left\{ \begin{array}{l} C_b(E^N) \rightarrow \ C_b({\mathcal {P}}(E))\\ \phi \mapsto \ \Phi \end{array}\right. \end{aligned}$$
(2.4)

is defined in the following way: for each \(\phi \in C_b(E^N), R^N[\phi ]\) is evaluated at the point \(f \in {\mathcal {P}}(E)\) as

$$\begin{aligned} f \mapsto R^N[\phi ](f)= R^N_{\phi }(f) = \int _{E^N} \phi (z_1,\ldots ,z_N) \, f(\mathrm{d}z_1) \, \dots \, f(\mathrm{d}z_N), \end{aligned}$$
(2.5)

which can be interpreted, as we shall see later, as a real valued polynomial taking probability measures as arguments.

2.5 The abstract theorem

We are now ready to give a more precise version of the main abstract theorem. The exact statement involves rather technical definitions, and its validity depends on five assumptions, (A1) to (A5) which are properly stated in Sect. 4. The first assumption is the requirement of symmetry under permutations that has already been stated. The remaining conditions are (1) estimates on the regularity and stability of the nonlinear semigroup, and (2) consistency estimates that quantifies that the \(N\)-particle systems and the limiting semigroup are compatible.

Theorem 2.1

(Fluctuation estimate) Consider a process \(({\mathcal {Z}}^N_t)_{t\ge 0}\) in \(E^N/ \mathfrak {S}_N\), and the related semigroups \(S^N_t\) and \(T^N_t\) as defined above. Let \(f_{in}\in {\mathcal {P}}(E)\), and consider a hierarchy of \(N\)-particle solutions \(f^N_t = S^N_t (f_{in}^{\otimes N})\), and a solution \(f_t=S^{N\!L}_t (f_{in})\) of the limit equation. We assume that (A1) to (A5) hold.

Then there is an absolute constant \(C>0\) and, for any \(T \in (0,\infty )\), there are constants \(C_{T}, \tilde{C}_{T} >0\) (depending on \(T)\) such that for any \(N, \ell \in {\mathbb {N}}^*\), with \(N \ge 2 \ell \), and for any

$$\begin{aligned} \varphi = \varphi _1 \otimes \dots \otimes \, \varphi _\ell \in {\mathcal {F}}^{\otimes \ell }, \quad \varphi _j \in {\mathcal {F}}, \ \Vert \varphi _j \Vert _{\mathcal {F}}\le 1, \end{aligned}$$

we have

$$\begin{aligned}&\sup _{[0,T)}\left| \left\langle \left( S^N_t(f_{\mathrm{in}}^{\otimes N}) - \left( S^{N\! L}_t(f_{\mathrm{in}})\right) ^{\otimes N}\right) , \varphi \otimes \mathbf{1}^{N-\ell }\right\rangle \right| \nonumber \\&\quad \le C \, \frac{\ell ^2}{N} + C_{T} \, \ell ^2 \, \varepsilon (N) + \tilde{C}_{T} \, \ell \, \Omega _N^{{\mathcal {G}}_3} (f_{\mathrm{in}}), \end{aligned}$$
(2.6)

with

$$\begin{aligned} \Omega _N^{{\mathcal {G}}_3} (f_{\mathrm{in}}) := \int _{E^N} \hbox {dist}_{{\mathcal {G}}_3} \left( \mu ^N_Z, f_{\mathrm{in}}\right) \, f^{\otimes N}_{\mathrm{in}} (\mathrm{d}Z). \end{aligned}$$
(2.7)

The space \({\mathcal {F}}\subset C_b(E)\) and the distance \(\textit{dist}_{{\mathcal {G}}_3}\) are defined later in Sect. 4.

We have used the notation \(\varphi = \varphi _1 \otimes \dots \otimes \, \varphi _\ell \) to denote

$$\begin{aligned} \varphi (z_1,\ldots ,z_\ell ) = \varphi _1(z_1) \varphi _2(z_2) \dots \varphi _\ell (z_\ell ), \end{aligned}$$

and \(\varphi \otimes \mathbf{1}^{N-\ell }\) to denote

$$\begin{aligned} (\varphi \otimes \mathbf{1}^{N-\ell })(z_1,\ldots ,z_N) = \varphi _1(z_1) \varphi _2(z_2) \dots \varphi _\ell (z_\ell ). \end{aligned}$$

We first note that the left hand side of (2.6) is the same as the left hand side of (1.3), the only difference being that the initial data to the \(N\)-particle system are assumed to factorize: \(f^N_{\mathrm{in}}= f_{\mathrm{in}}^{\otimes N}\), where \(f_{\mathrm{in}}\) is also the initial data to the limiting nonlinear equation. This is a stronger hypothesis than merely requiring the initial data to be chaotic (where the initial data to the \(N\)-particle system may factorize only in the limit of infinitely many particles). This restriction simplifies the proof, but it can easily be relaxed at the cost of some additional error terms.

The restriction to test functions of the form \(\varphi _1 \otimes \dots \otimes \, \varphi _\ell \otimes 1^{\otimes (N-\ell )}\), i.e. functions depending only on the first \(\ell \) variables, corresponds to analysing \(\ell \)-particle marginals. Hence the theorem implies the propagation of chaos as soon as \({\mathcal {F}}\) is dense in \(C_b(E)\) in the topology of uniform convergence on compact sets. This condition is satisfied in all examples given below.

2.6 Main steps of the proof

The proof begins by splitting the quantity we want to estimate,

$$\begin{aligned} \left\langle \left( S^N_t(f_{\mathrm{in}}^{\otimes N}) - \left( S^{N\! L}_t(f_{\mathrm{in}})\right) ^{\otimes N}\right) , \varphi \otimes \mathbf{1}^{N-\ell }\right\rangle , \end{aligned}$$

in three parts, each one corresponding to one of the error terms in the right hand side of the Eq. (2.6):

$$\begin{aligned}&\left| \left\langle S^N_t(f_{\mathrm{in}}^{\otimes N}) - \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes N}, \varphi \otimes 1^{\otimes N-\ell }\right\rangle \right| \nonumber \\&\quad \le \left| \left\langle S^N_t(f_{\mathrm{in}}^{\otimes N}), \varphi \otimes 1^{\otimes N-\ell }\right\rangle - \left\langle S^N_t(f_{\mathrm{in}}^{\otimes N}), R^\ell [\varphi ] \circ \mu ^N_Z \right\rangle \right| \nonumber \\&\qquad + \left| \left\langle f_{\mathrm{in}}^{\otimes N}, T^N_t (R^\ell [\varphi ] \circ \mu ^N_Z)\right\rangle - \left\langle f_{\mathrm{in}}^{\otimes N}, (T_t^\infty R^\ell [\varphi ]) \circ \mu ^N_Z)\right\rangle \right| \nonumber \\&\qquad + \left| \left\langle f_{\mathrm{in}}^{\otimes N}, (T_t^\infty R^\ell [\varphi ]) \circ \mu ^N_Z)\right\rangle - \left\langle (S^{N\!L}_t (f_{\mathrm{in}}))^{\otimes \ell } , \varphi \right\rangle \right| =: {\mathcal {T}}_1 + {\mathcal {T}}_2 + {\mathcal {T}}_ 3.\quad \quad \quad \end{aligned}$$
(2.8)

Then each of these terms is estimated separately:

  1. (1)

    The first term, \({\mathcal {T}}_1\), is bounded by \(C\ell ^2/N\), as proven in Lemma 4.2. From the definitions of \(R^{\ell }\) and \(\mu ^N_Z\), it follows that \(R^{\ell }[\varphi ](\mu ^N_Z)\) is a sum of terms of the form \(\varphi (z_{j_1},\ldots ,z_{j_{\ell }})\), where each index \(j_1,\ldots ,j_{\ell }\) is taken from the set \(\{1,\ldots ,N\}\). The error is then due to the fraction of terms for which two or more of the indices \(j_1,\ldots ,j_{\ell }\) are the same. Hence the estimate of \({\mathcal {T}}_1\) is of purely combinatorial nature, and only depends on the symmetry under permutation, i.e. the assumption (A1).

  2. (2)

    The estimate of the second term, \({\mathcal {T}}_2\), relies on the convergence of the \(N\)-particle semigroups \(T^N_t\) to the limiting semigroup \(T^{\infty }_t\). This is where the \(N\)-particle dynamics and the limiting dynamics are compared. The estimate can be found in Lemma 4.3, which depends on the consistency assumption (A3) on the generators and the stability assumption (A4) on the limiting dynamics. While the generator \(G^N\) of \(T^N_t\) can be defined in a straightforward manner, defining and estimating the generator \(G^{\infty }\) of \(T^{\infty }_t\) requires a more detailed analysis. It indeed involves derivatives of functions acting on \({\mathcal {P}}(E)\), and the required differential structure depends on the topology and metric structure chosen on \({\mathcal {P}}(E)\). The generator \(G^{\infty }\) is characterized in Lemma 4.1. The assumption (A4) is proved by establishing refined stability estimates on the limiting semigroup, showing their differentiability according to the initial data in a metric compatible with the previous steps.

  3. (3)

    For the last term \({\mathcal {T}}_3\), we note that

    $$\begin{aligned} T^{\infty }_t R^{\ell }[\varphi ](\mu ^N_Z)&= R^{\ell }[\varphi ] \left( S^{N\!L}_t(\mu ^N_Z)\right) \,=\, \left\langle (S^{N\!L}_t(\mu ^N_Z))^{\otimes \ell },\varphi \right\rangle , \end{aligned}$$

    which means that the nonlinear limiting equation is solved taking a sum of Dirac masses as initial data, and an \(\ell \)-fold product of the solution is integrated against \(\varphi \). The resulting function of \(Z=(z_1,\ldots ,z_N)\) is then integrated against \(f_{\mathrm{in}}^{\otimes N}\), which amounts to taking an average over all initial data such that the position of the \(N\) particles are independently taken at random from the law \(f_{in}\). When \(N\) is large, the random empirical measures \(\mu ^N_{\mathcal {Z}}\) are close to \(f_{\mathrm{in}}\), i.e. \(\pi ^N_P(f_{\mathrm{in}}^{\otimes N}) \rightharpoonup \delta _{f_{\mathrm{in}}}\) in \({\mathcal {P}}({\mathcal {P}}(E))\). This implies that the term \({\mathcal {T}}_3\) vanish in the limit when \(N\) goes to infinity. However the rate of this convergence sensitively depends on the regularity of the test function \(\varphi \) and on the continuity properties of \(S^{N\!L}_t\). In all cases considered here, the error \({\mathcal {T}}_3\) dominates the other error terms, and effectively determines the rate of convergence in the propagation of chaos. The precise result, together with the required assumptions, is given in Lemma 4.5.

3 Metrics on \({\mathcal {P}}(E)\) and differentiability of functions on \({\mathcal {P}}(E)\)

This section contains the technical details concerning the space of probability measures on \({\mathcal {P}}(E)\) and its dual, that is the space of continuous functions acting on \({\mathcal {P}}(E)\).

3.1 The metric issue

\({\mathcal {P}}(E)\) is our fundamental “state space”, where we compare the marginals of the \(N\)-particle density \(f^N_t\) and the chaotic infinite-particle dynamics \(f_t \) through their observables, i.e. the evolution of continuous bounded functions on \(E^N\) and \({\mathcal {P}}(E)\) respectively under the dual dynamics \(T^N_t\) and \(T^\infty _t\).

There are two canonical choices of topology on the space of probabilities, which determine two different sets \(C_b({\mathcal {P}}(E))\).

On the one hand, for a given locally compact and separable metric space \({\mathcal {E}}\), the space \(M^1({\mathcal {E}})\) of finite Borel measures on \({\mathcal {E}}\) is a Banach space when endowed with the total variation norm:

$$\begin{aligned} \forall \, f \in M^1({\mathcal {E}}), \quad \Vert f \Vert _{TV}&:= f^+({\mathcal {E}}) + f^-({\mathcal {E}}) \\&= \sup _{\phi \in C_b(Z), \, \Vert \phi \Vert _\infty \le 1} \langle f,\phi \rangle = \sup _{\phi \in C_0({\mathcal {E}}), \, \Vert \phi \Vert _\infty \le 1} \langle f,\phi \rangle , \end{aligned}$$

where \(f = f^+ - f^-\) stands for the Hahn decomposition and the equality between the two last terms comes from the fact that \({\mathcal {E}}\) is locally compact and separable.

We recall that \(f_k \xrightarrow []{TV} f\) (strong topology) when \((f_k)\) and \(f\) belongs to \(M^1({\mathcal {E}})\) and \(\Vert f_k - f \Vert _{TV} \rightarrow 0\) when \(k\rightarrow \infty \), and that \(f_k \rightharpoonup f\) (weak topology), if

$$\begin{aligned} \forall \, \varphi \in C_b(Z) \qquad \langle f, \varphi \rangle = \lim _{k\rightarrow \infty } \langle f_k , \varphi \rangle \,. \end{aligned}$$

The associated topology is denoted by \(\sigma (M^1({\mathcal {E}}),C_b({\mathcal {E}}))\). However, the weak convergence can be associated with different, non-equivalent metrics, and the choice of metric plays an important role as soon as one wants to perform differential calculus on \({\mathcal {P}}({\mathcal {E}})\).

In the sequel, we will denote by \(C_b({\mathcal {P}}(E),w)\) the space of continuous and bounded functions on \({\mathcal {P}}(E)\) endowed with the weak topology, and \(C_b({\mathcal {P}}(E),TV)\) the space of continuous and bounded functions on \({\mathcal {P}}(E)\) endowed with the total variation norm. It is clear that \(C_b({\mathcal {P}}(E),w) \subset C_b({\mathcal {P}}(E),TV)\) since \(f_k \xrightarrow []{TV} f\) implies \(f_k \rightharpoonup f\).

However, the supremum norm \(\Vert \Phi \Vert _{L^\infty ({\mathcal {P}}(E))}\) does not depend on the choice of topology on \({\mathcal {P}}(E)\), and endows the two previous sets with a Banach space topology. The transformations \(\pi ^N\) and \(R^N\) satisfy:

$$\begin{aligned} \left\| \pi ^N \Phi \right\| _{L^\infty (E^N)} \le \Vert \Phi \Vert _{L^\infty ({\mathcal {P}}(E))} \ \text{ and } \ \Vert R^N[\phi ]\Vert _{L^\infty ({\mathcal {P}}(E))} \le \Vert \phi \Vert _{L^\infty (E^N)}.\quad \end{aligned}$$
(3.1)

The transformation \(\pi ^N\) is well defined from \(C_b({\mathcal {P}}(E),w)\) to \(C_b(E^N)\), but it does not map \(C_b({\mathcal {P}}(E),TV)\) into \(C_b(E^N)\).

In the other way round, the transformation \(R^N\) is well defined from \(C_b(E^N)\) to \(C_b({\mathcal {P}}(E),w)\), and therefore also from \(C_b(E^N)\) to \(C_b({\mathcal {P}}(E),TV)\): for any \(\phi \in C_b(E^N)\) and for any sequence \(f_k\) so that the weak convergence \(f_k \rightharpoonup f\) holds, we have \(f_k^{\otimes N} \rightharpoonup f^{\otimes N}\), and then \(R^N[\phi ](f_k) \rightarrow R^N[\phi ](f)\).

The different metric structures associated with the weak topology are not seen at the level of \(C_b({\mathcal {P}}(E),w)\). However any norm (or semi-norm) “more regular” than the uniform norm on \(C_b({\mathcal {P}}(E))\) (in the sense of controlling some modulus of continuity or some differential) strongly depends on this choice, as is illustrated by the abstract Lipschitz spaces defined below.

Definition 3.1

Let \(m_{\mathcal {G}}: E \rightarrow {\mathbb {R}}_+\) be given. Then we define the following weighted subspace of probability measures

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {G}}}(E) := \{ f \in {\mathcal {P}}(E); \,\, \langle f, m_{\mathcal {G}}\rangle < \infty \}, \end{aligned}$$

together with a corresponding space of “increments”,

$$\begin{aligned} \mathcal {I} {\mathcal {P}}_{\mathcal {G}}(E) := \left\{ f_1 - f_2 \ ; \ f_1, \, f_2 \in {\mathcal {P}}_{{\mathcal {G}}}(E)\right\} . \end{aligned}$$

If moreover there is a vector space \({\mathcal {G}}\) (with norm denoted by \(\Vert \cdot \Vert _{\mathcal {G}}\)) which contains \(\mathcal {I} {\mathcal {P}}_{\mathcal {G}}(E)\), we then define the following distance on \({\mathcal {P}}_{{\mathcal {G}}}(E)\)

$$\begin{aligned} \forall \, f_1, \, f_2 \in {\mathcal {P}}_{{\mathcal {G}}}(E), \quad \textit{dist}_{\mathcal {G}}(f_1,f_2) := \Vert f_1 - f_2\Vert _{\mathcal {G}}. \end{aligned}$$

Remark 3.2

Note carefully that the space of increments \(\mathcal {I} {\mathcal {P}}_{\mathcal {G}}(E)\) is not a vector space in general.

Now we can define a precised notion of equivalence of metrics:

Definition 3.3

We say that \({\mathcal {P}}_{\mathcal {G}}(E)\) has a bounded diameter if there exits \(K_{\mathcal {G}}>0\) such that

$$\begin{aligned} \forall \, f \in {\mathcal {P}}_{\mathcal {G}}(E), \quad \textit{dist}_{\mathcal {G}}(f,g) \le K_{\mathcal {G}}\end{aligned}$$

for some given fixed \(g \in {\mathcal {P}}_{\mathcal {G}}(E)\).

Two metrics \(d_0\) and \(d_1\) on \({\mathcal {P}}_{\mathcal {G}}(E)\) are said to be Hölder uniformly equivalent on bounded sets if there exists \(\kappa \in (0,\infty )\) and for any \(a \in (0,\infty )\) there exists \(C_a \in (0,\infty )\) such that

$$\begin{aligned} \forall \, f_1, \, f_2 \in {\mathcal {B}}P_{{\mathcal {G}},a}, \quad \frac{1}{C_a} \, [d_2(f_1,f_2)]^\kappa \le d_1(f_1, f_2) \le C_a \, [d_2(f_1,f_2)]^\kappa \end{aligned}$$

where

$$\begin{aligned} {\mathcal {B}}P_{{\mathcal {G}},a} := \left\{ f \in {\mathcal {P}}_{\mathcal {G}}(E)\!\! \ ; \ \langle f, m_{\mathcal {G}}\rangle \le a\right\} . \end{aligned}$$

Finally, we say that two normed spaces \({\mathcal {G}}_0\) and \({\mathcal {G}}_1\) are Hölder uniformly equivalent (on bounded sets) if this is the case for the corresponding metrics.

We also define the vector space \(UC({\mathcal {P}}_{\mathcal {G}}(E);{\mathbb {R}})\) of uniformly continuous and bounded function \(\Psi : {\mathcal {P}}_{\mathcal {G}}(E) \rightarrow {\mathbb {R}}\), where the continuity is related the metric topology on \({\mathcal {P}}_{\mathcal {G}}(E)\) defined by \(\text{ dist }_{{\mathcal {G}}}\) above. Observe that this is a Banach space when endowed with the supremum norm.

Example 3.4

With the choice \(m_{\mathcal {G}}:= 1, \Vert \cdot \Vert _{\mathcal {G}}:= \Vert \cdot \Vert _{TV}\) we obtain \({\mathcal {P}}_{\mathcal {G}}(E)(E) = ({\mathcal {P}}(E),TV)\) endowed with the total variation norm.

3.2 Examples of distances on measures when \(E = {\mathbb {R}}^d\)

There are many ways to define distances on \({\mathcal {P}}(E)\) which are topologically equivalent to the weak topology of measures, see for instance [6, 34].

We list below some well-known distances on \({\mathcal {P}}({\mathbb {R}}^d)\) or on its subsets

$$\begin{aligned} {\mathcal {P}}_q({\mathbb {R}}^d) := \{f \in {\mathcal {P}}({\mathbb {R}}^d); \,\, M_q(f) < \infty \}, \quad q \ge 0, \end{aligned}$$

where the moment \(M_q(f)\) of order \(q\) of a probability measure is defined as

$$\begin{aligned} M_q(f) := \left\langle f \,,\, \langle v \rangle ^q\right\rangle , \quad \langle v \rangle ^2 = 1 + |v|^2. \end{aligned}$$

These distances are all Hölder uniformly equivalent to the weak topology \(\sigma ({\mathcal {P}}(E), C_b(E))\) on the bounded subsets

$$\begin{aligned} {\mathcal {B}}P_{q,a}(E) := \left\{ f \in {\mathcal {P}}_q({\mathbb {R}}^d), \,\, M_q(f) \le a\right\} \end{aligned}$$

for any \(a \in (0,\infty )\) and for \(q\) large enough. For more informations we refer to [12].

Example 3.5

(Dual-Hölder, or Zolotarev’s, distances) Denote by \(\textit{dist}_E\) a distance on \(E\) and fix \(z_0 \in E\) (e.g. \(z_0=0\) when \(E = {\mathbb {R}}^d\) in the sequel). Denote by \(Lip_0(E)\) the set of Lipschitz functions on \(E\) vanishing at one arbitrary point \(z_0 \in E\) endowed with the norm

$$\begin{aligned}{}[\varphi ]_{Lip} = [\varphi ]_1 := \sup _{z,\tilde{z} \in E, \ z \not = \tilde{z}}{|\varphi (z) - \varphi (\tilde{z})| \over \textit{dist}_E(z,\tilde{z})}. \end{aligned}$$

We then define the dual norm: take \(m_{\mathcal {G}}:= 1\) and endow \({\mathcal {P}}_{\mathcal {G}}(E)\) with

$$\begin{aligned} \forall \, f,g \in {\mathcal {P}}_{\mathcal {G}}(E), \quad [g-f]^*_1 := \sup _{\varphi \in Lip_0(E)} {\langle g - f, \varphi \rangle \over [\varphi ]_1}. \end{aligned}$$
(3.2)

Example 3.6

(Monge–Kantorovich–Wasserstein distances) For \(q \in [1,\infty )\), define

$$\begin{aligned} {\mathcal {P}}_{\mathcal {G}}(E) = {\mathcal {P}}_q(E):= \left\{ f \in {\mathcal {P}}(E); \,\, \langle f, m_{\mathcal {G}}\rangle :=\left\langle f, dist(\cdot ,v_0)^q\right\rangle < \infty \right\} \end{aligned}$$

and the Monge–Kantorovich–Wasserstein (MKW) distance \(W_q\) by

$$\begin{aligned} \forall \, f,g \in {\mathcal {P}}_q(E), \quad W^q_q(f,g) := \inf _{p \in \Pi (f,g)} \int _{E\times E} \textit{dist}_E(z,\tilde{z})^q \, p(\mathrm{d}z,\mathrm{d}\tilde{z}), \end{aligned}$$
(3.3)

where \(\Pi (f,g)\) denote the set of probability measures \(p \in {\mathcal {P}}(E \times E)\) with marginals \(f\) and \(g\) (\(p(A\times E) = f (A), p(E\times A) = g (A)\) for any Borel set \(A \subset E\)). Note that for \(Z,\tilde{Z} \in E^N\) and any \(q \in [1,\infty )\), one has

$$\begin{aligned} W_q\left( \mu ^N_Z,\mu ^N_{\tilde{Z}}\right) = d_{\ell ^q(E^N/{\mathfrak {S}}_N)} (Z, \tilde{Z}) := \min _{\sigma \in {\mathfrak {S}}_N} \left( {1 \over N} \sum _{i=1}^N \textit{dist}_E(z_i,\tilde{z}_{\sigma (i)})^q\right) ^{1/q},\nonumber \\ \end{aligned}$$
(3.4)

and that

$$\begin{aligned} \forall \, f, \, g \in P_1(E), \quad W_1 (f,g) = [f-g]^*_1 = \sup _{\phi \in Lip_0(E)}\, \left\langle f-g, \phi \right\rangle \end{aligned}$$
(3.5)

as well as

$$\begin{aligned} \forall \, q \in [1,\infty ), \,\,\, \forall \, f, \, g \in {\mathcal {P}}_q(R^d), \quad W_1(f,g) \le W_q(f,g). \end{aligned}$$
(3.6)

We refer to [40] and the references therein for more details on the Monge–Kantorovich–Wasserstein distances and for a proof of these claims.

Example 3.7

(Fourier-based norms) For \(E={\mathbb {R}}^d, m_{{\mathcal {G}}} := 1\), let

$$\begin{aligned} \forall \, f \in \mathcal {T} {\mathcal {P}}_{\mathcal {G}}(E), \quad \Vert f \Vert _{{\mathcal {G}}} = |f|_s := \sup _{\xi \in {\mathbb {R}}^d} \frac{|\hat{f}(\xi )|}{ \langle \xi \rangle ^s}, \quad s > 0. \end{aligned}$$

We denote by \({\mathcal {H}}^{-s}\) (which includes \({\mathcal {I}}{\mathcal {P}}_{\mathcal {G}}(E)\) for \(s\) large enough) the Banach space associated to the norm \(|\cdot |_{s}\). Such norms first appeared in connection with kinetic theory in [16].

Example 3.8

(Negative Sobolev norms) For \(E={\mathbb {R}}^d, m_{{\mathcal {G}}} := 1\), let

$$\begin{aligned} \forall \, f \in \mathcal {T} {\mathcal {P}}_{\mathcal {G}}(E), \quad \Vert f\Vert _{{\mathcal {G}}} = \Vert f\Vert _{H^{-s} ({\mathbb {R}}^d)} := \left\| \frac{\hat{f}(\xi )}{\langle \xi \rangle ^s}\right\| _{L^2({\mathbb {R}}^d)}, \quad s >0. \end{aligned}$$

We denote by \(H^{-s}\) (which includes \({\mathcal {I}}{\mathcal {P}}_{\mathcal {G}}(E)\) for \(s\) large enough) the Hilbert space associated to the norm \(\Vert \cdot \Vert _{H^{-s}}\).

For \(E={\mathbb {R}}^d, m_{{\mathcal {G}}} := 1\), and some integers \(k,\ell \ge 0\), we also define

$$\begin{aligned} \forall \, f \in \mathcal {T} {\mathcal {P}}_{\mathcal {G}}(E), \quad \Vert f \Vert _{{\mathcal {G}}} = \Vert f \Vert _{ H^{-k}_{-\ell } ({\mathbb {R}}^d)} := \sup _{{\Vert \varphi \Vert }_{H^k_\ell }=1} \langle f,\varphi \rangle , \end{aligned}$$

where

$$\begin{aligned} \left\| \varphi \right\| _{H^k_{\ell }}^2 := \sum _{|\alpha |\le k} \int _{{\mathbb {R}}^d} |\partial ^\alpha \varphi (z)|^2 \, \langle z \rangle ^{2\ell } \, \mathrm{d}z. \end{aligned}$$

We denote by \(H^{-k}_{-\ell }({\mathbb {R}}^d)\) (which includes \({\mathcal {I}}{\mathcal {P}}_{\mathcal {G}}(E)\) for \(k\) large enough) the Hilbert space associated to the norm \(\Vert \cdot \Vert _{H^{-k}_{-\ell }({\mathbb {R}}^d)}\).

3.3 Differential calculus for functions of probability measures

We start with a definition of Lipschitz regularity, for which a mere metric structure is sufficient.

Definition 3.9

For metric spaces \(\tilde{{\mathcal {G}}}_1\) and \(\tilde{{\mathcal {G}}}_2\) we denote by \(C^{0,1}(\tilde{{\mathcal {G}}}_1,\tilde{{\mathcal {G}}}_2)\) the space of functions from \(\tilde{{\mathcal {G}}}_1\) to \(\tilde{{\mathcal {G}}}_2\) with Lipschitz regularity, i.e. the set of functions \(\Psi : \tilde{{\mathcal {G}}}_1 \rightarrow \tilde{{\mathcal {G}}}_2\) such that there exists a constant \(C >0\) so that

$$\begin{aligned} \forall \, f,g \in \tilde{{\mathcal {G}}}_1, \quad \textit{dist}_{\tilde{{\mathcal {G}}}_2} \left( \Psi (g), \Psi (f)\right) \le C \, \textit{dist}_{\tilde{{\mathcal {G}}}_1} (g,f). \end{aligned}$$
(3.7)

We then define the semi-norm \([\cdot ]_ {C^{0,1}(\tilde{{\mathcal {G}}}_1, \tilde{{\mathcal {G}}}_2)}\) on \(C^{0,1}(\tilde{{\mathcal {G}}}_1, \tilde{{\mathcal {G}}}_2)\) as the infimum of the constants \(C > 0\) such that (3.7) holds.

The next step consists in defining a higher order differential calculus; this is where the assumption that metrics are inherited from a normed vector space structure plays a role.

Definition 3.10

Let \({\mathcal {G}}_1\) and \({\mathcal {G}}_2\) be normed spaces, and let \(\tilde{{\mathcal {G}}}_1\) and \(\tilde{{\mathcal {G}}}_2\) be two metric spaces such that \(\tilde{{\mathcal {G}}}_i - \tilde{{\mathcal {G}}}_i \subset {\mathcal {G}}_i\). For \(k \in {\mathbb {N}}\), we define \(C^{k,1}(\tilde{{\mathcal {G}}}_1; \tilde{{\mathcal {G}}}_2)\) to be the set of bounded continuous functions \(\Psi : \tilde{{\mathcal {G}}}_1 \rightarrow \tilde{{\mathcal {G}}}_2\) such that there exists \(D^j \Psi : \tilde{{\mathcal {G}}}_1 \rightarrow {\mathcal {B}}^j({\mathcal {G}}_1,{\mathcal {G}}_2)\) continuous and bounded, where \({\mathcal {B}}^j({\mathcal {G}}_1,{\mathcal {G}}_2)\) is the space of bounded \(j\)-multilinear applications from \({\mathcal {G}}_1\) to \({\mathcal {G}}_2\) (endowed with its canonical norm) for \(j = 1, \dots , k\), and some constants \(C_j >0, j=0, \dots , k\), so that for any \(j=0, \dots , k\)

$$\begin{aligned} \forall \, f,g \in \tilde{{\mathcal {G}}}_1, \quad \left\| \Psi (g) - \sum _{i=0}^j \left\langle D^i \Psi (f), (g-f)^{\otimes i} \right\rangle \right\| _{{\mathcal {G}}_2} \le C_j \, \Vert g-f\Vert _{{\mathcal {G}}_1}^{j+1}\quad \quad \end{aligned}$$
(3.8)

(with the convention \(D^0 \Psi = \Psi \)).

We also define the following seminorms on \(C^{k,1}(\tilde{{\mathcal {G}}}_1, \tilde{{\mathcal {G}}}_2)\)

$$\begin{aligned}{}[\Psi ]_{j,0} := \sup _{f \in \tilde{{\mathcal {G}}}_1} \left\| D^j \Psi (f)\right\| _{{\mathcal {B}}^j({\mathcal {G}}_1, {\mathcal {G}}_2)}, \quad j=1, \dots , k, \end{aligned}$$

with

$$\begin{aligned} \left\| L\right\| _{{\mathcal {B}}^j({\mathcal {G}}_1,{\mathcal {G}}_2)} := \sup _{h_i, \, \Vert h_i \Vert _{{\mathcal {G}}_1} \le 1, \, 1 \le i \le j} \left\| L\left( h_1,\dots , h_j\right) \right\| _{{\mathcal {G}}_2}, \end{aligned}$$

and

$$\begin{aligned}{}[\Psi ]_{j,1} := \sup _{f,g \in \tilde{{\mathcal {G}}}_1} \frac{\Big \Vert \Psi (g) - \sum _{i=0}^j \langle D^i \Psi (f), (g-f)^{\otimes i} \rangle \Big \Vert _{{\mathcal {G}}_2}}{\Vert g-f\Vert _{{\mathcal {G}}_1}^{j+1}}. \end{aligned}$$

Finally we combine these semi-norms into the norm

$$\begin{aligned} \Vert \Psi \Vert _{C^{k,1}(\tilde{{\mathcal {G}}}_1,\tilde{{\mathcal {G}}}_2)} = \sum _{j=1}^k \, [\Psi ]_{j,0} + [\Psi ]_{k,1}. \end{aligned}$$

Remark 3.11

Observe that for any \(j \ge 1\) the LHS of (3.8) makes sense since

$$\begin{aligned}&\Psi (g) - \sum _{i=0}^j \left\langle D^i \Psi (f), (g-f)^{\otimes i}\right\rangle \\&\quad = \Big [\Psi (g) - \Psi (f) \Big ] - \sum _{i=1}^j \left\langle D^i \Psi (f), (g-f)^{\otimes i}\right\rangle \in {\mathcal {G}}_2 \end{aligned}$$

since \(\Psi (g) - \Psi (f) \in \tilde{{\mathcal {G}}}_2 - \tilde{{\mathcal {G}}}_2 \subset {\mathcal {G}}_2\) and \(D^i \Psi (f) \in {\mathcal {B}}^j({\mathcal {G}}_1,{\mathcal {G}}_2)\). Then note that our definition is very close to the usual Fréchet definition of differentiability in Banach spaces for the function \(h \mapsto \Psi (f+h)\) with \(h = g-f \in {\mathcal {G}}_1\), except that the domain and range are restricted to subsets that have no vectorial structures and are not open within \({\mathcal {G}}_1\) and \({\mathcal {G}}_2\). We also only consider Lipschitz differentiability.

The following lemma confirms that this differential calculus is well-behaved for composition, which seems to be a minimal requirement for further applications.

Lemma 3.12

Consider \({\mathcal {U}}\in C^{k,1} (\tilde{{\mathcal {G}}}_1, \tilde{{\mathcal {G}}}_2)\) and \({\mathcal {V}}\in C^{k,1} (\tilde{{\mathcal {G}}}_2, \tilde{{\mathcal {G}}}_3)\). Then the composition \(\Psi := {\mathcal {V}}\circ {\mathcal {U}}\) belongs to \(C^{k,1} (\tilde{{\mathcal {G}}}_1, \tilde{{\mathcal {G}}}_3)\). Moreover the following chain rule holds at first order \(k=1\)

$$\begin{aligned} \forall \, f \in \tilde{{\mathcal {G}}}_1, \quad D \Psi [f] = D {\mathcal {V}}[{\mathcal {U}}(f)] \circ D {\mathcal {U}}[f], \end{aligned}$$
(3.9)

with the estimates

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle {[\Psi ]}_{0,1} \le {[{\mathcal {V}}]}_{0,1} \, {[{\mathcal {U}}]}_{0,1},\\ {[\Psi ]}_{1,0} \le {[{\mathcal {V}}]}_{1,0} \, {[{\mathcal {U}}]}_{1,0},\\ {[\Psi ]}_{1,1} \le {[{\mathcal {V}}]}_{1,0} \, {[{\mathcal {U}}]}_{1,1} + {[{\mathcal {V}}]}_{1,1} \, {[{\mathcal {U}}]}_{0,1}^2. \end{array}\right. \end{aligned}$$

At second order \(k=2\) one also has the chain rule

$$\begin{aligned} \forall \, f \in \tilde{{\mathcal {G}}}_1, \quad D^2 \Psi [f] = D^2 {\mathcal {V}}[{\mathcal {U}}(f)] \circ (D {\mathcal {U}}[f] \otimes D {\mathcal {U}}[f])\! +\! D {\mathcal {V}}[{\mathcal {U}}(f)] \circ D^2 {\mathcal {U}}[f].\nonumber \\ \end{aligned}$$
(3.10)

Proof of Lemma 3.12

It is straightforward by writing and compounding the expansions of \({\mathcal {U}}\) and \({\mathcal {V}}\) provided by Definition 3.10. \(\square \)

3.4 The subalgebra of polynomials in \(C_b({\mathcal {P}}(E))\)

In Sect. 2 we defined a map \(R^{\ell }: C(E^{\ell })\rightarrow C_b({\mathcal {P}}(E))\), which may be used to define a subalgebra of polyomials in \( C_b({\mathcal {P}}(E))\). We first define the monomials:

Definition 3.13

A monomial in \(C_b({\mathcal {P}}(E))\) of degree \(\ell \) is a function \(R^\ell [\varphi ]\) with \(\varphi = \varphi _1 \otimes \cdots \otimes \varphi _\ell , \varphi _i \in C_b(E)\) and \(\ell \in {\mathbb {N}}\). Explicitly

$$\begin{aligned} R^{\ell }[\varphi ](f)&= \int _{E^\ell } \varphi (z_1,\dots ,z_\ell ) \, \mathrm{d}f^{\otimes \ell }(z_1,\dots ,z_\ell )\\&= \prod _{j=1}^{\ell } \int _E \varphi _j(z) \, \mathrm{d}f(z)\,, \end{aligned}$$

which is well-defined for all \(f\in {\mathcal {P}}(E)\).

The product of two monomials is defined in a natural way by \(R^{\ell _1}[\phi ]R^{\ell _2}[\psi ] = R^{\ell _1+\ell _2}[\phi \otimes \psi ]\), and the polynomial functions are linear combinations of monomials. These form a subalgebra of \(C_b({\mathcal {P}}(E))\) that contains the constants and separates points in \({\mathcal {P}}(E)\), and hence the Stone-Weierstrass Theorem implies that this subalgebra is dense in \(C_b({\mathcal {P}}(E))\), where the meaning of “dense” depends on the topology chosen on \({\mathcal {P}}(E)\).

While the polynomials in \({\mathbb {R}}\) always are differentiable, the smoothness of the polynomials depends on the metric structure. We need first some preliminary definitions.

Definition 3.14

  • Duality of type 1: We say that a pair \(({\mathcal {F}},{\mathcal {G}})\) of normed vector spaces such that \({\mathcal {F}}\subset C_b(E)\) and \({\mathcal {P}}(E)-{\mathcal {P}}(E) \subset {\mathcal {G}}\) satisfies a duality inequality if

    $$\begin{aligned} \forall \, f,g \in {\mathcal {P}}(E), \,\, \forall \, \varphi \in {\mathcal {F}}, \quad |\langle (f-g), \varphi \rangle |\le C \, \Vert f-g \Vert _{\mathcal {G}}\, \Vert \varphi \Vert _{\mathcal {F}}.\quad \quad \end{aligned}$$
    (3.11)
  • Duality of type 2: More generally we say that a pair \(({\mathcal {F}},{\mathcal {P}}_{\mathcal {G}}(E))\) of a normed vector space \({\mathcal {F}}\subset C_b(E)\) endowed with the norm \(\Vert \cdot \Vert _{\mathcal {F}}\) and a probability space \({\mathcal {P}}_{\mathcal {G}}(E) \subset {\mathcal {P}}(E)\) endowed with a metric \(d_{\mathcal {G}}\) satisfies a duality inequality if

    $$\begin{aligned} \forall \, f,g \in {\mathcal {P}}_{\mathcal {G}}(E), \,\, \forall \, \varphi \in {\mathcal {F}}, \quad |\langle g-f, \varphi \rangle |\le C \, \textit{dist}_{\mathcal {G}}(f,g)\, \Vert \varphi \Vert _{\mathcal {F}}.\quad \quad \end{aligned}$$
    (3.12)

Lemma 3.15

If \(\varphi \in {\mathcal {F}}^\ell \) and the pair \(({\mathcal {F}},{\mathcal {G}})\) satisfy a duality of type 1, the polynomial function \(R^\ell [\varphi ]\) is of class \(C^{k,1}({\mathcal {P}}_{\mathcal {G}}(E),{\mathbb {R}})\) for any \(k\ge 0\). In the more general case where the pair \(({\mathcal {F}},{\mathcal {P}}_{\mathcal {G}}(E))\) satisfies a duality of type 2, the polynomial function \(R^\ell [\varphi ]\) is at least of class \(C^{0,1}({\mathcal {P}}_{\mathcal {G}}(E),{\mathbb {R}})\).

Proof

It is clearly enough to prove the lemma for monomials, and the proof then mainly follows from the multilinearity of \(R\). In the case of duality of type 2, then the conclusion follows from

$$\begin{aligned} R^\ell [\varphi ](f_2) - R^\ell [\varphi ](f_1)&= \sum _{i=1}^\ell \left( \,\prod _{1 \le k < i} \langle \varphi _k, f_2 \rangle \right) \langle \varphi _i, f_2 - f_1 \rangle \left( \,\prod _{i < k \le \ell } \langle \varphi _k, f_1 \rangle \right) . \end{aligned}$$

In the case of duality of type 1, we define

$$\begin{aligned} {\mathcal {G}}\rightarrow {\mathbb {R}}, \quad h \mapsto DR^\ell [\varphi ] (f) (h) := \sum _{i=1}^\ell \left( \prod _{j \not = i}\langle \varphi _j, f \rangle \right) \langle \varphi _i, h \rangle , \end{aligned}$$

and we write

$$\begin{aligned}&R^\ell [\varphi ](f_2) - R^\ell [\varphi ](f_1) - DR^\ell [\varphi ] (f_1) (f_2 - f_1)\\&\quad = \sum _{1\le j < i \le \ell } \left( \,\prod _{1 \le k < j} \langle \varphi _k, f_2 \rangle \right) \langle \varphi _j, f_2 - f_1 \rangle \left( \,\prod _{j < k <i} \langle \varphi _k, f_1 \rangle \right) \\&\qquad \times \langle \varphi _i, f_2 - f_1 \rangle \left( \,\prod _{i < k \le \ell } \langle \varphi _k, f_1 \rangle \right) . \end{aligned}$$

We deduce then

$$\begin{aligned}&\left| R^\ell [\varphi ](f_2) - R^\ell [\varphi ](f_1)\right| \le \Vert \varphi \Vert _{1,{\mathcal {F}}\otimes (L^\infty )^{\ell -1}} \, \Vert f_2 - f_1\Vert _{\mathcal {G}},\\&\left| DR^\ell [\varphi ](f_1)(h)\right| \le \Vert \varphi \Vert _{1,{\mathcal {F}}\otimes (L^\infty )^{\ell -1}} \, \Vert h\Vert _{\mathcal {G}}, \\&\left| R^\ell [\varphi ](f_2) \!-\! R^\ell [\varphi ](f_1) - DR^\ell [\varphi ](f_1)(f_2 - f_1)\right| \le \Vert \varphi \Vert _{1,{\mathcal {F}}^2 \otimes (L^\infty )^{\ell -2}} \, \Vert f_2 - f_1\Vert ^2_{\mathcal {G}}, \end{aligned}$$

where

$$\begin{aligned} \Vert \varphi \Vert _{1,{\mathcal {F}}^k \otimes (L^\infty )^{\ell -k}}&:= \sum _{\{i_1, \dots , i_k \}\subset \{1,\ldots ,\ell \}} \Vert \varphi _{i_1} \Vert _{{\mathcal {F}}} \, \cdots \Vert \varphi _{i_k} \Vert _{{\mathcal {F}}} \, \prod _{j \notin \{i_1,\dots ,i_k\}} \Vert \varphi _j \Vert _{L^\infty (E)} \\&\le \left\{ \begin{array}{lcl} \ell \, \Vert \varphi \Vert _{\infty ,{\mathcal {F}}\otimes (L^\infty )^{\ell -1}} &{} \quad \hbox {for} \quad &{} k=1, \\ \displaystyle \frac{\ell (\ell -1)}{2} \, \Vert \varphi \Vert _{\infty ,{\mathcal {F}}^2 \otimes (L^\infty )^{\ell -2}}&{} \quad \hbox {for} \quad &{} k=2, \end{array}\right. \end{aligned}$$

and we have defined

$$\begin{aligned} \Vert \varphi \Vert _{\infty ,{\mathcal {F}}^k \otimes (L^\infty )^{\ell -k}}&:= \max _{\{i_1, \dots , i_k \}\subset \{1,\ldots ,\ell \}} \Vert \varphi _{i_1}\Vert _{{\mathcal {F}}} \, \cdots \Vert \varphi _{i_k}\Vert _{{\mathcal {F}}}\! \prod _{j \notin \{i_1,\dots ,i_k\}}\Vert \varphi _j\Vert _{L^\infty (E)} \\&\le \Vert \varphi \Vert _{{\mathcal {F}}^{\otimes \ell }}, \end{aligned}$$

since \(\Vert \cdot \Vert _{L^\infty (E)} \le \Vert \cdot \Vert _{\mathcal {F}}\). This proves that \(R^\ell [\varphi ] \in C^{1,1}({\mathcal {P}}_{\mathcal {G}}(E),{\mathbb {R}})\). The cases \(k \ge 2\) are proved similarly.\(\square \)

4 Assumptions and technical lemmas

In this section we collect the lemmas used in the proof of Theorem 2.1, and the technical assumptions that are needed.

The first assumption is simply the statement that the \(N\)-particle dynamics is well defined and invariant under permutation of the particles.

figure a

4.1 The generator of the pullback semigroup

While the definition of \(T^N_t\) and its generator is rather standard, it takes more care when defining the pullback of the nonlinear semigroup and the corresponding generator. The second assumption that we need to impose on the system is related to this, and relates to our definition of a differential calculus of functions on \({\mathcal {P}}(E)\).

figure b

This assumption is sufficient for defining the generator of \(T^{\infty }_t\):

Lemma 4.1

Under assumption (A2) the pullback semigroup \(T^\infty _t\) is a contraction semigroup on the Banach space \(UC(P_{{\mathcal {G}}_1}(E);{\mathbb {R}})\) and its generator \(G^\infty \) is an unbounded linear operator on \(UC(P_{{\mathcal {G}}_1}(E);{\mathbb {R}})\) with domain \(Dom(G^\infty )\) containing \(C^{1,1}(P_{{\mathcal {G}}_1}(E); {\mathbb {R}})\). It is defined by

$$\begin{aligned} \forall \, \Phi \in C^{1,1}( P_{{\mathcal {G}}_1}(E); {\mathbb {R}}), \ \forall \, f \in P_{{\mathcal {G}}_1}(E), \quad \left( G^\infty \Phi \right) (f) = \left\langle D\Phi [f], Q(f)\right\rangle \!.\quad \quad \end{aligned}$$
(4.1)

Proof

The proof is split into several steps.

Step 1 We claim that for any \(f_{\mathrm{in}} \in {\mathcal {P}}_{{\mathcal {G}}_1}(E)\) and \(\tau > 0\) the map

$$\begin{aligned} {\mathcal {S}}(f_{\mathrm{in}}) : [0,\tau ) \rightarrow {\mathcal {P}}_{{\mathcal {G}}_1}(E), \quad t \mapsto S^{N \! L}_t(f_{\mathrm{in}}) \end{aligned}$$

is right-differentiable at \(t=0^+\) with \({\mathcal {S}}(f_{\mathrm{in}})'(0) = Q(f_{\mathrm{in}})\).

Denote \(f_t := S^{N\!L}_t f_{\mathrm{in}}\). First, since \(Q(f_t)\) is bounded in \({\mathcal {G}}_1\) uniformly on \(t \in [0,\tau ]\) from (A2)-(ii), we have, uniformly on \(f_{\mathrm{in}}\in {\mathcal {P}}_{{\mathcal {G}}_1}(E)\),

$$\begin{aligned} \Vert f_t - f_{\mathrm{in}}\Vert _{{\mathcal {G}}_1}= \left\| \int _0^t Q(f_s) \, \mathrm{d}s \right\| _{{\mathcal {G}}_1} \le K \, t, \end{aligned}$$
(4.2)

and then using (A2)-(ii) and the inequality (4.2) we obtain

$$\begin{aligned} \Vert f_t - f_{\mathrm{in}} - t \, Q(f_{\mathrm{in}})\Vert _{{\mathcal {G}}_1}&= \left\| \int _0^t \left( Q(f_s) - Q(f_{\mathrm{in}})\right) \, \mathrm{d}s\right\| _{{\mathcal {G}}_1} \\&= L \, \int _0^t \left\| f_s - f_{\mathrm{in}}\right\| _{{\mathcal {G}}_1}^\delta \, \mathrm{d}s\\&\le L \, \int _0^t (K \, s)^\delta \, ds = L \, K^\delta \, {t^{1+\delta } \over 1+\delta }, \end{aligned}$$

which implies the claim.

Step 2. We claim that \((T^\infty _t)\) is a \(C_0\)-semigroup of linear and bounded (in fact contraction) operators on \(UC({\mathcal {P}}_{{\mathcal {G}}_1}(E);{\mathbb {R}})\). Indeed, first for any \(\Phi \in UC( {\mathcal {P}}_{{\mathcal {G}}_1}(E);{\mathbb {R}})\) and denoting by \(\omega _\Phi \) the modulus of continuity of \(\Phi \), we have

$$\begin{aligned} \left| (T^\infty _t \Phi )(g) - (T^\infty _t \Phi )(f)\right|&= \left| \Phi (S^{N\!L}_t(g)) - \Phi (S^{N\!L}_t(g))\right| \\&\le \omega _\Phi \left( \text{ dist }_{{\mathcal {G}}_1} \left( S^{N\!L}_t(g),S^{N\!L}_t(f)\right) \right) \\&\le \omega _\Phi \left( C_\tau \, \text{ dist }_{{\mathcal {G}}_1} (f,g)\right) \end{aligned}$$

so that \(T^\infty _t \Phi \in UC({\mathcal {P}}_{{\mathcal {G}}_1}(E);{\mathbb {R}})\) for any \(t \ge 0\). Next, we have

$$\begin{aligned} \Vert T^\infty _t\Vert&= \sup _{\Vert \Phi \Vert \le 1}\Vert T^\infty _t \Phi \Vert = \sup _{\Vert \Phi \Vert \le 1} \sup _{f \in {\mathcal {P}}_{{\mathcal {G}}_1}(E)} \left| \Phi (S^{N\!L}_t(f))\right| \le 1,\\ \Vert \Phi \Vert&= \sup _{h \!\in \! {\mathcal {P}}_{{\mathcal {G}}_1}(E)} |\Phi (h)|. \end{aligned}$$

Finally, from (4.2), for any \(\Phi \in UC({\mathcal {P}}_{{\mathcal {G}}_1}(E);{\mathbb {R}})\), we have

$$\begin{aligned} \left\| T^\infty _t \Phi - \Phi \right\| = \sup _{f \in {\mathcal {P}}_{{\mathcal {G}}_1}(E)} \left| \Phi (S^{N\!L}_t(f)) - \Phi (f)\right| \le \omega _\Phi (K \, t) \rightarrow 0 \quad \text{ as } t \rightarrow 0^+. \end{aligned}$$

As a consequence, Hille–Yosida Theorem (see for instance [33, Theorem 3.1]) implies that \((T^\infty _t)\) is associated to a closed generator \(G^\infty \) with dense domain \(\hbox {dom}(G^\infty ) \subset UC({\mathcal {P}}_{{\mathcal {G}}_1}(E);{\mathbb {R}})\).

Step 3. A candidate for this generator is defined as follows. Let \(\tilde{G}^\infty \) be defined by

$$\begin{aligned} \forall \, \Phi \in C^{1,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E); {\mathbb {R}}), \ \forall \, f \in {\mathcal {P}}_{{\mathcal {G}}_1}(E), \quad (\tilde{G}^\infty \Phi ) (f) := \left\langle D\Phi [f], Q(f)\right\rangle \!. \end{aligned}$$

The RHS is well defined since \(D\Phi (f) \in {\mathcal {B}}({\mathcal {G}}_1,{\mathbb {R}}) = {\mathcal {G}}'_1\) and \(Q(f) \in {\mathcal {G}}_1\) by assumption. Moreover, since both \(f \mapsto D\Phi [f]\) and \(f \mapsto Q(f)\) are uniformly continuous so is the map \(f \mapsto (\tilde{G}^\infty \Phi ) (f)\). It yields \(\tilde{G}^\infty \Phi \in UC({\mathcal {P}}_{{\mathcal {G}}_1}(E); {\mathbb {R}})\).

Step 4. Finally, by composition,

$$\begin{aligned} \forall \, f \in {\mathcal {P}}_{{\mathcal {G}}_1}(E), \quad t \mapsto T^\infty _t \Phi (f) = \Phi \circ S^{N\!L}_t (f) \end{aligned}$$

is right-differentiable at \(t=0^+\) and

$$\begin{aligned} {\mathrm{d} \over \mathrm{d}t} (T^\infty _t \Phi ) (f)|_{t=0}&:= {\mathrm{d} \over \mathrm{d}t} (\Phi \circ {\mathcal {S}}(f)(t))|_{t=0} \\&= \left\langle D\Phi ({\mathcal {S}}(f)(0)), {\mathrm{d} \over \mathrm{d}t} {\mathcal {S}}(f)(0)\right\rangle \\&= \left\langle D\Phi [f], Q(f)\right\rangle = \left( \tilde{G}^\infty \Phi \right) (f), \end{aligned}$$

which implies \(\Phi \in \hbox {Dom}(G^\infty )\) and (4.1).\(\square \)

4.2 Estimates in the proof of Theorem 2.1

The proof of Theorem 2.1 relies on the three following lemmas, together with their assumptions.

4.2.1 Estimate of \({\mathcal {T}}_1\)

The first error term is estimated by the following combinatorial argument.

Lemma 4.2

Let \(S^N_t, \mu ^N_Z\) and \(R^{\ell }[\varphi ]\) as defined in Sect. 2 (see Fig. 1), and let \(\varphi \in C_b(E^{\ell })\). For any \(t\ge 0\) and any \(N \ge 2 \ell \)

$$\begin{aligned} {\mathcal {T}}_1 := \left| \left\langle S^N_t(f_{in}^{\otimes N}), \varphi \otimes 1^{\otimes N-\ell }\right\rangle - \left\langle S^N_t(f_{in}^{\otimes N}), R^\ell [\varphi ] \circ \mu ^N_Z\right\rangle \right| \le \frac{2 \, \ell ^2 \, \Vert \varphi \Vert _{L^\infty (E^\ell )}}{N}.\nonumber \\ \end{aligned}$$
(4.3)

Proof

Since \(S^N_t(f_{in}^{\otimes N})\) is a symmetric probability measure, estimate (4.3) is a direct consequence of the following estimate: For any \(\varphi \in C_b(E^\ell )\) and any \( \, N \ge 2 \ell \) we have

$$\begin{aligned} \left| \left( \varphi \otimes \mathbf{1}^{\otimes N-\ell }\right) _{\mathrm{sym}} - \pi _N R^\ell [\varphi ]\right| \le \frac{2 \, \ell ^2 \, \Vert \varphi \Vert _{L^\infty (E^\ell )}}{N}. \end{aligned}$$
(4.4)

Here the symmetrized version of a function \(\phi \in C_b(E^N)\), is defined as

$$\begin{aligned} \phi _{\mathrm{sym}} = \frac{1}{|{\mathfrak {S}}_N|} \, \sum _{\sigma \in {\mathfrak {S}}_N} \phi _\sigma . \end{aligned}$$
(4.5)

As a consequence for any symmetric measure \(f^N \in {\mathcal {P}}_{\mathrm{sym}}(E^N)\) we have

$$\begin{aligned} \left| \langle f^N, R^\ell [\varphi ](\mu ^N_Z) \rangle - \langle f^N, \varphi \rangle \right| \le {2 \, \ell ^2 \, \Vert \varphi \Vert _{L^\infty (E^\ell )} \over N}. \end{aligned}$$
(4.6)

To establish the inequality (4.4), we let \(\ell \le N/2\) and introduce

$$\begin{aligned} A_{N,\ell } := \left\{ (i_1, \dots , i_\ell ) \in \{1,\dots ,N\}^\ell \, : \ \forall \, k \not = k', \ i_k \not = i_{k'} \ \right\} \quad \text{ and } \quad B_{N,\ell } := A_{N,\ell }^c. \end{aligned}$$

Since there are \(N (N-1) \dots (N-\ell +1)\) ways of choosing \(\ell \) distinct indices among \(\{1,\dots ,N\}\) we get

$$\begin{aligned} {\left| B_{N,\ell }\right| \over N^\ell }&= 1 - \left( 1 - {1 \over N}\right) \, \dots \, \left( 1 - {\ell -1 \over N}\right) = 1 - \exp \left( \sum _{i = 0}^{\ell -1}\ln \left( 1 - \frac{i}{N}\right) \right) \\&\le 1 - \exp \left( - 2 \sum _{i = 0}^{\ell -1} \frac{i}{N}\right) \le {\ell ^2 \over N}, \end{aligned}$$

where we have used

$$\begin{aligned} \forall \, x \in [0,1/2], \quad \ln (1 - x) \ge - 2 \, x \quad \text{ and } \quad \forall \, x \in {\mathbb {R}}, \quad e^{-x} \ge 1 - x. \end{aligned}$$

Then we compute

$$\begin{aligned} R^\ell [\varphi ](\mu ^N_Z)&= {1 \over N^\ell } \sum _{i_1, \dots , i_\ell = 1}^N \varphi (z_{i_1}, \dots , z_{i_\ell }) \\&= {1 \over N^\ell } \sum _{(i_1, \dots , i_\ell ) \in A_{N,\ell }} \varphi (z_{i_1}, \dots , z_{i_\ell }) + {1 \over N^\ell } \sum _{(i_1, \dots , i_\ell ) \in B_{N,\ell }} \varphi (z_{i_1}, \dots , z_{i_\ell })\\&= {1 \over N^\ell } \, {1 \over (N-\ell )!} \sum _{\sigma \in {\mathfrak {S}}_N} \varphi (z_{\sigma (1)}, \dots , z_{\sigma (\ell )}) + {\mathcal {O}} \left( {\ell ^2 \over N} \, \Vert \varphi \Vert _{L^\infty }\right) \\&= {1 \over N!} \sum _{\sigma \in {\mathfrak {S}}_N} \varphi (z_{\sigma (1)}, \dots , z_{\sigma (\ell )}) + {\mathcal {O}} \left( {2 \, \ell ^2 \over N} \, \Vert \varphi \Vert _{L^\infty }\right) \end{aligned}$$

and the proof of (4.4) is complete. Next for any \(f^N \in {\mathcal {P}}(E^N)\) we have

$$\begin{aligned} \left\langle f^N, \varphi \right\rangle = \left\langle f^N, \left( \varphi \otimes \mathbf{1}^{\otimes N-\ell }\right) _{\mathrm{sym}}\right\rangle , \end{aligned}$$

and (4.6) follows from (4.4).\(\square \)

4.2.2 Estimate of \({\mathcal {T}}_2\)

The second error term is estimated thanks to a consistency result for the generators of the \(N\) particle system and the limiting dynamics, and stability estimates on the limiting dynamics. To proceed we need to introduce the two corresponding assumptions.

figure c
figure d

Lemma 4.3

Suppose that the assumptions (A1) to (A4) are satisfied, and let \({\mathcal {T}}_2\) be as defined in Eq. (2.8). Then

$$\begin{aligned} \quad {\mathcal {T}}_ 2&:= \left| \left\langle f_{\mathrm{in}}^{\otimes N}, T^N_t (R^\ell [\varphi ] \circ \mu ^N_Z)\right\rangle - \left\langle f_{\mathrm{in}}^{\otimes N}, \left( (T_t^\infty R^\ell [\varphi ]) \circ \mu ^N_Z\right) \right\rangle \right| \\ \nonumber&\le C(k,\ell ) \, C_{T} \, {\varepsilon }(N) \, \Vert \varphi \Vert _{{\mathcal {F}}_1^k \otimes (L^\infty )^{\ell -k}} \end{aligned}$$
(4.9)

for an explicitly given constant \(C(k,\ell )\) depending only on \(k\) and \(\ell \).

Proof

We start from the following identity

$$\begin{aligned} T^N_t \pi ^{N} - \pi ^{{\!N}} T^\infty _t&= - \int _0^t \frac{\partial }{\partial s} \left( T^N_{t-s} \, \pi ^{{\!N}} \, T^\infty _s\right) \, \mathrm{d}s \\&= \int _0^t T^N_{t-s} \, \left[ G^N \pi ^{{\!N}} - \pi ^{{\!N}} G^\infty \right] \, T^\infty _s \, \mathrm{d}s, \end{aligned}$$

which we evaluate on \(\Phi =R^{\ell }[\phi ]\in C_b({\mathcal {P}}(E))\). From assumption (A3) we have for any \(t \in [0,T]\)

$$\begin{aligned}&\left| \left\langle f_{\mathrm{in}}^{\otimes N}, T^N_t \pi ^{{\!N}} R^\ell [\varphi ] - \pi ^{{\!N}} T^\infty _t R^\ell [\varphi ]\right\rangle \right| \nonumber \\&\qquad = \left| \int _0^t \left\langle S^N_{t-s}\left( f_{\mathrm{in}}^N\right) , \left[ G^N \pi ^{{\!N}} - \pi ^{{\!N}} G^\infty \right] \, (T^\infty _s R^\ell [\varphi ])\right\rangle \, \mathrm{d}s\right| \nonumber \\&\qquad \le \int _0^T \left\| \left[ G^N \pi ^{{\!N}} - \pi ^{{\!N}} G^\infty \right] \, (T^\infty _s R^\ell [\varphi ])\right\| _{L^\infty (E^N)} \, \mathrm{d}s \nonumber \\&\qquad \le {\varepsilon }(N) \, \int _0^T \left\| T^\infty _s R^\ell [\varphi ]\right\| _{C^{k,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E))} \, \mathrm{d}s. \end{aligned}$$
(4.10)

Since \(T^\infty _t(R^\ell [\varphi ]) = R^\ell [\varphi ] \circ S^{N\!L}_t\) with \(S^{N\!L}_t \in C^{k,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathcal {P}}_{{\mathcal {G}}_2}(E))\) thanks to assumption (A4) and \(R^\ell [\varphi ] \in C^{k,1}({\mathcal {P}}_{{\mathcal {G}}_2}(E),{\mathbb {R}})\) because \(\varphi \in {\mathcal {F}}_2^{\otimes \ell }\) (see Sect. 3.4), we obtain with the help of Lemma 3.12 that \(T^\infty _t(R^\ell [\varphi ]) \in C^{k,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E))\) with uniform bound. We hence conclude that

$$\begin{aligned} \int _0^T \left\| T^\infty _s(R^\ell [\varphi ])\right\| _{C^{k,1} ({\mathcal {P}}_{{\mathcal {G}}_1}(E))} \, \mathrm{d}s \le C(k,\ell ) \, C_{T} \, \left\| R^\ell [\varphi ]\right\| _{C^{k, 1}(P_{{\mathcal {G}}_2})}, \end{aligned}$$
(4.11)

where \(C(k,\ell ) \le \ell ^2\) since \(k=1\) or \(k=2\).

Going back to the computation (4.10), and plugging (4.11) we deduce (4.9).\(\square \)

4.2.3 Estimate of \({\mathcal {T}}_3\)

The error term \({\mathcal {T}}_3\) from Eq. (2.8) depends on estimating how well the initial data for the nonlinear equation \(f_{\mathrm{in}}\) can be approximated by an empirical measure, and how well this error is then propagated along the semigroup. To this purpose we need to make a stability assumption for the limiting semigroup.

figure e

Remark 4.4

Observe that when \({\mathcal {P}}_{{\mathcal {G}}_1}(E) = {\mathcal {P}}_{{\mathcal {G}}_1}(E) = {\mathcal {P}}_{{\mathcal {G}}_3}(E)\) with the same weights and distances, the assumption (A5) is included in (A4). However it is crucial to have the flexibility to play with different metric structures in these assumptions.

Lemma 4.5

Assume that the limiting semigroup \(S^{N\!L}_t\) satisfies assumption (A5) for some probability space \({\mathcal {P}}_{{\mathcal {G}}_3}(E)\). Let \({\mathcal {F}}_3\) satisfy a duality inequality with \({\mathcal {P}}_{{\mathcal {G}}_3}(E)\) as defined in Definition 3.14. Let \(\varphi \in C_b(E^{\ell })\).

Then for any \(f_{\mathrm{in}}\in {\mathcal {P}}(E), t>0\) and \(N\ge 2\ell \) we have

$$\begin{aligned} {\mathcal {T}}_3&:= \left| \left\langle f_{\mathrm{in}}^{\otimes N}, \left( T_t^\infty R^\ell [\varphi ]\right) \circ \mu ^N_Z\right\rangle - \left\langle \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes k},\varphi \right\rangle \right| \nonumber \\&\le \ell \, \tilde{C}_{T} \, \Omega _N^{{\mathcal {G}}_3} (f_{\mathrm{in}}) \, \Vert \varphi \Vert _{{\mathcal {F}}_3 \otimes (L^\infty )^{\ell -1}}, \end{aligned}$$
(4.13)

where \(\Omega _N^{{\mathcal {G}}_3} (f_{in})\) is defined in (2.7) and \(\Vert \varphi \Vert _{{\mathcal {F}}_3 \otimes (L^\infty )^{\ell -1}}\) is defined in the proof of Lemma 3.15.

Proof

We split \({\mathcal {T}}_3\) in two terms, the first one being

$$\begin{aligned} {\mathcal {T}}_{3,1}&:= \left\langle f_{\mathrm{in}}^{\otimes N}, \left( T_t^\infty R^\ell [\varphi ]\right) \circ \mu ^N_Z\right\rangle \\&= \int _{E^N} R^\ell [\varphi ] \left( S^{N\!L}_t \left( \mu ^N_Z\right) \right) \, f_{\mathrm{in}}(\mathrm{d}z_1) \, \dots \, f_{\mathrm{in}}(\mathrm{d}z_N) \\&= \int _{E^N} \left( \prod _{i=1}^\ell a_i (Z)\right) \, f_{\mathrm{in}}(\mathrm{d}z_1) \, \dots \, f_{\mathrm{in}}(\mathrm{d}z_N), \end{aligned}$$

with

$$\begin{aligned} \forall \, i=1, \dots , \ell , \quad a_i = a_i(Z) := \int _{E} \varphi _i(w) \, S^{N\!L}_t(\mu ^N_Z)(\mathrm{d}w). \end{aligned}$$

Similarly, we write for the second term

$$\begin{aligned} {\mathcal {T}}_{3,2}&= \left\langle \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes \ell }, \varphi \right\rangle = \int _{E^N} \left( \prod _{i=1}^\ell b_i\right) \, f_{\mathrm{in}}(\mathrm{d}z_1) \, \dots \, f_{\mathrm{in}}(\mathrm{d}z_N), \end{aligned}$$

with

$$\begin{aligned} \forall \, i=1, \dots , \ell , \quad b_i := \int _{E} \varphi _i(w) \, S^{N\!L}_t(f_{\mathrm{in}})(\mathrm{d}w). \end{aligned}$$

Using the identity

$$\begin{aligned} \prod _{i=1}^\ell a_i - \prod _{i=1}^\ell b_i = \sum _{i=1}^\ell a_1 \dots a_{i-1} \, (a_i - b_i) \, b_{i+1} \dots b_\ell , \end{aligned}$$

we get

$$\begin{aligned} {\mathcal {T}}_3 \le \sum _{i=1}^\ell \left( ~\prod _{j \ne i} \Vert \varphi _j \Vert _{L^\infty (E)}\right) \, \int _{E^N} \left| a_i(Z) - b_i\right| \, f_{\mathrm{in}}(\mathrm{d}z_1) \, \dots \, f_{\mathrm{in}}(\mathrm{d}z_N).\quad \end{aligned}$$
(4.14)

Then by using the duality bracket together with assumption (A5) we have

$$\begin{aligned} \left| a_i(Z) - b_i\right|&:= \left| \int _{E} \varphi _i(w) \, \left( S^{N\!L}_t(f_{\mathrm{in}})(\mathrm{d}w) - S^{N\!L}_t(\mu ^N_V)(\mathrm{d}w)\right) \right| \nonumber \\&\le \Vert \varphi _i \Vert _{{\mathcal {F}}_3} \, \text{ dist }_{{\mathcal {G}}_3} \left( S^{N\!L}_t(f_{\mathrm{in}}), S^{N\!L}_t(\mu ^N_Z)\right) \nonumber \\&\le \tilde{C}_{T} \, \Vert \varphi _i\Vert _{{\mathcal {F}}_3} \, \text{ dist }_{{\mathcal {G}}_3}\left( f_{\mathrm{in}}, \mu ^N_Z\right) . \end{aligned}$$
(4.15)

Therefore combining (4.14) and (4.15) (for any \(1 \le i \le \ell \)), we conclude that (4.13) holds.\(\square \)

In order to use Lemma 4.5 we need an estimate on the term \(\Omega ^{{\mathcal {G}}_3}_N(f_{\mathrm{in}})\). This information is provided by the following quantitative version of the law of large number for empirical measures taken from [34]. We refer to [32] for a more detailed discussion of this issue.

Lemma 4.6

For any \(f_{\mathrm{in}} \in {\mathcal {P}}_{d+5}({\mathbb {R}}^d)\) and any \(N \ge 2\) there exists a constant \(C\) which only depends on \(d\) and \(M_{d+5}(f_{\mathrm{in}})\) so that

$$\begin{aligned} \Omega ^{W^2_2}_N(f_{\mathrm{in}}) = \int _{{\mathbb {R}}^{dN}} W_2(\mu ^N_Z, f_{\mathrm{in}})^2 \, f^{\otimes N}_{\mathrm{in}}(\mathrm{d}Z) \le C \, N^{ - \frac{2}{d+4}}. \end{aligned}$$

4.3 A remark on assumption (A4)

In this section we briefly explain how our key estimate (A4) can be obtained in the case of a nonlinear operator \(Q\) which splits into a linear part and a bilinear part:

$$\begin{aligned} \forall f\in {\mathcal {P}}(E), \quad Q(f) = Q_1(f) + Q_2(f,f) \end{aligned}$$
(4.16)

with \(Q_1\) linear and \(Q_2\) bilinear symmetric.

For two initial data \(f_{\mathrm{in}}\) and \(g_{\mathrm{in}}\) in a space \(P_{\mathcal {G}}(E)\) of probability measures, and some initial data \(h_{\mathrm{in}} \in {\mathcal {G}}\) we introduce the following evolution equations,

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t g = Q(g) = Q_1(g) + Q_2(g,g), \quad g_{|t=0} = g_{\mathrm{in}},\\ \partial _t f = Q(f) = Q_1(f) + Q_2(f,f) , \quad f_{|t=0} = f_{\mathrm{in}}, \\ \partial _t h = DQ[f] (h) = Q_1(h) + 2 \, Q_2(f,h), \quad h_{|t=0} = h_{\mathrm{in}} \end{array}\right. \end{aligned}$$

In the third equation the solution \(h_t\) depends linearly on \(h_{\mathrm{in}}\) (but also nonlinearly on \(f_{\mathrm{in}}\)): it is formally the first-order variation of the semigroup, i.e. \(D S^{N\!L}_t f_{\mathrm{in}}(h_{\mathrm{in}}) = h_t\).

We now want to write a second-order variation of the semigroup. To this purpose we consider \(h_t\) and \(\tilde{h}_t\) two solutions to the third equation above and write

$$\begin{aligned} \partial _t r = DQ[f](r) \!+\! {1 \over 2} D^2Q(f) (h,\tilde{h}) =Q_1(r) + 2 \, Q_2(f,r) \!+\! Q_2 (h,\tilde{h}), \quad \! r_{|t=0} = 0. \end{aligned}$$

In this fourth equation \(r_t\) depends bilinearly on \(h_t\) and \(\tilde{h}_t\) which are two solutions to the third equation, and therefore bilinearly on \(h_{\mathrm{in}}, \tilde{h}_{\mathrm{in}}\) (it also depends nonlinearly on \(f_{\mathrm{in}}\) again): it is formally the second-order derivative of the semigroup \(D^2 S^{N\!L}_t [f_{\mathrm{in}}](h_{\mathrm{in}}, \tilde{h}_{\mathrm{in}}) = r_t\). Observe that the initial data \(r_{\mathrm{in}}\) are always zero for this second variation problem since the map \((f_t)_{t \ge 0} \mapsto f_{\mathrm{in}}\) is linear.

Consider now \(h_{\mathrm{in}} = \tilde{h}_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}\) (which implies \(h_t = \tilde{h}_t\)). Let us define \(\mathsf {s} := f+g, \mathsf {d} := g - f, \omega := g - f - h, \psi := g - f - h - r\), for which we get the following evolution equations

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t \mathsf {d} = Q_1(\mathsf {d}) + Q_2(\mathsf {s},\mathsf {d}), \quad \mathsf {d}_{|t=0} = \mathsf {d}_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}},\\ \partial _t \omega = Q_1(\omega ) + Q_2(\mathsf {s},\omega ) + Q_2 (h,\mathsf {d}), \quad \omega _{|t=0} = 0,\\ \partial _t \psi = Q_1(\psi ) + Q_2(\mathsf {s},\psi ) + Q_2 (h,\omega ) + Q_2 (r,\mathsf {d}), \quad \psi _{|t=0} = 0. \end{array}\right. \end{aligned}$$

Now we can translate the regularity estimates on \(S^{N\!L}_t\) in terms of estimates on these solutions on some given time interval \([0,T]\):

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \sup _{t\in [0,T]} \Vert \mathsf {d}_t\Vert _{\mathcal {G}_2} \le C_T\Vert \mathsf {d}_{\mathrm{in}}\Vert _{\mathcal {G}_1} \implies S^{N\!L}_t\in C^{0,1}(P_{\mathcal {G}_1}(E), P_{\mathcal {G}_2}(E))\\ \left\{ \begin{array}{l} S^{N\!L}_t\in C^{0,1}(P_{\mathcal {G}_1}(E), P_{\mathcal {G}_2}(E))\\ \sup \nolimits _{t\in [0,T]} \Vert h_t\Vert _{{\mathcal {G}}_2} \le C_T \Vert h_{\mathrm{in}}\Vert _{\mathcal {G}_1}\\ \sup \nolimits _{t\in [0,T]} \Vert \omega _t\Vert _{{\mathcal {G}}_2} \le C_T \Vert \mathsf {d}_{\mathrm{in}}\Vert _{\mathcal {G}_1}^2 \end{array}\right\} \implies S^{N\!L}_t\in C^{1,1}(P_{\mathcal {G}_1}(E), P_{{\mathcal {G}}_2}(E))\\ \left\{ \begin{array}{l} S^{N\!L}_t\in C^{1,1}(P_{\mathcal {G}_1}(E), P_{{\mathcal {G}}_2}(E))\\ \sup \nolimits _{t\in [0,T]} \Vert r_t\Vert _{{\mathcal {G}}_2} \le C_T \Vert h_{\mathrm{in}}\Vert _{\mathcal {G}_1} \Vert \tilde{h}_{\mathrm{in}}\Vert _{\mathcal {G}_1}\\ \sup \nolimits _{t\in [0,T]} \Vert \psi _t\Vert _{\mathcal {G}_2} \le C_T \Vert \mathsf {d}_{\mathrm{in}}\Vert _{\mathcal {G}_1}^3 \end{array}\right\} \implies S^{N\!L}_t\in C^{2,1}(P_{\mathcal {G}_1}(E), P_{\mathcal {G}_2}(E)). \end{array}\right. \end{aligned}$$

Such estimates are typically obtained by energy estimates for the equations satisfied by \(\mathsf {d}, r, \omega \) and \(\psi \), for a well chosen “cascade” of norms connecting \(\Vert \cdot \Vert _{{\mathcal {G}}_1}\) to \(\Vert \cdot \Vert _{{\mathcal {G}}_2}\) (see later in the applications).

5 Maxwell molecule collisions with cut-off

5.1 The model

In this section we assume that \(E = {\mathbb {R}}^d, d \ge 2\), and we consider an \(N\)-particle system undergoing a space homogeneous random Boltzmann collisions according to a collision kernel \(b \in L^1([-1,1])\) only depending on the deviation angle and locally integrable. This is usually called Maxwellian molecules with Grad’s angular cut-off, as introduced in [23, 24, 29]. We make the normalization hypothesis

$$\begin{aligned} \Vert b \Vert _{L^1} = \int _{{\mathbb {S}}^{d-1}} b(\sigma _1) \, \mathrm{d} \sigma = 1. \end{aligned}$$

Let us now describe the stochastic process. Since the phase space \(E^N\) corresponds to the velocities of the particles, we shall denote \(Z=V\) in this section. Given a pre-collisional \(N\)-system of velocity particles \(V = (v_1, \dots , v_N) \in E^N = ({\mathbb {R}}^d)^N\), the stochastic runs as follows:

  1. (i)

    for any \(i'\ne j'\), we draw randomly for the pair of particles \((v_{i'},v_{j'})\) a random time \(T_{i',j'}\) of collision according to an exponential law of parameter \(1\), and then choose the collision time \(T_1\) and the colliding pair \((v_i,v_j)\) (which is a.s. well-defined) in such a way that

    $$\begin{aligned} T_1 = T_{i,j} := \min _{1 \le i' \ne j' \le N} T_{i',j'}; \end{aligned}$$
  2. (ii)

    we then choose \(\sigma \in {\mathbb {S}}^{d-1}\) at random according to the law \(b(\cos \theta _{ij})\) where we define the angular deviation \(\theta _{ij}\) by \(\cos \theta _{ij} = \sigma \cdot (v_j-v_i)/|v_j-v_i|\);

  3. (iii)

    the new state after collision at time \(T_1\) becomes

    $$\begin{aligned} V^* = V^*_{ij} = R_{ij,\sigma }V = (v_1, \dots , v^*_i, \dots ., v^*_j, \dots , v_N), \end{aligned}$$

    where the rotation \(R_{ij,\sigma }\) on the \((i,j)\) pair with vector \(\sigma \) is defined by

    $$\begin{aligned} \quad \quad v^*_i = {w_{ij} \over 2} + {u^*_{ij} \over 2}, \quad v^*_j= {w_{ij} \over 2} - {u^*_{ij} \over 2}, \end{aligned}$$
    (5.1)

    with

    $$\begin{aligned} w_{ij} = v_i+v_j, \quad u^*_{ij} = |u_{ij}| \, \sigma , \quad u_{ij} = v_i-v_j. \end{aligned}$$

Scaling the time by a factor \(1/N\) and repeating the above construction lead to the definition of a Markov process \(({\mathcal {V}}^N_t)\) on \(({\mathbb {R}}^d)^N\). It is associated to a Feller semigroup \((T^N_t)\) with generator \(G^N\). Moreover the master equation on the law \(f^N_t\) is given in dual form by

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} \langle f^N_t,\varphi \rangle = \langle f^N_t, G^N \varphi \rangle \end{aligned}$$
(5.2)

with

$$\begin{aligned} (G^N\varphi ) (V) = {1 \over N} \sum _{1\le i < j\le N}^N \int _{\mathbb {S}^{d-1}} b(\cos \theta _{ij}) \, \left[ \varphi ^*_{ij} - \varphi \right] \, \mathrm{d}\sigma \end{aligned}$$
(5.3)

where \(\varphi ^*_{ij}= \varphi (V^*_{ij})\) and \(\varphi = \varphi (V) \in C_b({\mathbb {R}}^{Nd})\). Finally, the flow \(f^N_{\mathrm{in}} \mapsto f^N_t\) defines a semigroup \(S^N_t\) for the \(N\)-particle distributions which is nothing but the dual semigroup of \(T^N_t\).

Note that the collision process is invariant under permutation of the velocities, and satisfies the microscopic conservations of momentum and energy at any collision time

$$\begin{aligned} \forall \, \alpha = 1, \dots , d, \quad \sum _{k} v^*_{k\alpha } = \sum _{k} v_{k\alpha }, \quad |V^*|^2 = |V|^2 := \sum _{k=1}^N |v_k|^2. \end{aligned}$$

We write \(V = (v_i)_{1 \le i \le N} = (v_1, \dots , v_N) \in E^N\) and \(v = (v_\alpha )_{1\le \alpha \le d} \in {\mathbb {R}}^d\), so that \( V = (v_{i\alpha }) \in {\mathbb {R}}^{Nd}\) with \(v_{i\alpha } \in {\mathbb {R}}\).

As a consequence, for any symmetric initial law \(f_{\mathrm{in}}^{\otimes N} \in {\mathcal {P}}_{\mathrm{sym}}({\mathbb {R}}^{Nd})\) the law density \(f_t^N\) remains a symmetric probability and conserves momentum and energy

$$\begin{aligned} \left\{ \begin{array}{l} \forall \, \alpha = 1, \dots , d, \quad \int _{{\mathbb {R}}^{dN}} \left( ~\sum _{k=1}^N v_{k\alpha }\right) \, f_t^N (\mathrm{d}v) = \int _{{\mathbb {R}}^{dN}} \left( \sum _{k=1}^N v_{k\alpha }\right) \, f_{\mathrm{in}}^{\otimes N} (\mathrm{d}v), \\ \forall \, \theta : {\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+, \quad \int _{{\mathbb {R}}^{dN}}\theta (|V|^2) \, f_t^N (\mathrm{d}v) = \int _{{\mathbb {R}}^{dN}} \theta (|V|^2) \, f_{\mathrm{in}}^{\otimes N} (\mathrm{d}v). \end{array}\right. \end{aligned}$$

The formal limit of this \(N\)-particle system is the nonlinear homogeneous Boltzmann equation on \({\mathcal {P}}(\mathbb {R}^d)\) defined by

$$\begin{aligned} \frac{\partial }{\partial t} f_t = Q(f_t,f_t) \end{aligned}$$
(5.4)

where the quadratic Boltzmann collision operator \(Q\) is defined by

$$\begin{aligned} \langle Q(f,f), \varphi \rangle := \int _{{\mathbb {R}}^{2d}\times {\mathbb {S}}^{d-1}} b(\theta ) \, (\phi (w^*_2) - \phi (w_2)) \, \mathrm{d}\sigma \, f(\mathrm{d}w_1) \, f(\mathrm{d}w_2) \end{aligned}$$
(5.5)

for \(\varphi \in C_b({\mathbb {R}}^d)\) and \(f \in {\mathcal {P}}({\mathbb {R}}^d)\), with

$$\begin{aligned} w_1^* = {w_1+ w_2 \over 2} + {|w_2 - w_1|\over 2}\, \sigma , \qquad w_2^* = {w_1+ w_2 \over 2} - {|w_2 - w_1|\over 2}\, \sigma \end{aligned}$$
(5.6)

and \(\cos \theta = \sigma \cdot (v-w)/|v-w|\). Equations (5.4)–(5.5) is the space homogeneous Boltzmann equation for elastic collisions associated to the Maxwell molecules cross section with Grad’s cutoff. We refer to the textbooks [10] and [39] and the numerous references therein for both the physical background and the mathematical theory of the Boltzmann equation. This equation generates a nonlinear semigroup \(S^{N\!L}_t\) on \({\mathcal {P}}({\mathbb {R}}^d)\) defined by \(S^{N\!L}_t f_{\mathrm{in}} := f_t\) for any \(f_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^d)\), which satisfies conservation of momentum and energy:

$$\begin{aligned} \forall \, t \ge 0, \quad \int _{{\mathbb {R}}^{d}} v \, f_t (\mathrm{d}v) = \int _{{\mathbb {R}}^{d}}v \, f_{\mathrm{in}} (\mathrm{d}v), \quad \int _{{\mathbb {R}}^{d}} |v|^2 \, f_t (\mathrm{d}v) = \int _{{\mathbb {R}}^{d}}|v|^2 \, f_{\mathrm{in}} (\mathrm{d}v). \end{aligned}$$

5.2 Statement of the result

On the one hand, it is well known that for the collision kernel that we have chosen the \(N\)-particle Markov process \(({\mathcal {V}}^N_t)\) described above is well defined for any initial velocity \({\mathcal {V}}^N_0\), and in particular, for any given initial law \(f^N_{\mathrm{in}} \in {\mathcal {P}}_{\mathrm{sym}}(({\mathbb {R}}^d)^N)\) there exists a unique solution \(f^N_t \in {\mathcal {P}}_{\mathrm{sym}}(({\mathbb {R}}^d)^N)\) to Eqs. (5.2)–(5.3) so that the \(N\)-particle semigroup \(S^N_t\) is well defined, see [24, 25, 30, 37]. On the other hand, it is also well known that for any \(f_{\mathrm{in}} \in {\mathcal {P}}_q({\mathbb {R}}^d), q \ge 0\) the nonlinear Boltzmann equation (5.4)–(5.5) has a unique solution \(f_t \in {\mathcal {P}}_q({\mathbb {R}}^d)\). This solution conserves momentum and energy as soon as \(q \ge 2\), see for instance [14, 3739].

Our mean field limit result then states as follows.

Theorem 5.1

(The Boltzmann equation for Maxwell molecules with Grad’s cut-off) Consider an initial distribution \(f_{in} \in {\mathcal {P}}_q({\mathbb {R}}^d), q \ge 2\), the hierarchy of \(N\)-particle distributions \(f^N_t = S^N_t(f_{in}^{\otimes N})\) following (5.2), and the solution \(f_t = S^{N\!L}_t(f_{in})\) following (5.4).

Then there is a constant \(C>0\) and, for any \(T>0\), there are constants \(C_{T}, \tilde{C}_T >0\) such that for any

$$\begin{aligned} \varphi = \varphi _1 \otimes \dots \otimes \, \varphi _\ell \in {\mathcal {F}}^{\otimes \ell }, \quad {\mathcal {F}}:= C_b({\mathbb {R}}^d) \cap \hbox {Lip}({\mathbb {R}}^d), \quad \Vert \varphi _j\Vert _{\mathcal {F}}\le 1, \end{aligned}$$

we have for \(N \ge 2 \ell \):

$$\begin{aligned} \sup _{[0,T]}\left| \left\langle \left( S^N_t(f_{\mathrm{in}}^N) - \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes N}\right) , \varphi \right\rangle \right| \le C \, \frac{\ell ^2}{N} + C_{T} \, {\ell ^2 \over N}+ \tilde{C}_{T} \, \ell \, \Omega ^{W_2}_N (f_{\mathrm{in}})\nonumber \\ \end{aligned}$$
(5.7)

where \(\Omega ^{W_2}_N\) was defined in (2.7) and \(W_2\) is the quadratic MKW distance defined in (3.3).

As a consequence of (5.7) and Lemma 4.6, this implies propagation of chaos with rate \({\varepsilon }(N) \le C(\ell ,T,f_{\mathrm{in}}) \, N^{-{1 \over d+4}}\) for any initial data \(f_{\mathrm{in}} \in P_{d+5}({\mathbb {R}}^d)\), where \( C(\ell ,T,f_{\mathrm{in}})\) is an explicitly computable constant.

For the Boltzmann equation with bounded kernel, propagation of chaos has been established by McKean in [29], where he adapted the method introduced by Kac in [24] based on the Wild sum representation of the solutions to the Boltzmann equation. Grünbaum in [21] gave an alternative proof based on the same “duality viewpoint” as developed in the present paper. Sznitman in [35] also gave a proof of propagation of chaos based on a nonlinear martingale approach. In all these works, propagation of chaos is proved but without any rate of convergence (as the number of particles goes to infinity). Graham and Méléard in [18, 19, 30] were then able to prove the propagation of the chaos with the sharp rate \(C(\ell ,T)/N\). Their proof is based on the construction of a stochastic tree associated to the process \({\mathcal {V}}^N_t\) which is specific to the Boltzmann equation with bounded kernel. More recently Kolokoltsov in [26] proved a fluctuation estimate for similar processes using a “duality view point” like the one developed by Grünbaum and used also in our work. His fluctuation estimate is similar to our Lemma 4.9 (and of the same order), but pays less attention to the remaining terms of our estimate. Fournier and Godinho [15] prove the propagation of chaos for a one-dimensional caricature of the Boltzmann equation using a coupling method in the spirit of [36, 37]. Their chaoticity estimate is of the same rate as ours.

5.3 Proof of Theorem 5.1

The assumptions (A1)–(A2)–(A3)–(A4)–(A5) needed to apply Theorem 2.1 will be verified step by step. In this proof we fix \({\mathcal {F}}_1={\mathcal {F}}_2=C_0({\mathbb {R}}^d)\) and \({\mathcal {F}}_3 = \text{ Lip }({\mathbb {R}}^d)\) and define \({\mathcal {P}}_{{\mathcal {G}}_1}(E) = {\mathcal {P}}_{{\mathcal {G}}_2}(E) := {\mathcal {P}}({\mathbb {R}}^d)\) endowed with the total variation norm \(\Vert \cdot \Vert _{TV}, {\mathcal {P}}_{{\mathcal {G}}_3}(E) := {\mathcal {P}}_2({\mathbb {R}}^d)\) endowed with the quadratic MKW distance \(W_2\). Notice that \(({\mathcal {G}}_1,{\mathcal {F}}_1)\) and \(({\mathcal {G}}_2,{\mathcal {F}}_2)\) satisfy a duality inequality of type 1, and \((P_{{\mathcal {G}}_3}, {\mathcal {F}}_3)\) satisfy a duality of type 2 (see Definition 3.13).

Proof of (A1) The symmetry assumption is satisfied because of the well-known properties of the Boltzmann-Kac \(N\)-particle system, and we refer to the previous works [18, 19, 21, 29, 30] for details.

Proof of (A3) We claim that there exists \(C_1\in {\mathbb {R}}_+\) such that for all \(\Phi \in C^{1,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})\)

$$\begin{aligned} \left\| G^N (\Phi \circ \mu ^N_V ) - \left\langle Q(\mu ^N_V,\mu ^N_V), D\Phi [\mu ^N_V]\right\rangle \right\| _{L^\infty (E^N)} \le {C_1 \over N}\Vert \Phi \Vert _{C^{1,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})},\quad \quad \end{aligned}$$
(5.8)

which is nothing but (A3) with \(k=\eta =1\) and \({\varepsilon }(N)=C_1 \, N^{-1}\).

Take \(\Phi \in C^{1,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})\), set \(\phi = D\Phi [\mu ^N_V]\) and compute

$$\begin{aligned} G^N (\Phi \circ \mu ^N_V )&= \frac{1}{N}\sum _{1\le i<j\le N} \int _{\mathbb {S}^{d-1}} b(\theta _{ij}) \left[ \Phi (\mu ^N_{V^*_{ij}}) - \Phi ( \mu ^N_V)\right] \, \mathrm{d}\sigma \\&= \frac{1}{N}\sum _{1\le i<j\le N} \int _{\mathbb {S}^{d-1}} b(\theta _{ij}) \, \langle \mu ^N_{V^*_{ij}} -\mu ^N_V, \phi \rangle \, \mathrm{d}\sigma \quad (= I_1(V)) \\&+ \frac{1}{N}\sum _{1\le i<j\le N} \int _{\mathbb {S}^{d-1}} {\mathcal {O}} \left( \Vert \Phi \Vert _{C^{1,1}} \, \left\| \mu ^N_{V^*_{ij}} -\mu ^N_V\right\| _{TV}^{2}\right) \, \mathrm{d}\sigma \quad (= I_2(V)). \end{aligned}$$

On the one hand, we have

$$\begin{aligned} I_1&= {1 \over 2N^2}\sum _{i,j= 1}^N \int _{\mathbb {S}^{d-1}} b(\theta _{ij}) \,\left[ \phi (v^*_i) + \phi (v^*_j) - \phi (v_i) - \phi (v_j)\right] \, \mathrm{d}\sigma \\&= {1 \over 2} \int _{{\mathbb {R}}^d} \int _{{\mathbb {R}}^d} \int _{\mathbb {S}^{d-1}} b(\theta ) \, \left[ \phi (v^*) + \phi (w^*) - \phi (v) - \phi (w)\right] \, \mu ^N_V (\mathrm{d}v) \, \mu ^N_V (\mathrm{d}w) \, \mathrm{d}\sigma \\&= \left\langle Q(\mu ^N_V,\mu ^N_V), \phi \right\rangle . \end{aligned}$$

On the other hand, we have

$$\begin{aligned} I_2 (V)&= {1 \over 2N}\sum _{i,j= 1}^N \int _{\mathbb {S}^{d-1}} {\mathcal {O}} \left( \Vert \Phi \Vert _{C^{1,1}} \, \left( {4 \over N}\right) ^{2}\right) \, \mathrm{d}\sigma \\&\le 8 \, \, {\Vert \Phi \Vert _{C^{1,1}} \over N} \, \sum _{i,j = 1}^N {1 \over N^2} \le 8 \, \Vert b \Vert {\Vert \Phi \Vert _{C^{1,1}} \over N}. \end{aligned}$$

Collecting these two terms we have proved that (5.8) holds.

Proof of (A4) Here we prove that for any \(f, \, h \in P({\mathbb {R}}^d)\) and for any \(T>0\)

$$\begin{aligned} \sup _{t \in [0,T]} \Big \Vert S^{N\!L}_t(g) \!-\! S^{N\!L}_t(f) \!-\! {\mathcal {LS}}_t^{\infty }[f] (g - f) \Big \Vert _{TV} \le e^{4 \, \Vert \gamma \Vert _\infty \, T} \, \Vert g - f\Vert _{TV}^2,\quad \end{aligned}$$
(5.9)

where \({\mathcal {LS}}^{\infty }_t[f]\) is the linearization of \(S^{N\!L}_t\) at \(f\). As a consequence, this implies that (A4) holds with \(k=\eta =1\) and the previous definitions of \({\mathcal {P}}_{{\mathcal {G}}_1}(E)\) and \({\mathcal {P}}_{{\mathcal {G}}_2}(E)\). We denote by \(f_t, g_t, h_t\) the solutions to the following equations:

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _t f_t = Q(f_t,f_t), \quad f_{|t=0} = f_{\mathrm{in}}, \\ \displaystyle \partial _t g_t = Q(g_t,g_t), \quad g_{|t=0} = g_{\mathrm{in}}, \\ \displaystyle \partial _t h_t = 2 \tilde{Q}(f_t,h_t) := Q(f_t,h_t) + Q(h_t,f_t), \quad h_{|t=0} = h_{\mathrm{in}}, \end{array}\right. \end{aligned}$$

where \(\tilde{Q}\) denotes the symmetrized form of the bilinear collision operator. The third equation corresponds to the first-order variation of the semigroup: \({\mathcal {LS}}^{\infty }_t[f](h_{\mathrm{in}}) =h_t\) solution to this equation.

Standard Gronwall arguments show the existence and uniqueness of such solutions, which moreover satisfy, uniformly on \([0,T]\)

$$\begin{aligned} \Vert h_t\Vert _{TV} \le e^{2 \, T} \,\Vert h_{\mathrm{in}}\Vert _{TV}, \quad \Vert g_t -f_t\Vert _{TV} \le e^{2 \, T} \, \Vert g_{\mathrm{in}} - f_{\mathrm{in}}\Vert _{TV}. \end{aligned}$$

Next, writing \(r_t := g_t - f_t - h_t\), we find that this expression satisfies the equation

$$\begin{aligned} \partial _t r_t = \tilde{Q} ( f_t+g_t, r_t) + \tilde{Q}(g_t - f_t,h_t), \qquad r_{\mathrm{in}} = 0. \end{aligned}$$

Introducing \(y_t := \Vert r_t\Vert _{TV}\), we have

$$\begin{aligned} y'_t&\le {1 \over 2} \, \Vert \tilde{Q} ( f_t+g_t, r_t)\Vert _{TV} + \Vert \tilde{Q}(g_t - f_t,h_t)\Vert _{TV} \\&\le \Vert \gamma \Vert _\infty \, \Vert f_t+g_t\Vert _{TV} \, \Vert r_t\Vert _{TV} + C \, \Vert g_t - f_t\Vert _{TV} \, \Vert h_t \Vert _{TV} \\&\le C \, y_t + C \, e^{4t}\, \Vert h - f \Vert _{TV}^2, \end{aligned}$$

from which we deduce

$$\begin{aligned} \forall \, t \in [0,T], \quad y_t \le e^{4T}\, \Vert h - f \Vert _{TV}^2. \end{aligned}$$

This concludes the proof of (5.9).

Proof of (A2) Assumption (A2)-(i) is clearly a consequence of (A4). For (A2)-(ii) we write

$$\begin{aligned} \Vert Q(f,f) - Q(g,g) \Vert _{TV}&= \sup _{\Vert \varphi \Vert _{L^\infty }\le 1}\int _{E}(Q(f,f) - Q(g,g)) \, \varphi \, \mathrm{d}v \\&= \sup _{\Vert \varphi \Vert _{L^\infty }\le 1}\int _{E \times E} (f \, f_* - g \, g_*) \int _{{\mathbb {S}}^{d-1}} b \, (\varphi ' - \varphi ) \, \mathrm{d}\sigma \, \mathrm{d}v \, \mathrm{d}v_* \\&\le 4 \, \Vert b \Vert _{L^1} \, \Vert f - g \Vert _{TV}, \end{aligned}$$

so that the function \(f \mapsto Q(f,f)\) is Lipshitz from \({\mathcal {P}}_{{\mathcal {G}}_1}(E)\) to \(M^1({\mathbb {R}}^d)\).

Proof of (A5) It is known since the seminal work of Tanaka [37] that the nonlinear Boltzmann flow associated to Maxwellian molecules is a contraction for the quadratic MLW distance \(W_2\): for any \(f_{\mathrm{in}}, \, g_{\mathrm{in}} \in {\mathcal {P}}_1({\mathbb {R}}^d)\) the solutions \(f_t, \,g_t\) to the Boltzmann Eq. (5.4) satisfy

$$\begin{aligned} \sup _{[0,T]} W_2(f_t,fg_t) \le W_2(f_{in},g_{in}). \end{aligned}$$

That immediately implies (A5) in the space \(P_{{\mathcal {G}}_3}(E)\) defined above. \(\square \)

6 Vlasov and McKean–Vlasov equations

6.1 The model

In this section we assume that \(E = {\mathbb {R}}^m\) (where \(m=d\) or \(m=2d\) with \(d\) the physical space dimension, see later) and we consider an \(N\)-particle system which undergoes McKean–Vlasov type stochastic dynamics, i.e. a drift deterministic force field combined with diffusion. We refer to the lecture notes [30, 36] and the references therein for more details on the model, and among many references, we highlight the recent paper [5] for recent results and references (using the so-called “coupling” method). The method we shall present here does not rely on any of these references. The results in this section are mostly not new but compare to the latest results of mean field limit on this equation as far as we know. Indeed we shall make strong smoothness assumptions on the coefficients of the evolution equation in order to avoid technical difficulties and our goal is to advocate for our new method and show its power and ability to deal with very different models.

We assume that the \(N\) particles \({\mathcal {Z}}^N_t = ({\mathcal {Z}}_{1,t}, \dots , {\mathcal {Z}}_{N,t})\) satisfies the stochastic differential equation

$$\begin{aligned} \mathrm{d} {\mathcal {Z}}_{i,t} = \sigma _i({\mathcal {Z}}_{i,t}) \, \mathrm{d} \fancyscript{B}_{i,t} + \fancyscript{T} {\mathcal {Z}}_{i,t} \, \mathrm{d} t + F^{N}_i ({\mathcal {Z}}^N_t) \, \mathrm{d}t \qquad 1 \le i \le N, \end{aligned}$$
(6.1)

where the \(\sigma (z_i)\) are the diffusion \(m \times m\)-matrices, the \(\fancyscript{B}_{i,t}\) are independent standard Wiener processes valued in \({\mathbb {R}}^m, \fancyscript{T}\) is an \(m\times m\)-matrix and the \(F^{N}_i : {\mathbb {R}}^m \rightarrow {\mathbb {R}}^m\) are the force fields acting on each particle. Because of indistinguishability we assume

$$\begin{aligned} F^{N}_i (Z) := F^N\left( z_i, \mu ^{N-1}_{\hat{Z}^N_i}\right) \end{aligned}$$

with \(\hat{Z}^N_i:= (z_1, \dots ,z_{i-1},z_{i+1}, \dots , z_N)\) and \(F^N: {\mathbb {R}}^m \times {\mathcal {P}}({\mathbb {R}}^m) \rightarrow {\mathbb {R}}^m\). (Note that here and below the Latin letters “\(i, j\), ...” label the particles, whereas the Greek letters “\(\alpha , \beta \), ...” label the coordinates).

We assume that \(F^N\) is uniformly bounded and Lipschitz in both variables (when endowing \({\mathcal {P}}({\mathbb {R}}^m)\) with a distance inherited from a negative Sobolev norm). More precisely, we assume that for any \(k > m/2\) there exists \(C_{F,k} > 0\) such that for any \(z, \tilde{z} \in {\mathbb {R}}^m, f, \tilde{f} \in {\mathcal {P}}({\mathbb {R}}^m)\)

$$\begin{aligned} \forall \, N \in {\mathbb {N}}^*, \quad \left| F^N(z,f) - F^N(\tilde{z}, \tilde{f})\right| \le C_{F,k} \, \Big [|z - \tilde{z}| + \Vert f - \tilde{f}\Vert _{H^{-k}}\Big ]. \end{aligned}$$
(6.2)

It is also natural for the limit to exist to assume that there exists a function \(F : {\mathbb {R}}^m \times {\mathcal {P}}({\mathbb {R}}^m) \rightarrow {\mathbb {R}}^m\) such that \(F^N\rightarrow F\), in the sense that there is a constant \(C_{F,lim} >0\) such that

$$\begin{aligned} \forall \, N \in {\mathbb {N}}^*, \ \forall \, z \in {\mathbb {R}}^m, \ \forall \, f \in {\mathcal {P}}({\mathbb {R}}^m), \quad \left| F^N(z,f) - F(z,f)\right| \le \frac{C_{F,lim}}{N}.\quad \quad \end{aligned}$$
(6.3)

A simple example which satisfies these assumptions is

$$\begin{aligned} F^N_i \left( Z,\mu ^{N-1}_{\hat{Z}_i}\right) = \frac{N}{N-1} \, F^N\left( z_i, \hat{Z}_i\right) , \quad F^N\left( z_i, \hat{Z}_i\right) := {1 \over N}\, \sum _{j \not = i} \fancyscript{U} (z_i - z_j)\nonumber \\ \end{aligned}$$
(6.4)

for a smooth vector field \(\fancyscript{U}: {\mathbb {R}}^m \rightarrow {\mathbb {R}}^m\), so that \(F(z,f) = (\fancyscript{U} * f)(z)\).

Under the smoothness assumptions (6.2) on the \(N\)-particle force fields, for any \(N \ge 1\) there exists a Markov process \(({\mathcal {Z}}^N_t)_{t \ge 0}\) which solves the system of stochastic differential equations (6.1), see [30, 36].

The time-dependent law \(f^N_t\) of the process \({\mathcal {Z}}^N_t\) satisfies the following linear master equation corresponding to (6.1), given in dual form by

$$\begin{aligned} \forall \, \varphi \in {\mathcal {D}}({\mathbb {R}}^m), \quad \partial _t \left\langle f^N_t,\varphi \right\rangle = \left\langle f^N_t,G^N \, \varphi \right\rangle \end{aligned}$$
(6.5)

where \(G^N\) is defined by

$$\begin{aligned} \forall \, Z \in {\mathbb {R}}^{mN}, \quad (G^N \varphi )(Z)&= \sum _{i=1}^N A(z_i) : \nabla ^2_i \varphi + \sum _{i=1}^N (\fancyscript{T}z_i) \cdot \nabla _i \varphi \nonumber \\&+ \sum _{i=1}^N F^N\left( z_i, \mu ^{N-1}_{\hat{Z}_i}\right) \cdot \nabla _i \varphi . \end{aligned}$$

The nonnegative diffusion matrix \(A\), the gradient \(\nabla _i\) and the Hessian matrix \(\nabla ^2_i\) associated to the variable \(z_i = (z_{i,1}, \dots , z_{i,m}) \in {\mathbb {R}}^m\) corresponding to the \(i\)-th particle are given by

$$\begin{aligned} A = {1 \over 2} \, \sigma \, \sigma ^* = \left( A_{\alpha ,\beta }\right) _{1 \le \alpha ,\beta \le m}, \quad A_{\alpha ,\beta } = \sum _{\gamma =1}^d \sigma _{\alpha ,\gamma } \, \sigma _{\beta ,\gamma }, \end{aligned}$$

and

$$\begin{aligned} \nabla _i \varphi = \left( \partial _{z_{i,\alpha }} \varphi \right) _{1 \le \alpha \le m}, \quad \nabla ^2_i \varphi = \left( \partial ^2_{z_{i, \alpha } z_{i,\beta }} \varphi \right) _{1 \le \alpha ,\beta \le m}. \end{aligned}$$

We also introduce the nonlinear mean field McKean–Vlasov equation on \({\mathcal {P}}({\mathbb {R}}^m)\):

$$\begin{aligned} {\partial \over \partial t} f = Q(f_t), \quad f_{|t=0} = f_{\mathrm{in}} \quad \hbox {in}\quad {\mathcal {P}}({\mathbb {R}}^m), \end{aligned}$$
(6.6)

with

$$\begin{aligned} Q(f) = \sum _{\alpha ,\beta =1}^m \partial ^2_{\alpha ,\beta } \left( A_{\alpha ,\beta }\, f\right) - \sum _{\alpha =1}^m \partial _\alpha [(\fancyscript{Tz})_\alpha f] - \sum _{\alpha =1}^m \partial _\alpha \left( F_\alpha (z,f) \, f\right) . \end{aligned}$$

There is an important literature on this class of nonlinear partial differential equations. See fore example [8] and [28] where more details and more references can be found.

In the sequel, we make the following strong structure, smoothness and boundedness assumptions on the coefficients:

$$\begin{aligned} (A \equiv 0, \ \kappa :=0) \,\, \quad \hbox {or}\quad \,\, (A \ge \kappa \, \hbox {Id}, \quad \kappa >0, \quad A \in W^{k,\infty }({\mathbb {R}}^m)), \end{aligned}$$
(6.7)

as well as

$$\begin{aligned} \forall \, z \in {\mathbb {R}}^m, \ \forall \, f \in {\mathcal {P}}({\mathbb {R}}^m), \quad F (z,f) = \int _{{\mathbb {R}}^m \times {\mathbb {R}}^m} \fancyscript{U} (z-\tilde{z}) \, f(\mathrm{d} \tilde{z}) \end{aligned}$$
(6.8)

where \(\fancyscript{U} \in H^{2k}_6({\mathbb {R}}^m)\) for some \(k \in {\mathbb {N}}, k > m/2 + 3\).

When the diffusion matrix \(A=0\) is zero, \(m=2d, z=(x,v) \in {\mathbb {R}}^{2d}\), our assumptions cover the case of the mean-field Vlasov equation. Indeed the classical Vlasov equation reads

$$\begin{aligned} \partial _t f + v \cdot \nabla _x f + (\nabla _x \psi *\rho [f]) \cdot \nabla _v f = 0, \quad f=f(t,x,v), \quad x, v \in {\mathbb {R}}^d, \end{aligned}$$

with

$$\begin{aligned} \rho [f](t,x) = \int _{{\mathbb {R}}^d} f(t,x,v) \, \mathrm{d}v, \end{aligned}$$

and it falls into our structural assumptions with \(z = (x,v) \in {\mathbb {R}}^d \times {\mathbb {R}}^d, \fancyscript{U} (z) = \fancyscript{U} (x)= (0,\nabla _x \psi (x))\) with \(\nabla _x \psi \) is \(H^{d+6+0}\) and

$$\begin{aligned} \fancyscript{T} = \left( \begin{matrix} 0_{xx} &{}\quad \text{ Id }_{xv} \\ 0_{vx} &{}\quad 0_{vv} \end{matrix}\right) . \end{aligned}$$

Then \(F=(F_x,F_v)\) defined by (6.8) is given by

$$\begin{aligned} F_x(x,v) = 0, \quad F_v(x,v) = \nabla _x \psi *\rho [f] \end{aligned}$$

for the limiting system, and, with \(X \in ({\mathbb {R}}^d)^N\) and \(V \in ({\mathbb {R}}^d)^N\), we have for the \(N\)-particle system \(\fancyscript{T}\) defined as above and \(F^N=(F^N_X,F^N_V)\) given by

$$\begin{aligned} F^N_X = 0, \quad (F^N_V)_i = \frac{1}{N} \sum _{j \not = i}^N \nabla _x \psi (X_i - X_j), \ i=1, \dots , N. \end{aligned}$$

Observe in particular that it does not allow for the Coulomb or Newton interactions in this Vlasov setting due to the smoothness assumption on \(\psi \).

6.2 Statement of the result

Our main result in the section is a quantitative propagation of chaos result for the class of equations described above. We state two separate results respectively for the McKean–Vlasov case (possibly non-zero diffusion matrix) and the Vlasov case (zero diffusion matrix).

Theorem 6.1

(The McKean–Vlasov equation) Consider an initial distribution \(f_{\mathrm{in}} \in {\mathcal {P}}_q({\mathbb {R}}^m), q \ge 2\), the hierarchy of \(N\)-particle distributions \(f^N_t = S^N_t(f_{\mathrm{in}}^{\otimes N})\) following (6.5) and the nonlinear evolution \(f_t = S^{N\!L}_t(f_{\mathrm{in}})\) following (6.6). Assume that (6.7) and (6.8) hold.

Then there is \(k \in {\mathbb {N}}\) and a constant \(C>0\) and, for any \(T>0\), there are constants \(C_{T}, \tilde{C}_T >0\) such that for any

$$\begin{aligned} \varphi = \varphi _1 \otimes \dots \otimes \, \varphi _\ell \in {\mathcal {F}}^{\otimes \ell }, \quad {\mathcal {F}}:= H^k_6({\mathbb {R}}^m) \cap \hbox {Lip}({\mathbb {R}}^m), \quad \Vert \varphi _j\Vert _{\mathcal {F}}\le 1, \end{aligned}$$

we have for \(N \ge 2 \ell \):

$$\begin{aligned} \sup _{[0,T]}\left| \left\langle \left( S^N_t(f_{\mathrm{in}}^N) - \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes N}\right) , \varphi \right\rangle \right| \le C \, {\ell ^2 \over N} + C_{T} \, {\ell ^2 \over N} + \tilde{C}_{T} \, \ell \, \Omega ^{W_2}_N (f_{\mathrm{in}}).\nonumber \\ \end{aligned}$$
(6.9)

As a consequence of (6.9) and Lemma 4.6, this implies the propagation of chaos with rate \({\varepsilon }(N) \le C(\ell ,T,f_{\mathrm{in}}) \, N^{-{1 \over m+4}}\) for any initial data \(f_{\mathrm{in}} \in {\mathcal {P}}_{m+5}({\mathbb {R}}^m)\).

Now we consider the case of the Vlasov equation. As will be clear from the proof, when \(A=0\) and \(\fancyscript{U}(0)=0\), the error \({\varepsilon }(N) = 0\) vanishes in assumption (A3). This leads to the following improved result.

Theorem 6.2

(The Vlasov equation) Suppose, in addition to the assumptions for Theorem 6.1, that \(A \equiv 0\) and \(\fancyscript{U} (0) = 0\). Then there is a constant \(C >0\) and, for any \(T>0\), a constant \(\tilde{C}_T>0\) such that for any \(\varphi \in \hbox {Lip}({\mathbb {R}}^{\ell m})\) and any \(N \ge \ell \):

$$\begin{aligned} \sup _{[0,T]}\left| \left\langle \left( S^N_t(f_{\mathrm{in}}^{\otimes N}) - \left( S^{N\!L}_t (f_{\mathrm{in}}) \right) ^{\otimes N}\right) , \varphi \right\rangle \right| \le C \, \Vert \nabla \varphi \Vert _{L^\infty ({\mathbb {R}}^{\ell m})} \, {\ell \over N} + \tilde{C}_{T} \, \Omega _N^{W_1} (f_{\mathrm{in}})\nonumber \\ \end{aligned}$$
(6.10)

(observe the replacement of \(W_2\) by \(W_1\) in the last term) which in turn implies

$$\begin{aligned} \sup _{[0,T]} {1 \over N}W_1 \left( (S^N_t(f_{in}^{\otimes N}), \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes N}\right) \le {C \over N} + \frac{\tilde{C}_{T} \, \Omega _N^{W_1} (f_{\mathrm{in}})}{N}. \end{aligned}$$

Remark 6.3

Note that the coupling method introduced in [36] leads to a rate of chaoticity of order \({\mathcal {O}}(1/\sqrt{N})\) for the normalized Wasserstein distance \(W_2\) between the law of \({\mathcal {Z}}^N_t\) and the tensor product \(f_t^{\otimes N}\). This is better than our estimate, which is limited by the estimate in Lemma 4.6. However, the coupling method is usually limited to the quadratic interaction given by (6.4).

6.3 Proof of Theorem 6.1

As in the proof of Theorem 5.1 we prove that Theorem 6.1 is a consequence of Theorem 2.1, by verifying that assumptions (A1)–(A2)–(A3)–(A4)–(A5) hold. However, in the present model we cannot, as in Sect. 5, use the total variation norm for the key consistency estimate (A3) and differential stability estimate (A4). The reason is that \(G^N\pi ^N\Phi \) involves derivatives of \(Z\mapsto \Phi (\phi _Z^N)\), and hence of \(Z \mapsto \mu ^N_Z\) which is not differentiable from \({\mathbb {R}}^{mN}\) to \({\mathcal {P}}({\mathbb {R}}^m)\) when \({\mathcal {P}}({\mathbb {R}}^m)\) is endowed with the total variation norm. We therefore make the following choice of functional spaces: \(E := {\mathbb {R}}^m\) with

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l@{\quad }l} {\mathcal {G}}_1 := H^{-s_1}_{-2} ({\mathbb {R}}^m), &{} {\mathcal {F}}_1 = H^{s_1}_2({\mathbb {R}}^m), &{} s_1 > {m \over 2} + 2 \\ {\mathcal {G}}_2 := H^{-s_2}_{-6} ({\mathbb {R}}^m), &{} {\mathcal {F}}_2 = H^{s_2}_6({\mathbb {R}}^m), &{} s_2 := s_1 + 2, \end{array}\right. \end{aligned}$$

and the weight \(m_{{\mathcal {G}}_1}(z) = m_{{\mathcal {G}}_2}(z) = 1\), and \({\mathcal {F}}_3 = \text{ Lip }({\mathbb {R}}^m)\) and \({\mathcal {P}}_{{\mathcal {G}}_3}(E) := {\mathcal {P}}_2({\mathbb {R}}^m)\) endowed with the quadratic MKW distance \(W_2\).

Proof of assumption (A1) The symmetry assumption is a consequence of the fact that first (6.5) is well posed for any \(f^N_{\mathrm{in}} \in {\mathcal {P}}_{\mathrm{sym}}({\mathbb {R}}^{mN})\) so that \(f^N_t\) is a probability measure for any \(t \ge 0\), and second the generator \(G^N\) commutes with the permutations.

Proof of assumption (A3) We claim that for any \(s_1 >m/2+1\) there exists a constant \(C_{s_1}\) such that for all \(\Phi \in C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})\)

$$\begin{aligned} \left\| G^N (\Phi \circ \mu ^N_Z) - \left\langle Q(\mu ^N_Z), D\Phi [\mu ^N_Z]\right\rangle \right\| _{L^\infty (E^N)} \le {C_{s_1} \over N}\Vert \Phi \Vert _{C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})},\quad \quad \end{aligned}$$
(6.11)

which is nothing but (A3) with \(k=2, \eta =1\) and \({\varepsilon }(N) = C_{s_1} \, N^{-1}\).

Proof of 6.11

First, the map

$$\begin{aligned} {\mathbb {R}}^{m N} \rightarrow H^{-s_1}({\mathbb {R}}^m), \quad Z \mapsto \mu ^N_Z \end{aligned}$$

is \(C^2\) with

$$\begin{aligned} \partial _{z_{i,\alpha }} \mu ^N_Z = {1 \over N} \, \partial _{{\alpha }} \delta _{z_i}, \qquad \partial ^2_{z_{i,\alpha }, z_{i,\beta }} \mu ^N_Z = {1 \over N^2} \,\partial ^2_{\alpha \beta } \delta _{z_i}. \end{aligned}$$

Take \(\Phi \in C_b^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E))\). Then the map

$$\begin{aligned} {\mathbb {R}}^{mN} \rightarrow {\mathbb {R}},\quad Z \mapsto \Phi (\mu ^N_Z) \end{aligned}$$

is \(C_b^2\). Indeed, denoting \(\phi = \phi _Z (\cdot ) = D\Phi \!\left[ \mu ^N_Z\right] \in (H^{-s_1}_{-2}({\mathbb {R}}^m))' = H^{s_1}_2({\mathbb {R}}^m)\), we can write:

$$\begin{aligned} \partial _{z_{i,\alpha }} \Phi \left( \mu ^N_Z\right)&= \left\langle D\Phi \!\left[ \mu ^N_Z\right] , {1 \over N} \, \partial _\alpha \delta _{z_i}\right\rangle = {1 \over N}\, \partial _{\alpha } \phi _Z (v_i)\\ \partial ^2_{z_{i,\alpha }, z_{i,\beta }} \Phi \left( \mu ^N_Z\right)&= \left\langle D\Phi \!\left[ \mu ^N_Z\right] , {1 \over N} \, \partial ^2_{z_{i,\alpha }, z_{i,\beta }} \delta _{z_i}\right\rangle \\&+ D^2\Phi \!\left[ \mu ^N_Z\right] \left( {1 \over N} \, \partial _{z_{i,\alpha }} \delta _{z_i}, {1 \over N} \, \partial _{z_{i,\beta }} \delta _{z_i}\right) \\&= {1 \over N} \, \partial ^2_{\alpha ,\beta } \phi _Z (z_i) + {1 \over N^2} \, D^2\Phi \!\left[ \mu ^N_Z\right] \left( \partial _{z_{i,\alpha }} \delta _{z_i}, \partial _{z_{i,\beta }} \delta _{z_i}\right) \end{aligned}$$

and both \(\partial _{z_{i,\alpha }} \delta _{z_i}\) and \(\partial ^2_{z_{i,\alpha }, z_{i,\beta }} \delta _{z_i}\) belong to \(H^{-s_1}_{-2}({\mathbb {R}}^m)\) thanks to the condition \(s_1 > m/2+2\).

As a consequence, we compute

$$\begin{aligned} \left( G^N \pi ^N \Phi \right) (Z)&= G^N \, \Phi (\mu ^N_Z) \\&= \sum _{i=1}^N A(z_i) : \nabla ^2_i \left( \Phi (\mu ^N_Z)\right) \\&+ \sum _{i=1}^N (\fancyscript{T}z_i) \cdot \nabla _i \left( \Phi (\mu ^N_Z)\right) \!+\! \sum _{i=1}^N F^N\left( z_i, \mu ^{N-1}_{\hat{Z}_i}\right) \cdot \nabla _i \left( \Phi (\mu ^N_Z)\right) \\&=: I_1(Z) + I_2(Z) \end{aligned}$$

with

$$\begin{aligned} I_1(Z)&:= \frac{1}{N} \, \sum _{i=1}^N \sum _{\alpha ,\beta =1}^m A_{\alpha ,\beta }(z_i) \, \partial ^2_{\alpha ,\beta } \phi _Z(z_i)\\&+ \sum _{i=1}^N \sum _{\alpha , \beta =1}^m (\fancyscript{T}_{\alpha \beta } z_i) \, \partial _\beta \phi _Z(z_i) + \frac{1}{N} \, \sum _{i=1}^N \sum _{\alpha =1}^m F_\alpha \left( z_i, \mu ^N_Z\right) \, \partial _{\alpha } \phi _Z (z_i) \end{aligned}$$

and

$$\begin{aligned} I_2(Z)&:= \frac{1}{N^2} \, \sum _{i=1}^N \sum _{\alpha ,\beta =1}^m A_{\alpha ,\beta }(z_i) \, D^2\Phi \!\left[ \mu ^N_Z\right] \left( \partial _{z_{i,\alpha }} \delta _{z_i}, \partial _{z_{i,\beta }} \delta _{z_i}\right) \\&+ \frac{1}{N} \, \sum _{i=1}^N \sum _{\alpha =1}^m \left[ F^N_\alpha \left( z_i, \mu ^{N-1}_{\hat{Z}_i}\right) - F_\alpha \left( z_i, \mu ^{N}_Z\right) \right] \,D\Phi \!\left[ \mu ^N_Z\right] \left( \partial _{z_{i,\alpha }} \delta _{z_i}\right) . \end{aligned}$$

On the one hand, using that

$$\begin{aligned} \left\| \mu ^{N-1}_{\hat{Z}_i} - \mu ^{N}_Z\right\| _{H^{-s_1}_{-2} ({\mathbb {R}}^m)} \le \frac{2}{N} \, \sup _{z_i} \left\| \delta _{z_i}\right\| _{H^{-s_1}({\mathbb {R}}^m)} \le \frac{C}{N} \end{aligned}$$

as well as (6.2) and (6.3), we deduce that

$$\begin{aligned} |I_2(Z)|&\le N \, \frac{m^2}{N^2} \, \Vert A \Vert _\infty \, \Vert D^2 \Phi \Vert _\infty \, \Vert \partial _1 \delta \Vert ^2_{H^{-s_1}_{-2}({\mathbb {R}}^m)} \\&+ N \, m \, \left( \frac{C_{F}}{N}\right) \, {1 \over N} \, \Vert D \Phi \Vert _\infty \, \Vert \partial _1 \delta \Vert _{H^{-s_1}_{-2}({\mathbb {R}}^m)} \le \frac{C_\Phi }{N}. \end{aligned}$$

On the other hand, we recognize

$$\begin{aligned} I_1 (Z)&= \left\langle \mu ^{N}_Z \, , \, \sum _{\alpha ,\beta =1}^m A_{\alpha ,\beta } \, \partial ^2_{\alpha ,\beta } \phi _Z\right\rangle \\&+ \left\langle \mu ^{N}_Z \, , \, \sum _{\alpha =1}^m (\fancyscript{T} \cdot )_{\alpha } \, \partial _{\alpha } \phi _Z\right\rangle + \left\langle \mu ^{N}_Z \, , \, \sum _{\alpha =1}^m F_\alpha \left( \cdot , \mu ^{N}_Z\right) \, \partial _{\alpha } \phi _Z\right\rangle \\&= \left\langle Q(\mu ^{N}_Z), \phi _Z\right\rangle = \left\langle Q(\mu ^{N}_Z), D\Phi (\mu ^{N}_Z)\right\rangle = \left( \pi ^N G^\infty \Phi \right) (Z), \end{aligned}$$

thanks to the calculation of the limit dual generator made in Sect. 4.1. \(\square \)

Proof of assumption (A4) We need here to perform a second-order expansion of the limit semigroup.

We consider

  • for any two given initial data \(f_{\mathrm{in}}, g_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^m)\) the corresponding solutions \(f_t\) and \(g_t\) to the nonlinear McKean–Vlasov (or Vlasov) Eq. (6.6),

  • for any given initial data \(h_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^m)\) the solution \(h_t\) to the following equation, which is the linearization around \(f_t\):

    $$\begin{aligned} \partial _t h = \nabla ^2 : (A \, h) - \nabla \cdot ((\fancyscript{Tz}) \, h) - \nabla \cdot \left[ h \, (\fancyscript{U}*f) + f \, (\fancyscript{U}*h)\right] , \quad h_{t=0} = h_{\mathrm{in}},\nonumber \\ \end{aligned}$$
    (6.12)
  • \(r_t\) the solution to the following second variation equation around \(f_t\)

    $$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _t r = \nabla ^2 : (A \, r) - \nabla \cdot ((\fancyscript{T}z) \, r) - \nabla \cdot \left[ r \, (\fancyscript{U} *f) + f \, (\fancyscript{U}*r)\right] \\ \displaystyle \qquad \quad - \frac{1}{2} \nabla \cdot \left[ \tilde{h} \, (\fancyscript{U} *h)\right] - \frac{1}{2} \nabla \cdot \left[ h \, (\fancyscript{U} * \tilde{h})\right] ,\\ \displaystyle r_{|t=0} = r_{\mathrm{in}} = 0 \end{array}\right. \end{aligned}$$
    (6.13)

    for two solutions \(h, \tilde{h}\) of the first variation equation.

Then we shall prove the following a priori estimates.

Lemma 6.4

For any \(s_1 \in {\mathbb {N}}, s_1> m/2 +1, \ell \in \{1,2,3\}\) and for any \(T > 0\), there exists \(C_T\) such that

$$\begin{aligned}&\sup _{[0,T]} \Vert g_t - f_t \Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)} \le C_T \, \Vert g_{\mathrm{in}} - f_{\mathrm{in}}\Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)},\end{aligned}$$
(6.14)
$$\begin{aligned}&\sup _{[0,T]} \Vert h_t\Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)} \le C_T \, \Vert h_{\mathrm{in}} \Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)},\end{aligned}$$
(6.15)
$$\begin{aligned}&\sup _{[0,T]} \Vert r_t \Vert _{H^{-(s_1+1)}_{-4}({\mathbb {R}}^m)} \le C_T \, \Vert h_{\mathrm{in}} \Vert _{H^{-s_1}_{-2}({\mathbb {R}}^m)} \, \Vert \tilde{h}_{\mathrm{in}} \Vert _{H^{-s_1}_{-2}({\mathbb {R}}^m)}, \end{aligned}$$
(6.16)

and when \(\tilde{h}_{\mathrm{in}} = h_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}\) we have

$$\begin{aligned}&\sup _{[0,T]} \Vert g_t - f_t - h_t\Vert _{H^{-(s_1+1)}_{-4}({\mathbb {R}}^m)} \le C_T \, \Vert g_{\mathrm{in}} - f_{\mathrm{in}}\Vert ^2_{H^{-s_1}_{-2}({\mathbb {R}}^m)}, \end{aligned}$$
(6.17)
$$\begin{aligned}&\sup _{[0,T]} \Vert g_t - f_t - h_t - r_t \Vert _{H^{-(s_1+2)}_{-6} ({\mathbb {R}}^m)} \le C_T \, \Vert g_{\mathrm{in}} - f_{\mathrm{in}} \Vert ^3_{H^{-s_1}_{-2}({\mathbb {R}}^m)}. \end{aligned}$$
(6.18)

This shows that the nonlinear semigroup \(S^{N\!L}_t\) associated to the nonlinear McKean–Vlasov equation (6.6) is \(C_b^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathcal {P}}_{{\mathcal {G}}_2}(E))\).

Proof of Lemma 6.4

The proof is carried out in several steps.

Step 1. We will several times consider the equation

$$\begin{aligned} \partial _t \zeta _t = \nabla ^2 : (A \, \zeta _t) - \nabla \cdot ((\fancyscript{T}z) \zeta _t) - \nabla \cdot ( u_1 \, \zeta _t + u_2 \, (\fancyscript{U} *\zeta _t)) \end{aligned}$$
(6.19)

with given initial data \(\zeta _{\mathrm{in}}\) and with an \({\mathbb {R}}^m\)-valued function \(u_1\) and an \({\mathbb {R}}\)-valued measure \(u_2\) to be specified [chosen in order to “match” equations (6.6), (6.12) and (6.13)]. We claim that for any \(k \in {\mathbb {N}}, k > m/2 + 1, \ell \in \{1,2,3\}\) and any \(T >0\),

$$\begin{aligned} \forall \, t \in [0,T], \quad \left\| \zeta _t\right\| _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \le \left\| \zeta _{\mathrm{in}}\right\| _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \, e^{C_k(\fancyscript{U},u_1,u_2) \, T} \end{aligned}$$
(6.20)

with

$$\begin{aligned} C_k(\fancyscript{U}, u_1,u_2) := C(k) \, \sup _{t \in [0,T]} \Bigg [\left\| u_1\right\| _{W^{k,\infty }({\mathbb {R}}^m)} + \Vert \fancyscript{U} \Vert _{H^{k}_\ell ({\mathbb {R}}^m)} \, \Vert u_2 \Vert _{TV({\mathbb {R}}^m)}\Bigg ]. \end{aligned}$$

We argue by duality and we consider a smooth solution \(\theta \) to the following linear equation (which is the dual equation of (6.19))

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t \theta = L^*_1 \theta + L^*_2 \theta ,\\ L^*_1 \theta := A : \nabla ^2 \theta + (\fancyscript{T}z) \cdot \nabla \theta , \\ L^*_2 \theta := u_1 \cdot \nabla \theta + \check{\fancyscript{U}} *\left( u_2 \, \nabla \theta \right) , \end{array}\right. \end{aligned}$$
(6.21)

with \(\check{\fancyscript{U}}(x) := \fancyscript{U} (-x)\).

For a given multi-index \(\nu \in {\mathbb {N}}^m\) with \(|\nu | = k' \le k\), we compute

$$\begin{aligned}&{\mathrm{d} \over \mathrm{d}t} \int _{{\mathbb {R}}^m} \left| \partial ^{\nu } \theta \right| ^{2\ell } \, \langle z \rangle ^{2\ell } \, \mathrm{d}z \\&\quad = \int _{{\mathbb {R}}^m} (\partial ^{\nu } L^*_1 \theta ) \, \partial ^{\nu } \theta \, \langle z \rangle ^{2\ell } \, \mathrm{d}z + \int _{{\mathbb {R}}^m} (\partial ^{\nu } L^*_2 \theta ) \, \partial ^{\nu } \theta \, \langle z \rangle ^{2\ell } \, \mathrm{d}z =: {\mathcal {L}}_1 + {\mathcal {L}}_2. \end{aligned}$$

By integrations by parts, we get

$$\begin{aligned} {\mathcal {L}}_1 \le - \kappa \, \int _{{\mathbb {R}}^m} \left| \nabla \partial ^\nu \theta \right| ^2 \, \langle z \rangle ^{2\ell } \, \mathrm{d}z + C \, \left( \Vert \fancyscript{T}\Vert _\infty + \Vert A \Vert _{W^{k'+2,\infty }({\mathbb {R}}^m)}\right) \, \left\| \theta \right\| ^2_{H^{k'}_\ell ({\mathbb {R}}^m)} \end{aligned}$$

and (using Sobolev embedding inequalities for the last term)

$$\begin{aligned} {\mathcal {L}}_2 \le C \left( \Vert u_1\Vert _{W^{k',\infty }({\mathbb {R}}^m)} + \Vert u_2 \Vert _{TV({\mathbb {R}}^m)} \, \Vert \fancyscript{U}\Vert _{H^{k'}_\ell ({\mathbb {R}}^m)}\right) \Vert \theta _t \Vert ^2_{H_\ell ^{k}({\mathbb {R}}^m)} \end{aligned}$$

which shows that

$$\begin{aligned} \forall \, t \in [0,T], \quad \left\| \theta _t \right\| _{H^k_\ell ({\mathbb {R}}^m)} \le \left\| \theta _{\mathrm{in}}\right\| _{H^k_\ell ({\mathbb {R}}^m)} \, e^{C_k(\fancyscript{U},u_1,u_2) \, T}. \end{aligned}$$

Denoting by \(U_t\) the linear semigroup associated to (6.19), the associated dual semigroup \(U^*_t\) is generated by (6.21). As a consequence, for any \(\theta _{\mathrm{in}} \in H^k({\mathbb {R}}^m)\), we have

$$\begin{aligned} \langle \zeta _t,\theta _{\mathrm{in}} \rangle&= \left\langle \zeta _{\mathrm{in}}, U^*_t \theta _{\mathrm{in}} \right\rangle \le \Vert \zeta _{\mathrm{in}} \Vert _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \, \Vert U^*_t \theta _{\mathrm{in}} \Vert _{H^{k}_\ell ({\mathbb {R}}^m)}\\&\le e^{C_k(\fancyscript{U},u_1,u_2) \, T} \, \Vert \zeta _{\mathrm{in}} \Vert _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \, \Vert \theta _{\mathrm{in}} \Vert _{H^{k}_\ell ({\mathbb {R}}^m)}, \end{aligned}$$

which concludes the proof of the claim (6.20).

Step 2. Proof of (6.14). The equation satisfied by the difference \(\mathsf {d}_t = g_t - f_t\) is

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t \mathsf {d}_t = \nabla ^2 : (A \, \mathsf {d}_t) - \nabla \cdot ((\fancyscript{T}z) \mathsf {d}_t) - \nabla \cdot \left( \mathsf {d}_t \, (\fancyscript{U} *f) + g \, (\fancyscript{U} *\mathsf {d}_t)\right) , \\ \mathsf {d}_{|t=0} = \mathsf {d}_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}, \end{array}\right. \quad \quad \end{aligned}$$
(6.22)

which fits in the form (6.19) with \(u_1 := \fancyscript{U} *f\) and \(u_2 = g\). Now, since

$$\begin{aligned} \left\| \nabla ^k (\fancyscript{U} *f)\right\| _{L^\infty ({\mathbb {R}}^m)} = \left\| (\nabla ^k \fancyscript{U}) *f\right\| _{L^\infty ({\mathbb {R}}^m)} \le \left\| \nabla ^k \fancyscript{U}\right\| _{L^\infty ({\mathbb {R}}^m)} \end{aligned}$$

we conclude that

$$\begin{aligned} C_k(\fancyscript{U}, \fancyscript{U} *f ,g) \le C \, \Vert \fancyscript{U} \Vert _{H^k_3 \cap W^{k,\infty } ({\mathbb {R}}^m)} \end{aligned}$$

and that (6.14) holds. Proceeding in the same way for the function \(h\) we end up with

$$\begin{aligned} \sup _{[0,T]} \Vert h_t\Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)} \le C_T \, \Vert h_{\mathrm{in}}\Vert _{H^{-s_1}_{-\ell }({\mathbb {R}}^m)}, \end{aligned}$$
(6.23)

for any \(s_1 \in {\mathbb {N}}, s_1 > m/2 + 1\).

Step 3. Inequalities for products and convolutions in Sobolev spaces. We define the weighted \(L^\infty \)-based Sobolev spaces as usual:

$$\begin{aligned} \Vert f\Vert _{L^\infty _{-\ell }({\mathbb {R}}^m)}&:= \left\| \langle \cdot \rangle ^{-\ell } f\right\| _{L^\infty ({\mathbb {R}}^m)},\\ \Vert f \Vert _{W^{k,\infty }_{-\ell }({\mathbb {R}}^m)}&:= \sum _{0 \le k' \le k}\left\| \langle \cdot \rangle ^{-\ell } \partial ^{k'} f\right\| _{L^\infty ({\mathbb {R}}^m)}. \end{aligned}$$

We have the three following inequalities on functions \(S, \psi \) in the appropriate spaces:

$$\begin{aligned} \Vert S*\fancyscript{U} \Vert _{L^\infty _{-\ell } ({\mathbb {R}}^m)} \le \Vert \fancyscript{U}\Vert _{H^k_\ell ({\mathbb {R}}^m)} \, \Vert S\Vert _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \end{aligned}$$

and more generally

$$\begin{aligned} \Vert S *\fancyscript{U}\Vert _{W^{k,\infty }_{-\ell } ({\mathbb {R}}^m)} \le \Vert \fancyscript{U}\Vert _{H^{2k}_\ell ({\mathbb {R}}^m)} \, \Vert S \Vert _{H^{-k}_{-\ell } ({\mathbb {R}}^m)} \end{aligned}$$

and finally

$$\begin{aligned} \Vert S \, \psi \Vert _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \le C_{k,\ell } \, \Vert S\Vert _{H^{-k}_{-\ell }({\mathbb {R}}^m)} \, \Vert \psi \Vert _{W^{k,\infty }({\mathbb {R}}^m)} \end{aligned}$$

for any \(k,\ell \in {\mathbb {N}}\), and some constant \(C_{k,\ell } >0\). The proofs are elementary and we omit them for the sake of conciseness.

Step 4. Proof of (6.17). Let \(\omega _t := g_t - f_t - h_t = \mathsf {d}_t - h_t\), which satisfies the equation

$$\begin{aligned} \partial _t \omega = L \, \omega + \Sigma , \quad \omega _{|t=0} = \omega _{\mathrm{in}} = 0, \end{aligned}$$
(6.24)

with

$$\begin{aligned} \left\{ \begin{array}{l} L \, \omega := \nabla ^2 : (A \, \omega ) - \nabla \cdot ((\fancyscript{T} z)\omega _t) - \nabla \cdot \left( \omega \, (\fancyscript{U}*f) + f \, (\fancyscript{U}*\omega )\right) ,\\ \Sigma _t = \nabla \cdot \left( \mathsf {d}_t \, (\fancyscript{U} *\mathsf {d}_t)\right) . \end{array}\right. \end{aligned}$$

Denoting by \(\Theta _{s,t}w\) the unique solution of the linear, non-autonomous equation

$$\begin{aligned} \partial _t w_t = L w_t,\qquad w_s=w, \end{aligned}$$

the Duhamel formula for Eq. (6.24) yields

$$\begin{aligned} \omega _t = \int _0^t \Theta _{s,t} \, \Sigma _s \, \mathrm{d}s. \end{aligned}$$

Therefore we obtain, using (6.20) and the estimates established in the Step 3, that for any \(t \in [0,T]\)

$$\begin{aligned} \Vert \omega _t \Vert _{H^{-k}_{-4}({\mathbb {R}}^m)}&\le C_T \, \int _0^t \left\| \nabla \left( \mathsf {d}_s \, (\fancyscript{U} *\mathsf {d}_s) \right) \right\| _{H^{-k}_{-4}({\mathbb {R}}^m)} \, \mathrm{d}s\\&\le C_{T,k} \, \int _0^t \Vert \nabla \mathsf {d}_s\Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \, \Vert \fancyscript{U} *\mathsf {d}_s \Vert _{W^{k,\infty }_{-2}({\mathbb {R}}^m)} \, \mathrm{d}s \\&+ C_{T,k} \, \int _0^t \Vert \mathsf {d}_s \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \Vert \fancyscript{U} *(\nabla \mathsf {d}_s) \Vert _{W^{k,\infty }_{-2}({\mathbb {R}}^m)} \, \mathrm{d}s\\&\le C_{T,k} \, \Vert \fancyscript{U} \Vert _{H_2^{2k}({\mathbb {R}}^m)} \, \int _0^t \left\| \mathsf {d}_s\right\| _{H_{-2}^{-(k-1)}({\mathbb {R}}^m)} \, \left\| \mathsf {d}_s\right\| _{H^{-k}_{-2}({\mathbb {R}}^m)} \, \mathrm{d}s, \end{aligned}$$

which together with (6.14) for the control of the norms of \(\mathsf {d}_t\) implies (6.17).

Step 5. Proof of (6.16) and (6.18). The second variation \(r\) satisfies the equation

$$\begin{aligned} \partial _t r = L \, r + R_t, \quad r_{|t=0} = 0, \end{aligned}$$

with \(L\) as above and

$$\begin{aligned} R_t := \frac{1}{2} \nabla \cdot \left( \tilde{h}_t \, (\fancyscript{U} *h_t)\right) + \frac{1}{2} \nabla \cdot \left( h_t \, (\fancyscript{U} *\tilde{h}_t)\right) . \end{aligned}$$

We proceed as in Step 4, taking advantage of the bound (6.23), and we obtain

$$\begin{aligned} \sup _{[0,T]}\Vert r_t\Vert _{H^{-k}_{-4}({\mathbb {R}}^m)} \le C_T \, \Vert h_{\mathrm{in}} \Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)} \, \Vert \tilde{h}_{\mathrm{in}} \Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)}, \end{aligned}$$
(6.25)

which is nothing but (6.16).

Finally we introduce \(\psi _t := g_t - f_t - h_t -r_t = \mathsf {d}_t - h_t - r_t = \omega _t - r_t\), with the initial data \(\tilde{h}_{\mathrm{in}} = h_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}\). It satisfies the equation

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t \psi = L \, \psi + \Psi _t , \quad \psi _{|t=0} = 0,\\ \Psi _t := \nabla \left( \omega _t \, (\fancyscript{U} *\mathsf {d}_t) + h_t \, (\fancyscript{U} *\omega _t)\right) \end{array}\right. \end{aligned}$$
(6.26)

Therefore, we deduce

$$\begin{aligned} \forall \, t&\in [0,T], \quad \Vert \psi _t\Vert _{H^{-k}_{-6}({\mathbb {R}}^m)} \le \left\| \int _0^t \Theta _{s,t} \, \Psi _s \, \mathrm{d}s\right\| _{H^{-k}_{-6} ({\mathbb {R}}^m)} \\&\le C_T \, \int _0^t \left( \Vert h_s\Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)} + \Vert \mathsf {d}_s \Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)}\right) \, \Vert \omega _s \Vert _{H^{-(k-1)}_{-4} ({\mathbb {R}}^m)} \, \mathrm{d}s, \end{aligned}$$

which together with (6.14)–(6.15) and (6.17) implies (6.18). \(\square \)

Proof of (A2) The first property (A2)-(i) is a consequence of (6.14) in Lemma 6.4. we have

$$\begin{aligned} \left\{ \begin{array}{l} \Vert Q(f_1) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \le C_{\fancyscript{U},1}\\ \Vert Q(f_2) \Vert _{H^{-2}_{-2}({\mathbb {R}}^m)} \le C_{\fancyscript{U},1} \\ \Vert Q(f_2) - Q(f_1) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \le C_{\fancyscript{U},2} \, \Vert f_2 - f_1 \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)}^{1/5}. \end{array}\right. \end{aligned}$$
(6.27)

We write

$$\begin{aligned} Q(f) = Q_1(f) + Q_2(f) \end{aligned}$$

with

$$\begin{aligned} Q_1(f) = \nabla ^2 :(A \, f) - \nabla \cdot ((\fancyscript{T}z) f), \quad Q_2(f) = - \nabla \cdot ((\fancyscript{U}*f) \, f). \end{aligned}$$

The linear term \(Q_1\) satisfies the first part of (6.27) (first equation) by direct inspection combined with the use of Sobolev embeddings. In order to see that \(Q_1\) also satisfies the second part of (6.27) (on the difference), we write

$$\begin{aligned} \Vert Q_1(f_2) - Q_1(f_1) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)}&\le \Vert A \Vert _{W^{k-2,\infty }({\mathbb {R}}^m)} \, \Vert f_2 - f_1 \Vert _{H^{-(k-2)}_{-2}({\mathbb {R}}^m)}\\&+ \Vert \fancyscript{T} \Vert _{\infty } \, \Vert f_2 - f_1 \Vert _{H^{-(k-1)}_{-1}({\mathbb {R}}^m)}, \end{aligned}$$

and we conclude by using interpolation and Sobolev embeddings (noticing that \(k - 2 - 1/2 > m/2\) allows for the Sobolev embedding in the first term).

Concerning the quadratic term \(Q_2\), using estimates proved in Step 3 of the proof of (A4), on the one hand we get

$$\begin{aligned} \Vert Q_2(f) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)}&\le \Vert Q_2(f) \Vert _{H^{-k}({\mathbb {R}}^m)} \le \Vert \fancyscript{U} *f \Vert _{W^{k-1,\infty }({\mathbb {R}}^m)} \, \Vert f \Vert _{H^{-(k-1)}({\mathbb {R}}^m)}\\&\le C_k \, \Vert \fancyscript{U}\Vert _{H^{2(k-1)}({\mathbb {R}}^m)} \, \Vert f \Vert _{H^{-(k-1)}({\mathbb {R}}^m)}^2 \le C_k \, \Vert \fancyscript{U} \Vert _{H^{2(k-1)}({\mathbb {R}}^m)} \end{aligned}$$

where we have used \({\mathcal {P}}({\mathbb {R}}^m) \subset H^{-(k-1)/2}({\mathbb {R}}^m)\) with continuous embedding. On the other hand, we have with \(\mathsf {d} := f_2 - f_1\)

$$\begin{aligned} \Vert Q_2(f_2) - Q_2(f_1) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)}&\le \Vert (\fancyscript{U} *\mathsf {d}) \, \langle \cdot \rangle ^{-2} \Vert _{W^{k-1,\infty }({\mathbb {R}}^m)} \, \Vert f_2 \Vert _{H^{-(k-1)}({\mathbb {R}}^m)} \\&+ \Vert \fancyscript{U} *f_1 \Vert _{W^{k-1,\infty }({\mathbb {R}}^m)} \, \Vert \mathsf {d} \Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)}. \end{aligned}$$

In order to estimate the first term in the above inequality, we remark that

$$\begin{aligned} \langle z \rangle ^{-2} \, |(\partial ^\alpha \fancyscript{U} *\mathsf {d})(z)|&\le \langle z \rangle ^{-2} \, \Vert \partial ^\alpha \fancyscript{U} (z-\cdot ) \, \langle \cdot \rangle ^2 \Vert _{H^{k}({\mathbb {R}}^m)} \, \Vert \mathsf {d}\, \langle \cdot \rangle ^{-2} \Vert _{H^{-k}({\mathbb {R}}^m)}\\&\le C \, \Vert \partial ^\alpha \fancyscript{U} \Vert _{H^{k}_2({\mathbb {R}}^m)} \, \Vert \mathsf {d} \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \end{aligned}$$

uniformly for any \(z \in {\mathbb {R}}^m\). All together, we have for the quadratic term

$$\begin{aligned} \Vert Q_2(f_2) - Q_2(f_1) \Vert _{H^{-k}_{-2}({\mathbb {R}}^m)} \le C'_{\fancyscript{U},2} \, \Vert f_2-f_1 \Vert _{H^{-(k-1)}_{-2}({\mathbb {R}}^m)}, \end{aligned}$$

and we conclude the proof of (A2)-(ii) by using interpolation and Sobolev embeddings again.

Proof of (A5) We use the well known following estimate (see [36]): for any \(q \ge 1, f_{\mathrm{in}}, \, g_{\mathrm{in}} \in {\mathcal {P}}_q({\mathbb {R}}^d)\) and \(T > 0\) there exists \(C_T\) such that

$$\begin{aligned} \sup _{t \ge 0}W_q(S^{N\!L}_t (f_{\mathrm{in}}), S^{N\!L}_t (g_{\mathrm{in}})) \le C_T \, W_q (f_{\mathrm{in}},g_{\mathrm{in}}), \end{aligned}$$

that we use with \(q=2\). Alternatively, estimate (6.14) precisely says that assumption (A5) holds in \({\mathcal {P}}_{{\mathcal {G}}_1}(E)\).

7 Inelastic collisions with thermal bath

7.1 The model

In this section we assume that \(E= {\mathbb {R}}^d, d \ge 1\), and we are interested in the following Boltzmann equation for diffusively excited granular media on the distribution \(f(t,v) \ge 0, v \in {\mathbb {R}}^d\) of particles:

$$\begin{aligned} {\partial f_t\over \partial t} = Q(f_t), \quad f_{|t=0} = f_{\mathrm{in}} \quad \hbox {in}\quad {\mathcal {P}}({\mathbb {R}}^d), \end{aligned}$$
(7.1)

with

$$\begin{aligned} Q(f) = Q_\alpha (f,f) + \nu \, \Delta \, f \end{aligned}$$

for some \(\nu >0\), and where the quadratic Boltzmann collision kernel \(Q_\alpha \) is defined by the following dual formulation

$$\begin{aligned} \langle Q_\alpha (f,f), \varphi \rangle := \int _{{\mathbb {R}}^{2d} \times \mathbb {S}^{d-1}} b(\cos \theta ) \, \left( \phi (w^*_2) - \phi (w_2)\right) \, \mathrm{d}\sigma \, f(\mathrm{d}w_1) \, f(\mathrm{d}w_2)\quad \quad \end{aligned}$$
(7.2)

for any \(\varphi \in C_0({\mathbb {R}}^d), f \in {\mathcal {P}}({\mathbb {R}}^d)\), and with \(\cos \theta = \sigma \cdot (w_2-w_1)/|w_2-w_1|\) and similarly as in Eqs. (5.1) and (5.6)

$$\begin{aligned} w^*_2 = {w_1+ w_2 \over 2}+ {u^*\over 2}, \end{aligned}$$

but with

$$\begin{aligned} u^* = \left( {1- \alpha \over 2}\right) \, (w_1-w_2) + \left( {1+\alpha \over 2}\right) \, |w_2 - w_1| \, \sigma , \end{aligned}$$

for some \(\alpha \in (0,1)\). (Note that the case \(\alpha = 1\) and \(\nu =0\) would correspond to the elastic Boltzmann kernel considered in Sect. 5.) This corresponds to a situation where particles lose energy when they collide. We refer to [3, 7] for a physical motivation to these equations. The mathematical theory is treated in e.g. [24], where, for example, it is proven that this equation generates a nonlinear semigroup \(S^{N\!L}_t f_{\mathrm{in}} := f_t\) for any \(f_{\mathrm{in}} \in {\mathcal {P}}_q({\mathbb {R}}^d), q \ge 2\). Notice that unlike the classical Boltzmann equation the kinetic energy is not conserved. For the sake of simplicity we make the normalization assumptions

$$\begin{aligned} \Vert b \Vert _{L^1(\mathbb {S}^{d-1})} = \int _{\mathbb {S}^{d-1}} b(\sigma _1) \, \mathrm{d}\sigma = 1, \qquad \nu = 1. \end{aligned}$$

One of these quantities, say the first, can be set to one just by a rescaling of time but the two cannot be changed independently. However, the result would be the same for any value of \(\nu \). Note that due to the normalization \(\Vert b \Vert _{L^1({\mathbb {S}}^{d-1})} = 1\) and the fact that \(f \in {\mathcal {P}}({\mathbb {R}}^d)\), the bilinear operator \(Q_\alpha \) splits into a quadratic part and a linear part

$$\begin{aligned} Q(f) = Q^+_\alpha (f,f) - f + \Delta f, \end{aligned}$$

where \(Q^+\) is defined through the positive part of the expression (7.2).

We now want to introduce a \(N\)-particle system associated to the above Boltzmann equation for diffusively excited granular media by mimicking the Kac’s construction. We consider the velocities process \(({\mathcal {V}}^N_t)\) with values in \({\mathbb {R}}^{dN}\), of mixed jump and diffusion nature, defined through the stochastic differential equations

$$\begin{aligned} {\mathcal {V}}^N_{t} = {\mathcal {V}}^N_{0}+\int _0^t \int _{{\mathbb {S}}^{d-1}} \sum _{i,j=1}^N \Gamma _{i,j,\sigma } ({\mathcal {V}}^N_{s^-}) \, \mathbf{1}_{z < b (\sigma \cdot \hat{u}_{i,j}({\mathcal {V}}^N_{s^-}))} \, {\mathcal {N}}^{N}(ds,\mathrm{d}\sigma ,i,j,\mathrm{d}z) + \sqrt{2} \, \fancyscript{B}^N_t. \end{aligned}$$

Here \(\fancyscript{B}^N_t\) is a \({\mathbb {R}}^{dN}\) valued standard Brownian motions, \({\mathcal {N}}^{N}(\mathrm{d}s,\mathrm{d}\sigma ,i,j,\mathrm{d}z)\) is a Poisson measure on \([0,\infty ) \times {\mathbb {S}}^{d-1} \times \{1, \dots , N \}^2 \times {\mathbb {R}}_+\) with intensity

$$\begin{aligned} \mathrm{d}s \, \mathrm{d}\sigma \, {1 \over N} \sum _{i',j'=1}^N \mathbf{1}_{i' \not = j'} \delta _{(i',j')}(i,j) \, \mathrm{d}u \end{aligned}$$

independent of \(\fancyscript{B}^N_t\), and the two functions \(\Gamma _{i,j,\sigma } : {\mathbb {R}}^{dN} \rightarrow {\mathbb {R}}^{dN}\) and \(\hat{u}_{i,j} : {\mathbb {R}}^{dN} \rightarrow {\mathbb {S}}^{d-1}\) are a.e. defined through the following expressions: for any \(V = (v_1, \dots , v_N) \in {\mathbb {R}}^{dN}\) we set

$$\begin{aligned} \hat{u}_{i,j} (V) := \frac{u_{ij}}{|u_{ij}|}, \quad u_{ij} := v_i-v_j \end{aligned}$$

and

$$\begin{aligned} \Gamma _{i,j,\sigma } (V) := V^*_{ij} - V, \end{aligned}$$

where

$$\begin{aligned} V^*_{ij} = (v_1, \dots , v_{i-1}, v^*_i, v_{i+1}, \dots , v_{j-1}, v^*_j, v_{j+1}, \dots , v_N) \end{aligned}$$

and, as in Eq. (5.1),

$$\begin{aligned} v^*_i = {w_{ij} \over 2} + {u^*_{ij} \over 2}, \quad v^*_j= {w_{ij} \over 2} - {u^*_{ij} \over 2}, \end{aligned}$$
(7.3)

but here with

$$\begin{aligned} w_{ij} = v_i+v_j, \quad u^*_{ij} = \left( {1- \alpha \over 2}\right) \, u_{ij} + \left( {1+\alpha \over 2}\right) \, |u_{ij}| \, \sigma . \end{aligned}$$

The associated forward Kolmogorov equation on the probability law \(f^N_t\) of \(({\mathcal {V}}^N_t)\) in \({\mathbb {R}}^{dN}\) reads

$$\begin{aligned} \partial _t \langle f^N_t,\varphi \rangle = \langle f^N_t, G^N \varphi \rangle \end{aligned}$$
(7.4)

with generator \(G^N = G^N_1 + G^N_2\), where \(G^N_1\) is associated to an inelastic Boltzmann collision process whose collision kernel only depends on the deviation angle as in Sect. 5

$$\begin{aligned} (G^N_1\varphi ) (V) = {1 \over N} \, \sum _{i,j= 1}^N \int _{\mathbb {S}^{d-1}} b(\cos \theta _{ij}) \, \left[ \varphi (V^*_{ij}) - \varphi (V)\right] \, \mathrm{d}\sigma , \end{aligned}$$
(7.5)

with \(\cos \theta _{ij} = \sigma \cdot (v_j-v_i)/|v_j-v_i|\) and \(V^*_{ij}\) defined in (7.3), and \(G^N_2\) is the generator associated to the Brownian motion

$$\begin{aligned} (G^N_2\varphi ) (V) = \sum _{i=1}^N \Delta _{i} \varphi , \end{aligned}$$
(7.6)

where \(v_i:=(v_{i,1}, \dots , v_{i,d})\) and \(\Delta _i\) denotes the Laplacian in \({\mathbb {R}}^d\) associated to the \(i\)-th particle:

$$\begin{aligned} \Delta _i := \sum _{\alpha =1}^d \partial ^2_{v_{i, \alpha },v_{i,\alpha }}. \end{aligned}$$

It is classical to prove that \({\mathcal {V}}^N_t\) is a Feller process and we refer to the textbooks [13, 33] where the theory is set up with full details (one can also refer to [15, 30, 36] where similar processes are considered).

7.2 Statement of the result

The main result in this section is a quantitative estimate of propagation of chaos for the mixed collision and diffusion model introduced above.

Theorem 7.1

Consider an initial distribution \(f_{\mathrm{in}} \in {\mathcal {P}}_q({\mathbb {R}}^d), q \ge 2\), the hierarchy of \(N\)-particle distributions \(f^N_t = S^N_t(f_{\mathrm{in}}^{\otimes N})\) following the evolution (7.4), and the nonlinear semigroup \(f_t = S^{N\!L}_t(f_{\mathrm{in}})\) following the evolution (7.1).

Then there is a constant \(C>0\) and, for any \(T>0\), there are constants \(C_{T}, \tilde{C}_T \in (0,\infty )\) only depending \(T \in (0,\infty )\) such that for any

$$\begin{aligned} \varphi = \varphi _1 \otimes \dots \otimes \, \varphi _\ell \in {\mathcal {F}}^{\otimes \ell }, \quad {\mathcal {F}}:= W^{9,1}({\mathbb {R}}^d) \cap W^{1,\infty } ({\mathbb {R}}^d), \quad \Vert \varphi _j \Vert _{\mathcal {F}}\le 1, \end{aligned}$$

we have for \(N \ge 2 \ell \):

$$\begin{aligned} \sup _{[0,T]}\left| \left\langle \left( S^N_t(f_{\mathrm{in}}^N) - \left( S^{N\!L}_t (f_{\mathrm{in}})\right) ^{\otimes N}\right) , \varphi \right\rangle \right| \le C \, {\ell ^2 \over N} + C_{T} \, {\ell ^2 \over N}+ \tilde{C}_{T} \, \ell \, \Omega ^{W_2}_N (f_{\mathrm{in}}).\nonumber \\ \end{aligned}$$
(7.7)

As a consequence of (7.7) and Lemma 4.6, this shows the quantitative propagation of chaos with rate \({\varepsilon }(N) \le C(\ell ,T,f_{\mathrm{in}}) \, N^{-{1 \over d+4}}\) for any initial data \(f_{\mathrm{in}} \in {\mathcal {P}}_{d+5}({\mathbb {R}}^d)\).

We are not aware of any result of propagation of chaos in this setting. A conceivable alternative approach would be to use the general nonlinear martingale approach, but that would most likely not provide any quantitative rate of propagation of chaos. The techniques developed recently in [15] for the elastic Kac equation without cut-off is yet another alternative technique that could be tried on this model, but we have not made any attempts in this direction, and, would it work, it is not clear as to what kind of convergence rate one could hope to achieve.

7.3 Proof of Theorem 7.1

We shall prove that Theorem 7.1 is a consequence of Theorem 2.1 by proving that the assumptions (A1)–(A2)–(A3)–(A4)–(A5). We consider the phase space \(E={\mathbb {R}}^d\) and the following choice of functional spaces

$$\begin{aligned} \left\{ \begin{array}{l} {\mathcal {G}}_1 := {\mathcal {H}}^{-s_1}({\mathbb {R}}^d), \,\, s_1 := 3, \\ {\mathcal {G}}_2 := {\mathcal {H}}^{-s_2}({\mathbb {R}}^d), \,\, s_2 := 3 s_1=9, \\ {\mathcal {F}}_1 = W^{s_1,1}({\mathbb {R}}^d), \\ {\mathcal {F}}_2 = W^{s_2,1}({\mathbb {R}}^d), \end{array}\right. \end{aligned}$$

where the Fourier based space \({\mathcal {H}}^{-s}({\mathbb {R}}^d)\) and the norms \(|\cdot |_s\) are defined in Example 3.7, and the corresponding spaces \({\mathcal {P}}_{{\mathcal {G}}_1}(E)\) and \({\mathcal {P}}_{{\mathcal {G}}_2}(E)\) (without weight). We finally define \({\mathcal {F}}_3 = \text{ Lip }({\mathbb {R}}^d)\) and \({\mathcal {P}}_{{\mathcal {G}}_3}(E) := {\mathcal {P}}_2({\mathbb {R}}^d)\) endowed with the quadratic MKW distance \(W_2\).

Proof of (A1) The well-posedness of Eqs. (7.4)–(7.5) is a variation on the well-posedness result for Eq. (7.1) as obtained in [2, 4]. We also refer to [13, 15, 30, 36] for a proof of the fact that t \({\mathcal {Z}}^N_t\) is a Feller process.

Proof of (A2) First we prove (A2)-(i), and more precisely we prove that

$$\begin{aligned} S^{N\!L}_t \in C^{0,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathcal {P}}_{{\mathcal {G}}_1}(E)), \end{aligned}$$

which is a consequence of the following result:

Lemma 7.2

For any \(f_{\mathrm{in}}, g_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^d)\) and any final time \(T >0\), the associated solutions \(f_t\) and \(g_t\) to the diffusive inelastic Boltzmann equation (7.1) satisfy for any \(s\ge 0\)

$$\begin{aligned} \sup _{t \in [0,T]} \left| f_t-g_t\right| _s \le e^{2T}\, \left| f_{\mathrm{in}}-g_{\mathrm{in}}\right| _s. \end{aligned}$$
(7.8)

Proof of Lemma 7.2

We recall Bobylev’s identity for Maxwellian inelastic collision kernel (see for instance [2])

$$\begin{aligned} {\mathcal {F}}\left( Q^+_\alpha (f,g)\right) (\xi ) = \hat{Q}^+_\alpha (F,G) (\xi ) =: {1 \over 2} \int _{{\mathbb {S}}^{d-1}} b\left( \sigma \cdot \hat{\xi }\right) \, [F^+ \, G^- + F^- \, G^+ ]\, \mathrm{d}\sigma , \end{aligned}$$

with \(F = \hat{f}, G = \hat{g}, F^\pm = F(\xi ^\pm ), G^\pm = G(\xi ^\pm )\) and

$$\begin{aligned} \xi ^+ = {3-\alpha \over 4} \, \xi + {1+\alpha \over 4} \, |\xi | \, \sigma , \quad \xi ^- = {1+\alpha \over 4} (\xi - |\xi | \, \sigma ). \end{aligned}$$

Denoting \(D = \hat{g} - \hat{f}, S = \hat{g} + \hat{f}\), the following equation holds

$$\begin{aligned} \partial _t D = \int _{S^2} b \left( \sigma \cdot \hat{\xi }\right) \, \left[ \frac{D^+ \, S^-}{2} + \frac{D^- \, S^+}{2}\right] \, \mathrm{d}\sigma - D - |\xi |^2 \, D. \end{aligned}$$
(7.9)

Using that \(\Vert S\Vert _\infty \le 2\) and then \(|\xi ^\pm |\le |\xi |\), we deduce in distributional sense

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {|D| \over \langle \xi \rangle ^s}&\le \left( \sup _{\xi \in {\mathbb {R}}^d} {|D| \over \langle \xi \rangle ^s}\right) \, \left( \sup _{\xi \in {\mathbb {R}}^d} \int _{{\mathbb {S}}^{d-1}} b(\sigma \cdot \hat{\xi }) \, \left\{ {\langle \xi ^+ \rangle ^s \over \langle \xi \rangle ^s} + {\langle \xi ^- \rangle ^s \over \langle \xi \rangle ^s}\right\} \, \mathrm{d}\sigma \right) \\&\le 2 \, \sup _{\xi \in {\mathbb {R}}^d} {|D| \over \langle \xi \rangle ^s}, \end{aligned}$$

from which we conclude that (7.8) holds. \(\square \)

Next we prove (A2)-(ii), as a consequence of the following result:

Lemma 7.3

For any \(f,g \in {\mathcal {P}}({\mathbb {R}}^d)\) and \(s \ge 0\), we have

$$\begin{aligned} \left| Q_\alpha (f,f)\right| _s \le 2 \end{aligned}$$
(7.10)

and

$$\begin{aligned} \left| Q_\alpha (f+g,f-g)\right| _s \le 3 \, |f-g|_s. \end{aligned}$$
(7.11)

Moreover for any \(s > 2\) there exists \(\delta \in (0,1)\) such that

$$\begin{aligned} \left| \Delta f - \Delta g\right| _s \le 2 \, |f-g|^\delta _s. \end{aligned}$$
(7.12)

Proof of Lemma 7.2

We prove the second inequalities (7.11). We write in Fourier:

$$\begin{aligned}&\mathcal {F}\left( Q_\alpha (f+g,f-g)\right) = \hat{Q}_\alpha (D,S) \\&\quad = \frac{1}{2} \, \int _{\mathbb {S}^{d-1}} b(\sigma \cdot \hat{\xi }) \, \left( S(\xi ^+) \, D(\xi ^-) + S(\xi ^-) \, D(\xi ^+) - 2 \, D(\xi )\right) \, \mathrm{d}\sigma \end{aligned}$$

where \(\hat{Q}_\alpha \) is the Fourier transform of the symmetric version of the collision operator \(Q_{\alpha }\), which yields

$$\begin{aligned} \frac{\left| \hat{Q}_\alpha (D,S)\right| }{\langle \xi \rangle ^s} \le {\mathcal {T}}_1 + {\mathcal {T}}_2 + {\mathcal {T}}_3, \end{aligned}$$

with

$$\begin{aligned} {\mathcal {T}}_1&:= \left| \frac{1}{2 \, {\langle \xi \rangle ^s}} \, \int _{\mathbb {S}^{d-1}} b(\sigma \cdot \hat{\xi }) \, S(\xi ^+) \, D(\xi ^-) \, \mathrm{d}\sigma \right| \\&\le \int _{\mathbb {S}^{d-1}} b(\sigma \cdot \hat{\xi }) \, {\left| S(\xi ^+)\right| \over 2}\, \frac{\left| D(\xi ^-)\right| }{\langle \xi ^- \rangle ^s} \, \frac{\langle \xi ^- \rangle ^s}{\langle \xi \rangle ^s} \, \mathrm{d}\sigma \le | D |_s. \end{aligned}$$

Similar estimates hold for the two other terms \({\mathcal {T}}_2\) and \({\mathcal {T}}_3\). The proof of the first inequality (7.10) is similar (and simpler): we use the Fourier representation of \(Q_\alpha (f,f)\) and the bound \(\Vert \hat{f}\Vert _{L^\infty } \le 1\). We finally prove the last inequality. We compute

$$\begin{aligned} \left| \Delta f - \Delta g\right| _s = \sup _{\xi \in {\mathbb {R}}^d} |\xi |^2 {|F - G|\over \langle \xi \rangle ^s} \le \sup _{\xi \in {\mathbb {R}}^d} \left( |F - G|^{1-\delta } \, \left( {|F - G|\over \langle \xi \rangle ^s}\right) ^{\delta }\right) \end{aligned}$$

with \(\delta := (s-2)/s\). \(\square \)

Proof of (A3) We claim that for any \(s_1 \ge 3\) there exists \(C_1\in {\mathbb {R}}_+\) such that for all \(\Phi \in C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})\)

$$\begin{aligned} \left\| G^N (\Phi \circ \mu ^N_Z ) \!-\! \left\langle Q(\mu ^N_Z,\mu ^N_Z), D\Phi [\mu ^N_Z]\right\rangle \right\| _{L^\infty (E^N)} \!\le \! {C_1 \over N} \Vert \Phi \Vert _{C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),{\mathbb {R}})},\quad \quad \quad \end{aligned}$$
(7.13)

which is (A3) with \(k=2, \eta =1\) and \({\varepsilon }(N) = C_{1} \, N^{-1}\).

We begin with a technical lemma which shows that the norm \(|\cdot |_s\) is well-adapted for obtaining differentiability of the empirical measures. It is worth emphasizing that the choice of \(s_1 = 3\) (in fact we only need \(s_1 > 2\) by modifying slightly the arguments) comes from the requirement that the function \(V \mapsto \Phi (\mu ^N_Z)\) be \(C^2\).

Lemma 7.4

The map \({\mathbb {R}}^{dN}\rightarrow {\mathcal {P}}_{{\mathcal {G}}_1}(E), V \mapsto \mu ^N_Z\) is \(C^{2,1}\) and

$$\begin{aligned} \partial _{(v_i)_\alpha } (\mu ^N_Z) = \frac{1}{N} \, \partial _{\alpha } \delta _{v_i}, \quad \partial ^2_{(v_i)_\alpha ,(v_i)_\beta } (\mu ^N_Z) = N^{-1} \, \partial ^2_{\alpha \beta } \delta _{v_i}. \end{aligned}$$

Proof of Lemma 7.2

For \(v,w \in {\mathbb {R}}^d\), we have

$$\begin{aligned} |\delta _v - \delta _w|_s&= \sup _{\xi \in {\mathbb {R}}^d} {\left| e^{- i \, v \cdot \xi } - e^{- i \, w\cdot \xi }\right| \over \langle \xi \rangle ^s} \le |v-w| \, \sup _{\xi \in {\mathbb {R}}^d} {\left\| \nabla _v e^{- i \, v \cdot \xi }\right\| _{L^\infty ({\mathbb {R}}^d_v)} \over \langle \xi \rangle ^s}\\&\le |v-w| \, \sup _{\xi \in {\mathbb {R}}^d} {|\xi | \over \langle \xi \rangle ^s} \le |v-w| \end{aligned}$$

which shows that \(v \mapsto \delta _v\) is \(C^{0,1}\). For the sake of simplicity we present the proof of differentiability when \(d=1\), the case \(d>1\) being similar. For \(v \in {\mathbb {R}}\) and \(h \in {\mathbb {R}}^*\), we have

$$\begin{aligned} \left| \delta _{v+h} - \delta _v - h \, \delta '_v\right| _s = \sup _{\xi \in {\mathbb {R}}} {\left| (e^{-i \, \xi \, h} - 1 + i \, \xi \, h) \, e^{-i \, v \, \xi }\right| \over \langle \xi \rangle ^s} \le \sup _{\xi \in {\mathbb {R}}} {\left| \xi \, h\right| ^2 \over \langle \xi \rangle ^s} \le |h|^2, \end{aligned}$$

from which we deduce that \(v \mapsto \delta _v\) is \(C^{1,1}\). Similarly we can go to second order:

$$\begin{aligned}&\left| \delta _{v+h} - \delta _v - h \, \delta '_v + \frac{h^2}{2} \, \delta ''_v\right| _s \\&\quad = \sup _{\xi \in {\mathbb {R}}} { \left| \left( e^{-i \, \xi \, h} - 1 + i \, \xi \, h - \xi ^2 \, h^2\right) \, e^{-i \, v \, \xi } \right| \over \langle \xi \rangle ^s} \le \sup _{\xi \in {\mathbb {R}}} { \left| \xi \, h\right| ^3 \over \langle \xi \rangle ^s} \le |h|^3, \end{aligned}$$

and we easily conclude that \(v \mapsto \delta _v\) is \(C^{2,1}\). When the dimension \(d\) is greater than \(1\), one can perform the same argument for the partial derivatives of the Dirac mass. \(\square \)

We come back to the proof of (7.13). Take \(\Phi \in C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E), {\mathbb {R}})\) and compute separately the contributions of \(G^N_i, i=1, 2\). Proceeding as in the proof of (A3) in Theorem 5.1 we have

$$\begin{aligned} G^N_1 \left( \Phi \circ \mu ^N_Z\right) = \left\langle Q_\alpha \left( \mu ^N_Z,\mu ^N_Z\right) , D \Phi (\mu ^N_Z)\right\rangle + I_2(V) \end{aligned}$$

with

$$\begin{aligned} |I_2(V)|&\le {1 \over 2N} \, \sum _{i,j= 1}^N \int _{\mathbb {S}^{d-1}} b(\cos (\theta _{ij})) \Vert \Phi \Vert _{C^{2,1}} \, \left| \mu ^N_{V^*_{ij}} -\mu ^N_Z\right| _{s_1}^{2} \, \mathrm{d}\sigma \\&\le {8 \over N} \, \Vert \Phi \Vert _{C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E), {\mathbb {R}})}, \end{aligned}$$

since for any \(i \not = j\)

$$\begin{aligned} \left| \mu ^N_{V^*_{ij}} -\mu ^N_Z\right| _{s_1}&= {1 \over N} \, \left| \delta _{v_i'} + \delta _{v_j'} - \delta _{v_i} - \delta _{v_j}\right| _{s_1}\\&\le {1 \over N} \, \left( \left| \delta _{v_i'}\right| _{s_1} + \left| \delta _{v_j'}\right| _{s_1} + \left| \delta _{v_i}\right| _{s_1} + \left| \delta _{v_j} \right| _{s_1}\right) = {4 \over N}. \end{aligned}$$

On the other hand, as in the proof of assumption (A3) in Sect. 6, the map \({\mathbb {R}}^{dN} \rightarrow {\mathbb {R}}, V \mapsto \Phi (\mu ^N_Z)\) is \(C^{2,1}\) thanks to Lemma 7.4 and denoting \(\phi _Z = D\Phi \!\left[ \mu ^N_Z\right] \in ({\mathcal {H}}^{s_1}({\mathbb {R}}^d))' \), we compute

$$\begin{aligned} G^N_2 (\Phi (\mu ^N_Z))&= \sum _{i=1}^N \Delta _i \Phi (\mu ^N_Z)\\&= \sum _{i=1}^N \left\{ \frac{1}{N} \, (\Delta \phi _Z) (v_i) + \frac{1}{N^2} \sum _{\alpha =1}^d D^2\Phi \!\left[ \mu ^N_Z\right] \left( \partial _{\alpha } \delta _{v_i}, \partial _{\alpha } \delta _{v_i}\right) \right\} \\&= \langle \Delta \mu ^N_Z, \phi _Z \rangle + {\mathcal {O}}\left( \frac{\Vert \Phi \Vert _{C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E), {\mathbb {R}})}}{N}\right) . \end{aligned}$$

We conclude the proof by combining the previous estimates. \(\square \)

Proof of (A4) For \(f_{\mathrm{in}}, g_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^d)\), we define the associated solutions \(f_t\) and \(g_t\) to the nonlinear Boltzmann equation; we define \(h_t := {\mathcal {L}}^{N\! L}_t[f_{\mathrm{in}}](g_{\mathrm{in}}-f_{\mathrm{in}})\) the solution of the linearized Boltzmann equation around \(f_t\); and we define \(r_t\) the solution to the “second variation” equation around \(f_t\). More precisely, we define

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t f_t = Q_\alpha (f_t,f_t) + \Delta \, f_t, \quad f_{|t=0} = f_{\mathrm{in}} \\ \partial _t g_t = Q_\alpha (g_t,g_t) + \Delta \, g_t, \quad g_{|t=0} = g_{\mathrm{in}} \\ \partial _t h_t = Q_\alpha (f_t,h_t) + Q_\alpha (h_t,f_t) + \Delta \, h_t, \quad h_{|t=0} = h_{\mathrm{in}},\\ \partial _t r_t = Q_\alpha (f_t,r_t) + Q_\alpha (r_t,f_t) + \Delta \, r_t + \frac{1}{2} Q_\alpha (h_t,\tilde{h}_t) + \frac{1}{2} Q_\alpha (\tilde{h}_t, h_t), \quad r_{|t=0} = 0 \end{array}\right. \end{aligned}$$

where in the last equation (second-order variation) \(h_t\) and \(\tilde{h}_t\) are two solutions to the third equation (first-order variation).

We then define when \(h_{\mathrm{in}} = \tilde{h}_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}\) the following error terms

$$\begin{aligned} \left\{ \begin{array}{l} \mathsf {d}_t := g_t - f_t \\ \omega _t := g_t - f_t - h_t = S^{N \! L}_t(g_{\mathrm{in}}) - S^{N \! L}_t(f_{\mathrm{in}})- {\mathcal {L}}^{N \! L}_t[f_{\mathrm{in}}] (g_{\mathrm{in}} - f_{\mathrm{in}}) \\ \psi _t := g_t - f_t - h_t - r_t. \end{array}\right. \end{aligned}$$

Lemma 7.5

Fix \(s \ge 0\) and \(T \in (0,\infty )\). There exists \(C_T\) such that for any \(f_{\mathrm{in}}, g_{\mathrm{in}} \in {\mathcal {P}}({\mathbb {R}}^d)\), the following estimates hold

$$\begin{aligned}&\forall \, t \in [0,T], \quad \left| h_t\right| _s \le C_T \, \left| h_{\mathrm{in}}\right| _s\!\!,\end{aligned}$$
(7.14)
$$\begin{aligned}&\forall \, t \in [0,T], \quad \left| r_t\right| _{2s} \le C_T \, \left| h_{\mathrm{in}}\right| _s \, |\tilde{h}_{\mathrm{in}}|_s, \end{aligned}$$
(7.15)

and when \(h_{\mathrm{in}} = \tilde{h}_{\mathrm{in}} = g_{\mathrm{in}} - f_{\mathrm{in}}\) we have furthermore

$$\begin{aligned}&\forall \, t \in [0,T], \quad \left| \omega _t\right| _{2s} \le C_T \, \left| f_{\mathrm{in}} - g_{\mathrm{in}}\right| ^2_s,\end{aligned}$$
(7.16)
$$\begin{aligned}&\forall \, t \in [0,T], \quad \left| \psi _t\right| _{3s} \le C_T \, \left| f_{\mathrm{in}} - g_{\mathrm{in}}\right| ^3_s. \end{aligned}$$
(7.17)

This proves that \(S^{N\!L}_t \in C^{2,1}({\mathcal {P}}_{{\mathcal {G}}_1}(E),P_{{\mathcal {G}}_2}(E))\).

Proof of Lemma 7.5

We skip the proof of (7.14) since it is similar to the proof of (7.8). We then deal with each term successively. We work in Fourier variable and we introduce the notations \(F = \hat{f}, D = \hat{\mathsf {d}}, H = \hat{h}, \tilde{H} = (\tilde{h})^{\hat{}}, \Omega = \hat{\omega }, R = \hat{r}\) and \(\Psi = \hat{\psi }\).

Step 1. The evolution equation satisfied by \(\Omega \) is

$$\begin{aligned} \partial _t \Omega = \hat{Q}_\alpha (\Omega ,F) + \hat{Q}_\alpha (F,\Omega ) - |\xi |^2 \, \Omega - \hat{Q}_\alpha (D,D). \end{aligned}$$
(7.18)

We deduce in distributional sense

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {|\Omega (\xi )| \over \langle \xi \rangle ^{2s}} \le {\mathcal {T}}_1 + {\mathcal {T}}_2, \end{aligned}$$

where

$$\begin{aligned} {\mathcal {T}}_1&:= \sup _{\xi \in {\mathbb {R}}^d} \int _{\mathbb {S}^{d-1}} {b\left( \sigma \cdot \hat{\xi }\right) \over \langle \xi \rangle ^{2s}} \, \Bigg (\left| \frac{\Omega (\xi ^+) \, F (\xi ^-)}{2}\right| + \left| \frac{\Omega (\xi ^-) \, F (\xi ^+)}{2}\right| \\&- F(\xi ) \Omega (0) - F(0) \Omega (\xi ) \Bigg ) \, \mathrm{d}\sigma \\&\le \sup _{\xi \in {\mathbb {R}}^d} \int _{\mathbb {S}^{d-1}} b\left( \sigma \cdot \hat{\xi }\right) \, \Bigg ({\left| \Omega (\xi ^+)\right| \over \langle \xi ^+\rangle ^{2s}} \, {\langle \xi ^+\rangle ^{2s} \over \langle \xi \rangle ^{2s}} + {\left| \Omega (\xi ^-)\right| \over \langle \xi ^-\rangle ^{2s}} \, {\langle \xi ^-\rangle ^{2s} \over \langle \xi \rangle ^{2s}} \\&+ \frac{|\Omega (\xi )|}{\langle \xi \rangle ^{2s}} + |\Omega (0)| \, \frac{|F(\xi )|}{\langle \xi \rangle ^{2s}} \Bigg ) \, \mathrm{d}\sigma \\&\le C \, \sup _{\xi \in {\mathbb {R}}^d} {\left| \Omega (\xi )\right| \over \langle \xi \rangle ^{2s}} + |\Omega (0)| \, \sup _{\xi \in {\mathbb {R}}^d} {\left| F (\xi )\right| \over \langle \xi \rangle ^{2s}}, \end{aligned}$$

for some constant \(C>0\), and

$$\begin{aligned} {\mathcal {T}}_2&:= {1\over 2} \, \sup _{\xi \in {\mathbb {R}}^d} \int _{\mathbb {S}^{d-1}} {b\left( \sigma \cdot \hat{\xi }\right) \over \langle \xi \rangle ^{2s}} \, \left| D (\xi ^+) \, D (\xi ^-) + D (\xi ^-) \, D (\xi ^+)\right| \, \mathrm{d}\sigma \\&\le {1 \over 2} \, \sup _{\xi \in {\mathbb {R}}^d} \int _{\mathbb {S}^{d-1}} b\left( \sigma \cdot \hat{\xi }\right) \, \left( {|D (\xi ^+)| \over \langle \xi ^+\rangle ^s} \, {|D (\xi ^-)|\over \langle \xi ^-\rangle ^s} + {|D (\xi ^+)|^2 \over \langle \xi ^+\rangle ^s} \, {|D (\xi ^-)|^2 \over \langle \xi ^-\rangle ^s}\right) \, \mathrm{d}\sigma \\&\le \left| \mathsf {d}_t\right| _s^2 \le C_T \, \left| f_{\mathrm{in}} - g_{\mathrm{in}} \right| _s^2, \end{aligned}$$

using the estimates (7.8). We then conclude thanks to a Gronwall lemma.

Step 2. The evolution equation satisfied by \(R\) is

$$\begin{aligned} \partial _t R&= \hat{Q}_\alpha (F,R) + \hat{Q}_\alpha (R,F) - |\xi |^2 \, R\nonumber \\&+ \frac{1}{2} \hat{Q}_\alpha (H,\tilde{H}) + \frac{1}{2} \hat{Q}_\alpha (\tilde{H}, H), \quad R_{|t=0} = 0. \end{aligned}$$
(7.19)

Equation (7.19) being similar to Eq. (7.18), with the same computations as in Step 1 we deduce that (7.15) holds.

Step 3. Choosing now \(h_{\mathrm{in}} = \tilde{h}_{\mathrm{in}} = \mathsf {d}_{\mathrm{in}}\), the equation satisfied by \(\Psi \) is

$$\begin{aligned} \partial _t \Psi = \hat{Q}_\alpha (F,\Psi ) + \hat{Q}_\alpha (\Psi ,F) - |\xi |^2 \, \Psi - \hat{Q}_\alpha (\Omega ,H) - \hat{Q}_\alpha (D,\Omega ), \quad \Psi _{|t=0} = 0. \end{aligned}$$

Observe that by conservation of mass \(\Psi _t(0)=0\) for all times, and with these choices of initial data \(\hat{Q}_\alpha (\Omega ,H) = \hat{Q}_\alpha ^+(\Omega ,H)\) and \(\hat{Q}_\alpha (D,\Omega ) = \hat{Q}_\alpha ^+(D,\Omega )\). Then we perform similar computations as in Step 1, and we deduce in distributional sense

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {|\Psi (\xi )| \over \langle \xi \rangle ^{3s}} \le {\mathcal {T}}_1 + {\mathcal {T}}_2 + {\mathcal {T}}_3, \end{aligned}$$

where

$$\begin{aligned} {\mathcal {T}}_1&:= \sup _{\xi \in {\mathbb {R}}^3} {|\hat{Q}_\alpha (F,\Psi ) + \hat{Q}_\alpha (\Psi ,F)|\over \langle \xi \rangle ^{3s}} \le C \, \sup _{\xi \in {\mathbb {R}}^3} {\left| \Psi (\xi )\right| \over \langle \xi \rangle ^{3s}},\\ {\mathcal {T}}_2&:= \sup _{\xi \in {\mathbb {R}}^3} {|\hat{Q}_\alpha ^+(\Omega ,H) |\over \langle \xi \rangle ^{3s}} \le 2 \, \left( ~\sup _{\xi \in {\mathbb {R}}^3} {\left| \Omega (\xi )\right| \over \langle \xi \rangle ^{2s}}\right) \, \left( ~\sup _{\xi \in {\mathbb {R}}^3} {\left| H(\xi )\right| \over \langle \xi \rangle ^{s}}\right) ,\\ {\mathcal {T}}_3&:= \sup _{\xi \in {\mathbb {R}}^3} {| \hat{Q}^+_\alpha (D,\Omega ) |\over \langle \xi \rangle ^{3s}} \le 2 \, \left( ~\sup _{\xi \in {\mathbb {R}}^3} {\left| D (\xi )\right| \over \langle \xi \rangle ^{s}}\right) \, \left( ~\sup _{\xi \in {\mathbb {R}}^3} {\left| \Omega (\xi )\right| \over \langle \xi \rangle ^{2s}}\right) . \end{aligned}$$

Finally we then conclude the proof of (7.17) using the already established estimates (7.8), (7.14), (7.16), and the Gronwall lemma.\(\square \)

Proof of (A5) We use the following result proved in [4] (see also [2] for a similar result)

$$\begin{aligned} \sup _{t \ge 0}W_2( S^{N\!L}_t f_{\mathrm{in}}, S^{N\!L}_t g_{\mathrm{in}}) \le W_2 (f_{\mathrm{in}},g_{\mathrm{in}}), \end{aligned}$$

which concludes the proof.