Random matching in 2D with exponent 2 for gaussian densities

We solve the Random Euclidean Matching problem with exponent 2 for the Gaussian distribution defined on the plane. Previous works by Ledoux and Talagrand determined the leading behavior of the average cost up to a multiplicative constant. We explicitly determine the constant, showing that the average cost is proportional to (log N)^2, where N is the number of points. Our approach relies on a geometric decomposition allowing an explicit computation of the constant. Our results illustrate the potential for exact solutions of random matching problems for many distributions defined on unbounded domains on the plane.


Introduction
Here we consider the Random Euclidean Matching problem with exponent 2 for some unbounded distribution in the plane.The case of unbounded distributions is important both from a mathematical and an applied point of view.
In particular we consider the case of the Gaussian, the case of the Maxwellian and we give some hints on the case of other unbounded exponentially decaying densities.
Random Euclidean Matching is a combinatorial optimization problem in which N red points and N blue points extracted independently from the same probability distribution are paired, a red point with a single blue point and vice versa, in order to minimize the sum of the distances (the distance raised to a certain power) between the points.The problem is equivalent to calculating the Wasserstein distance between the two empirical measures associated with the points or, which is the same, finding the optimal transport between the two empirical measures.
Apart for the great mathematical interest in the subject, in the last years applications of Matching and Optimal Transport in statistical learning and more in general in applied mathematics has enormously increased.In particular we refer to [17], [13], [19] and references therein for applications of Optimal Transport to statistical learning, computer graphics, image processing, shape analysis, pattern recognition, particle systems and more.For general reviews on Optimal Transport we refer to [29] and with a more applicative aim to [25], while for a general review of results and tools and methods in probability, with applications to the matching problem we refer the reader to [28].
Coming to the mathematical problem, recently there has been a lot of activity on this topic because of some sharp progress, starting from a conjecture by Caracciolo et al., which shows how the problem can be rephrased in terms of PDE.
Let µ be a probability distribution defined on Λ ⊂ R 2 .Let us consider two sets X N = {Xi} N i=1 and Y N = {Yi} N i=1 of N points independently sampled from the distribution µ.The Euclidean Matching problem with exponent 2 consists in finding the matching i → πi, i.e. the permutation π of {1, . . .N } which minimizes the sum of the squares of the distances between Xi and Yπ i , that is (1.1) The cost defined above can be seen, but for a constant factor N, as the square of the 2-Wasserstein distance between two probability measures.In fact, the p−Wasserstein distance Wp(µ, ν), with exponent p ≥ 1, between two probability measures µ and ν, is defined by where the infimum is taken on all the joint probability distributions dJµ,ν (x, y) with marginals with respect to dx and dy given by µ and ν, respectively.Defining the empirical measures (see for instance [12]).In the sequel we will shorten CN := CN (X N , Y N ).
The first general result on Random Euclidean Matching was obtained by combinatorial arguments in [1].In particular, in the case of dimension 2 and exponent 2, assumed that Xi and Yi are independently sampled with uniform density on the unit square Q they prove that Eσ[W 2  2 (µ N , ν N )] behaves like log N N , where with Eσ we have indicated with the expected value with respect to the uniform distribution dσ(x) = dx of the points {Xi} and {Yi}.
In the challenging paper [15], Caracciolo et al. conjecture that where we say that f ∼ g if limN→+∞ f (N )/g(N ) = 1.In terms of W 2 2 the conjecture is equivalent to Furthermore, in [15] it is conjectured that asymptotically the expected value of W 2 2 (µ N , σ) between the empirical density X N and the uniform probability measure σ on Q is given by The above conjectures were proved by Ambrosio et al. [5].In [2] more precise estimates are given and it is proved that the result can be extended to the case where particles are sampled from the uniform measure on a two-dimensional Riemannian compact manifold.In [3] it is shown that the optimal transport map for W2(µ N , σ) can be approximated as conjectured in [15].
We notice that, by simple scaling arguments, if we consider squares or manifold Λ of measure |Λ|, the cost has to be multiplied by |Λ|.
Then, in [8] it has been conjectured that, if the points are sampled from a smooth and strictly positive density ρ in a regular set Λ, then the result is the same: i.e. the leading term of the expected value of the cost is |Λ| 2π log N. The conjecture is based on a linearization of Monge-Ampere equation close to a non uniform density and a proof of the estimate form above is given when Λ is a square.This result has been proved by Ambrosio et al. [4].In particular they generalize the result to Hölder continuous positive densities in bounded regular sets and in Riemannian manifolds.
Summarizing, if the density σ = ρdx, is supported in a bounded regular set Λ where ρ is Holder continuous, and if there exist constants a and b such that 0 < a < ρ < b, then In [9], in the case of constant densities, the correction to the leading behavior has been studied.In particular it is conjectured that the correction is given in terms of the regularized trace of the inverse of the Laplace operator in the set.

Main results
Interestingly (1.5) implies that the limiting average cost is not continuous in the space of densities, even in L∞ norm.
Indeed, if we consider a sequence of smooth strictly positive densities ρ k on a disk of radius 2, converging, as k → ∞ to ρ = 1 π 1 |x|<1 , that is the uniform density on the disk of radius 1, we get that for any k : Eσ[CN ] ∼ 2 log N, while for the limiting density ρ we get Eσ[CN ] ∼ 1 2 log N. It is therefore natural to ask if it is possible to define sequences of densities, positive on all the disk of radius 2, that converge to the density ρ = 1 π 1 |x|<1 , and such that ).The answer is yes.
For instance, if we consider, in the disk of radius 2 the sequence of N -dependent "multiscaling" densities where 0 < α < 1.That is, in the average, there are N − N α points in the disk and N α points in the annulus.

1: Multiscaling density
In this case if dσ(x) = ρN (x)dx, as we prove in Section 3, the average of the cost is given by Here we consider three problems.The first is a generalization of the example seen above to any finite number of circular annuli.As we shall see this can be considered as a toy model for the Gaussian case.
Under suitable monotonicity conditions we will prove, see Theorem 3.1, that the average cost is given by The second case is the case of the Gaussian distribution, that is dµ(x) = ρ(x)dx and In this case Talagrand proved [27], that the average cost, for large N satisfies An estimate from above proportional to (log N ) 2 was previously proved by Ledoux in [22], see also [23] where an estimate from below is proved using PDE techniques as in [5].
In this case we prove, see Theorem 4.1, that the average cost is The third case, is when the density is given by and again dµ(x) = ρ(x)dx.This density, interpreting x2 as a velocity, is simply the Maxwellian distribution for a gas in the box (the segment) [0, 1].
In this case we prove, see Theorem 5.1, that the average limit cost is As we shall see the problem of the Gaussian and the problem of the Maxwellian can be considered as the limit of a suitable sequence of the multiscaling densities, in particular, in both cases we obtain that Dealing with other radially symmetric exponentilaly decaying densities, that is with densities proportional to e −|x| α , Talagrand showed that the leading behavior is proportional to (log N ) 1+2/α .We think that it would be possible to modify the proofs given here to deal with these cases and to get also in this case the exact leading behavior (i.e.determining also the multiplicative constant).More precisely, by (1.7) we would get Note that for α = 2 we find the constant 1 4 instead of 1 2 because here we are considering the Gaussian e −|x| 2  and not e −|x| 2 /2 .To prove this is out of the aim of the present paper and the technique would be slightly different, because here we make use of the fact that the gaussian is a product measure.
For what concerns probability distributions that decays as a power of the distance, for instance 1 1+|x| α , we do not dare to make conjectures.In that case the slow decay of the distribution does not allow us to apply techniques similar to those used in this work.
The structure of the paper is the following.In Section 2 we give some general results on Wasserstein distance that we use in the sequel.
In Section 3 we consider the case of multiscaling densities, in Section 4 the case of the Gaussian density and in Section 5 the case of the Maxwellian density.Now we briefly review what it is known, up to our knowledge, on the Random Euclidean Matching in dimension d ̸ = 2, with particular attention to the case of the constant distribution in the unite cube and of the Gaussian distribution.
In dimension 1 the Random Euclidean Matching problem is almost completely characterized, for any p ≥ 1.This is due to the fact that the best matching between two set of points on a line is monotone, see for instance [16], [14], and [11] where a general discussion on the one-dimensional case, also for the case of non-constant densities is given.In particular, for a segment of lenght 1 and for p = 2 : E[CN ] → 1/3 as N → ∞.For the normal distribution in dimension 1 in [10] it is proved that E[CN ] ∼ log log N, while in [11] estimates from below and from above proportional to log log N were given.
In dimension d ≥ 3, for the constant density in a cube, it has been proved that E[CN ] behaves as N 1−p/d , for any p ≥ 1 (see [26], [18], [22]).In [21] it has been proved the existence of the limit E[C N ] N 1−p/d for any p ≥ 1).In dimension d ≥ 3, the case of unbounded densities and in particular the gaussian case has been widely studied, see [27], [18], [6], [24].In particular, in [6], it has been proved that E[CN ] behaves as N 1−p/d , for any 0 < p < d/2, and an explicit expression for the constant multiplying N 1−p/d is conjectured, while in [24], it has been proved that E[CN ] behaves as N 1−p/d , for any 1 ≤ p < d.General results on Random Euclidean Matching, including the case p > d and the case of unbounded densities are given in [20].

Useful results and notations
In this Section, we recall some preliminary results that we will need later.
The following Lemma links the cost of semidiscrete problem to the cost of bipartite one.
Lemma 2.1 ([5], Proposition 4.8).Let ρ be any probability density on R 2 and let X1, . . ., XN and Y1, . . ., YN independent random variables in R 2 with common distribution ρ, then We will use also the following property for the upper bounds of the leading terms, that is a consequence Benamou-Brenier formula.The result we will use is the following.
We will also use a result by Talagrand, that relates the Wasserstein distance with the relative entropy, that is the following.
If µ is another density on R d , then we have Then, when proving the convergence, while for the upper bound we will use the canonical Wasserstein distance, for the lower bound we will use, as in [4], a distance between non-negative measures introduced in [19], that is In the following Theorem we denote with W2, * either the canonical W2 and the boundary version of it, that is W b2, and we collect some known results that we will need later.Let Ω ⊆ R 2 be a bounded connected domain with Lipschitz boundary and let ρ be a Hölder continuous probability density on Ω uniformly strictly positive and bounded from above.Given iid random variables {Xi} N i=1 and {Yi} M i=1 with common distribution ρ, we have Finally, we specify that hereafter we will denote the expected value conditioned to a random variable X as In the sequel, we denote with the same symbol a probability measure absolutely continuous with respect to Lebesgue measure and its density.

A piecewise multiscaling density
In this Section, we examine the trasportation cost of a random matching problem when X1, . . ., XN and Y1, . . ., YN are independent random variables in the disk C = {|x| ≤ S} with common distribution ρ L N , defined by where we have chosen the exponents α l strictly positive and decreasing with the index l, α0 := 1, and where the annuli C l are defined by This density is piecewise constant on the annuli C l , it depends on the number of particles we are considering and it allows to have (in the expected value) N α l particles (or Here we prove the following theorem.
Theorem 3.1.If X1, . . ., XN and Y1, . . ., YN are iid random variables with common distribution ρ L N , it holds Let us notice that also if ρ L N is supported on all the disk C the asymptotic cost of the problem (except for a factor 2π or 4π) is multiplied for L−1 l=0 α l |C l |, and therefore the cost is strictly smaller than the cost of the problem with particles distributed with a density bounded from below from a positive constant, as proved in ...
. This happens because ρ L N is not bounded from below: except for the disk C0 it is everywhere vanishing for large N .
We can also notice that ρ while the total cost is strictly larger than the cost of the problem when the particles are distributed with measure µ0.
Finally, let us notice that the second statement of Theorem 3.1 is equivalent to First we prove the following Lemma, similar to propositions proved in [8] and [4].It allows to compute the total cost as the sum of the costs of the problems on the annuli.The argument used to estimate the Wasserstein distance between two measures that are not bounded from below is that when we use Benamou-Brenier formula we find a divergent term due to a vanishing denominator.This term in the annulus C l is balanced from the numerator, which involves the fluctuations of the particles in C l and in ∪ L−1 l=0 C l , whose order is the same thanks to the choice of the exponents α l .
Lemma 3.1.There exists a constant c > 0 such that if X1, . . ., XN and Y1, . . ., YN are independent random variables in C with common distribution ρ L N , N l and M l are respectively the number of points Xi and Yi in C l , i.e.N l := N i=1 1(Xi ∈ C l ) and Proof.Since the proofs are very similar, we only focus on (3.2) and then we explain how to obtain (3.3) and (3.4).
Let fN be the weak solution of then by Theorem 2.2 we get As explained before, now we are going to prove that, when we take the expectation, the divergent term due to the vanishing density is balanced by the small fluctuations of the particles.
We can find fN depending only on |x|, i.e. in the form and we observe that the factor 2πs is exactly what we need to write the integral in polar coordinates, we have where in the last inequality we have used that if l = 0 the first summand disappears since P l = N ρ L N (P l ) = 0 almost everywhere and therefore the function 2πr in the denominator is multiplied for r 4 , thus it is integrable.Moreover, thanks to the choice of {α l } L−1 l=0 we have where c depends on L. Therefore Finally from (3.5), (3.6) and (3.7) we get Then, the proof of (3.3) is exactly the same as (3.2).
To obtain (3.4) it is sufficient to observe that in and then we use (3.2) and (3.3).
Now we can prove Theorem 3.1.Thanks to Lemma 2.1 it is sufficient to prove the upper bound for semidiscrete matching, in Proposition 3.1, and the lower bound for bipartite matching, in Proposition 3.2.
The structure of the proofs is the same as Theorem 1 in [8] and Theorems 1.1 and 1.2 in [4].First, we use the fact that the total transportation cost on the disk C is estimated by the sum of the costs on the annuli C l .This is possible thanks to Lemma 3.1.Then we use the fact that the problem on the annulus C l has been solved in [4] (it is a particular case of Theorem 1.1 and 1.2) because the probability density ρ L N is piecewise constant on the annuli C l (and, thus, piecewise bounded from below).Therefore, if N l is the number of particles in C l , each annulus contributes to the total cost with a term approximated by except for a factor 4π or 2π in semidiscrete and bipartite matching respectively.The total cost is a convex combination of all these terms, so the main contributions (avoiding the factors 4π and 2π) turns out to be Hence, thanks to triangle inequality and convexity of quadratic Wasserstein distance, if β > 0 and, since β > 0 is arbitrary, combining (3.8) with (3.2) of Lemma 3.1 we get lim sup Let now A l be defined as We can compute the expected value in (3.9) separately in the sets A c l and A l .In A c l we have the bound Using (3.10) and (3.11) we get Therefore we can limit ourselves to consider the expected value in A l : we use the properties of the conditioned expected value and (2.2) of Theorem 2.4.Indeed, since min l=0,..
Finally, we use the concavity of the function log x to observe that Thus, using (3.12), (3.13) and (3.14) we obtain and combining this with (3.9) we obtain the thesis.
Proposition 3.2.Let X1, . . ., XN and Y1, . . ., YN independent random variables in C with common distribution ρ L N .Then it holds lim inf Proof.(Sketch) Since the proof is quite similar to the previous one, we only explain the differences.First, we can restrict to a special set where θ = 1 √ log N .Its complementary has small probability.Then, if N l and M l are respectively the number of particles Xi and Yi in C l , we rename Using the superadditivity of W b 2 2 , we obtain The main contribution is given by the term in (3.15), and as in the previous Theorem it can be estimated using (2.3) of Theorem 2.4.
To prove that the term in (3.16) is negligible, it is sufficient to recall Lemmas 6.2 and 3.1.
Now we have proved Theorem 3.1.Let us observe that we can choose the exponents α l and the annuli C l in an interesting way.If that is a Riemann sum for the function f (x) = πS 2 x.Therefore, if we decrease max l=0,...,L−1 {α l − α l+1 }, in the limit we obtain In particular, the case S = √ 2 log N introduces us to Section 4, indeed the Gaussian density has the following property: if α0 = 1 > α1 > • • • > αL = 0, the annulus A l defined by Moreover the averaged number of particles in A l is N α l − N α l+1 ∼ N α l , therefore it contributes to the total cost with and by summing over l = 0, . . ., L − 1 and decreasing max l=0,...,L−1 {α l − α l+1 } we get Notwithstanding this, the case of the Gaussian density have some further difficulties.The first one is that what we have proved in the case of the multiscaling density depends on the number of the annuli we are considering, while we are currently not able to approximate the gaussian density with a piecewise multiscaling density on a finite number of annuli.Otherwise, if we consider a countable set of annuli, we need a uniform bound for the cost of the problem on an annulus.Instead, when dividing the disk of radius √ 2 log N into squares, using a rescaling argument, we only need a bound for the cost of the problem on a square (and we already have it from [5]).Moreover, the main property we use here is not that the Gaussian density is a radial density, but rather that it is a product measure, i.e. a function of x1 times a function of x2.

The Gaussian density
This Section concerns the problem of X1, . . ., XN and Y1, . . ., YN independent random variables in R 2 distributed according to Gaussian measure ρ, that is In Subsection 4.2 we prove the following theorem.
First, we underline that also if Gaussian density has an unbounded support, the number of particles in a cube of side dx is N e − |x| 2 2 dx, and we can notice that N e − |x| 2 2 is strictly smaller than 1 when |x| > √ 2 log N .Therefore, using the results in [8] and [4], we can suppose that the cost for semidiscrete and bipartite matching is N except for a factor 1 4π or 1 2π respectively.To achieve these result, we apply a cut-off and we substitute ρ with a density that we will call again ρN and whose support is contained in {|x| ≤ √ 2 log N }.To define ρN , we proceed in the following way.We cannot arrive exactly at √ 2 log N , otherwise there would be too few particles close to the boundary of {|x| ≤ √ 2 log N }, therefore we define and we construct a collection of squares that covers {|x| ≤ rN }, in this way J is a set of intervals in direction x1, while K is a set of intervals in direction x2.Now we define a set of squares that covers {|x| ≤ rN }, as follows.First, we denote by kmin and kmax and by j min Before going on, here we can notice that, thanks to the choice of the squares, if N j k is the (random) number of points in the square Q j k when the distribution of the particles is Gaussian (after having applied the cut-off, the expectation of this number can only increase) we have  with projections J k on the axis x1, where each J k is defined by Finally, we define EN Hereafter, if X1, . . ., XN and Ỹ1, . . ., ỸN are independent and identically distributed with measure ρN , and we define as the number of points Xi and Ỹi in the square Q j k , respectively Finally, where not specified, we denote j,k := kmax k=k min In Subsection 4.1 we prove some bounds that we will need for the proof of Theorem 4.1 in Subsection 4.2.

Preliminary estimates
This first result proves that we can substitute N independent random variables with common distribution ρ with N independent random variables with common distribution ρN .Lemma 4.1.Let ρ and ρN defined as before, X1, . . ., XN independent random variables in R 2 with common distribution ρ and T : R 2 → R 2 the optimal map that transports ρ in ρN .Then we have Proof.For (4.1) we use again Theorem 2.3 to write while as for (4.2), if T is the optimal map that transports ρ in ρN , that is we have that is the thesis thanks to (4.1).
The following Proposition allows us to compute the total cost of the problem as the sum of the costs of the problems on the squares Q j k .We have to bound the expectation of the distance between the Gaussian measure and the same Gaussian measure modified on the squares Q j k by a factor . Therefore the two measures we are considering are ρ and j,k .
The reason why these measures should be similar is that N j k is very close to its expectation, that is To prove it, we proceed in two steps and use the triangle inequality between the two measure involved and a third measure, that is , where N k is the number of points Xi in the rectangle R k .
As for the distance between j,k , first we use convexity of Wasserstein distance to restrict the problem to the rectangles, indeed we have Then, we argue as in Lemmas 3.1: when using Benamou-Brenier formula there is a vanishing density in the denominator, and this causes a divergent term.But this divergent term is completely balanced from the fluctuations of the particles, which are very few.Then, we use again Talagrand formula to bound the distance between k and ρ.
Proposition 4.1.There exist a constant c > 0 such that if X1, . . ., XN and Ỹ1, . . ., ỸN are independent random variables with common distribution ρN and if Proof.We only prove (4.3) and (4.5), indeed (4.4) is analogue to (4.3).We start by proving (4.3), therefore we define so that N k is the numbers of particles Xi in the whole rectangle R k and Therefore P j k is the numbers of particles Xi in R k not in the whole rectangle, but rather only until aj.
As for (4.4), the only difference in the proof is that where we have summands involving N j k we will find the same terms involving We proceed in two steps.First, we focus on the distance between the density modified on all the squares Q j k and the one modified only on the rectangles R k ; then, we study the distance between the measure modified on the rectangles R k and the Gaussian measure itself.
Using first the triangle inequality and then the convexity of quadratic Wasserstein distance we get As for the term in (4.6), we observe that we are considering again product measures in the rectangle R k whose marginals coincide in the direction x2, indeed we have therefore we just have a one dimensional problem: thanks to Lemma 6.1 we get To bound this term we argue as in Lemma 3.1: we define f : , f is the weak solution of Then, thanks to Theorem 2.2 we have (4.9) where in the last inequality we have used that, thanks to the choice of the points aj, we have |a 2 j+1 − a 2 j | ≤ c.We are going to estimate these two terms using the fact that where the density is small there are also small fluctuations of the particles, so that there will be a balance between the fluctuations and the divergent terms.In particular, our aim is to show that all the terms in the last two sums perfectly balance, and only (aj+1 − aj) remains.
Thus, to bound (4.9) and (4.10) we observe that we can condition to the number Before going on, here we observe that, when proving (4.4), to bound (4.9) and (4.10) at this point we should condition both to N k and to M k (and not only to N k ).In this way instead of 1 N k in (4.11) and (4.12), up for a constant we would obtain and these term can be estimated (in A θ , and but for multiplicative constants) by Finally, combining (4.8) with (4.9) and (4.10) we get  while the second one is a consequence of Thus the claim is proved and we have Applying (4.14) to (4.13) and using the properties of the conditioned expected value we get Now we have bounded the expectation of (4.6).
To estimate (4.7) we argue in the following way: thanks to Theorem 2.3 and using ρ(E N ) , we have Before concluding the proof we can observe that for (4.4) at this point we would obtain the following term that is analogue to the previous one but with a restriction to the set A θ : we can bound the expectation computed in A θ with the expectation computed everywhere because this term is everywhere positive (indeed the function f (x) = x 2 − x is convex and we are considering a convex combination of the summands).Finally, combining (4.6) with (4.15) and (4.7) with (4.16) we obtain (4.3).
To prove (4.5) we observe that in and this implies the thesis thanks to (4.3) and (4.4).
With the following Lemma we prove that thanks to the choice of the squares Q j k we can transport in the uniform measure.This implies that the problem on the square Q j k is (approximately) a random matching problem in the square with the uniform measure, and thus solved in [5].
and Z1, . . ., Z N j k and W1, . . ., W M j k are independent random variables in Q j k with common distribution , then Proof.To prove the statement, arguing as in [8], we can reduce to find a suitable map . Let Sj : (aj, aj+1) → (aj, aj+1) and ) be defined by and its inverse switch the measures we are considering and fix the boundary of Q j k , and since and this concludes the proof.
The following Lemma allows us to restrict to a good event in the bound from below for bipartite matching.It only uses Chernoff bound, as in [4].
Finally, this last Proposition collects all the contributions to the total cost given from each square.It makes rigorous the idea explained at the beginning of this Section.Proposition 4.2.Let X1, . . ., XN be independent random variables with common distribution ρN .Then we have Proof.First, we focus on the estimate from above, therefore we choose 0 = α0 < α1 < • • • < αL = 1 and we define Now we recognize in the right hand side of this inequality a Riemann sum for the function f and since our choice of {α l } L l=0 was arbitrary in [0, 1] we have lim sup As for the estimate from below, since function log x is concave, for a suitable function

Convergence Theorems
In this Subsection we prove Theorem 4.1.Thanks to Lemma 2.1 it is sufficient to prove the upper bound for semidiscrete matching, in Theorem 4.2, and the lower bound for bipartite matching, in Theorem 4.3.The structure of the proof is the same as Theorem 1 in [8] and Theorem 1.1 and 1.2 in [4].
Both in Theorem 4.2 and in Theorem 4.3, the first step consists in substituting N independent random variables with common distribution ρ with N independent random variables with common distribution ρN .This is possible thanks to Lemma 4.1.
Then we have to bound the distance between two measures, one of which is ρ (in semidiscrete case, the first one) or the empirical measure on N independent random variables with distribution ρN (in bipartite case, the second one) while the other is the same measure as the first , multiplied in each square Q j k by . This factor is expected to be very close to one, and this is the reason why the two measures involved are similar.This is possible thanks to Proposition 4.1.At this point, we are allowed to compute the total cost of the problem as the sum of the costs on the squares.When proving the upper bound we use a subadditivity argument while for the lower bound we use, as in [4], the distance introduced in [19], that is (2.1).This distance is superadditive.
Then, since we need that in each square Q j k there is an increasing (with N ) number of particles, in Theorem 4.2 we consider separately two events: the event in which the number of particles in the square is close to its expected value, and its complementary, and we show that the contribution of the second event is negligible.In Theorem 4.3 we simply restrict to a good event (that is A θ ) and thanks to Lemma 4.3 we are sure its probability to be close to 1.
Once made these assumptions, thanks to Lemma 4.2 we can approximate the probability measure on the square Q j k whose density is with the uniform measure on the square itself.Therefore, using the results obtained in [4] and [5] except for a factor 4π or 2π the cost with uniform measure (on the square Q j k ) is bounded from above and below by a term close to Finally, the total cost is a convex combination of all these contributions and the main term in the estimate turns out to be Proof.As explained before, first we substitute X1, . . ., XN with N independent random variables distributed with the probability measure whose density is ρN .If γ > 0 and if T : R 2 → R 2 is the map that transports ρ in ρN , we denote Xi := T (Xi).Using the triangle inequality we have The first term in the sum in (4.19) gives the main contribution, while we have a bound for the second and the third one thanks to (4.2) of Lemma 4.1 and (4.3) of Proposition 4.1.So we only focus on the first term.
To estimate it first we exclude the events with few particles in any square Q j k , therefore we define and we observe that the contributions in the events A j k c are negligible, indeed Then, we use (4.17) of Lemma 4.2, therefore if Zi := S( Xi) where S is the map that transports in the uniform measure on the square Q j k , we have  Proof.(Sketch) Except for some steps, the proof is very similar to the previous one, therefore we only underline the differences.Once made the substitution of {Xi, Yi} N i=1 with { Xi, Ỹi} N i=1 , we restrict to the set where θ = 1 (log N ) ξ and 0 < ξ < α−1 2 < 1 2 .Then, if we rename To make this rigorous we substitute again ρ with a density (that we will call ρN ) whose support is compact but increases with N .To define ρN , first we define rN and rN as rN := 2 log N (log N ) α , 1 < α < 2, rN := ⌊m⌊rN ⌋rN ⌋ + 1 m⌊rN ⌋ We define rN in this way because when multiplying for m⌊rN ⌋ we obtain an integer number and therefore the set can easily covered by an integer number of squares, while  Using this partition the arguments used are analogous to the Gaussian case, and in this way we can prove Theorem 5.1.

Appendix
Lemma 6.1.Let µ and λ be two probability measures on R absolutely continuous with respect to Lebesgue measure, and let ν be any probability measure on R. Then Proof.If S : R → R is the optimal map that transports µ in λ, i.e.

Figure 4 . 1 :
Figure 4.1: The set of squares Q j k where the cut-off is applied.

Figure 4 . 2 :
Figure 4.2: The set of rectangles where the cut-off is applied, except for zero measure sets.EN :=

Figure 4 . 3 : 2 =
Figure 4.3: A graphical representation of the proof.N j k is the number of particles in Q j k (the first square in blue), while P j k is the number of particles in the red rectangle.Once fixed N k , that is the number of particles in R k , the fluctuations of the particles in the red rectangle are exactly the fluctuations of the particles in the blue one.

Figure 5 . 1 :
Figure 5.1: On the left: the squares {Q j k } j,k .On the right: the horizontal rectangles {R k } k .