Multi-marginal Entropy-Transport with repulsive cost

In this paper we study theoretical properties of the entropy-transport functional with repulsive cost functions. We provide sufficient conditions for the existence of a minimizer in a class of metric spaces and prove the $\Gamma$-convergence of the entropy-transport functional to a multi-marginal optimal transport problem with a repulsive cost. We also prove the entropy-regularized version of the Kantorovich duality.


Introduction
We consider the following multi-marginal entropy-transport problem where C 0 [γ] = X N c dγ is the transportation cost related to a cost function c, E[γ] is the entropy, and ε ≥ 0 is a parameter, see Section 2 for details. We consider the setting where (X, d, m) is a Polish measure space and ρm ∈ P ac (X) is an absolutely continuous probability measure with respect to the reference measure m. An element γ ∈ Π sym N (ρ) is called a symmetric coupling (or transport plan), that is, a symmetric probability measure in X N having all marginals equal to ρm.
We We assume f : ]0, ∞[→ R to be a continuous and decreasing function that approaches +∞ if d(x i , x j ) → 0. Among the examples of such cost functions we have the Coulomb cost f (z) = 1/|z|, the Riesz cost f (z) = 1/|z| s , n ≥ s ≥ max{n − 2, 0} (in R n ) and the logarithmic cost f (z) = − log(|z|). We observe that when ε = 0, this entropy-transport problem reduces to the classical multi-marginal optimal transport problem with repulsive costs [5,7,8,12]. The motivation of this paper comes from both theory and numerics. For repulsive cost functions, the entropy term in (1.1) plays a role of a regularizer to compute numerically a solution γ of the multi-marginal optimal transport problem I 0 [ρ], see [2]. Numerical experiments suggest that when the regularization parameter ε goes to 0, the minimizer γ ε converges to a minimizer of I 0 [ρ] having minimal entropy among the minimizer of I 0 [ρ].
From a theoretical viewpoint, this type of a functional has direct relevance in Density Functional Theory. By choosing carefully the parameter ε, the functional (1.1) provides a lower bound for the Hohenberg-Kohn functional in Density Functional Theory [23,26]. This is an immediate consequence of the Log-Sobolev Inequality.
The entropy-transport problem has appeared previously in the literature in the attractive case, in particular when c(x 1 , x 2 ) = d(x 1 , x 2 ) 2 . We mention briefly below some of the connections of the entropy-transport with other fields and point out the relevance in the Coulomb case.
Brief comments on some applications of the entropy-transport Optimal Transport and Sinkhorn algorithm: The entropy-transport (1.1) was introduced by M. Cuturi [9] in order to compute numerically the optimal transport plan for the distance squared cost in the 2-marginals case via the Sinkorn algorithm. Due to its reasonable computational cost, it has been applied to a wide range of problems in various research areas, including Information Theory, Computer Graphics, Statistical Inference, Machine Learning, and Mean-Field Games. The entropic regularization method was also considered in the (attractive) multi-marginal case in the so-called barycenter problem introduced by M. Agueh and G. Carlier [1] (see also [6]) and in numerical methods in the time discretization of Brenier's relaxed formulation of the incompressible Euler equation [3]. For a thorough presentation of the computational aspects we refer to M. Cuturi and G. Peyré's book [24].
Second-order Calculus on RCD spaces: N. Gigli and L. Tamanini [16] studied the entropictransport problem on a class of metric spaces with (Riemannian) Ricci curvature bounded from below (2-marginals case, c(x 1 , x 2 ) = d(x 1 , x 2 ) 2 ). The entropic regularization procedure was crucial for establishing a second-order differential structure in that setting.
Schrödinger Problem: In 1926, E. Schrödinger introduced the (linear) Schrödinger equations describing the non-relativistic evolution of a single particle in an electric field with potential energy and also established an equivalence between such equations and a system of diffusion equations [25]. Roughly speaking, the variational problem (see (1.1) with X = C([0, 1], R d ) and N = 2) arises in the Schrödinger manuscript while studying the limit k → ∞ (N = 2) of the empirical measures associated to the evolution of k i.i.d. Brownian motions. We refer the reader to C. Léonard survey [20] for technical details and historical notes.
Lower bound on the Hohenberg-Kohn functional in Density Functional Theory: This is the particular case where the entropy-transport problem with Coulomb cost comes into play. It has been shown in [23,26] that the functional (1.1) provides a lower bound for computing the ground state energy of the Hohenberg-Kohn functional [5,7,8,12,21]. Below we give a brief description of the result. Notice that in this context X = R d and m is the Lebesgue measure on R d .
Assume that γ ∈ Π N (ρ) such that √ γ ∈ H 1 (R dN ). This is the case, for example, when is a ground-state wave function solving the N -electron Schrödinger Equation (see [7,8,12,26] for details). Then, we can define the Hohenberg-Kohn functional bỹ Now, as a consequence of the logarithmic Sobolev inequality for the Lebesgue measure [17], the following result holds: if ρL d ∈ P(R d ) and √ γ ∈ H 1 (R dN ) then Example of optimal entropy couplings. Let us present some computational examples of minimizers of I ε [ρ] illustrating the role of the parameter ε. Before this, we recall a result on the characterization of minimizers in the one-dimensional case [10]. In particular, according to it the minimizer of I 0 [ρ] is concentrated on finitely many graphs and thus singular with respect to the product reference measure. 10]). Let µ ∈ P(R) be an absolutely continuous probability measure and f : R → R strictly convex, bounded from below and non-increasing function. Then there exists a unique optimal symmetric plan γ ∈ Γ sym (µ) that solves Moreover, this plan is induced by an optimal cyclical map T , that is, γ sym = (γ T ) S , where γ T = (id, T, T (2) , . . . , T (N −1) ) µ. An explicit optimal cyclical map is Here F µ (x) = µ(−∞, x] is the distribution function of µ, and F −1 µ is its lower semicontinuous left inverse. One-dimensional Entropic-Transport with Coulomb cost and a Gaussian measure. Let ρ be the normal distribution on the real line with zero mean and standard deviation σ = 5. We compute numerically the solution of the entropic-transport problem with Coulomb cost in the real line using the Sinkorn algorithm [9]. Notice that by Theorem 1.1, we know that the minimizer of I 0 [ρ] is concentrated on a graph. See Figure 1 for an illustration of the computational results. Our code is based on the Python implementation available at POT library [14]. Organization of the paper. In Section 2 we introduce the setting and study sufficient conditions for the existence of minimizers for the entropy-transport problem (1.1). Section 3 is devoted to the Γ−convergence proof of the entropic-transport functional C ε [γ] to the multimarginal optimal transport with repulsive costs C 0 [γ]. In Section 4, we study the Kantorovich duality for the entropic-transport problem.
Strategy of the main proof and some technical remarks. The main result of this paper is Theorem 3.1, in which we prove the Γ-convergence of the entropic-regularized functional C ε [γ] to C 0 [γ]. The technical difficulty on dealing with the Γ-convergence comes from the fact that while for the entropic part E[γ] the minimizer γ tends to be as spread as possible with respect to m, for the cost C 0 [γ] a minimizer can be very singular and have infinite entropy.
We divide the proof in two parts. The part (I), the lim inf −inequality, follows basically from the lower-semicontinuity of the costs C 0 [γ] and C ε [γ] -which are obtained from the assumption ρ log ρ ∈ L 1 m (X) on the marginal measure ρm, giving a lower bound on the entropy. The part (II), the lim sup −inequality, is more involved. In Section 3.2, we construct a block approximation γ n for a coupling γ with C 0 [γ] < +∞. Such a construction is done in several steps, since we need to construct a competitor γ n such that E[γ n ] < ∞ and γ n ∈ Π sym N (ρ). The main idea and the rigorous construction is done at section 3.2.
Futhermore, we point out that our construction can deal with the case when the space X is a domain in R d , answering a question raised in [3]. There the Γ-convergence was proven using convolutions; an approach that does not seem to be easy to implement for domains, or in general metric spaces.
Related works: A proof of the Γ-convergence of (1.1) to the Monge-Kantorovich problem for c(x, y) = d(x, y) p first appeared in [19,22] via probabilistic methods. In [6], G. Carlier, V. Duval, G. Peyré and B. Schmitzer provided an alternative and more analytical proof carrying out a similar block approximation procedure for the two-marginal squared distance cost in the Euclidean space and the Wasserstein Barycenter.

The entropy-regularized repulsive costs
Let (X, d) be a Polish space and m be a reference measure on X. We denote by P(X) the set of Borel probability measures on X, and P ac (X) the set of Borel probability measures on X that are absolutely continuous with respect to m. We denote by m N the product measure m ⊗ m ⊗ · · · ⊗ m. This is the reference measure we use on the product space X N . On X N we use the sup-metric, which we denote by d N .
The class of cost functions c : X N → R ∪ {+∞} of our interest is given by functions of the form Above and from now on, we denote by (x 1 , . . . , x N ) points in X N , so x i ∈ X for each i. We denote by the set of couplings or transport plans, where pr i is the projection For the definition of the set of symmetric couplings Π sym N (ρ), see Definition 2.7. We define the functional C 0 [γ] to be the cost related to the coupling γ Given ε ≥ 0, we denote by C ε [γ] the entropy-regularized cost (2. 2) The notation ρ γ stands for the Radon-Nikodym derivative of γ with respect to the reference measure m N and γ m N means that γ is absolutely continuous with respect to the reference measure m N . Let ρm ∈ P ac (X). In this paper we are interested in the following infimum In order to guarantee the lower semicontinuity for C ε , we will assume ρ log ρ ∈ L 1 m (X). This will take care of the entropy part E[·] of the cost. In order to establish the lower semicontinuity for the functional C 0 [·], we assume that the measure ρ satisfies the following two conditions: Above we have, by an abuse of notation, denoted the measure ρm by only the density ρ; we will use the same abbreviation in the rest of the paper if there is no risk of confusion. The Condition (B) is a similar assumption than requiring, in the case of the quadratic cost, that the marginal measures have finite second moments. The Condition (A) guarantees that the cost is finite. If we endow the spaces P(X N ) and P(X) with w * -topology then, by Prokhorov's theorem, any subset of P(X) (or P(X N )) is tight if and only if it is relatively compact.
Remark 2.1 (Entropy-transport seen as a Kullback-Leibler divergence). If µ and ν are measures on a set X, the Kullback-Leibler divergence of µ with respect to ν is defined as Now, if both measures µ and ν are absolutely continuous with respect to some reference measure R of the space X with densities ρ µ and ρ ν , respectively, we can write: Considering the entropy-regularized MOT problem, we see that the cost functional C ε [γ] can be alternatively written as the Kullback-Leibler divergence between γ and a kernel κ defined below For the most part, in this paper we have chosen to consider as a reference measure the measure m N . However, as the following lemma shows, we could also assume the reference measure to be (ρm) ⊗N since the minimizers of the entropy-regularized MOT problem (2.3) do not depend on the choice of the reference measure, at least if there exists a minimizer with finite cost. To state the lemma, let us introduce the notation of relative entropy: for each reference measure R of a Polish space Y , and for each γ ∈ P(Y ), we denote by E(γ|R) the relative entropy of γ with respect to R, defined as Now we may consider two, a priori different, entropy-regularized MOT problems: the one introduced in (2.3) 4) and the problem with the reference measure chosen to be (ρm) ⊗N The folowing Lemma 2.2 is used only to go from the compact to the general case in the duality Theorem 4.2. The proof in [11] can be directly applied here to prove Lemma 2.2.
Lemma 2.2. Let (X, d, m) be a Polish measure space, ρm ∈ P(X) a measure satisfying (A) and (B), and c a cost function satisfying (F1) and (F2). Now for all > 0 we have Moreover, whenever at least one side of the equality above is finite, the problems (2.4) and (2.5) have the same minimizers.
2.1. Some properties of the entropy functional. Let us start by noting that the minimum of the entropy is attained by the product measure and that its value is not −∞.
Proposition 2.3. Let (X, d, m) be a Polish metric measure space, and let ρm ∈ P ac (X) with ρ log ρ ∈ L 1 m (X) . Then Proof. As we will see, the minimality is an immediate consequence of Jensen's inequality. Let γ ∈ Π(ρ). Then . Using Proposition 2.3 we immediately get the lower semicontinuity of the entropy functional by representing the entropy as relative entropy against the probability measure ⊗ N i=1 (ρm). See for instance [27,Lemma 4.1] for the lower semicontinuity of the entropy when the reference measure is finite.
is lower semicontinuous in the set Π sym N (ρ). Now we are ready to prove the existence of the minimizers for entropy-regularized MOT: . Proof. We notice that the set Π sym N (ρ) is compact in the w * -topology [18]. The functional E is lower semicontinuous by Corollary 2.4, and in our setting the lower semicontinuity of C 0 is proven as a part of the proof of [15, Proposition 3.1]. Since for each ε ≥ 0 the functional C ε is convex, we conclude that it has a minimizer in the set Π sym N (ρ) 2.2. Some properties of the coupling cost C 0 [γ]. Notice that in this section Π N (ρ) denotes the set of couplings in X N (not necessarily symmetric). Moreover, we need to assume extra hypothesis on the probability measure ρ in order to guaranteee that C 0 [γ] is bounded from below for a γ ∈ Π N (ρ) (e.g. f (z) = − log(|z|)).
The next theorem from [4] (see also [15,Theorem 3.2]) states that for measures ρ satisfying the assumptions (A) and (B) there exists α > 0 for which the support of any optimal plan is concentrated away from the set Let us fix 0 < β < 1 such that Then, we have for all Next we observe that one can restrict the problem min γ∈Π(ρ) C 0 [γ] to the class of symmetric couplings in X N having all the marginals equal to ρ.
for all permutations σ of the N symbols (x 1 , . . . , x N ). We denote by P sym (X N ) the set of symmetric probability measures in X N , and notice that Π sym N (ρ) := Π N (ρ) ∩ P sym (X N ). Let us also introduce the notation for symmetrized measures. If γ is a Borel measure on X N , we denote by γ S the symmetrized measure where S N is the set of permutations of the {1, . . . , N } koordinates (x 1 , . . . , x N ). The following result follows immediately.
Proposition 2.8. Under the hypothesis of Proposition 2.6, we have that 3. The Γ-convergence of Entropic-regularized cost Now let us turn to the Γ-convergence. From now on, (τ n ) n∈N is any sequence of positive real numbers decreasing to zero. Let us introduce the following functionals: for each n ∈ N The goal of this section is to prove that the sequence (C n∈N ) Γ-converges to C in the space P sym (X N ).
The proof of Theorem 3.1 is divided into two parts. The proof of the first part, the liminfinequality (I), is short and is established in the next subsection. The remainder of this section is then divided into subsections in which the second part, the limsup-inequality (II) is proven.
3.1. Proof of condition (I). We fix a sequence (γ n ) n∈N that converges to γ. If γ / ∈ Π N (ρ), then since the set Π N (ρ) is compact, for large indices we also have γ n / ∈ Π N (ρ), so both sides of inequality (I) are +∞, and we are done. Hence we may assume that γ and γ n 's are elements of the set Π N (ρ). Since now γ n ∈ Π N (ρ), the claim (I) follows from the lower-semicontinuity of γ → c dγ and from the entropy lower bound shown in Proposition 2.3.

3.2.
Constructing an approximation of the coupling γ. First of all, we need to construct an approximation of γ only in the case where C 0 [γ] < ∞: if this is not the case then any sequence (γ n ) converging to γ can be used to prove Condition (II). The idea of the construction is to redefine a large part of γ to be a product measure on finitely many Borel sets with small diameter. In order not to increase the cost by too much, the Borel sets we are using have to be far away from the diagonal compared to the diameter of the sets. We call the part of the measure defined in this way the core part of the approximation. For the rest of the measure, we take another finite combination of product measures. However, this time the sets do not need to have small (or even bounded) diameter, but just small measure. This part will be called the remainder part of the approximation.
We start the construction by taking out a small part of γ that will later be used to deal with the remainder part of the approximation. For this we take a sequence of radii defined as r n = 1/n. Since C[γ] < ∞, there exists a point x = (x 1 , . . . , x N ) ∈ spt(γ) with Moreover, since γ ∈ Π N (ρ) and ρ satisfies (A), we have γ({(y 1 , . . . ,y N ) ∈ X N | y i = x j for all i, j}) Thus, using again C[γ] < ∞, there exists another point x = (x N +1 , . . . , x 2N ) ∈ spt(γ), so that . From now on, we consider x, x fixed. Therefore, for n ∈ N sufficiently large we have Let us now define Observe that γ Bn and γ B n are symmetric probability measures. Since the marginals of a symmetric measure are the same, we may denote by ρ Bn the marginal of γ Bn and similarly by ρ B n the marginal of γ B n . Let us further denoteB n := spt γ Bn ,B n := spt γ B n and We then define a measure The idea behind the measure γ 0,n is that we have chopped of a small part of the measure around the points x and x (symmetrically) for later use. Since we are working with a singular cost, we still need to take out a small neighbourhood of the diagonals before approximating by product measures. We do this now. We fix a compact K n ⊂ X such that and take a small enough δ n ∈ (0, r n ) so that γ 0,n (D δn ) < ε n 2 . (3.4) Using K n and δ n we then define γ 1,n := γ 0,n | K N n \D δn . (3.5) The measure γ 1,n is now the core part of the measure that we approximate. We denote by ρ 1,n the marginals of the symmetric measure γ 1,n . Let us then approximate the measure γ 1,n . We take λ n ∈ (0, δ n /n) so that |f (r) − f (s)| < ε n for all r, s ∈ [δ n /2, 2 diam(K n )] with |r − s| ≤ 2λ n .
(3.6) Such λ n exists by the uniform continuity of f on the compact set[δ n /2, 2 diam(K n )]. Since the set K n is compact, we may fix a finite Borel partition {B i n } Mn i=1 of the set spt(ρ 1,n ) such that diam(B i n ) < λ n and 0 < ρ 1,n (B i n ) < ε n for all i ∈ {1, . . . , M n } . We are now ready to define the core part approximants γ a 1,n as Now let us handle the main part of the remainder of the measure, namely the measure γ 2,n := γ 0,n | D δn ∪(X N \K N n ) . Because γ 0,n and the set where we restrict it is symmetric, also γ 2,n is. We may thus denote its marginals by ρ 2,n .
In order to determine which part of the remaining marginal measure should be coupled where, we define a partition {A i,n } N i=1 of the space X by setting, for all i ∈ {1, . . . , N − 1} A i,n := {y ∈ X | d(x i , y) ≤ rn 2 } , and Condition (3.1) guarantees that the sets A i,n are pairwise disjoint. Now we approximate γ 2,n by the measure where for all i the measure η n,i is the product ρ Bn (B(x k , r n /10)) .
By the definition of the sets A i,n , for every (y 1 , . . . , y N ) ∈ spt(γ a 2,n ) we have for each i = j where we have assumed (which we can do without loss of generality) that y j ∈ B(x j , r n /10). What we have done using the measure γ a 2,n is that we have coupled the marginals of the measure γ 2,n with some suitable parts of the marginals of the reserved measure that was taken out around the point x. In this way we have used unevenly the marginals of this reserved part. To handle the rest of the reserved part of the measure around the point x, we now use the reserved measure around the point x . So, we need to redefine the coupling for the part of the marginal given by ρ 3,n := (pr 1 ) ε n γ(B n ) γ |B n + ρ 2,n − (pr 1 ) γ a 2,n .
We define it as where each φ n,i is defined as ρ Bn | B(x k ,rn/10) ρ Bn (B(x k , r n /10)) .
What remains is the part of the measure around x that was not used for γ a 3,n . Since γ a 3,n used the marginals from this part of the reserved measure evenly, we may simply couple the rest by a measure with b being the correct scaling constant. Similarly as for the previous remainder part, we have that for every (y 1 , . . . , y N ) ∈ spt(γ a 4,n ) and each i = j the inequality (3.9) holds. Now we are ready to define the full approximation as γ n = γ a 1,n + γ a 2,n + γ a 3,n + γ a 4,n . By construction γ n ∈ Π sym (ρ).
3.3. Narrow convergence of the approximations. Let us now prove that the sequence (γ n ) n narrowly converges to γ. We could argue this by using the Wasserstein distance. However, let us do it here directly using the definition of narrow convergence.
Proof. Let ϕ ∈ C b (X N ) and ε > 0. We need an index N 0 ∈ N such that (3.10) Let us denote M := sup x∈X N |ϕ(x)|; we may assume that M > 0. Since ρ is inner regular, we can fix a compact set K ⊂ X such that The function ϕ, when restricted to K N , is uniformly continuous. Hence there exists δ > 0 so that |ϕ(x) − ϕ(y)| < ε 12 for all x, y ∈ K N for which d N (x, y) < δ . (3.11) Now let N 0 ∈ N be so large that √ N λ n < δ and 6M ε n < ε 6 for all n ≥ N 0 .
Let us show that this choice of N 0 fulfills Condition (3.10). First we note that for all n ≥ N 0 we have where in the last inequality we have used the following facts: γ(X N ) − γ 1,n (X N ) < 3ε n for all n, and for the remainder part of the measure γ n we have (γ a 2,n + γ a 3,n + γ a 4,n )(X N ) < 3ε n for all n ∈ N. It remains to show that for all n ≥ N 0 we have We first estimate the integrals in the set K N . Let us fix, for each (k 1 , . . . , k N ) ∈ M N n for which the set (B k 1 n × · · · × B k N n ) ∩ K N is nonempty, an element z k 1 ,...,k N ∈ (B k 1 n × · · · × B k N n ) ∩ K N . Now we have, for a fixed (k 1 , . . . , k N ), denoting for simplicity where in a) we have used Condition (3.11), and in b) the fact that the measures γ and γ a coincide on 'cubes' Q. Summing the estimate above over all cubes (3.14) In inequality a) we have used the fact that ρ(X \ K) < 12M N and, since the marginals of γ 1,n and γ a 1,n are restrictions of ρ, we can bound both γ 1,n (X N \ K N ) and γ a 1,n (X N \ K N ) by ε 12M . For the same reason, we have Combining estimates (3.14) and (3.15) gives proving Condition (3.13).

3.4.
Convergence of the cost functional. In order to prove the Γ-limsup inequality (II), we need the cost C 0 [·] to converge along the approximating sequence γ n . We prove this in the following lemma.
Proof. Let us first consider the remainder part. Recall that for all n ∈ N we have Thus, using the lower bounds (3.8) and (3.9) for distances in the support of the remainder part, and the definition (3.2) of ε n , we get as n → ∞. By continuity of the integral, we get as n → ∞.
Let us now estimate the core part of the approximation. By the construction (3.7) of γ a 1,n and the choice (3.6) of λ n , we have Combining the above estimate with (3.16), (3.17) and (3.18) we get as n → ∞.

3.5.
Finiteness of the entropy for the approximations. Next we show that the entropy is finite for the approximating sequence. Notice that, in order to prove (II), we do not need a better estimate on the entropy.
Proof. In order to see the finiteness of the entropy, it suffices to notice that each γ n is a sum of finitely many measures (γ n,k ) Nn k=1 each of which is of the formγ n,k =ρ k 1 m ⊗ · · · ⊗ρ k N m with ρ k i ρ and dρ k i dρ ≤ 1. Indeed, by Proposition 2.3, the entropy is always bounded from below, and so we can make a crude estimate: 3.6. Proof of condition (II). We are now ready to prove the Γ-lim sup inequality (II). By Lemma 3.2 we already know that (γ n ) n converges to γ. However, C n [γ n ] need not converge to C[γ]. This can be solved by making the convergence of (γ n ) n slower by repeating always the same measure for sufficiently (but finitely) many times before moving to the next one. We define k(n) for every n ∈ N as k(n) = min n, max 1, sup k ∈ N | √ τ n E[γ j ] < 1 for all j ≤ k .
By definition, 1 ≤ k(n) ≤ n. Moreover, since for every j ∈ N we have E[γ j ] < ∞ by Lemma 3.4 and τ n → 0 by definition, we have that k(n) → ∞ as n → ∞. Thus, defining γ n = γ k(n) , for large enough n ∈ N we have Recalling that by Lemma 3.3 we have C 0 [γ k(n) ] → C 0 [γ], we conclude the proof. In Proposition (2.5) the existence of a minimizer for the entropy-regularized cost was established. Now that we know that measures γ for which C 0 (γ) < ∞ can be approximated by measures with not only finite costs but also finite entropy, we can say more: Corollary 3.5. Let (X, d, m) be a Polish metric measure space. Assume that ρm ∈ P ac (X) satisfies ρ log ρ ∈ L 1 m (X) and Conditions (A) and (B). Assume that c : X N → R ∪ {+∞} satisfies Conditions (F 1) and (F 2). Then, for each ε > 0, there exists a unique minimizer γ ∈ Π sym N (ρ) for the entropic-regularized cost C ε [γ], and this minimizer has a finite cost.
Proof. Our marginal measure satisfies Conditions (A) and (B), so there exists a measure γ ∈ Π sym N (ρ) that minimizes C 0 with C 0 [γ] < ∞. It must be noted that this measure can have infinite entropy. However, because of the approximation result presented in the proof of Condition (II) above, we get the existence of a measure γ ∈ Π sym N (ρ) such that C [γ ] < ∞. The uniqueness claim now follows, since the functional γ → C [γ] is strictly convex for > 0.

Entropic-Kantorovich Duality for Coulomb-type costs
We start by recalling the classical Fenchel-Rockafellar Theorem and we refer to the I. Ekeland and R. Témam's book [13,Theorem 4.2] for a more complete presentation and references.
where A * : Y * → X * denotes the adjoint operator of A.
Next we prove the Entropic-Kantorovich duality for the problem (2.3).
Proof. First let us assume that X is a compact space. We denote by X = (C b (X)) N and Y = C b (X N ), where C b (X) is the space of continuous and bounded functions on X, and similarly for X N . By Riesz representation theorem, the space Y is dual to the space M(X N ) of signed regular Borel measures on X N . Thus we may define the Legendre-Fenchel transform G * of a functional G : Y → R ∪ {+∞} by We apply Fenchel-Rockafellar Theorem 4.1 to the functionals and to the operator Now F and G are proper and convex functionals and A is a linear and continuous operator. So we may apply Fenchel-Rockafellar duality to get This gives (since for every set S we have inf(S) = − sup(−S) and sup S = − inf(−S)) It remains to show that the above expression has exactly the form of our duality claim. The claim that the right-hand sides correspond to each other follows immediately from our choices of X, F , and G. So it remains to show that (4.1) To prove it, let γ ∈ M(X N ). Now we have Let us then compute G * [γ]: If γ is not absolutely continuous with respect to m N we have G * [γ] = +∞. If γ m N , then the supremum (that appears in the definition of G * [γ]) is realized at ψ = ε log ρ γ + c; this holds also if the function ρ γ is not continuous since it can be approximated by a sequence of continuous functions. Thus we get for γ m N Hence, if γ ∈ Π N (ρ), we have This concludes the duality proof when X is a compact space.
The noncompact case: Due to Lemma 2.2, it suffices to prove the claim in the case where the reference measure is ρm instead of m; the finiteness of the measure ρm now gives access to inner regularity and to the approximability by compact sets. We will for simplicity denote ρ := ρm.
We may assume that sup u∈C b (X) D ρ (u) > −∞; indeed, since we can test with the function u ≡ 0, this always holds for cost functions that are bounded from below. Let us make, in the notation of the primal functional, the dependence on the reference measure explicit by the notation γ → C (γ|µ) when the reference measure on the space X is µ. Thus the original notation γ → C [γ] corresponds to γ → C (γ|m).
Since the measures ρ and γ are inner regular, there exists a sequence (K n ) n∈N of compact subsets of X such that ρ(K n ) → ρ(X) and γ(K N n ) → γ(X N ) . The claim follows by letting n → ∞.

Remark 4.3. Notice that we always have
We show that the equality actually holds by simply choosing a potential u(x) = (u 1 (x) + u 2 (x) + · · · + u N (x))/N .