# Infinite-body optimal transport with Coulomb cost

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s00526-014-0803-0

- Cite this article as:
- Cotar, C., Friesecke, G. & Pass, B. Calc. Var. (2015) 54: 717. doi:10.1007/s00526-014-0803-0

- 2 Citations
- 908 Downloads

## Abstract

We introduce and analyze symmetric infinite-body optimal transport (OT) problems with cost function of pair potential form. We show that for a natural class of such costs, the optimizer is given by the independent product measure all of whose factors are given by the one-body marginal. This is in striking contrast to standard finite-body OT problems, in which the optimizers are typically highly correlated, as well as to infinite-body OT problems with Gangbo–Swiech cost. Moreover, by adapting a construction from the study of exchangeable processes in probability theory, we prove that the corresponding \(N\)-body OT problem is well approximated by the infinite-body problem. To our class belongs the Coulomb cost which arises in many-electron quantum mechanics. The optimal cost of the Coulombic N-body OT problem as a function of the one-body marginal density is known in the physics and quantum chemistry literature under the name *SCE functional*, and arises naturally as the semiclassical limit of the celebrated Hohenberg-Kohn functional. Our results imply that in the inhomogeneous high-density limit (i.e. \(N\rightarrow \infty \) with arbitrary fixed inhomogeneity profile \(\rho {/}N\)), the SCE functional converges to the mean field functional. We also present reformulations of the infinite-body and N-body OT problems as two-body OT problems with representability constraints.

### Keywords

N-representability Density functional theory Hohenberg-Kohn functional N-body optimal transport Infinite-body optimal transport Coulomb cost Exchange-correlation functional de Finetti’s Theorem Finite exchangeability N-extendability### Mathematics Subject classification

49S05 65K99 81V55 82B05 82C70 92E99 35Q40## 1 Introduction

### 1.1 Semi-classical electron-electron interaction functional and connection to optimal transport

This work is motivated by, and contributes to, the longstanding quest in physics, chemistry and mathematics to design and justify approximations to the energy functional of many-electron quantum mechanics in terms of the one-body density.

*SCE functional*, where the acronym SCE stands for strictly correlated electrons; the fact that e.g. for \(N=2\), minimizers concentrate on lower-dimensional sets of form \(x_2=T(x_1)\) (see (1.5) below) has the physical interpretation that given the position of the first electron, the position of the second electron is strictly determined. The connection of the functional (1.1) with many-electron quantum mechanics which motivated this work is explained at the end of this Introduction.

We remark that dropping the symmetry requirement on \(\gamma _N\) would not alter the minimum value in (1.1), since the functional \(C_N\) takes the same value on a nonsymmetric measure as on its symmetrization.

*body mass transportation functional*or

*an optimal transport problem with*\(N\)

*marginals*, and the problem (1.1) of minimizing it an \(N\)-

*body optimal transport problem*. The functional \(V_{ee}^{SCE,N}\) can be interpreted as the

*minimum cost of an optimal transport problem as a functional of the marginal measure*. In the case \(N=2\), one is dealing with a standard (two-body or two-marginal) optimal transport problem of form

*cost function*and \({\mathcal {P}}(\mathbb {R}^{2d})\) is the space of probability measures on \(\mathbb {R}^{2d}\).

### 1.2 Previous results

It was not realized until recently [10, 17] that the minimization problem in (1.1) has the form of an optimal transport problem and can, especially in the case \(N=2\), be fruitfully analyzed via methods from OT theory.

Much less is known about \(N\)-body OT problems with \(N\ge 3\). Here the OT literature has focused on special cost functions [10, 13, 15, 16, 18, 27, 32, 35, 38, 42, 49, 50, 51, 54, 59, 60] and the structure of solutions is highly dependent on the cost function. For certain costs, solutions concentrate on graphs over the first marginal, as in the two body case, while for others the solutions can concentrate on high dimensional submanifolds of the product space. In particular, despite its importance in electronic structure theory, very little is known regarding the structure of the solutions of the \(N\)-body OT problem with Coulomb cost (1.1). Let us note, however, that the study of Monge–Kantorovich problems with symmetry constraints has been intitiated in [34] and continued in [15, 16, 29, 33, 35], the last two papers dealing with the Coulomb cost.

### 1.3 Main results

Here we focus on problem (1.1) in the *regime of large*\(N\), i.e. the “opposite” regime of the hitherto best understood case \(N=2\). We present two main results. The first introduces and analyzes the associated *infinite-body OT problem*. Remarkably, for a natural class of costs which includes the Coulomb cost, the infinite-body problem is uniquely minimized by the independent product measure all of whose factors are given by the one-body marginal. See Definition 1.1 below for the precise statement. This stands in surprising contrast to the pair of recent papers [Pass12a] and [Pass12b]. There costs of Gangbo–Swiech type are analyzed and it is shown that the optimizer is a Monge type solution; that is, any two of the variables are completely dependent rather than completely independent. Our second main result says that the corresponding \(N\)-body OT problem is well approximated by the infinite-body problem; in particular we show that the optimal cost per particle pair of the \(N\)-body problem converges to that of the infinite-body problem as \(N\) gets large. See Theorem 1.2 for the precise statement.

### 1.4 Connection with many-electron quantum mechanics and the Hohenberg–Kohn functional

### 1.5 Precise statement of main results

With a view to the application to density functional theory, we will work in the following setting even though some of our main results could be stated and proved for more general spaces, such as Polish spaces for Theorem 1.3.

**Definition 1.1**

(Infinite-dimensional Borel \(\sigma \)-algebra) Let \((\Omega _i^d, \mathcal{{F}}_i^d):=(\mathbb {R}^d, \mathcal{B}(\mathbb {R}^d))\), where \(i=1,2,\ldots , N,\ldots ,\) and \(d\ge 1\). The underlying \(\sigma \)-field is the Borel \(\sigma \)-field. We define the *infinite-dimensional cartesian product*\(\Omega _\infty ^d\) by \(\Omega _\infty ^d:=\times _{i=1}^\infty \Omega _i^d,\) and the *infinite-dimensional Borel*\(\sigma \)*-algebra*\(\mathcal{B}_\infty ^d\) as the Borel \(\sigma \)-algebra generated by the open subsets of \(\Omega _\infty ^d\) of the form \(\prod _{i=1}^\infty A_i\), \(A_i\in \mathcal{B}(\mathbb {R}^d)\), where \(A_i=\Omega _i^d\) for all but a finite number of \(i\).

For an abstract measurable space \((\mathcal{S}, \mathcal{B}(\mathcal{S}))\), we define similarly \((\mathcal{S}^\infty , \mathcal{B}^\infty (\mathcal{S}))\) as the cartesian product of \((\mathcal{S}, \mathcal{B}(\mathcal{S}))\).

To simplify the notation, we will write \((\mathbb {R}^d)^\infty \) instead of \(\Omega _\infty ^d\). Throughout the paper, if \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\) has a Lebesgue-integrable density, the latter is also denoted by \(\mu \).

*cost function*\(c_N:\underbrace{\mathbb {R}^d\times \cdots \times \mathbb {R}^d}_\mathrm N times \rightarrow {\mathbb {R}}_+\cup \{\infty \}\) be defined by

*infinite-body (or infinite-marginal) optimal transport problem*.

*Coulomb cost*\(c(x,y)=\frac{1}{|x-y|}\). In this case, the functional (1.13) becomes

**Theorem 1.2**

- (a)Let \(c \, : \, \mathbb {R}^{2d}\rightarrow \mathbb {R}_+\cup \{\infty \}\) in (1.10) be of the form \(c(x,y) =\ell (x-y)\), where \(\ell (z)=\ell (-z)\) for all \(z\in \mathbb {R}^d\) (i.e. \(c\) is symmetric), and eitherLet \(\mu \in \mathcal{P}(\mathbb {R}^d)\) be a measure such that
- (i)
\(\ell \in L^1(\mathbb {R}^d)\cap C_b(\mathbb {R}^d)\), \(\hat{\ell }\ge 0\) or

- (ii)
\(d=3\), \(\ell (z)=1/|z|\) (Coulomb cost).

Then the infinite-dimensional product measure on \((\mathbb {R}^d)^\infty \)$$\begin{aligned} \int _{\mathbb {R}^{2d}} c(x,y) \, \mu (dx)\mu (dy)<\infty . \end{aligned}$$(1.15)is a minimizer of the infinite-body optimal transport problem (1.12), and the optimal cost is the mean field functional, i.e.$$\begin{aligned} \gamma _0=\mu ^{\otimes \infty }=\mu \otimes \mu \otimes \cdots \end{aligned}$$(1.16)$$\begin{aligned} {F}^{OT}_{\infty }[\mu ]=\int _{\mathbb {R}^{2d}} c(x,y) \, \mu (dx)\mu (dy). \end{aligned}$$(1.17) - (i)
- (b)
If in addition \(\hat{\ell }(z)\) is strictly bigger than zero for all \(z\), then the independent measure (1.16) is the unique minimizer of the problem (1.12).

Note that in case (ii), i.e. the Coulomb cost in dimension \(d=3\), the strict positivity condition \(\hat{\ell }>0\) holds, because \(\hat{\ell }(k)=4\pi /|k|^2\). Moreover by simple estimates (see e.g. eq. (5.21) in the proof of Theorem 5.6 in [17]) the finiteness condition in (a) holds for all \(\mu \in L^1(\mathbb {R}^3)\cap L^3(\mathbb {R}^3)\); the latter is the natural \(L^p\) type space into which the domain of the Hohenberg-Kohn functional embeds. As a consequence, the above results are valid for all densities of physical interest in DFT. However the Coulomb cost it is neither continuous nor does it belong to \(L^1\). The obvious task to weaken the regularity assumptions in (i) so as to naturally include the Coulomb cost does not seem to be straightforward and lies beyond the scope of this article.

Our result stands in surprising contrast to the recent results in [52, 53] by one of us. For a class of costs including the many-body quadratic cost \(\sum _{i \ne j}|x_i-x_j|^2\) studied by Gangbo and Swiech [32], the optimizer of the infinite-body OT problem is demonstrated to be a Monge type solution; that is, any two of the variables are completely dependent, rather than completely independent as is the case for our class of costs. This dichotomy exposes a fascinating sensitivity to the cost function in infinite-body optimal transport problems. This difference is not present in two-marginal problems, where fairly weak conditions on the cost which include both the quadratic and the Coulomb cost suffice to ensure Monge type solutions. A milder version of the dichotomy does however arise in the multi-body context, where for certain costs the solution can concentrate on high dimensional submanifolds of the product space [13, 50]. It does not seem to be until one gets to the infinite marginal setting, however, that complete independence of the variables becomes optimal for certain costs. The difference between the costs in our paper and those in [52, 53] can be expressed succinctly as positivity of the Fourier transform of \(\ell \). Note that the latter is equivalent to the fact that \(c(x,y)=\ell (x-y)\) is a *positive kernel*, i.e. associated integral operator \(K\varphi (x):=\int _{\mathbb {R}^d}c(x,y)\varphi (y)\, dy\) satisfies \(\langle \varphi , K\varphi \rangle \ge 0\) for all \(\varphi \in C_0^\infty (\mathbb {R}^d)\). See Example 2.13(ii) in Sect. 2.2 for a simple explicit example of a cost function which satisfies all the assumptions in Theorem 1.2 except positivity of the Fourier transform and for which the conclusion of the theorem fails.

The basic idea for the proof of Theorem 1.2 is to represent the competing infinite-dimensional probability measures in (1.13) via de Finetti’s theorem, and identify the functional \(C_\infty \) introduced in (1.12), with the help of Fourier transform calculus and elementary probability theory, as a sum of the mean field functional and a certain variance term minimized by completely independent measures.

**Theorem 1.3**

Note that here not just costs leading to independence as in Theorem 1.2 but also costs leading to strong correlations as considered in [52, 53] are included.

The proof of Theorem 1.3 is based on a construction from advanced probability theory [22] which does not appear to be easily accessible to non-probabilists, and which contains the important insight that any \(N\)-body measure \(\gamma _N\in {\mathcal {P}}^N_{sym}(\mathbb {R}^{d})\) can be approximated by the \(N\)-body marginal \(\tilde{\gamma }_N\) of an infinite probability measure \(\gamma \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) (\(\tilde{\gamma }_N\) is infinitely representable in the terminology developed below). This allows us to approximate the \(N\)-body OT problem (1.18) as arising in density functional theory by the corresponding infinite-body OT problem (1.12). Interestingly, the focus of probabilists was precisely the other way around: the object of primary interest were the infinite probablity measures in the space \({\mathcal {P}}_{sym}^\infty \), or in fact the underlying infinite sequences of random variables. The latter serve as useful alternatives to iid (identically and independently distributed) sequences which allow to model repeated sampling experiments containing correlations; approximation by finite sequences of random variables was then of interest for purposes of numerical sampling.

*inhomogeneous high-density limit*in which the inhomogeneity is not a small perturbation, but stays proportional to the overall density. More precisely, one fixes an arbitrary density \(\mu \) of integral \(1\), considers the \(N\)-body system with proportional inhomogeneity, i.e. with one-body density given by \(\rho =N\mu \), and studies the asymptotics of the SCE energy as \(N\) gets large. Note that the SCE energy corresponds, up to normalization factors, to the optimal cost functional (1.19) with Coulomb cost \(c(x,y)=1/|x-y|\) in dimension \(d=3\):

**Corollary 1.4**

We remark that both numerator and denominator are of order \(N^2\) as \(N\rightarrow \infty \), i.e. they are proportional to the number of particle pairs in the system. A very interesting question raised by our work is to determine asymptotic corrections to the mean field energy in Eq. (1.20). For non-singular costs, we expect the next-order correction to occur at the thermodynamic order \(O(N)\). Unfortunately, understanding these corrections lies beyond the scope of the methods developed here.

*Remark 1.5*

The arguments developed in the present paper apply to a larger class of interaction energies (see Definition 1.1, Theorem 1.2), and—perhaps more importantly—are based on a general and transparent probabilistic inequality (namely the comparison estimate in Proposition 3.2 below between infinitely representable and finitely representable measures which goes back to Diaconis and Freedman). By contrast the Lieb–Oxford inequality was derived using highly nontrivial ad hoc estimates and currently lacks a probabilistic interpretation and analogues for non-Coulombic problems. But—unlike the Lieb–Oxford inequality—our probabilistic arguments fail to give a quantitative error bound for the associated optimal cost functionals for singular costs like the Coulomb cost, yielding such bounds only in the case of bounded costs (see Eq. (3.6)).

### 1.6 Plan of paper

The rest of the paper is organized as follows. In Sect. 2 we recall the notion of \(N\)-representability of pair measures, which was developed in the present OT context in our recent paper [27] and is equivalent to the concept of \(N\)-extendability of pairs of random variables in probability theory, and prove Theorem 1.2. Section 3 is devoted to the proof of Theorem 1.3.

## 2 Solution to the infinite-body OT problem

The proof of Theorem 1.2 will require two key Lemmas. The first one (Lemma 2.4) reduces the infinite-body OT problem (1.12) to a \(2\)-body OT problem with an infinite representability constraint. The second (Lemma 2.8) gives an explicit description of the measures satisfying this infinite representability constraint (de Finetti’s Theorem, stated in Proposition 2.7 below).

In Sect. 2.1 we recall the notion of \(N\)-representability of a pair density, generalize it to infinitely many particles, prove Lemmas 2.4 and 2.8, and also establish existence of at least one solution to (1.12) (Proposition 2.9). In Sect. 2.2 we establish Theorem 1.2, via Fourier transform calculus applied to the de Finetti representation of infinitely representable measures.

### 2.1 Reduction to a \(2\)-body OT problem with infinite representability constraint

We now reformulate the infinite-body mass transportation problem (1.12) as a standard (two-body) mass transportation problem subject to an infinite representability constraint. This reformulation is possible due to the fact that the cost in (1.10) is a sum of symmetric pair terms. We begin by recalling the definition of \(N\)-representability, introduced in the present context in our recent paper [27] (see Definition III.1).

**Definition 2.1**

*-representable*if there exists a symmetric probability measure \(\gamma _N\in {\mathcal {P}}_{sym}^N(\mathbb {R}^{d})\) such that for all Borel sets \(A_i,A_j\subseteq \mathbb {R}^d\) and all \(1\le i<j\le N\), we have

\(N\)-representability is a highly nontrivial restriction. The following basic example is taken from [27].

*Example*

Let \(A\), \(B\in \mathbb {R}^d\), \(A\ne B\). The totally anticorrelated probability measure \(\mu _2 = \frac{1}{2}(\delta _A\otimes \delta _B + \delta _B\otimes \delta _A)\) is not 3-representable. (Here \(\delta _A\) denotes the Dirac measure centred at \(A\).)

Intuitively, this is because we can not allocate 3 particles to 2 sites without doubly occupying one of the sites. Mathematically, to prove this suppose that \(\gamma \) was any probability measure on \((\mathbb {R}^d)^3\) with two-body marginal \(\mu _2\). Then \(\gamma \) must have one-body marginal supported on \(\{A,B\}\), and hence must be a convex combination of the measures \(\delta _X\otimes \delta _Y\otimes \delta _Z\) with \(X,Y,Z\in \{A,B\}\). But the two-point marginal of each of the latter measures contains a positive multiple of either \(\delta _A\otimes \delta _A\) or \(\delta _B\otimes \delta _B\), whence the two-pont marginal of \(\gamma \) cannot equal \(\mu _2\). For further discussion and more general examples we refer to [27].

The notion of \(N\)-representability in Definition 2.1 is well known in the probability theory literature, under the names *N-extendability* or *finite exchangeability*, and is usually stated and analyzed in the language of sequences \(X_1,\ldots ,X_N\) of \(N\) random variables. The formulation in Definition 2.1 is mathematically equivalent and corresponds to considering instead the law of the random vector \((X_1,\ldots ,X_N)\). Numerous attempts have been made to characterize \(N\)-extendability for \(N\ge 3\) for various types of marginals (see, for example, [2] for an an in-depth overview of \(N\)-extendability results in probability), but a direct characterization remains elusive.

Let us now generalize Definition 2.1 to infinite particle systems.

**Definition 2.2**

*infinitely representable*if there exists a symmetric probability measure \(\gamma _\infty \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) such that for all Borel sets \(A_i,A_j\subseteq \mathbb {R}^d\) and all \(1\le i<j\le N\), we have

Note that a symmetric probability measure \(\gamma _\infty \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) is called an *exchangeable* measure in the probabilistic literature. It is easy to see (see, for example, [2] or Lemma III.2 in [27]) that

**Lemma 2.3**

Let \(N\ge M \ge 2\). If \(\mu _2\in {\mathcal {P}}_{sym}^2(\mathbb {R}^{d})\) is \(N\)-representable, then it is also \(M\)-representable.

That is to say, \(N\)-representability becomes a more and more stringent condition as \(N\) increases. Note that the case \(N=\infty \) will be studied later in Lemma 3.3.

We will next reformulate the minimization problem (1.12) in terms of infinite representability. The result is a straightforward extension to infinite particle systems of Theorem III.3 in [27] for the \(N\)-body problem.

**Lemma 2.4**

*Proof*

This is clear from the observation that \({C}_{\infty }[\gamma ] = \int _{\Omega ^2}cd\mu _2\) for any \(\gamma \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) with \(\gamma \rightarrow \mu _2\) and definition of infinite representability. \(\square \)

To prove our next result, we will use the de Finetti–Hewitt–Savage Theorem for infinitely representable measures as stated and proved in [22]. See [22] Theorems 14 and 20 (for exchangeable measures in \({\mathcal {P}}_{sym}^\infty (\mathcal{S})\), where \(\mathcal{S}\) is a compact Hausdorff space) and the third paragraph on page 751 (for the more general case of standard spaces \(\mathcal{S}\) as defined in Definition 2.5 below). Note that de Finetti’s Theorem can be found in the literature—under various assumptions on \(\mathcal{S}\)—in the different but equivalent language of exchangeable sequences of random variables on \(\mathcal{S}\), starting with the seminal paper of [20] (for exchangeable Bernoulli random variables). The statement of the theorem requires some more notation and definitions.

**Definition 2.5**

Two probability spaces \((\Upsilon _1, \mathcal{G}_1, {\mathbb {P}}_1)\) and \((\Upsilon _2, \mathcal{G}_2, {\mathbb {P}}_2)\) are *isomorphic* if there exists a bijective map \(T \, : \Upsilon _1\rightarrow \Upsilon _2\) such that \(T\) and \(T^{-1}\) map measurable sets to measurable sets and are measure preserving.

Two probability spaces \((\Upsilon _1, \mathcal{G}_1, {\mathbb {P}}_1)\) and \((\Upsilon _2, \mathcal{G}_2, {\mathbb {P}}_2)\) are *isomorphic mod*\(0\) if there exist null sets \(A_1\subset \Upsilon _1, A_2\subset \Upsilon _2\) such that the probability spaces \(\Upsilon _1{\setminus }A_1, \Upsilon _2{\setminus }A_2\) (endowed with the natural sigma-fields and probability measures) are isomorphic.

A probability space is called a *standard space* if it is isomorphic mod \(0\) to an interval with Lebesgue measure, a finite or countable sum of atoms (i.e. measures whose support consists of a single point), or a disjoint union of both.

For further discussion of the notion of standard space and examples see [40].

Let \((S,\mathcal{B}(\mathcal{S}))\) be an abstract measurable space.

*Remark 2.6*

The main point for our purposes is that \((\mathcal{S}, \mathcal{B}(\mathcal{S}), \mu )\) is a standard space provided \(S\) is Polish (i.e., a complete separable metric space), \(\mathcal{B}(\mathcal{S})\) is the Borel \(\sigma \)-field, and \(\mu \) is any Borel probability measure on \((\mathcal{S}, \mathcal{B}(\mathcal{S}))\). In particular, \(\mathbb {R}^d\) and \((\mathbb {R}^d)^\infty \) endowed with any Borel probability measure are standard spaces. This follows e.g. by combining Theorem 2.4.1 in [40], which establishes that any Polish space endowed with a regular probability measure is standard, and the general measure-theoretic fact (see e.g. [5]) that any Borel probability measure on a Polish space is regular.

We endow \(\mathcal{P}(\mathcal{S})\)—the set of all probability measures on \((\mathcal{S}, \mathcal{B}(S))\)—with the smallest \(\sigma \)-algebra \(\mathcal{{B}}^*(\mathcal{S})\) which makes the functions \(\mathbb {P}\rightarrow \mathbb {P}(A), \mathbb {P}\in \mathcal{P}(\mathcal{S}),\) measurable for all \(A\in \mathcal{B}(\mathcal{S})\). We note here that in the weak star topology of \(\mathcal{P}(\mathcal{S})\) (in which the convergence is called weak convergence of measures), the map \(\mathbb {P}\rightarrow \mathbb {P}(A), A\in \mathcal{B}(\mathcal{S}),\) is continuous, and therefore \(\mathcal{{B}}^*(\mathcal{S})\) is by definition the Baire \(\sigma \)-field in \(\mathcal{P}(\mathcal{S})\). If \(\mathcal{S}\) is a metric space, the Baire \(\sigma \)-field \(\mathcal{{B}}^*(\mathcal{S})\) coincides with the Borel \(\sigma \)-field on \(\mathcal{P}(\mathcal{S})\). (For more on Baire and Borel \(\sigma \)-fields, see for example Chapter 6 in [6] or [22], and for more on the weak star topology of \(\mathcal{P}(\mathcal{S})\) see Chapter 8 in [6].)

We are now ready to state de Finetti’s Theorem. Translated into the present language, de Finetti’s Theorem says the following:

**Proposition 2.7**

In words: one can view an (infinite-dimensional) exchangeable probability measure \(\gamma _\infty \) on \(S^{\otimes \infty }\) as an integral of infinite product probability measures on \(S^{\otimes \infty }\) against a probability measure defined on the space of all probability measures on \(S\). An equivalent statement of De Finetti’s theorem is that the *extremal points* of the convex set of exchangeable probability measures on an infinite product space \(S^{\otimes \infty }\) are the infinite-dimensional product measures \(Q^{\otimes \infty }\) on \(S^{\otimes \infty }\). De Finetti’s theorem asserts, moreover, that this convex set is a *simplex*, i.e. any of its points \(\gamma _\infty \) is the barycentre of a unique probability measure \(\nu \) concentrated on the extremal points \(Q^{\otimes \infty }\).

**Theorem 2.8**

*Proof*

Note that the one and two body marginals of \(\gamma _{\infty }\) in (2.5) are given by \(\mu = \int _{{\mathcal {P}}(\mathbb {R}^d)} Q \, d\nu (Q)\) and \(\mu _2 = \int _{{\mathcal {P}}(\mathbb {R}^d)}Q\otimes Q \, d\nu (Q)\), respectively. Then, by de Finetti’s Theorem, \(\mu _2\) is infinitely representable if and only if \(\mu _2 = \int _{{\mathcal {P}}(\mathbb {R}^d)}Q\otimes Q \, d\nu (Q)\) for some \(\nu \in \mathcal{P}(\mathcal{P}(\mathbb {R}^d))\). The result follows from Lemma 2.4. \(\square \)

We end this subsection with a general result of existence of at least one solution to (1.12) and to (2.4). This result will be used in the proof of Theorem 1.3.

**Theorem 2.9**

For all \(N\in \mathbb {N}, N\ge 2\), let \(c_N\, : \, (\mathbb {R}^d)^N \rightarrow {\mathbb {R}}_+\cup \{\infty \}\) be defined as in (1.10), with \(c\) Borel-measurable, symmetric, and lower semi-continuous. Then there exists at least one solution \(\gamma ^{opt}\) to (1.12) and at least one solution \(\mu _2^{opt}\) to the minimization problem in (2.4).

*Proof*

To prove the existence of a solution \(\gamma \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d),\gamma \mapsto \mu \), to (1.12), we will adapt to our infinite-body optimal transportation problem the standard proof of existence of solutions to two-body OT problems as given e.g. in [65], Theorem 4.1. Since there are some subtle differences to the proof in [65], we will outline below the basic steps.

- (a)
Lower semicontinuity of the cost functional \(\gamma \mapsto C_\infty [\gamma ]\) on \({\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) with respect to weak convergence. This follows by a standard argument after rewriting \(C_\infty [\gamma ] = \int _{\mathbb {R}^{2d}}c(x_1,x_2)d\mu _2(x_1,x_2)\) and by noting that the class of infinite-dimensional symmetric probability measures in \({\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) is closed under weak convergence in the sense that if \(\{P_k\in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\}_{k\ge 1}\) converges weakly to a probability measure \(\mathbb {P}\in {{\mathcal {P}}}((\mathbb {R}^d)^\infty )\), then \(\mathbb {P}\in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) (for a proof of this statement, see e.g. page 54 in [2]).

- (b)Tightness in \({\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) of the set of all \(\gamma \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^{d})\) such that \(\gamma \mapsto \mu \) for some fixed \(\mu \in {\mathcal {P}}(\mathbb {R}^{d})\). This is proved similarly to Lemma 4.3 from [65]. More precisely, let \(\gamma \in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^{d})\) such that \(\gamma \mapsto \mu \) and \(\mu \in {\mathcal {P}}(\mathbb {R}^{d})\). Since \(\mathbb {R}^d\) is a Polish space, \(\mu \) is tight in \({\mathcal {P}}(\mathbb {R}^d)\). Then for any \(\epsilon >0\) and for any \(i\in \mathbb {N}, i\ge 1\), there exists a compact set \(K^i_\epsilon \subset \mathbb {R}^d\), independent of the choice of \(\gamma \), such that \(\mu (\mathbb {R}^d\setminus K^i_\epsilon ) \le \frac{\epsilon }{2^i}\). Take \(K_\epsilon :=\prod _{i\ge 1}K^i_\epsilon \), which is compact by Tychonoff’s theorem. Then we haveTightness now follows since this bound is independent of \(\gamma \).$$\begin{aligned} \gamma (K_\epsilon ^c)\le \gamma (\cup _{i\ge 1}(\underbrace{\mathbb {R}^d\times \cdots \times \mathbb {R}^d}_\mathrm{i-1 times }\times (K_\epsilon ^i)^c\times \mathbb {R}^d\times \cdots ))\le \sum _{i\ge 1}\mu ((K_\epsilon ^i)^c)\le \sum _{i\ge 1}\frac{\epsilon }{2^i}=\epsilon . \end{aligned}$$

One now trivially also obtains a solution to the variational problem in (2.4); namely, the two-point marginal \(\mu _2^{opt}\) of \(\gamma ^{opt}\) is a solution. \(\square \)

### 2.2 Proof of Theorem 1.2

In this subsection, we determine explicitly the optimal transport functional \({F}^{OT}_\infty \) introduced in Eq. (1.12), for a large class of cost functions. As an offshot, we obtain an interesting probabilistic interpretation of the infinite-body optimal transport functional \(C_\infty \) introduced in (1.12).

*Proof of Theorem 1.2*

*formally*, using the rules of Fourier transform calculus even though \(\ell \) and \(Q\) are not smooth rapidly decaying functions. The calculation will be justified rigorously in Lemma 2.10 below. Using, in order of appearance, Fubini’s theorem, the definition of the convolution, Plancherel’s formula, the Fourier calculus rule \(\widehat{f*g}=\hat{f} \, \hat{g}\), and again Fubini’s theorem gives, abbreviating \(c_d:=(2\pi )^{-d}\),

The only steps in the derivation of (2.8), (2.9), (2.10) which were nonrigorous due to lack of regularity of \(\ell \) and \(Q\) were the use of Plancherel’s formula and of the Fourier calculus rule \(\widehat{\ell *Q}=\hat{\ell }\hat{Q}\). Conventional assumptions would be \(\ell *Q\) and \(Q\in L^2(\mathbb {R}^d)\) for the former, and \(\ell \) and \(Q\in L^1(\mathbb {R}^d)\) for the latter. As none of these four assumptions are actually met here, we will need the following generalization of these facts. Though this will surely not be surprising to experts in the interest of completeness and for lack of a suitable reference, we include a proof in the “Appendix”.

**Lemma 2.10**

Now by the assumption \(\hat{\ell }(z) \ge 0\), the two variance terms on the right hand side of (2.10) are nonnegative. Because the right hand side of (2.10) vanishes when \(\nu =\delta _\mu \), i.e. \(\mu _2=\mu \otimes \mu \), we conclude that \(\mu _2=\mu \otimes \mu \) is a minimizer of the problem in (2.6), and hence (by Theorem 2.8) that \(\gamma =\mu ^{\otimes \infty }=\mu \otimes \mu \otimes \cdots \) is a minimizer of (1.12). This establishes Theorem 1.2 (a).

Before proceeding with the proof of (b), let us note a corollary of the above arguments. By combining (2.4) and (2.10), we obtain:

**Corollary 2.11**

*Remark 2.12*

(a) The proof of Theorem 1.2 relies heavily on the positivity of the Fourier transform of \(\ell \), and indeed the conclusion can fail dramatically in the absence of this condition, as shown by the following example.

*Example 2.13*

- (i)the quadratic costin which case (1.12) corresponds to the infinite marginal limit of the problem studied by Gangbo and Swiech in [32], in the special case of equal marginals; physically, one has replaced the repulsive Coulomb interactions by attractive harmonic oscillator-type interactions;$$\begin{aligned} \ell (z)=|z|^2, \end{aligned}$$
- (ii)the smoothly truncated quadratic costwhich behaves like \(|z|^2\) near \(z=0\) (so that (1.13) behaves like the quadratic OT problem (i) for marginals supported near \(0\)). Note that (ii) satisfies all assumptions of Theorem 1.2 except positivity of the Fourier transform \(\hat{\ell }\) (note that \(\hat{\ell }(k) = (\sqrt{2\pi }\sigma )^d e^{-\sigma ^2|k|^2/2} - (\sqrt{2\pi }/\sigma )^d e^{-|k|^2/2\sigma ^2}\)).$$\begin{aligned} \ell (z) = e^{-|z|^2/2\sigma ^2} - e^{-\sigma ^2|z|^2/2}, \quad \sigma >1, \end{aligned}$$

- (i)
the Gaussian cost \( \ell (z) = e^{-|z|^2/2\sigma ^2},\sigma \ne 0\), with Fourier transform \(\hat{\ell }(k) = (\sqrt{2\pi }\sigma )^d e^{-\sigma ^2|k|^2/2}\), and the exponential cost \(\ell (z)=e^{-a|z|},z\in \mathbb {R}^d,a>0\), with Fourier transform \(\hat{\ell }(k)=(2\pi )^dc(d)a/(a^2+k^2)^{\frac{d+1}{2}}, k\in \mathbb {R}^d\), where \(c(d)=\Gamma ((d+1)/2)/\pi ^{\frac{d+1}{2}}\), with \(\Gamma \) being the gamma function. Both satisfy the assumptions of Theorem 1.2(a).

- (ii)Let \(\lambda ,\beta >0, \beta >d/2,\) and let \(l_{\lambda , \beta }(z)=\left( |z|^2+\lambda ^2\right) ^{-\beta }, z\in \mathbb {R}^d\) (\(l_{\lambda ,\beta }\) are the
*inverse multiquadric functions*, widely used in statistics and in machine learning). By Theorem 6.13 from [66], \(l_{\lambda ,\beta }\) has as Fourier transformwhere \(c(\lambda ,\beta ,d)>0\) depends only on \(\alpha ,\beta \) and \(d\), and where \(K_{\beta -d/2}\ge 0\) is the modified Bessel function of second kind of order \(\beta -d/2\). Moreover, \(l_{\lambda ,\beta } \in L^1(\mathbb {R}^d)\cap C_b(\mathbb {R}^d)\) so Theorem 1.2(a) applies.$$\begin{aligned} {\hat{l}}_{\lambda ,\beta }(k)=c(\lambda ,\beta ,d)|k|^{\beta -d/2} K_{\beta -d/2}(|k|),\quad k\in \mathbb {R}^d, \end{aligned}$$ - (iii)A natural extension to the Coulomb cost example from Theorem 1.2(b) is the so-called
*screened Coulomb potential*(also known in physics as Yukawa potential). Set, for each \(\epsilon >0\), \(c_{\epsilon }(x,y) =\frac{e^{-\epsilon |x-y|}}{|x-y|}, x,y\in \mathbb {R}^3\). Since \(c_{\epsilon }(x,y)\le c(x,y)=\frac{1}{|x-y|}\) for all \(\epsilon \ge 0\) and all \(x,y\in \mathbb {R}^3\), and since (1.15) holds for the Coulomb cost \(c\), we have for all \(\mu \in L^1(\mathbb {R}^3)\cap L^3(\mathbb {R}^3)\)We note that \(\ell _{\epsilon }(x): =\frac{e^{-\epsilon |x|}}{|x|}, x\in \mathbb {R}^3,\) is in \(L^1(\mathbb {R}^3)\), is continuous on \((0,\infty )\), and has Fourier transform \(\hat{\ell }_\epsilon (k)=\frac{4\pi }{|k|^2+\epsilon ^2}>0, k\in \mathbb {R}^d\) (see, for example, [43] for a proof). Even though \(\ell _\epsilon \) is not bounded for any \(\epsilon >0\), the result can be proven for this cost by a straightforward adaptation of our argument for the Coulomb cost.$$\begin{aligned} \int _{\mathbb {R}^{6}} \frac{e^{-\epsilon |x-y|}}{|x-y|} \, \mu (x)\mu (y)dx dy<\infty ,~~\text{ for } \text{ all }~\epsilon >0. \end{aligned}$$(2.15) - (iv)
Various constraints ensuring that the Fourier transform of a function is positive have been derived for example in [36].

*signed*measure has been established in [44]. Such a representation would allow us to derive (2.6), but—due to the lack of sign information—does not allow to conclude that the independent measure is optimal in the finite-\(N\) case. Indeed, in the special case of marginals supported on two points it follows from the analysis in [27] that the independent measure is not minimizing for

*any*\(N\). For more general densities and cost functions, it follows from Proposition 3.6 below that the independent measure is not minimizing for any N.

(d) As a corollary of our analysis, we recover the following interesting result from [39]: if \((X_n)_{n\ge 1}\) is an infinite sequence of exchangeable random variables in \(\mathbb {R}^d\) such that \((X_n)_{n\ge 1}\) are pairwise independent (i.e., the joint distribution of any \((X_i,X_j)\) is a product of the distributions of \(X_i\) and \(X_j\)), they are mutually independent.

*finite (respectively infinite) exchangeable sequence of random variables*is a finite (respectively infinite) sequence \(X_1, X_2, X_3, \ldots \) of random variables such that for any finite permutation \(\tau \) of the indices \(1, 2, 3, \ldots \) (the permutation acts on only finitely many indices, with the rest fixed), the joint probability distribution of the permuted sequence

Note that for \(N<\infty \), pairwise independence does not imply mutual independence. One of the first counter-examples for \(N<\infty \) was provided in [4]; for further counter-examples see e.g. [21].

(e) We note that weakening even slightly the assumption of exchangeability of the measure may destroy uniqueness of the minimizer of (1.12). To prove this, we apply for example the results from [41] or from [8]. Therein, various examples are constructed of infinite stationary sequences \((X_n)_{n\ge 1}\) of random variables in \(\mathbb {R}^d\) such that \((X_n)_{n\ge 1}\) are pairwise independent, with mean \(0\) and finite second moments, but which do not satisfy the central limit theorem. This implies that in these particular cases \((X_n)_{n\ge 1}\) are not mutually independent.

## 3 Connection between the N-body OT problem and the infinite-body OT problem

We first note the following existence result for (3.1):

**Proposition 3.1**

Let \(c_N\, : (\mathbb {R}^d)^N \rightarrow {\mathbb {R}}_+\cup \{\infty \}\) be defined as in (1.10), with \(c\) lower semi-continuous. Then there exists at least one solution \(\gamma _N\) to (1.18), and at least one solution \(\mu _{2,N}\in {\mathcal {P}}_{sym}^2(\mathbb {R}^{d})\) to the minimization problem in (3.1).

*Proof*

The proof follows from a standard compactness argument, similar to those found in [65], combined with the fact that a non symmetric measure \(\gamma \) on \(\mathbb {R}^{Nd}\) may be symmetrized without changing the total cost \(C_N[\gamma ]\), due to the linearity of the functional and the constraints, and the symmetry of \(c\). \(\square \)

To establish (1.20), we will use the following result which allows us to approximate \(N\)-representable measures by infinitely representable ones. The result is actually a translation of Theorem \(13\) in [22] from the language of random variables into that of probability measures and explains why De Finetti’s Theorem holds exactly for \(N=\infty \) but only approximately for \(N<\infty \). For purposes of simplicity and completeness, unlike [22] we limit ourselves to euclidean spaces, and include a proof.

**Proposition 3.2**

*Proof*

We will use this result directly to easily establish Theorem 1.3 part (i). For part (ii), we will need the following intermediate Lemma.

**Lemma 3.3**

- (a)
Let \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\) and let \(\left( \mu _{2,N}\right) _{N\ge 2}\) be a sequence of symmetric probability measures on \(\mathbb {R}^{2d}\) such that \(\mu _{2,N}\mapsto \mu \) and \(\mu _{2,N}\) is \(N\)-representable for all \(N.\) If \(\mu _{2,N}\) converges weakly to some symmetric probability measure \(\mu _2\) on \(\mathbb {R}^{2d}\), then \(\mu _2\) is infinitely representable.

- (b)
A symmetric probability measure \(\mu _2\) on \(\mathbb {R}^{3d}\) is infinitely representable if and only if it is \(N\)-representable for all \(N\ge 2\).

*Proof*

First we deal with part (a). Proposition 3.2 yields a sequence of infinitely representable measures \(({\mathbb {P}}_{2,N}\in {\mathcal {P}}_{sym}^2(\mathbb {R}^d)))_{N\ge 2}\) converging weakly to \(\mu _2\). By definition, for each \({\mathbb {P}}_{2,N}\) there exists \(\gamma ^N\in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) such that \(\gamma ^N\mapsto {\mathbb {P}}_{2,N}\). By the same reasoning as in Theorem 2.9(b), the measures \(\gamma ^N\in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d), N\ge 2,\) all lie in a tight set of \((\mathbb {R}^d)^\infty \), so by Prokhorov’s theorem we can extract a further subsequence, still denoted by \((\gamma ^N)_{N\in \mathbb {N}}\) for simplicity, which converges weakly to some \(\gamma ^{\lim }\in {\mathcal {P}}((\mathbb {R}^d)^\infty )\), \(\gamma ^{\lim }\mapsto \mu _2\). We recall now that the class \({\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\) of infinite-dimensional symmetric probability measures is closed under weak convergence, therefore \(\gamma ^{\lim }\in {\mathcal {P}}_{sym}^\infty (\mathbb {R}^d)\). It follows that \(\mu _2\) is infinitely representable.

Next we prove (b). It is clear that an infinitely representable measure is \(N\)-representable for all \(N\ge 2\). On the other hand, if \(\mu _2\) is \(N\)-representable for all \(N\ge 2\), the result follows from assertion (a) by taking \(\mu _{2,N}\equiv \mu _2\) for all \(N\ge 2\). \(\square \)

*Proof of Theorem 1.3*

We will prove next assertion (ii). Let \(\mu _{2,N}\in {\mathcal {P}}_{sym}^2(\mathbb {R}^d), N\ge 2,\) where \(\mu _{2,N}\) is \(N\)-representable, solve (3.1). By the tightness of the set of symmetric measures on \(\mathbb {R}^{2d}\) with common marginal \(\mu \) and by Prokhorov’s theorem, we can, after passing to a subsequence, assume that \(\mu _{2,N}\) converges weakly to some measure \(\mu _2\in {\mathcal {P}}_{sym}^2(\mathbb {R}^d)\) whose marginal is also \(\mu \). By Lemma 3.3, it immediately follows that \(\mu _2\) is infinitely representable.

*Remark 3.4*

- (a)
We note here that the proof in fact yields that any convergent subsequence of optimal \(\mu _{2,N}\) in the \(N\)-body problem converges to a solution to the infinite body problem. Whenever the minimizer \(\mu _{2,\infty }\) in the infinite body problem is unique (for example, under the conditions in Theorem 1.2 part (ii)), this implies that the \(\mu _{2,N}\) converge to \(\mu _{2,\infty }\). For bounded costs, the proof also yields a bound on the rate of convergence of \(\frac{||c||_\infty }{N}\).

- (b)Theorem \(13\) from [22] proves the following: Let \(\gamma _N\in {\mathcal {P}}_{sym}^N(\mathbb {R}^{d})\). Then there exists a measure \(\nu \) on the set of probability measures on \(\mathcal{P}(\mathbb {R}^d)\), such thatFor some particular cases of marginals \(\gamma _1\) the bounds in (3.8) have been improved in [23].$$\begin{aligned} ||\gamma _k-{\mathbb {P}}_{k,\nu }||\le \frac{k(k-1)}{N}\quad \text{ for } \text{ all }~1\le k\le N. \end{aligned}$$(3.8)

**Corollary 3.5**

Physically, the factor \(1-1/N\) is a self-interaction correction, and the right hand side of (3.11) is a self-interaction corrected mean field energy. Thus the approximation via density representability of infinite order remembers that there are only \({N\atopwithdelims ()2}\) interaction terms, not \(N^2/2\).

*Proof*

Finally we note that, in contrast to the \(N=\infty \) case, minimizers of the \(N\)-body optimal transport problem are typically not given by the mean field measure for any \(N<\infty \).

**Proposition 3.6**

Let \(c_N:(\mathbb {R}^d)^N \rightarrow {\mathbb {R}}_+\cup \{\infty \}\) be defined as in (1.10). Assume that there is some point \(x=(x_1,x_2,\ldots ,x_N) \in \mathbb {R}^{Nd}\) such that \(c_N\) is \(C^2\) near \(x\), \(D^2_{x_ix_j}c(x) \ne 0\) for some \(i \ne j\), and the measure \(\mu \) has positive density near each \(x_i \in \mathbb {R}^d\). Then the product measure \(\mu \otimes \mu \) on \(\mathbb {R}^{2d}\) is not optimal for the \(2\)-body optimal transport problem with \(N\)-representability constraint (3.1), for any \(N < \infty \).

Note that for the Coulomb cost, the conditions on the cost hold for any \(x=(x_1,x_2,\ldots ,x_N)\) away from the diagonal; that is, for any \(x\) such that \(x_i \ne x_j\) for all \(i \ne j\).

*Proof*

## 4 Conclusions

Mean field approximations that reduce complicated many-body interactions to interactions of each particle with a collective mean field are ubiquitous in many areas of physics such as quantum mechanics, statistical mechanics, electromagnetism, and continuum mechanics, as well as in other fields such as mathematical biology, probability theory, or game theory.

Motivated by questions in many-electron quantum mechanics, we have presented a novel and quite general mathematical picture of how mean field approximations are rigorously related to underlying many-body interactions. Namely, for interactions with positive Fourier transform they emerge as the unique solution to a naturally associated infinite-body optimal transport problem.

## Acknowledgments

This work was begun when all three authors attended the 2012 trimester program at the Hausdorff Research Insitute for Mathematics in Bonn on ’Mathematical challenges of materials science and condensed matter physics’. We wish to thank the program organizers Sergio Conti, Richard James, Stephan Luckhaus, Stefan Müller, Manfred Salmhofer, and Benjamin Schlein for their hospitality. We are also indebted to Paola Gori-Giorgi for pointing out to us a very interesting alternative proof of Corollary 1.3 which is implicit in Ref. [57] (see Remark 1.4). B.P. acknowledges the support of a University of Alberta start-up grant and National Sciences and Engineering Research Council of Canada Discovery Grant number 412779-2012.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.