1 Introduction

Density functional theory (DFT) is a powerful tool used in quantum physics and chemistry to model quantum electrons in atoms, molecules and solids [8, 27, 30, 65, 67, 73]. However, DFT is based on a rather general mathematical scheme and it can be applied to many other situations. This work is devoted to the rigorous study of classical DFT, which is used for finite or infinite systems of interacting classical particles.

Classical DFT is widely employed in materials science, biophysics, chemical engineering and civil engineering [109]. It has a much lower computational cost than the more precise molecular dynamics simulations, which are limited to small systems and short times [38, 56, 89]. Classical DFT is typically used at interfaces between liquid–gas, liquid–liquid (in fluid mixtures), crystal-liquid and crystal-gas phases at bulk coexistence. The density is then non constant in space and varies in the interfacial region between the two phases.

The physical theory of inhomogenous fluids goes essentially back to the 60s [19, 20, 58, 72, 100]. Functional methods and their applications to the theory of the structure of bulk fluids were described in [75, 99]. The realization that methods developed in the quantum case by Hohenberg–Kohn–Sham [46, 55] could be transferred to classical fluids arose in the middle of the 70s, in particular in the works of Ebner–Saam [28, 29, 95] and Yang et al. [110]. Several authors then developed approximate free-energy functionals to calculate the density profile and surface tension of the liquid–gas interface. The square-gradient approximation could later be derived rather systematically, following the important works of Hohenberg–Kohn–Sham on the gradient expansion of the uniform (quantum) electron gas. Deriving efficient functionals for the solid–liquid transition was harder and took longer [45, 85, 86]. Well-known references on classical DFT are the two reviews by Evans [31, 32]. Other important physical references on the subject include [1, 2, 33, 43, 44, 69, 84, 97].

Rigorous works on classical DFT are rather scarse. Most of the mathematical works are about proving that one can find an external potential V whose interacting equilibrium Gibbs measure has any desired given density \(\rho \). This is called the inverse or dual problem and justifies the use of density functional methods. In quantum DFT, V is called the Kohn–Sham potential and its existence is unclear in most situations. However, in the classical case, V is usually well defined.

The grand-canonical 1D hard-core gas was solved exactly in a celebrated work by Percus [76], who provided an exact expression of the external potential V. This was used and extended in later works [77,78,79, 106]. In two famous works [10, 11], Chayes, Chayes and Lieb proved in a quite general setting (in particular any space dimension d) the existence and uniqueness of the dual potential V at any positive temperature \(T>0\). At \(T=0\), the canonical model can be reformulated as a multi-marginal optimal transport problem [17, 18, 23, 74, 96], where V is usually called the Kantorovich potential. Its existence and properties are known in many cases [5, 22, 49] but uniqueness usually does not hold. The grand-canonical case was studied in the recent article [24]. Most of these works are based on compactness arguments and do not furnish any quantitative information on the shape of the potential V in terms of the given density \(\rho \). In the recent paper [47], a novel Banach inversion theorem was used to provide an explicit formula for V in terms of \(\rho \) in the form of a convergent series, under the assumption that \(\rho \) is small in \(L^\infty (\mathbb {R}^d)\). This is the equivalent of the Virial expansion for uniform systems.Footnote 1

In this work and the companion paper [48] we do not discuss the dual potential V and instead focus on more quantitative properties of the model depending on the shape of the density \(\rho \). The case of the three dimensional Coulomb interaction \(w(x)=|x|^{-1}\) or, more generally long range Riesz interactions \(|x|^{-s}\) with \(s<d\) has been the object of several recent works [61, 65]. Here we always assume that the interaction potential w decays fast enough at infinity and do not discuss more complicated long range potentials such as Coulomb.

Our main goal in this paper is to show universal local bounds on the free energy \(F_T[\rho ]\) at given density \(\rho \in L^1(\mathbb {R}^d)\). By local we mean that we only use terms in the form

$$\begin{aligned} \int _{\mathbb {R}^d}\rho (x)^p\,\textrm{d}x,\qquad \int _{\mathbb {R}^d}\rho (x)^q\log \rho (x)\,\textrm{d}x. \end{aligned}$$

The admissible values of p and q will depend on the temperature T as well as on the singularity of the interaction potential w at the origin, that is, how strong the particles repel each other when they get close. Such universal bounds are important in DFT. They can help to find the natural form of approximate functionals to be used for practical computations.Footnote 2 In addition, these bounds will be useful in our next work [48] where we study the local density approximation.

Deriving simple lower bounds is usually easy, under reasonable stability assumptions on the interaction potential w. Obtaining upper bounds can be much more difficult. They require constructing a good trial state, but the constraint that the density is given and must be exactly reproduced can generate important mathematical complications.

The simplest trial state is obtained by taking i.i.d. particles, that is, a factorized N-particle probability \((\rho /N)^{\otimes N}\) where \(N=\int _{\mathbb {R}^d}\rho \in \mathbb {N}\). Doing so provides an upper bound on the free energy in terms of mean-field theory, often called in this context the Kirkwood-Monroe functional [54]:

$$\begin{aligned} \frac{1}{2}\iint _{\mathbb {R}^d\times \mathbb {R}^d}w(x-y)\rho (x)\rho (y)\,\textrm{d}x\,dy+T\int _{\mathbb {R}^d}\rho (x)\log \rho (x)\,\textrm{d}x. \end{aligned}$$
(1)

This only makes sense when the pair interaction potential w is locally integrable. If w is globally integrable, one can use Young’s inequality and estimate the first integral by the local energy \((\int _{\mathbb {R}^d}w_+/2)\int _{\mathbb {R}^d}\rho (x)^2\,\textrm{d}x\), where \(w_+{:}{=}\max (w,0)\) denotes the positive part. The simplest models of classical DFT use (1) as a basis.

In classical statistical mechanics, it is often convenient to consider potentials w diverging fast enough at the origin, which helps to stabilize the system [25, 26, 39, 83, 87, 94]. This divergence implies that the particles can never get too close to each other, and this requires that the trial state contains rather strong correlations. A factorized state is not appropriate and (1) is infinite. The simplest singular interaction is of course the hard-core potential \(w(x)=(+\infty ){\mathbb {1} }(|x|< r_0)\), which is simply infinite over a ball and vanishes outside.

In this paper we provide two different constructions of a correlated trial state, which give reasonable upper bounds on the classical free energy at given density, for singular interaction potentials at the origin. Our first method uses some ideas from harmonic analysis in the form of a Besicovitch-type covering lemma [21]. We cover space with cubes whose size is adapted to the local value of the density, and put essentially one particle per cube, with the constraint that the cubes are far enough from each other. This method works very well in the grand-canonical setting where the number of particles is allowed to fluctuate. In order to handle the canonical ensemble, a different construction is needed. We instead use techniques from optimal transport theory developed in [14], which give a rather good bound at zero temperature, \(T=0\). For \(T>0\) we couple this to the Besicovitch-type covering lemma and obtain an upper bound which is not as good as the grand-canonical one.

In [48] we will study the behavior of \(F_T[\rho ]\) in some particular regimes and the upper universal bounds derived here will be useful. Namely we will consider the thermodynamic limit where \(\rho \) is essentially constant over a large domain as well as the local density approximation when \(\rho \) varies slowly over big regions. Such regimes have been recently considered for the three dimensional Coulomb potential \(w(x)=|x|^{-1}\) in [63, 66], for more general Riesz potentials in [15, 16] and for a special class of positive-type interactions in [71]. The methods used in these works all rely on the assumption that the potential is positive-definite, and new ideas are necessary in the general (short-range) case.

2 Main Results

2.1 Free Energies at Given Density

This subsection is mainly devoted to precisely introducing models and notation used in the paper. Our main results are stated in the next subsections.

2.1.1 The Interaction Potential w

For convenience, we work in \(\mathbb {R}^d\) with a general dimension \(d\geqslant 1\). The physical cases of interest are of course \(d\in \{1,2,3\}\) but the proofs are the same for all d, except sometimes for \(d=1\). We consider systems of indistinguishable classical particles interacting through a short-range pair potential w. Throughout the paper, we work with an interaction satisfying the following properties.

Assumption 1

(on the short-range potential w) Let \( w:\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \} \) be an even lower semi-continuous function satisfying the following properties, for some constant \(\kappa >0\):

  1. (1)

    w is stable, that is,

    $$\begin{aligned} \sum \limits _{1 \leqslant j < k \leqslant N} w ( x_j - x_k ) \geqslant - \kappa N, \end{aligned}$$
    (2)

    for all \( N \in \mathbb {N}\) and \( x_1, \dotsc , x_N \in \mathbb {R}^d \);

  2. (2)

    w is upper regular, that is, there exist \( r_0 \geqslant 0 \), \( 0 \leqslant \alpha \leqslant \infty \) and \(s>d\) such that

    $$\begin{aligned} w(x) \leqslant \kappa \left( {\mathbb {1} }(|x|<r_0)\left( \frac{r_0}{|x|}\right) ^\alpha + \frac{1}{1 + |x|^s}\right) . \end{aligned}$$
    (3)

The lower semi-continuity of w will be used later to ensure that the energy is lower semi-continuous as a function of the one-particle density (see Remark 3 below). In statistical mechanics, the stability condition (2) is used to ensure the existence of the thermodynamic limit [94]. On the other hand, upper bounds of the form (3) are sometimes used to get more information on the equilibrium states [93]. At infinity, we assume that our potential w is bounded above by \(|x|^{-s}\), which is integrable since \(s>d\). It could of course decay faster. On the other hand, the parameter \( \alpha \) determines the allowed repulsive strength of the interaction at the origin. If \( \alpha = 0 \), then w is everywhere bounded from above, and if \( 0< \alpha < d \), then w has at most an integrable singularity at the origin. In particular, the positive part \( w_+\) is integrable over the whole of \(\mathbb {R}^d\) (since we are interested in upper bounds, the negative part \(w_-\) will not play a role in this paper). In the case where \( \alpha \geqslant d \), w can have a non-integrable singularity at the origin. If \( \alpha = \infty \), then w can have a hard-core. Our convention is that \((r_0/|x|)^\alpha =(+\infty ){\mathbb {1} }(|x|<r_0)\) for \(\alpha =+\infty \). When \(\alpha <\infty \) we can always assume that \(r_0=1\), possibly after increasing \(\kappa \).

Most short range potentials of physical interest are covered by Assumption 1, including for instance the simple hard-core and the Lennard–Jones potential \(w(x)=a|x|^{-12}-b|x|^{-6}\).

2.1.2 The Canonical Free Energy

In this subsection we define the canonical free energy \(F_T[\rho ]\) at given density \(\rho \).

Suppose that we have N particles in \( \mathbb {R}^d \), distributed according to some Borel probability measure \( \mathbb {P}\) on \( \mathbb {R}^{dN} \). Since the particles are indistinguishable, we demand that the measure \( \mathbb {P}\) is symmetric, that is,

$$\begin{aligned} \mathbb {P}(A_{\sigma ( 1 )}\times \cdots \times A_{\sigma ( N )}) = \mathbb {P}(A_1\times \cdots \times A_N), \end{aligned}$$

for any permutation \( \sigma \) of , and any Borel sets \(A_1,...,A_N\subset \mathbb {R}^d\). The one-body density of such a symmetric probability \( \mathbb {P}\) equals N times the first marginal of \(\mathbb {P}\), that is,

$$\begin{aligned} \rho _{\mathbb {P}} = N\int _{\mathbb {R}^{d ( N-1 )}} \, \textrm{d}\mathbb {P}( \cdot ,x_2, \dotsc , x_N ), \end{aligned}$$

where the integration is over \( x_2,\dotsc ,x_N \). Equivalently, \(\rho _\mathbb {P}(A)=N\mathbb {P}(A\times (\mathbb {R}^d)^{N-1})\) for every Borel set A. Note the normalization convention \( \rho _{\mathbb {P}}(\mathbb {R}^d) = N \). For a non-symmetric probability \(\mathbb {P}\) we define \(\rho _\mathbb {P}\) as the sum of the N marginals.

Notice that any positive measure \(\rho \) on \(\mathbb {R}^d\) with \(\rho (\mathbb {R}^d)=N\in \mathbb {N}\) arises from at least one N-particle probability measure \(\mathbb {P}\). One can take for instance \(\mathbb {P}=(\rho /N)^{\otimes N}\) for independent and identically distributed particles.

The pairwise average interaction energy of the particles is given by

$$\begin{aligned} \mathcal {U}_{N} ( \mathbb {P} ) = \int _{\mathbb {R}^{dN}} \sum \limits _{1 \leqslant j < k \leqslant N} w ( x_j - x_k ) \, \textrm{d}\mathbb {P}( x_1, \dotsc , x_N ). \end{aligned}$$

It could in principle be equal to \(+\infty \), but it always satisfies \(\mathcal {U}_{N} ( \mathbb {P} )\geqslant -\kappa N\) due to the stability condition on w in Assumption 1. When considering systems at positive temperature \(T>0\), it is necessary to also take the entropy of the system into account. It is given by

$$\begin{aligned} \mathcal {S}_N ( \mathbb {P} ) {:}{=} - \int _{\mathbb {R}^{dN}} \mathbb {P}( x ) \log \big (N! \, \mathbb {P}( x )\big ) \, \textrm{d}x. \end{aligned}$$
(4)

If \( \mathbb {P}\) is not absolutely continuous with respect to the Lebesgue measure on \( \mathbb {R}^{dN} \), we use the convention that \(\mathcal {S}_N ( \mathbb {P} )=-\infty \). The factor N! appears because the particles are indistinguishable. In fact, we should think that \(N!\,\mathbb {P}\) defines a probability measure over \((\mathbb {R}^d)^N/\mathfrak {S}_N\) where \(\mathfrak {S}_N\) is the permutation group. We need to make sure that \(\mathcal {S}_N(\mathbb {P})<+\infty \), which follows if we assume for instance that \(\rho _\mathbb {P}\) is absolutely continuous with \(\int _{\mathbb {R}^d}\rho _\mathbb {P}|\log \rho _\mathbb {P}|<\infty \). This is due to the well-known inequality (see, e.g., [24, Lemma 6.1])

$$\begin{aligned} \mathcal {S}_N ( \mathbb {P} )\leqslant -\int _{\mathbb {R}^d}\rho _\mathbb {P}(x)\log \rho _\mathbb {P}(x)\,\textrm{d}x+N. \end{aligned}$$
(5)

The latter follows immediately from writing the relative entropy of \(\mathbb {P}\) with respect to \((\rho /N)^{\otimes N}\), which is non-negative, and using \((N/e)^N\leqslant N!\).

The total free energy of the system in the state \( \mathbb {P}\) at temperature \( T \geqslant 0 \) equals

$$\begin{aligned} \mathcal {F}_T ( \mathbb {P} ) {:}{=}{} \mathcal {U}_{N} ( \mathbb {P} ) - T \mathcal {S}_N ( \mathbb {P} ) = \int _{\mathbb {R}^{dN}} \sum \limits _{j < k} w ( x_j - x_k ) \, \textrm{d}\mathbb {P}( x ) + T \int _{\mathbb {R}^{dN}} \mathbb {P}\log ( N! \, \mathbb {P} ). \end{aligned}$$
(6)

It can be equal to \(+\infty \) but never to \(-\infty \) due to the stability of w and thanks to the inequality (5) if \(T>0\) and \(\int _{\mathbb {R}^d}\rho _\mathbb {P}|\log \rho _\mathbb {P}|<\infty \).

Throughout the paper, we will only consider systems with a given one-body density \(\rho \), which is absolutely continuous with respect to the Lebesgue measure. At \(T>0\) we also assume that \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \). This allows us to consider the minimal energy of N-particle classical systems with density \( \rho \), given by

$$\begin{aligned} \boxed { F_T [\rho ] {:}{=} \inf _{\rho _{\mathbb {P}} = \rho } \mathcal {F}_T ( \mathbb {P} ) }, \end{aligned}$$
(7)

where the infimum is taken over N-particle states \( \mathbb {P}\) on \( \mathbb {R}^{dN} \) with one-particle density \( \rho _{\mathbb {P}} \) equal to \( \rho \). At \(T=0\), the entropy term disappears and we obtain

$$\begin{aligned} F_0 [\rho ] {:}{=} \inf _{\rho _{\mathbb {P}} = \rho }\int \sum \limits _{1\leqslant j < k\leqslant N} w ( x_j - x_k ) \, \textrm{d}\mathbb {P}( x ). \end{aligned}$$
(8)

This is a multi-marginal optimal transport problem with symmetric cost \(\sum _{j<k}w(x_j-x_k)\) and with all the marginals of \(\mathbb {P}\) equal to \(\rho /N\) [17, 18, 23, 74, 96]. From the stability assumption on w and (5), we have

$$\begin{aligned} F_T [\rho ]\geqslant -(\kappa +T)N+T\int _{\mathbb {R}^d}\rho (x)\log \rho (x)\,\textrm{d}x. \end{aligned}$$
(9)

One of our goals will be to find simple conditions ensuring that \(F_T[\rho ]<\infty \). Before we turn to this question, we first introduce the grand-canonical problem.

Remark 2

(Symmetry) In the definition (7) we can freely remove the constraint that \(\mathbb {P}\) is symmetric. Since the interaction is a symmetric function and the entropy \(\mathcal {S}_N\) is concave, the minimum is the same as for symmetric \(\mathbb {P}\)’s. Recall that for a non-symmetric \(\mathbb {P}\), \(\rho _\mathbb {P}\) is by definition the sum of the N marginals.

Remark 3

(Lower Semi-continuity) The function \(\rho \mapsto F_T[\rho ]\) is lower semi-continuous for the strong topology. That is, we have

$$\begin{aligned} F_T[\rho ]\leqslant \liminf _{n\rightarrow \infty } F_T[\rho _n]\quad \text {if } \int |\rho _n-\rho |\rightarrow 0 \text { and } T\int \rho _n|\log \rho _n|\leqslant C. \end{aligned}$$
(10)

At \(T>0\) this is valid under the sole condition that w is measurable (since the limiting probability \(\mathbb {P}\) is necessary absolutely continuous) but at \(T=0\), this uses the lower semi-continuity of w. The details of the argument are provided later in the proof of Theorem 29, for the convenience of the reader.

Remark 4

(Convexity and duality) Using the concavity of the entropy \(\mathcal {S}_N\), one can verify that \(\rho \mapsto F_T[\rho ]\) is convex. This can be used to derive the dual formulation of \(F_T[\rho ]\) in terms of external potentials

$$\begin{aligned} F_T[\rho ]={}&T\int _{\mathbb {R}^d}\rho \log \rho +\sup _{{\widetilde{V}}}\bigg \{-\int _{\mathbb {R}^d}\rho (x) {\widetilde{V}}(x)\,\textrm{d}x \nonumber \\&-T\log \int _{\mathbb {R}^{dN}}\exp \left( {-\frac{1}{T}\sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)-\frac{1}{T}\sum _{j=1}^N{\widetilde{V}}(x_j) }\right) \textrm{d}\rho ^{\otimes N} \bigg \}, \end{aligned}$$
(11)

see [11]. Our notation \({\widetilde{V}}\) is because the final physical dual potential is, rather, \(V{:}{=}{\widetilde{V}}-T\log \rho \). The existence of a maximizer \({\widetilde{V}}\) realizing the above supremum is proved in [11]. It is the unique potential (up to an additive constant) so that the corresponding Gibbs state has density \(\rho \), that is,

$$\begin{aligned} \rho _\mathbb {P}=\rho ,\quad \mathbb {P}=\frac{1}{Z}\exp \bigg (-\frac{1}{T}\sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)-\frac{1}{T}\sum _{j=1}^N{\widetilde{V}}(x_j)\bigg )\rho ^{\otimes N}, \end{aligned}$$

with Z a normalization constant. At \(T=0\), we have the similar formula

$$\begin{aligned} F_0[\rho ]=\sup _{{\widetilde{V}}}\bigg \{E_N[V]-\int _{\mathbb {R}^d}\rho (x) V(x)\,\textrm{d}x\bigg \}, \end{aligned}$$

where

$$\begin{aligned} E_N[V]=\inf _{x_1,...,x_N\in \mathbb {R}^d}\left\{ \sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)+\sum _{j=1}^NV(x_j)\right\} , \end{aligned}$$

is the ground state energy in the potential V [49]. Although there usually exist dual potentials at \(T=0\), those are often not unique.

2.1.3 The Grand-Canonical Free Energy

In the grand-canonical picture, where the exact particle number of the system is not fixed, a state \( \mathbb {P}\) is a family of symmetric n-particle positive measures \( \mathbb {P}_n \) on \((\mathbb {R}^d)^n\), so that

$$\begin{aligned} \sum _{n \geqslant 0} \mathbb {P}_n\big ((\mathbb {R}^d)^n\big )=1. \end{aligned}$$

Here \(\mathbb {P}_0\) is just a number, interpreted as the probability that there is no particle at all in the system. After replacing \(\mathbb {P}_n\) by \(\mathbb {P}_n/\mathbb {P}_n(\mathbb {R}^{dn})\), we can equivalently think that \(\mathbb {P}\) is a convex combination of canonical states. The entropy of \( \mathbb {P}\) is defined by

$$\begin{aligned} \mathcal {S}( \mathbb {P} ) {:}{=} \sum \limits _{n \geqslant 0} \mathcal {S}_n ( \mathbb {P}_n ) = -\mathbb {P}_0\log (\mathbb {P}_0)- \sum \limits _{n \geqslant 1} \int _{\mathbb {R}^{dn}} \mathbb {P}_n \log ( n! \, \mathbb {P}_n ), \end{aligned}$$
(12)

and the single particle density of the state \( \mathbb {P}\) is

$$\begin{aligned} \rho _{\mathbb {P}} = \sum \limits _{n \geqslant 1} \rho _{\mathbb {P}_n}=\sum _{n\geqslant 1}n\int _{(\mathbb {R}^d)^n}\textrm{d}\mathbb {P}_n(\cdot ,x_2,\dotsc ,x_n). \end{aligned}$$

The grand-canonical free energy of the state \( \mathbb {P}\) at temperature \( T \geqslant 0 \) is

$$\begin{aligned} \mathcal {G}_T ( \mathbb {P} ) {:}{=} \mathcal {U}( \mathbb {P} ) - T \mathcal {S}( \mathbb {P} ), \end{aligned}$$
(13)

where \( \mathcal {U}( \mathbb {P} ) \) denotes the interaction energy in the state \( \mathbb {P}\),

$$\begin{aligned} \mathcal {U}( \mathbb {P} ) {:}{=} \sum \limits _{n \geqslant 2} \mathcal {U}_{n} ( \mathbb {P}_n ) = \sum \limits _{n \geqslant 2} \int _{\mathbb {R}^{dn}} \sum \limits _{j<k}^n w ( x_j - x_k ) \, \textrm{d}\mathbb {P}_n ( x_1,...,x_N ). \end{aligned}$$
(14)

From the stability of w we have

$$\begin{aligned} \mathcal {U}_{n} ( \mathbb {P}_n )\geqslant -\kappa n\,\mathbb {P}_n(\mathbb {R}^{dn}), \end{aligned}$$

so that, after summing over n,

$$\begin{aligned} \mathcal {U}( \mathbb {P} )\geqslant -\kappa \int _{\mathbb {R}^d}\rho _\mathbb {P}(x)\,\textrm{d}x. \end{aligned}$$

By [24, Lemma 6.1] we have the universal entropy bound

$$\begin{aligned} \mathcal {S}( \mathbb {P} )\leqslant -\int _{\mathbb {R}^d}\rho _\mathbb {P}\big (\log \rho _\mathbb {P}-1). \end{aligned}$$
(15)

This is because the entropy at fixed density \( \rho \) is maximized by the grand-canonical Poisson state

$$\begin{aligned} \mathbb {Q}{:}{=} \left( \frac{e^{- \int _{\mathbb {R}^d} \rho }}{n!} \rho ^{\otimes n}\right) _{n \geqslant 0}, \end{aligned}$$
(16)

whose entropy is the right side of (15).

When keeping the one-particle density \( \rho = \rho _{\mathbb {P}} \in L^1 ( \mathbb {R}^d ) \) fixed, we denote the minimal grand-canonical free energy by

$$\begin{aligned} \boxed { G_T [\rho ] {:}{=} \inf _{\rho _{\mathbb {P}} = \rho } \mathcal {G}_T ( \mathbb {P} ). } \end{aligned}$$
(17)

Using (15), we obtain

$$\begin{aligned} G_T [\rho ] \geqslant - \left( {\kappa + T}\right) \int _{\mathbb {R}^d} \rho + T \int _{\mathbb {R}^d} \rho \log \rho , \end{aligned}$$
(18)

where \( \kappa \) is the stability constant of w in Assumption 1.

Remark 5

(Comparing \(F_T\) and \(G_T\)) Since a canonical trial state is automatically also admissible for the grand-canonical minimisation problem (17), we have the bound

$$\begin{aligned} G_T [\rho ] \leqslant F_T [\rho ], \end{aligned}$$

for any density \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \) with integer mass. Hence, any universal lower energy bound for the grand-canonical ensemble is also a lower bound for the canonical ensemble. A natural question to ask is under which condition we have \(F_T[\rho ] =G_T[\rho ]\) for a density \(\rho \) of integer mass. In general this is a difficult problem. See [24] for results and comments in this direction at \(T=0\).

If \(\int _{\mathbb {R}^d}\rho =N+t\) with \(t\in (0,1)\) and \(N\in \mathbb {N}\), we can write \(\rho =(1-t)\frac{N}{N+t}\rho +t\frac{N+1}{N+t}\rho \) and obtain after using the concavity of the entropy

$$\begin{aligned} G_T [\rho ]\leqslant (1-t)\,F_T \left[ \frac{N}{N+t}\rho \right] +t\,F_T \left[ \frac{N+1}{N+t}\rho \right] . \end{aligned}$$
(19)

This can be used to deduce an upper bound on \(G_T[\rho ]\), once an upper bound has been established in the canonical case. We will see, however, that it is usually much easier to directly prove upper bounds on \(G_T[\rho ]\) than on \(F_T[\rho ]\).

Remark 6

(Weak lower semi-continuity) The functional \(\rho \mapsto G_T[\rho ]\) is weakly lower semi-continuous and, in fact, a kind of lower continuous envelope of \(F_T[\rho ]\) (see [24, 65]). At \(T=0\) this uses the lower semi-continuity of w.

Remark 7

(Duality II) Like in the canonical case, we have the dual formulation

$$\begin{aligned} G_T[\rho ]&=T\int _{\mathbb {R}^d}\rho \log \rho +\sup _{{\widetilde{V}}}\bigg \{-\int _{\mathbb {R}^d}\rho (x) {\widetilde{V}}(x)\,\textrm{d}x\nonumber \\&-T\log \bigg [\sum _{n\geqslant 0}\int _{\mathbb {R}^{dn}}\exp \bigg (-\frac{1}{T}\sum _{1\leqslant j<k\leqslant n}w(x_j-x_k)-\frac{1}{T}\sum _{j=1}^n{\widetilde{V}}(x_j)\bigg )\textrm{d}\rho ^{\otimes N}\bigg ] \bigg \}, \end{aligned}$$
(20)

see [10, 11] and the more recent work [24, Sects. 4, 6].

2.2 Representability

Next we turn to the problem of representability. Namely, we are asking what kind of densities \(\rho \) can arise from N-particle probabilities with finite free energy. This depends on the shape of the interaction potential w. We only address this question for \(\rho \in L^1(\mathbb {R}^d)\) and do not look at general measures. The main result is that all densities are representable at zero temperature in the non-hard-core case (\(\alpha <\infty \)). At positive temperature, it is sufficient to assume in addition that \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \).

Theorem 8

(Representability in the canonical case) Let \(\rho \in L^1(\mathbb {R}^d)\) with \(\int _{\mathbb {R}^d}\rho (x) \, \textrm{d}x \in \mathbb {N}\). There exists a symmetric probability measure \(\mathbb {P}\) on \((\mathbb {R}^d)^N\) of density \(\rho \) so that \(|x_j-x_k|\geqslant \delta >0\) \(\mathbb {P}\)—almost everywhere, for some \(\delta >0\).

If w satisfies Assumption 1 without hard-core (\(\alpha <\infty \)), we obtain \(F_0[\rho ]<\infty \). If furthermore \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \), then \(\mathbb {P}\) can be assumed to have finite entropy and \(F_T[\rho ]<\infty \) for any \(T>0\).

The theorem follows from results in optimal transport theory and we quickly outline the proof here for the convenience of the reader. In this paper we will prove much more. We will in fact need some of these tools and more details will thus be provided later in the paper.

Proof

If \(\int _{\mathbb {R}^d}\rho =1\), we must take \(\mathbb {P}=\rho \) and end up with \(F_T[\rho ]=T\int \rho \log \rho \). In the rest of the proof we assume that \(\int _{\mathbb {R}^d}\rho \geqslant 2\).

For \(\rho \in L^1(\mathbb {R}^d)\), the existence of \(\mathbb {P}\) is proved in [14, Theorem 4.3]. The number \(\delta \) must be so that \(\int _{B(x,\delta )}\rho <1\) for any \(x\in \mathbb {R}^d\), where B(xR) denotes the ball centered at x and of radius R. Such a \(\delta >0\) always exists when \(\rho \in L^1(\mathbb {R}^d)\). See Sect. 5.1 below for more details on the results from [14].

Next we prove that \(\mathcal {F}_0( \mathbb {P} )<\infty \). Since \(\alpha <\infty \) (no hard-core), we can assume \(r_0=1\). We then have \(w(x)\leqslant C_\delta |x|^{-s}\) for all \(|x|\geqslant \delta \), with the constant \(C_\delta =\kappa (1+\delta ^{s-\alpha })\), due to Assumption 1. Hence, on the support of \(\mathbb {P}\) we have

$$\begin{aligned} \sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)=\frac{1}{2}\sum _{j=1}^N\sum _{k\ne j}w(x_j-x_k) \leqslant \frac{C_\delta }{2}N\max _{\begin{array}{c} |y_j|\geqslant \delta \\ |y_j-y_k|\geqslant \delta \end{array}}\sum _{j=1}^{N-1}\frac{1}{|y_j|^s}. \end{aligned}$$

The maximum is bounded by \(C\delta ^{-s}\) independently of N due to [61, Lemma 9]. Integrating with respect to \(\mathbb {P}\) we have proved that \(\mathcal {F}_0( \mathbb {P} )\leqslant C_\delta \delta ^{-s}N\). This bound is not very explicit but it only depends on \(\delta \) and N. Of course, \(\delta \) itself depends on \(\rho \) in a rather indirect way.

The probability measure \(\mathbb {P}\) obtained by the optimal transport method of [14] is probably a singular measure, hence with an infinite entropy. In [9], it is explained how to regularize any given \(\mathbb {P}\) using a method called the Block approximation. This method works well for a compactly supported density, for which it easily implies \(F_T[\rho ]<\infty \). We quickly describe the method here and refer to Sect. 5.3 below for details. In short, we split the space into small cubes \(\{{\mathcal {C}}_j\}\) of size proportional to \(\delta \) and introduce the trial probability measure

$$\begin{aligned} {\widetilde{\mathbb {P}}}=\sum _{j_1,...,j_N}\mathbb {P}({\mathcal {C}}_{j_1}\times \cdots \times {\mathcal {C}}_{j_N}) \frac{\rho {\mathbb {1} }_{{\mathcal {C}}_{j_1}}\otimes \cdots \otimes \rho {\mathbb {1} }_{{\mathcal {C}}_{j_N}}}{\int _{{\mathcal {C}}_{j_1}}\rho \cdots \int _{{\mathcal {C}}_{j_N}}\rho }. \end{aligned}$$

That is, we take a convex combination of independent particles over small cubes with probability \(\mathbb {P}({\mathcal {C}}_{j_1}\times \cdots \times {\mathcal {C}}_{j_N})\). Choosing the cubes small enough, we can ensure that \(|x_j-x_k|\geqslant \delta /2\) on the support of \({\widetilde{\mathbb {P}}}\) and \(\int _{{\mathcal {C}}_j}\rho <1\). A computation gives \(\rho _{{\widetilde{\mathbb {P}}}}=\rho _{\mathbb {P}}=\rho \). The entropy can be estimated by

$$\begin{aligned} \int _{\mathbb {R}^{dN}}{\widetilde{\mathbb {P}}}\log (N! \, {\widetilde{\mathbb {P}}})\leqslant \int _{\mathbb {R}^d} \rho \log \rho -\sum \limits _{j} \left( {\int _{{\mathcal {C}}_j} \rho }\right) \log \left( {\int _{{\mathcal {C}}_j} \rho }\right) , \end{aligned}$$

(see Lemma 26 below). Estimating the last sum is not an easy task for a general density. For a compactly supported density we can simply bound it by 1/e times the numbers of cubes intersecting the support of \(\rho \). Since the energy of \({\widetilde{\mathbb {P}}}\) is finite by the previous argument, we deduce that \(F_T[\rho ]<\infty \) for any \(\rho \) of compact support.

It thus remains to explain how to prove that \(F_T[\rho ]\) is finite for a density \(\rho \) of unbounded support. The idea is of course to truncate it. We choose two radii \(R_1<R_2\) so that

$$\begin{aligned} \int _{\mathbb {R}^d\setminus B_{R_2}}\rho =\int _{B_{R_2}\setminus B_{R_1}}\rho =\frac{1}{2}, \end{aligned}$$

(using here \(\int \rho \geqslant 2\)) and we define for shortness \(\rho _1{:}{=}\rho {\mathbb {1} }_{B_{R_1}}\), \(\rho _2{:}{=}\rho {\mathbb {1} }_{B_{R_2}{\setminus } B_{R_1}}\) and \(\rho _3{:}{=}\rho {\mathbb {1} }_{\mathbb {R}^d{\setminus } B_{R_2}}\). We can write

$$\begin{aligned} \rho =\frac{\rho _1+2\rho _2}{2}+\frac{\rho _1+2\rho _3}{2}, \end{aligned}$$

where \(\int _{\mathbb {R}^d}(\rho _1+2\rho _2)=\int _{\mathbb {R}^d}(\rho _1+2\rho _3)=N\). From the convexity of \(F_T\) we obtain

$$\begin{aligned} F_T[\rho ]\leqslant \frac{1}{2}F_T[\rho _1+2\rho _2]+\frac{1}{2}F_T[\rho _1+2\rho _3]. \end{aligned}$$

The first density \(\rho _1+2\rho _2\) has compact support hence has a finite energy, as explained above. For the second density \(\rho _1+2\rho _3\) we use an uncorrelated trial state in the form \(\mathbb {P}_1\otimes _s (2\rho _3)\) where \(\mathbb {P}_1\) is also constructed as before, but with \(\rho \) replaced by \(\rho _1\) which has mass \(N-1\). Here \(\otimes _s\) means the symmetric tensor product. A calculation shows that

$$\begin{aligned} F_T[\rho _1+2\rho _3]\leqslant {}&\mathcal {F}_T\big (\mathbb {P}_1\otimes _s(2\rho _3)\big )\\ ={}&\mathcal {F}_T(\mathbb {P}_1)+2\iint _{\mathbb {R}^{2d}}w(x-y)\rho _1(x)\rho _3(y)\, \textrm{d}x \, \textrm{d}y \\&+2T\int \rho _3\log (2\rho _3) \\ \leqslant {}&\mathcal {F}_T(\mathbb {P}_1)+(N-1)\sup _{|x|\geqslant R_2-R_1}|w(x)|+2T\int _{\mathbb {R}^d\setminus B_{R_2}} \rho \log (2\rho ). \end{aligned}$$

Thus the finiteness for densities of compact support implies the same for all densities. In fact, after optimizing over \(\mathbb {P}_1\) we have proved the bound

$$\begin{aligned} F_T[\rho ]\leqslant {}&\frac{F_T[\rho _1+2\rho _2]+F_T[\rho _1]}{2}\\&+\frac{N-1}{2}\sup _{|x|\geqslant R_2-R_1}|w(x)|+T\int _{\mathbb {R}^d\setminus B_{R_2}} \rho \log (2\rho ). \end{aligned}$$

This concludes the proof of Theorem 8. \(\square \)

We have not considered here the hard-core potential, to which we will come back later in Sect. 2.4. Representability is much more delicate in this case. From the inequality (19), we immediately obtain the following.

Corollary 9

(Representability in the grand-canonical case) Let \(\rho \in L^1(\mathbb {R}^d)\). Then we have \(G_0[\rho ]<\infty \) if w has no hard-core (\(\alpha <\infty \)). If furthermore \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \), then \(G_T[\rho ]<\infty \) for all \(T>0\).

2.3 Local Upper Bounds

Recall that we already have rather simple lower bounds in (9) and (18). The proof of Theorem 8 furnishes an upper bound on \(F_T[\rho ]\) but it depends on the smallest distance \(\delta \) between the particles in the system, which is itself a highly nonlinear and nonlocal function of \(\rho \). For non compactly-supported densities, the proof also involves the two radii \(R_1,R_2\) which depend on \(\rho \) as well.

Our goal here is to provide simple local upper bounds involving only integrals of the given density \(\rho \). We start in the next subsection by recalling the simple integrable case at the origin \(\alpha <d\), for which we can just choose i.i.d. particles. The case \(\alpha \geqslant d\) is much more complicated since particles cannot be allowed to get too close.

2.3.1 Upper Bound in the Weakly Repulsive Case \(\alpha <d\)

In the case where \( w_+ \) is integrable at the origin, it is easy to provide a simple upper bound.

Theorem 10

(Weakly repulsive case \(\alpha<\) d) Let w satisfy Assumption 1 with \( \alpha < d \). Let \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \cap L^2 ( \mathbb {R}^d ) \) with integer mass \( \int \rho \in \mathbb {N}\). Let also \(T\geqslant 0\) and assume that \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \) if \(T>0\). Then we have

(21)

In the grand-canonical case we have the exact same bound on \(G_T[\rho ]\), this time without any constraint on \(\int _{\mathbb {R}^d} \rho \) and with \(\rho \log \rho \) replaced by \(\rho (\log \rho -1)\) in the last integral.

As we have mentioned in the introduction, the functional appearing on the right side of the first line of (21) is the so-called Kirkwood–Monroe free energy [54], which is the simplest approximation of \(F_T[\rho ]\). It only makes sense for a locally integrable potential w. In addition to being an exact upper bound, the Kirkwood–Monroe free energy also provides the exact behavior of \(F_T[\rho ]\) in some regimes. This was studied in many works, including for instance [4, 36, 37, 41, 57] for the infinite gas at high density and [3, 6, 7, 50, 51, 53, 70, 90, 98] for trapped systems in the mean-field limit.

Proof

We denote \( N = \int _{\mathbb {R}^d} \rho \) and simply take as a trial state the pure tensor product \( \mathbb {P}{:}{=} ( \rho /N )^{\otimes N} \). The interaction energy satisfies

(22)

From the stability condition on w, we know that for any \(\eta \geqslant 0\) with \(\int \eta =1\),

$$\begin{aligned} \mathcal {U}_{K} ( \eta ^{\otimes K} )=\frac{K(K-1)}{2}\iint _{\mathbb {R}^d\times \mathbb {R}^d} w ( x -y ) \eta (x)\eta (y)\,\textrm{d}x\,\textrm{d}y\geqslant -\kappa K. \end{aligned}$$

Letting \(K\rightarrow \infty \), we find

$$\begin{aligned} \iint _{\mathbb {R}^d\times \mathbb {R}^d} w ( x -y ) \eta (x)\eta (y)\,\textrm{d}x\,\textrm{d}y\geqslant 0,\qquad \forall \eta \geqslant 0. \end{aligned}$$

This is how the stability is expressed in mean-field theory [62]. Since the double integral in (22) is non-negative, we can remove the 1/N for an upper bound. The entropy can itself be estimated by

$$\begin{aligned} -\mathcal {S}_N ( \mathbb {P} ) ={}&\int _{\mathbb {R}^{dN}} \left( {\frac{\rho }{N} }\right) ^{\otimes N} \log \left( {N! \left( {\frac{\rho }{N} }\right) ^{\otimes N}}\right) \\ ={}&\log \left( {\frac{N!}{N^N}}\right) + \int _{\mathbb {R}^d} \rho \log \rho \leqslant {} \int _{\mathbb {R}^d} \rho \log \rho , \end{aligned}$$

showing that (21) holds. In the grand-canonical case we use instead the Poisson state in (16) and exactly obtain the mean-field energy on the right side of (21) with \(\rho \log \rho \) replaced by \(\rho (\log \rho -1)\) in the last integral. \(\square \)

2.3.2 Upper Bounds in the Strongly Repulsive Case \(\alpha \geqslant d\)

When \(\alpha \geqslant d\) the right side of (21) is infinite due to the non-integrability of w at the origin. We cannot use a simple uncorrelated probability \(\mathbb {P}\) as a trial state and it is necessary to correlate the particles in such a way that they never get too close to each other. The difficulty is to do this at fixed density, with a reasonable energy cost. Also, we expect the typical distance between the particles to depend on the local value of \(\rho \). If we imagine that there are \(\rho (x)\) particles per unit volume in a neighborhood of a point x, then the distance should essentially be proportional to \(\rho (x)^{-1/d}\). We thus expect a bound in terms of \(\rho (x)^{1+\alpha /d}\) for large densities. We can only fully solve this question in the grand-canonical case. In the canonical case we can only treat \(T=0\) in full. The following is our first main result.

Theorem 11

(Strongly repulsive case \(\alpha \geqslant \) d) Suppose that the interaction w satisfies Assumption 1 with \( d \leqslant \alpha < \infty \). Let \(T\geqslant 0\) and assume that for \( T > 0 \), we have \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \).

\(\bullet \) In the grand-canonical ensemble, we have for any \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \),

$$\begin{aligned} G_T [\rho ]&\leqslant C\kappa \int _{\mathbb {R}^d}\rho ^2+CT \int _{\mathbb {R}^d} \rho + T \int _{\mathbb {R}^d} \rho \log \rho \nonumber \\&\qquad +{\left\{ \begin{array}{ll} \displaystyle C\kappa r_0^\alpha \int _{\mathbb {R}^d} \rho ^{1+\frac{\alpha }{d}}&{}\text {for } \alpha >d,\\ \displaystyle C\kappa r_0^d \left( \int _{\mathbb {R}^d} \rho ^2+\int _{\mathbb {R}^d}\rho ^2\big (\log r_0^d\rho \big )_+\right) &{}\text {for } \alpha =d. \end{array}\right. } \end{aligned}$$
(23)

Here the constant C only depends on the dimension d and the powers \(\alpha ,s\) from Assumption 1.

\(\bullet \) In the canonical ensemble we have the same estimate on \(F_T[\rho ]\) for all \(T\geqslant 0\) in dimension \(d=1\) and on \(F_0[\rho ]\) at \(T=0\) for \(d\geqslant 2\), provided of course that \(\rho \) has an integer mass.

In the proof we provide an explicit value for the constant C in (23) but we do not display it here since it is by no means optimal and depends on the cases. The parameters \(\kappa \) and \(r_0\) can be used to track the origin of the different terms in our bound (23). The integrable part of the potential gives the \(\rho ^2\) term as it did in Theorem 10. The terms involving \(r_0^\alpha \) on the second line are solely due to the divergence of w at the origin. It is important that we get here the expected and optimal \(\rho ^{1+\alpha /d}\) due to the singularity. Finally, we have an additional term involving \(T\rho \) which is an error in the entropy due to our construction. We otherwise get the optimal \(T\rho \log \rho \).

In dimension \(d=1\), the proof of Theorem 11 is relatively easy, both in the canonical and grand-canonical cases. It is detailed for convenience in Sect. 3. The idea is to split the density \(\rho \) into successive intervals of mass 1/2 and then write \(\rho =(2\rho _\text {odd}+2\rho _\text {even})/2\) where \(\rho _\text {odd}\) is the density restricted to the odd intervals and \(\rho _\text {even}\) to the even ones. We then take a trial state of the form \((\mathbb {P}_\text {odd}+\mathbb {P}_\text {even})/2\), where \(\mathbb {P}_\text {odd}\) corresponds to placing exactly one particle per odd interval at density \(2\rho \) and \(\mathbb {P}_\text {even}\) is defined similarly. This way we have inserted some distance between the particles. It depends on the form of \(\rho \) in the opposite set of intervals. The interaction between the particles can then be easily controlled in terms of \(\rho ^{1+\alpha /d}\), as we explain in Sect. 3.

In higher dimensions, there seems to be no general way of splitting \(\mathbb {R}^d\) into disjoints sets containing a fixed mass of \(\rho \), so that each set has finitely many neighbors at a given distance (except perhaps for very special densities [40]). We can however carry over a similar argument as in the 1D case if we allow a covering with intersections. The Besicovitch covering lemma [21] allows us to work with cubes \(Q_j\) intersecting with finitely many other cubes, such that \(\int _{Q_j}\rho \) is any given number. We can also distribute the \(Q_j\) into a finite (universal) number of subcollections so that the cubes in each family are disjoint and not too close to each other. For each collection of disjoint cubes we then use a simple tensor product similar to the 1D case. The interaction is estimated using that the length of the cubes is related to \(\int _{Q_j}\rho ^{1+\alpha /d}\), leading to a bound involving only \(\int _{\mathbb {R}^d}\rho ^{1+\alpha /d}\). This proof was inspired by the presentation in the recent book [35] of a proof of the Lieb–Thirring and Cwikel–Lieb–Rozenblum inequalities from [91, 92, 108], thus in a completely different context. The difficulty here is that we have no information on the number of particles in each subcollection, due to the overlaps. This is the reason why the proof works well in the grand-canonical setting, but not in the canonical case. The details are given in Sect. 4.

To prove the result in the canonical case at \(T=0\) for \(d\geqslant 2\), we use a completely different method based on optimal transport tools from [14]. As we will explain in Sect. 5, the latter work can be used to construct a trial state \(\mathbb {P}\) with \(\rho _\mathbb {P}=\rho \) so that the distance between any two given particles on the support can be related to some average local value of the density around the particles. This is how we can obtain the bound (23) at \(T=0\) in the canonical case.

The next natural step is to smear this trial measure \(\mathbb {P}\) and use it at \(T>0\) but we could unfortunately not give an optimal bound on the entropy of the smearing. Our bound relies on the local radius \( R ( x ) \) of a density \( \rho \), which is thoroughly studied in Sect. 5.1 and is defined as follows. Let \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \) with \( \int _{\mathbb {R}^d} \rho ( y ) \, \textrm{d}y > 1 \). For each \( x \in \mathbb {R}^d \), we define the local radius R(x) to be the largest number satisfying

$$\begin{aligned} \int _{B ( x, R ( x ) )} \rho ( y ) \, \textrm{d}y = 1. \end{aligned}$$
(24)

This number is always bounded below for a given \(\rho \in L^1(\mathbb {R}^d)\) but behaves like |x| at infinity. If \(\rho \) has compact support, then R(x) is bounded on the support of \(\rho \).

Theorem 12

(Strongly repulsive case \(\alpha \geqslant \) d II) Suppose that the interaction w satisfies Assumption 1 with \( 2\leqslant d \leqslant \alpha < \infty \). Let \(T>0\) and \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d )\) of integer mass with \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \). Then we have

$$\begin{aligned} F_T [\rho ]&\leqslant C(\kappa +T) \int _{\mathbb {R}^d}\rho ^2+CT \int _{\mathbb {R}^d} \rho + T \int _{\mathbb {R}^d} \rho \log \rho +T \int _{\mathbb {R}^d} \rho \log R^d\nonumber \\&\qquad +{\left\{ \begin{array}{ll} \displaystyle C\kappa r_0^\alpha \int _{\mathbb {R}^d} \rho ^{1+\frac{\alpha }{d}}&{}\text {for } \alpha >d,\\ \displaystyle C\kappa r_0^d \left( \int _{\mathbb {R}^d} \rho ^2+\int _{\mathbb {R}^d}\rho ^2\big (\log r_0^d\rho \big )_+\right) &{}\text {for } \alpha =d, \end{array}\right. } \end{aligned}$$
(25)

where the constant C only depends on the dimension d and the powers \(\alpha ,s\) from Assumption 1.

The main difference compared to (23) is the additional term \(T\int \rho \log R^d\), which we conjecture should not be present. It is only affecting the bound in places where R is large on the support of \(\rho \), that is, where one cannot find a sufficient amount of mass at a finite distance of x. Another small difference is the additional term \(CT\int \rho ^2\) due to our way of estimating the entropy. The proof is detailed in Sect. 5.4 below.

The upper bounds in Theorems 11 and 12 will be very useful for our next work [48] where we study \(F_T[\rho ]\) and \(G_T[\rho ]\) for extended systems. The sub-optimal upper bound (25) in the canonical case will be sufficient in this context.

Remark 13

(Lower bounds) Even when w really behaves like \(|x|^{-\alpha }\) at the origin (for instance satisfies \(w(x)\geqslant c|x|^{-\alpha }\) for some \(c>0\)), a lower bound in the form (23) cannot hold in general. This is because the density can be large in regions where there is only one particle at a time, which does not create any divergence in the interaction. As an example, consider N points \(X_1,...,X_N\in \mathbb {R}^d\) and place around each point one particle in the state \(\chi _r{:}{=}|B_r|^{-1}{\mathbb {1} }_{B_r}\), with r small enough. The corresponding state is the (symmetrization of the) tensor product \(\mathbb {P}_r=\bigotimes _{j=1}^N\chi _r(\cdot -X_j)\). Assuming that w is continuous, its interaction energy behaves as

$$\begin{aligned} \lim _{r\rightarrow 0}\mathcal {U}_{N}(\mathbb {P}_r)=\sum _{1\leqslant j<k\leqslant N}w(X_j-X_k), \end{aligned}$$

hence stays finite, whereas the entropy equals

$$\begin{aligned} \mathcal {S}_{N}(\mathbb {P}_r)=-N\int \chi _r\log \chi _r=N\log (|B_1|r^d)\underset{r\rightarrow 0}{\longrightarrow }-\infty . \end{aligned}$$

On the other hand, the right side of (23) diverges much faster, like \(Nr^{-\alpha }\). This proves that a lower bound of the form (23) cannot hold for all possible densities.

Nevertheless, it is expected that the term \(\int \rho ^{1+\alpha /d}\) should appear when there are many particles in a small domain and is thus optimal in such situations. For instance, assuming \(w\geqslant c|x|^{-\alpha }\) for \(|x|\leqslant r_0\) and taking \(\rho =N|B_{r_0/2}|^{-1}{\mathbb {1} }_{B_{r_0/2}}\) (N particles at uniform density in the small ball), we see that

$$\begin{aligned} F_T[\rho ]\geqslant \min _{x_1,...,x_N\in B_{r_0/2}}\left( \sum _{1\leqslant j<k\leqslant N}\frac{c}{|x_j-x_k|^\alpha }\right) +T\log (N/|B_{r_0/2}|)-TN. \end{aligned}$$

The first minimum is known to behave like \(N^{1+\alpha /d}r_0^{-\alpha }\) in the limit \(N\rightarrow \infty \) [61, Lemma 1], which is exactly proportional to \(\int \rho ^{1+\alpha /d}\). Thus in this case, the lower bound holds and the power \(1+\alpha /d\) is optimal.

2.4 The Hard-Core Case

We conclude this section with a discussion of the hard-core case, which is notoriously more difficult [11, Sect. 9]. We start with the question of representability of a given density and then turn to some upper bounds on the free energy.

2.4.1 Representability

Let \(r_0>0\) be a positive number and consider the hard-core potential \(w_{r_0}(x)=(+\infty ){\mathbb {1} }(|x|<r_0)\). Then we have for any N-particle probability measure \(\mathbb {P}\)

$$\begin{aligned} \mathcal {U}_{N}(\mathbb {P})={\left\{ \begin{array}{ll} 0&{}\text {if } |x_j-x_k|\geqslant r_0 \forall j\ne k, \mathbb {P}-\hbox {almost surely,}\\ +\infty &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

The set of \(\mathbb {P}\)’s such that \(\mathcal {U}_{N}(\mathbb {P})=0\) is convex and its extreme points are the symmetric tensor products of Dirac deltas located at distance \(\geqslant r_0\) from each other. It follows that the convex set of \(w_{r_0}\)–representable densities is the convex hull of the densities in the form

$$\begin{aligned} \rho =\sum _{j=1}^N \delta _{x_j},\qquad \min _{j\ne k}|x_j-x_k|\geqslant r_0. \end{aligned}$$
(26)

There is a similar result in the grand-canonical case. In spite of this simple characterization, it seems very hard, in general, to determine whether a given density belongs to this convex set or not.

In dimension \(d=1\), the problem can be solved exactly. Any extreme point (26) satisfies

$$\begin{aligned} \rho \big ([x,x+r_0)\big )\leqslant 1,\qquad \forall x\in \mathbb {R}, \end{aligned}$$
(27)

since there is always at most one Dirac delta in any interval of length \(r_0\). This property pertains on the whole convex hull of \(w_{r_0}\)—representable densities. Conversely, any positive measure \(\rho \) with \(\rho (\mathbb {R})=N\) satisfying (27) can be written as a convex combination of Dirac deltas at distance \(\geqslant r_0\). To see this, assume for simplicity \(\rho \in L^1(\mathbb {R})\) and define as in [13] the non-decreasing function \(t\mapsto x(t)\) on (0, N) so that

$$\begin{aligned} \int _{-\infty }^{x(t)}\rho (s)\,\textrm{d}s=t,\qquad \forall t\in (0,N). \end{aligned}$$

To avoid any ambiguity when the support of \(\rho \) is not connected, we can choose x(t) to be the largest possible real number satisfying the above condition. The function \(t\mapsto x(t)\) is differentiable, except possibly on a countable set, with \(x'(t)=\rho (x(t))^{-1}\). When \(\rho >0\) almost surely, we have \(\lim _{t\rightarrow 0^+}x(t)=-\infty \) and \(\lim _{t\rightarrow N^-}x(t)=+\infty \). From the definition of x(t) we have

$$\begin{aligned} \rho =\int _0^N\delta _{x(t)}\,\textrm{d}t. \end{aligned}$$
(28)

Indeed, if we integrate the right side against some continuous function f we find \(\int _0^N f(x(t))\,\textrm{d}t=\int _\mathbb {R}f(s)\rho (s)\,\textrm{d}s\) after changing variable \(s=x(t)\). Now we can also rewrite (28) as

$$\begin{aligned} \rho =\int _0^1\sum _{k=0}^{N-1}\delta _{x(t+k)}\,\textrm{d}t. \end{aligned}$$
(29)

By definition of x(t) we have

$$\begin{aligned} \int _{x(t+k)}^{x(t+k+1)}\rho (s)\,\textrm{d}s=1,\qquad \forall k=0,...,N-2,\quad \forall t\in (0,1), \end{aligned}$$

and therefore \(|x(t+k+1)-x(t+k)|\geqslant r_0\) when the condition (27) is satisfied. Hence (29) is the sought-after convex combination of delta’s located at distance \(\geqslant r_0\). The corresponding N-particle probability is

$$\begin{aligned} \mathbb {P}=\Pi _s\int _0^1\delta _{x(t)}\otimes \delta _{x(t+1)}\otimes \cdots \otimes \delta _{x(t+N-1)}\,\textrm{d}t \end{aligned}$$
(30)

where

$$\begin{aligned} \Pi _s ( f_1 \otimes \cdots \otimes f_N ) = \frac{1}{N!} \sum \limits _{\sigma \in \mathfrak {S}_N} f_{\sigma ( 1 )} \otimes \cdots \otimes f_{\sigma ( N )}, \end{aligned}$$
(31)

is the symmetrization operator. At positive temperature, the previous state can be regularized using the block approximation described in the proof of Theorem 8, provided that \(\int _\mathbb {R}\rho |\log \rho |<\infty \) and (27) holds with a strict inequality.

In dimensions \(d\geqslant 2\), the situation is much less clear. The condition (27) can be re-expressed in the form

$$\begin{aligned} \boxed {R_\rho {:}{=}\min _{x\in \mathbb {R}^d}R(x)\geqslant \frac{r_0}{2}}, \end{aligned}$$
(32)

where R(x) is the radius previously defined in (24). This can also be written in the form

$$\begin{aligned} \int _{B(x,r_0/2)}\rho \leqslant 1,\qquad \forall x\in \mathbb {R}^d. \end{aligned}$$

This is definitely a necessary condition for a density to be \(w_{r_0}\)—representable, in dimension \(d\geqslant 1\). Otherwise we would be able to find an \(x\in \mathbb {R}^d\) and an \(R<r_0/2\) such that \(\int _{B(x,R)}\rho >1\). But then the probability that there are at least two particles in the ball B(xR) cannot vanish for any \(\mathbb {P}\) of density \(\rho \) and those are at distance \(<r_0\). This was already mentioned in [11, Sect. 9].

For \(d\geqslant 2\) the condition (32) is definitely not sufficient for a density to be representable. A counter example arises naturally within the sphere packing problem. Recall that the d-dimensional sphere packing density

$$\begin{aligned} \rho _c(d){:}{=}\lim _{\ell \rightarrow \infty }\frac{\max \{ N\,\ \exists x_1,...,x_N\in \Omega _\ell ,\ |x_j-x_k|\geqslant 1\}}{|\Omega _\ell |}, \end{aligned}$$
(33)

gives the maximal number of points per unit volume one can put while ensuring that they are at distance \(\geqslant 1\) to each other. Here \(\Omega \) is any fixed smooth domain and \(\Omega _\ell =\ell \Omega \). The packing density equals \(\rho _c(1)=1\) in dimension \(d=1\) and is otherwise only known in dimensions \( d\in \{ 2,3,8,24\} \), for which it is given by some special lattices [12, 107]. The sphere packing fraction is defined by

$$\begin{aligned} v_c(d){:}{=}\rho _c (d)|B_{1/2}|=2^{-d}\rho _c (d)|B_{1}|, \end{aligned}$$

and represents the fraction of the volume occupied by the balls. This is simply \(v_c(1)=1\) in dimension \(d=1\) but is strictly less than 1 for \(d\geqslant 2\). Some volume has to be left unoccupied due to the impossibility to fill space with disjoint balls of fixed radius. It has been shown that \(v_c(d)\) tends to 0 exponentially fast in the limit \(d\rightarrow \infty \) but its exact behavior is still unknown [105]. Let us now consider a constant density \(\rho (x)=\rho _0{\mathbb {1} }_{\Omega _\ell }(x)\) over a large domain \(\Omega _\ell =\ell \Omega \) (for instance a ball). Then we have \(R(x)=(\rho _0|B_1|)^{-1/d}\) well inside \(\Omega _\ell \), whereas \(R(x)\geqslant (\rho _0|B_1|)^{-1/d}\) close to the boundary. This shows that for this density

$$\begin{aligned} R_\rho =\min _{x\in \mathbb {R}^d}R(x)=(\rho _0|B_1|)^{-\frac{1}{d}}=\frac{r_0}{2} \left( { \frac{r_0^{-d}\rho _c(d)}{\rho _0v_c(d)} }\right) ^{\frac{1}{d}}. \end{aligned}$$

In particular, the condition (32) is satisfied whenever \(\rho _0\leqslant r_0^{-d}\rho _c(d)/v_c(d)\). On the other hand, it is clear from the packing problem (rescaled by \(r_0\)) that when \(\rho _0>r_0^{-d}\rho _c(d)\) the density cannot be representable for \(\ell \) large enough. Otherwise we would be able to place \(N=\rho _0|\Omega _\ell |>r_0^{-d}\rho _c(d)|\Omega _\ell |\) points in \(\Omega _\ell \) at distance \(r_0\), which contradicts the definition of \(\rho _c(d)\). In conclusion, we have found that, in dimensions \(d\geqslant 2\), constant densities \(\rho _0{\mathbb {1} }_{\Omega _\ell }\) with

$$\begin{aligned} r_0^{-d}\rho _c(d)<\rho _0\leqslant \frac{r_0^{-d}\rho _c(d)}{v_c(d)}, \end{aligned}$$

satisfy (32) but cannot be \(w_{r_0}\)—representable for \(\ell \gg 1\).

As a side remark, we mention that there are representable densities satisfying (32), with \(R_\rho \) as close as we want to \(r_0/2\). We can just take the sum of two Dirac deltas placed at distance \(R\geqslant r_0\) or a smooth approximation of it. This proves that there cannot exist a simple necessary and sufficient condition of hard core representability involving \(R_\rho \) only, in dimensions \(d\geqslant 2\). This is in stark contrast with the one-dimensional case.

There exists, however, a simple sufficient condition in a form that was conjectured in [11, p. 116]. In [14, Theorem 4.1] (see also Theorem 21 below), it is proved that any density satisfying

$$\begin{aligned} \boxed {R_\rho \geqslant r_0}, \end{aligned}$$

is \(w_{r_0}\)–representable. The same holds when \(T>0\) if one puts a strict inequality. It would be interesting to know if such a result is valid for \(R_\rho \geqslant c_d r_0\) with \(c_d<1\), depending on the dimension.

The conclusion of our discussion is that there seems to exist no simple characterization of hard core representability in dimensions \(d\geqslant 2\), involving averages of \(\rho \) over balls. There are necessary or sufficient conditions but they do not match.

2.4.2 Upper Bounds

Next we discuss upper bounds in the hard core case. Even if we do not completely understand when a density is hard-core representable, the energy is very easy to bound when it is the case. Let us assume that w satisfies Assumption 1 with \(\alpha =+\infty \) and that \(\rho \in L^1(\mathbb {R}^d)\) is w–representable. For simplicity we also assume that \(w=+\infty \) on \(B_{r_0}\). Then, for any optimizer \(\mathbb {P}\), we have \(|x_j-x_k|\geqslant r_0\) for \(j\ne k\), \(\mathbb {P}\)–almost surely. This implies

$$\begin{aligned} F_0[\rho ]= \mathcal {U}_N(\mathbb {P})\leqslant \int _{(\mathbb {R}^d)^N}\sum _{1\leqslant j<k\leqslant N} \frac{\kappa {\mathbb {1} }(|x_j-x_k|\geqslant r_0)}{|x_j-x_k|^s}\,\textrm{d}\mathbb {P}\leqslant C\kappa Nr_0^{-s}, \end{aligned}$$
(34)

by [61, Lemma 9]. The constant C only depends on s and d. Upper bounds are easy once we know that the particles cannot get too close.

Constructing trial states with a good entropy is more difficult. Our proofs of Theorems 11 and 12 work in the hard-core case, but they require additional conditions, of the form

$$\begin{aligned} R_\rho >r_0\qquad \text {or}\qquad \int _{B(x,r_0/2)}\rho \leqslant \varepsilon , \end{aligned}$$

for a sufficiently small \(\varepsilon \). We do not state the corresponding results here and rather refer the reader to Remarks 1620, and 28 below. In the rest of this section we quickly discuss the grand-canonical 1D case which has been studied in a famous paper of Percus [76] and the situation where \(\rho \) is bounded uniformly.

2.4.3 The 1D Grand-Canonical Percus Formula

The grand-canonical inverse problem was completely solved by Percus in dimension \(d=1\) in [76] (see also [88]). Under the optimal assumption that \(R_\rho >r_0/2\), he proved that the grand-canonical Gibbs state with external potential

$$\begin{aligned} V(x)=-\log \rho (x)+\log \left( 1-\int _{x-r_0}^x\rho \right) -\int _x^{x+r_0} \frac{\rho (s)}{1-\int _{s-r_0}^s\rho }\,\textrm{d}s \end{aligned}$$

and hard-core \(w_{r_0}\) has the density \(\rho \). Since the potential \({\widetilde{V}}=V+T\log \rho \) solves the supremum in the dual formula (20), we obtain

$$\begin{aligned} \boxed {G_T[\rho ]= T\int _\mathbb {R}\rho (x)\big (\log \rho (x)-1\big )\,\textrm{d}x-T\int _\mathbb {R}\rho (x)\log \left( 1-\int _{x-r_0}^x\rho \right) \textrm{d}x}, \end{aligned}$$
(35)

for the hard core potential \(w_{r_0}\). This explicit expression shows us that, in one dimension, the nonlocality is solely due to the second logarithmic term, which involves the local average \(\int _{x-r_0}^x\rho \) over a window of length \(r_0\). This is further discussed in [88].

For a general potential w satisfying Assumption 1, we only obtain an upper bound and need to add \(C\kappa r_0^{-s}\int _\mathbb {R}\rho \) by (34). We can estimate the logarithm by assuming, for instance, that \(\int _{x-r_0}^x\rho \leqslant 1-\varepsilon \) for all \(x\in \mathbb {R}\).

To our knowledge the canonical problem was never solved in the manner of Percus. It would be interesting to derive an upper bound on \(F_T[\rho ]\) of the same form as the right side of (35).

2.4.4 Bound for Densities Uniformly Bounded by the Packing Density

In dimensions \(d\geqslant 2\) we have no simple criterion of representability, as we have seen. One simpler situation is when \(\rho \) is everywhere bounded above by the sphere packing density, which we have defined in (33). Then we can prove it is representable and furnish an explicit upper bound on its grand-canonical free energy.

Theorem 14

(Hard-core case with packing density bound) Assume that w satisfies Assumption 1 with \(\alpha =+\infty \). Let \(\rho _c(d)\) be the sphere packing density in (33) and \(v_c(d)=2^{-d}\rho _c(d)|B_1|\) be the volume fraction. Let \(\rho \in L^1 ( \mathbb {R}^d,\mathbb {R}_+ )\) be such that

$$\begin{aligned} \rho (x)\leqslant (1-\varepsilon )^dr_0^{-d}\rho _c(d), \end{aligned}$$

for some \(\varepsilon \in (0,1)\). We also assume that \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \) if \( T > 0 \). Then

$$\begin{aligned} G_T[\rho ]\leqslant \frac{C\kappa }{r_0^s}\int _{{\mathbb {R}}^d}\rho +T\int _{\mathbb {R}^d}\rho \log \rho +T\log \left( \frac{2^d}{\varepsilon ^d v_c(d)}\right) \int _{\mathbb {R}^d}\rho , \end{aligned}$$

with a constant C depending only on the dimension d and the power s from Assumption 1.

The idea of the proof is to first construct a trial state for a constant density \(\rho _0\approx (1-\varepsilon )^dr_0^{-d}\rho _c(d)\) by using a periodic sphere packing with a large period, uniformly averaged over translations (often called a “floating crystal” [64]). We then “geometrically localize” [60] this state to make it have density \(\rho \). The proof is detailed later in Sect. 6.

3 Proof of Theorem 11 in Dimension \(d=1\)

We start with the one-dimensional canonical case, for which the argument is relatively easy. We detail the proof for the convenience of the reader and because this will pave the way for the more complicated covering methods in higher dimensions. We only consider here the canonical case. The grand-canonical bound (23) follows using (19), but in the next section we will provide a direct proof in the grand-canonical case which also works in dimension \(d=1\).

Theorem 15

(\( d = 1 \)) Suppose the interaction w satisfies Assumption 1 with \( 1 \leqslant \alpha < \infty \). Let \( T \geqslant 0 \) and assume that \( \int _{\mathbb {R}} \rho |\log \rho | < \infty \) for \( T > 0 \). Then for any density \( 0 \leqslant \rho \in L^1 ( \mathbb {R} ) \) with \( \int _{\mathbb {R}} \rho \in \mathbb {N}\), we have

$$\begin{aligned} F_T [ \rho ] \leqslant {}&\frac{4 \kappa s}{s-1} \int _{\mathbb {R}} \rho ^2 + \log ( 2 ) T \int _{\mathbb {R}} \rho + T \int _{\mathbb {R}} \rho \log \rho \nonumber \\&+ {\left\{ \begin{array}{ll} \displaystyle \frac{2^{3+2\alpha }}{\alpha -1} \kappa r_0^\alpha \int _{\mathbb {R}} \rho ^{1+\alpha } &{}\text {for } \alpha >1,\\ \displaystyle 2^5 \kappa r_0 \left( {2 \log ( 2 ) \int _{\mathbb {R}} \rho ^2+\int _{\mathbb {R}}\rho ^2 \left( {\log r_0 \rho }\right) _+ }\right) &{}\text {for } \alpha =1. \end{array}\right. } \end{aligned}$$
(36)

Proof

Denoting \( N = \int _{\mathbb {R}} \rho \), we can split the real numbers \( \mathbb {R}\) into two families \( ( L_j )_{j=1}^N \), \( ( L_j^{*} )_{j=1}^N \) of disjoint intervals in such a way that the mass of \( \rho \) in each of these intervals is exactly \( \int _{L_j} \rho = 1/2 \), and such that each \( L_j \) has neighboring intervals only among the \( L_j^{*} \), and vice versa (see Fig. ).

Fig. 1
figure 1

Sketch of intervals

This allows us to write

$$\begin{aligned} \rho = \frac{1}{2} \left( { \sum \limits _j 2 \rho \mathbb {1}_{L_j} + \sum \limits _j 2 \rho \mathbb {1}_{L_j^{*}}}\right) , \end{aligned}$$

a convex combination of two measures with mass equal to N. As trial states for each of these, we take the symmetric tensor products

$$\begin{aligned} \mathbb {Q}= \Pi _s \left( {\bigotimes \limits _j ( 2 \rho \mathbb {1}_{L_j} ) }\right) , \qquad \mathbb {Q}^{*} = \Pi _s \left( {\bigotimes \limits _j ( 2 \rho \mathbb {1}_{L_j^{*}} ) }\right) , \end{aligned}$$

where \( \Pi _s \) denotes the symmetrization operator in (31). Then the state

$$\begin{aligned} \mathbb {P}{:}{=} \frac{1}{2} ( \mathbb {Q}+ \mathbb {Q}^{*} ), \end{aligned}$$

has one-body density equal to \( \rho _{\mathbb {P}} = \rho \). Using that the intervals \( ( L_j ) \) are all disjoint, we have for instance

$$\begin{aligned} - \mathcal {S}_N ( \mathbb {Q} ) ={}&\int _{\mathbb {R}^N} \frac{1}{N!} \sum \limits _{\sigma \in S_N} \bigotimes _j ( 2 \rho \mathbb {1}_{L_{\sigma ( j )}} ) \log \left( {\bigotimes _j ( 2 \rho \mathbb {1}_{L_{\sigma ( j )}} )}\right) \\ ={}&\sum \limits _{j=1}^N \int _{\mathbb {R}} 2 \rho \mathbb {1}_{L_j} \log ( 2 \rho \mathbb {1}_{L_j} ) ={} \int _{\bigcup _j L_j} 2 \rho \log ( 2 \rho ), \end{aligned}$$

and similarly for \( \mathbb {Q}^{*} \). By concavity of the entropy, we conclude that

$$\begin{aligned} - \mathcal {S}_N ( \mathbb {P} ) \leqslant {} - \frac{1}{2} \mathcal {S}_N ( \mathbb {Q} ) - \frac{1}{2} \mathcal {S}_N ( \mathbb {Q}^{*} ) ={} \log 2 \int _{\mathbb {R}} \rho + \int _{\mathbb {R}} \rho \log \rho . \end{aligned}$$

To estimate the interaction energy in the state \( \mathbb {P}\), it suffices to provide an estimate for both \( \mathbb {Q}\) and \( \mathbb {Q}^{*} \). We write here the argument only for \( \mathbb {Q}\), since the argument for \( \mathbb {Q}^{*} \) is exactly the same. By Assumption 1 and the construction of \( \mathbb {Q}\), we immediately have

$$\begin{aligned} \mathcal {U}_N ( \mathbb {Q} ) ={}&\iint _{\mathbb {R}^2} w ( x-y ) \rho _{\mathbb {Q}}^{( 2 )} ( x,y ) \, \textrm{d}x \, \textrm{d}y \\ \leqslant {}&4 \kappa \sum \limits _{i< j} \iint _{\mathbb {R}^2} \left( {\frac{r_0^{\alpha } \mathbb {1} ( |x - y| < r_0 )}{|x - y|^{\alpha }} + w_2 ( x - y ) }\right) \nonumber \\&\qquad \qquad \qquad \qquad \times \rho ( x ) \mathbb {1}_{L_i} ( x ) \rho ( y ) \mathbb {1}_{L_j} ( y ) \, \textrm{d}x \, \textrm{d}y, \end{aligned}$$

where \( w_2 ( x ) = ( 1 + |x|^s )^{-1} \) and \(\rho _{\mathbb {Q}}^{( 2 )}\) is the two-particle correlation function. For the contribution from the tail of the interaction, we have by Young’s inequality

From the core of w we get

$$\begin{aligned} 4 \sum \limits _{i< j} \iint _{\mathbb {R}^2} \frac{\mathbb {1} ( |x - y|< r_0 )}{|x - y|^{\alpha }} \rho ( x ) \mathbb {1}_{L_i} ( x ) \rho ( y ) \mathbb {1}_{L_j} ( y ) \, \textrm{d}x \, \textrm{d}y \leqslant \sum \limits _{i< j} \frac{\mathbb {1}_{\textrm{d}( L_i,L_j ) < r_0}}{\textrm{d}( L_i,L_j )^{\alpha }}. \end{aligned}$$

The idea now is to use the intervals \( ( L_j^{*} ) \) to estimate the sum above. For each i we denote by \( \eta _i \) the minimal length of neighboring intervals,

where \( \ell _j^{*} {:}{=} |L_j^{*}| \) is the interval length, and we re-order the collection \( ( L_i ) \) such that \( \eta _1 \leqslant \cdots \leqslant \eta _N \). Fixing the index i, we now clearly have for \( j > i \),

$$\begin{aligned} \textrm{d}( L_i, L_j ) \geqslant \eta _j \geqslant \eta _i, \end{aligned}$$

in particular, \( \eta _i \) is smaller than the side length of any interval neighboring \( L_j \). Pick \( x_j \in \overline{L_i} \) and \( y_j \in \overline{L_j} \) such that \( \textrm{d}( L_i, L_j ) = |x_j-y_j| \), and let \( L_k^{*} \) be the neighboring interval of \( L_j \) facing \( y_j \), that is, \( \textrm{d}( y_j, L_k^{*} ) = 0 \). Defining

$$\begin{aligned} \widetilde{L}_j {:}{=} ( y_j-\eta _i/2, y_j+\eta _i /2 ) \cap L_k^{*}, \end{aligned}$$

then \( |\widetilde{L}_j| = \eta _i / 2 \), and \( \eta _i /2 \leqslant |x_j-y| \leqslant |x_j-y_j| \) for all \( y \in \widetilde{L}_j \), so we can estimate

$$\begin{aligned} \frac{\mathbb {1}_{\textrm{d}( L_i,L_j )< r_0}}{\textrm{d}( L_i,L_j )^{\alpha }} \leqslant {} \frac{2}{\eta _i} \int _{\widetilde{L}_j} \frac{\mathbb {1} ( |x_j - y|< r_0 )}{|x_j - y|^{\alpha }} \, \textrm{d}y = \frac{2}{\eta _i} \int _{\widetilde{L}_j-x_j} \frac{\mathbb {1} ( |y| < r_0 )}{|y|^{\alpha }} \, \textrm{d}y. \end{aligned}$$

Now summing over j gives

$$\begin{aligned} \sum \limits _{j = i+1}^N \frac{\mathbb {1}_{\textrm{d}( L_i,L_j )< r_0}}{\textrm{d}( L_i,L_j )^{\alpha }} \leqslant {}&\frac{2}{\eta _i} \int _{\mathbb {R}} \frac{\mathbb {1} ( \eta _i /2 \leqslant |y| < r_0 )}{|y|^{\alpha }} \, \textrm{d}y ={} \frac{4}{\eta _i} \left( { \int _{\eta _i / 2}^{r_0} \frac{1}{|y|^{\alpha }} \, \textrm{d}y }\right) _+ \nonumber \\ \leqslant {}&{\left\{ \begin{array}{ll} \displaystyle \frac{2^{1+\alpha }}{\alpha -1} \frac{1}{\eta _i^{\alpha }} &{}\text {for } \alpha >1,\\ \displaystyle \frac{4}{\eta _i} \left( {\log \left( {\frac{2 r_0}{\eta _i}}\right) }\right) _+ &{}\text {for } \alpha =1. \end{array}\right. } \end{aligned}$$
(37)

By Hölder’s inequality, we have by construction of the intervals \( L_j^{*} \)

$$\begin{aligned} \frac{1}{( \ell _j^{*} )^{\alpha }} \leqslant 2^{1+\alpha } \int _{L_j^{*}} \rho ^{1+\alpha }, \end{aligned}$$

for any \( \alpha > 1 \), so in this case we conclude that

$$\begin{aligned} \sum \limits _{i< j} \frac{\mathbb {1}_{\textrm{d}( L_i,L_j ) < r_0}}{\textrm{d}( L_i,L_j )^{\alpha }} \leqslant {} \sum \limits _{i=1}^N \frac{2^{1+\alpha }}{\alpha -1} \frac{1}{\eta _i^{\alpha }} \leqslant {} \frac{2^{2+\alpha }}{\alpha -1} \sum \limits _{i=1}^N \frac{1}{( \ell _i^{*} )^{\alpha }} \leqslant {} \frac{2^{3+2\alpha }}{\alpha -1} \sum \limits _{i=1}^N \int _{L_i^{*}} \rho ^{1+\alpha }. \end{aligned}$$

The same bound holds for the interaction energy of \( \mathbb {Q}^{*} \), but with the intervals \( L_i^{*} \) replaced by \( L_i \) at the end. This finishes the proof of the \( \alpha > 1 \) case in (36).

To finish the \( \alpha = 1 \) case, we note that applying Jensen’s inequality on the function \( t \mapsto t^2 ( \log ( 2\lambda t ) )_+ \) for \( \lambda > 0 \) yields

$$\begin{aligned} \frac{1}{\ell _j^{*}} \left( {\log \left( { \frac{\lambda }{\ell _j^{*}} }\right) }\right) _+ ={} 4 \ell _j^{*} \left( {\frac{1}{\ell _j^{*}} \int _{L_j^{*}} \rho }\right) ^2 \left( { \log \left( { \frac{2 \lambda }{\ell _j^{*}} \int _{L_j^{*}} \rho }\right) }\right) _+ \leqslant {} 4 \int _{L_j^{*}} \rho ^2 ( \log ( 2 \lambda \rho ) )_+. \end{aligned}$$

Hence, continuing from (37), we get

$$\begin{aligned} \sum \limits _{i< j} \frac{\mathbb {1}_{\textrm{d}( L_i,L_j ) < r_0}}{\textrm{d}( L_i,L_j )^{\alpha }} \leqslant {} \sum \limits _{i=1}^N \frac{4}{\eta _i} \left( {\log \left( {\frac{2 r_0}{\eta _i}}\right) }\right) _+ \leqslant {}&8 \sum \limits _{i=1}^N \frac{1}{\ell _i^{*}} \left( {\log \left( {\frac{2 r_0}{\ell _i^{*}}}\right) }\right) _+ \\ \leqslant {}&2^5 \sum \limits _{i=1}^N \int _{L_i^{*}} \rho ^2 ( \log ( 4 r_0 \rho ) )_+. \end{aligned}$$

Since the corresponding bound also holds for \( \mathbb {Q}^{*} \), this concludes the proof. \(\square \)

Remark 16

(Hard-core case) In the case where w has a hard-core with range \( r_0 > 0 \), it follows from the proof above that

$$\begin{aligned} F_T [ \rho ] \leqslant \left( {\frac{4 \kappa s}{( s-1 ) r_0} +\log ( 2 ) T}\right) \int _{\mathbb {R}} \rho + T \int _{\mathbb {R}} \rho \log \rho , \end{aligned}$$
(38)

for any density \( \rho \in L^1 ( \mathbb {R} ) \) satisfying the (sub-optimal) condition \(\int _{x}^{x+r_0} \rho \leqslant \frac{1}{2}\) for all \( x \in \mathbb {R}\).

4 Proof of Theorem 11 in the Grand-Canonical Case

In the course of our proof we need to cover the support of our density using disjoint cubes separated by a distance depending on the local value of the density, in order to have a reasonable control of the interaction. We obtain such a covering by a variant of the Besicovitch lemma [21], which we first describe in this subsection. It is different from the standard formulation.

For simplicity we work with a compactly supported density \(\rho \) with \(\int _{\mathbb {R}^d}\rho >1\). For every \(x\in \mathbb {R}^d\), we define \(\ell (x)\) to be the largest number such that

$$\begin{aligned} \int _{x+\ell (x){\mathcal {C}}}\rho (x)\,\textrm{d}x=\frac{1}{3^d(4^d+1)}, \end{aligned}$$
(39)

where \({\mathcal {C}}=(-1/2,1/2)^d\) is the unit cube centered at the origin. It is convenient to work with cubes instead of balls. It is important that the chosen value of the integral in (39) is universal and only depends on the space dimension d. This value is motivated by the estimates which will follow, it could be any fixed number \(<1\) at this point. The number \(\ell (x)\) always exists since the full integral is larger than 1. The function \(x\mapsto \ell (x)\) is upper semi-continuous. To simplify our notation we denote by \({\mathcal {C}}(x){:}{=}x+\ell (x){\mathcal {C}}\) the cube centered at x of side length \(\ell (x)\). By Hölder’s inequality we get

$$\begin{aligned} \frac{1}{3^d(4^d+1)}=\int _{{\mathcal {C}}(x)}\rho \leqslant \ell (x)^{\frac{\alpha d}{\alpha +d}} \left( { \int _{{\mathcal {C}}(x)}\rho ^{1+\frac{\alpha }{d}} }\right) ^{\frac{d}{d+\alpha }}, \end{aligned}$$

and thus obtain the estimate

$$\begin{aligned} \frac{1}{\ell (x)^\alpha }\leqslant 3^{\alpha +d}(4^d+1)^{1+\frac{\alpha }{d}}\int _{{\mathcal {C}}(x)}\rho ^{1+\frac{\alpha }{d}},\qquad \forall x\in \mathbb {R}^d, \end{aligned}$$
(40)

on the local length \(\ell (x)\). The standard Besicovitch covering lemma (as stated for instance in [21, 35]) implies for compactly supported densities that there exists a set of points \(x_j'^{(k)}\) with \(1\leqslant k\leqslant K'\leqslant 4^d+1\) and \(1\leqslant j\leqslant J_k\) such that

  • the cubes \(\big ({\mathcal {C}}(x_j'^{(k)})\big )_{\begin{array}{c} 1\leqslant k\leqslant K'\\ 1\leqslant j\leqslant J_k \end{array}}\) cover the support of \(\rho \) and each \(x\in \mathbb {R}^d\) is in at most \(2^d\) such cubes,

  • for every k, the cubes \(\big ({\mathcal {C}}(x'^{(k)}_j)\big )_{1\leqslant j\leqslant J_k}\) are all disjoint.

We need to obtain different families which satisfy additional properties, namely we require the cubes to have a safety distance to all the larger cubes within the same family, this distance being comparable to the side length of the cube in question. The precise statement is the following.

Lemma 17

(Besicovitch with minimal distance) Let \(\rho \) be a compactly supported density with \(\int _{\mathbb {R}^d}\rho >1\). Then there exists a set of points \(x_j^{(k)}\) with \(1\leqslant k\leqslant K\leqslant 3^d(4^d+1)\) and \(1\leqslant j\leqslant J_k<\infty \) such that

  • the cubes \(\big ({\mathcal {C}}(x_j^{(k)})\big )_{\begin{array}{c} 1\leqslant k\leqslant K\\ 1\leqslant j\leqslant J_k \end{array}}\) cover the support of \(\rho \) and each \(x\in \mathbb {R}^d\) is in at most \(2^d\) such cubes,

  • for every k, the cubes \(\big ({\mathcal {C}}(x^{(k)}_j)\big )_{1\leqslant j\leqslant J_k}\) in the kth collection satisfy

    $$\begin{aligned} \textrm{d}\left( {\mathcal {C}}(x^{(k)}_j),{\mathcal {C}}(x^{(k)}_\ell )\right) \geqslant \frac{1}{2}\min \Big \{\ell (x^{(k)}_j),\ell (x^{(k)}_\ell )\Big \}. \end{aligned}$$

Proof

We start the proof by applying the standard Besicovitch covering lemma recalled above. We obtain K collections of disjoint cubes. To impose the minimal distance we separate each family into \(3^d\) subfamilies. Specifically, we use that the maximal number of disjoint cubes of side length \(\geqslant \ell \) intersecting a cube of side length \(2\ell \) is at most \(3^d\). Thus if we look at a given cube of side length \(\ell \), only \(3^d-1\) other bigger cubes can be at distance \(\leqslant \ell /2\). By induction we can thus always distribute all our cubes into \(3^d\) subfamilies, while ensuring the distance property for all the bigger cubes. \(\square \)

Using Lemma 17 we obtain the following partition of unity

$$\begin{aligned} {\mathbb {1} }_{{{\,\textrm{supp}\,}}\rho }=\sum _{k=1}^{K}\sum _{j=1}^{J_k}\frac{{\mathbb {1} }_{{\mathcal {C}}(x^{(k)}_j) \cap {{\,\textrm{supp}\,}}\rho }}{\eta }, \qquad {\mathbb {1} }_{{{\,\textrm{supp}\,}}\rho } \leqslant \eta {:}{=}\sum _{k=1}^{K}\sum _{j=1}^{J_k}{\mathbb {1} }_{{\mathcal {C}}(x^{(k)}_j)}\leqslant 2^d, \end{aligned}$$
(41)

which we are going to use to construct our trial state for the upper bound on \(G_T[\rho ]\). We split the proof into several steps. We start with the case \(\alpha >d\) and treat the special case \(\alpha =d\) at the very end.

Step 1: Less than one particle. If \(\int _{\mathbb {R}^d}\rho \leqslant 1\), we consider the probability \(\mathbb {P}=(\mathbb {P}_n)\) given by

$$\begin{aligned} \mathbb {P}_0=1-\int _{\mathbb {R}^d} \rho ,\qquad \mathbb {P}_1=\rho ,\qquad \mathbb {P}_n=0\text { for } n\geqslant 2, \end{aligned}$$
(42)

which has density \(\rho \) and no interaction energy. Its free energy is thus just equal to the entropy term

$$\begin{aligned} -T \mathcal {S}( \mathbb {P} ) = T\left( 1-\int _{\mathbb {R}^d}\rho \right) \log \left( 1-\int _{\mathbb {R}^d}\rho \right) +T\int _{\mathbb {R}^d}\rho \log \rho . \end{aligned}$$

The first term is negative and thus we obtain the desired inequality

$$\begin{aligned} G_T[\rho ]\leqslant T\int _{\mathbb {R}^d}\rho \log \rho \qquad \text {for}\quad \int _{\mathbb {R}^d}\rho \leqslant 1. \end{aligned}$$
(43)

Step 2: Compactly supported densities (\(\alpha >d\)). Next we consider the case of a compactly supported density \(\rho \) with \(\int _{\mathbb {R}^d}\rho >1\). Using the partition (41) we write

$$\begin{aligned} \rho =\frac{1}{K}\sum _{k=1}^{K} \left( { \sum _{j}\rho _j^{(k)}}\right) ,\qquad \rho _j^{(k)}{:}{=}\frac{K\rho {\mathbb {1} }_{Q_j^{(k)}}}{\eta }, \end{aligned}$$

where we abbreviated \(Q_j^{(k)}={\mathcal {C}}(x^{(k)}_j)\) for simplicity. This is a (uniform) convex combination of the K densities \(\rho ^{(k)}=\sum _{j}\rho _j^{(k)}\). For fixed k, the \(\rho _j^{(k)}\) have disjoint supports with distance greater or equal to \(\min \{\ell (x_j^{(k)}),\ell (x_{j'}^{(k)})\}/2\). In addition, we have

$$\begin{aligned} \int \rho _j^{(k)}=K\int _{Q_j^{(k)}} \frac{\rho }{\eta }\leqslant 3^d(4^d+1)\int _{Q_j^{(k)}}\rho \leqslant 1. \end{aligned}$$

This is the reason for our choice of the constant in (39). Our trial state is given by

$$\begin{aligned} \mathbb {P}{:}{=}\frac{1}{K}\sum _{k=1}^K\mathbb {P}^{(k)}, \end{aligned}$$

where

$$\begin{aligned} \mathbb {P}^{(k)}=\bigotimes _{j=1}^{J_k}\left( \left( 1-\int _{\mathbb {R}^d}\rho _j^{(k)}\right) \oplus \rho _j^{(k)}\oplus 0\oplus \ldots \right) , \end{aligned}$$

is the symmetrized tensor product of the states in (42), which has density \(\rho ^{(k)}\). Using the concavity of the entropy, our upper bound is, thus,

$$\begin{aligned} G_T[\rho ]&\leqslant \frac{1}{K}\sum _k \mathcal {G}_T(\mathbb {P}^{(k)})\\&\leqslant \frac{1}{K}\sum _{k=1}^{K}\bigg (\sum _{1\leqslant i<j\leqslant J_k}\iint _{\mathbb {R}^d\times \mathbb {R}^d} \rho _i^{(k)}(x)\rho _j^{(k)}(y)w(x-y)\,\textrm{d}x\,\textrm{d}y\\&\qquad +T\sum _{j=1}^{J_k}\int _{Q_j^{(k)}}\rho _j^{(k)}\log \rho _j^{(k)}\bigg ). \end{aligned}$$

We have

$$\begin{aligned} \frac{1}{K}\sum _{k=1}^{K}\sum _{j=1}^{J_k}\int _{Q_j^{(k)}}\rho _j^{(k)}\log \rho _j^{(k)}&=\frac{1}{K}\sum _{k=1}^{K}\sum _{j=1}^{J_k}\int _{Q_j^{(k)}}\rho _j^{(k)}\log \frac{K\rho }{\eta }\\&=\int _{\mathbb {R}^d}\rho \log \frac{K\rho }{\eta }\leqslant \int _{\mathbb {R}^d}\rho \log \rho +3d\int _{\mathbb {R}^d}\rho , \end{aligned}$$

since \(K\leqslant 15^d\leqslant e^{3d}\) and \(\eta \geqslant 1\). Thus we obtain

$$\begin{aligned} G_T[\rho ]&\leqslant \frac{1}{K}\sum _{k=1}^{K}\sum _{1\leqslant i<j\leqslant J_k} \iint _{\mathbb {R}^d\times \mathbb {R}^d}\rho _i^{(k)}(x)\rho _j^{(k)}(y)w(x-y)\,\textrm{d}x\,\textrm{d}y\\&\quad + T\int _{\mathbb {R}^d}\rho \log \rho +3Td\int _{\mathbb {R}^d}\rho . \end{aligned}$$

Our next task is to estimate the interaction, for every fixed k. By Assumption 1 we have \(w\leqslant w_1+w_2\) with \(w_1(x)=\kappa (r_0/|x|)^{\alpha }{\mathbb {1} }(|x|<r_0)\) and \(w_2(x)=\kappa (1+|x|^s)^{-1}\). We first estimate the term involving the integrable potential \(w_2\) using Young’s inequality as

$$\begin{aligned}&\sum _{1\leqslant i<j\leqslant J_k}\iint _{\mathbb {R}^d\times \mathbb {R}^d}\rho _i^{(k)}(x)\rho _j^{(k)}(y)w_2(x-y)\,\textrm{d}x\,\textrm{d}y\\&\qquad \leqslant \frac{1}{2}\iint _{\mathbb {R}^d\times \mathbb {R}^d}\rho ^{(k)}(x)\rho ^{(k)}(y)w_2(x-y)\,\textrm{d}x\,\textrm{d}y\\&\qquad \leqslant \frac{ \left\| w_2 \right\| _{L^1}}{2}\int _{\mathbb {R}^d}(\rho ^{(k)})^2= \frac{ \left\| w_2 \right\| _{L^1}}{2} K^2\int _{\cup _i Q_i^{(k)}}\frac{\rho ^2}{\eta ^2}. \end{aligned}$$

After summing over k this gives

$$\begin{aligned} \frac{1}{K}\sum _{k=1}^K\sum _{1\leqslant i<j\leqslant J_k}\iint _{\mathbb {R}^d\times \mathbb {R}^d}\rho _i^{(k)}(x)\rho _j^{(k)}(y)w_2(x-y)\,\textrm{d}x\,\textrm{d}y\leqslant \frac{ \left\| w_2 \right\| _{L^1}}{2} K\int _{\mathbb {R}^d}\frac{\rho ^2}{\eta }. \end{aligned}$$

Using for instance

$$\begin{aligned} \int _{\mathbb {R}^d}w_2=\kappa |\mathbb {S}^{d-1}|\int _0^\infty \frac{r^{d-1}}{1+r^s}\,\textrm{d}r\leqslant \kappa |\mathbb {S}^{d-1}|\frac{s}{d ( s-d )}, \end{aligned}$$

and recalling that \(\eta \geqslant 1\) and \(K\leqslant 3^d(4^d+1)\), we obtain

$$\begin{aligned} \frac{1}{K} \sum _{k=1}^{K}\sum _{1\leqslant i<j\leqslant J_k}\iint _{\mathbb {R}^d\times \mathbb {R}^d}\rho _i^{(k)}(x)\rho _j^{(k)}(y)w_2(x-y)\,\textrm{d}x\,\textrm{d}y\\ \leqslant \kappa \frac{s|\mathbb {S}^{d-1}|}{2d(s-d)}3^{d}(4^d+1)\int _{\mathbb {R}^d}\rho ^2. \end{aligned}$$

Next we consider the more complicated term involving the singular part \(w_1=\kappa {\mathbb {1} }(|x|<r_0) (r_0/|x|)^{\alpha }\). To simplify our notation, we remove the superscript (k) and thus consider the collection \((\rho _j)_{j=1}^J\) of functions supported in the disjoint cubes \(Q_j\) with the safety distance. For every \(i\ne j\), using \(\int _{\mathbb {R}^d}\rho _j\leqslant 1\), we can estimate

$$\begin{aligned} \iint \rho _i(x)\rho _j(y)w_1(x-y)\,\textrm{d}x\,\textrm{d}y\leqslant \frac{\kappa r_0^\alpha }{\textrm{d}(Q_i,Q_j)^\alpha }. \end{aligned}$$

Recall that when \(|Q_i|\leqslant |Q_j|\), the distance \(\textrm{d}(Q_i,Q_j)\) is at least equal to \(\ell _i/2\). We can order our J cubes so that the volume is increasing: \(|Q_1|\leqslant |Q_2|\leqslant \cdots \leqslant |Q_J|\). We need to estimate

$$\begin{aligned} \sum _{i=1}^{J-1}\sum _{j=i+1}^J\frac{1}{\textrm{d}(Q_i,Q_j)^\alpha }=\sum _{i=1}^{J-1}\frac{1}{\ell _i^\alpha }\sum _{j=i+1}^J\frac{1}{\textrm{d}({\mathcal {C}},Q'_{i,j})^\alpha }, \end{aligned}$$

where \({\mathcal {C}}=(-1/2,1/2)^d\) and for every i, we have denoted by \(Q'_{i,j}\) the cube centered at \((x_j-x_i)/\ell _i\), of volume \(|Q_j|/|Q_i|\geqslant 1\). To estimate the sum in j, we use the following lemma, which is based on the integrability at infinity of \(|x|^{-\alpha }\) and is similar to [61, Lemma 9].

Lemma 18

Let \({\mathcal {C}}=(-1/2,1/2)^d\) be the unit cube and consider any collection of non-intersecting cubes \(Q_j\) with the property that \(|Q_j|\geqslant 1\) and \(\textrm{d}({\mathcal {C}},Q_j)\geqslant \frac{1}{2}\). Then we have

$$\begin{aligned} \sum _j \frac{1}{\textrm{d}({\mathcal {C}},Q_j)^\alpha }\leqslant \frac{3^\alpha 2^{7d}d^2}{|\mathbb {S}^{d-1}|(\alpha -d)}. \end{aligned}$$
(44)

The constant on the right of (44) is not at all optimal and is only displayed for concreteness.

Proof of Lemma 18

Let \(X_j\in {\mathcal {C}}\) and \(Y_j\in Q_j\) be so that \(\textrm{d}({\mathcal {C}},Q_j)=|X_j-Y_j|\geqslant \frac{1}{2}\). For any \(x\in B(X_j,1/8)\) and \(y\in B(Y_j,1/8)\) we have

$$\begin{aligned} \frac{|X_j-Y_j|}{2}\leqslant |X_j-Y_j|-\frac{1}{4}\leqslant |x-y|\leqslant |X_j-Y_j|+\frac{1}{4}\leqslant \frac{3}{2}|X_j-Y_j|. \end{aligned}$$

Integrating over \(x'\in {\mathcal {C}}\cap B(X_j,1/8)\) and \(y'\in Q_j\cap B(Y_j,1/8)\) we obtain

$$\begin{aligned} \frac{1}{\textrm{d}({\mathcal {C}},Q_j)^\alpha }&=\frac{1}{|X_j-Y_j|^\alpha }\\&\leqslant \frac{(3/2)^\alpha }{|{\mathcal {C}}\cap B(X_j,1/8)|\;| Q_j\cap B(Y_j,1/8)|}\int _{\mathcal {C}}\int _{Q_j}\frac{\textrm{d}x\,\textrm{d}y}{|x-y|^\alpha }. \end{aligned}$$

The volume of the intersection of a ball of radius 1/8 centered at \(X_j\) in a cube of volume \(\geqslant 1\) and that of the other cube is bounded away from 0. It is in fact minimal when \(X_j, Y_j\) are located at a corner, yielding

$$\begin{aligned} |{\mathcal {C}}\cap B(X_j,1/8)|\geqslant \frac{|\mathbb {S}^{d-1}|}{2^{4d}d},\qquad |Q_j\cap B(Y_j,1/8)|\geqslant \frac{|\mathbb {S}^{d-1}|}{2^{4d}d}. \end{aligned}$$

Thus, we obtain

$$\begin{aligned} \frac{1}{\textrm{d}({\mathcal {C}},Q_j)^\alpha }\leqslant \frac{3^\alpha 2^{8d-\alpha }d^2}{|\mathbb {S}^{d-1}|^2}\int _{\mathcal {C}}\int _{Q_j}\frac{\textrm{d}x\,\textrm{d}y}{|x-y|^\alpha }. \end{aligned}$$

Summing over j using that the cubes are disjoint we obtain

$$\begin{aligned} \sum _j \frac{1}{\textrm{d}({\mathcal {C}},Q_j)^\alpha }\leqslant \frac{3^\alpha 2^{8d-\alpha }d^2}{|\mathbb {S}^{d-1}|^2}\int _{\mathcal {C}}\int _{\mathbb {R}^d}\frac{{\mathbb {1} }(|x-y|\geqslant \frac{1}{2}) \,\textrm{d}x\,\textrm{d}y}{|x-y|^\alpha }= \frac{3^\alpha 2^{7d}d^2}{|\mathbb {S}^{d-1}|(\alpha -d)}, \end{aligned}$$

as was claimed. \(\square \)

From the estimates (40) and (44), we deduce that

$$\begin{aligned}&\sum _{1\leqslant i<j\leqslant J_k}\iint _{|x-y|\leqslant r_0}\rho _i(x)\rho _j(y)w_1(x-y)\,\textrm{d}x\,\textrm{d}y\\&\quad \leqslant \kappa r_0^\alpha \frac{d^23^{d+2\alpha } 2^{7d}(4^d+1)^{1+\frac{\alpha }{d}}}{|\mathbb {S}^{d-1} |(\alpha -d)}\int _{\cup _iQ_i}\rho ^{1+\frac{\alpha }{d}}. \end{aligned}$$

Using that \(\int _{\cup _iQ_i}\rho ^{1+\frac{\alpha }{d}} \leqslant \int _{\mathbb {R}^d} \rho ^{1+ \frac{\alpha }{d}}\) and summing over K, we obtain our final estimate

$$\begin{aligned} G_T[\rho ]&\leqslant \kappa \frac{s|\mathbb {S}^{d-1}|}{2d(s-d)}3^{d}(4^d+1)\int _{\mathbb {R}^d}\rho ^2 +\kappa r_0^\alpha \frac{d^23^{d+2\alpha } 2^{7d}(4^d+1)^{1+\frac{\alpha }{d}}}{|\mathbb {S}^{d-1} |(\alpha -d)}\int _{\mathbb {R}^d}\rho ^{1+\frac{\alpha }{d}}\nonumber \\&\quad + T\int _{\mathbb {R}^d}\rho \log \rho +3Td\int _{\mathbb {R}^d}\rho . \end{aligned}$$
(45)

This is our final upper bound, with non-optimal constants only displayed for concreteness.

Step 3: General densities (\(\alpha >d\)). In order to be able to use the Besicovitch lemma, we restricted ourselves to compactly supported densities. We prove here that the exact same estimate holds for general densities. Let \(\rho \in (L^1\cap L^{1+\alpha /d})(\mathbb {R}^d,\mathbb {R}_+)\), \(\varepsilon \in (0,1)\) and write

$$\begin{aligned} \rho =(1-\varepsilon )\frac{\rho {\mathbb {1} }_{{\mathcal {C}}_L}}{1-\varepsilon }+\varepsilon \frac{\rho {\mathbb {1} }_{\mathbb {R}^d\setminus {\mathcal {C}}_L}}{\varepsilon }, \end{aligned}$$

with \(\mathcal {C}_L=(-L/2,L/2)^d\). Using the concavity of the entropy, we obtain

$$\begin{aligned} G_T[\rho ]\leqslant (1-\varepsilon )G_T\left[ \frac{\rho {\mathbb {1} }_{{\mathcal {C}}_L}}{1-\varepsilon }\right] +\varepsilon \, G_T\left[ \frac{\rho {\mathbb {1} }_{\mathbb {R}^d\setminus {\mathcal {C}}_L}}{\varepsilon }\right] . \end{aligned}$$
(46)

We choose L so large that

$$\begin{aligned} \int _{\mathbb {R}^d\setminus {\mathcal {C}}_L}\rho \leqslant \varepsilon , \end{aligned}$$

which allows us to use (43) for the second term on the right of (46). For the first term we just use Step 2. We find

$$\begin{aligned} G_T[\rho ]\leqslant \frac{C\kappa r_0^\alpha }{(1-\varepsilon )^{\frac{\alpha }{d}}} \int _{{\mathcal {C}}_L}\rho ^{1+\frac{\alpha }{d}}+\frac{C\kappa }{1-\varepsilon } \int _{{\mathcal {C}}_L}\rho ^{2}+CT\int _{{\mathcal {C}}_L}\rho +T\int _{\mathbb {R}^d}\rho \log \rho \\ +T\log \varepsilon ^{-1}\int _{\mathbb {R}^d\setminus {\mathcal {C}}_L}\rho +T\log (1-\varepsilon )^{-1}\int _{{\mathcal {C}}_L}\rho . \end{aligned}$$

By passing first to the limit \(L\rightarrow \infty \) and then \(\varepsilon \rightarrow 0\), we conclude that \(\rho \) satisfies the same estimate (45) as for compactly supported densities.

Step 4: Case \(\alpha =d\). The case when the core of the interaction behaves as \(w_1(x)=\kappa r_0^d |x|^{-d} {\mathbb {1} } ( |x|\leqslant r_0 )\) is similar to the previous situation with some small changes. The function is not integrable around the origin which requires to have a safety distance between particles in our trial state. However this interaction is also non-integrable without cutoff at infinity so we need to use that the core of our interaction is compactly supported on the ball of radius \(r_0\). The following alternative to Lemma 18 is going to be useful.

Lemma 19

Let \({\mathcal {C}}_0=(-\ell _0/2,\ell _0/2)^d\) and consider any collection of non-intersecting cubes \(Q_j\) with the property that \(|Q_j|\geqslant \ell _0^d\) and \(\textrm{d}({\mathcal {C}}_0,Q_j)\geqslant \frac{\ell _0}{2}\). Then we have

$$\begin{aligned} \sum _j \frac{{\mathbb {1} }_{\textrm{d}({\mathcal {C}}_0,Q_j)\leqslant r_0}}{\textrm{d}({\mathcal {C}}_0,Q_j)^d}\leqslant C\ell _0^{-d}\left( \log \left( \frac{2r_0}{\ell _0}\right) \right) _+. \end{aligned}$$
(47)

Proof

We assume \(\ell _0\leqslant 2r_0\) otherwise there is nothing to prove. Let \(X_j\in {\mathcal {C}}_0\) and \(Y_j\in Q_j\) be such that \(\textrm{d}({\mathcal {C}}_0,Q_j)=|X_j-Y_j|\geqslant \frac{\ell _0}{2}\). For any \(x\in B(X_j,\ell _{0}/8)\) and \(y\in B(Y_j,\ell _{0}/8)\) we have

$$\begin{aligned} \frac{|X_j-Y_j|}{2}\leqslant |X_j-Y_j|-\frac{\ell _0}{4}\leqslant |x-y|\leqslant |X_j-Y_j|+\frac{\ell _0}{4}\leqslant \frac{3}{2}|X_j-Y_j|. \end{aligned}$$

Integrating over \(x'\in {\mathcal {C}}_0\cap B(X_j,\ell _{0}/8)\) and \(y'\in Q_j\cap B(Y_j,\ell _{0}/8)\) we obtain

$$\begin{aligned} \frac{{\mathbb {1} }_{\textrm{d}({\mathcal {C}}_0,Q_j)\leqslant r_0}}{\textrm{d}({\mathcal {C}}_0,Q_j)^d}&={} \frac{{\mathbb {1} }_{\textrm{d}({\mathcal {C}}_0,Q_j)\leqslant r_0}}{|X_j-Y_j|^d}\\&\leqslant {} \frac{(3/2)^d}{|{\mathcal {C}}_0\cap B(X_j,\ell _0/8)|\;|Q_j\cap B(Y_j,\ell _0/8)|}\int _{{\mathcal {C}}_0}\int _{Q_j}\frac{{\mathbb {1} }_{\textrm{d}(|x-y|)\leqslant r_0}}{|x-y|^d}\,\textrm{d}x\,\textrm{d}y. \end{aligned}$$

Summing over all cubes we get

$$\begin{aligned} \sum _{j}\frac{{\mathbb {1} }_{\textrm{d}({\mathcal {C}}_0,Q_j)\leqslant r_0}}{\textrm{d}({\mathcal {C}}_0,Q_j)^d} \leqslant \frac{2^{7d}3^dd}{|{\mathbb {S}}^{d-1}|\ell _0^{d}}\int _{\frac{\ell _0}{2}}^{r_0}r^{-1} \,\textrm{d}r =\frac{2^{7d}3^dd}{|{\mathbb {S}}^{d-1}|}\frac{\log (2r_0/\ell _0)}{\ell _0^{d}}. \end{aligned}$$

\(\square \)

Next we explain how to relate the right side of (47) with the density \(\rho \). Recall from (39) that

$$\begin{aligned} \int _{\mathcal {C}(x)}\rho (y)\,\textrm{d}y=\frac{1}{3^d(4^d+1)}, \end{aligned}$$
(48)

where \(\mathcal {C}(x)\) is the cube of side length \(\ell (x)\) centered at x. By Jensen’s inequality, we have for every convex function F

$$\begin{aligned} \ell (x)^dF\left( \frac{1}{\ell (x)^d3^d(4^d+1)}\right) = \ell ( x )^d F \left( {\frac{1}{\ell ( x )^d} \int _{\mathcal {C} ( x )} \rho }\right) \leqslant \int _{\mathcal {C}(x)}F\big (\rho (y)\big )\,\textrm{d}y. \end{aligned}$$
(49)

Applying this to

$$\begin{aligned} F(t)=t^2\Big (\log \big (6^d(4^d+1)r_0^dt\big )\Big )_+, \end{aligned}$$

we obtain

$$\begin{aligned}&\ell (x)^{-d}\left( \log \left( \frac{2r_0}{\ell (x)}\right) \right) _+\nonumber \\&\qquad \leqslant \frac{3^{2d}(4^d+1)^2}{d}\int _{\mathcal {C}(x)}\rho (y)^2\Big (\log \big (6^d(4^d+1)r_0^d\rho (y)\big )\Big )_+\,\textrm{d}y\nonumber \\&\qquad \leqslant \frac{3^{2d}(4^d+1)^2}{d}\int _{\mathcal {C}(x)}\rho (y)^2\Big (4d+\big (\log r_0^d\rho (y)\big )_+\Big )\,\textrm{d}y. \end{aligned}$$
(50)

The rest of the proof is similar to the case \(\alpha >d\), using (50) and Lemma 19. We omit the details. This concludes the proof of Theorem 11.\(\square \)

Remark 20

(Hard-core case) The previous proof can be used in the hard core case \(\alpha =\infty \), under the (sub-optimal) condition that \(\int _{x+r_0{\mathcal {C}}}\rho <\frac{1}{3^d(4^d+1)}\) for all x, where \({\mathcal {C}}=(-1/2,1/2)^d\). The interaction can be bounded by \(\kappa C N\) as we have seen in (34), leading to the bound

$$\begin{aligned} G_T [\rho ]&\leqslant C\kappa \int _{\mathbb {R}^d}\rho +CT \int _{\mathbb {R}^d} \rho + T \int _{\mathbb {R}^d} \rho \log \rho . \end{aligned}$$

5 Proofs in the Canonical Case

5.1 The Local Radius R(x) in Optimal Transport

Here we explain how to construct canonical trial states using a result from optimal transport, in order to obtain bounds at zero temperature for a singular interaction (\(d\leqslant \alpha \leqslant \infty \)).

Consider any density \( \rho \) with \( \int \rho > 1 \), and recall the local radius R(x) from (24). Note that \( R ( x ) \) can never be zero because \( \rho \) as a measure does not have any point mass. The function R is connected to the Hardy–Littlewood maximal function \( M_\rho \), defined by

$$\begin{aligned} M_\rho ( x ) {:}{=} \sup _{r > 0} \frac{1}{|B_r|} \int _{B ( x,r )} \rho ( y ) \, \textrm{d}y, \end{aligned}$$
(51)

where \( |B_r| \) denotes the volume of a ball in \( \mathbb {R}^d \) of radius r. By definition of R it is clear that

$$\begin{aligned} \frac{1}{|B_1| R ( x )^d} = \frac{1}{|B_1| R ( x )^d} \int _{B ( x, R ( x ) )} \rho ( y ) \, \textrm{d}y \leqslant M_\rho ( x ), \end{aligned}$$

so that we have the pointwise bound

$$\begin{aligned} \frac{1}{R ( x )} \leqslant ( |B_1| M_\rho ( x ) )^{\frac{1}{d}}. \end{aligned}$$
(52)

Furthermore, using Hölder’s inequality gives for any \( p > 0 \),

$$\begin{aligned} 1 = \int _{B ( x, R ( x ) )} \rho \leqslant |B ( x, R ( x ) )|^{\frac{p}{p+d}} \left( {\int _{B ( x,R ( x ) )} \rho ^{1+\frac{p}{d}}}\right) ^{\frac{d}{p+d}}, \end{aligned}$$

implying for \( \rho \in L_{\textrm{loc}}^{1+\frac{p}{d}} ( \mathbb {R}^d ) \) the bound

$$\begin{aligned} \frac{1}{R ( x )^p} \leqslant |B_1|^{\frac{p}{d}} \int _{B ( x,R ( x ) )} \rho ^{1+\frac{p}{d}}. \end{aligned}$$
(53)

It is also apparent that R is 1-Lipschitz-continuous, see e.g. [14, Theorem 4.1]. One might also remark that R always stays away from zero, i.e.

$$\begin{aligned} R_{\rho } {:}{=} \min _{x \in \mathbb {R}^d} R ( x ) > 0. \end{aligned}$$
(54)

This is an immediate consequence of the facts that R is continuous and that, necessarily, \( \lim _{|x| \rightarrow \infty } R ( x ) = \infty \), because \( \rho \in L^1 ( \mathbb {R}^d ) \).

To obtain an upper bound on the canonical energy at a fixed density \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \), it is convenient to have existence of states \( \mathbb {P}\) in which the distance between the particles is bounded from below in terms of the function R from (24). The following is a consequence of a result from [14].

Theorem 21

(Optimal transport state) Let \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \) with \( N = \int _{\mathbb {R}^d} \rho \in \mathbb {N}\). There exists an N-particle state \( \mathbb {P}\) with density \( \rho _{\mathbb {P}} = \rho \) such that

$$\begin{aligned} |x_i - x_j| \geqslant \max \Big ( R_{\rho }, \tfrac{R(x_i) + R(x_j)}{3} \Big )\quad \text {for } 1\leqslant i\ne j\leqslant N. \end{aligned}$$
(55)

\(\mathbb {P}\)—almost everywhere, where R is the function defined by (24), and \( R_{\rho } \) is its minimum in (54).

Proof

The proof is a simple application of [14, Theorem 4.3]. For any \( 0< \eta < 1 \) (we will choose \( \eta = 1/3 \) in a moment) and any \( x \in \mathbb {R}^d \), define a set

Then we have for any \( 0< t < 1 -\eta \), using the Lipschitz continuity of R,

We wish to choose t and \( \eta \) such that the measure of right hand side is equal to one (with respect to the measure \( \rho \)). First, requiring the two balls to have the same radius leads to the choice \( t = \frac{1-\eta }{2} \). Next, we choose \( \eta \) such that \( \frac{\eta }{t} = \frac{\eta }{1-t-\eta } = \frac{2 \eta }{1-\eta } = 1 \), which implies \( \eta = 1/3 \).

Now, defining an open and symmetric set \( D \subseteq \mathbb {R}^d \times \mathbb {R}^d \) by

then satisfies

$$\begin{aligned} B ( x ) = B ( x, R_{\rho } ) \cup \widetilde{B} ( x ) \subseteq B ( x, R ( x ) ). \end{aligned}$$

Thus, by definition of R, we have \( \rho ( B ( x ) ) \leqslant \rho ( B ( x, R( x ) ) ) = 1 \), and since D is open and symmetric, [14, Theorem 4.3] asserts the existence of a \( \mathbb {P}\) with the claimed properties. Specifically, one can take \( \mathbb {P}\) to be the optimizer for the multi-marginal optimal transport problem associated to the cost

where A denotes the set containing the \((x_1,...,x_N)\) satisfying (55). \(\square \)

5.2 Proof of Theorem 11 in the Canonical Case at \(T=0\)

The existence of the state from Theorem 21 allows us to prove the last part of Theorem 11 about the canonical free energy at zero temperature. For convenience we state a proposition valid for any state \(\mathbb {P}\) for which the particles satisfy an inequality similar to (55). Along with Proposition 23 below (which covers the case \( \alpha = d \)), this immediately implies Theorem 11 in the canonical case.

Proposition 22

(Zero temperature energy bound, \( d< \alpha < \infty \)) Let w satisfy Assumption 1 with \( d< \alpha < \infty \). Let \(\mathbb {P}\) be any N-particle probability measure with one-body density \( \rho :=\rho _{\mathbb {P}} \) satisfying

$$\begin{aligned} |x_i - x_j| \geqslant \eta \big ( R ( x_i ) + R ( x_j )\big ) \qquad \text { for } 1\leqslant i \ne j\leqslant N, \end{aligned}$$
(56)

\(\mathbb {P}\)—almost everywhere, for some \( 0 < \eta \leqslant 1 \). Then the interaction energy in the state \( \mathbb {P}\) is bounded by

$$\begin{aligned} F_0 [ \rho ] \leqslant {} \mathcal {U}_{N} ( \mathbb {P} ) \leqslant {} \frac{C\kappa r_0^\alpha }{\eta ^{\alpha }} \int _{\mathbb {R}^d} \rho ( x )^{1+\frac{\alpha }{d}} \, \textrm{d}x + \frac{C\kappa }{\eta ^d} \int _{\mathbb {R}^d} \rho ( x )^2 \, \textrm{d}x \end{aligned}$$
(57)

with C a constant depending only on \(d,\alpha ,s\).

Proof of Proposition 22

In this proof we will not keep track of the exact value of the constants, since we will need the (unknown) one from the Hardy–Littlewood inequality. Hence C denotes here a generic constant depending only on \(d,\alpha ,s\). By the assumptions on w, we have

$$\begin{aligned} \mathcal {U}_{N} ( \mathbb {P} ) \leqslant {} \kappa \int _{\mathbb {R}^{dN}} \sum _{1\leqslant i<j\leqslant N} \left( \frac{r_0^\alpha {\mathbb {1} }(|x_i-x_j|\leqslant r_0)}{|x_i - x_j|^{\alpha }} + \frac{1}{1+ |x_i-x_j|^s}\right) \textrm{d}\mathbb {P}( x ). \end{aligned}$$
(58)

Let \(x= ( x_1, \dotsc , x_N) \) be in the support of \(\mathbb {P}\). After permutation we can assume that \( R ( x_1 ) \leqslant R ( x_2 ) \leqslant \cdots \leqslant R ( x_N ) \). We fix the index i and consider the points \( x_i - x_j \) in \( \mathbb {R}^d \) for \( j = i+1, \dotsc , N \). Because of (56), these points are all at a distance at least \( \eta ( R( x_i ) + R ( x_j ) ) \) from the origin, and

$$\begin{aligned} |( x_i - x_j ) - ( x_i - x_k )| = |x_j - x_k| \geqslant \eta ( R ( x_j ) + R ( x_k ) ). \end{aligned}$$

Hence we can place \( N - i \) disjoint balls in \( \mathbb {R}^d \) with radii \( \eta R ( x_j ) \), centered at the points \( x_i - x_j \), respectively. Inside each of these balls, we place a smaller ball of radius \( \frac{\eta }{2} R ( x_j ) \), centered at

$$\begin{aligned} z_j = \left( {1- \frac{\eta R ( x_j ) }{2 |x_i-x_j|}}\right) ( x_i - x_j ). \end{aligned}$$

Then \( x_i - x_j \) is the point on the boundary of \( B ( z_j, \frac{\eta }{2} R ( x_j ) ) \) which is the farthest from the origin (see Fig. ), so that

$$\begin{aligned} \frac{1}{|x_i - x_j|^{\alpha }} = \min _{y \in B ( z_j, \frac{\eta }{2} R ( x_j ) ) } \frac{1}{|y|^{\alpha }}. \end{aligned}$$
Fig. 2
figure 2

Sketch of Construction

Note that the distance from \( B ( z_j, \frac{\eta }{2} R ( x_j ) ) \) to the origin is bounded from below by

$$\begin{aligned} \textrm{d}\left( 0, B ( z_j, \frac{\eta }{2} R ( x_j ) ) \right) \geqslant |z_j| - \frac{\eta }{2} R ( x_j ) = |x_i-x_j| - \eta R ( x_j ) \geqslant \eta R ( x_i ). \end{aligned}$$

Using this, along with the fact that all the balls are disjoint, we get the pointwise bound

$$\begin{aligned} \sum \limits _{j=i+1}^N \frac{1}{|x_i - x_j|^{\alpha }} \leqslant {}&\sum \limits _{j=i+1}^N \frac{1}{|B ( z_j,\frac{\eta }{2} R ( x_j ) )|} \int _{B ( z_j, \frac{\eta }{2} R ( x_j ) )} \frac{1}{|y|^{\alpha }} \, \textrm{d}y \nonumber \\ \leqslant {}&\frac{1}{|B ( 0,\frac{\eta }{2} R ( x_i ) )|} \int _{B ( 0, \eta R ( x_i ) )^c} \frac{1}{|y|^{\alpha }} \, \textrm{d}y \\ ={}&\frac{2^d}{|B_1|} \frac{1}{( \eta R ( x_i ) )^{\alpha }} \int _{B ( 0, 1 )^c} \frac{1}{|y|^{\alpha }} \, \textrm{d}y, \nonumber \end{aligned}$$
(59)

for \( \mathbb {P}\)-a.e. \( x \in \mathbb {R}^{dN} \). We conclude that the contribution to the energy from the core of w can be bounded by

$$\begin{aligned} \int _{\mathbb {R}^{dN}} \sum \limits _{i=1}^N \sum \limits _{j = i +1}^N \frac{1}{|x_i - x_j|^{\alpha }} \, \textrm{d}\mathbb {P}( x ) \leqslant {}&\frac{C}{\eta ^{\alpha }} \int _{\mathbb {R}^{dN}} \sum \limits _{i = 1}^N \frac{1}{R ( x_i )^{\alpha }} \, \textrm{d}\mathbb {P}( x ) \nonumber \\ ={}&\frac{C}{\eta ^{\alpha }} \int _{\mathbb {R}^d} \frac{\rho ( x )}{R ( x )^{\alpha }} \, \textrm{d}x. \end{aligned}$$
(60)

Similarly, we get for the contribution from the tail of w,

$$\begin{aligned} \sum \limits _{j = i + 1} \frac{1}{1 + |x_i - x_j|^s} \leqslant {}&\sum \limits _{j = i + 1} \frac{1}{ |B ( z_j, \frac{\eta }{2} R ( x_j ) )| } \int _{B ( z_j, \frac{\eta }{2} R ( x_j ) )} \frac{1}{1 + |y|^s} \, \textrm{d}y \\ \leqslant {}&\frac{2^d}{|B_1|} \frac{1}{( \eta R ( x_i ) )^d} \int _{\mathbb {R}^d} \frac{1}{1 + |y|^s} \, \textrm{d}y, \end{aligned}$$

so

$$\begin{aligned} \int _{\mathbb {R}^{dN}} \sum \limits _{i=1}^N \sum \limits _{j = i +1}^N \frac{1}{1+|x_i - x_j|^s} \, \textrm{d}\mathbb {P}( x ) \leqslant {} \frac{C}{\eta ^d} \int _{\mathbb {R}^d} \frac{\rho ( x )}{R ( x )^d} \, \textrm{d}x. \end{aligned}$$
(61)

Finally, recalling from (52) that \( R ( x ) \) is bounded from below in terms of the maximal function of \( \rho \), we apply the Hölder and Hardy–Littlewood maximal inequalities to obtain for any power \( p > 0 \),

$$\begin{aligned} \int _{\mathbb {R}^d} \frac{\rho ( x )}{R ( x )^p} \, \textrm{d}x \leqslant {}&C \int _{\mathbb {R}^d} \rho ( x ) ( M_\rho ) ( x )^{\frac{p}{d}} \, \textrm{d}x \\ \leqslant {}&C \left( { \int _{\mathbb {R}^d} \rho ( x )^{1+ \frac{p}{d}} \, \textrm{d}x }\right) ^{\frac{d}{d+p}} \left( {\int _{\mathbb {R}^d} ( M_\rho ) ( x )^{1+\frac{p}{d}} \, \textrm{d}x}\right) ^{\frac{p}{d+p}} \\ \leqslant {}&C \int _{\mathbb {R}^d} \rho ( x )^{1+\frac{p}{d}}. \end{aligned}$$

Using this on (60) and (61), and combining with (58), we obtain the claimed bound (57). \(\square \)

Proposition 23

(Special case \( \alpha = d \)) Let w be an interaction satisfying Assumption 1 with \( \alpha = d \), and \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \) a density with \( \int \rho = N \). Then, for any N-particle probability measure \( \mathbb {P}\) with one-body density \( \rho _{\mathbb {P}} = \rho \) satisfying (56) for some \( 0 < \eta \leqslant 1 \), the interaction energy is bounded by

$$\begin{aligned} F_0 [ \rho ] \leqslant {}&\int _{\mathbb {R}^{dN}} \sum \limits _{1 \leqslant i < j \leqslant N} w ( x_i-x_j ) \, \textrm{d}\mathbb {P}( x_1, \dotsc , x_N ) \nonumber \\ \leqslant {}&\frac{\kappa r_0^d C}{\eta ^{2d}} \int _{\mathbb {R}^d} \rho ^2 \left( {\log \left( {\frac{c r_0^d}{\eta ^{2d}} \rho }\right) }\right) _+ + \frac{\kappa C}{\eta ^{2d}} \int _{\mathbb {R}^d} \frac{1}{1 + |y|^s} \, \textrm{d}y \int _{\mathbb {R}^d} \rho ^2, \end{aligned}$$
(62)

where the constants c and C depend only on the dimension d.

Proof

The proof goes along the same lines as the proof of Proposition 22. However, complications arise due to the fact that \( 1/|x|^d \) is not integrable at infinity, so we need to take into account the finite range \( r_0 \) of the core of w. Incidentally, this also forces us to avoid using the Hardy-Littlewood maximal inequality later in the proof. Following the proof of Proposition 22 up to (59) and noting that \( B ( z_j, \frac{\eta }{2} R ( x_j ) ) \subseteq B ( 0, |x_i-x_j| ) \), we have

$$\begin{aligned} \sum \limits _{j=i+1}^N \frac{\mathbb {1} ( |x_i - x_j| \leqslant r_0 )}{|x_i - x_j|^d} \leqslant {}&\sum \limits _{j=i+1}^N \frac{\mathbb {1} ( |x_i - x_j| \leqslant r_0 )}{|B ( z_j,\frac{\eta }{2} R ( x_j ) )|} \int _{B ( z_j, \frac{\eta }{2} R ( x_j ) )} \frac{1}{|y|^d} \, \textrm{d}y \\ \leqslant {}&\frac{1}{|B ( 0,\frac{\eta }{2} R ( x_i ) )|} \int _{\eta R ( x_i ) \leqslant |y| \leqslant r_0} \frac{1}{|y|^d} \, \textrm{d}y \\ ={}&\frac{|\mathbb {S}^{d-1}|}{|B ( 0,\frac{\eta }{2} R ( x_i ) )|} \left( { \int _{\eta R ( x_i )}^{r_0} \frac{1}{r} \, \textrm{d}r }\right) _+ \\ ={}&\frac{2^d}{\eta ^d R ( x_i )^d} \left( { \log \left( { \frac{r_0^d}{\eta ^d R ( x_i )^d} }\right) }\right) _+. \end{aligned}$$

This leads to the bound

$$\begin{aligned}&\int _{\mathbb {R}^{dN}} \sum \limits _{i=1}^N \sum \limits _{j = i +1}^N w ( x_i - x_j ) \, \textrm{d}\mathbb {P}( x ) \nonumber \\&\quad \leqslant {} \kappa r_0^d \frac{2^d}{\eta ^d} \int _{\mathbb {R}^{dN}} \sum \limits _{i=1}^N \frac{1}{R ( x_i )^d} \left( {\log \left( { \frac{r_0^d}{\eta ^d R ( x_i )^d} }\right) }\right) _+ \, \textrm{d}\mathbb {P}( x ) \nonumber \\&\qquad + \kappa \frac{2^d}{|B_1| \eta ^d} \int _{\mathbb {R}^d} \frac{1}{1 + |y|^s} \, \textrm{d}y \int _{\mathbb {R}^{dN}} \sum \limits _{i=1}^N \frac{1}{ R ( x_i )^d} \, \textrm{d}\mathbb {P}( x ), \end{aligned}$$
(63)

where, in this case, we cannot use the Hardy–Littlewood maximal inequality on the first term. However, this can be circumvented using the fact that \( |x_i - x_j| \geqslant \eta ( R ( x_i ) + R ( x_j ) ) \) on the support of \( \mathbb {P}\), which is the content of Lemma 24 below. Using the lemma, we conclude

$$\begin{aligned} F_0 [ \rho ] \leqslant {}&\frac{\kappa r_0^d C}{\eta ^{2d}} \int _{\mathbb {R}^d} \rho ^2 \left( { \log \left( {\frac{2^d |B_1| r_0^d}{\eta ^{2d}} \rho }\right) }\right) _+ + \frac{\kappa C}{\eta ^{2d}} \int _{\mathbb {R}^d} \frac{1}{1 + |y|^s} \, \textrm{d}y \int _{\mathbb {R}^d} \rho ^2, \end{aligned}$$

where the constant C depends only on the dimension d. \(\square \)

Lemma 24

Let \( 0 \leqslant \rho \in L^1 ( \mathbb {R}^d ) \) be any density with \( \int \rho > 1 \), and take any configuration of points \( x_1, \dotsc , x_M \in \mathbb {R}^d \) satisfying \( |x_i - x_j| \geqslant \eta ( R ( x_i ) + R ( x_j ) ) \) for \( i \ne j \), for some \( 0 < \eta \leqslant 1 \). Then we have the bounds

$$\begin{aligned} \sum \limits _{i=1}^M \frac{1}{R ( x_i )^p} \leqslant \frac{C_{d,p}}{\eta ^p} \int _{\mathbb {R}^d} \rho ^{1+\frac{p}{d}} \end{aligned}$$
(64)

for any \( p > 0 \), and for any \( \lambda > 0 \),

$$\begin{aligned} \sum \limits _{i=1}^M \frac{1}{R ( x_i )^d} \left( {\log \left( {\frac{\lambda }{R ( x_i )^d}}\right) }\right) _+ \leqslant \frac{C_d}{\eta ^d} \int _{\mathbb {R}^d} \rho ^2 \left( {\log \left( { \frac{2^d \lambda }{\eta ^d} |B_1| \rho }\right) }\right) _+. \end{aligned}$$
(65)

Proof

We consider any configuration \( x_1, \dotsc , x_M \) as in the statement, and seek to provide a bound on the sum \( \sum _{i=1}^M \frac{1}{R ( x_i )^p} \). We order the points such that \( R ( x_1 ) \leqslant \cdots \leqslant R ( x_M ) \), and assume first for simplicity that all the balls \( B ( x_j, R ( x_j ) ) \) intersect the smallest ball \( B ( x_1, R ( x_1 ) ) \). The main idea of the following argument is to split the space \( \mathbb {R}^d \) into shells of exponentially increasing width, centered around \( x_1 \), and arguing that the number of points among \( x_2, \dotsc , x_M \) that can lie in each shell is universally bounded. To elaborate, take any \( \tau > 1 \) and consider for \( m \in \mathbb {N}_0 \) the spherical shell of points \( y \in \mathbb {R}^d \) satisfying

$$\begin{aligned} \tau ^m \eta R ( x_1 ) \leqslant |x_1 - y| < \tau ^{m+1} \eta R ( x_1 ). \end{aligned}$$
(66)

Note that if \( x_j \) lies in this shell, then by Lipschitz continuity of R,

$$\begin{aligned} \frac{2\eta }{1+\eta } R ( x_j ) \leqslant |x_1-x_j| < \tau ^{m+1} \eta R ( x_1 ), \end{aligned}$$

immediately implying that

$$\begin{aligned} R ( x_j ) < \frac{1+\eta }{2} \tau ^{m+1} R ( x_1 ). \end{aligned}$$
(67)

This means that the ball \( B ( x_j, \eta R ( x_j ) ) \) is contained in

$$\begin{aligned} B ( x_j, \eta R ( x_j ) ) \subseteq B ( x_1, |x_1 - x_j| + \eta R ( x_j ) ) \subseteq B \left( {x_1, \frac{3+\eta }{2} \tau ^{m+1} \eta R ( x_1 )}\right) . \end{aligned}$$

Furthermore, by the assumption that \( B ( x_1, R ( x_1 ) ) \cap B ( x_j, R ( x_j ) ) \ne \emptyset \), we have that

$$\begin{aligned} \frac{\tau ^m}{2} \eta R ( x_1 ) \leqslant \frac{1}{2} |x_1 - x_j| < \frac{1}{2} ( R ( x_1 ) + R ( x_j ) ) \leqslant R ( x_j ). \end{aligned}$$
(68)

Since the balls \( B ( x_j, \eta R ( x_j ) ) \) are all disjoint, we conclude that the number of \( x_j \)’s that can lie in the m’th shell around \( x_1 \) is bounded by the ratio of the volumes

Note also that no \( x_j \) can be placed inside the first shell (corresponding to \( |x_1- x_j| < \eta R ( x_1 ) \)), because we always have \( |x_1 - x_j| \geqslant \eta ( R ( x_1 ) + R ( x_j ) ) \) by assumption. Now, for any power \( p > 0 \), this allows us to bound, using (53),

$$\begin{aligned} \sum \limits _{j = 1}^M \frac{1}{R ( x_j )^p} \leqslant {}&4^d \tau ^d \sum \limits _{m=0}^{\infty } \frac{2^p}{( \tau ^m \eta R ( x_1 ) )^p} \leqslant {} \frac{ 2^{p+2d} \tau ^{p+d} }{ \eta ^p ( \tau ^p - 1 ) } \frac{1}{R ( x_1 )^p} \nonumber \\ \leqslant {}&\frac{ 2^{p+2d} \tau ^{p+d} }{ \eta ^p ( \tau ^p - 1 ) } |B_1|^{\frac{p}{d}} \int _{B( x_1, R ( x_1 ) )} \rho ( y )^{1+\frac{p}{d}} \, \textrm{d}y. \end{aligned}$$
(69)

To bound the sum involving the logarithm, we note first that for any \( \lambda > 0 \), applying Jensen’s inequality to the function \( t \mapsto t^2 ( \log \lambda t )_+ \) yields

$$\begin{aligned} \frac{1}{R ( x )^d} \left( {\log \left( { \frac{\lambda }{R ( x )^d}}\right) }\right) _+ ={}&\frac{|B_1|}{|B ( x )|} \left( {\int _{B ( x )} \rho }\right) ^2 \left( {\log \left( {\frac{\lambda |B_1|}{|B ( x )|} \int _{B ( x )} \rho }\right) }\right) _+ \nonumber \\ \leqslant {}&|B_1| \int _{B ( x )} \rho ^2 ( \log ( \lambda |B_1| \rho ) )_+. \end{aligned}$$
(70)

Using this, we obtain by again summing over all the shells,

$$\begin{aligned} \sum \limits _{i=1}^M \frac{1}{R ( x_i )^d} \left( {\log \left( {\frac{\lambda }{R ( x_i )^d}}\right) }\right) _+ \leqslant {}&\sum \limits _{m=0}^{\infty } \frac{4^d \tau ^d 2^d}{( \tau ^m \eta R ( x_1 ) )^d} \left( {\log \left( {\frac{2^d \lambda }{( \tau ^m \eta R ( x_1 ) )^d}}\right) }\right) _+ \\ \leqslant {}&\frac{2^{3d} \tau ^d}{\eta ^d} \sum \limits _{m=0}^{\infty } \frac{1}{\tau ^{dm} R ( x_1 )^d} \left( {\log \left( {\frac{2^d \lambda }{\eta ^d R ( x_1 )^d}}\right) }\right) _+ \\ \leqslant {}&\frac{2^{3d} \tau ^{2d} |B_1|}{\eta ^d ( \tau ^d -1 )} \int _{B ( x_1, R ( x_1 ) )} \rho ^2 \left( { \log \left( { \frac{2^d \lambda }{\eta ^d} |B_1| \rho }\right) }\right) _+. \end{aligned}$$

Finally, we generalize to the case where not all the balls \( B ( x_j, R ( x_j ) ) \) intersect the smallest ball \( B ( x_1, R ( x_1 ) ) \). We split the configuration \( ( x_j )_{1 \leqslant j \leqslant M} \) into clusters \( \left( {x_j^{( k )}}\right) _{1 \leqslant j \leqslant n_k} \) with \( 1 \leqslant k \leqslant K \), such that:

  • For any k, \( R ( x_1^{( k )} ) \leqslant \cdots \leqslant R ( x_{n_k}^{( k )} ) \).

  • \( B ( x_1^{( k )}, R ( x_1^{( k )} ) ) \cap B ( x_j^{( k )}, R ( x_j^{( k )} ) ) \ne \emptyset \) for any jk.

  • The balls \( B ( x_1^{( k )}, R ( x_1^{( k )} ) ) \) are all pairwise disjoint for \( k = 1, \dotsc , K \).

Then, using (69) on each cluster, we get for instance

$$\begin{aligned} \sum \limits _{j = 1}^M \frac{1}{R ( x_j )^p} \leqslant {}&\frac{ 2^{p+2d} \tau ^{p+d} }{ \eta ^p ( \tau ^p - 1 ) } |B_1|^{\frac{p}{d}} \sum \limits _{k=1}^K \int _{B ( x_1^{( k )}, R ( x_1^{( k )} ) )} \rho ( y )^{1+\frac{p}{d}} \, \textrm{d}y \\ \leqslant {}&\frac{ 2^{p+2d} \tau ^{p+d} }{ \eta ^p ( \tau ^p - 1 ) } |B_1|^{\frac{p}{d}} \int _{\mathbb {R}^d} \rho ( y )^{1+\frac{p}{d}} \, \textrm{d}y, \end{aligned}$$

which concludes the proof of (64). (65) follows in the same way. \(\square \)

Remark 25

(Hard-core at zero temperature) As we have mentioned in Sect. 2.4, in the hard core case \(\alpha =+\infty \), we know from (34) that for any representable density \(\rho \), we have

$$\begin{aligned} F_0 [\rho ] \leqslant \frac{\kappa C}{r_0^s} \int _{\mathbb {R}^d} \rho ( x ) \, \textrm{d}x, \end{aligned}$$
(71)

where the constant C depends only on d and s. The problem is to determine when \(\rho \) is representable. Using Theorem 21, this is the case when for instance \(R_\rho =\min _{x} R(x)\geqslant r_0\).

5.3 The Block Approximation

While the state from Theorem 21 is useful for obtaining energy bounds at zero temperature, it might be singular with respect to the Lebesgue measure on \( \mathbb {R}^{dN} \), leaving it unsuitable to use for the positive temperature case, because the entropy in this case will be infinite. Here we describe a simple way of regularizing states, while keeping the one-body density fixed, which is a slight generalization to any partition of unity of the construction in [9]. Essentially, it works by cutting \( \mathbb {R}^d \) into “blocks” and then locally replacing the state by a pure tensor product.

Let \( \sum \chi _j = \mathbb {1}_{\mathbb {R}^d} \) be any partition of unity, and \( \mathbb {P}\) any N-particle state with density \( \rho \). The corresponding block approximation is defined by

$$\begin{aligned} {\widetilde{\mathbb {P}}}:= \sum \limits _{i_1,...,i_N} \mathbb {P}( \chi _{i_1} \otimes \cdots \otimes \chi _{i_N} ) \frac{( \rho \chi _{i_1} ) \otimes \cdots \otimes ( \rho \chi _{i_N} ) }{\prod _{k=1}^N\int _{\mathbb {R}^d} \rho \chi _{i_k}}, \end{aligned}$$
(72)

where we denote

$$\begin{aligned} \mathbb {P}( \chi _{i_1} \otimes \cdots \otimes \chi _{i_N} ):= \int _{\mathbb {R}^{dN}} \chi _{i_1} \otimes \cdots \otimes \chi _{i_N} \, \textrm{d}\mathbb {P}. \end{aligned}$$

That is, \( \widetilde{\mathbb {P}} \) is a convex combination of tensor products of the normalized \( \frac{\rho \chi _i}{\int \rho \chi _i} \). One can easily show that \( \widetilde{\mathbb {P}} \) has one-body density \( \rho _{\widetilde{\mathbb {P}}} = \rho \). Furthermore, it is clear that \( \widetilde{\mathbb {P}} \) is a symmetric measure whenever \( \mathbb {P}\) is, so we can also write

$$\begin{aligned} \widetilde{\mathbb {P}} = \sum \limits _{i_1,...,i_N} \mathbb {P}( \chi _{i_1} \otimes \cdots \otimes \chi _{i_N} ) \Pi _s \left( { \frac{\rho \chi _{i_1}}{\int \rho \chi _{i_1}} \otimes \cdots \otimes \frac{\rho \chi _{i_N}}{\int \rho \chi _{i_N}} }\right) , \end{aligned}$$

where \( \Pi _s \) denotes the symmetrization operator in (31). In [9] the chosen partition of unity is just a tiling made of cubes, but in fact any partition works. Applying Jensen’s inequality yields the following.

Lemma 26

(Entropy of the block approximation) Suppose that the state \( \mathbb {P}\) and the partition of unity \( ( \chi _j ) \) are such that \( \chi _{i_1}, \dotsc , \chi _{i_N} \) all have disjoint supports whenever \( \mathbb {P}( \chi _{i_1} \otimes \cdots \otimes \chi _{i_N} ) \ne 0 \). Then we have

$$\begin{aligned}&\int _{\mathbb {R}^{dN}} \widetilde{\mathbb {P}} \log ( N! \, {\widetilde{\mathbb {P}}} ) \leqslant {} \int _{\mathbb {R}^d} \rho \log \rho + \int _{\mathbb {R}^d} \rho \sum \limits _i \chi _i \log \chi _i\nonumber \\&\quad -\sum \limits _{i} \left( {\int _{\mathbb {R}^d} \rho \chi _{i}}\right) \log \left( {\int _{\mathbb {R}^d} \rho \chi _{i}}\right) . \end{aligned}$$
(73)

Remark 27

Since \( ( \chi _i ) \) is a partition of unity, the term above involving \( \chi _i \log \chi _i \) can always be estimated from above by zero. On the other hand, it is not clear that the sum in last term above is even finite for an arbitrary partition \( ( \chi _i ) \). However, it turns out to behave nicely in many situations. For instance, if \(\int \rho \chi _j\leqslant 1\) for all j, we can estimate it by 1/e times the number of terms in the partition of unity, which is typically finite when \(\rho \) has compact support.

Proof

The entropy of the block approximation can be estimated using Jensen’s inequality by

$$\begin{aligned}&\int \widetilde{\mathbb {P}} \log ( N! \, {\widetilde{\mathbb {P}}} ) \\&\ \leqslant \sum \limits _{i_1,...,i_N} \mathbb {P}( \chi _{i_1}\otimes \cdots \otimes \chi _{i_N} ) \int \Pi _s \left( \bigotimes _k \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \right) \log \left( N! \, \Pi _s \bigotimes _k \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \right) \\&\ = \sum \limits _{i_1,...,i_N} \mathbb {P}( \chi _{i_1}\otimes \cdots \otimes \chi _{i_N} ) \int \bigotimes _k \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \log \left( \sum _{\sigma \in \mathfrak {S}_N}\bigotimes _k \frac{\rho \chi _{i_{\sigma (k)}}}{\int \rho \chi _{i_{\sigma (k)}}} \right) . \end{aligned}$$

We have here used the symmetry of \(\mathbb {P}\) to remove the first \(\Pi _s\). It is important that the N! has disappeared in the logarithm. For any non-zero term, the supports of the \( \chi _{i_k} \) are all disjoint, hence only the case \(\sigma =\text {Id}\) remains in the sum. Using that

$$\begin{aligned} \int \bigotimes _k \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \log \left( \bigotimes _k \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \right) = \sum \limits _{k=1}^N \int \frac{\rho \chi _{i_k}}{\int \rho \chi _{i_k}} \log \frac{ \rho \chi _{i_k}}{\int \rho \chi _{i_k}}, \end{aligned}$$

and plugging this into the previous expression, we conclude that (73) holds. \(\square \)

5.4 Proof of Theorem 12 in the Canonical Case at \(T>0\)

We assume first that the density \( \rho \) is compactly supported, and then remove this assumption at the end. Applying the Besicovitch covering lemma [21, 35] on the cover gives the existence of a (finite) set of points \( ( y_j ) \subseteq {{\,\textrm{supp}\,}}\rho \) satisfying that \( ( B_j ):= ( B ( y_j, \varepsilon R ( y_j ) ) ) \) covers the support of \( \rho \), and the multiplicity of the cover is universally bounded, i.e.,

$$\begin{aligned} 1 \leqslant \varphi ( x ):= \sum \limits _j {\mathbb {1} }_{B_j} ( x ) \leqslant C_d, \qquad x \in {{\,\textrm{supp}\,}}\rho , \end{aligned}$$

where the constant \( C_d \) depends only on the dimension d, and thus not on \( \varepsilon \) or \( \rho \). This gives us a partition of unity \( ( \chi _j ) \) defined by \( \chi _j:= \frac{{\mathbb {1} }_{B_j}}{\varphi } \). One way of constructing the Besicovitch cover is to inductively maximize \( \varepsilon R ( y_j ) \) over the remaining volume \( y_j \in {{\,\textrm{supp}\,}}\rho {\setminus } \bigcup _{k=1}^{j-1} B_k \), supposing that \( y_1, \dotsc , y_{j-1} \) have already been chosen. This construction implies the bound on the distances

$$\begin{aligned} |y_j - y_k| \geqslant \max ( \varepsilon R ( y_j ), \varepsilon R ( y_k ) ) \geqslant \frac{\varepsilon }{2} ( R ( y_j ) + R ( y_k ) ), \end{aligned}$$
(74)

for all \( j \ne k \).

We now take the optimal transport state \( \mathbb {P}\) obtained from Theorem 21, and denote by \( m_j:= \int \rho \chi _j = \int _{B_j} \frac{\rho }{\varphi } \) the local mass of \( \rho \) with respect to the partition of unity \( ( \chi _j ) \). As a trial state for the free energy, we take the block approximation (72) of \( \mathbb {P}\) using the \( \chi _j \), i.e.,

$$\begin{aligned} \mathbb {P}_{\varepsilon }:={} \sum \limits _{j_1, \dotsc , j_N} \mathbb {P} ( \chi _{j_1} \times \cdots \times \chi _{j_N} ) \left( {\frac{\rho \chi _{j_1}}{m_{j_1}} }\right) \otimes \cdots \otimes \left( {\frac{\rho \chi _{j_N}}{m_{j_N}} }\right) . \end{aligned}$$

We show that the support of \( \mathbb {P}_{\varepsilon } \) satisfies the condition (56) for some \( \eta \). For any point \( ( x_1, \dotsc , x_N ) \in {{\,\textrm{supp}\,}}\mathbb {P}_{\varepsilon } \), there must be a term in the sum above such that \( \mathbb {P}( \chi _{j_1} \times \cdots \times \chi _{j_N} ) \ne 0 \), and \( x_k \in B_{j_k} = B( y_{j_k}, \varepsilon R ( y_{j_k} ) ) \) for all k. In particular, since the support of \( \mathbb {P}\) satisfies (55), there exist \( z_1, \dotsc , z_N \) with \( z_k \in B_{j_k} \) and \( |z_k - z_{\ell }| \geqslant \frac{1}{3} ( R ( z_k ) + R ( z_{\ell } ) ) \) for any \( k \ne \ell \). By the Lipschitz continuity of R, \( x_k \in B_{j_k} \) implies that \( R ( y_{j_k} ) \leqslant \frac{1}{1-\varepsilon } R ( x_k ) \), so

$$\begin{aligned} |x_k - z_k| \leqslant 2 \varepsilon R ( y_{j_k} ) \leqslant \frac{2\varepsilon }{1-\varepsilon } R ( x_k ). \end{aligned}$$

Finally, this gives us the bound

$$\begin{aligned} |x_k - x_{\ell }| \geqslant {}&|z_k - z_{\ell }| - |x_k - z_k| - |x_{\ell } - z_{\ell }| \nonumber \\ \geqslant {}&\frac{1}{3} ( R ( z_k ) + R ( z_{\ell } ) ) - |x_k - z_k| - |x_{\ell } - z_{\ell }| \nonumber \\ \geqslant {}&\frac{1}{3} ( R ( x_k ) + R ( x_{\ell } ) ) - \frac{4}{3} ( |x_k - z_k| + |x_{\ell } - z_{\ell }| ) \nonumber \\ \geqslant {}&\frac{1}{3} \left( {1 - \frac{8 \varepsilon }{1-\varepsilon }}\right) ( R ( x_k ) + R ( x_{\ell } ) ). \end{aligned}$$
(75)

This argument also shows that if \( \mathbb {P}( \chi _{j_1} \times \cdots \times \chi _{j_N} ) \ne 0 \), then the sets \( B_{j_k} \) are disjoint for \( k = 1, \dotsc , N \), provided that \( \varepsilon < \frac{1}{9} \).

Now, since \( \mathbb {P}_{\varepsilon } \) satisfies (75), it follows from Proposition 22 that the interaction energy (in case \( \alpha > d \)) is bounded by

$$\begin{aligned} \mathcal {U}_{N} ( \mathbb {P}_{\varepsilon } ) \leqslant {} C\kappa r_0^\alpha \int _{\mathbb {R}^d} \rho ^{1+\frac{\alpha }{d}} + C\kappa \int _{\mathbb {R}^d} \rho ^2, \end{aligned}$$

and similarly for \( \alpha = d \), using Proposition 23. Thus, to show (25), it only remains to provide a bound on the entropy of the state \( \mathbb {P}_{\varepsilon } \). First, applying Lemma 26 immediately gives

$$\begin{aligned} \int _{\mathbb {R}^{dN}} \mathbb {P}_{\varepsilon } \log ( N! \, \mathbb {P}_{\varepsilon } ) \leqslant {} \int _{\mathbb {R}^d} \rho \log \rho - \sum \limits _{j} m_j \log m_j. \end{aligned}$$

Then, for any numbers \( s,t \geqslant 0 \), we can use the elementary bound

$$\begin{aligned} - s \log ( t s ) \leqslant \frac{1}{et}, \end{aligned}$$

to conclude that

$$\begin{aligned} - \sum \limits _{j} m_j \log m_j ={}&\sum \limits _{j} m_j \log ( R( y_j )^d ) - m_j \log ( R ( y_j )^d m_j ) \\ \leqslant {}&\sum \limits _{j} d \int \rho ( x ) \chi _{j} ( x ) \log ( ( 1+\varepsilon ) R ( x ) ) \, \textrm{d}x + \frac{1}{e R ( y_j )^d} \\ \leqslant {}&d \log ( 1+\varepsilon ) \int _{\mathbb {R}^d} \rho + d \int _{\mathbb {R}^d} \rho \log R + \frac{C}{\varepsilon ^d} \int _{\mathbb {R}^d} \rho ^2, \end{aligned}$$

where the last inequality uses (74) and Lemma 24. This proves Theorem 12 for compactly supported densities. \(\square \)

Remark 28

(Hard-core case) In the hard core case \(\alpha =+\infty \), the above proof provides the bound

$$\begin{aligned} F_T [\rho ]&\leqslant C\frac{\kappa }{r_0^d}\int _{\mathbb {R}^d}\rho +CT \int _{\mathbb {R}^d} \rho + T \int _{\mathbb {R}^d} \rho \log \rho +\frac{CTr_0^d}{(R_\rho -r_0)^d}\int _{\mathbb {R}^d}\rho ^2\nonumber \\&+T \int _{\mathbb {R}^d} \rho \log R^d, \end{aligned}$$
(76)

under the assumption that \(R_\rho =\min _x R(x)>r_0\), where C only depends on d and s. The main difference is the estimate on the distance between the particles in (75). We need to keep the maximum and use

$$\begin{aligned} |x_k - x_{\ell }| \geqslant {}&\max \left\{ R_\rho ,\frac{1}{3} ( R ( x_k ) + R ( x_{\ell } ) )\right\} - \frac{8 \varepsilon }{3(1-\varepsilon )} ( R ( x_k ) + R ( x_{\ell } ) )\\ \geqslant {}&\left( 1-\frac{8\varepsilon }{1-\varepsilon }\right) R_\rho . \end{aligned}$$

Taking \(\varepsilon =\min (R_\rho /r_0-1,1)/100\) provides (76).

5.5 Removal of the Compactness Condition

To finish this section we describe how to extend a result holding for compactly supported densities to general integrable ones, using this time a compactness argument.

Theorem 29

Assume that w satisfies Assumption 1. If we have for some \(1\leqslant p\leqslant q<\infty \) with \(q\geqslant 2\) and some constants \(C_j\geqslant 0\)

$$\begin{aligned} F_T[\rho ]&\leqslant C_0\int \rho +C_1\int \rho ^{p}+C_2\int \rho ^q +T\int \rho \log \rho \nonumber \\&+C_3\int \rho ^2 (\log \rho )_++C_4\int \rho \log R, \end{aligned}$$
(77)

for all \(\rho \in L^1\cap L^q\) of compact support, then the same holds with the same constants for all \(\rho \in L^1\cap L^q\). If \(T>0\) we assume in both cases that \(\int _{\mathbb {R}^d}\rho |\log \rho |<\infty \).

Proof

Let us first assume \(C_4=0\) for simplicity. Our proof uses that the energy \(\rho \mapsto F_T[\rho ]\) is lower semi-continuous for the strong topology of \(L^1\), as previously mentioned in Remark 3, that is,

$$\begin{aligned}&F_T[\rho ]\leqslant \liminf _{n\rightarrow \infty }F_T[\rho _n]\nonumber \\&\quad \text {if } \rho _n\rightarrow \rho \text { strongly in } L^1(\mathbb {R}^d) \text { with } \int \rho _n^{q}+T\int \rho _n|\log \rho _n|\leqslant C. \end{aligned}$$
(78)

The theorem then follows immediately by letting

$$\begin{aligned} \rho _n:=\frac{N}{\int _{B_n}\rho }\;\rho {\mathbb {1} }_{B_n} \end{aligned}$$

the truncation of \(\rho \) over the ball of radius n. Note that \(\rho _n\leqslant (1+o(1))\rho \). The sequence \(\rho _n\) clearly satisfies the convergence properties of (78) and therefore the lower semi-continuity provides

$$\begin{aligned} F_T[\rho ]&\leqslant {} \! \liminf _{n\rightarrow \infty } F_T[\rho _n] \\&\leqslant {}\liminf _{n\rightarrow \infty } \Bigg \{C_0N+ C_1 \left( { \frac{N}{\int _{B_n}\rho } }\right) ^p \int _{B_n}\rho ^p+C_2 \left( { \frac{N}{\int _{B_n}\rho }}\right) ^q\int _{B_n}\rho ^q \\&\quad +T\frac{N}{\int _{B_n}\rho }\int _{B_n}\rho \log \rho +T\frac{N}{\int _{B_n}\rho } \log \left( {\frac{N}{\int _{B_n}\rho }}\right) \int _{B_n}\rho \\&\quad +C_3 \left( {\frac{N}{\int _{B_n}\rho } }\right) ^2\int _{B'_n}\rho ^2\log \rho +2 \left( {\frac{N}{\int _{B_n}\rho }}\right) ^2 \log \left( {\frac{N}{\int _{B_n}\rho } }\right) \int _{B'_n}\rho ^2\Bigg \}\\&={} C_0N+C_1\int \rho ^{p}+C_2\int \rho ^q +T\int \rho \log \rho +C_3\int \rho ^2(\log \rho )_+, \end{aligned}$$

where \(B'_n:=B_n\cap \big \{\rho \geqslant N^{-1}\int _{B_n}\rho \big \}\).

When \(C_4>0\) the proof is similar. We need to use that \((1+|x|)/C\leqslant R(x),R_n(x)\leqslant C(1+|x|)\) for some \(C>0\) (depending on \(\rho \)), where \(R_n(x)\) is the local radius of the truncated density \(\rho _n\), which converges locally to R. The uniform bounds on R and \(R_n\) imply that we must work under the assumptions that \(\int \rho (\log |x|)_+\) is finite (otherwise there is nothing to show). The limit follows from dominated convergence.

For the convenience of the reader, we conclude by quickly recalling the proof of the lower semi-continuity (78). We consider an arbitrary sequence \(\rho _n\) converging to \(\rho \) strongly in \(L^1\) and satisfying the bounds in (78). It is known that there exists an optimal \(\mathbb {P}_n\) for \(F_T[\rho _n]\) (but we could as well use a quasi-minimizer). From the upper bound we have \(F_T[\rho _n]\leqslant C\) for some constant C and therefore

$$\begin{aligned} C\geqslant {} F_T[\rho _n]&={} \mathcal {F}_T(\mathbb {P}_n)\\&={} \int _{(\mathbb {R}^d)^N} \sum _{1\leqslant j<k\leqslant N}w(x_j-x_k) \, \mathbb {P}_n+T\int \mathbb {P}_n\log (N! \, \mathbb {P}_n)\\&={} \int _{(\mathbb {R}^d)^N} \left( {\sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)+\kappa N}\right) \mathbb {P}_n+T\int \mathbb {P}_n\log \left( \frac{\mathbb {P}_n}{(\rho _n/N)^{\otimes N}}\right) \\&\quad -\kappa N+T\int \rho _n\log \rho _n+T\log \frac{N!}{N^N}. \end{aligned}$$

The first term is non-negative from the stability property of w and the second is a relative entropy, hence is also non-negative. We have thus proved that

$$\begin{aligned} T\int \mathbb {P}_n\log \mathbb {P}_n\leqslant C(\rho ,N,T), \end{aligned}$$

where the constant can depend on \(\rho ,N,T\) but not on n. On the other hand, we know that the sequence \((\mathbb {P}_n)\) is tight, that is,

$$\begin{aligned} \int _{\max |x_j|\geqslant R}\,\textrm{d}\mathbb {P}_n\leqslant \int \sum _{j=1}^N{\mathbb {1} }(|x_j|\geqslant R)\,\textrm{d}\mathbb {P}_n=\int _{|x|\geqslant R}\rho _n, \end{aligned}$$

where the right side is small due to the strong convergence in \(L^1\). After extraction of a subsequence, this implies \(\int F\,\textrm{d}\mathbb {P}_n\rightarrow \int F\,\textrm{d}\mathbb {P}\) for every \(F\in C^0_b\). Taking \(F(x_1,...,x_N)=\sum _{j=1}^Nf(x_j)\) with \(f\in C^0_b\), we find that \(\int f\rho _{\mathbb {P}_n}\rightarrow \int f\rho _{\mathbb {P}}\), that is, \(\rho _{\mathbb {P}}=\rho \). In addition, we have (by convexity)

$$\begin{aligned} T\int \mathbb {P}\log \mathbb {P}\leqslant T\liminf _{n\rightarrow \infty }\int \mathbb {P}_n\log \mathbb {P}_n\leqslant C. \end{aligned}$$
(79)

Hence \(\mathbb {P}\) is admissible for \(F_T[\rho ]\), and absolutely continuous with respect to the Lebesgue measure if \(T>0\). We thus have

$$\begin{aligned} \liminf _{n\rightarrow \infty }\int F\,\textrm{d}\mathbb {P}_n\geqslant \int F\,\textrm{d}\mathbb {P}, \end{aligned}$$

for every measurable function \(F\geqslant 0\) if \(T>0\) (using the absolute continuity of \(\mathbb {P}\)) and for every lower semi-continuous function \(F\geqslant 0\) if \(T=0\). This is satisfied for our interaction w by Assumption 1 and therefore we obtain as we wanted

$$\begin{aligned}&\liminf _{n\rightarrow \infty }\int _{(\mathbb {R}^d)^N} \left( { \sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)+\kappa N }\right) \textrm{d}\mathbb {P}_n \\&\quad \geqslant {} \int _{(\mathbb {R}^d)^N} \left( { \sum _{1\leqslant j<k\leqslant N}w(x_j-x_k)+\kappa N }\right) \textrm{d}\mathbb {P}. \end{aligned}$$

Together with the entropy bound (79) when \(T>0\), this proves that

$$\begin{aligned} \liminf _{n\rightarrow \infty }\left( F_T[\rho _n]+\kappa N\right) \geqslant F_T[\rho ]+\kappa N \end{aligned}$$

which is the claimed lower semi-continuity (78). \(\square \)

6 Proof of Theorem 14 in the Hard-Core Case

In this section we prove Theorem 14 concerning densities which are uniformly bounded in terms of the tight packing density \(\rho _c(d)\). We start by constructing a trial state with constant density by averaging a periodic tight packing over translations. Such a uniform average of a periodic lattice is often called a “floating crystal” [64, 65] in Physics and Chemistry. Finally, we estimate the entropic cost of “geometrically localizing” [60] this state to enforce the desired density.

Step 1: Constant density. We have assumed \(\rho \leqslant (1-\varepsilon )^dr_0^{-d}\rho _c(d)\). Let \(\eta >0\) be a fixed small number which will later be chosen in terms of \(\varepsilon \). From the definition of \(\rho _c(d)\) we can find a large cube \(C_\ell =(-\ell /2,\ell /2)^d\) and \(n=(1+2\eta )^{-d}r_0^{-d}\rho _c(d)\ell ^d\in \mathbb {N}\) points \(x_1^0,...,x_n^0\in C_\ell \) satisfying \(|x^0_j-x^0_k|\geqslant r_0(1+\eta )\) for all \(j\ne k\). We can also assume that no point is at a distance less than \(r_0\) to the boundary of \(C_\ell \). We are using here that the tight packing density for \(r_0(1+\eta )\) is \((1+\eta )^{-d}r_0^{-d}\rho _c(d)>(1+2\eta )^{-d}r_0^{-d}\rho _c(d)\) and that the limit (33) is the same for cubes and for balls.

Now, we replace each point \(x_j^0\) by a smeared measure

$$\begin{aligned} \chi _j^0(x)=\frac{2^d}{(r_0\eta )^d}\chi \left( { 2\frac{x-x_j^0}{r_0\eta } }\right) , \end{aligned}$$

where \(\chi =|B_1|^{-1}{\mathbb {1} }_{B_1}\). The smearing radius \(\eta r_0/2\) has been chosen so that the supports of the \(\chi ^0_j\) remain at distance at least \(r_0\).

Finally, we consider \((2K+1)^d\) copies of our system (\(K\in \mathbb {N}\)), repeated in a periodic fashion so as to form a very large cube \(C_L=(-L/2,L/2)^d\) of side length \(L=(2K+1)\ell \). In other words, we define the \(N:=(2K+1)^dn\) points \(x_j^k:=x_j^0+kL\) with \(k\in \{-K,...,K\}^d\). The smeared measures \(\chi _j^k\) are defined similarly. The state

$$\begin{aligned} \mathbb {P}=\Pi _s\bigotimes _{\begin{array}{c} j\in \{1,...,n\}\\ k\in \{-K,...K\}^d \end{array}} \chi _j^k, \end{aligned}$$

has the density \(\rho =\sum _{j,k}\chi _j^k\) and the finite entropy

$$\begin{aligned} \int _{(\mathbb {R}^d)^N}\mathbb {P}\log (N!\,\mathbb {P})=N\int _{\mathbb {R}^d} \chi \log \chi =N\log \left( \frac{2^d}{|B_1|r_0^d\eta ^d}\right) , \end{aligned}$$

(recall \(\Pi _s\) is the symmetrization operator in (31)). Finally, we average over translations of the big cube and define the trial state

$$\begin{aligned} {\widetilde{\mathbb {P}}}=\frac{1}{\ell ^d}\int _{C_\ell }\mathbb {P}(\cdot +\tau )\,\textrm{d}\tau , \end{aligned}$$

which has the density

$$\begin{aligned} {\widetilde{\rho }}=\frac{1}{\ell ^d}\sum _{j}\chi _j^0*{\mathbb {1} }_{C_L}. \end{aligned}$$

The latter is constant, equal to \(n/\ell ^d=(1+2\eta )^{-d}r_0^{-d}\rho _c(d)\) well inside the large cube. Note that, by concavity, the entropy of \({\widetilde{\mathbb {P}}}\) can be estimated by that of \(\mathbb {P}\).

Step 2: Geometric localization. We assume for the rest of the proof that \(\rho \) has a compact support and we choose K large enough so that \({\widetilde{\rho }}\) is constant on the support of \(\rho \). Our estimates will not depend on K. One can then deduce the bound for general densities by adapting the proof of Theorem 29, or by passing to the limit \(K\rightarrow \infty \) in the formulas (80)–(81) of the trial state.

We pick \(\eta \) so that \((1-\varepsilon )^d=(1+2\eta )^{-d}\), that is,

$$\begin{aligned} \eta =\frac{\varepsilon }{2(1-\varepsilon )}. \end{aligned}$$

Then we have \(\rho \leqslant {\widetilde{\rho }}\) a.e. This enables us to consider the localization function

$$\begin{aligned} \theta :=\frac{\rho }{{\widetilde{\rho }}}=\frac{\rho }{(1+2\eta )^{-d}r_0^{-d}\rho _c(d)}\leqslant 1, \end{aligned}$$

and the \(\theta \)–localized state \({\widetilde{\mathbb {P}}}_{|\theta }\), which has the desired density \(\theta \rho _{{\widetilde{\mathbb {P}}}}=\rho \).

We recall that the \(\theta \)—localization \(\mathbb {Q}_{|\theta }\) of a state \(\mathbb {Q}\) (with \(0\leqslant \theta \leqslant 1\)) is the unique state which has the correlation functions \(\rho ^{(k)}=\rho ^{(k)}_{\mathbb {Q}}\theta ^{\otimes k}\) for all k, see [34, 42, 60]. In our case we only need the definition for a tensor product since we have by linearity

$$\begin{aligned} {\widetilde{\mathbb {P}}}_{|\theta }=\frac{1}{\ell ^d}\int _{C_\ell }\mathbb {P}(\cdot +\tau )_{|\theta }\,\textrm{d}\tau . \end{aligned}$$
(80)

For a symmetric tensor product \(\mathbb {Q}=\Pi _s(q_1\otimes \cdots \otimes q_N)\) with probabilities \(q_j\) of disjoint support, the \(\theta \)-localized state can be expressed as

$$\begin{aligned} \mathbb {Q}_{|\theta }&=\bigoplus _{n=0}^N\left( {\begin{array}{c}N\\ n\end{array}}\right) \frac{1}{N!}\sum _{\sigma \in \mathfrak {S}_N} (\theta q_{\sigma (1)})\otimes \cdots \otimes (\theta q_{\sigma (n)})\times \nonumber \\&\quad \times \left( 1-\int \theta q_{\sigma (n+1)}\right) \cdots \left( 1-\int \theta q_{\sigma (N)}\right) . \end{aligned}$$
(81)

We will need the following.

Lemma 30

(Entropy of localization of tensor products) Let \(\mathbb {Q}=\Pi _s(q_1\otimes \cdots \otimes q_N)\) be a symmetric tensor product, with \(q_1,...,q_N\) probability measures of disjoint supports. For any \(0\leqslant \theta \leqslant 1\), we have

$$\begin{aligned} \mathcal {S}(\mathbb {Q}_{|\theta })=-\sum _j \int _{\mathbb {R}^d} (\theta q_{j})\log (\theta q_{j})-\sum _j \left( 1-\int _{\mathbb {R}^d}\theta q_{j}\right) \log \left( 1-\int _{\mathbb {R}^d}\theta q_{j}\right) . \end{aligned}$$
(82)

In particular, we deduce

$$\begin{aligned} -\mathcal {S}(\mathbb {Q}_{|\theta })\leqslant \sum _j \int _{\mathbb {R}^d} (\theta q_{j})\log (\theta q_{j}). \end{aligned}$$
(83)

As a side remark we also note also that (82) provides

$$\begin{aligned}&\mathcal {S}(\mathbb {Q}_{|\theta })+\mathcal {S}(\mathbb {Q}_{|1-\theta })\\&\quad =\mathcal {S}(\mathbb {Q})-\sum _j \left[ \left( 1-\int \theta q_{j}\right) \log \left( 1-\int \theta q_{j}\right) +\left( \int \theta q_{j}\right) \log \left( \int \theta q_{j}\right) \right] \\&\qquad -\int \rho \Big (\theta \log \theta +(1-\theta )\log (1-\theta )\Big ). \end{aligned}$$

The additional terms are positive and therefore we recover the subadditivity of the entropy \(\mathcal {S}(\mathbb {Q})\leqslant \mathcal {S}(\mathbb {Q}_{|\theta })+\mathcal {S}(\mathbb {Q}_{|1-\theta })\) [42, Appendix A].

Proof

Each tensor product \((\theta q_{\sigma (1)})\otimes \cdots \otimes (\theta q_{\sigma (n)})\) appears exactly \((N-n)!\) times with the same weight in (81). We can thus write it in the better form

$$\begin{aligned} \mathbb {Q}_{|\theta }=\bigoplus _{n=0}^N\frac{1}{n!}\sum _{j_1\ne \cdots \ne j_n}(\theta q_{j_1})\otimes \cdots \otimes (\theta q_{j_n})\prod _{k\notin \{j_1,...,j_n\}}\left( 1-\int \theta q_{k}\right) , \end{aligned}$$

where now the terms all have disjoint supports. We obtain that the entropy equals

$$\begin{aligned} \mathcal {S}(\mathbb {Q}_{|\theta })&=-\sum _{n=0}^N\frac{1}{n!}\int _{\mathbb {R}^{dn}}\sum _{j_1\ne \cdots \ne j_n}(\theta q_{j_1})\otimes \cdots \otimes (\theta q_{j_n})\prod _{k\notin \{j_1,...,j_n\}}\left( 1-\int \theta q_{k}\right) \times \\&\qquad \times \log \left[ { (\theta q_{j_1})\otimes \cdots \otimes (\theta q_{j_n})\prod _{k\notin \{i_1,...,i_n\}}\left( 1-\int \theta q_{k}\right) }\right] . \end{aligned}$$

Note that the n! in the logarithm simplifies with the 1/n!. Expanding the logarithm and collecting the terms we obtain the claimed formula. \(\square \)

In our case, we deduce by concavity that

$$\begin{aligned} -\mathcal {S}({\widetilde{\mathbb {P}}}_{|\theta })&\leqslant -\frac{1}{\ell ^d}\int _{C_\ell }\mathcal {S}\big (\mathbb {P}(\cdot -\tau )_{|\theta }\big )\,\textrm{d}\tau \\&\leqslant \frac{1}{\ell ^d}\sum _{j,k}\int _{C_\ell }\int _{\mathbb {R}^d}\theta (x)\chi _j^k(x-\tau )\log \big (\theta (x)\chi _j^k(x-\tau ) \big )\,\textrm{d}\tau \,\textrm{d}x\\&=\frac{1}{\ell ^d}\sum _{j,k}\int _{C_\ell }\int _{\mathbb {R}^d}\theta (x)\chi _j^k(x-\tau )\log \frac{\rho (x)\chi _j^k(x-\tau )}{(1+2\eta )^{-d}r_0^{-d}\rho _c(d)}\,\textrm{d}\tau \,\textrm{d}x. \end{aligned}$$

We estimate \(\chi _j^k\) in the logarithm by its supremum \(\Vert \chi _j^k\Vert _\infty =\frac{2^d}{(r_0\eta )^d|B_1|}\) and use that

$$\begin{aligned} \frac{\theta (x)}{\ell ^d}\sum _{j,k}\int _{C_\ell }\chi _j^k(x-\tau )\,\textrm{d}\tau =\theta (x){\widetilde{\rho }}(x)=\rho (x). \end{aligned}$$

We obtain

$$\begin{aligned} -\mathcal {S}({\widetilde{\mathbb {P}}}_{|\theta })&\leqslant \log \left( \frac{(1+2\eta )^{d}}{\eta ^dv_c(d)}\right) \int \rho +\int \rho \log \rho \\&=\log \left( \frac{2^d}{\varepsilon ^dv_c(d)}\right) \int \rho +\int \rho \log \rho . \end{aligned}$$

On the other hand, the energy bound (34) applies since we still have \(|x_j-x_k|\geqslant r_0\) on the support of the localized state \({\widetilde{\mathbb {P}}}_{|\theta }\). This concludes the proof of Theorem 14. \(\square \)