1 Introduction

In this paper we are concerned with the minimization of the functional

$$\begin{aligned} J[\rho ]=\iint \log \frac{1}{|x-y|}d\rho (x)d\rho (y)+d^2(\rho , \rho _0) \end{aligned}$$
(1.1)

among all probability measures \(\rho \) with finite second momentum. Here \(d^2(\rho , \rho _0)=\inf _{\gamma }\frac{1}{2} \iint |x-y|^2d\gamma (x,y)\) is the square of the Wasserstein distance between \(\rho \) and a given probability measure \(\rho _0\), and \(\gamma \) is a joint probability measure with marginals \(\pi _{x\#}\gamma =\rho \), \(\pi _{y\#}\gamma =\rho _0\). The support of \(\rho \) is a priori unknown (or free) and our main goal is to analyze the regularity of the free boundary, i.e. the boundary of the set where \(\rho \not =0\).

An analogous problem arises in high dimensions if we replace the logarithmic kernel by \(K(x-y)={|x-y|^{2-n}}, n\ge 3\). The methods we employ do not depend on the dimension. We focus on the logarithmic kernels since the potential \(U^\rho =-\rho *\log |x|\) may change sign and log-interaction phenomenon has a number of important applications [29, 32] (in Sect. 2 we also give a connection with random matrices).

An interesting feature of the variational problem for \(J[\rho ]\) is that it leads to an obstacle problem involving the potential of the optimal transport of \(\rho \) to \(\rho _0\). Let \(U^\rho \) be the logarithmic (or the Newtonian potential if \(n\ge 3\)) of the probability measure \(\rho \) and \(\psi \) the potential of the transport map, then formally we have

$$\begin{aligned} U^\rho =\psi \quad \text {in}\ \{\rho >0\}\ \text{ and }\ U^\rho \ge \psi \ \text{ elsewhere }. \end{aligned}$$
(1.2)

Since \(\Delta U^\rho =-2\pi \rho \) then it follows that

$$\begin{aligned} \Delta U^\rho =\Delta \psi \quad \text{ in }\ \{\rho >0\}, \quad \Delta U^\rho =0\quad \text{ in }\ \{\rho =0\}. \end{aligned}$$
(1.3)

Thus combining (1.2) and (1.3) we have the obstacle problem

$$\begin{aligned} \left\{ \begin{array}{ccc} \Delta U^\rho = \Delta \psi \chi _{\{\rho >0\}}&{}\quad \text{ in }\ \mathbb R^2, \\ \rho (U^\rho -\psi )=0 &{}\quad \text{ in }\ \mathbb R^2. \end{array} \right. \end{aligned}$$
(1.4)

In this formulation the position of the obstacle is a priori unknown as opposed to the classical case [7]. Note that \(\psi \) is semiconvex function, hence from Aleksandrov’s theorem it follows that \(D^2 \psi \) exists a.e. Consequently, the first equation in (1.4) is satisfied in a.e. sense provided that \(\rho \) is absolutely continuous with respect to the Lebesgue measure.

The partial mass transport and Monge–Ampère obstacle problems had been developed in the seminal work of Caffarelli and McCann [6], see also [16, 19] and the references given there.

Several papers introduced variational problems for measures. In [26] McCann formulated a variational principle for the energy

$$\begin{aligned} E[\rho ]=\int A(\rho )+\frac{1}{2}\iint d\rho (x)K(x-y)d\rho (y), \end{aligned}$$

which allowed to prove existence and uniqueness for a family of attracting gas models, and generalized the Brunn-Minkowski inequality from sets to measures.

Another interesting energy

$$\begin{aligned} F[\rho ]=\iint \log \frac{1}{|x-y|}d\rho (x)d\rho (y)+\int |x|^2d\rho , \end{aligned}$$

appears in the large deviation laws and log-gas interactions [29, 32]. This problem is also related to the classical obstacle problem [3]. Thanks to the quadratic potential every measure minimizing \(F[\cdot ]\) is confined in some ball. Furthermore, one can prove transport inequalities and bounds for the Wasserstein distance in terms of \(F[\rho ]\) [25].

There is a vast literature on interaction energies for probability measures governed by the Wasserstein metric [8, 10, 11, 20]. In particular, [13] contains an \(L^\infty \) estimate for the equilibrium measure and [14] a connection to obstacle problems.

The energy \(J[\rho ]\) appears in a number of physical considerations, for example in aggregation models where the movement of particles is advanced by the global potential \(U^\rho \). The corresponding gradient flow is \(\rho _t={\text {div}}(\rho \nabla U^\rho )\) [30], page 307. Another application appears in thermalization of granular media [12].

In [31] Savin considered the optimal transport of the probability measures in periodic setting for the energy \( \int |\nabla \rho |^2+d^2(\rho , \rho _0), \rho \in H^1([0,1]^n) \). The resulted obstacle problem takes the form

$$\begin{aligned} \left\{ \begin{array}{ll} -\Delta \rho =\psi \ {} &{}\text{ in }\ \{\rho >0\}, \ \\ -\Delta \rho \ge \psi \ {} &{}\text{ elsewhere }, \end{array} \right. \end{aligned}$$
(1.5)

where \(\psi \) is the transport potential of \(\rho \rightarrow \rho _0\) with given initial periodic probability measure \(\rho _0\) with \(H^1\) density.

The aim of this paper is to study the free boundary of the obstacle problem (1.4) for the minimizers \(\rho \) of \(J[\rho ]\).

In [4], the authors consider a gradient flow of the interaction energy

$$\begin{aligned} E_0[\rho ]=\frac{1}{2}\iint d\rho (x)K(|x-y|)d\rho (y), \end{aligned}$$

and study the convergence of radially symmetric solutions under conditions imposed on K, namely radially symmetric solutions converge exponentially fast in some transport distance toward a spherical shell stationary state.

In [5], they study the minimizers of \(E_0\) in \(d_\infty \) topology. Their main result being a Hausdorff dimension estimate of the support of the minimizer.

In [9], the minimization problem for \(E_0\) under the constraint \(d_\infty (\mu , \mu _0)<\epsilon \) is studied. As a result they obtain an obstacle type problem for the potential of transport. They also prove finite perimeter of \(\text{ supp }\rho \) and \(C^{1,1}\) regularity of transport potential under some conditions on K.

In our paper the energy is different, it has the additional term. Moreover, the set of admissible measures is not constrained to a neighborhood of a given initial measure \(\rho _0\) in \(d_\infty \) topology. Note also that above papers do not study the singular set of the free boundary, whereas we estimate the dimension of the singular set of \(\text{ supp }\rho _0\cap \text{ supp }\rho \). This estimate is quite different as opposed to the classical obstacle problem.

1.1 Main results

The energy \(J[\rho ]\) has nonlocal character due to the presence of the logarithmic kernel. However, thanks to the Wasserstein distance \(\rho \) is forced to have compact support provided that \(\mathrm{{supp}}\rho _0\) is compact. Observe that if \(\rho \) has atoms then \(J[\rho ]=\infty \), since the logarithmic term is unbounded, see also the discussion on page 133 [2] for the pure optimal transport case for the energy without the logarithmic term.

Theorem A

If \(\rho _0\) has compact support then there is a probability measure \(\rho \) minimizing J such that \(\mathrm{{supp}}\rho \) is compact. Moreover, \(\rho \) cannot have atoms and hence there is a measure preserving transport map \(y=T(x)\) such that \(\rho _0\) is the push forward of \(\rho \).

The second part of the theorem follows from the standard theory of optimal transport [1, 2]. The chief difficulty in proving the first part is to show that there is a minimizing sequence of probability measure with uniformly bounded supports. In order to establish this we use Carleson’s estimate from below for the nonlocal term and a localization argument for the Fourier transforms of these measures. For other applications of Fourier transforms see [23] and reverences therein.

Next we want to analyze the character of equilibrium measures. We show that \(\rho \in L^\infty \). To see this we compute and explore the first variation of J. The weak form of the Euler-Lagrange equation implies that \(\hat{\rho }\), the Fourier transform of \(\rho \), is in \(L^2\).

Theorem B

Let \(\rho , \rho _0\) be as in Theorem A. Then \(\widehat{\rho }\in L^2(\mathbb R^2)\) and \(d\rho =f dx\) on \(\mathrm{{supp}}\rho \) where \(f\in L^{\infty }(\mathbb R^2)\). In particular, the transport map \(y=T(x)\) (as in Theorem A) is given by

$$\begin{aligned} y=x+2\nabla U^\rho , \end{aligned}$$

where \(U^\rho =\rho *K\) is the potential of \(\rho \) and \(\nabla U^\rho \) is log-Lipschitz continuous.

The log-Lipschitz continuity of \(\nabla U^\rho \) follows from Judovič’s theorem [21]. In fact from the Calderòn-Zygmund estimates it follows that \(D^2U^\rho \in L^p_{loc}\) for every \(p>1\). The local mass balance condition for the optimal transport leads to a nonlocal Monge–Ampère equation

$$\begin{aligned} \det (Id+2D^2 U^\rho )=\frac{\rho (x)}{\rho _0(x+2\nabla U^\rho )}. \end{aligned}$$
(1.6)

(1.6) implies that \(\mathrm{{supp}}\rho \subset \mathrm{{supp}}\rho _0\). If we linearize (1.6) using a time discretization scheme, the resulted equation is \(\rho _t={\text {div}}(\rho \nabla U^\rho )\).

The analysis of the structure of singular set in the obstacle problems is the central problem of the regularity theory. Let \(\hbox {MD}(\mathrm{{supp}}\rho \cap B_r(x))\) be the infimum of distances between pairs of parallel planes such that \(\mathrm{{supp}}\rho \cap B_r(x)\) is contained in the strip determined by them [7]. Let

$$\begin{aligned} \omega ({R})=\sup _{r\le R}\sup _{x\in \mathrm{{supp}}\rho } \frac{\hbox {MD}(\mathrm{{supp}}\rho \cap B_r(x))}{r}. \end{aligned}$$
(1.7)

Observe that if \(n=2\) then (1.6) is equivalent to \(2\pi \rho _0[4\det D^2U^\rho +2 \Delta U^\rho +1]=-\Delta U^\rho \). From here we can deduce the equation

$$\begin{aligned} \det \left[ 2D^2U^\rho +Id\left( 1+\frac{1}{4\pi \rho _0}\right) \right] =\left( 1+\frac{1}{4\pi \rho _0}\right) ^2-1>0. \end{aligned}$$
(1.8)

Consequently, the standard regularity theory for the Monge–Ampère equation (see [33]) implies that we can get higher regularity for \(\rho \) if \(\rho _0\) is sufficiently smooth.

Theorem C

Let \(\omega ({R})\) be the modulus of continuity of the slab height (see (1.7)), \(B_i=B_{r_i}(x_i)\) a collection of disjoint balls included in \(B_R\) with \(x_i\in S\), where S is the singular set. Then for every \(\beta >n-1\) we have

$$\begin{aligned} \sum r_i^\beta \le C\frac{R^\beta }{\omega ^{n-1}(R)}\frac{1}{1-\omega ^{\beta -(n-1)}(R)}. \end{aligned}$$

Furthermore, if \(\omega ({R})= R^\sigma \), then there is \(\sigma '=\sigma '(n, \sigma )\) such that the singular set \(S\subset M_0\cup \bigcup _{i=1}^\infty M_i\) where \(\mathcal {H}^{n-1-\sigma '}(M_0)=0\) and \(M_i\) is contained in some \(C^1\) hypersurface such that the measure theoretic normal exists at each \(x\in S\cap M_i, i\ge 1\).

The paper is organized as follows: In Sect. 2 we recall some facts on the Wasserstein distance and Fourier transformation of measures. One of the key facts that we use is that the logarithmic term can be written as a weighted \(L^2\) norm of the Fourier transformation of \(\rho .\)

Section 3 contains the proof of Theorem A. The chief difficulty in the proof is to control the supports of the sequence of minimizing measures.

Section 4 contains some basic discussion of cyclic monotonicity and maximal Kantorovich potential. Then we derive the Euler-Lagrange equation. From here we infer that \(\rho \) has \(L^\infty \) density with respect to the Lebesgue measure. Theorem B follows from Theorem 4.4 and Corollary 4.6.

In Sect. 5 we study the regularity of free boundary and prove Theorem C.

The last two sections contain some final remarks and possible applications. First, in Sect. 6 we discuss the relation of \(J[\rho ]\) with the large deviations laws for the random matrices with interaction and provide a simple model with energy J. Finally, Sect. 7 is devoted to the nonlocal Monge–Ampère equation and its linearization \(\partial _t \rho ={\text {div}}(\rho \nabla U^\rho )\).

1.2 Notation

We will denote by \(\mathcal M(\mathbb R^n)\) the set of probability measures on \(\mathbb R^n\), and let \(\mu _{\# f}\) be the push forward of \(\mu \in \mathcal M(\mathbb R^{n})\) under a mapping f. \(d(\mu , \rho )\) denotes the 2-Wasserstein distance of \(\mu , \rho \in \mathcal M(\mathbb R^{n})\), \(B_r(x_0)\) the open ball of radius r centered at \(x_0\), K the kernels

$$\begin{aligned} K(x-y)= \left\{ \begin{array}{ccc} \log \frac{1}{|x-y|} \quad &{}\text{ if }\ n=2,\\ |x-y|^{2-n} &{}\text{ if }\ n\ge 3. \end{array} \right. \end{aligned}$$
(1.9)

\(U^\rho =\rho *K\) is the potential of \(\rho \in \mathcal M(\mathbb R^{n})\), \(\mathcal {H}^n\) the n dimensional Hausdorff measure, \(1_E\) the characteristic function of \(E\subset \mathbb R^n\). The restriction of \(\mu \in \mathcal M(\mathbb R^{n})\) on some \(E\subset \mathbb R^{n}\) will be denoted by , and \(\widehat{\mu }(\xi ) =\int e^{-2\pi i \langle x, \xi \rangle }d\mu (x)\) is the Fourier transform of \(\mu \in \mathcal M(\mathbb R^{n})\).

2 Set-up

Let \(f:\mathbb R^n\rightarrow \mathbb R^n\) be a map, for a Borel set \(E\subset \mathbb R^n\) the push forward is defined by \(\mu _{\# f}(E)=\mu (f^{-1}(E))\). For every joint probability measure \(\gamma \in \mathcal M(\mathbb R^n\times \mathbb R^n)\) we define the projections \(\pi _x:(x, y)\rightarrow x\), \(\pi _y:(x, y)\rightarrow y\).

We require \(\gamma \) to have prescribed marginals \(\rho \), \(\rho _0\in \mathcal M(\mathbb R^{n})\), i.e.

$$\begin{aligned} \gamma _{\#\pi _x}=\rho (x), \quad \gamma _{\#\pi _y}=\rho _0(y). \end{aligned}$$

For probability measures \(\rho , \rho _0\in \mathcal M(\mathbb R^n)\) we define their Wasserstein distance as follows

$$\begin{aligned} d(\rho , \rho _0)=\left( \inf _\gamma \frac{1}{2}\iint |x-y|^2d\gamma (x, y)\right) ^{\frac{1}{2}}, \end{aligned}$$
(2.1)

where \(\gamma \)’s are transport plans such that \(\gamma _{\#\pi _x}=\rho \), \(\gamma _{\#\pi _y}=\rho _0\). We recall the following properties of the Wasserstein distance:

  1. 1)

    d is a distance,

  2. 2)

    \(d^2\) is convex, i.e.

    $$\begin{aligned} d^2(tu+(1-t)v, w)\le td^2(u, w)+(1-t)d^2(v, w), \quad t\in [0, 1], u, v\in \mathcal M(\mathbb R^{n}), \end{aligned}$$
  3. 3)

    if \(u_k\rightarrow u, v_k\rightarrow v\) in \(L^1_{loc}\) as \(k\rightarrow \infty \) then

    $$\begin{aligned} \lim _{k\rightarrow \infty }d(u_k, v_k)=d(u, v), \end{aligned}$$
  4. 4)

    if \(u_k\rightarrow u, v_k\rightarrow v\) weakly, i.e. \(\int u_k\phi \rightarrow \int u\phi , \int v_k\phi \rightarrow \int v\phi \) for every \(\phi \in C_0\), then

    $$\begin{aligned} d(u, v)\le \liminf _{k\rightarrow \infty }d(u_k, v_k). \end{aligned}$$

See [34] for more details.

We also need the following definition of Wasserstein class:

Definition 2.1

Let \((\Omega , |\cdot |)\) be a Polish space (i.e. complete separable metric space equipped with its Borel \(\sigma \)-algebra). The Wasserstein space of order 2 is defined as

$$\begin{aligned} P_2(\Omega )=\left\{ \mu \in \mathcal M: \int _{\Omega }|x_0-x|^2\mu (dx)<\infty \right\} , \end{aligned}$$

where \(x_0\in \Omega \) is arbitrary. This space does not depend on the choice of \(x_0\). Thus d defines a finite distance on \(P_2\).

Remark 2.2

If \(\Omega \) is compact then so is \(P_2\). If \(\Omega \) is only locally compact then \(P_2(\Omega )\) is not locally compact, see [34]. This introduces several difficulties in the proof of the existence of a minimizer.

Remark 2.3

Recall that the Fourier transformation of the truncated kernel \(K_{r_0}=1_{B_{r_0}}K, n=2\) can be computed explicitly

$$\begin{aligned} \widehat{ K_{r_0}}=\frac{c_1}{4\pi |\xi |^2}(1-\mathcal B(2\pi r_0|\xi |)), \end{aligned}$$
(2.2)

where \(c_1>0\) is a universal constant, \(\mathcal B\) is the Bessel function of the first kind such that \(\mathcal B(0)=1, \mathcal B'(0)=0\) and \(\lim _{t\rightarrow + \infty }\mathcal B(t)=0\) [15].

If \(\mu \in \mathcal M(\mathbb R^2)\) has compact support then from the weak Parseval identity we have that

$$\begin{aligned} \iint K(x-y)\mu (x)\mu (y)=\int |\hat{\mu }|^2\widehat{K}\ge 0, \end{aligned}$$
(2.3)

where \(K(x-y)=\log \frac{1}{|x-y|}\) and \(\widehat{\mu }, \widehat{K}\) are the Fourier transforms of \(\mu , K\) respectively, see [22] for the proof. This observation shows that the energy J is nonnegative for compactly supported \(\mu \in \mathcal M(\mathbb R^{2})\).

We say that \(\mu \in \mathcal M(\mathbb R^{n})\) has finite energy if \(I[\mu ]<\infty \) where \(I[\rho ]=\iint K(x-y)d\rho (x)d\rho (y)\). Then \(\mathcal M(\mathbb R^{n})\) with \(\mathcal I[\rho , \mu ]=\iint K(x-y)d\rho (x)d\mu (y)\) has Hilbert structure, [24] page 82, and

$$\begin{aligned} \Vert \mu \Vert =\sqrt{\mathcal I[\mu , \mu ]} \end{aligned}$$

is a norm. It is remarkable that the standard mollifications \(\mu _k\) of \(\mu \) converge to \(\mu \) strongly, i.e. \(\lim _{k \rightarrow \infty }\Vert \mu -\mu _k\Vert =0\), see [24] Lemma 1.\(2'\) page 83.

3 Existence of minimizers

Proposition 3.1

Let \(\mu _0\in \mathcal M(\mathbb R^{2})\) and \(\mathrm{{supp}}\mu _0\subset B_{R_0}\) for some \(R_0>0\). Let \(\mu \in P_2(\mathbb R^2)\) and J be given by (1.1), then

  1. (i)

    \(J[\mu ]>-\infty \) provided that \(J[\mu ]<+\infty \),

  2. (ii)

    there is \(\varepsilon >0\) depending on \(R_0\) and \(\mu \) such that \(J[\mu _{\varepsilon }]<J[\mu ]\) provided that \(\mathrm{{supp}}\mu \not \subset {B_\varepsilon }, \mathrm{{supp}}\mu \cap B_\varepsilon \not =\emptyset \), where \(\mu _{\varepsilon }=1_{B_\varepsilon }\mu /\mu (B_{\varepsilon })\) is the normalized restriction of \(\mu \) to \(B_{\varepsilon }\),

  3. (iii)

    if \(0\le J[\mu _k]\le C\) for some sequence \(\{\mu _k\}\subset P_2(\mathbb R^2)\) and \(\varepsilon _k\) are the corresponding numbers from (ii) then there is \(\varepsilon _0>0\) such that \(\varepsilon _k\le \varepsilon _0\) uniformly in k, where \(\varepsilon _0\) depends only on C and \(R_0\).

Proof

We split the proof into three steps:

Step 1: Second moment estimate:

Let \(\varepsilon >0\) be fixed. By Theorem 1 [28] there is transference plan \(\gamma \in \mathcal M(\mathbb R^2\times B_{R_0}) \) with marginals \(\mu , \mu _0\) such that \(d^2(\mu , \mu _0)=\frac{1}{2}\iint |x-y|^2\gamma .\) Set then

$$\begin{aligned} \iint \gamma _\varepsilon = \frac{1}{\mu (B_\varepsilon )}\iint \limits _{B_{\varepsilon }\times B_{R_0}} \gamma = \frac{1}{\mu (B_\varepsilon )}\int _{B_{\varepsilon }} \mu (x)=1. \end{aligned}$$

Moreover, the projections of . Hence

$$\begin{aligned} d^2(\mu , \mu _0)= & {} \frac{1}{2}\iint |x-y|^2\gamma \\= & {} \frac{1}{2}\iint \limits _{B_\varepsilon \times B_{R_0}}|x-y|^2\gamma + \frac{1}{2}\iint \limits _{B_\varepsilon ^c\times B_{R_0}}|x-y|^2\gamma \\= & {} \mu (B_\varepsilon )\frac{1}{2}\iint |x-y|^2\gamma _\varepsilon + \frac{1}{2}\iint \limits _{B_\varepsilon ^c \times B_{R_0}}|x-y|^2\gamma .\\ \end{aligned}$$

Since \(\gamma _\varepsilon \) has marginals \(\mu _\varepsilon , \mu _0\) then \(\frac{1}{2}\iint |x-y|^2\gamma _\varepsilon \ge d^2(\mu _\varepsilon , \mu _0)\). Consequently, this in combination with the last inequality yields

$$\begin{aligned} d^2(\mu , \mu _0)\ge & {} \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)+\frac{1}{2} \iint \limits _{B_\varepsilon ^c \times B_{R_0}}|x|^2\left( 1-\frac{|y|}{|x|}\right) ^2\gamma (x, y)dxdy\nonumber \\\ge & {} \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)+\frac{1}{2} \iint \limits _{B_\varepsilon ^c\times B_{R_0}}\left[ |x|^2\left( 1-\frac{R_0}{\varepsilon }\right) ^2\gamma (x, y)dy\right] dx\nonumber \\= & {} \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)+2c_0 \int _{B_\varepsilon ^c}|x|^2\mu , \end{aligned}$$
(3.1)

where we denote

$$\begin{aligned} c_0:=\frac{1}{4} \left( 1-\frac{R_0}{\varepsilon }\right) ^2 \end{aligned}$$
(3.2)

provided that \(\varepsilon >{R_0}\). From Hölder’s inequality we have that

$$\begin{aligned} 2d^2(\mu , \mu _0)= & {} \int |x|^2d\mu -2\iint x\cdot yd\gamma +\int |y|^2d\mu _0\\\ge & {} \frac{1}{2}\int |x|^2d\mu - \int |y|^2d\mu _0, \\ \end{aligned}$$

hence it gives

$$\begin{aligned} \int |x|^2d\mu \le 4d^2(\mu , \mu _0)+2\int |y|^2d\mu _0\le 4(d^2(\mu , \mu _0) +R_0^2). \end{aligned}$$
(3.3)

Step 2: A bound for the logarithmic term:

Now we want to estimate the logarithmic term from below using the method from Chapter 1.1 [29]. To do so we denote \(Q(x)=c_0|x|^2, w(x)=e^{-c_0|x|^2}\) and introduce the logarithmic energy with quadratic potential

$$\begin{aligned} I_w[\mu ]= & {} \iint \log \frac{1}{|x-y|}d\mu (x)d\mu (y)+2\int Qd\mu \nonumber \\= & {} \iint \log \frac{1}{|x-y|w(x)w(y)}d\mu (x)d\mu (y). \end{aligned}$$
(3.4)

It is convenient to introduce the notation \(K_w(x, y)=\log \frac{1}{|x-y|w(x)w(y)}\), with this we have

$$\begin{aligned} I_w[\mu ]= & {} \iint \limits _{B_\varepsilon \times B_\varepsilon }K_w(x, y)d\mu (x)d\mu (y)+2\iint \limits _{B_\varepsilon \times B_\varepsilon ^c}K_w(x,y)d\mu (x)d\mu (y)\\{} & {} +\iint \limits _{B_\varepsilon ^c\times B_\varepsilon ^c}K_w(x, y)d\mu (x)d\mu (y). \end{aligned}$$

Observe that

$$\begin{aligned} e^ {K_w(x, y)}= \frac{e^{c_0(|x|^2+|y|^2)}}{|x-y|} \ge \frac{e^{c_0(|x|^2+|y|^2)}}{|x|+|y|} \ge \frac{1}{2}\left( \frac{e^{2c_0(|x|^2+|y|^2)}}{|x|^2+|y|^2}\right) ^{\frac{1}{2}} \end{aligned}$$

because \(\frac{1}{2}(|x|+|y|)\le \sqrt{|x|^2+|y|^2}\le |x|+|y|\). Therefore for every large constant \(T_0>0\) there is \(\varepsilon \) such that if \(\max \{|x|, |y|\}\ge \varepsilon \) then \(K_w(x, y)\ge T_0\). This yields the following estimate for \(I_w\)

$$\begin{aligned} I_w[\mu ]\ge & {} (\mu (B_\varepsilon ))^2\iint \limits _{B_\varepsilon \times B_\varepsilon }K_w(x, y)d\mu _\varepsilon d\mu _\varepsilon + 2T_0\iint \limits _{B_\varepsilon \times B_\varepsilon ^c}d\mu (x)d\mu (y) + T_0 \iint \limits _{B_\varepsilon ^c\times B_\varepsilon ^c}d\mu (x)d\mu (y)\\= & {} (\mu (B_\varepsilon ))^2\iint \limits _{B_\varepsilon \times B_\varepsilon }K_w(x, y)d \mu _\varepsilon d\mu _\varepsilon + 2T_0\mu (B_\varepsilon )(1-\mu (B_\varepsilon ))+T_0(1-\mu (B_\varepsilon ))^2\\= & {} (\mu (B_\varepsilon ))^2\iint \limits _{B_\varepsilon \times B_\varepsilon }K_w(x, y)d\mu _\varepsilon d\mu _\varepsilon + T_0(1-(\mu (B_\varepsilon ))^2). \end{aligned}$$

Thus after some simplification we get

$$\begin{aligned} I_w[\mu ]{} & {} \ge (\mu (B_\varepsilon ))^2 I_w(\mu _\varepsilon )+T_0(1-(\mu (B_\varepsilon ))^2)\nonumber \\{} & {} {\mathop {=}\limits ^{(3.4)}} (\mu (B_\varepsilon ))^2 \left[ \iint \log \frac{1}{|x-y|}d\mu _\varepsilon d\mu _\varepsilon + 2\int Q\mu _\varepsilon \right] +T_0(1-(\mu (B_\varepsilon ))^2).\nonumber \\ \end{aligned}$$
(3.5)

Step 3: Energy comparison in \(B_\varepsilon \):

Combining (3.5) with (3.1) we get

$$\begin{aligned} J[\mu ]{} & {} = \iint \log \frac{1}{|x-y|}d\mu (x)d\mu (y)+d^2(\mu , \mu _0)\\{} & {} {\mathop {\ge }\limits ^{(3.1)}} \iint \log \frac{1}{|x-y|}d\mu (x)d\mu (y) + \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)+2c_0\int _{\mathbb R^2\setminus B_\varepsilon }|x|^2\mu \\{} & {} = I_w(\mu )-2c_0\int _{B_\varepsilon }|x|^2d\mu + \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)\\{} & {} {\mathop {\ge }\limits ^{(3.5)}} (\mu (B_\varepsilon ))^2 \left[ \iint \log \frac{1}{|x-y|}d\mu _\varepsilon d\mu _\varepsilon + 2\int Q\mu _\varepsilon \right] +T_0(1-(\mu (B_\varepsilon ))^2)\\{} & {} \qquad -2c_0\int _{B_\varepsilon }|x|^2d\mu + \mu (B_\varepsilon )d^2(\mu _\varepsilon , \mu _0)\\{} & {} \ge (\mu (B_\varepsilon ))^2J[\mu _\varepsilon ] + 2c_0(\mu (B_\varepsilon ))^2\int |x|^2\mu _\varepsilon +T_0(1-(\mu (B_\varepsilon ))^2) -2c_0\int _{B_\varepsilon }|x|^2d\mu . \\ \end{aligned}$$

The last three terms on the last line can be further estimated from below as follows

$$\begin{aligned} J[\mu ]-(\mu (B_\varepsilon ))^2J[\mu _\varepsilon ]= & {} T_0(1-(\mu (B_\varepsilon ))^2) + 2c_0\mu (B_\varepsilon )\int _{B_\varepsilon } |x|^2d\mu -2c_0\int _{B_\varepsilon }|x|^2d\mu \nonumber \\= & {} T_0(1-(\mu (B_\varepsilon ))^2) - 2c_0(1-\mu (B_\varepsilon ))\int _{B_\varepsilon } |x|^2d\mu \nonumber \\= & {} (1-\mu (B_\varepsilon ))\left[ T_0(1+\mu (B_\varepsilon ))-2c_0\int _{B_\varepsilon } |x|^2d\mu \right] \nonumber \\\ge & {} (1-\mu (B_\varepsilon ))\left[ T_0 -2c_0\int _{B_\varepsilon } |x|^2d\mu \right] . \end{aligned}$$
(3.6)

In particular from here and (2.3) we see that \(J[\mu ]>-\infty \) and hence (i) follows. Now if we choose

$$\begin{aligned} T_0>1+J[\mu ] +8c_0(d^2(\mu , \mu _0)+R_0^2) \end{aligned}$$
(3.7)

then from (3.6) it follows that

$$\begin{aligned} J[\mu ]-(\mu (B_\varepsilon ))^2J[\mu _\varepsilon ]{} & {} > (J[\mu ]+1)(1-(\mu (B_\varepsilon ))^2) \\{} & {} \qquad + \ (1-\mu (B_\varepsilon ))\left[ 8c_0(d^2(\mu , \mu _0)+R_0^2)(1+\mu (B_\varepsilon ))- 2c_0\int _{B_\varepsilon } |x|^2d\mu \right] \\{} & {} {\mathop {\ge }\limits ^{(3.3)}} (J[\mu ]+1)(1-(\mu (B_\varepsilon ))^2). \end{aligned}$$

This implies \((\mu (B_\varepsilon ))^2(J[\mu ]-J[\mu _\varepsilon ])>1-(\mu (B_\varepsilon ))^2\), hence it is enough to take the minimization over \(\mathcal M (B_{\varepsilon })\).

It remains to check (iii). First we estimate

$$\begin{aligned} 1+J[\mu _{k}] +8c_0(d^2(\mu _{k}, \mu _0)+R_0^2){} & {} \le 1+C+8c_0(C+R_0^2)\\{} & {} {\mathop {\le }\limits ^{(3.2)}} 1+C+7(C+R_0^2):=\hat{C}. \end{aligned}$$

From (3.7) it follows that \(T_0\) can be chosen to be the same for every \(\mu _k\), say \(T_0>\hat{C}\), satisfying \(0\le J[\mu _k]\le C\) and the proof is complete. \(\square \)

Now we are ready to finish the proof of Theorem A.

Theorem 3.2

Let \(\rho _0\in \mathcal M(\mathbb R^{2})\) such that \(\mathrm{{supp}}\rho _0\subset B_{R_0}\) for some \(R_0>0\). Then there exists a minimizer \(\rho \in \mathcal M(\mathbb R^{2})\) of J. Moreover, the support of \(\rho \) is bounded.

Proof

First note that if we take the uniform measure \(\mu \) of some ball B having positive distance from \(B_{R_0}\) then \(J[\mu ]<+\infty \). Hence by Proposition 3.1 (i) we have that \(J[\mu ]>-\infty \). Thus if \(\mu _k\in P_2(\mathbb R^2)\) is a minimizing sequence then without loss of generality we can assume that \(J[\mu _k]\le C\) for some \(C>0\) uniformly in k. Moreover, from Proposition 3.1 (ii) it follows that there are positive numbers \(\varepsilon _k>0\) such that for the restriction measures \(\mu _{k, \varepsilon _k}\) we have

$$\begin{aligned} J[\mu _{k, \varepsilon _k}]<J[\mu _k]\le C. \end{aligned}$$
(3.8)

On the other hand it follows from (2.3) that \(J[\mu _{k, \varepsilon _k}]\ge 0\) because \(\mathrm{{supp}}\mu _{k, \varepsilon _k}\) is compact. Thus \(0\le J[\mu _{k, \varepsilon _k}]\le C\) uniformly in k and moreover \(J[\mu _{k, \varepsilon _k}]\rightarrow \inf _{\rho \in P_2(\mathbb R^2)} J[\rho ]\) thanks to (3.8). Consequently, applying Proposition 3.1 (iii), we can use the weak compactness of \(\mu _{k, \varepsilon _k}\) in \(\mathcal M(B_{\varepsilon _0})\) to get a weakly converging subsequence still denoted \(\mu _{k, \varepsilon _k}\) to some \(\rho \in \mathcal M(B_{\varepsilon _0})\). The logarithmic term is lower-semicontinuous [29], or [24] page 78, hence from the lower-semicontinuity of d (see property 4) in Sect. 2) it follows that

$$\begin{aligned} J[\rho ]\le \liminf _{k\rightarrow \infty }J[\mu _{k, \varepsilon _k}] \end{aligned}$$

and the desired result follows. \(\square \)

4 Euler-Lagrange equation

Definition 4.1

We say that a set \(S\subset \mathbb R^n\times \mathbb R^n\) is cyclically monotone if

$$\begin{aligned} \sum ^{m}_{k=1}\left| x_{k}-y_{k}\right| ^{2}\le \sum ^{m}_{k={1}}\left| x_{k+1}-y_{k}\right| ^{2} \end{aligned}$$
(4.1)

holds whenever \(m\ge 2\) and \((x_i, y_i)\in S, 1\le i\le m\) with \(x_{m+1}=x_1\). The set \(x_1, x_2, \dots , x_n\) is called a cycle.

Cancelling the square terms from (4.1) we get

$$\begin{aligned} \sum ^{m}_{k=1}y_{k}x_{k}\ge \sum ^{m}_{k=1}y_{k}x_{k+1 }. \end{aligned}$$
(4.2)

Let \(\gamma \) be a transference plane with marginals \(\rho , \rho _0\). It is well known that the support of \(\gamma \) is cyclically monotone, see [1] Theorem 2.2.

Let \(S\subset \mathbb {R} ^{n}\times \mathbb {R} ^{n}\) be cyclically monotone. Set \(c\left( x,y\right) =\dfrac{1}{2}\left| x-y\right| ^{2}\) and introduce the function

$$\begin{aligned} \psi \left( x\right) =\sup _{(x_{i},y_i)\in S}\left\{ c\left( x_{0},y_{0}\right) -c\left( x_{1},y_{0}\right) +c\left( x_{1},y_{1}\right) -c\left( x_{2},y_{1}\right) +\ldots +c\left( x_{k},y_{k}\right) -c\left( x,y_{k}\right) \right\} ,\nonumber \\ \end{aligned}$$
(4.3)

where the supremum is taken over all cycles of finite length. It is easy to check that \(\psi \) defined in (4.3) satisfies \( \psi \left( x\right) \le 0 \) and the normalization condition \( \psi \left( x_{0}\right) =0. \)

If \(\gamma (x, y)\) is a transference plan then it is contained in the c superdifferential of the c concave function \(\psi \) constructed above. \(\psi \) is called the maximal Kantorovich potential. Moreover, we have that if \((x', y')\in \mathrm{{supp}}\gamma \) then for every \(x\in \mathbb R^n\)

$$\begin{aligned} \psi \left( x\right) +\dfrac{1}{2}\left| x-y'\right| ^{2}\ge \psi \left( x'\right) +\dfrac{1}{2}\left| x'-y'\right| ^{2}. \end{aligned}$$
(4.4)

See Theorem 2.3 [1] for proof.

Remark 4.2

Recall that by Corollary 2.2 [1] if (CC) graphs are \(\rho \) negligible then the transference plan \(\gamma \) is unique and the transport map \(T=\nabla v\) for some convex potential v.

We want to show that in (4.4) we can take \(\psi =2U^\rho \), and \(\rho \) is absolutely continuous with respect to the Lebesgue measure.

Lemma 4.3

Let \(\rho \) be a minimizer, then \(U^\rho \rho \) is a signed Radon measure.

Proof

Let \(\xi \in C_0^\infty (B)\) be a cut-off function of some ball B. Let \(\{\rho _k\}_{k=1}^\infty \) be a sequence of mollifications of \(\rho \). Recall that by Remark 2.3\(I[\rho _k]<\infty \), and \(\Vert \rho -\rho _k\Vert =I[\rho -\rho _k]\rightarrow 0\) as \(k\rightarrow \infty \). Thus

$$\begin{aligned} \int _B U^\rho \xi \rho _k= & {} -\frac{1}{2\pi }\int _B U^\rho \xi \Delta U^{\rho _k}\nonumber \\= & {} \frac{1}{2\pi }\int \nabla (U^\rho \xi )\nabla U^{\rho _k}\le \frac{1}{2\pi }\Vert \nabla (U^\rho \xi )\Vert _{L^2}^2\Vert \nabla U^{\rho _k}\Vert _{L^2}^2. \end{aligned}$$
(4.5)

Note that [24] Lemma 1.\(2'\) page 83

$$\begin{aligned} \Vert \nabla U^{\rho _k}\Vert _{L^2}^2{} & {} = 4\pi ^2\int |\zeta |^2|\widehat{K}|^2|\widehat{\rho }_k|^2d\zeta \\{} & {} {\mathop {\le }\limits ^{(2.2)}} 4\pi ^2c_1\int \widehat{K} |\widehat{\rho }_k|^2 = 4\pi ^2c_1 I[\rho _k]\\{} & {} \le 8\pi ^2c_1I[\rho ]+8\pi ^2c_1 I[\rho -\rho _k] \rightarrow 8\pi ^2c_1 I[\rho ] \end{aligned}$$

as \(k\rightarrow \infty \). Since \(U^\rho \in H^1\) (see [22]) is superharmonic (hence bounded below in B, say by \(C_B\)) then from Fatou’s lemma we get that

$$\begin{aligned} C_B \int \xi d\rho \le \int U^\rho \xi d\rho \le \lim _{k\rightarrow \infty }\int _B U^\rho \xi d\rho _k\le C\Vert U^\rho \Vert _{H^1(B)}^2 I[\rho ], \end{aligned}$$
(4.6)

where C depends only on the dimension. \(\square \)

Theorem 4.4

Let \(\rho \) be a minimizer. Suppose the infimum in \(d(\rho , \rho _0)\) is realized for a transference plan \(\gamma \) and \((x^*, y^*)\in \mathrm{{supp}}\gamma \). Then \(\rho \) has \(L^\infty \) density with respect to the Lebesgue measure, and for every \(x_0\) we have

$$\begin{aligned} \frac{1}{2}|x_0-y^*|^2-\frac{1}{2}|x^*-y^*|^2+2U^\rho (x_0)-2U^\rho (x^*)\ge 0. \end{aligned}$$
(4.7)

Moreover, \(\nabla U^\rho \) is log-Lipschitz continuous.

Proof

Let \(\xi (x)\) be a cut-off function on \(B_\varepsilon (x^*)\). Introduce

Note that \(\gamma ^*_\varepsilon (x, y)\) is not a probability measure. Let \(\gamma _\varepsilon (x, y)=\tau _\#\gamma ^*_\varepsilon (x, y)\), where \(\tau : (x, y)\rightarrow (x-x^*+x_0, y)\) is the translation operator in x so that \(\tau (x^*, y)=(x_0, y)\), see Fig. 1. Letting

$$\begin{aligned} \varphi ^*(x):= & {} \pi _{x\#}\gamma ^*_\varepsilon (x,y),\nonumber \\ \varphi _0(x):= & {} \pi _{x\#}\gamma _\varepsilon (x, y), \end{aligned}$$
(4.8)

be the marginals in x, we can see that the marginals in y are

$$\begin{aligned} \iint _{\mathbb R\times \mathbb R} \eta (y) d\gamma \left( x,y\right) -t\iint _{\mathbb R\times \mathbb R} \eta (y)d\gamma ^{*}_{\varepsilon }\left( x,y\right) +t\iint _{\mathbb R\times \mathbb R} \eta (y)d\gamma _{\varepsilon }\left( x,y\right) \\= \iint _{\mathbb R\times \mathbb R} \eta (y)d\gamma \left( x,y\right) = \int _\mathbb R\eta (y)d\rho _{0}\left( y\right) \end{aligned}$$

because \(\tau \) is measure preserving. Here \(\eta \ge 0\) is a continuous function with compact support. For the other marginal, we have

$$\begin{aligned} \iint _{\mathbb R\times \mathbb R} \eta (x) d\gamma \left( x,y\right) -t\iint _{\mathbb R\times \mathbb R} \eta (x)d\gamma ^{*}_{\varepsilon }\left( x,y\right) +t\iint _{\mathbb R\times \mathbb R} \eta (x)d\gamma _{\varepsilon }\left( x,y\right) \\ =\int _{\mathbb R} \eta (x)d\rho \left( x\right) -t\int _{\mathbb R} \eta (x)d\varphi ^{*}(x) +t\int _{\mathbb R} \eta (x)d\varphi _{0}\left( x\right) . \end{aligned}$$

Observe that by (4.8) and the definition of \(\gamma _\varepsilon ^*\) we have

$$\begin{aligned} \int _{\mathbb R} \eta (x)d\rho -t\int _{\mathbb R} \eta (x)d\varphi ^*= & {} \iint _{\mathbb R\times \mathbb R}\eta (x)\left[ d\gamma \left( x,y\right) -td\gamma ^{*}_{\varepsilon }\left( x,y\right) \right] \\= & {} \iint _{\mathbb R\times \mathbb R}\eta (x)\left[ 1-t\xi (x)1_{B_\varepsilon (x^*)\times B_\varepsilon (y^*)}\right] d\gamma (x, y)\\\ge & {} 0 \end{aligned}$$

provided that t is small enough.

Consequently we can use \(\rho -t\varphi ^*+t\varphi _0\) against \(\rho \) and get from the convexity of \(d^2\) (see Sect. 2) the following estimate

$$\begin{aligned} \begin{aligned} d^{2}\left( \rho _{0},\rho -t\varphi ^{*}+t\varphi _{0}\right)&\le \dfrac{1}{2}\iint \left| x-y\right| ^{2}d\left( \gamma {-t}\gamma ^{*}_{\varepsilon }+t\gamma _{\varepsilon }\right) \\&= d^{2}\left( \rho _{0},\rho \right) +\dfrac{t}{2}\iint \left| x-y\right| ^{2}d(\gamma _{\varepsilon }-\gamma ^{*}_{\varepsilon }). \end{aligned} \end{aligned}$$

For the nonlocal term we have

$$\begin{aligned} \begin{aligned} \iint K\left( x-y\right) d\left( \rho +t\left( \varphi _{0}-\varphi ^*\right) \right) d\left( \rho +t\left( \varphi _{0}-\varphi ^*\right) \right) =&\iint K(x-y)d\rho (x) d\rho (y)\\&+2t\int U^{\rho }\left( x\right) d\left( \varphi _{0}-\varphi ^*\right) +O\left( t^{2}\right) . \end{aligned} \end{aligned}$$
Fig. 1
figure 1

The geometric construction of joint measures \(\gamma _\varepsilon \) and \(\gamma _\varepsilon ^*\) via restriction and translation

Then the energy comparison yields

$$\begin{aligned} \dfrac{t}{2}\iint \left| x-y\right| ^{2}d(\gamma _{\varepsilon }-\gamma ^{*}_{\varepsilon }) +2t\int U^{\rho }\left( x\right) d\left( \varphi _{0}-\varphi ^*\right) +O\left( t^{2}\right) \ge 0. \end{aligned}$$

Dividing by t and sending \(t\rightarrow 0\), \(t>0\) we get that

$$\begin{aligned} \dfrac{1}{2}\iint \left| x-y\right| ^{2}d(\gamma _{\varepsilon }-\gamma ^{*}_{\varepsilon }) +2\int U^{\rho }\left( x\right) d\left( \varphi _{0}-\varphi ^*\right) \ge 0. \end{aligned}$$
(4.9)

Since \(\gamma _\varepsilon \) is the push forward of \(\gamma _\varepsilon ^*\) under translation \(x\rightarrow x-x^*+x_0\) then we have from (4.9)

$$\begin{aligned} \dfrac{1}{2}\iint \left[ \left| x+x^*-x_0-y\right| ^{2}-|x-y|^2\right] d\gamma ^{*}_{\varepsilon } +2\int \left[ U^{\rho }\left( x+x^*-x_0\right) -U^\rho (x) \right] d\varphi ^*\ge 0.\nonumber \\ \end{aligned}$$
(4.10)

Taking \(x^*-x_0=\pm he_j\), where \(e_j\) is the unit direction of the jth coordinate axis, \(h>0\), and adding the resulted inequalities (4.10) we get

$$\begin{aligned} \dfrac{1}{2}\iint \left[ \left| x+he_j-y\right| ^{2}+|x-he_j-y|^2-2|x-y|^2\right] d\gamma ^{*}_{\varepsilon }\nonumber \\ +2\int \left[ U^{\rho }\left( x+he_j\right) +U^{\rho }\left( x-he_j\right) -2U^\rho (x) \right] d\varphi ^*\ge 0. \end{aligned}$$
(4.11)

But \(\left| x+he_j-y\right| ^{2}+|x-he_j-y|^2-2|x-y|^2=2h^2\), hence (4.11) is equivalent to

$$\begin{aligned} -\int \frac{U^{\rho }\left( x+he_j\right) +U^{\rho }\left( x-he_j\right) -2U^\rho (x) }{h^2}\xi d \rho \le \frac{1}{2}\int \xi d\rho . \end{aligned}$$
(4.12)

Note that by Lemma 4.3 the left hand side of (4.12) is well defined.

Claim 4.5

\(\rho \) has \(L^2\) density.

Proof

Let \(\delta _hu=\delta (x, h, u)=\frac{1}{h^2}\sum _j(u(x+he_j)+u(x-he_j)-2u(x))\) be the discrete Laplacian. Then from (4.12) with a sequence of cut offs \(\xi _k\uparrow 1\) on \(B_{\varepsilon _0}\), using the dominated convergence theorem, since \(U^\rho \rho \) is a signed Radon measure (see Lemma 4.3 and (4.6)), and recalling that \(\rho \) has compact support, it follows that

$$\begin{aligned} -\int \delta _h(U^\rho ) d\rho \le \frac{1}{2}\int d\rho =\frac{1}{2}. \end{aligned}$$

Since \(\mathrm{{supp}}\rho \) is compact we can assume that K vanishes outside of \(B_{r_0}\) and consider the truncated kernel \(K_{r_0}=1_{B_{r_0}}K\). From the weak Parseval identity we get that

$$\begin{aligned} \frac{1}{2}\ge & {} -\int \widehat{ \delta _h U^\rho }\widehat{\overline{\rho }} = -\int \frac{1}{h^2}\sum _j\left[ e^{-2\pi i h\xi _j}+e^{2\pi ih\xi _j}-2\right] \widehat{U^\rho }\widehat{\overline{\rho }}\\= & {} -\int \frac{1}{h^2}\sum _j\left[ e^{-2\pi i h\xi _j}+e^{2\pi i h\xi _j}-2\right] \widehat{K}_{r_0} |\widehat{ \rho }|^2\\= & {} \frac{1}{h^2}\int \widehat{K}_{r_0} |\widehat{ \rho }|^2\sum _j2(1-\cos 2\pi h\xi _j) = 4\int \sum _j\frac{\sin ^2(\pi \xi _j h)}{h^2}\widehat{K}_{r_0} |\widehat{ \rho }|^2. \end{aligned}$$

Letting \(h\rightarrow 0\) and applying Fatou’s lemma we get

$$\begin{aligned} \frac{1}{2} \ge 4\pi ^2\int |\xi |^2\widehat{K}_{r_0} |\widehat{ \rho }|^2 {\mathop {=}\limits ^{(2.2)}} 4\pi ^2c_1\int (1-\mathcal B(2\pi r_0|\xi |)) |\widehat{ \rho }|^2. \end{aligned}$$

Since the left hand side of the previous inequality does not depend on \(r_0\) we can let \(r_0\rightarrow \infty \) and applying Fatou’s lemma again we see that

$$\begin{aligned} 4\pi ^2 c_1\int |\widehat{ \rho }|^2 \le \frac{1}{2}\int d\rho . \end{aligned}$$

Since Fourier transform is isometry on \(L^2\) then \(\tilde{\rho }\), the inverse Fourier transform of \(\widehat{\rho }\), exists and \(\tilde{\rho }\in L^2\). But then \(\widehat{(\rho -\tilde{\rho })}=0\), and it follows that \(\rho \) has \(L^2\) density. The proof of the claim is complete. \(\square \)

Returning to the localized inequality (4.12) with \((x^*, y^*)\in \mathrm{{supp}}\gamma \) we get

$$\begin{aligned} -\int \delta _h(U^\rho ) \xi \rho dx\le \int \xi \rho dx. \end{aligned}$$
(4.13)

Using the weak convergence of second order finite differences in \(L^2\) we finally obtain from (4.13) and \(\delta _hU^\rho \rightarrow \Delta U^\rho =-2\pi \rho \)

$$\begin{aligned} 2\pi \int _{B_\varepsilon (x^*)}\rho ^2\xi dx\le \int _{B_\varepsilon (x^*)}\rho \xi dx\le \left( \int _{B_\varepsilon (x^*)}\rho ^2\xi dx\right) ^{\frac{1}{2}}\left( \int _{B_\varepsilon (x^*)}\xi dx\right) ^{\frac{1}{2}}. \end{aligned}$$

Consequently, the upper Lebesgue density of the measure \(\rho \) is bounded by some universal constant and hence \(d\rho =fdx\) for some \(f\in L^\infty (\mathbb R^n)\) [18]. Therefore from Judovič’s theorem [21] \(\nabla U^\rho \) is log-Lipschitz continuous. Moreover, by construction

$$\begin{aligned} \int \varphi _0(x)=\int \varphi ^*(x)=\iint \gamma _\varepsilon =\iint \gamma _\varepsilon ^*. \end{aligned}$$

Hence from (4.9) and the mean value theorem we get that

$$\begin{aligned} \frac{1}{2}|x_0-y^*|^2-\frac{1}{2}|x^*-y^*|^2+2U^\rho (x_0)-2U^\rho (x^*)\ge 0. \end{aligned}$$

Thus \(2U^\rho (x_0)+\frac{1}{2}|x_0-y^*|^2\ge 2U^\rho (x^*)+ \frac{1}{2}|x^*-y^*|^2\). \(\square \)

Corollary 4.6

Let \(\rho \) be a minimizer of J, then \(U^\rho =\psi \) on \(\mathrm{{supp}}\rho \). Furthermore, \(\mathrm{{supp}}\rho \) has nonempty interior.

Proof

In view of (4.4) and (4.7) \(U^\rho \) and \(\psi \) have the same c-subdifferential on \(\mathrm{{supp}}\rho \) then it follows that \(U^\rho =\psi \) and at free boundary point \(x^*=y^*\) we have \(\nabla U^\rho (x^*)=0\). The last claim follows from the log-Lipschitz continuity of \(\nabla U^\rho \). \(\square \)

5 Regularity of free boundary

Let \(x^*\in \mathrm{{supp}}\rho \), then from (4.7) we have for every x

$$\begin{aligned} U^\rho (x^*)\le U^\rho (x)+\frac{1}{4}\left[ |x-x^*|^2-|x^*-y^*|^2\right] . \end{aligned}$$

Therefore \(U^\rho (x^*)\le U^\rho (x)\) if \(x\in B_{|x^*-y^*|}(x^*):=B\) and \(x^*\not =y^*\). Consequently \(U^\rho \) has local minimum in \(\overline{B}\) at \(x^*\in \partial B\), and since \(U^\rho \) is superharmonic in \(\mathbb R^2\) it follows from Hopf’s lemma, applied to a ball with diameter \(\overline{x^*y^*}, \) that the normal derivative \(\partial _\nu U^\rho (x^*)<0\) where \(\nu =\frac{x^*-y^*}{|x^*-y^*|}\). Hence at the remaining free boundary points we must have \(x^*=y^*\) and hence \(\nabla \psi (x^*)=0\).

Definition 5.1

Let T be the transport map. We say that \(x\in \mathrm{{supp}}\rho \cap \mathrm{{supp}}\rho _0\) is a singular free boundary point if \(x=T(x), \nabla U^\rho (x)=0\) and

$$\begin{aligned} \limsup _{t\downarrow 0}\frac{1}{|B_t|}\int _{B_t(x)}\rho =0. \end{aligned}$$

The set of singular points is denoted by S.

Lemma 5.2

Let 0 be a singular free boundary point and \(\rho _0\ge s>0\) on \(\mathrm{{supp}}\rho _0\). Then for every small \(\varepsilon >0\) there is \(R^*>0\) such that the set of singular points in \(B_R, R<R^*\) can be trapped between two parallel planes at distance \(\frac{\sqrt{8n+1}}{(sc_n)^{\frac{1}{2n}}} \varepsilon ^{\frac{1}{2n}}R\) where \(c_n=|B_1|\).

Proof

Let \(\mathcal K\) be the convex hull of the singular set in \(B_R\). Then there is \(x_0\in B_R\) and an ellipsoid E (John’s ellipsoid [17] page 139) so that

$$\begin{aligned} x_0+\frac{1}{n} E\subset \mathcal K\subset x_0+E. \end{aligned}$$

Let r be the smallest axis of E. By mass balance condition

$$\begin{aligned} \int _{B_a} \rho (x)dx=\int _{T(B_a)}\rho _0(y)dy. \end{aligned}$$
(5.1)

By assumption 0 is a singular point, so we have \(\limsup \limits _{t\downarrow 0}\frac{1}{|B_t|}\int _{B_t} \rho (x)dx=0\). Thus for every \(\varepsilon >0\) small there is \(a_0\) such that

$$\begin{aligned} \int _{B_a}\rho (x)dx\le \varepsilon a^n\quad \text{ whenever }\ a<a_0. \end{aligned}$$
(5.2)

By assumption \(\rho _0>s>0\) then

$$\begin{aligned} \int _{T(B_a)} \rho _0 \ge \int _{B\frac{r}{2n}\left( x_{0}\right) }\rho _{0}\left( y\right) dy > sr^{n}c_n \end{aligned}$$
(5.3)

while \( \int _{B_a}\rho \left( x\right) dx < \varepsilon a^{n}. \)

Consequently, combining (5.1)–(5.3) we get \( \varepsilon a^{n} > s r^{n}c_n\) or

$$\begin{aligned} a > \left[ \dfrac{s c_n }{\varepsilon }\right] ^{\frac{1}{n}}r. \end{aligned}$$
(5.4)

It follows that (for small R and \(\varepsilon \)) there is a point \(A\in B_{\frac{r}{2n}} \left( x_0\right) \cap \{\rho _0>0 \}\) and \(B\in \{\rho >0\}\) so that \(|OB| \sim a\) and \(T^{-1}\left( A\right) =B\).

Fig. 2
figure 2

The construction used in the proof of Lemma 5.2. The red ball is \(B_{\frac{r}{2n}}(x_0)\), and \(T(A)=B\)

Let \(x_s\) be a singular point. Notice that \(x_{s}=T\left( x_{s}\right) \), i.e. the singular free boundary points are fixed points. From the monotonicity (4.2)

$$\begin{aligned} \left( x_{s}-A\right) \left( T^{-1}\left( x_{s}\right) -T^{-1}\left( A\right) \right) \ge 0 \end{aligned}$$

or \(\left( x_{s}-A\right) \left( x_{s}-B\right) \ge 0\). Let \(m=\dfrac{A+B}{2}\) be the midpoint of the segment AB, then

$$\begin{aligned} \left( x_{s}-A\right) \left( x_{s}-B\right)= & {} \left( x_{s}-B+B-A\right) \left( x_{s}-B\right) \\= & {} \left| x_{s}-B\right| ^{2} +\left( B-A\right) \left( x_{s}-B\right) \\= & {} \left| x_{s}-B\right| ^{2} -\left( A-B\right) \left( x_{s}-B\right) \\= & {} \left| x_{s}-B\right| ^2-2\dfrac{A-B}{2}\left( x_{s}-B\right) +\left| \dfrac{A-B}{2}\right| ^{2}-\left| \dfrac{A-B}{2}\right| ^{2}\\= & {} |x_s-m|^2 -\left| \dfrac{A-B}{2}\right| ^{2} \ge 0 \end{aligned}$$

because

$$\begin{aligned} \left| x_{s}-B-\dfrac{A-B}{2}\right| ^{2}=\left| x_{s}-m\right| ^{2}. \end{aligned}$$

Hence we arrive at

$$\begin{aligned} \left| x_{s}-m\right| ^{2}\ge \left| \dfrac{A-B}{2}\right| ^{2}. \end{aligned}$$

From simple geometric considerations we have that (see Figure 2)

$$\begin{aligned} \left| AP\right| =\left| AB\right| -\left| CB\right| \cos \alpha =\left| AB\right| -\left| AB\right| \cos ^{2}\alpha =\left| AB\right| \sin ^{2}\alpha . \end{aligned}$$

Note that \(\sin \alpha =\dfrac{\left| AC\right| }{\left| AB\right| }\le \dfrac{2R}{\left| AB\right| }\), hence it follows that

$$\begin{aligned} \left| AP\right| \le \left| AB\right| \dfrac{4R^{2}}{\left| AB\right| ^{2}}=\dfrac{4R^{2}}{\left| AB\right| }. \end{aligned}$$

Therefore \(S\cap B_R\) is on one side of the hyperplane containing the intersection \(B_R\) and the ball with diameter AB, see Fig. 2. Hence

$$\begin{aligned} \dfrac{r}{2n}\le \dfrac{4R^{2}}{\left| AB\right| } \end{aligned}$$

or, in view of (5.4), we get \(4R^{2}\ge \dfrac{r}{2n}\left[ r\left( \dfrac{sc_n}{\varepsilon }\right) ^{1/n}-R\right] \). From here

$$\begin{aligned} \dfrac{r^{2}}{2n}\left( \dfrac{sc_n}{\varepsilon }\right) ^{1/n}\le R^{2}(4+\frac{1}{2n}) \end{aligned}$$

implying \(r \le \frac{\sqrt{8n+1}}{(sc_n)^{\frac{1}{2n}}} \varepsilon ^{\frac{1}{2n}}R\) and the proof is complete. \(\square \)

Lemma 5.3

Let \(\omega (R)\) be the height of the slab containing \(S\cap B_R\) (see (1.7)), \(B_i=B_{r_i}(x_i)\) a collection of disjoint balls included in \(B_R\) with \(x_i\in S\). Then for every \(\beta >n-1\) we have

$$\begin{aligned} \sum r_i^\beta \le C\frac{R^\beta }{\omega ^{n-1}(R)}\frac{1}{1-\omega ^{\beta -(n-1)}(R)} \end{aligned}$$

Proof

Rotate the coordinate system such that \(x_n\) points in the direction of the normal of the parallel planes which are \(\omega ({R})\) apart and contain \(S\cap B_R\). Let \(\mathcal F_0\) be the collection of the balls satisfying \(R\omega (R)<r_i\le R\). If \(B_i\in \mathcal F_0\) then \(\text{ diam }\left( B_i\cap \{x_n=0\}\right) \ge \frac{1}{2}R\omega ({R})\). Therefore there are at most

$$\begin{aligned} \frac{R^{n-1}}{\left( \frac{1}{2} R \omega ({R})\right) ^{n-1}}=\frac{2^{n-1}}{\left( \omega ({R})\right) ^{n-1}} \end{aligned}$$

such balls. Thus we have

$$\begin{aligned} \sum _{B_i\in \mathcal F_0}r^\beta _i\le \frac{2^{n-1}}{\left( \omega ({R})\right) ^{n-1}}R^\beta \end{aligned}$$

and \(\{ B_i\}\setminus \mathcal F_0\) can be covered by balls \(\widehat{B}_{4R\omega ({R})}(y_j)\) such that \(y_j\in \{x_n=0\}\cap B_R\) and \(1\le j\le \frac{1}{\left( \omega ({R})\right) ^{n-1}}\). For each j we have \(S\cap \widehat{B}_{4R\omega ({R})}(y_j)\) is contained in the slab of width

$$\begin{aligned} R\omega ({R})\left( \omega ({R\omega ({R})})\right) \le R\left( \omega ({R})\right) ^2. \end{aligned}$$

Hence let \(\mathcal F_1\) be the collection of the balls \(B_i\) contained in \(\cup _j \widehat{B}_{4R\omega ({R})}(y_j)\) and satisfying \(R\left( \omega ({R})\right) ^2<r_i\le R\omega ({R}).\) Then every ball \(B_i\) in \(\mathcal F_1\) intersects \(\{x_n=0\}\) such that \(\text{ diam }(B_i\cap \{x_n=0\})\ge \frac{1}{2} R\left( \omega ({R})\right) ^2\) and the number such balls \(B_i\) is at most

$$\begin{aligned} \frac{\left( R\omega ({R})\right) ^{n-1}}{\left( R\left( \omega ({R})\right) ^2\right) ^{n-1}}=\frac{1}{\left( \omega ({R})\right) ^{n-1}}. \end{aligned}$$

Consequently

$$\begin{aligned} \sum _{B_i\in \mathcal F_1}r_i^\beta =\frac{1}{\left( \omega ({R})\right) ^{n-1}}\sum _{B_i\in \widehat{B}_{R\omega ({R})}(y_1)}\left( R\omega ({R})\right) ^\beta \le \frac{R^\beta }{\left( \omega ({R})\right) ^{2(n-1)}}. \end{aligned}$$

Again, as above we can choose at most \(\frac{1}{\left( \omega ({R})\right) ^{n-1}}\) balls \(\widehat{B}_{R\left( \omega ({R})\right) ^2}(y_l), l\le \frac{1}{\left( \omega ({R})\right) ^{n-1}}\) that cover \(\{B_i\}\setminus (\mathcal F_0\cup \mathcal F_2)\). We define \(\mathcal F_m\) inductively such that \(R\left( \omega ({R})\right) ^{m}<r_i\le R\left( \omega ({R})\right) ^{m-1}\) for \(B_i\in \mathcal F_m\), then repeating the argument above we have that

$$\begin{aligned} \sum _{B_i\in \mathcal F_m}r_i^\beta \le \left( \frac{1}{\left( \omega ({R})\right) ^{n-1}}\right) ^{m+1}\left( R\left( \omega ({R})\right) ^m\right) ^{\beta }. \end{aligned}$$

Therefore

$$\begin{aligned} \sum _i r_i^\beta\le & {} \sum _{m=0}^\infty \sum _{B_i\in \mathcal F_m} r_i^\beta \le \sum _{m=0}^\infty \left( \frac{1}{\left( \omega ({R})\right) ^{n-1}}\right) ^{m+1}\left( R\left( \omega ({R})\right) ^m\right) ^{\beta }\\= & {} \frac{R^{\beta }}{\left( \omega ({R})\right) ^{n-1}}\sum _{m=0}^\infty \left( \left( \omega ({R})\right) ^{\beta -(n-1)}\right) ^m\\= & {} \frac{R^{\beta }}{\left( \omega ({R})\right) ^{n-1}}\frac{1}{1-\left( \omega ({R})\right) ^{\beta -(n-1)}}. \end{aligned}$$

\(\square \)

Now we can finish the proof of Theorem C.

Theorem 5.4

Suppose \(\omega ({R})= R^\sigma \), then there is \(\sigma '>0\) depending only on \(n, \sigma \) such that \(S\subset M_0\cup \bigcup _{i=1}^\infty M_i\) where \(\mathcal {H}^{n-1-\sigma '}(M_0)=0\) and \(M_i\) is a \(C^1\) hypersurface such that the measure theoretic normal exists at each \(x\in S\cap M_i, i\ge 1\).

Proof

Let \(x\in S\) be such that there exists a unique normal in measure theoretic sense, see Definition 5.6 [18]. Notice that at the point x, where such normal exists the set has approximate tangent plane. Therefore the projections of \(B_r(x)\cap S\) onto two dimensional planes have diameter at least 2R. Thus we let \(M_0\) be the subset of S such that for \(x\in M_0\) there is sequence \(R_k\rightarrow 0\) such that the projections of \(B_{R_i}(x)\) onto some two dimensional plane is of order \(R^{1+\sigma }\).

Now let \(B_{r_i}(x_i)\) be a Besikovitch type covering of \(B_R\cap M_0\). Let us cover \(B_{r_i}(x_i)\cap M_0\) with balls of radius \(r_i^{1+\frac{\sigma }{2}}\), then there are at most

$$\begin{aligned} \frac{r_i^{n-2}}{r^{(n-2)(1+\frac{\sigma }{2})}_i}=\frac{1}{r^{\frac{\sigma }{2}(n-2)}_i} \end{aligned}$$

such balls. Hence for \(\alpha >0\) we have

$$\begin{aligned} \sum _i r_i^\alpha \le \sum _i \frac{1}{r^{\frac{\sigma }{2}(n-2)}_i}r_i^{\alpha (1+\frac{\sigma }{2})}=\sum _i r_i^{\alpha (1+\frac{\sigma }{2})-\frac{\sigma }{2}(n-2)}. \end{aligned}$$

Now we choose \(\delta =\frac{\sigma }{4}\) and \(\beta :=n-1+\delta \) and set

$$\begin{aligned} \beta:= & {} \alpha (1+\frac{\sigma }{2})-\frac{\sigma }{2}(n-2)=n-1+\delta . \end{aligned}$$

We want to show that for this choice of \(\beta \) we get \(\alpha =n-1-\sigma '\) for some \(\sigma '>0\) depending on n and \(\sigma \). Indeed, we have

$$\begin{aligned} \alpha:= & {} \frac{(n-1)+\delta +\frac{\sigma }{2}(n-2)}{1+\frac{\sigma }{2}}= \frac{(n-1)+\frac{\sigma }{4}+\frac{\sigma }{2}(n-2)}{1+\frac{\sigma }{2}}\\= & {} \left( (n-1)+\frac{\sigma }{4}+\frac{\sigma }{2}(n-2)\right) \left( 1-\frac{\sigma }{2}+o(\sigma )\right) \\= & {} n-1+\frac{\sigma }{4}(1+2(n-2)-2(n-1))+o(\sigma )\\= & {} n-1-\frac{\sigma }{4}+o(\sigma )\ge n-1-\sigma '. \end{aligned}$$

\(\square \)

6 Random matrices: an example

In this section we discuss a problem related to random matrices which leads to the obstacle problem (1.4). Let H be a Hermitian matrix, i.e. \(H_{ij}^\dagger =\bar{H}_{ji}\) (or \(H^\dagger =H\) for short) where \(\bar{H}_{ij}\) are the complex conjugates of the entries of \(N\times N\) matrix H. One of the well known random matrix ensembles is the Gaussian ensemble. The probability density of the random variables in the Gauss ensemble is given by the formula

$$\begin{aligned} P(H\in E)=\int _E e^{-\kappa \textrm{Trace}( H^2)}dH, \end{aligned}$$

where \(\kappa >0\) and

$$\begin{aligned} \hbox {Trace} H^2=\sum _{ij}|H_{ij}|^2 \end{aligned}$$

is the trace of the squared matrix [27]. The dispersion is the same for every H in the ensemble.

The corresponding statistical sum is

$$\begin{aligned} Z_{N}=\int e^{-\kappa \textrm{Trace}( H^{2})}dH. \end{aligned}$$

\(Z_N\) can be rewritten in an equivalent form

$$\begin{aligned} Z_N=C_N\int \prod _{i<k} (x_i-x_k)^2 \prod _{i} e^{-\kappa x_i^2} dx_i=\int e^{-W}dx_1\dots dx_N, \end{aligned}$$

where

$$\begin{aligned} W=-\sum _{i\not =j}\log |x_i-x_j|+\frac{N}{g}\sum _{i}x_i^2 \end{aligned}$$

and we replaced \(\kappa =Ng\) for convenience. If we assume that the particles (in the equilibrium) have density \(\rho \) then from approximation of Riemann’s sum we get that

$$\begin{aligned}W\sim -N^2\int \int \log |x-y|\rho (x)\rho (y)dxdy+ {N^2}g\int \rho (x)|x|^2.\end{aligned}$$

As \(N\rightarrow \infty \) the main contribution comes from the minimum of the functional

$$\begin{aligned} F[\rho ]=\int \int \log |x-y|\rho (x)\rho (y)dxdy+g\int \rho (x)|x|^2 \end{aligned}$$

with respect to the constraint \(\int _\mathbb R\rho =1\).

If in W the quadratic term is replaced by \({-\frac{1}{2}|x_i-y_i|^2\gamma (x_i, y_i)}, g\sim N, H_0=\text{ diag }(y_1, \dots , y_N)\), then we get the model corresponding to the energy J.

Remark 6.1

Let \(n=1\), then the first variation of \(F[\rho ]\) gives

$$\begin{aligned} -2\int _{\mathbb R}\log |x-y|\rho (y)+{x^2}g=\lambda , \end{aligned}$$

where \(\lambda \) is the Lagrange multiplier of the constraint \(\int _\mathbb R\rho =1\). Differentiating in x we get

$$\begin{aligned} \hbox {P.V.}\int _{-\infty }^{+\infty } \frac{\rho (y)dy}{x-y}=xg. \end{aligned}$$

The solution of this equation (given in terms of Hilbert’s transform) has the form

$$\begin{aligned}\rho (x)=\left\{ \begin{array}{ll} \frac{1}{\pi g}\sqrt{\frac{2}{g}-x^2} &{}\text{ if }\ |x|<\sqrt{\frac{2}{g}},\\ 0 &{}\text{ if }\ |x|>\sqrt{\frac{2}{g}}, \end{array} \right. \end{aligned}$$

and this is Wigner’s famous semicircle law [35], see also [32].

For the problem with \(d^2\) we have \(2U^\rho +\frac{1}{2}|x-T(x)|^2=\lambda \), where \(T:x\rightarrow y\) is the transport map. Since by Theorem B, \(x-T(x)=-2\frac{dU^\rho }{dx}\), it follows that \(U^\rho +\left| \frac{d}{dx} U^\rho \right| ^2=\lambda /2\). Hence \(U^\rho \le \lambda /2\) on \(\mathrm{{supp}}\rho \) and

$$\begin{aligned} \pm \frac{d}{dx} U^\rho =\sqrt{\lambda /2-U^\rho } \end{aligned}$$

or equivalently \( \pm 2 \sqrt{\lambda /2-U^\rho }=x+C, \) where C is an arbitrary constant. Thus after normalization we get that

$$\begin{aligned} 2U^\rho =\lambda -\frac{x^2}{2} \quad \text{ on }\ \mathrm{{supp}}\rho . \end{aligned}$$

7 The nonlocal Monge–Ampère equation

In this section we use a finite step approximation to obtain a solution, at least formally, to the equation

$$\begin{aligned} \partial _t \rho =\nabla \rho \nabla U^\rho +\Delta U^\rho \rho ={\text {div}}(\rho \nabla U^\rho ). \end{aligned}$$
(7.1)

From Corollary 4.6 we have

$$\begin{aligned} y(x)=x+2\nabla U^\rho (x). \end{aligned}$$

Consequently, the prescribed Jacobian equation is

$$\begin{aligned} \det (Id+2D^2 U^\rho )=\frac{\rho (x)}{\rho _0(x+2\nabla U^\rho )}. \end{aligned}$$

Note that this is a nonlocal Monge–Ampère equation. By standard \(W^{2, p}\) estimates for the potential \(U^\rho \) it follows that \(\mathrm{{supp}}\rho _0\setminus \mathrm{{supp}}\rho \) has vanishing Lebesgue measure.

Let \(h>0\) be small and consider the perturbed energy

$$\begin{aligned} \frac{h}{2}\iint K(x-y)d\rho d\rho +d^2(\rho , \rho _0). \end{aligned}$$

Linearizing the corresponding prescribed Jacobian equation

$$\begin{aligned} \det (Id+hD^2 U^\rho )=\frac{\rho (x)}{\rho _0(x+h\nabla U^\rho )} \end{aligned}$$

we get

$$\begin{aligned} \rho (x)= & {} \left[ 1+h\Delta U^\rho +O(h^2)\right] \rho _0(x+h\nabla U^\rho )\\= & {} \left[ 1+h\Delta U^\rho +O(h^2)\right] \left( \rho _0(x)+h\nabla \rho _0(x)\nabla U^\rho +O(h^2)\right) . \end{aligned}$$

Consequently

$$\begin{aligned} \rho (x)-\rho _0(x)=h\nabla \rho _0(x)\nabla U(x)+h\Delta U^\rho (x) \rho _0(x)+O(h^2) \end{aligned}$$

or after iterations \(\rho _0, \rho _1, \rho _2, \dots \) with step \(\frac{h}{2}\) we get

$$\begin{aligned} \rho _k(x)-\rho _{k-1}(x)=h\nabla \rho _{k-1}(x)\nabla U^\rho (x)+h\Delta U^\rho (x) \rho _{k-1}(x)+O(h^2). \end{aligned}$$

Therefore, sending \(h\rightarrow 0\) we obtain the equation

$$\begin{aligned} \partial _t \rho =\nabla \rho \nabla U^\rho +\Delta U^\rho \rho ={\text {div}}(\rho \nabla U^\rho ). \end{aligned}$$