Abstract
We analyze under which conditions equilibration between two competing effects, repulsion modeled by nonlinear diffusion and attraction modeled by nonlocal interaction, occurs. This balance leads to continuous compactly supported radially decreasing equilibrium configurations for all masses. All stationary states with suitable regularity are shown to be radially symmetric by means of continuous Steiner symmetrization techniques. Calculus of variations tools allow us to show the existence of global minimizers among these equilibria. Finally, in the particular case of Newtonian interaction in two dimensions they lead to uniqueness of equilibria for any given mass up to translation and to the convergence of solutions of the associated nonlinear aggregation-diffusion equations towards this unique equilibrium profile up to translations as \(t\rightarrow \infty \).
1 Introduction
The evolution of interacting particles and their equilibrium configurations has attracted the attention of many applied mathematicians and mathematical analysts for years. Continuum description of interacting particle systems usually leads to analyze the behavior of a mass density \(\rho (t,x)\) of individuals at certain location \(x\in {\mathbb {R}}^d\) and time \(t\ge 0\). Most of the derived models result in aggregation-diffusion nonlinear partial differential equations through different asymptotic or mean-field limits [14, 29, 75]. The different effects reflect that equilibria are obtained by competing behaviors: the repulsion between individuals/particles is modeled through nonlinear diffusion terms while their attraction is integrated via nonlocal forces. This attractive nonlocal interaction takes into account that the presence of particles/individuals at a certain location \(y\in {\mathbb {R}}^d\) produces a force at particles/individuals located at \(x\in {\mathbb {R}}^d\) proportional to \(-\nabla W(x-y)\) where the given interaction potential \(W:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) is assumed to be radially symmetric and increasing consistent with attractive forces. The evolution of the mass density of particles/individuals is given by the nonlinear aggregation-diffusion equation of the form:
with initial data \(\rho _0 \in L^1_+({\mathbb {R}}^d)\cap L^m({\mathbb {R}}^d)\). We will work with degenerate diffusions, \(m>1\), that appear naturally in modelling repulsion with very concentrated repelling nonlocal forces [14, 75], but also with linear and fast diffusion ranges \(0<m\le 1\), which are also classical in applications [59, 77]. These models are ubiquitous in mathematical biology where they have been used as macroscopic descriptions for collective behavior or swarming of animal species, see [15, 20, 69,70,71, 84] for instance, or more classically in chemotaxis-type models, see [11, 13, 26, 53, 54, 59, 77] and the references therein.
On the other hand, this family of PDEs is a particular example of nonlinear gradient flows in the sense of optimal transport between mass densities, see [2, 33, 34]. The main implication for us is that there is a natural Lyapunov functional for the evolution of (1.1) defined on the set of centered mass densities \(\rho \in L^1_+({\mathbb {R}}^d)\cap L^m({\mathbb {R}}^d)\) given by
being the last integral defined in the improper sense, and if \(m=1\) we replace the first integral of \(\mathcal {E} [\rho ]\) by \( \int _{{\mathbb {R}}^d} \rho \log \rho dx\). Therefore, if the balance between repulsion and attraction occurs, these two effects should determine stationary states for (1.1) including the stable solutions possibly given by local (global) minimizers of the free energy functional (1.2).
Many properties and results have been obtained in the particular case of Newtonian attractive potential due to its applications in mathematical modeling of chemotaxis [59, 77] and gravitational collapse models [78]. In the classical 2D Keller–Segel model with linear diffusion, it is known that equilibria can only happen in the critical mass case [10] while self-similar solutions are the long time asymptotics for subcritical mass cases [13, 22]. For supercritical masses, all solutions blow up in finite time [54]. It was shown in [23, 63] that degenerate diffusion with \(m>1\) is able to regularize the 2D classical Keller–Segel problem, where solutions exist globally in time regardless of its mass, and each solution remain uniformly bounded in time. For the Newtonian attraction interaction in dimension \(d\ge 3\), the authors in [9] show that the value of the degeneracy of the diffusion that allows the mass to be the critical quantity for dichotomy between global existence and finite time blow-up is given by \(m=2-2/d\). In fact, based on scaling arguments it is easy to argue that for \(m>2-2/d\), the diffusion term dominates when density becomes large, leading to global existence of solutions for all masses. This result was shown in [80] together with the global uniform bound of solutions for all times.
However, in all cases where the diffusion dominates over the aggregation, the long time asymptotics of solutions to (1.1) have not been clarified, as pointed out in [8]. Are there stationary solutions for all masses when the diffusion term dominates? And if so, are they unique up to translations? Do they determine the long time asymptotics for (1.1)? Only partial answers to these questions are present in the literature, which we summarize below.
To show the existence of stationary solutions to (1.1), a natural idea is to look for the global minimizer of its associated free energy functional (1.2). For the 3D case with Newtonian interaction potential and \(m>4/3\), Lions’ concentration-compactness principle [67] gives the existence of a global minimizer of (1.2) for any given mass. The argument can be extended to kernels that are no more singular than Newtonian potential in \({\mathbb {R}}^d\) at the origin, and have slow decay at infinity. The existence result is further generalized by [5] to a broader class of kernels, which can have faster decay at infinity. In all the above cases, the global minimizer of (1.2) corresponds to a stationary solution to (1.1) in the sense of distributions. In addition, the global minimizer must be radially decreasing due to Riesz’s rearrangement theorem.
Regarding the uniqueness of stationary solutions to (1.1), most of the available results are for Newtonian interaction. For the 3D Newtonian potential with \(m>4/3\), for any given mass, the authors in [65] prove uniqueness of stationary solutions to (1.1) among radial functions, and their method can be generalized to the Newtonian potential in \({\mathbb {R}}^d\) with \(m>2-2/d\). For the 3D case with \(m>4/3\), [79] show that all compactly supported stationary solutions must be radial up to a translation, hence obtaining uniqueness of stationary solutions among compactly supported functions. The proof is based on moving plane techniques, where the compact support of the stationary solution seems crucial, and it also relies on the fact that the Newtonian potential in 3D converges to zero at infinity. Similar results are obtained in [28] for 2D Newtonian potential with \(m>1\) using an adapted moving plane technique. Again, the uniqueness result is based on showing radial symmetry of compactly supported stationary solutions. Finally, we mention that uniqueness of stationary states has been proved for general attracting kernels in one dimension in the case \(m=2\), see [21]. To the best of our knowledge, even for Newtonian potential, we are not aware of any results showing that all stationary solutions are radial (up to a translation).
Previous results show the limitations of the present theory: although the existence of stationary states for all masses is obtained for quite general potentials, their uniqueness, crucial for identifying the long time asymptotics, is only known in very particular cases of diffusive dominated problems. The available uniqueness results are not very satisfactory due to the compactly supported restriction on the uniqueness class imposed by the moving plane techniques. And thus, large time asymptotics results are not at all available due to the lack of mass confinement results of any kind uniformly in time together with the difficulty of identifying the long time limits of sequences of solutions due to the restriction on the uniqueness class for stationary solutions.
If one wants to show that the long time asymptotics are uniquely determined by the initial mass and center of mass, a clear strategy used in many other nonlinear diffusion problems, see [87] and the references therein, is the following: one first needs to prove that all stationary solutions are radial up to a translation in a non restrictive class of stationary solutions, then one has to show uniqueness of stationary solutions among radial solutions, and finally this uniqueness will allow to identify the limits of time diverging sequences of solutions, if compactness of these sequences is shown in a suitable functional framework. Let us point out that comparison arguments used in standard porous medium equations are out of the question here due to the lack of maximum principle by the presence of the nonlocal term.
In this work, we will give the first full result of long time asymptotics for a diffusion dominated problem using the previous strategy without smallness assumptions of any kind. More precisely, we will prove that all solutions to the 2D Keller–Segel equation with \(m>1\) converge to the global minimizer of its free energy using the previous strategy. The first step will be to show radial symmetry of stationary solutions to (1.1) under quite general assumptions on W and the class of stationary solutions. Let us point out that standard rearrangement techniques fail in trying to show radial symmetry of general stationary states to (1.1) and they are only useful for showing radial symmetry of global minimizers, see [28]. Comparison arguments for radial solutions allow to prove uniqueness of radial stationary solutions in particular cases [61, 65]. However, up to our knowledge, there is no general result in the literature about radial symmetry of stationary solutions to nonlocal aggregation-diffusion equations.
Our first main result is that all stationary solutions of (1.1), with no restriction on \(m>0\), are radially decreasing up to translation by a fully novel application of continuous Steiner symmetrization techniques for the problem (1.1). Continuous Steiner symmetrization has been used in calculus of variations [18] for replacing rearrangement inequalities [16, 64, 72], but its application to nonlinear nonlocal aggregation-diffusion PDEs is completely new. Most of the results present in the literature using continuous Steiner symmetrization deal with functionals of first order, i.e. functionals involving a power of the modulus of the gradient of the unknown, see [19, Corollary 7.3] for an application to p-Laplacian stationary equations, and in [58, Section II] and [18, 57], while in our case the functional (1.2) is purely of zeroth order. The decay of the attractive Newtonian potential interaction term in \(d\ge 3\) follows from [18, Corollary 2] and [72], which is the only result related to our strategy.
We will construct a curve of measures starting from a stationary state \(\rho \) using continuous Steiner symmetrization such that the functional (1.2) decays strictly at first order along that curve unless the base point \(\rho \) is radially symmetric, see Proposition 2.15. However, the functional (1.2) has at most a quadratic variation when \(\rho \) is a stationary state as the first term in the Taylor expansion cancels. This leads to a contradiction unless the stationary state is radially symmetric. The construction of this curve needs a non-classical technique of slowing-down the velocities of the level sets for the continuous Steiner symmetrization in order to cope with the possible compact support of stationary states in the degenerate case \(m>1\), see Proposition 2.8. This first main result is the content of Sect. 2 in which we specify the assumptions on the interaction potential and the notion of stationary solutions in details. We point out that the variational structure of (1.1) is crucial to show the radially decreasing property of stationary solutions.
The result of radial symmetry for general stationary solutions to (1.1) is quite striking in comparison to other gradient flow models in collective behavior based on the competition of attractive and repulsive effects via nonlocal interaction potentials. Actually, there exist numerical and analytical evidence in [4, 7, 62] that there should be stationary solutions of these fully nonlocal interaction models which are not radially symmetric despite the radial symmetry of the interaction potential. Our first main result shows that this break of symmetry does not happen whenever nonlinear diffusion is chosen to model very strong localized repulsion forces, see [84]. Symmetry breaking in nonlinear diffusion equations without interactions has also received a lot of attention lately related to the Caffarelli–Kohn–Nirenberg inequalities, see [45, 46]. Another consequence of our radial symmetry results is the lack of non-radial local minimizers, and even non-radial critical points, of the free energy functional (1.2), which is not at all obvious.
We also generalize our radial symmetry result when (1.1) has an additional term \(\nabla \cdot (\rho \nabla V)\) on the right-hand side, where V is a confining potential (see Sect. 2.5 for precise conditions on V), in the sense that it plays the role of preventing particles to drift away in the presence of the diffusion. It is known that with the extra term, the corresponding energy functional has an additional term \(\int V(x) \rho (x)\, dx\). The particular case of quadratic confinement \(V(x)=\tfrac{|x|^2}{2}\) is important since it leads to the free energy functional associated to (1.1) with homogeneous kernels in self-similar variables [24, 25, 36] and thus, characterizing the self-similar profiles for those problems.
Finally, let us remark that our radial symmetry result applies to stationary states of (1.1) for any \(m>0\) regardless of being in the diffusion dominated case or not. As soon as stationary states of (1.1) exist under suitable assumptions on the interaction potential W, and the confining potential V if present, they must be radially symmetric up to a translation. This fact makes our result applicable to the fair-competition cases [10,11,12] and the aggregation-dominated cases, see [39, 40, 68] with degenerate, linear or fast diffusion. Section 2.4 is finally devoted to deal with the most restrictive case of \(\lambda \)-convex potentials and the Newtonian potential with \(m\ge 1-\tfrac{1}{d}\). In these cases, we can directly make use of the key first-order decay result of the interaction energy along Continuous Steiner symmetrization curves in Proposition 2.15, bypassing the technical result in Proposition 2.8, in order to give a nice shortcut of the proof of our main Theorem 2.2 based on gradient flow techniques.
We next study more properties of particular radially decreasing stationary solutions. We make use of the variational structure to show the existence of global minimizers to (1.2) under very general hypotheses on the interaction potential W and \(m>1\). In Sect. 3, we show that these global minimizers are in fact radially decreasing continuous functions, compactly supported if \(m>1\). These results fully generalize the results in [28, 79]. Putting together Sects. 2 and 3, the uniqueness and full characterization of the stationary states is reduced to uniqueness among the class of radial solutions. This result is known in the case of Newtonian attraction kernels [65].
Finally, we make use of the uniqueness among translations for any given mass of stationary solutions to (1.1) to obtain the second main result of this work, namely to answer the open problem of the long time asymptotics to (1.1) with Newtonian interaction in 2D and \(m>1\). This is accomplished in Sect. 4 by a compactness argument for which one has to extract the corresponding uniform in time bounds and a careful treatment of the nonlinear terms and dissipation while taking the limit \(t\rightarrow \infty \). We do not know how to obtain a similar result for Newtonian interaction in \(d\ge 3\) due to the lack of uniform in time mass confinement bounds in this case. We essentially cannot show that mass does not escape to infinity while taking the limit \(t\rightarrow \infty \). However, the compactness and characterization of stationary solutions is still valid in that case.
The present work opens new perspectives to show radial symmetry for stationary solutions to nonlocal aggregation-diffusion problems. While the hypotheses of our result to ensure existence of global radially symmetric minimizers of (1.2), and in turn of stationary solutions to (1.1), are quite general, we do not know yet whether there is uniqueness among radially symmetric stationary solutions (with a fixed mass) for general non-Newtonian kernels. We even do not have available uniqueness results of radial minimizers beyond Newtonian kernels. Understanding if the existence of radially symmetric local minimizers, that are not global, is possible for functionals of the form (1.2) with radial interaction potential is thus a challenging question. Concerning the long-time asymptotics of (1.1), the lack of a novel approach to find confinement of mass beyond the usual virial techniques and comparison arguments in radial coordinates hinders the advance in their understanding even for Newtonian kernels with \(d\ge 3\). Last but not least, our results open a window to obtain rates of convergence towards the unique equilibrium up to translation for the Newtonian kernel in 2D. The lack of general convexity of this variational problem could be compensated by recent results in a restricted class of functions, see [32]. However, the problem is quite challenging due to the presence of free boundaries in the evolution of compactly supported solutions to (1.1) that rules out direct linearization techniques as in the linear diffusion case [22].
2 Radial symmetry of stationary states with degenerate diffusion
Throughout this section, we assume that \(m>0\), and \(W\) satisfies the following four assumptions:
-
(K1)
\(W\) is attracting, i.e., \(W(x) \in C^1({\mathbb {R}}^d \setminus \{0\})\) is radially symmetric
$$\begin{aligned} W(x)=\omega (|x|)=\omega (r) \end{aligned}$$and \(\omega '(r)>0\) for all \(r>0\) with \(\omega (1)=0\).
-
(K2)
\(W\) is no more singular than the Newtonian kernel in \({\mathbb {R}}^d\) at the origin, i.e., there exists some \(C_w>0\) such that \(\omega '(r) \le C_w r^{1-d}\) for \(r\le 1\).
-
(K3)
There exists some \(C_w>0\) such that \(\omega '(r) \le C_w\) for all \(r>1\).
-
(K4)
Either \(\omega (r)\) is bounded for \(r\ge 1\) or there exists \(C_w>0\) such that for all \(a,b\ge 0\):
$$\begin{aligned} \omega _+(a+b)\le C_w (1+\omega (1+a)+\omega (1+b))\,. \end{aligned}$$
As usual, \(\omega _\pm \) denotes the positive and negative part of \(\omega \) such that \(\omega =\omega _+-\omega _-\). In particular, if \(W=-{\mathcal N}\), modulo the addition of a constant factor, is the attractive Newtonian potential, where \({\mathcal N}\) is the fundamental solution of \(-\Delta \) operator in \({\mathbb {R}}^d\), then \(W\) satisfies all the assumptions. Since the Eq. (1.1) does not change by adding a constant to the potential W, we will consider that the potential W is defined modulo additive constants from now on.
We denote by \(L^{1}_{+}({\mathbb {R}}^{d})\) the set of all non-negative functions in \(L^{1}({\mathbb {R}}^{d})\). Let us start by defining precisely stationary states to the aggregation Eq. (1.1) with a potential satisfying (K1)–(K4).
Definition 2.1
Given \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) we call it a stationary state for the evolution problem (1.1) if \(\rho _s^{m}\in H^1_{loc} ({\mathbb {R}}^d)\), \(\nabla \psi _s:=\nabla W *\rho _s\in L^1_{loc} ({\mathbb {R}}^d)\), and it satisfies
in the sense of distributions in \({\mathbb {R}}^d\).
Let us first note that \(\nabla \psi _s\) is globally bounded under the assumptions (K1)–(K3). To see this, a direct decomposition in near- and far-field sets yields
where we split the integrand into the sets \(\mathcal {A} := \{ y : |x - y| \le 1 \}\) and \(\mathcal {B} := {\mathbb {R}}^d \setminus \mathcal {A}\), and apply the assumptions (K1)–(K3).
Under the additional assumptions (K4) and \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\), we will show that the potential function \(\psi _s(x) = W*\rho _s(x)\) is also locally bounded. First, note that (K1)–(K3) ensures that \(|\omega (r)| \le {\tilde{C}}_w \phi (r)\) for all \(r\le 1\) with some \({\tilde{C}}_w>0\), where
Hence we can again perform a decomposition in near- and far-field sets and obtain
Our main goal in this section is the following theorem.
Theorem 2.2
Assume that W satisfies (K1)–(K4) and \(m>0\). Let \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) with \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\) be a non-negative stationary state of (1.1) in the sense of Definition 2.1. Then \(\rho _s\) must be radially decreasing up to a translation, i.e. there exists some \(x_0\in {\mathbb {R}}^d\), such that \(\rho _s(\cdot - x_0)\) is radially symmetric, and \(\rho _s(|x-x_0|)\) is non-increasing in \(|x-x_0|\).
Before going into the details of the proof, we briefly outline the strategy here. Assume there is a stationary state \(\rho _s\) which is not radially decreasing under any translation. To obtain a contradiction, we consider the free energy functional \(\mathcal {E}[\rho ]\) associated with (1.1),
where \(\mathcal {S}[\rho ]\) is replaced by \(\int \rho \log \rho \,dx\) if \(m=1\). We first observe that \(\mathcal {I}[\rho _s]\) is finite since the potential function \(\psi _s=W*\rho _s \in {\mathcal W}^{1,\infty }_{loc}({\mathbb {R}}^d)\) satisfies (2.4) with \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\). Since \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\), \(\mathcal {S}[\rho _s]\) is finite for all \(m>1\), but may be \(-\infty \) if \(m\in (0,1]\).
Below we discuss the strategy for \(m>1\) first, and point out the modification for \(m\in (0,1]\) in the next paragraph. Using the assumption that \(\rho _s\) is not radially decreasing under any translation, we will apply the continuous Steiner symmetrization to perturb around \(\rho _s\) and construct a continuous family of densities \(\mu (\tau , \cdot )\) with \(\mu (0,\cdot )=\rho _s\), such that \(\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s] < -c\tau \) for some \(c>0\) and any small \(\tau >0\). On the other hand, using that \(\rho _s\) is a stationary state, we will show that \(|\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]| \le C\tau ^2\) for some \(C>0\) and any small \(\tau >0\). Combining these two inequalities together gives us a contradiction for sufficiently small \(\tau >0\).
For \(m\in (0,1)\), even if \(\mathcal {S}[\rho _s]\) might be \(-\infty \) by itself, the difference \(\mathcal {S}[\mu (\tau )] - \mathcal {S}[\rho _s] \) can be still well-defined in the following sense, if we regularize the function \(\frac{1}{m-1}\rho ^{m}\) by \(\frac{1}{m-1}\rho (\rho +\epsilon )^{m-1}\) and take the limit \(\epsilon \rightarrow 0\):
and if \(m=1\) the integrand is replaced by \(\mu (\tau ,\cdot ) \log (\mu (\tau ,\cdot )+\epsilon ) - \rho _s \log (\rho _s +\epsilon )\). Note that as long as \(\mu (\tau )\) has the same distribution as \(\rho _s\), the above definition gives \(\mathcal {S}[\mu (\tau )] - \mathcal {S}[\rho _s] =0\). With such modification, we will show that the difference \(\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]\) is well-defined and satisfies the same two inequalities as the \(m>1\) case, so we again have a contradiction for small \(\tau >0\).
If the kernel W has certain convexity properties and \(m\ge 1-\tfrac{1}{d}\), then it is known that (1.1) has a rigorous Wasserstein gradient flow structure. In this case, once we obtain the crucial estimate: \(\mathcal {E}[\mu (\tau )] -\mathcal {E}[\rho _s] <-c\tau \), there is a shortcut that directly leads to the radial symmetry result, which we will discuss in Sect. 2.4.
Let us characterize first the set of possible stationary states of (1.1) in the sense of Definition 2.1 and their regularity. Parts of these arguments are reminiscent from those done in [28, 79] in the case of attractive Newtonian potentials.
Lemma 2.3
Let \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) with \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\) be a non-negative stationary state of (1.1) for some \(m>0\) in the sense of Definition 2.1. Then \(\rho _s \in \mathcal {C}({\mathbb {R}}^d)\), and there exists some \(C = C(\Vert \rho _s\Vert _{L^1}, \Vert \rho _s\Vert _{L^\infty }, C_w, d)>0\), such that
and
In addition, if \(m \in (0,1]\), then \(\mathrm {supp}\,\rho _s = {\mathbb {R}}^d\).
Proof
We have already checked that under these assumptions on W and \(\rho _s\), the potential function \(\psi _s\in {\mathcal W}^{1,\infty }_{loc}({\mathbb {R}}^d)\) due to (2.2)–(2.4). Since \(\rho _s^{m}\in H^1_{loc} ({\mathbb {R}}^d)\), then \(\rho _s^m\) is a weak \(H^1_{loc}({\mathbb {R}}^{d})\) solution of
with right hand side belonging to \({\mathcal W}^{-1,p}_{loc} ({\mathbb {R}}^d)\) for all \(1\le p \le \infty \). As a consequence, \(\rho _s^m\) is in fact a weak solution in \({\mathcal W}_{loc}^{1,p}({\mathbb {R}}^d)\) for all \(1<p<\infty \) of (2.9) by classical elliptic regularity results. Sobolev embedding shows that \(\rho _s^m\) belongs to some Hölder space \(\mathcal {C}_{loc}^{0,\alpha }({\mathbb {R}}^d)\), and thus \(\rho _s\in \mathcal {C}_{loc}^{0,\beta }({\mathbb {R}}^d)\) with \(\beta := \min \{\alpha /m, 1\}\). Let us define the set \(\Omega =\{x\in {\mathbb {R}}^d : \rho _s(x)>0\}\). Since \(\rho _s\in \mathcal {C}({\mathbb {R}}^d)\), then \(\Omega \) is an open set and it consists of a countable number of open possibly unbounded connected components. Let us take any bounded smooth connected open subset \(\Theta \) such that \(\overline{ \Theta } \subset \Omega \), and start with the case \(m\ne 1\). Since \(\rho _s\in \mathcal {C}({\mathbb {R}}^d)\), then \(\rho _s\) is bounded away from zero in \(\Theta \) and thus due to the assumptions on \(\rho _s\), we have that \(\frac{m}{m-1}\nabla \rho _s^{m-1} = \frac{1}{\rho _s} \nabla \rho _{s}^{m}\) holds in the distributional sense in \(\Theta \). We conclude that wherever \(\rho _s\) is positive, (2.1) can be interpreted as
in the sense of distributions in \(\Omega \). Hence, the function \(G(x)=\frac{m}{m-1}\rho _s^{m-1}(x) +\psi _s (x)\) is constant in each connected component of \(\Omega \). From here, we deduce that any stationary state of (1.1) in the sense of Definition 2.1 is given by
where G(x) is a constant in each connected component of the support of \(\rho _s\), and its value may differ in different connected components. Due to \(\psi _s\in {\mathcal W}^{1,\infty }_{loc}({\mathbb {R}}^d)\), we deduce that \(\rho _s \in \mathcal {C}_{loc}^{0,1/(m-1)}({\mathbb {R}}^d)\) if \(m\ge 2\) and \(\rho _s \in \mathcal {C}_{loc}^{0,1}({\mathbb {R}}^d)\) for \(m \in (0,1)\cup (1,2)\). Putting together (2.11) and (2.2), we conclude the desired estimate.
In addition, from (2.11) we have that \(\Omega = {\mathbb {R}}^d\) if \(m \in (0,1)\): if not, let \(\Omega _0\) be any connected component of \(\Omega \), and take \(x_0 \in \partial \Omega _0\). As we take a sequence of points \(x_n \rightarrow x_0\) with \(x_n \in \Omega _0\), we have that \(\rho _s(x_n)^{m-1}\rightarrow \infty \), whereas the sequence \(G(x_n) - \psi _s(x_n)\) is bounded [since \(\psi _s\) is locally bounded due to (2.4)], a contradiction.
If \(m=1\), the above argument still goes through except that we replace (2.10) by
in the sense of distributions in \(\Omega \). As a result, the function \(G(x)=\log \rho _s +\psi _s (x)\) is constant in each connected component of \(\Omega \). The same argument as the \(m\in (0,1)\) case then yields that \(\rho _s \in \mathcal {C}_{loc}^{0,1}({\mathbb {R}}^d)\) and \(\Omega = {\mathbb {R}}^d\), leading to the estimate \(|\nabla \log \rho | \le C\) in \({\mathbb {R}}^d\). \(\square \)
2.1 Some preliminaries about rearrangements
Now we briefly recall some standard notions and basic properties of decreasing rearrangements for non-negative functions that will be used later. For a deeper treatment of these topics, we address the reader to the books [6, 51, 56, 60, 64] or the papers [73, 81,82,83]. We denote by \(|E|_{d}\) the Lebesgue measure of a measurable set E in \({\mathbb {R}}^{d}\). Moreover, the set \(E^{\#}\) is defined as the ball centered at the origin such that \(|E^{\#}|_{d}=|E|_{d}\).
A non-negative measurable function f defined on \({\mathbb {R}}^{d}\) is called radially symmetric if there is a non-negative function \(\widetilde{f}\) on \([0,\infty )\) such that \(f(x)=\widetilde{f}(|x|)\) for all \(x\in {\mathbb {R}}^{d}\). If f is radially symmetric, we will often write \(f(x)=f(r)\) for \(r=|x|\ge 0\) by a slight abuse of notation. We say that f is rearranged if it is radial and \(\widetilde{f}\) is a non-negative right-continuous, non-increasing function of \(r>0\). A similar definition can be applied for real functions defined on a ball \(B_{R}(0)=\left\{ x\in {\mathbb {R}}^d:|x|<R\right\} \).
We define the distribution function of\(f\in L^{1}_{+}({\mathbb {R}}^{d})\) by
Then the function \(f^{*}:[0,+\infty )\rightarrow [0,+\infty ]\) defined by
will be called the Hardy–Littlewood one-dimensional decreasing rearrangement of f. By this definition, one could interpret \(f^{*}\) as the generalized right-inverse function of \(\zeta _{f}(\tau )\).
Making use of the definition of \(f^{*}\), we can define a special radially symmetric decreasing function \(f^{\#}\), which we will call the Schwarz spherical decreasing rearrangement of f by means of the formula
where \(\omega _d\) is the volume of the unit ball in \({\mathbb {R}}^d\). It is clear that if the set \(\Omega _{f}=\left\{ x\in {\mathbb {R}}^{d}:f(x)>0\right\} \) of f has finite measure, then \(f^{\#}\) is supported in the ball \(\Omega _{f}^{\#}\).
One can show that \(f^{*}\) (and so \(f^{\#}\)) is equidistributed with f (i.e. they have the same distribution function). Thus if \(f\in L^{p}({\mathbb {R}}^d)\), a simple use of Cavalieri’s principle (see e.g. [60, 82]) leads to the invariance property of the \(L^{p}\) norms:
In particular, using the layer-cake representation formula (see e.g. [64]) one could easily infer that
Among the many interesting properties of rearrangements, it is worth mentioning the Hardy–Littlewood inequality (see [6, 51, 60] for the proof): for any couple of non-negative measurable functions \(f,\,g\) on \({\mathbb {R}}^{d}\), we have
Since in Sect. 4 we will use estimates of the solutions of Keller–Segel problems in terms of their integrals, let us now recall the concept of comparison of mass concentration, taken from [85], that is remarkably useful.
Definition 2.4
Let \(f,g\in L^{1}_{loc}({\mathbb {R}}^{d})\) be two non-negative, radially symmetric functions on \({\mathbb {R}}^{d}\). We say that f is less concentrated than g, and we write \(f\prec g\) if for all \(R>0\) we get
The partial order relationship \(\prec \) is called comparison of mass concentrations. Of course, this definition can be suitably adapted if f, g are radially symmetric and locally integrable functions on a ball \(B_{R}\). The comparison of mass concentrations enjoys a nice equivalent formulation if f and g are rearranged, whose proof we refer to [1, 41, 86]:
Lemma 2.5
Let \(f,g\in L^{1}_{+}({\mathbb {R}}^{d})\) be two non-negative rearranged functions. Then \(f\prec g\) if and only if for every convex nondecreasing function \(\Phi :[0,\infty )\rightarrow [0,\infty )\) with \(\Phi (0)=0\) we have
From this Lemma, it easily follows that if \(f\prec g\) and \(f,g\in L^{p}({\mathbb {R}}^{d})\) are rearranged and non-negative, then
Let us also observe that if \(f,g\in L^{1}_{+}({\mathbb {R}}^{d})\) are non-negative and rearranged, then \(f\prec g\) if and only if for all \(s\ge 0\) we have
If \(f\in L^1_+({\mathbb {R}}^d)\), we denote by \(M_2[f]\) the second moment of f, i.e.
In this regard, another interesting property which will turn out useful is the following
Lemma 2.6
Let \(f, g \in L^1_+({\mathbb {R}}^d)\) with \(\Vert f\Vert _{L^1({\mathbb {R}}^d)} = \Vert g\Vert _{L^1({\mathbb {R}}^d)}\). If additionally g is rearranged and \(f^\# \prec g\), then \(M_2[f] \ge M_2[g]\).
Proof
Let us consider the sequence of bounded radially increasing functions \(\left\{ \varphi _{n}\right\} \), where \(\varphi _{n}(x)=\min \left\{ |x|^{2},n\right\} \) is the truncation of the function \(|x|^{2}\) at the level n and define the function
Then \(h_{n}\) is non-negative, bounded and rearranged. Thus using the Hardy–Littlewood inequality (2.14) and [1, Corollary 2.1] we find
Then passing to the limit as \(n\rightarrow \infty \) we find the desired result. \(\square \)
Remark 2.7
Lemma 2.6 can be easily generalized when \(|x|^{2}\) is replaced by any non-negative radially increasing potential \(V=V(r)\), \(r=|x|\), such that
2.2 Continuous Steiner symmetrization
Although classical decreasing rearragement techniques are very useful to study properties of the minimizers and for solutions of the evolution problem (1.1) in next sections, we do not know how to use them in connection with showing that stationary states are radially symmetric. For an introduction of continuous Steiner symmetrization and its properties, see [16, 18, 64]. In this subsection, we will use continuous Steiner symmetrization to prove the following proposition.
Proposition 2.8
Let \(\mu _0 \in \mathcal {C}({\mathbb {R}}^d) \cap L^1_+({\mathbb {R}}^d)\), and assume it is not radially decreasing after any translation.
Moreover, if \(m\in (0,1)\cup (1,\infty )\), assume that \(|\frac{m}{m-1}\nabla \mu _0^{m-1}| \le C_0\) in \(\mathrm {supp}\,\mu _0\) for some \(C_0\); and if \(m=1\) assume that \(|\nabla \log \mu _0| \le C_0\) in \(\mathrm {supp}\,\mu _0\) for some \(C_0\). In addition, if \(m\in (0,1]\), assume that \(\mathrm {supp}\,\mu _0 = {\mathbb {R}}^d\).
Then there exist some \(\delta _0>0, c_0>0, C_1>0\) (depending on m, \(\mu _0\) and \(W\)) and a function \(\mu \in C([0,\delta _0]\times {\mathbb {R}}^d)\) with \(\mu (0,\cdot ) = \mu _0\), such that \(\mu \) satisfies the following for a short time \(\tau \in [0,\delta _0]\), where \(\mathcal {E}\) is as given in (2.5):
2.2.1 Definitions and basic properties of Steiner symmetrization
Let us first introduce the concept of Steiner symmetrization for a measurable set \(E\subset {\mathbb {R}}^{d}\) . If \(d=1\), the Steiner symmetrization of E is the symmetric interval \(S(E)=\left\{ x\in {\mathbb {R}}:|x|<|E|_{1}/2\right\} \). Now we want to define the Steiner symmetrization of E with respect to a direction in \({\mathbb {R}}^{d}\) for \(d\ge 2\). The direction we symmetrize corresponds to the unit vector \(e_{1}=(1,0,\ldots ,0)\), although the definition can be modified accordingly when considering any other direction in \({\mathbb {R}}^{d}\).
Let us label a point \(x\in {\mathbb {R}}^{d}\) by \((x_{1},x^{\prime })\), where \(x^{\prime }=(x_{2},\ldots ,x_{d})\in {\mathbb {R}}^{d-1}\) and \(x_{1}\in {\mathbb {R}}\). Given any measurable subset E of \({\mathbb {R}}^{d}\) we define, for all \(x^{\prime }\in {\mathbb {R}}^{d-1}\), the section of E with respect to the direction \(x_{1}\) as the set
Then we define the Steiner symmetrization of E with respect to the direction \(x_{1}\) as the set S(E) which is symmetric about the hyperplane \(\left\{ x_{1}=0\right\} \) and is defined by
In particular we have that \(|E|_{d}=|S(E)|_{d}\).
Now, consider a non-negative function \(\mu _{0}\in L^{1}({\mathbb {R}}^{d})\), for \(d\ge 2\). For all \(x^{\prime }\in {\mathbb {R}}^{d-1}\), let us consider the distribution function of \(\mu _{0}(\cdot ,x^{\prime })\), i.e. the function
where
Then we can give the following definition:
Definition 2.9
We define the Steiner symmetrization (or Steiner rearrangement) of \(\mu _{0}\) in the direction \(x_{1}\) as the function \(S \mu _{0}=S \mu _{0}(x_{1},x^{\prime })\) such that \(S \mu _{0}(\cdot ,x^{\prime })\) is exactly the Schwarz rearrangement of \(\mu _{0}(\cdot ,x^{\prime })\)i.e. (see (2.12))
As a consequence, the Steiner symmetrization \(S\mu _{0}(x_{1},x^{\prime })\) is a function being symmetric about the hyperplane \(\left\{ x_{1}=0\right\} \) and for each \(h>0\) the level set
is equivalent to the Steiner symmetrization
which implies that \(S\mu _{0}\) and \(\mu _{0}\) are equidistributed, yielding the invariance of the \(L^{p}\) norms when passing from \(\mu _{0}\) to \(S\mu _{0}\), that is for all \(p\in [1,\infty ]\) we have
Moreover, by the layer-cake representation formula, we have
Now, we introduce a continuous version of this Steiner procedure via an interpolation between a set or a function and their Steiner symmetrizations that we will use in our symmetry arguments for steady states.
Definition 2.10
For an open set \(U\subset {\mathbb {R}}\), we define its continuous Steiner symmetrization\(M^\tau (U)\) for any \(\tau \ge 0\) as below. In the following we abbreviate an open interval \((c-r, c+r)\) by I(c, r), and we denote by \(\mathrm {sgn}\,c\) the sign of c (which is 1 for positive c, \(-1\) for negative c, and 0 if \(c=0\)).
-
(1)
If \(U = I(c,r)\), then
$$\begin{aligned} M^\tau (U):= {\left\{ \begin{array}{ll}I(c-\tau \,\mathrm {sgn}\,c, r) &{} \text { for }0\le \tau < |c|,\\ I(0,r) &{}\text { for }\tau \ge |c|. \end{array}\right. } \end{aligned}$$ -
(2)
If \(U = \cup _{i=1}^N I(c_i,r_i)\) (where all \(I(c_i, r_i)\) are disjoint), then \(M^\tau (U) := \cup _{i=1}^N M^\tau (I(c_i, r_i))\) for \(0\le \tau <\tau _1\), where \(\tau _1\) is the first time two intervals \(M^\tau (I(c_i, r_i))\) share a common endpoint. Once this happens, we merge them into one open interval, and repeat this process starting from \(\tau =\tau _1\).
-
(3)
If \(U = \cup _{i=1}^\infty I(c_i, r_i)\) (where all \(I(c_i, r_i)\) are disjoint), let \(U_N = \cup _{i=1}^N I(c_i, r_i)\) for each \(N>0\), and define \(M^\tau (U) := \cup _{N=1}^\infty M^\tau (U_N)\).
See Fig. 1 for illustrations of \(M^\tau (U)\) in the cases (1) and (2). Also, we point out that case (3) can be seen as a limit of case (2), since for each \(N_1<N_2\) one can easily check that \(M^\tau (U_{N_1}) \subset M^\tau (U_{N_2})\) for all \(\tau \ge 0\). Moreover, according to [18], the definition of \(M^{\tau }(U)\) can be extended to any measurable set U of \({\mathbb {R}}\), since
being \(O_{n}\supset O_{n+1}\)\(n=1,2,\ldots ,\) open sets and N a nullset.
In the next lemma we state four simple facts about \(M^\tau \). They can be easily checked for case (1) and (2) (hence true for (3) as well by taking the limit), and we omit the proof.
Lemma 2.11
Given any open set \(U\subset {\mathbb {R}}\), let \(M^\tau (U)\) be defined in Definition 2.10. Then
-
(a)
\(M^{0}(U)=U\), \(M^{\infty }(U)=S(E)\).
-
(b)
\(|M^\tau (U)| = |U|\) for all \(\tau \ge 0\).
-
(c)
If \(U_1 \subset U_2\), we have \(M^\tau (U_1) \subset M^\tau (U_2)\) for all \(\tau \ge 0\).
-
(d)
\(M^\tau \) has the semigroup property: \(M^{\tau +s}U = M^\tau (M^s(U))\) for any \(\tau ,s\ge 0\) and open set U.
Once we have the continuous Steiner symmetrization for a one-dimensional set, we can define the continuous Steiner symmetrization (in a certain direction) for a non-negative function in \({\mathbb {R}}^d\).
Definition 2.12
Given \(\mu _0 \in L^1_+({\mathbb {R}}^d)\), we define its continuous Steiner symmetrization \(S^\tau \mu _0\) (in direction \(e_1 = (1,0,\cdots ,0)\)) as follows. For any \(x_1 \in {\mathbb {R}}, x'\in {\mathbb {R}}^{d-1}, h>0\), let
where \(U_{x'}^h\) is defined in (2.19).
For an illustration of \(S^\tau \mu _0\) for \(\mu _0\in L^1({\mathbb {R}})\), see Fig. 2.
Using the above definition, Lemma 2.11 and the representation (2.20) one immediately has
Furthermore, it is easy to check that \(S^\tau \mu _0 = \mu _0\) for all \(\tau \) if and only if \(\mu _0\) is symmetric decreasing about the hyperplane \(H=\{x_1=0\}\). Below is the definition for a function being symmetric decreasing about a hyperplane:
Definition 2.13
Let \(\mu _0 \in L^1_+({\mathbb {R}}^d)\). For a hyperplane \(H \subset {\mathbb {R}}^d\) (with normal vector e), we say \(\mu _0\) is symmetric decreasing about H if for any \(x\in H\), the function \(f(\tau ):=\mu _0(x+\tau e)\) is rearranged, i.e. if \(f=f^{\#}\).
Next we state some basic properties of \(S^\tau \) without proof, see [18, 56, 58] for instance.
Lemma 2.14
The continuous Steiner symmetrization \(S^\tau \mu _0\) in Definition 2.12 has the following properties:
-
(a)
For any \(h>0\), \(|\{S^\tau \mu _0> h\}| = |\{\mu _0>h\}|\). As a result, \(\Vert S^\tau \mu _0\Vert _{L^p({\mathbb {R}}^d)} = \Vert \mu _0\Vert _{L^p({\mathbb {R}}^d)}\) for all \(1\le p\le +\infty \).
-
(b)
\(S^\tau \) has the semigroup property, that is, \(S^{\tau +s}\mu _0 = S^\tau (S^s\mu _0)\) for any \(\tau ,s\ge 0\) and non-negative \(\mu _0 \in L^1({\mathbb {R}}^d)\).
Lemma 2.14 immediately implies that \(\mathcal {S}[S^\tau \mu _0]\) is constant in \(\tau \), where \(\mathcal {S}[\cdot ]\) is as given in (2.5).
2.2.2 Interaction energy under Steiner symmetrization
In this subsection, we will investigate \(\mathcal {I}[S^\tau \mu _0]\). It has been shown in [18, Corollary 2] and [64, Theorem 3.7] that \(\mathcal {I}[S^\tau \mu _0]\) is non-increasing in \(\tau \). Indeed, in the case that \(\mu _0\) is a characteristic function \(\chi _{\Omega _0}\), it is shown in [72] that \(\mathcal {I}[S^\tau \mu _0]\) is strictly decreasing for \(\tau \) small enough if \(\Omega _0\) is not a ball. However, in order to obtain (2.16) for a strictly positive \(c_0\), some refined estimates are needed, and we will prove the following:
Proposition 2.15
Let \(\mu _0 \in \mathcal {C}({\mathbb {R}}^d) \cap L^1_+({\mathbb {R}}^d)\). Assume the hyperplane \(H=\{x_1=0\}\) splits the mass of \(\mu _0\) into half and half, and \(\mu _0\) is not symmetric decreasing about H. Let \(\mathcal {I}[\cdot ]\) be given in (2.5), where \(W\) satisfies the assumptions (K1)–(K3). Then \(\mathcal {I}[S^\tau \mu _0]\) is non-increasing in \(\tau \), and there exists some \(\delta _0>0\) (depending on \(\mu _0\)) and \(c_0>0\) (depending on \(\mu _0\) and \(W\)), such that
The building blocks to prove Proposition 2.15 are a couple of lemmas estimating how the interaction energy between two one-dimensional densities \(\mu _1, \mu _2\) changes under continuous Steiner symmetrization for each of them. That is, we will investigate how
changes in \(\tau \) for a given one-dimensional kernel \({\mathcal K}\) to be determined. We start with the basic case where \(\mu _1, \mu _2\) are both characteristic functions of some open interval.
Lemma 2.16
Assume \({\mathcal K}(x) \in \mathcal {C}^1({\mathbb {R}})\) is an even function with \({\mathcal K}'(x)<0\) for all \(x>0\). For \(i=1,2\), let \(\mu _i := \chi _{I(c_i,r_i)}\) respectively, where I(c, r) is as given in Definition 2.10. Then the following holds for the function \(I(\tau ) := I_{\mathcal K}[\mu _1, \mu _2](\tau )\) introduced in (2.21):
-
(a)
\(\frac{d^+}{d \tau } I(0) \ge 0\). (Here \(\frac{d^+}{d\tau }\) stands for the right derivative.)
-
(b)
If in addition \(\mathrm {sgn}\,c_1 \ne \mathrm {sgn}\,c_2\), then
$$\begin{aligned} \frac{d^+}{d \tau } I(0) \ge c_w \min \{r_1, r_2\} |c_2-c_1| > 0, \end{aligned}$$(2.22)where \(c_w\) is the minimum of \(|{\mathcal K}'(r)|\) for \(r\in [\frac{|c_2-c_1|}{2}, r_1+r_2 + |c_2-c_1|]\).
Proof
By definition of \(S^\tau \), we have \(S^\tau \mu _i = \chi _{M^\tau (I(c_i, r_i))}\) for \(i=1,2\) and all \(\tau \ge 0\). If \(\mathrm {sgn}\,c_1 = \mathrm {sgn}\,c_2\), the two intervals \(M^\tau (I(c_i, r_i))\) are moving towards the same direction for small enough \(\tau \), during which their interaction energy \(I(\tau )\) remains constant, implying \(\frac{d}{d \tau } I(0)=0\). Hence it suffices to focus on \(\mathrm {sgn}\,c_1 \ne \mathrm {sgn}\,c_2\) and prove (2.22).
Without loss of generality, we assume that \(c_2>c_1\), so that \(\mathrm {sgn}\,c_2- \mathrm {sgn}\,c_1\) is either 2 or 1. The definition of \(M^\tau \) gives
Taking its right derivative in \(\tau \) yields
Let us deal with the case \(r_1\le r_2\) first. In this case we rewrite \(\frac{d^+}{d\tau } I(0)\) as
where Q is the rectangle \([-r_1,r_1]\times [-r_2+(c_2-c_1), r_2+(c_2-c_1)]\), as illustrated in Fig. 3. Let \( Q^- = Q \cap \{x-y>0\}\), and \(Q^+ = Q \cap \{x-y<0\}\). The assumptions on \({\mathcal K}\) imply \({\mathcal K}'(x-y)<0\) in \(Q^-\), and \({\mathcal K}'(x-y)>0\) in \(Q^+\).
Illustration of the sets \(Q, Q^-, {\tilde{Q}}^+\) and D in the proof of Lemma 2.16
Let \({\tilde{Q}}^+ := Q^+ \cap \{y\le r_2\}\), and \(D:= [-r_1, r_1]\times [r_2 + \frac{c_2-c_1}{2}, r_2+(c_2-c_1)]\). (\({\tilde{Q}}^+\) and D are the yellow set and green set in Fig. 3 respectively). By definition, \({\tilde{Q}}^+\) and D are disjoint subsets of \(Q^+\), so
We claim that \(\int _{Q^-} {\mathcal K}'(x-y)dxdy + \int _{{\tilde{Q}}^+} {\mathcal K}'(x-y)dxdy \ge 0\). To see this, note that \( Q^- \cup {\tilde{Q}}^+ \) forms a rectangle, whose center has a zero x-coordinate and a positive y-coordinate. Hence for any \(h>0\), the line segment \({\tilde{Q}}^+ \cup \{x-y = -h\}\) is longer than \(Q^-\cup \{x-y = h\}\), which gives the claim.
Therefore, (2.24) becomes
Note that D is a rectangle with area \(r_1(c_2-c_1)\), and for any \((x,y)\in D\), we have (recall that \(r_{2}>r_{1}\))
This finally gives
Similarly, if \(r_1>r_2\), then \(I'(0)\) can be written as (2.23) with \({\tilde{Q}}\) defined as \([-r_1+(c_2-c_1),r_1+(c_2-c_1)]\times [-r_2, r_2]\) instead, and the above inequality would hold with the roles of \(r_1\) and \(r_2\) interchanged. Combining these two cases, we have
where \(c_w\) is the minimum of \(|{\mathcal K}'(r)|\) for \(r\in [\frac{|c_2-c_1|}{2}, r_1+r_2 + |c_2-c_1|]\). \(\square \)
The next lemma generalizes the above result to open sets with finite measures.
Lemma 2.17
Assume \({\mathcal K}(x) \in \mathcal {C}^1({\mathbb {R}})\) is an even function with \({\mathcal K}'(r)<0\) for all \(r>0\). For open sets \(U_1, U_2 \subset {\mathbb {R}}\) with finite measure, let \(\mu _i := \chi _{U_i}\) for \(i=1,2\), and \(I(\tau ) := I_{\mathcal K}[\mu _1,\mu _2](\tau )\) is as defined in (2.21). Then
-
(a)
\(\frac{d}{d \tau }I(\tau )\ge 0\) for all \(\tau \ge 0\);
-
(b)
In addition, assume that there exists some \(a\in (0,1)\) and \(R>\max \{|U_1|, |U_2|\}\) such that \(|U_1 \cap (\frac{|U_1|}{2}, R)|>a\), and \(|U_2 \cap (-R, -\frac{|U_2|}{2})|>a\). Then for all \(\tau \in [0,a/4]\), we have
$$\begin{aligned} \frac{d^+}{d \tau } I(\tau ) \ge \frac{1}{128} c_w a^3 > 0, \end{aligned}$$(2.25)where \(c_w\) is the minimum of \(|{\mathcal K}'(r)|\) for \(r\in [\frac{a}{4}, 4R]\).
Proof
It suffices to focus on the case when \(U_1, U_2\) both consist of a finite disjoint union of open intervals, and for the general case we can take the limit. Recall that \(S^\tau \mu _i = \chi _{M^\tau (U_i)}\) for \(i=1,2\) and all \(\tau \ge 0\).
To show (a), due to the semigroup property of \(S^\tau \) in Lemma 2.14, all we need to show is \(\frac{d^+}{d \tau }I(0)\ge 0\). By writing \(U_1, U_2\) each as a union of disjoint open intervals and expressing \(I(\tau )\) a sum of the pairwise interaction energy, (a) immediately follows from Lemma 2.16(a).
We will prove (b) next. First, we claim that
To see this, note that \(A_1(0)>\frac{3a}{4}\) due to the assumption \(|U_1 \cap (\frac{|U_1|}{2}, R)|>a\). Since each interval in \(M^\tau (U_1)\) moves with speed either 0 or \(\pm 1\) at each \(\tau \), we know \(A_1'(\tau )\ge -2\) for all \(\tau \), yielding the claim. (Similarly, \(A_2(\tau ) := |M^\tau (U_2) \cap (-R, -\frac{|U_2|}{2}-\frac{a}{4})|>\frac{a}{4}\) for all \(\tau \in [0,\frac{a}{4}]\).)
Now we pick any \(\tau _0 \in [0,\frac{a}{4}]\), and we aim to prove (2.25) at this particular time \(\tau _0\). At \(\tau =\tau _0\), write \(M^{\tau _0}(U_1):= \cup _{k=1}^{N_1} I(c_k^1, r_k^1)\), where all intervals \(I(c_k^1, r_k^1)\) are disjoint, and none of them share common endpoints – if they do, we merge them into one interval.
Note that for every \(x\in M^{\tau _0}(U_1) \cap (\frac{|U_1|}{2}+\frac{a}{4}, \,R)\), x must belong to some \(I(c_k^1, r_k^1)\) with \(a/4\le c_k^1 \le R+|U_1|/2\). Otherwise, the length of \(I(c_k^1, r_k^1)\) would exceed \(|U_1|\), contradicting Lemma 2.11(a). We then define
Combining the above discussion with (2.26), we have \(\sum _{k\in \mathscr {I}_1}|I(c_k^1, r_k^1)| \ge a/4\), i.e.
Likewise, let \(M^{\tau _0}(U_2) := \cup _{k=1}^{N_2} I(c_k^2, r_k^2)\), and denote by \(\mathscr {I}_2\) the set of indices k such that \(-R-|U_2|/2\le c_k^2 \le -\frac{a}{4}\), and similarly we have \(\sum _{k\in \mathscr {I}_2} r_k^2 \ge a/8\).
The semigroup property of \(M^\tau \) in Lemma 2.11 gives that for all \(s>0\),
Since none of the intervals \(I(c_k^1, r_k^1)\) share common endpoints, we have
A similar result holds for \(M^{\tau _0+s}(U_2)\), hence we obtain for sufficiently small \(s> 0\):
Applying Lemma 2.16(a) to the above identity yields
Next we will obtain a lower bound for \(T_{kl}\). By definition of \(\mathscr {I}_1\) and \( \mathscr {I}_2\), for each \(k\in \mathscr {I}_1\) and \(l\in \mathscr {I}_2\) we have that \(c_k^1 \ge \frac{a}{4}\) and \(c_l^2 \le -\frac{a}{4}\), hence \(|c_l^2 - c_k^1| \ge \frac{a}{2}\). Thus Lemma 2.16(b) yields
where \(c_w = \min _{r\in [\frac{a}{4}, 4R]} |{\mathcal K}'(r)|\) (here we used that for \(k\in \mathscr {I}_1, l\in \mathscr {I}_2\), we have \(r_k^1+r_l^2 + |c_l^2-c_k^1| \le |U_1|/2+|U_2|/2 + (R+|U_1|/2) + (R+|U_2|/2) \le 4R\), due to the assumption \(R>\max \{|U_1|, |U_2|\}\).)
Plugging the above inequality into (2.28) and using \(\min \{u,v\} \ge \min \{u,1\} \min \{v,1\}\) for \(u,v>0\), we have
here we applied (2.27) in the second-to-last inequality, and used the assumption \(a\in (0, 1)\) for the last inequality. Since \(\tau _0 \in [0,a/4]\) is arbitrary, we can conclude. \(\square \)
Now we are ready to prove Proposition 2.15.
Proof of Proposition 2.15
Since \(\mu _0 \in \mathcal {C}({\mathbb {R}}^d) \cap L^1_+({\mathbb {R}}^d)\) is not symmetric decreasing about \(H = \{x_1 = 0\}\), we know that there exists some \(x' \in {\mathbb {R}}^{d-1}\) and \(h>0\), such that \(U_{x'}^h := \{x_1\in {\mathbb {R}}: \mu _0(x_1, x')>h\}\) has finite measure, and its difference from \((-|U_{x'}^h|/2, |U_{x'}^h|/2)\) has nonzero measure.
For \(R>0, a>0\), define
Our discussion above yields that at least one of \(B_1^{R,a}\) and \(B_2^{R,a}\) is nonempty when R is sufficiently large and \(a>0\) sufficiently small (hence at least one of them must have nonzero measure by continuity of \(\mu _0\)). Indeed, using the fact that H splits the mass of \(\mu _0\) into half and half, we can choose R sufficiently large and \(a>0\) sufficiently small (both of them depend on \(\mu _0\) only), such that both \(B_1^{R,a}\) and \(B_2^{R,a}\) have nonzero measure in \({\mathbb {R}}^{d-1}\times (0,+\infty )\).
Now, let us define a one-dimensional kernel \(K_l(r) := -\tfrac{1}{2}W(\sqrt{r^2+l^2})\). Note that for any \(l>0\), the kernel \(K_l \in \mathcal {C}^1({\mathbb {R}})\) is even in r, and \(K_l'(r)<0\) for all \(r>0\). By definition of \(S^\tau \), we can rewrite \(\mathcal {I}[S^\tau \mu _0]\) as
Thus using the notation in (2.21), \(\mathcal {I}[S^\tau \mu _0] \) can be rewritten as
and taking its right derivative [and applying Lemma 2.17(a)] yields
By definition of \(B_1^{R,a}\) and \(B_2^{R,a}\), for any \((x',h_1)\in B_1^{R,a}\) and \((y',h_2)\in B_2^{R,a}\), we can apply Lemma 2.17(b) to obtain
where \(c_w\) is the minimum of \(|K_{|x'-y'|}'(r)|\) in [a / 4, 4R]. By definition of \(K_l(r)\), we have
Using \(|x'|\le R\) and \(|y'|\le R\) (due to definition of \(B_1, B_2\)), we have \( \frac{r}{\sqrt{r^2+|x'-y'|^2}} \ge \frac{a}{20R}\) for all \(r\in [a/4,4R]\), hence \(c_w \ge \frac{a}{40R} \min _{r\in [\frac{a}{4}, 4R]}W'(r)\).
Plugging (2.30) (with the above \(c_w\)) into (2.29) finally yields
hence we can conclude the desired estimate. \(\square \)
2.2.3 Proof of Proposition 2.8
In the statement of Proposition 2.8, we assume that \(\mu _0\) is not radially decreasing up to any translation. Since Steiner symmetrization only deals with symmetrizing in one direction, we will use the following simple lemma linking radial symmetry with being symmetric decreasing about hyperplanes. Although the result is standard (see [48, Lemma 1.8]), for the sake of completeness we include here the details of the proof.
Lemma 2.18
Let \(\mu _0 \in \mathcal {C}({\mathbb {R}}^d)\). Suppose for every unit vector e, there exists a hyperplane \(H \subset {\mathbb {R}}^d\) with normal vector e, such that \(\mu _0\) is symmetric decreasing about H. Then \(\mu _0\) must be radially decreasing up to a translation.
Proof
For \(i=1,\dots ,d\), let \(e_i\) be the unit vector with i-th coordinate 1 and all the other coordinates 0. By assumption, for each i, there exists some hyperplane \(H_i\) with normal vector \(e_i\), such that \(\mu _0\) is symmetric decreasing about \(H_i\). We then represent each \(H_i\) as \(\{(x_1,\dots ,x_d): x_i = a_i\)} for some \(a_i\in {\mathbb {R}}\), and then define \(a\in {\mathbb {R}}^d\) as \(a:=(a_1,\dots , a_d)\). Our goal is to prove that \(\mu _0(\cdot -a)\) is radially decreasing.
We first claim that \(\mu _0(x) = \mu _0(2a-x)\) for all \(x\in {\mathbb {R}}^d\). For any hyperplane \(H\subset {\mathbb {R}}^d\), let \(T_H: {\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) be the reflection about the hyperplane H. Since \(\mu _0\) is symmetric with respect to \(H_1, \dots , H_d\), we have \(\mu _0(x) = \mu _0(T_{H_i}x)\) for \(x\in {\mathbb {R}}^d\) and all \(i=1,\dots ,d\), thus \(\mu _0(x) = \mu _0(T_{H_1}\dots T_{H_d} x) = \mu _0(2a-x)\).
The claim implies that every hyperplane H passing through a must split the mass of \(\mu _0\) into half and half. Denote the normal vector of H by e. By assumption, \(\mu _0\) is symmetric decreasing about some hyperplane \(H'\) with normal vector e. The definition of symmetric decreasing implies that \(H'\) is the only hyperplane with normal vector e that splits the mass into half and half, hence \(H'\) must coincide with H. Thus \(\mu _0\) is symmetric decreasing about every hyperplane passing through a, hence we can conclude. \(\square \)
Proof of Proposition 2.8
Since \(\mu _0\) is not radially decreasing up to any translation, by Lemma 2.18, there exists some unit vector e, such that \(\mu _0\) is not symmetric decreasing about any hyperplane with normal vector e. In particular, there is a hyperplane H with normal vector e that splits the mass of \(\mu _0\) into half and half, and \(\mu _0\) is not symmetric decreasing about H. We set \(e=(1,0,\dots ,0)\) and \(H = \{x_1=0\}\) throughout the proof without loss of generality. For the rest of the proof, we will discuss two different cases \(m\in (0,1]\) and \(m>1\), and construct \(\mu (\tau , \cdot )\) in different ways.
Case 1:\(m \in (0,1]\). In this case, we simply set \(\mu (\tau ,\cdot ) = S^\tau \mu _0\). By Proposition 2.15, \(\mathcal {I}[S^\tau \mu _0]\) is decreasing at least linearly for a short time. Since continuous Steiner symmetrization preserves the distribution function, even if \(\mathcal {S}[\mu _0] = -\infty \) by itself, we still have the difference \(\mathcal {S}[\mu (\tau )] - \mathcal {S}[\mu _0] \equiv 0\) in the sense of (2.6). Thus (2.16) holds for all sufficiently small \(\tau >0\). In addition, (2.18) is automatically satisfied since we assumed that \(\mathrm {supp}\,\mu _0 = {\mathbb {R}}^d\) for \(m\in (0,1]\), and recall that \(S^\tau \) is mass-preserving by definition.
It then suffices to prove (2.17) for all sufficiently small \(\tau >0\). Let us discuss the case \(m=1\) first. By assumption, \(|\nabla \log \mu _0| \le C_0\). For any \(y\in {\mathbb {R}}^d\) and \(\tau >0\) we claim that
To see this, let us fix any \(y = (y_1, y')\in {\mathbb {R}}^d\). Since \(\log \mu _0(\cdot , y')\) is Lipschitz with constant \(C_0\), for any \(\tau >0\), the following two inequalities hold:
and
Since the level sets of \(\mu _0\) are moving with velocity at most 1 (and note that any level set of \(\mu _0\) is also a level set of \(\log \mu _0\)), we obtain (2.31). It implies
We then have \(|\mu (\tau ,y) - \mu _0(y)| \le 2C_0\mu _0(y) \tau \) for all \(\tau \in (0, \frac{\log 2}{C_0})\) and all \(y\in {\mathbb {R}}^d\).
Now we move on to \(m\in (0,1)\), where we aim to show that \(|\mu (\tau ,y) - \mu _0(y)| \le C_1 \mu _0^{2-m}(y)\tau \) for some \(C_1\) for all sufficiently small \(\tau >0\). Using the assumption \(|\nabla \frac{m}{1-m} \mu _0^{m-1}| \le C_0\), the same argument to obtain (2.31) then gives the following for all \(y\in {\mathbb {R}}^d, \tau >0\):
Note that \(\mu _0^{m-1}(y)\ge \Vert \mu _0\Vert _\infty ^{m-1}\), since \(\mu _0 \in L^\infty \) and \(m\in (0,1)\). Let us set \(\delta _0 = \frac{m}{2(1-m)C_0}\Vert \mu _0\Vert _\infty ^{m-1}\). For any \(\tau \in (0,\delta _0)\), the left hand side of the above inequality is strictly positive, thus we have
and note that our choice of \(\delta _0\) ensures that
for all \(\tau \in (0,\delta _0)\). Let \(f(a) := \left( \mu _0^{m-1}(y) +a\right) ^{\frac{1}{m-1}} - \mu _0(y)\), which is a convex and decreasing function in a with \(f(0)=0\). Using this function f, the above inequality (2.32) can be rewritten as
Since f is convex and decreasing, for all \(|a| \le \frac{C_0(1-m)}{m}\delta _0=\frac{1}{2}\Vert \mu _{0}\Vert _{\infty }^{m-1}\) we have
and this leads to
with \(C_1 := \frac{2^{\frac{m-2}{m-1}}}{m}C_{0}\), which gives (2.17).
Case 2:\(m>1\). Note that if we set \(\mu (\tau ,\cdot ) = S^\tau \mu _0\), then it directly satisfies (2.16) for a short time, since \(\mathcal {I}[S^\tau \mu _0]\) is decreasing at least linearly for a short time by Proposition 2.15, and we also have \(\mathcal {S}[S^\tau \mu _0]\) is constant in \(\tau \). However, \(S^\tau \mu _0\) does not satisfy (2.17) and (2.18). To solve this problem, we will modify \(S^\tau \mu _0\) into \(\tilde{S}^\tau \mu _0\), where we make the set \(U_{x'}^h := \{x_1\in {\mathbb {R}}: \mu _0(x_1, x')>h\}\) travels at speed v(h) rather than at constant speed 1, with v(h) given by
for some sufficiently small constant \(h_0>0\) to be determined later. More precisely, we define \(\mu (\tau ,\cdot ) = {\tilde{S}}^\tau \mu _0\) as
with v(h) as in (2.33). For an illustration on the difference between \(S^\tau \mu _0\) and \({\tilde{S}}^\tau \mu _0\), see the left figure of Fig. 4.
Left: A sketch on \(\mu _0\) (grey), \(S^\tau \mu _0\) (blue) and \({\tilde{S}}^\tau \mu _0\) (red dashed) for a small \(\tau >0\). Right: In the construction of \({\tilde{S}}^\tau \), due to a reduced speed at lower values, a higher value level set may travel over a lower value level set. The figure illustrates this phenomenon for a large \(\tau >0\)
Note that \({\tilde{S}}^\tau \mu _0\) and \(S^\tau \mu _0\) do not necessarily have the same distribution function. Due to a reduced speed v(h) for \(h\in (0,h_0)\) in the construction of \(\tilde{S}^\tau \), a higher block may travel over a lower block, as illustrated in the right figure of Fig. 4. When this happens, the part that is hanging outside would “drop down” as we integrate in h in (2.34), thus changing the distribution function of \({\tilde{S}}^\tau \mu _0\). But, this is not likely (and even impossible) to happen when \(\tau \ll 1\): indeed, using the regularity assumption \(|\nabla \mu _0^{m-1}|\le C_0\) and the particular v(h) in (2.33), one can show that the level sets remain ordered for small enough \(\tau \). But we will not pursue in this direction, since later we will show in (2.38) that \(\mathcal {S}[{\tilde{S}}^\tau \mu _0] \le \mathcal {S}[\mu _0]\) for all \(\tau >0\), which is sufficient for us.
Our goal is to show that such \(\mu (\tau ,\cdot )\) satisfies (2.16), (2.17) and (2.18) for small enough \(\tau \). Let us first prove that for any \(h_0>0\), \(\mu (\tau ,\cdot )\) satisfies (2.17) and (2.18) for \(\tau \in [0,\delta _1]\), where \(\delta _1 = \delta _1(m,h_0,C_0)>0\). To show (2.18), note that the assumption \(|\nabla (\mu _0^{m-1})| \le C_0\) directly leads to the following: for any \(x,y\in {\mathbb {R}}^d\) with \(\mu _0(x)\ge h>0\) and \(\mu _0(y)=0\), we have that \(|x-y| \ge h^{m-1}/C_0\). This implies that for any connected component \(D_i \subset \mathrm {supp}\,\mu _0\),
Now define \(D_{i,x'}\) as the one-dimensional set \(\{x_1\in {\mathbb {R}}: (x_1, x') \in D_i\}\). The inequality (2.35) yields
and note that for any \(h>0\), we have \(h^{m-1}/(C_0 v(h))\ge h_0^{m-1}/C_0\) by definition of v(h). Using the above equation, the definition of \({\tilde{S}}^\tau \) and the fact that \(M^{v(h)\tau }\) is measure-preserving, we have that (2.18) holds for all \(\tau \le h_0^{m-1}/C_0\).
Next we prove (2.17). Let us fix any \(y = (y_1, y')\in {\mathbb {R}}^d\), and denote \(h=\mu _0(y)\). Using \(|\nabla \mu _0^{m-1}| \le C_0\), we have that for any \(\lambda >1\),
So we have \(y_1 \not \in M^{v(\lambda h)\tau } \left( U_{y'}^{\lambda h}\right) \) for all \(\tau \le \frac{(\lambda ^{m-1}-1)h^{m-1}}{C_0 v(\lambda h)}\), which is uniformly bounded below by \( \frac{(\lambda ^{m-1}-1)h_0^{m-1}}{C_0 \lambda ^{m-1}}\) due to the fact that \(v(\lambda h) \le (\lambda h/h_0)^{m-1}\) for all h. By definition of \({\tilde{S}}^\tau \) and the fact that \(\mu _0(y)=h\), the following holds for all \(\lambda > 1\):
Note that there exists \(c_m^1>0\) only depending on m, such that \(\lambda ^{m-1}-1 \ge c_m^1(\lambda - 1)\) for all \(1<\lambda <2\). Hence for all \(1<\lambda <2\) we have
and this directly implies
Similarly, for any \(0<\eta < 1\) we have \( \mathrm {dist}(y_1, (U_{y'}^{\eta h})^c)\ge \frac{(1-\eta ^{m-1}) h^{m-1}}{C_0}, \) and an identical argument as above gives us
Now we let \(c_m^2>0\) be such that \(1-\eta ^{m-1} \ge c_m^2(1-\eta )\) for all \(\frac{1}{2}<\eta <1\). Hence we have \({\tilde{S}}^\tau [\mu _0](y) - \mu _0(y) \ge -(1-\eta )\mu _0(y)\) for \(\tau = \dfrac{c_m^2 h_0^{m-1}}{C_0}(1-\eta )\), which implies
Combining (2.36) and (2.37), we have that for any \(h_0>0\), (2.17) holds for some \(C_1\) for all \(\tau \in [0,\delta _1]\), where both \(C_1>0\) and \(\delta _1>0\) depend on \(C_0, h_0\) and m.
Finally, we will show that (2.16) holds for \(\mu (\tau ) = {\tilde{S}}^\tau [\mu _0]\) if we choose \(h_0>0\) to be sufficiently small. First, we point out that \(\mathcal {S}[{\tilde{S}}^\tau \mu _0]\) is not preserved for all \(\tau \). This is because when different level sets are moving at different speed v(h), we no longer have that \(M^{v(h_1)\tau }(U_{x'}^{h_1}) \subset M^{v(h_2)\tau }(U_{x'}^{h_2}) \) for all \(h_1>h_2\). Nevertheless, we claim it is still true that
To see this, note that the definition of \({\tilde{S}}^\tau \) and the fact that \(M^{v(h)\tau }\) is measure preserving give us
regardless of the definition of v(h). This implies that \(\int f({\tilde{S}}^\tau \mu _0(x))dx \le \int f(\mu _0(x))dx\) for any convex increasing function f, yielding (2.38).
Due to (2.38) and the fact that \(\mathcal {E}[\cdot ] = \mathcal {S}[\cdot ]+\mathcal {I}[\cdot ]\), in order to prove (2.16), it suffices to show
Recall that Proposition 2.15 gives that \(\mathcal {I}[S^\tau \mu _0] \le \mathcal {I}[\mu _0] - c\tau \) for \(\tau \in [0,\delta ]\) with some \(c>0\) and \(\delta >0\). As a result, to show (2.39), all we need is to prove that if \(h_0>0\) is sufficiently small, then
To show (2.40), we first split \(S^\tau \mu _0\) as the sum of two integrals in \(h\in [h_0,\infty )\) and \(h\in [0,h_0)\):
We then split \({\tilde{S}}^\tau \mu _0\) similarly, and since \(v(h)=1\) for all \(h>h_0\) we obtain
For any \(\tau \ge 0\), we have \(\Vert f_1(\tau ,\cdot )\Vert _{L^\infty ({\mathbb {R}}^d)} \le \Vert \mu _0\Vert _{L^\infty ({\mathbb {R}}^d)}\), while \(\Vert f_2(\tau ,\cdot )\Vert _{L^\infty ({\mathbb {R}}^d)}\) and \(\Vert \tilde{f}_2(\tau ,\cdot )\Vert _{L^\infty ({\mathbb {R}}^d)}\) are both bounded by \(h_0\). As for the \(L^1\) norm, we have that \(\Vert f_1(\tau ,\cdot )\Vert _{L^1({\mathbb {R}}^d)} \le \Vert \mu _0\Vert _{L^1({\mathbb {R}}^d)}\), and
where \(m_{\mu _0}(h_0)\) approaches 0 as \(h_0\searrow 0\).
Also, since \(v(h)\le 1\), we know that for each \(\tau \ge 0\), there is a transport map \(\mathcal {T}(\tau ,\cdot ):[0,\infty )\times {\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) with \(\sup _{x\in {\mathbb {R}}^d} |\mathcal {T}(\tau ,x)-x|\le 2 \tau \), such that \(\mathcal {T}(\tau ,\cdot )\# f_2(\tau ,\cdot )=\tilde{f}_2(\tau ,\cdot )\) (that is, \(\int {\tilde{f}}_2(\tau ,x) \varphi (x)dx = \int f_2(\tau ,x)\varphi (\mathcal {T}(\tau ,x))dx\) for any measurable function \(\varphi \)). Indeed, since the level sets of \(f_2\) are traveling at speed 1 and the level sets of \({\tilde{f}}_2\) are traveling with speed v(h), for each \(\tau \) we can find a transport plan between them with maximal displacement \(L^\infty \) distance at most \(2\tau \) in its support. Let us remark that since these densities are both in \(L^\infty \), there is some optimal transport map \(\tilde{\mathcal {T}}\) for the \(\infty \)-Wasserstein such that \(|\tilde{\mathcal {T}}(\tau ,x)-x|\le 2\tau \). Although existence of an optimal map is known [38], we just need a transport map with this property below.
Using the decompositions (2.41), (2.42) and the definition of \(\mathcal {I}[\cdot ]\), we obtain, omitting the \(\tau \) dependence on the right hand side,
and we will bound \(A_1(\tau )\) and \(A_2(\tau )\) in the following. For \(A_1(\tau )\), denote \(\Phi (\tau ,\cdot ) =: W*f_1(\tau ,\cdot )\), and using the \(L^\infty \), \(L^1\) bounds on \(f_1\) and the assumptions (K2),(K3), we proceed in the same way as in (2.4) to obtain that \(\Vert \nabla \Phi \Vert _{L^\infty ({\mathbb {R}}^d)} \le C = C(\Vert \mu _0\Vert _{L^\infty ({\mathbb {R}}^d)}, \Vert \mu _0\Vert _{L^1({\mathbb {R}}^d)}, C_w, d)\).
Using that \(\mathcal {T}(\tau ,\cdot )\# f_2(\tau ,\cdot ) = \tilde{f}_2(\tau ,\cdot )\), we can rewrite \(A_1(\tau )\) as
where the coefficient of \(\tau \) can be made arbitrarily small by choosing \(h_0\) sufficiently small. To control \(A_2(\tau )\), we first use the identity \(\int f(W*g)dx = \int g(W*f)dx\) to bound it by
and both terms can be controlled in the same way as \(A_1(\tau )\), since both \(\Phi _2 := W*f_2\) and \({{\tilde{\Phi }}}_2 := W*{\tilde{f}}_2\) satisfy the same estimate as \(\Phi \). Combining the estimates for \(A_1(\tau )\) and \(A_2(\tau )\), we can choose \(h_0>0\) sufficiently small, depending on \(\mu _0\) and \(W\), such that Eq. (2.40) would hold for all \(\tau \), which finishes the proof. \(\square \)
2.3 Proof of Theorem 2.2
Proof
Towards a contradiction, assume there is a stationary state \(\rho _s\) that is not radially decreasing. Due to Lemma 2.3, we have that \(\rho _s \in \mathcal {C}({\mathbb {R}}^d) \cap L^1_+({\mathbb {R}}^d)\), and \(|\frac{m}{m-1}\nabla \rho _s^{m-1}| \le C_0\) in \(\mathrm {supp}\,\rho _s\) for some \(C_0>0\) (and if \(m=1\), it becomes \(|\nabla \log \rho _s|\le C_0\)). In addition, if \(m\in (0,1]\), the same lemma also gives \(\mathrm {supp}\,\rho _s = {\mathbb {R}}^d\). This enables us to apply Proposition 2.8 to \(\rho _s\), hence there exists a continuous family of \(\mu (\tau ,\cdot )\) with \(\mu (0,\cdot ) = \rho _s\) and constants \(C_1>0, c_0>0,\delta _0>0\), such that the following holds for all \(\tau \in [0,\delta _0]\):
Next we will use (2.44) and (2.45) to directly estimate \(\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]\), and our goal is to show that there exists some \(C_2>0\), such that
We then directly obtain a contradiction between (2.43) and (2.46) for sufficiently small \(\tau >0\).
Let \(g(\tau ,x) := \mu (\tau ,x)-\rho _s(x)\). Due to (2.44), we have \(|g(\tau ,x)| \le C_1 \rho _s(x)^{\max \{1,2-m\}} \tau \) for all \(x\in {\mathbb {R}}^d\) and \(\tau \in [0,\delta _0]\). From now on, we set \(\delta _0\) to be the minimum of its previous value and \((2C_1(1+\Vert \rho _s\Vert _\infty ))^{-1}\). Such \(\delta _0\) ensures that \(\mathrm {supp}\,g(\tau ,\cdot ) \subset \mathrm {supp}\,\rho _s\) and \(|g(\tau ,x)/\rho _s(x)|\le \frac{1}{2}\) for all \(\tau \in [0,\delta _0]\).
Since the energy \(\mathcal {E}\) takes different formulas for \(m\ne 1\) and \(m=1\), we will treat these two cases differently. Let us start with the case \(m\in (0,1)\cup (1,+\infty )\). Using the notation \(g(\tau ,x)\), we have the following: (where in the integrand we omit the x dependence, due to space limitations)
Recall that for all \(|a|<1/2\), we have the elementary inequality
Since for all \(x\in \mathrm {supp}\,\rho _s\) and \(\tau \in [0,\delta _0]\) we have \(|g(\tau ,x)/\rho _s(x)|\le \frac{1}{2}\), we can replace a by \(g(x)/\rho _s(x)\) in the above inequality, then multiply \(\frac{1}{|m-1|}\rho _s^m\) to both sides to obtain the following (with \(C_2(m)=C(m)/|m-1|\)):
Applying this to (2.47), we have the following for all \(\tau \le \min \{\delta _0, C_1/2\}\):
Since \(\rho _s\) is a steady state solution, from (2.11) we have \(\frac{m}{m-1}\rho _s^{m-1} + W*\rho _s = C_i\) in each connected component \(D_i\subset \mathrm {supp}\,\rho _s\), hence \(I_1 \equiv 0\) for all \(\tau \in [0,\delta _0]\) due to (2.45) and the definition of \(g(\tau ,\cdot )\).
For \(I_2\) and \(I_3\), since \(|g(\tau ,x)| \le C_1 \rho _s(x)^{\max \{1,2-m\}} \tau \) for \(\tau \in [0,\delta _0]\), for \(m>1\) it becomes \(|g(\tau ,x)| \le C_1 \rho _s(x) \tau \), thus we directly have
for some \(A>0\) depending on \(\Vert \rho _s\Vert _{1}, \Vert \rho _s\Vert _{\infty }, m\) and d (where we use (2.4) and \(\rho _s \omega (1+|x|)\in L^1\) to control \(I_2\)). For \(m\in (0,1)\), the bound of g implies \(|g(\tau ,x)| \le C_1 \Vert \rho _s\Vert _{\infty }^{1-m} \rho _s(x) \tau \). Plugging this into \(I_2\) gives the same bound as above (with a different A). And for \(I_3\), plugging in \(|g(\tau ,x)| \le C_1 \rho _s(x)^{2-m} \tau \) gives
where in the last inequality we used that \(2-m>1\) and \(\rho _s \in L^1 \cap L^\infty \). Putting them together finally gives \(\big |\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]\big | \le 2A\tau ^2\) for all \(\tau \le \delta _0\), finishing the proof for \(m\in (0,1)\cup (1,+\infty )\).
Next we move on to the case \(m=1\). Using the notation \(g(\tau ,x)\), the difference \(\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]\) can be rewritten as follows: (where we again omit the x dependence in the integrand)
Again, we have \(J_1 = 0\) since \(\int g(\tau )dx = 0\), and \(\log \rho _s + W*\rho _s = C\) in \({\mathbb {R}}^d\). \(J_3\) is the same term as \(I_2\), thus again can be controlled by \(A\tau ^2\). Finally it remains to control \(J_2\). Let us break \(J_2\) into
For \(J_{22}\), using the inequality \(\log (1+a)<a\) for all \(a>0\), we have
where we use (2.44) in the second inequality. To control \(J_{21}\), due to the elementary inequality
for some universal constant C, letting \(a = \frac{g(\tau )}{\rho _s}\) and apply it to \(J_{21}\) gives
where the last inequality is obtained in the same way as (2.48). Combining these estimates above gives \(|\mathcal {E}[\mu (\tau )] - \mathcal {E}[\rho _s]| \le A\tau ^2\) for some \(A>0\) depending on \(\Vert \rho _s\Vert _{1}, \Vert \rho _s\Vert _{\infty }\) and d, which completes the proof. \(\square \)
2.4 A shortcut for equations with a gradient flow structure
In this subsection, we would like to discuss a shortcut for proving Theorem 2.2, once the first order decay under continuous Steiner symmetrization in Proposition 2.15 has been established, if the Eq. (1.1) has a rigorous gradient flow structure. Over the past two decades, it was discovered that many evolution PDEs have a Wasserstein gradient flow structure including the heat equation, porous medium equation, and the aggregation-diffusion Eq. (1.1) if the kernel W has certain convexity properties, see [2, 34, 42, 55, 76]. More precisely, for (1.1), if W is known to be \(\lambda \)-convex, then given any \(\rho _0 \in \mathcal {P}_2({\mathbb {R}}^d)\) (space of non-negative probability measures with finite second-moment) with \(\mathcal {E}[\rho _0]<\infty \), there exists a unique gradient flow \(\rho (t)\) of the free energy functional \(\mathcal {E}[\rho _0]\) in the space \(\mathcal {P}_2({\mathbb {R}}^d)\) endowed by the 2-Wasserstein distance. In addition, the gradient flow coincides with the unique weak solution if the velocity field has the necessary integrability conditions.
The \(\lambda \)-convexity of the potential W does not hold in the generality of our assumptions (K1)–(K4). However, the \(\lambda \)-convexity assumption on W has been recently relaxed in the following works for the particular, but important, case of the attractive Newtonian kernel. Craig [42] has shown that the gradient flow is well-posed if the energy \(\mathcal {E}\) is \(\xi \)-convex, where \(\xi \) is a modulus of convexity. Carrillo and Santambrogio [35] have recently shown that for (1.1) with attractive Newtonian potential, for any \(\rho _0\) in \(L^\infty ({\mathbb {R}}^d) \cap \mathcal {P}_2({\mathbb {R}}^d)\), there is a local-in-time gradient flow solution. The authors show that there are local in time \(L^\infty \) bounds at the discrete variational level allowing for local in time well defined gradient flow solutions. Furthermore, this gradient flow solution is unique among a large class of weak solutions due to the earlier results [32]. There, it was also shown that the free energy functional \(\mathcal {E}\) is \(\xi \)-convex for \(m\ge 1-\tfrac{1}{d}\) in the set of bounded densities \(L^\infty ({\mathbb {R}}^d) \cap \mathcal {P}_2({\mathbb {R}}^d)\) with a given fixed bound allowing the use of the recent theory of \(\xi \)-convex gradient flows in [42]. Summarizing, the recent results for the Newtonian attractive kernel [32, 35, 42] allow for a rigorous gradient flow structure of the Newtonian attractive kernel case for \(m\ge 1-\tfrac{1}{d}\) with initial data in \(L^\infty ({\mathbb {R}}^d) \cap \mathcal {P}_2({\mathbb {R}}^d)\).
In short we now know two particular more restrictive classes of potentials than the assumptions (K1)–(K4), including the Newtonian kernel case, for which a rigorous gradient flow theory has been developed for (1.1). Next we will show that under a rigorous gradient flow structure, once we use continuous Steiner symmetrization to obtain Proposition 2.15, it almost directly leads to radial symmetry via the following shortcut. In particular, Proposition 2.8 is not needed. Below is the statement and proof of the new proposition that we include for the sake of completeness. Note that it is weaker than Theorem 2.2, since Wasserstein gradient flow requires solutions to have a finite second moment, and furthermore for the existence of the gradient flow solutions we need to assume \(m\ge 1-\tfrac{1}{d}\). We will discuss this difference in Remark 2.20.
Proposition 2.19
Assume that W is such that (1.1) has a local-in-time unique gradient flow solution. Let \(\rho _s \in L^\infty ({\mathbb {R}}^d) \cap \mathcal {P}_2({\mathbb {R}}^d) \) be a stationary solution of (1.1) with \( \mathcal {E}[\rho _s]\) being finite. Then \(\rho _s\) must be radially decreasing after a translation.
Proof
Towards a contradiction, assume there is a stationary state \(\rho _s\) that is not radially decreasing after any translation. As before, Lemma 2.3 yields that \(\rho _s \in \mathcal {C}({\mathbb {R}}^d) \cap L^1_+({\mathbb {R}}^d)\). Applying Lemma 2.18 to \(\rho _s\) allows us to find a hyperplane H that splits the mass of \(\rho _s\) into half and half, but \(\rho _s\) is not symmetric decreasing about H. Without loss of generality assume \(H = \{x_1=0\}\). Applying Proposition 2.15 to \(\rho _s\) and using the fact that the \(L^m\) norm is conserved under the continuous Steiner symmetrization \(S^\tau \), we directly have that
where \(c_0, \delta _0\) are strictly positive constants that depend on \(\rho _s\). In addition, since the continuous Steiner symmetrization \(S^\tau \) gives an explicit transport plan from \(\rho _s\) to \(S^\tau \rho _s\), where each layer is shifted by no more than distance \(\tau \), we have \(W_\infty (\rho _s, S^\tau \rho _s) \le \tau \), thus
Using (2.49) and (2.50), the metric slope \(|\partial \mathcal {E}|(\rho _s)\) as defined in [2, Definition 1.2.4] satisfies
On the other hand, the local in time gradient flow solution \(\rho (t)\) with initial solution \(\rho _s\) satisfies an Evolution Differential Inequality (EVI) (see [42, Definition 2.10] when W is the Newtonian kernel), then arguing as in [3, Proposition 3.6], see also [32], we have that the following energy dissipation inequality is satisfied, for all \(t\ge 0\)
both for \(\lambda \)-convex potentials, actually (2.51) holds with equality, and for the Newtonian attractive potential. This is a consequence of the map \(t\rightarrow |\partial \mathcal {E}|(\rho (t))\) being decreasing and lower semicontinuous, see for instance [2, Theorem 2.4.15] in the \(\lambda \)-convex case and [42, Theorem 3.12] in the Newtonian kernel case. Since \(\rho (t)\equiv \rho _s\) is a gradient flow solution, plugging it into (2.51) yields that the left hand side is 0, whereas the right hand side is less than \(-\frac{1}{2}c_0^2 t\) which is negative for all \(t>0\), a contradiction. \(\square \)
Remark 2.20
The assumption that \(\rho _s\) is a probability measure does not create any actual restriction. If \(\rho _s\) is a stationary solution of (1.1) with mass \(M_0 \ne 1\), we can simply apply Theorem 2.19 to \({{\tilde{\rho }}}_s := \frac{\rho _s}{M_0}\), which has mass 1, and it is a stationary solution of (1.1) with some positive coefficients multiplied to the two terms on the right hand side. However, the assumption that \(\rho _s\) has finite second moment (which comes in the definition of \(\mathcal {P}_2({\mathbb {R}}^d)\)) makes it more restrictive than Theorem 2.2, which only requires \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\). Moreover, the assumption of the existence of a local-in-time unique gradient flow solution implies the more restrictive condition on the nonlinear diffusion \(m\ge 1-\tfrac{1}{d}\) in order to be proved with the available literature [3, 42].
At the end of this subsection, let us point out that for our main application in this work, where \(W = -\mathcal {N}\) is the attractive Newtonian kernel modulo translation and \(m>1\), we could have used this shortcut to show that all stationary solution \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) with finite second moment must be radially decreasing. However the longer approach (via Proposition 2.8 and Theorem 2.2) has a larger interest for two reasons. One is that as discussed in Remark 2.20, Theorem 2.2 proves radial symmetry in a more general class of stationary solutions and more general nonlinear diffusions. Another reason is that the longer approach does not rely on any convexity assumption on W, thus it works even if the equation does not have a rigorous gradient flow structure. Even more, part of the authors have also recently shown that this longer proof can be generalized to kernels that are more singular than Newtonian [31] for which a rigorous gradient flow theory is missing.
2.5 Including a potential term
In this subsection, we consider the aggregation-diffusion equation with an extra drift term given by a potential V(x):
where we assume that \(m>0\), \(V(x)\in \mathcal {C}^1({\mathbb {R}}^d)\) is radially symmetric, and \(V'(r)>0\) for all \(r>0\).
For this equation, its stationary solution is defined in the same way as Definition 2.1, with (2.1) replaced by \(\nabla \rho _s^{m} = -\rho _s\nabla (\psi _s + V)\). We point out that Lemma 2.3 still holds, except that the right hand sides of (2.7) and (2.8) are now replaced by an x-dependent bound \(C + |\nabla V(x)|\). From its proof, we know that if \(\rho _s\) is a stationary solution, then
where \(C_i\) may take different values in different components. As before, if \(m=1\) then \(\frac{m}{m-1}\rho _s^{m-1}\) is replaced by \(\log \rho _s\); and if \(0<m\le 1\) we again have that \(\mathrm {supp}\,\rho _s = {\mathbb {R}}^d\).
Due to the extra potential term, the energy functional \(\mathcal {E}[\rho ]\) is now given by \(\mathcal {S}[\rho ] + \mathcal {I}[\rho ] + \mathcal {V}[\rho ]\), with the extra potential energy \(\mathcal {V}[\rho ] := \int \rho V dx\). We start with a simple observation that the potential energy is non-increasing under continuous Steiner symmetrization, a consequence of properties of continuous Steiner symmetrization in [18].
Lemma 2.21
Let \(V \in \mathcal {C}({\mathbb {R}}^d)\) be radially symmetric and non-decreasing in |x|. Let \(\mu \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) be such that \(\int \mu V dx < \infty \). Then \(\int S^\tau [\mu ] V dx\) is non-increasing for all \(\tau >0\).
Proof
For any \(n\in {\mathbb {N}}_+\), let \(\varphi _n(x) := \max \{0, V(n)-V(x)\}\). (Here we define \(V(n):=V(x)|_{|x|=n}\) by a slight abuse of notation.) Note that \(\mathrm {supp}\,\varphi _n \subset B(0,n)\), and is non-increasing in |x|. By the Hardy–Littlewood inequality for continuous Steiner symmetrization [18, Lemma 4], we have
Note that \(-\varphi _n = \min \{V(n), V(x)\} - V(n).\) Since \(\int S^\tau [\mu ] dx = \int \mu dx\), (2.53) is equivalent with
Sending \(n\rightarrow \infty \), the above inequality becomes \(\int S^\tau [\mu ] V dx\le \int \mu V dx\) for all \(\tau \ge 0\). The semigroup property of \(S^\tau \) then gives us the desired result. \(\square \)
The above lemma gives that \(\frac{d^+}{d\tau } \int S^\tau [\mu ] V dx \le 0\), but it turns out that we have to improve it into a strict inequality if \(\mu \) is not symmetric decreasing about \(H = \{x_1=0\}\), which we prove below.
Lemma 2.22
Let \(V \in \mathcal {C}({\mathbb {R}}^d)\) be radially symmetric and strictly increasing in |x|. Assume \(\mu \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) is such that \(\int \mu V dx < \infty \), and \(\mu \) is not symmetric decreasing about \(H = \{x_1=0\}\). Then \(\frac{d^+}{d\tau } \int S^\tau [\mu ] V dx \big |_{\tau =0}< 0.\) As a consequence, for such \(\mu \), there is a constant \(c_{0}>0\) (depending on \(\mu \) and V) such that for small \(\tau >0,\)
Proof
Recall that for each \(x'\in {\mathbb {R}}^{d-1}\), \(h\in {\mathbb {R}}^+\), the set \(U_{x'}^h\) is an at most countable union of subintervals. Without loss of generality we assume the subintervals do not share a common endpoint; if so, we add a point to merge them into one interval. Each subinterval can be written in the form \(I(c,r)=(c-r,c+r)\). Since \(\mu \) is not symmetric decreasing about H, some of these subintervals must have their center not at 0 for some \(x', h\). This motivates us to define the set \(B_\delta \subset {\mathbb {R}}^{d-1} \times {\mathbb {R}}^+\) for \(0< \delta \ll 1\):
The assumption of \(\mu \) implies that \(|B_\delta | > 0\) for sufficiently small \(\delta >0\).
By Definition 2.12, \(\int S^\tau [\mu ] V dx\) can be written as
Now let us investigate the innermost integral. For any open set \(U \subset {\mathbb {R}}\), let us define
With this notation, the innermost integral in (2.54) becomes \(\Phi (\tau ; U_{x'}^h,x')\).
To estimate \(\frac{d^+}{d\tau } \Phi (\tau ; U_{x'}^h,x')|_{\tau =0}\), let us start with an easier estimate \(\frac{d^+}{d\tau } \Phi (\tau ; U,x')|_{\tau =0}\) when U is a single interval I(c, r). If \(c=0\), clearly \(\frac{d^+}{d\tau } \Phi (\tau ; U,x')\big |_{\tau =0} = 0\). If \(c\ne 0\) (WLOG assume \(c<0\)), then \(M^\tau (U) = I(c+\tau , r)\) for sufficiently small \(\tau >0\), thus
where we use \(|c+r|<|c-r|\) in the last inequality, which follows from \(c<0\), and actually we have \(|c-r|-|c+r|\ge \min \{2|c|, 2r\}\). And if \(c,r, x'\) satisfy \(|c|, r\in [\delta , \delta ^{-1}]\) and \(|x'| \le \delta ^{-1}\), we have the quantitative estimate
where \(C_\delta \) is given by
where we denote \(V(x)=V(|x|)\) by a slight abuse of notation. The strict positivity of \(C_\delta \) follows from the fact that V(r) is strictly increasing in r for \(r\ge 0\), as well as the compactness of the set \(\{ |a_1|-|a_2|\ge 2\delta , |a_1|,|a_2|\le 2\delta ^{-1}, |b|\le \delta ^{-1}\}\).
The above argument immediately leads to the crude estimate
as we take the sum of the estimate \(\frac{d^+}{d\tau } \Phi (\tau ; U,x')|_{\tau =0}\le 0\) over all the subintervals \(U \subset U_{x'}^h\). In addition, if \(|x'|\le \delta ^{-1}\) and \(U_{x'}^h\) has a subinterval I(c, r) with \(|c|, r\in [\delta , \delta ^{-1}]\), we have the quantitative estimate \(\frac{d^+}{d\tau } \Phi (\tau ; U_{x'}^h,x')|_{\tau =0} \le -C_\delta <0\). By definition of \(B_\delta \) at the beginning of this proof, we have
thus
finishing the proof.\(\square \)
Our goal of this subsection is to show that the radial symmetry result in Theorem 2.2 can be generalized to (2.52) for certain classes of potential V. We will work with one of the following two classes of V:
(V1) \(0<V'(r)\le C\) for some C for all \(r>0\).
(V2) \(V'(r)>0\) for all \(r>0\), and \(V'(r)\rightarrow +\infty \) as \(r\rightarrow +\infty \).
In the following theorem we prove radial symmetry of stationary solutions under assumption (V1) for all \(m>0\), and under assumption (V2) for \(m>1\). We expect that when \(m\in (0,1]\), it should be possible to refine some estimates in the proof and obtain symmetry for a wider class than (V1). We will not pursue this direction for presentation simplicity, and we leave further generalizations to interested readers.
Theorem 2.23
Assume that W satisfies (K1)–(K4) and \(m>0\). Let \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) satisfy \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\) and \( \rho _sV \in L^1({\mathbb {R}}^d)\). Assume that \(\rho _s\) is a non-negative stationary state of (2.52) in the sense of Definition 2.1, with (2.1) replaced by \(\nabla \rho _s^{m} = -\rho _s\nabla (\psi _s + V)\). Then if V satisfies (V1), or if V satisfies (V2) in addition to \(m>1\), then \(\rho _s\) is radially decreasing about the origin.
Proof
Note that Lemma 2.3 still holds with a potential V, except that right hand sides of (2.7) and (2.8) are now replaced by an x-dependent bound \(C + |\nabla V(x)|\), which is uniformly bounded in x under (V1). And under the assumptions (V2) and \(m>1\), we will prove in Lemma 2.24 that \(\rho _s\) must be compactly supported. Thus in both cases, the right hand sides of (2.7) and (2.8) are still uniformly bounded in x in \(\mathrm {supp}\,\rho _s\).
The rest of the proof follows a similar approach as Theorem 2.2 and Proposition 2.8, with \(\mathcal {E}\) including an extra potential energy \(\mathcal {V}[\rho ] := \int \rho V dx\). However, some crucial modifications in the proof of Proposition 2.8 are needed, which we highlight below.
First, note that with a potential V, we will prove radial symmetry about the origin, rather than up to a translation. For this reason, we take an arbitrary hyperplane H passing through the origin, and aim to prove that \(\rho _s\) is symmetric decreasing about H. (WLOG we let \(H = \{x_1=0\}\).) Since H does not split the mass of \(\rho _s\) into half-and-half, it is possible that for all \(x' \in {\mathbb {R}}^{d-1}\) and \(h>0\), every line segment in \(U_{x'}^h\) has its center lying on one side of H. Therefore, the estimate in Proposition 2.15 might fail for \(\rho _s\), and all we have is the crude estimate
Despite this weaker estimate in the interaction energy, we will show that all 3 estimates of Proposition 2.8 still hold, if we define \(\mu (\cdot ,\tau )\) in the same way as in its proof. Clearly, (2.17) and (2.18) remain true since \(\mu (\cdot ,\tau )\) is defined the same as before. We claim that (2.16) still holds, but with a different reason as before: the coefficient \(c_0>0\) used to come from contribution from the interaction energy via Proposition 2.15, but now it comes from the potential energy. To see this, consider the following two cases.
Case 1: \(m\in (0,1]\). Combining (2.55), Lemma 2.22 with \(\mathcal {S}[\mu (\tau )] - \mathcal {S}[\rho _s] \equiv 0\) (where the difference is defined in the sense of (2.6)), we again have (2.16) for some \(c_0>0\) for all sufficiently small \(\tau >0\).
Case 2: \(m>1\). In this case, recall that \(\mu (\tau ,\cdot ) = \tilde{S}^\tau [\mu (0,\cdot )]\), where \(\mu (0,\cdot )=\rho _s\) and \(\tilde{S}^\tau \) is the continuous Steiner symmetrization which “slows-down” at height \(h \in (0,h_0)\). From the proof of Lemma 2.22, we know that if \(B_\delta \) has a positive measure, then \(B_\delta \cap \{(x',h): h>h_0\}\) also has a positive measure for all sufficiently small \(h_0>0\), thus Lemma 2.22 still holds for \(\mu (\tau ) = {\tilde{S}}^\tau [\mu _0]\) if \(h_0\) is sufficiently small, leading to
In addition, for sufficiently small \(h_0\) we still have (2.40) (where we fix c to be the constant from the above equation), and combining it with (2.55) gives
and adding them together with (2.38) gives (2.16).
Once we obtain Proposition 2.8, the rest of the proof follows closely the proof of Theorem 2.2, except the following minor changes. With an extra potential energy in \(\mathcal {E}\), the right hand side of (2.47) has an additional term \(\int g(\tau ) V dx\). As a result, \(I_1\) has a different definition
which is still 0, since the equation for stationary solution now becomes
The \(m=1\) case is done with a similar modification, where \(J_1\) is now \(\int g(\tau ) \left( \log \rho _s + W*\rho _s + V\right) dx\), and again we have \(J_1=0\) since \(\rho _s\) is stationary. Finally, we obtain the same contradiction as in the proof of Theorem 2.2 if \(\rho _s\) is not symmetric decreasing about H. And since H is an arbitrary hyperplane through the origin, we have that \(\rho _s\) is radially decreasing about the origin. \(\square \)
Finally we state and prove the lemma used in the proof of Theorem 2.23, which shows all stationary solutions must be compactly supported if \(m>1\) and V satisfies (V2).
Lemma 2.24
Assume that \(m>1\), W satisfies (K1)–(K4), and V satisfies (V2). Let \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) satisfy \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\). Assume that \(\rho _s\) is a non-negative stationary state of (2.52) in the sense of Definition 2.1, with (2.1) replaced by \(\nabla \rho _s^{m} = -\rho _s\nabla (\psi _s + V)\). Then \(\rho _s\) is compactly supported.
Proof
With a potential term, we have that
where \(C_i\) takes different values in different connected components of \(\mathrm {supp}\,\rho _s\). By a similar computation as (2.4) (with \(W\) replaced by \(\min \{W,0\}\)), we have \(\rho _s*W\ge -C(\Vert \rho _s\Vert _1, \Vert \rho _s\Vert _\infty , W)\). Thus the first two terms of (2.56) are uniformly bounded below. As a result, every connected component D of \(\mathrm {supp}\,\rho _s\) must be bounded: if not, the left hand side would be unbounded in D due to \(\lim _{|x|\rightarrow \infty } V(|x|)=\infty \), contradicting with (2.56).
Note that every connected component being bounded does not imply that \(\mathrm {supp}\,\rho _s\) is bounded: there may be a countable number of connected components going to infinity. We claim that there is some \(R(\Vert \rho _s\Vert _1, \Vert \rho _s\Vert _\infty , W, V)>0\), such that every connected component D must satisfy that \(D \cap B(0,R) \not = \emptyset \). As we will see later, this will help us control the outmost point of D.
If \(0\in D\), then clearly \(D \cap B(0,R) \not = \emptyset \). If \(0\not \in D\), we find some unit vector \(\nu \in {\mathbb {R}}^d\), such that the ray starting at origin with direction \(\nu \) has a non-empty intersection with D. Let \( t_0 = \inf \{t>0: t\nu \in D\}, \) and let \(x_0 = t_0 \nu \). We take a sequence of points \((t_n)_{n=1}^\infty \) such that \(t_n\searrow t_0\) and \(t_n \nu \in D\), and denote \(x_n = t_n \nu \). Since \(x_n \in D\) and \(x_0 \in \partial D\), the left hand side of (2.56) takes the same constant value \(C_i\) at \(x_0\) and all \(x_n\). As a result, for all \(n\ge 1\) we have
Note that the first term is non-negative since \(\rho _s(x_0)=0\) (which follows from \(x_0\in \partial D\) and \(\rho _s\in \mathcal {C}({\mathbb {R}}^d)\)). The second term converges to \(\nabla (\rho _s*W)\cdot \nu \), whose absolute value is bounded by \(C(\Vert \rho _s\Vert _1, \Vert \rho _s\Vert _\infty , W)\) by (2.2). The third term converges to \(\nabla V(x_0) \cdot \nu = V'(t_0)\). Putting the three estimates together gives that
thus assumption (V2) gives that \(t_0 \le R(\Vert \rho _s\Vert _1, \Vert \rho _s\Vert _\infty , W, V)\), finishing the proof of the claim.
Finally, we will show that \(D \cap B(0,R) \not = \emptyset \) implies the outmost point of D cannot get too far. Take any \(x_1 \in D\cap B(0,R)\), and let \(x_2\) be the outmost point of D. Taking the difference of (2.56) at \(x_2\) and \(x_1\) gives
Due to (2.4), we bound the right hand side by \(C(\Vert \rho _s\Vert _1,\Vert \rho _s\Vert _\infty , \Vert \omega (1+|x|)\rho _s\Vert _1, W) + \omega (1+|x_2|) \Vert \rho _s\Vert _1\). Note that the left hand grows superlinearly in \(|x_2|\) due to (V2), whereas \(\omega (1+|x_2|)\) at most grows linearly in \(|x_2|\) by assumption (K3) on \(W\). This leads to
which completes the proof. \(\square \)
3 Existence of global minimizers
In Sect. 2, we showed that if \(\rho _s \in L^1_+({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) is a stationary state of (1.1) in the sense of Definition 2.1 and it satisfies \(\omega (1+|x|)\rho _s \in L^1({\mathbb {R}}^d)\), then it must be radially decreasing up to a translation. This section is concerned with the existence of such stationary solutions. Namely, under (K1)–(K4) and one of the extra assumptions (K5) or (K6) below, we will show that for any given mass, there indeed exists a stationary solution satisfying the above conditions. We will generalize the arguments of [28] to show that there exists a radially decreasing global minimizer \(\rho \) of the functional (2.5) given by
over the class of admissible densities
and with the potential satisfying at least (K1)–(K4). Note that the condition on the zero center of mass has to be understood in the improper integral sense, i.e.
since we do not assume that the first moment is bounded in the class \(\mathcal {Y}_{M}\). We emphasize that from now on we will work in the dominated regime with degenerate diffusion, namely when
In order to avoid loss of mass at infinity, we need to assume some growth condition at infinity. In this section, we will obtain the existence of global minimizers under two different conditions related to the works [5, 28, 67], and show that such global minimizers are indeed \(L^1\) and \(L^\infty \) stationary solutions. Namely, we assume further that the potential W satisfies at infinity either the property
-
(K5)
\(\displaystyle \lim _{r\rightarrow +\infty }\omega _{+}(r)=+\infty \),
or
-
(K6)
\(\displaystyle \lim _{r\rightarrow +\infty }\omega _{+}(r)=\ell \in (0,+\infty )\) where the non-negative potential \(\mathcal {K}:=\ell -W\) is such that, in the case \(m>2\), \(\mathcal {K}\in L^{\hat{p}}({\mathbb {R}}^{d}\setminus B_{1}(0))\), for some \(1\le \hat{p}<\infty \), while for the case \(2-(2/d)<m\le 2\) we will require that \(\mathcal {K}\in L^{p,\infty }({\mathbb {R}}^{d}\setminus B_{1}(0))\), for some \(1\le p<\infty \). Moreover, there exists an \(\alpha \in (0,d)\) for which \(m>1+\alpha /d\) and
$$\begin{aligned} \mathcal {K} (\tau x)\ge \tau ^{-\alpha } \mathcal {K}(x),\quad \forall \tau \ge 1,\, \text{ for } \text{ a.e. } x\in {\mathbb {R}}^{d}. \end{aligned}$$(3.2)
Here, we denote by \(L^{p,\infty }({\mathbb {R}}^{d})\) the weak-\(L^p\) or Marcinkiewicz space of index \(1\le p<\infty \). In particular, the attractive Newtonian potential (which is the fundamental solution of \(-\Delta \) operator in \({\mathbb {R}}^d\)) is covered by these assumptions: for \(d=1,2\) it satisfies (K5), whereas for \(d\ge 3\) it satisfies (K6) with \(\alpha = d-2\).
Notice that the subadditivity-type condition (K4) allows to claim that \({\mathcal E}[\rho ]\) is finite over the class \(\mathcal {Y}_{M}\): indeed if we split the W into its positive part \(W_{+}\) and negative part \(W_{-}\) as done in the bound of \(\psi _s\) in Sect. 2, the integral with kernel \(W_{-}\) is finite by the HLS inequality, see (3.3) below, while by (K4) we infer
3.1 Minimization of the Free Energy functional
The existence of minimizers of the functional \({\mathcal E}\) can be proven with different arguments according to the choice between condition (K5) or (K6): indeed, (K5) produces a quantitative version of the mass confinement effect while (K6) does it in a nonconstructive way. For such a difference, we first briefly discuss the case when condition (K6) is employed, as it can be proven by a simple application of Lion’s concentration-compactness principle [67] and its variant in [5].
Theorem 3.1
Assume that conditions (3.1), (K1)–(K4) and (K6) hold. Then for any positive mass M, there exists a global minimizer \(\rho _{0}\), which is radially symmetric and decreasing, of the free energy functional \({\mathcal E}\) in \(\mathcal {Y}_{M}\). Moreover, all global minimizers are radially symmetric and decreasing.
Proof
We write \({\mathcal E}[\rho ]={\widetilde{{\mathcal E}}}[\rho ]+\frac{\ell }{2}M^{2}\), where
being the kernel \(\mathcal {K}\) non-negative and radially decreasing; furthermore condition (K3) implies \(\mathcal {K}\in L^{p,\infty }(B_{1}(0))\), where \(p=d/(d-2)\). Then we are in position to apply [5, Theorem 1] for \(m>2\) and [67, Corollary II.1] for \(2-(2/d)<m\le 2\) to get the existence of a radially decreasing minimizer \(\rho _{0}\in \mathcal {Y}_{M}\) of \(\widetilde{{\mathcal E}}\) (and then of \({{\mathcal E}}\)). Moreover, since \(\mathcal {K}\) is strictly radially decreasing, all global minimizers are radially decreasing. \(\square \)
When considering the presence of condition (K5) the concentration-compactness principle is not applicable but a direct control of the mass confinement phenomenon is possible. Then we first prove the following Lemma, which provides a reversed Riesz inequality, allowing to reduce the study of the minimization of \({\mathcal E}\) to the set of all the radially decreasing density in \(\mathcal {Y}_{M}\).
Lemma 3.2
Assume that conditions (K1)–(K5) hold and take a density \(\rho \) such that
Then the following inequality holds:
and the equality occurs if and only if \(\rho \) is a translate of \(\rho ^{\#}\).
Proof
The proof proceeds exactly as in [27, Lemma 2], up to replacing the function k(r) defined there by the function
being \(r_{0}>0\) fixed. \(\square \)
Theorem 3.3
Assume that (3.1) and (K1)–(K5) hold, then the conclusions of Theorem 3.1 remain true.
Proof
We follow the main lines of [28, Theorem 2.1]. By Lemma 3.2 we can restrict ourselves to consider only radially decreasing densities \(\rho \). In order to show that \(\mathcal {I}[\rho ]\) is bounded from below, we first argue in the case \(d\ge 3\). Thanks to conditions (K1)–(K2) we have
Now we observe that by (3.1) we have
and \(\frac{d-2}{d}+\frac{d+2}{2d}+\frac{d+2}{2d}=2\), then by the classical HLS and \(L^{p}\) interpolation inequalities, we find
where \(\alpha =\frac{1}{m-1}\left( m\frac{d+2}{2d}-1\right) \). Then by (3.3) we find that
where we notice that \(m>2(1-\alpha )\) if and only if \(m>2-\frac{2}{d}\), that is (3.1). Then by (3.4) we can find a constant \(C_{1}>0\) and a sufficiently large constant \(C_{2}\) such that
Concerning the case \(d=2\), we observe that conditions (K1)–(K2) yield
and we can use the classical log-HLS inequality and the arguments of [28] to conclude.
Concerning the mass confinement, due to (K5) and the same arguments in [28], see also Lemma 4.17, allow us to show
Finally, we should check that the interaction potential W is lower semicontinuous as shown in [28, page8]. Indeed, the only technical point to verify in this more general setting relates to the control of the truncated interaction potential \(\mathsf {A}^{\varepsilon }\) for \(d\ge 3\). Notice that we can estimate due to (2.3)
Now recall that the Newtonian potential
is well defined for a.e. \(x\in {\mathbb {R}}^{d}\) and is in \(L^{1}_{loc}({\mathbb {R}}^{d})\), see [47, Theorem 2.21], then for a.e. \(x\in {\mathbb {R}}^{d}\) we have \(\chi _{B_{\varepsilon }(0)}\,\mathcal {N}*\rho \rightarrow 0\) as \(\varepsilon \rightarrow 0\). Moreover, by the HLS inequality we have
with
Then Lebesgue’s dominated convergence theorem allows to conclude that \(\mathsf {A}^{\varepsilon }[\rho ]\rightarrow 0\) as \(\varepsilon \rightarrow 0\). This convergence is uniform taken on a minimizing sequence \(\rho _{n}\).
Now, all ingredients are there to argue as in [28] showing that \({\mathcal E}\) achieves its infimum in the class of all radially decreasing densities in \(\mathcal {Y}_{M}\). \(\square \)
Remark 3.4
According to Theorem 2.2, the radial symmetry of the global minimizers of \({\mathcal E}\), which are particular critical points of \({\mathcal E}\), is not a surprise. Nevertheless, as pointed out in the proofs of Theorems 3.1–3.3, this property can be much more easily achieved by rearrangement inequalities.
A useful result, which will be used in the next arguments, regards the behavior at infinity of the so called W-potential, namely the function
Following the blueprint of [37, Lemma 1.1], we have the following result.
Lemma 3.5
Assume that (K1)–(K5) hold, and let
Then
Proof
As in Chae-Tarantello [37], we first set
so that our aim will be to show that \(\sigma (x)\rightarrow 0\) as \(|x|\rightarrow \infty \). Assume that \(|x|>2\). We then write
where \(\sigma _{i}\), \(i=1,2,3\), are defined by breaking the integral on the right hand side of (3.5) into:
respectively, where \(R>2\) is a fixed constant. Recall that (K2) implies \(|\omega (r)|\le C \phi (r)\) for \(r\le 1\), with \(\phi \) given in (2.3). Thus, we have
where we used \(f\in L^{\infty }({\mathbb {R}}^{d}\setminus B_1(0))\) and \(|x|>2\) in the last inequality. This means that \(\sigma _{1}(x)\rightarrow 0\) as \(|x|\rightarrow \infty \). Moreover, we notice that
Since by property (K3) we can estimate in the region \(D_{2}\)
such that
which implies that also \(\sigma _{2}(x)\rightarrow 0\) as \(|x|\rightarrow +\infty \). As for \(\sigma _3\), for x such that \(|x|>R\), using (K4)–(K5) we write
as \(|x|\rightarrow +\infty \), for any fixed \(R>1\). Hence letting \(R\rightarrow +\infty \) we get \(\sigma _{3}(x)\rightarrow 0\). \(\square \)
In case of assumption (K6), we prove the following Lemma.
Lemma 3.6
Assume (3.1), (K1)–(K4) and (K6) hold, and let \(\mathcal {K} := \ell -W\) be as defined in (K6). Then the following holds for any radially decreasing \( f\in L^1_+({\mathbb {R}}^{d})\):
and
where \(c:= 2^{-\alpha } \int _{B_1(0)} f(y)dy>0\), with \(\alpha >0\) as given in (K6).
Proof
Since both f and \(\mathcal {K}\) are radially symmetric, we define \({\bar{f}}\), \(\bar{\mathcal {K}}: [0,+\infty )\rightarrow {\mathbb {R}}\) such that \({\bar{f}}(|x|) = f(x)\), \(\bar{\mathcal {K}}(|x|)=\mathcal {K}(x)\). Note that \(\lim _{r\rightarrow \infty } {\bar{f}}(r) = \lim _{r\rightarrow \infty } \bar{\mathcal {K}}(r) = 0\) due to (K1), (K6) and the assumption on f. To prove (3.6), we break \(\int _{{\mathbb {R}}^{d}}\mathcal {K}(x-y)f(y)dy\) into the following three parts with \(|x|>1\) and control them respectively by:
and
Since all the three parts tend to 0 as \(|x|\rightarrow \infty \), we obtain (3.6). To show (3.7), we use \(\mathcal {K}, f \ge 0\) to estimate
where we apply (K6) to obtain the third inequality, and in the last inequality we define \(c:= 2^{-\alpha } \int _{B_1(0)} f(y)dy>0\). \(\square \)
Using similar arguments as in [28], we are able to derive the following result, which indeed gives a natural form of the Euler–Lagrange equation associated to the functional \({\mathcal E}\):
Theorem 3.7
Assume that (3.1), (K1)–(K4) and either (K5) or (K6) hold. Let \(\rho _0\in \mathcal {Y_{M}}\) be a global minimizer of the free energy functional \({\mathcal E}\). Then for some positive constant \(\mathsf {D}[\rho _{0}]\), we have that \(\rho _0\) satisfies
and
where
As a consequence, any global minimizer of \({\mathcal E}\) verifies
We now turn to show compactness of support and boundedness of the minimizers.
Lemma 3.8
Assume that (3.1), (K1)–(K4) and either (K5) or (K6) hold and let \(\rho _0\in \mathcal {Y}_{M}\) be a global minimizer of the free energy functional \({\mathcal E}\). Then \(\rho _0\) is compactly supported.
Proof
By Theorems 3.1 and 3.3, \(\rho _0\) is radially decreasing under either set of assumptions. In addition, under the assumption (K5), Lemma 3.5 gives that
hence combining this with (K5) gives us \((W*\rho _0)(x) \rightarrow +\infty \) as \(|x|\rightarrow \infty \). It implies that the right hand side of (3.9) must have compact support, hence \(\rho _0\) must have compact support too.
Under the assumption (K6), towards a contradiction, suppose \(\rho _0\) does not have compact support. Then \(\rho _0\) must be strictly positive in \({\mathbb {R}}^d\) since it is radially decreasing. We can then write (3.8) as
for some \(C\in {\mathbb {R}}\), where \(\mathcal {K} := \ell -W\) is as given in (K6). Indeed, C must be equal to 0, since both \(\rho _0(x)\) and \((\mathcal {K}*\rho _0)(x)\) tend to 0 as \(|x|\rightarrow \infty \), where we used (3.6) on the latter convergence. Thus
where we applied (3.7) to obtain the last inequality, with \(c:= 2^{-\alpha } \int _{B_1(0)} \rho _0(y)dy>0\). Due to the assumptions (3.2) and \(\alpha < d(m-1)\) in (K6), we have \(\int _{|x|>1} \mathcal {K}(x)^{1/(m-1)}dx = +\infty \). Combining this with (3.10) leads to \(\rho _0 \not \in L^1({\mathbb {R}}^d)\), a contradiction. \(\square \)
Lemma 3.9
Assume that (3.1), (K1)–(K4) and either (K5) or (K6) hold and let \(\rho _0\in \mathcal {Y}_{M}\) be a global minimizer of the free energy functional \({\mathcal E}\). Then \(\rho _0 \in L^\infty ({\mathbb {R}}^d)\).
Proof
By Theorems 3.1, 3.3 and Lemma 3.8, \(\rho _0\) is radially decreasing and has compact support say inside the ball \(B_R(0)\). Let us first concentrate on the proof under assumption (K5). For notational simplicity in this proof, we will denote by \(\Vert \rho _0\Vert _m\) the \(L^m({\mathbb {R}}^d)\)-norm of \(\rho _0\).
We will show that \(\rho _0 \in L^\infty ({\mathbb {R}}^d)\) by different arguments in several cases:
Case A:\(d\le 2\). Since \(\rho _0\) is supported in \(B_R(0)\), we can then find some \(C_w^1\) and \(C_w^2\), such that \(W \ge -C_w^1 \mathcal {N} - C_w^2\) in \(B_{2R}(0)\). Hence for any \(r<R\), we have
thus recalling (2.4)
Then by Eq. (3.9) it will be enough to show that the Newtonian potential\(\rho _0*\mathcal {N}\) is bounded in \(B_R(0)\) for \(d=1,2\). In \(d=1\), this is trivial. In \(d=2\) it follows from [50, Lemma 9.9] since we have that \(\rho _0*\mathcal {N}\in \mathcal {W}^{2,m}(B_R(0))\), then Morrey’s Theorem (see for instance [17, Corollary 9.15]) yields \(\rho _0*\mathcal {N}\in L^{\infty }(B_R(0))\).
Case B:\(d\ge 3\)and\(m> d/2\). In this case we get \(W^{-}\le C_{w}\,\mathcal {N}\) in the whole \({\mathbb {R}}^{d}\) for some constant \(C_w\), so we have for \(r>0\)
Then using Sobolev’s embedding theorem again (see again [17, Corollary 9.15]), we easily argue that for \(m>d/2\) we find \((\rho _0*W^{-})(r)\in L^{\infty }_{loc}({\mathbb {R}}^{d})\), hence \(\rho _0 \in L^\infty ({\mathbb {R}}^d)\) by (3.9) again.
Case C:\(d\ge 3\)and\(2-\tfrac{2}{d}<m\le d/2\). We aim to prove that \(\rho _{0}(0)\) is finite which is sufficient for the boundedness of \(\rho _0\) since \(\rho _0\) is radially decreasing. This is done by an inductive argument. To begin with, observe that since \(\rho _0\) is radially decreasing we have that \(\rho _0(r)^m |B(0,r)| \le \Vert \rho _0\Vert _m^m < \infty ,\) which leads to the basis step of our induction
We set our first exponent \({\tilde{p}} =-d/m\). For the induction step, we claim that if \(\rho _0(r) \le C_1( 1+r^{p})\) with \(-d<p<0\), then it leads to the refined estimate
where \(C_2\) depends on \(d,m, \rho _0, W\) and \(C_1\).
Indeed, taking into account (K2) and (K5), the compact support of \(\rho _0\) together with the fact that \(\mathcal {N}> 0\) for \(d\ge 3\), we deduce that \(W \ge - C_{w,d} \,\mathcal {N}\) for some constant depending on W and d. As a result, we have, for \(r\in (0,1)\),
We can easily bound \((\rho _0 * \mathcal {N})(1)\) by some \(C(d,\Vert \rho _0\Vert _m)\). To control \( \int _r^1 \partial _r (\rho _0*\mathcal {N})(s) ds\), recall that
where M(s) is the mass of \(\rho _0\) in B(0, s). By our induction assumption, we have
Combining this with (3.13), we have
so we get, for \(p\ne -2\),
Plugging it into the right hand side of (3.12) yields
and using this inequality in the Euler–Lagrange Eq. (3.9) leads to (3.11). Moreover, in the case \(p=-2\), we have instead the inequality
Now we are ready to apply the induction starting at \({\tilde{p}}=-d/m\) to show \(\rho _0(0)<\infty \). We will show that after a finite number of iterations our induction arrives to
for some \(a>0\), which then implies that \(\rho _0(0) < \infty \). Let \(g(p) := \frac{p+2}{m-1}\), which is a linear function of p with positive slope, and let us denote \(g^{(n)}(p)=: \underbrace{(g\circ g\dots \circ g)}_{n \text { iterations}}(p)\).
Subcase C.1:\(m=d/2\).- In this case, we have \(\widetilde{p}=-2\) and by (3.11) we obtain
hence applying the first inequality in (3.11) for \(p=-1\) gives us (3.14) with \(a=1/(m-1)\).
Then it remains to consider the case \(m<d/2\). Notice that \(-d<\widetilde{p}<-2\). By (3.11) we get, for all \(r\in (0,1)\),
Then we must consider three cases. We point out that in all the cases we need to discuss the possibility of \(g^{(n)}(p) = -2\) for some n: if this happens, the logarithmic case occurs again and the result follows in a final iteration step as in Subcase C.1.
Subcase C.2:\(m=2\)and\(m<d/2\).- In this case, we have \(g(p) = p+2\), hence \( g^{(n)}(p)=p+2n, \) then
Therefore we have \(g^{(n)}(\widetilde{p}) > 0\) for some finite n, whence iterating (3.15) n times we find \(\rho _0(0)<\infty \).
Subcase C.3:\(m>2\)and\(m<d/2\).- In this case, \(p=2/(m-2)\) is the only fixed point for the linear function g(p). For all \(p<\tfrac{2}{m-2}\) we have \(g(p)>p\) which implies \(g^{(n)}(p)>p\) for all \(n\in {\mathbb {N}}\). Notice that
so the point \(p=2/(m-2)\) is attracting in the sense that
Since \(\frac{2}{m-2}>0\), it again implies that \(g^{(n)}(p) > 0\) for some finite n. Then choosing \(p=\widetilde{p}\), we have \(g^{(n)}(\widetilde{p}) > 0\) for some n, then (3.15) implies \(\rho _0(0)<\infty \) again.
Subcase C.4:\(m<\min (2,d/2)\).- In this case, the only fixed point \(\frac{2}{m-2}\) is unstable, and we have \(g(p)>p\) for any \(p>\frac{2}{m-2}\), then by (3.16)
Notice that \(\widetilde{p} >\frac{2}{m-2}\), since this condition reads \(m>2d/(d+2)\), a direct consequence of (3.1). Hence we again obtain \(g^{(n)}(\widetilde{p}) > 0\) for some finite n, which finishes the last case.
Let us finally turn back to the proof if we assume (K6) instead of (K5). Notice first that the proof of the Case C can also be done as soon as the potential W satisfies the bound \(W \ge - C_{w,d} (1+ \,\mathcal {N})\) for some \(C_{w,d}>0\). This is trivially true regardless of the dimension if the potential satisfies (K6) instead of (K5). \(\square \)
Finally, it is interesting to derive some regularity properties of a minimizer \(\rho _{0}\), as in [28]. Since W may not be the classical Newtonian kernel, we are led to prove a nice regularity for the W-potential \(\psi _{\rho _{0}}(x)\) which can be transferred to \(\rho _{0}\) via equation (3.8) in the support of \(\rho _{0}\). Note that (3.9) ensures that \(\rho _{0}\) satisfies equation (2.1) in the sense of distributions: indeed, as shown in (2.2)–(2.4), we find that \(\psi _{\rho _{0}}\in {\mathcal W}^{1,\infty }_{loc}({\mathbb {R}}^{d})\) thus we can take gradients on both sides of the Euler–Lagrange condition (3.9) and multiplying by \(\rho \) and writing \(\rho \nabla \rho ^{m-1}=\tfrac{m-1}{m}\nabla \rho ^{m}\) we reach (2.1). Now, using the regularity arguments of the proof of Lemma 2.3 again, together with the compact support property, we finally have \(\rho _{0}\in \mathcal {C}^{0,\alpha }({\mathbb {R}}^{d})\) with \(\alpha =1/(m-1)\).
We can summarize all the results in this section in the following theorem.
Theorem 3.10
In the diffusion dominated regime (3.1), assume that conditions (K1)–(K4) and either (K5) or (K6) hold. Then for any positive mass M, there exists a global minimizer \(\rho _{0}\) of the free energy functional \({\mathcal E}\) (2.5) defined in \(\mathcal {Y}_{M}\), which is radially symmetric, decreasing, compactly supported, Hölder continuous, and a stationary solution of (1.1) in the sense of Definition 2.1.
Putting together the previous theorem with the uniqueness of radial stationary solutions for the attractive Newtonian potential proved in [28, 61], we obtain the following result.
Corollary 3.11
In the particular case of the attractive Newtonian potential \(W(x)=-\mathcal {N}(x)\) modulo the addition of a constant factor, the global minimizer obtained in Theorem 3.10 is unique among all stationary solutions in the sense of Definition 2.1.
3.2 Some remarks about the minimization of energies with a potential term
The aim of this subsection is to generalize the previous result of Sect. 3.1 when dealing with free functionals involving a potential energy, namely
defined over the same admissible set \(\mathcal {Y}_{M}\), for some \(\mathcal {C}^{1}\) non-negative radially increasing potential \(V=V(r)\), where \(r=|x|\), such that
In this framework, the functional \({\mathcal E}\) might be infinite on some densities \(\rho \). The presence of the confinement potential V allows then to prove the following generalization of theorems 3.1–3.3, where no asymptotic behavior at infinity is needed for the radial profile \(\omega (r)\) of the kernel W:
Theorem 3.12
Assume that (3.1) and (K1)–(K4) hold, then the conclusions of Theorem 3.1–3.3 remain true.
Proof
We first observe that by Remark 2.7 and Lemma 3.2 we can restrict to radially decreasing densities. Moreover, following the lines of the proof of Theorem 3.3 we find that \({\mathcal E}\) is bounded from below and
This inequality easily implies the mass confinement of any minimizing sequence \(\left\{ \rho _{n}\right\} \), that is for some constant \(C>0\)
for some large \(R>0\). In particular, we have that the sequence \(\left\{ \rho _{n}\right\} \) is tight, and by Prokhorov’s Theorem (see [3, Theorem 5.1.3]) we obtain that (up to subsequence) \(\left\{ \rho _{n}\right\} \) converges to a certain density \(\rho \in L^{1}_{+}({\mathbb {R}}^{d})\cap L^{m}({\mathbb {R}}^{d})\), \(\Vert \rho \Vert _{L^{1}({\mathbb {R}}^{d})}=M\), with respect to the narrow topology. Then [3, Lemma 5.1.7] ensures the lower semicontinuity of the potential energies of \(\left\{ \rho _{n}\right\} \), that is
This implies that the infimum of \({\mathcal E}\) is achieved over a radially decreasing density \(\rho \in \mathcal {Y}_{M}\). In order to check that all the global minimizers are radially decreasing, we pick any minimizer \(\rho \in \mathcal {Y_{M}}\) and use Remark 2.7 and Lemma 3.2 in order to see that
thus
then the equality case in Lemma 3.2 yields the conclusion. \(\square \)
We have the following generalization of Theorem 3.7:
Theorem 3.13
Assume that (3.1), (K1)–(K4) hold. Let \(\rho _0\in \mathcal {Y_{M}}\) be a global minimizer of the free energy functional \({\mathcal E}\). Then for some positive constant \(\mathsf {D}[\rho _{0}]\), we have that \(\rho _0\) satisfies
and
As a consequence, any global minimizer of \({\mathcal E}\) verifies
The compactly supported property of the minimizers then follows from (3.17) and Lemmas 3.5–3.6. Moreover, it is straightforward to check that Lemma 3.9 continues to hold, as well as Theorem 3.10.
4 Long-time asymptotics
We now consider the particular case of (1.1) given by the Keller Segel model in two dimensions with nonlinear diffusion as
where \(m>1\) and the logarithmic interaction kernel is defined as
This system is also referred to as the parabolic-elliptic Keller–Segel system with nonlinear diffusion, since the attracting potential \(c={\mathcal N}*\rho \) solves the Poisson equation \(-\Delta c=\rho \). It corresponds exactly to the range of diffusion dominated cases as discussed in [23] since solutions do not show blow-up and are globally bounded. We will show based on the uniqueness part in Sect. 2 that not only the solutions to (4.1) exist globally and are uniformly bounded in time in \(L^\infty \), but also the solutions achieve stabilization in time towards the unique stationary state for any given initial mass.
The main tool for analyzing stationary states and the existence of solutions to the evolutionary problem is again the following free energy functional
A simple differentiation formally shows that \({\mathcal E}\) is decaying in time along the evolution corresponding to (4.1), namely
which gives rise to the following (free) energy–energy dissipation inequality for weak solutions
for non-negative initial data \(\rho _0(x) \in L^1((1+{\log }(1+|x|^2))dx)\cap L^m({\mathbb {R}}^2)\). The entropy dissipation is given by
where here and in the following we use the notation
We shall note that h corresponds to \(\frac{\delta {\mathcal E}}{\delta \rho }\) and that in particular the evolutionary Eq. (4.1) can be stated as \(\partial _t \rho = \nabla \cdot (\rho \nabla h[\rho ])\). Thus, this equation bears the structure of being a gradient flow of the free energy functional in the sense of probability measures, see [2, 9, 11, 33] and the references therein.
We first prove the global well-posedness of weak solutions satisfying the energy inequality (4.3) in the next subsection as well as global uniform in time estimates for the solutions. In the second subsection, we used the uniform in time estimates together with the uniqueness of the stationary states proved in Sect. 2 to derive the main result of this section regarding long time asymptotics for (4.1).
4.1 Global well-posedness of the Cauchy problem
In this section we analyze the existence and uniqueness of a bounded global weak solution for initial data in \(L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\), where here and in the following we denote
Assuming to have a sufficiently regular solution with the gradient of the chemotactic potential being uniformly bounded, Kowalczyk [63] derived a priori bounds in \(L^\infty \) with respect to space and time for the Keller–Segel model with nonlinear diffusion on bounded domains. These a priori estimates have been improved and extended to the whole space by Calvez and Carrillo in [23]. We shall demonstrate here how these a priori estimates of [23] can be made rigorous when starting from an appropriately regularized equation leading to the following theorem.
Theorem 4.1
(Properties of weak solutions) For any non-negative initial data \(\rho _0\in L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\), there exists a unique global weak solution \(\rho \) to (4.1), which satisfies the energy inequality (4.3) with the energy being bounded from above and below in the sense that
for some (negative) constant \({\mathcal E}_*\). In particular \(\rho \) is uniformly bounded in space and time
where C depends only on the initial data. Moreover the log-moment grows at most linearly in time
where again C depends only on the initial data.
We shall also state the existence result for radial initial data that was obtained in [65] and [61] for higher dimensions and the Newtonian potential. Similar methods can be applied in the case \(d=2\) considered here:
Theorem 4.2
(Properties of radial solutions) Let \(\rho _0 \in L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\) be non-negative and radially symmetric.
-
(a)
Then the corresponding unique weak solution of (4.1) remains radially symmetric for all \(t>0\).
-
(b)
If \(\rho _0\) is compactly supported, then the solution remains compactly supported for all \(t>0\).
-
(c)
If \(\rho _0\) is moreover monotonically decreasing, then the solution remains radially decreasing for all \(t>0\).
In the remainder of this section we carry out the proof of the existence of a bounded global weak solution to (4.1) as stated in Theorem 4.1. We therefore introduce the following regularization of (4.1)
where \(m>1\) and the regularized logarithmic interaction potential is defined as
Moreover we have for the derivatives
satisfying
The regularization in (4.4) was used by Bian and Liu [8], who studied the Keller–Segel equation with nonlinear diffusion and the Newtonian potential for \(d\ge 3\), which has been modified accordingly for the logarithmic interaction kernel in \(d=2\). The additional linear diffusion term in (4.4) removes the degeneracy and the regularized logarithmic potential \({\mathcal N}_\varepsilon \) possesses a uniformly bounded gradient, such that the local well posedness of (4.4) is a standard result for any \(\varepsilon >0\). We shall note that a slightly different regularization for such nonlinear diffusion Keller–Segel type of equations has been introduced by Sugiyama in [80], which also yields the existence and uniqueness of a global weak solution. The advantage of the regularization in (4.4) resembling the one in [8] is the fact that the regularized problem satisfies a free energy inequality, that in the limit gives exactly (4.3), whereas in [80] the dissipation term could only be retained with a factor of 3 / 4.
We point out that in the case \(d=2\) other a priori estimates are available than in higher space dimensions leading to a different proof for global well posedness of the Cauchy problem for (4.4) and the limit \(\varepsilon \rightarrow 0\) compared to [8].
4.1.1 Global well posedness of the regularized Cauchy problem
To derive a priori estimates for the regularized problem (4.4) we use the iterative method used by Kowalczyk [63] based on employing test functions that are powers of \(\rho _{\varepsilon ,k}=(\rho _{\varepsilon }-k)_+\) for some \(k>0\). When testing (4.4) against \(p\rho _{\varepsilon ,k}^{p-1}\) for any \(p\ge 2\), we obtain:
where for estimating the integrals involving convolution terms we used the inequality
see e.g. Lieb and Loss [64]. Closing the estimate (4.6) would yield an estimate for \(\rho _{\varepsilon ,k}\) in \(L^\infty (0,T;L^p({\mathbb {R}}^2))\) and thus also for \(\rho _\varepsilon \in L^\infty (0,T;L^p({\mathbb {R}}^2))\), since
Kowalczyk proceeded from (4.5) with the assumption corresponding to \(\Vert \nabla {\mathcal N}_\varepsilon *\rho _{\varepsilon }\Vert _{L^\infty }\le C\). Observe that it would be sufficient to prove \(\rho _\varepsilon \in L^\infty (0,T;L^p({\mathbb {R}}^2))\) for some \(p>2\) implying \(\Delta {\mathcal N}_\varepsilon *\rho _\varepsilon \in L^\infty (0,T;L^p({\mathbb {R}}^2))\) and hence the uniform boundedness of the gradient term by Sobolev imbedding. Calvez and Carrillo [23] circumvent this assumption and derive the bound by using an equi-integrability property in the inequality (4.6). Hence, in order to being able to follow the ideas of [23] for the regularized problem, we need to derive the corresponding energy inequality for the latter.
Proposition 4.3
For any finite time \(T>0\) the solution \(\rho _\varepsilon \) to the Cauchy problem (4.4) supplemented with initial data \(\rho _{0} \in L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\) satisfies the energy inequality
for a positive constant \(C=C(M,\Vert \rho _{0}\Vert _{\infty })\) and \(0\le t\le T\), where \({{\mathcal {E}}}_\varepsilon \) is an approximation of the free energy functional in (4.2):
and \({\mathcal D}_\varepsilon \) the corresponding dissipation
In particular, we obtain equi-integrability
Remark 4.4
Note that due to the \(\varepsilon \Delta \rho _\epsilon \) regularization term in (4.4), its associated energy functional actually includes an extra term \(\varepsilon \int \rho _\varepsilon \log \rho _\varepsilon \) compared to \(\mathcal {E}_{\epsilon }\). But in this lemma we choose to obtain an energy inequality for \(\mathcal {E}_\varepsilon \) (rather than the actual associated energy functional), since the absence of the extra term \(\varepsilon \int \rho _{\varepsilon } \log \rho _{\varepsilon }\) will make it easier for us to obtain a priori estimates independent of \(\varepsilon \) later.
Proof
Testing (4.4) with \(\frac{m}{m-1}\rho _\varepsilon ^{m-1}-{\mathcal N}_\varepsilon *\rho _\varepsilon \) we obtain
where we have used (4.7) and the fact that \(\Vert J_\varepsilon \Vert _{L^1({\mathbb {R}}^2)}=1\). Hence we need to derive an a priori bound for \(\rho _\varepsilon \) in \(L^2({\mathbb {R}}^2)\). We use the estimate (4.6) for \(p=2\) and bound \(\int _{{\mathbb {R}}^2}\rho _{\varepsilon ,k}^3 dx\) using the Gagliardo–Nirenberg inequality (see for instance [49, 74]) as follows:
Then by (4.6) and interpolation of the \(L^2\)-integral, we have
Hence, choosing k large enough, recalling \(m>1\) and estimate (4.8), we can conclude by integrating in time that
for some constant \(C=C(M,\Vert \rho _{0}\Vert _{L^\infty ({\mathbb {R}}^2)})\), which implies the stated energy inequality.
In order to obtain a priori bounds and in particular the equi-integrability property, we need to bound the energy functional also from below. The difference to the corresponding energy functional for the original model (4.1) lies only in the regularized interaction kernel. Since clearly for all \(x\in {\mathbb {R}}^2\) we have \({\log }(|x|^2+\varepsilon ^2)\ge 2{\log }|x| \), we obtain
Following [23] we can estimate further using the logarithmic Hardy–Littlewood–Sobolev inequality
where C(M) is a constant depending on the mass M and
Now it is easy to verify there is a constant \(\kappa =\kappa (m,M)>1\) for which
such that
implying in particular
We therefore find from (4.9), (4.10) and (4.11) that
with \(C =C(m,\Vert \rho _0\Vert _{L^1({\mathbb {R}}^2)}, \Vert \rho _0\Vert _{L^\infty ({\mathbb {R}}^2)})\) being a constant independent of t. Since \(\Theta ^{+}\) is superlinear at infinity, we obtain the equi-integrability as in Theorem 5.3 in [23]. \(\square \)
The equi-integrability from Proposition 4.3 allows to close the estimate (4.6) analogously to Lemma 3.1 of [23] leading to a bound for \(\rho _{\varepsilon }\) in \(L^\infty (0,T;L^p({\mathbb {R}}^2))\). Moreover, using Moser’s iterative methods of Lemma 3.2 in [23] we finally get a bound for \(\rho _{\varepsilon }\) in \(L^\infty (0,T;L^\infty ({\mathbb {R}}^2))\). In order to avoid mass loss at infinity typically the boundedness of the second moment of the solution is employed. We here however demonstrate that the bound of the log-moment provides sufficient compactness, having the advantage of less restrictions on the initial data. We therefore denote for the regularization
The following lemma is now obtained following the ideas of [23]:
Lemma 4.5
The solution \(\rho _\varepsilon \) to (4.4) for a non-negative initial data \(\rho _{0}\in L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\) satisfies for any \(T>0\):
where the constant C depends on the initial data.
Proof
Computing formally the evolution of the log-moment in (4.4) in a similar fashion to [26], we find for the test function \(\phi (x)={\log } (1+|x|^2)\) after integrating by parts
Computing the derivatives of \(\phi \) we see
We thus obtain
Integration in time and making use of the energy - energy dissipation inequality (4.9) and the uniform bound on \({\mathcal E}_\varepsilon \) from below in (4.11) gives
The argument can easily be made rigorous by using compactly supported approximations of \(\phi \) on \({\mathbb {R}}^2\) as test functions, see e.g. also [13]. The proof is concluded by referring to Lemma 3.2 in [23] for the proof of uniform boundedness of \(\rho _\varepsilon \). \(\square \)
Remark 4.6
-
(i)
The fact that the uniform bound of \(\rho _\varepsilon \) grows linearly with time originates from the term of order \(\varepsilon \) in the energy inequality for the regularized equation. Hence the bound on the energy and therefore the modulus of equi-continuity for the regularized problem are depending on time. However, for the limiting equation (4.1) this term vanishes and the energy is decaying for all times, which allows to deduce uniform boundedness of the solution to (4.1) globally in time and space, see also [23, Lemma 5.7].
-
(ii)
The log-moment of \(\rho _\varepsilon \) grows at most linearly in time. The same statement is true for the limiting function. Hence it is only possible to guarantee confinement of mass for finite times. This property allowing for compactness results will in the following be used to pass to the limit in the regularized problem. Due to the growth of the bound with time it cannot be employed for the long-time behavior. Hence different methods will be required.
4.1.2 The limit \(\varepsilon \rightarrow 0\)
In order to deduce the global well-posedness of the Cauchy problem for (4.1) it remains to carry out the limit \(\varepsilon \rightarrow 0\). Knowing that the solution remains uniformly bounded and having the bounds from the energy inequality, we obtain weak convergence properties of the solution. In order to pass to the limit with the nonlinearities and in the entropy inequality, strong convergence results will be required. The following lemma summarizes the uniform bounds we obtain from Proposition 4.3 and Lemma 4.5:
Lemma 4.7
Let \(\rho _\varepsilon \) be the solution as in Proposition 4.3, then we obtain the following uniform in \(\varepsilon \) bounds
where C depends on \(m, q, \rho _0\) and T.
Proof
The uniform bounds of the \(L_{log}^1({\mathbb {R}}^2)\)- and \(L^\infty ({\mathbb {R}}^2)\)-norms follow from the conservation of mass and Lemma 4.5. The convolution term
can be estimated as follows:
The bound of \(\sqrt{\rho _\varepsilon } \nabla {\mathcal N}_\varepsilon *\rho _\varepsilon \) in \(L^2((0,T)\times {\mathbb {R}}^2)\) follows now easily by using the conservation of mass.
The basic \(L^2\)-estimate corresponding to (4.5) for \(p=2\) and \(k=0\) implies after integration in time
Using the above a priori estimates we can further bound employing the inequality in (4.7)
Since \(m>1\), the conservation of mass and the uniform boundedness of \(\rho _\varepsilon \) give \(\rho _\varepsilon ^{m-1/2}\) in \(L^2((0,T)\times {\mathbb {R}}^2)\). For the gradient we now use the bound on the entropy dissipation (4.9)
The bound for \(\nabla \rho ^q\) follows easily by rewriting
and using the uniform boundedness of \(\rho _\varepsilon \).
It thus now remains to derive the estimate for the time derivative. Using the previous estimates we have for any test function \(\phi \in L^2(0,T;H^1({\mathbb {R}}^2))\),
\(\square \)
We now use these bounds to derive weak convergence properties. The Dubinskii Lemma (see Lemma 4.23 in the “Appendix”) can be applied to obtain the strong convergence locally in space, which can be extended to global strong convergence using the boundedness of the log-moment.
Lemma 4.8
Let \(\rho _\varepsilon \) be the solution as in Proposition 4.3. Then, up to a subsequence,




Proof
Since \(\{\rho _\varepsilon \}_\varepsilon \) are uniformly bounded in \(L^q((0,T)\times {\mathbb {R}}^2)\) for any \(1\le q\le \infty \), we obtain from the reflexivity of the Lebesgue spaces for \(1<q<\infty \), up to a subsequence, the weak convergence
Moreover due to the uniform bounds from Lemma 4.7
for any \(r\ge m-\frac{1}{2}\), we can apply the Dubinskii Lemma stated in the “Appendix” to derive
The boundedness of the log-moment N(t) allows to extend the strong convergence to the whole space, since for any \(1\le q<\infty \) we have
as \(R\rightarrow \infty \). Due to the weak lower semi-continuity of the \(L^q\)-norm we can now conclude with (4.18) that also
Hence we can extend the strong convergence locally in space to strong convergence in \({\mathbb {R}}^2\):
Additionally the strong convergence in \(L^1((0,T)\times {\mathbb {R}}^2)\) can be deduced using the bound from the energy as stated in Lemma 4.22 in the “Appendix”. Interpolation now yields (4.14).
The weak convergence of \( \rho _\varepsilon ^{m-1/2}\) in \(L^2(0,T;H^1({\mathbb {R}}^2))\) holds due to its uniform boundedness given by inequality (4.13) and the reflexivity of the latter space, where the limit is identified arguing by the density of spaces. Due to the uniform boundedness of \(\rho _\varepsilon \) this assertion can be extended to any finite power bigger than \(m-1/2\).
Since moreover \(\sqrt{\rho _\varepsilon }\) is uniformly bounded in \(L^2((0,T)\times {\mathbb {R}}^2)\) we have the weak convergence towards \(\sqrt{\rho }\) in \(L^2((0,T)\times {\mathbb {R}}^2)\), where again the limit is identified by using the a.e. convergence of \(\rho _\varepsilon \) from the strong convergence above. To see (4.16) we rewrite
The first integral vanishes and the second one converges to 0 due to the weak convergence of \(\sqrt{\rho _\varepsilon }\rightharpoonup \sqrt{\rho }\) in \(L^2((0,T)\times {\mathbb {R}}^2)\).
Finally the convergence in (4.17) is a direct consequence of the bound \(\sqrt{\varepsilon }\Vert \nabla \rho _\varepsilon \Vert _{L^2((0,T)\times {\mathbb {R}}^2)}\le C\) in Lemma 4.7. \(\square \)
These convergence results from Lemma 4.8 are sufficient to obtain the weak convergence of the nonlinearities \(\sqrt{\rho _\varepsilon }\nabla h_\varepsilon [\rho _\varepsilon ]\) and \(\rho _\varepsilon \nabla h_\varepsilon [\rho _\varepsilon ]\) in \(L^2((0,T)\times {\mathbb {R}}^2)\), which allow to pass to the limit in the weak formulation and to deduce the weak lower semicontinuity of the entropy dissipation term:
Lemma 4.9
Let \(\rho _\varepsilon \) and \(\rho \) be as in Lemma 4.8. Then
Proof
Due to (4.15) and (4.16) it remains to verify
Due to Lemma 4.7, we have the weak convergence of \(\sqrt{\rho _\varepsilon }\,\nabla {\mathcal N}_\varepsilon *\rho _\varepsilon \) in \(L^2((0,T)\times {\mathbb {R}}^2;{\mathbb {R}}^{2})\). In order to identify the limit we consider for a \(\phi \in L^2((0,T)\times {\mathbb {R}}^2;{\mathbb {R}}^{2})\):
The first term converges to zero using (4.16), since by (4.12) it is bounded by
For the second term we first use the Cauchy–Schwarz inequality
To see that this convolution term vanishes we bound further
uniformly in x, t, where we substituted \(s = |x-y|/\varepsilon \). For the remaining term in (4.21) we proceed changing the order of integration, where we again skip the dependence of \(\rho _\varepsilon \) and \(\phi \) on t in the following:
To prove that this integral vanishes in the limit, due to (4.16) it suffices to show that
We shall therefore split the integral into two parts and consider first
It remains to bound the integral for \(|x-y|>1\):
\(\square \)
Proof of Theorem 4.1
The convergence property of the nonlinearity in (4.20) and the weak convergence of the time derivative due to Lemma 4.7 allow to pass to the limit in the weak formulation of the Cauchy problem for (4.1), where the linear diffusion term vanishes due to (4.17). The uniqueness of the solution is implied from Theorem 1.3 and Corollary 6.1 of [32], where we shall not go further into detail here.
It thus remains to pass to the limit in the energy inequality. Since the energy dissipation is weakly lower semicontinuous due to (4.19), we get
In order to obtain the energy inequality (4.3) in the limit \(\varepsilon \rightarrow 0\) it thus remains to show \({{\mathcal {E}}}_\varepsilon [\rho _\varepsilon ](t) \rightarrow {\mathcal E}[\rho ](t)\,\) for \(t\in [0,T]\). Lemma 4.22 and the uniform bounds on \(\rho _\varepsilon \) in Lemma 4.7 directly imply the strong convergence of \(\rho _\varepsilon \) in \(L^\infty (0,T;L^m({\mathbb {R}}^2))\). It is therefore left to prove the convergence for the convolution term and we rewrite
We split the domain of integration and first analyze the case \(|x-y|\ge 1\). In this domain, we get
and thus it converges to zero as \(\varepsilon \rightarrow 0\). Using the Cauchy–Schwarz inequality, we obtain moreover
We now turn to the integration domain \(|x-y|<1\), where by dominated convergence
This proves the convergence of the entropy, which together with the weak lower semicontinuity of the entropy-dissipation leads to the desired energy-energy dissipation inequality (4.3) for the limiting solution \(\rho \). \(\square \)
4.2 Long-time behavior of solutions
Our main result of Sect. 2 together with the uniqueness argument for radial stationary solutions to (4.1) of [61] and the characterization of global minimizers in [28] and Corollary 3.11 leads to the following result:
Theorem 4.10
There exists a unique stationary state \(\rho _M\) of (4.1) with mass M and zero center of mass in the sense of Definition 2.1 with the property \(\rho _M\in L^1_{log}({\mathbb {R}}^2)\). Moreover, \(\rho _M\) is compactly supported, bounded, radially symmetric and non-increasing. Moreover, the unique stationary state is characterized as the unique global minimizer of the free energy functional (4.2) with mass M.
As a consequence, all stationary states of (4.1) in the sense of Definition 2.1 with mass M are given by translations of the given profile \(\rho _M\):
Remark 4.11
As in [61, Corollary 2.3] we have the following result comparing the support and height for stationary states with different masses based on a scaling argument: Let \(\rho _1\) be the radial solution with unit mass. Then the radial solution with mass M is of the form
For two stationary states \(\rho _{M_1}\) and \(\rho _{M_2}\) with masses \(M_1>M_2\) the following relations hold:
-
(a)
If \(m>2\), then \(\rho _{M_1}\) has a bigger support and a bigger height than \(\rho _{M_2}\).
-
(b)
If \(m=2\), then all stationary states have the same support.
-
(c)
If \(1<m<2\), then \(\rho _{M_1}\) has smaller support and bigger height than \(\rho _{M_2}\).
We will study now the long time asymptotics for the global weak solutions \(\rho \) of (4.1) that according to the entropy inequality in Theorem 4.1 satisfy
Since the entropy is bounded from below, this implies for the entropy dissipation
Let us therefore now consider the sequence
for which we obtain
Thus \( {\mathcal D}[\rho _k]\rightarrow 0\) in \(L^1(0,T)\), or equivalently
The proof of convergence towards the steady state will be based on weak lower semicontinuity of the entropy dissipation. Assume that \(\rho _k\rightharpoonup \overline{\rho } \) in \(L^\infty (0,T;L^1({\mathbb {R}}^2)\cap L^m({\mathbb {R}}^2))\), then we have to derive
Since the \(L^2\)-norm is weakly lower semicontinuous, it therefore remains to show similarly as in Lemma 4.9
From there it can be deduced that \(\overline{\rho }\) is the stationary state \(\rho _M\) with \(M=\Vert \rho _0\Vert _{L^1({\mathbb {R}}^2)}\) by the uniqueness theorem 4.10, if we can guarantee that no mass gets lost in the limit.
The main difficulty for passing to the limit in the long-time behavior lies in obtaining sufficient compactness avoiding the loss of mass at infinity. Even though the mass of \(\rho (t,\cdot )\) is conserved for all time, if a positive amount of mass escapes to infinity, then a subsequence of \(\rho (t,\cdot )\) may weakly converge to a stationary solution with mass strictly less than M. To rule out this scenario, we need to show that the sequence \(\{\rho (t,\cdot )\}_{t>0}\) is tight, which can be done by obtaining uniform-in-time bounds for certain moments for \(\rho (t,\cdot )\). So far we only have a time-dependent bound on the logarithmic moment in Theorem 4.1, which is not enough. Moreover, even if we know that \(\{\rho (t,\cdot )\}_{t>0}\) is tight, if we want to choose the right limiting profile among all stationary states in \(\mathcal {S}\), we need to show the conservation of some symmetry. In fact, it is easy to check that the center of mass should formally be preserved by the evolution due to the antisymmetry of the gradient of the Newtonian potential. But to rigorously justify this, we need to work with moments that are larger than first moment, so the center of mass is well defined.
Below we state the main theorem in this section, where a key argument is to establish a uniform-in-time bound on the second moment of \(\rho (t,\cdot )\), if \(\rho _0\) has a finite second moment.
Theorem 4.12
Let \(\rho \) be the weak solution to (4.1) given in Theorem 4.1 with non-negative initial data \(\rho _0\in L^1((1+|x|^2)dx)\cap L^\infty ({\mathbb {R}}^2)\). Then, as \(t\rightarrow \infty \), \(\rho (\cdot ,t)\) converges to the unique stationary state with the same mass and center of mass as the initial data, i.e., to
with \(M=\Vert \rho _{0}\Vert _{L^{1}({\mathbb {R}}^{2})}\), ensured by Theorem 4.10. More precisely, we have
Our aim is to show that the second moment of solutions to (4.1) is uniformly bounded in time for all \(t\ge 0\). This in turn shows easily that the first moment is preserved in time for all \(t\ge 0\), as we will prove below. Recall that by (2.15) we denote by \(M_2[f]\) the second moment of \(f\in L^1_+({\mathbb {R}}^d)\). We first derive rigorously the evolution of the second moment in time:
starting from the regularized system (4.4). Computing the second moment of the regularized problem, we obtain
The strong convergence in (4.14) allows to pass to the limit \(\varepsilon \rightarrow 0\) in the first integral of (4.23) and for the remainder term we moreover have due to the conservation of mass and the uniform boundedness of \(\rho _\varepsilon \)
The argument can easily be made rigorous by using compactly supported approximations of \(|x|^2\) on \({\mathbb {R}}^2\) as test functions, see e.g. also [13]. We finally obtain (4.22) by integrating in time.
Now, we want to compare general solutions to (4.1) with its radial solutions. In order to do this we will make use of the concept of mass concentration, which has been recalled in 2.4, and used for instance in [44, 61] for classical applications to Keller–Segel type models.
Following exactly the same proof as in [61], the following two results hold for the solutions of (4.1). The first result says that for two radial solutions, if one is initially “more concentrated” than the other one, then this property is preserved for all time. The second result compares a general (possibly non-radial) solution \(\rho (t,\cdot )\) with another solution \(\mu (t,\cdot )\) with initial data \(\rho ^\#(0,\cdot )\), i.e., the decreasing rearrangement of the initial data for \(\rho (t,\cdot )\), and it says that the symmetric rearrangement of \(\rho (t,\cdot )\) is always “less concentrated” than the radial solution \(\mu (t,\cdot )\). This result generalizes the results from [44] to nonlinear diffusion with totally different proofs. We also refer the interested reader to the survey [86] for a general exposition of the mass concentration comparison results for local nonlinear parabolic equations and to the recent developments obtained in [88, 89] in the context of nonlinear parabolic equations with fractional diffusion.
Proposition 4.13
Let \(m>1\) and f, g be two radially symmetric solutions to (4.1) with \(f(0,\cdot ) \prec g(0,\cdot )\). Then we have \(f(t,\cdot ) \prec g(t,\cdot )\) for all \(t>0\).
Proposition 4.14
Let \(m>1\) and \(\rho \) be a solution to (4.1), and let \(\mu \) be a solution to (4.1) with initial condition \(\mu (0,\cdot ) = \rho ^\#(0,\cdot ).\) Then we have that \(\mu (t,\cdot )\) remains radially symmetric for all \(t\ge 0\), and in addition we have
Now we are ready to bound the second moment of solutions in the two-dimensional case: we will show that if \(\rho (t,\cdot )\) is a solution to (4.1) with \(M_2[\rho _0]\) finite, then \(M_2[\rho (t)]\) must be uniformly bounded for all time.
Theorem 4.15
Let \(\rho _0 \in L^1((1+|x|^2)dx) \cap L^\infty ({\mathbb {R}}^2)\). Let \(\rho (t,\cdot )\) be the solution to (4.1) with initial data \(\rho _0\). Then we have that
Proof
Recalling that \(\rho _M\) is the unique radially symmetric stationary solution with the same mass as \(\rho _0\) and zero center of mass, we let \(\rho _{M,\lambda } := \lambda ^2 \rho _M(\lambda x)\) with some parameter \(\lambda >1\). Since \(\rho _0\in L^1({\mathbb {R}}^2) \cap L^\infty ({\mathbb {R}}^2)\), we can choose a sufficiently large \(\lambda \) such that \(\rho _0^\# \prec \rho _{M,\lambda }\). Note that \(\lambda >1\) also directly yields that \(\rho _M \prec \rho _{M,\lambda }\).
Let \(\mu (t,\cdot )\) be the solution to (4.1) with initial data \(\rho _{M,\lambda }\). Combining Proposition 4.13 and Proposition 4.14, we have that
It then follows from (2.13) and Lemma 2.5 that
Now using the computation of the time derivative of \(M_2[\rho (t)]\) in (4.22), where \(\rho (\cdot ,t)\) is a solution to (4.1), we get
Since \(\mu (t,\cdot )\) is also a solution to (4.1), (4.25) also holds when \(\rho \) is replaced by \(\mu \). Combining this fact with (4.24), we thus have
Finally, it suffices to show \(M_2[\mu (t)]\) is uniformly bounded for all time. Since \(\rho _M\) is a stationary solution and we have \(\rho _M \prec \rho _{M,\lambda }\), it follows from Proposition 4.13 that \(\rho _M \prec \mu (t,\cdot )\) for all \(t\ge 0\), hence we have \(M_2[\rho _M] \ge M_2[\mu (t)]\) due to Lemma 2.6. Plugging this into (4.26) yields
where \(M_2[\rho _M]\) is a constant only depending on the mass \(M:=\Vert \rho _0\Vert _{L^1({\mathbb {R}}^2)}\), which can be computed as follows: using Remark 4.11, we know the support of \(\rho _M\) is given by the ball centered at 0 of radius \(R(M) = C_0 M^{\frac{m-2}{2(m-1)}}\) (where \(C_0\) is the radius of the support for the stationary solution with unit mass), hence \(M_2[\rho _M] \le M R(M)^2 \le C_0^2 M^{\frac{2m-3}{m-1}}\). \(\square \)
Remark 4.16
The last result showing uniform-in-time bounds for the second moment for \(m>1\) finite is also interesting in comparison to the results in [42, 43] where the case \(m\rightarrow \infty \) limit of the gradient flow is analysed. In the “\(m=\infty \)” case, the second moment of any solution is actually decreasing in time, leading to the result that all solutions converge towards the global minimizer with some explicit rate. As mentioned in the introduction, a result of this sort for any other potential rather than the attractive logarithmic potential is lacking.
As already mentioned above, a key ingredient in the proof of Theorem 4.12 is the confinement of mass, which is first now obtained as follows:
Lemma 4.17
Let \(\rho \) be a global weak solution as in Theorem 4.1 with mass M with initial data \(\rho _0 \in L^1((1+|x|^2)dx)\cap L^\infty ({\mathbb {R}}^2)\) and consider as above the sequence \(\{\rho _k\}_{k\in {\mathbb {N}}}=\{\rho (\cdot +t_k,\cdot )\}_{k\in {\mathbb {N}}}\) in \((0,T)\times {\mathbb {R}}^2\). Then there exists a \(\overline{\rho } \in L^1((0,T)\times {\mathbb {R}}^2)\cap L^m((0,T)\times {\mathbb {R}}^2)\) and a subsequence, that we denote with the same index without loss of generality, such that:
as \(k\rightarrow \infty \).
Proof
Due to the entropy being uniformly bounded from below and by the entropy inequality (4.2), we have \(\rho _k\in L^\infty ((0,T);L^m({\mathbb {R}}^2))\). Using Theorem 4.15, we deduce that
Since \(\{\rho _k\}_{k\in {\mathbb {N}}}\) are also uniformly bounded in \(L^\infty (0,T;L^m({\mathbb {R}}^2))\) we obtain equi-integrability and can therefore apply the Dunford–Pettis theorem (see Theorem 4.21 in “Appendix”) to obtain the weak convergence in \(L^1((0,T)\times {\mathbb {R}}^2)\cap L^m((0,T)\times {\mathbb {R}}^2)\). \(\square \)
In order to obtain weak lower semicontinuity of the entropy dissipation term, we need additional convergence results. These are derived from the following uniform bounds:
Lemma 4.18
Let \(\rho \) be a global weak solution as in Theorem 4.1 with mass M and consider as above the sequence \(\{\rho _k\}_{k\in {\mathbb {N}}}=\{\rho (\cdot +t_k,\cdot )\}_{k\in {\mathbb {N}}}\) in \((0,T)\times {\mathbb {R}}^2\). Then
Proof
The bounds are obtained from the energy-energy dissipation inequality (4.3) in an analogous way to the ones given in Lemma 4.7 with the only difference concerning the replacement of \({\mathcal N}_\varepsilon \) by \({\mathcal N}\), which however makes no difference in the estimate (4.12). \(\square \)
Using these estimates the following convergence properties can be derived in an analogous way to the proof of Lemma 4.8.
Lemma 4.19
Let the assumptions of Lemma 4.17 hold. Then, up to subsequences that we denote with the same index,
These convergence results from Lemma 4.19 and Lemma 4.17 are sufficient to obtain the weak convergence of the nonlinearities \(\sqrt{\rho _k}\nabla h[\rho _k]\) and \(\rho _k\nabla h[\rho _k]\) in \(L^2((0,T)\times {\mathbb {R}}^2)\), which allows to deduce the weak lower semicontinuity of the entropy dissipation term and to pass to the limit in the weak formulation of (4.1) in the same way as in the proof of Lemma 4.9.
Lemma 4.20
Let \(\rho _k\) and \(\overline{\rho }\) be as in Lemma 4.19. Then
This enables us to close the proof of convergence towards the set of stationary states.
Proof of Theorem 4.12
Let us first notice that \(\overline{\rho }\in L^\infty ((0,T)\times {\mathbb {R}}^2)\) due to the first convergence in Lemma 4.19 and the uniform in time bound on the weak solutions in Theorem 4.1. Due to the weak lower semicontinuity of the \(L^2((0,T)\times {\mathbb {R}}^2)\)-norm and the bound from below of the entropy as done in Proposition 4.3 implies that \( {\mathcal D}[\rho _k]\rightarrow 0\) in \(L^1(0,T)\), and as consequence
Thus \(\overline{\rho }\) solves
Moreover, due to the convergence properties in Lemmas 4.19 and 4.20 the limiting density \(\overline{\rho }\) is a weak distributional solution to (4.1) with test functions is \(L^2(0,T;H^1({\mathbb {R}}^2))\). Due to (4.28), we get that \(\overline{\rho } \nabla h[\overline{\rho }]=0\) a.e. in \((0,T)\times {\mathbb {R}}^2\) and thus \(\partial _t \overline{\rho }=0\) in \(L^2(0,T;H^{-1}({\mathbb {R}}^2))\). This yields that \(\overline{\rho }(t,x) \equiv \overline{\rho }(x)\) does not depend on time.
Due to the convergence properties in Lemma 4.19, the uniform bound on the second moment (4.27) together with Lemma 4.22 in the “Appendix”, we can deduce that \(\overline{\rho }\in L^1((1+|x|^2)dx)\) and that \(\rho _k \rightarrow \overline{\rho }\) in \(L^\infty (0,T;L^1({\mathbb {R}}^2))\). In particular, \(\overline{\rho }\) has mass M.
Putting together all the properties of \(\overline{\rho }\) just proved together with the fact that \(\nabla \overline{\rho }^m \in L^2({\mathbb {R}}^2)\) due to Lemma 4.19, we infer that \(\overline{\rho }\) corresponds to a steady state of Eq. (4.1) in the sense of Definition 2.1. The uniqueness up to translation of stationary states in Theorem 4.10 shows that \(\overline{\rho }\) is a translation of \(\rho _M\), and thus \(\overline{\rho }\in \mathcal {S}\). In fact, we have shown that the limit of all convergent sequences \(\{\rho _k\}_{k\in {\mathbb {N}}}\) must be a translation of \(\rho _M\). This in turn shows that the set of accumulation points of any time diverging sequence belongs to \(\mathcal {S}\).
Finally, in order to identify uniquely the limit, we take advantage of the translational invariance. We first remark that the center of mass of the initial data is preserved for all time due to the antisymmetry of \(\nabla \mathcal {N}\). Due to Proposition 4.15, all time diverging sequences have uniformly bounded second moments, thus since \(\overline{\rho }\) is an accumulation point of a sequence \(\rho _{k}\), by Lemma 4.22 we have
Hence all accumulation points of the sequences have the same center of mass as the initial data. Then, all possible limits reduce to the translation of \(\rho _M\) to the initial center of mass as desired. \(\square \)
References
Alvino, A., Trombetti, G., Lions, P.L.: On optimization problems with prescribed rearrangements. Nonlinear Anal. 13(2), 185–220 (1989)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows in metric spaces and in the space of probability measures, Lectures in Mathematics ETH Zürich, Birkhäuser Verlag, Basel, (2005)
Ambrosio, L., Gigli, N.: A user’s guide to optimal transport. In: Modelling and optimisation of flows on networks, volume 2062 of Lecture Notes in Math, pp. 1–155. Springer, Heidelberg (2013)
Balagué, D., Carrillo, J.A., Laurent, T., Raoul, G.: Dimensionality of local minimizers of the interaction energy. Arch. Rat. Mech. Anal. 209, 1055–1088 (2013)
Bedrossian, J.: Global minimizers for free energies of subcritical aggregation equations with degenerate diffusion. Appl. Math. Lett. 24(11), 1927–1932 (2011)
Bennett, C., Sharpley. R.: Interpolation of operators., vol. 129 of Pure and Applied Mathematics. Academic Press Inc., Boston, MA, (1988)
Bertozzi, A.L., von Brecht, James H., Sun, H., Kolokolnikov, Theodore, Uminsky, D.: Ring patterns and their bifurcations in a nonlocal model of biological swarms. Commun. Math. Sci. 13, 955–985 (2015)
Bian, S., Liu, J.G.: Dynamic and steady states for multi-dimensional Keller-Segel model with diffusion exponent \(m\,>\,0\). Commun. Math. Phys. 323(3), 1017–1070 (2013)
Blanchet, A., Calvez, V., Carrillo, J.A.: Convergence of the mass-transport steepest descent scheme for the subcritical Patlak–Keller–Segel model. SIAM J. Numer. Anal. 46, 691–721 (2008)
Blanchet, A., Carlen, E.A., Carrillo, J.A.: Functional inequalities, thick tails and asymptotics for the critical mass Patlak–Keller–Segel model. J. Funct. Anal. 262, 2142–2230 (2012)
Blanchet, A., Carrillo, J.A., Laurençot, P.: Critical mass for a Patlak–Keller–Segel model with degenerate diffusion in higher dimensions. Calc. Var. Partial Differ. Equ. 35, 133–168 (2009)
Blanchet, A., Carrillo, J.A., Masmoudi, N.: Infinite time aggregation for the critical Patlak–Keller–Segel model in \(\mathbb{R}^2\). Commun. Pure Appl. Math. 61, 1449–1481 (2008)
Blanchet, A., Dolbeaut, J., Perthame, B.: Two-dimensional Keller–Segel model: optimal critical mass and qualitative properties of the solutions. Electron. J. Differ. Equ. 44, 1–32 (2006)
Bodnar, M., Velázquez, J.J.L.: Friction dominated dynamics of interacting particles locally close to a crystallographic lattice. Math. Methods Appl. Sci. 36, 1206–1228 (2013)
Boi, S., Capasso, V., Morale, D.: Modeling the aggregative behavior of ants of the species polyergus rufescens. Nonlinear Anal. Real World Appl. 1, 163–176 (2000)
Brascamp, H., Lieb, E.H., Luttinger, J.M.: A general rearrangement inequality for multiple integrals. J. Funct. Anal. 17, 227–237 (1974)
Brezis, H.: Functional Analysis. Sobolev Spaces and Partial Differential Equations. Springer, New York (2010)
Brock, F.: Continuous Steiner symmetrization. Math. Nachrichten. 172, 25–48 (1995)
Brock, F.: Continuous rearrangement and symmetry of solutions of elliptic problems. Proc. Indian Acad. Sci. Math. Sci. 110, 157–204 (2000)
Burger, M., Capasso, V., Morale, D.: On an aggregation model with long and short range interactions. Nonlinear Anal. Real World Appl. 8, 939–958 (2007)
Burger, M., DiFrancesco, M., Franek, M.: Stationary states of quadratic diffusion equations with long-range attraction. Commun. Math. Sci. 11, 709–738 (2013)
Campos, J.F., Dolbeault, J.: Asymptotic estimates for the parabolic-elliptic Keller–Segel model in the plane. Commun. Partial Differ. Equ. 39, 806–841 (2014)
Calvez, V., Carrillo, J.A.: Volume effects in the Keller-Segel model: energy estimates preventing blow-up. J. Math. Pures Appl. 86(2), 155–175 (2006)
Calvez, V., Carrillo, J.A., Hoffmann, F.: Equilibria of homogeneous functionals in the fair-competition regime. Nonlinear Anal. TMA 159, 85–128 (2017)
Calvez, V., Carrillo, J.A., Hoffmann, F.: The geometry of diffusing and self-attracting particles in a one-dimensional fair-competition regime. Lecture Notes in Mathematics, vol. 2186. CIME Foundation Subseries, Springer (2018)
Calvez, V., Corrias, L.: The parabolic-parabolic Keller-Segel model in \({\mathbb{R}}^2\). Commun. Math. Sci. 6(2), 417–447 (2008)
Carlen, E., Loss, M.: Competing symmetries, the logarithmic HLS inequality and Onofri’s inequality on \(S^n\). Geom. Funct. Anal. 2, 90–104 (1992)
Carrillo, J.A., Castorina, D., Volzone, B.: Ground states for diffusion dominated free energies with logarithmic interaction. SIAM J. Math. Anal. 47(1), 1–25 (2015)
Carrillo, J. A., Choi, Y.-P., Hauray, M.: The derivation of swarming models: mean-field limit and Wasserstein distances. Collective dynamics from bacteria to crowds, pp. 1–46, CISM Courses and Lectures, 553, Springer, Vienna, (2014)
Carrillo, J. A., Hittmeir, S., Jüngel, A.: Cross diffusion and nonlinear diffusion preventing blow up in the Keller-Segel model. Math. Models Methods Appl. Sci. 22(12), 1250041 (2012)
Carrillo, J.A., Hoffmann, F., Mainini, E., Volzone, B.: Ground states in the diffusion-dominated regime, to appear in Calc. Var. Partial Differ. Equ. 57, 57–127 (2018)
Carrillo, J.A., Lisini, S., Mainini, E.: Uniqueness for Keller-Segel-type chemotaxis model discrete contin. Dyn. Syst. 34(4), 1319–1338 (2014)
Carrillo, J.A., McCann, R.J., Villani, C.: Kinetic equilibration rates for granular media and related equations: entropy dissipation and mass transportation estimates. Rev. Matemática Iberoamericana 19, 1–48 (2003)
Carrillo, J.A., McCann, R.J., Villani, C.: Contractions in the \(2\)-Wasserstein length space and thermalization of granular media. Arch. Ration. Mech. Anal. 179, 217–263 (2006)
Carrillo, J.A., Santambrogio, F.: \(L^\infty \) estimates for the JKO scheme in parabolic-elliptic Keller-Segel systems. Quart. Appl. Math. 76, 515–530 (2018)
Carrillo, J.A., Toscani, G.: Asymptotic \(L^1\)-decay of solutions of the porous medium equation to self-similarity. Indiana Univ. Math. J. 49, 113–141 (2000)
Chae, D., Tarantello, G.: On planar selfdual electroweak vortices. Ann. Inst. H. Poincaré Anal. Non Linéaire 21, 187–207 (2004)
Champion, T., Pascale, L.D., Juutinen, P.: The \(\infty \)-Wasserstein distance: local solutions and existence of optimal transport maps. SIAM J. Math. Anal. 40, 1–20 (2008)
Chen, L., Liu, J.-G., Wang, J.: Multidimensional degenerate Keller-Segel system with critical diffusion exponent \(2n/(n+2)\). SIAM J. Math. Anal. 44, 1077–1102 (2012)
Chen, L., Wang, J.: Exact criterion for global existence and blow up to a degenerate Keller-Segel system. Doc. Math. 19, 103–120 (2014)
Chong, K.M.: Some extensions of a theorem of Hardy, Littlewood and Pólya and their applications. Canad. J. Math. 26, 1321–1340 (1974)
Craig, K.: Nonconvex gradient flow in the Wasserstein metric and applications to constrained nonlocal interactions. Proc. Lond. Math. Soc. 114, 60–102 (2017)
Craig, K., Kim, I., Yao, Y.: Congested aggregation via Newtonian interaction. Arch. Ration. Mech. Anal. 227, 1–67 (2018)
Diaz, J.I., Nagai, T., Rakotoson, J.-M.: Symmetrization techniques on unbounded domains: application to a chemotaxis system on \({ R}^N\). J. Differ. Equ. 145, 156–183 (1998)
Dolbeault, J., Esteban, M., Loss, M.: Rigidity versus symmetry breaking via nonlinear flows on cylinders and Euclidean spaces. Invent. Math. 206, 397–440 (2016)
Dolbeault, J., Esteban, M., Loss, M.: Symmetry and symmetry breaking: rigidity and flows in elliptic PDEs. arXiv:1711.11291 (2017)
Folland, G.B.: Introduction to partial differential equations, 2nd edn. Princeton University Press, Princeton (1995)
Fraenkel, L. E.: An introduction to maximum principles and symmetry in elliptic problems. Cambridge Tracts in Mathematics, 128. Cambridge University Press, Cambridge (2000)
Gagliardo, E.: Ulteriori proprietà di alcune classi di funzioni in piú variabili. Ricerche Mat. 8, 24–51 (1959)
Gilbarg, D., Trudinger, N. S.: Elliptic partial differential equations of second order. Classics in Mathematics, Springer-Verlag, Berlin, 2001. Reprint of the 1998 edition
Hardy, G.H., Littlewood, J.E., Pólya, G.: Some simple inequalities satisfied by convex functions. Messenger Math. 58, 145–152 (1929). “Inequalities”, Cambridge University Press, 1952, 2d edn
Hittmeir, S., Jüngel, A.: Cross diffusion preventing blow-up in the two-dimensional Keller-Segel model. SIAM J. Math. Anal. 43(2), 997–1022 (2011)
Horstmann, D.: From 1970 until present: the Keller-Segel model in chemotaxis and its consequences. I. Jahresber. Deutsch. Math.-Verein. 105, 103–165 (2003)
Jäger, W., Luckhaus, S.: On explosions of solutions to a system of partial differential equations modelling chemotaxis. Trans. Amer. Math. Soc. 329, 819–824 (1992)
Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29, 1–17 (1998)
Kawohl, B.: ReaRrangements and Convexity of Level Sets in PDE. Lecture Notes in Mathematics, vol. 1150. Springer, Berlin (1985)
Kawohl. B.: Continuous symmetrization and related problems. Differential equations (Xanthi, 1987), pp. 353–360, Lecture Notes in Pure and Appl. Math. 118, Dekker, New York (1989)
Kawohl, B.: Symmetrization—or how to prove symmetry of solutions to a PDE. Partial differential equations (Praha, 1998), 214–229, Chapman & Hall/CRC Res. Notes Math., 406, Chapman & Hall/CRC, Boca Raton, FL (2000)
Keller, E.F., Segel, L.A.: Initiation of slide mold aggregation viewed as an instability. J. Theor. Biol. 26, 399–415 (1970)
Kesavan, S.: Symmetrization and applications. Series in Analysis, 3. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ (2006)
Kim, I., Yao, Y.: The Patlak–Keller–Segel model and its variations: properties of solutions via maximum principle. SIAM J. Math. Anal. 44, 568–602 (2012)
Kolokonikov, T., Sun, H., Uminsky, D., Bertozzi, A.: Stability of ring patterns arising from 2d particle interactions. Phys. Rev. E 84, 015203 (2011)
Kowalczyk, R.: Preventing blow-up in a chemotaxis model. J. Math. Anal. Appl. 305(2), 566–588 (2005)
Lieb, E. H., Loss. M.: Analysis. Graduate Studies in Mathematics, 14. American Mathematical Society, Providence, RI (1997)
Lieb, E.H., Yau, H.T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Commun. Math. Phys. 112(1), 147–174 (1987)
Lions, J.-L.: Quelques méthodes de résolution des problèmes aux limites non linéaires. (French) Dunod; Gauthier-Villars, Paris (1969)
Lions, P.-L.: The concentration-compactness principle in the calculus of variations. The locally compact case. I. Ann. Inst. H. Poincaré Anal. Non Linéaire 1, 109–145 (1984)
Luckhaus, S., Sugiyama, Y.: Asymptotic profile with the optimal convergence rate for a parabolic equation of chemotaxis in super-critical cases. Indiana Univ. Math. J. 56, 1279–1297 (2007)
Mogilner, A., Edelstein-Keshet, L.: A non-local model for a swarm. J. Math. Biol. 38, 534–570 (1999)
Mogilner, A., Edelstein-Keshet, L., Bent, L., Spiros, A.: Mutual interactions, potentials, and individual distance in a social aggregation. J. Math. Biol. 47, 353–389 (2003)
Morale, D., Capasso, V., OelschlÄger, K.: An interacting particle system modelling aggregation behavior: from individuals to populations. J. Math. Biol. 50, 49–66 (2005)
Morgan, F.: A round ball uniquely minimizes gravitational potential energy. Proc. Amer. Math. Soc. 133, 2733–2735 (2005)
Mossino, J., Rakotoson, J.-M.: Isoperimetric inequalities in parabolic equations. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 13, 51–73 (1986)
Nirenberg, L.: On elliptic partial differential equations. Ann. Scuola Norm. Sup. Pisa 13, 115–162 (1959)
Oelschläger, K.: Large systems of interacting particles and the porous medium equation. J. Differ. Equ. 88, 294–346 (1990)
Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Commun. Part. Differ. Equ. 26, 101–174 (2001)
Patlak, C.S.: Random walk with persistence and external bias. Bull. Math. Biophys. 15, 311–338 (1953)
Sire, C., Chavanis, P.-H.: Critical dynamics of self-gravitating Langevin particles and bacterial populations. Phys. Rev. E 78, 061111 (2008)
Ströhmer, G.: Stationary states and moving planes. In Parabolic and Navier-Stokes equations. Part 2, Banach Center Publ. 81 501–513 (2008)
Sugiyama, Y.: The global existence and asymptotic behavior of solutions to degenerate quasi-linear parabolic systems of chemotaxis. Differ. Int. Equ. 20, 133–180 (2007)
Talenti, G.: Elliptic equations and rearrangements. Ann. Scuola Norm. Sup. 3, 697–718 (1976)
Talenti, G.: Inequalities in rearrangement invariant function spaces. Nonlinear analysis, function spaces and applications, Vol. 5 (Prague, 1994). 177–230, Prometheus, Prague (1994)
Talenti, G.: Linear elliptic p.d.e.’s: level sets, rearrangements and a priori estimates of solutions. Boll. Un. Mat. Ital. B 4, 917–949 (1985)
Topaz, C.M., Bertozzi, A.L., Lewis, M.A.: A nonlocal continuum model for biological aggregation. Bull. Math. Biol. 68, 1601–1623 (2006)
Vázquez, J.L.: Symétrisation pour \(u_t=\Delta \varphi (u)\) et applications. C. R. Acad. Sc. Paris 295, 71–74 (1982)
Vázquez, J.L.: Symmetrization and mass comparison for degenerate nonlinear parabolic and related elliptic equations. Adv. Nonlinear Stud. 5, 87–131 (2005)
Vázquez, J.L.: The Porous Medium Equation. Mathematical theory. Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, Oxford (2007)
Vázquez, J.L., Volzone, B.: Symmetrization for linear and nonlinear fractional parabolic equations of porous medium type. J. Math. Pures Appl. 101, 553–582 (2014)
Vázquez, J.L., Volzone, B.: Optimal estimates for fractional fast diffusion equations. J. Math. Pures Appl. 103, 535–556 (2015)
Acknowledgements
JAC was partially supported by the Royal Society by a Wolfson Research Merit Award and the EPSRC Grants EP/K008404/1 and EP/P031587/1. SH acknowledges support by the Austrian Science Fund via the Hertha-Firnberg Project T-764, and the previous funding by the Austrian Academy of Sciences ÖAW via the New Frontiers project NST-0001. BV is partially supported by the INDAM-GNAMPA Project 2015 “Proprietà qualitative di soluzioni di equazioni ellittiche e paraboliche” and by This work has been partially supported by GNAMPA of the Italian INdAM (National Institute of High Mathematics) and by “Programma triennale della Ricerca dell’Università degli Studi di Napoli “Parthenope” - Sostegno alla ricerca individuale 2015–2017” (ITALY). YY was partially supported by the NSF Grant DMS-1565480 and DMS-1715418. YY wants to thank Almut Burchard for a helpful discussion.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Theorem 4.21
(Dunford–Pettis Theorem) Let \((X,\Sigma , \mu )\) be a probability space and \({\mathcal F}\) be a bounded subset of \(L^1(\mu )\). Then \({\mathcal F}\) is equi-integrable if and only if \({\mathcal F}\) is a relatively compact subset in \(L^1(\mu )\) with the weak topology.
Lemma 4.22
Let \((f_\varepsilon )\) be a sequence of non-negative functions uniformly bounded in the space \(L^\infty (0,T;L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2))\) with \(\Vert f_\varepsilon \Vert _{L^1({\mathbb {R}}^2)}=\Vert f\Vert _{L^1({\mathbb {R}}^2)}=M\). Assume moreover that \(f_\varepsilon \rightarrow f\) a.e. in \({\mathbb {R}}^2\times (0,T)\). Then, \(f\in L^\infty (0,T;L^1_{log}({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2))\) and
The same result holds by replacing the logarithmic moment by the second moment, i.e., by replacing \(L^1_{log}({\mathbb {R}}^2)\) by \(L^1((1+|x|^2)dx)\) everywhere.
Proof
A similar argument was used in the proof of Proposition 2.1 in [52]. First observe that by the Fatou lemma, for any \(m>1\)
Let now \(L>1\) and \(g_\varepsilon =\min \{f_\varepsilon ,L\}\). Then \(g_\varepsilon \rightarrow g=\min \{f,L\}\) a.e. Let moreover \(R>1\), then by the dominated convergence, it holds for sufficiently small \(\varepsilon >0\) on the ball \(B_R(0)\):
and we obtain
Using additionally the confinement of mass from the bound on the log-moment, we obtain
Since \(L>1\) is arbitrary and \(m>1\), this shows that \(f_\varepsilon \rightarrow f\) strongly in \(L^\infty (0,T;L^1({\mathbb {R}}^2))\). The proof in case we replace \(L^1_{log}({\mathbb {R}}^2)\) by \(L^1((1+|x|^2)dx)\) is done analogously. \(\square \)
For the proof of the following Dubinskii Lemma we refer to [30] or Theorem 12.1 in [66]:
Lemma 4.23
Let \(\Omega \subset {\mathbb {R}}^2\) be bounded with \(\partial \Omega \in C^{0,1}\) and let \(\{f_\varepsilon \}\), \(0<\varepsilon <1\), satisfy
for some \(p\ge 1\), \(q\ge 1\) and \(s\ge 0\). Then \(\{f_\varepsilon \}\) is relatively compact in \(L^{pl}(0,T;L^r(\Omega ))\) for any \(r<\infty \) and \(l<q\).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Carrillo, J.A., Hittmeir, S., Volzone, B. et al. Nonlinear aggregation-diffusion equations: radial symmetry and long time asymptotics. Invent. math. 218, 889–977 (2019). https://doi.org/10.1007/s00222-019-00898-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00222-019-00898-x