Beginner’s guide to Aggregation-Diffusion Equations

The aim of this survey is to serve as an introduction to the different techniques available in the broad field of Aggregation-Diffusion Equations. We aim to provide historical context, key literature, and main ideas in the field. We start by discussing the modelling and famous particular cases: Heat equation, Fokker-Plank, Porous medium, Keller-Segel, Chapman-Rubinstein-Schatzman, Newtonian vortex, Caffarelli-V´azquez, McKean-Vlasov, Kuramoto


Introduction
The aim of this survey is to serve as a general introduction to the many models and techniques used to study the Aggregation-Diffusion family of equations ∂ρ ∂t = div ρ∇(U ′ (ρ) + V + W * ρ) . (ADE) This survey does not intend to provide an encyclopedic presentation including all possible references and results in maximal generality (this would be impossible), but to give newcomers a general overview of the subject.

Modelling
Let us discuss first the modelling, and then we devote Section 3 to several modelling problems which belong to this family.On the one hand, diffusion is usually modelled as a conservation law.

Continuity equations
Let ρ be a density and ω ⊂ R d any control volume, if F is the out-going flux Hence we arrive at the continuity equation For the transport of heat, we use A generalisation of this problem which is used to the flow a gas through porous media.We derive the model as mass balance by writing the flux in terms of a velocity v, i.e., F = −ρv; we use Darcy's law to relate the velocity with the pressure p as v = − k µ ∇p; and lastly we use a general state equation p = ϕ(ρ), where for perfect gases ϕ(ρ) = p 0 ρ γ .Eventually, we recover that F = −∇Φ(ρ) for some non-decreasing Φ : R → R, and this yields the so-called Porous Medium Equation It is common to select Φ(ρ) = ρ m for m > 0. The most complete reference for this equation is [Váz06b].
Notice ∆Φ(ρ) = div(Φ ′ (ρ)∇ρ) so we take (2. 3) The classical modelling also allows for a transport velocity field v(t, x) to be included.For example, if a pollutant is being transported by water, then v(t, x) is the velocity of the water.The flux of mass being moved is recovered from the density and the velocity as ρv.A general flux then looks like F = −∇Φ(ρ) + ρv.
Remark 2.1.We do not discuss the case of sign-changing solutions.In that setting u m must be replaced by |u| m−1 u.

Particle systems
We can also understand transport from the particle perspective.Consider a known velocity field v(t, x).Consider N particles with positions X i (t) of equal masses 1/N moving in the velocity field Notice that the evolution of the particles is decoupled.To recover a PDE we consider the so-called empirical distribution defined as where δ x is the Dirac delta at a point x.First, notice that e., φ(0, x) = 0 and for T large enough φ(t, x) = 0 for t ≥ T ) we integrate in time to recover Integrating by parts, we observe that this is the distributional formulation of the continuity equation Aggregation and Confinement can be modelled by considering interacting particles where the velocity field is aware of other particles The first term models the interactions between particles, with attract/repel each other as a function of their relative position, and the second term models a field attracting/repelling every particle.Noticing that we deduce from the previous reasoning that the empirical measure µ N is a distributional solution of the Aggregation-Confinement Equation ∂ t µ = div(µ∇(W * µ + V )). (2.5) Linear diffusion can be added to the particle system by introducing noise and working with Stochastic ODEs and a mean-field approximation (see Section 3.8).
As we will see below, special attention has been paid to the case V = 0 which is simply called Aggregation Equation ∂ρ ∂t = div(ρ∇W * ρ).

(AE)
As a toy model we will also discuss the confinement problem ∂ρ ∂t = div(ρ∇V ). (CE) Joining the many-particle approximation for aggregation with the Porous Medium diffusion we recover (ADE).If we look for solutions defined in the whole space, then it is natural to consider only the case of finite mass.Hence, we assume ρ ∈ L 1 (R d ) and, in some sense, ρ t (x) → 0 as |x| → ∞. (2.6) 2.3 Boundary conditions in bounded domains of R d The problems above are often posed in the whole space R d .However, sometimes it is convenient to restrict them to bound domains, specially for the numerical analysis.Here and below we will denote by Ω an open and bounded set of R d with a smooth boundary.Although it is possible to study these kinds of problems in less smooth domain (Lipschitz boundary, interior point conditions, etc. . . ) this introduces additional difficulties we will not discuss.Since with this kind of equation we intend to model systems that conserve mass, there are two natural approaches.

No-flux conditions
We can stablish that no mass will cross the boundary of Ω, ∂Ω.Going back to (2.1), a natural way is to establish no-flux conditions F • n = 0 on ∂Ω.With our choice of flux this means where again ρ 0 is known, and we have to be a little careful with the notion of convolution when ρ is only defined in Ω.For this, we consider the extended notion W * ρ ≡ W * ρ, where ρ = ρ in Ω, 0 elsewhere.

Periodic boundary conditions
Another approach is to stablish that any mass that crosses the boundary will "come back" on the opposite side.This happens, for example, if all data are periodic.Some papers have studied periodic solutions defined over the unit cube [0, 1] d (or, equivalently, the torus T d ).In d = 1 this is equivalent to studying solutions defined over the unit circle S 1 .This is the case of the Kuramoto model described below.An example of the case d = 2 can be found in [CGPS20].

Free energy
We conclude this section by introducing a functional which will be very useful below.We will show how (ADE) is related to the free energy F(ρ) = ˆΩ U (ρ) + ˆΩ V ρ + 1 2 ¨Ω×Ω W (x − y)ρ(x)ρ(y) dx dy, (FE) whereas (ADE * ) is related to the more general free energy (2.9)This is called first variation of the free energy, and we will explain its importance below in Section 4.2.4.Hence, (ADE * ) can be re-written as ∂ρ ∂t = div ρ∇ δF δρ (2.10) for F given by (FE).
Occasionally, it will be convenient for us to use the free energies of each of the terms.Hence, we define (2.11)

A comment on mass conservation
Our evolution problem need, of course, an initial state ρ 0 .Throughout this survey we will usually restrict ourselves non-negative unit mass initial data, i.e., ˆRd ρ 0 (x) = 1.(U 0 ) All the equations presented above are of "conservation type".This means we expect preservation of mass, i.e., that provided unit initial mass (U 0 ) then there will be non-negative unit initial mass for all times, i.e., ˆRd ρ t = 1, ∀t > 0. (U) 3 Famous particular cases

Heat equation and Fokker-Plank
As explained above, the heat equation (HE) is the most classical example in our family of equations.It corresponds to (ADE) with the choices U = ρ log ρ and V = W = 0.In R d It is well known that (HE) admits the Gaussian solution All solutions (with suitable initial data) are precisely of the form and it is not difficult to check that A good survey on the matter can be found in [Váz17].
The fundamental solution K t is of the so-called self-similar type In this expression, F is usually called self-similar profile and σ(t), A(t) the scaling parameters.Due to mass conservation the natural choice is A(t) = σ(t) −d .Plugging (SS) into (HE) and setting the total mass ´Kt = 1 yields the Gaussian.Since (HE) is linear, M K t is a solution of mass M .Usually, the self-similar solution has σ(t) increasing, and it only makes sense to consider σ(t) ≥ 0. Hence, we make the choice σ(0) = 0 that corresponds to K t = δ 0 , a Dirac delta at 0. This type of solutions with ρ 0 = δ 0 are called source-type solutions.
This nice convolutional representation is not available in more general settings.There are several other ways to study the asymptotic behaviour.One of the key tricks is the study of the self-similar change of variable ρ(x, t) = u(y, τ which leads to the so-called linear Fokker-Planck equation This equation corresponds to 2 and W = 0. Notice that stationary states can be obtained as solutions of the equation for the flux log u + |x| 2 2 = −h.This yields the Gaussian profile The constant A is the unique value such that ´ρt = M .In original variable, we recover M K t .

Porous medium equation
As presented in the introduction, the most common example of (PME Φ ) is the Porous Medium Equation The range m ∈ (0, 1) is sometimes called Fast Diffusion Equation.The (PME) corresponds, for m ̸ = 1, to Notice that as m → 1 we have ρ m−1 /(m − 1) tends to log ρ and hence U m tends to U 1 .Notice that, due to the homogeneity of the equation, after rescaling space and time, we can preserve the equation (without introducing constants) and can work with (U 0 ).
Similarly to the heat equation, a suitable change of variable (detailed below in Section 5.1.1)converts (PME) into This corresponds to U = U m and V = |x| 2 2 and W = 0. (PME-FP) admits a stationary solution given by the so-called Barenblatt profiles where C ≥ 0. For m > 1 they have compact support, and for m ∈ (0, 1) they have power-like tails.To check whether finite-mass states are available we compute ˆRd Hence, finite mass steady states are only when m > d−2 d .This number is critical, as well will see below.Undoing the change of variables we arrive at the self-similar solution This solution is usually denoted ZKB after Zel'dovich and Kompaneets, and Barenblatt (see [Váz06b] and the references therein).Usually this is shortened to Barenblatt.When m > d−2 d there is a unique constant C so that (U).It can be explicitly computed (see [Váz06b,Section 17.5]).
As we let m ↘ d−2 d we get C ↗ ∞, so for m = d−2 d we expect B t = δ 0 .In fact, Brezis and Friedman [BF83] showed that if m ≤ d−2 d and ρ ) are a sequence of initial data converging to δ 0 , then the corresponding solutions converge to ρ (j) (t, x) → δ 0 (x) ⊗ 1(t).Therefore, very fast diffusion cannot diffuse a δ 0 .This is because the diffusion coefficient is mρ m−1 .For m ≪ 1 this means that small values of ρ are indeed diffused fast (hence the name of "fast diffusion").However, in regions where ρ is large the diffusion is slow.
It is also worth pointing to the equivalent formulation in the so-called pressure variable u = U ′ (ρ).We recover the Hamilton-Jacobi type equation where we recall the relation (2.3).Notice that Φ ′ (ρ) can be written as a non-linear function of u.This formulation is convenient when m > 1 since u = m m−1 ρ m−1 .This presentation is much better for the study of viscosity solutions (see [CV99;BV05]).
First results of asymptotic behaviour for m > 1 were obtained by regularity, self-similarity, and comparison arguments in the late 70s and early 80s (see [FK80] and the references therein).Self-similar analysis and relative entropy arguments were later developed to improve the theoretical understanding (see Sections 5.1.1 and 5.2.3).For a detailed explanation of the theory see [Váz06a;Váz06b].Remark 3.1.(HE) and (PME) are "diffusive".This can be seen in different directions.For example, at a point (t 0 , x 0 ) of local maximum of ρ we get ∆ρ m ≤ 0 and hence ∂ρ ∂t ≤ 0, i.e., the maximum values decays.This is strongly related to the maximum principle.The converse happens to the minimum if it exists (notice that non-negative solutions in R d will tend to 0 as |x| → ∞).
Remark 3.2.Let us recall some "numerological values".The value m = 1 − 2 d is the critical value for finitemass Barenblatt.It is also be the critical value below which there is finite-time extinction (see Section 4.2.1).The (3.2) has finite λ-moment, i.e., ˆRd In Section 4.3.4we will present a third relevant value m = 1 − 1 d coming from the Wasserstein gradient-flow analysis.
Remark 3.3.We will not discuss here the limit m → 0. In this direction there are two possibilities.On the one hand, some authors have discussed the rescaled limit u m m → log u, and we arrive at the logarithmic diffusion equation ∂ t u = ∆ log(u).We point to [Váz06a, Section 8.2] for a discussing on this topic and [Váz06a, Section 8.4] for historical notes.On the other hand, some authors choose to look at the limit |u| m−1 u → sign(u) leads to the so-called Sign Fast Diffusion Equation ∂ t u = ∆ sign(u).Some results in this direction are available [BF12].

Transport problem with known potential
To understand the transport terms in the (ADE) family, we will discuss the toy problem where V is known and does not depend on t.Imagine that ρ 0 ≥ 0 is radially symmetric function.Computing the balance of mass of a ball B r we get If ∇V • x ≤ 0 all balls B r are "loosing mass", i.e., then mass if always flowing "away" from 0 and towards infinity.Hence, we can think of these kinds of potentials as "diffusive" (but not as strongly as in the sense of Remark 3.1).We will show below by explicitly solving these problems that potentials such that ∇V • x ≥ 0 will attract mass towards 0, asymptotically (or even instantaneously) creating a singularity.

Keller-Segel model
The Keller-Segel [KS70] (or Patlak-Keller-Segel model [Pat53]) model for describes the motion of cells by chemotactical attraction by means of the coupled system (KSE) This model was first studied for d = 2.Some authors replace the second equation by a more general evolutionary problem ε ∂Vt ∂t − ∆V t + αV t = ρ t .We will discuss (KSE), which can be thought as the limit ε, α → 0. This simplification was first introduced by [Nan73] to model the case where the diffusion of the chemical is much faster than its production.
So far, this model is not in the (ADE) family.However, the can analysis the equation for V t to recover some information.In R d we can use the Fourier transform F for the elliptic problem to recover and hence we can solve and recover In fact, we have the closed expression (3.4) Notice that −∆W N = δ 0 .The so-called Newtonian potential corresponds to N = −W N .Eventually, we can write this problem as (ADE) where U = ρ log ρ, V = 0 and W = −W N .
We can think about the convolution term W * ρ t as an aggregation potential.The justification is the following.If ρ t is non-increasing, then V t = W * ρ t is also non-increasing, and hence ∇V t • x ≤ 0. Going back to Section 3.3, this is what we called aggregation potentials.Unfortunately, we do not have in general the sign ∇V t • x ≤ 0 for all t, x > 0. In fact, (KSE) does not "prefer" x = 0 from other points.
Changing the sign of the aggregation to ∂ρ ∂t = ∆ρ + div(ρ∇V t ) makes the problem more regularising.In [Bil92] the author proves global well-posedness of that problem for d = 3 in a bounded domain with no-flux condition was shown to have.
Sometimes it is convenient to express the problem in a single equation using operator notation: When written in this form, the authors usually allow for any initial mass ∥ρ 0 ∥ L 1 .We can rescale to impose (U 0 ).If we let χ = ∥ρ 0 ∥ L 1 and u(t, x) = ρ(t, x)/χ we arrive at the equation In bounded domains some authors introduce Neumann boundary conditions ∂u ∂n = ∂V ∂n = 0.The authors of [JL92] extend the problem to −∆V = βu − µ which a few authors later called Jäger-Luckhaus problem (e.g., [Sen05]).This is equivalent to replacing (−∆) −1 by (−∆ + id) −1 .Some authors later introduced the Keller-Segel model with Porous Medium Diffusion.One of the model arguments is that m > 1 would deal with "over-crowding" (i.e., the formation of Dirac deltas).We arrive at the problem ∂ρ ∂t = ∆ρ m − χ div(ρ∇(−∆) −1 ρ).
(KSE m,χ ) This model has the advantage of not having finite time blow-up (see [CC06]).Existence and uniqueness of solutions for m > 1 and a family of W covering W N can be found in [BRB11].Some authors replace (−∆) −1 by (−∆ + γid) −1 , see, e.g., [SK06; LS07; Sug07a; Sug07b].As we will discuss below, (KSE m,χ ) admits global solutions in the sub-critical range m > 2 − 2 d .However, for m < 2 − 2 d there is finite-time blow-up.In the critical case blow-up depends on χ.Curiously enough, the first case studied was d = 2 and m = 1, which is critical.A good explanation of this dichotomy is the free-energy minimisation argument which we will provide in Section 6.3.

Chapman-Rubinstein-Schatzman
In the context of superconductivity [CRS96] introduced a mean-field model for "the motion of rectilinear vortices in the mixed state of a type-II semiconductor".They argue this is the limit where the Ginzburg-Landau parameter κ tends to +∞ and the number of vortices becomes large one recovers the problem (CRSE) The model was first introduced in R 2 .Then, the authors also make a particularisation to a bounded domain Ω where they introduce Dirichlet conditions for V t (x) = h 0 on ∂Ω, and flux conditions setting the number of vortices crossing the boundary |ρ|∇v • n known.
In fact, when ρ ≥ 0 it becomes of our class (ADE) (or (ADE * ) in bounded domains).On R d we can solve as before to recover As long as ρ ≥ 0, this model corresponds to (ADE) where U, V = 0.If we work in a bounded domain Ω the homogeneous Dirichlet condition we recover (ADE * ) where U, V = 0 and K is the Green kernel of the equation.Notice that K is not a compactly supported in Ω × Ω.
There is extensive literature on this problem.First, there were different works applying PDE theory.First, [CRS96] proved existence of the problem posed in bounded domains.Then in [HS98] the authors work with non-negative ρ 0 and compactly supported, using generalised characteristics.We also point to [SS99] where the authors perform a vanishing viscosity analysis in bounded domains.
Later, some authors applied Optimal Transport methods, in particular the 2-Wasserstein gradient-flow approach.In [AS08] the authors worked in with non-negative probability measures in a bounded domain, i.e., P 2 (Ω).Later, the work was extended in [AMS11] to a bounded domain and signed measures.They extend the Wasserstein distance to signed measures and in fact they minimise a localised energy in the space of signed unit measures in R d to recover a problem of the form This sparked broader interest in the family of aggregation equations (AE).

Newtonian vortex problem
In [LZ99] the authors justify that a different limit from Gizburg-Landau equations leads to the problem without zero-order term Again, this can be set in R d or in a bounded domain (with suitable boundary conditions).
This problem also attracted several authors, who brought different approaches.[LZ99] the authors construct solutions by characteristics with the so-called "vortex-blob" method.In [MZ05] the authors prove existence of renormalised solutions.The authors of [BLL12] study positive and negative solutions.This is equivalent to studying non-negative solutions of the problems The W = −W N can also be seen of the problem W = W N but taken backwards in time.In fact, the authors prove that the + sign leads to asymptotic dispersion whereas − leads to blow-up.

The Caffarelli-Vázquez problem
In [CV11a], Caffarelli and Vázquez introduced a non-local porous medium-type equation given by where (−∆) s denotes the fractional Laplacian.In R d this operator is to the operational fractional power of the Laplacian, which can be rigorously defined as the operator with Fourier symbol |ξ| 2s .This problem is often called Caffarelli-Vázquez problem and is often written in the compact form This is a "diffusive" operator.To solve the fractional Laplace equation we take W = F −1 [|ξ| −2s ], and we recover the so-called Riesz potential The properties convolution properties of the Riesz potential where already well known in Harmonic Analysis.
There has been a lot of analysis for this equation: asymptotic analysis [CV11b], sign-changing case and selfsimilar solutions [BIK15], regularity theory [CSV13; CV15], and in [SV14] that the limit as s → 1 is (NVE).
In [LMS18] the authors construct a solution through the Wasserstein gradient-flow theory described in Section 4.3.

McKean-Vlasov and Kuramoto problems
The McKean-Vlasov process is given by the stochastic differential equation where a t , b t may depend on X t and its law, and B t is a Wiener process, i.e., we introduce Brownian motion.
We can extend this idea to a system of N particles driven by coupled McKean-Vlasov equations.If we consider that a t is constant, say a t = 2β −1 , and b t is given by the interaction we can arrive at dX Through arguments of mean-field theory (or propagation of chaos see, e.g., [JW17]), one arrives asymptotically as N → ∞ to (ADE) with linear diffusion U = βρ log ρ (see, e.g., [CGPS20]).This is the reason the problem is sometimes called McKean-Vlasov problem (see, e.g., [CGPS20]).However, this denotation is not universal (see, e.g., [CGYZ23]).
Y. Kuramoto [Kur75] introduced a model for synchronisation of chemical and biological oscillators, and it has been well-received by the community in neuroscience.It is a particular case of McKean-Vlasov problem in d = 1 where X (i) t models the phase of a coupled oscillator.Therefore, we take X (i) t ∈ [0, 2π) or, equivalently, periodic boundary condition.The particular choices are V (x) = −ω i the natural frequency of the system, and W (x) = sin(x).The resulting (ADE) is known as the Kuramoto model (KE) The case β −1 = 0 is also interesting.An interesting analysis of this problem without diffusion (i.e., β −1 = 0) can be found in [CCHKK14].

The neural networks of machine learning
A neural network is no more than a specific type of parametric function accepting an input vector x ∈ R d and returning y ∈ R. If the output is higher dimensional, then we can create a neural network for each output.In this context "training" corresponds to optimising the parameters to fit a set of prescribed data (supervised learning) or to achieve some objective (unsupervised learning).Neural networks are form by connecting so-called perceptron.A perceptron is a parametric function of the form where σ is called the activation function and A one-layer neural network is the linear combination of the output of N perceptrons, and can be expressed where The aim is, for N fixed, to find the best parameter to approximate a target function f , in the sense that The quadratic error can be replaced by more general functions.
In [FF22] the authors give a nice presentation of the following continuous formulation, i.e., limit as N → ∞.When can let ξ i := (w i , θ 1,i , θ 2,i ), Ξ(ξ, x) := wσ(θ 1,i x + θ 2,i ), and define the empirical distribution For convenience, define Ω as the set of admissible ξ.Then, the value of the neural network can be computed as We can rewrite the (3.5) to a minimisation problem over the set of empirical distributions given by (3.6).The limit as N → ∞ is now clear, we pass to the minimisation problem over P(Ω).To smoothen the problem the authors allow for a penalisation V (of type |ξ| 2 /2), which can be understood as trying to prevent overfitting.We land on the problem min Expanding the square, this is precisely (FE * ) for U = 0 and To find local minimums of this energy, it is natural over P(Ω), it is natural to consider the associated 2-Wasserstein gradient flow.As will show below, this leads to (ADE * ).
In [CCFF22] the authors generalise this result to problems with more than one layer.

Granular flow equation
Another popular example of the (ADE) family is the so-called granular flow equation.This phenomenom is usually modelled in phase space (t, x, v) where v is the velocity.In that model ρ(t, x, v) represents the phase-space distribution.We arrive at (ADE) if the ρ(0, x, v) does not depend on x.In this setting, U models the random interactions of the granules with their environment (a fluid or heat bath), V models friction, and W models inelastic collisions between granules with different velocities.For more details see [BCCP98; CMV06; Vil06].

The power-type family of Aggregation-Diffusion Equations
Many authors have devoted their attention to the power-type family of non-linearities given by U = U m , This family covers most of the relevant examples above.We introduce the associated free energies 4 Well-posedness frameworks After setting a model like the ones described in Section 3, and before any other analysis can take place, the mathematician immediately aims to solve the question of existence and uniqueness.For this, one needs a suitable framework in which to define what is a "solution".
The very first approach to a PDE is to consider the existence of so-called classical solutions.These are solutions that are smooth enough so that all the derivatives in the equation can be taken, and the equation is satisfied is all points in the domain.
The hope for classical solutions quickly vanishes for many singular problems (e.g., (PME) when m > 1).To deal with this difficulty, many authors in 20th century dedicated their efforts to the notion of weak solutions, with their derivatives taken in distributional sense.So the equation could simply be satisfied in almost all points, and additional reasonable conditions.We will make some comments on this in Section 4.1.
We devote Section 4.2 to present some good and bad expected properties in different setting.In particular, we will also make a brief stop in Section 4.1.6to discuss stationary solutions (i.e., solutions no evolving in time).When not even weak solutions may exist (like in the transport equation), the community turned towards Optimal Transport techniques, which we will discuss in Section 4.3.

The PDE approach
First, we discuss the more classical approach to understand "solutions" of (ADE).

Local PDEs
Non-linear diffusion problems of the general type were widely studied in the 20-th century.Through the work of many authors, different types of solutions have been studied: classical, weak, entropy, viscosity, . . .
Let us start by discussing the nicer problem given by non-degenerate parabolic problems.When Φ, E are smooth and we assume Φ is uniformly elliptic, in the sense that there exist constants such that existence, uniqueness, and maximum principle hold from the classical theory.The literature is extensive: in R d this issue was solved at the beginning of the twentieth century (see [LSU68]).This book already contains several results for bounded domains (see [LSU68, Chapter V.6 and V.7]).However, the development of regular solutions would leave room for later improvement.A good reference for Dirichlet boundary conditions is [DiB93].There is a series of works by [Ama90] for Neumann/Robin boundary conditions, although they provide very general ellipticity conditions which written for systems and are not easy to apply to specific case.There are later references which are more concise, e.g., [Yin05].
The theory of weak solution is based on the notion of weak (or distributional) derivative, and hence solutions are found in Sobolev spaces.This is typical of parabolic problems where Φ ̸ = 0.There are more general theory that allow for well-posedness even when Φ = 0: entropy solutions (see [Car99;KR03] and the references therein) and viscosity solutions (e.g., [CIL92] and the references therein) The theory of weak and energy solutions for (PME) is well presented in the book [Váz06b].Notice that for m > 1 we have Φ ′ (0) = 0, and we say that the diffusion is degenerate, and for m < 1 we have Φ ′ (0) = ∞, and we say the diffusion is singular.The first order problems U = 0 (or, equivalently, Φ = 0) are purely of the first order and hence may yield discontinuous solutions.
Remark 4.1.Occasionally, Hölder spaces are used in addition to Sobolev.For this, one can use the method of intrinsic scaling (see [DiB93;Urb08] and the references therein) Remark 4.2.Purely-diffusive problems like (PME) admit semigroup-type solutions.In the linear theory this is called Hille-Yosida theorem (see, e.g., [Bre10,Chapter 7]) and in the non-linear setting Crandall-Liggett theorem (see [CL71]).The semigroup is constructed as the limit τ → 0 of solutions of the implicit Euler scheme u 0 = ρ and u k+1 + τ (−∆u m k+1 ) = u k .To be precise, we take t k = kτ and for t ∈ [t k , t k+1 ) then we define Boundary conditions of homogeneous Dirichlet (i.e., u = 0) and Neumann (i.e., ∇u • n = 0) are fairly well understood in terms of existence and uniqueness (see, e.g., [LSU68]).In linear problems, even so-called Robin type boundary conditions ∇u • n + αu = 0.The operator Au = −∆u − div(u∇V ) with no-flux condition is monotone (i.e., ⟨Au, u⟩ ≥ 0) if ∥∇V ∥ L ∞ is small enough, so well-posedness follows from the Hille-Yosida theory.Non-linear diffusion problems are more difficult.Works for (KSE) and related problem usually take advantage of an auxiliary boundary condition for V t .
When there is no diffusion (i.e., U, W = 0) and V = −|x| 2 leads to mass trying to exit any ball B r .If work on a ball B R and we set no-flux condition ρ∇V • n = 0 on ∂B R , the result will be the formation of Dirac deltas on ∂B R .This can be properly understood by the optimal transport theory below.It points to a subtle trade-of between diffusion and aggregation.
Existence and uniqueness for (ADE * ) is only understood with (BC * ) if we assume (2.7) (see [CFG] for U = U m with m ∈ (0, 1)).The general case when (2.7) does not hold seems to be open.We provide a tricky example with no-diffusion in Remark 4.14.

Weak vs very weak solution
, multiply the equation and integrate one to recover Functions ρ that satisfy this equation are so-called weak solutions of the problem.Here, U (ρ) must have a gradient in some sense.In R d we can integrate by parts again in the diffusion term to recover These are the so-called very weak solutions to the problem.
If we work in a bounded domain Ω and consider the no-flux condition we can recover weak solutions.If we want to recover the no-flux condition from this weak formulation, we need to take φ not vanishing in the boundary, for example C ∞ c ((0, T ); C ∞ (Ω)).To integrate by parts again we would need that ∂ n Φ(ρ) = 0 on the boundary.This is only possible if we assume that V and K do not interfere with the boundary (2.7).

Solutions of (ADE * ) by approximation
We will not make any detailed analysis of Sobolev spaces let us make some comments that the equation would have if the solutions where smooth.We define Φ ′ (s) = sU ′′ (s) and Φ(0) = 0. To give a hint on the procedure: 1.For uniformly elliptic Φ ε , e.g., ) and E t (x) known, existence is done through the classical theory above.
2. Letting ε → 0 yields solutions of the degenerate/singular diffusion term.This is the approach of [Váz06b].
3. Take V κ , K κ smooth approximations of V and K. Then replace E t (x) by ρ∇(V + Kρ) by a fixed-point argument.
4. Pass to the limit in κ.
The order of steps 2 and 3 can be suitable exchanged depending on convenience.They depend on delicate a priori estimates.For two particular examples for global-in-time existence and uniqueness theory using this approach we point to [CHVY19] for m > 1, [CGYZ23] for m = 1, and [CFG] for m ∈ (0, 1) (the arguments in [Váz06b] make m > 1 simpler).For (AE) we point the reader to [LST] for a vanishing viscosity argument.

Push-forward solutions
When the velocity field v is known, the transport can be written as If v is smooth problems, then can be solved by characteristics.This problem can also be recovered trying to find solutions as generalised characteristics ρ t (X t (y)) = A t ρ 0 (y).This leads the set of ODEs the field of characteristics is the unique solution of this problem.We do not concern ourselves with A t for now.
A more elegant approach is to look for solutions of the Lagrangian formulation of the problem.ˆXt(A) For the details we recommend [ABS21, Lecture 16].This taps into a well-understood theory coming from the Optimal Transport world (we will come back to this in Section 4.3).Given a measure µ ∈ P(Ω) and a measurable map T : Ω → Ω we can define the push-forward of µ by T , which we denote T ♯ µ ∈ P(Ω), as In particular, we can rewrite as (4.4) There are many properties of ρ t that be derived directly from this structure and the analysis of the ODEs (4.3).

Solutions by characteristics when U = 0
If U, W = 0, then we can simply solve the de-coupled PDEs and set ρ t = (X t ) # ρ 0 .When V is of class C 2 , then a unique solution holds by the Picard-Lindelöf theorem.Furthermore, coming back to Section 3.3 if we have ∇V (x) so all the mass is moving towards 0. Since X t = 0 is a solution, when V ∈ C 2 mass will concentrate in infinite time.A posteriori, for (ADE) when U = 0 one can always define and arrive at a transport problem.
The problem (AE) is more involved.The specific case (CRSE) was done [HS98], where the author arrive at closed-form formulas for the characteristics.The author of [Lau07] proves global existence of solutions when W is good enough by a fixed-point argument with suitable a priori estimates in Sobolev spaces.This result later applied to construct local-in-time solutions by approximation [BCL09a; BB10; BLR11], where a key difficulty is to then decide if there is global existence or finite-time blow-up.Later works use characteristics to study blow-up behaviours (see, e.g., [BGL12]).

Stationary solutions
There is a particular type of solutions which will be of interest for the asymptotics.A stationary solution is a solution that does not evolve in time.We recall the equivalent writing (2.10).If ρ is a stationary solution, using φ = δF δρ [ ρ] as a test function yields Hence, we arrive at the equation for stationary states This means that in open connected sets where ρ > 0 then loc (where Φ ′ (s) = sU ′ (s) and Φ(0) = 0) and that ∇V, ∇W * ρ ∈ L 1 loc .This regularity is not always achievable.
Remark 4.4.When they are available, these kinds of solutions will typically be recovered from a minimisation problem.We will discuss particular examples in Section 6.3.
Remark 4.5.Notice that solving this problem when U = U m and V = V 2 the Barenblatt profile (3.2) (i.e., (ZKB) in re-scaled variables) is a solution of (ADE-S).

A priori estimates for smooth solutions
In this section we make some formal computations to understand properties of the solution of (ADE * ).We will mostly assume that ρ is smooth.However, due to weak lower semi-continuity of norms, the a priori estimates still usually hold after approximation.

Conservation of mass
Let us consider (ADE * ), and take a smooth set ω ⊂ R d .We can formally compute the balance of mass Hence, if we have the boundary conditions (BC) we have conservation of mass.In R d we will need that the decay as |x| → ∞ is fast enough.This is not always the case, as shown by the next result (see [BC81]).
Proposition 4.6.Let ρ be a solution of then there is finite-time extinction.
Proof.First let u 0 ∈ L q .Then, we compute for sufficiently good solutions that d dt using Sobolev's inequality where 2 * = 1 2 − 1 d .The equation d dt X = −CX α where α < 1 has finite-time extinction in the sense X T = 0. Thus, we have the finite-time extinction ˆRd ρ q → 0, t ↗ T.
Let ρ 0 ∈ L 1 (R d ).Take ε > 0 and an approximating sequence ρ This effect is known as "loss of mass through infinity", and is typical to the very fast diffusion.In some settings, this can be offset by the aggregation term.To prevent this loss of mass it suffices that ρ t is a tight family (a notion we discuss in the next section).

Mass escaping through infinity: tightness
In the previous section we have discussed the problem of mass escaping through infinity "as time passes".This phenomenon is also related to a difficult technical problem when working on P(R d ).We recall some classical results for working with probability measures.
First, let ρ n be a sequence of functions in P(R d ) ∩ L 1 (R d ).If they have an a.e.limit we know, at most, that ˆRd ρ ≤ lim inf But the equality needs to hold.Furthermore, let us take a sequence in µ n ∈ P(R d ).Since they are uniformly bounded measures, and ) by the Banach-Alaoglu theorem we have that the sequence µ n is weak-⋆ pre-compact, and take µ an accumulation point.It needs not lie in P(R d ).
To ensure that the limit is a measure we need to add some information.Prokhorov's theorem (see, e.g., [AGS05, Theorem 5.1.3])states that a family of measures is pre-compact in P(R d ) if and only it is tight.Definition 4.7.A family of probability measures (µ a ) a∈A is tight if, for every ε > 0 there exists This means, informally speaking, that the tails of µ n hold "uniformly small" mass.
Remark 4.8.To show that a family of probability measures (µ a ) a∈A is tight it suffices to show that it is uniformly integrable against a suitable function.Following [AGS05, Remark 5.1.5],it suffices to find φ : R d → [0, ∞] such that its sublevel sets (i.e., {x ∈ R d : φ(x) ≤ c} for c ∈ R) are compact, and we have Very typical examples is φ(x) = |x| p with p ≥ 1.The quantity ´Rd |x| p dµ is called the p-th order moment of µ.
It is clear that (4.6) can follow for minimisers (FE) if V is good enough.We point out the following estimate that allows to use W for the same purpose.

An L p estimate
We can proceed similarly to the approach in (4.5) in more general settings.If we drop the term coming from the diffusion, then we get Integrating by parts one final time leads to d dt , we assume good decay of ρ p ∇(V + Kρ) and ρ p ∇ δF δρ then the boundary term vanish and we can write an inequality for equation of the form However, when we work in Ω with (BC * ) and we take ω = Ω, then the last term vanishes.However, we cannot relate the boundary term ρ p ∇(V + Kρ) • n(x) to the L p norm of ρ.In some cases it can be compensated by the negative diffusion term, but not in general.This is one of the reasons we assume (2.7) on the boundary.

Free-energy dissipation
Definition 4.10 (First variation).Let X be a normed space, F : X → R and fix ρ 0 ∈ X.We say F it is Gateaux differentiable at X if for all φ ∈ X there exists The function dF is usually called first variation of F. Some authors extend this definition and require just that the limit exists for φ in a dense subset of X. Functionals that admit first variation are called Gateaux differentiable.
When X ⊂ L 1 (Ω), the first variation can sometimes be represented by a function If this is the case, we denote δF δρ [ρ 0 ] := f 0 .It is easy to check formally that, when F is given by (FE * ) and K(x, y) = K(y, x), then (2.9).In many of the examples the first variation can be rigorously computed.
If K(x, y) = K(y, x) then the solutions of (ADE * ) admits (FE * ) as Lyapunov functional.This functional is usually called free energy.For smooth solutions of (ADE * ) (with sufficiently good decay as |x| → ∞ in unbounded domains) we can compute When Ω = R d (with sufficiently good decay of ρ∇ δF δρ [ρ t ]) or we have the no-flux condition (BC * ), we recover the decay of the free energy, i.e., along solutions of (ADE * ) we have decay of free-energy dissipation Remark 4.11.Often the free energy dissipation allows us to prove tightness.One approach is to recall Remark 4.8 and notice that free energy dissipation includes a term ´Ω V ρ.The term ˜W (x − y)ρ(x)ρ(y) can also be useful due to Lemma 4.9.

Negative Sobolev spaces
Given a set Ω, the negative Sobolev spaces can be defined by distributions with finite norm Using the free-energy dissipation In the smooth setting, we can integrate in time Since L 1 is compactly embedded in W −1,1 , when F is bounded below, the sequence )) due to Ascoli-Arzelá theorem.Let ρ ∈ W −1,1 be the limit of a sub-sequence.Since F[ρ t ] is bounded below and non-increasing, there is a limit F. Then So µ does not depend on t.In the cases where one can prove stronger convergences, it can be shown that it is a stationary solution, i.e., div ρ∇ δF δρ [ ρ] = 0. (4.10) 4.3 Optimal-transport approach: Wasserstein spaces and the JKO scheme As we already mentioned in Section 4.1.4,there is a clear connection between some of our equations and Optimal Transport problems.Otto realised in [Ott96] (see also [Ott01]) that the (PME) in R d corresponds, in fact, to the gradient flow of a F with respect to the 2-Wasserstein distance.This idea led to the so-called "Otto calculus" that was later perfected.
The following pages are meant as an extremely brief introduction.For the reader interested in deepening their understanding of these spaces and their connection to PDEs we suggest reading the lecture notes [ABS21] first, and then the very detailed books [AGS05;Vil09].A very nice presentation with an emphasis on the examples can be found in [San15].

The Wasserstein distances
The Wasserstein metrics were a tool already used by people studying optimal transport between probability measures.We give a brief definition.Let X be a metric space with a Borel algebra, and recall the definition of push-forward given in Section 4.1.4.If we are given µ, ν ∈ P(X), a transport map is a measurable function T such that ν = T ♯ µ.We can formally think about the p-Wasserstein distance between probability distributions are defined for p ≥ 1 as inf This is the so-called Monge optimisation problem.The infimum is not always achieved, and sometimes no valid T exists.Kantorovich improved this idea by introducing transport plans.A transport plan is a probability distribution in P(X × X) that have µ, ν, i.e., This set is denoted by Π(µ, ν), and is never empty because of the measure µ ⊗ ν ∈ Π(µ, ν), the unique measure such that The Wasserstein distances are defined as (4.12) These distances are sometimes called also Kantorovich-Rubinstein distance.The distance between two measures in P(X) can be infinite, unless the finite p-th order moment is finite.Hence, we define the p-Wasserstein space The pair (P p (X), W p ) is a metric space.If X is complete, then so this is space is also complete and it is equivalent to the narrow convergence (see [AGS05, Proposition 7.1.5]).This is why we pick X = Ω.
As it usually happens in these families of spaces, there are three highlighted cases: p = 1, 2, ∞.The case p = 2 will appear in the next section, and it is the one directly related to (ADE).The advantage of p = 1 is the so-called Kantorovich-Rubinstein duality A very interesting property of the Wasserstein distance is that it admits a dynamic characterisation through the so-called Benamou-Brenier formula which is stated easiest in R d W 2 (µ 0 , µ 1 ) = inf (µ,v)∈A(µ0,µ1) ˆ1 0 ˆRd |v t | 2 dµ t dt, (4.14) where is set space of admissible paths between µ 0 and µ 1 .

Otto's calculus
The notion of calculus in these spaces is a little tricky, since P p (R d ) are not vector spaces.However, they are subsets of the linear space of measures.In fact, it can be though of as a manifold: we can define tangent through curves.In particular, ρ 0 ∈ P 2 (R d ) and take φ ∈ C ∞ c (R d ) a test function.Consider the curve of probabilities ρ t = (id + t∇φ) ♯ ρ 0 .
It is not too difficult to see [ABS21, Lecture 16] that ρ t is a distributional solution of the continuity equation where v t = ∇φ • (id + t∇φ) −1 .Then formally the tangent space T ρ0 P 2 (R d ) is made of This allows to formally construct the gradient of functions.Take the free energy of the diffusion term U.In distributional sense the Wasserstein gradient is In general Hence, (ADE) in R d is formally written can be formally written as the 2-Wasserstein gradient flow in the sense that of the free energy given by (FE).
Remark 4.12.If we work in a bounded domain, then we must study the space P 2 (Ω) and understand how the no-flux condition (BC) appears naturally in this context.One approach is to consider φ ∈ C ∞ c (Ω) (meaning their support is bounded and does not reach the boundary).In order for id + t∇φ : Ω → Ω when t ∈ (−ε, ε) we must specify ρ∇φ • n = 0.This is related to how the no-flux appears.We will continue this discussion below in Remark 4.13.

Rigorous gradient flow structure. The JKO scheme
The idea of writing gradient flows in terms of minimising movements is due to De Giorgi (see [De 93]).To draw a quick analogy think that we can to minimise a function F : R d → R. A continuous ODE going to local minimima is the R d gradient flow dx dt = −∇F (x) The implicit Euler scheme leads to the implicit gradient-descent method We have avoided making explicit the dependence of x n with respect to τ .We can rewrite this the minimisation of the so-called proximal function If F is smooth this minimisation problem has a unique solution for τ small.These are the so-called minimising movements.Then one can do a piece-wise constant interpolation x (τ ) (t) := x n when t ∈ [τ n, τ (n + 1)), or even linear interpolation.
The notion of minimising movement can be generalised to metric spaces and, in particular, we can construct in 2-Wasserstein space Jordan, Kinderlehrer, and Otto proved in the seminal paper [JKO98] that this procedure works for (HE-FP).
The book [AGS05] is devoted to proving how these minimising movements lead to (ADE) in R d for fairly general F. A good notion of solution for the limit of minimising movements is that of curves of maximal slope.Often, it is even possible to get back a suitable solution of a PDE.
However, the general setting is tricky as presented by the following example.
Remark 4.14.Take Ω = (−1, 1) (i.e., d = 1), U = W = 0 and V (x) = ax.Then we recover the transport equation ∂ρ ∂t = a ∂ρ ∂x .The solutions of this equation are always of the form ρ(t, x) = ρ 0 (x + at).The boundary condition aρ = 0 can be forced on one side, but never the other.From the perspective below one can always consider a Dirac delta on the right-hand side.This negates somehow aρ = 0 on that side, but it is only possible solution.In fact, it is recovered from the vanishing viscosity formulation.
Notice that this examples does not satisfy (2.7).If we replaced V by a different smooth potential such that (2.7), then the characteristics will not reach the boundary in finite time.Hence, we will expect a smooth solution.
Remark 4.15.Otto's original paper [Ott96] already covered p-Wassersten distances with p ̸ = 2.In particular, it showed that the doubly nonlinear equation has a p-Wasserstein gradient-flow structure.Recently, a new article has appeared justifying the JKO scheme for these problems, see [CS24a].

Convexity and McCann's condition
In the same way that (4.16) works better if F : R d → R is convex, there is a suitable notion of convexity that works for (4.17).The correct extension is convexity along geodesics also called displacement convexity, i.e., if ρ t is a geodesic from ρ 0 to ρ Before we present this result, it is interesting to discuss the structure of geodesics in Wasserstein space (see, e.g., [ABS21]).As the simplest case, assume that µ, ν are absolutely continuous probability distributions and compactly supported.Then, Monge's problem (4.11) for p = 2 admits an optimal transport map T = ∇φ such that ν = T ♯ µ (see, e.g., [ABS21, Lecture 5]).Furthermore, the unique geodesic between µ and ν is given by ρ t = ((1 − t)id + tT ) # µ.
In [McC97] the author proves that if V, W are strictly convex, This can also be written as Remark 4.16.The case m > d−1 d and V (x) = |x| 2 2 the functional is 0-convex and bounded below.But recalling Remark 3.2 for m < d 2+d the (3.2) is not in P 2 , we cannot hope for asymptotic convergence in 2-Wasserstein sense.
Besides "regular" convexity, a powerful tool in the arsenal is the notion of λ-convexity.A function 2 is convex.Analogously as above we introduce the notion of displacement (or geodesically) λ-convex.In this setting for the gradient flow starting from two points we have W 2 (µ t , µ t ) ≤ e −λt W 2 (µ 0 , µ 0 ).
This implies uniqueness of the gradient flow and, if F is bounded below, of its minimisers.
A similar analysis of a free energy in S 1 is done in [CS09].The sense in which the equation is satisfied is delicate.For U = V = 0 and W "pointy", we send the reader to [CDFLS11].

Global existence vs finite-time blow-up
As explained above, local-in-time existence is usually achieved through "standard" PDE methods or the use of gradient flow arguments, specially if starting from a nice initial datum ρ 0 .Whether these solutions behave "nicely" for large t is a more difficult problem.
Transport for given potential Going back to the simplest example (CE) we can look at the examples x and ρ t = (X t ) ♯ ρ 0 with corresponding characteristic field is obtained by solving with X t (0) = y.We conclude that For α < 2 the characteristics arrive at 0 at finite time and hence we get a Dirac delta at 0 at a certain finite time for any ρ 0 ̸ ≡ 0. This is not problematic in the distributional sense in this case.However, this behaviour happens also for other problems.
Keller-Segel problem.Some authors noticed by direct techniques that the Keller-Segel model has finitetime blow-up for initial data with large enough mass.[JL92] showed for (KSE) in a bounded domain with no-flux the existence of a critical mass M * such that, if ∥ρ 0 ∥ L 1 > M * , then ρ(T * ) contains a Dirac delta.Later this result was obtained also in R d in [HV96] using radial arguments, where the authors characterise the critical mass is M * = 8π.This was later proved in more generality in [DP04].Conversely, for ∥ρ 0 ∥ L 1 < M * global existence for (KSE) is known.In d = 2 it was shown in [BDP06].
A nice proof of the formation of a Dirac comes through the analysis of the second-order moment.For (KSE) in d = 2 it was first noticed by [DP04] that, working with solutions of initial mass ´Rd Hence, if M > 8π necessarily the second-order moment becomes negative in finite time.This is incompatible with our non-negative solutions.There is complete concentration to a Dirac delta.This was later extended by [BCL09b] for (KSE) when d > 2 where the authors prove that When M > M * , there exist ∥ρ 0 ∥ L 1 = M such that F[ρ 0 ] < 0. We will perform the analysis of the free energy for this problem in Remark 6.15.We will discuss below in Section 5.1.3the critical case M = 8π and some interesting ad-hoc constructions which are available in the literature.

Asymptotics
The aim of this section is to highlight some techniques coming from the PDE community that allow us to understand the behaviour as t → ∞ (and sometimes |x| → ∞) of "typical" solutions of the problem (namely those with "good" initial values).

Self-similarity
Some of the equations we are analysing have linear operators homogeneous non-linearities, and it is therefore interesting to exploit this structure.One of the main approaches comes from considering the mass-preserving change of variable u(τ, y) = σ(τ ) d ρ t(τ ), σ(τ )y . (5.1)

Self-similar scaling for (PME)
Let ρ be the solution of (PME).Applying the chain rule to (5.1) Due to the chain rule Eventually, we simplify the equation for u to The only sensible choice is dσ dτ = σ and d Forward self-similar solutions Notice that adding the condition σ(0) = 1 we get σ(τ ) = e τ .We also have that d t dτ = e τ /κ .We have t = t(τ ) so we set t(0) = 0. Eventually Notice that κ > 0 if and only if m > m c = d−2 d .The Barenblatt solution (ZKB) corresponds to setting u as stationary.In fact, for m > m c we can write the family of self-similar solutions Here we have chosen the presentation in [CT00; BBDGV09].
Backwards self-similar solutions When κ < 0 (i.e., m > m c = d−2 d ), we can build another type of solution.In [Váz06a, Section 5.2], Vázquez introduces the pseudo-Barenblatt solutions for the sub-critical regime 0 < m < m c which can be written .
Here, T denotes the extinction time.They are suitable to deal with finite-time extinction [BBDGV09].
5.1.2Self-similar analysis in more general contexts In the previous computation we can trade ∆ x ρ m for any non-linear operator L with the scaling property Self-similarity analysis for (CVE) was performed in [CV11b, Theorem 3.1].The result passes by an obstacle problem.For (NVE) the self-similar analysis is done in [SV14, Section 6.2].For the fractional heat equation see [GI08].
Self-similar solutions are very particular of problem where all terms have the same scaling, and it is not reasonable to expect solutions of this nature to exist in general.Consider the problem When L is the sum of two terms with different scaling, it is not reasonable to expect self-similar structure.For (ADE) with U = U m , V = 0, W = W k this is only possible in the so-called fair-competition regime The self-similar analysis in this setting was performed in [CCH17].

The Keller-Segel problem: bubbles
For d = 2 a self-similar solution is not available, although [HV96] give some hints on the shape by matching asymptotics.Later, there were more advanced studies on stability [Vel02; Vel04a; Vel04b].In this direction, see also [RS14;CGMN22].
In [HMV98] the authors prove there are no self-similar solution of (KSE χ ) when d = 2.However, for d = 3 they show that there exists a sequence of radial self-similar solutions ρ It was later conjectured in [Pin17] and proved in [DPDMW] that for d = 2 one can construct "approximate bubbles".In particular, an initial datum ρ ⋆ 0 with initial mass 8π so that the solutions of Keller-Segel problem (KSE) such that, for

Relative entropy
If we are able to construct a global solution B t of which we know some properties, we would like to see whether this is the "generic behaviour".Assume we are in a case where ∥ρ t ∥ L 1 is preserved.We would say that B t is attractive if For L p norms in general (which typically change over time), we would to see if the relative error tends to zero (5.2) These are usually called intermediate asymptotics, as opposed to the long-time asymptotic limit.
Naïve computations do not yield good results.Sometimes it is possible to do this computation directly via comparison arguments.But, in general, this is not possible.A useful tool is are the so-called relative entropies.We start by a simple example, (HE).

L 2 relative entropy for (HE)
An interesting alternative way of writing (HE-FP) as where G is the Gaussian profile (3.1).Let us take w = u G we rewrite this problem as This is the famous Ornstein-Uhlenbeck problem, which is also the dual (HE-FP).We can write the freeenergy dissipation formula We can now take advantage of the Gaussian Poincaré inequality, see [Che81], Hence, assuming that ´Rd wG = ´Rd u = 1, we recover that This approach can be generalised in many directions.We point the reader to [AMTU01] and the references therein.

L 1 Relative entropy argument for (HE)
For convenience, we will denote the solution of (HE-FP) by v. Notice that, with our choice of re-scaling v 0 = ρ 0 .The relative entropy is defined where I is the so-called Fisher information The Gaussian logarithmic Sobolev inequality (see [Gro75; Tos99; DT15]) This inequality ensures the relationship E ≤ 1 2 I, and hence we recover an ordinary differential inequation that yields E(v t ) ≤ E(ρ 0 )e −2t .Lastly, we will take advantage of the Czsizar-Kullback inequality taking Eventually, we deduce that if ρ is a solution (HE) Remark 5.1.It is also worth point out that (5.3) is equivalent to the Euclidean log-Sobolev inequality, which in scale invariant form (see [Wei78;DT15]) given by Notice that is usually applied to f = √ ρ, and hence the condition is simply that ρ ∈ L 1 (R d ) ∩ P(R d ).

L 1 relative entropy for (PME)
In [CT00; PD02] the authors extend the relative-entropy study to m > 1.Now we denote by v the solution of (PME-FP).Suppose there exists v of suitable mass and define the relative entropy and the Fisher information The suitable relative entropy for m > 1 is H(u||û) = H(u) − H(û).This was later extended to more general families than (PME-FP) in [CJMTU01].This relative-entropy arguments rely heavy on Gagliardo-Nirenberg-Sobolev inequalities.In fact, there is a deep connection between the smoothing properties of (PME) and these types of inequalities (see, e.g., [PD02]).In [BBDGV09; BDGV10] the authors develop the correct relative entropies and Fisher information functions to cover the cases m < 1 with the correct re-scaling of the Barenblatt/pseudo-Barenblatt explained in Section 5.1.1.
Remark 5.2.This kind of relative entropy arguments for diffusive problems (which do not have a natural stationary state) rely on the self-similar solution.It is possible to prove intermediate asymptotics without homogeneity or self-similarity techniques.For diffusive problems of the form ∂ρ ∂t = ∆Φ(ρ), a very nice analysis using Wasserstein spaces can be found in [CFT06].

Relative entropy argument for (ADE)
The entropy arguments for (PME) are stable enough that allow some families of perturbations into the range (ADE).For W = 0 and confining potentials see [CJMTU01].The reader may find information on the strictly displacement-convex free-energy functionals [CMV03;CMV06].For (KSE m,χ ) with m ∈ [1, 2 − 2 d ] (and, in fact, a larger family of W radially decreasing), intermediate asymptotics in the sense (5.2) are obtained in [Bed11b].In [CGYZ23] the authors take advantage of relative entropy arguments to prove that in the case U (ρ) = ρ log ρ, V = 0 and W smooth and bounded (5.2) with p = 1 and B t = K t (i.e., the heat kernel).See [CCS12] for previous results in this direction.

Study of the mass variable of radial solutions
One of the significant limitations to prove the above results rigorously is the lack of regularity of ρ.Some regularity can be regained by passing to the so-called mass variable. (5.5) Then, if ρ is radially, we recover for a weight κ This is a Hamilton-Jacobi type equation, and is well suited for the theory of viscosity solutions.In [CGV22a] the authors discuss the problem when W = 0.They show that one can take advantage of stability properties of viscosity solutions to characterise the steady state.Since κ(0) = 0 and κ(v) > 0 for v > 0, this equation can be used to show that for smooth V and W and radially symmetric ρ, singularities can only form at x = 0.They also show that a Dirac may form, but only in the limit t → ∞.
Many authors have taken advantage of this kind of idea in their arguments.For example, we point to [HMV98; Sen05; GMS11; KY12].
There is also an interesting connection between the mass variable and the Wasserstein distance for ρ.The simplest case states that if ρ 1 , ρ 2 ∈ P(R), and M i are their primitives, then A similar formula holds if ρ i ∈ P(R d ), and are radially symmetric.
The inverse of the mass function Furthermore, if we consider the generalised inverse and ρ 1 , ρ 2 ∈ P(R), and M i are their primitives, then . Interestingly, if ρ(t, x) solves a conservation problem, we can also deduce and equation for u(t, s) = M −1 (t, s).This equation is usually degenerate, but the coefficients do not depend on s (see, e.g., [CHR20]).

On the attractiveness of attractors
Relative entropy arguments allow us to show that the steady state (or energy minimiser) in L 1 is an attractor for a large range of equations: from (PME-FP) to all the examples discussed in Section 5.2.4.
A more difficult question is to understand the case where the energy minimiser may contain a singular part, e.g., and ∥ ρ∥ L 1 < 1.We will see an example in the aggregation dominated range (see Corollary 6.13 below).It is difficult to say whether a measure of this kind can be called a steady state, since it cannot be plugged into even the distributional formulation.If the energy is λ-displacemente convex, there is no doubt that the minimiser is a global attractor.
In [CGV22a] the authors prove that (ADE * ) m ∈ (0, 1), V smooth, and K = 0, the solution converges to the free energy minimiser, even when this contains a singular part.In [CFG] the authors use the mass to prove that for (ADE * ) m ∈ (0, 1), in large class of K.
In the recent preprint [Bia], covering (ADE) when k = 2 − 2s ∈ (2 − d, 0) and m = 2d d+2s (part of the fair-competition regime described in Section 6.3.3)shows that there is an explicit steady state (given in (6.3) below), and solutions with ∥ρ 0 ∥ L m > ∥ ρ∥ L m have ∥ρ t ∥ L m blow-up in finite time (and hence even we cannot study them in the distributional sense).

Minimisation of the free energy
As mentioned in Section 4.3.4,displacement λ-convexity with λ > 0 guarantees exponential contraction with rate W 2 (µ t , µ t ) ≤ e −λt W 2 (µ 0 , µ 0 ) In particular, if µ is a global minimiser of F (which is unique by convexity), then the JKO schemes is stationary and so is the gradient flow.This means that Therefore, it is a global attractor with exponential rate.First, we will try to characterise the local minimisers using variational arguments.This will lead us to Euler-Lagrange conditions, and generally corresponds to 2-Wasserstein local minimisers.On each connect component of the support of a steady state, the Euler-Lagrange condition comes also from the free-energy dissipation (4.9), and are simply δF δρ = C.
However, (ADE) can have steady states that do not satisfy the Euler-Lagrange conditions, specially when the support has several components.A famous example that is known to the community is the case U = U m with m > 1, W = 0, and a potential V with two wells, e.g., provided the support of these two terms are disjoint.These are not 2-Wasserstein local minimisers, but they are, actually, ∞-Wasserstein local minimisers.Furthermore, these steady states attract some initial data.
The study of ∞-Wasserstein local minimisation became quite popular, and we cite several references below.

Local minimisers: Euler-Lagrange conditions
The aim of this section is show that if then we expect it to satisfy the following: there exists h ∈ R such that We recall that our main interest is the free energy F given by (FE * ) for which the first variation δF δρ is given by (2.9).

First variation over the space of absolutely continuous probability measures
We follow the argument of [BCL09b; BCLR13a; CCP15; CCV15; CCH17; CHMV18; CHVY19; CGV22a] where the reader may find rigorous proofs in different ranges.Our aim is to take variations of the forms where, if we pick test functions φ suitably, we still have ρ ε ∈ L 1 ∩ P so Since this also holds for −ψ the equality holds above, and hence Hence we recover (EL 1 ).This gives us complete information on the boundary of the support.But it could jump uncontrollably from 0 to the other profile.
Variations outside the support of ρ Now we take the variations ˆRd ψ(y) dy.
If ψ ≥ 0 and ε < ∥ψ∥ −1 L 1 then ρε ≥ 0. Notice that we cannot hope for ψ < 0 outside the support of ρ.With a similar argument as before we get Since this holds true for all ψ is the conditions above, we have proven (EL 2 ).
Remark 6.1.For (ADE) we have that 6.1.2Solving the Euler-Lagrange equation when U ̸ = 0 When U ̸ ≡ 0 then U ′ is non-decreasing, and we get from we also know that ρ ≥ 0, and (6.2) equation holds with equality when ρ > 0. If the right-hand of (6.2) is positive, so is ρ and equality holds.When the right-hand side of (6.2) is non-positive, then ρ = 0. Therefore, we have that In particular, observe that when U = U m , V = |x| 2 2 and W = 0 (i.e., (PME-FP)) this is the famous Barenblatt solution (ZKB).More generally, the same approach holds true for the case W = 0.In this case, (EL 3 ) is a one-parameter family ρh .Furthermore, since U ′ is non-decreasing we have that ρh is point-wise non-decreasing with h.Hence, in most cases h can be recovered from the fixed value ´Rd ρ h .Remark 6.2.For the fast diffusion case U ′ (0) = ∞.Hence, if |h| < ∞ and W * ρ ∈ L ∞ loc then ρ > 0.
Notice that when ρ > 0 then the solutions of (EL 3 ) are precisely stationary solutions in the sense of (ADE-S).When the support of ρ minimisers are still formally stationary, however the regularity on the free boundary (the boundary of the support) is more challenging.
The cases U = 0 are also interesting.Take, for example (CVE) and re-scale to a Fokker-Planck problem.Then |x| 2 2 + (−∆) −s ρ ≥ h.Let v := (−∆) −s ρ.Hence, (−∆) s v = ρ ≥ 0. We also get, due to (EL 1 ), that when ρ > 0 then equality holds in the last equation.Hence, we can write This is the well-known fractional obstacle problem.In this particular application it is discussed in [CV11b].Regularity of solutions was proved in [Sil07].
The particular case k ∈ (2 − d, 0) and m = 2d 2d+k has the correct scaling (sometimes called conformal ).The explicit self-similar solution can be found using the result in [CLO06] and the references therein (see also [Bia]) to be ρ . (6.3) Remark 6.3.This also shows that in R d , when U ′ is strictly increasing, U ′ (0) = −∞, and V, W is bounded, there can exist no stationary states.We compute the lower bound This means that, ρt / ∈ L 1 (R d ) is not for any h ∈ R. over the space of even functions.This was later extended in [CG21] to (ADE) (again in the torus but with U = U m and V = 0) by taking the suitable extension of the energy.
Remark 6.5.When U = 0 the equation V + W * ρ = h has been studied using different approaches, but we include these references in Section 6.3.2 below.

Extension of the free energy to measure space
Our free energy F is defined for ρ ∈ P 2 (R d )∩L 1 (R d ).This set is not dense in P 2 (R d ) with the total variation metric since for ∥ρ − δ 0 ∥ T V = 2.However, it is dense in the narrow topology.We can therefore try to extend F. The natural weak lower-semicontinuous extension is given by Clearly, this is not so easy to compute directly.
Let us look first at the diffusion term in the homogeneous range U m .As an extreme example, we have to make sense of δ m 0 .An easy approach is to take Therefore, the extension has to be The rigorous construction of U in general setting can be found in [DT86].In fact where µ = µ ac + µ sing the absolutely continuous and singular parts of the measure.If V, W are well-behaved (e.g., bounded by C(1 + |x| 2 )) then For example, if U is sublinear and V, W are well-behaved The first variation in this setting is a little more involved, but it follows the same general scheme.
However, if W is singular at 0 the landscape can be richer, for example as mentioned in Remark 6.15 for (KSE χ ), the correct behaviour extension is more difficult to construct.In fact, we point the reader to [DYY22] where the authors construct example of W with infinitely many radially decreasing steady states.

General comments
For example, it is easy to show that when m > 1, and V, W are λ-convex for λ > 0 and bounded below, then the free-energy functional admits a unique minimiser with the properties above using the direct method of calculus of variations.First, the functional is bounded below since U m ≥ 0 for m > 1.We take a minimising sequence ρ n , and we use Prokhorov's theorem to prove it has a narrow limit μ.Using weak lower semi-continuity we show that the limit is indeed a minimiser.
The situation in bounded domain Ω is easier, since all sequences of measures are tight (i.e., mass cannot escape to infinity) and the Lebesgue spaces are embedded in each other.We focus therefore on the more complicated case R d .
To find energy minimisers in R d , one can try to construct minimising profiles by scaling.Given a profile ρ 1 , we can rescale it like ρ λ (x) = λ d ρ 1 (λx).
It might happen that full diffusion (i.e., λ → 0) is energy beneficial.Indeed, we have the scaling Notice that, when m > 1 then d(m − 1) and U m (ρ 1 ) > 0, whereas if m < 1 they are both negative.Furthermore, we observe that, when m > 1, then the energy is bounded below; and for m ∈ (0, 1] it is not. It might also happen that full concentration (i.e., λ → ∞) is best for the energy, for example for U, W = 0 and V = |x| 2 .Then we expect the minimiser to be a δ 0 .

The case U = 0
There is a long literature looking for minimisers when U = 0.In this setting, some authors have studied the W p local minimisers with p ∈ (1, ∞].In [BCLR13a] the authors realised that, when V = 0, the minimisers can be supported over sets of different dimensions, depending on the attractive-repulsive nature of W .The stability of these minimisers was later studied in [BCLR13b].In [BCY14] the authors discuss the compact support of minimisers under general settings.For general conditions on existence of global minimisers see [SST15].The regularity of compactly supported ∞-Wasserstein when V = 0 was studied in [CDM16].Recently, analysis of the Fourier transform has been applied to determine further structure of minimisers [CS22; CS].
There is the particularly interesting case of V = 0 and the power-type attractive-repulsive potential In this direction, there have a number of works.The authors of [CCH14] show existence of minimisers coming as a limit of empirical measures (recall (2.4)) when γ > α.In dimension 1 one can construct solutions of W * ρ = E by inverting Fredholm operators (see [CH17]).Nevertheless, the measures are of special relevance.In [KKLS21] the authors show in d = 1 if α ≥ 2 and γ is large enough, then µ 1 2 is unique W p minimiser for p ∈ [1, ∞).The W ∞ minimisation when d = 1 is richer: 2 is a W ∞ -strict local minimiser.When d > 1 the suitable extension of ρ m is a "Dirac delta" supported on the surface of a ball, see [DLMa; DLMb].The explicit shape of the W ∞ -minimiser when γ ∈ (2, 3) and α = 2 is given in [Fra22;FM].There is also significant interest in the case γ = 2, α ∈ (−2, 0), see [CS24b] and the references therein.
There are several related problems with different choices of V, W which have been studied over the last few years due to the modelling, we point the reader to [CMMR+20; CMMR+21; MMRSV23], and the references therein.

The power cases
The family of cases U m (ρ) = ρ m m−1 , V = 0, and W = χW k has been studied in extensive detail.We analyse the scaling of the second term of the energy to recover The scaling of the energy is given by The sign inside the parenthesis is crucial.As pointed out in [CCH17; CHMV18] the two terms are in balance when d(m − 1) = k, and this leads to the critical value called the fair-competition regime.For m > m c we have the so-called diffusion-dominated regime and for m ∈ (0, m c ) the aggregation-dominated regime.Notice that k > 0 implies m c < 1 (fast-diffusion) whereas k < 0 implies m c > 1 (slow-diffusion).
Remark 6.6.The case scaling argument can be repeated for F = U m + χV λ , and we recover also the critical value m c = d−λ d .The critical value of (PME-FP) corresponds precisely to λ = 2. Remark 6.7.Notice that (KSE m,χ ) corresponds precisely to λ = 2 − d, so we recover m c = 2d−2 d .The case U = U m with m ≥ 2, V = 0 and W radially decreasing was studied in [Bed11a], where the main tool is radially decreasing rearrangement of ρ (which will be presented in Section 6.4).where we recall the definition of m c given in (6.5).A proof can be found in [CCH17, Theorem 3.1] The logarithmic version is that if log The fair-competition regime was discussed in [CCH17], the diffusion-dominated regime in [CHMV18], and the aggregation-dominated regime in [CDP19].
Since there is fair competition, the rescaling done for (PME) suitably rescales the equation.There is, as usual, a new term div(xρ) coming from the time derivative.This new equation is itself the 2-Wasserstein gradient flow of the rescaled free energy where we recall the definition (3.7).The main results are as follows Theorem 6.9 (Fair-competition case [CCH17]).Let m = m c and k ∈ (−d, 0).Then 1. Stationary states satisfy F(ρ) = 0.
2. We have that Due to the previous item, we define χ c = 1/C HLS (k, d).
4. If χ < χ c , then there exists a global minimiser for F resc , but not for F.
5. If χ > χ c , then F and F resc are not bounded below.
Theorem 6.10 (Diffusion-dominated regime [CHMV18]).Let m > m c , d ≥ 1, χ > 0, and k ∈ (−d, 0).Then there exists a global minimiser ρ and it is radially symmetric and compactly supported.The work [CHVY19] also contains contributions towards existence of minimisers in the diffusion-dominated range.However, the key contribution is the radial symmetry as described in Section 6.4.More or less in parallel, work was done on the aggregation-dominated regime.
Then, for any ε > 0 the energy εU 1 + W does not admit any W p -local minimisers.Furthermore, if W is Lipschitz continuous, then it does not admit any critical points.

(Fast diffusion) If m < d d+k then the free energy εU
has a global minimiser.
The authors also provide a condition for sharp existence of global minimisers.In [CDDFH19] the authors introduce a reversed HLS inequality to deal with the aggregation-dominated regime.Theorem 6.12 (Reversed HLS [CDDFH19]).
Part of the results in [CDDFH19] appeared first as a preprint [CD18] written in terms of (ADE) instead of functional inequalities.We present the results as stated there as a corollary of the published paper.Corollary 6.13.Let m ∈ ( d d+k , 1) and k > 0. Then 1. there exists a unique µ that minimises the free energy U m ( µ) + χW( µ), and they coincide with minimisers of (6.8).
2. It is of the form (5.6) for some ρ ∈ L 1 .
3. For m > d−1 d and k ≥ 1 the functional is geodesically convex in W 2 , and hence, the minimiser is unique.
The authors also present the suitable Euler-Lagrange equation for µ.The computation is more involved that (6.1.1)that follows the same philosophy.
The case λ = 4 was studied in full detail in [CDFL20].Here the authors characterise precisely the dimensions d and index m for has a Dirac delta, i.e., it is (5.6) with ∥ ρ∥ L 1 < 1. Remark 6.14.A particular amount of work has been devoted to the case V = 0 whereas V ̸ = 0 raised much lower interest.The extension of the results is more or less direct, see [CGV22a;CFG].Remark 6.15.Keller-Segel corresponds precisely to the logarithmic cases d = 2, m = 1, and W = −1 2π W 0 which yield (assuming ∥ρ∥ L 1 = 1) This leads to the critical value of χ * = 8π or, equivalently recalling the change of variable in Section 3.4, critical mass M * = 8π.

Radial symmetry
There are several approaches to prove that, when V, W are radially symmetric then the minimisers must be radially symmetric.One approach is to use the associated stationary problem, and Alexandrov reflexions.A different approach is to prove that we can find radially symmetric minimising sequences.For this, we can take advantage of the rearrangement theory (see [Tal16] and the references therein).First, let us "re-arrange" a set.If A ⊂ R d we define the rearrangement of A as the ball of radius 0 We define the radially decreasing re-arrangement of a non-negative function f as the function f ⋆ such that {x : f ⋆ (x) ≥ t} = {x : f (x) ≥ t} ⋆ .
We can write f in terms of its level sets via the layer-cake representation By reverting these inequalities, if V, W are non-negative and radially increasing we have that Hence, the expected behaviour is that if ρ is a minimiser, then so is its radially decreasing rearrangement.Furthermore, any minimising sequence ρ k can be replaced by a radially-decreasing minimising sequence.However, rigorously proving this intuition is rather technical.A fairly general theorem can be seen in [CHVY19], where the authors take advantage of the continuous Steiner rearrangement.
7 Numerical methods

Finite Volumes
Finite volume schemes are a family of schemes for conservation equations which incorporate the idea of transport of mass.They are easiest to introduce d = 1.Take a spatial mesh x i = i∆x we aim to approximate numerically the averages The conservation equation (2.2) yields d dt Discretising in time we arrive at the general form The no-flux condition is approximated by F −N − 1 2 = F N + 1 2 = 0, so i = −K, • • • , K. For transport problems in which F = ρv, the flux is taken up-wind to preserve positivity of ρ n i , namely where u + = max{u, 0}, u − = min{u, 0} so that v = v + + v − .For (ADE) the velocity field v i+ 1 2 is an approximation of the velocity field v i+ 1 2 ≈ −∇(U ′ (ρ) + V + W * ρ)(x i+ 1 2 ).In [BCH20] the authors introduce a scheme with energy decay for (ADE).When W = 0, if U is concave then so is E(x, ρ) = U (ρ) + V (x)ρ and hence we can proof a version of (4.8) with an inequality Due to the convexity tricked used above, it is natural to take the implicit method Then we have the energy decay When W is introduced similar estimates are possible, although the bilinear terms need delicate handling.In [BCH20] the authors prove that a successful scheme is W (x i − x j ) ρ n+1 j + ρ n j 2 (7.2a) with the corresponding boundary no-flux conditions.This scheme has energy decay for any W even.The authors of [BCH20] also prove that under certain assumptions on W the decay of the energy holds for other discretisations without the midpoint (ρ n+1 j + ρ n j )/2.And a proof of convergence through compactness for the cases m > 2 was later introduced in [BCMS20] convergence.This scheme can be adapted to (ADE * ) by (7.2a') A very similar scheme for (AE) is studied in [DLV20], where the authors prove convergence in Wasserstein sense even for pointy potentials.This scheme is later extended for linear diffusion in [LST].
A higher order method for the case U, V = 0 and W Lipschitz was constructed in [CFS21].The authors prove convergence in 1-Wasserstein norm.
Remark 7.1.Some finite-volume methods have been studied actually correspond to the gradient of a discrete energy in a discrete version of the Wasserstein metric, see [RM17].

Methods based on discretising the Wasserstein distance
There are several possibilities of numerical computing the Wasserstein gradient flow.One could combine a method numerically computing the Wasserstein distance (see, e.g., [PC20]) with a numerical optimisation in the JKO scheme.A very interesting paper in a similar direction is [CCWW22], where the authors combine of the JKO scheme with the Benamou-Brenier formula (recall (4.14) above) to construct a discrete gradient-flow method, which they call primal-dual method.
A different approach is to use a modified energy for which the gradient flow is easier to compute.This leads to the particle/blob methods we will discuss in the next section.

Particle/Blob methods
When U = 0 and V, W are Lipschitz, (ADE) admits exact solutions given by empirical distributions Hence, the PDE can be reduced to a finite set of ODEs, that can be solved by any method for ODEs.If we are given ρ 0 ∈ L 1 , then for any K we approximate it by an empirical measure µ (K) 0 .
When U ̸ = 0 this is no longer the case, because the PDE is "regularising".Furthermore, the free of empirical measures is not defined for U = U m , m ≥ 1.However, there are two approaches to approximate the free energy so that the W 2 -gradient flow admits again these kinds of solutions.

The particle method
A first approach was introduced in [CPSW16] and perfected in [CHPW17].Fix an initial mass Fix these weights of the particles and K the number of particles.Take the set of empirical distributions of K elements and fixed weights The aim is to approximate the free energy F so that w i w j W (x i − x j ), (7.3) for suitably selected radii.Then the scheme is simply to apply a suitable adaptation of the JKO scheme (4.17) with time-step τ , that is

The blob method
The blob approach aims to modify the energy so that solutions by particles are admissible.To preserve the gradient-flow structure we modify the free energy by changing the term ´U (ρ).A popular idea introduced by [CCP19] is where F (s) = U (s) s The papers [DFR19; FT22] explore the case U = 0.The case U ̸ = 0 is, so far, only understood numerically.A finite-volume numerical scheme was constructed for this case [BCH].Their numerical results suggest that steady-states will be of the form ρ = min{1, (U ′ ) −1 (h − V − W * ρ)} + for suitable h such that the total mass is preserved.However, there is currently no rigorous proof that this is the correct choice.
Figure 1: Sets used of the proof of Lemma 4.9 function.It is usually called displacement convexity and was introduced by McCann in [McC97] 1 .
and a n → 0. This construction was later generalised for d ≥ 3 by several authors [Sen05; GMS11; SW19].Blow-up on the range U = U m for m ≥ 1, V = 0 and W = W k with k > 2 − d see [YB13].

Remark 6. 4 (
Bifurcation analysis).Several authors have paid attention to the structure of local minimisers in terms of bifurcations.These analyses can be done for (KE) in terms of Fourier coefficients.In[CGPS20] the authors study bifurcation of (McKVE) (with periodic boundary conditions, or equivalently in T d , with W = κ W , and V = 0) by applying Crandall-Rabinowitz theory.They see local minimisers as solutions of the equation (ρ, κ) → ρ − e −βκ W * ρ ´e−βκ( W * ρ)(y) dy , Remark 6.8 (Hardy-Littlewood-Sobolev).To show that the free energy functionals are bounded below, we recall the Hardy-Littlewood-Sobolev inequality k ∈ (−d, 0) and ρ ∈ P(R d ) ¨Rd ×R d |x − y| λ ρ(x)ρ(y) dx dy ≤ C HLS (k, d) ˆRd ρ(x) mc dx, (6.6) , where 1 A is the indicator function of A, and we have introduced the super-level set notationL + t (f ) := {x : f (x) ≥ t}.⋆ (x) dt.It is not difficult to check that if f, g, h are non-negative functions vanishing at infinity we have the conservation of all L p norms, which can be more generally written as ˆRd U (f ⋆ (x)) dx = ⋆ (x)g ⋆ (y)h ⋆ (x − y) dx dy ≥ ¨Rd ×R d f (x)g(y)h(x − y) dx dy.