Phase Transitions for Nonlinear Nonlocal Aggregation-Diffusion Equations

We are interested in studying the stationary solutions and phase transitions of aggregation equations with degenerate diffusion of porous medium-type, with exponent \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1< m < \infty $$\end{document}1<m<∞. We first prove the existence of possibly infinitely many bifurcations from the spatially homogeneous steady state. We then focus our attention on the associated free energy, proving existence of minimisers and even uniqueness for sufficiently weak interactions. In the absence of uniqueness, we show that the system exhibits phase transitions: we classify values of m and interaction potentials W for which these phase transitions are continuous or discontinuous. Finally, we comment on the limit \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m \rightarrow \infty $$\end{document}m→∞ and the influence that the presence of a phase transition has on this limit.


Introduction
In this work, we deal with the properties of the set of stationary states and long-time asymptotics for a general class of nonlinear aggregation-diffusion equations of the form (1.1) the attractive-repulsive interaction potential. Here denotes the d-dimensional torus T d having side length L > 0, with P( ) being the set of Borel probability measures on , and L m ( ) the set of m-power integrable functions on . Notice that for m = 1 we recover the linear diffusion case which is related to certain nonlocal Fokker-Planck equations, also referred to as McKean-Vlasov equations in the probability community. These equations also share the feature of being gradient flows of free energy functionals of the form for ρ ∈ L m ( ) ∩ P( ), as discussed extensively in the literature [JKO98,Ott01,Vil03,CMV03,AGS08]. We refer to [CCY19] for a recent survey of this active field of research.
Note that although we have included the free energy for m = 1 in (1.2), we will mostly be dealing with case m > 1 in this article. We will only discuss the case m = 1 as a limiting case of the energies F m β as m → 1. The case m = 1 is treated in more detail in [CGPS20].
Aggregation-diffusion equations such as (1.1) naturally appear in mathematical biology [BCM07,VS15,CMS+19,BDZ17,BCD+18] and mathematical physical contexts [Oel90,Phi07,FP08,BV13] as the typical mean-field limits of interacting particle systems of the form where W N = 1 β 1 ϕ N + W and ϕ N (x) = N ξ ϕ(N ξ/d x), for all x ∈ R d . Here, ϕ is a the typical localized repulsive potential, for instance a Gaussian, and 0 < ξ < 1. Notice that due to the choice of ξ , the shape of the potential gets squeezed to a Dirac Delta at 0 slower than the typical relative particle distance N −1/d . Also, β −1 2 ≥ 0 is the strength of the independent Brownian motions driving each particle. We refer to [Oel90,Phi07,BV13] for the case of quadratic diffusion m = 2 with β 1 = β, ν = 0, and to [FP08] for related particle approximations for different exponents m. The McKean-Vlasov equation m = 1 is obtained for the particular case β 1 = +∞ and β 2 = β, being the inverse temperature of the system for the linear case, and its derivation is classical for regular interaction potentials W , see for instance [Szn91].
Analysing the set of stationary states of the aggregation-diffusion equation (1.1) and their properties depending on β, the relative strength of repulsion by local nonlinear diffusion and attraction-repulsion by nonlocal interactions, is a very challenging problem. As with the linear case, the flat state is always a stationary solution of the system. The problem lies in constructing nontrivial stationary solutions and minimisers. In the linear diffusion case m = 1, we refer to [CP10,CGPS20] where quite a complete picture of the appearance of bifurcations and of continuous and discontinuous phase transitions is present, under suitable assumptions on the interaction potential W . Bifurcations of stationary solutions depending on a parameter are usually referred in the physics literature as phase transitions [Daw83]. In this work we make a distinction between the two: referring to the existence of nontrivial stationary solutions as bifurcations and the existence of nontrivial minimisers of F m β as phase transtions. Particular instances of phase transitions related to aggregation-diffusion equations with linear diffusion have been recently studied for the case of the Vicsek-Fokker-Planck equation on the sphere [DFL15,FL12] and the approximated homogeneous Cucker-Smale approximations in the whole space [Tug14,BCnCD16,ASBCD19]. We also refer to [Sch85] where the problem was studied on a bounded domain for the Newtonian interaction, and to [Tam84] where the problem was studied on the whole space with a confining potential.
However, there are no general results in the literature for the nonlinear diffusion case (1.1), m > 1, except for the particular case of m = 2, d = 1, with W given by the fundamental solution of the Laplacian with no flux boundary conditions (the Newtonian interaction) recently studied in [CCW+20]. Despite the simplicity of the setting in [CCW+20], this example revealed how complicated phase transitions for nonlinear diffusion cases could be. The authors showed that infinitely many discontinuous phase transitions occur for that particular problem. Let us mention that the closer result in the periodic setting is [CKY13], where the authors showed that no phase transitions occur for small values of β, when the flat state is asymptotically stable, for m ∈ (1, 2]. Our main goal is thus to develop a theory for the stationary solutions and phase transitions of (1.1) for general interactions W ∈ C 2 ( ) and nonlinear diffusion in the periodic setting, something that has not been previously studied in the literature. This paper can be thought of as an extension of the results in [CGPS20] to the setting of nonlinear diffusion. Considering this, we need to define appropriately the notion of phase transition for the case m ∈ (1, ∞), as done in [CP10] for the linear case m = 1.
Note that, unlike in the linear setting, the L 1 ( ) topology is not the natural topology to define phase transitions. It seems that for m > 1 the correct topology to work in is L ∞ ( ) (cf. Definition 5.10 and Remark 5.17 below). For our results we will often require compactness of minimisers in this topology. One possible way of obtaining this compactness is via control of the Hölder norms of the stationary solutions of (1.1). In Sect. 3 we briefly comment on the existence of solutions to (1.1) before proceeding to the proof of Hölder regularity. Since this is a key element of the subsequent results and the proof of Hölder regularity for such equations is not in the literature we include the proof in full detail in Sect. 3. It relies on the so-called method of intrinsic scaling introduced by DiBenedetto for the porous medium equation (cf. [DiB79]), which is a version of the De Giorgi-Nash-Moser iteration adapted to the setting of degenerate parabolic equations. We make modifications to the method to deal with the presence of the nonlocal drift term ∇ · (ρ∇W ρ). We remark here that the proof of this result is completely independent of the rest of the paper. In a first reading, readers more interested in the properties of stationary solutions and phase transitions might choose to skip the proof and continue to Sect. 4. As a consequence of the proof of Hölder regularity, we also obtain uniform-in-time equicontinuity of the solutions away from the initial datum in Corollary 3.4.
After the proof of the Hölder regularity we proceed to Sect. 4, where we discuss the local bifurcations of stationary solutions from the flat state ρ ∞ . In Theorem 4.4, we provide conditions on the interaction potential W and on the parameter β = β * , such that (ρ ∞ , β * ) is a bifurcation point using the Crandall-Rabinowitz theorem (cf. Theorem B.1). In fact for certain choices of W one can show that there exist infinitely many such bifurcation points. We then move on to Sect. 5, where we prove the existence and regularity of minimisers F m β . We also show that, for β small enough, the flat state is the unique minimiser of the energy for m ∈ (1, ∞], thus extending the result of [CKY13]. In Theorem 5.8, we use the uniform equicontinuity in time obtained in Corollary 3.4 to prove that solutions of (1.1) converge to ρ ∞ in L ∞ ( ) whenever it is the unique stationary solution. We show that, as in the linear case, the notion of H -stability (cf. Definition 2.1), provides a sharp criterion for the existence or non-existence of phase transitions. We then proceed, in Lemmas 5.15 and 5.16 Proposition 5.18, to provide sufficient conditions for the existence of continuous or discontinuous phase transitions, where the proofs rely critically on the Hölder regularity obtained in Sect. 3. We also provide general conditions on W for the existence of discontinuous phase transitions. We conclude the section by showing that m ∈ [2, 3] all non-H -stable potentials W are associated with discontinuous phase transitions of F m β , while for m = 4 we can construct a large class of W that lead to continuous phase transitions of F m β . We summarise our results below: (1) The proof of Hölder regularity of the weak solutions of (1.1) can be found in Theorem 3.3 and the preceding lemmas of Sect. 3.
(2) The result on the existence of local bifurcations of the stationary solutions is contained in Theorem 4.4.
(3) The results on phase transitions are spread out throughout Sect. 5. The result on the long-time behaviour of solutions before or in the absence of a phase transition can be found in Theorem 5.8. The main result on the existence of discontinuous transition points is Theorem 5.19 while the explicit conditions for a continuous transition point can be found in Theorem 5.24. (4) In Sect. 6, we treat the mesa limit m → ∞. The -convergence of the sequence of energies F m β to some limiting free energy F ∞ as m → ∞ can be found in Theorem 6.1. We then provide a characterisation of the minimisers of the limiting variational problem in terms of the size of the domain and the potential W in Theorem 6.2.
In Sect. 7, we display the results of some numerical experiments which we hope will shed further light on the theoretical results, while also providing us with some conjectures about the behaviour of the system in settings not covered by the theory.

Preliminaries and Notation
As mentioned earlier, we denote by P( ) the space of all Borel probability measures on with ρ the generic element which we will often associate with its density ρ(x) ∈ L 1 ( ), if it exists. We use the standard notation of L p ( ) and H s ( ) for the Lebesgue and periodic L 2 -Sobolev spaces, respectively. We denote by the C k ( ), C ∞ ( ) the space of k-times (k ∈ N) continuously differentiable and smooth functions, respectively.
Given any function in f ∈ L 2 ( ) we define its Fourier transform aŝ and N k is defined as Using this we have the following representation of the convolution of two functions W, f ∈ L 2 ( ) where W is even along every coordinate where Sym k ( ) = Sym( )/H k . Sym(λ) represents the symmetric group of the product of two-point spaces, = {1, −1} d , which acts on Z d by pointwise multiplication, i.e.
We need to quotient out H k as there might be some repetition of terms in the sum Another expression that we will use extensively in the sequel is the Fourier expansion of the following bilinear form (2. 2) The following notion will play an important role in the subsequent analysis.
If this does not hold, we denote this by W ∈ H c s . The above condition is equivalent to the following inequality holding true for all η ∈ L 2 ( ) : where 1 < m < ∞, β > 0, and W ∈ C 2 ( ) is even along every co-ordinate and has mean zero. It is not immediately clear what the correct notion of solution for the above PDE is, as it need not possess classical solutions. We introduce the appropriate notion of solution in the following definition.
The proof of this result is classical and we will not include it. It relies on regularisation techniques which remove the degeneracy in the problem. The meat of the matter is proving estimates uniform in the regularisation parameter. We refer to [BCL09,BS10] for proofs of this result with W ∈ C 2 ( ). We turn our attention to the regularity of solutions of (3.1). The proof is based on the method of intrinsic scaling introduced by DiBenedetto for the porous medium equation [DiB79,Urb08]. It is also similar in spirit to the proof in [KZ18] where regularity was proved for a degenerate diffusion equation posed on R d with a potentially singular drift term. We also direct the readers to [HZ19] where Hölder regularity was proven for driftdiffusion equations with sharp conditions on the drift term using a different strategy of proof. Since we will mainly be concerned with stationary solutions we assume for the time being that there exists some universal constant M > 0 such that ρ L ∞ ( T ) ≤ M, where T is the parabolic domain T := × [0, T ] and ∞ := × [0, ∞). We first state the result regarding Hölder regularity. Theorem 3.3 Let ρ be a weak solution of (3.1). with initial datum ρ 0 ∈ L ∞ ( ) ∩ P( ), such that ρ L ∞ ( T ) ≤ M < ∞. Then ρ is Hölder continuous with exponent a ∈ (0, 1) dependent on the data, m, d, W , and β. Moreover, the Hölder exponent a depends continuously on β for β > 0.
We also have the following consequence of the above result: Corollary 3.4. Let ρ be a weak solution of (3.1) with initial datum ρ 0 ∈ L ∞ ( )∩P( ), such that ρ L ∞ ( ∞ ) ≤ M < ∞. Then, for some C > 0, it holds that for all x, y ∈ T d and 0 < C < t 1 < t 2 < ∞. Note that the constants C h and a are independent of x, y and t 1 , t 2 .
We remind the reader that the above results are used to obtain the desired regularity and compactness of minimisers in Lemma 5.4 and the equicontinuity in time of solutions for the long-time behaviour result in Theorem 5.8, although they are of independent interest by themselves. The proof of Theorem 3.3 and Corollary 3.4 can be found in Sect. 8.

Characterisation of Stationary Solutions and Bifurcations
Now that we have characterised the notion of solution for (3.1) we study the associated stationary problem which is given by with the notion of solution identical to the one defined in Theorem 3.1. One can immediately see that ρ ∞ (cf. (1.3)) is a solution to (4.1) for all β > 0. As mentioned earlier, (3.1) and (4.1) are intimately associated to the free energy functional whenever the above quantities are finite and as +∞ otherwise. We will often use the shorthand notation S m β (ρ) dx for the entropies and E(ρ) := 1 2 × W (x − y)ρ(x)ρ(y) dx dy for the interaction energy. We will also drop the superscript m and just use F β (ρ) whenever m = 1.
Another object that will play an important role in the analysis below is the following self-consistency equation for some constant C > 0. We discuss how the above equation, solutions of (4.1), and F m β (ρ) are related to each other for the case m > 1 in the following proposition (the case m = 1 is discussed in [CGPS20] and the proofs are essentially identical).
(3) For every connected component A of its support ρ satisfies the self-consistency equation, i.e.
with C(A, ρ) given by Remark 4.2. We have used the notation for 1 < m < ∞, even though this is not a norm for 1 < m < 2.
Remark 4.3. Note that if a stationary solution ρ is fully supported then the constant where we have used the fact that W has mean zero. We can now formally pass to the limit m → 1 to obtain The solutions of the above equation are studied in detail in [CGPS20]. Now that we have various equivalent characterisations of stationary solutions of (3.1), we proceed to state and prove the main result of this section regarding the existence of bifurcations from the uniform state ρ ∞ (cf. (1.3)). Before doing this however we need to introduce some relevant notions. We denote by H n 0 ( ) the homogeneous H n ( ) space and by H n 0,s ( ) the closed subspace of H n 0 ( ) consisting of functions which are even along every coordinate (pointwise a.e.). Note that the {e k } k∈N d ,k =0 form an orthogonal basis for H n 0,s ( ). We then introduce the following map F : H n 0,s ( ) × R + → H n 0,s ( ) for n > d/2 which is given by Note that if F(η, β) = 0 then the pair (ρ ∞ + η, β) satisfies (4.2) on all of . If one can show that (ρ ∞ + η)(x) ≥ 0, ∀x ∈ then we have found a bonafide stationary solution of (3.1) by the equivalency established in Proposition 4.1. Thus, we would like to study the bifurcations of the map F from its trivial branch (0, β) . To this order we compute its Fréchet derivatives around 0 as follows: for some e 1 , e 2 , e 3 ∈ H n 0,s ( ). We then have the following result: Theorem 4.4 (Existence of bifurcations). Consider the map F : H n 0,s ( ) × R + → H n 0,s ( ) for n > d/2 as defined in (4.3) with its trivial branch (0, β). Assume there exists k * ∈ N d , k * ≡ 0 such that the following two conditions are satisfied Then, (0, β * ) is a bifurcation point of (0, β) with i.e. there exists a neigbourhood N of (0, β * ) and a curve (η(s), β(s)) ∈ N , s ∈ (−δ, δ), δ > 0 such that F(η(s), s) = 0. The branch η(s) has the form where r H n 0,s ( ) = o(s) as s → 0. Additionally, we have that β (0) = 0 and Proof. The proof of this theorem relies on the Crandall-Rabinowitz theorem (cf. Theorem B.1). Note that F ∈ C 2 (H n 0,s ( ) × R + ; H n 0,s ( )). Thus, we need to show that: (a) D η F(0, β * ) : H n 0,s ( ) → H n 0,s ( ) is Fredholm with index zero and has a one-dimensional kernel and (b) for any e ∈ ker(D η F(0, β * )), e = 0 it holds that D 2 ηβ F(0, β * )(e) / ∈ Im (D η F(0, β * )). For (a) we first note that D η F(0, β * ) is a compact perturbation of the identity as the operator W e is compact on H n 0,s . It follows then that it is a Fredholm operator. Note that the functions Note that if the conditions (1) and (2) in the statement of the theorem are satisfied it follows, using the expression for β * , that D η F(0, β * )(e k ) = 0 if and only if k = k * . Thus, we have that ker(D η F(0, β * )) = span(e k * ). This completes the verification of the condition (1) in Theorem B.1.
For condition (2) in Theorem B.1, we note again by the diagonalisation of D η F(0, β * ) that Im (D η F(0, β * )) = {span(e k * )} ⊥ . Thus, we have that We can now compute the derivatives of the branch. Using the identity [Kie12, I.6.3], it follows that where the last inequality follows by using the expression for e 2 k * from Proposition 5.23 and orthogonality of the basis {e k } k∈N d . Here ·, · denotes the dual pairing in H n 0,s . Thus, we have that β (0) = 0. Finally we can compute β (0) by using [Kie12,I.6.11] to obtain This completes the proof of the theorem.
Remark 4.5. Since H n 0,s ( ) is continuously embedded in C 0 ( ) it follows that for the branch of solutions ρ ∞ + η(s) found in Theorem 4.4 are in fact strictly positive for s sufficiently small and are thus stationary solutions by the result of Proposition 4.1. Any interaction potential W (x) such that infinitely many k satisfy the conditions of Theorem 4.4 will have infinitely many bifurcation points (0, β k ) from the trivial branch. A typical example would a be a potential for which the map k →Ŵ (k) is strictly negative and injective.
Remark 4.6. Note that β (0) > 0 for all m ∈ (1, 2)∪(3, ∞). This means that the branch turns to the right, i.e. it is supercritical. On the other hand if m ∈ (2, 3), then β (0) < 0. This means that the branch turns to the left, i.e. it is subcritical. If m ∈ {2, 3} we have that β (0) = 0. The relation of this phenomenon to the minimisers of the free energy will be discussed in Proposition 5.22.

Minimisers of the Free Energy and Phase Transitions
The nontrivial stationary solutions found as a result of the bifurcation analysis in the previous section need not correspond to minimisers of the free energy, F m β (ρ). Indeed, we do not know yet if minimisers even exist. We start first by proving the existence of minimisers of F m β . We then show that for β sufficiently small F m β has a unique minimiser, namely ρ ∞ (cf. (1.3)).
The natural question to ask then is if this scenario changes for larger values of β. We provide a rigorous definition by which this change can be characterised via the notion of a transition point and define two possible kinds of transition points, continuous and discontinuous. We then provide necessary and sufficient conditions on W for the existence of a transition point and sufficient conditions for the existence of continuous and discontinuous transition points.
We start with a technical lemma that provides us with some useful a priori bounds on the minimisers of F m β .
Proof. We start by noting that the following bounds hold We divide our analysis into two cases. For B > 0 and ρ ∈ P( ) let We then have the following bounds on the entropy.
It follows then that we have the following bound on the free energy.
If we define a constant B 1 as follows such that for B > B 1 , 1/| | has a lower value of the free energy than ρ. Case 2: (ρ, B) s.t. ε B < 1 2 We write ρ = ρ B + ρ r , where ρ B := ρ · χ B B and ρ r := ρ − ρ B . We then have the following bound on the entropy.
We can assume without loss of generality that F m β (ρ) < F m β (ρ ∞ ), otherwise the proof is complete. It follows then that By expanding E(ρ), the following estimate can be obtained where we have used the fact that ε B < 1/2. Defineρ r : One can control the second term in the brackets as follows for any δ < 1. Setting δ = 1 2 , we obtain Similarly, for the interaction energy we can compute the difference as follows Using the fact that ε B < 1/2 we can obtain Now, we can define a second constant as follows such that for B > B 2 ,ρ r has a lower value of the free energy than ρ. We now set our constant as follows B β,m := max(B 1 (β, m), 2B 2 (β, m)), and setρ to either be (1/| |) orρ r . The constant 2 in front of B 2 (β, m) follows from the fact thatρ r has been normalised.
The expression for the constant B β,m is explicit as a result of which we can even obtain some uniform control in m.
We now proceed to the existence result for minimisers of F m β .
Theorem 5.3 (Existence of minimisers). Fix β > 0 and m > 1, then F m β : P( ) → (−∞, +∞] has a minimiser ρ * ∈ P( ) ∩ L ∞ ( ). Additionally we have that Proof. We note first that, from (5.1) and (5.2), F m β is bounded below on P( ). Let {ρ n } n∈N be a minimising sequence. Note that by Lemma 5.1 we can pick this sequence such that ρ n L ∞ ( ) ≤ B β,m . By the Banach-Alaoglu theorem we have a subsequence {ρ n k } k∈N and measure ρ * ∈ L ∞ ( ) such that Furthermore, we can find another subsequence (which we do not relabel), such that Note that ρ * is nonnegative a.e. and also has mass one. Thus, is convex and lower semicontinuous in the L 2 ( ) topology. It follows from fairly classical results (cf. [Bre11, Theorem 3.7]) that F m β is also weakly lower semicontinuous. This concludes the proof of existence of minimisers. The bound simply follows from the fact that norms are lower semicontinuous under weak- * convergence.
Proof. The proof of the first statement follows simply by applying Proposition 4.1 and Theorem 3.3 with M = B β,m . For the second statement, letĪ be the closure of I . Then applying (8.17) for some x, y ∈ T d , we have that where a = a(β), C h = C h (β). Setting a = maxĪ a(β) and B to be as in Corollary 5.2, we have that where C h is some new constant depending on B , m, d, and W . Thus, the family {ρ β } β∈I is equicontinuous. It is clearly equibounded from Corollary 5.2. Applying the Arzelà-Ascoli theorem, the result follows. Now that we have shown existence and regularity of minimisers we show that for β small or W ∈ H s minimisers of F m β are unique and given by ρ ∞ . To show this we start with the following lemma which shows positivity of stationary solutions for β sufficiently small.

Lemma 5.5. There exists an δ > 0 depending on m and W , such that for all
Proof. Note that if ρ ∈ P( ) ∩ L m ( ) is stationary, then, by Proposition 4.1, it satisfies on each connected component A of its support with C(A, ρ) given by Thus, we have that ρ ∈ L ∞ ( ). Using a mollification argument and (4.2), one can then obtain the following bound By Theorem 3.3, it follows that ρ is a-Hölder continuous. Note further that we have that Thus, we can choose β to be small enough, dependent on m and W , and apply the bound to argue that Thus, the result follows.
We can now use the positivity estimate of Lemma 5.5 to prove that for β sufficiently small stationary solutions of (3.1) (and thus minimisers of F m β ) are unique. This improves the result of [CKY13], in which uniqueness is proved only for 1 < m ≤ 2.
Proof. Assume ρ ∈ P( ) ∩ L m ( ) is a stationary solution of (3.1). Then, we can apply the same argument as in the proof of Lemma 5.5 to obtain It follows that Let us now assume that β < δ, where δ is the constant from the statement of Lemma 5.5. Furthermore, if 1 < m < 2 the constant C( , ρ) in Proposition 4.1 can be controlled as follows where in the last step we have applied Jensen's inequality. Thus, we have for all x ∈ . Thus, for 1 < m < 2, we can apply the above bound to (5.3) to obtain If β is sufficiently small, we have that ∇ρ L ∞ ( ) = 0. Thus, ρ = ρ ∞ for β sufficiently small. Similarly for 2 ≤ m < ∞, we can apply the bound from Lemma 5.5 to obtain Applying a similar argument as before, we have that, for β 1, ρ = ρ ∞ . Thus, for β 1, ρ ∞ is the unique stationary solution of (3.1) and, by Proposition 4.1, the unique minimiser of F m β .
We also have the following result on uniqueness of minimisers when W ∈ H s .
Proof. We first consider the case in which W ∈ H s . We write the linear interpolant as Differentiating with respect to t twice we obtain that For W ∈ H s the above expression is strictly positive. Thus, F m β (ρ t ) is a convex function, from which it follows that F m β must have unique minimisers. We further argue that the minimiser must be ρ ∞ . Indeed, we have for any where the first inequality follows from Jensen's inequality and the second one from the fact that W ∈ H s and Definition 2.1.
We know now from Lemma 5.6, that for β 1, ρ ∞ is the unique minimiser of F m β and stationary solution of (3.1). We now present the following result on the long-time behaviour of (3.1) in this regime: Theorem 5.8 (Long-time behaviour). Let ρ be a weak solution of (3.1) with initial datum ρ 0 ∈ L ∞ ( ) ∩ P( ). Assume that β and W are such that ρ ∞ is the unique stationary solution of (3.1) (and, therefore, the unique minimiser of F m β ). Then, it holds that We choose as a test function in the weak formulation, φ = pρ p−1 , for some p > 1. Note that we can justify this choice by mollifying φ and then passing to the limit. We then obtain from (3.2) the following expression Plugging in the value of φ on the right hand side and integrating by parts, we obtain Applying the Lebesgue differentiation theorem, we obtain that for t a.e., it holds that Note that we can control the second term on the right hand side of the above expression as follows where we have used the fact that 1 < p < (m+ p−1)d d−2 and the constant θ ∈ (0, 1) is given by .
We now apply the Sobolev inequality on the torus, to obtain Note that the constant C d in the above estimate depends only on dimension and is independent of p > 1. We set q 1 := (m + p − 1)/( p(1 − θ)) and q 2 := q 1 /(q 1 − 1). Note that from the definition of θ we have Thus, we have that We can thus apply Young's inequality with q 1 , q 2 to obtain where C p,m,β > 0 is given by Multiplying through by W L ∞ ( ) ( p − 1), we can apply the estimate in (5.5) to (5.4) to obtain Applying Grönwall's inequality, we obtain that It follows that It follows then that we can find a constant M dependent on ρ 0 L ∞ ( ) , d, β, and m but independent of t and p such that for all t ∈ [0, ∞). Passing to the limit as p → ∞, it follows that for all t ∈ [0, ∞). We can now apply Theorem 3.3 to argue that the solution ρ is Hölder continuous with some exponent a ∈ (0, 1). Furthermore, we can apply Corollary 3.4, to argue that for all x, y ∈ T d and 0 for some E ∈ R. We make Z E into a complete metric space by equipping it with the d 2 (·, ·) Wasserstein distance. The fact that it is complete follows from the fact that F m β is lower semicontinuous with respect to convergence in d 2 (·, ·). Note that the family of mappings {S t } t≥0 forms a metric dynamical system in the sense of [CH98, Definition 9.1.1]. This follows from the fact (cf. [AGS08, Theorem 11.2.8]) the evolution defines a gradient flow ρ ∈ C([0, ∞); . We now define the ω-limit set associated to the initial datum Since the metric space Z E 0 is compact, it follows that the set t≥0 S t (ρ 0 ) is relatively compact in Z E 0 . Applying [CH98, Theorem 9.1.8], we have that ω(ρ 0 ) = ∅ and where ρ(·, t) is the unique solution of (3.1) with initial datum ρ 0 ∈ P( ) ∩ L ∞ ( ). We now need to show that ω(ρ 0 ) is contained in the set of stationary solutions of (3.1). Assume ρ * ∈ ω(ρ 0 ), then there exists a time-diverging sequence t n → ∞ such that Since the solution ρ(·, t) is gradient flow of the free energy F m β with respect to the d 2 (·, ·) distance on P( ), it follows that the following energy-dissipation equality holds true for all t ∈ [0, ∞) (cf. [AGS08, Theorem 11.1.3]) is the metric slope of F m β and is given by Bounding the energy from below and then passing to the limit as t → ∞ in (5.8), we obtain We now consider the time-diverging sequence t n → ∞ and the sequence of curves where in the last step we have used (5.9). It follows that |∂F m β |(μ(·, t)) = 0 for t a.e. Thus, since μ is continuous, we can find a sequence of times m ∈ N, Applying Proposition 4.1, it follows that ρ * ∈ Z E 0 ⊂ P( ) ∩ L m ( ) is necessarily a stationary solution of (3.1). Since ρ ∞ is the unique stationary solution, it follows that However, from (5.6) and (5.7), we know that, for any time-diverging sequence t n → ∞, {ρ(·, t n )} n∈N has a convergent subsequence in L ∞ ( ), whose limit must be ρ ∞ by (5.10).
Since the limit is unique, it follows that Remark 5.9. We remark that the technique used in the proof of Theorem 5.8 can be adapted to study the asymptotic properties of general gradient flows in the space of probability measures. These ideas have been expanded upon in [CGW20].
From Theorem 5.7, it is also immediately clear that W ∈ H c s is a necessary condition for the existence of a nontrivial minimiser at higher values of the parameter β. Indeed, Theorem 5.7 tells us that if W ∈ H s then minimisers of F m β are unique and are given by ρ ∞ . Before we discuss this any further, we introduce a notion of transition point that allows us to capture a change in the set of minimisers. ( We further classify transition points into discontinuous and continuous transition points.

Definition 5.11 (Continuous and discontinuous transition points). A transition point
such that β c is a transition point of F m β . Thus, W ∈ H c s is a necessary and sufficient condition for the existence of a transition point.
if it is defined uniquely. If not we pick any k that realises the minimum of the above expression. We now consider an expansion of the energy F m β (ρ ε ) around ρ ε which we will use repeatedly throughout the rest of this section. We Taylor expand around ρ ∞ to obtain where the function f (x) ∈ (ρ ∞ , ρ ε (x)). For ε > 0 small enough, the highest order term can be controlled as follows For β > β m , the second order term in the above expression has a negative sign. Thus, for ε > 0 sufficiently small we have that F m β (ρ ε ) < F m β (ρ ∞ ). Since, by Theorem 5.3, minimisers of F m β exists for all β > 0, it follows that for all β > β m there exist nontrivial minimisers of the free energy. Thus, there exists some β c ≤ β m which is a transition point of the free energy F m β (ρ).
Remark 5.13. We note here that the β m defined in the statement of Proposition 5.12 corresponds exactly to the point of critical stability of the uniform state ρ ∞ , i.e. if the stationary problem is linearised about ρ ∞ , then β m corresponds to the value of the parameter at which the first eigenvalue of the linearised operator crosses the imaginary axis.
Before attempting to provide conditions for the existence of continuous and discontinuous transition points we define the function F m : Lemma 5.14. For all β > 0, the function F m is continuous. Assume further that there exists β > 0 and P( ) Proof. We note that for 0 < β ≤ β c (where β c is possibly +∞) we have that F m (β) = F m β (ρ ∞ ) which is clearly a continuous function of β. Let β 2 > β 1 > β c (if β c < ∞, else we are done) and let ρ β 1 be the minimiser of F m β 1 . Note however due to the structure of the free energy we have that To obtain continuity of F m , note that the steps of the above equation would still hold with β 1 and β 2 exchanged. Using that ρ β 1 and ρ β 2 are uniformly bounded by Theorem 5.3, one has the desired continuity. Assume now that F m β (ρ β ) = F m β (ρ ∞ ) and let β > β . We then have that We will now try and refine our descriptions of discontinuous and continuous transition points in analogy with the results in [CP10,CGPS20]. Proof. We know already from Proposition 5.12 that β c ≤ β m . Let us assume that β c < β m . We know from Definition 5.11 that ρ ∞ is the unique minimiser of F m β c . Additionally for any sequence of minimisers {ρ β } β>β c we know that lim sup Consider such a sequence and set η β = ρ β −ρ ∞ . For β > β c , we expand the free energy about ρ ∞ as follows where f (x) ∈ (ρ ∞ , ρ β (x)) and can be bounded by Note that due to the fact that . Since β c < β m , the term in the brackets is positive close to β c we obtain a contradiction as ρ β is a nontrivial minimiser of F m β . Thus, we must have that β c = β m .
From Definition 5.11, we see that some β c > 0 is a discontinuous transition point if it violates either (or both) of the conditions (1) and (2). In the following lemma, we will show that if (2) is violated then (1) is as well.
Lemma 5.16. Assume β c > 0 is a discontinuous transition point of the energy F m β and that for some family of minimisers {ρ β } β>β c it holds that Then there exists P( ) ρ β c = ρ ∞ such that: Consider a sequence of points {β n } n∈N > β c and β n → β c as n → ∞. We know that the set of minimisers {ρ β n } n∈N is compact in C 0 ( ) ∩ P( ) from Lemma 5.4. Thus, there exists a subsequence ρ β n ∈ {ρ β } β>β c (which we do not relabel) and a limit ρ β c ∈ P( ) ∩ C 0 ( ) such that From the statement of the lemma we know that ρ β c = ρ ∞ . All that remains is to show that ρ β c is a minimiser of F m β c . We first note that lim n→∞ F β n (ρ β n ) = F β c (ρ β c ). This follows from the fact that the interaction energy E is continuous on C 0 ( ) ∩ P( ) for W ∈ C 2 ( ) and the entropy S m β is essentially an L m -norm and is thus also controlled by the C 0 ( ) topology. Finally we use the result of Lemma 5.14 to note that which completes the proof of (1). The proof of (2) follows immediately from the fact that ρ ∞ is the unique minimiser of S m β (ρ) on P( ) (which is a consequence of Jensen's inequality).
Remark 5.17. The above lemma tells us that we have not lost much by defining discontinuous transition points with respect to the L ∞ ( ) norm since the transition points obtained are discontinuous with respect to the L p ( ) norm as well for all p ∈ [1, ∞]. Indeed if we consider the sequence constructed in the proof of Lemma 5.16 {ρ β n } n∈N it follows that where ρ β c is the limiting object btained in the proof of Lemma 5.16. Thus, lim sup In the following proposition we outline the strategy we will use to provide sufficient conditions for the existence of continuous and discontinuous transition points. Proof. For the proof of Proposition 5.18(a) we note that β c already satisfies condition (1) of Definition 5.11. All we need to show is that it satisfies condition (2). Assume β c < β m , then by the very definition of a transition point we would have a contradiction since ρ ∞ is the unique minimiser of F m β at β = β m . It follows then that β c = β m . Assume now that condition (2) of Definition 5.11 is violated, i.e. there exists a family of minimisers {ρ β } β>β m such that By Lemma 5.16 it follows that there exists P( ) ρ β m = ρ ∞ which minimises F m β m . This is a contradiction.
For Proposition 5.18(b), we note that since ρ ∞ is not a minimiser at β = β m by Definition 5.10 and Proposition 5.12 it follows that β c < β . Thus, by Lemma 5.15, β c is a discontinuous transition point.
The next theorem provides conditions on the Fourier modes of W (x) for the existence of discontinuous transition points. It can be thought of as the analogue for the case of nonlinear diffusion.
Theorem 5.19. Assume W ∈ H c s and m = 2. Define, for some δ > 0, the set K δ as follows We define δ * to be the smallest value, if it exists, of δ for which the following condition is satisfied: We remark that two of the modes in the above expression can be repeated. For example, we could have k a = 2, k b = 1, k c = 1. Then if δ * is sufficiently small, F m β exhibits a discontinuous transition point at some β c < β .
Proof. We know already from Proposition 5.12 that the system possesses a transition point β c . We are going to use Proposition 5.18(b) and construct a competitor ρ ∈ P( ) which has a lower value of the free energy than ρ ∞ at β = β m . Define the function for some ε > 0, sufficiently small. We denote by |K δ * | the cardinality of K δ * , which is necessarily finite as W ∈ L 2 ( ). Expanding about the free energy about ρ ∞ we obtain where the function f (x) ∈ (ρ ∞ , ρ ε (x)). We use the definition of β m and control the highest order term in the same manner as Proposition 5.12 to simplify the expansion as follows: One can now check that under condition (A1), it holds that ⎛ where the constant a is independent of δ * . Indeed, the cube of the sum of n numbers a i , i = 1, . . . , n consists of only three types of terms, namely: a 3 i , a 2 i a j and a i a j a k . Setting the a i = w s(i) , with s(i) ∈ K δ * , one can check that the first type of term will always integrate to zero. The sum of the other two will take nonzero and in fact positive values if and only if condition (A1) is satisfied. This follows from the fact that π −π cos( x) cos(mx) cos(nx)dx = π 2 (δ +m,n + δ m+n, + δ n+ ,m ).
Also the term γ (m) 3 m(m − 2) is always negative. Thus, for δ * sufficiently small, considering the fact that |K δ * | ≥ 2 and is nonincreasing as δ * decreases, ρ ε has smaller free energy and ρ ∞ is not a minimiser at β = β m .
Remark 5.20. The case m = 2 is special, as transition points for any W ∈ H c s are necessarily discontinuous. This case will be treated in detail in Proposition 5.22.
The following lemma shows that discontinuous transitions are stable in m.
It would be sufficient for the purposes of this proof to show that such a nontrivial minimiser exists for F m β m for m close enough to m . Choosing ρ * to be the competitor state, we have a m − 1), a ≥ 0 as m → m , it follows, using the fact that ρ * ∈ C 0 ( ), that we can choose m close enough to m so that the above term is strictly positive. We then have that for m ∈ (m − ε, m + ε) for some ε > 0 small enough, ρ ∞ is not a minimiser of the free energy F m β m (ρ). By Proposition 5.18(b), it follows that F m β possesses a discontinuous transition point at some β m c < β m . The case m = 1 can be treated similarly.
In the following proposition, we single out some special values of m at which one always finds a discontinuous transition point for W ∈ H c s .

Proposition 5.22. Assume W ∈ H c s such that β c is a transition point of F m β . Then if m ∈ [2, 3], β c is a discontinuous transition point. Specifically for the case m = 2 we have that
(1) β 2 c = β 2 (2) There exists a one parameter family of minimiser Proof. We will try again to show that we have a competitor at β m . We start with the case 2 < m < 3. Consider the competitor ρ ε = ρ + εe k for ε > 0 and small and k := arg min k∈N d \{0}Ŵ (k)/ (k) if it is uniquely defined or any one such k if it is not. Expanding the energy upto fifth order and noting that second order terms vanish we obtain where the function f (x) ∈ (ρ ∞ , ρ ε (x)). We again bound the highest order term as in Proposition 5.12 and use the fact that e 3 k dx = 0 for any k ∈ N d \ {0} to obtain Since m(m − 2)(m − 3) is negative for m ∈ (2, 3), for ε > 0 sufficiently small, we have shown that ρ ∞ is no longer the minimiser of F m β m . The result follows by Proposition 5.18(b): we have a discontinuous transition point at some β c < β m .
We conclude the section by discussing the existence of continuous transition points. We show that for m = 4 one can construct a large class of potentials for which the transition point β c is continuous. We start with the following proposition.
Proposition 5.23. Let k ∈ N d be such that k ≡ 0 and let k i ∈ N, i = 1, . . . , d be such that Then we have: Note that P 2 (k) ∩ P 3 (k) = ∅. Similarly, we have that where the constants c σ 1 ,σ 2 j , c σ 1 ,σ 2 ,σ 3 , C 0 , C σ k ∈ R depend only on d, k, and ρ ∞ but are independent of the coefficients a σ (k) ∈ R.
We now proceed to the result concerning continuous transition points for m = 4.
is uniquely defined. Furthermore, we assume thatŴ (k) ≥ 0 for all k = k and that where the sets P 2 , P 3 and the constants c σ 1 ,σ 2 j , c σ 1 ,σ 2 ,σ 3 are as defined in Proposition 5.23.
Then β c = β 4 is a continuous transition point. Note that the constant (k) for k ∈ N d is as defined in (2.1).
Proof. We will rely on Proposition 5.18(a) for the proof of this result. We need to show that, at β = β 4 , ρ ∞ is the unique minimiser of F 4 β . Let ρ ∈ P( ) ∈ L ∞ ( ) be any measure different from ρ ∞ . Then it is sufficient to show that F 4 β 4 (ρ) > F 4 β 4 (ρ ∞ ) (it is sufficient to check bounded densities from the result of Lemma 5.1). We now define η := ρ − ρ ∞ and note that η has the following properties We can compute the free energy of ρ as follows where we have used (2.2). Simplifying further, by using the definition of β 4 and the fact that η has mean zero, we obtain (5.12) We define η 2 := η − f η,k where f η,k = σ ∈Sym k ( )η (σ (k ))e σ (k ) and deal with the two terms I 1 and I 2 separately. We then have where we have used the fact that We now use the fact that η has mean zero from (5.11) and Proposition 5.23 to obtain (5.13) For the second term we obtain [ f 4 η,k + 4 f 3 η,k η 2 + 6 f 2 η,k η 2 2 + 4 f η,k η 3 2 + η 4 2 ] dx.
(5.15) Putting (5.12), (5.13), (5.14), and (5.15), together we obtain Note now that Thus, it follows that where in the last step we have simply used the fact thatŴ (k) ≥ 0 for all k = k . We now note that where we have used the fact that |Sym k ( )| = (k) 2 . Additionally, we have that where in the last step we applied Jensen's inequality and used the fact that the integrand has unit L 2 ( ) norm. For any k ∈ N d , we define the following quantity and note that Finally, we can rewrite the inequality in (5.16) as Assume that |η| k = 0. Then (A2.) and (A3) along with the expression for β 4 , (5.17), and the fact that |η(k)| ≤ N k , imply that the discriminants of the quadratic expressions in (5.18) are all negative, i.e. Similarly, On the other hand if |η| k = 0, the proof follows by noting that any contribution from the interaction energy is positive and that ρ ∞ is the unique minimiser of S β,4 (ρ). The fact that β c = β 4 is a consequence of Lemma 5.15.
Remark 5.25. Note that although the assumptions in Theorem 5.24 seem complicated, all they really require is that all Fourier coefficients of W , except the dominant negative modê W (k ) are nonnegative and that a finitely many of them "positive enough" compared toŴ (k ). Consider d = 1, with W (x) = w 1 e 1 (x) + w 2 e 2 (x) + w 3 e 3 (x) with w 1 < 0 and w 2 , w 3 > 0. If, for some explicitly computable positive constants c 2 , c 3 > 0, w 2 > c 2 |w 1 | and w 3 > c 3 |w 1 |, the conditions of Theorem 5.24 are satisfied and the transition point β c = β 4 is continuous. In this setting, P 2 (1) = {e 2 } and P 3 (1) = {e 3 }.

The Mesa Limit m → ∞
A natural question to ask is ho w the sequence of free energies F m β : P( ) → (−∞, +∞] behave in the limit as m → ∞. We conjecture the following limit free energy, F ∞ : This is analogous to the so-called mesa limit of the porous medium equation considered by Caffarelli and Friedman [CF87]. It is also studied in [CKY18,CT20] for Newtonian interactions and [KPW19] for general drift-diffusion equations. We rederive the result in our setting.
(1) Recovery sequence: For each ρ ∈ P( ) ∩ L ∞ ( ) we choose ρ m = ρ as the recovery sequence. The interaction energy term remains unchanged as it is independent of m, while (m − 1) −1 converges to 0 as m → ∞. Assume first that ρ L ∞ ( ) > 1. It follows that there exists some ε > 0 and a set A of positive measure susch that ρ| A > 1 + . Thus, we have and thus F m β (ρ) → ∞ for all ρ L ∞ ( ) > 1. Now, let us assume that ρ L ∞ ( ) ≤ 1. This gives us and thus completes the construction of the recovery sequence.
(2) -lim inf: Assume that there exists {ρ m } m≥1 such that ρ m ρ in L ∞ -weak- * . For W ∈ C 2 ( ), the interaction energy is continuous and so we can disregard its behaviour. We start with the case in which ρ L ∞ ( ) ≤ 1. In this case the entropic term, S m β (ρ m ), can be controlled from below by 0 and thus the -lim inf holds trivially. The other case left to treat is when ρ L ∞ ( ) > 1. This implies again that there exists some ε > 0 and a set of positive measure A such that ρ| A > 1 + ε. It follows from the weak- * convergence that for some fixed positive constant δ > 0 independent of m. We define the sets A + m := {x ∈ A : ρ m > (1 + ε)} and A − m := A \ A + m . There also exists N ∈ N such that for m ≥ N , A ρ m dx ≥ (1 + )|A| + δ/2. Thus, for m ≥ N we have that This gives us the estimate we need on the entropic term since Passing to the limit as m → ∞, the result follows.
We would now like to understand how the presence of phase transitions for finite m affects the minimisers of F ∞ . This is discussed in the next result.
Theorem 6.2 (Minimisers of the mesa problem). Let F ∞ : P( ) → (−∞, +∞] be as defined in (6.1). Then Proof. The proof of Theorem 6.2(a) follows from the fact that if | | < 1, then for any ρ ∈ P( ) ∩ L 1 ( ) there exists a set A of positive measure such that ρ(x) > 1 for all x ∈ A. Indeed, if this were not the case we would have that which would be a contradiction. Thus, we have that ρ L ∞ ( ) > 1 for all ρ ∈ P( ) ∩ L 1 ( ) and so F ∞ ≡ ∞. The proof of Theorem 6.2(b) is similar. If ρ = ρ ∞ , we can again find a set of positive measure A such that ρ(x) > 1 for all x ∈ A. We then repeat the same argument as in the previous case.
Assume now that | | > 1 and W ∈ H s , W ≡ 0 (if W is identically zero then clearly F ∞ ≡ 0). Since W is mean-zero we have that On the other hand if P( ) ∩ L ∞ ( ) ρ = ρ ∞ , we know from Definition 2.1, that Finally consider the case W ∈ H c s . Let β > 0 be fixed and note that, since | | > 1, β m → 0 as m → ∞. Clearly for m large enough a nontrivial minimiser ρ m ∈ P( ) exists for β > 0 from the result of Proposition 5.12. Consider the measure ρ ε = ρ ∞ +εe k where k is as defined previously. We then have the following bound where the function f (x) ∈ (ρ ∞ , ρ ε (x)). Note that | f | ≤ (ρ ∞ + εN k ). Thus, we have the bound Additionally note that if ε is small enough and ρ ∞ < 1, the last term tends to 0 as m → ∞. Also since W ∈ H c s , the second term in the above expression is negative for m large enough as mρ m ∞ → 0 as m → ∞. It follows from this that, for m large enough, the following estimate holds where C 1 , C 2 > 0 are independent of m. it hus follows from Theorem 6.1, (6.2), and the definition of -convergence that where ρ ∈ P( ) is the minimiser of F ∞ . Thus, ρ = ρ ∞ and the result follows.

Numerical Experiments
The numerical experiments in this section are meant to shed light on the qualitative features of the global bifurcation diagram of the system, while also serving as a source of possible conjectures that can be studied in future work. They were performed using a modified version on the numerical scheme in [CCH15].  Fig. 1 shows the branches of stationary solutions obtained in the long-time limit for m ≥ 2 and W = − cos(2π x/L). The black dot denotes the point of linear stability β m while the red dot denotes the value of β at which the support of the stationary solution is a strict subset of T. Note that the diagram does not necessarily reflect the actual bifurcation diagram of the system as it is obtained from the long-time dynamics and thus will only see stable solutions. We already know that this choice of W satisfies the conditions of Theorem 4.4 and so there will a bifurcation at β m (the black points in Fig.  1). One would expect this branch to turn to the right for m ∈ (2, 3) (cf. Remark 4.6) and then turn back. We conjecture that the red points are all saddle-node bifurcations and correspond to discontinuous phase transitions for m ≥ 2 due to Lemma 5.15 and the fact that they lie ahead of the corresponding β m . Fig. 2, we plot the stationary solutions observed in the long-time limit for m large and β > β c . Since the stationary solutions are potentially minimisers of F m β and the minimisers converge to the minimisers of F ∞ as m → ∞ (cf. Theorem 6.1), the plots in Fig. 2 provide us with some information about the structure of the minimisers of the mesa problem. It seems to be that they converge to the indicator function of some fixed set. A natural next question one can ask is what happens to the continuity of phase transitions in the limit as m → ∞.

Proof of Hölder Regularity
We divide the proof into two parts. In Sect. 8.1, we derive some a priori estimates that will be useful in the proof of regularity. In Sect. 8.2, we perform the so-called reduction of oscillation scheme and complete the proof of Theorem 3.3. As mentioned earlier, readers interested only in bifurcations and phase transitions can skip directly to Sect. 4. Before turning to the proof of Theorem 3.3, we introduce some notation. Since the Eq. (3.1) is invariant under translations of the co-ordinate axis, we define the parabolic cylinder centred at (0, 0) and note that we can move it to any point by adding (x 0 , t 0 ). We also used K R as a shorthand for [−R, R] d . We denote the parabolic boundary by We use the following shorthand notation: Additionally, we consider the cut-off functions ζ such that Through the rest of this section we will also use f (x, t) to denote W ρ(x, t). Note that The reader should note that proof of regularity holds for any f ∈ C 2 ( ) that for which one can prove bounds of the kind shown above. We note before starting the proof that all estimates in the proof have constants that depend continuously on β > 0. Thus, the Hölder exponent a and semi-norm |ρ| a also depend continuously on β > 0.

A priori estimates.
There are two a priori estimates that play a key role in the proof of Hölder regularity: a Cacciopoli-type energy estimate and a logarithmic estimate. The proof of the energy estimate is essentially the same as [Urb08, Proposition 2.4] and we state it without proof.
Lemma 8.1 (Energy estimates). Pick k, ∈ R + and some cut-off function ζ , such that ζ = 0 on ∂ p Q(τ, R). Then it holds for any weak solution of (3.1) that Similarly we have, We note that Urbano [Urb08, Proposition 2.4] proves the above energy estimate for the p-Laplace equation, ∂ t ρ − p ρ = 0. The proof in our setting follows the same technique. We test the weak formulation in Theorem 8.3 (see page 33) against φ = ((ρ ± ) h − k) ± ζ 2 , for some cut-off function ζ supported in Q(τ, R) and integrate by parts. Applying similar bounds as in [Urb08, Proposition 2.4] and then passing to the limit as h → 0, we obtain the desired energy estimate. We also refer the reader to [Rod16, Proposition 2.7] where the proof of the energy estimate is carried out for the porous medium equation, ∂ t ρ − ρ m = 0, which is closer in structure to (3.1). We now move on to the logarithmic estimate. The proof of this needs to be adapted from the classical estimate in the presence of the drift term ∇ · (∇ fρ). Before stating and proving it, we introduce the following function where s is a bounded, measurable function on Q(τ, R) and The function has certain useful properties, namely, We also need to define the Steklov average for any ρ ∈ L 1 ( ×[0, T ]) for any 0 < h < T as follows The Steklov average has certain nice properties which we state without proving. Using this we have the following alternative notion of a weak solution of Definition 8.3. A weak solution of (3.1) is a bounded measurable function and ρ(x, 0) = ρ 0 . Proposition 8.4 [Urb08]. The notion of weak solution introduced in Theorem 3.1 and Theorem 8.3 are equivalent.
Lemma 8.5 (Logarithmic estimates). Let ρ be a nonnegative weak solution of (3.1) and ζ be a time-independent cut-off function, then it holds that for any −τ ≤ t ≤ 0.
Proof. We start by testing (8.2) against ((ψ ± ) 2 ) (ρ h )ζ 2 and integrating by parts to obtain Consider the first term on the LHS and integrating from −τ to t t −τ ×{s} Passing to the limit as h → 0 we obtain that Now consider the second term on the LHS of (8.3) (after passing to the limit as h → 0) where the last expression follows from Youngs inequality. Finally we consider the last term on the LHS of (8.3) (after passing to the limit as Putting it all together we obtain Taking into account the support of ζ , one obtains the result of the lemma.

Proof of Theorem 3.3.
We now get to the meat of the regularity argument, i.e. the reduction of oscillation. We assume again that ρ is a nonnegative weak solution of (3.1). We pick a cylinder Q(4R 2−ε , 2R) that lies inside T (shifted to (0, 0)) for 0 < R < 1. Then we can define We then define the rescaled cylinder which holds true if For a fixed ε > 0, α ∈ (0, 1) if the above inequality does not hold true for any R that can be made arbitrarily small, it follows that ω is comparable to the radius of the cylinder and thus we have Hölder continuity already. The proof of this statement is by contradiction. Let ω R := ess osc Q(4R 2−ε ,2R) ρ. Then for any point (x, t) ∈ T we set R := d T d (x, 0) + |t| 1/2 , the parabolic distance to the origin. Thus, we have We will specify the value of α later. We thus have by this inclusion that ess osc Q(w 1−m R 2 ,R) ρ ≤ ω.
We will also assume throughout the remainder of this proof that μ − < ω/4, as otherwise the equation is uniformly parabolic in Q(4R 2−ε , 2R). Before we proceed we pick some ν 0 ∈ (0, 1) and divide our analysis into two cases. Case 1 We now treat the two cases independently.

Reduction of oscillation in case 1
In the first case, we start by proving the following result.
(2) = μ − + ω/4 < k n which implies that . On the other hand if ρ − = , we have that ρ ≤ < k n we have that We now proceed to bound individual terms on the RHS of (8.1). For the first term we have: For the second term: For the third term: For the final two terms we have: where in the last step we have used the fact that R ε ω 1−m < α < 1 and that R < L.
Putting the bounds for the LHS and RHS of (8.1) together we obtain Lett = ω m−1 t and define the following rescaled functions In these new variables the inequality simplifies to Furthermore we have , where in the last step we have used the embedding into the parabolic space V 2 (cf. Lemma A.4). Thus, we have where we have used the fact thatζ n = 1 on Q(R 2 n+1 , R n+1 ) and have used (8.7). Thus, we have where we use the fact that |Q(R 2 n , R n )| = R d+2 n+1 ≤ R d+2 and R n /R n+1 ≤ 2. Setting we have the recursive inequality with the constant C independent of ω, R, n and dependent only d, m, β, f . Setting ν 0 = C −(d+2)/2 4 −(d+2) 2 /2 , we see that X 0 ≤ ν 0 is equivalent (8.5) to being satisfied with constant ν 0 , since k 0 = ω/2. Thus, for this choice, X n → 0 by the geometric convergence lemma (cf. Lemma A.2). It follows then, after changing variables, that ρ − > μ − + ω/4 a.e. in Q(ω 1−m ( R 2 ) 2 , R 2 ). The result follows by noting that ρ − > μ − + ω/4 = implies that ρ − = ρ. Corollary 8.7 (Reduction of oscillation in case 1). Assume that (8.5) holds with constant ν 0 as specified in the proof of Lemma 8.6. Then there exists a σ 1 ∈ (0, 1), independent of ω, R, such that Proof. We have by the result of the previous lemma that Thus, we have that Thus, the result holds with σ 1 = 3 4 .

Reduction of oscillation in case 2
We now assume that (8.6) holds but with the constant ν 0 fixed from the previous argument. We argue now that if (8.6) is satisfied then there exists some t 0 , We prove this by contradiction. Assume this is not the case then which contradicts (8.6). We now proceed to prove the following lemma.
Lemma 8.8 Assume that (8.6). holds. Then there exists a q ∈ N, depending only on the data, such that for all t ∈ [− ν 0 2 ω 1−m R 2 , 0] and α in (8.4) chosen to be small, depending only on ν 0 , m, d, β, W , M but independent of R and ω.
Proof. The proof of this lemma relies on the Lemma 8.5 with the function ψ + (u) on the cylinder Q(−t 0 , R). We choose where the constant n > 1 will be chosen later. It is fine to apply it to this function as we can assume that otherwise the proof of the lemma would be complete with q = 2. Indeed, we would have for all t ∈ [t 0 , 0]: Before we write down the inequality, we need to further understand the properties of the function ψ + (ρ) defined on the cylinder Q(−t 0 , R). Note first that Putting it all together we can get rid of the negative term in (8.8) and take the ess sup to obtain: We proceed to bound each of the terms individually. For the first term on the RHS of (8.9) we obtain: For the second term we use the fact that ρ ≤ 5ω/4 to obtain: For the third term we use the fact that 5/4ω ≥ ρ ≥ ω/2 on the supports of ψ + (ρ) and (ψ + ) (ρ) to obtain: Similarly for the final term we obtain For the LHS of (8.6), consider the set It is clear that ζ = 1 on S t and, since −ρ + k + ω/2 n+1 < 0, the function Thus, we have ess sup t∈[t 0 ,0] Putting all the terms back together we obtain and bounding ω 2 by M 2 , Finally, we obtain the estimate we need where one should note that R ≤ L and the term R ε ω 1−m can be controlled by α through (8.4). Note that for the term in the first set of brackets we can choose dδ ≤ 3ν 2 0 /16 and n large enough such that because (1 − ν 0 /2)(1 + ν 0 ) > 1. Now that n and δ have been fixed we note that the constant α in (8.4) can be made small enough (independent of ω and R) so that terms in the other two brackets are lesser that 3ν 2 0 /16. This gives us The proof follows by setting q = n + 1 and noting that [t 0 , 0] ⊃ [− ν 0 2 ω 1−m R 2 , 0]. We now proceed to prove that ρ is strictly lesser than its supremum in a smaller parabolic cylinder.
Lemma 8.9 Assume that (8.6). holds. Then there exists some s 0 ∈ N large enough, independent of ω, such that Proof. The proof is similar to that of Lemma 8.6 and relies on the energy estimates in Lemma 8.1. We start by considering the sequence such that R 0 = R and R n → R/2 as n → ∞. We then construct a sequence of nested shrinking cylinders Q(ν 0 2 −1 ω 1−m R 2 n , R n ) along with cut-off functions ζ n satisfying 0 ≤ ζ n ≤ 1, We now apply the energy estimate of Lemma 8.1 in Q(ν 0 2 −1 ω 1−m R 2 n , R n ) with = μ + − ω/2 s 0 , and k n = μ + − ω/(2 s 0 ) − ω/(2 n+s 0 ) for the function (ρ + − k n ) + . We will bound the terms on the LHS and RHS separately. Considering first the terms on the LHS we have where we have used the fact that when |∇(ρ + − k) + ζ n | is nonzero, ρ + ≥ k n ≥ ω/2. For the RHS we first note the following facts: (1) 0 ≤ μ − ≤ ω/4 which implies that ρ ≤ 5ω/4, and ρ + ≤ 5ω/4 .
Finally we can state the reduction of oscillation result in case 2.
Remark 8.12. We note that the proof of Corollary 3.4 follows from the fact that the constant C h is uniform in time as long as we are far enough from the initial data ρ 0 , i.e. if 0 < C < t 1 < t 2 < ∞ for some constant C > 0.

Then there exists a constant C = C(d) such that
where R = diam( ).
We then have the following embedding [DiB93, page 9]: Lemma A.4. Let ρ ∈ V 2 ( T ). Then there exists a constant C d depending only on d such that

Appendix B. Bifurcation theory
We state here the Crandall-Rabinowitz theorem (cf. [Nir01,Kie12]) for bifurcations with a one-dimensional kernel. (1) D x (0, κ * )F is a Fredholm operator with index zero and has a one-dimensional kernel.