Weighted Ultrafast Diffusion Equations: From Well-Posedness to Long-Time Behaviour

In this paper we devote our attention to a class of weighted ultrafast diffusion equations arising from the problem of quantisation for probability measures. These equations have a natural gradient flow structure in the space of probability measures endowed with the quadratic Wasserstein distance. Exploiting this structure, in particular through the so-called JKO scheme, we introduce a notion of weak solutions, prove existence, uniqueness, BV and H1 estimates, L1 weighted contractivity, Harnack inequalities, and exponential convergence to a steady state.


Introduction
In this work we investigate the well-posedness and the long-time behaviour of solutions u = u(t, x) of the nonlinear diffusion equation where α ∈ R and is a d-dimensional domain; we give specific hypotheses on later. This class of equations may exhibit a whole spectrum of different behaviours as α varies. We are interested in the case α < 0, i.e., when (0.1) takes the name of ultrafast diffusion equation. This class of equations has completely different properties from those found in the case α 1, which corresponds to the porous medium and heat framework. Porous medium equations model slow diffusion phenomena and have been extensively studied in the last years; we refer the reader to the monographs by J. L. Vázquez for a comprehensive theory [52,53]. The case 0 α < 1 is commonly referred to as fast diffusion equation; in particular, α = 0 gives the logarithmic diffusion equation [22].
When α < 1, the existence and uniqueness of weak solutions of the Cauchy problem, as well as the asymptotic behaviour and the main qualitative properties, are well understood when α lies in the so-called good parameter range max {0, α c } < α < 1, where α c := (d − 2)/d is a critical exponent; see for instance [34]. The theory on diffusion equations is less developed in the subcritical range α < α c , even under the condition α > 0, since the classical questions about existence, uniqueness and regularity become more challenging. A typical difficulty emerging in the subcritical case concerns the possible lack of positivity due to extinction in finite time: while in the good parameter range α > α c the mass is conserved, if we consider the case of the Cauchy problem in the whole space R d with d 3 and 0 < α < α c , Bénilan and Crandall [3] proved the extinction in finite time of solutions of the fast diffusion equation (0.1) when the initial datum is in some suitable L p space. In fact, solutions become identically zero in finite time for all 0 < α < 1 if considering the Cauchy problem in a bounded domain with Dirichlet boundary data; see [6,7] for more detail. Still, in R d , the critical case α = α c is very challenging as well since it turns out that, for d 3, solutions exhibit two space regions in which they have different long-time behaviours; see [32]. When is a bounded domain, the situation is even more involved and we refer to the monograph [52].
Most of the literature does not treat the very singular range α < 0, since the diffusivity u α−1 becomes extremely singular at u = 0. In particular, in [50] Vázquez showed that if one considers the Cauchy problem in the whole space or in a bounded domain with zero Dirichlet boundary conditions, then solutions starting from L 1 initial data become instantaneously identically zero, namely u(t) ≡ 0 for all t > 0. To circumvent this phenomenon, some authors have considered initial data that are not integrable and "not too small" at infinity; see [29,50] among the older references, then [8,21,51], and the books [22,52] for a more exhaustive discussion on the problem. It is interesting to notice that (0.1) with α < 0 arises naturally in certain physical applications. For example, superdiffusivities of this type have been proposed in [23] as a model for long-range Van der Waals interactions in thin films spreading on solid surfaces. This equation also appears in the study of cellular automata and interacting particle systems with self-organised criticality; see [18] for example. Other physical applications are mentioned in [4].
Besides the motivations above, our interest in ultrafast diffusion equations stems from the problem of quantisation for probability measures. This problem can be stated as follows: given an integer N , find an atomic measure with N atoms that best approximates a given probability density ρ on ⊂ R d in the sense of Wasserstein distances of any order p 1. As explained in [33], this is in fact equivalent to the following minimisation problem: where d(x, N ) stands for the distance between the point x ∈ and the set N , which is the support of the optimal measure. Note that, in this new formulation, the only unknowns are the locations of the points of the support N . A classical and important question concerns the asymptotics of a minimiser N when the cardinality N goes to infinity. In order to take such a limit, one defines the probability measure μ N := 1 N x∈ N δ x . Then, it is known (see [30,33,44]) that as N → ∞ the measures μ N weakly-* converge to a minimiser of the energy functional defined for densities f on . This convergence has also been investigated and justified from a -convergence viewpoint in [11] (a more general proof having been established in a similar case in [45]; see also [35] and [38] on how geometry can affect the optimal location problem).
In [16], the authors introduced a new approach to the quantisation problem based on gradient flows; their idea was to study the evolution of the points of N when they follow the steepest descent curves of the functional (0.2) (which is nothing but a continuous-time version of the well-known Lloyd's algorithm for the optimal quantisation; see [40], or [12,42] for more recent accounts and related topics), and to compare it to the gradient flow of a continuous functional. This analysis was performed in detail in the one-dimensional case in [16], and in the two-dimensional case in [17] when ρ ≡ 1. There, the authors study the Lagrangian evolution of the particles in the support of N under the gradient flow of (0.2) and prove quantitative convergence estimates to a continuous gradient flow. As observed in [36], at least when d = 1, this continuous Lagrangian evolution of particles corresponds, in Eulerian variables, to the gradient flow of F ρ in the 2-Wasserstein sense.
Because of this, understanding the 2-Wasserstein gradient flow of F ρ is a natural problem. Before computing the equation associated to this gradient flow, we first make a short comment about the boundary conditions: as explained in [16], because of the preservation of the mass in the quantisation problem, a very natural boundary condition is the no-flux (that is, Neumann) one; of course, the easiest case is actually just to suppose that the domain is a torus, which is the same as supposing that f is periodic. In the sequel we shall focus on these two boundary conditions: periodic, and no-flux on bounded domains.
To compute the 2-Wasserstein gradient flow of F ρ we note that, setting r := p/d, the first variation density of F ρ at a density f is given by Hence, by Otto's calculus (see for instance [36,47]) and by the theory of gradient flows in Wasserstein space (see [1,49]), the gradient flow of the functional F ρ in the 2-Wasserstein metric is given by This is an ultrafast diffusion equation weighted by the density ρ. Indeed when ρ ≡ 1 it corresponds (after a change of variable and up to a multiplicative constant; see (1.3)) to (0.1) with exponent α = −r < 0; this is the so-called ultrafast diffusion regime for which, as already explained, solutions starting from L 1 initial data vanish instantaneously when set on the whole space or with zero Dirichlet boundary conditions. However, the natural framework where we study this equation includes mass preservation, and thus, as mentioned above, we shall consider only periodic or Neumann boundary conditions on a bounded domain (assumed to be convex for technical reasons; see later).
As the reader will notice as this paper progress, we perform an essentially complete analysis of this weighted ultrafast diffusion equation by combining two approaches: on the one hand, we exploit as much as we can the time-discretisation given by the so-called Jordan-Kinderlehrer-Otto scheme (see [37]) and obtain many estimates using recent tools in optimal transportation; on the other hand, we obtain further results at the level of the continuous-time PDE. Each time we choose which approach to favour depending on the easiest one to adopt.

Main Results and Plan of the Paper
In this section we introduce the notation and assumptions, we state our main results, and we give an overview of the paper.
Let r be a positive real number. Let ⊂ R d be a d-dimensional domain: either the d-dimensional torus T d , or a bounded convex domain. Let ρ be a Borel probability density on , which we write either ρ ∈ P( ) with ρ dx or, abusively, ρ ∈ P( ) ∩ L 1 ( ).
We write M( ) the set of finite nonnegative Borel measures on , so that P( ) = {ρ ∈ M( ) : ρ( ) = 1}. Let us give the definition of the 2-Wasserstein distance. For any two μ, ν ∈ M( ) with same total mass, we define the 2-Wasserstein distance W 2 (μ, ν) between μ and ν by where (μ, ν) is the set of all transport plans between μ and ν, that is, the subset of M( ) × M( ) consisting of measures with μ as first marginal and ν as second marginal; see [48,54] for an exhaustive account on Wasserstein metrics. We want to investigate the properties of the following weighted ultrafast diffusion equation discussed in the introduction: where the unknown is f : is the outward unit normal vector to ∂ at x. Notice that when is the torus, there is no boundary condition and we can consider f to be a periodic function. When is a bounded convex domain, the boundary condition above should be intended in a weak sense, which means (see Definition 1.1 below) that test functions will not be compactly supported in space.
This equation (including the boundary conditions) can be seen as the gradient flow in W 2 of the functional see later for a precise definition of this functional on arbitrary measures. Let us consider a change of variable that was first introduced in [16], and that will be very useful to prove several of our estimates: for all (t, x) ∈ [0, ∞) × , With this change of variable, equation (1.1) becomes In order to state our results, we need first to introduce the class of solutions on which we can prove existence and uniqueness. Note in particular that the assumption in Definition 1.1 that initial data belong to L r +3 ( ) will be used to show that weak solutions exist; see the proof of Lemma 3.4.
and the following bounds hold: In this case we say that u := f /m is a weak solution of (1.3).
The Sobolev regularity conditions in the definition are crucial. First, it is important to observe that the equation has no distributional meaning if one does not assume any Sobolev regularity on the solution. Indeed, by looking at (1.3), one sees that the existence of weak derivatives for u −r is needed to define the divergence of m∇(u −r ). Then, the reader will see that these precise H 1 assumptions play a crucial role both in the proof of uniqueness in Theorem 1.2 (to make sure that we can justify the computations) and in the proof of instantaneous regularisation (or boundedness) of solutions in Theorem 1.3 (to be able to use the Moser iteration inspired by [46]; see also [8,22]). Before stating our main results, let us give a standing assumption on the weight m which will always hold in the paper.

Assumption. (Sobolev assumption on m)
The weight m is such that log m ∈ W 1, p ( ) for an exponent p > d. In this way log m is continuous and bounded, which means that m is continuous and bounded from above and below by positive constants. When needed, we will call λ a positive constant such that λ < m < λ −1 .
Although this assumption always implicitly holds, we will recall it in some results to emphasise its importance. Note that, because of this hypothesis on m (and thus on ρ), the domain of the functional F ρ is exactly the subset of densities f such that 1/ f ∈ L r ( ). Throughout the paper, it will be sometimes necessary to assume extra properties on log m (in particular, semiconcavity), and to assume bounds on f 0 . Since these properties and bounds are not always the same in every result we state later, we prefer not to detail them here but rather provide them whenever we need.
Then there exists a unique weak solution of (1.1) starting from f 0 .
With this theorem in hand, we can address the regularity properties of solutions and their long-time behaviour. In [36] the author showed that when ρ is a positive smooth function and one considers smooth solutions of (1.1) starting from initial data bounded away from zero and infinity, then as time goes to infinity the solution converges exponentially fast to the stationary state , so that f converges to a stationary state with the same mass as f 0 . 2 We recover in this paper the same convergence as in [36] without the initial boundedness condition, since we prove instantaneous upper and lower bounds (usually called Harnack inequalities) beforehand. Although upper bounds are rather classical in these settings, lower bounds are nonstandard and actually false in many situations [50]. In our case it is crucial that we work with periodic or no-flux boundary conditions. As we shall see, this result is crucial for the long time behaviour because, once the solution is bounded and bounded away from zero, the singular/degenerate character of the equation is not predominant and the solutions behave like standard parabolic equations.

Theorem 1.3. (Harnack inequalities)
Suppose that f is a weak solution of (1.1) starting from some density f 0 . Assume the following integrability properties on f 0 : Then, for any t > 0 there exists a constant C t > 0 (nonincreasing in t) such that 1 C t f (s) C t for all s t. 2 In the literature related to the quantisation problem, f 0 is a probability measure [16,36]; on the other hand, in the literature about fast-diffusion equations the mass is arbitrary, and often nonpreserved. For the sake of generality we admit here arbitrary masses and arbitrary initial data in L 1 ( ).
We can then apply the above theorem, together with H 1 and BV estimates that will be proven later, in order to obtain the following: Theorem 1.4. (Long-time behaviour) Let f be a weak solution of (1.1) starting from some f 0 satisfying the same assumptions as in Theorem 1.3. Then, there exist constants C, c > 0 such that, for all t 0, one has Moreover, if one adds the assumption that there exists ∈ R such that D 2 (log m) Id, then there exists another constant C > 0 so that one also has We now briefly explain the ideas behind the proofs of the above results. To prove the existence of weak solutions we use the so-called Jordan-Kinderlehrer-Otto (JKO) scheme. This method, first introduced in [37] and further developed in several other papers (see for instance [26] for a related setting), provides a very natural way to discretise Wasserstein gradient flows in time. More precisely, given a time step τ > 0, one fixes f (τ ) 0 := f 0 , and, for each k ∈ N, one defines f (τ ) k+1 as the minimiser of the functional .
In this way one constructs a discrete gradient flow defined at all times t = kτ with k 0, and to obtain a solution to (1.1) one needs to find a limit as τ → 0. In our case we face at least two main challenges: first, the JKO scheme is naturally set in the class of measures, and we would need to prove that minimisers of the above functional exist in the space of functions, or densities (a priori, the minimiser may have a singular part); second, we need to prove enough estimates on the discrete solutions to ensure that, in the limit, we obtain a weak solution according to Definition 1.1. To circumvent these difficulties, we first prove that if the initial datum f 0 is bounded between two multiples of m, then the same bound is true for f (τ ) k for all k ∈ N. In this way we guarantee that f (τ ) k is a function (and not only a measure). Also, still assuming that f 0 is bounded between two multiples of m, we exploit the so-called "flow-interchange technique" (see [41]) and the so-called "five-gradients inequality" (see [24]) to find H 1 and BV a priori estimates on our discrete solutions. In this way we can prove the existence of a weak solution of (1.1) whenever 0 < c 0 m f 0 C 0 m for some c 0 , C 0 > 0. Finally, the general existence theorem follows by approximation.
To prove uniqueness the idea is to consider two weak solutions f and g, and show that Note that we are actually unable to prove directly this property for all solutions: we can prove it only when one of the two solutions is uniformly bounded away from zero; see Proposition 3.6. Then, by an approximation argument, we are able to conclude the desired uniqueness; see Theorem 3.7. Finally, we prove the instantaneous positivity and boundedness from above (i.e., instantaneous regularisation) of weak solutions using a Moser iteration, and then we conclude the L 2 exponential convergence relying on the argument in [36]. We are also able to provide BV exponential convergence to the steady state, using the arguments deriving from the discrete BV estimate.
In Section 2 we discretise the problem in time and show existence of minimisers for the JKO scheme, together with a discrete maximum principle; we also give BV and H 1 estimates for the minimisers. In Section 3 we prove Theorem 1.2; as a corollary, we also get a continuous maximum principle. Then, in Section 4 we show Theorem 1.3 and in Section 5 we use our tools to prove exponential convergence, that is, Theorem 1.4. Note that, whenever relevant, we rewrite our results in remarks for the nonweighted case ρ ≡ 1, which is the prototype equation and helps understand the essential aspects of both the problem and the results.

Time Discretisation of the Problem
Let us fix in this section the time step τ > 0. In order to study the JKO scheme, we first define our functional F ρ on the space of measures. To this aim, for all μ ∈ M( ) that we can decompose as Note that, if we set U (s) = s −r for all s ∈ (0, ∞) we can also define the functional where m is as in (1.2). Of course, we have F ρ [μ] = G[μ]. More generally, and for future use, for a given exponent q < 0, and still using the decomposition and we also give a similar, but different definition, for q > 1: Also note that, when the reference weight m is not fixed (for instance, we will once use a sequence of weights m n ), we can also write G (q;m) instead of G (q) to stress the dependence on the weight.
The functional G previously defined is just an example of functional G (q) , for q = −r . We observe that all functionals G (q) are lower semicontinuous for the weak-* convergence of nonnegative measures because they have the form μ → U (dμ/ dx) dx + U ∞ μ s , where U ∞ := lim s→+∞ U (s)/s (and here we have U ∞ ∈ {0, +∞}); see for instance [48,Proposition 7.7]. (Note that if U ∞ = +∞ and μ s = 0, then we conventionally set U ∞ μ s = 0.) As explained in the introduction, one can construct a discrete gradient flow as an iterative sequence of minimisation problems of the form This means that, for a given ν ∈ M( ) with mass M, we want to solve Note that, as a consequence of the definition of the scheme, the mass M of our discrete solutions (and therefore also of their continuous limits) is preserved.

Well-Posedness of the Discrete Scheme
Proof. The functional G is lower semicontinuous for the weak-* convergence of measures, and so is μ → W 2 2 (μ, ν), since W 2 exactly metrises (on compact sets) this convergence. Moreover, the set M( ) is compact for this convergence, which proves the existence of a minimiser. Uniqueness comes from the strict convexity of the problem. Indeed, G is a convex functional and so is μ → W 2 2 (μ, ν). In addition, the latter is also strictly convex if ν dx; see [48,Proposition 7.19].
Theorem 2.1 does not exclude the possibility of a minimiser, say μ * , which is not absolutely continuous; its singular part μ s * does not enter into play in the computation of G[μ * ] but is not forbidden as soon as the absolutely continuous part of μ * is positive almost everywhere. In order to study the minimisers for Problem (2.1) we approximate U with a superlinear cost function to ease the computations. Define, for all ε > 0, and, for all μ ∈ M( ), We use the following result, which is essentially a statement on -convergence; see for instance [20]: admits a unique solution μ ε for every ε > 0. This solution is absolutely continuous for every ε > 0, and weakly-* converges to the unique solution μ * of (2.1) as ε → 0.
Proof. Given ε > 0, the existence and uniqueness of μ ε can be done as in the proof of Theorem 2.1. (Uniqueness is actually easier since the functional G ε is strictly convex, so that we do not need the strict convexity of the Wasserstein part and we do not need ν ε dx.) The fact that μ ε is absolutely continuous is straightforward, since otherwise G ε [μ ε ] = +∞. Up to subsequences, we can always suppose μ ε * μ * as ε → 0 for some μ * ∈ M M ( ); indeed, μ * ( ) = ν( ) since the weak-* convergence preserves in this case the total mass [5]. We now just need to prove that μ * solves (2.1). Given an arbitrary measureμ with an L 2 density, we can write Passing to the liminf as ε → 0, using the semicontinuity of G, the continuity of W 2 with respect to the weak-* convergence and the fact that we have , which is true for everyμ ∈ L 2 (since the extra term in the definition of G ε is a finite term multiplied by ε), we get This shows that μ * is a minimiser in (2.1) if we restrict to L 2 competitors. To complete the proof, it is enough to prove that the infimum in (2.1) does not change if we restrict it to L 2 , or, in fact, even to bounded, densities. To do so, take an arbitrary μ = f dx + μ s and define, for all p > 0, where f ∧ p stands for the minimum between f and p, f p * μ s as p → ∞ is an arbitrary L ∞ approximation of μ s , and c p = | | −1 ( f − f ∧ p) dx is a constant density with the same mass as the difference between f and its truncation f ∧ p.
We can see that, as p → ∞, we have μ p * μ; hence lim p→∞ W 2 2 (μ p , ν) = W 2 2 (μ, ν), and we also have lim As is sometimes done already, in the sequel we will often identify absolutely continuous measures with their densities.

Lemma 2.3. Given g ∈ M( ) ∩ L 1 ( ) and a convex lower semicontinuous func-
where ϕ is the unique (up to additive constants) Kantorovich potential from f * to g.
For the readers' convenience, we recall that the definition and role of Kantorovich potentials. First, we recall the duality result introduced by Kantorovich, which reads, in the case of the quadratic cost c(x, y) When (ϕ, ψ) is an optimal pair in the above supremum, then we say that ϕ is a Kantorovich potential from μ to ν. The Kantorovich potential is always a Lipschitz (when is compact) and semiconcave function, and is unique up to additive constants as soon as μ has strictly positive density almost everywhere. Moreover, it is connected to the optimal transport map T via T (x) = x − ∇ϕ(x), for almost every x ∈ , and it also plays the role of first variation of the functional μ → 1 2 W 2 2 (μ, ν). By inverting the roles of the two measures and using the uniqueness of the optimal map, it is easy to obtain ∇ψ = −∇ϕ • T . We refer the reader to [48, Sections 1.2, 1.3 and 7.2.2] for these facts and more details.
Proof. From V (0) = +∞ and from the finiteness of V ( f * /m) m we deduce that f > 0 almost everywhere. Hence, we can use [48,Proposition 7.20] to deduce that V ( f * /m) + ϕ/τ is equal to a constant almost everywhere on the support of f , i.e., the domain .
Proof. We prove the same estimates for the minimisation problem (2.2), where (ν ε ) ε is a smooth and strictly positive approximation of ν also satisfying the bound ν ε C 0 m or the bound ν ε c 0 m for all ε > 0. Then, by Lemma 2.2, as the constants c 0 and C 0 do not depend on ε, the same estimates hold true for the minimiser of (2.1). Also, we prove the result under the assumption that m be Lipschitz continuous; then, a simple approximation argument gives the result for any m.
Let ε > 0 and let μ ε be the unique minimiser of Problem (2.2). Let ϕ be the Kantorovich potential from μ ε to ν ε . By Lemma 2.2, μ ε is absolutely continuous (μ ε = f ε dx), and, by Lemma 2.3 applied with V = U ε , we obtain that Since ϕ is at least Lipschitz continuous (see for instance [48]), this implies that U ε ( f ε /m) is as well Lipschitz continuous. Using the explicit expression for U ε and U ε ε > 0, we get that f ε /m is Lipschitz continuous, and so the same is true of f ε (by the assumption that m is Lipschitz continuous). Moreover, f ε is bounded from below by a positive constant since U ε ( f ε /m) is bounded and Since the target measure ν ε is supposed to be smooth and strictly positive, we face an optimal transport problem between two Lipschitz densities which are bounded below and either periodic (if = T d ) or supported on a convex domain. In the former case we can apply the regularity result in [19] and in the latter case we can apply Caffarelli's regularity theory (see [13][14][15], [25,Theorem 3.3] and [31,Theorem 4.23 and Remark 4.25]) to get that ϕ ∈ C 2,β ( ) for some β < 1, under the extra assumption that be uniformly convex and smooth, so that we have regularity of T up to the boundary. Note that we get rid of the extra assumption on at the end of the proof. Moreover, the optimal map T = id − ∇ϕ is a diffeomorphism and sends ∂ into ∂ , which is only pertinent in the case where is a convex bounded domain.
-The case where is the torus. Letx be a point of maximum for f ε /m. Since U ε is monotonically increasing, thenx is also a point of maximum for U ε ( f ε /m). This implies thatx is a point of minimum for ϕ/τ . Therefore, because = T d , Let us recall that the optimal transport map T : → from f ε to ν ε is given by Since by assumption ν ε C 0 m and we know D 2 ϕ(x) 0, we get This proves the first part of the statement (i.e., the absolute continuity and the upper bound) for the case of the torus, and the second part (i.e., the lower bound) is analogous (choosing a minimum point for f ε /m instead of a maximum point).
-The case where is a uniformly convex, smooth and bounded domain. The difficulties arise whenx ∈ ∂ . To perform the same analysis as above we need either to exclude this case or to guarantee that anyway ∇ϕ(x) = 0.
Step 1: upper bound. Ifx is a minimum point for ϕ andx ∈ ∂ , then ∇ϕ(x) is orthogonal to the boundary, and ∇ϕ(x) · n(x) 0, where we recall that n(x) denotes the outward unit normal vector to ∂ atx. Yet, the strict convexity of and the condition T (x) =x − ∇ϕ(x) ∈¯ impose ∇ϕ(x) · n(x) > 0, which is a contradiction. The upper bound is thus easily handled.
Step 2: lower bound. For the lower bound the situation is trickier, as the above contradiction does not work. Yet, we can use the fact that T is a homeomorphism and that, forx ∈ ∂ a maximum point for ϕ, T (x) must be a point of ∂ which, by monotonicity of T , must satisfy n( Coming back to the point y = T (x), we can say that y ∈ ∂ , n(y) · n(x) 0, but also y =x − ∇ϕ(x). Moreover, sincex is a maximum point for ϕ, we have Hence in this case we cannot excludex ∈ ∂ , but we can guarantee that ∇ϕ(x) = 0. This, together with the regularity of ϕ, allows to apply the second-order condition on ϕ as in the torus case and conclude f ε c 0 m.
-The case where is a convex bounded domain. We now get rid of the extra assumption on . In this case the regularity theory of Caffarelli does not extend to the boundary, but the estimates can just be obtained by approximation, replacing with a sequence of domains satisfying Caffarelli's assumptions, and then passing to the limit. The bounds being independent on the smoothness of the boundary and of its uniform convexity, the result stays true in general.

A Priori BV Estimate
We give here an a priori BV estimate which we obtain also using the discrete maximum principle given in Lemma 2.4. For convenience, we define a BV norm weighted by m: if u ∈ BV ( ), we set where the right-hand side stands for the integral of the continuous function m with respect to the scalar measure |∇u|, which is the total variation of the vector measure ∇u. Since we are always supposing that m is bounded from above and below, this norm is bounded from above and below by constant multiples of the standard BV norm.
Lemma 2.5. Given ν ∈ M( ) with ν = g dx for some g ∈ L 1 ( ), let μ * be the unique minimiser of (2.1). Assume that c 0 m ν C 0 m for some C 0 , c 0 > 0, so that by Lemma 2.4 we have μ * = f * dx with c 0 m f * C 0 m, and that g/m ∈ BV ( ) and D 2 (log m) Id for some ∈ R. Then there exists a constant C 1 > 0, depending on c 0 , C 0 and on the sign of , such that, if Proof. Again, let us assume that m is Lipschitz continuous; then a simple approximation argument gives the result for any m. We start, as usual, by writing the optimality conditions of (2.1): where ϕ is the Kantorovich potential from f * to g. Note that the optimality conditions themselves imply that f * is a Lipschitz function, which allows us to differentiate it almost everywhere. We now write, for any vector v ∈ R d \ {0}, v := v/|v|, and set, by convention, 0 = 0. Since it follows, since U is convex, and hence U 0, that we have Hence, By [24, Lemma 3.1] applied with H = |·| (note that one should first write the inequality below for H ε = ε 2 + | · | 2 , ε > 0, instead of H and then pass to the limit ε → 0, which explains the choice of the convention for 0; also, one should first approximate g in W 1,1 ( ) and then pass to the limit at the very end of the proof, since otherwise the integral here below is not well-defined), we get where ψ is the Kantorovich potential from g to f , and ∇ψ := ∇ψ/|∇ψ|. Also, since the optimal map from g to f is S = id − ∇ψ, Since ∇ϕ • S = − ∇ψ, this gives Hence, Note that, since ∇ψ • T = −∇ϕ (T being the optimal map from f * to g), The last term can be rewritten using Also using the fact that the function s → sU (s) is nonincreasing, we can go on with the estimates: if > 0, then we have if < 0, then we obtain In all cases, (2.3) yields where C 1 > 0 depends on c 0 , C 0 and on the sign of . This means, provided which is the desired result.

A Priori H 1 Estimates
These estimates will be needed to prove later that the solution of the JKO gives a weak solution. They are obtained by the flow-interchange technique [41] after proving the following result: Id for some > 0 and C > 0, then the functional , then this same functional is geodesically convex without any L ∞ restriction.
Proof. First, we note that the set of densities satisfying a given L ∞ upper bound is geodesically convex in the Wassterstein space. The proof follows the same scheme as the usual one when no spatial inhomogeneity m is present; see [43] and, for instance, [48,Chapter 7].
Given two densities f 0 and f 1 , we know that the density f α of the Wasserstein geodesic connecting f 0 and f 1 is given, for all α ∈ [0, 1] and x ∈ , by and T is the optimal transport map from f 0 to f 1 . With the change of variable where, for all (α, x) ∈ [0, 1] × , a(α, y) = log(det(DT α (y))), b(α, y) = log(m(T α (y))).
We now differentiate twice in α, and use that (exp(h(α) In the case = 0 we stop here and we obtain geodesic convexity of G (q) . Otherwise, we go on by rewriting the exponential and we have dy.
which is exactly the claim.
Lemma 2.8. Given ν ∈ M( ) with ν = g dx for some g ∈ L 1 ( ), let μ * be the unique minimiser of (2.1). Assume that ν C 0 m, so that by Lemma 2.4 we have for a constant c(r, q) > 0.
Proof. The proof is based on the so-called flow-interchange procedure, first introduced in [41]; however, we will follow the technique described in [39]. First, we write the optimality conditions for the minimisers of (2.1): we have, almost everywhere in , We then multiply this equality by f ∇(V ( f * /m)), for a convex function V , and integrate. This provides Note that the right-hand side corresponds to the derivative, computed at time The claim is then proved by using Lemma 2.7: indeed, setting 1] h (α), which yields the result. The following lemma is a classical fact of the JKO scheme, and luckily does not use geodesic convexity (indeed, negative power functionals are rarely geodesically convex). We state it for completeness, but the reader can see that it is possible to obtain the desired estimates of this paper without using it. Lemma 2.10. Given ν ∈ M( ) with ν = g dx for some g ∈ L 1 ( ), let μ * be the unique minimiser of (2.1). Assume that ν C 0 m, so that by Lemma 2.4 we have Proof. The computations come again from the optimality conditions τ ∇(U ( f * m )) = −∇ϕ. We square and integrate with respect to f , thus obtaining We then compute, almost everywhere in , and obtain the equality in the claim. In order to compare the Wasserstein distance to the functional F ρ , it is enough to use the optimality of f * , i.e., which gives the inequality in the claim.

Preliminary on the Notion of Weak Solution
In order to clarify notation, let us • define the homogeneous Sobolev spacė We want now to prove some preliminary lemmas that will be useful throughout. For the first one, we recall our standing assumption on m as it really is crucial here. Proof. Recalling (1.3) we get: Let T > 0. By definition, For the first term, since by the definition of weak solution u −r ∈ L 2 ([0, T ], H 1 ( )), by Hölder's inequality we get Hence, For the second term, note that, when d > 2, Sobolev's inequality yields ϕ ∈ L 2 ([0, T ], L 2d/(d−2) ( )) and therefore, thanks to the assumption that log m ∈ W 1, p ( ) with p > d and to Hölder's inequality, When d 2, instead of ϕ ∈ L 2 ([0,T ],L 2d/(d−2) ( )), Sobolev's inequality provides ϕ ∈ L 2 ([0,T ],L q ( )) for every q < ∞ and we can still conclude thanks to the assumption log m ∈ W 1, p ( ) with the strict inequality p > d.
In the sequel, we will also need, several times, the following stability result for weak solutions:

Lemma 3.3.
Suppose that ( f n ) n is a sequence of solutions of (1.1) associated with a sequence of weights (m n ) n . Suppose that, for each n, f n is bounded both from above and below by positive constants which are not necessarily uniform in n, and suppose that the masses M n := f n (preserved in time) tend to a value M > 0 as n → ∞. Suppose that (log m n ) n is bounded in W 1, p ( ) (for some p > d), that log m n → log m uniformly as n → ∞ for some m with log m ∈ W 1, p ( ), that ( f n (0)) n is bounded in L r +3 ( ), that ( f n (0) −1 ) n is bounded in L r ( ), and that f n (0) * f 0 as n → ∞ for some f 0 ∈ L r +3 ( ). Then the curves (t → f n (t)) n are equicontinuous as curves valued in W 2 ( ) and, up to a subsequence, f n (t) * f (t) as n → ∞ for every t 0, where f is a weak solution of (1.1) starting from f 0 and associated with the weight m.
Proof. For each n, set u n = f n /m; we can use the fact that u α n belongs to L 2 ([0, ∞),Ḣ 1 ( )) for every α ∈ R (the definition of weak solution guaranteeing this fact for α = 1, and the upper and lower bounds on u n allowing us to use in fact any α ∈ R) together with ∂ t u n ∈ L 2 ([0, ∞), H −1 ( )) (by the Sobolev behaviour of log m n ; see Lemma 3.1) in order to compute, for q = r + 1, (Note that we detail a similar computation later in (4.3).) Let us first obtain a uniform bound on the L 2 loc ([0, ∞), H 1 ( )) norm of u n and u −r n . Using q = r + 3 in (3.1) one obtains the bound on the norm of u n in L 2 ([0, ∞),Ḣ 1 ( )) in terms of G (r +3;m n ) [ f n (0)], which is bounded by assumption. In order to transform this bound into a bound in L 2 loc ([0, ∞), H 1 ( )) we use the fact that the average of u n is bounded since u n > 0 and u n m n = f n = M n . Using the Poincaré-Wirtinger inequality we also bound the L 2 norm, and we get the desired bound.
Using q = −r in (3.1) one obtains the bound on the norm of u , which is also bounded. Notice that we also obtain boundedness of the L r norm of f n (t) −1 . Then, by Lemma 3.2 we deduce that u −r n is bounded in L 2 ([0, ∞),Ḣ 1 ( )). The bound on f n (t) −r also provides a bound on the average of u −r n and, again, using the Poincaré-Wirtinger inequality the bound becomes a bound in L 2 loc ([0, ∞), H 1 ( )). Each f n represents a continuous curve valued in the compact space W 2 ( ). In order to prove that these curves are equicontinous we recall the following fact from optimal transport theory (see for instance [48,Chapter 5]): whenever a curve of positive measures (μ(t)) t with fixed mass on satisfies ∂ t μ + div x (μv) = 0 (with no-flux boundary conditions), then we have where |μ (t)| is the metric derivative (see [1]) of μ. It is an important fact that bounds on the L 2 norm of the metric derivative imply Hölder continuity, from standard Sobolev injections. In our case, for μ = f n , using (1.1), the vector field v is given by ∇(u −(r +1) n ) and estimating its L 2 norm exactly amounts to the estimate of u We can therefore extract a uniformly converging subsequence (uniformly for the W 2 metric). In particular, up to a subsequence, we have a weak limit f n (t) * f (t) for every t. Moreover, the L 2 ([0, ∞), H 1 ( )) bound on u n , together with the L 2 ([0, ∞), H −1 ( )) bound on ∂ t u n (which comes from Lemma 3.1 and the uniform Sobolev bound on log m n ), allow us to apply the Aubin-Lions lemma (see [2]) and obtain strong compactness on u n . This means that we can assume that we have strong and almost-everywhere convergence f n (t) → f (t) as well as u n (t) → u(t) and u −r n (t) → u −r (t). Because of our bounds, we also have weak L 2 convergence of ∇(u −r n ) to ∇(u −r ). Together with the strong L 2 convergence m n → m, this allows the equation satisfied by f n to pass to the limit, which gives the existence of a weak solution satisfying the desired bounds and with initial datum f 0 and weight m.

Existence
We first prove the existence part of the theorem. The key technical tool is the use of the JKO scheme, as developed in Section 2: given f 0 with f 0 = M and a time-step τ > 0, one can define a recursive sequence via and we define f (τ ) k as the density of μ (τ ) k , for every k ∈ N. From the estimates in Section 2 (in particular Lemmas 2.4, 2.5 and 2.8) we know the following facts: C 0 m for some c 0 , C 0 , then the same inequality stays true for f for a positive constant C 1 (depending on c 0 , C 0 and on the sign of ); this means in particular that BV norms do not grow "too" fast during the evolution, provided we assume L ∞ bounds on f and semiconcavity of log m; • possibly assuming a priori L ∞ bounds on f (τ ) k for every k and semiconcavity of log m, the H 1 norm of quantities of the form f (τ ) k /m p (for p = (q −r −1)/2, q > 1, and p = −(r + 1/2)) can be estimated by terms which are the addends of a telescopic sum in k (allowing us to sum them and obtain integral estimates in time).
We first prove a more restrictive existence result. Indeed, it assumes extra semiconcavity of log m and boundedness and BV regularity of the initial datum.

Lemma 3.4. Assume D 2 (log m)
Id for some ∈ R. Then, for there exists a distributional solution of (1.1), starting from f 0 , obtained as the limit of the JKO scheme. Also, this solution satisfies and is therefore a weak solution according to Definition 1.1.
Proof. The proof follows the scheme described in [48,Chapter 8] to prove the convergence of the JKO iterations. Following such a scheme (see also [27], where this general procedure is presented), one has a sequence ( f k ) k of densities obtained by iteratively solving the minimisation problem (3.2). For simplicity of notation, we will often omit in this proof the dependence on τ of all our objects, until we need to let τ → 0. For all k, one also defines a vector field v k , given by v k = (id − T )/τ = ∇ϕ k /τ , where T is the optimal transport map from f k to f k−1 and ϕ k is the corresponding Kantorovich potential. As previously notes, the optimality conditions on f k allow us to check that we have v k = −∇ (U ( f k /m)).
Since this is a nonlinear relation, it cannot be directly deduced by the weak convergence. However, we have f (τ ) Also, applying Lemma 2.10 to g = f (τ ) , and summing over k, we get This yields uniform H 1 bounds. In particular, we choose q = r + 3 in (3.3) and we obtain a uniform L 2 bound, in time and space, on ∇( f (τ ) /m). Using the lower and upper bounds on the ratio f (τ ) /m this also translates into a similar bound on [ f (τ ) (t)/m] −r and allows to pass to the limit. Therefore, under the assumptions of this lemma, we have the existence of a weak solution with intial datum f 0 . Moreover this solution satisfies where the first L 2 norm is only bounded in terms of G (r +3) [ f 0 ] (the dependence on the constants c 0 , C 0 and disappears in the limit τ → 0). On the other hand, the second bound depends on c 0 (since the Lipschitz constant of the function s → s −r depends on the lower bounds; yet, it is possible to obtain uniform bounds using Lemma 2.10).
We now relax the extra assumptions of Lemma 3.4 and get the existence part of Theorem 1.2. In fact, let us restate it in a slightly more precise way: and is therefore a weak solution according to Definition 1.1. Morever, if f 0 c 0 m for some c 0 > 0, then this solution also satisfies f (t) c 0 m for every t 0.
Proof. We proceed by approximation, considering a sequence of initial data ( f n,0 ) n and a sequence of weights (m n ) n . If we suppose that the sequences ( f n,0 ) n and (m n ) n satisfy the assumptions of Lemma 3.4, then we have a sequence of solutions ( f n ) n . The approximation is chosen so that for all n the Sobolev norm and the upper and lower bounds of m n are preserved, as well as the bounds on G (r +3) [ f n,0 ] and F ρ [ f n,0 ]. We then apply Lemma 3.3, since the functions f n satisfy all the required assumptions. The last part of the statement (i.e. preservation of the lower bounds of f /m) is a direct consequence of the approximation, provided f n,0 is chosen so that f n,0 c 0 m for all n.

Weighted L 1 Contractivity and Uniqueness
We now prove the uniqueness part Theorem 1.2 by showing a contractivity result on weak solutions. As the reader will see from the proof, this actually follows by a weighted contractivity result on the equation satisfied by f /m. Although weighted contractivity estimates have already appeared in the literature (see for instance [9,10,52]), these are completely new in the ultrafast regime setting, and we expect both the result and the method of the proof to be useful in other circumstances. Proposition 3.6. (L 1 contractivity) Let f and g be two nonnegative weak solutions of (1.1), and suppose that there exists a constant c 0 > 0 such that g(t) c 0 m > 0 for all t ∈ [0, ∞). Then, which in particular implies L 1 contractivity: Proof. As often in this paper, it is convenient to use the notation u = f /m and v = g/m.
For every ε > 0, let us consider ψ ε : R → R to be a smooth approximation of the positive part defined as follows: for all s ∈ R, We fix t 0. Using Lemma 3.1 we deduce that ∂ t (u −v) ∈ L 2 loc ([0, ∞), H −1 ( )), which allows us to justify the next computation: Integrating by parts, we have Adding and subtracting u(t, x) −(r +1) ∇ x v(t, x) in the square brackets above we get The first term I 1 is nonpositive because ψ ε 0. By definition of ψ ε we have Note that, for some C r > 0, Thus, for a constant C > 0 (which in the rest of the proof may change value across any line), We now claim that the integrand in the right-hand side above belongs to L 1 ([0, ∞)× ). To prove this, recall that, by assumption, we have v c 0 . Since in the domain of integration in I 2 we have |u(t) − v(t)| ε, for ε small enough (for instance ε c 0 /4) we also have that u(t) c 0 /2. Therefore, on the domain of integration in I 2 , Also note that, on the domain of integration (dropping the (t, x) dependences to simplify the following computation), where the last inequality follows from (3.6). By the definition of weak solu- We now integrate in time the differential inequality (3.5): given T > 0, we have and, by dominated convergence (thanks to (3.7)), in the limit ε → 0 we obtain the following: Since on the region Recalling that u = f /m and v = g/m, this proves the first part of the statement. Repeating the proof, this time with ψ ε a smooth approximation of the negative part, we obtain Proof. Suppose that f is a weak solution starting from f 0 , and write u = f /m. Set u 0 = f 0 /m, consider u ε 0 = max{u 0 , ε}, and let v ε be the solution constructed in Theorem 3.5 starting from u ε 0 for all ε > 0. Note that we have v ε ε > 0. By Proposition 3.6 we deduce that for some time-independent constant C > 0. Letting ε → 0, we get where v is an arbitrary limit of (v ε ) ε . Hence u(t) must coincide with v(t) and therefore, since u (resp. f ) can be any weak solution starting from u 0 (resp. f 0 ), we obtain uniqueness.
We conclude this section by showing some easy and useful corollaries. First, we give a continuous maximum principle which is a corollary of Proposition 3.6. Remark 3.9 below also shows that, in fact, this continuous principle can be seen as a corollary of the discrete maximum principle (Lemma 2.4) and Theorem 3.7. Proof. Assume for instance that f 0 C 0 m (the case f 0 c 0 m being analogous). Then we apply Proposition 3.6 to f (t) and g(t) = C 0 m to deduce that Thus, for all t 0 we get f (t) C 0 m, as desired.
Remark 3.9. The above corollary can be also proved by noticing that, thanks to the discrete maximum principle, it holds for all solutions obtained as limit of the JKO scheme. Since by uniqueness all solutions can be obtained in this way, the result follows.
Finally, we give a useful remark, which is now straightforward, about the continuous dependence of the unique weak solution in terms of the initial data. Proof. This is a consequence of Lemma 3.3 and of the uniqueness result of Theorem 3.7. Note that there is no need to extract a subsequence because of the uniqueness.

Harnack Inequalities: Proof of Theorem 1.3
In [36] the author proves exponential convergence to equilibrium for initial data that are bounded away from zero and infinity (although the result there is stated only for d = 1, the proof works without modification in any dimension). Exponential convergence results will be the object of Section 5 and will be based on the preliminary proof of the fact that, instantaneously, solutions become bounded from above and below. The following proposition, along with the continuous maximum principle stated in Corollary 3. Assume that there exists q > σ max 1, d 2 such that f 0 , f −1 0 ∈ L q ( ). Then there are constants C 1 , C −1 > 0, independent of T , such that

2)
where α = A ∞ B ∞ and β = B ∞ are given by Here, for all i ∈ N ∪ {0}, Proof. It is convenient to prove the regularisation result in terms of u := f /m and then, at the end, rewrite the result in terms of f . We also write u 0 := u(0) = f 0 /m. We proceed as follows: we first obtain two inequalities, one for the gradient of u and the other for u (see (4.4) and (4.5)); we then show the result (4.1) for f using a Moser iteration; we finally prove (4.2) for f −1 in a similar way, using in fact the result for f . All our estimates have to be considered as a priori estimates: in order to perform the computations, in particular the derivations of certain integrals in time, we need Sobolev bounds on powers of u; therefore, we can start by supposing that our initial datum is bounded from above and below, which implies, thanks to Corollary 3.8, that the same bounds propagate to every time t and the regularity u ∈ L 2 loc ([0, ∞), H 1 ( )) implies the same regularity for all powers of u. Then, we note that the estimates we obtain do not depend on the bounds on the initial datum, and we therefore use Corollary 3.10 to deduce by approximation the same estimates for general initial data. We will not make this procedure explicit in this proof.
Step 1: inequalities for u and its gradient. Let ν ∈ {−1, 1}. We differentiate the weighted L q norm of u ν (in a similar computation as that to obtain (3.1)), and then use the equation for u and an integration by parts to obtain, for all t ∈ [0, T ], If we define C ν (q) > 0 such that and we notice that where η > 0 thanks to the condition q > σ , we obtain Set t 0 := T . Integrating the previous expression between t 0 and t 0 + T and dividing by T we have Thus, there existst ∈ (t 0 , t 0 + T ) such that using λ m 1/λ; see the Sobolev standing assumption on m. Note that since the weighted L q norm of u ν is nonincreasing in time (see (4.3)), we know u(t 0 ) ν ∈ L q ( ). Furthermore, following the same computation as in (4.3) replacing q by 2η, the weighted L 2η norm of u ν is as well nonincreasing, and therefore, because again λ m 1/λ: Since 2η < q, by Hölder's inequality, we obtain which finally gives Step 2: proof of (4.1) (ν = 1). Since u(t 0 ) ∈ L q ( ), (4.4) and (4.5) imply that u(t) η ∈ H 1 ( ). Thus, by Sobolev's inequality, when d 3 we obtain where C S > 0 is the Sobolev constant (depending on ) and 2 * := 2d/(d − 2). When d ∈ {1, 2} the same inequality holds by replacing 2 * with any number larger than two. 3 Because this does not change any argument given in the rest of the proof, for simplicity and without loss of generality we assume d 3. Using (4.4) and (4.5) with ν = 1 and λ m 1/λ, Recalling that m 1/λ, we get (4.6) and so | | q−1 (λM) q u(t 0 , x) q dx 1. Therefore, since η/q < 1/2, Hence, all in all, Using that T 1 then gives We now want to initialise a Moser interative scheme. To this end, let us define η 0 = η, q 0 = q, q 1 = 2 * η 0 , and t 1 =t. With this notation, (4.7) yields or, equivalently, Observe that q 1 > q 0 thanks to the assumption that q 0 > σ d 2 . We can now repeat the argument above starting from t 1 in place of t 0 , q 1 in place of q 0 , T /2 in place of T , and we find a time t 2 ∈ (t 1 , t 1 + T /2) such that Iterating k ∈ N times, we find , q k grows exponentially fast to infinity as k → ∞. By equation (4.8), for all k ∈ N we have that whereq k := q k −σ and where, by convention, k−1 j=i+1 q j q j = 1 if k i +1. Letting k → ∞, by the exponential growth of (q k ) k we find a time t ∞ ∈ (t 0 , 2T ) such that are finite constants since there exist constants c, C > 0 such that, for all i ∈ N, q i c θ i and q ī q i 1 + Cθ −i . Recalling the computation in (4.3), we know that the L q 0 and L ∞ norms of u are nonincreasing in time, and we thus conclude that that is, with the notation given in the statement of the theorem Since f (3T ) u(3T )/λ and u 0 f 0 /λ, we recover (4.1) and Step 3: proof of (4.2) (ν = −1). We proceed very similarly as in the case ν = 1 (Step 2), although in fact we use the result for ν = 1 as follows. By (4.9) and the arbitrariness of T , we know that for every t ∈ [t, 3T ], sincet T . Hence, with ν = −1, 1,∞ . Therefore, coming back to (4.4) with ν = −1 yieldŝ (4.10) Because u −1 0 ∈ L q ( ), (4.10) and (4.5) imply that u(t) −η ∈ H 1 ( ). Then, proceeding analogously as in Step 2, we get to In fact, in order to get to (4.11), the only main difference with respect to Step 1 is the treatment of (4.6): we use Jensen's inequality to yield To initialise the iterative scheme we use the same notation as in the case ν = 1; we get We then follow the same strategy as in the case ν = 1. After k ∈ N iterations we obtain Letting k → ∞, because of the exponential growth of (q k ) k we find t ∞ ∈ (0, 3T ) so that where A ∞ and B ∞ are as previously and is a finite constant. By (4.3) we deduce that the L q 0 and L ∞ norms of u −1 is nonincreasing, and we thus conclude that that is, with the notation given in the statement of the theorem, Since it holds that u(3T ) f (3T )/λ, f 0 u 0 /λ and u 0 f 0 /λ, we finally recover (4.2) and C −1 = C β −1,∞ /λ 1+(1+2ασ )β , which ends the proof.

Remark 4.2.
The proof of Proposition 4.1 has been completely approached via the continuous-time study of the equation. In fact, it is also possible to obtain similar estimates also via the JKO iterations, using the flow-interchange technique (as for the H 1 estimates already presented in Section 2.4). Yet there are some drawbacks to the flow-interchange approach: it requires geodesic convexity of the functional, which means that it can only be used for positive powers (negative powers are only geodesically convex in dimension 1) and that it would be suitable to suppose that log m be concave; also, it does not allow to iterate infinitely many times, which finally provides estimates on the norms u L p(τ ) for an expression p(τ ) with lim τ →0 p(τ ) = +∞. We decided to avoid this computation, because of its limited interest.

Long-Time Behaviour: Proof of Theorem 1.4
Thanks to the regularisation result of Section 4, we can now prove the first long-time convergence statement of Theorem 1.4 (i.e., the L 2 convergence), which we restate below. Proof. Using k iterated times the bound in Lemma 2.5 (as it can be understood from the beginning of Section 3.2), we get, for τ > 0, where C 1 is as in Lemma 2.5. For the limit of the JKO scheme, this implies that for some C 2 > 0 we get, for all t 0, as soon as f 0 ∈ BV ( ) and c 0 m f 0 C 0 m (use Lemma 3.4). This can be translated into the desired bound f (t 1 ) BV ( ;m) e C 2 (t 1 −t 0 ) f (t 0 ) BV ( ;m) for any t 0 > 0 and t 1 > t 0 , as soon as f (t 0 ) is in BV ( ) and is bounded from below and above. (We need to restart a JKO scheme from f (t 0 ), which the uniqueness allows us to do.) Yet, the L 2 integrability of the H 1 norm of u = f /m implies that u(t) is in H 1 ( ), and hence in BV ( ), for almost every positive time t, and the instantaneous regularisation given by Theorem 1.3 provides the lower and upper bounds, which finally gives the desired result.
We finally show the second long-time convergence statement of Theorem 1.4 (i.e., the BV convergence), which we restate as follows: The L 2 exponential convergence result presented in Theorem 5.1 allows us to conclude.
Remark 5.5. Because of the L ∞ bounds from above and from below, the longtime L 2 convergence easily implies L p convergence for every p 1, and this convergence is still exponential. On the other hand, getting uniform convergence is a delicate matter: in dimension one it is trivial when BV convergence is guaranteed, in higher dimension it is not.