A cross-diffusion system obtained via (convex) relaxation in the JKO scheme

In this paper, we start from a very natural system of cross-diffusion equations, which can be seen as a gradient flow for the Wasserstein distance of a certain functional. Unfortunately, the cross-diffusion system is not well-posed, as a consequence of the fact that the underlying functional is not lower semi-continuous. We then consider the relaxation of the functional, and prove existence of a solution in a suitable sense for the gradient flow of (the relaxed functional). This gradient flow has also a cross-diffusion structure, but the mixture between two different regimes, that are determined by the relaxation, makes this study non-trivial.


Introduction
The starting point of this paper is the following very natural system of PDEs complemented with no-flux boundary conditions in a bounded domain Ω.This cross-diffusion system describes the motion of two populations, each subject to diffusion and trying to avoid the presence of the other, so that each density acts as a potential in the evolution equation satisfied by the other.
This model was studied in [4], where it is obtained as a continuum version of a discrete lattice model proposed in [1], to account for the territorial development of two competing populations.Similar models appear in mathematical biology to describe the evolution of interacting species that are under the influence of population pressure due to intra-and inter-specific interferences (see e.g.[10,16]).However, in these papers, the models considered always enjoy some special structure that ensures some ''convexity'', which is crucially not verified by System (1.1), as we will explain later.
The existence of solutions for (1.1) is a very challenging problem.An easy computation shows that the system is parabolic when ρµ < 1 but has an anti-parabolic behavior when ρµ > 1. Existence of solutions for short time is proven by the third author and her collaborators in [4], under the assumption that the initial data satisfy ρ 0 µ 0 < 1.
A noticeable property of system (1.1) is the following: it can be seen as a gradient flow for a suitable functional in the Wasserstein space.
The usual notion of gradient flows applies to Hilbert spaces but in the last two decades the interest has grown for gradient flows in metric spaces and in particular in the space of probability measures endowed with the Wasserstein distance (see [3] and [15]), after the seminal work by Jordan, Kinderlehrer and Otto [12], who found a gradient flow structure for the Fokker-Planck equation.Applying the same ideas not to a single PDE but to a system of PDEs, thus looking for gradient flows in the space of pairs of probability measures, is more recent (see [7,13]) and more delicate.
At a formal level, given a functional F defined on probabilities on a given domain Ω, its gradient flow corresponds to the evolution PDE where δF/δρ is the first variation of the functional F (formally defined through the condition F (ρ + εδρ) = F (ρ) + ε ´δF δρ dδρ + o(ε)).This equation is endowed with no-flux boundary conditions on ∂Ω.Analogously, given a functional F defined on pairs (ρ, µ) of probabilities on the domain Ω, the gradient flow of F in W 2 (Ω) × W 2 (Ω) would be given by the system with again no-flux on the boundary.Of course, we formally define the two partial first variations via the condition F (ρ + εδρ, µ + εδµ) = F (ρ, µ) + ε ´δF δρ dδρ + ε ´δF δµ dδµ + o(ε).Here, it turns out that (1.1) is the gradient flow of the following functional where the function f 0 is given by f 0 (a, b) = a log a+ b log b+ ab, and ρ(x), µ(x) stand for the densities of the two measures, which are supposed to be absolutely continuous and identified with their L 1 densities.
The main problem in considering the function f 0 is that it is not globally convex, as we have, denoting D 2 f 0 the Hessian of f 0 , and this matrix is only positive-definite if ab < 1.The non-convexity of f 0 translates into the nonparabolic behavior of the system outside the region where ρµ < 1.
Yet, in terms of the functional F 0 , the situation is even worse.Indeed, integral functionals with a non-convex integrand are not lower semi-continuous for the weak convergence of probability measures.This is a non-negligible issue when applying variational methods.Indeed, one of the main tools to study gradient flows in a metric setting is the so-called minimizing movement method (see [2,8]).Let us quickly explain this tool in the general case of a metric space, and we specify it to the case of interest here just after.
For a given functional F , acting on a metric space M endowed with a distance d, the minimizing movement scheme consists in building an approximation of the gradient flow (which is a curve x(t) in M ) as follows: we fix a time step τ > 0 and look for a sequence x τ k such that .
In the case where the metric space is the space of propability measures on Ω and the distance d is the Wasserstein distance, this iterated minimization scheme is known under the name of Jordan-Kinderlehrer-Otto (JKO) scheme.It produces a sequence of probability measures which, when appropriately interpolated, would approximate the gradient flow of F .
Here, we will consider the case of a gradient flow in M = W 2 (Ω) × W 2 (Ω) (and the distance d is the natural product of the Wasserstein distance).The minimizing movement method then consist in solving the following family of minimization problems: . (1.3) However, if F is not lower semi-continuous for the weak convergence, which is the case when F = F 0 defined above, while the terms in W 2 2 are continuous, the existence of a minimizer in the above iterated minimization problem is not always guaranteed.From the variational point of view, it is necessary to replace the functional F 0 with its relaxation, or lower semi-continuous envelope.
We stress that the relation between the gradient flow of a functional and its lower semi-continuous envelope is not clear in terms of PDEs, but it is clearer at the level of the JKO scheme.Indeed, if we fix τ > 0 and a tolerance parameter δ > 0, when facing a non-lower semi-continuous functional F (whose lower semi-continuous envelope is denoted by F ), we could define a δ-approximated JKO scheme in the following way: we denote by argmin δ x G(x) the set of δ-almost minimizers of G, i.e. the set {x : G(x) < inf G + δ}, which is always non-empty, and we pick any sequence x τ,δ k satisfying .
It can be proven, as an application of Γ−convergence, that, for fixed τ and letting δ → 0, any such sequence converges to a sequence satisfying i.e. to the output of the JKO scheme for the lower semi-continuous relaxation F .
The present paper is then devoted to the study of the system of PDEs representing the gradient flow of a functional F obtained as the lower semi-continuous envelope of F 0 , and in particular to an existence result.Let us detail the structure of the paper.As a first task, we need to compute this lower semi-continuous envelope.The general theory about local functionals on measures (see [9]) provides a clear answer: one just needs to replace the non-convex function f 0 with its convex envelope, that we will denote by f .We therefore compute the convex envelope f of f 0 and find that it has the following form: on a certain region B ⊂ R 2 + we have f = f 0 , and on the complement A = R 2 + \ B the function f only depends on the sum of the two variables, i.e. there exists f : The fact that f partially agrees with a function of the sum only recalls some cross-diffusion problems already studied in other papers using Wasserstein techniques, such as in [7,13].Cross-diffusion with a functional only depending on the sum is not difficult to study, as one can first find an equation on the sum S = ρ + µ and then use the gradient of the solution S as a drift advecting both ρ and µ.The difficulty in most of the cross-diffusion problems involving the sum comes from the lower-order terms which differentiate the two species (reaction, or advection), since the above strategy cannot be applied and in general it is not possible to obtain compactness results on the two densities ρ and µ separately, but only on their sum.However, when the functional involves the integral of a strictly convex function of the two densities ρ and µ, it is indeed possible to obtain separate compactness: here the challenge comes from combining the two regimes, one where f is strictly convex and one where it only depends on the sum, but without extra terms differentiating the two species.
On the other hand, a remark is compulsory when looking at the precise definition of the function f , since the region B where f = f 0 is strictly included in the set {ab < 1}.Indeed, one could think (and it would have been nice!) that the system we obtain is an extension of the one with f 0 (that is, (1.1)), in the sense that it allows to extend the solutions even after touching the dangerous curve where ρµ = 1.Unfortunately, this interpretation is not correct since there are initial data satisfying ρµ < 1 but (ρ, µ) / ∈ B, for which the two systems would have well-defined different solutions, at least for short time.
As a result, this paper will only be concerned with the gradient flow of the new functional F (the lower semi-continuous envelope of F 0 , which is, in this case, its convexification as well), without discussing its relation with the original PDE which motivated the study.For this new gradient flow we will prove existence of solutions in a suitable sense.
The notion of solution we consider is inspired by the notion of EDI solution introduced in [2,3] in terms of the metric slope, but it is slightly different and more PDE-adapted.Given our functional F , our goal is to find a solution of the following system where the two continuity equations are satisfied in a weak sense with no-flux boundary conditions on ∂Ω and the functions δF δρ and δF δµ are differentiable in a suitable weak sense.In our particular case this means finding a pair (ρ t , µ t ) such that the gradients of f a (ρ t , µ t ) and f b (ρ t , µ t ) exist in such a sense, and the above equations are satisfied.
Formally, if the two first conditions of (1.4) are satisfied, then the last two conditions are equivalent, on the interval [0, T ], to the following inequality (1.5) Indeed, if we formally differentiate the function t → F (ρ t , µ t ) in time, we obtain This means that (1.5) can be re-written as and the only way to satisfy this condition is to satisfy the a.e.equality of the last two lines in (1.4).
This trick to characterize the solutions comes from the Euclidean observation that , and hence to the Energy Dissipation Inequality (EDI) The fact that gradients and derivatives cannot be defined in metric spaces (a vector structure is needed) but their norms could be defined (using the so-called metric derivative and metric slope) instead leads to the definition of a notion of EDI gradient flow in metric spaces.This is what is done in [3] for general metric spaces (in particular for functionals F which are geodesically convex) and then particularized to the case of the Wasserstein space.However, this is not the strategy which is followed in our paper, even though what we do is strongly inspired by the metric approach of [3].Our precise procedure is the following: • We define a class H consisting of pairs (ρ, µ) where ∇f a (ρ, µ) and ∇f b (ρ, µ) are well-defined.We do this by requiring that some functions of the densities (ρ, µ) belong to the H 1 Sobolev space: in particular, we require that η(ρ, µ) is H 1 for any smooth function η supported in the set B where f has a strictly convex behavior, and we also require √ ρ + µ to be H 1 .The second requirement is not sharp, in the sense that other functions of the sum could be used as well.We chose to use this one for simplicity, since we guarantee this condition on the solutions which we build via extra arguments.
• We define, on the class H, the slope functional SlopeF via Note that we do not pretend at all that this slope functional coincides with the metric slope which could be defined by following the theory in the first part of the book [3].
• We say that a pair of curves (ρ, µ) is an EDI solution if (ρ t , µ t ) ∈ H for a.e.t and we have for some velocity fields v, w solving the continuity equations In order to prove the existence of an EDI solution we rely on the JKO scheme (1.3) and build suitable interpolations in time of the sequence of solutions, thus obtaining an approximation of (1.8).More precisely, we will have for two different interpolations (ρ, μ) and (ρ, μ).We then pass to the limit as τ → 0 (τ is the time step for the discretization in the JKO scheme) where the weak limits of the different interpolations coincide.We prove that SlopeF is a lower semi-continuous functional for the weak convergence, which allows us to conclude, combined with more standard semi-continuity arguments.
Organization of the paper After this introduction, Section 2 is devoted to the computation of the convexification of f 0 , and to some properties of the function f we obtain, introducing some relevant quantities.Section 3 is devoted to the precise definition of the slope SlopeF and to the proof of its lower semi-continuity.Section 4 introduces the notion of EDI solutions and proves their existence.In the proof, several interpolations of the sequence obtained via the JKO scheme are needed, including the De Giorgi variational interpolation.Some estimates on these solutions are required in order to prove that they belong to the class H and to obtain the desired inequalities.Sections 5 and 6 are not required to obtain the existence of EDI solutions, but are required if one wants to come back to System (1.4).Indeed, we stated that the notions are formally equivalent thanks to an easy computation for the derivative in time of F (ρ t , µ t ).However, this computation is only formal if we do not face smooth solutions.This explains the choice of the notion of EDI solutions: it is a definition which coincides with solving the equation in a classical sense if functions are smooth, but the equivalence is in general not granted.In Section 5, we then explain an approximation procedure (by convolution) to obtain the differentiation property (i.e. the validity of (1.6)) for non-smooth solutions of the continuity equation.Yet, the nonlinearity of the terms involved in f a and f b requires a certain bound on the H 1 norm of some functions of the regularized functions.This raises a very natural question: suppose that a function u is such that its positive part u + belongs to H 1 , and let u ε be its convolution with a certain mollifier η ε : when is it true that the sequence (u ε ) + is bounded in H 1 (with, possibly, uniform bounds in terms of the kernel)?This is not trivial and the answer could depend on the choice of the kernel.We provide in Section 6 a proof of this fact in dimension 1 for a specific choice of the kernel shape.Note that the very same result is false (even after changing the shape of the kernel) in higher dimensions (we thank Alexey Kroshnin for a refined counter-example in this direction).Nevertheless, this does not prevent the approximation or the differentiation to be true.

Convexification
In this section, we characterize the function f which is the convex envelope of the function f 0 , and f 0 is defined by otherwise, (2.9) will be the lower semi-continuous envelope of F 0 (which is itself defined in a similar way replacing f with f 0 ) for the weak convergence of probability measures, thanks to standard relaxation results (see, for instance, [9]).As we underlined in the introduction, f 0 is not convex since D 2 f 0 (a, b) is not positive semi-definite unless ab ≤ 1.
First, we look at the shape of f 0 on diagonal lines where a + b is constant.We denote s := a + b and for a fixed s, we define a function g(a) := f 0 (a, s − a) which gives the value of f 0 over a segment for s fixed.Then we have We need to distinguish the cases where s ≤ 2 and where s > 2. In the first case, g is convex, since In the second case, we can easily see that g has a double-well shape, with three critical points on (0, s) (see Figure 2 Due to the double-well shape of g and to the fact that g ′′ vanishes only twice, the two minimizers of g satisfy g ′′ > 0 (and not only g ′′ ≥ 0).Indeed, if this were not the case, then on the interval between the two minimizers, g would be strictly concave and would have vanishing derivative at the endpoints, which is impossible.
It will turn out that the convexification of f 0 on the line a+b = s will coincide with the 1-dimensional convexification of g.
In order to construct such a function f , we first define two auxiliary functions α, β and two sets A, B: indeed, there exist two functions • for every s > 2 we have α(s) < s/2 < β(s), • for every s > 2 we have (2.10) This means that for every s > 2, the points α(s) and β(s) are the two minimizers of g and these conditions are enough to determine α, β in a unique way.They can also be obtained using an Implicit Function Theorem, which also shows the smoothness of α, β.On the other hand, we lose the smoothness at s = 2 where the IFT cannot be applied, but we have anyway continuity due to the uniqueness of the minimizer.
We then define two sets A and B as those sets that partition R 2 + , with A being the closed, convex set above the curve {(α(s), β(s)) : s ≥ 2} ∪ {(β(s), α(s)) : s ≥ 2}, and B = R   In Figure 2, g, which is the 1-dimensional convexification of f 0 and its convex envelope f are drawn in orange and green colors respectively.
Then, the main goal of this section is to prove the following proposition: where f 0 (a, b) := a log a+b log b+ab and f (s) = f 0 (α(s), β(s)).Then, f ∈ C 1 ([2, +∞)) and the function f is the convex envelope of f 0 .
We give the proof of this proposition at the end of this section.Next, we give some technical results that will prove useful in the sequel.We define the ''product'' function P as the following: where π(s) := α(s)β(s) ∈ C 0 ([2, +∞)).We gather some properties of π(s) in the following lemma: Lemma 2.2.The function π ∈ C 0 ([2, +∞)), satisfies the following properties: ii.
Proof.Recall that α(s) and β(s) are the minimizers of g, and that they satisfy g ′′ > 0. Since we have g ′′ (a) = 1 a + 1 s−a − 2 = s a(s−a) − 2, taking a = α(s) and s − a = β(s) we obtain s π(s) > 2, which proves i.Now we would like to compute π ′ (s) for s > 2 (note that α and β are differentiable because of the implicit function theorem): we have (2.13) Let us now compute α ′ (s) and β ′ (s).We know from (2.10) that at the minima of f we have Plugging these in (2.13) we obtain , which proves ii.
Since π ′ (s) ≤ 0 for s > 2 (we use here s − 2π(s) > 0), and π ∈ C 0 ([2, +∞)), we obtain that π has its maximum value at s = 2, then π(s) < π(2) = 1 for s > 2. This gives iii.Now, we want to prove that π is differentiable at s = 2 and compute its derivative.We will consider the liminf and the limsup of the incremental ratio and bound it iteratively.We recall that we have We first note that we have We then deduce We define two sequences (p n ) n∈N and (q n ) n∈N which are meant to satisfy We take p 0 = −1 and q 0 = 0. Supposing that we have defined p n and q n , we then note that for any ε > 0, we have, for s in a neighborhood of 2 + , that This implies that we have, in the same neighborhood .
Since π(s) → 1 as s → 2, we can define q n+1 via and, analogously, In particular, we have p 1 = −1 and q 1 = −1/3.We can see that the new values p n+1 and q n+1 also satisfy (2.14).From the definition of p n and q n we obtain By induction, we can see that the sequence p 2n is increasing and bounded above by −1/2.If we denote its limit by L, we have The same holds for p 2n+1 = p 2n .Similarly, q 2n and q 2n+1 are decreasing and bounded from below by −1/2 and they converge to −1/2 as well.
Proof.We recall that we have Then we compute This allows to see f ∈ C 1 since the expression for f ′ is made of continuous functions (as we do have . Moreover, we can go on differentiating and get Since we have π(s) ≤ 1 and s − 2π(s) ≥ 0, the second derivative of f is non-negative, and f is convex.
Remark 1.For future use, we denote by r 0 the number given by and we note that we have r 0 > 0 since the function in the infimum is strictly positive, tends to 1 as s → ∞, and tends to 2/(2 Corollary 2.4.The function f : R 2 + → R is convex. Proof.We notice that f 0 is C 1 in the interior of B and that f is C 1 on [2, +∞).Moreover we have the following formula for the gradient of f , using s = a + b: Since these two expressions agree on B ∩ A, then f is globally C 1 in (0, +∞) 2 .This allows us to prove that f is convex by considering separately its restrictions to A and B. Indeed, convexity for C 1 functions is equivalent to the inequality (∇f (x) − ∇f (y)) • (x − y) ≥ 0 for every x, y.If the segment connecting x and y is completely contained either in A or in B then the convexity of the two restrictions is enough to obtain the desired inequality.If not, we can decompose it into a finite number of segments (three at most) of the form [x i , x i+1 ] with x 0 = x and x 3 = y and each [x i , x i+1 ] fully contained either in A or in B. We then write and the fact that x − y is a positive scalar multiple of each vector x i − x i+1 shows that the convexity of each restriction is again enough for the desired result (note that we strongly use here f ∈ C 1 , i.e. that the gradients of the two restrictions agree).
The convexity of f restricted to B comes from the positivity of the Hessian of f 0 and that of f restricted to A from the convexity of f , and the result is proven.Now, with the help of the above results, we prove Proposition 2.1.
Proof of Proposition 2.1.The function f has been built so that on each segment {(a, b) : a + b = s} it coincides with the convexification of the restriction of f 0 on such a segment.So, the convexification of f 0 cannot be larger than f .On the other hand, f is a convex function smaller than f 0 , so it is also smaller than the convexification, which proves the claim.
We conclude this section with a remark which will be useful in the sequel (see Lemma 4.10 in Section 4).
Remark 2. If χ is a function which is compactly supported in the set B, then there exists a Lipschitz continuous function g : R 2 → R such that we have This holds because the Jacobian of (f a , f b ) is invertible inside B.

Lower semi-continuity of the slope
The goal of this section is to give a precise definition of the slope functional SlopeF (ρ, µ) (introduced in (1.7)) and prove that it is lower semi-continuous with respect to the weak topology on measures.
Notice that the formula (1.7) for SlopeF (ρ, µ) makes use of the gradients of ρ and µ, but this expression is not well-defined for any arbitrary couple of measures (ρ, µ).This leads us to consider the following space: Definition 3.1.We define the class H as the set of all pairs of densities ρ, µ ∈ L 1 (Ω) ∩ P(Ω) such that i.For every η ∈ W 1,∞ c (B), we have η(ρ, µ) ∈ H 1 (Ω); ii.We also have The sets A, B above are those defined in Section 2 (and we keep this notation in the whole presentation), and by W 1,∞ c (B), we denote the set of Lipschitz functions whose support is compact inside B (it can touch the axes R × {0} and {0} × R, but not the separating curve between A and B).
For (ρ, µ) ∈ H, we do not have necessarily ρ, µ ∈ H 1 (Ω).However, we can define for couples (ρ, µ) ∈ H a suitable notion of ''gradient'' for certain functions of (ρ, µ).The notion of gradient we want to define should satisfy at least some chain-rule in order to be useful in the sequel, that is, we would like to have, for any (ρ, µ) ∈ H, that In particular, we need this to be true for some simple functions χ, such as affine functions composed with suitable positive parts so that we have supp(χ) ⊂ B.
Let us also mention that, with such definition of the gradients, we have the following chain-rule: for any χ ∈ W 1,∞ c (B), and for any (ρ, µ) ∈ H we have To prove this fact, let us start with considering the case where χ is compactly supported in supp T α,β,c , for some α, β, c ∈ Q.Then, for ε small enough, we have, for As both the functions are in H 1 , their gradients coincide on the set where (ρ, µ) ∈ supp χ, and on this set, applying the chain rule on the right-hand side yields the desired equality.Outside of this set, the chain rule is direct and everything is zero.We then conclude by observing that any compact subset of B is contained in a finite union of supports of sets of the form supp T α,β,c for α, β, c ∈ Q, and then we use a partition of the unity to generalize the result to general χ ∈ W 1,∞ c (B).We note that, if ρ, µ ∈ H 1 (Ω), then the gradient defined above coincide with the usual gradient.
To conclude these remarks on the gradients of (ρ, µ), we observe that our definition of the space H does not allow us to define the gradients of ρ, µ properly when (ρ, µ) lies in the set A. However, in this set, and this will be sufficient for our needs, the gradient of the sum ρ + µ is well defined, i.e, it is measurable.Indeed, we know that

It follows from the above discussion that, for any Lipschitz function
We now define the slope functional on the space H. Definition 3.3.Let (ρ, µ) ∈ H.Then, we define the slope functional SlopeF as where S := ρ + µ.
Note that the above formula for the slope has been obtained by expanding the expression We are now in a position to state the main result that we prove in this section.
where the above convergence is weak in the sense of measures.Then we have

Preliminary results
In this section, we provide some preliminary results that will be used in the proof of Theorem 3.4.We start with proving the following proposition: Proposition 3.5.Assume that (ρ n , µ n ) ∈ H converges weakly, as n → +∞, to (ρ, µ) ∈ P(Ω), and that SlopeF (ρ n , µ n ) is bounded independently of n.Then, for any Lipschitz continuous function χ(a, b) which is constant everywhere on A, we have The proof of Proposition 3.5 relies on two lemmas.
Lemma 3.6.Assume that (ρ n , µ n ) and χ satisfy the hypotheses of Proposition 3.5.Assume in addition that χ is constant outside a compact set contained in B. Then, the sequence In particular, it has a subsequence that converges strongly in L 2 and weakly in H 1 .
Proof.We start by defining, for all points x such that (ρ n (x), µ n (x)) ∈ B, the following vector fields: Note that X n is only defined on {ρ n > 0}, i.e. ρ n −a.e., and Y n on {µ n > 0}, i.e. µ n −a.e.Therefore, by the definition of SlopeF , for any function η compactly supported in B, we have that is bounded independently of n.Again, for points x such that (ρ n (x), µ n (x)) ∈ B, we can write Hence, remembering that and Since (ρ n , µ n ) ∈ H, we have that χ(ρ n , µ n ) ∈ H 1 (Ω) (where χ is given as in the statement of Proposition 3.5).Up to subtracting a constant, we can assume that χ is compactly supported in B. Now, let us prove that the The first line above is obtained by using the chain rule given by (3.16) for the composition of Lipschitz functions and functions in H.

The quantities |χ
are bounded because the support of χ is far from the set A, so that the product ρ n µ n is bounded above by a constant strictly less than 1 on the set of points x such that (ρ n (x), µ n (x)) ∈ supp(χ).Hence the result follows.
Let us now improve the above lemma by showing that, for χ as above, the H 1 weak and L 2 strong limit of χ(ρ n , µ n ) is χ(ρ, µ), i.e. we can pass to the limit inside the function χ.To do so, we start with considering the specific case were χ is of the form T α,β,c , defined in (3.15).
and the convergence is strong in L 2 and weak in H 1 .
Proof.We assume that (α, β, c) are chosen as in the statement of the lemma, and we omit writing them as subscripts of T α,β,c in the proof.By Lemma 3.6, there exists u ∈ H 1 , u ≥ 0, such that strongly in L 2 and weakly in H 1 .Using and the weak converge of c − αρ n − βµ n to c − αρ − βµ, we find that c − αρ − βµ ≤ u.Taking the maximum with 0, we get This already proves the equality T (ρ, µ) = u on {u = 0}.Now, let δ > 0 be fixed and define the set ω := {u > δ} ⊂ Ω.Let ε > 0 be fixed as well.Using Egoroff's theorem, we can find E ⊂ ω such that |E| < ε and such that T (ρ n , µ n ) converges uniformly to u on ω\E.Taking n large enough, we have The term on the left-hand side converges to u½ ω\E strongly in L 2 and the term on the right-hand side converges to (c − αρ − βµ)½ ω\E weakly.Then, and this is actually an equality due to (3.18).Since the measure of E can be taken arbitrarily small, and up to taking δ → 0, we obtain that u = T (ρ, µ) a.e.
We can now turn to the proof of Proposition 3.5.
Proof of Proposition 3.5.If the function χ is of the form T α,β,c , then Lemma 3.7 tells us that the proposition is true.The strategy of the proof is to prove first that the proposition holds true for any function χ whose support is contained in a triangle, itself contained in B, then we show that it holds true for any function χ whose support does not touch A (but may not be contained in a single triangle), and finally we consider the general case.
Step 1.The case where the support of χ is in a triangle.Assume that the support of χ is compactly supported in the support of the triangle function T α+ε,β+ε,c−ε and that (α, β, c) and ε > 0 are chosen so that such a support is contained in B. We now define T 1 := T α+ε,β,c and T 2 := T α,β+ε,c .We want to prove that χ(ρ, µ) can be expressed as a Lipschitz function of (T 1 (ρ, µ), T 2 (ρ, µ)).To do so, we observe that there exists an affine function L : We then define a function g : It is clear that χ(ρ, µ) equals to g(T 1 (ρ, µ), T 2 (ρ, µ)) since if either T 1 (ρ, µ) or T 2 (ρ, µ) vanishes, then we have χ(ρ, µ) = 0, while in the other case we can express ρ and µ via the affine function L, and of course we only need to apply g to values which are in R 2 + (since T 1 , T 2 ≥ 0) and in L −1 (R 2 + ) (since ρ, µ ≥ 0).We only need to prove that g is Lipschitz continuous, which is not evident from its definition.To do so, we will prove that we have g(t Analogously, we also have εa + ε < t 2 , hence t 2 > ε.This shows that we could define and both expressions are Lipschitz continuous and they agree on the open set which is the intersection of the two domains of definition. Once we know that χ(ρ, µ) can be written as g(T 1 (ρ, µ), T 2 (ρ, µ)), the claim follows.
Step 2. The case where the support of χ does not touch A. Assume that the support of χ is compactly supported in B. We use the fact, which is based on the convexity of A, that the domain B is a union of triangles of the form supp T , even if functions supported in B are not necessarily supported in one of such of triangles only.Hence, we can find a finite family Let (χ k ) be a family of functions compactly supported in supp T α k ,β k ,c k , such that χ = k χ k .Then, we can apply the first step to conclude.
Step 3. The general case.Up to subtracting a constant, we assume that χ = 0 on A. Let us start with assuming that χ is non-negative.We proceed by approximation.Let ε > 0 be fixed, and define

By
Step 2 above, we have that and this convergence (up to a subsequence) holds strongly in L 2 and weakly in H 1 .Then, Taking the limit as n → +∞ yields the result.
To treat the case where χ changes its sign, we can apply the argument on the positive and negative parts of χ separately.
We conclude these preliminary results with the following lemma: Lemma 3.8.Let (ρ n , µ n ) ∈ H converges weakly, as n → +∞, to (ρ, µ) ∈ H and be such that SlopeF (ρ n , µ n ) is bounded independently of n.Then, for any χ compactly supported in B, we have where the convergence is weak in L 2 .
Proof.As usual, by possibly decomposing χ into a finite sum we can assume that the support of χ is included in a triangle.We choose two different triangle functions T 1 := T α+ε,β,c , T 2 := T α,β,c such that their supports include that of χ.We use the weak This implies since we just need to multiply the above weak converging sequence with χ(ρ n , µ n ), which is dominated and pointwisely converging a.e. to χ(ρ, µ).If we do the same for T 1 we also obtain Subtracting the two relations and dividing by ε, we obtain We can now also deduce that χ(ρ n , µ n )∇µ n ⇀ χ(ρ, µ)∇µ and the claim is proven.
We make use of these preliminary results to prove Theorem 3.4.

Lower semi-continuity in the region B
The goal of this section is to prove the following: Proposition 3.9.Let (ρ n , µ n ) ∈ H be such that (ρ n , µ n ) converges weakly, as n → +∞, to (ρ, µ) ∈ H and such that SlopeF (ρ n , µ n ) is bounded independently of n.Let χ be a Lipschitz function compactly supported in B. Then Proof.By the definitions of f and χ, we have The latter expression is convex in the terms (∇ρ n + ρ n ∇µ n )χ(ρ n , µ n ), ρ n χ(ρ n , µ n ) (and same for the terms involving µ n ).Then the standard lower semi-continuity results (see Chapter 4 in [11]) will prove the claim.We only have to note that Proposition 3.5 provides where the convergence is strong in L 2 (and the same also holds true for µ n ) and Lemma 3.8 in turn provides where the convergence is weak in L 2 .We used here many times that the functions (a, b) → aχ(a, b), bχ(a, b) are compactly supported in B and Lipschitz continuous.This concludes the proof.

Lower semi-continuity in the region A
The goal of this section is to prove the following: Let us define the function S := ρ + µ and consider the function P (ρ, µ) defined in (2.12).We recall that (see Lemma 2.2), for s > 2, we have Remember that, if (ρ, µ) ∈ H, then the gradients of √ S and P (ρ, µ) are well defined.We can then state the following lemma: Lemma 3.11.Consider (ρ, µ) ∈ H. Then we have Proof.We consider the cases where (ρ, µ) ∈ A and (ρ, µ) ∈ B separately.The inequality is actually an equality in the set A.
Proof.We start by giving some bounds on ∇S n .Recalling the definitions of X n , Y n from (3.17), we have (on the set {(ρ n , µ n ) ∈ B} of points x ∈ Ω where (ρ n (x), µ n (x)) ∈ B), Therefore, on {(ρ n , µ n ) ∈ B}, we obtain Since ρ n µ n ≤ 1 in the set B, we get For (ρ n , µ n ) ∈ A, we use f ′′ ≥ r 0 s , where r 0 as in Remark 1, and we obtain for a constant C independent of n.Therefore, taking is bounded independently of n.The function h is positive and vanishes only at s = 2.If we denote by H any anti-derivative of h, characterized by H ′ = h, we deduce that H is strictly increasing.Moreover, we obtain a uniform H 1 bound on H(S n ).This implies that, up to a subsequence, H(S n ) has an a.e.limit.Composing with H −1 , the same is true for S n .The pointwise limit of S n can only coincide with its weak limit, i.e. S. Now, to prove the convergence of P (ρ n , µ n ), it is sufficient to write for any extension of π to R + (π is originally only defined on [2, +∞), but we can take π = 1 on [0, 2]).
The function P (a, b) − π(a + b) is constant, equal to zero, on the set A. Then, we can apply Proposition 3.5 to obtain the desired convergence.In what concerns π(ρ n + µ n ) we just need to apply what we just proved on S n .
We are now in the position to prove Proposition 3.10.
Proof of Proposition 3.10.Consider a sequence as in the statement of the proposition.Then, we have Since √ S n + P n is bounded in H 1 , it converges weakly in H 1 up to extraction of a subsequence, and we know that its limit has to be √ S + P .Using the standard semi-continuity argument we obtain Therefore, Theorem 3.4, that is, the lower semi-continuity of the functional SlopeF , follows from the previous results as an easy consequence: Proof of Theorem 3.4.Let (ρ n , µ n ) be a sequence satisfying the hypotheses of Theorem 3.4.Let 0 ≤ χ ≤ 1 be a Lipschitz function compactly supported in B. Then Applying separately the results of Propositions 3.9 and 3.10, we obtain Since χ is arbitrary, we can select an increasing sequence of cut-off functions χ k converging to ½ B , and by monotone convergence we obtain and this concludes the proof.

Existence of solutions in the EDI sense
In this section, we define the JKO scheme for the functional F and three different interpolations (De Giorgi variational, piecewise geodesic, and piecewise constant) for this scheme.We prove that these interpolations converge to the same limit curve and this limit curve is a gradient flow for the functional F in a suitable sense (see Def. 4.2 below).More precisely, we show that the Energy Dissipation Inequality (4.19) holds by using the estimates we obtain via the three interpolations we define subsequently.
• The pairs (ρ t , µ t ) and (v t , w t ) satisfy the Energy Dissipation Inequality The main goal of this section is to prove the following theorem: Theorem 4.3.For any initial datum (ρ 0 , µ 0 ) such that F (ρ 0 , µ 0 ) < +∞, there exists an EDI gradient flow for the functional F .

The JKO scheme and the interpolations
Definition 4.4 (JKO scheme for F ).For a fixed time step τ > 0 (of the form τ = T /N for some N ∈ N), we define the JKO scheme as a sequence of probability measures (ρ τ k , µ τ k ) k , with a given initial datum (ρ τ 0 , µ τ 0 ) = (ρ 0 , µ 0 ) such that for k ∈ {0, • • • , N − 1} we have This sequence of minimizers exists since the functional F is lower semi-continuous for the weak convergence and so is the sum we minimize.Such a sum is also strictly convex, since the measures ρ τ k , µ τ k are necessarily absolutely continuous (which implies strict convexity of the Wasserstein terms, see Prop.7.19 in [14]): the minimizer is then unique at every step.
Notice that (4.20) implies that, at each iteration, we have An important consequence of (4.21) is the following inequality, which is obtained by summing over k and using the fact that F is bounded from below: Our strategy to prove that Inequality (4.19) holds is to improve (4.21) by the help of some interpolations for the sequence (ρ τ k , µ τ k ) k for the functional F .Therefore, we define these interpolations next.Definition 4.5 (De Giorgi variational interpolation).We define the De Giorgi variational interpolation (ρ τ t , μτ t ) for the sequence (ρ τ k , µ τ k ) k as follows: for any s ∈ (0, 1] and any k, take t = (k + s)τ such that

.23)
We notice that when s = 1, (4.23) is nothing but the JKO scheme (4.20).The main point (which is now a classical idea in the study of gradient flows, see [3]), is that we can improve (4.21) by To obtain (4.24) we define a function g This function is decreasing and hence differentiable a.e.At differentiability points, we necessarily have On the other hand, for a monotone function we have an inequality by the fundamental theorem of calculus, which gives here Combining the last two lines, we obtain ( Definition 4.6 (Piecewise constant interpolation).We define the piecewise constant interpolation as a pair of piecewise constant curves (ρ τ t , μτ t ) and a pair of velocities (v τ t , wτ t ) associated with these piecewise constant curves such that for every t ∈ (kτ, (k + 1)τ ] and k ∈ {0, • • • , N − 1}, N ∈ N, they satisfy where T τ ρ k+1 →ρ τ k and T τ µ k+1 →µ k are the optimal transport maps from ρ τ k+1 to ρ τ k and from µ τ k+1 to µ τ k respectively.We define the momentum variables by Ēτ (t) := ( Ēτ ρ (t), Ēτ µ (t)) = (ρ τ t vτ t , μτ t wτ t ).Definition 4.7 (Piecewise geodesic interpolation).We define the piecewise geodesic interpolation as a pair of densities (ρ τ t , μτ t ) that interpolate the discrete values (ρ τ k , µ τ k ) k along Wasserstein geodesics: for each k we define We also define some velocity fields (ṽ τ t , wτ t ) as follows In this way we can check that we have and, for t ∈ ((k − 1)τ, kτ ), we also have We also define the momentum variables by Ẽτ (t) = ( Ẽτ Now that we have defined the three interpolations above and that we have improved (4.21) into (4.27),we would like to replace the terms involving the De Giorgi variational interpolation (ρ τ k+s , μτ k+s ) in (4.27) by the slope functional SlopeF (ρ τ k+s , μτ k+s ).To be able to do that, first we prove two technical lemmas (Lemmas 4.8 and 4.9 below) and then by using them we prove that the pair of curves obtained via the De Giorgi variational interpolation belongs to the space H. Lemma 4.8.For every ρ, µ ∈ C ∞ (Ω) strictly positive densities we have where S := ρ + µ and r 0 > 0 is defined in Remark 1 in Section 2.
Proof.We consider the sets A and B separately.For (ρ, µ) ∈ A, we have For (ρ, µ) ∈ B, we have We want to show that the inequality is always satisfied.This means we want to have We now look at the matrix and prove that it is positive definite.Using r 0 ≤ 1 and ρ, µ ≤ S we easily see that the terms on the diagonal are non-negative.We then compute the determinant and obtain For fixed sum S = ρ + µ this last expression is minimal if the product ρµ is maximal.If S > 2 we can then bound this from below by 1−r 0 π(S) − 1 + 2r 0 S .This quantity is non-negative as a consequence of the definition of r 0 since we have If S ≤ 2 we just use ρµ ≤ S 2 /4 ≤ S/2 and bound the same quantity from below by 2 1−r 0 S − 1 + 2r 0 S = Lemma 4.9.Let us consider the auxiliary energy functional G(ρ, µ) = ´Ω g(ρ, µ) where g(a, b) := a log a + b log b (defined as equal to +∞ if measures are not absolutely continuous).Suppose that ρ 0 , µ 0 are given absolutely continuous measures and let us call (ρ 1 , µ 1 ) the unique solution of Then, setting S 1 := ρ 1 + µ 1 , we have where the number r 0 > 0 is defined in Remark 1 in Section 2.
Proof.We prove this result via a flow-interchange inequality and a regularization argument.We fix ε > 0 and define F ε (ρ, µ) := F (ρ, µ)+εG(ρ, µ).For a given sequence of smooth measures ρ 0,ε , µ 0,ε ∈ C ∞ we consider the sequence of minimizers If we write the optimality conditions for the above minimization problem we have ρ 1,ε , µ 1,ε > 0 (the argument is the same as in the proof of Lemma 8.6 in [14]) and where C, C are some positive constants.Since (a, b) is the gradient of a strictly convex function it is a diffeomorphism and we deduce that ρ 1,ε and µ 1,ε have the same regularity of the Kantorovich potential, and are Lipschitz continuous.They are also bounded from below because of the logarithm in the optimality conditions and Caffarelli's theory (see section 4.2.2 in [17]) implies that the Kantorovich potentials are C 2,α (we assumed that we are in a convex domain).Iterating these regularity arguments gives that ρ 1,ε and µ 1,ε are C ∞ functions.
We then use the geodesic convexity of the entropy to deduce that we have , where (ρ s , µ s ) is a pair of geodesic curves in W 2 (Ω) × W 2 (Ω) connecting the densities (ρ 1,ε , µ 1,ε ) to the densities (ρ 0,ε , µ 0,ε ) (pay attention that, in a JKO scheme, this interpolation starts from the new points and go back to the old points).We then use the continuity equations ∂ s ρ s + ∇ • (ρ s v s ) = 0, and ∂ s µ s + ∇ • (µ s w s ) = 0, together with the fact that the initial velocity fields v 0 and w 0 can be obtained as the opposite of the gradient of the corresponding Kantorovich potentials.Hence we have Using the optimality conditions we obtain Dropping the positive terms with the gradients of the logarithms and applying Lemma 4.8 we then obtain where S 1,ε = ρ 1,ε + µ 1,ε .It is then enough to let ε → 0 if we choose an approximation ρ 0,ε , µ 0,ε s.t.G(ρ 0,ε , µ 0,ε ) → G(ρ 0 , µ 0 ).Note that the terms in G(ρ 1,ε , µ 1,ε ) and S 1,ε are lower semi-continuous for the weak convergence (and the sequence (ρ 0,ε , µ 0,ε ) weakly converges to (ρ 1 , µ 1 ) by Γ−convergence of the minimized functionals and uniqueness of the minimizer at the limit).
Lemma 4.10.Let (ρ τ t , μτ t ) be the pair of curves obtained by the De Giorgi variational interpolation.Then (ρ τ t , μτ t ) ∈ H for every t.
Proof.First, we note that the optimality conditions provide Lipschitz continuity, for fixed τ , for f a (ρ τ t , μτ t ) and f a (ρ τ t , μτ t ), and hence for χ(ρ τ t , μτ t ) (see Remark 2 in Section 2).Therefore Property i. of Def.3.1 is satisfied.
For Property ii. of Def.3.1, we apply Lemma 4.9 with sτ instead of τ , which guarantees the H 1 behavior of the square root of the sum.
Remark 3. Note that the very same argument implies that the curves obtained by the piecewise constant interpolation also belong to the space H.

Existence of solutions
The goal of this section is to prove Theorem 4.3.We start by proving the following lemma: Lemma 4.11.The pair of curves (ρ τ t , μτ t ), (ρ τ t , μτ t ) and (ρ τ t , μτ t ) given by Definitions 4.5, 4.6 and 4.7 respectively converge up to subsequences, as τ → 0, to the same limit curve (ρ t , µ t ) uniformly in W 2 distance.Moreover, the vector-valued measure Ẽτ corresponding to the momentum variable of the piecewise geodesic interpolations, also converges weakly-* in the sense of measures on [0, T ] × Ω to a limit vector measure E along the same subsequence.
Proof.Recall that the geodesic speed is constant on each interval (kτ, (k + 1)τ ), and this implies and similar for wτ t .Then we obtain . Let us note that we have where the inequality is a consequence of (4.22).We first use this inequality to estimate the momentum variables, since we have Analogous estimates can be obtained for Ẽτ µ .This means that Ẽτ is bounded in L 1 ([0, T ] × Ω) and we obtain the weak-* compactness in the space of measures on space-time.
It is now classical in gradient flows, as a consequence of the estimate on the L 2 norm of the velocities (4.32), to obtain Hölder bounds on the geodesic interpolations.Indeed, the pair (ρ τ t , μτ t ) is uniformly 1 2 −Hölder continuous since by using the previous computations we can show that, for s < t, and an analogous estimate holds for μτ .Since the domain of the curves ρ, μ : [0, T ] → W 2 (Ω) is compact and so is the image domain W 2 (Ω), we can pass to the limit by using Ascoli-Arzelà theorem.Therefore there exists a subsequence τ j → 0 such that Ẽτ j ρ → E ρ and Ẽτ j µ → Ẽµ weakly-* as measures, ρτ j t → ρ t and μτ j t → µ t uniformly in W 2 . (4.33) Moreover the curves (ρ τ t , μτ t ) obtained from the piecewise constant interpolation converge uniformly to the same limit curve (ρ t , µ t ) as the ones obtained from the piecewise geodesic interpolation since we have that where C, C are some positive constant.This is shown by using that (ρ This means that the pair (ρ τ t , μτ t ) also converges to the limit curves (ρ t , µ t ) uniformly in [0, T ].Therefore we showed that the three interpolations that are defined for (4.20) converge to the same limit curves (ρ t , µ t ).
By Lemma 4.10, we know that, for the curves obtained by the De Giorgi intepolation we have (ρ τ t , μτ t ) ∈ H, for all t.Then we have the following: Using the above equality, (4.27) re-writes as Moreover, since S = ρ + µ and ρ and µ have unit mass, we also have S ∈ L ∞ t L 1 x .Then, we have This proves the result for d = 1.The sharp value of the constant α is 4π, but we will just use α = 1.Taking c(t) = log h(t) and denoting u(t, x) = St(x) h(t) in (4.39) we obtain that

Now
The very last inequality also relies on the fact that S has uniformly bounded entropy, i.e. ´Ω S t log S t dx ≤ C.This is a consequence of the bound on F (ρ t , µ t ) ≤ F (ρ 0 , µ 0 ).Using F ≥ G we have a uniform bound on ´ρ log ρ + µ log µ and, by convexity, on ´S 2 log( S 2 ), which in turn gives a bound on the entropy of S. This gives the result for d = 2 and finishes the proof.

Differentiation properties
The goal of this section is to prove a statement similar to the following one: Let (ρ, µ) be a curve in L 2 H (see Def. 4.1) and let v and w be two velocity fields for ρ and µ, respectively, i.e. we have ∂ t ρ + ∇ • (ρv) = 0 and ∂ t µ + ∇ • (µw) = 0.
If g : R 2 + → R is a ''nice enough'' function, we have (5.41) In particular, we would like this to be true for g = f and for (ρ, µ) the solution that we found in Section 4.
The main idea behind the above computation is that (5.41) holds if the densities of ρ and µ are smooth, by just using a differentiation under the integral sign and an integration by parts.Hence, we will prove the result by regularization, relying on a suitable convolution kernel (in space only).We will suppose that Ω is either the torus or a regular cube.In the second case, after symmetrizing, the functions ρ and µ can be extended by periodicity and it is exactly as if Ω were the torus.
We first observe the following property.
Lemma 5.1.If ρ ∈ L 1 ([0, T ] × Ω) and we have ´T 0 ´Ω ρ|v| 2 < +∞, and ρ ε and v ε are defined as (5.42), then we have Proof.It is well known (see Chapter 5 in [14]) that we have Moreover, it is clear that its pointwise limit is √ ρv on the set {ρ > 0} as a consequence of the standard properties of the convolution and of ρ ε → ρ and ρ ε v ε → ρv.We then use Lemma 5.2 below to deduce the strong L 2 convergence.
Lemma 5.2.Suppose that a sequence u n ∈ L 2 (X; R d ) weakly converges in L 2 to a function v, and that we have u n (x) → u(x) for a.e.x ∈ A ⊂ X.Then we have u = v on A. Suppose that a sequence u n satisfies lim sup n ´|u n | 2 ≤ ´|u| 2 and u n (x) → u(x) for a.e.x ∈ A with A = {u = 0}; then we have u n → u in L 2 (X).
Proof.Let φ ∈ L ∞ (X) be a test function vanishing on A c .We have ´un • φ → ´v • φ because of weak convergence.Yet, for an arbitrary R > 0, if we denote by π R the projection onto the closed ball of radius R, we also have ´πR We say that a convolution kernel η satisfies Property H1 conv if the following holds: There exists a constant C such that, for every function u ∈ L 1 with u + ∈ H 1 ∩ L ∞ and every positive constant c > 0 we have Proposition 5.5.Suppose that there exists a convolution kernel η satisfying Property H1 conv.Then, Equality (5.41) holds when g ∈ C 2 is compactly supported in B ⊂ R 2 + .
Proof.We will proceed as above by convolution, but we choose a convolution kernel η satisfying Property H1 conv.
The proof of this proposition will recall a lot the different steps in that of Proposition 3.5.Let us take a ''triangle'' function T α,β,c , as defined in (3.15), whose support {(a, b) ∈ R 2 + : αa + βb ≤ c} is contained in the open set B. We can assume that there exists another triangle contained in B with coefficients α, β, c ′ with c ′ > c.
We set u = c ′ − (αρ + βµ) and we observe that u + ∈ L 2 t H 1 x because (ρ, µ) ∈ L 2 H.We also have u ≤ c ′ which shows that u + also belongs to L ∞ .We then deduce that the functions (c If we fix a continuous function χ, supported in the support of T α,β,c , we have the pointwise convergence and this implies the weak L 2 convergence This is true for any continuous function χ supported in the triangle which is the support of T α,β,c .Yet, χ is also supported in different triangles, that we can obtain as supports of other functions of the form T α,β,c , by slightly changing the values of α, β, c.Therefore, we deduce that, for each such continuous function χ, we have Following the argument of the proof of Lemma 3.8, i.e., summing up many such functions χ and using partition of the unity, we can deduce that the same result finally holds for any continuous function χ supported in B (again, since the domain B is a union of such triangles).We now proceed in the usual way with the approximation by convolution, since we just need to observe that we have and then use the strong convergence of √ ρ ε v ε and the weak convergence of the other term, since g aa and g ab are continuous and compactly supported in B.
We also need to pass the terms ´g(ρ ε (0), µ ε (0)) and ´g(ρ ε (T ), µ ε (T )) to the limit, but this can be easily done by dominated convergence since g is bounded.
The next results extend the above proposition to the case which is of interest for us.For δ > 0, we set Proposition 5.6.Suppose that there exists a convolution kernel η satisfying Property H1 conv.Then, Equality (5.41) holds when g ∈ C 1,1 is compactly supported in the open set B ⊂ R 2 + , with g ∈ C 2 (B δ ) and g = 0 on B \ B δ .
Proof.The proof is obtained by approximation using a cut-off function χ r with the following properties: We then apply Proposition 5.5 to gχ, which is C 2 , and pass to the limit.The convergence of the boundary terms (in time) is obtained by pointwise convergence and dominated convergence.For the convergence of the integral terms in space-time we need to dominate the second derivatives In the above sum, the first term is bounded.The second and the third terms are also bounded since |∇χ r | ≤ Cr −1 but |∇g| ≤ r on {∇χ r = 0}, which is a consequence of the C 1,1 behaviour of g.Similarly, for the last term we use In order to conclude, we now take the function f and write it as f (a, b) = ḡ(a, b) + f (a + b), with ḡ ∈ C 1,1 and ḡ = 0 on A. It is not yet possible to apply Proposition 5.6 to g = ḡ since ḡ is supported on B = B ∪ ∂A and not on B, but we will obtain this by approximation.Proposition 5.7.Suppose that there exists a convolution kernel η satisfying Property H1 conv.Then, Equality (5.41) holds when g = ḡ, and hence for g = f provided that the curve (ρ, µ) is such that ´T 0 SlopeF (ρ, µ) dt is finite.We need to prove that all the terms computed in (ρ + δ, µ + δ) converge to the corresponding terms in (ρ, µ).To do so, we consider the difference between the terms with δ and those without δ.Since v ∈ L 2 (ρ), we just need to prove that the following terms converge to 0 in L 2 (ρ): ½ B δ , ∇S f ′′ (S + 2δ) − f ′′ (S) ½ B δ , ∇f a (ρ, µ)½ B\B δ , (we only consider the terms in the first integral, the second integral in (5.43) is treated similarly, using w ∈ L 2 (µ)).The last term is easy to handle, since it converges pointwise to 0 and it is dominated by |∇f a (ρ, µ)| which is in L 2 (ρ) (owing to the assumption on the slope).
This proves that we can take the limit δ → 0 in the first part of the integral.For the second part, one has to do the same estimates in L 2 (µ), and the computations are the same.
6 An H 1 estimate on the positive part The goal of this section is to prove that, in dimension d = 1, there indeed exists a convolution kernel satisfying the H1 conv property.In order to avoid boundary issues, the result will be proven on the 1−dimensional torus S 1 .As we already explained in the previous section, this implies a similar result on a segment, after one reflection.
Before going to dimension one, we would like to explain why this very part of the paper unfortunately requires us to restrict ourselves to dimension one: the reason lies in the very different behavior of H 1 functions in terms of pointwise bounds.Indeed, in the one-dimensional case H 1 functions are continuous, so that the ''bad'' region where u < 0 (i.e. the region where we only have an L 1 control on the function) has to be far away from regions where u > c; this is not the case in higher dimension.We note that it would be possible to obtain some estimates similar to the ones defining H1 conv in any dimension if we added the assumption that u ∈ L ∞ but it is not possible to obtain such estimates if we stick to the the L 1 assumption that we required which, by density, is actually equivalent to working with measures.
Unfortunately, for the sake of the applications of H1 conv to Section 5, we cannot assume u ∈ L ∞ as this would be equivalent to ρ, µ ∈ L ∞ , and L ∞ bounds are unknown in our cross-diffusion system.
Our 1−dimensional result will be proven by considering the kernel for m > 2. We have η ∈ L 1 (R), ´η(x) dx = 1, and η has a finite first moment.We then define a kernel on the torus using η(x) = k∈Z η(x − k).
An easy computation shows that, for x = 0, we have In particular, we also obtain All these properties can be translated in terms of the kernel η on S 1 , since we have uniform convergence of the series defining η together with its first and second derivatives.Moreover which proves that also η, which is a C 2 function except at 0 where η ′′ is a negative Dirac mass, satisfies In order to approximate the identity on the torus, we define and η ε (x) := k∈Z ηε (x − k).Observe that η ε is of mass 1 on the torus and that it satisfies As a consequence of the properties of η ε we do have, for any L 1 function w ≥ 0, denoting w ε := η ε * w, Proposition 6.1.There exists K > 0 (depending only on m) such that, for any u ∈ L 1 whose positive part u + is in H 1 ∩ L ∞ , and for every c > 0, we have Proof.Let u be as in the statement of the proposition.For notational simplicity, let us denote v = u + and w = u − the positive and negative parts of u respectively.Hence u = v − w.We have (6.46) Note that the desired inequality is straightforward if c > v L ∞ (since in this case (u ε − c) + = 0), so we can assume v L ∞ ≥ c and estimate the term ´|v ′ | 2 with ) and the critical points are characterized by (using b = s − a): log(a) + b = log(b) + a, a + b = s.For s ≤ 2 only a = b = s/2 is a critical point, otherwise the same point is a local maximizer and there are two global minimizers.

2 + 1 a + b = 2 a
\ A the open set below.The two sets A and B are represented in Figure 1.+ b = s

Figure 2 :Figure 3 : 2 +
Figure 2: Functions g and f , we consider a function φ(b) = b (log b − c), and notice that the Legendre transform φ * of φ is given by φ * (a) = e a+c−1 .Then we have the following inequality for every a, b, c (with b > 0): b(log b − c) + e a+c−1 ≥ ab. (
t → µ t uniformly in W 2 .
An additional estimate.We conclude this section by showing an additional estimate on S, which is actually not needed for our analysis.Lemma 4.14.Assume that the problem we consider is in dimension 1 or 2, that is, Ω ⊂ R d with d = 1 or d = 2.Then, for T > 0, we have ´T 0 S t 2 L 2 (Ω) dt < +∞.