Fine Properties of Geodesics and Geodesic \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda $$\end{document}λ-Convexity for the Hellinger–Kantorovich Distance

We study the fine regularity properties of optimal potentials for the dual formulation of the Hellinger–Kantorovich problem (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{H}\!\!\textsf{K}$$\end{document}HK), providing sufficient conditions for the solvability of the primal Monge formulation. We also establish new regularity properties for the solution of the Hamilton–Jacobi equation arising in the dual dynamic formulation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{H}\!\!\textsf{K}$$\end{document}HK, which are sufficiently strong to construct a characteristic transport-growth flow driving the geodesic interpolation between two arbitrary positive measures. These results are applied to study relevant geometric properties of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf{H}\!\!\textsf{K}$$\end{document}HK geodesics and to derive the convex behaviour of their Lebesgue density along the transport flow. Finally, exact conditions for functionals defined on the space of measures are derived that guarantee the geodesic \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda $$\end{document}λ-convexity with respect to the Hellinger–Kantorovich distance. Examples of geodesically convex functionals are provided.


Introduction
In [LMS16,LMS18] the Hellinger-Kantorovich distance (in [KMV16, CP * 15, CP * 18] it is also called Wasserstein-Fisher-Rao distance or Kantorovich-Fisher-Rao distance in [GaM17]) was introduced to describe the interaction between optimal transport and optimal creation and destruction of mass in a convex domain of R d . Here we further investigate the structure of (minimal) geodesics, and we fully analyze the question of geodesic λ-convexity of integral functionals with respect to this distance.
The Hellinger-Kantorovich distance can be considered as a combination, more precisely the inf-convolution, of the Hellinger-Kakutani distance on the set of all measures (cf. e.g. [Sch97]) and the L 2 Kantorovich-Wasserstein distance, which is well-known from the theory of optimal transport, see e.g. [AGS08,Vil09]. Throughout this text, we denote by M(R d ) all nonnegative and finite Borel measures endowed with the weak topology induced by the canonical duality with the continuous functions C 0 (R d ) decaying at infinity. While the L 2 Kantorovich-Wasserstein distance W(µ 0 , µ 1 ) of measures µ 0 , µ 1 ∈ M(R d ) requires µ 0 and µ 1 to have the same mass to be finite, the Hellinger-Kakutani distance, which is defined via has the upper bound H(µ 0 , µ 1 ) ≤ µ 0 (R d )+µ 1 (R d ), with equality if µ 0 and µ 1 are mutually singular.
It is a remarkable fact, deeply investigated in [LMS18], that the H K distance has many interesting equivalent characterizations, which highlight its geometric and variational character. A first one arises from the dual dynamic counterpart of (1.1) in terms of subsolutions of a suitable Hamilton-Jacobi equation: (1.2) By expressing solutions of (1.2) in terms of a new formula of Hopf-Lax type, one can write a static duality representation associated with the convex cost function L 1 (z) := 1 2 log(1 + tan 2 (|z|)) which forces |z| < π/2. Notice that it is possible to write (1.3) in a symmetric form with respect to ϕ 0 , ϕ 1 just by changing the sign of ϕ 1 .
It is remarkable that (1.3) can be interpreted as the dual problem of the static Logarithmic Entropy Transport (LET) variational formulation of H K. By introducing the logarithmic entropy density we get H K 2 (µ 0 , µ 1 ) = min where the minimum is taken over all positive finite Borel measures η in R d × R d whose marginals (π i ) ♯ η = σ i µ i are absolutely continuous with respect to µ i . The subdifferential DL 1 (z) = ∂L 1 (z) = tan(z) := tan |z|) z |z| and its inverse w → arctan(w) will play an important role. We continue to use bold function names for vector-valued functions constructed from real-valued ones as follows: for a map f : R → R with f (0) = 0 we set f : A fourth crucial formula, which we will extensively study in the present paper, is related to the primal Monge formulation of Optimal Transport, and clarifies the two main components of H K arising from transport and growth or decay effects. Its main ingredient is the notion of transport-growth pair (T , q) : (1.7) The Monge formulation of H K then looks for the optimal pair (T , q) among the ones transforming µ 0 into µ 1 by (T , q) ⋆ µ 0 = µ 1 which minimizes the conical cost C(T , q; µ 0 ) := R d 1 + q 2 (x) − 2q(x) cos π/2 |T (x)−x| dµ 0 (x), (1.8) where cos π/2 (r) := cos min{r, π/2} . As for the usual Monge formulation of optimal transport, the existence of an optimal transport-growth pair (T , q) minimizing (1.8) requires more restrictive properties on µ 0 , µ 1 which we will carefully study. It is worth noticing that the integrand in (1.8) has a relevant geometric interpretation as the square distance d 2 π,C , where d π,C is the distance on the cone space C over R d (cf. (2.5)) between the points [x, 1] and [T (x), q(x)] and suggests that H K induces a distance in M(R d ) which plays a similar role than the L 2 Kantorovich-Rubinstein-Wasserstein distance in P 2 (R d ).
Inspired by the celebrated paper [McC97], we want to study the structure of such minimizers and to characterize integral functionals which are convex along such kind of interpolations.

Improved regularity of potentials and geodesics
In the first part of the paper we will exploit the equivalent formulations of H K in order to obtain new information on the regularity and on the fine structure of the solutions to (1.3), (1.2), and (1.8).
More precisely, we will initially prove in Section 3 that the optimal H K potential ϕ 0 is locally semi-convex outside a closed (d−1)-rectifiable set, so that when µ 0 ≪ L d and µ 1 is concentrated in a neighborhood of supp(µ 0 ) of radius π/2 the Monge formulation (1.8) has a unique solution.
On the contact set (Ξ t ) t∈[0,1] , we can combine the (delicate) first-and second-order superdifferentiability properties of ξ t arising from the inf-convolution structure of (1.13) with the corresponding sub-differentiability properties exhibited byξ t . Using tools from nonsmooth analysis, we are then able to give a rigorous meaning to the characteristic flow associated with (1.12), i.e. to the maps t → T (t, ·) = T s→t (·), t → q(t, ·) = q s→t (·) solving (we omit to write the explicit dependence on x when not needed) Ṫ (t) = ∇ξ t (T (t)), q(t) = 2ξ t (T (t))q(t), in (0, 1), T (s, x) = x, q(s, x) = 1. (1.17) Moreover, we will prove that T s→t is a family of bi-Lipschitz maps on the contact sets obeying a natural concatenation property. As can be expected, the maps T s→t , q s→t provide a precise representation of the geodesics via µ t = (T s→t , q s→t ) ⋆ µ s for all s, t ∈ (0, 1). In particular (T s→t , q s→t ) is an optimal transport-growth pair between µ s and µ t minimizing the cost of (1.8).
Using this valuable information, in Section 5 we obtain various relevant structural properties of geodesics in (M(R d ), H K) such as non-branching, localization, and regularization effects. In particular, independently of the regularity of µ 0 and µ 1 , we will show that for s ∈ (0, 1) the Monge problem between µ s and µ 0 or between µ s and µ 1 always admit a unique solution, a property which is well known in the Kantorovich-Wasserstein framework.
Surprisingly enough, despite the lack of global regularity, we will also establish precise formulae for the first and second derivative of the differential of T s→t (and thus the second order differential of ξ t ) along the flow, which coincides with the equations that one obtains by formally differentiation using the joint information of the Hamilton-Jacobi equation (1.12) and (1.17) assuming sufficient regularity. For instance, differentiating in time the first equation of (1.17) and differentiating in space (1.12) one finds For q(t), B(t) := DT s→t , and its determinant δ(t) := det B(t) similar, just more involved, calculations yield the crucial second order equations (1.18d) In our case, even though we do not have enough regularity to justify the above formal computations, we can still derive them rigorously by a deeper analysis using the variational properties of the contact set. Even if our discussion is restricted to the Hellinger-Kantorovich case and uses the particular form of the Hopf-Lax semigroup (1.13) and its characteristics (1.9), we think that our argument applies to more general cases and may provide new interesting estimates also in the typical balanced case of Optimal Transport. Such regularity and the related second order estimates are sufficient to express the Lebesgue density c t of the measures µ t and thus to obtain crucial information on its behavior along the flow. In particular, Corollary 5.5 shows that c(t, ·) is given by (1.19c) and the time-dependent transport-growth mapping T s→t , q s→t are given in terms of ξ via (1.17) and the analog of (1.9). In particular, we will show that if µ s ≪ L d for some s ∈ (0, 1) then µ t ≪ L d for every t ∈ (0, 1) and combining (1.18b), (1.18c), and (1.19a) we will also prove that c t is a convex function along the flow maps T s→t .

Geodesic λ-convexity of functionals
The second part of the paper is devoted to establish necessary and sufficient conditions for geodesic λ-convexity of energy functionals E defined for a closed and convex domain Ω ⊂ R d with non-empty interior in the form where E ′ ∞ := lim c→∞ E(c)/c ∈ R ∪ {+∞} is the recession constant and E(0) = 0 holds. In [LMS16,Prop. 19] it was shown that the total-mass functional M : µ → µ(R d ) has the surprising property that it is exactly quadratic along H K geodesics γ : Thus, as a first observation we see that a density function E generates a geodesically λ-convex functional E if and only if E 0 : c → E(c) − λc generates a geodesically convex functional (i.e. geodesically 0-convex). Hence, subsequently we can restrict to λ = 0.
To explain the necessary and sufficient conditions on E for E to be geodesically convex, we first look at the differentiable case, and we define the shorthand notation For the Kantorovich-Wasserstein distance W the necessary and sufficient conditions are the so-called McCann conditions [McC97]: is lower semi-continuous and convex and see also [AGS08,Prop. 9.3.9]. For the Hellinger-Kakutani distance we simply need the condition In the case of differentiable E, our main result yields the following necessary and sufficient conditions for geodesic convexity of E on (M(R d ), H K), see Proposition 6.1, where the matrix B(c) ∈ R 2×2 sym is given by .
We immediately see that the non-negativity of the diagonal element B 11 (c) gives the first McCann condition in (1.22), and B 22 (c) ≥ 0 gives (1.23). However, the condition B(c) ≥ 0 is strictly stronger, since e.g. it implies that the additional condition (d+2)ε 1 (c)−2ε 0 (c) ≥ 0 holds, see (6.2). This condition means that c → c −2/(d+2) E(c) has to be non-decreasing, which will be an important building block for the main geodesic convexity result. Indeed, our main result in Theorem 7.2 is formulated for general lower semi-continuous and convex functions E : [0, ∞[ → R ∪ {∞} without differentiability assumptions. The conditions on E can be formulated most conveniently in terms of the auxiliary function (1.25c) The McCann conditions (1.22) are obtained by looking at N E (·, γ) for fixed γ, while the Hellinger-Kakutani condition (1.23) follows by looking at s → N E (sρ, sγ) for fixed (ρ, γ). The proof of the sufficiency and necessity of condition (1.25) for geodesic convexity of E is based on the explicit representation (1.19) of the geodesic curves giving By definition, we have α s (t, x) ≥ 0, and Corollary 5.5 guarantees δ s (t, x) > 0. Hence, we can introduce the two functions For smooth E we have smooth N E and may show convexity of t → e(t, x) via By convexity of N E , the term involving D 2 N E is non-negative, so it remains to show (1.26) To establish this, we use first that the scaling property N E (s 1+d/2 ρ, sγ) = s 2 N E (ρ, γ) for all s > 0 (which follows from the definition of N E via E) and the convexity of N E imply see Proposition 6.2. Second, we rely on a nontrivial curvature estimate for (ρ, γ), namelÿ (1.28) Estimates (1.28) are provided in Proposition 5.7 and strongly rely on the explicit representation and the regularity properties of the geodesics developed in Sections 4 and 5.
As a consequence, we find that the power functionals E m with E m (c) = c m with m > 1 are all geodesically convex, see Corollary 7.3. This result was already exploited in [DiC20, Thm. 2.14]. We can study the discontinuous "Hele-Shaw case" E(c) = −λc for c ∈ [0, 1] and E(c) = ∞ for c > 1. Moreover, in dimensions d = 1 or 2 the densities E q (c) = −c q with q ∈ [ d d+2 , 1 2 ] also lead to geodesically convex functionals E q , see again Corollary 7.3. Two important differences with the balanced Kantorovich-Wasserstein case are worth noting: First, the Boltzmann logarithmic entropy functional corresponding to E(c) = c log c is not geodesically λ-convex for any value of λ, see Example 6.5. Second, if the space dimension d is larger than or equal to 3, then there are no geodesically convex power functionals of the form E(x) = −c m with exponent m < 1, see Example 6.4. Some of these statements follow easily by observing that µ t = t 2 µ 1 is the unique geodesic connecting µ 0 = 0 and µ 1 .

Applications and outlook
In [Fle21,LaM22], the JKO scheme (minimizing movement scheme) for a gradient system (M(Ω), H K α,β , E ) is considered, i.e., for τ > 0 we iteratively define and consider the limit τ ↓ 0 (along subsequences) to obtain generalized minimizing movements (GMM) (cf. [AGS05]). Under suitable conditions, including the assumption E (µ) = Ω E(c)+cV dx with µ = cL d and E superlinear, it is shown in [Fle21,Thm. 3.4] that all GMM µ have the form µ(t) = c(t)L d , and the density c is a weak solution of the reaction-diffusion equation In [Li06], the equation u t = 0 = ∆u + au log u + bu is studied, whose solutions are steady states for HK gradient flows for E (u) = R d u log u dx. We also refer to [PQV14,DiC20], where equation (1.29) was studied for E(c) = 1 m c m − λc and V ≡ 0. The linear functional Φ(µ) = R d V (x) dµ for a given potential V ∈ C 0 (R d ) can easily be added, as its geodesic λ-convexity is characterized in [LMS16,Prop. 20]. Note that our main convexity result, proved here for the first time, plays an important role in the existence and uniqueness results of [DiC20], cf. Thm. 2.14 there.
In [LaM22] it is shown that the GMM for the gradient system (M(Ω), H K α,β , E ) are EVI λ solutions in the sense of [MuS20]. Again the main ingredient is the geodesic λconvexity of E in the form (1.20) contained in our main Theorem 7.2.

Main notation
finite positive Borel measures on X (with finite quadratic moment) P(X), P 2 (X) Borel probability measures on X (with finite quadratic moment) continuous and bounded real functions on X cos a (r) truncated function cos min{a, r} , a > 0 (typically a = π/2) W X (µ 1 , µ 2 ) Kantorovich-Wasserstein distance in P 2 (X) sin, tan, arctan, · · · vector-valued version of the usual scalar functions, see (1.6) H K(µ 1 , µ 2 ) Hellinger-Kantorovich distance in M(X): Section 2 (C, d a,C ), o metric cone on R d and its vertex, see Subsection 2.1.2 W a,C L 2 -Kantorovich-Wasserstein distance on P 2 (C) induced by d a,C x, r coordinate maps on C, see Subsection 2.1.2 π 0 , π 1 coordinate maps on a Cartesian product X 0 × X 1 , π i (x 0 , x 1 ) = x i h homogeneous projection from M 2 (C) to M(R d ), see (2.8) In this section, we recall a few properties and equivalent characterizations of the Hellinger-Kantorovich distance from [LMS16,LMS18], that will turn out to be crucial in the following. First, we fix some notation that we will extensively use: Let (X, d X ) be a complete and separable metric space. In the present paper X will typically be R d with the Euclidean distance, a closed convex subset thereof, the cone space C on R d (see Subsection 2.1.2), product spaces of the latter two, etc. We will denote by M(X) the space of all nonnegative and finite Borel measures on X endowed with the weak topology induced by the duality with the continuous and bounded functions of C b (X). The subset of measures with finite quadratic moment will be denoted by M 2 (X). The spaces P(X) and P 2 (X) are the corresponding subsets of probability measures.
If µ ∈ M(X) and T : X → Y is a Borel map with values in another metric space Y , then T ♯ µ denotes the push-forward measure on M(Y ), defined by T ♯ µ(B) := µ(T −1 (B)) for every Borel set B ⊂ Y . (2.1) We will often denote elements of X × X by (x 0 , x 1 ) and the canonical projections by π i : (x 0 , x 1 ) → x i , i = 0, 1. A coupling on X is a measure γ ∈ M(X×X) with marginals γ i := π i ♯ γ. Given two measures µ 0 , µ 1 ∈ M 2 (X) with equal mass µ 0 (X) = µ 1 (X), their (quadratic) Kantorovich-Wasserstein distance W X is defined by We refer to [AGS08] for a survey on the Kantorovich-Wasserstein distance and related topics.

Equivalent formulations of the Hellinger-Kantorovich distance
The Hellinger-Kantorovich distance was introduced in [LMS18, LMS16] and independently in [KMV16] and [CP * 18, CP * 15]. It is a generalization of the Kantorovich-Wasserstein distance to arbitrary non-negative and finite measures by taking creation and annihilation of mass into account. Indeed, the latter can be associated with a different notion of distance, namely the Hellinger-Kakutani distance, see [Hel09] and [Sch97]. , five different equivalent formulations of the Hellinger-Kantorovich distance are given: (i) the dynamical formulation, (ii) the cone space formulation, (iii) the optimal entropy-transport problem, (iv) the dual formulation in terms of Hellinger-Kantorovich potentials, and (v) the formulation using Hamilton-Jacobi equations. We will present and briefly discuss each of them below, as all are useful for our analysis of geodesic convexity.
In the following, we consider the Hellinger-Kantorovich distance for measures on the domain R d . However, it is easy to see that all arguments also work in the case of a closed and convex domain Ω ⊂ R d . In particular, the latter is a complete, geodesic space.

Dynamic approach
A first approach to the Hellinger-Kantorovich distance is related to the dynamic formulation, which naturally depends on two positive parameters α, β > 0: they control the relative strength of the Kantorovich-Wasserstein part and of the Hellinger-Kakutani part (see [LMS18, Section 8.5]).
Definition 2.1 (The dynamic formulation) For every µ 0 , µ 1 ∈ M(R d ) we set where the generalized continuity equation for the Borel vector and scalar fields Υ : (0, 1) × R d → R d and ξ : (0, 1) × R d → R reads Notice that (2.3) yields in particular that µ Υ and ξµ are (vector and scalar) measures with finite total mass, so that the canonical formulation of (gCE) in D ′ ((0, 1)×R d ) makes sense. For optimal solutions one has Υ(t, x) = ∇ξ(t, x) and the dual potential solves the generalized Hamilton-Jacobi equation in a suitable sense [LMS18, Theorem 8.20]. A simple rescaling technique shows that it is sufficient to restrict ourselves to a specific choice of the parameters α and β. In fact, it is easy to see that for every θ > 0 we have Moreover, if λ > 0 and we consider the spatial dilation H : x → λx in R d , we find Choosing λ := 4α/β, θ = 4/β, and setting H K := H K 1,4 we get Therefore, in order to keep simpler notation, in the remaining paper we will mainly consider the case α = 1 and β = 4.

Cone space formulation
There is a second characterization that connects H K with the classic Kantorovich-Wasserstein distance on the extended cone C : where ∼ is the equivalence relation which identifies all the points (x, 0) with the vertex o of C. More precisely, we write (x 0 , r 0 ) ∼ (x 1 , r 1 ) if and only if x 0 = x 1 and r 0 = r 1 or r 0 = r 1 = 0 and introduce the notation [x, r] to denote the equivalence class associated with (x, r) The cone C is a complete metric space endowed with the cone distances Notice that the projection map (x, r) → [x, r] is bijective from R d ×(0, ∞) to C * := C\{o}; we will denote by (x, r) its inverse, which we extend to o by setting The most natural choice of the parameter a in (2.5) is a := π: in this case the cone (C, d π,C ) is a geodesic space, i.e., given z i = [x i , r i ], i = 0, 1, there exists a curve z t = [x t , r t ] = geo t z 0 , z 1 , t ∈ [0, 1], connecting z 0 to z 1 and satisfying ∀ 0 ≤ s, t ≤ 1 : d π,C (z s , z t ) = |t−s| d π,C (z 0 , z 1 ). (2.6) If one of the two points coincides with o, e.g. for z 0 = o, it is immediate to check that z t = [x 1 , tr 1 ]. If r 0 , r 1 > 0 and |x 1 −x 0 | < π/2 then the unique geodesic curve reads (recall the convention in (1.6)) r t := r 0 (1+tu) 2 + t 2 |v| 2 1/2 , x t := x 0 + arctan tv 1+tu , where u = r 1 r 0 cos(|x 1 −x 0 |) − 1 and v := r 1 r 0 sin(x 1 −x 0 ). (2.7) For example, if we operate the same construction starting from the one-dimensional set Ω = [0, L] ⊂ R with 0 < L ≤ π we can isometrically identify the cone space over Ω with the two-dimensional sector Σ Ω = y = (r cos x, r sin x) ∈ R 2 r ≥ 0, x ∈ [0, L] endowed with the Euclidean distance. For L ∈ ]π, 2π[ the identification with the sector still holds, but the sector Σ Ω is no more convex and for x 0 , x 1 ∈ Ω with |x 0 −x 1 | ≥ π the cone distance corresponds to the geodesic distance on the sector Σ Ω , i.e. the length of the shortest path in Σ Ω connecting two points. On the one hand, we can define a homogeneous projection h : r 2 λ(·, dr), (2.8) i.e. for every λ ∈ M 2 (C) and On the other hand, measures in M(R d ) can be "lifted" to measures in M 2 (C), e.g. by considering the measure µ ⊗ δ 1 for µ ∈ M(R d ). More generally, for every Borel map r : R d → ]0, ∞[ and constant m 0 ≥ 0, the measure λ = m 0 δ o + µ ⊗ 1 r(·) 2 δ r(·) gives hλ = µ. Now, the cone space formulation of the Hellinger-Kantorovich distance between two measures µ 0 , µ 1 ∈ M(R d ) is given as follows, see [LMS16,Sec. 3].
The cone space formulation is reminiscent of classical optimal transport problems. Here, however, the marginals λ i of the transport plan λ ∈ M(C × C) are not fixed, and it is part of the problem to find an optimal pair of measures λ i satisfying the constraints hλ i = µ i and having minimal Kantorovich-Wasserstein distance on the cone space. The cone space formulation in (2.9) reveals many interesting geometric properties of the Hellinger-Kantorovich distance, e.g. Hellinger-Kantorovich geodesics are directly connected to geodesic curves in the cone space C, see below. Moreover, it can be deduced that a sharp threshold exists, which distinguishes between transport of mass and pure growth (i.e. creation or destruction) of mass.
The explicit computation of the previous remark is in fact a particular case of a general result [LMS18, Lem. 7.9+7.19].
Theorem 2.5 (Effective π/2-threshold in the cone distance) Let µ 0 , µ 1 ∈ M(R d ), if λ ∈ M 2 (C×C) is an optimal plan for the cone-space formulation (2.9) then λ (C×C) \ {(o, o)} is still optimal and Moreover, setting for i = 0, 1 (2.12) (see Figure 2.1) with the related decomposition then we have that Note that (2.14a) shows that the decomposition in (2.13) is extremal with respect to the subadditivity property in Lemma 7.8 of [LMS18], and (2.14b) shows that the computation of H K 2 between µ ′′ 0 and µ ′′ 1 is trivial, so that no information is lost if one restricts the evaluation of H K 2 to µ ′ 0 = µ 0 S ′ 0 and µ ′ 1 = µ 1 S ′ 1 . Motivated by the above properties, we introduce the following definition of reduced pairs, which will play a crucial role in our analysis of geodesic curves.
By definition the sets S i = supp(µ i ) are closed and S π/2 i are open, so that S ′′ i = S i \S π/2 1−i is closed as well, but S ′ i = S i ∩S π/2 1−i may be neither closed nor open. In the strongly reduced case the condition S i ⊂ S π/2 1−i means that, at least locally, the closed set S i has a positive distance to the boundary of the open set S π/2 1−i . Notice that for every (µ 0 , µ 1 ) ∈ M(R d ) 2 the corresponding pair (µ ′ 0 , µ ′ 1 ) defined according to (2.12)-(2.13) is reduced by construction. In fact, if x ∈ S ′ 0 then there exists y ∈ supp(µ 1 ) with |x−y| < π/2: clearly y ∈ S ′ 1 so that dist(x, supp(µ ′ 1 )) ≤ dist(x, S ′ 1 ) < π/2. and S π/2 1 denote the π/2-neighborhoods of the supports S 1 and S 0 , respectively, and 1−i ) are the corresponding restrictions of the measures µ i .

Transport-growth maps
It is useful to express (2.11b) in an equivalent way, which extends the notion of transport maps to the unbalanced case. It relies on special families of plans in λ ∈ M 2 (C 2 ) with h i λ = µ i generated by transport-growth systems.
It acts on ν according to this rule: where the last identity involves the obvious generalization of the definition (2.8) of homogeneous projection h from M 2 (X × [0, ∞)) to M(X).
(2.16) Transport-growth maps provide useful upper bounds for the H K metric, playing a similar role of transport maps for the Kantorovich-Wasserstein distance. In fact, for every choice of maps In order to show (2.17) it is sufficient to check that the measure λ ∈ M 2 (C 2 ) defined by so that (2.17) follows from (2.11b) and the identity On the other hand, choosing Y = C×C and an optimal plan ν = λ ∈ M 2 (C×C) for (2.11b) and setting and therefore equality holds in (2.17). (2.20) Inspired by the so-called Monge formulation of Optimal Transport, it is natural to look for similar improvement of (2.20), when and (2.13)), find an optimal transport-growth pair (T , q) : among all the transport-growth maps satisfying (T , q) ⋆ µ 0 = µ 1 By (2.17) we have the bound When µ 0 ≪ L d and the support of µ 1 is contained in the closed neighborhood of radius π/2 of the support of µ 0 , the results of the next section (cf. Corollary 3.5), which are a consequence of the optimality conditions in Theorem 2.14, show that the minimum of Problem 2.9 is attained and realizes the equality in (2.22).

Entropy-transport problem
A third point of view, typical of optimal transport problems, characterizes the Hellinger-Kantorovich distance via the static Logarithmic Entropy Transport (LET) variational formulation.
We define the logarithmic entropy density F : L 1 (x) := 1 2 ℓ(|x|), ℓ(r) := − log(cos 2 (r)) = log 1+ tan 2 (r) for r < π/2, +∞ otherwise. (2.23) As usual, we set ET(η; µ 0 , µ 1 ) := +∞ if one of the marginals (π i ) ♯ η of η is not absolutely continuous with respect to µ i . With this definition, the equivalent formulation of the Hellinger-Kantorovich distance as entropy-transport problem reads as follows. (2.25) Moreover, recalling the decomposition (2.12)-(2.13), (1) the pairs (µ 0 , µ 1 ) and (µ ′ 0 , µ ′ 1 ) share the same optimal plans η An optimal transport plan η, which always exists, gives the effective transport of mass. Note, in particular, that the finiteness of ET only requires (π i ) ♯ η = η i ≪ µ i (which is considerably weaker than the usual transport constraint (π i ) ♯ η = µ i ) and the cost of a deviation of η i from µ i is given by the entropy functionals associated with F . Moreover, the cost function ℓ is finite in the case |x 0 −x 1 | < π/2, which highlights the sharp threshold between transport and pure creation/destruction. Notice that we could equivalently use the truncated function cos 2 π/2 (r) = cos 2 (min{r, π/2}) instead of cos 2 (r) in (2.23). As we have already seen, the function r → cos 2 π/2 (r) plays an important role in many formulae. In general, optimal entropy-transport plans η ∈ M(R d ×R d ) are not unique. However, due to the strict convexity of F , their marginals η i are unique so that the non-uniqueness of the plan η is solely a property of the optimal transport problem associated with the cost function ( Remark 2.11 Besides (2.26), the connection between the cone-space formation and the logarithmic entropy-transport problem is given by the homogeneous marginal perspective function, namely where r 2 i plays the role of the reverse densities 1/σ i and θ is a scaling parameter, see [LMS18, Sec. 5].
We highlight that the logarithmic entropy-transport formulation (2.25) can be easily generalized by considering convex and lower semi-continuous functions F 0 and F 1 and cost functions ℓ, see [LMS18, Part I].
Applying the previous Theorem 2.10 we can refine formula (2.18) by providing an optimal pair of transport-growth maps solving (2.20) in the restricted set Y = S 0 × S 1 ⊂ R d ×R d . Indeed, we can choose arbitrary pointsx i ∈ S i and (2.28)

Dual formulation with Hellinger-Kantorovich potentials
In analogy to the Kantorovich-Wasserstein distance, we can give a dual formulation in terms of Hellinger-Kantorovich potentials. We slightly modify the notation of [LMS18], in order to be more consistent with the approach by the Hamilton-Jacobi equations (and the related Hopf-Lax solutions) of Section 4 and to deal with rescaled distances. As we will study segments of constant-speed geodesics t → µ t of length τ = t−s for 0 ≤ s < t ≤ 1, it will be convenient to introduce a scaling parameter τ > 0 that in certain parts will be replaced by 1, namely if we consider a whole geodesic. With this parameter, we set and the corresponding It is clear that minimizers η of (2.30) are independent of the coefficient 1 2τ in front of H K and coincide with solutions to (2.25) if µ τ = µ 1 . The role of τ just affects the rescaling of the potentials ϕ and ξ we will introduce below.
We also introduce the Legendre transform of F τ and their inverseš , +∞] and ξ ∈ [−∞, 1 2τ ] respectively, with the obvious convention induced by (2.32). With Theorem 6.3 in [LMS18] (see also Section 4 therein), we have the equivalent characterization of H K via the dual formulation Note that the formulations in (2.34a) and (2.34b) are connected by the transformation ξ 1 = G τ (ϕ 1 ), ξ 0 =Ǧ τ (ϕ 0 ) and the last condition in (2.34b) is equivalent to It is not difficult to check that one can also consider Borel functions in (2.34a) and (2.34b), e.g. for all Borel functions ϕ i : If we allow extended valued Borel functions, the supremum in (2.34a) and (2.34b) are attained.
Theorem 2.12 (Existence of optimal dual pairs) For all µ 0 , µ 1 ∈ M(R d ) and τ > 0 there exists an optimal pair of Borel potentials ϕ 0 , ϕ 1 : R d → [−∞, +∞] which is admissible according to (2.36) and realizes equality in (2.37), namely Remark 2.13 Denoting by S i := supp(µ i ) the support of µ i for i = 0 and 1, we remark that it is always sufficient to find Borel potentials ϕ i : we obtain a pair still satisfying (2.36) and (2.38). This freedom will be useful in Theorem 2.14 below.
Moreover, notice that (2.34b) can be rewritten as where P τ ξ is defined in (1.13). In particular, the operator P τ is directly connected to the dynamical formulation in (2.3), and we will thoroughly study its properties in Section 4.

First order optimality for H K
From the above discussion, we have already seen that there is never any transport over distances larger than π/2. This transport bound will also be seen in the following optimality conditions for the marginal densities σ i defined in (2.24).
The following holds: In particular, the marginals η i are unique and the densities σ i are unique µ ′ i -a.e. (2) If η is optimal and S i , S ′ i , S ′′ i and σ i are defined as above, the pairs of potentials defined by (2.43) (2.44) are optimal in the respective dual relaxed characterizations of Theorem 2.12 and sat- (3) Conversely, if η is optimal and (ϕ 0 , ϕ 1 ) (resp. (ξ 0 , ξ 1 )) is an optimal pair according to Theorem 2.12, then (2.45a) (resp. (2.45b)) holds η-a.e. and (2.46) 3 Regularity of static H K potentials ϕ 0 and ϕ 1 In this section, we will carefully study the regularity of a pair (ϕ 0 , ϕ 1 ) of optimal H K potentials arising in (2.43) of Theorem 2.14. We will improve the previous approximate differentiability result of [LMS18, Thm. 6.6(iii)] (see also [AGS08, Thm. 6.2.7]) by adapting the argument of [FiG11] and extending the classical result of [GaM96] to the H K setting. In fact, this section is largely independent of the specific H K setting but relies purely on the theory of L-transforms. As we are interested in the special case of continuous, extended values cost functions L = L τ = 1 τ L 1 : R d → [0, +∞] which attain the value +∞ outside a ball, we cannot rely on existing results and have to provide a careful analysis of this case (but see also [GaO07,McP09,JiS12,BeP13,BPP18] for different situations of discontinuous costs taking the value +∞).
We will use the notion of locally semi-concave and semi-convex functions; recall that a function ϕ : A function ϕ is locally semi-convex if −ϕ is locally semi-concave. Let us recall that locally semi-concave functions are locally Lipschitz and thus differentiable almost everywhere. We will denote by dom(∇ϕ) the domain of their differential. By Alexandrov's Theorem (see [AGS08, Thm. 5.5.4]), there exists for almost every x ∈ dom(∇ϕ) a symmetric matrix We will denote by dom(D 2 ϕ) the subset of density points in dom(∇ϕ) where (3.2a) and (3.2b) hold.
As the optimality of potential pairs (ϕ 0 , ϕ 1 ) is closely related to the theory of Ltransforms, we give the basic definitions first and then derive the associated regularity properties under additional smoothness assumptions.
For simplicity, we restrict the analysis of the remaining text to continuous functions We define the forward L-transform ϕ L→ 0 of a l.s.c. function ϕ 0 and the backward Ltransform ϕ L 1 of an u.s.c. function ϕ 1 via where the restriction of the infimum and supremum in (3.3) to the balls B R (x i ), corresponding to the shifted proper domain of L, is important to avoid the expression "∞−∞". It will turn out that ϕ L→ 0 is u.s.c. and ϕ L 1 is l.s.c. Of course, these transformations are related by and for arbitrary functions For later usage, we consider the following elementary example.
Example 3.1 (Forward and backward L-transform) We consider the potentials For a 0 = −∞ and a 1 = +∞, we obtain the transforms For R > 0 and sets S ⊂ R d , we introduce the notation sphere condition of radius R at every point of its boundary (e.g. if S is convex) then ext R (S) coincides with R d \ S and bdry R (S) = ∂S.
In general, bdry R (S) is a subset of the boundary of S, precisely made by all points of ∂S satisfying an exterior sphere condition of radius R with respect to S: (3.7) In fact, if x ∈ bdry R (S) then there exist sequences x n , y n such that x n → x, |x n −y n | < R and B R (y n ) ∩ S = ∅. Possibly extracting a subsequence, we can assume that y n → y, B R (y) ∩ S = ∅, and |x−y| ≤ R. Since x ∈ ∂S, it is not possible that |x−y| < R, so that the left-to-right implication of (3.7) holds. On the other hand, if x ∈ ∂S, |x−y| = R, and B R (y) ∩ S = ∅, it is immediate to check that x ∈ ∂(ext R (S)), see also Figure 3. In Theorem 3.3(2) we will use that for arbitrary sets S the boundary part bdry R (S) is countably (d−1)-rectifiable, see [Vil09, Th. 10.48(ii)], and hence has L d measure 0.
The following result shows how the properties of L provide regularity of the backward transform ϕ L . Of course, an analogous statement holds for the forward transform using (3.4). The important fact is that the upper bounds on the second derivatives of L generate semi-convexity of ϕ 0 (i.e. lower bounds on D 2 ϕ 0 ), see Assertions 5 and 6. As D 2 L(z) blows up at the boundary of B R (0), it is essential to use the fact that L(z k ) → +∞ for |z k | ↑ R.
(2) The set Q 0 satisfies an external sphere condition of radius R, namely so that the topological boundary of Q 0 is countably (d−1)-rectifiable.
(3) The "contact set" is closed. (3.14) (5) The restriction of ϕ 0 to the open set Ω 0 is locally semi-convex, and in particular locally Lipschitz and thus continuous.
and satisfies the following properties: Proof. We divide the proof in various steps, corresponding to each assertion.
, and lower semi-continuity is shown. The estimates in (3.10) are elementary following from L(0) = 0 and L(z) ≥ 0, respectively. The relation in (3.11) follows from the fact that ϕ 0 ( where we used L ≥ 0 and the upper semicontinuity of ϕ 1 . However, using dom(L) = B R (0) we obtain Assertion (3). The closedness of M ±∞ follows easily by the semi-continuities of ϕ i . For M fin we consider a sequence (x 0,n , by the semi-continuities. As the opposite inequality is always satisfied, we obtain the equality. We can also exclude that ϕ 0 ( Assertion (4). Let us first show that ϕ 0 is locally bounded from above in the interior of Q 0 , i.e. the open set Q 0 \ ∂Q 0 . In fact, if a sequence x n is converging tox ∈ Q 0 \ ∂Q 0 with ϕ 0 (x n ) ↑ +∞, by arguing as before and using ϕ 0 ( which contradicts the fact that ϕ 0 (x) < +∞ in a neighborhood ofx, because of |x−ȳ| ≤ R.
We fix now a compact subset K of the open set Ω 0 , a pointx ∈ K, and consider the section M 0→1 [x] of the contact set M fin . Let η > 0 be sufficiently small so that By the definition of ϕ 0 = ϕ L 1 , for every ε ∈ (0, 1] the sets are non-empty. We choose y ∈ M 1 (x) and set x ϑ := ϑx + (1−ϑ)y with ϑ = 1 −η/R, which implies |x ϑ −x| ≤ η, and hence x ϑ ∈ K η . Moreover, we have |x ϑ −y| ≤ R − η. Therefore, where ℓ(̺) := sup z∈B̺(0) L(z). Combining the last two estimates we additionally find Hence, all elements y ∈ M 1 (x) satisfies |x−y| ≤ θ and (3.20). We now consider a sequence y ε ∈ M ε (x) ⊂ M 1 (x), then a standard compactness argument and the upper semi-continuity of ϕ 1 show that any limit pointȳ is an element of M 0→1 [x], which is therefore not empty. The compactness of M 0→1 [x] and (3.14) again follow by (3.21) Assertion (5). Let us now fixx 0 ∈ Ω 0 and δ > 0 such that K := B δ (x 0 ) ⊂ Ω 0 . The previous assertion yields θ < R and a ′ , a ′′ ∈ R such that |x ′ −x| ≤ θ and a ′ ≤ ϕ 1 (x ′ ) ≤ a ′′ whenever x ∈ K and x ′ ∈ M 0→1 [x]. By possibly reducing δ, we can also assume that 3δ + θ < R. For every x ∈ K, we now have by construction which is bounded and semi-convex in K because it is a supremum over a family of uniformly semi-convex functions, where we use Assertion (6). This assertion follows in the standard way by using the extremality conditions in the contact set, see e.g. [AGS08, Thm. 6.2.4 and 6.2.7]. We give the main argument to show how the assumptions in (3.8) enter. By Alexandrov's theorem and Assertion (5) the set D ′′ 0 has full Lebesgue measure. To obtain the optimality conditions, we fix x 0 ∈ Q 0 ∩ D ′′ 0 and know from (3.22) that there existsx 1 such that ϕ 0 (x 0 ) = ϕ 1 (x 1 ) − L(x 1 −x 0 ). However, for all x ∈ B δ (x 0 ) we have ϕ 0 (x) + L(x 1 −x) ≥ ϕ 1 (x 1 ) with equality for x = x 0 . Thus, we obtain the optimality conditions sym . This result gives the conditions (a) to (c), if we observe thatx 1 is unique. But this property follows from the first optimality condition by using (3.8c) which allows us to writex i.e.x 1 is uniquely determined by x 0 . Moreover, DT 0→1 (x 0 ) exists and satisfies D 2 ϕ 0 ( The previous result can now be applied to the solution of the LET problem in Theorem 2.10 using L = L 1 ; thus in this case R = π/2. Using the notations for i , and D ′′ i defined for an optimal pair (ϕ 0 , ϕ 1 ) as in Theorem 3.2. So far we constructed optimal pairs (ϕ 0 , ϕ 1 ) satisfying However, following [Vil09, Ch. 5], we will show that it is possible to restrict to "tight optimal pairs" satisfying ϕ 0 = ϕ L 1 1 and ϕ 1 = ϕ L 1 → 0 , which implies that ϕ 0 is l.s.c. and ϕ 1 is u.s.c. This possibility leads to the following refinement of the results in [LMS18, Thm. 6.6(iii)]. (1) There exists an optimal pair of potentials ϕ 0 , ϕ 1 : R d → [−∞, +∞] with ϕ 0 being l.s.c. and ϕ 1 u.s.c., solving the dual problem of Theorem 2.12 and where the sets O i and Q i are as in (3.9).
(2) If η is an optimal solution of the LET problem (2.25), the functions σ 0 := e 2ϕ 0 and σ 1 := e −2ϕ 1 provide lower semi-continuous representatives of the densities of the marginals η i = π i ♯ η with respect to µ i , i.e., η i = σ i µ i , and η is concentrated on the contact set M fin so that supp(η) ⊂ M (see Theorem 3.2). The marginals η i are concentrated on the open sets O i .
Conversely, if η satisfies supp( η) ⊂ M and η i = σ i µ i , then η is an optimal solution of the LET problem (2.25).
Assertion (3). We just consider the case of µ 0 , since the argument for µ ′ 0 is completely analogous and eventually uses the fact that Ω 0 = O 0 ∩ int(Q 0 ) and µ ′ 0 is also concentrated on O 0 by (3.25).
If µ 0 (bdry π/2 (S 0 )) = 0, we also obtain µ 0 (∂Q 0 ) = 0 via the following arguments: By (3.25) we have S 0 ⊂ Q 0 , which implies that a point x ∈ ∂S 0 ∩ bdry π/2 (Q 0 ) also lies bdry π/2 (S 0 ). Using ∂Q 0 = bdry π/2 (Q 0 ) we obtain ∂S 0 ∩ ∂Q 0 ⊂ bdry π/2 (S 0 ) and find Thus, we have shown that µ 0 is concentrated on int(Q 0 ). Assertion (4). If µ ′ 0 ≪ L d then µ ′ 0 is concentrated on Ω 0 by Claim 3 and µ 0 (Ω 0 \ D ′ 0 ) = 0 by 3.2(6). By the previous claim 2, we know that the first marginal η 0 of η is given by which is the graph of the map T 0→1 given by Theorem 3.2(6). Assertion (5). Let us first recall that for i = 0, 1 the marginal η i of η and the measure µ ′ i are mutually absolutely continuous. Since µ ′ i ≪ L d we know by Theorem 3.2(6) and the third claim that . We can apply Theorem 3.2(6), inverting the order of the pair (ϕ 0 , ϕ 1 ) and obtaining that for every x 1 ∈ D ′ 1 there is a unique element x 0 ∈ R d in the section M 1→0 (x 1 ), i.e. such that (x 0 , x 1 ) ∈ M fin . This result precisely shows that the restriction of It is important to realize that the tightness condition (3.24) is strictly stronger than the optimality conditions (3.23). However, even for tight optimal pairs there is some freedom outside the supports of the measures µ 0 and µ 1 , as is seen in the following simple case.

We obtain ϕ
(3) With the notation of Theorem 3.2 we have O The following corollary shows that in the case of an absolutely continuous reduced pair (µ 0 , µ 1 ) the density of µ 1 can be written in terms of the optimal pair (σ 0 , σ 1 ), the transport map T , and the density of µ 0 , and vice versa.

Dynamic duality and regularity properties of the Hamilton-Jacobi equation
In the previous section, the regularity properties of the optimal H K pairs (ϕ 0 , ϕ 1 ) were studied, which can be understood via the static formulations of H K as only the measures µ 0 and µ 1 are involved. Now, we consider the dual potentials ξ t (x) = ξ(t, x) along geodesics (µ t ) t∈[0,1] . At this stage, the present Section 4 is completely independent of the previous Section 3. Only in the upcoming Section 5, we will combine the two results to derive the finer regularity properties of the geodesics µ t .
In [LMS18, Sect. 8.4], it is shown that the optimal dual potentials ξ in the dynamic formulation in (2.3) (but now for α = 1 and β = 4) are subsolutions to a suitable Hamilton-Jacobi equation, namely (4.1) Theorem 8.11 in [LMS18] shows that the maximal subsolutions of the generalized Hamilton-Jacobi equation (2.4) for t ∈ (0, τ ) are given by the following generalized Hopf-Lax formula where ξ 0 ∈ C 1 (R d ) is fixed and such that inf R d ξ 0 (·) > − 1 2τ , compare with (1.13). In the spirit of the previous section, it is possible to derive some semi-concavity properties of ξ t from this formula. However, these are not enough as we need more precise second order differentiability. To obtain the latter, we use the fact that a geodesic curve is not oriented, meaning that t → µ 1−t is still a geodesic, or in other words that t → ξ 1−t has to also solve a Hamilton-Jacobi equation. Thus, our strategy will be the following: For an optimal pair (ξ 0 ,ξ 1 ) in (2.34b), we construct a forward solution ξ t starting from ξ 0 and backward solutions starting fromξ 1 via ξ t = P t ξ 0 for t ∈ (0, 1] andξ t = R tξ1 := −P t (−ξ 1 ) for t ∈ [0, 1). (4.3) In Section 5, optimality will be used to guarantee that ξ t andξ t are essentially the same so that semi-concavity of ξ t and semi-convexity ofξ t provide the desired smoothness.
We are now in the position to compare the forward solution ξ t and the backward solutionξ t . The main philosophy is that in general we only have R t P t ξ 0 ≤ ξ 0 (cf. (4.38) below), but equality holds µ t -a.e. if (ξ 0 , P 1 ξ 0 ) is an optimal pair. In the following result, we still stay in the general case comparing arbitrary forward solutions ξ t = P t ξ 0 and backward solutionsξ t = R 1−tξ1 only assuming ξ 1 ≥ξ 1 . Along the contact set Ξ t where ξ t andξ t coincide, we can then derive differentiability and optimality properties of ξ t and ξ t . (4.38) Then, the following assertions hold: (1) For every t ∈ [0, 1] we have ξ t ≥ξ t and the contact set (4.39) (2) For every t ∈ (0, 1) and x ∈ Ξ t there exists a unique p = g t (x) satisfying so that in particular ξ t andξ t are differentiable at x with gradient g t (x) (cf. Proposition 4.1(2) for Λ a ).
Assertion (3). The fact that Ξ ± t are independent of t and contained in Ξ t follows from (4.17) and (4.32). Moreover, (4.42) follows easily since ξ t takes its minimum at Ξ − t and its maximum at Ξ + t . Let us now fix t ∈ (0, 1), Minimizing with respect to x we findζ(x 0 ) ≤ζ(x 1 ) − p 0 , x 1 −x 0 − 1 4C |p 1 −p 0 | 2 . Inverting the role of x 0 and x 1 and summing up gives 1 2C |p 1 −p 0 | 2 ≤ p 1 −p 0 , x 1 −x 0 and therefore The boundedness of g t on Ξ t follows by the fact that ξ t is Lipschitz. Assertion (4). Let us first consider the case s > t with τ := s−t and let y ∈ M s→t (x).

Remark 4.4 (Strongly reduced pairs)
It is worth noticing that if inf ξ 0 > − 1 2 and supξ 1 < 1 2 , then the sets Ξ ± t in (4.41) are empty and many properties of ξ t , T s→t and q s→t become considerably simpler. This situation is, e.g., the case of the solution induced by a strongly reduced pair with compact support, see Theorem 3.6.
We close this subsection by giving a small example for ξ t andξ t and their contact set Ξ t derived from an optimal pair (ξ 0 ,ξ 1 ) for the transport between two Dirac measures.

Geodesic flow and characteristics
Finally, we study the differentiability of g s = ∇ξ s and T t→s on Ξ s . Let us denote by Ξ t the subset of density points of the contact set Ξ t , which is closed by (4.39):  For ̺ > π/2 (lower figure), we Notice that Ξ t is just the set of Lebesgue points of the characteristic functions of Ξ t , so that [AFP00] L d (Ξ t \ Ξ t ) = 0. By [Buc92, Thm. 1], the family of sets ( Ξ t ) t∈(0,1) is invariant with respect to the action of the bi-Lipschitz maps T s→t , i.e., T s→t ( Ξ s ) = Ξ t for every s, t ∈ (0, 1). Given a locally Lipschitz function F : Ξ t → R d and x ∈ Ξ t , we say that F is differentiable at x if there exists a matrix A = DF (x) ∈ R d×d such that (4.53) Since x belongs to the set Ξ t of density points of Ξ t , the matrix A is unique and every (locally) Lipschitz extension of F is differentiable at x with the same differential A (e.g. one can argue as in the proof of [AFP00, Thm. 2.14]). We call dom t (DF ) the set of differentiability points x ∈ Ξ t of F . If F is locally Lipschitz in Ξ t , considering an arbitrary Lipschitz extension of F and applying Rademacher's theorem, we know that L d (Ξ t \ dom t (DF )) = 0. We will use the simple chain-rule prop- In the proof of the following lemma we will denote by ∂ξ s the Fréchet subdifferential of ξ s , which coincides with ∇ξ s whenever ξ s is differentiable, in particular in x ∈ Ξ s . Lemma 4.6 Let s ∈ (0, 1) and let x ∈ Ξ s be a density point of Ξ s where g s = ∇ξ s is differentiable in the sense of (4.53) with p = g s (x) and A = D∇ξ s (x). Then Analogous results hold forξ s . We will denote D∇ξ s by D 2 ξ s .
Notice that the points y in the limits in (4.55b) and (4.55c) are not restricted to Ξ s .
Proof. We adapt some ideas of [BCP96,AlA99] to our setting, and we consider the case ofξ s (to deal with a semi-convex function, instead of semi-concave). We will assume x = 0 and will shortly writeξ and Ξ forξ s and Ξ s omitting the explicit dependence on the parameter s. For h > 0 we define the blowup set Ξ h := h −1 Ξ. Up to an addition of a quadratic term, it is also not restrictive to assume thatξ is convex.
For h > 0 we set ω h (y) := 1 h 2 ξ (hy) −ξ(x) − h p, y so that ω h is a convex and nonnegative function. By (4.40) there exists a positive constant C such that 0 ≤ ω h (y) ≤ C|y| 2 for every y ∈ Ξ h .
For this it is sufficient to approximate the (rescaled) elements of the canonical basis ±e i , i = 1, · · · , d. If y ∈ B 2 (0) we then find coefficients α h,i ≥ 0, i α h,i = 1 such that so that ω h is uniformly bounded in B 2 (0) and therefore is also uniformly Lipschitz in B 1 (0). Every infinitesimal sequence h n ↓ 0 has a subsequence m → h n(m) such that ω h n(m) is uniformly convergent to a nonnegative, convex Lipschitz function ω : B 1 (0) → R. We want to show that any limit point ω coincides with the quadratic function induced by the differential A, namely ω(y) = ω A (y) = 1 2 Ay, y Let ω be the uniform limit of ω h along a subsequence h n ↓ 0. If y n ∈ Ξ hn ∩ B 1 (0) is converging to y ∈ B 1 (0) we know that any limit point of p n = ∇ω hn (y n ) belongs to ∂ω(y). On the other hand, p n = 1 hn (∇ξ(h n y n ) − p) = Ay n + o(1) thanks to the differentiability assumption, so that Ay ∈ ∂ω(y). Since we can approximate every point of B 1 (0) we conclude that Ay ∈ ∂ω(y) for every y ∈ B 1 (0). On the other hand, ω is Lipschitz, so that it is differentiable a.e. in B 1 (0) with ∇ω(y) = Ay and therefore the distributional differential of ∇ω coincides with A. We conclude that A is symmetric and ω(y) = 1 2 Ay, y . The fact that ω h uniformly converges to ω eventually yields (4.55b) and (4.55c).
We now use the second-order differentiability of ξ s to derive differentiability of T s→t by using the formula (4.43) with g s (x) = ∇ξ s (x). For s ∈ (0, 1) we define (4.57) As we already observed, since g s is Lipschitz on Ξ s , L d (Ξ s \ D s ) = 0 for every s ∈ (0, 1).
For t ∈ (0, 1) and τ = t−s we also have 1+2τ ξ s ≥ (1−t)/(1−s) > 0 so that is again Lipschitz on Ξ s . Thus, Lemma 4.6 can be applied and φ s,t is differentiable in the sense of (4.53) on D s . Finally, we exploit the explicit representation of T s→t via (4.43), namely for all x ∈ Ξ s we have (4.58) Now the chain rule (4.54) guarantees the differentiability of T s→t on the set D s : Lemma 4.7 (Differentiability of T ) For all s, t ∈ (0, 1) the mapping T s→t is differentiable on D s , and we have (4.59a) T s→t (D s ) = D t and DT t→s (T s→t (x))DT s→t (x) = I for x ∈ D s . (4.59b) For every t 0 , t 1 ∈ (0, 1), t 2 ∈ [0, 1] we also have (4.59c) Proof. Recall τ = t−s, then the explicit formula (4.59a) follows from differentiating ∇L 1 x−T s→t (x) = − τ 1+2τ ξs ∇ξ s . Since T −1 s→t = T t→s there exists a constant L such that If x ∈ D s and A = DT s→t (x), choosing ε > 0 we can find ̺ > 0 such that so that choosing ε < 1 2L and x ′ = x + v we get Using the fact that 0 is a density point of Ξ t − x we conclude that A is invertible with |A −1 | ≤ 2L. For every y ′ ∈ Ξ t with L|y ′ −y| < ̺ and x ′ = T t→s (y ′ ), we get |x ′ −x| < ̺ and (4.61) yields showing that y ∈ D t and A −1 = DT t→s (y). Hence, (4.59b) is established. Equation (4.59c) then follows by the concatenation property (4.45).
The explicit formula (4.59a) shows that DT s→t is the product of the positive matrix D 2 L 1 (z) −1 and a symmetric matrix, hence it is always real diagonalizable. The following result shows that the determinant and hence all eigenvalues stay positive for s, t ∈ (0, 1). In fact, we now derive differential equations with respect to t ∈ (0, 1) for the transport-growth pairs (T s→t (x), q s→t (x)) ∈ R d ×(0, +∞) as well as for DT s→t (x) ∈ R d×d and det DT s→t (x). Recall that t → (T s→t (x), q s→t (x)) is analytic for t ∈ (0, 1) by Theorem 4.3(5) and (6).
The following relations will be crucial to derive the curvature estimate needed for our main result on geodesic H K-convexity.
With the composition rule (4.45) we have T s→t+h (x) = T t→t+h (y) and computė This identity yields the first equation in (4.62a). For the second relation in (4.62a) we useT The relations (4.62b) for q(t) = q s→t follow similarly, using the scalar product rule for q s→t in (4.45) and by taking the square root of (4.44), namely To show that B(t) satisfies (4.62c), we exploit the matrix product rule (4.59c) and expand DT t→t+h (y) in (4.59a) to obtain (4.63) For this note that y − T t→t+h (y) = O(|h|) so that D 2 L 1 y−T t→t+h (y) = I + O(|h| 2 ) as L 1 is even. Thus, (4.62c) follows as in the previous two cases. For the determinant δ(t) we again have a scalar product rule, and it suffices to expand det(DT t→t+h (x)) at h = 0. For this we can use the classical expansion det(I+hA) = 1 + h tr A + 1 2 h 2 (tr A) 2 − tr(A 2 ) + O(h 3 ), and obtain As before this shows (4.62d), and the theorem is proved.
In this section, we have studied the forward solutions t → ξ t for t ∈ (0, 1) and its contact sets Ξ t with a corresponding backward solutionξ t . We obtained differentiability properties in these sets or in the slightly smaller sets D t and derived transport relations for important quantities such as q s→t and δ s (t) = det DT s→t (x). In the following section, we still have to show that the contact sets Ξ t are sufficiently big, if we define ξ t = P t ξ 0 andξ t = R 1−t ξ 1 for an optimal pair (ξ 0 , ξ 1 ). This will be done in Theorem 5.1.
We first show the optimality of potentials ξ t andξ t obtained from the forward or backward Hamilton-Jacobi equation in Theorem 5.1. With this, we are able to show in Theorem 5.2 that for subparts (s, t) ⊂ [0, 1] with τ = t−s < 1 the corresponding LET problem has a unique solution in Monge form, which implies that (M(R d ), H K) has the strong nonbranching property. Finally, in Theorem 5.4 and Corollary 5.5 we provide restrictions and splittings of geodesic curves needed for the main theorem in Section 7.

Geodesics and Hamilton-Jacobi equation
The next result clarifies the connection with the forward and backward Hopf-Lax flows ξ t andξ t studied in Theorem 4.3 and the importance of the contact set Ξ t defined in (4.39) (see also [LMS18, Thm. 8.20] and [Vil09, Chap. 7] for a similar result in the framework of Optimal Transport and displacement interpolation). We emphasize that despite the non-uniqueness of the geodesics (µ t ) t∈[0,1] (see [LMS16,Sec. 5.2]) in the following result, ξ t andξ t only depend on µ 0 and µ 1 and the optimal potentials ϕ 0 and ϕ 1 .
The result brings together the results of Sections 3 and 4 by starting with an optimal pair (ϕ 0 , ϕ 1 ) from Section 3 and considering the corresponding solutions ξ t andξ t of the forward and backward Hamilton-Jacobi equation starting with ξ 0 =Ǧ 1 (ϕ 0 ) and ξ 1 = G 1 (ϕ 1 ), respectively. First, we observe that "intermediate" pairs (ξ s , ξ t ) or (ξ s ,ξ t ) are optimal for connecting the intermediate points µ s and µ t on an arbitrary geodesic connecting µ 0 and µ 1 . Second, we observe that certain results obtained in Section 4 for s, t ∈ (0, 1) also hold in the limit points s, t ∈ {0, 1}. Finally, we show that the contact set Ξ t is large enough in the sense that it contains supp(µ t ) (see Example 4.5 for some instructive case with ̺ = π/2).
Assertion (2). Equation (5.1) for s = 0 yields (ξ t −ξ t ) dµ t = 0 for all t ∈ (0, 1), so that ξ t ≤ξ t and the continuity of ξ t ,ξ t yield ξ t =ξ t on S t = supp µ t . The cases t = 0 and 1 follow by the relations between ξ i and ϕ i and the fact that Note that the inclusion S t = supp(µ t ) ⊂ Ξ t is in general a strict inclusion. This can be seen for the case |z 1 −z 0 | = π/2 in Example 4.5, where Ξ t = [z 0 , z 1 ], however, there exists a pure Hellinger geodesic with supp(µ t ) = {z 0 , z 1 } for t ∈ (0, 1).
We can now exploit all the regularity features of the maps T s→t and q s→t on the contact set Ξ t (cf. Theorem 4.3). A first important consequence is that, given an H K geodesic (µ t ) t∈[0,1] and s ∈ (0, 1), the H K problem between µ s and µ t for any t ∈ [0, 1] has only one solution, which can be expressed in Monge form (see [AGS08, Lem. 7.2.1] for the corresponding properties for the L 2 -Wasserstein distance in R d ).
Theorem 5.2 (Regularizing effect along geodesics) Under the assumptions of Theorem 5.1, if s ∈ (0, 1) and t ∈ [0, 1], then the transport-growth pair (T s→t , q s→t ) of Theorem 4.3 is the unique solution of the Monge formulation (2.21) of the Entropy-Transport problem between µ s and µ t . In particular, the optimal Entropy-Transport problem between µ s and µ 0 or between µ s and µ 1 has a unique solution, and this solution is in Monge form.
The above theorem allows us to deduce the fact that (M(R d ), H K) has a strong nonbranching property. It is shown in [LMS16,Sec. 5.2] that the set of geodesics connecting two Dirac measures δ y 0 and δ y 1 is very large if |y 1 −y 0 | = π/2: it is convex but does not lie in a finite-dimensional space. The following result shows that all these geodesics are mutually disjoint except for the two endpoints µ 0 and µ 1 .
The next result shows that from a given geodesic we may construct new geodesics by multiplying the measures µ t by a suitably transported function. This will be useful in the proof of the main Theorem 7.2.
We can then pass to the limits t 1 ↓ 0 and t 2 ↑ 1 as follows. Notice that the curve t → ν t , t ∈ (0, 1), is converging in (M(R d ), H K) to a limit ν 0 and ν 1 for t ↓ 0 and t ↑ 1, since (ν t ) is a geodesic. Moreover, for every ζ ∈ C b (R d ) we can pass to the limit t ↑ 1 in since lim t↑1 T s→t (x) = T s→1 (x) and lim t↑1 q s→t (x) = q s→1 (x) and q is uniformly bounded. A similar argument holds for the case t ↓ 0.
In order to check the identity concerning the density ̺ ′ t of ν t , we use (5.5) and find The case t ∈ [0, s] is analogous.
The next result provides the fundamental formula for the representation of densities along geodesics. Generalizing the celebrated formulas for the Kantorovich-Wasserstein geodesics, the densities are again obtained by transport along geodesics, but now with non-constant speed and an additional growth factor a s (t, x) = q 2 s→t (x) to account for the annihilation and creation of mass. Recall that D s = dom(D 2 ξ s ) ⊂ Ξ s has full Lebesgue measure in Ξ s , i.e. L d (Ξ s \ D s ) = 0.
Proof. Assertion (1). In the case (a) holds for s ∈ (0, 1), there exists a bi-Lipschitz map T s,t : Ξ s → Ξ t and bounded growth factors q s,t : Ξ s → [a, b] with 0 < a < b < ∞ such that µ t = (T s→t , q s→t ) ⋆ µ s . In particular, for every Borel set A we have If L d (A) = 0 then L d (T t→s (A)) = 0 because T t→s is Lipschitz. Hence, using µ s ≪ L d we find µ s (T t→s (A) = 0, such that (5.9) gives µ t (A) = 0. With this we conclude µ s ≪ L d .
In the case of assumption (b), we argue as before but with µ 0 = c 0 L d for s = 0. Using the fact that q t→0 is locally bounded from below and that T t→0 is locally Lipschitz on Proposition 5.7 (Curvature estimates for (ρ, γ)) Let (ρ s , γ s ) : (0, 1)×D s → [0, ∞[ 2 be defined as above along a geodesic. Then, we have for all t ∈ (0, 1) the relations (5.12) Proof. As s ∈ (0, 1) and x ∈ D s are fixed, we will simply write ρ(t) instead of ρ s (t, x) and similarly for the other variables. Using the specific definition of ρ we obtain We can now use the formulas provided in (4.62a)-(4.62d) givingγ = 2ξ t γ andγ = |∇ξ t | 2 γ, where ξ t and its derivatives are evaluated at y = T s→t (x). Inserting this and (4.62d) foṙ δ andδ into the above relation forδ/δ we observe significant cancellations and obtain Thus, the curvature estimates (5.12) follow.
The above curvature estimates will be crucial in Section 7 for deriving our main result on geodesic convexity. We remark that for d ≥ 2 they are even slightly better that the "sufficient curvature estimates" given in (7.3) because of 1 − 4/d ≤ 1 − 4/d 2 (with equality only for d = 1).
We finally derive a useful result concerning the convexity of the density t → c(t, x) along geodesics. This provides a direct proof of the fact, which was used in [DiC20] that the L ∞ -norm along geodesics is bounded by the L ∞ -norm of the two endpoints. Indeed, we show more, namely that the function t → c(t, T t (x)) is either trivially constant or it is strictly convex.
Theorem 5.8 (Convexity of densities along geodesics) (1) Under the assumption of Corollary 5.5, for every s ∈ (0, 1) and x ∈ D s ∪ Ξ ± the function c s (t) = c(t, T s→t (x)) given by (5.6a) or (5.7), respectively, is convex and positive in (0, 1); moreover, with a possible L d -negligible exception, it is either constant or strictly convex.
Proof. Assertion (1). Since x ∈ R d and s ∈ (0, 1) play no role, we drop them for notational simplicity. We simply calculate the second derivative of the function t → c(t) = γ(t) d+2 c s /ρ(t) d . If c s = c(s, x) = 0 then c(t, T s→t (x)) = 0 and the result is obviously true. Hence, we may assume c s > 0 and obtain after an explicit calculation (5.14) The quadratic form involving the first derivatives is positive definite, and for the terms involving the second derivatives we can use the curvature estimates in (5.12) to obtain Notice that t → γ(t) is the square root of the non-negative (and strictly positive in (0, 1)) quadratic polynomial α(·, x) given by (5.6b), so that γ ′′ ≥ 0 and we conclude thatc(t) ≥ 0 as well due to c(t) > 0. Moreover, if x ∈ Ξ 0 s then |∇ξ s (x)| > 0, haveγ(t) > 0, and we deduce thatc(t) > 0 obtaining the strict convexity of c.
If x ∈ Ξ 0 s where ∇ξ s (x) = 0, we can use the representation (5.7) for c up to a L dnegligible set. Assertion (2). If µ 0 = c 0 L d ≪ L d , then δ s (0, x) > 0 for µ s -a.a. x ∈ D s thanks to the last statement of Corollary 3.5 (which is a direct consequence of Theorem 3.3(5)) and both δ s (0, x) and α s (0, x) coincides with their limit as t ↓ 0. A further application of Corollary 5.5(3) yields the result. The case t = 1 is completely analogous.
The above result easily provides the following statement on convexity of L ∞ norms along H K-geodesics. This generalizes to a corresponding result for the Kantorovich-Wasserstein geodesics (which might been known, but the authors were not able to identify a reference, see the Remark 5.10 below).
Corollary 5.9 (Convexity of the L ∞ norm along geodesics) Let µ 0 , µ 1 ∈ M(R d ) be absolutely continuous with respect to L d with densities c i ∈ L ∞ (R d ) and let (µ t ) t∈[0,1] be a H K geodesic connecting µ 0 to µ 1 .
Proof. The result for (M(R d ), H K) follows directly from Theorem 5.8.
Remark 5.10 Let (µ W t ) t∈[0,1] be the Kantorovich-Wasserstein geodesic connection between two probability measures µ 0 , µ 1 ∈ P 2 (R d ) with µ i = c i L d and c 0 , c 1 ∈ L ∞ (R d ). Similar to the previous result, In fact, for (P 2 (R d ), W 2 ) we replace (5.6) by the simpler formula for the Kantorovich-Wasserstein transport see [AGS05, Prop. 9.3.9]. Using µ 0 = c 0 L d we can choose s = 0 and have T W 0→t (x) = x + t(∇ϕ(x)−x) for a convex Kantorovich potential. Since for every symmetric positive semidefinite matrix D the function t → 1/ det (1−t)I + tD is convex, the desired result follows with the same arguments as for Theorem 5.8.

Preliminary discussion of the convexity conditions
In this section, we discuss the equivalence of two formulations of the convexity conditions and give a few examples. The proof of sufficiency and necessity of these conditions is then given in the following Section 7.
Proof. Expressing ∂ ρ N E and ∂ γ N E via ε 0 and ε 1 and using δ = (ρ/γ) d we obtain which is positive because of (6.2). Thus, (B) is established and the monotonicity of s → N E (s 1−4/d 2 ρ, sγ) in (C) follows simply by differentiation.
The crucial monotonicity stated at the end of the above proposition means It implies that if E attains a negative value it cannot be differentiable at c = 0: In the following examples we investigate which functions E satisfy the above conditions. The following two results will be used in Corollary 7.3 to obtain geodesic convexity for functionals of the form E(c) = Ω ac r dx. The third example shows that in case of the Boltzmann entropy with E(c) = c log c the conditions do not hold and hence geodesic convexity fails. Clearly, for m ≥ 1 we have det B(c) ≥ 0 for all space dimensions d ∈ N. Moreover, det B(c) < 0 for m ≤ 0.
In summary, we obtain geodesic convexity if and only if m ≥ 1.
Example 6.4 (Density function E(c) = −c q ) As in the previous example we have The Hellinger condition B 22 (c) ≥ 0 holds for q ∈ [0, 1 2 ], while the McCann condition B 11 (c) ≥ 0 holds for q ∈ [ d−1 d , 1], which also implies the monotonicity ε 1 ≥ ε 0 . With we obtain the additional condition q ≥ d/(d+2) and summarize that E(c) = −c q leads to a geodesically convex functional if and only if q ∈ max{ d−1 d , d d+2 }, 1 2 , which has solutions only for d = 1 and d = 2.
Finally, we discuss a few examples where the density function E is not smooth. Note that the conditions in (1.25) form a closed cone. Moreover, as for convex functions, the supremum E : Example 6.6 (Nonsmooth E) In applications one is also interested in cases where E is nonsmooth. For example the case E κ (c) = κc for c ∈ [0, c * ] and E(c) = ∞ for c > c * is considered in [DiC20]. Clearly, E 0 satisfies our assumptions (1.25) since N E only takes the values 0 and ∞ and the value 0 is taken on the convex set γ d+2 ≤ c * ρ d . Thus, E κ generates a functional E κ = E 0 + κM that is geodesically 2κ-convex.
A second example is given by E(c) = max{0, c 2 − c}. We first observe that E 1 (c) = c and E 2 (c) = c 2 satisfy (1.25). Hence, c → max{ E 1 (c), E 2 (c)} = E(c) + c satisfies (1.25) as well. Thus, we know that E generates a functional E that is at least geodesically (−2)convex. However, we may inspect the function c → c 2 − c in the region c ≥ 1 directly and find that E itself satisfies (1.25).
In practical applications, in particular for evolutionary variational inequalities as treated in [LaM22], it is desirable to find the optimal λ for the geodesic λ-convexity. So far, we have treated the case of geodesic 0-convexity and now return to the general case, which leads to the conditions The monotonicity condition is clearly independent of λ. The first equation still relies on the necessary McCann condition B 11 (c) ≥ 0. If this holds with strict inequality we see that the optimal λ is characterized by Moreover, we find ℓ(c) ∼ 2c −3/5 det B (2/5) /B (2/5) 11 for c ≈ 0 and ℓ(c) ∼ 2c det B (2) /B (2) 11 for c ≫ 1. Thus, by compactness λ opt = inf ℓ(c) c > 0 is strictly positive.
Remark 6.8 (Geodesic convexity via the Otto calculus) Following the key ideas in [OtW05,DaS08] a formal calculus for reaction-diffusion systems was developed in [LiM13]. It uses the dynamical formulation in Subsection 2.1.1 and the associated Onsager operator K(c)ξ = −α div(c∇ξ) + βcξ to characterize the geodesic λ-convexity of the functional E by calculating the quadratic form M(c, ·) (contravariant Hessian of E ): Then, one needs to show the estimate M(c, ξ) ≥ λ ξ, K(c)ξ . Following the methods in [LiM13,Sect. 4], for c ∈ C 0 c (Ω) and smooth ξ we obtain Analyzing the condition M(c, ξ) ≥ λ ξ, K(c)ξ we find the conditions which for λ = 0 give the same conditions as B(c) ≥ 0, see Proposition 6.1. Note that the middle estimate in (6.5) follows from the first and the third estimates because of 7 Proof of geodesic convexity of E In this section, we finally prove the necessity and sufficiency of the conditions for geodesic convexity of functionals E on M(Ω) in (1.25), where we now allow for a general closed and convex domain Ω ⊂ R d . In order to keep the arguments clear, we first restrict ourselves to absolutely continuous measures µ 0 and µ 1 . Thus, by Corollary 5.5 the connecting geodesic curves are also absolutely continuous, and we can rewrite E along the latter in the form The general case will then be treated by using an approximation argument. Under the assumption that E is twice differentiable in the interior of its domain, we show that for µ 0 -a.a. x ∈ Ω the function t → e(t, x) is convex. Since α(·, x) and δ(·) are analytic functions on [0, 1], we can show convexity in this case by establishingë(t, x) ≥ 0. For this, we can fix x ∈ Ω, drop the dependence on x for notational convenience, and set e(t) = δ(t)E c * α(t) δ(t) = N E ρ(t), γ(t) with ρ := (c * α) 1/2 δ 1/d , γ := (c * α) 1/2 , (7.1) and N E from (1.25a). Now, the classical chain rule implies the relation The aim is to showë(t) ≥ 0 for all t ∈ [0, 1]. By the convexity of N E it suffices to treat the last two terms. For this we exploit the curvature estimates (5.12) onγ andρ as well as the monotonicities in (1.25c) and Proposition 6.2.

Usage of the curvature estimates
We first show that it is sufficient to use the curvature estimates In particular, the equality condition for d = 1 is different from the inequality conditions for d ≥ 2. This will be used to compensate for the missing monotonicity of N E in (1.25c) in the case d = 1.
Below we will see that the curvature estimates (7.3) are necessary to complete our proof. Note that they are implied by the curvature estimates derived in Proposition 5.7. In fact, both coincide for d = 1, while for d ≥ 2 the former are strictly weaker as the latter because of 1 − 4/d < 1 − 4/d 2 .
Proof. As the first term (involving D 2 N E ) on the right-hand side of (7.2) is non-negative, we only have to show that the last two terms have a non-negative sum. For this we rearrange terms as follows: The right-hand side is the sum of two products, both of which are non-negative. Indeed, the first product equals 0 in the case d = 1 independently of the sign of ∂ ρ N E , because the second factor is 0. In the case d ≥ 2 both factors are non-negative (using ∂ ρ N E ≤ 0 and the second curvature estimate in (7.3)), so the first product is non-negative again.
In the second product both terms are non-negative by Proposition 6.2(B) and the first curvature estimate in (7.3). Thus,ë ≥ 0 in (7.2) is proved.

The main results on geodesic λ-convexity
We are now ready to establish our main result on the geodesic convexity of functionals E given in terms of a density E. We now make our general assumptions of E precise. (7.4a) We also want to include the case that E is not necessarily superlinear, so we introduce the recession constant The case E ′ ∞ = ∞ is the superlinear case where the functional E (µ) is always +∞, if µ has a singular part, i.e. µ ⊥ = 0 in the decomposition µ = cL d + µ ⊥ with µ ⊥ ⊥ L d .
We introduce a closed (convex) domain Ω ⊂ R d , and we consider the set of measures µ with support contained in Ω, which we identify with M(Ω). In the case that the right derivative E ′ 0 := lim c↓0 1 c E(c) of E at 0 is not finite, we further have to impose that Ω has finite Lebesgue measure. Therefore, we will assume that Ω is a closed convex set with nonempty interior and (7.4b) Thus, the functionals E are defined as follows It is well known that (7.4) guarantees that E is a weakly lower semi-continuous functional on M(Ω). In particular, condition (7.4b) is necessary to guarantee that the negative part x → min{E(c(x)), 0} is integrable, because for c ∈ L 1 (Ω) the functions x → − c(x) may not lie in L 1 (Ω). We refer to Example 7.4 for a case where (7.4b) can be avoided by using a confining potential.
We are now in the position to formulate our main result on the geodesic λ-convexity of integral functionals E on the Hellinger-Kantorovich space (M(Ω), H K). The proof consists of three steps. First, we assume that E is twice continuously differentiable in its domain. Restricting to geodesic curves connecting absolutely continuous measures, we can use the above differentiable theory givingë ≥ 0. In Step 2, we generalize to possibly non-differentiable density functions E, but keep absolutely continuous measures. For smoothing a given E, we use that whenever E solves the conditions (1.25) and (7.4) then c → E(rc) does so for each r ∈ [0, 1]. With a multiplicative convolution we construct a smooth E δ to which Step 1 applies. Finally, Step 3 handles the case where µ ⊥ 0 or µ ⊥ 1 are non-zero by a standard approximation argument of general measures using absolutely continuous measures.
Proof. Without loss of generality, we set λ * = 0 throughout the proof and shortly write N E = N λ * ,E .
Step 1: The smooth and absolutely-continuous case. We first assume that E is twice continuously differentiable in the interior ]0, c E [ of its domain and that the measures µ 0 and µ 1 are absolutely continuous with respect to L d , i.e. µ j = c j L d for c j ∈ L 1 (Ω).
In the strictly convex case, the values of c(t, x) for t ∈ ]0, 1[ lie in the interior of the domain of E, where E is twice differentiable. Hence, combining Propositions 7.1 and 5.7 shows that t → e(t, x) is convex for a.a. x ∈ Ω. Since integration over Ω maintains convexity we conclude that t → E (µ t ) is convex, too.
Step 2: The nonsmooth but absolutely-continuous case. We still assume µ j = c j L d , but now consider an E that is not necessarily twice differentiable, but still satisfies (7.4). We choose a function χ ∈ C ∞ c (R) satisfying χ(r) ≥ 0, To see this, we first consider the largest interval [0, c 1 [ on which E is non-increasing. Then 0 = E(0) ≥ E δ (c) ≥ E(c) which implies (7.6) with K = 1. If c 1 = ∞ then we are done. If c 1 < ∞, then E starts to increase and there exists c 2 ∈ [c 1 , ∞[ with E(c) ≥ 0 for c ≥ c 2 . Using the construction of E δ , we obtain for all c ≥ 2c 2 ≥ c 2 /(1−2δ) the lower bound E δ (c) ≥ 0. Using (6.3) we easily get E δ (c) ≤ E(c). It remains to cover the case c ∈ [c 1 , 3c 2 ]. If c 1 = 0 then E(c) ≥ 0 for all c, which means c 2 = 0 as well, then (7.6) follows immediately from the above arguments. If c 1 > 0, a uniform continuity argument gives the estimate |E δ (c) − E(c)| ≤ M for c ∈ [c 1 , 3c 2 ]. Then, choosing K = M/c 1 provides (7.6).
Step 3: Pure growth. The curve t → t 2 µ 1 is the unique geodesic connecting µ 0 = 0 and µ 1 . Using the Lebesgue decomposition µ 1 = c 1 L d + µ ⊥ we see that is convex on [0,1] by Step 2 for the first term and by E ′ ∞ ≥ 0. The nonnegativity of E ′ ∞ = lim c→∞ E(c)/c follows from (7.4a) and Proposition 6.2(A), namely for c ≥ c • we have Step 4: The general case allowing for singular measures. Singular measures can only occur for E with sublinear growth. Hence, we assume E ′ ∞ ∈ R from now on. In particular E is finite everywhere, and using E(c) ≤ E ′ ∞ c we have E (µ) ≤ E ′ ∞ µ(Ω). As in Corollary 5.6, we consider an arbitrary geodesic (µ t ) t∈[0,1] connecting µ 0 and µ 1 . For a fixed s ∈ (0, 1), we decompose µ s as µ a s + µ ⊥ s . Then, µ t = µ t + µ t splits into two geodesics with disjoint supports and µ s = µ a s and µ s = µ ⊥ s , see Corollary 5.6. Moreover, we have µ t ⊥ L d and µ t ≪ L d for all t ∈ (0, 1). This implies the relation Since ( µ t ) t∈[0,1] is a geodesic and the total mass functional M (µ) = µ(Ω) is convex (see (1.21)) and E ′ ∞ ≥ 0, the last term t → E ′ ∞ µ t (Ω) is convex. Hence, it is sufficient to check the convexity of t → E ( µ t ).
Since µ t ≪ L d for all t ∈ (0, 1), the function t → E ( µ t ) is convex in the open interval (0, 1) by Step 2. Hence, to show convexity on [0, 1] it is sufficient to check that lim sup t↓0 E ( µ t ) ≤ E ( µ 0 ) and lim sup t↑1 E ( µ t ) ≤ E ( µ 1 ), because H K convergence implies weak convergence and E is weakly l.s.c. Let us focus on the limit t ↓ 0 as the limit t ↑ 1 is completely analogous. The problem is that µ t ≪ L d for t ∈ (0, 1) only, but µ 0 may have a singular part. Hence, we forget the decomposition µ t = µ t + µ t and use a different one. Before that, we restrict to the case µ 0 (Ξ + ) = 0 because on Ξ + we have pure growth and this case is covered by Step 3. Now, we exploit the Lebesgue decomposition of µ 0 = µ a 0 + µ ⊥ 0 at t = 0 and consider two disjoint Borel sets A, B ⊂ Ω \ Ξ + such that µ a 0 = µ 0 A and µ ⊥ 0 = µ 0 B. We define the corresponding disjoints sets A t := T −1 t→0 (A) and B t := T −1 t→0 (B) as well as the measures ν A t := µ t A t and ν B t := µ t B t . By Theorem 5.4, we obtain two geodesics ν A t , ν B t concentrated on disjoint sets giving E (µ t ) = E (ν A t ) + E (ν B t ). Since ν A t ≪ L d for every t ∈ [0, 1) we deduce that t → E (ν A t ) is convex up to 0 by Step 2. Concerning E (ν B t ), we use E (µ) ≤ E ′ ∞ µ(Ω) and find where we exploited ν B 0 ⊥ L d in the last identity. This finishes the proof of the main theorem The following result is a direct consequence of the main result by using the results of Examples 6.3 and 6.4, respectively. In particular, this establishes the result announced in [DiC20, Thm. 2.14].
Example 7.4 We have seen above that the density E(c) = − √ c produces a geodesically convex functional in dimensions d = 1 and 2, if L d (Ω) < ∞. The restriction of finite volume for Ω can be dropped by using a confining potential V as follows: Let where V ∈ C(R d ) satisfies for m > d and A ∈ R the lower bound V (x) ≥ a 0 |x| m − A on R d . Then it is easy to see that E 1/2,V is well-defined and weakly lower semi-continuous. Moreover, in [LMS16,Prop. 20] it was shown for a continuous V : R d → R with inf V > −∞ that the linear mapping µ → R d V dµ is geodesically λ V -convex on (M(Ω), H K) if and only if the mapping V : [x, r] → r 2 V (x) is geodesically λ V -convex on the metric cone space (C, d C ). For smooth V , this amounts to the estimate Thus, for V satisfying both of the above assumptions, the functional E 1/2,V is geodesically λ V -convex on (M(R 2 ), H K) for d ∈ {1, 2}. For d = 1 we may choose V (x) = α + β|x| 2 with β > 0 and obtain λ V = 2α.
In particular, the second derivative is non-negative which means where ξ s and its derivatives are evaluated at x = x * = 0.