Generalized curvature for the optimal transport problem induced by a Tonelli Lagrangian

We propose a generalized curvature that is motivated by the optimal transport problem on Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb {R}}^d$$\end{document} with cost induced by a Tonelli Lagrangian L. We show that non-negativity of the generalized curvature implies displacement convexity of the generalized entropy functional on the L-Wasserstein space along C2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$C^2$$\end{document} displacement interpolants.


Introduction
Given a Riemannian manifold (M, g), one may consider the optimal transport problem with cost given by squared Riemannian distance.This induces the 2-Wasserstein distance W 2 on P 2 (M ), the space of probability measures on M with finite second moments (i.e.probability measures µ such that M d(x, x 0 ) 2 dµ(x) < ∞ for every x 0 ∈ M ).The metric space (P 2 (M ), W 2 ) is called the 2-Wasserstein space, and is known to be a geodesic space ([21] Chapter 7).
In [17], Otto proposed that P 2 (M ) admits a formal Riemannian structure and developed a formal calculus on P 2 (M ).This later became what is known as Otto calculus [21] and was made rigorous by Ambrosio-Gigli-Savaré [2].In particular, Otto calculus allows one to compute displacement Hessians of functionals along geodesics in P 2 (M ).This is useful for characterizing a displacement convex functional (i.e.convex along every geodesic) by the non-negativity of its displacement Hessian.In a seminal work by Otto and Villani [18], it was shown that the displacement convexity of the entropy functional is related to the Ricci curvature of (M, g).Since then, the notion of displacement convexity has been useful in many other areas.For instance, it has inspired new heuristics and proofs of various functional inequalities [1], [8].
Further advances have been made towards understanding the relationships between the geometry of the underlying space and the induced geometry of P(M ), the space of probability measures on M .In his Ph.D. thesis [19], Schachter studied the optimal transport problem on R d with cost induced by a Tonelli Lagrangian.The case d = 1 was considered in [20], and this work was later used in [3] and [15].
In his work, Schachter developed an Eulerian calculus, extending the Otto calculus.Among the other contributions of his thesis, Schachter derived a canonical form for the displacement Hessians of functionals.Using Eulerian calculus, he found a new class of displacement convex functionals on S 1 [20], which includes those found by Carrillo and Slepčev in [7].In the case when the cost is given by squared Riemannian distance, Schachter proved that his displacement Hessian agrees with Villani's displacement Hessian in [21], which is a quadratic form involving the Bakry-Emery tensor.
Summary of main results: In this manuscript, a generalized notion of curvature K x (Definition 5.6) is proposed for the manifold M = R d equipped with a general Tonelli Lagrangian L, and is given by K x (ξ) := tr ∇ξ(x) 2 + A(x, ξ(x))∇ξ(x) + B(x, ξ(x)) for vector fields ξ ∈ C 2 (R d ; R d ).The maps A and B are defined in Lemma 5.1.We prove that this generalized curvature is independent of the choice of coordinates (Theorem 5.7).In the case where ξ take a special form (that naturally arises from the optimal transport problem), we provide an explicit formula for K x in Theorem 5.8.Lastly, we furnish an example of a Lagrangian cost with non-negative generalized curvature that is not given by squared Riemannian distance.This induces a geometry on the L−Wasserstein space where the generalized entropy functional (4.1) is displacement convex along suitable curves.
This paper is organized as follows: In the first four sections, we will review the optimal transport problem induced by a Tonelli Lagrangian, up to and including the notion of displacement convexity.The thesis of Schachter [19] provides a good overview of key definitions and results needed.Section 2 covers some basic notation.Section 3 reviews some ideas from [19]; chief among them is the relationship between the various formulations of the optimal transport problem.Section 4 discusses functionals along curves in Wasserstein space, including a computation of the displacement Hessian.Section 5 introduces the definition and various properties of the generalized curvature K x .Lastly, Section 6 provides an example of a Lagrangian with everywhere non-negative generalized curvature.This paper grew out of an undergraduate project I worked on at UCLA.I would like to thank Wilfrid Gangbo (who first introduced this subject to me) for his continued guidance and support.His expertise and generosity have been tremendously helpful.I also thank Tommaso Pacini for helpful feedback.Lastly, I would like to thank the reviewer for the suggestions to the manuscript.

Notation
We will take our underlying manifold to be M = R d and identify its tangent bundle T R d ∼ = R d × R d .Let P ac = P ac (R d ) denote the set of probability measures on R d that are absolutely continuous with respect to the d−dimensional Lebesgue measure (denoted L d ).An element of P ac will often be identified by its density ρ.Given ρ ∈ P ac and a measurable function T : R d → R d , T # ρ will denote the push-forward measure of ρ.

Definition 2.1 (Tonelli Lagrangian). A function
(iii) L has asymptotic superlinear growth in the variable v, in the sense that there exists a constant c 0 ∈ R and a function θ : R d → [0, +∞) with k ≥ 3 will be assumed to be a Tonelli Lagrangian and we will work with the underlying space (R d , L).We denote the gradient with respect to the x (position) and v (velocity) variables by ∇ x L, ∇ v L ∈ R d respectively.Similarly, the second-order derivatives will be denoted by The time derivative of a function f (t) will be denoted by ḟ = df dt .
3 Optimal transport problem induced by a Tonelli Lagrangian

Lagrangian optimal transport problem
The goal of this section is to establish the different formulations of the optimal transport problem with cost induced by a Tonelli Lagrangian L. In this first subsection, the Lagrangian optimal transport problem will be presented.We will also briefly recall the classical Monge-Kantorovich theory.Most of the material in the subsection can be found in [5], [11], [19] and [21].In subsection 3.2 we will present an Eulerian perspective and its connections to viscosity solutions of the Hamilton-Jacobi equation.
Definition 3.1 (Action functional).Let T > 0 and γ ∈ W 1,1 ([0, T ]; R d ) be a curve.The action of L on γ is This induces a cost function c L,T : A curve γ with γ(0) = x, γ(T ) = y is called an action-minimizing curve from x to y if A L,T (γ) = c L,T (x, y).
Theorem 3.2 ([11] Appendix B).For any x, y ∈ R d , there exists an actionminimizing curve γ from x to y such that (i) A L,T (γ) = c L,T (x, y) We refer the reader to [11] and [19] for further properties of the cost function c L,T .In particular, it is locally Lipschitz and thus differentiable almost everywhere by Rademacher's theorem.Moreover, if either ∂ ∂x c L,T (x 0 , y 0 ) or ∂ ∂y c L,T (x 0 , y 0 ) exists at (x 0 , y 0 ), then the action-minimizing curve from x 0 to y 0 is unique.With the cost c L,T , we may state the Monge problem and the Kantorovich problem.
A minimizer π is called an optimal transport plan.The infimum in (3.5) is denoted W cL,T (ρ 0 , ρ T ) and it is called the Kantorovich cost from ρ 0 to ρ T .
If W cL,T (ρ 0 , ρ T ) is finite, then the Monge problem with cost c L,T admits an optimizer M (called the Monge map) that is unique ρ 0 −almost everywhere [11].Note that the Monge problem is only concerned with the initial and final states (i.e.ρ 0 , ρ T ).To interpolate between ρ 0 and ρ T in a way that respects the cost c L,T , we consider the Lagrangian formulation of the optimal transport problem induced by L. Definition 3.6 (Lagrangian optimal transport problem).Let ρ 0 , ρ T ∈ P ac .The Lagrangian optimal transport problem from ρ 0 to ρ T induced by the Tonelli Lagrangian L is the minimization problem where the infimum is taken over all σ : In [19], it is shown that if W cL,T (ρ 0 , ρ T ) is finite , then the Lagrangian optimal transport problem admits an optimizer σ such that σ(•, x) is an actionminimizing curve from σ(0, x) = x to σ(T, x) for every x ∈ R d .Moreover, the map σ(T, •) coincides with the Monge map M and so is unique ρ 0 −almost everywhere.With an optimizer σ, we can define the notion of displacement interpolation, which is the analogue of a geodesic in P ac .Definition 3.7 (Displacement interpolant).Let ρ 0 , ρ T ∈ P ac be such that the Kantorovich cost W cL,T (ρ 0 , ρ T ) is finite.Let σ be an optimizer of the Lagrangian optimal transport problem.Then the displacement interpolant between ρ 0 and ρ T for the cost c L,T is the measure-valued map Since µ t is absolutely continuous with respect to L d for every t ∈ [0, T ] ( [11] Theorem 5.1), we will also identify µ t with its density ρ t .Subsequently, we will always denote a displacement interpolant by a function ρ : [0, T ] × R d → R and use the notation ρ t = ρ(t, •) whenever the intention is clear.Since the maps σ(t, •) are uniquely defined (ρ 0 −almost everywhere) on the support of ρ 0 , the displacement interpolant is well-defined.Moreover, the map σ [0,t]×R d for an intermediary time t ∈ [0, T ] optimizes the Lagrangian optimal transport problem from ρ 0 to ρ t , i.e.
In order to discuss the Eulerian formulation of the optimal transport problem, we need to introduce the Kantorovich duality.We do so in accordance with the convention of [21].
Theorem 3.8 (Kantorovich duality).The Kantorovich optimal transport problem from ρ 0 to ρ T for the cost c L,T has a dual formulation Moreover, we may assume that If (u 0 , u T ) is an optimizer of the dual problem, then u 0 and u T are called Kantorovich potentials.
Remark 3.9.If the Monge optimal transport problem from ρ 0 to ρ T for the cost c L,T admits a minimizer M (unique ρ 0 −almost everywhere), then any optimal transport plan π ∈ Π(ρ 0 , ρ T ) is concentrated on the graph of M [11].Moreover, if u 0 and u T are Kantorovich potentials, then for every (x, y) ∈ R d × R d and we have equality for x ρ 0 −almost everywhere (see [21] Theorem 5.10).

Eulerian formulation
The paper by Benamou and Brenier [4] is one of the earliest works establishing the Eulerian formulation and its connection to Hamilton-Jacobi equations.Subsequently, the relationships between the different formulations of the optimal transport problem were further studied (for instance, [5]).
In particular, the Eulerian view establishes the displacement interpolant as a solution to the continuity equation.First, we state some basic facts about the Hamiltonian.
The Hamiltonian associated with the Tonelli Lagrangian L is defined as the Legendre transform of L with respect to the variable v, i.e.

H(x, p) = sup
Thus, the Hamiltonian H satisfies the Fenchel-Young inequality for all x, v, p ∈ R d , with equality if and only if Let u 0 : R d → [−∞, +∞] be a function and T > 0. We define the Lax-Oleinik evolution u : It is known that if u is finite, then it is a viscosity solution of the Hamilton-Jacobi equation (see [9] Section 7.2 and [10] Theorem 1.1).
) and t1 t0 L(γ(t), γ(t)) dt are all finite and In the following proposition, we mention some properties of u that are of interest to us.The proofs can be found in [6], [9] and [10].Proposition 3.12.Let u be defined as in (3.11).If u is finite, then the following hold: (i) u is continuous and locally semi-concave on (0, T ) × R d .

Generalized entropy functional and displacement Hessian
Otto calculus and Schachter's Eulerian calculus both allow for explicit computations, assuming that all relevant quantities possess sufficient regularity.However, the regularity of a displacement interpolant ρ depends on the Lagrangian L, the initial and final densities (ρ 0 , ρ T ), and the optimal trajectories σ (or the velocity field V in the Eulerian framework).In general, the Kantorovich potential u 0 arising from an optimal transport problem induced by a Tonelli Lagrangian L is only known to be semiconcave, differentiable L d −almost everywhere, and its gradient ∇u 0 is only locally bounded (see [13] and [14] Appendix C).This implies that the initial velocity V (0, x) = (∇ p H)(x, ∇u 0 (x)) is only locally bounded.As such, even if the initial density ρ 0 is smooth, its regularity may fail to propagate along the displacement interpolant.
For our purpose of computing displacement Hessians, we require displacement interpolants to be of class C 2 .Fortunately, such displacement interpolants do exist and we can construct them if we impose two additional criteria on L.
(L1) There exists c0 ≥ 0 and θ : In addition, θ is such that for any M > 0 there exists for all m ∈ [0, M ] and all r ≥ 0.
(L2) For any r > 0, there exists C r > 0 such that Some common examples of Tonelli Lagrangians satisfying these criteria include the Riemannian kinetic energy where g x denotes the underlying Riemannian metric tensor, and Lagrangians that arise from mechanics for some appropriate potential U : R d → R.
Let H be the corresponding Hamiltonian.
where V : [0, +∞) × R d → R d is a time-dependent vector field defined by (Here, Φ 1 and Φ 2 are the x and v components of Φ respectively.)If we let Proof.Since L and u 0 are both bounded below, we have and so u is finite.From [10], u is a continuous viscosity solution of the Hamilton-Jacobi equation (3.12) and we know that for each (t, x) ∈ (0, +∞) × R d , there exists a unique (u, L)−calibrated curve γ Moreover, (∇u)(s, γ x (s)) exists for all s ∈ [0, t] and is given by Since each γ x is necessarily an action-minimizing curve from γ x (0) to γ x (t) = x, it is the unique solution curve to the Euler-Lagrange system ) be a compactly supported density.Then for any T > 0, there exists a Proof.Let u 0 , u, V, σ be defined as in Lemma 4.1 and fix T > 0. For t ∈ [0, T ], define We claim that σ is an optimizer of the Lagrangian optimal transport problem from ρ 0 to ρ T = ρ(T, •), which would imply that ρ is indeed a displacement interpolant.Let φ : [0, T ] × R d → R d satisfy the four conditions in Definition 3.6.By Lemma 4.1, t → σ(t, x) is a (u, L)−calibrated curve for each x ∈ R d .Thus, for every By the definition of pushforward measure, the LHS of the last equality is where the last equality is due to the assumption that φ(T, •) # ρ 0 = ρ T and φ(0, x) = x.By the definition of u (i.e.(3.11)), we have that u(T, φ(T, x)) − u(0, φ(0, x)) Since φ was arbitrary, σ is indeed an optimizer of the Lagrangian optimal transport problem from ρ 0 to ρ T .

Displacement Hessian
Let F ∈ C 2 ((0, +∞)) ∩ C([0, +∞)) be a function satisfying we define the generalized entropy functional This is well-defined at least on which is finite.

Definition 4.4 (Displacement convexity). The generalized entropy functional
F is said to be convex along a displacement interpolant ρ t , t ∈ [0, T ], if F (ρ t ) is finite and for every t ∈ [0, T ].F is said to be displacement convex if it is convex along every displacement interpolant (on which F is real-valued).
Remark 4.5.When the displacement interpolant is a "straight line", McCann proved that ) is convex and nonincreasing on (0, +∞) [16].In this context, a "straight line" displacement interpolant refers to one of the form where M is the Monge map between ρ 0 and ρ T .
Along a suitable displacement interpolant ρ t , if the map t → F (ρ t ) is C 2 , then the condition that d 2 dt 2 F (ρ t ) ≥ 0 ensures convexity of F along ρ t .The following displacement Hessian formula is a special case of Theorem 4.3.2 of [19].
an optimizer of the Lagrangian optimal transport problem from ρ 0 to ρ T .Let V : [0, T ]× R d → R d be defined as in Remark 3.14 so that ρ, V satisfy the continuity equation ρ = −∇ • (ρV ).Assume that σ and V are C 2 at least on the set Then d 2 dt 2 F (ρ) exists for every t ∈ [0, T ] and is given by where G : [0, +∞) → R is defined by and Remark 4.7.The requirement that ρ 0 is compactly supported serves to ensure that F is finite along ρ.In addition, the compactness of supp(ρ 0 ) and the continuity of σ together ensures that the set {σ(t, x) : t ∈ [0, T ] , x ∈ supp(ρ 0 )} is compact.Thus, is bounded, up to a set of zero L d −measure.This means that d 2 dt 2 F (ρ) exists for every t ∈ [0, T ] and satisfies Remark 4.8.By Remark 3.14, for every t ∈ [0, T ], V (t, •) is uniquely defined on supp(ρ t ) ρ t −almost everywhere.Thus, (4.3) is well-defined.
Lemma 5.2.Define U(t, x) = (∇V )(t, σ(t, x)). (5.4) Proof.First, we note that since σ(t, x) = V (t, σ(t, x)), we get and so Using the matrix identity We want to show that the term tr((∇V ) 2 ) − ∇ • W appearing in the displacement Hessian formula (4.3) arises from equation (5.5).Taking the trace of (5.5), we have (5.6) On the other hand, direct computation yields Since V (t, σ(t, x)) = σ(t, x) and σ(0, x) = x, we may restate the above equation as + tr (∇V )(t, x) 2 + A(x, V (t, x))(∇V )(t, x) + B(x, V (t, x)) = 0. (5.7) Using the identities we see that By the computation of the displacement Hessian from the previous section, this is precisely tr((∇V ) 2 ) − ∇ • W .At this point, (5.7) holds for all time-dependent C 2 vector fields whose integral curves satisfy the Euler-Lagrange equation (3.3).To show that (5.7) holds for an arbitrary fixed vector field, we first need to make sense of the term V by introducing Definition 5.4.Proposition 5.3.Given any fixed vector field V 0 ∈ C 2 (R d ; R d ), we may extend it for a short time to a unique time-dependent vector field V (t, x), t ∈ [0, ǫ) with the following properties: The integral curves of V satisfy the Euler-Lagrange equation, i.e.
(5.8) Definition 5.4.Given a Tonelli Lagrangian L, we define the operation as in Proposition 5.3.
Proof.By Proposition 5.3, we may extend ξ for a short time to a time-dependent vector field V (t, x), with V (0, •) = ξ, whose integral curves satisfy the Euler-Lagrange equation.Thus, (5.7) holds for V and we have To show that K x is intrinsic, we will show that the operator is invariant under a change of coordinates.By Definition 5.4 and the definition of divergence, is also coordinate-free.
Theorem 5.8 (Formula for K where all terms involving L are evaluated at (x, ξ(x)).
Proof.See Appendix.
In conclusion, the displacement Hessian formula in (4.3) can be written as (5.12) 6 Displacement convexity for a non-Riemannian Lagrangian cost In this section, we provide an example of a Lagrangian cost that is not a squared Riemannian distance.We prove using a perturbation argument that the corresponding generalized curvature is non-negative and thus the generalized entropy functional is convex along C 2 displacement interpolants.Let g(x) be a positive definite matrix for every x ∈ R d so that 1 2 v, g(x)v , v ∈ R d defines a Riemannian metric.Let g ij = g ij (x) denote the ij-th entry of g(x) and g ij = g ij (x) denote the ij-th entry of the inverse matrix g(x) −1 .Further assume that the g ij are bounded with bounded derivatives, and that the corresponding Bakry-Emery tensor (denoted BE g ) is bounded from below.That is, Define the Lagrangian and the perturbed Lagrangian where ϕ : R d → R is a smooth perturbation (for instance, take ϕ to be of Schwartz class).Using Theorem 5.8, the respective generalized curvatures are given by where c g > 0 is a constant depending on g.Fix ǫ > 0 such that ǫ ≤ min{ Since K x (ξ) ≥ c g ||∇ξ|| 2 + k g ||ξ|| 2 , we conclude that Kx (ξ) ≥ 0.
In the computations below, time derivatives of ξ will be treated by extending ξ for a short time (in the sense of Proposition 5.3).

Definition 3 . 4 ( 4 ) 3 . 5 (
Monge problem).Let ρ 0 , ρ T ∈ P ac .The Monge optimal transport problem from ρ 0 to ρ T for the cost c L,T is the minimization probleminf M R d c L,T (x, M (x))ρ 0 (x) dx : M # ρ 0 = ρ T , M Borel measurable .(3.Definition Kantorovich problem).Let Π(ρ 0 , ρ T ) denote the set of all probability measures on R d × R d with marginals ρ 0 and ρ T .Then the Kantorovich optimal transport problem from ρ 0 to ρ T for the cost c L,T is the minimization problem