Abstract
We propose a generalized curvature that is motivated by the optimal transport problem on \({\mathbb {R}}^d\) with cost induced by a Tonelli Lagrangian L. We show that non-negativity of the generalized curvature implies displacement convexity of the generalized entropy functional on the L-Wasserstein space along \(C^2\) displacement interpolants.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Given a Riemannian manifold (M, g), one may consider the optimal transport problem with cost given by squared Riemannian distance. This induces the 2-Wasserstein distance \(W_2\) on \({\mathcal {P}}_2(M)\), the space of probability measures on M with finite second moments (i.e. probability measures \(\mu \) such that
for every \(x_0\in M\)). The metric space \(({\mathcal {P}}_2(M),W_2)\) is called the 2-Wasserstein space, and is known to be a geodesic space ([21] Chapter 7).
In [17], Otto proposed that \({\mathcal {P}}_2 (M)\) admits a formal Riemannian structure and developed a formal calculus on \({\mathcal {P}}_2 (M)\). This later became what is known as Otto calculus [21] and was made rigorous by Ambrosio-Gigli-Savaré [2]. In particular, Otto calculus allows one to compute displacement Hessians of functionals along geodesics in \({\mathcal {P}}_2 (M)\). This is useful for characterizing a displacement convex functional (i.e. convex along every geodesic) by the non-negativity of its displacement Hessian. In a seminal work by Otto and Villani [18], it was shown that the displacement convexity of the entropy functional is related to the Ricci curvature of (M, g). Since then, the notion of displacement convexity has been useful in many other areas. For instance, it has inspired new heuristics and proofs of various functional inequalities [1, 8].
Further advances have been made towards understanding the relationships between the geometry of the underlying space and the induced geometry of \({\mathcal {P}}(M)\), the space of probability measures on M. In his Ph.D. thesis [19], Schachter studied the optimal transport problem on \({\mathbb {R}}^d\) with cost induced by a Tonelli Lagrangian. The case \(d=1\) was considered in [20], and this work was later used in [3] and [15].
In his work, Schachter developed an Eulerian calculus, extending the Otto calculus. Among the other contributions of his thesis, Schachter derived a canonical form for the displacement Hessians of functionals. Using Eulerian calculus, he found a new class of displacement convex functionals on \(S^1\) [20], which includes those found by Carrillo and Slepčev in [7]. In the case when the cost is given by squared Riemannian distance, Schachter proved that his displacement Hessian agrees with Villani’s displacement Hessian in [21], which is a quadratic form involving the Bakry–Emery tensor.
Summary of main results: In this manuscript, a generalized notion of curvature \({\mathcal {K}}_x\) (Definition 5.6) is proposed for the manifold \(M={\mathbb {R}}^d\) equipped with a general Tonelli Lagrangian L, and is given by
for vector fields \(\xi \in C^2({\mathbb {R}}^d;{\mathbb {R}}^d)\). The maps A and B are defined in Lemma 5.1. We prove that this generalized curvature is independent of the choice of coordinates (Theorem 5.7). In the case where \(\xi \) take a special form (that naturally arises from the optimal transport problem), we provide an explicit formula for \({\mathcal {K}}_x\) in Theorem 5.8. Lastly, we furnish an example of a Lagrangian cost with non-negative generalized curvature that is not given by squared Riemannian distance. This induces a geometry on the L-Wasserstein space where the generalized entropy functional (4.1) is displacement convex along suitable curves.
This paper is organized as follows: In the first four sections, we will review the optimal transport problem induced by a Tonelli Lagrangian, up to and including the notion of displacement convexity. The thesis of Schachter [19] provides a good overview of key definitions and results needed. Section 2 covers some basic notation. Section 3 reviews some ideas from [19]; chief among them is the relationship between the various formulations of the optimal transport problem. Section 4 discusses functionals along curves in Wasserstein space, including a computation of the displacement Hessian. Section 5 introduces the definition and various properties of the generalized curvature \({\mathcal {K}}_x\). Lastly, Sect. 6 provides an example of a Lagrangian with everywhere non-negative generalized curvature.
2 Notation
We will take our underlying manifold to be \(M = {\mathbb {R}}^d\) and identify its tangent bundle \(T{\mathbb {R}}^d\cong {\mathbb {R}}^d \times {\mathbb {R}}^d\). Let \({\mathcal {P}}^{ac} = {\mathcal {P}}^{ac}({\mathbb {R}}^d)\) denote the set of probability measures on \({\mathbb {R}}^d\) that are absolutely continuous with respect to the \(d-\)dimensional Lebesgue measure (denoted \({\mathcal {L}}^d\)). An element of \({\mathcal {P}}^{ac}\) will often be identified by its density \(\rho \). Given \(\rho \in {\mathcal {P}}^{ac}\) and a measurable function \(T:{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\), \(T_{\#}\rho \) will denote the push-forward measure of \(\rho \).
Definition 2.1
(Tonelli Lagrangian) A function \(L:{\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is called a Tonelli Lagrangian if it satisfies the following conditions:
-
(i)
\(L\in C^2 ({\mathbb {R}}^d \times {\mathbb {R}}^d)\).
-
(ii)
For every \(x\in {\mathbb {R}}^d\), the function \(L(x,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is strictly convex.
-
(iii)
L has asymptotic superlinear growth in the variable v, in the sense that there exists a constant \(c_0\in {\mathbb {R}}\) and a function \(\theta :{\mathbb {R}}^d \rightarrow [0,+\infty )\) with
$$\begin{aligned} \lim _{|v|\rightarrow +\infty }\frac{\theta (v)}{|v|}=+\infty \end{aligned}$$such that
$$\begin{aligned} L(x,v)\ge c_0 + \theta (v) \end{aligned}$$(2.1)for all \((x,v)\in {\mathbb {R}}^d \times {\mathbb {R}}^d\).
Throughout this manuscript, \(L \in C^k ({\mathbb {R}}^d \times {\mathbb {R}}^d)\), \(k\ge 3\) will be assumed to be a Tonelli Lagrangian and we will work with the underlying space \(({\mathbb {R}}^d, L)\). We denote the gradient with respect to the x (position) and v (velocity) variables by \(\nabla _x L, \nabla _v L\in {\mathbb {R}}^d\) respectively. Similarly, the second-order derivatives will be denoted by \(\nabla _{xx}^2\,L\), \(\nabla _{vv}^2\,L\), \(\nabla _{xv}^2\,L\), \(\nabla _{vx}^2L = \nabla _{xv}^2 L^{\top } \in {\mathbb {R}}^{d\times d}\). We will assume that the Hessian \(\nabla _{vv}^2 L(x,v)\) is positive-definite for every \((x,v)\in {\mathbb {R}}^d \times {\mathbb {R}}^d\). The time derivative of a function f(t) will be denoted by \({\dot{f}} = \frac{df}{dt}\).
3 Optimal transport problem induced by a Tonelli Lagrangian
3.1 Lagrangian optimal transport problem
The goal of this section is to establish the different formulations of the optimal transport problem with cost induced by a Tonelli Lagrangian L. In this first subsection, the Lagrangian optimal transport problem will be presented. We will also briefly recall the classical Monge–Kantorovich theory. Most of the material in the subsection can be found in [5, 11, 19, 21]. In subsection 3.2 we will present an Eulerian perspective and its connections to viscosity solutions of the Hamilton–Jacobi equation.
Definition 3.1
(Action functional) Let \(T>0\) and \(\gamma \in W^{1,1}([0,T];{\mathbb {R}}^d)\) be a curve. The action of L on \(\gamma \) is
This induces a cost function \(c_{L,T}: {\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) given by
A curve \(\gamma \) with \(\gamma (0) = x, \gamma (T) = y\) is called an action-minimizing curve from x to y if \({\mathcal {A}}_{L,T}(\gamma ) = c_{L,T}(x,y)\).
Theorem 3.2
([11] Appendix B) For any \(x,y\in {\mathbb {R}}^d\), there exists an action-minimizing curve \(\gamma \) from x to y such that
-
(i)
\({\mathcal {A}}_{L,T}(\gamma ) = c_{L,T}(x,y)\)
-
(ii)
\(\gamma \in C^k ([0,T];{\mathbb {R}}^d)\)
-
(ii)
\(\gamma \) satisfies the Euler–Lagrange equation
$$\begin{aligned} \frac{d}{dt}((\nabla _v L)(\gamma ,{\dot{\gamma }})) = (\nabla _x L)(\gamma ,{\dot{\gamma }}) \end{aligned}$$(3.3)
Definition 3.3
(Lagrangian flow) The Lagrangian flow \(\Phi :[0,+\infty )\times {\mathbb {R}}^d\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\times {\mathbb {R}}^d\) is defined by
We refer the reader to [11] and [19] for further properties of the cost function \(c_{L,T}\). In particular, it is locally Lipschitz and thus differentiable almost everywhere by Rademacher’s theorem. Moreover, if either \(\frac{\partial }{\partial x}c_{L,T}(x_0,y_0)\) or \(\frac{\partial }{\partial y}c_{L,T}(x_0,y_0)\) exists at \((x_0,y_0)\), then the action-minimizing curve from \(x_0\) to \(y_0\) is unique. With the cost \(c_{L,T}\), we may state the Monge problem and the Kantorovich problem.
Definition 3.4
(Monge problem) Let \(\rho _0,\rho _T\in {\mathcal {P}}^{ac}\). The Monge optimal transport problem from \(\rho _0\) to \(\rho _T\) for the cost \(c_{L,T}\) is the minimization problem
Definition 3.5
(Kantorovich problem) Let \(\Pi (\rho _0,\rho _T)\) denote the set of all probability measures on \({\mathbb {R}}^d \times {\mathbb {R}}^d\) with marginals \(\rho _0\) and \(\rho _T\). Then the Kantorovich optimal transport problem from \(\rho _0\) to \(\rho _T\) for the cost \(c_{L,T}\) is the minimization problem
A minimizer \(\pi \) is called an optimal transport plan. The infimum in (3.5) is denoted \(W_{c_{L,T}}(\rho _0, \rho _T)\) and it is called the Kantorovich cost from \(\rho _0\) to \(\rho _T\).
If \(W_{c_{L,T}}(\rho _0, \rho _T)\) is finite, then the Monge problem with cost \(c_{L,T}\) admits an optimizer M (called the Monge map) that is unique \(\rho _0-\)almost everywhere [11]. Note that the Monge problem is only concerned with the initial and final states (i.e. \(\rho _0,\rho _T\)). To interpolate between \(\rho _0\) and \(\rho _T\) in a way that respects the cost \(c_{L,T}\), we consider the Lagrangian formulation of the optimal transport problem induced by L.
Definition 3.6
(Lagrangian optimal transport problem) Let \(\rho _0,\rho _T\in {\mathcal {P}}^{ac}\). The Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _T\) induced by the Tonelli Lagrangian L is the minimization problem
where the infimum is taken over all \(\sigma :[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) such that
-
(i)
\(\sigma (\cdot ,x)\in W^{1,1}([0,T];{\mathbb {R}}^d)\) for every \(x\in {\mathbb {R}}^d\)
-
(ii)
\(\sigma (t,\cdot )\) is Borel measurable for every \(t\in [0,T]\)
-
(iii)
\(\sigma (0,x) = x\) for every \(x\in {\mathbb {R}}^d\)
-
(iv)
\(\sigma (T,\cdot )_{\#}\rho _0 = \rho _T\)
In [19], it is shown that if \(W_{c_{L,T}}(\rho _0, \rho _T)\) is finite, then the Lagrangian optimal transport problem admits an optimizer \(\sigma \) such that \(\sigma (\cdot ,x)\) is an action-minimizing curve from \(\sigma (0,x)=x\) to \(\sigma (T,x)\) for every \(x\in {\mathbb {R}}^d\). Moreover, the map \(\sigma (T,\cdot )\) coincides with the Monge map M and so is unique \(\rho _0-\)almost everywhere. With an optimizer \(\sigma \), we can define the notion of displacement interpolation, which is the analogue of a geodesic in \({\mathcal {P}}^{ac}\).
Definition 3.7
(Displacement interpolant) Let \(\rho _0,\rho _T\in {\mathcal {P}}^{ac}\) be such that the Kantorovich cost \(W_{c_{L,T}}(\rho _0, \rho _T)\) is finite. Let \(\sigma \) be an optimizer of the Lagrangian optimal transport problem. Then the displacement interpolant between \(\rho _0\) and \(\rho _T\) for the cost \(c_{L,T}\) is the measure-valued map
Since \(\mu _t\) is absolutely continuous with respect to \({\mathcal {L}}^d\) for every \(t\in [0,T]\) ([11] Theorem 5.1), we will also identify \(\mu _t\) with its density \(\rho _t\). Subsequently, we will always denote a displacement interpolant by a function \(\rho : [0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) and use the notation \(\rho _t = \rho (t,\cdot )\) whenever the intention is clear. Since the maps \(\sigma (t,\cdot )\) are uniquely defined (\(\rho _0-\)almost everywhere) on the support of \(\rho _0\), the displacement interpolant is well-defined. Moreover, the map \(\sigma \big |_{[0,t]\times {\mathbb {R}}^d}\) for an intermediary time \(t\in [0,T]\) optimizes the Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _t\), i.e.
In order to discuss the Eulerian formulation of the optimal transport problem, we need to introduce the Kantorovich duality. We do so in accordance with the convention of [21].
Theorem 3.8
(Kantorovich duality) The Kantorovich optimal transport problem from \(\rho _0\) to \(\rho _T\) for the cost \(c_{L,T}\) has a dual formulation
Moreover, we may assume that
If \((u_0, u_T)\) is an optimizer of the dual problem, then \(u_0\) and \(u_T\) are called Kantorovich potentials.
Remark 3.9
If the Monge optimal transport problem from \(\rho _0\) to \(\rho _T\) for the cost \(c_{L,T}\) admits a minimizer M (unique \(\rho _0-\)almost everywhere), then any optimal transport plan \(\pi \in \Pi (\rho _0,\rho _T)\) is concentrated on the graph of M [11]. Moreover, if \(u_0\) and \(u_T\) are Kantorovich potentials, then
for every \((x,y)\in {\mathbb {R}}^d \times {\mathbb {R}}^d\) and we have equality
for x \(\rho _0-\)almost everywhere (see [21] Theorem 5.10).
3.2 Eulerian formulation
The paper by Benamou and Brenier [4] is one of the earliest works establishing the Eulerian formulation and its connection to Hamilton–Jacobi equations. Subsequently, the relationships between the different formulations of the optimal transport problem were further studied (for instance, [5]).
In particular, the Eulerian view establishes the displacement interpolant as a solution to the continuity equation. First, we state some basic facts about the Hamiltonian.
The Hamiltonian associated with the Tonelli Lagrangian L is defined as the Legendre transform of L with respect to the variable v, i.e.
Thus, the Hamiltonian H satisfies the Fenchel-Young inequality
for all \(x,v,p\in {\mathbb {R}}^d\), with equality if and only if
Moreover, \(H\in C^k ({\mathbb {R}}^d \times {\mathbb {R}}^d)\) and
Let \(u_0:{\mathbb {R}}^d \rightarrow [-\infty ,+\infty ]\) be a function and \(T>0\). We define the Lax-Oleinik evolution \(u:[0,T]\times {\mathbb {R}}^d\rightarrow [-\infty ,+\infty ]\) of \(u_0\) by
so that \(u(0,x) = u_0(x)\).
Remark 3.10
Since L is bounded below, if there exists some \((t^*,x^*)\in (0,T]\times {\mathbb {R}}^d\) such that \(u(t^*,x^*)\) is finite, then u is finite on all of \([0,T]\times {\mathbb {R}}^d\).
It is known that if u is finite, then it is a viscosity solution of the Hamilton–Jacobi equation
(see [9] Section 7.2 and [10] Theorem 1.1).
Definition 3.11
(Calibrated curve) Let \(f:[t_0,t_1]\times {\mathbb {R}}^d\) be a function. A curve \(\gamma \in W^{1,1}([t_0,t_1];{\mathbb {R}}^d)\) is called a \((f,L)-\)calibrated curve if \(f(t_0,\gamma (t_0))\), \(f(t_1,\gamma (t_1))\) and \(\int _{t_0}^{t_1}L(\gamma (t),{\dot{\gamma }}(t))\;dt\) are all finite and
In the following proposition, we mention some properties of u that are of interest to us. The proofs can be found in [6, 9, 10].
Proposition 3.12
Let u be defined as in (3.11). If u is finite, then the following hold:
-
(i)
u is continuous and locally semi-concave on \((0,T)\times {\mathbb {R}}^d\).
-
(ii)
u is a viscosity solution of the Hamilton–Jacobi equation
$$\begin{aligned} \frac{\partial u}{\partial t} + H(x,\nabla u) = 0. \end{aligned}$$ -
(iii)
If \([a,b]\subset [0,T]\) and \(\gamma :[a,b]\rightarrow {\mathbb {R}}^d\) is a \((u,L)-\)calibrated curve, then u is differentiable at \((t,\gamma (t))\) for every \(t\in [a,b]\) and we have
$$\begin{aligned} \nabla u(t,\gamma (t)) = (\nabla _v L)(\gamma (t),{\dot{\gamma }}(t)). \end{aligned}$$(3.14) -
(iv)
If u is differentiable at \((t^*, x^*)\), then there is at most one \((u,L)-\)calibrated curve \(\gamma :[a,b]\rightarrow {\mathbb {R}}^d\) with \(t^* \in [a,b]\) and \(\gamma (t^*) = x^*\).
We now return to the optimal transport problem from \(\rho _0\in {\mathcal {P}}^{ac}\) to \(\rho _T\in {\mathcal {P}}^{ac}\) induced by L. Suppose that \(W_{c_{L,T}}(\rho _0,\rho _T)\) is finite and let \(u_0\in L^1(\rho _0)\) be a Kantorovich potential.
Proposition 3.13
Let \(\sigma : [0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be an optimizer of the Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _T\) induced by L. Let \((u_0,u_T)\) be the corresponding Kantorovich potentials and \(u:[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) be the Lax–Oleinik evolution of \(u_0\). Then \((\nabla u)(t,\sigma (t,x))\) exists for all \(t\in [0,T]\) and x \(\rho _0-\)almost everywhere. In addition, \(\sigma \) satisfies the relation
Proof
By Remark 3.10, u is finite since \(u(T,\cdot ) = u_T \in L^1(\rho _T)\). By Remark 3.9, the Kantorovich potentials \((u_0,u_T)\) satisfy
for x \(\rho _0-\)almost everywhere (recall that \(\sigma (T,\cdot )\) coincides with the Monge map). Thus, for \(\rho _0-\)almost every x, the curve \(t\mapsto \sigma (t,x)\) is a \((u,L)-\)calibrated curve and so \((\nabla u)(t,\sigma (t,x)) = (\nabla _v L)(\sigma (t,x),{\dot{\sigma }}(t,x))\) exists by Proposition 3.12. Using identity (3.10), we get
\(\square \)
Remark 3.14
Let \(V:[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be a time-dependent vector field that agrees with \((\nabla _p H)(x,(\nabla u)(t,x))\) on the set
for each \(t\in [0,T]\). Using the definition of the displacement interpolant \(\rho _t = \sigma (t,\cdot )_{\#}\rho _0\), and the fact that \((\nabla u)(t,\sigma (t,x))\) exists for all \(t\in [0,T]\) and \(\rho _0-\)almost every \(x\in {\mathbb {R}}^d\), we have that the set
is a set of zero \(\rho _t-\)measure. Thus, \(S_t\) has full \(\rho _t-\)measure and so \(V(t,x) = (\nabla _p H)(x,(\nabla u)(t,x))\) \(\rho _t-\)almost everywhere. By (3.15), \({\dot{\sigma }}(t,x) = V(t,\sigma (t,x))\) for all \(t\in [0,T]\) and \(\rho _0-\)almost every \(x\in {\mathbb {R}}^d\). This means that \(\rho _t\) and V solve the continuity equation
in the sense of distributions ([19] Proposition 3.4.3).
4 Generalized entropy functional and displacement Hessian
Otto calculus and Schachter’s Eulerian calculus both allow for explicit computations, assuming that all relevant quantities possess sufficient regularity. However, the regularity of a displacement interpolant \(\rho \) depends on the Lagrangian L, the initial and final densities \((\rho _0, \rho _T)\), and the optimal trajectories \(\sigma \) (or the velocity field V in the Eulerian framework). In general, the Kantorovich potential \(u_0\) arising from an optimal transport problem induced by a Tonelli Lagrangian L is only known to be semiconcave, differentiable \({\mathcal {L}}^d -\)almost everywhere, and its gradient \(\nabla u_0\) is only locally bounded (see [13] and [14] Appendix C). This implies that the initial velocity \(V(0,x) = (\nabla _p H)(x,\nabla u_0 (x))\) is only locally bounded. As such, even if the initial density \(\rho _0\) is smooth, its regularity may fail to propagate along the displacement interpolant.
For our purpose of computing displacement Hessians, we require displacement interpolants to be of class \(C^2\). Fortunately, such displacement interpolants do exist and we can construct them if we impose two additional criteria on L.
4.1 \(C^2\) displacement interpolants
Let \(L\in C^{k+1} ({\mathbb {R}}^d \times {\mathbb {R}}^d)\), \(k\ge 3\) be a Tonelli Lagrangian satisfying two additional criteria (see [6] Chapters 6.3, 6.4).
-
(L1)
There exists \({\tilde{c}}_0\ge 0\) and \({\tilde{\theta }}:[0,\infty )\rightarrow [0,\infty )\) with
$$\begin{aligned} \lim _{r\rightarrow +\infty }\frac{{\tilde{\theta }}(r)}{r} = +\infty \end{aligned}$$such that
$$\begin{aligned} L(x,v)\ge {\tilde{\theta }}(|v|)-{\tilde{c}}_0. \end{aligned}$$In addition, \({\tilde{\theta }}\) is such that for any \(M>0\) there exists \(K_M >0\) with
$$\begin{aligned} {\tilde{\theta }}(r+m)\le K_M[1+{\tilde{\theta }}(r)] \end{aligned}$$for all \(m\in [0,M]\) and all \(r\ge 0\).
-
(L2)
For any \(r>0\), there exists \(C_r > 0\) such that
$$\begin{aligned} |(\nabla _x L)(x,v)| + |(\nabla _v L)(x,v)| < C_r {\tilde{\theta }}(|v|) \end{aligned}$$for all \(|x|\le r\), \(v\in {\mathbb {R}}^d\).
Some common examples of Tonelli Lagrangians satisfying these criteria include the Riemannian kinetic energy
where \(g_x\) denotes the underlying Riemannian metric tensor, and Lagrangians that arise from mechanics
for some appropriate potential \(U:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\).
Let H be the corresponding Hamiltonian.
Lemma 4.1
Let \(u_0\in C^{k+1}({\mathbb {R}}^d)\) with \(u_0(x)\ge -{\tilde{c}}_0\) for all \(x\in {\mathbb {R}}^d\). Let \(u:[0,+\infty )\times {\mathbb {R}}^d \rightarrow [-\infty ,+\infty ]\) be the Lax–Oleinik evolution of \(u_0\), as defined in (3.11). For \(x\in {\mathbb {R}}^d\), consider the Lagrangian flow (introduced in Definition 3.3)
where \(V:[0,+\infty )\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a time-dependent vector field defined by
(Here, \(\Phi _1\) and \(\Phi _2\) are the x and v components of \(\Phi \) respectively.) If we let \(\sigma (t,x) = \Phi _1(t,x,V(0,x))\), then \({\dot{\sigma }}(t,x) = V(t,\sigma (t,x))\) for all \(t\in [0,+\infty )\), \(x\in {\mathbb {R}}^d\). Moreover, \(\sigma (t,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a \(C^{k}-\)diffeomorphism for every \(t\in [0,+\infty )\).
Proof
Since L and \(u_0\) are both bounded below, we have
and so u is finite. From [10], u is a continuous viscosity solution of the Hamilton–Jacobi equation (3.12) and we know that for each \((t,x)\in (0,+\infty )\times {\mathbb {R}}^d\), there exists a unique \((u,L)-\)calibrated curve \(\gamma _x: [0,t]\rightarrow {\mathbb {R}}^d\) such that \(\gamma _x(t) = x\). Moreover, \((\nabla u)(s,\gamma _x (s))\) exists for all \(s\in [0,t]\) and is given by
Since each \(\gamma _x\) is necessarily an action-minimizing curve from \(\gamma _x (0)\) to \(\gamma _x (t) = x\), it is the unique solution curve to the Euler–Lagrange system
Therefore, \(\sigma (t,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a bijection for all \(t\in [0,+\infty )\) and \({\dot{\sigma }}(t,x) = V(t,\sigma (t,x))\) for all \((t,x)\in [0,+\infty )\times {\mathbb {R}}^d\).
Lastly, since \(L\in C^{k+1} ({\mathbb {R}}^d \times {\mathbb {R}}^d)\) and \(u_0\in C^{k+1} ({\mathbb {R}}^d)\), we have that \(u\in C^{k+1} ([0,+\infty )\times {\mathbb {R}}^d)\) [6]. As \(\nabla _p H \in C^{k}({\mathbb {R}}^d \times {\mathbb {R}}^d; {\mathbb {R}}^d)\), we have \(V(t,\cdot )\in C^{k}({\mathbb {R}}^d; {\mathbb {R}}^d)\) and so \(\sigma (t,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a \(C^{k}-\)diffeomorphism for every \(t\in [0,+\infty )\). \(\square \)
Proposition 4.2
Let \(\rho _0\in {\mathcal {P}}^{ac} \cap C_{c}^2 ({\mathbb {R}}^d)\) be a compactly supported density. Then for any \(T>0\), there exists a \(C^2\) displacement interpolant \(\rho : [0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) with \(\rho (0,\cdot ) = \rho _0\).
Proof
Let \(u_0, u, V, \sigma \) be defined as in Lemma 4.1 and fix \(T>0\). For \(t\in [0,T]\), define
We claim that \(\sigma \) is an optimizer of the Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _T = \rho (T,\cdot )\), which would imply that \(\rho \) is indeed a displacement interpolant. Let \(\phi :[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) satisfy the four conditions in Definition 3.6. By Lemma 4.1, \(t\mapsto \sigma (t,x)\) is a \((u,L)-\)calibrated curve for each \(x\in {\mathbb {R}}^d\). Thus, for every \(x\in {\mathbb {R}}^d\),
By the definition of pushforward measure, the LHS of the last equality is
where the last equality is due to the assumption that \(\phi (T,\cdot )_{\#}\rho _0 = \rho _T\) and \(\phi (0,x) = x\). By the definition of u (i.e. (3.11)), we have that
for every \(x\in {\mathbb {R}}^d\). Thus,
Since \(\phi \) was arbitrary, \(\sigma \) is indeed an optimizer of the Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _T\).
By Lemma 4.1, \(\sigma (t,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a \(C^{k}-\)diffeomorphism for every \(t\in [0,T]\) and \(\sigma (\cdot ,x)\in C^{k+1}([0,T];{\mathbb {R}}^d)\) for every \(x\in {\mathbb {R}}^d\). Using the change-of-variables formula,
where \(\text {det} \nabla \sigma (t,\cdot ) >0\) because \(\sigma (0,x) = x \implies \text {det} \nabla \sigma (0,\cdot ) = 1\). Since \(k\ge 3\), \(\rho \in C^2([0,T]\times {\mathbb {R}}^d)\). \(\square \)
4.2 Displacement Hessian
Let \(F\in C^2 ((0,+\infty ))\cap C([0,+\infty ))\) be a function satisfying
-
(F1)
\(F(0) = 0\),
-
(F2)
\(s^2 F''(s)\ge sF'(s) - F(s) \ge 0,\) \(\quad \forall s\in [0,+\infty )\).
If \(\rho _0 \in {\mathcal {P}}^{ac}\) is such that \(F(\rho _0)\in L^1 ({\mathbb {R}}^d)\), we define the generalized entropy functional
This is well-defined at least on \({\mathcal {P}}^{ac}\cap C_{c}^0 ({\mathbb {R}}^d)\) since \(F(0)= 0\) implies
which is finite.
Remark 4.3
If \(\rho _0\) is the density of a fluid and \({\mathcal {F}}(\rho _0)\) is the internal energy, then \(\rho _0 F'(\rho _0) - F(\rho _0)\) can be interpreted as a pressure [12, 21].
Definition 4.4
(Displacement convexity) The generalized entropy functional \({\mathcal {F}}\) is said to be convex along a displacement interpolant \(\rho _t\), \(t\in [0,T]\), if \({\mathcal {F}}(\rho _t)\) is finite and
for every \(t\in [0,T]\). \({\mathcal {F}}\) is said to be displacement convex if it is convex along every displacement interpolant (on which \({\mathcal {F}}\) is real-valued).
Remark 4.5
When the displacement interpolant is a “straight line", McCann proved that \({\mathcal {F}}\) is displacement convex if \(s\mapsto s^d F(s^{-d})\) is convex and non-increasing on \((0,+\infty )\) [16]. In this context, a “straight line" displacement interpolant refers to one of the form
where M is the Monge map between \(\rho _0\) and \(\rho _T\).
Along a suitable displacement interpolant \(\rho _t\), if the map \(t\mapsto {\mathcal {F}}(\rho _t)\) is \(C^2\), then the condition that \(\frac{d^2}{dt^2}{\mathcal {F}}(\rho _t)\ge 0\) ensures convexity of \({\mathcal {F}}\) along \(\rho _t\). The following displacement Hessian formula is a special case of Theorem 4.3.2 of [19].
Theorem 4.6
(Displacement Hessian formula) Let \(\rho \in C^2 ([0,T]\times {\mathbb {R}}^d)\) be a displacement interpolant, with \(\rho _0 = \rho (0,\cdot )\) compactly supported. Let \(\sigma :[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be an optimizer of the Lagrangian optimal transport problem from \(\rho _0\) to \(\rho _T\). Let \(V:[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be defined as in Remark 3.14 so that \(\rho ,V\) satisfy the continuity equation \({\dot{\rho }} = -\nabla \cdot (\rho V)\). Assume that \(\sigma \) and V are \(C^2\) at least on the set
Then \(\frac{d^2}{dt^2}{\mathcal {F}}(\rho )\) exists for every \(t\in [0,T]\) and is given by
where \(G:[0,+\infty )\rightarrow {\mathbb {R}}\) is defined by
and
Remark 4.7
The requirement that \(\rho _0\) is compactly supported serves to ensure that \({\mathcal {F}}\) is finite along \(\rho \). In addition, the compactness of supp\((\rho _0)\) and the continuity of \(\sigma \) together ensures that the set \(\{\sigma (t,x)\;:\; t\in [0,T]\;,\; x\in \text {supp}(\rho _0)\}\) is compact. Thus,
is bounded, up to a set of zero \({\mathcal {L}}^d -\)measure. This means that \(\frac{d^2}{dt^2}{\mathcal {F}}(\rho )\) exists for every \(t\in [0,T]\) and satisfies
Remark 4.8
By Remark 3.14, for every \(t\in [0,T]\), \(V(t,\cdot )\) is uniquely defined on supp\((\rho _t)\) \(\rho _t-\)almost everywhere. Thus, (4.3) is well-defined.
Proof
The displacement Hessian is
Integrating by parts, the above expression becomes
Using the continuity equation \({\dot{\rho }} = -\nabla \cdot (\rho V)\), the definitions of W and G, and integration by parts, this integral can be written as
A straightforward computation then yields the desired formula. \(\square \)
Remark 4.9
Recall that \(\rho G'(\rho ) - G(\rho ) = \rho ^2 F''(\rho ) - \rho F'(\rho ) + F'(\rho )\ge 0\) and \(G(\rho ) = \rho F'(\rho ) - F(\rho ) \ge 0\) by assumption (F2). Thus, the condition that \({{\,\textrm{tr}\,}}((\nabla V)^2)-\nabla \cdot W \ge 0\) would ensure that \(\frac{d^2}{dt^2}{\mathcal {F}}(\rho )\ge 0\). In the case where the cost is given by squared Riemannian distance, the term \({{\,\textrm{tr}\,}}((\nabla V)^2)-\nabla \cdot W\) is a quadratic form involving the Bakry–Emery tensor [19, 21]. In the following section, we will generalize this quadratic form for an arbitrary Tonelli Lagrangian.
5 Generalized curvature for Tonelli Lagrangians
The goal of this section is to define a generalized curvature for the space \(({\mathbb {R}}^d, L)\). In principle, this generalized curvature is similar to the Ricci curvature in the sense that it is related to the deformation of a shape flowing along action-minimizing curves. The generalized curvature, however, will not be a tensor because it will depend on both the tangent vector and its gradient. Throughout this section, we will assume that L is a \(C^3\) Tonelli Lagrangian.
Let \(T>0\) and \(\sigma :[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be such that
Let \(V:[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be a time-dependent vector field defined by \({\dot{\sigma }}(t,x) = V(t,\sigma (t,x))\) so that \(V(t,\cdot )\in C^2({\mathbb {R}}^d; {\mathbb {R}}^d)\) for every \(t\in [0,T]\). Following the method outlined in [21] Chapter 14, we first derive Lemma 5.1, which is an ODE of the Jacobian matrix \(\nabla \sigma \).
Lemma 5.1
Define \(A,B:{\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^{d\times d}\) by
where \(\gamma _{x,v}:[0,\epsilon ) \rightarrow {\mathbb {R}}^d\) is the unique curve satisfying the Euler–Lagrange equation with initial conditions \(\gamma _{x,v}(0) = x\), \({\dot{\gamma }}_{x,v}(0) = v\). Then the Jacobian \(\nabla \sigma \) satisfies a second-order matrix equation
Proof
Taking the spatial gradient of the Euler–Lagrange equation,
To conclude, we group by the terms \(\nabla \sigma , \nabla {\dot{\sigma }}, \nabla \ddot{\sigma }\) and multiply by \((\nabla _{vv}^2 L)(\sigma ,{\dot{\sigma }})^{-1}\). \(\square \)
Lemma 5.2
Define
Then
Proof
First, we note that since \({\dot{\sigma }}(t,x) = V(t,\sigma (t,x))\), we get
and so
Using the matrix identity \(\frac{d}{dt}M^{-1} = -M^{-1}{\dot{M}}M^{-1}\),
By Lemma 5.1, \(\nabla \ddot{\sigma } = -A(\sigma ,{\dot{\sigma }})(\nabla {\dot{\sigma }}) - B(\sigma ,{\dot{\sigma }})(\nabla \sigma )\) and so
\(\square \)
We want to show that the term \({{\,\textrm{tr}\,}}((\nabla V)^2)-\nabla \cdot W\) appearing in the displacement Hessian formula (4.3) arises from Eq. (5.5). Taking the trace of (5.5), we have
On the other hand, direct computation yields
Since \(V(t,\sigma (t,x)) = {\dot{\sigma }}(t,x)\) and \(\sigma (0,x) = x\), we may restate the above equation as
Using the identities
and
we see that
By the computation of the displacement Hessian from the previous section, this is precisely \({{\,\textrm{tr}\,}}((\nabla V)^2)-\nabla \cdot W\).
At this point, (5.7) holds for all time-dependent \(C^2\) vector fields whose integral curves satisfy the Euler–Lagrange equation (3.3). To show that (5.7) holds for an arbitrary fixed vector field, we first need to make sense of the term \({\dot{V}}\) by introducing Definition 5.4.
Proposition 5.3
Given any fixed vector field \(V_0 \in C^2({\mathbb {R}}^d; {\mathbb {R}}^d)\), we may extend it for a short time to a unique time-dependent vector field V(t, x), \(t\in [0,\epsilon )\) with the following properties:
-
(i)
\(V(0,\cdot ) = V_0\)
-
(ii)
The integral curves of V satisfy the Euler–Lagrange equation, i.e.
$$\begin{aligned} {\dot{\sigma }}(t,x)&= V(t,\sigma (t,x))\\ \sigma (0,x)&= x\\ \frac{d}{dt}((\nabla _v L)(\sigma ,{\dot{\sigma }}))&= (\nabla _x L)(\sigma ,{\dot{\sigma }}) \end{aligned}$$
Proof
We recall Definition 3.3 and the existence of a Lagrangian flow \(\Phi = (\Phi _1,\Phi _2)\) satisfying \(\frac{d}{dt}((\nabla _v L)(\Phi )) = (\nabla _x L)(\Phi )\). Set \(\sigma (t,x) = \Phi _1(t,x,V_0(x))\). The maps \(\sigma (t,\cdot ):{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) are defined for all \(t\in [0,+\infty )\) and there exists \(\epsilon >0\) such that \(\sigma (t,\cdot )\) is invertible for \(t\in [0,\epsilon )\). Thus, for \(t\in [0,\epsilon )\), we may define the desired vector field by
\(\square \)
Definition 5.4
Given a Tonelli Lagrangian L, we define the operation
as in Proposition 5.3.
Remark 5.5
By the Euler–Lagrange equation (3.3), we can give an explicit formula for \(\Gamma _L(V_0)\). Suppose \(\sigma (t,x)\) and V(t, x) satisfy the two properties in Proposition 5.3, then
Since
by the Euler–Lagrange equation, we have
Definition 5.6
(Generalized curvature) Let \(\xi \in C^2({\mathbb {R}}^d; {\mathbb {R}}^d)\). For each \(x\in {\mathbb {R}}^d\), we define the generalized curvature \({\mathcal {K}}_x\) by
where \(A,B:{\mathbb {R}}^d \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^{d\times d}\) are defined as in Lemma 5.1.
Theorem 5.7
Let \(\xi \in C^2({\mathbb {R}}^d; {\mathbb {R}}^d)\). Then
In particular, the generalized curvature \({\mathcal {K}}_x\) is intrinsic, i.e. does not depend on the choice of coordinates.
Proof
By Proposition 5.3, we may extend \(\xi \) for a short time to a time-dependent vector field V(t, x), with \(V(0,\cdot ) = \xi \), whose integral curves satisfy the Euler–Lagrange equation. Thus, (5.7) holds for V and we have
To show that \({\mathcal {K}}_x\) is intrinsic, we will show that the operator
is invariant under a change of coordinates. By Definition 5.4 and the definition of divergence, \(-\nabla \cdot \big (\Gamma _L (\xi )\big )\) is coordinate-free. Next, observe that \(\langle \xi ,\nabla (\nabla \cdot \xi ) \rangle \) is the directional derivative of \(\nabla \cdot \xi \) (which is coordinate-free) with respect to \(\xi \). Thus,
is also coordinate-free. \(\square \)
In the case where \(\xi (x) = (\nabla _p H)(x,\nabla u(x)) \iff \nabla u(x) = (\nabla _v L)(x,\xi (x))\) for some potential \(u:{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) (cf. Proposition 3.12, Lemma 4.1), we can derive an explicit formula for \({\mathcal {K}}_x(\xi )\).
Theorem 5.8
(Formula for \({\mathcal {K}}_x(\xi )\)) Let \(\xi \in C^2({\mathbb {R}}^d; {\mathbb {R}}^d)\) be such that there exists \(u\in C^2({\mathbb {R}}^d)\), with
Then,
where all terms involving L are evaluated at \((x,\xi (x))\).
Proof
See Appendix. \(\square \)
In conclusion, the displacement Hessian formula (4.3) can be written as
6 Displacement convexity for a non-Riemannian Lagrangian cost
In this section, we provide an example of a Lagrangian cost that is not a squared Riemannian distance. We prove using a perturbation argument that the corresponding generalized curvature is non-negative and thus the generalized entropy functional is convex along \(C^2\) displacement interpolants.
Let g(x) be a positive definite matrix for every \(x\in {\mathbb {R}}^d\) so that \(\frac{1}{2}\langle v, g(x)v\rangle \), \(v\in {\mathbb {R}}^d\) defines a Riemannian metric. Let \(g_{ij} = g_{ij}(x)\) denote the ij-th entry of g(x) and \(g^{ij} = g^{ij}(x)\) denote the ij-th entry of the inverse matrix \(g(x)^{-1}\). Further assume that the \(g_{ij}\) are bounded with bounded derivatives, and that the corresponding Bakry–Emery tensor (denoted BE\(_g\)) is bounded from below. That is,
Define the Lagrangian
and the perturbed Lagrangian
where \(\varphi :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is a smooth perturbation (for instance, take \(\varphi \) to be of Schwartz class). Using Theorem 5.8, the respective generalized curvatures are given by
and
By Theorem A.3.1 of [19] and (5.7), \({\mathcal {K}}_x(\xi ) = ||g^{-1}\nabla \xi ^\top ||_{\text {HS}}^2 + \text {BE}_g(\xi )\), where \(||\cdot ||_{\text {HS}}\) denotes the Hilbert-Schmidt norm. Thus, we have a lower bound
where \(c_g>0\) is a constant depending on g. Fix \(\epsilon >0\) such that \(\epsilon \le \min \{\frac{c_g}{10},\frac{k_g}{12}\}\). Our goal is to choose \(\varphi \) so that
-
1.
\(|{\tilde{L}}^{ij} - L^{ij}| = |{\tilde{L}}^{ij} - g^{ij}|\) is sufficiently small, i.e. \(||\nabla ^2 \varphi ||\) is close to zero, and
-
2.
\(|\frac{\partial ^3 \varphi }{\partial v_i \partial v_j \partial v_k}|\) is sufficiently small.
To this end, we choose \(\varphi \) such that
Since \({\mathcal {K}}_x(\xi ) \ge c_g||\nabla \xi ||^2 + k_g ||\xi ||^2\), we conclude that \(\tilde{{\mathcal {K}}}_x(\xi ) \ge 0\).
References
Agueh, M.: Sharp Gagliardo–Nirenberg inequalities and mass transport theory. J. Dyn. Differ. Equ. 18, 1069–1093 (2006)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Birkhäuser, New York (2005)
Bauer, M., Modin, K.: Semi-invariant Riemannian metrics in hydrodynamics. Calc. Var. Partial. Differ. Equ. 59(2), Paper No. 65, 25 pp. (2020)
Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math. 84, 375–393 (2000)
Bernard, P., Buffoni, B.: Optimal mass transportation and Mather theory. J. Eur. Math. Soc. 9, 85–121 (2007)
Cannarsa, P., Sinestrari, C.: Semiconcave Functions, Hamilton–Jacobi Equations, and Optimal Control. Birkhäuser, New York (2004)
Carrillo, J.A., Slepčev, D.: Example of a displacement convex functional of first order. Calc. Var. Partial. Differ. Equ. 36, 547–564 (2008)
Cordero-Erausquin, D., Gangbo, W., Houdré, C.: Inequalities for generalized entropy and optimal transportation. Contemp. Math. 353, 73–94 (2004)
Fathi, A.: Weak KAM theorem in Lagrangian dynamics. Monograph 88 (2014)
Fathi, A.: Viscosity solutions of the Hamilton–Jacobi equation on a non-compact manifold (2020)
Fathi, A., Figalli, A.: Optimal transportation on non-compact manifolds. Israel J. Math. 175, 1–59 (2007)
Figalli, A., Gangbo, W., Yolcu, T.: A variational method for a class of parabolic PDEs. Annali della Scuola Normale Superiore di Pisa - Classe di Scienze Ser. 5, 10(1), 207–252 (2011)
Figalli, A., Gigli, N.: Local semiconvexity of Kantorovich potentials on non-compact manifolds, ESAIM: control. Optim. Cal. Var. 17(3), 648–653 (2011)
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)
Gomes, D.A., Seneci, T.: Displacement convexity for first-order mean-field games. Minimax Theory Appl. 3(2), 261–284 (2018)
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)
Otto, F.: The geometry of dissipative evolution equations: the porous medium equation. Commun. Part. Differ. Equ. 26, 101–174 (2001)
Otto, F., Villani, C.: Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality. J. Funct. Anal. 173(2), 361–400 (2000)
Schachter, B.: An Eulerian Approach to Optimal Transport with Applications to the Otto Calculus. Ph.D. thesis (2017)
Schachter, B.: A new class of first order displacement convex functionals. SIAM J. Math. Anal. 50(2), 1779–1789 (2018)
Villani, C.: Optimal transport—old and new, Grundlehren Math. Wiss, vol. 338. Springer, Berlin (2008)
Acknowledgements
This paper grew out of an undergraduate project I worked on at UCLA. I would like to thank Wilfrid Gangbo (who first introduced this subject to me) for his continued guidance and support. His expertise and generosity have been tremendously helpful. I also thank Tommaso Pacini for helpful feedback. Lastly, I would like to thank the reviewer for the suggestions to the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Xavier Ros-Oton.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The generalized curvature \({\mathcal {K}}_x(\xi )\) is given by
In the computations below, time derivatives of \(\xi \) will be treated by extending \(\xi \) for a short time (in the sense of Proposition 5.3).
Lemma 7.1
Proof
Recall that
By assumption, there exists a potential u(x) satisfying
Since the Hessian
is symmetric, we have
Next,
\(\square \)
Lemma 7.2
Proof
Recall that
By a similar computation as the previous lemma, we have
\(\square \)
Putting together these two lemmas, we get the formula for \({\mathcal {K}}_x(\xi )\).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yang, Y. Generalized curvature for the optimal transport problem induced by a Tonelli Lagrangian. Calc. Var. 62, 206 (2023). https://doi.org/10.1007/s00526-023-02550-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00526-023-02550-2