Abstract
The Lott–Sturm–Villani Curvature-Dimension condition provides a synthetic notion for a metric-measure space to have Ricci-curvature bounded from below and dimension bounded from above. We prove that it is enough to verify this condition locally: an essentially non-branching metric-measure space \((X,\mathsf {d},{\mathfrak {m}})\) (so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length-space and \({\mathfrak {m}}(X) < \infty \)) verifying the local Curvature-Dimension condition \({\mathsf {CD}}_{loc}(K,N)\) with parameters \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\), also verifies the global Curvature-Dimension condition \({\mathsf {CD}}(K,N)\). In other words, the Curvature-Dimension condition enjoys the globalization (or local-to-global) property, answering a question which had remained open since the beginning of the theory. For the proof, we establish an equivalence between \(L^1\)- and \(L^2\)-optimal-transport–based interpolation. The challenge is not merely a technical one, and several new conceptual ingredients which are of independent interest are developed: an explicit change-of-variables formula for densities of Wasserstein geodesics depending on a second-order temporal derivative of associated Kantorovich potentials; a surprising third-order theory for the latter Kantorovich potentials, which holds in complete generality on any proper geodesic space; and a certain rigidity property of the change-of-variables formula, allowing us to bootstrap the a-priori available regularity. As a consequence, numerous variants of the Curvature-Dimension condition proposed by various authors throughout the years are shown to, in fact, all be equivalent in the above setting, thereby unifying the theory.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The Curvature-Dimension condition \({\mathsf {CD}}(K,N)\) was first introduced in the 1980’s by Bakry and Émery [15, 16] in the context of diffusion generators, having in mind primarily the setting of weighted Riemannian manifolds, namely smooth Riemannian manifolds endowed with a smooth density with respect to the Riemannian volume. The \({\mathsf {CD}}(K,N)\) condition serves as a generalization of the classical condition in the non-weighted Riemannian setting of having Ricci curvature bounded below by \(K \in {\mathbb {R}}\) and dimension bounded above by \(N \in [1,\infty ]\) (see e.g. [56, 60] for further possible extensions). Numerous consequences of this condition have been obtained over the past decades, extending results from the classical non-weighted setting and at times establishing new ones directly in the weighted one. These include diameter bounds, volume comparison theorems, heat-kernel and spectral estimates, Harnack inequalities, topological implications, Brunn–Minkowski-type inequalities, and isoperimetric, functional and concentration inequalities—see e.g. [17, 48, 77] and the references therein.
Being a differential and Hilbertian condition, it was for many years unclear how to extend the Bakry–Émery definition beyond the smooth Riemannian setting, as interest in (measured) Gromov-Hausdorff limits of Riemannian manifolds and other non-Hilbertian singular spaces steadily grew. In parallel, and apparently unrelatedly, the theory of Optimal-Transport was being developed in increasing generality following the influential work of Brenier [21] (see e.g. [2, 36, 53, 65, 75,76,77]). Given two probability measures \(\mu _0,\mu _1\) on a common geodesic space \((X,\mathsf {d})\) and a prescribed cost of transporting a single mass from point x to y, the Monge-Kantorovich idea is to optimally couple \(\mu _0\) and \(\mu _1\) by minimizing the total transportation cost, and as a byproduct obtain a Wasserstein geodesic \([0,1] \ni t \mapsto \mu _t\) connecting \(\mu _0\) and \(\mu _1\) in the space of probability measures \({\mathcal {P}}(X)\). This gives rise to the notion of displacement convexity of a given functional on \({\mathcal {P}}(X)\) along Wasserstein geodesics, introduced and studied by McCann [52]. Following the works of Cordero-Erausquin–McCann–Schmuckenschläger [33], Otto–Villani [62] and von Renesse–Sturm [70], it was realized that the \({\mathsf {CD}}(K,\infty )\) condition in the smooth setting may be equivalently formulated synthetically as a certain convexity property of an entropy functional along \(W_2\) Wasserstein geodesics (associated to \(L^2\)-Optimal-Transport, when the transport-cost is given by the squared-distance function).
This idea culminated in the seminal works of Lott–Villani [51] and Sturm [73, 74], where a synthetic definition of \({\mathsf {CD}}(K,N)\) was proposed on a general (complete, separable) metric space \((X,\mathsf {d})\) endowed with a (locally-finite Borel) reference measure \({\mathfrak {m}}\) (“metric-measure space”, or m.m.s.); it was moreover shown that the latter definition coincides with the Bakry–Émery one in the smooth Riemannian setting (and in particular in the classical non-weighted one), that it is stable under measured Gromov-Hausdorff convergence of m.m.s.’s, and that it implies various geometric and analytic inequalities relating metric and measure, in complete analogy with the smooth setting. It was subsequently also shown [58, 64] that Finsler manifolds and Alexandrov spaces satisfy the Curvature-Dimension condition. Thus emerged an overwhelmingly convincing notion of Ricci curvature lower bound K and dimension upper bound N for a general (geodesic) m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\), leading to a rich and fruitful theory exploring the geometry of m.m.s.’s by means of Optimal-Transport.
One of the most important and longstanding open problems in the Lott–Sturm–Villani theory (see [73, 74] and [77, pp. 888, 907]) is whether the Curvature-Dimension condition on a general geodesic m.m.s.(say, having full-support \(\text {supp}({\mathfrak {m}}) = X\)) enjoys the globalization (or local-to-global) property: if the \({\mathsf {CD}}(K,N)\) condition is known to hold on a neighborhood \(X_o\) of any given point \(o \in X\) (a property henceforth denoted by \({\mathsf {CD}}_{loc}(K,N)\)), does it also necessarily hold on the entire space? Clearly this is indeed the case in the smooth setting, as both curvature and dimension may be computed locally (by equivalence with the differential \({\mathsf {CD}}\) definition). However, for reasons which we will expand on shortly, this is not at all clear and in some cases is actually false on general m.m.s.’s. An affirmative answer to this question would immensely facilitate the verification of the \({\mathsf {CD}}\) condition, which at present requires testing all possible \(W_2\)-geodesics on X, instead of locally on each \(X_o\). The analogous question for sectional curvature on Alexandrov spaces (where the dimension N is absent) does indeed have an affirmative answer, as shown by Topogonov, and in full generality, by Perelman (see [22]).
Several partial answers to the local-to-global problem have already been obtained in the literature. A geodesic space \((X,\mathsf {d})\) is called non-branching if geodesics are forbidden to branch at an interior-point into two separate geodesics. On a non-branching geodesic m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) having full support, it was shown by Sturm in [73, Theorem 4.17] that the local-to-global property is satisfied when \(N = \infty \) (assuming that the space of probability measures with finite \({\mathfrak {m}}\)-relative entropy is geodesically convex; see also [77, Theorem 30.42] where the same globalization result was proved under a different condition involving the existence of a full-measure totally-convex subset of X of finite-dimensional points). Still for non-branching geodesic m.m.s.’s having full support, a positive answer was also obtained by Villani in [77, Theorem 30.37] for the case \(K=0\) and \(N \in [1,\infty )\).
We stress that in these results, the restriction to non-branching spaces is not merely a technical assumption—an example of a heavily-branching m.m.s.verifying \({\mathsf {CD}}_{loc}(0,4)\) which does not verify \({\mathsf {CD}}(K,N)\) for any fixed \(K \in {\mathbb {R}}\) and \(N\in [1,\infty ]\) was constructed by Rajala in [67]. Consequently, a natural assumption is to require that \((X,\mathsf {d})\) be non-branching, or more generally, to require that the \(L^2\)-Optimal-Transport on \((X,\mathsf {d},{\mathfrak {m}})\) be concentrated (i.e. up to a null-set) on a non-branching subset of geodesics, an assumption introduced by Rajala and Sturm in [68] under the name essentially non-branching (see Sect. 6 for precise definitions). For instance, it is known [68] that measured Gromov-Hausdorff limits of Riemannian manifolds satisfying \({\mathsf {CD}}(K,\infty )\), and more generally, \({\mathsf {RCD}}(K,\infty )\) spaces, always satisfy the essentially non-branching assumption (see Sect. 13).
In this work, we provide an affirmative answer to the globalization problem in the remaining range of parameters: for \(N \in (1,\infty )\) and \(K \in {\mathbb {R}}\), the \({\mathsf {CD}}(K,N)\) condition verifies the local-to-global property on an essentially non-branching geodesic m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) having finite total-measure and full support. The exclusion of the case \(N=1\) is to avoid unnecessary pathologies, and is not essential. Our assumption that \({\mathfrak {m}}\) has finite total-measure (or equivalently, by scaling, that it is a probability measure) is most probably technical, but we did not verify it can be removed so as to avoid overloading the paper even further. This result is new even under the additional assumption that the space is infinitesimally Hilbertian (see [40])—we will say that such spaces verify \({\mathsf {RCD}}(K,N)\)—in which case the assumption of being (globally) essentially non-branching is in fact superfluous.
To better explain the difference between the previously known cases when \(\frac{K}{N} = 0\) and the conceptual challenge which the newly treated case \(\frac{K}{N} \ne 0\) poses, as well as to sketch our solution and its main new ingredients, which we believe are of independent interest, we provide some additional details below and refer to Sect. 6 for precise definitions.
1.1 Disentangling volume-distortion coefficients
Roughly speaking, the \({\mathsf {CD}}(K,N)\) condition prescribes a synthetic second-order bound on how an infinitesimal volume changes when it is moved along a \(W^{2}\)-geodesic: the volume distortion (or transport Jacobian) J along the geodesic should satisfy the following interpolation inequality for \(t_0 = 0\) and \(t_1 = 1\):
where \(\tau _{K,N}^{(t)}(\theta )\) is an explicit coefficient depending on the curvature \(K \in {\mathbb {R}}\), dimension \(N \in [1,\infty ]\), the interpolating time parameter \(t \in [0,1]\) and the total length of the geodesic \(\theta \in [0,\infty )\) (with an appropriate interpretation of (1.1) when \(N=\infty \)). When \(N <\infty \), the latter coefficient is obtained by geometrically averaging two different volume distortion coefficients:
where the \(\sigma _{K,N-1}^{(t)}(\theta )\) term encodes an \((N-1)\)-dimensional evolution orthogonal to the transport and thus affected by the curvature, and the linear term t represents a one dimensional evolution tangential to the transport and thus independent of any curvature information. As with the Jacobi equation in the usual Riemannian setting, the function \([0,1] \ni t \mapsto \sigma (t) := \sigma _{K,N-1}^{(t)}(\theta )\) is explicitly obtained by solving the second-order differential equation:
The common feature of the previously known cases \(\frac{K}{N} = 0\) for the local-to-global problem is the linear behaviour in time of the distortion coefficient: \(\tau _{K,N}^{(t)}(\theta ) = t\). A major obstacle with the remaining cases \(\frac{K}{N} \ne 0\) is that the function \([0,1] \ni t \mapsto \tau _{K,N}^{(t)}(\theta )\) does not satisfy a second-order differential characterization such as (1.3). If it did, it would be possible to express the interpolation inequality (1.1) on \([t_0,t_1] \subset [0,1]\) as a second-order differential inequality for \(J^{\frac{1}{N}}\) on \([t_0,t_1]\) (see Lemmas A.5 and A.6), and so if (1.1) were known to hold for all \(\left\{ [t_0^i,t_1^i]\right\} _{i=1\ldots k}\) so that \(\cup _{i=1}^k (t_0^i,t_1^i) = (0,1)\), it would follow that (1.1) also holds for \([t_0,t_1] = [0,1]\). However, a counterexample to the latter implication was constructed by Deng and Sturm in [34], thereby showing that:
![](http://media.springernature.com/lw426/springer-static/image/art%3A10.1007%2Fs00222-021-01040-6/MediaObjects/222_2021_1040_Equ4_HTML.png)
On the other hand, the above argument does work if we were to replace \(\tau \) by the slightly smaller \(\sigma \) coefficients. This motivated Bacher and Sturm in [14] to define for \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\) the slightly weaker “reduced” Curvature-Dimension condition, denoted by \({\mathsf {CD}}^{*}(K,N)\), where the distortion coefficients \(\tau _{K,N}^{(t)}(\theta )\) are indeed replaced by \(\sigma _{K,N}^{(t)}(\theta )\). Using the above gluing argument (after resolving numerous technicalities), the local-to-global property for \({\mathsf {CD}}^*(K,N)\) was established in [14] on non-branching spaces (see also the work of Erbar–Kuwada–Sturm [35, Corollary 3.13, Theorem 3.14 and Remark 3.26] for an extension to the essentially non-branching setting, cf. [29, 68]). Let us also mention here the work of Ambrosio–Mondino–Savaré [10], who independently of a similar result in [35], established the local-to-global property for \({\mathsf {RCD}}^*(K,N)\) proper spaces, \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), without a-priori assuming any non-branching assumptions (but a-posteriori, such spaces must be essentially non-branching by [68]).
Without requiring any non-branching assumptions, the \({\mathsf {CD}}^*(K,N)\) condition was shown in [14] to imply the same geometric and analytic inequalities as the \({\mathsf {CD}}(K,N)\) condition, but with slightly worse constants (typically missing the sharp constant by a factor of \(\frac{N-1}{N}\)), suggesting that the latter is still the “right” notion of Curvature-Dimension. We conclude that the local-to-global challenge is to properly disentangle between the orthogonal and tangential components of the volume distortion J before attempting to individually integrate them as above. This also highlights the geometric nature of the globalization problem, and demonstrates that it is not merely a technical challenge.
1.2 Comparing \(L^2\)- and \(L^1\)-Optimal-Transport and main result
There have been a couple of prior attempts to disentangle the volume distortion into its orthogonal and tangential components, by comparing between \(W_2\) and \(W_1\) Wasserstein geodesics (associated to \(L^2\)- and \(L^1\)-Optimal-Transport, respectively). In [30], this strategy was implicitly employed by Cavalletti and Sturm to show that \({\mathsf {CD}}_{loc}(K,N)\) implies the measure-contraction property \({\mathsf {MCP}}(K,N)\), which in a sense is a particular case of \({\mathsf {CD}}(K,N)\) when one end of the \(W_2\)-geodesic is a Dirac delta at a point \(o \in X\) (see [57, 74]). In that case, all of the transport-geodesics have o as a common end point, so by considering a disintegration of \({\mathfrak {m}}\) on the family of spheres centered at o, and restricting the \(W_2\)-geodesic to these spheres, the desired disentanglement was obtained. In the subsequent work [24], Cavalletti generalized this approach to a particular family of \(W_2\)-geodesics, having the property that for a.e. transport-geodesic \(\gamma \), its length \(\ell (\gamma )\) is a function of \(\varphi (\gamma _{0})\), where \(\varphi \) is a Kantorovich potential associated to the corresponding \(L^2\)-Optimal-Transport problem. Here the disintegration was with respect to the individual level sets of \(\varphi \), and again the restriction of the \(W_2\)-geodesic enjoying the latter property to these level sets (formally of co-dimension one) induced a \(W_1\)-geodesic, enabling disentanglement.
Another application of \(L^1\)-Optimal-Transport, seemingly unrelated to disentanglement of \(W_2\)-geodesics, appeared in the recent breakthrough work of Klartag [47] on localization in the smooth Riemannian setting. The localization paradigm, developed by Payne–Weinberger [63], Gromov–Milman [44] and Kannan–Lovász–Simonovits [46], is a powerful tool to reduce various analytic and geometric inequalities on the space \(({\mathbb {R}}^n,\mathsf {d},{\mathfrak {m}})\) to appropriate one-dimensional counterparts. The original approach by these authors was based on a bisection method, and thus inherently confined to \({\mathbb {R}}^n\). In [47], Klartag extended the localization paradigm to the weighted Riemannian setting, by disintegrating the reference measure \({\mathfrak {m}}\) on \(L^1\)-Optimal-Transport geodesics (or “rays”) associated to the inequality under study (cf. Feldman–McCann [38]), and proving that the resulting conditional one-dimensional measures inherit the Curvature-Dimension properties of the underlying manifold.
Klartag’s idea is quite robust, and permitted Cavalletti and Mondino in [27] to avoid the smooth techniques used in [47] and to extend the localization paradigm to the framework of essentially non-branching geodesic m.m.s.’s \((X,\mathsf {d},{\mathfrak {m}})\) of full-support verifying \({\mathsf {CD}}_{loc}(K,N)\), \(N \in (1,\infty )\). By a careful study of the structure of \(W_1\)-geodesics, Cavalletti and Mondino were able to transfer the Curvature-Dimension information encoded in the \(W_2\)-geodesics to the individual rays along which a given \(W_1\)-geodesic evolves, thereby proving that on such spaces,
![](http://media.springernature.com/lw448/springer-static/image/art%3A10.1007%2Fs00222-021-01040-6/MediaObjects/222_2021_1040_Equ5_HTML.png)
Note that the densities of one-dimensional \({\mathsf {CD}}(K,N)\) spaces are characterized via the \(\sigma \) (as opposed to \(\tau \)) volume-distortion coefficients (see the “Appendix”), so by applying the gluing argument described in the previous subsection, only local \({\mathsf {CD}}_{loc}(K,N)\) information was required in [27] to obtain global control over the entire one-dimensional transport ray.
This allowed Cavalletti and Mondino (see [27, 28]) to obtain a series of sharp geometric and analytic inequalities for \({\mathsf {CD}}_{loc}(K,N)\) spaces as above, in particular extending from the smooth Riemannian setting the sharp Lévy-Gromov [42] and Milman [55] isoperimetric inequalities, as well as the sharp Brunn-Minkowski inequality of Cordero-Erausquin–McCann–Schmuckenschläger [33] and Sturm [74], all in global form (see also Ohta [59]).
We would like to address at this point a certain general belief shared by some in the Optimal-Transport community, stating that the property \({\mathsf {BM}}(K,N)\) of satisfying the Brunn-Minkowski inequality (with sharp coefficients correctly depending on K, N), should be morally equivalent to the \({\mathsf {CD}}(K,N)\) condition. Rigorously establishing such an equivalence would immediately yield the local-to-global property of \({\mathsf {CD}}(K,N)\), by the Cavalletti–Mondino localization proof that \({\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {BM}}(K,N)\). However, we were unsuccessful in establishing the missing implication \({\mathsf {BM}}(K,N) \Rightarrow {\mathsf {CD}}(K,N)\), and in fact a careful attempt in this direction seems to lead back to the circle of ideas we were ultimately able to successfully develop in this work.
Instead of starting our investigation from \({\mathsf {BM}}(K,N)\), our strategy is to directly start from a suitable modification of the property (1.5), which we dub \({\mathsf {CD}}^1(K,N)\), when (1.5) is required to hold for transport rays associated to (signed) distance functions from level sets of continuous functions. A stronger condition, when (1.5) is required to hold for transport rays associated to all 1-Lipschitz functions, is denoted by \({\mathsf {CD}}^1_{Lip}(K,N)\)—see Sect. 8 for precise definitions. The main result of this work consists of showing that \({\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N)\), by means of transferring the one-dimensional \({\mathsf {CD}}(K,N)\) information encoded in a family of suitably constructed \(L^1\)-Optimal-Transport rays, onto a given \(W_2\)-geodesic, thereby obtaining the correct disentanglement between tangential and orthogonal distortions. This goes in exactly the opposite direction to the one studied by Cavalletti and Mondino in [27], and completes the cycle
To the best of our knowledge, this decisive feature of our work—deducing \({\mathsf {CD}}(K,N)\) for a given \(W_2\)-geodesic by considering the \({\mathsf {CD}}_{\text {loc}}(K,N)\) information encoded in family (in accordance with (1.4)) of different associated \(W_2\)-geodesics (manifesting itself in the \({\mathsf {CD}}^1(K,N)\) information along a family of different \(L^1\)-Optimal-Transport rays)—has not been previously explored.
Main Theorem 1.1
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.with \({\mathfrak {m}}(X) < \infty \), and let \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). Then the following statements are equivalent:
-
(1)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}(K,N)\).
-
(2)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^*(K,N)\).
-
(3)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{Lip}(K,N)\).
-
(4)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1(K,N)\).
If in addition \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length-space, the above statements are equivalent to
-
(5)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K,N)\).
To this list one can also add the entropic Curvature-Dimension condition \({\mathsf {CD}}^e(K,N)\) of Erbar–Kuwada–Sturm [35], which is known to be equivalent to \({\mathsf {CD}}^*(K,N)\) for essentially non-branching spaces. In other words, all synthetic definitions of Curvature-Dimension are equivalent for essentially non-branching m.m.s.’s, and in particular, the local-to-global property holds for such spaces (recall that this is known to be false on m.m.s.’s where branching is allowed by [67]). The equivalence with \({\mathsf {CD}}_{loc}(K,N)\) is clearly false without some global assumption ultimately ensuring that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a geodesic-space, see Remark 13.4.
As already mentioned, and being slightly imprecise (see Sect. 13 for precise statements), the implications \({\mathsf {CD}}(K,N) \Rightarrow {\mathsf {CD}}^*(K,N) \Rightarrow {\mathsf {CD}}_{loc}(K,N)\) follow from the work of Bacher and Sturm [14], and the implication \({\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {CD}}^1_{Lip}(K,N)\) follows by adapting to the present framework what was already proved by Cavalletti and Mondino in [27] (after taking care of the important maximality requirement of transport-rays, see Theorem 7.10). So almost all of our effort goes into proving that \({\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N)\). For a smooth weighted Riemannian manifold \((M,\mathsf {d},{\mathfrak {m}})\), it is an easy exercise to show the latter implication using the Bakry–Émery differential characterization of \({\mathsf {CD}}(K,N)\)—simply use an appropriate umbilic hypersurface H passing through a given point \(p \in M\) and perpendicular to a given direction \(\xi \in T_p M\), and apply the \({\mathsf {CD}}^1(K,N)\) definition to the distance function from H. Of course, this provides no insight towards how to proceed in the m.m.s.setting, so it is natural to try and obtain an alternative synthetic proof, still in the smooth setting. While this is possible, it already poses a much greater challenge, which in some sense provided the required insight leading to the strategy we ultimately employ in this work.
1.3 Main new ingredients of proof
To achieve the right disentanglement, we are required to develop several new ingredients beyond the present state-of-the-art, which, being conceptual in nature, are in our opinion of independent interest.
-
(1)
The first is a change-of-variables formula for the density of an \(L^2\)-Optimal-Transport geodesic in X (see Theorem 11.4), which depends on a second-order derivative of associated interpolating Kantorovich potentials.
Let \(\mathrm{Geo}(X)\) denote the collection of constant speed geodesics on X parametrized on the interval [0, 1], and let \(\mathrm{e}_t : \mathrm{Geo}(X) \ni \gamma \mapsto \gamma _t \in X\) denote the evaluation map at time \(t \in [0,1]\). Given two Borel probability measures \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\) with finite second moments, any \(W_2\)-geodesic \([0,1] \ni t\mapsto \mu _t \in {\mathcal {P}}(X)\) can be lifted to an optimal dynamical plan \(\nu \in {\mathcal {P}}(\mathrm{Geo}(X))\), so that \((\mathrm{e}_t)_{\sharp } \nu = \mu _t\) for all \(t \in [0,1]\). Let \(\varphi \) denote a Kantorovich potential associated to the \(L^2\)-transport problem between \(\mu _0\) and \(\mu _1\). Given \(s,t\in (0,1)\), we introduce the time-propagated intermediate Kantorovich potential \(\Phi _s^t\) by pushing forward \(\varphi _s\) via \(\mathrm{e}_t \circ \mathrm{e}_s^{-1}\), where \(\left\{ \varphi _t\right\} _{t\in [0,1]}\) is the family of interpolating Kantorovich potentials obtained via the Hopf–Lax semi-group applied to \(\varphi \). While \(\mathrm{e}_t^{-1}\) may be multi-valued, Theorem 3.11 ensures that \(\Phi _s^t = \varphi _s \circ \mathrm{e}_s \circ \mathrm{e}_t^{-1}\) is well-defined on \(\mathrm{e}_t(G_\varphi )\), the set of t-mid-points of transport geodesics.
Theorem 11.4 states that if \((X,\mathsf {d},{\mathfrak {m}})\) is an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) (\({\mathfrak {m}}(X) < \infty \) and \(N \in (1,\infty )\)), and if \(\mu _0,\mu _1 \ll {\mathfrak {m}}\), then for \(\nu \)-a.e. transport-geodesic \(\gamma \in \mathrm{Geo}(X)\) of positive length:
$$\begin{aligned} \frac{\rho _{s} (\gamma _{s})}{\rho _{t}(\gamma _{t})} = \frac{\ell ^{2}(\gamma )}{\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})} \cdot h^\gamma _s(t) \;\;\; \text {for a.e. }t,s \in (0,1), \end{aligned}$$(1.6)where \(\rho _t\) are appropriate versions of the densities \(d\mu _t / d{\mathfrak {m}}\), and for every \(s \in (0,1)\), \(h^\gamma _s\) is a \({\mathsf {CD}}(\ell (\gamma )^2 K ,N)\) density on [0, 1] so that \(h^\gamma _s(s) = 1\). In particular, for a.e. \(t,s\in (0,1)\), \(\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})\) exists and is positive. Here \(h^\gamma _s\) is obtained from the \({\mathsf {CD}}^1(K,N)\) condition applied to the transport-ray associated to the (signed) distance function from the level set \(\left\{ \varphi _s = \varphi _s(\gamma _s)\right\} \).
Theorem 11.4 constitutes the culmination of Part II of this work, which is mostly dedicated to introducing the \({\mathsf {CD}}^{1}(K,N)\) condition and rigorously establishing the change-of-variables formula (1.6). Note that we refrain from making any assumptions on (the challenging) spatial regularity of \(\Phi _{s}^{t}\) when \(t\ne s\), so we are precluded from invoking the coarea formula in our derivation. Our main tool for deriving (1.6) is a comparison between two disintegrations of appropriate measures, one encoding \(W_2\) information and another encoding \(W_1\) information—see Sect. 11 for a heuristic derivation.
-
(2)
To obtain disentanglement of the “Jacobian” \(t \mapsto 1/\rho _t(\gamma _t)\) into its orthogonal and tangential components, we need to understand the first-order variation of the change-of-variables formula (1.6) at \(t=s\), i.e. the second-order variation of \(t \mapsto \Phi _s^t\) at \(t=s\), which amounts to a third-order variation of \(t \mapsto \varphi _t\). Our second main new ingredient in this work is a surprising third-order bound on the variation of \(t \mapsto \varphi _t\) along the Hopf–Lax semi-group (Theorem 5.5), which holds in complete generality on any proper geodesic space.
To this end, we develop in Part I of this work a first, second, and finally third order temporal theory of intermediate Kantorovich potentials in a purely metric setting \((X,\mathsf {d})\), without specifying any reference measure \({\mathfrak {m}}\) and without assuming any non-branching assumptions. This part, which may be read independently of the other components of this work, is presented first (in Sects. 2–5), since its results are constantly used throughout the rest of this work.
Our starting point here is the pioneering work by Ambrosio–Gigli–Savaré [5, 6, Section 3], who already investigated in a very general (extended) metric space setting the first and second order temporal behaviour of the Hopf-Lax semi-group \(Q_t\) applied to a general function \(f : X \rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \). However, the essential point we observe in our treatment is that when f is itself a Kantorovich potential \(\varphi \), characterized by the property that \(\varphi = Q_1(-\varphi ^c)\) and \(\varphi ^c = Q_1(-\varphi )\), much more may be said regarding the behaviour of \(t \mapsto \varphi _t := -Q_t(-\varphi )\), even in first and second order. This is due to the fact that if we reverse time and define \({\bar{\varphi }}_t := Q_{1-t}(-\varphi ^c)\), then we obtain two-sided control over \(\varphi _t\) on the set \(\left\{ \varphi _t = {\bar{\varphi }}_t\right\} \), which turns out to coincide with the set \(\mathrm{e}_t(G_\varphi )\). So for instance, two apparently novel observations which we constantly use throughout this work are that for all \(t \in (0,1)\), \(\ell _t^2/2 := \partial _t \varphi _t\) exists on \(\mathrm{e}_t(G_\varphi )\), and that transport geodesics having a given \(x \in X\) as their t-midpoint all have the same length \(\ell _t(x)\). In Sect. 3, we establish Lipschitz regularity properties of \(t \mapsto \ell ^2_t(x)\) for all \(x \in X\), as well as upper and lower derivative estimates, both pointwise and a.e., for appropriate times t. These are then transferred in Sect. 4 to corresponding estimates for the function \(\Phi _s^t\).
Part I culminates in Sect. 5, whose goal is to prove a quantitative version of the following (somewhat oversimplified) statement, which crucially provides second order information on \(\ell _t\), or equivalently, third order information on \(\varphi _t\), along \(\gamma _t\):
(1.7)Equivalently, this amounts to the statement that:
$$\begin{aligned} (0,1) \ni r \mapsto L(r) := \exp \left( - \frac{1}{\ell (\gamma )^2} \int ^r_{r_0} \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) dt\right) \text { is concave },\nonumber \\ \end{aligned}$$(1.8)since (formally)
$$\begin{aligned} \frac{L''}{L} = (\log L) '' + ((\log L)')^2 = - z' + z^2 \le 0 . \end{aligned}$$It turns out that L(t) precisely corresponds to the tangential component of \(1/\rho _t(\gamma _t)\), and its concavity ensures that it is synthetically controlled by the linear term appearing in the definition of \(\tau ^{(t)}_{K,N}(\theta )\) in (1.2). The novel observation that it is possible to extract in a general metric setting third order information from the Hopf-Lax semi-group, which formally solves the first-order Hamilton-Jacobi equation, is in our opinion one of the most surprising parts of this work. Even in the smooth Riemannian setting, we were not able to find a synthetic proof which is easier than the one in the general metric setting; a formal differential proof of (1.7) assuming both temporal and (more challenging) spatial higher-order regularity of \(\varphi _t\) is provided in Sect. 5.1, but the latter seems to wrongly suggest that it would not be possible to extend (1.7) beyond a Hilbertian setting. Our proof in the general metric setting (Theorem 5.2) is based on a careful comparison of second order expansions of \(\varepsilon \mapsto \varphi _{\tau +\varepsilon }(\gamma _\tau )\) at \(\tau =t,s\), and subtle differences between the usual second derivative and the second Peano derivative (see Sect. 2) come into play.
-
(3)
Our third main new ingredient, described in Part III, is a certain rigidity property of the change-of-variables formula (1.6), which allows us to bootstrap the a-priori available temporal regularity, and which in combination with the first and second ingredients, enables us to achieve disentanglement.
Indeed, the definition of \(\Phi _s^t\) may be naturally extended to an appropriate domain beyond \(\mathrm{e}_t(G_\varphi )\) as follows, allowing to easily (formally) calculate its partial derivative:
$$\begin{aligned} \Phi _s^t = \varphi _t + (t-s) \frac{\ell _t^2}{2} , \;\;\; \partial _t \Phi _s^t = \ell _t^2 + (t-s) \partial _t\frac{\ell _t^2}{2} . \end{aligned}$$Evaluating at \(x = \gamma _t\) and plugging this into the change-of-variables formula (1.6), it follows that for \(\nu \)-a.e. geodesic \(\gamma \):
$$\begin{aligned} \frac{\rho _s(\gamma _s)}{\rho _t(\gamma _t)} = \frac{h^\gamma _s(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)}{\ell ^2(\gamma )}} \;\;\; \text {for a.e. } t,s \in (0,1). \end{aligned}$$(1.9)Thanks to the idea of considering together both initial-point s and end-point t, the latter formula takes on a very rigid structure: note that on the left-hand-side the s and t variables are separated, and the denominator on the right-hand-side depends linearly is s. Consequently, we can easily bootstrap the a-priori available regularity in s and t of all terms involved. It follows that \(\frac{1}{\ell ^2(\gamma )} \partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)\) must coincide for a.e. \(t \in (0,1)\) with a locally-Lipschitz function z(t), so that (1.7) applies. In addition, by redefining \(\left\{ h^\gamma _s\right\} \) for s in a null subset of (0, 1), we can guarantee that \((0,1) \ni s \mapsto h^\gamma _s(t)\) is locally Lipschitz (for any given \(t \in (0,1)\)), even though there is a-priori no relation between the different densities \(\left\{ h^\gamma _s\right\} _{s \in (0,1)}\).
At this point, if \(\rho _t(\gamma _t)\) and z(t) were known to be \(C^2\) smooth, and equality were to hold in (1.9) for all \(s,t \in (0,1)\), we could then define
$$\begin{aligned} Y(r) := \exp \left( \int _{r_0}^r \partial _t|_{t=s} \log h^\gamma _s(t) ds \right) , \end{aligned}$$(1.10)and as \(\partial _t|_{t=s} \log (1 + (t-s) z(t)) = z(s)\), it would follow, recalling the definition (1.8) of L, that
$$\begin{aligned} \frac{\rho _{r_0}(\gamma _{r_0})}{\rho _r(\gamma _r)} = L(r) Y(r) \;\;\; \forall r \in (0,1) . \end{aligned}$$(1.11)Using the fact that all \(\left\{ h^\gamma _s\right\} _{s \in (0,1)}\) are \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) densities to control \(\partial ^2_t|_{t=r} \log h_r(t)\), and surprisingly, also the concavity of L (again!) to control the mixed partial derivatives \(\partial _s \partial _t|_{t=s=r} \log h^\gamma _s(t)\), a formal computation described in Sect. 12.2 then verifies that Y is a \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) density itself. A rigorous justification without all of the above non-realistic assumptions turns out to be extremely tedious, due to the difficulty in applying an approximation argument while preserving the rigidity of the equation—this is worked out in Sect. 12 and the “Appendix”.
After taking care of all these details, we finally obtain the desired disentanglement (1.11) of the Jacobian: L is concave and so controlled synthetically by a linear distortion coefficient, whereas Y is a \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) density and so (by definition) \(Y^{1/(N-1)}\) is controlled synthetically by the \(\sigma ^{(t)}_{K,N-1}(\ell (\gamma ))\) coefficient. A standard application of Hölder’s inequality then verifies that \(J^{1/N}(r) = \rho _r(\gamma _r)^{-1/N}\) is controlled by the \(\tau ^{(t)}_{K,N}(\ell (\gamma ))\) distortion coefficient, i.e. satisfies (1.1)—in fact for all \(t_0,t_1 \in [0,1]\)—thereby establishing \({\mathsf {CD}}(K,N)\), see Theorem 13.2.
The definition (1.10) of Y finally sheds light on the crucial role which the parameter \(s \in (0,1)\) plays in our strategy—its role is to vary between the different \(W_2\)-geodesics from which the \({\mathsf {CD}}_{loc}(K,N)\) information is extracted into the \({\mathsf {CD}}^1(K,N)\) information on the disintegration into transport-rays from the (signed) distance functions from level sets \(\left\{ \varphi _s = \varphi _s(\gamma _s)\right\} \), thereby coming full circle with the observation of (1.4).
Besides establishing the local-to-global property of \({\mathsf {CD}}(K,N)\) and the equivalence of its various variants (in our setting), we emphasize that as a by product of our proof, we obtain a remarkable new self-improvement property of \({\mathsf {CD}}(K,N)\): the \(\tau _{K,N}\)-concavity (1.1) of the transport Jacobian \(J_t(\gamma _t)\) along all \(W_2\)-geodesics implies the (a-priori) stronger “L-Y” decomposition \(J_t(\gamma _t) = L_\gamma (t) Y_\gamma (t)\), where \(L_\gamma \) is concave and \(Y_\gamma \) is a \({\mathsf {CD}}(\ell (\gamma )^2 K, N)\) density on (0, 1). As already mentioned above, this self-improvement is false for a single \(W_2\)-geodesic. We believe that the stronger “L-Y” information will prove to be of further use in the study of \({\mathsf {CD}}(K,N)\) essentially non-branching spaces.
We refer to Sect. 13 for the final details and for additional immediate corollaries of the Main Theorem 1.1 pertaining to \({\mathsf {RCD}}(K,N)\) and strong \({\mathsf {CD}}(K,N)\) spaces. We also provide there several concluding remarks and suggestions for further investigation.
2 Part I Temporal theory of Optimal-Transport
3 Preliminaries
3.1 Geodesics
A metric space \((X,\mathsf {d})\) is called a length space if for all \(x,y \in X\), \(\mathsf {d}(x,y) = \inf \ell (\sigma )\), where the infimum is over all (continuous) curves \(\sigma : I \rightarrow X\) connecting x and y, and \(\ell (\sigma ) := \sup \sum _{i=1}^k \mathsf {d}(\sigma (t_{i-1}),\sigma (t_{i}))\) denotes the curve’s length, where the latter supremum is over all \(k \in {\mathbb {N}}\) and \(t_0 \le \cdots \le t_k\) in the interval \(I \subset {\mathbb {R}}\). A curve \(\gamma \) is called a geodesic if \(\ell (\gamma |_{[t_0,t_1]}) = \mathsf {d}(\gamma (t_0),\gamma (t_1))\) for all \([t_0,t_1] \subset I\). If \(\ell (\gamma ) = 0\) we will say that \(\gamma \) is a null geodesic. The metric space is called a geodesic space if for all \(x,y \in X\) there exists a geodesic in X connecting x and y. We denote by \(\mathrm{Geo}(X)\) the set of all closed directed constant-speed geodesics parametrized on the interval [0, 1]:
We regard \(\mathrm{Geo}(X)\) as a subset of all Lipschitz maps \(\text {Lip}([0,1], X)\) endowed with the uniform topology. We will frequently use \(\gamma _t := \gamma (t)\).
The metric space is called proper if every closed ball (of finite radius) is compact. It follows from the metric version of the Hopf-Rinow Theorem (e.g. [22, Theorem 2.5.28]) that for complete length spaces, local compactness is equivalent to properness, and that complete proper length spaces are in fact geodesic.
Given a subset \(D \subset X \times {\mathbb {R}}\), we denote its sections by
Given a subset \(G \subset \mathrm{Geo}(X)\), we denote by \(\mathring{G} := \left\{ \gamma |_{(0,1)} \;;\; \gamma \in G\right\} \) the corresponding open-ended geodesics on (0, 1). For a subset of (closed or open) geodesics \({\tilde{G}}\), we denote
We denote by \(\mathrm{e}_t : \mathrm{Geo}(X) \ni \gamma \mapsto \gamma _t \in X\) the (continuous) evaluation map at \(t \in [0,1]\), and abbreviate given \(I \subset [0,1]\) as follows:
3.2 Derivatives
For a function \(g : A \rightarrow {\mathbb {R}}\) on a subset \(A \subset {\mathbb {R}}\), denote its upper and lower derivatives at a point \(t_0 \in A\) which is an accumulation point of A by
We will say that g is differentiable at \(t_0\) iff \(\frac{d}{dt} g(t_0) := \frac{{\overline{d}}}{dt} g(t_0) = \underline{\frac{d}{dt}} g(t_0) < \infty \). This is a slightly more general definition of differentiability than the traditional one which requires that \(t_0\) be an interior point of A.
Remark 2.1
Note that there are only a countable number of isolated points in A, so a.e. point in A is an accumulation point. In addition, it is clear that if \(t_0 \in B \subset A\) is an accumulation point of B and g is differentiable at \(t_0\), then \(g|_B\) is also differentiable at \(t_0\) with the same derivative. In particular, if g is a.e. differentiable on A then \(g|_B\) is also a.e. differentiable on B and the derivatives coincide.
Remark 2.2
Denote by \(A_1 \subset A\) the subset of density one points of A (which are in particular accumulation points of A). By Lebesgue’s Density Theorem \({\mathcal {L}}^1(A \setminus A_1) = 0\), where we denote by \({\mathcal {L}}^1\) the Lebesgue measure on \({\mathbb {R}}\) throughout this work. If \(g : A \rightarrow {\mathbb {R}}\) is locally Lipschitz, consider any locally Lipschitz extension \({\hat{g}} : {\mathbb {R}}\rightarrow {\mathbb {R}}\) of g. Then it is easy to check that for \(t_0 \in A_1\), g is differentiable in the above sense at \(t_0\) if and only if \({{\hat{g}}}\) is differentiable at \(t_0\) in the usual sense, in which case the derivatives coincide. In particular, as \({{\hat{g}}}\) is a.e. differentiable on \({\mathbb {R}}\), it follows that g is a.e. differentiable on \(A_1\) and hence on A, and it holds that \(\frac{d}{dt} g = \frac{d}{dt} {{\hat{g}}}\) a.e. on A.
Let \(f : I \rightarrow {\mathbb {R}}\) denote a convex function on an open interval \(I \subset {\mathbb {R}}\). It is well-known that the left and right derivatives \(f^{\prime ,-}\) and \(f^{\prime ,+}\) exist at every point in I and that f is locally Lipschitz there; in particular, f is differentiable at a given point iff the left and right derivatives coincide there. Denoting by \(D \subset I\) the differentiability points of f in I, it is also well-known that \(I \setminus D\) is at most countable. Consequently, any point in D is an accumulation point, and we may consider the differentiability in D of \(f' : D \rightarrow {\mathbb {R}}\) as defined above. We will require the following elementary one-dimensional version (probably due to Jessen) of the well-known Aleksandrov’s theorem about twice differentiability a.e. of convex functions on \({\mathbb {R}}^n\) (see [45, Theorem 5.2.1] or [20, Section 2.6], and [71, p. 31] for historical comments). Clearly, all of these results extend to locally semi-convex and semi-concave functions as well; recall that a function \(f : I \rightarrow {\mathbb {R}}\) is called semi-convex (semi-concave) if there exists \(C \in {\mathbb {R}}\) so that \(I \ni x \mapsto f(x) + C x^2\) is convex (concave).
Lemma 2.3
(Second Order Differentiability of Convex Function) Let \(f : I \rightarrow {\mathbb {R}}\) be a convex function on an open interval \(I \subset {\mathbb {R}}\), and let \(\tau _0 \in I\) and \(\Delta \in {\mathbb {R}}\). Then the following statements are equivalent:
-
(1)
f is differentiable at \(\tau _0\), and if \(D \subset I\) denotes the subset of differentiability points of f in I, then \(f' : D \rightarrow {\mathbb {R}}\) is differentiable at \(\tau _0\) with
$$\begin{aligned} (f')'(\tau _0) := \lim _{D \ni \tau \rightarrow \tau _0} \frac{f'(\tau ) - f'(\tau _0)}{\tau -\tau _0} = \Delta . \end{aligned}$$ -
(2)
The right derivative \(f^{\prime ,+} \,{:}\, I \,{\rightarrow }\, {\mathbb {R}}\) is differentiable at \(\tau _0\) with \((f^{\prime ,+})'\) \((\tau _0)=\Delta \).
-
(3)
The left derivative \(f^{\prime ,-}\,{:}\, I \,{\rightarrow }\, {\mathbb {R}}\) is differentiable at \(\tau _0\) with \((f^{\prime ,-})'\) \((\tau _0)=\Delta \).
-
(4)
f is differentiable at \(\tau _0\) and has the following second order expansion there:
$$\begin{aligned} f(\tau _0 + \varepsilon ) = f(\tau _0) + f'(\tau _0) \varepsilon + \Delta \frac{\varepsilon ^2}{2} + o(\varepsilon ^2)\quad \text { as }\varepsilon \rightarrow 0. \end{aligned}$$In this case, f is said to have a second Peano derivative at \(\tau _0\).
We remark that even for a differentiable function f, while the implication \((1) \Rightarrow (4)\) follows by Taylor’s theorem (existence of the second derivative at a point implies existence of the second Peano derivative there), the converse implication is in general false (see e.g. [61] for a nice discussion). For a locally semi-convex or semi-concave function f, we will say that f is twice differentiable at \(\tau _0\) if any (all) of the above equivalent conditions hold for some \(\Delta \in {\mathbb {R}}\), and write \((\frac{d}{d\tau })^{2}|_{\tau = \tau _0} f(\tau ) = \Delta \).
Finally, we will require the following slightly more refined notation.
Definition
Given an open interval \(I \subset {\mathbb {R}}\) and a function \(f : I \rightarrow {\mathbb {R}}\) which is differentiable at \(\tau _0 \in I\), we define its upper and lower second Peano derivatives at \(\tau _0\), denoted \({\overline{{\mathcal {P}}}}_2 f(\tau _0)\) and \({\underline{{\mathcal {P}}}}_2 f(\tau _0)\) respectively, by
where
Clearly f has a second Peano derivative at \(\tau _0\) iff \({\overline{{\mathcal {P}}}}_2 f(\tau _0) = {\underline{{\mathcal {P}}}}_2 f(\tau _0) < \infty \).
The following is a type of Stolz–Cesàro lemma:
Lemma 2.4
Given an open interval \(I \subset {\mathbb {R}}\) and a locally absolutely continuous function \(f : I \rightarrow {\mathbb {R}}\) which is differentiable at \(\tau _0 \in I\), we have
Proof
By local absolute continuity, f is differentiable a.e. in I and we have for small enough \(\left| \varepsilon \right| \):
and hence
Taking appropriate subsequential limits as \(\varepsilon \rightarrow 0\), the asserted inequalities readily follow. \(\square \)
4 Temporal theory of intermediate-time Kantorovich potentials: first and second order
In the next sections, we will only consider the quadratic cost function \(c=\mathsf {d}^2/2\) on \(X \times X\).
Definition
(c-Concavity, Kantorovich Potential) The c-transform of a function \(\psi : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) is defined as the following (upper semi-continuous) function:
A function \(\varphi : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) is called c-concave if \(\varphi = \psi ^c\) for some \(\psi \) as above. It is well known [76, Exercise 2.35] that \(\varphi \) is c-concave iff \((\varphi ^c)^c = \varphi \). In the context of Optimal-Transport with respect to the quadratic cost c, a c-concave function \(\varphi : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) which is not identically equal to \(-\infty \) is also known as a Kantorovich potential, and this is how we will refer to such functions in this work. In that case, \(\varphi ^c : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) is also a Kantorovich potential, called the dual or conjugate potential.
There is a natural way to interpolate between a Kantorovich potential and its dual by means of the Hopf-Lax semi-group, resulting in intermediate-time Kantorovich potentials \(\left\{ \varphi _t\right\} _{t \in (0,1)}\). The goal of the next three sections is to provide first, second and third order information on the time-behavior \(t \mapsto \varphi _t(x)\) at intermediate times \(t \in (0,1)\). In these sections, we only assume that \((X,\mathsf {d})\) is a proper geodesic metric space.
In this section, we focus on first and second order information. The main new result is Theorem 3.11.
4.1 Hopf-Lax semi-group
We begin with several well-known definitions which we slightly modify and specialize to our setting.
Definition
(Hopf-Lax Transform) Given \(f : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) which is not identically \(+\infty \) and \(t > 0\), define the Hopf-Lax transform \(Q_t f : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) by
Clearly either \(Q_t f \equiv -\infty \) or \(Q_t f(x)\) is finite for all \(x \in X\) (as our metric \(\mathsf {d}\) is finite). Consequently, we denote:
setting \(t_*(f) = 0\) if the supremum is over an empty set. Finally, we set \(Q_0 f := f\).
It is not hard to check (see e.g. [49, Theorem 2.5 (i)]) that when \((X,\mathsf {d})\) is a length space (and in particular geodesic), the Hopf-Lax transform is in fact a semi-group on \([0,\infty )\):
Remark 3.1
It is also possible to extend the definition of \(Q_t f\) to negative times \(t < 0\) by setting
This is called the backwards Hopf-Lax semi-group on \((-\infty ,0]\). However, \(({\mathbb {R}},+) \ni t \mapsto (Q_t,\circ )\) is in general not an abelian group homomorphism, not even for \(t \in [0,1]\) when applied to a Kantorovich potential \(\varphi \) (characterized by \(Q_{-1} \circ Q_1(-\varphi ) = -\varphi \))—see Sect. 3.3. This will be a rather significant nuisance we will need to cope with in this work.
Clearly \((0,\infty ) \times X \ni (t,x) \mapsto Q_t f(x)\) is upper semi-continuous as the infimum of continuous functions in (t, x), and by definition \([0,\infty ) \ni t \mapsto Q_t f(x)\) is monotone non-increasing for each \(x \in X\). Consequently, \((0,\infty ) \ni t \mapsto Q_t f(x)\) must be continuous from the left.
It may also be shown (see [5, Lemma 3.1.2]) that \(X \times (0,t_*(f)) \ni (x,t) \mapsto Q_t f (x)\) is continuous (and in fact locally Lipschitz, see Theorem 3.4 below). Together with the left-continuity, we deduce that for every \(x \in X\), \((0,t_*(f)] \ni t \mapsto Q_t f(x)\) is continuous.
Note that by definition \(f^c = Q_1(-f)\), and that a Kantorovich pair of conjugate potentials \(\varphi ,\varphi ^c : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) are characterized by not being identically equal to \(-\infty \) and satisfying:
In particular, \(t_*(\varphi ),t_*(\varphi ^c) \ge 1\), and we a-posteriori deduce that \(\varphi , \varphi ^c\) are both finite on the entire space X (we have used above the fact that the metric \(\mathsf {d}\) is finite, which differs from other more general treatments).
Definition
(Interpolating Intermediate-Time Kantorovich Potentials) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), the interpolating Kantorovich potential at time \(t \in [0,1]\), \(\varphi _t : X \rightarrow {\mathbb {R}}\), is defined for all \(t \in [0,1]\) by
Note that \(\varphi _0 = \varphi \), \(\varphi _1 = -\varphi ^c\), and
Applying the above mentioned general properties of the Hopf-Lax semi-group to \(\varphi _t\), it will be useful to record:
Lemma 3.2
-
(1)
\((x,t) \mapsto \varphi _t(x)\) is lower semi-continuous on \(X \times (0,1]\) and continuous on \(X \times (0,1)\).
-
(2)
For every \(x \in X\), \([0,1] \ni t \mapsto \varphi _t(x)\) is monotone non-decreasing and continuous on (0, 1].
Definition
(Kantorovich Geodesic) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), a geodesic \(\gamma \in \mathrm{Geo}(X)\) is called a \(\varphi \)-Kantorovich (or optimal) geodesic if
We denote all \(\varphi \)-Kantorovich geodesics by \(G_\varphi \). Note that \(\gamma \in G_{\varphi }\) iff \(\gamma ^c \in G_{\varphi ^c}\), where \(\gamma ^c(t) := \gamma (1-t)\) is the time-reversed geodesic. By upper semi-continuity of \(\varphi \) and \(\varphi ^c\), it follows that \(G_\varphi \) is a closed subset of \(\mathrm{Geo}(X)\).
The following is not hard to check (see e.g. [24, Corollary 2.16]):
Lemma 3.3
Let \(\gamma \) be a \(\varphi \)-Kantorovich geodesic. Then
4.2 Distance functions
The following important definition was given by Ambrosio–Gigli–Savaré [5, 6]:
Definition
(Distance functions \(D^{\pm }_f\)) Given \(f : X \rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) which is not identically \(+\infty \), denote
where the supremum and infimum above run over the set of minimizing sequences \(\left\{ y_n\right\} \) in the definition of the Hopf-Lax transform (3.1). A simple diagonal argument shows that the (outer) supremum and infimum above are in fact attained.
The following properties were established in [5, 6, Chapter 3]:
Theorem 3.4
(Ambrosio–Gigli–Savaré) For any metric space \((X,\mathsf {d})\) (not necessarily proper, complete nor geodesic):
-
(1)
Both functions \(D^{\pm }_f(x,t)\) are locally finite on \(X \times (0,t_*(f))\), and \((x,t) \mapsto Q_t f(x)\) is locally Lipschitz there.
-
(2)
\((x,t) \mapsto D^{\pm }_f(x,t)\) is upper (\(D^{+}_f(x,t)\)) / lower (\(D^{-}_f(x,t)\)) semi-continuous on \(X \times (0,t_*(f))\).
-
(3)
For every \(x \in X\), both functions \((0,t_*(f)) \ni t \mapsto D^{\pm }_f(x,t)\) are monotone non-decreasing and coincide except where they have (at most countably many) jump discontinuities.
-
(4)
For every \(x \in X\), \(\partial _t^{\pm } Q_t f(x) = - \frac{(D^{\pm }_f(x,t))^2}{2 t^2}\) for all \(t \in (0,t_*(f))\), where \(\partial _t^{-}\) and \(\partial _t^+\) denote the left and right partial derivatives, respectively. In particular, the map \((0,t_*(f)) \ni t \mapsto Q_t f(x)\) is locally Lipschitz and locally semi-concave, and differentiable at \(t \in (0,t_*(f))\) iff \(D^+_f(x,t) = D^-_f(x,t)\).
It may be instructive to recall the proof of property (3) above, which is related to some ensuing properties, so for completeness, we present it below. For simplicity, we restrict to the case of interest for us, and first record:
Lemma 3.5
Given a proper metric space X, a lower semi-continuous \(f : X \rightarrow {\mathbb {R}}\), \(x \in X\) and \(t \in (0,t_*(f))\), there exist \(y^{\pm }_t \in X\) so that
Recall that \(-\varphi \) is indeed lower semi-continuous for any Kantorovich potential \(\varphi \).
Proof of Lemma 3.5
Let \(\{y_t^{\pm ,n}\}\) denote a minimizing sequence so that
By property (1) we know that \(D^{\pm }_f(x,t)< R < \infty \), and the properness implies that the closed geodesic ball \(B_R(x)\) is compact. Consequently \(\{y_t^{\pm ,n}\}\) has a converging subsequence to \(y^{\pm }_t\), and the lower semi-continuity of f implies that:
as asserted. \(\square \)
Proof of (3) for proper X and lower semi-continuous f. The assertion will follow immediately after establishing
since trivially \(D^-_f \le D^+_f\) and since a monotone function can only have a countable number of jump discontinuities. By Lemma 3.5, there exist \(y^+_s\) and \(y^-_t\) so that
and:
It follows that
Summing these two inequalities and rearranging terms, one deduces
as required. \(\square \)
4.3 Intermediate-time duality and time-reversed potential
It is immediate to show by inspecting the definitions that we always have (e.g. [77, Theorem 7.34 (iii)] or [3, Proposition 2.17 (ii)]):
this is an inherent group-structure incompatibility of the Hopf-Lax forward and backward semi-groups. Note that for \(f = -\varphi \) where \(\varphi \) is a Kantorovich potential, we do have equality for \(s=1\), and in fact for all \(s \in [0,1]\). However, for \(f = Q_t(-\varphi )\), \(t \in (0,1)\) and \(s=1-t\), we can only assert an inequality above ([77, Theorem 7.36], [3, Corollary 2.23 (i)]):
and equality may not hold at every point of X (cf. [77, Remark 7.37]). Nevertheless, in our setting, the subset where equality is attained may be characterized as in the next proposition. We first introduce the following very convenient:
Definition
(Time-Reversed Interpolating Potential) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), define the time-reversed interpolating Kantorovich potential at time \(t \in [0,1]\), \({\bar{\varphi }}_t : X \rightarrow {\mathbb {R}}\), as
Note that \({\bar{\varphi }}_0 = \varphi \), \({\bar{\varphi }}_1 = -\varphi ^c\), and
Proposition 3.6
-
(1)
\(\varphi _0 = {\bar{\varphi }}_0 = \varphi \) and \(\varphi _1 = {\bar{\varphi }}_1 = -\varphi ^c\).
-
(2)
For all \(t \in [0,1]\), \(\varphi _t \le {\bar{\varphi }}_t\).
-
(3)
For any \(t \in (0,1)\), \(\varphi _t(x) = {\bar{\varphi }}_t(x)\) if and only if \(x \in \mathrm{e}_t(G_\varphi )\). In other words:
$$\begin{aligned} D(\mathring{G}_\varphi ) = \left\{ (x,t) \in X \times (0,1) \; ; \; \varphi _t(x) = {\bar{\varphi }}_t(x) \right\} . \end{aligned}$$(3.3)
(1) is immediate by c-concavity, and (2) is a reformulation of (3.2), so the only assertion requiring proof is (3). The if direction is well-known (e.g. [77, Theorem 7.36], [3, Corollary 2.23 (ii)]), but the other direction appears to be new. It is based on the following simple lemma, which we will use again later on:
Lemma 3.7
Assume that for some \(x,y,z \in X\) and \(t \in (0,1)\):
Then x is a t-intermediate point between y and z:
and there exists a \(\varphi \)-Kantorovich geodesic \(\gamma : [0,1] \rightarrow X\) with \(\gamma (0) = y\), \(\gamma (t) =x\) and \(\gamma (1) = z\).
Proof
Using that
our assumption yields
On the other hand, the reverse inequality is always valid by the triangle and Cauchy–Schwarz inequalities:
It follows that we must have equality everywhere above, and (3.4) amounts to the equality case in the Cauchy–Schwarz inequality. Consequently, the concatenation \(\gamma : [0,1] \rightarrow X\) of any constant speed geodesic \(\gamma _1 : [0,t] \rightarrow X\) between y and x, with any constant speed geodesic \(\gamma _2 : [t,1] \rightarrow X\) between x and z, so that \(\gamma (0) = y\), \(\gamma (t) = x\) and \(\gamma (1) = z\), must be a constant speed geodesic itself (by the triangle inequality). Lastly, the equality in (3.5) implies that \(\gamma \in G_\varphi \), thereby concluding the proof. \(\square \)
Proof of Proposition 3.6 (3)
We begin with the known direction. Let \(x = \gamma _t\) with \(\gamma \in G_\varphi \). Apply Lemma 3.3 to \(\gamma \) with \(s=0\) and \(r=t\):
and to \(\gamma ^c \in G_{\varphi ^c}\) with \(s=1\) and \(r=1-t\):
where we used that \((\varphi ^c)_1 = -(\varphi ^c)^c = -\varphi \). Summing these two identities, we obtain:
as asserted.
For the other direction, assume that \(\varphi _t(x) = - (\varphi ^c)_{1-t}(x)\) for some \(x \in X\) and \(t \in (0,1)\). By Lemma 3.5 applied to the lower semi-continuous functions \(-\varphi \) and \(-\varphi ^c\), there exist \(y_t,z_t \in X\) so that
Summing the two equations, the assertion follows immediately from Lemma 3.7. \(\square \)
We also record the following immediate corollary of Lemma 3.2:
Corollary 3.8
-
(1)
\((x,t) \mapsto {\bar{\varphi }}_t(x)\) is upper semi-continuous on \(X \times [0,1)\) and continuous on \(X \times (0,1)\).
-
(2)
For every \(x \in X\), \([0,1] \ni t \mapsto {\bar{\varphi }}_t(x)\) is monotone non-decreasing and continuous on [0, 1).
Finally, in view of (3.3), we deduce for free:
Corollary 3.9
\(D(\mathring{G}_\varphi )\) is a closed subset of \(X \times (0,1)\).
Proof
Immediate from (3.3) by the continuity of \(\varphi _t(x)\) and \({\bar{\varphi }}_t(x)\) on \(X \times (0,1)\).
\(\square \)
4.4 Length functions \(\ell _t^{\pm }\) and \({\bar{\ell }}_t^{\pm }\)
Definition
(Length functions \(\ell _t^{\pm },{\bar{\ell }}_t^{\pm }\)) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), denote
To provide motivation for these definitions, let us mention that we will shortly see that if \(x = \gamma _t\) with \(\gamma \in G_\varphi \) and \(t \in (0,1)\), then
In particular, all \(\varphi \)-Kantorovich geodesics having x as their t-mid-point have the same length. These facts seem to not have been previously noted in the literature, and they will be crucially exploited in this work.
Definition
For \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), introduce the following set:
and on it define \({\tilde{\ell }}_t(x)\) as the common value \({\tilde{\ell }}_t^+(x) = {\tilde{\ell }}_t^-(x)\).
Recalling that \(\varphi _t = -Q_t(-\varphi )\) and \({\bar{\varphi }}_t = Q_{1-t}(-\varphi ^c)\), we begin by translating Theorem 3.4 into the following corollary. We freely use standard properties of semi-convex (semi-concave) functions, like twice a.e. differentiability, non-negativity (non-positivity) of the singular part of the distributional second derivative (see e.g. Lemma A.11), etc.
Corollary 3.10
Let \(\varphi : X \rightarrow {\mathbb {R}}\) denote a Kantorovich potential. Then:
-
(1)
For \({\tilde{\ell }}= \ell ,{\bar{\ell }}\) and \({\tilde{\varphi }}=\varphi ,{\bar{\varphi }}\), \({\tilde{\ell }}^{\pm }_t(x)\) are locally finite on \(X \times (0,1)\), and \((x,t) \mapsto {\tilde{\varphi }}_t(x)\) is locally Lipschitz there.
-
(2)
For \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), \((x,t) \mapsto {\tilde{\ell }}^{\pm }_t(x)\) is upper (\({\tilde{\ell }}^{+}_t(x)\)) / lower (\({\tilde{\ell }}^{-}_t(x)\)) semi-continuous on \(X \times (0,1)\). In particular, the subset \(D_{{\tilde{\ell }}} \subset X \times (0,1)\) is Borel and \((x,t) \mapsto {\tilde{\ell }}_t(x)\) is continuous on \(D_{{\tilde{\ell }}}\).
-
(3)
For every \(x \in X\) we have
$$\begin{aligned} \partial _t^{\pm } \varphi _t(x) = \frac{\ell ^{\pm }_t(x)^2}{2} ~,~ \partial _t^{\pm } {\bar{\varphi }}_t(x) = \frac{{\bar{\ell }}^{\mp }_t(x)^2}{2}\;\;\; \forall t \in (0,1) . \end{aligned}$$In particular, for \({\tilde{\ell }}= \ell ,{\bar{\ell }}\) and \({\tilde{\varphi }}=\varphi ,{\bar{\varphi }}\), respectively, the map \((0,1) \ni t \mapsto {\tilde{\varphi }}_t(x)\) is locally Lipschitz, and it is differentiable at \(t \in (0,1)\) iff \(t \in D_{{\tilde{\ell }}}(x)\), the set on which both maps \((0,1) \ni t \mapsto {\tilde{\ell }}^{\pm }_t(x)\) coincide. \(D_{{\tilde{\ell }}}(x)\) is precisely the set of continuity points of both maps, and thus coincides with (0, 1) with at most countably exceptions. In particular
$$\begin{aligned} {\tilde{\varphi }}_{t_2}(x) - {\tilde{\varphi }}_{t_1}(x) = \int _{t_1}^{t_2} \frac{{\tilde{\ell }}^2_\tau (x)}{2} d\tau \;\;\; \forall t_1,t_2 \in (0,1) . \end{aligned}$$ -
(4)
For every \(x \in X\):
-
(a)
Both maps \((0,1) \ni t \mapsto t \ell ^{\pm }_t(x)\) are monotone non-decreasing. In particular, \(D_{\ell }(x) \ni t \mapsto \ell ^2_t(x)\) is differentiable a.e., the singular part of its distributional derivative is non-negative, \((0,1) \ni t\mapsto \varphi _t(x)\) is locally semi-convex, and
$$\begin{aligned} {\underline{\partial }}_t \frac{\ell _t^2(x)}{2}&\ge -\frac{1}{t} \ell _t^2(x) \;\;\; \forall t \in D_{\ell }(x) . \end{aligned}$$(3.6) -
(b)
Both maps \((0,1) \ni t \mapsto (1-t) {\bar{\ell }}^{\pm }_t(x)\) are monotone non-increasing. In particular, \(D_{{\bar{\ell }}}(x) \ni t \mapsto {\bar{\ell }}^2_t(x)\) is differentiable a.e., the singular part of its distributional derivative is non-positive, \((0,1) \ni t \mapsto {\bar{\varphi }}_t(x)\) is locally semi-concave, and
$$\begin{aligned} {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2}&\le \frac{1}{1-t} {\bar{\ell }}_t^2(x) \;\;\; \forall t \in D_{{\bar{\ell }}}(x) . \end{aligned}$$(3.7)
-
(a)
Proof
The only point requiring verification is that monotonicity of \(t \mapsto t \ell _t(x)\) in (4a) and \(t \mapsto (1-t) {\bar{\ell }}_t\) in (4b) implies (3.6) and (3.7), respectively. For instance, using the continuity of \(t \mapsto \ell _t(x)\) on \(D_{\ell }(x)\), (3.6) is clearly equivalent to
Now, if \(\ell _t(x) = 0\) the monotonicity directly implies \({\underline{\partial }}_t \ell _t(x) \ge 0\) and establishes (3.8), whereas otherwise, (3.8) is equivalent by the chain-rule (and again the continuity of \(t \mapsto \ell _t(x)\) on \(D_{\ell }(x)\)) to
which in turn is a consequence of the aforementioned monotonicity. The proof of (3.7) follows identically. \(\square \)
We now arrive to the main new result of this section, which will be constantly and crucially used in this work:
Theorem 3.11
Let \(\varphi : X \rightarrow {\mathbb {R}}\) denote a Kantorovich potential.
-
(1)
For all \(x \in \mathrm{e}_t(G_\varphi )\) with \(t \in (0,1)\), we have
$$\begin{aligned} \ell ^{+}_t(x) = \ell ^{-}_t(x) = {\bar{\ell }}^{+}_t(x) = {\bar{\ell }}^{-}_t(x) = \ell (\gamma ) , \end{aligned}$$for any \(\gamma \in G_\varphi \) so that \(\gamma _t = x\). In other words
$$\begin{aligned} D(\mathring{G}_\varphi ) = \left\{ (x,t) \in X \times (0,1) \; ; \; x = \gamma _t \; , \; \gamma \in G_\varphi \right\} \subset D_\ell \cap D_{{\bar{\ell }}}, \end{aligned}$$and moreover \(\ell _t(x) = {\bar{\ell }}_t(x)\) there.
-
(2)
For all \(x \in X\), \(\mathring{G}_\varphi (x) \ni t \mapsto \ell _t(x)={\bar{\ell }}_t(x)\) is locally Lipschitz:
$$\begin{aligned}&\left| \sqrt{t (1-t)} \ell _{t}(x) - \sqrt{s (1-s)} \ell _{s}(x)\right| \nonumber \\&\quad \le \sqrt{\ell _{t}(x) \ell _{s}(x)} \left| \sqrt{t (1-s)} - \sqrt{s (1-t)} \right| \;\;\; \forall t,s \in \mathring{G}_\varphi (x).\qquad \end{aligned}$$(3.9) -
(3)
For all \((x,t) \in D(\mathring{G}_\varphi ) \subset D_\ell \cap D_{{\bar{\ell }}}\) we have for both \(* = {\underline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) ,{\overline{{\mathcal {P}}}}_2 \varphi _t(x)\):
$$\begin{aligned} -\frac{1}{t} \ell _t^2(x) \le {\underline{\partial }}_t \frac{\ell _t^2(x)}{2} \le {\underline{{\mathcal {P}}}}_2 \varphi _t(x) \le * \le {\overline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) \le {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2} \le \frac{1}{1-t} \ell _t^2(x) , \end{aligned}$$where the Peano (partial) derivatives are with respect to the t variable.
-
(4)
For all \((x,t) \in D(\mathring{G}_\varphi ) \subset D_\ell \cap D_{{\bar{\ell }}}\) we have:
$$\begin{aligned} {\overline{\partial }}_t \frac{\ell _t^2(x)}{2}&\le {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2} + \left( \frac{1}{1-t} + \frac{1}{t}\right) \ell _t^2(x) \le \left( \frac{2}{1-t} + \frac{1}{t}\right) \ell _t^2(x) \\ \underline{\partial _t} \frac{{\bar{\ell }}_t^2(x)}{2}&\ge \underline{\partial _t} \frac{\ell _t^2(x)}{2} -\left( \frac{1}{t} + \frac{1}{1-t}\right) \ell _t^2(x) \ge -\left( \frac{2}{t} + \frac{1}{1-t}\right) \ell _t^2(x) . \end{aligned}$$
In particular, for every \(x \in X\), we have:
with \(t \mapsto \frac{\ell ^2_t(x)}{2}\) and \(t \mapsto \frac{{\bar{\ell }}^2_t(x)}{2}\) continuous on \(D_\ell (x) \cap D_{{\bar{\ell }}}(x)\), differentiable a.e. there, and having locally bounded lower and upper derivatives on \(\mathring{G}_{\varphi }(x) \subset D_\ell (x) \cap D_{{\bar{\ell }}}(x)\) as in (3) and (4).
Proof
To see (1), let \((x,t) \in D(\mathring{G}_\varphi )\). Equivalently, by Proposition 3.6 (3), we know that \(\varphi _t(x) = {\bar{\varphi }}_t(x)\). In addition, Lemma 3.5 assures the existence of \(y^{\pm }\) and \(z^{\pm }\) in X so that
Equating both expressions and applying Lemma 3.7, we deduce that x is the t-midpoint of a geodesic connecting \(y^{\pm }\) and \(z^{\pm }\) (for all 4 possibilities), and that
so that all 4 possibilities above coincide. We remark in passing that this already implies in a non-branching setting that necessarily \(y^+ = y^-\) and \(z^+=z^-\), i.e. the uniqueness of a \(\varphi \)-Kantorovich geodesic with t-mid point x.
Furthermore, if \(x = \gamma _t\) for some \(\gamma \in G_\varphi \), then by Lemma 3.3:
It follows by definition of \(D^{\pm }_{-\varphi }(x,t)\) that:
which together with (3.10) establishes that \(\ell (\gamma ) = \ell _t(x) = {\bar{\ell }}_t(x)\).
To see (2), let \(\gamma ^t, \gamma ^s \in G_{\varphi }\) be so that \(\gamma ^t_t = \gamma ^s_s = x\), for some \(t,s \in (0,1)\). Then
for \((p,q) = (t,s)\) and \((p,q) = (s,t)\). Summing these two inequalities, we obtain the well-known c-cyclic monotonicity of the set \(\left\{ (\gamma ^t_0,\gamma ^t_1),(\gamma ^s_0,\gamma ^s_1)\right\} \):
To evaluate the right-hand-side, we simply pass through x and employ the triangle inequality:
Plugging this above and rearranging terms, we obtain
Completing the square by subtracting \(2 \sqrt{t(1-t)s(1-s)} \ell (\gamma ^t) \ell (\gamma ^s)\) from both sides, and recalling that \(\ell (\gamma ^p) = \ell _p(x)\) for \(p=t,s\), we readily obtain (3.9). In particular, using \(t=s\), the above argument recovers the last assertion of (1) that \(\ell (\gamma )\) is the same for all \(\gamma \in G_\varphi \) so that \(\gamma _t = x\).
To see (3), recall that given \(x \in X\), we know by Proposition 3.6 that \(\varphi _t(x) \le {\bar{\varphi }}_t(x)\) for all \(t \in (0,1)\) with equality iff \(t \in \mathring{G}_\varphi (x)\). Since \(\mathring{G}_\varphi (x) \subset D_\ell (x) \cap D_{{\bar{\ell }}}(x)\) by (1), we know that both maps \(t \mapsto {\tilde{\varphi }}_t(x)\) are differentiable at \(t_0 \in \mathring{G}_\varphi (x)\), and we see again that \(\frac{\ell ^2_{t_0}(x)}{2} = \partial _t \varphi _{t_0}(x) = \partial _t {\bar{\varphi }}_{t_0}(x) = \frac{{\bar{\ell }}^2_{t_0}(x)}{2}\), since the derivatives of a function and its majorant must coincide at a mutual point of differentiability where they touch. Moreover, defining \({\tilde{h}} = h,{\bar{h}}\) as
it follows that \(h \le {\bar{h}}\) (on \((-t_0,1-t_0)\)). Diving by \(\varepsilon ^2\) and taking appropriate subsequential limits, we obviously obtain
Combining these inequalities with those of Lemma 2.4, (3.6) and (3.7), the chain of inequalities in (3) readily follows.
To see (4), let \(t_0 \in \mathring{G}_\varphi (x)\). Consider the function \(f(t) := {\bar{\varphi }}_t(x) - \varphi _t(x)\) on (0, 1), which is locally semi-concave by Corollary 3.10. By Proposition 3.6, we know that \(f \ge 0\) with \(f(t_0) = 0\). The function f is differentiable on \(D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) and satisfies \(f'(t) = \frac{{\bar{\ell }}^2_{t}(x)}{2} - \frac{\ell ^2_{t}(x)}{2}\) there. In particular, this holds at \(t_0 \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by (1) and \(f'(t_0) = 0\). Note that by Corollary 3.10:
In particular, since both \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}_t(x)\) are continuous at \(t = t_0 \in D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\), for \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), it follows that
It follows that on the open interval \(I_\delta := (t_0-\delta ,t_0 + \delta ) \cap (0,1)\), \(f - C_\varepsilon \frac{t^2}{2}\) is concave with \(C_\varepsilon \) defined as the constant on the right-hand-side above. Applying Lemma 3.12 below to the translated function \(f(\cdot + t_0)\) on the interval \(I_\delta - t_0\), it follows that:
As \({\bar{\ell }}_{t_0}(x) = \ell _{t_0}(x)\) by (1), we obtain
The assertion of (4) now follows by taking appropriate subsequential limits as \(t \rightarrow t_0\) and using the fact that \(\varepsilon > 0\) was arbitrary. \(\square \)
Lemma 3.12
Given \(I \subset {\mathbb {R}}\) an open interval containing 0, let \(f : I \rightarrow {\mathbb {R}}\) denote a C-semi-concave function, so that \(I \ni t \mapsto f - C \frac{t^2}{2}\) is concave, \(C \ge 0\). Assume that \(f \ge 0\) on I, that f is differentiable at 0 and that \(f(0) = f'(0) = 0\). Then \(\underline{\partial _t}|_{t=0} f'(t) \ge -C\), and moreover, \(\frac{f'(t)}{t} \ge -C\) for all \(t \in D \cap I/2\), where \(D \subset I\) denotes the subset (of full measure) of differentiability points of f.
Note that the C-semi-concavity is equivalent to \({\overline{\partial }}_t|_{t=0} f'(t) \le C\), while the conclusion is from the opposite direction. It is not hard to verify that the asserted lower bound is in fact best possible.
Proof of Lemma 3.12
Set \(g = f'\) on D. The C-semi-concavity is equivalent to the statement that \(g(t) - C t\) is non-increasing on D, so that \(g(t_2) \le g(t_1) + C (t_2 - t_1)\) for all \(t_1,t_2 \in D\) with \(t_1 < t_2\). It follows that necessarily \(g(t) \ge -C t\) for all \(t \in D \cap I/2\) with \(t \ge 0\), since:
Repeating the same argument for \(t \mapsto f(-t)\), we see that \(-g(t) \ge C t\) for all \(t \in D \cap I/2\) with \(t \le 0\). This concludes the proof. \(\square \)
In a sense, Theorem 3.11 (2) is the temporal analogue of the spatial 1/2-Hölder regularity proved by Villani in [77, Theorem 8.22]. Formally taking \(s \rightarrow t\) in (3.9), it is easy to check that one obtains (for both \({\tilde{\ell }} = \ell ,{\bar{\ell }}\)) stronger bounds than in Theorem 3.11 (3) and (4):
However, we do not know how to rigorously pass from (3.9) to (3.11) or vice versa (by differentiation or integration, respectively), since we cannot exclude the possibility that the (relatively closed in (0, 1)) set \(\mathring{G}_\varphi (x)\) has isolated points, nor that it is disconnected. Instead, we can obtain the following stronger version of (3.11) which only holds for a.e. \(t \in \mathring{G}_\varphi (x)\), but will prove to be very useful later on.
Corollary 3.13
For all \(x \in X\), for a.e. \(t \in \mathring{G}_\varphi (x)\), \(\partial _t \ell ^2_t(x)\) and \(\partial _t {\bar{\ell }}^2_t(x)\) exist, coincide, and satisfy:
Proof
By Corollary 3.10, for all \(x \in X\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), \(t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable a.e. on \(D_{{\tilde{\ell }}}(x)\). Consequently, the first and third equalities in (3.12) follow for a.e. \(t \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by Remark 2.1. The second equality follows since \(\ell _t(x) = {\bar{\ell }}_t(x)\) for \(t \in \mathring{G}_\varphi (x)\) by Theorem 3.11. The lower and upper bounds in (3.12) then follow from Theorem 3.11 (3) (or as in (3.11), by taking the limit as \(s \rightarrow t\) in Theorem 3.11 (2)). \(\square \)
4.5 Null-geodesics
Definition 3.14
(Null-Geodesics and Null-Geodesic Points) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), denote the subset of null \(\varphi \)-Kantorovich geodesics by
Its complement in \(G_\varphi \) will be denoted by \(G_\varphi ^+\). The subset of X of null \(\varphi \)-Kantorovich geodesic points is denoted by
Its complement in X will be denoted by \(X^+\).
The following provides a convenient equivalent characterization of \(X^0\) and \(X^+\):
Lemma 3.15
Given \(x \in X\), the following statements are equivalent:
-
(1)
\(x \in X^0\), i.e. \(\varphi (x) + \varphi ^c(x) = 0\).
-
(2)
\(\forall t \in (0,1)\), \(\varphi _t(x) = {\bar{\varphi }}_t(x) = \varphi (x) = -\varphi ^c(x)\).
-
(3)
\(\forall t \in (0,1)\), \(\varphi _t(x) = c\) and \({\bar{\varphi }}_t(x) = {\bar{c}}\) for some \(c,{\bar{c}} \in {\mathbb {R}}\).
-
(4)
\(D_{\ell }(x) = D_{{\bar{\ell }}}(x) = (0,1)\) and \(\;\forall t \in (0,1) \;\; \ell _t(x) = {\bar{\ell }}_t(x) = 0\).
-
(5)
\(\exists t_0 \in \mathring{G}_\varphi (x)\) so that \(\varphi _{t_0}(x) = \varphi (x)\) or \({\bar{\varphi }}_{t_0}(x) = \varphi (x)\) or \(\varphi _{t_0}(x) = -\varphi ^c(x)\) or \({\bar{\varphi }}_{t_0}(x) = -\varphi ^c(x)\).
-
(6)
\(\exists t_0 \in \mathring{G}_\varphi (x)\) so that \(\ell _{t_0}^{-}(x) = 0\) or \(\ell _{t_0}^{+}(x) = 0\) or \({\bar{\ell }}_{t_0}^{-}(x) = 0\) or \({\bar{\ell }}_{t_0}^+(x) = 0\).
In other words, we have the following dichotomy: all \(\varphi \)-Kantorovich geodesics having \(x \in X\) as some interior mid-point have either strictly positive length (iff \(x \in X^+\)) or zero length (iff \(x \in X^0\)).
Remark 3.16
In fact, we always have \(\varphi _t(x) = {\bar{\varphi }}_t(x)\) and \(\ell _t(x) = {\bar{\ell }}_t(x)\) for \(t \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by Theorem 3.11, so we may simply write “\(\varphi _{t_0}(x) = \varphi (x)\) or \(\varphi _{t_0}(x)= -\varphi ^c(x)\)” and “\(\ell _{t_0}(x) = {\bar{\ell }}_{t_0}(x) = 0\)” in statements (5) and (6), respectively. However, we chose to formulate these statements with the (a-priori) minimal requirements.
Proof of Lemma 3.15
\((1) \Rightarrow (2)\) is straightforward: for instance, (1) is by definition identical to \(\varphi _1(x) = \varphi _0(x)\) and (2) follows by the monotonicity of \([0,1] \ni t \mapsto {\tilde{\varphi }}_t(x)\) for both \({\tilde{\varphi }} = \varphi ,{\bar{\varphi }}\); alternatively, apply Lemma 3.3 to the null geodesic \(\gamma ^0 \equiv x\) with respect to both Kantorovich potentials \(\varphi \) and \(\varphi ^c\).
\((2) \,\!\Rightarrow \!\, (3)\) is trivial.
\((3)\,\!\!\Leftrightarrow \!\!\,(4)\) follows by using that \(D_{{\tilde{\ell }}}(x)\) is characterized as the subset of t-differentiability points of \(\varphi _t(x)\) on (0, 1) with \(\partial _t {\tilde{\varphi }}_t(x) = {\tilde{\ell }}_t^2(x)/2\) there.
\((3)\,\!\Rightarrow \!\,(1)\): by the continuity of \(t \mapsto \varphi _t(x)\) from the left at \(t=1\) it follows that \(c = \varphi _1(x)\), and similarly the continuity of \(t \mapsto {\bar{\varphi }}_t(x)\) from the right at \(t=0\) yields that \({\bar{c}} = {\bar{\varphi }}_0(x) = \varphi (x)\). Since always \(\varphi \le {\bar{\varphi }}\), we deduce \(\varphi _1(x) = c \le {\bar{c}} = \varphi (x)\). On the other hand, we always have \(\varphi (x) \le \varphi _1(x)\) by monotonicity, so we conclude that \(\varphi (x) = \varphi _1(x)\), establishing statement (1). This concludes the proof of the equivalence \((1)\,\! \Leftrightarrow \!\,(2) \,\!\Leftrightarrow \!\,(3) \,\!\Leftrightarrow \!\, (4)\).
\((2) \,\!\Rightarrow \!\, (5)\) and \((4)\,\! \Rightarrow \!\,(6)\) are trivial.
\((5) \,\!\!\Rightarrow \!\!\,(6)\) is straightforward: for instance, if \({\tilde{\varphi }}_{t_0}(x) = {\tilde{\varphi }}_0(x) = \varphi (x)\) for some \(t_0 \in (0,1)\) and \({\tilde{\varphi }} \in \left\{ \varphi ,{\bar{\varphi }}\right\} \), then by monotonicity, \({\tilde{\varphi }}_t(x) = \varphi (x)\) for all \(t \in [0,t_0]\), and hence the left derivative at \(t=t_0\) satisfies \(\ell ^{-}_{t_0}(x) = \partial _t^{-}|_{t=t_0} \varphi _t(x) = 0\) if \({\tilde{\varphi }} = \varphi \) and \({\bar{\ell }}^{+}_{t_0}(x) = \partial _t^{-}|_{t=t_0} {\bar{\varphi }}_t(x) = 0\) if \({\tilde{\varphi }} = {\bar{\varphi }}\). If \({\tilde{\varphi }}_{t_0}(x) = {\tilde{\varphi }}_1(x) = -\varphi ^c(x)\), repeat the argument using the right derivative.
The only direction requiring second-order information on \(\varphi _t\) is \((6) \Rightarrow (3)\). By Corollary 3.10, \(t \mapsto t \ell _{t}^{\pm }(x)\) and \(t \mapsto (1-t) {\bar{\ell }}_t^{\pm }(x)\) are monotone non-decreasing and non-increasing on (0, 1), respectively. Since \(t_0 \in \mathring{G}_\varphi \), in view of Remark 3.16, (5) is equivalent to \(\ell ^{\pm }_{t_0}(x) = {\bar{\ell }}^{\pm }_{t_0}(x) = 0\). The monotonicity implies that \(\ell _t^{\pm }(x) = 0\) for all \(t \in (0,t_0]\) and that \({\bar{\ell }}_t^{\pm }(x) = 0\) for all \(t \in [t_0,1)\). It follows that \(\varphi _t(x)\) is constant on \((0,t_0]\) and \({\bar{\varphi }}_t(x)\) is constant on \([t_0,1)\). As \(\varphi _{t_0}(x) = {\bar{\varphi }}_{t_0}(x)\), the monotonicity of \(t \mapsto {\tilde{\varphi }}_t(x)\) and the majoration \(\varphi _t \le {\bar{\varphi }}_t\) forces both \(t \mapsto \varphi _t(x)\) and \(t \mapsto {\bar{\varphi }}_t(x)\) to be constant on (0, 1), establishing (3) (in fact with \(c = {\bar{c}}\)). \(\square \)
Corollary 3.17
If \(x \in X^+\) then \(\ell _t(x) > 0\) for all \(t \in [\inf \mathring{G}_\varphi (x), 1) \cap D_{\ell }(x)\) and \({\bar{\ell }}_t(x) > 0\) for all \(t \in (0,\sup \mathring{G}_\varphi (x)] \cap D_{{\bar{\ell }}}(x)\).
Proof
Immediate by (6) and the monotonicity of \(D_{\ell }(x) \ni t \mapsto t \ell _t(x)\) and \(D_{{\bar{\ell }}}(x) \ni t \mapsto (1-t) {\bar{\ell }}_t(x)\), together with the fact that \(\mathring{G}_\varphi (x)\) is relatively closed in (0, 1) by Corollary 3.9. \(\square \)
Corollary 3.18
Given \(x \in X\), assume that \(\exists t_1,t_2 \in \mathring{G}_\varphi (x)\) with \(t_1 \ne t_2\). Then \(x \in X^0\) iff \(\varphi _{t_1}(x) = \varphi _{t_2}(x)\) (or equivalently, \({\bar{\varphi }}_{t_1}(x) = {\bar{\varphi }}_{t_2}(x)\)).
Proof
The “only if” direction follows immediately by Lemma 3.15, whereas the “if” direction follows by Corollary 3.17, after recalling that \(\varphi _{t_2}(x) - \varphi _{t_1}(x) = \int _{t_1}^{t_2} \frac{\ell _\tau ^2(x)}{2} d\tau \) by Corollary 3.10. As usual, the equivalent condition follows by Theorem 3.11. \(\square \)
5 Temporal theory of intermediate-time Kantorovich potentials: time-propagation
The goal of this section is to introduce and study the following function(s):
Definition
(Time-Propagated Intermediate Kantorovich Potentials) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) and \(s,t \in (0,1)\), define the t-propagated s-Kantorovich potential \(\Phi _s^t\) on \(D_{\ell }(t)\), and its time-reversed version \({\bar{\Phi }}_s^t\) on \(D_{{\bar{\ell }}}(t)\), by:
Observe that for all \(s,t \in (0,1)\):
indeed, while \(\mathrm{e}_t^{-1} : \mathrm{e}_t(G_\varphi ) \rightarrow G_\varphi \) may be multi-valued, Theorem 3.11 implies that \(\ell (\gamma ) = \ell _t(x) = {\bar{\ell }}_t(x)\) for any \(\gamma \in G_\varphi \) with \(\gamma _t = x\), and consequently Lemma 3.3 yields that \(\varphi _s \circ \mathrm{e}_s\) is single-valued for all such \(\gamma \) and (also recalling Proposition 3.6):
Consequently, on \(\mathrm{e}_t(G_\varphi )\), \(\Phi _s^t={\bar{\Phi }}_s^t\) is identified as the push-forward of \(\varphi _s\) via \(\mathrm{e}_t \circ \mathrm{e}_s^{-1}\), i.e. its propagation along \(G_\varphi \) from time s to time t.
We will use the following short-hand notation. Given \(s \in [0,1]\) and \(a_s \in {\mathbb {R}}\), we denote:
suppressing the implicit dependence of \(G_{a_s}\) on s. The above argument about why \(\varphi _s \circ \mathrm{e}_s \circ \mathrm{e}_t^{-1}\) is well-defined can be rewritten as:
Corollary 4.1
(Inter Level-Set Propagation) For all \(s,t \in (0,1)\), \(a_s, b_s \in {\mathbb {R}}\), \(a_s \ne b_s\), we have:
Note that while typically disjoint sets remain disjoint under optimal-transport only under some additional non-branching assumptions, Corollary 4.1 holds true in general.
5.1 Monotonicity
Lemma 4.2
Let \(x = \gamma ^1_{t_1} = \gamma ^2_{t_2}\) with \(\gamma ^1,\gamma ^2 \in G_\varphi \) and \(0< t_1< t_2 < 1\). Then for any \(s \in (0,1)\):
Moreover, the left-hand-side is in fact strictly positive iff \(x \in X^+\).
Proof
We know by Lemma 3.3 and Theorem 3.11 that:
Recall that \(\varphi _{t_i}(x) = {\bar{\varphi }}_{t_i}(x)\) and \(\ell _{t_i}(x) = {\bar{\ell }}_{t_i}(x)\) by Proposition 3.6 and Theorem 3.11, as \(x = \gamma ^i_{t_i}\). Now set \({\bar{s}} := (s \vee t_1) \wedge t_2\). Since \({\bar{s}} \in \left\{ t_1,t_2,s\right\} \), it follows that:
By Corollary 3.10, we know for \({\tilde{\ell }} = \ell ,{\bar{\ell }}\) that \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable a.e., and that the singular part of its distributional derivative is non-negative for \({\tilde{\ell }} = \ell \) and non-positive for \({\tilde{\ell }} = {\bar{\ell }}\). Consequently, we may proceed as follows:
where we used that \(\tau -s \ge 0\) when \({\bar{s}} \le \tau < t_2\) and that \(\tau -s \le 0\) when \({\bar{s}} \ge \tau > t_1\). Using (3.6) and (3.7) to bound the above lower and upper derivatives on the sets (having full measure) \(D_{\ell }(x)\) and \(D_{{\bar{\ell }}}(x)\), respectively, we obtain:
Summarizing, we have obtained:
We now use the inequality \(\varphi _{{\bar{s}}}(x) \le {\bar{\varphi }}_{{\bar{s}}}(x)\) in the first line above when \(2 \frac{s}{t_2} - 1 \ge 0\), and in the second line when \(2 \frac{1-s}{1-t_1} - 1 \ge 0\), yielding
In particular, the first estimate applies whenever \(s \ge \frac{1}{2}\) and the second one whenever \(s \le \frac{1}{2}\). Using that \([0,1] \ni \tau \mapsto {\tilde{\varphi }}_\tau (x)\) is monotone non-decreasing, the asserted (4.1) is established in either case. Moreover, (4.1) implies that if \(\varphi _s(\gamma ^2_s) -\varphi _s(\gamma ^1_s) = 0\) then \(\varphi _{t_1}(x) = \varphi _{t_2}(x)\), and hence by Corollary 3.18 that \(x \in X^0\); and vice-versa, if \(x \in X^0\) then all geodesics having x as an interior point are null by Lemma 3.15, and hence \(\gamma ^1_s = \gamma ^2_s = x\) and \(\varphi _s(\gamma ^2_s) -\varphi _s(\gamma ^1_s) = 0\). \(\square \)
We can already deduce the following important consequence, complementing Corollary 4.1, which holds for any proper geodesic space \((X,\mathsf {d})\), independently of any additional assumptions like various forms of non-branching:
Corollary 4.3
(Intra Level-Set Propagation) For any \(s \in (0,1)\), \(a_s \in {\mathbb {R}}\), and \(t_1, t_2 \in (0,1)\) with \(t_1 \ne t_2\):
In other words, for each \(x \in e_{(0,1)}(G_{a_s}) \cap X^+\), there exists a unique \(t \in (0,1)\) so that \(x \in \mathrm{e}_t(G_{a_s})\).
Proof
If \(x = \gamma ^1_{t_1} = \gamma ^2_{t_2} \in X^+\), \(0<t_1< t_2< 1\), then Lemma 4.2 yields \(\varphi _s(\gamma ^2(s)) > \varphi _s(\gamma ^1(s))\), establishing the assertion. \(\square \)
5.2 Properties of \(\Phi _s^t\)
The following information will be crucially used when deriving the Change-Of-Variables formula in Sect. 11:
Proposition 4.4
For any \(s \in (0,1)\), the following properties of \(\Phi _s^t\) and \({\bar{\Phi }}_s^t\) hold:
-
(1)
The maps \((x,t) \mapsto \Phi _s^t(x)\) and \((x,t) \mapsto {\bar{\Phi }}_s^t(x)\) are continuous on \(D_{\ell }\) and \(D_{{\bar{\ell }}}\), respectively.
-
(2)
For each \(x \in X\), \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively, \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\Phi }}_s^t(x)\) is differentiable at t iff \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable at t or if \(t=s \in D_{{\tilde{\ell }}}(x)\), so in particular \(t \mapsto {\tilde{\Phi }}_s^t(x)\) is a.e. differentiable. At points t of differentiability:
$$\begin{aligned} \partial _t {\tilde{\Phi }}_s^t(x) = {\tilde{\ell }}_t^2(x) + (t-s) \frac{\partial _t {\tilde{\ell }}^2_t(x)}{2} . \end{aligned}$$(4.2)In particular, if \(s \in D_{{\tilde{\ell }}}(x)\) then \(\exists \partial _t|_{t=s} {\tilde{\Phi }}_s^t(x) = {\tilde{\ell }}_s^2(x)\).
-
(3)
For each \(x \in X\), the map \(\mathring{G}_\varphi (x) \ni t \mapsto \Phi _s^t(x) = {\bar{\Phi }}_s^t(x)\) is locally Lipschitz and non-decreasing (if \(\# \mathring{G}_\varphi (x) \ge 2\), it is strictly increasing iff \(x \in X^+\)).
-
(4)
For all \(t \in (0,1)\):
$$\begin{aligned}&\quad {\left\{ \begin{array}{ll} {\underline{\partial }}_t \Phi _s^t(x) \ge \frac{s}{t} \ell _t^2(x) &{} \quad \,\,t \ge s \\ {\overline{\partial }}_t \Phi _s^t(x) \le \frac{s}{t} \ell _t^2(x) &{} \quad \,\,t \le s \end{array}\right. } \;\; \forall x \in D_{\ell }(t) ~;~ \\&\quad {\left\{ \begin{array}{ll} {\overline{\partial }}_t {\bar{\Phi }}_s^t(x) \le \frac{1-s}{1-t} {\bar{\ell }}_t^2(x) &{} t \ge s \\ {\underline{\partial }}_t {\bar{\Phi }}_s^t(x) \ge \frac{1-s}{1-t} {\bar{\ell }}_t^2(x) &{} t \le s \end{array}\right. } \;\; \forall x \in D_{{\bar{\ell }}}(t). \end{aligned}$$ -
(5)
For all \((x,t) \in D(\mathring{G}_\varphi )\):
$$\begin{aligned}&\min \left( \frac{s}{t},\frac{1-s}{1-t} + \frac{t-s}{t(1-t)}\right) \ell _t^2(x) \le {\underline{\partial }}_t \Phi _s^t(x) \\&\le {\overline{\partial }}_t \Phi _s^t(x) \le \max \left( \frac{s}{t},\frac{1-s}{1-t} + \frac{t-s}{t(1-t)}\right) \ell _t^2(x), \\&\min \left( \frac{1-s}{1-t} , \frac{s}{t} - \frac{t-s}{t(1-t)}\right) \ell _t^2(x) \le {\underline{\partial }}_t {\bar{\Phi }}_s^t(x) \\&\le {\overline{\partial }}_t {\bar{\Phi }}_s^t(x) \le \max \left( \frac{1-s}{1-t} , \frac{s}{t} - \frac{t-s}{t(1-t)}\right) \ell _t^2(x). \end{aligned}$$
Proof
Recall that
The first and second statements follow by Lemma 3.2 and Corollary 3.10. As \(t \mapsto {\tilde{\varphi }}_t(x)\) is differentiable on \(D_{{\tilde{\ell }}}(x)\) with derivative \(\frac{{\tilde{\ell }}_t^2(x)}{2}\), the points of differentiability of \(t \mapsto {\tilde{\Phi }}_s^t(x)\) must coincide with those of \(t \mapsto {\tilde{\ell }}_t^2(x)\) and (4.2) follows immediately, with the only possible exception being the point \(t=s\) if \(s \in D_{{\tilde{\ell }}}(x)\), where direct inspection and continuity of \(t \mapsto {\tilde{\ell }}_t^2(x)\) on \(D_{{\tilde{\ell }}}(x)\) verifies (4.2). The local Lipschitzness follows by Theorem 3.11 (2). The monotonicity follows by Lemma 4.2, since if \(\gamma ^t \in G_\varphi \) is such that \(\gamma ^t_t = x\), then \(\Phi _s^t(\gamma ^t_t) = {\bar{\Phi }}_s^t(\gamma ^t_t) = \varphi _s(\gamma ^t_s)\). The last two assertions follow as in the proof of Lemma 4.2, after noting that:
and similarly for \({\overline{\partial }}_t\). Indeed, the estimates (3.6) and (3.7) of Corollary 3.10 yield (4), which already yields half of the inequalities in (5) for all \((x,t) \in D_\ell \cap D_{\bar{\ell }}\). To get the other half, we must restrict to \(D(\mathring{G}_\varphi )\) and use the estimates of Theorem 3.11 (4), thereby concluding the proof. \(\square \)
As an immediate corollary of Proposition 4.4, Corollary 3.13 and Lemma 3.15, we obtain:
Corollary 4.5
For all \(x \in X\), for a.e. \(t \in \mathring{G}_\varphi (x)\), \(\partial _t \Phi ^t_s(x)\) and \(\partial _t {\bar{\Phi }}^t_s(x)\) exist, coincide, and satisfy:
In particular, if \(x \in X^+\) then \(\partial _t \Phi ^t_s(x) > 0\) for a.e. \(t \in \mathring{G}_\varphi (x)\).
We will also require the following consequence of Proposition 4.4 and Theorem 3.11:
Lemma 4.6
For any \(x \in X\), \(s \in (0,1)\), and \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively:
Proof
By (4.2), the claim boils down to proving
Using Corollary 3.13, it follows that:
But the latter limit is clearly 0 (e.g. by Corollary 3.10 (1)). \(\square \)
6 Temporal theory of intermediate-time Kantorovich potentials: third order
Fix a non-null Kantorovich geodesic \(\gamma \in G_\varphi ^+\), and denote for short \(\ell := \ell (\gamma ) > 0\). Recall by the results of Sect. 3 that for all \(t \in (0,1)\), \(\ell _t(\gamma _t) = {\bar{\ell }}_t(\gamma _t) = \ell \) and that \(\partial _t \varphi _t(x) = \partial _t {\bar{\varphi }}_t(x) =\ell _t^2(x)/2\) for all \(x \in \mathrm{e}_t(G_\varphi )\). Also, recall that given \(x \in X\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), the function \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}_t(x)\) is only a.e. differentiable, and even on \(\mathring{G}_{\varphi }(x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\), we only have at the moment upper and lower bounds on \(\underline{\partial _t} {\tilde{\ell }}^2_t(x)/2\) and \({\overline{\partial }}_t {\tilde{\ell }}^2_t(x)/2\), i.e. second order information on \({\tilde{\varphi }}_t(x)\).
The goal of this section is to rigorously make sense and prove the following formal statement, which provides second order information on \(\ell _t\), or equivalently, third order information on \(\varphi _t\), along \(\gamma _t\):
Equivalently, this amounts to the statement that the function:
is concave in \(r \in (0,1)\), since formally:
6.1 Formal argument
We start by providing a formal proof of (5.1) in an infinitesimally Hilbertian setting, which is rigorously justified on a Riemannian manifold if all involved functions are smooth (in time and space).
Recall that the Hopf-Lax semi-group solves the Hamilton-Jacobi equation (e.g. [6]):
We evaluate all subsequent functions at \(x = \gamma _t\). Since:
and since \(\gamma '(t) = -\nabla \varphi _t\) (see e.g. [6] or Lemma 10.3),
But taking two time derivatives in (5.2), we know that:
and so we conclude that:
It remains to apply Cauchy–Schwarz and deduce:
as asserted. Note that in a general setting, we can try and interpret z(t) as minus the directional derivative of \(\ell _t^2/2 = \partial _t \varphi _t\) in the direction of \(\gamma '(t)\) (by taking derivative of the identity \(\frac{\ell ^2_t}{2}(\gamma (t)) \equiv \frac{\ell ^2}{2}\)), and thus hope to justify the Cauchy–Schwarz inequality as the statement that the local Lipschitz constant of \(\partial _t \varphi _t\) is greater than any unit-directional derivative. However, a crucial point in the above argument of identifying \(z'(t)\) with \(\left| \nabla \partial _t \varphi _t\right| ^2\) was to use the linearity of \(\left\langle \cdot ,\cdot \right\rangle \) in both of its arguments, and so ultimately this formal proof is genuinely restricted to an infinitesimally Hilbertian setting.
The above discussion seems to suggest that there is no hope of proving (5.1) beyond the Hilbertian setting. Furthermore, it seems that the spatial regularity of \(\varphi _t\) and \(\partial _t \varphi _t = \frac{1}{2}\ell ^2_t\) should play an essential role in any rigorous justification. Remarkably, we will see that this is not the case on both counts, and that an appropriate interpretation of (5.1) holds true on a general proper geodesic space \((X,\mathsf {d})\).
6.2 Notation
Recall that by the results of Sect. 3, \(\tau \mapsto \varphi _\tau (x)\) and \(\tau \mapsto {\bar{\varphi }}_\tau (x)\) are locally semi-convex and semi-concave on (0, 1), respectively, and that \(\partial _t^{\pm } \varphi _t(x) = \ell ^{\pm }_t(x)^2/2\), \(\partial _t^{\pm } {\bar{\varphi }}_t(x) = {\bar{\ell }}^{\mp }_t(x)^2/2\) and \(\ell ^{\pm }_t(\gamma _t) = {\bar{\ell }}^{\pm }_t(\gamma _t) = \ell \) for all \(t \in (0,1)\). We respectively introduce \({\tilde{p}} = p,{\bar{p}}\) by defining at \(t \in (0,1)\):
where the penultimate equalities in each of the lines above follow from the continuity of \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto {\tilde{\ell }}_\tau (\gamma _t)\) at \(\tau = t \in G_\varphi (\gamma _t) \subset D_{{\tilde{\ell }}}(\gamma _t)\), and the last ones by the monotonicity of \(\tau \mapsto \tau \ell ^{\pm }_\tau (\gamma _t)\) and \(\tau \mapsto (1-\tau ) {\bar{\ell }}^{\pm }_\tau (\gamma _t)\) and the density of \(D_{{\tilde{\ell }}}\) in (0, 1). Clearly \({\tilde{p}}_{-}(t) \le {\tilde{p}}_{+}(t)\), and \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{p}} \in {\mathbb {R}}\) iff \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto {\tilde{\ell }}_\tau ^2/2(\gamma _t)\) is differentiable at \(\tau = t\) with derivative \({\tilde{p}}\). In addition, for \({\tilde{q}} = q,{\bar{q}}\), set:
where the Peano (partial) derivatives are with respect to the t variable. It will be useful to recall that if we define \({\tilde{h}} = h , {\bar{h}}\) by:
then:
By definition, \({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = {\tilde{q}} \in {\mathbb {R}}\) if and only if \(\tau \mapsto {\tilde{\varphi }}_{\tau }(\gamma _t)\) has second order Peano derivative at \(\tau = t\) equal to \({\tilde{q}}\), and hence by Lemma 2.3, iff \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{q}}\), or equivalently, iff any of the other equivalent conditions for the second order differentiability of \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_{\tau }(\gamma _t)\) at \(\tau =t\) are satisfied. Moreover, Lemma 2.4 implies:
but we will not require this here. We summarize the above discussion in:
Corollary 5.1
The following statements are equivalent for a given \(t \in (0,1)\):
-
(1)
\({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{p}} \in {\mathbb {R}}\), i.e. \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto \frac{{\tilde{\ell }}^2_\tau }{2}(\gamma _t)\) is differentiable at \(\tau = t\) with derivative \({\tilde{p}}\).
-
(2)
\({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = {\tilde{q}} \in {\mathbb {R}}\), i.e. \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_\tau (\gamma _t)\) has a second Peano derivative at \(\tau = t\) equal to \({\tilde{q}}\).
In any of these cases \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_\tau (\gamma _t)\) is twice differentiable at \(\tau = t\), and we have:
6.3 Main inequality
The following inequality and its consequences are the main results of this section.
Theorem 5.2
For all \(s < t\) and \(\varepsilon \) so that \(s,t,s+\varepsilon ,t+\varepsilon \in (0,1)\), we have (for both possibilities for ±):
and
Proof
By Lemma 3.5, there exists \(y^{\pm }_\varepsilon \in X\) so that
with \(\mathsf {d}(y^{\pm }_\varepsilon ,\gamma _s) = D^{\pm }_{-\varphi }(\gamma _s,s+\varepsilon ) = (s+\varepsilon ) \ell ^{\pm }_{s+\varepsilon }(\gamma _s) =: D^{\pm }_{s+\varepsilon }\). By definition, note that:
We abbreviate \(D_{r} := r \ell = \mathsf {d}(\gamma _r,\gamma _0)\), \(r=s,t\). The proof consists of subtracting the above two expressions and applying the triangle inequality:
Indeed, we obtain after subtraction, recalling the definition of h, and an application of Lemma 3.3:
Carefully rearranging terms, we obtain:
and the first claim follows.
The second claim follows by the duality between \(\varphi \) and \(\varphi ^c\). Indeed, exchange \(\varphi ,\gamma ,\varepsilon ,s,t\) with \(\varphi ^c,\gamma ^c,-\varepsilon ,1-t,1-s\), and recall that \({\bar{\varphi }}_t = -\varphi ^c_{1-t}\). A straightforward inspection of the definitions verifies:
and
and so the second claim follows from the first one. Alternatively, one may repeat the above argument by subtracting the following two expressions:
with \(\mathsf {d}(z^{\pm }_\varepsilon ,\gamma _t) = D^{\pm }_{-\varphi ^c}(\gamma _t,1-t-\varepsilon ) = (1-t-\varepsilon ) \ell ^{\pm }_{t+\varepsilon }(\gamma _t)\) and applying the triangle inequality \(\mathsf {d}(z_\varepsilon ,\gamma _s) \le \mathsf {d}(z_\varepsilon ,\gamma _t) + \mathsf {d}(\gamma _t,\gamma _s)\). \(\square \)
6.4 Consequences
As immediate corollaries of Theorem 5.2, we obtain after diving both sides by \(\varepsilon ^2\) and taking appropriate subsequential limits as \(\varepsilon \rightarrow 0\):
Corollary 5.3
For both \({\tilde{q}} = q,{\bar{q}}\), the functions \(t \mapsto {\tilde{q}}_-(t)\) and \(t \mapsto {\tilde{q}}_+(t)\) are monotone non-decreasing on (0, 1).
Corollary 5.4
For all \(0< s< t < 1\) (and both possibilities for ±):
and
It will be convenient to use the above information in the following form:
Theorem 5.5
Assume that for a.e. \(t \in (0,1)\):
in any of the equivalent senses given by Corollary 5.1, and that moreover:
Furthermore, assume that the latter joint value coincides a.e. on (0, 1) with some continuous function \(z_c\):
Then (5.5) holds for all \(t \in (0,1)\), and we have:
Moreover, we have the following third order information on \(\varphi _t(x)\) at \(x=\gamma _t\):
In particular, for any point \(t \in (0,1)\) where \(z_c(t)\) is differentiable:
Proof
The assumptions imply by Corollary 5.1 that \({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = z_c(t)\) for a.e. \(t \in (0,1)\). It follows that the same is true for every \(t \in (0,1)\) by monotonicity of \({\tilde{q}}_{\pm }\) and the assumption that \(z_c\) is continuous, yielding (5.8). Furthermore, Corollary 5.1 implies that \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = z_c(t)\) for both \({\tilde{p}} = p,{\bar{p}}\) and for all \(t \in (0,1)\), and we obtain (5.9) by taking geometric mean of (5.3) and (5.4). The final assertion obviously follows by taking the limit in (5.9) as \(s \rightarrow t\). \(\square \)
We do not know whether all three assumptions (5.5), (5.6) and (5.7) hold for a.e. \(t \in (0,1)\) for a fixed Kantorovich geodesic \(\gamma \). However, we can guarantee the first two assumptions, at least for almost all Kantorovich geodesics, in the following sense:
Lemma 5.6
Let \(\nu \) denote any \(\sigma \)-finite Borel measure concentrated on \(G_\varphi \), so that for a.e. \(t \in (0,1)\), \(\mu _t := (\mathrm{e}_t)_{\sharp }(\nu ) \ll \mathfrak m\) for some \(\sigma \)-finite Borel measure \(\mathfrak m\) on X. Then for \(\nu \)-a.e. geodesic \(\gamma \), (5.5) and (5.6) hold for a.e. \(t \in (0,1)\).
Proof
Recall that \(D(\mathring{G}_\varphi )\) is closed in \(X \times (0,1)\) by Corollary 3.9. Denote the following Borel subsets:
By Corollary 3.13, we know that \({\mathcal {L}}^1(B(x)) = 0\) for all \(x \in X\). By Fubini
and so for a.e. \(t \in (0,1)\), \(\mathfrak m(B(t)) = 0\). Since \(\mu _t \ll \mathfrak m\) for a.e. \(t \in (0,1)\), it follows that for a.e. \(t \in (0,1)\), \(\nu (\mathrm{e}_t^{-1} B(t)) = \mu _t(B(t)) = 0\). In other words, for a.e. \(t \in (0,1)\), the Borel set \(\left\{ \gamma \in G_\varphi \; ; \; \gamma _t \in B(t) \right\} \) has zero \(\nu \)-measure. Applying Fubini again as before
we conclude that for \(\nu \)-a.e. \(\gamma \in G_\varphi \), the set \(\left\{ t \in (0,1) \; ; \; \gamma _t \in B(t) \right\} \) has zero Lebesgue measure, or equivalently, the set
has full Lebesgue measure. The asserted (5.5) and (5.6) now directly follow from an application of Corollary 5.1. \(\square \)
Finally, we obtain the following concise interpretation of the 3rd order information on \(\tau \mapsto \varphi _\tau \) along \(\gamma _t\), which will play a crucial role in this work:
Lemma 5.7
Assume that for some locally absolutely continuous function \(z_{ac}\) on (0, 1) we have
Then for any fixed \(r_0 \in (0,1)\), the function
is concave on (0, 1).
Proof
Since \(L \in C^1(0,1)\), concavity of L is equivalent to showing that the function:
is monotone non-decreasing. But as this function is locally absolutely continuous, this is equivalent to showing that \(W'(r) \ge 0\) for a.e. \(r \in (0,1)\). Note that the points of differentiability of W and \(z_{ac}\) coincide. At these points (of full Lebesgue measure), we indeed have
where the last inequality follows from Theorem 5.5. This concludes the proof. \(\square \)
We will subsequently show that under synthetic curvature conditions, the above assumption is indeed satisfied for \(\nu \)-a.e. geodesic \(\gamma \).
7 Part II Disintegration theory of Optimal-Transport
8 Preliminaries
So far we have worked without considering any reference measure over our metric space \((X,\mathsf {d})\). A triple \((X,\mathsf {d}, {\mathfrak {m}})\) is called a metric measure space, m.m.s.for short, if \((X, \mathsf {d})\) is a complete and separable metric space and \({\mathfrak {m}}\) is a non-negative Borel measure over X. In this work we will only be concerned with the case that \({\mathfrak {m}}\) is a probability measure, that is \({\mathfrak {m}}(X) =1\), and hence \({\mathfrak {m}}\) is automatically a Radon measure (i.e. inner-regular). We refer to [3, 5, 43, 76, 77] for background on metric measure spaces in general, and the theory of Optimal-Transport on such spaces in particular.
8.1 Geometry of Optimal-Transport on metric measure spaces
The space of all Borel probability measures over X will be denoted by \({\mathcal {P}}(X)\). It is naturally equipped with its weak topology, in duality with bounded continuous functions \(C_b(X)\) over X. The subspace of those measures having finite second moment will be denoted by \({\mathcal {P}}_{2}(X)\), and the subspace of \({\mathcal {P}}_{2}(X)\) of those measures absolutely continuous with respect to \({\mathfrak {m}}\) is denoted by \({\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\). The weak topology on \({\mathcal {P}}_{2}(X)\) is metrized by the \(L^{2}\)-Wasserstein distance \(W_{2}\), defined as follows for any \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\):
where the infimum is taken over all \(\pi \in {\mathcal {P}}(X \times X)\) having \(\mu _0\) and \(\mu _1\) as the first and the second marginals, respectively; such candidates \(\pi \) are called transference plans. It is known that the infimum in (6.1) is always attained for any \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\), and the transference plans realizing this minimum are called optimal transference plans between \(\mu _0\) and \(\mu _1\). When \(W_2(\mu _0,\mu _1) < \infty \), it is known that given an optimal transference plan \(\pi \) between \(\mu _0\) and \(\mu _1\), there exists a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) (see Sect. 3), which is associated to \(\pi \), meaning that
In particular, when \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X)\), then necessarily \(W_2(\mu _0,\mu _1) < \infty \) and the above discussion applies. Moreover, in this case, it is known that for any Kantorovich potential \(\varphi \) associated to an optimal transference plan between \(\mu _0\) and \(\mu _1\), (6.2) in fact holds for all optimal transference plans \(\pi \) between \(\mu _0\) and \(\mu _1\). In addition, in this case a transference plan \(\pi \) is optimal iff it is supported on a \(\mathsf {d}^2\)-cyclically monotone set. A set \(\Lambda \subset X \times X\) is said to be c-cyclically monotone if for any finite set of points \(\{(x_{i},y_{i})\}_{i = 1,\ldots ,N} \subset \Lambda \) it holds
with the convention that \(y_{N+1} = y_{1}\).
As \((X,\mathsf {d})\) is a complete and separable metric space then so is \(({\mathcal {P}}_2(X), W_2)\). Under these assumptions, it is known that \((X,\mathsf {d})\) is geodesic if and only if \(({\mathcal {P}}_2(X), W_2)\) is geodesic. Recall that \(\mathrm{e}_{t}\) denotes the (continuous) evaluation map at \(t \in [0,1]\):
A measure \(\nu \in {\mathcal {P}}(\mathrm{Geo}(X))\) is called an optimal dynamical plan if \((\mathrm{e}_0,\mathrm{e}_1)_{\sharp } \nu \) is an optimal transference plan; it easily follows in that case that \([0,1] \ni t \mapsto (\mathrm{e}_t)_\sharp \nu \) is a geodesic in \(({\mathcal {P}}_2(X), W_2)\). It is known that any geodesic \((\mu _t)_{t \in [0,1]}\) in \(({\mathcal {P}}_2(X), W_2)\) can be lifted to an optimal dynamical plan \(\nu \) so that \((\mathrm{e}_t)_\sharp \, \nu = \mu _t\) for all \(t \in [0,1]\) (see for instance [3, Theorem 2.10]). We denote by \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\) the space of all optimal dynamical plans \(\nu \) so that \((\mathrm{e}_i)_\sharp \, \nu = \mu _i\), \(i=0,1\). Consequently, whenever \((X,\mathsf {d})\) is geodesic, the set \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\) is non-empty for all \(\mu _0,\mu _1\in {\mathcal {P}}_2(X)\), and for any Kantorovich potential \(\varphi \) associated to an optimal transference plan between \(\mu _0\) and \(\mu _1\), we have \(\nu (G_\varphi ) = 1\) for all \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\).
In order to consider restrictions of optimal dynamical plans, for any \(s,t \in [0,1]\) with \(s \le t\) we consider the restriction map
where \(f^{t}_{s} : [0,1] \rightarrow [s,t]\) is defined by \(f^{t}_{s}(\tau ) = s + (t-s) \tau \). During this work we will use the following facts: if \(\nu \in \mathrm {OptGeo}(\mu _{0}, \mu _{1})\) then the restriction \((\text {restr}^{t}_{s})_{\sharp }\nu \) is still an optimal dynamical plan, now between \(\mu _s\) and \(\mu _t\) where \(\mu _{r}:=(\mathrm{e}_{r})_{\sharp }\nu \). Moreover, any probability measure \(\nu ' \in {\mathcal {P}}(\mathrm{Geo}(X))\) with \(\text {supp}(\nu ') \subset \text {supp}(\nu ) ( \subset G_\varphi )\) is also an optimal dynamical plan, between \((\mathrm{e}_{0})_{\sharp } \nu '\) and \((\mathrm{e}_{1})_{\sharp } \nu '\).
On several occasions we will use the following standard lemma (whose proof is a straightforward adaptation of e.g. [29, Lemma 4.4], relying on the Arzelà–Ascoli and Prokhorov theorems):
Lemma 6.1
Assume that \((X,\mathsf {d})\) is a Polish and proper space. Let \(\left\{ \mu _0^i\right\} ,\left\{ \mu _1^i\right\} \subset {\mathcal {P}}_2(X)\) denote two sequences of probability measures weakly converging to \(\mu ^\infty _0 ,\mu ^\infty _1 \in {\mathcal {P}}_2(X)\), respectively. Assume that \(\nu ^i \in \mathrm {OptGeo}(\mu _0^i,\mu _1^i)\). Then there exists a subsequence \(\left\{ \nu ^{i_j}\right\} \) weakly converging to \(\nu ^\infty \in \mathrm {OptGeo}(\mu _0^\infty ,\mu _1^\infty )\).
Definition
(Essentially Non-Branching) A subset \(G \subset \mathrm{Geo}(X)\) of geodesics is called non-branching if for any \(\gamma ^{1}, \gamma ^{2} \in G\) the following holds:
\((X,\mathsf {d})\) is called non-branching if \(\mathrm{Geo}(X)\) is non-branching. \((X,\mathsf {d}, {\mathfrak {m}})\) is called essentially non-branching [68] if for all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), any \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\) is concentrated on a Borel non-branching set \(G\subset \mathrm{Geo}(X)\).
Recall that a measure \(\nu \) on a measurable space \((\Omega ,{\mathcal {F}})\) is said to be concentrated on \(A \subset \Omega \) if \(\exists B \subset A\) with \(B \in {\mathcal {F}}\) so that \(\nu (\Omega \setminus B) = 0\).
8.2 Curvature-Dimension conditions
We now turn to describe various synthetic conditions encapsulating generalized Ricci curvature lower bounds coupled with generalized dimension upper bounds.
Definition 6.2
(\(\sigma _{K,{\mathcal {N}}}\)-coefficients) Given \(K \in {\mathbb {R}}\) and \({\mathcal {N}}\in (0,\infty ]\), define
In addition, given \(t \in [0,1]\) and \(0< \theta < D_{K,{\mathcal {N}}}\), define
and set \(\sigma ^{(t)}_{K,{\mathcal {N}}}(0) = t\) and \(\sigma ^{(t)}_{K,{\mathcal {N}}}(\theta ) = +\infty \) for \(\theta \ge D_{K,{\mathcal {N}}}\).
Definition 6.3
(\(\tau _{K,N}\)-coefficients) Given \(K \in {\mathbb {R}}\) and \(N \in (1,\infty ]\), define
When \(N=1\), set \(\tau ^{(t)}_{K,1}(\theta ) = t\) if \(K \le 0\) and \(\tau ^{(t)}_{K,1}(\theta ) = +\infty \) if \(K > 0\).
The synthetic Curvature-Dimension condition \({\mathsf {CD}}(K,N)\) has been defined on a general m.m.s.independently in several seminal works by Sturm and Lott–Villani: the case \(N=\infty \) and \(K \in {\mathbb {R}}\) was defined in [73] and [51], the case \(N \in [1,\infty )\) in [74] for \(K \in {\mathbb {R}}\) and in [51] for \(K=0\) (and subsequently for \(K \in {\mathbb {R}}\) in [50]). Our treatment in this work excludes the case \(N=\infty \) (for which the globalization result we are after is in any case known [73]). To exclude possible pathological behavior when \(N=1\), we will always assume, unless otherwise stated, that \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\).
We will use the following definition introduced in [74]. Recall that given \(N \in (1,\infty )\), the N-Rényi relative-entropy functional \({\mathcal {E}}_N : {\mathcal {P}}(X) \rightarrow [0,1]\) (since \({\mathfrak {m}}(X) = 1\)) is defined as
where \(\mu = \rho {\mathfrak {m}}+ \mu ^{{\text {sing}}}\) is the Lebesgue decomposition of \(\mu \) with \(\mu ^{\text {sing}}\perp {\mathfrak {m}}\). It is known [74] that \({\mathcal {E}}_N\) is upper semi-continuous with respect to the weak topology on \({\mathcal {P}}(X)\).
Definition 6.4
(\({\mathsf {CD}}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {CD}}(K,N)\) if for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), there exists \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t\in [0,1]\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\), and for all \(N' \ge N\):
for some optimal transference plan \(\pi \) between \(\mu _0 = \rho _0 \,{\mathfrak {m}}\) and \(\mu _1 = \rho _1\, {\mathfrak {m}}\).
Remark 6.5
When \({\mathfrak {m}}(X) < \infty \) as in our setting, it is known [74, Proposition 1.6 (ii)] that \({\mathsf {CD}}(K,N)\) implies \({\mathsf {CD}}(K,\infty )\), and hence the requirement \(\mu _t \ll {\mathfrak {m}}\) for all intermediate times \(t \in (0,1)\) is in fact superfluous, as it must hold automatically by finiteness of the Shannon entropy (see [73, 74]).
The following is a local version of \({\mathsf {CD}}(K,N)\):
Definition 6.6
(\({\mathsf {CD}}_{loc}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {CD}}_{loc}(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\), there exists a neighborhood \(X_o \subset X\) of o, so that for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) supported in \(X_o\), there exists \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t\in [0,1]\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\), and for all \(N' \ge N\), (6.3) holds.
Note that \((\mathrm{e}_t)_{\sharp } \nu \) is not required to be supported in \(X_o\) for intermediate times \(t \in (0,1)\) in the latter definition.
The following pointwise density inequality is a known equivalent definition of \({\mathsf {CD}}(K,N)\) on essentially non-branching spaces (the equivalence follows by combining the results of [29] and [41], see the proof of Proposition 9.1):
Definition 6.7
(\({\mathsf {CD}}(K,N)\) for essentially non-branching spaces) An essentially non-branching m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}(K,N)\) if and only if for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\), \(\nu \) is induced by a map (i.e. \(\nu = S_{\sharp }(\mu _0)\) for some map \(S : X \rightarrow \mathrm{Geo}(X)\)), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and writing \(\mu _t = \rho _t {\mathfrak {m}}\), we have for all \(t \in [0,1]\):
The Measure Contraction Property \({\mathsf {MCP}}(K,N)\) was introduced independently by Ohta in [57] and Sturm in [74]. The idea is to only require the \({\mathsf {CD}}(K,N)\) condition to hold when \(\mu _1\) degenerates to \(\delta _o\), a delta-measure at \(o \in \text {supp}({\mathfrak {m}})\). However, there are several possible implementations of this idea. We start with the following one, which is a variation of the one used in [29]:
Definition 6.8
(\({\mathsf {MCP}}_{\varepsilon }(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {MCP}}_{\varepsilon }(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) with bounded support, there exists \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o} )\), such that for all \(t \in [0,1)\), if \(\mu _t := (\mathrm{e}_t)_{\#} \nu \) then \(\text {supp}(\mu _t) \subset \text {supp}({\mathfrak {m}})\), and:
where \(\mu _0 = \rho _0 {\mathfrak {m}}\).
The variant proposed in [57] is as follows:
Definition 6.9
(\({\mathsf {MCP}}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {MCP}}(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) of the form \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) for some Borel set \(A \subset X\) with \(0< {\mathfrak {m}}(A) < \infty \), there exists \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o} )\) such that:
Remark 6.10
Note that in [57] it was assumed in addition that \(\text {supp}({\mathfrak {m}}) = X\) and that \((X,\mathsf {d})\) is a length-space, but (6.5) was only required to hold for \(A \subset B(o, D_{K,N-1})\) if \(K>0\); both our version and the one from [57] imply that the diameter of \(\text {supp}({\mathfrak {m}})\) is bounded above by \(D_{K,N-1}\) (this follows in our version since \(\tau _{K,N}(\theta ) = +\infty \) if \(\theta \ge D_{K,N-1}\), and by [57, Theorem 4.3] in the version from [57]), and also that \(\text {supp}({\mathfrak {m}})\) is a geodesic-space (see Lemma 6.12 below), and therefore both versions are ultimately equivalent.
When either the \({\mathsf {MCP}}(K,N)\) or \({\mathsf {MCP}}_{\varepsilon }(K,N)\) conditions hold for a given \(o \in \text {supp}({\mathfrak {m}})\), we will say that the space satisfies the corresponding condition with respect to o.
Remark 6.11
The \({\mathsf {CD}}(K,N)\), \({\mathsf {CD}}_{loc}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) and \({\mathsf {MCP}}(K,N)\) conditions all ensure that for all \(t \in [0,1]\), \(\text {supp}((\mathrm{e}_t)_\sharp \nu ) \subset \text {supp}({\mathfrak {m}})\) for the appropriate \(\nu \in \mathrm{OptGeo}(\mu _0,\mu _1)\) appearing in the corresponding definition. Consequently, for a fixed dense countable set of times \(t \in (0,1)\), \(\gamma _t \in \text {supp}({\mathfrak {m}})\) for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\); since \(\text {supp}({\mathfrak {m}})\) is closed, this in fact holds for all \(t \in [0,1]\), and hence \(\gamma \in \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\) for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), i.e. \(\text {supp}(\nu ) \subset \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\). It follows that \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}(K,N)\), \({\mathsf {CD}}_{loc}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\) iff \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does.
The following simple lemma will be useful for quickly establishing that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic:
Lemma 6.12
Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {CD}}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\). Then \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a Polish, proper and geodesic space. The same holds for \({\mathsf {CD}}_{loc}(K,N)\) if \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is assumed to be a length space.
Proof
As \(\text {supp}({\mathfrak {m}}) \subset X\) is closed, \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is Polish. It was shown in [57, Lemma 2.5, Theorem 5.1] for \({\mathsf {MCP}}(K,N)\) (and hence \({\mathsf {MCP}}_{\varepsilon }(K,N)\)) and in [74, Corollary 2.4] for \({\mathsf {CD}}(K,N)\) that these conditions imply a doubling condition, so that every closed bounded ball in \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is totally bounded. Together with completeness, this already implies that the latter space is proper. By Remark 6.11, \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies the same corresponding condition as \((X,\mathsf {d},{\mathfrak {m}})\). In particular, if \((X,\mathsf {d},{\mathfrak {m}})\) and hence \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\), then for any \(x,y \in \text {supp}({\mathfrak {m}})\), there is at least one geodesic in \(\text {supp}({\mathfrak {m}})\) from \(B(y,\varepsilon ) \cap \text {supp}({\mathfrak {m}})\) to x; together with properness and completeness, this already implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic. On the other hand, if \((X,\mathsf {d},{\mathfrak {m}})\) and hence \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K,N)\), the above argument shows that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is complete and locally compact. Together with the assumption that the latter space is a length-space, the Hopf-Rinow theorem implies that it is proper and geodesic. \(\square \)
Lemma 6.13
The following chain of implications is known:
Proof
By Remark 6.11, we may reduce to the case \(\text {supp}({\mathfrak {m}}) = X\). Fixing \(\mu _0 \ll {\mathfrak {m}}\) with bounded support and \(o \in X\), let \(\nu ^\varepsilon \) be an element of \(\mathrm {OptGeo}(\mu _0,\mu _1^\varepsilon )\) satisfying the \({\mathsf {CD}}(K,N)\) condition for \(\mu _1^\varepsilon = {\mathfrak {m}}(B(o,\varepsilon ))^{-1}{\mathfrak {m}}\llcorner _{B(o,\varepsilon )}\). By Lemma 6.1 (which applies since the space is proper by Lemma 6.12), \(\left\{ \nu ^\varepsilon \right\} \) has a converging subsequence to \(\nu ^0 \in \mathrm {OptGeo}(\mu _0,\delta _o)\) as \(\varepsilon \rightarrow 0\). The upper semi-continuity of \({\mathcal {E}}_N\) and the continuity of the evaluation map \(\mathrm{e}_t\) ensure that \(\nu ^0\) satisfies the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition (6.4). The second implication follows by the arguments of [66, Section 5] (without any types of essential non-branching assumptions). \(\square \)
Remark 6.14
We will show in Proposition 9.1 that for essentially non-branching spaces, \({\mathsf {MCP}}(K,N)\) implies back \({\mathsf {MCP}}_{\varepsilon }(K,N)\). We remark that for non-branching spaces, the implication \({\mathsf {CD}}(K,N) \Rightarrow {\mathsf {MCP}}(K,N)\) was first proved in [74].
Many additional useful results on the structure of \(W_{2}\)-geodesics can be obtained just from the \({\mathsf {MCP}}\) condition. The following has been shown in [29, Theorem 1.1 and Appendix] (when \(\text {supp}(m) = X\); the formulation below is immediately obtained from Remark 6.11):
Theorem 6.15
([29]) Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.satisfying \({\mathsf {MCP}}(K,N)\). Given any pair \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0} \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _{1}) \subset \text {supp}({\mathfrak {m}})\), the following holds:
-
there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\) and hence a unique optimal transference plan between \(\mu _0\) and \(\mu _1\);
-
there exists a map \(S : X \supset {\text {Dom}}\,(S) \rightarrow \mathrm{Geo}(X)\) such that \(\nu = S_{\sharp } \mu _{0}\);
-
for any \(t \in [0,1)\) the measure \((\mathrm{e}_{t})_{\sharp } \nu \) is absolutely continuous with respect to \({\mathfrak {m}}\).
The following is a standard corollary of the fact that the optimal dynamical plan is induced by a map (see e.g. the comments after [41, Theorem 1.1]); as we could not find a reference, we sketch the proof for completeness.
Corollary 6.16
With the same assumptions as in Theorem 6.15, the unique optimal transference plan \(\nu \) is concentrated on a (Borel) set \(G \subset \mathrm{Geo}(X)\), so that for all \(t \in [0,1)\), the evaluation map \(\mathrm{e}_t|_G : G \rightarrow X\) is injective. In particular, for any Borel subset \(H \subset G\):
Sketch of proof
First, we claim the existence of \(X_1 \subset X\) with \(\mu _0(X_1) = 1\), so the for all \(x \in X_1\), there exists a unique \(\gamma \in G_\varphi \) with \(\gamma _0 = x\). Otherwise, if \(A \subset X\) is a set of positive \(\mu _0\)-measure where this is violated, there are at least two distinct geodesics in \(G_\varphi \) emanating from every \(x \in A\). As these geodesics must be different at some rational time in (0, 1), it follows that there exists a rational \({{\bar{t}}} \in (0,1)\) and \(B \subset A\) still of positive \(\mu _0\)-measure so that both pairs of geodesics emanating from x are different at time \({{\bar{t}}}\) for all \(x \in B\). Consider \({\bar{\mu }}_0 = \mu _0 \llcorner _{B} / \mu _0(B) \ll {\mathfrak {m}}\), and transport to time \({{\bar{t}}}\) half of its mass along one geodesic and the second half along the other one (see e.g. the proof of [29, Theorem 5.1]). The latter transference plan is optimal but is not induced by a map, yielding a contradiction.
Now denote \(G := S(X_1)\) (and hence \(\nu (G) = 1\)), so that the injectivity of \(\mathrm{e}_0|_G\) is already guaranteed. To see the injectivity of \(\mathrm{e}_t|_G\) for all \(t \in (0,1)\), suppose in the contrapositive the existence of \(\gamma ^{1}, \gamma ^{2} \in G\) with \(\gamma ^{1}_{t} = \gamma ^{2}_{t}\). Denoting by \(\eta \) the gluing of \(\gamma ^{1}\) restricted to [0, t] with \(\gamma ^{2}\) restricted to [t, 1], it follows by \(\mathsf {d}^{2}\)-cyclic monotonicity (see e.g. the proof of [14, Lemma 2.6] or that of Lemma 3.7) that \(\eta \in G_{\varphi }\) with \(\eta _{0} = \gamma _{0}^{1}\) and \(\eta \ne \gamma ^{1}\). But this is in contradiction to the definition of \(X_1\), thereby concluding the proof. \(\square \)
8.3 Disintegration theorem
We include here a version of the Disintegration Theorem that we will use. We will follow [18, Appendix A] where a self-contained approach (and a proof) of the Disintegration Theorem in countably generated measure spaces can be found. An even more general version of the Disintegration Theorem can be found in [39, Section 452].
Recall that given a measure space \((X,{\mathscr {X}},{\mathfrak {m}})\), a set \(A \subset X\) is called \({\mathfrak {m}}\)-measurable if A belongs to the completion of the \(\sigma \)-algebra \({\mathscr {X}}\), generated by adding to it all subsets of null \({\mathfrak {m}}\)-sets; similarly, a function \(f : (X,{\mathscr {X}},{\mathfrak {m}}) \rightarrow {\mathbb {R}}\) is called \({\mathfrak {m}}\)-measurable if all of its sub-level sets are \({\mathfrak {m}}\)-measurable.
Definition 6.17
(Disintegation on sets) Let \((X,{\mathscr {X}},{\mathfrak {m}})\) denote a measure space. Given any family \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) of subsets of X, a disintegration of \({\mathfrak {m}}\) on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) is a measure-space structure \((Q,{\mathscr {Q}},{\mathfrak {q}})\) and a map
so that
-
(1)
for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is concentrated on \(X_\alpha \);
-
(2)
for all \(B \in {\mathscr {X}}\), the map \(\alpha \mapsto {\mathfrak {m}}_{\alpha }(B)\) is \({\mathfrak {q}}\)-measurable;
-
(3)
for all \(B \in {\mathscr {X}}\), \({\mathfrak {m}}(B) = \int {\mathfrak {m}}_{\alpha }(B)\, {\mathfrak {q}}(d\alpha )\).
The measures \({\mathfrak {m}}_\alpha \) are referred to as conditional probabilities.
Given a measurable space \((X, {\mathscr {X}})\) and a function \({\mathfrak {Q}}: X \rightarrow Q\), with Q a general set, we endow Q with the push forward \(\sigma \)-algebra \({\mathscr {Q}}\) of \({\mathscr {X}}\):
i.e. the biggest \(\sigma \)-algebra on Q such that \({\mathfrak {Q}}\) is measurable. Moreover, given a measure \({\mathfrak {m}}\) on \((X,{\mathscr {X}})\), define a measure \({\mathfrak {q}}\) on \((Q,{\mathscr {Q}})\) by pushing forward \({\mathfrak {m}}\) via \({\mathfrak {Q}}\), i.e. \({\mathfrak {q}}:= {\mathfrak {Q}}_\sharp \, {\mathfrak {m}}\).
Definition 6.18
(Consistent and Strongly Consistent Disintegation) A disintegration of \({\mathfrak {m}}\) consistent with \({\mathfrak {Q}}: X \rightarrow Q\) is a map:
such that the following requirements hold:
-
(1)
for all \(B \in {\mathscr {X}}\), the map \(\alpha \mapsto {\mathfrak {m}}_{\alpha }(B)\) is \({\mathfrak {q}}\)-measurable;
-
(2)
for all \(B \in {\mathscr {X}}\) and \(C \in {\mathscr {Q}}\), the following consistency condition holds:
$$\begin{aligned} {\mathfrak {m}}\left( B \cap {\mathfrak {Q}}^{-1}(C) \right) = \int _{C} {\mathfrak {m}}_{\alpha }(B)\, {\mathfrak {q}}(d\alpha ). \end{aligned}$$
A disintegration of \({\mathfrak {m}}\) is called strongly consistent with respect to \({\mathfrak {Q}}\) if in addition:
-
(3)
for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is concentrated on \({\mathfrak {Q}}^{-1}(\alpha )\);
The above general scheme fits with the following situation: given a measure space \((X,{\mathscr {X}},{\mathfrak {m}})\), suppose a partition of X is given into disjoint sets \(\{ X_{\alpha }\}_{\alpha \in Q}\) so that \(X = \cup _{\alpha \in Q} X_\alpha \). Here Q is the set of indices and \({\mathfrak {Q}}: X \rightarrow Q\) is the quotient map, i.e.
We endow Q with the quotient \(\sigma \)-algebra \({\mathscr {Q}}\) and the quotient measure \({\mathfrak {q}}\) as described above, obtaining the quotient measure space \((Q, {\mathscr {Q}}, {\mathfrak {q}})\). When a disintegration \(\alpha \mapsto {\mathfrak {m}}_\alpha \) of \({\mathfrak {m}}\) is (strongly) consistent with the quotient map \({\mathfrak {Q}}\), we will simply say that it is (strongly) consistent with the partition. Note that any disintegration \(\alpha \mapsto {\mathfrak {m}}_\alpha \) of \({\mathfrak {m}}\) on a partition \(\{ X_{\alpha }\}_{\alpha \in Q}\) (as in Definition 6.17) is automatically strongly consistent with the partition (as in Definition 6.18), and vice versa.
We now formulate the Disintegration Theorem (it is formulated for probability measures but clearly holds for any finite non-zero measure):
Theorem 6.19
(Theorem A.7, Proposition A.9 of [18]) Assume that \((X,{\mathscr {X}},{\mathfrak {m}})\) is a countably generated probability space and that \(\left\{ X_{\alpha }\right\} _{\alpha \in Q}\) is a partition of X.
Then the quotient probability space \((Q, {\mathscr {Q}},{\mathfrak {q}})\) is essentially countably generated and there exists an essentially unique disintegration \(\alpha \mapsto {\mathfrak {m}}_{\alpha }\) consistent with the partition.
If in addition \({\mathscr {X}}\) contains all singletons, then the disintegration is strongly consistent if and only if there exists a \({\mathfrak {m}}\)-section \(S_{{\mathfrak {m}}} \in {\mathscr {X}}\) of the partition such that the \(\sigma \)-algebra on \(S_{{\mathfrak {m}}}\) induced by the quotient-map contains the trace \(\sigma \)-algebra \({\mathscr {X}} \cap S_{{\mathfrak {m}}} := \left\{ A \cap S_{{\mathfrak {m}}} ; A \in {\mathscr {X}}\right\} \).
Let us expand on the statement of Theorem 6.19. Recall that a \(\sigma \)-algebra \({\mathcal {A}}\) is countably generated if there exists a countable family of sets so that \({\mathcal {A}}\) coincides with the smallest \(\sigma \)-algebra containing them. On the measure space \((Q, {\mathscr {Q}},{\mathfrak {q}})\), the \(\sigma \)-algebra \({\mathscr {Q}}\) is called essentially countably generated if there exists a countable family of sets \(Q_{n} \subset Q\) such that for any \(C \in {\mathscr {Q}}\) there exists \({{\hat{C}}} \in \hat{{\mathscr {Q}}}\), where \(\hat{{\mathscr {Q}}}\) is the \(\sigma \)-algebra generated by \(\{ Q_{n} \}_{n \in {\mathbb {N}}}\), such that \({\mathfrak {q}}(C\, \Delta \, {{\hat{C}}}) = 0\).
Essential uniqueness is understood above in the following sense: if \(\alpha \mapsto {\mathfrak {m}}^{1}_{\alpha }\) and \(\alpha \mapsto {\mathfrak {m}}^{2}_{\alpha }\) are two consistent disintegrations with the partition then \({\mathfrak {m}}^{1}_{\alpha }={\mathfrak {m}}^{2}_{\alpha }\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).
Finally, a set \(S \subset X\) is a section for the partition \(X = \cup _{\alpha \in Q}X_{\alpha }\) if for any \(\alpha \in Q\), \(S \cap X_\alpha \) is a singleton \(\left\{ x_\alpha \right\} \). By the axiom of choice, a section S always exists, and we may identify Q with S via the map \(Q \ni \alpha \mapsto x_\alpha \in S\). A set \(S_{{\mathfrak {m}}}\) is an \({\mathfrak {m}}\)-section if there exists \(Y \in {\mathscr {X}}\) with \({\mathfrak {m}}(X \setminus Y) = 0\) such that the partition \(Y = \cup _{\alpha \in Q_{\mathfrak {m}}} (X_{\alpha } \cap Y)\) has section \(S_{{\mathfrak {m}}}\), where \(Q_{{\mathfrak {m}}} = \left\{ \alpha \in Q ; X_{\alpha } \cap Y \ne \emptyset \right\} \). As \({\mathfrak {q}}= {\mathfrak {Q}}_{\sharp } {\mathfrak {m}}\), clearly \({\mathfrak {q}}(Q \setminus Q_{{\mathfrak {m}}}) = 0\). As usual, we identify between \(Q_{{\mathfrak {m}}}\) and \(S_{{\mathfrak {m}}}\), so that now \(Q_{{\mathfrak {m}}}\) carries two measurable structures: \({\mathscr {Q}} \cap Q_{{\mathfrak {m}}}\) (the push-forward of \({\mathscr {X}} \cap Y\) via \({\mathfrak {Q}}\)), and also \({\mathscr {X}} \cap S_{{\mathfrak {m}}}\) via our identification. The last condition of Theorem 6.19 is that \({\mathscr {Q}} \cap Q_{{\mathfrak {m}}} \supset {\mathscr {X}} \cap S_{{\mathfrak {m}}}\), i.e. that the restricted quotient-map \({\mathfrak {Q}}|_{Y} : (Y,{\mathscr {X}} \cap Y) \rightarrow (S_{\mathfrak {m}}, {\mathscr {X}} \cap S_{{\mathfrak {m}}})\) is measurable, so that the full quotient-map \({\mathfrak {Q}}: (X,{\mathscr {X}}) \rightarrow (S , {\mathscr {X}} \cap S)\) is \({\mathfrak {m}}\)-measurable.
We will typically apply the Disintegration Theorem to \((E,{\mathcal {B}}(E),{\mathfrak {m}}\llcorner _{E})\), where \(E \subset X\) is an \({\mathfrak {m}}\)-measurable subset (with \({\mathfrak {m}}(E) > 0\)) of the m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\). As our metric space is separable, \({\mathcal {B}}(E)\) is countably generated, and so Theorem 6.19 applies. In particular, when \(Q \subset {\mathbb {R}}\), E is a closed subset of X, the partition elements \(X_\alpha \) are closed and the quotient-map \({\mathfrak {Q}}: E \rightarrow Q\) is known to be Borel (for instance, this is the case when \({\mathfrak {Q}}\) is continuous), [72, Theorem 5.4.3] guarantees the existence of a Borel section S for the partition so that \({\mathfrak {Q}}: E \rightarrow S\) is Borel measurable, thereby guaranteeing by Theorem 6.19 the existence of an essentially unique disintegration strongly consistent with \({\mathfrak {Q}}\).
9 Theory of \(L^1\)-Optimal-Transport
In this section we recall various results from the theory of \(L^1\)-Optimal-Transport which are relevant to this work, and add some new information we will subsequently require. We refer to [2, 13, 19, 23, 37, 38, 47, 76] for more details.
9.1 Preliminaries
To any 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\) there is a naturally associated \(\mathsf {d}\)-cyclically monotone set:
Its transpose is given by \(\Gamma ^{-1}_{u}= \{ (x,y) \in X \times X : (y,x) \in \Gamma _{u} \}\). We define the transport relation \(R_u\) and the transport set \({\mathcal {T}}_{u}\), as
where \(\{ x = y\}\) denotes the diagonal \(\{ (x,y) \in X^{2} : x=y \}\) and \(P_{i}\) the projection onto the i-th component. Recall that \(\Gamma _u(x) = \left\{ y \in X \; ;\; (x,y) \in \Gamma _u\right\} \) denotes the section of \(\Gamma _u\) through x in the first coordinate, and similarly for \(R_u(x)\) (through either coordinates by symmetry). Since u is 1-Lipschitz, \(\Gamma _{u}, \Gamma ^{-1}_{u}\) and \(R_{u}\) are closed sets, and so are \(\Gamma _u(x)\) and \(R_u(x)\). Consequently \({\mathcal {T}}_{u}\) is a projection of a Borel set and hence analytic; it follows that it is universally measurable, and in particular, \({\mathfrak {m}}\)-measurable [72].
The following is immediate to verify (see [2, Proposition 4.2]):
Lemma 7.1
Let \((\gamma _0,\gamma _1) \in \Gamma _{u}\) for some \(\gamma \in \mathrm{Geo}(X)\). Then \((\gamma _{s},\gamma _{t}) \in \Gamma _{u}\) for all \(0\le s \le t \le 1\).
Also recall the following definitions, introduced in [23]:
\(A_{\pm }\) are called the sets of forward and backward branching points, respectively. Note that both \(A_{\pm }\) are analytic sets; for instance:
showing that \(A_{+}\) is a projection of an analytic set and therefore analytic. If \(x \in A_{+}\) and \((y,x) \in \Gamma _{u}\) necessarily also \(y \in A_{+}\) (as \(\Gamma _{u}(y) \supset \Gamma _{u}(x)\) by the triangle inequality); similarly, if \(x \in A_{-}\) and \((x,y) \in \Gamma _{u}\) then necessarily \(y \in A_{-}\).
Consider the non-branched transport set
which belongs to the sigma-algebra \(\sigma ({\mathcal {A}})\) generated by analytic sets and is therefore \({\mathfrak {m}}\)-measurable. Define the non-branched transport relation:
It was shown in [23] (cf. [19]) that \(R_u^b\) is an equivalence relation over \({\mathcal {T}}_{u}^{b}\) and that for any \(x \in {\mathcal {T}}_{u}^{b}\), \(R_{u}(x) \subset (X,\mathsf {d})\) is isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\).
Remark 7.2
Note that even if \(x \in {\mathcal {T}}_{u}^{b}\), the transport ray \(R_{u}(x)\) need not be entirely contained in \({\mathcal {T}}_{u}^{b}\). However, we will soon prove that almost every transport ray (with respect to an appropriate measure) has interior part contained in \({\mathcal {T}}_{u}^{b}\).
It will be very useful to note that whenever the space \((X,\mathsf {d})\) is proper (for instance when \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\)), \({\mathcal {T}}_{u}\) and \(A_{\pm }\) are \(\sigma \)-compact sets: indeed writing \(R_{u} \setminus \{ x= y\} = \cup _{\varepsilon>0} R_{u} \setminus \{ \mathsf {d}(x,y) > \varepsilon \}\) it follows that \(R_{u} \setminus \{ x= y\}\) is \(\sigma \)-compact. Hence \( {\mathcal {T}}_{u}\) is \(\sigma \)-compact. Moreover:
since \((R_{u})^{c}\) is open and open sets are \(F_{\sigma }\) in metric spaces, it follows that \(\{ (x,z,w) \in {\mathcal {T}}_{u} \times (R_{u})^{c} :(x,z), (x,w) \in \Gamma _{u} \}\) is \(\sigma \)-compact and therefore \(A_{+}\) is \(\sigma \)-compact; the same applies to \(A_{-}\). Consequently, \({\mathcal {T}}_u^b\) and \(R_u^b\) are Borel.
Now, from the first part of the Disintegration Theorem 6.19 applied to \(({\mathcal {T}}_u^b , {\mathcal {B}}({\mathcal {T}}_u^b), {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}})\), we obtain an essentially unique disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}}\) consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) of \(R_{u}^{b}\):
with corresponding quotient space \((Q, {\mathscr {Q}},{\mathfrak {q}})\) (\(Q \subset {\mathcal {T}}_u^b\) may be chosen to be any section of the above partition). The next step is to show that the disintegration is strongly consistent. By the Disintegration Theorem, this is equivalent to the existence of a \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}}\)-section \({{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) (which by a mild abuse of notation we will call \({\mathfrak {m}}\)-section), such that the quotient map associated to the partition is \({\mathfrak {m}}\)-measurable, where we endow \({{\bar{Q}}}\) with the trace \(\sigma \)-algebra. This has already been shown in [19, Proposition 4.4] in the framework of non-branching metric spaces; since its proof does not use any non-branching assumption, we can conclude that
where now \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section for the above partition (and hence \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\)). For a more constructive approach under the additional assumption of properness of the space, see also [25, Proposition 4.8].
A-priori the non-branched transport set \({\mathcal {T}}_u^b\) can be much smaller than \({\mathcal {T}}_u\). However, under fairly general assumptions one can prove that the sets \(A_{\pm }\) of forward and backward branching are both \({\mathfrak {m}}\)-negligible. In [23] this was shown for a m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) verifying \({\mathsf {RCD}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\). The proof only relies on the following two properties which hold for the latter spaces (see also [25]):
-
\(\text {supp}({\mathfrak {m}}) = X\).
-
Given \(\mu _{0}, \mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0}\ll {\mathfrak {m}}\), there exists a unique optimal transference plan for the \(W_{2}\)-distance and it is induced by an optimal-transport map .
By Theorem 6.15 these properties are also verified for an essentially non-branching m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}})= X\). We summarize the above discussion in:
Corollary 7.3
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.satisfying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}(X) = {\mathfrak {m}}\). Then for any 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\), we have \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\). In particular, we obtain the following essentially unique disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}^b_{u}}\) strongly consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) of \(R_{u}^{b}\):
Here Q may be chosen to be a section of the above partition so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section with \({\mathfrak {m}}\)-measurable quotient map. In particular, \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\).
Remark 7.4
By modifying the definitions of \(A_+,A_-\) to only reflect branching inside \(\text {supp}({\mathfrak {m}})\), it is possible to remove the assumption that \(\text {supp}(X) = {\mathfrak {m}}\), but we refrain from this extraneous generality here.
Remark 7.5
If we consider \(u = \mathsf {d}(\cdot ,o)\), it is easy to check that the set \(A_{+}\) coincides with the cut locus \(C_{o}\), i.e. the set of those \(z \in X\) such that there exists at least two distinct geodesics starting at z and ending in o. Hence the previous corollary implies that for any \(o \in X\), the cut locus has \({\mathfrak {m}}\)-measure zero: \({\mathfrak {m}}(C_{o}) = 0\). This in particular implies that an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\) also supports a local (1, 1)-weak Poincaré inequality, see [69].
9.2 Maximality of transport rays on non-branched transport-set
It is elementary to check that \(\Gamma _{u}\) induces a partial order relation on X:
Note that by definition:
Recall that for any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_{u}(x),\mathsf {d})\) is isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\). This isometry induces a total ordering on \(R_u(x)\) which must coincide with either \(\le _u\) or \(\ge _u\), implying that \((R_u(x), \le _u)\) is totally ordered.
Lemma 7.6
For any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_u^b(x) = R_u(x) \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},|\cdot |)\).
Proof
Consider \(z,w \in R_{u}(x) \cap {\mathcal {T}}_{u}^{b}\); as \((R_u(x), \le _u)\) is totally ordered, assume without loss of generality that \(z \le _u w\). Given \(y \in R_u(x)\) with \(z \le _u y \le _u w\), we must prove that \(y \in {\mathcal {T}}_{u}^{b}\). Indeed, since \(w \ge _u y\) and \(w \notin A_{+}\), necessarily \(y \notin A_{+}\), and since \(z \le _u y\) and \(z \notin A_{-}\), necessarily \(y \notin A_{-}\). Hence \(y \in {\mathcal {T}}_{u}^{b}\) and the claim follows. \(\square \)
Recall that given a partially ordered set, a chain is a totally ordered subset. A chain is called maximal if it is maximal with respect to inclusion. We introduce the following:
Definition 7.7
(Transport Ray) A maximal chain R in \((X,\mathsf {d},\le _u)\) is called a transport ray if it is isometric to a closed interval I in \(({\mathbb {R}},\left| \cdot \right| )\) of positive (possibly infinite) length.
In other words, a transport ray R is the image of a closed non-null geodesic \(\gamma \) parametrized by arclength on I so that the function \(u \circ \gamma \) is affine with slope 1 on I, and so that R is maximal with respect to inclusion.
Lemma 7.8
Given \(x \in {\mathcal {T}}_u^b\), R is a transport ray passing through x if and only if \(R = R_u(x)\).
Proof
Recall that for any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_u(x), \mathsf {d},\le _u)\) is order isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\). As \(R_u(x)\) is by definition maximal in X with respect to inclusion, it follows that it must be a transport ray.
Conversely, note that for any transport ray R we always have \(R \subset \cap _{w \in R} R_u(w)\). Indeed, for any \(w,z \in R\), we have \(z \le _u w\) or \(z \ge _u w\), and hence by definition \((w,z) \in R_u\) so that \(z \in R_u(w)\). If \(x \in R \cap {\mathcal {T}}_u^b\), we already showed above that \(R_{u}(x)\) is a transport ray. Since \(R \subset R_u(x)\) and R is assumed to be maximal with respect to inclusion, it follows that necessarily \(R = R_u(x)\). \(\square \)
Corollary 7.9
If \(R_1\) and \(R_2\) are two transport rays which intersect in \({\mathcal {T}}_u^b\) then they must coincide.
In this subsection, we reconcile between the crucial maximality property of \(R_u(\alpha )\) which we will require for the definition of \({\mathsf {CD}}^1\) in the next section, and the fact that the disintegration in (7.3) is with respect to (the possibly non-maximal) \(R_u^b(\alpha ) = R_u(\alpha ) \cap {\mathcal {T}}_u^b\). We will show that under \({\mathsf {MCP}}\), for \({\mathfrak {q}}\)-a.e. \(\alpha \), the only parts of \(R_{u}(\alpha )\) which are possibly not contained in \({\mathcal {T}}_{u}^{b}\) are its end points—this fact is the main new result of this section.
To rigorously state this new observation, we recall the classical definition of initial and final points, \({\mathfrak {a}}\) and \({\mathfrak {b}}\), respectively:
Note that
so \({\mathfrak {a}}\) is the difference of analytic sets and consequently belongs to \(\sigma ({\mathcal {A}})\); similarly for \({\mathfrak {b}}\). As in the previous subsection, whenever \((X,\mathsf {d})\) is proper, \({\mathfrak {a}}, {\mathfrak {b}}\) are in fact Borel sets.
Theorem 7.10
(Maximality of transport rays on non-branched transport-set) Let \((X,\mathsf {d}, {\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\). Let \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) be any 1-Lipschitz function, with (7.3) the associated disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_u}\).
Then there exists \({{\hat{Q}}} \subset Q\) such that \({\mathfrak {q}}(Q \setminus {{\hat{Q}}}) = 0\) and for any \(\alpha \in {{\hat{Q}}}\) it holds:
In particular, for every \(\alpha \in {{\hat{Q}}}\):
(with the latter interpreted as the relative interior).
Proof
Step 1. Consider the \({\mathfrak {m}}\)-section \({{\bar{Q}}}\) from Corollary 7.3 so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_{u}^{b})\), \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}(Q \setminus {{\bar{Q}}}) = 0\). Consider the set
The claim will be proved once we show that \({\mathfrak {q}}(Q_{1})=0\). First, observe that
and therefore \(Q_{1} \subset {{\bar{Q}}}\) is analytic; since \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\), it follows that \(Q_1\) is \({\mathfrak {q}}\)-measurable. Now suppose by contradiction that \({\mathfrak {q}}(Q_{1})> 0\).
We can divide \(Q_{1}\) into two sets:
Since \(Q_{1} = Q_{1}^{+} \cup Q_{1}^{-}\), without any loss in generality let us assume \({\mathfrak {q}}(Q_{1}^{+}) > 0\), and for ease of notation assume further that \(Q_{1}^{+} = Q_{1}\).
Hence, for any \(\alpha \in Q_{1}\), there exists \(z \in \Gamma _{u}(\alpha )\) such that \(z \notin {\mathcal {T}}_{u}^{b}\) and \(z \notin {\mathfrak {b}}\); note that necessarily \(z \in A_{-}\). Recall that for all \(\alpha \in Q\), \(R_{u}(\alpha )\) and hence \(\Gamma _u(\alpha )\) are isometric via the map u to closed intervals, and hence \(\Gamma _u(\alpha ) \setminus (\{ \alpha \} \cup {\mathfrak {b}})\) is isometric to an open interval. Since \(\Gamma _u(\alpha ) \cap {\mathcal {T}}_{u}^{b}\) is isometric to an interval and contains \(\alpha \), it follows that for \(\alpha \in Q_1\), there exist distinct \(a_\alpha ,b_\alpha \in \Gamma _{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b}\) so that
is a non-empty open interval. Moreover, we may select \(a_{\alpha }\) and \(b_\alpha \) to be \({\mathfrak {q}}\)-measurable functions of \(Q_1\). To see this, consider the set \(\Sigma : = \{ (\alpha , x ,y) \in Q_{1} \times \Gamma _{u} :x \in A_{-} , \ (\alpha ,x) \in \Gamma _{u},\ \mathsf {d}(x,y) > 0\}\), and observe that it is analytic (being the intersection of analytic sets), and that \(P_1(\Sigma ) = Q_1\). By von Neumann’s selection Theorem (see [72, Theorem 5.5.2]), there exists a \(\sigma ({\mathcal {A}})\)-measurable selection of \(\Sigma \):
and so in particular these functions are \({\mathfrak {q}}\)-measurable. It follows that
are also \(\sigma ({\mathcal {A}})\)-measurable and hence \({\mathfrak {q}}\)-measurable. Possibly restricting \(Q_{1}\), by Lusin’s Theorem we can also assume that the above functions are continuous.
Step 2. By Fubini’s Theorem
Hence there exists \(c \in {\mathbb {R}}\) and \(Q_{1,c} \subset Q_{1}\) with \({\mathfrak {q}}(Q_{1,c}) > 0\), such that for any \(\alpha \in Q_{1,c}\) it holds \(c \in (u(b_{\alpha }), u(a_{\alpha }))\); in particular for any \(\alpha \in Q_{1,c}\) there exists a unique \(z_{\alpha } \in \Gamma _{u}(\alpha )\) such that \(u(z_{\alpha }) =c\). Furthermore, we can assume that \(Q_{1,c}\) is compact, and hence by continuity of \(u(a_\alpha )\) it follows that
Then define the following set:
Recall that \(R_u^b\) is Borel since \((X,\mathsf {d})\) is proper, and therefore \(\Lambda \) is Borel. Note by the aforementioned discussion that \(P_1(\Lambda ) = Q_{1,c}\). Also note that for \((\alpha ,x,z) \in \Lambda \), since \(R_{u}(\alpha )\) is isometric to a closed interval, necessarily \(z = z_{\alpha }\). Finally, we claim that \(P_{2,3} (\Lambda )\) is \(\mathsf {d}^{2}\)-cyclically monotone: for \((x_{1},z_{1}), (x_{2},z_{2}) \in P_{2,3} (\Lambda )\) observe that
Hence for \(\{(x_{i}, z_{i})\}_{i \le n } \subset P_{2,3} (\Lambda )\), setting \(z_{n+1} = z_{1}\),
and the monotonicity follows. We can then define a function T by imposing \({\text {graph}}(T) = P_{2,3}(\Lambda )\); note that \(P_{2,3}(\Lambda )\) is analytic and therefore T is Borel measurable (see [72, Theorem 4.5.2]).
Step 3. Consider now the measure
and since \({\mathfrak {q}}(Q_{1,c})> 0\) it follows that \(\eta _{0}(X) > 0\); note that \(\eta _{0}\) is concentrated on \({\text {Dom}}\,(T) = \cup _{\alpha \in Q_{1,c}} R_{u}^b(\alpha )\). Hence there exists \(x \in X\) and \(r > 0\) such that \(\eta _{0}(B_{r}(x)) > 0\), and we redefine \(\eta _0\) to be the probability measure obtained by conditioning \(\eta _0\) to \(B_r(x)\). Clearly \(\eta _{0} \ll {\mathfrak {m}}\). Finally we define \(\eta _{1} : = T_{\sharp } \,\eta _{0}\). By Step 2 and Theorem 6.15, the map T is the unique Optimal-Transport map between \(\eta _{0}\) and \(\eta _{1}\) for the \(W_{2}\)-distance (as it is supported on a \(\mathsf {d}^2\)-cyclically monotone set). Consider moreover \(\nu \) the unique element of \(\mathrm {OptGeo}(\eta _{0},\eta _{1})\)—then \(\nu \)-a.e. \(\gamma \) it holds that
It follows in particular by Lemma 7.1 that \(\gamma _{s} \in \Gamma _{u}(\gamma _{0})\) for all \(s \in [0,1]\).
Recalling that \(u(a_{\alpha }) - c > \varepsilon \) for all \(\alpha \in Q_{1,c}\), that \(a_{\alpha } \le M\) by continuity on \(Q_{1,c}\), and that the support of \(\eta _0\) is bounded, it follows that there exists \({{\bar{t}}} \in (0,1)\) such that \(\nu \)-a.e. \(\gamma _{{{\bar{t}}}} \in {\mathcal {T}}_{u} \setminus {\mathcal {T}}_{u}^{b} \subset A_{+} \cup A_{-}\). Since \({\mathfrak {m}}(A_{+} \cup A_{-}) = 0\), necessarily \((\mathrm{e}_{{{\bar{t}}}})_{\sharp } \nu \perp {\mathfrak {m}}\), but this is in contradiction with the assertion of Theorem 6.15 that \((\mathrm{e}_{{{\bar{t}}}})_{\sharp } \nu \ll {\mathfrak {m}}\) since \(\eta _{0} \ll {\mathfrak {m}}\) and \({{\bar{t}}} < 1\). The claim follows. \(\square \)
10 The \({\mathsf {CD}}^{1}\) condition
In this section we introduce the \({\mathsf {CD}}^1(K,N)\) condition, which plays a cardinal role in this work. As a first step towards understanding this new condition, we show that it always implies \({\mathsf {MCP}}_{\varepsilon }(K,N)\) (and \({\mathsf {MCP}}(K,N)\)), without requiring any types of non-branching assumptions. By analogy, we also introduce the \({\mathsf {MCP}}^1(K,N)\) condition, which may be of independent interest.
10.1 Definitions of \({\mathsf {CD}}^{1}\) and \({\mathsf {MCP}}^1\)
We first assume that \(\text {supp}({\mathfrak {m}}) = X\). Note that we do not assume that the transport rays \(\{X_{\alpha }\}_{\alpha \in Q}\) below are disjoint or have disjoint relative interiors, in an attempt to obtain a useful definition also for m.m.s.’s which may have significant branching. However, throughout most of this work, we will typically assume in addition that the space is essentially non-branching, in which case an equivalent definition will be presented in Proposition 8.13 below.
Definition 8.1
(\({\mathsf {CD}}^1_{u}(K,N)\) when \(\text {supp}({\mathfrak {m}}) = X\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\), let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), and let \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) denote a 1-Lipschitz function. \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}_{u}(K,N)\) condition if there exists a family \(\{X_{\alpha }\}_{\alpha \in Q} \subset X\), such that:
-
(1)
There exists a disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}}\) on \(\{X_{\alpha }\}_{\alpha \in Q}\):
$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \, {\mathfrak {q}}(d\alpha ), \quad \text {with } \quad {\mathfrak {m}}_{\alpha }(X_{\alpha }) = 1, \text { for } {\mathfrak {q}}\text {-a.e. }\alpha \in Q . \end{aligned}$$(8.1) -
(2)
For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \(X_\alpha \) is a transport ray for \(\Gamma _u\) (recall Definition 7.7).
-
(3)
For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is supported on \(X_\alpha \).
-
(4)
For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), the m.m.s.\((X_{\alpha }, \mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {CD}}(K,N)\).
We take this opportunity to define an analogous variant of \({\mathsf {MCP}}\):
Definition 8.2
(\({\mathsf {MCP}}^1_{u}(K,N)\) when \(\text {supp}({\mathfrak {m}}) = X\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\), let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), let \(o \in X\) and denote the 1-Lipschitz function \(u := \mathsf {d}(\cdot ,o)\). \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {MCP}}^{1}_{u}(K,N)\) condition if there exists a family \(\{X_{\alpha }\}_{\alpha \in Q} \subset X\), such that conditions (1)–(3) above hold, together with:
-
(4’)
For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), the m.m.s.\((X_{\alpha }, \mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) with respect to \(o \in X_\alpha \).
Remark 8.3
Note that when \(u= \mathsf {d}(\cdot ,o)\) then necessarily \({\mathcal {T}}_{u} = X\) (if X is not a singleton). In addition \((x,o) \in \Gamma _{u}\) for any \(x \in X\), and hence by maximality of a transport ray, we must have \(o \in X_{\alpha }\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), and by condition (3) we deduce that \(o \in \text {supp}({\mathfrak {m}}_\alpha )\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\). As \({\mathsf {CD}}(K,N)\) implies \({\mathsf {MCP}}(K,N)\) (in the one-dimensional case this is a triviality), we obviously see that \({\mathsf {CD}}^1_u(K,N)\) implies \({\mathsf {MCP}}^1_u(K,N)\) for all \(u = \mathsf {d}(\cdot ,o)\).
We will focus on a particular class of 1-Lipschitz functions.
Definition
(Signed Distance Function) Given a continuous function \(f : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) so that \(\left\{ f = 0\right\} \ne \emptyset \), the function
is called the signed distance function (from the zero-level set of f).
Lemma 8.4
\(d_f\) is 1-Lipschitz on \(\left\{ f \ge 0\right\} \) and \(\left\{ f \le 0\right\} \). If \((X,\mathsf {d})\) is a length space, then \(d_f\) is 1-Lipschitz on the entire X.
Proof
Given \(x,y \in X\) with \(f(x) f(y) \ge 0\), the assertion follows by the usual triangle inequality, valid for any metric space:
When \(f(x) f(y) < 0\), and given \(\varepsilon > 0\), let \(\gamma : [0,1] \rightarrow X\) denote a continuous path with \(\gamma _0 = x\), \(\gamma _1=y\) and \(\ell (\gamma ) \le \mathsf {d}(x,y) + \varepsilon \). By continuity, it follows that there exists \(t \in (0,1)\) so that \(f(\gamma _t) = 0\). It follows that
As \(\varepsilon > 0\) was arbitrary, the assertion is proved. \(\square \)
Remark 8.5
To extend Remark 8.3 to more general signed distance functions, we will need to require that \((X,\mathsf {d})\) is proper, and in that case \({\mathcal {T}}_{d_{f}} \supset X \setminus \{f =0\}\). Indeed, given \(x \in X \setminus \{f =0 \}\), consider the distance minimizing \(z \in \{ f= 0 \}\) (by compactness of bounded sets). Then \((x,z) \in R_{d_{f}}\) and as \(x \ne z\) it follows that \(x \in {\mathcal {T}}_{d_{f}}\).
We now remove the restriction that \(\text {supp}({\mathfrak {m}}) = X\) and introduce the main new definitions of this work:
Definition 8.6
(\({\mathsf {CD}}^1_{Lip}(K,N)\), \({\mathsf {CD}}^1(K,N)\) and \({\mathsf {MCP}}^1(K,N)\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.and let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\).
-
\((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}_{Lip}(K,N)\) condition if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{u}(K,N)\) for all 1-Lipschitz functions \(u : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\).
-
\((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}(K,N)\) condition if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{d_f}(K,N)\) for all continuous functions \(f : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) so that \(\left\{ f=0\right\} \ne \emptyset \) and \(d_f : (\text {supp}({\mathfrak {m}}) , \mathsf {d}) \rightarrow {\mathbb {R}}\) is 1-Lipschitz.
-
\((X,\mathsf {d},{\mathfrak {m}})\) is said to verify \({\mathsf {MCP}}^{1}(K,N)\) if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}_u^1(K,N)\) for all functions \(u(x) = \mathsf {d}(x,o)\) with \(o \in \text {supp}({\mathfrak {m}})\).
Remark 8.7
Clearly \({\mathsf {CD}}^1_{Lip}(K,N) \Rightarrow {\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {MCP}}^1(K,N)\) in view of Remark 8.3. Note that we do not a-priori know that \(d_f\) is 1-Lipschitz, since we do not know that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length-space (see Lemma 8.4); nevertheless, we will shortly see that the \({\mathsf {CD}}^1(K,N)\) condition implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) must be a geodesic space, and hence the sentence “so that \(d_f\) is 1-Lipschitz” is in fact redundant.
Remark 8.8
By definition, the \({\mathsf {CD}}^1_{Lip}\), \({\mathsf {CD}}^1\) and \({\mathsf {MCP}}^1\) conditions hold for \((X,\mathsf {d},{\mathfrak {m}})\) iff they hold for \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\). It is also possible to introduce a definition of \({\mathsf {CD}}^1_{u}\) and \({\mathsf {MCP}}^1_u\) which applies to \((X,\mathsf {d},{\mathfrak {m}})\) directly, without passing through \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\)—this would involve requiring that the transport rays \(\left\{ X_\alpha \right\} \) are maximal inside \(\text {supp}({\mathfrak {m}})\), and in the case of \({\mathsf {CD}}^1_u\) would only apply to functions u which are 1-Lipschitz on \(\text {supp}({\mathfrak {m}})\) (these may be extended to the entire X by McShane’s theorem). Our choice to use a tautological approach is motivated by the analogous situation for the more classical \(W_2\) definitions of curvature-dimension (see Remark 6.11) and is purely for convenience, so as not to overload the definitions.
10.2 \({\mathsf {MCP}}^{1}\) implies \({\mathsf {MCP}}_\varepsilon \)
Proposition 8.9
Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {MCP}}^{1}(K,N)\) with \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\) (in particular, this holds if it verifies \({\mathsf {CD}}^1_{Lip}(K,N)\) or \({\mathsf {CD}}^1(K,N)\)). Then it verifies \({\mathsf {MCP}}_{\varepsilon }(K,N)\).
Proof
We will show that \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {MCP}}_{\varepsilon }(K,N)\), and consequently so will \((X,\mathsf {d},{\mathfrak {m}})\). By Remark 8.8, we may therefore assume that \(\text {supp}({\mathfrak {m}}) = X\). Fix any \(o \in X\) and consider the 1-Lipschitz function \(u (x) : = \mathsf {d}(x,o)\). From \({\mathsf {MCP}}^{1}(K,N)\) and Remark 8.5 we deduce the existence of a disintegration of \({\mathfrak {m}}\) on \({\mathcal {T}}_u = X\) along a family of Borel sets \(\{ X_{\alpha }\}_{\alpha \in Q}\):
so that \(X_\alpha \) is a transport ray for \(\Gamma _u\), \({\mathfrak {m}}_\alpha \) is supported on \(X_\alpha \) and \((X_{\alpha }, \mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) with respect to \(o \in X_\alpha \), for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).
Now consider any \(\mu _0 \in {\mathcal {P}}(X)\) with \(\mu _0 \ll {\mathfrak {m}}\), so that \(\rho _0 := \frac{d\mu _0}{d{\mathfrak {m}}}\) has bounded support. By measurability of the disintegration, the function \(Q \ni \alpha \mapsto z_\alpha := \int \rho _0(x) {\mathfrak {m}}_{\alpha }(dx)\) is \({\mathfrak {q}}\)-measurable, and hence \({\bar{Q}} := \left\{ \alpha \in Q \; ; \; z_\alpha \in (0,\infty )\right\} \) is \({\mathfrak {q}}\)-measurable. Clearly \(\int _{{\bar{Q}}} z_\alpha {\mathfrak {q}}(d\alpha ) = \int _{Q} z_\alpha {\mathfrak {q}}(d\alpha ) = 1\) since \(z_\alpha < \infty \) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).
Define \(\mu _0^\alpha := \frac{1}{z_\alpha } \rho _0 {\mathfrak {m}}_\alpha \in {\mathcal {P}}(X_\alpha )\) for all \(\alpha \in {\bar{Q}}\). Since for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the one-dimensional (non-branching) \((X_{\alpha },\mathsf {d})\) contains o, there exists a unique element \(\nu ^\alpha \) of \(\mathrm {OptGeo}(\mu _{0}^{\alpha }, \delta _{o}) \cap {\mathcal {P}}(\mathrm{Geo}(X_{\alpha }))\) where \(\mathrm{Geo}(X_{\alpha })\) denotes the space of geodesics in \(X_{\alpha }\). Define then
and observe that \((\mathrm{e}_{0})_{\sharp } \nu = \rho _0 {\mathfrak {m}}= \mu _0\) and \((\mathrm{e}_{1})_{\sharp } \nu = \delta _{o}\). To conclude that \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o})\) we must show that \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu = : \mu _{t}\) is a \(W_{2}\)-geodesic. Indeed, for any \(0 \le s < t \le 1\), consider the transference plan \((\mathrm{e}_{s},\mathrm{e}_{t})_{\sharp } \nu \) between \(\mu _s\) and \(\mu _t\), yielding:
By the triangle inequality, it follows that \(t \mapsto \mu _{t}\) must indeed be a geodesic in \(({\mathcal {P}}_2(X),W_2)\). Note that this property is particular to transportation to a delta measure.
It remains to establish the \({\mathsf {MCP}}_{\varepsilon }\) inequality of Definition 6.8. Fix \(t \in (0,1)\), and recall that for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the (one-dimensional, non-branching) \((X_{\alpha }, \mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) (and hence \({\mathsf {MCP}}_{\varepsilon }(K,N)\)), and as \(\mu _0^\alpha \ll {\mathfrak {m}}_\alpha \) and \(o \in \text {supp}({\mathfrak {m}}_\alpha )\), in particular \(\mu _t^\alpha := (\mathrm{e}_t)_{\sharp }(\nu ^\alpha ) \ll {\mathfrak {m}}_\alpha \). Applying \(\mathrm{e}_t\) to both sides of (8.3), it follows that \(\mu _t = (\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\). Writing \(\mu _t = \rho _t {\mathfrak {m}}\) and \(\mu _t^\alpha = \rho _t^\alpha {\mathfrak {m}}_\alpha \) for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the \({\mathsf {MCP}}_{\varepsilon }\) condition implies that:
In addition, the application of \(\mathrm{e}_t\) to both sides of (8.3) yields the following disintegration:
Now consider the set \(Y = \left\{ \rho _t > 0\right\} \), and note that by (8.5):
Integrating (8.5) against \(\rho _t^{-\frac{1}{N}}\) on \(Y = \left\{ \rho _t > 0\right\} \), applying Hölder’s inequality on the interior integral for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), using (8.6), employing the one-dimensional \({\mathsf {MCP}}_{\varepsilon }\) inequality (8.4) and canceling \(z_\alpha \), and finally applying Hölder’s inequality again on the exterior integral, we obtain
where the last inequality above follows since \(\rho _0 {\mathfrak {m}}_\alpha = 0\) for \(\alpha \in Q \setminus {\bar{Q}}\) and since the exponent on the second term is negative. Note that we applied Hölder’s inequality above in reverse form
which is valid as soon as \(\alpha + \beta = 1\), \(\beta < 0\), regardless of whether or not \(\left| g\right| > 0\) \(\omega \)-a.e..
Rearranging terms above and raising to the power of \(\frac{N-1}{N}\), the desired inequality follows:
\(\square \)
Remark 8.10
Note that the above proof shows that, not only does it hold that \(\text {supp}(\mu _t) \subset \text {supp}({\mathfrak {m}})\) for all \(t\in [0,1)\), as required in the definition of \({\mathsf {MCP}}_{\varepsilon }(K,N)\), but in fact \(\mu _t \ll {\mathfrak {m}}\).
Remark 8.11
Recalling that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) always implies \({\mathsf {MCP}}(K,N)\), we deduce that \({\mathsf {MCP}}^1(K,N)\) implies \({\mathsf {MCP}}(K,N)\). In fact, a direct proof of the latter implication is elementary. Indeed, let \(A \subset X\) be any Borel set with \(0< {\mathfrak {m}}(A) < \infty \), and denote \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\). Recall that for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), \(o \in X_\alpha \), \(\text {supp}({\mathfrak {m}}_\alpha ) = X_\alpha \) and \((X_{\alpha },\mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\). Defining \(\nu \) as in (8.3) and continuing with the notation used there, it follows by uniqueness of \(\nu ^\alpha \) and the \({\mathsf {MCP}}\) condition with respect to the point \(o \in X_\alpha \), that for any Borel set \(B \subset X\):
for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\). Integrating over \({\bar{Q}}\) we obtain
and the claim follows.
As a consequence, we immediately obtain from Lemmas 6.12 and 8.4:
Corollary 8.12
Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) with \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). Then \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a Polish, proper and geodesic space. In particular, for any continuous function \(f : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) with \(\{ f = 0 \} \ne \emptyset \), the function \(d_{f} : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) is 1-Lipschitz.
10.3 On essentially non-branching spaces
Having at our disposal \({\mathsf {MCP}}(K,N)\), we can now invoke the results of Sect. 7 concerning the theory of \(L^1\)-Optimal-Transport, and obtain the following important equivalent definitions of \({\mathsf {CD}}^1_{Lip}(K,N)\), \({\mathsf {CD}}^1(K,N)\) and \({\mathsf {MCP}}^1(K,N)\) assuming that \((X,\mathsf {d},{\mathfrak {m}})\) is essentially non-branching.
Proposition 8.13
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\). Given \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\), the following statements are equivalent:
-
(1)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{Lip}(K,N)\).
-
(2)
For any 1-Lipschitz function \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\), let \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) denote the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes of \(R_{u}^{b}\). Denote by \(X_\alpha \) the closure \(\overline{R_u^b(\alpha )}\). Then all the conditions (1)–(4) of Definition 8.1 hold for the family \(\left\{ X_\alpha \right\} _{\alpha \in Q}\). In particular, \(X_\alpha = R_{u}(\alpha )\) is a transport-ray for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).
Moreover, the sets \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) have disjoint interiors \(\{\mathring{R}_u^b(\alpha )\}_{\alpha \in Q}\) contained in \({\mathcal {T}}_u^b\), and the disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_u}\) on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) given by (8.1) is essentially unique.
Furthermore, Q may be chosen to be a section of the above partition so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section with \({\mathfrak {m}}\)-measurable quotient map, so that in particular \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\).
An identical statement holds for \({\mathsf {CD}}^1(K,N)\) when only considering signed distance functions \(u = d_f\).
An identical statement also holds for \({\mathsf {MCP}}^1(K,N)\) when only considering the functions \(u = d(\cdot ,o)\), after replacing above condition (4) of Definition 8.1 with condition (4’) of Definition 8.2.
Proof
The only direction requiring proof is \((1) \Rightarrow (2)\). Given a 1-Lipschitz function u as above, we may assume that \({\mathfrak {m}}({\mathcal {T}}_u) > 0\), otherwise there is nothing to prove. The \({\mathsf {CD}}^1_u(K,N)\) condition ensures there exists a family \(\left\{ Y_\beta \right\} _{\beta \in P}\) of sets and a disintegration:
so that for \({\mathfrak {p}}\)-a.e. \(\beta \in P\), \(Y_\beta \) is a transport ray for \(\Gamma _u\), \((Y_\beta ,\mathsf {d},{\mathfrak {m}}^P_\beta )\) satisfies \({\mathsf {CD}}(K,N)\) and \(\text {supp}({\mathfrak {m}}^P_\beta ) = Y_\beta \). By removing a \({\mathfrak {p}}\)-null-set from P, let us assume without loss of generality that the above properties hold for all \(\beta \in P\).
As \({\mathsf {CD}}^1_{Lip}(K,N) \Rightarrow {\mathsf {CD}}^1 (K,N) \Rightarrow {\mathsf {MCP}}^1(K,N) \Rightarrow {\mathsf {MCP}}(K,N)\), and as our space is essentially non-branching with full-support, Corollary 7.3 implies that \({\mathfrak {m}}(A_+ \cup A_-) = 0\) and that there exists an essentially unique disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}^b_{u}}\) strongly consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\):
By Corollary 7.3, Q may be chosen to be a section of the above partition satisfying the statement appearing in the formulation of Proposition 8.13. Again, let us assume without loss of generality that \({\mathfrak {m}}_{\alpha }(R_u^b(\alpha )) = 1\) for all \(\alpha \in Q\).
By Theorem 7.10, there exists \(Q_1 \subset Q\) of full \({\mathfrak {q}}\)-measure so that \(R_u(\alpha ) = \overline{R_u^b(\alpha )} \supset R_u^b(\alpha ) \supset \mathring{R}_u(\alpha )\) for all \(\alpha \in Q_1\). In addition, since \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\), there exists \(P_1 \subset P\) of full \({\mathfrak {p}}\)-measure so that \({\mathfrak {m}}^P_\beta ({\mathcal {T}}_u^b)=1\) for all \(\beta \in P_1\). By Lemmas 7.6 and 7.8, \((Y_\beta \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},\left| \cdot \right| )\), and therefore \((\overline{Y_\beta \cap {\mathcal {T}}_u^b}, \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b})\) still satisfies \({\mathsf {CD}}(K,N)\), is of total measure 1 and satisfies \(\text {supp}(({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}) = \overline{Y_\beta \cap {\mathcal {T}}_u^b}\), for all \(\beta \in P_1\).
Now by Lemma 7.8, since \(Y_\beta \cap {\mathcal {T}}_u^b \ne \emptyset \) for all \(\beta \in P_1\), \(Y_\beta = R_u(x)\) for all \(x \in Y_\beta \cap {\mathcal {T}}_u^b\). In particular, for all \(\beta \in P_1\), there exists a unique (since \(R_u^b \) is an equivalence relation on \({\mathcal {T}}_u^b\) and by uniqueness of the section map) \(\alpha = \alpha (\beta ) \in Q\) so that \(Y_\beta = R_u(\alpha )\). Denoting by \({\tilde{Q}} \subset Q\) the set of indices \(\alpha \) obtained in this way, it is clear that \({\tilde{Q}}\) if of full \({\mathfrak {q}}\)-measure, since:
Consequently, \(Q_2 := {\tilde{Q}} \cap Q_1\) is of full \({\mathfrak {q}}\)-measure as well. Denoting \(P_2 := \alpha ^{-1}(Q_2)\) and repeating the above argument, it follows that \(P_2 \subset P_1\) is of full \({\mathfrak {p}}\)-measure and satisfies that for all \(\beta \in P_2\), \(Y_\beta = R_u(\alpha )\) for \(\alpha = \alpha (\beta )\in Q_2\).
We conclude that there is a one-to-one correspondence:
so both of these representations yield an identical partition (up to relabeling) of the set
Clearly \({\mathfrak {m}}({\mathcal {T}}_u^b \setminus C) = 0\) and so C is \({\mathfrak {m}}\)-measurable. Therefore, by the above two disintegration formulae:
After identifying between \(P_2\) and \(Q_2\) via \(\eta \), it follows necessarily that \({\mathfrak {q}}\llcorner _{Q_2}={\mathfrak {p}}\llcorner _{P_2}\) as they are both the push-forward of \({\mathfrak {m}}\llcorner _{C}\) under the partition map (since \(({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}\) and \({\mathfrak {m}}_\alpha \) are both probability measures on \({\mathcal {T}}_u\)). Applying the Disintegration Theorem 6.19 to \((C,{\mathcal {B}}(C),{\mathfrak {m}}\llcorner _{C})\), we conclude that there is an essentially unique disintegration of \({\mathfrak {m}}\llcorner _{C}\) on the above partition of C. Consequently, there exist \(P_3 \subset P_2\) of full \({\mathfrak {p}}\)-measure and \(Q_3 = \eta (P_3) \subset Q_2\) of full \({\mathfrak {q}}\)-measure so that
for all pairs \((\beta ,\alpha ) \in P_3 \times Q_3\) related by the correspondence \(\eta \).
Recall that \(X_\alpha := \overline{R_u^b(\alpha )}\). It follows that for all \(\alpha \in Q_3\) (with corresponding \(\beta \in P_3\)):
-
(1)
\(X_\alpha = \overline{R_u^b(\alpha )} = R_u(\alpha )\) is a transport ray.
-
(2)
\((\overline{Y_\beta \cap {\mathcal {T}}_u^b} , \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}) = (\overline{R_u^b(\alpha )} = X_\alpha ,\mathsf {d},{\mathfrak {m}}_\alpha )\) satisfies \({\mathsf {CD}}(K,N)\) with total measure 1.
-
(3)
Consequently
$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \,{\mathfrak {q}}(d\alpha ) , \end{aligned}$$(8.8)is a disintegration on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\).
-
(4)
\({\mathfrak {m}}_\alpha = ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}\) is supported on \(\overline{Y_\beta \cap {\mathcal {T}}_u^b} = \overline{R_u^b(\alpha )} = X_\alpha \).
This confirms the 4 conditions of Definition 8.1, and the essential uniqueness of the disintegration (8.8) readily follows from that of the disintegration (8.7) and the arguments above.
Finally, by Lemma 7.6, since \((R_u^b(\alpha ) = R_u(\alpha ) \cap {\mathcal {T}}_u^b , \mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},\left| \cdot \right| )\), then \(\mathring{X}_\alpha = \mathring{R}_{u}^b(\alpha )\) for all \(\alpha \in Q\). As \(\left\{ R_{u}^b(\alpha )\right\} _{\alpha \in Q}\) are equivalence classes, it follows that \(\{\mathring{X}_\alpha \}_{\alpha \in Q}\) is a family of disjoint subsets of \({\mathcal {T}}_u^b\). This concludes the proof for the case of \({\mathsf {CD}}^1_{Lip}\) and \({\mathsf {CD}}^1\).
For \({\mathsf {MCP}}^1\), one just needs to note that if \(u = \mathsf {d}(\cdot ,o)\) then \(o \in Y_\beta \) for all \(\beta \in P\) (by Remark 8.3, since \(Y_\beta \) is a transport ray). Recalling the definition of \(P_1 \subset P\), since \((Y_\beta \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval and \({\mathfrak {m}}^P_\beta (Y_\beta \cap {\mathcal {T}}_u^b) = 1\) for all \(\beta \in P_1\), it follows necessarily that for those \(\beta \), \(o \in \overline{Y_\beta \cap {\mathcal {T}}_u^b}\) and \((\overline{Y_\beta \cap {\mathcal {T}}_u^b} , \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b})\) still satisfies \({\mathsf {MCP}}(K,N)\) with respect to o and is of full support. The rest of the the argument is identical to the one presented above, concluding the proof. \(\square \)
Recall moreover that we already derived several properties of \(W_{2}\)-geodesics in essentially non-branching m.m.s.’s verifying \({\mathsf {MCP}}(K,N)\). Hence from Proposition 8.9 we also obtain all the claims of Theorem 6.15 and Corollary 6.16, as well as all of the results of the next section, provided the m.m.s.is essentially non-branching and verifies \({\mathsf {CD}}^{1}(K,N)\) for \(N \in (1,\infty )\).
11 Temporal-regularity under \({\mathsf {MCP}}\)
In this section we deduce from the Measure Contraction and essentially non-branching properties various temporal-regularity results for the map \(t \mapsto \rho _{t}(\gamma _{t})\) and related objects, which we will require for this work. By Proposition 8.9, these results also apply under the \({\mathsf {CD}}^1\) condition. While these properties are essentially standard consequences of recently available results and tools, they appear to be new and may be of independent interest.
As usual, we assume that \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). We begin with:
Proposition 9.1
Let \((X,\mathsf {d},{\mathfrak {m}})\) denote an essentially non-branching m.m.s.. Then the following are equivalent:
-
(1)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}(K,N)\).
-
(2)
\((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}_{\varepsilon }(K,N)\).
-
(3)
For all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0} \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _{1}) \subset \text {supp}({\mathfrak {m}})\), there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\), \(\nu \) is induced by a map (i.e. \(\nu = S_{\sharp }(\mu _0)\) for some map \(S : X \rightarrow \mathrm{Geo}(X)\)), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1)\), and writing \(\mu _t = \rho _t {\mathfrak {m}}\), we have for all \(t \in [0,1)\):
$$\begin{aligned} \rho _t^{-\frac{1}{N}}(\gamma _t) \ge \tau _{K,N}^{(1-t)}(\mathsf {d}(\gamma _0,\gamma _1)) \rho _0^{-\frac{1}{N}}(\gamma _0) \;\;\; \text {for }\nu \text {-a.e. }\gamma \in \mathrm{Geo}(X) ,\nonumber \\ \end{aligned}$$(9.1)and (integrating with respect to \(\nu \)):
$$\begin{aligned} {\mathcal {E}}_{N}(\mu _t) \ge \int \tau _{K,N}^{(1-t)} (\mathsf {d}(\gamma _0,\gamma _1)) \rho _0^{-\frac{1}{N}}(\gamma _0) \nu (d\gamma ) . \end{aligned}$$(9.2) -
(4)
For all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_2(X)\) of the form \(\mu _1 = \delta _o\) for some \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) for some Borel set \(A \subset X\) with \(0< {\mathfrak {m}}(A) < \infty \), there exists a \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t \in [0,1)\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) and (9.1), (9.2) hold.
Moreover, the equivalence \((1) \Leftrightarrow (4)\) does not require the essentially non-branching assumption.
Remark 9.2
In fact, for essentially non-branching spaces, it is also possible to add the \({\mathsf {MCP}}^1(K,N)\) condition to the above list of equivalent statements. Indeed, we have already seen in the previous section that \({\mathsf {MCP}}^1(K,N) \Rightarrow {\mathsf {MCP}}_{\varepsilon }(K,N)\) without any non-branching assumptions. The converse implication for non-branching spaces follows from [19, Proposition 9.5] (without identifying the \({\mathsf {MCP}}^1(K,N)\) condition by this name), and it is possible to extend this to essentially non-branching spaces by following the arguments of [23, Proposition A.1].
Remark 9.3
Note that in (3), one is allowed to test any \(\mu _1\) with \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\), not only \(\mu _1 = \delta _o\) as in the other statements. By Theorem 6.15 (recall that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\)), note that the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition is precisely equivalent to the validity of (9.2) for all measures \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) of the form \(\mu _1 = \delta _o\) with \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \ll {\mathfrak {m}}\) with bounded support.
Remark 9.4
While the equivalence \((1) \Leftrightarrow (4)\) will not be directly used in this work, it is worthwhile remarking that this is the only instance we are aware of, where one can obtain information on the density along geodesics without assuming or a-posteriori concluding some type of non-branching assumption. Indeed, the proof of \((1) \Rightarrow (4)\) relies on the (newly available) Theorem 3.11.
Proof of Proposition 9.1
\((1) \Rightarrow (4)\). \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic by Lemma 6.12. Given \(\mu _0\) and \(\mu _1 = \delta _o\) as in (4), any \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) is concentrated on \(G_\varphi \) (where \(\varphi \) is the associated Kantorovich potential), and so Theorem 3.11 implies that \(\mathsf {d}(\gamma _{0},\gamma _{1}) = \ell _t(\gamma _t)\) for \(\nu \)-a.e. \(\gamma \). It follows that with the notation of Sect. 3:
The pointwise inequality between densities follows for \({\mathfrak {m}}\)-a.e. x, and since \(\ell _t < \infty \) (and hence \(\tau _{K,N}^{(1-t)}(\ell _t(x)) > 0\)) for \(t\in (0,1)\), this in fact implies that \((\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\) (without relying on Theorem 6.15, which is unavailable without the essentially non-branching assumption). Since \((\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\), the inequality between densities is verified at \(x = \gamma _t\) for \(\nu \)-a.e. \(\gamma \). Noting that \(\frac{1}{{\mathfrak {m}}(A)} = \rho _0(\gamma _0)\) for \(\nu \)-a.e. \(\gamma \), (9.1) and hence (9.2) are established for \(\mu _0,\mu _1\) as above.
\((4) \Rightarrow (1)\). This follows by applying (9.1) to \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) and \(\mu _1 = \delta _o\), raising the resulting inequality to the power of N, and integrating it against \(\nu \llcorner _{\left\{ \gamma _t \in B\right\} }\) for all Borel sets \(B \subset \text {supp}(\mu _t)\), thereby verifying the \({\mathsf {MCP}}(K,N)\) inequality (6.5).
\((4) \Rightarrow (2)\). Let \(o \in \text {supp}({\mathfrak {m}})\) and let \(\mu _{0} = \rho _{0} {\mathfrak {m}}\in {\mathcal {P}}(X)\) with bounded support. As \((4)\Rightarrow (1)\), Lemma 6.12 implies that \((\text {supp}({\mathfrak {m}}),d)\) is proper, and in addition the assertions of Theorem 6.15 and Corollary 6.16 are in force.
Now, there exists an non-decreasing sequence \(\{f^{i}\}_{i\in {\mathbb {N}}}\) of simple functions, that is
such that \(\mu _{0}^{i} := \rho _{0}^{i} {\mathfrak {m}}:= \frac{1}{z^i} f^i {\mathfrak {m}}\in {\mathcal {P}}(X)\) is of bounded support, \(z^i := \int f^i d{\mathfrak {m}}\nearrow 1\), \(f^i \nearrow \rho _0\) pointwise, and \(\mu _0^i \rightharpoonup \mu _0\) weakly, as \(i \rightarrow \infty \). By Theorem 6.15 there exists a unique \(\nu ^{i} \in \mathrm {OptGeo}(\mu _{0}^{i}, \delta _{o})\), it is induced by a map, and can be written as
with each \(\nu ^{i}_{k}\) the unique optimal dynamical plan between \(\mu _{0,k}^i := \rho _{0,k}^i {\mathfrak {m}}:= \frac{1}{{\mathfrak {m}}(A_k^i)} {\mathfrak {m}}\llcorner _{A_k^i}\) and \(\delta _{o}\). Moreover, \((\mathrm{e}_{t})_{\#}\nu ^{i}_{k} \perp (\mathrm{e}_{t})_{\#} \nu ^{i}_{j}\) whenever \(k\ne j\), for all \(t \in [0,1)\) by Corollary 6.16. Lastly, \(\text {supp}(\nu ^i) \subset \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\) by Remark 6.11. It follows by (9.2) applied to \(\nu ^i_k\) that
Multiplying by \(\left( \frac{1}{z^i} \alpha ^i_k {\mathfrak {m}}(A^i_k)\right) ^{1-\frac{1}{N}}\), summing over k, and using the mutual singularity of all corresponding measures, we obtain
Passing to a subsequence if necessary, Lemma 6.1 implies that \(\nu ^i \rightharpoonup \nu ^\infty \in \mathrm {OptGeo}(\mu _0,\delta _o)\), and hence \((\mathrm{e}_{t})_{\#}\nu ^{i} \rightharpoonup (\mathrm{e}_{t})_{\#}\nu ^{\infty }\). It follows by upper semi-continuity of \({\mathcal {E}}_N\) on the left-hand side of (9.3), and monotone convergence (and \(z_i \rightarrow 1\)) on the right hand side, that taking \(i \rightarrow \infty \) yields the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) inequality (6.4). \((2) \Rightarrow (3)\). By Remark 6.11, we may reduce to the case \(\text {supp}({\mathfrak {m}}) = X\). In view of Remark 9.3, we first extend the validity of (9.2) by removing the (immaterial) restriction that \(\mu _0\) has bounded support. When \(K > 0\), \(\text {supp}(\mu _0)\) is automatically bounded since \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\) which by Remark 6.10 implies a Bonnet-Myers diameter estimate. When \(K \le 0\), we may weakly approximate a general \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) by measures \(\mu _0^i \ll {\mathfrak {m}}\) having bounded support and repeat the argument presented above in the proof of \((4) \Rightarrow (2)\).
The case of a general \(\mu _1 \in {\mathcal {P}}_2(X)\) with \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\) follows by approximating \(\mu _1\) by a convex combination of delta-measures:
with \(W_{2}(\mu _{1}^{i}, \mu _{1}) \rightarrow 0\) as \(i\rightarrow \infty \). By Theorem 6.15 (recall again that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\)), for each i there exists a unique \(\nu ^{i} \in \mathrm {OptGeo}(\mu _{0},\mu _{1}^{i})\), and we may write \(\nu ^{i} = \sum _{k \le n(i)} \alpha ^i_k \nu ^{i}_{k}\) so that
Moreover, as explained above, \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} \perp (\mathrm{e}_{t})_{\#} \nu ^{i}_{j}\) whenever \(k\ne j\), for all \(t \in [0,1)\). Furthermore, as \((\mathrm{e}_{0})_{\#} \nu ^{i}_{k} \ll {\mathfrak {m}}\) (since \((\mathrm{e}_{0})_{\#} \nu ^i = \mu _0 = \rho _0 {\mathfrak {m}}\ll {\mathfrak {m}}\)), Theorem 6.15 implies that \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} \ll {\mathfrak {m}}\) for all \(t \in [0,1)\). Writing \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} = \rho _{k,t}^{i} {\mathfrak {m}}\), the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition implies for all \(t \in [0,1)\):
Multiplying by \((\alpha _k^i)^{1-1/N}\), summing over k and using the mutual singularity of the corresponding measures, we obtain
Passing as usual to a subsequence if necessary, Lemma 6.1 implies that \(\nu ^i \rightharpoonup \nu ^\infty \in \mathrm {OptGeo}(\mu _0,\mu _1)\), and hence \((\mathrm{e}_{t})_{\#}\nu ^{i} \rightharpoonup (\mathrm{e}_{t})_{\#}\nu ^{\infty }\). Invoking the upper semi-continuity of \({\mathcal {E}}_{N}\) on the left-hand-side, and lower semi-continuity of the right-hand-side (see [74, Lemma 3.3], noting that the first marginal of \(\nu ^i\) is fixed to be \(\mu _0 = \rho _0 {\mathfrak {m}}\)), (9.2) finally follows in full generality.
The density estimate (9.1) then follows using a straightforward variation of [41, Proposition 3.1], where it was shown how the existence of (a necessarily unique) transport map S may be used to obtain a pointwise density inequality such as (9.1) from an integral inequality such as (9.2) (the statement of [41, Proposition 3.1] involves an assumption on infinitesimal Hilbertianity of the space, but the only property used in the proof is the existence of a transport map S inducing a unique optimal dynamical plan).
Finally, \((3) \Rightarrow (4)\) is trivial. This concludes the proof. \(\square \)
Corollary 9.5
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\). Then with the same assumptions and notation as in Proposition 9.1 (3), there exist versions of the densities \(\rho _t = \frac{d\mu _t}{d{\mathfrak {m}}}\), \(t \in [0,1)\), so that for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), for all \(0\le s \le t <1\):
(with \(\frac{s}{t} = \frac{0}{0}\) interpreted as 1 above). In particular, for \(\nu \)-a.e. \(\gamma \), the map \(t \mapsto \rho _{t}(\gamma _{t})\) is locally Lipschitz on (0, 1) and upper semi-continuous at \(t=0\).
Proof
Step 1. Given \(0 \le s \le t < 1\), observe that \((\text {restr}^{t}_{s})_{\sharp } \nu \) is the unique element of \(\mathrm {OptGeo}(\mu _{s},\mu _{t})\); indeed \(\mu _{s}\) is absolutely continuous with respect to \({\mathfrak {m}}\) and so Theorem 6.15 applies. In particular, we deduce that for each \(0 \le s \le t < 1\) and \(\nu \)-a.e. \(\gamma \):
with the exceptional set depending on s and t. Reversing time and the roles of \(\mu _s,\mu _t\), we similarly obtain for each \(0 \le s \le t < 1\) and \(\nu \)-a.e. \(\gamma \) that:
with the exceptional set depending on s and t (the case \(s=0\) is also included as the conclusion is then trivial). Note that given \(s \in [0,1)\), as \(\rho _s(x) > 0\) for \(\mu _s\)-a.e. x, we have that \(\rho _s(\gamma _s) > 0\) for \(\nu \)-a.e. \(\gamma \). Altogether, we see that for each \(0 \le s \le t < 1\), for \(\nu \)-a.e. \(\gamma \):
with the exceptional set depending on s and t.
Together with an application of Corollary 6.16, we deduce the existence of a Borel set \(H \subset \mathrm{Geo}(X)\) with \(\nu (H)=1\) such that \(\mathrm{e}_t|_H : H \rightarrow X\) is injective for all \(t \in [0,1)\), and such that for every \(\gamma \in H\), the double sided estimate (9.5) holds for all \(s,t \in [0,1) \cap {\mathbb {Q}}\). We then define for \(t \in [0,1)\) and \(\gamma \in H\):
and \({\hat{\rho }}_t = 0\) outside of \(\mathrm{e}_t(H)\). By (9.5) we see that for any \(\gamma \in H\) and \(t \in (0,1)\) the above limit always exists, and so by injectivity of \(\mathrm{e}_t|_H\), \({\hat{\rho }}_t\) is well-defined. Furthermore, (9.5) implies that for all \(\gamma \in H\), \({\hat{\rho }}_{\cdot }(\gamma _{\cdot })\) satisfies (9.5) itself for all \(0 \le s \le t < 1\). Finally, for each \(t \in [0,1)\) consider any sequence \(\{s_{n}\}\subset {\mathbb {Q}}\) converging to t; then (9.5) is valid for \(\nu \)-a.e. \(\gamma \) at t and \(s_{n}\), with the exceptional set not depending on n. Taking the limit as \(n\rightarrow \infty \) implies \( \rho _{t}(\gamma _{t}) = {\hat{\rho }}_{t}(\gamma _{t})\). Hence we have obtained that for each \(t \in [0,1)\), for \(\nu \)-a.e. \(\gamma \):
with the exceptional set depending only on t.
It follows that for all \(t \in [0,1)\), \(\rho _t(x) = {\hat{\rho }}_t(x)\) for \(\mu _t\)-a.e. x. As \(\mu _t\) and \({\mathfrak {m}}\) are mutually absolutely continuous on \(\left\{ \rho _t > 0\right\} \), it follows that \(\rho _t {\mathfrak {m}}= {\hat{\rho }}_t 1_{\left\{ \rho _t > 0\right\} }{\mathfrak {m}}\) for all \(t \in [0,1)\).
Step 2. We now claim that for all \(t \in [0,1)\), \({\mathfrak {m}}(\{ \rho _t = 0 \} \cap \mathrm{e}_t(H)) = 0\). This will establish that \(\mu _t = \rho _t {\mathfrak {m}}= {\hat{\rho }}_t {\mathfrak {m}}\), so that \({\hat{\rho }}_t\) is indeed a density of \(\mu _t\), thereby concluding the proof.
Suppose in the contrapositive that the above is false, so that there exists \(t \in [0,1)\) with \({\mathfrak {m}}(\{ \rho _t = 0 \} \cap \mathrm{e}_t(H)) > 0\). As \(\mathrm{e}_t|_H\) is injective, there exist \(K \subset H\) such that \(K_{t} : = \mathrm{e}_{t}(K) = \{ \rho _t = 0 \} \cap \mathrm{e}_t(H)\).
Set \(K_s := \mathrm{e}_s(K)\) for all \(s \in [0,1)\). We claim that \({\mathfrak {m}}(K_s) > 0\) for all \(s \in (0,1)\). Indeed, define \(\eta _{t} : = {\mathfrak {m}}\llcorner _{K_{t}}/{\mathfrak {m}}(K_{t})\) and set \({\bar{\nu }} := (\mathrm{e}_{t}|_H)^{-1}_{\#} \eta _{t}\) and \(\eta _s := (\mathrm{e}_s)_{\sharp } {\bar{\nu }}\). As \({\bar{\nu }}\) is concentrated on \(K \subset H \subset \text {supp}(\nu )\), it follows that \((\text {restr}^{1}_{t})_{\sharp } {\bar{\nu }}\) must be an optimal dynamical plan between \(\eta _t\) and \(\eta _1\). As \(\eta _t \ll {\mathfrak {m}}\), Theorem 6.15 implies that the latter plan is in fact the unique element of \(\mathrm{OptGeo}(\eta _t,\eta _1)\), and that \(\eta _s \ll {\mathfrak {m}}\) for all \(s \in [t,1)\). As \(\eta _s(K_s) = 1\), it follows that \({\mathfrak {m}}(K_s) > 0\). If \(t > 0\), a similar argument applies to the range \(s \in (0,t]\).
However, by definition, for all \(s \in [0,1) \cap {\mathbb {Q}}\) we have \(0 < {\hat{\rho }}_s = \rho _s\) on \(\mathrm{e}_s(H)\), and in particular on \(\mathrm{e}_s(K) = K_s\). Choosing any \(s \in (0,1) \cap {\mathbb {Q}}\), we obtain the desired contradiction:
This concludes the proof. \(\square \)
Proposition 9.6
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\). Consider any \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _0 \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\), and let \(\nu \) denote the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). Then for any compact set \(G \subset \mathrm{Geo}(X)\) with \(\nu (G)> 0\), such that (9.4) holds for all \(\gamma \in G\) and \(0 \le s \le t < 1\), we have for all \(s \in [0,1)\), \({\mathfrak {m}}(\mathrm{e}_{s}(G)) > 0\), and for all \(0\le s \le t <1\):
where \(d(G) = sup \{ \ell (\gamma ) :\gamma \in G\} < \infty \) and \(K^{-} = \max \{0, - K\}\) (and with \(\frac{t}{s} = \frac{0}{0}\) interpreted as 1 above). In particular, the map \(t \mapsto {\mathfrak {m}}(\mathrm{e}_{t}(G))\) is locally Lipschitz on (0, 1) and lower semi-continuous at \(t=0\).
Proof
We proceed with the usual notation repeatedly used above. Fix \(s \in [0,1)\). Since \(\mu _{s}(\mathrm{e}_{s}(G)) \ge \nu (G) >0\) and \(\mu _s \ll {\mathfrak {m}}\), it follows that \({\mathfrak {m}}(\mathrm{e}_{s}(G)) > 0\). Define \({\bar{\mu }}_{0} : = {\mathfrak {m}}\llcorner _{\mathrm{e}_s(G)}/{\mathfrak {m}}(\mathrm{e}_s(G))\).
By Corollary 6.16, there exists a Borel set \(H \subset G\) such that \(\mathrm{e}_{s}^{-1} : \mathrm{e}_{s}(H) \rightarrow G\) is a single valued map and:
where the second assertion above follows since \({\mathfrak {m}}\) and \(\mu _s\) are mutually absolutely continuous on \(\left\{ \rho _s > 0 \right\} \), and since our assumption (9.4) guarantees that \(\mathrm{e}_{s}(G) \subset \{\rho _{s} > 0 \}\). Now consider:
By construction and (9.7), \((\mathrm{e}_0)_{\sharp }{\bar{\nu }} = {\bar{\mu }}_{0}\); define \({\bar{\mu }}_{1} : = (\mathrm{e}_1)_{\sharp }{\bar{\nu }}\) and note that necessarily \({\bar{\nu }} \in \mathrm {OptGeo}({\bar{\mu }}_{0}, {\bar{\mu }}_{1})\) (since \({\bar{\nu }}\) is still supported on a \(\mathsf {d}^2/2\)-cyclically monotone set) and that it is induced by the map \(T := \mathrm{e}_1 \circ \mathrm{e}_s^{-1}\). Theorem 6.15 then implies that \({\bar{\mu }}_{r} = {\bar{\rho }}_r {\mathfrak {m}}\ll {\mathfrak {m}}\) for all \(r \in [0,1)\). Note that \({\bar{\mu }}_r\) is concentrated on the compact set \(\mathrm{e}_{t}(G)\) with \(t := s + r(1-s)\), and therefore \({\mathfrak {m}}(\text {supp}({\bar{\mu }}_r)) \le {\mathfrak {m}}(\mathrm{e}_{t}(G) )\). It follows by Jensen’s inequality together with the \({\mathsf {MCP}}(K,N)\) assumption that:
where the last inequality follows from the lower bound (see e.g. [29, Remark 2.3]):
Substituting \(r = \frac{t-s}{1-s}\), the left-hand side of (9.6) is established. Reversing the time, the right-hand side of (9.6) immediately follows, thereby concluding the proof. \(\square \)
The following two consequences of Proposition 9.6 will be required for the proof of the change-of-variables formula in Sect. 11. Recall that for any \(G \subset \mathrm{Geo}(X)\),
and that \(D(G)(x) = \{ t \in [0,1] :x = \gamma _{t}, \ \gamma \in G \}\) and \(D(G)(t) = \{ x \in X :x = \gamma _t,\ \gamma \in G \} = \mathrm{e}_{t}(G)\). To simplify the notation, we directly write G(x) instead of D(G)(x).
Proposition 9.7
With the same assumptions as in Proposition 9.6, we have for any \(t \in (0,1)\):
The same result also holds for \(t=0\) if we dispense with the factor of 2 in the denominator.
The proof follows the same line as the proof of [26, Theorem 2.1]. We include it for the reader’s convenience.
Proof
Fix \(t \in (0,1)\). Suppose in the contrapositive that the claim is false:
Consider the complement \(G(x)^{c} = \{ t \in [0,1] : x \notin \mathrm{e}_{t}(G)\}\), and deduce the existence of a sequence \(\varepsilon _{n} \rightarrow 0\) such that
Now let
with E(x), E(s) the corresponding sections. By Fubini’s Theorem and (9.8) we obtain that:
so there must be a sequence of \(\{s_{n}\}_{n \in {\mathbb {N}}}\) converging to t so that \({\mathfrak {m}}(E (s_{n})) \ge \kappa \), for some \(\kappa >0\). Repeating the above argument for the case \(t=0\) with the appropriate obvious modifications, the latter conclusion also holds in that case as well. Note that
The compact sets \(\mathrm{e}_{s_{n}}(G)\) converge to \(\mathrm{e}_{t}(G)\) in Hausdorff distance: indeed, \(\mathsf {d}(\gamma _{t},\gamma _{s_{n}}) \le C |t-s_{n}|\) where \(C: =\sup _{\gamma \in G} \ell (\gamma ) < \infty \) by compactness of G. Hence, for each \(\varepsilon > 0\) there exists \(n(\varepsilon )\) such that for all \(n \ge n(\varepsilon )\) it holds \(\mathrm{e}_{t}(G)^{\varepsilon } \supset \mathrm{e}_{s_n}(G)\) (and vice-versa), where \(A^\varepsilon := \left\{ y \in X \; ; \; \mathsf {d}(y,A) \le \varepsilon \right\} \). It follows that
Taking the limit as \(n \rightarrow \infty \), the continuity property of Proposition 9.6 (lower semi-continuity if \(t=0\)) implies that for each \(\varepsilon > 0\):
with \(\kappa \) independent of \(\varepsilon \). Since \({\mathfrak {m}}(\mathrm{e}_{t}(G)) = \lim _{\varepsilon \rightarrow 0} {\mathfrak {m}}(\mathrm{e}_{t}(G)^{\varepsilon })\) we obtain a contradiction, and the claim is proved. \(\square \)
Corollary 9.8
With the same assumptions as in Proposition 9.6, and assuming that \(\text {supp}({\mathfrak {m}}) = X\), we have
where \(\varphi \) is an associated Kantorovich potential to the c-optimal-transport problem from \(\mu _{0}\) to \(\mu _{1}\) with \(c = \mathsf {d}^2/2\). In particular:
Recall from Sect. 3 that \(G_{\varphi }\subset \mathrm{Geo}(X)\) denotes the set of \(\varphi \)-Kantorovich geodesics, \(G^{+}_{\varphi }\) denotes the subset of geodesics in \(G_{\varphi }\) having positive length, and \(X^0 = \mathrm{e}_{[0,1]}(G_\varphi ^0)\) denotes the subset of null geodesic points in X. Necessarily \(\nu (G_{\varphi }) = 1\). The assumption \(\text {supp}({\mathfrak {m}}) = X\) guarantees by Lemma 6.12 that \((X,\mathsf {d})\) is proper and geodesic, so that the results of Part I are in force; by Remark 6.11 this poses no loss in generality.
Proof of Corollary 9.8
Suppose by contradiction that \(\nu (\mathrm{e}_{0}^{-1}(X^{0})\cap G^{+}_{\varphi } ) > 0\). By inner regularity, there exists a compact \(G \subset \mathrm{e}_{0}^{-1}(X^{0})\cap G^{+}_{\varphi }\) with \(\nu (G)>0\) verifying the hypothesis of Proposition 9.6 and therefore also the conclusion of Proposition 9.7 for \(t =0\). In particular, for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_{0}(G) \subset X^{0}\) there exists \(\gamma \in G \subset G^{+}_{\varphi }\) and \(t \in (0,1)\) (sufficiently small) such that \(x = \gamma _{t}\). But \(\mu _0(\mathrm{e}_0(G)) = \nu (\mathrm{e}_0^{-1}(\mathrm{e}_0(G))) \ge \nu (G) > 0\), and hence \({\mathfrak {m}}(\mathrm{e}_0(G)) > 0\) as \(\mu _0 \ll {\mathfrak {m}}\). It follows that there exists at least one \(x \in \mathrm{e}_{0}(G)\) as above, in direct contradiction to the characterization of \(X^0\) given in Lemma 3.15. Hence we can conclude that \(\nu \)-almost-surely, \(\mathrm{e}_{t}^{-1}(X^{0})\) is contained in the set of null geodesics \(G_{\varphi }^0\). For \(t \in (0,1)\), \(\mathrm{e}_t^{-1}(X^0) \subset G_\varphi ^0\) by Lemma 3.15, and so we conclude that \(\mu _{t}\llcorner _{X^0} = \mu _0\llcorner _{X^0}\) for all \(t \in [0,1)\). \(\square \)
Remark 9.9
When applying the results of this section, note that when both \(\mu _0,\mu _1 \ll {\mathfrak {m}}\), then by reversing the roles of \(\mu _0\) and \(\mu _1\), we in fact obtain all the above results also at the right end-point \(t=1\).
12 Two families of conditional measures
The next two sections will be devoted to the study of \(W_{2}\)-geodesics over \((X,\mathsf {d},{\mathfrak {m}})\), when \((X,\mathsf {d},{\mathfrak {m}})\) is assumed to be essentially non-branching and verifies \({\mathsf {CD}}^1(K,N)\). By Remark 8.8, we also assume \(\text {supp}({\mathfrak {m}}) = X\). We will use Proposition 8.13 as an equivalent definition for \({\mathsf {CD}}^1(K,N)\). By Proposition 8.9 and Remark 8.11, X also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 applies. In addition, it follows by Lemma 6.12 that \((X,\mathsf {d})\) is geodesic and proper, and so the results of Part I apply.
Fix \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), and denote by \(\nu \) the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). As usual, we denote \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and set
Fix also an associated Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) for the c-optimal-transport problem from \(\mu _{0}\) to \(\mu _{1}\), with \(c = \mathsf {d}^2/2\). Recall that \(G_{\varphi }\subset \mathrm{Geo}(X)\) denotes the set of \(\varphi \)-Kantorovich geodesics and that necessarily \(\nu (G_{\varphi }) = 1\). We further recall from Sect. 3 that the interpolating Kantorovich potential and its time-reversed version at time \(t \in (0,1)\) are defined for any \(x \in X\) as
with \(\varphi _0 = {\bar{\varphi }}_0 = \varphi \) and \(\varphi _1 = {\bar{\varphi }}_1 = -\varphi ^c\). By Proposition 3.6 we have, for all \(t \in (0,1)\), \(\varphi _t(x) \le {\bar{\varphi }}_t(x)\), with equality iff \(x \in \mathrm{e}_t(G_\varphi )\).
It will be convenient from a technical perspective to first restrict \(\nu \), by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs \(\mu _0,\mu _1\) and \(\mu _1,\mu _0\)), Proposition 9.7 and Corollary 6.16, to a suitable good compact subset \(G \subset G^+_{\varphi }\) with \(\nu (G) \ge \nu (G_\varphi ^+)-\varepsilon \). Recall that \(G_\varphi ^+\) was defined in Sect. 3 as the subset of geodesics in \(G_\varphi \) having positive length, and note that the length function \(\ell : \mathrm{Geo}(X) \rightarrow [0,\infty )\) is continuous and hence is bounded away from 0 and \(\infty \) on a compact \(G \subset G^+_{\varphi }\).
Definition 10.1
(Good Subset of Geodesics) A subset \(G \subset G^+_{\varphi }\) is called good if the following properties hold:
-
G is compact;
-
there exists \(c > 0\) so that for every \(\gamma \in G\):
$$\begin{aligned} c \le \ell (\gamma ) \le 1/c \; ; \end{aligned}$$(10.1) -
for every \(\gamma \in G\), \(\rho _s(\gamma _s) > 0\) for all \(s \in [0,1]\) and \((0,1) \ni s \mapsto \rho _s(\gamma _s)\) is continuous;
-
the claim of Proposition 9.7 holds true for G;
-
The map \(\mathrm{e}_{t}|_G : G \rightarrow X\) is injective (and we will henceforth restrict \(\mathrm{e}_t\) to G or its subsets).
Assumption 10.2
We will assume in this section and in Sect. 11.1 that \(\nu \) is concentrated on a good \(G \subset G^+_\varphi \).
We will dispose of this assumption in the Change-of-Variables Theorem 11.4.
12.1 \(L^{1}\) partition
For \(s \in [0,1]\) and \(a_{s} \in {\mathbb {R}}\), we recall the following notation (introduced in Sect. 4 for \(G = G_\varphi \), but now we treat a general \(G \subset G_\varphi \) as above):
As G is compact and \(\mathrm{e}_s : G \rightarrow X\) is continuous, \(\mathrm{e}_s(G)\) is compact. When \(s \in (0,1)\), \(\varphi _s : X \rightarrow {\mathbb {R}}\) is continuous by Lemma 3.2, and hence \(G_{a_s}\) is compact as well.
The structure of the evolution of \(G_{a_{s}}\), i.e. \(\mathrm{e}_{[0,1]}(G_{a_{s}}) = \{ \gamma _{t} :t \in [0,1], \ \gamma \in G_{a_{s}} \}\), will be the topic of this subsection, so the properties we prove below are only meaningful for \(a_s \in \varphi _s(\mathrm{e}_s(G))\) (and moreover typically when \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}})) > 0\)). It will be convenient to use a short-hand notation for the signed-distance function from a level set of \(\varphi _{s}\), \(d_{a_{s}} : = d_{\varphi _{s} - a_{s}}\) (see (8.2)).
Lemma 10.3
For any \(s \in [0,1]\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\) the following holds: for each \(\gamma \in G_{a_{s}}\) and \(0 \le r \le t \le 1\), \((\gamma _{r},\gamma _{t}) \in \Gamma _{d_{a_{s}}}\). In particular, the evolution of \(G_{a_{s}}\) is a subset of the transport set associated to \(d_{a_{s}}\):
Proof
Fix \(\gamma \in G_{a_{s}}\). If \(s \in [0,1)\) then for any \(p \in \{ \varphi _{s} = a_{s} \}\):
by Lemma 3.3 and Proposition 3.6 (2), and hence \(\mathsf {d}(\gamma _{s},\gamma _{1}) \le \mathsf {d}(p,\gamma _{1})\); the latter also holds for \(s=1\) trivially. Similarly, if \(s \in (0,1]\) then for any \( q\in \{ \varphi _{s} = a_{s}\}\):
and therefore \(\mathsf {d}(\gamma _{0},\gamma _{s}) \le \mathsf {d}(\gamma _{0},q)\), with the latter also holding for \(s=0\) trivially. Consequently, for any \(p,q \in \{ \varphi _{s} = a_{s} \}\):
Taking infimum over p and q it follows that:
where the sign of \(d_{a_s}\) was determined by the fact that \(s \mapsto \varphi _s(\gamma _s)\) is decreasing (e.g. by Lemma 3.3). On the other hand
thanks to the 1-Lipschitz regularity of \(d_{a_{s}}\) ensured by Lemma 8.4 since \((X,\mathsf {d})\) is geodesic. Therefore equality holds and \((\gamma _{0},\gamma _{1}) \in \Gamma _{d_{a_{s}}}\). The assertion then follows by Lemma 7.1. \(\square \)
Next, recall by Proposition 8.13 applied to the function \(u = d_{a_{s}}\), that according to the equivalent characterization of \({\mathsf {CD}}^{1}_u(K,N)\), the following disintegration formula holds:
where Q is a section of the partition of \({\mathcal {T}}_{d_{a_s}}^b\) given by the equivalence classes \(\{R_{d_{a_s}}^b(\alpha )\}_{\alpha \in Q}\), and for \({\hat{{\mathfrak {q}}}}^{a_s}\)-a.e. \(\alpha \in Q\), the probability measure \({\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\) is supported on the transport ray \(X_{\alpha } = \overline{R^b_{d_{a_s}}(\alpha )} = R_{d_{a_s}}(\alpha )\) and \((X_{\alpha }, \mathsf {d}, {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}})\) verifies \({\mathsf {CD}}(K,N)\). It follows by Lemma 10.3 that
It will be convenient to make the previous disintegration formula a bit more explicit. We refer to the “Appendix” for the definition of \({\mathsf {CD}}(K,N)\) density and the (suggestive) relation to one-dimensional \({\mathsf {CD}}(K,N)\) spaces. Recall that \(\ell _s(\gamma _s) = \ell (\gamma )\) for all \(\gamma \in G\).
Proposition 10.4
For any \(s \in (0,1)\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\), the following disintegration formula holds:
with \({\mathfrak {q}}^{a_{s}}\) a Borel measure concentrated on \(\mathrm{e}_{s}(G_{a_{s}})\) of mass \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\), \(g^{a_{s}} : \mathrm{e}_s(G_{a_s}) \times [0,1] \rightarrow X\) is defined by \(g^{a_{s}}(\beta ,t) = \mathrm{e}_t(\mathrm{e}_{s}^{-1}(\beta ))\) and is Borel measurable, for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \in \mathrm{e}_{s}(G_{a_{s}})\), \(h^{a_{s}}_{\beta }\) is a \({\mathsf {CD}}(\ell _s(\beta )^2 K,N)\) probability density on [0, 1] vanishing at the end-points, and the map \(\mathrm{e}_{s}(G_{a_{s}}) \times [0,1] \ni (\beta ,t) \mapsto h^{a_{s}}_{\beta }(t)\) is \({\mathfrak {q}}^{a_s} \otimes {\mathcal {L}}^1\llcorner _{[0,1]}\)-measurable.
Proof
We will abbreviate \(u = d_{a_s}\).
Step 1. We claim that
Indeed, if \(x \in \mathrm{e}_{[0,1]}(\gamma )\), then \(R_u(x) \supset \mathrm{e}_{[0,1]}(\gamma )\) by Lemma 10.3. But on the other hand, \(R_u(x) = R_u(\alpha )\) for all \(x \in R_u^b(\alpha )\), since any two transport rays intersecting in \({\mathcal {T}}_{u}^b\) must coincide by Corollary 7.9. Hence, if \(\exists x \in \mathrm{e}_{[0,1]}(\gamma ) \cap R_{u}^b(\alpha )\), the assertion follows.
Step 2. We also claim that
Indeed, since \(\alpha \in Q \subset {\mathcal {T}}_u^b\) then \(R_u(\alpha )\) is a transport ray by Lemma 7.8, and since \(u = d_{a_s}\) is affine (with slope 1) on a transport ray, \(R_{u}(\alpha )\) must intersect \(\{d_{a_s} = 0\}=\{\varphi _{s} = a_{s} \}\), and hence \(\mathrm{e}_{s} (G_{a_{s}})\), at most once. It follows by Step 1 that \(\gamma ^1_s = \gamma ^2_s\), and so by injectivity of \(\mathrm{e}_s|_G : G \rightarrow X\), that \(\gamma ^1 = \gamma ^2\).
Step 3. Denote
We claim that there exists a bijective map:
for which
Indeed, for all \(\alpha \in Q^1\), there exists precisely one \(\gamma \in G_{a_s}\) (and hence \(\gamma \in G^1_{a_s}\)) so that \(R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ) \ne \emptyset \) by Step 2. And vice versa, given any \(\gamma \in G^1_{a_s}\), there is at least one \(\alpha \in Q\) (and hence \(\alpha \in Q^1\)) so that \(R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ) \ne \emptyset \), and it follows by Step 1 that \(\mathrm{e}_{[0,1]}(\gamma ) \subset R_u(\alpha )\) and hence \({\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(\gamma ) \subset R_u^b(\alpha )\); but this means that for all \(\alpha \ne \beta \in Q\), \(R_u^b(\beta ) \cap \mathrm{e}_{[0,1]}(\gamma ) = \emptyset \), since \(\left\{ R_u^b(\beta )\right\} _{\beta \in Q}\) is a partition of \({\mathcal {T}}_u^b\), implying the uniqueness of \(\alpha \in Q^1\).
Moreover, we claim that the map \(\eta : (Q^1,{\mathcal {B}}(Q^1)) \rightarrow (G^1_{a_s} , {\mathcal {B}}(G^1_{a_s}))\) is measurable. Indeed, recall that \(G_{a_s}\) is compact, and since \((X,\mathsf {d})\) is proper, \({\mathcal {T}}_u^b\) and \(R_u^b\) are Borel, and hence \(G^1_{a_s}\) is analytic. Then write:
and
Note that \(\Lambda \) is analytic and that \(\Lambda (x)\) is either an empty set or a singleton for all \(x \in {\mathcal {T}}_u^b\) by Step 2 (and the fact that \(R_u^b\) is an equivalence relation on \({\mathcal {T}}_u^b\)). It follows that for any \(B \in {\mathcal {B}}(G_{a_s})\), both \(A_1 = P_1(\Lambda \cap ({\mathcal {T}}_u^b \times B))\) and \(A_2 = P_1(\Lambda \cap ({\mathcal {T}}_u^b \times (G_{a_s} \setminus B)))\) are analytic, disjoint and \(Q^1 = (Q^1 \cap A_1) \cup (Q^1 \cap A_2)\). By the Lusin separability principle [72, Theorem 4.4.1], there exists a Borel subset \(B_1 \subset {\mathcal {T}}_u^b\) containing \(A_1\) which is still disjoint from \(A_2\). Consequently \(\eta ^{-1}(B \cap G_{a_s}^1) = \eta ^{-1}(B) = Q^1 \cap A_1 = Q^1 \cap B_1 \in {\mathcal {B}}(Q^1)\), concluding the proof that \(\eta \) is Borel measurable on \(Q^1\).
Step 4. Recall that for all \(\alpha \in {{\bar{Q}}}\) of full \({\hat{{\mathfrak {q}}}}^{a_s}\) measure, \({\hat{{\mathfrak {m}}}}_\alpha ^{a_s}\) is supported on the transport ray \(R_u(\alpha ) = \overline{R_u^b(\alpha )}\) and \((R_u(\alpha ),\mathsf {d},{\hat{{\mathfrak {m}}}}_\alpha ^{a_s})\) verifies \({\mathsf {CD}}(K,N)\). Consequently, for such \(\alpha \)’s, \({\hat{{\mathfrak {m}}}}_\alpha ^{a_s}\) gives positive mass to any relatively open subset of \(R_u(\alpha )\) and does not charge points. It follows that for \(\alpha \in {{\bar{Q}}}\), since \(\mathrm{e}_{[0,1]}(\gamma ^\alpha ) \subset R_u(\alpha )\) has non-empty relative interior, it holds that:
In particular, \(Q^1\) coincides up to a \({\hat{{\mathfrak {q}}}}^{a_s}\)-null set with the \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable set \(Q^2 := \{ \alpha \in Q \; ; \; {\hat{{\mathfrak {m}}}}_\alpha ^{a_s}(\mathrm{e}_{[0,1]}(G_{a_s})) > 0 \}\), and thus \(Q^1\) is itself \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable. In fact, it is easy to see that \(Q^1\) coincides with an analytic set up to a \({\hat{{\mathfrak {q}}}}^{a_s}\)-null-set.
Step 5. Recalling that \(\mathrm{e}_{[0,1]}(G_{a_s}) \subset {\mathcal {T}}_u\) by Lemma 10.3 and that \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\) by Corollary 7.3, we obtain from (10.2) the following disintegration of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_s})}\):
where the last two transitions and the measurability of \(\alpha \mapsto {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha )) > 0\) follow from Step 4. For all \(\alpha \in {{\bar{Q}}} \cap Q^1\), define the probability measure:
Since \(\mathrm{e}_{[0,1]}(\gamma ^\alpha )\) is a convex subset of \(R_u(\alpha )\), it follows that the one-dimensional m.m.s.\((\mathrm{e}_{[0,1]}(\gamma ^\alpha ), \mathsf {d},{\bar{{\mathfrak {m}}}}_{\alpha }^{a_s})\) verifies \({\mathsf {CD}}(K,N)\) and is of full support for all \(\alpha \in {{\bar{Q}}} \cap Q^1\). Similarly, define:
Step 6. Recall that our original disintegration (10.2) was on \((Q,{\mathscr {Q}},{\hat{{\mathfrak {q}}}}^{a_s})\), so that there exists \({\tilde{Q}} \subset Q\) of full \({\hat{{\mathfrak {q}}}}^{a_s}\) measure so that \({\tilde{Q}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) and \( {\mathscr {Q}} \supset {\mathcal {B}}({\tilde{Q}})\). It follows that we may find \({\mathscr {Q}} \ni {\tilde{Q}}^1 \subset Q^1\) with \({\hat{{\mathfrak {q}}}}^{a_s}(Q^1 \setminus {\tilde{Q}}^1) = 0\) so that \({\mathscr {Q}} \supset {\mathcal {B}}({\tilde{Q}}^1)\). Let us now push-forward the measure space \((Q^1 , {\mathscr {Q}} \cap Q^1 , {\bar{{\mathfrak {q}}}}^{a_s})\) via the Borel measurable map \(\mathrm{e}_s \circ \eta \) (by Step 3), yielding the measure space \((\mathrm{e}_s(G_{a_s}^1), {\mathscr {S}}, {\mathfrak {q}}^{a_s})\), which is thus guaranteed to satisfy \({\mathscr {S}} \supset {\mathcal {B}}({\tilde{S}})\), where \({\tilde{S}} := \mathrm{e}_s \circ \eta ({\tilde{Q}}^1)\) is of full \({\mathfrak {q}}^{a_s}\) measure. Restricting the space to \({\tilde{S}}\) and abusing notation, we obtain \(({\tilde{S}}, {\mathscr {S}}, {\mathfrak {q}}^{a_s})\) with \({\mathscr {S}} \supset {\mathcal {B}}({\tilde{S}})\), implying that \({\mathfrak {q}}^{a_s}\) is a Borel measure concentrated on \({\tilde{S}} \subset \mathrm{e}_s(G^1_{a_s}) \subset \mathrm{e}_s(G_{a_s})\). Note that \({\hat{{\mathfrak {q}}}}^{a_{s}}\), \({\bar{{\mathfrak {q}}}}^{a_s}\) and \({\mathfrak {q}}^{a_s}\) all have total mass \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\).
Denoting \({\mathfrak {m}}_{\gamma ^\alpha _s}^{a_s} := {\bar{{\mathfrak {m}}}}_{\alpha }^{a_s}\), the disintegration from Step 5 translates to
Furthermore, for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the m.m.s.\((\mathrm{e}_{[0,1]}(\mathrm{e}_s^{-1}(\beta )), \mathsf {d}, {\mathfrak {m}}_{\beta }^{a_s})\) verifies \({\mathsf {CD}}(K,N)\) and is of full support, and is therefore isometric to \((I^{a_{s}}_{\beta },\left| \cdot \right| , {{\hat{h}}}^{a_{s}}_{\beta } {\mathcal {L}}^1\llcorner _{I^{a_{s}}_{\beta }})\), where \(I^{a_{s}}_{\beta } : = [0,\ell _{s}(\beta )]\) and \({{\hat{h}}}^{a_{s}}_{\beta }\) is a \({\mathsf {CD}}(K,N)\) probability density on \(I^{a_{s}}_{\beta }\) (see Definition A.1). To prevent measurability issues, we will use the convention that \({{\hat{h}}}^{a_{s}}_{\beta }\) vanishes at the end-points of \(I^{a_{s}}_{\beta }\).
Step 7. Next, we observe that \(g^{a_s}\) is Borel. Indeed, note that by injectivity of \(\mathrm{e}_s\):
As \(G_{a_s}\) is compact, it follows that \({\text {graph}}(g^{a_s})\) is analytic, and hence (see [72, Theorem 4.5.2]) \(g^{a_s}\) is Borel measurable.
Step 8. It follows that \({\mathfrak {m}}_{\beta }^{a_s} = g^{a_s}(\beta ,\cdot )_{\sharp }(h^{a_{s}}_{\beta } {\mathcal {L}}^1\llcorner _{[0,1]})\), where:
Clearly \(h_{\beta }^{a_{s}}\) is now a \({\mathsf {CD}}(\ell _{s}(\beta )^{2}K,N)\) probability density on the interval [0, 1]. The only remaining task is to prove that the map \(\mathrm{e}_s(G_{a_s}) \times [0,1] \ni (\beta ,t) \mapsto h^{a_s}_\beta (t)\) is \({\mathfrak {q}}^{a_s} \otimes {\mathcal {L}}^1\llcorner _{[0,1]}\)-measurable. By measurability of the disintegration (10.3) (recall Definition 6.18), the map \(Q \ni \alpha \mapsto {\hat{{\mathfrak {m}}}}_\alpha ^{a_s}(B)\) is \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable for any Borel set \(B \subset X\). It follows that for any compact \(I \subset (0,1)\), the map:
is \({\mathfrak {q}}^{a_{s}}\)-measurable, where \(\alpha (\beta ) := (\mathrm{e}_s \circ \eta )^{-1}(\beta )\) is \({\mathfrak {q}}^{a_s}\)-measurable as a map \(({\tilde{S}},{\mathscr {S}},{\mathfrak {q}}^{a_s}) \rightarrow (Q , {\mathscr {Q}} , {\hat{{\mathfrak {q}}}}^{a_s})\) by the construction from Step 6. As \(h^{a_{s}}_{\beta }\) is continuous on (0, 1) for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), we know that for such \(\beta \) and all \(t \in (0,1)\):
It follows by [72, Proposition 3.1.27] that for all \(t \in (0,1)\), the map
is \({\mathfrak {q}}^{a_{s}}\)-measurable. As for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the map \((0,1) \ni t \mapsto h^{a_{s}}_{\beta }(t)\) is continuous, [72, Theorem 3.1.30] confirms the required measurability.
This concludes the proof.\(\square \)
It will be convenient to invert the order of integration in (10.4) using Fubini’s Theorem
We thus define
so that the final formula is
Remark 10.5
Since for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the \({\mathsf {CD}}(\ell _s^2(\beta ) K , N)\) density \(h^{a_s}_\beta \) must be strictly positive on (0, 1) (see “Appendix”), by multiplying and dividing \({\mathfrak {q}}^{a_s}\) by the positive \({\mathfrak {q}}^{a_s}\)-measurable function \(\beta \mapsto h^{a_s}_\beta (s)\) (recall that \(s \in (0,1)\)), we may always renormalize and assume that \(h^{a_{s}}_{\beta }(s) = 1\). Note that this does not affect the definition of \({\mathfrak {m}}_{t}^{a_{s}}\) above. This normalization ensures that \({\mathfrak {m}}_s^{a_s} = {\mathfrak {q}}^{a_s}\) so that
Remark 10.6
Note that since \({\mathfrak {q}}^{a_{s}}\) is concentrated on \(\mathrm{e}_s(G_{a_s})\), by definition \({\mathfrak {m}}_{t}^{a_{s}}\) is concentrated on \(\mathrm{e}_t(G_{a_s})\) for all \(t \in (0,1)\). By Corollary 4.3, the latter sets are disjoint for different t’s in (0, 1) (recall that \(s \in (0,1)\) and that \(G \subset G^+_{\varphi }\)). Formula (10.5) can thus be seen again as a disintegration formula over a partition. In particular, for any \(s \in (0,1)\) and \(0< t , \tau < 1\) with \(t \ne \tau \), the measures \({\mathfrak {m}}^{a_{s}}_{t}\) and \({\mathfrak {m}}^{a_{s}}_{\tau }\) are mutually singular.
Proposition 10.7
For any \(s\in (0,1)\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\), the map
is continuous in the weak topology, we have
and
for some \(C> 0\) depending only on K, N and \(c > 0\) from assumption (10.1).
Proof
Recall that the definition of \({\mathfrak {m}}^{a_{s}}_{t}\) does not depend on the last normalization we performed, when we imposed that \(h^{a_s}_\beta (s) = 1\), so we revert to the normalization that \(h^{a_s}_\beta \) is a \({\mathsf {CD}}(\ell _s(\beta )^2 K , N)\) probability density on [0, 1], and hence \(\left\| {\mathfrak {q}}^{a_{s}}\right\| = {\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\). The second assertion follows since whenever the latter mass is positive, by positivity of a \({\mathsf {CD}}(K,N)\) density in the interior of its support (see “Appendix”):
Similarly, it follows by Lemma A.8, the lower semi-continuity of \(h^{a_s}_\beta \) at the end-points (see “Appendix”), and assumption (10.1), that \(\max _{t \in [0,1]} h^{a_{s}}_{\beta }(t)\) is uniformly bounded in \(a_{s}\) and \(\beta \) for \({\mathfrak {q}}^{a_{s}}\)-a.e. \(\beta \) by a constant \(C>0\) as above, implying that
yielding the third assertion.
Now note that the density \((0,1) \ni t \mapsto h^{a_{s}}_{\beta }(t)\) is continuous (see “Appendix”) for \({\mathfrak {q}}^{a_{s}}\)-a.e. \(\beta \), and the same trivially holds for the map \([0,1] \ni t \mapsto g^{a_{s}}(\beta ,t)\). We conclude by Dominated Convergence that for any \(f \in C_{b}(X)\) and any \(t \in (0,1)\):
yielding the first assertion, and concluding the proof. \(\square \)
12.2 \(L^{2}\) partition
For each \(t \in (0,1)\), we can find a natural partition of \(\mathrm{e}_{t}(G) \subset \mathrm{e}_t(G_\varphi )\) consisting of level sets of the time-propagated intermediate Kantorovich potentials \(\Phi _{s}^{t}\) introduced in Sect. 4. Recall that the function \(\Phi _s^t\) (\(s,t \in (0,1)\)) was defined as:
and interpreted on \(\mathrm{e}_{t}(G_\varphi )\) as the propagation of \(\varphi _s\) from time s to t along \(G_\varphi \), i.e. \(\Phi _{s}^{t} = \varphi _{s} \circ \mathrm{e}_{s} \circ \mathrm{e}_{t}^{-1}\). In particular, for any \(\gamma \in G\), \(\Phi _{s}^{t}(\gamma _{t}) = \varphi _{s}(\gamma _{s})\), and \(\mathrm{e}_{t} (G_{a_{s}}) \cap \mathrm{e}_{t} (G_{b_{s}}) = \emptyset \) as soon as \(a_{s} \ne b_{s}\) (see Corollary 4.1). It follows that for any \(s,t \in (0,1)\), we can consider the partition of the compact set \(\mathrm{e}_{t}(G)\) given by its intersection with the family \(\{ \Phi _{s}^{t}= a_{s} \}_{a_{s} \in {\mathbb {R}}}\); as usual, it will be sufficient to take \(a_{s} \in \Phi _{s}^{t}(\mathrm{e}_{t}(G)) = \varphi _{s}(\mathrm{e}_{s}(G))\).
Since \(\Phi _{s}^{t}\) is continuous, the Disintegration Theorem 6.19 yields the following essentially unique disintegration of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\) strongly consistent with respect to the quotient-map \(\Phi _{s}^{t}\):
so that for \({\mathfrak {q}}^{t}_{s}\)-a.e. \(a_{s}\), \({\hat{{\mathfrak {m}}}}^{t}_{a_{s}}\) is a probability measure concentrated on the set \(\mathrm{e}_t(G) \cap \left\{ \Phi _s^t = a_s\right\} = \mathrm{e}_{t}(G_{a_{s}})\). By definition, \({\mathfrak {q}}^{t}_{s} = (\Phi _{s}^{t})_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\). To make this disintegration more explicit, we show:
Proposition 10.8
-
(1)
For any \(s,t,\tau \in (0,1)\), the quotient measures \({\mathfrak {q}}^{t}_{s}\) and \({\mathfrak {q}}^{\tau }_s\) are mutually absolutely continuous.
-
(2)
For any \(s,t \in (0,1)\), the quotient measure \({\mathfrak {q}}^{t}_{s}\) is absolutely continuous with respect to Lebesgue measure \({\mathcal {L}}^1\) on \({\mathbb {R}}\).
Proof
Recall that \({\mathfrak {q}}^{t}_{s} = (\Phi _{s}^{t})_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\).
(1) For any Borel set \(I \subset {\mathbb {R}}\), note that:
since \(\mu _t \ll {\mathfrak {m}}\) and its density \(\rho _t\) is assumed to be positive on \(\mathrm{e}_t(G)\) where \(\mu _t\) is supported (see Definition 10.1). But \(\mu _\tau = (\mathrm{e}_\tau \circ \mathrm{e}_t^{-1})_{\sharp } \mu _t\), and so:
It follows that \({\mathfrak {q}}^{t}_{s}(I) > 0\) iff \({\mathfrak {q}}^{\tau }_{s}(I) > 0\), thereby establishing the first assertion.
(2) Thanks to the first assertion, it is enough to only consider the case \(t=s\) in the second one. Recall that \(\Phi _{s}^{s} = \varphi _{s}\). Then the claim boils down to showing that \({\mathfrak {m}}(\varphi _{s}^{-1} (I) \cap \mathrm{e}_{s}(G)) = 0\) whenever \(I \subset \varphi _{s}(\mathrm{e}_{s}(G))\) is a compact set with \({\mathcal {L}}^{1}(I) = 0\).
By compactness, we fix a ball \(B_{r}(o)\) containing \(\mathrm{e}_{s}(G)\). Since \(\varphi _{s}\) is Lipschitz continuous on bounded sets (Corollary 3.10 (1)), possibly using a cut-off Lipschitz function over \(B_{r}(o)\), we may assume that \(\varphi _{s}\) has bounded total variation measure \(\Vert D \varphi _{s} \Vert \) (we refer to [54] and [9] for all missing notions and background regarding BV-functions on metric-measure spaces). From the local Poincaré inequality (see Remark 7.5 and [54, page 992]) and the doubling property (see Lemma 6.12 and recall that \(\text {supp}({\mathfrak {m}}) = X\)), it follows that the total variation measure of \(\varphi _{t}\) is absolutely continuous with respect to \({\mathfrak {m}}\), and that
(see [54, page 992] or [12, Section 4]), where
By [31, Theorem 6.1], the previous quantity in fact coincides in our setting with the pointwise Lipschitz constant of \(\varphi _{s}\) at x, which in turn coincides with \(\ell ^+_s(x)\) by [6, Theorem 3.6]; hence for \(x = \gamma _s\) we have \(|\nabla \varphi _{s}|(x) =\ell _s(x)\). By the co-area formula (see [54, Proposition 4.2]), for any Borel set \(A \subset B_{r}(o)\):
where \(\Vert \partial \{ \varphi _{s} > \tau \} \Vert \) denotes the total variation measure associated to the set of finite perimeter \(\{ \varphi _{s} > \tau \}\). From [1, Theorem 5.3] it follows that \(\Vert \partial \{ \varphi _{s} > \tau \} \Vert \) is concentrated on \(\{\varphi _{s} = \tau \}\) and therefore, for any Borel set \(I \subset \varphi _{s}(\mathrm{e}_{s}(G))\) with \({\mathcal {L}}^{1}(I) = 0\), it follows by (10.9) and (10.8):
Since \(|\nabla \varphi _{s}| = \ell _s(x) > 0\) on \(\mathrm{e}_{s}(G)\), it follows that \({\mathfrak {m}}(\varphi _{s}^{-1}(I) \cap \mathrm{e}_{s}(G)) = 0\), thereby concluding the proof. \(\square \)
Remark 10.9
Inspecting the proof of Proposition 10.8, from the co-area formula ([54, Proposition 4.2]) and the Hausdorff representation of the perimeter measure ([1, Theorem 5.3]), it follows that for \({\mathfrak {q}}^{s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\) the measure \({\mathfrak {m}}^{s}_{a_{s}}\) is absolutely continuous with respect to the Hausdorff measure of codimension one (see [1] for more details).
Employing the previous proposition, we define:
obtaining from (10.7) the following disintegration (for every \(s,t \in (0,1)\)):
with \({\mathfrak {m}}^{t}_{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\), for \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\).
We now shed light on the relation of the above disintegration to \(L^2\)-Optimal-Transport, by relating it to another disintegration formula for \(\nu \), the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). Observe that the family of sets \(\{G_{a_{s}}\}_{a_{s}\in {\mathbb {R}}}\) is a partition of G and that \(G_{a_{s}} = \left\{ \varphi _{s} \circ \mathrm{e}_{s} = a_{s}\right\} \). Since the quotient-map \(\varphi _{s} \circ \mathrm{e}_{s} : \mathrm{Geo}(X) \rightarrow {\mathbb {R}}\) is continuous and G is compact, the Disintegration Theorem 6.19 ensures the existence of an essentially unique disintegration of \(\nu \) strongly consistent with \(\varphi _{s} \circ \mathrm{e}_{s}\):
so that for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the probability measure \(\nu _{a_s}\) is concentrated on \(G_{a_s}\). Clearly \({\mathfrak {q}}^{\nu }_{s}(\varphi _{s}(\mathrm{e}_{s}(G))) = \left\| \nu \right\| =1\).
Corollary 10.10
-
(1)
For any \(s \in (0,1)\), the quotient measure \({\mathfrak {q}}^{\nu }_{s}\) is mutually absolutely continuous with respect to \({\mathfrak {q}}^{s}_{s}\), and in particular it is absolutely continuous with respect to \({\mathcal {L}}^{1}\).
-
(2)
For any \(s,t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\):
$$\begin{aligned} \rho _{t} \cdot {\mathfrak {m}}^{t}_{a_{s}} = q^{\nu }_s(a_{s}) \cdot (\mathrm{e}_{t})_{\#} \nu _{a_{s}} , \end{aligned}$$(10.12)where \(q^{\nu }_s := d {\mathfrak {q}}^{\nu }_{s} / d{\mathcal {L}}^{1}\). In particular, \({\mathfrak {m}}^{t}_{a_{s}}\) and \((\mathrm{e}_{t})_{\#} \nu _{a_{s}}\) are mutually absolutely-continuous for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\).
-
(3)
In particular, for any \(s \in (0,1)\) and \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the map:
$$\begin{aligned}{}[0,1] \ni t \mapsto \rho _{t} \cdot {\mathfrak {m}}^{t}_{a_{s}} \end{aligned}$$coincides for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) with the \(W_2\)-geodesic \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu _{a_{s}}\) up to a positive multiplicative constant depending only on \(a_{s}\).
Proof
Recall that \(\mu _s \ll {\mathfrak {m}}\) is supported on \(\mathrm{e}_s(G)\) and \(\rho _{s} > 0\) there (see Definition 10.1), so that \(\mu _s\) and \({\mathfrak {m}}\llcorner _{\mathrm{e}_{s}(G)}\) are mutually absolutely-continuous. It immediately follows that the same holds for \((\varphi _{s} )_{\#} \mu _s\) and \({\mathfrak {q}}^s_s = (\varphi _{s} )_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{s}(G)}\). But:
establishing (1).
Denoting the resulting probability density \(q^{\nu }_s := d {\mathfrak {q}}^{\nu }_{s} / d{\mathcal {L}}^{1}\), (10.11) translates to:
Pushing forward both sides via the evaluation map \(\mathrm{e}_t\) given \(t \in (0,1)\), we obtain:
with \(q^{\nu }_s(a_{s}) \cdot (\mathrm{e}_{t})_{\#}\nu _{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\). On the other hand, multiplying both sides of (10.10) by \(\rho _t\) (which is supported on \(\mathrm{e}_t(G)\)), we obtain
with \(\rho _t \cdot {\mathfrak {m}}^{t}_{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\). By the essential uniqueness of the disintegration (Theorem 6.19), noting that \(\varphi _s(\mathrm{e}_s(G))\) is compact, (10.12) immediately follows. As \(\rho _t > 0\) on \(\mathrm{e}_t(G)\) (see Definition 10.1) and \(q^\nu _s(a_s) \in (0,\infty )\) for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the “in particular” part of (2) is also established.
Finally, by Fubini’s theorem, it follows that for each \(s \in (0,1)\) and \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), (10.12) holds with \(q^\nu _s(a_s) \in (0,\infty )\) for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\). Note that for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the curve \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu _{a_{s}}\) is a \(W_2\)-geodesic (since \(\nu _{a_s}\) is concentrated on \(G_{a_s} \subset G\)). This establishes (3), thereby concluding the proof. \(\square \)
13 Comparison between conditional measures
So far we have proved, under Assumption 10.2, that for each \(s \in (0,1)\) we have the following two families of disintegrations:
for each \(t \in (0,1)\) and each \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), respectively, corresponding to the partitions:
Moreover, both \({\mathfrak {m}}_{a_{s}}^{t}\) and \({\mathfrak {m}}_{t}^{a_{s}}\) are concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\), for each \(t \in (0,1)\) for \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), and for each \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and all \(t \in (0,1)\), respectively, so that the above disintegrations are strongly consistent with respect to the corresponding partition. In addition, we have by (10.6) and (10.12) for all \(s,t \in (0,1)\) and a.e. \(a_s \in \varphi _s(G_{a_s})\):
The goal of the first subsection, in which we retain Assumption 10.2, is to prove that \({\mathfrak {m}}^{t}_{a_{s}}\) and \({\mathfrak {m}}_{t}^{a_{s}}\) are in fact equivalent measures. We will prove in particular that for all \(s \in (0,1)\):
A heuristic formal argument for establishing (11.3) may be seen as follows. Writing \(\Phi _s^t(x) = \Phi _s(t,x)\), we have:
Formally applying the coarea formula (assuming spatial regularity), we have:
where the last transition follows by the implicit function theorem \(\nabla _x \Phi _s + \partial _t \Phi _s \cdot \nabla _x\Phi ^{-1}_s = 0\).
In the second subsection, we deduce the change-of-variables formula (1.6) for the density along geodesics, discarding Assumption 10.2. An insightful heuristic argument may be seen by combining (11.2) and (11.3) as follows:
13.1 Equivalence of conditional measures
Recall that Assumption 10.2 is still in force in this subsection. We start with the following auxiliary:
Lemma 11.1
For every \(s,t\in (0,1)\) and \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the following limit:
holds true in the weak topology. Moreover, for any \(f \in C_b(X)\), the map \(\varphi _{s}(\mathrm{e}_{s}(G)) \ni a_{s} \mapsto \int _{X} f {\mathfrak {m}}_{t}^{a_{s}}\) is Borel.
Proof
By Proposition 10.7, \((0,1) \ni t\mapsto {\mathfrak {m}}^{a_{s}}_{t}\) is continuous in the weak topology, and so together with (11.1), we see that for any \(f \in C_b(X)\):
thereby concluding the proof of the first assertion. For the second assertion, given a compact set \(I \subset [0,1]\), consider the compact set
Hence \(B := P_{1,4}(K) = \{ (\mathrm{e}_I(G_{a_s}) , a_s) : a_s \in \varphi _{s}(\mathrm{e}_s(G)) \} \) is compact as well. It follows by Fubini’s theorem that the map \(\varphi _{s}(\mathrm{e}_{s}(G)) \ni a_{s} \mapsto \int _{\mathrm{e}_I(G_{a_s})} f \, {\mathfrak {m}}\) is Borel. Taking \(I = [t-\varepsilon ,t+\varepsilon ]\), employing the first assertion, and recalling that the pointwise limit of Borel functions is Borel, the second assertion follows. \(\square \)
Remark 11.2
One may similarly show (employing an additional density argument) that for every \(s,t\in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the following limit:
holds true in the weak topology, but this will not be required.
We now find explicit expressions for the densities.
Theorem 11.3
For any \(s \in (0,1)\),
Moreover, for any \(s \in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) including at \(t=s\), \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and \({\mathfrak {m}}_{a_{s}}^{t}\)-a.e. x, and we have:
For the ensuing proof, it will be convenient to introduce the following notation. For all \(t_0 \in {\mathbb {R}}\) and \(x_0 \in X\), denote
Recall that G(x) denotes the section \(\left\{ t \in [0,1] \; ; \; \exists \gamma \in G \;,\;\gamma _t = x\right\} \) and \(\mathring{G}(x) = G(x) \cap (0,1)\).
Proof of Theorem 11.3
Step 1. Fix \(s,t \in (0,1)\). By Lemma 11.1 and the boundedness of \(\Vert {\mathfrak {m}}_{\tau }^{a_{s}}\Vert \) uniformly in \(a_{s}\) and \(\tau \in [0,1]\) (see Proposition 10.7), it is easy to deduce (e.g. by Dominated Convergence Theorem) the following limit of measures on \(\varphi _{s}(\mathrm{e}_{s}(G)) \times X\) in the weak topology (i.e. in duality with \(C_{b}(\varphi _{s}(\mathrm{e}_{s}(G)) \times X)\)):
Using Fubini’s Theorem and (11.1), we proceed as follows:
Moreover, we claim that it is enough to integrate on \(\mathrm{e}_{t}(G)\) above:
To see this, recall that by Proposition 4.4 (3) (relying on Theorem 3.11 (2)), the map \((t-\varepsilon ,t+\varepsilon )\cap \mathring{G}(x) \ni \tau \rightarrow \Phi _{s}^{\tau }(x)\) is Lipschitz with Lipschitz constant bounded uniformly in \(\varepsilon \in (0,t/2 \wedge (1-t)/2)\) and \(x \in \cup _{|\tau - t|<\varepsilon } \mathrm{e}_{\tau }(G)\) (recall that for any \(\gamma \in G\), \(\ell (\gamma )\le 1/c\)); we denote the latter Lipschitz bound by L. Hence the family of measures
is bounded in the total-variation norm by L, uniformly in \(\varepsilon \) and x as above. But by continuity:
and so we can modify the domain of integration in (11.6) yielding (11.7).
Step 2. Fixing \(x \in \mathrm{e}_t(G)\), we now focus on the weak limit:
Recall that \((t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \ni \tau \mapsto \Phi _{s}^{\tau }(x)\) has Lipschitz constant bounded by L, and moreover, is increasing by Proposition 4.4 (3). Now extend it to the entire (0, 1) while preserving (non-strict) monotonicity and the bound on the Lipschitz constant, e.g. \({\hat{\Phi }}_{s}^{\tau }(x) := \inf _{r \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \Phi _{s}^{r}(x) + L (\tau - r)_+\). Then for any \(f \in C_{b}({\mathbb {R}})\), by the change-of-variables formula for (monotone) Lipschitz functions:
the last transition follows since \(\tau \mapsto \Phi _{s}^{\tau }(x)\) is differentiable a.e. on \(D_{\ell }(x)\) and hence \(\partial _{\tau } \Phi _{s}^{\tau }(x) = \partial _{\tau } \Phi _{s}^{\tau }(x)|_{(t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)}\) for a.e. \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) by Remark 2.1, and in addition since \(\partial _{\tau } \Phi _{s}^{\tau }(x)|_{(t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} = \partial _{\tau } {\hat{\Phi }}_{s}^{\tau }(x)\) for a.e. \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) by Remark 2.2. Recall that Proposition 4.4 ensures that for all \(x \in X\), \(\partial _t \Phi _{s}^{t}(x)\) exists for \({\mathcal {L}}^1\)-a.e. \(t \in \mathring{G}(x)\), including at \(t=s\) if \(s \in \mathring{G}(x)\) (in which case \(\partial _t \Phi _{s}^{t}|_{t = s} = \ell _s^2(x)\)). Moreover, Corollary 4.5 and our assumption that \(G \subset G_\varphi ^+\) ensure that \(\partial _t \Phi _{s}^{t}(x) > 0\) for \({\mathcal {L}}^1\)-a.e. \(t \in \mathring{G}(x)\), including at \(t=s\). Applying Fubini’s theorem, we have
It follows that for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\), \(\partial _t \Phi _{s}^{t}(x)\) exists and is positive for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\) (including at \(t=s\) for all \(x \in \mathrm{e}_s(G)\)).
Step 3. We now claim that for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including \(t=s\), if \(f \in C_b({\mathbb {R}})\) and \(\Psi \in C_b(X)\) then
To this end, we will show that for such t’s, both
and
tend to 0 in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) as \(\varepsilon \rightarrow 0\).
Step 4. To see the claim about \(\text {I}_\varepsilon \), since \(\left| \partial _\tau \Phi _s^\tau (x)\right| \le L\) (uniformly in \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) and \(x \in \mathrm{e}_t(G)\)), it is clear that \(\lim _{\varepsilon \rightarrow 0} \text {I}_\varepsilon (x) = 0\) pointwise by continuity of f and \(\mathring{G}(x) \ni \tau \mapsto \Phi ^\tau _s(x)\) (see Proposition 4.4). To obtain convergence in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\), it is therefore enough to show by Dominated Convergence that
uniformly in \(x \in \mathrm{e}_t(G)\). Since f is uniformly continuous on the compact set \(\varphi _s(\mathrm{e}_s(G))\), the uniform estimate (11.8) follows since \(\mathring{G}(x) \ni \tau \mapsto \Phi ^\tau _s(x)\) is Lipschitz on \([\delta ,1-\delta ]\), with Lipschitz constant depending only on \(\delta > 0\) and an upper bound on \(\{ \ell (\gamma ) \; ; \; \gamma \in G \}\) (see Proposition 4.4 (3) and Theorem 3.11 (2)).
Step 5. To see the claim about \(\text {II}_\varepsilon \), it is clearly enough to show that
Step 5a. We first establish (11.9) for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) (independently of f and \(\Psi \)). Since \(\partial _\tau \Phi _s^\tau (x) \le L\) uniformly in \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) and \(x \in \mathrm{e}_t(G)\), by Dominated Convergence, it is enough to establish pointwise convergence in (11.9) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\).
For every \(x \in X\), denote
By Proposition 4.4 (based on Theorem 3.11), we know that for every \(x \in X\), the map \(\tau \mapsto \partial _{\tau }\Phi _{s}^{\tau }(x)\) is in \(L^{\infty }_{loc}(\mathring{G}(x))\), and so by Lebesgue’s Differentiation Theorem, \({\mathcal {L}}^{1}(\mathring{G}(x) \setminus Leb(x)) = 0\). Integrating over \({\mathfrak {m}}\) and applying Fubini’s Theorem, it follows that for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\):
thereby establishing (by definition) the pointwise convergence in (11.9) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\).
Step 5b. We next establish (11.9) at \(t=s\). Write
The first expression tends to 0 pointwise for all \(x \in X\) by Lemma 4.6, and hence by Dominated Convergence also in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) (since \(\left| \partial _\tau \Phi _s^\tau (x)\right| \le L\) and \(\ell _s(x) \le 1/c\) uniformly). The second expression tends to 0 in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) by Proposition 9.7 and the uniform boundedness of \(\ell _s^2(x)\).
Step 6. In other words, we have verified in Steps 3-5 the following weak convergence, for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\):
where recall \(\Phi _s^s(x) = \varphi _s(x)\) and \(\partial _t \Phi _s^t |_{t=s} = \ell _{s}^{2}(x)\). Combining this with Step 1, we deduce that:
Integrating this identity against \(1 \otimes \psi \) with \(1 \in C_b({\mathbb {R}})\) and \(\psi \in C_{b}(X)\), we obtain:
where we used that \({\mathfrak {m}}^{a_{s}}_{t}\) is concentrated on \(\mathrm{e}_t(G_{a_s}) \subset \mathrm{e}_t(G)\) for all \(t \in (0,1)\) and \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) in the first expression, and the disintegration (11.1) of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\) in the last transition. In other words, we obtained for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\):
Since \({\mathfrak {m}}_{a_s}^t\) is also concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for all \(t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\), the assertion follows by essential uniqueness of consistent disintegrations (Theorem 6.19). Note that by Step 2, \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\), and so by (11.1), the same holds for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and \({\mathfrak {m}}_{a_{s}}^{t}\)-a.e. x. \(\square \)
13.2 Change-of-variables formula
We now obtain the following main result of Sects. 10 and 11. At this time, we dispense of Assumption 10.2.
Theorem 11.4
(Change-of-Variables) Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) with \(\text {supp}({\mathfrak {m}}) = X\), and let \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\). Let \(\nu \) denote the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\), and set \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in (0,1)\).
Then there exist versions of the densities \(\rho _t := d\mu _t / d{\mathfrak {m}}\), \(t \in [0,1]\), so that for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), (9.4) holds for all \(0 \le s \le t \le 1\), and in particular, for \(\nu \)-a.e. \(\gamma \), \(t \mapsto \rho _t(\gamma _t)\) is positive and locally Lipschitz on (0, 1), and upper semi-continuous at \(t=0,1\).
Moreover, for any \(s\in (0,1)\), for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), \(\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})\) exists, is positive, and the following change-of-variables formula holds:
Here \(\varphi \) denotes a Kantorovich potential associated to the c-optimal-transport problem between \(\mu _0\) and \(\mu _1\) with cost \(c = \mathsf {d}^2/2\), and \(\Phi _s^t\) denotes the time-propagated intermediate Kantorovich potential introduced in Sect. 4; \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _{s}}\) is the \({\mathsf {CD}}(\ell (\gamma )^2 K ,N)\) density on [0, 1] from Proposition 10.4, after applying the re-normalization from Remark 10.5, so that \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}(s) = 1\). In particular, for \(\nu \text {-a.e. } \gamma \in G_\varphi ^+\), the above change-of-variables formula holds for \({\mathcal {L}}^{1}\)-a.e. \(t,s \in (0,1)\).
Lastly, for all \(\gamma \in G_\varphi ^0\), we have
Recall that \(\nu \) is concentrated on \(G_\varphi = G_\varphi ^+ \cup G_\varphi ^0\), where \(G_\varphi ^+\) and \(G_\varphi ^0\) denote the subsets of positive and zero length \(\varphi \)-Kantorovich geodesics, respectively. Note that \(\partial _{t}|_{t=s}\Phi _{s}^{t}(\gamma _{s}) = \ell _s^2(\gamma _s) = \ell ^2(\gamma )\) by Proposition 4.4, so that together with our normalization that \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}(s) = 1\), we see that both sides of (11.10) are indeed equal to 1 for \(t=s\).
Proof of Theorem 11.4
Step 0. As usual, by Proposition 8.9 and Remark 8.11, \((X,\mathsf {d},{\mathfrak {m}})\) also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 and all the results of Sect. 9 apply. We will use the versions of the densities given by Corollary 9.5. On \(X^0 = \mathrm{e}_{[0,1]}(G_\varphi ^0)\), we know by Corollary 9.8 that \(\mu _0\llcorner _{X^0} = \mu _1\llcorner _{X^0} = \mu _t\llcorner _{X^0}\) for all \(t \in [0,1]\), and so if necessary, we simply redefine \(\rho _t|_{X^0} := \rho _0|_{X^0}\) for all \(t \in (0,1]\), so that (11.11) holds. Note that by Lemma 3.15, this will not affect \((0,1) \ni t \mapsto \rho _t(\gamma _t)\) for all \(\gamma \in G_{\varphi }^+\), and Corollary 9.8 (applied to the pair \(\mu _1,\mu _0\)) ensures that the same is true for \(\nu \)-a.e. \(\gamma \in G_{\varphi }^+\) at \(t=1\).
Step 1. As explained in the beginning of Sect. 10, by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs \(\mu _0,\mu _1\) and \(\mu _1,\mu _0\)), Proposition 9.7 and Corollary 6.16, there exists a good compact subset \(G^{\varepsilon } \subset G^+_{\varphi }\) with \(\nu (G^{\varepsilon }) \ge \nu (G_\varphi ^+)-\varepsilon \) for any \(\varepsilon > 0\) (recall Definition 10.1). Of course, we may assume that \(G^{\varepsilon }\) is increasing as \(\varepsilon \) decreases to 0 (say, along a fixed sequence). Fixing \(\varepsilon > 0\) and a good \(G^{\varepsilon }\), denote \(\nu ^{\varepsilon } = \frac{1}{\nu (G^\varepsilon )} \nu \llcorner _{G^{\varepsilon }}\) and \(\mu _t^{\varepsilon } := (\mathrm{e}_t)_{\sharp } \nu ^{\varepsilon } \ll {\mathfrak {m}}\), so that all of the results of Sects. 10 and 11.1 apply to \(\nu ^{\varepsilon }\). Note that by Corollary 6.16, we have that \(\mu ^{\varepsilon }_t = \frac{1}{\nu (G^\varepsilon )} (\mu _t)\llcorner _{\mathrm{e}_t(G^{\varepsilon })}\) for all \(t \in [0,1]\), and therefore
Also note that as \(\nu \) is concentrated on \(G^{\varepsilon } \subset G_\varphi \), \(\varphi \) is still a Kantorovich potential for the associated transport-problem.
Step 2. Recall that by Corollary 10.10 (3), for each \(s \in (0,1)\) and \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^\varepsilon ))\), the map
coincides for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) with the geodesic \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu ^{\varepsilon }_{a_{s}}\) up to a (positive) constant \(C^{\varepsilon }_{a_s}\) depending on \(a_{s}\), where \(\nu ^{\varepsilon }_{a_s}\) is the conditional measure from the disintegration in (10.11). Consequently, for such s and \(a_s\), for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) and any Borel \(H \subset G^{\varepsilon }_{a_{s}}\), the quantity
is constant (where we used the fact that \(\mathrm{e}_t|_{G^{\varepsilon }} : G^{\varepsilon } \rightarrow X\) is injective).
By Theorem 11.3, for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _s(G^{\varepsilon }_s)\) (and hence for \({\mathfrak {q}}_s^{\varepsilon ,s}\)-a.e. \(a_s \in \varphi _s(G_s^{\varepsilon })\) by Proposition 10.8), \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathfrak {m}}^{\varepsilon ,t}_{a_s}\)-a.e. x, and \({\mathfrak {m}}_{t}^{\varepsilon ,a_{s}} = \partial _{t}\Phi _{s}^{t} \cdot {\mathfrak {m}}^{\varepsilon ,t}_{a_{s}}\). It follows that for those t and \(a_s\) for which this representation and (11.12) hold true:
where the second transition follows from our normalization and Remark 10.5, ensuring that \({\mathfrak {m}}_t^{\varepsilon ,a_s} = (g^{a_{s}}(\cdot ,t))_{\sharp }\, ( h^{a_{s}}_{\cdot }(t) {\mathfrak {m}}_s^{\varepsilon ,a_{s}})\), and the last transition follows from Theorem 11.3.
Note that g and h above do not depend on \(\varepsilon > 0\). For g, this follows by its very definition as \(g^{a_s}(\beta ,t) = \mathrm{e}_t(\mathrm{e}_s^{-1}(\beta ))\) (and the injectivity of \(\mathrm{e}_s|_{G_\varepsilon }\) for all \(\varepsilon > 0\)). For h, this immediately follows by inspecting the proof of Proposition 10.4, where \(h^{a_s}_{\gamma _s}(t)\) was uniquely defined (for \(t \in (0,1)\)) as the continuous version of the density of \({\hat{{\mathfrak {m}}}}^{a_s}_\alpha \) from (10.2) after conditioning it on \(\mathrm{e}_{[0,1]}(\gamma )\) and pulling it back to the interval [0, 1], where \(\alpha \in Q^{1,\varepsilon }\) was bijectively identified with \(\gamma \in G_{a_s}^{\varepsilon ,1}\) via \(\eta ^{\varepsilon }\); as \(Q^{1,\varepsilon }\) and \(G_{a_s}^{\varepsilon ,1}\) clearly increase as \(\varepsilon \) decreases to 0, with \(\eta ^{\varepsilon }|_{Q^{1,\varepsilon '}} = \eta ^{\varepsilon '}\) for \(0< \varepsilon < \varepsilon '\), we verify that h indeed does not depend on \(\varepsilon >0\).
Step 3. As the left-hand-side of (11.13) does not depend on t, it follows that for all \(s \in (0,1)\) and for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) (both of which we fix for the time being), there exists a subset \(T \subset (0,1)\) of full \({\mathcal {L}}^1\) measure, so that for all \(H \subset G^{\varepsilon }_{a_s}\):
is constant. As any Borel subset of \(\mathrm{e}_s(G_{a_s})\) may be written as \(\mathrm{e}_s(H)\), equality of measures follows, and hence equality of densities for \({\mathfrak {m}}_{a_{s}}^{\varepsilon ,s}\)-a.e. \(\beta \). We have therefore proved that for \(t,t' \in T\):
for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), where \(\gamma = \gamma ^\beta = \mathrm{e}_s^{-1}(\beta ) = g^{a_s}(\beta ,\cdot )\in G^{\varepsilon }_{a_{s}}\), with the exceptional set depending on \(t,t'\). Note that given \(t' \in T\), \(\partial _{\tau }|_{\tau =t'}\Phi _{s}^{\tau }(\gamma ^\beta _{t'})\) indeed exists for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\) by Corollary 10.10 (2).
It follows that for all \(t \in T\), for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), (11.14) holds simultaneously for a countable sequence \(t' \in T^t \subset T\) which is dense in (0, 1). Taking the limit in (11.14) as \(T^t \ni t' \rightarrow s\), using Proposition 4.4 (5) which entails:
employing the continuity of \((0,1) \ni t' \mapsto h^{a_{s}}_{\gamma _s}(t')\), our normalization \(h^{a_{s}}_{\gamma _s}(s) = 1\), and the continuity of \((0,1) \ni t' \mapsto \rho ^{\varepsilon }_{t'}(\gamma _{t'})\) (as \(G^{\varepsilon }\) is good), it follows that for all \(s \in (0,1)\), for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) and \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\):
for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), with \(\gamma = \mathrm{e}_s^{-1}(\beta ) \in G^{\varepsilon }_{a_{s}}\).
Step 4. Recall that by Corollary 10.10 (2), \({\mathfrak {m}}^{\varepsilon ,s}_{a_s}\) and \((\mathrm{e}_s)_{\sharp } \nu ^{\varepsilon }_{a_s}\) are mutually absolutely continuous for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\). It follows that for all \(s \in (0,1)\), for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) and \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\), (11.15) holds for \(\nu _{a_s}\)-a.e. \(\gamma \). By Corollary 10.10 (1), note that \({\mathfrak {q}}^{\varepsilon ,s}_{s}\) and \({\mathfrak {q}}^{\varepsilon ,\nu }_s\) are mutually absolutely continuous, and hence the disintegration formula (10.11) implies that for all \(s \in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\):
for \(\nu \)-a.e. \(\gamma \in G^{\varepsilon }\), and in particular that \(\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _{t})\) exists and is positive for those s, t and \(\gamma \). Taking the limit as \(\varepsilon \rightarrow 0\) along a countable sequence, it follows for all \(s \in (0,1)\), \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), that
thereby concluding the proof of (11.10). As a consequence, an application of Fubini’s Theorem verifies that for \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), (11.10) holds for \({\mathcal {L}}^1\)-a.e. \(s,t \in (0,1)\).
\(\square \)
Remark 11.5
Observe that all of the results of this section also equally hold for \({\bar{\Phi }}_s^t\) in place of \(\Phi _s^t\). Indeed, recall that for all \(x \in X\), \(\Phi _s^t(x) = {\bar{\Phi }}_s^t(x)\) for \(t \in \mathring{G}_\varphi (x)\), and that by Corollary 4.5, \(\partial _t \Phi _s^t(x) = \partial _t {\bar{\Phi }}_s^t(x)\) for a.e. \(t \in \mathring{G}_\varphi (x)\). As these were the only two properties used in the above derivation (in particular, in Step 2 of the proof of Theorem 11.3), the assertion follows.
14 Part III Putting it all together
15 Combining change-of-variables formula with Kantorovich 3rd order information
Let \((X,\mathsf {d},\mathfrak m)\) denote an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^1(K,N)\). Let \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), and let \(\nu \) be the unique element of \(\mathrm {OptGeo}(\mu _0,\mu _1)\) (by Proposition 8.9, Remark 8.11 and Theorem 6.15). Recall that \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and we subsequently denote by \(\rho _t\) the versions of the corresponding densities given by Theorem 11.4 (resulting from Corollary 9.5). Finally, denote by \(\varphi \) a Kantorovich potential associated to the corresponding optimal transference plan, so that \(\nu (G_\varphi ) = 1\).
15.1 Change-of-variables rigidity
Recall that by the Change-of-Variables Theorem 11.4, we know that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\) and for a.e. \(t,s \in (0,1)\), \(\partial _\tau |_{\tau =t} \Phi ^\tau _s(\gamma _t)\) exists, is positive, and it holds that:
In fact, by Remark 11.5, the same also holds with \({\bar{\Phi }}\) in place of \(\Phi \), so that in particular:
Recall that given \(t,s \in (0,1)\), for \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively, \({\tilde{\Phi }}_s^t\) was defined on \(D_{{\tilde{\ell }}}\) as:
and that by Proposition 4.4 (2), the differentiability points of \(t \mapsto {\tilde{\Phi }}_{s}^t(x)\) and \(t \mapsto {\tilde{\ell }}^2_t(x)\) coincide for all \(t \ne s\), and at those points:
It follows from (12.2) that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\) and for a.e. \(t \in (0,1)\):
Alternatively, (12.4) follows directly by Lemma 5.6, in fact for \(\nu \)-a.e. \(\gamma \) (not just \(\gamma \in G_\varphi ^+\)).
Plugging (12.3) and (12.4) into (12.1), it follows that we may express the Change-of-Variables Theorem 11.4 as the statement that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\), we have:
Note that the denominators on the right-hand-side of (12.5) are always positive (when defined) for all \(t,s \in (0,1)\) by Theorem 3.11 (3). Fixing the geodesic \(\gamma \), we denote for brevity \(\rho (t) := \rho _t(\gamma _t)\), \(h_s(t) := h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)\) and \(K_0 := K \cdot \ell (\gamma )^2\). We then have the following additional information for \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), by Corollary 9.5 and Proposition 10.4, respectively:
-
(A)
\((0,1) \ni t \mapsto \rho (t)\) is locally Lipschitz and strictly positive.
-
(B)
For all \(s \in (0,1)\), \(h_s\) is a \({\mathsf {CD}}(K_0,N)\) density on [0, 1], satisfying \(h_s(s) = 1\). In particular, it is locally Lipschitz continuous on (0, 1) and strictly positive there.
Remark 12.1
It is in fact possible to deduce (A) just from the Change-of-Variables formula (12.5) and without referring to Corollary 9.5. This may be achieved by a careful bootstrap argument, exploiting the separation of variables on the left-hand-side of (12.5) and the a-priori estimates of Lemma A.9 in the “Appendix” on the logarithmic derivative of \({\mathsf {CD}}(K_0,N)\) densities. But since we already know (A), and since (A) was actually (mildly) used in the proof of the Change-of-Variables Theorem 11.4, we only mention this possibility in passing. Note that Corollary 9.5 applies to all \({\mathsf {MCP}}(K,N)\) essentially non-branching spaces, whereas the Change-of-Variables formula requires knowing the stronger \({\mathsf {CD}}^1(K,N)\) condition.
Fix a geodesic \(\gamma \in G_\varphi ^+\) satisfying (12.5), (A) and (B) above. Let \(I \subset (0,1)\) be the set of full measure where (12.5) holds for all \(s \in I\). It follows from (12.5) that for all \(s \in I\), \(t \mapsto \frac{\partial _\tau |_{\tau =t} {\tilde{\ell }}_\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}\) coincide a.e. on (0, 1) for both \({\tilde{\ell }} = \ell ,{\bar{\ell }}\) with the same locally Lipschitz function \(t \mapsto z_s(t)\) defined on \((0,1) \setminus \left\{ s\right\} \):
By continuity, it follows that the functions \(\left\{ z_s\right\} _{s \in I}\) must all coincide on their entire domain of definition with a single function \(t \mapsto z(t)\) defined on (0, 1); the latter function must therefore be locally Lipschitz continuous, and satisfy
By Theorem 5.5, which provides us with 3rd order information on intermediate-time Kantorovich potentials, we obtain the following additional information on z:
-
(C)
\((0,1) \ni t \mapsto z(t)\) is locally Lipschitz.
For any \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\) so that
$$\begin{aligned}&\frac{z(t) - z(s)}{t-s} \ge (1 - C_\delta (t-s)) \left| z(s)\right| \left| z(t)\right| \\&\quad \forall 0< \delta \le s< t \le 1 - \delta < 1 . \end{aligned}$$In particular, \(z'(t) \ge z^2(t)\) for a.e. \(t \in (0,1)\).
Remark 12.2
By Theorem 5.5, we obtain the following interpretation for z(t)—it coincides for all \(t \in (0,1)\) with the second Peano derivative of \(\tau \mapsto \varphi _{\tau }(\gamma _t)\) and of \(\tau \mapsto {\bar{\varphi }}_{\tau }(\gamma _t)\) at \(\tau = t\). In particular, these second Peano derivatives are guaranteed to exist for all \(t \in (0,1)\) and are a continuous function thereof.
We have already seen above how (12.5) enabled us to deduce (12.6), thereby gaining (by Theorem 5.5) an additional order of regularity for \(\partial _\tau |_{\tau =t} \ell _\tau ^2/2(\gamma (t))\). The purpose of this section is to show that the combination of the Change-of-Variables Formula:
together with properties (A), (B) and (C) above, forms a very rigid condition, and already implies the following representation for \(\frac{1}{\rho _t(\gamma _t)}\); we formulate this independently of the preceding discussion as follows:
Theorem 12.3
(Change-of-Variables Rigidity) Assume that (12.7) holds, where \(\rho \), \(\left\{ h_s\right\} \) and z satisfy (A), (B) and (C) above. Then
where L is concave and Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1).
15.2 Formal argument
To better motivate the ensuing proof of Theorem 12.3, we begin with a formal argument.
Assume that the functions \(\rho \) and z are \(C^2\) smooth and that equality holds in (12.7) for all \(t,s \in (0,1)\). It follows that the mapping \((s,t) \mapsto h_s(t)\) is also \(C^2\) smooth. Fix any \(r_0 \in (0,1)\), and define the functions L and Y by
Note that by (12.7):
As already noted in Lemma 5.7, the concavity of L follows from (C), since
The more interesting function is Y. We have for all \(r \in (0,1)\):
To handle the last term on right-hand-side above, note that by the separation of variables on the left-hand-side of (12.7), we have by (C) again, after taking logarithms and calculating the partial derivatives in t and s:
We therefore conclude that for all \(r \in (0,1)\):
where the last inequality follows from (B) and the differential characterization of \({\mathsf {CD}}(K_0,N)\) densities (applied to \(h_r(t)\) at \(t=r\)). Applying the characterization again, we deduce that Y is a (\(C^2\)-smooth) \({\mathsf {CD}}(K_0,N)\) density on (0, 1). This concludes the formal proof that
with L and Y satisfying the desired properties. In a sense, the latter argument has been tailored to “reverse-engineer” the smooth Riemannian argument, where the separation to orthogonal and tangential components of the Jacobian is already encoded in the Jacobi equation, (B) is a consequence of the corresponding Riccati equation, and (C) is a consequence of Cauchy–Schwarz (cf. [74, Proof of Theorem 1.7]).
15.3 Rigorous argument
It is surprisingly very tedious to modify the above formal argument into a rigorous one. It seems that an approximation argument cannot be avoided, since the definition of Y above is inherently differential, and so on one hand we do not know how to check the \({\mathsf {CD}}(K_0,N)\) condition for Y synthetically, but on the other hand Y is not even differentiable, so it is not clear how to check the \({\mathsf {CD}}(K_0,N)\) condition by taking derivatives. The main difficulty in applying an approximation argument here stems from the fact that we do not know how to approximate \(\{h_s\}\) and z by smooth functions \(\{h^\varepsilon _s\}\) and \(z^\varepsilon \), so that simultaneously:
-
\(\{h^\varepsilon _s\}\) are \({\mathsf {CD}}(K_0-\varepsilon ,N)\) densities;
-
\(z^\varepsilon \) is a function of t only, and not of s;
-
and the separation of variables structure of (12.7) is preserved.
Our solution is to note that the main role of the separation of variables in the above formal argument was to ensure that (12.8) holds, and so we will replace the rigid third requirement with the following relaxed one:
-
\(\partial _s \partial _t |_{t=s=r} \log h^\varepsilon _s(t) \le B_\delta \varepsilon \) for all \(r \in [\delta ,1-\delta ]\) and \(\delta > 0\).
Proof of Theorem 12.3
Step 1: Redefining \(h_s(t)\).
First, observe that there exists \(I_y \subset (0,1)\) of full measure so that for all \(s \in I_y\), (12.7) is satisfied for a.e. \(t \in (0,1)\), and hence for all \(t \in (0,1)\), since all the functions \(\rho \), \(\left\{ h_s\right\} \) and z are assumed to be continuous on (0, 1). Unfortunately, we cannot extend this to all \(s \in (0,1)\) as well, since there may be a null set of s’s for which the densities \(h_s(t)\) do not comply at all with the equation (12.7). To remedy this, we simply force (12.7) to hold for all \(s,t \in (0,1)\) by defining
and claim that for all \(s \in (0,1)\), \({\tilde{h}}_s\) is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1). Indeed, for \(s \in I_y\), \({\tilde{h}}_s = h_s\) and there is nothing to check. If \(s_0 \in (0,1) \setminus I_y\), simply note that \({\tilde{h}}_s(t)\) is locally Lipschitz in \(s \in (0,1)\) (since \(\rho (s)\) is), and hence
But the family of \({\mathsf {CD}}(K_0,N)\) densities on (0, 1) is clearly closed under pointwise limits (it is characterized by a family of inequalities between 3 points), and so \({\tilde{h}}_{s_0}\) is a \({\mathsf {CD}}(K_0,N)\) density, as asserted.
Step 2: Properties of z and \(\{{\tilde{h}}_s\}\).
We next collect several additional observations regarding the functions z and \(\{{\tilde{h}}_s\}\). Recall that \(\rho \) (by assumption) and \({\tilde{h}}_s\) (as \({\mathsf {CD}}(K_0,N)\) densities) are strictly positive in (0, 1). Together with (12.9) (or directly from (12.7)), this implies that \(1 + (t-s) z(t) > 0\) for all \(t,s \in (0,1)\), and hence
-
(D)
\(-\frac{1}{t} \le z(t) \le \frac{1}{1-t} \;\;\; \forall t \in (0,1)\).
In fact, we already knew this by Theorem 3.11 (3) but refrained from including this into our assumption (C) since this is a consequence of the other assumptions. Furthermore
-
(E)
\(I_x := \{ t \in (0,1) \; ; \; \tau \mapsto {\tilde{h}}_s(\tau ) \text { is differentiable at } \tau =t \text { for all } s \in (0,1)\}\) is of full measure.
Indeed, this follows directly from the definition (12.9) by considering the set all points t where \(\rho (t)\) and z(t) are differentiable. In addition, we clearly have:
-
(F)
\(\forall t \in I_x\), \((0,1) \ni s \mapsto \partial _t {\tilde{h}}_s(t)\) is continuous.
Step 3: Defining L and Y.
Now fix \(r_0 \in (0,1)\), and define the functions L, Y on (0, 1) as follows:
Clearly, the function L is well defined for all \(r \in (0,1)\) as z is assumed locally Lipschitz. As for the function Y, (E) implies that \(\partial _t |_{t=s} \log {\tilde{h}}_s(t)\) exists for a.e. \(s \in (0,1)\), and the fact that the latter integrand is locally integrable on (0, 1) is a consequence of Lemma A.9 in the “Appendix”, which guarantees a-priori locally-integrable estimates on the logarithmic derivative of \({\mathsf {CD}}(K_0,N)\) densities.
Consequently, as in our formal argument, we may write (since \(\log \rho \) is locally absolutely continuous on (0, 1)):
and hence
We have already verified in Lemma 5.7 that the property \(z'(s) \ge z^2(s)\) a.e. in \(s \in (0,1)\) implies that L is concave on (0, 1), so it remains to show that Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1).
Step 4: Approximation argument
We now arrive to our approximation argument. Given \(\varepsilon _1 , \varepsilon _2 >0\), \(t \in (\varepsilon _1,1-\varepsilon _1)\) and \(s \in (\varepsilon _2,1-\varepsilon _2)\), define the double logarithmic mollification of \({\tilde{h}}_s(t)\) by
where \(\psi _\varepsilon (x) = \frac{1}{\varepsilon } \psi (x/\varepsilon )\) and \(\psi \) is a \(C^2\)-smooth non-negative function on \({\mathbb {R}}\) supported on \([-1,1]\) and integrating to 1. Since for all \(\eta \in (0,1/2)\), we clearly have by (12.9) (and, say, (D))
it follows by Proposition A.12 in the “Appendix” on logarithmic convolutions that \(\{{\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t)\}_{s \in (\varepsilon _2,1-\varepsilon _2)}\) is a \(C^2\)-smooth (in (t, s)) family of \({\mathsf {CD}}(K_0,N)\) densities on \((\varepsilon _1,1-\varepsilon _1)\).
Step 5: Concluding the proof assuming (H1) and (H2)
We will subsequently show the following two additional properties of the family \(\{h^{\varepsilon _1,\varepsilon _2}_s(t)\}\):
-
(H1)
\(\lim _{\varepsilon _2 \rightarrow 0} \lim _{\varepsilon _1 \rightarrow 0} \partial _t |_{t=s} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \partial _t |_{t=s} \log {\tilde{h}}_s(t)\) for a.e. \(s \in (0,1)\).
-
(H2)
\(\forall \delta \in (0,1/2) \; \exists C_\delta > 0 \;\; \forall \varepsilon \in (0,\frac{\delta }{8}] \;\; \forall \varepsilon _1,\varepsilon _2 \in (0,\varepsilon ]\):
$$\begin{aligned} \partial _s \partial _t |_{t=s=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) \le 2 C_\delta \varepsilon \;\;\; \forall r \in [\delta ,1-\delta ] . \end{aligned}$$
Assuming these additional properties, let us show how to conclude the proof of Theorem 12.3. Set \(\varepsilon = \max (\varepsilon _1,\varepsilon _2)\), and assuming that \(\varepsilon < \min (r_0,1-r_0)\), define the function \(Y^{\varepsilon _1,\varepsilon _2}\) on \((\varepsilon ,1-\varepsilon )\) given by
First, we claim to have the following pointwise convergence for all \(r \in (0,1)\):
Indeed, the pointwise convergence of the integrands is ensured by property (H1), and as soon as \(r_0,r \in (\eta , 1-\eta )\) for some \(\eta > 0\), we obtain by the a-priori estimates of Lemma A.9 in the “Appendix” (since \({\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\eta ,1-\eta )\) for all \(\varepsilon _1,\varepsilon _2 \in (0,\eta ]\) and \(s \in (\eta ,1-\eta )\)):
Consequently, (12.10) follows by Lebesgue’s Dominated Convergence theorem.
Now \(Y^{\varepsilon _1,\varepsilon _2}\) is \(C^2\)-smooth, and so as in our formal argument, we have for all \(r \in (\varepsilon ,1-\varepsilon )\):
As \({\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\varepsilon ,1-\varepsilon )\), we know by the differential characterization of such densities that
Combining this with property (H2), we conclude that for any \(\delta \in (0,1/2)\), whenever \(\varepsilon = \max (\varepsilon _1,\varepsilon _2) \in (0,\min (r_0,1-r_0,\frac{\delta }{8}))\):
and hence \(Y^{\varepsilon _1,\varepsilon _2}\) is a \(C^2\)-smooth \({\mathsf {CD}}(K_0 - 2 C_{\delta } \varepsilon ,N)\) density on \([\delta ,1- \delta ]\).
Combining all of the preceding information, since (as before) the family of \({\mathsf {CD}}(K_0',N)\) densities is closed under pointwise limits, we conclude from (12.10) that Y is a \({\mathsf {CD}}(K_0 - 2 C_\delta \varepsilon , N)\) density on \([\delta ,1-\delta ]\), for any \(\delta \in (0,1/2)\) and \(\varepsilon \in (0,\min (r_0,1-r_0,\frac{\delta }{8}))\). Taking the limit as \(\varepsilon \rightarrow 0\) and then as \(\delta \rightarrow 0\), we confirm that Y must be a \({\mathsf {CD}}(K_0,N)\) density on (0, 1), concluding the proof.
It remains to establish properties (H1) and (H2).
Step 6: proof of (H1)
Given \(y \in (0,1)\) and \(t \in (\varepsilon _1,1-\varepsilon _1)\), denote
so that for every \(s \in (\varepsilon _2,1-\varepsilon _2)\):
By Proposition A.10 in the “Appendix”, \({\tilde{h}}_y^{\varepsilon _1}\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\varepsilon _1,1-\varepsilon _1)\) for all \(y \in (0,1)\). Consequently, Lemma A.9 implies that \(t \mapsto \log {\tilde{h}}_y^{\varepsilon _1}(t)\) is locally Lipschitz on \((\varepsilon _1,1-\varepsilon _1)\), uniformly in \(y \in (0,1)\):
In particular, it follows that we may differentiate in t under the integral in (12.11) at any \(t_0 \in (\varepsilon _1,1-\varepsilon _1)\):
Now, by a standard argument (see Lemma 12.5 at the end of this section), we know that the derivative of an \(\varepsilon \)-mollification of a Lipschitz function converges to the derivative itself, at all points where the derivative exists, namely:
Together with (12.12) and (12.13), it follows by Dominated Convergence theorem that:
But by property (F), we know that \((0,1) \ni y \mapsto \partial _t |_{t=t_0} \log {\tilde{h}}_y(t)\) is continuous for all \(t_0 \in I_x\), and therefore taking the limit as \(\varepsilon _2 \rightarrow 0\):
By property (E), \(I_x\) has full measure, thereby concluding the proof of (an extension of) property (H1).
Step 7: proof of (H2)
We will require the following:
Lemma 12.4
Let z satisfy (C) and (D). Then for all \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\), so that for all \(\varepsilon \in (0,\frac{\delta }{4}]\), \(r \in [\delta ,1-\delta ]\), \(r-\varepsilon \le t_1 < t_2 \le r + \varepsilon \) and \(r-\varepsilon \le s_1 < s_2 \le r+\varepsilon \), we have:
Proof
Opening the various brackets, the assertion is equivalent to the statement:
and after dividing by \((t_2 - t_1) (s_2 - s_1)\), we see that our goal is to establish:
for an appropriate \(C_\delta \). Note that the right-hand-side of (12.14) is always positive by (D). As \(\min (t_i,1-t_i) \ge \delta - \varepsilon \ge \frac{3}{4} \delta \), by our assumption (C), (12.14) would follow from:
or equivalently (assuming \(\left| z(t_1)\right| \left| z(t_2)\right| > 0\), otherwise there is nothing to prove):
But \(\frac{1}{\left| z(t_i)\right| } \ge \min (t_i,1-t_i) \ge \frac{3}{4} \delta \) by (D), and as \(\varepsilon \in (0,\frac{\delta }{4}]\), we see that (12.15) is ensured by setting:
\(\square \)
Translating the statement of Lemma 12.4 into a statement for \({\tilde{h}}_s(t)\) using (12.9), we obtain that for all \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\), so that for all \(\varepsilon \in (0,\frac{\delta }{8}]\), \(r \in [\delta ,1-\delta ]\), \(r-\varepsilon \le t,s\le r + \varepsilon \) and \(\Delta t,\Delta s \in [0,\varepsilon ]\), we have:
Integrating the above in t against \(\psi _{\varepsilon _1}(r-t)\) and in s against \(\psi _{\varepsilon _2}(r-s)\) with \(\varepsilon _1,\varepsilon _2 \in (0,\varepsilon ]\), we obtain that under the same assumptions as above:
Exchanging sides, dividing by \(\Delta t > 0\) and taking limit as \(\Delta t \rightarrow 0\), and then dividing by \(\Delta s > 0\) and taking limit as \(\Delta s \rightarrow 0\), we obtain precisely:
thereby confirming (H2). \(\square \)
For completeness, we provide a proof of the following lemma, used in Step 6 above.
Lemma 12.5
Let f be a locally Lipschitz function on an open interval \(I \subset {\mathbb {R}}\). Let \(\psi \) denote a \(C^1\)-smooth compactly supported function on \({\mathbb {R}}\) which integrates to 1. Denote by \(\psi _\varepsilon (x) = \frac{1}{\varepsilon } \psi (x/\varepsilon )\), \(\varepsilon > 0\), the corresponding family of mollifiers. Then:
at all points \(x \in I\) where f is differentiable.
Proof
Without loss of generality, assume that \(0 \in I\), that f is differentiable at 0 and that \(f(0)=0\). Assume that \(\psi \) is supported in \([-M,M]\), and let \(\varepsilon > 0\) be small enough so that \([-M \varepsilon , M \varepsilon ] \subset I\). Then:
where the differentiation under the integral is justified since f is locally Lipschitz. Integrating by parts (which is justified as \(f \psi _\varepsilon \) is absolutely continuous), we obtain
But for each \(z \in [-M , M] \setminus \left\{ 0\right\} \), \(\lim _{\varepsilon \rightarrow 0} \frac{f (\varepsilon z)}{\varepsilon z} = f'(0)\), and since f is Lipschitz on \([-\varepsilon M , \varepsilon M]\), we obtain by Lebesgue’s Dominated Convergence Theorem that
as asserted. \(\square \)
16 Final results
In this final section, we combine the results obtained in Parts I, II and the previous section, establishing at last the Main Theorem 1.1 and the globalization theorem for the \({\mathsf {CD}}(K,N)\) condition. We also treat the case of an infinitesimally Hilbertian space.
Throughout this section, recall that we assume \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\).
16.1 Proof of the Main Theorem 1.1
Theorem 13.1
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s., so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:
Proof
By Remark 6.11, \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}_{loc}(K,N)\) if and only if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does. By Remark 8.8, the same is true for \({\mathsf {CD}}^1_{Lip}(K,N)\). Consequently, we may assume that \(\text {supp}({\mathfrak {m}}) = X\). By Lemma 6.12 we deduce that \((X,\mathsf {d})\) is proper and geodesic (note that this would be false without the length space assumption above). Note that for geodesic essentially non-branching spaces, it is known that \({\mathsf {CD}}_{loc}(K,N)\) implies \({\mathsf {MCP}}(K,N)\)—see [30] for a proof assuming non-branching, but the same proof works under essentially non-branching, see the comments after [29, Corollary 5.4]. Consequently, the results of Sect. 7 apply.
Recall that given a 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\), the equivalence relation \(R^b_{u}\) on the transport set \({\mathcal {T}}_{u}^{b}\) induces a partition \(\{R_u^b(\alpha )\}_{\alpha \in Q}\) of \({\mathcal {T}}_{u}^{b}\). By Corollary 7.3, we know that \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\) with associated strongly consistent disintegration:
It was proved in [27] that the \({\mathsf {CD}}_{loc}(K,N)\) condition ensures that for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \((\overline{R^b_u(\alpha )},\mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {CD}}(K,N)\) with \(\text {supp}({\mathfrak {m}}_\alpha ) = \overline{R^b_u(\alpha )}\). Denoting by \(X_{\alpha }\) the closure \(\overline{R_u^b(\alpha )}\), Theorem 7.10 ensures that \(X_{\alpha }\) coincides with the transport ray \(R_{u}(\alpha )\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\). Consequently, all 4 conditions of the \({\mathsf {CD}}^1_u(K,N)\) Definition 8.1 are verified, and the assertion follows. \(\square \)
Theorem 13.2
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:
Proof
By Remark 8.8, \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}^1(K,N)\) if and only if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does. By Remark 6.11, the same is true for \({\mathsf {CD}}(K,N)\). Consequently, we may assume that \(\text {supp}({\mathfrak {m}}) = X\).
By Proposition 8.9 and Remark 8.11, X also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 applies. Given \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), consider the unique \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\), and denote \(\mu _t := (\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\) for all \(t \in [0,1]\). Let \(\rho _t := d\mu _t / d{\mathfrak {m}}\) denote the versions of the densities guaranteed by Corollary 9.5.
Denote an associated Kantorovich potential by \(\varphi \), and recall that \(\nu \) is concentrated on \(G_\varphi = G_\varphi ^+ \cup G_\varphi ^0\), where \(G_\varphi ^+\) and \(G_\varphi ^0\) denote the subsets of positive and zero length \(\varphi \)-Kantorovich geodesics, respectively. The change-of-variables Theorem 11.4 and Proposition 4.4 yield that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\):
where for all \(s \in (0,1)\), \(h_{s} = h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}\) is a \({\mathsf {CD}}(K_{0},N)\) density, with \(K_{0} = \ell ^{2}(\gamma ) K\) and \(h_{s}(s)=1\). Together with Corollary 9.5, which ensures the Lipschitz regularity (and positivity) of \((0,1) \ni t \mapsto \rho _t(\gamma _t)\), this verifies assumptions (A) and (B) of Theorem 12.3. As explained in Sect. 12, the 3rd order information on the Kantorovich potential \(\varphi \) asserted by Theorem 5.5 verifies assumption (C) of Theorem 12.3. It follows by Theorem 12.3 (and the discussion preceding it) that the rigidity of (13.1) necessarily implies that for those \(\gamma \in G_\varphi ^+\) satisfying (13.1), it holds:
where L is concave and Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1). Noting that \(\sigma _{K_0,N}^{(\alpha )}(\theta ) = \sigma _{K,N}^{(\alpha )}(\theta \ell (\gamma ))\), we obtain by a standard application of Hölder’s inequality that for any \(t_0,t_1 \in (0,1)\), \(\alpha \in [0,1]\) and \(t_\alpha = \alpha t_1 + (1-\alpha ) t_0\):
Using the upper semi-continuity of \(t \mapsto \rho _t(\gamma _t)\) at the end-points \(t=0,1\) ensured by Corollary 9.5 (as both \(\mu _0,\mu _1 \ll {\mathfrak {m}}\)), we conclude that for \(\nu \)-a.e. \(\gamma \in G^+_{\varphi }\), the previous inequality in fact holds for all \(t_0,t_1 \in [0,1]\). In particular, for \(t_0 = 0\), \(t_1=1\) and all \(\alpha \in [0,1]\):
As for null-geodesics \(\gamma \in G_\varphi ^0\) (having zero length), note that \(\tau _{K,N}^{(s)}(0) = s\) and that \([0,1] \ni t \mapsto \rho _t(\gamma _t)\) remains constant by Theorem 11.4, and therefore (13.3) holds trivially with equality for all \(\gamma \in G_\varphi ^0\). In conclusion, (13.3) holds for \(\nu \)-a.e. geodesic \(\gamma \), thereby confirming the validity of Definition 6.7 and verifying \({\mathsf {CD}}(K,N)\). \(\square \)
As an immediate consequence of the previous two theorems, we obtain the Local-to-Global Theorem for the Curvature-Dimension condition.
Theorem 13.3
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:
Remark 13.4
It is clear that the above globalization theorem is false without some global assumption ultimately ensuring that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic. Indeed, simply consider a \({\mathsf {CD}}(K,N)\) space, and restrict it to two disjoint geodesically-convex closed subsets of \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) (each having positive measure)—the resulting space clearly satisfies \({\mathsf {CD}}_{loc}(K,N)\) but not \({\mathsf {CD}}(K,N)\); it is also easy to construct similar examples where \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is connected. In addition, as already mentioned in the Introduction, the globalization theorem is known to be false without some type of non-branching assumption (see [67]).
As an interesting byproduct, we also obtain that \({\mathsf {CD}}^{1}\) and \({\mathsf {CD}}^{1}_{Lip}\) are equivalent conditions on essentially non-branching spaces:
Corollary 13.5
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:
Proof
\({\mathsf {CD}}_{Lip}^{1}(K,N)\) is by definition stronger than \({\mathsf {CD}}^{1}(K,N)\), which in turn implies \({\mathsf {CD}}(K,N)\) by Theorem 13.2. But \({\mathsf {CD}}(K,N)\) implies its local version \({\mathsf {CD}}_{loc}(K,N)\), as well as that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic by Lemma 6.12. The cycle is then closed by Theorem 13.1. \(\square \)
Finally, we deduce a complete equivalence between the reduced and the classic Curvature-Dimension conditions on essentially non-branching spaces. Recall that the reduced version \({\mathsf {CD}}^*(K,N)\), introduced in [14] (in the non-branching setting), is defined exactly in the same manner as \({\mathsf {CD}}(K,N)\), with the only (crucial) difference being that one employs the slightly smaller \(\sigma ^{(t)}_{K,N}(\theta )\) coefficients instead of the \(\tau ^{(t)}_{K,N}(\theta )\) ones in Definition 6.4.
Corollary 13.6
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:
Proof
By definition \({\mathsf {CD}}(K,N)\) is stronger than \({\mathsf {CD}}^*(K,N)\) (see [14, Proposition 2.5 (i)]). For the converse implication, note that \({\mathsf {CD}}^*(K,N)\) implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic, by verbatim repeating the proof of Lemma 6.12. Then we observe that \({\mathsf {CD}}^*(K,N) \Rightarrow {\mathsf {CD}}_{loc}(K^{-},N)\), where \({\mathsf {CD}}_{loc}(K^{-},N)\) denotes that \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K',N)\) for every \(K' < K\) (with the open neighborhoods possibly depending on \(K'\)). For non-branching spaces, this was proved in [14, Proposition 5.5] (see also [34, Lemma 2.1]), but the proof does not rely on any non-branching assumptions. Then, by Theorem 13.3, we obtain \({\mathsf {CD}}(K',N)\) for any \(K' < K\). Finally, by uniqueness of dynamical plans (see Theorem 6.15 and Lemma 6.13) and continuity of \(\tau _{K',N}^{(t)}(\theta )\) in \(K'\), the claim follows. \(\square \)
16.2 \({\mathsf {RCD}}(K,N)\) spaces
We also mention the more recent Riemannian Curvature Dimension condition \({\mathsf {RCD}}^{*}(K,N)\). In the infinite dimensional case \(N = \infty \), it was introduced in [7] for finite measures \({\mathfrak {m}}\) and in [4] for \(\sigma \)-finite ones. The class \({\mathsf {RCD}}^{*}(K,N)\) with \(N<\infty \) has been proposed in [40] and extensively investigated in [8, 11, 35]. We refer to these papers and references therein for a general account on the synthetic formulation of the latter Riemannian-type Ricci curvature lower bounds. Here we only briefly recall that it is a strengthening of the reduced Curvature Dimension condition: a m.m.s.verifies \({\mathsf {RCD}}^{*}(K,N)\) if and only if it satisfies \({\mathsf {CD}}^{*}(K,N)\) and is infinitesimally Hilbertian [40, Definition 4.19 and Proposition 4.22], meaning that the Sobolev space \(W^{1,2}(X,{\mathfrak {m}})\) is a Hilbert space (with the Hilbert structure induced by the Cheeger energy). Recall also that the local-to-global property for the \({\mathsf {RCD}}^{*}(K,N)\) condition (say for length spaces of full support) has already been established for \(N=\infty \) in [7, Theorem 6.22] for non-branching spaces with finite second moment, for \(N < \infty \) in [35, Theorems 3.17 and 3.25] for strong \({\mathsf {RCD}}^*(K,N)\) spaces, and for all \(N \in [1,\infty ]\) in [10, Theorems 7.2 and 7.8] for proper spaces without any non-branching assumptions.
We are now in a position to introduce the following (expected) definition:
Definition
We will say that a m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {RCD}}(K,N)\) if it verifies \({\mathsf {CD}}(K,N)\) and is infinitesimally Hilbertian.
We can now immediately deduce:
Corollary 13.7
Note that \({\mathsf {CD}}^{*}(K,\infty )\) and \({\mathsf {CD}}(K,\infty )\) are the same condition, so the above also holds for \(N = \infty \).
Proof
Since \({\mathsf {CD}}(K,N)\) is stronger than \({\mathsf {CD}}^{*}(K,N)\), one implication is straightforward. For the other implication, recall that \({\mathsf {RCD}}^{*}(K,N)\) forces the space to be essentially non-branching (see [68, Corollary 1.2]), and so the assertion follows by Corollary 13.6. \(\square \)
Corollary 13.8
Let \((X,\mathsf {d},{\mathfrak {m}})\) be an m.m.s.so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:
Proof
One implication is trivial. For the converse, as usual, we may assume that \(\text {supp}({\mathfrak {m}}) = X\) by Remark 6.11. By Lemma 6.12, we know that \((X,\mathsf {d})\) is proper and geodesic (as usual, this would be false without the length space assumption above). As the local-to-global property has been proved for proper geodesic \({\mathsf {RCD}}^*(K,N)\) spaces without any non-branching assumptions in [10], it follows that:
where the last implication follows by Corollary 13.7. \(\square \)
16.3 Concluding remarks
We conclude this work with several brief remarks and suggestions for further investigation.
-
Note that the proof of Theorem 13.2 in fact yields more than stated: not only does the synthetic inequality (13.2) hold (for all \(t_0,t_1 \in [0,1]\)), but in fact we obtain for \(\nu \)-a.e. geodesic \(\gamma \) the a-priori stronger disentanglement (or “L-Y” decomposition):
$$\begin{aligned} \frac{1}{\rho _t(\gamma _t)} = L_\gamma (t) Y_\gamma (t) \;\;\; \forall t \in (0,1), \end{aligned}$$(13.4)where \(L_\gamma \) is concave and \(Y_\gamma \) is a \({\mathsf {CD}}(\ell (\gamma )^2 K, N)\) density on (0, 1). As explained in the Introduction, it follows from [34] that for a fixed \(\gamma \), (13.4) is indeed strictly stronger than (13.2). In view of Main Theorem 1.1, this constitutes a new characterization of essentially non-branching \({\mathsf {CD}}(K,N)\) spaces.
-
According to [35, p. 1026], it is possible to localize the argument of [68] and deduce from a strong \({\mathsf {CD}}_{loc}(K,\infty )\) condition (when K-convexity of the entropy is assumed along any \(W_2\)-geodesic with end-points inside the local neighborhood), that the space is globally essentially non-branching. In combination with our results, it follows that the strong \({\mathsf {CD}}(K,N)\) condition enjoys the local-to-global property, without a-priori requiring any additional non-branching assumptions.
-
It would still be interesting to clarify the relation between the \({\mathsf {CD}}(K,N)\) condition and the property \({\mathsf {BM}}(K,N)\) of satisfying a Brunn-Minkowski inequality (with sharp dependence on K, N as in [74]). Note that by Main Theorem 1.1, it is enough to understand this locally on essentially non-branching spaces.
-
It would also be interesting to study the \({\mathsf {CD}}^1(K,N)\) condition on its own, when no non-branching assumptions are assumed, and to verify the usual list of properties desired by a notion of Curvature-Dimension (see [28, 51, 74]).
-
A natural counterpart of \({\mathsf {RCD}}(K,N)\) would be \({\mathsf {RCD}}^{1}(K,N)\): we will say that a m.m.s.verifies \({\mathsf {RCD}}^{1}(K,N)\) if it verifies \({\mathsf {CD}}^{1}(K,N)\) and it is infinitesimally Hilbertian. Recall that an \({\mathsf {RCD}}(K,N)\) space is always essentially non-branching [68], and hence Main Theorem 1.1 immediately yields:
$$\begin{aligned} {\mathsf {RCD}}(K,N) \Rightarrow {\mathsf {RCD}}^{1}(K,N). \end{aligned}$$The converse implication would be implied by the following claim which we leave for a future investigation: an \({\mathsf {RCD}}^{1}(K,N)\)-space is always essentially non-branching.
-
In regards to the novel third order temporal information on the intermediate-time Kantorovich potentials \(\varphi _t\) we obtain in this work—it would be interesting to explore whether it has any additional consequences pertaining to the spatial regularity of solutions to the Hamilton-Jacobi equation in general, and of the transport map \(T_{s,t} = \mathrm{e}_t \circ \mathrm{e}_{s}|_G^{-1}\) from an intermediate time \(s \in (0,1)\) in particular (where \(G \subset G_\varphi \) is the subset of injectivity guaranteed by Corollary 6.16). In the smooth Riemannian setting, the map \(T_{s,t}\) is known to be locally Lipschitz by Mather’s regularity theory (see [77, Chapter 8] and cf. [77, Theorem 8.22]). A starting point for this investigation could be the following bound on the (formal) Jacobian of \(T_{s,t}\), which follows immediately from (12.5), Theorem 3.11 (3) and Lemma A.9: for \(\mu _s\)-a.e. x, the Jacobian is bounded above by a function of \(s,t,K,N,l_s(x)\) only.
References
Ambrosio, L.: Fine properties of sets of finite perimeter in doubling metric measure spaces. Set Valued Anal. 10, 111–128 (2002)
Ambrosio, L.: Lecture notes on optimal transport problems. In: Mathematical Aspects of Evolving Interfaces (Funchal, 2000). Lecture Notes in Mathematics, vol. 1812, pp. 1–52. Springer, Berlin (2003)
Ambrosio, L., Gigli, N.: A user’s guide to optimal transport. In: Piccoli, B., Rascle, M. (eds.) Modelling and Optimisation of Flows on Networks. Lecture Notes in Mathematics, vol. 2062, pp. 1–155. Springer, Heidelberg (2013)
Ambrosio, L., Gigli, N., Mondino, A., Rajala, T.: Riemannian Ricci curvature lower bounds in metric measure spaces with \(\sigma \)-finite measure. Trans. Am. Math. Soc. 367(7), 4661–4701 (2015)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows in metric spaces and in the space of probability measures. Lectures in Mathematics ETH-Zürich. Birkhäuser, Basel (2005)
Ambrosio, L., Gigli, N., Savaré, G.: Calculus and heat flow in metric measure spaces and application to spaces with Ricci curvature bounded from below. Invent. Math. 195, 289–391 (2014)
Ambrosio, L., Gigli, N., Savaré, G.: Metric measure spaces with Riemannian Ricci curvature bounded from below. Duke Math. J. 163, 1405–1490 (2014)
Ambrosio, L., Gigli, N., Savaré, G.: Bakry–Émery curvature-dimension condition and Riemannian Ricci curvature bounds. Ann. Probab. 43, 339–404 (2015)
Ambrosio, L., Di Marino, S.: Equivalent definitions of BV space and of total variation on metric measure spaces. J. Funct. Anal. 266, 4150–4188 (2014)
Ambrosio, L., Mondino, A., Savaré, G.: On the Bakry–Émery condition, the gradient estimates and the local-to-global property of RCD\(^{*}(K, N)\) metric measure spaces. J. Geom. Anal. 26, 24–56 (2016)
Ambrosio, L., Mondino, A., Savaré, G.: Nonlinear diffusion equations and curvature conditions in metric measure spaces. Mem. Am. Math. Soc. 262(1270), v+121 (2019)
Ambrosio, L., Miranda Jr., M., Pallara, D.: Special functions of bounded variation in doubling metric measure spaces. In: Calculus of variations: topics from the mathematical heritage of E. De Giorgi, vol. 14, pp. 1–45. Dept. Math., Seconda Univ. Napoli, Caserta (2004)
Ambrosio, L., Pratelli, A.: Existence and stability results in the \(L^1\) theory of optimal transportation. In: Optimal Transportation and Applications (Martina Franca, 2001), Lecture Notes in Mathematics, vol. 183, pp. 123–160. Springer, Berlin (2003)
Bacher, K., Sturm, K.T.: Localization and tensorization properties of the curvature-dimension condition for metric measure spaces. J. Funct. Anal. 259(1), 28–56 (2010)
Bakry, D.: L’hypercontractivité et son utilisation en théorie des semigroupes. In: Lectures on Probability Theory (Saint-Flour, 1992), Lecture Notes in Mathematics, vol. 1581, pp. 1–114. Springer, Berlin (1994)
Bakry, D., Émery, M.: Diffusions hypercontractives. In: Séminaire de probabilités, XIX, 1983/84, Lecture Notes in Mathematics, vol. 1123, pp. 177–206. Springer (1985)
Bakry, D., Gentil, I., Ledoux, M.: Analysis and geometry of Markov diffusion operators. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 348. Springer, Cham (2014)
Bianchini, S., Caravenna, L.: On the extremality, uniqueness and optimality of transference plans. Bull. Inst. Math. Acad. Sin. (N.S.) 4(4), 353–454 (2009)
Bianchini, S., Cavalletti, F.: The Monge problem for distance cost in geodesic spaces. Commun. Math. Phys. 318, 615–673 (2013)
Borwein, J.M., Vanderwerff, J.D.: Convex functions: constructions, characterizations and counterexamples. In: Encyclopedia of Mathematics and its Applications, vol. 109. Cambridge University Press, Cambridge (2010)
Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 44(4), 375–417 (1991)
Burago, D., Burago, Y., Ivanov, S.: A course in metric geometry. Graduate Studies in Mathematics , vol. 33, pp. xiv+415. American Mathematical Society, Providence, RI (2001)
Cavalletti, F.: Monge problem in metric measure spaces with Riemannian curvature-dimension condition. Nonlinear Anal. 99, 136–151 (2014)
Cavalletti, F.: Decomposition of geodesics in the Wasserstein space and the globalization property. Geom. Funct. Anal. 24, 493–551 (2014)
Cavalletti, F.: An overview of \(L^{1}\) optimal transportation on metric measure spaces. In: Measure Theory in Non-Smooth Spaces. Partial Differ. Equ. Meas. Theory, pp. 98–144. De Gruyter Open, Warsaw (2017)
Cavalletti, F., Huesmann, M.: Self-intersection of optimal geodesics. Bull. Lond. Math. Soc. 46, 653–656 (2014)
Cavalletti, F., Mondino, A.: Sharp and rigid isoperimetric inequalities in metric-measure spaces with lower Ricci curvature bounds. Invent. Math. 208(3), 803–849 (2017)
Cavalletti, F., Mondino, A.: Sharp geometric and functional inequalities in metric measure spaces with lower Ricci curvature bounds. Geom. Topol. 21(1), 603–645 (2017)
Cavalletti, F., Mondino, A.: Optimal maps in essentially non-branching spaces. Commun. Contemp. Math. 19(6), 1750007 (2017)
Cavalletti, F., Sturm, K.-T.: Local curvature-dimension condition implies measure-contraction property. J. Funct. Anal. 262, 5110–5127 (2012)
Cheeger, J.: Differentiability of Lipschitz functions on metric measure spaces. Geom. Funct. Anal. 3(9), 428–517 (1999)
Cordero-Erausquin, D.: Some applications of mass transport to Gaussian-type inequalities. Arch. Ration. Mech. Anal. 161(3), 257–269 (2002)
Cordero-Erausquin, D., McCann, R.J., Schmuckenschläger, M.: A Riemannian interpolation inequality à la Borell. Brascamp Lieb. Invent. Math. 146, 219–257 (2001)
Deng, Q., Sturm, K.-T.: Localization and tensorization properties of the curvature-dimension condition for metric measure spaces. II. J. Funct. Anal. 260, 3718–3725 (2011)
Erbar, M., Kuwada, K., Sturm, K.-T.: On the equivalence of the entropic curvature-dimension condition and Bochner’s inequality on metric measure spaces. Invent. Math. 201(3), 993–1071 (2015)
Evans, L.C.: Partial differential equations and Monge–Kantorovich mass transfer. Current developments in mathematics. 1997 (Cambridge, MA), pp. 65–126. International Press, Boston, MA (1999)
Evans, L.C., Gangbo, W.: Differential equations methods for the Monge–Kantorovich mass transfer problem. Mem. Am. Math. Soc. 137(653), viii+66 (1999)
Feldman, M., McCann, R.J.: Monge’s transport problem on a Riemannian manifold. Trans. Am. Math. Soc. 354(4), 1667–1697 (2002)
Fremlin, D.H.: Measure Theory, vol. 4. Torres Fremlin, Colchester (2006)
Gigli, N.: On the differential structure of metric measure spaces and applications. Mem. Am. Math. Soc. 236 (1113), vi+91
Gigli, N., Rajala, T., Sturm, K.-T.: Optimal Maps and exponentiation on finite-dimensional spaces with Ricci curvature bounded from below. J. Geom. Anal. 26, 2914–2929 (2016)
Gromov, M.: Paul Lévy’s isoperimetric inequality. Preprint, I.H.E.S. (1980)
Gromov, M.: Metric Structures for Riemannian and Non-Riemannian Spaces. Birkhäuser, Basel (2001)
Gromov, M., Milman, V.D.: Generalization of the spherical isoperimetric inequality to uniformly convex Banach spaces. Compositio Math. 62(3), 263–282 (1987)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex analysis and minimization algorithms. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] , vol. 305. Springer, Berlin (1993)
Kannan, R., Lovász, L., Simonovits, M.: Isoperimetric problems for convex bodies and a localization lemma. Discrete Comput. Geom. 13(3–4), 541–559 (1995)
Klartag, B.: Needle decompositions in Riemannian geometry. Mem. Am. Math. Soc. 249(1180), v+77 (2017)
Ledoux, M.: The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs , vol. 89. American Mathematical Society, Providence, RI (2001)
Lott, J., Villani, C.: Hamilton–Jacobi semigroup on length spaces and applications. J. Math. Pures Appl. 88, 219–229 (2007)
Lott, J., Villani, C.: Weak curvature conditions and functional inequalities. J. Funct. Anal. 245(1), 311–333 (2007)
Lott, J., Villani, C.: Ricci curvature for metric-measure spaces via optimal transport. Ann. Math. 169(3), 903–991 (2009)
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)
McCann, R.J., Guillen, N.: Five lectures on optimal transportation: geometry, regularity and applications. In: Analysis and Geometry of Metric Measure Spaces, CRM Proc. Lecture Notes, vol. 56, pp. 145–180. American Mathematical Society, Providence, RI (2013)
Miranda Jr., M.: Functions of bounded variation on “good” metric spaces. J. Math. Pures Appl. 82, 975–1004 (2003)
Milman, E.: Sharp isoperimetric inequalities and model spaces for the curvature-dimension-diameter condition. J. Eur. Math. Soc. (JEMS) 17(5), 1041–1078 (2015)
Milman, E.: Beyond traditional curvature-dimension I: new model spaces for isoperimetric and concentration inequalities in negative dimension. Trans. Am. Math. Soc. 369(5), 3605–3637 (2017)
Ohta, S.-I.: On the measure contraction property of metric measure spaces. Comment. Math. Helv. 82, 805–828 (2007)
Ohta, S.-I.: Finsler interpolation inequalities. Calc. Var. Partial Differ. Equ. 36(2), 211–249 (2009)
Ohta, S.-I.: Needle decompositions and isoperimetric inequalities in Finsler geometry. J. Math. Soc. Japan 70(2), 651–693 (2018)
Ohta, S.-I.: \((K, N)\)-convexity and the curvature-dimension condition for negative \(N\). J. Geom. Anal. 26(3), 2067–2096 (2016)
Oliver, H.W.: The exact Peano derivative. Trans. Am. Math. Soc. 76, 444–456 (1954)
Otto, F., Villani, C.: Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. J. Funct. Anal. 173(2), 361–400 (2000)
Payne, L.E., Weinberger, H.F.: An optimal Poincaré inequality for convex domains. Arch. Ration. Mech. Anal. 5, 286–292 (1960)
Petrunin, A.: Alexandrov meets Lott–Villani–Sturm. Münster J. Math. 4, 53–64 (2011)
Rachev, S.T., Rüschendorf, L.: Mass Transportation Problem, vol. I. Probability and its Applications (New York). Springer, New York (1998)
Rajala, T.: Interpolated measures with bounded densities in metric spaces satisfying the curvature-dimension conditions of Sturm. J. Funct. Anal. 263, 896–924 (2012)
Rajala, T.: Failure of the local-to-global property for CD\((K,N)\) spaces. Ann. Sc. Norm. Super. Pisa Cl. Sci. 16, 45–68 (2016)
Rajala, T., Sturm, K.-T.: Non-branching geodesics and optimal maps in strong CD\((K,\infty )\)-spaces. Calc. Var. Partial Differ. Equ. 50, 831–846 (2014)
von Renesse, M.-K.: On local Poincaré via transportation. Math. Z. 259, 21–31 (2008)
von Renesse, M.-K., Sturm, K.-T.: Transport inequalities, gradient estimates, entropy and Ricci curvature. Commun. Pure Appl. Math. 58, 923–940 (2005)
Schneider, R.: Convex Bodies: The Brunn–Minkowski Theory. Encyclopedia of Mathematics and its Applications, vol. 44. Cambridge University Press, Cambridge (1993)
Srivastava, S.M.: A Course on Borel Sets. Graduate Texts in Mathematics. Springer, Berlin (1998)
Sturm, K.-T.: On the geometry of metric measure spaces. I. Acta Math. 196(1), 65–131 (2006)
Sturm, K.-T.: On the geometry of metric measure spaces. II. Acta Math. 196(1), 133–177 (2006)
Urbas, J.: Mass Transfer Problems. Lecture notes, University of Bonn (1998)
Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics , vol. 58. American Mathematical Society, Providence, RI (2003)
Villani, C.: Optimal Transport—Old and New. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] , vol. 338. Springer, Berlin (2009)
Acknowledgements
We would like to thank Theo Sturm and Cédric Villani for numerous discussions and for encouraging us to pursue the globalization problem. We also thank the referees for their careful reading of the manuscript and helpful comments.
Funding
Open access funding provided by Scuola Internazionale Superiore di Studi Avanzati - SISSA within the CRUI-CARE Agreement. E. Milman: The research leading to these results is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 637851).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
E. Milman: The research leading to these results is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 637851).
Appendix: One dimensional \({\mathsf {CD}}(K,N)\) densities
Appendix: One dimensional \({\mathsf {CD}}(K,N)\) densities
Definition A.1
A non-negative function h defined on an interval \(I \subset {\mathbb {R}}\) is called a \({\mathsf {CD}}(K,N)\) density on I, for \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\), if for all \(x_0,x_1 \in I\) and \(t \in [0,1]\):
(recalling the coefficients \(\sigma \) from Definition 6.2). While we avoid in this work the case \(N = \infty \), it will be useful in this section to also treat the case \(N = \infty \), whence the latter condition is interpreted by subtracting 1 from both sides, multiplying by \(N-1\), and taking the limit as \(N \rightarrow \infty \), namely:
For completeness, we will say that h is a \({\mathsf {CD}}(K,1)\) density on I iff \(K \le 0\) and h is constant on the interior of I.
Unless otherwise stated, we assume in this “Appendix” that \(K \in {\mathbb {R}}\) and \(N \in (1,\infty ]\). The following is a specialization to dimension one of a well-known result in the theory of \({\mathsf {CD}}(K,N)\) mm-spaces, which explains the terminology above. Here we do not assume that a m.m.s.is necessarily equipped with a probability measure.
Theorem A.2
If h is a \({\mathsf {CD}}(K,N)\) density on an interval \(I \subset {\mathbb {R}}\) then the m.m.s.\((I,\left| \cdot \right| ,h(t) dt)\) verifies \({\mathsf {CD}}(K,N)\). Conversely, if the m.m.s.\(({\mathbb {R}},\left| \cdot \right| ,\mu )\) verifies \({\mathsf {CD}}(K,N)\) and \(I = \text {supp}(\mu )\) is not a point, then \(\mu \ll {\mathcal {L}}^1\) and there exists a version of the density \(h = d\mu / d{\mathcal {L}}^1\) which is a \({\mathsf {CD}}(K,N)\) density on I.
Proof
The first assertion follows from e.g. [74, Theorem 1.7 (ii)], and the second follows by considering the \({\mathsf {CD}}(K,N)\) condition for uniform measures \(\mu _0,\mu _1\) on intervals of length \(\varepsilon \) and \(\alpha \varepsilon \), respectively, letting \(\varepsilon \rightarrow 0\), employing Lebesgue’s differentiation theorem, and optimizing on \(\alpha > 0\) (e.g. as in the proof of [30, Theorem 4.3]). \(\square \)
Let h be a \({\mathsf {CD}}(K,N)\) density on an interval \(I \subset {\mathbb {R}}\). A few standard and easy consequences of Definition A.1 are:
-
h is also a \({\mathsf {CD}}(K_2,N_2)\) density for all \(K_2 \le K\) and \(N_2 \in [N,\infty ]\) (this follows from the corresponding monotonicity of the coefficients \(\sigma ^{(t)}_{K,N-1}(\theta )\) in K and N, see e.g. [51, 74]).
-
h is lower semi-continuous on I and locally Lipschitz continuous in its interior (this is easily reduced to a standard identical statement for concave functions on I).
-
h is strictly positive in the interior whenever it does not identically vanish (follows immediately from the definition).
-
h is locally semi-concave in the interior, i.e. for all \(x_0\) in the interior of I, there exists \(C_{x_0} \in {\mathbb {R}}\) so that \(h(x) - C_{x_0} x^2\) is concave in a neighborhood of \(x_0\) (easily checked for \({\mathsf {CD}}(K,\infty )\) densities). In particular, it is twice differentiable (in the sense of Lemma 2.3) a.e. in I.
1.1 Differential characterization
The following is a well-known differential characterization of \(C^2\)-smooth \({\mathsf {CD}}(K,N)\) densities:
Lemma A.3
Let \(h \in C^2_{loc}(I)\) on some open interval \(I \subset {\mathbb {R}}\). The following are equivalent:
-
(1)
h is a \({\mathsf {CD}}(K,N)\) density on I.
-
(2)
For all \(x \in I\):
$$\begin{aligned} (\log h)''(x) + \frac{1}{N-1} ((\log h)'(x))^2 = (N-1) \frac{(h^{\frac{1}{N-1}})''(x)}{h^{\frac{1}{N-1}}(x)} \le - K .\nonumber \\ \end{aligned}$$(A.1)where the left hand side is interpreted as \((\log h)''(x)\) when \(N=\infty \).
Remark A.4
The equality in (A.1) holds for any \(N \in (1,\infty )\) by the Leibniz and chain rules at any point x where h(x) is positive and twice differentiable (and in particular, \(h^{\frac{1}{N-1}}\) and \(\log h\) are also twice differentiable at such a point x). The condition (A.1) is the one-dimensional specialization of the Bakry–Émery \({\mathsf {CD}}(K,N)\) condition for smooth weighted Riemannian manifolds [15, 16].
In fact, we will require a couple of extensions of the above standard claim, which in particular, together imply Lemma A.3; to avoid unnecessary generality, we only treat the case \(N \in (1,\infty )\).
Lemma A.5
Let h denote a \({\mathsf {CD}}(K,N)\) density on an interval \(I \subset {\mathbb {R}}\), \(N \in (1,\infty )\). Then h satisfies (A.1) at any point x in the interior where it is twice differentiable (in particular, (A.1) holds for a.e. \(x \in I\)).
Proof
Let x be a point as above. Observe that:
and so denoting \(g = h^{\frac{1}{N-1}}\), the \({\mathsf {CD}}(K,N)\) condition with \(x_0 = x - \varepsilon \), \(x_1 = x+\varepsilon \) and \(t = 1/2\) implies:
It follows by Taylor’s theorem and continuity of g in the interior of I that:
confirming (A.1) and concluding the proof. \(\square \)
Lemma A.6
Let h be a positive differentiable function on an open interval \(I \subset {\mathbb {R}}\) whose derivative is locally absolutely continuous there (and hence h is twice differentiable a.e. in I). If h satisfies (A.1) for a.e. \(x \in I\) and \(N \in (1,\infty )\), then \(\ell (I) \le D_{K,N-1}\) and h is a \({\mathsf {CD}}(K,N)\) density on I.
Remark A.7
The differentiability assumption at every point cannot be relaxed, as witnessed by the convex function \(h(x) = \left| x\right| \), which satisfies \(h''(x) = 0\) for a.e. x but nevertheless is not concave.
Proof
Given \(x_0, x_1 \in I\) with \(\left| x_1 - x_0\right| < D_{K,N-1}\), consider the function \(\Delta \) on [0, 1] given by:
As \(\Delta \) is positive and bounded away from zero on [0, 1], and since \(y^{\frac{1}{N-1}}\) is Lipschitz on compact sub-intervals of \((0,\infty )\), it follows that \(\Delta \) is differentiable with absolutely continuous derivative on [0, 1]. In addition, clearly \(\Delta (0) = \Delta (1) = 0\). Abbreviating \(\sigma (t) = \sigma ^{(t)}_{K,N-1}(\left| x_1-x_0\right| )\), it is immediate to verify that:
and therefore our assumption (A.1) for a.e. \(x \in I\) implies:
Now set \(\Delta _0(t) = \Delta (t)\) and \(\Delta _1(t) = \Delta (1-t)\), and for each \(i \in \left\{ 0,1\right\} \), denote by \(\beta _i\) the absolutely continuous function on [0, 1] given by:
It follows by the Leibniz rule that for any \(i\in \left\{ 0,1\right\} \):
and since \(\sigma (0) = 0\) we also have \(\beta _i(0) = 0\). The absolute continuity implies that \(\beta _i\) is monotone non-increasing, and hence \(\beta _i(t) \le 0\) for all \(t \in [0,1]\).
We are ready to conclude that \(\Delta \ge 0\) on [0, 1], by showing that \(\Delta (t_0) \ge 0\) for any local extremum point \(t_0 \in (0,1)\) of \(\Delta \). Indeed, when \(K \le 0\), this is immediate, since \(\sigma ' > 0\) and:
When \(K > 0\), set \(t_1 = 1 - t_0\) which is a local extremal point of \(\Delta _1\) in (0, 1), and note that \(t_{i^*} \in (0,1/2]\) for some \(i^* \in \left\{ 0,1\right\} \). Since \(\left| x_1 - x_0\right| < D_{K,N-1}\), it follows that \(\sigma ' > 0\) on [0, 1/2], and so the same argument as for the case \(K \le 0\) but applied to \(\Delta _{i^*}\) yields that \(\Delta (t_0) = \Delta _{i^*}(t_{i^*}) \ge 0\), as asserted.
Finally, when \(K > 0\), assume in the contrapositive that there exist \(x_0,x_1 \in I\) with \(x_1 - x_0 = D_{K,N-1}\). Denote \(\Delta _0(t) = \Delta (t) := h(t x_1 + (1-t)x_0)\) and set \(\sigma (t) := \sin (\pi t)\) for \(t \in [0,1]\). Note that as before, (A.2) and (A.3) are satisfied, and so defining the function \(\beta _0\) by (A.4), \(\beta _0\) is again monotone non-increasing on [0, 1]. But:
yielding a contradiction to the monotonicity, and concluding the proof. \(\square \)
1.2 A-priori estimates
We will also require the following a-priori estimates on the supremum and logarithmic derivative of \({\mathsf {CD}}(K,N)\) densities. Here it is crucial that \(N \in (1,\infty )\).
Lemma A.8
Let h denote a \({\mathsf {CD}}(K,N)\) density on a finite interval (a, b), \(N \in (1,\infty )\), which integrates to 1. Then:
In particular, for fixed K and N, h is uniformly bounded from above as long as \(b-a\) is uniformly bounded away from 0 (and from above if \(K < 0\)).
Proof
Given \(x_0 \in (a,b)\), we have by the \({\mathsf {CD}}(K,N)\) condition:
When \(K \ge 0\), the monotonicity of \(K \mapsto \sigma ^{(t)}_{K,N-1}(\theta )\) implies that \(\sigma ^{(t)}_{K,N-1}(\theta ) \ge \sigma ^{(t)}_{0,N-1}(\theta ) = t\), and we obtain:
When \(K < 0\), one may show that the function \(\theta \mapsto \sigma ^{(t)}_{K,N-1}(\theta )\) is decreasing on \({\mathbb {R}}_+\), as this is equivalent to showing that the function \(x \mapsto \log \sinh \exp (x)\) is convex on \({\mathbb {R}}_+\), and the latter may be verified by direct differentiation (and using that \(\sinh (x) \cosh (x) \ge x\)). Consequently, we obtain:
as asserted. We remark that when \(K > 0\), one may similarly show that the function \(\theta \mapsto \sigma ^{(t)}_{K,N-1}(\theta )\) is increasing on \([0,D_{K,N-1})\), and since \(\sigma ^{(t)}_{K,N-1}(0) = t\), we obtain the previous estimate we employed. \(\square \)
Lemma A.9
Let h denote a \({\mathsf {CD}}(K,N)\) density on a finite interval (a, b), \(N \in (1,\infty )\). Then:
for any point \(x \in (a,b)\) where h is differentiable. In particular, \(\log h(x)\) is locally Lipschitz on \(x \in (a,b)\) with estimates depending continuously only on x, a, b, K, N.
Proof
Denote \(\Psi = h^{\frac{1}{N-1}}\). The inequality on the right-hand-side follows since:
with equality at \(t=1\), and hence we may compare derivatives at \(t=1\):
whenever \(\Psi \) is differentiable at x. The inequality on the left-hand-side follows similarly. \(\square \)
1.3 Logarithmic convolutions
We will require the following:
Proposition A.10
Let h denote a \({\mathsf {CD}}(K,N)\) density on an interval (a, b). Let \(\psi _\varepsilon \) denote a non-negative \(C^2\) function supported on \([-\varepsilon ,\varepsilon ]\) with \(\int \psi _\varepsilon = 1\). For any \(\varepsilon \in (0,\frac{b-a}{2})\), define the function \(h^\varepsilon \) on \((a+\varepsilon ,b-\varepsilon )\) by:
Then \(h^\varepsilon \) is a \(C^2\)-smooth \({\mathsf {CD}}(K,N)\) density on \((a+\varepsilon ,b -\varepsilon )\).
For the proof, we will require the following general:
Lemma A.11
Let g denote a semi-concave function on an open interval I (i.e. \(g(x) - M\frac{x^2}{2}\) is concave for some \(M \ge 0\)). Let \(\psi \) denote a \(C^2\)-smooth non-negative test function with compact support in I. Then:
In other words, the singular part of g’s distributional second derivative is non-positive.
The argument is identical to the one used by D. Cordero–Erausquin in the proof of [32, Lemma 1]. For completeness, we present the proof.
Proof
Extend g and \(\psi \) to the entire \({\mathbb {R}}\) by defining them as equal to zero outside of I. Given \(\varepsilon > 0\) and \(x \in I\), denote:
and similarly for \(D^2_\varepsilon \psi (x)\). By Taylor’s theorem, for any point \(x \in I\) where g is twice differentiable we have \(\lim _{\varepsilon \rightarrow 0} D^2_\varepsilon g(x) = g''(x)\). In fact, this holds at any point where g has a second Peano derivative, see Sect. 2.2; in the context of convex functions on \({\mathbb {R}}^n\), such points are called points possessing a Hessian in the sense of Aleksandrov. Now since for small enough \(\varepsilon > 0\), \(D^2_\varepsilon g \le M\) on the support of \(\psi \) by semi-concavity (and since \(\psi \ge 0\)), we obtain by Fatou’s lemma:
where the last equality follows by Lebesgue’s Dominated Convergence theorem using the fact that \(\left| D^2_\varepsilon \psi (x)\right| \le \max \left| \psi ''\right| \) for all \(x \in I , \varepsilon > 0\), and the fact that g is locally integrable. \(\square \)
Proof of Proposition A.10
Note that \(\log h\) is locally integrable on (a, b), so that the integral:
is well-defined for all \(x \in (a+\varepsilon , b-\varepsilon )\), and we may take two derivatives in x under the integral (as \(\psi _\varepsilon \) is \(C^2\)-smooth with bounded corresponding derivatives), implying the asserted smoothness. In addition:
where the last equality follows from the usual integration by parts formula and Leibniz rule since \((\log h(y)) \psi _{\varepsilon }(x-y)\) is absolutely continuous. Furthermore:
where the last inequality follows by Lemma A.11 applied to \(g = \log h\), since h is a \({\mathsf {CD}}(K,\infty )\) density (by monotonicity in N), and hence \(\log h(x) + K \frac{x^2}{2}\) is concave on (a, b).
Putting everything together and applying Jensen’s inequality, we obtain:
where the last inequality follows since the integrand is non-positive (where it is defined) by Lemma A.5. A final application of Lemma A.3 concludes the proof. \(\square \)
We will use Proposition A.10 in the following form:
Proposition A.12
Let \(\left\{ h_s(t)\right\} _{s \in (c,d)}\) denote a Borel measurable family of \({\mathsf {CD}}(K,N)\) densities on (a, b) (so that for every \(t \in (a,b)\), \((c,d) \ni s \mapsto h_s(t)\) is Borel measurable). Assume in addition that:
Given \(\varepsilon _1,\varepsilon _2 > 0\) and \(s \in (c+\varepsilon _2 , d-\varepsilon _2)\), denote the following function:
where as usual, \(\psi _{\varepsilon _i}\) denotes a non-negative \(C^2\) function supported on \([-\varepsilon _i,\varepsilon _i]\) with \(\int \psi _{\varepsilon _i} = 1\). Then \(\left\{ h^{\varepsilon _1,\varepsilon _2}_{s}(t)\right\} _{s \in (c+\varepsilon _2,d-\varepsilon _2)}\) is a \(C^2\)-smooth (in (s, t)) family of \({\mathsf {CD}}(K,N)\) densities on \((a+\varepsilon _1,b-\varepsilon _1)\).
Proof
The proof is a repetition of the proof of the previous proposition, so we will be brief. Our assumption (A.5) implies that (A.6) is well-defined, and justifies taking two derivatives in t and s under the integral, implying the assertion on smoothness. The first derivative in t under the integral may be integrated by parts, whereas for the second derivative we apply Lemma A.11. A final application of Jensen’s inequality as in Proposition A.10 establishes the asserted differential characterization of \({\mathsf {CD}}(K,N)\), concluding the proof. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cavalletti, F., Milman, E. The globalization theorem for the Curvature-Dimension condition. Invent. math. 226, 1–137 (2021). https://doi.org/10.1007/s00222-021-01040-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00222-021-01040-6