1 Introduction

The Curvature-Dimension condition \({\mathsf {CD}}(K,N)\) was first introduced in the 1980’s by Bakry and Émery [15, 16] in the context of diffusion generators, having in mind primarily the setting of weighted Riemannian manifolds, namely smooth Riemannian manifolds endowed with a smooth density with respect to the Riemannian volume. The \({\mathsf {CD}}(K,N)\) condition serves as a generalization of the classical condition in the non-weighted Riemannian setting of having Ricci curvature bounded below by \(K \in {\mathbb {R}}\) and dimension bounded above by \(N \in [1,\infty ]\) (see e.g. [56, 60] for further possible extensions). Numerous consequences of this condition have been obtained over the past decades, extending results from the classical non-weighted setting and at times establishing new ones directly in the weighted one. These include diameter bounds, volume comparison theorems, heat-kernel and spectral estimates, Harnack inequalities, topological implications, Brunn–Minkowski-type inequalities, and isoperimetric, functional and concentration inequalities—see e.g. [17, 48, 77] and the references therein.

Being a differential and Hilbertian condition, it was for many years unclear how to extend the Bakry–Émery definition beyond the smooth Riemannian setting, as interest in (measured) Gromov-Hausdorff limits of Riemannian manifolds and other non-Hilbertian singular spaces steadily grew. In parallel, and apparently unrelatedly, the theory of Optimal-Transport was being developed in increasing generality following the influential work of Brenier [21] (see e.g. [2, 36, 53, 65, 75,76,77]). Given two probability measures \(\mu _0,\mu _1\) on a common geodesic space \((X,\mathsf {d})\) and a prescribed cost of transporting a single mass from point x to y, the Monge-Kantorovich idea is to optimally couple \(\mu _0\) and \(\mu _1\) by minimizing the total transportation cost, and as a byproduct obtain a Wasserstein geodesic \([0,1] \ni t \mapsto \mu _t\) connecting \(\mu _0\) and \(\mu _1\) in the space of probability measures \({\mathcal {P}}(X)\). This gives rise to the notion of displacement convexity of a given functional on \({\mathcal {P}}(X)\) along Wasserstein geodesics, introduced and studied by McCann [52]. Following the works of Cordero-Erausquin–McCann–Schmuckenschläger [33], Otto–Villani [62] and von Renesse–Sturm [70], it was realized that the \({\mathsf {CD}}(K,\infty )\) condition in the smooth setting may be equivalently formulated synthetically as a certain convexity property of an entropy functional along \(W_2\) Wasserstein geodesics (associated to \(L^2\)-Optimal-Transport, when the transport-cost is given by the squared-distance function).

This idea culminated in the seminal works of Lott–Villani [51] and Sturm [73, 74], where a synthetic definition of \({\mathsf {CD}}(K,N)\) was proposed on a general (complete, separable) metric space \((X,\mathsf {d})\) endowed with a (locally-finite Borel) reference measure \({\mathfrak {m}}\) (“metric-measure space”, or m.m.s.); it was moreover shown that the latter definition coincides with the Bakry–Émery one in the smooth Riemannian setting (and in particular in the classical non-weighted one), that it is stable under measured Gromov-Hausdorff convergence of m.m.s.’s, and that it implies various geometric and analytic inequalities relating metric and measure, in complete analogy with the smooth setting. It was subsequently also shown [58, 64] that Finsler manifolds and Alexandrov spaces satisfy the Curvature-Dimension condition. Thus emerged an overwhelmingly convincing notion of Ricci curvature lower bound K and dimension upper bound N for a general (geodesic) m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\), leading to a rich and fruitful theory exploring the geometry of m.m.s.’s by means of Optimal-Transport.

One of the most important and longstanding open problems in the Lott–Sturm–Villani theory (see [73, 74] and [77, pp. 888, 907]) is whether the Curvature-Dimension condition on a general geodesic m.m.s.(say, having full-support \(\text {supp}({\mathfrak {m}}) = X\)) enjoys the globalization (or local-to-global) property: if the \({\mathsf {CD}}(K,N)\) condition is known to hold on a neighborhood \(X_o\) of any given point \(o \in X\) (a property henceforth denoted by \({\mathsf {CD}}_{loc}(K,N)\)), does it also necessarily hold on the entire space? Clearly this is indeed the case in the smooth setting, as both curvature and dimension may be computed locally (by equivalence with the differential \({\mathsf {CD}}\) definition). However, for reasons which we will expand on shortly, this is not at all clear and in some cases is actually false on general m.m.s.’s. An affirmative answer to this question would immensely facilitate the verification of the \({\mathsf {CD}}\) condition, which at present requires testing all possible \(W_2\)-geodesics on X, instead of locally on each \(X_o\). The analogous question for sectional curvature on Alexandrov spaces (where the dimension N is absent) does indeed have an affirmative answer, as shown by Topogonov, and in full generality, by Perelman (see [22]).

Several partial answers to the local-to-global problem have already been obtained in the literature. A geodesic space \((X,\mathsf {d})\) is called non-branching if geodesics are forbidden to branch at an interior-point into two separate geodesics. On a non-branching geodesic m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) having full support, it was shown by Sturm in [73, Theorem 4.17] that the local-to-global property is satisfied when \(N = \infty \) (assuming that the space of probability measures with finite \({\mathfrak {m}}\)-relative entropy is geodesically convex; see also [77, Theorem 30.42] where the same globalization result was proved under a different condition involving the existence of a full-measure totally-convex subset of X of finite-dimensional points). Still for non-branching geodesic m.m.s.’s having full support, a positive answer was also obtained by Villani in [77, Theorem 30.37] for the case \(K=0\) and \(N \in [1,\infty )\).

We stress that in these results, the restriction to non-branching spaces is not merely a technical assumption—an example of a heavily-branching m.m.s.verifying \({\mathsf {CD}}_{loc}(0,4)\) which does not verify \({\mathsf {CD}}(K,N)\) for any fixed \(K \in {\mathbb {R}}\) and \(N\in [1,\infty ]\) was constructed by Rajala in [67]. Consequently, a natural assumption is to require that \((X,\mathsf {d})\) be non-branching, or more generally, to require that the \(L^2\)-Optimal-Transport on \((X,\mathsf {d},{\mathfrak {m}})\) be concentrated (i.e. up to a null-set) on a non-branching subset of geodesics, an assumption introduced by Rajala and Sturm in [68] under the name essentially non-branching (see Sect. 6 for precise definitions). For instance, it is known [68] that measured Gromov-Hausdorff limits of Riemannian manifolds satisfying \({\mathsf {CD}}(K,\infty )\), and more generally, \({\mathsf {RCD}}(K,\infty )\) spaces, always satisfy the essentially non-branching assumption (see Sect. 13).

In this work, we provide an affirmative answer to the globalization problem in the remaining range of parameters: for \(N \in (1,\infty )\) and \(K \in {\mathbb {R}}\), the \({\mathsf {CD}}(K,N)\) condition verifies the local-to-global property on an essentially non-branching geodesic m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) having finite total-measure and full support. The exclusion of the case \(N=1\) is to avoid unnecessary pathologies, and is not essential. Our assumption that \({\mathfrak {m}}\) has finite total-measure (or equivalently, by scaling, that it is a probability measure) is most probably technical, but we did not verify it can be removed so as to avoid overloading the paper even further. This result is new even under the additional assumption that the space is infinitesimally Hilbertian (see [40])—we will say that such spaces verify \({\mathsf {RCD}}(K,N)\)—in which case the assumption of being (globally) essentially non-branching is in fact superfluous.

To better explain the difference between the previously known cases when \(\frac{K}{N} = 0\) and the conceptual challenge which the newly treated case \(\frac{K}{N} \ne 0\) poses, as well as to sketch our solution and its main new ingredients, which we believe are of independent interest, we provide some additional details below and refer to Sect. 6 for precise definitions.

1.1 Disentangling volume-distortion coefficients

Roughly speaking, the \({\mathsf {CD}}(K,N)\) condition prescribes a synthetic second-order bound on how an infinitesimal volume changes when it is moved along a \(W^{2}\)-geodesic: the volume distortion (or transport Jacobian) J along the geodesic should satisfy the following interpolation inequality for \(t_0 = 0\) and \(t_1 = 1\):

$$\begin{aligned}&J^{\frac{1}{N}}(\alpha t_1 + (1-\alpha ) t_0)\nonumber \\&\quad \ge \tau _{K,N}^{(\alpha )}(\left| t_1-t_0\right| \theta ) J^{\frac{1}{N}}(t_1) + \tau _{K,N}^{(1-\alpha )}(\left| t_1-t_0\right| \theta ) J^{\frac{1}{N}}(t_0) \;\;\; \forall \alpha \in [0,1] ,\nonumber \\ \end{aligned}$$
(1.1)

where \(\tau _{K,N}^{(t)}(\theta )\) is an explicit coefficient depending on the curvature \(K \in {\mathbb {R}}\), dimension \(N \in [1,\infty ]\), the interpolating time parameter \(t \in [0,1]\) and the total length of the geodesic \(\theta \in [0,\infty )\) (with an appropriate interpretation of (1.1) when \(N=\infty \)). When \(N <\infty \), the latter coefficient is obtained by geometrically averaging two different volume distortion coefficients:

$$\begin{aligned} \tau _{K,N}^{(t)}(\theta ) := t^{\frac{1}{N}} \sigma _{K,N-1}^{(t)}(\theta )^{\frac{N-1}{N}} , \end{aligned}$$
(1.2)

where the \(\sigma _{K,N-1}^{(t)}(\theta )\) term encodes an \((N-1)\)-dimensional evolution orthogonal to the transport and thus affected by the curvature, and the linear term t represents a one dimensional evolution tangential to the transport and thus independent of any curvature information. As with the Jacobi equation in the usual Riemannian setting, the function \([0,1] \ni t \mapsto \sigma (t) := \sigma _{K,N-1}^{(t)}(\theta )\) is explicitly obtained by solving the second-order differential equation:

$$\begin{aligned} \sigma ''(t) + \theta ^{2} \frac{K}{N-1} \sigma (t) =0 \;~\text {on}~\; t \in [0,1] ~,~ \sigma (0) = 0 ~,~ \sigma (1) = 1 . \end{aligned}$$
(1.3)

The common feature of the previously known cases \(\frac{K}{N} = 0\) for the local-to-global problem is the linear behaviour in time of the distortion coefficient: \(\tau _{K,N}^{(t)}(\theta ) = t\). A major obstacle with the remaining cases \(\frac{K}{N} \ne 0\) is that the function \([0,1] \ni t \mapsto \tau _{K,N}^{(t)}(\theta )\) does not satisfy a second-order differential characterization such as (1.3). If it did, it would be possible to express the interpolation inequality (1.1) on \([t_0,t_1] \subset [0,1]\) as a second-order differential inequality for \(J^{\frac{1}{N}}\) on \([t_0,t_1]\) (see Lemmas A.5 and A.6), and so if (1.1) were known to hold for all \(\left\{ [t_0^i,t_1^i]\right\} _{i=1\ldots k}\) so that \(\cup _{i=1}^k (t_0^i,t_1^i) = (0,1)\), it would follow that (1.1) also holds for \([t_0,t_1] = [0,1]\). However, a counterexample to the latter implication was constructed by Deng and Sturm in [34], thereby showing that:

(1.4)

On the other hand, the above argument does work if we were to replace \(\tau \) by the slightly smaller \(\sigma \) coefficients. This motivated Bacher and Sturm in [14] to define for \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\) the slightly weaker “reduced” Curvature-Dimension condition, denoted by \({\mathsf {CD}}^{*}(K,N)\), where the distortion coefficients \(\tau _{K,N}^{(t)}(\theta )\) are indeed replaced by \(\sigma _{K,N}^{(t)}(\theta )\). Using the above gluing argument (after resolving numerous technicalities), the local-to-global property for \({\mathsf {CD}}^*(K,N)\) was established in [14] on non-branching spaces (see also the work of Erbar–Kuwada–Sturm [35, Corollary 3.13, Theorem 3.14 and Remark 3.26] for an extension to the essentially non-branching setting, cf. [29, 68]). Let us also mention here the work of Ambrosio–Mondino–Savaré [10], who independently of a similar result in [35], established the local-to-global property for \({\mathsf {RCD}}^*(K,N)\) proper spaces, \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), without a-priori assuming any non-branching assumptions (but a-posteriori, such spaces must be essentially non-branching by [68]).

Without requiring any non-branching assumptions, the \({\mathsf {CD}}^*(K,N)\) condition was shown in [14] to imply the same geometric and analytic inequalities as the \({\mathsf {CD}}(K,N)\) condition, but with slightly worse constants (typically missing the sharp constant by a factor of \(\frac{N-1}{N}\)), suggesting that the latter is still the “right” notion of Curvature-Dimension. We conclude that the local-to-global challenge is to properly disentangle between the orthogonal and tangential components of the volume distortion J before attempting to individually integrate them as above. This also highlights the geometric nature of the globalization problem, and demonstrates that it is not merely a technical challenge.

1.2 Comparing \(L^2\)- and \(L^1\)-Optimal-Transport and main result

There have been a couple of prior attempts to disentangle the volume distortion into its orthogonal and tangential components, by comparing between \(W_2\) and \(W_1\) Wasserstein geodesics (associated to \(L^2\)- and \(L^1\)-Optimal-Transport, respectively). In [30], this strategy was implicitly employed by Cavalletti and Sturm to show that \({\mathsf {CD}}_{loc}(K,N)\) implies the measure-contraction property \({\mathsf {MCP}}(K,N)\), which in a sense is a particular case of \({\mathsf {CD}}(K,N)\) when one end of the \(W_2\)-geodesic is a Dirac delta at a point \(o \in X\) (see [57, 74]). In that case, all of the transport-geodesics have o as a common end point, so by considering a disintegration of \({\mathfrak {m}}\) on the family of spheres centered at o, and restricting the \(W_2\)-geodesic to these spheres, the desired disentanglement was obtained. In the subsequent work [24], Cavalletti generalized this approach to a particular family of \(W_2\)-geodesics, having the property that for a.e. transport-geodesic \(\gamma \), its length \(\ell (\gamma )\) is a function of \(\varphi (\gamma _{0})\), where \(\varphi \) is a Kantorovich potential associated to the corresponding \(L^2\)-Optimal-Transport problem. Here the disintegration was with respect to the individual level sets of \(\varphi \), and again the restriction of the \(W_2\)-geodesic enjoying the latter property to these level sets (formally of co-dimension one) induced a \(W_1\)-geodesic, enabling disentanglement.

Another application of \(L^1\)-Optimal-Transport, seemingly unrelated to disentanglement of \(W_2\)-geodesics, appeared in the recent breakthrough work of Klartag [47] on localization in the smooth Riemannian setting. The localization paradigm, developed by Payne–Weinberger [63], Gromov–Milman [44] and Kannan–Lovász–Simonovits [46], is a powerful tool to reduce various analytic and geometric inequalities on the space \(({\mathbb {R}}^n,\mathsf {d},{\mathfrak {m}})\) to appropriate one-dimensional counterparts. The original approach by these authors was based on a bisection method, and thus inherently confined to \({\mathbb {R}}^n\). In [47], Klartag extended the localization paradigm to the weighted Riemannian setting, by disintegrating the reference measure \({\mathfrak {m}}\) on \(L^1\)-Optimal-Transport geodesics (or “rays”) associated to the inequality under study (cf. Feldman–McCann [38]), and proving that the resulting conditional one-dimensional measures inherit the Curvature-Dimension properties of the underlying manifold.

Klartag’s idea is quite robust, and permitted Cavalletti and Mondino in [27] to avoid the smooth techniques used in [47] and to extend the localization paradigm to the framework of essentially non-branching geodesic m.m.s.’s \((X,\mathsf {d},{\mathfrak {m}})\) of full-support verifying \({\mathsf {CD}}_{loc}(K,N)\), \(N \in (1,\infty )\). By a careful study of the structure of \(W_1\)-geodesics, Cavalletti and Mondino were able to transfer the Curvature-Dimension information encoded in the \(W_2\)-geodesics to the individual rays along which a given \(W_1\)-geodesic evolves, thereby proving that on such spaces,

(1.5)

Note that the densities of one-dimensional \({\mathsf {CD}}(K,N)\) spaces are characterized via the \(\sigma \) (as opposed to \(\tau \)) volume-distortion coefficients (see the “Appendix”), so by applying the gluing argument described in the previous subsection, only local \({\mathsf {CD}}_{loc}(K,N)\) information was required in [27] to obtain global control over the entire one-dimensional transport ray.

This allowed Cavalletti and Mondino (see [27, 28]) to obtain a series of sharp geometric and analytic inequalities for \({\mathsf {CD}}_{loc}(K,N)\) spaces as above, in particular extending from the smooth Riemannian setting the sharp Lévy-Gromov [42] and Milman [55] isoperimetric inequalities, as well as the sharp Brunn-Minkowski inequality of Cordero-Erausquin–McCann–Schmuckenschläger [33] and Sturm [74], all in global form (see also Ohta [59]).

We would like to address at this point a certain general belief shared by some in the Optimal-Transport community, stating that the property \({\mathsf {BM}}(K,N)\) of satisfying the Brunn-Minkowski inequality (with sharp coefficients correctly depending on KN), should be morally equivalent to the \({\mathsf {CD}}(K,N)\) condition. Rigorously establishing such an equivalence would immediately yield the local-to-global property of \({\mathsf {CD}}(K,N)\), by the Cavalletti–Mondino localization proof that \({\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {BM}}(K,N)\). However, we were unsuccessful in establishing the missing implication \({\mathsf {BM}}(K,N) \Rightarrow {\mathsf {CD}}(K,N)\), and in fact a careful attempt in this direction seems to lead back to the circle of ideas we were ultimately able to successfully develop in this work.

Instead of starting our investigation from \({\mathsf {BM}}(K,N)\), our strategy is to directly start from a suitable modification of the property (1.5), which we dub \({\mathsf {CD}}^1(K,N)\), when (1.5) is required to hold for transport rays associated to (signed) distance functions from level sets of continuous functions. A stronger condition, when (1.5) is required to hold for transport rays associated to all 1-Lipschitz functions, is denoted by \({\mathsf {CD}}^1_{Lip}(K,N)\)—see Sect. 8 for precise definitions. The main result of this work consists of showing that \({\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N)\), by means of transferring the one-dimensional \({\mathsf {CD}}(K,N)\) information encoded in a family of suitably constructed \(L^1\)-Optimal-Transport rays, onto a given \(W_2\)-geodesic, thereby obtaining the correct disentanglement between tangential and orthogonal distortions. This goes in exactly the opposite direction to the one studied by Cavalletti and Mondino in [27], and completes the cycle

$$\begin{aligned} {\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {CD}}^1_{Lip}(K,N) \Rightarrow {\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N) . \end{aligned}$$

To the best of our knowledge, this decisive feature of our work—deducing \({\mathsf {CD}}(K,N)\) for a given \(W_2\)-geodesic by considering the \({\mathsf {CD}}_{\text {loc}}(K,N)\) information encoded in family (in accordance with (1.4)) of different associated \(W_2\)-geodesics (manifesting itself in the \({\mathsf {CD}}^1(K,N)\) information along a family of different \(L^1\)-Optimal-Transport rays)—has not been previously explored.

Main Theorem 1.1

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.with \({\mathfrak {m}}(X) < \infty \), and let \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). Then the following statements are equivalent:

  1. (1)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}(K,N)\).

  2. (2)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^*(K,N)\).

  3. (3)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{Lip}(K,N)\).

  4. (4)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1(K,N)\).

If in addition \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length-space, the above statements are equivalent to

  1. (5)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K,N)\).

To this list one can also add the entropic Curvature-Dimension condition \({\mathsf {CD}}^e(K,N)\) of Erbar–Kuwada–Sturm [35], which is known to be equivalent to \({\mathsf {CD}}^*(K,N)\) for essentially non-branching spaces. In other words, all synthetic definitions of Curvature-Dimension are equivalent for essentially non-branching m.m.s.’s, and in particular, the local-to-global property holds for such spaces (recall that this is known to be false on m.m.s.’s where branching is allowed by [67]). The equivalence with \({\mathsf {CD}}_{loc}(K,N)\) is clearly false without some global assumption ultimately ensuring that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a geodesic-space, see Remark 13.4.

As already mentioned, and being slightly imprecise (see Sect. 13 for precise statements), the implications \({\mathsf {CD}}(K,N) \Rightarrow {\mathsf {CD}}^*(K,N) \Rightarrow {\mathsf {CD}}_{loc}(K,N)\) follow from the work of Bacher and Sturm [14], and the implication \({\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {CD}}^1_{Lip}(K,N)\) follows by adapting to the present framework what was already proved by Cavalletti and Mondino in [27] (after taking care of the important maximality requirement of transport-rays, see Theorem 7.10). So almost all of our effort goes into proving that \({\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N)\). For a smooth weighted Riemannian manifold \((M,\mathsf {d},{\mathfrak {m}})\), it is an easy exercise to show the latter implication using the Bakry–Émery differential characterization of \({\mathsf {CD}}(K,N)\)—simply use an appropriate umbilic hypersurface H passing through a given point \(p \in M\) and perpendicular to a given direction \(\xi \in T_p M\), and apply the \({\mathsf {CD}}^1(K,N)\) definition to the distance function from H. Of course, this provides no insight towards how to proceed in the m.m.s.setting, so it is natural to try and obtain an alternative synthetic proof, still in the smooth setting. While this is possible, it already poses a much greater challenge, which in some sense provided the required insight leading to the strategy we ultimately employ in this work.

1.3 Main new ingredients of proof

To achieve the right disentanglement, we are required to develop several new ingredients beyond the present state-of-the-art, which, being conceptual in nature, are in our opinion of independent interest.

  1. (1)

    The first is a change-of-variables formula for the density of an \(L^2\)-Optimal-Transport geodesic in X (see Theorem 11.4), which depends on a second-order derivative of associated interpolating Kantorovich potentials.

    Let \(\mathrm{Geo}(X)\) denote the collection of constant speed geodesics on X parametrized on the interval [0, 1], and let \(\mathrm{e}_t : \mathrm{Geo}(X) \ni \gamma \mapsto \gamma _t \in X\) denote the evaluation map at time \(t \in [0,1]\). Given two Borel probability measures \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\) with finite second moments, any \(W_2\)-geodesic \([0,1] \ni t\mapsto \mu _t \in {\mathcal {P}}(X)\) can be lifted to an optimal dynamical plan \(\nu \in {\mathcal {P}}(\mathrm{Geo}(X))\), so that \((\mathrm{e}_t)_{\sharp } \nu = \mu _t\) for all \(t \in [0,1]\). Let \(\varphi \) denote a Kantorovich potential associated to the \(L^2\)-transport problem between \(\mu _0\) and \(\mu _1\). Given \(s,t\in (0,1)\), we introduce the time-propagated intermediate Kantorovich potential \(\Phi _s^t\) by pushing forward \(\varphi _s\) via \(\mathrm{e}_t \circ \mathrm{e}_s^{-1}\), where \(\left\{ \varphi _t\right\} _{t\in [0,1]}\) is the family of interpolating Kantorovich potentials obtained via the Hopf–Lax semi-group applied to \(\varphi \). While \(\mathrm{e}_t^{-1}\) may be multi-valued, Theorem 3.11 ensures that \(\Phi _s^t = \varphi _s \circ \mathrm{e}_s \circ \mathrm{e}_t^{-1}\) is well-defined on \(\mathrm{e}_t(G_\varphi )\), the set of t-mid-points of transport geodesics.

    Theorem 11.4 states that if \((X,\mathsf {d},{\mathfrak {m}})\) is an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) (\({\mathfrak {m}}(X) < \infty \) and \(N \in (1,\infty )\)), and if \(\mu _0,\mu _1 \ll {\mathfrak {m}}\), then for \(\nu \)-a.e. transport-geodesic \(\gamma \in \mathrm{Geo}(X)\) of positive length:

    $$\begin{aligned} \frac{\rho _{s} (\gamma _{s})}{\rho _{t}(\gamma _{t})} = \frac{\ell ^{2}(\gamma )}{\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})} \cdot h^\gamma _s(t) \;\;\; \text {for a.e. }t,s \in (0,1), \end{aligned}$$
    (1.6)

    where \(\rho _t\) are appropriate versions of the densities \(d\mu _t / d{\mathfrak {m}}\), and for every \(s \in (0,1)\), \(h^\gamma _s\) is a \({\mathsf {CD}}(\ell (\gamma )^2 K ,N)\) density on [0, 1] so that \(h^\gamma _s(s) = 1\). In particular, for a.e. \(t,s\in (0,1)\), \(\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})\) exists and is positive. Here \(h^\gamma _s\) is obtained from the \({\mathsf {CD}}^1(K,N)\) condition applied to the transport-ray associated to the (signed) distance function from the level set \(\left\{ \varphi _s = \varphi _s(\gamma _s)\right\} \).

    Theorem 11.4 constitutes the culmination of Part II of this work, which is mostly dedicated to introducing the \({\mathsf {CD}}^{1}(K,N)\) condition and rigorously establishing the change-of-variables formula (1.6). Note that we refrain from making any assumptions on (the challenging) spatial regularity of \(\Phi _{s}^{t}\) when \(t\ne s\), so we are precluded from invoking the coarea formula in our derivation. Our main tool for deriving (1.6) is a comparison between two disintegrations of appropriate measures, one encoding \(W_2\) information and another encoding \(W_1\) information—see Sect. 11 for a heuristic derivation.

  2. (2)

    To obtain disentanglement of the “Jacobian” \(t \mapsto 1/\rho _t(\gamma _t)\) into its orthogonal and tangential components, we need to understand the first-order variation of the change-of-variables formula (1.6) at \(t=s\), i.e. the second-order variation of \(t \mapsto \Phi _s^t\) at \(t=s\), which amounts to a third-order variation of \(t \mapsto \varphi _t\). Our second main new ingredient in this work is a surprising third-order bound on the variation of \(t \mapsto \varphi _t\) along the Hopf–Lax semi-group (Theorem 5.5), which holds in complete generality on any proper geodesic space.

    To this end, we develop in Part I of this work a first, second, and finally third order temporal theory of intermediate Kantorovich potentials in a purely metric setting \((X,\mathsf {d})\), without specifying any reference measure \({\mathfrak {m}}\) and without assuming any non-branching assumptions. This part, which may be read independently of the other components of this work, is presented first (in Sects. 25), since its results are constantly used throughout the rest of this work.

    Our starting point here is the pioneering work by Ambrosio–Gigli–Savaré [5, 6, Section 3], who already investigated in a very general (extended) metric space setting the first and second order temporal behaviour of the Hopf-Lax semi-group \(Q_t\) applied to a general function \(f : X \rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \). However, the essential point we observe in our treatment is that when f is itself a Kantorovich potential \(\varphi \), characterized by the property that \(\varphi = Q_1(-\varphi ^c)\) and \(\varphi ^c = Q_1(-\varphi )\), much more may be said regarding the behaviour of \(t \mapsto \varphi _t := -Q_t(-\varphi )\), even in first and second order. This is due to the fact that if we reverse time and define \({\bar{\varphi }}_t := Q_{1-t}(-\varphi ^c)\), then we obtain two-sided control over \(\varphi _t\) on the set \(\left\{ \varphi _t = {\bar{\varphi }}_t\right\} \), which turns out to coincide with the set \(\mathrm{e}_t(G_\varphi )\). So for instance, two apparently novel observations which we constantly use throughout this work are that for all \(t \in (0,1)\), \(\ell _t^2/2 := \partial _t \varphi _t\) exists on \(\mathrm{e}_t(G_\varphi )\), and that transport geodesics having a given \(x \in X\) as their t-midpoint all have the same length \(\ell _t(x)\). In Sect. 3, we establish Lipschitz regularity properties of \(t \mapsto \ell ^2_t(x)\) for all \(x \in X\), as well as upper and lower derivative estimates, both pointwise and a.e., for appropriate times t. These are then transferred in Sect. 4 to corresponding estimates for the function \(\Phi _s^t\).

    Part I culminates in Sect. 5, whose goal is to prove a quantitative version of the following (somewhat oversimplified) statement, which crucially provides second order information on \(\ell _t\), or equivalently, third order information on \(\varphi _t\), along \(\gamma _t\):

    (1.7)

    Equivalently, this amounts to the statement that:

    $$\begin{aligned} (0,1) \ni r \mapsto L(r) := \exp \left( - \frac{1}{\ell (\gamma )^2} \int ^r_{r_0} \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) dt\right) \text { is concave },\nonumber \\ \end{aligned}$$
    (1.8)

    since (formally)

    $$\begin{aligned} \frac{L''}{L} = (\log L) '' + ((\log L)')^2 = - z' + z^2 \le 0 . \end{aligned}$$

    It turns out that L(t) precisely corresponds to the tangential component of \(1/\rho _t(\gamma _t)\), and its concavity ensures that it is synthetically controlled by the linear term appearing in the definition of \(\tau ^{(t)}_{K,N}(\theta )\) in (1.2). The novel observation that it is possible to extract in a general metric setting third order information from the Hopf-Lax semi-group, which formally solves the first-order Hamilton-Jacobi equation, is in our opinion one of the most surprising parts of this work. Even in the smooth Riemannian setting, we were not able to find a synthetic proof which is easier than the one in the general metric setting; a formal differential proof of (1.7) assuming both temporal and (more challenging) spatial higher-order regularity of \(\varphi _t\) is provided in Sect. 5.1, but the latter seems to wrongly suggest that it would not be possible to extend (1.7) beyond a Hilbertian setting. Our proof in the general metric setting (Theorem 5.2) is based on a careful comparison of second order expansions of \(\varepsilon \mapsto \varphi _{\tau +\varepsilon }(\gamma _\tau )\) at \(\tau =t,s\), and subtle differences between the usual second derivative and the second Peano derivative (see Sect. 2) come into play.

  3. (3)

    Our third main new ingredient, described in Part III, is a certain rigidity property of the change-of-variables formula (1.6), which allows us to bootstrap the a-priori available temporal regularity, and which in combination with the first and second ingredients, enables us to achieve disentanglement.

    Indeed, the definition of \(\Phi _s^t\) may be naturally extended to an appropriate domain beyond \(\mathrm{e}_t(G_\varphi )\) as follows, allowing to easily (formally) calculate its partial derivative:

    $$\begin{aligned} \Phi _s^t = \varphi _t + (t-s) \frac{\ell _t^2}{2} , \;\;\; \partial _t \Phi _s^t = \ell _t^2 + (t-s) \partial _t\frac{\ell _t^2}{2} . \end{aligned}$$

    Evaluating at \(x = \gamma _t\) and plugging this into the change-of-variables formula (1.6), it follows that for \(\nu \)-a.e. geodesic \(\gamma \):

    $$\begin{aligned} \frac{\rho _s(\gamma _s)}{\rho _t(\gamma _t)} = \frac{h^\gamma _s(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)}{\ell ^2(\gamma )}} \;\;\; \text {for a.e. } t,s \in (0,1). \end{aligned}$$
    (1.9)

    Thanks to the idea of considering together both initial-point s and end-point t, the latter formula takes on a very rigid structure: note that on the left-hand-side the s and t variables are separated, and the denominator on the right-hand-side depends linearly is s. Consequently, we can easily bootstrap the a-priori available regularity in s and t of all terms involved. It follows that \(\frac{1}{\ell ^2(\gamma )} \partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)\) must coincide for a.e. \(t \in (0,1)\) with a locally-Lipschitz function z(t), so that (1.7) applies. In addition, by redefining \(\left\{ h^\gamma _s\right\} \) for s in a null subset of (0, 1), we can guarantee that \((0,1) \ni s \mapsto h^\gamma _s(t)\) is locally Lipschitz (for any given \(t \in (0,1)\)), even though there is a-priori no relation between the different densities \(\left\{ h^\gamma _s\right\} _{s \in (0,1)}\).

    At this point, if \(\rho _t(\gamma _t)\) and z(t) were known to be \(C^2\) smooth, and equality were to hold in (1.9) for all \(s,t \in (0,1)\), we could then define

    $$\begin{aligned} Y(r) := \exp \left( \int _{r_0}^r \partial _t|_{t=s} \log h^\gamma _s(t) ds \right) , \end{aligned}$$
    (1.10)

    and as \(\partial _t|_{t=s} \log (1 + (t-s) z(t)) = z(s)\), it would follow, recalling the definition (1.8) of L, that

    $$\begin{aligned} \frac{\rho _{r_0}(\gamma _{r_0})}{\rho _r(\gamma _r)} = L(r) Y(r) \;\;\; \forall r \in (0,1) . \end{aligned}$$
    (1.11)

    Using the fact that all \(\left\{ h^\gamma _s\right\} _{s \in (0,1)}\) are \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) densities to control \(\partial ^2_t|_{t=r} \log h_r(t)\), and surprisingly, also the concavity of L (again!) to control the mixed partial derivatives \(\partial _s \partial _t|_{t=s=r} \log h^\gamma _s(t)\), a formal computation described in Sect. 12.2 then verifies that Y is a \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) density itself. A rigorous justification without all of the above non-realistic assumptions turns out to be extremely tedious, due to the difficulty in applying an approximation argument while preserving the rigidity of the equation—this is worked out in Sect. 12 and the “Appendix”.

After taking care of all these details, we finally obtain the desired disentanglement (1.11) of the Jacobian: L is concave and so controlled synthetically by a linear distortion coefficient, whereas Y is a \({\mathsf {CD}}(\ell (\gamma )^2 K,N)\) density and so (by definition) \(Y^{1/(N-1)}\) is controlled synthetically by the \(\sigma ^{(t)}_{K,N-1}(\ell (\gamma ))\) coefficient. A standard application of Hölder’s inequality then verifies that \(J^{1/N}(r) = \rho _r(\gamma _r)^{-1/N}\) is controlled by the \(\tau ^{(t)}_{K,N}(\ell (\gamma ))\) distortion coefficient, i.e. satisfies (1.1)—in fact for all \(t_0,t_1 \in [0,1]\)—thereby establishing \({\mathsf {CD}}(K,N)\), see Theorem 13.2.

The definition (1.10) of Y finally sheds light on the crucial role which the parameter \(s \in (0,1)\) plays in our strategy—its role is to vary between the different \(W_2\)-geodesics from which the \({\mathsf {CD}}_{loc}(K,N)\) information is extracted into the \({\mathsf {CD}}^1(K,N)\) information on the disintegration into transport-rays from the (signed) distance functions from level sets \(\left\{ \varphi _s = \varphi _s(\gamma _s)\right\} \), thereby coming full circle with the observation of (1.4).

Besides establishing the local-to-global property of \({\mathsf {CD}}(K,N)\) and the equivalence of its various variants (in our setting), we emphasize that as a by product of our proof, we obtain a remarkable new self-improvement property of \({\mathsf {CD}}(K,N)\): the \(\tau _{K,N}\)-concavity (1.1) of the transport Jacobian \(J_t(\gamma _t)\) along all \(W_2\)-geodesics implies the (a-priori) stronger “L-Y” decomposition \(J_t(\gamma _t) = L_\gamma (t) Y_\gamma (t)\), where \(L_\gamma \) is concave and \(Y_\gamma \) is a \({\mathsf {CD}}(\ell (\gamma )^2 K, N)\) density on (0, 1). As already mentioned above, this self-improvement is false for a single \(W_2\)-geodesic. We believe that the stronger “L-Y” information will prove to be of further use in the study of \({\mathsf {CD}}(K,N)\) essentially non-branching spaces.

We refer to Sect. 13 for the final details and for additional immediate corollaries of the Main Theorem 1.1 pertaining to \({\mathsf {RCD}}(K,N)\) and strong \({\mathsf {CD}}(K,N)\) spaces. We also provide there several concluding remarks and suggestions for further investigation.

2 Part I Temporal theory of Optimal-Transport

3 Preliminaries

3.1 Geodesics

A metric space \((X,\mathsf {d})\) is called a length space if for all \(x,y \in X\), \(\mathsf {d}(x,y) = \inf \ell (\sigma )\), where the infimum is over all (continuous) curves \(\sigma : I \rightarrow X\) connecting x and y, and \(\ell (\sigma ) := \sup \sum _{i=1}^k \mathsf {d}(\sigma (t_{i-1}),\sigma (t_{i}))\) denotes the curve’s length, where the latter supremum is over all \(k \in {\mathbb {N}}\) and \(t_0 \le \cdots \le t_k\) in the interval \(I \subset {\mathbb {R}}\). A curve \(\gamma \) is called a geodesic if \(\ell (\gamma |_{[t_0,t_1]}) = \mathsf {d}(\gamma (t_0),\gamma (t_1))\) for all \([t_0,t_1] \subset I\). If \(\ell (\gamma ) = 0\) we will say that \(\gamma \) is a null geodesic. The metric space is called a geodesic space if for all \(x,y \in X\) there exists a geodesic in X connecting x and y. We denote by \(\mathrm{Geo}(X)\) the set of all closed directed constant-speed geodesics parametrized on the interval [0, 1]:

$$\begin{aligned} \mathrm{Geo}(X) : = \left\{ \gamma : [0,1] \rightarrow X \; ; \; \begin{array}{c} \mathsf {d}(\gamma (s),\gamma (t)) = |s-t| \mathsf {d}(\gamma (0),\gamma (1)) \\ \forall s,t \in [0,1] \end{array} \right\} . \end{aligned}$$

We regard \(\mathrm{Geo}(X)\) as a subset of all Lipschitz maps \(\text {Lip}([0,1], X)\) endowed with the uniform topology. We will frequently use \(\gamma _t := \gamma (t)\).

The metric space is called proper if every closed ball (of finite radius) is compact. It follows from the metric version of the Hopf-Rinow Theorem (e.g. [22, Theorem 2.5.28]) that for complete length spaces, local compactness is equivalent to properness, and that complete proper length spaces are in fact geodesic.

Given a subset \(D \subset X \times {\mathbb {R}}\), we denote its sections by

$$\begin{aligned} D(t) := \left\{ x \in X \;;\; (x,t) \in D\right\} ~,~ D(x) := \left\{ t \in {\mathbb {R}}\; ; \; (x,t) \in D\right\} . \end{aligned}$$

Given a subset \(G \subset \mathrm{Geo}(X)\), we denote by \(\mathring{G} := \left\{ \gamma |_{(0,1)} \;;\; \gamma \in G\right\} \) the corresponding open-ended geodesics on (0, 1). For a subset of (closed or open) geodesics \({\tilde{G}}\), we denote

$$\begin{aligned} D({\tilde{G}}) := \left\{ (x,t) \in X \times {\mathbb {R}}\; ; \; \exists \gamma \in {\tilde{G}} ~,~ t \in \text {Dom}(\gamma ) \; , \; x = \gamma _t \right\} . \end{aligned}$$

We denote by \(\mathrm{e}_t : \mathrm{Geo}(X) \ni \gamma \mapsto \gamma _t \in X\) the (continuous) evaluation map at \(t \in [0,1]\), and abbreviate given \(I \subset [0,1]\) as follows:

$$\begin{aligned} \mathrm{e}_t({\tilde{G}}) = {\tilde{G}}(t)&:= D({\tilde{G}})(t) = \left\{ \gamma _t \; ; \; \gamma \in {\tilde{G}} \right\} ~,~ \mathrm{e}_I({\tilde{G}}) := \cup _{t \in I} \mathrm{e}_t({\tilde{G}}) ~,~\\ {\tilde{G}}(x)&:= D({\tilde{G}})(x) = \left\{ t \in [0,1] \; ; \; \exists \gamma \in {\tilde{G}} ~,~ t \in \text {Dom}(\gamma ) \; , \; \gamma _t = x\right\} . \end{aligned}$$

3.2 Derivatives

For a function \(g : A \rightarrow {\mathbb {R}}\) on a subset \(A \subset {\mathbb {R}}\), denote its upper and lower derivatives at a point \(t_0 \in A\) which is an accumulation point of A by

$$\begin{aligned} \frac{{\overline{d}}}{dt} g(t_0) = \limsup _{A \ni t \rightarrow t_0} \frac{g(t) - g(t_0)}{t-t_0} ~,~ \underline{\frac{d}{dt}} g(t_0) = \liminf _{A \ni t \rightarrow t_0} \frac{g(t) - g(t_0)}{t-t_0} . \end{aligned}$$

We will say that g is differentiable at \(t_0\) iff \(\frac{d}{dt} g(t_0) := \frac{{\overline{d}}}{dt} g(t_0) = \underline{\frac{d}{dt}} g(t_0) < \infty \). This is a slightly more general definition of differentiability than the traditional one which requires that \(t_0\) be an interior point of A.

Remark 2.1

Note that there are only a countable number of isolated points in A, so a.e. point in A is an accumulation point. In addition, it is clear that if \(t_0 \in B \subset A\) is an accumulation point of B and g is differentiable at \(t_0\), then \(g|_B\) is also differentiable at \(t_0\) with the same derivative. In particular, if g is a.e. differentiable on A then \(g|_B\) is also a.e. differentiable on B and the derivatives coincide.

Remark 2.2

Denote by \(A_1 \subset A\) the subset of density one points of A (which are in particular accumulation points of A). By Lebesgue’s Density Theorem \({\mathcal {L}}^1(A \setminus A_1) = 0\), where we denote by \({\mathcal {L}}^1\) the Lebesgue measure on \({\mathbb {R}}\) throughout this work. If \(g : A \rightarrow {\mathbb {R}}\) is locally Lipschitz, consider any locally Lipschitz extension \({\hat{g}} : {\mathbb {R}}\rightarrow {\mathbb {R}}\) of g. Then it is easy to check that for \(t_0 \in A_1\), g is differentiable in the above sense at \(t_0\) if and only if \({{\hat{g}}}\) is differentiable at \(t_0\) in the usual sense, in which case the derivatives coincide. In particular, as \({{\hat{g}}}\) is a.e. differentiable on \({\mathbb {R}}\), it follows that g is a.e. differentiable on \(A_1\) and hence on A, and it holds that \(\frac{d}{dt} g = \frac{d}{dt} {{\hat{g}}}\) a.e. on A.

Let \(f : I \rightarrow {\mathbb {R}}\) denote a convex function on an open interval \(I \subset {\mathbb {R}}\). It is well-known that the left and right derivatives \(f^{\prime ,-}\) and \(f^{\prime ,+}\) exist at every point in I and that f is locally Lipschitz there; in particular, f is differentiable at a given point iff the left and right derivatives coincide there. Denoting by \(D \subset I\) the differentiability points of f in I, it is also well-known that \(I \setminus D\) is at most countable. Consequently, any point in D is an accumulation point, and we may consider the differentiability in D of \(f' : D \rightarrow {\mathbb {R}}\) as defined above. We will require the following elementary one-dimensional version (probably due to Jessen) of the well-known Aleksandrov’s theorem about twice differentiability a.e. of convex functions on \({\mathbb {R}}^n\) (see [45, Theorem 5.2.1] or [20, Section 2.6], and [71, p. 31] for historical comments). Clearly, all of these results extend to locally semi-convex and semi-concave functions as well; recall that a function \(f : I \rightarrow {\mathbb {R}}\) is called semi-convex (semi-concave) if there exists \(C \in {\mathbb {R}}\) so that \(I \ni x \mapsto f(x) + C x^2\) is convex (concave).

Lemma 2.3

(Second Order Differentiability of Convex Function) Let \(f : I \rightarrow {\mathbb {R}}\) be a convex function on an open interval \(I \subset {\mathbb {R}}\), and let \(\tau _0 \in I\) and \(\Delta \in {\mathbb {R}}\). Then the following statements are equivalent:

  1. (1)

    f is differentiable at \(\tau _0\), and if \(D \subset I\) denotes the subset of differentiability points of f in I, then \(f' : D \rightarrow {\mathbb {R}}\) is differentiable at \(\tau _0\) with

    $$\begin{aligned} (f')'(\tau _0) := \lim _{D \ni \tau \rightarrow \tau _0} \frac{f'(\tau ) - f'(\tau _0)}{\tau -\tau _0} = \Delta . \end{aligned}$$
  2. (2)

    The right derivative \(f^{\prime ,+} \,{:}\, I \,{\rightarrow }\, {\mathbb {R}}\) is differentiable at \(\tau _0\) with \((f^{\prime ,+})'\) \((\tau _0)=\Delta \).

  3. (3)

    The left derivative \(f^{\prime ,-}\,{:}\, I \,{\rightarrow }\, {\mathbb {R}}\) is differentiable at \(\tau _0\) with \((f^{\prime ,-})'\) \((\tau _0)=\Delta \).

  4. (4)

    f is differentiable at \(\tau _0\) and has the following second order expansion there:

    $$\begin{aligned} f(\tau _0 + \varepsilon ) = f(\tau _0) + f'(\tau _0) \varepsilon + \Delta \frac{\varepsilon ^2}{2} + o(\varepsilon ^2)\quad \text { as }\varepsilon \rightarrow 0. \end{aligned}$$

    In this case, f is said to have a second Peano derivative at \(\tau _0\).

We remark that even for a differentiable function f, while the implication \((1) \Rightarrow (4)\) follows by Taylor’s theorem (existence of the second derivative at a point implies existence of the second Peano derivative there), the converse implication is in general false (see e.g. [61] for a nice discussion). For a locally semi-convex or semi-concave function f, we will say that f is twice differentiable at \(\tau _0\) if any (all) of the above equivalent conditions hold for some \(\Delta \in {\mathbb {R}}\), and write \((\frac{d}{d\tau })^{2}|_{\tau = \tau _0} f(\tau ) = \Delta \).

Finally, we will require the following slightly more refined notation.

Definition

Given an open interval \(I \subset {\mathbb {R}}\) and a function \(f : I \rightarrow {\mathbb {R}}\) which is differentiable at \(\tau _0 \in I\), we define its upper and lower second Peano derivatives at \(\tau _0\), denoted \({\overline{{\mathcal {P}}}}_2 f(\tau _0)\) and \({\underline{{\mathcal {P}}}}_2 f(\tau _0)\) respectively, by

$$\begin{aligned} {\overline{{\mathcal {P}}}}_2 f(\tau _0) := \limsup _{\varepsilon \rightarrow 0} \frac{h(\varepsilon )}{\varepsilon ^2} \ge \liminf _{\varepsilon \rightarrow 0} \frac{h(\varepsilon )}{\varepsilon ^2} =: {\underline{{\mathcal {P}}}}_2 f(\tau _0) , \end{aligned}$$

where

$$\begin{aligned} h(\varepsilon ) := 2( f(\tau _0 + \varepsilon ) - f(\tau _0) - \varepsilon f'(\tau _0)) . \end{aligned}$$

Clearly f has a second Peano derivative at \(\tau _0\) iff \({\overline{{\mathcal {P}}}}_2 f(\tau _0) = {\underline{{\mathcal {P}}}}_2 f(\tau _0) < \infty \).

The following is a type of Stolz–Cesàro lemma:

Lemma 2.4

Given an open interval \(I \subset {\mathbb {R}}\) and a locally absolutely continuous function \(f : I \rightarrow {\mathbb {R}}\) which is differentiable at \(\tau _0 \in I\), we have

$$\begin{aligned} \underline{\frac{d}{dt}} f'(\tau _0) \le {\underline{{\mathcal {P}}}}_2 f(\tau _0) \le {\overline{{\mathcal {P}}}}_2 f(\tau _0) \le \frac{{\overline{d}}}{dt} f'(\tau _0) . \end{aligned}$$

Proof

By local absolute continuity, f is differentiable a.e. in I and we have for small enough \(\left| \varepsilon \right| \):

$$\begin{aligned} \frac{1}{2} h(\varepsilon ) = f(\tau _0+\varepsilon ) - f(\tau _0) - \varepsilon f'(\tau _0) =\int _{0}^{\varepsilon } (f'(\tau _0 + \delta ) - f'(\tau _0)) d\delta , \end{aligned}$$

and hence

$$\begin{aligned} \frac{h(\varepsilon )}{\varepsilon ^{2}} = \frac{1}{\varepsilon ^2} \int _{0}^{\varepsilon } 2 \delta \frac{f'(\tau _0 + \delta ) - f'(\tau _0)}{\delta } d\delta . \end{aligned}$$

Taking appropriate subsequential limits as \(\varepsilon \rightarrow 0\), the asserted inequalities readily follow. \(\square \)

4 Temporal theory of intermediate-time Kantorovich potentials: first and second order

In the next sections, we will only consider the quadratic cost function \(c=\mathsf {d}^2/2\) on \(X \times X\).

Definition

(c-Concavity, Kantorovich Potential) The c-transform of a function \(\psi : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) is defined as the following (upper semi-continuous) function:

$$\begin{aligned} \psi ^c(x) = \inf _{y \in X} \frac{\mathsf {d}(x,y)^2}{2} - \psi (y) . \end{aligned}$$

A function \(\varphi : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) is called c-concave if \(\varphi = \psi ^c\) for some \(\psi \) as above. It is well known [76, Exercise 2.35] that \(\varphi \) is c-concave iff \((\varphi ^c)^c = \varphi \). In the context of Optimal-Transport with respect to the quadratic cost c, a c-concave function \(\varphi : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) which is not identically equal to \(-\infty \) is also known as a Kantorovich potential, and this is how we will refer to such functions in this work. In that case, \(\varphi ^c : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) is also a Kantorovich potential, called the dual or conjugate potential.

There is a natural way to interpolate between a Kantorovich potential and its dual by means of the Hopf-Lax semi-group, resulting in intermediate-time Kantorovich potentials \(\left\{ \varphi _t\right\} _{t \in (0,1)}\). The goal of the next three sections is to provide first, second and third order information on the time-behavior \(t \mapsto \varphi _t(x)\) at intermediate times \(t \in (0,1)\). In these sections, we only assume that \((X,\mathsf {d})\) is a proper geodesic metric space.

In this section, we focus on first and second order information. The main new result is Theorem 3.11.

4.1 Hopf-Lax semi-group

We begin with several well-known definitions which we slightly modify and specialize to our setting.

Definition

(Hopf-Lax Transform) Given \(f : X \rightarrow {\mathbb {R}}\cup \left\{ \pm \infty \right\} \) which is not identically \(+\infty \) and \(t > 0\), define the Hopf-Lax transform \(Q_t f : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) by

$$\begin{aligned} Q_t f (x) := \inf _{y \in X} \frac{\mathsf {d}(x,y)^2}{2t} + f(y) . \end{aligned}$$
(3.1)

Clearly either \(Q_t f \equiv -\infty \) or \(Q_t f(x)\) is finite for all \(x \in X\) (as our metric \(\mathsf {d}\) is finite). Consequently, we denote:

$$\begin{aligned} t_*(f) := \sup \left\{ t > 0 \;;\; Q_t f \not \equiv -\infty \right\} \end{aligned}$$

setting \(t_*(f) = 0\) if the supremum is over an empty set. Finally, we set \(Q_0 f := f\).

It is not hard to check (see e.g. [49, Theorem 2.5 (i)]) that when \((X,\mathsf {d})\) is a length space (and in particular geodesic), the Hopf-Lax transform is in fact a semi-group on \([0,\infty )\):

$$\begin{aligned} Q_{s+t} f = Q_s \circ Q_t f \;\;\; \forall t,s \ge 0 . \end{aligned}$$

Remark 3.1

It is also possible to extend the definition of \(Q_t f\) to negative times \(t < 0\) by setting

$$\begin{aligned} Q_t f (x) := - Q_{-t}(-f)(x) = \sup _{y \in X} \frac{\mathsf {d}(x,y)^2}{2t} + f(y) ~,~ t < 0 . \end{aligned}$$

This is called the backwards Hopf-Lax semi-group on \((-\infty ,0]\). However, \(({\mathbb {R}},+) \ni t \mapsto (Q_t,\circ )\) is in general not an abelian group homomorphism, not even for \(t \in [0,1]\) when applied to a Kantorovich potential \(\varphi \) (characterized by \(Q_{-1} \circ Q_1(-\varphi ) = -\varphi \))—see Sect. 3.3. This will be a rather significant nuisance we will need to cope with in this work.

Clearly \((0,\infty ) \times X \ni (t,x) \mapsto Q_t f(x)\) is upper semi-continuous as the infimum of continuous functions in (tx), and by definition \([0,\infty ) \ni t \mapsto Q_t f(x)\) is monotone non-increasing for each \(x \in X\). Consequently, \((0,\infty ) \ni t \mapsto Q_t f(x)\) must be continuous from the left.

It may also be shown (see [5, Lemma 3.1.2]) that \(X \times (0,t_*(f)) \ni (x,t) \mapsto Q_t f (x)\) is continuous (and in fact locally Lipschitz, see Theorem 3.4 below). Together with the left-continuity, we deduce that for every \(x \in X\), \((0,t_*(f)] \ni t \mapsto Q_t f(x)\) is continuous.

Note that by definition \(f^c = Q_1(-f)\), and that a Kantorovich pair of conjugate potentials \(\varphi ,\varphi ^c : X \rightarrow {\mathbb {R}}\cup \left\{ -\infty \right\} \) are characterized by not being identically equal to \(-\infty \) and satisfying:

$$\begin{aligned} \varphi = Q_1(-\varphi ^c) ~,~ \varphi ^c = Q_1(-\varphi ) . \end{aligned}$$

In particular, \(t_*(\varphi ),t_*(\varphi ^c) \ge 1\), and we a-posteriori deduce that \(\varphi , \varphi ^c\) are both finite on the entire space X (we have used above the fact that the metric \(\mathsf {d}\) is finite, which differs from other more general treatments).

Definition

(Interpolating Intermediate-Time Kantorovich Potentials) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), the interpolating Kantorovich potential at time \(t \in [0,1]\), \(\varphi _t : X \rightarrow {\mathbb {R}}\), is defined for all \(t \in [0,1]\) by

$$\begin{aligned} \varphi _t(x) := Q_{-t}(\varphi ) = -Q_t(-\varphi ) . \end{aligned}$$

Note that \(\varphi _0 = \varphi \), \(\varphi _1 = -\varphi ^c\), and

$$\begin{aligned} -\varphi _t(x) = \inf _{y \in X} \frac{\mathsf {d}^2(x,y)}{2t} - \varphi (y) \;\;\;\; \forall t \in (0,1] . \end{aligned}$$

Applying the above mentioned general properties of the Hopf-Lax semi-group to \(\varphi _t\), it will be useful to record:

Lemma 3.2

  1. (1)

    \((x,t) \mapsto \varphi _t(x)\) is lower semi-continuous on \(X \times (0,1]\) and continuous on \(X \times (0,1)\).

  2. (2)

    For every \(x \in X\), \([0,1] \ni t \mapsto \varphi _t(x)\) is monotone non-decreasing and continuous on (0, 1].

Definition

(Kantorovich Geodesic) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), a geodesic \(\gamma \in \mathrm{Geo}(X)\) is called a \(\varphi \)-Kantorovich (or optimal) geodesic if

$$\begin{aligned} \varphi (\gamma _0) + \varphi ^c(\gamma _1) = \frac{d(\gamma _0,\gamma _1)^2}{2} = \frac{\ell (\gamma )^2}{2} . \end{aligned}$$

We denote all \(\varphi \)-Kantorovich geodesics by \(G_\varphi \). Note that \(\gamma \in G_{\varphi }\) iff \(\gamma ^c \in G_{\varphi ^c}\), where \(\gamma ^c(t) := \gamma (1-t)\) is the time-reversed geodesic. By upper semi-continuity of \(\varphi \) and \(\varphi ^c\), it follows that \(G_\varphi \) is a closed subset of \(\mathrm{Geo}(X)\).

The following is not hard to check (see e.g. [24, Corollary 2.16]):

Lemma 3.3

Let \(\gamma \) be a \(\varphi \)-Kantorovich geodesic. Then

$$\begin{aligned} \varphi _s(\gamma _s) - \varphi _r(\gamma _r) = \frac{\mathsf {d}(\gamma _s,\gamma _r)^2}{2 (r-s)} = (r-s)\frac{\ell (\gamma )^2}{2} \;\;\; \forall s,r \in [0,1] . \end{aligned}$$

4.2 Distance functions

The following important definition was given by Ambrosio–Gigli–Savaré [5, 6]:

Definition

(Distance functions \(D^{\pm }_f\)) Given \(f : X \rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) which is not identically \(+\infty \), denote

$$\begin{aligned} D^+_{f}(x,t) := \sup \limsup _{n \rightarrow \infty } \mathsf {d}(x,y_n) \ge \inf \liminf _{n \rightarrow \infty } \mathsf {d}(x,y_n) =: D^-_{f}(x,t) , \end{aligned}$$

where the supremum and infimum above run over the set of minimizing sequences \(\left\{ y_n\right\} \) in the definition of the Hopf-Lax transform (3.1). A simple diagonal argument shows that the (outer) supremum and infimum above are in fact attained.

The following properties were established in [5, 6, Chapter 3]:

Theorem 3.4

(Ambrosio–Gigli–Savaré) For any metric space \((X,\mathsf {d})\) (not necessarily proper, complete nor geodesic):

  1. (1)

    Both functions \(D^{\pm }_f(x,t)\) are locally finite on \(X \times (0,t_*(f))\), and \((x,t) \mapsto Q_t f(x)\) is locally Lipschitz there.

  2. (2)

    \((x,t) \mapsto D^{\pm }_f(x,t)\) is upper (\(D^{+}_f(x,t)\)) / lower (\(D^{-}_f(x,t)\)) semi-continuous on \(X \times (0,t_*(f))\).

  3. (3)

    For every \(x \in X\), both functions \((0,t_*(f)) \ni t \mapsto D^{\pm }_f(x,t)\) are monotone non-decreasing and coincide except where they have (at most countably many) jump discontinuities.

  4. (4)

    For every \(x \in X\), \(\partial _t^{\pm } Q_t f(x) = - \frac{(D^{\pm }_f(x,t))^2}{2 t^2}\) for all \(t \in (0,t_*(f))\), where \(\partial _t^{-}\) and \(\partial _t^+\) denote the left and right partial derivatives, respectively. In particular, the map \((0,t_*(f)) \ni t \mapsto Q_t f(x)\) is locally Lipschitz and locally semi-concave, and differentiable at \(t \in (0,t_*(f))\) iff \(D^+_f(x,t) = D^-_f(x,t)\).

It may be instructive to recall the proof of property (3) above, which is related to some ensuing properties, so for completeness, we present it below. For simplicity, we restrict to the case of interest for us, and first record:

Lemma 3.5

Given a proper metric space X, a lower semi-continuous \(f : X \rightarrow {\mathbb {R}}\), \(x \in X\) and \(t \in (0,t_*(f))\), there exist \(y^{\pm }_t \in X\) so that

$$\begin{aligned} Q_t f(x) = \frac{\mathsf {d}(x,y^{\pm }_t)^2}{2 t} + f(y^{\pm }_t) \; \text { and } \;\; \mathsf {d}(x,y^{\pm }_t) = D^{\pm }_f(x,t) . \end{aligned}$$

Recall that \(-\varphi \) is indeed lower semi-continuous for any Kantorovich potential \(\varphi \).

Proof of Lemma 3.5

Let \(\{y_t^{\pm ,n}\}\) denote a minimizing sequence so that

$$\begin{aligned} Q_t f(x) = \lim _{n \rightarrow \infty } \frac{\mathsf {d}(x,y_t^{\pm ,n})^2}{2 t} + f(y_t^{\pm ,n}) \text { and } D^{\pm }_f(x,t) = \lim _{n \rightarrow \infty } \mathsf {d}(x,y_t^{\pm ,n}) . \end{aligned}$$

By property (1) we know that \(D^{\pm }_f(x,t)< R < \infty \), and the properness implies that the closed geodesic ball \(B_R(x)\) is compact. Consequently \(\{y_t^{\pm ,n}\}\) has a converging subsequence to \(y^{\pm }_t\), and the lower semi-continuity of f implies that:

$$\begin{aligned} Q_t f(x) = \inf _{y \in X} \frac{\mathsf {d}(x,y)^2}{2 t} + f(y) = \min _{y \in B_R(x)} \frac{\mathsf {d}(x,y)^2}{2 t} + f(y) = \frac{\mathsf {d}(x,y^{\pm }_t)^2}{2 t} + f(y^{\pm }_t) , \end{aligned}$$

as asserted. \(\square \)

Proof of (3) for proper X and lower semi-continuous f. The assertion will follow immediately after establishing

$$\begin{aligned} D^+_f(x,s) \le D^-_f(x,t) \;\;\; \forall 0< s< t < t_*(f) , \end{aligned}$$

since trivially \(D^-_f \le D^+_f\) and since a monotone function can only have a countable number of jump discontinuities. By Lemma 3.5, there exist \(y^+_s\) and \(y^-_t\) so that

$$\begin{aligned}&Q_s f(x) = \inf _{y \in X} \frac{\mathsf {d}(x,y)^2}{2 s} + f(y) = \frac{\mathsf {d}(x,y^+_s)^2}{2 s} + f(y^+_s) \text { and }\\&\mathsf {d}(x,y^+_s) = D^+_f(x,s) , \end{aligned}$$

and:

$$\begin{aligned}&Q_t f(x) = \inf _{y \in X} \frac{\mathsf {d}(x,y)^2}{2 t} + f(y) = \frac{\mathsf {d}(x,y^{-}_t)^2}{2 t} + f(y^{-}_t) \text { and }\\&\mathsf {d}(x,y^{-}_t) = D^-_f(x,t) . \end{aligned}$$

It follows that

$$\begin{aligned} \frac{\mathsf {d}(x,y^+_s)^2}{2 s} + f(y^+_s)&\le \frac{\mathsf {d}(x,y^-_t)^2}{2 s} + f(y^-_t) , \\ \frac{\mathsf {d}(x,y^-_t)^2}{2 t} + f(y^-_t)&\le \frac{\mathsf {d}(x,y^+_s)^2}{2 t} + f(y^+_s) . \end{aligned}$$

Summing these two inequalities and rearranging terms, one deduces

$$\begin{aligned} D^+_f(x,s) \left( \frac{1}{s} - \frac{1}{t}\right) \le D^-_f(x,t) \left( \frac{1}{s} - \frac{1}{t}\right) , \end{aligned}$$

as required. \(\square \)

4.3 Intermediate-time duality and time-reversed potential

It is immediate to show by inspecting the definitions that we always have (e.g. [77, Theorem 7.34 (iii)] or [3, Proposition 2.17 (ii)]):

$$\begin{aligned} Q_{-s} \circ Q_s f \le f \text { on }X \;\;\; \forall s > 0 ; \end{aligned}$$

this is an inherent group-structure incompatibility of the Hopf-Lax forward and backward semi-groups. Note that for \(f = -\varphi \) where \(\varphi \) is a Kantorovich potential, we do have equality for \(s=1\), and in fact for all \(s \in [0,1]\). However, for \(f = Q_t(-\varphi )\), \(t \in (0,1)\) and \(s=1-t\), we can only assert an inequality above ([77, Theorem 7.36], [3, Corollary 2.23 (i)]):

$$\begin{aligned} (\varphi ^c)_{1-t} = Q_{-(1-t)} \circ Q_1 (-\varphi ) \le Q_t(-\varphi ) = -\varphi _t \text { on }X, \end{aligned}$$
(3.2)

and equality may not hold at every point of X (cf. [77, Remark 7.37]). Nevertheless, in our setting, the subset where equality is attained may be characterized as in the next proposition. We first introduce the following very convenient:

Definition

(Time-Reversed Interpolating Potential) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), define the time-reversed interpolating Kantorovich potential at time \(t \in [0,1]\), \({\bar{\varphi }}_t : X \rightarrow {\mathbb {R}}\), as

$$\begin{aligned} {\bar{\varphi }}_t := -(\varphi ^c)_{1-t} = Q_{1-t}(-\varphi ^c) = - Q_{-(1-t)} \circ Q_{1-t}(-\varphi _t) . \end{aligned}$$

Note that \({\bar{\varphi }}_0 = \varphi \), \({\bar{\varphi }}_1 = -\varphi ^c\), and

$$\begin{aligned} {\bar{\varphi }}_t(x) = \inf _{y \in X} \frac{\mathsf {d}^2(x,y)}{2(1-t)} - \varphi ^c(y) \;\;\;\; \forall t \in [0,1) . \end{aligned}$$

Proposition 3.6

  1. (1)

    \(\varphi _0 = {\bar{\varphi }}_0 = \varphi \) and \(\varphi _1 = {\bar{\varphi }}_1 = -\varphi ^c\).

  2. (2)

    For all \(t \in [0,1]\), \(\varphi _t \le {\bar{\varphi }}_t\).

  3. (3)

    For any \(t \in (0,1)\), \(\varphi _t(x) = {\bar{\varphi }}_t(x)\) if and only if \(x \in \mathrm{e}_t(G_\varphi )\). In other words:

    $$\begin{aligned} D(\mathring{G}_\varphi ) = \left\{ (x,t) \in X \times (0,1) \; ; \; \varphi _t(x) = {\bar{\varphi }}_t(x) \right\} . \end{aligned}$$
    (3.3)

(1) is immediate by c-concavity, and (2) is a reformulation of (3.2), so the only assertion requiring proof is (3). The if direction is well-known (e.g. [77, Theorem 7.36], [3, Corollary 2.23 (ii)]), but the other direction appears to be new. It is based on the following simple lemma, which we will use again later on:

Lemma 3.7

Assume that for some \(x,y,z \in X\) and \(t \in (0,1)\):

$$\begin{aligned} \frac{\mathsf {d}(x,y)^2}{2 t} - \varphi (y) = \varphi ^c(z) - \frac{\mathsf {d}(x,z)^2}{2 (1-t)} . \end{aligned}$$

Then x is a t-intermediate point between y and z:

$$\begin{aligned} \mathsf {d}(y,z) = \frac{\mathsf {d}(x,y)}{t} = \frac{\mathsf {d}(x,z)}{1-t} , \end{aligned}$$
(3.4)

and there exists a \(\varphi \)-Kantorovich geodesic \(\gamma : [0,1] \rightarrow X\) with \(\gamma (0) = y\), \(\gamma (t) =x\) and \(\gamma (1) = z\).

Proof

Using that

$$\begin{aligned} \varphi (y) + \varphi ^c(z) \le \frac{\mathsf {d}(y,z)^2}{2} , \end{aligned}$$
(3.5)

our assumption yields

$$\begin{aligned} \frac{\mathsf {d}(x,y)^2}{2 t} + \frac{\mathsf {d}(x,z)^2}{2 (1-t)} \le \frac{\mathsf {d}(y,z)^2}{2} . \end{aligned}$$

On the other hand, the reverse inequality is always valid by the triangle and Cauchy–Schwarz inequalities:

$$\begin{aligned} \frac{\mathsf {d}(y,z)^2}{2} \le \frac{(\mathsf {d}(x,y) + \mathsf {d}(x,z))^2}{2} \le \frac{\mathsf {d}(x,y)^2}{2 t} + \frac{\mathsf {d}(x,z)^2}{2 (1-t)} . \end{aligned}$$

It follows that we must have equality everywhere above, and (3.4) amounts to the equality case in the Cauchy–Schwarz inequality. Consequently, the concatenation \(\gamma : [0,1] \rightarrow X\) of any constant speed geodesic \(\gamma _1 : [0,t] \rightarrow X\) between y and x, with any constant speed geodesic \(\gamma _2 : [t,1] \rightarrow X\) between x and z, so that \(\gamma (0) = y\), \(\gamma (t) = x\) and \(\gamma (1) = z\), must be a constant speed geodesic itself (by the triangle inequality). Lastly, the equality in (3.5) implies that \(\gamma \in G_\varphi \), thereby concluding the proof. \(\square \)

Proof of Proposition 3.6 (3)

We begin with the known direction. Let \(x = \gamma _t\) with \(\gamma \in G_\varphi \). Apply Lemma 3.3 to \(\gamma \) with \(s=0\) and \(r=t\):

$$\begin{aligned} \varphi (\gamma _0) - \varphi _t(\gamma _t) = \varphi _0(\gamma _0) - \varphi _t(\gamma _t) = t \frac{\text {len}(\gamma )^2}{2} , \end{aligned}$$

and to \(\gamma ^c \in G_{\varphi ^c}\) with \(s=1\) and \(r=1-t\):

$$\begin{aligned} -\varphi (\gamma _0) - (\varphi ^c)_{1-t}(\gamma _t) = (\varphi ^c)_1(\gamma ^c_1) - (\varphi ^c)_{1-t}(\gamma ^c_{1-t}) = -t \frac{\text {len}(\gamma ^c)^2}{2} = -t \frac{\text {len}(\gamma )^2}{2}, \end{aligned}$$

where we used that \((\varphi ^c)_1 = -(\varphi ^c)^c = -\varphi \). Summing these two identities, we obtain:

$$\begin{aligned} \varphi _t(\gamma _t) = - (\varphi ^c)_{1-t}(\gamma _t) , \end{aligned}$$

as asserted.

For the other direction, assume that \(\varphi _t(x) = - (\varphi ^c)_{1-t}(x)\) for some \(x \in X\) and \(t \in (0,1)\). By Lemma 3.5 applied to the lower semi-continuous functions \(-\varphi \) and \(-\varphi ^c\), there exist \(y_t,z_t \in X\) so that

$$\begin{aligned} -\varphi _t(x)&= Q_t(-\varphi )(x) = \frac{\mathsf {d}(x,y_t)^2}{2 t} - \varphi (y_t) , \\ \varphi _t(x) = -(\varphi ^c)_{1-t}(x)&= Q_{1-t}(-\varphi ^c)(x) = \frac{\mathsf {d}(x,z_t)^2}{2 (1-t)} -\varphi ^c(z_t) . \end{aligned}$$

Summing the two equations, the assertion follows immediately from Lemma 3.7. \(\square \)

We also record the following immediate corollary of Lemma 3.2:

Corollary 3.8

  1. (1)

    \((x,t) \mapsto {\bar{\varphi }}_t(x)\) is upper semi-continuous on \(X \times [0,1)\) and continuous on \(X \times (0,1)\).

  2. (2)

    For every \(x \in X\), \([0,1] \ni t \mapsto {\bar{\varphi }}_t(x)\) is monotone non-decreasing and continuous on [0, 1).

Finally, in view of (3.3), we deduce for free:

Corollary 3.9

\(D(\mathring{G}_\varphi )\) is a closed subset of \(X \times (0,1)\).

Proof

Immediate from (3.3) by the continuity of \(\varphi _t(x)\) and \({\bar{\varphi }}_t(x)\) on \(X \times (0,1)\).

\(\square \)

4.4 Length functions \(\ell _t^{\pm }\) and \({\bar{\ell }}_t^{\pm }\)

Definition

(Length functions \(\ell _t^{\pm },{\bar{\ell }}_t^{\pm }\)) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), denote

$$\begin{aligned} \ell ^{\pm }_t(x) := \frac{D^{\pm }_{-\varphi }(x,t)}{t} \;\; , \;\; {\bar{\ell }}^{\pm }_t(x) := \frac{D^{\pm }_{-\varphi ^c}(x,1-t)}{1-t} \;\;, \;\; (x,t) \in X \times (0,1). \end{aligned}$$

To provide motivation for these definitions, let us mention that we will shortly see that if \(x = \gamma _t\) with \(\gamma \in G_\varphi \) and \(t \in (0,1)\), then

$$\begin{aligned} \ell ^{+}_t(x) = \ell ^{-}_t(x) = {\bar{\ell }}^{+}_t(x) = {\bar{\ell }}^{-}_t(x) = \ell (\gamma ) . \end{aligned}$$

In particular, all \(\varphi \)-Kantorovich geodesics having x as their t-mid-point have the same length. These facts seem to not have been previously noted in the literature, and they will be crucially exploited in this work.

Definition

For \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), introduce the following set:

$$\begin{aligned} D_{{\tilde{\ell }}} := \left\{ (x,t) \in X \times (0,1) \; ; \; {\tilde{\ell }}_t^+(x) = {\tilde{\ell }}_t^-(x) \right\} , \end{aligned}$$

and on it define \({\tilde{\ell }}_t(x)\) as the common value \({\tilde{\ell }}_t^+(x) = {\tilde{\ell }}_t^-(x)\).

Recalling that \(\varphi _t = -Q_t(-\varphi )\) and \({\bar{\varphi }}_t = Q_{1-t}(-\varphi ^c)\), we begin by translating Theorem 3.4 into the following corollary. We freely use standard properties of semi-convex (semi-concave) functions, like twice a.e. differentiability, non-negativity (non-positivity) of the singular part of the distributional second derivative (see e.g. Lemma A.11), etc.

Corollary 3.10

Let \(\varphi : X \rightarrow {\mathbb {R}}\) denote a Kantorovich potential. Then:

  1. (1)

    For \({\tilde{\ell }}= \ell ,{\bar{\ell }}\) and \({\tilde{\varphi }}=\varphi ,{\bar{\varphi }}\), \({\tilde{\ell }}^{\pm }_t(x)\) are locally finite on \(X \times (0,1)\), and \((x,t) \mapsto {\tilde{\varphi }}_t(x)\) is locally Lipschitz there.

  2. (2)

    For \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), \((x,t) \mapsto {\tilde{\ell }}^{\pm }_t(x)\) is upper (\({\tilde{\ell }}^{+}_t(x)\)) / lower (\({\tilde{\ell }}^{-}_t(x)\)) semi-continuous on \(X \times (0,1)\). In particular, the subset \(D_{{\tilde{\ell }}} \subset X \times (0,1)\) is Borel and \((x,t) \mapsto {\tilde{\ell }}_t(x)\) is continuous on \(D_{{\tilde{\ell }}}\).

  3. (3)

    For every \(x \in X\) we have

    $$\begin{aligned} \partial _t^{\pm } \varphi _t(x) = \frac{\ell ^{\pm }_t(x)^2}{2} ~,~ \partial _t^{\pm } {\bar{\varphi }}_t(x) = \frac{{\bar{\ell }}^{\mp }_t(x)^2}{2}\;\;\; \forall t \in (0,1) . \end{aligned}$$

    In particular, for \({\tilde{\ell }}= \ell ,{\bar{\ell }}\) and \({\tilde{\varphi }}=\varphi ,{\bar{\varphi }}\), respectively, the map \((0,1) \ni t \mapsto {\tilde{\varphi }}_t(x)\) is locally Lipschitz, and it is differentiable at \(t \in (0,1)\) iff \(t \in D_{{\tilde{\ell }}}(x)\), the set on which both maps \((0,1) \ni t \mapsto {\tilde{\ell }}^{\pm }_t(x)\) coincide. \(D_{{\tilde{\ell }}}(x)\) is precisely the set of continuity points of both maps, and thus coincides with (0, 1) with at most countably exceptions. In particular

    $$\begin{aligned} {\tilde{\varphi }}_{t_2}(x) - {\tilde{\varphi }}_{t_1}(x) = \int _{t_1}^{t_2} \frac{{\tilde{\ell }}^2_\tau (x)}{2} d\tau \;\;\; \forall t_1,t_2 \in (0,1) . \end{aligned}$$
  4. (4)

    For every \(x \in X\):

    1. (a)

      Both maps \((0,1) \ni t \mapsto t \ell ^{\pm }_t(x)\) are monotone non-decreasing. In particular, \(D_{\ell }(x) \ni t \mapsto \ell ^2_t(x)\) is differentiable a.e., the singular part of its distributional derivative is non-negative, \((0,1) \ni t\mapsto \varphi _t(x)\) is locally semi-convex, and

      $$\begin{aligned} {\underline{\partial }}_t \frac{\ell _t^2(x)}{2}&\ge -\frac{1}{t} \ell _t^2(x) \;\;\; \forall t \in D_{\ell }(x) . \end{aligned}$$
      (3.6)
    2. (b)

      Both maps \((0,1) \ni t \mapsto (1-t) {\bar{\ell }}^{\pm }_t(x)\) are monotone non-increasing. In particular, \(D_{{\bar{\ell }}}(x) \ni t \mapsto {\bar{\ell }}^2_t(x)\) is differentiable a.e., the singular part of its distributional derivative is non-positive, \((0,1) \ni t \mapsto {\bar{\varphi }}_t(x)\) is locally semi-concave, and

      $$\begin{aligned} {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2}&\le \frac{1}{1-t} {\bar{\ell }}_t^2(x) \;\;\; \forall t \in D_{{\bar{\ell }}}(x) . \end{aligned}$$
      (3.7)

Proof

The only point requiring verification is that monotonicity of \(t \mapsto t \ell _t(x)\) in (4a) and \(t \mapsto (1-t) {\bar{\ell }}_t\) in (4b) implies (3.6) and (3.7), respectively. For instance, using the continuity of \(t \mapsto \ell _t(x)\) on \(D_{\ell }(x)\), (3.6) is clearly equivalent to

$$\begin{aligned} {\underline{\partial }}_t \ell _t(x) \ge -\frac{1}{t} \ell _t(x) \;\;\; \forall t \in D_{\ell }(x) . \end{aligned}$$
(3.8)

Now, if \(\ell _t(x) = 0\) the monotonicity directly implies \({\underline{\partial }}_t \ell _t(x) \ge 0\) and establishes (3.8), whereas otherwise, (3.8) is equivalent by the chain-rule (and again the continuity of \(t \mapsto \ell _t(x)\) on \(D_{\ell }(x)\)) to

$$\begin{aligned} {\underline{\partial }}_t \log (t \ell _t(x)) = \frac{1}{t} + {\underline{\partial }}_t \log (\ell _t(x)) \ge 0 \;\;\; \forall t \in D_{\ell }(x) , \end{aligned}$$

which in turn is a consequence of the aforementioned monotonicity. The proof of (3.7) follows identically. \(\square \)

We now arrive to the main new result of this section, which will be constantly and crucially used in this work:

Theorem 3.11

Let \(\varphi : X \rightarrow {\mathbb {R}}\) denote a Kantorovich potential.

  1. (1)

    For all \(x \in \mathrm{e}_t(G_\varphi )\) with \(t \in (0,1)\), we have

    $$\begin{aligned} \ell ^{+}_t(x) = \ell ^{-}_t(x) = {\bar{\ell }}^{+}_t(x) = {\bar{\ell }}^{-}_t(x) = \ell (\gamma ) , \end{aligned}$$

    for any \(\gamma \in G_\varphi \) so that \(\gamma _t = x\). In other words

    $$\begin{aligned} D(\mathring{G}_\varphi ) = \left\{ (x,t) \in X \times (0,1) \; ; \; x = \gamma _t \; , \; \gamma \in G_\varphi \right\} \subset D_\ell \cap D_{{\bar{\ell }}}, \end{aligned}$$

    and moreover \(\ell _t(x) = {\bar{\ell }}_t(x)\) there.

  2. (2)

    For all \(x \in X\), \(\mathring{G}_\varphi (x) \ni t \mapsto \ell _t(x)={\bar{\ell }}_t(x)\) is locally Lipschitz:

    $$\begin{aligned}&\left| \sqrt{t (1-t)} \ell _{t}(x) - \sqrt{s (1-s)} \ell _{s}(x)\right| \nonumber \\&\quad \le \sqrt{\ell _{t}(x) \ell _{s}(x)} \left| \sqrt{t (1-s)} - \sqrt{s (1-t)} \right| \;\;\; \forall t,s \in \mathring{G}_\varphi (x).\qquad \end{aligned}$$
    (3.9)
  3. (3)

    For all \((x,t) \in D(\mathring{G}_\varphi ) \subset D_\ell \cap D_{{\bar{\ell }}}\) we have for both \(* = {\underline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) ,{\overline{{\mathcal {P}}}}_2 \varphi _t(x)\):

    $$\begin{aligned} -\frac{1}{t} \ell _t^2(x) \le {\underline{\partial }}_t \frac{\ell _t^2(x)}{2} \le {\underline{{\mathcal {P}}}}_2 \varphi _t(x) \le * \le {\overline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) \le {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2} \le \frac{1}{1-t} \ell _t^2(x) , \end{aligned}$$

    where the Peano (partial) derivatives are with respect to the t variable.

  4. (4)

    For all \((x,t) \in D(\mathring{G}_\varphi ) \subset D_\ell \cap D_{{\bar{\ell }}}\) we have:

    $$\begin{aligned} {\overline{\partial }}_t \frac{\ell _t^2(x)}{2}&\le {\overline{\partial }}_t \frac{{\bar{\ell }}_t^2(x)}{2} + \left( \frac{1}{1-t} + \frac{1}{t}\right) \ell _t^2(x) \le \left( \frac{2}{1-t} + \frac{1}{t}\right) \ell _t^2(x) \\ \underline{\partial _t} \frac{{\bar{\ell }}_t^2(x)}{2}&\ge \underline{\partial _t} \frac{\ell _t^2(x)}{2} -\left( \frac{1}{t} + \frac{1}{1-t}\right) \ell _t^2(x) \ge -\left( \frac{2}{t} + \frac{1}{1-t}\right) \ell _t^2(x) . \end{aligned}$$

In particular, for every \(x \in X\), we have:

$$\begin{aligned} \partial _t \varphi _t(x) = \partial _t {\bar{\varphi }}_t(x) = \frac{\ell ^2_t(x)}{2} = \frac{{\bar{\ell }}^2_t(x)}{2} \;\;\; \forall t \in \mathring{G}_\varphi (x) , \end{aligned}$$

with \(t \mapsto \frac{\ell ^2_t(x)}{2}\) and \(t \mapsto \frac{{\bar{\ell }}^2_t(x)}{2}\) continuous on \(D_\ell (x) \cap D_{{\bar{\ell }}}(x)\), differentiable a.e. there, and having locally bounded lower and upper derivatives on \(\mathring{G}_{\varphi }(x) \subset D_\ell (x) \cap D_{{\bar{\ell }}}(x)\) as in (3) and (4).

Proof

To see (1), let \((x,t) \in D(\mathring{G}_\varphi )\). Equivalently, by Proposition 3.6 (3), we know that \(\varphi _t(x) = {\bar{\varphi }}_t(x)\). In addition, Lemma 3.5 assures the existence of \(y^{\pm }\) and \(z^{\pm }\) in X so that

$$\begin{aligned} -\varphi _t(x)&= \frac{\mathsf {d}(x,y^{\pm })^2}{2t} - \varphi (y^{\pm }), \;\; \mathsf {d}(x,y^{\pm }) = t \ell ^{\pm }_t(x) \\ -{\bar{\varphi }}_t(x)&= - \frac{\mathsf {d}(x,z^{\pm })^2}{2(1-t)} + \varphi ^c(z^{\pm }),\;\; \mathsf {d}(x,z^{\pm }) = (1-t) {\bar{\ell }}^{\pm }_t(x) . \end{aligned}$$

Equating both expressions and applying Lemma 3.7, we deduce that x is the t-midpoint of a geodesic connecting \(y^{\pm }\) and \(z^{\pm }\) (for all 4 possibilities), and that

$$\begin{aligned} \ell ^{\pm }_t(x) = \frac{\mathsf {d}(x,y^{\pm })}{t} = \frac{\mathsf {d}(x,z^{\pm })}{1-t} = {\bar{\ell }}^{\pm }_{t}(x) \end{aligned}$$
(3.10)

so that all 4 possibilities above coincide. We remark in passing that this already implies in a non-branching setting that necessarily \(y^+ = y^-\) and \(z^+=z^-\), i.e. the uniqueness of a \(\varphi \)-Kantorovich geodesic with t-mid point x.

Furthermore, if \(x = \gamma _t\) for some \(\gamma \in G_\varphi \), then by Lemma 3.3:

$$\begin{aligned} -\varphi _t(x) = \frac{\mathsf {d}(x , \gamma _0)^2}{2t} - \varphi (\gamma _0) . \end{aligned}$$

It follows by definition of \(D^{\pm }_{-\varphi }(x,t)\) that:

$$\begin{aligned} t \ell ^-_t(x) = D^-_{-\varphi }(x,t) \le \mathsf {d}(x,\gamma _0) = t \ell (\gamma ) \le D^+_{-\varphi }(x,t) = t \ell ^+_t(x) , \end{aligned}$$

which together with (3.10) establishes that \(\ell (\gamma ) = \ell _t(x) = {\bar{\ell }}_t(x)\).

To see (2), let \(\gamma ^t, \gamma ^s \in G_{\varphi }\) be so that \(\gamma ^t_t = \gamma ^s_s = x\), for some \(t,s \in (0,1)\). Then

$$\begin{aligned} \varphi ^c(\gamma ^p_1) = \frac{\ell (\gamma ^p)^2}{2} - \varphi (\gamma ^p_0) \le \frac{\mathsf {d}(\gamma ^p_1,\gamma ^q_0)^2}{2} - \varphi (\gamma ^q_0) \end{aligned}$$

for \((p,q) = (t,s)\) and \((p,q) = (s,t)\). Summing these two inequalities, we obtain the well-known c-cyclic monotonicity of the set \(\left\{ (\gamma ^t_0,\gamma ^t_1),(\gamma ^s_0,\gamma ^s_1)\right\} \):

$$\begin{aligned} \ell (\gamma ^t)^2 + \ell (\gamma ^s)^2 \le \mathsf {d}(\gamma ^t_0,\gamma ^s_1)^2 + \mathsf {d}(\gamma ^s_0,\gamma ^t_1)^2 . \end{aligned}$$

To evaluate the right-hand-side, we simply pass through x and employ the triangle inequality:

$$\begin{aligned} \mathsf {d}(\gamma ^p_0,\gamma ^q_1) \le \mathsf {d}(\gamma ^p_0,x) + \mathsf {d}(x,\gamma ^q_1) = p \; \ell (\gamma ^p) + (1-q) \; \ell (\gamma ^q) . \end{aligned}$$

Plugging this above and rearranging terms, we obtain

$$\begin{aligned} t (1-t) \ell (\gamma ^t)^2 + s (1-s) \ell (\gamma ^s)^2 \le \left( t (1-s) + s (1-t)\right) \ell (\gamma ^t) \ell (\gamma ^s) . \end{aligned}$$

Completing the square by subtracting \(2 \sqrt{t(1-t)s(1-s)} \ell (\gamma ^t) \ell (\gamma ^s)\) from both sides, and recalling that \(\ell (\gamma ^p) = \ell _p(x)\) for \(p=t,s\), we readily obtain (3.9). In particular, using \(t=s\), the above argument recovers the last assertion of (1) that \(\ell (\gamma )\) is the same for all \(\gamma \in G_\varphi \) so that \(\gamma _t = x\).

To see (3), recall that given \(x \in X\), we know by Proposition 3.6 that \(\varphi _t(x) \le {\bar{\varphi }}_t(x)\) for all \(t \in (0,1)\) with equality iff \(t \in \mathring{G}_\varphi (x)\). Since \(\mathring{G}_\varphi (x) \subset D_\ell (x) \cap D_{{\bar{\ell }}}(x)\) by (1), we know that both maps \(t \mapsto {\tilde{\varphi }}_t(x)\) are differentiable at \(t_0 \in \mathring{G}_\varphi (x)\), and we see again that \(\frac{\ell ^2_{t_0}(x)}{2} = \partial _t \varphi _{t_0}(x) = \partial _t {\bar{\varphi }}_{t_0}(x) = \frac{{\bar{\ell }}^2_{t_0}(x)}{2}\), since the derivatives of a function and its majorant must coincide at a mutual point of differentiability where they touch. Moreover, defining \({\tilde{h}} = h,{\bar{h}}\) as

$$\begin{aligned} {\tilde{h}}(\varepsilon ) := 2\left( {\tilde{\varphi }}_{t_0+\varepsilon }(x) - {\tilde{\varphi }}_{t_0}(x) - \varepsilon \partial _t {\tilde{\varphi }}_{t_0}(x)\right) \end{aligned}$$

it follows that \(h \le {\bar{h}}\) (on \((-t_0,1-t_0)\)). Diving by \(\varepsilon ^2\) and taking appropriate subsequential limits, we obviously obtain

$$\begin{aligned} {\underline{{\mathcal {P}}}}_2 \varphi _t(x) \le {\underline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) ~,~ {\overline{{\mathcal {P}}}}_2 \varphi _t(x) \le {\overline{{\mathcal {P}}}}_2 {\bar{\varphi }}_t(x) . \end{aligned}$$

Combining these inequalities with those of Lemma 2.4, (3.6) and (3.7), the chain of inequalities in (3) readily follows.

To see (4), let \(t_0 \in \mathring{G}_\varphi (x)\). Consider the function \(f(t) := {\bar{\varphi }}_t(x) - \varphi _t(x)\) on (0, 1), which is locally semi-concave by Corollary 3.10. By Proposition 3.6, we know that \(f \ge 0\) with \(f(t_0) = 0\). The function f is differentiable on \(D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) and satisfies \(f'(t) = \frac{{\bar{\ell }}^2_{t}(x)}{2} - \frac{\ell ^2_{t}(x)}{2}\) there. In particular, this holds at \(t_0 \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by (1) and \(f'(t_0) = 0\). Note that by Corollary 3.10:

$$\begin{aligned} {\overline{\partial }}_t f'(t) \le {\overline{\partial }}_t \frac{{\bar{\ell }}^2_{t}(x)}{2} - {\underline{\partial }}_t \frac{\ell ^2_{t}(x)}{2} \le \frac{1}{1-t} {\bar{\ell }}_t^2(x) + \frac{1}{t} \ell _t^2(x) . \end{aligned}$$

In particular, since both \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}_t(x)\) are continuous at \(t = t_0 \in D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\), for \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), it follows that

$$\begin{aligned}&\forall \varepsilon> 0 \;\; \exists \delta > 0 \;\; \forall t \in (t_0-\delta ,t_0 + \delta ) \cap D_{\ell }(x) \cap D_{{\bar{\ell }}}(x) \;\;\; \\&{\overline{\partial }}_t f'(t) \le \frac{1}{1-t_0} \ell _{t_0}^2(x) + \frac{1}{t_0} \ell _{t_0}^2(x) + \varepsilon . \end{aligned}$$

It follows that on the open interval \(I_\delta := (t_0-\delta ,t_0 + \delta ) \cap (0,1)\), \(f - C_\varepsilon \frac{t^2}{2}\) is concave with \(C_\varepsilon \) defined as the constant on the right-hand-side above. Applying Lemma 3.12 below to the translated function \(f(\cdot + t_0)\) on the interval \(I_\delta - t_0\), it follows that:

$$\begin{aligned}&\frac{1}{t-t_0} \left( \frac{{\bar{\ell }}^2_{t}(x)}{2} - \frac{\ell ^2_{t}(x)}{2}\right) = \frac{f'(t) - f'(0)}{t-t_0} \quad \ge -C_\varepsilon \\&\;\;\; \forall t \in (t_0 - \frac{\delta }{2},t_0 + \frac{\delta }{2}) \cap D_{\ell }(x) \cap D_{{\bar{\ell }}}(x) . \end{aligned}$$

As \({\bar{\ell }}_{t_0}(x) = \ell _{t_0}(x)\) by (1), we obtain

$$\begin{aligned}&\frac{\frac{\ell ^2_{t}(x)}{2} - \frac{\ell ^2_{t_0}(x)}{2}}{t-t_0} \le \frac{\frac{{\bar{\ell }}^2_{t}(x)}{2} - \frac{{\bar{\ell }}^2_{t_0}(x)}{2}}{t-t_0} + C_{\varepsilon }\\&\;\;\; \forall t \in (t_0 - \frac{\delta }{2},t_0 + \frac{\delta }{2}) \cap D_{\ell }(x) \cap D_{{\bar{\ell }}}(x). \end{aligned}$$

The assertion of (4) now follows by taking appropriate subsequential limits as \(t \rightarrow t_0\) and using the fact that \(\varepsilon > 0\) was arbitrary. \(\square \)

Lemma 3.12

Given \(I \subset {\mathbb {R}}\) an open interval containing 0, let \(f : I \rightarrow {\mathbb {R}}\) denote a C-semi-concave function, so that \(I \ni t \mapsto f - C \frac{t^2}{2}\) is concave, \(C \ge 0\). Assume that \(f \ge 0\) on I, that f is differentiable at 0 and that \(f(0) = f'(0) = 0\). Then \(\underline{\partial _t}|_{t=0} f'(t) \ge -C\), and moreover, \(\frac{f'(t)}{t} \ge -C\) for all \(t \in D \cap I/2\), where \(D \subset I\) denotes the subset (of full measure) of differentiability points of f.

Note that the C-semi-concavity is equivalent to \({\overline{\partial }}_t|_{t=0} f'(t) \le C\), while the conclusion is from the opposite direction. It is not hard to verify that the asserted lower bound is in fact best possible.

Proof of Lemma 3.12

Set \(g = f'\) on D. The C-semi-concavity is equivalent to the statement that \(g(t) - C t\) is non-increasing on D, so that \(g(t_2) \le g(t_1) + C (t_2 - t_1)\) for all \(t_1,t_2 \in D\) with \(t_1 < t_2\). It follows that necessarily \(g(t) \ge -C t\) for all \(t \in D \cap I/2\) with \(t \ge 0\), since:

$$\begin{aligned} 0\le & {} f(2t) - f(0) = \int _0^{2t} g(s) ds \le \int _0^t (g(0) + C s) ds \\&+ \int _t^{2t} (g(t) + C(s-t)) ds = C \frac{t^2}{2} + t g(t) + C \frac{t^2}{2} . \end{aligned}$$

Repeating the same argument for \(t \mapsto f(-t)\), we see that \(-g(t) \ge C t\) for all \(t \in D \cap I/2\) with \(t \le 0\). This concludes the proof. \(\square \)

In a sense, Theorem 3.11 (2) is the temporal analogue of the spatial 1/2-Hölder regularity proved by Villani in [77, Theorem 8.22]. Formally taking \(s \rightarrow t\) in (3.9), it is easy to check that one obtains (for both \({\tilde{\ell }} = \ell ,{\bar{\ell }}\)) stronger bounds than in Theorem 3.11 (3) and (4):

$$\begin{aligned} -\frac{1}{t} \ell ^2_t(x) \le \underline{\partial _t} \frac{{\tilde{\ell }}^2_t(x)|_{\mathring{G}_\varphi (x)}}{2} \le \overline{\partial _t} \frac{{\tilde{\ell }}^2_t(x)|_{\mathring{G}_\varphi (x)}}{2} \le \frac{1}{1-t} \ell ^2_t(x) \;\;\; \forall t \in \mathring{G}_\varphi (x) .\nonumber \\ \end{aligned}$$
(3.11)

However, we do not know how to rigorously pass from (3.9) to (3.11) or vice versa (by differentiation or integration, respectively), since we cannot exclude the possibility that the (relatively closed in (0, 1)) set \(\mathring{G}_\varphi (x)\) has isolated points, nor that it is disconnected. Instead, we can obtain the following stronger version of (3.11) which only holds for a.e. \(t \in \mathring{G}_\varphi (x)\), but will prove to be very useful later on.

Corollary 3.13

For all \(x \in X\), for a.e. \(t \in \mathring{G}_\varphi (x)\), \(\partial _t \ell ^2_t(x)\) and \(\partial _t {\bar{\ell }}^2_t(x)\) exist, coincide, and satisfy:

$$\begin{aligned} -\frac{1}{t} \ell ^2_t(x) \le \partial _t \frac{\ell ^2_t(x)}{2} = \partial _t \frac{\ell ^2_t(x)|_{\mathring{G}_\varphi (x)}}{2} = \partial _t \frac{{\bar{\ell }}^2_t(x)|_{\mathring{G}_\varphi (x)}}{2} = \partial _t \frac{{\bar{\ell }}^2_t(x)}{2} \le \frac{1}{1-t} \ell ^2_t(x). \nonumber \\ \end{aligned}$$
(3.12)

Proof

By Corollary 3.10, for all \(x \in X\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), \(t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable a.e. on \(D_{{\tilde{\ell }}}(x)\). Consequently, the first and third equalities in (3.12) follow for a.e. \(t \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by Remark 2.1. The second equality follows since \(\ell _t(x) = {\bar{\ell }}_t(x)\) for \(t \in \mathring{G}_\varphi (x)\) by Theorem 3.11. The lower and upper bounds in (3.12) then follow from Theorem 3.11 (3) (or as in (3.11), by taking the limit as \(s \rightarrow t\) in Theorem 3.11 (2)). \(\square \)

4.5 Null-geodesics

Definition 3.14

(Null-Geodesics and Null-Geodesic Points) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\), denote the subset of null \(\varphi \)-Kantorovich geodesics by

$$\begin{aligned} G_\varphi ^0 := \left\{ \gamma \in G_\varphi \; ; \; \ell (\gamma ) = 0 \right\} . \end{aligned}$$

Its complement in \(G_\varphi \) will be denoted by \(G_\varphi ^+\). The subset of X of null \(\varphi \)-Kantorovich geodesic points is denoted by

$$\begin{aligned} X^0 := \left\{ x \in X \; ; \; \exists \gamma \in G_\varphi ^0 \;\; \gamma \equiv x\right\} = \left\{ x \in X \; ; \; \varphi (x) + \varphi ^c(x) = 0 \right\} . \end{aligned}$$

Its complement in X will be denoted by \(X^+\).

The following provides a convenient equivalent characterization of \(X^0\) and \(X^+\):

Lemma 3.15

Given \(x \in X\), the following statements are equivalent:

  1. (1)

    \(x \in X^0\), i.e. \(\varphi (x) + \varphi ^c(x) = 0\).

  2. (2)

    \(\forall t \in (0,1)\), \(\varphi _t(x) = {\bar{\varphi }}_t(x) = \varphi (x) = -\varphi ^c(x)\).

  3. (3)

    \(\forall t \in (0,1)\), \(\varphi _t(x) = c\) and \({\bar{\varphi }}_t(x) = {\bar{c}}\) for some \(c,{\bar{c}} \in {\mathbb {R}}\).

  4. (4)

    \(D_{\ell }(x) = D_{{\bar{\ell }}}(x) = (0,1)\) and \(\;\forall t \in (0,1) \;\; \ell _t(x) = {\bar{\ell }}_t(x) = 0\).

  5. (5)

    \(\exists t_0 \in \mathring{G}_\varphi (x)\) so that \(\varphi _{t_0}(x) = \varphi (x)\) or \({\bar{\varphi }}_{t_0}(x) = \varphi (x)\) or \(\varphi _{t_0}(x) = -\varphi ^c(x)\) or \({\bar{\varphi }}_{t_0}(x) = -\varphi ^c(x)\).

  6. (6)

    \(\exists t_0 \in \mathring{G}_\varphi (x)\) so that \(\ell _{t_0}^{-}(x) = 0\) or \(\ell _{t_0}^{+}(x) = 0\) or \({\bar{\ell }}_{t_0}^{-}(x) = 0\) or \({\bar{\ell }}_{t_0}^+(x) = 0\).

In other words, we have the following dichotomy: all \(\varphi \)-Kantorovich geodesics having \(x \in X\) as some interior mid-point have either strictly positive length (iff \(x \in X^+\)) or zero length (iff \(x \in X^0\)).

Remark 3.16

In fact, we always have \(\varphi _t(x) = {\bar{\varphi }}_t(x)\) and \(\ell _t(x) = {\bar{\ell }}_t(x)\) for \(t \in \mathring{G}_\varphi (x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\) by Theorem 3.11, so we may simply write “\(\varphi _{t_0}(x) = \varphi (x)\) or \(\varphi _{t_0}(x)= -\varphi ^c(x)\)” and “\(\ell _{t_0}(x) = {\bar{\ell }}_{t_0}(x) = 0\)” in statements (5) and (6), respectively. However, we chose to formulate these statements with the (a-priori) minimal requirements.

Proof of Lemma 3.15

\((1) \Rightarrow (2)\) is straightforward: for instance, (1) is by definition identical to \(\varphi _1(x) = \varphi _0(x)\) and (2) follows by the monotonicity of \([0,1] \ni t \mapsto {\tilde{\varphi }}_t(x)\) for both \({\tilde{\varphi }} = \varphi ,{\bar{\varphi }}\); alternatively, apply Lemma 3.3 to the null geodesic \(\gamma ^0 \equiv x\) with respect to both Kantorovich potentials \(\varphi \) and \(\varphi ^c\).

\((2) \,\!\Rightarrow \!\, (3)\) is trivial.

\((3)\,\!\!\Leftrightarrow \!\!\,(4)\) follows by using that \(D_{{\tilde{\ell }}}(x)\) is characterized as the subset of t-differentiability points of \(\varphi _t(x)\) on (0, 1) with \(\partial _t {\tilde{\varphi }}_t(x) = {\tilde{\ell }}_t^2(x)/2\) there.

\((3)\,\!\Rightarrow \!\,(1)\): by the continuity of \(t \mapsto \varphi _t(x)\) from the left at \(t=1\) it follows that \(c = \varphi _1(x)\), and similarly the continuity of \(t \mapsto {\bar{\varphi }}_t(x)\) from the right at \(t=0\) yields that \({\bar{c}} = {\bar{\varphi }}_0(x) = \varphi (x)\). Since always \(\varphi \le {\bar{\varphi }}\), we deduce \(\varphi _1(x) = c \le {\bar{c}} = \varphi (x)\). On the other hand, we always have \(\varphi (x) \le \varphi _1(x)\) by monotonicity, so we conclude that \(\varphi (x) = \varphi _1(x)\), establishing statement (1). This concludes the proof of the equivalence \((1)\,\! \Leftrightarrow \!\,(2) \,\!\Leftrightarrow \!\,(3) \,\!\Leftrightarrow \!\, (4)\).

\((2) \,\!\Rightarrow \!\, (5)\) and \((4)\,\! \Rightarrow \!\,(6)\) are trivial.

\((5) \,\!\!\Rightarrow \!\!\,(6)\) is straightforward: for instance, if \({\tilde{\varphi }}_{t_0}(x) = {\tilde{\varphi }}_0(x) = \varphi (x)\) for some \(t_0 \in (0,1)\) and \({\tilde{\varphi }} \in \left\{ \varphi ,{\bar{\varphi }}\right\} \), then by monotonicity, \({\tilde{\varphi }}_t(x) = \varphi (x)\) for all \(t \in [0,t_0]\), and hence the left derivative at \(t=t_0\) satisfies \(\ell ^{-}_{t_0}(x) = \partial _t^{-}|_{t=t_0} \varphi _t(x) = 0\) if \({\tilde{\varphi }} = \varphi \) and \({\bar{\ell }}^{+}_{t_0}(x) = \partial _t^{-}|_{t=t_0} {\bar{\varphi }}_t(x) = 0\) if \({\tilde{\varphi }} = {\bar{\varphi }}\). If \({\tilde{\varphi }}_{t_0}(x) = {\tilde{\varphi }}_1(x) = -\varphi ^c(x)\), repeat the argument using the right derivative.

The only direction requiring second-order information on \(\varphi _t\) is \((6) \Rightarrow (3)\). By Corollary 3.10, \(t \mapsto t \ell _{t}^{\pm }(x)\) and \(t \mapsto (1-t) {\bar{\ell }}_t^{\pm }(x)\) are monotone non-decreasing and non-increasing on (0, 1), respectively. Since \(t_0 \in \mathring{G}_\varphi \), in view of Remark 3.16, (5) is equivalent to \(\ell ^{\pm }_{t_0}(x) = {\bar{\ell }}^{\pm }_{t_0}(x) = 0\). The monotonicity implies that \(\ell _t^{\pm }(x) = 0\) for all \(t \in (0,t_0]\) and that \({\bar{\ell }}_t^{\pm }(x) = 0\) for all \(t \in [t_0,1)\). It follows that \(\varphi _t(x)\) is constant on \((0,t_0]\) and \({\bar{\varphi }}_t(x)\) is constant on \([t_0,1)\). As \(\varphi _{t_0}(x) = {\bar{\varphi }}_{t_0}(x)\), the monotonicity of \(t \mapsto {\tilde{\varphi }}_t(x)\) and the majoration \(\varphi _t \le {\bar{\varphi }}_t\) forces both \(t \mapsto \varphi _t(x)\) and \(t \mapsto {\bar{\varphi }}_t(x)\) to be constant on (0, 1), establishing (3) (in fact with \(c = {\bar{c}}\)). \(\square \)

Corollary 3.17

If \(x \in X^+\) then \(\ell _t(x) > 0\) for all \(t \in [\inf \mathring{G}_\varphi (x), 1) \cap D_{\ell }(x)\) and \({\bar{\ell }}_t(x) > 0\) for all \(t \in (0,\sup \mathring{G}_\varphi (x)] \cap D_{{\bar{\ell }}}(x)\).

Proof

Immediate by (6) and the monotonicity of \(D_{\ell }(x) \ni t \mapsto t \ell _t(x)\) and \(D_{{\bar{\ell }}}(x) \ni t \mapsto (1-t) {\bar{\ell }}_t(x)\), together with the fact that \(\mathring{G}_\varphi (x)\) is relatively closed in (0, 1) by Corollary 3.9. \(\square \)

Corollary 3.18

Given \(x \in X\), assume that \(\exists t_1,t_2 \in \mathring{G}_\varphi (x)\) with \(t_1 \ne t_2\). Then \(x \in X^0\) iff \(\varphi _{t_1}(x) = \varphi _{t_2}(x)\) (or equivalently, \({\bar{\varphi }}_{t_1}(x) = {\bar{\varphi }}_{t_2}(x)\)).

Proof

The “only if” direction follows immediately by Lemma 3.15, whereas the “if” direction follows by Corollary 3.17, after recalling that \(\varphi _{t_2}(x) - \varphi _{t_1}(x) = \int _{t_1}^{t_2} \frac{\ell _\tau ^2(x)}{2} d\tau \) by Corollary 3.10. As usual, the equivalent condition follows by Theorem 3.11. \(\square \)

5 Temporal theory of intermediate-time Kantorovich potentials: time-propagation

The goal of this section is to introduce and study the following function(s):

Definition

(Time-Propagated Intermediate Kantorovich Potentials) Given a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) and \(s,t \in (0,1)\), define the t-propagated s-Kantorovich potential \(\Phi _s^t\) on \(D_{\ell }(t)\), and its time-reversed version \({\bar{\Phi }}_s^t\) on \(D_{{\bar{\ell }}}(t)\), by:

$$\begin{aligned} \Phi _s^t := \varphi _t + (t-s) \frac{\ell _t^2}{2} \text { on } \mathrm{e}_t(G_\varphi ),~ {\bar{\Phi }}_s^t := {\bar{\varphi }}_t + (t-s) \frac{{\bar{\ell }}_t^2}{2} \text { on }D_{{\bar{\ell }}}(t). \end{aligned}$$

Observe that for all \(s,t \in (0,1)\):

$$\begin{aligned} \Phi _s^t = {\bar{\Phi }}_s^t = \varphi _s \circ \mathrm{e}_s \circ \mathrm{e}_t^{-1} \quad \text { on }\mathrm{e}_t(G_\varphi ) ; \end{aligned}$$

indeed, while \(\mathrm{e}_t^{-1} : \mathrm{e}_t(G_\varphi ) \rightarrow G_\varphi \) may be multi-valued, Theorem 3.11 implies that \(\ell (\gamma ) = \ell _t(x) = {\bar{\ell }}_t(x)\) for any \(\gamma \in G_\varphi \) with \(\gamma _t = x\), and consequently Lemma 3.3 yields that \(\varphi _s \circ \mathrm{e}_s\) is single-valued for all such \(\gamma \) and (also recalling Proposition 3.6):

$$\begin{aligned} \Phi _s^t(\gamma _t) = {\bar{\Phi }}_s^t(\gamma _t) = \varphi _s(\gamma _s) \;\;\; \forall \gamma \in G_{\varphi } . \end{aligned}$$

Consequently, on \(\mathrm{e}_t(G_\varphi )\), \(\Phi _s^t={\bar{\Phi }}_s^t\) is identified as the push-forward of \(\varphi _s\) via \(\mathrm{e}_t \circ \mathrm{e}_s^{-1}\), i.e. its propagation along \(G_\varphi \) from time s to time t.

We will use the following short-hand notation. Given \(s \in [0,1]\) and \(a_s \in {\mathbb {R}}\), we denote:

$$\begin{aligned} G_{a_s} := \left\{ \gamma \in G_\varphi \; ; \; \varphi _s(\gamma (s)) = a_s\right\} , \end{aligned}$$

suppressing the implicit dependence of \(G_{a_s}\) on s. The above argument about why \(\varphi _s \circ \mathrm{e}_s \circ \mathrm{e}_t^{-1}\) is well-defined can be rewritten as:

Corollary 4.1

(Inter Level-Set Propagation) For all \(s,t \in (0,1)\), \(a_s, b_s \in {\mathbb {R}}\), \(a_s \ne b_s\), we have:

$$\begin{aligned} \mathrm{e}_t(G_\varphi ) \cap \left\{ \Phi _s^t = a_s\right\} \cap \left\{ \Phi _s^t = b_s\right\} = \mathrm{e}_{t}(G_{a_s}) \cap \mathrm{e}_{t}(G_{b_s}) = \emptyset . \end{aligned}$$

Note that while typically disjoint sets remain disjoint under optimal-transport only under some additional non-branching assumptions, Corollary 4.1 holds true in general.

5.1 Monotonicity

Lemma 4.2

Let \(x = \gamma ^1_{t_1} = \gamma ^2_{t_2}\) with \(\gamma ^1,\gamma ^2 \in G_\varphi \) and \(0< t_1< t_2 < 1\). Then for any \(s \in (0,1)\):

$$\begin{aligned} \varphi _s(\gamma ^2_s) - \varphi _s(\gamma ^1_s) \ge 2 \min \left( \frac{s}{t_2},\frac{1-s}{1-t_1}\right) (\varphi _{t_2}(x) - \varphi _{t_1}(x)) \ge 0 . \end{aligned}$$
(4.1)

Moreover, the left-hand-side is in fact strictly positive iff \(x \in X^+\).

Proof

We know by Lemma 3.3 and Theorem 3.11 that:

$$\begin{aligned} \varphi _s(\gamma ^i_s) = \varphi _{t_i}(\gamma ^i_{t_i}) + (t_i - s) \frac{\ell ^2(\gamma ^i)}{2} = \varphi _{t_i}(x) + (t_i - s) \frac{\ell _{t_i}^2(x)}{2} ~,~ i=1,2 . \end{aligned}$$

Recall that \(\varphi _{t_i}(x) = {\bar{\varphi }}_{t_i}(x)\) and \(\ell _{t_i}(x) = {\bar{\ell }}_{t_i}(x)\) by Proposition 3.6 and Theorem 3.11, as \(x = \gamma ^i_{t_i}\). Now set \({\bar{s}} := (s \vee t_1) \wedge t_2\). Since \({\bar{s}} \in \left\{ t_1,t_2,s\right\} \), it follows that:

$$\begin{aligned}&\varphi _s(\gamma ^2_s) - \varphi _s(\gamma ^1_s) - \left( \varphi _{t_2}(x) - \varphi _{t_1}(x)\right) \\&\quad = (t_2 - s) \frac{\ell _{t_2}^2(x)}{2} - ({\bar{s}} - s) \frac{\ell _{{\bar{s}}}^2(x)}{2} + ({\bar{s}} - s) \frac{{\bar{\ell }}_{{\bar{s}}}^2(x)}{2} - (t_1-s) \frac{{\bar{\ell }}_{t_1}^2(x)}{2} . \end{aligned}$$

By Corollary 3.10, we know for \({\tilde{\ell }} = \ell ,{\bar{\ell }}\) that \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable a.e., and that the singular part of its distributional derivative is non-negative for \({\tilde{\ell }} = \ell \) and non-positive for \({\tilde{\ell }} = {\bar{\ell }}\). Consequently, we may proceed as follows:

$$\begin{aligned} \ge \int _{{\bar{s}}}^{t_2} {\underline{\partial }}_{\tau } \left( (\tau - s) \frac{\ell _{\tau }^2(x)}{2}\right) d\tau + \int _{t_1}^{{\bar{s}}} {\overline{\partial }}_{\tau } \left( (\tau - s) \frac{{\bar{\ell }}_{\tau }^2(x)}{2}\right) d\tau , \end{aligned}$$

where we used that \(\tau -s \ge 0\) when \({\bar{s}} \le \tau < t_2\) and that \(\tau -s \le 0\) when \({\bar{s}} \ge \tau > t_1\). Using (3.6) and (3.7) to bound the above lower and upper derivatives on the sets (having full measure) \(D_{\ell }(x)\) and \(D_{{\bar{\ell }}}(x)\), respectively, we obtain:

$$\begin{aligned}&\ge \int _{{\bar{s}}}^{t_2} \left( 1 - 2 \frac{\tau -s}{\tau }\right) \frac{\ell _\tau ^2(x) }{2} d\tau + \int _{t_1}^{{\bar{s}}} \left( 1 + 2 \frac{\tau -s}{1-\tau }\right) \frac{{\bar{\ell }}_{\tau }^2(x)}{2} d\tau \\&\quad = \int _{{\bar{s}}}^{t_2} \left( 2 \frac{s}{\tau } - 1\right) \frac{\ell _\tau ^2(x) }{2} d\tau + \int _{t_1}^{{\bar{s}}} \left( 2 \frac{1-s}{1-\tau } - 1\right) \frac{{\bar{\ell }}_{\tau }^2(x)}{2} d\tau \\&\quad \ge \left( 2 \frac{s}{t_2} - 1\right) \int _{{\bar{s}}}^{t_2} \frac{\ell _\tau ^2(x) }{2} d\tau + \left( 2 \frac{1-s}{1-t_1} - 1\right) \int _{t_1}^{{\bar{s}}} \frac{{\bar{\ell }}_{\tau }^2(x)}{2} d\tau \\&\quad = \left( 2 \frac{s}{t_2} - 1\right) \left( \varphi _{t_2}(x) - \varphi _{{\bar{s}}}(x)\right) + \left( 2 \frac{1-s}{1-t_1} - 1\right) \left( {\bar{\varphi }}_{{\bar{s}}}(x) - {\bar{\varphi }}_{t_1}(x)\right) . \end{aligned}$$

Summarizing, we have obtained:

$$\begin{aligned} \varphi _s(\gamma ^2_s) - \varphi _s(\gamma ^1_s)&\ge \left( 2 \frac{s}{t_2} - 1\right) \left( \varphi _{t_2}(x) - \varphi _{{\bar{s}}}(x)\right) + \varphi _{t_2}(x) \\&\quad + \left( 2 \frac{1-s}{1-t_1} - 1\right) \left( {\bar{\varphi }}_{{\bar{s}}}(x) - \varphi _{t_1}(x)\right) - \varphi _{t_1}(x) . \end{aligned}$$

We now use the inequality \(\varphi _{{\bar{s}}}(x) \le {\bar{\varphi }}_{{\bar{s}}}(x)\) in the first line above when \(2 \frac{s}{t_2} - 1 \ge 0\), and in the second line when \(2 \frac{1-s}{1-t_1} - 1 \ge 0\), yielding

$$\begin{aligned} \ge {\left\{ \begin{array}{ll} 2 \frac{s}{t_2} ({\bar{\varphi }}_{t_2}(x) - {\bar{\varphi }}_{{\bar{s}}}(x)) + 2 \frac{1-s}{1-t_1} ({\bar{\varphi }}_{{\bar{s}}}(x) - {\bar{\varphi }}_{t_1}(x)) &{} s \ge \frac{t_2}{2} \\ 2 \frac{s}{t_2} (\varphi _{t_2}(x) - \varphi _{{\bar{s}}}(x)) + 2 \frac{1-s}{1-t_1} (\varphi _{{\bar{s}}}(x) - \varphi _{t_1}(x)) &{} 1-s \ge \frac{1-t_1}{2}. \end{array}\right. } \end{aligned}$$

In particular, the first estimate applies whenever \(s \ge \frac{1}{2}\) and the second one whenever \(s \le \frac{1}{2}\). Using that \([0,1] \ni \tau \mapsto {\tilde{\varphi }}_\tau (x)\) is monotone non-decreasing, the asserted (4.1) is established in either case. Moreover, (4.1) implies that if \(\varphi _s(\gamma ^2_s) -\varphi _s(\gamma ^1_s) = 0\) then \(\varphi _{t_1}(x) = \varphi _{t_2}(x)\), and hence by Corollary 3.18 that \(x \in X^0\); and vice-versa, if \(x \in X^0\) then all geodesics having x as an interior point are null by Lemma 3.15, and hence \(\gamma ^1_s = \gamma ^2_s = x\) and \(\varphi _s(\gamma ^2_s) -\varphi _s(\gamma ^1_s) = 0\). \(\square \)

We can already deduce the following important consequence, complementing Corollary 4.1, which holds for any proper geodesic space \((X,\mathsf {d})\), independently of any additional assumptions like various forms of non-branching:

Corollary 4.3

(Intra Level-Set Propagation) For any \(s \in (0,1)\), \(a_s \in {\mathbb {R}}\), and \(t_1, t_2 \in (0,1)\) with \(t_1 \ne t_2\):

$$\begin{aligned} e_{t_1}(G_{a_s} \setminus G_\varphi ^0) \cap e_{t_2}(G_{a_s} \setminus G_\varphi ^0) = e_{t_1}(G_{a_s}) \cap e_{t_2}(G_{a_s}) \cap X^+ = \emptyset . \end{aligned}$$

In other words, for each \(x \in e_{(0,1)}(G_{a_s}) \cap X^+\), there exists a unique \(t \in (0,1)\) so that \(x \in \mathrm{e}_t(G_{a_s})\).

Proof

If \(x = \gamma ^1_{t_1} = \gamma ^2_{t_2} \in X^+\), \(0<t_1< t_2< 1\), then Lemma 4.2 yields \(\varphi _s(\gamma ^2(s)) > \varphi _s(\gamma ^1(s))\), establishing the assertion. \(\square \)

5.2 Properties of \(\Phi _s^t\)

The following information will be crucially used when deriving the Change-Of-Variables formula in Sect. 11:

Proposition 4.4

For any \(s \in (0,1)\), the following properties of \(\Phi _s^t\) and \({\bar{\Phi }}_s^t\) hold:

  1. (1)

    The maps \((x,t) \mapsto \Phi _s^t(x)\) and \((x,t) \mapsto {\bar{\Phi }}_s^t(x)\) are continuous on \(D_{\ell }\) and \(D_{{\bar{\ell }}}\), respectively.

  2. (2)

    For each \(x \in X\), \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively, \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\Phi }}_s^t(x)\) is differentiable at t iff \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}^2_t(x)\) is differentiable at t or if \(t=s \in D_{{\tilde{\ell }}}(x)\), so in particular \(t \mapsto {\tilde{\Phi }}_s^t(x)\) is a.e. differentiable. At points t of differentiability:

    $$\begin{aligned} \partial _t {\tilde{\Phi }}_s^t(x) = {\tilde{\ell }}_t^2(x) + (t-s) \frac{\partial _t {\tilde{\ell }}^2_t(x)}{2} . \end{aligned}$$
    (4.2)

    In particular, if \(s \in D_{{\tilde{\ell }}}(x)\) then \(\exists \partial _t|_{t=s} {\tilde{\Phi }}_s^t(x) = {\tilde{\ell }}_s^2(x)\).

  3. (3)

    For each \(x \in X\), the map \(\mathring{G}_\varphi (x) \ni t \mapsto \Phi _s^t(x) = {\bar{\Phi }}_s^t(x)\) is locally Lipschitz and non-decreasing (if \(\# \mathring{G}_\varphi (x) \ge 2\), it is strictly increasing iff \(x \in X^+\)).

  4. (4)

    For all \(t \in (0,1)\):

    $$\begin{aligned}&\quad {\left\{ \begin{array}{ll} {\underline{\partial }}_t \Phi _s^t(x) \ge \frac{s}{t} \ell _t^2(x) &{} \quad \,\,t \ge s \\ {\overline{\partial }}_t \Phi _s^t(x) \le \frac{s}{t} \ell _t^2(x) &{} \quad \,\,t \le s \end{array}\right. } \;\; \forall x \in D_{\ell }(t) ~;~ \\&\quad {\left\{ \begin{array}{ll} {\overline{\partial }}_t {\bar{\Phi }}_s^t(x) \le \frac{1-s}{1-t} {\bar{\ell }}_t^2(x) &{} t \ge s \\ {\underline{\partial }}_t {\bar{\Phi }}_s^t(x) \ge \frac{1-s}{1-t} {\bar{\ell }}_t^2(x) &{} t \le s \end{array}\right. } \;\; \forall x \in D_{{\bar{\ell }}}(t). \end{aligned}$$
  5. (5)

    For all \((x,t) \in D(\mathring{G}_\varphi )\):

    $$\begin{aligned}&\min \left( \frac{s}{t},\frac{1-s}{1-t} + \frac{t-s}{t(1-t)}\right) \ell _t^2(x) \le {\underline{\partial }}_t \Phi _s^t(x) \\&\le {\overline{\partial }}_t \Phi _s^t(x) \le \max \left( \frac{s}{t},\frac{1-s}{1-t} + \frac{t-s}{t(1-t)}\right) \ell _t^2(x), \\&\min \left( \frac{1-s}{1-t} , \frac{s}{t} - \frac{t-s}{t(1-t)}\right) \ell _t^2(x) \le {\underline{\partial }}_t {\bar{\Phi }}_s^t(x) \\&\le {\overline{\partial }}_t {\bar{\Phi }}_s^t(x) \le \max \left( \frac{1-s}{1-t} , \frac{s}{t} - \frac{t-s}{t(1-t)}\right) \ell _t^2(x). \end{aligned}$$

Proof

Recall that

$$\begin{aligned} {\tilde{\Phi }}_s^t := {\tilde{\varphi }}_t(x) + (t-s) \frac{{\tilde{\ell }}^2_t(x)}{2}\quad \text { on }D_{{\tilde{\ell }}}. \end{aligned}$$

The first and second statements follow by Lemma 3.2 and Corollary 3.10. As \(t \mapsto {\tilde{\varphi }}_t(x)\) is differentiable on \(D_{{\tilde{\ell }}}(x)\) with derivative \(\frac{{\tilde{\ell }}_t^2(x)}{2}\), the points of differentiability of \(t \mapsto {\tilde{\Phi }}_s^t(x)\) must coincide with those of \(t \mapsto {\tilde{\ell }}_t^2(x)\) and (4.2) follows immediately, with the only possible exception being the point \(t=s\) if \(s \in D_{{\tilde{\ell }}}(x)\), where direct inspection and continuity of \(t \mapsto {\tilde{\ell }}_t^2(x)\) on \(D_{{\tilde{\ell }}}(x)\) verifies (4.2). The local Lipschitzness follows by Theorem 3.11 (2). The monotonicity follows by Lemma 4.2, since if \(\gamma ^t \in G_\varphi \) is such that \(\gamma ^t_t = x\), then \(\Phi _s^t(\gamma ^t_t) = {\bar{\Phi }}_s^t(\gamma ^t_t) = \varphi _s(\gamma ^t_s)\). The last two assertions follow as in the proof of Lemma 4.2, after noting that:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\underline{\partial }}_t {\tilde{\Phi }}_s^t(x) ={\tilde{\ell }}_t^2(x) + (t-s) {\underline{\partial }}_t \frac{{\tilde{\ell }}_t^2(x)}{2} &{} t \ge s \\ {\underline{\partial }}_t {\tilde{\Phi }}_s^t(x) = {\tilde{\ell }}_t^2(x) + (t-s) {\overline{\partial }}_t \frac{{\tilde{\ell }}_t^2(x)}{2} &{} t \le s \end{array}\right. } \;\;\;\; \forall x \in D_{{\tilde{\ell }}}(t) , \end{aligned}$$

and similarly for \({\overline{\partial }}_t\). Indeed, the estimates (3.6) and (3.7) of Corollary 3.10 yield (4), which already yields half of the inequalities in (5) for all \((x,t) \in D_\ell \cap D_{\bar{\ell }}\). To get the other half, we must restrict to \(D(\mathring{G}_\varphi )\) and use the estimates of Theorem 3.11 (4), thereby concluding the proof. \(\square \)

As an immediate corollary of Proposition 4.4, Corollary 3.13 and Lemma 3.15, we obtain:

Corollary 4.5

For all \(x \in X\), for a.e. \(t \in \mathring{G}_\varphi (x)\), \(\partial _t \Phi ^t_s(x)\) and \(\partial _t {\bar{\Phi }}^t_s(x)\) exist, coincide, and satisfy:

$$\begin{aligned} \min \left( \frac{s}{t},\frac{1-s}{1-t}\right) \ell ^2_t(x)&\le \partial _t \Phi ^t_s(x) = \partial _t \Phi ^t_s(x)|_{\mathring{G}_\varphi (x)} \\&= \partial _t {\bar{\Phi }}^t_s(x)|_{\mathring{G}_\varphi (x)} = \partial _t {\bar{\Phi }}^t_s(x) \le \max \left( \frac{s}{t},\frac{1-s}{1-t}\right) \ell _t^2(x) . \end{aligned}$$

In particular, if \(x \in X^+\) then \(\partial _t \Phi ^t_s(x) > 0\) for a.e. \(t \in \mathring{G}_\varphi (x)\).

We will also require the following consequence of Proposition 4.4 and Theorem 3.11:

Lemma 4.6

For any \(x \in X\), \(s \in (0,1)\), and \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively:

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{1}{2 \varepsilon } \int _{(s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}_\varphi (x)} \left( \partial _t {\tilde{\Phi }}^t_s(x) - {\tilde{\ell }}_s^2(x)\right) dt = 0 . \end{aligned}$$

Proof

By (4.2), the claim boils down to proving

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \int _{(s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}_\varphi (x)} (t-s) \partial _t {\tilde{\ell }}_t^2(x) dt = 0 . \end{aligned}$$

Using Corollary 3.13, it follows that:

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \left| \int _{(s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}_\varphi (x)} (t-s) \partial _t {\tilde{\ell }}_t^2(x) dt \right| ~ \le \lim _{\varepsilon \rightarrow 0} \int _{(s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}_\varphi (x)} \left| \partial _t \frac{{\tilde{\ell }}_t^2(x)}{2}\right| dt \\&\quad ~ \le \frac{1}{\min (s,1-s)} \lim _{\varepsilon \rightarrow 0} \int _{(s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}_\varphi (x)} {\tilde{\ell }}_t^2(x) dt . \end{aligned}$$

But the latter limit is clearly 0 (e.g. by Corollary 3.10 (1)). \(\square \)

6 Temporal theory of intermediate-time Kantorovich potentials: third order

Fix a non-null Kantorovich geodesic \(\gamma \in G_\varphi ^+\), and denote for short \(\ell := \ell (\gamma ) > 0\). Recall by the results of Sect. 3 that for all \(t \in (0,1)\), \(\ell _t(\gamma _t) = {\bar{\ell }}_t(\gamma _t) = \ell \) and that \(\partial _t \varphi _t(x) = \partial _t {\bar{\varphi }}_t(x) =\ell _t^2(x)/2\) for all \(x \in \mathrm{e}_t(G_\varphi )\). Also, recall that given \(x \in X\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), the function \(D_{{\tilde{\ell }}}(x) \ni t \mapsto {\tilde{\ell }}_t(x)\) is only a.e. differentiable, and even on \(\mathring{G}_{\varphi }(x) \subset D_{\ell }(x) \cap D_{{\bar{\ell }}}(x)\), we only have at the moment upper and lower bounds on \(\underline{\partial _t} {\tilde{\ell }}^2_t(x)/2\) and \({\overline{\partial }}_t {\tilde{\ell }}^2_t(x)/2\), i.e. second order information on \({\tilde{\varphi }}_t(x)\).

The goal of this section is to rigorously make sense and prove the following formal statement, which provides second order information on \(\ell _t\), or equivalently, third order information on \(\varphi _t\), along \(\gamma _t\):

$$\begin{aligned} z(t) := \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) \;\;\; \Rightarrow \;\;\; z'(t) \ge \frac{z(t)^2}{\ell ^2} . \end{aligned}$$
(5.1)

Equivalently, this amounts to the statement that the function:

$$\begin{aligned} L(r) = \exp \left( - \frac{1}{\ell ^2} \int ^r_{r_0} \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) dt\right) \end{aligned}$$

is concave in \(r \in (0,1)\), since formally:

$$\begin{aligned} \frac{L''}{L} = (\log L) '' + ((\log L)')^2 = -\frac{z'}{\ell ^2} + \frac{z^2}{\ell ^4} \le 0 . \end{aligned}$$

6.1 Formal argument

We start by providing a formal proof of (5.1) in an infinitesimally Hilbertian setting, which is rigorously justified on a Riemannian manifold if all involved functions are smooth (in time and space).

Recall that the Hopf-Lax semi-group solves the Hamilton-Jacobi equation (e.g. [6]):

$$\begin{aligned} \partial _t\varphi _t = \frac{1}{2} \ell _t^2 = \frac{1}{2} \left| \nabla \varphi _t\right| ^2 . \end{aligned}$$
(5.2)

We evaluate all subsequent functions at \(x = \gamma _t\). Since:

$$\begin{aligned} z(t) = \partial _t^2 \varphi _t(\gamma (t)) = \left\langle \nabla \partial _t \varphi _t , \nabla \varphi _t \right\rangle , \end{aligned}$$

and since \(\gamma '(t) = -\nabla \varphi _t\) (see e.g. [6] or Lemma 10.3),

$$\begin{aligned} z'(t) = \partial _t^3 \varphi _t - \left\langle \nabla \partial _t^2 \varphi _t , \nabla \varphi _t \right\rangle . \end{aligned}$$

But taking two time derivatives in (5.2), we know that:

$$\begin{aligned} \partial _t^3 \varphi _t = \left\langle \nabla \partial _t^2 \varphi _t , \nabla \varphi _t \right\rangle + \left\langle \nabla \partial _t \varphi _t , \nabla \partial _t \varphi _t \right\rangle , \end{aligned}$$

and so we conclude that:

$$\begin{aligned} z'(t) = \left| \nabla \partial _t \varphi _t\right| ^2 . \end{aligned}$$

It remains to apply Cauchy–Schwarz and deduce:

$$\begin{aligned} z'(t) \ge \left\langle \nabla \partial _t \varphi _t , \frac{\nabla \varphi _t}{\left| \nabla \varphi _t\right| } \right\rangle ^2 = \frac{z(t)^2}{\ell ^2} , \end{aligned}$$

as asserted. Note that in a general setting, we can try and interpret z(t) as minus the directional derivative of \(\ell _t^2/2 = \partial _t \varphi _t\) in the direction of \(\gamma '(t)\) (by taking derivative of the identity \(\frac{\ell ^2_t}{2}(\gamma (t)) \equiv \frac{\ell ^2}{2}\)), and thus hope to justify the Cauchy–Schwarz inequality as the statement that the local Lipschitz constant of \(\partial _t \varphi _t\) is greater than any unit-directional derivative. However, a crucial point in the above argument of identifying \(z'(t)\) with \(\left| \nabla \partial _t \varphi _t\right| ^2\) was to use the linearity of \(\left\langle \cdot ,\cdot \right\rangle \) in both of its arguments, and so ultimately this formal proof is genuinely restricted to an infinitesimally Hilbertian setting.

The above discussion seems to suggest that there is no hope of proving (5.1) beyond the Hilbertian setting. Furthermore, it seems that the spatial regularity of \(\varphi _t\) and \(\partial _t \varphi _t = \frac{1}{2}\ell ^2_t\) should play an essential role in any rigorous justification. Remarkably, we will see that this is not the case on both counts, and that an appropriate interpretation of (5.1) holds true on a general proper geodesic space \((X,\mathsf {d})\).

6.2 Notation

Recall that by the results of Sect. 3, \(\tau \mapsto \varphi _\tau (x)\) and \(\tau \mapsto {\bar{\varphi }}_\tau (x)\) are locally semi-convex and semi-concave on (0, 1), respectively, and that \(\partial _t^{\pm } \varphi _t(x) = \ell ^{\pm }_t(x)^2/2\), \(\partial _t^{\pm } {\bar{\varphi }}_t(x) = {\bar{\ell }}^{\mp }_t(x)^2/2\) and \(\ell ^{\pm }_t(\gamma _t) = {\bar{\ell }}^{\pm }_t(\gamma _t) = \ell \) for all \(t \in (0,1)\). We respectively introduce \({\tilde{p}} = p,{\bar{p}}\) by defining at \(t \in (0,1)\):

$$\begin{aligned} {\tilde{p}}_{+}^\gamma (t) = {\tilde{p}}_{+}(t)&:= {\overline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}^2_{\tau }(\gamma _t)/2 = \ell \cdot {\overline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}_{\tau }(\gamma _t) = \ell \cdot {\overline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}^{\pm }_{\tau }(\gamma _t) ~,~ \\ {\tilde{p}}_{-}^\gamma (t) = {\tilde{p}}_{-}(t)&:= {\underline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}^2_{\tau }(\gamma _t)/2 = \ell \cdot {\underline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}_{\tau }(\gamma _t) = \ell \cdot {\underline{\partial }}_\tau |_{\tau =t} {\tilde{\ell }}^{\pm }_{\tau }(\gamma _t) ~,~ \end{aligned}$$

where the penultimate equalities in each of the lines above follow from the continuity of \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto {\tilde{\ell }}_\tau (\gamma _t)\) at \(\tau = t \in G_\varphi (\gamma _t) \subset D_{{\tilde{\ell }}}(\gamma _t)\), and the last ones by the monotonicity of \(\tau \mapsto \tau \ell ^{\pm }_\tau (\gamma _t)\) and \(\tau \mapsto (1-\tau ) {\bar{\ell }}^{\pm }_\tau (\gamma _t)\) and the density of \(D_{{\tilde{\ell }}}\) in (0, 1). Clearly \({\tilde{p}}_{-}(t) \le {\tilde{p}}_{+}(t)\), and \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{p}} \in {\mathbb {R}}\) iff \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto {\tilde{\ell }}_\tau ^2/2(\gamma _t)\) is differentiable at \(\tau = t\) with derivative \({\tilde{p}}\). In addition, for \({\tilde{q}} = q,{\bar{q}}\), set:

$$\begin{aligned} {\tilde{q}}_+(t) := {\overline{{\mathcal {P}}}}_2 {\tilde{\varphi }}_t(x)|_{x=\gamma _t} \ge {\underline{{\mathcal {P}}}}_2 {\tilde{\varphi }}_t(x)|_{x=\gamma _t} =: {\tilde{q}}_-(t) , \end{aligned}$$

where the Peano (partial) derivatives are with respect to the t variable. It will be useful to recall that if we define \({\tilde{h}} = h , {\bar{h}}\) by:

$$\begin{aligned} {\tilde{h}}(t,\varepsilon )&:= 2\left( {\tilde{\varphi }}_{t+\varepsilon }(\gamma _t) - \tilde{\varphi _t}(\gamma _t) - \varepsilon \partial _t {\tilde{\varphi }}_t(\gamma _t)\right) \\&\; = 2\left( {\tilde{\varphi }}_{t+\varepsilon }(\gamma _t) - \varphi _t(\gamma _t) - \varepsilon \ell ^2/2\right) , \end{aligned}$$

then:

$$\begin{aligned} {\tilde{q}}_+(t) = \limsup _{\varepsilon \rightarrow 0} \frac{{\tilde{h}}(t,\varepsilon )}{\varepsilon ^2} \ge \liminf _{\varepsilon \rightarrow 0} \frac{{\tilde{h}}(t,\varepsilon )}{\varepsilon ^2} = {\tilde{q}}_-(t) . \end{aligned}$$

By definition, \({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = {\tilde{q}} \in {\mathbb {R}}\) if and only if \(\tau \mapsto {\tilde{\varphi }}_{\tau }(\gamma _t)\) has second order Peano derivative at \(\tau = t\) equal to \({\tilde{q}}\), and hence by Lemma 2.3, iff \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{q}}\), or equivalently, iff any of the other equivalent conditions for the second order differentiability of \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_{\tau }(\gamma _t)\) at \(\tau =t\) are satisfied. Moreover, Lemma 2.4 implies:

$$\begin{aligned} {\tilde{p}}_-(t) \le {\tilde{q}}_-(t) \le {\tilde{q}}_+(t) \le {\tilde{p}}_+(t) \;\;\; \forall t \in (0,1) , \end{aligned}$$

but we will not require this here. We summarize the above discussion in:

Corollary 5.1

The following statements are equivalent for a given \(t \in (0,1)\):

  1. (1)

    \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = {\tilde{p}} \in {\mathbb {R}}\), i.e. \(D_{{\tilde{\ell }}}(\gamma _t) \ni \tau \mapsto \frac{{\tilde{\ell }}^2_\tau }{2}(\gamma _t)\) is differentiable at \(\tau = t\) with derivative \({\tilde{p}}\).

  2. (2)

    \({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = {\tilde{q}} \in {\mathbb {R}}\), i.e. \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_\tau (\gamma _t)\) has a second Peano derivative at \(\tau = t\) equal to \({\tilde{q}}\).

In any of these cases \((0,1) \ni \tau \mapsto {\tilde{\varphi }}_\tau (\gamma _t)\) is twice differentiable at \(\tau = t\), and we have:

$$\begin{aligned} \partial _\tau ^2|_{\tau = t} {\tilde{\varphi }}_\tau (\gamma _t) := \partial _\tau |_{\tau =t} \frac{{\tilde{\ell }}^2_\tau }{2}(\gamma _t) = \ell \cdot \partial _\tau |_{\tau =t} {\tilde{\ell }}_t(\gamma _t) = {\tilde{p}} = {\tilde{q}} . \end{aligned}$$

6.3 Main inequality

The following inequality and its consequences are the main results of this section.

Theorem 5.2

For all \(s < t\) and \(\varepsilon \) so that \(s,t,s+\varepsilon ,t+\varepsilon \in (0,1)\), we have (for both possibilities for ±):

$$\begin{aligned} \frac{h(t,\varepsilon ) - h(s,\varepsilon )}{t-s} \ge \frac{s+\varepsilon }{t+\varepsilon } (\ell ^{\pm }_{s+\varepsilon }(\gamma _s) - \ell _s(\gamma _s))^2 , \end{aligned}$$

and

$$\begin{aligned} \frac{{\bar{h}}(t,\varepsilon ) - {\bar{h}}(s,\varepsilon )}{t-s} \ge \frac{1-t-\varepsilon }{1-s-\varepsilon } ({\bar{\ell }}^{\pm }_{t+\varepsilon }(\gamma _t) - {\bar{\ell }}_t(\gamma _t))^2 . \end{aligned}$$

Proof

By Lemma 3.5, there exists \(y^{\pm }_\varepsilon \in X\) so that

$$\begin{aligned} -\varphi _{s+\varepsilon }(\gamma _{s}) = \frac{\mathsf {d}^{2}(y^{\pm }_{\varepsilon },\gamma _{s})}{2(s+\varepsilon )} - \varphi (y^{\pm }_{\varepsilon }) , \end{aligned}$$

with \(\mathsf {d}(y^{\pm }_\varepsilon ,\gamma _s) = D^{\pm }_{-\varphi }(\gamma _s,s+\varepsilon ) = (s+\varepsilon ) \ell ^{\pm }_{s+\varepsilon }(\gamma _s) =: D^{\pm }_{s+\varepsilon }\). By definition, note that:

$$\begin{aligned} -\varphi _{t+\varepsilon }(\gamma _{t}) \le \frac{\mathsf {d}^{2}(y^{\pm }_{\varepsilon },\gamma _{t})}{2(t+\varepsilon )} - \varphi (y^{\pm }_{\varepsilon }) . \end{aligned}$$

We abbreviate \(D_{r} := r \ell = \mathsf {d}(\gamma _r,\gamma _0)\), \(r=s,t\). The proof consists of subtracting the above two expressions and applying the triangle inequality:

$$\begin{aligned} \mathsf {d}(y^{\pm }_\varepsilon ,\gamma _t) \le \mathsf {d}(y^{\pm }_\varepsilon ,\gamma _s) + \mathsf {d}(\gamma _s,\gamma _t) = D^{\pm }_{s+\varepsilon } + (D_t - D_s) = D_t + (D^{\pm }_{s+\varepsilon } - D_{s}) . \end{aligned}$$

Indeed, we obtain after subtraction, recalling the definition of h, and an application of Lemma 3.3:

$$\begin{aligned} 0&\le ~ \varphi _{t+\varepsilon }(\gamma _{t}) -\varphi _{s+\varepsilon }(\gamma _{s}) + \frac{\mathsf {d}^{2}(y^{\pm }_{\varepsilon },\gamma _{t})}{2(t+\varepsilon )} - \frac{\mathsf {d}^{2}(y^{\pm }_{\varepsilon },\gamma _{s})}{2(s+\varepsilon )}\\&= ~ \frac{1}{2}\left( h(t,\varepsilon ) - h(s,\varepsilon )\right) +\varphi _{t}(\gamma _t) - \varphi _{s}(\gamma _s) +\frac{\mathsf {d}^{2}(y^{\pm }_{\varepsilon }, \gamma _{t})}{2(t+\varepsilon )} - \frac{(D^{\pm }_{s+\varepsilon })^2}{2(s+\varepsilon )} \\&\le ~ \frac{1}{2}\left( h(t,\varepsilon ) - h(s,\varepsilon )\right) -\frac{\ell ^{2}}{2}(t-s) - \frac{(D^{\pm }_{s+\varepsilon })^2}{2(s+\varepsilon )} \\&\quad +\frac{ (D^{\pm }_{s+\varepsilon } -D_{s})^{2} + D_{t}^{2} + 2(D^{\pm }_{s+\varepsilon } - D_{s})D_{t} }{2(t+\varepsilon )}. \end{aligned}$$

Carefully rearranging terms, we obtain:

$$\begin{aligned}&\frac{1}{2} \left( h(t,\varepsilon ) - h(s,\varepsilon )\right) \\&\quad \ge \frac{(D^{\pm }_{s+\varepsilon })^2}{2(s+\varepsilon )} - \frac{D^{2}_{s}}{2s} + \frac{D_{t}^{2}}{2}\left( \frac{1}{t} - \frac{1}{t+\varepsilon } \right) -\frac{ (D^{\pm }_{s+\varepsilon } -D_{s})^{2} + 2(D^{\pm }_{s+\varepsilon } - D_{s})D_{t} }{2(t+\varepsilon )} \\&\quad = \frac{1}{2(s+\varepsilon )} ((D^{\pm }_{s+\varepsilon })^{2} - D_{s}^{2}) + \frac{D_{s}^{2}}{2} \left( \frac{1}{s+\varepsilon } - \frac{1}{s} \right) + \frac{D_{t}^{2}}{2}\left( \frac{1}{t} - \frac{1}{t+\varepsilon } \right) \\&\qquad -\frac{ (D^{\pm }_{s+\varepsilon } -D_{s})^{2} + 2(D^{\pm }_{s+\varepsilon } - D_{s})D_{t} }{2(t+\varepsilon )} \\&\quad = (D^{\pm }_{s+\varepsilon } - D_{s}) \left( \frac{D^{\pm }_{s+\varepsilon } + D_{s}}{2(s+\varepsilon )} - \frac{D_{t}}{t+\varepsilon } \right) + \frac{\ell ^{2}}{2} \left( \frac{\varepsilon t}{t+\varepsilon } - \frac{\varepsilon s}{s+\varepsilon } \right) - \frac{(D^{\pm }_{s+\varepsilon } - D_{s})^{2}}{2(t+\varepsilon )} \\&\quad = (D^{\pm }_{s+\varepsilon } - D_{s}) \left( \frac{D^{\pm }_{s+\varepsilon } - D_{s} + 2D_{s} - 2(s+\varepsilon )\ell }{2(s+\varepsilon )} + D_{t}\left( \frac{1}{t}- \frac{1}{t+\varepsilon }\right) \right) \\&\qquad + \varepsilon ^{2}\frac{\ell ^{2}}{2} \left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon } \right) - \frac{(D^{\pm }_{s+\varepsilon } - D_{s})^{2}}{2(t+\varepsilon )} \\&\quad = \frac{(D^{\pm }_{s+\varepsilon } - D_{s})^{2}}{2}\left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon } \right) -\varepsilon \frac{D^{\pm }_{s+\varepsilon } - D_{s}}{s+\varepsilon } \ell \\&\qquad + (D^{\pm }_{s+\varepsilon } - D_{s})D_{t}\left( \frac{1}{t} - \frac{1}{t+\varepsilon } \right) + \varepsilon ^{2}\frac{\ell ^{2}}{2} \left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon } \right) \\&\quad = \left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon } \right) \left( \frac{(D^{\pm }_{s+\varepsilon } - D_{s})^{2}}{2} -\varepsilon \ell (D^{\pm }_{s+\varepsilon } - D_{s}) + \varepsilon ^{2} \frac{\ell ^{2}}{2} \right) \\&\quad = \frac{1}{2}\left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon }\right) \left( D^{\pm }_{s+\varepsilon } - D_{s} -\varepsilon \ell \right) ^{2} \\&\quad = \frac{1}{2}\left( \frac{1}{s+\varepsilon } - \frac{1}{t+\varepsilon } \right) (s+\varepsilon )^{2} ( \ell ^{\pm }_{s+\varepsilon }(\gamma _s) - \ell )^{2} \\&\quad = \frac{t-s}{2} \frac{s+\varepsilon }{t+\varepsilon } ( \ell ^{\pm }_{s+\varepsilon }(\gamma _s) - \ell )^{2} , \end{aligned}$$

and the first claim follows.

The second claim follows by the duality between \(\varphi \) and \(\varphi ^c\). Indeed, exchange \(\varphi ,\gamma ,\varepsilon ,s,t\) with \(\varphi ^c,\gamma ^c,-\varepsilon ,1-t,1-s\), and recall that \({\bar{\varphi }}_t = -\varphi ^c_{1-t}\). A straightforward inspection of the definitions verifies:

$$\begin{aligned} h^{\varphi ^c}(1-r,-\varepsilon ) = -{\bar{h}}^{\varphi }(r,\varepsilon ) , \end{aligned}$$

and

$$\begin{aligned} \frac{(\ell _{1-t-\varepsilon }^{\varphi ^c,\pm }(\gamma ^c_{1-t}))^2}{2} = - \partial _t^{\mp } \varphi ^c_{1-t-\varepsilon }(\gamma ^c_{1-t}) = \partial _t^{\mp } {\bar{\varphi }}_{t+\varepsilon }(\gamma _t) = \frac{({\bar{\ell }}^{\varphi ,\pm }_{t+\varepsilon }(\gamma _t))^2}{2} , \end{aligned}$$

and so the second claim follows from the first one. Alternatively, one may repeat the above argument by subtracting the following two expressions:

$$\begin{aligned} {\bar{\varphi }}_{t+\varepsilon }(\gamma _{t}) =&~ \frac{\mathsf {d}^{2}(z^{\pm }_{\varepsilon },\gamma _{t})}{2(1-t-\varepsilon )} - \varphi ^c(z^{\pm }_{\varepsilon }) , \\ {\bar{\varphi }}_{s+\varepsilon }(\gamma _{s}) \le&~ \frac{\mathsf {d}^{2}(z^{\pm }_{\varepsilon },\gamma _{s})}{2(1-s-\varepsilon )} - \varphi ^c(z^{\pm }_{\varepsilon }) , \end{aligned}$$

with \(\mathsf {d}(z^{\pm }_\varepsilon ,\gamma _t) = D^{\pm }_{-\varphi ^c}(\gamma _t,1-t-\varepsilon ) = (1-t-\varepsilon ) \ell ^{\pm }_{t+\varepsilon }(\gamma _t)\) and applying the triangle inequality \(\mathsf {d}(z_\varepsilon ,\gamma _s) \le \mathsf {d}(z_\varepsilon ,\gamma _t) + \mathsf {d}(\gamma _t,\gamma _s)\). \(\square \)

6.4 Consequences

As immediate corollaries of Theorem 5.2, we obtain after diving both sides by \(\varepsilon ^2\) and taking appropriate subsequential limits as \(\varepsilon \rightarrow 0\):

Corollary 5.3

For both \({\tilde{q}} = q,{\bar{q}}\), the functions \(t \mapsto {\tilde{q}}_-(t)\) and \(t \mapsto {\tilde{q}}_+(t)\) are monotone non-decreasing on (0, 1).

Corollary 5.4

For all \(0< s< t < 1\) (and both possibilities for ±):

$$\begin{aligned} \frac{q_+(t) - q_-(s)}{t-s} \ge \frac{s}{t} \left( \frac{p_{\pm }(s)}{\ell }\right) ^2 , \end{aligned}$$
(5.3)

and

$$\begin{aligned} \frac{{\bar{q}}_+(t) -{\bar{q}}_-(s)}{t-s} \ge \frac{1-t}{1-s} \left( \frac{{\bar{p}}_{\pm }(t)}{\ell }\right) ^2 . \end{aligned}$$
(5.4)

It will be convenient to use the above information in the following form:

Theorem 5.5

Assume that for a.e. \(t \in (0,1)\):

$$\begin{aligned} (0,1)\ni \tau \mapsto {\tilde{\varphi }}_\tau (\gamma _t) \text { is twice differentiable at }\tau = t\text { for }{} \mathbf{both} ~ {\tilde{\varphi }} = \varphi ,{\bar{\varphi }} \end{aligned}$$
(5.5)

in any of the equivalent senses given by Corollary 5.1, and that moreover:

$$\begin{aligned} \partial _\tau ^2|_{\tau = t} \varphi _\tau (\gamma _t) = \partial _\tau ^2|_{\tau = t} {\bar{\varphi }}_\tau (\gamma _t) \;\;\; \text {for a.e. } t \in (0,1) . \end{aligned}$$
(5.6)

Furthermore, assume that the latter joint value coincides a.e. on (0, 1) with some continuous function \(z_c\):

$$\begin{aligned} \partial _\tau ^2|_{\tau = t} \varphi _\tau (\gamma _t) = \partial _\tau ^2|_{\tau = t} {\bar{\varphi }}_\tau (\gamma _t) = z_c(t) \;\;\; \text {for a.e. } t \in (0,1) . \end{aligned}$$
(5.7)

Then (5.5) holds for all \(t \in (0,1)\), and we have:

$$\begin{aligned}&\partial _\tau ^2|_{\tau = t} \varphi _\tau (\gamma _t) = \partial _\tau ^2|_{\tau = t} {\bar{\varphi }}_\tau (\gamma _t) = \partial _\tau |_{\tau =t} \frac{\ell ^2_\tau }{2}(\gamma _t)\nonumber \\&\quad = \partial _\tau |_{\tau =t} \frac{{\bar{\ell }}^2_\tau }{2}(\gamma _t) = z_c(t) \;\;\; \forall t \in (0,1) . \end{aligned}$$
(5.8)

Moreover, we have the following third order information on \(\varphi _t(x)\) at \(x=\gamma _t\):

$$\begin{aligned} \frac{z_c(t) - z_c(s)}{t-s} \ge \sqrt{\frac{s}{t} \frac{1-t}{1-s}} \frac{\left| z_c(s)\right| \left| z_c(t)\right| }{\ell ^2} \;\;\; \forall 0< s< t < 1 . \end{aligned}$$
(5.9)

In particular, for any point \(t \in (0,1)\) where \(z_c(t)\) is differentiable:

$$\begin{aligned} z_c'(t) \ge \frac{z_c(t)^2}{\ell ^2} . \end{aligned}$$

Proof

The assumptions imply by Corollary 5.1 that \({\tilde{q}}_-(t) = {\tilde{q}}_+(t) = z_c(t)\) for a.e. \(t \in (0,1)\). It follows that the same is true for every \(t \in (0,1)\) by monotonicity of \({\tilde{q}}_{\pm }\) and the assumption that \(z_c\) is continuous, yielding (5.8). Furthermore, Corollary 5.1 implies that \({\tilde{p}}_-(t) = {\tilde{p}}_+(t) = z_c(t)\) for both \({\tilde{p}} = p,{\bar{p}}\) and for all \(t \in (0,1)\), and we obtain (5.9) by taking geometric mean of (5.3) and (5.4). The final assertion obviously follows by taking the limit in (5.9) as \(s \rightarrow t\). \(\square \)

We do not know whether all three assumptions (5.5), (5.6) and (5.7) hold for a.e. \(t \in (0,1)\) for a fixed Kantorovich geodesic \(\gamma \). However, we can guarantee the first two assumptions, at least for almost all Kantorovich geodesics, in the following sense:

Lemma 5.6

Let \(\nu \) denote any \(\sigma \)-finite Borel measure concentrated on \(G_\varphi \), so that for a.e. \(t \in (0,1)\), \(\mu _t := (\mathrm{e}_t)_{\sharp }(\nu ) \ll \mathfrak m\) for some \(\sigma \)-finite Borel measure \(\mathfrak m\) on X. Then for \(\nu \)-a.e. geodesic \(\gamma \), (5.5) and (5.6) hold for a.e. \(t \in (0,1)\).

Proof

Recall that \(D(\mathring{G}_\varphi )\) is closed in \(X \times (0,1)\) by Corollary 3.9. Denote the following Borel subsets:

$$\begin{aligned} P:= & {} \left\{ (x,t) \in D(\mathring{G}_\varphi ) \; ; \; \exists \partial _t \ell ^2_t(x) ~,~ \exists \partial _t {\bar{\ell }}^2_t(x) ~,~ \partial _t \ell ^2_t(x)/2 = \partial _t {\bar{\ell }}^2_t(x)/2 \right\} ,~\\ B:= & {} D(\mathring{G}_\varphi ) \setminus P. \end{aligned}$$

By Corollary 3.13, we know that \({\mathcal {L}}^1(B(x)) = 0\) for all \(x \in X\). By Fubini

$$\begin{aligned} 0 = \int {\mathcal {L}}^1(B(x)) \mathfrak m(dx) = \int _0^1 \mathfrak m(B(t)) {\mathcal {L}}^1(dt) , \end{aligned}$$

and so for a.e. \(t \in (0,1)\), \(\mathfrak m(B(t)) = 0\). Since \(\mu _t \ll \mathfrak m\) for a.e. \(t \in (0,1)\), it follows that for a.e. \(t \in (0,1)\), \(\nu (\mathrm{e}_t^{-1} B(t)) = \mu _t(B(t)) = 0\). In other words, for a.e. \(t \in (0,1)\), the Borel set \(\left\{ \gamma \in G_\varphi \; ; \; \gamma _t \in B(t) \right\} \) has zero \(\nu \)-measure. Applying Fubini again as before

$$\begin{aligned} 0 = \int \nu (\left\{ \gamma \in G_\varphi ; \; \gamma _t \in B(t) \right\} ) {\mathcal {L}}^1(dt)= \int {\mathcal {L}}^1(\left\{ t \in (0,1) ; \gamma _t \in B(t)\right\} ) \nu (d\gamma ) , \end{aligned}$$

we conclude that for \(\nu \)-a.e. \(\gamma \in G_\varphi \), the set \(\left\{ t \in (0,1) \; ; \; \gamma _t \in B(t) \right\} \) has zero Lebesgue measure, or equivalently, the set

$$\begin{aligned} \big \{ t \in (0,1) \; ; \; \exists \partial _\tau |_{\tau =t} \ell ^2_\tau (\gamma _t) ,~ \exists \partial _\tau |_{\tau =t} {\bar{\ell }}^2_\tau (\gamma _t) , \partial _\tau |_{\tau =t} \ell ^2_\tau (\gamma _t)/2 = \partial _\tau |_{\tau =t} {\bar{\ell }}^2_\tau (\gamma _t)/2\big \} \end{aligned}$$

has full Lebesgue measure. The asserted (5.5) and (5.6) now directly follow from an application of Corollary 5.1. \(\square \)

Finally, we obtain the following concise interpretation of the 3rd order information on \(\tau \mapsto \varphi _\tau \) along \(\gamma _t\), which will play a crucial role in this work:

Lemma 5.7

Assume that for some locally absolutely continuous function \(z_{ac}\) on (0, 1) we have

$$\begin{aligned} \exists \partial _\tau |_{\tau =t} \frac{\ell ^2_\tau }{2} (\gamma _t) = z_{ac}(t) \;\;\; \text {for a.e. } t \in (0,1) . \end{aligned}$$

Then for any fixed \(r_0 \in (0,1)\), the function

$$\begin{aligned} L(r) = \exp \left( -\frac{1}{\ell ^2} \int _{r_0}^r \partial _\tau |_{\tau =t} \frac{\ell ^2_\tau }{2}(\gamma _t) dt \right) = \exp \left( -\frac{1}{\ell ^2} \int _{r_0}^r z_{ac}(t) dt \right) , \end{aligned}$$

is concave on (0, 1).

Proof

Since \(L \in C^1(0,1)\), concavity of L is equivalent to showing that the function:

$$\begin{aligned} W(r) := - \ell ^2 L'(r) = L(r) z_{ac}(r) \end{aligned}$$

is monotone non-decreasing. But as this function is locally absolutely continuous, this is equivalent to showing that \(W'(r) \ge 0\) for a.e. \(r \in (0,1)\). Note that the points of differentiability of W and \(z_{ac}\) coincide. At these points (of full Lebesgue measure), we indeed have

$$\begin{aligned} W'(r) = L'(r) z_{ac}(r) + L(r) z'_{ac}(r) = L(r) (z_{ac}'(r) - z_{ac}(r)^2 / \ell ^2) \ge 0, \end{aligned}$$

where the last inequality follows from Theorem 5.5. This concludes the proof. \(\square \)

We will subsequently show that under synthetic curvature conditions, the above assumption is indeed satisfied for \(\nu \)-a.e. geodesic \(\gamma \).

7 Part II Disintegration theory of Optimal-Transport

8 Preliminaries

So far we have worked without considering any reference measure over our metric space \((X,\mathsf {d})\). A triple \((X,\mathsf {d}, {\mathfrak {m}})\) is called a metric measure space, m.m.s.for short, if \((X, \mathsf {d})\) is a complete and separable metric space and \({\mathfrak {m}}\) is a non-negative Borel measure over X. In this work we will only be concerned with the case that \({\mathfrak {m}}\) is a probability measure, that is \({\mathfrak {m}}(X) =1\), and hence \({\mathfrak {m}}\) is automatically a Radon measure (i.e. inner-regular). We refer to [3, 5, 43, 76, 77] for background on metric measure spaces in general, and the theory of Optimal-Transport on such spaces in particular.

8.1 Geometry of Optimal-Transport on metric measure spaces

The space of all Borel probability measures over X will be denoted by \({\mathcal {P}}(X)\). It is naturally equipped with its weak topology, in duality with bounded continuous functions \(C_b(X)\) over X. The subspace of those measures having finite second moment will be denoted by \({\mathcal {P}}_{2}(X)\), and the subspace of \({\mathcal {P}}_{2}(X)\) of those measures absolutely continuous with respect to \({\mathfrak {m}}\) is denoted by \({\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\). The weak topology on \({\mathcal {P}}_{2}(X)\) is metrized by the \(L^{2}\)-Wasserstein distance \(W_{2}\), defined as follows for any \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\):

$$\begin{aligned} W_2^2(\mu _0,\mu _1) := \inf _{ \pi } \int _{X\times X} \mathsf {d}^2(x,y) \, \pi (dx , dy), \end{aligned}$$
(6.1)

where the infimum is taken over all \(\pi \in {\mathcal {P}}(X \times X)\) having \(\mu _0\) and \(\mu _1\) as the first and the second marginals, respectively; such candidates \(\pi \) are called transference plans. It is known that the infimum in (6.1) is always attained for any \(\mu _0,\mu _1 \in {\mathcal {P}}(X)\), and the transference plans realizing this minimum are called optimal transference plans between \(\mu _0\) and \(\mu _1\). When \(W_2(\mu _0,\mu _1) < \infty \), it is known that given an optimal transference plan \(\pi \) between \(\mu _0\) and \(\mu _1\), there exists a Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) (see Sect. 3), which is associated to \(\pi \), meaning that

$$\begin{aligned} \varphi (x) + \varphi ^c(y) = \frac{\mathsf {d}(x,y)^2}{2} \;\;\; \text {for } \pi \text {-a.e. }(x,y) \in X \times X . \end{aligned}$$
(6.2)

In particular, when \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X)\), then necessarily \(W_2(\mu _0,\mu _1) < \infty \) and the above discussion applies. Moreover, in this case, it is known that for any Kantorovich potential \(\varphi \) associated to an optimal transference plan between \(\mu _0\) and \(\mu _1\), (6.2) in fact holds for all optimal transference plans \(\pi \) between \(\mu _0\) and \(\mu _1\). In addition, in this case a transference plan \(\pi \) is optimal iff it is supported on a \(\mathsf {d}^2\)-cyclically monotone set. A set \(\Lambda \subset X \times X\) is said to be c-cyclically monotone if for any finite set of points \(\{(x_{i},y_{i})\}_{i = 1,\ldots ,N} \subset \Lambda \) it holds

$$\begin{aligned} \sum _{i = 1}^{N} c(x_{i},y_{i}) \le \sum _{i = 1}^{N} c(x_{i},y_{i+1}) \end{aligned}$$

with the convention that \(y_{N+1} = y_{1}\).

As \((X,\mathsf {d})\) is a complete and separable metric space then so is \(({\mathcal {P}}_2(X), W_2)\). Under these assumptions, it is known that \((X,\mathsf {d})\) is geodesic if and only if \(({\mathcal {P}}_2(X), W_2)\) is geodesic. Recall that \(\mathrm{e}_{t}\) denotes the (continuous) evaluation map at \(t \in [0,1]\):

$$\begin{aligned} \mathrm{e}_{t} : \mathrm{Geo}(X) \ni \gamma \mapsto \gamma _t \in X. \end{aligned}$$

A measure \(\nu \in {\mathcal {P}}(\mathrm{Geo}(X))\) is called an optimal dynamical plan if \((\mathrm{e}_0,\mathrm{e}_1)_{\sharp } \nu \) is an optimal transference plan; it easily follows in that case that \([0,1] \ni t \mapsto (\mathrm{e}_t)_\sharp \nu \) is a geodesic in \(({\mathcal {P}}_2(X), W_2)\). It is known that any geodesic \((\mu _t)_{t \in [0,1]}\) in \(({\mathcal {P}}_2(X), W_2)\) can be lifted to an optimal dynamical plan \(\nu \) so that \((\mathrm{e}_t)_\sharp \, \nu = \mu _t\) for all \(t \in [0,1]\) (see for instance [3, Theorem 2.10]). We denote by \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\) the space of all optimal dynamical plans \(\nu \) so that \((\mathrm{e}_i)_\sharp \, \nu = \mu _i\), \(i=0,1\). Consequently, whenever \((X,\mathsf {d})\) is geodesic, the set \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\) is non-empty for all \(\mu _0,\mu _1\in {\mathcal {P}}_2(X)\), and for any Kantorovich potential \(\varphi \) associated to an optimal transference plan between \(\mu _0\) and \(\mu _1\), we have \(\nu (G_\varphi ) = 1\) for all \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\).

In order to consider restrictions of optimal dynamical plans, for any \(s,t \in [0,1]\) with \(s \le t\) we consider the restriction map

$$\begin{aligned} \text {restr}^{t}_{s} : C([0,1]; X) \ni \gamma \mapsto \gamma \circ f^t_s \in C([0,1]; X), \end{aligned}$$

where \(f^{t}_{s} : [0,1] \rightarrow [s,t]\) is defined by \(f^{t}_{s}(\tau ) = s + (t-s) \tau \). During this work we will use the following facts: if \(\nu \in \mathrm {OptGeo}(\mu _{0}, \mu _{1})\) then the restriction \((\text {restr}^{t}_{s})_{\sharp }\nu \) is still an optimal dynamical plan, now between \(\mu _s\) and \(\mu _t\) where \(\mu _{r}:=(\mathrm{e}_{r})_{\sharp }\nu \). Moreover, any probability measure \(\nu ' \in {\mathcal {P}}(\mathrm{Geo}(X))\) with \(\text {supp}(\nu ') \subset \text {supp}(\nu ) ( \subset G_\varphi )\) is also an optimal dynamical plan, between \((\mathrm{e}_{0})_{\sharp } \nu '\) and \((\mathrm{e}_{1})_{\sharp } \nu '\).

On several occasions we will use the following standard lemma (whose proof is a straightforward adaptation of e.g. [29, Lemma 4.4], relying on the Arzelà–Ascoli and Prokhorov theorems):

Lemma 6.1

Assume that \((X,\mathsf {d})\) is a Polish and proper space. Let \(\left\{ \mu _0^i\right\} ,\left\{ \mu _1^i\right\} \subset {\mathcal {P}}_2(X)\) denote two sequences of probability measures weakly converging to \(\mu ^\infty _0 ,\mu ^\infty _1 \in {\mathcal {P}}_2(X)\), respectively. Assume that \(\nu ^i \in \mathrm {OptGeo}(\mu _0^i,\mu _1^i)\). Then there exists a subsequence \(\left\{ \nu ^{i_j}\right\} \) weakly converging to \(\nu ^\infty \in \mathrm {OptGeo}(\mu _0^\infty ,\mu _1^\infty )\).

Definition

(Essentially Non-Branching) A subset \(G \subset \mathrm{Geo}(X)\) of geodesics is called non-branching if for any \(\gamma ^{1}, \gamma ^{2} \in G\) the following holds:

$$\begin{aligned} \exists t \in (0,1) \;\;\; \gamma ^{1}_{s} = \gamma ^{2}_{s} \;\;\; \forall s \in [0,t] \quad \Longrightarrow \quad \gamma ^{1}_{s} = \gamma ^{2}_{s} \;\;\; \forall s \in [0,1]. \end{aligned}$$

\((X,\mathsf {d})\) is called non-branching if \(\mathrm{Geo}(X)\) is non-branching. \((X,\mathsf {d}, {\mathfrak {m}})\) is called essentially non-branching [68] if for all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), any \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\) is concentrated on a Borel non-branching set \(G\subset \mathrm{Geo}(X)\).

Recall that a measure \(\nu \) on a measurable space \((\Omega ,{\mathcal {F}})\) is said to be concentrated on \(A \subset \Omega \) if \(\exists B \subset A\) with \(B \in {\mathcal {F}}\) so that \(\nu (\Omega \setminus B) = 0\).

8.2 Curvature-Dimension conditions

We now turn to describe various synthetic conditions encapsulating generalized Ricci curvature lower bounds coupled with generalized dimension upper bounds.

Definition 6.2

(\(\sigma _{K,{\mathcal {N}}}\)-coefficients) Given \(K \in {\mathbb {R}}\) and \({\mathcal {N}}\in (0,\infty ]\), define

$$\begin{aligned} D_{K,{\mathcal {N}}} := {\left\{ \begin{array}{ll} \frac{\pi }{\sqrt{K/{\mathcal {N}}}} &{} K > 0,\; {\mathcal {N}}< \infty \\ +\infty &{} \text {otherwise} \end{array}\right. }. \end{aligned}$$

In addition, given \(t \in [0,1]\) and \(0< \theta < D_{K,{\mathcal {N}}}\), define

$$\begin{aligned} \sigma ^{(t)}_{K,{\mathcal {N}}}(\theta ) := \frac{\sin (t \theta \sqrt{\frac{K}{{\mathcal {N}}}})}{\sin (\theta \sqrt{\frac{K}{{\mathcal {N}}}})} = {\left\{ \begin{array}{ll} \frac{\sin (t \theta \sqrt{\frac{K}{{\mathcal {N}}}})}{\sin (\theta \sqrt{\frac{K}{{\mathcal {N}}}})} &{} K > 0,\; {\mathcal {N}}< \infty \\ t &{} K = 0 \text { or } {\mathcal {N}}= \infty \\ \frac{\sinh (t \theta \sqrt{\frac{-K}{{\mathcal {N}}}})}{\sinh (\theta \sqrt{\frac{-K}{{\mathcal {N}}}})} &{} K< 0,\; {\mathcal {N}}< \infty \end{array}\right. }, \end{aligned}$$

and set \(\sigma ^{(t)}_{K,{\mathcal {N}}}(0) = t\) and \(\sigma ^{(t)}_{K,{\mathcal {N}}}(\theta ) = +\infty \) for \(\theta \ge D_{K,{\mathcal {N}}}\).

Definition 6.3

(\(\tau _{K,N}\)-coefficients) Given \(K \in {\mathbb {R}}\) and \(N \in (1,\infty ]\), define

$$\begin{aligned} \tau _{K,N}^{(t)}(\theta ) := t^{\frac{1}{N}} \sigma _{K,N-1}^{(t)}(\theta )^{1 - \frac{1}{N}} . \end{aligned}$$

When \(N=1\), set \(\tau ^{(t)}_{K,1}(\theta ) = t\) if \(K \le 0\) and \(\tau ^{(t)}_{K,1}(\theta ) = +\infty \) if \(K > 0\).

The synthetic Curvature-Dimension condition \({\mathsf {CD}}(K,N)\) has been defined on a general m.m.s.independently in several seminal works by Sturm and Lott–Villani: the case \(N=\infty \) and \(K \in {\mathbb {R}}\) was defined in [73] and [51], the case \(N \in [1,\infty )\) in [74] for \(K \in {\mathbb {R}}\) and in [51] for \(K=0\) (and subsequently for \(K \in {\mathbb {R}}\) in [50]). Our treatment in this work excludes the case \(N=\infty \) (for which the globalization result we are after is in any case known [73]). To exclude possible pathological behavior when \(N=1\), we will always assume, unless otherwise stated, that \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\).

We will use the following definition introduced in [74]. Recall that given \(N \in (1,\infty )\), the N-Rényi relative-entropy functional \({\mathcal {E}}_N : {\mathcal {P}}(X) \rightarrow [0,1]\) (since \({\mathfrak {m}}(X) = 1\)) is defined as

$$\begin{aligned} {\mathcal {E}}_N(\mu ) := \int \rho ^{1 - \frac{1}{N}} d{\mathfrak {m}}, \end{aligned}$$

where \(\mu = \rho {\mathfrak {m}}+ \mu ^{{\text {sing}}}\) is the Lebesgue decomposition of \(\mu \) with \(\mu ^{\text {sing}}\perp {\mathfrak {m}}\). It is known [74] that \({\mathcal {E}}_N\) is upper semi-continuous with respect to the weak topology on \({\mathcal {P}}(X)\).

Definition 6.4

(\({\mathsf {CD}}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {CD}}(K,N)\) if for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), there exists \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t\in [0,1]\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\), and for all \(N' \ge N\):

$$\begin{aligned} {\mathcal {E}}_{N'}(\mu _t) \ge \int _{X \times X} \left( \begin{array}{l} \tau ^{(1-t)}_{K,N'}(\mathsf {d}(x_0,x_1)) \rho _0^{-1/N'}(x_0) \\ + \tau ^{(t)}_{K,N'}(\mathsf {d}(x_0,x_1)) \rho _1^{-1/N'}(x_1) \end{array}\right) \pi (dx_0,dx_1), \end{aligned}$$
(6.3)

for some optimal transference plan \(\pi \) between \(\mu _0 = \rho _0 \,{\mathfrak {m}}\) and \(\mu _1 = \rho _1\, {\mathfrak {m}}\).

Remark 6.5

When \({\mathfrak {m}}(X) < \infty \) as in our setting, it is known [74, Proposition 1.6 (ii)] that \({\mathsf {CD}}(K,N)\) implies \({\mathsf {CD}}(K,\infty )\), and hence the requirement \(\mu _t \ll {\mathfrak {m}}\) for all intermediate times \(t \in (0,1)\) is in fact superfluous, as it must hold automatically by finiteness of the Shannon entropy (see [73, 74]).

The following is a local version of \({\mathsf {CD}}(K,N)\):

Definition 6.6

(\({\mathsf {CD}}_{loc}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {CD}}_{loc}(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\), there exists a neighborhood \(X_o \subset X\) of o, so that for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) supported in \(X_o\), there exists \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t\in [0,1]\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\), and for all \(N' \ge N\), (6.3) holds.

Note that \((\mathrm{e}_t)_{\sharp } \nu \) is not required to be supported in \(X_o\) for intermediate times \(t \in (0,1)\) in the latter definition.

The following pointwise density inequality is a known equivalent definition of \({\mathsf {CD}}(K,N)\) on essentially non-branching spaces (the equivalence follows by combining the results of [29] and [41], see the proof of Proposition 9.1):

Definition 6.7

(\({\mathsf {CD}}(K,N)\) for essentially non-branching spaces) An essentially non-branching m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}(K,N)\) if and only if for all \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\), \(\nu \) is induced by a map (i.e. \(\nu = S_{\sharp }(\mu _0)\) for some map \(S : X \rightarrow \mathrm{Geo}(X)\)), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and writing \(\mu _t = \rho _t {\mathfrak {m}}\), we have for all \(t \in [0,1]\):

$$\begin{aligned} \rho _t^{-1/N}(\gamma _t)\ge & {} \tau _{K,N}^{(1-t)}(\mathsf {d}(\gamma _0,\gamma _1)) \rho _0^{-1/N}(\gamma _0)\\&+ \tau _{K,N}^{(t)}(\mathsf {d}(\gamma _0,\gamma _1)) \rho _1^{-1/N}(\gamma _1) \;\;\; \text {for }\nu \text {-a.e. }\gamma \in \mathrm{Geo}(X) . \end{aligned}$$

The Measure Contraction Property \({\mathsf {MCP}}(K,N)\) was introduced independently by Ohta in [57] and Sturm in [74]. The idea is to only require the \({\mathsf {CD}}(K,N)\) condition to hold when \(\mu _1\) degenerates to \(\delta _o\), a delta-measure at \(o \in \text {supp}({\mathfrak {m}})\). However, there are several possible implementations of this idea. We start with the following one, which is a variation of the one used in [29]:

Definition 6.8

(\({\mathsf {MCP}}_{\varepsilon }(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {MCP}}_{\varepsilon }(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) with bounded support, there exists \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o} )\), such that for all \(t \in [0,1)\), if \(\mu _t := (\mathrm{e}_t)_{\#} \nu \) then \(\text {supp}(\mu _t) \subset \text {supp}({\mathfrak {m}})\), and:

$$\begin{aligned} {\mathcal {E}}_{N}(\mu _t) \ge \int _X \tau _{K,N}^{(1-t)} (\mathsf {d}(x_0,o)) \rho _0^{1-\frac{1}{N}}(x_0) {\mathfrak {m}}(dx_0) , \end{aligned}$$
(6.4)

where \(\mu _0 = \rho _0 {\mathfrak {m}}\).

The variant proposed in [57] is as follows:

Definition 6.9

(\({\mathsf {MCP}}(K,N)\)) A m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) is said to satisfy \({\mathsf {MCP}}(K,N)\) if for any \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) of the form \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) for some Borel set \(A \subset X\) with \(0< {\mathfrak {m}}(A) < \infty \), there exists \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o} )\) such that:

$$\begin{aligned} \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\ge (\mathrm{e}_{t})_{\sharp } \big ( \tau _{K,N}^{(1-t)}(\mathsf {d}(\gamma _{0},\gamma _{1}))^{N} \nu (d \gamma ) \big ) \;\;\; \forall t \in [0,1] . \end{aligned}$$
(6.5)

Remark 6.10

Note that in [57] it was assumed in addition that \(\text {supp}({\mathfrak {m}}) = X\) and that \((X,\mathsf {d})\) is a length-space, but (6.5) was only required to hold for \(A \subset B(o, D_{K,N-1})\) if \(K>0\); both our version and the one from [57] imply that the diameter of \(\text {supp}({\mathfrak {m}})\) is bounded above by \(D_{K,N-1}\) (this follows in our version since \(\tau _{K,N}(\theta ) = +\infty \) if \(\theta \ge D_{K,N-1}\), and by [57, Theorem 4.3] in the version from [57]), and also that \(\text {supp}({\mathfrak {m}})\) is a geodesic-space (see Lemma 6.12 below), and therefore both versions are ultimately equivalent.

When either the \({\mathsf {MCP}}(K,N)\) or \({\mathsf {MCP}}_{\varepsilon }(K,N)\) conditions hold for a given \(o \in \text {supp}({\mathfrak {m}})\), we will say that the space satisfies the corresponding condition with respect to o.

Remark 6.11

The \({\mathsf {CD}}(K,N)\), \({\mathsf {CD}}_{loc}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) and \({\mathsf {MCP}}(K,N)\) conditions all ensure that for all \(t \in [0,1]\), \(\text {supp}((\mathrm{e}_t)_\sharp \nu ) \subset \text {supp}({\mathfrak {m}})\) for the appropriate \(\nu \in \mathrm{OptGeo}(\mu _0,\mu _1)\) appearing in the corresponding definition. Consequently, for a fixed dense countable set of times \(t \in (0,1)\), \(\gamma _t \in \text {supp}({\mathfrak {m}})\) for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\); since \(\text {supp}({\mathfrak {m}})\) is closed, this in fact holds for all \(t \in [0,1]\), and hence \(\gamma \in \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\) for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), i.e. \(\text {supp}(\nu ) \subset \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\). It follows that \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}(K,N)\), \({\mathsf {CD}}_{loc}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\) iff \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does.

The following simple lemma will be useful for quickly establishing that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic:

Lemma 6.12

Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {CD}}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\). Then \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a Polish, proper and geodesic space. The same holds for \({\mathsf {CD}}_{loc}(K,N)\) if \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is assumed to be a length space.

Proof

As \(\text {supp}({\mathfrak {m}}) \subset X\) is closed, \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is Polish. It was shown in [57, Lemma 2.5, Theorem 5.1] for \({\mathsf {MCP}}(K,N)\) (and hence \({\mathsf {MCP}}_{\varepsilon }(K,N)\)) and in [74, Corollary 2.4] for \({\mathsf {CD}}(K,N)\) that these conditions imply a doubling condition, so that every closed bounded ball in \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is totally bounded. Together with completeness, this already implies that the latter space is proper. By Remark 6.11, \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies the same corresponding condition as \((X,\mathsf {d},{\mathfrak {m}})\). In particular, if \((X,\mathsf {d},{\mathfrak {m}})\) and hence \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}(K,N)\), \({\mathsf {MCP}}_{\varepsilon }(K,N)\) or \({\mathsf {MCP}}(K,N)\), then for any \(x,y \in \text {supp}({\mathfrak {m}})\), there is at least one geodesic in \(\text {supp}({\mathfrak {m}})\) from \(B(y,\varepsilon ) \cap \text {supp}({\mathfrak {m}})\) to x; together with properness and completeness, this already implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic. On the other hand, if \((X,\mathsf {d},{\mathfrak {m}})\) and hence \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K,N)\), the above argument shows that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is complete and locally compact. Together with the assumption that the latter space is a length-space, the Hopf-Rinow theorem implies that it is proper and geodesic. \(\square \)

Lemma 6.13

The following chain of implications is known:

$$\begin{aligned} {\mathsf {CD}}(K,N) \Rightarrow {\mathsf {MCP}}_{\varepsilon }(K,N) \Rightarrow {\mathsf {MCP}}(K,N) . \end{aligned}$$

Proof

By Remark 6.11, we may reduce to the case \(\text {supp}({\mathfrak {m}}) = X\). Fixing \(\mu _0 \ll {\mathfrak {m}}\) with bounded support and \(o \in X\), let \(\nu ^\varepsilon \) be an element of \(\mathrm {OptGeo}(\mu _0,\mu _1^\varepsilon )\) satisfying the \({\mathsf {CD}}(K,N)\) condition for \(\mu _1^\varepsilon = {\mathfrak {m}}(B(o,\varepsilon ))^{-1}{\mathfrak {m}}\llcorner _{B(o,\varepsilon )}\). By Lemma 6.1 (which applies since the space is proper by Lemma 6.12), \(\left\{ \nu ^\varepsilon \right\} \) has a converging subsequence to \(\nu ^0 \in \mathrm {OptGeo}(\mu _0,\delta _o)\) as \(\varepsilon \rightarrow 0\). The upper semi-continuity of \({\mathcal {E}}_N\) and the continuity of the evaluation map \(\mathrm{e}_t\) ensure that \(\nu ^0\) satisfies the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition (6.4). The second implication follows by the arguments of [66, Section 5] (without any types of essential non-branching assumptions). \(\square \)

Remark 6.14

We will show in Proposition 9.1 that for essentially non-branching spaces, \({\mathsf {MCP}}(K,N)\) implies back \({\mathsf {MCP}}_{\varepsilon }(K,N)\). We remark that for non-branching spaces, the implication \({\mathsf {CD}}(K,N) \Rightarrow {\mathsf {MCP}}(K,N)\) was first proved in [74].

Many additional useful results on the structure of \(W_{2}\)-geodesics can be obtained just from the \({\mathsf {MCP}}\) condition. The following has been shown in [29, Theorem 1.1 and Appendix] (when \(\text {supp}(m) = X\); the formulation below is immediately obtained from Remark 6.11):

Theorem 6.15

([29]) Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.satisfying \({\mathsf {MCP}}(K,N)\). Given any pair \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0} \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _{1}) \subset \text {supp}({\mathfrak {m}})\), the following holds:

  • there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\) and hence a unique optimal transference plan between \(\mu _0\) and \(\mu _1\);

  • there exists a map \(S : X \supset {\text {Dom}}\,(S) \rightarrow \mathrm{Geo}(X)\) such that \(\nu = S_{\sharp } \mu _{0}\);

  • for any \(t \in [0,1)\) the measure \((\mathrm{e}_{t})_{\sharp } \nu \) is absolutely continuous with respect to \({\mathfrak {m}}\).

The following is a standard corollary of the fact that the optimal dynamical plan is induced by a map (see e.g. the comments after [41, Theorem 1.1]); as we could not find a reference, we sketch the proof for completeness.

Corollary 6.16

With the same assumptions as in Theorem 6.15, the unique optimal transference plan \(\nu \) is concentrated on a (Borel) set \(G \subset \mathrm{Geo}(X)\), so that for all \(t \in [0,1)\), the evaluation map \(\mathrm{e}_t|_G : G \rightarrow X\) is injective. In particular, for any Borel subset \(H \subset G\):

$$\begin{aligned} (\mathrm{e}_t)_\sharp (\nu \llcorner _{H}) = (\mathrm{e}_t)_\sharp (\nu )\llcorner _{\mathrm{e}_t(H)} \;\;\; \forall t \in [0,1) . \end{aligned}$$

Sketch of proof

First, we claim the existence of \(X_1 \subset X\) with \(\mu _0(X_1) = 1\), so the for all \(x \in X_1\), there exists a unique \(\gamma \in G_\varphi \) with \(\gamma _0 = x\). Otherwise, if \(A \subset X\) is a set of positive \(\mu _0\)-measure where this is violated, there are at least two distinct geodesics in \(G_\varphi \) emanating from every \(x \in A\). As these geodesics must be different at some rational time in (0, 1), it follows that there exists a rational \({{\bar{t}}} \in (0,1)\) and \(B \subset A\) still of positive \(\mu _0\)-measure so that both pairs of geodesics emanating from x are different at time \({{\bar{t}}}\) for all \(x \in B\). Consider \({\bar{\mu }}_0 = \mu _0 \llcorner _{B} / \mu _0(B) \ll {\mathfrak {m}}\), and transport to time \({{\bar{t}}}\) half of its mass along one geodesic and the second half along the other one (see e.g. the proof of [29, Theorem 5.1]). The latter transference plan is optimal but is not induced by a map, yielding a contradiction.

Now denote \(G := S(X_1)\) (and hence \(\nu (G) = 1\)), so that the injectivity of \(\mathrm{e}_0|_G\) is already guaranteed. To see the injectivity of \(\mathrm{e}_t|_G\) for all \(t \in (0,1)\), suppose in the contrapositive the existence of \(\gamma ^{1}, \gamma ^{2} \in G\) with \(\gamma ^{1}_{t} = \gamma ^{2}_{t}\). Denoting by \(\eta \) the gluing of \(\gamma ^{1}\) restricted to [0, t] with \(\gamma ^{2}\) restricted to [t, 1], it follows by \(\mathsf {d}^{2}\)-cyclic monotonicity (see e.g. the proof of [14, Lemma 2.6] or that of Lemma 3.7) that \(\eta \in G_{\varphi }\) with \(\eta _{0} = \gamma _{0}^{1}\) and \(\eta \ne \gamma ^{1}\). But this is in contradiction to the definition of \(X_1\), thereby concluding the proof. \(\square \)

8.3 Disintegration theorem

We include here a version of the Disintegration Theorem that we will use. We will follow [18, Appendix A] where a self-contained approach (and a proof) of the Disintegration Theorem in countably generated measure spaces can be found. An even more general version of the Disintegration Theorem can be found in [39, Section 452].

Recall that given a measure space \((X,{\mathscr {X}},{\mathfrak {m}})\), a set \(A \subset X\) is called \({\mathfrak {m}}\)-measurable if A belongs to the completion of the \(\sigma \)-algebra \({\mathscr {X}}\), generated by adding to it all subsets of null \({\mathfrak {m}}\)-sets; similarly, a function \(f : (X,{\mathscr {X}},{\mathfrak {m}}) \rightarrow {\mathbb {R}}\) is called \({\mathfrak {m}}\)-measurable if all of its sub-level sets are \({\mathfrak {m}}\)-measurable.

Definition 6.17

(Disintegation on sets) Let \((X,{\mathscr {X}},{\mathfrak {m}})\) denote a measure space. Given any family \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) of subsets of X, a disintegration of \({\mathfrak {m}}\) on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) is a measure-space structure \((Q,{\mathscr {Q}},{\mathfrak {q}})\) and a map

$$\begin{aligned} Q \ni \alpha \longmapsto {\mathfrak {m}}_{\alpha } \in {\mathcal {P}}(X,{\mathscr {X}}) \end{aligned}$$

so that

  1. (1)

    for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is concentrated on \(X_\alpha \);

  2. (2)

    for all \(B \in {\mathscr {X}}\), the map \(\alpha \mapsto {\mathfrak {m}}_{\alpha }(B)\) is \({\mathfrak {q}}\)-measurable;

  3. (3)

    for all \(B \in {\mathscr {X}}\), \({\mathfrak {m}}(B) = \int {\mathfrak {m}}_{\alpha }(B)\, {\mathfrak {q}}(d\alpha )\).

The measures \({\mathfrak {m}}_\alpha \) are referred to as conditional probabilities.

Given a measurable space \((X, {\mathscr {X}})\) and a function \({\mathfrak {Q}}: X \rightarrow Q\), with Q a general set, we endow Q with the push forward \(\sigma \)-algebra \({\mathscr {Q}}\) of \({\mathscr {X}}\):

$$\begin{aligned} C \in {\mathscr {Q}} \quad \Longleftrightarrow \quad {\mathfrak {Q}}^{-1}(C) \in {\mathscr {X}}, \end{aligned}$$

i.e. the biggest \(\sigma \)-algebra on Q such that \({\mathfrak {Q}}\) is measurable. Moreover, given a measure \({\mathfrak {m}}\) on \((X,{\mathscr {X}})\), define a measure \({\mathfrak {q}}\) on \((Q,{\mathscr {Q}})\) by pushing forward \({\mathfrak {m}}\) via \({\mathfrak {Q}}\), i.e. \({\mathfrak {q}}:= {\mathfrak {Q}}_\sharp \, {\mathfrak {m}}\).

Definition 6.18

(Consistent and Strongly Consistent Disintegation) A disintegration of \({\mathfrak {m}}\) consistent with \({\mathfrak {Q}}: X \rightarrow Q\) is a map:

$$\begin{aligned} Q \ni \alpha \longmapsto {\mathfrak {m}}_{\alpha } \in {\mathcal {P}}(X,{\mathscr {X}}) \end{aligned}$$

such that the following requirements hold:

  1. (1)

    for all \(B \in {\mathscr {X}}\), the map \(\alpha \mapsto {\mathfrak {m}}_{\alpha }(B)\) is \({\mathfrak {q}}\)-measurable;

  2. (2)

    for all \(B \in {\mathscr {X}}\) and \(C \in {\mathscr {Q}}\), the following consistency condition holds:

    $$\begin{aligned} {\mathfrak {m}}\left( B \cap {\mathfrak {Q}}^{-1}(C) \right) = \int _{C} {\mathfrak {m}}_{\alpha }(B)\, {\mathfrak {q}}(d\alpha ). \end{aligned}$$

A disintegration of \({\mathfrak {m}}\) is called strongly consistent with respect to \({\mathfrak {Q}}\) if in addition:

  1. (3)

    for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is concentrated on \({\mathfrak {Q}}^{-1}(\alpha )\);

The above general scheme fits with the following situation: given a measure space \((X,{\mathscr {X}},{\mathfrak {m}})\), suppose a partition of X is given into disjoint sets \(\{ X_{\alpha }\}_{\alpha \in Q}\) so that \(X = \cup _{\alpha \in Q} X_\alpha \). Here Q is the set of indices and \({\mathfrak {Q}}: X \rightarrow Q\) is the quotient map, i.e.

$$\begin{aligned} \alpha = {\mathfrak {Q}}(x) \iff x \in X_{\alpha }. \end{aligned}$$

We endow Q with the quotient \(\sigma \)-algebra \({\mathscr {Q}}\) and the quotient measure \({\mathfrak {q}}\) as described above, obtaining the quotient measure space \((Q, {\mathscr {Q}}, {\mathfrak {q}})\). When a disintegration \(\alpha \mapsto {\mathfrak {m}}_\alpha \) of \({\mathfrak {m}}\) is (strongly) consistent with the quotient map \({\mathfrak {Q}}\), we will simply say that it is (strongly) consistent with the partition. Note that any disintegration \(\alpha \mapsto {\mathfrak {m}}_\alpha \) of \({\mathfrak {m}}\) on a partition \(\{ X_{\alpha }\}_{\alpha \in Q}\) (as in Definition 6.17) is automatically strongly consistent with the partition (as in Definition 6.18), and vice versa.

We now formulate the Disintegration Theorem (it is formulated for probability measures but clearly holds for any finite non-zero measure):

Theorem 6.19

(Theorem A.7, Proposition A.9 of [18]) Assume that \((X,{\mathscr {X}},{\mathfrak {m}})\) is a countably generated probability space and that \(\left\{ X_{\alpha }\right\} _{\alpha \in Q}\) is a partition of X.

Then the quotient probability space \((Q, {\mathscr {Q}},{\mathfrak {q}})\) is essentially countably generated and there exists an essentially unique disintegration \(\alpha \mapsto {\mathfrak {m}}_{\alpha }\) consistent with the partition.

If in addition \({\mathscr {X}}\) contains all singletons, then the disintegration is strongly consistent if and only if there exists a \({\mathfrak {m}}\)-section \(S_{{\mathfrak {m}}} \in {\mathscr {X}}\) of the partition such that the \(\sigma \)-algebra on \(S_{{\mathfrak {m}}}\) induced by the quotient-map contains the trace \(\sigma \)-algebra \({\mathscr {X}} \cap S_{{\mathfrak {m}}} := \left\{ A \cap S_{{\mathfrak {m}}} ; A \in {\mathscr {X}}\right\} \).

Let us expand on the statement of Theorem 6.19. Recall that a \(\sigma \)-algebra \({\mathcal {A}}\) is countably generated if there exists a countable family of sets so that \({\mathcal {A}}\) coincides with the smallest \(\sigma \)-algebra containing them. On the measure space \((Q, {\mathscr {Q}},{\mathfrak {q}})\), the \(\sigma \)-algebra \({\mathscr {Q}}\) is called essentially countably generated if there exists a countable family of sets \(Q_{n} \subset Q\) such that for any \(C \in {\mathscr {Q}}\) there exists \({{\hat{C}}} \in \hat{{\mathscr {Q}}}\), where \(\hat{{\mathscr {Q}}}\) is the \(\sigma \)-algebra generated by \(\{ Q_{n} \}_{n \in {\mathbb {N}}}\), such that \({\mathfrak {q}}(C\, \Delta \, {{\hat{C}}}) = 0\).

Essential uniqueness is understood above in the following sense: if \(\alpha \mapsto {\mathfrak {m}}^{1}_{\alpha }\) and \(\alpha \mapsto {\mathfrak {m}}^{2}_{\alpha }\) are two consistent disintegrations with the partition then \({\mathfrak {m}}^{1}_{\alpha }={\mathfrak {m}}^{2}_{\alpha }\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).

Finally, a set \(S \subset X\) is a section for the partition \(X = \cup _{\alpha \in Q}X_{\alpha }\) if for any \(\alpha \in Q\), \(S \cap X_\alpha \) is a singleton \(\left\{ x_\alpha \right\} \). By the axiom of choice, a section S always exists, and we may identify Q with S via the map \(Q \ni \alpha \mapsto x_\alpha \in S\). A set \(S_{{\mathfrak {m}}}\) is an \({\mathfrak {m}}\)-section if there exists \(Y \in {\mathscr {X}}\) with \({\mathfrak {m}}(X \setminus Y) = 0\) such that the partition \(Y = \cup _{\alpha \in Q_{\mathfrak {m}}} (X_{\alpha } \cap Y)\) has section \(S_{{\mathfrak {m}}}\), where \(Q_{{\mathfrak {m}}} = \left\{ \alpha \in Q ; X_{\alpha } \cap Y \ne \emptyset \right\} \). As \({\mathfrak {q}}= {\mathfrak {Q}}_{\sharp } {\mathfrak {m}}\), clearly \({\mathfrak {q}}(Q \setminus Q_{{\mathfrak {m}}}) = 0\). As usual, we identify between \(Q_{{\mathfrak {m}}}\) and \(S_{{\mathfrak {m}}}\), so that now \(Q_{{\mathfrak {m}}}\) carries two measurable structures: \({\mathscr {Q}} \cap Q_{{\mathfrak {m}}}\) (the push-forward of \({\mathscr {X}} \cap Y\) via \({\mathfrak {Q}}\)), and also \({\mathscr {X}} \cap S_{{\mathfrak {m}}}\) via our identification. The last condition of Theorem 6.19 is that \({\mathscr {Q}} \cap Q_{{\mathfrak {m}}} \supset {\mathscr {X}} \cap S_{{\mathfrak {m}}}\), i.e. that the restricted quotient-map \({\mathfrak {Q}}|_{Y} : (Y,{\mathscr {X}} \cap Y) \rightarrow (S_{\mathfrak {m}}, {\mathscr {X}} \cap S_{{\mathfrak {m}}})\) is measurable, so that the full quotient-map \({\mathfrak {Q}}: (X,{\mathscr {X}}) \rightarrow (S , {\mathscr {X}} \cap S)\) is \({\mathfrak {m}}\)-measurable.

We will typically apply the Disintegration Theorem to \((E,{\mathcal {B}}(E),{\mathfrak {m}}\llcorner _{E})\), where \(E \subset X\) is an \({\mathfrak {m}}\)-measurable subset (with \({\mathfrak {m}}(E) > 0\)) of the m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\). As our metric space is separable, \({\mathcal {B}}(E)\) is countably generated, and so Theorem 6.19 applies. In particular, when \(Q \subset {\mathbb {R}}\), E is a closed subset of X, the partition elements \(X_\alpha \) are closed and the quotient-map \({\mathfrak {Q}}: E \rightarrow Q\) is known to be Borel (for instance, this is the case when \({\mathfrak {Q}}\) is continuous), [72, Theorem 5.4.3] guarantees the existence of a Borel section S for the partition so that \({\mathfrak {Q}}: E \rightarrow S\) is Borel measurable, thereby guaranteeing by Theorem 6.19 the existence of an essentially unique disintegration strongly consistent with \({\mathfrak {Q}}\).

9 Theory of \(L^1\)-Optimal-Transport

In this section we recall various results from the theory of \(L^1\)-Optimal-Transport which are relevant to this work, and add some new information we will subsequently require. We refer to [2, 13, 19, 23, 37, 38, 47, 76] for more details.

9.1 Preliminaries

To any 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\) there is a naturally associated \(\mathsf {d}\)-cyclically monotone set:

$$\begin{aligned} \Gamma _{u} : = \{ (x,y) \in X\times X : u(x) - u(y) = \mathsf {d}(x,y) \}. \end{aligned}$$
(7.1)

Its transpose is given by \(\Gamma ^{-1}_{u}= \{ (x,y) \in X \times X : (y,x) \in \Gamma _{u} \}\). We define the transport relation \(R_u\) and the transport set \({\mathcal {T}}_{u}\), as

$$\begin{aligned} R_{u} := \Gamma _{u} \cup \Gamma ^{-1}_{u} ~,~ {\mathcal {T}}_{u} := P_{1}(R_{u} \setminus \{ x = y \}) , \end{aligned}$$
(7.2)

where \(\{ x = y\}\) denotes the diagonal \(\{ (x,y) \in X^{2} : x=y \}\) and \(P_{i}\) the projection onto the i-th component. Recall that \(\Gamma _u(x) = \left\{ y \in X \; ;\; (x,y) \in \Gamma _u\right\} \) denotes the section of \(\Gamma _u\) through x in the first coordinate, and similarly for \(R_u(x)\) (through either coordinates by symmetry). Since u is 1-Lipschitz, \(\Gamma _{u}, \Gamma ^{-1}_{u}\) and \(R_{u}\) are closed sets, and so are \(\Gamma _u(x)\) and \(R_u(x)\). Consequently \({\mathcal {T}}_{u}\) is a projection of a Borel set and hence analytic; it follows that it is universally measurable, and in particular, \({\mathfrak {m}}\)-measurable [72].

The following is immediate to verify (see [2, Proposition 4.2]):

Lemma 7.1

Let \((\gamma _0,\gamma _1) \in \Gamma _{u}\) for some \(\gamma \in \mathrm{Geo}(X)\). Then \((\gamma _{s},\gamma _{t}) \in \Gamma _{u}\) for all \(0\le s \le t \le 1\).

Also recall the following definitions, introduced in [23]:

$$\begin{aligned} A_{+} : =&~\{ x \in {\mathcal {T}}_{u} : \exists z,w \in \Gamma _{u}(x), (z,w) \notin R_{u} \}, \\ A_{-} : =&~\{ x \in {\mathcal {T}}_{u} : \exists z,w \in \Gamma ^{-1}_{u}(x), (z,w) \notin R_{u} \}. \end{aligned}$$

\(A_{\pm }\) are called the sets of forward and backward branching points, respectively. Note that both \(A_{\pm }\) are analytic sets; for instance:

$$\begin{aligned} A_{+} = P_{1} (\{ (x,z,w) \in {\mathcal {T}}_{u}\times X \times X :(x,z), (x,w) \in \Gamma _{u}, \ (z,w) \notin R_{u} \}), \end{aligned}$$

showing that \(A_{+}\) is a projection of an analytic set and therefore analytic. If \(x \in A_{+}\) and \((y,x) \in \Gamma _{u}\) necessarily also \(y \in A_{+}\) (as \(\Gamma _{u}(y) \supset \Gamma _{u}(x)\) by the triangle inequality); similarly, if \(x \in A_{-}\) and \((x,y) \in \Gamma _{u}\) then necessarily \(y \in A_{-}\).

Consider the non-branched transport set

$$\begin{aligned} {\mathcal {T}}_{u}^{b} : = {\mathcal {T}}_{u} \setminus (A_{+} \cup A_{-}), \end{aligned}$$

which belongs to the sigma-algebra \(\sigma ({\mathcal {A}})\) generated by analytic sets and is therefore \({\mathfrak {m}}\)-measurable. Define the non-branched transport relation:

$$\begin{aligned} R_u^b := R_u \cap ({\mathcal {T}}_u^b \times {\mathcal {T}}_u^b) . \end{aligned}$$

It was shown in [23] (cf. [19]) that \(R_u^b\) is an equivalence relation over \({\mathcal {T}}_{u}^{b}\) and that for any \(x \in {\mathcal {T}}_{u}^{b}\), \(R_{u}(x) \subset (X,\mathsf {d})\) is isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\).

Remark 7.2

Note that even if \(x \in {\mathcal {T}}_{u}^{b}\), the transport ray \(R_{u}(x)\) need not be entirely contained in \({\mathcal {T}}_{u}^{b}\). However, we will soon prove that almost every transport ray (with respect to an appropriate measure) has interior part contained in \({\mathcal {T}}_{u}^{b}\).

It will be very useful to note that whenever the space \((X,\mathsf {d})\) is proper (for instance when \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\)), \({\mathcal {T}}_{u}\) and \(A_{\pm }\) are \(\sigma \)-compact sets: indeed writing \(R_{u} \setminus \{ x= y\} = \cup _{\varepsilon>0} R_{u} \setminus \{ \mathsf {d}(x,y) > \varepsilon \}\) it follows that \(R_{u} \setminus \{ x= y\}\) is \(\sigma \)-compact. Hence \( {\mathcal {T}}_{u}\) is \(\sigma \)-compact. Moreover:

$$\begin{aligned} A_{+} = P_{1} \Big ( \{ (x,z,w) \in {\mathcal {T}}_{u} \times (R_{u})^{c} :(x,z), (x,w) \in \Gamma _{u} \} \Big ); \end{aligned}$$

since \((R_{u})^{c}\) is open and open sets are \(F_{\sigma }\) in metric spaces, it follows that \(\{ (x,z,w) \in {\mathcal {T}}_{u} \times (R_{u})^{c} :(x,z), (x,w) \in \Gamma _{u} \}\) is \(\sigma \)-compact and therefore \(A_{+}\) is \(\sigma \)-compact; the same applies to \(A_{-}\). Consequently, \({\mathcal {T}}_u^b\) and \(R_u^b\) are Borel.

Now, from the first part of the Disintegration Theorem 6.19 applied to \(({\mathcal {T}}_u^b , {\mathcal {B}}({\mathcal {T}}_u^b), {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}})\), we obtain an essentially unique disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}}\) consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) of \(R_{u}^{b}\):

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}} = \int _{Q} {\mathfrak {m}}_{\alpha }\,{\mathfrak {q}}(d\alpha ) \end{aligned}$$

with corresponding quotient space \((Q, {\mathscr {Q}},{\mathfrak {q}})\) (\(Q \subset {\mathcal {T}}_u^b\) may be chosen to be any section of the above partition). The next step is to show that the disintegration is strongly consistent. By the Disintegration Theorem, this is equivalent to the existence of a \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}}\)-section \({{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) (which by a mild abuse of notation we will call \({\mathfrak {m}}\)-section), such that the quotient map associated to the partition is \({\mathfrak {m}}\)-measurable, where we endow \({{\bar{Q}}}\) with the trace \(\sigma \)-algebra. This has already been shown in [19, Proposition 4.4] in the framework of non-branching metric spaces; since its proof does not use any non-branching assumption, we can conclude that

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}} = \int _{Q} {\mathfrak {m}}_{\alpha }\,{\mathfrak {q}}(d\alpha ), \quad \text {and for } {\mathfrak {q}}\text {-a.e. } \alpha \in Q, \quad {\mathfrak {m}}_{\alpha }(R_u^b(\alpha )) =1, \end{aligned}$$

where now \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section for the above partition (and hence \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\)). For a more constructive approach under the additional assumption of properness of the space, see also [25, Proposition 4.8].

A-priori the non-branched transport set \({\mathcal {T}}_u^b\) can be much smaller than \({\mathcal {T}}_u\). However, under fairly general assumptions one can prove that the sets \(A_{\pm }\) of forward and backward branching are both \({\mathfrak {m}}\)-negligible. In [23] this was shown for a m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) verifying \({\mathsf {RCD}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\). The proof only relies on the following two properties which hold for the latter spaces (see also [25]):

  • \(\text {supp}({\mathfrak {m}}) = X\).

  • Given \(\mu _{0}, \mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0}\ll {\mathfrak {m}}\), there exists a unique optimal transference plan for the \(W_{2}\)-distance and it is induced by an optimal-transport map .

By Theorem 6.15 these properties are also verified for an essentially non-branching m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}})= X\). We summarize the above discussion in:

Corollary 7.3

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.satisfying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}(X) = {\mathfrak {m}}\). Then for any 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\), we have \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\). In particular, we obtain the following essentially unique disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}^b_{u}}\) strongly consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) of \(R_{u}^{b}\):

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \,{\mathfrak {q}}(d\alpha ), \quad \text {and for } {\mathfrak {q}}\text {-a.e. } \alpha \in Q, \quad {\mathfrak {m}}_{\alpha }(R_u^b(\alpha )) =1 . \end{aligned}$$
(7.3)

Here Q may be chosen to be a section of the above partition so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section with \({\mathfrak {m}}\)-measurable quotient map. In particular, \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\).

Remark 7.4

By modifying the definitions of \(A_+,A_-\) to only reflect branching inside \(\text {supp}({\mathfrak {m}})\), it is possible to remove the assumption that \(\text {supp}(X) = {\mathfrak {m}}\), but we refrain from this extraneous generality here.

Remark 7.5

If we consider \(u = \mathsf {d}(\cdot ,o)\), it is easy to check that the set \(A_{+}\) coincides with the cut locus \(C_{o}\), i.e. the set of those \(z \in X\) such that there exists at least two distinct geodesics starting at z and ending in o. Hence the previous corollary implies that for any \(o \in X\), the cut locus has \({\mathfrak {m}}\)-measure zero: \({\mathfrak {m}}(C_{o}) = 0\). This in particular implies that an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\) also supports a local (1, 1)-weak Poincaré inequality, see [69].

9.2 Maximality of transport rays on non-branched transport-set

It is elementary to check that \(\Gamma _{u}\) induces a partial order relation on X:

$$\begin{aligned} y \le _{u} x \;\;\; \Leftrightarrow \;\;\; (x,y) \in \Gamma _{u} . \end{aligned}$$

Note that by definition:

$$\begin{aligned}&x \in A_+ ~,~ y \ge _u x \;\; \quad \Rightarrow \;\; y \in A_+, \\&x \in A_- ~,~ y \le _u x \;\; \quad \Rightarrow \;\; y \in A_-. \end{aligned}$$

Recall that for any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_{u}(x),\mathsf {d})\) is isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\). This isometry induces a total ordering on \(R_u(x)\) which must coincide with either \(\le _u\) or \(\ge _u\), implying that \((R_u(x), \le _u)\) is totally ordered.

Lemma 7.6

For any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_u^b(x) = R_u(x) \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},|\cdot |)\).

Proof

Consider \(z,w \in R_{u}(x) \cap {\mathcal {T}}_{u}^{b}\); as \((R_u(x), \le _u)\) is totally ordered, assume without loss of generality that \(z \le _u w\). Given \(y \in R_u(x)\) with \(z \le _u y \le _u w\), we must prove that \(y \in {\mathcal {T}}_{u}^{b}\). Indeed, since \(w \ge _u y\) and \(w \notin A_{+}\), necessarily \(y \notin A_{+}\), and since \(z \le _u y\) and \(z \notin A_{-}\), necessarily \(y \notin A_{-}\). Hence \(y \in {\mathcal {T}}_{u}^{b}\) and the claim follows. \(\square \)

Recall that given a partially ordered set, a chain is a totally ordered subset. A chain is called maximal if it is maximal with respect to inclusion. We introduce the following:

Definition 7.7

(Transport Ray) A maximal chain R in \((X,\mathsf {d},\le _u)\) is called a transport ray if it is isometric to a closed interval I in \(({\mathbb {R}},\left| \cdot \right| )\) of positive (possibly infinite) length.

In other words, a transport ray R is the image of a closed non-null geodesic \(\gamma \) parametrized by arclength on I so that the function \(u \circ \gamma \) is affine with slope 1 on I, and so that R is maximal with respect to inclusion.

Lemma 7.8

Given \(x \in {\mathcal {T}}_u^b\), R is a transport ray passing through x if and only if \(R = R_u(x)\).

Proof

Recall that for any \(x \in {\mathcal {T}}_{u}^{b}\), \((R_u(x), \mathsf {d},\le _u)\) is order isometric to a closed interval in \(({\mathbb {R}},\left| \cdot \right| )\). As \(R_u(x)\) is by definition maximal in X with respect to inclusion, it follows that it must be a transport ray.

Conversely, note that for any transport ray R we always have \(R \subset \cap _{w \in R} R_u(w)\). Indeed, for any \(w,z \in R\), we have \(z \le _u w\) or \(z \ge _u w\), and hence by definition \((w,z) \in R_u\) so that \(z \in R_u(w)\). If \(x \in R \cap {\mathcal {T}}_u^b\), we already showed above that \(R_{u}(x)\) is a transport ray. Since \(R \subset R_u(x)\) and R is assumed to be maximal with respect to inclusion, it follows that necessarily \(R = R_u(x)\). \(\square \)

Corollary 7.9

If \(R_1\) and \(R_2\) are two transport rays which intersect in \({\mathcal {T}}_u^b\) then they must coincide.

In this subsection, we reconcile between the crucial maximality property of \(R_u(\alpha )\) which we will require for the definition of \({\mathsf {CD}}^1\) in the next section, and the fact that the disintegration in (7.3) is with respect to (the possibly non-maximal) \(R_u^b(\alpha ) = R_u(\alpha ) \cap {\mathcal {T}}_u^b\). We will show that under \({\mathsf {MCP}}\), for \({\mathfrak {q}}\)-a.e. \(\alpha \), the only parts of \(R_{u}(\alpha )\) which are possibly not contained in \({\mathcal {T}}_{u}^{b}\) are its end points—this fact is the main new result of this section.

To rigorously state this new observation, we recall the classical definition of initial and final points, \({\mathfrak {a}}\) and \({\mathfrak {b}}\), respectively:

$$\begin{aligned} {\mathfrak {a}} : =&~\{ x \in {\mathcal {T}}_{u} :\not \exists y \in {\mathcal {T}}_{u}, \ (y,x) \in \Gamma _{u}, \ y \ne x \}, \\ {\mathfrak {b}} : =&~\{ x \in {\mathcal {T}}_{u} :\not \exists y \in {\mathcal {T}}_{u}, \ (x,y) \in \Gamma _{u}, \ y \ne x \}. \end{aligned}$$

Note that

$$\begin{aligned} {\mathfrak {a}} = {\mathcal {T}}_{u} \setminus P_{1}\big (\{ \Gamma _u \setminus \{ x= y \} \}) , \end{aligned}$$

so \({\mathfrak {a}}\) is the difference of analytic sets and consequently belongs to \(\sigma ({\mathcal {A}})\); similarly for \({\mathfrak {b}}\). As in the previous subsection, whenever \((X,\mathsf {d})\) is proper, \({\mathfrak {a}}, {\mathfrak {b}}\) are in fact Borel sets.

Theorem 7.10

(Maximality of transport rays on non-branched transport-set) Let \((X,\mathsf {d}, {\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\) and \(\text {supp}({\mathfrak {m}}) = X\). Let \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) be any 1-Lipschitz function, with (7.3) the associated disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_u}\).

Then there exists \({{\hat{Q}}} \subset Q\) such that \({\mathfrak {q}}(Q \setminus {{\hat{Q}}}) = 0\) and for any \(\alpha \in {{\hat{Q}}}\) it holds:

$$\begin{aligned} R_{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b} \subset {\mathfrak {a}} \cup {\mathfrak {b}}. \end{aligned}$$

In particular, for every \(\alpha \in {{\hat{Q}}}\):

$$\begin{aligned} R_u(\alpha ) = \overline{R_u^b(\alpha )} \supset R_u^b(\alpha ) \supset \mathring{R}_u(\alpha ) \end{aligned}$$

(with the latter interpreted as the relative interior).

Proof

Step 1. Consider the \({\mathfrak {m}}\)-section \({{\bar{Q}}}\) from Corollary 7.3 so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_{u}^{b})\), \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}(Q \setminus {{\bar{Q}}}) = 0\). Consider the set

$$\begin{aligned} Q_{1} : = \{ \alpha \in {{\bar{Q}}} :R_{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b} \nsubseteq {\mathfrak {a}}\cup {\mathfrak {b}} \}. \end{aligned}$$

The claim will be proved once we show that \({\mathfrak {q}}(Q_{1})=0\). First, observe that

$$\begin{aligned} Q_{1} = {{\bar{Q}}} \cap P_{1} \Big ( R_{u} \cap \big ( {\mathcal {T}}_{u}^{b} \times (( A_{+}\setminus {\mathfrak {a}}) \cup (A_{-}\setminus {\mathfrak {b}})) \big ) \Big ), \end{aligned}$$

and therefore \(Q_{1} \subset {{\bar{Q}}}\) is analytic; since \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\), it follows that \(Q_1\) is \({\mathfrak {q}}\)-measurable. Now suppose by contradiction that \({\mathfrak {q}}(Q_{1})> 0\).

We can divide \(Q_{1}\) into two sets:

$$\begin{aligned} Q_{1}^{+} : = \{ \alpha \in Q_1 :\Gamma _{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b} \nsubseteq {\mathfrak {b}} \}, \quad Q_{1}^{-} : = \{ \alpha \in Q_1 :\Gamma ^{-1}_{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b} \nsubseteq {\mathfrak {a}} \} . \end{aligned}$$

Since \(Q_{1} = Q_{1}^{+} \cup Q_{1}^{-}\), without any loss in generality let us assume \({\mathfrak {q}}(Q_{1}^{+}) > 0\), and for ease of notation assume further that \(Q_{1}^{+} = Q_{1}\).

Hence, for any \(\alpha \in Q_{1}\), there exists \(z \in \Gamma _{u}(\alpha )\) such that \(z \notin {\mathcal {T}}_{u}^{b}\) and \(z \notin {\mathfrak {b}}\); note that necessarily \(z \in A_{-}\). Recall that for all \(\alpha \in Q\), \(R_{u}(\alpha )\) and hence \(\Gamma _u(\alpha )\) are isometric via the map u to closed intervals, and hence \(\Gamma _u(\alpha ) \setminus (\{ \alpha \} \cup {\mathfrak {b}})\) is isometric to an open interval. Since \(\Gamma _u(\alpha ) \cap {\mathcal {T}}_{u}^{b}\) is isometric to an interval and contains \(\alpha \), it follows that for \(\alpha \in Q_1\), there exist distinct \(a_\alpha ,b_\alpha \in \Gamma _{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b}\) so that

$$\begin{aligned} (u(b_{\alpha }), u(a_{\alpha })) \subset u (\Gamma _{u}(\alpha ) \setminus {\mathcal {T}}_{u}^{b}) \end{aligned}$$

is a non-empty open interval. Moreover, we may select \(a_{\alpha }\) and \(b_\alpha \) to be \({\mathfrak {q}}\)-measurable functions of \(Q_1\). To see this, consider the set \(\Sigma : = \{ (\alpha , x ,y) \in Q_{1} \times \Gamma _{u} :x \in A_{-} , \ (\alpha ,x) \in \Gamma _{u},\ \mathsf {d}(x,y) > 0\}\), and observe that it is analytic (being the intersection of analytic sets), and that \(P_1(\Sigma ) = Q_1\). By von Neumann’s selection Theorem (see [72, Theorem 5.5.2]), there exists a \(\sigma ({\mathcal {A}})\)-measurable selection of \(\Sigma \):

$$\begin{aligned} Q_{1} \ni \alpha \rightarrow (a_{\alpha },b_{\alpha }), \end{aligned}$$

and so in particular these functions are \({\mathfrak {q}}\)-measurable. It follows that

$$\begin{aligned} Q_{1} \ni \alpha \rightarrow u(a_{\alpha }), \qquad Q_{1} \ni \alpha \rightarrow u(b_{\alpha }), \end{aligned}$$

are also \(\sigma ({\mathcal {A}})\)-measurable and hence \({\mathfrak {q}}\)-measurable. Possibly restricting \(Q_{1}\), by Lusin’s Theorem we can also assume that the above functions are continuous.

Step 2. By Fubini’s Theorem

$$\begin{aligned} 0< \int _{Q_{1}} (u(a_{\alpha })-u(b_{\alpha })) \, {\mathfrak {q}}(d\alpha ) = \int _{{\mathbb {R}}} {\mathfrak {q}}\Big (\{\alpha \in Q_{1} :u (b_{\alpha })< t < u (a_{\alpha })\} \Big )\, dt . \end{aligned}$$

Hence there exists \(c \in {\mathbb {R}}\) and \(Q_{1,c} \subset Q_{1}\) with \({\mathfrak {q}}(Q_{1,c}) > 0\), such that for any \(\alpha \in Q_{1,c}\) it holds \(c \in (u(b_{\alpha }), u(a_{\alpha }))\); in particular for any \(\alpha \in Q_{1,c}\) there exists a unique \(z_{\alpha } \in \Gamma _{u}(\alpha )\) such that \(u(z_{\alpha }) =c\). Furthermore, we can assume that \(Q_{1,c}\) is compact, and hence by continuity of \(u(a_\alpha )\) it follows that

$$\begin{aligned} \exists \varepsilon> 0 \;\;\; \forall \alpha \in Q_{1,c} \;\;\; u(a_{\alpha }) - c > \varepsilon . \end{aligned}$$

Then define the following set:

$$\begin{aligned} \Lambda : = \{ (\alpha , x,z) \in Q_{1,c} \times \Gamma _{u} :(\alpha ,x) \in R_{u}^{b}, \ u(z) =c\}. \end{aligned}$$

Recall that \(R_u^b\) is Borel since \((X,\mathsf {d})\) is proper, and therefore \(\Lambda \) is Borel. Note by the aforementioned discussion that \(P_1(\Lambda ) = Q_{1,c}\). Also note that for \((\alpha ,x,z) \in \Lambda \), since \(R_{u}(\alpha )\) is isometric to a closed interval, necessarily \(z = z_{\alpha }\). Finally, we claim that \(P_{2,3} (\Lambda )\) is \(\mathsf {d}^{2}\)-cyclically monotone: for \((x_{1},z_{1}), (x_{2},z_{2}) \in P_{2,3} (\Lambda )\) observe that

$$\begin{aligned} \mathsf {d}(x_{1},z_{1}) = u(x_{1}) - u(z_{1}) = u(x_{1}) - c = u(x_{1}) - u(z_{2}) \le \mathsf {d}(x_{1},z_{2}). \end{aligned}$$

Hence for \(\{(x_{i}, z_{i})\}_{i \le n } \subset P_{2,3} (\Lambda )\), setting \(z_{n+1} = z_{1}\),

$$\begin{aligned} \sum _{i\le n} \mathsf {d}^{2}(x_{i},z_{i}) \le \sum _{i\le n} \mathsf {d}^{2}(x_{i},z_{i+1}), \end{aligned}$$

and the monotonicity follows. We can then define a function T by imposing \({\text {graph}}(T) = P_{2,3}(\Lambda )\); note that \(P_{2,3}(\Lambda )\) is analytic and therefore T is Borel measurable (see [72, Theorem 4.5.2]).

Step 3. Consider now the measure

$$\begin{aligned} \eta _{0} : = \int _{Q_{1,c}} {\mathfrak {m}}_{\alpha }\,{\mathfrak {q}}(d\alpha ), \end{aligned}$$

and since \({\mathfrak {q}}(Q_{1,c})> 0\) it follows that \(\eta _{0}(X) > 0\); note that \(\eta _{0}\) is concentrated on \({\text {Dom}}\,(T) = \cup _{\alpha \in Q_{1,c}} R_{u}^b(\alpha )\). Hence there exists \(x \in X\) and \(r > 0\) such that \(\eta _{0}(B_{r}(x)) > 0\), and we redefine \(\eta _0\) to be the probability measure obtained by conditioning \(\eta _0\) to \(B_r(x)\). Clearly \(\eta _{0} \ll {\mathfrak {m}}\). Finally we define \(\eta _{1} : = T_{\sharp } \,\eta _{0}\). By Step 2 and Theorem 6.15, the map T is the unique Optimal-Transport map between \(\eta _{0}\) and \(\eta _{1}\) for the \(W_{2}\)-distance (as it is supported on a \(\mathsf {d}^2\)-cyclically monotone set). Consider moreover \(\nu \) the unique element of \(\mathrm {OptGeo}(\eta _{0},\eta _{1})\)—then \(\nu \)-a.e. \(\gamma \) it holds that

$$\begin{aligned} \gamma _{0} \in {\text {Dom}}\,(T) \cap B_{r}(x) \subset {\mathcal {T}}_u^b \;\; , \;\; u(\gamma _{1}) = c \;\; , \;\; (\gamma _0,\gamma _1) \in \Gamma _u . \end{aligned}$$

It follows in particular by Lemma 7.1 that \(\gamma _{s} \in \Gamma _{u}(\gamma _{0})\) for all \(s \in [0,1]\).

Recalling that \(u(a_{\alpha }) - c > \varepsilon \) for all \(\alpha \in Q_{1,c}\), that \(a_{\alpha } \le M\) by continuity on \(Q_{1,c}\), and that the support of \(\eta _0\) is bounded, it follows that there exists \({{\bar{t}}} \in (0,1)\) such that \(\nu \)-a.e. \(\gamma _{{{\bar{t}}}} \in {\mathcal {T}}_{u} \setminus {\mathcal {T}}_{u}^{b} \subset A_{+} \cup A_{-}\). Since \({\mathfrak {m}}(A_{+} \cup A_{-}) = 0\), necessarily \((\mathrm{e}_{{{\bar{t}}}})_{\sharp } \nu \perp {\mathfrak {m}}\), but this is in contradiction with the assertion of Theorem 6.15 that \((\mathrm{e}_{{{\bar{t}}}})_{\sharp } \nu \ll {\mathfrak {m}}\) since \(\eta _{0} \ll {\mathfrak {m}}\) and \({{\bar{t}}} < 1\). The claim follows. \(\square \)

10 The \({\mathsf {CD}}^{1}\) condition

In this section we introduce the \({\mathsf {CD}}^1(K,N)\) condition, which plays a cardinal role in this work. As a first step towards understanding this new condition, we show that it always implies \({\mathsf {MCP}}_{\varepsilon }(K,N)\) (and \({\mathsf {MCP}}(K,N)\)), without requiring any types of non-branching assumptions. By analogy, we also introduce the \({\mathsf {MCP}}^1(K,N)\) condition, which may be of independent interest.

10.1 Definitions of \({\mathsf {CD}}^{1}\) and \({\mathsf {MCP}}^1\)

We first assume that \(\text {supp}({\mathfrak {m}}) = X\). Note that we do not assume that the transport rays \(\{X_{\alpha }\}_{\alpha \in Q}\) below are disjoint or have disjoint relative interiors, in an attempt to obtain a useful definition also for m.m.s.’s which may have significant branching. However, throughout most of this work, we will typically assume in addition that the space is essentially non-branching, in which case an equivalent definition will be presented in Proposition 8.13 below.

Definition 8.1

(\({\mathsf {CD}}^1_{u}(K,N)\) when \(\text {supp}({\mathfrak {m}}) = X\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\), let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), and let \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) denote a 1-Lipschitz function. \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}_{u}(K,N)\) condition if there exists a family \(\{X_{\alpha }\}_{\alpha \in Q} \subset X\), such that:

  1. (1)

    There exists a disintegration of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}}\) on \(\{X_{\alpha }\}_{\alpha \in Q}\):

    $$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \, {\mathfrak {q}}(d\alpha ), \quad \text {with } \quad {\mathfrak {m}}_{\alpha }(X_{\alpha }) = 1, \text { for } {\mathfrak {q}}\text {-a.e. }\alpha \in Q . \end{aligned}$$
    (8.1)
  2. (2)

    For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \(X_\alpha \) is a transport ray for \(\Gamma _u\) (recall Definition 7.7).

  3. (3)

    For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \({\mathfrak {m}}_\alpha \) is supported on \(X_\alpha \).

  4. (4)

    For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), the m.m.s.\((X_{\alpha }, \mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {CD}}(K,N)\).

We take this opportunity to define an analogous variant of \({\mathsf {MCP}}\):

Definition 8.2

(\({\mathsf {MCP}}^1_{u}(K,N)\) when \(\text {supp}({\mathfrak {m}}) = X\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\), let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\), let \(o \in X\) and denote the 1-Lipschitz function \(u := \mathsf {d}(\cdot ,o)\). \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {MCP}}^{1}_{u}(K,N)\) condition if there exists a family \(\{X_{\alpha }\}_{\alpha \in Q} \subset X\), such that conditions (1)–(3) above hold, together with:

  1. (4’)

    For \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), the m.m.s.\((X_{\alpha }, \mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) with respect to \(o \in X_\alpha \).

Remark 8.3

Note that when \(u= \mathsf {d}(\cdot ,o)\) then necessarily \({\mathcal {T}}_{u} = X\) (if X is not a singleton). In addition \((x,o) \in \Gamma _{u}\) for any \(x \in X\), and hence by maximality of a transport ray, we must have \(o \in X_{\alpha }\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), and by condition (3) we deduce that \(o \in \text {supp}({\mathfrak {m}}_\alpha )\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\). As \({\mathsf {CD}}(K,N)\) implies \({\mathsf {MCP}}(K,N)\) (in the one-dimensional case this is a triviality), we obviously see that \({\mathsf {CD}}^1_u(K,N)\) implies \({\mathsf {MCP}}^1_u(K,N)\) for all \(u = \mathsf {d}(\cdot ,o)\).

We will focus on a particular class of 1-Lipschitz functions.

Definition

(Signed Distance Function) Given a continuous function \(f : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\) so that \(\left\{ f = 0\right\} \ne \emptyset \), the function

$$\begin{aligned} d_{f} : X \rightarrow {\mathbb {R}}, \qquad d_{f}(x) : = \text {dist}(x, \{ f = 0 \} ) \text {sgn}(f), \end{aligned}$$
(8.2)

is called the signed distance function (from the zero-level set of f).

Lemma 8.4

\(d_f\) is 1-Lipschitz on \(\left\{ f \ge 0\right\} \) and \(\left\{ f \le 0\right\} \). If \((X,\mathsf {d})\) is a length space, then \(d_f\) is 1-Lipschitz on the entire X.

Proof

Given \(x,y \in X\) with \(f(x) f(y) \ge 0\), the assertion follows by the usual triangle inequality, valid for any metric space:

$$\begin{aligned} \left| d_f(x) - d_f(y)\right| = \left| \text {dist}(x,\left\{ f=0\right\} ) - \text {dist}(y,\left\{ f=0\right\} )\right| \le \mathsf {d}(x,y) . \end{aligned}$$

When \(f(x) f(y) < 0\), and given \(\varepsilon > 0\), let \(\gamma : [0,1] \rightarrow X\) denote a continuous path with \(\gamma _0 = x\), \(\gamma _1=y\) and \(\ell (\gamma ) \le \mathsf {d}(x,y) + \varepsilon \). By continuity, it follows that there exists \(t \in (0,1)\) so that \(f(\gamma _t) = 0\). It follows that

$$\begin{aligned} \left| d_f(x) - d_f(y)\right|= & {} \text {dist}(x,\left\{ f = 0\right\} ) + \text {dist}(y,\left\{ f=0\right\} ) \\\le & {} \mathsf {d}(x,\gamma _t) + \mathsf {d}(y,\gamma _t) \le \ell (\gamma ) \le \mathsf {d}(x,y) + \varepsilon . \end{aligned}$$

As \(\varepsilon > 0\) was arbitrary, the assertion is proved. \(\square \)

Remark 8.5

To extend Remark 8.3 to more general signed distance functions, we will need to require that \((X,\mathsf {d})\) is proper, and in that case \({\mathcal {T}}_{d_{f}} \supset X \setminus \{f =0\}\). Indeed, given \(x \in X \setminus \{f =0 \}\), consider the distance minimizing \(z \in \{ f= 0 \}\) (by compactness of bounded sets). Then \((x,z) \in R_{d_{f}}\) and as \(x \ne z\) it follows that \(x \in {\mathcal {T}}_{d_{f}}\).

We now remove the restriction that \(\text {supp}({\mathfrak {m}}) = X\) and introduce the main new definitions of this work:

Definition 8.6

(\({\mathsf {CD}}^1_{Lip}(K,N)\), \({\mathsf {CD}}^1(K,N)\) and \({\mathsf {MCP}}^1(K,N)\)) Let \((X,\mathsf {d},{\mathfrak {m}})\) denote a m.m.s.and let \(K \in {\mathbb {R}}\) and \(N \in [1,\infty ]\).

  • \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}_{Lip}(K,N)\) condition if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{u}(K,N)\) for all 1-Lipschitz functions \(u : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\).

  • \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify the \({\mathsf {CD}}^{1}(K,N)\) condition if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{d_f}(K,N)\) for all continuous functions \(f : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) so that \(\left\{ f=0\right\} \ne \emptyset \) and \(d_f : (\text {supp}({\mathfrak {m}}) , \mathsf {d}) \rightarrow {\mathbb {R}}\) is 1-Lipschitz.

  • \((X,\mathsf {d},{\mathfrak {m}})\) is said to verify \({\mathsf {MCP}}^{1}(K,N)\) if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}_u^1(K,N)\) for all functions \(u(x) = \mathsf {d}(x,o)\) with \(o \in \text {supp}({\mathfrak {m}})\).

Remark 8.7

Clearly \({\mathsf {CD}}^1_{Lip}(K,N) \Rightarrow {\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {MCP}}^1(K,N)\) in view of Remark 8.3. Note that we do not a-priori know that \(d_f\) is 1-Lipschitz, since we do not know that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length-space (see Lemma 8.4); nevertheless, we will shortly see that the \({\mathsf {CD}}^1(K,N)\) condition implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) must be a geodesic space, and hence the sentence “so that \(d_f\) is 1-Lipschitz” is in fact redundant.

Remark 8.8

By definition, the \({\mathsf {CD}}^1_{Lip}\), \({\mathsf {CD}}^1\) and \({\mathsf {MCP}}^1\) conditions hold for \((X,\mathsf {d},{\mathfrak {m}})\) iff they hold for \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\). It is also possible to introduce a definition of \({\mathsf {CD}}^1_{u}\) and \({\mathsf {MCP}}^1_u\) which applies to \((X,\mathsf {d},{\mathfrak {m}})\) directly, without passing through \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\)—this would involve requiring that the transport rays \(\left\{ X_\alpha \right\} \) are maximal inside \(\text {supp}({\mathfrak {m}})\), and in the case of \({\mathsf {CD}}^1_u\) would only apply to functions u which are 1-Lipschitz on \(\text {supp}({\mathfrak {m}})\) (these may be extended to the entire X by McShane’s theorem). Our choice to use a tautological approach is motivated by the analogous situation for the more classical \(W_2\) definitions of curvature-dimension (see Remark 6.11) and is purely for convenience, so as not to overload the definitions.

10.2 \({\mathsf {MCP}}^{1}\) implies \({\mathsf {MCP}}_\varepsilon \)

Proposition 8.9

Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {MCP}}^{1}(K,N)\) with \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\) (in particular, this holds if it verifies \({\mathsf {CD}}^1_{Lip}(K,N)\) or \({\mathsf {CD}}^1(K,N)\)). Then it verifies \({\mathsf {MCP}}_{\varepsilon }(K,N)\).

Proof

We will show that \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {MCP}}_{\varepsilon }(K,N)\), and consequently so will \((X,\mathsf {d},{\mathfrak {m}})\). By Remark 8.8, we may therefore assume that \(\text {supp}({\mathfrak {m}}) = X\). Fix any \(o \in X\) and consider the 1-Lipschitz function \(u (x) : = \mathsf {d}(x,o)\). From \({\mathsf {MCP}}^{1}(K,N)\) and Remark 8.5 we deduce the existence of a disintegration of \({\mathfrak {m}}\) on \({\mathcal {T}}_u = X\) along a family of Borel sets \(\{ X_{\alpha }\}_{\alpha \in Q}\):

$$\begin{aligned} {\mathfrak {m}}= \int _{Q} {\mathfrak {m}}_{\alpha } \, {\mathfrak {q}}(d\alpha ), \quad {\mathfrak {m}}_{\alpha } (X_{\alpha }) = 1, \ \text {for } {\mathfrak {q}}\text {-a.e. } \alpha \in Q, \end{aligned}$$

so that \(X_\alpha \) is a transport ray for \(\Gamma _u\), \({\mathfrak {m}}_\alpha \) is supported on \(X_\alpha \) and \((X_{\alpha }, \mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) with respect to \(o \in X_\alpha \), for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).

Now consider any \(\mu _0 \in {\mathcal {P}}(X)\) with \(\mu _0 \ll {\mathfrak {m}}\), so that \(\rho _0 := \frac{d\mu _0}{d{\mathfrak {m}}}\) has bounded support. By measurability of the disintegration, the function \(Q \ni \alpha \mapsto z_\alpha := \int \rho _0(x) {\mathfrak {m}}_{\alpha }(dx)\) is \({\mathfrak {q}}\)-measurable, and hence \({\bar{Q}} := \left\{ \alpha \in Q \; ; \; z_\alpha \in (0,\infty )\right\} \) is \({\mathfrak {q}}\)-measurable. Clearly \(\int _{{\bar{Q}}} z_\alpha {\mathfrak {q}}(d\alpha ) = \int _{Q} z_\alpha {\mathfrak {q}}(d\alpha ) = 1\) since \(z_\alpha < \infty \) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).

Define \(\mu _0^\alpha := \frac{1}{z_\alpha } \rho _0 {\mathfrak {m}}_\alpha \in {\mathcal {P}}(X_\alpha )\) for all \(\alpha \in {\bar{Q}}\). Since for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the one-dimensional (non-branching) \((X_{\alpha },\mathsf {d})\) contains o, there exists a unique element \(\nu ^\alpha \) of \(\mathrm {OptGeo}(\mu _{0}^{\alpha }, \delta _{o}) \cap {\mathcal {P}}(\mathrm{Geo}(X_{\alpha }))\) where \(\mathrm{Geo}(X_{\alpha })\) denotes the space of geodesics in \(X_{\alpha }\). Define then

$$\begin{aligned} \nu : = \int _{{\bar{Q}}} \nu ^{\alpha } z_\alpha \,{\mathfrak {q}}(d\alpha ), \end{aligned}$$
(8.3)

and observe that \((\mathrm{e}_{0})_{\sharp } \nu = \rho _0 {\mathfrak {m}}= \mu _0\) and \((\mathrm{e}_{1})_{\sharp } \nu = \delta _{o}\). To conclude that \(\nu \in \mathrm {OptGeo}(\mu _0, \delta _{o})\) we must show that \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu = : \mu _{t}\) is a \(W_{2}\)-geodesic. Indeed, for any \(0 \le s < t \le 1\), consider the transference plan \((\mathrm{e}_{s},\mathrm{e}_{t})_{\sharp } \nu \) between \(\mu _s\) and \(\mu _t\), yielding:

$$\begin{aligned} W_{2}^{2}(\mu _{s}, \mu _{t})&~ \le \int _{{\bar{Q}}} \int _{X_{\alpha } \times X_{\alpha }} \mathsf {d}^{2}(x,y) (\mathrm{e}_{s},\mathrm{e}_{t})_{\sharp } \nu ^{\alpha } (dxdy) z_\alpha \, {\mathfrak {q}}(d\alpha ) \\&~ = \int _{{\bar{Q}}} (t-s)^{2} \int _{X_{\alpha } \times X_{\alpha }} \mathsf {d}^{2}(x,y) (\mathrm{e}_{0},\mathrm{e}_{1})_{\sharp } \nu ^{\alpha } (dxdy) z_\alpha \, {\mathfrak {q}}(d\alpha ) \\ {}&~ = (t-s)^{2} \int _{{\bar{Q}}} \int _{X_{\alpha }} \mathsf {d}^{2}(x,o) \mu _0^\alpha (dx) z_\alpha \,{\mathfrak {q}}(d\alpha ) \\&~ = (t-s)^{2} \int _{Q} \int _{X_{\alpha }} \mathsf {d}^{2}(x,o) \rho _0(x) {\mathfrak {m}}_\alpha (dx) \,{\mathfrak {q}}(d\alpha ) \\&~ = (t-s)^{2} \int _{X} \mathsf {d}^{2}(x,o) \rho _0(x) {\mathfrak {m}}(dx) \\&~ = (t-s)^{2} W_{2}^{2}(\mu _0,\delta _{o}). \end{aligned}$$

By the triangle inequality, it follows that \(t \mapsto \mu _{t}\) must indeed be a geodesic in \(({\mathcal {P}}_2(X),W_2)\). Note that this property is particular to transportation to a delta measure.

It remains to establish the \({\mathsf {MCP}}_{\varepsilon }\) inequality of Definition 6.8. Fix \(t \in (0,1)\), and recall that for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the (one-dimensional, non-branching) \((X_{\alpha }, \mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\) (and hence \({\mathsf {MCP}}_{\varepsilon }(K,N)\)), and as \(\mu _0^\alpha \ll {\mathfrak {m}}_\alpha \) and \(o \in \text {supp}({\mathfrak {m}}_\alpha )\), in particular \(\mu _t^\alpha := (\mathrm{e}_t)_{\sharp }(\nu ^\alpha ) \ll {\mathfrak {m}}_\alpha \). Applying \(\mathrm{e}_t\) to both sides of (8.3), it follows that \(\mu _t = (\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\). Writing \(\mu _t = \rho _t {\mathfrak {m}}\) and \(\mu _t^\alpha = \rho _t^\alpha {\mathfrak {m}}_\alpha \) for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), the \({\mathsf {MCP}}_{\varepsilon }\) condition implies that:

$$\begin{aligned}&\int _X (\rho _t^{\alpha }(x))^{1 - \frac{1}{N}} {\mathfrak {m}}_\alpha (dx) \ge \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \left( \frac{\rho _0(x)}{z_\alpha }\right) ^{1-\frac{1}{N}} {\mathfrak {m}}_\alpha (dx) \nonumber \\&\quad \forall {\mathfrak {q}}-\text {a.e. } \alpha \in {\bar{Q}} . \end{aligned}$$
(8.4)

In addition, the application of \(\mathrm{e}_t\) to both sides of (8.3) yields the following disintegration:

$$\begin{aligned} \rho _t {\mathfrak {m}}= \int _{{\bar{Q}}} \rho _t^\alpha z_\alpha {\mathfrak {m}}_\alpha {\mathfrak {q}}(d\alpha ) . \end{aligned}$$
(8.5)

Now consider the set \(Y = \left\{ \rho _t > 0\right\} \), and note that by (8.5):

$$\begin{aligned} \int _{X \setminus Y} \rho _t^\alpha (x) {\mathfrak {m}}_\alpha (dx) = 0 \;\;\; \forall {\mathfrak {q}}\text {-a.e. } \alpha \in {\bar{Q}} . \end{aligned}$$
(8.6)

Integrating (8.5) against \(\rho _t^{-\frac{1}{N}}\) on \(Y = \left\{ \rho _t > 0\right\} \), applying Hölder’s inequality on the interior integral for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), using (8.6), employing the one-dimensional \({\mathsf {MCP}}_{\varepsilon }\) inequality (8.4) and canceling \(z_\alpha \), and finally applying Hölder’s inequality again on the exterior integral, we obtain

$$\begin{aligned}&\int _X \rho _t(x)^{1-\frac{1}{N}} {\mathfrak {m}}(dx) = \int _Y \rho _t(x)^{1-\frac{1}{N}} {\mathfrak {m}}(dx)\\&\quad =\int _{{\bar{Q}}} \int _Y \rho _t^\alpha (x) \rho _t(x)^{-\frac{1}{N}} {\mathfrak {m}}_\alpha (dx) z_\alpha {\mathfrak {q}}(d\alpha ) \\&\quad \ge \int _{{\bar{Q}}} \left( \int _Y (\rho _t^\alpha (x))^{1 - \frac{1}{N}} {\mathfrak {m}}_\alpha (dx)\right) ^{\frac{N}{N-1}} \left( \int _Y \rho _t(x)^{\frac{N-1}{N}} {\mathfrak {m}}_{\alpha }(dx)\right) ^{-\frac{1}{N-1}} z_\alpha {\mathfrak {q}}(d \alpha ) \\&\quad = \int _{{\bar{Q}}} \left( \int _X (\rho _t^\alpha (x))^{1 - \frac{1}{N}} {\mathfrak {m}}_\alpha (dx)\right) ^{\frac{N}{N-1}} \left( \int _X \rho _t(x)^{\frac{N-1}{N}} {\mathfrak {m}}_{\alpha }(dx)\right) ^{-\frac{1}{N-1}} z_\alpha {\mathfrak {q}}(d \alpha ) \\&\quad \ge \int _{{\bar{Q}}} \left( \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \rho _0(x)^{1-\frac{1}{N}} {\mathfrak {m}}_\alpha (dx)\right) ^{\frac{N}{N-1}}\\&\qquad \quad \times \, \left( \int _X \rho _t(x)^{\frac{N-1}{N}} {\mathfrak {m}}_{\alpha }(dx)\right) ^{-\frac{1}{N-1}} {\mathfrak {q}}(d \alpha ) \\&\quad \ge \left( \int _{{\bar{Q}}} \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \rho _0(x)^{1-\frac{1}{N}} {\mathfrak {m}}_\alpha (dx) {\mathfrak {q}}(d \alpha )\right) ^{\frac{N}{N-1}} \\&\qquad \left( \int _{{\bar{Q}}} \int _X \rho _t(x)^{\frac{N-1}{N}} {\mathfrak {m}}_{\alpha }(dx) {\mathfrak {q}}(d\alpha )\right) ^{-\frac{1}{N-1}} \\&\quad \ge \left( \int _{Q} \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \rho _0(x)^{1-\frac{1}{N}} {\mathfrak {m}}_\alpha (dx) {\mathfrak {q}}(d \alpha )\right) ^{\frac{N}{N-1}} \\&\qquad \left( \int _{Q} \int _X \rho _t(x)^{\frac{N-1}{N}} {\mathfrak {m}}_{\alpha }(dx) {\mathfrak {q}}(d\alpha )\right) ^{-\frac{1}{N-1}} \\&\quad = \left( \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \rho _0(x)^{1-\frac{1}{N}} {\mathfrak {m}}(dx)\right) ^{\frac{N}{N-1}} \left( \int _X \rho _t(x)^{1 - \frac{1}{N}} {\mathfrak {m}}(dx)\right) ^{-\frac{1}{N-1}} , \end{aligned}$$

where the last inequality above follows since \(\rho _0 {\mathfrak {m}}_\alpha = 0\) for \(\alpha \in Q \setminus {\bar{Q}}\) and since the exponent on the second term is negative. Note that we applied Hölder’s inequality above in reverse form

$$\begin{aligned} \int \left| f\right| ^{\alpha } \left| g\right| ^{\beta } d\omega \ge \left( \int \left| f\right| d\omega \right) ^{\alpha } \left( \int \left| g\right| d\omega \right) ^{\beta } , \end{aligned}$$

which is valid as soon as \(\alpha + \beta = 1\), \(\beta < 0\), regardless of whether or not \(\left| g\right| > 0\) \(\omega \)-a.e..

Rearranging terms above and raising to the power of \(\frac{N-1}{N}\), the desired inequality follows:

$$\begin{aligned} \int _X \rho _t(x)^{1-\frac{1}{N}} {\mathfrak {m}}(dx) \ge \int _X \tau ^{(1-t)}_{K,N}(d(x,o)) \rho _0(x)^{1-\frac{1}{N}} {\mathfrak {m}}(dx) . \end{aligned}$$

\(\square \)

Remark 8.10

Note that the above proof shows that, not only does it hold that \(\text {supp}(\mu _t) \subset \text {supp}({\mathfrak {m}})\) for all \(t\in [0,1)\), as required in the definition of \({\mathsf {MCP}}_{\varepsilon }(K,N)\), but in fact \(\mu _t \ll {\mathfrak {m}}\).

Remark 8.11

Recalling that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) always implies \({\mathsf {MCP}}(K,N)\), we deduce that \({\mathsf {MCP}}^1(K,N)\) implies \({\mathsf {MCP}}(K,N)\). In fact, a direct proof of the latter implication is elementary. Indeed, let \(A \subset X\) be any Borel set with \(0< {\mathfrak {m}}(A) < \infty \), and denote \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\). Recall that for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\), \(o \in X_\alpha \), \(\text {supp}({\mathfrak {m}}_\alpha ) = X_\alpha \) and \((X_{\alpha },\mathsf {d}, {\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {MCP}}(K,N)\). Defining \(\nu \) as in (8.3) and continuing with the notation used there, it follows by uniqueness of \(\nu ^\alpha \) and the \({\mathsf {MCP}}\) condition with respect to the point \(o \in X_\alpha \), that for any Borel set \(B \subset X\):

$$\begin{aligned} {\mathfrak {m}}_{\alpha } (B) \ge \int _{\mathrm{e}_{t}^{-1}(B)} \tau _{K,N}^{(1-t)} (\mathsf {d}(\gamma _{0},\gamma _{1}))^{N} {\mathfrak {m}}_{\alpha }(A) \nu ^{\alpha }(d\gamma ), \end{aligned}$$

for \({\mathfrak {q}}\)-a.e. \(\alpha \in {\bar{Q}}\). Integrating over \({\bar{Q}}\) we obtain

$$\begin{aligned} {\mathfrak {m}}(B)&~ \ge \int _{{\bar{Q}}} {\mathfrak {m}}_\alpha (B) {\mathfrak {q}}(d\alpha ) \\&~ \ge \int _{\mathrm{e}_{t}^{-1}(B)} \int _{{\bar{Q}}} \tau _{K,N}^{(1-t)} (\mathsf {d}(\gamma _{0},\gamma _{1}))^{N} {\mathfrak {m}}_{\alpha }(A) \nu ^{\alpha }(d\gamma ) \,{\mathfrak {q}}(d\alpha ) \\&~ = \int _{\mathrm{e}_{t}^{-1}(B)} \tau _{K,N}^{(1-t)} (\mathsf {d}(\gamma _{0},\gamma _{1}))^{N} {\mathfrak {m}}(A) \nu (d\gamma ) , \end{aligned}$$

and the claim follows.

As a consequence, we immediately obtain from Lemmas 6.12 and 8.4:

Corollary 8.12

Let \((X,\mathsf {d},{\mathfrak {m}})\) be a m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) with \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). Then \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a Polish, proper and geodesic space. In particular, for any continuous function \(f : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) with \(\{ f = 0 \} \ne \emptyset \), the function \(d_{f} : (\text {supp}({\mathfrak {m}}),\mathsf {d}) \rightarrow {\mathbb {R}}\) is 1-Lipschitz.

10.3 On essentially non-branching spaces

Having at our disposal \({\mathsf {MCP}}(K,N)\), we can now invoke the results of Sect. 7 concerning the theory of \(L^1\)-Optimal-Transport, and obtain the following important equivalent definitions of \({\mathsf {CD}}^1_{Lip}(K,N)\), \({\mathsf {CD}}^1(K,N)\) and \({\mathsf {MCP}}^1(K,N)\) assuming that \((X,\mathsf {d},{\mathfrak {m}})\) is essentially non-branching.

Proposition 8.13

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.with \(\text {supp}({\mathfrak {m}}) = X\). Given \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\), the following statements are equivalent:

  1. (1)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}^1_{Lip}(K,N)\).

  2. (2)

    For any 1-Lipschitz function \(u : (X,\mathsf {d}) \rightarrow {\mathbb {R}}\), let \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\) denote the partition of \({\mathcal {T}}_{u}^{b}\) given by the equivalence classes of \(R_{u}^{b}\). Denote by \(X_\alpha \) the closure \(\overline{R_u^b(\alpha )}\). Then all the conditions (1)–(4) of Definition 8.1 hold for the family \(\left\{ X_\alpha \right\} _{\alpha \in Q}\). In particular, \(X_\alpha = R_{u}(\alpha )\) is a transport-ray for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\).

    Moreover, the sets \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) have disjoint interiors \(\{\mathring{R}_u^b(\alpha )\}_{\alpha \in Q}\) contained in \({\mathcal {T}}_u^b\), and the disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_u}\) on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\) given by (8.1) is essentially unique.

    Furthermore, Q may be chosen to be a section of the above partition so that \(Q \supset {{\bar{Q}}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) with \({{\bar{Q}}}\) an \({\mathfrak {m}}\)-section with \({\mathfrak {m}}\)-measurable quotient map, so that in particular \({\mathscr {Q}} \supset {\mathcal {B}}({{\bar{Q}}})\) and \({\mathfrak {q}}\) is concentrated on \({{\bar{Q}}}\).

An identical statement holds for \({\mathsf {CD}}^1(K,N)\) when only considering signed distance functions \(u = d_f\).

An identical statement also holds for \({\mathsf {MCP}}^1(K,N)\) when only considering the functions \(u = d(\cdot ,o)\), after replacing above condition (4) of Definition 8.1 with condition (4’) of Definition 8.2.

Proof

The only direction requiring proof is \((1) \Rightarrow (2)\). Given a 1-Lipschitz function u as above, we may assume that \({\mathfrak {m}}({\mathcal {T}}_u) > 0\), otherwise there is nothing to prove. The \({\mathsf {CD}}^1_u(K,N)\) condition ensures there exists a family \(\left\{ Y_\beta \right\} _{\beta \in P}\) of sets and a disintegration:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_u} = \int _P {\mathfrak {m}}^P_\beta {\mathfrak {p}}(d \beta ) ~, \quad \text {with } \quad {\mathfrak {m}}^P_{\beta }(Y_{\beta }) = 1, \text { for } {\mathfrak {p}}\text {-a.e. }\beta \in P , \end{aligned}$$

so that for \({\mathfrak {p}}\)-a.e. \(\beta \in P\), \(Y_\beta \) is a transport ray for \(\Gamma _u\), \((Y_\beta ,\mathsf {d},{\mathfrak {m}}^P_\beta )\) satisfies \({\mathsf {CD}}(K,N)\) and \(\text {supp}({\mathfrak {m}}^P_\beta ) = Y_\beta \). By removing a \({\mathfrak {p}}\)-null-set from P, let us assume without loss of generality that the above properties hold for all \(\beta \in P\).

As \({\mathsf {CD}}^1_{Lip}(K,N) \Rightarrow {\mathsf {CD}}^1 (K,N) \Rightarrow {\mathsf {MCP}}^1(K,N) \Rightarrow {\mathsf {MCP}}(K,N)\), and as our space is essentially non-branching with full-support, Corollary 7.3 implies that \({\mathfrak {m}}(A_+ \cup A_-) = 0\) and that there exists an essentially unique disintegration \((Q,{\mathscr {Q}},{\mathfrak {q}})\) of \({\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}^b_{u}}\) strongly consistent with the partition of \({\mathcal {T}}_{u}^{b}\) given by \(\left\{ R_u^b(\alpha )\right\} _{\alpha \in Q}\):

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \,{\mathfrak {q}}(d\alpha ) \quad \text {with } \quad {\mathfrak {m}}_{\alpha }(R_u^b(\alpha )) = 1, \text { for } {\mathfrak {q}}\text {-a.e. }\alpha \in Q . \end{aligned}$$
(8.7)

By Corollary 7.3, Q may be chosen to be a section of the above partition satisfying the statement appearing in the formulation of Proposition 8.13. Again, let us assume without loss of generality that \({\mathfrak {m}}_{\alpha }(R_u^b(\alpha )) = 1\) for all \(\alpha \in Q\).

By Theorem 7.10, there exists \(Q_1 \subset Q\) of full \({\mathfrak {q}}\)-measure so that \(R_u(\alpha ) = \overline{R_u^b(\alpha )} \supset R_u^b(\alpha ) \supset \mathring{R}_u(\alpha )\) for all \(\alpha \in Q_1\). In addition, since \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\), there exists \(P_1 \subset P\) of full \({\mathfrak {p}}\)-measure so that \({\mathfrak {m}}^P_\beta ({\mathcal {T}}_u^b)=1\) for all \(\beta \in P_1\). By Lemmas 7.6 and 7.8, \((Y_\beta \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},\left| \cdot \right| )\), and therefore \((\overline{Y_\beta \cap {\mathcal {T}}_u^b}, \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b})\) still satisfies \({\mathsf {CD}}(K,N)\), is of total measure 1 and satisfies \(\text {supp}(({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}) = \overline{Y_\beta \cap {\mathcal {T}}_u^b}\), for all \(\beta \in P_1\).

Now by Lemma 7.8, since \(Y_\beta \cap {\mathcal {T}}_u^b \ne \emptyset \) for all \(\beta \in P_1\), \(Y_\beta = R_u(x)\) for all \(x \in Y_\beta \cap {\mathcal {T}}_u^b\). In particular, for all \(\beta \in P_1\), there exists a unique (since \(R_u^b \) is an equivalence relation on \({\mathcal {T}}_u^b\) and by uniqueness of the section map) \(\alpha = \alpha (\beta ) \in Q\) so that \(Y_\beta = R_u(\alpha )\). Denoting by \({\tilde{Q}} \subset Q\) the set of indices \(\alpha \) obtained in this way, it is clear that \({\tilde{Q}}\) if of full \({\mathfrak {q}}\)-measure, since:

$$\begin{aligned}&0 = {\mathfrak {p}}(P \setminus P_1) = {\mathfrak {m}}\left( {\mathcal {T}}_{u}^{b} \setminus \bigcup _{\beta \in P_{1}} Y_{\beta } \right) \\&\quad = {\mathfrak {m}}\left( {\mathcal {T}}_{u}^{b} \setminus \bigcup _{\alpha (\beta ) :\beta \in P_{1}} R_{u}(\alpha (\beta )) \right) = {\mathfrak {q}}(Q \setminus {\tilde{Q}}). \end{aligned}$$

Consequently, \(Q_2 := {\tilde{Q}} \cap Q_1\) is of full \({\mathfrak {q}}\)-measure as well. Denoting \(P_2 := \alpha ^{-1}(Q_2)\) and repeating the above argument, it follows that \(P_2 \subset P_1\) is of full \({\mathfrak {p}}\)-measure and satisfies that for all \(\beta \in P_2\), \(Y_\beta = R_u(\alpha )\) for \(\alpha = \alpha (\beta )\in Q_2\).

We conclude that there is a one-to-one correspondence:

$$\begin{aligned} \eta : P_2 \ni \beta \leftrightarrow \alpha \in Q_2 \;\;\; \text {whenever} \;\;\; Y_\beta \cap {\mathcal {T}}_u^b = R_u^b(\alpha ) ( = R_u(\alpha ) \cap {\mathcal {T}}_u^b), \end{aligned}$$

so both of these representations yield an identical partition (up to relabeling) of the set

$$\begin{aligned} C := \bigcup _{\beta \in P_2} (Y_\beta \cap {\mathcal {T}}_u^b) = \bigcup _{\alpha \in Q_2} R_u^b(\alpha ) . \end{aligned}$$

Clearly \({\mathfrak {m}}({\mathcal {T}}_u^b \setminus C) = 0\) and so C is \({\mathfrak {m}}\)-measurable. Therefore, by the above two disintegration formulae:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_u} = {\mathfrak {m}}\llcorner _{C} = \int _{P_2} ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b} {\mathfrak {p}}(d\beta ) = \int _{Q_2} {\mathfrak {m}}_\alpha {\mathfrak {q}}(d\alpha ) . \end{aligned}$$

After identifying between \(P_2\) and \(Q_2\) via \(\eta \), it follows necessarily that \({\mathfrak {q}}\llcorner _{Q_2}={\mathfrak {p}}\llcorner _{P_2}\) as they are both the push-forward of \({\mathfrak {m}}\llcorner _{C}\) under the partition map (since \(({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}\) and \({\mathfrak {m}}_\alpha \) are both probability measures on \({\mathcal {T}}_u\)). Applying the Disintegration Theorem 6.19 to \((C,{\mathcal {B}}(C),{\mathfrak {m}}\llcorner _{C})\), we conclude that there is an essentially unique disintegration of \({\mathfrak {m}}\llcorner _{C}\) on the above partition of C. Consequently, there exist \(P_3 \subset P_2\) of full \({\mathfrak {p}}\)-measure and \(Q_3 = \eta (P_3) \subset Q_2\) of full \({\mathfrak {q}}\)-measure so that

$$\begin{aligned} ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b} = {\mathfrak {m}}_{\alpha } \end{aligned}$$

for all pairs \((\beta ,\alpha ) \in P_3 \times Q_3\) related by the correspondence \(\eta \).

Recall that \(X_\alpha := \overline{R_u^b(\alpha )}\). It follows that for all \(\alpha \in Q_3\) (with corresponding \(\beta \in P_3\)):

  1. (1)

    \(X_\alpha = \overline{R_u^b(\alpha )} = R_u(\alpha )\) is a transport ray.

  2. (2)

    \((\overline{Y_\beta \cap {\mathcal {T}}_u^b} , \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}) = (\overline{R_u^b(\alpha )} = X_\alpha ,\mathsf {d},{\mathfrak {m}}_\alpha )\) satisfies \({\mathsf {CD}}(K,N)\) with total measure 1.

  3. (3)

    Consequently

    $$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = \int _{Q} {\mathfrak {m}}_{\alpha } \,{\mathfrak {q}}(d\alpha ) , \end{aligned}$$
    (8.8)

    is a disintegration on \(\left\{ X_\alpha \right\} _{\alpha \in Q}\).

  4. (4)

    \({\mathfrak {m}}_\alpha = ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b}\) is supported on \(\overline{Y_\beta \cap {\mathcal {T}}_u^b} = \overline{R_u^b(\alpha )} = X_\alpha \).

This confirms the 4 conditions of Definition 8.1, and the essential uniqueness of the disintegration (8.8) readily follows from that of the disintegration (8.7) and the arguments above.

Finally, by Lemma 7.6, since \((R_u^b(\alpha ) = R_u(\alpha ) \cap {\mathcal {T}}_u^b , \mathsf {d})\) is isometric to an interval in \(({\mathbb {R}},\left| \cdot \right| )\), then \(\mathring{X}_\alpha = \mathring{R}_{u}^b(\alpha )\) for all \(\alpha \in Q\). As \(\left\{ R_{u}^b(\alpha )\right\} _{\alpha \in Q}\) are equivalence classes, it follows that \(\{\mathring{X}_\alpha \}_{\alpha \in Q}\) is a family of disjoint subsets of \({\mathcal {T}}_u^b\). This concludes the proof for the case of \({\mathsf {CD}}^1_{Lip}\) and \({\mathsf {CD}}^1\).

For \({\mathsf {MCP}}^1\), one just needs to note that if \(u = \mathsf {d}(\cdot ,o)\) then \(o \in Y_\beta \) for all \(\beta \in P\) (by Remark 8.3, since \(Y_\beta \) is a transport ray). Recalling the definition of \(P_1 \subset P\), since \((Y_\beta \cap {\mathcal {T}}_u^b,\mathsf {d})\) is isometric to an interval and \({\mathfrak {m}}^P_\beta (Y_\beta \cap {\mathcal {T}}_u^b) = 1\) for all \(\beta \in P_1\), it follows necessarily that for those \(\beta \), \(o \in \overline{Y_\beta \cap {\mathcal {T}}_u^b}\) and \((\overline{Y_\beta \cap {\mathcal {T}}_u^b} , \mathsf {d}, ({\mathfrak {m}}^P_\beta )\llcorner _{{\mathcal {T}}_u^b})\) still satisfies \({\mathsf {MCP}}(K,N)\) with respect to o and is of full support. The rest of the the argument is identical to the one presented above, concluding the proof. \(\square \)

Recall moreover that we already derived several properties of \(W_{2}\)-geodesics in essentially non-branching m.m.s.’s verifying \({\mathsf {MCP}}(K,N)\). Hence from Proposition 8.9 we also obtain all the claims of Theorem 6.15 and Corollary 6.16, as well as all of the results of the next section, provided the m.m.s.is essentially non-branching and verifies \({\mathsf {CD}}^{1}(K,N)\) for \(N \in (1,\infty )\).

11 Temporal-regularity under \({\mathsf {MCP}}\)

In this section we deduce from the Measure Contraction and essentially non-branching properties various temporal-regularity results for the map \(t \mapsto \rho _{t}(\gamma _{t})\) and related objects, which we will require for this work. By Proposition 8.9, these results also apply under the \({\mathsf {CD}}^1\) condition. While these properties are essentially standard consequences of recently available results and tools, they appear to be new and may be of independent interest.

As usual, we assume that \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\). We begin with:

Proposition 9.1

Let \((X,\mathsf {d},{\mathfrak {m}})\) denote an essentially non-branching m.m.s.. Then the following are equivalent:

  1. (1)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}(K,N)\).

  2. (2)

    \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {MCP}}_{\varepsilon }(K,N)\).

  3. (3)

    For all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _{0} \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _{1}) \subset \text {supp}({\mathfrak {m}})\), there exists a unique \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\), \(\nu \) is induced by a map (i.e. \(\nu = S_{\sharp }(\mu _0)\) for some map \(S : X \rightarrow \mathrm{Geo}(X)\)), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1)\), and writing \(\mu _t = \rho _t {\mathfrak {m}}\), we have for all \(t \in [0,1)\):

    $$\begin{aligned} \rho _t^{-\frac{1}{N}}(\gamma _t) \ge \tau _{K,N}^{(1-t)}(\mathsf {d}(\gamma _0,\gamma _1)) \rho _0^{-\frac{1}{N}}(\gamma _0) \;\;\; \text {for }\nu \text {-a.e. }\gamma \in \mathrm{Geo}(X) ,\nonumber \\ \end{aligned}$$
    (9.1)

    and (integrating with respect to \(\nu \)):

    $$\begin{aligned} {\mathcal {E}}_{N}(\mu _t) \ge \int \tau _{K,N}^{(1-t)} (\mathsf {d}(\gamma _0,\gamma _1)) \rho _0^{-\frac{1}{N}}(\gamma _0) \nu (d\gamma ) . \end{aligned}$$
    (9.2)
  4. (4)

    For all \(\mu _{0},\mu _{1} \in {\mathcal {P}}_2(X)\) of the form \(\mu _1 = \delta _o\) for some \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) for some Borel set \(A \subset X\) with \(0< {\mathfrak {m}}(A) < \infty \), there exists a \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) so that for all \(t \in [0,1)\), \(\mu _t := (\mathrm{e}_t)_{\#} \nu \ll {\mathfrak {m}}\) and (9.1), (9.2) hold.

Moreover, the equivalence \((1) \Leftrightarrow (4)\) does not require the essentially non-branching assumption.

Remark 9.2

In fact, for essentially non-branching spaces, it is also possible to add the \({\mathsf {MCP}}^1(K,N)\) condition to the above list of equivalent statements. Indeed, we have already seen in the previous section that \({\mathsf {MCP}}^1(K,N) \Rightarrow {\mathsf {MCP}}_{\varepsilon }(K,N)\) without any non-branching assumptions. The converse implication for non-branching spaces follows from [19, Proposition 9.5] (without identifying the \({\mathsf {MCP}}^1(K,N)\) condition by this name), and it is possible to extend this to essentially non-branching spaces by following the arguments of [23, Proposition A.1].

Remark 9.3

Note that in (3), one is allowed to test any \(\mu _1\) with \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\), not only \(\mu _1 = \delta _o\) as in the other statements. By Theorem 6.15 (recall that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\)), note that the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition is precisely equivalent to the validity of (9.2) for all measures \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) of the form \(\mu _1 = \delta _o\) with \(o \in \text {supp}({\mathfrak {m}})\) and \(\mu _0 \ll {\mathfrak {m}}\) with bounded support.

Remark 9.4

While the equivalence \((1) \Leftrightarrow (4)\) will not be directly used in this work, it is worthwhile remarking that this is the only instance we are aware of, where one can obtain information on the density along geodesics without assuming or a-posteriori concluding some type of non-branching assumption. Indeed, the proof of \((1) \Rightarrow (4)\) relies on the (newly available) Theorem 3.11.

Proof of Proposition 9.1

\((1) \Rightarrow (4)\). \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic by Lemma 6.12. Given \(\mu _0\) and \(\mu _1 = \delta _o\) as in (4), any \(\nu \in \mathrm {OptGeo}(\mu _0,\mu _1)\) is concentrated on \(G_\varphi \) (where \(\varphi \) is the associated Kantorovich potential), and so Theorem 3.11 implies that \(\mathsf {d}(\gamma _{0},\gamma _{1}) = \ell _t(\gamma _t)\) for \(\nu \)-a.e. \(\gamma \). It follows that with the notation of Sect. 3:

$$\begin{aligned} \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\ge (\mathrm{e}_{t})_{\sharp } \big ( \tau _{K,N}^{(1-t)}(\mathsf {d}(\gamma _{0},\gamma _{1}))^{N} \nu (d \gamma ) \big ) = \rho _t(x) \tau _{K,N}^{(1-t)}(\ell _t(x))^N {\mathfrak {m}}(dx) . \end{aligned}$$

The pointwise inequality between densities follows for \({\mathfrak {m}}\)-a.e. x, and since \(\ell _t < \infty \) (and hence \(\tau _{K,N}^{(1-t)}(\ell _t(x)) > 0\)) for \(t\in (0,1)\), this in fact implies that \((\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\) (without relying on Theorem 6.15, which is unavailable without the essentially non-branching assumption). Since \((\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\), the inequality between densities is verified at \(x = \gamma _t\) for \(\nu \)-a.e. \(\gamma \). Noting that \(\frac{1}{{\mathfrak {m}}(A)} = \rho _0(\gamma _0)\) for \(\nu \)-a.e. \(\gamma \), (9.1) and hence (9.2) are established for \(\mu _0,\mu _1\) as above.

\((4) \Rightarrow (1)\). This follows by applying (9.1) to \(\mu _0 = \frac{1}{{\mathfrak {m}}(A)} {\mathfrak {m}}\llcorner _{A}\) and \(\mu _1 = \delta _o\), raising the resulting inequality to the power of N, and integrating it against \(\nu \llcorner _{\left\{ \gamma _t \in B\right\} }\) for all Borel sets \(B \subset \text {supp}(\mu _t)\), thereby verifying the \({\mathsf {MCP}}(K,N)\) inequality (6.5).

\((4) \Rightarrow (2)\). Let \(o \in \text {supp}({\mathfrak {m}})\) and let \(\mu _{0} = \rho _{0} {\mathfrak {m}}\in {\mathcal {P}}(X)\) with bounded support. As \((4)\Rightarrow (1)\), Lemma 6.12 implies that \((\text {supp}({\mathfrak {m}}),d)\) is proper, and in addition the assertions of Theorem 6.15 and Corollary 6.16 are in force.

Now, there exists an non-decreasing sequence \(\{f^{i}\}_{i\in {\mathbb {N}}}\) of simple functions, that is

$$\begin{aligned} f^{i} = \sum _{k \le n(i)} \alpha ^{i}_{k} \chi _{A^{i}_{k}}, \qquad \alpha ^{i}_{k}> 0 ,\quad {\mathfrak {m}}(A^i_k) > 0, \quad A^{i}_{k} \cap A^{i}_{j} = \emptyset , \text { if } k\ne j, \end{aligned}$$

such that \(\mu _{0}^{i} := \rho _{0}^{i} {\mathfrak {m}}:= \frac{1}{z^i} f^i {\mathfrak {m}}\in {\mathcal {P}}(X)\) is of bounded support, \(z^i := \int f^i d{\mathfrak {m}}\nearrow 1\), \(f^i \nearrow \rho _0\) pointwise, and \(\mu _0^i \rightharpoonup \mu _0\) weakly, as \(i \rightarrow \infty \). By Theorem 6.15 there exists a unique \(\nu ^{i} \in \mathrm {OptGeo}(\mu _{0}^{i}, \delta _{o})\), it is induced by a map, and can be written as

$$\begin{aligned} \nu ^{i} = \sum _{k\le n(i)} \frac{1}{z^i} \alpha ^i_k {\mathfrak {m}}(A^i_k) \nu ^{i}_{k}, \end{aligned}$$

with each \(\nu ^{i}_{k}\) the unique optimal dynamical plan between \(\mu _{0,k}^i := \rho _{0,k}^i {\mathfrak {m}}:= \frac{1}{{\mathfrak {m}}(A_k^i)} {\mathfrak {m}}\llcorner _{A_k^i}\) and \(\delta _{o}\). Moreover, \((\mathrm{e}_{t})_{\#}\nu ^{i}_{k} \perp (\mathrm{e}_{t})_{\#} \nu ^{i}_{j}\) whenever \(k\ne j\), for all \(t \in [0,1)\) by Corollary 6.16. Lastly, \(\text {supp}(\nu ^i) \subset \mathrm{Geo}(\text {supp}({\mathfrak {m}}))\) by Remark 6.11. It follows by (9.2) applied to \(\nu ^i_k\) that

$$\begin{aligned} {\mathcal {E}}_N((\mathrm{e}_{t})_{\#}\nu ^{i}_k) \ge \int \tau _{K,N}^{(1-t)}(\mathsf {d}(x,o)) \left( \rho ^{i}_{0,k}(x)\right) ^{1-\frac{1}{N}} \, {\mathfrak {m}}(dx) . \end{aligned}$$

Multiplying by \(\left( \frac{1}{z^i} \alpha ^i_k {\mathfrak {m}}(A^i_k)\right) ^{1-\frac{1}{N}}\), summing over k, and using the mutual singularity of all corresponding measures, we obtain

$$\begin{aligned} {\mathcal {E}}_N((\mathrm{e}_{t})_{\#}\nu ^{i}) \ge \int \tau _{K,N}^{(1-t)}(\mathsf {d}(x,o)) \left( \rho ^i_{0}(x)\right) ^{1-\frac{1}{N}} \, {\mathfrak {m}}(dx) . \end{aligned}$$
(9.3)

Passing to a subsequence if necessary, Lemma 6.1 implies that \(\nu ^i \rightharpoonup \nu ^\infty \in \mathrm {OptGeo}(\mu _0,\delta _o)\), and hence \((\mathrm{e}_{t})_{\#}\nu ^{i} \rightharpoonup (\mathrm{e}_{t})_{\#}\nu ^{\infty }\). It follows by upper semi-continuity of \({\mathcal {E}}_N\) on the left-hand side of (9.3), and monotone convergence (and \(z_i \rightarrow 1\)) on the right hand side, that taking \(i \rightarrow \infty \) yields the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) inequality (6.4). \((2) \Rightarrow (3)\). By Remark 6.11, we may reduce to the case \(\text {supp}({\mathfrak {m}}) = X\). In view of Remark 9.3, we first extend the validity of (9.2) by removing the (immaterial) restriction that \(\mu _0\) has bounded support. When \(K > 0\), \(\text {supp}(\mu _0)\) is automatically bounded since \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\) which by Remark 6.10 implies a Bonnet-Myers diameter estimate. When \(K \le 0\), we may weakly approximate a general \(\mu _0 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\) by measures \(\mu _0^i \ll {\mathfrak {m}}\) having bounded support and repeat the argument presented above in the proof of \((4) \Rightarrow (2)\).

The case of a general \(\mu _1 \in {\mathcal {P}}_2(X)\) with \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\) follows by approximating \(\mu _1\) by a convex combination of delta-measures:

$$\begin{aligned} \mu _{1}^{i} = \sum _{k\le n(i)} a^{i}_{k} \delta _{o^{i}_{k}} \;\; , \;\; o^{i}_{k} \in \text {supp}({\mathfrak {m}}) \ \text {for} \ k \le n(i), \ \text {and} \ \sum _{k\le n(i)} a^{i}_{k} = 1; \end{aligned}$$

with \(W_{2}(\mu _{1}^{i}, \mu _{1}) \rightarrow 0\) as \(i\rightarrow \infty \). By Theorem 6.15 (recall again that \({\mathsf {MCP}}_{\varepsilon }(K,N)\) implies \({\mathsf {MCP}}(K,N)\)), for each i there exists a unique \(\nu ^{i} \in \mathrm {OptGeo}(\mu _{0},\mu _{1}^{i})\), and we may write \(\nu ^{i} = \sum _{k \le n(i)} \alpha ^i_k \nu ^{i}_{k}\) so that

$$\begin{aligned} \nu ^{i}_{k} \in \mathrm {OptGeo}( (\mathrm{e}_{0})_{\#} \nu ^{i}_{k} , \delta _{o^{i}_{k}}). \end{aligned}$$

Moreover, as explained above, \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} \perp (\mathrm{e}_{t})_{\#} \nu ^{i}_{j}\) whenever \(k\ne j\), for all \(t \in [0,1)\). Furthermore, as \((\mathrm{e}_{0})_{\#} \nu ^{i}_{k} \ll {\mathfrak {m}}\) (since \((\mathrm{e}_{0})_{\#} \nu ^i = \mu _0 = \rho _0 {\mathfrak {m}}\ll {\mathfrak {m}}\)), Theorem 6.15 implies that \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} \ll {\mathfrak {m}}\) for all \(t \in [0,1)\). Writing \((\mathrm{e}_{t})_{\#} \nu ^{i}_{k} = \rho _{k,t}^{i} {\mathfrak {m}}\), the \({\mathsf {MCP}}_{\varepsilon }(K,N)\) condition implies for all \(t \in [0,1)\):

$$\begin{aligned} \int (\rho _{k,t}^{i})^{1-\frac{1}{N}}(x) \, {\mathfrak {m}}(dx) \ge \int \tau _{K,N}^{(1-t)} (\mathsf {d}(x,o^{i}_{k})) (\rho _{0,k}^{i})^{1-\frac{1}{N}}(x) {\mathfrak {m}}(dx). \end{aligned}$$

Multiplying by \((\alpha _k^i)^{1-1/N}\), summing over k and using the mutual singularity of the corresponding measures, we obtain

$$\begin{aligned} {\mathcal {E}}_{N}((\mathrm{e}_{t})_{\#} \nu ^{i}) \ge \int _X \tau _{K,N}^{(1-t)} (\mathsf {d}(x,y)) \rho _0^{-\frac{1}{N}}(x) \, (\mathrm{e}_{0},\mathrm{e}_{1})_{\#} \nu ^{i} (dxdy). \end{aligned}$$

Passing as usual to a subsequence if necessary, Lemma 6.1 implies that \(\nu ^i \rightharpoonup \nu ^\infty \in \mathrm {OptGeo}(\mu _0,\mu _1)\), and hence \((\mathrm{e}_{t})_{\#}\nu ^{i} \rightharpoonup (\mathrm{e}_{t})_{\#}\nu ^{\infty }\). Invoking the upper semi-continuity of \({\mathcal {E}}_{N}\) on the left-hand-side, and lower semi-continuity of the right-hand-side (see [74, Lemma 3.3], noting that the first marginal of \(\nu ^i\) is fixed to be \(\mu _0 = \rho _0 {\mathfrak {m}}\)), (9.2) finally follows in full generality.

The density estimate (9.1) then follows using a straightforward variation of [41, Proposition 3.1], where it was shown how the existence of (a necessarily unique) transport map S may be used to obtain a pointwise density inequality such as (9.1) from an integral inequality such as (9.2) (the statement of [41, Proposition 3.1] involves an assumption on infinitesimal Hilbertianity of the space, but the only property used in the proof is the existence of a transport map S inducing a unique optimal dynamical plan).

Finally, \((3) \Rightarrow (4)\) is trivial. This concludes the proof. \(\square \)

Corollary 9.5

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\). Then with the same assumptions and notation as in Proposition 9.1 (3), there exist versions of the densities \(\rho _t = \frac{d\mu _t}{d{\mathfrak {m}}}\), \(t \in [0,1)\), so that for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), for all \(0\le s \le t <1\):

$$\begin{aligned} \rho _s(\gamma _s) > 0 \;\; , \;\; \left( \tau _{K,N}^{(\frac{s}{t})} (\mathsf {d}(\gamma _{0},\gamma _{t})) \right) ^{N} \le \frac{\rho _{t}(\gamma _{t})}{\rho _{s}(\gamma _{s})} \le \left( \tau _{K,N}^{(\frac{1-t}{1-s})} (\mathsf {d}(\gamma _{s},\gamma _{1})) \right) ^{-N}\nonumber \\ \end{aligned}$$
(9.4)

(with \(\frac{s}{t} = \frac{0}{0}\) interpreted as 1 above). In particular, for \(\nu \)-a.e. \(\gamma \), the map \(t \mapsto \rho _{t}(\gamma _{t})\) is locally Lipschitz on (0, 1) and upper semi-continuous at \(t=0\).

Proof

Step 1. Given \(0 \le s \le t < 1\), observe that \((\text {restr}^{t}_{s})_{\sharp } \nu \) is the unique element of \(\mathrm {OptGeo}(\mu _{s},\mu _{t})\); indeed \(\mu _{s}\) is absolutely continuous with respect to \({\mathfrak {m}}\) and so Theorem 6.15 applies. In particular, we deduce that for each \(0 \le s \le t < 1\) and \(\nu \)-a.e. \(\gamma \):

$$\begin{aligned} \rho _{t}(\gamma _{t})^{-1/N} \ge \rho _{s}(\gamma _{s})^{-1/N} \tau _{K,N}^{(\frac{1-t}{1-s})} (\mathsf {d}(\gamma _{s},\gamma _{1})), \end{aligned}$$

with the exceptional set depending on s and t. Reversing time and the roles of \(\mu _s,\mu _t\), we similarly obtain for each \(0 \le s \le t < 1\) and \(\nu \)-a.e. \(\gamma \) that:

$$\begin{aligned} \rho _{s}(\gamma _{s})^{-1/N} \ge \rho _{t}(\gamma _{t})^{-1/N} \tau _{K,N}^{(\frac{s}{t})} (\mathsf {d}(\gamma _{0},\gamma _{t})), \end{aligned}$$

with the exceptional set depending on s and t (the case \(s=0\) is also included as the conclusion is then trivial). Note that given \(s \in [0,1)\), as \(\rho _s(x) > 0\) for \(\mu _s\)-a.e. x, we have that \(\rho _s(\gamma _s) > 0\) for \(\nu \)-a.e. \(\gamma \). Altogether, we see that for each \(0 \le s \le t < 1\), for \(\nu \)-a.e. \(\gamma \):

$$\begin{aligned}&\rho _s(\gamma _s) > 0,\nonumber \\&\quad \rho _{s}(\gamma _{s}) \left( \tau _{K,N}^{(\frac{s}{t})} (\mathsf {d}(\gamma _{0},\gamma _{t})) \right) ^{N} \le \rho _{t}(\gamma _{t}) \le \rho _{s}(\gamma _{s}) \left( \tau _{K,N}^{(\frac{1-t}{1-s})} (\mathsf {d}(\gamma _{s},\gamma _{1})) \right) ^{-N},\nonumber \\ \end{aligned}$$
(9.5)

with the exceptional set depending on s and t.

Together with an application of Corollary 6.16, we deduce the existence of a Borel set \(H \subset \mathrm{Geo}(X)\) with \(\nu (H)=1\) such that \(\mathrm{e}_t|_H : H \rightarrow X\) is injective for all \(t \in [0,1)\), and such that for every \(\gamma \in H\), the double sided estimate (9.5) holds for all \(s,t \in [0,1) \cap {\mathbb {Q}}\). We then define for \(t \in [0,1)\) and \(\gamma \in H\):

$$\begin{aligned} {\hat{\rho }}_{t}(\gamma _t) := {\left\{ \begin{array}{ll} \lim _{(0,1) \cap {\mathbb {Q}}\ni s \rightarrow t} \rho _{s}(\gamma _{s}) &{} t \in (0,1) \\ \rho _{0}(\gamma _{0}) &{} t = 0 \end{array}\right. }, \end{aligned}$$

and \({\hat{\rho }}_t = 0\) outside of \(\mathrm{e}_t(H)\). By (9.5) we see that for any \(\gamma \in H\) and \(t \in (0,1)\) the above limit always exists, and so by injectivity of \(\mathrm{e}_t|_H\), \({\hat{\rho }}_t\) is well-defined. Furthermore, (9.5) implies that for all \(\gamma \in H\), \({\hat{\rho }}_{\cdot }(\gamma _{\cdot })\) satisfies (9.5) itself for all \(0 \le s \le t < 1\). Finally, for each \(t \in [0,1)\) consider any sequence \(\{s_{n}\}\subset {\mathbb {Q}}\) converging to t; then (9.5) is valid for \(\nu \)-a.e. \(\gamma \) at t and \(s_{n}\), with the exceptional set not depending on n. Taking the limit as \(n\rightarrow \infty \) implies \( \rho _{t}(\gamma _{t}) = {\hat{\rho }}_{t}(\gamma _{t})\). Hence we have obtained that for each \(t \in [0,1)\), for \(\nu \)-a.e. \(\gamma \):

$$\begin{aligned} \rho _{t}(\gamma _{t}) = {\hat{\rho }}_{t}(\gamma _{t}), \end{aligned}$$

with the exceptional set depending only on t.

It follows that for all \(t \in [0,1)\), \(\rho _t(x) = {\hat{\rho }}_t(x)\) for \(\mu _t\)-a.e. x. As \(\mu _t\) and \({\mathfrak {m}}\) are mutually absolutely continuous on \(\left\{ \rho _t > 0\right\} \), it follows that \(\rho _t {\mathfrak {m}}= {\hat{\rho }}_t 1_{\left\{ \rho _t > 0\right\} }{\mathfrak {m}}\) for all \(t \in [0,1)\).

Step 2. We now claim that for all \(t \in [0,1)\), \({\mathfrak {m}}(\{ \rho _t = 0 \} \cap \mathrm{e}_t(H)) = 0\). This will establish that \(\mu _t = \rho _t {\mathfrak {m}}= {\hat{\rho }}_t {\mathfrak {m}}\), so that \({\hat{\rho }}_t\) is indeed a density of \(\mu _t\), thereby concluding the proof.

Suppose in the contrapositive that the above is false, so that there exists \(t \in [0,1)\) with \({\mathfrak {m}}(\{ \rho _t = 0 \} \cap \mathrm{e}_t(H)) > 0\). As \(\mathrm{e}_t|_H\) is injective, there exist \(K \subset H\) such that \(K_{t} : = \mathrm{e}_{t}(K) = \{ \rho _t = 0 \} \cap \mathrm{e}_t(H)\).

Set \(K_s := \mathrm{e}_s(K)\) for all \(s \in [0,1)\). We claim that \({\mathfrak {m}}(K_s) > 0\) for all \(s \in (0,1)\). Indeed, define \(\eta _{t} : = {\mathfrak {m}}\llcorner _{K_{t}}/{\mathfrak {m}}(K_{t})\) and set \({\bar{\nu }} := (\mathrm{e}_{t}|_H)^{-1}_{\#} \eta _{t}\) and \(\eta _s := (\mathrm{e}_s)_{\sharp } {\bar{\nu }}\). As \({\bar{\nu }}\) is concentrated on \(K \subset H \subset \text {supp}(\nu )\), it follows that \((\text {restr}^{1}_{t})_{\sharp } {\bar{\nu }}\) must be an optimal dynamical plan between \(\eta _t\) and \(\eta _1\). As \(\eta _t \ll {\mathfrak {m}}\), Theorem 6.15 implies that the latter plan is in fact the unique element of \(\mathrm{OptGeo}(\eta _t,\eta _1)\), and that \(\eta _s \ll {\mathfrak {m}}\) for all \(s \in [t,1)\). As \(\eta _s(K_s) = 1\), it follows that \({\mathfrak {m}}(K_s) > 0\). If \(t > 0\), a similar argument applies to the range \(s \in (0,t]\).

However, by definition, for all \(s \in [0,1) \cap {\mathbb {Q}}\) we have \(0 < {\hat{\rho }}_s = \rho _s\) on \(\mathrm{e}_s(H)\), and in particular on \(\mathrm{e}_s(K) = K_s\). Choosing any \(s \in (0,1) \cap {\mathbb {Q}}\), we obtain the desired contradiction:

$$\begin{aligned} 0 < \int _{K_{s}} \rho _{s} {\mathfrak {m}}=\mu _{s}(K_{s}) = \mu _t(K_t) = \int _{K_t} \rho _t {\mathfrak {m}}= 0 . \end{aligned}$$

This concludes the proof. \(\square \)

Proposition 9.6

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {MCP}}(K,N)\). Consider any \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X)\) with \(\mu _0 \ll {\mathfrak {m}}\) and \(\text {supp}(\mu _1) \subset \text {supp}({\mathfrak {m}})\), and let \(\nu \) denote the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). Then for any compact set \(G \subset \mathrm{Geo}(X)\) with \(\nu (G)> 0\), such that (9.4) holds for all \(\gamma \in G\) and \(0 \le s \le t < 1\), we have for all \(s \in [0,1)\), \({\mathfrak {m}}(\mathrm{e}_{s}(G)) > 0\), and for all \(0\le s \le t <1\):

$$\begin{aligned}&\left( \frac{1-t}{1-s}\right) ^{N} e^{-d(G) (t-s) \sqrt{(N-1)K^{-}}}\nonumber \\&\quad \le \frac{{\mathfrak {m}}(\mathrm{e}_{t}(G))}{{\mathfrak {m}}(\mathrm{e}_{s}(G))} \le \left( \frac{t}{s}\right) ^{N} e^{d(G) (t-s) \sqrt{(N-1)K^{-}}} , \end{aligned}$$
(9.6)

where \(d(G) = sup \{ \ell (\gamma ) :\gamma \in G\} < \infty \) and \(K^{-} = \max \{0, - K\}\) (and with \(\frac{t}{s} = \frac{0}{0}\) interpreted as 1 above). In particular, the map \(t \mapsto {\mathfrak {m}}(\mathrm{e}_{t}(G))\) is locally Lipschitz on (0, 1) and lower semi-continuous at \(t=0\).

Proof

We proceed with the usual notation repeatedly used above. Fix \(s \in [0,1)\). Since \(\mu _{s}(\mathrm{e}_{s}(G)) \ge \nu (G) >0\) and \(\mu _s \ll {\mathfrak {m}}\), it follows that \({\mathfrak {m}}(\mathrm{e}_{s}(G)) > 0\). Define \({\bar{\mu }}_{0} : = {\mathfrak {m}}\llcorner _{\mathrm{e}_s(G)}/{\mathfrak {m}}(\mathrm{e}_s(G))\).

By Corollary 6.16, there exists a Borel set \(H \subset G\) such that \(\mathrm{e}_{s}^{-1} : \mathrm{e}_{s}(H) \rightarrow G\) is a single valued map and:

$$\begin{aligned} \nu (G \setminus H) = 0, \quad {\mathfrak {m}}(\mathrm{e}_{s}(G) \setminus \mathrm{e}_{s}(H)) = 0, \end{aligned}$$
(9.7)

where the second assertion above follows since \({\mathfrak {m}}\) and \(\mu _s\) are mutually absolutely continuous on \(\left\{ \rho _s > 0 \right\} \), and since our assumption (9.4) guarantees that \(\mathrm{e}_{s}(G) \subset \{\rho _{s} > 0 \}\). Now consider:

$$\begin{aligned} {\bar{\nu }} := (\text {rest}^1_s \circ \mathrm{e}_s^{-1})_{\sharp } ({\bar{\mu }}_0\llcorner _{\mathrm{e}_s(H)}) = \int _{\mathrm{e}_s(H)} \delta _{\text {restr}^{1}_{s}(\mathrm{e}_{s}^{-1}(x))} {\bar{\mu }}_{0}(dx) \in {\mathcal {P}}(\mathrm{Geo}(X)) . \end{aligned}$$

By construction and (9.7), \((\mathrm{e}_0)_{\sharp }{\bar{\nu }} = {\bar{\mu }}_{0}\); define \({\bar{\mu }}_{1} : = (\mathrm{e}_1)_{\sharp }{\bar{\nu }}\) and note that necessarily \({\bar{\nu }} \in \mathrm {OptGeo}({\bar{\mu }}_{0}, {\bar{\mu }}_{1})\) (since \({\bar{\nu }}\) is still supported on a \(\mathsf {d}^2/2\)-cyclically monotone set) and that it is induced by the map \(T := \mathrm{e}_1 \circ \mathrm{e}_s^{-1}\). Theorem 6.15 then implies that \({\bar{\mu }}_{r} = {\bar{\rho }}_r {\mathfrak {m}}\ll {\mathfrak {m}}\) for all \(r \in [0,1)\). Note that \({\bar{\mu }}_r\) is concentrated on the compact set \(\mathrm{e}_{t}(G)\) with \(t := s + r(1-s)\), and therefore \({\mathfrak {m}}(\text {supp}({\bar{\mu }}_r)) \le {\mathfrak {m}}(\mathrm{e}_{t}(G) )\). It follows by Jensen’s inequality together with the \({\mathsf {MCP}}(K,N)\) assumption that:

$$\begin{aligned} {\mathfrak {m}}(\mathrm{e}_{t}(G) )^{1/N}&\ge {\mathfrak {m}}( \text {supp}({\bar{\mu }}_r) )^{1/N} \ge \int {\bar{\rho }}_{r}^{1-1/N}(x) \,{\mathfrak {m}}(dx) \\&\ge {\mathfrak {m}}(\mathrm{e}_s(G))^{1/N - 1} \int _{\mathrm{e}_s(G)} \tau _{K,N}^{(1-r)}(\mathsf {d}(x,T(x))) \, {\mathfrak {m}}(dx) \\&\ge {\mathfrak {m}}(\mathrm{e}_s(G))^{1/N} (1-r) e^{-(1-s) d(G) r \sqrt{(N-1)K^{-}}/N}, \end{aligned}$$

where the last inequality follows from the lower bound (see e.g. [29, Remark 2.3]):

$$\begin{aligned} \tau _{K,N}^{(1-r)} (\theta ) = (1-r) \left( \frac{\sigma ^{(1-r)}_{K,N-1} (\theta )}{ 1-r} \right) ^{\frac{N-1}{N}} \ge (1-r)e^{ - \theta r \sqrt{ (N-1)K^{-}}/N }. \end{aligned}$$

Substituting \(r = \frac{t-s}{1-s}\), the left-hand side of (9.6) is established. Reversing the time, the right-hand side of (9.6) immediately follows, thereby concluding the proof. \(\square \)

The following two consequences of Proposition 9.6 will be required for the proof of the change-of-variables formula in Sect. 11. Recall that for any \(G \subset \mathrm{Geo}(X)\),

$$\begin{aligned} D(G) := \{ (x,t) \in X \times [0,1] :x = \gamma _t, \ \gamma \in G\}, \end{aligned}$$

and that \(D(G)(x) = \{ t \in [0,1] :x = \gamma _{t}, \ \gamma \in G \}\) and \(D(G)(t) = \{ x \in X :x = \gamma _t,\ \gamma \in G \} = \mathrm{e}_{t}(G)\). To simplify the notation, we directly write G(x) instead of D(G)(x).

Proposition 9.7

With the same assumptions as in Proposition 9.6, we have for any \(t \in (0,1)\):

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0+} \frac{{\mathcal {L}}^{1} \big ( G(x) \cap (t- \varepsilon , t+\varepsilon ) \big ) }{2\varepsilon } = 1 \;\;\; \text { in }L^1(\mathrm{e}_{t}(G),{\mathfrak {m}}). \end{aligned}$$

The same result also holds for \(t=0\) if we dispense with the factor of 2 in the denominator.

The proof follows the same line as the proof of [26, Theorem 2.1]. We include it for the reader’s convenience.

Proof

Fix \(t \in (0,1)\). Suppose in the contrapositive that the claim is false:

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0} \int _{\mathrm{e}_{t}(G)} \left| 1 - \frac{{\mathcal {L}}^{1} (G(x) \cap (t-\varepsilon , t+\varepsilon )) }{2\varepsilon } \right| \, {\mathfrak {m}}(dx) > 0. \end{aligned}$$

Consider the complement \(G(x)^{c} = \{ t \in [0,1] : x \notin \mathrm{e}_{t}(G)\}\), and deduce the existence of a sequence \(\varepsilon _{n} \rightarrow 0\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty } \int _{\mathrm{e}_{t}(G)} \frac{{\mathcal {L}}^{1} (G(x)^{c} \cap (t-\varepsilon _{n}, t+\varepsilon _{n})) }{2\varepsilon _{n}} \, {\mathfrak {m}}(dx)>0. \end{aligned}$$
(9.8)

Now let

$$\begin{aligned} E: = \{ (x,s) \in \mathrm{e}_{t}(G) \times (0,1) \; ; \; s \in G(x)^{c} \} \end{aligned}$$

with E(x), E(s) the corresponding sections. By Fubini’s Theorem and (9.8) we obtain that:

$$\begin{aligned}&\lim _{n \rightarrow \infty } \frac{1}{2\varepsilon _{n}} \int _{(t-\varepsilon _{n},t+\varepsilon _{n})} {\mathfrak {m}}(E(s))\, {\mathcal {L}}^{1}(ds) \\&\quad = \lim _{n \rightarrow \infty } \frac{1}{2\varepsilon _{n}} {\mathfrak {m}}\otimes {\mathcal {L}}^{1}\left( E \cap (\mathrm{e}_{t}(G) \times (t-\varepsilon _{n}, t+ \varepsilon _{n}))\right) \\&\quad = \lim _{n \rightarrow \infty } \frac{1}{2\varepsilon _{n}} \int _{\mathrm{e}_{t}(G)} {\mathcal {L}}^{1}(G(x)^{c} \cap (t-\varepsilon _{n}, t+ \varepsilon _{n})) \, {\mathfrak {m}}(dx) > 0, \end{aligned}$$

so there must be a sequence of \(\{s_{n}\}_{n \in {\mathbb {N}}}\) converging to t so that \({\mathfrak {m}}(E (s_{n})) \ge \kappa \), for some \(\kappa >0\). Repeating the above argument for the case \(t=0\) with the appropriate obvious modifications, the latter conclusion also holds in that case as well. Note that

$$\begin{aligned} E(s_{n}) = \{ x\in \mathrm{e}_{t}(G) :x \notin \mathrm{e}_{s_{n}}(G) \} = \mathrm{e}_{t}(G) \setminus \mathrm{e}_{s_{n}}(G). \end{aligned}$$

The compact sets \(\mathrm{e}_{s_{n}}(G)\) converge to \(\mathrm{e}_{t}(G)\) in Hausdorff distance: indeed, \(\mathsf {d}(\gamma _{t},\gamma _{s_{n}}) \le C |t-s_{n}|\) where \(C: =\sup _{\gamma \in G} \ell (\gamma ) < \infty \) by compactness of G. Hence, for each \(\varepsilon > 0\) there exists \(n(\varepsilon )\) such that for all \(n \ge n(\varepsilon )\) it holds \(\mathrm{e}_{t}(G)^{\varepsilon } \supset \mathrm{e}_{s_n}(G)\) (and vice-versa), where \(A^\varepsilon := \left\{ y \in X \; ; \; \mathsf {d}(y,A) \le \varepsilon \right\} \). It follows that

$$\begin{aligned} {\mathfrak {m}}(\mathrm{e}_{t}(G)^{\varepsilon }) \ge {\mathfrak {m}}\big (\mathrm{e}_{t}(G) \setminus \mathrm{e}_{s_{n}}(G) \big ) + {\mathfrak {m}}(\mathrm{e}_{s_{n}}(G) ) \ge \kappa + {\mathfrak {m}}(\mathrm{e}_{s_{n}}(G) ). \end{aligned}$$

Taking the limit as \(n \rightarrow \infty \), the continuity property of Proposition 9.6 (lower semi-continuity if \(t=0\)) implies that for each \(\varepsilon > 0\):

$$\begin{aligned} {\mathfrak {m}}(\mathrm{e}_{t}(G)^{\varepsilon }) \ge \kappa + {\mathfrak {m}}(\mathrm{e}_{t}(G) ) \end{aligned}$$

with \(\kappa \) independent of \(\varepsilon \). Since \({\mathfrak {m}}(\mathrm{e}_{t}(G)) = \lim _{\varepsilon \rightarrow 0} {\mathfrak {m}}(\mathrm{e}_{t}(G)^{\varepsilon })\) we obtain a contradiction, and the claim is proved. \(\square \)

Corollary 9.8

With the same assumptions as in Proposition 9.6, and assuming that \(\text {supp}({\mathfrak {m}}) = X\), we have

$$\begin{aligned} \nu (\mathrm{e}_{0}^{-1}(X^{0})\cap G^{+}_{\varphi } ) = 0 , \end{aligned}$$

where \(\varphi \) is an associated Kantorovich potential to the c-optimal-transport problem from \(\mu _{0}\) to \(\mu _{1}\) with \(c = \mathsf {d}^2/2\). In particular:

$$\begin{aligned} \mu _{t}\llcorner _{X^0} = \mu _0\llcorner _{X^0} \;\;\; \forall t \in [0,1) . \end{aligned}$$

Recall from Sect. 3 that \(G_{\varphi }\subset \mathrm{Geo}(X)\) denotes the set of \(\varphi \)-Kantorovich geodesics, \(G^{+}_{\varphi }\) denotes the subset of geodesics in \(G_{\varphi }\) having positive length, and \(X^0 = \mathrm{e}_{[0,1]}(G_\varphi ^0)\) denotes the subset of null geodesic points in X. Necessarily \(\nu (G_{\varphi }) = 1\). The assumption \(\text {supp}({\mathfrak {m}}) = X\) guarantees by Lemma 6.12 that \((X,\mathsf {d})\) is proper and geodesic, so that the results of Part I are in force; by Remark 6.11 this poses no loss in generality.

Proof of Corollary 9.8

Suppose by contradiction that \(\nu (\mathrm{e}_{0}^{-1}(X^{0})\cap G^{+}_{\varphi } ) > 0\). By inner regularity, there exists a compact \(G \subset \mathrm{e}_{0}^{-1}(X^{0})\cap G^{+}_{\varphi }\) with \(\nu (G)>0\) verifying the hypothesis of Proposition 9.6 and therefore also the conclusion of Proposition 9.7 for \(t =0\). In particular, for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_{0}(G) \subset X^{0}\) there exists \(\gamma \in G \subset G^{+}_{\varphi }\) and \(t \in (0,1)\) (sufficiently small) such that \(x = \gamma _{t}\). But \(\mu _0(\mathrm{e}_0(G)) = \nu (\mathrm{e}_0^{-1}(\mathrm{e}_0(G))) \ge \nu (G) > 0\), and hence \({\mathfrak {m}}(\mathrm{e}_0(G)) > 0\) as \(\mu _0 \ll {\mathfrak {m}}\). It follows that there exists at least one \(x \in \mathrm{e}_{0}(G)\) as above, in direct contradiction to the characterization of \(X^0\) given in Lemma 3.15. Hence we can conclude that \(\nu \)-almost-surely, \(\mathrm{e}_{t}^{-1}(X^{0})\) is contained in the set of null geodesics \(G_{\varphi }^0\). For \(t \in (0,1)\), \(\mathrm{e}_t^{-1}(X^0) \subset G_\varphi ^0\) by Lemma 3.15, and so we conclude that \(\mu _{t}\llcorner _{X^0} = \mu _0\llcorner _{X^0}\) for all \(t \in [0,1)\). \(\square \)

Remark 9.9

When applying the results of this section, note that when both \(\mu _0,\mu _1 \ll {\mathfrak {m}}\), then by reversing the roles of \(\mu _0\) and \(\mu _1\), we in fact obtain all the above results also at the right end-point \(t=1\).

12 Two families of conditional measures

The next two sections will be devoted to the study of \(W_{2}\)-geodesics over \((X,\mathsf {d},{\mathfrak {m}})\), when \((X,\mathsf {d},{\mathfrak {m}})\) is assumed to be essentially non-branching and verifies \({\mathsf {CD}}^1(K,N)\). By Remark 8.8, we also assume \(\text {supp}({\mathfrak {m}}) = X\). We will use Proposition 8.13 as an equivalent definition for \({\mathsf {CD}}^1(K,N)\). By Proposition 8.9 and Remark 8.11, X also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 applies. In addition, it follows by Lemma 6.12 that \((X,\mathsf {d})\) is geodesic and proper, and so the results of Part I apply.

Fix \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), and denote by \(\nu \) the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). As usual, we denote \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and set

$$\begin{aligned} \mu _t =: \rho _{t} {\mathfrak {m}}\;\;\; \forall t \in [0,1] . \end{aligned}$$

Fix also an associated Kantorovich potential \(\varphi : X \rightarrow {\mathbb {R}}\) for the c-optimal-transport problem from \(\mu _{0}\) to \(\mu _{1}\), with \(c = \mathsf {d}^2/2\). Recall that \(G_{\varphi }\subset \mathrm{Geo}(X)\) denotes the set of \(\varphi \)-Kantorovich geodesics and that necessarily \(\nu (G_{\varphi }) = 1\). We further recall from Sect. 3 that the interpolating Kantorovich potential and its time-reversed version at time \(t \in (0,1)\) are defined for any \(x \in X\) as

$$\begin{aligned}&-\varphi _{t} (x) = \inf _{y \in X} \frac{\mathsf {d}^{2}(x,y)}{2t} - \varphi (y) ,\\&\quad {\bar{\varphi }}_t(x) = \inf _{y \in X} \frac{\mathsf {d}^2(x,y)}{2(1-t)} - \varphi ^c(y) \;\;\;\; \forall t \in (0,1), \end{aligned}$$

with \(\varphi _0 = {\bar{\varphi }}_0 = \varphi \) and \(\varphi _1 = {\bar{\varphi }}_1 = -\varphi ^c\). By Proposition 3.6 we have, for all \(t \in (0,1)\), \(\varphi _t(x) \le {\bar{\varphi }}_t(x)\), with equality iff \(x \in \mathrm{e}_t(G_\varphi )\).

It will be convenient from a technical perspective to first restrict \(\nu \), by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs \(\mu _0,\mu _1\) and \(\mu _1,\mu _0\)), Proposition 9.7 and Corollary 6.16, to a suitable good compact subset \(G \subset G^+_{\varphi }\) with \(\nu (G) \ge \nu (G_\varphi ^+)-\varepsilon \). Recall that \(G_\varphi ^+\) was defined in Sect. 3 as the subset of geodesics in \(G_\varphi \) having positive length, and note that the length function \(\ell : \mathrm{Geo}(X) \rightarrow [0,\infty )\) is continuous and hence is bounded away from 0 and \(\infty \) on a compact \(G \subset G^+_{\varphi }\).

Definition 10.1

(Good Subset of Geodesics) A subset \(G \subset G^+_{\varphi }\) is called good if the following properties hold:

  • G is compact;

  • there exists \(c > 0\) so that for every \(\gamma \in G\):

    $$\begin{aligned} c \le \ell (\gamma ) \le 1/c \; ; \end{aligned}$$
    (10.1)
  • for every \(\gamma \in G\), \(\rho _s(\gamma _s) > 0\) for all \(s \in [0,1]\) and \((0,1) \ni s \mapsto \rho _s(\gamma _s)\) is continuous;

  • the claim of Proposition 9.7 holds true for G;

  • The map \(\mathrm{e}_{t}|_G : G \rightarrow X\) is injective (and we will henceforth restrict \(\mathrm{e}_t\) to G or its subsets).

Assumption 10.2

We will assume in this section and in Sect. 11.1 that \(\nu \) is concentrated on a good \(G \subset G^+_\varphi \).

We will dispose of this assumption in the Change-of-Variables Theorem 11.4.

12.1 \(L^{1}\) partition

For \(s \in [0,1]\) and \(a_{s} \in {\mathbb {R}}\), we recall the following notation (introduced in Sect. 4 for \(G = G_\varphi \), but now we treat a general \(G \subset G_\varphi \) as above):

$$\begin{aligned} G_{a_{s}} = G_{a_s,s} : = \{ \gamma \in G : \varphi _{s}(\gamma _{s}) = a_{s} \}. \end{aligned}$$

As G is compact and \(\mathrm{e}_s : G \rightarrow X\) is continuous, \(\mathrm{e}_s(G)\) is compact. When \(s \in (0,1)\), \(\varphi _s : X \rightarrow {\mathbb {R}}\) is continuous by Lemma 3.2, and hence \(G_{a_s}\) is compact as well.

The structure of the evolution of \(G_{a_{s}}\), i.e. \(\mathrm{e}_{[0,1]}(G_{a_{s}}) = \{ \gamma _{t} :t \in [0,1], \ \gamma \in G_{a_{s}} \}\), will be the topic of this subsection, so the properties we prove below are only meaningful for \(a_s \in \varphi _s(\mathrm{e}_s(G))\) (and moreover typically when \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}})) > 0\)). It will be convenient to use a short-hand notation for the signed-distance function from a level set of \(\varphi _{s}\), \(d_{a_{s}} : = d_{\varphi _{s} - a_{s}}\) (see (8.2)).

Lemma 10.3

For any \(s \in [0,1]\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\) the following holds: for each \(\gamma \in G_{a_{s}}\) and \(0 \le r \le t \le 1\), \((\gamma _{r},\gamma _{t}) \in \Gamma _{d_{a_{s}}}\). In particular, the evolution of \(G_{a_{s}}\) is a subset of the transport set associated to \(d_{a_{s}}\):

$$\begin{aligned} \mathrm{e}_{[0,1]}(G_{a_{s}}) \subset {\mathcal {T}}_{d_{a_{s}}}. \end{aligned}$$

Proof

Fix \(\gamma \in G_{a_{s}}\). If \(s \in [0,1)\) then for any \(p \in \{ \varphi _{s} = a_{s} \}\):

$$\begin{aligned} \frac{\mathsf {d}^{2}(\gamma _{s}, \gamma _{1})}{2(1-s)} = \varphi _{s}(\gamma _{s}) + \varphi ^{c}(\gamma _{1}) = \varphi _{s}(p) + \varphi ^{c}(\gamma _{1}) \le {\bar{\varphi }}_{s}(p) + \varphi ^{c}(\gamma _{1}) \le \frac{\mathsf {d}^{2}(p, \gamma _{1})}{2(1-s)} \end{aligned}$$

by Lemma 3.3 and Proposition 3.6 (2), and hence \(\mathsf {d}(\gamma _{s},\gamma _{1}) \le \mathsf {d}(p,\gamma _{1})\); the latter also holds for \(s=1\) trivially. Similarly, if \(s \in (0,1]\) then for any \( q\in \{ \varphi _{s} = a_{s}\}\):

$$\begin{aligned} \frac{\mathsf {d}^{2}(\gamma _{0}, \gamma _{s})}{2s} = \varphi (\gamma _{0}) - \varphi _{s}(\gamma _{s}) = \varphi (\gamma _{0}) - \varphi _{s}(q) \le \frac{\mathsf {d}^{2}(\gamma _{0}, q)}{2s}, \end{aligned}$$

and therefore \(\mathsf {d}(\gamma _{0},\gamma _{s}) \le \mathsf {d}(\gamma _{0},q)\), with the latter also holding for \(s=0\) trivially. Consequently, for any \(p,q \in \{ \varphi _{s} = a_{s} \}\):

$$\begin{aligned} \mathsf {d}(\gamma _{0},\gamma _{1}) \le \mathsf {d}(\gamma _{0},p) + \mathsf {d}(q,\gamma _{1}) . \end{aligned}$$

Taking infimum over p and q it follows that:

$$\begin{aligned} \mathsf {d}(\gamma _{0},\gamma _{1}) \le d_{a_{s}}(\gamma _{0}) - d_{a_{s}}(\gamma _{1}) , \end{aligned}$$

where the sign of \(d_{a_s}\) was determined by the fact that \(s \mapsto \varphi _s(\gamma _s)\) is decreasing (e.g. by Lemma 3.3). On the other hand

$$\begin{aligned} d_{a_{s}}(\gamma _{0}) - d_{a_{s}}(\gamma _{1}) \le \mathsf {d}(\gamma _{0},\gamma _{1}), \end{aligned}$$

thanks to the 1-Lipschitz regularity of \(d_{a_{s}}\) ensured by Lemma 8.4 since \((X,\mathsf {d})\) is geodesic. Therefore equality holds and \((\gamma _{0},\gamma _{1}) \in \Gamma _{d_{a_{s}}}\). The assertion then follows by Lemma 7.1. \(\square \)

Next, recall by Proposition 8.13 applied to the function \(u = d_{a_{s}}\), that according to the equivalent characterization of \({\mathsf {CD}}^{1}_u(K,N)\), the following disintegration formula holds:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{d_{a_s}}} = \int _{Q} {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}} \,{\hat{{\mathfrak {q}}}}^{a_{s}} (d\alpha ), \end{aligned}$$
(10.2)

where Q is a section of the partition of \({\mathcal {T}}_{d_{a_s}}^b\) given by the equivalence classes \(\{R_{d_{a_s}}^b(\alpha )\}_{\alpha \in Q}\), and for \({\hat{{\mathfrak {q}}}}^{a_s}\)-a.e. \(\alpha \in Q\), the probability measure \({\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\) is supported on the transport ray \(X_{\alpha } = \overline{R^b_{d_{a_s}}(\alpha )} = R_{d_{a_s}}(\alpha )\) and \((X_{\alpha }, \mathsf {d}, {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}})\) verifies \({\mathsf {CD}}(K,N)\). It follows by Lemma 10.3 that

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}})} = \int _{Q} {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}})} \,{\hat{{\mathfrak {q}}}}^{a_{s}}(d\alpha ) . \end{aligned}$$
(10.3)

It will be convenient to make the previous disintegration formula a bit more explicit. We refer to the “Appendix” for the definition of \({\mathsf {CD}}(K,N)\) density and the (suggestive) relation to one-dimensional \({\mathsf {CD}}(K,N)\) spaces. Recall that \(\ell _s(\gamma _s) = \ell (\gamma )\) for all \(\gamma \in G\).

Proposition 10.4

For any \(s \in (0,1)\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\), the following disintegration formula holds:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}}) } = \int _{\mathrm{e}_{s}(G_{a_{s}})} g^{a_{s}}(\beta , \cdot )_{ \#} \left( h^{a_{s}}_{\beta } \cdot {\mathcal {L}}^{1} \llcorner _{[0,1]} \right) {\mathfrak {q}}^{a_{s}}(d\beta ), \end{aligned}$$
(10.4)

with \({\mathfrak {q}}^{a_{s}}\) a Borel measure concentrated on \(\mathrm{e}_{s}(G_{a_{s}})\) of mass \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\), \(g^{a_{s}} : \mathrm{e}_s(G_{a_s}) \times [0,1] \rightarrow X\) is defined by \(g^{a_{s}}(\beta ,t) = \mathrm{e}_t(\mathrm{e}_{s}^{-1}(\beta ))\) and is Borel measurable, for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \in \mathrm{e}_{s}(G_{a_{s}})\), \(h^{a_{s}}_{\beta }\) is a \({\mathsf {CD}}(\ell _s(\beta )^2 K,N)\) probability density on [0, 1] vanishing at the end-points, and the map \(\mathrm{e}_{s}(G_{a_{s}}) \times [0,1] \ni (\beta ,t) \mapsto h^{a_{s}}_{\beta }(t)\) is \({\mathfrak {q}}^{a_s} \otimes {\mathcal {L}}^1\llcorner _{[0,1]}\)-measurable.

Proof

We will abbreviate \(u = d_{a_s}\).

Step 1. We claim that

$$\begin{aligned} \forall \gamma \in G_{a_s} \;\; \forall \alpha \in Q ~,~ \mathrm{e}_{[0,1]}(\gamma ) \cap R_{u}^b(\alpha ) \ne \emptyset \; \Rightarrow \; R_u(\alpha ) \supset \mathrm{e}_{[0,1]}(\gamma ) . \end{aligned}$$

Indeed, if \(x \in \mathrm{e}_{[0,1]}(\gamma )\), then \(R_u(x) \supset \mathrm{e}_{[0,1]}(\gamma )\) by Lemma 10.3. But on the other hand, \(R_u(x) = R_u(\alpha )\) for all \(x \in R_u^b(\alpha )\), since any two transport rays intersecting in \({\mathcal {T}}_{u}^b\) must coincide by Corollary 7.9. Hence, if \(\exists x \in \mathrm{e}_{[0,1]}(\gamma ) \cap R_{u}^b(\alpha )\), the assertion follows.

Step 2. We also claim that

$$\begin{aligned} \forall \gamma ^1,\gamma ^2 \in G_{a_s}\;\; \forall \alpha \in Q ~,~ \mathrm{e}_{[0,1]}(\gamma ^i) \cap R_{u}^b(\alpha ) \ne \emptyset \; ,\; i=1,2 \; \; \Rightarrow \; \gamma ^1 = \gamma ^2 . \end{aligned}$$

Indeed, since \(\alpha \in Q \subset {\mathcal {T}}_u^b\) then \(R_u(\alpha )\) is a transport ray by Lemma 7.8, and since \(u = d_{a_s}\) is affine (with slope 1) on a transport ray, \(R_{u}(\alpha )\) must intersect \(\{d_{a_s} = 0\}=\{\varphi _{s} = a_{s} \}\), and hence \(\mathrm{e}_{s} (G_{a_{s}})\), at most once. It follows by Step 1 that \(\gamma ^1_s = \gamma ^2_s\), and so by injectivity of \(\mathrm{e}_s|_G : G \rightarrow X\), that \(\gamma ^1 = \gamma ^2\).

Step 3. Denote

$$\begin{aligned} G^1_{a_s}:= & {} \left\{ \gamma \in G_{a_s} \; ; \; {\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(\gamma ) \ne \emptyset \right\} , \\ Q^1:= & {} \left\{ \alpha \in Q \; ; \; R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(G_{a_s}) \ne \emptyset \right\} . \end{aligned}$$

We claim that there exists a bijective map:

$$\begin{aligned} \eta : Q^1 \ni \alpha \mapsto \gamma ^\alpha \in G^1_{a_s} , \end{aligned}$$

for which

$$\begin{aligned} R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(G_{a_s}) = {\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(\gamma ^\alpha ) = R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ^\alpha ) . \end{aligned}$$

Indeed, for all \(\alpha \in Q^1\), there exists precisely one \(\gamma \in G_{a_s}\) (and hence \(\gamma \in G^1_{a_s}\)) so that \(R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ) \ne \emptyset \) by Step 2. And vice versa, given any \(\gamma \in G^1_{a_s}\), there is at least one \(\alpha \in Q\) (and hence \(\alpha \in Q^1\)) so that \(R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ) \ne \emptyset \), and it follows by Step 1 that \(\mathrm{e}_{[0,1]}(\gamma ) \subset R_u(\alpha )\) and hence \({\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(\gamma ) \subset R_u^b(\alpha )\); but this means that for all \(\alpha \ne \beta \in Q\), \(R_u^b(\beta ) \cap \mathrm{e}_{[0,1]}(\gamma ) = \emptyset \), since \(\left\{ R_u^b(\beta )\right\} _{\beta \in Q}\) is a partition of \({\mathcal {T}}_u^b\), implying the uniqueness of \(\alpha \in Q^1\).

Moreover, we claim that the map \(\eta : (Q^1,{\mathcal {B}}(Q^1)) \rightarrow (G^1_{a_s} , {\mathcal {B}}(G^1_{a_s}))\) is measurable. Indeed, recall that \(G_{a_s}\) is compact, and since \((X,\mathsf {d})\) is proper, \({\mathcal {T}}_u^b\) and \(R_u^b\) are Borel, and hence \(G^1_{a_s}\) is analytic. Then write:

$$\begin{aligned} \Lambda := P_{1,2}( \{ (y, \gamma , x, t) \in {\mathcal {T}}_u^b \times G_{a_s} \times X \times [0,1] \; ; \; (y,x) \in R_u^b \; , \; x = \gamma _t \} ) , \end{aligned}$$

and

$$\begin{aligned} {\text {graph}}(\eta ) = \Lambda \cap (Q^1 \times G_{a_s}) = \Lambda \cap (Q^1 \times G^1_{a_s}) . \end{aligned}$$

Note that \(\Lambda \) is analytic and that \(\Lambda (x)\) is either an empty set or a singleton for all \(x \in {\mathcal {T}}_u^b\) by Step 2 (and the fact that \(R_u^b\) is an equivalence relation on \({\mathcal {T}}_u^b\)). It follows that for any \(B \in {\mathcal {B}}(G_{a_s})\), both \(A_1 = P_1(\Lambda \cap ({\mathcal {T}}_u^b \times B))\) and \(A_2 = P_1(\Lambda \cap ({\mathcal {T}}_u^b \times (G_{a_s} \setminus B)))\) are analytic, disjoint and \(Q^1 = (Q^1 \cap A_1) \cup (Q^1 \cap A_2)\). By the Lusin separability principle [72, Theorem 4.4.1], there exists a Borel subset \(B_1 \subset {\mathcal {T}}_u^b\) containing \(A_1\) which is still disjoint from \(A_2\). Consequently \(\eta ^{-1}(B \cap G_{a_s}^1) = \eta ^{-1}(B) = Q^1 \cap A_1 = Q^1 \cap B_1 \in {\mathcal {B}}(Q^1)\), concluding the proof that \(\eta \) is Borel measurable on \(Q^1\).

Step 4. Recall that for all \(\alpha \in {{\bar{Q}}}\) of full \({\hat{{\mathfrak {q}}}}^{a_s}\) measure, \({\hat{{\mathfrak {m}}}}_\alpha ^{a_s}\) is supported on the transport ray \(R_u(\alpha ) = \overline{R_u^b(\alpha )}\) and \((R_u(\alpha ),\mathsf {d},{\hat{{\mathfrak {m}}}}_\alpha ^{a_s})\) verifies \({\mathsf {CD}}(K,N)\). Consequently, for such \(\alpha \)’s, \({\hat{{\mathfrak {m}}}}_\alpha ^{a_s}\) gives positive mass to any relatively open subset of \(R_u(\alpha )\) and does not charge points. It follows that for \(\alpha \in {{\bar{Q}}}\), since \(\mathrm{e}_{[0,1]}(\gamma ^\alpha ) \subset R_u(\alpha )\) has non-empty relative interior, it holds that:

$$\begin{aligned} R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(G_{a_s})&\ne \emptyset \; \Leftrightarrow \; \\ {\hat{{\mathfrak {m}}}}^{a_s}_\alpha (\mathrm{e}_{[0,1]}(G_{a_s}))&= {\hat{{\mathfrak {m}}}}^{a_s}_\alpha (R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(G_{a_s})) \\&= {\hat{{\mathfrak {m}}}}^{a_s}_\alpha ( R_u^b(\alpha ) \cap \mathrm{e}_{[0,1]}(\gamma ^\alpha )) \\&= {\hat{{\mathfrak {m}}}}^{a_s}_\alpha (\mathrm{e}_{[0,1]}(\gamma ^\alpha )) > 0 . \end{aligned}$$

In particular, \(Q^1\) coincides up to a \({\hat{{\mathfrak {q}}}}^{a_s}\)-null set with the \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable set \(Q^2 := \{ \alpha \in Q \; ; \; {\hat{{\mathfrak {m}}}}_\alpha ^{a_s}(\mathrm{e}_{[0,1]}(G_{a_s})) > 0 \}\), and thus \(Q^1\) is itself \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable. In fact, it is easy to see that \(Q^1\) coincides with an analytic set up to a \({\hat{{\mathfrak {q}}}}^{a_s}\)-null-set.

Step 5. Recalling that \(\mathrm{e}_{[0,1]}(G_{a_s}) \subset {\mathcal {T}}_u\) by Lemma 10.3 and that \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\) by Corollary 7.3, we obtain from (10.2) the following disintegration of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_s})}\):

$$\begin{aligned}&{\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_s})} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(G_{a_s})} = \int _{Q} {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\llcorner _{{\mathcal {T}}_u^b \cap \mathrm{e}_{[0,1]}(G_{a_s})} \,{\hat{{\mathfrak {q}}}}^{a_{s}}(d\alpha ) \\&\quad = \int _{{{\bar{Q}}} \cap Q^1} {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\llcorner _{\mathrm{e}_{[0,1]}(\gamma ^\alpha )} {\hat{{\mathfrak {q}}}}^{a_{s}}(d\alpha ) = \int _{{{\bar{Q}}} \cap Q^1} \frac{{\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\llcorner _{\mathrm{e}_{[0,1]}(\gamma ^\alpha )}}{{\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha ))} {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha )) {\hat{{\mathfrak {q}}}}^{a_{s}}(d\alpha ) , \end{aligned}$$

where the last two transitions and the measurability of \(\alpha \mapsto {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha )) > 0\) follow from Step 4. For all \(\alpha \in {{\bar{Q}}} \cap Q^1\), define the probability measure:

$$\begin{aligned} {\bar{{\mathfrak {m}}}}_{\alpha }^{a_s} := \frac{{\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}\llcorner _{\mathrm{e}_{[0,1]}(\gamma ^\alpha )}}{{\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha ))} . \end{aligned}$$

Since \(\mathrm{e}_{[0,1]}(\gamma ^\alpha )\) is a convex subset of \(R_u(\alpha )\), it follows that the one-dimensional m.m.s.\((\mathrm{e}_{[0,1]}(\gamma ^\alpha ), \mathsf {d},{\bar{{\mathfrak {m}}}}_{\alpha }^{a_s})\) verifies \({\mathsf {CD}}(K,N)\) and is of full support for all \(\alpha \in {{\bar{Q}}} \cap Q^1\). Similarly, define:

$$\begin{aligned} {\bar{{\mathfrak {q}}}}^{a_s} := {\hat{{\mathfrak {m}}}}_{\alpha }^{a_{s}}(\mathrm{e}_{[0,1]}(\gamma ^\alpha )) {\hat{{\mathfrak {q}}}}^{a_{s}} \llcorner _{Q^1} (d \alpha ) . \end{aligned}$$

Step 6. Recall that our original disintegration (10.2) was on \((Q,{\mathscr {Q}},{\hat{{\mathfrak {q}}}}^{a_s})\), so that there exists \({\tilde{Q}} \subset Q\) of full \({\hat{{\mathfrak {q}}}}^{a_s}\) measure so that \({\tilde{Q}} \in {\mathcal {B}}({\mathcal {T}}_u^b)\) and \( {\mathscr {Q}} \supset {\mathcal {B}}({\tilde{Q}})\). It follows that we may find \({\mathscr {Q}} \ni {\tilde{Q}}^1 \subset Q^1\) with \({\hat{{\mathfrak {q}}}}^{a_s}(Q^1 \setminus {\tilde{Q}}^1) = 0\) so that \({\mathscr {Q}} \supset {\mathcal {B}}({\tilde{Q}}^1)\). Let us now push-forward the measure space \((Q^1 , {\mathscr {Q}} \cap Q^1 , {\bar{{\mathfrak {q}}}}^{a_s})\) via the Borel measurable map \(\mathrm{e}_s \circ \eta \) (by Step 3), yielding the measure space \((\mathrm{e}_s(G_{a_s}^1), {\mathscr {S}}, {\mathfrak {q}}^{a_s})\), which is thus guaranteed to satisfy \({\mathscr {S}} \supset {\mathcal {B}}({\tilde{S}})\), where \({\tilde{S}} := \mathrm{e}_s \circ \eta ({\tilde{Q}}^1)\) is of full \({\mathfrak {q}}^{a_s}\) measure. Restricting the space to \({\tilde{S}}\) and abusing notation, we obtain \(({\tilde{S}}, {\mathscr {S}}, {\mathfrak {q}}^{a_s})\) with \({\mathscr {S}} \supset {\mathcal {B}}({\tilde{S}})\), implying that \({\mathfrak {q}}^{a_s}\) is a Borel measure concentrated on \({\tilde{S}} \subset \mathrm{e}_s(G^1_{a_s}) \subset \mathrm{e}_s(G_{a_s})\). Note that \({\hat{{\mathfrak {q}}}}^{a_{s}}\), \({\bar{{\mathfrak {q}}}}^{a_s}\) and \({\mathfrak {q}}^{a_s}\) all have total mass \({\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\).

Denoting \({\mathfrak {m}}_{\gamma ^\alpha _s}^{a_s} := {\bar{{\mathfrak {m}}}}_{\alpha }^{a_s}\), the disintegration from Step 5 translates to

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_s})} = \int _{\mathrm{e}_s(G_{a_s})} {\mathfrak {m}}_{\beta }^{a_s} {\mathfrak {q}}^{a_s}(d \beta ) . \end{aligned}$$

Furthermore, for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the m.m.s.\((\mathrm{e}_{[0,1]}(\mathrm{e}_s^{-1}(\beta )), \mathsf {d}, {\mathfrak {m}}_{\beta }^{a_s})\) verifies \({\mathsf {CD}}(K,N)\) and is of full support, and is therefore isometric to \((I^{a_{s}}_{\beta },\left| \cdot \right| , {{\hat{h}}}^{a_{s}}_{\beta } {\mathcal {L}}^1\llcorner _{I^{a_{s}}_{\beta }})\), where \(I^{a_{s}}_{\beta } : = [0,\ell _{s}(\beta )]\) and \({{\hat{h}}}^{a_{s}}_{\beta }\) is a \({\mathsf {CD}}(K,N)\) probability density on \(I^{a_{s}}_{\beta }\) (see Definition A.1). To prevent measurability issues, we will use the convention that \({{\hat{h}}}^{a_{s}}_{\beta }\) vanishes at the end-points of \(I^{a_{s}}_{\beta }\).

Step 7. Next, we observe that \(g^{a_s}\) is Borel. Indeed, note that by injectivity of \(\mathrm{e}_s\):

$$\begin{aligned} {\text {graph}}(g^{a_s}) = P_{1,2,3} (\{&(\beta ,t,x,\gamma ) \in \mathrm{e}_{s}(G_{a_{s}}) \times [0,1] \times X \times G_{a_s} \; ; \; \nonumber \\&\gamma _s = \beta ~,~ \gamma _{t} = x \}) . \end{aligned}$$

As \(G_{a_s}\) is compact, it follows that \({\text {graph}}(g^{a_s})\) is analytic, and hence (see [72, Theorem 4.5.2]) \(g^{a_s}\) is Borel measurable.

Step 8. It follows that \({\mathfrak {m}}_{\beta }^{a_s} = g^{a_s}(\beta ,\cdot )_{\sharp }(h^{a_{s}}_{\beta } {\mathcal {L}}^1\llcorner _{[0,1]})\), where:

$$\begin{aligned}{}[0,1] \ni t \mapsto h^{a_{s}}_{\beta } (t) : = \ell _s(\beta ) {{\hat{h}}}^{a_{s}}_{\beta }( t \ell _{s}(\beta ) ) . \end{aligned}$$

Clearly \(h_{\beta }^{a_{s}}\) is now a \({\mathsf {CD}}(\ell _{s}(\beta )^{2}K,N)\) probability density on the interval [0, 1]. The only remaining task is to prove that the map \(\mathrm{e}_s(G_{a_s}) \times [0,1] \ni (\beta ,t) \mapsto h^{a_s}_\beta (t)\) is \({\mathfrak {q}}^{a_s} \otimes {\mathcal {L}}^1\llcorner _{[0,1]}\)-measurable. By measurability of the disintegration (10.3) (recall Definition 6.18), the map \(Q \ni \alpha \mapsto {\hat{{\mathfrak {m}}}}_\alpha ^{a_s}(B)\) is \({\hat{{\mathfrak {q}}}}^{a_s}\)-measurable for any Borel set \(B \subset X\). It follows that for any compact \(I \subset (0,1)\), the map:

$$\begin{aligned} \mathrm{e}_{s}(G_{a_{s}})\supset {\tilde{S}} \ni \beta \mapsto F (\beta ) : = \int _{I} h^{a_{s}}_{\beta } (\tau ) d\tau = \frac{{\hat{{\mathfrak {m}}}}^{a_s}_{\alpha (\beta )}(\mathrm{e}_{I}(G_{a_s}))}{{\hat{{\mathfrak {m}}}}^{a_s}_{\alpha (\beta )}(\mathrm{e}_{[0,1]}(G_{a_s}))} \end{aligned}$$

is \({\mathfrak {q}}^{a_{s}}\)-measurable, where \(\alpha (\beta ) := (\mathrm{e}_s \circ \eta )^{-1}(\beta )\) is \({\mathfrak {q}}^{a_s}\)-measurable as a map \(({\tilde{S}},{\mathscr {S}},{\mathfrak {q}}^{a_s}) \rightarrow (Q , {\mathscr {Q}} , {\hat{{\mathfrak {q}}}}^{a_s})\) by the construction from Step 6. As \(h^{a_{s}}_{\beta }\) is continuous on (0, 1) for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), we know that for such \(\beta \) and all \(t \in (0,1)\):

$$\begin{aligned} h^{a_{s}}_{\beta }(t) = \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon }\int _{[t-\varepsilon ,t+\varepsilon ]} h^{a_{s}}_{\beta } (\tau ) d\tau . \end{aligned}$$

It follows by [72, Proposition 3.1.27] that for all \(t \in (0,1)\), the map

$$\begin{aligned} {\tilde{S}} \ni \beta \mapsto h^{a_{s}}_{\beta }(t) \end{aligned}$$

is \({\mathfrak {q}}^{a_{s}}\)-measurable. As for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the map \((0,1) \ni t \mapsto h^{a_{s}}_{\beta }(t)\) is continuous, [72, Theorem 3.1.30] confirms the required measurability.

This concludes the proof.\(\square \)

It will be convenient to invert the order of integration in (10.4) using Fubini’s Theorem

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}}) } = \int _{[0,1]} g^{a_{s}}(\cdot ,t)_{\sharp } \left( h^{a_{s}}_{\cdot }(t) \cdot {\mathfrak {q}}^{a_{s}} \right) {\mathcal {L}}^{1}(dt) . \end{aligned}$$

We thus define

$$\begin{aligned} {\mathfrak {m}}_{t}^{a_{s}} : = g^{a_{s}}(\cdot ,t)_{\sharp } \left( h^{a_{s}}_{\cdot }(t) \cdot {\mathfrak {q}}^{a_{s}} \right) , \end{aligned}$$

so that the final formula is

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}}) } = \int _{[0,1]} {\mathfrak {m}}_{t}^{a_{s}} \, {\mathcal {L}}^{1}(dt). \end{aligned}$$
(10.5)

Remark 10.5

Since for \({\mathfrak {q}}^{a_s}\)-a.e. \(\beta \), the \({\mathsf {CD}}(\ell _s^2(\beta ) K , N)\) density \(h^{a_s}_\beta \) must be strictly positive on (0, 1) (see “Appendix”), by multiplying and dividing \({\mathfrak {q}}^{a_s}\) by the positive \({\mathfrak {q}}^{a_s}\)-measurable function \(\beta \mapsto h^{a_s}_\beta (s)\) (recall that \(s \in (0,1)\)), we may always renormalize and assume that \(h^{a_{s}}_{\beta }(s) = 1\). Note that this does not affect the definition of \({\mathfrak {m}}_{t}^{a_{s}}\) above. This normalization ensures that \({\mathfrak {m}}_s^{a_s} = {\mathfrak {q}}^{a_s}\) so that

$$\begin{aligned} {\mathfrak {m}}_{t}^{a_{s}} : = g^{a_{s}}(\cdot ,t)_{\sharp } \left( h^{a_{s}}_{\cdot }(t) \cdot {\mathfrak {m}}_s^{a_{s}} \right) . \end{aligned}$$
(10.6)

Remark 10.6

Note that since \({\mathfrak {q}}^{a_{s}}\) is concentrated on \(\mathrm{e}_s(G_{a_s})\), by definition \({\mathfrak {m}}_{t}^{a_{s}}\) is concentrated on \(\mathrm{e}_t(G_{a_s})\) for all \(t \in (0,1)\). By Corollary 4.3, the latter sets are disjoint for different t’s in (0, 1) (recall that \(s \in (0,1)\) and that \(G \subset G^+_{\varphi }\)). Formula (10.5) can thus be seen again as a disintegration formula over a partition. In particular, for any \(s \in (0,1)\) and \(0< t , \tau < 1\) with \(t \ne \tau \), the measures \({\mathfrak {m}}^{a_{s}}_{t}\) and \({\mathfrak {m}}^{a_{s}}_{\tau }\) are mutually singular.

Proposition 10.7

For any \(s\in (0,1)\) and \(a_{s} \in \varphi _s(\mathrm{e}_s(G))\), the map

$$\begin{aligned} (0,1) \ni t \mapsto {\mathfrak {m}}^{a_{s}}_{t} \end{aligned}$$

is continuous in the weak topology, we have

$$\begin{aligned} {\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))> 0 \;\;\; \Rightarrow \;\;\; \forall t \in (0,1) \;\;\; {\mathfrak {m}}^{a_{s}}_{t}(\mathrm{e}_{t}(G_{a_{s}})) > 0 , \end{aligned}$$

and

$$\begin{aligned} \forall t\in [0,1] \;\;\; {\mathfrak {m}}^{a_{s}}_{t}(\mathrm{e}_{t}(G_{a_{s}})) = \Vert {\mathfrak {m}}^{a_{s}}_{t} \Vert \le C \; {\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}})), \end{aligned}$$

for some \(C> 0\) depending only on K, N and \(c > 0\) from assumption (10.1).

Proof

Recall that the definition of \({\mathfrak {m}}^{a_{s}}_{t}\) does not depend on the last normalization we performed, when we imposed that \(h^{a_s}_\beta (s) = 1\), so we revert to the normalization that \(h^{a_s}_\beta \) is a \({\mathsf {CD}}(\ell _s(\beta )^2 K , N)\) probability density on [0, 1], and hence \(\left\| {\mathfrak {q}}^{a_{s}}\right\| = {\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}}))\). The second assertion follows since whenever the latter mass is positive, by positivity of a \({\mathsf {CD}}(K,N)\) density in the interior of its support (see “Appendix”):

$$\begin{aligned} \forall t \in (0,1) \;\;\; {\mathfrak {m}}^{a_{s}}_{t}(\mathrm{e}_{t}(G_{a_{s}})) = \left\| {\mathfrak {m}}^{a_{s}}_{t} \right\| = \int h^{a_{s}}_{\beta }(t) {\mathfrak {q}}^{a_s}(d\beta ) > 0 . \end{aligned}$$

Similarly, it follows by Lemma A.8, the lower semi-continuity of \(h^{a_s}_\beta \) at the end-points (see “Appendix”), and assumption (10.1), that \(\max _{t \in [0,1]} h^{a_{s}}_{\beta }(t)\) is uniformly bounded in \(a_{s}\) and \(\beta \) for \({\mathfrak {q}}^{a_{s}}\)-a.e. \(\beta \) by a constant \(C>0\) as above, implying that

$$\begin{aligned} \forall t \in [0,1] \;\;\; \left\| {\mathfrak {m}}^{a_{s}}_{t} \right\| = \left\| h^{a_{s}}_{\cdot }(t) \cdot {\mathfrak {q}}^{a_{s}}\right\| \le C \left\| {\mathfrak {q}}^{a_{s}}\right\| = C \; {\mathfrak {m}}(\mathrm{e}_{[0,1]}(G_{a_{s}})) , \end{aligned}$$

yielding the third assertion.

Now note that the density \((0,1) \ni t \mapsto h^{a_{s}}_{\beta }(t)\) is continuous (see “Appendix”) for \({\mathfrak {q}}^{a_{s}}\)-a.e. \(\beta \), and the same trivially holds for the map \([0,1] \ni t \mapsto g^{a_{s}}(\beta ,t)\). We conclude by Dominated Convergence that for any \(f \in C_{b}(X)\) and any \(t \in (0,1)\):

$$\begin{aligned} \lim _{\tau \rightarrow t}\int f(x) \, {\mathfrak {m}}^{a_{s}}_{\tau }(dx) =&~ \lim _{\tau \rightarrow t} \int f(g^{a_{s}}(\alpha ,\tau )) h^{a_{s}}_{\beta }(\tau ) \, {\mathfrak {q}}^{a_{s}}(d\beta ) \\ =&~ \int f(g^{a_{s}}(\beta ,t)) h^{a_{s}}_{\alpha }(t) \, {\mathfrak {q}}^{a_{s}}(d\beta ) = \int f(x) \, {\mathfrak {m}}^{a_{s}}_{t}(dx) , \end{aligned}$$

yielding the first assertion, and concluding the proof. \(\square \)

12.2 \(L^{2}\) partition

For each \(t \in (0,1)\), we can find a natural partition of \(\mathrm{e}_{t}(G) \subset \mathrm{e}_t(G_\varphi )\) consisting of level sets of the time-propagated intermediate Kantorovich potentials \(\Phi _{s}^{t}\) introduced in Sect. 4. Recall that the function \(\Phi _s^t\) (\(s,t \in (0,1)\)) was defined as:

$$\begin{aligned} \Phi _{s}^{t} = \varphi _{t} + (t-s) \frac{\ell _{t}^{2}}{2} , \end{aligned}$$

and interpreted on \(\mathrm{e}_{t}(G_\varphi )\) as the propagation of \(\varphi _s\) from time s to t along \(G_\varphi \), i.e. \(\Phi _{s}^{t} = \varphi _{s} \circ \mathrm{e}_{s} \circ \mathrm{e}_{t}^{-1}\). In particular, for any \(\gamma \in G\), \(\Phi _{s}^{t}(\gamma _{t}) = \varphi _{s}(\gamma _{s})\), and \(\mathrm{e}_{t} (G_{a_{s}}) \cap \mathrm{e}_{t} (G_{b_{s}}) = \emptyset \) as soon as \(a_{s} \ne b_{s}\) (see Corollary 4.1). It follows that for any \(s,t \in (0,1)\), we can consider the partition of the compact set \(\mathrm{e}_{t}(G)\) given by its intersection with the family \(\{ \Phi _{s}^{t}= a_{s} \}_{a_{s} \in {\mathbb {R}}}\); as usual, it will be sufficient to take \(a_{s} \in \Phi _{s}^{t}(\mathrm{e}_{t}(G)) = \varphi _{s}(\mathrm{e}_{s}(G))\).

Since \(\Phi _{s}^{t}\) is continuous, the Disintegration Theorem 6.19 yields the following essentially unique disintegration of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\) strongly consistent with respect to the quotient-map \(\Phi _{s}^{t}\):

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)} = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} {\hat{{\mathfrak {m}}}}^{t}_{a_{s}} \, {\mathfrak {q}}^{t}_{s}(da_{s}) \end{aligned}$$
(10.7)

so that for \({\mathfrak {q}}^{t}_{s}\)-a.e. \(a_{s}\), \({\hat{{\mathfrak {m}}}}^{t}_{a_{s}}\) is a probability measure concentrated on the set \(\mathrm{e}_t(G) \cap \left\{ \Phi _s^t = a_s\right\} = \mathrm{e}_{t}(G_{a_{s}})\). By definition, \({\mathfrak {q}}^{t}_{s} = (\Phi _{s}^{t})_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\). To make this disintegration more explicit, we show:

Proposition 10.8

  1. (1)

    For any \(s,t,\tau \in (0,1)\), the quotient measures \({\mathfrak {q}}^{t}_{s}\) and \({\mathfrak {q}}^{\tau }_s\) are mutually absolutely continuous.

  2. (2)

    For any \(s,t \in (0,1)\), the quotient measure \({\mathfrak {q}}^{t}_{s}\) is absolutely continuous with respect to Lebesgue measure \({\mathcal {L}}^1\) on \({\mathbb {R}}\).

Proof

Recall that \({\mathfrak {q}}^{t}_{s} = (\Phi _{s}^{t})_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\).

(1) For any Borel set \(I \subset {\mathbb {R}}\), note that:

$$\begin{aligned}&{\mathfrak {q}}^{t}_{s}(I) = {\mathfrak {m}}\left( \left\{ \gamma _{t} : \varphi _{s}(\gamma _{s}) \in I, \gamma \in G \right\} \right)> 0\\&\quad \Leftrightarrow \;\; \mu _{t}\left( \left\{ \gamma _{t} : \varphi _{s}(\gamma _{s}) \in I, \gamma \in G \right\} \right) > 0 , \end{aligned}$$

since \(\mu _t \ll {\mathfrak {m}}\) and its density \(\rho _t\) is assumed to be positive on \(\mathrm{e}_t(G)\) where \(\mu _t\) is supported (see Definition 10.1). But \(\mu _\tau = (\mathrm{e}_\tau \circ \mathrm{e}_t^{-1})_{\sharp } \mu _t\), and so:

$$\begin{aligned} \mu _{\tau }\left( \left\{ \gamma _{\tau } : \varphi _{s}(\gamma _{s}) \in I, \gamma \in G \right\} \right) = \mu _{t}\left( \left\{ \gamma _{t} : \varphi _{s}(\gamma _{s}) \in I, \gamma \in G \right\} \right) . \end{aligned}$$

It follows that \({\mathfrak {q}}^{t}_{s}(I) > 0\) iff \({\mathfrak {q}}^{\tau }_{s}(I) > 0\), thereby establishing the first assertion.

(2) Thanks to the first assertion, it is enough to only consider the case \(t=s\) in the second one. Recall that \(\Phi _{s}^{s} = \varphi _{s}\). Then the claim boils down to showing that \({\mathfrak {m}}(\varphi _{s}^{-1} (I) \cap \mathrm{e}_{s}(G)) = 0\) whenever \(I \subset \varphi _{s}(\mathrm{e}_{s}(G))\) is a compact set with \({\mathcal {L}}^{1}(I) = 0\).

By compactness, we fix a ball \(B_{r}(o)\) containing \(\mathrm{e}_{s}(G)\). Since \(\varphi _{s}\) is Lipschitz continuous on bounded sets (Corollary 3.10 (1)), possibly using a cut-off Lipschitz function over \(B_{r}(o)\), we may assume that \(\varphi _{s}\) has bounded total variation measure \(\Vert D \varphi _{s} \Vert \) (we refer to [54] and [9] for all missing notions and background regarding BV-functions on metric-measure spaces). From the local Poincaré inequality (see Remark 7.5 and [54, page 992]) and the doubling property (see Lemma 6.12 and recall that \(\text {supp}({\mathfrak {m}}) = X\)), it follows that the total variation measure of \(\varphi _{t}\) is absolutely continuous with respect to \({\mathfrak {m}}\), and that

$$\begin{aligned} \exists c > 0 \;\;\; c |\nabla \varphi _{s}| \, {\mathfrak {m}}\le \Vert D \varphi _{s} \Vert \le |\nabla \varphi _{s}| \, {\mathfrak {m}}\end{aligned}$$
(10.8)

(see [54, page 992] or [12, Section 4]), where

$$\begin{aligned} |\nabla \varphi _{s}|(x) : = \liminf _{\delta \rightarrow 0} \sup _{y \in B_{\delta }(x)} \frac{|\varphi _{s}(y)-\varphi _{s}(x)|}{\delta }. \end{aligned}$$

By [31, Theorem 6.1], the previous quantity in fact coincides in our setting with the pointwise Lipschitz constant of \(\varphi _{s}\) at x, which in turn coincides with \(\ell ^+_s(x)\) by [6, Theorem 3.6]; hence for \(x = \gamma _s\) we have \(|\nabla \varphi _{s}|(x) =\ell _s(x)\). By the co-area formula (see [54, Proposition 4.2]), for any Borel set \(A \subset B_{r}(o)\):

$$\begin{aligned} \int _{-\infty }^{+\infty } \Vert \partial \{ \varphi _{s} > \tau \} \Vert (A) \, d\tau = \Vert D\varphi _{s} \Vert (A), \end{aligned}$$
(10.9)

where \(\Vert \partial \{ \varphi _{s} > \tau \} \Vert \) denotes the total variation measure associated to the set of finite perimeter \(\{ \varphi _{s} > \tau \}\). From [1, Theorem 5.3] it follows that \(\Vert \partial \{ \varphi _{s} > \tau \} \Vert \) is concentrated on \(\{\varphi _{s} = \tau \}\) and therefore, for any Borel set \(I \subset \varphi _{s}(\mathrm{e}_{s}(G))\) with \({\mathcal {L}}^{1}(I) = 0\), it follows by (10.9) and (10.8):

$$\begin{aligned} \Vert D \varphi _{s} \Vert (\varphi _{s}^{-1}(I)) = 0 \; , \; |\nabla \varphi _{s}| \,{\mathfrak {m}}(\varphi _{s}^{-1}(I)) = 0 . \end{aligned}$$

Since \(|\nabla \varphi _{s}| = \ell _s(x) > 0\) on \(\mathrm{e}_{s}(G)\), it follows that \({\mathfrak {m}}(\varphi _{s}^{-1}(I) \cap \mathrm{e}_{s}(G)) = 0\), thereby concluding the proof. \(\square \)

Remark 10.9

Inspecting the proof of Proposition 10.8, from the co-area formula ([54, Proposition 4.2]) and the Hausdorff representation of the perimeter measure ([1, Theorem 5.3]), it follows that for \({\mathfrak {q}}^{s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\) the measure \({\mathfrak {m}}^{s}_{a_{s}}\) is absolutely continuous with respect to the Hausdorff measure of codimension one (see [1] for more details).

Employing the previous proposition, we define:

$$\begin{aligned} {\mathfrak {m}}^{t}_{a_{s}} := (d{\mathfrak {q}}^t_s/d{\mathcal {L}}^1) \cdot {\hat{{\mathfrak {m}}}}^{t}_{a_{s}} , \end{aligned}$$

obtaining from (10.7) the following disintegration (for every \(s,t \in (0,1)\)):

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)} = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} {\mathfrak {m}}^{t}_{a_{s}} \, {\mathcal {L}}^{1}(da_{s}), \end{aligned}$$
(10.10)

with \({\mathfrak {m}}^{t}_{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\), for \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\).

We now shed light on the relation of the above disintegration to \(L^2\)-Optimal-Transport, by relating it to another disintegration formula for \(\nu \), the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\). Observe that the family of sets \(\{G_{a_{s}}\}_{a_{s}\in {\mathbb {R}}}\) is a partition of G and that \(G_{a_{s}} = \left\{ \varphi _{s} \circ \mathrm{e}_{s} = a_{s}\right\} \). Since the quotient-map \(\varphi _{s} \circ \mathrm{e}_{s} : \mathrm{Geo}(X) \rightarrow {\mathbb {R}}\) is continuous and G is compact, the Disintegration Theorem 6.19 ensures the existence of an essentially unique disintegration of \(\nu \) strongly consistent with \(\varphi _{s} \circ \mathrm{e}_{s}\):

$$\begin{aligned} \nu = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} \nu _{a_{s}} \, {\mathfrak {q}}^{\nu }_{s}(da_{s}), \end{aligned}$$
(10.11)

so that for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the probability measure \(\nu _{a_s}\) is concentrated on \(G_{a_s}\). Clearly \({\mathfrak {q}}^{\nu }_{s}(\varphi _{s}(\mathrm{e}_{s}(G))) = \left\| \nu \right\| =1\).

Corollary 10.10

  1. (1)

    For any \(s \in (0,1)\), the quotient measure \({\mathfrak {q}}^{\nu }_{s}\) is mutually absolutely continuous with respect to \({\mathfrak {q}}^{s}_{s}\), and in particular it is absolutely continuous with respect to \({\mathcal {L}}^{1}\).

  2. (2)

    For any \(s,t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\):

    $$\begin{aligned} \rho _{t} \cdot {\mathfrak {m}}^{t}_{a_{s}} = q^{\nu }_s(a_{s}) \cdot (\mathrm{e}_{t})_{\#} \nu _{a_{s}} , \end{aligned}$$
    (10.12)

    where \(q^{\nu }_s := d {\mathfrak {q}}^{\nu }_{s} / d{\mathcal {L}}^{1}\). In particular, \({\mathfrak {m}}^{t}_{a_{s}}\) and \((\mathrm{e}_{t})_{\#} \nu _{a_{s}}\) are mutually absolutely-continuous for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\).

  3. (3)

    In particular, for any \(s \in (0,1)\) and \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the map:

    $$\begin{aligned}{}[0,1] \ni t \mapsto \rho _{t} \cdot {\mathfrak {m}}^{t}_{a_{s}} \end{aligned}$$

    coincides for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) with the \(W_2\)-geodesic \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu _{a_{s}}\) up to a positive multiplicative constant depending only on \(a_{s}\).

Proof

Recall that \(\mu _s \ll {\mathfrak {m}}\) is supported on \(\mathrm{e}_s(G)\) and \(\rho _{s} > 0\) there (see Definition 10.1), so that \(\mu _s\) and \({\mathfrak {m}}\llcorner _{\mathrm{e}_{s}(G)}\) are mutually absolutely-continuous. It immediately follows that the same holds for \((\varphi _{s} )_{\#} \mu _s\) and \({\mathfrak {q}}^s_s = (\varphi _{s} )_{\#} {\mathfrak {m}}\llcorner _{\mathrm{e}_{s}(G)}\). But:

$$\begin{aligned} (\varphi _{s} )_{\#} (\mu _s) = (\varphi _{s} )_{\#} ( (\mathrm{e}_{s})_{\#} \nu ) = (\varphi _{s} \circ \mathrm{e}_{s} )_{\#} ( \nu ) = {\mathfrak {q}}^{\nu }_s , \end{aligned}$$

establishing (1).

Denoting the resulting probability density \(q^{\nu }_s := d {\mathfrak {q}}^{\nu }_{s} / d{\mathcal {L}}^{1}\), (10.11) translates to:

$$\begin{aligned} \nu = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} q^{\nu }_{s}(a_{s}) \nu _{a_{s}} \, {\mathcal {L}}^{1}(da_{s}) . \end{aligned}$$

Pushing forward both sides via the evaluation map \(\mathrm{e}_t\) given \(t \in (0,1)\), we obtain:

$$\begin{aligned} \rho _{t} {\mathfrak {m}}= \int _{\varphi _{s}(\mathrm{e}_{s}(G))} q^{\nu }_s(a_{s}) \cdot (\mathrm{e}_{t})_{\#}\nu _{a_{s}} \, {\mathcal {L}}^{1}(da_{s}), \end{aligned}$$

with \(q^{\nu }_s(a_{s}) \cdot (\mathrm{e}_{t})_{\#}\nu _{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\). On the other hand, multiplying both sides of (10.10) by \(\rho _t\) (which is supported on \(\mathrm{e}_t(G)\)), we obtain

$$\begin{aligned} \rho _t {\mathfrak {m}}= \int _{\varphi _{s}(\mathrm{e}_{s}(G))} \rho _t \cdot {\mathfrak {m}}^{t}_{a_{s}} \, {\mathcal {L}}^{1}(da_{s}), \end{aligned}$$

with \(\rho _t \cdot {\mathfrak {m}}^{t}_{a_{s}}\) concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\). By the essential uniqueness of the disintegration (Theorem 6.19), noting that \(\varphi _s(\mathrm{e}_s(G))\) is compact, (10.12) immediately follows. As \(\rho _t > 0\) on \(\mathrm{e}_t(G)\) (see Definition 10.1) and \(q^\nu _s(a_s) \in (0,\infty )\) for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the “in particular” part of (2) is also established.

Finally, by Fubini’s theorem, it follows that for each \(s \in (0,1)\) and \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), (10.12) holds with \(q^\nu _s(a_s) \in (0,\infty )\) for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\). Note that for \({\mathfrak {q}}^{\nu }_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the curve \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu _{a_{s}}\) is a \(W_2\)-geodesic (since \(\nu _{a_s}\) is concentrated on \(G_{a_s} \subset G\)). This establishes (3), thereby concluding the proof. \(\square \)

13 Comparison between conditional measures

So far we have proved, under Assumption 10.2, that for each \(s \in (0,1)\) we have the following two families of disintegrations:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)} = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} {\mathfrak {m}}^{t}_{a_{s}} \, {\mathcal {L}}^{1} (da_{s}) \;\;\; \text { and } \;\;\; {\mathfrak {m}}\llcorner _{\mathrm{e}_{[0,1]}(G_{a_{s}})} = \int _{[0,1]} {\mathfrak {m}}^{a_{s}}_{t} \, {\mathcal {L}}^{1} (dt)\nonumber \\ \end{aligned}$$
(11.1)

for each \(t \in (0,1)\) and each \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), respectively, corresponding to the partitions:

$$\begin{aligned} \{\mathrm{e}_t(G_{a_s}) \}_{a_s \in \varphi _{s}(\mathrm{e}_{s}(G))} \;\;\; \text { and } \;\;\; \{ \mathrm{e}_t(G_{a_s}) \}_{t \in (0,1)} . \end{aligned}$$

Moreover, both \({\mathfrak {m}}_{a_{s}}^{t}\) and \({\mathfrak {m}}_{t}^{a_{s}}\) are concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\), for each \(t \in (0,1)\) for \({\mathcal {L}}^1\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), and for each \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and all \(t \in (0,1)\), respectively, so that the above disintegrations are strongly consistent with respect to the corresponding partition. In addition, we have by (10.6) and (10.12) for all \(s,t \in (0,1)\) and a.e. \(a_s \in \varphi _s(G_{a_s})\):

$$\begin{aligned} {\mathfrak {m}}_t^{a_s} = (\mathrm{e}_t \circ \mathrm{e}_s^{-1})_{\sharp }(h^{a_s}_{\cdot }(t) {\mathfrak {m}}_s^{a_s}) ~,~ \rho _t {\mathfrak {m}}_{a_s}^t = (\mathrm{e}_t \circ \mathrm{e}_s^{-1})_{\sharp }(\rho _s {\mathfrak {m}}_{a_s}^s) . \end{aligned}$$
(11.2)

The goal of the first subsection, in which we retain Assumption 10.2, is to prove that \({\mathfrak {m}}^{t}_{a_{s}}\) and \({\mathfrak {m}}_{t}^{a_{s}}\) are in fact equivalent measures. We will prove in particular that for all \(s \in (0,1)\):

$$\begin{aligned} {\mathfrak {m}}_t^{a_s} = \partial _t \Phi _s^t \; {\mathfrak {m}}_{a_s}^t \;\;\; \text {for a.e. }t \in (0,1), a_s \in \varphi _s(G_{a_s}) . \end{aligned}$$
(11.3)

A heuristic formal argument for establishing (11.3) may be seen as follows. Writing \(\Phi _s^t(x) = \Phi _s(t,x)\), we have:

$$\begin{aligned}&\mathrm{e}_t(G_{a_s}) = \mathrm{e}_t(G) \cap \left\{ x\in X \; ; \; \Phi _s(t,x) = a_s\right\} \\&\quad = \mathrm{e}_t(G) \cap \left\{ x \in X \; ; \; \Phi _s(\cdot ,x)^{-1}(a_s) = t\right\} . \end{aligned}$$

Formally applying the coarea formula (assuming spatial regularity), we have:

$$\begin{aligned} \frac{{\mathfrak {m}}_t^{a_s}}{{\mathfrak {m}}_{a_s}^t} = \frac{\left| \nabla _x \Phi _s(t,x)\right| }{\left| \nabla _x \Phi _s(\cdot ,x)^{-1}(a_s)\right| } = \left| - \partial _t \Phi _s(t,x)\right| , \end{aligned}$$

where the last transition follows by the implicit function theorem \(\nabla _x \Phi _s + \partial _t \Phi _s \cdot \nabla _x\Phi ^{-1}_s = 0\).

In the second subsection, we deduce the change-of-variables formula (1.6) for the density along geodesics, discarding Assumption 10.2. An insightful heuristic argument may be seen by combining (11.2) and (11.3) as follows:

$$\begin{aligned} \frac{\partial _\tau |_{\tau =t}\Phi _s^\tau (\gamma _t)}{\rho _t(\gamma _t)} = \left. \frac{{\mathfrak {m}}_t^{a_s}}{\rho _t {\mathfrak {m}}_{a_s}^t}\right| _{\gamma _t} = \left. \frac{h^{a_s}_{\cdot }(t) {\mathfrak {m}}_{s}^{a_s}}{\rho _s {\mathfrak {m}}_{a_s}^s} \right| _{\gamma _s} = \frac{h^{a_s}_{\gamma _s}(t)}{\rho _s(\gamma _s)} {\partial _\tau |_{\tau =s} \Phi _s^\tau (\gamma _s)} = \frac{h^{a_s}_{\gamma _s}(t)}{\rho _s(\gamma _s)} {\ell (\gamma )^2} . \end{aligned}$$

13.1 Equivalence of conditional measures

Recall that Assumption 10.2 is still in force in this subsection. We start with the following auxiliary:

Lemma 11.1

For every \(s,t\in (0,1)\) and \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the following limit:

$$\begin{aligned} {\mathfrak {m}}_{t}^{a_{s}} = \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \, {\mathfrak {m}}\llcorner _{\mathrm{e}_{[t-\varepsilon ,t+\varepsilon ]}(G_{a_{s}})} \end{aligned}$$

holds true in the weak topology. Moreover, for any \(f \in C_b(X)\), the map \(\varphi _{s}(\mathrm{e}_{s}(G)) \ni a_{s} \mapsto \int _{X} f {\mathfrak {m}}_{t}^{a_{s}}\) is Borel.

Proof

By Proposition 10.7, \((0,1) \ni t\mapsto {\mathfrak {m}}^{a_{s}}_{t}\) is continuous in the weak topology, and so together with (11.1), we see that for any \(f \in C_b(X)\):

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \int _X f(z) {\mathfrak {m}}\llcorner _{\mathrm{e}_{[t-\varepsilon ,t+\varepsilon ]}(G_{a_{s}})}(dz) \\&\quad = \lim _{\varepsilon \rightarrow 0}\frac{1}{2\varepsilon } \int _{t-\varepsilon }^{t+\varepsilon } \bigg ( \int _{X} f(z){\mathfrak {m}}^{a_s}_{\tau }(dz) \bigg ) {\mathcal {L}}^{1}(d \tau ) = \int _{X} f(z) {\mathfrak {m}}^{a_{s}}_{t}(dz) , \end{aligned}$$

thereby concluding the proof of the first assertion. For the second assertion, given a compact set \(I \subset [0,1]\), consider the compact set

$$\begin{aligned} K : = \{ (x,t,\gamma ,a_{s}) \in X \times I \times G \times \varphi _{s}(\mathrm{e}_s(G)) :x = \gamma _{t}, \ \varphi _{s}(\gamma _{s}) = a_{s} \}. \end{aligned}$$

Hence \(B := P_{1,4}(K) = \{ (\mathrm{e}_I(G_{a_s}) , a_s) : a_s \in \varphi _{s}(\mathrm{e}_s(G)) \} \) is compact as well. It follows by Fubini’s theorem that the map \(\varphi _{s}(\mathrm{e}_{s}(G)) \ni a_{s} \mapsto \int _{\mathrm{e}_I(G_{a_s})} f \, {\mathfrak {m}}\) is Borel. Taking \(I = [t-\varepsilon ,t+\varepsilon ]\), employing the first assertion, and recalling that the pointwise limit of Borel functions is Borel, the second assertion follows. \(\square \)

Remark 11.2

One may similarly show (employing an additional density argument) that for every \(s,t\in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G))\), the following limit:

$$\begin{aligned} {\mathfrak {m}}_{a_{s}}^{t} = \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \, {\mathfrak {m}}\llcorner _{(\Phi _{s}^{t})^{-1}[a_{s}-\varepsilon ,a_{s}+\varepsilon ] \cap \mathrm{e}_t(G) } \end{aligned}$$

holds true in the weak topology, but this will not be required.

We now find explicit expressions for the densities.

Theorem 11.3

For any \(s \in (0,1)\),

$$\begin{aligned} {\mathfrak {m}}^{a_{s}}_{s} = \ell _{s}^{2} \cdot {\mathfrak {m}}^{s}_{a_{s}} \;\;\; \text { for }{\mathcal {L}}^{1}\text {-a.e. }a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G)) . \end{aligned}$$
(11.4)

Moreover, for any \(s \in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) including at \(t=s\), \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and \({\mathfrak {m}}_{a_{s}}^{t}\)-a.e. x, and we have:

$$\begin{aligned} {\mathfrak {m}}^{a_{s}}_{t} = \partial _{t}\Phi _{s}^{t} \cdot {\mathfrak {m}}^{t}_{a_{s}} \;\;\; \text { for }{\mathcal {L}}^{1}\text {-a.e. }a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G)) . \end{aligned}$$
(11.5)

For the ensuing proof, it will be convenient to introduce the following notation. For all \(t_0 \in {\mathbb {R}}\) and \(x_0 \in X\), denote

$$\begin{aligned} \i ^1_{t_0} : X \ni x \mapsto (t_0,x) \in ({\mathbb {R}},X) ~ , ~ \i ^2_{x_0} : {\mathbb {R}}\ni t \mapsto (t,x_0) \in ({\mathbb {R}},X) . \end{aligned}$$

Recall that G(x) denotes the section \(\left\{ t \in [0,1] \; ; \; \exists \gamma \in G \;,\;\gamma _t = x\right\} \) and \(\mathring{G}(x) = G(x) \cap (0,1)\).

Proof of Theorem 11.3

Step 1. Fix \(s,t \in (0,1)\). By Lemma 11.1 and the boundedness of \(\Vert {\mathfrak {m}}_{\tau }^{a_{s}}\Vert \) uniformly in \(a_{s}\) and \(\tau \in [0,1]\) (see Proposition 10.7), it is easy to deduce (e.g. by Dominated Convergence Theorem) the following limit of measures on \(\varphi _{s}(\mathrm{e}_{s}(G)) \times X\) in the weak topology (i.e. in duality with \(C_{b}(\varphi _{s}(\mathrm{e}_{s}(G)) \times X)\)):

$$\begin{aligned} \int _{\varphi _{s}(\mathrm{e}_{s}(G))} (\i ^1_{a_s})_{\sharp } ({\mathfrak {m}}^{a_{s}}_{t}) {\mathcal {L}}^1(da_{s}) = \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } \int _{t-\varepsilon }^{t+\varepsilon } \int _{\varphi _{s}(\mathrm{e}_{s}(G))}(\i ^1_{a_s})_{\sharp } ({\mathfrak {m}}^{a_{s}}_{\tau }) \, {\mathcal {L}}^1(da_{s}) \, {\mathcal {L}}^1(d\tau ) . \end{aligned}$$

Using Fubini’s Theorem and (11.1), we proceed as follows:

$$\begin{aligned} \nonumber =&\lim _{\varepsilon \rightarrow 0} \frac{1}{2 \varepsilon } \int _{\varphi _{s}(\mathrm{e}_{s}(G))} (\i ^1_{a_s})_{\sharp } ({\mathfrak {m}}\llcorner _{\mathrm{e}([t-\varepsilon ,t+\varepsilon ])(G_{a_s})}) {\mathcal {L}}^1(da_s) \\ \nonumber =&\lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon }(\mathcal {L}^{1}\otimes {\mathfrak {m}}) \llcorner \left\{ \begin{array}{l} (a_s,x) \in \varphi _{s}(\mathrm{e}_{s}(G)) \times X \; ; \\ \gamma _{\tau } = x, \, \gamma \in G, \,\varphi _{s}(\gamma _{s}) = a_s, \tau \in (t-\varepsilon ,t+\varepsilon ) \end{array} \right\} \\ \nonumber =&\lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon } (\mathcal {L}^{1}\otimes {\mathfrak {m}}) \llcorner \left\{ \begin{array}{l} (a_s,x) \in \varphi _{s}(\mathrm{e}_{s}(G)) \times X \; ; \\ a_s = \Phi _{s}^{\tau }(x), \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \end{array} \right\} \\ =&\lim _{\varepsilon \rightarrow 0} \int _{\cup _{|\tau - t| < \varepsilon } \mathrm{e}_{\tau }(G)} \frac{1}{2\varepsilon } (\i ^2_{x})_{\sharp }(\mathcal {L}^{1} \llcorner \{ \Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \} ) \, {\mathfrak {m}}(dx) . \end{aligned}$$
(11.6)

Moreover, we claim that it is enough to integrate on \(\mathrm{e}_{t}(G)\) above:

$$\begin{aligned} = \lim _{\varepsilon \rightarrow 0} \int _{ \mathrm{e}_{t}(G)} \frac{1}{2\varepsilon } (\i ^2_{x})_{\sharp }({\mathcal {L}}^{1} \llcorner \{ \Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \} ) \, {\mathfrak {m}}(dx) .\nonumber \\ \end{aligned}$$
(11.7)

To see this, recall that by Proposition 4.4 (3) (relying on Theorem 3.11 (2)), the map \((t-\varepsilon ,t+\varepsilon )\cap \mathring{G}(x) \ni \tau \rightarrow \Phi _{s}^{\tau }(x)\) is Lipschitz with Lipschitz constant bounded uniformly in \(\varepsilon \in (0,t/2 \wedge (1-t)/2)\) and \(x \in \cup _{|\tau - t|<\varepsilon } \mathrm{e}_{\tau }(G)\) (recall that for any \(\gamma \in G\), \(\ell (\gamma )\le 1/c\)); we denote the latter Lipschitz bound by L. Hence the family of measures

$$\begin{aligned} \frac{1}{2\varepsilon }{\mathcal {L}}^{1} \llcorner \{ \Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\} \end{aligned}$$

is bounded in the total-variation norm by L, uniformly in \(\varepsilon \) and x as above. But by continuity:

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} {\mathfrak {m}}(\cup _{|\tau - t|< \varepsilon } \mathrm{e}_{\tau }(G) \setminus \mathrm{e}_{t}(G)) = 0, \end{aligned}$$

and so we can modify the domain of integration in (11.6) yielding (11.7).

Step 2. Fixing \(x \in \mathrm{e}_t(G)\), we now focus on the weak limit:

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{1}{2\varepsilon }{\mathcal {L}}^{1} \llcorner \{ \Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \} . \end{aligned}$$

Recall that \((t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x) \ni \tau \mapsto \Phi _{s}^{\tau }(x)\) has Lipschitz constant bounded by L, and moreover, is increasing by Proposition 4.4 (3). Now extend it to the entire (0, 1) while preserving (non-strict) monotonicity and the bound on the Lipschitz constant, e.g. \({\hat{\Phi }}_{s}^{\tau }(x) := \inf _{r \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \Phi _{s}^{r}(x) + L (\tau - r)_+\). Then for any \(f \in C_{b}({\mathbb {R}})\), by the change-of-variables formula for (monotone) Lipschitz functions:

$$\begin{aligned}&\frac{1}{2\varepsilon } \int _{\{\Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\} } f(a) \, {\mathcal {L}}^1(da) \\&\quad = \frac{1}{2\varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} f(\Phi _{s}^{\tau }(x)) \partial _{\tau } {\hat{\Phi }}_{s}^{\tau }(x) \, {\mathcal {L}}^1(d\tau ) \\&\quad = \frac{1}{2\varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} f(\Phi _{s}^{\tau }(x)) \partial _{\tau } \Phi _{s}^{\tau }(x) \, {\mathcal {L}}^1(d\tau ) ; \end{aligned}$$

the last transition follows since \(\tau \mapsto \Phi _{s}^{\tau }(x)\) is differentiable a.e. on \(D_{\ell }(x)\) and hence \(\partial _{\tau } \Phi _{s}^{\tau }(x) = \partial _{\tau } \Phi _{s}^{\tau }(x)|_{(t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)}\) for a.e. \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) by Remark 2.1, and in addition since \(\partial _{\tau } \Phi _{s}^{\tau }(x)|_{(t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} = \partial _{\tau } {\hat{\Phi }}_{s}^{\tau }(x)\) for a.e. \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) by Remark 2.2. Recall that Proposition 4.4 ensures that for all \(x \in X\), \(\partial _t \Phi _{s}^{t}(x)\) exists for \({\mathcal {L}}^1\)-a.e. \(t \in \mathring{G}(x)\), including at \(t=s\) if \(s \in \mathring{G}(x)\) (in which case \(\partial _t \Phi _{s}^{t}|_{t = s} = \ell _s^2(x)\)). Moreover, Corollary 4.5 and our assumption that \(G \subset G_\varphi ^+\) ensure that \(\partial _t \Phi _{s}^{t}(x) > 0\) for \({\mathcal {L}}^1\)-a.e. \(t \in \mathring{G}(x)\), including at \(t=s\). Applying Fubini’s theorem, we have

$$\begin{aligned} 0= & {} \int _{X}{\mathcal {L}}^{1}(\mathring{G}(x) \setminus \{ t \in \mathring{G}(x) :\exists \partial _{t} \Phi _{s}^{t}(x)> 0 \} ){\mathfrak {m}}(dx) \\= & {} \int _0^1 {\mathfrak {m}}( \mathrm{e}_t(G) \setminus \{ x \in \mathrm{e}_t(G) :\exists \partial _{t} \Phi _{s}^{t}(x) > 0 \} ) {\mathcal {L}}^1(dt). \end{aligned}$$

It follows that for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\), \(\partial _t \Phi _{s}^{t}(x)\) exists and is positive for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\) (including at \(t=s\) for all \(x \in \mathrm{e}_s(G)\)).

Step 3. We now claim that for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including \(t=s\), if \(f \in C_b({\mathbb {R}})\) and \(\Psi \in C_b(X)\) then

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} \int _{\mathrm{e}_t(G)} \left[ \frac{1}{2 \varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} f(\Phi ^\tau _s(x)) \partial _\tau \Phi _s^\tau (x) {\mathcal {L}}^1(d \tau )\right. \\&\qquad \qquad \qquad \left. - f(\Phi ^t_s(x)) \partial _t \Phi _s^t(x) \right] \Psi (x) {\mathfrak {m}}(dx) = 0. \end{aligned}$$

To this end, we will show that for such t’s, both

$$\begin{aligned} \text {I}_\varepsilon (x) := \frac{1}{2 \varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \left( f(\Phi ^\tau _s(x)) - f(\Phi ^t_s(x)\right) \partial _\tau \Phi _s^\tau (x) {\mathcal {L}}^1(d \tau ) , \end{aligned}$$

and

$$\begin{aligned} \text {II}_\varepsilon (x) := f(\Phi ^t_s(x)) \left[ \frac{1}{2 \varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \partial _\tau \Phi _s^\tau (x) {\mathcal {L}}^1(d \tau ) - \partial _t\Phi _s^t(x) \right] , \end{aligned}$$

tend to 0 in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) as \(\varepsilon \rightarrow 0\).

Step 4. To see the claim about \(\text {I}_\varepsilon \), since \(\left| \partial _\tau \Phi _s^\tau (x)\right| \le L\) (uniformly in \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) and \(x \in \mathrm{e}_t(G)\)), it is clear that \(\lim _{\varepsilon \rightarrow 0} \text {I}_\varepsilon (x) = 0\) pointwise by continuity of f and \(\mathring{G}(x) \ni \tau \mapsto \Phi ^\tau _s(x)\) (see Proposition 4.4). To obtain convergence in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\), it is therefore enough to show by Dominated Convergence that

$$\begin{aligned} \frac{1}{2 \varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \left( f(\Phi ^\tau _s(x)) - f(\Phi ^t_s(x)\right) {\mathcal {L}}^1(d \tau ) \le C , \end{aligned}$$
(11.8)

uniformly in \(x \in \mathrm{e}_t(G)\). Since f is uniformly continuous on the compact set \(\varphi _s(\mathrm{e}_s(G))\), the uniform estimate (11.8) follows since \(\mathring{G}(x) \ni \tau \mapsto \Phi ^\tau _s(x)\) is Lipschitz on \([\delta ,1-\delta ]\), with Lipschitz constant depending only on \(\delta > 0\) and an upper bound on \(\{ \ell (\gamma ) \; ; \; \gamma \in G \}\) (see Proposition 4.4 (3) and Theorem 3.11 (2)).

Step 5. To see the claim about \(\text {II}_\varepsilon \), it is clearly enough to show that

$$\begin{aligned}&\tilde{\text {II}}_\varepsilon (x) := \frac{1}{2 \varepsilon } \int _{ (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)} \partial _\tau \Phi _s^\tau (x) {\mathcal {L}}^1(d \tau ) - \partial _t\Phi _s^t(x)\nonumber \\&\quad \rightarrow 0 \text { in }L^1(\mathrm{e}_t(G),{\mathfrak {m}}). \end{aligned}$$
(11.9)

Step 5a. We first establish (11.9) for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) (independently of f and \(\Psi \)). Since \(\partial _\tau \Phi _s^\tau (x) \le L\) uniformly in \(\tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\) and \(x \in \mathrm{e}_t(G)\), by Dominated Convergence, it is enough to establish pointwise convergence in (11.9) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\).

For every \(x \in X\), denote

$$\begin{aligned} Leb(x) : = \{ t \in \mathring{G}(x) \; ; \; t \text { is a Lebesgue point of } \tau \mapsto \partial _{\tau }\Phi _{s}^{\tau }(x) 1_{\mathring{G}(x)}(\tau ) \} . \end{aligned}$$

By Proposition 4.4 (based on Theorem 3.11), we know that for every \(x \in X\), the map \(\tau \mapsto \partial _{\tau }\Phi _{s}^{\tau }(x)\) is in \(L^{\infty }_{loc}(\mathring{G}(x))\), and so by Lebesgue’s Differentiation Theorem, \({\mathcal {L}}^{1}(\mathring{G}(x) \setminus Leb(x)) = 0\). Integrating over \({\mathfrak {m}}\) and applying Fubini’s Theorem, it follows that for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\):

$$\begin{aligned} {\mathfrak {m}}(\mathrm{e}_{t}(G) \setminus \{ x \in \mathrm{e}_{t}(G) \; ; \; t \text { is a Lebesgue point of } \tau \mapsto \partial _{\tau }\Phi _{s}^{\tau }(x) 1_{\mathring{G}(x)}(\tau ) \}) = 0 , \end{aligned}$$

thereby establishing (by definition) the pointwise convergence in (11.9) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\).

Step 5b. We next establish (11.9) at \(t=s\). Write

$$\begin{aligned} \tilde{\text {II}}_\varepsilon (x)= & {} \frac{1}{2 \varepsilon } \int _{ (s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}(x)} \left( \partial _\tau \Phi _s^\tau (x) - \ell _s^2(x)\right) {\mathcal {L}}^1(d \tau ) \\&+ \ell _s^2(x) \left[ \frac{1}{2 \varepsilon } \int _{ (s-\varepsilon ,s+\varepsilon ) \cap \mathring{G}(x)} {\mathcal {L}}^1(d \tau ) - 1 \right] . \end{aligned}$$

The first expression tends to 0 pointwise for all \(x \in X\) by Lemma 4.6, and hence by Dominated Convergence also in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) (since \(\left| \partial _\tau \Phi _s^\tau (x)\right| \le L\) and \(\ell _s(x) \le 1/c\) uniformly). The second expression tends to 0 in \(L^1(\mathrm{e}_t(G),{\mathfrak {m}})\) by Proposition 9.7 and the uniform boundedness of \(\ell _s^2(x)\).

Step 6. In other words, we have verified in Steps 3-5 the following weak convergence, for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\):

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} \int _{ \mathrm{e}_{t}(G)} \frac{1}{2\varepsilon } (\i ^2_{x})_{\sharp }({\mathcal {L}}^{1} \llcorner \{ \Phi _{s}^{\tau }(x) \; ; \; \tau \in (t-\varepsilon ,t+\varepsilon ) \cap \mathring{G}(x)\} ) \, {\mathfrak {m}}(dx) \\&\quad = \int _{\mathrm{e}_{t}(G)} (\i ^2_{x})_{\sharp }(\delta _{\Phi _s^t(x)}) \partial _{t}\Phi _{s}^{t}(x) \,{\mathfrak {m}}(dx) , \end{aligned}$$

where recall \(\Phi _s^s(x) = \varphi _s(x)\) and \(\partial _t \Phi _s^t |_{t=s} = \ell _{s}^{2}(x)\). Combining this with Step 1, we deduce that:

$$\begin{aligned} \int _{\varphi _{s}(\mathrm{e}_{s}(G))} (\i ^1_{a_s})_{\sharp } ({\mathfrak {m}}^{a_{s}}_{t}) {\mathcal {L}}^1(da_{s}) = \int _{\mathrm{e}_{t}(G)} (\i ^2_{x})_{\sharp }(\delta _{\Phi _s^t(x)})) \partial _{t}\Phi _{s}^{t}(x) \,{\mathfrak {m}}(dx) . \end{aligned}$$

Integrating this identity against \(1 \otimes \psi \) with \(1 \in C_b({\mathbb {R}})\) and \(\psi \in C_{b}(X)\), we obtain:

$$\begin{aligned}&\int _{\varphi _{s}(\mathrm{e}_{s}(G))} \int _{\mathrm{e}_s(G)} \psi (x) \, {\mathfrak {m}}^{a_{s}}_{t}(dx) \, {\mathcal {L}}^1(d a_{s}) = \int _{\mathrm{e}_{t}(G)} \psi (x) \partial _{t}\Phi _{s}^{t}(x) \, {\mathfrak {m}}(dx) \\&\quad = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} \int _{\mathrm{e}_t(G)} \psi (x) \, \partial _{t}\Phi _{s}^{t}(x) {\mathfrak {m}}^{t}_{a_{s}}(dx) \, {\mathcal {L}}^1(d a_{s}) , \end{aligned}$$

where we used that \({\mathfrak {m}}^{a_{s}}_{t}\) is concentrated on \(\mathrm{e}_t(G_{a_s}) \subset \mathrm{e}_t(G)\) for all \(t \in (0,1)\) and \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) in the first expression, and the disintegration (11.1) of \({\mathfrak {m}}\llcorner _{\mathrm{e}_{t}(G)}\) in the last transition. In other words, we obtained for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\):

$$\begin{aligned} \int _{\varphi _{s}(\mathrm{e}_{s}(G))} {\mathfrak {m}}^{a_{s}}_{t} {\mathcal {L}}^1(d a_{s}) = \int _{\varphi _{s}(\mathrm{e}_{s}(G))} \partial _{t}\Phi _{s}^{t} \; {\mathfrak {m}}^{t}_{a_{s}} {\mathcal {L}}^1(d a_{s}) . \end{aligned}$$

Since \({\mathfrak {m}}_{a_s}^t\) is also concentrated on \(\mathrm{e}_{t}(G_{a_{s}})\) for all \(t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\), the assertion follows by essential uniqueness of consistent disintegrations (Theorem 6.19). Note that by Step 2, \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\) including at \(t=s\) for \({\mathfrak {m}}\)-a.e. \(x \in \mathrm{e}_t(G)\), and so by (11.1), the same holds for \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _{s}(\mathrm{e}_{s}(G))\) and \({\mathfrak {m}}_{a_{s}}^{t}\)-a.e. x. \(\square \)

13.2 Change-of-variables formula

We now obtain the following main result of Sects. 10 and 11. At this time, we dispense of Assumption 10.2.

Theorem 11.4

(Change-of-Variables) Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^{1}(K,N)\) with \(\text {supp}({\mathfrak {m}}) = X\), and let \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\). Let \(\nu \) denote the unique element of \(\mathrm {OptGeo}(\mu _{0},\mu _{1})\), and set \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in (0,1)\).

Then there exist versions of the densities \(\rho _t := d\mu _t / d{\mathfrak {m}}\), \(t \in [0,1]\), so that for \(\nu \)-a.e. \(\gamma \in \mathrm{Geo}(X)\), (9.4) holds for all \(0 \le s \le t \le 1\), and in particular, for \(\nu \)-a.e. \(\gamma \), \(t \mapsto \rho _t(\gamma _t)\) is positive and locally Lipschitz on (0, 1), and upper semi-continuous at \(t=0,1\).

Moreover, for any \(s\in (0,1)\), for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), \(\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})\) exists, is positive, and the following change-of-variables formula holds:

$$\begin{aligned} \frac{\rho _{t}(\gamma _{t})}{\rho _{s} (\gamma _{s})} = \frac{\partial _{\tau }|_{\tau = t}\Phi _{s}^{\tau }(\gamma _{t})}{\ell ^{2}(\gamma )} \cdot \frac{1}{ h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}(t)} . \end{aligned}$$
(11.10)

Here \(\varphi \) denotes a Kantorovich potential associated to the c-optimal-transport problem between \(\mu _0\) and \(\mu _1\) with cost \(c = \mathsf {d}^2/2\), and \(\Phi _s^t\) denotes the time-propagated intermediate Kantorovich potential introduced in Sect. 4; \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _{s}}\) is the \({\mathsf {CD}}(\ell (\gamma )^2 K ,N)\) density on [0, 1] from Proposition 10.4, after applying the re-normalization from Remark 10.5, so that \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}(s) = 1\). In particular, for \(\nu \text {-a.e. } \gamma \in G_\varphi ^+\), the above change-of-variables formula holds for \({\mathcal {L}}^{1}\)-a.e. \(t,s \in (0,1)\).

Lastly, for all \(\gamma \in G_\varphi ^0\), we have

$$\begin{aligned} \rho _t(\gamma _t) = \rho _s(\gamma _s) \;\;\; \forall t,s \in [0,1] . \end{aligned}$$
(11.11)

Recall that \(\nu \) is concentrated on \(G_\varphi = G_\varphi ^+ \cup G_\varphi ^0\), where \(G_\varphi ^+\) and \(G_\varphi ^0\) denote the subsets of positive and zero length \(\varphi \)-Kantorovich geodesics, respectively. Note that \(\partial _{t}|_{t=s}\Phi _{s}^{t}(\gamma _{s}) = \ell _s^2(\gamma _s) = \ell ^2(\gamma )\) by Proposition 4.4, so that together with our normalization that \(h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}(s) = 1\), we see that both sides of (11.10) are indeed equal to 1 for \(t=s\).

Proof of Theorem 11.4

Step 0. As usual, by Proposition 8.9 and Remark 8.11, \((X,\mathsf {d},{\mathfrak {m}})\) also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 and all the results of Sect. 9 apply. We will use the versions of the densities given by Corollary 9.5. On \(X^0 = \mathrm{e}_{[0,1]}(G_\varphi ^0)\), we know by Corollary 9.8 that \(\mu _0\llcorner _{X^0} = \mu _1\llcorner _{X^0} = \mu _t\llcorner _{X^0}\) for all \(t \in [0,1]\), and so if necessary, we simply redefine \(\rho _t|_{X^0} := \rho _0|_{X^0}\) for all \(t \in (0,1]\), so that (11.11) holds. Note that by Lemma 3.15, this will not affect \((0,1) \ni t \mapsto \rho _t(\gamma _t)\) for all \(\gamma \in G_{\varphi }^+\), and Corollary 9.8 (applied to the pair \(\mu _1,\mu _0\)) ensures that the same is true for \(\nu \)-a.e. \(\gamma \in G_{\varphi }^+\) at \(t=1\).

Step 1. As explained in the beginning of Sect. 10, by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs \(\mu _0,\mu _1\) and \(\mu _1,\mu _0\)), Proposition 9.7 and Corollary 6.16, there exists a good compact subset \(G^{\varepsilon } \subset G^+_{\varphi }\) with \(\nu (G^{\varepsilon }) \ge \nu (G_\varphi ^+)-\varepsilon \) for any \(\varepsilon > 0\) (recall Definition 10.1). Of course, we may assume that \(G^{\varepsilon }\) is increasing as \(\varepsilon \) decreases to 0 (say, along a fixed sequence). Fixing \(\varepsilon > 0\) and a good \(G^{\varepsilon }\), denote \(\nu ^{\varepsilon } = \frac{1}{\nu (G^\varepsilon )} \nu \llcorner _{G^{\varepsilon }}\) and \(\mu _t^{\varepsilon } := (\mathrm{e}_t)_{\sharp } \nu ^{\varepsilon } \ll {\mathfrak {m}}\), so that all of the results of Sects. 10 and  11.1 apply to \(\nu ^{\varepsilon }\). Note that by Corollary 6.16, we have that \(\mu ^{\varepsilon }_t = \frac{1}{\nu (G^\varepsilon )} (\mu _t)\llcorner _{\mathrm{e}_t(G^{\varepsilon })}\) for all \(t \in [0,1]\), and therefore

$$\begin{aligned} \mu _t^{\varepsilon } = \rho _t^{\varepsilon } {\mathfrak {m}}~,~ \rho _t^{\varepsilon } := \frac{1}{\nu (G^\varepsilon )} \rho _t|_{\mathrm{e}_t(G^{\varepsilon })} \;\;\; \forall t \in [0,1] . \end{aligned}$$

Also note that as \(\nu \) is concentrated on \(G^{\varepsilon } \subset G_\varphi \), \(\varphi \) is still a Kantorovich potential for the associated transport-problem.

Step 2. Recall that by Corollary 10.10 (3), for each \(s \in (0,1)\) and \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^\varepsilon ))\), the map

$$\begin{aligned}{}[0,1] \ni t \mapsto \rho _{t} \cdot {\mathfrak {m}}^{\varepsilon ,t}_{a_{s}} \end{aligned}$$

coincides for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) with the geodesic \(t \mapsto (\mathrm{e}_{t})_{\sharp } \nu ^{\varepsilon }_{a_{s}}\) up to a (positive) constant \(C^{\varepsilon }_{a_s}\) depending on \(a_{s}\), where \(\nu ^{\varepsilon }_{a_s}\) is the conditional measure from the disintegration in (10.11). Consequently, for such s and \(a_s\), for \({\mathcal {L}}^{1}\)-a.e. \(t \in [0,1]\) and any Borel \(H \subset G^{\varepsilon }_{a_{s}}\), the quantity

$$\begin{aligned} \int _{\mathrm{e}_{t}(H)}\rho ^{\varepsilon }_{t} (x) {\mathfrak {m}}^{\varepsilon ,t}_{a_{s}}(dx) = C^{\varepsilon }_{a_s} \int _{\mathrm{e}_t(H)} (\mathrm{e}_{t})_{\sharp } \nu ^{\varepsilon }_{a_{s}}( dx) = C^{\varepsilon }_{a_s} \nu ^{\varepsilon }_{a_{s}}(H)\nonumber \\ \end{aligned}$$
(11.12)

is constant (where we used the fact that \(\mathrm{e}_t|_{G^{\varepsilon }} : G^{\varepsilon } \rightarrow X\) is injective).

By Theorem 11.3, for \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \({\mathcal {L}}^1\)-a.e. \(a_s \in \varphi _s(G^{\varepsilon }_s)\) (and hence for \({\mathfrak {q}}_s^{\varepsilon ,s}\)-a.e. \(a_s \in \varphi _s(G_s^{\varepsilon })\) by Proposition 10.8), \(\partial _{t}\Phi _{s}^{t}(x)\) exists and is positive for \({\mathfrak {m}}^{\varepsilon ,t}_{a_s}\)-a.e. x, and \({\mathfrak {m}}_{t}^{\varepsilon ,a_{s}} = \partial _{t}\Phi _{s}^{t} \cdot {\mathfrak {m}}^{\varepsilon ,t}_{a_{s}}\). It follows that for those t and \(a_s\) for which this representation and (11.12) hold true:

$$\begin{aligned} C^{\varepsilon }_{a_s} \nu ^{\varepsilon }_{a_{s}}(H)&= \int _{\mathrm{e}_{t}(H)} \rho ^{\varepsilon }_{t}(x) {\mathfrak {m}}^{\varepsilon ,t}_{a_{s}}(dx) = \int _{\mathrm{e}_{t}(H)} \frac{\rho ^{\varepsilon }_{t}(x)}{\partial _{t}\Phi _{s}^{t}(x)} \, {\mathfrak {m}}^{\varepsilon ,a_{s}}_{t}(dx)\quad \\&~ = \int _{\mathrm{e}_{s}(H)}\frac{\rho ^{\varepsilon }_{t}(g^{a_{s}}(\beta ,t))}{\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(g^{a_{s}}(\beta ,t))} h^{a_{s}}_{\beta }(t)\, {\mathfrak {m}}_{s}^{\varepsilon ,a_{s}}(d\beta ) \nonumber \\&~ = \int _{\mathrm{e}_{s}(H)} \frac{\rho ^{\varepsilon }_{t}(g^{a_{s}}(\beta ,t))}{\partial _{\tau }|_{\tau =t} \Phi _{s}^{\tau }(g^{a_{s}}(\beta ,t))} h^{a_{s}}_{\beta }(t) \ell ^{2}_{s}(\beta ) {\mathfrak {m}}_{a_{s}}^{\varepsilon ,s}(d\beta ) ,\nonumber \end{aligned}$$
(11.13)

where the second transition follows from our normalization and Remark 10.5, ensuring that \({\mathfrak {m}}_t^{\varepsilon ,a_s} = (g^{a_{s}}(\cdot ,t))_{\sharp }\, ( h^{a_{s}}_{\cdot }(t) {\mathfrak {m}}_s^{\varepsilon ,a_{s}})\), and the last transition follows from Theorem 11.3.

Note that g and h above do not depend on \(\varepsilon > 0\). For g, this follows by its very definition as \(g^{a_s}(\beta ,t) = \mathrm{e}_t(\mathrm{e}_s^{-1}(\beta ))\) (and the injectivity of \(\mathrm{e}_s|_{G_\varepsilon }\) for all \(\varepsilon > 0\)). For h, this immediately follows by inspecting the proof of Proposition 10.4, where \(h^{a_s}_{\gamma _s}(t)\) was uniquely defined (for \(t \in (0,1)\)) as the continuous version of the density of \({\hat{{\mathfrak {m}}}}^{a_s}_\alpha \) from (10.2) after conditioning it on \(\mathrm{e}_{[0,1]}(\gamma )\) and pulling it back to the interval [0, 1], where \(\alpha \in Q^{1,\varepsilon }\) was bijectively identified with \(\gamma \in G_{a_s}^{\varepsilon ,1}\) via \(\eta ^{\varepsilon }\); as \(Q^{1,\varepsilon }\) and \(G_{a_s}^{\varepsilon ,1}\) clearly increase as \(\varepsilon \) decreases to 0, with \(\eta ^{\varepsilon }|_{Q^{1,\varepsilon '}} = \eta ^{\varepsilon '}\) for \(0< \varepsilon < \varepsilon '\), we verify that h indeed does not depend on \(\varepsilon >0\).

Step 3. As the left-hand-side of (11.13) does not depend on t, it follows that for all \(s \in (0,1)\) and for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) (both of which we fix for the time being), there exists a subset \(T \subset (0,1)\) of full \({\mathcal {L}}^1\) measure, so that for all \(H \subset G^{\varepsilon }_{a_s}\):

$$\begin{aligned} T \ni t \mapsto \int _{\mathrm{e}_{s}(H)} \frac{\rho ^{\varepsilon }_{t}(g^{a_{s}}(\beta ,t))}{\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(g^{a_{s}}(\beta ,t))} h^{a_{s}}_{\beta }(t) \ell ^{2}_{s}(\beta ) {\mathfrak {m}}_{a_{s}}^{\varepsilon ,s}(d\beta ) \end{aligned}$$

is constant. As any Borel subset of \(\mathrm{e}_s(G_{a_s})\) may be written as \(\mathrm{e}_s(H)\), equality of measures follows, and hence equality of densities for \({\mathfrak {m}}_{a_{s}}^{\varepsilon ,s}\)-a.e. \(\beta \). We have therefore proved that for \(t,t' \in T\):

$$\begin{aligned} \rho ^{\varepsilon }_{t'}(\gamma _{t'}) (\partial _{\tau }|_{\tau =t'}\Phi _{s}^{\tau }(\gamma _{t'}))^{-1} h^{a_{s}}_{\gamma _s}(t') = \rho ^{\varepsilon }_{t}(\gamma _t) (\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _t))^{-1} h^{a_{s}}_{\gamma _s}(t) ,\nonumber \\ \end{aligned}$$
(11.14)

for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), where \(\gamma = \gamma ^\beta = \mathrm{e}_s^{-1}(\beta ) = g^{a_s}(\beta ,\cdot )\in G^{\varepsilon }_{a_{s}}\), with the exceptional set depending on \(t,t'\). Note that given \(t' \in T\), \(\partial _{\tau }|_{\tau =t'}\Phi _{s}^{\tau }(\gamma ^\beta _{t'})\) indeed exists for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\) by Corollary 10.10 (2).

It follows that for all \(t \in T\), for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), (11.14) holds simultaneously for a countable sequence \(t' \in T^t \subset T\) which is dense in (0, 1). Taking the limit in (11.14) as \(T^t \ni t' \rightarrow s\), using Proposition 4.4 (5) which entails:

$$\begin{aligned} \lim _{T^t \ni t' \rightarrow s} \partial _{\tau }|_{\tau =t'}\Phi _{s}^{\tau }(\gamma ^\beta _{t'}) = \ell _s(\gamma ^\beta _s)^2 = \ell (\gamma ^\beta )^2 , \end{aligned}$$

employing the continuity of \((0,1) \ni t' \mapsto h^{a_{s}}_{\gamma _s}(t')\), our normalization \(h^{a_{s}}_{\gamma _s}(s) = 1\), and the continuity of \((0,1) \ni t' \mapsto \rho ^{\varepsilon }_{t'}(\gamma _{t'})\) (as \(G^{\varepsilon }\) is good), it follows that for all \(s \in (0,1)\), for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) and \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\):

$$\begin{aligned} \rho ^{\varepsilon }_{s}(\gamma _s) \ell (\gamma )^{-2} = \rho ^{\varepsilon }_{t}(\gamma _t) (\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _t))^{-1} h^{a_{s}}_{\gamma _s}(t) \end{aligned}$$
(11.15)

for \({\mathfrak {m}}^{\varepsilon ,s}_{a_{s}}\)-a.e. \(\beta \in \mathrm{e}_{s}(G^{\varepsilon }_{a_{s}})\), with \(\gamma = \mathrm{e}_s^{-1}(\beta ) \in G^{\varepsilon }_{a_{s}}\).

Step 4. Recall that by Corollary 10.10 (2), \({\mathfrak {m}}^{\varepsilon ,s}_{a_s}\) and \((\mathrm{e}_s)_{\sharp } \nu ^{\varepsilon }_{a_s}\) are mutually absolutely continuous for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\). It follows that for all \(s \in (0,1)\), for \({\mathfrak {q}}^{\varepsilon ,s}_{s}\)-a.e. \(a_{s} \in \varphi _{s}(\mathrm{e}_{s}(G^{\varepsilon }))\) and \({\mathcal {L}}^1\)-a.e. \(t \in (0,1)\), (11.15) holds for \(\nu _{a_s}\)-a.e. \(\gamma \). By Corollary 10.10 (1), note that \({\mathfrak {q}}^{\varepsilon ,s}_{s}\) and \({\mathfrak {q}}^{\varepsilon ,\nu }_s\) are mutually absolutely continuous, and hence the disintegration formula (10.11) implies that for all \(s \in (0,1)\) and \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\):

$$\begin{aligned} \rho ^{\varepsilon }_{s} (\gamma _{s}) \ell (\gamma )^{-2}= \rho ^{\varepsilon }_{t}(\gamma _{t}) (\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _{t}))^{-1} h^{\varphi _s(\gamma _s)}_{\gamma _s}(t), \end{aligned}$$

for \(\nu \)-a.e. \(\gamma \in G^{\varepsilon }\), and in particular that \(\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _{t})\) exists and is positive for those s, t and \(\gamma \). Taking the limit as \(\varepsilon \rightarrow 0\) along a countable sequence, it follows for all \(s \in (0,1)\), \({\mathcal {L}}^{1}\)-a.e. \(t \in (0,1)\) and \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), that

$$\begin{aligned} \rho _{s} (\gamma _{s}) \ell (\gamma )^{-2}= \rho _{t}(\gamma _{t}) (\partial _{\tau }|_{\tau =t}\Phi _{s}^{\tau }(\gamma _{t}))^{-1} h^{\varphi _s(\gamma _s)}_{\gamma _s}(t) , \end{aligned}$$

thereby concluding the proof of (11.10). As a consequence, an application of Fubini’s Theorem verifies that for \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), (11.10) holds for \({\mathcal {L}}^1\)-a.e. \(s,t \in (0,1)\).

\(\square \)

Remark 11.5

Observe that all of the results of this section also equally hold for \({\bar{\Phi }}_s^t\) in place of \(\Phi _s^t\). Indeed, recall that for all \(x \in X\), \(\Phi _s^t(x) = {\bar{\Phi }}_s^t(x)\) for \(t \in \mathring{G}_\varphi (x)\), and that by Corollary 4.5, \(\partial _t \Phi _s^t(x) = \partial _t {\bar{\Phi }}_s^t(x)\) for a.e. \(t \in \mathring{G}_\varphi (x)\). As these were the only two properties used in the above derivation (in particular, in Step 2 of the proof of Theorem 11.3), the assertion follows.

14 Part III Putting it all together

15 Combining change-of-variables formula with Kantorovich 3rd order information

Let \((X,\mathsf {d},\mathfrak m)\) denote an essentially non-branching m.m.s.verifying \({\mathsf {CD}}^1(K,N)\). Let \(\mu _0,\mu _1 \in {\mathcal {P}}_2(X,\mathsf {d},{\mathfrak {m}})\), and let \(\nu \) be the unique element of \(\mathrm {OptGeo}(\mu _0,\mu _1)\) (by Proposition 8.9, Remark 8.11 and Theorem 6.15). Recall that \(\mu _t := (\mathrm{e}_t)_{\sharp } \nu \ll {\mathfrak {m}}\) for all \(t \in [0,1]\), and we subsequently denote by \(\rho _t\) the versions of the corresponding densities given by Theorem 11.4 (resulting from Corollary 9.5). Finally, denote by \(\varphi \) a Kantorovich potential associated to the corresponding optimal transference plan, so that \(\nu (G_\varphi ) = 1\).

15.1 Change-of-variables rigidity

Recall that by the Change-of-Variables Theorem 11.4, we know that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\) and for a.e. \(t,s \in (0,1)\), \(\partial _\tau |_{\tau =t} \Phi ^\tau _s(\gamma _t)\) exists, is positive, and it holds that:

$$\begin{aligned} \frac{\rho _s(\gamma _s)}{\rho _t(\gamma _t)} = \frac{h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)}{ \partial _\tau |_{\tau =t} \Phi ^\tau _s(\gamma _t) / \ell (\gamma )^2 } . \end{aligned}$$
(12.1)

In fact, by Remark 11.5, the same also holds with \({\bar{\Phi }}\) in place of \(\Phi \), so that in particular:

$$\begin{aligned} \partial _\tau |_{\tau =t} \Phi ^\tau _s(\gamma _t) = \partial _\tau |_{\tau =t} {\bar{\Phi }}^\tau _s(\gamma _t) \;\;\; \text {for }\nu \text {-a.e. } \gamma \in G_{\varphi }^+ \;\;\; \text {for a.e. } t,s \in (0,1) .\nonumber \\ \end{aligned}$$
(12.2)

Recall that given \(t,s \in (0,1)\), for \({\tilde{\Phi }} = \Phi ,{\bar{\Phi }}\) and \({\tilde{\ell }} = \ell ,{\bar{\ell }}\), respectively, \({\tilde{\Phi }}_s^t\) was defined on \(D_{{\tilde{\ell }}}\) as:

$$\begin{aligned} {\tilde{\Phi }}_s^t = {\tilde{\varphi }}_t + (t-s) \frac{{\tilde{\ell }}_t^2}{2} , \end{aligned}$$

and that by Proposition 4.4 (2), the differentiability points of \(t \mapsto {\tilde{\Phi }}_{s}^t(x)\) and \(t \mapsto {\tilde{\ell }}^2_t(x)\) coincide for all \(t \ne s\), and at those points:

$$\begin{aligned} \partial _t {\tilde{\Phi }}_s^t (x) = {\tilde{\ell }}_t^2(x) + (t-s) \partial _t \frac{{\tilde{\ell }}_t^2}{2}(x) . \end{aligned}$$
(12.3)

It follows from (12.2) that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\) and for a.e. \(t \in (0,1)\):

$$\begin{aligned} \exists \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) \;\; ,\;\; \exists \partial _\tau |_{\tau =t} \frac{{\bar{\ell }}_\tau ^2}{2}(\gamma _t) \;\;,\;\; \partial _\tau |_{\tau =t} \frac{\ell _\tau ^2}{2}(\gamma _t) = \partial _\tau |_{\tau =t} \frac{{\bar{\ell }}_\tau ^2}{2}(\gamma _t) .\qquad \end{aligned}$$
(12.4)

Alternatively, (12.4) follows directly by Lemma 5.6, in fact for \(\nu \)-a.e. \(\gamma \) (not just \(\gamma \in G_\varphi ^+\)).

Plugging (12.3) and (12.4) into (12.1), it follows that we may express the Change-of-Variables Theorem 11.4 as the statement that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\), we have:

$$\begin{aligned}&\frac{\rho _s(\gamma _s)}{\rho _t(\gamma _t)} = \frac{h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}}\nonumber \\&\quad = \frac{h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}{\bar{\ell }}_\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}} \;\;\; \text {for a.e. } t,s \in (0,1) . \end{aligned}$$
(12.5)

Note that the denominators on the right-hand-side of (12.5) are always positive (when defined) for all \(t,s \in (0,1)\) by Theorem 3.11 (3). Fixing the geodesic \(\gamma \), we denote for brevity \(\rho (t) := \rho _t(\gamma _t)\), \(h_s(t) := h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)\) and \(K_0 := K \cdot \ell (\gamma )^2\). We then have the following additional information for \(\nu \)-a.e. \(\gamma \in G_\varphi ^+\), by Corollary 9.5 and Proposition 10.4, respectively:

  1. (A)

    \((0,1) \ni t \mapsto \rho (t)\) is locally Lipschitz and strictly positive.

  2. (B)

    For all \(s \in (0,1)\), \(h_s\) is a \({\mathsf {CD}}(K_0,N)\) density on [0, 1], satisfying \(h_s(s) = 1\). In particular, it is locally Lipschitz continuous on (0, 1) and strictly positive there.

Remark 12.1

It is in fact possible to deduce (A) just from the Change-of-Variables formula (12.5) and without referring to Corollary 9.5. This may be achieved by a careful bootstrap argument, exploiting the separation of variables on the left-hand-side of (12.5) and the a-priori estimates of Lemma A.9 in the “Appendix” on the logarithmic derivative of \({\mathsf {CD}}(K_0,N)\) densities. But since we already know (A), and since (A) was actually (mildly) used in the proof of the Change-of-Variables Theorem 11.4, we only mention this possibility in passing. Note that Corollary 9.5 applies to all \({\mathsf {MCP}}(K,N)\) essentially non-branching spaces, whereas the Change-of-Variables formula requires knowing the stronger \({\mathsf {CD}}^1(K,N)\) condition.

Fix a geodesic \(\gamma \in G_\varphi ^+\) satisfying (12.5), (A) and (B) above. Let \(I \subset (0,1)\) be the set of full measure where (12.5) holds for all \(s \in I\). It follows from (12.5) that for all \(s \in I\), \(t \mapsto \frac{\partial _\tau |_{\tau =t} {\tilde{\ell }}_\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}\) coincide a.e. on (0, 1) for both \({\tilde{\ell }} = \ell ,{\bar{\ell }}\) with the same locally Lipschitz function \(t \mapsto z_s(t)\) defined on \((0,1) \setminus \left\{ s\right\} \):

$$\begin{aligned} z_s(t) := \frac{\frac{1}{\rho _s(\gamma _s)} h^{\varphi _s(\gamma _s)}_{\gamma _s}(t) \rho _t(\gamma _t) - 1}{t-s} . \end{aligned}$$

By continuity, it follows that the functions \(\left\{ z_s\right\} _{s \in I}\) must all coincide on their entire domain of definition with a single function \(t \mapsto z(t)\) defined on (0, 1); the latter function must therefore be locally Lipschitz continuous, and satisfy

$$\begin{aligned} z(t) = \frac{\partial _\tau |_{\tau =t} \ell _\tau ^2/2(\gamma _t)}{\ell (\gamma )^2} = \frac{\partial _\tau |_{\tau =t} {\bar{\ell }}_\tau ^2/2(\gamma _t)}{\ell (\gamma )^2} \;\;\; \text {for a.e. } t \in (0,1) . \end{aligned}$$
(12.6)

By Theorem 5.5, which provides us with 3rd order information on intermediate-time Kantorovich potentials, we obtain the following additional information on z:

  1. (C)

    \((0,1) \ni t \mapsto z(t)\) is locally Lipschitz.

    For any \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\) so that

    $$\begin{aligned}&\frac{z(t) - z(s)}{t-s} \ge (1 - C_\delta (t-s)) \left| z(s)\right| \left| z(t)\right| \\&\quad \forall 0< \delta \le s< t \le 1 - \delta < 1 . \end{aligned}$$

    In particular, \(z'(t) \ge z^2(t)\) for a.e. \(t \in (0,1)\).

Remark 12.2

By Theorem 5.5, we obtain the following interpretation for z(t)—it coincides for all \(t \in (0,1)\) with the second Peano derivative of \(\tau \mapsto \varphi _{\tau }(\gamma _t)\) and of \(\tau \mapsto {\bar{\varphi }}_{\tau }(\gamma _t)\) at \(\tau = t\). In particular, these second Peano derivatives are guaranteed to exist for all \(t \in (0,1)\) and are a continuous function thereof.

We have already seen above how (12.5) enabled us to deduce (12.6), thereby gaining (by Theorem 5.5) an additional order of regularity for \(\partial _\tau |_{\tau =t} \ell _\tau ^2/2(\gamma (t))\). The purpose of this section is to show that the combination of the Change-of-Variables Formula:

$$\begin{aligned} \frac{\rho (s)}{\rho (t)} = \frac{h_{s}(t)}{1 + (t-s) z(t)} \;\;\; \text {for a.e. } t,s\in (0,1) , \end{aligned}$$
(12.7)

together with properties (A), (B) and (C) above, forms a very rigid condition, and already implies the following representation for \(\frac{1}{\rho _t(\gamma _t)}\); we formulate this independently of the preceding discussion as follows:

Theorem 12.3

(Change-of-Variables Rigidity) Assume that (12.7) holds, where \(\rho \), \(\left\{ h_s\right\} \) and z satisfy (A), (B) and (C) above. Then

$$\begin{aligned} \frac{1}{\rho (t)} = L(t) Y(t) \;\;\; \forall t \in (0,1) , \end{aligned}$$

where L is concave and Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1).

15.2 Formal argument

To better motivate the ensuing proof of Theorem 12.3, we begin with a formal argument.

Assume that the functions \(\rho \) and z are \(C^2\) smooth and that equality holds in (12.7) for all \(t,s \in (0,1)\). It follows that the mapping \((s,t) \mapsto h_s(t)\) is also \(C^2\) smooth. Fix any \(r_0 \in (0,1)\), and define the functions L and Y by

$$\begin{aligned} \log L(r) := - \int _{r_0}^r z(s) ds ~,~ \log Y(r) := \int _{r_0}^r \partial _t|_{t=s} \log h_s(t) ds . \end{aligned}$$

Note that by (12.7):

$$\begin{aligned}&\log \frac{\rho (r_0)}{\rho (r)} = \int _{r_0}^r \partial _t |_{t=s} \log \frac{\rho (s)}{\rho (t)} ds \\&\quad = \int _{r_0}^r \partial _t|_{t=s} \log h_s(t) ds - \int _{r_0}^r \partial _t|_{t=s} \log (1+(t-s) z(t)) ds \\&\quad = \log Y(r) + \log L(r) . \end{aligned}$$

As already noted in Lemma 5.7, the concavity of L follows from (C), since

$$\begin{aligned} \frac{L''}{L} = (\log L)'' + ((\log L)')^2 = -z' + z^2 \le 0 . \end{aligned}$$

The more interesting function is Y. We have for all \(r \in (0,1)\):

$$\begin{aligned} (\log Y)'(r)&= \partial _t|_{t=r} \log h_r(t) , \\ (\log Y)''(r)&= \partial _t^2|_{t=r} \log h_r(t) + \partial _s \partial _t |_{t=s=r} \log h_s(t) . \end{aligned}$$

To handle the last term on right-hand-side above, note that by the separation of variables on the left-hand-side of (12.7), we have by (C) again, after taking logarithms and calculating the partial derivatives in t and s:

$$\begin{aligned}&\partial _s \partial _t|_{t=s=r} \log h_s(t) = \partial _s \partial _t |_{t=s=r} \log (1 + (t-s) z(t)) \nonumber \\&\quad = -z'(r) + z^2(r) \le 0. \end{aligned}$$
(12.8)

We therefore conclude that for all \(r \in (0,1)\):

$$\begin{aligned} (\log Y)''(r) + \frac{((\log Y)'(r))^2}{N-1} \le \partial _t^2 |_{t=r} \log h_r(t) + \frac{(\partial _t |_{t=r} \log h_r(t))^2}{N-1} \le -K_0 , \end{aligned}$$

where the last inequality follows from (B) and the differential characterization of \({\mathsf {CD}}(K_0,N)\) densities (applied to \(h_r(t)\) at \(t=r\)). Applying the characterization again, we deduce that Y is a (\(C^2\)-smooth) \({\mathsf {CD}}(K_0,N)\) density on (0, 1). This concludes the formal proof that

$$\begin{aligned} \frac{\rho (r_0)}{\rho (r)} = L(r) Y(r) \;\;\; \forall r \in (0,1) , \end{aligned}$$

with L and Y satisfying the desired properties. In a sense, the latter argument has been tailored to “reverse-engineer” the smooth Riemannian argument, where the separation to orthogonal and tangential components of the Jacobian is already encoded in the Jacobi equation, (B) is a consequence of the corresponding Riccati equation, and (C) is a consequence of Cauchy–Schwarz (cf. [74, Proof of Theorem 1.7]).

15.3 Rigorous argument

It is surprisingly very tedious to modify the above formal argument into a rigorous one. It seems that an approximation argument cannot be avoided, since the definition of Y above is inherently differential, and so on one hand we do not know how to check the \({\mathsf {CD}}(K_0,N)\) condition for Y synthetically, but on the other hand Y is not even differentiable, so it is not clear how to check the \({\mathsf {CD}}(K_0,N)\) condition by taking derivatives. The main difficulty in applying an approximation argument here stems from the fact that we do not know how to approximate \(\{h_s\}\) and z by smooth functions \(\{h^\varepsilon _s\}\) and \(z^\varepsilon \), so that simultaneously:

  • \(\{h^\varepsilon _s\}\) are \({\mathsf {CD}}(K_0-\varepsilon ,N)\) densities;

  • \(z^\varepsilon \) is a function of t only, and not of s;

  • and the separation of variables structure of (12.7) is preserved.

Our solution is to note that the main role of the separation of variables in the above formal argument was to ensure that (12.8) holds, and so we will replace the rigid third requirement with the following relaxed one:

  • \(\partial _s \partial _t |_{t=s=r} \log h^\varepsilon _s(t) \le B_\delta \varepsilon \) for all \(r \in [\delta ,1-\delta ]\) and \(\delta > 0\).

Proof of Theorem 12.3

Step 1: Redefining \(h_s(t)\).

First, observe that there exists \(I_y \subset (0,1)\) of full measure so that for all \(s \in I_y\), (12.7) is satisfied for a.e. \(t \in (0,1)\), and hence for all \(t \in (0,1)\), since all the functions \(\rho \), \(\left\{ h_s\right\} \) and z are assumed to be continuous on (0, 1). Unfortunately, we cannot extend this to all \(s \in (0,1)\) as well, since there may be a null set of s’s for which the densities \(h_s(t)\) do not comply at all with the equation (12.7). To remedy this, we simply force (12.7) to hold for all \(s,t \in (0,1)\) by defining

$$\begin{aligned} {\tilde{h}}_s(t) := \frac{\rho (s)}{\rho (t)} (1 + (t-s) z(t)) \;\;\; s,t \in (0,1) , \end{aligned}$$
(12.9)

and claim that for all \(s \in (0,1)\), \({\tilde{h}}_s\) is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1). Indeed, for \(s \in I_y\), \({\tilde{h}}_s = h_s\) and there is nothing to check. If \(s_0 \in (0,1) \setminus I_y\), simply note that \({\tilde{h}}_s(t)\) is locally Lipschitz in \(s \in (0,1)\) (since \(\rho (s)\) is), and hence

$$\begin{aligned} {\tilde{h}}_{s_0}(t) = \lim _{s \rightarrow s_0} {\tilde{h}}_{s}(t) = \lim _{I_y \ni s \rightarrow s_0} {\tilde{h}}_s(t) = \lim _{I_y \ni s \rightarrow s_0} h_s(t) \;\;\; \forall t \in (0,1) . \end{aligned}$$

But the family of \({\mathsf {CD}}(K_0,N)\) densities on (0, 1) is clearly closed under pointwise limits (it is characterized by a family of inequalities between 3 points), and so \({\tilde{h}}_{s_0}\) is a \({\mathsf {CD}}(K_0,N)\) density, as asserted.

Step 2: Properties of z and \(\{{\tilde{h}}_s\}\).

We next collect several additional observations regarding the functions z and \(\{{\tilde{h}}_s\}\). Recall that \(\rho \) (by assumption) and \({\tilde{h}}_s\) (as \({\mathsf {CD}}(K_0,N)\) densities) are strictly positive in (0, 1). Together with (12.9) (or directly from (12.7)), this implies that \(1 + (t-s) z(t) > 0\) for all \(t,s \in (0,1)\), and hence

  1. (D)

    \(-\frac{1}{t} \le z(t) \le \frac{1}{1-t} \;\;\; \forall t \in (0,1)\).

In fact, we already knew this by Theorem 3.11 (3) but refrained from including this into our assumption (C) since this is a consequence of the other assumptions. Furthermore

  1. (E)

    \(I_x := \{ t \in (0,1) \; ; \; \tau \mapsto {\tilde{h}}_s(\tau ) \text { is differentiable at } \tau =t \text { for all } s \in (0,1)\}\) is of full measure.

Indeed, this follows directly from the definition (12.9) by considering the set all points t where \(\rho (t)\) and z(t) are differentiable. In addition, we clearly have:

  1. (F)

    \(\forall t \in I_x\), \((0,1) \ni s \mapsto \partial _t {\tilde{h}}_s(t)\) is continuous.

Step 3: Defining L and Y.

Now fix \(r_0 \in (0,1)\), and define the functions LY on (0, 1) as follows:

$$\begin{aligned} \log L(r) := - \int _{r_0}^r z(s) ds ~,~ \log Y(r) := \int _{r_0}^r \partial _t |_{t=s} \log {\tilde{h}}_s(t) ds . \end{aligned}$$

Clearly, the function L is well defined for all \(r \in (0,1)\) as z is assumed locally Lipschitz. As for the function Y, (E) implies that \(\partial _t |_{t=s} \log {\tilde{h}}_s(t)\) exists for a.e. \(s \in (0,1)\), and the fact that the latter integrand is locally integrable on (0, 1) is a consequence of Lemma A.9 in the “Appendix”, which guarantees a-priori locally-integrable estimates on the logarithmic derivative of \({\mathsf {CD}}(K_0,N)\) densities.

Consequently, as in our formal argument, we may write (since \(\log \rho \) is locally absolutely continuous on (0, 1)):

$$\begin{aligned}&\log \frac{\rho (r_0)}{\rho (r)} = \int _{r_0}^r \partial _t |_{t=s} \log \frac{\rho (s)}{\rho (t)} ds \\&\quad = \int _{r_0}^r \partial _t|_{t=s} \log {\tilde{h}}_s(t) ds - \int _{r_0}^r \partial _t |_{t=s} \log (1+(t-s) z(t)) ds\\&\quad = \log Y(r) + \log L(r) , \end{aligned}$$

and hence

$$\begin{aligned} \frac{\rho (r_0)}{\rho (r)} = L(r) Y(r) \;\;\; \forall r \in (0,1). \end{aligned}$$

We have already verified in Lemma 5.7 that the property \(z'(s) \ge z^2(s)\) a.e. in \(s \in (0,1)\) implies that L is concave on (0, 1), so it remains to show that Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1).

Step 4: Approximation argument

We now arrive to our approximation argument. Given \(\varepsilon _1 , \varepsilon _2 >0\), \(t \in (\varepsilon _1,1-\varepsilon _1)\) and \(s \in (\varepsilon _2,1-\varepsilon _2)\), define the double logarithmic mollification of \({\tilde{h}}_s(t)\) by

$$\begin{aligned} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) := \int \int \log {\tilde{h}}_y(x) \psi _{\varepsilon _1}(t-x) \psi _{\varepsilon _2}(s-y) dx dy , \end{aligned}$$

where \(\psi _\varepsilon (x) = \frac{1}{\varepsilon } \psi (x/\varepsilon )\) and \(\psi \) is a \(C^2\)-smooth non-negative function on \({\mathbb {R}}\) supported on \([-1,1]\) and integrating to 1. Since for all \(\eta \in (0,1/2)\), we clearly have by (12.9) (and, say, (D))

$$\begin{aligned} \int _{\eta }^{1-\eta } \int _{\eta }^{1-\eta } \left| \log {\tilde{h}}_y(x)\right| dx dy < \infty , \end{aligned}$$

it follows by Proposition A.12 in the “Appendix” on logarithmic convolutions that \(\{{\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t)\}_{s \in (\varepsilon _2,1-\varepsilon _2)}\) is a \(C^2\)-smooth (in (ts)) family of \({\mathsf {CD}}(K_0,N)\) densities on \((\varepsilon _1,1-\varepsilon _1)\).

Step 5: Concluding the proof assuming (H1) and (H2)

We will subsequently show the following two additional properties of the family \(\{h^{\varepsilon _1,\varepsilon _2}_s(t)\}\):

  1. (H1)

    \(\lim _{\varepsilon _2 \rightarrow 0} \lim _{\varepsilon _1 \rightarrow 0} \partial _t |_{t=s} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \partial _t |_{t=s} \log {\tilde{h}}_s(t)\) for a.e. \(s \in (0,1)\).

  2. (H2)

    \(\forall \delta \in (0,1/2) \; \exists C_\delta > 0 \;\; \forall \varepsilon \in (0,\frac{\delta }{8}] \;\; \forall \varepsilon _1,\varepsilon _2 \in (0,\varepsilon ]\):

    $$\begin{aligned} \partial _s \partial _t |_{t=s=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) \le 2 C_\delta \varepsilon \;\;\; \forall r \in [\delta ,1-\delta ] . \end{aligned}$$

Assuming these additional properties, let us show how to conclude the proof of Theorem 12.3. Set \(\varepsilon = \max (\varepsilon _1,\varepsilon _2)\), and assuming that \(\varepsilon < \min (r_0,1-r_0)\), define the function \(Y^{\varepsilon _1,\varepsilon _2}\) on \((\varepsilon ,1-\varepsilon )\) given by

$$\begin{aligned} \log Y^{\varepsilon _1,\varepsilon _2}(r) := \int _{r_0}^r \partial _t |_{t=s} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) ds . \end{aligned}$$

First, we claim to have the following pointwise convergence for all \(r \in (0,1)\):

$$\begin{aligned}&\lim _{\varepsilon _2 \rightarrow 0} \lim _{\varepsilon _1 \rightarrow 0} \log Y^{\varepsilon _1,\varepsilon _2}(r)\nonumber \\&\quad = \lim _{\varepsilon _2 \rightarrow 0} \lim _{\varepsilon _1 \rightarrow 0} \int _{r_0}^r \partial _t |_{t=s} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) ds \nonumber \\&\quad = \int _{r_0}^r \partial _t |_{t=s}\log {\tilde{h}}_s(t) ds = \log Y(r). \end{aligned}$$
(12.10)

Indeed, the pointwise convergence of the integrands is ensured by property (H1), and as soon as \(r_0,r \in (\eta , 1-\eta )\) for some \(\eta > 0\), we obtain by the a-priori estimates of Lemma A.9 in the “Appendix” (since \({\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\eta ,1-\eta )\) for all \(\varepsilon _1,\varepsilon _2 \in (0,\eta ]\) and \(s \in (\eta ,1-\eta )\)):

$$\begin{aligned} \forall t,s \in [r_0,r] \;\; \forall \varepsilon _1,\varepsilon _2 \in (0,\eta ] \;\; \left| \partial _t \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t)\right| \le C(r,r_0,\eta ,K_0,N) . \end{aligned}$$

Consequently, (12.10) follows by Lebesgue’s Dominated Convergence theorem.

Now \(Y^{\varepsilon _1,\varepsilon _2}\) is \(C^2\)-smooth, and so as in our formal argument, we have for all \(r \in (\varepsilon ,1-\varepsilon )\):

$$\begin{aligned} (\log Y^{\varepsilon _1,\varepsilon _2})'(r)&= \partial _t |_{t=r}\log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r(t) , \\ (\log Y^{\varepsilon _1,\varepsilon _2})''(r)&= \partial _t^2 |_{t=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r(t) + \partial _s \partial _t |_{t=s=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) . \end{aligned}$$

As \({\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\varepsilon ,1-\varepsilon )\), we know by the differential characterization of such densities that

$$\begin{aligned} \partial _t^2 |_{t=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r(t) + \frac{1}{N-1} (\partial _t |_{t=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_r(t))^2 \le -K_0 . \end{aligned}$$

Combining this with property (H2), we conclude that for any \(\delta \in (0,1/2)\), whenever \(\varepsilon = \max (\varepsilon _1,\varepsilon _2) \in (0,\min (r_0,1-r_0,\frac{\delta }{8}))\):

$$\begin{aligned}&(\log Y^{\varepsilon _1,\varepsilon _2})''(r) + \frac{1}{N-1} ((\log Y^{\varepsilon _1,\varepsilon _2})'(r))^2 \le -K_0 + 2 C_{\delta } \varepsilon \, \forall r \in [\delta , 1 - \delta ], \end{aligned}$$

and hence \(Y^{\varepsilon _1,\varepsilon _2}\) is a \(C^2\)-smooth \({\mathsf {CD}}(K_0 - 2 C_{\delta } \varepsilon ,N)\) density on \([\delta ,1- \delta ]\).

Combining all of the preceding information, since (as before) the family of \({\mathsf {CD}}(K_0',N)\) densities is closed under pointwise limits, we conclude from (12.10) that Y is a \({\mathsf {CD}}(K_0 - 2 C_\delta \varepsilon , N)\) density on \([\delta ,1-\delta ]\), for any \(\delta \in (0,1/2)\) and \(\varepsilon \in (0,\min (r_0,1-r_0,\frac{\delta }{8}))\). Taking the limit as \(\varepsilon \rightarrow 0\) and then as \(\delta \rightarrow 0\), we confirm that Y must be a \({\mathsf {CD}}(K_0,N)\) density on (0, 1), concluding the proof.

It remains to establish properties (H1) and (H2).

Step 6: proof of (H1)

Given \(y \in (0,1)\) and \(t \in (\varepsilon _1,1-\varepsilon _1)\), denote

$$\begin{aligned} \log {\tilde{h}}_y^{\varepsilon _1}(t) := \int \log {\tilde{h}}_y(x) \psi _{\varepsilon _1}(t-x) dx \end{aligned}$$

so that for every \(s \in (\varepsilon _2,1-\varepsilon _2)\):

$$\begin{aligned} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \int \log {\tilde{h}}^{\varepsilon _1}_y(t) \psi _{\varepsilon _2}(s-y) dy . \end{aligned}$$
(12.11)

By Proposition A.10 in the “Appendix”, \({\tilde{h}}_y^{\varepsilon _1}\) is a \({\mathsf {CD}}(K_0,N)\) density on \((\varepsilon _1,1-\varepsilon _1)\) for all \(y \in (0,1)\). Consequently, Lemma A.9 implies that \(t \mapsto \log {\tilde{h}}_y^{\varepsilon _1}(t)\) is locally Lipschitz on \((\varepsilon _1,1-\varepsilon _1)\), uniformly in \(y \in (0,1)\):

$$\begin{aligned} \sup _{y \in (0,1)} \left| \partial _t \log {\tilde{h}}^{\varepsilon _1}_y(t)\right| \le C(t,\varepsilon _1,K_0,N) . \end{aligned}$$
(12.12)

In particular, it follows that we may differentiate in t under the integral in (12.11) at any \(t_0 \in (\varepsilon _1,1-\varepsilon _1)\):

$$\begin{aligned} \partial _t |_{t = t_0} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \int \partial _t |_{t=t_0} \log {\tilde{h}}^{\varepsilon _1}_y(t) \psi _{\varepsilon _2}(s-y) dy . \end{aligned}$$
(12.13)

Now, by a standard argument (see Lemma 12.5 at the end of this section), we know that the derivative of an \(\varepsilon \)-mollification of a Lipschitz function converges to the derivative itself, at all points where the derivative exists, namely:

$$\begin{aligned} \forall t_0 \in I_x \;\; \forall y \in (0,1) \;\; \lim _{\varepsilon _1 \rightarrow 0} \partial _t |_{t=t_0} \log {\tilde{h}}_y^{\varepsilon _1}(t) = \partial _t |_{t=t_0} \log {\tilde{h}}_y(t) . \end{aligned}$$

Together with (12.12) and (12.13), it follows by Dominated Convergence theorem that:

$$\begin{aligned}&\forall t_0 \in I_x \;\; \forall s \in (\varepsilon _2 , 1-\varepsilon _2) \\&\quad \lim _{\varepsilon _1 \rightarrow 0} \partial _t |_{t = t_0} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \int \partial _t |_{t=t_0} \log {\tilde{h}}_y(t) \psi _{\varepsilon _2}(s-y) dy .\qquad \qquad \qquad \end{aligned}$$

But by property (F), we know that \((0,1) \ni y \mapsto \partial _t |_{t=t_0} \log {\tilde{h}}_y(t)\) is continuous for all \(t_0 \in I_x\), and therefore taking the limit as \(\varepsilon _2 \rightarrow 0\):

$$\begin{aligned} \forall t_0 \in I_x \;\; \forall s \in (0,1) \;\;\; \lim _{\varepsilon _2 \rightarrow 0} \lim _{\varepsilon _1 \rightarrow 0} \partial _t |_{t = t_0} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_s(t) = \partial _t |_{t=t_0} \log {\tilde{h}}_s(t) . \end{aligned}$$

By property (E), \(I_x\) has full measure, thereby concluding the proof of (an extension of) property (H1).

Step 7: proof of (H2)

We will require the following:

Lemma 12.4

Let z satisfy (C) and (D). Then for all \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\), so that for all \(\varepsilon \in (0,\frac{\delta }{4}]\), \(r \in [\delta ,1-\delta ]\), \(r-\varepsilon \le t_1 < t_2 \le r + \varepsilon \) and \(r-\varepsilon \le s_1 < s_2 \le r+\varepsilon \), we have:

$$\begin{aligned}&(1 + (t_1 - s_1) z(t_1)) (1 + (t_2 - s_2) z(t_2)) \\&\quad \le (1 + C_\delta \varepsilon (t_2 - t_1) (s_2 - s_1))(1 + (t_2 - s_1) z(t_2)) (1 + (t_1 - s_2) z(t_1)) . \end{aligned}$$

Proof

Opening the various brackets, the assertion is equivalent to the statement:

$$\begin{aligned}&z(t_1) (s_2 - s_1) - z(t_2)(s_2 - s_1) + z(t_1) z(t_2) (t_2 - t_1)(s_2 - s_1) \\&\quad \le C_\delta \varepsilon (t_2 - t_1) (s_2 - s_1) (1 + (t_2 - s_1) z(t_2)) (1 + (t_1 - s_2) z(t_1)) , \end{aligned}$$

and after dividing by \((t_2 - t_1) (s_2 - s_1)\), we see that our goal is to establish:

$$\begin{aligned} z(t_1) z(t_2) - \frac{z(t_2) - z(t_1)}{t_2 - t_1} \le C_\delta \varepsilon (1 + (t_2 - s_1) z(t_2)) (1 + (t_1 - s_2) z(t_1)) ,\nonumber \\ \end{aligned}$$
(12.14)

for an appropriate \(C_\delta \). Note that the right-hand-side of (12.14) is always positive by (D). As \(\min (t_i,1-t_i) \ge \delta - \varepsilon \ge \frac{3}{4} \delta \), by our assumption (C), (12.14) would follow from:

$$\begin{aligned} \left| z(t_1)\right| \left| z(t_2)\right| B_{\frac{3}{4} \delta } 2 \varepsilon \le C_\delta \varepsilon (1 - 2 \varepsilon \left| z(t_2)\right| )(1 - 2 \varepsilon \left| z(t_1)\right| ) , \end{aligned}$$

or equivalently (assuming \(\left| z(t_1)\right| \left| z(t_2)\right| > 0\), otherwise there is nothing to prove):

$$\begin{aligned} 2 B_{\frac{3}{4} \delta } \le C_{\delta } \left( \frac{1}{\left| z(t_1)\right| } - 2 \varepsilon \right) \left( \frac{1}{\left| z(t_2)\right| } - 2 \varepsilon \right) . \end{aligned}$$
(12.15)

But \(\frac{1}{\left| z(t_i)\right| } \ge \min (t_i,1-t_i) \ge \frac{3}{4} \delta \) by (D), and as \(\varepsilon \in (0,\frac{\delta }{4}]\), we see that (12.15) is ensured by setting:

$$\begin{aligned} C_{\delta } := \frac{32}{\delta ^2} B_{\frac{3}{4} \delta }. \end{aligned}$$

\(\square \)

Translating the statement of Lemma 12.4 into a statement for \({\tilde{h}}_s(t)\) using (12.9), we obtain that for all \(\delta \in (0,1/2)\), there exists \(C_\delta > 0\), so that for all \(\varepsilon \in (0,\frac{\delta }{8}]\), \(r \in [\delta ,1-\delta ]\), \(r-\varepsilon \le t,s\le r + \varepsilon \) and \(\Delta t,\Delta s \in [0,\varepsilon ]\), we have:

$$\begin{aligned}&\log {\tilde{h}}_{s}(t) + \log {\tilde{h}}_{s+\Delta s}(t + \Delta t) \\&\le \log {\tilde{h}}_{s}(t + \Delta t) + \log {\tilde{h}}_{s+\Delta s}(t) + 2 C_\delta \varepsilon \; \Delta t \; \Delta s . \end{aligned}$$

Integrating the above in t against \(\psi _{\varepsilon _1}(r-t)\) and in s against \(\psi _{\varepsilon _2}(r-s)\) with \(\varepsilon _1,\varepsilon _2 \in (0,\varepsilon ]\), we obtain that under the same assumptions as above:

$$\begin{aligned}&\log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_{r}(r) + \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_{r+\Delta s}(r + \Delta t) \\&\quad \le \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_{r}(r + \Delta t) + \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_{r+\Delta s}(r) + C_\delta \varepsilon \; \Delta t \; \Delta s. \end{aligned}$$

Exchanging sides, dividing by \(\Delta t > 0\) and taking limit as \(\Delta t \rightarrow 0\), and then dividing by \(\Delta s > 0\) and taking limit as \(\Delta s \rightarrow 0\), we obtain precisely:

$$\begin{aligned} \partial _s \partial _t |_{t=s=r} \log {\tilde{h}}^{\varepsilon _1,\varepsilon _2}_{s}(t) \le 2 C_\delta \varepsilon , \end{aligned}$$

thereby confirming (H2). \(\square \)

For completeness, we provide a proof of the following lemma, used in Step 6 above.

Lemma 12.5

Let f be a locally Lipschitz function on an open interval \(I \subset {\mathbb {R}}\). Let \(\psi \) denote a \(C^1\)-smooth compactly supported function on \({\mathbb {R}}\) which integrates to 1. Denote by \(\psi _\varepsilon (x) = \frac{1}{\varepsilon } \psi (x/\varepsilon )\), \(\varepsilon > 0\), the corresponding family of mollifiers. Then:

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} (f *\psi _\varepsilon )'(x) = f'(x) , \end{aligned}$$

at all points \(x \in I\) where f is differentiable.

Proof

Without loss of generality, assume that \(0 \in I\), that f is differentiable at 0 and that \(f(0)=0\). Assume that \(\psi \) is supported in \([-M,M]\), and let \(\varepsilon > 0\) be small enough so that \([-M \varepsilon , M \varepsilon ] \subset I\). Then:

$$\begin{aligned} (f *\psi _\varepsilon )'(0) = \left. \frac{d}{dx} \right| _{x=0} \int f(x+y) \psi _\varepsilon (y) dy = \int f'(y) \psi _\varepsilon (y) dy , \end{aligned}$$

where the differentiation under the integral is justified since f is locally Lipschitz. Integrating by parts (which is justified as \(f \psi _\varepsilon \) is absolutely continuous), we obtain

$$\begin{aligned} (f *\psi _\varepsilon )'(0) = - \int _{-M\varepsilon }^{M\varepsilon } f(y) \psi _\varepsilon '(y) dy = - \int _{-M}^M \frac{f(\varepsilon z)}{\varepsilon z} z \psi '(z) dz . \end{aligned}$$

But for each \(z \in [-M , M] \setminus \left\{ 0\right\} \), \(\lim _{\varepsilon \rightarrow 0} \frac{f (\varepsilon z)}{\varepsilon z} = f'(0)\), and since f is Lipschitz on \([-\varepsilon M , \varepsilon M]\), we obtain by Lebesgue’s Dominated Convergence Theorem that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} (f *\psi _\varepsilon )'(0) = - \int _{-M}^M f'(0) z \psi '(z) dz = f'(0) \int _{-M}^M \psi (z) dz = f'(0) , \end{aligned}$$

as asserted. \(\square \)

16 Final results

In this final section, we combine the results obtained in Parts I, II and the previous section, establishing at last the Main Theorem 1.1 and the globalization theorem for the \({\mathsf {CD}}(K,N)\) condition. We also treat the case of an infinitesimally Hilbertian space.

Throughout this section, recall that we assume \(K \in {\mathbb {R}}\) and \(N \in (1,\infty )\).

16.1 Proof of the Main Theorem 1.1

Theorem 13.1

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s., so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:

$$\begin{aligned} {\mathsf {CD}}_{loc}(K,N) \Rightarrow {\mathsf {CD}}^{1}_{Lip}(K,N) . \end{aligned}$$

Proof

By Remark 6.11, \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}_{loc}(K,N)\) if and only if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does. By Remark 8.8, the same is true for \({\mathsf {CD}}^1_{Lip}(K,N)\). Consequently, we may assume that \(\text {supp}({\mathfrak {m}}) = X\). By Lemma 6.12 we deduce that \((X,\mathsf {d})\) is proper and geodesic (note that this would be false without the length space assumption above). Note that for geodesic essentially non-branching spaces, it is known that \({\mathsf {CD}}_{loc}(K,N)\) implies \({\mathsf {MCP}}(K,N)\)—see [30] for a proof assuming non-branching, but the same proof works under essentially non-branching, see the comments after [29, Corollary 5.4]. Consequently, the results of Sect. 7 apply.

Recall that given a 1-Lipschitz function \(u : X \rightarrow {\mathbb {R}}\), the equivalence relation \(R^b_{u}\) on the transport set \({\mathcal {T}}_{u}^{b}\) induces a partition \(\{R_u^b(\alpha )\}_{\alpha \in Q}\) of \({\mathcal {T}}_{u}^{b}\). By Corollary 7.3, we know that \({\mathfrak {m}}({\mathcal {T}}_u \setminus {\mathcal {T}}_u^b) = 0\) with associated strongly consistent disintegration:

$$\begin{aligned} {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}} = {\mathfrak {m}}\llcorner _{{\mathcal {T}}_{u}^{b}} = \int _{Q} {\mathfrak {m}}_{\alpha } \,{\mathfrak {q}}(d\alpha ), \qquad \text {with } {\mathfrak {m}}_{\alpha } (R_u^b(\alpha )) = 1, \ \text {for } {\mathfrak {q}}\text {-a.e. } \alpha \in Q . \end{aligned}$$

It was proved in [27] that the \({\mathsf {CD}}_{loc}(K,N)\) condition ensures that for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\), \((\overline{R^b_u(\alpha )},\mathsf {d},{\mathfrak {m}}_{\alpha })\) verifies \({\mathsf {CD}}(K,N)\) with \(\text {supp}({\mathfrak {m}}_\alpha ) = \overline{R^b_u(\alpha )}\). Denoting by \(X_{\alpha }\) the closure \(\overline{R_u^b(\alpha )}\), Theorem 7.10 ensures that \(X_{\alpha }\) coincides with the transport ray \(R_{u}(\alpha )\) for \({\mathfrak {q}}\)-a.e. \(\alpha \in Q\). Consequently, all 4 conditions of the \({\mathsf {CD}}^1_u(K,N)\) Definition 8.1 are verified, and the assertion follows. \(\square \)

Theorem 13.2

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:

$$\begin{aligned} {\mathsf {CD}}^1(K,N) \Rightarrow {\mathsf {CD}}(K,N) . \end{aligned}$$

Proof

By Remark 8.8, \((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {CD}}^1(K,N)\) if and only if \((\text {supp}({\mathfrak {m}}),\mathsf {d},{\mathfrak {m}})\) does. By Remark 6.11, the same is true for \({\mathsf {CD}}(K,N)\). Consequently, we may assume that \(\text {supp}({\mathfrak {m}}) = X\).

By Proposition 8.9 and Remark 8.11, X also verifies \({\mathsf {MCP}}(K,N)\), and so Theorem 6.15 applies. Given \(\mu _{0},\mu _{1} \in {\mathcal {P}}_{2}(X,\mathsf {d},{\mathfrak {m}})\), consider the unique \(\nu \in \mathrm {OptGeo}(\mu _{0},\mu _{1})\), and denote \(\mu _t := (\mathrm{e}_t)_{\sharp }(\nu ) \ll {\mathfrak {m}}\) for all \(t \in [0,1]\). Let \(\rho _t := d\mu _t / d{\mathfrak {m}}\) denote the versions of the densities guaranteed by Corollary 9.5.

Denote an associated Kantorovich potential by \(\varphi \), and recall that \(\nu \) is concentrated on \(G_\varphi = G_\varphi ^+ \cup G_\varphi ^0\), where \(G_\varphi ^+\) and \(G_\varphi ^0\) denote the subsets of positive and zero length \(\varphi \)-Kantorovich geodesics, respectively. The change-of-variables Theorem 11.4 and Proposition 4.4 yield that for \(\nu \)-a.e. geodesic \(\gamma \in G_\varphi ^+\):

$$\begin{aligned}&\frac{\rho _s(\gamma _s)}{\rho _t(\gamma _t)} = \frac{h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}\ell _\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}} \nonumber \\&\quad = \frac{h^{\varphi _s(\gamma _s)}_{\gamma _s}(t)}{1 + (t-s) \frac{\partial _\tau |_{\tau =t}{\bar{\ell }}_\tau ^2/2(\gamma _t)}{\ell (\gamma )^2}} \;\;\; \text {for a.e. } t,s \in (0,1) . \end{aligned}$$
(13.1)

where for all \(s \in (0,1)\), \(h_{s} = h^{\varphi _{s}(\gamma _{s})}_{\gamma _s}\) is a \({\mathsf {CD}}(K_{0},N)\) density, with \(K_{0} = \ell ^{2}(\gamma ) K\) and \(h_{s}(s)=1\). Together with Corollary 9.5, which ensures the Lipschitz regularity (and positivity) of \((0,1) \ni t \mapsto \rho _t(\gamma _t)\), this verifies assumptions (A) and (B) of Theorem 12.3. As explained in Sect. 12, the 3rd order information on the Kantorovich potential \(\varphi \) asserted by Theorem 5.5 verifies assumption (C) of Theorem 12.3. It follows by Theorem 12.3 (and the discussion preceding it) that the rigidity of (13.1) necessarily implies that for those \(\gamma \in G_\varphi ^+\) satisfying (13.1), it holds:

$$\begin{aligned} \frac{1}{\rho _{t}(\gamma _{t})} = L(t) Y(t) \;\;\; \forall t \in (0,1) , \end{aligned}$$

where L is concave and Y is a \({\mathsf {CD}}(K_0,N)\) density on (0, 1). Noting that \(\sigma _{K_0,N}^{(\alpha )}(\theta ) = \sigma _{K,N}^{(\alpha )}(\theta \ell (\gamma ))\), we obtain by a standard application of Hölder’s inequality that for any \(t_0,t_1 \in (0,1)\), \(\alpha \in [0,1]\) and \(t_\alpha = \alpha t_1 + (1-\alpha ) t_0\):

$$\begin{aligned} \rho _{t_\alpha }^{-\frac{1}{N}}(\gamma _{t_\alpha })&= L^{\frac{1}{N}}(t_\alpha ) Y^{\frac{1}{N}}(t_\alpha ) \nonumber \\&\ge \Big ( \alpha L(t_1) + (1-\alpha ) L(t_0) \Big )^{\frac{1}{N}} \cdot \nonumber \\&\qquad \Big ( \sigma _{K_0,N-1}^{(\alpha )}(\left| t_1-t_0\right| ) Y^{\frac{1}{N-1}}(t_1) \sigma _{K_0,N-1}^{(1-\alpha )}(\left| t_1-t_0\right| ) Y^{\frac{1}{N-1}}(t_0) \Big )^{\frac{N-1}{N}} \nonumber \\&\ge \alpha ^{\frac{1}{N}} \sigma _{K_0,N-1}^{(\alpha )}(\left| t_1-t_0\right| )^{\frac{N-1}{N}} L^{\frac{1}{N}} (t_1)Y^{\frac{1}{N}}(t_1) \nonumber \\&\quad + (1-\alpha )^{\frac{1}{N}} \sigma _{K_0,N-1}^{(1-\alpha )}(\left| t_1-t_0\right| )^{\frac{N-1}{N}} L^{\frac{1}{N}} (t_0)Y^{\frac{1}{N}}(t_0) \nonumber \\&= \alpha ^{\frac{1}{N}} \sigma _{K,N-1}^{(\alpha )}(\left| t_1-t_0\right| \ell (\gamma ))^{\frac{N-1}{N}} \rho _{t_1}^{-\frac{1}{N}}(\gamma _{t_1}) \nonumber \\&\quad + (1-\alpha )^{\frac{1}{N}} \sigma _{K,N-1}^{(1-\alpha )}(\left| t_1-t_0\right| \ell (\gamma ))^{\frac{N-1}{N}} \rho _{t_0}^{-\frac{1}{N}}(\gamma _{t_0}) \nonumber \\&= \tau _{K,N}^{(\alpha )}(\mathsf {d}(\gamma _{t_0}, \gamma _{t_1})) \rho _{t_1}^{-\frac{1}{N}}(\gamma _{t_1}) + \tau _{K,N}^{(1-\alpha )}(\mathsf {d}(\gamma _{t_0}, \gamma _{t_1})) \rho _{t_0}^{-\frac{1}{N}}(\gamma _{t_0}) . \end{aligned}$$
(13.2)

Using the upper semi-continuity of \(t \mapsto \rho _t(\gamma _t)\) at the end-points \(t=0,1\) ensured by Corollary 9.5 (as both \(\mu _0,\mu _1 \ll {\mathfrak {m}}\)), we conclude that for \(\nu \)-a.e. \(\gamma \in G^+_{\varphi }\), the previous inequality in fact holds for all \(t_0,t_1 \in [0,1]\). In particular, for \(t_0 = 0\), \(t_1=1\) and all \(\alpha \in [0,1]\):

$$\begin{aligned} \rho _{\alpha }^{-1/N}(\gamma _{\alpha }) \ge \tau _{K,N}^{(\alpha )}(\mathsf {d}(\gamma _0, \gamma _1)) \rho _{1}^{-\frac{1}{N}}(\gamma _{1}) + \tau _{K,N}^{(1-\alpha )}(\mathsf {d}(\gamma _0, \gamma _1)) \rho _{0}^{-\frac{1}{N}}(\gamma _{0}) .\nonumber \\ \end{aligned}$$
(13.3)

As for null-geodesics \(\gamma \in G_\varphi ^0\) (having zero length), note that \(\tau _{K,N}^{(s)}(0) = s\) and that \([0,1] \ni t \mapsto \rho _t(\gamma _t)\) remains constant by Theorem 11.4, and therefore (13.3) holds trivially with equality for all \(\gamma \in G_\varphi ^0\). In conclusion, (13.3) holds for \(\nu \)-a.e. geodesic \(\gamma \), thereby confirming the validity of Definition 6.7 and verifying \({\mathsf {CD}}(K,N)\). \(\square \)

As an immediate consequence of the previous two theorems, we obtain the Local-to-Global Theorem for the Curvature-Dimension condition.

Theorem 13.3

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:

$$\begin{aligned} {\mathsf {CD}}_{loc}(K,N) \iff {\mathsf {CD}}(K,N). \end{aligned}$$

Remark 13.4

It is clear that the above globalization theorem is false without some global assumption ultimately ensuring that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic. Indeed, simply consider a \({\mathsf {CD}}(K,N)\) space, and restrict it to two disjoint geodesically-convex closed subsets of \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) (each having positive measure)—the resulting space clearly satisfies \({\mathsf {CD}}_{loc}(K,N)\) but not \({\mathsf {CD}}(K,N)\); it is also easy to construct similar examples where \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is connected. In addition, as already mentioned in the Introduction, the globalization theorem is known to be false without some type of non-branching assumption (see [67]).

As an interesting byproduct, we also obtain that \({\mathsf {CD}}^{1}\) and \({\mathsf {CD}}^{1}_{Lip}\) are equivalent conditions on essentially non-branching spaces:

Corollary 13.5

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:

$$\begin{aligned} {\mathsf {CD}}(K,N) \iff {\mathsf {CD}}^{1}(K,N) \iff {\mathsf {CD}}_{Lip}^{1}(K,N). \end{aligned}$$

Proof

\({\mathsf {CD}}_{Lip}^{1}(K,N)\) is by definition stronger than \({\mathsf {CD}}^{1}(K,N)\), which in turn implies \({\mathsf {CD}}(K,N)\) by Theorem 13.2. But \({\mathsf {CD}}(K,N)\) implies its local version \({\mathsf {CD}}_{loc}(K,N)\), as well as that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is geodesic by Lemma 6.12. The cycle is then closed by Theorem 13.1. \(\square \)

Finally, we deduce a complete equivalence between the reduced and the classic Curvature-Dimension conditions on essentially non-branching spaces. Recall that the reduced version \({\mathsf {CD}}^*(K,N)\), introduced in [14] (in the non-branching setting), is defined exactly in the same manner as \({\mathsf {CD}}(K,N)\), with the only (crucial) difference being that one employs the slightly smaller \(\sigma ^{(t)}_{K,N}(\theta )\) coefficients instead of the \(\tau ^{(t)}_{K,N}(\theta )\) ones in Definition 6.4.

Corollary 13.6

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an essentially non-branching m.m.s.. Then:

$$\begin{aligned} {\mathsf {CD}}^{*}(K,N) \iff {\mathsf {CD}}(K,N). \end{aligned}$$

Proof

By definition \({\mathsf {CD}}(K,N)\) is stronger than \({\mathsf {CD}}^*(K,N)\) (see [14, Proposition 2.5 (i)]). For the converse implication, note that \({\mathsf {CD}}^*(K,N)\) implies that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is proper and geodesic, by verbatim repeating the proof of Lemma 6.12. Then we observe that \({\mathsf {CD}}^*(K,N) \Rightarrow {\mathsf {CD}}_{loc}(K^{-},N)\), where \({\mathsf {CD}}_{loc}(K^{-},N)\) denotes that \((X,\mathsf {d},{\mathfrak {m}})\) verifies \({\mathsf {CD}}_{loc}(K',N)\) for every \(K' < K\) (with the open neighborhoods possibly depending on \(K'\)). For non-branching spaces, this was proved in [14, Proposition 5.5] (see also [34, Lemma 2.1]), but the proof does not rely on any non-branching assumptions. Then, by Theorem 13.3, we obtain \({\mathsf {CD}}(K',N)\) for any \(K' < K\). Finally, by uniqueness of dynamical plans (see Theorem 6.15 and Lemma 6.13) and continuity of \(\tau _{K',N}^{(t)}(\theta )\) in \(K'\), the claim follows. \(\square \)

16.2 \({\mathsf {RCD}}(K,N)\) spaces

We also mention the more recent Riemannian Curvature Dimension condition \({\mathsf {RCD}}^{*}(K,N)\). In the infinite dimensional case \(N = \infty \), it was introduced in [7] for finite measures \({\mathfrak {m}}\) and in [4] for \(\sigma \)-finite ones. The class \({\mathsf {RCD}}^{*}(K,N)\) with \(N<\infty \) has been proposed in [40] and extensively investigated in [8, 11, 35]. We refer to these papers and references therein for a general account on the synthetic formulation of the latter Riemannian-type Ricci curvature lower bounds. Here we only briefly recall that it is a strengthening of the reduced Curvature Dimension condition: a m.m.s.verifies \({\mathsf {RCD}}^{*}(K,N)\) if and only if it satisfies \({\mathsf {CD}}^{*}(K,N)\) and is infinitesimally Hilbertian [40, Definition 4.19 and Proposition 4.22], meaning that the Sobolev space \(W^{1,2}(X,{\mathfrak {m}})\) is a Hilbert space (with the Hilbert structure induced by the Cheeger energy). Recall also that the local-to-global property for the \({\mathsf {RCD}}^{*}(K,N)\) condition (say for length spaces of full support) has already been established for \(N=\infty \) in [7, Theorem 6.22] for non-branching spaces with finite second moment, for \(N < \infty \) in [35, Theorems 3.17 and 3.25] for strong \({\mathsf {RCD}}^*(K,N)\) spaces, and for all \(N \in [1,\infty ]\) in [10, Theorems 7.2 and 7.8] for proper spaces without any non-branching assumptions.

We are now in a position to introduce the following (expected) definition:

Definition

We will say that a m.m.s.\((X,\mathsf {d},{\mathfrak {m}})\) satisfies \({\mathsf {RCD}}(K,N)\) if it verifies \({\mathsf {CD}}(K,N)\) and is infinitesimally Hilbertian.

We can now immediately deduce:

Corollary 13.7

$$\begin{aligned} {\mathsf {RCD}}(K,N) \iff {\mathsf {RCD}}^{*}(K,N). \end{aligned}$$

Note that \({\mathsf {CD}}^{*}(K,\infty )\) and \({\mathsf {CD}}(K,\infty )\) are the same condition, so the above also holds for \(N = \infty \).

Proof

Since \({\mathsf {CD}}(K,N)\) is stronger than \({\mathsf {CD}}^{*}(K,N)\), one implication is straightforward. For the other implication, recall that \({\mathsf {RCD}}^{*}(K,N)\) forces the space to be essentially non-branching (see [68, Corollary 1.2]), and so the assertion follows by Corollary 13.6. \(\square \)

Corollary 13.8

Let \((X,\mathsf {d},{\mathfrak {m}})\) be an m.m.s.so that \((\text {supp}({\mathfrak {m}}),\mathsf {d})\) is a length space. Then:

$$\begin{aligned} {\mathsf {RCD}}_{loc}(K,N) \iff {\mathsf {RCD}}(K,N). \end{aligned}$$

Proof

One implication is trivial. For the converse, as usual, we may assume that \(\text {supp}({\mathfrak {m}}) = X\) by Remark 6.11. By Lemma 6.12, we know that \((X,\mathsf {d})\) is proper and geodesic (as usual, this would be false without the length space assumption above). As the local-to-global property has been proved for proper geodesic \({\mathsf {RCD}}^*(K,N)\) spaces without any non-branching assumptions in [10], it follows that:

$$\begin{aligned} {\mathsf {RCD}}_{loc}(K,N) \Rightarrow {\mathsf {RCD}}^*_{loc}(K,N) \Rightarrow {\mathsf {RCD}}^*(K,N) \Rightarrow {\mathsf {RCD}}(K,N) , \end{aligned}$$

where the last implication follows by Corollary 13.7. \(\square \)

16.3 Concluding remarks

We conclude this work with several brief remarks and suggestions for further investigation.

  • Note that the proof of Theorem 13.2 in fact yields more than stated: not only does the synthetic inequality (13.2) hold (for all \(t_0,t_1 \in [0,1]\)), but in fact we obtain for \(\nu \)-a.e. geodesic \(\gamma \) the a-priori stronger disentanglement (or “L-Y” decomposition):

    $$\begin{aligned} \frac{1}{\rho _t(\gamma _t)} = L_\gamma (t) Y_\gamma (t) \;\;\; \forall t \in (0,1), \end{aligned}$$
    (13.4)

    where \(L_\gamma \) is concave and \(Y_\gamma \) is a \({\mathsf {CD}}(\ell (\gamma )^2 K, N)\) density on (0, 1). As explained in the Introduction, it follows from [34] that for a fixed \(\gamma \), (13.4) is indeed strictly stronger than (13.2). In view of Main Theorem 1.1, this constitutes a new characterization of essentially non-branching \({\mathsf {CD}}(K,N)\) spaces.

  • According to [35, p. 1026], it is possible to localize the argument of [68] and deduce from a strong \({\mathsf {CD}}_{loc}(K,\infty )\) condition (when K-convexity of the entropy is assumed along any \(W_2\)-geodesic with end-points inside the local neighborhood), that the space is globally essentially non-branching. In combination with our results, it follows that the strong \({\mathsf {CD}}(K,N)\) condition enjoys the local-to-global property, without a-priori requiring any additional non-branching assumptions.

  • It would still be interesting to clarify the relation between the \({\mathsf {CD}}(K,N)\) condition and the property \({\mathsf {BM}}(K,N)\) of satisfying a Brunn-Minkowski inequality (with sharp dependence on KN as in [74]). Note that by Main Theorem 1.1, it is enough to understand this locally on essentially non-branching spaces.

  • It would also be interesting to study the \({\mathsf {CD}}^1(K,N)\) condition on its own, when no non-branching assumptions are assumed, and to verify the usual list of properties desired by a notion of Curvature-Dimension (see [28, 51, 74]).

  • A natural counterpart of \({\mathsf {RCD}}(K,N)\) would be \({\mathsf {RCD}}^{1}(K,N)\): we will say that a m.m.s.verifies \({\mathsf {RCD}}^{1}(K,N)\) if it verifies \({\mathsf {CD}}^{1}(K,N)\) and it is infinitesimally Hilbertian. Recall that an \({\mathsf {RCD}}(K,N)\) space is always essentially non-branching [68], and hence Main Theorem 1.1 immediately yields:

    $$\begin{aligned} {\mathsf {RCD}}(K,N) \Rightarrow {\mathsf {RCD}}^{1}(K,N). \end{aligned}$$

    The converse implication would be implied by the following claim which we leave for a future investigation: an \({\mathsf {RCD}}^{1}(K,N)\)-space is always essentially non-branching.

  • In regards to the novel third order temporal information on the intermediate-time Kantorovich potentials \(\varphi _t\) we obtain in this work—it would be interesting to explore whether it has any additional consequences pertaining to the spatial regularity of solutions to the Hamilton-Jacobi equation in general, and of the transport map \(T_{s,t} = \mathrm{e}_t \circ \mathrm{e}_{s}|_G^{-1}\) from an intermediate time \(s \in (0,1)\) in particular (where \(G \subset G_\varphi \) is the subset of injectivity guaranteed by Corollary 6.16). In the smooth Riemannian setting, the map \(T_{s,t}\) is known to be locally Lipschitz by Mather’s regularity theory (see [77, Chapter 8] and cf. [77, Theorem 8.22]). A starting point for this investigation could be the following bound on the (formal) Jacobian of \(T_{s,t}\), which follows immediately from (12.5), Theorem 3.11 (3) and Lemma A.9: for \(\mu _s\)-a.e. x, the Jacobian is bounded above by a function of \(s,t,K,N,l_s(x)\) only.