The Globalization Theorem for the Curvature Dimension Condition

The Lott-Sturm-Villani Curvature-Dimension condition provides a synthetic notion for a metric-measure space to have Ricci-curvature bounded from below and dimension bounded from above. We prove that it is enough to verify this condition locally: an essentially non-branching metric-measure space $(X,{\mathsf d},{\mathfrak m})$ (so that $({\mathfrak m},{\mathsf d})$ is a length-space and ${\mathfrak m}(X)<\infty$) verifying the local Curvature-Dimension condition $\mathsf{CD}_{loc}(K,N)$ with parameters $K \in \mathbb{R}$ and $N \in (1,\infty)$, also verifies the global Curvature-Dimension condition $\mathsf{CD}(K,N)$, meaning that the Curvature-Dimension condition enjoys the globalization (or local-to-global) property. The main new ingredients of our proof are an explicit $\textit{change-of-variables}$ formula for densities of Wasserstein geodesics depending on a second-order derivative of an associated Kantorovich potential; a surprising $\textit{third-order}$ bound on the latter Kantorovich potential, which holds in complete generality on any proper geodesic space; and a certain $\textit{rigidity}$ property of the change-of-variables formula, allowing us to bootstrap the a-priori available regularity. The change-of-variables formula is obtained via a new synthetic notion of Curvature-Dimension we dub $\mathsf{CD}^{1}(K,N)$.


Introduction
The Curvature-Dimension condition CD(K, N ) was first introduced in the 1980's by Bakry and Emery [16,15] in the context of diffusion generators, having in mind primarily the setting of weighted Riemannian manifolds, namely smooth Riemannian manifolds endowed with a smooth density with respect to the Riemannian volume. The CD(K, N ) condition serves as a generalization of the classical condition in the non-weighted Riemannian setting of having Ricci curvature bounded below by K ∈ R and dimension bounded above by N ∈ [1, ∞] (see e.g. [55,59] for further possible extensions). Numerous consequences of this condition have been obtained over the past decades, extending results from the classical non-weighted setting and at times establishing new ones directly in the weighted one. These include diameter bounds, volume comparison theorems, heat-kernel and spectral estimates, Harnack inequalities, topological implications, Brunn-Minkowski-type inequalities, and isoperimetric, functional and concentration inequalities -see e.g. [48,17,76] and the references therein.
Being a differential and Hilbertian condition, it was for many years unclear how to extend the Bakry-Émery definition beyond the smooth Riemannian setting, as interest in (measured) Gromov-Hausdorff limits of Riemannian manifolds and other non-Hilbertian singular spaces steadily grew. In parallel, and apparently unrelatedly, the theory of Optimal-Transport was being developed in increasing generality following the influential work of Brenier [21] (see e.g. [2,36,52,64,74,75,76]). Given two probability measures µ 0 , µ 1 on a common geodesic space (X, d) and a prescribed cost of transporting a single mass from point x to y, the Monge-Kantorovich idea is to optimally couple µ 0 and µ 1 by minimizing the total transportation cost, and as a byproduct obtain a Wasserstein geodesic [0, 1] ∋ t → µ t connecting µ 0 and µ 1 in the space of probability measures P(X). This gives rise to the notion of displacement convexity of a given functional on P(X) along Wasserstein geodesics, introduced and studied by McCann [51]. Following the works of Cordero-Erausquin-McCann-Schmuckenshläger [33], Otto-Villani [61] and von Renesse-Sturm [69], it was realized that the CD(K, ∞) condition in the smooth setting may be equivalently formulated synthetically as a certain convexity property of an entropy functional along W 2 Wasserstein geodesics (associated to L 2 -Optimal-Transport, when the transport-cost is given by the squared-distance function).
This idea culminated in the seminal works of Lott-Villani [50] and Sturm [71,72], where a synthetic definition of CD(K, N ) was proposed on a general (complete, separable) metric space (X, d) endowed with a (locally-finite Borel) reference measure m ("metric-measure space", or m.m.s.); it was moreover shown that the latter definition coincides with the Bakry-Émery one in the smooth Riemannian setting (and in particular in the classical non-weighted one), that it is stable under measured Gromov-Hausdorff convergence of m.m.s.'s, and that it implies various geometric and analytic inequalities relating metric and measure, in complete analogy with the smooth setting. It was subsequently also shown [57,63] that Finsler manifolds and Alexandrov spaces satisfy the Curvature-Dimension condition. Thus emerged an overwhelmingly convincing notion of Ricci curvature lower bound K and dimension upper bound N for a general (geodesic) m.m.s. (X, d, m), leading to a rich and fruitful theory exploring the geometry of m.m.s.'s by means of Optimal-Transport.
One of the most important and longstanding open problems in the Lott-Sturm-Villani theory (see [71,72] and [76, pp. 888, 907]) is whether the Curvature-Dimension condition on a general geodesic m.m.s. (say, having full-support supp(m) = X) enjoys the globalization (or local-toglobal) property: if the CD(K, N ) condition is known to hold on a neighborhood X o of any given point o ∈ X (a property henceforth denoted by CD loc (K, N )), does it also necessarily hold on the entire space? Clearly this is indeed the case in the smooth setting, as both curvature and dimension may be computed locally (by equivalence with the differential CD definition). However, for reasons which we will expand on shortly, this is not at all clear and in some cases is actually false on general m.m.s.'s. An affirmative answer to this question would immensely facilitate the verification of the CD condition, which at present requires testing all possible W 2geodesics on X, instead of locally on each X o . The analogous question for sectional curvature on Alexandrov spaces (where the dimension N is absent) does indeed have an affirmative answer, as shown by Topogonov, and in full generality, by Pereleman (see [22]).
Several partial answers to the local-to-global problem have already been obtained in the literature. A geodesic space (X, d) is called non-branching if geodesics are forbidden to branch at an interior-point into two separate geodesics. On a non-branching geodesic m.m.s. (X, d, m) having full support, it was shown by Sturm in [71,Theorem 4.17] that the local-to-global property is satisfied when N = ∞ (assuming that the space of probability measures with finite m-relative entropy is geodesically convex; see also [76,Theorem 30.42] where the same globalization result was proved under a different condition involving the existence of a full-measure totally-convex subset of X of finite-dimensional points). Still for non-branching geodesic m.m.s.'s having full support, a positive answer was also obtained by Villani in [76,Theorem 30.37] for the case K = 0 and N ∈ [1, ∞).
We stress that in these results, the restriction to non-branching spaces is not merely a technical assumption -an example of a heavily-branching m.m.s. verifying CD loc (0, 4) which does not verify CD(K, N ) for any fixed K ∈ R and N ∈ [1, ∞] was constructed by Rajala in [66].
Consequently, a natural assumption is to require that (X, d) be non-branching, or more generally, to require that the L 2 -Optimal-Transport on (X, d, m) be concentrated (i.e. up to a null-set) on a non-branching subset of geodesics, an assumption introduced by Rajala and Sturm in [67] under the name essentially non-branching (see Section 6 for precise definitions). For instance, it is known [67] that measured Gromov-Hausdorff limits of Riemannian manifolds satisfying CD(K, ∞), and (possibly) more generally, RCD(K, ∞) spaces, always satisfy the essentially nonbranching assumption (see Section 13).
In this work, we provide an affirmative answer to the globalization problem in the remaining range of parameters: for N ∈ (1, ∞) and K ∈ R, the CD(K, N ) condition verifies the local-toglobal property on an essentially non-branching geodesic m.m.s. (X, d, m) having finite totalmeasure and full support. The exclusion of the case N = 1 is to avoid unnecessary pathologies, and is not essential. Our assumption that m has finite total-measure (or equivalently, by scaling, that it is a probability measure) is most probably technical, but we did not verify it can be removed so as to avoid overloading the paper even further. This result is new even under the additional assumption that the space is infinitesimally Hilbertian (see Gigli [40]) -we will say that such spaces verify RCD(K, N ) -in which case the assumption of being (globally) essentially non-branching is in fact superfluous.
To better explain the difference between the previously known cases when K N = 0 and the challenge which the newly treated case K N = 0 poses, as well as to sketch our solution and its main new ingredients, which we believe are of independent interest, we provide some additional details below. To avoid being too technical in this Introduction, we try to keep the discussion conceptual, and refer to Section 6 for precise definitions.

Disentangling volume-distortion coefficients
Roughly speaking, the CD(K, N ) condition prescribes a synthetic second-order bound on how an infinitesimal volume changes when it is moved along a W 2 -geodesic: the volume distortion (or transport Jacobian) J along the geodesic should satisfy the following interpolation inequality for t 0 = 0 and t 1 = 1: (1.1) where τ (t) K,N (θ) is an explicit coefficient depending on the curvature K ∈ R, dimension N ∈ [1, ∞], the interpolating time parameter t ∈ [0, 1] and the total length of the geodesic θ ∈ [0, ∞) (with an appropriate interpretation of (1.1) when N = ∞). When N < ∞, the latter coefficient is obtained by geometrically averaging two different volume distortion coefficients: τ (t) K,N (θ) := t 1 N σ (t) K,N −1 (θ) where the σ (t) K,N −1 (θ) term encodes an (N − 1)-dimensional evolution orthogonal to the transport and thus affected by the curvature, and the linear term t represents a one dimensional evolution tangential to the transport and thus independent of any curvature information. As with the Jacobi equation in the usual Riemannian setting, the function [0, 1] ∋ t → σ(t) := σ (t) K,N −1 (θ) is explicitly obtained by solving the second-order differential equation: σ ′′ (t) + θ 2 K N − 1 σ(t) = 0 on t ∈ [0, 1] , σ(0) = 0 , σ(1) = 1. (1. 3) The common feature of the previously known cases K N = 0 for the local-to-global problem is the linear behaviour in time of the distortion coefficient: τ (t) K,N (θ) = t. A major obstacle with the remaining cases K N = 0 is that the function [0, 1] ∋ t → τ (t) K,N (θ) does not satisfy a secondorder differential characterization such as (1.3). If it did, it would be possible to express the interpolation inequality (1.1) on [t 0 , t 1 ] ⊂ [0, 1] as a second-order differential inequality for J 1 N on [t 0 , t 1 ] (see Lemmas A.5 and A. 6), and so if (1.1) were known to hold for all [ ..k , a finite covering of [0, 1], it would follow that (1.1) also holds for [t 0 , t 1 ] = [0, 1]. However, a counterexample to the latter implication was constructed by Deng and Sturm in [34], thereby showing that: the local-to-global property for K N = 0, if true, cannot be obtained by a one-dimensional bootstrap argument on a single W 2 -geodesic as above, and must follow from a deeper reason involving a family of W 2 -geodesics simultaneously.

(1.4)
On the other hand, the above argument does work if we were to replace τ by the slightly smaller σ coefficients. This motivated Bacher and Sturm in [14] to define for K ∈ R and N ∈ (1, ∞) the slightly weaker "reduced" Curvature-Dimension condition, denoted by CD * (K, N ), where the distortion coefficients τ (t) K,N (θ) are indeed replaced by σ (t) K,N (θ). Using the above gluing argument (after resolving numerous technicalities), the local-to-global property for CD * (K, N ) was established in [14] on non-branching spaces (see also the work of Erbar-Kuwada-Sturm [35,Corollary 3.13, Theorem 3.14 and Remark 3.26] for an extension to the essentially non-branching setting, cf. [67,29]). Let us also mention here the work of Ambrosio-Mondino-Savaré [10], who independently of a similar result in [35], established the local-to-global property for RCD * (K, N ) proper spaces, K ∈ R and N ∈ [1, ∞], without a-priori assuming any non-branching assumptions (but a-posteriori, such spaces must be essentially non-branching by [67]).
Without requiring any non-branching assumptions, the CD * (K, N ) condition was shown in [14] to imply the same geometric and analytic inequalities as the CD(K, N ) condition, but with slightly worse constants (typically missing the sharp constant by a factor of N −1 N ), suggesting that the latter is still the "right" notion of Curvature-Dimension. We conclude that the local-toglobal challenge is to properly disentangle between the orthogonal and tangential components of the volume distortion J before attempting to individually integrate them as above.

Comparing L 2 and L 1 Optimal-Transport
There have been a couple of prior attempts to disentangle the volume distortion into its orthogonal and tangential components, by comparing between W 2 and W 1 Wasserstein geodesics (associated to L 2 and L 1 Optimal-Transport, respectively). In [30], this strategy was implicitly employed by Cavalletti and Sturm to show that CD loc (K, N ) implies the measure-contraction property MCP(K, N ), which in a sense is a particular case of CD(K, N ) when one end of the W 2geodesic is a Dirac delta at a point o ∈ X (see [72,56]). In that case, all of the transport-geodesics have o as a common end point, so by considering a disintegration of m on the family of spheres centered at o, and restricting the W 2 -geodesic to these spheres, the desired disentanglement was obtained. In the subsequent work [24], Cavalletti generalized this approach to a particular family of W 2 -geodesics, having the property that for a.e. transport-geodesic γ, its length ℓ(γ) is a function of ϕ(γ 0 ), where ϕ is a Kantorovich potential associated to the corresponding L 2 -Optimal-Transport problem. Here the disintegration was with respect to the individual level sets of ϕ, and again the restriction of the W 2 -geodesic enjoying the latter property to these level sets (formally of co-dimension one) induced a W 1 -geodesic, enabling disentanglement.
Another application of L 1 -Optimal-Transport, seemingly unrelated to disentanglement of W 2 -geodesics, appeared in the recent breakthrough work of Klartag [47] on localization in the smooth Riemannian setting. The localization paradigm, developed by Payne-Weinberger [62], Gromov-Milman [44] and Kannan-Lovász-Simonovits [46], is a powerful tool to reduce various analytic and geometric inequalities on the space (R n , d, m) to appropriate one-dimensional counterparts. The original approach by these authors was based on a bisection method, and thus inherently confined to R n . In [47], Klartag extended the localization paradigm to the weighted Riemannian setting, by disintegrating the reference measure m on L 1 -Optimal-Transport geodesics (or "rays") associated to the inequality under study, and proving that the resulting conditional one-dimensional measures inherit the Curvature-Dimension properties of the underlying manifold.
Klartag's idea is quite robust, and permitted Cavalletti and Mondino in [27] to avoid the smooth techniques used in [47] and to extend the localization paradigm to the framework of essentially non-branching geodesic m.m.s.'s (X, d, m) of full-support verifying CD loc (K, N ), N ∈ (1, ∞). By a careful study of the structure of W 1 -geodesics, Cavalletti and Mondino were able to transfer the Curvature-Dimension information encoded in the W 2 -geodesics to the individual rays along which a given W 1 -geodesic evolves, thereby proving that on such spaces, the conditional one-dimensional measures obtained by disintegration of m on L 1 -Optimal-Transport rays satisfy CD(K, N ). (1.5) Note that the densities of one-dimensional CD(K, N ) spaces are characterized via the σ (as opposed to τ ) volume-distortion coefficients (see the Appendix), so by applying the gluing argument described in the previous subsection, only local CD loc (K, N ) information was required in [27] to obtain global control over the entire one-dimensional transport ray.
This allowed Cavalletti and Mondino (see [27,28]) to obtain a series of sharp geometric and analytic inequalities for CD loc (K, N ) spaces as above, in particular extending from the smooth Riemannian setting the sharp Lévy-Gromov [42] and Milman [54] isoperimetric inequalities, as well as the sharp Brunn-Minkowski inequality of Cordero-Erausquin-McCann-Schmuckenshläger [33] and Sturm [72], all in global form (see also Ohta [58]).
We would like to address at this point a certain general belief shared by some in the Optimal-Transport community, stating that the property BM(K, N ) of satisfying the Brunn-Minkowski inequality (with sharp coefficients correctly depending on K, N ), should be morally equivalent to the CD(K, N ) condition. Rigorously establishing such an equivalence would immediately yield the local-to-global property of CD(K, N ), by the Cavalletti-Mondino localization proof that CD loc (K, N ) ⇒ BM(K, N ). However, we were unsuccessful in establishing the missing implication BM(K, N ) ⇒ CD(K, N ), and in fact a careful attempt in this direction seems to lead back to the circle of ideas we were ultimately able to successfully develop in this work.
Instead of starting our investigation from BM(K, N ), our strategy is to directly start from a suitable modification of the property (1.5), which we dub CD 1 (K, N ). The main result of this work consists of showing that CD 1 (K, N ) ⇒ CD(K, N ), by means of transferring the onedimensional CD(K, N ) information encoded in a family of suitably constructed L 1 -Optimal-Transport rays, onto a given W 2 -geodesic. This goes in exactly the opposite direction to the one studied by Cavalletti and Mondino in [27], and completes the cycle: To the best of our knowledge, this decisive feature of our work -deducing CD(K, N ) for a given W 2 -geodesic by considering the CD loc (K, N ) information encoded in family (in accordance with (1.4)) of different associated W 2 -geodesics (manifesting itself in the CD 1 (K, N ) information along a family of different L 1 -Optimal-Transport rays) -has not been previously explored.
To achieve the right disentanglement, we are required to develop several new ingredients, which we believe are of independent interest. The first is a change-of-variables formula for the density of a W 2 -geodesic along a given L 2 -Optimal-Transport geodesic in X, which depends on a second-order derivative of an associated Kantorovich potential. The second is a surprising third-order bound on the latter Kantorovich potential, which holds in complete generality on any proper geodesic space. The third is a certain rigidity property of the change-of-variables formula, which allows to bootstrap the a-priori available regularity, and which in combination with the first and second ingredients, enables us to achieve disentanglement. These ingredients and the strategy outlined above are described in more detail next.

Definition of CD 1 (K, N) and main result
Motivated by the results of [47,27], we propose the following new definition of Curvature-Dimension, formulated in the language of L 1 -Optimal-Transport. For simplicity, we only present the case when supp(m) = X, and refer to Section 8 for the general definition. Recall that by the classical Monge-Kantorovich-Rubinstein theorem (e.g. [76,Case 5.16]), the L 1 -Optimal-Transport cost between two probability measures is characterized by duality with respect to integration of 1-Lipschitz test functions.
Definition. R is called a transport-ray for u if it is the image of a closed geodesic γ parametrized by arclength (and of positive length), so that the function u • γ is affine with slope 1, and so that R is maximal with respect to inclusion.
The three main novelties in the above definitions are that we do not require {X α } to be disjoint (as in the non-branching framework); the requirement that supp(m α ) = X α as opposed to simply supp(m α ) ⊂ X α ; and the maximality in the definition of transport ray -these are crucial if we wish to use CD 1 u (K, N ) as a starting assumption, as opposed to an end conclusion.
We will be particularly interested in a certain distinguished sub-family of 1-Lipschitz functions. Given a continuous function f : (X, d) → R so that {f = 0} = ∅, the function: is called the signed-distance function (from the zero-level set of f ). When (X, d) is a length space, it is easy to check that d f is 1-Lipschitz.
. Note that we do not a-priori require (X, d) to be a geodesic or length space, but these turn out to be consequences of the definition. We show in Section 8 that even without any additional non-branching assumptions, CD 1 (K, N ) always implies (a strong form of) the MCP(K, N ) condition. Furthermore, for essentially nonbranching CD 1 (K, N ) spaces, we show that the transport-rays {X α } may be chosen to have disjoint interiors and that the disintegration (1.6) is essentially unique. This already makes a connection to the structure of W 2 -geodesics, and provides us with a starting point for proving the following theorem, which is the main result of this work: Main Theorem 1.1. Let (X, d, m) be an essentially non-branching m.m.s. with m(X) < ∞, and let K ∈ R and N ∈ (1, ∞). Then the following statements are equivalent: (1) (X, d, m) verifies CD(K, N ).
If in addition (supp(m), d) is a length-space, the above statements are equivalent to: To this list one can also add the entropic Curvature-Dimension condition CD e (K, N ) of Erbar-Kuwada-Sturm [35], which is known to be equivalent to CD * (K, N ) for essentially nonbranching spaces. In other words, all synthetic definitions of Curvature-Dimension are equivalent for essentially non-branching m.m.s.'s, and in particular, the local-to-global property holds for such spaces (recall that this is known to be false on m.m.s.'s where branching is allowed by [66]). The equivalence with CD loc (K, N ) is clearly false without some global assumption ultimately ensuring that (supp(m), d) is a geodesic-space, see Remark 13.4. As an interesting by-product, we see that CD 1 Lip (K, N ) and CD 1 (K, N ) are equivalent on essentially non-branching spaces, a fact which is not entirely clear even in the smooth Riemannian setting (cf. [47]). It would also be interesting to study the CD 1 (K, N ) condition in its own right, when no non-branching assumptions are present -we leave this for another time.
As already mentioned, and being slightly imprecise (see Section 13 for precise statements), the implications CD(K, N ) ⇒ CD * (K, N ) ⇒ CD loc (K, N ) follow from the work of Bacher and Sturm [14], and the implication CD loc (K, N ) ⇒ CD 1 Lip (K, N ) follows by adapting to the present framework what was already proved by Cavalletti and Mondino in [27] (after taking care of the important maximality requirement of transport-rays, see Theorem 7.10). So almost all of our effort goes into proving that CD 1 (K, N ) ⇒ CD(K, N ).
For a smooth weighted Riemannian manifold (M, d, m), it is an easy exercise to show the latter implication using the Bakry-Émery differential characterization of CD(K, N ) -simply use an appropriate umbilic hypersurface H passing through a given point p ∈ M and perpendicular to a given direction ξ ∈ T p M , and apply the CD 1 (K, N ) definition to the distance function from H. Of course, this provides no insight towards how to proceed in the m.m.s. setting, so it is natural to try and obtain an alternative synthetic proof, still in the smooth setting. While this is possible, it already poses a much greater challenge, which in some sense provided the required insight leading to the strategy we ultimately employ in this work.
In fact, the density h s (t) = h ϕs(γs) γs (t) is obtained from the CD 1 u (K, N ) condition for the signed-distance function u = d s = d ϕs−ϕs(γs) from the level set {ϕ s = ϕ s (γ s )} (after applying time re-parametrization and scaling), as the geodesic γ is formally perpendicular to the latter level set, and may be rigorously shown to be a subset of a transport-ray of d s . The proof is based on the following rough idea. If G ⊂ G ϕ is an appropriate subset of Kantorovich geodesics of positive length with ν(G) = 1, we consider G as = G as,s = {γ ∈ G ; ϕ s (γ s ) = a s } and the (formally co-dimension one) set e t (G as ). We then compare between two conditional measures m as t and m t as concentrated on e t (G as ): the first is obtained by disintegrating m e (0,1) (Ga s ) on the partition {e t (G as )} t∈(0,1) , and its variation in t is precisely governed by the density h s (t) supported on the L 1 -Optimal-Transport ray of d s ; the second is obtained by disintegrating m et(G) on the partition {e t (G as )} as∈R , and coincides up to the density ρ t with the W 2 -geodesic (e t ) ♯ (ν Ga s ). It turns out that m as t and m t as are mutually absolutely continuous, with Radon-Nikodym derivative ∂ τ | τ =t Φ τ s on e t (G as ); this may be formally understood by writing e t (G as ) = e t (G) ∩ Φ t s = a s and comparing the variations of t → Φ t s = a s and a s → Φ t s = a s . Combining the above properties, we are able to conclude (1.7).
Part II of this work is mostly dedicated to introducing the CD 1 (K, N ) condition and rigorously establishing the change-of-variables formula (1.7). To this end, we require various temporal regularity properties shared by all (essentially non-branching) MCP(K, N ) spaces, as well as of the function Φ t s . Note that we refrain from making any assumptions on (the challenging) spatial regularity of Φ t s when t = s (or equivalently, of the length-map ℓ t defined below), so we are precluded from invoking the coarea formula in our derivation. Also note that even the assertions that {e t (G as )} t∈(0,1) and {e t (G as )} as∈R are indeed partitions and that ∂ τ | τ =t Φ τ s (γ t ) > 0 seem to be new and non-trivial.

Third order information on intermediate-time Kantorovich potentials
To obtain disentanglement of the "Jacobian" t → 1/ρ t (γ t ) into its orthogonal and tangential components, we need to understand the first-order variation of the change-of-variables formula (1.7) at t = s, i.e. the second-order variation of t → Φ t s at t = s, which amounts to a third-order variation of t → ϕ t . Forgetting for a moment the fact that we do not a-priori have enough regularity to justify taking three derivatives (this will be taken care of in our third ingredient, the rigidity of the change-of-variables formula), we conclude that an in-depth investigation of intermediate-time Kantorovich potentials {ϕ t } is required. This constitutes Part I of this work, where we develop a first, second, and finally third order temporal theory of intermediate Kantorovich potentials in a purely metric setting (X, d), without specifying any reference measure m. This part, which may be read independently of the other components of this work, is presented first (in Sections 2-5), since its results are constantly used throughout the rest of this work. Our only assumptions in Part I are that (X, d) is a proper geodesic space, without invoking any non-branching assumptions.
Our starting point here is the pioneering work by Ambrosio-Gigli-Savaré [5], [6,Section 3], who already investigated in a very general (extended) metric space setting the first and second order temporal behaviour of the Hopf-Lax semi-group Q t applied to a general function f : X → R ∪ {+∞}. However, the essential point we observe in our treatment is that when f is itself a Kantorovich potential ϕ, characterized by the property that ϕ = Q 1 (−ϕ c ) and ϕ c = Q 1 (−ϕ), much more may be said regarding the behaviour of t → ϕ t := −Q t (−ϕ), even in first and second order. This is due to the fact that if we reverse time and defineφ t := Q 1−t (−ϕ c ), then ϕ t ≤φ t with equality (when t ∈ (0, 1)) precisely on e t (G ϕ ), yielding a two-sided control over ϕ t on e t (G ϕ ). So for instance, two apparently novel observations which we constantly use throughout this work are that for all t ∈ (0, 1), ℓ 2 t /2 := ∂ t ϕ t exists on e t (G ϕ ), and that Kantorovich geodesics γ ∈ G ϕ having a given x ∈ X as their t-midpoint all have the same length ℓ t (x). In Section 3, we establish Lipschitz regularity properties ofG ϕ (x) ∋ t → ℓ 2 t (x) for all x ∈ X, as well as upper and lower derivative estimates, both pointwise and a.e. onG ϕ (x) := t ∈ (0, 1) ; e −1 t (x) ∩ G ϕ = ∅ . These are then transferred in Section 4 to corresponding estimates for the function Φ t s . Part I culminates in Section 5, whose goal is to prove a quantitative version of the following (somewhat oversimplified) statement, which crucially provides second order information on ℓ t , or equivalently, third order information on ϕ t , along γ t : exists a.e. in t ∈ (0, 1) and coincides with an absolutely continuous function z, then z ′ (t) ≥ z(t) 2 for a.e. t ∈ (0, 1). (1.8) Equivalently, this amounts to the statement that: It turns out that L(t) precisely corresponds to the tangential component of 1/ρ t (γ t ), and its concavity ensures that it is synthetically controlled by the linear term appearing in the definition of τ (t) K,N (θ) in (1.2). This constitutes the second main new ingredient of this work. The novel observation that it is possible to extract in a general metric setting third order information from the Hopf-Lax semi-group, which formally solves the first-order Hamilton-Jacobi equation, is in our opinion one of the most surprising parts of this work. Even in the smooth Riemannian setting, we were not able to find a synthetic proof which is easier than the one in the general metric setting; a formal differential proof of (1.8) assuming both temporal and (more challenging) spatial higher-order regularity of ϕ t is provided in Subsection 5.1, but the latter seems to wrongly suggest that it would not be possible to extend (1.8) beyond a Hilbertian setting. The surprising proof in the general metric setting (Theorem 5.2) is based on a careful comparison of second order expansions of ε → ϕ τ +ε (γ τ ) at τ = t, s, and subtle differences between the usual second derivative and the second Peano derivative (see Section 2) come into play.

Rigidity of Change-of-Variables Formula
The definition of Φ t s may be naturally extended to an appropriate domain beyond e t (G ϕ ) as follows, allowing to easily (formally) calculate its partial derivative: Evaluating at x = γ t and plugging this into the change-of-variables formula (1.7), it follows that for ν-a.e. geodesic γ: for a.e. t, s ∈ (0, 1). (1.10) Thanks to the idea of considering together both initial-point s and end-point t, the latter formula takes on a very rigid structure: note that on the left-hand-side the s and t variables are separated, and the denominator on the right-hand-side depends linearly is s. Consequently, we can easily bootstrap the a-priori available regularity in s and t of all terms involved. For all s ∈ (0, 1), is locally Lipschitz being a CD(ℓ(γ) 2 K, N ) density, and at this point we already know that t → ρ t (γ t ) is also locally Lipschitz (from the MCP(K, N ) condition, although we can also deduce this from (1.10) directly). It easily follows that 1 ℓ 2 (γ) ∂ τ | τ =t ℓ 2 τ /2(γ t ) must coincide for a.e. t ∈ (0, 1) with a locally-Lipschitz function z(t), so that (1.8) applies. Similarly, by redefining {h s } for s in a null subset of (0, 1), we can guarantee that (0, 1) ∋ s → h s (t) is locally Lipschitz (for any given t ∈ (0, 1)), even though there is a-priori no relation between the different densities {h s } s∈(0,1) .
At this point, if ρ t (γ t ) and z(t) were known to be C 2 smooth, and equality were to hold in (1.10) for all s, t ∈ (0, 1), we could then define: and as ∂ t | t=s log(1 + (t − s)z(t)) = z(s), it would follow, recalling the definition (1.9) of L, that: Using the fact that all {h s } s∈(0,1) are CD(ℓ(γ) 2 K, N ) densities to control ∂ 2 t | t=r log h r (t), and surprisingly, also the concavity of L (again!) to control the mixed partial derivatives ∂ s ∂ t | t=s=r log h s (t), a formal computation described in Subsection 12.2 then verifies that Y is a CD(ℓ(γ) 2 K, N ) density itself. A rigorous justification without all of the above non-realistic assumptions turns out to be extremely tedious, due to the difficulty in applying an approximation argument while preserving the rigidity of the equation -this is worked out in Section 12 and the Appendix.
The definition (1.11) of Y finally sheds light on the crucial role which the parameter s ∈ (0, 1) plays in our strategy -its role is to vary between the different W 2 -geodesics from which the CD loc (K, N ) information is extracted into the CD 1 ds (K, N ) information on the disintegration into transport-rays from the level set {ϕ s = ϕ s (γ s )}, thereby coming full circle with the observation of (1.4).
We refer to Section 13 for the final details and for additional immediate corollaries of the Main Theorem 1.1 pertaining to RCD(K, N ) and strong CD(K, N ) spaces. We also provide there several concluding remarks and suggestions for further investigation.
Part I Temporal Theory of Optimal Transport 2 Preliminaries

Geodesics
where the infimum is over all (continuous) curves σ : I → X connecting x and y, and ℓ(σ) ) denotes the curve's length, where the latter supremum is over all If ℓ(γ) = 0 we will say that γ is a null geodesic. The metric space is called a geodesic space if for all x, y ∈ X there exists a geodesic in X connecting x and y. We denote by Geo(X) the set of all closed directed constant-speed geodesics parametrized on the interval [0, 1]: We regard Geo(X) as a subset of all Lipschitz maps Lip([0, 1], X) endowed with the uniform topology. We will frequently use γ t := γ(t).
The metric space is called proper if every closed ball (of finite radius) is compact. It follows from the metric version of the Hopf-Rinow Theorem (e.g. [22,Theorem 2.5.28]) that for complete length spaces, local compactness is equivalent to properness, and that complete proper length spaces are in fact geodesic.

Derivatives
For a function g : A → R on a subset A ⊂ R, denote its upper and lower derivatives at a point t 0 ∈ A which is an accumulation point of A by: We will say that g is differentiable at This is a slightly more general definition of differentiability than the traditional one which requires that t 0 be an interior point of A.
Remark 2.1. Note that there are only a countable number of isolated points in A, so a.e. point in A is an accumulation point. In addition, it is clear that if t 0 ∈ B ⊂ A is an accumulation point of B and g is differentiable at t 0 , then g| B is also differentiable at t 0 with the same derivative. In particular, if g is a.e. differentiable on A then g| B is also a.e. differentiable on B and the derivatives coincide.
Remark 2.2. Denote by A 1 ⊂ A the subset of density one points of A (which are in particular accumulation points of A). By Lebesgue's Density Theorem L 1 (A \ A 1 ) = 0, where we denote by L 1 the Lebesgue measure on R throughout this work. If g : A → R is locally Lipschitz, consider any locally Lipschitz extensionĝ : R → R of g. Then it is easy to check that for t 0 ∈ A 1 , g is differentiable in the above sense at t 0 if and only ifĝ is differentiable at t 0 in the usual sense, in which case the derivatives coincide. In particular, asĝ is a.e. differentiable on R, it follows that g is a.e. differentiable on A 1 and hence on A, and it holds that d dt g = d dtĝ a.e. on A.
Let f : I → R denote a convex function on an open interval I ⊂ R. It is well-known that the left and right derivatives f ′,− and f ′,+ exist at every point in I and that f is locally Lipschitz there; in particular, f is differentiable at a given point iff the left and right derivatives coincide there. Denoting by D ⊂ I the differentiability points of f in I, it is also well-known that I \ D is at most countable. Consequently, any point in D is an accumulation point, and we may consider the differentiability in D of f ′ : D → R as defined above. We will require the following elementary one-dimensional version (probably due to Jessen) of the well-known Aleksandrov's theorem about twice differentiability a.e. of convex functions on R n (see [45,Theorem 5.2.1] or [20,Section 2.6], and [70, p. 31] for historical comments). Clearly, all of these results extend to locally semi-convex and semi-concave functions as well; recall that a function f : (1) f is differentiable at τ 0 , and if D ⊂ I denotes the subset of differentiability points of f in I, then f ′ : D → R is differentiable at τ 0 with: (2) The right derivative f ′,+ : I → R is differentiable at τ 0 with (f ′,+ ) ′ (τ 0 ) = ∆.
(4) f is differentiable at τ 0 and has the following second order expansion there: In this case, f is said to have a second Peano derivative at τ 0 .
We remark that even for a differentiable function f , while the implication (1) ⇒ (4) follows by Taylor's theorem (existence of the second derivative at a point implies existence of the second Peano derivative there), the converse implication is in general false (see e.g. [60] for a nice discussion). For a locally semi-convex or semi-concave function f , we will say that f is twice differentiable at τ 0 if any (all) of the above equivalent conditions hold for some ∆ ∈ R, and write Finally, we will require the following slightly more refined notation.

Definition.
Given an open interval I ⊂ R and a function f : I → R which is differentiable at τ 0 ∈ I, we define its upper and lower second Peano derivatives at τ 0 , denoted P 2 f (τ 0 ) and P 2 f (τ 0 ) respectively, by: Clearly f has a second Peano derivative at τ 0 iff P 2 f (τ 0 ) = P 2 f (τ 0 ) < ∞.
The following is a type of Stolz-Cesàro lemma: Given an open interval I ⊂ R and a locally absolutely continuous function f : I → R which is differentiable at τ 0 ∈ I, we have: Proof. By local absolute continuity, f is differentiable a.e. in I and we have for small enough |ε|: and hence: Taking appropriate subsequential limits as ε → 0, the asserted inequalities readily follow.

Temporal Theory of Intermediate-Time Kantorovich Potentials. First and Second Order
In the next sections, we will only consider the quadratic cost function c = d 2 /2 on X × X.
Definition (c-Concavity, Kantorovich Potential). The c-transform of a function ψ : X → R ∪ {±∞} is defined as the following (upper semi-continuous) function: In the context of optimal-transport with respect to the quadratic cost c, a c-concave function ϕ : X → R ∪ {−∞} which is not identically equal to −∞ is also known as a Kantorovich potential, and this is how we will refer to such functions in this work. In that case, ϕ c : X → R ∪ {−∞} is also a Kantorovich potential, called the dual or conjugate potential.
There is a natural way to interpolate between a Kantorovich potential and its dual by means of the Hopf-Lax semi-group, resulting in intermediate-time Kantorovich potentials {ϕ t } t∈(0,1) . The goal of the next three sections is to provide first, second and third order information on the time-behavior t → ϕ t (x) at intermediate times t ∈ (0, 1). In these sections, we only assume that (X, d) is a proper geodesic metric space.
In this section, we focus on first and second order information. The main new result is Theorem 3.11.

Hopf-Lax semi-group
We begin with several well-known definitions which we slightly modify and specialize to our setting.
Definition (Hopf-Lax Transform). Given f : X → R ∪ {±∞} which is not identically +∞ and t > 0, define the Hopf-Lax transform Q t f : X → R ∪ {−∞} by: Clearly is finite for all x ∈ X (as our metric d is finite). Consequently, we denote: Remark 3.1. It is also possible to extend the definition of Q t f to negative times t < 0 by setting: This is called the backwards Hopf-Lax semi-group on (−∞, 0]. However, (R, is in general not an abelian group homomorphism, not even for t ∈ [0, 1] when applied to a Kantorovich potential ϕ (characterized by Q −1 • Q 1 (−ϕ) = −ϕ) -see Subsection 3.3. This will be a rather significant nuisance we will need to cope with in this work.
is upper semi-continuous as the infimum of continuous functions in (t, x), and by definition [0, It may also be shown (see [5,Lemma 3 is continuous (and in fact locally Lipschitz, see Theorem 3.4 below). Together with the leftcontinuity, we deduce that for every Note that by definition f c = Q 1 (−f ), and that a Kantorovich pair of conjugate potentials ϕ, ϕ c : X → R ∪ {−∞} are characterized by not being identically equal to −∞ and satisfying: In particular, t * (ϕ), t * (ϕ c ) ≥ 1, and we a-posteriori deduce that ϕ, ϕ c are both finite on the entire space X (we have used above the fact that the metric d is finite, which differs from other more general treatments).

Distance functions
The following important definition was given by Ambrosio-Gigli-Savaré [5,6]: where the supremum and infimum above run over the set of minimizing sequences {y n } in the definition of the Hopf-Lax transform (3.1). A simple diagonal argument shows that the (outer) supremum and infimum above are in fact attained.
(3) For every x ∈ X, both functions (0, t * (f )) ∋ t → D ± f (x, t) are monotone non-decreasing and coincide except where they have (at most countably many) jump discontinuities.
for all t ∈ (0, t * (f )), where ∂ − t and ∂ + t denote the left and right partial derivatives, respectively. In particular, the map (0, t * (f )) ∋ t → Q t f (x) is locally Lipschitz and locally semi-concave, and differentiable at t It may be instructive to recall the proof of property (3) above, which is related to some ensuing properties, so for completeness, we present it below. For simplicity, we restrict to the case of interest for us, and first record: Lemma 3.5. Given a proper metric space X, a lower semi-continuous f : X → R, x ∈ X and t ∈ (0, t * (f )), there exist y ± t ∈ X so that: Recall that −ϕ is indeed lower semi-continuous for any Kantorovich potential ϕ.
Proof of Lemma 3.5. Let {y ±,n t } denote a minimizing sequence so that: By property (1) we know that D ± f (x, t) < R < ∞, and the properness implies that the closed geodesic ball B R (x) is compact. Consequently {y ±,n t } has a converging subsequence to y ± t , and the lower semi-continuity of f implies that: as asserted.
Proof of (3) for proper X and lower semi-continuous f . The assertion will follow immediately after establishing: and since a monotone function can only have a countable number of jump discontinuities. By Lemma 3.5, there exist y + s and y − t so that: and: It follows that: Summing these two inequalities and rearranging terms, one deduces: as required.

Intermediate-time duality and time-reversed potential
this is an inherent group-structure incompatibility of the Hopf-Lax forward and backward semigroups. Note that for f = −ϕ where ϕ is a Kantorovich potential, we do have equality for s = 1, and in fact for all s ∈ [0, 1]. However, for f = Q t (−ϕ), t ∈ (0, 1) and s = 1 − t, we can only assert an inequality above ( [76,Theorem 7.36],[3, Corollary 2.23 (i)]): and equality may not hold at every point of X (cf. [76,Remark 7.37]). Nevertheless, in our setting, the subset where equality is attained may be characterized as in the next proposition. We first introduce the following very convenient: Definition (Time-Reversed Interpolating Potential). Given a Kantorovich potential ϕ : X → R, define the time-reversed interpolating Kantorovich potential at time t ∈ [0, 1],φ t : X → R, as: Note thatφ 0 = ϕ,φ 1 = −ϕ c , and: Proposition 3.6.
(1) is immediate by c-concavity, and (2) is a reformulation of (3.2), so the only assertion requiring proof is (3). The if direction is well-known (e.g. [76,Theorem 7.36], [3, Corollary 2.23 (ii)]), but the other direction appears to be new. It is based on the following simple lemma, which we will use again later on: Lemma 3.7. Assume that for some x, y, z ∈ X and t ∈ (0, 1): Then x is a t-intermediate point between y and z:
Proof. Using that: our assumption yields: On the other hand, the reverse inequality is always valid by the triangle and Cauchy-Schwarz inequalities: It follows that we must have equality everywhere above, and (3.3) amounts to the equality case in the Cauchy-Schwarz inequality. Consequently, the concatenation γ : [0, 1] → X of any constant speed geodesic γ 1 : [0, t] → X between y and x, with any constant speed geodesic γ 2 : [t, 1] → X between x and z, so that γ(0) = y, γ(t) = x and γ(1) = z, must be a constant speed geodesic itself (by the triangle inequality). Lastly, the equality in (3.4) implies that γ ∈ G ϕ , thereby concluding the proof.
Proof of Proposition 3.6 (3). We begin with the known direction. Let x = γ t with γ ∈ G ϕ . Apply Lemma 3.3 to γ with s = 0 and r = t: and to γ c ∈ G ϕ c with s = 1 and r = 1 − t: where we used that (ϕ c ) 1 = −(ϕ c ) c = −ϕ. Summing these two identities, we obtain: as asserted.
For the other direction, assume that ϕ t (x) = −(ϕ c ) 1−t (x) for some x ∈ X and t ∈ (0, 1). By Lemma 3.5 applied to the lower semi-continuous functions −ϕ and −ϕ c , there exist y t , z t ∈ X so that: Summing the two equations, the assertion follows immediately from Lemma 3.7.
We also record the following immediate corollary of Lemma 3.2: is upper semi-continuous on X × [0, 1) and continuous on X × (0, 1).
Finally, in view of Proposition 3.6 (3), we deduce for free: is a closed subset of X × (0, 1).

Length functions
. Given a Kantorovich potential ϕ : X → R, denote: To provide motivation for these definitions, let us mention that we will shortly see that if x = γ t with γ ∈ G ϕ and t ∈ (0, 1), then: In particular, all ϕ-Kantorovich geodesics having x as their t-mid-point have the same length. These facts seem to not have been previously noted in the literature, and they will be crucially exploited in this work.
Definition. Forl = ℓ,l, introduce the following set: and on it definel t (x) as the common valuel Recalling we begin by translating Theorem 3.4 into the following corollary. We freely use standard properties of semi-convex (semi-concave) functions, like twice a.e. differentiability, non-negativity (non-positivity) of the singular part of the distributional second derivative (see e.g. Lemma A.11), etc...
(4) For every x ∈ X: is differentiable a.e., the singular part of its distributional derivative is non-negative, (0, 1) ∋ t → ϕ t (x) is locally semi-convex, and: , the singular part of its distributional derivative is non-positive, (0, 1) ∋ t →φ t (x) is locally semi-concave, and: Proof. The only point requiring verification is that monotonicity of t → tℓ t (x) in (4a) and t → (1 − t)l t in (4b) implies (3.5) and (3.6), respectively. For instance, using the continuity of Now, if ℓ t (x) = 0 the monotonicity directly implies ∂ t ℓ t (x) ≥ 0 and establishes (3.7), whereas otherwise, (3.7) is equivalent by the chain-rule (and again the continuity which in turn is a consequence of the aforementioned monotonicity. The proof of (3.6) follows identically.
We now arrive to the main new result of this section, which will be constantly and crucially used in this work: Theorem 3.11. Let ϕ : X → R denote a Kantorovich potential.
(1) For all x ∈ e t (G ϕ ) with t ∈ (0, 1), we have: for any γ ∈ G ϕ so that γ t = x. In other words: where the Peano (partial) derivatives are with respect to the t variable.
In particular, for every x ∈ X, we have: , differentiable a.e. there, and having locally bounded lower and upper derivatives onG (3) and (4).
Proof. To see (1), let (x, t) ∈ D(G ϕ ). Equivalently, by Proposition 3.6 (3), we know that ϕ t (x) =φ t (x). In addition, Lemma 3.5 assures the existence of y ± and z ± in X so that: Equating both expressions and applying Lemma 3.7, we deduce that x is the t-midpoint of a geodesic connecting y ± and z ± (for all 4 possibilities), and that: so that all 4 possibilities above coincide. We remark in passing that this already implies in a nonbranching setting that necessarily y + = y − and z + = z − , i.e. the uniqueness of a ϕ-Kantorovich geodesic with t-mid point x. Furthermore, if x = γ t for some γ ∈ G ϕ , then by Lemma 3.3: It follows by definition of D ± −ϕ (x, t) that: which together with (3.9) establishes that ℓ(γ) To see (2), let γ t , γ s ∈ G ϕ be so that γ t t = γ s s = x, for some t, s ∈ (0, 1). Then: for (p, q) = (t, s) and (p, q) = (s, t). Summing these two inequalities, we obtain the well-known c-cyclic monotonicity of the set To evaluate the right-hand-side, we simply pass through x and employ the triangle inequality: Plugging this above and rearranging terms, we obtain: Completing the square by subtracting 2 t(1 − t)s(1 − s)ℓ(γ t )ℓ(γ s ) from both sides, and recalling that ℓ(γ p ) = ℓ p (x) for p = t, s, we readily obtain (3.8). In particular, using t = s, the above argument recovers the last assertion of (1) that ℓ(γ) is the same for all γ ∈ G ϕ so that γ t = x.
To see (3), recall that given x ∈ X, we know by Proposition 3. (1), we know that both maps t →φ t (x) are differentiable at t 0 ∈G ϕ (x), and we see again that , since the derivatives of a function and its majorant must coincide at a mutual point of differentiability where they touch. Moreover, definingh = h,h as: it follows that h ≤h (on (−t 0 , 1−t 0 )). Diving by ε 2 and taking appropriate subsequential limits, we obviously obtain: Combining these inequalities with those of Lemma 2.4, (3.5) and (3.6), the chain of inequalities in (3) readily follows.
To see (4), , which is locally semi-concave by Corollary 3.10. By Proposition 3.6, we know that there. In particular, this holds at (1) and f ′ (t 0 ) = 0. Note that by Corollary 3.10: it follows that:

It follows that on the open interval
is concave with C ε defined as the constant on the right-hand-side above. Applying Lemma 3.12 below to the translated function f (· + t 0 ) on the interval I δ − t 0 , it follows that: . (1), we obtain: The assertion of (4) now follows by taking appropriate subsequential limits as t → t 0 and using the fact that ε > 0 was arbitrary.
Note that the C-semi-concavity is equivalent to ∂ t | t=0 f ′ (t) ≤ C, while the conclusion is from the opposite direction. It is not hard to verify that the asserted lower bound is in fact best possible.
Proof of Lemma 3.12. Set g = f ′ on D. The C-semi-concavity is equivalent to the statement that g(t) − Ct is non-increasing on D, so that g(t 2 ) ≤ g(t 1 ) + C(t 2 − t 1 ) for all t 1 , t 2 ∈ D with t 1 < t 2 . It follows that necessarily g(t) ≥ −Ct for all t ∈ D ∩ I/2 with t ≥ 0, since: Repeating the same argument for t → f (−t), we see that −g(t) ≥ Ct for all t ∈ D ∩ I/2 with t ≤ 0. This concludes the proof.
In a sense, Theorem 3.11 (2) is the temporal analogue of the spatial 1/2-Hölder regularity proved by Villani in [76,Theorem 8.22]. Formally taking s → t in (3.8), it is easy to check that one obtains (for bothl = ℓ,l) stronger bounds than in Theorem 3.11 (3) and (4): However, we do not know how to rigorously pass from (3.8) to (3.10) or vice versa (by differentiation or integration, respectively), since we cannot exclude the possibility that the (relatively closed in (0, 1)) setG ϕ (x) has isolated points, nor that it is disconnected. Instead, we can obtain the following stronger version of (3.10) which only holds for a.e. t ∈G ϕ (x), but will prove to be very useful later on.
Proof. By Corollary 3.10, for all x ∈ X andl = ℓ,l, t →l 2 t (x) is differentiable a.e. on Dl(x). Consequently, the first and third equalities in The lower and upper bounds in (3.11) then follow from Theorem 3.11 (3) (or as in (3.10), by taking the limit as s → t in Theorem 3.11 (2)).

Null-Geodesics
Definition 3.14 (Null-Geodesics and Null-Geodesic Points). Given a Kantorovich potential ϕ : X → R, denote the subset of null ϕ-Kantorovich geodesics by: Its complement in G ϕ will be denoted by G + ϕ . The subset of X of null ϕ-Kantorovich geodesic points is denoted by: Its complement in X will be denoted by X + .
The following provides a convenient equivalent characterization of X 0 and X + : 15. Given x ∈ X, the following statements are equivalent: In other words, we have the following dichotomy: all ϕ-Kantorovich geodesics having x ∈ X as some interior mid-point have either strictly positive length (iff x ∈ X + ) or zero length (iff x ∈ X 0 ).
Proof. Immediate by (6) and the monotonicity of , together with the fact thatG ϕ (x) is relatively closed in (0, 1) by Corollary 3.9.
Proof. The "only if" direction follows immediately by Lemma 3.15, whereas the "if" direction follows by Corollary 3.17, after recalling that 2 dτ by Corollary 3.10. As usual, the equivalent condition follows by Theorem 3.11.

Temporal Theory of Intermediate-Time Kantorovich Potentials. Time-Propagation
The goal of this section is to introduce and study the following function(s): Definition (Time-Propagated Intermediate Kantorovich Potentials). Given a Kantorovich potential ϕ : X → R and s, t ∈ (0, 1), define the t-propagated s-Kantorovich potential Φ t s on D ℓ (t), and its time-reversed versionΦ t s on Dl(t), by: Observe that for all s, t ∈ (0, 1): for any γ ∈ G ϕ with γ t = x, and consequently Lemma 3.3 yields that ϕ s •e s is single-valued for all such γ and (also recalling Proposition 3.6): We will use the following short-hand notation. Given s ∈ [0, 1] and a s ∈ R, we denote: suppressing the implicit dependence of G as on s. The above argument about why ϕ s • e s • e −1 t is well-defined can be rewritten as: Note that while typically disjoint sets remain disjoint under optimal-transport only under some additional non-branching assumptions, Corollary 4.1 holds true in general.

Monotonicity
Then for any s ∈ (0, 1): Moreover, the left-hand-side is in fact strictly positive iff x ∈ X + .
Proof. We know by Lemma 3.3 and Theorem 3.11 that: by Proposition 3.6 and Theorem 3.11, as x = γ i t i . Now sets := (s ∨ t 1 ) ∧ t 2 . Sinces ∈ {t 1 , t 2 , s}, it follows that: By Corollary 3.10, we know forl = ℓ,l that Dl(x) ∋ t →l 2 t (x) is differentiable a.e., and that the singular part of its distributional derivative is non-negative forl = ℓ and non-positive for ℓ =l. Consequently, we may proceed as follows: where we used that τ − s ≥ 0 whens ≤ τ < t 2 and that τ − s ≤ 0 whens ≥ τ > t 1 . Using (3.5) and (3.6) to bound the above lower and upper derivatives on the sets (having full measure) D ℓ (x) and Dl(x), respectively, we obtain: Summarizing, we have obtained: We now use the inequality ϕs(x) ≤φs(x) in the first line above when 2 s t 2 − 1 ≥ 0, and in the second line when 2 1−s 1−t 1 − 1 ≥ 0, yielding: In particular, the first estimate applies whenever s ≥ 1 2 and the second one whenever s ≤ 1 2 .
, and hence by Corollary 3.18 that x ∈ X 0 ; and vice-versa, if x ∈ X 0 then all geodesics having x as an interior point are null by Lemma 3.15, and hence γ 1 We can already deduce the following important consequence, complementing Corollary 4.1, which holds for any proper geodesic space (X, d), independently of any additional assumptions like various forms of non-branching: . For any s ∈ (0, 1), a s ∈ R, and t 1 , t 2 ∈ (0, 1) with t 1 = t 2 : In other words, for each x ∈ e (0,1) (G as )∩X + , there exists a unique t ∈ (0, 1) so that x ∈ e t (G as ).
, establishing the assertion.

Properties of Φ t s
The following information will be crucially used when deriving the Change-Of-Variables formula in Section 11: For any s ∈ (0, 1), the following properties of Φ t s andΦ t s hold: are continuous on D ℓ and Dl, respectively.
(4) For all t ∈ (0, 1): The first and second statements follow by Lemma 3.2 and Corollary 3.10. As 2 , the points of differentiability of t →Φ t s (x) must coincide with those of t →l 2 t (x) and (4.2) follows immediately, with the only possible exception being the point t = s if s ∈ Dl(x), where direct inspection and continuity of t →l 2 t (x) on Dl(x) verifies (4.2). The local Lipschitzness follows by Theorem 3.11 (2). The monotonicity follows by Lemma 4 The last two assertions follow as in the proof of Lemma 4.2, after noting that: and similarly for ∂ t . Indeed, the estimates (3.5) and (3.6) of Corollary 3.10 yield (4), which already yields half of the inequalities in (5) for all (x, t) ∈ D ℓ ∩ Dl. To get the other half, we must restrict to D(G ϕ ) and use the estimates of Theorem 3.11 (4), thereby concluding the proof.

Temporal Theory of Intermediate-Time Kantorovich Potentials. Third Order
Fix a non-null Kantorovich geodesic γ ∈ G + ϕ , and denote for short ℓ := ℓ(γ) > 0. Recall by the results of Section 3 that for all t ∈ (0, 1), ℓ t (γ t ) =l t (γ t ) = ℓ and that ∂ t ϕ t (x) = ∂ tφt (x) = ℓ 2 t (x)/2 for all x ∈ e t (G ϕ ). Also, recall that given x ∈ X andl = ℓ,l, the function Dl(x) ∋ t →l t (x) is only a.e. differentiable, and even onG ϕ (x) ⊂ D ℓ (x) ∩ Dl(x), we only have at the moment upper and lower bounds on ∂ tl The goal of this section is to rigorously make sense and prove the following formal statement, which provides second order information on ℓ t , or equivalently, third order information on ϕ t , along γ t : Equivalently, this amounts to the statement that the function: is concave in r ∈ (0, 1), since formally:

Formal Argument
We start by providing a formal proof of (5.1) in an infinitesimally Hilbertian setting, which is rigorously justified on a Riemannian manifold if all involved functions are smooth (in time and space).
Recall that the Hopf-Lax semi-group solves the Hamilton-Jacobi equation (e.g. [6]): We evaluate all subsequent functions at x = γ t . Since: and since γ ′ (t) = −∇ϕ t (see e.g. [6] or Lemma 10.3), But taking two time derivatives in (5.2), we know that: and so we conclude that: It remains to apply Cauchy-Schwarz and deduce: as asserted. Note that in a general setting, we can try and interpret z(t) as minus the directional derivative of ℓ 2 t /2 = ∂ t ϕ t in the direction of γ ′ (t) (by taking derivative of the identity 2 ), and thus hope to justify the Cauchy-Schwarz inequality as the statement that the local Lipschitz constant of ∂ t ϕ t is greater than any unit-directional derivative. However, a crucial point in the above argument of identifying z ′ (t) with |∇∂ t ϕ t | 2 was to use the linearity of ·, · in both of its arguments, and so ultimately this formal proof is genuinely restricted to an infinitesimally Hilbertian setting. The above discussion seems to suggest that there is no hope of proving (5.1) beyond the Hilbertian setting. Furthermore, it seems that the spatial regularity of ϕ t and ∂ t ϕ t = 1 2 ℓ 2 t should play an essential role in any rigorous justification. Remarkably, we will see that this is not the case on both counts, and that an appropriate interpretation of (5.1) holds true on a general proper geodesic space (X, d).

Notation
Recall that by the results of Section 3, τ → ϕ τ (x) and τ →φ τ (x) are locally semi-convex and semi-concave on (0, 1), respectively, and that ∂ ± . We respectively introducep = p,p by defining at t ∈ (0, 1): where the penultimate equalities in each of the lines above follow from the continuity of Dl In addition, forq = q,q, set:q where the Peano (partial) derivatives are with respect to the t variable. It will be useful to recall that if we defineh = h,h by: By definition,q − (t) =q + (t) =q ∈ R if and only if τ →φ τ (γ t ) has second order Peano derivative at τ = t equal toq, and hence by Lemma 2.3, iffp − (t) =p + (t) =q, or equivalently, iff any of the other equivalent conditions for the second order differentiability of (0, 1) ∋ τ →φ τ (γ t ) at τ = t are satisfied. Moreover, Lemma 2.4 implies: but we will not require this here. We summarize the above discussion in: Corollary 5.1. The following statements are equivalent for a given t ∈ (0, 1): In any of these cases (0, 1) ∋ τ →φ τ (γ t ) is twice differentiable at τ = t, and we have:

Main Inequality
The following inequality and its consequences are the main results of this section.
Theorem 5.2. For all s < t and ε so that s, t, s + ε, t + ε ∈ (0, 1), we have (for both possibilities for ±): Proof. By Lemma 3.5, there exist y ± ε ∈ X so that: By definition, note that: We abbreviate D r := rℓ = d(γ r , γ 0 ), r = s, t. The proof consists of subtracting the above two expressions and applying the triangle inequality: Indeed, we obtain after subtraction, recalling the definition of h, and an application of Lemma 3.3: .
Carefully rearranging terms, we obtain: and the first claim follows. The second claim follows by the duality between ϕ and ϕ c . Indeed, exchange ϕ, γ, ε, s, t with ϕ c , γ c , −ε, 1−t, 1−s, and recall thatφ t = −ϕ c 1−t . A straightforward inspection of the definitions verifies: and: and so the second claim follows from the first one. Alternatively, one may repeat the above argument by subtracting the following two expressions:
Corollary 5.4. For all 0 < s < t < 1 (and both possibilities for ±): and:q It will be convenient to use the above information in the following form: Theorem 5.5. Assume that for a.e. t ∈ (0, 1): in any of the equivalent senses given by Corollary 5.1, and that moreover: Furthermore, assume that the latter joint value coincides a.e. on (0, 1) with some continuous function z c : Then (5.5) holds for all t ∈ (0, 1), and we have: Moreover, we have the following third order information on ϕ t (x) at x = γ t : In particular, for any point t Proof. The assumptions imply by Corollary 5.1 thatq − (t) =q + (t) = z c (t) for a.e. t ∈ (0, 1). It follows that the same is true for every t ∈ (0, 1) by monotonicity ofq ± and the assumption that z c is continuous, yielding (5.8). Furthermore, Corollary 5.1 implies thatp − (t) =p + (t) = z c (t) for bothp = p,p and for all t ∈ (0, 1), and we obtain (5.9) by taking geometric mean of (5.3) and (5.4). The final assertion obviously follows by taking the limit in (5.9) as s → t.
Finally, we obtain the following concise interpretation of the 3rd order information on τ → ϕ τ along γ t , which will play a crucial role in this work: Lemma 5.7. Assume that for some locally absolutely continuous function z ac on (0, 1) we have: Then for any fixed r 0 ∈ (0, 1), the function: is concave on (0, 1).
Proof. Since L ∈ C 1 (0, 1), concavity of L is equivalent to showing that the function: is monotone non-decreasing. But as this function is locally absolutely continuous, this is equivalent to showing that W ′ (r) ≥ 0 for a.e. r ∈ (0, 1). Note that the points of differentiability of W and z ac coincide. At these points (of full Lebesgue measure), we indeed have: where the last inequality follows from Theorem 5.5. This concludes the proof.
We will subsequently show that under synthetic curvature conditions, the above assumption is indeed satisfied for ν-a.e. geodesic γ.

Part II Disintegration Theory of Optimal Transport 6 Preliminaries
So far we have worked without considering any reference measure over our metric space (X, d).
A triple (X, d, m) is called a metric measure space, m.m.s. for short, if (X, d) is a complete and separable metric space and m is a non-negative Borel measure over X. In this work we will only be concerned with the case that m is a probability measure, that is m(X) = 1, and hence m is automatically a Radon measure (i.e. inner-regular). We refer to [3,5,43,75,76] for background on metric measure spaces in general, and the theory of optimal transport on such spaces in particular.

Geometry of Optimal Transport on Metric Measure Spaces
The space of all Borel probability measures over X will be denoted by P(X). It is naturally equipped with its weak topology, in duality with bounded continuous functions C b (X) over X. The subspace of those measures having finite second moment will be denoted by P 2 (X), and the subspace of P 2 (X) of those measures absolutely continuous with respect to m is denoted by P 2 (X, d, m). The weak topology on P 2 (X) is metrized by the L 2 -Wasserstein distance W 2 , defined as follows for any µ 0 , µ 1 ∈ P(X): where the infimum is taken over all π ∈ P(X × X) having µ 0 and µ 1 as the first and the second marginals, respectively; such candidates π are called transference plans. It is known that the infimum in (6.1) is always attained for any µ 0 , µ 1 ∈ P(X), and the transference plans realizing this minimum are called optimal transference plans between µ 0 and µ 1 . When W 2 (µ 0 , µ 1 ) < ∞, it is known that given an optimal transference plan π between µ 0 and µ 1 , there exists a Kantorovich potential ϕ : X → R (see Section 3), which is associated to π, meaning that: In particular, when µ 0 , µ 1 ∈ P 2 (X), then necessarily W 2 (µ 0 , µ 1 ) < ∞ and the above discussion applies. Moreover, in this case, it is known that for any Kantorovich potential ϕ associated to an optimal transference plan between µ 0 and µ 1 , (6.2) in fact holds for all optimal transference plans π between µ 0 and µ 1 . In addition, in this case a transference plan π is optimal iff it is supported on a d 2 -cyclically monotone set. A set Λ ⊂ X × X is said to be c-cyclically monotone if for any finite set of points with the convention that y N +1 = y 1 .
Recall that a measure ν on a measurable space (Ω, F) is said to be concentrated on A ⊂ Ω if ∃B ⊂ A with B ∈ F so that ν(Ω \ B) = 0.

Curvature-Dimension Conditions
We now turn to describe various synthetic conditions encapsulating generalized Ricci curvature lower bounds coupled with generalized dimension upper bounds.
Remark 6.5. When m(X) < ∞ as in our setting, it is known [72, Proposition 1.6 (ii)] that CD(K, N ) implies CD(K, ∞), and hence the requirement µ t ≪ m for all intermediate times t ∈ (0, 1) is in fact superfluous, as it must hold automatically by finiteness of the Shannon entropy (see [71,72]).
The following is a local version of CD(K, N ): Note that (e t ) ♯ ν is not required to be supported in X o for intermediate times t ∈ (0, 1) in the latter definition.
The following pointwise density inequality is a known equivalent definition of CD(K, N ) on essentially non-branching spaces (the equivalence follows by combining the results of [29] and [41], see the proof of Proposition 9.1): Definition 6.7 (CD(K, N ) for essentially non-branching spaces). An essentially non-branching m.m.s. (X, d, m) satisfies CD(K, N ) if and only if for all µ 0 , µ 1 ∈ P 2 (X, d, m), there exists a unique ν ∈ OptGeo(µ 0 , µ 1 ), ν is induced by a map (i.e. ν = S ♯ (µ 0 ) for some map S : X → Geo(X)), µ t := (e t ) # ν ≪ m for all t ∈ [0, 1], and writing µ t = ρ t m, we have for all t ∈ [0, 1]: The Measure Contraction Property MCP(K, N ) was introduced independently by Ohta in [56] and Sturm in [72]. The idea is to only require the CD(K, N ) condition to hold when µ 1 degenerates to δ o , a delta-measure at o ∈ supp(m). However, there are several possible implementations of this idea. We start with the following one, which is a variation of the one used in [29]: where µ 0 = ρ 0 m.
The variant proposed in [56] is as follows (with the minor modification that our definition below does not require that supp(m) = X as in [56]): When either the MCP(K, N ) or MCP ε (K, N ) conditions hold for a given o ∈ supp(m), we will say that the space satisfies the corresponding condition with respect to o.
Lemma 6.11. The following chain of implications is known: Proof. By Remark 6.10, we may reduce to the case supp(m) = X. It is then known [72, Corollary 2.4] that CD(K, N ) implies a doubling condition for (X, d), and so by completeness, the latter space must be proper. Fixing µ 0 ≪ m with bounded support and o ∈ X, let ν ε be an element of OptGeo(µ 0 , µ ε 1 ) satisfying the CD(K, N ) condition for µ ε 1 = m(B(o, ε)) −1 m B(o,ε) . By Lemma 6.1, {ν ε } has a converging subsequence to ν 0 ∈ OptGeo(µ 0 , δ o ) as ε → 0. The upper semi-continuity of E N and the continuity of the evaluation map e t ensure that ν 0 satisfies the MCP ε (K, N ) condition (6.4). The second implication follows by the arguments of [65, Section 5] (without any types of essential non-branching assumptions).
Remark 6.12. We will show in Proposition 9.1 that for essentially non-branching spaces, MCP(K, N ) implies back MCP ε (K, N ). We remark that for non-branching spaces, the implication CD(K, N ) ⇒ MCP(K, N ) was first proved in [72].
The following simple lemma will be useful for quickly establishing that (supp(m), d) is proper and geodesic:  (K, N ), the above argument shows that (supp(m), d) is complete and locally compact. Together with the assumption that the latter space is a length-space, the Hopf-Rinow theorem implies that it is proper and geodesic.
The following is a standard corollary of the fact that the optimal dynamical plan is induced by a map (see e.g. the comments after [41, Theorem 1.1]); as we could not find a reference, for completeness, we sketch the proof. Corollary 6.15. With the same assumptions as in Theorem 6.14, the unique optimal transference plan ν is concentrated on a (Borel) set G ⊂ Geo(X), so that for all t ∈ [0, 1), the evaluation map e t | G : G → X is injective. In particular, for any Borel subset H ⊂ G: Sketch of proof. First, we claim the existence of X 1 ⊂ X with µ 0 (X 1 ) = 1, so the for all x ∈ X 1 , there exists a unique γ ∈ G ϕ with γ 0 = x. Otherwise, if A ⊂ X is a set of positive µ 0 -measure where this is violated, there are at least two distinct geodesics in G ϕ emanating from every x ∈ A. As these geodesics must be different at some rational time in (0, 1), it follows that there exists a rationalt ∈ (0, 1) and B ⊂ A still of positive µ 0 -measure so that both pairs of geodesics emanating from x are different at timet for all x ∈ B. Considerμ 0 = µ 0 | B /µ 0 (B) ≪ m, and transport to timet half of its mass along one geodesic and the second half along the other one (see e.g. the proof of [29, Theorem 5.1]). The latter transference plan is optimal but is not induced by a map, yielding a contradiction. Now denote G := S ♯ (X 1 ) (and hence ν(G) = 1), so that the injectivity of e 0 | G is already guaranteed. To see the injectivity of e t | G for all t ∈ (0, 1), suppose in the contrapositive the existence of γ 1 , γ 2 ∈ G with γ 1 t = γ 2 t . Denoting by η the gluing of γ 1 restricted to [0, t] with γ 2 restricted to [t, 1], it follows by d 2 -cyclic monotonicity (see e.g. the proof of [14, Lemma 2.6] or that of Lemma 3.7) that η ∈ G ϕ with η 0 = γ 1 0 and η = γ 1 . But this is in contradiction to the definition of X 1 , thereby concluding the proof.

Disintegration Theorem
We include here a version of the Disintegration Theorem that we will use. We will follow [18, Appendix A] where a self-contained approach (and a proof) of the Disintegration Theorem in countably generated measure spaces can be found. An even more general version of the Disintegration Theorem can be found in [39,Section 452].
Recall that given a measure space (X, X , m), a set A ⊂ X is called m-measurable if A belongs to the completion of the σ-algebra X , generated by adding to it all subsets of null m-sets; similarly, a function f : (X, X , m) → R is called m-measurable if all of its level sets are m-measurable. Definition 6.16 (Disintegation on sets). Let (X, X , m) denote a measure space. Given any family {X α } α∈Q of subsets of X, a disintegration of m on {X α } α∈Q is a measure-space structure (Q, Q, q) and a map Q ∋ α −→ m α ∈ P(X, X ) so that: (1) for q-a.e. α ∈ Q, m α is concentrated on X α ; (2) for all B ∈ X , the map α → m α (B) is q-measurable; The measures m α are referred to as conditional probabilities.
Given a measurable space (X, X ) and a function Q : X → Q, with Q a general set, we endow Q with the push forward σ-algebra Q of X : i.e. the biggest σ-algebra on Q such that Q is measurable. Moreover, given a measure m on (X, X ), define a measure q on (Q, Q) by pushing forward m via Q, i.e. q := Q ♯ m.
such that the following requirements hold: (1) for all B ∈ X , the map α → m α (B) is q-measurable; (2) for all B ∈ X and C ∈ Q, the following consistency condition holds: A disintegration of m is called strongly consistent with respect to Q if in addition: (3) for q-a.e. α ∈ Q, m α is concentrated on Q −1 (α); The above general scheme fits with the following situation: given a measure space (X, X , m), suppose a partition of X is given into disjoint sets {X α } α∈Q so that X = ∪ α∈Q X α . Here Q is the set of indices and Q : X → Q is the quotient map, i.e.
We endow Q with the quotient σ-algebra Q and the quotient measure q as described above, obtaining the quotient measure space (Q, Q, q). When a disintegration α → m α of m is (strongly) consistent with the quotient map Q, we will simply say that it is (strongly) consistent with the partition. Note that any disintegration α → m α of m on a partition {X α } α∈Q (as in Definition 6.16) is automatically strongly consistent with the partition (as in Definition 6.17), and vice versa.
We now formulate the Disintegration Theorem (it is formulated for probability measures but clearly holds for any finite non-zero measure): Theorem 6.18 (Theorem A.7, Proposition A.9 of [18]). Assume that (X, X , m) is a countably generated probability space and that {X α } α∈Q is a partition of X.
Then the quotient probability space (Q, Q, q) is essentially countably generated and there exists an essentially unique disintegration α → m α consistent with the partition.
If in addition X contains all singletons, then the disintegration is strongly consistent if and only if there exists a m-section S m ∈ X of the partition such that the σ-algebra on S m induced by the quotient-map contains the trace σ-algebra X ∩ S m := {A ∩ S m ; A ∈ X }.
Let us expand on the statement of Theorem 6.18. Recall that a σ-algebra A is countably generated if there exists a countable family of sets so that A coincides with the smallest σ-algebra containing them. In the measure space (Q, Q, q), the σ-algebra Q is called essentially countably generated if there exists a countable family of sets Q n ⊂ Q such that for any C ∈ Q there existŝ C ∈Q, whereQ is the σ-algebra generated by {Q n } n∈N , such that q(C ∆Ĉ) = 0.
Essential uniqueness is understood above in the following sense: if α → m 1 α and α → m 2 α are two consistent disintegrations with the partition then m 1 α = m 2 α for q-a.e. α ∈ Q. Finally, a set S ⊂ X is a section for the partition X = ∪ α∈Q X α if for any α ∈ Q, S ∩ X α is a singleton {x α }. By the axiom of choice, a section S always exists, and we may identify Q with S via the map Q ∋ α → x α ∈ S. A set S m is an m-section if there exists Y ∈ X with m(X \Y ) = 0 such that the partition Y = ∪ α∈Qm (X α ∩ Y ) has section S m , where Q m = {α ∈ Q; X α ∩ Y = ∅}. As q = Q ♯ m, clearly q(Q \ Q m ) = 0. As usual, we identify between Q m and S m , so that now Q m carries two measurable structures: Q ∩ Q m (the push-forward of X ∩ Y via Q), and also X ∩ S m via our identification. The last condition of Theorem 6.18 is that Q ∩ Q m ⊃ X ∩ S m , i.e. that the restricted quotient-map Q| Y : (Y, X ∩ Y ) → (S m , X ∩ S m ) is measurable, so that the full quotient-map Q : (X, X ) → (S, X ∩ S) is m-measurable.
We will typically apply the Disintegration Theorem to (E, B(E), m E ), where E ⊂ X is an m-measurable subset (with m(E) > 0) of the m.m.s. (X, d, m). As our metric space is separable, B(E) is countably generated, and so Theorem 6.18 applies. In particular, when Q ⊂ R, E is a closed subset of X, the partition elements X α are closed and the quotient-map Q : E → Q is known to be Borel (for instance, this is the case when Q is continuous), [73,Theorem 5.4.3] guarantees the existence of a Borel section S for the partition so that Q : E → S is Borel measurable, thereby guaranteeing by Theorem 6.18 the existence of an essentially unique disintegration strongly consistent with Q.

L 1 Optimal Transportation Theory
In this section we recall various results from the theory of L 1 optimal-transport which are relevant to this work, and add some new information we will subsequently require. We refer to [2,13,19,23,37,38,47,75] for more details.

Preliminaries
To any 1-Lipschitz function u : X → R there is a naturally associated d-cyclically monotone set: Its transpose is given by Γ −1 u = {(x, y) ∈ X × X : (y, x) ∈ Γ u }. We define the transport relation R u and the transport set T u , as: where {x = y} denotes the diagonal {(x, y) ∈ X 2 : x = y} and P i the projection onto the i-th component. Recall that Γ u (x) = {y ∈ X ; (x, y) ∈ Γ u } denotes the section of Γ u through x in the first coordinate, and similarly for R u (x) (through either coordinates by symmetry). Since u is 1-Lipschitz, Γ u , Γ −1 u and R u are closed sets, and so are Γ u (x) and R u (x). Consequently T u is a projection of a Borel set and hence analytic; it follows that it is universally measurable, and in particular, m-measurable [73].
The following is immediate to verify (see [2,Proposition 4.2]): Also recall the following definitions, introduced in [23]: A ± are called the sets of forward and backward branching points, respectively. Note that both A ± are analytic sets. If x ∈ A + and (y, x) ∈ Γ u necessarily also y ∈ A + (as Γ u (y) ⊃ Γ u (x) by the triangle inequality); similarly, if x ∈ A − and (x, y) ∈ Γ u then necessarily y ∈ A − . Consider the non-branched transport set which belongs to the sigma-algebra σ(A) generated by analytic sets and is therefore m-measurable.
Define the non-branched transport relation: In was shown in [23] (cf. [19]) that R b u is an equivalence relation over T b u and that for any Remark 7.2. Note that even if x ∈ T b u , the transport ray R u (x) need not be entirely contained in T b u . However, we will soon prove that almost every transport ray (with respect to an appropriate measure) has interior part contained in T b u . It will be very useful to note that whenever the space (X, d) is proper (for instance when (X, d, m) verifies MCP(K, N ) and supp(m) = X), T u and A ± are σ-compact sets: indeed writing R u \ {x = y} = ∪ ε>0 R u \ {d(x, y) > ε} it follows that R u \ {x = y} is σ-compact. Hence T u is σ-compact. Moreover: open and open sets are F σ in metric spaces, it follows that {(x, z, w) ∈ T u × (R u ) c : (x, z), (x, w) ∈ Γ u } is σ-compact and therefore A + is σ-compact; the same applies to A − . Consequently, T b u and R b u are Borel. Now, from the first part of the Disintegration Theorem 6.18 applied to (T b u , B(T b u ), m T b u ), we obtain an essentially unique disintegration of m T b u consistent with the partition of T b u given by the equivalence classes R b u (α) α∈Q of R b u : with corresponding quotient space (Q, Q, q) (Q ⊂ T b u may be chosen to be any section of the above partition). The next step is to show that the disintegration is strongly consistent. By the Disintegration Theorem, this is equivalent to the existence of a m T b u -sectionQ ∈ B(T b u ) (which by a mild abuse of notation we will call m-section), such that the quotient map associated to the partition is m-measurable, where we endowQ with the trace σ-algebra. This has already been shown in [19,Proposition 4.4] in the framework of non-branching metric spaces; since its proof does not use any non-branching assumption, we can conclude that: where now Q ⊃Q ∈ B(T b u ) withQ an m-section for the above partition (and hence q is concentrated onQ). For a more constructive approach under the additional assumption of properness of the space, see also [25,Proposition 4.8].
A-priori the non-branched transport set T b u can be much smaller than T u . However, under fairly general assumptions one can prove that the sets A ± of forward and backward branching are both m-negligible. In [23] this was shown for a m.m.s. (X, d, m) verifying RCD(K, N ) and supp(m) = X. The proof only relies on the following two properties which hold for the latter spaces (see also [25]): -supp(m) = X.
-Given µ 0 , µ 1 ∈ P 2 (X) with µ 0 ≪ m, there exists a unique optimal transference plan for the W 2 -distance and it is induced by an optimal transport map .
By Theorem 6.14 these properties are also verified for an essentially non-branching m.m.s. (X, d, m) satisfying MCP(K, N ) and supp(m) = X. We summarize the above discussion in: m Tu = Q m α q(dα), and for q − a.e. α ∈ Q, m α (R b u (α)) = 1.

(7.3)
Here Q may be chosen to be a section of the above partition so that Q ⊃Q ∈ B(T b u ) withQ an m-section with m-measurable quotient map. In particular, Q ⊃ B(Q) and q is concentrated on Q.
Remark 7.4. By modifying the definitions of A + , A − to only reflect branching inside supp(m), it is possible to remove the assumption that supp(X) = m, but we refrain from this extraneous generality here.
Remark 7.5. If we consider u = d(·, o), it is easy to check that the set A + coincides with the cut locus C o , i.e. the set of those z ∈ X such that there exists at least two distinct geodesics starting at z and ending in o. Hence the previous corollary implies that for any o ∈ X, the cut locus has m-measure zero: m(C o ) = 0. This in particular implies that an essentially nonbranching m.m.s. verifying MCP(K, N ) and supp(m) = X also supports a local (1, 1)-weak Poincaré inequality, see [68].

Maximality of transport rays on non-branched transport-set
It is elementary to check that Γ u induces a partial order relation on X: Note that by definition: Recall that for any x ∈ T b u , (R u (x), d) is isometric to a closed interval in (R, |·|). This isometry induces a total ordering on R u (x) which must coincide with either ≤ u or ≥ u , implying that (R u (x), ≤ u ) is totally ordered.
is isometric to an interval in (R, | · |). Proof. Consider z, w ∈ R u (x) ∩ T b u ; as (R u (x), ≤ u ) is totally ordered, assume without loss of generality that z ≤ u w. Given y ∈ R u (x) with z ≤ u y ≤ u w, we must prove that y ∈ T b u . Indeed, since w ≥ u y and w / ∈ A + , necessarily y / ∈ A + , and since z ≤ u y and z / ∈ A − , necessarily y / ∈ A − . Hence y ∈ T b u and the claim follows.
Recall that given a partially ordered set, a chain is a totally ordered subset. A chain is called maximal if it is maximal with respect to inclusion. We introduce the following: Definition 7.7 (Transport Ray). A maximal chain R in (X, d, ≤ u ) is called a transport ray if it is isometric to a closed interval I in (R, |·|) of positive (possibly infinite) length.
In other words, a transport ray R is the image of a closed non-null geodesic γ parametrized by arclength on I so that the function u • γ is affine with slope 1 on I, and so that R is maximal with respect to the total ordering ≤ u . Lemma 7.8. Given x ∈ T b u , R is a transport ray passing through x if and only if R = R u (x). Proof. Recall that for any x ∈ T b u , (R u (x), d, ≤ u ) is order isometric to a closed interval in (R, |·|). As R u (x) is by definition maximal in X with respect to inclusion, it follows that it must be a transport ray. Conversely, note that for any transport ray R we always have R ⊂ ∩ w∈R R u (w). Indeed, for any w, z ∈ R, we have z ≤ u w or z ≥ u w, and hence by definition (w, z) ∈ R u so that z ∈ R u (w). If x ∈ R ∩ T b u , we already showed above that R u (x) is a transport ray. Since R ⊂ R u (x) and R is assumed to be maximal with respect to inclusion, it follows that necessarily R = R u (x). Corollary 7.9. If R 1 and R 2 are two transport rays which intersect in T b u then they must coincide.
In this subsection, we reconcile between the crucial maximality property of R u (α) which we will require for the definition of CD 1 in the next section, and the fact that the disintegration in (7.3) is with respect to (the possibly non-maximal) R b u (α) = R u (α) ∩ T b u . We will show that under MCP, for q-a.e. α, the only parts of R u (α) which are possibly not contained in T b u are its end points -this fact is the main new result of this section.
To rigorously state this new observation, we recall the classical definition of initial and final points, a and b, respectively: Note that: so a is the difference of analytic sets and consequently belongs to σ(A); similarly for b. As in the previous subsection, whenever (X, d) is proper, a, b are in fact Borel sets. Then there existsQ ⊂ Q such that q(Q \Q) = 0 and for any α ∈Q it holds: In particular, for every α ∈Q: (with the latter interpreted as the relative interior). Proof.
Step 1. Consider the m-sectionQ from Corollary 7.3 so that Q ⊃Q ∈ B(T b u ), Q ⊃ B(Q) and q(Q \Q) = 0. Consider the set: The claim will be proved once we show that q(Q 1 ) = 0. First, observe that and therefore Q 1 ⊂Q is analytic; since Q ⊃ B(Q), it follows that Q 1 is q-measurable. Now suppose by contradiction that q(Q 1 ) > 0. We can divide Q 1 into two sets: Since Q 1 = Q + 1 ∪ Q − 1 , without any loss in generality let us assume q(Q + 1 ) > 0, and for ease of notation assume further that Q + 1 = Q 1 . Hence, for any α ∈ Q 1 , there exists z ∈ Γ u (α) such that z / ∈ T b u and z / ∈ b; note that necessarily z ∈ A − . Recall that for all α ∈ Q, R u (α) and hence Γ u (α) are isometric to a closed interval, and we will use the following identification to denote the isometry Γ u (α) ∼ I α , with Moreover, we may select a α and b α to be q-measurable functions of Q 1 . To see this, consider the set Σ := {(α, x, y) ∈ Q 1 × Γ u : x ∈ A − , d(x, y) > 0}, and observe that it is analytic, and that P 1 (Σ) = Q 1 . By von Neumann's selection Theorem (see [73,Theorem 5.5.2]), there exists a σ(A)-measurable selection of Σ: and so in particular these functions are q-measurable. As u(a α ) = u(α) − a α (and similarly for b α ), it follows that are also σ(A)-measurable and hence q-measurable. Possibly restricting Q 1 , by Lusin's Theorem we can also assume that the above functions are continuous.
Then define the following set: Recall that R b u is Borel since (X, d) is proper, and therefore Λ is Borel. Also note that for (α, x, z) ∈ Λ, since R u (α) is isometric to a closed interval, necessarily z = z α . Finally, we claim that P 2,3 (Λ) is d 2 -cyclically monotone: for (x 1 , z 1 ), (x 2 , z 2 ) ∈ P 2,3 (Λ) observe that and the monotonicity follows. We can then define a function T by imposing graph(T ) = P 2,3 (Λ); note that P 2,3 (Λ) is analytic and therefore T is Borel measurable (see [73,Theorem 4

The CD 1 Condition
In this section we introduce the CD 1 (K, N ) condition, which plays a cardinal role in this work. As a first step towards understanding this new condition, we show that it always implies MCP ε (K, N ) (and MCP(K, N )), without requiring any types of non-branching assumptions. By analogy, we also introduce the MCP 1 (K, N ) condition, which may be of independent interest.

Definitions of CD 1 and MCP 1
We first assume that supp(m) = X. Note that we do not assume that the transport rays {X α } α∈Q below are disjoint or have disjoint relative interiors, in an attempt to obtain a useful definition also for m.m.s.'s which may have significant branching. However, throughout most of this work, we will typically assume in addition that the space is essentially non-branching, in which case an equivalent definition will be presented in Proposition 8.13 below. (2) For q-a.e. α ∈ Q, X α is a transport ray for Γ u (recall Definition 7.7).
We take this opportunity to define an analogous variant of MCP: Remark 8.3. Note that when u = d(·, o) then necessarily T u = X (if X is not a singleton). In addition (x, o) ∈ Γ u for any x ∈ X, and hence by maximality of a transport ray, we must have o ∈ X α for q-a.e. α ∈ Q, and by condition (3) we deduce that o ∈ supp(m α ) for q-a.e. α ∈ Q. As CD(K, N ) implies MCP(K, N ) (in the one-dimensional case this is a triviality), we obviously see that CD 1 u (K, N ) implies MCP 1 u (K, N ) for all u = d(·, o).
We will focus on a particular class of 1-Lipschitz functions.
Remark 8.5. To extend Remark 8.3 to more general signed distance functions, we will need to require that (X, d) is proper, and in that case T d f ⊃ X \ {f = 0}. Indeed, given x ∈ X \ {f = 0}, consider the distance minimizing z ∈ {f = 0} (by compactness of bounded sets). Then (x, z) ∈ R d f and as x = z it follows that x ∈ T d f .
We now remove the restriction that supp(m) = X and introduce the main new definitions of this work:  . Note that we do not a-priori know that d f is 1-Lipschitz, since we do not know that (supp(m), d) is a length-space (see Lemma 8.4); nevertheless, we will shortly see that the CD 1 (K, N ) condition implies that (supp(m), d) must be a geodesic space, and hence the sentence "so that d f is 1-Lipschitz" is in fact redundant.
Remark 8.8. By definition, the CD 1 Lip , CD 1 and MCP 1 conditions hold for (X, d, m) iff they hold for (supp(m), d, m). It is also possible to introduce a definition of CD 1 u and MCP 1 u which applies to (X, d, m) directly, without passing through (supp(m), d, m) -this would involve requiring that the transport rays {X α } are maximal inside supp(m), and in the case of CD 1 u would only apply to functions u which are 1-Lipschitz on supp(m) (these may be extended to the entire X by McShane's theorem). Our choice to use a tautological approach is motivated by the analogous situation for the more classical W 2 definitions of curvature-dimension (see Remark 6.10) and is purely for convenience, so as not to overload the definitions. Proof. We will show that (supp(m), d, m) satisfies MCP ε (K, N ), and consequently so will (X, d, m). By Remark 8.8, we may therefore assume that supp(m) = X. so that X α is a transport ray for Γ u , m α is supported on X α and (X α , d, m α ) verifies MCP(K, N ) with respect to o ∈ X α , for q-a.e. α ∈ Q. Now consider any µ 0 ∈ P(X) with µ 0 ≪ m, so that ρ 0 := dµ 0 dm has bounded support. By measurability of the disintegration, the function Q ∋ α → z α := ρ 0 (x)m α (dx) is q-measurable, and henceQ := {α ∈ Q ; z α ∈ (0, ∞)} is q-measurable. Clearly Q z α q(dα) = Q z α q(dα) = 1 since z α < ∞ for q-a.e. α ∈ Q.
It remains to establish the MCP ε inequality of Definition 6.8. Fix t ∈ (0, 1), and recall that for q-a.e. α ∈Q, the (one-dimensional, non-branching) (X α , d, m α ) verifies MCP(K, N ) (and hence MCP ε (K, N )), and as µ α 0 ≪ m α and o ∈ supp(m α ), in particular µ α t := (e t ) ♯ (ν α ) ≪ m α . Applying e t to both sides of (8.3), it follows that µ t = (e t ) ♯ (ν) ≪ m. Writing µ t = ρ t m and µ α t = ρ α t m α for q-a.e. α ∈Q, the MCP ε condition implies that: In addition, the application of e t to both sides of (8.3) yields the following disintegration: Now consider the set Y = {ρ t > 0}, and note that by (8.5): , applying Hölder's inequality on the interior integral for q-a.e. α ∈Q, using (8.6), employing the one-dimensional MCP ε inequality (8.4) and canceling z α , and finally applying Hölder's inequality again on the exterior integral, we obtain: where the last inequality above follows since ρ 0 m α = 0 for α ∈ Q \Q and since the exponent on the second term is negative. Note that we applied Hölder's inequality above in reverse form: which is valid as soon as α + β = 1, β < 0, regardless of whether or not |g| > 0 ω-a.e.. Rearranging terms above and raising to the power of N −1 N , the desired inequality follows: Remark 8.10. Note that the above proof shows that, not only does it hold that supp(µ t ) ⊂ supp(m) for all t ∈ [0, 1), as required in the definition of MCP ε (K, N ), but in fact µ t ≪ m.
for q-a.e. α ∈Q. Integrating overQ we obtain and the claim follows (when K > 0, there is no need to inspect B B(o, π (N − 1)/K) since A and hence (e t ) ♯ (ν) are supported inside the latter ball).
As a consequence, we immediately obtain from Lemmas 6.13 and 8.4:

On Essentially Non-Branching Spaces
Having at our disposal MCP(K, N ), we can now invoke the results of Section 7 concerning L 1 Optimal Transportation theory, and obtain the following important equivalent definitions of CD 1 Lip (K, N ), CD 1 (K, N ) and MCP 1 (K, N ) assuming that (X, d, m) is essentially non-branching. Given K ∈ R and N ∈ (1, ∞), the following statements are equivalent: (2) For any 1-Lipschitz function u : (X, d) → R, let R b u (α) α∈Q denote the partition of T b u given by the equivalence classes of R b u . Denote by X α the closure R b u (α). Then all the conditions (1)-(4) of Definition 8.1 hold for the family {X α } α∈Q . In particular, X α = R u (α) is a transport-ray for q-a.e. α ∈ Q. Moreover, the sets {X α } α∈Q have disjoint interiors {R b u (α)} α∈Q contained in T b u , and the disintegration (Q, Q, q) of m Tu on {X α } α∈Q given by (8.1) is essentially unique. Furthermore, Q may be chosen to be a section of the above partition so that Q ⊃Q ∈ B(T b u ) withQ an m-section with m-measurable quotient map, so that in particular Q ⊃ B(Q) and q is concentrated onQ.
An identical statement holds for CD 1 (K, N ) when only considering signed distance functions u = d f . An identical statement also holds for MCP 1 (K, N ) when only considering the functions u = d(·, o), after replacing above condition (4) of Definition 8.1 with condition (4') of Definition 8.2.
Proof. The only direction requiring proof is (1) ⇒ (2). Given a 1-Lipschitz function u as above, we may assume that m(T u ) > 0, otherwise there is nothing to prove. The CD 1 u (K, N ) condition ensures there exists a family {Y β } β∈P of sets and a disintegration: so that for p-a.e. β ∈ P , Y β is a transport ray for Γ u , (Y β , d, m P β ) satisfies CD(K, N ) and supp(m P β ) = Y β . By removing a p-null-set from P , let us assume without loss of generality that the above properties hold for all β ∈ P . N ), and as our space is essentially non-branching with full-support, Corollary 7.3 implies that m(A + ∪ A − ) = 0 and that there exists an essentially unique disintegration (Q, Q, q) of m Tu = m T b u strongly consistent with the partition of T b u given by R b u (α) α∈Q : By Corollary 7.3, Q may be chosen to be a section of the above partition satisfying the statement appearing in the formulation of Proposition 8.13. Again, let us assume without loss of generality that m α (R b u (α)) = 1 for all α ∈ Q. By Theorem 7.10, there exists In addition, since m(T u \ T b u ) = 0, there exists P 1 ⊂ P of full p-measure so that m P β (T b u ) = 1 for all β ∈ P 1 . By Lemmas 7.6 and 7.8, N ), is of total measure 1 and satisfies supp( In particular, for all β ∈ P 1 , there exists a unique (since R b u is an equivalence relation on T b u and by uniqueness of the section map) α = α(β) ∈ Q so that Y β = R u (α). Denoting byQ ⊂ Q the set of indices α obtained in this way, it is clear thatQ if of full q-measure, since: Consequently, Q 2 :=Q ∩ Q 1 is of full q-measure as well. Denoting P 2 := α −1 (Q 2 ) and repeating the above argument, it follows that P 2 ⊂ P 1 is of full p-measure and satisfies that for all β ∈ P 2 , Y β = R u (α) for α = α(β) ∈ Q 2 . We conclude that there is a one-to-one correspondence: so both of these representations yield an identical partition (up to relabeling) of the set: Clearly m(T b u \ C) = 0 and so C is m-measurable. Therefore, by the above two disintegration formulae: After identifying between P 2 and Q 2 via η, it follows necessarily that q Q 2 = p P 2 as they are both the push-forward of m C under the partition map (since (m P β ) T b u and m α are both probability measures on T u ). Applying the Disintegration Theorem 6.18 to (C, B(C), m C ), we conclude that there is an essentially unique disintegration of m C on the above partition of C. Consequently, there exist P 3 ⊂ P 2 of full p-measure and Q 3 = η(P 3 ) ⊂ Q 2 of full q-measure so that: for all pairs (β, α) ∈ P 3 × Q 3 related by the correspondence η.
(3) Consequently: This confirms the 4 conditions of Definition 8.1, and the essential uniqueness of the disintegration (8.8) readily follows from that of the disintegration (8.7) and the arguments above. Finally, by Lemma 7.6, since u . This concludes the proof for the case of CD 1 Lip and CD 1 . For MCP 1 , one just needs to note that if u = d(·, o) then o ∈ Y β for all β ∈ P (by Remark 8.3, since Y β is a transport ray). Recalling the definition of N ) with respect to o and is of full support. The rest of the the argument is identical to the one presented above, concluding the proof.
Recall moreover that we already derived several properties of W 2 -geodesics in essentially non-branching m.m.s.'s verifying MCP(K, N ). Hence from Proposition 8.9 we also obtain all the claims of Theorem 6.14 and Corollary 6.15, as well as all of the results of the next section, provided the m.m.s. is essentially non-branching and verifies CD 1 (K, N ) for N ∈ (1, ∞).

Temporal-Regularity under MCP
In this section we deduce from the Measure Contraction and essentially non-branching properties various temporal-regularity results for the map t → ρ t (γ t ) and related objects, which we will require for this work. By Proposition 8.9, these results also apply under the CD 1 condition. While these properties are essentially standard consequences of recently available results and tools, they appear to be new and may be of independent interest.
Remark 9.2. In fact, for essentially non-branching spaces, it is also possible to add the MCP 1 (K, N ) condition to the above list of equivalent statements. Indeed, we have already seen in the previous section that MCP 1 (K, N ) ⇒ MCP ε (K, N ) without any non-branching assumptions. The converse implication for non-branching spaces follows from [19,Proposition 9.5] (without identifying the MCP 1 (K, N ) condition by this name), and it is possible to extend this to essentially non-branching spaces by following the arguments of [23, Proposition A.1]. Remark 9.3. Note that in (3), one is allowed to test any µ 1 with supp(µ 1 ) ⊂ supp(m), not only µ 1 = δ o as in the other statements. By Theorem 6.14 (recall that MCP ε (K, N ) implies MCP(K, N )), note that the MCP ε (K, N ) condition is precisely equivalent to the validity of (9.2) for all measures µ 0 , µ 1 ∈ P 2 (X) of the form µ 1 = δ o with o ∈ supp(m) and µ 0 ≪ m with bounded support (contained in B(o, π (N − 1)/K) if K > 0). Remark 9.4. While the equivalence (1) ⇔ (4) will not be directly used in this work, it is worthwhile remarking that this is the only instance we are aware of, where one can obtain information on the density along geodesics without assuming or a-posteriori concluding some type of non-branching assumption. Indeed, the proof of (1) ⇒ (4) relies on the (newly available) Theorem 3.11.
Proof of Proposition 9.1.
(4) ⇒ (2). Let o ∈ supp(m) and let µ 0 = ρ 0 m ∈ P(X) with bounded support (contained in B(o, π (N − 1)/K) if K > 0). As (4) ⇒ (1), Lemma 6.13 implies that (supp(m), d) is proper, and in addition the assertions of Theorem 6.14 and Corollary 6.15 are in force. Now, there exists an non-decreasing sequence {f i } i∈N of simple functions, that is is of bounded support, z i := f i dm ր 1, f i ր ρ 0 pointwise, and µ i 0 ⇀ µ 0 weakly, as i → ∞. By Theorem 6.14 there exists a unique ν i ∈ OptGeo(µ i 0 , δ o ), it is induced by a map, and can be written as: with each ν i k the unique optimal dynamical plan between µ i 0,k : for all t ∈ [0, 1) by Corollary 6.15. Lastly, supp(ν i ) ⊂ Geo(supp(m)) by Remark 6.10. It follows by (9.2) applied to ν i k that:

Multiplying by
N , summing over k, and using the mutual singularity of all corresponding measures, we obtain: Passing to a subsequence if necessary, Lemma 6.1 implies that ν i ⇀ ν ∞ ∈ OptGeo(µ 0 , δ o ), and hence (e t ) # ν i ⇀ (e t ) # ν ∞ . It follows by upper semi-continuity of E N on the left-hand side of (9.3), and monotone convergence (and z i → 1) on the right hand side, that taking i → ∞ yields the MCP ε (K, N ) inequality (6.4). (2) ⇒ (3). By Remark 6.10, we may reduce to the case supp(m) = X. In view of Remark 9.3, we first extend the validity of (9.2) by removing the (immaterial) restriction that µ 0 has bounded support. The restriction that supp(µ 0 ) ⊂ B(o, π (N − 1)/K) if K > 0 is automatically satisfied since MCP ε (K, N ) implies MCP(K, N ) which by [56] implies a Bonnet-Meyers diameter estimate. When K ≤ 0, we may weakly approximate a general µ 0 ∈ P 2 (X, d, m) by measures µ i 0 ≪ m having bounded support and repeat the argument presented above in the proof of (4) ⇒ (2).
The case of a general µ 1 ∈ P 2 (X) with supp(µ 1 ) ⊂ supp(m) follows by approximating µ 1 by a convex combination of delta-measures: with W 2 (µ i 1 , µ 1 ) → 0 as i → ∞. By Theorem 6.14 (recall again that MCP ε (K, N ) implies MCP (K, N )), for each i there exists a unique ν i ∈ OptGeo(µ 0 , µ i 1 ), and we may write ν i = k≤n(i) α i k ν i k so that: Moreover, as explained above, N ) condition implies for all t ∈ [0, 1): Multiplying by (α i k ) 1−1/N , summing over k and using the mutual singularity of the corresponding measures, we obtain: Passing as usual to a subsequence if necessary, Lemma 6.1 implies that ν i ⇀ ν ∞ ∈ OptGeo(µ 0 , µ 1 ), and hence (e t ) # ν i ⇀ (e t ) # ν ∞ . Invoking the upper semi-continuity of E N on the left-hand-side, and lower semi-continuity of the right-hand-side (see [72,Lemma 3.3], noting that the first marginal of ν i is fixed to be µ 0 = ρ 0 m), (9.2) finally follows in full generality.
The density estimate (9.1) then follows using a straightforward variation of [41, Proposition 3.1], where it was shown how the existence of (a necessarily unique) transport map S may be used to obtain a pointwise density inequality such as (9.1) from an integral inequality such as (9.2) (the statement of [41, Proposition 3.1] involves an assumption on infinitesimal Hilbertianity of the space, but the only property used in the proof is the existence of a transport map S inducing a unique optimal dynamical plan).
Step 1. Given 0 ≤ s ≤ t < 1, observe that (restr t s ) ♯ ν is the unique element of OptGeo(µ s , µ t ); indeed µ s is absolutely continuous with respect to m and so Theorem 6.14 applies. In particular, we deduce that for each 0 ≤ s ≤ t < 1 and ν-a.e. γ: with the exceptional set depending on s and t. Reversing time and the roles of µ s , µ t , we similarly obtain for each 0 ≤ s ≤ t < 1 and ν-a.e. γ that: with the exceptional set depending on s and t (the case s = 0 is also included as the conclusion is then trivial). Note that given s ∈ [0, 1), as ρ s (x) > 0 for µ s -a.e. x, we have that ρ s (γ s ) > 0 for ν-a.e. γ. Altogether, we see that for each 0 ≤ s ≤ t < 1, for ν-a.e. γ: with the exceptional set depending on s and t.
Step 2. We now claim that for all t ∈ [0, 1), m({ρ t = 0} ∩ e t (H)) = 0. This will establish that µ t = ρ t m =ρ t m, so thatρ t is indeed a density of µ t , thereby concluding the proof.
Suppose in the contrapositive that the above is false, so that there exists t ∈ [0, 1) with m({ρ t = 0} ∩ e t (H)) > 0. As e t | H is injective, there exist K ⊂ H such that K t := e t (K) = {ρ t = 0} ∩ e t (H).
The following two consequences of Proposition 9.6 will be required for the proof of the change-of-variables formula in Section 11. Recall that for any G ⊂ Geo(X), and that D(G)(x) = {t ∈ [0, 1] : x = γ t , γ ∈ G} and D(G)(t) = {x ∈ X : x = γ t , γ ∈ G} = e t (G). To simplify the notation, we directly write G(x) instead of D(G)(x).
Proposition 9.7. With the same assumptions as in Proposition 9.6, we have for any t ∈ (0, 1): The same result also holds for t = 0 if we dispense with the factor of 2 in the denominator.
The proof follows the same line as the proof of [26, Theorem 2.1]. We include it for the reader's convenience.
Proof. Fix t ∈ (0, 1). Suppose in the contrapositive that the claim is false: Consider the complement G(x) c = {t ∈ [0, 1] : x / ∈ e t (G)}, and deduce the existence of a sequence ε n → 0 such that lim n→∞ et(G) 2ε n m(dx) > 0. m(E(s)) L 1 (ds) = lim so there must be a sequence of {s n } n∈N converging to t so that m(E(s n )) ≥ κ, for some κ > 0. Repeating the above argument for the case t = 0 with the appropriate obvious modifications, the latter conclusion also holds in that case as well. Note that: Since the compact sets e sn (G) converge to e t (G) in Hausdorff distance, for each ε > 0 there exists n(ε) such that for all n ≥ n(ε) it holds e t (G) ε ⊃ e sn (G) (and vice-versa), where A ε := {y ∈ X ; dist(y, A) ≤ ε}. It follows that: m(e t (G) ε ) ≥ m e t (G) \ e sn (G) + m(e sn (G)) ≥ κ + m(e sn (G)).
Taking the limit as n → ∞, the continuity property of Proposition 9.6 (lower semi-continuity if t = 0) implies that for each ε > 0: with κ independent of ε. Since m(e t (G)) = lim ε→0 m(e t (G) ε ) we obtain a contradiction, and the claim is proved.
Corollary 9.8. With the same assumptions as in Proposition 9.6, and assuming that supp(m) = X, we have: ν(e −1 0 (X 0 ) ∩ G + ϕ ) = 0, where ϕ is an associated Kantorovich potential to the c-optimal-transport problem from µ 0 to µ 1 with c = d 2 /2. In particular: Recall from Section 3 that G ϕ ⊂ Geo(X) denotes the set of ϕ-Kantorovich geodesics, G + ϕ denotes the subset of geodesics in G ϕ having positive length, and X 0 = e [0,1] (G 0 ϕ ) denotes the subset of null geodesic points in X. Necessarily ν(G ϕ ) = 1. The assumption supp(m) = X guarantees by Lemma 6.13 that (X, d) is proper and geodesic, so that the results of Part I are in force; by Remark 6.10 this poses no loss in generality.

Two families of conditional measures
The next two sections will be devoted to the study of W 2 -geodesics over (X, d, m), when (X, d, m) is assumed to be essentially non-branching and verifies CD 1 (K, N ). By Remark 8.8, we also assume supp(m) = X. We will use Proposition 8.13 as an equivalent definition for CD 1 (K, N ). By Proposition 8.9 and Remark 8.11, X also verifies MCP(K, N ), and so Theorem 6.14 applies. In addition, it follows by Lemma 6.13 that (X, d) is geodesic and proper, and so the results of Part I apply.
Fix µ 0 , µ 1 ∈ P 2 (X, d, m), and denote by ν the unique element of OptGeo(µ 0 , µ 1 ). As usual, we denote µ t := (e t ) ♯ ν ≪ m for all t ∈ [0, 1], and set: Fix also an associated Kantorovich potential ϕ : X → R for the c-optimal transport problem from µ 0 to µ 1 , with c = d 2 /2. Recall that G ϕ ⊂ Geo(X) denotes the set of ϕ-Kantorovich geodesics and that necessarily ν(G ϕ ) = 1. We further recall from Section 3 that the interpolating Kantorovich potential and its time-reversed version at time t ∈ (0, 1) are defined for any x ∈ X as: with ϕ 0 =φ 0 = ϕ and ϕ 1 =φ 1 = −ϕ c . By Proposition 3.6 we have, for all t ∈ (0, 1), It will be convenient from a technical perspective to first restrict ν, by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs µ 0 , µ 1 and µ 1 , µ 0 ), Proposition 9.7 and Corollary 6.15, to a suitable good compact subset G ⊂ G + ϕ with ν(G) ≥ ν(G + ϕ ) − ε. Recall that G + ϕ was defined in Section 3 as the subset of geodesics in G ϕ having positive length, and note that the length function ℓ : Geo(X) → [0, ∞) is continuous and hence is bounded away from 0 and ∞ on a compact G ⊂ G + ϕ . -The map e t | G : G → X is injective (and we will henceforth restrict e t to G or its subsets).
Assumption 10.2. We will assume in this section and in Subsection 11.1 that: ν is concentrated on a good G ⊂ G + ϕ .
We will dispose of this assumption in the Change-of-Variables Theorem 11.4.

L 1 partition
For s ∈ [0, 1] and a s ∈ R, we recall the following notation (introduced in Section 4 for G = G ϕ , but now we treat a general G ⊂ G ϕ as above): G as = G as,s := {γ ∈ G : ϕ s (γ s ) = a s }.
As G is compact and e s : G → X is continuous, e s (G) is compact. When s ∈ (0, 1), ϕ s : X → R is continuous by Lemma 3.2, and hence G as is compact as well.
The structure of the evolution of G as , i.e. e [0,1] (G as ) = {γ t : t ∈ [0, 1], γ ∈ G as }, will be topic of this subsection, so the properties we prove below are only meaningful for a s ∈ ϕ s (e s (G)) (and moreover typically when m(e [0,1] (G as )) > 0). It will be convenient to use a short-hand notation for the signed-distance function from a level set of ϕ s , d as := d ϕs−as (see (8.2)). Lemma 10.3. Let (X, d) be a geodesic space. For any s ∈ [0, 1] and a s ∈ ϕ s (e s (G)) the following holds: for each γ ∈ G as and 0 ≤ r ≤ t ≤ 1, (γ r , γ t ) ∈ Γ da s . In particular, the evolution of G as is a subset of the transport set associated to d as : Proof. Fix γ ∈ G as . If s ∈ [0, 1) then for any p ∈ {ϕ s = a s }: by Lemma 3.3 and Proposition 3.6 (2), and hence d(γ s , γ 1 ) ≤ d(p, γ 1 ); the latter also holds for s = 1 trivially. Similarly, if s ∈ (0, 1] then for any q ∈ {ϕ s = a s }: and therefore d(γ 0 , γ s ) ≤ d(γ 0 , q), with the latter also holding for s = 0 trivially. Consequently, for any p, q ∈ {ϕ s = a s }: Taking infimum over p and q it follows that: where the sign of d as was determined by the fact that s → ϕ s (γ s ) is decreasing (e.g. by Lemma 3.3). On the other hand: thanks to the 1-Lipschitz regularity of d as ensured by Lemma 8.4 since (X, d) is geodesic. Therefore equality holds and (γ 0 , γ 1 ) ∈ Γ da s . The assertion then follows by Lemma 7.1.
Next, recall by Proposition 8.13 applied to the function u = d as , that according to the equivalent characterization of CD 1 u (K, N ), the following disintegration formula holds: where Q is a section of the partition of T b da s given by the equivalence classes {R b da s (α)} α∈Q , and forq as -a.e. α ∈ Q, the probability measurem as α is supported on the transport ray X α = R b da s (α) = R da s (α) and (X α , d,m as α ) verifies CD(K, N ). It follows by Lemma 10.3 that: It will be convenient to make the previous disintegration formula a bit more explicit. We refer to the Appendix for the definition of CD(K, N ) density and the (suggestive) relation to onedimensional CD(K, N ) spaces. Recall that ℓ s (γ s ) = ℓ(γ) for all γ ∈ G.

4)
with q as a Borel measure concentrated on e s (G as ), g as : e s (G as ) × [0, 1] → X is defined by g as (β, ·) = e −1 s (β) and is Borel measurable, for q as -a.e. β ∈ e s (G as ), , h as β is a CD(ℓ s (β) 2 K, N ) probability density on [0, 1] vanishing at the end-points, and the map e s (G as ) Proof. First, we may assume that m(e [0,1] (G as )) > 0 since otherwise there is nothing to prove. We will abbreviate u = d as .
Step 2. We also claim that: Indeed, since α ∈ Q ⊂ T b u then R u (α) is a transport ray by Lemma 7.8, and since u = d as is affine (with slope 1) on a transport ray, R u (α) must intersect {d as = 0} = {ϕ s = a s }, and hence e s (G as ), at most once. It follows by Step 1 that γ 1 s = γ 2 s , and so by injectivity of e s | G : G → X, that γ 1 = γ 2 .
Step 4. Recall that for all α ∈Q of fullq as measure,m as α is supported on the transport ray R u (α) = R b u (α) and (R u (α), d,m as α ) verifies CD (K, N ). Consequently, for such α's,m as α gives positive mass to any relatively open subset of R u (α) and does not charge points. It follows that for α ∈Q, since e [0,1] (γ α ) ⊂ R u (α) has non-empty relative interior, it holds that: In particular, Q 1 coincides up to aq as -null set with theq as -measurable set Q 2 := {α ∈ Q ;m as α (e [0,1] (G as )) > 0}, and thus Q 1 is itselfq as -measurable. In fact, it is easy to show that Q 1 coincides with an analytic set up to aq as -null-set.
Step 5. Recalling that e .
Step 6. Recall that our original disintegration (10.2) was on (Q, Q,q as ), so that there exists Q ⊂ Q of fullq as measure so thatQ ∈ B(T b u ) and Q ⊃ B(Q). It follows that we may find Q ∋Q 1 ⊂ Q 1 withq as (Q 1 \Q 1 ) = 0 so that Q ⊃ B(Q 1 ). Let us now push-forward the measure space (Q 1 , Q ∩ Q 1 ,q as ) via the Borel measurable map e s • η (by Step 3), yielding the measure space (e s (G 1 as ), S , q as ), which is thus guaranteed to satisfy S ⊃ B(S), whereS := e s • η(Q 1 ) is of full q as measure. Restricting the space toS and abusing notation, we obtain (S, S , q as ) with S ⊃ B(S), implying that q as is a Borel measure concentrated onS ⊂ e s (G 1 as ) ⊂ e s (G as ). Denoting m  N ) and is of full support, and is therefore isometric to (I as β , |·| ,ĥ as β L 1 I as β ), where I as β := [0, ℓ s (β)] andĥ as β is a CD(K, N ) probability density on I as β (see Definition A.1). To prevent measurability issues, we will use the convention thatĥ as β vanishes at the end-points of I as β .
As G as is compact, it follows that graph(g as ) is analytic, and hence (see [73,Theorem 4.5.2]) g as is Borel measurable.
This concludes the proof.
It will be convenient to invert the order of integration in (10.4) using Fubini's Theorem: We thus define: m as t := g as (·, t) ♯ (h as · (t) · q as ) , so that the final formula is: Remark 10.5. Since for q as -a.e. β, the CD(ℓ 2 s (β)K, N ) density h as β must be strictly positive on (0, 1) (see Appendix), by multiplying and dividing q as by the positive q as -measurable function β → h as β (s) (recall that s ∈ (0, 1)), we may always renormalize and assume that h as β (s) = 1. Note that this does not affect the definition of m as t above. This normalization ensures that m as s = q as so that: m as t := g as (·, t) ♯ (h as · (t) · m as s ) . (10.6) Remark 10.6. Note that since q as is concentrated on e s (G as ), by definition m as t is concentrated on e t (G as ) for all t ∈ (0, 1). By Corollary 4.3, the latter sets are disjoint for different t's in (0, 1) (recall that s ∈ (0, 1) and that G ⊂ G + ϕ ). Formula (10.5) can thus be seen again as a disintegration formula over a partition. In particular, for any s ∈ (0, 1) and 0 < t, τ < 1 with t = τ , the measures m as t and m as τ are mutually singular.
Proposition 10.7. For any s ∈ (0, 1) and a s ∈ ϕ s (e s (G)), the map Proof. Recall that the definition of m as t does not depend on the last normalization we performed, when we imposed that h as β (s) = 1, so we revert to the normalization that h as β is a CD(ℓ s (β) 2 K, N ) probability density on [0, 1], and hence q as = m(e [0,1] (G as )). The second assertion follows since whenever the latter mass is positive, by positivity of a CD(K, N ) density in the interior of its support (see Appendix): ∀t ∈ (0, 1) m as t (e t (G as )) = m as t = h as β (t)q as (dβ) > 0.
Similarly, it follows by Lemma A.8, the lower semi-continuity of h as β at the end-points (see Appendix), and assumption (10.1), that max t∈[0,1] h as β (t) is uniformly bounded in a s and β for q as -a.e. β by a constant C > 0 as above, implying that: yielding the third assertion. Now note that the density (0, 1) ∋ t → h as β (t) is continuous (see Appendix) for q as -a.e. β, and the same trivially holds for the map [0, 1] ∋ t → g as (β, t). We conclude by Dominated Convergence that for any f ∈ C b (X) and any t ∈ (0, 1): yielding the first assertion, and concluding the proof.

L 2 partition
For each t ∈ (0, 1), we can find a natural partition of e t (G) ⊂ e t (G ϕ ) consisting of level sets of the time-propagated intermediate Kantorovich potentials Φ t s introduced in Section 4. Recall that the function Φ t s (s, t ∈ (0, 1)) was defined as: and interpreted on e t (G ϕ ) as the propagation of ϕ s from time s to t along G ϕ , i.e. Φ t s = ϕ s • e s • e −1 t . In particular, for any γ ∈ G, Φ t s (γ t ) = ϕ s (γ s ), and e t (G as ) ∩ e t (G bs ) = ∅ as soon as a s = b s (see Corollary 4.1). It follows that for any s, t ∈ (0, 1), we can consider the partition of the compact set e t (G) given by its intersection with the family {Φ t s = a s } as∈R ; as usual, it will be sufficient to take a s ∈ Φ t s (e t (G)) = ϕ s (e s (G)). Since Φ t s is continuous, the Disintegration Theorem 6.18 yields the following essentially unique disintegration of m et(G) strongly consistent with respect to the quotient-map Φ t s : so that for q t s -a.e. a s ,m t as is a probability measure concentrated on the set e t (G) To make this disintegration more explicit, we show: Proposition 10.8.
(1) For any s, t, τ ∈ (0, 1), the quotient measures q t s and q τ s are mutually absolutely continuous.
(2) For any s, t ∈ (0, 1), the quotient measure q t s is absolutely continuous with respect to Lebesgue measure L 1 on R.
Proof. Recall that q t s = (Φ t s ) # m et(G) . (1) For any Borel set I ⊂ R, note that: since µ t ≪ m and its density ρ t is assumed to be positive on e t (G) where µ t is supported (see Definition 10.1). But µ τ = (e τ • e −1 t ) ♯ µ t , and so: It follows that q t s (I) > 0 iff q τ s (I) > 0, thereby establishing the first assertion. (2) Thanks to the first assertion, it is enough to only consider the case t = s in the second one. Recall that Φ s s = ϕ s . Then the claim boils down to showing that m(ϕ −1 s (I) ∩ e s (G)) = 0 whenever I ⊂ ϕ s (e s (G)) is a compact set with L 1 (I) = 0.
By compactness, we fix a ball B r (o) containing e s (G). Since ϕ s is Lipschitz continuous on bounded sets (Corollary 3.10 (1)), possibly using a cut-off Lipschitz function over B r (o), we may assume that ϕ s has bounded total variation measure Dϕ s (we refer to [53] and [9] for all missing notions and background regarding BV-functions on metric-measure spaces). From the local Poincaré inequality (see Remark 7.5 and [53,page 992]) and the doubling property (see Lemma 6.13 and recall that supp(m) = X), it follows that the total variation measure of ϕ t is absolutely continuous with respect to m, and that: follows that for q s s -a.e. a s ∈ ϕ s (e s (G)) the measure m s as is absolutely continuous with respect to the Hausdorff measure of codimension one (see [1] for more details).
Employing the previous proposition, we define: obtaining from (10.7) the following disintegration (for every s, t ∈ (0, 1)): with m t as concentrated on e t (G as ), for L 1 -a.e. a s ∈ ϕ s (e s (G)). We now shed light on the relation of the above disintegration to L 2 -Optimal-Transport, by relating it to another disintegration formula for ν, the unique element of OptGeo(µ 0 , µ 1 ). Observe that the family of sets {G as } as∈R is a partition of G and that G as = {ϕ s • e s = a s }. Since the quotient-map ϕ s •e s : Geo(X) → R is continuous and G is compact, the Disintegration Theorem 6.18 ensures the existence of an essentially unique disintegration of ν strongly consistent with ϕ s • e s : ν = ϕs(es(G)) ν as q ν s (da s ), (10.11) so that for q ν s -a.e. a s ∈ ϕ s (e s (G)), the probability measure ν as is concentrated on G as . Clearly q ν s (ϕ s (e s (G))) = ν = 1.
(1) For any s ∈ (0, 1), the quotient measure q ν s is mutually absolutely continuous with respect to q s s , and in particular it is absolutely continuous with respect to L 1 .
Pushing forward both sides via the evaluation map e t given t ∈ (0, 1), we obtain: with q ν s (a s ) · (e t ) # ν as concentrated on e t (G as ) for L 1 -a.e. a s ∈ ϕ s (e s (G)). On the other hand, multiplying both sides of (10.10) by ρ t (which is supported on e t (G)), we obtain: with ρ t · m t as concentrated on e t (G as ) for L 1 -a.e. a s ∈ ϕ s (e s (G)). By the essential uniqueness of the disintegration (Theorem 6.18), noting that ϕ s (e s (G)) is compact, (10.12) immediately follows. As ρ t > 0 on e t (G) (see Definition 10.1) and q ν s (a s ) ∈ (0, ∞) for q ν s -a.e. a s ∈ ϕ s (e s (G)), the "in particular" part of (2) is also established.
Finally, by Fubini's theorem, it follows that for each s ∈ (0, 1) and q ν s -a.e. a s ∈ ϕ s (e s (G)), (10.12) holds with q ν s (a s ) ∈ (0, ∞) for L 1 -a.e. t ∈ (0, 1). Note that for q ν s -a.e. a s ∈ ϕ s (e s (G)), the curve t → (e t ) ♯ ν as is a W 2 -geodesic (since ν as is concentrated on G as ⊂ G). This establishes (3), thereby concluding the proof. respectively. Moreover, both m t as and m as t are concentrated on e t (G as ), for each t ∈ (0, 1) for L 1 -a.e. a s ∈ ϕ s (e s (G)), and for each a s ∈ ϕ s (e s (G)) and all t ∈ (0, 1), respectively, so that the above disintegrations are strongly consistent with respect to the corresponding partition. The goal of the first subsection, in which we retain Assumption 10.2, is to prove that m t as and m as t are equivalent measures; in particular we look for an explicit formula for the corresponding densities. In the second subsection, we deduce a change-of-variables formula for the density along geodesics, discarding Assumption 10.2.

Equivalence of conditional measures
Recall that Assumption 10.2 is still in force in this subsection. We start with the following auxiliary: Lemma 11.1. For every s, t ∈ (0, 1) and a s ∈ ϕ s (e s (G)), the following limit: holds true in the weak topology.
Proof. By Proposition 10.7, (0, 1) ∋ t → m as t is continuous in the weak topology, and so together with (11.1), we see that for any f ∈ C b (X): thereby concluding the proof.
Remark 11.2. One may similarly show (employing an additional density argument) that for every s, t ∈ (0, 1) and L 1 -a.e. a s ∈ ϕ s (e s (G)), the following limit: holds true in the weak topology, but this will not be required.
Proof of Theorem 11.3.
Step 3. We now claim that for L 1 -a.e. t ∈ (0, 1) including t = s, if f ∈ C b (R) and Ψ ∈ C b (X) then: To this end, we will show that for such t's, both: and: tend to 0 in L 1 (e t (G), m) as ε → 0.
Step 5. To see the claim about II ε , it is clearly enough to show that: Step 5a. We first establish (11.7) for L 1 -a.e. t ∈ (0, 1) (independently of f and Ψ). Since x ∈ e t (G), by Dominated Convergence, it is enough to establish pointwise convergence in (11.7) for m-a.e. x ∈ e t (G).
Step 5b. We next establish (11.7) at t = s. Write: The first expression tends to 0 pointwise for all x ∈ X by Lemma 4.6, and hence by Dominated Convergence also in L 1 (e t (G), m) (since |∂ τ Φ τ s (x)| ≤ L and ℓ s (x) ≤ 1/c uniformly). The second expression tends to 0 in L 1 (e t (G), m) by Proposition 9.7 and the uniform boundedness of ℓ 2 s (x). Step 6. In other words, we have verified in Steps 3-5 the following weak convergence, for L 1 -a.e. t ∈ (0, 1) including at t = s: where recall Φ s s (x) = ϕ s (x) and ∂ t Φ t s | t=s = ℓ 2 s (x). Combining this with Step 1, we deduce that: Integrating this identity against 1 ⊗ ψ with 1 ∈ C b (R) and ψ ∈ C b (X), we obtain: where we used that m as t is concentrated on e t (G as ) ⊂ e t (G) for all t ∈ (0, 1) and a s ∈ ϕ s (e s (G)) in the first expression, and the disintegration (11.1) of m et(G) in the last transition. In other words, we obtained for L 1 -a.e. t ∈ (0, 1) including at t = s: Since m t as is also concentrated on e t (G as ) for all t ∈ (0, 1) and L 1 -a.e. a s ∈ ϕ s (e s (G)), the assertion follows by essential uniqueness of consistent disintegrations (Theorem 6.18). Note that by Step 2, ∂ t Φ t s (x) exists and is positive for L 1 -a.e. t ∈ (0, 1) including at t = s for m-a.e. x ∈ e t (G), and so by (11.1), the same holds for L 1 -a.e. a s ∈ ϕ s (e s (G)) and m t as -a.e. x.

Change-of-Variables Formula
We now obtain the following main result of Sections 10 and 11. At this time, we dispense of Assumption 10.2.
Proof of Theorem 11.4.
Step 1. As explained in the beginning of Section 10, by inner regularity of Radon measures, Corollary 9.5 (applied to both pairs µ 0 , µ 1 and µ 1 , µ 0 ), Proposition 9.7 and Corollary 6.15, there exists a good compact subset G ε ⊂ G + ϕ with ν(G ε ) ≥ ν(G + ϕ ) − ε for any ε > 0 (recall Definition 10.1). Of course, we may assume that G ε is increasing as ε decreases to 0 (say, along a fixed sequence). Fixing ε > 0 and a good G ε , denote ν ε = 1 ν(G ε ) ν G ε and µ ε t := (e t ) ♯ ν ε ≪ m, so that all of the results of Section 10 and Subsection 11.1 apply to ν ε . Note that by Corollary 6.15, we have that 1], and therefore: Also note that as ν is concentrated on G ε ⊂ G ϕ , ϕ is still a Kantorovich potential for the associated transport-problem.
Remark 11.5. Observe that all of the results of this section also equally hold forΦ t s in place of Φ t s . Indeed, recall that for all x ∈ X, Φ t s (x) =Φ t s (x) for t ∈G ϕ (x), and that by Corollary 4.5, for a.e. t ∈G ϕ (x). As these were the only two properties used in the above derivation (in particular, in Step 2 of the proof of Theorem 11.3), the assertion follows.

Part III
Putting it all together 12 Combining Change-of-Variables Formula with Kantorovich 3rd order information Let (X, d, m) denote an essentially non-branching m.m.s. verifying CD 1 (K, N ). Let µ 0 , µ 1 ∈ P 2 (X, d, m), and let ν be the unique element of OptGeo(µ 0 , µ 1 ) (by Proposition 8.9, Remark 8.11 and Theorem 6.14). Recall that µ t := (e t ) ♯ ν ≪ m for all t ∈ [0, 1], and we subsequently denote by ρ t the versions of the corresponding densities given by Theorem 11.4 (resulting from Corollary 9.5). Finally, denote by ϕ a Kantorovich potential associated to the corresponding optimal transference plan, so that ν(G ϕ ) = 1.
Remark 12.1. It is in fact possible to deduce (A) just from the Change-of-Variables formula (12.5) and without referring to Corollary 9.5. This may be achieved by a careful bootstrap argument, exploiting the separation of variables on the left-hand-side of (12.5) and the a-priori estimates of Lemma A.9 in the Appendix on the logarithmic derivative of CD(K 0 , N ) densities. But since we already know (A), and since (A) was actually (mildly) used in the proof of the Change-of-Variables Theorem 11.4, we only mention this possibility in passing. Note that Corollary 9.5 applies to all MCP(K, N ) essentially non-branching spaces, whereas the Changeof-Variables formula requires knowing the stronger CD 1 (K, N ) condition.
In particular, these second Peano derivatives are guaranteed to exist for all t ∈ (0, 1) and are a continuous function thereof.
We have already seen above how (12.5) enabled us to deduce (12.6), thereby gaining an additional order of regularity for ∂ τ | τ =t ℓ 2 τ /2(γ(t)). The purpose of this section is to show that the combination of the Change-of-Variables Formula: for a.e. t, s ∈ (0, 1), (12.7) together with properties (A), (B) and (C) above, forms a very rigid condition, and already implies the following representation for 1 ρt(γt) ; we formulate this independently of the preceding discussion as follows: where L is concave and Y is a CD(K 0 , N ) density on (0, 1).

Formal Argument
To better motivate the ensuing proof of Theorem 12.3, we begin with a formal argument. Assume that the functions ρ and z are C 2 smooth and that equality holds in (12.7) for all t, s ∈ (0, 1). It follows that the mapping (s, t) → h s (t) is also C 2 smooth. Fix any r 0 ∈ (0, 1), and define the functions L and Y by: Note that by (12.7): As already noted in Lemma 5.7, the concavity of L follows from (C), since: The more interesting function is Y . We have for all r ∈ (0, 1): To handle the last term on right-hand-side above, note that by the separation of variables on the left-hand-side of (12.7), we have by (C) again, after taking logarithms and calculating the partial derivatives in t and s: We therefore conclude that for all r ∈ (0, 1): where the last inequality follows from (B) and the differential characterization of CD(K 0 , N ) densities (applied to h r (t) at t = r). Applying the characterization again, we deduce that Y is a (C 2 -smooth) CD(K 0 , N ) density on (0, 1). This concludes the formal proof that: ρ(r 0 ) ρ(r) = L(r)Y (r) ∀r ∈ (0, 1), with L and Y satisfying the desired properties. In a sense, the latter argument has been tailored to "reverse-engineer" the smooth Riemannian argument, where the separation to orthogonal and tangential components of the Jacobian is already encoded in the Jacobi equation

Rigorous Argument
It is surprisingly very tedious to modify the above formal argument into a rigorous one. It seems that an approximation argument cannot be avoided, since the definition of Y above is inherently differential, and so on one hand we do not know how to check the CD(K 0 , N ) condition for Y synthetically, but on the other hand Y is not even differentiable, so it is not clear how to check the CD(K 0 , N ) condition by taking derivatives. The main difficulty in applying an approximation argument here stems from the fact that we do not know how to approximate {h s } and z by smooth functions {h ε s } and z ε , so that simultaneously: -z ε is a function of t only, and not of s ; -and the separation of variables structure of (12.7) is preserved.
Our solution is to note that the main role of the separation of variables in the above formal argument was to ensure that (12.8) holds, and so we will replace the rigid third requirement with the following relaxed one: and δ > 0. Proof of Theorem 12.3.
Step 1 -Redefining h s (t). First, observe that there exists I y ⊂ (0, 1) of full measure so that for all s ∈ I y , (12.7) is satisfied for a.e. t ∈ (0, 1), and hence for all t ∈ (0, 1), since all the functions ρ, {h s } and z are assumed to be continuous on (0, 1). Unfortunately, we cannot extend this to all s ∈ (0, 1) as well, since there may be a null set of s's for which the densities h s (t) do not comply at all with the equation (12.7). To remedy this, we simply force (12.7) to hold for all s, t ∈ (0, 1) by defining: (1 + (t − s)z(t)) s, t ∈ (0, 1), (12.9) and claim that for all s ∈ (0, 1),h s is a CD(K 0 , N ) density on (0, 1). Indeed, for s ∈ I y ,h s = h s and there is nothing to check. If s 0 ∈ (0, 1) \ I y , simply note thath s (t) is locally Lipschitz in s ∈ (0, 1) (since ρ(s) is), and hence: But the family of CD(K 0 , N ) densities on (0, 1) is clearly closed under pointwise limits (it is characterized by a family of inequalities between 3 points), and soh s 0 is a CD(K 0 , N ) density, as asserted.
Step 2 -Properties of z and {h s }. We next collect several additional observations regarding the functions z and {h s }. Recall that ρ (by assumption) andh s (as CD(K 0 , N ) densities) are strictly positive in (0, 1). Together with (12.9) (or directly from (12.7)), this implies that 1 + (t − s)z(t) > 0 for all t, s ∈ (0, 1), and hence: . In fact, we already knew this by Theorem 3.11 (3) but refrained from including this into our assumption (C) since this is a consequence of the other assumptions. Furthermore: (E) I x := {t ∈ (0, 1) ; τ →h s (τ ) is differentiable at τ = t for all s ∈ (0, 1)} is of full measure.
Indeed, this follows directly from the definition (12.9) by considering the set all points t where ρ(t) and z(t) are differentiable. In addition, we clearly have: (F) ∀t ∈ I x , (0, 1) ∋ s → ∂ ths (t) is continuous.
Step 3 -Defining L and Y . Now fix r 0 ∈ (0, 1), and define the functions L, Y on (0, 1) as follows: log L(r) := − Clearly, the function L is well defined for all r ∈ (0, 1) as z is assumed locally Lipschitz. As for the function Y , (E) implies that ∂ t | t=s logh s (t) exists for a.e. s ∈ (0, 1), and the fact that the latter integrand is locally integrable on (0, 1) is a consequence of Lemma A.9 in the Appendix, which guarantees a-priori locally-integrable estimates on the logarithmic derivative of CD(K 0 , N ) densities.
We have already verified in Lemma 5.7 that the property z ′ (s) ≥ z 2 (s) a.e. in s ∈ (0, 1) implies that L is concave on (0, 1), so it remains to show that Y is a CD(K 0 , N ) density on (0, 1).

Final Results
In this final section, we combine the results obtained in Parts I, II and the previous section, establishing at last the Main Theorem 1.1 and the globalization theorem for the CD(K, N ) condition. We also treat the case of an infinitesimally Hilbertian space.
Throughout this section, recall that we assume K ∈ R and N ∈ (1, ∞).  N ). Consequently, we may assume that supp(m) = X. By Lemma 6.13 we deduce that (X, d) is proper and geodesic (note that this would be false without the length space assumption above). Note that for geodesic essentially non-branching spaces, it is known that CD loc (K, N ) implies MCP(K, N ) -see [30] for a proof assuming nonbranching, but the same proof works under essentially non-branching, see the comments after [29,Corollary 5.4]. Consequently, the results of Section 7 apply.
Recall that given a 1-Lipschitz function u : X → R, the equivalence relation R b u on the transport set T b u induces a partition {R b u (α)} α∈Q of T b u . By Corollary 7.3, we know that m(T u \ T b u ) = 0 with associated strongly consistent disintegration: with m α (R b u (α)) = 1, for q-a.e. α ∈ Q.
It was proved in [27] that the CD loc (K, N ) condition ensures that for q-a.e. α ∈ Q, (R b u (α), d, m α ) verifies CD(K, N ) with supp(m α ) = R b u (α). Denoting by X α the closure R b u (α), Theorem 7.10 ensures that X α coincides with the transport ray R u (α) for q-a.e. α ∈ Q. Consequently, all 4 conditions of the CD 1 (K, N ) Definition 8.1 are verified, and the assertion follows. Proof. By Remark 8.8, (X, d, m) satisfies CD 1 (K, N ) if and only if (supp(m), d, m) does. By Remark 6.10, the same is true for CD(K, N ). Consequently, we may assume that supp(m) = X.

Concluding remarks
We conclude this work with several brief remarks and suggestions for further investigation.
-Note that the proof of Theorem 13.2 in fact yields more than stated: not only does the synthetic inequality (13.2) hold (for all t 0 , t 1 ∈ [0, 1]), but in fact we obtain for ν-a.e. geodesic γ the a-priori stronger disentanglement (or "L-Y" decomposition): = L γ (t)Y γ (t) ∀t ∈ (0, 1), (13.4) where L γ is concave and Y γ is a CD(ℓ(γ) 2 K, N ) density on (0, 1). As explained in the Introduction, it follows from [34] that for a fixed γ, (13.4) is indeed strictly stronger than (13.2). In view of Main Theorem 1.1, this constitutes a new characterization of essentially non-branching CD(K, N ) spaces.
-According to [35, p. 1026], it is possible to localize the argument of [67] and deduce from a strong CD loc (K, ∞) condition (when K-convexity of the entropy is assumed along any W 2 -geodesic with end-points inside the local neighborhood), that the space is globally essentially non-branching. In combination with our results, it follows that the strong CD(K, N ) condition enjoys the local-to-global property, without a-priori requiring any additional non-branching assumptions.
-It would still be interesting to clarify the relation between the CD(K, N ) condition and the property BM(K, N ) of satisfying a Brunn-Minkowski inequality (with sharp dependence on K, N as in [72]). Note that by Main Theorem 1.1, it is enough to understand this locally on essentially non-branching spaces.
-It would also be interesting to study the CD 1 (K, N ) condition on its own, when no nonbranching assumptions are assumed, and to verify the usual list of properties desired by a notion of Curvature-Dimension (see [50,72,28]).
-A natural counterpart of RCD(K, N ) would be RCD 1 (K, N ): we will say that a m.m.s. verifies RCD 1 (K, N ) if it verifies CD 1 (K, N ) and it is infinitesimally Hilbertian. Recall that an RCD(K, N ) space is always essentially non-branching [67], and hence Main Theorem 1.1 immediately yields: RCD(K, N ) ⇒ RCD 1 (K, N ).
The converse implication would be implied by the following claim which we leave for a future investigation: an RCD 1 (K, N )-space is always essentially non-branching.
-In regards to the novel third order temporal information on the intermediate-time Kantorovich potentials ϕ t we obtain in this work -it would be interesting to explore whether it has any additional consequences pertaining to the spatial regularity of solutions to the Hamilton-Jacobi equation in general, and of the transport map T s,t = e t • e s | −1 G from an intermediate time s ∈ (0, 1) in particular (where G ⊂ G ϕ is the subset of injectivity guaranteed by Corollary 6.15). In the smooth Riemannian setting, the map T s,t is known to be locally Lipschitz by Mather's regularity theory (see [76,Chapter 8] and cf. [76,Theorem 8.22]). A starting point for this investigation could be the following bound on the (formal) Jacobian of T s,t , which follows immediately from (12.5), Theorem 3.11 (3) and Lemma A.9: for µ s -a.e. x, the Jacobian is bounded above by a function of s, t, K, N, l s (x) only.
A Appendix -One Dimensional CD(K, N ) Densities Definition A.1. A non-negative function h defined on an interval I ⊂ R is called a CD(K, N ) density on I, for K ∈ R and N ∈ (1, ∞), if for all x 0 , x 1 ∈ I and t ∈ [0, 1]: (recalling the coefficients σ from Definition 6.2). While we avoid in this work the case N = ∞, it will be useful in this section to also treat the case N = ∞, whence the latter condition is interpreted by subtracting 1 from both sides, multiplying by N − 1, and taking the limit as N → ∞, namely: For completeness, we will say that h is a CD(K, 1) density on I iff K ≤ 0 and h is constant on the interior of I.
Unless otherwise stated, we assume in this section that K ∈ R and N ∈ (1, ∞]. The following is a specialization to dimension one of a well-known result in the theory of CD(K, N ) mm-spaces, which explains the terminology above. Here we do not assume that a m.m.s. is necessarily equipped with a probability measure. Proof. The first assertion follows from e.g. [72, Theorem 1.7 (ii)], and the second follows by considering the CD(K, N ) condition for uniform measures µ 0 , µ 1 on intervals of length ε and αε, respectively, letting ε → 0, employing Lebesgue's differentiation theorem, and optimizing on α > 0 (e.g. as in the proof of [30,Theorem 4.3]).
Let h be a CD(K, N ) density on an interval I ⊂ R. A few standard and easy consequences of Definition A.1 are: • h is also a CD(K 2 , N 2 ) density for all K 2 ≤ K and N 2 ∈ [N, ∞] (this follows from the corresponding monotonicity of the coefficients σ (t) K,N −1 (θ) in K and N , see e.g. [72,50]).
• h is lower semi-continuous on I and locally Lipschitz continuous in its interior (this is easily reduced to a standard identical statement for concave functions on I).
• h is strictly positive in the interior whenever it does not identically vanish (follows immediately from the definition).
• h is locally semi-concave in the interior, i.e. for all x 0 in the interior of I, there exists C x 0 ∈ R so that h(x) − C x 0 x 2 is concave in a neighborhood of x 0 (easily checked for CD(K, ∞) densities). In particular, it is twice differentiable in I with at most countably many exceptions.

A.2 A-priori estimates
We will also require the following a-priori estimates on the supremum and logarithmic derivative of CD(K, N ) densities. Here it is crucial that N ∈ (1, ∞).
Lemma A.8. Let h denote a CD(K, N ) density on a finite interval (a, b), N ∈ (1, ∞), which integrates to 1. Then: In particular, for fixed K and N , h is uniformly bounded from above as long as b− a is uniformly bounded away from 0 (and from above if K < 0).
Proof. Given x 0 ∈ (a, b), we have by the CD(K, N ) condition: When K ≥ 0, the monotonicity of K → σ (t) K,N −1 (θ) implies that σ (t) K,N −1 (θ) ≥ σ (t) 0,N −1 (θ) = t, and we obtain: When K < 0, one may show that the function θ → σ (t) K,N −1 (θ) is decreasing on R + , as this is equivalent to showing that the function x → log sinh exp(x) is convex on R + , and the latter may be verified by direct differentiation (and using that sinh(x) cosh(x) ≥ x). Consequently, we obtain: (σ (t) K,N −1 (b − a)) N −1 dt, as asserted. We remark that when K > 0, one may similarly show that the function θ → σ (t) K,N −1 (θ) is increasing on [0, D K,N −1 ), and since σ (t) K,N −1 (0) = t, we obtain the previous estimate we employed.  ∈ (a, b) where h is differentiable. In particular, log h(x) is locally Lipschitz on x ∈ (a, b) with estimates depending continuously only on x, a, b, K, N .
where the last equality follows from the usual integration by parts formula and Leibniz rule since (log h(y))ψ ε (x − y) is absolutely continuous. Furthermore: (log h ε ) ′′ (x) = log h(y) d 2 (dx) 2 ψ ε (x−y)dy = log h(y) d 2 (dy) 2 ψ ε (x−y)dy ≤ (log h) ′′ (y)ψ ε (x−y)dy, where the last inequality follows by Lemma A.11 applied to g = log h, since h is a CD(K, ∞) density (by monotonicity in N ), and hence log h(x) + K x 2 2 is concave on (a, b). Putting everything together and applying Jensen's inequality, we obtain: where the last inequality follows since the integrand is non-positive (where it is defined) by Lemma A.5. A final application of Lemma A.3 concludes the proof.
Proof. The proof is a repetition of the proof of the previous proposition, so we will be brief. Our assumption (A.5) implies that (A.6) is well-defined, and justifies taking two derivatives in t and s under the integral, implying the assertion on smoothness. The first derivative in t under the integral may be integrated by parts, whereas for the second derivative we apply Lemma A.11. A final application of Jensen's inequality as in Proposition A.10 establishes the asserted differential characterization of CD(K, N ), concluding the proof.