Absolutely continuous and BV-curves in 1-Wasserstein spaces

We extend the result of Lisini (Calc Var Partial Differ Equ 28:85–120, 2007) on the superposition principle for absolutely continuous curves in p-Wasserstein spaces to the special case of p=1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p=1$$\end{document}. In contrast to the case of p>1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p>1$$\end{document}, it is not always possible to have lifts on absolutely continuous curves. Therefore, one needs to relax the notion of a lift by considering curves of bounded variation, or shortly BV-curves, and replace the metric speed by the total variation measure. We prove that any BV-curve in a 1-Wasserstein space can be represented by a probability measure on the space of BV-curves which encodes the total variation measure of the Wasserstein curve. In particular, when the curve is absolutely continuous, the result gives a lift concentrated on BV-curves which also characterizes the metric speed. The main theorem is then applied for the characterization of geodesics and the study of the continuity equation in a discrete setting.


Introduction
Let (X, d) be a complete and separable metric space and P p (X) be the associated Wasserstein space of order p ≥ 1, i.e., the space of Borel probability measures on X with finite moment of order p, endowed with the (Kantorovitch-Rubinstein-)Wasserstein distance W p .In [8], Lisini proved that, for p > 1, any absolutely continuous curve (µ t ) ∈ AC p (I : P p (X)) over a compact time interval I ⊂ R with finite p-energy can be represented by a Borel probability measure π on continuous curves (γ t ) in X, that is, π ∈ P (C(I : X)), which satisfies the following properties: (i) π is concentrated on AC p (I : X) ⊂ C(I : X); (ii) (e t ) # π = µ t for all t ∈ I (where e t is the evaluation map, defined by e t (γ) := γ t ); (iii) the metric derivative | μt | satisfies the following for L 1 -a.e.t ∈ I: Measures on a path space, like π above, are sometimes called path measures.Item (i) tells us that to characterize (µ t ), we can restrict our attention to a specific set of continuous curves, namely, absolutely continuous curves, or shortly AC-curves.Item (ii) ensures that π has the desired time-marginals, or in other words, is a lift of (µ t ) to the path space.This is also known as a superposition principle since the curve of measures (µ t ) is obtained by superposing individual curves (γ t ) in the underlying space.Finally, Item (iii) states that the metric speed | μt | in the p-Wasserstein space can be obtained by taking the average over the metric speed of the characterizing curves in the base space according to the measure π.Equation (1.1) can be regarded as a minimality property for π.Indeed, for general lifts satisfying (i)-(ii), one can expect only an inequality (≤) in (1.1) (see [8,Theorem 4]).
The minimal choice, which achieves equality, is in fact constructed using techniques of optimal transport.For Wasserstein geodesics, such a lift, often called optimal dynamical plans, is constructed in an earlier work by Lott and Villani [10, Proposition 2.10 and E.6] for the case of p = 2 and in complete locally compact length spaces.In Lisini's work [8,Theorem 5], which local compactness is no longer required, the lift is constructed for general absolutely continuous curves in p-Wasserstein spaces, p > 1, and in particular is used for the characterization of the geodesics.Later in [9], Lisini also extends the results above to the so-called Wasserstein-Orlicz distance, where the usual cost function d p is replaced by a more general function ψ with suitable properties.This extension, however, does not cover the case d 1 .
In this paper, we study the peculiar case of p = 1, where the cost function d p in the definition of the Wasserstein distance loses its strict convexity.We first provide a simple example (see Example 1.1 below) in which an absolutely continuous curve in an 1-Wasserstein space cannot be lifted to a measure π on continuous curves.Nonetheless, we show that a similar superposition result still holds if we relax the notion of lifts.More precisely, we need to consider a larger class of curves, namely, curves of bounded variation, or shortly BV-curves (see also Example 3.5).
When considering the case p > 1, it is well known that the space of absolutely continuous curves with finite p-energy is closely connected to Sobolev space of order 1 with finite pnorm via the following "identification-inclusion" relationship W 1,p (I : X) ≃ AC p (I : X) ⊂ C(I : X), which succinctly indicates that every Sobolev curve can be identified with an absolutely continuous representative.Additionally, we have the Borel inclusion of absolutely continuous curves into the space of continuous curves equipped with the topology of uniform convergence, which turns it into a Polish space.In the present paper, where we study the case p = 1, these are replaced by the following BV (I : X) ≃ BV(I : X) ⊂ D(I : X). (1.3) Here BV (I : X) denotes the space of all BV-curves.As an analogue to (1.2), every BVcurve can be identified through a Borel selection map with a Càdlàg (right-continuous and left-limited) curve of bounded variation.The space of such curves is denoted by BV(I : X), which is a Borel subset of the larger space of all possible Càdlàg curves denoted by D(I : X).
The space D(I : X) can be equipped with a specific metric, which turns it into a Polish space, known as Skorokhod space.It is worth mentioning that in restriction to C(I : X), the Skorokhod topology is exactly the topology of uniform convergence.In short, we view BV-curves as a Borel subset, up to choosing a representative, of Skorokhod space.
Even though the metric derivative of BV-curves u ∈ BV (I : X) exists almost everywhere, as does so for AC-curves, it does not completely capture their "speed."The natural replacement for metric derivative in this situation is the so-called total variation measure |Du| ∈ M(I), which takes also the singular part of the speed, in particular jumps of the curves, into account.Here M(I) is the set of all positive measures over I and we will use L n to denote n-dimensional Lebesgue measure.
Equation (1.4) is interpreted as equality of measures, i.e., for any (non-negative) Borel function f : I → R, we have I f (t) d|Dµ|(t) = I f (t) d|Dγ|(t) dπ(γ).Theorem 3.1, which we prove first, indicates that (1.4) can be viewed as an optimality condition among all lifts of (µ t ), as in the case of p > 1.To construct π, we use optimal mass transport as in [8] with modifications for BV-curves.
A motivating example.Here we present an elementary example of an absolutely continuous curve in the 1-Wasserstein space over R, for which it is impossible to have lifts on continuous curves (also discussed briefly in [9,Remark 3.2]).Still, we construct a lift on discontinuous BV-curves.This provides a first insight into our results.
Example 1.1.Consider a curve of probability measures on R defined as This is a basic situation where the mass is "teleported" from 0 to 1, but not continuously "transported," as shown in Fig. 1 (left).First of all, observe that for any t, s ∈ I, and thus, the metric derivative in p-Wasserstein space, only exits for p = 1.Therefore, (µ t ) / ∈ AC p (I : P p (R)) for all p ∈ (1, ∞).Nevertheless (µ t ) ∈ AC 1 (I : P 1 (R)) and it is even a constant-speed geodesic in 1-Wasserstein space between δ 0 and δ 1 .It is clear that there is no measure π on the set of continuous curves, i.e., a measure in P (C(I : R)), such that µ t = (e t ) # π for all t ∈ I.However, we do claim that there exists a measure π ∈ P (D(I : R)) concentrated on the set of BV-curves such that µ t = (e t ) # π for all t ∈ I and moreover it enjoys the optimally property for any a, b ∈ I with a < b.Comparing with (1.4), we point out that the left-hand side of the equation above is nothing but |Dµ|([a, b]) since (µ t ) in this simple example is absolutely continuous.
To construct π, let us label particles standing at position x = 0 at time 0 with a real-valued parameter denoted by α ∈ [0, 1].These particles gradually jump to position x = 1 and since the rate of mass discharge is constant, we would expect that jumps happen uniformly in time.Let the particle with label α jump from 0 to 1 at time α.Then its path is simply expressed using the indicator function as follows Some sample paths are plotted in Fig. 1 (right).Now, we consider a uniform measure over α (as jumps happen uniformly) and consequently construct a path measure π by We show now that π has the desired time-marginals and satisfies (1.7) as well.As for the first claim, notice that for any Borel subset B ∈ B(R), we can write where we first used the definition of push-forward and then substituted (1.9) and (1.8).As for the second claim (1.7), we start from the right-hand side and write which proves the claim.As already mentioned, (µ t ) here is a constant-speed geodesic connecting δ 0 to δ 1 .In fact, there are infinitely many constant-speed W 1 -geodesics between δ 0 to δ 1 .In Example 4.7, we present a relatively general way of how one can construct different geodesics.Applications.As a direct application of Theorems 3.1 and 3.3, we characterize BVcurves in 1-Wasserstein spaces.Furthermore, we characterize what we call BV-geodesics, i.e., variation minimizing curves, in 1-Wasserstein spaces.With the understanding (thanks to Theorem 3.3) of continuity and metric derivatives of Wasserstein curves of bounded variation, we then distinguish continuous length minimizing and constant speed geodesics among all BV-geodesics.We also discuss why the characterizing absolutely continuous curves in 1-Wasserstein spaces using their lifts still remains challenging.As seen in Example 1.1, superposing discontinuous curves might result in continuity in the 1-Wasserstein space.On the other hand, continuous curves will always lead to a continuous Wasserstein curve.We investigate the relation between the regularity of the curves at the level of the base space and at the level of the Wasserstein space.The observations can be summarized as follows: superposing curves can only increase the regularity or, to put it differently, irregularities may average out when superposing, see Table 1.
Finally, we study the continuity equation in a discrete setting.More precisely, using the lift coming from Theorem 3.3, we show that for any absolutely continuous curve (µ t ) living in a bounded subset of a (topologically) discrete metric space, there exists (v t ), a suitable discrete analogue of a time-dependent vector field, such that (µ t , v t ) satisfy the discrete continuity (or current) equation.We conclude with a discussion on a discrete Benamou-Brenier formula and on challenges arising if one is interested not only in the metric structure of the discrete space but also in an additional graph structure.
Organisation of the paper.The remainder of the paper is structured as follows: • In Section 2, we provide preliminary concepts concerning BV-curves in metric spaces and Skhorokhod space.Additionally, we prove equivalent definitions of BV-curves in Theorem 2.17.Such a result (which we did not easily find it in the literature) makes it more convenient to work with BV-curves in different situations.• In Section 3, we present and prove the main results, Theorem 3.1 and 3.3.We then provide some examples to shed light on the main results.
• In Section 4, we apply the main theorems to characterize BV-curves and geodesics, understand better the regularity of curves in superposition, and finally study the continuity equation in a discrete setting.
Acknowledgement.We thank Matthias Erbar for helpful suggestions and stimulating discussions on related topics.We also thank Tapio Rajala for valuable comments on the paper.The third named author acknowledges the support provided by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -through SPP 2026 Geometry at Infinity.The authors also thank the anonymous referee for providing detailed feedback on the manuscript.
Data availability.Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.The σ-algebra of Borel sets of X is denoted by B(X).We denote by P (X) the space of Borel probability measures on X, and by P p (X) ⊂ P (X), p ≥ 1, its subset of measures with finite p-th moment.The space P p (X) is endowed with (Kantorovitch-Rubinstein-)Wasserstein metric W p .Given a map T : X → Y between two measurable spaces and a probability measure µ ∈ P (X), the push-forward measure (or the image measure) is denoted by T # µ ∈ P (Y ).
2.2.BV-curves in metric spaces.In this subsection, we recall basic definitions and notions related to curves of bounded variation.L 1 (I : X) denotes the space of all maps u : I → X such that I d(u(t), x) dt < ∞ for some (and thus every) x ∈ X. Definition 2.1 (Variation).The pointwise variation of a function u : I → X on any subset J ⊂ I is defined as and its essential variation is defined as ess Var(u; J) := inf {Var(v; J)|u = v a.e. on J} .Definition 2.2 (BV-curves).We call u ∈ L 1 (I : X) a curve of bounded variation, or shortly a BV-curve, if ess Var(u) := ess Var(u; I) < ∞.We use BV (I : X) to denote the space of all BV-curves.
For a non-decreasing function f : I → R, we can define its variation measure as the Lebesgue-Stieltjes measure |Df | given by (see [14,Section 6.3.3] Using this, we can generalize the notion of variation measure in general metric spaces: Definition 2.3 (Variation measure).Let u ∈ BV (I : X).The variation measure of u is defined as the Lebesgue-Stieltjes measure |Du| induced by the non-decreasing function V : I → R defined as V (t) := ess Var(u; (0, t)).
By Lebesgue(-Radon-Nikodym) decomposition, the variation measure can be written as ) where | u| denotes the Radon-Nikodym derivative of the variation measure with respect to the Lebesgue measure, |Du| J is the purely atomic part, or the "jump part," and the remaining term |Du| C is the continuous singular part or the "Cantor part".The density | u|(t) actually coincides almost everywhere with the metric derivative | ut |, hence the notation (see Lemma 2.8 at the end of this subsection).In this paper, we preserve the term metric speed only for the metric derivative of absolutely continuous curves.We however do not claim this to be a common practice in the literature.Definition 2.4 (Càdlàg curves).A curve u : I → X is called a Càdlàg curve if it is right-continuous with left-limits, i.e., for every t ∈ I we have that We use D(I : X) to denote the set of all Càdlàg curves and BV(I : X) to denote Càdlàg curves of bounded variation.
The reason for introducing Càdlàg curves here is that any BV-curve admits a Càdlàg representative, i.e., they coincide L 1 -a.e.This is stated in the lemma below:  Proof.It suffices to consider the nontrivial case when u has finite variation.Since Càdlàg representation if exists must be unique (up to the value at t = 1), we only need to show that any BV-curve has a Càdlàg representation satisfying (2.2).Let u ∈ BV (I : X).By definition, there is a sequence of u n : I → X, such that u n = u a.e. and ess Var(u) ւ Var(u n ) < ∞.For each n, the function t → Var(u n ; (0, t)) is nondecreasing so it has left and right limits at each t ∈ (0, 1) and is continuous at all t ∈ I \N n , where N n is at most countable.Then by the completeness of (X, d), u n must have left and right limits at each t ∈ (0, 1) and be continuous at all t ∈ I \ N n as well.As the family {u n } coincides a.e., at each t ∈ I, the right limit u n (t + ) equals for all n, which will be denoted by ũ(t).Clearly, on I \ ∪ n N n , u n = ũ for all n, ensuring ũ is a representative of u.Notice that if a function has right limits everywhere then its corresponding right limit function is right continuous.Therefore, ũ is right-continuous on [0, 1).And the existence of left limits is a direct consequence of ũ ∈ BV .So for its variation, given any ε > 0 and partition 0 where 0 < r n ≪ 1 − t k and assume u n (t) := u n (1 − ) whenever t ≥ 1 (in fact, without loss of generality, each u n can be chosen as left-continuous at t = 1 as this never increases Var(u n )).After taking supremum of division and passing ε to 0, one concludes where the last line comes directly from the definition of pointwise variation.
Remark 2.6.Given E ⊂ I with L 1 (E) = 1, if a function u defined on E has finite pointwise variation, then we can extend u to I with Var(u; E) = Var(u; I).Indeed, arguing as in Lemma 2.5, at each t ∈ I lim E∋τ ցt u(τ ) exists and for t ∈ I \ E we define u(t) as the above limit.On I \ E, u is right-continuous so the variation will not increase after extension.
Lemma 2.7.The function ess Var : L 1 (I : Without loss of generality, we may assume that ess Var(u n ) < ∞ for all n ∈ N. By Lemma 2.5, we can assume each u n to achieve (2.2).As (X, d) is complete, up to picking a subsequence, u n (t) converges to u(t) on some E ⊂ I with full measure, yielding By Remark 2.6, ess Var(u) is bounded by Var(u; E) and hence ess Var is lower semicontinuous.
To end this subsection, we briefly comment on the relation between the variation measure and the metric derivative.Recall that the metric derivative of u : I → X at time t is defined by whenever the above limit exists.
Proof.It is known, e.g. from [1, Theorem 2.2], that the metric derivative | ut | exists almost everywhere and equals to the density | u|(t).The second equality is a general fact for measures on R by the following argument.Assume that µ is a locally finite measure on R with the decomposition µ = ρL 1 + µ s , where µ s ⊥ L 1 .By the Lebesgue differentiation theorem, it suffices to show If else, there exists a Borel set T ⊂ R and some c > 0 such that L 1 (T ) > 0 and Based on the standard differentiation theorem of measures (cf.[4, Theorem 2.4.3]),µ s (T ) > 0, which contradicts the fact that µ s and L 1 are mutually singular.

Skorokhod space.
As alluded in the previous subsection, Càdlàg curves are important in the study of BV-curves.The space of Càdlàg curves D(I : X) can be equipped with a metric, known as Skorokhod metric, which turns it into a complete and separable space, known as Skorokhod space.Recall from (1.2)-(1.3)that D(I : X) with the Skorokhod topology plays the role of C(I : X) with the topology of uniform convergence.The goal of this subsection is to recall necessary notions of Skorokhod space.
Definition 2.9 (Skorokhod space).For two curves γ, γ ∈ D(I : X), define a distance where the infimum runs over all increasing homeomorphisms λ : I → I, and The set D(I : X) equipped with the distance d Sk is called the Skorokhod space.
By definition, a sequence of curves γ n ∈ D(I : X) converges to γ ∈ D(I : X) if and only if there exists functions {λ n } such that d sup (γ, γ n • λ n ) → 0 and λ n B → 0 where the latter ensures |λ n (t) − t| → 0 uniformly in t.
Theorem 2.10 (Billingsley-Skorokhod).Let X be complete and separable metric space.Then the Skorokhod space (D(I : X), d Sk ) is complete and separable.
The proof can be found in [5, Section 12], where the author only studied D(I : R) but the argument holds exactly the same replacing Euclidean space with general complete and separable metric space.
The next lemma shows that the topology induced by the distance d Sk is finer than L 1 -topology.Lemma 2.11.Let (γ i ) i and γ be in Here the last term converges to zero due to the almost everywhere continuity of Càdlàg curves (cf.[5,Lemma 12.1]) and the dominated convergence theorem.
Lemma 2.12.The Borel σ-algebra B(D(I : X)) of the Skorokhod space is equal to the σ-algebra generated by the evaluation maps.More generally, where T ⊂ I is an arbitrary dense subset of I for which 1 ∈ T .
Proof.The proof goes as in the real valued case, see [5].By Proposition 2.15, it suffices to prove that B(D(I : X)) ⊂ σ(e t : t ∈ T ).Notice that by writing e 0 = lim t i ց0 e t i for some sequence (t i ) ⊂ T , we obtain Borel measurability of e 0 .Thus, we may assume that 0 ∈ T .The maps e t J := (e t 1 , . . ., e tn ) : D(I : X) → X n are (σ(e t : t ∈ T ), B(X n ))-measurable by definition for all t J = (t 1 , . . ., t n ) and |J| := n ∈ N.Moreover, for a partition t is continuous, and therefore Borel measurable.Let now (t Jn ), t Jn ⊂ T , be a sequence of partitions of I with mesh |t Jn | going to zero.Then by above we have that the composition map S n := ι t Jn • e t Jn is (σ(e t : t ∈ T ), B(D(I : X)))-measurable.Moreover, we have that id D(I:X) = lim n→∞ S n .Thus the identity is (σ(e t : t ∈ T ), B(D(I : X)))-measurable and hence B(D(I : X)) ⊂ σ(e t : t ∈ T ), which concludes the proof.

Further auxiliary results.
Here we collect some results that are used later in the proof of main theorems.The first statement is about the lower semi-continuity of pointwise variation, which follows immediately by combining Lemma 2.11 together with Lemma 2.7 and Lemma 2.5 (and taking into account possible discontinuities at t = 1): Lemma 2.13.The pointwise variation Var : The next proposition concerns the identification in (1.3): Proposition 2.14 (Borel Selection).For any γ ∈ BV (I : X), we let γ denote the Càdlàgrepresentative left-continuous at t = 1.Then the selection map T : BV (I : Proof.By Lemma 2.7 and Lemma 2.13, BV (I : X) and BV(I : X) are Borel subsets of L 1 (I : X), d L 1 and D(I : X), d Sk , respectively.Notice that by definition, the subset BV 1 of all curves in BV which is left-continuous at t = 1 is closed under the Skorokhod metric.So it suffices to show that the following bijection The following proposition is needed in the proof of (ii) in Theorem 3.3.Then to prove the time-marginals condition reduces to prove it only on a dense set of times t.
Proposition 2.15.The evaluation map e t : D(I : X) → X defined as e t (γ) := γ t is Borel measurable.Moreover, for any π ∈ P (D(I : X)) the function is Càdlàg for every continuous and bounded φ ∈ C b (X : R).
Proof.Step 1.For t ∈ {0, 1}, the map e t is actually continuous.Therefore it suffices to consider t ∈ (0, 1).Moreover, it suffices to prove that ψ • e t is Borel measurable for every continuous and bounded ψ ∈ C b (X).The rest of the proof goes in the lines of the real valued case, cf.[5]; for m ∈ N, define ψ m : D(I : We want to show that ψ m is continuous.Let γ n → γ in D(I : X).Then γ n (s) → γ(s) L 1 -almost everywhere.Thus, by dominated convergence we have that when n → ∞.We conclude that ψ m is continuous.On the other hand, by right continuity of ψ • γ, we have that ψ • e t (γ) = lim m→∞ ψ m (γ).Hence, ψ • e t is Borel measurable.
Step 2. Let us first prove the right continuity for all t ∈ [0, 1).For that, let φ ∈ C b (X), and C > 0 so that |φ| ≤ C. Fix t 0 ∈ [0, 1) and ε > 0. Write This is possible since φ•γ is right continuous for every γ ∈ D(I : X).Since Γ n is increasing in n, there exists n 0 ∈ N so that Thus, the map t → φ • e t dπ is right continuous.The existence of left limits goes in the same lines.The only modifications needed are the following.First of all, in (2.5) the set Γ n is replaced by Second, in (2.6), the estimates are done for t, s ∈ [t 0 − 1 n 0 , t 0 ) from which the existence of left limits follows by Cauchy's criterion.
Finally, the observation below will be useful when dealing with (non-continuous) BVcurves in the Wasserstein space.Lemma 2.16.Let γ ∈ D(I : X).Then the image of γ is precompact.
Proof.Let C ⊂ X be a collection of all left and right limits of γ, and let S := C \ Im(γ).Clearly C is the closure of the image of γ.It suffices to prove that for any sequence (x i ) ⊂ S there exists a subsequence that converges.Indeed, if (x i ) ⊂ C so that there exists infinitely many i for which x i = γ(t i ), then we can take a monotone subsequence of t i and by the existence of left and right limits conclude that this subsequence converges to a point in C. Therefore, let (x i ) ⊂ S. For any i, choose t i so that d(γ(t i ), x i ) < 1 2 i .Then, as before, there exists a monotone subsequence of (t i ), still denoted by t i .Let a = lim γ(t i ) ∈ C. Then (for the corresponding subsequence of (x i )) we have that 2.5.Equivalent definitions of BV-curves.It is known, especially in the real-valued case (see, e.g., [7]), that different definitions of BV-curves are equivalent.Although we expect that such a result is also known in the general setting of metric spaces, we did not easily find it in the literature.Thus, for the sake of completeness, we here prove the equivalence in the general case.
(4) ⇒ (1) Let {z i } i∈N ⊂ X be a dense set.Define the functions ϕ i (•) := d(z i , •).By assumption ϕ i • u ∈ BV (I : R).Thus for each i, we can take a representative φ i ∈ D(I : R) and Our goal is to find a representative ũ whose pointwise variation is finite.Notice that by the density of the set {z i } in X, we have Further using the assumption on measure µ, for s, t ∈ Ĩ and s < t, we have Now take a decreasing sequence Since µ is a finite measure, the right hand side in equation above goes to 0 as k → ∞.Therefore, (u(t i )) i∈N is a Cauchy sequence and by completeness, the limit exists and we have ũ = u over Ĩ.Finally, by the similar argument as in the proof of Lemma 2.5, equation (2.3), we conclude Var(ũ) ≤ µ(I) < ∞.

Then
(1) for all 0 < h < b − a, l b a (h) ≤ l b a (h/2), and in particular, Proof.The first assertion follows simply by triangle inequality: Given arbitrary h, h ′ , by triangle inequality So the continuity of l b a at h boils down to show lim The conclusion follows if we take the uniform step-size 1/2 n and argue as in part (1): Remark 2.19 (Equivalence of variation measures).In Theorem 2.17, we show that five different definitions of the set of BV-curves are in fact equivalent.The proof has an even stronger implication, namely, that the five(-ish) different notions of the variation measure of a BV-curve are equal.To be more precise, given a BV-curve u, the following measures are equal: (5 ) the minimal measure satisfying (2.9).
The fact that all these different approaches lead to the same measure is evident from the corresponding steps in the proof of Theorem 2.17.The single measure obtained in one of the various ways is thus denoted by |Du| and called the variation measure of u.Moreover, we take advantage of the different approaches without mentioning it explicitly.
Remark 2.20.Sometimes it is useful to have bounds as above which hold not only for almost every s and t, but in fact everywhere.For this one needs to consider a specific choice for a representative of the BV-curve in question.One choice, which we decided to work with in this paper, is the Càdlàg-representative.Since D(I : X)-representative of a BV-curve is unique (up to its value at t = 1), if u ∈ BV(I : X), by Lemma 2.5, we indeed have that d(u(s), u(t)) ≤ |Du|((s, t]) for all 0 ≤ s ≤ t < 1.

Main results
3.1.Lifts of AC-and BV-curves in 1-Wasserstein spaces.As introduced in Sections 1 and 2.1, we consider (X, d) a complete and separable metric space and P 1 (X) the associated Wasserstein space of order p = 1.Without loss of generality, we fix the time interval to be I = [0, 1].Theorem 3.1.Let π ∈ P (D(I : X)) be concentrated on BV(I : X) ⊂ D(I : X) such that and µ 0 := (e 0 ) # π ∈ P 1 (X).Then the curve t → µ t := (e t ) # π belongs to BV(I : P 1 (X)), and as measures.
The previous theorem states that any lift π of a BV-curve (µ t ) provides an upper bound for the variation measure of (µ t ) through equation (3.2).In the next theorem, a measure π is constructed, using techniques of optimal transportation, that achieves equality and thus entails a key relation on the variation of the curves.Our main result is the following: Theorem 3.3 (AC-and BV-curves in 1-Wasserstein spaces).Let (µ t ) ∈ BV(I : P 1 (X)).Then there exists a probability measure π ∈ P (D(I : X)) such that (i) π is concentrated on BV(I : X) ⊂ D(I : X); (ii) (e t ) # π = µ t for all t ∈ I; (iii) the total variation measure |Dµ| satisfies2 |Dµ| = |Dγ| dπ(γ). (3.5) Moreover, the absolutely continuous part | μ|L 1 of the measure |Dµ|, given by the metric derivative, satisfies for L 1 -a.e.t ∈ I.
Proof.Our proof firmly follows the one of [8, Theorem 5] with modifications for BV-curves established in Section 2.
For any integer N ∈ N, we divide I into 2 N pieces and denote t i := i/2 N for i = 0, • • • , 2 N .Let X i represent the i-th copy of X and take the product space It is always possible (see e.g.[3, Section 5.3]) to find η N ∈ P (X N ) s.t.
where Opt(µ t i , µ t i+1 ) is the set of optimal couplings between µ t i and µ t i+1 and the maps Pr i , Pr i,j are projections from X N to the i-th, (i, j)-th component, respectively.Finally, we define the filling map σ : and set π N := σ # η N ∈ P (L 1 (I : X)).
Combining two above estimates, we obtain sup which proves (3.7).The precompact criterion in [8, Theorem 2] guarantees all {Φ ≤ c} are precompact.For the tightness of {π N }, it remains to show that all sublevels of Φ are closed in L 1 (I : X).It suffices to prove Φ is lower semi-continuous with respect to L 1 -convergence, which is a consequence of Fatou's Lemma.Indeed, given any u n → u in L 1 (and it is not restrictive to assume further u n (t) → u(t) for L 1 -a.e.t ∈ I), we have In conclusion, by Prokhorov's theorem, there exists π ∈ P (L 1 (I : X)) and a subsequence N k such that π N k → π narrowly in P (L 1 (I : X)) as k → ∞.
Step 2 (π is concentrated on BV (I : X)).As shown in the end of Step 1, the function is lower semi-continuous and bounded from below.So by narrowly convergence of π N L 1 (I:X) where the second inequality comes from (3.8).Therefore, By Theorem 2.17, π is concentrated on the Borel subset BV (I : X) ⊂ L 1 (I : X).Considering push-forward via the Borel selection map T : BV (I : X) ⊂ L 1 → D(I : X) in Proposition 2.14, we can construct the probability measure which is concentrated on BV(I : X).
Step 3 (Proof of (ii) and (iii)).Recall that for any BV-function u, Then we can repeat Step 1 to produce (3.9) on each subinterval [s, t] ⊂ I: Together with Theorem 3.1, (iii) will be proved after obtaining (ii).At last, for (ii), fix any test functions ϕ ∈ C b (X) and ξ ∈ C b (I).By noticing u → [0,1] ξ(t)ϕ(u(t)) dt is continuous on L 1 (I : X) and (e t ) # π N = µ t i , for each N ∈ N, t ∈ [t i , t i+1 ), we have where the last limit is guaranteed by continuity and boundedness of ξ and ϕ.Since t → µ t ∈ D(I : P 1 (X)), the function is in D(I : R) so it is continuous outside a set of countably many points, and in particular is Riemannian integrable.As a result, the limit of Riemann sums in (3.10) is right-continuous.Therefore, (3.11) holds for all t ∈ [0, 1) and ϕ ∈ C b (X), which implies (e t ) # π = µ t .
Proof of the final claim.Suppose first that (µ t ) is absolutely continuous, then in particular, it is a curve of bounded variation and thus (i)-(iii) already hold.To derive (3.6), observe that for L 1 -a.e.
where in the third step, we used (3.5).Therefore, the statement follows whenever (µ t ) is absolutely continuous.We now claim that the argument above works also in the non-absolutely continuous case.Indeed, Lemma 2.8 guarantees the equivalence between | μ|(t) and the second line as well as the validity of the last equality (recall the notation explained in Section 2.2).In the other steps, absolute continuity was not used.Hence, the conclusion follows.
3.2.Remarks on the main results.In this section, we shed light on the main results by providing some examples.First of all, note that in general, we can not expect any uniqueness of π in Theorem 3.3.Indeed, the uniqueness is not true even in Lisini's original result [8,Theorem 5] for p > 1.
However, when p > 1, there are cases where the lift is unique.For instance, when the underlying space is non-branching, then the lift of any constant-speed geodesic must be unique (see e.g.[2, Proposition 3.16]).The following example illustrates that in the case p = 1, where the cost lacks strict convexity, uniqueness fails throughout, even in Euclidean spaces.
Example 3.4 (Nonuniqueness of lifts).Let µ 0 and µ 1 be two probability measures on R supported inside [−2, −1] and [1,2], respectively.Clearly t → µ t := (1 − t)µ 0 + tµ 1 is a constant-speed geodesic under W -distance and notice that every coupling between µ 0 and µ 1 is optimal.For any coupling induced by a Borel map T , let us define a family of curves labelled by α ∈ [0, 1] and x ∈ supp(µ 0 ) in the following way This is in fact a generalized version of the construction in Example 1.1 (in which we had supp(µ 0 ) = {0} and T (x) = 1).Now, similar to that example, one can check that the measure satisfies (i)-(iii) in the theorem above.Whenever there exist at least two transport maps (e.g. in the case that µ 0 and µ 1 are uniformly distributed) then the lift π ∈ P (D(I : X)) of (µ t ) will no longer be unique.
Another natural question is to what extend, an AC-curve in the 1-Wasserstein space has lifts on AC-curves satisfying the optimality condition (3.5).Recall that Example 1.1 already provides an extreme case where no lift on continuous curves is possible.The example below demonstrates that the existence of lifts on AC-curves is not a guarantee for finding an optimal one.Example 3.5 (Non-optimality of AC lifts).Take γ ∈ AC(I : R 2 ) with unit length and assume γ is not length-minimizing, i.e., d(γ 0 , γ 1 ) < 1.Consider (µ t ) ⊂ P 1 (R 2 ) defined as where First of all, it can be readily checked that (µ t ) is a constant-speed geodesic.Secondly, there exists a lift π of (µ t ) concentrated on AC-curves.For example, take two families of AC-curves Actually, there is no way for any lift π concentrated on AC-curves to achieve the equality (3.5).Because if (µ t ) is optimally transported along continuous curves, they have to be length-minimizing.But on the other hand, (almost all) curves in π have to lie inside γ, as each supp(µ t ) does.
Observe that if γ in the example above is length-minimizing, then the constructed lift is indeed optimal.In this case, all measures µ t live in a convex set.One can ask whether the strategy above could yield an optimal lift concentrated on AC-curves if we restrict all µ t to be fully supported on a common convex domain.We further add the assumption µ t = ρ t L n of absolute continuity of measures, trying to exclude the teleporting phenomenon, which appears for instance when replacing the one-dimensional Hausdorff measure by a higherdimensional one in Example 3.5.However, such convincing-sounded assumptions (even with uniform bounds on densities ρ t ) turn out to fail again for obtaining a lift on ACcurves, at least in higher dimensions, as shown below.This is again to emphasize that it is necessary to relax the classical notion of lift and consider a larger class of curves.
Example 3.6.Consider a curve of probability measures on R 2 defined as where the density ρ t : R 2 → R at time t is defined as Here 1 2 > ε > 0 is a fixed parameter.Measure µ t is shown in Fig. 2 (left).Since the curve t → ρ t L 2 is a constant speed 1-Wasserstein geodesic, so is the curve (µ t ).While the curve (µ t ) has infinitely many lifts, all of them have the property that they (up to neglecting a zero measure set of curves) only transport horizontally.This allows us to reduce the study of a possible lift to the 1-dimensional problem of the following measures on [0, 1].Given any y ∈ [0, 1] (corresponding to the slices of the measures µ t at height y), we define a curve of probability measures on R as as shown in Fig. 2 (right).Consider now any lift π y of (µ y t ) as in the Theorem 3.3.Since where Γ y is the collection of curves that jump at t = y.We conclude that any optimal lift π of (µ t ) as in Theorem 3.3 gives positive mass for the set of non-absolutely continuous curves.Corollary 4.1 (Characterization of BV-curves in 1-Wasserstein spaces).Let (µ t ) ⊂ P (X) with µ 0 ∈ P 1 (X).Then (µ t ) ∈ BV(I : P 1 (X)) if and only if there exists π ∈ P (D(I : X)) so that Characterization of AC-curves in 1-Wasserstein space using their lifts however remains challenging.A naive extension of the characterization in the case p > 1 to the case p = 1 would lead to I | γt | dt dπ < ∞, which is a well-defined condition as the metric derivative for BV-curves still exists L 1 -a.e.However, this condition does not guarantee even continuity, let alone absolute continuity.In fact, it is already weaker than (iii).Recall Example 1.1, where any lift of the absolutely continuous curve (µ t ) is concentrated on discontinuous curves.Continuous curves of bounded variation, on the other hand, can be easily characterized, for which the following observation is useful: Proof.By considering the atomic part in the Lebesgue decomposition of the variation measure and using equation (3.5), we obtain Equation (4.1) simply means that the jump size in the 1-Wasserstein space is obtained by taking the average over all jumps in the underlying space.An implication is that if µ t jumps at time t (i.e. the left-hand side of (4.1) is non-zero), then at least some curves must jump at this time as well.Notice that this does not contradict the observation in Example 1.1.Even though all the underlying curves in the example jump, this does not lead to a jump in (µ t ) since at any time t, only one curve jumps and thus has measure zero in the lift.As a result, we conclude that a BV-curve (µ t ) is continuous if and only if for all t ∈ I, the set of curves (γ t ) which has jump at t has measure zero.This is formally stated in the corollary below: Theorem 4.5 (Characterization of geodesics in 1-Wasserstein spaces).Let (µ t ) ⊂ P (X) with µ 0 ∈ P 1 (X).Then (µ t ) is • BV-geodesic with respect to W 1 distance if and only if there exists π ∈ P (D(I : X)) so that (i) π is concentrated on the set of BV-geodesics; (ii) (e t ) # π = µ t for all t ∈ I; (iii) W 1 (µ 0 , µ 1 ) = d(γ 0 , γ 1 ) dπ(γ) < ∞. • continuous and length minimising if and only if in addition to (i)-(iii), π satisfies (iv) for all t ∈ I |Dγ|({t}) dπ(γ) = 0.
• constant-speed geodesic if and only if in addition to (i)-(iii), π satisfies Proof.Suppose first that π satisfies (i)-(iii).Then by Theorem 3.1 we have that Hence, all the above inequalities are actually equalities, and thus (µ t ) is BV-geodesic.Suppose now that (µ t ) is a BV-geodesic, and let π be given by Theorem 3.3.Then (ii) holds and we have which proves (iii).Furthermore, since the first inequality is due to the pointwise inequality |Dγ|([0, 1]) ≥ d(γ 0 , γ 1 ), we can actually conclude that there has to be pointwise equality for π-almost every curve γ, and hence π is concentrated on BV-geodesics.
The claim about continuous and length minimizing follows immediately after considering Proposition 4.2.As for the characterization of constant-speed geodesics, thanks to the equality (3.6), it is enough to show the sufficiency, i.e. to show (i)-(iii) and (v) imply being constant-speed geodesic.By Theorem 3.3 and (iii), we always have Hence, once (v) holds, equality holds everywhere in the above and in particular, which means that (µ t ) has to be a constant-speed geodesic.
4.3.Regularity of the curves in superposition.Example 1.1 illustrates that the superposition of a family of discontinuous curves can produce an absolutely continuous Wasserstein curve.One could naturally ask similar questions, e.g., is it possible to get an absolutely continuous curve from superposing continuous singular curves?Here we answer these questions by investigating three different scenarios, where the lift as in Theorem 3.3 is concentrated purely on either absolutely continuous (AC), continuous singular (CS), or jump (J) curves.The answer is summarized in Table 1 and elaborated upon thereafter.First, note that we can always have a curve with the same regularity as the underlying curves by simply taking µ t = δ γt , t ∈ I (diagonal entries of the table).Secondly, if all curves γ are AC, then (µ t ) is necessarily AC and thus cannot be CS or J due to the following simple observation: Remark 4.6 (A sufficient condition for absolutely continuity).Under the assumptions of Theorem 3.3, if the lift π is merely concentrated on AC 1 (I : X), then (µ t ) ∈ AC 1 (I : P 1 (X)).This directly follows from (3.Next, the case γ-CS, (µ t )-J is not possible.As explained after Proposition 4.2, if (µ t ) jumps, then at least some curves must jump as well.The opposite case, γ-J, (µ t )-CS, can however happen.Take, for example, where c : [0, 1] → [0, 1] is the Cantor function.Then, in the same way as Example 1.1, one can construct a lift that is concentrated on jump curves.Lastly, in Example 4.7 below, we show that the case γ-CS, (µ t )-AC is possible as well.
As a final remark, it is interesting to notice that all cases in the upper diagonal of Table 1 turn out to be not possible.One can conclude that this is simply due to the fact that one can produce regular Wasserstein curves out of irregular curves, while producing irregular Wasserstein curves purely out of regular curves is not possible.
In conclusion, we have constructed a constant-speed geodesic (µ t ) and a lift π of (µ t ) concentrated on BV-curves whose variation measures are cyclical translations of σ 0 .Now, different choices σ 0 give rise to different W 1 -geodesics between δ 0 and δ 1 , as shown in Fig. 3.
• Finally, by choosing σ 0 to be a probability measure with no atoms and no absolutely continuous part, we get that π is concentrated on BV-curves that are continuous but not absolutely continuous.4.4.Continuity equation in discrete setting.Theorem 3.3 is a useful tool for the study of BV-curves in 1-Wasserstein spaces.In the continuous setting, it is well known that whenever the space X has a kind of differential structure, absolutely continuous curves (µ t ) ⊂ P p (X), for p > 1, are related to solutions of the continuity equation (see e.g.[3,Chapter 8]).More precisely, one can find a time-dependent Borel velocity field v t : X → X of (µ t ) so that the continuity equation Concerning the continuity equation, the case p = 1 is far more involved, not least due to the presence of non-localities as seen already in Example 1.1.While the exponent p = 1 creates great difficulties in the continuous setting, it also opens up the possibility of studying analogous questions in the discrete setting.The discrete counterpart to the continuity equation, sometimes referred to as the current equation, is also studied in the literature and has a tight connection with Markov chains.In [11], among other things, Léonard derives a Benamou-Brenier type formula relating W 1 (µ 0 , µ 1 ) to the current equation on metric graphs.See also [12], where an alternative metric on the space of probability measures on a finite set X is introduced via modifying the Benamou-Brenier formula in order to have a gradient flow interpretation of the heat flow in a discrete setting.Different aspects of this metric are studied later in [6], in particular, for the characterization of absolutely continuous curves in the corresponding metric space.
In this section, we study the current equation in a countable and proper metric space for which the induced topology is discrete.More precisely, we show that the current equation can be directly recovered from Theorem 3.3, yielding that for a given BV-curve (µ t ) there exists v t : X → M(X), x → v x t , so that the pair (µ t , v t ) satisfies the current equation.The obtained v t (or rather d(x, •)v x t (•)) can be interpreted as a velocity field, but unlike the continuous setting, it is a time-dependent positive measure over the space X.While the (pointwise defined) continuity equation makes sense for BV-curves due to almost everywhere differentiability, the result is more meaningful whenever (µ t ) is an absolutely continuous curve and therefore completely characterized by the continuity equation.
Setting.Throughout this section, (X, d) is a countable and proper metric space whose induced topology is discrete and we adopt the notation µ t (x) = µ t ({x}), x ∈ X.We start by recalling the current equation: The following lemma states a useful observation for Wasserstein curves in discrete spaces, whose proof is included in the argument of Theorem 4.10.Lemma 4.9.Let (µ t ) ⊂ P 1 (X).If t → µ t is BV or absolutely continuous, then for each x ∈ X, t → µ t (x) is BV or absolutely continuous, respectively.The reverse is also true when all measures µ t are supported inside a common bounded set.Theorem 4.10 (BV-curves and current equation).Let (µ t ) ∈ BV(I : P 1 (X)) and assume that for each t ∈ I = [0, 1], supp(µ t ) is bounded.Then there exists (v t ) so that (µ t , v t ) satisfies the current equation.If further all measures µ t are supported inside a common bounded set, then (i) For any (v t ) such that the pair (µ t , v t ) satisfies the current equation, we have (ii) There exists a (v t ) satisfying the current equation such that Proof.Proof of the a priori estimate (i).Recall the Kantorovich-Rubinstein theorem, where the supremum runs over all Lipschitz functions ψ : where in the third step, we simply exchanged the indexes of summation in the second term.As all supp(µ t ) are confined to a common bounded set, the above summation over x is actually a finite sum.So Proof of the existence and (ii).Let π be the lift of (µ t ) given by Theorem 3.3 with Denote by {π x t } the disintegration of π with respect to e t , i.e., π = π x t dµ t (x) and furthermore by {ν x t+h } the push-forward ν x t+h := (e t+h ) # π x t .The goal is to first prove that, for fixed x, the measure νx t+h := ν x t+h | X\{x} h converges weakly to some measure v x t .Since X is proper with the discrete topology, the ball B 1 (x) contains only finite elements and so r x := min{d(x, y) : y = x} ∧ 1 > 0. In particular, νx t+h ({y : d(x, y) > M}) → 0 uniformly in h when M → ∞.Thus, properness of X implies that {ν x t+h } h is tight, and for arbitrary weakly convergent subsequence we define v x t to be its limit.Couple more comments about the continuity equation in the discrete setting are in order.First of all, Theorem 4.10 could be used to prove a Benamou-Brenier type formula for the 1-Wasserstein distance in the discrete setting, W 1 (µ 0 , µ 1 ) = inf where the infimum is taken over all (µ t ) ∈ AC 1 ([0, 1] : P 1 (X)) with (v t , µ t ) satisfying the current equation (4.5).In fact, the 1-Wasserstein space over any complete and separable metric space is geodesic3 , and thus Benamou-Brenier formula follows whenever Theorem 4.10 is applicable.The disadvantage is that Benamou-Brenier formula in such a general form is hardly useful.If instead, one assumes more structure on the space, for instance, that the space is a discrete metric graph, then one can ask whether the Benamou-Brenier formula holds among all transports that respect the graph structure in a suitable manner.As alluded before, such a formulation has been proven to hold by Léonard in [11,Theorem 3.1].We note that the result can be recovered by techniques introduced in this paper in the case of measures with bounded support.Indeed, given any µ 0 and µ 1 , take σ ∈ Opt(µ 0 , µ 1 ).For any (x, y) ∈ supp(σ), consider a "discrete geodesic" (x = x 1 , . . ., x n = y), and perform subsequent linear interpolations between δ x i and δ x i+1 to obtain a Wasserstein geodesic (µ xy t ) between measures δ x and δ y .This can be done so that (µ xy t ) has a constant speed.Now apply Theorem 3.3 (or simply modify Example 1.1) to obtain a lift π xy of (µ xy t ).Define π := π xy dσ(x, y), and let µ t := (e t ) # π for all t ∈ [0, 1].By construction and Theorem 3.1, we have that (µ t ) is a Wasserstein geodesic.Finally, it is readily checked that the measures v t constructed (from this particular π) in the proof of Theorem 4.10 respect the graph structure, that is, v x t (y) = 0 whenever y is not a neighbor of x.One challenge, however, for using our techniques to get more insight into the framework of graphs arises from the fact that it is not clear how to detect those curves on the level of the Wasserstein space which respects the graph structure.More precisely, it is not clear when a Wasserstein curve has a lift that is concentrated on curves that only jump along the edges of the graph (cf.discussion in Section 3.2).For instance, simply by looking at linear interpolations between measures like in Example 1.1, one ends up with constant speed Wasserstein geodesics which often don't have any lifts respecting the graph structure.Notice that for the construction of a pair (µ t , v t ) realizing the Wasserstein distance via Benamou-Brenier formula, we do not need to lift arbitrary curves or even geodesics but rather construct a specific Wasserstein geodesic and its lift with the desired endpoints.

4 . Applications 4 . 1 .
Characterization of BV-curves.An immediate consequence of combining Theorem 3.1 and Theorem 3.3 is that one can characterize BV-curves in 1-Wasserstein spaces:

Table 1 .
Curves (µt) in 1-Wasserstein spaces vs. curves γ in their lifts as in Theorem 3.3.The lift is assumed to be concentrated purely on either absolutely continuous (AC), continuous singular (CS), or jump (J) curves.We specify whether superposition is possible ( ) or not (×).
Note that r x • |µ t (x) − µ s (x)| ≤ W 1 (µ t , µ s ).So t → µ t (x) is BV or absolutely continuous if t → µ t is BV or absolutely continuous, respectively, which proves Lemma 4.9 as well.At t ∈ (0, 1) where t → µ t (x) is differentiable, where in the last equality we used the assumption that µ t is concentrated on finite-many points.Finally, for (4.6), by weak convergence, •) can be regarded as bounded function as all µ t are confined to a common bounded set.