The Root solution to the multi-marginal embedding problem: an optimal stopping and time-reversal approach

We provide a complete characterisation of the Root solution to the Skorokhod embedding problem (SEP) by means of an optimal stopping formulation. Our methods are purely probabilistic and the analysis relies on a tailored time-reversal argument. This approach allows us to address the long-standing question of a multiple marginals extension of the Root solution of the SEP. Our main result establishes a complete solution to the n-marginal SEP using first hitting times of barrier sets by the time–space process. The barriers are characterised by means of a recursive sequence of optimal stopping problems. Moreover, we prove that our solution enjoys a global optimality property extending the one-marginal Root case. Our results hold for general, one-dimensional, martingale diffusions.


Introduction
The Skorokhod embedding problem (SEP) for Brownian motion (B t ) t≥0 consists of specifying a stopping time σ such that B σ is distributed according to a given probability measure μ on R. It has been an active field of study in probability since the original paper by Skorokhod [39], see Obłój [29] for an account. One of the most natural ideas for a solution is to consider σ as the first hitting time of some shape in time-space. This was carried out in an elegant paper of Root [36]. Root showed that for any centred and square integrable distribution μ there exists a barrier R, i.e. a subset of R + × R such that (t, x) ∈ R implies (s, x) ∈ R for all s ≥ t, for which B σ R ∼ μ, σ R = inf{t : (t, B t ) ∈ R}. The barrier is (essentially) unique, as argued by Loynes [26].
Root's solution enjoys a fundamental optimality property, established by Rost [38], that it minimises the variance of the stopping time among all solutions to the SEP. More generally, E f (σ R ) ≤ E f (σ ) for any convex function f ≥ 0 and any stopping time σ with B σ ∼ B σ R . This led to a recent revival of interest in this construction in the mathematical finance literature, where optimal solutions to SEP are linked to robust pricing and hedging of derivatives, see Hobson [22,23]. More precisely, optimality of the Root solution translates into lower bounds on prices of options written on the realised volatility. A more detailed analysis of this application in the single marginal setting can be found in Cox and Wang [13]. In the financial context, the results in this paper allow one to incorporate information contained in call prices at times before the maturity time of the option on realised variance, as well as the call options which have the same maturity as the variance option.
In recent work Cox and Wang [14] show that the barrier R may be written as the unique solution to a Free Boundary Problem (FBP) or, more generally, to a Variational Inequality (VI). This yields directly its representation by means of an optimal stopping problem. This observation was the starting point for our study here. Subsequently, Gassiat et al. [20] used analytic methods based on the theory of viscosity solutions to extend Root's existence result to the case of general, integrable starting and target measures satisfying the convex ordering condition. Using methods from optimal transport, Beiglböck et al. [3] have also recently proved the existence and optimality of Root solutions for one-dimensional Feller processes and, under suitable assumptions on the target measure, for Brownian motion in higher dimensions.
The first contribution of our paper is to show that one can obtain the barrier R directly from the optimal stopping formulation, and to prove the embedding property using purely probabilistic methods. This also allows us to determine a number of interesting properties of R by means of a time-reversal technique. Our results will hold for a general one-dimensional diffusion.
Beyond the conceptual interest in deriving the Root solution from the optimal stopping formulation, the new perspective enables us to address the long-standing question of extending the Root solution of the Skorokhod embedding problem to the multiple-marginals case, i.e. given a non-decreasing (in convex order) family of n probability measures (μ 0 , . . . , μ n ) on R with finite first moment, and a diffusion X started from the measure μ 0 , find stopping times σ 1 ≤ · · · ≤ σ n such that X σ i ∼ μ i , and X .∧σ n is uniformly integrable. Our second contribution, and the main result of the paper, provides a complete characterisation of such a solution to the SEP which extends the Root solution in the sense that it enjoys the following two properties: -First, the stopping times are defined as hitting times of a sequence of barriers, which are completely characterized by means of a recursive sequence of optimal stopping problems; -Second, similar to the one-marginal case, we prove that our solution of the multiple marginal SEP minimizes the expectation of any non-decreasing convex function of ρ n among all families of stopping times ρ 1 ≤ · · · ≤ ρ n , such that X ρ i ∼ μ i .
It is well known that solutions to the multiple marginal SEP exist if and only if the measures are in convex order, however finding optimal solutions to the multiple marginal SEP is more difficult. While many classical constructions of solutions to embedding problems can, in special cases, be ordered (see [27]), in general the ordering condition is not satisfied except under strong conditions on the measures. The first paper to produce optimal solutions to the multiple marginal SEP was Brown et al. [8], who extended the single marginal construction of Azéma and Yor [2] to the case where one intermediate marginal is specified. More recently, Obłój and Spoida [31] and Henry-Labordère et al. [21] extended these results to give an optimal construction for an arbitrary sequence of n marginals satisfying a mild technical condition.
There are also a number of papers which make explicit connections between optimal stopping problems and solutions to the SEP, including Jacka [24], Peskir [33], Obłój [30] and Cox et al. [12]. In these papers, the key observation is that the optimal solution to the SEP can be closely connected to a particular optimal stopping problem; in all these papers, the same stopping time gives rise to both the optimal solution to the SEP, and the optimal solution to a related optimal stopping problem. In this paper, we will see that the key connection is not that the same stopping time solves both the SEP and a related optimal stopping problem, but rather that there is a time-reversed optimal stopping problem which has the same stopping region as the SEP, and moreover, the value function of the optimal stopping problem has a natural interpretation in the SEP. The first paper we are aware of to exploit this connection is McConnell [28], who works in the setting of the solution of Rost [37] and Chacon [10] to the SEP (see also [13,20]), and uses analytic methods to show that Rost's solution to the SEP has a corresponding optimal stopping interpretation. More recently 1 De Angelis [16] has provided a probabilistic approach to understanding McConnell's connection, using a careful analysis of the differentiability of the value function to deduce the embedding properties of the SEP; both the papers of McConnell and De Angelis also require some regularity assumptions on the underlying measures in order to establish their results. In contrast, we consider the Root solution to the SEP. As noted above, a purely analytic connection between Root's solutions to the SEP and a related (time-reversed) optimal stopping problem was observed in Cox and Wang [14]. In this paper, we are not only able to establish the embedding problems based on properties of the related optimal stopping problem, but we are also able to use our methods to prove new results (in this case, the extension to multiple marginal solutions, and characterisation of the corresponding stopping regions), without requiring any assumptions on the measures which we embed (beyond the usual convex ordering condition).
The paper is organized as follows. Section 2 formulates the multiple marginals Skorokhod embedding problem, reviews the Root solution together with the corresponding variational formulation, and states our optimal stopping characterization of the Root barrier. In Sect. 3, we report the main characterisation of the multiple marginal solution of the SEP, and we derive the corresponding optimality property. The rest of the paper is devoted to the proof of the main results. In Sect. 4, we introduce some important definitions relating to potentials, state the main technical results, and use these to prove our main result regarding the embedding properties. The connection with optimal stopping is examined in Sect. 5. Given this preparation, we report the proof of the main result in Sect. 6 in the case of locally finitely supported measures. This is obtained by means of a time reversal argument. Finally, we complete the proof in the case of general measures in Sect. 7 by a delicate limiting procedure. Notation and standing assumptions In the following, we consider a regular, timehomogenous, martingale diffusion taking values on an interval I, defined on a filtered probability space ( , F, (F t ), P) satisfying the usual hypotheses. For (t, x) ∈ R + × R, we write E t,x for expectations under the measure for which the diffusion departs from x at time t. We also write E x = E 0,x . We use both (X t ) and (Y t ) to denote the diffusion process. While X and Y denote the same object, the double notation allows us to distinguish between two interpretations: with a fixed reference time-space domain R + × R, we think of (X t ) as starting in (t, x) and running forward in time and of (Y t ) as starting in (t, x) and running backwards in time. For a distribution ν on R, We suppose that the diffusion coefficient is η(x), so d X t = η(X t ) 2 dt, where η is locally Lipschitz, |η(x)| 2 ≤ C η (1 + |x| 2 ), for some constant C η , and strictly positive on I • , where we write I • = (a I , b I ), and without loss of generality, assume that 0 ∈ I • ; in addition, we useĪ for the closure of I, and ∂Ī for the boundary, so ∂Ī = {a I , b I }. We assume that the corresponding endpoints are either absorbing (in which case they are in I), or inaccessible (in which case, if for example b I is inaccessible and finite, then P(X t → b I as t → ∞) > 0). The measures we wish to embed will be assumed to be supported onĪ, and in the case where I =Ī, it may be possible to embed mass at ∂Ī by taking a stopping time which takes the value ∞. We define S := [0, ∞] ×Ī. We note also 2 that as a consequence of the assumption on η, we have E x X 2 t < ∞, and we further write We will also frequently want to restart the space-time process, given some stopped distribution in both time and space, and we will write ξ for a general probability measure on S, with typically ξ ∼ (σ, X σ ) for some stopping time σ . With this notation, we have, E ξ [A] = E t,x [A] ξ(dt, dx) and we denote (T ξ , X T ξ ) the random starting point, which then has law ξ . Since ξ may put mass on ∂Ī, we interpret the process started at such a point as the constant process. For each of these processes, L x t denotes the (semimartingale) local time at x corresponding to the process X t , with the convention that L x t = 0 for t ≤ T ξ . In addition, given a barrier R, we define the corresponding hitting time of R by X under P ξ by: Similarly, given a stopping time σ 0 we write Finally, we observe that, as a consequence of the (local) Lipschitz property of η, we know there exists a continuous transition density, p : The Root solution of the Skorokhod embedding problem
The lower and the upper bounds of the support of μ k relative to μ k−1 are denoted by We exclude the case where μ k = μ k−1 as a trivial special case, and so we always have k < r k for all k = 1, . . . , n, as a consequence of the convex ordering. The potential of a probability measure μ is defined by see Chacon [11]. For centred measures μ μ μ n in convex order, we have Recall that (X t ) t∈R + is a martingale diffusion. A stopping time σ (which may take the value ∞ with positive probability) is said to be uniformly integrable (UI) if the process (X t∧σ ) t≥0 is uniformly integrable under P μ 0 . We denote by T the collection of all UI stopping times.
The classical Skorokhod embedding problem with starting measure μ 0 and target measure μ 1 is: We consider the problem with multiple marginals: (2.7) In this paper, our interest is in a generalisation of the Root [36] solution of the Skorokhod embedding problem so that each stopping time σ k is the first hitting time, after σ k−1 , by (t, X t ) t≥0 of some subset R in S. Further, and crucially, we require that R is a barrier in the following sense: Given a barrier R, for x ∈Ī, we define the corresponding barrier function: Since R is closed it follows, as observed by Root [36] and Loynes [26], that t R (·) is lower semi-continuous on I. Also, from the second property, we see that a barrier is the epigraph of the corresponding barrier function in the (t, x)-plane: is an open interval containing zero.
(ii) For a probability measure ξ = ξ(dt, dx) on S, we say that a barrier is ξ -regular if i.e. the barrier cannot be enlarged without altering the stopping distribution of the space-time diffusion started with law ξ and run to the hitting of R.
Observe that a regular barrier is a δ (0,0) -regular barrier. We have the following characterisation:

Lemma 2.4
Let ξ be a probability measure on S and R a barrier such that If R is not ξ -regular then by definition the set of all barriersR for which X σ R ∼ X σR P ξ -a.s. is not a singleton. Then for any two such barriersR 1 ,R 2 their union is also such a barrier, as shown by Loynes [26]. It follows that there exists a minimal such barrier with respect to the inclusion which then necessarily has to be ξ -regular.
It follows that, without loss of generality, we may restrict our attention to ξ -regular barriers. Henceforth, whenever a barrier is given it is assumed that it is a ξ -regular barrier, where the measure ξ will be clear from the context.

Root's solution and its PDE characterisation
The main result of Root [36] is the following. Theorem 2.5 (Root [36]) Let μ 0 = δ 0 , η(x) ≡ 1, and μ 1 be a centred probability measure on R with a finite second moment. Then there exists a barrier R * such that σ R * is a solution of SEP (μ μ μ 1 ).
The first significant generalisation of this result is due to Root [38] who showed that the result generalised to transient Markov processes under certain conditions. The condition that the probability measure μ 1 has finite second moment has only very recently been further relaxed to the more natural condition that the measure has a finite first moment. This was first achieved by Gassiat et al. [20], who have extended Root's result to the case of one-dimensional (time-inhomogeneous) diffusions using PDE methods. The result was also obtained by Beiglböck [3] using methods from Optimal Transport theory. Remark 2. 6 Loynes [26] showed, as used above in Lemma 2.4, that in Theorem 2.5 the barrier can be taken to be regular and is then unique.
We next recall the recent work of Cox and Wang [14] and Gassiat et al. [20]. For a function u : (t, x) ∈ R + × R −→ u(t, x) ∈ R, we denote by ∂ t u the t-derivative, Du, D 2 u the first and second spacial derivatives, i.e. with respect to the x-variable, and we introduce the (heat) second order operator Consider the variational inequality or obstacle problem: Then, based on the existence result of Root [36] and Cox and Wang [14] proved the following result.
Moreover, we have the representation u 1 In Cox and Wang [14], the solution to the variational inequality was determined as a solution in an appropriate Sobolev space, while Gassiat et al. [20] show that the solution can be understood in the viscosity sense.

Optimal stopping characterisation
The objective of this paper is to provide a probabilistic version of the last result, and its generalisation to the multiple marginal problem. Our starting point is the classical probabilistic representation of the solution to (2.10) as an optimal stopping problem. Define now where T t is the collection of all (F t )-stopping times τ ≤ t. Then, using classical results, see e.g. Bensoussan and Lions [6], when properly understood, u 1 in (2.11) is a solution to (2.10). Uniqueness, in an appropriate sense, of solutions to (2.10), then allows to deduce that the characterisation of the Root barrier given in Theorem 2.7 corresponds to the stopping region of the optimal stopping problem (2.11) The probabilistic approach we develop in this paper provides a self-contained construction of the Root solution, and does not rely on the existence result of Root [36] or PDE results. Indeed, these follow from the following direct characterisation which is a special case of Theorem 3.1 below.

Iterated optimal stopping and multiple marginal barriers
In order to extend the Root solution to the multiple marginals SEP(μ μ μ n ), we now introduce the following natural generalisation of the previous optimal stopping problem. Denote The main ingredient for our construction is the following iterated sequence of optimal stopping problems: The stopping regions corresponding to the above sequence of optimal stopping problems are given by: and the optimal stopping time which solves (3.1) is the first entry to R k by the time space process starting in (t, x) and running backwards in time: Our main result shows that the same barriers used to stop the process running forward in time: give the multiple marginals Root solution of SEP(μ μ μ n ). It is important to note that the barriers in (3.2) are not necessarily nested-both R k and R k−1 may contain points which are not in the other barrier.
An example of a possible sequence of stopping times is depicted in Fig. 1. Since the barriers are not necessarily nested, in general σ k will not be equal to the first entry time to the barrier, only the first entry time after σ k−1 . It may also be the case that σ k−1 = σ k . Both cases are shown in Fig. 1.
Finally, it will be useful to introduce the (time-space) measures on S defined for all Borel subsets A of S by: We are now ready to state our main result, which includes Theorem 2.8 as a special case.

Fig. 1
A realisation of a Root-type solution to the multiple marginal problem. Here we depict three barriers which are not ordered (in the sense that R 1 R 2 R 3 ). As a result, the given realisation can enter the second and third barriers before the first stopping time. Note also that since the first stopping time, σ 1 , happens at a point which is also inside the second barrier, we have here σ 1 = σ 2 Remark 3.2 In general, explicit examples of Root-type solutions to the SEP (and by extension, its multi-marginal version) are hard to find. In fact, to the best of our knowledge, even for the one-marginal problem in a standard Brownian setting, the only cases where an explicit barrier can be computed are measures supported on two points and Gaussian marginals. In some cases, the barrier can be characterised as the solution to an integral equation, see Gassiat et al. [19]. As a result, numerical methods seem to be the only viable approach for explicit computation of Root-type barriers. A natural consequence of Theorem 3.1 is that numerical approaches to the multiple stopping problem can be used to find solutions to the SEP.

Optimality
In this section, we show optimality of the constructed n-fold Root solution of the multiple marginal Skorokhod embedding problem. We recall the main ingredients of our embedding defined in (3.1)-(3.3). We also denote t k := t R k . Define the set of all solutions to SEP(μ μ μ n ) in (2.7): For a given function f : R −→ R + we consider the optimal n-fold embedding problem: inf ρ∈T (μ μ μ n ) σ ∈ T (μ μ μ n ) and E μ 0 The above remains true for any stopping times ρ 1 , . . . , ρ n which embed μ μ μ since if ρ is not uniformly integrable then it is not minimal, see [29,Sect. 8], and we can find smaller stopping timesρ ∈ T (μ μ μ n ) for which the above bound is already satisfied. Similar to many proofs of optimality of particular solutions to SEP, see e.g. Hobson [23], Cox et al. [12] and Henry-Labordère et al. [21], at the heart of our argument lies identification of a suitable pathwise inequality. Interpreting (3.5) as an iterated Martingale Optimal Transport problem, the pathwise inequality amounts to an explicit identification of the dual optimiser in the natural Kantorovich-type duality. Our inequality is inspired by the one developed by Cox and Wang [14].
For all (t, x) ∈ R + ×Ī and k = n, . . . , 0, we introduce the functions Our main result below involves the following functions:

Lemma 3.4 Let f be a non-negative non-decreasing function. Then for all
and equality holds if The proof of the above inequality is entirely elementary, even if not immediate, and is reported in "Appendix A". The optimality in Theorem 3.3 then essentially follows by evaluating the above on stopped paths (ρ i , X ρ i ) and taking expectations. Technicalities in the proof are mainly related to checking suitable integrability of various terms and the proof is also reported in "Appendix A". Finally, we note that the above pathwise inequality could be evaluated on paths of arbitrary martingale and, after taking expectations, would lead to a martingale inequality. The inequality would be sharp in the sense that we have equality for X stopped at σ in (3.3). This method of arriving at martingale inequalities is linked to the so-called Burkholder method, see e.g. Burkholder [9], and has been recently exploited in number of works, see e.g. Acciaio et al. [1], Beiglböck and Nutz [4] and Obłój et al. [32].

The inductive step
In this section we outline the main ideas behind the proof of Theorem 3.1. The proof proceeds by induction. At the end of each step in the induction, we will determine a stopping time σ ξ , and the time-space distribution ξ , which corresponds to the distribution of the stopped process (σ ξ , X σ ξ ) under the starting measure μ 0 . This measure will be the key part of the subsequent definitions. Given this stopping time, and a new law β, we proceed to determine a new stopping time σ ξ β , and the corresponding time-space distribution ξ β . This stopping time will embed the law β. This inductive step is summarised in Theorem 4.1 below. This stopping time σ ξ β is constructed as the solution of an optimal stopping problem u β , introduced below, with obstacle function appropriately defined by combining the potential function v ξ of the stopped process X .∧σ ξ and the difference of potentials between the starting distribution-the spatial marginal of ξ denoted α ξ -and the target distribution β. We will also show that the function u β is equal to the potential function v ξ β , allowing us to iterate the procedure.
We now introduce the precise definitions. The measure μ 0 will be a fixed integrable measure throughout, and so we will typically not emphasise the dependence of many terms on this measure.
Let ξ be the P μ 0 -time-space distribution of (σ ξ , X σ ξ ) for some UI stopping time Motivated by the iterative optimal stopping problems (3.1), we also introduce, for any probability measure β onĪ, the difference of potentials (4. 2) The optimal stopping problem which will serve for our induction argument is: We also introduce the corresponding stopping region and we set Theorem 4.1 Let σ ξ ∈ T with corresponding time-space distribution ξ , and β an integrable measure such that β cx α ξ . Then σ ξ β is a UI stopping time embedding β and u β = v ξ β . Moreover, R β is a ξ β -regular barrier.
The rest of this paper is dedicated to the proof of Theorem 4.1. The following result isolates the main steps needed for this.
Proof From the assumptions and the definition of v ξ β we obtain where the inequality follows from Fatou's Lemma. This in particular implies that α ξ β is an integrable probability measure onĪ, U α ξ β (x) > −∞ for all x ∈Ī, and Then for x, y ∈Ī, it follows from the dominated convergence theorem that In particular, U β (x) = U α ξ β (x) + c for some c ∈ R, for all x ∈Ī, and by the above, sending x → ∂Ī, we see that c = 0. We conclude that α ξ β = β, i.e. X σ ξ β ∼ β, which is the required embedding property. Moreover, it follows from the Tanaka formula together with the monotone convergence theorem that The uniform integrability of the stopping time σ ξ β now follows from [18, Corollary 3.4].
The pointwise convergence of u β (t, .) towards U β , as t → ∞ will be stated in Lemma 5.5 (iii), while the equality u β = v ξ β is more involved and will be shown through a series of results, see Lemma 7.3.
Recall that, under P ξ , the local time is set to L x t = 0 for t ≤ T ξ , by convention. Then from the strong Markov property, we have justifying the claimed equivalence.

Remark 4.4
Observe that the regularity of the barrier can now be seen as an easy consequence of Lemma 4.2. Suppose (in the setting of Theorem 4.1), we have u β = v ξ β and u β (t, .) → U β pointwise as t → ∞. From (4.6), (4.2) and applying monotone convergence to E ξ L x t∧σ R β as t → ∞, we deduce that (4.6). In view of Remark 2.3, this shows that R β is ξ -regular.

Properties of the stopped potential function
The following lemma provides some direct properties of the stopped potential. Recall the definition m μ (t) := E μ |X t |. We say that a function which is Lipchitz continuous with constant K is a K -Lipschitz function. 1 2 -Hölder continuous on [0, T ] for all T > 0. In addition

Lemma 5.1 Let σ ξ ∈ T with corresponding time-space distribution ξ . Then, v ξ is concave and 1-Lipschitz-continuous in x, and non-increasing, and v ξ (t, x) is (uniformly in x)
and the following identity holds in the distribution sense: by which we mean that, for any stopping time σ ≤ t, we have where q σ is the space-time density of the process Y s (started at t and running backwards in time) up to the stopping time σ .
Proof The definition of v ξ (t, x) in (4.1) immediately shows that v ξ is concave, 1-Lipschitz in x, and non-increasing in t. As in Remark 4.3 above, using Tanaka's formula and the strong Markov property we obtain 2) We now consider continuity properties of E y L x t . First observe that, by the martingale property of X t , we have Using the fact that η(x) 2 ≤ C η (1 + |x| 2 ) and the martingale property of X , we deduce where the first inequality follows via localisation and limiting argument using Fatou's lemma and monotone convergence. It now follows by Grönwall's lemma that from which we deduce that It remains to compute Lv ξ . First, since v ξ is non-increasing in t and concave in x, the partial derivatives ∂ t v ξ and D 2 v are well-defined as distributions onĪ, so Lv ξ makes sense in terms of measures.
We first consider the case where η is suitably differentiable (say smooth). Note that by a monotone convergence argument, we can restrict to the case where Y remains in a compact subinterval of I • up to σ , and hence is bounded. Let p(t, x, y) be the transition density for the diffusion and recall that E y L x It follows that for an arbitrary starting measure ν, we have x)dr, and we directly compute (using Kolmogorov's Forward Equation, which holds due to the smoothness assumption on η) that Suppose in addition that ξ has a smooth density with respect to Lebesgue measure (which we also denote by ξ ). We then compute from (5.2) and the equation above that Applying Itô's lemma, we see that We now argue that our results hold for an arbitrary, locally Lipschitz function η. Keeping ξ fixed as above, with a smooth density, let η n be a sequence of Lipschitz functions obtained from η by mollification. Note that since we are on a compact interval, η and hence η n are all bounded and from the mollification, we may assume that there exists K such that η, η n are all K -Lipschitz; moreover ξ is bounded on the corresponding compact time-space set.
Write Y n for the solution to the SDE dY n t = η n (Y n t ) dW t , and note in particular, by standard results for SDEs (e.g. [34,Theorem V.4.15]) that sup r ∈[0,t] |Y n r − Y r | → 0 almost surely (possibly after restricting to a subsequence), and in L 1 as n → ∞. Hence, by bounded convergence, we get convergence of the corresponding expectations on the right-hand side of (5.4), as n → ∞. In addition, writing v ξ,n for the functions corresponding to the diffusions Y n , we see from the first half of the proof that the functions v ξ,n , v ξ are 1-Lipschitz in x, and uniformly Hölder continuous in t, for some common Hölder coefficient. It follows from the Arzelà-Ascoli theorem that v ξ,n converge uniformly (possibly down a subsequence) to v ξ . We deduce that (5.1) holds for general η and smooth ξ .
Finally, approximating the measure ξ by smooth measures through a mollification argument, and observing that the local times for the diffusion are jointly continuous in x and t (by (1.1) and the discussion preceeding this equation) we conclude that we can pass to the limit on the right-hand side of (5.2), and hence on the left-hand side of (5.1). On the other hand, when q σ is continuous, we can also pass to the limit on the right-hand side of (5.1). Moreover we can approximate σ by a sequence of stopping times σ n σ such that q σ n has a continuous density, and this gives us the required result after a monotone convergence argument.
For the next statement, we introduce the processes

Lemma 5.2
Let σ ξ ∈ T with corresponding time-space distribution ξ . Then the processes V t and V t − V t are P x -supermartingales for all t ≤ t ≤ ∞, and x ∈Ī.
Proof In this proof we will want to take expectations with respect to both the X and Y processes at the same time; we will assume that these are defined on a product space, where the processes are independent. Then we will denote expectation with respect to the X process alone by E μ X [A], etc, and the filtrations generated by the respective processes by F X s and F Y s . We first prove the supermartingale property for the process V t . The case t = ∞ is an immediate consequence of the Jensen inequality. Next, fix t ∈ [0, ∞), and recall Using Hunt's switching identity (e.g. [7, Theorem VI.
Using the Strong Markov property, and usingỸ ,Ẽ to denote independent copies of Y etc., we deduce where, in the final line, we used Jensen's inequality and the fact that A similar calculation to that above shows that for u ≤ s, whereσ := σ ξ +(s −u). Note that for any r > r , the process |X r ∧u − y|−|X r ∧u − y| is a supermartingale for u ≥ 0. It follows that, since σ ξ ≤σ ,

The optimal stopping problem
In this section we derive some useful properties of the function u β (t, x). We first state some standard facts from the theory of optimal stopping. Introduce

Proposition 5.3
Let σ ξ ∈ T with corresponding time-space distribution ξ , and α ξ cx β. Then for all (t, x) ∈ S, τ t ∈ T t is an optimal stopping rule for the problem u β in (4.3):

7)
and the process u Proof Recall that under P t,x the diffusion Y r , r ≥ t departs from x at time t, and when t = 0, we write P 0,x = P x . Then we have for 0 ≤ s ≤ t: Notice that u t (s, x) is a classical optimal stopping problem with horizon t, and obstacle  [17] that the standard results of optimal stopping holds true. In particular, the process u β (t − s, Y s ) s≤t satisfies the announced martingale and supermartingale properties, and an optimal stopping time for the problem u t (0, which is exactly τ t .
for all x ∈ R. We now consider the cases where I = R, In the case where I = R, we have E x L y t → ∞ as t → ∞, for any x, y ∈ I.
for all x ∈ R and all t ≥ 0. So there always exists x ∈ I with t β (x) < ∞ and hence R β = ∅.
Similarly consider the case whereĪ = (−∞, b I ]. From the properties of the diffusion, we know that X t → b I almost surely as t → ∞. for all x ∈ I • and all t ≥ 0. Hence there always exists x ∈ I • with t β (x) < ∞ and hence R β = ∅.
Finally consider the case whereĪ = [a I , b I ]. Hence lim t→∞ X t ∈ {a I , b I }, and a similar argument to above gives U t (x) → a I − x−a I b I −a I (b I + a I ) as t → ∞, for x ∈Ī. This limit corresponds to Uν(x), whereν is the centred measure supported on {a I , b I }, and it is easy to check that this potential is strictly smaller than the potential of any other centred measure supported onĪ, and so for any other measure, there always exists x ∈ I • with t β (x) < ∞ and hence R β = ∅. The case of the measureν is trivial, and we exclude this from subsequent arguments.

Hölder-continuous in t, and there is a constant C which is independent of
Proof (i) The 1-Lipschitz-continuity of u β (t, x) in x follows directly from the Lipschitz continuity of v ξ and w β in x. Then the 1 2 -Hölder continuity in t follows by standard arguments using the dynamic programming principle (for example, as a simple modification of the proof of Proposition 2.7 in Touzi [40]).
(ii) Let t > t, fix ε > 0, and let τ ∈ T t be such that Recall from Lemma 5.2 the supermartingale properties of the process V t introduced in (5.5). Then In addition, since w β ≤ 0, we have: Putting these together, we conclude that By the arbitrariness of ε > 0, this shows u β − v ξ is non-increasing in t, and implies that u β inherits from v ξ the non-increase in t. By the supermartingale property of the process u β (t − s, Y s ) s∈ [0,t] in Proposition 5.3, this in turns implies that u β is concave in x.
(iii) By definition, x) by the supermartingale property of V t established in the previous Lemma 5.2.
In the rest of this proof, we show that u β (t, x) → U β (x) as t → ∞ for all x ∈Ī. We consider three cases: follows from the previous case that u β (t, x n ) → U β (x n ), and therefore u β (t, x) → U β (x) by the Lipschitz-continuity of u β . a, b) does not intersect R β . By Remark 5.4, we may assume R β is not empty and hence (a x , b x ) = I • . In the subsequent argument, we assume that a x is finite, the case where b x is finite follows by the same line of argument. The optimal stopping time τ t in (5.6) satisfies τ t ≥ H a x ,b x := inf{r ≥ 0 : Y t / ∈ (a x , b x )} and τ t → H a x ,b x , P x -almost surely. If both a x and b x are finite, we use the inequality u β (t, x) ≥ U β (x), together with Fatou's Lemma, Lemmas 5.1 and 5.2, and bounded convergence, to see that Hence lim t→∞ u β (t, x) = U β (x), and U β is linear on (a x , b x ). For the general case where b x may be infinite, a more careful argument is needed. Since w β := (U β − U α ξ )(x) → 0 as |x| → 0, it follows that δ := max(−w β ) < ∞.
Fix ε > 0 and choose c sufficiently large that δ/(c − a x ) < ε. Let H c := inf{s ≥ 0 : Y s ≥ c} and note that τ t ∧ H c → H a x ,c = inf{t ≥ 0 : Y t / ∈ (a x , c)} as t → ∞. Then by the martingale property of u β on t ≤ τ t , and the fact that u β ≤ v ξ , we have where we wrote w β (t, x) = w β (x). Taking limits as t → ∞, and using Fatou as above, it follows from the definition of c that: Taking ε 0 and using concavity of U β we get that lim t→∞ u β (t, x) = U β (x), and U β is linear on (a x , c). Letting c → ∞ we conclude that U β is linear on (a x , ∞).

Existence and basic properties of the barrier
We denote the barrier function corresponding to the regular barrier R β defined in (4.4) with t β := t R β . It will be used on many occasions in our proofs. Recall from (2.3) the definition of the support of a measure μ k in terms of the measure μ k−1 . In what follows, we write β , r β for the bounds of the support of β in terms of the measure α ξ .

Corollary 5.6
Let σ ξ ∈ T with corresponding time-space distribution ξ , and α ξ cx β. Then, the set R β is a (closed) barrier, and moreover and it is then immediate from (iii) and (ii) of Lemma 5.5 that u β (t , x) = v ξ (t , x)+w β (x) and so (t , x) ∈ R β , for all t > t. By the continuity of v ξ and u β , established in Lemmas 5.1 and 5.5, we conclude that R β is a closed barrier.

Suppose now that β[(a, b)] = 0 and w β <
Here we have used the strict inequality w β (y) < 0 for all y ∈ (a, b) to get the second line. To get the final line, we use Lemma 5.2 to deduce that Lv linear on (a, b).
x) for all t, by (iii) of Lemma 5.5, and so Remark 5.7 (On R β having rays for arbitrary large |x|) We can now deduce from the proof of the convergence u β U β , as t ∞ in Lemma 5.5 (iii), that for any In the proof, we show that for any point x such that t β (x) = ∞ either there exists points a < x < b such that t β (a), t β (b) < ∞ or there exists an a less than x such that for any c large enough U β is linear on (a, c). Letting c → ∞, and using the fact that U β (c) + |c| → 0, we conclude that U β (y) = −|y| for all y ≥ a. Then U β (y) ≤ U α ξ (y) ≤ U μ 0 (y) ≤ −|y| = U β (y) implies U β (y) = U α ξ (y). In particular, w β (x) = 0, and by Corollary 5.6 we contradict the initial assumption that x is not in the barrier.
Remark 5.8 (On the structure of the stopping region) Let α ξ , β be integrable measures in convex order. It follows from Corollary 5.6 that the barrier can be divided into at most countably many (possibly infinite) non-overlapping open intervals J 1 , Observing that in both the embedding, and the optimal stopping perspectives, the process started from x ∈ J k never exits each interval J k , it is sufficient to consider each interval separately, noting that in such a case, u β (t, x) = v ξ (t, x) for all t ≥ 0, and all x ∈ ∞ k=1 J k . In the subsequent argument, we will assume that we are on a single such interval J k , which may then be finite, semi-infinite, or equal to I • . In addition, if the measures α ξ , β are in convex order, then their restrictions to each J k are also in convex order. Remark 5.9 (On R β for atomic measures) Let α ξ , β be integrable measures in convex order. Bearing in mind Remark 5.8, we suppose that β is a probability measure onĪ such that for some integer n ≥ 1, and some ordered scalars x 1 < · · · < x n , we have n i=1 β[{x i }] = 1 and β[{x i }] > 0 for all i = 1, . . . , n . From the representation of the optimal stopping time τ t , see Proposition 5.3 above, and the form of the set R β implied by Corollary 5.6, it follows that where T (x 1 , . . . , x n ) is the set of stopping times τ such that τ ≤ H x 1 ,x n and Y τ ∈ {x 1 , . . . x n } a.s.

Locally finitely supported measures
A probability measure β is said to be α ξ -locally finitely supported if its support intersects any compact subset of supp(α ξ , β) = {x : U α ξ (x) > U β (x)} at a finite number of points. The measure β is α ξ -finitely supported if its support intersects supp(α ξ , β) at a finite number of points. Throughout, α ξ will be fixed, so we will typically only refer to (locally) finitely supported measures. Observe that an integrable, centred measure β can only be finitely supported if β and r β are both finite-indeed, in this case a locally finitely supported measure is finitely supported if and only if r β and β are both finite.

Preparation
We start with two preliminary results which play crucial roles in the next section where we establish the main result for finitely supported measures. The first result is the key behind the time-reversal methodology which underpins the main results, see Sect. 3.1.
Here, we give a natural proof in the case where X = B is a Brownian motion, when the proof has a simple intuition. 3 In "Appendix B" we give a PDE proof which works in the more general diffusion setting.
To understand the importance of the result, it is helpful to think of the local time of X and of Y on the two sides of the announced equality. This result is then used to obtain the key equality v ξ β = u β in a "box" setting where the barrier is locally composed of two rays. The case of finitely supported measures is then obtained with an inductive argument in Sect. 6.2.

Lemma 6.1 Let L be the local time of a Brownian motion B. For any a
Proof Without loss of generality we suppose b − y > x − a and introduce two additional points c = x −(b − y) and d = y Note that by translation invariance and symmetry of Brownian motion we have Using this in the desired equality, and subtracting E x L y t∧H c,b , we see that it suffices to show that Finally, by shift invariance, we may suppose without loss of generality that x = 0. Consider three independent Brownian motions B (3) , B (4) , B (5) starting from 0 and denote H (i) the hitting times for B (i) . Further, let ρ (3) and observe these are standard Brownian motions. This construction is depicted in Fig. 2. We denote L y,(i) the local time of B (i) at level y.
Recall that c < a < d < b and consider L y,(1) t∧H . For this quantity to be non-zero the following have to happen prior to t: first B (1) has to hit a without reaching b, then it has to come back to x = 0 and continue to y without ever reaching c. This happens at time ρ (3) + H (4) y and from then onwards the local time L y,(1) is counted before time t ∧ H (1) c,b and we see that it simply corresponds to L 0, (5) . With a similar reasoning for L y, (2) , we see that our construction gives us the desired coupling: (2) c,d and taking expectations gives the required result.
We now prove an important consequence of the above result, which will form the basis of an induction argument. Lemma 6.2 Let σ ξ ∈ T with corresponding time-space distribution ξ , and α ξ cx β.
Proof In view of Remark 4.3, and the continuity of v ξ β − u β , it is sufficient to show that Fig. 2 A depiction of the Brownian motions B (1) and B (2) constructed in the proof of Lemma 6.1. Observe that the blue and green sections in each process are mirror images, up to translation, while the magenta sections are equal, up to translation (color figure online) ∈ (a, b).
We fix x ∈ (a, b).
where we introduced the measure m(dy) := P ξ X t 0 ∈ dy, T ξ < t 0 < σ R β , and used the fact that, conditional on starting in {t 0 } × (a, b), the stopping times σ R β and H a,b are equal (and starting on {t 0 } × (a, b) , we never hit x before σ R β ). Observe that for y ∈ (a, b), we have ∈ (a, b) by the assumptions on R β . Moreover, since σ ξ is a UI embedding of α ξ , it follows from the Tanaka formula that for y ∈ (a, b), we have where the last equality follows from the assumption that together with Remark 4.3. Since D 2 U λ (dy) = λ(dy), this provides by substituting in (6.4) that for y ∈ (a, b): Plugging this expression in (6.3), we get ξ(ds, dy).
The required result now follows from the following claims involving ζ := inf{s ≥ 0 : which we now prove.
(ii) We next prove (6.6). Since v ξ (t 0 , .) is concave by Lemma 5.1, it follows from the Itô-Tanaka formula that: where the last equality follows from Lemma 6.1 together with a coordinate shift.
(iii) Finally we turn to (6.7). Recall that by application of the Itô-Tanaka formula, due to the concavity of the function u β (t, .), as established in Lemma 5.5. We finally conclude from Lemma 6.1/B.1 that

The case of finitely supported measures
We now start the proof of Theorem 4.1 for a (relatively) finitely supported probability measure β, where we call a measure on R finitely supported if it is supported on a finite set of points. Recall from Lemma 4.2 and Lemma 5.5 (iii) that we need to prove that u β = v ξ β . When there is no risk of confusion we write σ β for σ R β . In the sequel, we will say that β is α ξ -supported on n points if the measure β restricted to ( β , r β ) is a discrete measure, supported on n points.
The proof proceeds by induction on the number of points in the support of β| ( β ,r β ) . The case where α ξ = β is trivial, since it follows immediately from (iii) of Corollary 5.6 that R β = S. Hence we suppose that β < r β . We start with the case where β| ( β ,r β ) contains no points, and therefore all mass starting in ( β , r β ) under ξ will be embedded at the two points β , r β . Lemma 6.4 Let σ ξ ∈ T with corresponding time space distribution ξ , and α ξ cx β with β(( β , r β )) = 0. Then v ξ β = u β holds for all (t, x) ∈ S.
The proof of Proposition 6.3 will be complete when we establish that the following induction step works. Lemma 6.5 Let σ ξ ∈ T with time-space distribution ξ . Assume v ξ β = u β for any β cx α ξ which is α ξ -supported on n points. Then, v ξ β = u β for any measure β which is α ξ -supported on n + 1 points.
Proof Let β be a centred probability measure α ξ -supported on the n +1 ordered points x := {x 1 , . . . , x n+1 }, with β[{x i }] > 0 for all i = 1, . . . , n + 1. By Remark 5.9, the set R β is of the form Let j be such that t j = max i t i , so that [t j , ∞) × {x j } is a horizontal ray in R β starting farthest away from zero. Define a centred probability measure α ξ -supported on x (− j) := x \{x j } by conveniently distributing the mass of β at x j among the closest neighboring points: By a direct calculation, we see that U β * (x) = U β (x) for x / ∈ I j , and U β * is affine and strictly smaller than U β on I j . Consider first x / ∈ I j . Recall (5.7) with the optimal stopping time τ t being the minimum of t and the first entry to R β for the diffusion X started in (t, x) and running backward in time. However since max{t j−1 , t j+1 } ≤ t j it follows that Y τ t = x j on τ t < t. In consequence, we can rewrite (5.11) as An analogous argument shows u β (t, x) = u β * (t, x) for x ∈ I j \{x j } and t ≤ t j and for x = x j and t < t j . By continuity of u β we also have u β (x j , t j ) = u β * (x j , t j ). 2. We now prove that u β = v ξ β holds for all (t, x).
Consequently, for all t ≤ t j and all s ≥ 0, It follows from the induction hypothesis that u β = v ξ β holds for all x ∈ R, t ≤ t j , and for all x / ∈ I j . 2.2. It remains to consider x ∈ (x j−1 , x j+1 ) and t > t j . For x ∈ (x j , x j+1 ), we now know that u β = v ξ β holds at t = t j , and R β places no points in [0, ∞) × (x j , x j+1 ). Then, it follows from Lemma 6.2 that u β = v ξ β on (x j , x j+1 ). The same argument applies for x ∈ (x j−1 , x j ).

The case of locally finitely supported measures
In this subsection, we consider the case of measures β which are α ξ -finitely supported on any compact subset of R, but could have an accumulation of atoms at −∞ or ∞. We will establish Theorem 4.1 for such β by suitably approximating β with a sequence of measures with α ξ -finite support. Recall that β = sup{x : α ξ ((−∞, y]) = β((−∞, y]) ∀y ≤ x} = sup{x : U α ξ (y) = U β (y) ∀y ≤ x}, and similarly for r β . The desired result has already been shown when −∞ < β ≤ r β < ∞, see Proposition 6.3, so we consider the case where at least one of these is infinite. For simplicity, we suppose that both are infinite (and hence I = R), the case where only one is being similar. The approximation is depicted graphically in Fig. 3.
For N > 0, we observe that we can define a new measure β N , and constants and extend linearly to the right of N , with gradient (U β ) + (N ) until the function meets U α ξ , at the point r N , from which point on, we take U β N (x) = U α ξ (x); a similar construction follows from −N . The existence of the point r N follows from the fact that U β (x) − U α ξ (x) → 0 as x → ∞, which in turn is a consequence of the convex ordering property. This construction guarantees In particular, β N is a sequence of atomic measures with α ξ -finite support. Hence, by Proposition 6.3, Theorem 4.1 holds for these measures. Moreover, we can prove the following: Lemma 6.6 Let σ ξ ∈ T with corresponding time-space distribution ξ , and β a locally finitely supported measure such that α ξ cx β. Let β N be the sequence of measures constructed above. Then the sequence Proof We proceed in two steps: Then, by definition of the optimal stopping problem, we see that N ] by construction, and so if it is optimal to stop for β N , it is also optimal to stop for β N and for β. It follows that, for The desired monotonicity follows instantly and R β ⊇ R follows since R β is closed. 2. It remains to show the reverse inclusion R ⊇ R β .
First, observe that for the points where t R (x) = 0 or t R (x) = ∞ the inclusion holds. This is an immediate consequence of Corollary 5.6 together with the relation between the measures β and β N . The rest of the proof is devoted to showing that for a point x in the support of β with 0 < t := t R (x) < ∞, we have (t, x) / ∈ R β for all t < t . We first carry our preparatory computations which follow two cases. Then we combine the two to give the final result.

Since Theorem 4.1 holds for
This means that, for all t < t with t − t sufficiently small there is a positive probability under P ξ that the process reaches (t, x) before hitting R (and hence Fig. 4 The possible cases considered in step 3.1. of the proof of Lemma 6.6. In the first case, shown in the bottom half of the diagram, paths starting at (t 1 , x 1 ) can only reach points in the support of ξ (denoted by the red line) which are at time 0. In this case, we are interested on the behaviour of the process on the set A shown, given that it does not leave the set D 1 . In the second case, the process starting at (t 2 , x 2 ) can reach points in the support of ξ which are not in the set {t = 0}. In this case, we are interested in the behaviour of the process on the sets A and A + depicted, given that the process does not leave D 2 (color figure online) also R β N ) or exiting [−N 0 , N 0 ]. In particular, considering possible paths, we can reverse this: for any such t < t , running backwards, there exists a positive probability that we will reach the support of ξ before hitting R or exiting a bounded interval. More specifically, writing x − = sup{y < x : (0, y) ∈ R}, x + = inf{y > x : (0, y) ∈ R}, and ε = t − t, for some ε sufficiently small at least one of the following two cases described below is true. We refer to Fig. 4 for a graphical interpretation of the two cases, and a number of the important quantities described below.
Case 1 The only points of the support of ξ which can be reached from (t , be a closed and bounded interval such that ξ({0} × A) > 0. Observe that the measures β N are α ξfinitely supported, and hence R β N ∩ (R + × ([x − ε, x + ε] \ {x})) = ∅ for some ε > 0, and all N . Moreover, we may assume that ε is also sufficiently For such an ε, write and note that R ∩ D = ∅.
Our aim is now to use the expression of Lv ξ in Lemma 5.1, to show that V t is a strict supermartingale on A := [0, ε] × A. Recall that t = t − ε and define Recall the family of supermartingales V t defined in (5.5). We want to show Using the supermartingale property of V t , we can further reduce this to showing that Note that on {τ D > t} we have τ N = t and τ N ≥ τ D . We now write q(t −s, y) for the space-time density of the process (t − s, x + Y s ) killed when it leaves D, i.e.
for smooth functions f . Then from the form of D, we know that q is bounded away from zero on A, and applying Lemma 5.1 we have by the assumption on the support of ξ under consideration. By the assumption on ξ , and the fact that q is bounded below on A, this final term is strictly positive, and independent of N , so: for some δ > 0 independent of N . Case 2 There exists a bounded rectangle A ⊂ (0, t ) × (x − , x + ) such that ξ(A) > 0, all points of A can be reached from (t , x) via a continuous path which does not enter R, and the process spends a strictly positive time in A. More specifically, for all sufficiently small ε > 0, we can choose a , a r , s A such that A = [s A , s A + ε/2) × [a , a r ], ξ(A) > 0, s A + 3ε < t and the set satisfies D ∩ R = ∅. Further, recalling the definitions of τ D and τ N above, we have τ D ≤ τ N P x -a.s.. In a similar manner to above, we now writẽ q(t − s, y) for the space-time density of the process (t − s, x + Y s ) killed when it leaves D, and observe thatq is bounded away from zero on the set where in the last line we applied Lemma 5.1 and the fact that for (t − s, y) It follows that we can choose δ > 0 independent of N such that which, by an application of Itô's formula, implies that Finally, observe that, in view of the supermartingale properties of Lemma 5.2, we can combine (6.9) and (6.10) to get: for some δ > 0 independent of N , and for any ξ satisfying the conditions of the lemma.

2.2.
We are now ready to exploit the above to establish that (t, x) / ∈ R β for t < t . Take the values of t, ε, δ determined above, and consider the following calculation: Here we use (6.11) for the first two terms in the second inequality; the third term in the second inequality is at least 0 using the fact that τ N < t implies that τ ε N = τ N < t, and w β N (·) ≤ 0. It then follows, since v ξ is non-increasing in t, that We now use the fact that δ > 0 independently of N , and u β N (t, x) → u β (t, x) as N → ∞ to deduce that u β (t, x) − v ξ (t, x) > w β (x). In particular, it is not optimal to stop immediately for the u β optimal stopping problem at (t, x) with t < t , whenever 0 < t R (x) < ∞. Proposition 6.7 Let σ ξ ∈ T with corresponding time-space distribution ξ , and let β be a locally finitely supported measure such that α ξ cx β. Then u β = v ξ β and Theorem 4.1 holds for β.
Proof It follows from Lemma 6.6 that σ β N decreases to σ β , and X σ β N converges to X σ β in probability, and therefore X σ β ∼ β. Finally, if we write H ±N = inf{t ≥ T ξ : |X t | = N }, we also have where we used (4.6) and monotone convergence. It follows from Remark 4.3 that v ξ β = u β .

The general case
In this section, we complete the proof of Theorem 4.1. We fix σ ξ ∈ T with its corresponding time-space distribution ξ , and let β be an arbitrary integrable measure such that β cx α ξ . We start by approximating β with a sequence of locally finitely supported measures. Let We set t k m = ∞ when there are no points of R β in [0, ∞) × I k m , see Corollary 5.6 for a characterisation. The existence of a minimizer x k m follows from the lower semicontinuity of the barrier function t β which, in turn, is implied by the closedness property of the barrier R β . If there exists more than one minimiser, we choose the smallest: . We now determine a sequence of approximating measures defined as follows: the measure β m is defined through its potential function, U β m (x), and we set U β m (x) to be the smallest concave function such that U β m (x k m ) = U β (x k m ) for all k. In particular, we deduce that U β m (x) ≤ U β m+1 (x) ≤ U β (x); moreover, β m has the same mean as β, β m cx β m+1 cx β and U β m (x) − U β (x) → 0 as x → ∂I for each m. This approximation is depicted in Fig. 5.
Each β m is locally finitely supported, and so we can apply Proposition 6.7 to each β m . Write R m := R β m for the corresponding barrier. A typical sequence of barriers are depicted in Fig. 6. Since the potentials of the measures are increasing, we have and it follows from the optimal stopping formulation that t R m (x) ≤ t R m+1 (x) ≤ t β (x)-i.e. new spikes may appear, but existing spikes get smaller. Taking a sequence The approximation sequence of a general measure β. In (a), the red points denote the smallest point in the barrier for the given subdivisions (marked in gray). In (b), the original potential (in blue) is interpolated at the corresponding x-values, to produce a smaller potential corresponding to a measure β m . In (c), a finer set of intervals are used to produce additional approximating points. Note that the previous (red) points are all in the new set of approximating points. In (d), these points are used to produce the potential of a new measure β m+1 (color figure online) k m such that x = x k m m for all m ≥ m 0 , for some m 0 , we see that t R m (x) increases to a limit. We now establish that this limit is equal to t β (x).

Lemma 7.1 Let
This shows that m≥0 k≥m R k ⊂ R β , and therefore R ⊂ R β by the closeness of R β . We now show the reverse inclusion, R β ⊆ R. For (t, x) ∈ R β , and ε > 0, choose m 0 so that 2 −m 0 < ε. Then there exists x such that |x − x | < ε and (t , x ) ∈ R m 0 for some t and t β (x ) ≤ t β (x) ≤ t by our choice of points x k m . Further, as argued above, The above shows R = R β , or equivalently t R = t β . As observed above, for x = x k m m , we have t R m (x) is an increasing sequence in m and hence converges to some limit which we denote t (x). By the barrier property of each R m and the definition of R we see that (t (x), x) ∈ R. It follows that t R (x) ≤ t (x) ≤ t β (x) and hence all three are equal.

Proposition 7.2 Consider the approximation sequence above and define
Proof (i) By definition σ m ≤ σ R m and, from Proposition 6.7, the same process stopped at σ R m is uniformly integrable, which implies the result. (ii) Suppose that σ m does not converge a.s. to σ R β . Take ω such that, possibly passing to a subsequence, we have that σ m (ω) → t ∞ for some t ∞ < σ R β (ω). Then We take limits on both sides. The left-hand side converges to v ξ (t ∞ , y ∞ )+w β (y ∞ ) by continuity of w β and joint continuity of v ξ , see (5.2). For the right-hand side we use 1-Lipschitz continuity of each u β m (t, ·) and 1/2-Hölder continuity in t as given in Lemma 5.5. This shows, with X t ∞ =: y ∞ , that for a constant C independent of m, and hence which then shows that (t ∞ , y ∞ ) ∈ R β and hence σ R β (ω) ≤ t ∞ which gives the desired contradiction.
(iii) Using the above, together with Proposition 6.7 and Remark 4.3, we deduce that  x). We consider the alternative approximat- from which it follows thatR m is an increasing sequence of barriers. Moreover, from the definition of the points x k m , we have σR m σ R β , since when we hit R β , we are guaranteed to hitR m as soon as we have travelled at least 2 −m+1 in both directions. However σR m ≥ σ R m , and therefore: x) and the result follows.
We note that ξ β -regularity of R β now follows from Remark 4.4. The proof of Theorem 4.1 is complete.

A Proofs of the optimality results
We prove here the results announced in Sect. 3.2. We start by establishing the required pathwise inequality.
Proof of Lemma 3. 4 We proceed in three steps.
where ζ k is the first time we enter R n , having previously entered the barriers R n−1 , R n−2 , . . . , R k in sequence. Then ζ k ≥ ζ k+1 , P t,x -a.s. implying that ϕ k ≥ ϕ k+1 by the non-decrease of f . 2. We next compute that: 3. By the previous steps, we have: To be able to take expectations in the pathwise inequality when applied to the stopped diffusion, we need to establish suitable (sub)martingale properties. These are isolated in the following lemma. Then, for all k = 1, . . . , n, the process {h k (t, X t ) − h k (0, X 0 ), t ≥ 0} is a P μ 0submartingale, and a P μ 0 -martingale on σ k−1 , σ k .
Proof First, applying the Itô-Tanaka formula to the second term in the definition of h k , we have Since 0 ≤ ϕ k (u, x) ≤ f ∞ < ∞, (A.3) shows that h k (t, X t ) − h k (0, X 0 ) differs from a martingale by a bounded random variable and in particular is integrable. We now proceed in two steps.
1. For 0 ≤ s ≤ t, using the above decomposition and (A.3), we have where E Then, with equality if σ k−1 ≤ s ≤ t ≤ σ k . 2. (i) We first argue, for all (s, x) ∈ R + × R, that The martingale property is immediate from the definition of ϕ k . The submartingale property follows from the following induction. First, the claim is obvious for k = n+1 by the fact that f is non-decreasing. Next, suppose that the submartingale property in (A.7) holds for some k + 1. Introduce the stopping times σ t R k := inf{u ≥ t : (u, X u ) ∈ R k }, and notice that σ t R k ≥ σ r R k for s ≤ r ≤ t. Then, denoting byX ,σ independent copies of the same objects, and using the induction hypothesis, we see that: (ii) We now prove (A.4). For u ≥ t − s, it follows from (A.7) that E μ 0 s ϕ k (u, X t ) = E 0,X s ϕ k (u,X t−s ) = E u−(t−s),X s ϕ k (u,X u ) = E u−(t−s),X s ϕ k+1 (σ u R k ,Xσu R k ) ≥ E u−(t−s),X s ϕ k+1 (σ R k ,Xσ R k ) = ϕ k u − (t − s), X s , P μ 0 − a.s.
(A.8) (iii) We next prove (A.5). For u ≤ t − s, using again (A.7), we see that: (iv) Finally, to prove (A.6), we observe that the equality was lost in (A.4) and (A.5) only because of the inequalities in (A.8) and (A.9), which in turn become equalities provided that (u, X u ) does not enter R k for u ∈ [s, t). The condition that σ k−1 ≤ s ≤ t ≤ σ k ensures this is true.
Proof of Theorem 3.3 Finally, we complete the proof of the main result in Sect. 3.2. First, by monotone convergence arguments and since ψ is convex, note that is the same for all ρ ∈ T (μ μ μ n ) so that adding a constant to f does not change the problem. We shall normalise f by taking f (0) = 0 and exclude the trivial case f ≡ 0. If the quantities in (A.10) are equal to + ∞ then there is nothing to prove. We thus assume that (A.10) is finite. Note that this might be so even if ψ(x)μ i (dx) = ∞ for each 0 ≤ i ≤ n. More generally, thanks to the convex ordering of measures, one can define the integral g(x)(μ j − μ i )(dx) for a convex g and 0 ≤ i ≤ j ≤ n. This is done by considering g k g which are convex, equal to g on a compact set and affine on the complement. Further, if h = h − g + g with (h − g) and g convex with finite integrals against (μ j − μ i ) then the integral h(x)(μ j − μ i )(dx) is also well defined and finite, see Beiglböck et al. [5] for details. We shall use this fact below repeatedly together with ψ(x)(μ j − μ i )(dx) < ∞ which follows from (A.10).
The aim is now to take expectations in (3.7) for (s i , x i ) = (ρ i , X ρ i ), where ρ ∈ T (μ μ μ n ). To do this, we need to check that the expectations under P μ 0 of individual terms on the right-hand side of (3.7) are well defined.
We can rewrite the first two terms on the right-hand side of (3.7) as: where we used that f (0) = 0 so that κ n+1 ≡ 0 and h n+1 (t, x) = t 0 f (u)du. The expectation of the first two terms is then equal to The integrals in the second sum are well defined and finite by the discussion above. As for the first sum, observe that Using |ϕ i | ≤ f ∞ , E μ 0 [ρ n ] < ∞ and integrability properties of κ i we see that the local martingale is a martingale on [0, ρ n ]. It then follows from Lemma A.1 that with equality if ρ i = σ i . Taking expectations under P μ 0 in (3.7), we deduce that with equality when we replace ρ n with σ n .

B Extension to continuous Markov local martingales
The following statement extends Lemma 6.1 to a class of continuous Markov local martingales.
Lemma B.1 Let X be a local martingale with d X t = η(X t ) 2 dt, for some locally Lipschitz function η, and let a < b be fixed points in I • , and H a,b the first exit time of X from the interval (a, b). Then E x X t∧H a,b − y = E y X t∧H a,b − x for all x, y ∈ [a, b].
Since v ε −→ v, locally uniformly, it follows from the stability result of viscosity solutions that v is a viscosity solution of ∂ t v − 1 2 η 2 D 2 v = 0 on R + × (a, b). We also directly see that v(t, a) = y − a and v(t, b) = b − y. Hence v is also a viscosity solution of (B.1).
Step 3 To conclude that u = v, we now use the fact that Eq. (B.1) has a unique C 0 (R + × [a, b]) viscosity solution. Indeed the corresponding equation satisfied by e λt u(t, x), for an arbitrary λ > 0, satisfies the conditions of Theorem 8.2 of Crandall et al. [15].