Optimal transport and Skorokhod embedding
 3.1k Downloads
 14 Citations
Abstract
The Skorokhod embedding problem is to represent a given probability as the distribution of Brownian motion at a chosen stopping time. Over the last 50 years this has become one of the important classical problems in probability theory and a number of authors have constructed solutions with particular optimality properties. These constructions employ a variety of techniques ranging from excursion theory to potential and PDE theory and have been used in many different branches of pure and applied probability. We develop a new approach to Skorokhod embedding based on ideas and concepts from optimal mass transport. In analogy to the celebrated article of Gangbo and McCann on the geometry of optimal transport, we establish a geometric characterization of Skorokhod embeddings with desired optimality properties. This leads to a systematic method to construct optimal embeddings. It allows us, for the first time, to derive all known optimal Skorokhod embeddings as special cases of one unified construction and leads to a variety of new embeddings. While previous constructions typically used particular properties of Brownian motion, our approach applies to all sufficiently regular Markov processes.
Mathematics Subject Classification
Primary 60G42 60G44 Secondary 91G201 Introduction
Our aim is to develop a new approach to (SEP) based on ideas from optimal transport. Many of the previous developments are thus obtained as applications of one unifying principle (Theorem 1.3) and several difficult problems are rendered tractable. Moreover, our methods can easily handle a number of more general versions of the problem: for example, integrable measures, general starting distributions, and \(\mathbb {R}^d\)valued Feller processes.
1.1 A motivating example: Root’s construction
1.2 Optimal Skorokhod embedding problem
The Root stopping time solves (OptSEP) in the case where \(\gamma (f,s)= s^2\). Other examples where the solution is known include functions depending on the running maximum \(\gamma ((f,s)):= {\bar{f}}(s):= \max _{t\le s} f(t)\) or functions of the local time at 0.
The solutions to (SEP) have their origins in many different branches of probability theory, and in many cases, the original derivation of the embedding occurred separately from the proof of the corresponding optimality properties. Moreover, the optimality of a given construction is often not immediate; for example, the optimality property of the Root embedding was first conjectured by Kiefer [36] and subsequently established by Rost [50].
In contrast to existing work, we will start with the optimization problem (OptSEP) and we seek a systematic method to determine the minimizer for a given function \(\gamma \). To develop a general theory for this optimization problem we interpret stopping times in terms of a transport plan from the Wiener space \(({C_0(\mathbb {R}_+)},{\mathbb {W}})\) to the target measure \(\mu \), i.e. we want to think of a stopping time \(\tau \) as transporting the mass of a trajectory \((B_t(\omega ))_{t \in \mathbb {R}_+}\) to the point \(B_{\tau (\omega )}(\omega )\in \mathbb {R}.\) Note that this is not a coupling between \({\mathbb {W}}\) and \(\mu \) in the usual sense and one cannot directly apply optimal transport theory. Nevertheless the transport perspective provides a powerful intuition that guides us to develop an analogous theory, which in particular accounts for the adaptedness properties of stopping times. To this end, it is necessary to combine ideas and results from optimal transport with concepts and techniques from stochastic analysis.
As in optimal transport, it is crucial to consider (OptSEP) in a suitably relaxed form, i.e. in (OptSEP) we will optimize over randomized stopping times (see Definition 3.7 below). These can be viewed as usual stopping times on a possibly enlarged probability space but in our context it is more natural to interpret them as stopping times of ‘Kantorovichtype’ (in the sense of optimal transport), i.e. stopping times which terminate a given path not at a single deterministic time instance but according to a distribution.
This relaxation will allow us to transfer many of the convenient properties of classical transport theory to our probabilistic setup. Exactly as in classical transport theory, (OptSEP) can be viewed as a linear optimization problem. The set of couplings in mass transport is compact and similarly the set of all randomized stopping times solving (SEP) on Wiener space is compact in a natural sense. Under the standing assumption that B is defined on a sufficiently rich stochastic basis, these considerations allow us to prove:
Theorem 1.1
Let \(\gamma :S\rightarrow \mathbb {R}\) be lsc and bounded from below. Then (OptSEP) admits a minimizing stopping time \(\tau \).
Here we can talk about the continuity properties of \(\gamma \) since S possesses a natural Polish topology (cf. (3.1)).
In the language of linear optimization, Theorem 1.1 is a primal problem. It is therefore natural to expect that there exists a corresponding dual problem, and our second main result concerns this duality:
Theorem 1.2
We will prove this result in Sect. 4, and variants of this result will prove to be important in establishing later results. Theorem 1.2 has close analogues in the literature. In particular, using Hobson’s time change argument [30, 31], Theorem 1.2 is comparable to the work of Dolinsky and Soner [20, 21]. Similar duality results in a discrete time framework are established by Bouchard and Nutz [9] among others.
1.3 Geometric characterization of optimizers: monotonicity principle
A fundamental idea in optimal transport is that the optimality of a transport plan is reflected by the geometry of its support set. Often this is key to understanding the transport problem. On the level of support sets, the relevant notion is ccyclical monotonicity. The relevance of this concept for the theory of optimal transport has been fully recognized by Gangbo and McCann [24], based on earlier work of Knott and Smith [37] and Rüschendorf [51, 52] among others.
Inspired by these results, we establish a monotonicity principle which links the optimality of a stopping time \(\tau \) with ‘geometric’ properties of \(\tau \). Combined with Theorem 1.1, this principle will turn out to be surprisingly powerful. For the first time, all the known solutions to (SEP) with optimality properties can be established through one unifying principle. Moreover, the monotonicity principle allows us to treat the optimization problem (OptSEP) in a systematic manner, generating further embeddings as a byproduct.
Our third main result states:
Theorem 1.3
If (1.4) holds, we will loosely say that \(\Gamma \) supports \(\tau \). The significance of Theorem 1.3 is that it links the optimality of the stopping time \(\tau \) with a particular property of the set \(\Gamma \), i.e. \(\gamma \)monotonicity. In applications, the latter turns out to be much more tangible. We emphasize that we do not require continuity assumptions on \(\gamma \) in this result. This will be important when we apply our results.
Definition 1.4
We note that a swapping of paths (as illustrated in Fig. 2) was used by Hobson [31, p. 34] to provide a heuristic derivation of the optimality properties of the Root embedding. Indeed Hobson’s approach was the starting point of the present paper.
Definition 1.5
By the monotonicity principle, Theorem 1.3, an optimal stopping time is supported by a set \(\Gamma \) such that \(\Gamma ^<\times \Gamma \) contains no stopgo pair \(\big ((f,s),(g,t)\big )\). Intuitively, such a pair gives rise to a possible modification, improving the given stopping rule: as \(f(s) = g(t)\), we can imagine stopping the path (f, s) at time s, and allowing (g, t) to go on by transferring all paths which extend (f, s), the ‘remaining lifetime’, onto (g, t), which is now going (see Fig. 2). By (1.6) this guarantees an improved value of \(P_\gamma \), contradicting the optimality of our stopping rule. Observe that the condition \(f(s)=g(t)\) is what guarantees that a modified stopping rule still embeds the measure \(\mu \). In Sect. 2 below we will briefly indicate how the monotonicity principle can be used to derive existing solutions to the Skorokhod embedding problem as well as a whole family of novel solutions to the Skorokhod embedding problem; many further examples will be provided in Sect. 6.
Importantly, the transportbased approach readily admits a number of strong generalizations and extensions. With only minor changes the existence result, Theorem 1.1, the duality result, Theorem 1.2, and the monotonicity principle, Theorem 1.3 below, extend to general starting distributions and Brownian motion in \(\mathbb {R}^d\), and more generally to sufficiently regular Markov processes; see Sects. 5 and 7. This is notable since previous constructions usually exploit rather specific properties of Brownian motion.
The monotonicity principle, Theorem 1.3, represents the culmination of the three main results, and the proof of this result will be the most complex part of this paper, requiring substantial preparation in order to combine the relevant concepts from stochastic analysis and optimal transport. The preparation and proof of this result will therefore comprise the majority of the paper. In fact the proof will automatically imply a stronger version (Theorem 5.7) of Theorem 1.3. For our applications, it will also be helpful to introduce a version of this result which incorporates a secondary optimization, Theorem 5.16.
The ‘classical’ optimal transport version of Theorem 1.3 can be established through fairly direct arguments, at least in a reasonably regular setting, cf. [3, Thms. 3.2,3.3] and [57, p. 88f]. However, these approaches do not extend easily to our setup: stopping times are of course not couplings in the usual sense and there is no reason for particular combinatorial manipulations to carry over in a direct fashion. Another substantial difference is that the procedure of transferring paths described below Definition 1.5 necessarily refers to a continuum of paths while the classical notion of cyclical monotonicity is concerned with rearrangements along finite cycles. The argument given subsequently is more in the spirit of [6, 8] and requires a fusion of ideas from optimal transport and stochastic analysis.
1.4 New horizons
 1.
Markov processes The results presented in this paper should extend to a more general class of Markov processes with càdlàg paths. The main technical issues this would present lie in the generalization of the results in Sect. 3, where the specific structure of the space of continuous paths is exploited.
 2.
Multiple pathswapping In our monotonicity principle, Theorem 1.3, we consider the impact of swapping mass from a single unstopped path onto a single stopped path, and argue that if this improves the objective \(\gamma \) on average, then the stopping time in question was not optimal. In classical optimal transport, it is known that single swapping is not sufficient to guarantee optimality; rather, one needs to consider the impact of allowing a finite ‘cycle’ of swaps to occur, and moreover, that this is both a necessary and sufficient condition for optimality. It is natural to conjecture that a similar result applies in the present setup.
 3.
Multiple marginals A natural generalization of the Skorokhod embedding problem is to consider the case where a sequence of measures, \(\mu _1, \mu _2, \ldots , \mu _n\) are given, and the aim is to find a sequence of stopping times \(\tau _1 \le \tau _2 \le \cdots \le \tau _n\) such that \(B_{\tau _k} \sim \mu _k\), and such that the chosen sequence of stopping times minimizes \(\mathbb {E}[\gamma ((B_t)_{t \le \tau _n},\tau _1,\ldots ,\tau _n)]\) for a suitable function \(\gamma \). In this setup, it is natural to ask whether there exists a suitable monotonicity principle, corresponding to Theorem 1.3.
 4.
Constrained embedding problems In this paper, we consider classical embedding problems, where the optimization is carried out over the class of solutions to (SEP). However, in many natural applications, one needs to further consider the class of constrained embedding problems: for example, where one minimizes some function over the class of embeddings which also satisfy a restriction on the probability of stopping after a given time. It would be natural to derive generalizations of our duality results, and a corresponding monotonicity principle for such problems.
1.5 Background
Since the first solution to (SEP) by Skorokhod [54] the embedding problem has received frequent attention in the literature, with new solutions appearing regularly, and exploiting a number of different mathematical tools. Many of these solutions also prove to be, by design or accident, solutions of (OptSEP) for a particular choice of \(\gamma \), e.g. [4, 33, 45, 48, 50, 56]. The survey [43] is a comprehensive account of all the solutions to (SEP) up to 2004 and references many articles which use or develop solutions to the Skorokhod embedding problem. More recently, novel twists on the classical Skorokhod embedding problem have been investigated by: Last et. al. [38], who consider the closely related problem of finding unbiased shifts of Brownian motion (and where there are also natural connections to optimal transport); Hirsch et. al. [29], who have used solutions to the Skorokhod embedding problem to construct Peacocks; and Gassiat et. al. [25], who have exploited particular properties of Root’s solution to construct efficient numerical schemes for SDEs.
The Skorokhod embedding problem has also recently received substantial attention from the mathematical finance community. This goes back to an idea of Hobson [30]: through the DambisDubinsSchwarz Theorem, the optimization problems (OptSEP) are related to the pricing of financial derivatives, and in particular to the problem of modelrisk. We refer the reader to the survey article [31] for further details.
Recently there has been much interest in optimal transport problems where the transport plan must satisfy additional martingale constraints. Such problems arise naturally in the financial context, but are also of independent mathematical interest, for example—mirroring classical optimal transport—they have important consequences for the study of martingale inequalities (see e.g. [9, 28, 44]). The first papers to study such problems include [7, 19, 23, 32], and this field is commonly referred to as martingale optimal transport. The Skorokhod embedding problem has been considered in this context by Galichon et. al. in [23]; through a stochastic control problem they recover the Azéma–Yor solution of the Skorokhod embedding problem. Notably, their approach is very different from the one pursued in the present paper.
1.6 Outline of the article
In Sect. 2 we establish the Root and the Rost embeddings as a consequence of Theorems 1.1 and 1.3, as well as constructing a family of new embeddings. The results presented in this section are intended as a motivation for the rest of the paper. In the derivation of these embeddings we highlight the interplay between arguments of a probabilistic nature, and arguments relating to the pathwise space S introduced in (1.2). A major benefit of working in these two separate domains is that it is typically relatively easy to prove pointwise statements in the setup of the space S; on the other hand, the associated probabilistic arguments are usually straightforward. However neither set of arguments naturally transfers to the other setup.
The link between these distinct domains is provided by Theorems 1.1 and 1.2, and in particular the monotonicity principle Theorem 1.3 which we establish in Sects. 3–5. In Sect. 3, we introduce a framework that allows us to view classical probabilistic concepts on the pathwise space S and establish a number of auxiliary results that will be needed later on. In Sect. 4 we prove our first two main results. As in the transport case, Theorem 1.1 will be a simple consequence of lower semicontinuity plus compactness of the set of solutions to the Skorokhod problem. To establish Theorem 1.2, we use classical duality results from optimal transport. In Sect. 5 we prove Theorem 1.3 based on a combination of arguments from optimal transport with Choquet’s capacitability theorem and ingredients from stochastic analysis.
In Sect. 6 we use our results to establish all known solutions to (OptSEP) as well as further embeddings. We also give an example in which (OptSEP) admits only optimizers depending on additional randomization. For readers who are mainly interested in these applications, it should be possible to read this section immediately after Sect. 2.
In Sect. 7 we describe a number of extensions of our previous results. In particular we consider general starting distributions and show that our main results extend to continuous Feller processes under certain assumptions which we are able to verify for a large class of processes. As a special case of the results in this section, we also show that, as usual, the moment condition on \(\mu \) can be dropped when the second condition in (SEP) is recast in terms of uniform integrability resp. minimality (cf. (2.1)).
1.7 Frequently used notation

The set of (sub)probability measures on a space \(\mathsf {X}\) is denoted by \({\mathcal {P}}(\mathsf {X})\) / \({\mathcal {P}}^{\le 1}(\mathsf {X})\).

For a measure \(\xi \) on \(\mathsf {X}\) we write \(f(\xi )\) for the pushforward of \(\xi \) under \(f:\mathsf {X}\rightarrow \mathsf {Y}\).

We use \(\xi (f)\) as well as \(\int f~ d\xi \) to denote the integral of a function f against a measure \(\xi \).

Stochastic processes are usually denoted by capital letters like X, Y, Z.

\({C_x(\mathbb {R}_+)}\) denotes the continuous functions starting in x; \({C(\mathbb {R}_+)}=\bigcup _{x\in \mathbb {R}}{C_x(\mathbb {R}_+)}\).

The set of stopped paths is \( S =\{(f,s): f:[0,s] \rightarrow \mathbb {R} \text{ is } \text{ continuous }, \)f(0)=0\(\} \) and we define \(r:{C_0(\mathbb {R}_+)}\times \mathbb {R}_+\rightarrow S\) by \(r(\omega , t):= (\omega _{\upharpoonright [0,t]},t)\).

For \(\Gamma \subseteq S\) we set \(\Gamma ^<:=\{(f,s): \exists ({\tilde{f}},{\tilde{s}})\in \Gamma , s< {\tilde{s}} \text{ and } f\equiv {\tilde{f}} \hbox { on }[0,s]\}.\)

For \((f,s) \in S\) we write \(\bar{f} = \sup _{r \le s} f(r),\underline{f} = \inf _{r \le s} f(r)\) and \(f^* = \sup _{r \le s} f(r)\).

We use \(\oplus \) for the concatenation of paths: depending on the context the arguments may be elements of \(S,{C_0(\mathbb {R}_+)}\) or \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\).

If F is a function on S resp. \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\) and \((f,s)\in S\) we set \(F^{(f,s)\oplus } (y):= F((f,s)\oplus y)\), where y may be an element of \( S,{C_0(\mathbb {R}_+)}\), or \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\).

\({\mathbb {W}}\) denotes Wiener measure; \(\mathcal {F}^0\) (\(\mathcal {F}^a\)) the natural (augmented) filtration on \({C_0(\mathbb {R}_+)}\).

Two commonly used probability spaces are \((\Omega , \mathcal {G}, (\mathcal {G}_t)_{t \ge 0}, \mathbb {P})\), which is an arbitrary probability space, on which there exists a process B which is Brownian motion, and sometimes also a \(\mathcal {G}_0\)random variable Y which is uniformly distributed on [0, 1]. On this space, the natural filtration generated by the process B is denoted by \((\mathcal {F}^B_t)_{t \ge 0}\) In addition, we sometimes refer to the space \(({{\overline{C}}_0(\mathbb {R}_+)}, {\bar{\mathcal {F}}}, ({\bar{\mathcal {F}}}_t)_{t \ge 0}, \overline{{\mathbb {W}}})\), which is the product space \({{\overline{C}}_0(\mathbb {R}_+)}= {C_0(\mathbb {R}_+)}\times [0,1]\) equipped with a suitable filtration (see the discussion above Theorem 3.8 for further details) and the product measure \(\overline{{\mathbb {W}}}={\mathbb {W}}\otimes {\mathcal {L}}\) of Wiener and Lebesgue measure.
2 Particular embeddings
2.1 The Root embedding
We recall the definition of the Root embedding, \(\tau _{\text {Root}}\), from (1.1), and we wish to recover Root’s result [48] from an optimization problem. Remember that, according to Root’s terminology, a (closed) set \({\mathcal {R}} \subseteq \mathbb {R}_+ \times \mathbb {R}\) is a barrier if \((s,x) \in {\mathcal {R}}\) implies \((t,x) \in {\mathcal {R}}\) whenever \(t>s\). Then Root’s construction of a solution to the Skorokhod embedding problem can be summarized as follows:
Theorem 2.1
Let \(\gamma (f,t)= h(t)\), where \(h:\mathbb {R}_+\rightarrow \mathbb {R}\) is a strictly convex function such that (OptSEP) is well posed. Then a minimizer of (OptSEP) exists, and moreover for any minimizer \({\hat{\tau }}\), there exists a barrier \({\mathcal {R}}\) such that \({\hat{\tau }}=\inf \{ t \ge 0 : (t,B_t)\in {\mathcal {R}}\}\). In particular the Skorokhod embedding problem has a solution of barrier type as in (1.1).
Proof
Step 1. We first pick—by Theorem 1.1—a stopping time \({\hat{\tau }}\) which attains \(P_\gamma .\) By Theorem 1.3 there exists a set \(\Gamma \subseteq S\) such that \(\left( \left( B_{s}\right) _{s \le {\hat{\tau }}}, {\hat{\tau }}\right) \in \Gamma \) almost surely, and such that \((\Gamma ^{<} \times \Gamma ) \cap \mathsf {SG}= \emptyset \).
A consequence of this proof is that (on a given stochastic basis) there exists exactly one solution of the Skorokhod embedding problem which minimizes \(\mathbb {E}[h(\tau )]\); this property was first established in [50], together with the optimality property of Root’s solution. To see this, assume that minimizers \(\tau _1\) and \(\tau _2\) are given. Then we can use an independent coinflip to define a new minimizer \({\bar{\tau }}\) which is with probability 1 / 2 equal to \(\tau _1\) and with probability 1 / 2 equal to \(\tau _2\). By Theorem 2.1, \({\bar{\tau }}\) is of barrier type and hence \(\tau _1=\tau _2\).
Remark 2.2
We highlight here the nature of the proof of Theorem 2.1. The proof divides into three steps, two of these steps (Steps 1 and 3) being probabilistic in nature, making arguments about random variables on a particular probability space. The second step, however, is purely a pointwise argument about the properties of subsets of \(\Gamma \) in relation to the function \(\gamma \) which we look to optimize. The latter arguments are not probabilistic in nature.
Remark 2.3
The following argument, due to Loynes [39], can be used to argue that barriers are unique in the sense that if two barriers solve (SEP), then their hitting times must be equal. Suppose that \({\mathcal {R}}\) and \({\mathcal {S}}\) are both closed barriers which embed \(\mu \). Note that we can take the closed barriers without altering the stopping properties. Consider the barrier \({\mathcal {R}} \cup {\mathcal {S}}\): let \(A \subseteq \Omega _{{\mathcal {R}}} := \{x: (t,x) \in {\mathcal {S}} \implies (t,x) \in {\mathcal {R}}\}\). Then \(\mathbb {P}(B_{\tau _{{\mathcal {R}} \cup {\mathcal {S}}}} \in A) \le \mathbb {P}(B_{\tau _{{\mathcal {R}}}} \in A) = \mu (A)\). Similarly, for \(A' \subseteq \Omega _{\mathcal {S}} := \{x: (t,x) \in {\mathcal {R}} \implies (t,x) \in {\mathcal {S}}\},\mathbb {P}(B_{\tau _{{\mathcal {R}} \cup \mathcal {S}}} \in A') \le \mathbb {P}(B_{\tau _{\mathcal {S}}} \in A') = \mu (A')\). Since \(\mu (\Omega _{{\mathcal {R}}} \cup \Omega _\mathcal {S}) = 1,\tau _{{\mathcal {R}} \cup {\mathcal {S}}}\) embeds \(\mu \).
It is known (see Monroe [41]) that, when \(\mu \) has a second moment, the second condition in (SEP), \(\mathbb {E}[\tau ] < \infty \) is equivalent to minimality of the stopping time (recall (2.1)). It immediately follows from the argument above that if the barriers \({\mathcal {R}}\) and \({\mathcal {S}}\) solve (SEP), then \(\tau _{{\mathcal {R}}} = \tau _{\mathcal {S}}\) a.s. With minor modifications the argument of Loynes also applies to the Rost solution discussed below as well as to a number of further classical embeddings presented in Sect. 6 below.
2.2 The Rost embedding
A set \({{\mathcal {R}}} \subseteq \mathbb {R}_+\times \mathbb {R}\) is an inverse barrier if \((s,x)\in {{\mathcal {R}}}\) and \(s > t\) implies that \((t,x)\in {{\mathcal {R}}}\). It has been shown by Rost [50] that under the condition \(\mu (\{0\})=0\) there exists an inverse barrier such that the corresponding hitting time (in the sense of (1.1)) solves the Skorokhod problem (see Fig. 3(A)). It is not hard to see that without this condition some additional randomization is required. We derive this using an argument almost identical to the one above.
Theorem 2.4
Suppose \(\mu (\{0\}) = 0\). Let \(\gamma (f,t)= h(t)\), where \(h:\mathbb {R}_+\rightarrow \mathbb {R}_+\) is a strictly concave function such that \(\mathrm{(OptSEP)}\) is well posed. Then a minimizer \({\hat{\tau }}\) of \(\mathrm{(OptSEP)}\) exists, and moreover for any minimizer \({\hat{\tau }}\), there exists an inverse barrier \({\mathcal {R}}\) such that \({\hat{\tau }}=\inf \{ t \ge 0 : (t,B_t)\in {\mathcal {R}}\}\). In particular the Skorokhod embedding problem has a solution which is the hitting time of an inversebarrier.
Proof
Arguing likewise on c, we obtain \(\tau _\textsc {cl}= \tau _\textsc {op}\) a.s. \(\square \)
As in the case of the Root embedding we obtain that the minimizer of \(\mathbb {E}[h( \tau )]\) is unique.
2.3 The Cave embedding
In this section we give an example of a new embedding that can be derived from Theorem 1.3. It can be seen as a unification of the Root and Rost embeddings. A set \({\mathcal {R}} \subseteq \mathbb {R}_+\times \mathbb {R}\) is a cave barrier if there exists \(t_0\in \mathbb {R}_+\), an inverse barrier \({\mathcal {R}}^0\subseteq [0,t_0]\times \mathbb {R}\) and a barrier \({\mathcal {R}}^1 \subseteq [t_0,\infty )\times \mathbb {R}\) such that \({\mathcal {R}}={\mathcal {R}}^0\cup {\mathcal {R}}^1.\) We will show that there exists a cave barrier such that the corresponding hitting time (in the sense of (1.1)) solves the Skorokhod problem (Fig. 3(B)). We derive this using an argument similar to the one above:

\(\varphi (0)=0, \lim _{t\rightarrow \infty }\varphi (t)=0, \varphi (t_0)=1\)

\(\varphi \) is strictly concave on \([0,t_0]\)

\(\varphi \) is strictly convex on \([t_0,\infty )\).
Theorem 2.5
(Cave embedding) Suppose \(\mu (\{0\}) = 0\). Let \(\gamma (f,t)= \varphi (t)\). Then a minimizer \({\hat{\tau }}\) of (OptSEP) exists, and moreover for any minimizer \({\hat{\tau }}\), there exists a cave barrier \({\mathcal {R}}\) such that \({\hat{\tau }}=\inf \{ t \ge 0 : (t,B_t)\in {\mathcal {R}}\}\). In particular the Skorokhod embedding problem has a solution which is the hitting time of a cave barrier.
Since this construction does not already appear in the literature, we emphasize that the result remains true for integrable (centered) measures \(\mu \) (see Sect. 7).
Proof of Theorem 2.5
By the same argument as for the Root and Rost embeddings it then follows that \(\tau _{{\mathcal {R}}_{\textsc {cl}}}\le {\hat{\tau }}\le \tau _{{\mathcal {R}}_{\textsc {op}}}\) a.s. and also that \(\tau _{{\mathcal {R}}_{\textsc {cl}}}=\tau _{{\mathcal {R}}_{\textsc {op}}}\) a.s., proving the claim. \(\square \)
2.4 Remarks
In Sect. 6.3 we will show that the arguments above can be adapted to prove the existence of Rost and Root embeddings in a more general setting. Specifically, in Sects. 6 and 7 we will show that this approach generalizes to a multidimensional setup and (sufficiently regular) Markov processes. In the case of the Root embedding it does not matter for the argument whether the starting distribution is a Dirac in 0 as in our setup or a more general distribution \(\lambda \). For the Rost embedding a general starting distribution is slightly more difficult. In the case where \(\lambda \) and \(\mu \) have common mass, then it may be the case that \({{\mathrm{proj}}}_{\mathbb {R}_+}({\mathcal {R}}_\textsc {cl}\cap (A \times \mathbb {R}_+)) = \{0\}\) for some set A—that is, all paths which stop at \(x \in A\) do so at time zero. In this case it is possible that \({\hat{\tau }} < \tau _\textsc {op}\) when the process starts in A, and in general, some proportion of the paths starting on A must be stopped instantly. As a result, in the case of general starting measures, independent randomization is necessary. In the Rost case, it is also straightforward to compute the independent randomization which preserves the embedding property.
Other recent approaches to the Root and Rost embeddings can be found in [13, 14, 25, 26]. These papers largely exploit PDE techniques, and as a consequence, are able to produce more explicit descriptions of the barriers, however the methods tend to be highly specific to the problem under consideration.
3 Preliminaries on stopping times and filtrations
A key feature of this article is that we are taking a nonstandard perspective on stopping times; the main purpose of this section is to provide a convenient framework. To this end, we need to discuss connections between common notions defined on an arbitrary probability space and their related notions defined on the canonical path space \({C_0(\mathbb {R}_+)}\) and the space S. We then see (by Lemma 3.11, Theorem 3.8) that in the context of our optimization problem, rather than studying the class of all possible stopping times, we can equivalently focus on randomized stopping times on the canonical space. These can be characterized in various equivalent terms (cf. Theorem 3.8); e.g. viewing them as measures on \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\) is useful to establish compactness results while the representation through ‘increasing’ functions on S is necessary for the manipulations of stopping times which we need to consider in the proof of the monotonicity principle, Theorem 1.3, in Sect. 5. Finally, we shall consider the set of ‘joinings’ which can be interpreted as a type of coupling between a randomized stopping time and an abstract probability measure. This is an important ingredient in the proofs of Theorem 1.2 and Theorem 1.3.
3.1 Spaces and filtrations
For our arguments it will be important to be precise about the relationship between the sets \({{C_0(\mathbb {R}_+)}} \times \mathbb {R}_+\) and S. We therefore discuss the underlying filtrations in some detail.
We consider two different filtrations on the Wiener space \({{C_0(\mathbb {R}_+)}}\), the canonical or natural filtration \(\mathcal {F}^0=(\mathcal {F}_t^0)_{t\in \mathbb {R}_+}\) as well as its usual augmentation \(\mathcal {F}^a=(\mathcal {F}^a_t)_{t\in \mathbb {R}_+}\). As Brownian motion is a continuous Feller process, all rightcontinuous \(\mathcal {F}^a\)martingales are continuous ([47, Theorem VI. 15.4]) and hence all \(\mathcal {F}^a\)stopping times are predictable and the \(\mathcal {F}^a\)optional and \(\mathcal {F}^a\)predictable \(\sigma \)algebras coincide [46, Corollary IV 5.7]. By [16, Theorem IV. 97, Rem. IV. 98] we also have that the \(\mathcal {F}^0\)predictable, \(\mathcal {F}^0\)optional and \(\mathcal {F}^0\)progressive \(\sigma \)algebras coincide because \({{C_0(\mathbb {R}_+)}}\) is the set of continuous paths. Moreover, we will use the following result.
Theorem 3.1
 (1)
If \(\tau \) is a predictable time wrt \(\mathcal {G}^a\), then there exists a predictable time \(\tau '\) wrt \(\mathcal {G}\) such that \(\tau =\tau '\) a.s. For every \(\mathcal {G}^a\)predictable process \((X_t)_{t\in \mathbb {R}_+}\) there is a \(\mathcal {G}\)predictable process \((X_t')_{t\in \mathbb {R}_+}\) which is indistinguishable from \((X_t)_{t\in \mathbb {R}_+}.\)
 (2)
If \((A_t)_{t\in \mathbb {R}_+}\) is an increasing rightcontinuous \(\mathcal {G}^a\)predictable process there is an increasing rightcontinuous \(\mathcal {G}\)predictable process \((A_t')_{t\in \mathbb {R}_+}\) (possibly assuming the value \(+\infty \)) which is indistinguishable from \((A_t)_{t\in \mathbb {R}_+}\).
Proof
For Statement (1) we refer to [16, Theorem IV. 78] and the comments directly afterwards. To prove statement (2), let \((A_t)_{t\in \mathbb {R}_+}\) be an increasing rightcontinuous \(\mathcal {G}^a\)predictable process. Arguing on \((\frac{2}{\pi }\arctan (A_tA_0))_{t\in \mathbb {R}_+}\), we may assume that A takes values in [0, 1].
We use an extension of the filtered probability space denoted \((\bar{\Omega },\bar{\mathcal {G}},(\bar{\mathcal {G}}_t)_{t \ge 0},\bar{\mathbb {P}})\), where we take \(\bar{\Omega } = \Omega \times [0,1],\bar{\mathcal {G}} = \mathcal {G}\otimes {\mathcal {B}}([0,1]), \bar{\mathbb {P}}(D_1\times D_2) = \mathbb {P}(D_1) \mathcal {L}(D_2)\), and set \(\bar{\mathcal {G}}_t= \mathcal {G}_t \otimes {\mathcal {B}}([0,1])\) and let \(\bar{\mathcal {G}}^a\) be its usual augmentation. Here, \(\mathcal {L}\) denotes Lebesgue measure. Abusing notation we also write A for the mapping \((\omega ,x,t) \mapsto A_t(\omega )\) on \( \bar{\Omega }\times \mathbb {R}_+\).
The following result is a particular case of [16, Theorem IV. 97] (in somewhat different notation).
Theorem 3.2
 (1)
A set \(D\subseteq {{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\) is \(\mathcal {F}^0\)optional iff \(D=r^{1}(A)\) for some Borel set \(A\subseteq S\).
 (2)
A process \(X=(X_t)_{t\in \mathbb {R}_+}\) is \(\mathcal {F}^0\)optional iff \(X=H\circ r\) for some Borel measurable \(H:S\rightarrow \mathbb {R}\).
The mapping r is not a closed mapping: it is easy to see that there exist closed sets in \({{C_0(\mathbb {R}_+)}} \times \mathbb {R}_+\) with a nonclosed image under r. However this does not happen for closed optional sets: it is straightforward that an \(\mathcal {F}^0\)optional set \(A\subseteq {{C_0(\mathbb {R}_+)}} \times \mathbb {R}_+\) is closed iff the corresponding set r(A) is closed in S.
Definition 3.3
If X is an \(\mathcal {F}^0\)optional process we write \(X^S\) for the unique function \(S\rightarrow \mathbb {R}\) satisfying \(X=X^S\circ r\). We say that an optional process X is Scontinuous (resp. Slsc) if the corresponding function \(X^S: S \rightarrow \mathbb {R}\) is continuous (resp. lsc).
It is trivially true that an Scontinuous process is continuous in the usual pathwise sense. The converse is not generally true—consider the case where \(X_t(\omega )=\text{ sign }(\omega (1))(t2)_+\). This is a continuous, optional process, however the corresponding function \(X^S\) is not a continuous mapping from S to \(\mathbb {R}\). Other examples arise from functions connected to the local time of Brownian motion, cf. Sect. 6.2.
Definition 3.4
Clearly, (3.3) defines an \(\mathcal {F}^0_t\)measurable function which is a version of the classical conditional expectation; subsequently, it will be useful to have this function defined for all \(\omega \). In accordance with Definition 3.3 we write \(X^{M,S}\) for the function satisfying \(X^M = X^{M,S}\circ r\).
Proposition 3.5
Let \(X\in C_b({{C_0(\mathbb {R}_+)}})\). Then \(X^M_t\) is an Scontinuous martingale, \(X^M_\infty =\lim _{t\rightarrow \infty } X^M_t\) exists and equals X.
Proof
Note that \(X^{M,S}(f,s)= \int X^{(f,s)\oplus }(\omega )\, {\mathbb {W}}(d\omega )\) for \((f,s)\in S\). Also, \((f_n,s_n)\rightarrow (f,s)\) implies \(f_n\oplus \omega \rightarrow f\oplus \omega \) for \(\omega \in {{C_0(\mathbb {R}_+)}}\) and, by continuity of \(X,X^{(f_n,s)\oplus }(\omega )\rightarrow X^{(f,s)\oplus }(\omega )\). Since X is bounded, dominated convergence implies \(X^{M,S}(f_n,s_n)\rightarrow X^{M,S}(f,s).\) \(\square \)
For \(X\in C_b({{C_0(\mathbb {R}_+)}})\), \(X^M\) is a martingale with continuous paths and hence satisfies the optional stopping theorem. Using the functional monotone class theorem, we see that the optional stopping theorem holds for \(X^M\) for all bounded measurable \(X: {C_0(\mathbb {R}_+)}\rightarrow \mathbb {R}\). Also one can prove that \(X^M\) has almost surely continuous paths, even if X itself was not continuous, but we will not use this fact.
3.2 Randomized stopping times
Working on the probability space \(({C_0(\mathbb {R}_+)}, {\mathbb {W}})\), a stopping time \(\tau \) is a mapping which assigns to each path \(\omega \) the time \(\tau (\omega ) \) at which the path is stopped. If the stopping time depends on external randomization, then we may consider a path \(\omega \) which is not stopped at a single point \(\tau (\omega )\), but rather that there is a subprobability measure \(\tau _\omega \) on \(\mathbb {R}\) which represents the probability that the path \(\omega \) is stopped at a given time, conditional on observing the path \(\omega \). The aim of this section is to make this idea precise, and to establish connections with related properties in the literature. Specifically, the notion of a randomized stopping time has previously appeared in e.g. [5, 40, 49].
Subsequently we will identify randomized stopping times as a subset of the well studied \({\mathbf {P}}\)measures: A finite measure \(\xi \) on \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\) is a \({\mathbf {P}}\) measure (wrt \({\mathbb {W}}\)) if it does not charge any \({\mathbb {W}}\)evanescent set. A basic result of Doléans [18] is the following
Theorem 3.6
Definition 3.7
(Randomized stopping times) A measure \(\xi \in \mathsf {M}\) is called a randomized stopping time, written \(\xi \in \mathsf {RST}\), iff the associated increasing process A is optional.
Below, it will sometimes be convenient to represent randomized stopping times on an extension of the space \(({C_0(\mathbb {R}_+)}, \mathcal {F}^0, (\mathcal {F}^0_t)_{t\ge 0},{\mathbb {W}})\): we will consider \(({{\overline{C}}_0(\mathbb {R}_+)},{\bar{\mathcal {F}}},({\bar{\mathcal {F}}}_t)_{t \ge 0},\overline{{\mathbb {W}}})\), where \({{\overline{C}}_0(\mathbb {R}_+)}= {C_0(\mathbb {R}_+)}\times [0,1], \overline{{\mathbb {W}}}(A_1\times A_2) = {\mathbb {W}}(A_1) \mathcal {L}(A_2)\) (where \(\mathcal {L}\) denotes Lebesgue measure), \({\bar{\mathcal {F}}}\) is the completion of \(\mathcal {F}^0 \otimes {\mathcal {B}}([0,1])\), and \({\bar{\mathcal {F}}}_t\) the usual augmentation of \((\mathcal {F}_t^0 \otimes {\mathcal {B}}([0,1]))_{t \ge 0}\). We will write \({\bar{B}}=({\bar{B}}_t)_{t\ge 0}\) for the process given by \({\bar{B}}_t(\omega ,u)=\omega _t.\) Observe that if \(Y_t(\omega ,u) = u\), then \(({\bar{B}}_t, Y_t)\) is (trivially) a continuous Feller process, and hence by the same arguments as above, the \({\bar{\mathcal {F}}}\)predictable and \({\bar{\mathcal {F}}}\)optional \(\sigma \)algebras coincide.
Randomized stopping times play a key role in this paper; depending on the respective context, the following different characterizations will be useful:
Theorem 3.8
 (1)There is a Borel function \(A:S\rightarrow [0,1]\) such that the process \(A\circ r\) is rightcontinuous increasing anddefines a disintegration of \(\xi \) wrt to \({\mathbb {W}}\).$$\begin{aligned} \xi _\omega ([0,s]):=A\circ r(\omega ,s) \end{aligned}$$(3.4)
 (2)
We have \(\xi \in \mathsf {RST}\), i.e. given a disintegration \((\xi _\omega )_{\omega \in {C_0(\mathbb {R}_+)}}\) of \(\xi \), the random variable \( {\tilde{A}}_t(\omega )=\xi _\omega ([0,t])\) is \(\mathcal {F}^a_t\)measurable for all \(t\in \mathbb {R}_+\).
 (3)For all \(f\in C_b(\mathbb {R}_+)\) supported on some \( [0,t],t\ge 0\) and all \(g\in C_b({C_0(\mathbb {R}_+)})\)$$\begin{aligned} \int f(s) (g\mathbb {E}[g\mathcal {F}_t^0])(\omega ) \, \xi (d\omega , ds)=0 \end{aligned}$$(3.5)
 (4)On the probability space \(({{\overline{C}}_0(\mathbb {R}_+)},{\bar{\mathcal {F}}},({\bar{\mathcal {F}}}_t)_{t \ge 0},\overline{{\mathbb {W}}})\), the random timedefines an \({\bar{\mathcal {F}}}\)stopping time.$$\begin{aligned} \rho (\omega ,u) :=\inf \{ t \ge 0 : \xi _\omega ([0,t]) \ge u\} \end{aligned}$$(3.6)
Proof
The equivalence of (1) and (2) follows directly from Theorems 3.1, 3.2 and 3.6.
Remark 3.9
 (1)
The function A in (3.4) is unique up to indistinguishability (cf. Theorem 3.6). We will denote this function by \(A^\xi \).
 (2)
We will say \(\xi \in \mathsf {RST}\) is a nonrandomized stopping time iff there is a disintegration \((\xi _\omega )_{\omega \in {C_0(\mathbb {R}_+)}}\) of \(\xi \) such that \(\xi _\omega \) is either null (corresponding to a path which is not stopped) or a Diracmeasure (of mass 1) for every \(\omega \). Clearly this means that \(\xi _\omega = \delta _{\tau (\omega )}\) a.s. for some (nonrandomized) stopping time \(\tau \). \(\xi \) is a nonrandomized stopping time iff there is a version of \(A^\xi \) which only attains the values 0 and 1.
 (3)
We will say \(\xi \in \mathsf {RST}\) is a finite randomized stopping time iff \(\xi ({C_0(\mathbb {R}_+)}\times \mathbb {R}_+) = 1\).
An immediate consequence of Theorem 3.8 (3) is the following
Corollary 3.10
The set \(\mathsf {RST}\) is closed wrt the weak topology induced by the continuous bounded functions on \({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\).
The next lemma implies that optimizing over usual stopping times on a rich enough probability space in (OptSEP) is equivalent to optimizing over randomized stopping times on Wiener space.
Lemma 3.11
Proof
To prove the second part, we observe that by Theorem 3.8 (4), there exists an \({\bar{\mathcal {F}}}\)stopping time \(\rho '\) representing \(\xi \). Since \(\rho '\) is \({\bar{\mathcal {F}}}\)predictable, it follows from Theorem 3.1 that there exists an almost surely equal \((\mathcal {F}_t^0 \times {\mathcal {B}}([0,1]))_{t \ge 0}\)stopping time \(\rho \). Then we can define a random time on \(\Omega \) by \(\rho ((B_s)_{s \ge 0},Y)\), where B is the Brownian motion, and Y the independent \(\mathcal {G}_0\)measurable, uniform random variable. Consider the map \({\bar{\Phi }}:\Omega \rightarrow {{\overline{C}}_0(\mathbb {R}_+)}, {\bar{\omega }}\mapsto ((B_t({\bar{\omega }}))_{t\ge 0}, Y({\bar{\omega }})).\) Since \(\rho \) is a \((\mathcal {F}_t^0 \times {\mathcal {B}}([0,1]))_{t \ge 0}\)stopping time and \({\bar{\Phi }} \) is measurable from \( (\Omega , \mathcal {G}_t)\) to \(({{\overline{C}}_0(\mathbb {R}_+)}, \mathcal {F}_t^0 \times {\mathcal {B}} ([0,1])),\rho \circ (B,Y)\) is a \(\mathcal {G}\)stopping time. \(\square \)
3.3 Randomized stopping times solving the Skorokhod problem and compactness
Lemma 3.12
 (1)
\(\xi (T)={\bar{\mathbb {E}}}[\rho ] < \infty \),
 (2)
\(\xi (T)={\bar{\mathbb {E}}}[\rho ] = V\) ,
 (3)
\((\bar{B}_{\rho \wedge t})\) is uniformly integrable.
Definition 3.13
We denote by \(\mathsf {RST}(\mu )\) the set of all finite randomized stopping times satisfying the conditions in Lemma 3.12.
For us it is crucial that randomized stopping times have the following property:
Theorem 3.14
The set \(\mathsf {RST}(\mu )\) is nonempty and compact wrt the weak topology induced by the continuous and bounded functions on \({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\).
Proof
If \(\mu \) is a centered probability then it is not hard to establish that the Skorokhod embedding problem has a solution, e.g. one can use the external randomization \(u\in [0,1]\) to stop \(({\bar{B}}_t(\omega ,u))_{t\ge 0}\) once it leaves (a(u), b(u)). Choosing a, b carefully we obtain a solution of (SEP), see e.g. [43, p. 332] for a detailed account.
By Prokhorov’s theorem we have to show that \(\mathsf {RST}(\mu )\) is tight and closed.
Our use of randomization to achieve compactness of a set of stopping times has similarities to the work of Baxter and Chacon [5]. However their setup is different, and their intended applications are not connected to Skorokhod embedding.
3.4 Joinings
Remark 3.15
Write \({\mathsf {pred}}\) for the \(\sigma \)algebra of \(\mathcal {F}^0\)predictable sets in \({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\). We call a set \(A\subseteq {{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\times \mathsf {Y}\) predictable if it is an element of \({\mathsf {pred}}\otimes {\mathcal {B}}(\mathsf {Y})\). We will say that a function defined on \({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\times \mathsf {Y}\) is predictable if it is measurable wrt \({\mathsf {pred}}\otimes {\mathcal {B}}(\mathsf {Y})\). As before, predictable subsets of \( {{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\times \mathsf {Y}\) correspond to measurable subsets of \(S\times \mathsf {Y}\), and similarly for functions.
4 The optimization problem and duality
4.1 The primal problem
Theorem 4.1
By Lemma 3.11, Theorem 1.1 is a consequence of this result.
Proof of Theorem 4.1/Theorem 1.1
By the Portmanteau theorem, the functional (4.3) is lsc if \(\gamma :S\rightarrow \mathbb {R}\) is lsc and bounded from below by a constant.
In Sect. 7 below we establish existence of a minimizing stopping time in the case where the measure \(\mu \) does not necessarily admit a finite second moment. However we will then replace Assumption (4.2) by the requirement that \(\gamma \) is bounded from below.
4.2 The dual problem
The following result implies Theorem 1.2.
Theorem 4.2
We will establish Theorem 4.2 as a consequence of the following auxiliary duality result, where we write T for the projection map \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\times \mathbb {R}\rightarrow \mathbb {R}_+,T(\omega ,t,y)=t\).
Proposition 4.3
Proposition 4.3 should be compared to the (formally) very similar classical duality theorem of optimal transport, see e.g. [58, Section 5] for a proof as well as for a discussion of its origin and related literature.
Theorem 4.4
The strategy of the proof of Proposition 4.3 is to establish the duality relation \((\star )\) for \(\pi \), resp. \((\varphi , \psi ) \) taken from certain larger candidate sets, in which case the duality relation follows from Theorem 4.4. Then we introduce additional constraints via a variational approach to obtain an improved duality through the following minmax theorem.
Theorem 4.5
 (1)
K is compact,
 (2)
\(F(\cdot , y)\) is continuous and convex on K for every \(y\in L\),
 (3)
\(F(x,\cdot )\) is concave on L for every \(x\in K\)
Proof of Proposition 4.3
Claim 1
Taking the \(\inf \) over \(\pi \) satisfying (\(p[t_0]\)) and the \(\sup \) over \((\varphi , \psi )\) satisfying (\(d[c,t_0]\)), the duality relation \((\star )\) holds for continuous bounded \(c: {C_0(\mathbb {R}_+)}\times \mathbb {R}_+\times \mathbb {R}\rightarrow \mathbb {R}\).
Claim 2
Taking the \(\inf \) over \(\pi \) satisfying (\(p[t_0, V]\)) and the \(\sup \) over \((\varphi , \psi )\) satisfying \((d[c,t_0, V])\), the duality relation \((\star )\) holds for continuous bounded \(c: {C_0(\mathbb {R}_+)}\times \mathbb {R}_+\times \mathbb {R}\rightarrow \mathbb {R}\).
Claim 3
Taking the \(\inf \) over \(\pi \) satisfying (p[V]) and the \(\sup \) over \((\varphi , \psi )\) satisfying (d[c, V]), the duality relation \((\star )\) holds for \(c: {C_0(\mathbb {R}_+)}\times \mathbb {R}_+\times \mathbb {R}\rightarrow \mathbb {R}\) lsc and bounded from below.
It follows that \((\varphi ,\psi )\) satisfy (\(d^M[c,V]\)). Thus (4.10) yields the nontrivial part of \((\star )\) for the constraints \((p^M[V])\), (\(d^M[c,V]\)) in the case of continuous bounded c. As above, the extension to lsc c is straightforward. \(\square \)
Proof of Theorem 4.2
To prove the latter inequality, note that each \(\pi \in \mathsf {JOIN}^{1,V}( \mu )\) satisfying \(\int c\, d\pi <\infty \) is concentrated on \(\{(\omega ,t, y): \omega (t)=y\}\) and writing \(p(\omega , t, y):=(\omega ,t)\) we find \(\xi := p(\pi )\in \mathsf {RST}(\mu ),\int c\, d\pi = \int \gamma \, d\xi \). \(\square \)
4.3 General starting distribution
Theorem 4.6
5 The monotonicity principle
 1.
Consider an optimal stopping rule \(\xi \) and a stopgo pair \(((f,s),(g,t))\in \mathsf {SG}\) where (f, s) is still going according to the stopping rule \(\xi \) while (g, t) is stopped by \(\xi \). Intuitively speaking, we can find an (infinitesimal) improvement of \(\xi \) by switching the roles of f and g. As \(\xi \) is optimal, there should only exist a few such pairs. We formalize this in Proposition 5.8 by showing that if \(\pi (\mathsf {SG})>0\) for some \(\pi \in \mathsf {JOIN}(r(\xi ))\) we can explicitly construct a stopping rule with strictly lower ‘cost’.
 2.
Knowing that \(\mathsf {SG}\) is negligible in the sense that it is not seen by the ‘couplings’ \(\pi \) just described, it remains to find a support \(\Gamma \) of \(\xi \) such that \(\mathsf {SG}\cap (\Gamma ^< \times \Gamma )=\emptyset .\) The crucial step is the characterization of a set which is null wrt all \(\pi \in \mathsf {JOIN}(r(\xi ))\) which we establish in Proposition 5.9 based on Choquet’s capacitability theorem and an auxiliary duality result.
In the first part of this section we will give a number of definitions and results that are needed to establish Theorem 1.3 (including the statements of Propositions 5.8 and 5.9); the respective proofs will be given subsequently.
The notion of stopgo pairs introduced in Definition 1.4 requires that all possible extensions \(\sigma \) are considered. However, to establish the monotonicity principle, it is actually more natural to prove a stronger result that appeals to a relaxed notion of stopgo pairs which are sensitive to the stopping measure \(\xi \), or—more precisely—to a representation of \(\xi \) through a function \(A^\xi \) as in Theorem 3.8 (1).
Important Convention
Throughout this section we will fix \(\xi \in \mathsf {RST}(\mu )\), as well as the particular representation \(A^\xi \).
Definition 5.1
The measure \(\xi ^{(f,s)}\) is the normalized stopping measure given that we followed the path f up to time s. In other words this is the normalized stopping measure of the ‘bush’ which follows the ‘stub’ (f, s). We note that \(\xi ^{(f,s)}\) depends measurably on \((f,s)\in S\).
Informally, the following lemma asserts that if \(\xi \) is a wellbehaved stopping time, then the same holds for \(\xi ^{(f,s)}\) for typical \((f,s)\in S\). More precisely, we say that \(V\subseteq S\) is evanescent if \(r^{1}(V)\) is an evanescent subset of \({C_0(\mathbb {R}_+)}\times \mathbb {R}_+\). Equivalently, V is evanescent if there is a Borel set \(A\subseteq {C_0(\mathbb {R}_+)}, {\mathbb {W}}(A)=1\) such that \(r(A\times \mathbb {R}_+)\cap V=\emptyset \). Recall that T denotes the projection from \({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+\) onto \( \mathbb {R}_+.\)
Lemma 5.2
The set \(\{(f,s)\in S : \xi ^{(f,s)}\notin \mathsf {RST}^1\}\) is evanescent. Moreover, if \(F:{C_0(\mathbb {R}_+)}\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) is predictable and satisfies \(\xi (F)<\infty \) then the set \(\{(f,s)\in S : \xi ^{(f,s)}(F^{(f,s)\oplus }) =\infty \}\) is evanescent. In particular, \(\{(f,s)\in S:\xi ^{(f,s)}(T) =\infty \}\) is evanescent, since \(\xi \in \mathsf {RST}(\mu )\).
Definition 5.3
 (1)
\(\int T\, d\xi ^{(f,s)}=\infty \) or \(\xi ^{(f,s)}({C_0(\mathbb {R}_+)}\times \mathbb {R}_+)<1\);
 (2)
the integral on the left side equals \(\infty \);
 (3)
either of the integrals is not defined.
Lemma 5.4
Remark 5.5
Note that \(\mathsf {SG}^\xi \) and \({\widehat{\mathsf {SG}}}^\xi \) are Borel subsets of \(S\times S\) (corresponding to predictable subsets of \(({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+)\times ({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+)=({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+)\times \mathsf {Y}\) in the sense of Remark 3.15). In contrast, \(\mathsf {SG}\) is in general just coanalytic.
Definition 5.6
Recall that we say that our optimization problem (OptSEP) is well posed if \({\int \gamma ~ d\xi }\) exists with values in \((\infty ,\infty ]\) for all \(\xi \in \mathsf {RST}(\mu )\) and it is finite for one such \(\xi \). Together with Lemma 5.4, the following result implies Theorem 1.3 stated in the introduction, and is itself a slightly stronger result.
Theorem 5.7
Assume that \(\gamma :S\rightarrow \mathbb {R}\) is Borel measurable, the optimization problem (4.1) is well posed and that \(\xi \in \mathsf {RST}(\mu )\) is an optimizer. Then there exists a \((\gamma ,\xi )\)monotone Borel set \(\Gamma \subseteq S\) which supports \(\xi \) in the sense that \(r(\xi )(\Gamma )=1\).
The proof of Theorem 5.7 relies on Proposition 5.8 and Proposition 5.9 below. The first result formalizes the heuristic idea that an optimizer cannot be improved on a large set of paths but at most on a small set of exceptional paths. The second result allows us to entirely exclude such an exceptional set of paths.
Given functions \(F:\mathsf {X}\rightarrow \mathsf {X}', G:\mathsf {Y}\rightarrow \mathsf {Y}'\) we denote the product map by \(F\otimes G:\mathsf {X}\times \mathsf {Y}\rightarrow \mathsf {X}'\times \mathsf {Y}'.\) Given a probability \(\nu \) on a Polish space \(\mathsf {Y}\), we defined the set \(\mathsf {JOIN}(\nu )\) in Sect. 3.4. An element \(\pi \in \mathsf {JOIN}(\nu )\) is a measure on \(({{C_0(\mathbb {R}_+)}}\times \mathbb {R}_+) \times \mathsf {Y}\), and we will commonly consider the pushforward measure \((F \otimes G)(\pi )\). Typically F will be the map \(r: {{C_0(\mathbb {R}_+)}} \times \mathbb {R}_+ \rightarrow S\), and G will be r or the identity.
Proposition 5.8
Assume that \(\gamma :S\rightarrow \mathbb {R}\) is Borel measurable, the optimization problem (4.1) is well posed and that \(\xi \in \mathsf {RST}(\mu )\) is an optimizer. Then \((r \otimes {{\mathrm{Id}}})(\pi )(\mathsf {SG}^\xi )=0\) for any \(\pi \in \mathsf {JOIN}^1( r(\xi ))\).
Below we apply Proposition 5.9 to \((\mathsf {Y}, \nu )=(S, r(\xi ))\), but this choice is not relevant for the proof of Proposition 5.9 and so we state it for an abstract Polish probability space \((\mathsf {Y}, \nu )\).
Proposition 5.9
 (1)
\((r \otimes {{\mathrm{Id}}})(\pi )(E)=0\) for all \(\pi \in \mathsf {JOIN}^1( \nu )\).
 (2)
\(E \subseteq (F \times \mathsf {Y})\ \cup \ (S\times N)\) for some evanescent set \(F\subseteq S\) and a \(\nu \)null set \(N\subseteq \mathsf {Y}\).
Intuitively speaking, Proposition 5.9 characterizes when a predictable set \(E\subseteq S\times \mathsf {Y}\) is ‘negligible’. In this sense it relates to the classical (cross) section theorem, which implies the following characterization of negligible subsets of S.
Proposition 5.10
 (1)
\(r(\alpha )(E)=0\) for all \(\alpha \in \mathsf {RST}\).
 (2)
E is evanescent.
 (1’)
\({\mathbb {W}}(((B_s)_{s\le \tau },\tau )\in E)=0\) for every \(\mathcal {F}^0\)stopping time \(\tau \).
Note that the equivalence of (1) and (2) in Proposition 5.10 corresponds precisely to Proposition 5.9 in the case where \(\mathsf {Y}\) consists of a single element.
Proof of Theorem 5.7
It remains to establish the auxiliary results stated above.
Proof of Lemma 5.2
Proof of Lemma 5.4
5.1 Proof of Proposition 5.8
 (1)
The terminal distributions \(\mu _0, \mu _1\) corresponding to \(\xi _0^\pi \) and \(\xi _1^\pi \) satisfy \((\mu _0+\mu _1)/2= \mu .\)
 (2)
\(\xi _0^\pi \) stops paths earlier than \(\xi \) while \(\xi _1^\pi \) stops later than \(\xi \).
 (3)The cost of \(\xi _0^\pi \) plus the cost of \(\xi _1^\pi \) is less than twice the cost of \(\xi \), i.e.$$\begin{aligned}&\int \gamma \circ r(\omega ,t) \, d\xi ^\pi _0(\omega , t)+ \int \gamma \circ r(\omega ,t) \, d\xi ^\pi _1(\omega , t) \\&\quad < 2 \int \gamma \circ r(\omega ,t) \, d\xi (\omega , t). \end{aligned}$$
If we are able to construct such a pair \(\xi _0^\pi , \xi _1^\pi \), then \(\xi ^\pi :=(\xi _0^\pi +\xi _1^\pi )/2 \in \mathsf {RST}(\mu )\) is strictly better than \(\xi \) and therefore yields the desired contradiction.
5.2 Proof of Proposition 5.9
Only the implication (1) \(\Rightarrow \) (2) of Proposition 5.9 is nontrivial. The proof is based on Choquet’s capacitability theorem and the following auxiliary duality result which is closely related to Proposition 4.3. We fix \(t_0\in \mathbb {R}_+\) and set \(S_{t_0}:=\{(f,s)\in S: s\le t_0\}\).
Proposition 5.11
Proof
We now state several consequences of Proposition 5.11 in which we switch the roles of \(\inf \) and \(\sup \) to provide a more natural formulation.
Corollary 5.12
Proof
It suffices to consider the case \({\mathbb {W}}(\varphi )\le 1\). Put \(\rho =\inf \{t\ge 0:\varphi ^M_t > 1\}.\) Due to Scontinuity of \(\varphi ^M\) (by Proposition 3.5) the set \(O:=\{(\omega ,t) :\varphi ^{M,S}\circ r(\omega ,t)>1\}\) is open. Hence also \(\{\rho <\infty \}={{\mathrm{proj}}}_{{C_0(\mathbb {R}_+)}} O\) is open as projections are open mappings and the map \(\omega \mapsto \varphi ^M_{\rho (\omega )}(\omega )=:{\bar{\varphi }}(\omega )\le 1\) is lsc. Clearly, \((\bar{\varphi },{\bar{\psi }})\) satisfies (5.15) and \({\mathbb {W}}(\bar{\varphi })+\nu ({\bar{\psi }})\le {\mathbb {W}}(\varphi )+\nu (\psi ).\) \(\square \)
Lemma 5.13
\(D_{t_0}\) is a Choquet capacity on \(S\times \mathsf {Y}.\)
Proof
 (1)
monotonicity: \(A\subseteq B \Rightarrow \Psi (A)\le \Psi (B)\)
 (2)
continuity from below: \(A_1\subseteq A_2\subseteq \cdots \Rightarrow \Psi (A_n) \rightarrow \Psi (\bigcup _j A_j)\)
 (3)
boundedness: \(\Psi (K)<\infty \) for all compact K; if \(\Psi (K)<u\) there exists open \(U\supseteq K\) with \(\Psi (U)<u.\)
Lemma 5.14
Proof
We may assume \(D_{t_0}(K)<1/2\), otherwise simply take \(A=\mathsf {Y}, F=\emptyset \).
Proof of Proposition 5.9
Assume first that \(E\subseteq S_{t_0}\times \mathsf {Y}\). We have \(\sup _{\pi \in \mathsf {JOIN}^1( \nu )} \pi (K)=0\) for all compact \(K\subseteq E\). By Corollary 5.12, this implies that \(D_{t_0}(K)=0\) for all compact \(K\subseteq E\). By Choquet’s capacitability theorem [35, Theorem 30.13] and Lemma 5.13 this in turn implies \(D_{t_0}(E)=0\).
Hence, by Lemma 5.14, for each \(\varepsilon >0\) there exist \(F \subseteq S\) and a set \(N\subseteq \mathsf {Y}\) such that \(E \subseteq (F \times \mathsf {Y})\cup (S \times N)\) and \({\mathbb {W}}({{\mathrm{deb}}}_{t_0}(F))+\nu (N)\le 2\varepsilon .\)
5.3 A secondary minimization result
We need an extended version of the stopgo pairs introduced in Definition 5.3.
Definition 5.15
We also define secondary stopgo pairs in the wide sense by \({{\widehat{\mathsf {SG}}}_{2}^{\xi }}=\mathsf {SG}_2^\xi \cup \{(f,s)\in S: A^\xi (f,s)=1\}\times S\).
Then we have the following generalization of Theorem 5.7.
Theorem 5.16
The proof given for Theorem 5.7 also applies in the present situation. Hence, the result follows immediately from the following straightforward variant of Proposition 5.8.
Proposition 5.17
Assume that \(\gamma , \tilde{\gamma }: S \rightarrow \mathbb {R}\) are measurable, the optimization problem (5.20) is well posed, and that \(\xi \in \mathsf {RST}(\mu )\) is an optimizer. Then \((r \otimes {{\mathrm{Id}}})(\pi )(\mathsf {SG}_2^\xi )=0\) for any \(\pi \in \mathsf {JOIN}^1(r(\xi ))\).
Proof
As \(\xi \in \mathsf {Opt}_\gamma \) we have to show that \((r \otimes {{\mathrm{Id}}})(\pi )(\mathsf {SG}_2^\xi {\setminus }\mathsf {SG}^\xi )=0\), however this follows by considering the same construction as in the proof of Proposition 5.8. \(\square \)
6 Embeddings in abundance
Theorem 6.1
Let \(\gamma , \tilde{\gamma }:S\rightarrow \mathbb {R}\) be lsc and bounded from below in the sense of (4.2). Then \(\mathrm{({OptSEP}_2)}\) admits a minimizer \({\hat{\tau }}\).
We now provide the appropriate generalizations of Definitions 1.4 and 1.5 and Theorem 1.3 for this case.
Definition 6.2
Definition 6.3
From Theorem 5.16 together with a trivial modification of Lemma 5.4 we then obtain:
Theorem 6.4
6.1 Recovering classical embeddings
For subsequent use, it will be helpful to write, for \((f,s) \in S,\bar{f} = \sup _{r \le s} f(r),\underline{f} = \inf _{r \le s} f(r)\) and \(f^* = \sup _{r \le s} f(r)\).
Theorem 6.5
Proof
Theorem 6.6
Proof
Remark 6.7
We observe that both the results hold for onedimensional Brownian motion with an arbitrary starting distribution \(\lambda \) satisfying the usual convex ordering condition.
Theorem 6.8
Proof
Fix a bounded and strictly increasing continuous function \({\tilde{\varphi }}:\mathbb {R}^2_+\rightarrow \mathbb {R}\) and consider the continuous functions \(\gamma ((f,s)) = \varphi (\bar{f},\underline{f})\) and \({\tilde{\gamma }}((f,s)) = (f(s))^2 {\tilde{\varphi }}(\bar{f},\underline{f})\). Then \((\hbox {OptSEP}_2)\) is well posed and by Theorem 6.1 there exists a minimizer \(\tau _{P}\). By Theorem 6.4, pick a \({\tilde{\gamma }}\gamma \)monotone set \(\Gamma \subseteq S\) supporting \(\tau _{P}\). Note that we may assume that \(\Gamma \) only contains points such that \(\underline{g}< 0 < \bar{g}\), since \(\mu (\{0\}) = 0\).
In addition, consider a path \((g,t) \in S\) such that \(\underline{g}< g(t) < \bar{g}\). Then there exists \((f,s) \in S\) such that \(f(r) = g(r)\) for \(r \le s\), and such that \(f(s) = g(t)\), and exactly one of \(\bar{f} = \bar{g}\), or \(\underline{f} = \underline{g}\). This is true since there must exist a last time that \(g(r) = x\) before setting the most recent extremum. In particular, \(((f,s),(g,t)) \in {\mathsf {SG}_2}\). It follows that \(\Gamma \cap \{(g,t): \underline{g}< g(t) < \bar{g}\} = \emptyset \), that is, any stopped path must stop at a minimum or a maximum.
Theorem 6.9
Proof
Our primary objective will be to minimize \(\gamma ((f,s)) = \varphi (\bar{f},\underline{f})\), which is a lsc function on S. We again introduce a secondary minimization problem: specifically, we consider the function \({\tilde{\gamma }}((f,s)) = (f(s))^2 {\tilde{\varphi }}(\bar{f},\underline{f})\) for some bounded, continuous and strictly increasing function \({\tilde{\varphi }}:\mathbb {R}_+^2\rightarrow \mathbb {R}\). Then \((\hbox {OptSEP}_2)\) is well posed and by Theorem 6.1 there exists a minimizer \(\tau _{xr}\). By Theorem 6.4, pick a \({\tilde{\gamma }}\gamma \)monotone set \(\Gamma \subseteq S\) supporting \(\tau _{xr}.\)
By a similar argument to that given in the proof of Theorem 6.5 we can show \({\mathsf {SG}_2}\supseteq \{((f,s),(g,t))\in S\times S: f(s)=g(t), (\bar{f},  \underline{f})> (\bar{g}, \underline{g})\}\).
Remark 6.10
We observe that, in the case of Theorem 6.9, the characterization provided would not appear to be sufficient to identify the functions \(\alpha _+, \alpha _\) given the measure \(\mu \). This is in contrast to the constructions of Azéma–Yor, Perkins and Jacka, where knowledge of the form of the embedding is sufficient to identify the corresponding stopping rule.
On a more abstract level, uniqueness of barrier type embeddings in a two dimensional phase space can be seen as a consequence of Loynes’ argument [39]. More precisely, let \(A_t\) be some continuous process and suppose that \(\tau _1\) and \( \tau _2\) denote the times when \((A_t, B_t)\) hits a closed barrier type set \(R_1\) resp. \(R_2\). If \(\mathbb {E}[ \tau _1], \mathbb {E}[\tau _2] < \infty \) and both stopping times embed the same measure, the argument presented in Remark 2.3 shows that \(\tau _1=\tau _2\).
Remark 6.11
6.2 The Valloisembedding and optimizing functions of local time
We say that a \(\mathcal {G}\)adapted process \(\mathfrak {L}^x\) is a local time in x if it is a (rightcontinuous, increasing) compensator of \(Bx\) and we suppress x in the case of local time at 0. This determines \(\mathfrak {L}^x\) up to indistinguishability (and clearly the choice of \(\mathfrak {L}^x\) is irrelevant for (6.7)).
For us it is convenient to allow local time to assume the value \(+\infty \) on an evanescent set. Using this convention, Theorem 4.1 implies that there exists a Borel function \(L^x:S\rightarrow [0,\infty ]\) such that \(L^x\circ r \) is a (rightcontinuous, increasing) \(\mathcal {F}^0\)predictable local time on Wiener space. We will call such a process \(L^x\) a raw local time in x. We note that the value \(+\infty \) cannot be avoided here, see [42].
Lemma 6.12
Proof
It follows that \({\mathbb {W}}({{\mathrm{deb}}}(V))=0\), hence we may pick a Borel set \(A\subseteq {{\mathrm{deb}}}(V)^c\) with \({\mathbb {W}}(A)=1\) such that (6.8) holds. \(\square \)
Our next goal is to verify that (6.7) admits an optimizer.
Proposition 6.13
Let \(h:[0,\infty ) \rightarrow \mathbb {R}\) be continuous and bounded. Then there exists an optimizer for (6.7). Moreover, if \({\tilde{\gamma }}(f,s) = e^{L(f,s)}f^2(s)\) or \({\tilde{\gamma }}(f,s) = e^{L(f,s)}f^2(s)\), the secondary minimization problem \(\mathrm{({OptSEP}_2)}\) also admits a solution.
Proof
The second assertion follows from a similar reasoning, using an approximation argument to handle the unboundedness of \({\tilde{\gamma }}\). \(\square \)
Note added in revision. Guo et al. [27] were able to relax the continuity assumption in our existence and duality results Theorems 1.1 and 1.2. Based on the work of Jacod and Memin [34] they establish these results under the assumption that \(t\mapsto \gamma \circ r(\omega , t)\) is lsc for every \(\omega \in {C_0(\mathbb {R}_+)}\). In particular their results would imply a more general version of Proposition 6.13.
We are now able to show:
Theorem 6.14
 (1)There exists a stopping time \(\tau _{V}\) which maximizesover the set of all solutions to (SEP), and which is of the form$$\begin{aligned} \mathbb {E}\left[ h\left( \mathfrak {L}_\tau \right) \right] \end{aligned}$$for some decreasing function \(\alpha _+\ge 0\) and increasing function \(\alpha _\le 0\).$$\begin{aligned}\tau _{V} = \inf \left\{ t > 0: B_t \notin \left( \alpha _\left( \mathfrak {L}_t\right) ,\alpha _+\left( \mathfrak {L}_t\right) \right) \right\} \text { a.s.,} \end{aligned}$$
 (2)There exists a stopping time \(\tau _{V+}\) which minimizesover the set of all solutions to (SEP), and which is of the form$$\begin{aligned} \mathbb {E}\left[ h\left( \mathfrak {L}_\tau \right) \right] \end{aligned}$$for some increasing function \(\alpha _+\ge 0\), and some decreasing function \(\alpha _\le 0\), and a \(\{0,\infty \}\)valued \(\mathcal {G}_0\)measurable random variable Z.$$\begin{aligned} \tau _{V+} = Z \wedge \inf \left\{ t > 0: B_t \notin \left( \alpha _\left( \mathfrak {L}_t\right) ,\alpha _+\left( \mathfrak {L}_t\right) \right) \right\} , \text { a.s.} \end{aligned}$$
Proof
We consider the second case, under the additional assumption that \(0<\mu (\{0\}) <1\), the other cases being slightly simpler. As above, we let L be a raw local time and observe that \((\mathfrak {L}_t)_{t \ge 0}:= (L\circ r((B_t)_{t \ge 0},t))_{t \ge 0}\) is (indistinguishable from) the local time of \((B_t)_{t \ge 0}\) on \((\Omega , \mathcal {G}, (\mathcal {G}_t)_{t\ge 0}, \mathbb {P})\).
Applying Proposition 6.13 and Theorem 6.4 to the optimizations corresponding to \(\gamma (\omega ,t) = h(L\circ r(\omega ,t))\) and \({\tilde{\gamma }}(\omega ,t) = e^{L\circ r(\omega ,t)}\omega ^2_t\) we obtain a minimizer \(\tau _{V+}\) and a \({\tilde{\gamma }}{\gamma }\)monotone set \(\Gamma \subseteq S\) supporting \(\tau _{V+}\).
Since \((\Gamma ^{<} \times \Gamma ) \cap \mathsf {SG}_2 = \emptyset \) and \((0,0) \in \Gamma ^{<}\) (\(\Gamma \) contains a nontrivial element since \(\mu (\{0\}) < 1\)) then \((l,0) \not \in \Gamma \) for any \(l \ge 0\). It follows that \(\mathbb {P}(\tau _{V+} = 0) = \mu (\{0\})\).
We now consider \(\tau _{V+}\) on \(\{\tau _{V+} > 0 \}\). Note that \(\{\tau>0\}= \{ \mathfrak {L}_\tau >0\}\) a.s., for any stopping time \(\tau \) and hence in particular \(\{\tau _{V+}> 0 \}= \{ \mathfrak {L}_{\tau _{V+}} >0\}\) a.s. Then on \(\{\tau _{V+} >0\},\tau _\textsc {cl}^* \le \tau _{V+} \le \tau _\textsc {op}^*\) a.s., and hence \(\mathbb {P}(\tau _{V+} \le \tau _{\textsc {op}}^*) = 1\). Define \(\alpha _+(l) = \inf \{ x>0: (l,x) \in {\mathcal {R}}_\textsc {op}\}\) and \(\alpha _(l) = \sup \{ x<0: (l,x) \in {\mathcal {R}}_\textsc {op}\}\).
Remark 6.15
The arguments above extend from local time at 0 to a general continuous additive functional A. Recalling that \(\mathfrak {L}^x\) denotes local time in x, A can be represented in the form \(A_t:=\int _0^t \mathfrak {L}_s^x\, dm_A(x)\). Let f be a convex function such that \(f'' = m_A\) in the sense of distributions. If \(\int f\, d\mu < \infty \) then the above proof is easily adapted to the more general situation.
In this manner, we deduce the existence of optimal solutions to (SEP) for functions depending on A. By analogy with Theorem 6.14 this can be used to generate (inverse/cave) barrier type embeddings of various kinds. Other generalizations and variants may be considered in a similar manner. We leave specific examples as an exercise for the reader.
6.3 Root and Rost embeddings in higher dimensions
In the case \(d=2\) it follows from Falkner’s results [22] that the Skorokhod problem admits a solution (i.e. \(\mathsf {RST}(\mu )\ne \emptyset \)) if (6.10) is satisfied for \(u(x,y)=\ln xy\) and then (6.11) applies.
In either case, assuming that we do have a solution satisfying (6.11), then the existence result as well as the monotonicity principle carry over to the present setup (with identical proofs) and we are able to state the following:
Theorem 6.16
Suppose \(\mathsf {RST}(\mu )\) is nonempty. If h is a strictly convex function and \(\hat{\tau } \in \mathsf {RST}(\mu )\) minimizes \(\mathbb {E}[h(\tau )]\) over \(\tau \in \mathsf {RST}(\mu )\) then there exists a barrier \({\mathcal {R}}\) such that \(\hat{\tau } = \inf \{ t > 0 : (B_t,t) \in {\mathcal {R}}\}\) on \(\{{\hat{\tau }} >0\}\) a.s.
The proof of this result is much the same as that of Theorem 2.1, except we no longer show that \(\tau _\textsc {cl}= \tau _\textsc {op}\). In higher dimensions with general initial laws, it is easy to construct examples where there are common atoms of \(\lambda \) and \(\mu \), but where the size of the atom in \(\lambda \) is strictly larger than the atom of \(\mu \). By the transience of the process, it is clear that the optimal (indeed, only) behaviour is to stop mass starting at such a point immediately with a probability strictly between 0 and 1, however the stopping times \(\tau _\textsc {cl}\) and \(\tau _\textsc {op}\) will always stop either all the mass, or none of this mass respectively. For this reason, we do not say anything about the behaviour of \({\hat{\tau }}\) when \({\hat{\tau }} = 0\). Trivially, the above result tells us that the solution of the optimal embedding problem is given by a barrier if there exists a set D such that \(\lambda (D) = 1 = \mu (D^c)\).
Proof of Theorem 6.16
We now consider the generalization of the Rost embedding. Recall that \((\min (\lambda , \mu ))(A) := \inf _{B \subseteq A} \left( \lambda (B)+ \mu (A{\setminus } B)\right) \) defines a measure.
Theorem 6.17
Suppose \(\lambda , \mu \) are measures in \(\mathbb {R}^d\) and \(\hat{\tau } \in \mathsf {RST}(\mu )\) maximizes \(\mathbb {E}[h(\tau )]\) over all stopping times in \(\mathsf {RST}(\mu )\), for a convex function \(h: \mathbb {R}_+ \rightarrow \mathbb {R}\), with \(\mathbb {E}[h(\tau )]<\infty \). Then \(\mathbb {P}(\hat{\tau }=0, B_0 \in A) = (\min (\lambda , \mu ))(A)\), for \(A \in {\mathcal {B}}(\mathbb {R})\), and on \(\{\hat{\tau }>0\},\hat{\tau }\) is the first hitting time of an inverse barrier.
Proof
It follows from an identical argument to that in the proof of Theorem 2.4 that \(\tau _\textsc {cl}^{0,\delta } \le \hat{\tau } \le \tau _\textsc {op}^{0,\delta }\) on \(\{\hat{\tau } \ge \delta \}\). However, by similar arguments to those used above, we deduce that \(\tau _\textsc {op}^{0,\delta }\) and \(\tau _\textsc {cl}^{0,\delta }\) have the same law on \(\{\hat{\tau } \ge \delta \}\), and hence that \(\hat{\tau } = \tau _\textsc {op}^{0,\delta }\) on this set, and then by taking \(\delta \rightarrow 0\), we get \(\hat{\tau } = \tau _\textsc {op}\) on \(\{\hat{\tau }>0\}\).
To see the final claim, we note that trivially \(\mathbb {P}(\hat{\tau }=0, B_0 \in A) \le (\min (\lambda , \mu ))(A)\). If there is strict inequality, then there exist some paths in \(\Gamma \) which start at \(x \in A\), and paths in \(\Gamma \) which stop at x at strictly positive time, constituting a stopgo pair and therefore violating the monotonicity principle. \(\square \)
Remark 6.18
We observe that the arguments of Remark 2.3 can be applied again in this context. However, one needs to be a little more careful, since it is necessary to take the fine closure of the barriers with respect to the fine topology for the processes \((t,B_t)_{t\ge 0}\). With this modification in place, the argument of Loynes can be easily adapted to show that the (finely closed versions) of the barriers in Theorems 6.16 and 6.17 are unique in the sense of Remark 2.3.
6.4 An optimal Skorokhod embedding problem which admits only randomized solutions
By analogy with optimal transport, we might interpret a ‘natural stopping time’ (i.e. a stopping time wrt to the Brownian filtration) which solves (OptSEP) as a Mongetype solution whereas stopping times which depend on additional randomization are of Kantorovichtype. With the exception of the Rost solution, all optimal stopping times encountered in the previous section are natural stopping times, and in the Rost case external randomization is only needed at time 0. One might ask whether the optimal Skorokhod embedding problem always admits a solution \(\tau \) which is natural on \(\{\tau >0\}\). We sketch an example, showing that this is not the case:
Example 6.19
There exist an absolutely continuous probability \(\mu \) and a continuous adapted process \(\gamma _t=\gamma ((B_s)_{s\le t})\) with values in [0, 1] such that (OptSEP) admits only randomized solutions.
Proof
 (1)
\(\tau =\sigma \) with probability 1 / 2,
 (2)
otherwise \(\tau \) stops the first time the Brownian path reaches the level \(\pm l((B_s)_{s\le \sigma })\).
Write \({\hat{\tau }}\) for the randomized stopping time \(\mathsf {RST}(\mu )\) corresponding to \( \tau \). It is then straightforward to see that \({\hat{\tau }} \) is the unique solution of (OptSEP). Thus, the optimal Skorokhod embedding problem admits no (nonrandomized) solution in the natural filtration of B. \(\square \)
In optimal transport it is a difficult and interesting problem to understand under which conditions transport problems admit solutions of Mongetype. An interesting subject for future research would be to understand when Mongetype solutions exist for the optimal Skorokhod embedding problem.
7 Skorokhod embedding for Feller processes
Most of the arguments required to establish our main results are abstract and carry over to the present setup. In fact, only the parts building on the condition \(\mathbb {E}[\tau ]<\infty \) need to be adjusted to account for the more general condition of \(\tau \) being minimal. Therefore, to establish Theorems 1.1, 1.2, and 1.3 in the general Feller setup, we need the crucial Assumption 7.1 below which we verify in a number of natural examples in Sect. 7.2.
Assumption 7.1
 (1)that there exist continuous functions \(h:\mathbb {R}^n\rightarrow \mathbb {R}\) and \(\zeta :S\rightarrow \mathbb {R}\) such that:

\(\zeta _t:=\zeta ((Z_s)_{s\le t},t)\) is strictly increasing, \(\zeta _0=0,\lim _{t\rightarrow \infty } \zeta _t=\infty \), \(\mathbb {P}\)a.s. and

\(X_t := h(Z_t)\zeta _t\) is a martingale and \((X^\tau _{ t})_{t\ge 0}\) is uniformly integrable for all \(\tau \) solving \((\hbox {SEP}^\mathrm{Z})\), or

 (2)that whenever \(\tau \) is a finite stopping time satisfying \(Z_\tau \sim \mu \) then \(\tau \) is minimal and there is an increasing function \(G:\mathbb {R}_+\rightarrow \mathbb {R},\lim _{t\rightarrow \infty }G(t) =\infty \) which satisfies$$\begin{aligned} \sup \{\mathbb {E}[G(\tau )]: \tau \text{ solves } (\hbox {SEP}^\mathrm{Z})\} =: V <\infty . \end{aligned}$$(7.1)
Theorem 7.2
If \(\gamma :S\rightarrow \mathbb {R}\) is lsc and bounded from below, \(\mathrm{({OptSEP}^{Z})}\) admits a minimizer.
Theorem 7.3
Theorem 7.4
Remark 7.5
 (1)
Of course, the analogues of the secondary optimization results, Theorems 6.1 (on existence of a minimizer) and 6.4 (monotonicity principle), carry over to the present setup with the obvious changes.
 (2)
The continuity of \(\zeta \) on S which was imposed in Assumption 7.1 (1) is not required in Theorems 7.2 and 7.4.
 (3)
The condition \(0< \mathbb {E}[\sigma ] < \infty \) in Definition 1.4 should be replaced by considering all stopping times with \(0< \mathbb {E}[\zeta ((B_s)_{s \le \sigma },\sigma )]<\infty \) in case (1) of Assumption 7.1, or \(0<\mathbb {E}[G(\tau )] < \infty \) in case (2). In addition, the expectation should be taken over the law of the Feller process started at \(f(s) = g(t)\).
7.1 Sketch of proofs
As in Sect. 3 we consider the canonical setup \((C(\mathbb {R}_+,\mathbb {R}^d),\mathcal {F}^0,\mathbb {Q})\) (where \(\mathbb {Q}\) denotes the law of the Feller process) and we write Y for the canonical process. It follows from continuity of Y (resp. Z) and the Feller property that the \(\mathcal {F}^a\)optional and the \(\mathcal {F}^a\)predictable \(\sigma \)algebra on the canonical space agree; similarly Proposition 3.5 on the definition of Scontinuous martingales extends to the present context. We define \(\mathsf {RST},\mathsf {JOIN}\) and related notions as before with \(\mathbb {Q}\) replacing \({\mathbb {W}}\). We say that \(\xi \in \mathsf {RST}\) is a minimal embedding of \(\mu \) if the corresponding stopping time \(\rho \) (cf. (3.6)) on the enlarged probability space \((C(\mathbb {R}_+,\mathbb {R}^d)\times [0,1],{\bar{\mathbb {Q}}})\) constitutes a minimal embedding. (Representing randomized stopping times as in Theorem 3.8 (1), the stopping time \(\xi \) constitutes a minimal embedding iff there is no randomized stopping time \(\xi '\ne \xi \) embedding the same measure which satisfies \(A^{\xi '}\ge A^\xi \).) For \(\mu \in {\mathcal {P}}(\mathbb {R}^d)\) we define \(\mathsf {RST}(\mu )\) to be the set of all minimal randomized stopping times embedding the measure \(\mu \).
Recalling the argument from Theorem 3.14, we see that the existence of a function \(\zeta : S \rightarrow \mathbb {R}\) such that \(\zeta \circ r\) increases to \(\infty \) and \(\sup _{\xi \in \mathsf {RST}(\mu )} \xi (\zeta \circ r)<\infty \) implies that \(\mathsf {RST}(\mu )\) is compact. (Vice versa, if \(\mathsf {RST}(\mu )\) is compact then such a function exists and can be chosen so that \(\zeta \circ r\) is deterministic). Hence, by (7.3) resp. (7.1), \(\mathsf {RST}(\mu )\) is compact.
Proof of Theorem 7.2
The argument follows the proof of Theorem 4.1 line by line. \(\square \)
Proof of Theorem 7.3
We give the argument in the case \(\lambda =\delta _0\) for ease of exposition. Setting \(h= \zeta \circ r\) resp. \(h= G\circ T\) (and using identical arguments as previously) we obtain the following extension of Proposition 4.3:
Proof of Theorem 7.4
Apart from the abstract theory the ingredients of the proof of Theorem 5.7 are Propositions 5.8 and 5.9. The only stage where the proof of Proposition 5.8 has to be altered is when establishing that the randomized stopping time \(\xi ^\pi \) is minimal. Under Assumption (7.1) (1) this follows using the minimality characterization given in (7.3), under Assumption (7.1) (2) this is of course trivial.
Proposition 5.9 only uses transport duality, the Feller property to construct Scontinuous martingales and Choquet’s capacitability theorem. \(\square \)
7.2 Examples
We now provide a list of Examples in which Assumption 7.1 is satisfied and Theorems 7.2, 7.3, and 7.4 apply.
7.2.1 Let Z be a onedimensional Brownian motion and assume that \(\lambda \) and \(\mu \) have first moments and are in convex order: then Assumption 7.1 (1) holds
Proof
7.2.2 Onedimensional regular diffusions
 (i)Suppose \(s(I^\circ ) = (a,b)\) for \(a, b \in \mathbb {R}\). Then it follows from [10, Theorems 17 and 22] that a solution to \((\hbox {SEP}^\mathrm{Z})\) exists if and only if \(s(\lambda )\) precedes \(s(\mu )\) in convex order, and in fact, any finite \(\tau \) with \(Z_\tau \sim \mu \) is minimal. Moreover we note thatIt follows that \( \{A_{\tau '}: B_{\tau '}\sim s(\mu ), \tau ' \text{ is } \text{ minimal }\}\) is bounded in probability, hence (7.2) and then Assumption 7.1 (2) holds.

\(\{\tau ': B_{\tau '}\sim s(\mu ), \tau ' \text{ is } \text{ a } \text{ minimal }\} \) is bounded in probability

\(A_t<\infty \) provided the path \((B_s)_{s\le t}\) stays inside an interval \([c,d]\subseteq (a,b)\).

Given \(\varepsilon >0\) there exists an interval \([c,d]\subseteq (a,b)\) such that \((B_s)_{s\le \tau '}\) stays inside [c, d] with probability \(> 1\varepsilon \) for each minimal \(\tau ',B_{\tau '}\sim s(\mu )\).

 (ii)Suppose \(s(I^\circ ) = (a,\infty )\) for \(a \in \mathbb {R}\), and that \(s(\lambda )\) and \(s(\mu )\) are in convex order and that the moments \(m_\lambda = \int s(y) \,\lambda (dy),m_\mu = \int s(y)\, \mu (dy)\) exist. Then it follows from Theorems 17 and 22 and the discussion at the top of p. 245 of [10] that a solution to \((\hbox {SEP}^\mathrm{Z})\) exists if and only if for all \(x \ge a\),Again, any finite \(\tau \) with \(Z_\tau \sim \mu \) is minimal and (7.2) follows as above.$$\begin{aligned} \int s(y)x\,\mu (dy) \le \int s(y)  x\, \lambda (dy) + (m_\lambda m_\mu ) \end{aligned}$$(7.7)
An analogous result holds if \(s(I^\circ ) = (\infty ,b)\) for \(b \in \mathbb {R}\).
 (iii)
Suppose \(s(I^\circ ) = (\infty , \infty )\) and that \(s(\lambda ), s(\mu )\) are in convex order, \( \int s(y)^2 \,\mu (dy) < \infty \). Then we are in the classical case, and a stopping time \(\tau \) with \(Z_\tau \sim \mu \) is minimal if and only if \(\mathbb {E}[A^{1}_\tau ] < \infty \). If the process Z is sufficiently wellbehaved (as in the examples below) one can show that \(X_t = s(Z_t)^2  A_t^{1}\) is a martingale and that \(A^{1}\) depends continuously on the path \((Z_s)_{s\le t}\). For all \(\tau \) solving \((\hbox {SEP}^\mathrm{Z}),\mathbb {E}[A^{1}_\tau ] < \infty \); hence \((X^\tau _t)_{t\ge 0}\) is uniformly integrable and Assumption 7.1 (1) is satisfied.
More generally, when only the integrals \(\int s(y) \,\lambda (dy),\int s(y) \,\mu (dy)\) are finite, (assuming sufficient regularity of the diffusion), Assumption 7.1 (1) follows as in Sect. 7.2.1.
Remark 7.6
Observe that none of the constructions described in Sects. 6.1 and 6.2 rely on fine properties of Brownian motion—the main properties used are the continuity of paths, the strong Markov property, and the regularity and diffusive nature of paths (that the process started at x immediately returns to x, and immediately enters the sets \((x,\infty )\) and \((\infty ,x)\)). It follows that all the given constructions extend to the case of regular diffusions described above.
Example 7.7
(Brownian motion with drift) Let \(Z_t=B_t+at\) for some \(a<0\) with \(Z_0\sim \lambda \), and \(I=(\infty ,\infty )\). Then a possible choice of the scale function is \(s(x) = \exp (2 a x)\). Let \(\lambda , \mu \in {\mathcal {P}}(\mathbb {R})\) be such that \(s(\lambda ), s(\mu )\) are integrable and satisfy (7.7). Then Assumption 7.1 holds by (ii) above.
Example 7.8
(Geometric Brownian motion) Let Z be a geometric Brownian motion, given through the SDE \(dZ_t= Z_tdB_t,Z_0\sim \lambda ~.\) A possible choice of scale function is \(s(x)=x.\) Let \(\lambda ,\mu \in {\mathcal {P}}(0,\infty )\) be such that \(s(\lambda ),s(\mu )\) are integrable and satisfy the corresponding version of (7.7). Then Assumption 7.1 holds by (ii) above. (More general versions of geometric Brownian motion can be treated similarly.)
Example 7.9
(Threedimensional Bessel process) Let \(Z=B\) for a threedimensional Brownian motion \((B_t)_{t\ge 0}\) with \(Z_0\sim \lambda .\) A possible choice of scale function is \(s(x)=11/x,\) and \(s(I^\circ )=(\infty , 1)\). Let \(\lambda ,\mu \in {\mathcal {P}}(0,\infty )\) be such that \(s(\lambda ),s(\mu )\) are integrable and satisfy the corresponding version of (7.7). Then Assumption 7.1 holds by (ii) above. Similar results hold for ddimensional Bessel processes, with \(d > 2\).
Example 7.10
(Ornstein–Uhlenbeck process) Let Z be an Ornstein–Uhlenbeck process, given for example as the solution to the SDE \(dZ_t = Z_t \, dt + dW_t, Z_0\sim \lambda \). Then \(Z_t\) is a regular diffusion on \(I=(\infty ,\infty )\) with scale function given (up to constants) by \(s'(x) = \exp (x^2)\), and \(s(I^\circ ) = (\infty ,\infty )\). Suppose \(\lambda , \mu \) are measures on \(\mathbb {R}\) such that \(s(\lambda ), s(\mu )\) are in convex order and \( \int s(y)^2 \,\mu (dy) < \infty \). Then \(A_t^{1} = \int _0^t \exp \{2Z_t^2\} \, ds\) is continuous as a function of \((Z_s)_{s \le t}\), and hence Assumption 7.1 holds by (iii) above.
7.2.3 The Hoeffding–Frechet coupling as a very particular Root solution
Let Z be the deterministic process given by \(dZ_t= dt\) started in \(Z_0 \sim \lambda \). Z is not a regular diffusion, however Assumption 7.1 (2) is easily checked. Let \(\mu \) be another probability and assume for simplicity that \(\max {{\mathrm{supp}}}\,\lambda \le \min {{\mathrm{supp}}}\,\mu \). Then the Root solution minimizes \(\mathbb {E}[\tau ^2]\). But note also that since \(\tau = Z_\tau Z_0\), this minimization problem corresponds precisely to finding the joint distribution \((Z_0, Z_\tau )\) which minimizes \(\mathbb {E}[(Z_\tau Z_0)^2]\): the classical transport problem in the most simple setup. Specifically, the Root solution for the particular case of the process Z corresponds precisely to the monotone (Hoeffding–Frechet) coupling. In the same fashion the Rost solution corresponds to the comonotone coupling between \(\lambda \) and \(\mu \).
Notes
Acknowledgements
The authors thank Julio Backhoff, Walter Schachermayer, and Nizar Touzi for useful discussions. We are particularly indebted to the anonymous referees and Manu Eder for many helpful suggestions which had a significant impact on the final version of this article.
References
 1.Acciaio, B., Beiglböck, M., Penkner, F., Schachermayer, W., Temme, J.: A trajectorial interpretation of doob’s martingale inequalities. Ann. Appl. Probab. 23(4), 1494–1505 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 2.Adams, D.R., Hedberg, L.I.: Function Spaces and Potential Theory, volume 314 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1996)Google Scholar
 3.Ambrosio, L., Pratelli, A.: Existence and stability results in the \(L^1\) theory of optimal transportation. In: Caffarelli, L.A., Salsa, S. (eds.) Optimal Transportation and Applications (Martina Franca, 2001), volume 1813 of Lecture Notes in Math., pp. 123–160. Springer, Berlin (2003)Google Scholar
 4.Azéma, J., Yor, M.: Une solution simple au problème de Skorokhod. In: Séminaire de Probabilités, XIII (Univ. Strasbourg, Strasbourg, 1977/78), volume 721 of Lecture Notes in Math., pp. 90–115. Springer, Berlin (1979)Google Scholar
 5.Baxter, J.R., Chacon, R.V.: Compactness of stopping times. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 40(3), 169–181 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Beiglböck, M., Goldstern, M., Maresch, G., Schachermayer, W.: Optimal and better transport plans. J. Funct. Anal. 256(6), 1907–1927 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
 7.Beiglböck, M., HenryLabordère, P., Penkner, F.: Modelindependent bounds for option prices: a mass transport approach. Finance Stoch. 17(3), 477–501 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 8.Bianchini, S., Caravenna, L.: On optimality of \(c\)cyclically monotone transference plans. C. R. Math. Acad. Sci. Paris 348(11–12), 613–618 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
 9.Bouchard, B., Nutz, M.: Arbitrage and duality in nondominated discretetime models. Ann. Appl. Probab. 25(2), 823–859 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 10.Cox, A.M.G.: Extending Chacon–Walsh: minimality and generalised starting distributions. In: DonatiMartin, C., Émery, M., Rouault, A., Stricker, C. (eds.) Séminaire de probabilités XLI, volume 1934 of Lecture Notes in Math., pp. 233–264. Springer, Berlin (2008)Google Scholar
 11.Cox, A.M.G., Hobson, D.: Skorokhod embeddings, minimality and noncentred target distributions. Probab. Theory Relat. Fields 135, 395–414 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 12.Cox, A.M.G., Obłój, J.: Robust hedging of double touch barrier options. SIAM J. Financ. Math. 2, 141–182 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
 13.Cox, A.M.G., Wang, J.: Root’s barrier: construction, optimality and applications to variance options. Ann. Appl. Probab. 23(3), 859–894 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Cox, A.M.G., Peskir, G.: Embedding laws in diffusions by functions of time. Ann. Probab. 43(5), 2481–2510 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 15.Delbaen, F., Schachermayer, W.: A general version of the fundamental theorem of asset pricing. Math. Ann. 300(3), 463–520 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
 16.Dellacherie, C., Meyer, P.A.: Probabilities and Potential, volume 29 of NorthHolland Mathematics Studies. NorthHolland Publishing Co., Amsterdam (1978)Google Scholar
 17.Dellacherie, C., Meyer, P.A.: Probabilities and Potential. B, volume 72 of NorthHolland Mathematics Studies. NorthHolland Publishing Co., Amsterdam (1982). Theory of martingales, Translated from the French by J. P. WilsonGoogle Scholar
 18.Doléans, C.: Existence du processus croissant naturel associé à un potentiel de la classe (d). Probab. Theory Relat. Fields 9(4), 309–314 (1968)MathSciNetzbMATHGoogle Scholar
 19.Dolinsky, Y., Soner, H.M.: Martingale optimal transport and robust hedging in continuous time. Probab. Theory Relat. Fields 160(1–2), 391–427 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 20.Dolinsky, Y., Soner, H.M.: Martingale optimal transport in the Skorokhod space. Stoch. Process. Appl. 125(10), 3893–3931 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 21.Dolinsky, Y., Soner, M.H.: Robust hedging with proportional transaction costs. Finance Stoch. 18(2), 327–347 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 22.Falkner, N.: The distribution of Brownian motion in \({R}^{n}\) at a natural stopping time. Adv. Math. 40(2), 97–127 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Galichon, A., HenryLabordère, P., Touzi, N.: A stochastic control approach to noarbitrage bounds given marginals, with an application to lookback options. Ann. Appl. Probab. 24(1), 312–336 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 24.Gangbo, W., McCann, R.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
 25.Gassiat, P., Mijatović, A., Oberhauser, H.: An integral equation for Root’s barrier and the generation of Brownian increments. Ann. Appl. Probab. 25(4), 2039–2065 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 26.Gassiat, P., Oberhauser, H., dos Reis, G.: Root’s barrier, viscosity solutions of obstacle problems and reflected FBSDEs. Stoch. Process. Appl. 125(12), 4601–4631 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Guo, G., Tan, X., Touzi, N.: Optimal Skorokhod embedding under finitelymany marginal constraints. ArXiv eprints (2015)Google Scholar
 28.HenryLabordère, P., Obłój, J., Spoida, P., Touzi, N.: The maximum maximum of a martingale with given \(n\) marginals. Ann. Appl. Probab. 26(1), 1–44 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
 29.Hirsch, F., Profeta, C., Roynette, B., Yor, M.: Peacocks and associated martingales, with explicit constructions, volume 3 of Bocconi & Springer Series. Springer, Milan; Bocconi University Press, Milan (2011)Google Scholar
 30.Hobson, D.: Robust hedging of the lookback option. Finance Stoch. 2, 329–347 (1998)CrossRefzbMATHGoogle Scholar
 31.Hobson, D.: The Skorokhod embedding problem and modelindependent bounds for option prices. In: Carmona, R., Çınlar, E., Ekeland, I., Jouini, E., Scheinkman, J.A., Touzi N. (eds.) ParisPrinceton Lectures on Mathematical Finance 2010, volume 2003 of Lecture Notes in Math., pp. 267–318. Springer, Berlin (2011)Google Scholar
 32.Hobson, D., Neuberger, A.: Robust bounds for forward start options. Math. Finance 22(1), 31–56 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 33.Jacka, S.: Doob’s inequalities revisited: a maximal \(h^1\)embedding. Stoch. Process. Appl. 29(2), 281–290 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
 34.Jacod, J., Mémin, J.: Sur un type de convergence intermédiaire entre la convergence en loi et la convergence en probabilité. In: Azéma, J., Yor M. (eds.) Séminaire de Probabilités XV. 1979/80 (Univ. Strasbourg, Strasbourg, 1979/1980) (French), volume 850 of Lecture Notes in Math., pp. 529–546. Springer, BerlinNew York (1981)Google Scholar
 35.Kechris, A.S.: Classical Descriptive Set Theory, volume 156 of Graduate Texts in Mathematics. Springer, New York (1995)CrossRefGoogle Scholar
 36.Kiefer, J.: Skorohod embedding of multivariate RV’s, and the sample DF. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 24(1), 1–35 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
 37.Knott, M., Smith, C.S.: On the optimal mapping of distributions. J. Optim. Theory Appl. 43(1), 39–49 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
 38.Last, G., Mörters, P., Thorisson, H.: Unbiased shifts of Brownian motion. Ann. Probab. 42(2), 431–463 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 39.Loynes, R.M.: Stopping times on Brownian motion: some properties of Root’s construction. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 16, 211–218 (1970)MathSciNetCrossRefzbMATHGoogle Scholar
 40.Meyer, P.A.: Convergence faible et compacité des temps d’arrêt, d’après Baxter et Chacon. In: Dellacherie, C., Meyer, P.A., Weil M. (eds.) Séminaire de probabilités. volume XII of Lecture Notes in Mathematics, 649, pp. 411–423. Springer, Berlin (1978)Google Scholar
 41.Monroe, I.: On embedding right continuous martingales in Brownian motion. Ann. Math. Stat. 43, 1293–1311 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
 42.Najnudel, J., Nikeghbali, A.: A new kind of augmentation of filtrations. ESAIM Probab. Stat. 15, S39–S57 (2011)MathSciNetCrossRefGoogle Scholar
 43.Obłój, J.: The Skorokhod embedding problem and its offspring. Probab. Surv. 1, 321–390 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
 44.Obłój, J., Spoida, P., Touzi, N.: Martingale inequalities for the maximum via pathwise arguments. In: DonatiMartin, C., Lejay, A., Rouault, A. (eds.) In Memoriam Marc YorSéminaire de Probabilités XLVII, pp. 227–247. Springer (2015)Google Scholar
 45.Perkins, E.: The CereteliDavis solution to the \(H^1\)embedding problem and an optimal embedding in Brownian motion. In: Çınlar, E., Chung, K.L., Getoor, R.K. (eds.) Seminar on Stochastic Processes, 1985, pp. 172–223. Birkhäuser, Boston (1986)Google Scholar
 46.Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, volume 293 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 3rd edn. Springer, Berlin (1999)Google Scholar
 47.Rogers, L .C .G., Williams, D.: Diffusions, Markov Processes and Martingales: Itô Calculus, Cambridge University Press, Cambridge (2000)Google Scholar
 48.Root, D.H.: The existence of certain stopping times on Brownian motion. Ann. Math. Stat. 40, 715–718 (1969)MathSciNetCrossRefzbMATHGoogle Scholar
 49.Rost, H.: The stopping distributions of a Markov process. Invent. Math. 14, 1–16 (1971)MathSciNetCrossRefzbMATHGoogle Scholar
 50.Rost, H.: Skorokhod stopping times of minimal variance. In: Meyer, P.A. (ed.) Séminaire de Probabilités, X (Première partie, Univ. Strasbourg, Strasbourg, année universitaire 1974/1975), pp. 194–208. Lecture Notes in Math., vol. 511. Springer, Berlin (1976)Google Scholar
 51.Rüschendorf, L.: Fréchetbounds and their applications. In: Dall’Aglio, G., Kotz, S., Salinetti, G. (eds.) Advances in Probability Distributions with Given Marginals (Rome, 1990), volume 67 of Math. Appl., pp. 151–187. Kluwer Acad. Publ., Dordrecht (1991)Google Scholar
 52.Rüschendorf, L.: Optimal solutions of multivariate coupling problems. Appl. Math. (Warsaw) 23(3), 325–338 (1995)MathSciNetzbMATHGoogle Scholar
 53.Skorohod, A.V.: Issledovaniya po teorii sluchainykh protsessov (Stokhasticheskie differentsialnye uravneniya i predelnye teoremy dlya protsessov Markova). Izdat. Kiev. Univ., Kiev (1961)Google Scholar
 54.Skorokhod, A.V.: Studies in the Theory of Random Processes (Translated from the Russian by Scripta Technica Inc.). AddisonWesley Publishing Co. Inc., Reading (1965)Google Scholar
 55.Strasser, H.: Mathematical Theory of Statistics: Statistical Experiments and Asymptotic Decision Theory, volume 7 of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin (1985)Google Scholar
 56.Vallois, P.: Le probleme de Skorokhod sur \({\mathbb{R}}\): une approche avec le temps local. In: Azéma, J., Yor M. (eds.) Séminaire de Probabilités XVII 1981/82, pp. 227–239. Springer (1983)Google Scholar
 57.Villani, C.: Topics in Optimal Transportation, volume 58 of Graduate Studies in Mathematics. American Mathematical Society, Providence (2003)Google Scholar
 58.Villani, C.: Optimal Transport. Old and New, volume 338 of Grundlehren der mathematischen Wissenschaften. Springer, Berlin (2009)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.