Skip to main content

Geometry of distribution-constrained optimal stopping problems


We adapt ideas and concepts developed in optimal transport (and its martingale variant) to give a geometric description of optimal stopping times \(\tau \) of Brownian motion subject to the constraint that the distribution of \(\tau \) is a given probability \(\mu \). The methods work for a large class of cost processes. (At a minimum we need the cost process to be measurable and \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted. Continuity assumptions can be used to guarantee existence of solutions.) We find that for many of the cost processes one can come up with, the solution is given by the first hitting time of a barrier in a suitable phase space. As a by-product we recover classical solutions of the inverse first passage time problem/Shiryaev’s problem.


To whet the reader’s appetite and to give some idea of the kind of problems that can be solved with the methods presented in this paper we would like to start with two corollaries to our main results. In Sect. 3 we will present these main results and in Sect. 4 we will use them to prove Corollary 1.1 from them.

Both Corollaries 1.1 and 1.2 assert that the solutions of certain optimal stopping problems can be described by a barrier in an appropriate phase space.

In this section, let \((B_{t})_{t \ge 0}\) be a Brownian motion startedFootnote 1 in 0 on some filtered probability space satisfying the usual conditions and let \(\mu \) be a measure on \( (0,\infty ) \). First we consider optimal stopping problems of the following form.


(OptStop\(^{\psi (B_t,t)}\)) Among all stopping times \(\tau \sim \mu \) on find the maximizer of

$$\begin{aligned} \tau \mapsto \mathbb E[Z_{\tau }]\text {,}\end{aligned}$$

where the process Z is of the form \(Z_t = \psi (B_t,t)\).

Corollary 1.1

Assume that \(\mu \) has finite first moment. There is an upper semicontinuous function \( \beta : \mathbb {R}_{+}\rightarrow [-\infty ,\infty ]\) such that the stopping time

$$\begin{aligned} \tau := \inf \left\{ t > 0 : B_t \le \beta (t) \right\} \end{aligned}$$

has distribution \(\mu \).

\(\tau \) has the following uniqueness properties: On the one hand it is the a.s. unique stopping time which has distribution \(\mu \) and which is of the form (1.1) (we will later say that such a stopping time is the hitting time of a downwards barrier).

On the other hand \(\tau \) is also the a.s. unique solution of \((\textsc {OptStop}^{\psi (B_t,t)})\) for a number of different \(\psi \). Namely:

  • Let \(p \ge 0\), assume \(\mu \) has finite moment of order \(\frac{1}{2} + p + \varepsilon \) for some \(\varepsilon > 0\) and let \(A: \mathbb {R}_{+}\rightarrow \mathbb {R}\) be strictly increasing and \(|A(t)| \le K (1 + t^p)\) for some constant K.Footnote 2 Then we may choose

    $$\begin{aligned} \psi (B_t,t) = B_t A(t) \text {.}\end{aligned}$$
  • Let \(p \ge 2\), assume \(\mu \) has finite moment of order \(\frac{p}{2} + \varepsilon \) for some \(\varepsilon > 0\) and let \(\phi : \mathbb {R}\rightarrow \mathbb {R}\) satisfy \( \phi ''' > 0\) as well as \(\left| \phi (y)\right| \le K (1 + |y|^p)\) for some constant K. Then we may choose

    $$\begin{aligned} \psi (B_t,t) = \phi (B_t) \text {.}\end{aligned}$$

To give an example of a slightly more complicated functional amenable to analysis with our tools consider


(OptStop\(^{B^*_{t}}\)) Among all stopping times \(\tau \sim \mu \) on find the maximizer of

$$\begin{aligned} \tau \mapsto \mathbb E[B^*_{\tau }]\text {,}\end{aligned}$$

where \(B^*_{t} = \sup _{s \le t} B(s)\).

Corollary 1.2

Assume that \(\mu \) has finite moment of order \(\frac{3}{2}\). Then \((\textsc {OptStop}^{B^*_{t}})\) has a solution \(\tau \) given by

$$\begin{aligned} \tau = \inf \left\{ t >0 : B_t - B^*_{t} \le \beta (t) \right\} \end{aligned}$$

for some upper semicontinuous function \(\beta : \mathbb {R}_{+}\rightarrow [-\infty ,0]\).

We emphasize that the solutions to the constrained optimal stopping problems provided in Corollaries 1.1 and 1.2 represent particular applications of the abstract results obtained below. Figure 1 presents graphical depictions of stopping rules of several further solutions of constrained optimal stopping problems (together with the respective optimality properties). These stopping rules can be derived—under suitable moment conditions—using arguments very similar to those required for Corollaries 1.1 and 1.2 (see also the comments in Remark 7.1 at the end of the paper).

Fig. 1
figure 1

Solutions to constrained optimal stopping problems

Background: martingale optimal transport and Shiryaev’s problem

In this article we consider distribution-constrained stopping problems from a mass transport perspective. Specifically we find that problems of the form exemplified in \((\textsc {OptStop}^{\psi (B_t,t)})\) and \((\textsc {OptStop}^{B^*_{t}})\) are amenable to techniques originally developed for the martingale version of the classical mass transport problem. This martingale optimal transport problem arises naturally in robust finance; papers to investigate such problems include [8, 12, 16, 18, 20, 25, 31]. In mathematical finance, transport techniques complement the Skorokhod embedding approach (see [24, 32] for an overview) to model-independent/robust finance.

A fundamental idea in optimal transport is that the optimality of a transport plan is reflected by the geometry of its support set which can be characterized using the notion of c-cyclical monotonicity. The relevance of this concept for the theory of optimal transport has been fully recognized by Gangbo and McCann [19], based on earlier work of Knott and Smith [28] and Rüschendorf [36, 37] among others. Inspired by these ideas, the literature on martingale optimal transport has developed a ‘monotonicity principle’ which allows to characterize martingale transport plans through geometric properties of their support sets, cf. [6, 7, 9, 10, 22, 39].

The main contribution of this article is to establish a monotonicity principle which is applicable to distribution-constrained optimal stopping problems. This transport approach turns out to be remarkably powerful, in particular we will find that questions as raised in Problems \((\textsc {OptStop}^{\psi (B_t,t)})\) and \((\textsc {OptStop}^{B^*_{t}})\) can be addressed using a relatively intuitive set of arguments.

The distribution-constrained optimal stopping problem \((\textsc {OptStop})\) (and specifically \((\textsc {OptStop}^{B^*_{t}})\)) arises naturally in financial and actuarial mathematics. We refer the reader to [23] which describes various examples (unit-linked life insurances, stochastic modelling for health insurances, the liquidation of an investment portfolio, the valuation of swing options).

Bayraktar and Miller [5] consider the same optimization problem that we treat here. However their setup and methods are rather distinct from the ones used here: they assume that the target distribution is given by finitely many atoms and that the target functional depends solely on the terminal value of Brownian motion. Following the measure valued martingale approach of Cox and Källblad [5, 14] address the constrained optimal stopping problem using a Bellman perspective.

The problem to construct a stopping time \(\tau \) of Brownian motion such that the law of \(\tau \) matches a given distribution on the real line was proposed by Shiryaev in his Banach Center lectures in the 1970’s, it has since been called Shiryaev’s problem or inverse first passage problem. Dudley and Gutmann [17] provide an abstract measure-theoretic construction. An early barrier-type solution to the inverse first passage problem was given by Anulova [3]. She constructs a symmetric two-sided barrier (corresponding to the case \(a=0\) in the sixth picture of Fig. 1). Anulova discretises the measure \(\mu \) and concludes through approximation arguments. The solution to the inverse first passage problem given in Corollary 1.1 was derived by Chen et al. [13] based on a variational inequality which describes the corresponding barrier. Notably, this is predated by a (formal) PDE description of such barriers given by Avellaneda and Zhu [4] in the context of credit risk modeling. Ekström and Janson [13] relate this solution to an optimal stopping problem and provide an integral equation for the barrier. Analytic solutions to the inverse first passage problem are known only in a few cases ([1, 2, 11, 29, 33, 38]). An interesting connection between the inverse first passage problem and Skorokhod’s problem is provided by Jaimungal et al. [26].

Statement of main results

Assumption 1

Throughout we will assume that is a filtered probability space and that \( (B_{t})_{t \ge 0} \) is an adapted process which has continuous paths on , such that B can be regarded as a measurable map from \(\Omega \) to \(C(\mathbb {R}_{+})\), the space of continuous functions from \(\mathbb {R}_{+}\) to \(\mathbb {R}\). The cost function \(c\) will always be a measurable map \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\). \(\mu \) will denote a probability measure on \(\mathbb {R}_{+}\).

Then the problem we consider can be stated as follows.


(OptStop) Among all stopping times \(\tau \sim \mu \) find the minimizer of

$$\begin{aligned} \tau \mapsto \mathbb E[c(B,\tau )] \text {.}\end{aligned}$$

Here we formulate our main optimization problem in terms of minimization, following the usual convention in the optimal transport literature (which is also used in the closely related paper [6]). Clearly, a sign change transforms this into a maximization problem and in our applications we will in fact turn to this latter version when resulting formulations appear more natural. We trust that this will not cause confusion.

Throughout we will also make the following assumptions without further mention:

Assumption 2

  1. 1.

    \( c\) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted, where \((\mathcal {F}^0_{t})_{t \ge 0}\) is the filtration on \(C(\mathbb {R}_{+})\) generated by the canonical process \(\left( \omega \mapsto \omega (t)\right) _{t \in \mathbb {R}_{+}{}} \).

  2. 2.

    There is a -measurable random variable U which is uniformly distributed on [0, 1] and independent of the process \( (B_{t})_{t \ge 0} \).

  3. 3.

    There is a probability measure \(\lambda \) s.t. \( (B_{t})_{t \ge 0} \) is a Brownian motion with initial law \(\lambda \), i.e. \(B_0 \sim \lambda \).

  4. 4.

    The problem is well-posed in the sense that \( \mathbb E[c(B,\tau )] \) is defined and \( > -\infty \) for all stopping times \( \tau \sim \mu \) and that \( \mathbb E[c(B,\tau )] < \infty \) for at least one such stopping time.

  5. 5.

    \( {\int }t^{p_0} \,d\mu (t) < \infty \), where \(p_0 \ge 0\) is some constant that we fix here and that can be chosen when applying the results from this section.

A note on language: The adjective “adapted” is usually applied to processes whose time argument is written in subscript form. For any filtered measurable space \(\tilde{\Omega }\) and any function \(f : \tilde{\Omega } \times \mathbb {R}_{+}\rightarrow \mathbb {R}\) (or possibly \(f : \tilde{\Omega } \times \mathbb {R}_{+}\rightarrow [-\infty ,\infty ] \)) we will interchangeably think of f simply as a function or as the process \( Y_t(\omega ) := f(\omega ,t) \). And so f being adapted means the same thing as \((Y_t)_{t \in \mathbb {R}_{+}}\) being adapted. Similarly for a subset \(\Gamma \) of \(\tilde{\Omega } \times \mathbb {R}_{+}\) we may also think of \(\Gamma \) as its indicator function or as the process \(Y_t(\omega ) := 1_{\Gamma }(\omega ,t)\) and will also say that the set \(\Gamma \) is adapted.

With that in mind, Assumption 2.1 should seem like an obvious thing to ask for from the cost function. Also, knowing about the existence of optional projections, it should be clear no later than Lemma 5.3 that Assumption 2.1 does not pose a real restriction on the class of problems we are treating.

The role of Assumption 2.2 should become clearer soon. We would like to note at this point though that often enough our results put together will imply that the solution of Problem \((\textsc {OptStop})\) for a space which satisfies Assumption 2.2 is essentially the same as the solution of the Problem for a space which may not satisfy said assumption, and we will find that we can describe this solution in detail. This can be seen executed in the proofs of the corollaries stated in the Appetizer.

The methods in this paper work not just for Brownian motion but for a class of processes which is conceptually bigger, but then turns out to not include much beyond Brownian motion—namely for any space-homogeneous but possibly time-inhomogeneous Markov process with continuous paths which has the strong Markov property. (Here space-homogeneous means that starting the process at location x and then moving its paths to start at location y results in a version of the process started at y.) If the reader so wishes, she may think of B as a process from this slightly larger class of processes. Care was taken not to reference any properties of Brownian motion beyond those stated here. In particular our results apply to multi-dimensional Brownian motion.

Assumption 2.4 is mostly just there to ensure that we are actually talking about an optimization problem in a meaningful sense. For the problems presented in the Appetizer, the moment conditions on \(\mu \) which are given in the statement of Corollary 1.1 and Corollary 1.2 ensure that Assumption 2.4 is satisfied (as we will see in the proofs of these corollaries).

The constant \(p_0\) in Assumption 2.5 will (implicitly) appear in the statement of Theorem 3.6, one of the main results. Its role is to ensure that \(\mathbb E[\varphi (B,\tau )]\) will be finite for some (class of) function(s) \(\varphi \) and any solution \(\tau \) of \((\textsc {OptStop})\). (The choice \(\varphi (B,\tau ) = \tau ^{p_0}\) is somewhat arbitrary here.)

The main results are Theorems 3.1 and 3.6.

We give two versions of Theorem 3.1. Version A is easier to state and may feel more natural, but we will need Version B (which is more general and has essentially the same proof as Version A) in the proof of the corollaries in the Appetizer.

Theorem 3.1

Version A. Assume that the cost function \(c\) is bounded from below and lower semicontinuous when we equip \(C(\mathbb {R}_{+})\) with the topology of uniform convergence on compacts. Then the Problem \((\textsc {OptStop})\) has a solution.

Version B. Assume that the cost function \(c\) is lower semicontinuous when we equip \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) with the product topology of two Polish topologies which generate the right sigma-algebras on \(C(\mathbb {R}_{+})\) and \(\mathbb {R}_{+}\) respectively and assume that the set \( \left\{ c_-(B,\tau ) : \tau \sim \mu , \tau \text { is a stopping time} \right\} \) is uniformly integrable, where \(c_- := -c\vee 0 \) denotes the negative part of \(c\). Then the Problem \((\textsc {OptStop})\) has a solution.

To state Theorem 3.6 we need a few more definitions.

Remark 3.2

We will find it convenient to talk about processes that don’t start at time 0 but instead at some time \( t > 0 \). Similarly we will consider stopping times taking values in \([t,\infty )\). These will be defined on the space \(C([t, \infty ))\) equipped with the filtration \((\mathcal {F}_{t}^{s})_{s \ge t}\), again generated by the canonical process \(\left( \omega \mapsto \omega (s)\right) _{s \ge t} \). We refer to the distribution of Brownian motion started at time t and location x by \(\mathbb {W}^{t}_{x}\). This is a measure on \(C([t, \infty ))\). For a probability measure \(\kappa \) on \(\mathbb {R}\) we write \(\mathbb {W}^{t}_{\kappa }\) for the distribution of Brownian motion started at time t with initial law \(\kappa \).

Definition 3.3

(Concatenation) For every \(t \in \mathbb {R}_{+}\) we have an operation \(\odot \) of concatenation, which is a map into \(C([t, \infty ))\) and is defined for \((\omega ,s) \in C([t, \infty ))\times [t, \infty )\) and \(\theta \in C\left( [s,\infty )\right) \) with \(\theta (s)=0\) by

$$\begin{aligned} \left( (\omega ,s) \odot \theta \right) (r) = {\left\{ \begin{array}{ll} \omega (r) &{} t\le r \le s \\ \omega (s) + \theta (r) &{} r > s \end{array}\right. }\text {.}\end{aligned}$$

Definition 3.4

(Stop-Go pairs) The set of Stop-Go pairs \(\mathsf {SG}\subseteq \left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) \times \left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) \) is defined as the set of all pairs \( ((\omega ,t),(\eta ,t)) \) (note that the time components have to match) such that

$$\begin{aligned} c(\omega ,t) + {\int }c((\eta ,t) \odot \theta , \sigma (\theta )) \,d\mathbb {W}^{t}_{0}(\theta ) < c(\eta ,t) + {\int }c((\omega ,t) \odot \theta , \sigma (\theta )) \,d\mathbb {W}^{t}_{0}(\theta ) \end{aligned}$$

for all \((\mathcal {F}_{t}^{s})_{s \ge t}\)-stopping times \(\sigma \) for which \(\mathbb {W}^{t}_{0}(\sigma = t) < 1\), \(\mathbb {W}^{t}_{0}(\sigma = \infty ) = 0\), \({\int }\sigma ^{p_0} \,d\mathbb {W}^{t}_{0} < \infty \) and for which both sides in (3.2) are defined and finite.

A hopefully intuitive way of putting the definition of Stop-Go pairs into words is the following: \(((\omega ,s),(\eta ,t))\) form a Stop-Go pair iff, irrespective of how we might stop after time t (i.e. which stopping rule \(\sigma \) we might use after time t), Stopping \(\omega \) at time t and letting \(\eta \) Go on is better—i.e. has lower cost—than stopping \(\eta \) and letting \(\omega \) go on. These two situations are contrasted in Fig. 2.

Fig. 2
figure 2

The left hand side of (3.2) corresponds to averaging the function c over the stopped paths on the left picture, the right hand side to averaging the function c over the stopped paths on the right picture

As hinted at earlier, the definition of Stop-Go pairs depends on the parameter \(p_0\) from Assumption 2.5. A larger \(p_0\) means that we are asking for more in Assumption 2.5 and implies that we get a larger set \(\mathsf {SG}\), as we are quantifying over fewer stopping times \(\sigma \) in the definition of \(\mathsf {SG}\). This in turn implies that the conclusion of Theorem 3.6 below will be stronger.

Definition 3.5

(Initial Segments) For a set \(\Gamma \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) define the set \(\Gamma ^<\subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) by

$$\begin{aligned} \Gamma ^< = \left\{ (\omega ,s) : (\omega ,t) \in \Gamma \text { for some } t > s \right\} \text {.}\end{aligned}$$

Theorem 3.6

(Monotonicity Principle) Assume that \( \tau \) solves \((\textsc {OptStop})\). Then there is a measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted set \( \Gamma \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) such that

$$\begin{aligned} \mathbb {P}[ ((B_{t})_{t \ge 0},\tau ) \in \Gamma ] = 1 \end{aligned}$$


$$\begin{aligned} \mathsf {SG}\cap \left( \Gamma ^< \times \Gamma \right) = \emptyset \text {.}\end{aligned}$$

The following lemma should give a first hint about how the Monotonicity can be applied.

Lemma 3.7

Let \(\tau \) be a solution of \((\textsc {OptStop})\) and assume that the cost function \(c\) is such that there exists a measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted process \((Y_{t})_{t \ge 0}\) such that

$$\begin{aligned} Y_t(\omega ) < Y_t(\eta ) \implies \left( (\omega ,t),(\eta ,t)\right) \in \mathsf {SG}\text {.}\end{aligned}$$

Define the barriers by

where \(\Gamma \) is a set with the properties in Theorem 3.6. Define the functions and \(\hat{\tau }\) on \(C(\mathbb {R}_{+})\) by



When applying this Lemma to show that some optimal stopping problem has a barrier-type solution as symbolized for example by the pictures in Fig. 1 the process \(Y_t(B)\) is of course what we are labelling the vertical axes in the pictures with. So for the first picture \(Y_t(\omega ) = \omega (t)\), for the second one \(Y_t(\omega ) = \omega (t) - \sup _{s \le t}\omega (s)\), for the third \(Y_t(\omega ) = -(\omega (t) - \sup _{s \le t}\omega (s))\) (the sign is flipped relative to the labelling in the picture because in this picture the barrier is drawn “up” instead of “down”), etc.

Notice that, contrary to customs, when we draw the barriers in the pictures in Fig. 1 the first coordinate is the vertical axis and the second coordinate is the horizontal axis. This is because, to make cross-referencing and comparison with [6] easier, we follow their convention of always having time as the second coordinate but still in the pictures it seems more natural to put the independent variable on the horizontal axis.

Note that a priori and \(\hat{\tau }\) need not be stopping times or even measurable, as we don’t know much about the sets and .

Using the properties of a concrete process \((Y_{t})_{t \ge 0}\) we will in the proofs of Corollaries 1.1 and 1.2 be able to show that a.s. (this should not be surprising as for each time t the barriers and differ by at most a single point) and therefore that the optimizer \(\tau \) is the hitting time of a barrier.

Proof of Lemma 3.7

Let \(\tilde{\omega } \in \Omega \) s.t. \(\left( B(\tilde{\omega }),\tau (\tilde{\omega })\right) \in \Gamma \). By assumption this holds for \(\mathbb {P}\)-a.a. \(\tilde{\omega }\). Then and therefore .

Next we show that \(\hat{\tau }(\tilde{\omega }) \ge \tau (\tilde{\omega })\). Assume that \(\left( Y_t(B(\tilde{\omega })),t\right) \in \hat{\mathcal {R}}\). We want to show that \(t \ge \tau (\tilde{\omega })\). By the definition of \(\hat{\mathcal {R}}\) we find that there is \(\eta \in C(\mathbb {R}_{+})\) with \((\eta ,t) \in \Gamma \) and \(Y_t(B(\tilde{\omega })) < Y_t(\eta )\), so by (3.5) we know \(\left( (B(\tilde{\omega }),t),(\eta ,t)\right) \in \mathsf {SG}\). Assuming, if possible, \(t < \tau (\tilde{\omega })\) we get according to Definition 3.5 that \((B(\tilde{\omega }),t) \in \Gamma ^<\). Therefore we have that \(\left( (B(\tilde{\omega }),t),(\eta ,t)\right) \in \mathsf {SG}\cap \left( \Gamma ^<\times \Gamma \right) \), but this is a contradiction to \(\mathsf {SG}\cap \left( \Gamma ^<\times \Gamma \right) = \emptyset \), so we must have \(t \ge \tau (\tilde{\omega })\). \(\square \)

Remark 3.8

(Duality) Problem \((\textsc {OptStop})\) is an infinite-dimensional linear programming problem and one would hence expect that a corresponding dual problem can be formulated. Indeed, assuming that c is lower semicontinuous and bounded from below, the value of the optimization problem equals

$$\begin{aligned} \sup _{M,\psi } \mathbb E[M_0]+ {\int }\psi \, d\mu , \end{aligned}$$

where the supremum is taken over bounded -martingales \(M=(M_t)_{t\ge 0}\) and bounded continuous functions \(\psi : \mathbb {R}_+\rightarrow \mathbb {R}\) satisfying (up to evanescence)

$$\begin{aligned} M_t+\psi (t)\le c(B,t)\text {.}\end{aligned}$$

This can be established in complete analogy to the duality result derived in [6, Theorem 1.2/Section 4.2] and we do not elaborate.

Digesting the appetizer

We will now demonstrate how to use the Monotonicity Principle of Theorem 3.6 to derive Corollary 1.1. The proof of Corollary 1.2 is very similar but relies on understanding a technical detail which does not add much to the story at this point, so we leave it for the end of the paper.

Both of the sets and in Lemma 3.7 have the property that (writing \(\mathcal {R}\) for the set in question) \((y,t) \in \mathcal {R}\) and \(y' \le y\) implies \((y',t) \in \mathcal {R}\). We call such sets (downwards) barriers. More specifically, for technical reasons in what follows it is slightly more convenient to talk about subsets of \([-\infty ,\infty ]\times \mathbb {R}_{+}\) instead of subsets of \( \mathbb {R}\times \mathbb {R}_{+}\), giving the following definition.

Definition 4.1

Let X be a topological space. A downwards barrier is a set \(\mathcal {R}\subseteq [-\infty ,\infty ]\times X\) such that \(\{-\infty \} \times X \subseteq \mathcal {R}\) and

$$\begin{aligned} (y,t) \in \mathcal {R}\text { and } y' \le y \text { implies } (y',t) \in \mathcal {R}\end{aligned}$$

Clearly, in Lemma 3.7, instead of talking about , we could have talked about without anything really changing, and likewise for \(\hat{\mathcal {R}}\).

The reader will easily verify the following lemma.

Lemma 4.2

Let X be a topological space. There is a bijection between the set of all upper semicontinuous functions \(\beta : X \rightarrow [-\infty ,\infty ]\) and the set of all closed downwards barriers \(\mathcal {R}\subseteq [-\infty ,\infty ]\times X\) (where closure is to be understood in the product topology). This bijection maps any upper semicontinuous function \(\beta \) to the barrier \(\mathcal {R}\) which is the hypograph of \(\beta \)

$$\begin{aligned} \mathcal {R}:= \left\{ (y,x) : y \le \beta (x) \right\} \text {,}\end{aligned}$$

while the inverse maps a barrier \(\mathcal {R}\) to the function \(\beta \) given by

$$\begin{aligned} \beta (x) := \sup \left\{ y : (y,x) \in \mathcal {R}\right\} \text {.}\end{aligned}$$

What we will show now, on the way to proving Corollary 1.1 is that the first hitting time after 0 of any downwards barrier by Brownian motion is a.s. equal to the first hitting time after 0 of the closure of that barrier. This serves to both resolve the question whether the times in Lemma 3.7 are stopping times and to show that a.s.

Let us assume for the rest of this section that B is actually a Brownian motion started in 0.

Lemma 4.3

Let \(\mathcal {R}\) be a downwards barrier in \([-\infty ,\infty ]\times \mathbb {R}_{+}\). Let \(\overline{\mathcal {R}}\) be the closure of \(\mathcal {R}\) (in the product topology of the usual topologies on \([-\infty ,\infty ]\) and \(\mathbb {R}_{+}\)). Define

$$\begin{aligned} \tau (\omega )&:= \inf \{ t> 0 : (B_t(\omega ),t) \in \mathcal {R}\} \\ \overline{\tau }(\omega )&:= \inf \{ t > 0 : (B_t(\omega ),t) \in \overline{\mathcal {R}}\} \text {.}\end{aligned}$$

Then \(\tau = \overline{\tau }\) a.s.


As \(\overline{\mathcal {R}}\supseteq \mathcal {R}\) we clearly have \(\overline{\tau }(\omega ) \le \tau (\omega )\) for all \(\omega \in \Omega \). Define

$$\begin{aligned} \overline{\tau }_{\varepsilon }(\omega ) := \inf \{ t > 0 : (B_t(\omega ) + \varepsilon \cdot A(t),t) \in \overline{\mathcal {R}}\} \text {,}\end{aligned}$$

where \(A(t) := \frac{t}{1+t} \) is a bounded, strictly increasing function. Using just that \(\overline{\mathcal {R}}\) is the closure of \(\mathcal {R}\) one proves by elementary methods that \(\tau (\omega ) \le \overline{\tau }_{\varepsilon }(\omega )\) for all \(\omega \in \Omega \) and any \(\varepsilon > 0\). Because \(A(t) = \int _0^t (1+s)^{-2} \,ds \) is the integral from 0 to t of a square integrable function we can apply Girsanov’s theorem (see e.g. [34, Theorem 38.5]) to see that \(\overline{\tau }_{1/n}\) converges to \(\overline{\tau }\) in distribution as \(n \rightarrow \infty \).

As \(\left( \overline{\tau }_{1/n}\right) _n\) is a decreasing sequence bounded below by \(\overline{\tau }\) we get that convergence holds almost surely. \(\square \)

The following is a particular case of [21, Corollary 2.3] (which in turn relies on arguments given in [30, 35]). Note that this lemma is purely a statement about barrier-type stopping times and is not directly connected to the optimization problem under consideration.

Lemma 4.4

(Uniqueness of Barrier-type solutions) Assume that \( (Y_{t})_{t \ge 0} \) is a measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted process and that the process Z defined through \(Z_t:=Y_t(B)\) has a.s. continuous paths. Let \( \mathcal {R}_1, \mathcal {R}_2 \subseteq [-\infty ,\infty ]\times \mathbb {R}_{+}\) be closed downwards barriers such that for

$$\begin{aligned} \tau _i(\omega ) := \inf \left\{ t > 0 : (Z_t(\omega ),t) \in \mathcal {R}_i \right\} \end{aligned}$$

we have \(\tau _1 \sim \tau _2\). Then \(\tau _1= \tau _2\) a.s.


Is to be found in [21, Corollary 2.3]. \(\square \)

We now have the necessary prerequisites to use our main results in showing that the first optimization problem in the Appetizer admits a (unique) barrier-type solution.

Proof of Corollary 1.1

The strategy is as follows: We choose a cost function and leverage Theorem 3.1 to show that an optimizer exists, the Monotonicity Principle in the form of Theorem 3.6 and Lemma 3.7 will—with some help from Lemma 4.3—show that any optimizer must be the hitting time of a barrier. Lemma 4.4 shows that any two barrier-type solutions must be equal.

We now provide the details. Start with a cost function \(c(\omega ,t) := -\omega (t) A(t)\) for a strictly monotone function \(A: \mathbb {R}_{+}\rightarrow \mathbb {R}\) which satisfies \(|A(t)| \le K(1+t^p)\) and assume that \(\mu \) has moment of order \(\frac{1}{2} + p + \varepsilon \) for some \(\varepsilon > 0\). To prove that a barrier-type solution exists when \(\mu \) has first moment, choose a bounded strictly increasing A and \(p=0\), \(\varepsilon =\frac{1}{2}\) in this step. (These assumptions guarantee in particular that the optimization problems considered below have a finite value.) Clearly the problem (OptStop) for c corresponds to \((\textsc {OptStop}^{\psi (B_t,t)})\) for \(\psi (B_t,t) = B_t A(t)\) (i.e. \(\psi \) takes the role of \(-c\) such that the minimal/maximal values agree up to a change of sign). We will deal with the case where \(\psi (B_t,t) = \phi (B_t)\) at the end of this proof.

We now check that the conditions in Version B of Theorem 3.1 are satisfied. We also need to check that Assumption 2 holds. Here we need the assumption that \(\mu \) has moment of order \(\frac{1}{2} + p + \varepsilon \), as well as the Hölder and Burkholder-Davis-Gundy inequalities. The latter specialized to Brownian motion state that for all \(q > 0\) there are positive constants \(K_0\) and \(K_1\) such that for any stopping time \(\tau \) we have \( K_0 \, \mathbb E\left[ \tau ^{q/2}\right] \le \mathbb E\left[ (|B|^*_\tau )^q\right] \le K_1 \, \mathbb E\left[ \tau ^{q/2}\right] \) (where \(|B|^*_t = \sup _{s \le t} |B_s| \)). With these in hand a straightforward calculation allows us to bound \(B_\tau A(\tau )\) in the \(L^{1+\delta }\)-norm for some \(\delta > 0\), independently of the stopping time \(\tau \sim \mu \).

This shows both that the uniform integrability condition in Version B of Theorem 3.1 is satisfied and that Assumption 2.4 is satisfied.

On \(C(\mathbb {R}_{+})\) we may choose the (Polish) topology of uniform convergence on compacts. For the topology on \(\mathbb {R}_{+}\) we start with the usual topology and turn A into a continuous function (if it wasn’t), by making use of the fact that any measurable function from a Polish space to a second countable space may be turned into a continuous function by passing to a larger Polish topology (with the same Borel sets) on the domain. (This can be found for example in [27, Theorem 13.11].)

In the statement of Corollary 1.1 we did not require that the probability space satisfy Assumption 2.2. To remedy this we can enlarge the probability space by setting \(\tilde{\Omega } := \Omega \times [0,1]\), and \(\tilde{\mathbb {P}} := \mathbb {P}\otimes \mathcal {L}\), where \(\mathcal {L}\) is Lebesgue measure on [0, 1]. On this space we consider the Brownian motion \(\tilde{B}_t (\omega , x) := B_t(\omega )\). Theorem 3.1 now gives us an optimal stopping time \(\tilde{\tau }\) on the enlarged probability space. If we can show that this stopping time is in fact the hitting time of a barrier, then it follows that \(\tilde{\tau } = \tau \circ ((\omega ,x) \mapsto \omega )\) for a stopping time \(\tau \) which is defined as the hitting time of the Brownian motion B of the same barrier. As there are more stopping times on than on in the sense that any stopping time \(\tau '\) on induces a stopping time \(\tilde{\tau }' := \tau ' \circ ((\omega ,x) \mapsto \omega )\) on we conclude that \(\tau \) must also be optimal among the stopping times on . With this out of the way, let us refer to our Brownian motion by B, to the optimal stopping time by \(\tau \) and to our filtered probability space by irrespective of whether this is the original process and space we started with, or an enlarged one.

Choosing \( p_0 := \frac{1}{2} + p + \varepsilon \) in Assumption 2.5 we apply Theorem 3.6 to obtain a set \(\Gamma \) on which \((B,\tau )\) is concentrated under \(\mathbb {P}\) and for which (3.4) holds. As \(\mu \) is concentrated on \((0,\infty )\), we may assume that \(\Gamma \cap (C(\mathbb {R}_{+})\times \{ 0 \}) = \emptyset \). Next we want to show that Lemma 3.7 applies with \(Y_t(\omega ) = \omega (t)\).

Translating (3.5) to our situation, we want to prove that \(\omega (t) < \eta (t)\) implies

$$\begin{aligned} -\omega (t) A(t) - \mathbb E\left[ \left( \eta (t) + \tilde{B}_\sigma \right) A(\sigma ) \right] < -\eta (t) A(t) - \mathbb E\left[ \left( \omega (t) + \tilde{B}_\sigma \right) A(\sigma ) \right] \text {,}\end{aligned}$$

where \(\tilde{B}\) is Brownian motion started in 0 at time t on \(C([t, \infty ))\) and \(\sigma \) is any stopping time thereon with \(\mathbb {W}^{t}_{0}(\sigma = t) < 1\), \(\mathbb {W}^{t}_{0}(\sigma = \infty ) = 0\), \({\int }\sigma ^{p_0} \,d\mathbb {W}^{t}_{0} < \infty \). Again the Burkholder–Davis–Gundy inequality shows that \(\mathbb E[\tilde{B}_\sigma A(\sigma )] < \infty \). So (4.1) turns into

$$\begin{aligned} \omega (t)\, \mathbb E[A(\sigma ) - A(t)] < \eta (t)\, \mathbb E[A(\sigma ) - A(t)] \end{aligned}$$

which clearly follows from the assumptions. So we know that Lemma 3.7 holds, i.e. using the names from said lemma we have \(\mathbb {P}\)-a.s.

\(\Gamma \cap (C(\mathbb {R}_{+})\times \{ 0 \}) = \emptyset \) implies and therefore , and likewise for \(\hat{\mathcal {R}}\) and \(\hat{\tau }\). As it follows from Lemma 4.3 that a.s. and that \(\tau \) is of the form claimed in (1.1) with \(\beta (t) := \sup \{y\in \mathbb {R}: (y,t) \in \overline{\mathcal {R}}\}\). The uniqueness claims follow from Lemma 4.4 and what we have already proven.

We now treat the case where \(\psi (B_t,t) = \phi (B_t)\) with \(\phi ''' > 0\), \(\left| \phi (y)\right| \le K (1 + |y|^p)\) and \(\mu \) has finite moment of order \( \frac{p}{2} + \varepsilon \) for some \(\varepsilon > 0\). Most of the proof remains unchanged. Setting \(c(\omega ,t) = -\phi (\omega (t))\) we may again use the Burkholder-Davis-Gundy inequalities to show that \(c(B_\tau ,\tau )\) is bounded in \(L^{1+\delta }\)-norm, independently of the stopping time \(\tau \sim \mu \), thereby showing both that Assumption 2.4 is satisfied and that the uniform-integrability condition in Version B of Theorem 3.1 is satisfied.

It remains to show that \(\omega (t) < \eta (t)\) implies \(((\omega ,t),(\eta ,t)) \in \mathsf {SG}\). \(\phi ''' > 0\) implies that the map \(y \mapsto \phi (\eta (t) + y) - \phi (\omega (t) + y)\) is strictly convex. By the strict Jensen inequality \(\mathbb E[ \phi (\eta (t) + \tilde{B}_\sigma ) - \phi (\omega (t) + \tilde{B}_\sigma ) ] > \phi (\eta (t)) - \phi (\omega (t))\) for any stopping time \(\sigma \) on \(C([t, \infty ))\) which is almost surely finite, satisfies optional stopping and is not almost surely equal to t. As we may choose \(p_0 := \frac{p}{2} + \varepsilon \), which is greater than 1, we may assume that the \(\sigma \) in the definition of \(\mathsf {SG}\) has finite first moment, which is enough to guarantee that it satisfies optional stopping. Rearranging the last inequality gives (3.2). \(\square \)

Existence of an optimizer

The proof of existence of solutions to the Problem (OptStop) crucially depends on thinking of stopping times as the joint distribution of the process to be stopped and the stopping time. We introduce some concepts to make this precise and give a proof of Theorem 3.1 at the end of this section.

Lemma 5.1

Let \(G: C([t, \infty ))\rightarrow \mathbb {R}\), and \(s \ge t\). The function

$$\begin{aligned} \omega \mapsto {\int }G((\omega ,s) \odot \theta ) \,d\mathbb {W}^{s}_{0}(\theta ) \end{aligned}$$

is a version of the conditional expectation \(\mathbb E_{\mathbb {W}^{t}_{\lambda }}[G|\mathcal F^t_s]\) (for any initial distribution \(\lambda \)). Henceforth, by we will mean this function.

If \(G \in C_b\left( C([t, \infty ))\right) \), then .


Obvious. \(\square \)

Here we use \(C_b(X)\) to denote the set of continuous bounded functions from a topological space X to \(\mathbb {R}\). The last sentence of the lemma is of course true for any topology on \(C([t, \infty ))\) for which the map \(\omega \mapsto \omega \odot \theta \) is continuous for all \(\theta \), but we will only need it for the topology of uniform convergence on compacts.Footnote 3

Given spaces X and Y we will denote the projection from \(X \times Y\) to X by \(\mathsf {proj}_{X}\) (and similarly for Y). For a measurable map \(F: X \rightarrow Y\) between measure spaces and a measure \(\nu \) on X we denote the pushforward of \(\nu \) under F by \(F_*(\nu ) := D \mapsto \nu (F^{-1}\left[ D\right] )\).

Definition 5.2

(\(\mathsf {RST}\)) The set \(\mathsf {RST}^{t}_{\kappa }\) of randomized stopping times (of Brownian motion started at time t with initial distribution \(\kappa \)) is defined as the set of all subprobability measures \(\xi \) on \(C([t, \infty ))\times [t, \infty )\) such that \((\mathsf {proj}_{C([t, \infty ))})_*(\xi ) \le \mathbb {W}^{t}_{\kappa }\) and that


for all \(s > t\), all \( G \in C_b\left( C\left( [t,\infty )\right) \right) \) and all \(F \in C_b\left( [t,\infty )\right) \) supported on [ts].

In this definition the topology on \(C([t, \infty ))\) is that of uniform convergence on compacts and the topology on \([t, \infty )\) is the usual topology.

Given a distribution \(\nu \) on \(C\left( [t,\infty )\right) \) we write

$$\begin{aligned} \mathsf {RST}^{t}_{\kappa }(\nu ) := \left\{ \xi \in \mathsf {RST}^{t}_{\kappa } : (\mathsf {proj}_{[t, \infty )})_*(\xi ) = \nu \right\} \text {.}\end{aligned}$$

We write \(\mathsf {RST}^{t}_{\kappa }(\mathcal {P})\) for the set of all \(\xi \in \mathsf {RST}^{t}_{\kappa }\) with mass 1 and call these the finite randomized stopping times.

In any of these, if we drop the superscript t then we will mean time \(t = 0\), while, if we drop the subscript \(\kappa \), then we mean that the initial distribution \(\kappa = \delta _0\), i.e. the Brownian motion to be stopped is started deterministically in 0.

To explain the qualifier finite it may help to imagine that for a non-finite randomized stopping time of mass \(\alpha < 1\), the mass \(1-\alpha \) which is missing is placed along \(C([t,\infty ))\times \{\infty \}\).

The following Lemma 5.3 from [6] shows that the problem (OptStop) is equivalent to the following optimization problem \((\textsc {OptStop'})\) in the sense that a solution of one can be translated into a solution of the other and vice versa. This of course also implies that the values of the two problems are equal, thereby showing that the concrete space has no bearing on this value, as long as Assumptions 1 and 2 are satisfied.

The definition we have given for a randomized stopping time is only the most convenient (for our purposes) of a number of possible equivalent definitions. Although Lemma 5.3 below should provide some intuition on what a randomized stopping time is, the reader may still wish to refer to [6, theorem 3.8] for the other possible ways of defining randomized stopping times. The first step in connecting condition (5.1), which is one of the equivalent conditions listen in said theorem, to the others, is to notice that (5.1) can be rewritten as

where \(\xi _\omega \) is a disintegration of \(\xi \) with respect to \(\mathbb {W}^{t}_{\kappa }\). This says that the function \(\omega \mapsto {\int }F(r) \,d\xi _\omega (r)\) is orthogonal to for all bounded continuous G, i.e. that it is a.s. \(\mathcal F^t_s\)-measurable whenever F is supported on [ts]. A limit argument then shows that \(\omega \mapsto \xi _\omega ([t,s])\) is a.s. \(\mathcal F^t_s\)-measurable. Again, we refer the reader to [6] for a more detailed exposition.


\((\textsc {OptStop'})\) Among all randomized stopping times \(\xi \in \mathsf {RST}_{\lambda }(\mu )\) find the minimizer of

$$\begin{aligned} \xi ' \mapsto {\int }c\,d\xi ' \text {.}\end{aligned}$$

Lemma 5.3

([6, Lemma 3.11]) Let \(\tau \) be a -stopping time and consider

$$\begin{aligned} \Phi&: \Omega \rightarrow C(\mathbb {R}_{+})\times [0,\infty ] \\ \Phi (\omega )&:= ((B_t(\omega ))_{t\ge 0}, \tau (\omega )) \text {.}\end{aligned}$$

Then is a randomized stopping time, i.e. \(\xi \in \mathsf {RST}_{\lambda }\), and for any non-negative measurable process \(F: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) we have

$$\begin{aligned} {\int }F \,d\xi = \mathbb E[(F \cdot 1_{C(\mathbb {R}_{+})\times \mathbb {R}_{+}}) \circ \Phi ] = \mathbb E[F(B,\tau ) \cdot 1_{\mathbb {R}_{+}}(\tau )] \text {.}\end{aligned}$$

For any \(\xi \in \mathsf {RST}_{\lambda }\), we can find a -stopping time \(\tau \) such that \(\xi = \Phi _*(\mathbb {P})\) and (5.2) holds.

\(\xi \) is a finite randomized stopping time iff \(\tau \) is a.s. finite.

Proof of Theorem 3.1

We prove Version B of the theorem. Version A is a special case. We show that Problem \((\textsc {OptStop'})\) has a solution. To this end we show that the set \(\mathsf {RST}_{\lambda }(\mu )\) is compact (in the weak topology). From the fact that \(c\) is lower semicontinuous and bounded from below in an appropriate sense we then deduce by the Portmanteau theorem that the map

$$\begin{aligned} \hat{c}&: \mathsf {RST}_{\lambda }(\mu ) \rightarrow (-\infty ,\infty ] \\ \hat{c}(\zeta )&:= {\int }c\,d\zeta \end{aligned}$$

is lower semicontinuous and therefore that the infimum \( \inf _{\zeta \in \mathsf {RST}_{\lambda }(\mu )} \hat{c}(\zeta ) \) is attained.

Now for the details. On each of the spaces \(C(\mathbb {R}_{+})\) and \(\mathbb {R}_{+}\) we are dealing with two topologies, one coming from the Definition 5.2 of randomized stopping times (to wit, the topology of uniform convergence on compacts on the space \(C(\mathbb {R}_{+})\) and the usual topology on \(\mathbb {R}_{+}\)) and one coming from the assumptions in the statement of this theorem. We can equip each of these spaces with the smallest topology which contains the two topologies in question. These are again Polish topologies and they still generate the standard sigma-algebras on the respective spaces. For the remainder of this proof all topological notions are to be understood relative to these topologies. So the topology on \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) is the product topology of these two topologies, and the weak topology on the space of measures on \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) is to be understood relative to this product topology. The cost function \(c\) of course remains lower semicontinuous and by Lemma 5.1 the functions appearing in Definition 5.2 are continuous.

Note that for \(\xi \in \mathsf {RST}_{\lambda }(\mu )\) as \(\mu \) has mass 1, so must \(\xi \) and \((\mathsf {proj}_{C(\mathbb {R}_{+})})_*(\xi )\), which together with \((\mathsf {proj}_{C(\mathbb {R}_{+})})_*(\xi ) \le \mathbb {W}^{0}_{\lambda }\) implies \((\mathsf {proj}_{C(\mathbb {R}_{+})})_*(\xi ) = \mathbb {W}^{0}_{\lambda }\). So we deduce

$$\begin{aligned} \mathsf {RST}_{\lambda }(\mu ) = \left\{ \xi \in \Pi : {\int }F(s) \big ( G - \mathbb E[G|\mathcal F^0_t] \big )(\omega ) \,d\xi (\omega ,s) = 0 \forall (t,F,G) \in \star \right\} \end{aligned}$$


The set \( \Pi \) is compact by Prokhorov’s Theorem and the fact that pushforwards are continuous maps between measure spaces. It remains to show that \(\mathsf {RST}_{\lambda }(\mu )\) is a nonempty closed subset. It is nonempty because the product measure \(\mathbb {W}^{0}_{\lambda } \otimes \mu \in \mathsf {RST}_{\lambda }(\mu )\). It is closed because, as noted, is continuous for all \((t,F,G) \in \star \).

Now we show that \( \hat{c} \) is lower semicontinuous. The functions \( c^N := c\vee -N \) are each bounded from below and lower semicontinuous. By the Portmanteau theorem the maps \(\hat{c}^N := \zeta \mapsto {\int }c^N \,d\zeta \) are lower semicontinuous. On \(\mathsf {RST}_{\lambda }(\mu )\) they converge uniformly to \(\hat{c}\) because

$$\begin{aligned} \sup _\zeta \left| \hat{c}(\zeta ) - \hat{c}^N(\zeta )\right| \le \sup _\zeta {\int }\left| c- c^N \right| \,d\zeta \le \sup _{\zeta \in \mathsf {RST}_{\lambda }(\mu )} {\int }c_- \cdot 1_{c_- \ge N} \,d\zeta \text {,}\end{aligned}$$

which converges to 0 as N goes to \(\infty \) by the uniform integrability assumption. As a uniform limit of lower semicontinuous functions is again lower semicontinuous we see that \(\hat{c}\) is lower semicontinuous. \(\square \)

Geometry of the optimizer

This section is devoted to the proof of Theorem 3.6. The proof closely mimicks that of Theorem 1.3/Theorem 5.7 in [6]. For the benefit of those readers already familiar with said paper we will first describe the changes required to the proofs there to make them work in our situation and then—for the sake of a more self-contained presentation—indulge in reiterating the main arguments and only citing results from [6] that we can use verbatim.

Sketch of differences in the proof of Theorem 3.6 relative to [6, Theorem 5.7]

Again the strategy is to show that for a larger set we can find a set such that . The definition of must of course be adapted analoguously to the changes required to the definition of \(\mathsf {SG}\).

Apart from that the only real changes are to [6, Theorem 5.8]. Whereas previously it was essential that the randomized stopping time \(\xi ^{r(\omega ,s)}\) is also a valid randomized stopping time of the Markov process in question when started at a different time but the same location \(\omega (s)\), we now need that \(\xi ^{r(\omega ,s)}\) will also be a randomized stopping time of our Markov process when started at the same time s but in a different place. Of course, when we are talking about Brownian motion both are true, but this difference is the reason why in the case of the Skorokhod embedding the right class of processes to generalize the argument to is that of Feller processes while in our setup we don’t need our processes to be time-homogeneous but we do need them to be space-homogeneous. That we are able to plant this “bush” \(\xi ^{r(\omega ,s)}\) in another location is what guarantees that the measure \(\xi _1^\pi \) defined in the proof of Theorem 5.8 of [6] is again a randomized stopping time.

Whereas in the Skorokhod case the task is to show that the new better randomized stopping time \(\xi ^\pi \) embeds the same distribution as \(\xi \) we now have to show that the randomized stopping time we construct has the same distribution as \(\xi \). The argument works along the same lines though—instead of using that \(\left( (\omega ,s),(\eta ,t)\right) \in \widehat{\mathsf {SG}}^\xi \) implies \(\omega (s)=\eta (t)\) we now use that \(\left( (\omega ,s),(\eta ,t)\right) \in \widehat{\mathsf {SG}}^\xi \) implies \(s=t\). \(\square \)

We now present the argument in more detail.

As may be clear by now, what we will show is that if \(\xi \in \mathsf {RST}_{\lambda }(\mu )\) is a solution of \((\textsc {OptStop'})\), then there is a measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted set \(\Gamma \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) such that \(\mathsf {SG}\cap \left( \Gamma ^< \times \Gamma \right) = \emptyset \). Using Lemma 5.3 this implies Theorem 3.6.

We need to make some preparations. To align the notation with [6] and to make some technical steps easier it is useful to have another characterization of measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted processes and sets. To this end define

Definition 6.1

r has many right inverses. A simple one is

$$\begin{aligned} r'&: S \rightarrow C(\mathbb {R}_{+})\times \mathbb {R}_{+}\\ r'(f,s)&:= \left( t \mapsto {\left\{ \begin{array}{ll} f(t) &{} \quad \text {for} t \le s \\ f(s) &{} \quad \text {for} t > s \end{array}\right. },s \right) . \end{aligned}$$

We endow S with the sigma algebra generated by \(r'\).

[6, Theorem 3.2], which is a direct consequence of [15, Theorem IV. 97], asserts that a process X is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted iff X factors as \(X=X'\circ r\) for a measurable function \(X' : S \rightarrow \mathbb {R}\). So a set \(D \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted iff \(D = r^{-1}\left[ D'\right] \) for some measurable \(D' \subseteq S\).

Note that \(r(\omega ,t) = r(\omega ',t')\) implies \((\omega ,t) \odot \theta = (\omega ',t') \odot \theta \) and therefore

$$\begin{aligned} \mathsf {SG}&=(r \times r)^{-1}\left[ \mathsf {SG}'\right] \end{aligned}$$

for a set \(\mathsf {SG}' \subseteq S \times S\) which is described by an expression almost identical to that in Definition 3.4. Namely we can overload \(\odot \) to also be the name for the operation whose first operand is an element of S, such that \((\omega ,t) \odot \theta = r(\omega ,t) \odot \theta \) and note that as \(c\) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted we can write \(c= c'\circ r\) and thus get a cost function \( c'\) which is defined on S.

Given an optimal \(\xi \in \mathsf {RST}_{\lambda }(\mu )\) we may therefore rephrase our task as having to find a measurable set \(\Gamma \subseteq S\) such that \(r_*(\xi )\) is concentrated on \(\Gamma \) and that \(\mathsf {SG}' \cap \left( \Gamma ^< \times \Gamma \right) = \emptyset \), where .

Note that for \(\Gamma \subseteq S\) although \(\left( r^{-1}\left[ \Gamma \right] \right) ^<\) is not equal to \( r^{-1}\left[ \Gamma ^<\right] \) we still have \(\mathsf {SG}\cap \left( r^{-1}\left[ \Gamma ^<\right] \times r^{-1}\left[ \Gamma \right] \right) = \emptyset \) iff \(\mathsf {SG}\cap \left( (r^{-1}\left[ \Gamma \right] )^< \times r^{-1}\left[ \Gamma \right] \right) = \emptyset \).

One of the main ingredients of the proof of [6, Theorem 1.3] and of our Theorem 3.6 is a procedure whereby we accumulate many infinitesimal changes to a given randomized stopping time \(\xi \) to build a new stopping time \(\xi ^\pi \). The guiding intuition for the authors is to picture these changes as replacing certain “branches” of the stopping time \(\xi \) by different branches. Some of these branches will actually enter the statement of a somewhat stronger theorem (Theorem 6.8 below), so we begin by describing these. Our way to get a handle on “branches”—i.e. infinitesimal parts of a randomized stopping time—is to describe them through a disintegration (wrt \(\mathbb {W}^{0}_{\lambda }\)) of the randomized stopping time. We need the following statement from [6] which should also serve to provide more intuition on the nature of randomized stopping times.

Lemma 6.2

[6, Theorem 3.8] Let \(\xi \) be a measure on \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\). Then \( \xi \in \mathsf {RST}_{\lambda } \) iff there is a disintegration \((\xi _{\omega })_{\omega \in C(\mathbb {R}_{+})}\) of \(\xi \) wrt \(\mathbb {W}^{0}_{\lambda }\) such that \((\omega ,t) \mapsto \xi _\omega ([0,t])\) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted and maps into [0, 1].

Using Lemma 6.2 above let us fix for the rest of this section both \(\xi \in \mathsf {RST}_{\lambda }(\mu )\) and a disintegration \(\left( \xi _{\omega }\right) _{\omega \in C(\mathbb {R}_{+})}\) with the properties above. Both Definition 6.3 below and Theorem 6.8 implicitly depend on this particular disintegration and we emphasize that whenever we write \(\xi _{\omega }\) in the following we are always referring to the same fixed disintegration with the properties given in Lemma 6.2. Note that the measurability properties of \(\left( \xi _{\omega }\right) _{\omega \in C(\mathbb {R}_{+})}\) imply that for any \( I \subseteq [0,s] \) we can determine \( \xi _\omega (I) \) from alone. For \((f,s) \in S\) we will again overload notation and use \(\xi _{(f,s)} \) to refer to the measure on [0, s] which is equal to for any \( \omega \in C(\mathbb {R}_{+})\) such that \(r(\omega ,s) = (f,s)\).

Definition 6.3

(conditional randomized stopping time) Let \((f,s) \in S\). We define a new randomized stopping time by setting


for all bounded measurable \(F : C([s, \infty ))\times [s, \infty )\rightarrow \mathbb {R}\), i.e. \((\xi ^{(f,s)}_{\omega })_{\omega \in C([s, \infty ))}\) is the disintegration of \( \xi ^{(f,s)} \) wrt \(\mathbb {W}^{s}_{0}\).

Here \(\delta _s\) is the Dirac measure concentrated at s. Really, the definition in the case where \( \xi _{(f,s)}([0,s]) = 1 \) is somewhat arbitrary—it’s more a convenience to avoid partially defined functions. What we will use is that .

Definition 6.4

(relative Stop-Go pairs) The set \(\mathsf {SG}^\xi \) consists of all \(\left( (f,t), (g,t)\right) \in S \times S\) (again the times have to match) such that either

$$\begin{aligned} c'(f,t) + {\int }c((g,t) \odot \theta , u) \,d\xi ^{(f,t)}(\theta ,u) < c'(g,t) + {\int }c((f,t) \odot \theta , u) \,d\xi ^{(f,t)}(\theta ,u) \end{aligned}$$

or any one of

  1. 1.

    \(\xi ^{(f,t)}\left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) < 1\) or \({\int }s^{p_0} \,d\xi ^{(f,t)}(\theta ,s) = \infty \)

  2. 2.

    the integral on the right hand side equals \(\infty \)

  3. 3.

    either of the integrals is not defined

holds. We also define

$$\begin{aligned} \widehat{\mathsf {SG}}^\xi := \mathsf {SG}^\xi \cup \left\{ (f,s) \in S : \xi _{(f,s)}([0,s]) = 1 \right\} \times S \end{aligned}$$

Lemma 6.6 below says that the numbered cases above are exceptional in an appropriate sense and one may consider them a technical detail. Note that when we say \(\left( (f,t),(g,t)\right) \in \mathsf {SG}^\xi \) we are implicitly saying that \( \xi _{(f,t)}([0,t]) < 1 \).

Note that the sets \(\mathsf {SG}^\xi \) and \(\widehat{\mathsf {SG}}^\xi \) are measurable (in contrast to \(\mathsf {SG}\), which may be more complicated).

Definition 6.5

We call a measurable set \(F \subseteq S\) evanescent if \(r^{-1}\left[ F\right] \) is evanescent, that is, if \(\mathbb {W}^{0}_{\lambda }\left( \mathsf {proj}_{C(\mathbb {R}_{+})}\left[ r^{-1}\left[ F\right] \right] \right) = 0\).

Lemma 6.6

[6, Lemma 5.2] Let \(F: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) be some measurable function for which \({\int }F \,d\xi \in \mathbb {R}\). Then the following sets are evanescent.

  • \(\left\{ (f,s) \in S : \xi ^{(f,s)}\left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) < 1 \right\} \)

  • \(\left\{ (f,s) \in S : {\int }F((f,s) \odot \theta ,u) \,d\xi ^{(f,s)}(\theta ,u) \not \in \mathbb {R}\right\} \)


See [6].\(\square \)

Lemma 6.7

[6, Lemma 5.4]

$$\begin{aligned} \mathsf {SG}' \subseteq \widehat{\mathsf {SG}}^\xi \end{aligned}$$


Can be found in [6]. Note that they fix \(p_0 = 1\). \(\square \)

Theorem 6.8

Assume that \( \xi \) is a solution of \((\textsc {OptStop'})\). Then there is a measurable set \( \Gamma \subseteq S \) such that \( r_*(\xi )(\Gamma ) = 1 \) and

$$\begin{aligned} \widehat{\mathsf {SG}}^\xi \cap \left( \Gamma ^< \times \Gamma \right) = \emptyset \text {.}\end{aligned}$$

Our argument follows [6, Theorem 5.7]. We also need the following two auxilliary propositions, which in turn require some definitions.

Definition 6.9

Let \(\upsilon \) be a probability measure on some measure space Y. The set \(\mathsf {JOIN}_{\lambda }(\upsilon )\) is the set of all subprobability measures \(\pi \) on \((C(\mathbb {R}_{+})\times \mathbb {R}_{+}) \times Y\) such that

Proposition 6.10

Let \( \xi \) be a solution of \((\textsc {OptStop'})\). Then \( \left( r \times \mathsf {Id}\right) _*(\pi )(\mathsf {SG}^\xi ) = 0 \) for all \( \pi \in \mathsf {JOIN}_{\lambda }(r_*(\xi )) \).

Here we use \(\times \) to denote the Cartesian product map, i.e. for sets \(X_i,Y_i\) and functions \(F_i : X_i \rightarrow Y_i\) where \(i \in \{1,2\}\) the map \(F_1 \times F_2 : X_1 \times X_2 \rightarrow Y_1 \times Y_2\) is given by \((F_1 \times F_2)(x_1,x_2) = (F_1(x_1),F_2(x_2))\). Proposition 6.10 is an analogue of [6, Proposition 5.8] and it is where the material changes compared to [6] take place. We will give the proof at the end of this section.

Proposition 6.11

[6, Proposition 5.9] Let \((Y, \upsilon )\) be a Polish probability space and let \( E \subseteq S \times Y \) be a measurable set. Then the following are equivalent

  1. 1.

    \( \left( r \times \mathsf {Id}\right) _*(\pi )(E) = 0 \) for all \(\pi \in \mathsf {JOIN}_{\lambda }(\upsilon )\)

  2. 2.

    \( E \subseteq (F \times Y) \cup (S \times N) \) for some evanescent set \(F \subseteq S\) and a measurable set \(N \subseteq Y\) which satisfies \(\upsilon (N) = 0\).

Proposition 6.11 is proved in [6] and we will not repeat the proof here.

Proof of Theorem 6.8

Using Proposition 6.10 we see that \( \left( r \times \mathsf {Id}\right) _*(\pi )(\mathsf {SG}^\xi ) = 0 \) for all \( \pi \in \mathsf {JOIN}_{\lambda }(r_*(\xi )) \). Plugging this into Proposition 6.11 we find an evanescent set \(F_1 \subseteq S\) and a set \( N \subseteq S\) such that \(r_*(\xi )(N) = 0\) and \(\mathsf {SG}^\xi \subseteq (F_1 \times S) \cup (S \times N)\). Defining for any Borel set \(E \subseteq S\) the analytic set

we observe that \( \left( (E^>)^c\right) ^< \subseteq E^c \) and find \(r_*(\xi )(F_1^>) = 0\).

Setting \(F_2 := \left\{ (f,s) \in S : \xi _{(f,s)}([0,s]) = 1 \right\} \) and arguing on the disintegration \(\left( \xi _\omega \right) _{\omega \in C(\mathbb {R}_{+})}\) we see that \( r_*(\xi )(F_2^>) = 0 \), so \(r_*(\xi )(F^>) = 0\) for \(F := F_1 \cup F_2\).

This shows that \(S {\setminus } (N \cup F^>)\) has full \(r_*(\xi )\)-measure. Let \(\Gamma \) be a Borel subset of that set which also has full \(r_*(\xi )\)-measure.


$$\begin{aligned} \Gamma ^< \times \Gamma&\subseteq \left( (F^>)^c\right) ^< \times N^c \subseteq F^c \times N^c \text { and}\\ \widehat{\mathsf {SG}}^\xi&\subseteq (F \times S) \cup (S \times N) \end{aligned}$$

which shows \( \widehat{\mathsf {SG}}^\xi \cap \left( \Gamma ^< \times \Gamma \right) = \emptyset \). \(\square \)

Lemma 6.12

If \( \alpha \in \mathsf {RST}_{\lambda } \) and \( G : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow [0,1] \) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted, then the measure defined by

$$\begin{aligned} F \mapsto {\int }F(\omega , t) G(\omega , t) \,d\alpha (\omega ,t) \end{aligned}$$

is still in \( \mathsf {RST}_{\lambda } \).


We use the criterion in Lemma 6.2. Let \( (\alpha _\omega )_{\omega \in C(\mathbb {R}_{+})} \) be a disintegration of \( \alpha \) wrt \( \mathbb {W}^{0}_{\lambda } \) for which \( (\omega ,t) \mapsto \alpha _\omega ([0,t]) \) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted and maps into [0, 1]. Then \((\hat{\alpha }_\omega )_\omega \) defined by \( \hat{\alpha }_{\omega } := F \mapsto {\int }F(t) G(\omega ,t) \,d\alpha _{\omega }(t) \) is a disintegration of the measure in (6.5) for which \((\omega ,t) \mapsto \hat{\alpha }_\omega ([0,t]) \) is measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted and maps into [0, 1]. \(\square \)

Lemma 6.13

(Strong Markov property for RSTs) Let \( \alpha \in \mathsf {RST}_{\lambda } \). Then

$$\begin{aligned} {\int }F(\omega ,t) \,d\alpha (\omega ,t) = {\iint }F((\omega ,t) \odot \tilde{\omega }, t) \,d\mathbb {W}^{t}_{0}(\tilde{\omega }) \,d\alpha (\omega ,t) \end{aligned}$$

for all bounded measurable \( F : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\).


Using integral notation instead of the more conventional \(\mathbb E\), we may write the classical form of the strong markov property as

$$\begin{aligned}&{\int }G\left( \Theta _{\tau (\omega )}(\omega )\right) H(\omega ) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{0}_{\lambda }(\omega ) \\&\quad ={\iint }G(\tilde{\omega }) H(\omega ) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{\tau (\omega )}_{\omega (\tau (\omega ))}(\tilde{\omega }) \,d\mathbb {W}^{0}_{\lambda }(\omega ) \end{aligned}$$

for all bounded measurable \(G : C(\mathbb {R}_{+})\rightarrow \mathbb {R}\) and all bounded \(\mathcal {F}^0_\tau \)-measurable \(H : C(\mathbb {R}_{+})\rightarrow \mathbb {R}\). Here \(\Theta _t\) is the function which cuts off the initial segment of a path up to time t. From this a simple monotone class argument shows that

$$\begin{aligned}&{\int }K\left( \Theta _{\tau (\omega )}(\omega ),\omega \right) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{0}_{\lambda }(\omega ) \\&\quad = {\iint }K(\tilde{\Omega },\omega ) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{\tau (\omega )}_{\omega (\tau (\omega ))}(\tilde{\Omega }) \,d\mathbb {W}^{0}_{\lambda }(\omega ) \end{aligned}$$

for all bounded \(\mathcal {F}^0_\infty \otimes \mathcal {F}^0_\tau \)-measurable \(K : C(\mathbb {R}_{+})\times C(\mathbb {R}_{+})\) \(\rightarrow \mathbb {R}\).

We may then choose for \(K(\tilde{\omega }, \omega )\) the function \(F(\eta , \tau (\omega ))\) where the path \(\eta \) is created by cutting off the tail of \(\omega \) after time \(\tau (\omega )\) and attaching \(\tilde{\omega }\) in its place. Noting the relationship between \(\mathbb {W}^{\tau (\omega )}_{x}\) and \(\mathbb {W}^{\tau (\omega )}_{0}\) we then get

$$\begin{aligned}&{\int }F(\omega ,\tau (\omega )) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{0}_{\lambda }(\omega )\\&\quad = {\iint }F((\omega ,\tau (\omega )) \odot \tilde{\Omega },\tau (\omega )) \cdot 1_{\mathbb {R}_{+}}(\tau (\omega )) \,d\mathbb {W}^{\tau (\omega )}_{0}(\tilde{\Omega }) \,d\mathbb {W}^{0}_{\lambda }(\omega ) \text {.}\end{aligned}$$

Using Lemma 5.3 with \(\Omega = [0,1] \times C(\mathbb {R}_{+})\) and we find a -stopping time \(\tau \) s.t. we may write \(\alpha \) as

(where \(\mathcal {L}\) is Lebesgue measure on [0, 1]). For a fixed \(y \in [0,1]\), \(\omega \mapsto \tau (y,\omega )\) is an \((\mathcal {F}^0_{t})_{t \ge 0}\)-stopping time, so we may apply the previous equation to these stopping times and integrate over \(y \in [0,1]\) to get

$$\begin{aligned}&{\int }F(\omega ,\tau (y,\omega )) \cdot 1_{\mathbb {R}_{+}}(\tau (y,\omega )) \,d(\mathcal {L}\otimes \mathbb {W}^{0}_{\lambda })(y,\omega )\\&\quad = {\iint }F((\omega ,\tau (y,\omega )) \odot \tilde{\Omega },\tau (y,\omega )) \cdot 1_{\mathbb {R}_{+}}(\tau (y,\omega )) \,d\mathbb {W}^{\tau (y,\omega )}_{0}(\tilde{\Omega }) \,d(\mathcal {L}\otimes \mathbb {W}^{0}_{\lambda }) (y,\omega ) \text {.}\end{aligned}$$

Using the equation for \(\alpha \) we see that this is what we wanted to prove. \(\square \)

Lemma 6.14

(Gardener’s Lemma) Assume that we have \(\xi \in \mathsf {RST}_{\lambda }(\mathcal {P})\), a measure \(\alpha \) on \(C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) and two families \( \beta ^{(\omega ,t)} \), \( \gamma ^{(\omega ,t)} \), where \( (\omega ,t) \in C(\mathbb {R}_{+})\times \mathbb {R}_{+}\), with such that both maps

$$\begin{aligned} (\omega ,t)&\mapsto {\int }1_{D}\left( (\omega ,t) \odot \tilde{\omega },s\right) \,d\beta ^{(\omega ,t)}(\tilde{\omega },s) \text { and } \\ (\omega ,t)&\mapsto {\int }1_{D}\left( (\omega ,t) \odot \tilde{\omega },s\right) \,d\gamma ^{(\omega ,t)}(\tilde{\omega },s) \end{aligned}$$

are measurable for all Borel \(D \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) and that

$$\begin{aligned} \xi (D) - {\iint }1_{D}\left( (\omega ,t) \odot \tilde{\omega },s\right) \,d\beta ^{(\omega ,t)}(\tilde{\omega },s) \,d\alpha (\omega ,t) \ge 0 \end{aligned}$$

for all Borel \(D \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\). Then for \(\hat{\xi }\) defined by

$$\begin{aligned} {\int }F \,d\hat{\xi } := {\int }F \,d\xi&- {\iint }F((\omega ,t) \odot \tilde{\omega },s) \,d\beta ^{(\omega ,t)}(\tilde{\omega },s) \,d\alpha (\omega ,t) \\&+ {\iint }F((\omega ,t) \odot \tilde{\omega },s) \,d\gamma ^{(\omega ,t)}(\tilde{\omega },s) \,d\alpha (\omega ,t) \end{aligned}$$

for all bounded measurable F we have \(\hat{\xi } \in \mathsf {RST}_{\lambda }(\mathcal {P})\).

Remark 6.15

The intuition behind the Gardener’s Lemma is that we are replacing certain branches \( \beta ^{(\omega ,t)} \) of the randomized stopping time \( \xi \) by other branches \( \gamma ^{(\omega ,t)} \) to obtain a new stopping time \( \hat{\xi } \). This process happens along the measure \(\alpha \). Note that (6.6) implies that \({\int }1_{D}\left( (\omega ,t) \odot \tilde{\omega }\right) \,d\mathbb {W}^{t}_{0}(\tilde{\omega }) \,d\alpha (\omega ,t) \le \mathbb {W}^{0}_{\lambda }(D) \) for all Borel \(D \subseteq C(\mathbb {R}_{+})\). The authors like to think of \(\alpha \) as a stopping time and of the maps \((\omega ,t) \mapsto \beta ^{(\omega ,t)}\) and \((\omega ,t) \mapsto \gamma ^{(\omega ,t)}\) as adapted (in some sense that would need to be made precise). As these assumptions aren’t necessary for the proof of the Gardener’s Lemma, they were left out, but it might help the reader’s intuition to keep them in mind.

Proof of Lemma 6.14

We need to check that the \(\hat{\xi }\) we define is indeed a measure, that \((\mathsf {proj}_{C(\mathbb {R}_{+})})_*(\hat{\xi }) = \mathbb {W}^{0}_{\lambda }\) and that (5.1) holds for \(\hat{\xi }\).

Checking that \(\hat{\xi }\) is a measure is routine—we just note that (6.6) guarantees that \(\hat{\xi }(D) \ge 0 \) for all Borel D.

Let \(G: C(\mathbb {R}_{+})\rightarrow \mathbb {R}\) be a bounded measurable function.

$$\begin{aligned} {\int }G(\omega ) \,d\hat{\xi }(\omega ,t)&= {\int }G(\omega ) \,d\xi (\omega ,t) - {\iint }G((\omega ,t) \odot \tilde{\omega }) \,d\beta ^{(\omega ,t)}(\tilde{\omega },s) \,d\alpha (\omega ,t) \\&\quad + {\iint }G((\omega ,t) \odot \tilde{\omega }) \,d\gamma ^{(\omega ,t)}(\tilde{\omega },s) \,d\alpha (\omega ,t) \\&= {\int }G \,d\mathbb {W}^{0}_{\lambda } - {\iint }G((\omega ,t) \odot \tilde{\omega }) \,d\mathbb {W}^{t}_{0} \,d\alpha (\omega ,t) \\&\quad + {\iint }G((\omega ,t) \odot \tilde{\omega }) \,d\mathbb {W}^{t}_{0} \,d\alpha (\omega ,t) \\&= {\int }G \,d\mathbb {W}^{0}_{\lambda } \end{aligned}$$

Now let \(F : \mathbb {R}_{+}\rightarrow \mathbb {R}\) and \(G: C(\mathbb {R}_{+})\rightarrow \mathbb {R}\) be bounded continuous functions, with F supported on [0, r].


The first summand is 0 because \(\xi \in \mathsf {RST}_{\lambda }(\mathcal {P})\). Looking at the second summand we expand the definition of .

whenever \(t \le r\), which is the case for those t which are relevant in the integrand above, because \( F(s) \ne 0 \) implies \( s \le r \) and moreover \(\beta ^{(\omega ,t)}\) is concentrated on \((\tilde{\omega },s)\) for which \( t \le s \).

Setting \(\hat{G}^{(\omega ,t)}(\tilde{\omega }) := G((\omega ,t) \odot \tilde{\omega })\) and we can write

which is 0 because and therefore

for all \((\omega ,t)\) and \(r \ge t\). The same argument works for the third summand in (6.7). \(\square \)

Proof of Proof of Proposition 6.10

We prove the contrapositive. Assuming that there exists a \( \pi ' \in \mathsf {JOIN}_{\lambda }(r_*(\xi )) \) with \( \left( r \times \mathsf {Id}\right) _*(\pi ')(\mathsf {SG}^\xi ) > 0 \), we construct a \( \xi ^\pi \in \mathsf {RST}_{\lambda }(\mu ) \) such that \( {\int }c\,d\xi ^\pi < {\int }c\,d\xi \).

If \(\pi ' \in \mathsf {JOIN}_{\lambda }(r_*(\xi ))\), then for any two measurable sets \(D_1,D_2 \subseteq S\), because and by making use of Lemma 6.12 we can deduce that . Using the monotone class theorem this extends to any measurable subset of \(S \times S\) in place of \(D_1 \times D_2\). So we can set and know that \((\mathsf {proj}_{C(\mathbb {R}_{+})\times \mathbb {R}_{+}})_*(\pi ) \in \mathsf {RST}_{\lambda }\) and that \(\pi \) is concentrated on \(\mathsf {SG}^\xi \).

We will be using a disintegration of \( \pi \) wrt \(r(\xi )\), which we call \( \left( \pi _{(g,t)}\right) _{(g,t) \in S} \) and for which we assume that \(\pi _{(g,t)}\) is a subprobability measure for all \((g,t) \in S\). It will also be useful to assume that \( \pi _{(g,t)} \) is concentrated on the set \( \{ (\omega ,s) \in C(\mathbb {R}_{+})\times \mathbb {R}_{+}: s = t \} \) not just for \( r(\xi ) \)-almost all (gt) but for all (gt) . Again this is no restriction of generality. We will also push \( \pi \) onto \( \left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) \times \left( C(\mathbb {R}_{+})\times \mathbb {R}_{+}\right) \), defining a measure \(\bar{\pi }\) via

$$\begin{aligned} {\int }F \,d\bar{\pi } := {\iint }F\left( (\omega ,s),((g,t) \odot \tilde{\eta }, t)\right) \,d\mathbb {W}^{t}_{0}(\tilde{\eta }) \,d\pi \left( (\omega ,s),(g,t)\right) \end{aligned}$$

for all bounded measurable F. Observe that by Lemma 6.13 the pushforward of \(\pi \) under projection onto the second coordinate (pair) is \(\xi \) and that a disintegration of \(\bar{\pi }\) wrt to \(\xi \) (again in the second coordinate) is given by \(\left( \pi _{r(\eta ,t)}\right) _{(\eta ,t) \in C(\mathbb {R}_{+})\times \mathbb {R}_{+}}\). Let us name \((\mathsf {proj}_{C(\mathbb {R}_{+})\times \mathbb {R}_{+}})_*(\pi ) =: \zeta \in \mathsf {RST}_{\lambda } \). We will now use the Gardener’s Lemma to define two modifications \(\xi _{0}^\pi \), \(\xi _{1}^\pi \) of \(\xi \) such that \(\xi ^\pi := \frac{1}{2}(\xi _{0}^\pi + \xi _{1}^\pi )\) is our improved randomized stopping time.

For all bounded measurable \(F : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) define

The concatenation on the last line is well-defined \(\bar{\pi }\)-almost everywhere because \(\bar{\pi }\) is concentrated on \( (r \times r)^{-1}\left[ \mathsf {SG}^\xi \right] \) and so in the integrand above \(s = t\) on a set of full measure.

We need to check that the Gardener’s Lemma applies in both cases. First of all observe that the product measure \( \mathbb {W}^{t}_{0} \otimes \delta _t \) is in and that Lemma 6.13 implies

$$\begin{aligned} {\int }F(\omega ,t) \,d\alpha (\omega ,t) = {\iint }F((\omega ,t) \odot \tilde{\omega }, s) \,d\left( \mathbb {W}^{t}_{0} \otimes \delta _t\right) (\tilde{\omega },s) \,d\alpha (\omega ,t) \text {.}\end{aligned}$$

for any randomized stopping time \(\alpha \). So for \(\xi _{0}^\pi \) the measures \(\gamma ^{(\omega ,t)}\) are given by \( \mathbb {W}^{t}_{0} \otimes \delta _t \) and for \(\xi _{1}^\pi \) the measures \(\beta ^{(\omega ,t)}\) are given by \( \mathbb {W}^{t}_{0} \otimes \delta _t \).

For \( \xi _{0}^\pi \) the measure along which we are replacing branches is given by

$$\begin{aligned} F \mapsto {\int }F(\omega ,s) (1-\xi _{\omega }([0,s])) \,d\zeta (\omega ,s) \text {.}\end{aligned}$$

The branches \( \beta ^{(\omega ,s)} \) we remove are \(\xi ^{r(\omega ,s)} \). We need to check that

$$\begin{aligned} {\int }F \,d\xi - {\int }(1-\xi _{\omega }([0,s])) {\int }F((\omega ,s) \odot \tilde{\omega }, u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega },u) \,d\zeta (\omega ,s) \ge 0 \end{aligned}$$

for all positive, bounded, measurable \( F : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\). Let us calculate.

Here we first used the definition of \( \xi ^{r(\omega ,s)} \) and then Lemma 6.13 and finally that \((\mathsf {proj}_{C(\mathbb {R}_{+})})_*(\zeta ) \le \mathbb {W}^{0}_{\lambda }\).

For \(\xi _{1}^\pi \) we replace branches along

$$\begin{aligned} F&\mapsto {\int }F(\eta ,t) (1-\xi _{\omega }([0,s])) \,d\bar{\pi }\left( (\omega ,s),(\eta ,t)\right) \\&= {\int }F(\eta ,t) {\int }(1-\xi _{\omega }([0,s])) \,d\pi _{r(\eta ,t)}(\omega ,s) \,d\xi (\eta ,t) \text {.}\end{aligned}$$

The calculation above shows that

$$\begin{aligned} {\int }F \,d\xi - {\int }(1-\xi _{\omega }([0,s])) F(\eta ,t) \,d\bar{\pi }\left( (\omega ,s),(\eta ,t)\right) \ge 0 \end{aligned}$$

for all positive, bounded, measurable \( F : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\). For \(\xi _{1}^\pi \) the branches \(\gamma ^{(\eta ,t)}\) that we add are given by

$$\begin{aligned} F \mapsto \frac{ {\int }(1-\xi _{\omega }([0,s])) {\int }F(\tilde{\omega },u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega },u) \,d\pi _{r(\eta ,t)}(\omega ,s) }{ {\int }(1-\xi _{\omega }([0,s])) \,d\pi _{r(\eta ,t)}(\omega ,s) } \end{aligned}$$

when \({\int }(1-\xi _{\omega }([0,s])) \,d\pi _{r(\eta ,t)}(\omega ,s) > 0\) and \( \delta _t \) otherwise (again, the latter is arbitrary). In the more interesting case \( \gamma ^{(\eta ,t)} \) is an average over elements of and therefore itself in . Here it is again crucial that for \( \pi _{r(\eta ,t)} \)-almost all \((\omega ,s)\) we have \(s = t\), otherwise we would be averaging randomized stopping times of our process started at unrelated times.

Putting this together we see that \(\xi ^\pi := \frac{1}{2}(\xi _{0}^\pi + \xi _{1}^\pi )\) is a randomized stopping time and that

$$\begin{aligned}&2 {\int }F \,d(\xi ^\pi - \xi ) = {\int }(1-\xi _{\omega }([0,s])) \Big ( F(\omega ,s) - {\int }F((\omega ,s) \odot \tilde{\omega }, u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega },u) \nonumber \\&\quad - F(\eta ,t) + {\int }F((\eta ,t) \odot \tilde{\omega },u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega }, u) \Big ) \,d\bar{\pi }((\omega ,s),(\eta ,t)) \end{aligned}$$

for all bounded measurable \( F : C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\). Specializing to \(F(\omega ,s) = G(s)\) for \(G : \mathbb {R}_{+}\rightarrow \mathbb {R}\) bounded measurable we find that

$$\begin{aligned} {\int }G(s) \,d(\xi -\xi ^\pi )(\omega ,s) = 0 \text { ,} \end{aligned}$$

again because for \(\bar{\pi }\)-almost all \(\left( (\omega ,s),(\eta ,t)\right) \) we have \(s=t\). This shows that \(\xi ^\pi \in \mathsf {RST}_{\lambda }(\mu )\).

We now want to extend (6.8) to \(c\). We first show that (6.8) also holds for \(F: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) which are measurable and positive and for which \({\int }F \,d\xi < \infty \). To see this, approximate such an F from below by bounded measurable functions (for which (6.8) holds) and note that by previous calculations both

Looking at positive and negative parts of \(c\) and using Assumption 2.4 to see that \( {\int }c_{-} \,d(\xi ^\pi -\xi ) \in \mathbb {R}\) we get that indeed (6.8) holds for \(F = c\).

Now we will argue that the integrand in the right hand side of (6.8) is negative \(\bar{\pi }\)-almost everywhere. This will conclude the proof.

By inserting an r in appropriate places we can read off from Definition 6.4 what it means that \(\bar{\pi }\) is concentrated on \((r \times r)^{-1}\left[ \mathsf {SG}^\xi \right] \). In the course of verifying that (6.8) applies to \(c\) we already saw that cases 2 and 3 in Definition 6.4 can only occur on a set of \(\bar{\pi }\)-measure 0. Lemma 6.6 excludes case 1 \(\bar{\pi }\)-almost everywhere. This means that (6.2) holds \(\bar{\pi }\)-almost everywhere—or more correctly, that for \(\bar{\pi }\)-a.a. \(((\omega ,s),(\eta ,t))\) we have \(s=t\) and

$$\begin{aligned}&c(\omega ,s) - {\int }c((\omega ,s) \odot \tilde{\omega }, u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega },u) \nonumber \\&\quad - c(\eta ,t) + {\int }c((\eta ,t) \odot \tilde{\omega }, u) \,d\xi ^{r(\omega ,s)}(\tilde{\omega },u) < 0 \text {,}\end{aligned}$$

completing the proof. \(\square \)

Variations on the theme

We proceed to prove Corollary 1.2. This is closely modelled on the treatment of the Azema-Yor embedding in [6, Theorem 6.5]. As is the case there we run into a technical obstacle, though one which can be overcome by combining the ideas we have already seen in slightly new ways.

To demonstrate the problem let us begin an attempt to prove Corollary 1.2. Again, we read off \(c(\omega ,t) = -\omega ^*(t)\), with \(\omega ^*(t) = \sup _{s \le t}\omega (s)\). We may use Theorem 3.1 to find a solution \(\tau \) of the problem \((\textsc {OptStop}^{B^*_{t}})\) and we use Theorem 3.6 to find a set \(\Gamma \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\) for which \(\mathbb {P}[(B,\tau ) \in \Gamma ] = 1\) and \(\mathsf {SG}\cap (\Gamma ^<\times \Gamma ) = \emptyset \). Now we would like to apply Lemma 3.7 with \(Y_t(\omega ) = \omega (t) - \omega ^*(t)\), as proposed by Corollary 1.2, so we want to prove that \(\omega (t) - \omega ^*(t) < \eta (t) - \eta ^*(t)\) implies \(((\omega ,t),(\eta ,t)) \in \mathsf {SG}\). Let us do the calculations. We start with an \((\mathcal {F}_{t}^{s})_{s \ge t}\)-stopping time \(\sigma \), for which \(\mathbb {W}^{t}_{0}(\sigma = t) < 1\), \(\mathbb {W}^{t}_{0}(\sigma = \infty ) = 0\) and for which both sides in (3.2) are defined and finite. To reduce clutter, let us name \((\omega \mapsto (\omega ,\sigma (\omega )))_*(\mathbb {W}^{t}_{0}) =: \alpha \), so that (3.2), which we want to prove, reads

$$\begin{aligned} - \omega ^*(t) + {\int }((\omega ,t) \odot \theta )^*(s) \,d\alpha (\theta ,s) < - \eta ^*(t) + {\int }((\eta ,t) \odot \theta )^*(s) \,d\alpha (\theta ,s) \end{aligned}$$

We may rewrite the left hand side as

$$\begin{aligned}&{\int }\Big (\omega ^*(t) \vee \big (\omega (t) + \theta ^*(s)\big )\Big ) - \omega ^*(t) \,d\alpha (\theta ,s) \\&\quad ={\int }0 \vee \big (\omega (t) - \omega ^*(t) + \theta ^*(s) \big ) \,d\alpha (\theta ,s) \text {.}\end{aligned}$$

For the right hand side we get the same expression with \(\omega \) replaced by \(\eta \). Looking at the integrands we see that if

$$\begin{aligned} 0 < \eta (t) - \eta ^*(t) + \theta ^*(s) \end{aligned}$$


$$\begin{aligned} 0 \vee \big ( \omega (t) - \omega ^*(t) + \theta ^*(s) \big ) < 0 \vee \big ( \eta (t) - \eta ^*(t) + \theta ^*(s) \big ) \text {,}\end{aligned}$$

but in the other case

$$\begin{aligned} 0 \vee \big ( \omega (t) - \omega ^*(t) + \theta ^*(s) \big ) = 0 = 0 \vee \big ( \eta (t) - \eta ^*(t) + \theta ^*(s) \big ) \text {.}\end{aligned}$$

So if (7.2) holds for \((\theta ,s)\) from a set of positive \(\alpha \)-measure, then we proved what we wanted to prove. But if \( \theta ^*(s) \le \eta ^*(t) - \eta (t) \) for \(\alpha \)-a.a. \((\theta ,s)\) then in (3.2) we have equality instead of strict inequality.

As in [6, Theorem 6.5], one way of getting around this is to introduce a secondary optimization criterion. One way to explain the idea of secondary optimization is to think about what happens if, instead of considering a cost function \(c: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) we consider a cost function \(c: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}^n\). Of course, to be able to talk about optimization, we will then want to have an order on \(\mathbb {R}^n\). For reasons that should become clear soon, we decide on the lexicographical order. For the case \(n = 2\) that we are actually interested in for Corollary 1.2 this means that

$$\begin{aligned} (x_1,x_2) \le (y_1,y_2) \iff x_1 < y_1 \text { or } (x_1 = y_1 \text { and } x_2 \le y_2) \text {.}\end{aligned}$$

We claim that Theorem 3.6 is still true if we replace \(c: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}\) by \(c: C(\mathbb {R}_{+})\times \mathbb {R}_{+}\rightarrow \mathbb {R}^n\) and read any symbol \(\le \) which appears between vectors in \(\mathbb {R}^n\) as the lexicographic order on \(\mathbb {R}^n\) (and of course likewise for all the derived symbols and notions <, \(\ge \), >, \(\inf \), etc.). Moreover, the arguments are exactly the same. Indeed the crucial part that may deserve some mention is at the end of the proof of Proposition 6.10, where we use the assumption that (6.9) holds on a set of positive measure, i.e. that the integrand is \( < 0 \) on a set of positive measure, and that the integrand is 0 outside that set, to conclude that the integral itself must be \( < 0 \). This implication is also true for the lexicographical order on \(\mathbb {R}^n\). One more detail to be aware of is that integrating functions which map into \(\mathbb {R}^2\) may give results of the form \((\infty ,x)\), \((x,-\infty )\), etc. In the case of a one-dimensional cost function we excluded such problems by making Assumption 2.4. What we really want in the proof of Proposition 6.10 is that \({\int }c\,d\xi \) and \({\int }c\,d\xi ^\pi \) should be finite. Clearly a sufficient condition to guarantee this is to replace Assumption 2.4 by

  1. (4’)

    \( \mathbb E[c(B,\tau )] \in \mathbb {R}^n \) for all stopping times \( \tau \sim \mu \).

This is not the most general version possible but it will suffice for our purposes.

To get an existence result we may assume that \(c=(c_1,c_2)\) is component-wise lower semicontinuous and that both \(c_1\) and \(c_2\) are bounded below (in either of the ways described in the two versions of Theorem 3.1). Note that—because we are talking about the lexicographic order—\(\xi \in \mathsf {RST}_{\lambda }(\mu )\) is a solution of \((\textsc {OptStop'})\) for \(c\) iff \(\xi \) is a solution of \((\textsc {OptStop'})\) for \(c_1\) and among all such solutions \(\xi '\), \(\xi \) minimizes \({\int }c_2 \,d\xi '\). By Theorem 3.1 in the form that we have already proved the set of solutions of \((\textsc {OptStop'})\) for \(c_1\) is non-empty. It is also a closed subset of a compact set and therefore itself compact. This allows us to reiterate the argument that we used in the proof of Theorem 3.1 to find inside this set a minimizer of \(\xi ' \mapsto {\int }c_2 \,d\xi '\). This minimizer is the solution of \((\textsc {OptStop'})\) for \(c\).

With this in hand we may pick up our

Proof of Corollary 1.2

The same arguments as in the proof of Corollary 1.1 apply, so we may assume that our probability space satisfies Assumption 2.2. We start with a cost function \(c(\omega ,t) := (c_1(\omega ,t),c_2(\omega ,t)) := (-\omega ^*(t), (\omega ^*(t) - \omega (t))^3)\). \(||c_1(B,\tau ) ||_{L^{3}} \le ||\left| B\right| ^*_{\tau } ||_{L^{3}} \le K_1 ||\tau ||_{L^{3/2}}^{1/2}\), by the Burkholder-Davis-Gundy inequalities, so \((c_1)_-\) satisfies the uniform integrability condition and \(\mathbb E[c(B,\tau )]\) is finite for all stopping times \(\tau \sim \mu \). \(c_2 \ge 0\) and by the Burkholder-Davis-Gundy inequalities \(\mathbb E[c_2(B,\tau )] \le \mathbb E[(B^*(\tau ))^3] \le K_1 \mathbb E[\tau ^{3/2}] = K_1 {\int }t^{3/2} \,d\mu (t)\) for some constant \(K_1\). The last term is finite by assumption.

By our discussion in the preceding paragraphs we find a solution \(\tau \) of (OptStop) for \(c\) and a measurable, \((\mathcal {F}^0_{t})_{t \ge 0}\)-adapted set \(\Gamma \subseteq C(\mathbb {R}_{+})\times \mathbb {R}_{+}\), for which \(\mathbb {P}[(B,\tau ) \in \Gamma ] = 1\) and \(\mathsf {SG}\cap (\Gamma ^<\times \Gamma ) = \emptyset \), where now \( ((\omega ,t),(\eta ,t)) \in \mathsf {SG}\) iff for all \((\mathcal {F}_{t}^{s})_{s \ge t}\)-stopping times \(\sigma \) for which \(\mathbb {W}^{t}_{0}(\sigma = t) < 1\), \(\mathbb {W}^{t}_{0}(\sigma = \infty ) = 0\), \({\int }\sigma ^{3/2} \,d\mathbb {W}^{t}_{0} < \infty \), setting \(\alpha := (\omega \mapsto (\omega ,\sigma (\omega )))_*(\mathbb {W}^{t}_{0})\) we have that either equation (7.1) holds or

$$\begin{aligned} - \omega ^*(t) + {\int }((\omega ,t) \odot \theta )^*(s) \,d\alpha (\theta ,s) = - \eta ^*(t) + {\int }((\eta ,t) \odot \theta )^*(s) \,d\alpha (\theta ,s) \end{aligned}$$


$$\begin{aligned} c_2(\omega ,t) - {\int }c_2((\omega ,t) \odot \theta , s) \,d\alpha (\theta ,s) < c_2(\eta ,t) - {\int }c_2((\eta ,t) \odot \theta , s) \,d\alpha (\theta ,s) \text {.}\end{aligned}$$

Now we want to apply Lemma 3.7, so we want to show that \(\omega (t) - \omega ^*(t) < \eta (t) - \eta ^*(t)\) implies \(((\omega ,t),(\eta ,t)) \in \mathsf {SG}\). We already dealt with the case where \(\alpha \) is such that (7.2) holds on a set of positive \(\alpha \)-measure. We now deal with the other case, so we have

$$\begin{aligned} \theta ^*(s) \le \eta ^*(t) - \eta (t) < \omega ^*(t) - \omega (t) \end{aligned}$$

for \(\alpha \)-a.a. \((\theta ,s)\) and we know that (7.3) holds. We show that (7.4) holds. Because of (7.5), \( ((\omega ,t) \odot \theta )^*(s) = \omega ^*(t) \), and so \( c_2((\omega ,t) \odot \theta , s) = (\omega ^*(t) - \omega (t) - \theta (s))^3 \). We calculate the left hand side of (7.4).

$$\begin{aligned}&{\int }(\omega ^*(t) - \omega (t))^3 - (\omega ^*(t) - \omega (t) - \theta (s))^3 \,d\alpha (\theta ,s) \\&\quad ={\int }3 (\omega ^*(t) - \omega (t))^2 \theta (s) - 3 (\omega ^*(t) - \omega (t)) (\theta (s))^2 + (\theta (s))^3 \,d\alpha (\theta ,s) \\&\quad =(\omega (t) - \omega ^*(t)) 3 {\int }(\theta (s))^2 \,d\alpha (\theta ,s) + {\int }(\theta (s))^3 \,d\alpha (\theta ,s) \end{aligned}$$

Here the Burkholder–Davis–Gundy inequalities show that both \({\int }(\theta (s))^3 \,d\alpha (\theta ,s)\) and \( {\int }(\theta (s))^2 \,d\alpha (\theta ,s) \) are finite so that we may split the integral and they also show that \(\{\tilde{B}_{\sigma \wedge T}: T \ge t\}\) is uniformly integrable so that by the optional stopping theorem \({\int }\theta (s) \,d\alpha (\theta ,s) = 0\). (\(\tilde{B}\) is again Brownian motion started in 0 at time t on \(C([t, \infty ))\).)

For the right hand side of (7.4) we get the same expression with \(\omega \) replaced by \(\eta \). This concludes the proof that \(\omega (t) - \omega ^*(t) < \eta (t) - \eta ^*(t)\) implies \(((\omega ,t),(\eta ,t)) \in \mathsf {SG}\) and Lemma 3.7 gives us barriers , such that for their hitting times , by \(B_t - B^*_{t}\) we have a.s.

Again we want to show that a.s. and that they are actually stopping times. Again we do so by showing that they are both a.s. equal to the hitting time of the closure of the respective barrier. If then this works in exactly the same way as in Lemma 4.3. (This time we define \(\overline{\tau }_\varepsilon := \inf \{ t > 0 : (B^\varepsilon _t(\omega ) - (B^\varepsilon )^*_{t}(\omega ), t) \in \overline{\mathcal {R}}\}\) where \(B^\varepsilon _t(\omega ) := B_t(\omega ) + A(t) \varepsilon \).) If then \((B^\varepsilon _t(\omega ) - (B^\varepsilon )^*_{t}(\omega ), t) \in \overline{\mathcal {R}}\) and \(t>0\) need not imply \( B_t(\omega ) - B^*_{t} (\omega ) < B^\varepsilon _t(\omega ) - (B^\varepsilon )^*_{t}(\omega )\), which is essential for the topological argument showing that the hitting time of \(\mathcal {R}\) is less than or equal \(\overline{\tau }_\varepsilon \). But if , then and are both almost surely \(\le T\) where \(T := \inf \{ t > 0 : (0,t) \in \overline{\hat{\mathcal {R}}}\}\), so in the step where we show that the hitting time of \(\mathcal {R}\) is less than \(\overline{\tau }_\varepsilon \) we can argue under the assumption that \(\overline{\tau }_\varepsilon (\omega ) < T\). In this case we do have that \((B^\varepsilon _t(\omega ) - (B^\varepsilon )^*_{t}(\omega ), t) \in \overline{\mathcal {R}}\) and \(t>0\) implies \( B_t(\omega ) - B^*_{t} (\omega ) < B^\varepsilon _t(\omega ) - (B^\varepsilon )^*_{t}(\omega )\). \(\square \)

Remark 7.1

We hope that the proofs of Corollaries 1.1 and 1.2 have given the reader some idea of how to apply the main results of this paper to arrive at barrier-type solutions of constrained optimal stopping problems, as depicted in Fig. 1.

We would like to conclude by giving a couple of pointers to the interested reader who may want to work through the proofs corresponding to the remaining pictures in Fig. 1.

For the problem of minimizing \(\mathbb E[B^*_{\tau }]\), it may actually happen that the times from Lemma 3.7 do not coincide. Specifically one has to expect this to happen on a non-negligible set when contains parts of the time axis which \(\hat{\mathcal {R}}\) does not contain. Under these circumstances an optimizer may turn out to be a true randomized stopping time, with a proportion of a path hitting the time axis at a certain point needing to be stopped while the rest continues. In this situation the picture alone does not completely describe the optimal stopping time.

For the problems involving absolute values one needs to make a minor modification in the proof of Proposition 6.10. Specifically one can allow “mirroring” the paths which are “transplanted” using the Gardener’s Lemma. This leads to a slightly different definition of Stop-Go pairs, which is perhaps most easily described by saying that in Fig. 2 the green paths which are stoppen by \(\sigma \) may be flipped upside-down on either side.


  1. We note that the results presented in this section remain valid for Brownian motions started according to a general law \(\lambda \) at the cost of slightly more tedious moment conditions in the formulation of Corollaries 1.1 and 1.2.

  2. One may of course choose \(0 \le p < \frac{1}{2}\), \(\varepsilon := \frac{1}{2} - p\) and e.g. \(A(t) := t^p\) so that no moment conditions beyond those at the very beginning of this theorem are imposed on \(\mu \).

  3. And that choice is rather arbitrary itself, as close reading will reveal.


  1. Alili, L., Patie, P.: On the first crossing times of a Brownian motion and a family of continuous curves. C. R. Math. Acad. Sci. Paris 340(3), 225–228 (2005).

    MathSciNet  Article  MATH  Google Scholar 

  2. Alili, L., Patie, P.: Boundary crossing identities for Brownian motion and some nonlinear ODE’s. Proc. Am. Math. Soc. 142(11), 3811–3824 (2014).

    MathSciNet  Article  MATH  Google Scholar 

  3. Anulova, S.V.: On Markov stopping times with a given distribution for a Wiener process. Theory Probab. Appl. 25(2), 362–366 (1981).

    MathSciNet  Article  MATH  Google Scholar 

  4. Avellaneda, M., Zhu, J.: Distance to default. Risk 12(14), 125–129 (2001)

    Google Scholar 

  5. Bayraktar, E., Miller, C.W.: Distribution-constrained optimal stopping. arXiv e-prints (2016)

  6. Beiglböck, M., Cox, A.M.G., Huesmann, M.: Optimal transport and Skorokhod embedding. Invent. Math. 208(2), 327–400 (2017).

    MathSciNet  Article  MATH  Google Scholar 

  7. Beiglböck, M., Griessler, C.: An optimality principle with applications in optimal transport. arXiv e-prints (2014)

  8. Beiglböck, M., Henry-Labordère, P., Penkner, F.: Model-independent bounds for option prices: a mass transport approach. Finance Stoch. 17(3), 477–501 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  9. Beiglböck, M., Juillet, N.: On a problem of optimal transport under marginal martingale constraints. Ann. Probab. 44(1), 42–106 (2016).

    MathSciNet  Article  MATH  Google Scholar 

  10. Beiglböck, M., Nutz, M., Touzi, N.: Complete duality for martingale optimal transport on the line. Ann. Probab. 45(5), 3038–3074 (2017)

    MathSciNet  Article  MATH  Google Scholar 

  11. Breiman, L.: First exit times from a square root boundary. In: Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability (Berkeley, California, 1965/66), vol. II. Contributions to Probability Theory, Part 2, pp. 9–16. University of California Press, Berkeley, California (1967)

  12. Campi, L., Laachir, I., Martini, C.: Change of numeraire in the two-marginals martingale transport problem. Finance Stoch. 21(2), 471–486 (2017).

    MathSciNet  Article  MATH  Google Scholar 

  13. Chen, X., Cheng, L., Chadam, J., Saunders, D.: Existence and uniqueness of solutions to the inverse boundary crossing problem for diffusions. Ann. Appl. Probab. 21(5), 1663–1693 (2011).

    MathSciNet  Article  MATH  Google Scholar 

  14. Cox, A.M.G., Källblad, S.: Model-independent bounds for Asian options: a dynamic programming approach. arXiv e-prints (2015)

  15. Dellacherie, C., Meyer, P.A.: Probabilities and Potential, A. North-Holland Mathematics Studies, vol. 29. North-Holland Publishing Co., Amsterdam (1978)

    Google Scholar 

  16. Dolinsky, Y., Soner, H.M.: Martingale optimal transport and robust hedging in continuous time. Probab. Theory Relat. Fields 160(1–2), 391–427 (2014).

    MathSciNet  Article  MATH  Google Scholar 

  17. Dudley, R.M., Gutmann, S.: Stopping times with given laws. Lecture Notes in Mathematics, vol. 581, pp. 51–58 (1977)

  18. Galichon, A., Henry-Labordère, P., Touzi, N.: A stochastic control approach to no-arbitrage bounds given marginals, with an application to lookback options. Ann. Appl. Probab. 24(1), 312–336 (2014)

    MathSciNet  Article  MATH  Google Scholar 

  19. Gangbo, W., McCann, R.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)

    MathSciNet  Article  MATH  Google Scholar 

  20. Ghoussoub, N., Kim, Y.H., Lim, T.: Structure of optimal martingale transport plans in general dimensions. arXiv e-prints (2016)

  21. Grass, A.: Uniqueness and stability properties of barrier type Skorokhod embeddings. Master thesis, Vienna. (2016)

  22. Guo, G., Tan, X., Touzi, N.: On the monotonicity principle of optimal Skorokhod embedding problem. SIAM J. Control Optim. 54(5), 2478–2489 (2016).

    MathSciNet  Article  MATH  Google Scholar 

  23. Hirhager, K.: Adapted dependence with applications to financial and actuarial risk management. PhD thesis, TU Vienna (2013)

  24. Hobson, D.: The Skorokhod embedding problem and model-independent bounds for option prices. In: Paris-Princeton Lectures on Mathematical Finance 2010, Lecture Notes in Mathematics, vol. 2003, pp. 267–318. Springer, Berlin (2011).

  25. Hobson, D., Neuberger, A.: Robust bounds for forward start options. Math. Finance 22(1), 31–56 (2012)

    MathSciNet  Article  MATH  Google Scholar 

  26. Jaimungal, S., Kreinin, A., Valov, A.: The generalized Shiryaev problem and Skorokhod embedding. Theory Probab. Appl. 58(3), 493–502 (2014).

    MathSciNet  Article  MATH  Google Scholar 

  27. Kechris, A.: Classical Descriptive Set Theory. Graduate Texts in Mathematics. Springer, New York (1995)

    Book  Google Scholar 

  28. Knott, M., Smith, C.S.: On the optimal mapping of distributions. J. Optim. Theory Appl. 43(1), 39–49 (1984).

    MathSciNet  Article  MATH  Google Scholar 

  29. Lerche, H.R.: Boundary Crossing of Brownian Motion, Lecture Notes in Statistics, vol. 40. Springer, Berlin (1986). (Its relation to the law of the iterated logarithm and to sequential analysis)

  30. Loynes, R.M.: Stopping times on Brownian motion: some properties of Root’s construction. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 16, 211–218 (1970)

    MathSciNet  Article  MATH  Google Scholar 

  31. Nutz, M., Stebegg, F.: Canonical supermartingale couplings. arXiv e-prints (2016)

  32. Obłój, J.: The Skorokhod embedding problem and its offspring. Probab. Surv. 1, 321–390 (2004).

    MathSciNet  Article  MATH  Google Scholar 

  33. Peskir, G.: On integral equations arising in the first-passage problem for Brownian motion. J. Integral Equ. Appl. 14(4), 397–423 (2002).

    MathSciNet  Article  MATH  Google Scholar 

  34. Rogers, L.C.G., Williams, D.: Diffusions, Markov Processes, and Martingales, vol. 2. Cambridge Mathematical Library. Cambridge University Press, Cambridge (2000). (Itô calculus, Reprint of the second (1994) edition)

  35. Root, D.H.: The existence of certain stopping times on Brownian motion. Ann. Math. Stat. 40, 715–718 (1969)

    MathSciNet  Article  MATH  Google Scholar 

  36. Rüschendorf, L.: Fréchet-bounds and their applications. In: Advances in probability distributions with given marginals (Rome, 1990), Mathematics and Its Applications, vol. 67, pp. 151–187. Kluwer, Dordrecht (1991)

  37. Rüschendorf, L.: Optimal solutions of multivariate coupling problems. Appl. Math. (Warsaw) 23(3), 325–338 (1995)

    MathSciNet  Article  MATH  Google Scholar 

  38. Salminen, P.: On the first hitting time and the last exit time for a Brownian motion to/from a moving boundary. Adv. Appl. Probab. 20(2), 411–426 (1988).

    MathSciNet  Article  MATH  Google Scholar 

  39. Zaev, D.: On the Monge–Kantorovich problem with additional linear constraints. Mat. Zametki 98(5), 664–683 (2015).

    MathSciNet  Article  MATH  Google Scholar 

Download references


Open access funding provided by Austrian Science Fund (FWF).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Mathias Beiglböck.

Additional information

Mathias Beiglböck and Manu Eder were supported by FWF-Grant Y00782, Christiane Elgert by Jubiläumsfonds 16549, Uwe Schmock by Jubiläumsfonds 16549 and FWF-Grant P25216.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Beiglböck, M., Eder, M., Elgert, C. et al. Geometry of distribution-constrained optimal stopping problems. Probab. Theory Relat. Fields 172, 71–101 (2018).

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI:


  • Distribution-constrained optimal stopping
  • Optimal transport
  • Inverse first passage problem
  • Shiryaev’s problem

Mathematics Subject Classification

  • Primary 60G42
  • 60G44
  • Secondary 91G20