Skip to main content
Log in

Optimal Stopping via Pathwise Dual Empirical Maximisation

  • Published:
Applied Mathematics & Optimization Submit manuscript

Abstract

The optimal stopping problem arising in the pricing of American options can be tackled by the so called dual martingale approach. In this approach, a dual problem is formulated over the space of adapted martingales. A feasible solution of the dual problem yields an upper bound for the solution of the original primal problem. In practice, the optimization is performed over a finite-dimensional subspace of martingales. A sample of paths of the underlying stochastic process is produced by a Monte-Carlo simulation, and the expectation is replaced by the empirical mean. As a rule the resulting optimization problem, which can be written as a linear program, yields a martingale such that the variance of the obtained estimator can be large. In order to decrease this variance, a penalizing term can be added to the objective function of the pathwise optimization problem. In this paper, we provide a rigorous analysis of the optimization problems obtained by adding different penalty functions. In particular, a convergence analysis implies that it is better to minimize the empirical maximum instead of the empirical mean. Numerical simulations confirm the variance reduction effect of the new approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  1. Belomestny, D.: Solving optimal stopping problems via empirical dual optimization. Ann. Appl. Probab. 23(5), 1988–2019 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  2. Davis, M.H.A., Karatzas, I.: A Deterministic Approach to Optimal Stopping, vol. 33. Wiley, Chichester (1994)

    MATH  Google Scholar 

  3. Desai, V.V., Farias, V.F., Moallemi, C.C.: Pathwise optimization for optimal stopping problems. Manag. Sci. 58(12), 2292–2308 (2012)

    Article  Google Scholar 

  4. Haugh, M., Kogan, L.: Pricing American options: a duality approach. Oper. Res. 52, 258–270 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I. Grundlehren der mathematischen Wissenschaften, vol. 305. Springer, Berlin (1996)

    MATH  Google Scholar 

  6. Maehara, H.: A threshold for the size of random caps to cover a sphere. Ann. Inst. Stat. Math. 40(4), 665–670 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  7. Robbins, H.E.: On the measure of a random set. Ann. Math. Stat. 15(1), 70–74 (1944)

    Article  MathSciNet  MATH  Google Scholar 

  8. Rogers, L.C.G.: Monte Carlo valuation of American options. Math. Financ. 12(3), 271–286 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  9. Schoenmakers, J., Zhang, J., Huang, J.: Optimal dual martingales, their analysis, and application to new algorithms for Bermudan products. SIAM J. Financ. Math. 4, 86–116 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Stevens, W.L.: Solution to a geometrical problem in probability. Ann. Eugen. 9, 315–320 (1939)

    Article  MathSciNet  Google Scholar 

  11. Tyrrell Rockafellar, R.: Convex Analysis, Volume 28 of Princeton Mathematical Series. Princeton University Press, Princeton (1970)

  12. Wendel, J.G.: A problem in geometric probability. Math. Scand. 11, 109–111 (1962)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Roland Hildebrand and John Schoenmakers acknowledge support by Research Center MATHEON through project SE7 funded by the Einstein Center for Mathematics Berlin.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Denis Belomestny.

Additional information

Denis Belomestny’s research was conducted at IITP RAS and supported by a Russian Scientific Foundation Grant (Project No. 14-50-00150).

Appendices

Appendix A: Convex Analysis

In this section we introduce some notions from convex analysis and conic optimization. The dual of the real vector space \(\mathbb {R}^{n}\) will be denoted by \(\mathbb {R}_{n}\), and the scalar product between \(y \in \mathbb R_n\) and \(x \in \mathbb R^n\) will be denoted by \(\langle y,x \rangle \). Let \(\mathbf {1}^n,\mathbf {1}_{n} = (1,\dots ,1)^{T}\) be the all-ones vector in \(\mathbb {R}^{n}\) and in the dual space \(\mathbb R_n\), respectively.

1.1 Conic Programs

Conic programming is a generalization of linear programming where the ordinary inequality constraints are replaced by a more general notion of inequality defined by a convex cone.

A conic program over a closed convex cone \(\mathcal{K}\subset \mathbb {R}^{n}\) is an optimization problem of the form

$$\begin{aligned} \inf _{x\in \mathcal{K}}\langle c,x\rangle \ :\ Ax=b, \end{aligned}$$
(25)

where \(c \in \mathbb R_n\) is a vector defining the linear cost function of the problem, A is an \(m\times n\) matrix, and \(b\in \mathbb {R}^{m}\). Here Ab define the linear constraints of the problem.

The availability of algorithms for solving a conic program depends on the nature of the cone \(\mathcal{K}\). For example, if \(\mathcal{K}\) is the positive orthant \(\mathbb {R}_{+}^{n}\), then (25) is a linear program which can be easily solved. Efficient solution algorithms are also available if \(\mathcal{K}\) is a second order cone, in which case the program is called conic quadratic program.

1.2 Exposed and Extreme Points

In this subsection we introduce the notions of exposed and extreme points of a closed convex set and consider the relations between them.

Definition A.1

([11, p. 162]) Let \(C \subset \mathbb {R}^{n}\) be a closed convex set. A point \(x \in C\) is called extreme point if there does not exist an open line segment \(L \subset C\) such that \(x \in L\).

Lemma A.2

([11, Corollary 18.5.1]) A closed bounded convex set is the convex hull of its extreme points.

Definition A.3

([11, pp. 162–163]) Let \(C \subset \mathbb {R}^{n}\) be a closed convex set. A point \(x \in C\) is called an exposed point if there exists a supporting affine hyperplane \(H \subset \mathbb {R}^{n}\) to C such that \(H \cap C = \{x\}\).

Lemma A.4

([11, Theorem 18.6]) Let \(C \subset \mathbb {R}^{n}\) be a closed convex set. Then the set of exposed points of C is dense in the set of extreme points of C.

Corollary A.5

Let \(C \subset \mathbb {R}^{n}\) be a bounded closed convex set and E its set of exposed points. Then C is the convex hull of the closure of E.

Proof

The corollary follows immediately from the two lemmas above. \(\square \)

1.3 Convex Functions and Subgradients

In this subsection we introduce the notion of a subgradient.

Definition A.6

([11, pp. 214–215]) Let \(D \subset \mathbb {R}^{n}\) be a convex set and \(f: D \rightarrow \mathbb {R}\) a convex function. A subgradient of f at \(x \in D\) is a dual vector \(y \in \mathbb {R}_{n}\) such that \(f(z) \ge f(x) + \langle y, z-x \rangle \) for all \(z \in D\). The set of all subgradients at \(x \in D\) is called subdifferential at \(x \in D\) and denoted by \(\partial f(x)\).

The subdifferential is a closed convex set [11, p. 215]. If f is differentiable at x, then the gradient \(f^{\prime }(x)\) is the only subgradient [11, p. 216].

Lemma A.7

([5, p. 261]) Let \(D \subset \mathbb {R}^{n}\) be a convex domain, \(F: D \rightarrow \mathbb {R}\) a convex function, and \(h: D \rightarrow \mathbb {R}\) a convex \(C^1\) function. For \(x \in D\) and \(\lambda \ge 0\) we then have \(\partial (\lambda F+h)(x) = \lambda \partial F(x) + h^{\prime }(x)\).

Lemma A.8

([11, Theorem 23.9]) Let \(D\subset \mathbb {R}^{m}\) be a convex domain, and \(H:\mathbb {R}^{m} \rightarrow \mathbb {R}^{n}\) an affine map given by \(H(x)=A(x)+b\), with Ab the linear part of H and the shift, respectively. Let further \(F:H[D]\rightarrow \mathbb {R}\) be a convex function. Then for \(x\in D\) we have \(\partial (F\circ H)(x)=A^{*}[\partial F(Ax+b)]\), where \(A^{*}\) is the adjoint map of A.

1.4 Convex Sets and Polars

In this subsection we introduce the notion of a polar and study its properties.

Definition A.9

([11, p. 125]) Let \(C \subset \mathbb {R}^{n}\) be a closed convex set containing the origin of \(\mathbb {R}^{n}\). The set \(C^{o} = \{ y \in \mathbb {R}_{n} \,|\, \langle y,x \rangle \le 1\ \forall \ x \in C \}\) is called the polar of the set C.

The set \(C^{o}\) is also closed, convex, and contains the origin [11, p. 125]. It is bounded if and only if the origin of \(\mathbb {R}^{n}\) is contained in the interior of C [11, Corollary 14.5.1]. Moreover, the polar of \(C^{o}\) is again C [11, Theorem 14.5]. If \(C,C^{\prime }\) are two closed convex sets containing the origin and satisfying \(C \subset C^{\prime }\), then their polars satisfy \((C^{\prime })^o \subset C^{o}\) [11, p. 125].

Let now \(L \subset \mathbb {R}^{n}\) be a linear subspace, and let \(L^{\perp } \subset \mathbb {R}_{n}\) be the orthogonal subspace. Then the dual space \(L^{*}\) of L can be identified with the quotient \(\mathbb {R}_{n}/L^{\perp }\). Let \(\Pi : \mathbb {R}_{n} \rightarrow \mathbb {R}_{n}/L^{\perp }\) be the corresponding projection. Let \(C \subset \mathbb {R}^{n}\) be a closed convex set containing the origin. Then the intersection \(C \cap L \subset L\) is a closed convex set in L, containing the origin of L. The next result gives a convenient description of the polar of \(C \cap L\) as a subset of L in terms of the polar \(C^{o}\).

Lemma A.10

Assume the notations of the previous paragraph. Then the polar \((C \cap L)^{o}\) is given by the closure of the projection \(\Pi [C^{o}]\).

Proof

Let \(y \in C^{o}\) be an arbitrary point in the polar of C and \(\Pi (y) = y + L^{\perp }\) its projection on the quotient \(\mathbb {R}_{n}/L^{\perp }\). Then we have \(\langle y,x \rangle \le 1\) for all \(x \in C\). In particular, we have \(\langle y,x \rangle \le 1\) for all \(x \in C \cap L\). Hence \(\Pi (y) \in (C \cap L)^{o}\). It follows that \(\Pi [C^{o}] \subset (C \cap L)^{o}\). However, \((C \cap L)^{o}\) is closed, and hence the closure of \(\Pi [C^{o}]\) is also a subset of \((C \cap L)^{o}\).

Let now \(y \in \mathbb {R}_{n}\) such that \(\Pi (y) = y + L^{\perp }\) is not contained in the closure of \(\Pi [C^{o}]\). Then there exists a hyperplane \(H \subset \mathbb {R}_{n}/L^{\perp }\) which separates \(\Pi (y)\) from \(\Pi [C^{o}]\), and such that \(\Pi (y) \not \in H\). Then the hyperplane \(\Pi ^{-1}[H] \subset \mathbb {R}_{n}\) separates y from \(C^{o}\), and \(y \not \in \Pi ^{-1}[H]\). Note also that \(y \not = 0\). It follows that there exists a vector \(z \in L\) such that \(\langle y,z \rangle > 1\) and \(\langle w,z \rangle \le 1\) for all \(w \in C^{o}\). Hence \(z \in C \cap L\). But then \(y + L^{\perp } \not \in (C \cap L)^{o}\). This proves the converse inclusion and completes the proof. \(\square \)

Appendix B: Convexity of the Penalized Problem

The following consideration shows that if the condition in Lemma 2.2 is not satisfied, then problem (8) may not be convex at all.

Let \(\alpha ^{*}\in \mathbb {R}^{K}\) be an arbitrary vector, and define the vector \(c^{*}\) by \(c_{k}^{*}=Z^{(k)}(M(\alpha ^{*}))\). Assume that g is differentiable and \(g^{\prime }\) is not nonnegative at \(c^{*}\), i.e., there exists an index l such that \(\nabla _{l}g(c^{*})<0\). Suppose further that the maximum \(\max _{t}\left( Z_{t}^{(l)}-\sum _{r=1}^{K}\alpha _{r}^{*} M_{t}^{r,(l)}\right) \) is attained at more than one index t, e.g., at the indices ij, and suppose that there exists a direction \(\delta \in \mathbb {R}^{K}\) such that \(\sum _{r=1}^{K}\delta _{r}M_{t}^{r,(k)}\) is zero for pairs (kt) other than (li) and (lj) such that \(Z_{t}^{(k)}-\sum _{r=1}^{K}\alpha _{r}^{*}M_{t}^{r,(k)}=Z^{(k)}(M(\alpha ^{*}))\), and \(\sum _{r=1}^{K} \delta _{r}M_{i}^{r,(l)}\not =\sum _{r=1}^{K}\delta _{r}M_{j}^{r,(l)}\). Then problem (8) is not convex.

Indeed, for real \(\varepsilon \) define \(\alpha _{\varepsilon }=\alpha ^{*}+\varepsilon \delta \) and a vector \(c(\varepsilon )\) by \(c_{k}(\varepsilon )=Z^{(k)}(M(\alpha _{\varepsilon }))\). Let without loss of generality \(\sum _{r=1}^{K}\delta _{r}M_{i}^{r,(l)}<\sum _{r=1}^{K}\delta _{r}M_{j}^{r,(l)}\). For \(\varepsilon >0\) small enough we then have \(c_{k}(\pm \varepsilon )=c_{k}^{*}\) for all \(k\not =l\), \(c_{l}(\varepsilon )=c_{l}^{*}-\varepsilon \sum _{r=1}^{K}\delta _{r}M_{i}^{r,(l)}\), and \(c_{l}(-\varepsilon )=c_{l}^{*}+\varepsilon \sum _{r=1}^{K}\delta _{r}M_{j}^{r,(l)}\). The cost function of problem (8) is given by \(g(c(\varepsilon ))\) for \(\alpha =\alpha _{\varepsilon }\). We have \(\frac{d}{d\varepsilon } \frac{g(c(\varepsilon ))+g(c(-\varepsilon ))}{2}|_{\varepsilon =0}=\nabla _{l}g(c^{*})\frac{\sum _{r=1}^{K}\delta _{r}M_{j}^{r,(l)}-\sum _{r=1} ^{K}\delta _{r}M_{i}^{r,(l)}}{2}<0\), and the cost function is not convex.

If K is not too small, then the above conditions are in general verified for some value of \(\alpha \). Hence it is reasonable to demand the condition given in Lemma 2.2.

Appendix C: Justification of Condition (12)

We need to prove the equivalence of the following two conditions.

  1. (i)

    For every \(x \in \mathbb {R}^{n}\) there exists a subgradient \(y \in \partial g(x)\) whose elements are all nonnegative.

  2. (ii)

    The set \(\frac{1}{n}\mathbf {1}_{n} + \lambda \Pi ^*\left[ F_{1}^{o}\right] \) is contained in the nonnegative orthant.

We shall prove the two directions of the equivalence relation separately.

(ii) \(\Rightarrow \) (i). First we consider condition (i) for the case \(\Pi x = 0\). We shall show that the dual vector \(y = \frac{1}{n}\mathbf {1}_{n} \ge 0\) is a subgradient of g at x. We have \(F(\Pi x) = 0\) and hence \(g(x) = \frac{1}{n}\langle \mathbf {1}_{n},x \rangle \). For every \(z \in \mathbb {R}^{n}\) we have \(F(\Pi z) \ge 0\) by assumption (7) on F and hence \(g(z) \ge \frac{1}{n}\langle \mathbf {1}_{n},z \rangle = g(x) + \langle \frac{1}{n}\mathbf {1}_{n}, z-x \rangle \). This proves \(\frac{1}{n}\mathbf {1}_{n} \in \partial g(x)\).

Now let \(x \in \mathbb {R}^{n}\) be such that \(L \ni \Pi x \not = 0\). Then \(F(\Pi x) > 0\), and we can define \(\tilde{x} = \frac{\Pi x}{F(\Pi x)} \in L\). By definition, we have \(F(\tilde{x}) = 1\). It follows that \(\tilde{x}\) is on the boundary of the set \(F_{1}\). Hence there exists an element \(w \in F_{1}^{o}\) such that \(\langle \tilde{w},x \rangle = 1\), and hence \(\langle w,\Pi x \rangle = F(\Pi x)\). By assumption we have \(y = \frac{1}{n}\mathbf {1}_{n} + \lambda \Pi ^* w \ge 0\). We shall show that \(y \in \partial g(x)\).

Indeed, let \(z \in \mathbb {R}^{n}\). Then we have \(g(z) - g(x) - \langle y,z-x \rangle = \lambda (F(\Pi z) - F(\Pi x) - \langle \Pi ^* w,z-x \rangle ) = \lambda (F(\Pi z) - \langle w,\Pi z \rangle )\). If \(\Pi z = 0\), then \(F(\Pi z) - \langle w,\Pi z \rangle = 0\). Let us assume that \(\Pi z \not = 0\). Then \(F(\Pi z) > 0\), and we may define \(\tilde{z} = \frac{\Pi z}{F(\Pi z)}\). We get \(F(\tilde{z}) = 1\), and \(\tilde{z} \in F_{1}\). It follows that \(\langle w,\tilde{z} \rangle \le 1\), because \(w \in F_{1}^{o}\). But then \(\langle w,\Pi z \rangle \le F(\Pi z)\), which proves \(g(z) - g(x) - \langle y,z-x \rangle \ge 0\). Hence \(y \in \partial g(x)\), which yields (i).

(i) \(\Rightarrow \) (ii). First we shall prove an auxiliary result.

Lemma C.11

Let \(\tilde{F}_{1} = F_{1} \cap L\). Then the polar \(\tilde{F}_{1}^{o}\) of \(\tilde{F}_{1}\) in L is given by the projection \(\Pi ^*\left[ F_{1}^{o}\right] \).

Proof

By Lemma A.10 the polar \(\tilde{F}_{1}^{o}\) is given by the closure of \(\Pi ^*\left[ F_{1}^{o}\right] \). It remains to show that \(\Pi ^*\left[ F_{1}^{o}\right] \) is closed. We have \(F(0) = 0\), and hence \(F_{1}\) contains a ball around the origin with positive radius r. It follows that the polar \(F_{1}^{o}\) is contained in a ball with radius \(r^{-1}\), and is hence compact. But projections of compact sets are compact, and in particular closed. \(\square \)

We now come to the implication (i) \(\Rightarrow \) (ii). Assume (i) and consider first an exposed point \(w \in \tilde{F}_{1}^{o}\). Our aim is to show that \(\frac{1}{n}\mathbf {1}_{n} + \lambda w \ge 0\). By definition, there exists \(x \in \tilde{F}_{1}\) such that \(\langle w,x \rangle = 1\), \(\langle v,x \rangle \le 1\) for all \(v \in \tilde{F}_1^o\), and \(\left\{ v \in \tilde{F}_{1}^{o} \,|\, \langle v,x \rangle = 1 \right\} = \{ w \}\). Note that \(x \not = 0\), hence \(F(x) > 0\), and \(\tilde{x} = \frac{x}{F(x)} \in \tilde{F}_1\). Therefore \(\langle w,\tilde{x} \rangle \le 1\) and \(1 = \langle w,x \rangle \le F(x)\). It follows that \(F(x) = 1\).

Let \(y \ge 0\) be a subgradient of g at x. By Lemmas A.7 and A.8 there exists \(v \in \partial F(x)\) such that \(y = \frac{1}{n}\mathbf {1}_{n} + \lambda \Pi ^* v\). By definition, for all z we have \(F(z) - F(x) - \langle v,z - x \rangle \ge 0\). Inserting \(z = \alpha x\) for \(\alpha \ge 0\), we obtain \((\alpha - 1)F(x) \ge (\alpha - 1) \langle v,x \rangle \). Since \(\alpha - 1\) assumes positive as well as negative values for \(\alpha \ge 0\), it follows that \(1 = F(x) = \langle v,x \rangle = \langle v,\Pi x \rangle = \langle \Pi ^*v,x \rangle \). Thus we get for all z that \(F(z) - \langle v,z \rangle \ge 0\). In particular, for \(z \in \tilde{F}_{1}\) we have \(1 \ge F(z) \ge \langle v,z \rangle = \langle \Pi ^* v,z \rangle \), and \(\Pi ^* v \in \tilde{F}_{1}^{o}\). From \(\langle \Pi ^*v,x \rangle = 1\) it follows that \(\Pi ^* v = w\), and \(y = \frac{1}{n}\mathbf {1}_{n} + \lambda w \ge 0\).

Thus \(\frac{1}{n}\mathbf {1}_{n} + \lambda w \ge 0\) for all exposed points \(w \in \tilde{F}_{1}^{o}\). By Corollary A.5 we get that \(\frac{1}{n}\mathbf {1}_{n} + \lambda w \ge 0\) for all \(w \in \tilde{F}_{1}^{o} = \Pi ^*\left[ F_{1}^{o}\right] \). This shows (ii).

Appendix D: Proof of Theorem 3.1

We first need the following Lemma.

Lemma D.1

Let \(K,N\in \mathbb {N}_{+}\) and \(\beta \in \mathbb {R}^{K}\) be fixed. For a fixed set of N Monte Carlo realizations, let \(t_{\beta }^{(n)},\)\(n=1,\ldots ,N,\) be such that

$$\begin{aligned} \max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\beta _{k}M_{t} ^{k,(n)}\right) =Z_{t_{\beta }^{(n)}}^{(n)}-\sum _{k=1}^{K}\beta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}. \end{aligned}$$

If

$$\begin{aligned} \max _{n=1,\ldots ,N}\sum _{k=1}^{K}\delta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\ge 0\text { for all }\delta \in \mathbb {R}^{K} \end{aligned}$$
(26)

then it holds that

$$\begin{aligned} \min _{n=1,\ldots ,N}\left( Z_{t_{\beta }^{(n)}}^{(n)}-\sum _{k=1}^{K}\beta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\right)\le & {} \inf _{\alpha }\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)} - \sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) \\\le & {} \max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K} \beta _{k}M_{t}^{k,(n)}\right) . \end{aligned}$$

Proof

With \(\alpha =\beta -\delta \) for \(\delta \in \mathbb {R}^{K}\) we have on the one hand

$$\begin{aligned} \inf _{\alpha }&\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)} -\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) \\&\qquad \quad =\inf _{\delta }\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)} -\sum _{k=1}^{K}\beta _{k}M_{t}^{k,(n)}+\sum _{k=1}^{K}\delta _{k}M_{t} ^{k,(n)}\right) \\&\qquad \quad \le \max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K} \beta _{k}M_{t}^{k,(n)}\right) , \end{aligned}$$

and on the other hand

$$\begin{aligned}&\inf _{\alpha }\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) \\&\qquad \quad \ge \inf _{\delta }\max _{n=1,\ldots ,N}\left( Z_{t_{\beta }^{(n)}}^{(n)} -\sum _{k=1}^{K}\beta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}+\sum _{k=1}^{K}\delta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\right) \\&\qquad \quad \ge \inf _{\delta }\left( \max _{n=1,\ldots ,N}\left( \min _{n^{\prime } =1,\ldots ,N}\left( Z_{t_{\beta }^{(n^{\prime })}}^{(n^{\prime })}-\sum _{k=1} ^{K}\beta _{k}M_{t_{\beta }^{(n^{\prime })}}^{k,(n^{\prime })}\right) +\sum _{k=1}^{K}\delta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\right) \right) \\&\qquad \quad =\inf _{\delta }\left( \min _{n^{\prime }=1,\ldots ,N}\left( Z_{t_{\beta }^{(n^{\prime })}}^{(n^{\prime })}-\sum _{k=1}^{K}\beta _{k}M_{t_{\beta }^{(n^{\prime })}}^{k,(n^{\prime })}\right) +\max _{n=1,\ldots ,N}\sum _{k=1} ^{K}\delta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\right) \\&\qquad \quad \ge \min _{n=1,\ldots ,N}\left( Z_{t_{\beta }^{(n)}}^{(n)}-\sum _{k=1}^{K} \beta _{k}M_{t_{\beta }^{(n)}}^{k,(n)}\right) , \end{aligned}$$

by using (26). \(\square \)

Corollary D.2

Suppose that for a fixed \(K\in \mathbb {N}_{+}\) there exists an \(\alpha ^{*}\in \mathbb {R}^{K}\) such that

$$\begin{aligned} M^{*}:=\sum _{k=1}^{K}\alpha _{k}^{*}M_{t}^{k} \end{aligned}$$
(27)

is surely optimal in the sense of [9]. That is

$$\begin{aligned} Y^{*}=\max _{t=0,\ldots ,T}\left( Z_{t}-\sum _{k=1}^{K}\alpha _{k}^{*} M_{t}^{k}\right) \text { almost surely,} \end{aligned}$$

and so we have

$$\begin{aligned} Y^{*}=\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}^{*}M_{t}^{k,(n)}\right) , \,\, n=1,\ldots ,N. \end{aligned}$$

Let \(t_{*}^{(n)},\)\(n=1,\ldots ,N,\) be such that

$$\begin{aligned} Y^{*}=\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}^{*}M_{t}^{k,(n)}\right) =Z_{t_{*}^{(n)}}^{(n)}-\sum _{k=1}^{K}\alpha _{k} ^{*}M_{t_{*}^{(n)}}^{k,(n)} \end{aligned}$$

for each n. By virtue of Lemma D.1 we then obtain for \(\beta =\alpha ^{*}\)

$$\begin{aligned} Y^{*}=\inf _{\alpha }\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)} -\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) , \end{aligned}$$

provided that (26) holds for \(\beta =\alpha ^{*}.\)

Proposition D.3

Let us assume \(M^{*}\) as in (27) in Corollary D.2 and that

$$\begin{aligned} \max _{n=1,\ldots ,N}\sum _{k=1}^{K}\delta _{k}M_{t_{*}^{(n)}}^{k,(n)}\ge c\left\| \delta \right\| \, \, \text {for all} \, \, \delta \in \mathbb {R}^{K}\text { and some }\, c>0, \end{aligned}$$
(28)

that is, a stronger version of (26) holds. If now

$$\begin{aligned} \alpha ^{\circ }=\underset{\alpha }{\arg \inf }\max _{n=1,\ldots ,N}\max _{t=0,\ldots .,T} \left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) , \end{aligned}$$

then it follows that \(\alpha ^{\circ }=\alpha ^{*}.\)

Proof

Let us define

$$\begin{aligned} F\left( \alpha \right) =\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t} ^{(n)}-\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) . \end{aligned}$$

So by Corollary D.2, \(F\left( \alpha ^{\circ }\right) =F\left( \alpha ^{*}\right) =Y^{*},\) and for any \(\delta \in \mathbb {R}^{K}\) we have

$$\begin{aligned} F\left( \alpha ^{*}-\delta \right)&=\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T} \left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}^{*}M_{t}^{k,(n)}+\sum _{k=1} ^{K}\delta _{k}M_{t}^{k,(n)}\right) \\&\ge \max _{n=1,\ldots ,N}\left( Z_{t_{*}^{(n)}}^{(n)}-\sum _{k=1}^{K} \alpha _{k}^{*}M_{t_{*}^{(n)}}^{k,(n)}+\sum _{k=1}^{K}\delta _{k}M_{t_{*}^{(n)}}^{k,(n)}\right) \\&=Y^{*}+\max _{n=1,\ldots ,N}\sum _{k=1}^{K}\delta _{k}M_{t_{*}^{(n)}} ^{k,(n)}\ge c\left\| \delta \right\| , \end{aligned}$$

hence \(\alpha ^{*}\) is a strict local minimum of F. Since F is convex, \(\alpha ^{*}\) is also a unique strict global minimum. Thus, it must hold that \(\alpha ^{\circ }=\alpha ^{*}.\)\(\square \)

We next suppose that an almost surely optimal martingale \(M^{*}\) satisfies

$$\begin{aligned} M^{*}:=\sum _{k=1}^{\infty }\alpha _{k}^{*}M_{t}^{k} \end{aligned}$$

where the convergence is understood almost surely (and if it is needed to be in an \(L_{p}\) sense for some \(p\ge 1\)). Let us introduce two convex functions

$$\begin{aligned} G_{K,N}(\alpha )=\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*} M_{t}^{k,(n)}\right) \end{aligned}$$

and

$$\begin{aligned} F_{K,N}(\alpha )=\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K}\alpha _{k}M_{t}^{k,(n)}\right) . \end{aligned}$$

It then holds that

$$\begin{aligned} \sup _{\alpha }\left| F_{K,N}(\alpha )-G_{K,N}(\alpha )\right| \le \max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left| \sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t}^{k,(n)}\right| . \end{aligned}$$

Indeed, for fixed \(\alpha ,\)\(n^{*}\) and \(t_{*}^{n^{*}}\) such that

$$\begin{aligned} F_{K,N}(\alpha )=Z_{t_{*}^{n^{*}}}^{(n^{*})}-\sum _{k=1}^{K}\alpha _{k}M_{t_{*}^{n^{*}}}^{k,(n^{*})} \end{aligned}$$

we have on the one hand

$$\begin{aligned}&F_{K,N}(\alpha )-G_{K,N}(\alpha )\\&\qquad \qquad \le Z_{t_{*}^{n^{*}}}^{(n^{*})}-\sum _{k=1}^{K}\alpha _{k}M_{t_{*}^{n^{*}}}^{k,(n^{*})}-\left( Z_{t_{*}^{n^{*}} }^{(n^{*})}-\sum _{k=1}^{K}\alpha _{k}M_{t_{*}^{n^{*}}}^{k,(n^{*} )}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{*}^{n^{*}}}^{k,(n^{*} )}\right) \\&\qquad \qquad =\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{*}^{n^{*}}}^{k,(n^{*})}\le \max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left| \sum _{k=K+1}^{\infty } \alpha _{k}^{*}M_{t}^{k,(n)}\right| , \end{aligned}$$

and on the other hand, with \(n^{\circ }\) and \(t_{\circ }^{n^{\circ }}\) such that

$$\begin{aligned} G_{K,N}(\alpha )=Z_{t_{\circ }^{n^{\circ }}}^{(n^{\circ })}-\sum _{k=1}^{K} \alpha _{k}M_{t_{\circ }^{n^{\circ }}}^{k,(n^{\circ })}-\sum _{k=K+1}^{\infty } \alpha _{k}^{*}M_{t_{\circ }^{n^{\circ }}}^{k,(n^{\circ })} \end{aligned}$$
$$\begin{aligned}&G_{K,N}(\alpha )-F_{K,N}(\alpha )\\&\qquad \qquad \le Z_{t_{\circ }^{n^{\circ }}}^{(n^{\circ })}-\sum _{k=1}^{K}\alpha _{k}M_{t_{\circ }^{n^{\circ }}}^{k,(n^{\circ })}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{\circ }^{n^{\circ }}}^{k,(n^{\circ })}-\left( Z_{t_{\circ }^{n^{\circ }}}^{(n^{\circ })}-\sum _{k=1}^{K}\alpha _{k}M_{t_{\circ }^{n^{\circ }} }^{k,(n^{\circ })}\right) \\&\qquad \qquad =-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{\circ }^{K,n^{\circ }} }^{k,(n^{\circ })}\le \max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left| \sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t}^{k,(n)}\right| . \end{aligned}$$

Now let \(t_{*}^{(n)},\)\(n=1,\ldots ,N,\) be defined such that for \(\alpha ^{*}:=\left( \alpha _{1}^{*},\ldots ,\alpha _{K}^{*}\right) ,\)

$$\begin{aligned} G_{K,N}(\alpha ^{*})=Z_{t_{*}^{(n)}}^{(n)}-\sum _{k=1}^{K}\alpha _{k} ^{*}M_{t_{*}^{(n)}}^{k,(n)}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{*}^{(n)}}^{k,(n)}=Y^{*} \end{aligned}$$

for each n,  and assume that

$$\begin{aligned} \max _{n=1,\ldots ,N}\sum _{k=1}^{K}\delta _{k}M_{t_{*}^{(n)}}^{k,(n)}\ge c\left\| \delta \right\| \, \, \text {for all}\, \, \delta \in \mathbb {R}^{K}\text { and some }\, c>0. \end{aligned}$$
(29)

By applying Proposition D.3 to the cash-flow

$$\begin{aligned} Z_{t}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t}^{k} \end{aligned}$$

it thus follows that

$$\begin{aligned} \underset{\alpha \in \mathbb {R}^{K}}{\arg \inf }\,G_{K,N}(\alpha )=\left( \alpha _{1}^{*},\ldots ,\alpha _{K}^{*}\right) \end{aligned}$$

on \(\mathcal {E}_{c,K,N}.\) Then, using the Markov and Doob inequalities, we get

$$\begin{aligned} \mathbb {P}\left( \sup _{\alpha }\left| F_{K,N}(\alpha )-G_{K,N} (\alpha )\right| \ge \varepsilon \right)&\le \mathbb {P}\left( \max _{n}\max _{t=0,\ldots ,T}\left| \sum _{k=K+1}^{\infty }\alpha _{k}^{*} M_{t}^{k,(n)}\right| \ge \varepsilon \right) \nonumber \\&=1-\mathbb {P}\left( \max _{n}\max _{t=0,\ldots ,T}\left| \sum _{k=K+1} ^{\infty }\alpha _{k}^{*}M_{t}^{k,(n)}\right|<\varepsilon \right) \nonumber \\&=1-\left( \mathbb {P}\left( \max _{t=0,\ldots ,T}\left| \sum _{k=K+1} ^{\infty }\alpha _{k}^{*}M_{t}^{k,(n)}\right| <\varepsilon \right) \right) ^{N}\nonumber \\&\le 1-(1-A_p\,\eta \,\varepsilon ^{-p}\, K^{-\rho })^{N}\le A_p\,\eta N\varepsilon ^{-p}\, K^{-\rho }\nonumber \\ \end{aligned}$$
(30)

for \(K>K_0\) and some constant \(A_p\) depending on \(p.\) Now consider K and N to be fixed and let

$$\begin{aligned} \alpha _{\inf }^{F}:=\left( \alpha _{\inf ,1}^{F},\ldots ,\alpha _{\inf ,K}^{F}\right) :=\underset{\alpha \in \mathbb {R}^{K}}{\arg \inf }F_{K,N}(\alpha ). \end{aligned}$$

Due to

$$\begin{aligned} G_{K,N}\left( \alpha _{\inf }^{F}\right)&=G_{K,N}\left( \alpha ^{*}-\left( \alpha ^{*}-\alpha _{\inf }^{F}\right) \right) \\&=\max _{n=1,\ldots ,N}\max _{t=0,\ldots ,T}\left( Z_{t}^{(n)}-\sum _{k=1}^{K} \alpha _{k}^{*}M_{t}^{k,(n)}\right. \\&\qquad -\left. \sum _{k=K+1}^{\infty }\alpha _{k}^{*} M_{t}^{k,(n)}+\sum _{k=1}^{K}\left( \alpha _{k}^{*}-\alpha _{\inf ,k} ^{F}\right) M_{t}^{k,(n)}\right) \\&\ge \max _{n=1,\ldots ,N}\left( Z_{t_{*}^{(n)}}^{(n)}-\sum _{k=1}^{K} \alpha _{k}^{*}M_{t_{*}^{(n)}}^{k,(n)}-\sum _{k=K+1}^{\infty }\alpha _{k}^{*}M_{t_{*}^{(n)}}^{k,(n)}+\sum _{k=1}^{K}\left( \alpha _{k}^{*}-\alpha _{\inf ,k}^{F}\right) M_{t_{*}^{(n)}}^{k,(n)}\right) \\&=Y^{*}+\max _{n=1,\ldots ,N}\sum _{k=1}^{K}\left( \alpha _{k}^{*} -\alpha _{\inf ,k}^{F}\right) M_{t_{*}^{(n)}}^{k,(n)}\\&\ge Y^{*}+c\left\| \alpha ^{*}-\alpha _{\inf }^{F}\right\| , \end{aligned}$$

by virtue of (29), it holds that

$$\begin{aligned} \left\| \alpha ^{*}-\alpha _{\inf }^{F}\right\|&\le \frac{1}{c}\left( G_{K,N}\left( \alpha _{\inf }^{F}\right) -G_{K,N}(\alpha ^{*})\right) \\&\le \frac{1}{c}\left| G_{K,N}\left( \alpha _{\inf }^{F}\right) -F_{K,N}\left( \alpha _{\inf } ^{F}\right) \right| \\&\quad +\,\frac{1}{c}\left( F_{K,N}\left( \alpha _{\inf }^{F}\right) -F_{K,N} (\alpha ^{*})\right) +\frac{1}{c}\left| F_{K,N}(\alpha ^{*} )-G_{K,N}(\alpha ^{*})\right| \\&\le \frac{2}{c}\sup _{\alpha }\left| F_{K,N}(\alpha )-G_{K,N}(\alpha )\right| . \end{aligned}$$

So we have

$$\begin{aligned} \mathbb {P}\left( \left\{ \Vert \alpha ^{*,K}-\alpha _{\inf }^{F}\Vert \ge \varepsilon \right\} \cap \mathcal {E}_{c,K,N} \right)\le & {} \mathbb {P}\left( \frac{2}{c}\sup _{\alpha }\left| F_{K,N}(\alpha )-G_{K,N}(\alpha )\right| \ge \varepsilon \right) \\\le & {} A_p\,\eta 2^p\, N(c\varepsilon )^{-p}\, K^{-\rho } \end{aligned}$$

by (30).

Appendix E: Proof of Theorem 3.2

The proof goes along the lines in [6], where a similar result was proven for the covering of the sphere by random spherical caps.

Let \(z \in \mathsf {S}^{K-1}\) be a point such that \(z \not \in \bigcup _{k=1}^{N} S^{(n)}\). Then the spherical cap \(B(z,2\delta )\) centered on z and with opening angle \(2\delta \) is contained in the complement of the union \(\bigcup _{k=1}^{N} S^{\delta ,(n)}\), where \(S^{\delta ,(n)}\) is the realization of the random subset \(S^{\delta } = \bigcap _{t=1,\dots ,T} S^{\delta }_t\) corresponding to the realization \(M_t^{k,(n)}\).

In particular, the fraction \(u_{\delta }\) of points of the sphere \(\mathsf {S}^{K-1}\) which is not covered by the union \(\bigcup _{k=1}^{N} S^{\delta ,(n)}\) is not smaller than

$$\begin{aligned} \frac{|B(z,2\delta )|}{|\mathsf {S}^{K-1}|} = \frac{\frac{2\pi ^{(K-1)/2}}{\Gamma ((K-1)/2)} \int _{0}^{\delta } (\sin \varphi )^{K-2} d\varphi }{\frac{2\pi ^{K/2} }{\Gamma (K/2)}} = \frac{\Gamma (K/2)\int _{0}^{\delta } (\sin \varphi )^{K-2} d\varphi }{\sqrt{\pi }\Gamma ((K-1)/2)}. \end{aligned}$$

Hence

$$\begin{aligned} \mathbb {E }u_{\delta } \ge \mathbb {P }\left\{ \bigcup _{n=1}^{N} S^{(n)} \not = \mathsf {S}^{K-1} \right\} \cdot \frac{\Gamma (K/2)\int _{0}^{\delta } (\sin \varphi )^{K-2} d\varphi }{\sqrt{\pi }\Gamma ((K-1)/2)}. \end{aligned}$$
(31)

The expectation of \(u_{\delta }\) can now be computed as in [6]. By the independence of the \(S^{\delta ,(n)}\) we have that

$$\begin{aligned} \mathbb P\left\{ z \not \in \bigcup _{n=1}^{N} S^{\delta ,(n)} \right\} = \left( 1 - \mathbb P \left\{ z \in S^{\delta } \right\} \right) ^N \end{aligned}$$

and by Robbins’ theorem [7]

$$\begin{aligned} \mathbb Eu_{\delta } = \int _{\mathsf {S}^{K-1}} \left( 1 - \mathbb P \left\{ z \in S^{\delta } \right\} \right) ^N d\mu (z) \le \left( 1 - \min _{z \in \mathsf {S}^{K-1}}\mathbb P \left\{ z \in S^{\delta } \right\} \right) ^N \end{aligned}$$

with \(\mu \) the canonical measure on the sphere summing to 1.

We therefore obtain by the assumption of Theorem 3.2 that

$$\begin{aligned} \mathbb {P }\left\{ \bigcup _{n=1}^{N} S^{(n)} \not = \mathsf {S}^{K-1} \right\}\le & {} \frac{(1 - \pi _0 + \pi _1\delta )^N\sqrt{\pi }\Gamma ((K-1)/2)}{\Gamma (K/2)\int _{0}^{\delta } (\sin \varphi )^{K-2} d\varphi } \\\le & {} \frac{\pi ^{K-3/2}\sqrt{K}(1 - \pi _0 + \pi _1\delta )^N}{2^{K-5/2}\delta ^{K-1}}. \end{aligned}$$

Here we have used that \(\frac{\Gamma \left( \frac{K-1}{2}\right) }{\Gamma \left( \frac{K}{2}\right) } \le \frac{\sqrt{2K}}{K-1}\), and \(\int _{0}^{\delta } (\sin \varphi )^{K-2} d\varphi \ge \left( \frac{2}{\pi }\right) ^{K-2} \frac{\delta ^{K-1}}{K-1}\) for \(\delta \le \frac{\pi }{2}\). With \(\delta = \frac{(1-\pi _0)(K-1)}{\pi _1(N-K+1)}\) we then get

$$\begin{aligned} \mathbb {P }\left\{ \bigcup _{k=1}^{N} S_{X_{k}} \not = \mathsf {S}^{K-1} \right\} \le 2\sqrt{\frac{2K}{\pi }}(1-\pi _0)^{N-K+1}\left( \frac{\pi \pi _1Ne}{2(K-1)}\right) ^{K-1}, \end{aligned}$$

where we used \(\left( \frac{N}{N-K+1}\right) ^{N-K+1} \le e^{K-1}\).

This completes the proof of Theorem 3.2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Belomestny, D., Hildebrand, R. & Schoenmakers, J. Optimal Stopping via Pathwise Dual Empirical Maximisation. Appl Math Optim 79, 715–741 (2019). https://doi.org/10.1007/s00245-017-9454-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-017-9454-9

Keywords

Mathematics Subject Classification

Navigation