1 Introduction

Two approaches to pricing and hedging

The question of pricing and hedging of a contingent claim lies at the heart of mathematical finance. Following Merton’s seminal contribution [38], we may distinguish two ways of approaching it. First, one may want to make statements “based on assumptions sufficiently weak to gain universal support”,Footnote 1 e.g. market efficiency combined with some broad mathematical idealisation of the market setting. We refer to this perspective as the model-independent approach. While very appealing at first, it has been traditionally criticised for producing outputs which are too imprecise to be of practical relevance. This is contrasted with the second, model-specific approach which focuses on obtaining explicit statements leading to unique prices and hedging strategies. “To do so, more structure must be added to the problem through additional assumptions at the expense of losing some agreement.”1 Typically this is done by fixing a filtered probability space \((\varOmega, \mathcal {F},(\mathcal {F}_{t})_{t\ge0}, \mathbb {P})\) with risky assets represented by some adapted process \((S_{t})\).

The model-specific approach, originating from the seminal works of Samuelson [47] and Black and Scholes [8], has revolutionised the financial industry and become the dominating paradigm for researchers in quantitative finance. Accordingly, we refer to it also as the classical approach. The original model of Black and Scholes has been extended and generalised, e.g. adding stochastic volatility and/or stochastic interest rates, trying to account for market complexity observed in practice. Such generalisations often lead to market incompleteness and a lack of unique rational warrant prices. Nevertheless, no-arbitrage pricing and hedging was fully characterised in a body of works on the fundamental theorem of asset pricing (FTAP) culminating in Schachermayer [20, 21]. The feasible prices for a contingent claim correspond to expectations of the (discounted) payoff under equivalent martingale measures (EMM) and form an interval. The bounds of the interval are also given by the super- and sub-hedging prices. Put differently, the supremum of expectations of the payoff under EMMs is equal to the infimum of prices of superhedging strategies. We refer to this fundamental result as the pricing–hedging duality.

Short literature review

The ability to obtain unique prices and hedging strategies, which is the strength of the model-specific approach, relies on its primary weakness—the necessity to postulate a fixed probability measure ℙ giving a full probabilistic description of future market dynamics. Put differently, this approach captures risks within a given model, but fails to tell us anything about the model uncertainty, also called Knightian uncertainty; see Knight [36, Chap. 2]. Accordingly, researchers have extended the classical setup to one where many measures \(\{\mathbb {P}_{\alpha}:\alpha\in\varLambda\}\) are simultaneously deemed feasible. This can be seen as weakening assumptions and going back from model-specific towards model-independent. The pioneering works considered uncertain volatility; see Lyons [37] and Avellaneda et al. [2]. More recently, a systematic approach based on quasi-sure analysis was developed, with stochastic integration based on capacity theory in Denis and Martini [23] and on the aggregation method in Soner et al. [48]; see also Neufeld and Nutz [41]. In discrete time, a corresponding generalisation of the FTAP and the pricing–hedging duality was obtained by Bouchard and Nutz [10] and in continuous time by Biagini et al. [6]; see also references therein. We also mention that setups with frictions, e.g. trading constraints, were considered; see Bayraktar and Zhou [3].

In parallel, the model-independent approach has also seen a revived interest. This was mainly driven by the observation that with the increasingly rich market reality, this “universally acceptable” setting may actually provide outputs precise enough to be practically relevant. Indeed, in contrast to when Merton [38] was examining this approach, at present typically not only the underlying is liquidly traded, but so are many European options written on it. Accordingly, these should be treated as inputs and hedging instruments, thus reducing the possible universe of no-arbitrage scenarios. Breeden and Litzenberger [11] were the first to observe that if many (all) European options for a given maturity trade, then this is equivalent to fixing the marginal distribution of the stock under any EMM in the classical setting. Hobson [33] in his pioneering work then showed how this can be used to compute model-independent prices and hedges of lookback options. Other exotic options were analysed in subsequent works; see Brown et al. [12], Cox and Wang [18], Cox and Obłój [17]. The resulting no-arbitrage price bounds could still be too wide even for market making, but the associated hedging strategies were shown to perform remarkably well when compared to traditional delta–vega hedging; see Obłój and Ulmer [43]. Note that the superhedging property here is understood in a pathwise sense, and typically the strategies involve buy-and-hold positions in options and simple dynamic trading in the underlying. The universality of the setting and relative insensitivity of the outputs to the (few) assumptions earned this setup the name of robust approach.

In the wake of the financial crisis, significant research focus shifted back to the model-independent approach, and many natural questions, such as establishing the pricing–hedging duality and a (robust) version of the FTAP, were pursued. In a one-period setting, the pricing–hedging duality was linked to the Karlin–Isii duality in linear programming by Davis et al. [19]; see also Riedel [45]. Beiglböck et al. [5] re-interpreted the problem as a martingale optimal transport problem and established a general discrete-time pricing–hedging duality as an analogue of the Kantorovich duality in optimal transport. Here the primal elements are martingale measures, starting in a given point and having fixed marginal distribution(s) via the Breeden and Litzenberger [11] formula. The dual elements are sub- or superhedging strategies, and the payoff of the contingent claim is the “cost functional”. An analogous result in continuous time, under suitable continuity assumptions, was obtained by Dolinsky and Soner [25], who also more recently considered the discontinuous setting [26]. These topics remain an active field of research. Acciaio et al. [1] considered the pricing–hedging duality and the FTAP with an arbitrary market input in discrete time and under significant technical assumptions. These were relaxed, offering great insights, in a recent work of Burzoni et al. [14]. Galichon et al. [30] applied the methods of stochastic control to deduce the model-independent prices and hedges; see also Henry-Labordère et al. [32]. Several authors considered setups with frictions, e.g. transactions costs in Dolinsky and Soner [24] or trading constraints in Cox et al. [16] and Fahim and Huang [28].

Main contribution

The present work contributes to the literature on robust pricing and hedging of contingent claims in two ways. First, inspired by Dolinsky and Soner [25], we study the pricing–hedging duality in continuous time and extend their results to multiple dimensions, different market setups and options with uniformly continuous payoffs. Our results are general and obtained in a comprehensive setting. We explicitly specify several important special cases, including the setting when finitely many options are traded, some dynamically and some statically, and the setting when all European call options for \(n\) maturities are traded. The latter gives the martingale optimal transport (MOT) duality with \(n\) marginal constraints which was also recently studied in a discontinuous setup by Dolinsky and Soner [26] and, in parallel to our work, by Guo et al. [31].

Our second main contribution is to propose a robust approach, which subsumes the model-independent setting, but allows us to include assumptions and move gradually towards the model-specific setting. In this sense, we strive to provide a setup which connects and interpolates between the two ends of the spectrum considered by Merton [38]. In contrast, all the above works on the model-independent approach stay within Merton [38]’s “universally accepted” setting and analyse the implications of incorporating the ability to trade some options at given market prices for the outputs, namely prices and hedging strategies of other contingent claims. We amend this setup and allow expressing modelling beliefs. These are articulated in a pathwise manner. More precisely, we allow the modeller to deem certain paths impossible and exclude them from the analysis; the superhedging property is only required to hold on the remaining set of paths \(\mathfrak {P}\). This is reflected in the form of the pricing–hedging duality we obtain.

Our framework was inspired by Mykland’s [39] idea of incorporating a prediction set of paths into the pricing and hedging problem. On a philosophical level, we start with the “universally acceptable” setting and proceed by ruling out more and more scenarios as impossible; see also Cassese [15]. We may proceed in this way until we end up with paths supporting a unique martingale measure, e.g. a geometric Brownian motion, giving us essentially a model-specific setting. The hedging arguments are required to work for all the paths which remain under consideration, and a (strong) arbitrage would be given by a strategy which makes positive profit for all these paths. In discrete time, these ideas were recently explored by Burzoni et al. [13]. This should be contrasted with another way of interpolating between the model-independent and the model-specific, namely one which starts from a given model ℙ and proceeds by adding more and more possible scenarios \(\{\mathbb {P}_{\alpha}: \alpha\in\varLambda\}\). This naturally leads to probabilistic (quasi-sure) hedging and different notions of no-arbitrage; see Bouchard and Nutz [10].

Our approach to establishing the pricing–hedging duality involves both discretisation, as in Dolinsky and Soner [25], as well as a variational approach as in Galichon et al. [30]. We first prove an “unconstrained” duality result: (3.4) states that for any derivative with bounded and uniformly continuous payoff function \(G\), the minimal initial cost of setting up a portfolio consisting of cash and dynamic trading in the risky assets (some of which could be options themselves) which superhedges the payoff \(G\) for every nonnegative continuous path is equal to the supremum of the expected value of \(G\) over all nonnegative continuous martingale measures.Footnote 2 This result is shown through an elaborate discretisation procedure building on ideas in [25, 26]. Subsequently, we develop a variational formulation which allows us to add statically traded options, or the specification of a prediction set \(\mathfrak {P}\), via Lagrange multipliers. In some cases, this leads to “constrained” duality results, similar to the ones obtained in the works cited above, with superhedging portfolios allowed to trade statically in market options and the martingale measures required to reprice these options. In particular, Theorems 3.6 and 3.12 extend the duality obtained in [19] and in [25], respectively. However, in general, we obtain an asymptotic duality result with the dual and primal problems defined through a limiting procedure. The primal value is the limit of superhedging prices on an \(\epsilon \)-neighbourhood of \(\mathfrak {P}\), and the dual value is the limit of suprema of expectations of the payoff over \(\epsilon\)-(mis)calibrated models; see Definitions 2.1 and 3.11.

The paper is organised as follows. Section 2 introduces our robust framework for pricing and hedging and defines the primal (pricing) and dual (hedging) problems. Section 3 contains all the main results. First, in Sect. 3.1, we outline the unconstrained pricing–hedging duality, displayed in (3.4), and state the constrained (asymptotic) duality results under suitable compactness assumptions. This allows us in particular to treat the case of finitely many traded options. Then, in Sect. 3.2, we apply the previous results to the martingale optimal transport case. All the result except the main unconstrained duality in (3.4) are proved in Sect. 4. The former is stated in Theorem 5.1 and shown in Sect. 5. The proof proceeds via discretisation, of the primal problem in Sect. 5.1 and of the dual problem in Sect. 5.3, with Sect. 5.2 connecting the two via classical duality results. The proofs of two auxiliary results are relegated to the Appendix.

2 Robust modelling framework

2.1 Traded assets

We consider a financial market with \(d+1\) primary assets: a numeraire (e.g. the money market account) and \(d\) underlying assets which may be traded at any time \(t\le T\). All prices are denominated in the units of the numeraire. In particular, the numeraire’s price is thus normalised and equal to one. The underlying assets’ price path is denoted \(S=((S^{(1)}_{t},\ldots,S^{(d)}_{t}):t\in[0,T])\), starts in \(S_{0}=(1,\ldots ,1)\) and is assumed to be nonnegative and continuous in time. It is thus an element of the canonical space \(\mathcal {C}([0,T],\mathbb {R}_{+}^{d})\) of all \(\mathbb {R}_{+}^{d}\)-valued continuous functions on \([0,T]\), which we endow with the supremum norm \(\|\cdot\|\). Throughout, trading is frictionless.

We pursue a robust approach and do not postulate any probability measure which would specify the dynamics for \(S\). Instead, we incorporate as inputs prices of traded derivative instruments. We assume that there is a set \(\mathcal {X}\) of market-traded options with prices \(\mathcal {P}(X)\), \(X\in \mathcal {X}\), known at time zero. These options can be traded frictionlessly at time zero, but are not assumed to be available for trading at future times. In particular, only buy-and-hold trading in these options is allowed (static trading). An option \(X\in \mathcal {X}\) is just a mapping \(X:\mathcal {C}([0,T],\mathbb {R}_{+}^{d}) \to \mathbb {R}\), measurable with respect to the \(\sigma\)-field generated by the coordinate process. In the sequel, we only consider continuous payoffs \(X\) and often specialise further to European options, i.e., \(X(S)=f(S_{T_{i}})\) for some \(f\) and \(0< T_{i}\leq T\).

Further, in addition to the above, we allow dynamically traded derivative assets. We consider \(K\) continuously traded European options and assume their prices evolve continuously and are strictly positive. The \(j\)th option has initial price \(\mathcal {P}(X^{(c)}_{j})\) and terminal payoff \(X^{(c)}_{j}(S^{(1)}_{T},\ldots, S^{(d)}_{T})\) at time \(T\). Since trading is frictionless, we can and do consider traded options with renormalised payoffs \(X^{(c)}_{j}/\mathcal {P}(X^{(c)}_{j})\) and initial prices equal to 1. When we need to consider perturbations to options’ prices, this then corresponds to a multiplicative perturbation of their payoffs; see e.g. Assumption 3.1 below. We thus have \(d+K\) dynamically traded assets whose price paths belong to

$$\varOmega=\{S\in \mathcal {C}([0,T],\mathbb {R}_{+}^{d+K}) : S_{0}=(1,\ldots,1) \}. $$

Their price process is given by the canonical process \(\mathbb {S}=(\mathbb {S}_{t})_{0\le t \le T}\) on \(\varOmega\), i.e., \(\mathbb {S}=(\mathbb {S}^{(1)},\ldots, \mathbb {S}^{(d+K)}) : [0,T]\to \mathbb {R}^{d+K}_{+}\). We let \(\mathbb {F}=(\mathcal {F}_{t})_{0\le t\le T}\) be its natural raw filtration.Footnote 3 The subset of paths which respect to the market information about the future payoff constraints is given by

$$\mathcal {I}:= \{S\in \varOmega: S^{(d+i)}_{T} = X^{(c)}_{i}(S^{(1)}_{T},\ldots , S^{(d)}_{T})/\mathcal {P}(X^{(c)}_{i}), i= 1, \dots, K \}. $$

We sometimes refer to ℐ as the information space. For a random variable \(G\) on \(\varOmega\), we clearly have \(G=G\circ \mathbb {S}\), and we exploit this to write \(G(\mathbb {S})\) instead of simply \(G\) when we want to stress that \(G\) is seen as a function of the assets’ path. It is also convenient to think of \(X\in \mathcal {X}\) as functions on \(\varOmega\) with \(X(S)=X(S^{1},\ldots ,S^{d})\). We only consider continuous \(X\), i.e., \(\mathcal {X}\subseteq \mathcal {C}(\varOmega ,\mathbb {R})\) with \(\|X\|_{\infty}:= \sup\{|X(S)|:S\in\varOmega\}\). We write \(\mathcal {X}=\emptyset\) to indicate the situation with no statically traded options, and \(K=0\) to indicate when there are no dynamically traded options.

2.2 Beliefs

As argued in the introduction, we allow our agents to express modelling beliefs. These are encoded as restrictions of the path space and may come from time series analysis of past data, or idiosyncratic views about the market in the future. Put differently, we are allowed to rule out paths which we deem impossible. The paths which remain are referred to as prediction set or beliefs. Note that such beliefs may also encode one agent’s superior information about the market. As the agent rejects more and more paths, the framework’s outputs—the robust price bounds—should get tighter and tighter. This can be seen as a way to quantify the impact of making assumptions or acquiring additional insights or information.

The choice of paths is expressed by the prediction set \(\mathfrak {P}\subseteq \mathcal {I}\). Our arguments are required to work pathwise on \(\mathfrak {P}\), while paths in the complement of \(\mathfrak {P}\) are ignored in our considerations. This binary way of specifying beliefs is motivated by the fact that in the end, we only see one path and hence are interested in arguments which work pathwise. Nevertheless, the approach is very comprehensive, and as \(\mathfrak {P}\) changes from all paths in ℐ to the support of a given model, we essentially interpolate between model-independent and model-specific setups. It also allows incorporating the information from time series of data coherently into the option pricing setup, as no probability measure is fixed and hence no distinction between real-world and risk-neutral measures is made. The idea of such a prediction set first appeared in Mykland [39]; see also Nadtochiy and Obłój [40] and Cox et al. [16] for an extended discussion.

2.3 Trading strategies and superreplication

We consider two types of trading: buy-and-hold strategies in options in \(\mathcal {X}\) and dynamic trading in assets \(S^{(i)}\), \(i\leq d+K\). The gains from the latter take the integral form \(\int \gamma _{u}(S)\,\mathrm {d}S_{u}\), and to define this integral pathwise, we need to impose suitable restrictions on \(\gamma \). We may, following Dolinsky and Soner [25], take \(\gamma =(\gamma _{t})_{0 \leq t\leq T}\) to be an \(\mathbb {F}\)-progressively measurable process of finite variation and use the integration by parts formula to define

$$ \int_{0}^{t}\gamma _{u}(S)\,\mathrm {d}S_{u}:=\gamma _{t} \cdot S_{t}-\gamma _{0}\cdot S_{0}-\int_{0}^{t}S_{u} \,\mathrm {d}\gamma _{u},\qquad S\in\varOmega, $$

where we write \(a\cdot b\) to denote the usual scalar product for any \(a, b\in \mathbb {R}^{d+K}\) and the last term on the right-hand side is a Stieltjes integral. However, for our duality, it is sufficient to consider a smaller class of processes. Namely, we say that \(\gamma \) is admissible if it is \(\mathbb {F}\)-adapted, \(\gamma (S)\) is a simple, i.e., right-continuous and piecewise constant, function for all \(S\in\varOmega\), and

$$ \int_{0}^{t}\gamma _{u}(S)\,\mathrm {d}S_{u}\ge-M, \quad\forall\, S\in \mathcal {I}, t\in [0,T]\text{, for some }M>0. $$

We denote by \(\mathcal{A}\) the set of such integrands \(\gamma \). To define static trading, we consider

$$\begin{aligned} \mathrm {Lin}_{N}(\mathcal {X})&=\bigg\{ a_{0}+\sum_{i=1}^{m}a_{i}X_{i} : m\in \mathbb {N},\,X_{i}\in \mathcal {X},\, a_{i}\in \mathbb {R},\,\sum_{i=0}^{m}|a_{i}|\le N\bigg\} ,\\ \mathrm {Lin}(\mathcal {X})&=\bigcup_{N\geq1} \mathrm {Lin}_{N}(\mathcal {X}). \end{aligned}$$

An admissible (semi-static) trading strategy is a pair \((X,\gamma )\) with \(X\in \mathrm {Lin}(\mathcal {X})\) and \(\gamma\in\mathcal{A}\). We denote the class of such trading strategies by \(\mathcal{A}_{\mathcal {X}}\). The cost of following \((X,\gamma )\in\mathcal{A}_{\mathcal {X}}\) is equal to the cost of setting up its static part, i.e., of buying the options at time zero, and is given by

$$\begin{aligned} \mathcal {P}(X):=a_{0}+\sum_{i=1}^{m} a_{i}\mathcal {P}(X_{i}),\qquad \mbox{ for }X=a_{0}+\sum _{i=1}^{m} a_{i} X_{i}. \end{aligned}$$

Throughout, we assume that the above defines \(\mathcal {P}\) uniquely as a linear operator on \(\mathrm {Lin}(\mathcal {X})\). This is in particular true if the elements in \(\mathcal {X}\) are linearly independent. Further, to eliminate obvious arbitrages, we assume that for \(X\in \mathcal {X}\), we have \(\mathcal {P}(X)\leq \|X\|_{\infty}\), which by (2.3) then holds for all \(X\in \mathrm {Lin}(\mathcal {X})\). It follows that \(\mathcal {P}\) is bounded linear, and hence continuous, on \((\mathrm {Lin}(\mathcal {X}),\|\cdot\|_{\infty})\). Note that with our definitions, \(\mathrm {Lin}(\emptyset)=\mathbb {R}\) and \(\mathcal {P}(a)=a\) for \(a\in \mathbb {R}\). We also note that dynamically traded assets can be traded statically; so our previous notation \(\mathcal {P}(X^{(c)}_{j})\) is consistent.

Our prime interest is in understanding robust pricing and hedging of a non-liquidly traded derivative with payoff \(G:\varOmega\to \mathbb {R}\). Our main results consider bounded payoffs \(G\), and since the setup is frictionless and there are no trading restrictions, without any loss of generality, we may consider only the superhedging price. The subhedging follows by considering \(-G\).

Definition 2.1

Consider \(G:\varOmega\to \mathbb {R}\). A portfolio \((X,\gamma )\in\mathcal{A}_{\mathcal {X}}\) is said to superreplicate \(G\) on \(\mathfrak {P}\) if

$$ X(S)+\int_{0}^{T}\gamma _{u}(S)\,\mathrm {d}S_{u}\ge G(S),\qquad \forall S\in \mathfrak {P}.$$

The (minimal) superreplication cost or superhedging price of \(G\) on \(\mathfrak {P}\) is defined as

$$ V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G):=\inf\{\mathcal {P}(X) : \exists(X,\gamma )\in\mathcal{A}_{\mathcal {X}} \text{ which superreplicates $G$ on $\mathfrak {P}$}\}. $$

The approximate superreplication cost of \(G\) on \(\mathfrak {P}\) is defined as

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G):=\inf\{\mathcal {P}(X) : \ & \exists(X,\gamma )\in\mathcal{A}_{\mathcal {X}}\text{ which} \\ &\text{superreplicates $G$ on $\mathfrak {P}^{\epsilon}$ for some $\epsilon> 0$}\}, \end{aligned}$$

where \(\mathfrak {P}^{\epsilon}=\{\omega\in \mathcal {I}: \inf_{\upsilon\in \mathfrak {P}}\| \omega-\upsilon\|\le\epsilon\}\).

As we shall see below, the approximate superreplicating cost appears naturally as the correct object to obtain a duality with general \(\mathfrak {P}\) and \(\mathcal {X}\). We note, however, that ex post, it is also a natural object from the financial point of view: It requires the superreplication to be robust with respect to an arbitrarily small perturbation of the beliefs.

Note that by definition, \(\mathcal {I}^{\epsilon}= \mathcal {I}\) and consequently \(V_{\mathcal {X}, \mathcal {P}, \mathcal {I}}(G)=\widetilde{V}_{\mathcal {X}, \mathcal {P}, \mathcal {I}}(G)\). Finally, we denote by \(\mathbf {V}_{\mathcal {I}}(G) = V_{\emptyset, \mathcal {P}, \mathcal {I}}(G)\) the superreplication cost of \(G\) in the absence of constraints.

2.4 Market models

Our aim is to relate the robust superhedging price as introduced above to the classical pricing-by-expectation arguments. To this end, we look at all classical models which reprice market-traded options.

Definition 2.2

We denote by ℳ the set of probability measures ℙ on \((\varOmega ,\mathcal {F}_{T})\) such that \(\mathbb {S}\) is an \((\mathbb {F},\mathbb {P})\)-martingale and let \(\mathcal {M}_{\mathcal {I}}\) be the set of probability measures \(\mathbb {P}\in \mathcal {M}\) such that \(\mathbb {P}[ \mathcal {I}] =1\). A probability measure \(\mathbb {P}\in \mathcal {M}_{\mathcal {I}}\) is called an \((\mathcal {X},\mathcal {P},\mathfrak {P})\)-market model or simply a calibrated model if \(\mathbb {P}[\mathfrak {P}]=1\) and \(\mathbb {E}_{\mathbb {P}}[X]=\mathcal {P}(X)\) for all \(X\in \mathcal {X}\). The set of such measures is denoted by \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\). More generally, a probability measure \(\mathbb {P}\in \mathcal {M}_{\mathcal {I}}\) is called an \(\eta\)-\((\mathcal {X},\mathcal {P},\mathfrak {P})\)-market model if \(\mathbb {P}[\mathfrak {P}^{\eta}]>1-\eta\) and \(|\mathbb {E}_{\mathbb {P}}[X]-\mathcal {P}(X)|<\eta\) for all \(X\in \mathcal {X}\). The set of such measures is denoted by \({ \mathcal {M}}^{\eta}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\).

Any \(\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\) provides us with a feasible no-arbitrage price \(\mathbb {E}_{\mathbb {P}}[G]\) for a derivative with payoff \(G\). The robust price for \(G\) is given as

$$ P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G):=\sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}}\mathbb {E}_{\mathbb {P}}[G], $$

where throughout, the expectation is defined with the convention that \(\infty{-} \infty{=} {-}\infty\). In the cases of particular interest, \((\mathcal {X},\mathcal {P})\) uniquely determines the marginal distributions of \(\mathbb {S}\) at given maturities, and \(P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) is then the value of the corresponding martingale optimal transport problem. We often use this terminology, even in the case of an arbitrary \(\mathcal {X}\). Finally, in the special case where there are no constraints, i.e., \(\mathcal {X}= \emptyset\) and \(\mathfrak {P}= \mathcal {I}\), we write \(\mathbf {P}_{\mathcal {I}}(G)\) to represent the corresponding maximal modelling value, i.e.,

$$\mathbf {P}_{\mathcal {I}}(G) = P_{\emptyset, \mathcal {P}, \mathcal {I}}(G) = \sup_{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})]. $$

We shall see below that with a general \(\mathcal {X}\) and \(\mathfrak {P}\), we do not obtain a duality using \(P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) in (2.4), but rather have to consider its approximate value given as

$$\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G):=\lim_{\eta\searrow0}\sup_{\mathbb {P}\in { \mathcal {M}}^{\eta}_{\mathcal {X},\mathcal {P},\mathfrak {P}}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})]. $$

Ex post, and similarly to the approximate superhedging above, this may be seen as a natural robust object to consider: instead of requiring a perfect calibration, the concept of \(\eta\)-market model allows a controlled degree of mis-calibration. This seems practically relevant since the market prices \(\mathcal {P}\) are an idealised concept obtained e.g. via averaging of bid–ask spreads.

3 Main results

Our prime interest, as discussed in the introduction, is in establishing a general robust pricing–hedging duality. Given a non-traded derivative with payoff \(G\), we have two candidate robust prices for it. The first one, \(V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\), is obtained through pricing-by-hedging arguments. The second one, \(P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\), is obtained by pricing-via-expectation arguments. In a classical setting, by fundamental results, see e.g. Delbaen and Schachermayer [20, Theorem 5.7], the analogous two prices are equal.

Within the present pathwise robust approach, the pricing–hedging duality was obtained for specific payoffs \(G\) in the literature linking the robust approach with the Skorokhod embedding problem; see Hobson [33] or Obłój [42] for a discussion. Subsequently, an abstract result was established in Dolinsky and Soner [25]. For \(d=1\), \(K=0\), \(\mathfrak {P}= \mathcal {I}=\varOmega\) and \(\mathcal {X}\) the set of all call (or put) options with a common maturity \(T\) and with \(\mathcal {P}(X)=\int_{0}^{\infty} X(x)\mu(\mathrm {d}x)\), \(\forall X\in \mathcal {X}\), where \(\mu\) is a probability measure on \(\mathbb {R}_{+}\) with mean equal to 1, they showed that

$$V_{\mathcal {X},\mathcal {P},\mathcal {I}}(G)=P_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)\qquad \text{for a ``strongly continuous'' class of bounded $G$}. $$

The result was extended to unbounded claims by broadening the class of admissible strategies and imposing a technical assumption on \(\mu\). We extend this duality to a much more general setting of abstract \(\mathcal {X}\), possibly involving options with multiple maturities, a multidimensional setting, and with an arbitrary prediction set \(\mathfrak {P}\). However, in this generality, the duality may only hold between approximate values. We first give the statements, illustrated with examples, and all the proofs are postponed to Sect. 4.

Note that, for any Borel \(G:\varOmega\to \mathbb {R}\), the inequality

$$\begin{aligned} V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) \end{aligned}$$

is true as long as there is at least one \(\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\) and at least one \((X,\gamma)\in \mathcal{A}_{\mathcal {X}}\) which superreplicates \(G\) on \(\mathfrak {P}\). Indeed, since \(\gamma \) is progressively measurable, the integral \(\int_{0}^{\cdot} \gamma _{u}(\mathbb {S})\,\mathrm {d}\mathbb {S}_{u}\), defined pathwise via integration by parts, agrees a.s. with the stochastic integral under ℙ. Then by (2.1), the stochastic integral is a ℙ-supermartingale and hence \(\mathbb {E}_{\mathbb {P}}[\int_{0}^{T} \gamma _{u}(\mathbb {S})\,\mathrm {d}\mathbb {S}_{u}]\le0\). This in turn implies that \(\mathbb {E}_{\mathbb {P}}\left[G\right]\leq \mathcal {P}(X)\). The result follows since \((X,\gamma )\) and ℙ were arbitrary. The converse inequality, however, is very involved. The fundamental difficulty lies in the fact that even in the martingale optimal transport setting of [25], the set \(\mathcal {M}_{\mathcal {X},\mathcal {P}, \mathcal {I}}\) is not compact. This is in contrast to the discrete-time case; see [5]. In our general setting, the converse inequality to (3.1) may fail, see Example 3.7 below, making it necessary to look at the duality between the approximate values.

3.1 General duality

We start with a general duality between the approximate values \(\widetilde{V}, \widetilde{P}\). But first, we give our standing assumption which states that the prices of dynamically traded options are not “on the boundary of the no-arbitrage region”, i.e., calibrated martingale measures exist under arbitrarily small perturbation of the initial prices.

Assumption 3.1

Either \(K = 0\) or \(X^{(c)}_{1},\ldots, X^{(c)}_{K}\) are bounded and uniformly continuous with market prices \(\mathcal {P}(X^{(c)}_{1}),\ldots, \mathcal {P}(X^{(c)}_{K})\) satisfying that there exists an \(\epsilon> 0\) such that for any \((p_{i})_{1\le i\le K}\) with \(|\mathcal {P}(X^{(c)}_{i}) - p_{i}|\le\epsilon \) for all \(i\le K\), we have \(\mathcal {M}_{\tilde{ \mathcal {I}}}\neq\emptyset\), where

$$\begin{aligned} \tilde{ \mathcal {I}}:= \{S\in\varOmega: S^{(d+i)}_{T} = X^{(c)}_{i}(S^{(1)}_{T},\ldots, S^{(d)}_{T})/p_{i}\text{ for all }1\leq i\le K\}. \end{aligned}$$

Theorem 3.2

Assume that \(\mathfrak {P}\) is a measurable subset of ℐ, Assumption 3.1 holds, all \(X\in \mathcal {X}\) are uniformly continuous and bounded, and \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta}\neq\emptyset\) for any \(\eta>0\). Then for any uniformly continuous and bounded \(G:\varOmega\to \mathbb {R}\), we have

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) \ge \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G), \end{aligned}$$

and if \(\mathrm {Lin}_{1}(\mathcal {X})\) defined in (2.2) is a compact subset of \((\mathcal {C}(\varOmega, \mathbb {R}), \|\cdot\|_{\infty})\), then equality holds, i.e.,

$$ \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G). $$

Remark 3.3

The above remains true if instead of martingale measures in ℳ, we restrict to Brownian martingales; see Sect. 4.1.

Example 3.4

Finite \(\mathcal {X}\)

Consider \(\mathcal {X}= \{ X_{1}, \ldots, X_{m}\}\), where the \(X_{i}\) are bounded and uniformly continuous. In this case, \(\mathrm {Lin}_{1}(\mathcal {X})\) is a convex and compact subset of \(\mathcal {C}(\varOmega, \mathbb {R})\). Therefore, if \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta} \neq\emptyset\) for any \(\eta>0\), we can apply Theorem 3.2 to conclude that \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\).

Let us outline the proof of Theorem 3.2. The first inequality in (3.2) is relatively easy to obtain, and the main effort is in establishing the converse inequality which yields (3.3). This is done in two steps. First, we consider the case without constraints, i.e., \(\mathcal {X}=\emptyset\) and \(\mathfrak {P}= \mathcal {I}\). The approximate values \(\widetilde{V}, \widetilde{P}\) then reduce to \(V\) and \(P\) respectively; so we need to show that for any bounded and uniformly continuous \(G:\varOmega\to \mathbb {R}\), we have

$$\begin{aligned} \mathbf {V}_{\mathcal {I}}(G)=\mathbf {P}_{\mathcal {I}}(G). \end{aligned}$$

This result is a special case of Theorem 5.1, which is shown in Sects. 5.1 and 5.3. Our proof proceeds through discretisation of both the primal and the dual problem and is inspired by the methods in [25, 26] but involves significant technical differences which are necessary to obtain our more general results. The first key difference, when comparing with [25], is that the discretisation therein entangles discretisation of the dynamic hedging part and static hedging part, while we develop a “clean” decoupled discretisation of the dynamic hedging part only. Second, we have to improve the discretisation to deal with payoff functions which are uniformly continuous. This is crucial for the subsequent use of a variational approach to generalise the pricing–hedging duality results to include static hedging in options with different maturities. The time-continuity assumptions on the payoff made in [25] are much stronger, and their results could not be applied directly in our framework.

We note also that in a quasi-sure setting, an analogue of (3.4) was obtained in Possamaï et al. [44] and earlier papers, as discussed therein. However, while similar in spirit, there is no immediate link between our results or proofs and those in [44]. Here, we consider a comparatively smaller set of admissible trading strategies and require a pathwise superhedging property. Consequently, we also need to impose stronger regularity constraints on \(G\).

In the second step of the proof, we use a variational approach combined with a minimax argument to obtain duality under all the constraints. Specifically, in analogy to e.g. Proposition 5.2 in Henry-Labordère et al. [32], for any uniformly continuous and bounded \(G:\varOmega\to \mathbb {R}\), we can write

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)&=\inf_{X\in \mathrm {Lin}_{N}(\mathcal {X}),\,N\ge0}\big(\mathbf {V}_{\mathcal {I}}(G-X-N\lambda _{\mathfrak {P}})+\mathcal {P}(X)\big) \\ &=\inf_{X\in \mathrm {Lin}_{N}(\mathcal {X}),\,N\ge0}\big(\mathbf {P}_{\mathcal {I}}(G-X-N\lambda_{\mathfrak {P}})+\mathcal {P}(X)\big) , \end{aligned}$$

where \(\lambda_{\mathfrak {P}}(\omega):=\inf_{\upsilon\in \mathfrak {P}}\|\omega-\upsilon \|\wedge1\). An application of the minimax theorem then yields

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \lim_{N\to\infty} \sup_{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}}\inf_{X\in \mathrm {Lin}_{N}(\mathcal {X})}\big(\mathbb {E}_{\mathbb {P}} [G-X-N\lambda_{\mathfrak {P}}]+\mathcal {P}(X)\big). \end{aligned}$$

The final, and somewhat technical, argument is to show that the above is dominated by \(\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\).

We end this section with a study of the relation between \(V,P\) and their approximate values \(\widetilde{V}, \widetilde{P}\). As already noted, by definition, \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) and \(\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\). Therefore, when \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\), the duality \(V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) follows if we can show that \(P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\). We establish this equality for an important family of market setups, but also provide examples when it fails.

Consider first the case with no specific beliefs, \(\mathfrak {P}= \mathcal {I}\), and finitely many traded put options with maturities \(0< T_{1}<\cdots<T_{n}=T\), i.e.,

$$\begin{aligned} \mathcal {X}=\big\{ (K^{(i)}_{k,j}-\mathbb {S}^{(i)}_{T_{j}})^{+} : 1\le i\le d, 1\le j\le n, 1\le k\le m(i,j)\big\} , \end{aligned}$$

where \(0< K^{(i)}_{k,j} < K^{(i)}_{k^{\prime},j}\) for any \(k < k^{\prime }\) and \(m(i,j)\in \mathbb {N}\). To simplify the notation, we write

$$\begin{aligned} \mathcal {P}\big((K^{(i)}_{k,j} - \mathbb {S}^{(i)}_{T_{j}})^{+}\big) = p_{k,i,j}, \qquad \forall i,j,k. \end{aligned}$$

In analogy to Assumption 3.1, we need to impose that the put prices are in the interior of the no-arbitrage region.

Assumption 3.5

Market put prices are such that there exists an \(\epsilon> 0\) such that for any \((\tilde{p}_{k,i,j})_{i,j,k}\) with \(|\tilde{p}_{k,i,j} - p_{k,i,j}|\le\epsilon\) for all \(i,j,k\), there exists a \(\tilde{\mathbb {P}}\in \mathcal {M}_{ \mathcal {I}}\) with

$$\begin{aligned} \tilde{p}_{k,i,j} = \mathbb {E}_{\tilde{\mathbb {P}}}[(K^{(i)}_{k,j} - \mathbb {S}^{(i)}_{T_{j}})^{+}], \qquad \forall i,j,k. \end{aligned}$$

Theorem 3.6

Let \(\mathcal {X}\) be given by (3.6) and assume that the market prices satisfy Assumptions 3.1 and 3.5. Then for any uniformly continuous and bounded \(G: \varOmega\to \mathbb {R}\), we have

$$\begin{aligned} V_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G) = P_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G). \end{aligned}$$

The above result establishes a general robust pricing–hedging duality when finitely many put options are traded. It extends in many ways the duality obtained in Davis et al. [19] for \(d=n=1\) and \(K=0\). Note that in general, \(\widetilde{V}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)=V_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)\); so it follows from Example 3.4 that we also have \(\widetilde{P}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)=P_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)\) in Theorem 3.6. These equalities may still hold, but may also fail, when nontrivial beliefs are specified. We present now three examples to highlight various possible scenarios. In the first two examples, for different reasons, the pricing–hedging duality fails, i.e., \(V_{\mathfrak {P}}>P_{\mathfrak {P}}\), while the approximate duality (3.3) holds. In the last example, all quantities are equal.

Example 3.7

Consider the case when there are no traded options, \(\mathcal {X}=\emptyset\), \(K=0\) and \(d=1\), and let

$$\mathfrak {P}= \{S \in\varOmega: S_{T}\le2 \}. $$

Define \(G:\varOmega\to \mathbb {R}\) by \(G(S ) = ( \max_{0 \leq t\leq T}S_{t} - 4)^{+}\wedge1\). Theorem 3.2 implies that \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\). On the other hand, it is straightforward to see that \(P_{\mathfrak {P}}(G) = 0\). By letting \(\mathcal {M}^{\mathrm{loc}}_{\mathfrak {P}}\) be the set of ℙ such that \(\mathbb {S}\) is a ℙ-local martingale and \(\mathbb {P}[\mathfrak {P}] = 1\), we further have

$$V_{\mathfrak {P}}(G) \geq\sup_{\mathbb {P}\in \mathcal {M}^{\mathrm{loc}}_{\mathfrak {P}}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})] > 0=P_{\mathfrak {P}}(G). $$

Example 3.8

In this example, we consider \(\mathfrak {P}\) corresponding to the Black–Scholes model. For simplicity, consider the case without any traded options, i.e., \(K=0\), \(\mathcal {X}= \emptyset\), \(d=1\), and letFootnote 4

$$\mathfrak {P}= \{S \in\varOmega: S \text{ admits a quadratic variation and } \mathrm {d}\langle S \rangle_{t} = \sigma^{2} S^{2}_{t} \mathrm {d}t, 0\leq t\leq T\}. $$

Then \(\mathcal {M}_{\mathfrak {P}} = \{\mathbb {P}_{\sigma}\}\), where \(\mathbb {S}\) is a geometric Brownian motion with constant volatility \(\sigma\) under \(\mathbb {P}_{\sigma}\). The duality in Theorem 3.2 then gives that for any bounded and uniformly continuous \(G\),

$$\begin{aligned} \widetilde{V}_{\mathfrak {P}}(G) &= \inf\{x : \exists\gamma\!\in\! \mathcal{A}\text{ which superreplicates $G-x$ on $\mathfrak {P}^{\epsilon}$ for some } \epsilon>0\}\\ &= \lim_{\eta\searrow0} \sup_{\mathbb {P}\in \mathcal {M}_{\mathfrak {P}}^{\eta}} \mathbb {E}_{\mathbb {P}}[G]. \end{aligned}$$

However, in this case, \(\mathbb {P}_{\sigma}\) has full support on \(\varOmega\) so that \(\mathfrak {P}^{\epsilon} = \varOmega\) and \(\mathcal {M}_{\mathfrak {P}}^{\epsilon}= \mathcal {M}\) for any \(\epsilon>0\). The above then boils down to the case with no beliefs, and we have

$$\begin{aligned} V_{ \mathcal {I}}(G)=\widetilde{V}_{\mathfrak {P}}(G)= \widetilde{P}_{\mathfrak {P}}(G)=\sup_{\mathbb {P}\in \mathcal {M}}\mathbb {E}_{\mathbb {P}}[G] \ge \mathbb {E}_{\mathbb {P}_{\sigma}}[G] = P_{\mathfrak {P}}(G), \end{aligned}$$

where for most \(G\) the inequality is strict.

Example 3.9

Consider again the case with no traded options, \(K=0\) and \(\mathcal {X}=\emptyset\), and let

$$\mathfrak {P}= \{S \in\varOmega: \|S\|\le b\} \qquad \text{for some $b\ge1$}. $$

Given a bounded and uniformly continuous payoff function \(G\), consider the duality in Theorem 3.2. For each \(N\in \mathbb {N}\), we pick \(\mathbb {P}^{(N)}\in \mathcal {M}_{\mathfrak {P}}^{1/N}\) such that

$$ \mathbb {E}_{\mathbb {P}^{(N)}}[G]\ge\sup_{\mathbb {P}\in \mathcal {M}_{\mathfrak {P}}^{1/N}}\mathbb {E}_{\mathbb {P}}[G]-1/N. $$

Let \(\tau\) be the first hitting time of \(b+1/N\) by \(\mathbb {S}\) and define \(\tilde{\mathbb {S}}^{(N)}\) by

$$\tilde{\mathbb {S}}^{(N)}_{t} = \mathbb {S}_{0} + \frac{b}{b+1/N}(\mathbb {S}_{t\wedge\tau} - \mathbb {S}_{0}). $$

By definition, \(\mathbb {P}^{(N)} \circ(\tilde{\mathbb {S}}^{(N)})^{-1} \in \mathcal {M}_{\mathfrak {P}}\). Also note that \(\mathbb {P}^{(N)}[\tau< T] \le1/N\). Hence by uniform continuity of \(G\), it is straightforward to see that

$$\big|\mathbb {E}_{\mathbb {P}^{(N)}}[G(\tilde{\mathbb {S}}^{(N)})] - \mathbb {E}_{\mathbb {P}^{(N)}}[G(\mathbb {S})]\big|\longrightarrow 0 \qquad \text{as } N\to\infty, $$

which leads to \(\widetilde{P}_{\mathfrak {P}}(G) = P_{\mathfrak {P}}(G)\). As \(V_{\mathfrak {P}}(G)\le\widetilde{V}_{\mathfrak {P}}(G) = \widetilde{P}_{\mathfrak {P}}(G)\), we then conclude that

$$ \widetilde{V}_{\mathfrak {P}}(G) = \widetilde{P}_{\mathfrak {P}}(G) = P_{\mathfrak {P}}(G) = V_{\mathfrak {P}}(G). $$

3.2 Martingale optimal transport duality

We focus now on the case when \((\mathcal {X},\mathcal {P})\) determines the marginal distributions of \(\mathbb {S}^{(i)}_{T_{j}}\) for \(i\le d\) and given maturities \(0< T_{1}<\cdots<T_{n}=T\). For concreteness, let us consider the case when put options are traded, i.e.,

$$\begin{aligned} \mathcal {X}=\big\{ (\kappa-\mathbb {S}^{(i)}_{T_{j}})^{+}: i=1,\ldots,d,j=1,\ldots,n,\kappa \in \mathbb {R}_{+}\big\} . \end{aligned}$$

Arbitrage considerations, see e.g. Cox and Obłój [17] and Cox et al. [16], show that absence of (a weak type of) arbitrage is equivalent to \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathcal {I}}\neq\emptyset\). Note that the latter is equivalent to market prices \(\mathcal {P}\) being encoded by a vector \(\pmb {\mu}\) of probability measures \((\mu^{(i)}_{j})\) with

$$\begin{aligned} p_{i,j}(\kappa)=\mathcal {P}\big((\kappa-\mathbb {S}^{(i)}_{T_{j}})^{+}\big)=\int(\kappa -s)^{+}\mu^{(i)}_{j}(\mathrm {d}s), \end{aligned}$$

where for each \(i=1,\ldots, d\), \(\mu^{(i)}_{1},\ldots,\mu^{(i)}_{n}\) have finite first moments, mean 1 and increase in convex order (written as \(\mu^{(i)}_{1}\preceq\mu^{(i)}_{2}\preceq\cdots\preceq\mu ^{(i)}_{n}\)), i.e., we have \(\int\phi(x)\mu^{(i)}_{1}(\mathrm {d}x)\le\cdots \le\int\phi(x)\mu^{(i)}_{n}(\mathrm {d}x)\) for any convex function \(\phi: \mathbb {R}_{+}\to \mathbb {R}\). In fact, as noted already by Breeden and Litzenberger [11], the \(\mu^{(i)}_{j}\) are defined by

$$\begin{aligned} \mu^{(i)}_{j}([0,\kappa])= p^{\prime}_{i,j}(\kappa+) \qquad \text{for }\kappa\in \mathbb {R}_{+}. \end{aligned}$$

We may think of \((\mu^{(i)}_{j})\) and \(\mathfrak {P}\) as the modelling inputs. The set of calibrated market models \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\) is simply the set of probability measures \(\mathbb {P}\in \mathcal {M}\) such that \(\mathbb {S}^{(i)}_{T_{j}}\) is distributed according to \(\mu^{(i)}_{j}\) and \(\mathbb {P}[\mathfrak {P}]=1\). Accordingly, we write \(\mathcal {M}_{\pmb {\mu },\mathfrak {P}}= \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}\) and \(P_{\pmb {\mu },\mathfrak {P}}(G)=P_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\).

Remark 3.10

It follows, see Strassen [49], that \(\mathcal {M}_{\pmb {\mu}, \mathcal {I}}\) is nonempty if and only if \(\mu^{(i)}_{1},\ldots,\mu^{(i)}_{n}\) have finite first moments, mean 1 and increase in convex order, for any \(i=1,\ldots,d\). However, in general, the additional constraints associated with a nontrivial \(\mathfrak {P}\subsetneq \mathcal {I}\) are much harder to understand.

In this context, we can improve Theorem 3.2 and narrow down the class of approximate market models by requiring that they match exactly the marginal distributions at the last maturity.

Definition 3.11

Let be the set of all measures \(\mathbb {P}\in \mathcal {M}\) such that \(\mathcal {L}_{\mathbb {P}}(\mathbb {S}^{(i)}_{T_{j}})\), the law of \(\mathbb {S}^{(i)}_{T_{j}}\) under ℙ, satisfies

$$\begin{aligned} \mathcal {L}_{\mathbb {P}}(\mathbb {S}^{(i)}_{T_{n}})=\mu^{(i)}_{n}\mbox{ and }d_{p}\big(\mathcal {L}_{\mathbb {P}}(\mathbb {S}^{(i)}_{T_{j}}),\mu^{(i)}_{j}\big)\le\eta, \qquad \text{for } j=1,\ldots ,n-1, i=1,\ldots,d, \end{aligned}$$

and furthermore \(\mathbb {P}[\mathfrak {P}^{\eta}]\ge1-\eta\), where \(d_{p}\) is the Lévy–Prokhorov metric on probability measures. Finally, let

Note that for a suitable choiceFootnote 5 of \(\epsilon(\eta)\) converging to zero as \(\eta\to0\). It follows that \(P_{\pmb {\mu },\mathfrak {P}}(G)\leq \widetilde {P}_{\pmb {\mu },\mathfrak {P}}(G)\leq \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\). The following result extends and sharpens the duality obtained in Theorem 3.2 to the current setting.

Theorem 3.12

Under Assumption 3.1, let \(\mathfrak {P}\) be a measurable subset of ℐ, \(\mathcal {X}\) given by (3.7) and \(\mathcal {P}\) such that for any \(\eta>0\), , where \(\pmb {\mu}\) is defined via (3.8). Then for any uniformly continuous and bounded \(G\), the robust pricing–hedging duality holds between the approximate values, i.e.,

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=\widetilde {P}_{\pmb {\mu },\mathfrak {P}}(G). \end{aligned}$$

Remark 3.13

Theorem 3.12 readily extends to unbounded exotic options, e.g. lookback options, following the approach of Dolinsky and Soner [25]. Fix \(p>1\) and relax the admissibility in (2.1) to

$$ \int_{0}^{t}\gamma _{u}(S)\,\mathrm {d}S_{u}\ge-M\Big(1+\sup_{0\le s\le t}|S_{s}|^{p}\Big), \qquad \forall\, S\in \mathcal {I},\,t\in[0,T], \text{for some $M>0$.} $$

Likewise, assume that all \(\mu^{(i)}_{j}\) admit a finite \(p\)th moment and allow static trading in European options with payoffs which grow at most as \(|x|^{p}\). Then the duality in Theorem 3.12 extends to uniformly continuous \(G\) with \(|G(S)|{\le} \mbox{const} (1{+} \sup_{0\le t\le T}|S_{t}|^{p})\).

In the case of one maturity, \(n=1\), we have \(\widetilde {P}_{\pmb {\mu },\mathcal {I}}(G)=P_{\pmb {\mu },\mathcal {I}}(G)\). In particular, Theorem 3.12 extends the duality of Dolinsky and Soner [25] by allowing arbitrary dimension \(d\). It is also possible to consider a multidimensional extension where the whole marginal distribution \(\mathcal {L}_{\mathbb {P}}(\mathbb {S}_{T})\) is fixed, or equivalently \(\mathcal {X}\) is large enough, e.g. dense in the Lipschitz-continuous functions on \(\mathbb {R}^{d}\). For \(n=1\) and \(\mathfrak {P}= \mathcal {I}\), such an extension follows via Theorem 3.2 and Lemma 4.3; see Hou [34, Sect. 3.2 of Chap. 5].

Assumption 3.14

\(G\) is bounded and uniformly continuous, and such that there exists a constant \(L>0\) such that for all \(\mathbb {R}_{+}^{d+K}\)-valued functions \(\upsilon , \hat{\upsilon}\) of the form

υ t = i = 1 n j = 0 m i 1 υ i , j 1 [ t i , j , t i , j + 1 ) ( t ) + v n , m n 1 1 T n ( t ) , υ ˆ t = i = 1 n j = 0 m i 1 υ i , j 1 [ t ˆ i , j , t ˆ i , j + 1 ) ( t ) + v n , m n 1 1 T n ( t ) ,

where \(t_{1,0}=\hat{t}_{1,0}=0\), \(t_{i,0}=\hat{t}_{i,0} = T_{i-1}\), \(2\le i\le n\), \(t_{i,m_{i} }=\hat{t}_{i,m_{i}} = T_{i}\), \(1\le i\le n\), we have

$$ |G(\upsilon)-G(\hat{\upsilon})|\le L\|\upsilon\|\sum^{n}_{i=1}\sum _{j=1}^{m_{i}}|\Delta t_{i,j}-\Delta\hat{t}_{i,j}|, $$

where \(\Delta t_{i,j}:=t_{i,j}-t_{i,j-1}\) and \(\Delta\hat{t}_{i,j}:=\hat{t}_{i,j}-\hat{t}_{i,j-1}\).

Note that Assumption 3.14 is close in spirit to Assumption 2.1 in [25], but is weaker and, unlike the latter, is satisfied by European options with intermediate maturities \(T_{1},\ldots, T_{n-1}\). Next we introduce a particular class of prediction sets. Our definition is closely related to time-invariant sets in Vovk [51], also recently used in Beiglböck et al. [4], but slightly different as we work with all continuous functions and also require that maturities \(T_{i}\) are preserved.

Definition 3.15

We say \(\mathfrak {P}\) is time-invariant if \((S_{t})_{t\in[0,T]}\in \mathfrak {P}\) implies that we have \((S_{f(t)})_{t\in[0,T]}\in \mathfrak {P}\) for any nondecreasing and continuous function \(f:[0,T]\to[0,T]\) with \(f(0)=0\) and \(f(T_{i}) = T_{i}\) for \(i=1,\ldots,n\).

We note that many natural path restrictions are time-invariant. Particular examples include \(\{S\in \mathcal {I}: \|S\|\leq b\}\) for a given bound \(b\), or the set of paths which satisfy a drawdown constraint for a selection of assets \(J\subseteq\{1,\ldots, d+K\}\), i.e.,

$$\Big\{ S\in \mathcal {I}: S^{(i)}_{t}\geq\alpha_{i}\sup_{u\leq t}S^{(i)}_{u}, t\in [0,T], i\in J\Big\} ,\qquad \mbox{for some fixed }\alpha_{i}\in[0,1]. $$

Theorem 3.16

Under Assumption 3.1, let \(\mathfrak {P}\) be a closed and time-invariant subset of ℐ, \(\mathcal {X}\) given by (3.7) and \(\mathcal {P}\) such that \(\mathcal {M}_{\pmb {\mu },\mathfrak {P}}\neq\emptyset\), where \(\pmb {\mu}\) is defined via (3.8). Assume there exists \(p>1\) for which \(\int|x|^{p} \mu_{n}^{(i)}(\mathrm {d}x)<\infty\) for \(i=1,\ldots, d\). Then for any \(G\) which satisfies Assumption 3.14, we have

$$ \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=P_{\pmb {\mu}, \mathfrak {P}}(G)=\widetilde {P}_{\pmb {\mu },\mathfrak {P}}(G).$$

4 Auxiliary results and proofs

We present now the proofs of all the results in Sect. 3. As noted before, we use the unconstrained duality (3.4) which follows from Theorem 5.1 stated and proved in Sect. 5 below. We start by describing a discretisation of a continuous path, often referred to as the “Lebesgue discretisation”, a term we also use. The discretisation is a crucial tool in Sect. 5.1, but is also employed in the proofs of Lemmas 4.4, 4.3, 4.5 and Theorem 3.16 below.

Definition 4.1

For a positive integer \(N\) and any \(S \in\varOmega\), we set \(\tau ^{(N)}_{0}(S)=0\), then define

$$\begin{aligned} \tau^{(N)}_{k}(S):=\inf\bigg\{ t\ge\tau^{(N)}_{k-1}(S):|S_{t}-S_{\tau ^{(N)}_{k-1}(S)}|=\frac{1}{2^{N}} \bigg\} \wedge T \end{aligned}$$

and let \(m^{(N)}(S):=\min\{k\in \mathbb {N}: \tau^{(N)}_{k}(S)=T\}\). We write \(m^{(N)}\) for the measurable map \(\varOmega\ni S\mapsto m^{(N)}(S)\) and note that by definition, \(m^{(N)}=m^{(N)}(\mathbb {S})\).

Following the observation that \(m^{(N)}(S)<\infty\) for all \(S\in\varOmega \), we say that the sequence of stopping times \(0=\tau^{(N)}_{0}<\tau ^{(N)}_{1}<\cdots<\tau^{(N)}_{m^{(N)}}=T\) forms a Lebesgue partition of \([0,T]\) on \(\varOmega\). Similar partitions were studied previously; see e.g. Bichteler [7] and Vovk [51]. Their main appearances have been as tools to build a pathwise version of the Itô integral. They can also be interpreted, from a financial point of view, as candidate times for rebalancing portfolio holdings; see Whalley and Wilmott [52].

Remark 4.2

Consider \(N\geq3\) and two paths \(S, \tilde{S}\in\varOmega\) such that \(\| S-\tilde{S}\|< 2^{-N}\). Then for each \(i< m^{(N-2)}(S)\), \(\{|S_{t}| : t\in(\tau^{(N-2)}_{i-1}(S), \tau^{(N-2)}_{i}(S)]\}\cap\{k/2^{N} : k\in \mathbb {N}_{+}\}\) has at least four elements, which implies that there exists at least one \(j< m^{(N)}(\tilde{S})\) such that \(\tau^{(N)}_{j}(\tilde{S})\in (\tau^{(N-2)}_{i-1}(S), \tau^{(N-2)}_{i}(S)]\). Consequently, \(m^{(N-2)}(S) \le m^{(N)}(\tilde{S})\) and hence for any weakly converging sequence of probability measures \(\mathbb {P}^{(k)}\to \mathbb {P}\) and any bounded nonincreasing function \(\phi: \mathbb {N}\to \mathbb {R}\),

$$\begin{aligned} \mathbb {E}_{\mathbb {P}}\big[\phi\big(m^{(N)}(\mathbb {S})\big)\big]\le\liminf_{k\to\infty} \mathbb {E}_{\mathbb {P}^{(k)}}\big[\phi\big(m^{(N-2)}(\mathbb {S})\big)\big]. \end{aligned}$$

4.1 Proof of Theorem 3.2 and Remark 3.3

To establish (3.2), we consider an \((X,\gamma )\in\mathcal{A}_{\mathcal {X}}\) that superreplicates \(G\) on \(\mathfrak {P}^{\epsilon}\) for some \(\epsilon>0\). Since \(X\) is bounded and \(\gamma \) is admissible, we can find suitable \(M>0\) such that

$$\begin{aligned} X(\mathbb {S})+\int_{0}^{T}\gamma _{u} \,\mathrm {d}\mathbb {S}_{u}\ge G(\mathbb {S})-M\lambda_{\mathfrak {P}}(\mathbb {S}), \end{aligned}$$

where we recall that \(\lambda_{\mathfrak {P}}(\omega)=\inf_{\upsilon\in \mathfrak {P}}\| \omega-\upsilon\|\wedge1\). Next, for each \(N\ge1\), we pick \(\mathbb {P}^{(N)}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{1/N}\) such that

$$\mathbb {E}_{\mathbb {P}^{(N)}}[G(\mathbb {S})]\ge\sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{1/N}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})]-\frac{1}{N}. $$

Since \(\gamma \) is progressively measurable, the integral \(\int_{0}^{\cdot} \gamma _{u}(\mathbb {S})\,\mathrm {d}\mathbb {S}_{u}\), defined pathwise via integration by parts, agrees a.s. with the stochastic integral under any \(\mathbb {P}^{(N)}\). Then by (2.1), the stochastic integral is a \(\mathbb {P}^{(N)}\)-supermartingale and hence \(\mathbb {E}_{\mathbb {P}^{(N)}}[\int_{0}^{T} \gamma _{u}(\mathbb {S})\,\mathrm {d}\mathbb {S}_{u}]\le0\). Therefore, from (4.2),

$$ \mathbb {E}_{\mathbb {P}^{(N)}}[X(\mathbb {S})]\ge \mathbb {E}_{\mathbb {P}^{(N)}}[G(\mathbb {S})-M\lambda_{\mathfrak {P}}(\mathbb {S})] \ge \sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{1/N}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})]-\frac{1}{N}-\frac{2M}{N}. $$

Also note that \(X\) takes the form \(a_{0} + \sum_{i=1}^{m} a_{i} X_{i}\), \(X_{i}\in \mathcal {X}\), and hence by the definition of \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{1/N}\),

$$\begin{aligned} |\mathcal {P}(X)-\mathbb {E}_{\mathbb {P}^{(N)}}[X(\mathbb {S})]|\longrightarrow 0 \qquad \text{as } N\to\infty. \end{aligned}$$

Together with (4.3), this yields \(\mathcal {P}(X)\ge \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) and (3.2) follows because \((X,\gamma )\in\mathcal{A}_{\mathcal {X}}\) was arbitrary.

To establish (3.3), we show the converse inequality in three steps.

Step 1: Duality without constraints. This is the crucial and also the most technical part of the proof which we defer to Sect. 5. The duality in (3.4) follows as a special case of Theorem 5.1, which is stated and proved in Sect. 5.

Step 2: Calculus of variation approach. Fix \(G\). Note that any \((X,\gamma )\) that superreplicates \(G-N\lambda_{\mathfrak {P}}\) on ℐ also superreplicates \(G-N/M\) on \(\mathfrak {P}^{\frac{1}{M}}\). It follows that for any fixed \(M,N\geq1\),

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) &= \inf\{\mathcal {P}(X):\exists(X,\gamma )\in\mathcal{A}_{\mathcal {X}}, \epsilon>0 \text{ such that} \\ &\phantom{=::\inf\{\mathcal {P}(X):}\text{$(X,\gamma )$ superreplicates $G$ on }\mathfrak {P}^{\epsilon}\}\\ & \le\frac{N}{M} + \inf\{\mathcal {P}(X): \text{$\exists(X,\gamma )\in\mathcal{A}_{\mathcal {X}}$ which} \\ &\phantom{=:\frac{N}{M} + \inf\{\mathcal {P}(X)::}\text{superreplicates $G-N\lambda_{\mathfrak {P}}$ on $ \mathcal {I}$}\}. \end{aligned}$$

Taking the infimum over \(M\) and then over \(N\), we obtain

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) &\le\inf_{N\ge0} \inf\{\mathcal {P}(X): \text{$\exists(X,\gamma )\in \mathcal{A}_{\mathcal {X}}$ which} \\ &\phantom{=:\inf_{N\ge0} \inf\{\mathcal {P}(X)::}\text{superreplicates $G-N\lambda_{\mathfrak {P}}$ on $ \mathcal {I}$}\} \\ &= \inf_{N\ge0}{V}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G-N\lambda_{\mathfrak {P}}) = \inf_{N\ge0}\widetilde{V}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G-N\lambda_{\mathfrak {P}}). \end{aligned}$$

On the other hand, given any \((X,\gamma )\in\mathcal{A}_{\mathcal {X}}\) and \(\epsilon>0\) such that \((X,\gamma )\) superreplicates \(G\) on \(\mathfrak {P}^{\epsilon}\), by the admissibility of \((X,\gamma )\) and boundedness of \(X\) and \(G\), if \(N>0\) is sufficiently large, then

$$\begin{aligned} X(S)+ \int_{0}^{T}\gamma _{u}(S)\,\mathrm {d}S_{u}\ge G(S)-N\lambda_{\mathfrak {P}},\qquad S\in \mathcal {I}, \end{aligned}$$

that is, \((X,\gamma )\) superreplicates \(G-N\lambda_{\mathfrak {P}}\) on ℐ. It follows that we have equality in (4.4). We also have

$$\begin{aligned} \widetilde{V}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G-N\lambda_{\mathfrak {P}}) & = \inf_{X\in \mathrm {Lin}(\mathcal {X})}\big(\mathcal {P}(X) +\inf\{x\in \mathbb {R}:\exists \gamma \in\mathcal{A} \text{ such that} \\ &\phantom{=::\inf_{X\in \mathrm {Lin}(\mathcal {X})}\big(\mathcal {P}(X) +\inf\{x\in \mathbb {R}:} \text{$(x,\gamma )$ superreplicates} \\ &\phantom{=::\inf_{X\in \mathrm {Lin}(\mathcal {X})}\big(\mathcal {P}(X) +\inf\{x\in \mathbb {R}:} \text{$G-N\lambda_{\mathfrak {P}}-X$ on $ \mathcal {I}\}\big)$} \\ &= \inf_{X\in \mathrm {Lin}(\mathcal {X})}\big(\mathcal {P}(X)+ \mathbf {V}_{\mathcal {I}}(G - N\lambda_{\mathfrak {P}} - X)\big)\\ &= \inf_{X\in \mathrm {Lin}(\mathcal {X})}\big(\mathcal {P}(X)+\mathbf {P}_{\mathcal {I}}(G-N\lambda_{\mathfrak {P}}-X)\big), \end{aligned}$$

where the last equality is justified by Theorem 5.1 as \(\lambda_{\mathfrak {P}}\) and \(X\) are bounded and uniformly continuous. Combining the above with (4.4), we conclude that (3.5) holds.

Step 3: Application of the minimax theorem. We rewrite (3.5) and apply a minimax argument to get

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)&= \inf_{X\in \mathrm {Lin}(\mathcal {X}),\,N\ge0}\big(\mathbf {P}_{\mathcal {I}}(G-X-N\lambda _{\mathfrak {P}})+\mathcal {P}(X)\big) \\ &= \lim_{N\to\infty} \inf_{X\in \mathrm {Lin}_{N}(\mathcal {X})}\bigg(\sup_{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}} [G-X-N\lambda_{\mathfrak {P}}]+\mathcal {P}(X)\bigg) \\ &= \lim_{N\to\infty} \sup_{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}}\inf_{X\in \mathrm {Lin}_{N}(\mathcal {X})}\big(\mathbb {E}_{\mathbb {P}} [G-X-N\lambda_{\mathfrak {P}}]+\mathcal {P}(X)\big) \end{aligned}$$
$$\begin{aligned} &\le \lim_{N\to\infty}\sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta_{N}}}\mathbb {E}_{\mathbb {P}}[G] = \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G), \end{aligned}$$

for \(\eta_{N} = 2 \kappa/\sqrt{N}\) with \(\kappa=1+ \|G\|_{\infty}\), where \(\|G\|_{\infty}=\sup_{S\in\varOmega}|G(S)|\). The crucial third equality follows by a minimax theorem (see e.g. Terkelsen [50, Corollary 2]) by observing that the mapping

$$ \mathrm {Lin}_{N}(\mathcal {X})\times \mathcal {M}_{ \mathcal {I}} \ni(X,\mathbb {P}) \mapsto \mathbb {E}_{\mathbb {P}}[G(\mathbb {S})-X(\mathbb {S})-N\lambda_{\mathfrak {P}}(\mathbb {S})]+\mathcal {P}(X)\in \mathbb {R}$$

is bilinear and \(\mathrm {Lin}_{N}(\mathcal {X})\) is convex and compact. To justify the inequality between (4.5) and (4.6), consider \(\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}\setminus \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta_{N}}\). Then in particular, either there exists \(X^{*}\in \mathcal {X}\) such that \(|\mathbb {E}_{\mathbb {P}}[X^{*}] -\mathcal {P}(X^{*})|> \eta_{N}\frac{1}{\sqrt{N}}\) or \(\mathbb {P}[\mathbb {S}\notin \mathfrak {P}^{\eta_{N}}]\ge\eta_{N}\). In the former case, since \(\pm NX^{*}\in \mathrm {Lin}_{N}(\mathcal {X})\), we obtain

$$\begin{aligned} \mathbb {E}_{\mathbb {P}}[G - NX^{*} -N\lambda_{\mathfrak {P}}]+ \mathcal {P}(NX^{*}) &\le \mathbb {E}_{\mathbb {P}}[G] - N\big(\mathbb {E}_{\mathbb {P}}[X^{*}] -\mathcal {P}(X^{*})\big)\\ & < \kappa- 2\kappa\sqrt{N}\leq -\kappa, \end{aligned}$$

where, without loss of generality, we assume \(\mathbb {E}_{\mathbb {P}}[X^{*}] < \mathcal {P}(X^{*})\). In the latter case, we have \(\mathbb {E}_{\mathbb {P}}[N\lambda_{\mathfrak {P}}] \geq N\frac {2\kappa}{\sqrt{N}}\frac{2\kappa}{\sqrt{N}}=4\kappa^{2}\geq4\kappa\), while \(|\mathbb {E}_{\mathbb {P}}[X] -\mathcal {P}(X)|\leq N\frac{2\kappa}{N}=2\kappa\) for any \(X\in \mathrm {Lin}_{N}(\mathcal {X})\). It follows that

$$\begin{aligned} \mathbb {E}_{\mathbb {P}}[G - X - N\lambda_{\mathfrak {P}}]+ \mathcal {P}(X) \leq\kappa- 4\kappa+2\kappa =-\kappa. \end{aligned}$$

On the other hand, since (3.2) implies \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(0)= 0\), we have

$$\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G+\|G\|_{\infty})-\|G\|_{\infty} \geq \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(0) -\| G\|_{\infty} = -\kappa+1, $$

and hence we may restrict to measures in \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta_{N}}\) in (4.5). Dropping nonpositive terms, we obtain (4.6) which completes the proof of Theorem 3.2.  □

For Remark 3.3, it remains to argue that Theorem 3.2 remains true when we restrict to Brownian martingales. Specifically, given \(T\) and a probability space \((\varOmega ^{W},\mathbb {F}^{W},P^{W})\) with a \(\tilde{d}\)-dimensional Brownian motion \(W\) on \([0,T]\), where \(\mathbb {F}^{W}= (\mathcal {F}_{t}^{W})_{0 \leq t\leq T}\) is the \(P^{W}\)-completion of the natural filtration of \(W\), consider

$$ \mathbb {P}:= P^{W}\circ(Z^{\alpha})^{-1},\qquad \text{where } Z^{\alpha}:= \int _{0}^{\cdot}\alpha_{u} \,\mathrm {d}W_{u} $$

for some \(\mathbb {F}^{W}\)-progressively measurable process \(\alpha\) with values in the \((d+K)\times\tilde{d}\) matrices such that the above vector integral is well defined. Let \(\underline{\mathcal {M}}_{ \mathcal {I}}\) be the family of all \(\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}\) which admit such a representation. From (3.5), as argued above, and Remark 5.2 below, we have

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) &= \inf_{X\in \mathrm {Lin}(\mathcal {X}),\, N\ge0,}\bigg(\sup_{\mathbb {P}\in \underline{\mathcal {M}}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-X-N\lambda_{\mathfrak {P}}] + \mathcal {P}(X)\bigg). \end{aligned}$$

Then by following the same argument as in Step 3 above, we can show that we have \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta_{N}}\cap\underline{\mathcal {M}}_{ \mathcal {I}}\neq \emptyset\) when \(N\) is sufficiently large and

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \lim_{N\to\infty}\sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta_{N}}\cap \underline{\mathcal {M}}_{ \mathcal {I}} }\mathbb {E}_{\mathbb {P}}[G]. \end{aligned}$$

4.2 Proof of Theorem 3.6

The set \(\mathcal {X}\) is finite and as discussed in Example 3.4, we can apply Theorem 3.2. Together with \(V_{\mathcal {X},\mathcal {P},\mathcal {I}}=\widetilde{V}_{\mathcal {X},\mathcal {P}, \mathcal {I}}\), this yields

$$\begin{aligned} V_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G) = \widetilde{P}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G) = \lim_{N\to \infty}\sup_{\mathbb {P}\in \mathcal {M}^{1/N}_{\mathcal {X},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G]. \end{aligned}$$

Now for every positive integer \(N\), we pick \(\mathbb {P}^{(N)}\in \mathcal {M}^{1/N}_{\mathcal {X},\mathcal {P}, \mathcal {I}}\) such that

$$\mathbb {E}_{\mathbb {P}^{(N)}}[G] + 1/N \ge\sup_{\mathbb {P}\in \mathcal {M}^{1/N}_{\mathcal {X},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G]. $$

We let

$$\begin{aligned} p^{(N)}_{k,i,j} := \mathbb {E}_{\mathbb {P}^{(N)}}[(K^{(i)}_{k,j} - S^{(i)}_{k,j})^{+}], \qquad \tilde{p}^{(N)}_{k,i,j} := \sqrt{N}\big(p_{k,i,j} - (1- 1/\sqrt {N})p^{(N)}_{k,i,j}\big), \end{aligned}$$

for any \(i=1,\ldots, d\), \(j=1,\ldots,n\), \(k=1,\ldots,m(i,j)\). Note that

$$\begin{aligned} |\tilde{p}^{(N)}_{k,i,j} - p_{k,i,j}| = (\sqrt{N} -1)|p_{k,i,j} - p^{(N)}_{k,i,j}|\le\frac{\sqrt{N}}{N} =\frac{1}{\sqrt{N}}, \qquad \forall i,j,k. \end{aligned}$$

Then it follows from Assumption 3.5 that when \(N\) is large, there exists a \(\tilde{\mathbb {P}}^{(N)}\in \mathcal {M}_{ \mathcal {I}}\) such that

$$\begin{aligned} \tilde{p}^{(N)}_{k,i,j} = \mathbb {E}_{\tilde{\mathbb {P}}^{(N)}}[(K^{(i)}_{k,j} - S^{(i)}_{k,j})^{+}],\qquad \forall i,j,k. \end{aligned}$$

Now we consider \(\mathbb {Q}:= (1-1/\sqrt{N})\mathbb {P}^{(N)} + \tilde{\mathbb {P}}^{(N)}/\sqrt {N}\). It follows that

$$\begin{aligned} \mathbb {E}_{\mathbb {Q}}[(K^{(i)}_{k,j} - S^{(i)}_{k,j})^{+}] &= (1-1/\sqrt{N})\mathbb {E}_{\mathbb {P}^{(N)}}[(K^{(i)}_{k,j} - S^{(i)}_{k,j})^{+}]\\ &\phantom{=}{}+\frac{1}{\sqrt{N}} \mathbb {E}_{\tilde{\mathbb {P}}^{(N)}}[(K^{(i)}_{k,j} - S^{(i)}_{k,j})^{+}]\\ &= (1-1/\sqrt{N})p^{(N)}_{k,i,j} + \tilde{p}^{(N)}_{k,i,j}/\sqrt{N} = p_{k,i,j} \end{aligned}$$

and hence \(\mathbb {Q}\in \mathcal {M}_{\mathcal {X},\mathcal {P}, \mathcal {I}}\). In addition,

$$\begin{aligned} \big|\mathbb {E}_{\mathbb {Q}}[G] - \mathbb {E}_{\mathbb {P}^{(N)}}[G]\big| \le \frac{1}{\sqrt{N}}(\mathbb {E}_{\mathbb {P}^{(N)}}[|G|]+\mathbb {E}_{\tilde{\mathbb {P}}^{(N)}}[|G|]) \le\frac{2\|G\|_{\infty }}{\sqrt{N}}. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \sup_{\mathbb {P}\in \mathcal {M}^{1/N}_{\mathcal {X},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G] &\le \sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G] + \frac{2\|G\| _{\infty}}{\sqrt{N}} +\frac{1}{N} \\ &= P_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G) + \frac{2\|G\|_{\infty}}{\sqrt{N}} + \frac{1}{N}, \end{aligned}$$

and taking limits as \(N\to\infty\) yields \(\widetilde{P}_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)\le P_{\mathcal {X},\mathcal {P}, \mathcal {I}}(G)\). Together with (4.8) and (3.1), this completes the proof. □

4.3 Proof of Theorem 3.12

From Theorem 3.2, we know that \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge \widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\) and by definition, \(\widetilde {P}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\geq \widetilde {P}_{\pmb {\mu },\mathfrak {P}}(G)\). Hence, to establish Theorem 3.12, it suffices to show that

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\le \widetilde {P}_{\pmb {\mu },\mathfrak {P}}(G). \end{aligned}$$

This is a special case (\(\alpha=\beta=0\)) of Proposition 4.5 below, which is a crucial technical result also used to prove Theorem 3.16 below. We recall that \(\underline{\mathcal {M}}_{ \mathcal {I}}=\mathcal {M}_{ \mathcal {I}}\cap\underline{\mathcal {M}}\), where \(\underline{\mathcal {M}}:=\{\mathbb {P}\in \mathcal {M}: \mathbb {P}\mbox{ satisfies (4.7)}\}\). We also require additional notation for sets of martingale measures with a constraint on the final marginal only. For a probability measure \(\pi\) on \(\mathbb {R}^{d}\), we let \(\mathcal {M}_{\pi, \mathcal {I}}:= \{ \mathbb {P}\in \mathcal {M}_{ \mathcal {I}}: \mathcal{L}_{\mathbb {P}}(S_{T_{n}})=\pi\}\) and likewise, for a \(d\)-tuple \(\mu_{n}=(\mu_{n}^{(1)},\ldots,\mu_{n}^{(d)})\) of probability measures on ℝ, we let \(\mathcal {M}_{\mu_{n}, \mathcal {I}}:= \{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}: \mathcal{L}_{\mathbb {P}}(S_{T_{n}}^{(i)})=\mu_{n}^{(i)}, i=1,\ldots,d\}\). These notations are used in statements and proofs below, and it will be clear from the context if we work with the former or the latter object. Finally, we allow a perturbation by defining

$$\mathcal {M}_{\mu_{n}, \mathcal {I},\eta}:=\{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}: d_{\mathbb {P}}(\mathcal{L}_{\mathbb {P}}(S_{T_{n}}^{(i)},\mu_{n}^{(i)})\leq\eta, i=1,\ldots,d\}. $$

We note that these sets are different from the main objects introduced in Definition 3.11 and are only needed for some technical arguments below.

We start with two lemmas leading to Proposition 4.5.

Lemma 4.3

Consider probability measures \(\pi^{(N)},\pi\) on \(\mathbb {R}^{d}_{+}\) with mean vectors \(\pmb {1}\) and \((\pi^{(N)})\) converging weakly to \(\pi\). Then, for any \(\alpha, \beta\ge0\), \(D\in \mathbb {N}\) and a bounded uniformly continuous \(G\),

$$ \limsup_{N\to\infty}\sup_{\mathbb {P}\in\underline{\mathcal {M}}\cap \mathcal {M}_{\pi^{(N)},\mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D)}}\wedge\alpha] \le\sup_{\mathbb {P}\in\underline{\mathcal {M}}\cap \mathcal {M}_{\pi, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt {m^{(D-2)}}\wedge\alpha], $$

where \(m^{(D)}\) is given in Definition 4.1.


See Sect. A.1 in the Appendix. □

Lemma 4.4

Under Assumption 3.1, let \(\mathfrak {P}\) be a measurable subset of ℐ, \(\mathrm {Lin}_{1}(\mathcal {X})\) a compact subset of \((\mathcal {C}(\varOmega, \mathbb {R}),\|\cdot\|_{\infty})\) and \(\mathcal {M}_{s}\) a nonempty convex subset of \(\mathcal {M}_{ \mathcal {I}}\) such that \(\mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{\eta}\cap \mathcal {M}_{s}\neq\emptyset\) for all \(\eta>0\). Then for any \(\alpha, \beta\ge0\), \(D\in \mathbb {N}\) and a bounded uniformly continuous \(G\),

$$\begin{aligned} &\inf_{X\in \mathrm {Lin}(\mathcal {X}),\,N\ge0}\bigg(\sup_{\mathbb {P}\in \mathcal {M}_{s}}\mathbb {E}_{\mathbb {P}}[G-\beta \sqrt{m^{(D)}}\wedge\alpha-X-N\lambda_{\mathfrak {P}}]+\mathcal {P}(X)\bigg) \\ &\phantom{:::::::}\le\lim_{N\to\infty}\sup_{\mathbb {P}\in \mathcal {M}_{\mathcal {X},\mathcal {P},\mathfrak {P}}^{1/N}\cap \mathcal {M}_{s}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-2)}}\wedge\alpha], \end{aligned}$$

with equality when \(\alpha= \beta= 0\), where \(m^{(D)}\) is defined in Definition 4.1.


See Sect. A.2 in the Appendix. □

Proposition 4.5

Under Assumption 3.1, let \(\mathfrak {P}\) be a measurable subset of ℐ, \(\mathcal {X}\) given by (3.7) and \(\mathcal {P}\) such that for any \(\eta>0\), , where \(\pmb {\mu }\) is defined via (3.8). Then for any uniformly continuous and bounded \(G\) and \(\alpha,\beta\ge0\), \(D\in \mathbb {N}\),

$$\begin{aligned} \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G-\beta\sqrt{m^{(D)}}\wedge\alpha) \le\widetilde{P}_{\pmb {\mu }, \mathfrak {P}}(G-\beta\sqrt{m^{(D-8)}}\wedge\alpha), \end{aligned}$$

where \(m^{(D)}\) is defined in Definition 4.1.



and . Let , and write

$$\mathcal {Z}= \bigcup_{M\ge0} \mathcal {Z}_{M},\qquad \mathcal {Y}= \bigcup_{M\ge0} \mathcal {Y}_{M}. $$

Note that given any \(f\in \mathcal {C}_{b}(\mathbb {R}_{+},\mathbb {R})\), \(\epsilon>0\) and a measure \(\mu\) on \(\mathbb {R}_{+}\) with a finite first moment, there is some \(u:\mathbb {R}_{+}\to \mathbb {R}\) of the form \(u(s)=a_{0}+\sum_{i=1}^{n}a_{i}(\kappa_{i}-s)^{+}\) such that \(u\ge f\) and \(\int(u-f)\,\mathrm {d}\mu<\epsilon\). This gives the first inequality in the following:

$$\begin{aligned} &\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G-\beta\sqrt{m^{(D)}}\wedge\alpha) \\ &\quad\le\widetilde{V}_{\mathcal {Z}\cup \mathcal {Y},\mathcal {P},\mathfrak {P}}(G-\beta\sqrt{m^{(D)}}\wedge \alpha) \\ &\quad= \inf_{X\in \mathrm {Lin}(\mathcal {Z}\cup \mathcal {Y}), N\ge0}\big(\mathbf {V}_{\mathcal {I}}(G-X-\beta\sqrt {m^{(D)}}\wedge\alpha-N\lambda_{\mathfrak {P}})+\mathcal {P}(X)\big) \\ &\quad\le\inf_{X\in \mathrm {Lin}(\mathcal {Z}\cup \mathcal {Y}), N\ge0}\bigg(\sup_{\mathbb {P}\in\underline{\mathcal {M}}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-X-\beta\sqrt{m^{(D-2)}}\wedge\alpha-N\lambda_{\mathfrak {P}}]+\mathcal {P}(X)\bigg) \\ &\quad= \inf_{Y\in \mathrm {Lin}(\mathcal {Y}), N\ge0, M\ge0}\inf_{Z\in \mathrm {Lin}(\mathcal {Z}_{M})}\bigg(\sup_{\mathbb {P}\in\underline{\mathcal {M}}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-Y-Z-\beta\sqrt {m^{(D-2)}}\wedge\alpha-N\lambda_{\mathfrak {P}}] \\ &\hphantom{ =:\inf_{Y\in \mathrm {Lin}(\mathcal {Y}), N\ge0, M\ge0}\inf_{Z\in \mathrm {Lin}(\mathcal {Z}_{M})}\Big\{ }\quad{}+\mathcal {P}(Y+Z)\bigg) \\ &\quad\le\inf_{Y\in \mathrm {Lin}(\mathcal {Y}), M\ge0, N\ge0}\;\lim_{L\to\infty}\sup_{\mathbb {P}\in\underline{\mathcal {M}}_{ \mathcal {I}}\cap \mathcal {M}^{1/L}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}} \mathbb {E}_{\mathbb {P}}[G -\beta\sqrt{m^{(D-4)}}\wedge\alpha-Y \\ &\hphantom{=:\inf_{Y\in \mathrm {Lin}(\mathcal {Y}), M\ge0, N\ge0}\;\lim_{L\to\infty }\quad\sup_{\mathbb {P}\in\underline{\mathcal {M}}_{ \mathcal {I}}\cap \mathcal {M}^{1/L}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}} \mathbb {E}_{\mathbb {P}}[}{} -N\lambda_{\mathfrak {P}}+\mathcal {P}(Y)] \\ &\quad\le\inf_{Y\in \mathrm {Lin}(\mathcal {Y}), M\ge0, N\ge0}\sup_{\mathbb {P}\in\underline{\mathcal {M}}_{ \mathcal {I}}\cap \mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}} \mathbb {E}_{\mathbb {P}}[G-\beta\sqrt {m^{(D-4)}}\wedge\alpha-Y \\ &\hphantom{=:\inf_{Y\in \mathrm {Lin}(\mathcal {Y}), M\ge0, N\ge0}\quad\sup_{\mathbb {P}\in\underline {\mathcal {M}}_{ \mathcal {I}}\cap \mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}} \mathbb {E}_{\mathbb {P}}[}{} -N\lambda_{\mathfrak {P}}+\mathcal {P}(Y)]. \end{aligned}$$

Above, the first equality follows from the previously argued equality in (4.4). The second inequality is justified by Theorem 5.1 and Remark 5.2 below. Both the ensuing equality and the last inequality are clear. It remains to observe that the third inequality follows from Lemma 4.4. To justify this, note that is a convex and compact subset of \(\mathcal {C}(\mathbb {R}_{+},\mathbb {R})\), and it follows that \(\mathrm {Lin}_{1}(\mathcal {Z}_{M})=\mathcal {Z}_{M}\) is a convex compact subset of \((\mathcal {C}(\varOmega, \mathbb {R}),\|\cdot\|_{\infty})\). In addition, for all \(\eta>0 \) implies that \(\mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}} \neq\emptyset\) for all \(M\), and clearly we can obtain such measures on a Wiener space so that \(\mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}\cap\underline{M}_{ \mathcal {I}}\neq\emptyset\) for all \(M\), as required.

For any \(\mathbb {P}\in \mathcal {M}_{\mathcal {Z}_{M}, \mathcal {P}, \mathcal {I}}^{1/M}\), let \(\epsilon_{\mathbb {P}} = \max\{d_{p}(\mu_{n}^{(i)}, \mathcal {L}_{\mathbb {P}}(\mathbb {S}^{(i)}_{T_{n}})) : i=1, \dots, d\}\), where the Lévy–Prokhorov metric \(d_{p}\) on probability measures on \(\mathbb {R}_{+}^{d}\) is given by

$$ d_{p}(\mu,\nu):=\sup_{f\in\mathfrak{G}^{b}_{1}(\mathbb {R}_{+}^{d})}\bigg|\int f \,\mathrm {d}\nu -\int f \,\mathrm {d}\mu\bigg|, $$


$$\mathfrak{G}^{b}_{1}(\mathbb {R}_{+}^{d}):=\{f\in \mathcal {C}(\mathbb {R}^{d}_{+},\mathbb {R}): \|f\|\le1 \text{ and } |f(\pmb {x})-f(\pmb {y})|\le|\pmb {x}-\pmb {y}|, \forall \pmb {x}\neq \pmb {y}\} $$

(see e.g. Bogachev [9, Theorem 8.3.2]). Pick \(g\in\mathfrak {G}^{b}_{1}(\mathbb {R}_{+})\) such that

$$\begin{aligned} \bigg|\int_{\mathbb {R}_{+}}g(x)\mu_{n}^{(i)}(\mathrm {d}x)-\mathbb {E}_{\mathbb {P}}[g(\mathbb {S}_{T_{n}}^{(i)})]\bigg| > \epsilon_{\mathbb {P}}/2 \qquad \text{for some } i=1,\ldots,d, \end{aligned}$$

and define \(\hat{g}\in\mathfrak{G}_{M}(\mathbb {R}_{+})\) via \(\hat{g}(x)=Mg(x\wedge M^{2})\). Then by the definition of \(\mathcal {M}_{\mathcal {Z}_{M}, \mathcal {P}, \mathcal {I}}^{1/M}\) and \(\hat{g}\),

$$\begin{aligned} \frac{1}{M}&\geq \bigg|\int_{\mathbb {R}_{+}}\hat{g}(x)\mu_{n}^{(i)}(\mathrm {d}x)-\mathbb {E}_{\mathbb {P}}[\hat{g}(\mathbb {S}_{T_{n}}^{(i)})]\bigg| \\ & \ge M\bigg|\int g \,\mathrm {d}\mu_{n}^{(i)} - \mathbb {E}_{\mathbb {P}}[g(\mathbb {S}_{T_{n}}^{(i)})] \bigg|-M\mu_{n}^{(i)}(\{|x|\geq M^{2}\})-M\mathbb {P}[|\mathbb {S}_{T_{n}}^{(i)}|\geq M^{2}] \\ & \ge M\frac{\epsilon_{\mathbb {P}}}{2}-\frac{2}{M}. \end{aligned}$$

It follows that \(\epsilon_{\mathbb {P}} \le3/M^{2}\) and hence \(\mathcal {M}_{\mathcal {Z}_{M}, \mathcal {P}, \mathcal {I}}^{1/M} \subseteq \mathcal {M}_{\mu_{n}, \mathcal {I}, 1/M}\) for \(M\geq3\). Fix \(Y\in \mathrm {Lin}(\mathcal {Y})\) and for each \(M\geq3\), take \(\mathbb {P}^{(M)}\in \underline{\mathcal {M}}\cap \mathcal {M}_{\mu_{n}, \mathcal {I},1/M}\) such that

$$\begin{aligned} & \mathbb {E}_{\mathbb {P}^{(M)}}[G-\beta\sqrt{m^{(D-4)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]\\ &\quad\ge\sup_{\mathbb {P}\in \underline{\mathcal {M}}\cap \mathcal {M}_{\mu_{n}, \mathcal {I}, 1/M}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-4)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]-\frac{1}{M}. \end{aligned}$$

Let \(\pi^{(M)}_{n}\) be the law of \((\mathbb {S}^{(1)}_{T_{n}},\ldots, \mathbb {S}^{(d)}_{T_{n}})\) under \(\mathbb {P}^{(M)}\) and note that its marginals have mean 1. The family \((\pi^{(M)}_{n})_{M\ge3}\) is tight and by Prokhorov’s theorem, there exists a subsequence \((\pi^{(M_{k})}_{n})_{k \in{\mathbb {N}}}\) converging to some \(\pi_{n}\). Note that the marginal distributions of \(\pi_{n}\) are \(\mu_{n}^{(i)}\), \(i=1,\ldots, d\). By the choice of \(\mathbb {P}^{(M)}\) and Lemma 4.3, it follows that

$$\begin{aligned} &\lim_{M\to\infty}\sup_{\mathbb {P}\in \underline{\mathcal {M}}\cap \mathcal {M}_{\mu_{n}, \mathcal {I}, 1/M}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-4)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]\\ &\quad\le\sup_{\mathbb {P}\in \underline{\mathcal {M}}\cap \mathcal {M}_{\mu_{n}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta \sqrt{m^{(D-6)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]. \end{aligned}$$

With this and using \(\mathcal {M}_{\mathcal {Z}_{M}, \mathcal {P}, \mathcal {I}}^{1/M} \subseteq \mathcal {M}_{\mu_{n}, \mathcal {I}, 1/M}\), we may continue (4.10) by writing

$$\begin{aligned} &\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\\ &\quad\le\inf_{Y\in \mathrm {Lin}(\mathcal {Y})} \inf_{M\ge0,\, N\ge0}\;\sup_{\mathbb {P}\in \underline{\mathcal {M}}\cap \mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt {m^{(D-4)}}\wedge\alpha-Y \\ &\phantom{=:\inf_{Y\in \mathrm {Lin}(\mathcal {Y})} \inf_{M\ge0,\, N\ge0}\;\sup_{\mathbb {P}\in \underline{\mathcal {M}}\cap \mathcal {M}^{1/M}_{\mathcal {Z}_{M},\mathcal {P}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[}\quad{}-N\lambda_{\mathfrak {P}}+\mathcal {P}(Y)]\\ &\quad\le \inf_{Y\in \mathrm {Lin}(\mathcal {Y}),\, N\ge0}\; \sup_{\mathbb {P}\in \mathcal {M}_{\mu_{n},\mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-6)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]\\ &\quad\le \inf_{M\ge0} \inf_{Y\in \mathrm {Lin}(\mathcal {Y}_{M}), N\ge0}\sup_{\mathbb {P}\in \mathcal {M}_{\mu _{n}, \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-6)}}\wedge\alpha-Y-N\lambda_{\mathfrak {P}} +\mathcal {P}(Y)]\\ &\quad\le \inf_{M\ge0} \lim_{N\to\infty}\sup_{\mathbb {P}\in \mathcal {M}_{\mu_{n}, \mathcal {I}}\cap \mathcal {M}^{1/N}_{\mathcal {Y}_{M},\mathcal {P},\mathfrak {P}}}\mathbb {E}_{\mathbb {P}}[G-\beta\sqrt{m^{(D-8)}}\wedge \alpha], \end{aligned}$$

where the last inequality follows from Lemma 4.4 since by analogous arguments to the ones above, we may argue that when \(M\) is large enough. The first inclusion implies that \(\mathcal {M}_{\mathcal {Y}_{M}, \mathcal {P}, \mathfrak {P}}^{1/M}\cap \mathcal {M}_{\mu_{n}, \mathcal {I}}\neq \emptyset\), justifying the application of Lemma 4.4. The second inclusion allows us to continue the above chain of inequalities to conclude the proof via


4.4 Proof of Theorem 3.16

We first make two simple observations.

Remark 4.6

If \(\mathfrak {P}\) is a nonempty closed (with respect to the sup-norm) subset of \(\varOmega\), then

$$\begin{aligned} \mathfrak {P}= \bigcap_{\epsilon>0}\mathfrak {P}^{\epsilon} = \bigcap_{\epsilon >0}\mkern 1.5mu\overline {\mkern -1.5mu\mathfrak {P}^{\epsilon}\mkern -1.5mu}\mkern 1.5mu, \end{aligned}$$

where \(\mkern 1.5mu\overline {\mkern -1.5mu\mathfrak {P}^{\epsilon}\mkern -1.5mu}\mkern 1.5mu\) is the closure of \(\mathfrak {P}^{\epsilon}\).

Lemma 4.7

If \(\mathfrak {P}\) is time-invariant, then for every \(\epsilon>0\), \(\mathfrak {P}^{\epsilon}\) is also time-invariant.


This follows easily by observing that for two paths \(S,\tilde{S}\in \varOmega\) and any nondecreasing continuous function \(f:[0,T_{n}]\to [0,T_{n}]\) with \(f(0)=0\) and \(f(T_{i}) = T_{i}\) for any \(i=1,\ldots,n\), we have \(\|\tilde{S}_{\cdot}-S_{\cdot}\|=\|\tilde{S}_{f(\cdot)}-S_{f(\cdot)}\| \). □

We now proceed with the proof of Theorem 3.16. Recall that the inequalities \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)\ge P_{\pmb {\mu}, \mathfrak {P}}(G)\) hold in general. In addition, according to Theorem 3.12, \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G) = \widetilde{P}_{\pmb {\mu}, \mathfrak {P}}(G)\). Therefore, we only need to show that \(\widetilde{P}_{\pmb {\mu}, \mathfrak {P}}(G) = P_{\pmb {\mu}, \mathfrak {P}}(G)\). Our proof of this equality is divided into six steps. First, using Proposition 4.5, we argue that it suffices to consider measures with “good control” on the expectation of \(m^{(D)}(\mathbb {S})\). Next, we perform three time changes within each trading period \([T_{i}, T_{i+1}]\). The resulting time change of \(\mathbb {S}\), denoted by \(\ddot{\mathbb {S}}\), allows a “good control” over its quadratic variation process. At the same time, we keep \(G(\mathbb {S})\) and \(G(\ddot{\mathbb {S}})\) “close”, and given a measure with “good control” on \(\mathbb {E}_{\mathbb {P}}[m^{(D)}(\mathbb {S})]\), since \(\mathfrak {P}^{\eta}\) is time-invariant, the law of the time-changed price process \(\ddot{\mathbb {S}}\) remains an element of . Then in Step 5, given a sequence of models with improved calibration precisions, we show tightness of the quadratic variation process of the time-changed price process \(\ddot {\mathbb {S}}\) under these measures. This then leads to tightness of the image measures via \(\ddot{\mathbb {S}}\). In Step 6, we deduce the duality \(\widetilde {P}_{\pmb {\mu}, \mathfrak {P}}(G) = P_{\pmb {\mu}, \mathfrak {P}}(G)\) from tightness and conclude.

Recall that \(\mathcal {X}\) is given by (3.7). Let

$$\begin{aligned} \mathcal {X}_{n}=\{(\kappa-\mathbb {S}^{(i)}_{T_{n}})^{+}: i=1,\ldots,d,\,\,\kappa\in \mathbb {R}_{+}\} \end{aligned}$$

and write \(P_{\mu_{n},\mathfrak {P}}:=P_{\mathcal {X}_{n},\mathcal {P},\mathfrak {P}}\) for the associated primal problem, where the martingale measures have fixed marginals \(\mu _{n}^{(i)}\), given by (3.8), of the distribution of \(\mathbb {S}_{T_{n}}\). Note that by definition, \(P_{\mu_{n}, \mathcal {I}}=\widetilde{P}_{\mu _{n}, \mathcal {I}}\) and that since the \(\mu_{n}^{(i)}\) have finite \(p\)th moment, we have \(P_{\mu_{n}, \mathcal {I}}\left(\|\mathbb {S}\|\right)<\infty\).

Step 1: Reducing to measureswith good control on \(\mathbb {E}_{\mathbb {P}}[\sqrt{m^{(D)}(\mathbb {S})}]\). Let \(G\) satisfy Assumption 3.14. Choose \(\kappa\geq1\) such that \(\|G\|\le\kappa\) and let \(f_{e}:\mathbb {R}^{d+K}_{+}\to \mathbb {R}_{+}\) be a modulus of continuity of \(G\), i.e.,

$$|G(\omega)-G(\upsilon)|\le f_{e}(|\omega-\upsilon|) \qquad \text{for any }\omega, \upsilon\in\varOmega $$

with \(\lim_{x\to0} f_{e}(x) = 0\). Fix \(D\in \mathbb {N}\). Consider \(X_{D}:\varOmega\to \mathbb {R}\) given by

$$\begin{aligned} X_{D}(S) &= \sqrt{\sum_{j=1}^{m^{(D)}(S)\wedge2^{6D}\kappa^{2}}\, \sum _{i=1}^{d+K}\big|S^{(i)}_{\tau^{(D)}_{j}(S)}- S^{(i)}_{\tau ^{(D)}_{j-1}(S)}\big|^{2}} \\ &\ge2^{-D}\sqrt{m^{(D)}(S)\wedge(2^{6D}\kappa^{2})-1}\\ &\ge\Big(2^{-D}\big(\sqrt{m^{(D)}(S)\wedge2^{6D}\kappa^{2}} -1\big)\Big)= \kappa2^{2D}\wedge\frac{\sqrt{m^{(D)}(S)}}{2^{D}} -2^{-D}, \end{aligned}$$

where the \(\tau^{(D)}_{i}\) and \(m^{(D)}\) are defined in Definition 4.1. It follows from the proof of Lemma 5.4 in Dolinsky and Soner [26] that there exists a \(\gamma\in\mathcal{A}\) such that

$$ \int_{0}^{\tau_{m^{(D)}(S)\wedge2^{6D}\kappa^{2}}}\gamma_{u} \,\mathrm {d}S_{u} + 3(d+K)\max_{0\le j\le(m^{(D)}(S)\wedge2^{6D}\kappa^{2})}|S_{\tau ^{(D)}_{j}}|\ge X_{D}(S),\qquad S\in \mathcal {I}. $$

Hence \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(X_{D})\le3(d+K)\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(\|\mathbb {S}\|\wedge(\kappa^{2}2^{5D} + 1))\). Reducing \(\mathcal {X}\) to options with maturity \(T_{n}\) and considering ℐ instead of \(\mathfrak {P}\) only increases the superhedging price, and therefore

$$ 0\le \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(X_{D}) \le3(d+K)V_{\mathcal {X}_{n},\mathcal {P}, \mathcal {I}}\big(\|\mathbb {S}\|\wedge (\kappa^{2}2^{5D} + 1)\big)\leq3(d+K)P_{\mu_{n}, \mathcal {I}}\left(\|\mathbb {S}\|\right), $$

which is finite, and where the last inequality follows from Theorem 3.12 applied to the case of a single maturity. It now follows from sublinearity of \(\widetilde{V}\) that


where \(c_{2}\) is a constant independent of \(D\) and the last inequality follows from Proposition 4.5.

Next we denote by \(\widehat {\mathcal {M}}_{\mathcal {I}}^{\kappa}\) the set of \(\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}\) such that

$$\begin{aligned} \mathbb {E}_{\mathbb {P}}\bigg[\kappa2^{D}\wedge\frac{\sqrt{m^{(D-8)}(\mathbb {S})}}{2^{2D}}\bigg]\le2\kappa+ 2. \end{aligned}$$

We notice that if \(\mathbb {P}\notin \widehat {\mathcal {M}}_{\mathcal {I}}^{\kappa}\), then

$$\begin{aligned} \mathbb {E}_{\mathbb {P}}\bigg[G(\mathbb {S})-\kappa2^{D}\wedge\frac{\sqrt{m^{(D-8)}(\mathbb {S})}}{2^{2D}}\bigg] < \kappa- 2\kappa-2 =-\kappa-2, \end{aligned}$$

while by the inequalities in (4.11) above, for \(D\) sufficiently large,

It follows that in (4.11), it suffices to consider , which in particular is nonempty.

Step 2: First time change: “squeezing paths and adding constant paths”. The first time change squeezes the evolution on \([T_{i-1},T_{i}]\) to \([T_{i-1},T_{i}-1/D]\) and adds a constant piece to the path on \([T_{i}-1/D,T_{i}]\). To achieve this, define an increasing function \(f: [0,T_{n}]\to[0,T_{n}]\) by

f(t)= i = 1 n ( T i ( T i 1 + ( T i T i 1 ) ( t T i 1 ) T i T i 1 1 / D )) 1 { T i 1 < t T i }

and then a process \((\tilde{\mathbb {S}}_{t})_{t\in[0,T_{n}]}\) by a time change of \(\mathbb {S}\) via \(f\), i.e., \(\tilde{\mathbb {S}}_{t} = \mathbb {S}_{f(t)}\). Note that \(f(T_{i}-1/D)=T_{i}\), as required. We argue below that (3.9) implies that we have \(|G(\mathbb {S}) - G(\tilde{\mathbb {S}})|\to 0\) as \(D\to\infty\).

Now for every \(N\in \mathbb {N}\), take such that

Since \(\mathbb {S}_{T_{i}} =\tilde {\mathbb {S}}_{T_{i}}\), we have in particular \(\mathcal {L}_{\mathbb {P}^{(N)}}(\mathbb {S}_{T_{i}}) = \mathcal {L}_{\mathbb {P}^{(N)}}(\tilde {\mathbb {S}}_{T_{i}})\) for all \(i\le n\). Also, being a time change of \(\mathbb {S}\), the process \((\tilde{\mathbb {S}}_{t})_{t\in[0,T_{n}]}\) is a martingale (in the time-changed filtration). It follows that its distribution \(\mathbb {P}^{(N)} \circ(\tilde {\mathbb {S}}_{t})^{-1}\) is an element of as \(\mathfrak {P}^{1/N}\) is time-invariant, by Lemma 4.7.

Step 3: Second time change: introducing a lower bound on the time step. The second time change ensures that we can bound from below the difference between any two consecutive stopping times in the Lebesgue discretisation in Definition 4.1. We want to do this by adding a constancy interval of length \(\delta\) to each step of the discretisation. As we have squeezed the paths above, we have length \(1/D\) to use up while still keeping the time changes to within the intervals \([T_{i-1},T_{i}]\). Taking suitably small \(\delta\), this allows us, with high probability, to alter all the steps in the Lebesgue discretisation.

For ease of notation, it is helpful to rename the elements of the set

$$\{\tau^{(D)}_{j} : j\le m^{(D)}\}\cup\{T_{i} : i = 1,\ldots, n\} $$

as follows. We define a sequence of stopping times \(\tau _{i,j}^{(D)}: \varOmega\to[T_{i-1},T_{i}]\) and \(m^{(D)}_{i}: \varOmega\to \mathbb {N}_{+}\) in a recursive manner. Set \(T_{0}=m^{(D)}_{0}(\mathbb {S}) = \tau^{(D)}_{0,-1}= 0\) and for \(i=1,\ldots,n\), set \(\tau^{(D)}_{i,0}(\mathbb {S})=T_{i-1}\) and let

$$\begin{aligned} \tau^{(D)}_{i,1}(\mathbb {S})&=\inf\bigg\{ t\ge T_{i-1}:\big|\mathbb {S}_{t}-\mathbb {S}_{\tau ^{(D)}_{i-1,m^{(D)}_{i-1}(\mathbb {S})-1}(\mathbb {S})}\big|=\frac{1}{2^{D}}\bigg\} \wedge T_{i},\\ \tau^{(D)}_{i,k}(\mathbb {S})&=\inf\bigg\{ t\ge\tau_{i,k-1}(\mathbb {S}):\big|\mathbb {S}_{t}-\mathbb {S}_{\tau ^{(D)}_{i,k-1}(\mathbb {S})}\big|=\frac{1}{2^{D}}\bigg\} \wedge T_{i},\\ m^{(D)}_{i}(\mathbb {S})&=m^{(D)}_{i-1}(\mathbb {S})+\min\{k\in \mathbb {N}: \tau^{(D)}_{i,k}(\mathbb {S})=T_{i}\}. \end{aligned}$$

It follows that for any \(S\in \mathcal {I}\),

$$\begin{aligned} m^{(D)}(S)\le m^{(D)}_{n}(S) \le m^{(D)}(S) + n-1. \end{aligned}$$

Set \(\varTheta= 2\lceil\kappa^{2} 2^{6D} \rceil+n\) and \(\delta= 1/(4D\varTheta^{2})\). We now define a sequence of stopping times \(\sigma _{i,j}:\varOmega\to[0,T_{n}]\) by \(\sigma_{i,0}(S) := T_{i-1}\), \(\sigma _{i,\varTheta+1}(S) := T_{i}\), and for \(j\leq\varTheta\), we put

$$\sigma_{i,j}(S) := \big(\tau^{(D-8)}_{i, j}(S) + \delta j\big)\wedge \big(T_{i}-1/(2D)\big)\qquad \text{if }j< m^{(D-8)}_{i}(S), $$

while \(\sigma_{i,j}(S) := T_{i}-1/(2D)\) otherwise, where \(i=1,\ldots, n\). Then it follows from the definition that

$$T_{i-1}=\sigma_{i,0}(S)\le\sigma_{i,1}(S)\le\cdots\le\sigma _{i,\varTheta}(S)< \sigma_{i,\varTheta+1}(S)= T_{i} $$

for all \(S\in\varOmega\). Further, since the process \(\tilde {\mathbb {S}}\) is always constant on \([T_{i}-1/D, T_{i}]\), we have \(\tau^{(D-8)}_{i, j}(\tilde {\mathbb {S}})\le T_{i}-1/D\) and hence for \(j\le\varTheta\wedge (m^{(D-8)}_{i}(\tilde {\mathbb {S}})-1)\) that

$$\sigma_{i,j}(\tilde {\mathbb {S}}) \le\tau^{(D-8)}_{i, m^{(D-8)}_{i}-1}(\tilde {\mathbb {S}}) + \delta \varTheta\le T_{i}-\frac{1}{D} + \frac{1}{4D\varTheta}< T_{i} - \frac{1}{2D}. $$

Also, for all \(j =1,\ldots, (\varTheta\wedge (m^{(D-8)}_{i}(\tilde {\mathbb {S}})-1))\),

$$ \sigma_{i,j}(\tilde {\mathbb {S}}) - \sigma_{i,j-1}(\tilde {\mathbb {S}}) = \delta+ \big(\tau ^{(D-8)}_{i,j}(\tilde {\mathbb {S}}) - \tau^{(D-8)}_{i,j-1}(\tilde {\mathbb {S}})\big)\ge\delta.$$

We are now ready to define the time-changed process \(\check{\mathbb {S}}\) by

S ˇ t = i = 1 n ( j = 0 Θ 1 S ˜ τ i , j ( D 8 ) ( S ˜ ) + ( t σ i , j ( S ˜ ) δ ) + 1 [ σ i , j ( S ˜ ) , σ i , j + 1 ( S ˜ ) ) ( t ) = : i = 1 n { + S ˜ ( τ i , Θ ( D 8 ) ( S ˜ ) + 1 T i t 1 T i σ i , Θ ( S ˜ ) ) T i 1 [ σ i , Θ ( S ˜ ) , T i ] ( t ) ) .

Observe that \(\check{\mathbb {S}}\) is a (continuous) time change of \(\tilde {\mathbb {S}}\) and \(\tilde {\mathbb {S}}_{T_{i}} = \check{\mathbb {S}}_{T_{i}} = \mathbb {S}_{T_{i}}\) for \(i\le n\). As before, this implies that \(\check{\mathbb {S}}\) remains a martingale and .

We argue now that \(|G(\mathbb {S}) - G(\check{\mathbb {S}})|\) is small for large \(D\). To this end, we approximate a path \(S\) with a piecewise constant function \(\tilde{F}^{(D)}(S)\) which jump at the times \(\tau_{i,j}^{(D)}\). A similar discretisation is used later in Sect. 5; see (5.2). For \(S\in\varOmega\), consider

F ˜ t ( D ) (S)= i = 1 n j = 0 m i ( D ) 1 S τ i , j ( D ) 1 [ τ i , j ( D ) , τ i , j + 1 ( D ) ) (t)+ S T n 1 { T n } (t),t[0,T].

Then the time-continuity property of \(G\) in (3.9) ensures that

$$\begin{aligned} |G(\mathbb {S}) - G(\tilde{\mathbb {S}})| &\le\big|G(\mathbb {S}) - G\big(\tilde{F}^{(D)}(\mathbb {S})\big)\big|+ \big|G(\tilde{\mathbb {S}}) - G\big(\tilde{F}^{(D)}(\tilde{\mathbb {S}})\big)\big| \\ & \phantom{=}{}+ \big|G\big(\tilde{F}^{(D)}(\mathbb {S})\big) - G\big(\tilde{F}^{(D)}(\tilde{\mathbb {S}})\big)\big| \\ &\le2f_{e}(2^{-D+9}) + \frac{2nL\|\mathbb {S}\|}{D}. \end{aligned}$$

Similarly, for any \(S\in\varOmega\) with \(m^{(D-8)}_{n}(\tilde {\mathbb {S}}(S))=m^{(D-8)}_{n}(S)\le\varTheta\), again by (3.9), we have

$$\begin{aligned} \big|G\big(\tilde {\mathbb {S}}(S)\big)-G\big(\check{\mathbb {S}}(S)\big)\big| &\le\Big|G\big(\tilde {\mathbb {S}}(S)\big) - G\Big(\tilde{F}^{(D-8)}\big(\tilde {\mathbb {S}}(S)\big)\Big)\Big| \\ &\phantom{=}{} + \Big|G\big(\check{\mathbb {S}}(S)\big) - G\Big(\tilde{F}^{(D-8)}\big(\check{\mathbb {S}}(S)\big)\Big)\Big| \\ &\phantom{=}{}+ \Big|G\big(\tilde{F}^{(D-8)}(\tilde {\mathbb {S}})(S)\big) - G\Big(\tilde{F}^{(D-8)}\big(\check{\mathbb {S}}(S)\big)\Big)\Big| \\ &\le2f_{e}(2^{-D+9})+nL\|\tilde {\mathbb {S}}(S)\|\varTheta\delta \\ &\le2f_{e}(2^{-D+9}) +nL\|\mathbb {S}(S)\|/D, \end{aligned}$$

when \(D\) is sufficiently large. From (4.12), the Markov inequality gives

$$ \mathbb {P}^{(N)}[\{S\in \mathcal {I}:\, m^{(D-8)}(S)\ge\varTheta-n+2\}]\le\frac{2\kappa +2}{\kappa2^{D}}, $$

and hence by (4.13),

$$ \mathbb {P}^{(N)}[\{S\in \mathcal {I}:\, m^{(D-8)}_{n}(S)\ge\varTheta+1\}]\le\frac{2\kappa +2}{\kappa2^{D}}. $$

Furthermore, by (4.15) and (4.16),

$$\begin{aligned} |\mathbb {E}_{\mathbb {P}^{(N)}}[G(\tilde {\mathbb {S}})] - \mathbb {E}_{\mathbb {P}^{(N)}}[G(\check{\mathbb {S}})]| &\le2\kappa \mathbb {P}^{(N)}[m^{(D-8)}_{n}(\tilde {\mathbb {S}})>\varTheta]+2f_{e}(2^{-D+9}) \\ &\phantom{=}{} +nL\mathbb {E}_{\mathbb {P}^{(N)}}[\|\mathbb {S}\|]/D \\ &\le\frac{4\kappa+4}{2^{D}} +2f_{e}(2^{-D+9}) \\ &\phantom{=}{} +nLV_{\mathcal {X}_{n}, \mathcal {P}, \mathcal {I}}(\|\mathbb {S}\|)/D. \end{aligned}$$

Step 4: Third time change: controlling the increments of the quadratic variation. We say that \(\omega\in \mathcal {C}([0,T],\mathbb {R})\) admits a quadratic variation if

$$\lim_{N\to\infty}\sum_{k=0}^{m^{(N)}(\omega)-1}\Big(\omega_{\tau ^{(N)}_{k}(\omega)\land t}-\omega_{\tau^{(N)}_{k+1}(\omega)\land t}\Big)^{2} $$

exists and is a continuous function for \(t\in[0,T]\). In this case, we denote this limit with \(\langle\omega\rangle\) and otherwise we let \(\langle\omega\rangle\) be zero. In addition, for \(S\in\varOmega\), we say \(S\) admits a quadratic variation if \(S^{(i)}\) admits a quadratic variation for any \(i\le d+K\).

It follows from Theorem 4.30.1 in Rogers and Williams [46] and its proof that for any \(\mathbb {P}\in \mathcal {M}\), \(\langle \mathbb {S}\rangle:= (\langle \mathbb {S}^{(1)} \rangle, \ldots, \langle \mathbb {S}^{(d+K)} \rangle)\) agrees ℙ-a.s. with the classical definition of the quadratic variation of \(\mathbb {S}\) under ℙ, i.e., \(\mathbb {S}^{2}-\langle \mathbb {S}\rangle\) is a ℙ-martingale. Further, Doob’s inequality gives for all \(i\le d\) that

$$ \mathbb {E}_{\mathbb {P}^{(N)}}[\|\check{\mathbb {S}}^{(i)}\|^{p}]\le\bigg(\frac{p}{p-1}\bigg)^{p}\int_{[0,\infty)} x^{p}\mu^{(i)}_{n}(\mathrm {d}x), $$

and by the BDG inequalities, there exist constants \(c_{p}, C_{p}\in (0,\infty)\) such that

$$ c_{p}\mathbb {E}_{\mathbb {P}^{(N)}}\big[\langle\check{\mathbb {S}}^{(i)} \rangle^{p/2}_{T_{n}}\big]\le \mathbb {E}_{\mathbb {P}^{(N)}}[\|\check{\mathbb {S}}^{(i)}\|^{p}] \le C_{p}\mathbb {E}_{\mathbb {P}^{(N)}}\big[\langle\check{\mathbb {S}}^{(i)} \rangle^{p/2}_{T_{n}}\big]. $$

It follows that

$$ \mathbb {E}_{\mathbb {P}^{(N)}}\bigg[\sum_{i=1}^{d+K}\langle\check{\mathbb {S}}^{(i)} \rangle ^{p/2}_{T_{n}}\bigg]\le K_{1}, $$

where \(K_{1} := \frac{1}{c_{p}}((\frac{p}{p-1})^{p}\sum_{i=1}^{d}\int _{[0,\infty)} x^{p}\mu^{(i)}_{n}(\mathrm {d}x)+K\kappa^{p})\).

In the following, we want to modify \(\check{\mathbb {S}}\) on

$$\begin{aligned} \tilde{ \mathcal {I}} &:= \{S\in \mathcal {I}: \check{\mathbb {S}}(S) \text{ admits a quadratic variation}\} \\ &\phantom{:}= \{S\in \mathcal {I}: S \text{ admits a quadratic variation}\} \end{aligned}$$

to obtain another process \(\ddot{\mathbb {S}}\) with a better control of the quadratic variation, while its law remains in . In fact, \(\ddot{\mathbb {S}}\) will be obtained as a time change of \(\check{\mathbb {S}}\) on each interval \([\sigma_{i,j}(\tilde {\mathbb {S}}), \sigma_{i,j+1}(\tilde {\mathbb {S}}))\). Then by the continuity of \(G\), it follows that

$$ \big|G\big(\check{\mathbb {S}}(S)\big) - G\big(\ddot{\mathbb {S}}(S)\big)\big|\le f_{e}(2^{-D+9}), \qquad \forall\, S\in\tilde{ \mathcal {I}}\cap\big\{ h\in \mathcal {I}: m^{(D-8)}_{n}\big(\tilde {\mathbb {S}}(h)\big) \le\varTheta\big\} . $$

This together with (4.16) and the fact that \(\mathbb {P}[\tilde{ \mathcal {I}}] = 1\) for any \(\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}\) yields

$$\begin{aligned} |\mathbb {E}_{\mathbb {P}^{(N)}}[G(\check{\mathbb {S}}) - G(\ddot{\mathbb {S}})]|&\le f_{e}(2^{-D+9}) + 2\kappa \mathbb {P}^{(N)}\big[\big\{ S\in \mathcal {I}:\, m^{(D-8)}_{n}\big(\tilde {\mathbb {S}}(S)\big)\ge \varTheta+1\big\} \big]\\ &\le f_{e}(2^{-D+9}) + \frac{4\kappa+4}{2^{D}}. \end{aligned}$$

Hence, by (4.14) and (4.17),

$$\begin{aligned} |\mathbb {E}_{\mathbb {P}^{(N)}}[G(\mathbb {S}) - G(\ddot{\mathbb {S}})]| &\le5f_{e}(2^{-D+9}) + \frac{2nL\|\mathbb {S}\|}{D}+ \frac{8\kappa+8}{2^{D}} \\ &\phantom{=}{}+\frac{2nLV_{\mathcal {X}_{n},\mathcal {P}, \mathcal {I}}(\|\mathbb {S}\|)}{D}. \end{aligned}$$

First, for every \(i,j,k\), define \(\rho^{(i,j,k)}:\varOmega\to [T_{i-1},T_{i}]\) by

$$\rho^{(i,j,k)}(S) = \sigma_{i,j}\big(\check{\mathbb {S}}(S)\big)+\delta(1-2^{-k+1}). $$

Then for \(i = 1,\ldots,n\), \(j = 0,1,\ldots\), let \(\theta ^{(i,j,0)}_{t}=\sigma_{i,j}\) and define recursively for \(k = 1,2,\ldots\) a change of time \(\theta^{(i,j,k)}: \mathcal {I}\times[\rho^{i,j,k},\rho ^{i,j,k+1}] \to[T_{i-1},T_{i}]\) by

$$\begin{aligned} \theta^{(i,j,k)}_{t}(S) &= \inf\bigg\{ u\ge\theta^{(i,j,k-1)}_{\rho^{i,j,k}}(S) :\sum_{\ell =1}^{d+K} \big(\langle\check{\mathbb {S}}^{(\ell)}(S)\rangle_{u} - \langle\check{\mathbb {S}}^{(\ell)}(S) \rangle_{\theta^{(i,j,k-1)}_{\rho^{i,j,k}}}\big) \\ &\hphantom{=:\inf\bigg\{ u\ge\theta^{(i,j,k-1)}_{\rho^{i,j,k}}(S) :} > 2^{k}(t-\rho^{(i,j,k)})/\delta\bigg\} \wedge\sigma_{i,j+1}\big(\check{\mathbb {S}}(S)\big)\\ &\phantom{=:}\text{ for } t\in[\rho^{i,j,k},\rho^{i,j,k+1}], S\in \tilde{ \mathcal {I}}. \end{aligned}$$

For \(S\in\varOmega\setminus\tilde{ \mathcal {I}}\), set \(\theta^{(i,j,k)}_{t}(S) = t\), \(0 \leq t\leq T_{n}\).

We consider a time change of \(\check{\mathbb {S}}\) via the \(\theta^{(i,j,k)}\), defined by \(\ddot{\mathbb {S}}_{t} := \check{\mathbb {S}}_{\theta^{(i,j,k)}_{t}(\mathbb {S})}\) for \(t\in[\rho^{(i,j,k)}(\mathbb {S}), \rho^{(i,j,k+1)}(\mathbb {S}))\) for all \(i,j,k\) as above. Note that \(\theta^{(i,j,k-1)}_{\rho ^{i,j,k}}=\theta^{(i,j,k)}_{\rho^{i,j,k}}\) so that the resulting process is continuous. Consider \(S\in\tilde { \mathcal {I}}\) and \(i,j\) such that we have \(\sigma_{i,j+1}(\tilde {\mathbb {S}}(S)) - \sigma_{i,j}(\tilde {\mathbb {S}}(S)) > 0\), as otherwise everything collapses to one point. Then the quadratic variation of \(\ddot{\mathbb {S}}(S)\) grows on \([\rho^{(i,j,k)}(S), \rho^{(i,j,k+1)}(S))\) linearly at the rate \(2^{k}/\delta\), and \(\rho^{(i,j,k+1)}(S) - \rho ^{(i,j,k)}(S) = 2^{-k}\delta\). In particular, \(\ddot{\mathbb {S}}\) accumulates one unit of quadratic variation over each interval \([\rho^{(i,j,k)}(S), \rho^{(i,j,k+1)}(S))\) for \(k\) increasing until the total quadratic variation of \(\check{\mathbb {S}}\) on \([\sigma_{i,j+1}(\tilde {\mathbb {S}}(S)) - \sigma_{i,j}(\tilde {\mathbb {S}}(S))]\) is exhausted. Trivially bounding the quadratic variation of \(\check{\mathbb {S}}\) over a small interval by its quadratic variation over \([0,T_{n}]\), we see that

$$ \sum_{\ell=1}^{d+K}\big(\langle\ddot{\mathbb {S}}^{(\ell)}(S)\rangle_{t} - \langle \ddot{\mathbb {S}}^{(\ell)}(S)\rangle_{s}\big) \le2^{k_{0}}|t-s|/\delta \quad \text{for }\sigma_{i,j}\big(\tilde {\mathbb {S}}(S)\big)\le s \le t\le\sigma_{i,j+1}\big(\tilde {\mathbb {S}}(S)\big), $$

whenever \(S\in\tilde{ \mathcal {I}}\) is such that \(\sum_{i=1}^{d+K}\langle \check{\mathbb {S}}^{(i)}(S) \rangle_{T_{n}} \le k_{0}\). Therefore, for such \(S\), we have

$$ \sum_{\ell=1}^{d+K}\big(\langle\ddot{\mathbb {S}}^{(\ell)}\rangle_{t} - \langle \ddot{\mathbb {S}}^{(\ell)}\rangle_{s}\big) \le2^{k_{0}+1}|t-s|/\delta, \quad \forall s,t\in[0,T_{n}] \text{ with } |t-s|\le\delta. $$

We can ensure this happens with large probability since by Markov’s inequality,

$$\begin{aligned} \mathbb {P}^{(N)}\bigg[\sum_{i=1}^{d+K}\langle\ddot{\mathbb {S}}^{(i)} \rangle_{T_{n}} > k_{0}\bigg] &= \mathbb {P}^{(N)}\bigg[\sum_{i=1}^{d+K}\langle\check{\mathbb {S}}^{(i)} \rangle_{T_{n}} > k_{0}\bigg]\\ &\le\frac{\mathbb {E}_{\mathbb {P}^{(N)}}[\sum_{i=1}^{d+K}\langle\check{\mathbb {S}}^{(i)} \rangle^{p/2}_{T_{n}}]}{k_{0}^{p/2}}\le K_{1}k_{0}^{-p/2}. \end{aligned}$$

Finally, we observe that each \(\theta^{(i,j,k)}_{t}(\mathbb {S})\) is a stopping time relative to the natural filtration of \(\check{\mathbb {S}}\), and hence \(\ddot{\mathbb {S}}\) is a continuous \(\mathbb {P}^{(N)}\)-martingale.

Step 5: Tightness of the measures through tightness of the quadratic variation processes. Together with (4.19), by the Arzelà–Ascoli theorem, the above implies that the family \(\{\mathbb {P}^{(N)}\circ(\langle\ddot{\mathbb {S}} \rangle)^{-1}: N\in \mathbb {N}\}\) is tight in \(\mathcal {C}([0,T_{n}], \mathbb {R}^{d+K})\). Then by Theorem VI.4.13 in Jacod and Shiryaev [35], \(\{\mathbb {P}^{(N)}\circ\ddot{\mathbb {S}}^{-1}\}_{N\in \mathbb {N}}\) is tight in \(\mathbb {D}([0,T_{n}],\mathbb {R}^{d+K})\), the space of right-continuous functions with left limits. By Theorem VI.3.21 in Jacod and Shiryaev [35], this implies that for all \(\epsilon>0, \eta>0\), there are \(N_{0}\in \mathbb {N}\) and \(\theta>0\) with

$$\begin{aligned} N\ge N_{0} \Longrightarrow \mathbb {P}^{(N)}[w_{T_{n}}^{\prime}(\ddot{\mathbb {S}}, \theta )\ge\eta]\le\epsilon, \end{aligned}$$

where \(w_{T_{n}}^{\prime}\) is defined by

$$\begin{aligned} w_{T_{n}}^{\prime}(S, \theta) &= \inf\Big\{ \max_{i\le r} \sup_{t_{i-1}\le s\le t< t_{i}}|S_{t}- S_{s}|:r\in \mathbb {N}, 0 = t_{0}< \cdots< t_{r} = T_{n},\\ &\phantom{=:\inf\Big\{ \max_{i\le r} \sup_{t_{i-1}\le s\le t< t_{i}}|S_{t}- S_{s}|:} \inf_{i< r}(t_{i}-t_{i-1})\ge\theta\Big\} . \end{aligned}$$

Clearly, for \(S\in\varOmega\), continuity of \(S\) implies that

$$\begin{aligned} w_{T_{n}}(S, \theta):=\sup\{|S_{t}- S_{s}|:\, 0\le s< t\le T_{n},\, t-s\le\theta \} \le2w_{T_{n}}^{\prime}(S, \theta). \end{aligned}$$

Then we have

$$\begin{aligned} N\ge N_{0} \Longrightarrow \mathbb {P}^{(N)}[w_{T_{n}}(\ddot{\mathbb {S}}, \theta)\ge2\eta ]\le\epsilon, \end{aligned}$$

which then by Theorem VI.1.5 in Jacod and Shiryaev [35] implies that the family \(\{\mathbb {P}^{(N)}\circ\ddot{\mathbb {S}}^{-1}: N\in \mathbb {N}\}\) is tight, now in \(\mathcal {C}([0,T_{n}],\mathbb {R}^{d+K})\).

Step 6: Tightness gives exact duality. By tightness, there exists a converging subsequence \(\{\mathbb {P}^{(N_{k})}\circ \ddot{\mathbb {S}}^{-1}\}\) such that \(\mathbb {P}^{(N_{k})}\circ\ddot{\mathbb {S}}^{-1} \to \mathbb {P}\) weakly for some probability measure ℙ on \(\varOmega\). Consequently,

$$\begin{aligned} \lim_{k\to\infty} \mathbb {E}_{\mathbb {P}^{(N_{k})}}[G(\ddot{\mathbb {S}})]=\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})]. \end{aligned}$$

In addition, if ℙ is an element of \(\mathcal {M}_{\pmb {\mu },\mathfrak {P}}\), then

where \(e(x) := 5f_{e}(2^{-x+9}) + \frac{2nL\|\mathbb {S}\|}{x}+ \frac{c_{2}+ 8\kappa +8}{2^{x}} +\frac{2nLV_{\mathcal {X}_{n},\mathcal {P}, \mathcal {I}}(\|\mathbb {S}\|)}{x}\) and the third inequality follows from (4.18). Recalling that \(\widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}=\widetilde {P}_{\pmb {\mu },\mathfrak {P}}\) and letting \(D\to\infty\), we obtain the desired equality \(\widetilde {P}_{\pmb {\mu },\mathfrak {P}}=P_{\pmb {\mu },\mathfrak {P}}\) and conclude that

$$ \widetilde {V}_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=V_{\mathcal {X},\mathcal {P},\mathfrak {P}}(G)=P_{\pmb {\mu}, \mathfrak {P}}(G)=\widetilde{P}_{\pmb {\mu}, \mathfrak {P}}(G). $$

It remains to argue that ℙ is an element of \(\mathcal {M}_{\pmb {\mu}, \mathfrak {P}}\). First, it is straightforward to see that \(\mathbb {S}\) is a ℙ-martingale and \(\mathcal {L}_{\mathbb {P}}(S_{T_{i}}) = \mu_{i}\) for any \(i\le n\). To show that \(\mathbb {P}[\mathbb {S}\in \mathfrak {P}] = 1\), notice that by the Portemanteau theorem, for every \(\epsilon>0\),

$$\mathbb {P}[\mathbb {S}\in \mkern 1.5mu\overline {\mkern -1.5mu\mathfrak {P}^{\epsilon}\mkern -1.5mu}\mkern 1.5mu] \ge\limsup_{k\to\infty} \mathbb {P}^{(N_{k})}[\mathbb {S}\in \mkern 1.5mu\overline {\mkern -1.5mu\mathfrak {P}^{\epsilon}\mkern -1.5mu}\mkern 1.5mu] \ge\limsup_{k\to\infty} \mathbb {P}^{(N_{k})}[\mathbb {S}\in \mathfrak {P}^{1/N_{k}}] =1. $$

Therefore, it follows from Remark 4.6 and monotone convergence that

$$ \mathbb {P}[\mathbb {S}\in \mathfrak {P}] = \lim_{\epsilon\searrow0}\mathbb {P}[\mathbb {S}\in \mkern 1.5mu\overline {\mkern -1.5mu\mathfrak {P}^{\epsilon}\mkern -1.5mu}\mkern 1.5mu] =1, $$

and hence \(\mathbb {P}\in \mathcal {M}_{\pmb {\mu}, \mathfrak {P}}\).  □

5 Pricing–hedging duality without constraints

This and the subsequent section are devoted to establishing the crucial pricing–hedging duality result in the absence of constraints, which was exploited in all the proofs above.

Theorem 5.1

Under Assumption 3.1, for any \(\alpha, \beta\ge0\) and \(D\in \mathbb {N}\),

$$\begin{aligned} \mathbf {V}_{\mathcal {I}}(G- \beta\sqrt{m^{(D)}}\wedge\alpha) \le \mathbf {P}_{\mathcal {I}}(G - \beta\sqrt {m^{(D-2)}}\wedge\alpha), \end{aligned}$$

where \(m^{(D)}\) is defined in Definition 4.1.

Remark 5.2

As a by-product of the proof of Theorem 5.1, (5.1) still holds true when the probabilistic models ℙ are restricted to those which arise within a Brownian setup, i.e., ℙ satisfies (4.7).

The strategy of the proof is inspired by Dolinsky and Soner [25] and proceeds via discretisation, of the dual side in Sect. 5.1 and of the primal side in Sect. 5.3. The duality between the discrete counterparts is obtained by using classical probabilistic results of Föllmer and Kramkov [29].

5.1 Discretisation of the dual

5.1.1 A discrete-time approximation through simple strategies

The proof of Theorem 5.1 is based on a discretisation method involving a discretisation of the path space into a countable set of piecewise constant functions. These are obtained as a “shift” of the “Lebesgue discretisation” of a path. Recall from Definition 4.1 that for a positive integer \(N\) and any \(S \in\varOmega\), \(\tau^{(N)}_{0}(S)=0\), \(m^{(N)}_{0}(S)=0\),

$$\begin{aligned} \tau^{(N)}_{k}(S)=\inf\bigg\{ t\ge\tau^{(N)}_{k-1}(S):|S_{t}-S_{\tau ^{(N)}_{k-1}(S)}|=\frac{1}{2^{N}} \bigg\} \wedge T \end{aligned}$$

and \(m^{(N)}(S)=\min\{k\in \mathbb {N}: \tau^{(N)}_{k}(S)=T\}\). Now denote by \(\mathcal{A}_{N}\) the set of \(\gamma \in\mathcal{A}\) with \(|\gamma |\le N\) and for which trading in the risky assets only takes place at the moments \(0=\tau_{0}^{(N)}(S)<\tau_{1}^{(N)}(S)< \cdots<\tau_{m^{(N)}(S)}^{(N)}(S)=T\). Set

$$\begin{aligned} \mathbf {V}_{\mathcal {I}}^{(N)}(G):=\inf\{x : \exists \gamma \in\mathcal{A}_{N} \text{ which superreplicates $G-x$}\}. \end{aligned}$$

Then it is obvious from the definition of \(\mathbf {V}_{\mathcal {I}}^{(N)}\) that \(\mathbf {V}_{\mathcal {I}}^{(N_{1})}(G)\ge \mathbf {V}_{\mathcal {I}}^{(N_{2})}(G)\ge \mathbf {V}_{\mathcal {I}}(G)\) for any \(N_{2}\ge N_{1}\), and in fact, the following result states that \(\mathbf {V}_{\mathcal {I}}^{(N)}(G)\) converges to \(\mathbf {V}_{\mathcal {I}}(G)\) asymptotically.

Corollary 5.3

Under the assumptions of Theorem 5.1,

$$ \lim_{N\to\infty} \mathbf {V}_{\mathcal {I}}^{(N)}(G)=\mathbf {V}_{\mathcal {I}}(G). $$

5.1.2 A countable class of piecewise constant functions

In this section, we construct a countable set of piecewise constant functions which can approximate any continuous function \(S\) to a certain degree. It is achieved in three steps. The first step is to use the Lebesgue partition defined in the last section to discretise a continuous function into a piecewise constant function whose jump times are the stopping times. Due to the arbitrary nature of jump times and jump sizes, the set of piecewise constant functions \(F^{(N)}(S)\), generated through this procedure over all \(S\), is uncountable. To overcome this, in the subsequent two steps, we restrict the jump times and sizes to a countable set and hence define a class of approximating schemes. As explained in Sect. 3.1, our methods are closely inspired by [25], but in order to deal with payoff functions which are uniformly continuous, so that in applications we can include static hedging in options with different maturities, we had to devise an improved discretisation scheme.

We denote by \(\mathcal {D}([0,T],\mathbb {R}^{d+K})\) the set of all \(\mathbb {R}^{d+K}\)-valued measurable functions on \([0,T]\) and by \(\mathbb {D}([0,T],\mathbb {R}^{d+K})\) the subset of all right-continuous functions with left limits.

Step 1. Let \(\tau_{k}^{(N)}(S)\) and \(m^{(N)}(S)\) be defined as in Sect. 5.1.1. To simplify the notation, in this section, we often suppress their dependences on \(S\) and \(N\) and simply write

$$\begin{aligned} m=m^{(N)}(S), \qquad \tau_{k}=\tau_{k}^{(N)}(S). \end{aligned}$$

Our first “naive” approximation \(F^{(N)}:\varOmega\to \mathbb {D}([0,T],\mathbb {R}^{d+K})\) is defined as

F t ( N ) (S)= k = 0 m 1 S τ k 1 [ τ k , τ k + 1 ) (t)+ S T 1 { T } (t)for  t [ 0 , T ] S Ω

Note that \(F^{(N)}(\mathbb {S})\) is piecewise constant and \(\|F^{(N)}(\mathbb {S})-\mathbb {S}\| \le1/2^{N}\).

Step 2. Define a map

$$\begin{aligned} &\zeta ^{(N)}:\mathbb {R}^{d}_{+}\to A^{(N)}:=\{2^{-N}k : k=(k_{1},\ldots,k_{d+K})\in \mathbb {N}^{d+K}\},\\ &\zeta ^{(N)}(x)_{i}:=2^{-N}\lceil2^{N}x_{i}\rceil, \qquad i=1,\ldots,d+K. \end{aligned}$$

We then define our second approximation \(\check {F}^{(N)}: \varOmega\to \mathbb {D}([0,T],\mathbb {R}^{d+K})\) by

F ˇ t ( N ) ( S ) = ( S 0 ζ ( N + 1 ) ( S τ 1 ) ) + k = 0 m 2 ζ ( N + k + 1 ) ( S τ k + 1 ) 1 [ τ k , τ k + 1 ) ( t ) = + ζ ( N + m ) ( S τ m ) 1 [ τ m 1 , T ] ( t ) , t [ 0 , T ] .

Step 3. We now construct the shifted jump times \(\hat {\tau }^{(N)}_{k}:\varOmega\to \mathbb {Q}_{+}\cup\{T\}\). Firstly, set \(\hat {\tau }^{(N)}_{0}=0\). Then for any \(S\in\varOmega\) and \(k=1,\ldots,m^{(N)}(S)\), define \(\Delta\tau ^{(N)}_{k}:=\tau^{(N)}_{k}-\tau^{(N)}_{k-1}\) and let \(\Delta \hat {\tau }^{(N)}_{k} = p_{k}/q_{k}\) with

$$\begin{aligned} (p_{k},q_{k}) &=\arg \!\min \bigg\{ p+q: (p,q)\in \mathbb {N}^{2},\\ &\phantom{=::\arg \!\min \bigg\{ p+q:} \tau^{(N)}_{k-1}-\hat {\tau }^{(N)}_{k-1}< \frac {p}{q}\le\Delta\tau^{(N)}_{k}+\tau^{(N)}_{k-1}-\hat {\tau }^{(N)}_{k-1}\bigg\} \end{aligned}$$

if \(k< m^{(N)}(S)\) and \(\Delta \hat {\tau }^{(N)}_{k}=T-\hat {\tau }^{(N)}_{m^{(N)}-1}\) otherwise. Finally, set \(\hat {\tau }^{(N)}_{k} \!:=\sum_{i=1}^{k}\Delta \hat {\tau }^{(N)}_{i}\). Here we also suppress the dependences of these shifted jump times on \(S\) and \(N\) and write \(\hat {\tau }_{k}=\hat {\tau }_{k}^{(N)}(S)\). Clearly \(0=\hat {\tau }_{0}<\hat {\tau }_{1}<\hat {\tau }_{2}<\cdots<\hat {\tau }_{m}=T\), \(\tau_{k-1}<\hat {\tau }_{k}\le \tau_{k}\) for all \(k< m\) and \(\hat {\tau }_{m}=\tau_{m}=T\). These \(\hat {\tau }\) are the shifted versions of the \(\tau \) and are uniquely defined for any \(S\). We are going to use the \(\hat {\tau }\) to define a class of approximating schemes.

We can define an approximation \(\hat {F}^{(N)}:\varOmega\to \mathbb {D}([0,T],\mathbb {R}^{d+K})\) by

F ˆ t ( N ) ( S ) = ( S 0 ζ ( N + 1 ) ( S τ 1 ) ) + k = 0 m 2 ζ ( N + k + 1 ) ( S τ k + 1 ) 1 [ τ ˆ k , τ ˆ k + 1 ) ( t ) = + ζ ( N + m ) ( S τ m ) 1 [ τ ˆ m 1 , T ] ( t ) , t [ 0 , T ] .

Notice that \(\hat {F}^{(N)}(\mathbb {S})\) is piecewise constant and

$$\begin{aligned} \| \hat {F}^{(N)}(\mathbb {S})-\mathbb {S}\|&\le\| \hat {F}^{(N)}(\mathbb {S})- \check {F}^{(N)}(\mathbb {S})\|+\| \check {F}^{(N)}(\mathbb {S})-F^{(N)}(\mathbb {S})\|+\| F^{(N)}(\mathbb {S})-\mathbb {S}\| \\ &\le\frac{2}{2^{N-1}}+\frac{2}{2^{N}}+\frac{1}{2^{N}}< \frac {1}{2^{N-3}}. \end{aligned}$$

Definition 5.4

Let \(\hat{\mathbb {D}}^{(N)}\subseteq \mathbb {D}([0,T],\mathbb {R}^{d+K})\) be the set of functions \(f=(f^{(i)})_{i=1}^{d+K}\) which satisfy the following:

  1. 1.

    For any \(i=1,\ldots,d+K\), \(f^{(i)}(0)=1\);

  2. 2.

    \(f\) is piecewise constant with jumps at times \(t_{1},\ldots,t_{\ell -1}\in \mathbb {Q}_{+}\) for some \(\ell<\infty\), where \(t_{0}=t_{\ell_{0}}=0< t_{1}< t_{2}<\cdots<t_{\ell-1}<T\);

  3. 3.

    For any \(k=1,\ldots,\ell-1\) and \(i=1,\ldots,d+K\), \(f^{(i)}(t_{k})-f^{(i)}(t_{k-1})=j/2^{N+k}\) for \(j\in \mathbb {Z}\) with \(|j|\le2^{k}\);

  4. 4.

    \(\inf_{t\in[0,T],\, 1\le i\le d+K}f^{(i)}(t)\ge-2^{-N+3}\);

  5. 5.

    \(\|f^{(i)}\| \le\kappa+ 1\) for \(i = d+1, \ldots, d+K\), where \(\kappa= \max_{1\le j\le K}\frac{\|X^{(c)}_{j}\|_{\infty}}{ \mathcal {P}(X^{(c)}_{j})}\);

  6. 6.

    If \(f^{(i)}(t_{k}) = -2^{-N+3}\) for some \(i\le d+K\) and \(k\le\ell -1\), then \(f(t_{j}) = f(t_{k})\) for all \(k< j<\ell\);

  7. 7.

    If \(f^{(i)}(t_{k}) = \kappa+1\) for some \(i> d\) and \(k\le\ell-1\), then \(f(t_{j}) = f(t_{k})\) for all \(k< j<\ell\).

It is clear that \(\hat{\mathbb {D}}^{(N)}\) is countable.

5.1.3 A countable probabilistic structure

Let \(\hat{\varOmega}:=\mathbb {D}([0,T],\mathbb {R}^{d+K})\) and denote by \(\hat {\mathbb {S}}=(\hat {\mathbb {S}}_{t})_{0\le t \le T}\) the canonical process on the space \(\hat{\varOmega}\).

The set \(\hat {\mathbb {D}}^{(N)}\) is a countable subset of \(\hat{\varOmega}\). There exists a local martingale measure \(\hat {\mathbb {P}}^{(N)}\) on \(\hat{\varOmega}\) which satisfies \(\hat {\mathbb {P}}^{(N)}[\hat {\mathbb {D}}^{(N)}]=1\) and \(\hat {\mathbb {P}}^{(N)}[\{f\}]>0\) for all \(f\in \hat {\mathbb {D}}^{(N)}\). In fact, such a local martingale measure \(\hat {\mathbb {P}}^{(N)}\) on \(\hat {\mathbb {D}}^{(N)}\) can be constructed “by hand” as a continuous-time Markov chain with jump times decided independently of the jump positions. Let \(\hat{\mathbb {F}}^{(N)}:=(\hat{\mathcal {F}}^{(N)}_{t})_{0\le t\le T}\) be the filtration generated by the process \(\hat {\mathbb {S}}\) and satisfying the usual assumptions (right-continuous and \(\hat {\mathbb {P}}^{(N)}\)-complete).

In the last section, we saw definitions of \(\hat{\tau}^{(N)}_{k}\) on \(\varOmega\). Here we extend their definitions to \(\bigcup_{N\in \mathbb {N}}\hat {\mathbb {D}}^{(N)}\). Define the jump times by setting \(\hat {\tau }_{0}(\hat {\mathbb {S}})=0\) and for \(k>0\),

$$ \hat {\tau }_{k}(\hat {\mathbb {S}})=\inf\{t>\hat {\tau }_{k-1}(\hat {\mathbb {S}}):\hat {\mathbb {S}}_{t}\neq \hat {\mathbb {S}}_{t-} \}\wedge T. $$

Next we introduce the random time before \(T\),

$$\begin{aligned} m(\hat {\mathbb {S}}):=\min\{k:\hat {\tau }_{k}(\hat {\mathbb {S}})= T\}. \end{aligned}$$

Observe that for \(S\in\varOmega\), we have \(\hat {F}^{(N)}(S)\in \hat {\mathbb {D}}^{(N)}\), \(\hat {\tau }_{k}(\hat {F}^{(N)}(S))=\hat {\tau }_{k}(S)\) for all \(k\) and \(m(\hat {F}^{(N)}(S))=m^{(N)}(S)\). It follows that the definitions are consistent. In this context, a trading strategy \((\hat {\gamma }_{t})_{t=0}^{T}\) on the filtered probability space \((\hat{\varOmega},\hat{\mathbb {F}}^{(N)},\hat {\mathbb {P}}^{(N)})\) is a predictable stochastic process. So \(\hat {\gamma }\) is a map from \(\mathbb {D}([0,T],\mathbb {R}^{d+K})\) to \(\mathcal {D}([0,T],\mathbb {R}^{d+K})\). Now choose \(a\in \mathcal {D}([0,T],\mathbb {R}^{d+K})\) such that \(a\notin \hat {\gamma }(\hat {\mathbb {D}}^{(N)})\) and then define a mapping \(\phi: \mathbb {D}([0,T],\mathbb {R}^{d+K})\to \mathcal {D}([0,T],\mathbb {R}^{d+K})\) by \(\phi(\hat{S})=\hat {\gamma }(\hat{S})\) if \(\hat{S}\in \hat {\mathbb {D}}^{(N)}\), and equal to \(a\) otherwise. Since \(\hat {\mathbb {P}}^{(N)}\) has full support on \(\hat {\mathbb {D}}^{(N)}\), we get \(\hat {\gamma }=\phi (\hat {\mathbb {S}})\) \(\hat {\mathbb {P}}^{(N)}\)-a.s. In particular, for any \(A\) that is a Borel-measurable subset of \(\mathbb {R}^{d+K}\), the symmetric difference of \(\{\hat {\gamma }_{t}\in A\}\) and \(\{\phi(\hat {\mathbb {S}})_{t}\in A\}\) is a nullset for \(\hat {\mathbb {P}}^{(N)}\). Thus \(\phi\) is a predictable map. Furthermore, since \(\hat {\mathbb {P}}^{(N)}\) charges all elements in \(\hat {\mathbb {D}}^{(N)}\), for any \(\upsilon, \tilde{\upsilon}\in \hat {\mathbb {D}}^{(N)}\) and \(t\in[0,T]\),

$$ \upsilon_{u}=\tilde{\upsilon}_{u}, \forall u\in[0,t) \quad\Longrightarrow \quad\phi(\upsilon)_{t}=\phi(\tilde{\upsilon})_{t}. $$

In the sequel, we always consider the above version \(\phi(\hat {\mathbb {S}})\) of a predictable process \(\hat {\gamma }\).

We now formally define the probabilistic superreplication problem and later build a connection between the probabilistic superreplication problem on the discretised space and the pathwise discretised robust hedging problem. For the rest of the section, we write \(\int _{t_{1}}^{t_{2}}\) to mean \(\int_{(t_{1},t_{2}]}\).

As \(G\) is defined only on \(\varOmega\), to consider paths in \(\hat{\varOmega}\), we need to extend the domain of \(G\) to \(\hat{\varOmega}\). For most financial contracts, the extension is natural. However, we pursue a general approach here. We first define a projection by

where \(\omega^{1}\) is the constant path equal to 1. Put differently, when \(\hat{S}\in\bigcup_{N\in \mathbb {N}}\hat {\mathbb {D}}^{(N)}\), is the linear interpolation of the points

$$\Big(\big(\hat {\tau }_{0}(\hat{S}),\hat{S}_{\hat {\tau }_{0}(\hat{S})}\big),\ldots, \big(\hat {\tau }_{m(\hat{S})}(\hat{S}),\hat{S}_{\hat {\tau }_{m(\hat{S})}(\hat{S})}\big)\Big). $$

We then can define \(\hat {G}:\hat{\varOmega}\to\varOmega\) using the projection by , where \(\hat{S}\vee0:= ((\hat{S}^{(1)}_{t}\vee0, \ldots, \hat{S}^{(d+K)}_{t}\vee 0))_{0\le t\le T}\) for any \(S\in\hat{\varOmega}\). Note that \(G\) and \(\hat {G}\) are equal on \(\varOmega\). In addition, for every \(N\in \mathbb {N}\) and \(\hat{S}\in \hat {\mathbb {D}}^{(N)}\), we have


Therefore, we can deduce that


where the last inequality follows from (5.4) and (5.7). Similarly, for each \(D\in \mathbb {N}\), we define \(\hat{m}^{(D)}:\hat{\varOmega}\to \mathbb {N}\) by . Then by Remark 4.2 and (5.8), when \(N\) is sufficiently large,

$$ \hat{m}^{(D-2)}\big(\hat{F}^{(N)}(S)\big) \le m^{(D)}(S), \qquad \forall\,S\in\varOmega. $$

Definition 5.5

\(\hat {\gamma }:\hat{\varOmega}\to \mathcal {D}([0,T],\mathbb {R}^{d+K})\) is \(\hat {\mathbb {P}}^{(N)}\)-admissible if \(\hat {\gamma }\) is predictable and bounded by \(N\), and the stochastic integral \((\int_{0}^{t}\hat {\gamma }_{u}(\hat {\mathbb {S}})\,\mathrm {d}\hat {\mathbb {S}}_{u})_{0\le t\le T}\), which is well defined under \(\hat {\mathbb {P}}^{(N)}\), satisfies that there is some \(M >0\) such that

$$ \int_{0}^{t}\hat {\gamma }_{u}(\hat {\mathbb {S}})\,\mathrm {d}\hat {\mathbb {S}}_{u}\ge-M \qquad \hat {\mathbb {P}}^{(N)}\text{-a.s., } t \in[0,T). $$

An admissible strategy \(\hat {\gamma }\) is said to \(\hat {\mathbb {P}}^{(N)}\)-superreplicate \(\hat {G}\) if

$$ \int_{0}^{T}\hat {\gamma }_{u}(\hat {\mathbb {S}})\,\mathrm {d}\hat {\mathbb {S}}_{u}\ge \hat {G}(\hat {\mathbb {S}}) \qquad \hat {\mathbb {P}}^{(N)}\text{-a.s.} $$

The superreplication cost of \(\hat {G}\) is defined as

$$\begin{aligned} \hat {\mathbb {V}}^{(N)}:=\inf\{x : \exists \hat {\gamma }\text{ which is $\hat {\mathbb {P}}^{(N)}$-admissible and $\hat {\mathbb {P}}^{(N)}$-superreplicates $\hat {G}-x$}\}. \end{aligned}$$

Similarly to [25], we can now connect the probabilistic superhedging problem and the discretised robust hedging problem.

Definition 5.6

Given a predictable stochastic process \((\hat {\gamma }_{t})_{0\leq t \leq T}\) on \((\hat{\varOmega},\hat{\mathbb {F}}^{(N)},\hat {\mathbb {P}}^{(N)})\), we define \(\gamma ^{(N)}:\varOmega\to \mathbb {D}([0,T],\mathbb {R}^{d+K})\) by

γ t ( N ) (S):= k = 0 m 1 γ ˆ τ ˆ k ( F ˆ ( N ) (S)) 1 ( τ k , τ k + 1 ] (t),

where \(\tau_{k}=\tau_{k}^{(N)}(S)\), \(m=m^{(N)}(S)\) are given in Definition 4.1, \(\hat {F}^{(N)}\) in (5.3), \(\hat {\tau }_{k}=\hat {\tau }_{k}( \hat {F}^{(N)}(S))\) in (5.5), and we recall that \(m^{(N)}(S)=m(\hat {F}^{(N)}(S))\).

Lemma 5.7

For any admissible strategy \(\hat {\gamma }\) in the sense of Definition 5.5, \(\gamma ^{(N)}\) defined in (5.10) is \(\mathbb {F}\)-predictable.


We first show that if we equip \(\varOmega\) and \(\mathbb {D}([0,T],\mathbb {R}^{d+K})\) with the respective \(\sigma\)-algebras \(\mathcal {F}_{T}\) and \(\hat{\mathcal {F}}^{(N)}_{T}\), the function \(\hat {F}^{(N)}:\varOmega\to \mathbb {D}([0,T],\mathbb {R}^{d+K})\) is measurable. Since \(\hat {\mathbb {P}}^{(N)}\) has full support on \(\hat {\mathbb {D}}^{(N)}\), for any \(A\in\hat{\mathcal {F}}^{(N)}_{T}\),

$$\{ \hat {F}^{(N)}\in A\} = \bigcup_{\hat{S}\in \hat {\mathbb {D}}^{(N)}\cap A}\{S\in\varOmega: \hat {F}^{(N)}(S) = \hat{S}\}. $$

It is clear from the construction that \(\hat {\tau }^{(N)}_{k}\), \(m^{(N)}\) and \(\zeta ^{(N+k)}\) are all \(\mathcal {F}_{T}\)-measurable, and we note that for any \(\hat{\mathbb {S}}\in \hat {\mathbb {D}}^{(N)}\),

$$\{S\in\varOmega: \hat {F}^{(N)}(S) = \hat{S}\} = \{S\in\varOmega: m^{(N)} = m, \hat {\tau }^{(N)}_{k} = t_{k}, \zeta ^{(N+i)} = s_{k}, \forall k< m\} $$

for some \(m, t_{k}, s_{k}\). Therefore, we can conclude that \(\hat {F}^{(N)}\) has the desired measurability.

To prove that \(\gamma ^{(N)}\) is \(\mathbb {F}\)-predictable, we need to show that \(\hat {\gamma }_{\hat {\tau }_{k}}\circ \hat {F}^{(N)}\) is \(\mathcal {F}_{\tau_{k}}\)-measurable. Galmarino’s test, see Dellacherie and Meyer [22, Theorem IV.100], states that given any \(\mathcal {F}_{T}\)-measurable random variable \(\phi:\varOmega\to \mathbb {R}^{d+K}\) and any \(\mathbb {F}\)-stopping time \(\tau\), \(\phi\) is \(\mathcal {F}_{\tau }\)-measurable if and only if

$$ \forall\upsilon,\omega\in\varOmega: \qquad \upsilon_{u}=\omega_{u}, \forall u\in[0,\tau(\upsilon)] \quad\Longrightarrow\quad\phi(\upsilon )=\phi(\omega). $$

It follows from the definition of \(\hat {F}^{(N)}\) that for such \(\upsilon, \omega \) and \(\tau=\tau_{k}\), we have \(\hat {F}^{(N)}_{u}(\omega)= \hat {F}^{(N)}_{u}(\upsilon), \forall u\in[0,\hat {\tau }_{k})\). Hence by (5.6), \(\hat {\gamma }_{\hat {\tau }_{k}}( \hat {F}^{(N)}(\omega))=\hat {\gamma }_{\hat {\tau }_{k}}( \hat {F}^{(N)}(\upsilon))\). Therefore Galmarino’s test implies that \(\gamma ^{(N)}\) defined in (5.10) is \(\mathbb {F}\)-predictable. □

The following result is crucial. It states that the probabilistic superreplication value is asymptotically larger than the value of the discretised robust hedging problem. Recall that \(\lambda_{ \mathcal {I}}(\omega ):=\inf_{\upsilon\in \mathcal {I}}\|\omega-\upsilon\|\wedge1\).

Proposition 5.8

For uniformly continuous and bounded \(G\), \(\alpha, \beta\ge0\) and \(D\in \mathbb {N}\), we have

$$\begin{aligned} &\liminf_{N\to\infty} \mathbf {V}_{\mathcal {I}}^{(N)}\big(G(\mathbb {S}) - \beta\sqrt{m^{(D)}(\mathbb {S})}\wedge\alpha\big) \\ &\quad \le\liminf_{N \to\infty} \hat {\mathbb {V}}^{(N)}\big(\hat {G}(\hat {\mathbb {S}}) - \beta\sqrt{\hat {m}^{(D-2)}(\hat {\mathbb {S}})}\wedge\alpha- N\lambda_{ \mathcal {I}}(\hat {\mathbb {S}})\big). \end{aligned}$$


Fix \(N\ge6\). Let \(f_{e}:\mathbb {R}_{+}\to \mathbb {R}_{+}\) be a modulus of continuity for \(G\) so that \(\lim_{x\to0}f_{e}(x)=0\). Define \(G^{(N)}:\varOmega\to \mathbb {R}\) as

$$ G^{(N)}(S):=\hat {G}(S)-f_{e}(2^{-N+4})-\frac{14(d+K)N}{2^{N}}. $$

Note that

$$\begin{aligned} \mathbf {V}_{\mathcal {I}}^{(N)}(G-\beta\sqrt{m^{(D)}}\wedge\alpha) =&\mathbf {V}_{\mathcal {I}}^{(N)}(G^{(N)}- \beta\sqrt{m^{(D)}}\wedge\alpha)\\ &{}+f_{e}(2^{-N+4})+\frac{14(d+K)N}{2^{N}}. \end{aligned}$$

Hence, to show (5.11), it suffices to show that

$$ \mathbf {V}_{\mathcal {I}}^{(N)}(G^{(N)}-\beta\sqrt{m^{(D)}}\wedge\alpha)\le \hat {\mathbb {V}}^{(N)}(\hat {G}-\beta\sqrt{\hat{m}^{(D-2)}}\wedge\alpha- N\lambda_{ \mathcal {I}}). $$

The rest of the proof is structured to establish (5.12). Given a probabilistic semi-static portfolio \(\hat {\gamma }\) which superreplicates \(\hat {G}-\beta\sqrt{\hat{m}^{(D-2)}}\wedge\alpha-N\lambda_{ \mathcal {I}}-x\), we argue that the lifted trading strategy \(\gamma ^{(N)}\) superreplicates \(G^{(N)}-\beta\sqrt {m^{(D)}}\wedge\alpha-x\) on ℐ. To simplify notations, throughout the rest of the proof, we fix \(S\in \mathcal {I}\) and write \(\hat {F}:=\hat {F}^{(N)}(S)\).

Superreplication. We first notice that for any \(j< m-1\),

$$\begin{aligned} |(S_{\tau_{j+1}}-S_{\tau_{j}})-(\hat {F}_{\hat {\tau }_{j}}-\hat {F}_{\hat {\tau }_{j-1}})| &\le|S_{\tau_{j+1}}-\hat {F}_{\hat {\tau }_{j}}|+|S_{\tau_{j}}-\hat {F}_{\hat {\tau }_{j-1}}|\\ &\le\frac{1}{2^{N+j+1}}+\frac{1}{2^{N+j}}=\frac{3}{2^{N+j+1}}. \end{aligned}$$

It follows that for any \(k< m\),

$$\begin{aligned} &\bigg|\int_{0}^{\tau_{k}}\gamma _{u}^{(N)}(S)\,\mathrm {d}S_{u}-\int_{0}^{\hat {\tau }_{k}} \hat {\gamma }_{u}(\hat {F})\,\mathrm {d}\hat {F}_{u}\bigg| \\ &\quad\le \bigg|\sum_{j=0}^{k-1}\hat {\gamma }_{\hat {\tau }_{j}}(\hat {F}) (S_{\tau_{j+1}}-S_{\tau_{j}})-\sum_{j=0}^{k-1}\hat {\gamma }_{\hat {\tau }_{j+1}}(\hat {F}) (\hat {F}_{\hat {\tau }_{j+1}}-\hat {F}_{\hat {\tau }_{j}})\bigg| \\ &\quad\le \sum_{j=0}^{k-2} \big|\hat {\gamma }_{\hat {\tau }_{j+1}}(\hat {F}) \big((S_{\tau _{j+2}}-S_{\tau_{j+1}})- (\hat {F}_{\hat {\tau }_{j+1}}-\hat {F}_{\hat {\tau }_{j}})\big)\big| + \frac{2(d+K)N}{2^{N-1}} \\ &\quad\le\sum_{j=0}^{\infty}\frac{N(d+K)}{2^{N+j+2}} + \frac {2(d+K)N}{2^{N-1}}\le\frac{5(d+K)N}{2^{N}}. \end{aligned}$$

In addition,

$$\begin{aligned} &\bigg|\int_{\tau_{m-1}}^{T}\gamma _{u}^{(N)}(S)\,\mathrm {d}S_{u}-\int_{\hat {\tau }_{m-1}}^{T} \hat {\gamma }_{u}(\hat {F})\,\mathrm {d}\hat {F}_{u}\bigg| \\ &\quad= |\hat {\gamma }_{\hat {\tau }_{m-1}}(\hat {F}) (S_{T}-S_{\tau_{m-1}})-\hat {\gamma }_{\hat {\tau }_{m}}(\hat {F}) (\hat {F}_{\hat {\tau }_{m}}-\hat {F}_{\hat {\tau }_{m-1}})|\le\frac{N(d+K)}{2^{N}}. \end{aligned}$$


$$\begin{aligned} x+\int_{0}^{T}\gamma _{u}^{(N)}(S)\,\mathrm {d}S_{u} &\ge x+\int_{0}^{T} \hat {\gamma }_{u}(\hat {F})\,\mathrm {d}\hat {F}_{u}-\frac{5(d+K)N}{2^{N}}-\frac {(d+K)N}{2^{N}} \\ &\ge \hat {G}(\hat {F})- \beta\sqrt{\hat{m}^{(D-2)}(\hat {F})}\wedge\alpha-N\lambda _{ \mathcal {I}}(\hat {F})-\frac{6(d+K)N}{2^{N}} \\ &\ge \hat {G}(\hat {F})-\beta\sqrt{\hat{m}^{(D-2)}(\hat {F})}\wedge\alpha -N/2^{N-3}-\frac{6(d+K)N}{2^{N}} \\ &\ge G(S)-\beta\sqrt{m^{(D)}(S)}\wedge\alpha-f_{e}(2^{-N+4}) -\frac {14(d+K)N}{2^{N}}\\ &=G^{(N)}(S), \end{aligned}$$

where the second inequality follows from the superreplicating property of \(\hat {\gamma }\) and the fact that \(\hat {\mathbb {P}}^{(N)}[\{f\}]>0\), \(\forall f\in \hat {\mathbb {D}}^{(N)}\), the third inequality is justified by (5.4), and the last inequality is due to (5.8) and (5.9).

Admissibility. Now, for a given \(t< T\), let \(k< m\) be the largest integer so that \(\tau _{k}(S)\le t\). It follows from (5.13) and (5.14) that

$$\begin{aligned} \int_{0}^{t}\gamma ^{(N)}_{u}(S)\,\mathrm {d}S_{u} &=\int_{0}^{\tau_{k}}\gamma _{u}^{(N)}(S)\,\mathrm {d}S_{u}+\int_{\tau_{k}}^{t}\gamma _{u}^{(N)}(S)\,\mathrm {d}S_{u} \\ &\ge \int_{0}^{\hat {\tau }_{k}} \hat {\gamma }_{u}(\hat {F})\,\mathrm {d}\hat {F}_{u} - \frac{5(d+K)N}{2^{N}} - N(d+K)\max_{i}|S^{(i)}_{t}-S^{(i)}_{\tau_{k}}|\\ &\ge -M-\frac{6(d+K)N}{2^{N}}, \end{aligned}$$

where the last inequality follows from the admissibility of \(\hat {\gamma }\) and again the fact that \(\hat {\mathbb {P}}^{(N)}[\{f\}]>0, \forall f\in \hat {\mathbb {D}}^{(N)}\). Hence \(\gamma ^{(N)}\) is admissible. □

5.2 Duality for the discretised problems

Definition 5.9

Let \(\hat {\Pi }^{(N)}\) be the set of all probability measures \(\hat{\mathbb {Q}}\) which are equivalent to \(\hat {\mathbb {P}}^{(N)}\). For any \(\kappa\ge0\), denote by \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(\kappa)\) the set of all probability measures \(\hat{\mathbb {Q}}\in \hat {\Pi }^{(N)}\) such that

$$\hat{\mathbb {Q}}\Big[\Big\{ \omega\in\hat{\varOmega} : \inf_{\upsilon\in \mathcal {I}}\| \hat{\mathbb {S}}(\omega)-\upsilon\|\ge1/N\Big\} \Big]\le\frac{\kappa}{N} $$


$$\mathbb {E}_{\hat{\mathbb {Q}}}\bigg[\sum_{k=1}^{m(\hat {\mathbb {S}})}\sum_{i=1}^{d+K}\big|\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k}}|\hat{\mathcal {F}}^{(N)}_{\hat {\tau }_{k}-}] -\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k-1}}\big|\bigg]\le\frac{\kappa}{N}, $$

where \(\hat {\tau }_{k}=\hat {\tau }_{k}(\hat {\mathbb {S}})\) and \(m=m(\hat {\mathbb {S}})\) are defined in (5.5).

Lemma 5.10

Let \(\kappa>1\) and suppose \(\hat {G}\) is bounded by \(\kappa-1\) and \(\mathcal {M}_{ \mathcal {I}}\neq\emptyset\). Then there are at most finitely many \(N\in \mathbb {N}\) such that \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa)= \emptyset\), and we have

$$ \liminf_{N \to\infty} \hat {\mathbb {V}}^{(N)}\big(\hat {G}(\hat {\mathbb {S}})-N\lambda_{ \mathcal {I}}(\hat {\mathbb {S}})\big)\le \liminf_{N \to\infty}\sup_{\hat{\mathbb {Q}}\in \hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa)}\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})]. $$


For any \(\hat{\mathbb {Q}}\in \hat {\Pi }^{(N)}\), the support of \(\hat{\mathbb {Q}}\) is \(\hat {\mathbb {D}}^{(N)}\) whose elements are piecewise constant. Therefore, the canonical process \(\hat {\mathbb {S}}\) is a semimartingale under \(\hat{\mathbb {Q}}\). Moreover, it has the decomposition \(\hat {\mathbb {S}}=\hat{M}^{\hat{\mathbb {Q}}}+\hat{A}^{\hat{\mathbb {Q}}}\), where

A ˆ t Q ˆ = k = 1 m ( S ˆ ) ( E Q ˆ [ S ˆ τ ˆ k | F ˆ τ ˆ k ( N ) ] S ˆ τ ˆ k 1 ) 1 [ τ ˆ k , τ ˆ k + 1 ) ( t ) , t < T , A ˆ T Q ˆ : = lim t T A ˆ t Q ˆ ,

is a predictable process of bounded variation and \(\hat{M}^{\hat{\mathbb {Q}}}\) is a martingale under \(\hat{\mathbb {Q}}\). Then similarly to Dolinsky and Soner [26], it follows from Example 2.3 and Proposition 4.1 in Föllmer and Kramkov [29] that

$$\begin{aligned} & \hat {\mathbb {V}}^{(N)}\big(\hat {G}(\hat {\mathbb {S}})-N\lambda_{ \mathcal {I}}(\hat {\mathbb {S}})\big) \\ &\quad=\sup_{\hat{\mathbb {Q}}\in \hat {\Pi }^{(N)}}\mathbb {E}_{\hat{\mathbb {Q}}}\bigg[\hat{G}(\hat {\mathbb {S}})- N\lambda_{\mathcal {I}}(\hat {\mathbb {S}})-N\sum_{k=1}^{m(\hat {\mathbb {S}})}\sum_{i=1}^{d+K}\big|\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k}}|\hat{\mathcal {F}}^{(N)}_{\hat {\tau }_{k}-}] -\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k-1}}\big| \bigg]. \end{aligned}$$

By Proposition 5.8,

$$\begin{aligned} \liminf_{N \to\infty} \hat {\mathbb {V}}^{(N)}\big(\hat {G}(\hat {\mathbb {S}})-N\lambda_{ \mathcal {I}}(\hat {\mathbb {S}})\big) \ge \liminf_{N\to\infty} \mathbf {V}_{\mathcal {I}}^{(N)}(G)\ge \mathbf {P}_{\mathcal {I}}(G)> -\kappa. \end{aligned}$$

Then, in (5.15), it suffices to consider the supremum over \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa)\). In particular, \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa )\neq\emptyset\) for \(N\) large enough. □

5.3 Discretisation of the primal

Next, we show that we can lift any measure in \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(c)\) to a continuous martingale measure in \(\mathcal {M}_{ \mathcal {I}}\) such that the difference of the expected value of \(G\) under this continuous martingale measure and the expected value of \(\hat {G}\) under the original measure is within a bounded error, which goes to zero as \(N\to\infty\). Through this, we asymptotically connect the primal problems on the discretised space to the approximation of the primal problems on the space of continuous functions.

Proposition 5.11

Under the assumptions of Theorem 5.1, if \(G\) and all \(X^{(c)}_{i}/\mathcal {P}(X^{(c)}_{i})\) are bounded by \(\kappa-1\) for some \(\kappa \ge1\), then for any \(\alpha, \beta\ge0\), \(D\in \mathbb {N}\),

$$\begin{aligned} &\limsup_{N \to\infty}\sup_{\hat{\mathbb {Q}}\in \hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa+2\alpha)}\mathbb {E}_{\hat {\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})-\beta\sqrt{\hat{m}^{(D)}(\hat {\mathbb {S}})}\wedge\alpha] \\ &\quad\le\sup_{\mathbb {P}\in \mathcal {M}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})-\beta\sqrt{\hat {m}^{(D-2)}(\hat {\mathbb {S}})}\wedge\alpha]. \end{aligned}$$


Let \(f_{e}:\mathbb {R}^{d+K}_{+}\to \mathbb {R}_{+}\) be a modulus of continuity of \(G\), i.e.,

$$|G(\omega)-G(\upsilon)|\le f_{e}(|\omega-\upsilon|) \qquad \text{for any }\omega, \upsilon\in\varOmega $$

and \(\lim_{x\searrow0}f_{e}(x)=0\). Recall from Lemma 5.10 that \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa+2\alpha)\neq\emptyset\) for \(N\) large enough. Hence to show (5.16), it suffices to prove that for any \(\hat{\mathbb {Q}}\in \hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa+2\alpha)\),

$$\begin{aligned} \mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})-\beta\sqrt{\hat{m}^{(D)}(\hat {\mathbb {S}})}\wedge\alpha] \le \sup_{\mathbb {P}\in \underline {\mathcal {M}}_{ \mathcal {I}}}\mathbb {E}_{\mathbb {P}}[G(\mathbb {S})-\beta\sqrt{m^{(D-2)}(\mathbb {S})}\wedge\alpha] + g(1/N), \end{aligned}$$

for some \(g:\mathbb {R}_{+}\!\to \mathbb {R}_{+}\) such that \(\lim_{x\searrow0}g(x) = 0\). We fix \(N\) and \(\hat{\mathbb {Q}}\in \hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa+2\alpha)\) and prove the above inequality in four steps.

Step 1. We first construct a semimartingale \(\hat{Z}=\hat {M}+\hat{A}\) on a Wiener space \((\varOmega^{W},\mathcal {F}^{W},P^{W})\) such that

$$\begin{aligned} |\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})] - E^{W}[\hat {G}(\hat{Z})]|\le\kappa2^{-N+1} \end{aligned}$$


$$ P^{W}\Big[\Big\{ \omega\in\varOmega^{W} : \inf_{\upsilon\in \mathcal {I}}\|\hat {M}(\omega)+\hat{A}(\omega)-\upsilon\|\ge1/N\Big\} \Big]\le\frac {2\kappa+2\alpha}{N}+2^{-N}, $$

where \(\hat{M}\) is constructed from a martingale and both have piecewise constant paths.

Since the measure \(\hat{\mathbb {Q}}\) is supported on \(\hat{\mathbb {D}}^{(N)}\), the canonical process \(\hat {\mathbb {S}}\) is a pure jump process under \(\hat{\mathbb {Q}}\), with a finite number of jumps \(\hat{\mathbb {Q}}\)-a.s. Consequently, there exists a deterministic positive integer \(m_{0}\) (depending on \(N\)) such that

$$ \hat{\mathbb {Q}}[m(\hat {\mathbb {S}})>m_{0}]< 2^{-N}. $$

It follows that

$$ |\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})]-\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}}^{\hat {\tau }_{m_{0}}})]|\le\kappa 2^{-N+1}. $$

Notice that by the definition of \(\hat {\mathbb {D}}^{(N)}\), the law of \(\hat {\mathbb {S}}^{\hat {\tau }_{m_{0}}}\) under \(\hat{\mathbb {Q}}\) is also supported on \(\hat {\mathbb {D}}^{(N)}\).

Let \((\varOmega^{W},\mathcal {F}^{W},P^{W})\) be a complete probability space together with a standard \((m_{0}+2)\)-dimensional Brownian motion \(\{ W_{t}=(W_{t}^{(1)},\ldots, W_{t}^{(m_{0}+2)})\}_{t \geq0}\) and let \((\mathcal {F}^{W}_{t})_{t\geq0}\) be the \(P^{W}\)-completion of the natural filtration of \(W\). With a small modification of Lemma 5.1 in Dolinsky and Soner [25], we can construct a sequence of stopping times (with respect to the Brownian filtration) \(\sigma_{1}<\sigma_{2}<\cdots< \sigma_{m_{0}}\) together with \(\mathcal {F}^{W}_{\sigma_{i}}\)-measurable random variables \(Y_{i}\), \(i=1,\ldots,m_{0}\) such that

$$\begin{aligned} &\mathcal {L}_{P^{W}}\big((\sigma_{1},\ldots,\sigma_{m_{0}}, Y_{1},\ldots,Y_{m_{0}})\big)\\ &\quad=\mathcal {L}_{\hat{\mathbb {Q}}}\big((\hat {\tau }_{1}, \ldots, \hat {\tau }_{m_{0}}, \hat {\mathbb {S}}_{\hat {\tau }_{1}}-\hat {\mathbb {S}}_{\hat {\tau }_{0}},\ldots, \hat {\mathbb {S}}_{\hat {\tau }_{m_{0}}}-\hat {\mathbb {S}}_{\hat {\tau }_{m_{0}-1}})\big); \end{aligned}$$

see Sect. A.3 for details. Define \(X_{i}\) as

$$\begin{aligned} X_{i}=E^{W}[Y_{i}|\mathcal {F}^{W}_{\sigma_{i-1}}\vee\sigma(\sigma_{i})], \qquad i=1,\ldots,m_{0}. \end{aligned}$$

Note that by the definition of \(\hat {\mathbb {D}}^{(N)}\), we have \(|Y_{i}|\le2^{-N}\) and hence also \(|X_{i}|\le2^{-N}\). Also by the construction of the \(\sigma_{i}\) and \(Y_{i}\), we have

$$ E^{W}[Y_{i}|\mathcal {F}^{W}_{\sigma_{i-1}}\vee\sigma(\sigma_{i})] = E^{W}[Y_{i}| \pmb {\sigma}_{i},\, \pmb {Y}_{i-1}], $$

where \(\pmb {\sigma}_{i}:=(\sigma_{1},\ldots,\sigma_{i})\), \(\pmb {Y}_{i}:=(Y_{1},\ldots,Y_{i})\) and \(E^{W}\) is the expectation with respect to \(P^{W}\). From these, we can construct a jump process \((\hat{A}_{t})_{0\leq t\leq T}\) by

A ˆ t = j = 1 m 0 X j 1 [ σ j , T ] (t).

In particular, for \(k\le m_{0}\), \(\hat{A}_{\sigma_{k}} = \sum_{j=1}^{k}X_{j}\). Define a martingale \((M_{t})_{0\leq t\leq T}\) via

$$\begin{aligned} M_{t}= 1+E^{W}\bigg[\sum_{j=1}^{m_{0}}(Y_{j}-X_{j})\bigg|\mathcal {F}_{t}^{W}\bigg], \qquad t\in[0,T]. \end{aligned}$$

Since all Brownian martingales are continuous, so is \(M\). Moreover, Brownian motion increments are independent and therefore

$$ M_{\sigma_{k}}=1+\sum_{j=1}^{k}(Y_{j}-X_{j}) \qquad P^{W} \text{-a.s., $k\le m$.} $$

We now introduce a stochastic process \((\hat{M}_{t})_{0\leq t\leq T}\) on the Brownian probability space by setting \(\hat{M}_{t}=M_{\sigma_{k}}\) for \(t\in[\sigma_{k},\sigma_{k+1})\), \(k< m_{0}\), and \(\hat{M}_{t}=\hat {M}_{\sigma_{m_{0}}}\) for \(t\in[\sigma_{m_{0}},T]\). Note that as \(|Y_{i}-X_{i}|\le2^{-N+1}\), for any \(k\le m_{0}\) and \(t\le T\), we have

$$\begin{aligned} |\hat{M}_{t\wedge\sigma_{k+1}\vee\sigma_{k}}\! - M_{t\wedge\sigma _{k+1}\vee\sigma_{k}}| &= \bigg|\!\sum_{j=k+1}^{m_{0}}\!\!\!E^{W}[(Y_{j}-X_{j})|\mathcal {F}_{t\wedge\sigma _{k+1}\vee\sigma_{k}}^{W}]\bigg| \\ & = \bigg|\!\sum_{j=k+2}^{m_{0}}\!\!\!E^{W}\big[E^{W}[(Y_{j}-X_{j})|\mathcal {F}^{W}_{\sigma _{j-1}}\!\!\vee\! \sigma(\sigma_{j})]\big|\mathcal {F}_{t\wedge\sigma_{k+1}\vee \sigma_{k}}^{W}\big] \\ & \quad{} + E^{W}[Y_{k+1}-X_{k+1}|\mathcal {F}_{t\wedge\sigma _{k+1}\vee\sigma_{k}}^{W}]\bigg| \\ &=| E^{W}[Y_{k+1}-X_{k+1}|\mathcal {F}_{t\wedge\sigma_{k+1}\vee\sigma_{k}}^{W}]|\\ & \le E^{W}\big[|Y_{k+1}-X_{k+1}|\big|\mathcal {F}_{t\wedge\sigma_{k+1}\vee\sigma _{k}}^{W}\big]\le2^{-N+1} \end{aligned}$$

and hence

$$\begin{aligned} \|\hat{M}-M\|< 2^{-N+2}. \end{aligned}$$

We also notice that \(\hat{Z}=\hat{M}+\hat{A}\) satisfies \(\hat{Z}_{0} = \hat {\mathbb {S}}_{0}\) and

$$\begin{aligned} &\mathcal {L}_{P^{W}}\big((\sigma_{1},\ldots,\sigma_{m_{0}}, Z_{\sigma_{1}}-Z_{0},\ldots ,Z_{\sigma_{m_{0}}}-Z_{\sigma_{m_{0}-1}})\big)\\ &\quad=\mathcal {L}_{\hat{\mathbb {Q}}}\big((\hat {\tau }_{1}, \ldots, \hat {\tau }_{m_{0}}, \hat {\mathbb {S}}_{\hat {\tau }_{1}}-\hat {\mathbb {S}}_{\hat {\tau }_{0}},\ldots, \hat {\mathbb {S}}_{\hat {\tau }_{m_{0}}}-\hat {\mathbb {S}}_{\hat {\tau }_{m_{0}-1}})\big). \end{aligned}$$

It follows that

$$\begin{aligned} E^{W}[\hat {G}(\hat{Z})] = \mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}}^{\tau_{m_{0}}})]. \end{aligned}$$

In particular, by (5.20), we see that (5.17) holds, and also by (5.19) and the definition of \(\hat{M}\) and \(\hat{A}\), (5.18) holds.

Step 2. We shall shortly construct a continuous martingale \(M^{\theta_{0}}\) from \(M\) such that \(M^{\theta_{0}}\) is bounded below by \(-2^{-N+2}-N^{-\frac {1}{2}}\) and

$$ |E^{W}[\hat {G}(M^{\theta_{0}})]-\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})]|\le c^{2}N^{-\frac{1}{2}} + 2f_{e}(N^{-\frac{1}{2}} + 2^{-N+2}) +2^{-N}. $$

As the law of \(\hat{Z}\) under \(P^{W}\) is the same as that of \(\hat {\mathbb {S}}^{m_{0}}\) under \(\hat{\mathbb {Q}}\), it follows from the fact that \(\hat{\mathbb {Q}}\) is supported on \(\hat{\mathbb {D}}^{(N)}\) and any \(f\in\hat{\mathbb {D}}^{(N)}\) is above \(-2^{-N+3}\) that

$$\begin{aligned} \hat{Z}\ge-2^{-N+3} \qquad \text{ $P^{W}$-a.s.} \end{aligned}$$

By combining this with (5.7) and (5.21), we can deduce that

It follows that

where we use the fact that . Hence, since \(\hat {G}\) is bounded by \(\kappa\),

$$\begin{aligned} |E^{W}[\hat {G}(M)]-E^{W}[\hat {G}(\hat{Z})]| &\le f_{e}(2^{-N+4}+N^{-\frac{1}{2}})\\ &\phantom{=}{}+2\kappa P^{W}\bigg[\max_{1\le i\le d+K}\sum _{k=1}^{m_{0}}|X^{(i)}_{k}|> N^{-\frac{1}{2}}\bigg]. \end{aligned}$$

Note that with the notation of the proof in Sect. A.3 in the Appendix,

$$\begin{aligned} X_{k} = E^{W}[Y_{k}\big| \pmb {\sigma}_{k},\, \pmb {Y}_{k-1}] \stackrel {(d)}{=}&\,\mathbb {E}_{\hat{\mathbb {Q}}}\big[\hat {\mathbb {S}}_{\hat {\tau }_{k}} - \hat {\mathbb {S}}_{\hat {\tau }_{k-1}}\big|\pmb {\hat {\tau }}_{k},\, \pmb {\Delta \hat {\mathbb {S}}}_{\hat {\tau }_{k-1}}\big]\\ =&\, \mathbb {E}_{\hat{\mathbb {Q}}}\big[\hat {\mathbb {S}}_{\hat {\tau }_{k}} - \hat {\mathbb {S}}_{\hat {\tau }_{k-1}}\big|\hat{\mathcal {F}}^{(N)}_{\hat {\tau }_{k}-}\big], \end{aligned}$$

where \(\Delta \hat {\mathbb {S}}_{k}=\hat {\mathbb {S}}_{\tilde {\tau }_{k}}-\hat {\mathbb {S}}_{\tilde {\tau }_{k}-}=\hat {\mathbb {S}}_{\tilde {\tau }_{k}}-\hat {\mathbb {S}}_{\tilde {\tau }_{k}-1}\) for \(k\le m_{0}\) and hence

$$\begin{aligned} E^{W}\bigg[\sum_{i=1}^{d+K}\sum_{k=1}^{m_{0}}\big|X^{(i)}_{k}\big|\bigg]=\mathbb {E}_{\hat{\mathbb {Q}}}\bigg[\sum_{k=1}^{m_{0}}\sum_{i=1}^{d+K}\big|\mathbb {E}_{\hat{\mathbb {Q}}}\big[\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k}}\big|\hat{\mathcal {F}}^{(N)}_{\hat {\tau }_{k}-}\big]-\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k-1}}\big|\bigg]. \end{aligned}$$

By Markov’s inequality and the definition of \(\hat {\mathbb {M}}^{(N)}_{\mathcal {I}}(2\kappa+2\alpha)\), we have

$$\begin{aligned} P^{W}\bigg[\sum_{i=1}^{d+K}\sum_{k=1}^{m_{0}}|X^{(i)}_{k}|> N^{-\frac {1}{2}}\bigg] &\le\sqrt{N}E^{W}\bigg[\sum_{i=1}^{d+K}\sum _{k=1}^{m_{0}}|X^{(i)}_{k}|\bigg] \\ & \le\sqrt{N} \mathbb {E}_{\hat{\mathbb {Q}}}\bigg[\sum_{k=1}^{m(\hat {\mathbb {S}})}\sum_{i=1}^{d+K}\big|\mathbb {E}_{\hat{\mathbb {Q}}}\big[\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k}}\big|\hat{\mathcal {F}}^{(N)}_{\hat {\tau }_{k}-}\big] -\hat {\mathbb {S}}^{(i)}_{\hat {\tau }_{k-1}}\big|\bigg] \\ &\le2(\kappa+\alpha) N^{-\frac{1}{2}}. \end{aligned}$$

Therefore, we have

$$\begin{aligned} |E^{W}[\hat {G}(M)]-E^{W}[\hat {G}(\hat{Z})]|\le f_{e}(2^{-N+4}+N^{-\frac {1}{2}})+4(\kappa+\alpha)^{2}N^{-\frac{1}{2}}. \end{aligned}$$

By (5.21)–(5.23),

$$\begin{aligned} P^{W}\bigg[&\inf_{0\le t\le T}\min_{1\le i\le d+K}M^{(i)}_{t}> -2^{-N+4}-N^{-\frac{1}{2}}\text{ and }\\ &\max_{d\le i\le d+K}\|M^{(i)}\|< \kappa+1+2^{-N+2}+N^{-\frac {1}{2}}\bigg]\ge1-2\kappa N^{-\frac{1}{2}}. \end{aligned}$$

Hence the stopped process \(M^{\theta_{0}}\), with

$$ \begin{aligned} \theta_{0}:=\inf\Big\{ t\ge0:& \min_{1\le i\le d+K}M^{(i)}_{t}\le -2^{-N+4}-N^{-\frac{1}{2}}\text{ or }\\ &\max_{d\le i\le d+K}\|M^{(i)}\|\ge\kappa+1+2^{-N+2}+N^{-\frac {1}{2}}\Big\} , \end{aligned} $$


$$\begin{aligned} |E^{W}[\hat {G}(M)]-E^{W}[\hat {G}(M^{\theta_{0}})]|\le4(\kappa+\alpha)^{2} N^{-\frac {1}{2}}. \end{aligned}$$

By (5.20), (5.24) and (5.25), it follows that

$$\begin{aligned} &|E^{W}[\hat {G}(M^{\theta_{0}})]-\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})]| \\ &\quad\le|E^{W}[\hat {G}(M^{\theta_{0}})]-E^{W}[\hat {G}(M)]| \\ &\qquad{} +|E^{W}[\hat {G}(M)]-E^{W}[\hat {G}(\hat{Z})]| + |\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}}^{\hat {\tau }_{m_{0}}})]-\mathbb {E}_{\hat{\mathbb {Q}}}[\hat {G}(\hat {\mathbb {S}})]| \\ &\quad\le4(\kappa+\alpha)^{2}N^{-\frac{1}{2}}+4(\kappa+\alpha)^{2}N^{-\frac {1}{2}}+f_{e}(2^{-N+4}+N^{-\frac{1}{2}})+\kappa2^{-N+1} . \end{aligned}$$

In addition, by (5.21) and (5.23), we can deduce from (5.18) that

$$\begin{aligned} &P^{W}\Big[\Big\{ \omega\in\varOmega^{W} : \inf_{\upsilon\in \mathcal {I}}\|M^{\theta _{0}}(\omega)-\upsilon\|\ge1/N + N^{-\frac{1}{2}}+2^{-N+2}\Big\} \Big]\\ &\quad\le\frac{2\kappa+2\alpha}{N}+2^{-N}+2(\kappa+\alpha) N^{-\frac{1}{2}}, \end{aligned}$$

which for \(N\) large enough easily implies that

$$ P^{W}\Big[\Big\{ \omega\in\varOmega^{W} : \inf_{\upsilon\in \mathcal {I}}\|M^{\theta _{0}}(\omega)-\upsilon\|\ge4(\kappa+\alpha) N^{-\frac{1}{2}}\Big\} \Big]\le4(\kappa+\alpha) N^{-\frac{1}{2}} . $$

Similarly, by (5.21) and (5.23), we have

$$ P^{W}[\|\hat{Z} - M^{\theta}\| \ge2^{-N+2} + N^{-\frac{1}{2}}]\le 2(\kappa+\alpha) N^{-\frac{1}{2}}. $$

Step 3. The next step is to modify the martingale \(M^{\theta _{0}}\) in such way that \(\varGamma\), the new continuous martingale, is nonnegative. Write \(\epsilon_{N}=2^{-N+4}+N^{-\frac{1}{2}}\) and define an \(\mathcal {F}^{W}_{T}\)-measurable random variable \(\varLambda\geq0\) by \(\varLambda=(M_{T\wedge{\theta_{0}}}+\epsilon_{N})/(1+\epsilon_{N})\). Then

$$\begin{aligned} |\varLambda-M_{T_{n}\wedge\theta_{0}}|=\bigg|\epsilon_{N}\frac{1-M_{T\wedge \theta_{0}}}{1+\epsilon_{N}}\bigg| \le\epsilon_{N}(1+|M_{T\wedge\theta_{0}}|). \end{aligned}$$

Note that for any \(i>d\), we have \(\|\varLambda^{(i)}\|\le\kappa +1+2^{-N+2}+N^{-\frac{1}{2}}+\epsilon_{N}\le\kappa+2\) for \(N\) large enough. We now construct a continuous martingale from \(\varLambda\) by setting

$$ \varGamma_{t}=E^{W}[\varLambda|\mathcal {F}_{t}^{W}], \qquad t\in[0,T], $$

and \(\varLambda\ge0\) implies that \(\varGamma\) is nonnegative, and \(\varGamma ^{(i)}_{0} = 1\) for \(i\le d+K\). Hence \(\mathbb {P}^{(N)} := P^{W}\circ(\varGamma _{t})^{-1}\in \underline {\mathcal {M}}\).

We first note that for all \(i=1,\ldots,d+K\),

$$\begin{aligned} E^{W}[|M_{T\wedge\theta_{0}}^{(i)}|]=\mathbb {E}^{W}[M_{T\wedge\theta _{0}}^{(i)}+2(M^{(i)}_{T\wedge\theta_{0}})^{-}]\le \mathbb {E}^{W}[M_{T\wedge\theta _{0}}^{(i)}+2]=3. \end{aligned}$$

Then by Doob’s martingale inequality,

$$\begin{aligned} P^{W}[\|\varGamma-M^{\theta_{0}}\|\ge\epsilon_{N}^{1/2}] &\le\epsilon_{N}^{-1/2}\sum_{i=1}^{d+K}E^{W}[|\varLambda ^{(i)}-M^{(i)}_{T\wedge\theta_{0}}|] \\ &\le\epsilon_{N}^{-1/2}4(d+K)\epsilon_{N}=4(d+K)\epsilon _{N}^{1/2}. \end{aligned}$$

This together with (5.26), writing \(\kappa_{1}=\kappa+\alpha\), yields

| E W [ G ( Γ ) ] E Q ˆ [ G ( S ˆ ) ] | E W [ | G ( Γ ) G ˆ ( M θ 0 ) | ] + | E W [ G ˆ ( M θ 0 ) ] E Q ˆ [ G ˆ ( S ˆ ) ] | E W [ | G ( Γ ) G ( M θ 0 0 ) | 1 { Γ M θ 0 < ϵ N 1 / 2 } ] + 8 κ 1 ( d + K ) ϵ N 1 / 2 + 8 κ 1 2 N 1 2 + f e ( 2 N + 4 + N 1 2 ) + κ 1 2 N + 1 f e ( ϵ N 1 / 2 ) + 8 κ 1 ( d + K ) ϵ N 1 / 2 + 9 κ 1 2 ϵ N 1 / 2 + f e ( ϵ N 1 / 2 ) 2 f e ( ϵ N 1 / 2 ) + 17 κ 1 2 ( d + K ) ϵ N 1 / 2 .

Finally, we can deduce from (5.27) and (5.29) that

$$\begin{aligned} &P^{W}\Big[\Big\{ \omega\in\varOmega^{W} : \inf_{\upsilon\in \mathcal {I}}\|\varGamma (\omega)-\upsilon\|\ge4\kappa_{1} N^{-\frac{1}{2}}+\epsilon_{N}^{1/2}\Big\} \Big] \\ &\quad\le4\kappa_{1} N^{-\frac{1}{2}} + 4(d+K)\epsilon^{1/2}_{N}, \end{aligned}$$

and from (5.28) and (5.29) that

$$\begin{aligned} P^{W}[ \|\hat{Z} - \varGamma\| \ge\epsilon_{N}+\epsilon_{N}^{1/2}]\le2\kappa_{1} N^{-\frac{1}{2}} + 4(d+K)\epsilon^{1/2}_{N}. \end{aligned}$$

Step 4. The last step is to construct a new process \(\tilde {\varGamma}\) from \(\varGamma\) such that the law of \(\tilde{\varGamma}\) under \(P^{W}\) is an element of \(\mathcal {M}_{ \mathcal {I}}\). We write \(\eta_{N} = 4\kappa_{1} N^{-\frac{1}{2}} + 4(d+K)\epsilon ^{1/2}_{N}\) and

$$\begin{aligned} p^{(N)}_{i} := \mathbb {E}_{\mathbb {P}^{(N)}}[X^{(c)}_{i}(\mathbb {S}_{T}^{(1)},\ldots, \mathbb {S}_{T}^{(d)})] \end{aligned}$$

for any \(i=1,\ldots K\), and define \(\tilde{p}^{(N)}_{i}\) by

$$\begin{aligned} \tilde{p}^{(N)}_{i} := \frac{\mathcal {P}(X^{(c)}_{i}) - (1- \sqrt{\eta _{N}})p^{(N)}_{i}}{\sqrt{\eta_{N}}}. \end{aligned}$$

We can deduce from (5.30) and (5.31) that

| P ( X i ( c ) ) p i ( N ) | E W [ | X i ( c ) ( Γ T ( 1 ) , , Γ T ( d ) ) P ( X i ( c ) ) Γ T ( d + i ) | ] P ( X i ( c ) ) η N + E W [ | X i ( c ) ( Γ T ( 1 ) , , Γ T ( d ) ) P ( X i ( c ) ) Γ T ( d + i ) | = : P ( X i ( c ) ) η N + E W [ × 1 { | X i ( c ) ( Γ T ( 1 ) , , Γ T ( d ) ) / P ( X i ( c ) ) Γ T ( d + i ) | > η N } ] P ( X i ( c ) ) η N + 2 ( κ + 2 ) P ( X i ( c ) ) η N , i = 1 , , K .

It follows immediately that

$$\begin{aligned} |\tilde{p}^{(N)}_{i} - \mathcal {P}(X^{(c)}_{i})| &= \bigg(\frac{1}{\sqrt{\eta_{N}}} -1\bigg)|\mathcal {P}(X^{(c)}_{i}) - p^{(N)}_{i}| \\ &\le\frac{2(\kappa+2)\mathcal {P}(X^{(c)}_{i})\eta_{N}}{\sqrt{\eta_{N}}} =2(\kappa +2)\mathcal {P}(X^{(c)}_{i})\sqrt{\eta_{N}}, \qquad \forall i\le K. \end{aligned}$$

Then it follows from Assumption 3.1 that when \(N\) is large enough, there exists a \(\tilde{\mathbb {P}}^{(N)}\in \underline {\mathcal {M}}_{\tilde{ \mathcal {I}}}\) such that

$$\begin{aligned} \tilde{p}^{(N)}_{i} = \mathbb {E}_{\tilde{\mathbb {P}}^{(N)}}[X^{(c)}_{i}(\mathbb {S}_{T}^{(1)},\ldots, \mathbb {S}_{T}^{(d)})], \qquad \forall i\le K. \end{aligned}$$

On the Wiener space \((\varOmega^{W}, \mathcal {F}^{W}, P^{W})\), or a suitable enlargement if necessary, there are continuous martingales \(\varGamma\) and \(\tilde{M}\) which have laws equal to \(\mathbb {P}^{(N)}\) and \(\tilde{\mathbb {P}}^{(N)}\) respectively, and an \(\mathcal {F}^{W}_{T}\)-measurable random variable \(\xi\in\{0,1\} \) that is independent of \(\varGamma\) and \(\tilde{M}\) with

$$P^{W}[\xi= 1] = 1- \sqrt{\eta_{N}}, \qquad P^{W}[\xi= 0]=\sqrt{\eta _{N}}. $$

Define \(\mathcal {F}^{W}_{T}\)-measurable random variables \(\tilde{\varLambda}^{(i)}\) by

Λ ˜ ( i ) = Γ T ( i ) 1 { ξ = 1 } + M ˜ T ( i ) 1 { ξ = 0 } , i = 1 , , d , Λ ˜ ( i ) = X i d ( c ) ( Λ ˜ ( 1 ) , , Λ ˜ ( d ) ) / P ( X i d ( c ) ) , d + K > i > d .

We now construct a continuous martingale from \(\tilde{\varLambda}\) by setting

$$ \tilde{\varGamma}_{t}=E^{W}[\tilde{\varLambda}|\mathcal {F}_{t}^{W}], \qquad t\in[0,T]. $$

It follows from the fact that \(\xi\) is independent of \(\varGamma\) and \(\tilde{M}\) that

$$\begin{aligned} \tilde{\varGamma}^{(i+d)}_{0} &= E^{W}[\tilde{\varGamma}^{(i+d)}_{T}|\mathcal {F}_{0}^{W}]\\ &=(1-\sqrt{\eta_{N}})E^{W}[X_{i}^{(c)}(\varGamma_{T}^{(1)},\ldots,\varGamma _{T}^{(d)})/\mathcal {P}(X^{(c)}_{i}) ]\\ &\phantom{=}{}+ \sqrt{\eta_{N}} E^{W}[X_{i}^{(c)}(\tilde{M}_{T}^{(1)},\ldots ,\tilde{M}_{T}^{(d)})/\mathcal {P}(X^{(c)}_{i})]\\ &= \frac{(1-\sqrt{\eta_{N}})p_{i}^{(N)}+\sqrt{\eta_{N}}\tilde{p}_{i}^{(N)}}{\mathcal {P}(X^{(c)}_{i})} =1, \qquad 1\leq i \leq K, \end{aligned}$$


$$\begin{aligned} \tilde{\varGamma}^{(i)}_{0} &= E^{W}[\tilde{\varGamma}^{(i)}_{T}|\mathcal {F}_{0}^{W}] = E^{W}[\tilde{\varLambda}^{(i)}_{T}|\mathcal {F}_{0}^{W}] \\ &= (1-\sqrt{\eta_{N}})E^{W}[\varGamma^{(i)}_{T}]+ \sqrt{\eta_{N}} E^{W}[\tilde {M}^{(i)}_{T}] = 1, \qquad i\le d. \end{aligned}$$

Hence \(\tilde{\mathbb {P}} := P^{W} \circ(\tilde{\varGamma}_{t})^{-1} \in \mathcal {M}_{\mathcal {I}}\). Also by the independence between \(\xi\) and \((\varGamma, \tilde{M})\), we have

$$\begin{aligned} E^{W}[|\tilde{\varLambda}^{(i)} - \varGamma_{T}^{(i)}|] = \sqrt{\eta_{N}} E^{W}[|\tilde{M}_{T}^{(i)}- \varGamma_{T}^{(i)}|] \le2\sqrt{\eta_{N}}, \qquad i\le d, \end{aligned}$$

and by (5.30),

$$\begin{aligned} P^{W}[|\varGamma_{T}^{(i)} - \tilde{\varLambda}^{(i)}|>\eta_{N}]\le\eta_{N} +\sqrt {\eta_{N}}\le2\sqrt{\eta_{N}}, \qquad i> d, \end{aligned}$$

which implies that

E W [ | Λ ˜ ( i ) Γ T ( i ) | ] = 2 E W [ ( Λ ˜ ( i ) Γ T ( i ) ) + ] E W [ Λ ˜ ( i ) Γ T ( i ) ] = 2 E W [ ( Λ ˜ ( i ) Γ T ( i ) ) + ] 2 η N + 2 E W [ Λ ˜ ( i ) 1 { | Λ ˜ ( i ) Γ T ( i ) | > η N } ] 2 η N + 4 ( κ + 2 ) η N 14 κ η N , i = d + 1 , , K .

Then by Doob’s martingale inequality,

$$P^{W}[\|\tilde{\varGamma}-\varGamma\|\ge\kappa\eta_{N}^{1/4}]\le\frac {1}{\kappa\eta_{N}^{1/4}}\sum_{i=1}^{d+K}E^{W}[|\tilde{\varLambda }^{(i)}-\varGamma^{(i)}_{T}|]\le14(d+K)\eta_{N}^{1/4} $$

and hence

| E P ˜ [ G ( S ) ] E P ( N ) [ G ( S ) ] | = | E W [ G ( Γ ˜ ) G ( Γ ) ] | f e ( κ η N 1 / 4 ) + E W [ | G ( Γ ) G ( Γ ) | 1 { Γ ˜ Γ κ η N 1 / 4 } ] f e ( κ η N 1 / 4 ) + 28 κ ( d + K ) η N 1 / 4 .

In addition, we can deduce from (5.31) that

$$ P^{W}[ \|\hat{Z} - \tilde{\varGamma}\| \ge\kappa\eta_{N}^{1/4} +\epsilon _{N}+\epsilon_{N}^{1/2}]\le2\kappa_{1} N^{-\frac{1}{2}} + 4(d+K)\epsilon ^{1/2}_{N} +14(d+K)\kappa\eta_{N}^{1/4}. $$

Notice that when \(N\) is sufficiently large such that \(\kappa\eta _{N}^{1/4} +\epsilon_{N}+\epsilon_{N}^{1/2} < 2^{-D-1}\), we can deduce from (5.7) and (5.22) that on the event

$$\{\omega\in\varOmega^{W} : \|\hat{Z}(\omega) - \tilde{\varGamma}(\omega)\| < \kappa\eta_{N}^{1/4} +\epsilon_{N}+\epsilon_{N}^{1/2} \text{ and } \hat {Z}(\omega)\in \hat {\mathbb {D}}^{(N)}\}, $$

we have

and hence by Remark 4.2, the inequality \(\hat{m}^{(D)}(\hat{Z}) \ge m^{(D-2)}(\tilde{\varGamma})\) holds on

$$\{\omega\in\varOmega^{W} : \|\hat{Z}(\omega) - \tilde{\varGamma}(\omega)\| < \kappa\eta_{N}^{1/4} +\epsilon_{N}+\epsilon_{N}^{1/2} \text{ and } \hat {Z}(\omega)\in \hat {\mathbb {D}}^{(N)}\}. $$

It follows that

$$\begin{aligned} &\mathbb {E}_{\hat{\mathbb {Q}}}[\beta\sqrt{\hat{m}^{(D)}(\hat {\mathbb {S}})}\wedge\alpha] \ge \mathbb {E}_{\hat {\mathbb {Q}}}[\beta\sqrt{\hat{m}^{(D)}(\hat {\mathbb {S}}^{m_{0}})}\wedge\alpha] = E^{W}[\beta\sqrt{\hat{m}^{(D)}(\hat{Z})}\wedge\alpha] \\ &\quad\ge E^{W}[\beta\sqrt{m^{(D-2)}(\tilde{\varGamma})}\wedge\alpha] - \alpha \big( 2\kappa_{1} N^{-\frac{1}{2}} + 4(d+K)\epsilon^{1/2}_{N} +14(d+K)\kappa\eta_{N}^{1/4}\big). \end{aligned}$$