Robust pricing–hedging dualities in continuous time

We pursue a robust approach to pricing and hedging in mathematical ﬁ-nance. We consider a continuous-time setting in which some underlying assets and options, with continuous price paths, are available for dynamic trading and a further set of European options, possibly with varying maturities, is available for static trading. Motivated by the notion of prediction set in Mykland (Ann. Stat. 31:1413–1438, 2003), we include in our setup modelling beliefs by allowing to specify a set of paths to be considered, e.g. superreplication of a contingent claim is required only for paths falling in the given set. Our framework thus interpolates between model-independent and model-speciﬁc settings and allows us to quantify the impact of making assumptions or gaining information. We obtain a general pricing–hedging duality result: the inﬁmum over superhedging prices of an exotic option with payoff G is equal to the supremum of expectations of G under calibrated martingale measures. Our results include in particular the martingale optimal transport duality of Dolinsky and Soner (Probab. Theory Relat. Fields 160:391–427, 2014) and extend it to multiple dimen-sions, multiple maturities and beliefs which are invariant under time-changes. In a general setting with arbitrary beliefs and for a uniformly continuous G , the asserted duality holds between limiting values of perturbed problems.


Introduction
Two approaches to pricing and hedging The question of pricing and hedging of a contingent claim lies at the heart of mathematical finance. Following Merton's seminal contribution [38], we may distinguish two ways of approaching it. First, one may want to make statements "based on assumptions sufficiently weak to gain universal support", 1 e.g. market efficiency combined with some broad mathematical idealisation of the market setting. We refer to this perspective as the model-independent approach. While very appealing at first, it has been traditionally criticised for producing outputs which are too imprecise to be of practical relevance. This is contrasted with the second, model-specific approach which focuses on obtaining explicit statements leading to unique prices and hedging strategies. "To do so, more structure must be added to the problem through additional assumptions at the expense of losing some agreement." 1 Typically this is done by fixing a filtered probability space (Ω, F, (F t ) t≥0 , P) with risky assets represented by some adapted process (S t ).
The model-specific approach, originating from the seminal works of Samuelson [47] and Black and Scholes [8], has revolutionised the financial industry and become the dominating paradigm for researchers in quantitative finance. Accordingly, we refer to it also as the classical approach. The original model of Black and Scholes has been extended and generalised, e.g. adding stochastic volatility and/or stochastic interest rates, trying to account for market complexity observed in practice. Such generalisations often lead to market incompleteness and a lack of unique rational warrant prices. Nevertheless, no-arbitrage pricing and hedging was fully characterised in a body of works on the fundamental theorem of asset pricing (FTAP) culminating in Schachermayer [20,21]. The feasible prices for a contingent claim correspond to expectations of the (discounted) payoff under equivalent martingale measures (EMM) and form an interval. The bounds of the interval are also given by the super-and sub-hedging prices. Put differently, the supremum of expectations of the payoff under EMMs is equal to the infimum of prices of superhedging strategies. We refer to this fundamental result as the pricing-hedging duality.
Short literature review The ability to obtain unique prices and hedging strategies, which is the strength of the model-specific approach, relies on its primary weakness-the necessity to postulate a fixed probability measure P giving a full probabilistic description of future market dynamics. Put differently, this approach captures risks within a given model, but fails to tell us anything about the model uncertainty, also called Knightian uncertainty; see Knight [36,Chap. 2]. Accordingly, researchers have extended the classical setup to one where many measures {P α : α ∈ Λ} are simultaneously deemed feasible. This can be seen as weakening assumptions and going back from model-specific towards model-independent. The pioneering works considered uncertain volatility; see Lyons [37] and Avellaneda et al. [2]. More recently, a systematic approach based on quasi-sure analysis was developed, with stochastic integration based on capacity theory in Denis and Martini [23] and on the aggregation method in Soner et al. [48]; see also Neufeld and Nutz [41]. In discrete time, a corresponding generalisation of the FTAP and the pricing-hedging duality was obtained by Bouchard and Nutz [10] and in continuous time by Biagini et al. [6]; see also references therein. We also mention that setups with frictions, e.g. trading constraints, were considered; see Bayraktar and Zhou [3].
In parallel, the model-independent approach has also seen a revived interest. This was mainly driven by the observation that with the increasingly rich market reality, this "universally acceptable" setting may actually provide outputs precise enough to be practically relevant. Indeed, in contrast to when Merton [38] was examining this approach, at present typically not only the underlying is liquidly traded, but so are many European options written on it. Accordingly, these should be treated as inputs and hedging instruments, thus reducing the possible universe of no-arbitrage scenarios. Breeden and Litzenberger [11] were the first to observe that if many (all) European options for a given maturity trade, then this is equivalent to fixing the marginal distribution of the stock under any EMM in the classical setting. Hobson [33] in his pioneering work then showed how this can be used to compute model-independent prices and hedges of lookback options. Other exotic options were analysed in subsequent works; see Brown et al. [12], Cox and Wang [18], Cox and Obłój [17]. The resulting no-arbitrage price bounds could still be too wide even for market making, but the associated hedging strategies were shown to perform remarkably well when compared to traditional delta-vega hedging; see Obłój and Ulmer [43]. Note that the superhedging property here is understood in a pathwise sense, and typically the strategies involve buy-and-hold positions in options and simple dynamic trading in the underlying. The universality of the setting and relative insensitivity of the outputs to the (few) assumptions earned this setup the name of robust approach.
In the wake of the financial crisis, significant research focus shifted back to the model-independent approach, and many natural questions, such as establishing the pricing-hedging duality and a (robust) version of the FTAP, were pursued. In a oneperiod setting, the pricing-hedging duality was linked to the Karlin-Isii duality in linear programming by Davis et al. [19]; see also Riedel [45]. Beiglböck et al. [5] reinterpreted the problem as a martingale optimal transport problem and established a general discrete-time pricing-hedging duality as an analogue of the Kantorovich duality in optimal transport. Here the primal elements are martingale measures, starting in a given point and having fixed marginal distribution(s) via the Breeden and Litzenberger [11] formula. The dual elements are sub-or superhedging strategies, and the payoff of the contingent claim is the "cost functional". An analogous result in continuous time, under suitable continuity assumptions, was obtained by Dolinsky and Soner [25], who also more recently considered the discontinuous setting [26]. These topics remain an active field of research. Acciaio et al. [1] considered the pricinghedging duality and the FTAP with an arbitrary market input in discrete time and under significant technical assumptions. These were relaxed, offering great insights, in a recent work of Burzoni et al. [14]. Galichon et al. [30] applied the methods of stochastic control to deduce the model-independent prices and hedges; see also Henry-Labordère et al. [32]. Several authors considered setups with frictions, e.g. transactions costs in Dolinsky and Soner [24] or trading constraints in Cox et al. [16] and Fahim and Huang [28].

Main contribution
The present work contributes to the literature on robust pricing and hedging of contingent claims in two ways. First, inspired by Dolinsky and Soner [25], we study the pricing-hedging duality in continuous time and extend their results to multiple dimensions, different market setups and options with uniformly continuous payoffs. Our results are general and obtained in a comprehensive setting. We explicitly specify several important special cases, including the setting when finitely many options are traded, some dynamically and some statically, and the setting when all European call options for n maturities are traded. The latter gives the martingale optimal transport (MOT) duality with n marginal constraints which was also recently studied in a discontinuous setup by Dolinsky and Soner [26] and, in parallel to our work, by Guo et al. [31].
Our second main contribution is to propose a robust approach, which subsumes the model-independent setting, but allows us to include assumptions and move gradually towards the model-specific setting. In this sense, we strive to provide a setup which connects and interpolates between the two ends of the spectrum considered by Merton [38]. In contrast, all the above works on the model-independent approach stay within Merton [38]'s "universally accepted" setting and analyse the implications of incorporating the ability to trade some options at given market prices for the outputs, namely prices and hedging strategies of other contingent claims. We amend this setup and allow expressing modelling beliefs. These are articulated in a pathwise manner. More precisely, we allow the modeller to deem certain paths impossible and exclude them from the analysis; the superhedging property is only required to hold on the remaining set of paths P. This is reflected in the form of the pricing-hedging duality we obtain.
Our framework was inspired by Mykland's [39] idea of incorporating a prediction set of paths into the pricing and hedging problem. On a philosophical level, we start with the "universally acceptable" setting and proceed by ruling out more and more scenarios as impossible; see also Cassese [15]. We may proceed in this way until we end up with paths supporting a unique martingale measure, e.g. a geometric Brownian motion, giving us essentially a model-specific setting. The hedging arguments are required to work for all the paths which remain under consideration, and a (strong) arbitrage would be given by a strategy which makes positive profit for all these paths. In discrete time, these ideas were recently explored by Burzoni et al. [13]. This should be contrasted with another way of interpolating between the model-independent and the model-specific, namely one which starts from a given model P and proceeds by adding more and more possible scenarios {P α : α ∈ Λ}. This naturally leads to probabilistic (quasi-sure) hedging and different notions of no-arbitrage; see Bouchard and Nutz [10].
Our approach to establishing the pricing-hedging duality involves both discretisation, as in Dolinsky and Soner [25], as well as a variational approach as in Galichon et al. [30]. We first prove an "unconstrained" duality result: (3.4) states that for any derivative with bounded and uniformly continuous payoff function G, the minimal initial cost of setting up a portfolio consisting of cash and dynamic trading in the risky assets (some of which could be options themselves) which superhedges the payoff G for every nonnegative continuous path is equal to the supremum of the ex-pected value of G over all nonnegative continuous martingale measures. 2 This result is shown through an elaborate discretisation procedure building on ideas in [25,26]. Subsequently, we develop a variational formulation which allows us to add statically traded options, or the specification of a prediction set P, via Lagrange multipliers. In some cases, this leads to "constrained" duality results, similar to the ones obtained in the works cited above, with superhedging portfolios allowed to trade statically in market options and the martingale measures required to reprice these options. In particular, Theorems 3.6 and 3.12 extend the duality obtained in [19] and in [25], respectively. However, in general, we obtain an asymptotic duality result with the dual and primal problems defined through a limiting procedure. The primal value is the limit of superhedging prices on an -neighbourhood of P, and the dual value is the limit of suprema of expectations of the payoff over -(mis)calibrated models; see Definitions 2.1 and 3.11.
The paper is organised as follows. Section 2 introduces our robust framework for pricing and hedging and defines the primal (pricing) and dual (hedging) problems. Section 3 contains all the main results. First, in Sect. 3.1, we outline the unconstrained pricing-hedging duality, displayed in (3.4), and state the constrained (asymptotic) duality results under suitable compactness assumptions. This allows us in particular to treat the case of finitely many traded options. Then, in Sect. 3.2, we apply the previous results to the martingale optimal transport case. All the result except the main unconstrained duality in (3.4) are proved in Sect. 4. The former is stated in Theorem 5.1 and shown in Sect. 5. The proof proceeds via discretisation, of the primal problem in Sect. 5.1 and of the dual problem in Sect. 5.3, with Sect. 5.2 connecting the two via classical duality results. The proofs of two auxiliary results are relegated to the Appendix.

Traded assets
We consider a financial market with d + 1 primary assets: a numeraire (e.g. the money market account) and d underlying assets which may be traded at any time t ≤ T . All prices are denominated in the units of the numeraire. In particular, the numeraire's price is thus normalised and equal to one. The underlying assets' price path is de- 1) and is assumed to be nonnegative and continuous in time. It is thus an element of the canonical space , which we endow with the supremum norm · . Throughout, trading is frictionless.
We pursue a robust approach and do not postulate any probability measure which would specify the dynamics for S. Instead, we incorporate as inputs prices of traded derivative instruments. We assume that there is a set X of market-traded options with prices P(X), X ∈ X , known at time zero. These options can be traded frictionlessly at time zero, but are not assumed to be available for trading at future times. In particular, only buy-and-hold trading in these options is allowed (static trading). An option X ∈ X is just a mapping X : C([0, T ], R d + ) → R, measurable with respect to the σ -field generated by the coordinate process. In the sequel, we only consider continuous payoffs X and often specialise further to European options, i.e., X(S) = f (S T i ) for some f and 0 < T i ≤ T .
Further, in addition to the above, we allow dynamically traded derivative assets. We consider K continuously traded European options and assume their prices evolve continuously and are strictly positive. The j th option has initial price P(X j ) and initial prices equal to 1. When we need to consider perturbations to options' prices, this then corresponds to a multiplicative perturbation of their payoffs; see e.g. Assumption 3.1 below. We thus have d + K dynamically traded assets whose price paths belong to Their price process is given by the canonical process S = (S t ) 0≤t≤T on Ω, i.e., S = (S (1) 0≤t≤T be its natural raw filtration. 3 The subset of paths which respect to the market information about the future payoff constraints is given by We sometimes refer to I as the information space. For a random variable G on Ω, we clearly have G = G • S, and we exploit this to write G(S) instead of simply G when we want to stress that G is seen as a function of the assets' path. It is also convenient to think of X ∈ X as functions on Ω with X(S) = X(S 1 , . . . , S d ). We only consider continuous X, i.e., X ⊆ C(Ω, R) with X ∞ := sup{|X(S)| : S ∈ Ω}. We write X = ∅ to indicate the situation with no statically traded options, and K = 0 to indicate when there are no dynamically traded options.

Beliefs
As argued in the introduction, we allow our agents to express modelling beliefs. These are encoded as restrictions of the path space and may come from time series analysis of past data, or idiosyncratic views about the market in the future. Put differently, we are allowed to rule out paths which we deem impossible. The paths which remain are referred to as prediction set or beliefs. Note that such beliefs may also encode one agent's superior information about the market. As the agent rejects more and more paths, the framework's outputs-the robust price bounds-should get tighter and tighter. This can be seen as a way to quantify the impact of making assumptions or acquiring additional insights or information.
The choice of paths is expressed by the prediction set P ⊆ I. Our arguments are required to work pathwise on P, while paths in the complement of P are ignored in our considerations. This binary way of specifying beliefs is motivated by the fact that in the end, we only see one path and hence are interested in arguments which work pathwise. Nevertheless, the approach is very comprehensive, and as P changes from all paths in I to the support of a given model, we essentially interpolate between model-independent and model-specific setups. It also allows incorporating the information from time series of data coherently into the option pricing setup, as no probability measure is fixed and hence no distinction between real-world and risk-neutral measures is made. The idea of such a prediction set first appeared in Mykland [39]; see also Nadtochiy and Obłój [40] and Cox et al. [16] for an extended discussion.

Trading strategies and superreplication
We consider two types of trading: buy-and-hold strategies in options in X and dynamic trading in assets S (i) , i ≤ d + K. The gains from the latter take the integral form γ u (S) dS u , and to define this integral pathwise, we need to impose suitable restrictions on γ . We may, following Dolinsky and Soner [25], take γ = (γ t ) 0≤t≤T to be an F-progressively measurable process of finite variation and use the integration by parts formula to define where we write a · b to denote the usual scalar product for any a, b ∈ R d+K and the last term on the right-hand side is a Stieltjes integral. However, for our duality, it is sufficient to consider a smaller class of processes. Namely, we say that γ is admissible if it is F-adapted, γ (S) is a simple, i.e., right-continuous and piecewise constant, function for all S ∈ Ω, and We denote by A the set of such integrands γ . To define static trading, we consider Lin N (X ).
An admissible (semi-static) trading strategy is a pair (X, γ ) with X ∈ Lin(X ) and γ ∈ A. We denote the class of such trading strategies by A X . The cost of following (X, γ ) ∈ A X is equal to the cost of setting up its static part, i.e., of buying the options at time zero, and is given by Throughout, we assume that the above defines P uniquely as a linear operator on Lin(X ). This is in particular true if the elements in X are linearly independent. Further, to eliminate obvious arbitrages, we assume that for X ∈ X , we have P(X) ≤ X ∞ , which by (2.3) then holds for all X ∈ Lin(X ). It follows that P is bounded linear, and hence continuous, on (Lin(X ), · ∞ ). Note that with our definitions, Lin(∅) = R and P(a) = a for a ∈ R. We also note that dynamically traded assets can be traded statically; so our previous notation P(X (c) j ) is consistent. Our prime interest is in understanding robust pricing and hedging of a non-liquidly traded derivative with payoff G : Ω → R. Our main results consider bounded payoffs G, and since the setup is frictionless and there are no trading restrictions, without any loss of generality, we may consider only the superhedging price. The subhedging follows by considering −G.
The (minimal) superreplication cost or superhedging price of G on P is defined as The approximate superreplication cost of G on P is defined as V X ,P,P (G) := inf{P(X) : ∃(X, γ ) ∈ A X which superreplicates G on P for some > 0}, where P = {ω ∈ I : inf υ∈P ω − υ ≤ }.
As we shall see below, the approximate superreplicating cost appears naturally as the correct object to obtain a duality with general P and X . We note, however, that ex post, it is also a natural object from the financial point of view: It requires the superreplication to be robust with respect to an arbitrarily small perturbation of the beliefs.
Note that by definition, I = I and consequently V X ,P,I (G) = V X ,P,I (G). Finally, we denote by V I (G) = V ∅,P,I (G) the superreplication cost of G in the absence of constraints.

Market models
Our aim is to relate the robust superhedging price as introduced above to the classical pricing-by-expectation arguments. To this end, we look at all classical models which reprice market-traded options.

Definition 2.2
We denote by M the set of probability measures P on (Ω, F T ) such that S is an (F, P)-martingale and let M I be the set of probability measures P ∈ M such that P[I] = 1. A probability measure P ∈ M I is called an (X , P, P)-market model or simply a calibrated model if P[P] = 1 and E P [X] = P(X) for all X ∈ X . The set of such measures is denoted by M X ,P,P . More generally, a probability measure P ∈ M I is called an η-(X , P, P)-market model if P[P η ] > 1 − η and |E P [X] − P(X)| < η for all X ∈ X . The set of such measures is denoted by M η X ,P,P .
Any P ∈ M X ,P,P provides us with a feasible no-arbitrage price E P [G] for a derivative with payoff G. The robust price for G is given as where throughout, the expectation is defined with the convention that ∞−∞=−∞.
In the cases of particular interest, (X , P) uniquely determines the marginal distributions of S at given maturities, and P X ,P,P (G) is then the value of the corresponding martingale optimal transport problem. We often use this terminology, even in the case of an arbitrary X . Finally, in the special case where there are no constraints, i.e., X = ∅ and P = I, we write P I (G) to represent the corresponding maximal modelling value, i.e., We shall see below that with a general X and P, we do not obtain a duality using P X ,P,P (G) in (2.4), but rather have to consider its approximate value given as Ex post, and similarly to the approximate superhedging above, this may be seen as a natural robust object to consider: instead of requiring a perfect calibration, the concept of η-market model allows a controlled degree of mis-calibration. This seems practically relevant since the market prices P are an idealised concept obtained e.g. via averaging of bid-ask spreads.

Main results
Our prime interest, as discussed in the introduction, is in establishing a general robust pricing-hedging duality. Given a non-traded derivative with payoff G, we have two candidate robust prices for it. The first one, V X ,P,P (G), is obtained through pricing-by-hedging arguments. The second one, P X ,P,P (G), is obtained by pricingvia-expectation arguments. In a classical setting, by fundamental results, see e.g. Delbaen and Schachermayer [20,Theorem 5.7], the analogous two prices are equal.
Within the present pathwise robust approach, the pricing-hedging duality was obtained for specific payoffs G in the literature linking the robust approach with the Skorokhod embedding problem; see Hobson [33] or Obłój [42] for a discussion. Subsequently, an abstract result was established in Dolinsky and Soner [25]. For d = 1, K = 0, P = I = Ω and X the set of all call (or put) options with a common maturity T and with P(X) = ∞ 0 X(x)μ(dx), ∀X ∈ X , where μ is a probability measure on R + with mean equal to 1, they showed that V X ,P,I (G) = P X ,P,I (G) for a "strongly continuous" class of bounded G.
The result was extended to unbounded claims by broadening the class of admissible strategies and imposing a technical assumption on μ. We extend this duality to a much more general setting of abstract X , possibly involving options with multiple maturities, a multidimensional setting, and with an arbitrary prediction set P. However, in this generality, the duality may only hold between approximate values. We first give the statements, illustrated with examples, and all the proofs are postponed to Sect. 4. Note that, for any Borel G : Ω → R, the inequality is true as long as there is at least one P ∈ M X ,P,P and at least one (X, γ ) ∈ A X which superreplicates G on P. Indeed, since γ is progressively measurable, the integral · 0 γ u (S) dS u , defined pathwise via integration by parts, agrees a.s. with the stochastic integral under P. Then by (2.1), the stochastic integral is a P-supermartingale and hence E P [ T 0 γ u (S) dS u ] ≤ 0. This in turn implies that E P [G] ≤ P(X). The result follows since (X, γ ) and P were arbitrary. The converse inequality, however, is very involved. The fundamental difficulty lies in the fact that even in the martingale optimal transport setting of [25], the set M X ,P,I is not compact. This is in contrast to the discrete-time case; see [5]. In our general setting, the converse inequality to (3.1) may fail, see Example 3.7 below, making it necessary to look at the duality between the approximate values.

General duality
We start with a general duality between the approximate values V , P . But first, we give our standing assumption which states that the prices of dynamically traded options are not "on the boundary of the no-arbitrage region", i.e., calibrated martingale measures exist under arbitrarily small perturbation of the initial prices.  Theorem 3.2 Assume that P is a measurable subset of I, Assumption 3.1 holds, all X ∈ X are uniformly continuous and bounded, and M η X ,P,P = ∅ for any η > 0. Then for any uniformly continuous and bounded G : Ω → R, we have V X ,P,P (G) ≥ P X ,P,P (G), (3.2) and if Lin 1 (X ) defined in (2.2) is a compact subset of (C(Ω, R), · ∞ ), then equality holds, i.e., V X ,P,P (G) = P X ,P,P (G).
where the X i are bounded and uniformly continuous. In this case, Lin 1 (X ) is a convex and compact subset of C(Ω, R). Therefore, if M η X ,P,P = ∅ for any η > 0, we can apply Theorem 3.2 to conclude that V X ,P,P (G) = P X ,P,P (G).
Let us outline the proof of Theorem 3.2. The first inequality in (3.2) is relatively easy to obtain, and the main effort is in establishing the converse inequality which yields (3.3). This is done in two steps. First, we consider the case without constraints, i.e., X = ∅ and P = I. The approximate values V , P then reduce to V and P respectively; so we need to show that for any bounded and uniformly continuous G : Ω → R, we have This result is a special case of Theorem 5.1, which is shown in Sects. 5.1 and 5.3. Our proof proceeds through discretisation of both the primal and the dual problem and is inspired by the methods in [25,26] but involves significant technical differences which are necessary to obtain our more general results. The first key difference, when comparing with [25], is that the discretisation therein entangles discretisation of the dynamic hedging part and static hedging part, while we develop a "clean" decoupled discretisation of the dynamic hedging part only. Second, we have to improve the discretisation to deal with payoff functions which are uniformly continuous. This is crucial for the subsequent use of a variational approach to generalise the pricinghedging duality results to include static hedging in options with different maturities. The time-continuity assumptions on the payoff made in [25] are much stronger, and their results could not be applied directly in our framework. We note also that in a quasi-sure setting, an analogue of (3.4) was obtained in Possamaï et al. [44] and earlier papers, as discussed therein. However, while similar in spirit, there is no immediate link between our results or proofs and those in [44]. Here, we consider a comparatively smaller set of admissible trading strategies and require a pathwise superhedging property. Consequently, we also need to impose stronger regularity constraints on G.
In the second step of the proof, we use a variational approach combined with a minimax argument to obtain duality under all the constraints. Specifically, in analogy to e.g. Proposition 5.2 in Henry-Labordère et al. [32], for any uniformly continuous and bounded G : Ω → R, we can write An application of the minimax theorem then yields The final, and somewhat technical, argument is to show that the above is dominated by P X ,P,P (G). We end this section with a study of the relation between V , P and their approximate values V , P . As already noted, by definition, V X ,P,P (G) ≥ V X ,P,P (G) and P X ,P,P (G) ≥ P X ,P,P (G). Therefore, when V X ,P,P (G) = P X ,P,P (G), the duality V X ,P,P (G) = P X ,P,P (G) follows if we can show that P X ,P,P (G) = P X ,P,P (G). We establish this equality for an important family of market setups, but also provide examples when it fails.
Consider first the case with no specific beliefs, P = I, and finitely many traded put options with maturities 0 < T 1 < · · · < T n = T , i.e., k ,j for any k < k and m(i, j ) ∈ N. To simplify the notation, we write In analogy to Assumption 3.1, we need to impose that the put prices are in the interior of the no-arbitrage region.
Assumption 3.5 Market put prices are such that there exists an > 0 such that for Theorem 3.6 Let X be given by (3.6) and assume that the market prices satisfy Assumptions 3.1 and 3.5. Then for any uniformly continuous and bounded G : Ω → R, we have V X ,P,I (G) = P X ,P,I (G).
The above result establishes a general robust pricing-hedging duality when finitely many put options are traded. It extends in many ways the duality obtained in Davis et al. [19] for d = n = 1 and K = 0. Note that in general, V X ,P,I (G) = V X ,P,I (G); so it follows from Example 3.4 that we also have P X ,P,I (G) = P X ,P,I (G) in Theorem 3.6. These equalities may still hold, but may also fail, when nontrivial beliefs are specified. We present now three examples to highlight various possible scenarios. In the first two examples, for different reasons, the pricing-hedging duality fails, i.e., V P > P P , while the approximate duality (3.3) holds. In the last example, all quantities are equal.
On the other hand, it is straightforward to see that P P (G) = 0. By letting M loc P be the set of P such that S is a P-local martingale and P[P] = 1, we further have Example 3.8 In this example, we consider P corresponding to the Black-Scholes model. For simplicity, consider the case without any traded options, i.e., K = 0, X = ∅, d = 1, and let 4 P = {S ∈ Ω : S admits a quadratic variation and d S t = σ 2 S 2 t dt, 0 ≤ t ≤ T }.
Then M P = {P σ }, where S is a geometric Brownian motion with constant volatility σ under P σ . The duality in Theorem 3.2 then gives that for any bounded and uniformly continuous G, However, in this case, P σ has full support on Ω so that P = Ω and M P = M for any > 0. The above then boils down to the case with no beliefs, and we have where for most G the inequality is strict.

Example 3.9
Consider again the case with no traded options, K = 0 and X = ∅, and let Given a bounded and uniformly continuous payoff function G, consider the duality in Theorem 3.2. For each N ∈ N, we pick P (N ) ∈ M 1/N P such that Let τ be the first hitting time of b + 1/N by S and defineS (N ) bỹ

Martingale optimal transport duality
We focus now on the case when (X , P) determines the marginal distributions of S (i) T j for i ≤ d and given maturities 0 < T 1 < · · · < T n = T . For concreteness, let us consider the case when put options are traded, i.e., Arbitrage considerations, see e.g. Cox and Obłój [17] and Cox et al. [16], show that absence of (a weak type of) arbitrage is equivalent to M X ,P,I = ∅. Note that the latter is equivalent to market prices P being encoded by a vector μ μ μ of probability measures (μ n have finite first moments, mean 1 and increase in convex order (written as μ n (dx) for any convex function φ : R + → R. In fact, as noted already by Breeden and Litzenberger [11], the μ (i) j are defined by We may think of (μ (i) j ) and P as the modelling inputs. The set of calibrated market models M X ,P,P is simply the set of probability measures P ∈ M such that S (i) T j is distributed according to μ (i) j and P[P] = 1. Accordingly, we write M μ μ μ,P = M X ,P,P and P μ μ μ,P (G) = P X ,P,P (G). Remark 3.10 It follows, see Strassen [49], that M μ μ μ,I is nonempty if and only if μ n have finite first moments, mean 1 and increase in convex order, for any i = 1, . . . , d. However, in general, the additional constraints associated with a nontrivial P I are much harder to understand.
In this context, we can improve Theorem 3.2 and narrow down the class of approximate market models by requiring that they match exactly the marginal distributions at the last maturity.
X ,P,P for a suitable choice 5 of (η) converging to zero as η → 0. It follows that P μ μ μ,P (G) ≤ P μ μ μ,P (G) ≤ P X ,P,P (G). The following result extends and sharpens the duality obtained in Theorem 3.2 to the current setting. Theorem 3.12 Under Assumption 3.1, let P be a measurable subset of I, X given by (3.7) and P such that for any η > 0, M η μ μ μ,P = ∅, where μ μ μ is defined via (3.8). Then for any uniformly continuous and bounded G, the robust pricing-hedging duality holds between the approximate values, i.e., Remark 3.13 Theorem 3.12 readily extends to unbounded exotic options, e.g. lookback options, following the approach of Dolinsky and Soner [25]. Fix p > 1 and relax the admissibility in (2.1) to Likewise, assume that all μ (i) j admit a finite pth moment and allow static trading in European options with payoffs which grow at most as |x| p . Then the duality in Theorem 3.12 extends to uniformly continuous G with |G(S)|≤const(1+ sup 0≤t≤T |S t | p ).
In the case of one maturity, n = 1, we have P μ μ μ,I (G) = P μ μ μ,I (G). In particular, Theorem 3.12 extends the duality of Dolinsky and Soner [25] by allowing arbitrary dimension d. It is also possible to consider a multidimensional extension where the whole marginal distribution L P (S T ) is fixed, or equivalently X is large enough, e.g. dense in the Lipschitz-continuous functions on R d . For n = 1 and P = I, such an extension follows via Theorem 3. Assumption 3.14 G is bounded and uniformly continuous, and such that there exists a constant L > 0 such that for all R d+K + -valued functions υ,υ of the form Note that Assumption 3.14 is close in spirit to Assumption 2.1 in [25], but is weaker and, unlike the latter, is satisfied by European options with intermediate maturities T 1 , . . . , T n−1 . Next we introduce a particular class of prediction sets. Our definition is closely related to time-invariant sets in Vovk [51], also recently used in Beiglböck et al. [4], but slightly different as we work with all continuous functions and also require that maturities T i are preserved.

Auxiliary results and proofs
We present now the proofs of all the results in Sect. 3. As noted before, we use the unconstrained duality (3.4) which follows from Theorem 5.1 stated and proved in Sect. 5 below. We start by describing a discretisation of a continuous path, often referred to as the "Lebesgue discretisation", a term we also use. The discretisation is a crucial tool in Sect. 5.1, but is also employed in the proofs of Lemmas 4.4, 4.3, 4.5 and Theorem 3.16 below.
We write m (N ) for the measurable map Ω S → m (N ) (S) and note that by definition, m (N ) = m (N ) (S).
Following the observation that m (N ) (S) < ∞ for all S ∈ Ω, we say that the sequence of stopping times 0 = τ Similar partitions were studied previously; see e.g. Bichteler [7] and Vovk [51]. Their main appearances have been as tools to build a pathwise version of the Itô integral. They can also be interpreted, from a financial point of view, as candidate times for rebalancing portfolio holdings; see Whalley and Wilmott [52].
has at least four elements, which implies that there exists at least one j < m (N ) and hence for any weakly converging sequence of probability measures P (k) → P and any bounded nonincreasing function φ : N → R, (4.1)

Proof of Theorem 3.2 and Remark 3.3
To establish (3.2), we consider an (X, γ ) ∈ A X that superreplicates G on P for some > 0. Since X is bounded and γ is admissible, we can find suitable M > 0 such that Since γ is progressively measurable, the integral · 0 γ u (S) dS u , defined pathwise via integration by parts, agrees a.s. with the stochastic integral under any P (N ) . Then by (2.1), the stochastic integral is a P (N ) -supermartingale and hence Together with (4.3), this yields P(X) ≥ P X ,P,P (G) and (3.2) follows because (X, γ ) ∈ A X was arbitrary. To establish (3.3), we show the converse inequality in three steps.
Step 1: Duality without constraints. This is the crucial and also the most technical part of the proof which we defer to Sect. 5. The duality in (3.4) follows as a special case of Theorem 5.1, which is stated and proved in Sect. 5.
Step 2: Calculus of variation approach. Fix G. Note that any (X, γ ) that superreplicates G − Nλ P on I also superreplicates G − N/M on P 1 M . It follows that for any fixed M, N ≥ 1, Taking the infimum over M and then over N , we obtain On the other hand, given any (X, γ ) ∈ A X and > 0 such that (X, γ ) superreplicates G on P , by the admissibility of (X, γ ) and boundedness of X and G, if N > 0 is sufficiently large, then It follows that we have equality in (4.4). We also have where the last equality is justified by Theorem 5.1 as λ P and X are bounded and uniformly continuous. Combining the above with (4.4), we conclude that (3.5) holds.
Step 3: Application of the minimax theorem. We rewrite (3.5) and apply a minimax argument to get In the former case, since ±NX * ∈ Lin N (X ), we obtain where, without loss of generality, we assume E P [X * ] < P(X * ). In the latter case, On the other hand, since (3.2) implies V X ,P,P (0) = 0, we have and hence we may restrict to measures in M η N X ,P,P in (4.5). Dropping nonpositive terms, we obtain (4.6) which completes the proof of Theorem 3.2.
For Remark 3.3, it remains to argue that Theorem 3.2 remains true when we restrict to Brownian martingales. Specifically, given T and a probability space 0≤t≤T is the P W -completion of the natural filtration of W , consider for some F W -progressively measurable process α with values in the (d + K) ×d matrices such that the above vector integral is well defined. Let M I be the family of all P ∈ M I which admit such a representation. From (3.5), as argued above, and Remark 5.2 below, we have Then by following the same argument as in Step 3 above, we can show that we have M

Proof of Theorem 3.6
The set X is finite and as discussed in Example 3.4, we can apply Theorem 3.2. Together with V X ,P,I = V X ,P,I , this yields V X ,P,I (G) = P X ,P,I (G) = lim Now for every positive integer N , we pick P (N ) ∈ M 1/N X ,P,I such that We let Then it follows from Assumption 3.5 that when N is large, there exists aP (N ) ∈ M I such thatp Now we consider Q : and hence Q ∈ M X ,P,I . In addition, Therefore, we have and taking limits as N → ∞ yields P X ,P,I (G) ≤ P X ,P,I (G). Together with (4.8) and (3.1), this completes the proof.

Proof of Theorem 3.12
From Theorem 3.2, we know that V X ,P,P (G) ≥ P X ,P,P (G) and by definition, P X ,P,P (G) ≥ P μ μ μ,P (G). Hence, to establish Theorem 3.12, it suffices to show that This is a special case (α = β = 0) of Proposition 4.5 below, which is a crucial technical result also used to prove Theorem 3.16 below. We recall that We note that these sets are different from the main objects introduced in Definition 3.11 and are only needed for some technical arguments below. We start with two lemmas leading to Proposition 4.5.     7) and P such that for any η > 0, M η μ μ μ,P = ∅, where μ μ μ is defined via (3.8). Then for any uniformly continuous and bounded G and α, β ≥ 0, D ∈ N, Note that given any f ∈ C b (R + , R), > 0 and a measure μ on R + with a finite first moment, there is some u : R + → R of the form u(s) = a 0 + n i=1 a i (κ i − s) + such that u ≥ f and (u − f ) dμ < . This gives the first inequality in the following: It follows that P ≤ 3/M 2 and hence M Let π (M) n be the law of (S With this and using M 1/M Z M ,P,I ⊆ M μ n ,I,1/M , we may continue (4.10) by writing where the last inequality follows from Lemma 4.4 since by analogous arguments to the ones above, we may argue that M

Proof of Theorem 3.16
We first make two simple observations. Remark 4.6 If P is a nonempty closed (with respect to the sup-norm) subset of Ω, then where P is the closure of P .

Lemma 4.7 If P is time-invariant, then for every > 0, P is also time-invariant.
Proof This follows easily by observing that for two paths S,S ∈ Ω and any nondecreasing continuous function f : [0, T n ] → [0, T n ] with f (0) = 0 and f (T i ) = T i for any i = 1, . . . , n, we have S · − S · = S f (·) − S f (·) .
We now proceed with the proof of Theorem 3.16. Recall that the inequalities V X ,P,P (G) ≥ V X ,P,P (G) ≥ P μ μ μ,P (G) hold in general. In addition, according to Theorem 3.12, V X ,P,P (G) = P μ μ μ,P (G). Therefore, we only need to show that P μ μ μ,P (G) = P μ μ μ,P (G). Our proof of this equality is divided into six steps. First, using Proposition 4.5, we argue that it suffices to consider measures with "good control" on the expectation of m (D) (S). Next, we perform three time changes within each trading period [T i , T i+1 ]. The resulting time change of S, denoted byS, allows a "good control" over its quadratic variation process. At the same time, we keep G(S) and G(S) "close", and given a measure P ∈ M η μ μ μ,P with "good control" on E P [m (D) (S)], since P η is time-invariant, the law of the time-changed price processS remains an element of M η μ μ μ,P . Then in Step 5, given a sequence of models with improved calibration precisions, we show tightness of the quadratic variation process of the time-changed price processS under these measures. This then leads to tightness of the image measures viaS. In Step 6, we deduce the duality P μ μ μ,P (G) = P μ μ μ,P (G) from tightness and conclude.
Recall that X is given by (3.7). Let  8), of the distribution of S T n . Note that by definition, P μ n ,I = P μ n ,I and that since the μ (i) n have finite pth moment, we have P μ n ,I ( S ) < ∞.
Step 1: Reducing to measures P with good control on E P [ m (D) (S)]. Let G satisfy Assumption 3.14. Choose κ ≥ 1 such that G ≤ κ and let f e : R d+K + → R + be a modulus of continuity of G, i.e.,

|G(ω) − G(υ)| ≤ f e (|ω − υ|)
for any ω, υ ∈ Ω with lim x→0 f e (x) = 0. Fix D ∈ N. Consider X D : Ω → R given by where the τ (D) i and m (D) are defined in Definition 4.1. It follows from the proof of Lemma 5.4 in Dolinsky and Soner [26] that there exists a γ ∈ A such that Hence V X ,P,P (X D ) ≤ 3(d +K) V X ,P,P ( S ∧(κ 2 2 5D +1)). Reducing X to options with maturity T n and considering I instead of P only increases the superhedging price, and therefore which is finite, and where the last inequality follows from Theorem 3.12 applied to the case of a single maturity. It now follows from sublinearity of V that where c 2 is a constant independent of D and the last inequality follows from Proposition 4.5. Next we denote by M κ I the set of P ∈ M I such that We notice that if P / ∈ M κ I , then while by the inequalities in (4.11) above, for D sufficiently large, It follows that in (4.11), it suffices to consider P ∈ M κ I ∩ M 1/N μ μ μ,P , which in particular is nonempty.
and then a process (S t ) t∈[0,T n ] by a time change of S via f , i.e.,S t = S f (t) . Note that f (T i − 1/D) = T i , as required. We argue below that (3.9) implies that we have Also, being a time change of S, the process (S t ) t∈[0,T n ] is a martingale (in the time-changed filtration). It follows that its distribution P (N ) • (S t ) −1 is an element of M 1/N μ μ μ,P as P 1/N is time-invariant, by Lemma 4.7.
Step 3: Second time change: introducing a lower bound on the time step. The second time change ensures that we can bound from below the difference between any two consecutive stopping times in the Lebesgue discretisation in Definition 4.1. We want to do this by adding a constancy interval of length δ to each step of the discretisation. As we have squeezed the paths above, we have length 1/D to use up while still keeping the time changes to within the intervals [T i−1 , T i ]. Taking suitably small δ, this allows us, with high probability, to alter all the steps in the Lebesgue discretisation.
For ease of notation, it is helpful to rename the elements of the set Further, since the processS is always constant on We are now ready to define the time-changed processŠ by Observe thatŠ is a (continuous) time change ofS andS T i =Š T i = S T i for i ≤ n. As before, this implies thatŠ remains a martingale and P (N ) • (Š t ) −1 ∈ M 1/N μ μ μ,P . We argue now that |G(S)−G(Š)| is small for large D. To this end, we approximate a path S with a piecewise constant functionF (D) (S) which jump at the times τ (D) i,j . A similar discretisation is used later in Sect. 5; see (5.2). For S ∈ Ω, consider Then the time-continuity property of G in (3.9) ensures that (4.14) Similarly, for any S ∈ Ω with m (D−8) n (S(S)) = m (D−8) n (S) ≤ Θ, again by (3.9), we have (4.15) when D is sufficiently large. From (4.12), the Markov inequality gives and hence by (4.13), Furthermore, by (4.15) and (4.16), (4.17) Step 4: Third time change: controlling the increments of the quadratic variation. We say that ω ∈ C([0, T ], R) admits a quadratic variation if exists and is a continuous function for t ∈ [0, T ]. In this case, we denote this limit with ω and otherwise we let ω be zero. In addition, for S ∈ Ω, we say S admits a quadratic variation if S (i) admits a quadratic variation for any i ≤ d + K. It follows from Theorem 4.30.1 in Rogers and Williams [46] and its proof that for any P ∈ M, S := ( S (1) , . . . , S (d+K) ) agrees P-a.s. with the classical definition of the quadratic variation of S under P, i.e., S 2 − S is a P-martingale. Further, Doob's inequality gives for all i ≤ d that and by the BDG inequalities, there exist constants c p , C p ∈ (0, ∞) such that Kκ p ). In the following, we want to modifyŠ oñ This together with (4.16) and the fact that P[Ĩ] = 1 for any P ∈ M I yields Hence, by (4.14) and (4.17), First, for every i, j, k, define ρ (i,j,k) : Then for i = 1, . . . , n, j = 0, 1, . . ., let θ (i,j,0) t = σ i,j and define recursively for k = 1, 2, . . . a change of time θ (i,j,k) : We consider a time change ofŠ via the θ (i,j,k) , defined byS t :=Š θ (i,j,k) t (S) for t ∈ [ρ (i,j,k) (S), ρ (i,j,k+1) (S)) for all i, j, k as above. Note that θ (i,j,k−1) ρ i,j,k = θ (i,j,k) ρ i,j,k so that the resulting process is continuous. Consider S ∈Ĩ and i, j such that we have σ i,j +1 (S(S)) − σ i,j (S(S)) > 0, as otherwise everything collapses to one point. Then the quadratic variation ofS(S) grows on [ρ (i,j,k) (S), ρ (i,j,k+1) (S)) linearly at the rate 2 k /δ, and ρ (i,j,k+1) (S) − ρ (i,j,k) (S) = 2 −k δ. In particular,S accumulates one unit of quadratic variation over each interval [ρ (i,j,k) (S), ρ (i,j,k+1) (S)) for k increasing until the total quadratic variation ofŠ on [σ i,j +1 (S(S))−σ i,j (S(S))] is exhausted. Trivially bounding the quadratic variation ofŠ over a small interval by its quadratic variation over [0, T n ], we see that We can ensure this happens with large probability since by Markov's inequality, Finally, we observe that each θ (i,j,k) t (S) is a stopping time relative to the natural filtration ofŠ, and henceS is a continuous P (N ) -martingale.
Step 5: Tightness of the measures through tightness of the quadratic variation processes. Together with (4.19), by the Arzelà-Ascoli theorem, the above implies that the family Then by Theorem VI.4.13 in Jacod and Shiryaev [35], , the space of right-continuous functions with left limits. By Theorem VI.3.21 in Jacod and Shiryaev [35], this implies that for all > 0, η > 0, there are N 0 ∈ N and θ > 0 with where w T n is defined by Clearly, for S ∈ Ω, continuity of S implies that w T n (S, θ ) := sup{|S t − S s | : 0 ≤ s < t ≤ T n , t − s ≤ θ } ≤ 2w T n (S, θ ).

Then we have
which then by Theorem VI.1.5 in Jacod and Shiryaev [35] implies that the family Step 6: Tightness gives exact duality. By tightness, there exists a converging subsequence {P (N k ) •S −1 } such that P (N k ) •S −1 → P weakly for some probability measure P on Ω. Consequently, In addition, if P is an element of M μ μ μ,P , then and the third inequality follows from (4.18). Recalling that V X ,P,P = P μ μ μ,P and letting D → ∞, we obtain the desired equality P μ μ μ,P = P μ μ μ,P and conclude that V X ,P,P (G) = V X ,P,P (G) = P μ μ μ,P (G) = P μ μ μ,P (G).
It remains to argue that P is an element of M μ μ μ,P . First, it is straightforward to see that S is a P-martingale and L P (S T i ) = μ i for any i ≤ n. To show that P[S ∈ P] = 1, notice that by the Portemanteau theorem, for every > 0, Therefore, it follows from Remark 4.6 and monotone convergence that P[S ∈ P] = lim 0 P[S ∈ P ] = 1, and hence P ∈ M μ μ μ,P .

Pricing-hedging duality without constraints
This and the subsequent section are devoted to establishing the crucial pricinghedging duality result in the absence of constraints, which was exploited in all the proofs above. Assumption 3.1, for any α, β ≥ 0 and D ∈ N,

Theorem 5.1 Under
where m (D) is defined in Definition 4.1.

Remark 5.2
As a by-product of the proof of Theorem 5.1, (5.1) still holds true when the probabilistic models P are restricted to those which arise within a Brownian setup, i.e., P satisfies (4.7).
The strategy of the proof is inspired by Dolinsky and Soner [25] and proceeds via discretisation, of the dual side in Sect. 5.1 and of the primal side in Sect. 5.3. The duality between the discrete counterparts is obtained by using classical probabilistic results of Föllmer and Kramkov [29].

A discrete-time approximation through simple strategies
The proof of Theorem 5.1 is based on a discretisation method involving a discretisation of the path space into a countable set of piecewise constant functions. These are obtained as a "shift" of the "Lebesgue discretisation" of a path. Recall from Defini- Then it is obvious from the definition of V (N ) I (G) ≥ V I (G) for any N 2 ≥ N 1 , and in fact, the following result states that V (N ) I (G) converges to V I (G) asymptotically.

A countable class of piecewise constant functions
In this section, we construct a countable set of piecewise constant functions which can approximate any continuous function S to a certain degree. It is achieved in three steps. The first step is to use the Lebesgue partition defined in the last section to discretise a continuous function into a piecewise constant function whose jump times are the stopping times. Due to the arbitrary nature of jump times and jump sizes, the set of piecewise constant functions F (N ) (S), generated through this procedure over all S, is uncountable. To overcome this, in the subsequent two steps, we restrict the jump times and sizes to a countable set and hence define a class of approximating schemes. As explained in Sect. 3.1, our methods are closely inspired by [25], but in order to deal with payoff functions which are uniformly continuous, so that in applications we can include static hedging in options with different maturities, we had to devise an improved discretisation scheme.
We Step 2. Define a map We then define our second approximationF (N ) : Step 3. We now construct the shifted jump timesτ Here we also suppress the dependences of these shifted jump times on S and N and writeτ k =τ (N ) k (S). Clearly 0 =τ 0 <τ 1 <τ 2 < · · · <τ m = T , τ k−1 <τ k ≤ τ k for all k < m andτ m = τ m = T . Theseτ are the shifted versions of the τ and are uniquely defined for any S. We are going to use theτ to define a class of approximating schemes.
It is clear thatD (N ) is countable.

A countable probabilistic structure
LetΩ := D([0, T ], R d+K ) and denote byŜ = (Ŝ t ) 0≤t≤T the canonical process on the spaceΩ. The setD (N ) is a countable subset ofΩ. There exists a local martingale measurê P ( In the last section, we saw definitions ofτ on Ω. Here we extend their definitions to N ∈ND (N ) . Define the jump times by settingτ 0 (Ŝ) = 0 and for k > 0,  ∈D (N ) , and equal to a otherwise. SinceP (N ) has full support onD (N ) , we getγ = φ(Ŝ)P (N ) -a.s. In particular, for any A that is a Borel-measurable subset of R d+K , the symmetric difference of {γ t ∈ A} and {φ(Ŝ) t ∈ A} is a nullset forP (N ) . Thus φ is a predictable map. Furthermore, sinceP (N ) charges all elements inD (N ) , for any υ,υ ∈D (N ) and t ∈ [0, T ], In the sequel, we always consider the above version φ(Ŝ) of a predictable processγ . We now formally define the probabilistic superreplication problem and later build a connection between the probabilistic superreplication problem on the discretised space and the pathwise discretised robust hedging problem. For the rest of the section, we write As G is defined only on Ω, to consider paths inΩ, we need to extend the domain of G toΩ. For most financial contracts, the extension is natural. However, we pursue a general approach here. We first define a projection :Ω → C([0, T ], R d+K ) by ifŜ is continuous, , where ω 1 is the constant path equal to 1. Put differently, whenŜ ∈ N ∈ND (N ) , (Ŝ) is the linear interpolation of the points We then can defineĜ :Ω → Ω using the projection byĜ(Ŝ) = G( (Ŝ)∨0), wherê S ∨ 0 := ((Ŝ (1) t ∨ 0, . . . ,Ŝ (d+K) t ∨ 0)) 0≤t≤T for any S ∈Ω. Note that G andĜ are equal on Ω. In addition, for every N ∈ N andŜ ∈D (N ) , we have Therefore, we can deduce that Similarly to [25], we can now connect the probabilistic superhedging problem and the discretised robust hedging problem. for some m, t k , s k . Therefore, we can conclude thatF (N ) has the desired measurability.
The following result is crucial. It states that the probabilistic superreplication value is asymptotically larger than the value of the discretised robust hedging problem. Recall that λ I (ω) := inf υ∈I ω − υ ∧ 1.
Proof Fix N ≥ 6. Let f e : R + → R + be a modulus of continuity for G so that lim x→0 f e (x) = 0. Define G (N ) : Ω → R as Note that V (N ) Hence, to show (5.11), it suffices to show that V (N ) The rest of the proof is structured to establish (5.12). Given a probabilistic semistatic portfolioγ which superreplicatesĜ − β √m (D−2) ∧ α − Nλ I − x, we argue that the lifted trading strategy γ (N ) To simplify notations, throughout the rest of the proof, we fix S ∈ I and writê F :=F (N ) (S).
Superreplication. We first notice that for any j < m − 1, It follows that for any k < m, In addition, (5.14) Hence, x where the second inequality follows from the superreplicating property ofγ and the fact thatP (N ) [{f }] > 0, ∀f ∈D (N ) , the third inequality is justified by (5.4), and the last inequality is due to (5.8) and (5.9).
Admissibility. Now, for a given t < T , let k < m be the largest integer so that τ k (S) ≤ t. It follows from (5.13) and (5.14) that where the last inequality follows from the admissibility ofγ and again the fact that Hence γ (N ) is admissible.

Duality for the discretised problems
Definition 5.9 Letˆ (N ) be the set of all probability measuresQ which are equivalent toP (N ) . For any κ ≥ 0, denote byM (N ) I (κ) the set of all probability measureŝ Q ∈ˆ (N ) such thatQ Proof For anyQ ∈ˆ (N ) , the support ofQ isD (N ) whose elements are piecewise constant. Therefore, the canonical processŜ is a semimartingale underQ. Moreover, it has the decompositionŜ =MQ +ÂQ, wherê is a predictable process of bounded variation andMQ is a martingale underQ. Then similarly to Dolinsky and Soner [26], it follows from Example 2.3 and Proposition 4.1 in Föllmer and Kramkov [29] that Then, in (5.15), it suffices to consider the supremum overM

Discretisation of the primal
Next, we show that we can lift any measure inM (N ) I (c) to a continuous martingale measure in M I such that the difference of the expected value of G under this continuous martingale measure and the expected value ofĜ under the original measure is within a bounded error, which goes to zero as N → ∞. Through this, we asymptotically connect the primal problems on the discretised space to the approximation of the primal problems on the space of continuous functions.
for some g : R + → R + such that lim x 0 g(x) = 0. We fix N andQ ∈M (N ) I (2κ + 2α) and prove the above inequality in four steps.
Step 1. We first construct a semimartingaleẐ =M +Â on a Wiener space (Ω W , F W , P W ) such that (5.17) and whereM is constructed from a martingale and both have piecewise constant paths.
Since the measureQ is supported onD (N ) , the canonical processŜ is a pure jump process underQ, with a finite number of jumpsQ-a.s. Consequently, there exists a deterministic positive integer m 0 (depending on N ) such that It follows that Notice that by the definition ofD (N ) , the law ofŜτ m 0 underQ is also supported onD (N ) . Let (Ω W , F W , P W ) be a complete probability space together with a standard t≥0 be the P W -completion of the natural filtration of W . With a small modification of Lemma 5.1 in Dolinsky and Soner [25], we can construct a sequence of stopping times (with respect to the Brownian filtration) σ 1 < σ 2 < · · · < σ m 0 together with F W σ i -measurable random variables Y i , i = 1, . . . , m 0 such that Note that by the definition ofD (N ) , we have |Y i | ≤ 2 −N and hence also |X i | ≤ 2 −N . Also by the construction of the σ i and Y i , we have where σ σ σ i := (σ 1 , . . . , σ i ), Y Y Y i := (Y 1 , . . . , Y i ) and E W is the expectation with respect to P W . From these, we can construct a jump process (Â t ) 0≤t≤T bŷ In particular, for k ≤ m 0 ,Â σ k = k j =1 X j . Define a martingale (M t ) 0≤t≤T via Since all Brownian martingales are continuous, so is M. Moreover, Brownian motion increments are independent and therefore We now introduce a stochastic process (M t ) 0≤t≤T on the Brownian probability space Note that as |Y i − X i | ≤ 2 −N +1 , for any k ≤ m 0 and t ≤ T , we have and hence We also notice thatẐ =M +Â satisfiesẐ 0 =Ŝ 0 and

It follows that
In particular, by (5.20), we see that (5.17) holds, and also by (5.19) and the definition ofM andÂ, (5.18) holds.
Step 2. We shall shortly construct a continuous martingale M θ 0 from M such that M θ 0 is bounded below by −2 −N +2 − N − 1 2 and As the law ofẐ under P W is the same as that ofŜ m 0 underQ, it follows from the fact thatQ is supported onD (N ) and any f ∈D (N ) is above −2 −N +3 that Z ≥ −2 −N +3 P W -a.s. (5.22) By combining this with (5.7) and (5.21), we can deduce that It follows that where we use the fact that (Ẑ) ∨ 0 − M ∨ 0 ≤ (Ẑ) − M . Hence, sinceĜ is bounded by κ, Note that with the notation of the proof in Sect. A.3 in the Appendix, By Markov's inequality and the definition ofM (N ) Therefore, we have By (5.21)-(5.23), Hence the stopped process M θ 0 , with In addition, by (5.21) and (5.23), we can deduce from (5.18) that which for N large enough easily implies that Similarly, by (5.21) and (5.23), we have Step 3. The next step is to modify the martingale M θ 0 in such way that Γ , the new continuous martingale, is nonnegative. Write N = 2 −N +4 + N − 1 2 and define an Note that for any i > d, we have Λ (i) ≤ κ + 1 + 2 −N +2 + N − 1 2 + N ≤ κ + 2 for N large enough. We now construct a continuous martingale from Λ by setting and Λ ≥ 0 implies that Γ is nonnegative, and Γ (i) We first note that for all i = 1, . . . , d + K, Then by Doob's martingale inequality, This together with (5.26), writing κ 1 = κ + α, yields ≤ f e (  (5.30) and from (5.28) and (5.29) that Step 4. The last step is to construct a new processΓ from Γ such that the law of Γ under P W is an element of M I . We write η N = 4κ 1 N − 1 2 + 4(d + K) We can deduce from (5.30) and (5.31) that It follows immediately that Then it follows from Assumption 3.1 that when N is large enough, there exists ã P (N ) ∈ MĨ such that We now construct a continuous martingale fromΛ by setting It follows from the fact that ξ is independent of Γ andM that which implies that Then by Doob's martingale inequality, Notice that when N is sufficiently large such that κη We now define η N by Note that η N → 0 as N → ∞. Then by Doob's martingale inequality, It follows that Then by (4.1) in Remark 4.2, for any sequence (P (k) ) k≥1 converging to P weakly, The next step is to interchange the order of the infimum and supremum. Notice that when we fix P, G is affine in the first variable and continuous due to the dominated convergence theorem. In addition, by definition, G is lower semi-continuous in the second variable. Furthermore, G is convex in the second variable. To justify this, we notice that P → E where the second inequality follows from the fact that −X ∈ Lin N (X ) for every X ∈ Lin N (X ). This completes the verification of (4.9).

A.3 Construction of σ and Y in Proposition 5.11
Given a sequence a 1 , a 2 , . . ., we denote by a a a m := (a 1 , . . . , a m ) the vector of its first m elements. We denote by Π(E) the set of probability measures on E. In addition, set (a 1 , . . . , a d+K ) : a j ∈ Z, |a j | ≤ 2 k , j = 1, . . . , d + K .