Abstract
The objective of this paper is to develop a duality between a novel entropy martingale optimal transport (EMOT) problem and an associated optimisation problem. In EMOT, we follow the approach taken in the entropy optimal transport (EOT) problem developed in Liero et al. (Invent. Math. 211:969–1117, 2018), but we add the constraint, typical of martingale optimal transport (MOT) theory, that the infimum of the cost functional is taken over martingale probability measures. In the associated problem, the objective functional, related via Fenchel conjugacy to the entropic term in EMOT, is no longer linear as in (martingale) optimal transport. This leads to a novel optimisation problem which also has a clear financial interpretation as a nonlinear subhedging problem. Our theory allows us to establish a nonlinear robust pricing–hedging duality which also covers a wide range of known robust results. We also focus on Wasserstein-induced penalisations and study how the duality is affected by variations in the penalty terms, with a special focus on the convergence of EMOT to the extreme case of MOT.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
As a consequence of the financial crisis in 2008, the uncertainty in the selection of a reference probability \(P\) gained increasing attention and led to the investigation of the notions of arbitrage and of the pricing–hedging duality in different settings. On the one hand, the single reference probability \(P\) was replaced with a family of – a priori non-dominated – probability measures, leading to the theory of quasi-sure stochastic analysis. On the other hand, taking an even more radical approach, a probability-free, pathwise theory of financial markets made substantial advances in the second decade of this century. In this context, it was shown in the seminal paper by Beiglböck et al. [4] that optimal transport theory is a powerful tool to prove pathwise pricing–hedging duality results. The theory we are going to present fits in this conceptual framework that we now briefly recall.
The market model is in discrete time with a finite horizon \(T\in \mathbb{N}\) and zero interest rate. Let
for closed (possibly noncompact) subsets \(K_{0},\dots ,K_{T}\) of ℝ and let \(X_{0},\dots ,X_{T}\) be the canonical projections \(X_{t}:\Omega \rightarrow K_{t}\) for \(t=0,1,\dots ,T\). The process \(X=(X_{t}) \) represents the price of some underlying asset. Later we allow a multidimensional price process, but in this introduction, we stick to the one-dimensional case for notational simplicity. We assume no reference probability measure. We write
and when \(\mu \) is a measure defined on the Borel \(\sigma \)-algebra of \(\Omega\), its marginals are denoted by \(\mu _{0},\dots ,\mu _{T}\). One then considers a contingent claim \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is allowed to depend on the whole path of the underlying asset, and one admits semistatic trading strategies for hedging. This means that in addition to dynamic trading in \(X\) via admissible integrands \(\Delta \in \mathcal{H}\), one may invest in vanilla options \(\varphi _{t}:K_{t}\rightarrow \mathbb{R}\). For modelling purposes, one can take vector subspaces \(\mathcal{E}_{t}\subseteq \mathcal{C}(K_{t})\) for \(t=0,\dots ,T\), where \(\mathcal{C}(K_{t})\) is the space of real-valued continuous functions on \(K_{t}\). For each \(t\), \(\mathcal{E}_{t}\) is the set of static options that can be used for hedging, say affine combinations of vanilla options with different strikes and the same maturity \(t\), and \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) is the space of all hedging instruments. The key assumption in the robust, optimal-transport-based formulation is that the marginals \(\mathcal{(}\widehat{Q}_{0},\widehat{Q}_{1},\dots ,\widehat{Q}_{T})\) of the underlying price process \(X \) are known; see the seminal papers by Breeden and Litzenberger [13] and Hobson [27]. Such marginals can be identified if one knows a (very) large number of prices of plain vanilla options maturing at each intermediate date, for example the prices of all call options with intermediate maturities and ranging strikes. In this case, the class of arbitrage-free pricing measures that are compatible with the observed prices of the options is given by
Let ℋ consist of admissible (predictable) trading strategies, given as in Beiglböck et al. [4] via bounded continuous functions, and let
denote the corresponding set of stochastic integrals. In this framework, the subhedging duality, obtained in [4, Theorem 1.1], takes the form
for
and the right-hand side of (1.1) is known as the robust subhedging price of \(c\). Obviously, an analogous theory for the superhedging price can be developed as well. Several relevant papers contributed to this stream of literature, as for example Davis et al. [18], Dolinsky and Soner [20], Galichon et al. [22], Henry-Labordère et al. [26], Tan and Touzi [37]. More recent works on the topic include also Bartl et al. [3], Cheridito et al. [14], Guo and Obłój [24], Hou and Obłój [28].
1.1 The dual problem
The left-hand side of (1.1), namely \(\inf _{Q\in \mathcal{M(}\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})}E_{Q}[c] \), represents the dual problem in the financial application, but is typically the primal problem in martingale optimal transport (MOT). We label this case as the sublinear case of MOT. Inspired by the entropy optimal transport (EOT) introduced in Liero et al. [30], we are naturally led to the study of the convex case of MOT, i.e., an entropy martingale optimal transport (EMOT) problem, in the form
where \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}\) is a divergence in the usual form (see (3.5) below for an explicit expression). Notice that in the EMOT primal problem (1.3), the typical MOT constraint that \(Q\) has prescribed marginals \((\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\) is relaxed (as the infimum is taken with respect to all martingale probability measures) by penalising via \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}\) those martingale measures \(Q\) whose marginals are far from some reference marginals \((\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\). This is a key difference with classical MOT. Nevertheless, when \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}} (\,\cdot \, )=\delta _{\{ \widehat{Q}_{t}\}}(\,\cdot \, )\), the EMOT reduces to the classical MOT problem where only martingale probability measures with fixed marginals are allowed. Here \(\delta _{A}:=\infty 1_{A^{c}}\) is the characteristic function of a set \(A\) as customarily defined in convex analysis. We also stress that in (1.3), we only consider martingale probability measures, while the EOT problem of [30] is obtained by replacing in (1.3) the set \(\mathrm{Mart}(\Omega )\) with \(\mathrm{Meas}(\Omega )\) consisting of all positive finite measures \(\mu \) on \(\Omega \).
In EMOT, the marginals are no longer fixed a priori as in the left-hand side of (1.1), because we may not have sufficient information to detect them with enough accuracy. This might be the case for example if there are not sufficiently many traded call and put options on the underlying assets in the market so that we cannot extract precisely the marginals via the Breeden and Litzenberger [13] approach. Alternatively, the exact prices of the options might be unknown, e.g. by market impact effects.
1.2 The primal problem
To describe the nonlinear subhedging value, we start with the space \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) of hedging instruments consisting of some vectors of continuous functions and consider a functional \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\). An example is given by (a sum of) expected utility functions, as detailed below, so that \(U\) is not necessarily linear or even cash-additive.
We restore cash-additivity via the notion of the optimised certainty equivalent (OCE) studied in Ben Tal and Teboulle [5]. To this end, we introduce the generalised optimised certainty equivalent associated to \(U\) as
This is a cash-additive map (see (2.2)), yet nonlinear in general, and can be considered as a valuation of options \(\varphi =(\varphi _{t})\) instead of the linear cost \(\sum _{t=0}^{T} E_{\widehat{Q}_{t}}[\varphi _{t}]\) in (1.1). For a possibly path-dependent contingent claim \(c:\Omega \rightarrow (-\infty ,+\infty ]\), the nonlinear subhedging value of \(c\) when valuation is done by \(S^{U}\) then reads
for \({\Phi }_{\Delta }(c):=\{ \varphi \in \mathcal{E}: \sum _{t=0}^{T} \varphi _{t}(x_{t})+I^{\Delta }(x)\leq c(x), \forall \,x\in \Omega \}\).
1.3 The duality
One of the main results of the paper in Theorem 2.4 is the duality
where
is the penalisation term associated to \(U\) via the Fenchel conjugate. In addition, we also prove the existence of an optimiser for the problem on the left-hand side of (1.5). We now understand that the dual problem for the latter, namely of EMOT in its general form, is the nonlinear subhedging problem appearing on the right-hand side of (1.5).
Observe that \(\mathcal{D} := \mathcal{D}_{U}\) in (1.6) does not necessarily have an additive structure, or a divergence formulation \(\mathcal{D} (Q)=\sum _{t=0}^{T} \mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}(Q_{t})\) as in (1.3), and so it does not necessarily depend on a given martingale measure \(\widehat{Q}\). For example, such penalisation terms could be induced by market prices (see Sect. 4.3) or by a Wasserstein distance (see Sect. 4.4). This additional flexibility in choosing \(\mathcal{D}\) constitutes one key generalisation of the entropy optimal transport theory of Liero et al. [30]. Of course, the other difference with EOT is the presence in (1.5) of the additional supremum with respect to admissible integrands \(\Delta \in \mathcal{H}\). As a consequence, on the left-hand side of (1.5), the infimum is now taken with respect to martingale probability measures instead of positive measures.
In the special case of a valuation functional \(U\) induced by utility functions, the duality (1.5) has a particularly interesting formulation (see Sects. 3.4 and 4.1 for the assumptions and more details). We provide here only two special cases. Let \(\widehat{Q}_{t}\) be the marginals of some \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) and \(U(\varphi )= \sum _{t=0}^{T} {E}_{\widehat{Q}_{t}}[u_{t}(\varphi _{t})]\). If \(u_{t}(x)=\frac{1}{\gamma _{t}}(1-\exp {(-\gamma _{t}x)})\), \(\gamma _{0},\dots ,\gamma _{t}>0\), is an exponential utility function, then (1.5) takes the form
where \(H(\,\cdot \, , \,\cdot \, )\) denotes the relative entropy. If \(u_{t}(x)=x\) is the linear utility function, then \(\mathcal{D}_{U}(\,\cdot \, )=\sum _{t=0}^{T} \delta _{\widehat{Q}_{t}}( \,\cdot \, )\) and the EMOT reduces to the classical MOT problem.
Our framework allows us to establish and comprehend several different robust pricing–hedging duality results: new nonlinear utility-based formulations (in Corollaries 4.3, 4.4 and 5.1); the linear case (in Corollary 5.3) and the case without options (in Corollary 4.6); a new duality with penalisation functions based on market data (see Sect. 4.3) or on a Wasserstein distance (see Sect. 4.4).
One additional feature of the paper consists in replacing the set of stochastic integrals ℐ with a general set \(\mathcal{A}\) of suitable hedging instruments that is a general convex cone. Particular choices of such an \(\mathcal{A}\), apart from the usual set of stochastic integrals, allow us to work with \(\varepsilon \)-martingale measures, supermartingales and submartingales in the duality (see Sect. 2.2.1). This extends EMOT beyond the strict martingale property.
Section 2.5 is devoted to stability and convergence issues, as we analyse how the duality is affected by variations in the penalty terms. In Examples 4.12, 4.16 and 5.4, we apply this result to the convergence of EMOT to the extreme case of MOT, and in Sect. 4.4, we focus on Wasserstein-induced penalisation terms.
2 The entropy martingale optimal transport duality
In this section, we present a precise mathematical setting, the main results and their proofs. The main result in Theorem 2.4 relies on (i) a Fenchel–Moreau argument applied to the dual system \((C_{0:T},(C_{0:T})^{\ast })\), where \(C_{0:T}\) is a set of appropriately weighted continuous functions, (ii) the Daniell–Stone theorem that guarantees that the elements in the dual space \((C_{0:T})^{\ast }\) that enter in the dual representation can be represented by probability measures. In order to make this possible, an order-continuity-type assumption on the valuation functional is enforced (see (2.6)).
2.1 The setting
Fix \(d\in \mathbb{N}\) modelling the number of stocks in the market, and fix \(d(T+1)\) closed subsets \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) of ℝ. Set, for \(0\leq s\leq t\leq T\),
Let \(\mathcal{C}\left (\Omega _{s:t}\right )\) be the vector space of continuous real-valued functions on \(\Omega _{s:t}\), and let
We introduce the space \(B_{s:t}\) in a similar fashion, just substituting the requirement of continuity for \(\varphi \) with the request that \(\varphi \) be measurable with respect to the Borel \(\sigma \)-algebra of \(\Omega _{s:t}\). Then \(C_{s:t}\) and \(B_{s:t}\) are Banach lattices under the norm \(\Vert \cdot \Vert _{s:t}\). The topological dual of \(C_{s:t}\) is denoted by \((C_{s:t})^{\ast }\).
In a discrete-time framework with finite horizon \(T\) and assuming zero interest rate, we model a market with \(d\) stocks using the canonical \(d\)-dimensional process given by \(X_{t}^{j}(x)=x_{t}^{j},j=1,\dots ,d,t=0,\dots ,T\), for \(x \in \Omega \). We introduce the set \(\mathrm{Prob}(\Omega )\) of probability measures on \(\Omega \), endowed with its Borel \(\sigma \)-algebra, and the set of those probability measures under which the \(X_{t}^{j}\) are integrable as
Fix now vector subspaces \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) with \(\mathcal{E}_{t}\subseteq C_{0:t}\). The space \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) represents the class of financial instruments that can be used for static hedging. Since \(\mathcal{E}_{t}\subseteq {C}_{0:t}\), we are potentially allowing to consider also Asian and path-dependent options \(\varphi _{t}(x_{0},\dots ,x_{t})\) in the sets \(\mathcal{E}_{t}\). Nonetheless, the choice \(\mathcal{E}_{t}\subseteq C_{t:t}\) is permitted, too; see Sects. 4 and 5. Moreover, in some of the subsequent results, see Sect. 4.1, we take as \(\mathcal{E}_{t}\subseteq {C}_{t:t}\) the subspace consisting of (combinations of) deterministic amounts, units of underlying stock at time \(t\) and call options with different strike prices and the same maturity \(t\). Let \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) be a proper (i.e., \(\mathrm{dom}(U):=\{\varphi \in \mathcal{E} : U(\varphi )>-\infty \} \neq \emptyset \)) concave functional. Recall from (1.4) the definition of \(S^{U}:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty ]\) which represents the valuation functional of the hedging instruments in ℰ. Let \(\mathrm{dom}(S^{U}):=\{\varphi \in \mathcal{E} : S^{U}(\varphi )>- \infty \}\). Observe that we are considering valuation of the process \(\varphi =(\varphi _{0},\dots ,\varphi _{T}) \in \mathcal{E}\) rather than the valuation of the terminal payoffs only. Under the usual convention \(\infty \cdot 0=0\cdot \infty =0\), one can check that the functional \(S^{U}\) is concave on the convex set \(\mathrm{dom}(S^{U}) \) and cash-additive, meaning that
Definition 2.1
Given a convex cone \(\mathcal{A}\subseteq C_{0:T}\) and a Borel function \(c\), we define
where
and the usual convention \(\sup \emptyset =-\infty \) is adopted.
We recognise that \(\pi (c)\) in (2.3) is a generalised robust subhedging value for \(c\), with a general set \(-\mathcal{A}\) replacing the set of terminal values of stochastic integrals used before. Some relevant examples for choices of \(\mathcal{A}\) are provided in Sect. 2.2.1.
Definition 2.2
We define the polar \(\mathcal{A}^{\circ }\) of the cone \(\mathcal{A}\subseteq C_{0:T}\) to be the set
where \(\langle \,\cdot \,,\,\cdot \, \rangle \) is the usual pairing between \(C_{0:T}\) and its topological dual \((C_{0:T})^{\ast }\), and we observe that for any \(\lambda \) in \((C_{0:T})^{\ast }\),
As will be clarified in Sect. 2.4.1, \(\mathrm{Prob}^{1}(\Omega )\) can be identified with a subset of \((C_{0:T})^{\ast }\); so we introduce the set of probability measures
2.2 The main results
Assumption 2.3
(i) Let \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) be closed subsets of ℝ and denote . The vector subspaces \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) satisfy that \({\mathbb{R}} \subseteq \mathcal{E}_{t}\subseteq C_{0:t}\), \(t=0,\dots ,T\), and we set \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T} \). The functional \(U: \mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) is concave with \(U(0)\in {\mathbb{R}} \). Moreover, \(\mathcal{A}\subseteq C_{0:T}\) is a convex cone with \(0\in \mathcal{A}\).
(ii) For every \(t=0,\dots ,T\), there exist compact sets , and functions \(0\leq f_{t}^{n}\in \mathcal{E}_{t},n\geq 1\), such that
and
Theorem 2.4
Suppose Assumption 2.3is fulfilled.
-
(i)
If
$$ {\pi (\,\widehat{c}\,)< \infty }\textit{ for some }{\widehat{c}\in B_{0:T},} $$(2.7)then \(\pi (c)\in {\mathbb{R}}\) for every \(c\in B_{0:T}\) and \(\pi :B_{0:T}\rightarrow {\mathbb{R}}\) is norm-continuous, cash-additive, concave and nondecreasing on \(B_{0:T}\).
-
(ii)
For every lower semicontinuous \(c:\Omega \rightarrow (-\infty ,+\infty ]\) satisfying
$$ c(x)\geq -A\bigg( 1+\sum _{t=0}^{T}\sum _{j=1}^{d} \vert x_{t}^{j} \vert \bigg), \qquad \forall x\in \Omega , \textit{for some}{\ A\in \lbrack 0, \infty ){,}} $$(2.8)we have the duality
$$ \inf _{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}}\big( E_{Q} [c ]+\mathcal{D}(Q)\big)=\sup _{z\in -\mathcal{A}}\sup _{\varphi \in { \Phi }_{z}(c)}S^{U} ( \varphi )=\pi (c), $$(2.9)where \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}\) is given in (2.4),
$$ \mathcal{D}(Q)=\sup _{\varphi \in \mathcal{E}}\bigg( U(\varphi )- \sum _{t=0}^{T}\int _{\Omega _{0:t}}\varphi _{t}\mathrm{d}Q_{t}\bigg), $$(2.10)and \(Q_{t}\) is the marginal of \(Q\in \mathrm{Prob}^{1}(\Omega )\) on \(\mathcal{B}(\Omega _{0:t})\). Furthermore, if \(\pi (c)<\infty \), the infimum on the left-hand side of (2.9) is a minimum.
Notice that the condition \(\pi (\,\widehat{c}\,)<\infty \) for some \(\widehat{c}\in B_{0:T}\) is not required for the validity of Theorem 2.4 (ii). In addition, we allow in (2.9) \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }=\emptyset \) with the usual convention \(\inf \emptyset =+\infty \). Recall also that the existence of an optimiser in MOT implies that \(\mathcal{M}(\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\) is not empty and that the marginals must be in convex order. In EMOT, the marginals are no longer assigned, and so an optimiser \(Q^{\ast }\) of the left-hand side of (2.9) belongs to \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) with \(\mathcal{D}(Q^{\ast })< \infty \) with no other requirement.
Corollary 2.5
Suppose Assumption 2.3 (i) holds with the subsets \(K_{t}^{j}\), \(t=0,\dots ,T\), \(j=1,\dots ,d\), of ℝ being compact, \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and \(U(0 )=0\). Then (2.9) holds true and if \({\pi (c)<\infty }\), there exists an optimum for the left-hand side of (2.9).
Proof
When \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) are compact, then \(C_{0:T}=\mathcal{C}_{b}(\Omega )\). As \(U(0)=0\), (2.5) and (2.6) are automatically satisfied: just take and \(f_{t}^{n}=0,t=0,\dots ,T,n\geq 1\). □
Assumption 2.3 (ii) is inspired by Cheridito et al. [15] and is for instance satisfied if \(K_{t}^{j} \subseteq \lbrack 0,\infty )\), \(t=0,\dots ,T\), \(j=1,\dots ,d\), and if the valuations over a suitable sequence of call options on the underlying stocks converge to zero when the corresponding strikes diverge to infinity, as explained in the following example.
Example 2.6
Let
and suppose that \(f_{j,t}^{\alpha }\in \mathcal{E}_{t}\) for every \(\alpha \geq 0\), \(j=1,\dots ,d\), \(t=0,\dots ,T\). As shown in Proposition A.1 (ii), to guarantee that (2.5) and (2.6) are satisfied, it is enough to require that \(U\) is (componentwise) nondecreasing on , \(U(0)=0\) and that for \(\beta \in {\mathbb{R}}_{+}\) given in A.1 (i), we have
Condition (2.12) is a requirement on the valuation of single options having maturity \(t\).
Remark 2.7
The proof of Theorem 2.4 will clarify that the use of \(-\mathcal{A}\) in place of \(\mathcal{A}\) in defining \(\pi (c)\) is somehow a matter of taste. Now the infimum in (2.9) is in fact taken over measures in the polar \(\mathcal{A}^{\circ }\). Instead, without the minus sign \((-\mathcal{A})\) in defining \(\pi (c)\), we should work with \((-\mathcal{A})^{\circ }\), which is less convenient in the computations of the proof.
2.2.1 Examples for \(\mathcal{A}\)
We anticipate here financially relevant examples of possible choices of the convex cone \(\mathcal{A}\) and the corresponding set \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}\).
Example 2.8
To introduce martingale measures in this setup, we set
Thus the space \(\mathcal{H}^{d}\) is the class of admissible trading strategies and ℐ is the set of elementary stochastic integrals. The (possibly empty) class of martingale measures for the canonical process is denoted by \(\mathrm{Mart}(\Omega )\) and consists of all probability measures on \(\mathcal{B}(\Omega )\) which make each of the processes \((X_{t}^{j})\) a martingale under the natural filtration \(\mathcal{F}_{t}:=\sigma (X_{s}^{j},s\leq t,j=1,\dots ,d),t=0,\dots ,T\). Equivalently,
It is then clear that choosing \(\mathcal{A}=\mathcal{I}\), we get \(\mathrm{Mart}(\Omega )=\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\). When \(d=1\), we simply write \(\mathcal{H}=\mathcal{H}^{1}\).
Example 2.9
For every \(\varepsilon \geq 0\), the set of so-called \(\varepsilon \)-martingale measures (see Guo and Obłój [24]) is
Thus, taking
(here \(\mathrm{conv}(\,\cdot \, )\) stands for the convex hull in \(C_{0:T}\), which is easily seen to be a cone since \(\mathcal{H}^{d}\) is a vector space), one sees that
Taking in particular \(\varepsilon =0\), we have \(\mathrm{Mart}_{0}(\Omega )=\mathrm{Mart}(\Omega )\) as in Example 2.8. It is interesting to notice that for any sequence \(\varepsilon _{n}\downarrow 0\), we have
Example 2.10
Alternative choices for the set \(\mathcal{A}\) which produce supermartingale or submartingale measures are \(\mathcal{A}^{\pm }=\{I^{\Delta } : \Delta \in (\mathcal{H}^{\pm })^{d} \}\), where we define the sets \(\mathcal{H}^{+}=\{\Delta \in \mathcal{H} : \Delta _{t}\geq 0, \forall t=0,\dots ,T\}\) and \(\mathcal{H}^{-}=-\mathcal{H}^{+}\). The set \(\mathcal{A}^{+}\) models dynamic trading with no short selling and yields
2.2.2 Rephrasing the main results: superhedging and the martingale measures case
For a given proper concave \(U:\mathcal{E}\rightarrow {\mathbb{R}}\), recall the definition of \(S^{U}\) in (1.4) and for \(V(\,\cdot \, )=-U(-\,\cdot \, )\), set \(S_{V}(\varphi ):=-S^{U}(-\varphi )\) and
Observe that in our notation, the superhedging value for \(c\) is
where
The selection of \(-\mathcal{A}\) for \(\pi \) and \(\mathcal{A}\) for \(\pi _{+}\) permits to recognise that the two are linked by \(\pi _{+}(c)=-\pi (-c)\), and so the duality results for \(\pi \) can easily be translated into duality results for \(\pi _{+}\). Of course, when \(\mathcal{A}\) is a vector space as in the case of stochastic integrals (see (2.13) and Example 2.8), we have \(\mathcal{A}=-\mathcal{A}\) and there is no need for the different choices \(-\mathcal{{A}}\) for \(\pi \) and \(\mathcal{{A}}\) for \(\pi _{+}\).
We now rephrase our findings in Theorem 2.4, with minor additions, to get the formulations in Corollary 2.12 and Corollary 2.13 which will simplify our discussion in Sects. 4 and 5.
We associate to the functions \(c:\Omega \rightarrow (-\infty ,+\infty ]\), \(g:\Omega \rightarrow \lbrack -\infty ,+\infty )\) the sets
Remark 2.11
If \(\mathcal{E}_{t}\subseteq C_{t:t}\) for \(t=0,\dots ,T\), then \(\mathrm{dom}(S^{U})\subseteq C_{0:0}\times \cdots \times C_{T:T}\) and each element \(\varphi _{t}\) in (2.16) is a function of the single variable \(x_{t}\). If additionally \(\mathrm{dom}(S^{U})=\mathcal{E}\) and \(d=1\), (2.16) is consistent with (1.2).
From Theorem 2.4 and the equalities \(\mathcal{S}_{\sup}(\,\cdot \, )=-\mathcal{S}_{{\mathrm {sub}}}(-\,\cdot \, )\), \(S_{V}(\,\cdot \, )=-S^{U}(-\,\cdot \, )\), one easily deduces
Corollary 2.12
Let \(\mathcal{A}=\mathcal{I}\) as in (2.13). Suppose that the assumptions in Theorem 2.4are satisfied, \(g:\Omega \rightarrow \lbrack -\infty ,+\infty )\) is upper semicontinuous and also condition (2.8) holds with \(c \) replaced by \(-g\). Then
If the left-hand side of (2.18) (resp. (2.19)) is finite, then an optimum exists for the left-hand side of (2.18) (resp. (2.19)).
Corollary 2.13
If \(d=1\) and \(\Omega :=K_{0}\times \cdots \times K_{T}\) for compact sets \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\), then (2.18) and (2.19) as well as existence of optima are guaranteed by the following simplified set of assumptions: \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous, \(g:\Omega \rightarrow (-\infty ,+\infty ]\) is upper semicontinuous and \(U(0)=0\).
Proof
When \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\) are compact, we may repeat the proof of Corollary 2.12 invoking Corollary 2.5 in place of the more general Theorem 2.4. □
In the subsequent sections, we only consider the subhedging price; the corresponding statements for the superhedging price can be obtained in the obvious way just described.
2.3 Literature review
We observe that the EMOT problem on the left-hand side of (1.5) was not previously considered in the literature. The associated subhedging value on the right-hand side of (1.5) is also new, even though different formulations of nonlinear subhedging prices already appeared in the literature. For example, in Föllmer and Schied [21, Sect. 4.8], the use of general risk measures in a non-robust framework allows weakening the pointwise inequality constraint in subhedging problems. In Cheridito et al. [15], the authors, now in a robust framework, consider additionally a general set of discounted trading gains that may describe different market structures, such as transaction costs or trading constraints. In the present paper, we consider instead explicitly nonlinear pricing (i.e., \(S^{U}\) on the right-hand side of (1.5)) of static parts of semistatic trading strategies and its impact in the duality (i.e., \(\mathcal{D}_{U}\) on the left-hand side of (1.5)). Pennanen and Perkkiö [33] also developed a generalised optimal transport duality, which can be applied to study the pricing–hedging duality in a context similar to our additive setup of Sect. 3.
The addition of an entropic term to optimal transport problems was popularised by Cuturi [17], with several applications especially from the computational point of view (see for example the survey/monograph by Peyré and Cuturi [34, Chap. 4]). The Sinkhorn algorithm can be applied with the entropic regularisation procedure described in these works (see Benamou et al. [6] for some advantages). Convergence for this algorithm is studied e.g. in Ireland and Kullback [29] and Rüschendorf [35]. After the present paper was posted on arXiv, several relevant advances were made regarding this topic. We mention here Nutz and Wiesel [32], Bernton et al. [7], Ghosal et al. [23]. We stress that all the papers mentioned in this paragraph address a different problem: in Cuturi [17] and subsequent works, the requirement of an exact matching of the marginal distributions is maintained. In the present setting, we relax this constraint in order to model uncertainty regarding the marginals themselves.
A Sinkhorn algorithm approach was adopted in De March and Henry-Labordère [19] for building an arbitrage-free implied volatility surface from bid-ask quotes, while Henry-Labordère [25] studied a problem related to the entropic relaxation of an optimal transportation problem and Blanchet et al. [9] studied the number of operations needed for approximation of the transport cost with a given accuracy, in the case of entropic regularisation. Our framework also allows the use of a penalisation of the form \(Q\mapsto \sum _{t=0}^{T}\delta _{\widehat{Q}_{t}}(Q_{t} )+\widetilde{D}(Q) \), for some entropic term \(\widetilde{D}\), so that with this choice, the EMOT reduces to the MOT problem with an additional entropic regularisation term, as analysed in the abovementioned literature.
Stability issues have been studied in Backhoff-Veraguas and Pammer [2] and Neufeld and Sester [31] in what we called the sublinear case, namely with no penalty and with fixed marginals.
The works by Bernton et al. [7] and Ghosal et al. [23] study geometric properties of minimisers of the entropic OT, by means of the concept of cyclical invariance. This is a counterpart to the characterisation, using \(c\)-cyclical monotonicity, of the geometry of optimal transport plans in the classical framework of OT. Even though a similar study of geometric properties for optimisers of EMOT would be of great interest, this topic is beyond the scope of the present paper and is left for future research.
In the framework of Liero et al. [30] (i.e., with penalisations of the marginals induced by divergence functions) and after the first version of the present work was posted on arXiv, duality results were obtained in the context of weak martingale optimal entropy transport problems by Chung and Trinh [16].
2.4 Proof of Theorem 2.4
2.4.1 The full technical setup
For a metric space \(\mathbb{X}\), \(\mathcal{B}(\mathbb{X})\) denotes the Borel \(\sigma \)-algebra and \(m\mathcal{B}(\mathbb{X})\) the class of real-valued, Borel-measurable functions on \(\mathbb{X}\). We define the sets
Recall that we fixed \(d(T+1)\) closed sets \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d} \subseteq{\mathbb{R}}\), where \(d\) is the number of stocks and \(T\) the time horizon. We use the following weighted spaces of continuous functions: for an index set \(I\subseteq \{1,\dots ,d\}\times \{0,\dots ,T\}\), we take
For example, we already encountered \(C_{s:t}\) in (2.1), and we also consider
The corresponding norms are denoted by \(\Vert \cdot \Vert _{I},\Vert \cdot \Vert _{s:t},\Vert \cdot \Vert _{t}\), respectively. Some additional details on weighted spaces can be found in Appendix A.1.
Notice that if \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) are compact, then
Observe that by a slight abuse of notation (regarding the domains of the functions), for index sets \(I\subseteq J\subseteq \{1,\dots ,d\}\times \{0,\dots ,T\}\), there is a constant \(0<\theta \leq 1\) such that
As already mentioned in Pennanen and Perkkiö [33] and Cheridito et al. [15], every finite signed Borel measure \(\gamma \) on \(\mathbb{X}\) with \(C_{0:T}\subseteq L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\vert \gamma \vert )\) induces a continuous linear functional \(\lambda \in (C_{0:T})^{\ast }\) via integration, namely
The collection of such functionals, identified with the corresponding measures, is denoted by \(\mathrm{ca}^{1}(\mathbb{X})\), that is,
while the classes of nonnegative and of probability measures in \(\mathrm{ca}^{1}(\mathbb{X})\) are denoted by \(\mathrm{Meas}^{1}(\mathbb{X})\) and \(\mathrm{Prob}^{1}(\mathbb{X}) \), respectively. The canonical \(d\)-dimensional process is given by \(X_{t}^{j}(x)=x_{t}^{j}\), \(j=1,\dots ,d\), \(t=0,\dots ,T\). Observe that every \(\phi \in C_{0:T}\) satisfies \(\vert \phi (x)\vert \leq \Vert \phi \Vert _{0:T}( 1+\sum _{t=0}^{T} \sum _{j=1}^{d}\vert x_{t}^{j}\vert ) \), and so for any \(\mu \in \mathrm{Meas}^{1}(\mathbb{X})\), we have \(C_{0:T}\subseteq L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\mu )\) iff \(X_{t}^{j}\in L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\mu )\) for all \(j\) and \(t\).
Remark 2.14
Under Assumption 2.3 (i), with the notation from there, consider the proper, convex functional \(V(\varphi ):=-U(-\varphi ) \), with \(\mathrm{dom}(V)=\{ \varphi \in \mathcal{E} : V(\varphi )< \infty \}\). We define the (convex) conjugate of \(U\) by
Then \(\mathcal{D}\) is a convex functional and -lower semicontinuous, even if we do not require that \(U\) is -upper semicontinuous. When some \(\gamma \in (C_{0:T})^{\ast }\) is given, we slightly improperly write \(\mathcal{D}(\gamma )=\mathcal{D}(\gamma _{0},\dots ,\gamma _{T})\), where \(\gamma _{t}\) is the restriction of \(\gamma \) to \(C_{0:t}\). We also set
As an immediate consequence of the definitions, the Fenchel inequality holds: if \((\varphi _{0},\dots ,\varphi _{T})\in \mathcal{E}\) and , then
Remark 2.15
Another way to introduce our setting, which is used in Sects. 4.3 and 4.4, is to start with a proper convex functional \(\mathcal{D}:\mathrm{ca}^{1}(\Omega )\rightarrow (-\infty ,+\infty ]\) which is \(\sigma (\mathrm{ca}^{1}(\Omega ),\mathcal{E})\)-lower semicontinuous for an \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\subseteq (C_{0:T})^{T+1}\). By the Fenchel–Moreau theorem, we then have the representation
where now \(V\) is the Fenchel–Moreau (convex) conjugate of \(\mathcal{D}\), namely
Setting \(U(\varphi ):=-V(-\varphi )\), \(\varphi \in \mathcal{E}\), we get back that \(\mathcal{D}\) satisfies (2.21) and additionally that \(U\) is \(\sigma (\mathcal{E},\mathrm{ca}^{1}(\Omega ))\)-upper semicontinuous. In conclusion, a pair \((U,\mathcal{D})\) satisfying (2.21) might be defined either by providing a proper concave \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) as in Sect. 2.1, or by assigning a proper convex and \(\sigma (\mathcal{E},\mathrm{ca}^{1}(\Omega ))\)-lower semicontinuous \(\mathcal{D}:\mathrm{ca}^{1}(\Omega )\rightarrow (-\infty ,+\infty ]\) as explained in this remark.
2.4.2 Technical comments on Theorem 2.4
Remark 2.16
We now provide conditions ensuring that \(\pi (0)<\infty \), which by Theorem 2.4 (i) implies that \(\pi (c)\in {\mathbb{R}}\) for every \(c\in B_{0:T}\).
(a) If there exists \(\lambda \in \mathcal{A}^{\circ }\cap \partial U(0)\subseteq (C_{0:T})^{ \ast }\), then \(\pi (0)< \infty \). Note that here, is the supergradient of \(U\) at \(0\in \mathcal{E}\), and we identify \(\lambda \) with the vector of its restrictions in writing improperly \(\lambda \in \partial U(0)\)). To see this, let \(\lambda \) satisfy \(S^{U}(\varphi )\leq \sum _{t=0}^{T}\langle \varphi _{t},\lambda _{t} \rangle ,\,\forall \varphi \in \mathcal{E}\). In particular, for all \(z\in -\mathcal{A}\) and \(\varphi \in \Phi _{z}(0)\), it then holds that \(S^{U}(\varphi )\leq \langle \sum _{t=0}^{T}\varphi _{t},\lambda \rangle \leq \langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \rangle \leq 0\), as \(\langle z,\lambda \rangle \geq 0\) for all \(\lambda \in \mathcal{A}^{\circ }\), which in turns yields \(\pi (0)\leq 0\).
(b) We have \(\pi (0)< \infty \) if and only if there exists \(Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) such that \(\mathcal{D}(Q)< \infty \). Indeed, by definition, \(\pi (0)\leq \int _{\Omega }0\,\mathrm{d}Q+\pi ^{\ast }(Q)=\pi ^{\ast }(Q)\). But from Lemma 2.20 (which does not rely on Lemma 2.18), we have
(the latter equality coming from \(Q\in \mathcal{A}^{\circ } \)). Hence \(\pi (0)\leq \mathcal{D}(Q)< \infty \). Conversely, \(\pi (0)< \infty \) implies the existence of a minimum point in (2.9).
Remark 2.17
Set
and observe that \({\Phi }_{z}(c)=\widetilde{{\Phi }}_{z}(c)\cap \mathrm{dom}(S^{U})\). Then with \(\sup \emptyset =-\infty \),
To see this, we consider different cases for a fixed \(z\in -\mathcal{A}\).
Case 1: \({\Phi }_{z }(c)=\emptyset \), which means \(\sup _{\varphi \in {\Phi }_{z }(c)}S^{U}(\varphi )=-\infty \) by convention. If \(\widetilde{{\Phi }}_{z }(c)=\emptyset \), then \(\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}S^{U}(\varphi )=- \infty \) by convention. If \(\widetilde{{\Phi }}_{z }(c)\neq \emptyset \), then \(\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}S^{U}(\varphi )=- \infty \) since for every \(\varphi \in \widetilde{{\Phi }}_{z }(c)\), we have \(S^{U}(\varphi )=-\infty \) as \(\varphi \notin \mathrm{dom}(S^{U})\).
Case 2: \({\Phi }_{z}(c)\neq \emptyset \). Then \(\widetilde{{\Phi }}_{z}(c)\neq \emptyset \), too, and since we can ignore all functions \(\varphi \in \widetilde{{\Phi }}_{z}(c)\setminus {{\Phi }}_{z}(c)\) (which produce values \(S^{U}(\varphi )=-\infty \)), we get
2.4.3 The proof
The proof of Theorem 2.4 is split in the following Lemmas 2.18, 2.20, 2.22–2.24 which are then combined in Lemma 2.25.
Lemma 2.18
Under Assumption 2.3and if (2.7) holds, Theorem 2.4 (i) holds. Moreover, the restriction of \(\pi \) to \(C_{0:T}\) satisfies
for
Proof
Suppose that \(\pi (\,\widehat{c}\,)<\infty \) for some \(\widehat{c}\in B_{0:T}\). To prove that \(\pi (c)>-\infty \) for every \(c\in B_{0:T}\), it is (more than) enough to show that
Set \(H_{n}=H_{0}(n)\times \cdots \times H_{T}(n)\subseteq \Omega \). Observe that whenever \(c\in B_{0:T}\) is given, we have for every \(n\geq 1\) and \(x\in H_{n}\) that
and for every \(x\in \Omega \setminus H_{n}\) that
using (2.5) in the last inequality. Thus for every \(x\in \Omega \),
If we now show that \((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0\leq t\leq T}\in \mathrm{dom}(S^{U})\) for \(n\) big enough, we then conclude that
by cash-additivity of \(S^{U}\), and at the same time,
by definition. This in particular proves that \(\pi (c)>-\infty \). Going then back to checking \((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0\leq t\leq T}\in \mathrm{dom}(S^{U})\), observe that
by Assumption 2.3. The fact that \(\pi (c)< \infty \) will follow once we show monotonicity, cash-additivity and concavity of \(\pi \). Monotonicity is trivial: if \(c_{1}\leq c_{2}\), then \({ \Phi }_{z}(c_{1})\subseteq {\Phi }_{z}(c_{2})\) for every \(z\in -\mathcal{A}\) (both sets might be empty). The cash-additivity property can be seen as follows: given \(\beta \in {\mathbb{R}}\) and setting \({1}=(1,\dots ,1)\in {\mathbb{R}}^{T}\), observe that whenever \(z\in \mathcal{A}\) is given, \(\varphi \in {\Phi }_{z}(c+\beta )\) is equivalent to \(\varphi -\frac{\beta }{T+1}{1}\in {\Phi }_{z}(c)\) since by cash-additivity of \(S^{U}\), \(\mathrm{dom}(S^{U})+{\mathbb{R}}^{T+1}=\mathrm{dom}(S^{U})\). Consequently,
Coming to concavity, it is convenient to rewrite \(\pi (c)\) in a slightly more convenient form as
and to recall that whenever \(c\in B_{0:T}\) is given, the set over which we take the supremum on the right-hand side of (2.27) is not empty by (2.26). Take then \(c_{i}\in B_{0:T}\) and associated \(z_{i}\in -\mathcal{A},\varphi ^{i}\in \mathrm{dom}(S^{U})\) with \(\sum _{t=0}^{T}\varphi _{t}^{i}+z_{i}\leq c_{i}\). Define \(c_{ \alpha }=\alpha c_{1}+(1-\alpha )c_{2}\) and analogously \(z_{\alpha }\) and \(\varphi ^{\alpha } \) for \(\alpha \in \lbrack 0,1]\). Then clearly \(\sum _{t=0}^{T}\varphi _{t}^{\alpha }+z_{\alpha }\leq c_{\alpha }\). Combining this with concavity of \(S^{U}\) on \(\mathrm{dom}(S^{U})\), we obtain
using (2.27) in the last equality. Taking now the supremum over all \(z_{i},\varphi ^{i}\) with \(\sum _{t=0}^{T}\varphi _{t}^{i}+z_{i}\leq c_{i}\), we obtain
Notice that up to this point, we have \(\pi (c_{i})\in (-\infty ,+\infty ]\) so that (2.28) makes sense.
Now we can combine (2.28) with the fact that \(\pi (c)>-\infty \) for every \(c\in B_{0:T}\) to show that \(\pi (c)<\infty \) for every \(c\in B_{0:T}\). Indeed, suppose that \(\pi (\,\widetilde{c}\,)= \infty \) for some \(\widetilde{c}\in B_{0:T}\). We know by hypothesis that \(\pi (\,\widehat{c}\,)< \infty \) for some \(\widehat{c}\in B_{0:T}\), and by what we have previously proved, we know that \(\pi (2\,\widehat{c}-\widetilde{c}\,)>-\infty \). Observing that \(\widehat{c}= \alpha (2\,\widehat{c}- \widetilde{c}\,)+ \left ( 1-\alpha \right ) \widetilde{c}\) for \(\alpha =\frac{1}{2}\), we have from (2.28) that
This yields a contradiction; thus there can be no \(\widetilde{c}\in B_{0:T}\) with \(\pi (\widetilde{c})= \infty \). Hence \(\pi :B_{0:T}\rightarrow {\mathbb{R}}\) is cash-additive, concave and nondecreasing on \(B_{0:T}\). Then it is automatically norm-continuous on \(B_{0:T}\) by the extended Namioka–Klee theorem (see Biagini and Frittelli [8]). The Fenchel–Moreau-type dual representation (2.25) holds, again by the extended Namioka–Klee theorem, this time applied to the restriction of \(\pi \) to \(C_{0:T}\), plus standard arguments involving monotonicity and cash-additivity to prove that \(\pi ^{\ast }(\lambda )<\infty \) implies that \(\lambda \geq 0\) and \(\lambda (1)=1\). See for example Föllmer and Schied [21, Theorem 4.16] for an exploitable technique for a similar argument. □
Remark 2.19
Under Assumption 2.3 and if (2.7) holds, \(S^{U}(\varphi )< \infty \) for every \(\varphi \in \mathrm{dom}(S^{U})\). Indeed, choosing \(c_{\varphi }:=\sum _{t=0}^{T}\varphi _{t}\), we get that \(\varphi \in {\Phi }_{0}(c_{\varphi })\) and \(S^{U}(\varphi )\leq \pi (c_{\varphi })< \infty \) by Lemma 2.18.
Lemma 2.20
For every \(\lambda \in (C_{0:T})^{\ast }\) with \(\lambda \geq 0\), we have
If in addition \(\lambda (1)=1\), then
Proof
Fix \(\lambda \in (C_{0:T})^{*}\) with \(\lambda \geq 0\). Then
where the third equality follows from (2.24). Consequently,
At the same time, for every \(\varphi \in \mathcal{E}, z\in -\mathcal{A}\) and for \(\widehat{c}=\sum _{t=0}^{T}\varphi _{t}+z\in C_{0:T}\), we have that \(\varphi \in \Phi _{z}(\,\widehat{c}\,)\). Thus
using (2.29) in the second inequality, and hence
Combining (2.30) and (2.31), we get \(\pi ^{*}(\lambda )=\sigma _{\mathcal{A}}(\lambda )+(S^{U})^{*}(\lambda _{0}, \dots ,\lambda _{T})\). If additionally \(\lambda (1)=1\), then we have
□
Remark 2.21
Under Assumption 2.3, we have \(U(0)\in {\mathbb{R}}\) and therefore
for every \(0\leq \lambda \in (C_{0:T})^{\ast }\).
Lemma 2.22
Under Assumption 2.3and if (2.7) holds, let \(0\leq \lambda \in (C_{0:T})^{\ast }\) with \(\lambda (1)=1\) be given and define \(0\leq \lambda _{t}=\lambda |_{C_{0:t}}\in (C_{0:t})^{\ast } \). If \((\lambda _{0},\dots ,\lambda _{T})\in \mathrm{dom}(\mathcal{D})\), then there exists a unique \(Q\in \mathrm{Prob}^{1}(\Omega )\) which represents \(\lambda \) on \(C_{0:T}\), i.e.,
Proof
The proof is an adaptation of Bogachev [10, Theorem 7.10.6]. We first stress the fact that \(\lambda _{t}=\lambda |_{C_{0:t}}\in (C_{0:t})^{*}\) is a consequence of (2.20). We apply Proposition A.2. To do so, we show that for a fixed \(\varepsilon >0\) and \(n\) big enough, we may define a set \(H_{0}(n)\times \cdots \times H_{T}(n)\) that is compact (since so are all the factors) and satisfies the assumptions in Proposition A.2. Suppose that a given \(\varphi \in C_{0:T}\) satisfies \(\varphi (x)=0\) for every . We also have automatically that
By Assumption 2.3, we then have for every that
Since, moreover, \(\varphi = 0\) on by assumption, we get
Then for every \(a>0\), we have
where (2.33) follows from positivity of \(\lambda \), (2.32), linearity and the fact that we have \(\lambda _{t}:=\lambda |_{C_{0:t}}\in (C_{0:t})^{*}\), while (2.34) follows from the Fenchel inequality (2.22).
Since \((\lambda _{0},\dots ,\lambda _{T}) \in \mathrm{dom}(\mathcal{D})\) by hypothesis, we can select \(a>0\) such that \(\frac{1}{a} \mathcal{D}(\lambda _{0},\dots ,\lambda _{T})\leq \frac{\varepsilon}{2}\). Choose now \(n\) in such a way that \(\frac{1}{a}V ( a f^{n}_{0},\dots ,a f^{n}_{T} )\leq \frac{\varepsilon}{2}\) for every \(s\leq T\) (which is possible by Assumption 2.3). Continuing from (2.34), we get
The result now follows by combining Proposition A.2 and the Daniell–Stone result in Theorem A.3. □
Lemma 2.23
Under Assumption 2.3and if (2.7) holds, equation (2.9) is true for every \(c\in C_{0:T}\), with a minimum in place of the infimum.
Proof
Combining Lemmas 2.18, 2.20 and 2.22, we have
where the first equality uses Lemma 2.18, the second Lemma 2.20, and the third and fifth equalities the fact that \(\mathcal{D}\) is bounded from below by \(S^{U}(0)\) by Remark 2.21, hence \((\lambda _{0},\dots ,\lambda _{T})\in \mathrm{dom}(\mathcal{D}) \) if and only if \(\mathcal{D}(\lambda _{0},\dots ,\lambda _{T})< \infty \). Moreover, in (2.35), we use Lemma 2.22 and identify probability measures \(Q\in \mathrm{Prob}^{1}(\Omega )\) and their induced functionals, as well as the marginals \(Q_{t}\) of such measures, with the restrictions of such functionals to \({C}_{0:t}\). Finally, in the last equality, we just use the definition of \(\sigma _{\mathcal{A}}\) and the fact that \(\pi (c)< \infty \) by Lemma 2.18. □
Lemma 2.24
Under Assumption 2.3and if (2.7) holds, the sublevel set
is \(\sigma ((C_{0:T})^{\ast },C_{0:T})|_{\mathrm{Prob}^{1}(\Omega )}\)-(sequentially) compact for every \(b \in {\mathbb{R}}\).
Proof
To begin with, we show that \(\{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1,\pi ^{ \ast }(\lambda )\leq b \}\) is weak∗ compact. First, we prove that \(B := \{\lambda \in (C_{0:T})^{\ast } : \pi ^{\ast }(\lambda )\leq b \} \) is weak∗ compact. Observe that by (2.25), we have for every \(r>0\) and \(\lambda \in (C_{0:T})^{\ast }\) with \(\pi ^{\ast }(\lambda )\leq b \) that
Now since \(-\pi (\,\cdot \, )\) is real-valued, convex and continuous on \(C_{0:T}\) by Lemma 2.18, it follows from Aliprantis and Border [1, Theorem 5.43] that the right-hand side in (2.36) is finite for some \(r>0\). Thus the operator norms of elements of the set \(B\) are uniformly bounded, and so \(B\) is contained in some (weak∗ compact, by the Banach–Alaoglu theorem) ball of \((C_{0:T})^{\ast }\). Since \(\pi ^{\ast }\) is weak∗ lower semicontinuous by its very definition, its sublevel sets are weak∗ closed. This concludes the proof of weak∗ compactness of \(B\). Next,
is the intersection of a weak∗ compact set and weak∗ closed sets; hence it is weak∗ compact. Combining the fact that \(\sigma _{\mathcal{A}}=\delta _{\mathcal{A}^{\circ }} \) and Lemma 2.20,
By Lemma 2.22, we can identify normalised nonnegative functionals in \(\mathrm{dom}(\mathcal{D})\) and measures in \(\mathrm{Prob}^{1}(\Omega )\), so that by a slight abuse of notation, we can write
Moving to sequential compactness, the topology \(\tau =\sigma ((C_{0:T})^{\ast },C_{0:T})|_{\mathrm{Prob}^{1}(\Omega )}\) induced on \(\mathrm{Prob}^{1}(\Omega )\) is the topology generated by the 1-Wasserstein distance by Bolley [11, Theorem A.2] (see also the discussion in the introduction of Bolley [12], and Villani [38, Definition 6.8.(iv) and Theorem 6.9]). By the above argument, the set \(\{Q \in \mathrm{Prob}_{1}(\Omega ) : \mathcal{D}(Q )\leq b \}\) is then a compact subset of \((\mathrm{Prob}^{1}(\Omega ),\tau )\), which is then 1-Wasserstein sequentially compact. As \(\mathcal{A}^{\circ}\) is weak∗ closed by its definition, the result follows. □
Lemma 2.25
Under Assumption 2.3, for every lower semicontinuous functional
satisfying (2.8), the duality (2.9) holds, and if \(\pi (c)<\infty \), the infimum in (2.9) is a minimum.
Proof
Take \(c\) as in the statement. Observe that from the definition of \(\pi \) and the Fenchel inequality on \(S^{U}\), for any \(Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\), we have
where the second inequality follows from \({Q}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{0}\) and the third is a consequence of Lemma 2.20. Hence
The case \(\pi (c)= \infty \) is thus trivial, and we focus on the case \(\pi (c)<\infty \). Let \(c^{A}(x):=-A( 1+\sum _{t=0}^{T}\sum _{j=1}^{d} \vert x_{t}^{j}\vert )\), for \(x\in \Omega \). Then \(c\geq c^{A}\in C_{0:T}\) and we have \(\pi (c^{A})\leq \pi (c)< \infty \), as can be easily verified. A standard argument produces a sequence \((c_{n})\subseteq C_{0:T}\) with \(c_{n}\uparrow c\) pointwise on \(\Omega \). We claim that given a sequence of optima for the dual problems of \(\pi (c_{n})\), taking a suitable converging subsequence, the limit \(\widehat{Q}\) satisfies \(\widehat{Q}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) and \(E_{\widehat{Q}}[c] +\mathcal{D}(\widehat{Q})\leq \pi (c)\). This and (2.37) then imply (2.9).
To prove the claim, recall from Lemma 2.23 and \(\infty >\pi (c)\geq \pi (c_{n})\) that each dual problem for \(\pi (c_{n})\) admits an optimum; call it \(Q^{n}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\). We proceed by observing that \(\mathcal{D}(Q^{n})\in{\mathbb{R}}\) for every \(n\) and
where we set \(\eta _{t}(x_{t})=\sum _{j=1}^{d}\vert x_{t}^{j}\vert \) for \(x_{t}\in K_{t}\). Now by the Fenchel inequality (2.22), mimicking the argument in (2.34),
Going back to (2.38), we then get
where \(\zeta \in {\mathbb{R}}\) is a constant depending on \(c_{1},V,\eta _{0},\dots ,\eta _{T}\). As \(\pi (c_{n})\leq \pi (c)< \infty \), we conclude that \(\sup _{n}\mathcal{D}(Q^{n})<\infty \), which in turn implies that the sequence \((Q_{n})\) lies in \(\{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ } : \mathcal{D}(Q)\leq b \}\) for \(b \in {\mathbb{R}}\) big enough. We know that the latter set is weak∗ sequentially compact by Lemma 2.24. Thus we can extract a subsequence, which we call again \((Q^{n})\), weak∗ converging to a \(\widehat{Q}\in \mathrm{Prob}^{1}\cap \mathcal{A}^{\circ }\). Now it is easily seen that
where the first inequality uses the fact that \(Q\mapsto E_{Q}[c_{n}] +\mathcal{D}(Q)\) is weak∗ lower semicontinuous as a sum of weak∗ lower semicontinuous functionals, and the second uses that \(c_{n}\leq c_{m}\) if \(m\geq n\). □
2.5 Convergence of EMOT
In this section, we study some stability and convergence results for the EMOT problem. In particular, we show how under suitable convergence assumptions on the penalty terms, one can see the classical MOT as a limit case for EMOT.
We suppose that for each \(n\in \mathbb{N}\cup \{\infty \} \), we are given a functional \(U_{n}\) and a set \(\mathcal{A}_{n}\subseteq C_{0:T}\). We denote the corresponding value as in (2.3) by \(\pi _{n}(c)\).
Proposition 2.26
Suppose that for each \(n\in \mathbb{N}\cup \{\infty \}\), the assumptions of Theorem 2.4hold for \(\pi _{n}(c)\) and that \(\pi _{n}(c)< \infty \). Suppose that
for every \(Q\in \mathrm{Prob}^{1}(\Omega )\). Then \(\pi _{n}(c)\uparrow \pi _{\infty }(c)\) for every \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is lower semicontinuous and satisfies (2.8).
Proof
By Lemma 2.23, the dual problem for \(\pi (c_{n})\) admits an optimum \(Q^{n}\) in the set \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}_{n}^{\circ }\) for each \(n\in \mathbb{N}\). Observe that \(\infty >\pi _{\infty }(c)\geq \sup _{n}\pi _{n}(c)=\lim _{n}\pi _{n}(c)\) and that with an argument similar to the one yielding (2.39),
using Lemma 2.20 to get the equality in the fourth line. As a consequence, for some constant \(\eta \),
Hence all the measures \(Q^{n}, n \in \mathbb{N}\), belong to a sublevel set of the form
which is \(\sigma (\mathrm{Prob}^{1}(\Omega ),C_{0:T})\)-(sequentially) compact by Lemma 2.24. Extract a subsequence, again called \((Q^{n})\), converging to a limit \(Q^{\infty }\in \mathrm{Prob}^{1}(\Omega )\). Since \(\mathcal{D}_{n}\) and \(\sigma _{\mathcal{A}_{n}}\) are lower semicontinuous, so is \(\mathcal{D}_{n}+\sigma _{\mathcal{A}_{n}}\) for every \(n\in \mathbb{N}\cup \{\infty \}\). Hence
using (2.40) in the first equality and the second inequality. Up to taking a further subsequence, again called \((Q^{n})\), we may assume that the lim inf above is in fact a limit, so that
Since \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and satisfies (2.8) for some \(A\geq 0\), there exists a sequence \((c_{n}) \subseteq C_{0:T}\) with \(c_{n}\uparrow c\) pointwise on \(\Omega \), just as in the proof of Lemma 2.25. By monotone convergence, we then have \(E_{Q}[c]\! =\!\sup _{n}\!E_{Q} [ c_{n} ] \). We conclude that \(Q\mapsto E_{Q}\left [ c\right ] \) is the supremum of linear functionals, each being continuous with respect to the topology induced by \(\sigma ((C_{0:T})^{\ast },C_{0:T})\) on \(\mathrm{Prob}^{1}(\Omega )\). Thus \(Q\mapsto E_{Q}[ c] \) is lower semicontinuous with respect to that topology, and we obtain \(E_{Q^{\infty }}[c]\leq \liminf _{n}E_{Q^{n}}[c]\). Passing to a further subsequence, we can assume that \(\liminf _{n}E_{Q^{n}}[c]=\lim _{n}E_{Q^{n}}[c]\). From the previous arguments, we then get
where we use that the \(Q^{n}\) are optima. Since we already have \(\lim _{n}\pi _{n}(c)\leq \pi _{\infty }(c)\), this concludes the proof of \(\pi _{n}(c)\uparrow \pi _{\infty }(c)\). □
3 Additive structure
In Sect. 2, we did not require any particular structural form of the functionals \(\mathcal{D},U\). We now assume an additive structure of \(U\) and, complementarily, an additive structure of \(\mathcal{D}\). In the entire Sect. 3, we take for each \(t=0,\dots ,T\) a vector subspace \(\mathcal{E}_{t}\subseteq {C}_{t}=C_{t:t}\) such that \(\mathcal{E}_{t}+{\mathbb{R}}=\mathcal{E}_{t}\) and set \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\). Observe that we automatically have \(\mathcal{E}+{\mathbb{R}}^{T+1}=\mathcal{E}\). It is also clear that ℰ is a subspace of \((C_{0:T})^{T+1}\) if we interpret \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) as subspaces of \(C_{0:T}\). We also mention here that up to now, we used for \(\lambda \in (C_{0:T})^{*}\) (resp. for a measure \(\mu \in \mathrm{ca}(\Omega )\)) the notation \(\lambda _{t}\) (resp. \(\mu _{t}\)) for the restriction to \(C_{0:t}\) (resp. for the marginal on \(\Omega _{0:t}\)). This was motivated by the fact that we were considering general \(\mathcal{E}_{t}\subseteq C_{0:t}\). Since we now mostly work with \(\mathcal{E}_{t}\subseteq C_{t}\), we change notation slightly.
Notation 3.1
Throughout Sects. 3–5, given \(\lambda \in (C_{0:T})^{*}\) (or a measure \(\mu \in \mathrm{ca}(\Omega )\)), we use the notation \(\lambda _{t}\) (resp. \(\mu _{t}\)) for the restriction to \(C_{t}\) (resp. for the marginal on ).
3.1 Additive structure of \(U\)
Setup 3.2
We consider a proper concave functional \(U_{t}:\mathcal{E}_{t}\rightarrow [ -\infty ,+\infty )\) for every \(t=0,\dots ,T\). We define \(\mathcal{D}_{t}\) on \(\mathrm{ca}^{1}(K_{t})\) similarly to (2.21) as
and observe that \(\mathcal{D}_{t}\) can also be viewed as defined on \(\mathrm{ca}^{1}(\Omega )\) by using for \(\gamma \in \mathrm{ca}^{1}(\Omega )\) the marginals \(\gamma _{0},\dots ,\gamma _{T} \) and setting \(\mathcal{D}_{t}(\gamma ):=\mathcal{D}_{t}(\gamma _{t})\). We now define, for each \(\varphi \in \mathcal{E}\), \(U(\varphi ):=\sum _{t=0}^{T}U_{t}(\varphi _{t})\) and define \(\mathcal{D}\) on \(\mathrm{ca}^{1}(\Omega )\) using (2.21). Recall from (1.4) that
Lemma 3.3
In Setup 3.2and under the convention \(+\infty -\infty =-\infty \), we have
and for all \(\varphi \in \mathcal{E}\) that
Proof
The simple proof is omitted. □
3.2 Duality for the general cash-additive setup
As a consequence of Theorem 2.4, we now obtain a duality in a general cash-additive setup.
Theorem 3.4
Suppose that \(\mathcal{E}_{t}\subseteq C_{t}\) with \(X_{t}\in \mathcal{E}_{t}\) and that \(U_{t}:\mathcal{E}_{t}\rightarrow {\mathbb{R}}\) is a concave, cash-additive functional null in 0. Set \(U(\varphi ):=\sum _{t=0}^{T}U_{t}(\varphi _{t})\) for \(\varphi \in \mathcal{E}= \mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and suppose that Assumption 2.3is fulfilled. Recall the formulation of \(\mathcal{S}_{{\mathrm {sub}}}(c)\) from (1.2) and consider for every \(t=0,\dots ,T\) the penalisations
Let \(c:\Omega \rightarrow (-\infty ,+\infty ]\) be lower semicontinuous and such that (2.8) holds. Then
and the infimum in (3.3) is a minimum provided that \(\pi (c)< \infty \).
Proof
Let \(\mathcal{D}\) be defined as in (2.21). Observe that we are in Setup 3.2. Lemma 3.3 tells us that \(S^{U}(\varphi )=\sum _{t=0}^{T}S^{U_{t}}(\varphi _{t})=\sum _{t=0}^{T}U_{t}( \varphi _{t})\), since \(U_{0},\dots ,U_{T}\) are cash-additive, and that \(\mathcal{D}\) coincides on \(\mathrm{Mart}(\Omega )\) with the penalisation term \(Q\mapsto \sum _{t=0}^{T}\mathcal{D}_{t}(Q_{t})\), where \(\mathcal{D}_{t}(\mathcal{Q}_{t})\) is given in (3.2). So \(S^{U}=U\) and \(\mathrm{dom}(S^{U})=\mathcal{E}\), all the assumptions of Theorem 2.4 are fulfilled, and so we can apply Corollary 2.12 which together with Remark 2.11 yields exactly (3.3). □
3.3 Additive structure of \(\mathcal{D}\)
The results of this subsection will be applied in Sects. 4.3 and 4.4. In the spirit of Remark 2.15, we now reverse the procedure taken in the previous subsection: We start from some functionals \(\mathcal{D}_{t}\) on \(\mathrm{ca}^{1}(K_{t})\) for \(t=0,\dots ,T\) and build an additive functional \(\mathcal{D} \) on \(\mathrm{ca}^{1}(\Omega )\). Our aim is to find the counterparts of the results in Sect. 3.1.
Setup 3.5
We consider a proper, convex, \(\sigma (\mathrm{ca}^{1}(K_{t}),\mathcal{E}_{t})\)-lower semicontinuous functional \(\mathcal{D}_{t}:\mathrm{ca}^{1}(K_{t})\rightarrow (-\infty ,+\infty ]\) for every \(t=0,\dots ,T\). We extend the functionals \(\mathcal{D}_{t}\) to \(\mathrm{ca}^{1}(\Omega )\) by using, for any \(\gamma \in \mathrm{ca}(\Omega )\), the marginals \(\gamma _{0},\dots ,\gamma _{T}\). If \(\gamma \in \mathrm{ca}^{1}(\Omega )\), we set
We define \(V(\varphi )\) for \(\varphi \in \mathcal{E}\) and \(V_{t}(\varphi _{t})\) for \(\varphi _{t}\in \mathcal{E}_{t}\) for \(t=0,\dots ,T\) similarly to (2.23) as
We define on ℰ the functional \(U(\,\cdot \, )=-V(-\,\cdot \, )\) and similarly \(U_{t}(\,\cdot \, )=-V_{t}(-\,\cdot \, )\) on \(\mathcal{E}_{t}\) for \(t=0,\dots ,T\). Finally, \(S^{U}(\varphi )\), \(S^{U_{0}}(\varphi _{0}),\dots ,S^{U_{T}}(\varphi _{T})\) are defined as in Setup 3.2.
Lemma 3.6
In Setup 3.5, we have the following:
1) \(\mathcal{D}_{0},\dots ,\mathcal{D}_{T}\) as well as \(\mathcal{D}\) are \(\sigma (\mathrm{ca}^{1}(\Omega ),\mathcal{E})\)-lower semicontinuous.
2) Under the additional assumption that \(\mathrm{dom}(\mathcal{D}_{t})\subseteq \mathrm{Prob}^{1}(K_{t})\) for \(t=0,\dots ,T\), we have for any \(\varphi =(\varphi _{0},\dots ,\varphi _{T})\in \mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) that
Proof
The proof is omitted. □
3.4 Divergences induced by utility functions
In this section, we provide the exact formulation of the divergences induced by utility functions \(u_{t}:{\mathbb{R}}\rightarrow \lbrack -\infty ,+\infty )\), distinguishing the two cases \(\mathrm{dom}(u_{t})={\mathbb{R}}\) and \(\mathrm{dom}(u_{t})\supseteq \lbrack 0,\infty )\).
Assumption 3.7
We consider concave, upper semicontinuous nondecreasing functions \(u_{0},\dots ,u_{T}:{\mathbb{R}}\rightarrow \lbrack -\infty ,+\infty )\) with \(u_{0}(0)=\cdots =u_{T}(0)=0\) and \(u_{t}(x)\leq x\), \(\forall x\in {\mathbb{R}}\) (that is, \(1\in \partial u_{0}(0)\cap \cdots \cap \partial u_{T}(0)\)). For each \(t=0,\dots ,T\), we define \(v_{t}(x):=-u_{t}(-x)\), \(x\in {\mathbb{R}}\), and
Remark 3.8
We observe that \(v_{t}(y)=v_{t}^{\ast \ast }(y)=\sup _{x\in {\mathbb{R}}}(xy-v_{t}^{\ast }(y))\) for all \(y\in {\mathbb{R}}\) by the Fenchel–Moreau theorem and that \(v_{t}^{\ast }\) is convex, lower semicontinuous and bounded from below on ℝ. Assumption 3.7 is satisfied by a wide range of utility functions.
Fix \(\widehat{\mu _{t}} \in \mathrm{Meas}(K_{t})\). We define, for \(\mu \in \mathrm{Meas}(K_{t})\),
In the next two results, whose proofs are postponed to Appendix A.2, we provide the dual representation of the divergence terms.
Proposition 3.9
Take \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7with
consider closed (possibly noncompact) \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\) and let \(\widehat{\mu }_{t}\in \mathrm{Meas}(K_{t})\), \(t= 0,\dots ,T\). Then
Let \(\widehat{Q}_{t}\in \mathrm{Prob}(K_{t})\) and for \(\mu \in \mathrm{Meas}(K_{t})\), let \(\mu =\mu _{a}+\mu _{s} \) be the Lebesgue decomposition of \(\mu \) with respect to \(\widehat{Q}_{t}\), where \(\mu _{a}\ll \widehat{Q}_{t}\) and \(\mu _{s}\perp \widehat{Q}_{t} \). Set
As \(u_{t}(0)=0\), \((v_{t}^{\ast })_{\infty }^{\prime }\in \lbrack 0, \infty ]\) since \(v^{*}_{t}(y)\geq u_{t}(0)-0 y=0\). Then we can define, for \(\mu \in \mathrm{Meas}(K_{t})\),
where we use the convention \(\infty \cdot 0=0\) if \((v_{t}^{\ast })_{\infty }^{\prime }=\infty \) and \(\mu _{s}(K_{t})=0\). Observe that the restriction of \(\mathcal{F}(\,\cdot \, | \widehat{Q}_{t})\) to \(\mathrm{Meas}(K_{t})\) coincides with the functional in Liero et al. [30, Equation (2.35)] with \(F=v_{t}^{\ast }\), and that whenever \(\mathrm{dom}(u_{t})={\mathbb{R}}\), we have \((v_{t}^{\ast })_{ \infty }^{\prime }=\lim _{y\rightarrow \infty } \frac{v_{t}^{\ast }(y)}{y}= \infty \) and \(\mathcal{F}_{t}(\,\cdot \, | \widehat{Q}_{t})\) coincides with \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(\,\cdot \, )\) on \(\mathrm{Meas}(K_{t})\); see (3.5).
Proposition 3.10
Suppose that \(u_{0},\dots ,u_{T}:{\mathbb{R}}\rightarrow \mathbb{[-\infty },+\infty )\) satisfy Assumption 3.7and \(K_{0},\dots ,K_{T}\subseteq{\mathbb{R}}\) are compact. If \(\widehat{Q}_{t}\in \mathrm{Prob}(K_{t})\) for all \(t\in \{0, \dots ,T\}\) has full support, then
Example 3.11
The requirement that \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) have full support is crucial for the proof of Proposition 3.10. We provide a simple example to show that (3.7) does not hold in general when that assumption is not fulfilled. Take \(K= \{-2,0,2\}\), \(\widehat{Q}=\frac{1}{2}\delta _{-2}+\frac{1}{2}\delta _{+2}\), \(\mu =\delta _{0}\), \(u(x)=\frac{x}{x+1}\) for \(x\geq -1\) and \(u(x)=-\infty \) for \(x<-1\). It is easy to see that \(v^{\ast }\) associated via (3.4) is given by \(v^{\ast }(y)=1+y-2\sqrt{y}\) for \(y\geq 0\) and \(v^{\ast }(y)=-\infty \) for \(y<0\), so that \((v_{t}^{\ast })_{\infty }^{\prime }=1\). It is also easy to see that \(\mu \perp \widehat{Q}\); hence \(\mu _{a}=0\) and \(\mu _{s}=\mu \) in the Lebesgue decomposition with respect to \(\widehat{Q}\). Hence \(\mathcal{F}(\mu | \widehat{Q})=1+1\mu (K)=2\). At the same time, we see that taking \(\varphi _{N}\in \mathcal{C}_{b}(K)\) defined via \(\varphi _{N}(-2)=\varphi _{N}(2)=0,\varphi _{N}(0)=-N\) (observe that \(u(\varphi _{N})\notin \mathcal{C}_{b}(K)\) for \(N\) sufficiently large), we have
4 Applications in the compact case
We suppose in the entire Sect. 4 that the following requirements are fulfilled.
Standing Assumption 4.1
Let \(d=1\) and \(\Omega :=K_{0}\times \cdots \times K_{T}\) for compact sets \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in{\mathbb{R}}\), the functional \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous, \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is a given martingale measure with marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\), and \(c\in L^{1}(\widehat{Q})\).
Under this assumption, \(C_{0:T}=\mathcal{C}_{b}(\Omega )\) and \((C_{0:T})^{\ast }=\mathrm{ca}(\Omega )=\mathrm{ca}^{1}(\Omega )\). We observe that the stock price \((X_{t})\) is bounded due to the compactness of \(K_{0},\dots ,K_{T}\). As a consequence, if we consider for example the call option \((X_{t}-\alpha )^{+}\), \(\alpha \in \mathbb{R}\), then it is also bounded on \(\Omega \). The selection \(\mathcal{E} \subseteq \mathcal{C}_{b}(K_{0})\times \cdots \times \mathcal{C}_{b}(K_{T})\) is appropriate in this context.
4.1 Subhedging with vanilla options
As in Beiglböck et al. [4], we suppose in Sect. 4.1 that the elements in \(\mathcal{E}_{t}\) represent portfolios obtained combining call options with maturity \(t\), units of the underlying stock at time \(t\) and deterministic amounts, that is, \(\mathcal{E}_{t}\) consists of all the functions in \(\mathcal{C}_{b}(K_{t})\) of the form
and take \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\). As one can see in the proofs of Corollaries 4.3 and 4.5, which are the core content of this section, one could as well take instead the space \(\mathcal{E}=\mathcal{C}_{b}(K_{0}) \times \cdots \times \mathcal{C}_{b}(K_{T})\) and preserve the validity of (4.3) and (4.5).
In all the results in Sect. 4.1, the functional \(U\) is real-valued on the whole space ℰ and cash-additive, which yields \(\mathrm{dom}(U)= \mathrm{dom}(S^{U})=\mathcal{E}\). Thus we can use Corollaries 2.5 and 2.12, in particular (2.16) and (2.17), in the case \(\mathrm{dom}(S^{U})=\mathcal{E}\).
Take \(U_{t}(\varphi _{t}):=\int _{K_{t}}u_{t}(\varphi _{t}(x_{t})) \mathrm{d}\widehat{Q}_{t}(x_{t})\). We work with the associated (one-dimensional) optimised certainty equivalent \(S^{U_{t}}\) that we rename, for \(\varphi _{t}\in \mathcal{C} _{b}(K_{t})\), as
We observe that Assumption 3.7 does not impose that the functions \(u_{t}\) are real-valued on all of ℝ. Nevertheless, for the functional \(U_{\widehat{Q}_{t}}\), it can be easily shown that we have the following result whose proof is omitted.
Lemma 4.2
Under Assumption 3.7, for each \(t=0,\dots ,T\), \(U_{\widehat{Q}_{t}}\) is real-valued on \(\mathcal{C}_{b}(K_{t})\) and null in 0, concave, nondecreasing and cash-additive.
Corollary 4.3
Take \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7and suppose that we have \(\mathrm{dom}(u_{0})=\cdots = \mathrm{dom}(u_{T})={\mathbb{R}}\). Then
Moreover, if the left-hand side of (4.3) is finite, a minimum point exists.
Proof
Set \(U(\varphi )=\sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t})\) for \(\varphi \in \mathcal{E}\). By Lemma 4.2, for each \(t=0,\dots ,T\) the monotone concave functional \(\varphi _{t}\mapsto U_{\widehat{Q}_{t}}(\varphi _{t})\) is well defined, finite-valued, concave and nondecreasing on all of \(\mathcal{C}_{b}(K_{t})\). Hence by the extended Namioka–Klee theorem (see [8]), it is norm-continuous on \(\mathcal{C}_{b}(K_{t})\). We observe that in this case, we are in Setup 3.2 and can apply (3.1) from Lemma 3.3. We have for every \(Q\in \mathrm{Mart}(\Omega )\) that
Indeed, in (4.4), we combine the continuity of \(U_{\widehat{Q}_{t}}\) on \(\mathcal{C}_{b}(K_{t})\) with the fact that \(\mathcal{E}_{t}\) consists of all piecewise linear functions on \(K_{t}\) so that \(\mathcal{E}_{t}\) is norm-dense in \(\mathcal{C}_{b}(K_{t})\); in the fourth equality, we use the fact that for \(\widetilde{\varphi}_{t}:=\varphi _{t}+\alpha _{t} \) and for every \({Q}\in \mathrm{Mart}(\Omega )\), \(\int _{K_{t}}\widetilde{\varphi}\mathrm{d}{Q}_{t}=\int _{K_{t}}{\varphi}\mathrm{d}\widehat{Q}_{t}+\alpha _{t} \); in the fifth equality, we exploit the fact that \(\widetilde{\varphi}_{t}\in \mathcal{E}_{t}\) for every \(\varphi _{t}\in \mathcal{E}_{t}\), \(\alpha _{t} \in {\mathbb{R}}\); and the last equality follows from (3.6).
Using Lemma 3.3 and the fact that \(U_{\widehat{Q}_{0}},\dots ,U_{\widehat{Q}_{T}}\) are cash-additive, we obtain \(S^{U}(\varphi )=\sum _{t=0}^{T}S^{U_{\widehat{Q}_{t}}}(\varphi _{t})= \sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t})=U(\varphi )\). By Lemma 4.2, the assumptions of Corollary 2.13 are satisfied so that we obtain
Existence of optima follows again from Corollary 2.13. □
We stress the fact that in Corollary 4.3, we assume that all the functions \(u_{0},\dots ,u_{T}\) are real-valued on all of ℝ. A more general result can be obtained when weakening this assumption, but it requires an additional assumption on the marginals of \(\widehat{Q}\).
Corollary 4.4
Suppose Assumption 3.7is fulfilled. Assume that \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) have full support on \(K_{0},\dots ,K_{T}\), respectively. Then (4.3) holds true if we replace \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})\) with \(\mathcal{F}_{t}(Q_{t}| \widehat{Q}_{t})\). Moreover, finiteness of the problem on the left-hand side of (4.3) implies the existence of a minimum.
Proof
The proof carries over almost literally from the proof of Corollary 4.3, except for replacing the reference to Proposition 3.9 with a reference to Proposition 3.10. □
We stress that in Corollary 4.4, we impose the full support property on \(K_{0},\dots ,K_{T}\) with respect to their induced (Euclidean) topology. In particular, this means that whenever \(k_{t}\in K_{t}\) is an isolated point, \(\widehat{Q}_{t}[\{k_{t}\}]>0\). This is consistent with the assumption \(K_{0}=\{x_{0}\}\), which implies that \(\mathrm{Prob}(K_{0})\) reduces to the Dirac measure, i.e., \(\mathrm{Prob}(K_{0})=\{\delta _{x_{0}}\}\).
We now take \(u_{t}(x)=x\) for \(t=0,\dots ,T\) and get \(U_{\widehat{Q}_{t}}(\varphi _{t})= {E}_{\widehat{Q}_{t}}[\varphi _{t}]\). Hence an easy computation yields for all \(Q\in \mathrm{Mart}(\Omega )\) that
Recalling that \(\mathrm{Mart}(\widehat{Q}_{0},\dots ,\widehat{Q}_{T})=\{Q\in \mathrm{Mart}(\Omega ) : Q_{t}= \widehat{Q}_{t},\forall \,t=0,\dots ,T \}\), we recover from Corollary 4.3 the following result of Beiglböck et al. [4] (under the compactness assumption, which will be dropped in Corollary 5.3).
Corollary 4.5
We have the equality
Moreover, if the left-hand side of (4.5) is finite, a minimum point exists.
4.2 Subhedging without options
The pricing–hedging duality without options takes the following form.
Corollary 4.6
We have the equality
Moreover, if the left-hand side of (4.6) is finite, a minimum point exists.
Proof
We take \(\mathcal{E}_{0}=\cdots =\mathcal{E}_{T}={\mathbb{R}}\) and \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}={\mathbb{R}}^{T+1}\). If \(u_{t}(x_{t})=x_{t}\), \(t=0,\dots ,T\), and \(\widehat{Q}\in \mathrm{Mart}(\Omega )\), the functional \(U_{\widehat{Q}_{t}}\) defined in (4.2) is given by \(U_{\widehat{Q}_{t}}(m_{t})=m_{t}\) and so \(U(m)=\sum _{t=0}^{T}U_{\widehat{Q}_{t}}(m_{t})=\sum _{t=0}^{T}m_{t}\) for all \(m\in \mathcal{E}\). Hence for each \(\varphi \in \mathcal{E}\) with \(\varphi =(m_{0},\dots ,m_{T})\), \(m\in {\mathbb{R}}^{T+1}\), we select \(U(\varphi )=\sum _{t=0}^{T}m_{t}\). Then by the definition of \(\mathcal{D}\) (see (2.10)), we get
In particular, \(\mathcal{D}(Q)=0\) for every \(Q\in \mathrm{Mart}(\Omega )\). Moreover, we observe that we have \({S}^{U}(\varphi )=U(\varphi )\) for every \(\varphi \in \mathcal{E}\). Applying Corollary 2.13, we get from (2.18) that
We recognise on the right-hand side above the right-hand side of (4.6). Finally, the existence of optima follows again from Corollary 2.13. □
4.3 Penalty terms induced by market prices
In this section, we change our perspective. Instead of starting from a given \(U\), we give a particular form of the penalisation term \(\mathcal{D}\) and proceed by identifying the corresponding \(U\) in the spirit of Remark 2.15. For each \(t=0,\dots ,T\), we suppose that finite sequences \((c_{t,n})_{1\leq n\leq N_{t}}\) in ℝ and \((f_{t,n})_{1\leq n\leq N_{t}}\) in \(\mathcal{C}_{b}(K_{t})\) are given. The functions \(f_{t,n}\) represent payoffs of options whose prices \(c_{t,n}\) are known from the market. We also take and define
Lemma 4.7
The set \(\mathrm{Mart}_{t}(K_{t})\) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact.
Proof
Consider the topology \(\tau = \sigma (\mathrm{ca}(\Omega ),\mathcal{C}_{b}(\Omega ))\). We see that \(\mathrm{Mart}(\Omega )\) is a \(\tau \)-closed subset of the \(\tau \)-compact set \(\mathrm{Prob}(\Omega )\) (which is \(\tau \)-compact since \(\Omega \) is a compact Polish space, see [1, Theorem 15.11]); hence it is \(\tau \)-compact. Then \(\mathrm{Mart}_{t}(K_{t})\) is the image of a \(\tau \)-compact set via the marginal map \(\gamma \mapsto \gamma _{t}\) which is \(\tau \)-\(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-continuous; hence it is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact. □
We introduce the notion of a loss function that will be useful here and also in the sequel (see Sects. 4.4 and 4.4.1) to build penalisation functions.
Definition 4.8
A function \(G:{\mathbb{R}}\rightarrow (-\infty ,+\infty ] \) is called a loss function if it is convex, nondecreasing, lower semicontinuous and satisfies \(G(0)=0\). We set \(\mathrm{dom}(G):= \{ x\in {\mathbb{R}} : G(x)< \infty \} \). The conjugate function \(G^{\ast }:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) defined by \(G^{\ast }(y)=\sup _{x\in {\mathbb{R}}}(xy-G(x))\) satisfies by the monotonicity of \(G\) that \(G^{\ast }(y)=\infty \) for every \(y<0\).
Our requirements allow a wide range of penalisations. For example, we might use power-like penalisations, i.e., \(G(x)=\frac{x^{p}}{{p}}\) for \(x>0\) and \({p}\in (1, \infty )\), \(G(x)=0\) for \(x\leq 0\). In that case, we have \(G^{\ast }(y)=\frac{y^{q}}{{q}}\) for every \(y\geq 0\) for \(\frac{1}{p}+\frac{1}{q}=1\). Alternatively, we might take
For \(\gamma _{t}\in \mathrm{ca}(K_{t})\) we set
Proposition 4.9
Assume that \(G_{n,t}:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) is a loss function for all \(n= 0,\dots ,N_{t}\) and \(t=0,\dots ,T\), and that the martingale measure \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) in the Standing Assumption 4.1also satisfies \(\vert \int _{K_{t}}f_{t,n}\,\mathrm{d}\widehat{Q}_{t}-c_{t,n}\vert \in \mathrm{dom}(G_{t,n})\). Then
where
and \(\Pi ^{{{\mathrm {sub}}}}\) is given in (4.6). Finally, if the left-hand side of (4.8) is finite, a minimum point exists.
Proof
1) Set \(g_{t,n}:=f_{t,n}-c_{t,n} \). For any \(t\in \{0,\dots ,T\}\), we prove that the functional \(\mathcal{D}_{t}^{G}\) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous and that for every \(\varphi _{t}\in \mathcal{C}_{b}(K_{t})\), its Fenchel–Moreau (convex) conjugate satisfies
and thus
Here we use the definition of the superhedging price as
by Corollary 4.6. We observe that \(\mathcal{D}_{t}^{G} \) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous as it is a sum of functions, each being a composition of a lower semicontinuous function and a continuous function on \(\mathrm{Mart}_{t}(K_{t})\) which is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact by Lemma 4.7. We now need to compute
Recall that \(G_{t,n}(x)=\sup _{y\in {\mathbb{R}}}(xy-G_{t,n}^{\ast }(y))\) by the Fenchel–Moreau theorem. Hence
where \(\mathrm{dom}=\mathrm{dom}(G_{t,1}^{\ast })\times \cdots \times \mathrm{dom}(G_{t,N_{t}}^{\ast })\subseteq {\mathbb{R}}^{N_{t}}\). We see that \(\mathcal{T}\) is real-valued on \(\mathrm{dom}\times \mathrm{Mart}_{t}(K_{t})\), convex in the first variable and concave in the second. Moreover, \(\{\mathcal{T}(y_{t},\,\cdot \, )\geq C\}\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-closed in \(\mathrm{Mart}_{t}(\Omega )\) for every \(y_{t}\in \mathrm{dom}\), and \(\mathrm{Mart}_{t}(K_{t})\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact by Lemma 4.7. As a consequence, \(\mathcal{T}(y_{t},\,\cdot \, )\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous on \(\mathrm{Mart}_{t}(K_{t})\). We can apply Simons [36, Theorem 3.1], with \(A=\mathrm{dom}\) and \(B=\mathrm{Mart}_{t}(K_{t})\) endowed with the topology \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\), and interchange inf and sup. From our previous computations, we then get
Equation (4.9) can be obtained with minor manipulations.
2) To conclude, we are clearly in the setup of Corollary 2.13 with \(\mathcal{D}\) given as in Setup 3.5 from \(\mathcal{D}_{0}^{G},\dots ,\mathcal{D}_{T}^{G}\), and by definition \(\mathrm{dom}(\mathcal{D}_{t}^{G})\subseteq \mathrm{Prob}(K_{t})\) for each \(t=0,\dots ,T\). Using Lemma 3.6, 2) together with the computations in 1) and the fact that \(S^{U_{t}^{G}}= U_{t}^{G}\) by cash-additivity of \(U_{t}^{G}\), we get the desired equality from (2.18) in Corollary 2.13; indeed, \(G_{t,n}^{\ast }\) is bounded from below and proper by our assumptions on \(G_{t,n}\), and \(\Pi ^{{\mathrm {sub}}}\) is real-valued and cash-additive on bounded continuous functions. This guarantees that \(V_{t}^{G}(\varphi _{t})\) is null for an appropriate choice of (constant) \(\varphi _{t}\). The existence of optima follows again from Corollary 2.13. □
Remark 4.10
Our assumption of existence of a particular \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) in Proposition 4.9 expresses the fact that we are assuming that our market prices \(c_{t,n}\) are close enough to those given by expectations under some martingale measure.
Example 4.11
Proposition 4.9 covers a wide range of penalisations. For example, we might impose a threshold for the error in computing option prices by taking into account only those martingale measures \(Q\) such that \(\vert \int _{\Omega }f_{t,n}\,\mathrm{d}Q_{t}-c_{t,n}\vert \leq \varepsilon _{t,n}\) for some \(\varepsilon _{t,n}\geq 0\). To express this, just take \(G_{t,n}\) in the form (4.7) for \(\varepsilon =\varepsilon _{t,n}\).
Example 4.12
We now study the convergence of the penalised problem described above to the classical MOT problem. We suppose that our information on the marginal distributions increases, by increasing the number of prices available from the market. We take \(f_{t,n}(x_{t})=(x_{t}-\alpha _{n})^{+}\) to be call options with maturity \(t\) and strikes \(\alpha _{n}\), \(n \in \mathbb{N}\), that form a dense subset of ℝ.
We take as loss functions \(G_{t,n}(x)=0\) for \(x\leq 0\) and \(G_{t,n}(x)=\infty \) for all \(x> 0\), \(t=0,\dots ,T\), \(n\geq 1\). This means that on the left-hand side of (4.8), the infimum is taken only over martingale measures whose theoretical prices exactly match the ones for the data, namely \(c_{t,n}\). For each \(t=0,\dots ,T\), \((c_{t,n})\) is a given sequence of prices, and we suppose that they are all computed under the same martingale measure \(\widehat{Q}\in \mathrm{Mart}(\Omega )\). We consider for each \(k\in \mathbb{N}\) the initial segment \(c_{t,1},\dots ,c_{t,N_{t}(k)}\) for sequences \(N_{t}(k)\uparrow \infty \), \(t=0,\dots ,T\). This means that for every \(Q\in \mathrm{Mart}(\Omega )\),
and
so that
From the denseness of \((\alpha _{n})\), we conclude that \(\mathcal{D}_{\infty }(Q)=0\) if \(Q_{t}= \widehat{Q}_{t}\) for \(0\leq t\leq T\), and \(\mathcal{D}_{\infty }(Q)= \infty \) otherwise. As a consequence, by Proposition 2.26, we have the convergence
4.4 Penalty terms given via Wasserstein distance
Let \(d_{t}\) be a metric on \(K_{t}\) (equivalent to the Euclidean one). The (1-)Wasserstein distance induced by \(d_{t}\) is called \(W_{t}:\mathrm{Prob}(K_{t})\times \mathrm{Prob}(K_{t})\rightarrow {\mathbb{R}}\). Let \(\mathrm{Lip}(1,K_{t})\) be the class of \(d_{t}\)-Lipschitz functions on \(K_{t}\) with Lipschitz constant at most 1. Notice that \(\mathrm{Lip}(1,K_{t})\subseteq \mathcal{C}_{b}(K_{t})\) since \(d_{t}\) is equivalent to the Euclidean metric. For each \(t\), let \(G_{t}:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) be a loss function as in Definition 4.8. For \(\mathrm{Mart}_{t}(K_{t})\) as in Sect. 4.3, we introduce
Then \(\mathcal{D}^{W}_{t}\) is lower semicontinuous with respect to the topology of weak convergence of probability measures, since the Wasserstein metric metrises the latter for compact underlying spaces and \(\mathrm{Mart}_{t}(K_{t})\) is compact under that topology by Lemma 4.7. We are then in Setup 3.5 and the case 2) of Lemma 3.6. As in Sect. 4.3, we take .
Proposition 4.13
For each \(t=0,\dots ,T\), suppose that \(G_{t}\) is a loss function, that there exists a \(Q\in \mathrm{Mart}(\Omega )\) with \(G_{t}(W_{t}(Q_{t},\widehat{Q}_{t}))<\infty \), where \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is the martingale measure from the Standing Assumption 4.1, and take \(\mathcal{D}^{W}_{t}\) as in (4.10). Then
where
and \(\Pi ^{{{\mathrm {sub}}}}\) is given in (4.6). Finally, if the left-hand side of (4.11) is finite, a minimum point exists.
Proof
Starting from \(\mathcal{D}^{W}_{t}\), we compute the associated \(V^{W}_{t}\) as
for the penalty \(\alpha (y,\ell _{t}):=\int _{K_{t}}y\ell _{t}\mathrm{d}\widehat{Q}_{t}+G_{t}^{\ast }(y)\). In the equality chain above, we use the following: in the third equality, the dual representation of \(G_{t}\); in the fifth, the definition of \(\mathrm{dom}(G_{t}^{\ast })\) and the classical Kantorovich–Rubinstein duality (see Villani [38, Remark 6.5]); in the sixth, Simons [36, Theorem 3.1] (observe that \(\mathrm{Mart}_{t}(K_{t})\) is compact by Lemma 4.7); in the seventh, the definition of the superhedging price \(\Pi ^{{\sup}}(g):=-\Pi ^{{{\mathrm {sub}}}}(-g)=\sup _{Q\in \mathrm{Mart}(\Omega )}E_{Q}[ g] \) by Corollary 4.6. Once we have \(V^{W}_{t}\), we have \(U^{W}_{t}\), and then we can argue as in step 2) of the proof of Proposition 4.9, also regarding existence of an optimum. □
Remark 4.14
If \(U^{W}_{t}\) (as well as \(U_{t}^{G}\) in Proposition 4.9) is real-valued on \(\mathcal{C}_{b}(K_{t})\), one might take \(\mathcal{E}_{t}\) as the set of functions of the form (4.1) in place of \(\mathcal{E}_{t}=\mathcal{C}_{b}(K_{t})\) in both Proposition 4.9 and 4.13, using norm-density of the piecewise linear functions just as in the proof of Corollary 4.3.
Remark 4.15
The reader can check that the property \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is not used in the proof, and that it would suffice to have only \(\widehat{Q}\in \mathrm{Prob}(\Omega )\). This will be exploited in Sect. 4.4.1.
Example 4.16
Taking \(G_{t}(x)=0\) if \(x\leq \varepsilon _{t}\) and \(G_{t}(x)=\infty \) otherwise, we get \(G_{t}^{\ast }(y)= \varepsilon _{t}y\) if \(y\geq 0\) and \(G_{t}^{\ast }(y)= \infty \) otherwise. In this case,
One can verify with the same techniques as in Example 4.12 that we have convergence, as \(\varepsilon _{t}\downarrow 0\) for every \(t=0,\dots ,T\), of the values on the right-hand side of (4.12) to the MOT value on the left-hand side of (4.5).
4.4.1 Convergence with Wasserstein-induced penalisation
As already mentioned, in the classical MOT framework, the marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) need to be determined, potentially from the prices of many vanilla options. It is then reasonable to suppose that in a real-world situation, one proceeds by approximation, that is, one determines sequences of candidates \((\widehat{Q}_{t}^{n})\subseteq \mathrm{Prob}(K_{t} )\) for \(t=0,\dots ,T\). If such an approximation scheme (whose details are beyond the scope of this paper) is working, one should have a convergence of these sequences to the true marginals. One suitable candidate for the type of convergence is the weak one, namely one might want to have \(\widehat{Q}_{t}^{n}\rightarrow \widehat{Q}_{t}^{\infty }:= \widehat{Q}_{t} \) as \(n \to \infty \) for \(t=0,\dots ,T\) in the weak sense for probability measures. We suppose here that \(K_{0},\dots ,K_{T}\) are compact sets, and so weak convergence is equivalent to convergence in the Wasserstein distance. Proposition 4.17 below shows how the EMOT values treated in Proposition 4.13 and associated to the approximating measures \(\widehat{Q}_{t}^{n},t=0,\dots ,T\), converge to the original MOT value for the true marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\), provided that the loss functions \(G_{t}^{n}\) converge appropriately.
For the next result, it is convenient to rename the martingale measure \(\widehat{Q}\) from the Standing Assumption 4.1 as \(\widehat{Q}^{\infty }\), so that \(\widehat{Q}^{\infty }\in \mathrm{Mart}(\Omega )\) with marginals \(\widehat{Q}_{t}^{\infty }\) and \(c\in L^{1}(\widehat{Q}^{\infty })\).
Proposition 4.17
For each \(n\in \mathbb{N}\cup \{\infty \}\) and \(t=0,\dots ,T\), let \(G_{t}^{n}\) be a loss functions with \(G_{t}^{n}(x)\uparrow G_{t}^{\infty }(x)\) as \(n \to \infty \) for every \(x\in {\mathbb{R}}\) and \(G_{t}^{\infty }(x)=\infty \) for every \(x>0\). For every \(t=0,\dots ,T\) and \(n\in \mathbb{N}\), we assume that \(\widehat{Q}_{t}^{n}\in \mathrm{Prob}(K_{t}) \), \(\lim _{n}W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n})=0\) and \(\lim _{n}G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))=0\). Then
where
Proof
The second equality in (4.13) is just a consequence of the definition of \(\pi _{\infty }^{W}\) and of \(G_{t}^{\infty }(x)=\infty \) for every \(x>0\). Since we may always pass to a subsequence of \((\widehat{Q}^{n})\), we may assume without loss of generality that \(G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))<\infty \) for all \(n\in \mathbb{N}\cup \{\infty \}\) and all \(t\) (the case \(n=\infty \) is obvious from \(G_{t}^{\infty }(0)=0\)). We first claim that \(\pi _{n}^{W}(c)\) is finite for all \(n\in \mathbb{N}\cup \{\infty \}\). Indeed, since \(\Omega \) is compact, \(c\) is lower semicontinuous and \(G_{t}^{n}\) are nonnegative on \([0, \infty )\), we have for \(n\in \mathbb{N}\cup \{\infty \}\) that
We now prove that \(\pi _{\infty }^{W}(c)\geq \limsup _{n}\pi _{n}^{W}(c)\). Since \(\pi _{\infty }^{W}(c)< \infty \), there exists an optimum \(Q^{\infty }\in \mathrm{Mart}(\Omega )\) for \(\pi _{\infty }^{W}(c)\), and its marginals satisfy \(Q_{t}^{\infty }=\widehat{Q}_{t}^{\infty }\), \(t=0,\dots ,T\). Then \(G_{t}^{n}(W_{t}({Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))=G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty }, \widehat{Q}_{t}^{n}))\rightarrow 0\) as \(n \to \infty \) and
It only remains to show that \(\liminf _{n}\pi _{n}^{W}(c)\geq \pi _{\infty }^{W}(c)\). Proposition 4.13 and Remark 4.15 guarantee for each \(n\in \mathbb{N}\) the existence of an optimum \(Q^{n}\in \mathrm{Mart}(\Omega )\) for the value \(\pi _{n}^{W}(c)<\infty \). From the proof of Lemma 4.7, we know that \(\mathrm{Mart}(\Omega )\) is weakly compact, and so we can take a subsequence of \(( Q^{n} )\) such that for some \(\widetilde{Q}\in \mathrm{Mart}(\Omega )\), \(W_{t}(Q_{t}^{n_{k}},\widetilde{Q}_{t})\rightarrow 0\) as \(k \to \infty \) for every \(t\) and
Let \(N\in \mathbb{N}\) and recall that \(\sup _{N}G_{t}^{N}(x)=G_{t}^{\infty }(x)\). Then we compute by using the particular form of \(G_{t}^{\infty }\) that
where the first inequality uses \(\widetilde{Q}\!\in \!\mathrm{Mart}(\Omega )\) and the second is justified as follows. From the lower semicontinuity with respect to the weak convergence of \(Q\mapsto \int _{\Omega }c\mathrm{d}Q\), the lower semicontinuity of \(G_{t}^{N}\) and \(W_{t}(\widetilde{Q}_{t},\widehat{Q}_{t}^{\infty })=\lim _{k}W_{t}(Q_{t}^{n_{k}}, \widehat{Q}_{t}^{n_{k}})\) for all \(t\), we obtain
Moreover, \(G_{t}^{n}\) is increasing in \(n\) for each \(t\), and so for all \(n_{k}>N\),
This and (4.16) imply
by (4.14). Taking the supremum over \(N\in \mathbb{N}\), we obtain (4.15). □
5 Applications in the noncompact case
We stress that the extension of the results in Sects. 4.3 and 4.4 to the noncompact case seems to be nontrivial. The main issues come from the verification of Assumption 2.3 (ii), when one starts the analysis from penalisation terms rather than from valuation functionals. Not excluding that such an extension is possible, we leave this topic for future research. However, the case of valuations induced by utility functions can be treated also in the noncompact case, as we describe below. In the noncompact case, Corollary 4.3 takes the following form.
Corollary 5.1
Take \(d=1\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in {\mathbb{R}}\) and let \(K_{1},\dots ,K_{T}\subseteq {\mathbb{R}}\) be closed subsets of ℝ. Consider utility functions \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7, and suppose \(\mathrm{dom}(u_{0})=\cdots =\mathrm{dom}(u_{T})={\mathbb{R}}\). Take for each \(t=0,\dots ,T \) the vector space \(\mathcal{E}_{t}\subseteq C_{t} =C_{t:t}\) (see (2.1)) of functions of the form (4.1), let \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and fix a \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) such that
Suppose that \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and satisfies (2.8). Then
where \(U_{\widehat{Q}_{t}}(\varphi _{t})\) is defined in (4.2) for general \(\varphi _{t}\in C_{t}\) and \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}\) is given in (3.5). Moreover, the infimum in (5.1) is a minimum provided that the right-hand side of (5.1) is finite.
Proof
All the claims follow from Theorem 3.4 if we show that all its hypotheses are satisfied. To do so, we check the following properties: (i) the functional \({U}_{t}(\varphi _{t})=U_{\widehat{Q}_{t}}(\varphi _{t})\) is real-valued on \({C}_{t}\), concave, nondecreasing and cash-additive; (ii) \(\mathcal{D}_{t}(Q_{t})=\mathcal{D}_{v_{t}^{ \ast },\widehat{Q}_{t}}(Q_{t})\) for \(Q\in \mathrm{Mart}(\Omega )\); (iii) \({U}_{t}(0)=0\ \) for \(t=0,\dots ,T\) and the conditions (2.5) and (2.6) hold, which we do using Example 2.6.
To check (i), observe that for every \(t=0,\dots ,T\) and \(\varphi _{t}\in C_{t}\),
where the first inequality in the last line uses the fact that \(u_{t}(x)\leq x\), \(\forall x\in {\mathbb{R}}\), and the finiteness of the last term comes from \(\widehat{Q}\in \mathrm{Prob}^{1}(\Omega )\). Concavity, monotonicity and cash-additivity can be checked by direct computation. Coming to (ii), from Proposition 3.9, for every \(Q\in \mathrm{Prob}^{1}(\Omega )\) and \(t=0,\dots ,T\), we have
where the first equality exploits (3.6) and the last inequality is from the Fenchel inequality \(v_{t}^{\ast }(y)\geq (\varphi _{t}y-v_{t}(\varphi _{t}))\). To conclude the proof of (ii), we have to show that the sup in the above expression can be taken over \(\mathcal{E}_{t}\), as in the penalty term \(\mathcal{D}_{t}\) in Theorem 3.4. Observe that for every \(\varphi _{t}\in C_{t}\), there exists a sequence \((\varphi _{t}^{n}) \subseteq C_{t}\), with each \(\varphi _{t}^{n}\in \mathcal{E}_{t} \) of the form (4.1), such that \(\varphi _{t}^{n}\rightarrow \varphi _{t}\) pointwise on \(K_{t}\) and \(\sup _{n} \Vert \varphi _{t}^{n} \Vert _{t}<\infty \). This implies
for all \(Q\in \mathrm{Prob}^{1}(\Omega )\) by dominated convergence and using the assumption that
Finally, we work on (iii). Take \(f_{t}^{\alpha _{n}}(x_{t})=(\vert x_{t}\vert -\alpha _{n})^{+}\) as in (2.11) and \(\alpha _{n}\uparrow \infty \). Observe that \(f_{t}^{\alpha _{n}}(x_{t})\rightarrow 0\) as \(n \to \infty \) and that the assumption \(u_{t}(x)\leq x\) for all \(x\in \mathbb{R}\) implies \({U}_{t}(0)\leq 0\). Then for every \(a>0\),
where the last limit is by dominated convergence. Indeed, \(f_{t}^{\alpha _{n}}(x_{t})\leq 1+\vert x_{t}\vert \) for every \(x_{t}\in K_{t}\) so that
for every \(x_{t}\in K_{t},a>0\). Now (5.2) yields simultaneously that \({U}_{t}(0)=0\) and \({U}_{t}(-af_{t}^{\alpha _{n}})\rightarrow 0\) as \(n \to \infty \) for all \(t=0,\dots ,T,a>0\). To apply Example 2.6, it is then enough to observe that taking \(U(\varphi ):=\sum _{t=0}^{T}{U}_{t}(\varphi _{t})\) as in Theorem 3.4, we have \(U(0)=0\) and \(U(0,\dots ,0,-af_{t}^{\frac{n}{\beta }},0,\dots ,0)={U}_{t}(-af_{t}^{\frac{n}{\beta }})\rightarrow 0\) as \(n \to \infty \) for all \(a>0\), which is (2.12). □
Remark 5.2
Observe that Corollary 5.1 remains valid for general \(\widehat{Q}_{t}\in \mathrm{Prob}^{1}(K_{t})\) without requesting these are marginals of a martingale measure. Indeed, we did not use the martingale property at any point in the above proof.
Just as we obtained Corollary 4.5 from Corollary 4.3 by using the linear utility functions \(u_{t} (x_{t})=x_{t}\), we now deduce the following result from Corollary 5.1; see Beiglböck et al. [4, Theorem 1.1 and Corollary 1.2].
Corollary 5.3
Take \(d=1\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in {\mathbb{R}}\) and let \(K_{1},\dots ,K_{T}\subseteq {\mathbb{R}}\) be closed subsets of ℝ. Take for each \(t=0,\dots ,T\) the vector space \(\mathcal{E}_{t}\subseteq C_{t}\) of functions of the form (4.1), let \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and fix a \(\widehat{Q}\in \mathrm{Mart}(\Omega )\). Then for any \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is lower semicontinuous and satisfies (2.8), we have
and if \(\pi (c)<\infty \), a minimum point exists for the infimum in (5.3).
Example 5.4
We now study the convergence to the MOT problem. Take functions \(u_{0}, \dots ,u_{T}:{\mathbb{R}}\rightarrow {\mathbb{R}}\) satisfying Assumption 3.7, and assume additionally that these are all differentiable in 0 (which implies that \(\{1\}=\partial u_{0}(0)=\cdots =\partial u_{T}(0)\)). Observe that if we set \(u_{t}^{n}(x):=nu_{t} ( \frac{x}{n} 1) \) for \(x\in {\mathbb{R}},t=0,\dots ,T\), the functions \(u_{0}^{n},\dots ,u_{T}^{n}\) still satisfy Assumption 3.7. Moreover, \((v_{t}^{n})^{\ast }(y)=\sup _{x\in {\mathbb{R}}}(u_{t}^{n}(x)-xy))=nv_{t}^{\ast }(y)\), \(y\in {\mathbb{R}}\). Since \(u_{t}(0)=0\), we have \(v_{t}^{\ast }\geq 0\) and as a consequence \(\sup _{n}(v_{t}^{n})^{\ast }(y)=0\) if \(v_{t}^{\ast }(y)=0\) and \(\sup _{n}(v_{t}^{n})^{\ast }(y)=\infty \) otherwise. Moreover, \(v_{t}^{\ast }(y)=0\) implies that we have \(y\in \partial u_{t}(0)=\{1\}\). Consider the set \(\mathcal{A}^{\varepsilon }\) of \(\varepsilon \)-martingale measures defined in (2.14), take \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) and a sequence \(\varepsilon _{n}\downarrow 0\). Using (2.15) gives for every \(Q\in \mathrm{Prob}^{1}(\Omega )\) that
where
As a consequence, by Proposition 2.26,
References
Aliprantis, C.D., Border, K.C.: Infinite Dimensional Analysis, 3rd edn. Springer, Berlin (2006)
Backhoff-Veraguas, J., Pammer, G.: Stability of martingale optimal transport and weak optimal transport. Ann. Appl. Probab. 32, 721–752 (2022)
Bartl, D., Kupper, M., Prömel, D.J., Tangpi, L.: Duality for pathwise superhedging in continuous time. Finance Stoch. 23, 697–728 (2019)
Beiglböck, M., Henry-Labordère, P., Penkner, F.: Model-independent bounds for option prices—a mass transport approach. Finance Stoch. 17, 477–501 (2013)
Ben-Tal, A., Teboulle, M.: An old-new concept of convex risk measures: the optimized certainty equivalent. Math. Finance 17, 449–476 (2007)
Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L., Peyré, G.: Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comput. 37, A1111–A1138 (2015)
Bernton, E., Ghosal, P., Nutz, M.: Entropic optimal transport: geometry and large deviations. Duke Math. J. 171, 3363–3400 (2022)
Biagini, S., Frittelli, M.: On the extension of the Namioka–Klee theorem and on the Fatou property for risk measures. In: Delbaen, F., et al. (eds.) Optimality and Risk—Modern Trends in Mathematical Finance. The Kabanov Festschrift, pp. 1–28. Springer, Berlin (2009)
Blanchet, J., Jambulapati, A., Kent, C., Sidford, A.: Towards optimal running times for optimal transport. Preprint (2020). Available online at https://arxiv.org/abs/1810.07717
Bogachev, V.I.: Measure Theory. Vol. II. Springer, Berlin (2007)
Bolley, F.: Applications du Transport Optimal à des Problèmes de Limites de Champ Moyen. Ph.D. Thesis, Ecole Normale Supérieure de Lyon – ENS LYON (2005). Available online at https://theses.hal.science/tel-00011462
Bolley, F.: Separability and completeness for the Wasserstein distance. In: Donati-Martin, C., et al. (eds.) Séminaire de Probabilités XLI. Lecture Notes in Math., vol. 1934, pp. 371–377. Springer, Berlin (2008)
Breeden, D.T., Litzenberger, R.H.: Prices of state-contingent claims implicit in option prices. J. Bus. 51, 621–651 (1978)
Cheridito, P., Kiiski, M., Prömel, D.J., Soner, H.M.: Martingale optimal transport duality. Math. Ann. 379, 1685–1712 (2021)
Cheridito, P., Kupper, M., Tangpi, L.: Duality formulas for robust pricing and hedging in discrete time. SIAM J. Financ. Math. 8, 738–765 (2017)
Chung, N.-P., Trinh, T.-S.: Weak optimal entropy transport problems. Preprint (2021). Available online at https://arxiv.org/abs/2101.04986
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., et al. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300. Curran Associates, Red Hook (2013)
Davis, M., Obłój, J., Raval, V.: Arbitrage bounds for prices of weighted variance swaps. Math. Finance 24, 821–854 (2014)
De March, H., Henry-Labordère, P.: Building arbitrage-free implied volatility: Sinkhorn’s algorithm and variants. Preprint (2020). Available online at https://arxiv.org/abs/1902.04456
Dolinsky, Y., Soner, H.M.: Martingale optimal transport and robust hedging in continuous time. Probab. Theory Relat. Fields 160, 391–427 (2014)
Föllmer, H., Schied, A.: Stochastic Finance. An Introduction in Discrete Time, 4th revised and extended edn. de Gruyter, Berlin (2016)
Galichon, A., Henry-Labordère, P., Touzi, N.: A stochastic control approach to no-arbitrage bounds given marginals, with an application to lookback options. Ann. Appl. Probab. 24, 312–336 (2014)
Ghosal, P., Nutz, M., Bernton, E.: Stability of entropic optimal transport and Schrödinger bridges. J. Funct. Anal. 283, 109622 (2022)
Guo, G., Obłój, J.: Computational methods for martingale optimal transport problems. Ann. Appl. Probab. 29, 3311–3347 (2019)
Henry-Labordère, P.: From (martingale) Schrödinger bridges to a new class of stochastic volatility models. Preprint (2019). Available online at https://arxiv.org/abs/1904.04554
Henry-Labordère, P., Obłój, J., Spoida, P., Touzi, N.: The maximum maximum of a martingale with given \(n\) marginals. Ann. Appl. Probab. 26, 1–44 (2016)
Hobson, D.G.: Robust hedging of the lookback option. Finance Stoch. 2, 329–347 (1998)
Hou, Z., Obłój, J.: Robust pricing–hedging dualities in continuous time. Finance Stoch. 22, 511–567 (2018)
Ireland, C.T., Kullback, S.: Contingency tables with given marginals. Biometrika 55, 179–188 (1968)
Liero, M., Mielke, A., Savaré, G.: Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures. Invent. Math. 211, 969–1117 (2018)
Neufeld, A., Sester, J.: On the stability of the martingale optimal transport problem: a set-valued map approach. Stat. Probab. Lett. 176, 109131 (2021)
Nutz, M., Wiesel, J.: Entropic optimal transport: convergence of potentials. Probab. Theory Relat. Fields 184, 401–424 (2022)
Pennanen, T., Perkkiö, A.-P.: Convex duality in nonlinear optimal transport. J. Funct. Anal. 277, 1029–1060 (2019)
Peyré, G., Cuturi, M.: Computational Optimal Transport: With Applications to Data Science. Now Publishers, Hanover (2019)
Rüschendorf, L.: Convergence of the iterative proportional fitting procedure. Ann. Stat. 23, 1160–1174 (1995)
Simons, S.: Minimax and Monotonicity. Lecture Notes in Mathematics, vol. 1693. Springer, Berlin (1998)
Tan, X., Touzi, N.: Optimal transportation under controlled stochastic dynamics. Ann. Probab. 41, 3201–3240 (2013)
Villani, C.: Optimal Transport. Old and New. Springer, Berlin (2009)
Funding
Open access funding provided by Università degli Studi di Milano within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proposition A.1
(i) There exist constants \(\gamma =\gamma (d,T) > 0, \beta =\beta (d,T)>0 \) such that for every \(A>1\), we have
where \(f^{\alpha}_{j,s}\) is defined in (2.11).
(ii) The conditions in Example 2.6are sufficient for (2.5) and (2.6) to hold.
Proof
(i) Note that \(f^{\frac{A}{\beta}}_{j,s}(x^{j}_{s})=(|x^{j}_{s}|-\frac{A}{\beta})^{+}\). Fix \(x\in ({\mathbb{R}}^{d} \setminus [-A,A]^{d})^{T+1}\). Define
Then \(\{1,\dots ,d\}\times \{0,\dots ,T\}=I(x)\cup I^{c}(x)\) and \(I(x)\neq \emptyset \). Moreover, for \(\beta >1\),
Choosing e.g. \(\beta =2d(T+1)\), we can solve \(\gamma A \frac{1}{2} - ( A+d(T+1)A+1 ) \geq 0\) to get \(\gamma \geq 2d(T+2)+\frac{2}{A}\). This yields a possible selection \(\gamma =2d(T+2)+2\) as well which depends only on the dimensions \(d,T\).
(ii) Observe first that for any concave function \(F:\mathbb{R}^{T+1}\to \mathbb{R}\), we have
where \(\{ e^{(t)} : t=0,\dots ,T\}\) is the canonical basis in \(\mathbb{R}^{T+1}\). Now we define the set and set \(f_{t}^{n}:=\gamma \sum _{j=1}^{d}f_{j,t}^{\frac{n}{\beta }}\) for \(\gamma ,\beta \) given in (i) above. Then (i) guarantees that (2.5) holds. The concavity and monotonicity of \(U\) and (2.12) imply that for any \(a\geq 0\),
which is (2.6). □
1.1 A.1 Weighted spaces
We use the notation introduced in Sect. 2.4.1. Consider the following spaces of continuous functions. For \(\psi \in \mathcal{C}(\mathbb{X})\), we set
As can be easily verified by following the classical case of bounded continuous functions with the sup-norm, \(C_{\psi }\) is a Banach lattice under the norm \(\Vert \cdot \Vert _{\psi }\). Notice also that \(\mathcal{C}_{b}(\mathbb{X})\ni \varphi \mapsto (1+ \vert \psi \vert ) \varphi \in C_{\psi }\) defines an isomorphism between Banach spaces. The topological dual of \(C_{\psi }\) is denoted by \((C_{\psi })^{\ast }\).
Proposition A.2
Let \(L\in (C_{\psi})^{*}\) be continuous, linear and positive. Suppose that for every \(\varepsilon >0\), there exists a compact \(K_{\varepsilon}\subseteq \mathbb{X}\) such that
Then for every sequence \((c_{n}) \subseteq C_{\psi}\) with \(c_{n}\downarrow 0\) pointwise on \(\mathbb{X}\), we have \(L(c_{n})\downarrow 0\).
Proof
Fix \(\varepsilon >0\) and take the associated compact \(K_{\varepsilon}\). By Dini’s lemma, we have \(\sup _{x\in K_{ \varepsilon}} c_{n}(x)\downarrow 0\). Take \(n\) big enough such that we have \(\sup _{x\in K_{\varepsilon}}c_{n}(x) <\varepsilon \). Define \(0\leq g^{\varepsilon}_{n}:=\min (c_{n},\varepsilon )\). Then clearly \(g_{n}^{\varepsilon}(x)= |g_{n}^{\varepsilon}(x) |\leq \varepsilon \leq \varepsilon (1+ |\psi (x) | )\) for all \(x\in \mathbb{X} \) implies that \(\| g_{n}^{\varepsilon }\|_{\psi}\leq \varepsilon \) and therefore
where \(\| L\|\) is the operator norm (note that \(\| L\|< \infty \) since \(L\) is continuous). Also, since \(\sup _{x\in K_{\varepsilon}}c_{n}(x)<\varepsilon \), we get that \(c_{n}\) and \(g_{n}^{\varepsilon}\) coincide on \(K_{\varepsilon}\) so that \((c_{n}-g_{n}^{\varepsilon})|_{K_{\varepsilon}}= 0\). By using the hypothesis on \(L\), we then get
where the last step uses the Banach lattice property of \(\| \cdot \|_{\psi}\) and that \(\| g_{n}^{\varepsilon }\|_{\psi}\leq \varepsilon \), as shown before. We now combine (A.1) and (A.2) to get
Since \(\varepsilon >0\) is arbitrary, \(L(c_{n})\downarrow 0\). □
We state below the celebrated Daniell–Stone theorem.
Theorem A.3
Let \(V\) be a vector lattice of functions (i.e., \(f,g\in V\) implies that \(\max (f,g)\in V\)) on a set \(\mathbb{X}\) such that \(1\in V\). Let \(L\) be a linear functional on \(V\) with the properties that \(L(f)\geq 0\) whenever \(f\geq 0\), \(L(1) = 1\), and \(L(f_{n})\rightarrow 0\) for every sequence \((f_{n})\) of functions in \(V\) monotonically decreasing to zero. Then there exists a unique probability measure \(\mu \) on the \(\sigma \)-algebra \(\mathcal{F} = \sigma (V)\) generated by \(V\) such that \(V \subseteq L^{1}(\mu )\) and
Proof
See Bogachev [10, Theorem 7.8.1.] □
1.2 A.2 Proofs
Proof of Proposition 3.9
We use Liero et al. [30, Theorem 2.7 and Remark 2.8]. To do so, let us rename \(F:=v_{t}^{\ast }\) (see (3.4) for the definition of \(v^{\ast }\)), which implies that \(F^{\circ }(y):=-F^{\ast }(-y) \) of [30, Equation (2.45)] satisfies
by the Fenchel–Moreau theorem. All the assumptions of [30, Sect. 2.3] on \(F\) are satisfied since \(F(y)\geq u_{t}(0)-0y=0\) for \(y\geq 0\) and \(F(1)=\sup _{x\in {\mathbb{R}}}(u_{t}(x)-x)\leq 0\) (recall that \(u_{t}(x)\leq x\) for all \(x\in {\mathbb{R}}\)). Also, we have \(\lim _{y\to \infty }\frac{F(y)}{y}=F_{\infty }^{\prime }=\infty \) since \(\mathrm{dom}(u_{t})={\mathbb{R}}\). We can then apply [30, Theorem 2.7 and Remark 2.8] to obtain (3.6). We stress the fact that since \(u_{t}\) is finite-valued on all of ℝ, it is continuous there and for every \(\varphi _{t}\in \mathcal{C}_{b}(K_{t})\), we have \(F^{\circ }(\varphi _{t})=u_{t}(\varphi _{t})\in \mathcal{C}_{b}(K_{t})\). So the additional constraint \(F^{\circ }(\varphi _{t})\in \mathcal{C}_{b}(K_{t})\) (below [30, Equation (2.49)]) is redundant in our setup. □
Proof of Proposition 3.10
We exploit again Liero et al. [30, Theorem 2.7 and Remark 2.8] (with \(u_{t}\) in place of \(F^{\circ }\)), as we explain now. Since \(u_{t}\) is nondecreasing, either its domain is of the form \([M,\infty )\) or \((M, \infty )\), with \(M\leq 0\). Given \(\varphi _{t}\in \mathcal{C}_{b}(K_{t})\) and \(\mu \in \mathrm{Meas}(K_{t})\), we have three cases:
If \(\inf (\varphi _{t}({\mathbb{R}}))>M\), then \(u_{t}(\varphi _{t})\in \mathcal{C}_{b}(K_{t})\) since \(u_{t}\) is continuous on the interior of its domain.
If \(\inf (\varphi _{t}({\mathbb{R}}))< M\), then \(\{\varphi _{t}< M\}\) is open and non-empty and hence has positive \(\widehat{Q}_{t}\)-measure as \(\widehat{Q}_{t}\) has full support. Thus \(\int _{K_{t}}u_{t}(\varphi _{t})\,\mathrm{d}\widehat{Q}_{t}=-\infty \).
Finally, if \(\inf (\varphi _{t}({\mathbb{R}}))=M\), then \(u_{t}(\varphi _{t})=\lim _{\varepsilon \downarrow 0}u_{t}(\max ( \varphi _{t},M+\varepsilon ))\) (since \(u_{t}\) is nondecreasing and upper semicontinuous), \(u_{t}(\max (\varphi _{t},M+\varepsilon ))\in \mathcal{C}_{b}(K_{t})\) as in the first case, and by the monotone convergence theorem,
Then we infer that
and from [30, Theorem 2.7 and Remark 2.8] and (A.3), we deduce the result. □
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Doldi, A., Frittelli, M. Entropy martingale optimal transport and nonlinear pricing–hedging duality. Finance Stoch 27, 255–304 (2023). https://doi.org/10.1007/s00780-023-00498-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00780-023-00498-x
Keywords
- Martingale optimal transport problem
- Entropy optimal transport problem
- Pricing–hedging duality
- Robust finance
- Pathwise finance