1 Introduction

As a consequence of the financial crisis in 2008, the uncertainty in the selection of a reference probability \(P\) gained increasing attention and led to the investigation of the notions of arbitrage and of the pricing–hedging duality in different settings. On the one hand, the single reference probability \(P\) was replaced with a family of – a priori non-dominated – probability measures, leading to the theory of quasi-sure stochastic analysis. On the other hand, taking an even more radical approach, a probability-free, pathwise theory of financial markets made substantial advances in the second decade of this century. In this context, it was shown in the seminal paper by Beiglböck et al. [4] that optimal transport theory is a powerful tool to prove pathwise pricing–hedging duality results. The theory we are going to present fits in this conceptual framework that we now briefly recall.

The market model is in discrete time with a finite horizon \(T\in \mathbb{N}\) and zero interest rate. Let

$$ \Omega :=K_{0}\times \cdots \times K_{T} $$

for closed (possibly noncompact) subsets \(K_{0},\dots ,K_{T}\) of ℝ and let \(X_{0},\dots ,X_{T}\) be the canonical projections \(X_{t}:\Omega \rightarrow K_{t}\) for \(t=0,1,\dots ,T\). The process \(X=(X_{t}) \) represents the price of some underlying asset. Later we allow a multidimensional price process, but in this introduction, we stick to the one-dimensional case for notational simplicity. We assume no reference probability measure. We write

$$ \mathrm{Mart}(\Omega ):=\{ \text{martingale probability measures for $X$ under the natural filtration} \}, $$

and when \(\mu \) is a measure defined on the Borel \(\sigma \)-algebra of \(\Omega\), its marginals are denoted by \(\mu _{0},\dots ,\mu _{T}\). One then considers a contingent claim \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is allowed to depend on the whole path of the underlying asset, and one admits semistatic trading strategies for hedging. This means that in addition to dynamic trading in \(X\) via admissible integrands \(\Delta \in \mathcal{H}\), one may invest in vanilla options \(\varphi _{t}:K_{t}\rightarrow \mathbb{R}\). For modelling purposes, one can take vector subspaces \(\mathcal{E}_{t}\subseteq \mathcal{C}(K_{t})\) for \(t=0,\dots ,T\), where \(\mathcal{C}(K_{t})\) is the space of real-valued continuous functions on \(K_{t}\). For each \(t\), \(\mathcal{E}_{t}\) is the set of static options that can be used for hedging, say affine combinations of vanilla options with different strikes and the same maturity \(t\), and \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) is the space of all hedging instruments. The key assumption in the robust, optimal-transport-based formulation is that the marginals \(\mathcal{(}\widehat{Q}_{0},\widehat{Q}_{1},\dots ,\widehat{Q}_{T})\) of the underlying price process \(X \) are known; see the seminal papers by Breeden and Litzenberger [13] and Hobson [27]. Such marginals can be identified if one knows a (very) large number of prices of plain vanilla options maturing at each intermediate date, for example the prices of all call options with intermediate maturities and ranging strikes. In this case, the class of arbitrage-free pricing measures that are compatible with the observed prices of the options is given by

$$ \mathcal{M}(\widehat{Q}_{0},\widehat{Q}_{1},\dots ,\widehat{Q}_{T}):= \{ Q\in \mathrm{Mart}(\Omega ) : X_{t}\sim _{Q}\widehat{Q}_{t} \text{ for }t=0,\dots ,T \} . $$

Let ℋ consist of admissible (predictable) trading strategies, given as in Beiglböck et al. [4] via bounded continuous functions, and let

$$ \mathcal{I} :=\bigg\{ I^{\Delta }(x):=\sum _{t=0}^{T-1}\Delta _{t}(x_{0}, \dots ,x_{t})(x_{t+1}-x_{t}) : \Delta \in \mathcal{H}\bigg\} $$

denote the corresponding set of stochastic integrals. In this framework, the subhedging duality, obtained in [4, Theorem 1.1], takes the form

$$ \inf _{Q\in \mathcal{M}(\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})}E_{Q}[c] = \sup \bigg\{ \sum _{t=0}^{T}E_{\widehat{Q}_{t}}[\varphi _{t}] : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c) \bigg\} , $$
(1.1)

for

$$ \mathcal{S}_{{\mathrm {sub}}}(c):=\bigg\{ \varphi \in \mathcal{E} : \exists \Delta \in \mathcal{H}\text{ with }\sum _{t=0}^{T}\varphi _{t}(x_{t})+I^{ \Delta }(x)\leq c(x), \forall x\in \Omega \bigg\} , $$
(1.2)

and the right-hand side of (1.1) is known as the robust subhedging price of \(c\). Obviously, an analogous theory for the superhedging price can be developed as well. Several relevant papers contributed to this stream of literature, as for example Davis et al. [18], Dolinsky and Soner [20], Galichon et al. [22], Henry-Labordère et al. [26], Tan and Touzi [37]. More recent works on the topic include also Bartl et al. [3], Cheridito et al. [14], Guo and Obłój [24], Hou and Obłój [28].

1.1 The dual problem

The left-hand side of (1.1), namely \(\inf _{Q\in \mathcal{M(}\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})}E_{Q}[c] \), represents the dual problem in the financial application, but is typically the primal problem in martingale optimal transport (MOT). We label this case as the sublinear case of MOT. Inspired by the entropy optimal transport (EOT) introduced in Liero et al. [30], we are naturally led to the study of the convex case of MOT, i.e., an entropy martingale optimal transport (EMOT) problem, in the form

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q} [ c ] +\sum _{t=0}^{T}\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}(Q_{t})\bigg) , $$
(1.3)

where \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}\) is a divergence in the usual form (see (3.5) below for an explicit expression). Notice that in the EMOT primal problem (1.3), the typical MOT constraint that \(Q\) has prescribed marginals \((\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\) is relaxed (as the infimum is taken with respect to all martingale probability measures) by penalising via \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}\) those martingale measures \(Q\) whose marginals are far from some reference marginals \((\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\). This is a key difference with classical MOT. Nevertheless, when \(\mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}} (\,\cdot \, )=\delta _{\{ \widehat{Q}_{t}\}}(\,\cdot \, )\), the EMOT reduces to the classical MOT problem where only martingale probability measures with fixed marginals are allowed. Here \(\delta _{A}:=\infty 1_{A^{c}}\) is the characteristic function of a set \(A\) as customarily defined in convex analysis. We also stress that in (1.3), we only consider martingale probability measures, while the EOT problem of [30] is obtained by replacing in (1.3) the set \(\mathrm{Mart}(\Omega )\) with \(\mathrm{Meas}(\Omega )\) consisting of all positive finite measures \(\mu \) on \(\Omega \).

In EMOT, the marginals are no longer fixed a priori as in the left-hand side of (1.1), because we may not have sufficient information to detect them with enough accuracy. This might be the case for example if there are not sufficiently many traded call and put options on the underlying assets in the market so that we cannot extract precisely the marginals via the Breeden and Litzenberger [13] approach. Alternatively, the exact prices of the options might be unknown, e.g. by market impact effects.

1.2 The primal problem

To describe the nonlinear subhedging value, we start with the space \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) of hedging instruments consisting of some vectors of continuous functions and consider a functional \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\). An example is given by (a sum of) expected utility functions, as detailed below, so that \(U\) is not necessarily linear or even cash-additive.

We restore cash-additivity via the notion of the optimised certainty equivalent (OCE) studied in Ben Tal and Teboulle [5]. To this end, we introduce the generalised optimised certainty equivalent associated to \(U\) as

$$ S^{U}(\varphi ):=\sup _{\beta \in {\mathbb{R}}^{T+1}}\bigg( U( \varphi +\beta )-\sum _{t=0}^{T}\beta _{t}\bigg). $$
(1.4)

This is a cash-additive map (see (2.2)), yet nonlinear in general, and can be considered as a valuation of options \(\varphi =(\varphi _{t})\) instead of the linear cost \(\sum _{t=0}^{T} E_{\widehat{Q}_{t}}[\varphi _{t}]\) in (1.1). For a possibly path-dependent contingent claim \(c:\Omega \rightarrow (-\infty ,+\infty ]\), the nonlinear subhedging value of \(c\) when valuation is done by \(S^{U}\) then reads

$$ \pi (c)=\sup \{ S^{U}(\varphi ) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c) \}=\sup _{\Delta \in \mathcal{H}}\sup _{\varphi \in {\Phi }_{\Delta }(c)}S^{U}( \varphi ) $$

for \({\Phi }_{\Delta }(c):=\{ \varphi \in \mathcal{E}: \sum _{t=0}^{T} \varphi _{t}(x_{t})+I^{\Delta }(x)\leq c(x), \forall \,x\in \Omega \}\).

1.3 The duality

One of the main results of the paper in Theorem 2.4 is the duality

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\big( E_{Q} [ c ] +\mathcal{D}_{U}(Q)\big) =\sup _{\Delta \in \mathcal{H}}\sup _{\varphi \in {\Phi }_{\Delta }(c)}S^{U}(\varphi ) , $$
(1.5)

where

$$ \mathcal{D}_{U}(Q)=\sup _{\varphi \in \mathcal{E}}\bigg( U(\varphi )- \sum _{t=0}^{T}\int _{K_{t}}\varphi _{t}\mathrm{d}Q_{t}\bigg) \qquad \text{for }Q\in \mathrm{Mart}(\Omega ) $$
(1.6)

is the penalisation term associated to \(U\) via the Fenchel conjugate. In addition, we also prove the existence of an optimiser for the problem on the left-hand side of (1.5). We now understand that the dual problem for the latter, namely of EMOT in its general form, is the nonlinear subhedging problem appearing on the right-hand side of (1.5).

Observe that \(\mathcal{D} := \mathcal{D}_{U}\) in (1.6) does not necessarily have an additive structure, or a divergence formulation \(\mathcal{D} (Q)=\sum _{t=0}^{T} \mathcal{D}_{v^{*}_{t},\widehat{Q}_{t}}(Q_{t})\) as in (1.3), and so it does not necessarily depend on a given martingale measure \(\widehat{Q}\). For example, such penalisation terms could be induced by market prices (see Sect. 4.3) or by a Wasserstein distance (see Sect. 4.4). This additional flexibility in choosing \(\mathcal{D}\) constitutes one key generalisation of the entropy optimal transport theory of Liero et al. [30]. Of course, the other difference with EOT is the presence in (1.5) of the additional supremum with respect to admissible integrands \(\Delta \in \mathcal{H}\). As a consequence, on the left-hand side of (1.5), the infimum is now taken with respect to martingale probability measures instead of positive measures.

In the special case of a valuation functional \(U\) induced by utility functions, the duality (1.5) has a particularly interesting formulation (see Sects. 3.4 and 4.1 for the assumptions and more details). We provide here only two special cases. Let \(\widehat{Q}_{t}\) be the marginals of some \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) and \(U(\varphi )= \sum _{t=0}^{T} {E}_{\widehat{Q}_{t}}[u_{t}(\varphi _{t})]\). If \(u_{t}(x)=\frac{1}{\gamma _{t}}(1-\exp {(-\gamma _{t}x)})\), \(\gamma _{0},\dots ,\gamma _{t}>0\), is an exponential utility function, then (1.5) takes the form

$$\begin{aligned} &\inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q} [ c ] +\sum _{t=0}^{T}\frac{1}{\gamma _{t}}H(Q_{t},\widehat{Q}_{t})\bigg) \\ &=\sup \bigg\{ \sum _{t=0}^{T}-\frac{1}{\gamma _{t}}\ln {E}_{ \widehat{Q}_{t}} [ \exp ( -\gamma _{t}\varphi _{t} ) ] : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} , \end{aligned}$$

where \(H(\,\cdot \, , \,\cdot \, )\) denotes the relative entropy. If \(u_{t}(x)=x\) is the linear utility function, then \(\mathcal{D}_{U}(\,\cdot \, )=\sum _{t=0}^{T} \delta _{\widehat{Q}_{t}}( \,\cdot \, )\) and the EMOT reduces to the classical MOT problem.

Our framework allows us to establish and comprehend several different robust pricing–hedging duality results: new nonlinear utility-based formulations (in Corollaries 4.3, 4.4 and 5.1); the linear case (in Corollary 5.3) and the case without options (in Corollary 4.6); a new duality with penalisation functions based on market data (see Sect. 4.3) or on a Wasserstein distance (see Sect. 4.4).

One additional feature of the paper consists in replacing the set of stochastic integrals ℐ with a general set \(\mathcal{A}\) of suitable hedging instruments that is a general convex cone. Particular choices of such an \(\mathcal{A}\), apart from the usual set of stochastic integrals, allow us to work with \(\varepsilon \)-martingale measures, supermartingales and submartingales in the duality (see Sect. 2.2.1). This extends EMOT beyond the strict martingale property.

Section 2.5 is devoted to stability and convergence issues, as we analyse how the duality is affected by variations in the penalty terms. In Examples 4.12, 4.16 and 5.4, we apply this result to the convergence of EMOT to the extreme case of MOT, and in Sect. 4.4, we focus on Wasserstein-induced penalisation terms.

2 The entropy martingale optimal transport duality

In this section, we present a precise mathematical setting, the main results and their proofs. The main result in Theorem 2.4 relies on (i) a Fenchel–Moreau argument applied to the dual system \((C_{0:T},(C_{0:T})^{\ast })\), where \(C_{0:T}\) is a set of appropriately weighted continuous functions, (ii) the Daniell–Stone theorem that guarantees that the elements in the dual space \((C_{0:T})^{\ast }\) that enter in the dual representation can be represented by probability measures. In order to make this possible, an order-continuity-type assumption on the valuation functional is enforced (see (2.6)).

2.1 The setting

Fix \(d\in \mathbb{N}\) modelling the number of stocks in the market, and fix \(d(T+1)\) closed subsets \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) of ℝ. Set, for \(0\leq s\leq t\leq T\),

Let \(\mathcal{C}\left (\Omega _{s:t}\right )\) be the vector space of continuous real-valued functions on \(\Omega _{s:t}\), and let

$$ C_{s:t}:= \bigg\{ \varphi \in \mathcal{C} (\Omega _{s:t} ) : \Vert \varphi \Vert _{s:t}:=\sup _{x\in \Omega _{s:t} } \frac{\vert \varphi (x) \vert }{1+\sum _{u=s}^{t}\sum _{j=1}^{d}\vert x_{u}^{j}\vert }< \infty \bigg\} . $$
(2.1)

We introduce the space \(B_{s:t}\) in a similar fashion, just substituting the requirement of continuity for \(\varphi \) with the request that \(\varphi \) be measurable with respect to the Borel \(\sigma \)-algebra of \(\Omega _{s:t}\). Then \(C_{s:t}\) and \(B_{s:t}\) are Banach lattices under the norm \(\Vert \cdot \Vert _{s:t}\). The topological dual of \(C_{s:t}\) is denoted by \((C_{s:t})^{\ast }\).

In a discrete-time framework with finite horizon \(T\) and assuming zero interest rate, we model a market with \(d\) stocks using the canonical \(d\)-dimensional process given by \(X_{t}^{j}(x)=x_{t}^{j},j=1,\dots ,d,t=0,\dots ,T\), for \(x \in \Omega \). We introduce the set \(\mathrm{Prob}(\Omega )\) of probability measures on \(\Omega \), endowed with its Borel \(\sigma \)-algebra, and the set of those probability measures under which the \(X_{t}^{j}\) are integrable as

$$ \mathrm{Prob}^{1}(\Omega ):=\bigg\{ Q\in \mathrm{Prob}(\Omega ) : E_{Q} \bigg[ \sum _{t=0}^{T}\sum _{j=1}^{d}\vert X_{t}^{j} \vert \bigg] < \infty \bigg\} . $$

Fix now vector subspaces \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) with \(\mathcal{E}_{t}\subseteq C_{0:t}\). The space \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) represents the class of financial instruments that can be used for static hedging. Since \(\mathcal{E}_{t}\subseteq {C}_{0:t}\), we are potentially allowing to consider also Asian and path-dependent options \(\varphi _{t}(x_{0},\dots ,x_{t})\) in the sets \(\mathcal{E}_{t}\). Nonetheless, the choice \(\mathcal{E}_{t}\subseteq C_{t:t}\) is permitted, too; see Sects. 4 and 5. Moreover, in some of the subsequent results, see Sect. 4.1, we take as \(\mathcal{E}_{t}\subseteq {C}_{t:t}\) the subspace consisting of (combinations of) deterministic amounts, units of underlying stock at time \(t\) and call options with different strike prices and the same maturity \(t\). Let \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) be a proper (i.e., \(\mathrm{dom}(U):=\{\varphi \in \mathcal{E} : U(\varphi )>-\infty \} \neq \emptyset \)) concave functional. Recall from (1.4) the definition of \(S^{U}:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty ]\) which represents the valuation functional of the hedging instruments in ℰ. Let \(\mathrm{dom}(S^{U}):=\{\varphi \in \mathcal{E} : S^{U}(\varphi )>- \infty \}\). Observe that we are considering valuation of the process \(\varphi =(\varphi _{0},\dots ,\varphi _{T}) \in \mathcal{E}\) rather than the valuation of the terminal payoffs only. Under the usual convention \(\infty \cdot 0=0\cdot \infty =0\), one can check that the functional \(S^{U}\) is concave on the convex set \(\mathrm{dom}(S^{U}) \) and cash-additive, meaning that

$$ S^{U}(\varphi +\alpha )=S^{U}(\varphi )+\sum _{t=0}^{T}\alpha _{t}, \qquad \forall \varphi \in \mathcal{E}, \forall \alpha \in{\mathbb{R}}^{T+1} . $$
(2.2)

Definition 2.1

Given a convex cone \(\mathcal{A}\subseteq C_{0:T}\) and a Borel function \(c\), we define

$$ \pi (c):=\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}S^{U}(\varphi ) \in \lbrack -\infty ,+\infty ], $$
(2.3)

where

$$ {\Phi }_{z}(c):=\bigg\{ \varphi \in \mathrm{dom}(S^{U}) : \sum _{t=0}^{T} \varphi _{t}(x_{0},\dots ,x_{t})+z(x)\leq c(x), \forall x\in \Omega \bigg\} $$

and the usual convention \(\sup \emptyset =-\infty \) is adopted.

We recognise that \(\pi (c)\) in (2.3) is a generalised robust subhedging value for \(c\), with a general set \(-\mathcal{A}\) replacing the set of terminal values of stochastic integrals used before. Some relevant examples for choices of \(\mathcal{A}\) are provided in Sect. 2.2.1.

Definition 2.2

We define the polar \(\mathcal{A}^{\circ }\) of the cone \(\mathcal{A}\subseteq C_{0:T}\) to be the set

$$ \mathcal{A}^{\circ }:= \{\lambda \in (C_{0:T})^{\ast } : \langle z, \lambda \rangle \leq 0, \forall z\in \mathcal{A} \} $$

where \(\langle \,\cdot \,,\,\cdot \, \rangle \) is the usual pairing between \(C_{0:T}\) and its topological dual \((C_{0:T})^{\ast }\), and we observe that for any \(\lambda \) in \((C_{0:T})^{\ast }\),

figure a

As will be clarified in Sect. 2.4.1, \(\mathrm{Prob}^{1}(\Omega )\) can be identified with a subset of \((C_{0:T})^{\ast }\); so we introduce the set of probability measures

$$ \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }= \{ Q\in \mathrm{Prob}^{1}(\Omega ) : E_{Q} [ z ] \leq 0, \forall z\in \mathcal{A}\} . $$
(2.4)

2.2 The main results

Assumption 2.3

(i) Let \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) be closed subsets of ℝ and denote . The vector subspaces \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) satisfy that \({\mathbb{R}} \subseteq \mathcal{E}_{t}\subseteq C_{0:t}\), \(t=0,\dots ,T\), and we set \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T} \). The functional \(U: \mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) is concave with \(U(0)\in {\mathbb{R}} \). Moreover, \(\mathcal{A}\subseteq C_{0:T}\) is a convex cone with \(0\in \mathcal{A}\).

(ii) For every \(t=0,\dots ,T\), there exist compact sets , and functions \(0\leq f_{t}^{n}\in \mathcal{E}_{t},n\geq 1\), such that

(2.5)

and

$$ U(-af_{0}^{n},\dots ,-af_{T}^{n}) \longrightarrow 0 \qquad \text{as $ n \to \infty , \forall a\in {\mathbb{R}},a>0$ .} $$
(2.6)

Theorem 2.4

Suppose Assumption 2.3is fulfilled.

  1. (i)

    If

    $$ {\pi (\,\widehat{c}\,)< \infty }\textit{ for some }{\widehat{c}\in B_{0:T},} $$
    (2.7)

    then \(\pi (c)\in {\mathbb{R}}\) for every \(c\in B_{0:T}\) and \(\pi :B_{0:T}\rightarrow {\mathbb{R}}\) is norm-continuous, cash-additive, concave and nondecreasing on \(B_{0:T}\).

  2. (ii)

    For every lower semicontinuous \(c:\Omega \rightarrow (-\infty ,+\infty ]\) satisfying

    $$ c(x)\geq -A\bigg( 1+\sum _{t=0}^{T}\sum _{j=1}^{d} \vert x_{t}^{j} \vert \bigg), \qquad \forall x\in \Omega , \textit{for some}{\ A\in \lbrack 0, \infty ){,}} $$
    (2.8)

    we have the duality

    $$ \inf _{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}}\big( E_{Q} [c ]+\mathcal{D}(Q)\big)=\sup _{z\in -\mathcal{A}}\sup _{\varphi \in { \Phi }_{z}(c)}S^{U} ( \varphi )=\pi (c), $$
    (2.9)

    where \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}\) is given in (2.4),

    $$ \mathcal{D}(Q)=\sup _{\varphi \in \mathcal{E}}\bigg( U(\varphi )- \sum _{t=0}^{T}\int _{\Omega _{0:t}}\varphi _{t}\mathrm{d}Q_{t}\bigg), $$
    (2.10)

    and \(Q_{t}\) is the marginal of \(Q\in \mathrm{Prob}^{1}(\Omega )\) on \(\mathcal{B}(\Omega _{0:t})\). Furthermore, if \(\pi (c)<\infty \), the infimum on the left-hand side of (2.9) is a minimum.

Notice that the condition \(\pi (\,\widehat{c}\,)<\infty \) for some \(\widehat{c}\in B_{0:T}\) is not required for the validity of Theorem 2.4 (ii). In addition, we allow in (2.9) \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }=\emptyset \) with the usual convention \(\inf \emptyset =+\infty \). Recall also that the existence of an optimiser in MOT implies that \(\mathcal{M}(\widehat{Q}_{0},\widehat{Q}_{1},\dots , \widehat{Q}_{T})\) is not empty and that the marginals must be in convex order. In EMOT, the marginals are no longer assigned, and so an optimiser \(Q^{\ast }\) of the left-hand side of (2.9) belongs to \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) with \(\mathcal{D}(Q^{\ast })< \infty \) with no other requirement.

Corollary 2.5

Suppose Assumption 2.3 (i) holds with the subsets \(K_{t}^{j}\), \(t=0,\dots ,T\), \(j=1,\dots ,d\), ofbeing compact, \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and \(U(0 )=0\). Then (2.9) holds true and if \({\pi (c)<\infty }\), there exists an optimum for the left-hand side of (2.9).

Proof

When \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) are compact, then \(C_{0:T}=\mathcal{C}_{b}(\Omega )\). As \(U(0)=0\), (2.5) and (2.6) are automatically satisfied: just take and \(f_{t}^{n}=0,t=0,\dots ,T,n\geq 1\). □

Assumption 2.3 (ii) is inspired by Cheridito et al. [15] and is for instance satisfied if \(K_{t}^{j} \subseteq \lbrack 0,\infty )\), \(t=0,\dots ,T\), \(j=1,\dots ,d\), and if the valuations over a suitable sequence of call options on the underlying stocks converge to zero when the corresponding strikes diverge to infinity, as explained in the following example.

Example 2.6

Let

$$ f_{j,t}^{\alpha }(x_{t,j}):= (\vert x_{t,j}\vert -\alpha )^{+}, \qquad x_{t,j}\in K_{t}^{j}, j=1,\dots ,d, t=0,\dots ,T, $$
(2.11)

and suppose that \(f_{j,t}^{\alpha }\in \mathcal{E}_{t}\) for every \(\alpha \geq 0\), \(j=1,\dots ,d\), \(t=0,\dots ,T\). As shown in Proposition A.1 (ii), to guarantee that (2.5) and (2.6) are satisfied, it is enough to require that \(U\) is (componentwise) nondecreasing on , \(U(0)=0\) and that for \(\beta \in {\mathbb{R}}_{+}\) given in A.1 (i), we have

$$\begin{aligned} U (0,\dots ,0,-af_{j,t}^{\frac{n}{\beta }},0,\dots ,0 )\longrightarrow 0 \qquad \text{as $n \to \infty $, }&\forall j=1,\dots ,d, \\ &t=0,\dots ,T,a\geq 0 . \end{aligned}$$
(2.12)

Condition (2.12) is a requirement on the valuation of single options having maturity \(t\).

Remark 2.7

The proof of Theorem 2.4 will clarify that the use of \(-\mathcal{A}\) in place of \(\mathcal{A}\) in defining \(\pi (c)\) is somehow a matter of taste. Now the infimum in (2.9) is in fact taken over measures in the polar \(\mathcal{A}^{\circ }\). Instead, without the minus sign \((-\mathcal{A})\) in defining \(\pi (c)\), we should work with \((-\mathcal{A})^{\circ }\), which is less convenient in the computations of the proof.

2.2.1 Examples for \(\mathcal{A}\)

We anticipate here financially relevant examples of possible choices of the convex cone \(\mathcal{A}\) and the corresponding set \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}\).

Example 2.8

To introduce martingale measures in this setup, we set

$$\begin{aligned} \mathcal{H}^{d}& :=\big\{ \Delta =(\Delta _{0},\dots ,\Delta _{T-1}) : \Delta _{t}\in \big(\mathcal{C}_{b}(K_{0}\times \cdots \times K_{t}) \big)^{d}\big\} , \\ I^{\Delta }(x)& :=\sum _{t=0}^{T-1}\sum _{j=1}^{d}\Delta _{t}^{j}(x_{0}, \dots ,x_{t})(x_{t+1}^{j}-x_{t}^{j}), \qquad \forall x\in \Omega , \\ \mathcal{A}& \phantom{:} =\mathcal{I}:= \{ I^{\Delta } : \Delta \in \mathcal{H}^{d} \} \subseteq C_{0:T} . \end{aligned}$$
(2.13)

Thus the space \(\mathcal{H}^{d}\) is the class of admissible trading strategies and ℐ is the set of elementary stochastic integrals. The (possibly empty) class of martingale measures for the canonical process is denoted by \(\mathrm{Mart}(\Omega )\) and consists of all probability measures on \(\mathcal{B}(\Omega )\) which make each of the processes \((X_{t}^{j})\) a martingale under the natural filtration \(\mathcal{F}_{t}:=\sigma (X_{s}^{j},s\leq t,j=1,\dots ,d),t=0,\dots ,T\). Equivalently,

$$ \mathrm{Mart}(\Omega ):= \{Q\in \mathrm{Prob}^{1}(\Omega ) : {E}_{Q}[I^{\Delta }]=0,\,\forall \Delta \in \mathcal{H}^{d} \} . $$

It is then clear that choosing \(\mathcal{A}=\mathcal{I}\), we get \(\mathrm{Mart}(\Omega )=\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\). When \(d=1\), we simply write \(\mathcal{H}=\mathcal{H}^{1}\).

Example 2.9

For every \(\varepsilon \geq 0\), the set of so-called \(\varepsilon \)-martingale measures (see Guo and Obłój [24]) is

$$ \mathrm{Mart}_{\varepsilon }(\Omega ):=\bigg\{ Q\in \mathrm{Prob}^{1}( \Omega ) : E_{Q} [ I^{\Delta } ] \leq \varepsilon \sum _{t=0}^{T-1} \max _{j=1,\dots ,d} \Vert \Delta _{t}^{j} \Vert _{\infty }, \forall \Delta \in \mathcal{H}^{d}\bigg\} . $$

Thus, taking

$$ \mathcal{A}^{\varepsilon }:=\mathrm{conv}\bigg( \bigg\{ I^{\Delta }- \varepsilon \sum _{t=0}^{T-1}\max _{j=1,\dots ,d} \Vert \Delta _{t}^{j} \Vert _{\infty }:\Delta \in \mathcal{H}^{d}\bigg\} \bigg) $$
(2.14)

(here \(\mathrm{conv}(\,\cdot \, )\) stands for the convex hull in \(C_{0:T}\), which is easily seen to be a cone since \(\mathcal{H}^{d}\) is a vector space), one sees that

$$ \mathrm{Mart}_{\varepsilon }(\Omega )=\mathrm{Prob}^{1}(\Omega )\cap (\mathcal{A}^{\varepsilon })^{\circ }. $$

Taking in particular \(\varepsilon =0\), we have \(\mathrm{Mart}_{0}(\Omega )=\mathrm{Mart}(\Omega )\) as in Example 2.8. It is interesting to notice that for any sequence \(\varepsilon _{n}\downarrow 0\), we have

$$ \sigma _{\mathcal{A}^{\varepsilon _{n}}}(Q)\uparrow \sigma _{ \mathcal{I}}(Q), \qquad \forall Q\in \mathrm{Prob}^{1}(\Omega ) . $$
(2.15)

Example 2.10

Alternative choices for the set \(\mathcal{A}\) which produce supermartingale or submartingale measures are \(\mathcal{A}^{\pm }=\{I^{\Delta } : \Delta \in (\mathcal{H}^{\pm })^{d} \}\), where we define the sets \(\mathcal{H}^{+}=\{\Delta \in \mathcal{H} : \Delta _{t}\geq 0, \forall t=0,\dots ,T\}\) and \(\mathcal{H}^{-}=-\mathcal{H}^{+}\). The set \(\mathcal{A}^{+}\) models dynamic trading with no short selling and yields

$$ \mathrm{Prob}^{1}(\Omega )\cap (\mathcal{A}^{+})^{\circ } = \{ \text{supermartingale measures for the canonical process}\}. $$

2.2.2 Rephrasing the main results: superhedging and the martingale measures case

For a given proper concave \(U:\mathcal{E}\rightarrow {\mathbb{R}}\), recall the definition of \(S^{U}\) in (1.4) and for \(V(\,\cdot \, )=-U(-\,\cdot \, )\), set \(S_{V}(\varphi ):=-S^{U}(-\varphi )\) and

$$ \mathrm{dom}(S_{V}):=\{\varphi \in \mathcal{E} : S_{V}(\varphi )< \infty \}=-\mathrm{dom}(S^{U}). $$

Observe that in our notation, the superhedging value for \(c\) is

$$ \pi _{+}(c):=\inf _{z\in \mathcal{A}}\inf _{\varphi \in {\Psi }_{z}(c)}S_{V}(\varphi ) \in \lbrack -\infty ,+\infty ], $$

where

$$ {\Psi }_{z}(c):=\bigg\{ \varphi \in \mathrm{dom}(S_{V}) : \sum _{t=0}^{T} \varphi _{t}(x_{0},\dots ,x_{t})+z(x)\geq c(x), \forall x\in \Omega \bigg\} . $$

The selection of \(-\mathcal{A}\) for \(\pi \) and \(\mathcal{A}\) for \(\pi _{+}\) permits to recognise that the two are linked by \(\pi _{+}(c)=-\pi (-c)\), and so the duality results for \(\pi \) can easily be translated into duality results for \(\pi _{+}\). Of course, when \(\mathcal{A}\) is a vector space as in the case of stochastic integrals (see (2.13) and Example 2.8), we have \(\mathcal{A}=-\mathcal{A}\) and there is no need for the different choices \(-\mathcal{{A}}\) for \(\pi \) and \(\mathcal{{A}}\) for \(\pi _{+}\).

We now rephrase our findings in Theorem 2.4, with minor additions, to get the formulations in Corollary 2.12 and Corollary 2.13 which will simplify our discussion in Sects. 4 and 5.

We associate to the functions \(c:\Omega \rightarrow (-\infty ,+\infty ]\), \(g:\Omega \rightarrow \lbrack -\infty ,+\infty )\) the sets

$$\begin{aligned} \mathcal{S}_{{\mathrm {sub}}}(c)&:=\bigg\{ \varphi \in \mathrm{dom}(S^{U}) : \exists \Delta \in \mathcal{H}^{d}\text{ such that } \\ & \phantom{=:} \quad \sum _{t=0}^{T}\varphi _{t}(x_{0},\dots ,x_{t})+I^{\Delta }(x) \leq c(x), \forall x\in \Omega \bigg\} , \end{aligned}$$
(2.16)
$$\begin{aligned} \mathcal{S}_{\sup}(g)&:=\bigg\{ \varphi \in \mathrm{dom}(S_{V}) : \exists \Delta \in \mathcal{H}^{d}\text{ such that } \\ & \phantom{=:} \quad \sum _{t=0}^{T}\varphi _{t} (x_{0},\dots ,x_{t})+I^{\Delta }(x) \geq g(x), \forall x\in \Omega \bigg\} , \end{aligned}$$
(2.17)

Remark 2.11

If \(\mathcal{E}_{t}\subseteq C_{t:t}\) for \(t=0,\dots ,T\), then \(\mathrm{dom}(S^{U})\subseteq C_{0:0}\times \cdots \times C_{T:T}\) and each element \(\varphi _{t}\) in (2.16) is a function of the single variable \(x_{t}\). If additionally \(\mathrm{dom}(S^{U})=\mathcal{E}\) and \(d=1\), (2.16) is consistent with (1.2).

From Theorem 2.4 and the equalities \(\mathcal{S}_{\sup}(\,\cdot \, )=-\mathcal{S}_{{\mathrm {sub}}}(-\,\cdot \, )\), \(S_{V}(\,\cdot \, )=-S^{U}(-\,\cdot \, )\), one easily deduces

Corollary 2.12

Let \(\mathcal{A}=\mathcal{I}\) as in (2.13). Suppose that the assumptions in Theorem 2.4are satisfied, \(g:\Omega \rightarrow \lbrack -\infty ,+\infty )\) is upper semicontinuous and also condition (2.8) holds with \(c \) replaced by \(-g\). Then

$$\begin{aligned} \inf _{Q\in \mathrm{Mart}(\Omega )}\big( E_{Q}[c] +\mathcal{D}(Q)\big) &=\sup _{\varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)}S^{U}\left ( \varphi \right ) , \end{aligned}$$
(2.18)
$$\begin{aligned} \sup _{Q\in \mathrm{Mart}(\Omega )}\big( E_{Q}[g] -\mathcal{D}(Q)\big) &=\inf _{\varphi \in \mathcal{S}_{\sup}(g)}S_{V}\left ( \varphi \right ) . \end{aligned}$$
(2.19)

If the left-hand side of (2.18) (resp. (2.19)) is finite, then an optimum exists for the left-hand side of (2.18) (resp. (2.19)).

Corollary 2.13

If \(d=1\) and \(\Omega :=K_{0}\times \cdots \times K_{T}\) for compact sets \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\), then (2.18) and (2.19) as well as existence of optima are guaranteed by the following simplified set of assumptions: \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous, \(g:\Omega \rightarrow (-\infty ,+\infty ]\) is upper semicontinuous and \(U(0)=0\).

Proof

When \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\) are compact, we may repeat the proof of Corollary 2.12 invoking Corollary 2.5 in place of the more general Theorem 2.4. □

In the subsequent sections, we only consider the subhedging price; the corresponding statements for the superhedging price can be obtained in the obvious way just described.

2.3 Literature review

We observe that the EMOT problem on the left-hand side of (1.5) was not previously considered in the literature. The associated subhedging value on the right-hand side of (1.5) is also new, even though different formulations of nonlinear subhedging prices already appeared in the literature. For example, in Föllmer and Schied [21, Sect. 4.8], the use of general risk measures in a non-robust framework allows weakening the pointwise inequality constraint in subhedging problems. In Cheridito et al. [15], the authors, now in a robust framework, consider additionally a general set of discounted trading gains that may describe different market structures, such as transaction costs or trading constraints. In the present paper, we consider instead explicitly nonlinear pricing (i.e., \(S^{U}\) on the right-hand side of (1.5)) of static parts of semistatic trading strategies and its impact in the duality (i.e., \(\mathcal{D}_{U}\) on the left-hand side of (1.5)). Pennanen and Perkkiö [33] also developed a generalised optimal transport duality, which can be applied to study the pricing–hedging duality in a context similar to our additive setup of Sect. 3.

The addition of an entropic term to optimal transport problems was popularised by Cuturi [17], with several applications especially from the computational point of view (see for example the survey/monograph by Peyré and Cuturi [34, Chap. 4]). The Sinkhorn algorithm can be applied with the entropic regularisation procedure described in these works (see Benamou et al. [6] for some advantages). Convergence for this algorithm is studied e.g. in Ireland and Kullback [29] and Rüschendorf [35]. After the present paper was posted on arXiv, several relevant advances were made regarding this topic. We mention here Nutz and Wiesel [32], Bernton et al. [7], Ghosal et al. [23]. We stress that all the papers mentioned in this paragraph address a different problem: in Cuturi [17] and subsequent works, the requirement of an exact matching of the marginal distributions is maintained. In the present setting, we relax this constraint in order to model uncertainty regarding the marginals themselves.

A Sinkhorn algorithm approach was adopted in De March and Henry-Labordère [19] for building an arbitrage-free implied volatility surface from bid-ask quotes, while Henry-Labordère [25] studied a problem related to the entropic relaxation of an optimal transportation problem and Blanchet et al. [9] studied the number of operations needed for approximation of the transport cost with a given accuracy, in the case of entropic regularisation. Our framework also allows the use of a penalisation of the form \(Q\mapsto \sum _{t=0}^{T}\delta _{\widehat{Q}_{t}}(Q_{t} )+\widetilde{D}(Q) \), for some entropic term \(\widetilde{D}\), so that with this choice, the EMOT reduces to the MOT problem with an additional entropic regularisation term, as analysed in the abovementioned literature.

Stability issues have been studied in Backhoff-Veraguas and Pammer [2] and Neufeld and Sester [31] in what we called the sublinear case, namely with no penalty and with fixed marginals.

The works by Bernton et al. [7] and Ghosal et al. [23] study geometric properties of minimisers of the entropic OT, by means of the concept of cyclical invariance. This is a counterpart to the characterisation, using \(c\)-cyclical monotonicity, of the geometry of optimal transport plans in the classical framework of OT. Even though a similar study of geometric properties for optimisers of EMOT would be of great interest, this topic is beyond the scope of the present paper and is left for future research.

In the framework of Liero et al. [30] (i.e., with penalisations of the marginals induced by divergence functions) and after the first version of the present work was posted on arXiv, duality results were obtained in the context of weak martingale optimal entropy transport problems by Chung and Trinh [16].

2.4 Proof of Theorem 2.4

2.4.1 The full technical setup

For a metric space \(\mathbb{X}\), \(\mathcal{B}(\mathbb{X})\) denotes the Borel \(\sigma \)-algebra and \(m\mathcal{B}(\mathbb{X})\) the class of real-valued, Borel-measurable functions on \(\mathbb{X}\). We define the sets

$$\begin{aligned} \mathrm{ca}(\mathbb{X})& := \{ \gamma :\mathcal{B}(\mathbb{X})\rightarrow (-\infty ,+\infty ) : \gamma \text{ finite signed Borel measure on }\mathbb{X} \} , \\ \mathrm{Meas}(\mathbb{X})& := \{\mu :\mathcal{B}(\mathbb{X}) \rightarrow \lbrack 0,\infty ) : \mu \geq 0 \text{ finite Borel measure on }\mathbb{X} \}, \\ \mathrm{Prob}(\mathbb{X})& := \{Q:\mathcal{B}(\mathbb{X})\rightarrow \lbrack 0,1] : Q\text{ probability Borel measure on }\mathbb{X} \}, \\ \mathcal{C}(\mathbb{X})& := \{\varphi :\mathbb{X}\rightarrow { \mathbb{R}} : \varphi \text{ continuous on }\mathbb{X} \}, \\ \mathcal{C}_{b}(\mathbb{X})& := \{\varphi :\mathbb{X}\rightarrow { \mathbb{R}}: \varphi \text{ bounded and continuous on }\mathbb{X} \} . \end{aligned}$$

Recall that we fixed \(d(T+1)\) closed sets \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d} \subseteq{\mathbb{R}}\), where \(d\) is the number of stocks and \(T\) the time horizon. We use the following weighted spaces of continuous functions: for an index set \(I\subseteq \{1,\dots ,d\}\times \{0,\dots ,T\}\), we take

For example, we already encountered \(C_{s:t}\) in (2.1), and we also consider

$$ C_{t}:=C_{t:t}. $$

The corresponding norms are denoted by \(\Vert \cdot \Vert _{I},\Vert \cdot \Vert _{s:t},\Vert \cdot \Vert _{t}\), respectively. Some additional details on weighted spaces can be found in Appendix A.1.

Notice that if \(K_{0}^{1},\dots ,K_{0}^{d},\dots ,K_{T}^{1},\dots ,K_{T}^{d}\) are compact, then

$$ C_{0:T}=\mathcal{C}_{b}(\Omega )\qquad \text{and} \qquad (C_{0:T})^{ \ast }=\mathrm{ca}(\Omega ). $$

Observe that by a slight abuse of notation (regarding the domains of the functions), for index sets \(I\subseteq J\subseteq \{1,\dots ,d\}\times \{0,\dots ,T\}\), there is a constant \(0<\theta \leq 1\) such that

$$ C_{I}\subseteq C_{J}, \qquad \theta \Vert \phi \Vert _{I}\leq \Vert \phi \Vert _{J}\leq \Vert \phi \Vert _{I}, \quad \forall \phi \in C_{I} . $$
(2.20)

As already mentioned in Pennanen and Perkkiö [33] and Cheridito et al. [15], every finite signed Borel measure \(\gamma \) on \(\mathbb{X}\) with \(C_{0:T}\subseteq L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\vert \gamma \vert )\) induces a continuous linear functional \(\lambda \in (C_{0:T})^{\ast }\) via integration, namely

$$ c\mapsto \langle c,\lambda \rangle =\int _{\mathbb{X}}c\,\mathrm{d} \gamma , \qquad \forall c\in C_{0:T} . $$

The collection of such functionals, identified with the corresponding measures, is denoted by \(\mathrm{ca}^{1}(\mathbb{X})\), that is,

$$\begin{aligned} \mathrm{ca}^{1}(\mathbb{X}):=\big\{ \gamma : & \, \gamma \text{ is a finite signed Borel measure on $\mathcal{B}(\mathbb{X})$} \\ &\, \text{with } C_{0:T}\subseteq L^{1}\big(\mathbb{X},\mathcal{B}(\mathbb{X}), \vert \gamma \vert \big)\big\} , \end{aligned}$$

while the classes of nonnegative and of probability measures in \(\mathrm{ca}^{1}(\mathbb{X})\) are denoted by \(\mathrm{Meas}^{1}(\mathbb{X})\) and \(\mathrm{Prob}^{1}(\mathbb{X}) \), respectively. The canonical \(d\)-dimensional process is given by \(X_{t}^{j}(x)=x_{t}^{j}\), \(j=1,\dots ,d\), \(t=0,\dots ,T\). Observe that every \(\phi \in C_{0:T}\) satisfies \(\vert \phi (x)\vert \leq \Vert \phi \Vert _{0:T}( 1+\sum _{t=0}^{T} \sum _{j=1}^{d}\vert x_{t}^{j}\vert ) \), and so for any \(\mu \in \mathrm{Meas}^{1}(\mathbb{X})\), we have \(C_{0:T}\subseteq L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\mu )\) iff \(X_{t}^{j}\in L^{1}(\mathbb{X},\mathcal{B}(\mathbb{X}),\mu )\) for all \(j\) and \(t\).

Remark 2.14

Under Assumption 2.3 (i), with the notation from there, consider the proper, convex functional \(V(\varphi ):=-U(-\varphi ) \), with \(\mathrm{dom}(V)=\{ \varphi \in \mathcal{E} : V(\varphi )< \infty \}\). We define the (convex) conjugate of \(U\) by

$$ \mathcal{D}(\gamma _{0},\dots ,\gamma _{T}):=\sup _{\varphi \in \mathcal{E}}\bigg( U(\varphi )-\sum _{t=0}^{T}\langle \varphi _{t},\gamma _{t} \rangle \bigg) =\sup _{\varphi \in \mathcal{E}}\bigg( \sum _{t=0}^{T} \langle \varphi _{t},\gamma _{t}\rangle -V(\varphi )\bigg). $$
(2.21)

Then \(\mathcal{D}\) is a convex functional and -lower semicontinuous, even if we do not require that \(U\) is -upper semicontinuous. When some \(\gamma \in (C_{0:T})^{\ast }\) is given, we slightly improperly write \(\mathcal{D}(\gamma )=\mathcal{D}(\gamma _{0},\dots ,\gamma _{T})\), where \(\gamma _{t}\) is the restriction of \(\gamma \) to \(C_{0:t}\). We also set

As an immediate consequence of the definitions, the Fenchel inequality holds: if \((\varphi _{0},\dots ,\varphi _{T})\in \mathcal{E}\) and , then

$$ \sum _{t=0}^{T}\langle \varphi _{t},\gamma _{t}\rangle \leq \mathcal{D}(\gamma _{0},\dots ,\gamma _{T})+V(\varphi _{0},\dots ,\varphi _{T}) . $$
(2.22)

Remark 2.15

Another way to introduce our setting, which is used in Sects. 4.3 and 4.4, is to start with a proper convex functional \(\mathcal{D}:\mathrm{ca}^{1}(\Omega )\rightarrow (-\infty ,+\infty ]\) which is \(\sigma (\mathrm{ca}^{1}(\Omega ),\mathcal{E})\)-lower semicontinuous for an \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\subseteq (C_{0:T})^{T+1}\). By the Fenchel–Moreau theorem, we then have the representation

$$ \mathcal{D}(\gamma )=\sup _{\varphi \in \mathcal{E}}\bigg( \sum _{t=0}^{T} \int _{\Omega }\varphi _{t}\,\mathrm{d}\gamma -V(\varphi )\bigg) , $$

where now \(V\) is the Fenchel–Moreau (convex) conjugate of \(\mathcal{D}\), namely

$$ V(\varphi ):=\sup _{\gamma \in \mathrm{ca}^{1}(\Omega )}\bigg( \sum _{t=0}^{T} \int _{\Omega }\varphi _{t}\,\mathrm{d}\,\gamma -\mathcal{D}(\gamma )\bigg) . $$
(2.23)

Setting \(U(\varphi ):=-V(-\varphi )\), \(\varphi \in \mathcal{E}\), we get back that \(\mathcal{D}\) satisfies (2.21) and additionally that \(U\) is \(\sigma (\mathcal{E},\mathrm{ca}^{1}(\Omega ))\)-upper semicontinuous. In conclusion, a pair \((U,\mathcal{D})\) satisfying (2.21) might be defined either by providing a proper concave \(U:\mathcal{E}\rightarrow \lbrack -\infty ,+\infty )\) as in Sect. 2.1, or by assigning a proper convex and \(\sigma (\mathcal{E},\mathrm{ca}^{1}(\Omega ))\)-lower semicontinuous \(\mathcal{D}:\mathrm{ca}^{1}(\Omega )\rightarrow (-\infty ,+\infty ]\) as explained in this remark.

2.4.2 Technical comments on Theorem 2.4

Remark 2.16

We now provide conditions ensuring that \(\pi (0)<\infty \), which by Theorem 2.4 (i) implies that \(\pi (c)\in {\mathbb{R}}\) for every \(c\in B_{0:T}\).

(a) If there exists \(\lambda \in \mathcal{A}^{\circ }\cap \partial U(0)\subseteq (C_{0:T})^{ \ast }\), then \(\pi (0)< \infty \). Note that here, is the supergradient of \(U\) at \(0\in \mathcal{E}\), and we identify \(\lambda \) with the vector of its restrictions in writing improperly \(\lambda \in \partial U(0)\)). To see this, let \(\lambda \) satisfy \(S^{U}(\varphi )\leq \sum _{t=0}^{T}\langle \varphi _{t},\lambda _{t} \rangle ,\,\forall \varphi \in \mathcal{E}\). In particular, for all \(z\in -\mathcal{A}\) and \(\varphi \in \Phi _{z}(0)\), it then holds that \(S^{U}(\varphi )\leq \langle \sum _{t=0}^{T}\varphi _{t},\lambda \rangle \leq \langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \rangle \leq 0\), as \(\langle z,\lambda \rangle \geq 0\) for all \(\lambda \in \mathcal{A}^{\circ }\), which in turns yields \(\pi (0)\leq 0\).

(b) We have \(\pi (0)< \infty \) if and only if there exists \(Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) such that \(\mathcal{D}(Q)< \infty \). Indeed, by definition, \(\pi (0)\leq \int _{\Omega }0\,\mathrm{d}Q+\pi ^{\ast }(Q)=\pi ^{\ast }(Q)\). But from Lemma 2.20 (which does not rely on Lemma 2.18), we have

$$ \pi ^{\ast }(Q)=\mathcal{D}(Q)+\sigma _{\mathcal{A}}(Q)=\mathcal{D}(Q) $$

(the latter equality coming from \(Q\in \mathcal{A}^{\circ } \)). Hence \(\pi (0)\leq \mathcal{D}(Q)< \infty \). Conversely, \(\pi (0)< \infty \) implies the existence of a minimum point in (2.9).

Remark 2.17

Set

$$ \widetilde{{\Phi }}_{z}(c):=\bigg\{ \varphi \in \mathcal{E}: \sum _{t=0}^{T} \varphi _{t}(x_{0},\dots ,x_{t})+z(x)\leq c(x), \forall \,x\in \Omega \bigg\} $$

and observe that \({\Phi }_{z}(c)=\widetilde{{\Phi }}_{z}(c)\cap \mathrm{dom}(S^{U})\). Then with \(\sup \emptyset =-\infty \),

$$ \pi (c):=\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}S^{U}(\varphi ) =\sup _{z\in -\mathcal{A}}\sup _{\varphi \in \widetilde{{\Phi }}_{z}(c)}S^{U}(\varphi ) . $$
(2.24)

To see this, we consider different cases for a fixed \(z\in -\mathcal{A}\).

Case 1: \({\Phi }_{z }(c)=\emptyset \), which means \(\sup _{\varphi \in {\Phi }_{z }(c)}S^{U}(\varphi )=-\infty \) by convention. If \(\widetilde{{\Phi }}_{z }(c)=\emptyset \), then \(\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}S^{U}(\varphi )=- \infty \) by convention. If \(\widetilde{{\Phi }}_{z }(c)\neq \emptyset \), then \(\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}S^{U}(\varphi )=- \infty \) since for every \(\varphi \in \widetilde{{\Phi }}_{z }(c)\), we have \(S^{U}(\varphi )=-\infty \) as \(\varphi \notin \mathrm{dom}(S^{U})\).

Case 2: \({\Phi }_{z}(c)\neq \emptyset \). Then \(\widetilde{{\Phi }}_{z}(c)\neq \emptyset \), too, and since we can ignore all functions \(\varphi \in \widetilde{{\Phi }}_{z}(c)\setminus {{\Phi }}_{z}(c)\) (which produce values \(S^{U}(\varphi )=-\infty \)), we get

$$ \sup _{\varphi \in \widetilde{{\Phi }}_{z}(c)}S^{U}(\varphi )=\sup _{ \varphi \in {{\Phi }}_{z}(c)}S^{U}(\varphi ). $$

2.4.3 The proof

The proof of Theorem 2.4 is split in the following Lemmas 2.18, 2.20, 2.222.24 which are then combined in Lemma 2.25.

Lemma 2.18

Under Assumption 2.3and if (2.7) holds, Theorem 2.4 (i) holds. Moreover, the restriction of \(\pi \) to \(C_{0:T}\) satisfies

$$ \pi (c)=\min _{ \substack{ \lambda \in (C_{0:T})^{\ast }, \\ \lambda \geq 0,\lambda (1)=1}} \big( \langle c,\lambda \rangle +\pi ^{\ast }(\lambda )\big), \qquad \forall c\in C_{0:T}, $$
(2.25)

for

$$ \pi ^{\ast }(\lambda )=\sup _{c\in C_{0:T}}\big( \pi (c)-\langle c,\lambda \rangle \big) ,\qquad \lambda \in (C_{0:T})^{ \ast } . $$

Proof

Suppose that \(\pi (\,\widehat{c}\,)<\infty \) for some \(\widehat{c}\in B_{0:T}\). To prove that \(\pi (c)>-\infty \) for every \(c\in B_{0:T}\), it is (more than) enough to show that

$$ {\Phi }_{z}(c)\neq \emptyset , \qquad \forall z\in -\mathcal{A}. $$
(2.26)

Set \(H_{n}=H_{0}(n)\times \cdots \times H_{T}(n)\subseteq \Omega \). Observe that whenever \(c\in B_{0:T}\) is given, we have for every \(n\geq 1\) and \(x\in H_{n}\) that

$$\begin{aligned} c(x)-z(x)&\geq -\sup _{x\in H_{n}} \vert c(x)-z(x) \vert \\ &\geq - \Vert c-z \Vert _{0:T}\sup _{x\in H_{n}}\bigg( 1+\sum _{t=0}^{T} \sum _{j=1}^{d} \vert x_{t}^{j} \vert \bigg) >-\infty , \end{aligned}$$

and for every \(x\in \Omega \setminus H_{n}\) that

$$\begin{aligned} c(x)-z(x)& \geq - \Vert c-z \Vert _{0:T}\bigg( 1+\sum _{t=0}^{T}\sum _{j=1}^{d} \vert x_{t}^{j} \vert \bigg) \\ & \overset{}{\geq }-\left \Vert c-z\right \Vert _{0:T}- \Vert c-z \Vert _{0:T}\sum _{t=0}^{T}f_{t}^{n}(x_{0},\dots ,x_{t}), \end{aligned}$$

using (2.5) in the last inequality. Thus for every \(x\in \Omega \),

$$ c(x)-z(x)\geq - \Vert c-z \Vert _{0:T}-\sup _{x\in H _{n}} \vert c(x)-z(x) \vert - \Vert c-z \Vert _{0:T}\sum _{t=0}^{T}f_{t}^{n}(x_{0},\dots ,x_{t}) . $$

If we now show that \((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0\leq t\leq T}\in \mathrm{dom}(S^{U})\) for \(n\) big enough, we then conclude that

$$ \Big(- \Vert c-z \Vert _{0:T}-\sup _{x\in H_{n}} \vert c(x)-z(x) \vert - \Vert c-z \Vert _{0:T}f_{t}^{n}\Big)_{0\leq t\leq T}\in \mathrm{dom}(S^{U}) $$

by cash-additivity of \(S^{U}\), and at the same time,

$$ \Big(- \Vert c-z \Vert _{0:T}-\sup _{x\in H_{n}} \vert c(x)-z(x) \vert - \Vert c-z \Vert _{0:T}f_{t}^{n}\Big)_{0\leq t\leq T}\in { \Phi }_{z}(c) $$

by definition. This in particular proves that \(\pi (c)>-\infty \). Going then back to checking \((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0\leq t\leq T}\in \mathrm{dom}(S^{U})\), observe that

$$\begin{aligned} &S^{U}\big((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0 \leq t \leq T}\big) \\ &=\sup _{\alpha \in {\mathbb{R}}^{T+1}}\bigg( U\big((- \Vert c-z \Vert _{0:T}f_{t}^{n})_{0 \leq t \leq T}+\alpha \big)-\sum _{t=0}^{T}\alpha _{t}\bigg) \\ & \geq U\big(- \Vert c-z \Vert _{0:T}(f_{t}^{n})_{0 \leq t \leq T} \big) \\ &=-V\big( \Vert c-z \Vert _{0:T}(f_{t}^{n})_{0 \leq t \leq T}\big) \longrightarrow 0>-\infty \qquad \text{as $n\to \infty $} \end{aligned}$$

by Assumption 2.3. The fact that \(\pi (c)< \infty \) will follow once we show monotonicity, cash-additivity and concavity of \(\pi \). Monotonicity is trivial: if \(c_{1}\leq c_{2}\), then \({ \Phi }_{z}(c_{1})\subseteq {\Phi }_{z}(c_{2})\) for every \(z\in -\mathcal{A}\) (both sets might be empty). The cash-additivity property can be seen as follows: given \(\beta \in {\mathbb{R}}\) and setting \({1}=(1,\dots ,1)\in {\mathbb{R}}^{T}\), observe that whenever \(z\in \mathcal{A}\) is given, \(\varphi \in {\Phi }_{z}(c+\beta )\) is equivalent to \(\varphi -\frac{\beta }{T+1}{1}\in {\Phi }_{z}(c)\) since by cash-additivity of \(S^{U}\), \(\mathrm{dom}(S^{U})+{\mathbb{R}}^{T+1}=\mathrm{dom}(S^{U})\). Consequently,

$$\begin{aligned} \pi (c+\beta )& =\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c+\beta )}S^{U}(\varphi ) =\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}S^{U}\bigg( \varphi + \frac{\beta }{T+1}{1}\bigg) \\ & =\sum _{t=0}^{T}\frac{\beta }{T+1}+\sup _{z\in -\mathcal{A}}\sup _{ \varphi \in {\Phi }_{z}(c)}S^{U}(\varphi )=\pi (c)+\beta . \end{aligned}$$

Coming to concavity, it is convenient to rewrite \(\pi (c)\) in a slightly more convenient form as

$$ \pi (c)=\sup \bigg\{ S^{U}(\varphi ) : \varphi \in \mathrm{dom}(S^{U}),\exists \,z\in -\mathcal{A}\text{ such that }\sum _{t=0}^{T} \varphi _{t}+z\leq c \bigg\} $$
(2.27)

and to recall that whenever \(c\in B_{0:T}\) is given, the set over which we take the supremum on the right-hand side of (2.27) is not empty by (2.26). Take then \(c_{i}\in B_{0:T}\) and associated \(z_{i}\in -\mathcal{A},\varphi ^{i}\in \mathrm{dom}(S^{U})\) with \(\sum _{t=0}^{T}\varphi _{t}^{i}+z_{i}\leq c_{i}\). Define \(c_{ \alpha }=\alpha c_{1}+(1-\alpha )c_{2}\) and analogously \(z_{\alpha }\) and \(\varphi ^{\alpha } \) for \(\alpha \in \lbrack 0,1]\). Then clearly \(\sum _{t=0}^{T}\varphi _{t}^{\alpha }+z_{\alpha }\leq c_{\alpha }\). Combining this with concavity of \(S^{U}\) on \(\mathrm{dom}(S^{U})\), we obtain

$$\begin{aligned} & \alpha S^{U}(\varphi ^{1})+(1-\alpha )S^{U}(\varphi ^{2}) \\ &\leq S^{U}(\varphi ^{\alpha }) \\ & \leq \sup \bigg\{ S^{U}(\varphi ) : \varphi \in \mathrm{dom}(S^{U}),\exists \,z\in -\mathcal{A}\text{ such that }\sum _{t=0}^{T} \varphi _{t}+z\leq c_{\alpha } \bigg\} \\ & = \pi (c_{\alpha }), \end{aligned}$$

using (2.27) in the last equality. Taking now the supremum over all \(z_{i},\varphi ^{i}\) with \(\sum _{t=0}^{T}\varphi _{t}^{i}+z_{i}\leq c_{i}\), we obtain

$$\begin{aligned} \alpha \pi (c_{1})+(1-\alpha )\pi (c_{2})\leq \pi \big(\alpha c_{1}+(1- \alpha ) c_{2}\big), \quad \forall \alpha \in [0,1], c_{1},c_{2} \in B_{0:T} . \end{aligned}$$
(2.28)

Notice that up to this point, we have \(\pi (c_{i})\in (-\infty ,+\infty ]\) so that (2.28) makes sense.

Now we can combine (2.28) with the fact that \(\pi (c)>-\infty \) for every \(c\in B_{0:T}\) to show that \(\pi (c)<\infty \) for every \(c\in B_{0:T}\). Indeed, suppose that \(\pi (\,\widetilde{c}\,)= \infty \) for some \(\widetilde{c}\in B_{0:T}\). We know by hypothesis that \(\pi (\,\widehat{c}\,)< \infty \) for some \(\widehat{c}\in B_{0:T}\), and by what we have previously proved, we know that \(\pi (2\,\widehat{c}-\widetilde{c}\,)>-\infty \). Observing that \(\widehat{c}= \alpha (2\,\widehat{c}- \widetilde{c}\,)+ \left ( 1-\alpha \right ) \widetilde{c}\) for \(\alpha =\frac{1}{2}\), we have from (2.28) that

$$ \infty =\alpha \pi (2\,\widehat{c}-\widetilde{c}\,)+(1-\alpha )\pi (\,\widetilde{c}\,)\leq \pi \big(\alpha (2\,\widehat{c}-\widetilde{c}\,)+ ( 1-\alpha ) \widetilde{c}\,\big)=\pi (\,\widehat{c}\,)< \infty . $$

This yields a contradiction; thus there can be no \(\widetilde{c}\in B_{0:T}\) with \(\pi (\widetilde{c})= \infty \). Hence \(\pi :B_{0:T}\rightarrow {\mathbb{R}}\) is cash-additive, concave and nondecreasing on \(B_{0:T}\). Then it is automatically norm-continuous on \(B_{0:T}\) by the extended Namioka–Klee theorem (see Biagini and Frittelli [8]). The Fenchel–Moreau-type dual representation (2.25) holds, again by the extended Namioka–Klee theorem, this time applied to the restriction of \(\pi \) to \(C_{0:T}\), plus standard arguments involving monotonicity and cash-additivity to prove that \(\pi ^{\ast }(\lambda )<\infty \) implies that \(\lambda \geq 0\) and \(\lambda (1)=1\). See for example Föllmer and Schied [21, Theorem 4.16] for an exploitable technique for a similar argument. □

Remark 2.19

Under Assumption 2.3 and if (2.7) holds, \(S^{U}(\varphi )< \infty \) for every \(\varphi \in \mathrm{dom}(S^{U})\). Indeed, choosing \(c_{\varphi }:=\sum _{t=0}^{T}\varphi _{t}\), we get that \(\varphi \in {\Phi }_{0}(c_{\varphi })\) and \(S^{U}(\varphi )\leq \pi (c_{\varphi })< \infty \) by Lemma 2.18.

Lemma 2.20

For every \(\lambda \in (C_{0:T})^{\ast }\) with \(\lambda \geq 0\), we have

$$ \pi ^{\ast }(\lambda )=(S^{U})^{\ast }(\lambda _{0},\dots ,\lambda _{T})+ \sigma _{\mathcal{A}}(\lambda ). $$

If in addition \(\lambda (1)=1\), then

$$ (S^{U})^{\ast }(\lambda _{0},\dots ,\lambda _{T}):=\sup _{\varphi \in \mathcal{E}}\bigg( S^{U}(\varphi ) -\sum _{t=0}^{T}\langle \varphi _{t},\lambda _{t}\rangle \bigg) =\mathcal{D}(\lambda _{0}, \dots ,\lambda _{T}) = \mathcal{D}(\lambda ) . $$

Proof

Fix \(\lambda \in (C_{0:T})^{*}\) with \(\lambda \geq 0\). Then

$$\begin{aligned} \pi ^{*}(\lambda )&=\sup _{c\in C_{0:T}}\big(\pi (c)-\langle c, \lambda \rangle \big) \\ &=\sup _{c\in C_{0:T}}\Big(\sup _{z \in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z }(c)}S^{U} ( \varphi )-\langle c, \lambda \rangle \Big) \\ & = \sup _{c\in C_{0:T}}\Big(\sup _{z \in -\mathcal{A}}\sup _{ \varphi \in \widetilde{{\Phi }}_{z }(c)}S^{U}(\varphi )-\langle c, \lambda \rangle \Big) \\ &=\sup _{c\in C_{0:T}}\sup _{z \in -\mathcal{A}}\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}\big(S^{U}(\varphi )-\langle c,\lambda \rangle \big) \\ &\leq \sup _{c\in C_{0:T}}\sup _{z \in -\mathcal{A}}\sup _{\varphi \in \widetilde{{\Phi }}_{z }(c)}\bigg(S^{U}(\varphi )-\bigg\langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \bigg\rangle \bigg) \\ &=\sup _{z \in -\mathcal{A}}\sup _{\varphi \in \mathcal{E}}\bigg(S^{U} ( \varphi )-\bigg\langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \bigg\rangle \bigg) \\ &=\sup _{z\in \mathcal{A}}\langle z,\lambda \rangle +\sup _{\varphi \in \mathcal{E}}\bigg(S^{U} ( \varphi )-\sum _{t=0}^{T}\langle \varphi _{t},\lambda _{t}\rangle \bigg) \\ &=\sigma _{\mathcal{A}}(\lambda )+(S^{U})^{*}(\lambda _{0},\dots , \lambda _{T}), \end{aligned}$$
(2.29)

where the third equality follows from (2.24). Consequently,

$$ \pi ^{*}(\lambda )\leq \sigma _{\mathcal{A}}(\lambda )+(S^{U})^{*}( \lambda _{0},\dots ,\lambda _{T}) . $$
(2.30)

At the same time, for every \(\varphi \in \mathcal{E}, z\in -\mathcal{A}\) and for \(\widehat{c}=\sum _{t=0}^{T}\varphi _{t}+z\in C_{0:T}\), we have that \(\varphi \in \Phi _{z}(\,\widehat{c}\,)\). Thus

$$ S^{U}(\varphi )-\bigg\langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \bigg\rangle \leq \sup _{\varphi \in \widetilde{{\Phi }}_{z}(\,\widehat{c}\,)}\big(S^{U}(\varphi )-\langle \widehat{c}, \lambda \rangle \big)\overset{}{\leq}\pi ^{*}(\lambda ), $$

using (2.29) in the second inequality, and hence

$$\begin{aligned} &\sup _{z \in \mathcal{A}}\,\langle z,\lambda \rangle +\sup _{ \varphi \in \mathcal{E}}\bigg(S^{U}(\varphi )-\bigg\langle \sum _{t=0}^{T} \varphi _{t},\lambda \bigg\rangle \bigg) \\ &=\sup _{z \in -\mathcal{A}}\sup _{\varphi \in \mathcal{E}}\bigg(S^{U}(\varphi )-\bigg\langle \sum _{t=0}^{T}\varphi _{t}+z,\lambda \bigg\rangle \bigg)\leq \pi ^{*}( \lambda ) . \end{aligned}$$
(2.31)

Combining (2.30) and (2.31), we get \(\pi ^{*}(\lambda )=\sigma _{\mathcal{A}}(\lambda )+(S^{U})^{*}(\lambda _{0}, \dots ,\lambda _{T})\). If additionally \(\lambda (1)=1\), then we have

$$\begin{aligned} (S^{U})^{*}(\lambda _{0},\dots ,\lambda _{T})&=\sup _{\varphi \in \mathcal{E}}\bigg(S^{U}(\varphi )-\bigg\langle \sum _{t=0}^{T}\varphi _{t}, \lambda _{t}\bigg\rangle \bigg) \\ &=\sup _{\varphi \in \mathcal{E}}\bigg(\sup _{\alpha \in{\mathbb{R}}^{T+1}} \Big( U(\varphi +\alpha )-\sum _{t=0}^{T}\alpha _{t}\Big)- \bigg\langle \sum _{t=0}^{T}\varphi _{t},\lambda \bigg\rangle \bigg) \\ &=\sup _{\varphi \in \mathcal{E}}\sup _{\alpha \in{\mathbb{R}}^{T+1}}\bigg(\Big( U(\varphi +\alpha )-\sum _{t=0}^{T}\alpha _{t} \Big)-\bigg\langle \sum _{t=0}^{T}\varphi _{t},\lambda \bigg\rangle \bigg) \\ &=\sup _{\varphi \in \mathcal{E}}\sup _{\alpha \in{\mathbb{R}}^{T+1}}\bigg(U(\varphi +\alpha )-\sum _{t=0}^{T}\langle \varphi _{t}+ \alpha _{t},\lambda \rangle \bigg) \\ &=\sup _{\varphi \in \mathcal{E}}\bigg(U(\varphi )-\bigg\langle \sum _{t=0}^{T} \varphi _{t},\lambda \bigg\rangle \bigg)=\mathcal{D}(\lambda _{0}, \dots ,\lambda _{T})=\mathcal{D}(\lambda ) . \end{aligned}$$

 □

Remark 2.21

Under Assumption 2.3, we have \(U(0)\in {\mathbb{R}}\) and therefore

$$ (S^{U})^{\ast }(\lambda )\geq S^{U}(0)\geq U(0)>-\infty $$

for every \(0\leq \lambda \in (C_{0:T})^{\ast }\).

Lemma 2.22

Under Assumption 2.3and if (2.7) holds, let \(0\leq \lambda \in (C_{0:T})^{\ast }\) with \(\lambda (1)=1\) be given and define \(0\leq \lambda _{t}=\lambda |_{C_{0:t}}\in (C_{0:t})^{\ast } \). If \((\lambda _{0},\dots ,\lambda _{T})\in \mathrm{dom}(\mathcal{D})\), then there exists a unique \(Q\in \mathrm{Prob}^{1}(\Omega )\) which represents \(\lambda \) on \(C_{0:T}\), i.e.,

$$ \langle \varphi ,\lambda \rangle =E_{Q} [\varphi ], \qquad \forall \varphi \in C_{0:T} . $$

Proof

The proof is an adaptation of Bogachev [10, Theorem 7.10.6]. We first stress the fact that \(\lambda _{t}=\lambda |_{C_{0:t}}\in (C_{0:t})^{*}\) is a consequence of (2.20). We apply Proposition A.2. To do so, we show that for a fixed \(\varepsilon >0\) and \(n\) big enough, we may define a set \(H_{0}(n)\times \cdots \times H_{T}(n)\) that is compact (since so are all the factors) and satisfies the assumptions in Proposition A.2. Suppose that a given \(\varphi \in C_{0:T}\) satisfies \(\varphi (x)=0\) for every . We also have automatically that

$$ \left |\varphi (x)\right |\leq \left \| \varphi \right \|_{0:T}\bigg(1+ \sum _{t=0}^{T}\sum _{j=1}^{d} |x^{j}_{t} |\bigg), \qquad \forall x\in \Omega . $$

By Assumption 2.3, we then have for every that

$$ |\varphi (x) |\leq \| \varphi \|_{0:T}\bigg(\sum _{t=0}^{T} f^{n}_{t}(x_{0}, \dots ,x_{t})\bigg) . $$

Since, moreover, \(\varphi = 0\) on by assumption, we get

$$ |\varphi (x) |\leq \| \varphi \|_{0:T}\bigg(\sum _{t=0}^{T} f^{n}_{t}(x_{0}, \dots ,x_{t})\bigg), \qquad \forall (x_{0},\dots ,x_{T})\in \Omega . $$
(2.32)

Then for every \(a>0\), we have

$$\begin{aligned} |\langle \varphi ,\lambda \rangle |&\leq \langle |\varphi |,\lambda \rangle \leq \bigg\langle \| \varphi \|_{0:T}\sum _{t=0}^{T} f^{n}_{t}, \lambda \bigg\rangle = \| \varphi \|_{0:T}\sum _{t=0}^{T}\langle f^{n}_{t}, \lambda \rangle \end{aligned}$$
(2.33)
$$\begin{aligned} &= \| \varphi \|_{0:T}\sum _{t=0}^{T}\langle f^{n}_{t},\lambda _{t} \rangle = \| \varphi \|_{0:T}\frac{1}{a}\sum _{t=0}^{T}\langle a f^{n}_{t}, \lambda _{t}\rangle \\ &\leq \| \varphi \|_{0:T}\bigg(\frac{1}{a}\mathcal{D}(\lambda _{0},\dots ,\lambda _{T})+\frac{1}{a}V ( a f^{n}_{0}, \dots ,a f^{n}_{T} )\bigg), \end{aligned}$$
(2.34)

where (2.33) follows from positivity of \(\lambda \), (2.32), linearity and the fact that we have \(\lambda _{t}:=\lambda |_{C_{0:t}}\in (C_{0:t})^{*}\), while (2.34) follows from the Fenchel inequality (2.22).

Since \((\lambda _{0},\dots ,\lambda _{T}) \in \mathrm{dom}(\mathcal{D})\) by hypothesis, we can select \(a>0\) such that \(\frac{1}{a} \mathcal{D}(\lambda _{0},\dots ,\lambda _{T})\leq \frac{\varepsilon}{2}\). Choose now \(n\) in such a way that \(\frac{1}{a}V ( a f^{n}_{0},\dots ,a f^{n}_{T} )\leq \frac{\varepsilon}{2}\) for every \(s\leq T\) (which is possible by Assumption 2.3). Continuing from (2.34), we get

$$ |\langle \varphi ,\lambda \rangle |\leq \| \varphi \|_{0:T}\bigg( \frac{\varepsilon}{2}+\frac{\varepsilon}{2}\bigg)\leq \varepsilon \| \varphi \|_{0:T} . $$

The result now follows by combining Proposition A.2 and the Daniell–Stone result in Theorem A.3. □

Lemma 2.23

Under Assumption 2.3and if (2.7) holds, equation (2.9) is true for every \(c\in C_{0:T}\), with a minimum in place of the infimum.

Proof

Combining Lemmas 2.18, 2.20 and 2.22, we have

$$\begin{aligned} \pi (c)& = \min _{ \substack{ \lambda \in (C_{0:T})^{*}, \\ \lambda \geq 0,\lambda (1)=1}} \big(\langle c,\lambda \rangle +\pi ^{*}(\lambda )\big) \\ &= \min _{ \substack{ \lambda \in (C_{0:T})^{*}, \\ \lambda \geq 0,\lambda (1)=1}} \big(\langle c,\lambda \rangle +\mathcal{D}(\lambda )+\sigma _{\mathcal{A}}(\lambda )\big) \\ & = \min _{ \substack{ \lambda \in (C_{0:T})^{*},(\lambda _{0},\dots ,\lambda _{T})\in \mathrm{dom}(\mathcal{D}), \\ \lambda \geq 0,\lambda (1)=1}} \big(\langle c,\lambda \rangle +\mathcal{D}(\lambda )+\sigma _{\mathcal{A}}(\lambda )\big) \\ &\overset{ }{=}\min _{ \substack{ Q\in \mathrm{Prob}^{1}(\Omega ), \\ Q\in \mathrm{dom}(\mathcal{D})}} \big(E_{Q} [c ]+\mathcal{D}(Q)+\sigma _{\mathcal{A}}(Q)\big) \\ & = \min _{Q\in \mathrm{Prob}^{1}(\Omega )}\big(E_{Q} [c ]+ \mathcal{D}(Q)+\sigma _{\mathcal{A}}(Q)\big) \\ &\overset{}{=}\min _{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ}} \big(E_{Q} [c ]+\mathcal{D}(Q)\big), \end{aligned}$$
(2.35)

where the first equality uses Lemma 2.18, the second Lemma 2.20, and the third and fifth equalities the fact that \(\mathcal{D}\) is bounded from below by \(S^{U}(0)\) by Remark 2.21, hence \((\lambda _{0},\dots ,\lambda _{T})\in \mathrm{dom}(\mathcal{D}) \) if and only if \(\mathcal{D}(\lambda _{0},\dots ,\lambda _{T})< \infty \). Moreover, in (2.35), we use Lemma 2.22 and identify probability measures \(Q\in \mathrm{Prob}^{1}(\Omega )\) and their induced functionals, as well as the marginals \(Q_{t}\) of such measures, with the restrictions of such functionals to \({C}_{0:t}\). Finally, in the last equality, we just use the definition of \(\sigma _{\mathcal{A}}\) and the fact that \(\pi (c)< \infty \) by Lemma 2.18. □

Lemma 2.24

Under Assumption 2.3and if (2.7) holds, the sublevel set

$$ \{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ } : \mathcal{D}(Q)\leq b \} $$

is \(\sigma ((C_{0:T})^{\ast },C_{0:T})|_{\mathrm{Prob}^{1}(\Omega )}\)-(sequentially) compact for every \(b \in {\mathbb{R}}\).

Proof

To begin with, we show that \(\{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1,\pi ^{ \ast }(\lambda )\leq b \}\) is weak compact. First, we prove that \(B := \{\lambda \in (C_{0:T})^{\ast } : \pi ^{\ast }(\lambda )\leq b \} \) is weak compact. Observe that by (2.25), we have for every \(r>0\) and \(\lambda \in (C_{0:T})^{\ast }\) with \(\pi ^{\ast }(\lambda )\leq b \) that

$$\begin{aligned} \sup _{\substack{ c\in C_{0:T}, \\ \Vert c \Vert _{0:T}\leq r}}\vert \langle c,\lambda \rangle \vert &=\sup _{ \substack{ c\in C_{0:T}, \\ \Vert c \Vert _{0:T}\leq r}}\langle c, \lambda \rangle \\ &\leq \sup _{\substack{ c\in C_{0:T}, \\ \Vert c \Vert _{0:T}\leq r }}\big( -\pi (-c)\big) +\pi ^{\ast }(\lambda ) \\ &\leq b+\sup _{ \substack{ c\in C_{0:T}, \\ \Vert c \Vert _{0:T}\leq r}}\big( -\pi (-c)\big) . \end{aligned}$$
(2.36)

Now since \(-\pi (\,\cdot \, )\) is real-valued, convex and continuous on \(C_{0:T}\) by Lemma 2.18, it follows from Aliprantis and Border [1, Theorem 5.43] that the right-hand side in (2.36) is finite for some \(r>0\). Thus the operator norms of elements of the set \(B\) are uniformly bounded, and so \(B\) is contained in some (weak compact, by the Banach–Alaoglu theorem) ball of \((C_{0:T})^{\ast }\). Since \(\pi ^{\ast }\) is weak lower semicontinuous by its very definition, its sublevel sets are weak closed. This concludes the proof of weak compactness of \(B\). Next,

$$\begin{aligned} & \{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1,\pi ^{ \ast }(\lambda )\leq b \} \\ &= B \cap \{\lambda \in (C_{0:T})^{\ast } : \lambda (1)=1 \}\cap \bigcap _{\varphi \in C_{0:T},\varphi \geq 0}\{\lambda \in (C_{0:T})^{ \ast } : \lambda (\varphi )\geq 0 \} \end{aligned}$$

is the intersection of a weak compact set and weak closed sets; hence it is weak compact. Combining the fact that \(\sigma _{\mathcal{A}}=\delta _{\mathcal{A}^{\circ }} \) and Lemma 2.20,

$$\begin{aligned} & \{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1, \pi ^{\ast }(\lambda )\leq b \} \\ &= \{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1, \mathcal{D}(\lambda )\leq b \}\cap \mathcal{A}^{\circ } . \end{aligned}$$

By Lemma 2.22, we can identify normalised nonnegative functionals in \(\mathrm{dom}(\mathcal{D})\) and measures in \(\mathrm{Prob}^{1}(\Omega )\), so that by a slight abuse of notation, we can write

$$\begin{aligned} & \{\lambda \in (C_{0:T})^{\ast } : \lambda \geq 0,\lambda (1)=1, \mathcal{D}(\lambda )\leq b \}\cap \mathcal{A}^{\circ } \\ &= \{Q \in \mathrm{Prob}_{1}(\Omega ),\mathcal{D}(Q )\leq b \}\cap \mathcal{A}^{\circ } . \end{aligned}$$

Moving to sequential compactness, the topology \(\tau =\sigma ((C_{0:T})^{\ast },C_{0:T})|_{\mathrm{Prob}^{1}(\Omega )}\) induced on \(\mathrm{Prob}^{1}(\Omega )\) is the topology generated by the 1-Wasserstein distance by Bolley [11, Theorem A.2] (see also the discussion in the introduction of Bolley [12], and Villani [38, Definition 6.8.(iv) and Theorem 6.9]). By the above argument, the set \(\{Q \in \mathrm{Prob}_{1}(\Omega ) : \mathcal{D}(Q )\leq b \}\) is then a compact subset of \((\mathrm{Prob}^{1}(\Omega ),\tau )\), which is then 1-Wasserstein sequentially compact. As \(\mathcal{A}^{\circ}\) is weak closed by its definition, the result follows. □

Lemma 2.25

Under Assumption 2.3, for every lower semicontinuous functional

$$ c:\Omega \rightarrow (-\infty ,+\infty ] $$

satisfying (2.8), the duality (2.9) holds, and if \(\pi (c)<\infty \), the infimum in (2.9) is a minimum.

Proof

Take \(c\) as in the statement. Observe that from the definition of \(\pi \) and the Fenchel inequality on \(S^{U}\), for any \(Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\), we have

$$\begin{aligned} \pi (c)& =\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}S^{U}(\varphi ) \\ &\leq \sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}\bigg( (S^{U})^{\ast }({Q}_{0},\dots ,Q_{T})+E_{Q}\bigg[ \sum _{t=0}^{T}\varphi _{t}\bigg] \bigg) \\ & \overset{}{\leq }\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}\bigg( (S^{U})^{ \ast }({Q}_{0},\dots ,Q_{T})+E_{Q}\bigg[ \sum _{t=0}^{T}\varphi _{t}+z \bigg] \bigg) \\ & \overset{}{\leq }\sup _{z\in -\mathcal{A}}\sup _{\varphi \in {\Phi }_{z}(c)}\big( E_{Q}[c] +\mathcal{D}({Q})\big) \\ &=E_{Q}[c] +\mathcal{D}({Q}), \end{aligned}$$

where the second inequality follows from \({Q}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{0}\) and the third is a consequence of Lemma 2.20. Hence

$$ \pi (c)\leq \inf _{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }}\big(E_{Q}[c] +\mathcal{D}({Q})\big) . $$
(2.37)

The case \(\pi (c)= \infty \) is thus trivial, and we focus on the case \(\pi (c)<\infty \). Let \(c^{A}(x):=-A( 1+\sum _{t=0}^{T}\sum _{j=1}^{d} \vert x_{t}^{j}\vert )\), for \(x\in \Omega \). Then \(c\geq c^{A}\in C_{0:T}\) and we have \(\pi (c^{A})\leq \pi (c)< \infty \), as can be easily verified. A standard argument produces a sequence \((c_{n})\subseteq C_{0:T}\) with \(c_{n}\uparrow c\) pointwise on \(\Omega \). We claim that given a sequence of optima for the dual problems of \(\pi (c_{n})\), taking a suitable converging subsequence, the limit \(\widehat{Q}\) satisfies \(\widehat{Q}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\) and \(E_{\widehat{Q}}[c] +\mathcal{D}(\widehat{Q})\leq \pi (c)\). This and (2.37) then imply (2.9).

To prove the claim, recall from Lemma 2.23 and \(\infty >\pi (c)\geq \pi (c_{n})\) that each dual problem for \(\pi (c_{n})\) admits an optimum; call it \(Q^{n}\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ }\). We proceed by observing that \(\mathcal{D}(Q^{n})\in{\mathbb{R}}\) for every \(n\) and

$$ \pi (c_{n})=E_{Q^{n}}[c_{n}]+\mathcal{D}(Q^{n})\geq -E_{Q^{n}}\bigg[ \Vert c_{1} \Vert _{0:T}\bigg( 1+\sum _{t=0}^{T}\eta _{t}\bigg) \bigg] +\mathcal{D}(Q^{n}), $$
(2.38)

where we set \(\eta _{t}(x_{t})=\sum _{j=1}^{d}\vert x_{t}^{j}\vert \) for \(x_{t}\in K_{t}\). Now by the Fenchel inequality (2.22), mimicking the argument in (2.34),

$$ E_{Q^{n}}\bigg[ \Vert c_{1} \Vert _{0:T}\sum _{t=0}^{T}\eta _{t}\bigg] \leq \frac{1}{2}\mathcal{D}(Q^{n})+\frac{1}{2}V ( 2 \Vert c_{1} \Vert _{0:T}\eta _{0},\dots ,2 \Vert c_{1} \Vert _{0:T}\eta _{T} ) . $$

Going back to (2.38), we then get

$$ \pi (c_{n})\geq \zeta +\frac{1}{2}\mathcal{D}(Q^{n}), $$
(2.39)

where \(\zeta \in {\mathbb{R}}\) is a constant depending on \(c_{1},V,\eta _{0},\dots ,\eta _{T}\). As \(\pi (c_{n})\leq \pi (c)< \infty \), we conclude that \(\sup _{n}\mathcal{D}(Q^{n})<\infty \), which in turn implies that the sequence \((Q_{n})\) lies in \(\{Q\in \mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}^{\circ } : \mathcal{D}(Q)\leq b \}\) for \(b \in {\mathbb{R}}\) big enough. We know that the latter set is weak sequentially compact by Lemma 2.24. Thus we can extract a subsequence, which we call again \((Q^{n})\), weak converging to a \(\widehat{Q}\in \mathrm{Prob}^{1}\cap \mathcal{A}^{\circ }\). Now it is easily seen that

$$\begin{aligned} E_{\widehat{Q}}[c] +\mathcal{D}(\widehat{Q})& =\lim _{n}E_{\widehat{Q}}\left [ c_{n}\right ] +\mathcal{D}(\widehat{Q}) \overset{}{\leq }\lim _{n}\liminf _{m}\big( E_{Q^{m}}[c_{n}]+\mathcal{D}(Q^{m})\big) \\ & \overset{}{\leq }\lim _{n}\liminf _{m}\big( E_{Q^{m}}[c_{m}]+\mathcal{D}(Q^{m})\big) \\ & =\liminf _{m}\big( E_{Q^{m}}[c_{m}]+\mathcal{D}(Q^{m})\big) =\lim _{m}\pi (c^{m})\leq \pi (c), \end{aligned}$$

where the first inequality uses the fact that \(Q\mapsto E_{Q}[c_{n}] +\mathcal{D}(Q)\) is weak lower semicontinuous as a sum of weak lower semicontinuous functionals, and the second uses that \(c_{n}\leq c_{m}\) if \(m\geq n\). □

2.5 Convergence of EMOT

In this section, we study some stability and convergence results for the EMOT problem. In particular, we show how under suitable convergence assumptions on the penalty terms, one can see the classical MOT as a limit case for EMOT.

We suppose that for each \(n\in \mathbb{N}\cup \{\infty \} \), we are given a functional \(U_{n}\) and a set \(\mathcal{A}_{n}\subseteq C_{0:T}\). We denote the corresponding value as in (2.3) by \(\pi _{n}(c)\).

Proposition 2.26

Suppose that for each \(n\in \mathbb{N}\cup \{\infty \}\), the assumptions of Theorem 2.4hold for \(\pi _{n}(c)\) and that \(\pi _{n}(c)< \infty \). Suppose that

$$ \mathcal{D}_{n}(Q) + \sigma _{\mathcal{A}_{n}}(Q) \uparrow \mathcal{D}_{\infty}(Q) + \sigma _{\mathcal{A}_{\infty}}(Q) \qquad \textit{as $ n\to \infty $} $$
(2.40)

for every \(Q\in \mathrm{Prob}^{1}(\Omega )\). Then \(\pi _{n}(c)\uparrow \pi _{\infty }(c)\) for every \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is lower semicontinuous and satisfies (2.8).

Proof

By Lemma 2.23, the dual problem for \(\pi (c_{n})\) admits an optimum \(Q^{n}\) in the set \(\mathrm{Prob}^{1}(\Omega )\cap \mathcal{A}_{n}^{\circ }\) for each \(n\in \mathbb{N}\). Observe that \(\infty >\pi _{\infty }(c)\geq \sup _{n}\pi _{n}(c)=\lim _{n}\pi _{n}(c)\) and that with an argument similar to the one yielding (2.39),

$$\begin{aligned} \pi _{n}(c)& =E_{Q^{n}}[c]+\mathcal{D}_{n}(Q^{n})+\sigma _{ \mathcal{A}_{n}}(Q^{n}) \\ &\geq E_{Q^{n}} \bigg[ - \Vert c \Vert _{0:T}\sum _{t=0}^{T}\psi _{t} \bigg] +\mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n}) \\ & \geq -\frac{1}{2}\pi _{n}^{\ast }(Q^{n})+\frac{1}{2}\pi _{n}\bigg( -2 \Vert c \Vert _{0:T}\sum _{t=0}^{T}\psi _{t}\bigg) +\mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n}) \\ & \overset{}{= }\frac{1}{2}\pi _{n}\bigg( -2 \Vert c \Vert _{0:T} \sum _{t=0}^{T}\psi _{t}\bigg) +\frac{1}{2}\mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n}) \\ & \geq \frac{1}{2}\pi _{1}\bigg( -2 \Vert c \Vert _{0:T}\sum _{t=0}^{T} \psi _{t}\bigg) +\frac{1}{2}\mathcal{D}_{1}(Q^{n})+\sigma _{\mathcal{A}_{1}}(Q^{n}), \end{aligned}$$

using Lemma 2.20 to get the equality in the fourth line. As a consequence, for some constant \(\eta \),

$$ \infty >\pi _{\infty }(c)\geq \pi _{n}(c)\geq \eta +\frac{1}{2}\mathcal{D}_{1}(Q^{n})+\sigma _{\mathcal{A}_{1}}(Q^{n}). $$

Hence all the measures \(Q^{n}, n \in \mathbb{N}\), belong to a sublevel set of the form

$$ \{Q\in \mathrm{Prob}^{1}(\Omega )\cap (\mathcal{A}_{1}\mathcal{)}^{ \circ } : \mathcal{D}_{1}(Q)\leq b \} $$

which is \(\sigma (\mathrm{Prob}^{1}(\Omega ),C_{0:T})\)-(sequentially) compact by Lemma 2.24. Extract a subsequence, again called \((Q^{n})\), converging to a limit \(Q^{\infty }\in \mathrm{Prob}^{1}(\Omega )\). Since \(\mathcal{D}_{n}\) and \(\sigma _{\mathcal{A}_{n}}\) are lower semicontinuous, so is \(\mathcal{D}_{n}+\sigma _{\mathcal{A}_{n}}\) for every \(n\in \mathbb{N}\cup \{\infty \}\). Hence

$$\begin{aligned} \mathcal{D}_{\infty }(Q^{\infty })+\sigma _{\mathcal{A}_{\infty }}(Q^{ \infty }) &= \sup _{K}\big( \mathcal{D}_{K}(Q^{\infty })+\sigma _{\mathcal{A}_{K}}(Q^{\infty })\big) \\ &\leq \sup _{K}\liminf _{n}\big( \mathcal{D}_{K}(Q^{n})+\sigma _{ \mathcal{A}_{K}}(Q^{n})\big) \\ &\overset{}{\leq }\sup _{K}\liminf _{n}\big( \mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n})\big) \\ &=\liminf _{n}\big( \mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n}) \big), \end{aligned}$$

using (2.40) in the first equality and the second inequality. Up to taking a further subsequence, again called \((Q^{n})\), we may assume that the lim inf above is in fact a limit, so that

$$ \mathcal{D}_{\infty }(Q^{\infty })+\sigma _{\mathcal{A}_{\infty }}(Q^{ \infty })\leq \lim _{n}\big( \mathcal{D}_{n}(Q^{n})+\sigma _{ \mathcal{A}_{n}}(Q^{n})\big) . $$

Since \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and satisfies (2.8) for some \(A\geq 0\), there exists a sequence \((c_{n}) \subseteq C_{0:T}\) with \(c_{n}\uparrow c\) pointwise on \(\Omega \), just as in the proof of Lemma 2.25. By monotone convergence, we then have \(E_{Q}[c]\! =\!\sup _{n}\!E_{Q} [ c_{n} ] \). We conclude that \(Q\mapsto E_{Q}\left [ c\right ] \) is the supremum of linear functionals, each being continuous with respect to the topology induced by \(\sigma ((C_{0:T})^{\ast },C_{0:T})\) on \(\mathrm{Prob}^{1}(\Omega )\). Thus \(Q\mapsto E_{Q}[ c] \) is lower semicontinuous with respect to that topology, and we obtain \(E_{Q^{\infty }}[c]\leq \liminf _{n}E_{Q^{n}}[c]\). Passing to a further subsequence, we can assume that \(\liminf _{n}E_{Q^{n}}[c]=\lim _{n}E_{Q^{n}}[c]\). From the previous arguments, we then get

$$\begin{aligned} \pi _{\infty }(c)& \leq E_{Q^{\infty }}[c]+\mathcal{D}_{\infty }(Q^{ \infty })+\sigma _{\mathcal{A}_{\infty }}(Q^{\infty }) \\ &\leq \lim _{n}E_{Q^{n}}[c]+\mathcal{D}_{\infty }(Q^{\infty })+ \sigma _{\mathcal{A}_{\infty }}(Q^{\infty }) \\ & \leq \lim _{n}E_{Q^{n}}[c]+\lim _{n}\big(\mathcal{D}_{n}(Q^{n})+ \sigma _{\mathcal{A}_{n}}(Q^{n})\big) \\ &=\lim _{n}\big( E_{Q^{n}}[c]+\mathcal{D}_{n}(Q^{n})+\sigma _{\mathcal{A}_{n}}(Q^{n})\big) =\lim _{n}\pi _{n}(c), \end{aligned}$$

where we use that the \(Q^{n}\) are optima. Since we already have \(\lim _{n}\pi _{n}(c)\leq \pi _{\infty }(c)\), this concludes the proof of \(\pi _{n}(c)\uparrow \pi _{\infty }(c)\). □

3 Additive structure

In Sect. 2, we did not require any particular structural form of the functionals \(\mathcal{D},U\). We now assume an additive structure of \(U\) and, complementarily, an additive structure of \(\mathcal{D}\). In the entire Sect. 3, we take for each \(t=0,\dots ,T\) a vector subspace \(\mathcal{E}_{t}\subseteq {C}_{t}=C_{t:t}\) such that \(\mathcal{E}_{t}+{\mathbb{R}}=\mathcal{E}_{t}\) and set \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\). Observe that we automatically have \(\mathcal{E}+{\mathbb{R}}^{T+1}=\mathcal{E}\). It is also clear that ℰ is a subspace of \((C_{0:T})^{T+1}\) if we interpret \(\mathcal{E}_{0},\dots ,\mathcal{E}_{T}\) as subspaces of \(C_{0:T}\). We also mention here that up to now, we used for \(\lambda \in (C_{0:T})^{*}\) (resp. for a measure \(\mu \in \mathrm{ca}(\Omega )\)) the notation \(\lambda _{t}\) (resp. \(\mu _{t}\)) for the restriction to \(C_{0:t}\) (resp. for the marginal on \(\Omega _{0:t}\)). This was motivated by the fact that we were considering general \(\mathcal{E}_{t}\subseteq C_{0:t}\). Since we now mostly work with \(\mathcal{E}_{t}\subseteq C_{t}\), we change notation slightly.

Notation 3.1

Throughout Sects. 35, given \(\lambda \in (C_{0:T})^{*}\) (or a measure \(\mu \in \mathrm{ca}(\Omega )\)), we use the notation \(\lambda _{t}\) (resp. \(\mu _{t}\)) for the restriction to \(C_{t}\) (resp. for the marginal on ).

3.1 Additive structure of \(U\)

Setup 3.2

We consider a proper concave functional \(U_{t}:\mathcal{E}_{t}\rightarrow [ -\infty ,+\infty )\) for every \(t=0,\dots ,T\). We define \(\mathcal{D}_{t}\) on \(\mathrm{ca}^{1}(K_{t})\) similarly to (2.21) as

$$ \mathcal{D}_{t}(\gamma _{t}):=\sup _{\varphi _{t}\in \mathcal{E}_{t}} \bigg(U_{t}(\varphi _{t})-\int _{K_{t}}\varphi _{t}\,\mathrm{d}\gamma _{t}\bigg), \qquad \gamma _{t}\in \mathrm{ca}^{1}(K_{t}), $$

and observe that \(\mathcal{D}_{t}\) can also be viewed as defined on \(\mathrm{ca}^{1}(\Omega )\) by using for \(\gamma \in \mathrm{ca}^{1}(\Omega )\) the marginals \(\gamma _{0},\dots ,\gamma _{T} \) and setting \(\mathcal{D}_{t}(\gamma ):=\mathcal{D}_{t}(\gamma _{t})\). We now define, for each \(\varphi \in \mathcal{E}\), \(U(\varphi ):=\sum _{t=0}^{T}U_{t}(\varphi _{t})\) and define \(\mathcal{D}\) on \(\mathrm{ca}^{1}(\Omega )\) using (2.21). Recall from (1.4) that

$$\begin{aligned} S^{U}(\varphi )&:=\sup _{\beta \in {\mathbb{R}}^{T+1}}\bigg( U( \varphi +\beta )-\sum _{t=0}^{T}\beta _{t}\bigg), \qquad \varphi \in \mathcal{E}, \\ S^{U_{t}}(\varphi _{t})&:=\sup _{\alpha \in {\mathbb{R}}}\big( U( \varphi _{t}+\alpha )-\alpha \big),\qquad \varphi _{t}\in \mathcal{E}_{t}. \end{aligned}$$

Lemma 3.3

In Setup 3.2and under the convention \(+\infty -\infty =-\infty \), we have

$$\begin{aligned} \mathcal{D}(\gamma )&=\sum _{t=0}^{T}\mathcal{D}_{t}(\gamma )=\sum _{t=0}^{T}\mathcal{D}_{t}(\gamma _{t}),\qquad \forall \gamma \in \mathrm{ca}^{1}(\Omega ), \\ S^{U}(\varphi )&=\sum _{t=0}^{T}S^{U_{t}}(\varphi _{t}),\qquad \forall \varphi \in \mathcal{E}, \end{aligned}$$
(3.1)

and for all \(\varphi \in \mathcal{E}\) that

$$\begin{aligned} S^{U}(\varphi +\beta )&=S^{U}(\varphi )+\sum _{t=0}^{T}\beta _{t}, \qquad \forall \beta \in {\mathbb{R}}^{T+1}, \\ S^{U_{t}}(\varphi _{t}+\beta )&=S^{U_{t}}(\varphi _{t})+\beta , \qquad \forall \beta \in {\mathbb{R}} . \end{aligned}$$

Proof

The simple proof is omitted. □

3.2 Duality for the general cash-additive setup

As a consequence of Theorem 2.4, we now obtain a duality in a general cash-additive setup.

Theorem 3.4

Suppose that \(\mathcal{E}_{t}\subseteq C_{t}\) with \(X_{t}\in \mathcal{E}_{t}\) and that \(U_{t}:\mathcal{E}_{t}\rightarrow {\mathbb{R}}\) is a concave, cash-additive functional null in 0. Set \(U(\varphi ):=\sum _{t=0}^{T}U_{t}(\varphi _{t})\) for \(\varphi \in \mathcal{E}= \mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and suppose that Assumption 2.3is fulfilled. Recall the formulation of \(\mathcal{S}_{{\mathrm {sub}}}(c)\) from (1.2) and consider for every \(t=0,\dots ,T\) the penalisations

$$ \mathcal{D}_{t}(Q_{t}):=\sup _{\varphi _{t}\in \mathcal{E}_{t}}\bigg( U_{t}( \varphi _{t})-\int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}\bigg) \qquad \textit{for }Q_{t}\in \mathrm{Prob}^{1}(K_{t}). $$
(3.2)

Let \(c:\Omega \rightarrow (-\infty ,+\infty ]\) be lower semicontinuous and such that (2.8) holds. Then

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}_{t}(Q_{t})\bigg) =\sup \bigg\{ \sum _{t=0}^{T}U_{t}( \varphi _{t}) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} =\pi (c), $$
(3.3)

and the infimum in (3.3) is a minimum provided that \(\pi (c)< \infty \).

Proof

Let \(\mathcal{D}\) be defined as in (2.21). Observe that we are in Setup 3.2. Lemma 3.3 tells us that \(S^{U}(\varphi )=\sum _{t=0}^{T}S^{U_{t}}(\varphi _{t})=\sum _{t=0}^{T}U_{t}( \varphi _{t})\), since \(U_{0},\dots ,U_{T}\) are cash-additive, and that \(\mathcal{D}\) coincides on \(\mathrm{Mart}(\Omega )\) with the penalisation term \(Q\mapsto \sum _{t=0}^{T}\mathcal{D}_{t}(Q_{t})\), where \(\mathcal{D}_{t}(\mathcal{Q}_{t})\) is given in (3.2). So \(S^{U}=U\) and \(\mathrm{dom}(S^{U})=\mathcal{E}\), all the assumptions of Theorem 2.4 are fulfilled, and so we can apply Corollary 2.12 which together with Remark 2.11 yields exactly (3.3). □

3.3 Additive structure of \(\mathcal{D}\)

The results of this subsection will be applied in Sects. 4.3 and 4.4. In the spirit of Remark 2.15, we now reverse the procedure taken in the previous subsection: We start from some functionals \(\mathcal{D}_{t}\) on \(\mathrm{ca}^{1}(K_{t})\) for \(t=0,\dots ,T\) and build an additive functional \(\mathcal{D} \) on \(\mathrm{ca}^{1}(\Omega )\). Our aim is to find the counterparts of the results in Sect. 3.1.

Setup 3.5

We consider a proper, convex, \(\sigma (\mathrm{ca}^{1}(K_{t}),\mathcal{E}_{t})\)-lower semicontinuous functional \(\mathcal{D}_{t}:\mathrm{ca}^{1}(K_{t})\rightarrow (-\infty ,+\infty ]\) for every \(t=0,\dots ,T\). We extend the functionals \(\mathcal{D}_{t}\) to \(\mathrm{ca}^{1}(\Omega )\) by using, for any \(\gamma \in \mathrm{ca}(\Omega )\), the marginals \(\gamma _{0},\dots ,\gamma _{T}\). If \(\gamma \in \mathrm{ca}^{1}(\Omega )\), we set

$$ \mathcal{D}_{t}(\gamma ):=\mathcal{D}_{t}(\gamma _{t}) \qquad \text{and}\qquad \mathcal{D}(\gamma ):=\sum _{t=0}^{T}\mathcal{D}_{t}(\gamma ) =\sum _{t=0}^{T}\mathcal{D}_{t}(\gamma _{t}) . $$

We define \(V(\varphi )\) for \(\varphi \in \mathcal{E}\) and \(V_{t}(\varphi _{t})\) for \(\varphi _{t}\in \mathcal{E}_{t}\) for \(t=0,\dots ,T\) similarly to (2.23) as

$$\begin{aligned} V(\varphi )&:=\sup _{\gamma \in \mathrm{ca}^{1}(\Omega )}\bigg(\int _{ \Omega}\Big( \sum _{t=0}^{T}\varphi _{t} \Big)\,\mathrm{d}\gamma - \mathcal{D}(\gamma )\bigg), \\ V_{t}(\varphi _{t})&:=\sup _{\gamma \in \mathrm{ca}^{1}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t}\,\mathrm{d}\gamma -\mathcal{D}_{t}(\gamma ) \bigg) . \end{aligned}$$

We define on ℰ the functional \(U(\,\cdot \, )=-V(-\,\cdot \, )\) and similarly \(U_{t}(\,\cdot \, )=-V_{t}(-\,\cdot \, )\) on \(\mathcal{E}_{t}\) for \(t=0,\dots ,T\). Finally, \(S^{U}(\varphi )\), \(S^{U_{0}}(\varphi _{0}),\dots ,S^{U_{T}}(\varphi _{T})\) are defined as in Setup 3.2.

Lemma 3.6

In Setup 3.5, we have the following:

1) \(\mathcal{D}_{0},\dots ,\mathcal{D}_{T}\) as well as \(\mathcal{D}\) are \(\sigma (\mathrm{ca}^{1}(\Omega ),\mathcal{E})\)-lower semicontinuous.

2) Under the additional assumption that \(\mathrm{dom}(\mathcal{D}_{t})\subseteq \mathrm{Prob}^{1}(K_{t})\) for \(t=0,\dots ,T\), we have for any \(\varphi =(\varphi _{0},\dots ,\varphi _{T})\in \mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) that

$$ U(\varphi )=\sum _{t=0}^{T}U_{t}(\varphi _{t})=\sum _{t=0}^{T}-V_{t}(- \varphi _{t}),\qquad S^{U}(\varphi )=\sum _{t=0}^{T}S^{U_{t}}( \varphi _{t}). $$

Proof

The proof is omitted. □

3.4 Divergences induced by utility functions

In this section, we provide the exact formulation of the divergences induced by utility functions \(u_{t}:{\mathbb{R}}\rightarrow \lbrack -\infty ,+\infty )\), distinguishing the two cases \(\mathrm{dom}(u_{t})={\mathbb{R}}\) and \(\mathrm{dom}(u_{t})\supseteq \lbrack 0,\infty )\).

Assumption 3.7

We consider concave, upper semicontinuous nondecreasing functions \(u_{0},\dots ,u_{T}:{\mathbb{R}}\rightarrow \lbrack -\infty ,+\infty )\) with \(u_{0}(0)=\cdots =u_{T}(0)=0\) and \(u_{t}(x)\leq x\), \(\forall x\in {\mathbb{R}}\) (that is, \(1\in \partial u_{0}(0)\cap \cdots \cap \partial u_{T}(0)\)). For each \(t=0,\dots ,T\), we define \(v_{t}(x):=-u_{t}(-x)\), \(x\in {\mathbb{R}}\), and

$$ v_{t}^{\ast }(y):=\sup _{x\in {\mathbb{R}}}\big(xy-v_{t}(x)\big)\,= \sup _{x\in {\mathbb{R}}}\big(u_{t}(x)-xy\big), \qquad y\in { \mathbb{R}} . $$
(3.4)

Remark 3.8

We observe that \(v_{t}(y)=v_{t}^{\ast \ast }(y)=\sup _{x\in {\mathbb{R}}}(xy-v_{t}^{\ast }(y))\) for all \(y\in {\mathbb{R}}\) by the Fenchel–Moreau theorem and that \(v_{t}^{\ast }\) is convex, lower semicontinuous and bounded from below on ℝ. Assumption 3.7 is satisfied by a wide range of utility functions.

Fix \(\widehat{\mu _{t}} \in \mathrm{Meas}(K_{t})\). We define, for \(\mu \in \mathrm{Meas}(K_{t})\),

$$ \mathcal{D}_{v_{t}^{\ast },\widehat{\mu}_{t}}(\mu ):=\textstyle\begin{cases} \int _{K_{t}}v_{t}^{\ast } ( \frac{\mathrm{d}\mu }{\mathrm{d}\widehat{\mu}_{t}} ) \,\mathrm{d}\widehat{\mu}_{t} &\quad \text{if }\mu \ll \widehat{\mu}_{t}, \\ \infty & \quad \text{otherwise.}\end{cases} $$
(3.5)

In the next two results, whose proofs are postponed to Appendix A.2, we provide the dual representation of the divergence terms.

Proposition 3.9

Take \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7with

$$ \mathrm{dom}(u_{0})=\cdots =\mathrm{dom}(u_{T})={\mathbb{R}}, $$

consider closed (possibly noncompact) \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\) and let \(\widehat{\mu }_{t}\in \mathrm{Meas}(K_{t})\), \(t= 0,\dots ,T\). Then

$$ \mathcal{D}_{v_{t}^{\ast },\widehat{\mu }_{t}}(\mu )=\sup _{\varphi _{t} \in \mathcal{C}_{b}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t}(x_{t})\, \mathrm{d}\mu (x_{t})-\int _{K_{t}}v_{t}\big(\varphi _{t}(x_{t})\big) \,\mathrm{d}\widehat{\mu }_{t}(x_{t})\bigg) . $$
(3.6)

Let \(\widehat{Q}_{t}\in \mathrm{Prob}(K_{t})\) and for \(\mu \in \mathrm{Meas}(K_{t})\), let \(\mu =\mu _{a}+\mu _{s} \) be the Lebesgue decomposition of \(\mu \) with respect to \(\widehat{Q}_{t}\), where \(\mu _{a}\ll \widehat{Q}_{t}\) and \(\mu _{s}\perp \widehat{Q}_{t} \). Set

$$ (v_{t}^{\ast })_{\infty }^{\prime }:=\lim _{y\rightarrow \infty } \frac{v_{t}^{\ast }(y)}{y}, \qquad t=0,\dots ,T . $$

As \(u_{t}(0)=0\), \((v_{t}^{\ast })_{\infty }^{\prime }\in \lbrack 0, \infty ]\) since \(v^{*}_{t}(y)\geq u_{t}(0)-0 y=0\). Then we can define, for \(\mu \in \mathrm{Meas}(K_{t})\),

$$ \mathcal{F}_{t}(\mu | \widehat{Q}_{t}):=\int _{K_{t}}v_{t}^{\ast } \bigg( \frac{\mathrm{d}\mu _{a}}{\mathrm{d}\widehat{Q}_{t}}\bigg) \, \mathrm{d}\widehat{Q}_{t}+(v_{t}^{\ast })_{\infty }^{\prime }\,\mu _{s}(K_{t}), $$

where we use the convention \(\infty \cdot 0=0\) if \((v_{t}^{\ast })_{\infty }^{\prime }=\infty \) and \(\mu _{s}(K_{t})=0\). Observe that the restriction of \(\mathcal{F}(\,\cdot \, | \widehat{Q}_{t})\) to \(\mathrm{Meas}(K_{t})\) coincides with the functional in Liero et al. [30, Equation (2.35)] with \(F=v_{t}^{\ast }\), and that whenever \(\mathrm{dom}(u_{t})={\mathbb{R}}\), we have \((v_{t}^{\ast })_{ \infty }^{\prime }=\lim _{y\rightarrow \infty } \frac{v_{t}^{\ast }(y)}{y}= \infty \) and \(\mathcal{F}_{t}(\,\cdot \, | \widehat{Q}_{t})\) coincides with \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(\,\cdot \, )\) on \(\mathrm{Meas}(K_{t})\); see (3.5).

Proposition 3.10

Suppose that \(u_{0},\dots ,u_{T}:{\mathbb{R}}\rightarrow \mathbb{[-\infty },+\infty )\) satisfy Assumption 3.7and \(K_{0},\dots ,K_{T}\subseteq{\mathbb{R}}\) are compact. If \(\widehat{Q}_{t}\in \mathrm{Prob}(K_{t})\) for all \(t\in \{0, \dots ,T\}\) has full support, then

$$ \mathcal{F}_{t}(\mu | \widehat{Q}_{t})=\sup _{\varphi _{t}\in \mathcal{C}_{b}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t}(x_{t})\,\mathrm{d}\mu (x_{t})- \int _{K_{t}}v_{t}\big(\varphi _{t}(x_{t})\big)\,\mathrm{d} \widehat{Q }_{t}(x_{t})\bigg) . $$
(3.7)

Example 3.11

The requirement that \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) have full support is crucial for the proof of Proposition 3.10. We provide a simple example to show that (3.7) does not hold in general when that assumption is not fulfilled. Take \(K= \{-2,0,2\}\), \(\widehat{Q}=\frac{1}{2}\delta _{-2}+\frac{1}{2}\delta _{+2}\), \(\mu =\delta _{0}\), \(u(x)=\frac{x}{x+1}\) for \(x\geq -1\) and \(u(x)=-\infty \) for \(x<-1\). It is easy to see that \(v^{\ast }\) associated via (3.4) is given by \(v^{\ast }(y)=1+y-2\sqrt{y}\) for \(y\geq 0\) and \(v^{\ast }(y)=-\infty \) for \(y<0\), so that \((v_{t}^{\ast })_{\infty }^{\prime }=1\). It is also easy to see that \(\mu \perp \widehat{Q}\); hence \(\mu _{a}=0\) and \(\mu _{s}=\mu \) in the Lebesgue decomposition with respect to \(\widehat{Q}\). Hence \(\mathcal{F}(\mu | \widehat{Q})=1+1\mu (K)=2\). At the same time, we see that taking \(\varphi _{N}\in \mathcal{C}_{b}(K)\) defined via \(\varphi _{N}(-2)=\varphi _{N}(2)=0,\varphi _{N}(0)=-N\) (observe that \(u(\varphi _{N})\notin \mathcal{C}_{b}(K)\) for \(N\) sufficiently large), we have

$$\begin{aligned} \sup _{\varphi \in \mathcal{C}_{b}(K)}\bigg( \int _{K}\varphi \, \mathrm{d}\mu -\int _{K}v(\varphi )\,\mathrm{d}\widehat{Q }\bigg) &= \sup _{\varphi \in \mathcal{C}_{b}(K)}\bigg( \int _{K}u(\varphi )\, \mathrm{d}\widehat{Q}-\int _{K}\varphi \,\mathrm{d}\mu \bigg) \\ & \geq \sup _{N}\bigg( \int _{K}u(\varphi _{N})\,\mathrm{d} \widehat{Q}-\int _{K}\varphi _{N}\,\mathrm{d}\mu \bigg) \\ & \geq \sup _{N}\bigg( 0 \frac{1}{2}+ 0 \frac{1}{2}-(-N)\bigg) = \infty . \end{aligned}$$

4 Applications in the compact case

We suppose in the entire Sect. 4 that the following requirements are fulfilled.

Standing Assumption 4.1

Let \(d=1\) and \(\Omega :=K_{0}\times \cdots \times K_{T}\) for compact sets \(K_{0},\dots ,K_{T}\subseteq {\mathbb{R}}\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in{\mathbb{R}}\), the functional \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous, \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is a given martingale measure with marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\), and \(c\in L^{1}(\widehat{Q})\).

Under this assumption, \(C_{0:T}=\mathcal{C}_{b}(\Omega )\) and \((C_{0:T})^{\ast }=\mathrm{ca}(\Omega )=\mathrm{ca}^{1}(\Omega )\). We observe that the stock price \((X_{t})\) is bounded due to the compactness of \(K_{0},\dots ,K_{T}\). As a consequence, if we consider for example the call option \((X_{t}-\alpha )^{+}\), \(\alpha \in \mathbb{R}\), then it is also bounded on \(\Omega \). The selection \(\mathcal{E} \subseteq \mathcal{C}_{b}(K_{0})\times \cdots \times \mathcal{C}_{b}(K_{T})\) is appropriate in this context.

4.1 Subhedging with vanilla options

As in Beiglböck et al. [4], we suppose in Sect. 4.1 that the elements in \(\mathcal{E}_{t}\) represent portfolios obtained combining call options with maturity \(t\), units of the underlying stock at time \(t\) and deterministic amounts, that is, \(\mathcal{E}_{t}\) consists of all the functions in \(\mathcal{C}_{b}(K_{t})\) of the form

$$ \varphi _{t}(x_{t})=a+bx_{t}+\sum _{n=1}^{N}c_{n}(x_{t}-\alpha _{n})^{+} \qquad \text{with } N\geq 1, a,b,c_{n},\alpha _{n}\in \mathbb{R}, x_{t}\in K_{t}, $$
(4.1)

and take \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\). As one can see in the proofs of Corollaries 4.3 and 4.5, which are the core content of this section, one could as well take instead the space \(\mathcal{E}=\mathcal{C}_{b}(K_{0}) \times \cdots \times \mathcal{C}_{b}(K_{T})\) and preserve the validity of (4.3) and (4.5).

In all the results in Sect. 4.1, the functional \(U\) is real-valued on the whole space ℰ and cash-additive, which yields \(\mathrm{dom}(U)= \mathrm{dom}(S^{U})=\mathcal{E}\). Thus we can use Corollaries 2.5 and 2.12, in particular (2.16) and (2.17), in the case \(\mathrm{dom}(S^{U})=\mathcal{E}\).

Take \(U_{t}(\varphi _{t}):=\int _{K_{t}}u_{t}(\varphi _{t}(x_{t})) \mathrm{d}\widehat{Q}_{t}(x_{t})\). We work with the associated (one-dimensional) optimised certainty equivalent \(S^{U_{t}}\) that we rename, for \(\varphi _{t}\in \mathcal{C} _{b}(K_{t})\), as

$$ S^{U_{t}}(\varphi _{t})=\sup _{\alpha \in {\mathbb{R}}}\bigg( \int _{K_{t}}u_{t} \big(\varphi _{t}(x_{t})+\alpha \big)\mathrm{d}\widehat{Q}_{t}(x_{t})-\alpha \bigg)=:U_{\widehat{Q}_{t}}(\varphi _{t}). $$
(4.2)

We observe that Assumption 3.7 does not impose that the functions \(u_{t}\) are real-valued on all of ℝ. Nevertheless, for the functional \(U_{\widehat{Q}_{t}}\), it can be easily shown that we have the following result whose proof is omitted.

Lemma 4.2

Under Assumption 3.7, for each \(t=0,\dots ,T\), \(U_{\widehat{Q}_{t}}\) is real-valued on \(\mathcal{C}_{b}(K_{t})\) and null in 0, concave, nondecreasing and cash-additive.

Corollary 4.3

Take \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7and suppose that we have \(\mathrm{dom}(u_{0})=\cdots = \mathrm{dom}(u_{T})={\mathbb{R}}\). Then

$$\begin{aligned} \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})\bigg) & =\sup \bigg\{ \sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t}) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} . \end{aligned}$$
(4.3)

Moreover, if the left-hand side of (4.3) is finite, a minimum point exists.

Proof

Set \(U(\varphi )=\sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t})\) for \(\varphi \in \mathcal{E}\). By Lemma 4.2, for each \(t=0,\dots ,T\) the monotone concave functional \(\varphi _{t}\mapsto U_{\widehat{Q}_{t}}(\varphi _{t})\) is well defined, finite-valued, concave and nondecreasing on all of \(\mathcal{C}_{b}(K_{t})\). Hence by the extended Namioka–Klee theorem (see [8]), it is norm-continuous on \(\mathcal{C}_{b}(K_{t})\). We observe that in this case, we are in Setup 3.2 and can apply (3.1) from Lemma 3.3. We have for every \(Q\in \mathrm{Mart}(\Omega )\) that

$$\begin{aligned} \mathcal{D}(Q)& :=\sup _{\varphi \in \mathcal{E}}\bigg( U(\varphi )- \sum _{t=0}^{T}\int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}\bigg) \\ & \phantom{:} =\sum _{t=0}^{T}\sup _{\varphi _{t} \in \mathcal{E}_{t}}\bigg( U_{ \widehat{Q}_{t}}(\varphi _{t} )-\int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}\bigg) \\ & \phantom{:} =\sum _{t=0}^{T}\sup _{\varphi _{t} \in \mathcal{C}_{b}(K_{t})}\bigg( U_{ \widehat{Q}_{t}}(\varphi _{t} )-\int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}\bigg) \\ & \phantom{:} = \sum _{t=0}^{T}\sup _{\varphi \in \mathcal{C}_{b}(K_{t}),\alpha _{t}\in{\mathbb{R}}}\bigg( \int _{K_{t}}u_{t}( \varphi _{t}+\alpha _{t})\,\mathrm{d}\widehat{Q}_{t}-\int _{K_{t}}( \varphi _{t}+\alpha _{t})\,\mathrm{d}Q_{t}\bigg) \\ & \phantom{:} = \sum _{t=0}^{T}\sup _{\varphi \in \mathcal{C}_{b}(K_{t})}\bigg( \int _{K_{t}}u_{t}(\varphi _{t})\,\mathrm{d} \widehat{Q}_{t}-\int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}\bigg) \\ & \phantom{:} =\sum _{t=0}^{T}\sup _{\psi _{t}\in \mathcal{C}_{b}(K_{t})}\bigg( \int _{K_{t}}\psi _{t}\,\mathrm{d}Q_{t}-\int _{K_{t}}v_{t}(\psi _{t}) \,\mathrm{d}\widehat{Q}_{t}\bigg) =\sum _{t=0}^{T}\mathcal{D}_{v_{t}^{\ast }, \widehat{Q}_{t}}(Q_{t}). \end{aligned}$$
(4.4)

Indeed, in (4.4), we combine the continuity of \(U_{\widehat{Q}_{t}}\) on \(\mathcal{C}_{b}(K_{t})\) with the fact that \(\mathcal{E}_{t}\) consists of all piecewise linear functions on \(K_{t}\) so that \(\mathcal{E}_{t}\) is norm-dense in \(\mathcal{C}_{b}(K_{t})\); in the fourth equality, we use the fact that for \(\widetilde{\varphi}_{t}:=\varphi _{t}+\alpha _{t} \) and for every \({Q}\in \mathrm{Mart}(\Omega )\), \(\int _{K_{t}}\widetilde{\varphi}\mathrm{d}{Q}_{t}=\int _{K_{t}}{\varphi}\mathrm{d}\widehat{Q}_{t}+\alpha _{t} \); in the fifth equality, we exploit the fact that \(\widetilde{\varphi}_{t}\in \mathcal{E}_{t}\) for every \(\varphi _{t}\in \mathcal{E}_{t}\), \(\alpha _{t} \in {\mathbb{R}}\); and the last equality follows from (3.6).

Using Lemma 3.3 and the fact that \(U_{\widehat{Q}_{0}},\dots ,U_{\widehat{Q}_{T}}\) are cash-additive, we obtain \(S^{U}(\varphi )=\sum _{t=0}^{T}S^{U_{\widehat{Q}_{t}}}(\varphi _{t})= \sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t})=U(\varphi )\). By Lemma 4.2, the assumptions of Corollary 2.13 are satisfied so that we obtain

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q} [ c(X) ] +\sum _{t=0}^{T} \mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})\bigg) =\sup \bigg\{ \sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t}): \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} . $$

Existence of optima follows again from Corollary 2.13. □

We stress the fact that in Corollary 4.3, we assume that all the functions \(u_{0},\dots ,u_{T}\) are real-valued on all of ℝ. A more general result can be obtained when weakening this assumption, but it requires an additional assumption on the marginals of \(\widehat{Q}\).

Corollary 4.4

Suppose Assumption 3.7is fulfilled. Assume that \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) have full support on \(K_{0},\dots ,K_{T}\), respectively. Then (4.3) holds true if we replace \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})\) with \(\mathcal{F}_{t}(Q_{t}| \widehat{Q}_{t})\). Moreover, finiteness of the problem on the left-hand side of (4.3) implies the existence of a minimum.

Proof

The proof carries over almost literally from the proof of Corollary 4.3, except for replacing the reference to Proposition 3.9 with a reference to Proposition 3.10. □

We stress that in Corollary 4.4, we impose the full support property on \(K_{0},\dots ,K_{T}\) with respect to their induced (Euclidean) topology. In particular, this means that whenever \(k_{t}\in K_{t}\) is an isolated point, \(\widehat{Q}_{t}[\{k_{t}\}]>0\). This is consistent with the assumption \(K_{0}=\{x_{0}\}\), which implies that \(\mathrm{Prob}(K_{0})\) reduces to the Dirac measure, i.e., \(\mathrm{Prob}(K_{0})=\{\delta _{x_{0}}\}\).

We now take \(u_{t}(x)=x\) for \(t=0,\dots ,T\) and get \(U_{\widehat{Q}_{t}}(\varphi _{t})= {E}_{\widehat{Q}_{t}}[\varphi _{t}]\). Hence an easy computation yields for all \(Q\in \mathrm{Mart}(\Omega )\) that

$$ \mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})=\textstyle\begin{cases} 0 & \qquad \text{if }Q_{t}= \widehat{Q}_{t}, \\ \infty &\qquad \text{otherwise. }\end{cases}$$

Recalling that \(\mathrm{Mart}(\widehat{Q}_{0},\dots ,\widehat{Q}_{T})=\{Q\in \mathrm{Mart}(\Omega ) : Q_{t}= \widehat{Q}_{t},\forall \,t=0,\dots ,T \}\), we recover from Corollary 4.3 the following result of Beiglböck et al. [4] (under the compactness assumption, which will be dropped in Corollary 5.3).

Corollary 4.5

We have the equality

$$\begin{aligned} \inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0},\dots ,\widehat{Q}_{T})}E_{Q} [ c ] & =\sup \bigg\{ \sum _{t=0}^{T}{E}_{\widehat{Q}_{t}}[\varphi _{t}] : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} . \end{aligned}$$
(4.5)

Moreover, if the left-hand side of (4.5) is finite, a minimum point exists.

4.2 Subhedging without options

The pricing–hedging duality without options takes the following form.

Corollary 4.6

We have the equality

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}E_{Q}[c]=\sup \{ m\in {\mathbb{R}} : \exists \Delta \in \mathcal{H}\textit{ with }m+I^{\Delta} \leq c \} =:\Pi ^{{{\mathrm {sub}}}}(c) . $$
(4.6)

Moreover, if the left-hand side of (4.6) is finite, a minimum point exists.

Proof

We take \(\mathcal{E}_{0}=\cdots =\mathcal{E}_{T}={\mathbb{R}}\) and \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}={\mathbb{R}}^{T+1}\). If \(u_{t}(x_{t})=x_{t}\), \(t=0,\dots ,T\), and \(\widehat{Q}\in \mathrm{Mart}(\Omega )\), the functional \(U_{\widehat{Q}_{t}}\) defined in (4.2) is given by \(U_{\widehat{Q}_{t}}(m_{t})=m_{t}\) and so \(U(m)=\sum _{t=0}^{T}U_{\widehat{Q}_{t}}(m_{t})=\sum _{t=0}^{T}m_{t}\) for all \(m\in \mathcal{E}\). Hence for each \(\varphi \in \mathcal{E}\) with \(\varphi =(m_{0},\dots ,m_{T})\), \(m\in {\mathbb{R}}^{T+1}\), we select \(U(\varphi )=\sum _{t=0}^{T}m_{t}\). Then by the definition of \(\mathcal{D}\) (see (2.10)), we get

$$\begin{aligned} \mathcal{D}(\gamma )= \textstyle\begin{cases} 0 &\text{ for }\gamma \in \mathrm{ca}(\Omega )\text{ with }\gamma ( \Omega )=1, \\ \infty &\text{ otherwise.} \end{cases}\displaystyle \end{aligned}$$

In particular, \(\mathcal{D}(Q)=0\) for every \(Q\in \mathrm{Mart}(\Omega )\). Moreover, we observe that we have \({S}^{U}(\varphi )=U(\varphi )\) for every \(\varphi \in \mathcal{E}\). Applying Corollary 2.13, we get from (2.18) that

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}E_{Q}[c] =\sup \bigg\{ \sum _{t=0}^{T}m_{t} : \exists \,\Delta \in \mathcal{H}\text{ such that }\sum _{t=0}^{T}m_{t}+I^{ \Delta }\leq c\bigg\} . $$

We recognise on the right-hand side above the right-hand side of (4.6). Finally, the existence of optima follows again from Corollary 2.13. □

4.3 Penalty terms induced by market prices

In this section, we change our perspective. Instead of starting from a given \(U\), we give a particular form of the penalisation term \(\mathcal{D}\) and proceed by identifying the corresponding \(U\) in the spirit of Remark 2.15. For each \(t=0,\dots ,T\), we suppose that finite sequences \((c_{t,n})_{1\leq n\leq N_{t}}\) in ℝ and \((f_{t,n})_{1\leq n\leq N_{t}}\) in \(\mathcal{C}_{b}(K_{t})\) are given. The functions \(f_{t,n}\) represent payoffs of options whose prices \(c_{t,n}\) are known from the market. We also take and define

$$ \mathrm{Mart}_{t}(K_{t})= \{ \gamma _{t}\in \mathrm{Prob}(K_{t}) : \exists \,Q\in \mathrm{Mart}(\Omega )\text{ with }\gamma _{t}= Q_{t} \} \subseteq \mathrm{ca}(K_{t}) . $$

Lemma 4.7

The set \(\mathrm{Mart}_{t}(K_{t})\) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact.

Proof

Consider the topology \(\tau = \sigma (\mathrm{ca}(\Omega ),\mathcal{C}_{b}(\Omega ))\). We see that \(\mathrm{Mart}(\Omega )\) is a \(\tau \)-closed subset of the \(\tau \)-compact set \(\mathrm{Prob}(\Omega )\) (which is \(\tau \)-compact since \(\Omega \) is a compact Polish space, see [1, Theorem 15.11]); hence it is \(\tau \)-compact. Then \(\mathrm{Mart}_{t}(K_{t})\) is the image of a \(\tau \)-compact set via the marginal map \(\gamma \mapsto \gamma _{t}\) which is \(\tau \)-\(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-continuous; hence it is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact. □

We introduce the notion of a loss function that will be useful here and also in the sequel (see Sects. 4.4 and 4.4.1) to build penalisation functions.

Definition 4.8

A function \(G:{\mathbb{R}}\rightarrow (-\infty ,+\infty ] \) is called a loss function if it is convex, nondecreasing, lower semicontinuous and satisfies \(G(0)=0\). We set \(\mathrm{dom}(G):= \{ x\in {\mathbb{R}} : G(x)< \infty \} \). The conjugate function \(G^{\ast }:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) defined by \(G^{\ast }(y)=\sup _{x\in {\mathbb{R}}}(xy-G(x))\) satisfies by the monotonicity of \(G\) that \(G^{\ast }(y)=\infty \) for every \(y<0\).

Our requirements allow a wide range of penalisations. For example, we might use power-like penalisations, i.e., \(G(x)=\frac{x^{p}}{{p}}\) for \(x>0\) and \({p}\in (1, \infty )\), \(G(x)=0\) for \(x\leq 0\). In that case, we have \(G^{\ast }(y)=\frac{y^{q}}{{q}}\) for every \(y\geq 0\) for \(\frac{1}{p}+\frac{1}{q}=1\). Alternatively, we might take

$$ G(x)=\textstyle\begin{cases} 0 & \quad \text{ if }x\leq \varepsilon , \\ \infty & \quad \text{ otherwise,}\end{cases}\displaystyle \qquad \text{so that $G^{\ast }(y)=\varepsilon y$ for $y\geq 0$.} $$
(4.7)

For \(\gamma _{t}\in \mathrm{ca}(K_{t})\) we set

$$ \mathcal{D}^{G}_{t}(\gamma _{t}):=\textstyle\begin{cases} \sum _{n=1}^{N_{t}}G_{t,n} ( \vert \int _{K_{t}}f_{t,n}\,\mathrm{d}\gamma _{t}-c_{t,n} \vert ) & \quad \text{ for }\gamma _{t}\in \mathrm{Mart}_{t}(K_{t}), \\ \infty & \quad \text{ otherwise}. \end{cases} $$

Proposition 4.9

Assume that \(G_{n,t}:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) is a loss function for all \(n= 0,\dots ,N_{t}\) and \(t=0,\dots ,T\), and that the martingale measure \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) in the Standing Assumption 4.1also satisfies \(\vert \int _{K_{t}}f_{t,n}\,\mathrm{d}\widehat{Q}_{t}-c_{t,n}\vert \in \mathrm{dom}(G_{t,n})\). Then

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}_{t}^{G}(Q_{t})\bigg) =\sup \bigg\{ \sum _{t=0}^{T}U_{t}^{G}( \varphi _{t}) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} , $$
(4.8)

where

$$ U_{t}^{G}(\varphi _{t}):=\sup _{y_{t}\in {\mathbb{R}}^{N_{t}}}\bigg( \Pi ^{{{\mathrm {sub}}}}\Big( \varphi _{t}+\sum _{n=1}^{N_{t}}y_{t,n}(f_{t,n}-c_{t,n}) \Big) -\sum _{n=1}^{N_{t}}G_{t,n}^{\ast }(y_{t,n})\bigg) $$

and \(\Pi ^{{{\mathrm {sub}}}}\) is given in (4.6). Finally, if the left-hand side of (4.8) is finite, a minimum point exists.

Proof

1) Set \(g_{t,n}:=f_{t,n}-c_{t,n} \). For any \(t\in \{0,\dots ,T\}\), we prove that the functional \(\mathcal{D}_{t}^{G}\) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous and that for every \(\varphi _{t}\in \mathcal{C}_{b}(K_{t})\), its Fenchel–Moreau (convex) conjugate satisfies

$$\begin{aligned} V_{t}^{G}(\varphi _{t})&:=\sup _{\gamma _{t}\in \mathrm{ca}(K_{t})} \bigg( \int _{K_{t}}\varphi _{t}\,\mathrm{d}\gamma _{t}-\mathcal{D}_{t}^{G}( \gamma _{t})\bigg) \\ & \phantom{:} =\inf _{y_{t}\in {\mathbb{R}}^{N_{t}}}\bigg( \Pi ^{{\sup}}\Big( \varphi _{t}-\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n}\Big) +\sum _{n=1}^{N_{t}}G_{t,n}^{ \ast }(y_{t,n})\bigg) , \end{aligned}$$

and thus

$$ U_{t}^{G}(\varphi _{t}):=-V_{t}^{G}(-\varphi _{t})=\sup _{y_{t}\in { \mathbb{R}}^{N_{t}}}\bigg( \Pi ^{{{\mathrm {sub}}}}\Big( \varphi _{t}+\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n} \Big) -\sum _{n=1}^{N_{t}}G_{t,n}^{\ast }(y_{t,n})\bigg) . $$
(4.9)

Here we use the definition of the superhedging price as

$$ \Pi ^{{\sup}}(g):=-\Pi ^{{{\mathrm {sub}}}}(-g)=\sup _{Q\in \mathrm{Mart}(\Omega )}E_{Q}[ g] $$

by Corollary 4.6. We observe that \(\mathcal{D}_{t}^{G} \) is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous as it is a sum of functions, each being a composition of a lower semicontinuous function and a continuous function on \(\mathrm{Mart}_{t}(K_{t})\) which is \(\sigma (\mathrm{ca}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact by Lemma 4.7. We now need to compute

$$\begin{aligned} V_{t}^{G}(\varphi _{t})&=\sup _{\gamma _{t}\in \mathrm{ca}(K_{t})} \bigg( \int _{K_{t}}\varphi _{t}\,\mathrm{d}\gamma _{t}-\mathcal{D}_{t}^{G}( \gamma _{t})\bigg) \\ &=\sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})}\bigg( \int _{K_{t}} \varphi _{t}\,\mathrm{d}Q_{t}-\mathcal{D}_{t}^{G}(Q_{t})\bigg) . \end{aligned}$$

Recall that \(G_{t,n}(x)=\sup _{y\in {\mathbb{R}}}(xy-G_{t,n}^{\ast }(y))\) by the Fenchel–Moreau theorem. Hence

$$\begin{aligned} V_{t}^{G}(\varphi _{t}) & =\sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})} \bigg( \int _{K_{t}}\varphi _{t}\,\mathrm{d}Q_{t}-\sum _{n=1}^{N_{t}} \sup _{y_{t,n}\in {\mathbb{R}}}\Big( y_{t,n}\int _{K_{t}}g_{t,n}\, \mathrm{d}Q_{t}-G_{t,n}^{\ast }(y_{t,n})\Big) \bigg) \\ & =\sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})}\inf _{y_{t}\in \mathrm{dom}}\bigg( \int _{K_{t}}\Big( \varphi _{t}-\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n} \Big) \,\mathrm{d}Q_{t}+\sum _{n=1}^{N_{t}}G_{t,n}^{\ast }(y_{t,n})\bigg) \\ & =:\sup _{Q_{t}\in \mathrm{Mart}(K_{t})}\inf _{y_{t}\in \mathrm{dom}} \mathcal{T}(y_{t},Q_{t}), \end{aligned}$$

where \(\mathrm{dom}=\mathrm{dom}(G_{t,1}^{\ast })\times \cdots \times \mathrm{dom}(G_{t,N_{t}}^{\ast })\subseteq {\mathbb{R}}^{N_{t}}\). We see that \(\mathcal{T}\) is real-valued on \(\mathrm{dom}\times \mathrm{Mart}_{t}(K_{t})\), convex in the first variable and concave in the second. Moreover, \(\{\mathcal{T}(y_{t},\,\cdot \, )\geq C\}\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-closed in \(\mathrm{Mart}_{t}(\Omega )\) for every \(y_{t}\in \mathrm{dom}\), and \(\mathrm{Mart}_{t}(K_{t})\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-compact by Lemma 4.7. As a consequence, \(\mathcal{T}(y_{t},\,\cdot \, )\) is \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\)-lower semicontinuous on \(\mathrm{Mart}_{t}(K_{t})\). We can apply Simons [36, Theorem 3.1], with \(A=\mathrm{dom}\) and \(B=\mathrm{Mart}_{t}(K_{t})\) endowed with the topology \(\sigma (\mathrm{Mart}_{t}(K_{t}),\mathcal{C}_{b}(K_{t}))\), and interchange inf and sup. From our previous computations, we then get

$$\begin{aligned} V_{t}^{G}(\varphi _{t})& =\sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})}\inf _{y_{t}\in \mathrm{dom}}\mathcal{T}(y_{t},Q_{t}) \\ &=\inf _{y_{t}\in \mathrm{dom}}\sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})}\mathcal{T}(y_{t},Q_{t}) \\ & =\inf _{y_{t}\in \mathrm{dom}}\bigg( \sup _{Q_{t}\in \mathrm{Mart}_{t}(K_{t})}\int _{K_{t}}\Big( \varphi _{t}-\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n} \Big) \,\mathrm{d}Q_{t}+\sum _{n=1}^{N_{t}}G_{t,n}^{\ast }(y_{t,n})\bigg) \\ & =\inf _{y_{t}\in \mathrm{dom}}\bigg( \sup _{Q\in \mathrm{Mart}( \Omega )}\int _{\Omega }\Big( \varphi _{t}-\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n} \Big) \,\mathrm{d}Q+\sum _{n=1}^{N_{t}}G_{t,n}^{\ast }(y_{t,n})\bigg) \\ & {=}\inf _{y_{t}\in \mathrm{dom}}\bigg( \Pi ^{{\sup}}\Big( \varphi _{t}- \sum _{n=1}^{N_{t}}y_{t,n}g_{t,n}\Big) +\sum _{n=1}^{N_{t}}G_{t,n}^{ \ast }(y_{t,n})\bigg) \\ & =\inf _{y_{t}\in {\mathbb{R}}^{N_{t}}}\bigg( \Pi ^{{\sup}}\Big( \varphi _{t}-\sum _{n=1}^{N_{t}}y_{t,n}g_{t,n}\Big) +\sum _{n=1}^{N_{t}}G_{t,n}^{ \ast }(y_{t,n})\bigg) . \end{aligned}$$

Equation (4.9) can be obtained with minor manipulations.

2) To conclude, we are clearly in the setup of Corollary 2.13 with \(\mathcal{D}\) given as in Setup 3.5 from \(\mathcal{D}_{0}^{G},\dots ,\mathcal{D}_{T}^{G}\), and by definition \(\mathrm{dom}(\mathcal{D}_{t}^{G})\subseteq \mathrm{Prob}(K_{t})\) for each \(t=0,\dots ,T\). Using Lemma 3.6, 2) together with the computations in 1) and the fact that \(S^{U_{t}^{G}}= U_{t}^{G}\) by cash-additivity of \(U_{t}^{G}\), we get the desired equality from (2.18) in Corollary 2.13; indeed, \(G_{t,n}^{\ast }\) is bounded from below and proper by our assumptions on \(G_{t,n}\), and \(\Pi ^{{\mathrm {sub}}}\) is real-valued and cash-additive on bounded continuous functions. This guarantees that \(V_{t}^{G}(\varphi _{t})\) is null for an appropriate choice of (constant) \(\varphi _{t}\). The existence of optima follows again from Corollary 2.13. □

Remark 4.10

Our assumption of existence of a particular \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) in Proposition 4.9 expresses the fact that we are assuming that our market prices \(c_{t,n}\) are close enough to those given by expectations under some martingale measure.

Example 4.11

Proposition 4.9 covers a wide range of penalisations. For example, we might impose a threshold for the error in computing option prices by taking into account only those martingale measures \(Q\) such that \(\vert \int _{\Omega }f_{t,n}\,\mathrm{d}Q_{t}-c_{t,n}\vert \leq \varepsilon _{t,n}\) for some \(\varepsilon _{t,n}\geq 0\). To express this, just take \(G_{t,n}\) in the form (4.7) for \(\varepsilon =\varepsilon _{t,n}\).

Example 4.12

We now study the convergence of the penalised problem described above to the classical MOT problem. We suppose that our information on the marginal distributions increases, by increasing the number of prices available from the market. We take \(f_{t,n}(x_{t})=(x_{t}-\alpha _{n})^{+}\) to be call options with maturity \(t\) and strikes \(\alpha _{n}\), \(n \in \mathbb{N}\), that form a dense subset of ℝ.

We take as loss functions \(G_{t,n}(x)=0\) for \(x\leq 0\) and \(G_{t,n}(x)=\infty \) for all \(x> 0\), \(t=0,\dots ,T\), \(n\geq 1\). This means that on the left-hand side of (4.8), the infimum is taken only over martingale measures whose theoretical prices exactly match the ones for the data, namely \(c_{t,n}\). For each \(t=0,\dots ,T\), \((c_{t,n})\) is a given sequence of prices, and we suppose that they are all computed under the same martingale measure \(\widehat{Q}\in \mathrm{Mart}(\Omega )\). We consider for each \(k\in \mathbb{N}\) the initial segment \(c_{t,1},\dots ,c_{t,N_{t}(k)}\) for sequences \(N_{t}(k)\uparrow \infty \), \(t=0,\dots ,T\). This means that for every \(Q\in \mathrm{Mart}(\Omega )\),

$$\begin{aligned} \mathcal{D}_{k}(Q)& :=\sum _{t=0}^{T}\sum _{n=1}^{N_{t}(k)}G_{t,n} \bigg( \bigg\vert \int _{K_{t}}f_{t,n}\,\mathrm{d}Q_{t}-c_{t,n} \bigg\vert \bigg) \\ & \phantom{:} \leq \sum _{t=0}^{T}\sum _{n=1}^{N_{t}(k+1)}G_{t,n}\bigg( \bigg\vert \int _{K_{t}}f_{t,n}\,\mathrm{d}Q_{t}-c_{t,n}\bigg\vert \bigg) = \mathcal{D}_{k+1}(Q) \end{aligned}$$

and

$$ \mathcal{D}_{\infty }(Q)=\sup _{k}\mathcal{D}_{k+1}(Q)=\sum _{t=0}^{T}\sum _{n=1}^{\infty }G_{t,n}\bigg( \bigg\vert \int _{K_{t}}f_{t,n}\, \mathrm{d}Q_{t}-c_{t,n}\bigg\vert \bigg), $$

so that

$$ \mathcal{D}_{\infty }(Q)=\textstyle\begin{cases} 0 &\quad \text{if $\int _{K_{t}}f_{t,n}\,\mathrm{d}Q_{t}=c_{t,n}$ for all $ 0\leq t\leq T,n\geq 1$,} \\ \infty & \quad \text{otherwise.}\end{cases}$$

From the denseness of \((\alpha _{n})\), we conclude that \(\mathcal{D}_{\infty }(Q)=0\) if \(Q_{t}= \widehat{Q}_{t}\) for \(0\leq t\leq T\), and \(\mathcal{D}_{\infty }(Q)= \infty \) otherwise. As a consequence, by Proposition 2.26, we have the convergence

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\big( E_{Q}[c]+\mathcal{D}_{k}(Q) \big) \longrightarrow \inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0},\dots , \widehat{Q}_{T})}E_{Q}[c]\qquad \text{as $k \to \infty $.} $$

4.4 Penalty terms given via Wasserstein distance

Let \(d_{t}\) be a metric on \(K_{t}\) (equivalent to the Euclidean one). The (1-)Wasserstein distance induced by \(d_{t}\) is called \(W_{t}:\mathrm{Prob}(K_{t})\times \mathrm{Prob}(K_{t})\rightarrow {\mathbb{R}}\). Let \(\mathrm{Lip}(1,K_{t})\) be the class of \(d_{t}\)-Lipschitz functions on \(K_{t}\) with Lipschitz constant at most 1. Notice that \(\mathrm{Lip}(1,K_{t})\subseteq \mathcal{C}_{b}(K_{t})\) since \(d_{t}\) is equivalent to the Euclidean metric. For each \(t\), let \(G_{t}:{\mathbb{R}}\rightarrow (-\infty ,+\infty ]\) be a loss function as in Definition 4.8. For \(\mathrm{Mart}_{t}(K_{t})\) as in Sect. 4.3, we introduce

$$ \mathrm{Prob} \ni Q_{t} (K_{t})\mapsto \mathcal{D}^{W}_{t}(Q_{t}):=\textstyle\begin{cases} G_{t} (W_{t}(Q_{t},\widehat{Q}_{t}) ) & \quad \text{for }Q_{t}\in \mathrm{Mart}_{t}(K_{t}), \\ \infty &\quad \text{otherwise.}\end{cases} $$
(4.10)

Then \(\mathcal{D}^{W}_{t}\) is lower semicontinuous with respect to the topology of weak convergence of probability measures, since the Wasserstein metric metrises the latter for compact underlying spaces and \(\mathrm{Mart}_{t}(K_{t})\) is compact under that topology by Lemma 4.7. We are then in Setup 3.5 and the case 2) of Lemma 3.6. As in Sect. 4.3, we take .

Proposition 4.13

For each \(t=0,\dots ,T\), suppose that \(G_{t}\) is a loss function, that there exists a \(Q\in \mathrm{Mart}(\Omega )\) with \(G_{t}(W_{t}(Q_{t},\widehat{Q}_{t}))<\infty \), where \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is the martingale measure from the Standing Assumption 4.1, and take \(\mathcal{D}^{W}_{t}\) as in (4.10). Then

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}^{W}_{t}(Q_{t})\bigg) =\sup \bigg\{ \sum _{t=0}^{T}U^{W}_{t}( \varphi _{t}) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} , $$
(4.11)

where

$$ U^{W}_{t}(\varphi _{t}):=\sup _{ \substack{ y\geq 0, \\ \ell _{t}\in \mathrm{Lip}(1,K_{t})}}\bigg( \Pi ^{{\mathrm {sub}}}(\varphi _{t}+y\ell _{t})-\int _{K_{t}}y \ell _{t}\mathrm{d}\widehat{Q}_{t}-G_{t}^{\ast }(y)\bigg) $$

and \(\Pi ^{{{\mathrm {sub}}}}\) is given in (4.6). Finally, if the left-hand side of (4.11) is finite, a minimum point exists.

Proof

Starting from \(\mathcal{D}^{W}_{t}\), we compute the associated \(V^{W}_{t}\) as

$$\begin{aligned} V^{W}_{t}(\varphi _{t}) &:=\sup _{\gamma \in \mathrm{ca}(K_{t})} \bigg( \int _{K_{t}}\varphi _{t}\mathrm{d}\gamma -\mathcal{D}^{W}_{t}( \gamma )\bigg) \\ & \phantom{:} =\sup _{Q\in \mathrm{Mart}_{t}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t} \mathrm{d}Q-G_{t}\big(W_{t}(Q,\widehat{Q}_{t})\big)\bigg) \\ & \phantom{:} = \sup _{Q\in \mathrm{Mart}_{t}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t} \mathrm{d}Q-\sup _{y\geq 0}\big( yW_{t}(Q,\widehat{Q}_{t})-G_{t}^{\ast }(y)\big) \bigg) \\ & \phantom{:} =\sup _{Q\in \mathrm{Mart}_{t}(K_{t})}\inf _{y\geq 0}\bigg( \int _{K_{t}} \varphi _{t}\mathrm{d}Q-yW_{t}(Q,\widehat{Q}_{t})+G_{t}^{\ast }(y) \bigg) \\ & \phantom{:} = \sup _{Q\in \mathrm{Mart}_{t}(K_{t})}\inf _{ \substack{ y\in \mathrm{dom}(G_{t}^{\ast }), \\ \ell _{t} \in \mathrm{Lip}(1,K_{t})}} \bigg( \int _{K_{t}}(\varphi _{t}-y\ell _{t} )\mathrm{d}Q+\int _{K_{t}}y \ell _{t} \mathrm{d}\widehat{Q}_{t}+G_{t}^{\ast }(y)\bigg) \\ & \phantom{:} = \inf _{ \substack{ y\in \mathrm{dom}(G_{t}^{\ast }), \\ \ell _{t} \in \mathrm{Lip}(1,K_{t})}}\bigg( \sup _{Q\in \mathrm{Mart}_{t}(K_{t})} \Big( \int _{K_{t}}(\varphi _{t}-y\ell _{t} )\mathrm{d}Q\Big) +\int _{K_{t}}y \ell _{t}\mathrm{d}\widehat{Q}_{t}+G_{t}^{\ast }(y)\bigg) \\ & \phantom{:} = \inf _{ \substack{ y\in \mathrm{dom}(G_{t}^{\ast }), \\ \ell _{t}\in \mathrm{Lip}(1,K_{t})}} \bigg( \Pi ^{\sup}(\varphi _{t}-y\ell _{t})+\int _{K_{t}}y\ell _{t} \mathrm{d}\widehat{Q}_{t}+G_{t}^{\ast }(y)\bigg) \\ & \phantom{:} =\inf _{\substack{ y\geq 0, \\ \ell _{t}\in \mathrm{Lip}(1,K_{t}) }}\big( \Pi ^{\sup}(\varphi _{t}-y\ell _{t})+\alpha (y,\ell _{t}) \big) \end{aligned}$$

for the penalty \(\alpha (y,\ell _{t}):=\int _{K_{t}}y\ell _{t}\mathrm{d}\widehat{Q}_{t}+G_{t}^{\ast }(y)\). In the equality chain above, we use the following: in the third equality, the dual representation of \(G_{t}\); in the fifth, the definition of \(\mathrm{dom}(G_{t}^{\ast })\) and the classical Kantorovich–Rubinstein duality (see Villani [38, Remark 6.5]); in the sixth, Simons [36, Theorem 3.1] (observe that \(\mathrm{Mart}_{t}(K_{t})\) is compact by Lemma 4.7); in the seventh, the definition of the superhedging price \(\Pi ^{{\sup}}(g):=-\Pi ^{{{\mathrm {sub}}}}(-g)=\sup _{Q\in \mathrm{Mart}(\Omega )}E_{Q}[ g] \) by Corollary 4.6. Once we have \(V^{W}_{t}\), we have \(U^{W}_{t}\), and then we can argue as in step 2) of the proof of Proposition 4.9, also regarding existence of an optimum. □

Remark 4.14

If \(U^{W}_{t}\) (as well as \(U_{t}^{G}\) in Proposition 4.9) is real-valued on \(\mathcal{C}_{b}(K_{t})\), one might take \(\mathcal{E}_{t}\) as the set of functions of the form (4.1) in place of \(\mathcal{E}_{t}=\mathcal{C}_{b}(K_{t})\) in both Proposition 4.9 and 4.13, using norm-density of the piecewise linear functions just as in the proof of Corollary 4.3.

Remark 4.15

The reader can check that the property \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) is not used in the proof, and that it would suffice to have only \(\widehat{Q}\in \mathrm{Prob}(\Omega )\). This will be exploited in Sect. 4.4.1.

Example 4.16

Taking \(G_{t}(x)=0\) if \(x\leq \varepsilon _{t}\) and \(G_{t}(x)=\infty \) otherwise, we get \(G_{t}^{\ast }(y)= \varepsilon _{t}y\) if \(y\geq 0\) and \(G_{t}^{\ast }(y)= \infty \) otherwise. In this case,

$$\begin{aligned} &\inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c]+\sum _{t=0}^{T} \mathcal{D}^{W}_{t}(Q)\bigg) \\ &=\inf \{ E_{Q}[c] : Q\in \mathrm{Mart}(\Omega ) \text{ and } W_{t}(Q_{t}, \widehat{Q}_{t})\leq \varepsilon _{t}, t=0,\dots ,T \} . \end{aligned}$$
(4.12)

One can verify with the same techniques as in Example 4.12 that we have convergence, as \(\varepsilon _{t}\downarrow 0\) for every \(t=0,\dots ,T\), of the values on the right-hand side of (4.12) to the MOT value on the left-hand side of (4.5).

4.4.1 Convergence with Wasserstein-induced penalisation

As already mentioned, in the classical MOT framework, the marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\) need to be determined, potentially from the prices of many vanilla options. It is then reasonable to suppose that in a real-world situation, one proceeds by approximation, that is, one determines sequences of candidates \((\widehat{Q}_{t}^{n})\subseteq \mathrm{Prob}(K_{t} )\) for \(t=0,\dots ,T\). If such an approximation scheme (whose details are beyond the scope of this paper) is working, one should have a convergence of these sequences to the true marginals. One suitable candidate for the type of convergence is the weak one, namely one might want to have \(\widehat{Q}_{t}^{n}\rightarrow \widehat{Q}_{t}^{\infty }:= \widehat{Q}_{t} \) as \(n \to \infty \) for \(t=0,\dots ,T\) in the weak sense for probability measures. We suppose here that \(K_{0},\dots ,K_{T}\) are compact sets, and so weak convergence is equivalent to convergence in the Wasserstein distance. Proposition 4.17 below shows how the EMOT values treated in Proposition 4.13 and associated to the approximating measures \(\widehat{Q}_{t}^{n},t=0,\dots ,T\), converge to the original MOT value for the true marginals \(\widehat{Q}_{0},\dots ,\widehat{Q}_{T}\), provided that the loss functions \(G_{t}^{n}\) converge appropriately.

For the next result, it is convenient to rename the martingale measure \(\widehat{Q}\) from the Standing Assumption 4.1 as \(\widehat{Q}^{\infty }\), so that \(\widehat{Q}^{\infty }\in \mathrm{Mart}(\Omega )\) with marginals \(\widehat{Q}_{t}^{\infty }\) and \(c\in L^{1}(\widehat{Q}^{\infty })\).

Proposition 4.17

For each \(n\in \mathbb{N}\cup \{\infty \}\) and \(t=0,\dots ,T\), let \(G_{t}^{n}\) be a loss functions with \(G_{t}^{n}(x)\uparrow G_{t}^{\infty }(x)\) as \(n \to \infty \) for every \(x\in {\mathbb{R}}\) and \(G_{t}^{\infty }(x)=\infty \) for every \(x>0\). For every \(t=0,\dots ,T\) and \(n\in \mathbb{N}\), we assume that \(\widehat{Q}_{t}^{n}\in \mathrm{Prob}(K_{t}) \), \(\lim _{n}W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n})=0\) and \(\lim _{n}G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))=0\). Then

$$ \lim _{n}\pi _{n}^{W}(c)=\pi _{\infty }^{W}(c)=\inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0}^{\infty },\dots ,\widehat{Q}_{T}^{ \infty })}E_{Q}[c] , $$
(4.13)

where

$$ \pi _{n}^{W}(c)=\inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q} [ c ] + \sum _{t=0}^{T}G_{t}^{n}\big(W_{t}(Q_{t},\widehat{Q}_{t}^{n})\big) \bigg) , \qquad n\in \mathbb{N}\cup \{\infty \} . $$

Proof

The second equality in (4.13) is just a consequence of the definition of \(\pi _{\infty }^{W}\) and of \(G_{t}^{\infty }(x)=\infty \) for every \(x>0\). Since we may always pass to a subsequence of \((\widehat{Q}^{n})\), we may assume without loss of generality that \(G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))<\infty \) for all \(n\in \mathbb{N}\cup \{\infty \}\) and all \(t\) (the case \(n=\infty \) is obvious from \(G_{t}^{\infty }(0)=0\)). We first claim that \(\pi _{n}^{W}(c)\) is finite for all \(n\in \mathbb{N}\cup \{\infty \}\). Indeed, since \(\Omega \) is compact, \(c\) is lower semicontinuous and \(G_{t}^{n}\) are nonnegative on \([0, \infty )\), we have for \(n\in \mathbb{N}\cup \{\infty \}\) that

$$\begin{aligned} -\infty < \inf _{x\in \Omega }c(x)&\leq \inf _{Q\in \mathrm{Mart}( \Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}G_{t}^{n}\big(W_{t}(Q_{t}, \widehat{Q}_{t}^{n})\big)\bigg) \\ &\leq E_{\widehat{Q}^{\infty }}[c] +\sum _{t=0}^{T}G_{t}^{n}\big(W_{t}( \widehat{Q}_{t}^{\infty },\widehat{Q}_{t}^{n})\big)< \infty . \end{aligned}$$

We now prove that \(\pi _{\infty }^{W}(c)\geq \limsup _{n}\pi _{n}^{W}(c)\). Since \(\pi _{\infty }^{W}(c)< \infty \), there exists an optimum \(Q^{\infty }\in \mathrm{Mart}(\Omega )\) for \(\pi _{\infty }^{W}(c)\), and its marginals satisfy \(Q_{t}^{\infty }=\widehat{Q}_{t}^{\infty }\), \(t=0,\dots ,T\). Then \(G_{t}^{n}(W_{t}({Q}_{t}^{\infty },\widehat{Q}_{t}^{n}))=G_{t}^{n}(W_{t}(\widehat{Q}_{t}^{\infty }, \widehat{Q}_{t}^{n}))\rightarrow 0\) as \(n \to \infty \) and

$$\begin{aligned} \pi _{\infty }^{W}(c)& =E_{Q^{\infty }}[c] \\ &=\lim _{n}\bigg( E_{Q^{\infty }}[c] +\sum _{t=0}^{T}G_{t}^{n}\big(W_{t}({Q}_{t}^{ \infty },\widehat{Q}_{t}^{n})\big)\bigg) \\ & \geq \limsup _{n}\inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] + \sum _{t=0}^{T}G_{t}^{n}\big(W_{t}(Q_{t},\widehat{Q}_{t}^{n})\big)\bigg) =\limsup _{n}\pi _{n}^{W}(c). \end{aligned}$$

It only remains to show that \(\liminf _{n}\pi _{n}^{W}(c)\geq \pi _{\infty }^{W}(c)\). Proposition 4.13 and Remark 4.15 guarantee for each \(n\in \mathbb{N}\) the existence of an optimum \(Q^{n}\in \mathrm{Mart}(\Omega )\) for the value \(\pi _{n}^{W}(c)<\infty \). From the proof of Lemma 4.7, we know that \(\mathrm{Mart}(\Omega )\) is weakly compact, and so we can take a subsequence of \(( Q^{n} )\) such that for some \(\widetilde{Q}\in \mathrm{Mart}(\Omega )\), \(W_{t}(Q_{t}^{n_{k}},\widetilde{Q}_{t})\rightarrow 0\) as \(k \to \infty \) for every \(t\) and

$$ \lim _{k}\bigg( E_{Q^{n_{k}}}[c] +\sum _{t=0}^{T}G_{t}^{n_{k}}\big(W_{t}(Q_{t}^{n_{k}}, \widehat{Q}_{t}^{n_{k}})\big)\bigg) =\liminf _{n}\pi _{n}^{W}(c) . $$
(4.14)

Let \(N\in \mathbb{N}\) and recall that \(\sup _{N}G_{t}^{N}(x)=G_{t}^{\infty }(x)\). Then we compute by using the particular form of \(G_{t}^{\infty }\) that

$$\begin{aligned} \pi _{\infty }^{W}(c)& =\inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0}^{\infty },\dots ,\widehat{Q}_{T}^{\infty })}E_{Q}[c] \\ &=\inf _{Q\in \mathrm{Mart}(\Omega )}\sup _{N\in \mathbb{N}}\bigg( E_{Q} [ c] +\sum _{t=0}^{T}G_{t}^{N}\big(W_{t}(Q_{t},\widehat{Q}_{t}^{\infty }) \big)\bigg) \\ & \leq \sup _{N\in \mathbb{N}}\bigg( E_{\widetilde{Q}}[c] +\sum _{t=0}^{T}G_{t}^{N} \big(W_{t}(\widetilde{Q}_{t},\widehat{Q}_{t}^{\infty })\big)\bigg) \\ &\leq \liminf _{n}\pi _{n}^{W}(c), \end{aligned}$$
(4.15)

where the first inequality uses \(\widetilde{Q}\!\in \!\mathrm{Mart}(\Omega )\) and the second is justified as follows. From the lower semicontinuity with respect to the weak convergence of \(Q\mapsto \int _{\Omega }c\mathrm{d}Q\), the lower semicontinuity of \(G_{t}^{N}\) and \(W_{t}(\widetilde{Q}_{t},\widehat{Q}_{t}^{\infty })=\lim _{k}W_{t}(Q_{t}^{n_{k}}, \widehat{Q}_{t}^{n_{k}})\) for all \(t\), we obtain

$$\begin{aligned} &E_{\widetilde{Q}}[c] +\sum _{t=0}^{T}G_{t}^{N}\big(W_{t}( \widetilde{Q}_{t},\widehat{Q}_{t}^{\infty })\big) \\ &\leq \liminf _{k}\bigg( E_{Q^{n_{k}}}[c] +\sum _{t=0}^{T}G_{t}^{N} \big(W_{t}(Q_{t}^{n_{k}},\widehat{Q}_{t}^{n_{k}})\big)\bigg) . \end{aligned}$$
(4.16)

Moreover, \(G_{t}^{n}\) is increasing in \(n\) for each \(t\), and so for all \(n_{k}>N\),

$$ E_{Q^{n_{k}}}[c] +\sum _{t=0}^{T}G_{t}^{N}\big(W_{t}(Q_{t}^{n_{k}},\widehat{Q}_{t}^{n_{k}})\big)\leq E_{Q^{n_{k}}}[c] +\sum _{t=0}^{T}G_{t}^{n_{k}} \big(W_{t}(Q_{t}^{n_{k}},\widehat{Q}_{t}^{n_{k}})\big). $$

This and (4.16) imply

$$\begin{aligned} &\bigg( E_{\widetilde{Q}}[c] +\sum _{t=0}^{T}G_{t}^{N}\big(W_{t}(\widetilde{Q}_{t},\widehat{Q}_{t}^{\infty })\big)\bigg) \\ &\leq \liminf _{k}\bigg( E_{Q^{n_{k}}}[c] +\sum _{t=0}^{T}G_{t}^{n_{k}} \big(W_{t}(Q_{t}^{n_{k}},\widehat{Q}_{t}^{n_{k}})\big)\bigg) =\liminf _{n}\pi _{n}^{W}(c), \end{aligned}$$

by (4.14). Taking the supremum over \(N\in \mathbb{N}\), we obtain (4.15). □

5 Applications in the noncompact case

We stress that the extension of the results in Sects. 4.3 and 4.4 to the noncompact case seems to be nontrivial. The main issues come from the verification of Assumption 2.3 (ii), when one starts the analysis from penalisation terms rather than from valuation functionals. Not excluding that such an extension is possible, we leave this topic for future research. However, the case of valuations induced by utility functions can be treated also in the noncompact case, as we describe below. In the noncompact case, Corollary 4.3 takes the following form.

Corollary 5.1

Take \(d=1\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in {\mathbb{R}}\) and let \(K_{1},\dots ,K_{T}\subseteq {\mathbb{R}}\) be closed subsets of ℝ. Consider utility functions \(u_{0},\dots ,u_{T}\) satisfying Assumption 3.7, and suppose \(\mathrm{dom}(u_{0})=\cdots =\mathrm{dom}(u_{T})={\mathbb{R}}\). Take for each \(t=0,\dots ,T \) the vector space \(\mathcal{E}_{t}\subseteq C_{t} =C_{t:t}\) (see (2.1)) of functions of the form (4.1), let \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and fix a \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) such that

$$ \int _{K_{t}}v_{t}\big( a(1+\left \vert x_{t}\right \vert )\big) \, \mathrm{d}\widehat{Q}_{t}(x_{t})< \infty , \qquad \forall a>0,t=0,\dots ,T . $$

Suppose that \(c:\Omega \rightarrow (-\infty ,+\infty ]\) is lower semicontinuous and satisfies (2.8). Then

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})\bigg) =\sup \bigg\{ \sum _{t=0}^{T}U_{\widehat{Q}_{t}}(\varphi _{t}) : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} , $$
(5.1)

where \(U_{\widehat{Q}_{t}}(\varphi _{t})\) is defined in (4.2) for general \(\varphi _{t}\in C_{t}\) and \(\mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}\) is given in (3.5). Moreover, the infimum in (5.1) is a minimum provided that the right-hand side of (5.1) is finite.

Proof

All the claims follow from Theorem 3.4 if we show that all its hypotheses are satisfied. To do so, we check the following properties: (i) the functional \({U}_{t}(\varphi _{t})=U_{\widehat{Q}_{t}}(\varphi _{t})\) is real-valued on \({C}_{t}\), concave, nondecreasing and cash-additive; (ii) \(\mathcal{D}_{t}(Q_{t})=\mathcal{D}_{v_{t}^{ \ast },\widehat{Q}_{t}}(Q_{t})\) for \(Q\in \mathrm{Mart}(\Omega )\); (iii) \({U}_{t}(0)=0\ \) for \(t=0,\dots ,T\) and the conditions (2.5) and (2.6) hold, which we do using Example 2.6.

To check (i), observe that for every \(t=0,\dots ,T\) and \(\varphi _{t}\in C_{t}\),

$$\begin{aligned} -\infty &< -\int _{K_{t}}v_{t}\big( \Vert \varphi _{t} \Vert _{t} (1+ \vert x_{t} \vert )\big) \,\mathrm{d}\widehat{Q}_{t}(x_{t}) \\ &=\int _{K_{t}}u_{t}\big( - \Vert \varphi _{t} \Vert _{t} (1+ \vert x_{t} \vert )\big) \,\mathrm{d}\widehat{Q}_{t}(x_{t}) \leq \int _{K_{t}}u_{t} \big( \varphi _{t}(x_{t})\big) \,\mathrm{d}\widehat{Q}_{t}(x_{t})\leq U_{\widehat{Q}_{t}}(\varphi _{t}) \\ &\overset{}{\leq }\int _{K_{t}}\varphi _{t}(x_{t})\mathrm{d} \widehat{Q}_{t}(x_{t})\leq \int _{K_{t}} \Vert \varphi _{t} \Vert _{t} (1+ \vert x_{t} \vert )\,\mathrm{d}\widehat{Q}_{t}(x_{t})< \infty , \end{aligned}$$

where the first inequality in the last line uses the fact that \(u_{t}(x)\leq x\), \(\forall x\in {\mathbb{R}}\), and the finiteness of the last term comes from \(\widehat{Q}\in \mathrm{Prob}^{1}(\Omega )\). Concavity, monotonicity and cash-additivity can be checked by direct computation. Coming to (ii), from Proposition 3.9, for every \(Q\in \mathrm{Prob}^{1}(\Omega )\) and \(t=0,\dots ,T\), we have

$$\begin{aligned} \mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t})& = \sup _{\varphi _{t} \in \mathcal{C}_{b}(K_{t})}\bigg( \int _{K_{t}}\varphi _{t}(x_{t})\,\mathrm{d}Q_{t}(x_{t})-\int _{K_{t}} \!v_{t}\big(\varphi _{t}(x_{t})\big)\, \mathrm{d}\widehat{Q}_{t}(x_{t})\bigg) \\ &\leq \sup _{\varphi _{t}\in {C}_{t}}\bigg( \int _{K_{t}}\varphi _{t}(x_{t}) \,\mathrm{d}Q_{t}(x_{t})-\int _{K_{t}} \!v_{t}\big(\varphi _{t}(x_{t}) \big)\,\mathrm{d}\widehat{Q}_{t}(x_{t})\bigg) \\ & \leq \mathcal{D}_{v_{t}^{\ast },\widehat{Q}_{t}}(Q_{t}), \end{aligned}$$

where the first equality exploits (3.6) and the last inequality is from the Fenchel inequality \(v_{t}^{\ast }(y)\geq (\varphi _{t}y-v_{t}(\varphi _{t}))\). To conclude the proof of (ii), we have to show that the sup in the above expression can be taken over \(\mathcal{E}_{t}\), as in the penalty term \(\mathcal{D}_{t}\) in Theorem 3.4. Observe that for every \(\varphi _{t}\in C_{t}\), there exists a sequence \((\varphi _{t}^{n}) \subseteq C_{t}\), with each \(\varphi _{t}^{n}\in \mathcal{E}_{t} \) of the form (4.1), such that \(\varphi _{t}^{n}\rightarrow \varphi _{t}\) pointwise on \(K_{t}\) and \(\sup _{n} \Vert \varphi _{t}^{n} \Vert _{t}<\infty \). This implies

$$\begin{aligned} &\int _{K_{t}}\varphi _{t}(x_{t})\,\mathrm{d}Q(x_{t})-\int _{K_{t}}v_{t} \big(\varphi _{t}(x_{t})\big)\,\mathrm{d}\widehat{Q}_{t}(x_{t}) \\ &=\lim _{n}\bigg( \int _{K_{t}}\varphi _{t}^{n}(x_{t})\,\mathrm{d}Q(x_{t})- \int _{K_{t}}v_{t}\big(\varphi _{t}^{n}(x_{t})\big)\,\mathrm{d}\widehat{Q}_{t}(x_{t})\bigg) \end{aligned}$$

for all \(Q\in \mathrm{Prob}^{1}(\Omega )\) by dominated convergence and using the assumption that

$$ \int _{K_{t}}v_{t}\big( a(1+\vert x_{t}\vert )\big) \,\mathrm{d} \widehat{Q}_{t}(x_{t})< \infty , \qquad \forall a>0. $$

Finally, we work on (iii). Take \(f_{t}^{\alpha _{n}}(x_{t})=(\vert x_{t}\vert -\alpha _{n})^{+}\) as in (2.11) and \(\alpha _{n}\uparrow \infty \). Observe that \(f_{t}^{\alpha _{n}}(x_{t})\rightarrow 0\) as \(n \to \infty \) and that the assumption \(u_{t}(x)\leq x\) for all \(x\in \mathbb{R}\) implies \({U}_{t}(0)\leq 0\). Then for every \(a>0\),

$$ 0\geq {U}_{t}(0)\geq {U}_{t}(-af_{t}^{\alpha _{n}}) \geq \int _{K_{t}}u_{t} ( -af_{t}^{\alpha _{n}} ) \mathrm{d}\widehat{Q}_{t} \longrightarrow 0 \qquad \text{as $n \to \infty $}, $$
(5.2)

where the last limit is by dominated convergence. Indeed, \(f_{t}^{\alpha _{n}}(x_{t})\leq 1+\vert x_{t}\vert \) for every \(x_{t}\in K_{t}\) so that

$$\begin{aligned} \big\vert u_{t}\big( -af_{t}^{\alpha _{n}}(x_{t})\big)\big\vert &=-u_{t} \big( -af_{t}^{\alpha _{n}}(x_{t})\big) \\ &\leq -u_{t}\big(-a(1+\vert x_{t}\vert )\big)=v_{t}\big(a(1+\vert x_{t} \vert )\big)\in L^{1}(\widehat{Q}_{t}) \end{aligned}$$

for every \(x_{t}\in K_{t},a>0\). Now (5.2) yields simultaneously that \({U}_{t}(0)=0\) and \({U}_{t}(-af_{t}^{\alpha _{n}})\rightarrow 0\) as \(n \to \infty \) for all \(t=0,\dots ,T,a>0\). To apply Example 2.6, it is then enough to observe that taking \(U(\varphi ):=\sum _{t=0}^{T}{U}_{t}(\varphi _{t})\) as in Theorem 3.4, we have \(U(0)=0\) and \(U(0,\dots ,0,-af_{t}^{\frac{n}{\beta }},0,\dots ,0)={U}_{t}(-af_{t}^{\frac{n}{\beta }})\rightarrow 0\) as \(n \to \infty \) for all \(a>0\), which is (2.12). □

Remark 5.2

Observe that Corollary 5.1 remains valid for general \(\widehat{Q}_{t}\in \mathrm{Prob}^{1}(K_{t})\) without requesting these are marginals of a martingale measure. Indeed, we did not use the martingale property at any point in the above proof.

Just as we obtained Corollary 4.5 from Corollary 4.3 by using the linear utility functions \(u_{t} (x_{t})=x_{t}\), we now deduce the following result from Corollary 5.1; see Beiglböck et al. [4, Theorem 1.1 and Corollary 1.2].

Corollary 5.3

Take \(d=1\), \(K_{0}=\{x_{0}\}\) for some \(x_{0}\in {\mathbb{R}}\) and let \(K_{1},\dots ,K_{T}\subseteq {\mathbb{R}}\) be closed subsets of ℝ. Take for each \(t=0,\dots ,T\) the vector space \(\mathcal{E}_{t}\subseteq C_{t}\) of functions of the form (4.1), let \(\mathcal{E}=\mathcal{E}_{0}\times \cdots \times \mathcal{E}_{T}\) and fix a \(\widehat{Q}\in \mathrm{Mart}(\Omega )\). Then for any \(c:\Omega \rightarrow (-\infty ,+\infty ]\) which is lower semicontinuous and satisfies (2.8), we have

$$ \inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0},\dots ,\widehat{Q}_{T})}E_{Q}[c]=\sup \bigg\{ \sum _{t=0}^{T}{E}_{\widehat{Q}_{t}}[\varphi _{t}] : \varphi \in \mathcal{S}_{{\mathrm {sub}}}(c)\bigg\} =\pi (c), $$
(5.3)

and if \(\pi (c)<\infty \), a minimum point exists for the infimum in (5.3).

Example 5.4

We now study the convergence to the MOT problem. Take functions \(u_{0}, \dots ,u_{T}:{\mathbb{R}}\rightarrow {\mathbb{R}}\) satisfying Assumption 3.7, and assume additionally that these are all differentiable in 0 (which implies that \(\{1\}=\partial u_{0}(0)=\cdots =\partial u_{T}(0)\)). Observe that if we set \(u_{t}^{n}(x):=nu_{t} ( \frac{x}{n} 1) \) for \(x\in {\mathbb{R}},t=0,\dots ,T\), the functions \(u_{0}^{n},\dots ,u_{T}^{n}\) still satisfy Assumption 3.7. Moreover, \((v_{t}^{n})^{\ast }(y)=\sup _{x\in {\mathbb{R}}}(u_{t}^{n}(x)-xy))=nv_{t}^{\ast }(y)\), \(y\in {\mathbb{R}}\). Since \(u_{t}(0)=0\), we have \(v_{t}^{\ast }\geq 0\) and as a consequence \(\sup _{n}(v_{t}^{n})^{\ast }(y)=0\) if \(v_{t}^{\ast }(y)=0\) and \(\sup _{n}(v_{t}^{n})^{\ast }(y)=\infty \) otherwise. Moreover, \(v_{t}^{\ast }(y)=0\) implies that we have \(y\in \partial u_{t}(0)=\{1\}\). Consider the set \(\mathcal{A}^{\varepsilon }\) of \(\varepsilon \)-martingale measures defined in (2.14), take \(\widehat{Q}\in \mathrm{Mart}(\Omega )\) and a sequence \(\varepsilon _{n}\downarrow 0\). Using (2.15) gives for every \(Q\in \mathrm{Prob}^{1}(\Omega )\) that

$$ \sum _{t=0}^{T}\mathcal{D}_{(v_{t}^{n})^{\ast },\widehat{Q}_{t}}(Q_{t})+\sigma _{\mathcal{A}^{\varepsilon _{n}}}(Q_{t}) \uparrow \mathcal{D}_{\infty }(Q)+\sigma _{\mathcal{A}_{\infty }}(Q) \qquad \text{as $n \to \infty $}, $$

where

$$ \mathcal{D}_{\infty }(Q)+\sigma _{\mathcal{A}_{\infty }}(Q)=\textstyle\begin{cases} 0\quad &\quad \text{if }Q\in \mathrm{Mart}(\Omega ) \text{ and $Q_{t}= \widehat{Q}_{t}$ for $t=0,\dots ,T$,} \\ \infty \quad & \quad \text{otherwise.}\end{cases}$$

As a consequence, by Proposition 2.26,

$$ \inf _{Q\in \mathrm{Mart}(\Omega )}\bigg( E_{Q}[c] +\sum _{t=0}^{T}\mathcal{D}_{(v_{t}^{n})^{\ast },\widehat{Q}_{t}}(Q_{t})\bigg) \longrightarrow \inf _{Q\in \mathrm{Mart}(\widehat{Q}_{0},\dots ,\widehat{Q}_{T})}E_{Q}[c] \qquad \text{as $n \to \infty $}. $$