A concept of copula robustness and its applications in quantitative risk management

Zähle, Henryk

doi:10.1007/s00780-022-00485-8

A concept of copula robustness and its applications in quantitative risk management

Open access
Published: 13 September 2022

Volume 26, pages 825–875, (2022)
Cite this article

Download PDF

You have full access to this open access article

Finance and Stochastics Aims and scope Submit manuscript

A concept of copula robustness and its applications in quantitative risk management

Download PDF

Henryk Zähle¹

3047 Accesses
1 Citation
Explore all metrics

Abstract

In financial and actuarial applications, marginal risks and their dependence structure are often modelled separately. While it is sometimes reasonable to assume that the marginal distributions are ‘known’, it is usually quite involved to obtain information on the copula (dependence structure). Therefore copula models used in practice are quite often only rough guesses. For many purposes, it is thus relevant to know whether certain characteristics derived from $d$-variate risks are robust with respect to (at least small) deviations in the copula. In this article, a general concept of copula robustness is introduced and criteria for copula robustness are presented. These criteria are illustrated by means of several examples from quantitative risk management. The concept of aggregation robustness introduced by Embrechts et al. (Finance Stoch. 19:763–790, 2015) can be embedded in our framework of copula robustness.

Lévy Copulas: Review of Recent Results

Copulae in High Dimensions: An Introduction

A Graphical Tool for Copula Selection Based on Tail Dependence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

As pointed out by Embrechts et al. [17] and McNeil et al. [33, Sect. 6.2.1], in financial mathematics and actuarial science, marginal risks and their dependence structure are often modelled separately. While the marginal risks of a $d$-variate risk are identified with probability distributions $\mu _{1},\ldots ,\mu _{d}$ on the real line, the dependence structure is most often modelled by a $d$-variate copula $C$. The distribution function $F_{\mu}$ of the joint distribution $\mu $ is then given by

F_{μ} (x_{1}, \dots, x_{d}) = C (F_{μ_{1}} (x_{1}), \dots, F_{μ_{d}} (x_{d})) for all x_{1}, \dots, x_{d} \in R,

(1.1)

where $F_{\mu _{i}}$ is the distribution function of $\mu _{i}$.

In practical applications, quantitative risk managers and actuaries are interested in various aspects $\mathcal{T}_{d}(\mu )$ of the joint distribution $\mu $ of the individual risks. An important example is $\mathcal{T}_{d}=\mathcal{R}_{A_{d}}$ with

$$ {\mathcal{R}}_{A_{d}}(\mu ):=\mathcal{R}(\mu \circ A_{d}^{-1}), $$

(1.2)

where ℛ is the risk functional corresponding to some distribution-invariant ‘downside’ risk measure and $A_{d} : R^{d} \to R$ is a fixed Borel-measurable map regarded as an aggregation map in the spirit of McNeil et al. [33, Sect. 6.2.1]. Standard examples for the aggregation map are $A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}$ and the three other maps presented in Example 4.4 below. Note that $\mu \circ A_{d}^{-1}$ is the distribution of $A_{d}(X_{1},\ldots ,X_{d})$ when $(X_{1},\ldots ,X_{d})$ is a random vector distributed according to $\mu $. Therefore $\mathcal{R}_{A_{d}}(\mu )$ can be seen as the downside risk of the aggregate position $A_{d}(X_{1},\ldots ,X_{d})$.

More generally, one could consider $\mathcal{T}_{d}=\mathcal{R}_{\mathfrak{A}_{d}}$ with

$$ {\mathcal{R}}_{\mathfrak{A}_{d}}(\mu ):=\inf \{\mathcal{R}(\mu \circ A_{d}^{-1}) : A_{d}\in \mathfrak{A}_{d} \}=\inf \{\mathcal{R}_{A_{d}}(\mu ) : A_{d} \in \mathfrak{A}_{d} \}, $$

(1.3)

where $\mathfrak{A}_{d}$ is a fixed set of Borel-measurable maps $A_{d} : R^{d} \to R$ . If there exists an $A_{d}^{*}\in \mathfrak{A}_{d}$ at which the infimum in (1.3) is attained, then $\mathcal{R}_{\mathfrak{A}_{d}}(\mu )$ can be seen as the smallest possible risk of a position $A_{d}(X_{1},\ldots ,X_{d})$ derived from the single risks $X_{1},\ldots ,X_{d}$ with joint distribution $\mu $ through a function $A_{d}\in \mathfrak{A}_{d}$. It is worth noting that ‘risk’ here does not necessarily mean downside risk, but can also be for instance a mean–downside risk mixture which is the target value in many portfolio optimisation problems. For details, see Sect. 5.2, in particular Remark 5.5.

Of course, there are many other examples for $\mathcal{T}_{d}$. One of them is the optimal value in a multi-period portfolio optimisation problem that is addressed in Sect. 6.2. In this example, the role of $\mu $ is played by the joint distribution of the relative price changes of the $d$ risky assets that are available on the considered financial market.

When starting from separate models for the copula and the marginal distributions, it is reasonable to regard $\mathcal{T}_{d}$ as a functional of the copula $C$ and the marginal distributions $\mu _{1},\ldots ,\mu _{d}$ via

$$ \mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d}):=\mathcal{T}_{d}\Big( \mathfrak{p}_{d}\big(C(F_{\mu _{1}},\ldots ,F_{\mu _{d}})\big)\Big), $$

(1.4)

where $\mathfrak{p}_{d}$ assigns to a $d$-variate distribution function its corresponding Borel probability measure on $R^{d}$ .

In [33, Sect. 6.2.1], McNeil et al. point out that practitioners are often required to work only with partial information. For instance, in some situations, it is possible to obtain (sufficient) information on $\mu _{1},\ldots ,\mu _{d}$, but it is much more difficult to obtain information on the dependence structure. Carrying this to the extreme, McNeil et al. assume that $\mu _{1},\ldots ,\mu _{d}$ are fully known and $C$ is fully unknown. In this case, one cannot specify $\mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})$, because $C$ is unknown. This leads to the ‘Fréchet problem’ of specifying the range of the map $C\mapsto \mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})$. In the special case where $\mathfrak{T}_{d}$ takes values in ℝ, this is often related to finding (sharp) upper and lower bounds for this map. There is a vast literature dealing with this problem; see for instance the works of Rüschendorf [43], [44, Chap. 4], Embrechts and Puccetti [14], Embrechts et al. [15], Puccetti [39], Embrechts et al. [17] and the references cited therein.

In the present paper, a related but different problem is addressed. Still in the case where $\mu _{1},\ldots ,\mu _{d}$ are known (and fixed), assume that $\widehat{C}$ is a guess for the true copula $C$. It might be based on an expert opinion, a statistical estimation, or the like. Of course, as a guess, $\widehat{C}$ can differ from $C$. It is clear that a deviation of $\widehat{C}$ from $C$ can imply a significant difference between $\mathfrak{T}_{d}(\widehat{C},\mu _{1},\ldots ,\mu _{d})$ and $\mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})$. On the other hand, one might ask whether the difference remains small if the deviation of $\widehat{C}$ from $C$ is small. This question was raised and answered by Embrechts et al. [17] in the context of (1.2) with $A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}$. Krätschmer et al. [28, Sect. 4.2.4] took up this concept and generalised the respective result of [17]. In fact, in the latter two references, continuity of the functional $\mathcal{T}_{d}$ at the probability measure $\mathfrak{p}_{d}(C(F_{\mu _{1}},\ldots ,F_{\mu _{d}}))$ (with fixed marginal distributions $\mu _{1},\ldots ,\mu _{d}$ having finite $p$th moments) was not considered with respect to a metric on the set of copulas, but with respect to the (relative) weak topology on the set of $d$-variate distributions (with marginal distributions $\mu _{1},\ldots ,\mu _{d}$). However, it can be seen from Theorem 3.10 below that this is equivalent when the set of copulas is equipped with the supremum distance.

Despite this equivalence, it might be a little more accessible for some readers to measure the difference between two dependence structures directly through the difference between the corresponding copulas, in particular if one starts from separate models for the copula and the marginal distributions. If one follows this approach, one ought to take into account that a $d$-variate distribution $\mu $ with fixed marginal distributions $\mu _{1},\ldots ,\mu _{d}$ depends on the copula $C$ only through the values that $C$ takes on $\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}$ ($\subseteq [0,1]^{d}$), where $\mathrm{ran}F_{\mu _{i}}$ is the range of $F_{\mu _{i}}$. This is apparent from (1.1) and suggests to measure the distance between copulas (in the considered framework) only on $\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}$.

We propose to say that the functional $\mathcal{T}_{d}$ underlying $\mathfrak{T}_{d}$ (recall Eq. (1.4)) is copula robust if for any ‘admissible’ univariate distributions $\mu _{1},\ldots ,\mu _{d}$, the map $C\mapsto \mathfrak{T}_{d}(C, \mu _{1},\ldots ,\mu _{d})$ is continuous with respect to pointwise (or uniform) convergence on $\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}$, where it is assumed that $\mathcal{T}_{d}$ (and thus $\mathfrak{T}_{d}$) takes values in a topological space. By ‘admissible’ we mean that one can find at least one copula $C$ such that the probability measure $\mathfrak{p}_{d}(C(F_{\mu _{1}},\ldots ,F_{\mu _{d}}))$ is contained in the domain of $\mathcal{T}_{d}$. The precise definition of copula robustness is given in Sect. 3. The required notation and terminology as well as some auxiliary results are given before in Sect. 2. It is worth mentioning that Theorem 2.3 provides a generalisation of Deheuvels’ [10] copula convergence theorem and that Corollary 2.9 provides a characterisation of weak convergence in Fréchet classes of $d$-variate distributions.

In the second part of the paper, we discuss three examples for copula robust functionals $\mathcal{T}_{d}$. First, in Sect. 4, we address the quantification of the ‘downside risk’ of aggregate financial positions. It will be seen that the functional in (1.2) is copula robust under mild assumptions (Sect. 4.2). The relation of copula robustness to the concept of aggregation robustness of Embrechts et al. [17] (Sect. 4.3) as well as copula robustness of inf-convolution functionals (Sect. 4.4) are also discussed in detail. Second, in Sect. 5, we address stochastic programming problems. It can be inferred from results of Claus et al. [9] that the optimal value of a general stochastic programming problem depends copula robustly on the distribution of the underlying $d$-variate input random variable $Z$. This covers in particular classical one-period portfolio optimisation problems (where the role of $Z$ is played by the vector of the relative price changes of $d$ risky assets) and therefore backs in a way a hypothesis of Saida and Prigent [45]. In [45, Sect. 1], they conclude from their numerical investigations that ‘investors must more take care of the specification of the marginal distribution than of the copula function’. Third, in Sect. 6, we address multi-period portfolio optimisation problems and derive results that are similar to those in the one-period case. The main tool in this context is Theorem 6.2 which is a variant of a result of Müller [35] about the continuous dependence of the value function on the transition function in a Markov decision model. Theorem 6.2 is of independent interest and contributes to the general theory of Markov decision processes.

Throughout this paper, $|\cdot |$ denotes any norm on $R^{d}$ and $\langle \,\cdot \,,\,\cdot \,\rangle $ is the Euclidean scalar product defined by $\langle x,y\rangle :=\sum _{i=1}^{d}x_{i}y_{i}$ for any elements $x=(x_{1},\ldots ,x_{d})$ and $y=(y_{1},\ldots ,y_{d})$ of $R^{d}$ . Moreover, we set $R_{+} : = [0, \infty)$ and $R_{+ +} : = (0, \infty)$ . The proofs of all results can be found in Appendix A.

2 Preliminary notation, terminology and results

2.1 Fréchet classes and copulas

For any $d \in N$ , let us use $\mathcal{M}_{d}$ to denote the set of all Borel probability measures on $R^{d}$ . For any fixed $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$, denote by $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$ the set of all $\mu \in{\mathcal{M}}_{d}$ having marginals $\mu _{1},\ldots ,\mu _{d}$, i.e., satisfying $\mu \circ \pi _{i}^{-1}=\mu _{i}$ for any $i=1,\ldots ,d$, where $π_{i} : R^{d} \to R$ is the projection on the $i$th coordinate. The set $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$ is known as Fréchet class associated with the univariate Borel probability measures $\mu _{1},\ldots ,\mu _{d}$. The distribution function of a Borel probability measure $\mu $ will be denoted by $F_{\mu}$.

By definition a $d$-variate copula is the distribution function $C:[0,1]^{d}\rightarrow [0,1]$ of a Borel probability measure on $[0,1]^{d}$ whose marginal distributions are all given by the uniform distribution on $[0,1]$. The latter condition ensures that each $d$-variate copula $C$ is Lipschitz-continuous. Theorem 2.10.7 in Nelsen’s textbook [36] indeed shows that every $d$-variate copula $C$ satisfies $|C(u)-C(v)|\le |u-v|_{1}$, where $|x|_{1}:=\sum _{i=1}^{d}|x_{i}|$ for any $x = (x_{1}, \dots, x_{d}) \in R^{d}$ .

Let us denote by $\mathbf{C}_{d}$ the set of all $d$-variate copulas. With any $C\in \mathbf{C}_{d}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$, we associate an element $\mu $ of the Fréchet class $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$ through (1.1). It is indeed easily seen that the right-hand side of (1.1) defines a probability distribution function on $R^{d}$ and that the corresponding Borel probability measure $\mu $ has $\mu _{1},\ldots ,\mu _{d}$ as its marginal distributions. Sklar’s theorem ([49]; see also [36, Theorem 2.10.9]) shows that the distribution function of any element $\mu $ of $\mathcal{M}_{d}(\mu _{1},\dots ,\mu _{d})$ admits the representation (1.1). That is, for any $\mu \in{\mathcal{M}}_{d}(\mu _{1},\dots ,\mu _{d})$, one can find a copula $C\in \mathbf{C}_{d}$ such that (1.1) holds. On the set $\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}$, the copula $C$ is uniquely determined and given by

$$ C(u_{1},\ldots ,u_{d})=F_{\mu}\big(F_{\mu _{1}}^{\leftarrow}(u_{1}), \ldots ,F_{\mu _{d}}^{\leftarrow}(u_{d})\big), $$

(2.1)

where $F_{μ_{i}}^{\leftarrow} (u_{i}) : = inf {x \in R : F_{μ_{i}} (x) \geq u_{i}}$ . In particular, if $F_{\mu _{1}},\ldots ,F_{\mu _{d}}$ are all continuous, then the copula $C$ is unique and given by (2.1) on the whole unit cube $[0,1]^{d}$. For background on copulas, see for instance the textbooks by Durante and Sempi [12, Chaps. 1–2] or Nelsen [36, Chaps. 1–2].

For any nonempty compact set $K\subseteq [0,1]^{d}$, we can define a pseudo-metric $d_{K}$ on $\mathbf{C}_{d}$ through

$$ d_{K}(C_{1},C_{2}):=\sup _{u\in K} |C_{1}(u)-C_{2}(u) |. $$

Since the elements of $\mathbf{C}_{d}$ are all Lipschitz-continuous with Lipschitz constant 1 on $[0,1]^{d}$, the set $\mathbf{C}_{d}$ is uniformly equicontinuous. This implies that convergence of a sequence ${(C_{n})}_{n \in N} \in C_{d}^{N}$ to some $C\in \mathbf{C}_{d}$ with respect to $d_{K}$ is equivalent to pointwise convergence of ${(C_{n})}_{n \in N}$ to $C$ on $K$. The topology on $\mathbf{C}_{d}$ generated by $d_{K}$ is denoted by $\mathcal{O}_{K}$. For any $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$, we let

$$ d_{\mu _{1},\ldots ,\mu _{d}}:=d_{K}\quad \mbox{and}\quad\mathcal{O}_{ \mu _{1},\ldots ,\mu _{d}}:=\mathcal{O}_{K}\quad \mbox{with}\ K:= \overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}. $$

(2.2)

For $K=[0,1]^{d}$, the pseudo-metric $d_{K}$ is even a metric, and the topology $\mathcal{O}_{K}$ is the standard topology on $\mathbf{C}_{d}$ (and the counterpart of the weak topology on the set of all Borel probability measures on $[0,1]^{d}$ whose distribution functions are $d$-variate copulas). In particular, if $F_{\mu _{1}},\ldots ,F_{\mu _{d}}$ are all continuous, then $d_{\mu _{1},\ldots ,\mu _{d}}=d_{[0,1]^{d}}$ and $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}=\mathcal{O}_{[0,1]^{d}}$.

Convergence of copulas with respect to $\mathcal{O}_{[0,1]^{d}}$ has been addressed in the literature several times, for instance by Charpentier and Segers [7] and Trutschnig [51]. Metrics inducing topologies that are at least as fine as $\mathcal{O}_{[0,1]^{d}}$ have been studied for instance by Li et al. [30], Trutschnig [50], Fernández Sánchez and Trutschnig [19] and Kasper et al. [24]. On the other hand, the (pseudo-) metric $d_{\mu _{1},\ldots ,\mu _{d}}$ defined by (2.2) generates a topology that is at most as fine as $\mathcal{O}_{[0,1]^{d}}$. It is finally worth mentioning that the metric on the set of bivariate subcopulas that was recently introduced by Rachasingho and Tasena [40] basically differs from the metric $d_{\mu _{1},\mu _{2}}$ and from its variant $d_{\mu _{1},\mu _{2}}^{\sim}$ introduced in the following Remark 2.1; for details, see Appendix B.

Remark 2.1

For any fixed $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{d}$, one can regard the pseudo-metric $d_{\mu _{1},\ldots ,\mu _{d}}$ as a metric when changing from $\mathbf{C}_{d}$ to the quotient set $\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ with respect to the equivalence relation

$$ \sim _{\mu _{1},\ldots ,\mu _{d}}:=\{(C,C')\in \mathbf{C}_{d}\times \mathbf{C}_{d}:C=C'\mbox{ on }\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}\}. $$

On the resulting quotient set $\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$, one may then define a metric through $d_{μ_{1}, \dots, μ_{d}}^{\sim} (C, C^{'}) : = d_{μ_{1}, \dots, μ_{d}} (C, C^{'})$ , where $C,C'$ are (arbitrary) representatives of the equivalence classes $C, C^{'} \in C_{d} /_{\sim_{μ_{1}, \dots, μ_{d}}}$ . The topology on $\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ generated by $d_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$, henceforth denoted by $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$, preserves the topological structure in the sense that a set $G\subseteq \mathbf{C}_{d}$ lies in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}$ if and only if the set

{C \in C_{d} /_{\sim_{μ_{1}, \dots, μ_{d}}} : there exists a C \in C with C \in G}

lies in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$.

2.2 The set $\mathcal{M}_{d}^{p}$ and the $p$-weak topology

Fix $p \in R_{+}$ and let $\mathcal{M}_{d}^{p}$ be the set of all $\mu \in{\mathcal{M}}_{d}$ for which $\int_{R^{d}} {| x |}^{p} μ (d x) < \infty$ . Note that $\mathcal{M}_{d}=\mathcal{M}_{d}^{0}\supseteq{\mathcal{M}}_{d}^{p_{1}}\supseteq{\mathcal{M}}_{d}^{p_{2}}$ for any $p_{1}, p_{2} \in R_{+}$ with $p_{1}\le p_{2}$. The $p$-weak topology on $\mathcal{M}_{d}^{p}$, henceforth denoted by $\mathcal{O}_{d}^{p}$, is defined as the coarsest topology for which all mappings $\mu \mapsto \int f\,d\mu $, $f\in {\mathcal{C}}_{d}^{p}$, are continuous, where $\mathcal{C}_{d}^{p}$ is the space of all continuous functions $f : R^{d} \to R$ with ${sup}_{x \in R^{d}} | f (x) | / (1 + {| x |}^{p}) < \infty$ . The 0-weak topology on $\mathcal {M}_{d}^{0}$ ($=\mathcal{M}_{d}$) is just the classical weak topology, and the $p$-weak topology $\mathcal{O}_{d}^{p}$ is finer than the relative weak topology $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}$ when $p>0$.

It is known from Krätschmer et al. [28, Lemma 2.1] that $(\mathcal{M}_{d}^{p},\mathcal{O}_{d}^{p})$ is a Polish space and that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}$ if and only if both $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}$ and

\int_{R^{d}} {| x |}^{p} μ_{n} (d x) ⟶ \int_{R^{d}} {| x |}^{p} μ (d x) .

In particular, $\mathcal{O}_{d}^{p}$ is metrised by

d (μ, ν) : = d_{weak} (μ, ν) + | \int_{R^{d}} {| x |}^{p} μ (d x) - \int_{R^{d}} {| x |}^{p} ν (d x) |

for any metric $d_{\mathrm{weak}}$ which metrises $\mathcal{O}_{d}^{0}$. Already in the 1980s, Bickel and Freedman [5, Lemma 8.3] proved for $p\in [1,\infty )$ that $\mathcal{O}_{d}^{p}$ is also metrisable by the $L^{p}$-Wasserstein metric. The following proposition is a sort of continuous mapping theorem.

Proposition 2.2

Let $d, d^{'} \in N$ and $p, p^{'} \in R_{+}$ . Let $h : R^{d} \to R^{d^{'}}$ be a continuous function such that ${sup}_{x \in R^{d}} {| h (x) |}^{p^{'}} / (1 + {| x |}^{p}) < \infty$ . Then $\mathfrak{h}(\mu ):=\mu \circ h^{-1}$ lies in $\mathcal{M}_{d'}^{p'}$ for any $\mu \in{\mathcal{M}}_{d}^{p}$, and the map $\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}$ is $(\mathcal{O}_{d}^{p},\mathcal{O}_{d'}^{p'})$-continuous.

2.3 A generalisation of Deheuvels’ copula convergence theorem

Deheuvels’ convergence theorem [10, Théorème 2.3, Lemma 4.1] says that given a $d$-variate distribution $\mu \in{\mathcal{M}}_{d}$ whose marginal distributions $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$ all possess continuous distribution functions $F_{\mu _{1}},\ldots ,F_{\mu _{d}}$, a sequence ${(μ_{n})}_{n \in N} \in M_{d}^{N}$ converges to $\mu $ in $\mathcal{O}_{d}^{0}$ if and only if $(\mu _{n,i})$ converges to $\mu _{i}$ in $\mathcal{O}_{1}^{0}$, $i=1,\ldots ,d$, and $d_{[0,1]^{d}}(C_{n},C)\to 0$. Here $C$ is the unique copula of $\mu $, $C_{n}$ is any copula of $\mu _{n}$, and $\mu _{n,i}$ is the $i$th marginal distribution of $\mu _{n}$. Sempi [47] and Lindner and Szimayer [31] extended Deheuvels’ result to the general case where the marginal distribution functions $F_{\mu _{1}},\ldots ,F_{\mu _{d}}$ might be discontinuous. Theorem 2.1 in [31] shows that ${(μ_{n})}_{n \in N} \in M_{d}^{N}$ converges to $\mu $ in $\mathcal{O}_{d}^{0}$ if and only if $(\mu _{n,i})$ converges to $\mu _{i}$ in $\mathcal{O}_{1}^{0}$, $i=1,\ldots ,d$, and $d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0$ (a similar result was proved earlier for $d=2$ in [47, Theorems 2 and 3]). Recall that $C$ is uniquely determined only on $\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}$. The results of [47, Example 2] and [31, Example 2.2] show that convergence of the copula on the whole unit cube $[0,1]^{d}$ can indeed fail. Theorem 2.3 below is a version of the Sempi–Lindner–Szimayer result where the weak topologies are replaced by $p$-weak topologies.

Consider the map $\mathfrak{P}_{d}:\mathbf{C}_{d}\times\mathcal{M}_{1}\times \cdots \times\mathcal{M}_{1}\to{\mathcal{M}}_{d}$ defined by

$$ \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d}):=\mathfrak{p}_{d}\big(C(F_{ \mu _{1}},\ldots ,F_{\mu _{d}})\big), $$

(2.3)

where $\mathfrak{p}_{d}$ assigns to a $d$-variate distribution function its corresponding Borel probability measure on $R^{d}$ . Note that $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ remains unchanged when $C$ is modified outside $\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}$. It is easily seen (see Appendix A.2) that for any $p \in R_{+}$ , the univariate distributions $\mu _{1},\ldots ,\mu _{d}$ lie in $\mathcal{M}_{1}^{p}$ if and only if the $d$-variate distribution $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ lies in $\mathcal{M}_{d}^{p}$, regardless of the copula $C$. In particular, the restriction of $\mathfrak{P}_{d}$ to $\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$ can be regarded as an $\mathcal{M}_{d}^{p}$-valued map.

Theorem 2.3

Fix $p \in R_{+}$ and let $(C,\mu _{1},\ldots ,\mu _{d})$ and $(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})$, $n \in N$ , be elements of $\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$. Then

$$ \mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\longrightarrow \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d}) $$

in $\mathcal{O}_{d}^{p}$ if and only if $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{p}$, $i=1,\ldots ,d$, and $d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0$.

Recall that a sequence in a product space converges in the product topology if and only if for each projection, the corresponding marginal sequence converges. As a direct consequence, we can obtain from Theorem 2.3 the following corollary, taking into account that each of the involved topologies is metrisable, or at least pseudo-metrisable, and that $d_{\mu _{1},\ldots ,\mu _{d}}(C,C)=0$ for any $C\in \mathbf{C}_{d}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$.

Corollary 2.4

For any $p \in R_{+}$ , $C\in \mathbf{C}_{d}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the following two assertions hold:

(i) The map $\mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to{\mathcal{M}}_{d}^{p}$ is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{p})$-continuous.

(ii) The map $\mathfrak{P}_{d}(C,\,\cdot \,,\,\ldots ,\,\cdot \,):\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}\to{\mathcal{M}}_{d}^{p}$ is continuous for the pair $(\mathcal{O}_{1}^{p}\times \cdots \times\mathcal{O}_{1}^{p},\mathcal{O}_{d}^{p})$.

2.4 Characterisation of ($p$-)weak convergence in Fréchet classes

For any $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the image of the map

$$ \mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to{\mathcal{M}}_{d}^{p} $$

is the Fréchet class $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$, and we have $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p'}$ as well as ${\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\subseteq{\mathcal{M}}_{d}^{p'}$ for any $p'\in [0,p]$. Therefore, Corollary 2.4 (i) immediately yields the following result.

Corollary 2.5

For any $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the map

$$ \mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}) $$

is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d}))$-continuous for any $p'\in [0,p]$.

As a simple consequence of Theorem 2.3, we obtain the following corollary (see Appendix A.3). The result is already known from Krätschmer et al. [28, Proposition 3.9] (with $A_{d}$ chosen to be the identity on $R^{d}$ ), where other arguments have been used for the proof.

Corollary 2.6

For any $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, we have that

$$ {\mathcal{O}}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})=\mathcal{O}_{d}^{p'} \cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}) $$

for any $p'\in [0,p]$.

For any fixed $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{d}^{p}$, we use as before $\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ to denote the quotient set of $\mathbf{C}_{d}$ with respect to the equivalence relation $\sim _{\mu _{1},\ldots ,\mu _{d}}$ of identity on $\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}$. Recall from Remark 2.1 that we denote by $\mathcal{O}_{\mu _{1}\ldots ,\mu _{k}}^{\sim}$ the topology on $\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ generated by the metric $d_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$ corresponding to the pseudo-metric $d_{\mu _{1},\ldots ,\mu _{d}}$, and that $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$ preserves the topological structure of $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}$.

Let us denote by $\mathfrak{P}_{\mu _{1},\ldots ,\mu _{d}}:\mathbf{C}_{d}/_{\sim _{ \mu _{1},\ldots ,\mu _{d}}}\to{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ the map that assigns to each equivalence class $C \in C_{d} /_{\sim_{μ_{1}, \dots, μ_{d}}}$ the unique probability measure $μ_{C} \in M_{d} (μ_{1}, \dots, μ_{d})$ that satisfies $μ_{C} = P_{d} (C, μ_{1}, \dots, μ_{d})$ for all representatives $C \in C$ . Then Corollary 2.5 can be reformulated as follows.

Corollary 2.7

For any $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the map

$$ \mathfrak{P}_{\mu _{1},\ldots ,\mu _{d}}:\mathbf{C}_{d}/_{\sim _{\mu _{1}, \ldots ,\mu _{d}}}\to{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}) $$

is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim},\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}))$-continuous for any $p'\in [0,p]$.

Let $\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}:\mathcal{M}_{d}(\mu _{1}, \ldots ,\mu _{d})\to \mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ be the map that assigns to each $\mu \in{\mathcal{M}}_{d}(\mu _{1}\ldots ,\mu _{d})$ the unique equivalence class $C_{μ} \in C_{d} /_{\sim_{μ_{1}, \dots, μ_{d}}}$ whose representatives are copulas of $\mu $. Then we have the following converse of Corollary 2.7.

Corollary 2.8

For any $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the map

$$ \mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}:\mathcal{M}_{d}(\mu _{1}, \ldots ,\mu _{d})\to \mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}} $$

is $(\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mu _{1},\ldots ,\mu _{d}}^{\sim})$-continuous for any $p'\in [0,p]$.

As an immediate consequence of Corollaries 2.7 and 2.8, we obtain the following result. Note that the equivalence (a) ⇔ (b) also follows from Corollary 2.6, and that condition (c) is equivalent with $d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0$ for any copulas $C$ and $C_{n}$, $n \in N$ , of $\mu $ and $\mu _{n}$, $n \in N$ , respectively.

Corollary 2.9

Fix $p \in R_{+}$ and $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. Then the following assertions are equivalent for any ${(μ_{n})}_{n \in N} \in M_{d} {(μ_{1}, \dots, μ_{d})}^{N}$ and $\mu \in \mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$:

(a) $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$.

(b) $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$.

(c) $\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}(\mu _{n})\to \mathfrak{C}_{ \mu _{1},\ldots ,\mu _{d}}(\mu )$ in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$.

3 Copula robustness

3.1 Definition of copula robustness

Let $\mathcal{M}_{d}'\subseteq{\mathcal{M}}_{d}$ and $\mathcal{T}_{d}:\mathcal{M}_{d}'\longrightarrow \mathbf{E} $ be any map taking values in some topological space $(\mathbf{E},\mathcal{O}_{\mathbf{E}})$. As before, let the map $\mathfrak{P}_{d}:\mathbf{C}_{d}\times\mathcal{M}_{1}\times \cdots \times\mathcal{M}_{1}\to{\mathcal{M}}_{d}$ be defined by (2.3). Let $\mathfrak{D}_{d}'$ be the set of all $(C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1} \times \cdots \times\mathcal{M}_{1}$ for which $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ lies in $\mathcal{M}_{d}'$. Then we can associate with $\mathcal{T}_{d}$ a functional $\mathfrak{T}_{d}:\mathfrak{D}_{d}'\to \mathbf{E}$ through

$$ \mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d}):=\mathcal{T}_{d}\big( \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\big). $$

(3.1)

Let $\mathfrak{M}_{d}'$ be the set of all $d$-tuples $(\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{1}\times \cdots \times\mathcal{M}_{1}$ for which there exists a copula $C\in \mathbf{C}_{d}$ such that $(C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'$. Moreover, for any fixed $(\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'$, let the set $\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})$ consist of all those copulas $C\in \mathbf{C}_{d}$ for which $(C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'$.

Definition 3.1

The map $\mathcal{T}_{d}$ is copula robust if for any fixed $(\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'$, the map $\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d}'( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}$ is continuous for the pair

$$ (\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}\cap \mathbf{C}_{d}'(\mu _{1}, \ldots ,\mu _{d}),\mathcal{O}_{\mathbf{E}}).$$

The sets $\mathfrak{M}_{d}'$ and $\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})$ are illustrated by Examples 3.2 and 3.3 below. The examples show in particular that the set $\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})$ can be quite different from case to case. In Example 3.2, and in the further course, let $\mathcal{N}_{1}$ be the set of all non-degenerate univariate normal distributions and for $d\ge 2$, let $\mathcal{N}_{d}$ be the set of all (possibly degenerate) $d$-variate normal distributions with continuous marginals. In Example 3.2, we also need the notion of a Gaussian copula. Recall that, by definition, a $d$-variate Gaussian copula is an element $C\in \mathbf{C}_{d}$ given through

$$ C(u_{1},\ldots ,u_{d}):=\boldsymbol{\varPhi}_{\boldsymbol{0},R}\big( \varPhi _{0,1}^{-1}(u_{1}),\ldots ,\varPhi _{0,1}^{-1}(u_{d})\big) $$

(3.2)

for some correlation matrix $R$, i.e., for some symmetric and positive semi-definite matrix $R\in [-1,1]^{d\times d}$ which has entries 1 on the diagonal. Here $\varPhi _{0,1}$ and $\boldsymbol{\varPhi}_{\boldsymbol{0},R}$ are respectively the distribution function of the univariate standard normal distribution and the distribution function of the centered $d$-variate normal distribution $\mathbf{N}_{\boldsymbol{0},R}$ with covariance matrix equal to $R$, and we set $\varPhi _{0,1}^{-1}(0):=-\infty $ and $\varPhi _{0,1}^{-1}(1):=+ \infty $ as well as

Φ_{0, R} (x_{1}, \dots, x_{d}) : = N_{0, R} [⨉_{i = 1}^{d} (- \infty, x_{i}] \cap R^{d}]

for any $x_{1}, \dots, x_{d} \in \overline{R} : = R \cup {- \infty, + \infty}$ . The set of all Gaussian copulas is denoted by $\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}$.

Example 3.2

If $\mathcal{M}_{d}'=\mathcal{N}_{d}$, then $\mathfrak{D}_{d}'=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}$ (see Appendix A.5). In particular, $\mathfrak{M}_{d}'=\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}$ and $\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}$ for any $(\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'$.

Example 3.3

If $\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}$ for some $p \in R_{+}$ , then $\mathfrak{D}_{d}'=\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$ (see Appendix A.6). In particular, $\mathfrak{M}_{d}'=\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$ and $\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})=\mathbf{C}_{d}$ for any $(\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'$.

The following lemma is trivial, but, nevertheless, worth to be written down. In the lemma, $(\mathbf{E}',\mathcal{O}_{\mathbf{E}}')$ is another topological space.

Lemma 3.4

If $\mathcal{T}_{d}$ is copula robust and $\mathcal{U}:\mathbf{E}\to \mathbf{E}'$ is any $(\mathcal{O}_{\mathbf{E}},\mathcal{O}_{\mathbf{E}}')$-continuous map, then the composition $\mathcal{T}_{d}':=\mathcal{U}\circ{\mathcal{T}}_{d}$ is copula robust.

3.2 Copula robustness of functionals on $\mathcal{N}_{d}$

In this section, let specifically $\mathcal{M}_{d}'=\mathcal{N}_{d}$. That is, let $\mathcal{T}_{d}:\mathcal{N}_{d}\to \mathbf{E}$ be any map taking values in some topological space $(\mathbf{E},\mathcal{O}_{\mathbf{E}})$. In view of Example 3.2, the definition of copula robustness of $\mathcal{T}_{d}$ (Definition 3.1) can then be reformulated as follows.

Definition 3.5

The map $\mathcal{T}_{d}$ on $\mathcal{N}_{d}$ is copula robust if for any fixed $\mu _{1},\ldots ,\mu _{d}\in \mathcal{N}_{1}$, the map $\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d}^{ {\mbox{\textup {{\scriptsize {Ga}}}}}}\to \mathbf{E}$ is $(\mathcal{O}_{[0,1]^{d}}\cap \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}},\mathcal{O}_{ \mathbf{E}})$-continuous.

Remark 3.6

Convergence in $(\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}},\mathcal{O}_{[0,1]^{d}}\cap \mathbf{C}_{d}^{ {\mbox{\textup {{\scriptsize {Ga}}}}}})$ is nothing but pointwise (or uniform) convergence in $\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}$. This sort of convergence is therefore equivalent to convergence of the respective correlation matrices in any matrix norm; for details, see Appendix A.7.

Example 3.7

The identity map $\mathcal{P}_{d}:\mathcal{N}_{d}\to{\mathcal{N}}_{d}$ is copula robust in the sense of Definition 3.5 when the role of $(\mathbf{E},\mathcal{O}_{\mathbf{E}})$ is played by $(\mathcal{N}_{d},\mathcal{O}_{d}^{p}\cap{\mathcal{N}}_{d})$ for arbitrary (but fixed) $p \in R_{+}$ . For details, see Appendix A.8.

Example 3.7 and Lemma 3.4 (applied to $\mathcal{T}_{d}':=\mathcal{T}_{d}$, $\mathcal{T}_{d}:=\mathcal{P}_{d}$, $\mathcal{U}:=\mathcal{T}_{d}$) immediately yield the following result.

Theorem 3.8

If $\mathcal{T}_{d}$ is $(\mathcal{O}_{d}^{p}\cap{\mathcal{N}}_{d},\mathcal{O}_{\mathbf{E}})$-continuous for some $p \in R_{+}$ , then it is copula robust.

Of course, Theorem 3.8 can be generalised to larger sets of parametric distributions as for instance to the set $\mathcal{S}_{d}$ of all $d$-variate (Student) $t$-distributions with continuous marginals; see for instance Demarta and McNeil [11] for the definitions of $d$-variate $t$-distributions and $t$-copulas. However, for the sake of clarity and ease, the exposition here is restricted to the Gaussian setting. A perhaps more interesting setting is addressed in the next section.

3.3 Copula robustness of functionals on $\mathcal{M}_{d}^{p}$

In this section, let specifically $\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}$ for some $p \in R_{+}$ . That is, let $\mathcal{T}_{d}:\mathcal{M}_{d}^{p}\to \mathbf{E}$ be any map taking values in some topological space $(\mathbf{E},\mathcal{O}_{\mathbf{E}})$. In view of Example 3.3, the definition of copula robustness of $\mathcal{T}_{d}$ (Definition 3.1) can then be reformulated as follows.

Definition 3.9

The map $\mathcal{T}_{d}$ on $\mathcal{M}_{d}^{p}$ is copula robust if for any fixed $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, the map $\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to \mathbf{E}$ is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})$-continuous.

With the help of Corollaries 2.5 and 2.8, we can derive the following characterisation of copula robustness of $\mathcal{T}_{d}$. For details, see Appendix A.9.

Theorem 3.10

Let $\mathcal{T}_{d}: \mathcal{M}_{d}^{p} \to \mathbf{E}$ be any map. Then $\mathcal{T}_{d}$ is copula robust if and only if for any fixed $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, its restriction $\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}$ to the Fréchet class $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$ is continuous for the pair $(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mathbf{E}})$.

Example 3.11

Corollary 2.4 (i) shows that the identity map $\mathcal{P}_{d}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d}^{p}$ is copula robust in the sense of Definition 3.9 when the role of $(\mathbf{E},\mathcal{O}_{\mathbf{E}})$ is played by $(\mathcal{M}_{d}^{p},\mathcal{O}_{d}^{p})$.

Example 3.11 and Lemma 3.4 (applied to $\mathcal{T}_{d}':=\mathcal{T}_{d}$, $\mathcal{T}_{d}:=\mathcal{P}_{d}$, $\mathcal{U}:=\mathcal{T}_{d}$) immediately yield the following result.

Theorem 3.12

If $\mathcal{T}_{d}$ is $(\mathcal{O}_{d}^{p},\mathcal{O}_{\mathbf{E}})$-continuous, then it is copula robust.

Now fix $d^{'} \in N$ and $p^{'} \in R_{+}$ . Proposition 2.2 ensures that in the scope of the following corollary, we have $\mu \circ h^{-1}\in{\mathcal{M}}_{d'}^{p'}$ for any $\mu \in{\mathcal{M}}_{d}^{p}$.

Corollary 3.13

Let $\mathcal{T}_{d'}:\mathcal{M}_{d'}^{p'}\to \mathbf{E}$ be an $(\mathcal{O}_{p'}^{d'},\mathcal{O}_{\mathbf{E}})$-continuous map and suppose $h : R^{d} \to R^{d^{'}}$ is a continuous map with ${sup}_{x \in R^{d}} {| h (x) |}^{p^{'}} / (1 + {| x |}^{p}) < \infty$ . Then the map $\mathcal{T}_{d}':\mathcal{M}_{d}^{p}\to \mathbf{E}$ defined by $\mathcal{T}_{d}'(\mu ):=\mathcal{T}_{d'}(\mu \circ h^{-1})$ is copula robust.

4 Example 1: risk measures of aggregate risks

4.1 Foundations of risk measures

Let $(Ω, F, P)$ be an atomless probability space and denote by $L^{0} : = L^{0} (Ω, F, P)$ the usual class of all finite-valued random variables modulo the equivalence relation of ℙ-a.s. identity. Moreover, let $L^{p} = L^{p} (Ω, F, P)$ be the usual $L^{p}$-space, $p>0$. For any $p \in R_{+}$ , we say that a map $ρ : L^{p} \to R$ is a risk measure when the following three conditions are satisfied:

(i) (monotonicity) $\rho (X)\le \rho (Y)$ for $X$, $Y\in L^{p}$ with $X\le Y$;

(ii) (cash-additivity) $\rho (X+m)=\rho (X)+m$ for $X\in L^{p}$ and $m \in R$ ;

(iii) (distribution-invariance) $\rho (X)=\rho (Y)$ for $X,Y\in L^{p}$ with $P_{X} = P_{Y}$ .

In this context, the elements of $L^{p}$ should be seen as payoff profiles where positive realisations correspond to losses. Following Föllmer and Schied [22], [23, Chap. 4], a risk measure $ρ : L^{p} \to R$ is said to be convex if it satisfies the following condition:

(iv) (convexity) $\rho (\lambda X+(1-\lambda ) Y)\le \lambda \rho (X)+(1-\lambda ) \rho (Y)$ for all $X,Y\in L^{p}$ and $\lambda \in [0,1]$.

The following example recalls three risk measures which are popular in practice and/or among academics. For background, see Emmer et al. [18] and references cited therein.

Example 4.1

Fix $\alpha \in (0,1)$.

(i) The value at risk at level $\alpha $ is the risk measure ${VaR}_{α} : L^{0} \to R$ defined by $\mathrm{VaR}_{\alpha}(X):=F_{X}^{\leftarrow}(\alpha )$, where $F_{X}^{\leftarrow} (α) : = inf {x \in R : F_{X} (x) \geq α}$ is the lower $\alpha $-quantile of $P_{X}$ . It is not convex.

(ii) The average value at risk at level $\alpha $ is the risk measure ${AVaR}_{α} : L^{1} \to R$ defined by $\mathrm{AVaR}_{\alpha}(X):=\frac{1}{1-\alpha}\int _{\alpha}^{1}F_{X}^{ \leftarrow}(s)\,ds$ and known to be convex; see for instance the work of Wang and Dhaene [53].

(iii) The $\alpha $-expectile at level $\alpha $ is the risk measure ${Ept}_{α} : L^{1} \to R$ defined by ${Ept}_{α} (X) : = U_{α} {(X)}^{- 1} (0)$ , where $U_{α} {(X)}^{- 1}$ denotes the inverse of the function $U_{α} (X) (m) : = E [U_{α} (X - m)]$ with $U_{\alpha}(x):=\alpha x$ or $(1-\alpha )x$ depending on whether $x\ge 0$ or $x<0$. It is well defined, and known to be convex if and only if $\alpha \ge 1/2$; see the work of Bellini et al. [3].

For any risk measure $ρ : L^{p} \to R$ , we may define a functional $R_{ρ} : M_{1}^{p} \to R$ through

$$ {\mathcal{R}}_{\rho}(\mu ):=\rho (X_{\mu}), $$

(4.1)

where $X_{\mu}$ is any random variable on $(Ω, F, P)$ with distribution $\mu $. We refer to $\mathcal{R}_{\rho}$ as the risk functional associated with $\rho $. The assertion of the following result is a direct consequence of Cheridito and Li [8, Theorem 4.1] combined with the representation theorem of Krätschmer et al. [27, Theorem 3.5]. Here $O_{R}$ refers to the natural topology on ℝ.

Theorem 4.2

Let $p \in R_{+}$ . For any convex risk measure $ρ : L^{p} \to R$ , the corresponding risk functional $R_{ρ} : M_{1}^{p} \to R$ is $(O_{1}^{p}, O_{R})$ -continuous.

4.2 Copula robustness of risk measures of aggregate risks

Let $ρ : L^{p^{'}} \to R$ be a risk measure for some $p^{'} \in R_{+}$ . Let $A_{d} : R^{d} \to R$ be any continuous map, regarded as an aggregation map in the spirit of McNeil et al. [33, Sect. 6.2.1]. Assume that for some $p \in R_{+}$ and any $X_{1},\ldots ,X_{d}\in L^{p}$, the random variable $A_{d}(X_{1},\ldots ,X_{d})$ lies in $L^{p'}$. Then we can define a map $R_{ρ, A_{d}} : M_{d}^{p} \to R$ through

$$ \mathcal{R}_{\rho ,A_{d}}(\mu ):=\mathcal{R}_{\rho }(\mu \circ A_{d}^{-1} ). $$

(4.2)

We refer to $\mathcal{R}_{\rho ,A_{d}}$ as aggregation risk functional associated with $\rho $ and $A_{d}$. Note that the right-hand side in (4.2) equals $\rho (A_{d}(X_{1},\ldots ,X_{d}))$ when $(X_{1},\ldots ,X_{d})$ is an $R^{d}$ -valued random variable with distribution $\mu $. As a direct consequence of Corollary 3.13 (applied to $\mathcal{T}_{1}:=\mathcal{R}_{\rho}$, $h:=A_{d}$, $\mathcal{T}_{d}':=\mathcal{R}_{\rho ,A_{d}}$) and Theorem 4.2, we obtain the following result.

Corollary 4.3

Take $p, p^{'} \in R_{+}$ , a convex risk measure $ρ : L^{p^{'}} \to R$ and a continuous map $A_{d} : R^{d} \to R$ satisfying ${sup}_{x \in R^{d}} {| A_{d} (x) |}^{p^{'}} / {(1 + | x |)}^{p} < \infty$ . Then the aggregation risk functional $R_{ρ, A_{d}} : M_{d}^{p} \to R$ defined by (4.2) is copula robust.

Example 4.4

In risk management, $A_{d}$ is frequently chosen as one of the following maps; see for instance the textbook by McNeil et al. [33, Sect. 6.2]:

(i) $A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}$;

(ii) $A_{d}(x_{1},\dots ,x_{d}) := \max \{x_{1},\dots ,x_{d}\}$;

(iii) $A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}(x_{i} - t_{i})^{+}$ for thresholds $t_{1},\dots ,t_{d} > 0$;

(iv) $A_{d}(x_{1},\dots ,x_{d}) := (\sum _{i=1}^{d}x_{i} - t)^{+}$ for a threshold $t > 0$.

It is easily seen that for each of these four maps, ${sup}_{x \in R^{d}} {| A_{d} (x) |}^{p} / {(1 + | x |)}^{p} < \infty$ holds for any $p \in R_{+}$ . That is, all these maps satisfy the assumptions of Corollary 4.3 for $p'=p$ (and thus for any $p \in R_{+}$ and $p'\in [0,p]$). In particular, for each of these four maps $A_{d}$ and for any convex risk measure $ρ : L^{p} \to R$ , the corresponding aggregation risk functional $R_{ρ, A_{d}} : M_{d}^{p} \to R$ defined by (4.2) is copula robust for any $d \in N$ .

Remark 4.5

Of course, the assertion of Corollary 4.3 and the last assertion in Example 4.4 also hold true for any other risk measure $\rho $ for which the corresponding risk functional $R_{ρ} : M_{1}^{p^{'}} \to R$ is $(O_{1}^{p^{'}}, O_{R})$ -continuous.

Remark 4.5 indicates that in the setting of Corollary 4.3, the assumed convexity of $\rho $ is not necessary. To give an example that shows that this is indeed true, let $\rho $ be the $\alpha $-expectile $\mathrm{Ept}_{\alpha}$ with $\alpha <1/2$ (see Example 4.1 (iii)). Then $\rho $ is not convex (see Bellini et al. [3, Proposition 7(b–c)]), but the corresponding risk functional $R_{ρ} : M_{1}^{1} \to R$ is $(O_{1}^{1}, O_{R})$ -continuous (see Krätschmer and Zähle [29, Theorem 2.1]), and the latter implies that the aggregation risk functional $R_{ρ, A_{d}} : M_{d}^{1} \to R$ is copula robust.

On the other hand, if the risk functional $R_{ρ} : M_{1}^{p^{'}} \to R$ corresponding to some $\rho $ is not $(O_{1}^{p^{'}}, O_{R})$ -continuous, then copula robustness of $R_{ρ, A_{d}} : M_{d}^{p^{'}} \to R$ can indeed fail to hold. For instance, Example 4.7 below shows that $R_{ρ, A_{2}} : M_{2}^{0} \to R$ is not copula robust when $\rho :=\mathrm{VaR}_{\alpha}$ (see Example 4.1 (i)) and $A_{2}(x_{1},x_{2}):=x_{1}+x_{2}$. Note here that the risk functional $R_{ρ} : M_{1}^{0} \to R$ associated with $\rho :=\mathrm{VaR}_{\alpha}$ is known not to be weakly continuous, and that weak continuity is just $(O_{1}^{0}, O_{R})$ -continuity.

It is further known that the risk functional $R_{ρ} : M_{1}^{0} \to R$ associated with $\rho :=\mathrm{VaR}_{\alpha}$ can be made weakly continuous when restricting it to the set $\mathcal{M}_{1}^{(\alpha )}$ of all those Borel probability measures on ℝ that possess a unique $\alpha $-quantile (see e.g. van der Vaart [52, Lemma 21.2]), or even to the set $\mathcal{M}_{1}^{\mathcal{L}}$ of all $\mu \in \bigcap _{s\in (0,1)}{\mathcal{M}}_{1}^{(s)}$ that possess a Lebesgue density. Nonetheless, the corresponding aggregation risk functional $\mathcal{R}_{\rho ,A_{2}}$, with $A_{2}(x_{1},x_{2}):=x_{1}+x_{2}$, defined on the set $\mathcal{M}_{2}^{\mathcal{L}}$ of all Borel probability measures on $R^{2}$ with marginal distributions in $\mathcal{M}_{1}^{\mathcal{L}}$, is still not copula robust. This is also a consequence of Example 4.7. The lack of copula robustness of $\mathcal{R}_{\rho ,A_{2}}$ on $\mathcal{M}_{2}^{\mathcal{L}}$ is not immediately obvious. Note, however, that for $\mu \in{\mathcal{M}}_{2}^{\mathcal{L}}$, the image measure $\mu \circ A_{2}^{-1}$ can be purely discrete (see Example 4.7), i.e., $\mu \circ A_{2}^{-1}$ can lie outside the set $\mathcal{M}_{1}^{(\alpha )}$ on which $\mathcal{R}_{\rho}$ is weakly continuous.

When restricting $\mathcal{R}_{\rho ,A_{2}}$, with $\rho :=\mathrm{VaR}_{\alpha}$ and $A_{2}(x_{1},x_{2}):=x_{1}+x_{2}$, to the much smaller set $\mathcal{N}_{2}$ introduced before (3.2), then copula robustness holds true. Note that $\mu \circ A_{2}^{-1}\in{\mathcal{N}}_{1}'\subseteq{\mathcal{M}}_{1}^{(\alpha )}$ for all $\mu \in{\mathcal{N}}_{2}$, where $\mathcal{N}_{1}'$ ($\supseteq{\mathcal{N}}_{1}$) is the set of all (possibly degenerate) univariate normal distributions. The copula robustness follows from Theorem 3.8 since the restriction of $\mathcal{R}_{\rho ,A_{2}}$ to $\mathcal{N}_{2}$ is $(O_{2}^{0} \cap N_{2}, O_{R})$ -continuous. The latter follows from the $(\mathcal{O}_{2}^{0}\cap{\mathcal{N}}_{2},\mathcal{O}_{1}^{0}\cap{\mathcal{N}}_{1}')$-continuity of the map $\mathfrak{h}:\mathcal{N}_{2}\to{\mathcal{N}}_{1}'$ defined by $\mathfrak{h}(\mu ):=\mu \circ A_{2}^{-1}$ and the $(O_{1}^{0} \cap N_{1}^{'}, O_{R})$ -continuity of the restriction of $\mathcal{R}_{\rho}$ to $\mathcal{N}_{1}'$ ($\subseteq{\mathcal{M}}_{1}^{(\alpha )}$).

4.3 Relation to aggregation robustness of risk measures

In [17], Embrechts et al. consider the special case where $A_{d}$ is defined as in (i) of Example 4.4 and $\rho $ is a coherent distortion risk measure defined on a subset of $L^{1}$. In this case, they obtain an analogue of Corollary 4.3 and refer to it as aggregation robustness. In fact, they do not explicitly consider continuity in the copula, but rather weak continuity of the analogous functional defined on the corresponding Fréchet class. However, as seen in Theorem 3.10, this is the same. A generalisation to more general risk measures and more general aggregation maps is given in the work of Krätschmer et al. [28, Sect. 4.2.4].

The following definition is a reformulation of the definition of aggregation robustness of a risk measure $ρ : L^{p} \to R$ (i.e., of [17, Definition 2.1]). As before, the aggregation risk functional $\mathcal{R}_{\rho ,A_{d}}$ associated with $\rho $ and $A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}$ is defined by (4.2).

Definition 4.6

Let $p \in R_{+}$ . A risk measure $ρ : L^{p} \to R$ is said to be aggregation robust if the corresponding aggregation risk functionals $R_{ρ, A_{d}} : M_{d}^{p} \to R$ , $d\ge 2$, are copula robust.

In view of Corollary 4.3 and Example 4.4, any convex risk measure $ρ : L^{p} \to R$ is aggregation robust. This assertion remains true when replacing in Definition 4.6 the map $A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}$ by any other of the maps introduced in Example 4.4.

In their Example 2.2, Embrechts et al. [17] demonstrated that for any $\alpha \in (0,1)$, the value at risk ${VaR}_{α} : L^{0} \to R$ is not aggregation robust. The following example extends the first part of that example (from $\alpha =1/2$ to general $\alpha \in (0,1)$) and shows that for any $\alpha \in (0,1)$ and $p \in R_{+}$ , the aggregation risk functional $R_{{VaR}_{α}, A_{2}} : M_{2}^{p} \to R$ is not copula robust. The example is in particular interesting in that it shows that even if the marginal distributions $\mu _{1},\ldots ,\mu _{d}$ possess Lebesgue densities and unique quantiles, the map $C\mapsto \mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{d}}(C,\mu _{1},\ldots , \mu _{d})$ need not be continuous when choosing $A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}$ (here $\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{d}}$ is derived from $\mathcal{R}_{\rho ,A_{d}}$ as $\mathfrak{T}_{d}$ is derived from $\mathcal{T}_{d}$ in (3.1)). The point here is that for random variables $X_{1},\ldots ,X_{d}$ with distributions $\mu _{1},\ldots ,\mu _{d}$, the distribution of $\sum _{i=1}^{n}X_{i}$ can be discrete even if $\mu _{1},\ldots ,\mu _{d}$ possess Lebesgue densities. This fact has already been pointed out in [17].

Example 4.7

Generalising the first part of Embrechts et al. [17, Example 2.2], define for $\alpha \in (0,1/2]$ a bivariate copula $C_{0}^{(\alpha )}$ through

$$\begin{aligned} &C_{0}^{(\alpha )}(u_{1},u_{2}) \\ &\ := \max \big\{ \min \{u_{1},\alpha \}+\min \{u_{2},\alpha \}- \alpha ,0\big\} +\max \{u_{1}+u_{2}-(1+\alpha ),0 \}, \end{aligned}$$

let $C_{1}$ be the bivariate independence copula, i.e., $C_{1}(u_{1},u_{2}):=u_{1}u_{2}$, and define for any $t\in [0,1]$ the copula $C_{t}^{(\alpha )}$ as a mixture of $C_{0}^{(\alpha )}$ and $C_{1}$ via

$$ C_{t}^{(\alpha )}(u_{1},u_{2}):= (1-t)\,C_{0}^{(\alpha )}(u_{1},u_{2})+t \,C_{1}(u_{1},u_{2}). $$

Moreover, for any $t\in [0,1]$, let $\hat{C}_{t}^{(\alpha )}$ be the survival copula of $C_{t}^{(\alpha )}$ which is defined by $\hat{C}_{t}^{(\alpha )}(u_{1},u_{2}):=u_{1}+u_{2}-1+C_{t}^{(\alpha )}(1-u_{1},1-u_{2})$. Finally, let $\mu _{1}:=\mu _{2}:=\mathrm{U}_{[0,1]}$ as well as $\hat{\mu}_{1}:=\hat{\mu}_{2}:=\mathrm{U}_{[-1,0]}$, where $\mathrm{U}_{I}$ is used to denote the uniform distribution on $I$. Then the following two assertions are valid:

(i) $\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{0}^{(\alpha )},\mu _{1}, \mu _{2})=\alpha $ and $\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{t}^{(\alpha )},\mu _{1}, \mu _{2})=\sqrt{2\alpha}$ for any $t\in (0,1]$. Therefore we have $\lim _{t\searrow 0}C_{t}^{(\alpha )}=C_{0}^{(\alpha )}$ uniformly, but

$$ \lim _{t\searrow 0}\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{t}^{( \alpha )},\mu _{1},\mu _{2})\neq\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{0}^{( \alpha )},\mu _{1},\mu _{2}). $$

(ii) $\mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(\hat{C}_{0}^{(\alpha )}, \hat{\mu}_{1},\hat{\mu}_{2})=-1-\alpha $ and $\mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(\hat{C}_{t}^{(\alpha )}, \hat{\mu}_{1},\hat{\mu}_{2})=-\sqrt{2\alpha}$ for any $t\in (0,1]$. Therefore we have that $\lim _{t\searrow 0}\hat{C}_{t}^{(\alpha )}=\hat{C}_{0}^{(\alpha )}$ uniformly, but

$$ \lim _{t\searrow 0}\mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(\hat{C}_{t}^{( \alpha )},\hat{\mu}_{1},\hat{\mu}_{2})\neq\mathfrak{R}_{\mathrm{VaR}_{1- \alpha},A_{2}}(\hat{C}_{0}^{(\alpha )},\hat{\mu}_{1},\hat{\mu}_{2}).$$

For details, see Appendix A.11.

It is worth commenting on the copulas $C_{0}^{(\alpha )}$, $C_{1}$ and $C_{t}^{(\alpha )}$ in the preceding example. The copula $C_{1}$ is well known; it is simply the distribution function of the uniform distribution on $[0,1]^{2}$. The copula $C_{0}^{(\alpha )}$ is the distribution function of the ‘uniform distribution’ on the union of the two disjoint line segments $S_{1}^{\alpha}$ and $S_{2}^{\alpha}$ with endpoints $(\alpha ,0),(0,\alpha )$ and $(1,\alpha ),(\alpha ,1)$, respectively (see Appendix A.11 for the precise definition). Thus $C_{t}^{(\alpha )}$ is the distribution function of the Borel probability measure on $[0,1]^{2}$ that is defined as a convex combination (with coefficients $t$ and $1-t$) of the uniform distribution on $[0,1]^{2}$ and the ‘uniform distribution’ on $S_{1}^{\alpha}\uplus S_{2}^{\alpha}$. For a visualisation of $C_{0}^{(\alpha )}$, see Fig. 1, and note that the distribution of the sum of two $\mathrm{U}_{[0,1]}$-distributed random variables coupled via $C_{0}^{(\alpha )}$ is the two-point distribution $\alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha}$.

4.4 Application to optimal capital and risk allocations

Let $p\in [1,\infty )$. As in the work of Filipović and Svindland [21], consider $d$ agents, or business units, with endowments $X_{1},\ldots ,X_{d}\in L^{p}$. We then assume that these agents assess the riskiness of their positions by means of some convex risk measures $ρ_{1}, \dots, ρ_{d} : L^{p} \to R$ (in the sense of Sect. 4.1). In order to minimise the total and individual risk, the agents redistribute the aggregate endowment $X:=\sum _{i=1}^{d}X_{i}$ among themselves. By a redistribution of $X$, we mean any $d$-tuple $(Y_{1},\ldots ,Y_{d})$ of random variables (payoffs) in $L^{p}$ such that $X=\sum _{i=1}^{d}Y_{i}$. A redistribution $(X_{1}^{*},\ldots ,X_{d}^{*})$ is called an optimal capital and risk allocation of $X$ if

$$ \sum _{i=1}^{n}\rho _{i}(X_{i}^{*})=\inf \bigg\{ \sum _{i=1}^{n}\rho _{i}(Y_{i}) : Y_{1},\ldots ,Y_{d}\in L^{p}\mbox{ and }\sum _{i=1}^{n}Y_{i}=X \bigg\} . $$

(4.3)

Here it is assumed that the redistribution is not subject to frictions, i.e., that every redistribution of $X$ is admissible, even if this is not always the case (as pointed out by Filipović and Kupper [20]).

Note that an optimal capital and risk allocation $(X_{1}^{*},\ldots ,X_{d}^{*})$ as a redistribution must satisfy $X=\sum _{i=1}^{d}X_{i}^{*}$. An optimal capital and risk allocation of $X$ need not exist. If it exists, it coincides with the inf-convolution of $\rho _{1},\ldots ,\rho _{d}$ at $X$, denoted by $\mathop {\square }_{i=1}^{d}\rho _{i}(X)$, which is defined by the right-hand side of (4.3). Using the convention $\inf \emptyset =\infty $, the inf-convolution can be seen as a map $\mathop {\square }_{i=1}^{d}\rho _{i}:L^{p}\to (-\infty ,\infty ]$. For background, see [21] and references cited therein.

In the above economic setting, the inf-convolution can also be regarded as a map $\mathop {\boxtimes }_{i=1}^{d}\rho _{i}:L^{p}\times \cdots \times L^{p}\to (- \infty ,\infty ]$ through

$$ \mathop {\boxtimes }_{i=1}^{d}\rho _{i}(X_{1},\ldots ,X_{d}):=\mathop {\square }_{i=1}^{d} \rho _{i}\bigg(\sum _{i=1}^{d}X_{i} \bigg). $$

Since we assumed $\rho _{1},\ldots ,\rho _{d}$ to be convex risk measures on $L^{p}$, a result of Filipović and Svindland [21, Corollary 2.7] ensures that the inf-convolution $\mathop {\square }_{i=1}^{d}\rho _{i}$ is a convex risk measure on $L^{p}$, too (note that in [21, Corollary 2.7], $\mathop {\square }_{i=1}^{d}\rho _{i}$ is exact, and hence it is ℝ-valued if $\rho _{1},\ldots ,\rho _{d}$ are ℝ-valued). As a convex risk measure, $\mathop {\square }_{i=1}^{d}\rho _{i}$ is distribution-invariant, and so is $\mathop {\boxtimes }_{i=1}^{d}\rho _{i}$. Thus we may associate with $\mathop {\boxtimes }_{i=1}^{d}\rho _{i}$ a corresponding functional $R_{⊠_{i = 1}^{d} ρ_{i}} : M_{d}^{p} \to R$ through

$$ {\mathcal{R}}_{\mathop {\boxtimes }_{i=1}^{d}\rho _{i}}(\mu ):=\mathcal{R}_{\mathop {\square }_{i=1}^{d} \rho _{i},A_{d}}(\mu )=\mathcal{R}_{\mathop {\square }_{i=1}^{d}\rho _{i}} (\mu \circ A_{d}^{-1} ) $$

with $A_{d}(x_{1},\ldots ,x_{d})=\sum _{i=1}^{d}x_{i}$, where $\mathcal{R}_{\mathop {\square }_{i=1}^{d}\rho _{i}}$ and $\mathcal{R}_{\mathop {\square }_{i=1}^{d}\rho _{i},A_{d}}$ are defined as in (4.1) and (4.2), respectively.

It is worth mentioning that [21, Corollary 2.7] even ensures that for any $X\in L^{p}$, there exists a comonotone optimal capital and risk allocation $(X_{1}^{*},\ldots ,X_{d}^{*})$. This implies that whenever $\rho _{1}=\cdots =\rho _{d}$ and $\rho :=\rho _{1}$ is comonotonic (i.e., finitely additive for all comonotone risks), we have

$$ {\mathcal{R}}_{\mathop {\boxtimes }_{i=1}^{d}\rho}(\mu )=\mathcal{R}_{\rho }(\mu \circ A_{d}^{-1} ) $$

(4.4)

for any $\mu \in{\mathcal{M}}_{d}^{p}$ (see Appendix A.12). Of course, for convex risk measures $\rho $ that are not comonotonic, the representation (4.4) need not apply. An example for a comonotonic convex risk measure is the average value at risk at level $\alpha \in (0,1)$. A counterexample is the $\alpha $-expectile at level $\alpha \in [1/2,1)$; see Emmer et al. [18].

The following result is a direct consequence of Corollary 4.3 and Example 4.4 since we have seen above that $\mathop {\square }_{i=1}^{d}\rho _{i}$ is a convex risk measure on $L^{p}$ if $\rho _{1},\ldots ,\rho _{d}$ are.

Corollary 4.8

Let $p\in [1,\infty )$ and $ρ_{1}, \dots, ρ_{d} : L^{p} \to R$ be convex risk measures. Then $R_{⊠_{i = 1}^{d} ρ_{i}} : M_{d}^{p} \to R$ is copula robust.

The following example shows that if the risk measures $\rho _{1},\ldots ,\rho _{d}$ are not assumed to be convex, copula robustness of $\mathcal{R}_{\mathop {\boxtimes }_{i=1}^{d}\rho _{i}}$ may fail; recall that $\mathrm{VaR}_{\alpha}$ is not convex.

Example 4.9

It is known from the work of Embrechts et al. [13, Corollary 2] that $\mathop {\square }_{i=1}^{2}{\mathrm{VaR}}_{\alpha}=\mathrm{VaR}_{2\alpha}$ on $L^{1}$ when $\alpha \in (0,1/2)$. Therefore

$$ {\mathcal{R}}_{\mathop {\boxtimes }_{i=1}^{2}{\mathrm{VaR}}_{\alpha}}(\mu )=\mathcal{R}_{ \mathop {\square }_{i=1}^{2}{\mathrm{VaR}}_{\alpha},A_{2}}(\mu )=\mathcal{R}_{\mathrm{VaR}_{2 \alpha},A_{2}}(\mu )=\mathcal{R}_{\mathrm{VaR}_{2\alpha}}(\mu \circ A_{2}^{-1}) $$

for any $\mu \in{\mathcal{M}}_{2}^{1}$ and $\alpha \in (0,1/2)$. Thus it follows from Example 4.7 that

R_{⊠_{i = 1}^{2} {VaR}_{α}} : M_{2}^{1} \to R

is not copula robust for any $\alpha \in (0,1/2)$.

5 Example 2: stochastic programming problems

5.1 A class of stochastic programming problems

Adopting the framework of Claus et al. [9], let $\varXi $ be a nonempty and compact subset of $R^{k}$ , $h : Ξ \times R^{d} \to R$ a Borel-measurable function and $Z$ an $R^{d}$ -valued random variable on an atomless probability space $(Ω, F, P)$ . Let $p\in [1,\infty )$ and assume that $h(\xi ,Z)$ is contained in $L^{p} = L^{p} (Ω, F, P)$ for any $\xi \in \varXi $. Consider the optimisation problem

$$ \min \big\{ \rho \big(h(\xi ,Z)\big) : \xi \in \varXi \big\} , $$

(5.1)

where $ρ : L^{p} \to R$ is any map. A classical example for $\rho $ is the expectation, i.e., $ρ (Y) = E [Y]$ , where $p=1$. In Sect. 5.2, we consider another example where $\rho $ is a more general monotone, distribution-invariant and convex function on $L^{p}$. Problem (5.1) can be written as $min {R_{ρ} (P \circ h {(ξ, Z)}^{- 1}) : ξ \in Ξ}$ or, equivalently, as

$$ \min \big\{ \mathcal{R}_{\rho}\big((\delta _{\xi}\otimes \mu )\circ h^{-1} \big) : \xi \in \varXi \big\} , $$

(5.2)

where $\mathcal{R}_{\rho}$ is derived from $\rho $ as in (4.1) and $\mu $ denotes the distribution of $Z$.

Lemma 5.1 below assumes the following three conditions, where monotonicity, distribution-invariance and convexity are defined as in (i), (iii) and (iv) in Sect. 4.1. Recall that $(Ω, F, P)$ is assumed to be atomless.

(a) $ρ : L^{p} \to R$ , for some $p\in [1,\infty )$, is monotone, distribution-invariant and convex.

(b) $h : Ξ \times R^{d} \to R$ is Borel-measurable and limited by an exponent $γ \in R_{+ +}$ .

(c) $(\delta _{\xi}\otimes \mu )[D_{h}]=0$ for any $\xi \in \varXi $ and $\mu \in{\mathcal{M}}_{d}^{\gamma p}$.

The second requirement in (b) means that there exists some locally bounded map $\eta :\varXi \to (0,\infty )$ such that $|h(\xi ,z)|\le \eta (\xi )(1+|z|)^{\gamma}$ for all $(ξ, z) \in Ξ \times R^{d}$ . In (c), the set $D_{h}$ is the set of all discontinuity points of $h$. Under conditions (a) and (b), the map $Q_{ρ, h} : Ξ \times M_{d}^{γ p} \to R$ given by

$$ \mathcal{Q}_{\rho ,h}(\xi ,\mu ):=\mathcal{R}_{\rho} \big((\delta _{\xi} \otimes \mu )\circ h^{-1}\big) $$

is well defined. The following lemma is known from Claus et al. [9, Theorem 5.2].

Lemma 5.1

If conditions (a)–(c) hold true, then the map $Q_{ρ, h} : Ξ \times M_{d}^{γ p} \to R$ is $((O_{R^{k}} \cap Ξ) \times O_{d}^{γ p}, O_{R})$ -continuous.

Lemma 5.1 can be used to obtain the following result on the map

\begin{aligned} R_{ρ, h} : M_{d}^{γ p} \to R \cup {- \infty}, \\ R_{ρ, h} (μ) : = inf {Q_{ρ, h} (ξ, μ) : ξ \in Ξ} . \end{aligned}

(5.3)

Recall that the set $\varXi $ was assumed to be compact.

Theorem 5.2

If conditions (a)–(c) hold true, then the infimum in (5.3) is attained for any $\mu \in{\mathcal{M}}_{d}^{\gamma p}$, and the map $R_{ρ, h} : M_{d}^{γ p} \to R$ is $(O_{d}^{γ p}, O_{R})$ -continuous.

Note here that if the infimum in (5.3) is attained, then $\mathcal{R}_{\rho ,h}(\mu )$ is a solution to (5.2). Theorem 5.2 is a variant of Claus et al. [9, Corollary 2.4].

Remark 5.3

The $(O_{d}^{γ p}, O_{R})$ -continuity of $R_{ρ, h} : M_{d}^{γ p} \to R$ obtained in the preceding theorem can be seen as robustness of $\rho $ relative to $(\mathcal{G},Z,\pi _{d}^{\gamma p})$ in the sense of Embrechts et al. [16, Definition 1], where $\mathcal{G}:=\{h(\xi ,\,\cdot \,):\xi \in \varXi \}$ and $\pi _{d}^{\gamma p}$ is any metric metrising the $(p\gamma )$-weak topology $\mathcal{O}_{d}^{\gamma p}$.

5.2 Example: one-period mean–risk portfolio optimisation

Consider a one-period financial market consisting of one riskless bond and $d$ risky assets with prices per unit $S_{0}^{0}:=1$ and $S_{0}^{1}, \dots, S_{0}^{d} \in R_{+ +}$ at time 0. In between time 0 and time 1, the prices change to $S_{1}^{0},S_{1}^{1},\ldots ,S_{1}^{d}$ according to $S_{1}^{i}=Z^{i}S_{0}^{i}$, $i=0,\ldots ,d$, where the bond’s relative price change $Z^{0}$ is deterministic ( $\in R_{+ +}$ ) and known at time 0 and the assets’ relative price changes $Z^{1},\ldots ,Z^{d}$ are $R_{+}$ -valued random variables on a common atomless probability space $(Ω, F, P)$ and are unobservable at time 0. Let $x_{0} \in R_{+ +}$ be an amount of capital to be invested in the bond and in the $d$ assets at time 0. If for any $i=1,\ldots ,d$, the amount of capital invested in the asset $i$ is denoted by $\xi _{i}$, then the amount of capital invested in the bond is $\xi _{0}:=x_{0}-\langle \xi ,\boldsymbol{1}\rangle $, where $\xi :=(\xi _{1},\ldots ,\xi _{d})$ and $1 : = (1, \dots, 1) \in R^{d}$ . When identifying a portfolio with the corresponding amounts of capital $\xi _{1},\ldots ,\xi _{d}$ and assuming that taking loans and short selling are banned, the set

Ξ : = {(ξ_{1}, \dots, ξ_{d}) \in R_{+}^{d} : 〈 ξ, 1 〉 \leq x_{0}}

(5.4)

can be seen as the set of all admissible portfolios. The realised loss at time 1 of a portfolio $\xi =(\xi _{1},\ldots ,\xi _{d})\in \varXi $ is given by

$$ h(\xi ,z):=(x_{0}-\langle \xi ,\boldsymbol{1}\rangle )(1-Z^{0})+ \langle \xi ,\boldsymbol{1}-z\rangle , $$

(5.5)

when $z=(z_{1},\ldots ,z_{d})$ is the vector of the assets’ realised relative price changes, i.e., the realisation of $Z:=(Z^{1},\ldots ,Z^{d})$.

Of course, the portfolio $\xi =(\xi _{1},\ldots ,\xi _{d})\in \varXi $ should be chosen such that the expected profit is as high as possible, i.e., such that the expected loss $E [h (ξ, Z)]$ is as small as possible. Simultaneously the portfolio’s downside risk should be as small as possible, where the downside risk can be measured by $\sigma (h(\xi ,Z))$ for a suitable given ‘downside’ risk measure $σ : L^{p} \to R$ . This leads to the mean–risk model

min {E [h (ξ, Z)] + κ σ (h (ξ, Z)) : ξ \in Ξ},

(5.6)

where $κ \in R_{+ +}$ is a risk aversion parameter. Note that the model (5.6) aims at minimising the weighted sum of two competing objects and is in line with Markowitz’ [32] classical mean–variance optimisation theory (where $σ (h (ξ, Z)) = V ar [h (ξ, Z)]$ ). It is also worth mentioning that mean–risk models are related to the corresponding multiobjective optimisation problems; see for instance the works of Ogryczak and Ruszczyński [37, 38] and Schultz and Tiedemann [46].

The mean–risk model (5.6) coincides with problem (5.2) when $Z$ is distributed according to $\mu $ and $ρ : L^{p} \to R$ is defined by

ρ (Y) : = E [Y] + κ σ (Y),

(5.7)

where one should note that $\rho $ is monotone, distribution-invariant and convex if $\sigma $ is. For any fixed $p\in [1,\infty )$, the following corollary is a simple consequence of Theorem 5.2; see Appendix A.14.

Corollary 5.4

Let $σ : L^{p} \to R$ be monotone, distribution-invariant and convex. For any $\mu \in{\mathcal{M}}_{d}^{p}$, let $\mathcal{R}_{\rho ,h}(\mu )$ be defined by (5.3) (and (5.7)). Then the infimum on the right-hand side of (5.3) is attained (and thus finite) for any $\mu \in{\mathcal{M}}_{d}^{p}$, and the map $R_{ρ, h} : M_{d}^{p} \to R$ is $(O_{d}^{p}, O_{R})$ -continuous.

Remark 5.5

If we use $\mathfrak{A}_{d}$ to denote the set of all functions $h (ξ, \cdot) : R^{d} \to R$ , $\xi \in \varXi $, then $\mathcal{R}_{\rho ,h}=\mathcal{R}_{\rho ,\mathfrak{A}_{d}}$ for the functional $\mathcal{R}_{\rho ,\mathfrak{A}_{d}}$ defined by (1.3), i.e., by

$$ {\mathcal{R}}_{\rho ,\mathfrak{A}_{d}}(\mu ):=\inf \{\mathcal{R}_{\rho}(\mu \circ A_{d}^{-1}) : A_{d}\in \mathfrak{A}_{d} \}=\inf \{\mathcal{R}_{ \rho ,A_{d}}(\mu ) : A_{d}\in \mathfrak{A}_{d} \}. $$

Here $\mathcal{R}_{\rho}$ is derived from $\rho $ as in (4.1), and $\mathcal{R}_{\rho ,A_{d}}$ is derived from $\mathcal{R}_{\rho}$ and $A_{d}$ as in (4.2).

5.3 Copula robustness of stochastic programming problems

In the setting of Sect. 5.1, assume that conditions (a)–(c) are satisfied and recall that $\varXi $ was assumed to be compact. Then by Theorem 5.2, the map $R_{ρ, h} : M_{d}^{γ p} \to R$ is $(O_{d}^{γ p}, O_{R})$ -continuous. Together with Theorem 3.12, this leads to the following result.

Corollary 5.6

If conditions (a)–(c) hold true, then the map $R_{ρ, h} : M_{d}^{γ p} \to R$ is copula robust.

Corollary 5.6 shows that under conditions (a)–(c), the minimal value of problem (5.2) is robust with respect to slight changes in the copula of $\mu $.

Example 5.7

Let us return to the specific setting of Sect. 5.2 (mean–risk portfolio optimisation), where $\mu $ played the role of the joint distribution of the relative price changes $(Z_{1},\ldots ,Z_{d})$. In this framework, it can be seen in the proof of Corollary 5.4 that conditions (a)–(c) are satisfied for $p=1$ when $σ : L^{p} \to R$ is monotone, distribution-invariant and convex. Thus under the latter assumptions on $\sigma $, Corollary 5.6 ensures that the functional $R_{ρ, h} : M_{d}^{p} \to R$ is copula robust. Of course, the copula robustness of $\mathcal{R}_{\rho ,h}$ also directly follows from Theorem 3.12 and Corollary 5.4.

6 Example 3: multi-period portfolio optimisation

In this section, the objective is to show that the maximal expected utility of the terminal wealth of a portfolio in a multi-period financial market model (see Sect. 6.2) is copula robust if it is regarded as a function of the joint distribution of the assets’ relative price changes. The terminal wealth portfolio optimisation problem can be regarded as a Markov decision problem as introduced in the textbook by Bäuerle and Rieder [1, Chaps. 1 and 2] and in other standard monographs. To prove the main result of this section (Corollary 6.8), it is therefore useful to first establish a variant of a result of Müller [35, Theorem 4.2] about the dependence of the value function on the Markov transition probability function. This variant can be found in Theorem 6.2 and is of independent interest. It is worth pointing out that the factor $1/\psi (x)$ on the right-hand side of (6.3) is essential for our purposes; see the proof of Corollary 6.7.

6.1 Groundwork: a class of Markov decision models

6.1.1 Basic notation and terminology

Let $(E,\mathcal{E})$ be a measurable space, to be regarded as the state space, and $N \in N$ the fixed finite planning horizon. For each $n=0,\ldots ,N-1$ and $x\in E$, let $A_{n}(x)$ be a nonempty set whose elements are regarded as the admissible actions at time $n$ in state $x$. For each $n=0,\ldots ,N-1$, let $A_{n}:=\bigcup _{x\in E} A_{n}(x)$ and $D_{n}:=\{(x,a)\in E \times A_{n} : a\in A_{n}(x)\}$. The elements of $A_{n}$ can be seen as the actions that may basically be selected at time $n$, whereas the elements of $D_{n}$ are the possible state–action combinations at time $n$. We equip $A_{n}$ with a $\sigma $-algebra $\mathcal{A}_{n}$ and $D_{n}$ with the trace $\sigma $-algebra $\mathcal{D}_{n}:=(\mathcal{E}\otimes{\mathcal{A}}_{n})\cap D_{n}$. We use $\boldsymbol{A}$ to denote the family that consists of all sets $A_{n}(x)$, $n=0,\ldots ,N-1$, $x\in E$, and of all $\sigma $-algebras $\mathcal{A}_{n}$, $n=0,\ldots ,N-1$. All the sets and spaces just introduced are fully determined by $(E,\mathcal{E})$ and $\boldsymbol{A}$. Although all the objects introduced in what follows depend on $(E,\mathcal{E})$ and $\boldsymbol{A}$, we suppress this dependence in the notation.

By a (Markov decision) transition function associated with $(E,\mathcal{E})$ and $\boldsymbol{A}$, we mean an $N$-tuple $P=(P_{n})_{n=0}^{N-1}$, where $P_{n}$ is a probability kernel from $(D_{n},\mathcal{D}_{n})$ to $(E,\mathcal{E})$, to be seen as the one-step transition kernel at time $n$. The set of all transition functions is denoted by $\overline{\mathcal{P}}$. The actions are governed by a so called $N$-stage strategy, i.e., by an $N$-tuple $\pi =(f_{0},\ldots ,f_{N-1})$ where $f_{n}$ is a decision rule at time $n$, i.e., an $(\mathcal{E},\mathcal{A}_{n})$-measurable map $f_{n}:E\rightarrow A_{n}$ satisfying $f_{n}(x)\in A_{n}(x)$ for all $x\in E$. Let $F_{n}$ be a nonempty set of decision rules at time $n$, and define the set of all ‘admissible’ strategies by $\varPi :=F_{0}\times \cdots \times F_{N-1}$.

For any $P=(P_{n})_{n=0}^{N-1}\in \overline{\mathcal{P}}$, $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ and $n=0,\ldots ,N-1$, define the probability kernel $P_{n}^{\pi}$ from $(E,\mathcal{E})$ to $(E,\mathcal{E})$ by $P_{n}^{\pi}(x,B) := P_{n}((x,f_{n}(x)),B)$, $x\in E$, $B\in{\mathcal{E}}$. The probability measure $P_{n}^{\pi}(x,\,\cdot \,)$ can be seen as the one-step transition probability at time $n$ given state $x$ when the actions are chosen according to $\pi $. On the measurable space $(\varOmega ,\mathcal{F}):=(E^{N+1},\mathcal{E}^{\otimes (N+1)})$, we can define for any $x_{0}\in E$ and $\pi \in \varPi $ the probability measure $P^{x_{0}, P; π} : = δ_{x_{0}} \otimes P_{0}^{π} \otimes \dots \otimes P_{N - 1}^{π}$ , where the right-hand side is the usual product of the probability measure $\delta _{x_{0}}$ and the kernels $P_{0}^{\pi},\ldots ,P_{N-1}^{\pi}$. Under the probability measure $P^{x_{0}, P; π}$ , the identity map $X=(X_{n})_{n=0}^{N}$ on $\varOmega $ is called Markov decision process (MDP) associated with initial state $x_{0}$, transition function $P$ and strategy $\pi $.

Let $r_{n} : D_{n} \to R$ be a $(D_{n}, B (R))$ -measurable map, referred to as one-stage reward function, and $r_{N} : E \to R$ an $(E, B (R))$ -measurable map, referred to as terminal reward function. Here $r_{n}(x,a)$ specifies the one-stage reward when action $a$ is taken at time $n$ in state $x$, and $r_{N}(x)$ specifies the reward of being in state $x$ at the terminal time $N$. Finally, set $\vec{\boldsymbol{r}}:=(r_{n})_{n=0}^{N}$.

For any fixed subset $\mathcal{P}\subseteq \overline{\mathcal{P}}$, the collection of the objects $(E,\mathcal{E})$, $\boldsymbol{A}$, $\varPi $, $\mathcal{P}$, ${P^{x_{0}, P; π} : x_{0} \in E, P \in P, π \in Π}$ , $X$ and $\vec{\boldsymbol{r}}$ introduced so far are often referred to as Markov decision model. In fact, in the standard literature, the set $\mathcal{P}$ is typically a singleton. If, however, there is uncertainty with respect to the ‘true’ transition function, then one should allow a whole bundle of transition functions in the model.

6.1.2 Intrinsic optimisation problem

We assume that $r_{k}(X_{k},f_{k}(X_{k}))$, $k=0,\ldots ,N-1$, and $r_{N}(X_{N})$ are $P^{x_{0}, P; π}$ -integrable for any $x_{0}\in E$, $P\in{\mathcal{P}}$, $\pi \in \varPi $ (for a sufficient condition, see Lemma 6.1 below). As a consequence, we can define for any $P\in{\mathcal{P}}$ and $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ an $(E, B (R))$ -measurable map $V_{0}^{P; π} : E \to R$ by

V_{0}^{P; π} (x_{0}) : = E^{x_{0}, P; π} [\sum_{k = 0}^{N - 1} r_{k} (X_{k}, f_{k} (X_{k})) + r_{N} (X_{N})] .

The value $V_{n}^{P;\pi}(x_{0})$ specifies the expected total reward of $X$ under $P^{x_{0}, P; π}$ . Here ‘under $P^{x_{0}, P; π}$ ’ means that $X$ starts in $x_{0}$ and that the random transitions of $X$ are governed by $P$ and $\pi $. For fixed $P\in{\mathcal{P}}$, it is natural to look for those strategies $\pi \in \varPi $ for which the expected total reward from time 0 to $N$ is maximal for a given initial states $x_{0}\in E$. This results in the optimisation problem

$$ \max \{V_{0}^{P;\pi}(x_{0}) : \pi \in \varPi \}. $$

(6.1)

We assume that $\sup _{\pi \in \varPi}V_{0}^{P;\pi}(x_{0})<\infty $ for any $x_{0}\in E$, which means that it is impossible to gain an arbitrarily high reward. A strategy $\pi ^{P}\in \varPi $ is said to be optimal for (6.1) if $V_{0}^{P;\pi ^{P}}(x_{0})=V_{0}^{P}(x_{0})$ for any $x_{0}\in E$, where the map $V_{0}^{P} : E \to R$ is defined by $V_{0}^{P}(x_{0}):=\sup _{\pi \in \varPi}V_{0}^{P;\pi}(x_{0})$. The map $V_{0}^{P}$ is referred to as value function.

Some known facts about the existence of optimal strategies are recalled in Appendix C. Part (i) of Theorem C.1 shows that under some assumptions, the value function can be obtained by the Bellman iteration scheme. The latter involves the time-$n$ value functions $V_{n}^{P}$, $n=1,\ldots ,N$, defined by $V_{n}^{P}(x):=\sup _{\pi \in \varPi}V_{n}^{P;\pi}(x)$, where for any $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ the $(E, B (R))$ -measurable map $V_{n}^{P; π} : E \to R$ is defined by $V_{n}^{P; π} (x) : = E^{x_{0}, P; π} [\sum_{k = n}^{N - 1} r_{k} (X_{k}, f_{k} (X_{k})) + r_{N} (X_{N}) | X_{n} = x]$ (note that the right-hand side is independent of $x_{0}\in E$). Here and in the following, we use the convention $\sum _{n=N}^{N-1}:=0$. The maps $V_{n}^{P;\pi} (\, \cdot \,)$, $\pi \in \varPi $, are sometimes called policy value functions and appear in Theorem 6.2.

6.1.3 Bounding function

For the Markov decision model introduced above and $P\in{\mathcal{P}}$, an $(\mathcal{E},\mathcal{B}([1,\infty ))$-measurable function $\psi :E\to [1,\infty )$ is called a bounding function for $P$ if there exist constants $K_{1}, K_{2}, K_{3} \in R_{+}$ such that the following three assertions hold:

(a) $|r_{n}(x,a)| \le K_{1} \psi (x)$ for any $n=0,\ldots ,N-1$ and $(x,a)\in D_{n}$;

(b) $|r_{N}(x)| \le K_{2} \psi (x)$ for any $x\in E$;

(c) $\int _{E}\psi (y)\,P_{n}((x,a),dy)\le K_{3} \psi (x)$ for any $n=0,\ldots ,N-1$ and $(x,a)\in D_{n}$.

This terminology is adapted from the work of Müller [34, Definition 2.4] and the textbook by Bäuerle and Rieder [1, Definition 2.4.1]. Denote by $M (E)$ the set of all $(E, B (R))$ -measurable maps $v : E \to R$ , and by $M_{ψ} (E)$ the set of all $v \in M (E)$ satisfying $\|v\|_{\psi}<\infty $, where $\|v\|_{\psi}:=\sup _{x\in E}|v(x)|/\psi (x)$.

Lemma 6.1

Let $P\in{\mathcal{P}}$. If there exists a bounding function $\psi $ for $P$, then the random variables $r_{k}(X_{k},f_{k}(X_{k}))$, $k=0,\ldots ,N-1$, and $r_{N}(X_{N})$ are $P^{x_{0}, P; π}$ -integrable for any $x_{0}\in E$ and $\pi \in \varPi $, and moreover, $\|V_{n}^{P}\|_{\psi}<\infty $ (in particular, $V_{n}^{P; π} \in M_{ψ} (E)$ for any $\pi \in \varPi $) for any $n=0,\ldots ,N-1$.

6.1.4 Continuous dependence of the optimal value on the transition function

Let $\psi :E\to [1,\infty )$ be an $(\mathcal{E},\mathcal{B}([1,\infty ))$-measurable function, and note that the integral $\int _{E} v\,d\mathfrak{m}$ exists and is finite for any $v \in M_{ψ} (E)$ and $\mathfrak{m}\in{\mathcal{M}}_{1}^{\psi}(E)$, the set of all probability measures on $(E,\mathcal{E})$ with $\int \psi \,d\mathfrak{m}<\infty $. For any fixed subset $M \subseteq M_{ψ} (E)$ , the distance between $\mathfrak{m}_{1}$ and $\mathfrak{m}_{2}$ from $\mathcal{M}_{1}^{\psi}(E)$ can be measured by

d_{M} (m_{1}, m_{2}) : = sup_{v \in M} | \int_{E} v d m_{1} - \int_{E} v d m_{2} | .

(6.2)

Note that (6.2) defines a probability pseudo-metric (in the sense of Rachev [41, Sect. 2.3]), i.e., a map $d_{M} : M_{1}^{ψ} (E) \times M_{1}^{ψ} (E) \to {\overline{R}}_{+}$ which is symmetric and fulfils the triangle inequality. If $M$ separates points in $\mathcal{M}_{1}^{\psi}(E)$ (i.e., if any two $\mathfrak{m}_{1},\mathfrak{m}_{2}\in{\mathcal{M}}_{1}^{\psi}(E)$ coincide when $\int _{E} v\,d\mathfrak{m}_{1}=\int _{E} v\,d\mathfrak{m}_{2}$ for all $v \in M$ ), then $d_{M}$ is even a probability metric. It is sometimes called integral probability metric or probability metric with a $\zeta $-structure; see Müller [35] and Zolotarev [54].

In some situations, the (pseudo-)metric $d_{M}$ (with $M \subseteq M_{ψ} (E)$ fixed) can be represented by the right-hand side of (6.2) with $M$ replaced by a different subset $M^{'}$ of $M_{ψ} (E)$ . Each such set $M^{'}$ is said to be a generator of $d_{M}$ . The largest generator of $d_{M}$ is called the maximal generator of $d_{M}$ and will be denoted by $\overline{M}$ . That is, $\overline{M}$ is the set of all $v \in M_{ψ} (E)$ for which $| \int_{E} v d m_{1} - \int_{E} v d m_{2} | \leq d_{M} (m_{1}, m_{2})$ for all $\mathfrak{m}_{1},\mathfrak{m}_{2}\in{\mathcal{M}}_{1}^{\psi}(E)$; see [35, Definition 3.1]. Examples for $d_{M}$ and $\overline{M}$ are discussed in Kern et al. [26] and Müller [34, 35].

Now denote by $\mathcal{P}_{\psi}$ the set of all transition functions $P=(P_{n})_{n=0}^{N-1}\in{\mathcal{P}}$ with $P_{n}((x,a),\,\cdot \,) \in{\mathcal{M}}_{1}^{\psi}(E)$ for all $(x,a)\in D_{n}$ and $n=0,\ldots ,N-1$. For any $P\in{\mathcal{P}}_{\psi}$, the integrals $\int _{E} v(y)\,P_{n}((x,a),dy)$, $v \in M_{ψ} (E)$ , $(x,a)\in D_{n}$, $n=0,\ldots ,N-1$, exist and are finite. For any $M \subseteq M_{ψ} (E)$ , we may define the distance between two transition functions $P=(P_{n})_{n=0}^{N-1}$ and $Q=(Q_{n})_{n=0}^{N-1}$ from $\mathcal{P}_{\psi}$ by

d_{M, ψ} (P, Q) : = max_{n = 0, \dots, N - 1} sup_{(x, a) \in D_{n}} d_{M} (P_{n} ((x, a), \cdot), Q_{n} ((x, a), \cdot)) / ψ (x) .

(6.3)

For any $M \subseteq M_{ψ} (E)$ , the Minkowski functional $ϱ_{M} : M_{ψ} (E) \to {\overline{R}}_{+}$ (in the sense of Rudin [42, paragraph after Definition 1.33]) is defined by

ϱ_{M} (v) : = inf {λ \in R_{+ +} : v / λ \in M},

where we set $\inf \emptyset :=\infty $. Examples for $M$ and $ϱ_{M}$ are discussed in Kern et al. [26] and Müller [34]. In the following result, we assume that $\psi $ is a bounding function for any $Q\in{\mathcal{P}}_{\psi}$. By Lemma 6.1, it then follows that $V_{n}^{Q}(x)<\infty $ for any $n=0,\ldots ,N$, $Q\in{\mathcal{P}}_{\psi}$ and $x\in E$. In particular, we can define a functional ${\overline{V}}_{n}^{x} : P_{ψ} \to R$ by $\overline{\mathcal{V}}_{n}^{x}(Q):=V_{n}^{Q}(x)$. Note that Theorem 6.2 is a refinement of Kern’s PhD thesis [25, Theorem 2.2.8] and that a related result was proved earlier by Müller [34, Theorem 4.2]. We use $K_{3,P}$ to denote the constant in condition (c) of a bounding function for $P$.

Theorem 6.2

We assume that $\psi $ is a bounding function for any $Q\in{\mathcal{P}}_{\psi}$, and we let $M \subseteq M_{ψ} (E)$ and $M^{'}$ be a generator of $d_{M}$ . Then for any $n=0,\ldots ,N-1$, $x_{n}\in E$ and $Q,P\in{\mathcal{P}}_{\psi}$, we have

\begin{aligned} | {\overline{V}}_{n}^{x_{n}} (Q) - {\overline{V}}_{n}^{x_{n}} (P) | \\ \leq \sum_{j = n}^{N - 1} sup_{π \in Π} ϱ_{M^{'}} (V_{j + 1}^{P; π}) {(K_{3, P} + ϱ_{M^{'}} (ψ) d_{M, ψ} (Q, P))}^{n - j} ψ (x_{n}) d_{M, ψ} (Q, P) . \end{aligned}

As a direct consequence of Theorem 6.2, we get the following result.

Corollary 6.3

Assume that $\psi $ is a bounding function for any $Q\in{\mathcal{P}}_{\psi}$ and let $P\in{\mathcal{P}}_{\psi}$. Let $M \subseteq M_{ψ} (E)$ and $M^{'}$ be a generator of $d_{M}$ . If $ϱ_{M^{'}} (ψ) < \infty$ and ${sup}_{π \in Π} ϱ_{M^{'}} (V_{n + 1}^{P; π}) < \infty$ for any $n=0,\ldots ,N-1$, then $\overline{\mathcal{V}}_{n}^{x_{n}}$ is $(d_{M, ψ}, | \cdot |)$ -continuous at $P$ for any $n=0,\ldots ,N-1$ and $x_{n}\in E$.

6.2 A utility-based portfolio optimisation problem

6.2.1 Financial market model and a terminal wealth optimisation problem

Consider an $N$-period financial market consisting of one riskless bond $S^{0}=(S^{0}_{n})_{n=0}^{N}$ and $d$ risky assets $S^{i}=(S^{i}_{n})_{n=0}^{N}$, $i=1,\ldots ,d$, for some fixed $d \in N$ . Assume that the value of the bond evolves deterministically according to

$$ S^{0}_{0}=1\quad \mbox{ and }\quad S^{0}_{n+1}=Z^{0}_{n+1} S^{0}_{n}, \quad n=0,\ldots ,N-1, $$

for some fixed constants $Z^{0}_{1},\ldots ,Z^{0}_{N}\in [1,\infty )$, and that the value of the $i$th asset evolves stochastically according to

$$ S^{i}_{0}=s^{i}_{0}\quad \mbox{ and }\quad S^{i}_{n+1}=Z^{i}_{n+1} S^{i}_{n}, \quad n=0,\ldots ,N-1, $$

for a constant $s_{0}^{i} \in R_{+ +}$ and independent $R_{+}$ -valued random variables $Z^{i}_{1},\ldots ,Z^{i}_{N}$ on a common probability space $(Ω, F, P)$ . For $n=0,\ldots ,N$, set $S_{n}:=(S_{n}^{1},\ldots ,S_{n}^{d})$ and $Z_{n}:=(Z_{n}^{1},\ldots ,Z_{n}^{d})$ and denote by $\mu _{n}$ the distribution of $Z_{n}$. We also define $1 : = (1, \dots, 1) \in R^{d}$ , $\mathcal{F}_{0}:=\{\emptyset ,\varOmega \}$, $\mathcal{F}_{n}:=\sigma (S_{0},\ldots ,S_{n})=\sigma (Z_{1},\ldots ,Z_{n})$, $n=1,\ldots ,N$, and $F : = {(F_{n})}_{n = 0}^{N}$ .

Now, an agent invests a given amount of capital $x_{0} \in R_{+ +}$ in the bond and the assets according to some self-financing trading strategy. By a trading strategy, we mean an $F$ -adapted $R_{+}^{d + 1}$ -valued stochastic process $\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}$ with $\xi _{n}=(\xi _{n}^{1},\ldots ,\xi _{n}^{d})$, where $\xi _{n}^{0}$ and $\xi _{n}^{i}$ specify the amounts of capital invested in the bond and in the $i$th asset, respectively, during the time interval $[n,n+1)$. The nonnegativity of $\xi _{n}^{0},\xi _{n}^{1},\ldots ,\xi _{n}^{d}$, $n=0,\ldots ,N-1$, means that taking loans and short selling of the assets are excluded. The corresponding ( $F$ -adapted) portfolio process $X^{\xi}=(X_{n}^{\xi})_{n=0}^{N}$ associated with $\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}$ is defined by

$$ X_{0}^{\xi}:=\xi _{0}^{0}+\langle \xi _{0},\boldsymbol{1}\rangle , \qquad X_{n+1}^{\xi}:=\xi _{n}^{0}Z^{0}_{n+1} + \langle \xi _{n},Z_{n+1} \rangle ,\quad n=0,\ldots ,N-1. $$

(6.4)

A trading strategy $\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}$ is called self-financing with respect to the initial capital $x_{0}$ if $x_{0}=\xi _{0}^{0}+\langle \xi _{0},\boldsymbol{1}\rangle $ and $X_{n}^{\xi}=\xi _{n}^{0}+\langle \xi _{n},\boldsymbol{1}\rangle $ for any $n=1,\ldots ,N$. Note that $\xi _{n}^{0}$ and $\langle \xi _{n},\boldsymbol{1}\rangle $ specify the amounts of capital invested during the time interval $[n,n+1)$ in the bond and in the $d$ assets, respectively. For any self-financing trading strategy $\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}$ with respect to $x_{0}$, we have $\xi _{n}^{0}=X_{n}^{\xi}-\langle \xi _{n},\boldsymbol{1}\rangle $ for any $n=0,\ldots ,N-1$, and therefore the corresponding portfolio process admits the representation

$$ X_{0}^{\xi}=x_{0}, \qquad X_{n+1}^{\xi}=Z^{0}_{n+1} X_{n}^{\xi} + \langle \xi _{n},Z_{n+1}-Z^{0}_{n+1}\boldsymbol{1}\rangle \quad \mbox{for }n=0,\ldots ,N-1. $$

(6.5)

In view of (6.5), we identify a self-financing trading strategy with respect to $x_{0}$ with an $F$ -adapted $R_{+}^{d}$ -valued stochastic process $\xi =(\xi _{n})_{n=0}^{N-1}$ with $\xi _{n}=(\xi _{n}^{1}, \ldots ,\xi _{n}^{d})$ such that $\langle \xi _{0},\boldsymbol{1}\rangle \in [0,x_{0}]$ and $\langle \xi _{n},\boldsymbol{1}\rangle \in [0,X_{n}^{\xi}]$ for any $n=1,\ldots ,N-1$. We restrict ourselves to Markovian self-financing trading strategies $\xi =(\xi _{n})_{n=0}^{N-1}$ with respect to $x_{0}$ which means that $\xi _{n}$ only depends on $n$ and $X_{n}^{\xi}$. To put it another way, we assume that for any $n=0,\ldots ,N-1$, there exists a Borel-measurable map $f_{n} : R_{+} \to R_{+}^{d}$ such that $\xi _{n} = f_{n}(X_{n}^{\xi})$. Then in particular, $X^{\xi}$ is an $R_{+}$ -valued $F$ -Markov process whose one-step transition probability at time $n\in \{0,\ldots ,N-1\}$ given state $x \in R_{+}$ and strategy $\xi =(\xi _{n})_{n=0}^{N-1}$ (resp. $\pi :=(f_{n})_{n=0}^{N-1}$) is given by $\mu _{n+1}\circ \eta _{n,(x,f_{n}(x))}^{-1} $, where

η_{n, (x, a)} (z) : = Z_{n + 1}^{0} x + 〈 a, z - Z_{n + 1}^{0} 1 〉, z \in R_{+}^{d} .

(6.6)

The agent’s aim is to find a self-financing trading strategy $\xi =(\xi _{n})_{n=0}^{N-1}$ (resp. $\pi =(f_{n})_{n=0}^{N-1}$) with respect to $x_{0}$ for which her expected utility of the relative terminal wealth is maximised. We assume that the agent is risk-averse and that her attitude towards risk is set via the power utility function $u_{α} : R_{+} \to R_{+}$ defined by

$$ u_{\alpha}(x):=x^{\alpha }$$

(6.7)

for some fixed $\alpha \in (0,1)$. Hence the agent is interested in those self-financing trading strategies $\xi =(\xi _{n})_{n=0}^{N-1}$ (resp. $\pi =(f_{n})_{n=0}^{N-1}$) with respect to $x_{0}$ for which the expectation of $u_{\alpha}(X_{N}^{\xi}/(x_{0}S_{N}^{0}))$ is maximised. Since $u_{\alpha}(X_{N}^{\xi}/(x_{0}S_{N}^{0})) = u_{\alpha}(X_{N}^{\xi})/(x_{0}S_{N}^{0})^{ \alpha}$, this is equivalent to maximising the expectation of $u_{\alpha}(X_{N}^{\xi})$. For notational simplicity, we consider the terminal wealth optimisation problem in the latter form. We assume that $Z_{n}^{1},\ldots ,Z_{n}^{d}$ are ℙ-a.s. strictly positive and $E [u_{α} (〈 Z_{n}, 1 〉)] < \infty$ for any $n=1,\ldots ,N$.

Example 6.4

Assume that the bond and the $d$ assets evolve according to the 1-dimensional ordinary (Itô stochastic) differential equations

$$\begin{aligned} & d\mathsf{s}^{0}_{t} = \delta _{0}\mathsf{s}^{0}_{t}\,dt,\quad \mathsf{s}_{0}^{0}=1, \\ & d\mathsf{s}_{t}^{i}=\delta _{i}\mathsf{s}_{t}^{i}\,dt+\sigma _{i} \mathsf{s}_{t}^{i}\,dB_{t}^{i},\quad \mathsf{s}_{0}^{i}=s_{0}^{i}, \qquad i=1,\ldots ,d, \end{aligned}$$

where $δ_{0}, δ_{1}, \dots, δ_{d}, σ_{1}, \dots, σ_{d} \in R_{+ +}$ are constants and $B^{1},\ldots ,B^{d}$ are (jointly Gaussian) correlated 1-dimensional standard Brownian motions which satisfy for any $t \in R_{+}$ that $C ov (B_{t}^{i}, B_{t}^{j}) = R_{i, j} t$ , where $R = {(R_{i, j})}_{1 \leq i, j \leq d} \in R^{d \times d}$ is a fixed correlation matrix (i.e., $R$ is symmetric and positive semi-definite with entries 1 on the diagonal). This is a multivariate version of the classical Black–Scholes–Merton model. Choose the trading period to be the unit interval $[0,1]$ and assume that the bond and the assets can be traded only at $N$ equidistant time points in $[0,1]$, namely at $t_{N,n}:=n/N$, $n=0,\ldots ,N-1$. Then the relative price changes $Z^{0}_{n+1}:=S^{0}_{n+1}/S^{0}_{n}=\mathsf{s}^{0}_{t_{N,n+1}}/ \mathsf{s}^{0}_{t_{N,n}}$ and $Z_{n+1}^{i}:=S_{n+1}^{i}/S_{n}^{i}=\mathsf{s}^{i}_{t_{N,n+1}}/ \mathsf{s}^{i}_{t_{N,n}}$ are given by, respectively, $e^{\delta _{0}(t_{N,n+1}-t_{N,n})}$ and $e^{(\delta _{i} - \sigma _{i}^{2}/2)(t_{N,n+1}-t_{N,n})+\sigma _{i}(B^{i}_{t_{N,n+1}}-B^{i}_{t_{N,n}})}$, i.e., for $n=0,\ldots ,N-1$,

$$ Z_{n+1}^{0}=e^{\delta _{0}/N}\quad \mbox{ and }\quad Z_{n+1}^{i}=e^{( \delta _{i} - \sigma _{i}^{2}/2)/N+\sigma _{i}(B^{i}_{t_{N,n+1}}-B^{i}_{t_{N,n}})}, \qquad i=1,\ldots ,d. $$

That is, we have $Z_{n+1}=(e^{G_{1}},\ldots ,e^{G_{d}})$ for a $d$-variate random variable $(G_{1},\ldots ,G_{d})$ which has a $d$-variate normal distribution $\mathbf{N}_{\delta ,\varGamma}$ with $\delta :=(( \delta _{i} - \sigma _{i}^{2}/2)/N)_{i=1}^{d}$ and $\varGamma :=( \sigma _{i}R_{i,j}\sigma _{j}/N)_{1\le i,j\le d}$. Thus we have $\mu _{1}=\cdots =\mu _{N}=\mathrm{LN}_{\delta ,\varGamma}$, where $\mathrm{LN}_{\delta ,\varGamma}$ is a $d$-variate log-normal distribution with parameters $\delta $ and $\varGamma $.

6.2.2 Interpretation as a Markov decision problem

The terminal wealth optimisation problem just introduced can be embedded in the framework of Sect. 6.1 as follows. Let $Z^{0}_{1},\ldots ,Z^{0}_{N}\in [1,\infty )$ be a priori fixed and choose $(E, E) : = (R_{+}, B (R_{+}))$ . For any $x \in R_{+}$ and $n=0,\ldots ,N-1$, let

A_{n} (x) : = A (x) : = {a \in R_{+}^{d} : 〈 a, 1 〉 \leq x} .

Hence $A_{n} = R_{+}^{d}$ and $D_{n} = D : = {(x, a) \in R_{+}^{d + 1} : a \in A (x)}$ for $n=0, \ldots ,N-1$. Let $A_{n} : = B (R_{+}^{d})$ and $D_{n} : = B (R_{+}^{d + 1}) \cap D$ for any $n=0,\ldots ,N-1$, and let the set $F$ consist of all those Borel-measurable maps $f : R_{+} \to R_{+}^{d}$ that satisfy $\langle f(x),\boldsymbol{1}\rangle \in [0,x]$ for any $x \in R_{+}$ . Finally, let $F_{n}:=F$ for $n=0,\ldots ,N-1$ and $\varPi :=F_{0}\times \cdots \times F_{N-1}=F^{N}$.

Let $M_{1}^{α} (R_{+ +}^{d})$ be the set of all Borel probability measures on $R_{+ +}^{d}$ for which $\int_{R_{+}^{d}} {| z |}^{α} μ (d z) < \infty$ . The latter condition is equivalent to $\int_{R_{+ +}^{d}} {〈 z, 1 〉}^{α} μ (d z) < \infty$ , which can be shown by using arguments as at the beginning of Appendix A.2. For any $\vec{μ} = {(μ_{n})}_{n = 1}^{N} \in M_{1}^{α} {(R_{+ +}^{d})}^{N}$ , we define a transition function $P^{\vec{\boldsymbol{\mu}}}=(P_{n}^{\vec{\boldsymbol{\mu}}})_{n=0}^{N-1}$ by

$$P_{n}^{\vec{\boldsymbol{\mu}}}\big((x,a),\,\cdot \,\big)=(\mu _{n+1} \circ \eta _{n,(x,a)}^{-1}) [\,\cdot \,],\qquad (x,a)\in D_{n},\,n=0, \ldots ,N-1, $$

where the map $η_{n, (x, a)} : R_{+}^{d} \to R_{+}$ is defined by (6.6). The set of all such transition functions is denoted by $\mathcal{P}_{\alpha}$, i.e., $P_{α} : = {P^{\vec{μ}} : \vec{μ} \in M_{1}^{α} {(R_{+ +}^{d})}^{N}}$ , and plays the role of $\mathcal{P}$.

Let $r_{n}:= 0$, $n=0,\ldots ,N-1$, and $r_{N}(x):=u_{\alpha}(x)$, $x \in R_{+}$ . Then

V_{0}^{P; π} (x_{0}) = E^{x_{0}, P; π} [r_{N} (X_{N})] = E^{x_{0}, P; π} [u_{α} (X_{N})]

for any $x_{0} \in R_{+}$ , $P\in{\mathcal{P}}_{\alpha}$ and $\pi \in \varPi $, and the terminal wealth problem introduced subsequent to (6.7) can be identified with the optimisation problem (6.1), i.e., with

max {E^{x_{0}, P; π} [u_{α} (X_{N})] : π \in Π}

(6.8)

for any $x_{0} \in R_{+}$ and $P\in{\mathcal{P}}_{\alpha}$. A strategy $\pi ^{P}\in \varPi $ is called an optimal (self-financing) trading strategy for $P$ if it solves the maximisation problem (6.8) for any $x_{0} \in R_{+}$ . Note that the coordinate process $X$ plays the role of the portfolio process $X^{\xi}$ introduced in (6.4), and that for each $x_{0} \in R_{+}$ , any self-financing trading strategy $\xi =(\xi _{n})_{n=0}^{N-1}$ with respect to $x_{0}$ may be identified with some $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ through $\xi _{n}=f_{n}(X_{n}^{\xi})$. Theorem C.3 ensures that optimal trading strategies exist.

6.2.3 Continuous dependence of the optimal value on $P^{\vec{\boldsymbol{\mu}}}$

Let the function $ψ_{α} : R_{+} \to [1, \infty)$ be defined by $\psi _{\alpha}(x):=1 + u_{\alpha}(x)$. Moreover, let $\mathcal{P}_{\psi _{\alpha}}$ be derived from $\mathcal{P}_{\alpha}$ as $\mathcal{P}_{\psi}$ is derived from $\mathcal{P}$ in Sect. 6.1.

Lemma 6.5

$\psi _{\alpha}$ is a bounding function for any $P\in{\mathcal{P}}_{\alpha}$, and we have $\mathcal{P}_{\psi _{\alpha}}=\mathcal{P}_{\alpha}$.

Let $M : = M_{Höl, α} : = {v \in R^{R_{+}} : {∥ v ∥}_{Höl, α} \leq 1}$ , where the Hölder-$\alpha $ norm is defined by ${∥ v ∥}_{Höl, α} : = {sup}_{x, y \in R_{+} : x \neq y} | v (x) - v (y) | / {| x - y |}^{α}$ . We obviously have $M_{Höl, α} \subseteq M_{ψ_{α}} (R_{+})$ , and in view of Lemmas 6.5 and 6.1, we can therefore define a functional ${\overline{V}}_{n}^{x} : P_{ψ_{α}} \to R$ through $\overline{\mathcal{V}}_{n}^{x}(P):=V_{n}^{P}(x)$. The set $M_{Höl, α}$ separates points in $M_{1}^{ψ_{α}} (R_{+})$ , implying that $d_{M_{Höl, α}}$ (defined by (6.2) with $M : = M_{Höl, α}$ ) provides a metric on $M_{1}^{ψ_{α}} (R_{+})$ ; see Kern et al. [26] for details. Let $d_{M_{Höl, α}, ψ_{α}}$ be defined by (6.3) with $M : = M_{Höl, α}$ and $\psi :=\psi _{\alpha}$.

Theorem 6.6

For any $n=0,\ldots ,N-1$ and $x \in R_{+}$ , the map ${\overline{V}}_{n}^{x} : P_{ψ_{α}} \to R$ is -continuous.

Recall that the elements of $\mathcal{P}_{\alpha}$ ($=\mathcal{P}_{\psi _{ \alpha}}$) are parametrised by the elements of the set $M_{1}^{α} {(R_{+ +}^{d})}^{N}$ . For any $μ \in M_{1}^{α} (R_{+ +}^{d})$ , denote by $\overline{\mu}$ the element of $M_{1}^{α} {(R_{+ +}^{d})}^{N}$ whose $N$ entries are all equal to $\mu $, i.e., $\overline{\mu}:=(\mu )_{n=1}^{N}$. Then we can define a functional $V_{n}^{x} : M_{1}^{α} (R_{+ +}^{d}) \to R$ by

$$ \mathcal{V}_{n}^{x}(\mu )\,:=\,\overline{\mathcal{V}}_{n}^{x}( \overline{\mu}) = V_{n}^{P^{\overline{\mu}}}(x). $$

(6.9)

Since we used $\mathcal{O}_{d}^{\alpha}$ to denote the $\alpha $-weak topology on $\mathcal{M}_{1}^{\alpha}$ (see Sect. 2.2), we use $O_{d}^{α} (R_{+ +}^{d})$ to denote the analogous topology on $M_{1}^{α} (R_{+ +}^{d})$ .

Corollary 6.7

For any $n=0,\ldots ,N-1$ and $x \in R_{+}$ , the map $V_{n}^{x} : M_{1}^{α} (R_{+ +}^{d}) \to R$ defined by (6.9) is $(O_{d}^{α} (R_{+ +}^{d}), O_{R})$ -continuous.

6.3 Copula robustness of the maximal expected utility of the terminal wealth

For any $n=0,\ldots ,N-1$ and $x \in R_{+}$ , let the map $V_{n}^{x} : M_{1}^{α} (R_{+ +}^{d}) \to R$ be defined by (6.9), and note that $\mathcal{V}_{0}^{x_{0}}(\mu )$ corresponds to the maximal expected utility of the terminal wealth in (6.8) with $P=P^{\overline{\mu}}$. When regarding each $μ \in M_{1}^{α} (R_{+ +}^{d})$ as a Borel probability measure on the whole Euclidean space $R^{d}$ (with $μ [R_{+ +}^{d}] = 1$ ), the set $M_{1}^{α} (R_{+ +})$ can be seen as a subset of $\mathcal{M}_{1}^{\alpha}$. Thus $\mathbf{C}_{d}(\mu _{1},\ldots \mu _{d})=\mathbf{C}_{d}$ for any $μ_{1}, \dots, μ_{d} \in M_{1}^{α} (R_{+ +}^{d})$ , and Theorem 3.12 and Corollary 6.7 together imply the following result.

Corollary 6.8

For any $n=0,\ldots ,N-1$ and $x \in R_{+}$ , the map $V_{n}^{x} : M_{1}^{α} (R_{+ +}^{d}) \to R$ defined by (6.9) is copula robust.

References

Bäuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Berlin (2011)
Book MATH Google Scholar
Bauer, H.: Measure and Integration Theory. de Gruyter, Berlin (2001)
Book MATH Google Scholar
Bellini, F., Klar, B., Müller, A., Rosazza Gianin, E.: Generalized quantiles as risk measures. Insur. Math. Econ. 54, 41–48 (2014)
Article MathSciNet MATH Google Scholar
Berge, C.: Topological Spaces. Macmillan, New York (1963)
MATH Google Scholar
Bickel, P.J., Freedman, D.A.: Some asymptotic theory for the bootstrap. Ann. Stat. 9, 1196–1217 (1981)
Article MathSciNet MATH Google Scholar
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Book MATH Google Scholar
Charpentier, A., Segers, J.: Convergence of Archimedean copulas. Stat. Probab. Lett. 78, 412–419 (2008)
Article MathSciNet MATH Google Scholar
Cheridito, P., Li, T.: Risk measures on Orlicz hearts. Math. Finance 19, 189–214 (2009)
Article MathSciNet MATH Google Scholar
Claus, M., Krätschmer, V., Schultz, R.: Weak continuity of risk functionals with applications to stochastic programming. SIAM J. Optim. 27, 91–109 (2017)
Article MathSciNet MATH Google Scholar
Deheuvels, P.: Caractérisation complète des lois extrêmes multivariées et de la convergence des types extrêmes. Publ. Inst. Stat. Univ. Paris 23, 1–36 (1978)
MATH Google Scholar
Demarta, S., McNeil, A.J.: The $t$ copula and related copulas. Int. Stat. Rev. 73, 111–129 (2005)
Article MATH Google Scholar
Durante, F., Sempi, C.: Principles of Copula Theory. Taylor & Francis, London (2015)
Book MATH Google Scholar
Embrechts, P., Liu, H., Wang, R.: Quantile-based risk sharing. Oper. Res. 66, 936–949 (2018)
Article MathSciNet MATH Google Scholar
Embrechts, P., Puccetti, G.: Bounds for functions of dependent risks. Finance Stoch. 10, 341–352 (2006)
Article MathSciNet MATH Google Scholar
Embrechts, P., Puccetti, G., Rüschendorf, L.: Model uncertainty and VaR aggregation. J. Bank. Finance 37, 2750–2764 (2013)
Article Google Scholar
Embrechts, P., Schied, A., Wang, R.: Robustness in the optimization of risk measures. Oper. Res. 70, 95–110 (2021)
Article MathSciNet MATH Google Scholar
Embrechts, P., Wang, B., Wang, R.: Aggregation-robustness and model uncertainty of regulatory risk measures. Finance Stoch. 19, 763–790 (2015)
Article MathSciNet MATH Google Scholar
Emmer, S., Kratz, M., Tasche, D.: What is the best risk measure in practice? A comparison of standard measures. J. Risk 18, 31–60 (2015)
Article Google Scholar
Fernández Sánchez, J., Trutschnig, W.: Conditioning-based metrics on the space of multivariate copulas and their interrelation with uniform and levelwise convergence and iterated function systems. J. Theor. Probab. 28, 1311–1336 (2015)
Article MathSciNet MATH Google Scholar
Filipović, D., Kupper, M.: Optimal capital and risk transfers for group diversification. Math. Finance 18, 55–76 (2007)
Article MathSciNet MATH Google Scholar
Filipović, D., Svindland, G.: Optimal capital and risk allocations for law- and cash-invariant convex functions. Finance Stoch. 12, 423–439 (2008)
Article MathSciNet MATH Google Scholar
Föllmer, H., Schied, A.: Robust preferences and convex measures of risk. In: Sandmann, K., Schönbucher, P. (eds.) Advances in Finance and Stochastics. Essays in Honour of Dieter Sondermann, pp. 39–56. Springer, Berlin (2002)
Chapter Google Scholar
Föllmer, H., Schied, A.: Stochastic Finance. An Introduction in Discrete Time, 3rd revised and extended edn. de Gruyter, Berlin (2011)
Book MATH Google Scholar
Kasper, T., Fuchs, S., Trutschnig, W.: On weak conditional convergence of bivariate Archimedean and extreme value copulas, and consequences to nonparametric estimation. Bernoulli 27, 2217–2240 (2021)
Article MathSciNet MATH Google Scholar
Kern, P.: Sensitivity and Statistical Inference in Markov Decision Models and Collective Risk Models. PhD thesis, Saarland University, Saarbrücken, (2020). Available online at https://doi.org/10.22028/D291-32385
Kern, P., Simroth, A., Zähle, H.: First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function. Math. Methods Oper. Res. 92, 165–197 (2020)
Article MathSciNet MATH Google Scholar
Krätschmer, V., Schied, A., Zähle, H.: Comparative and qualitative robustness for law-invariant risk measures. Finance Stoch. 18, 271–295 (2014)
Article MathSciNet MATH Google Scholar
Krätschmer, V., Schied, A., Zähle, H.: Domains of weak continuity of statistical functionals with a view toward robust statistics. J. Multivar. Anal. 158, 1–19 (2017)
Article MathSciNet MATH Google Scholar
Krätschmer, V., Zähle, H.: Statistical inference for expectile-based risk measures. Scand. J. Stat. 44, 425–454 (2017)
MathSciNet MATH Google Scholar
Li, X., Mikusiński, P., Taylor, M.D.: Strong approximation of copulas. J. Math. Anal. Appl. 225, 608–623 (1998)
Article MathSciNet MATH Google Scholar
Lindner, A., Szimayer, A.: A limit theorem for copulas. Discussion paper 433, Sonderforschungsbereich 386, Ludwig-Maximilian-Universität München (2005). Available online at https://epub.ub.uni-muenchen.de/1802/
Markowitz, H.M.: Portfolio selection. J. Finance 7, 77–91 (1952)
Google Scholar
McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management, 1st edn. Princeton University Press, Princeton (2005)
MATH Google Scholar
Müller, A.: How does the value function of a Markov decision process depend on the transition probabilities? Math. Oper. Res. 22, 872–885 (1997)
Article MathSciNet MATH Google Scholar
Müller, A.: Integral probability metrics and their generating classes of functions. Adv. Appl. Probab. 29, 429–443 (1997)
Article MathSciNet MATH Google Scholar
Nelsen, R.B.: An Introduction to Copulas, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Ogryczak, W., Ruszczyński, A.: From stochastic dominance to mean–risk models: semideviations as risk measures. Eur. J. Oper. Res. 116, 33–50 (1999)
Article MATH Google Scholar
Ogryczak, W., Ruszczyński, A.: Dual stochastic dominance and related mean–risk models. SIAM J. Optim. 13, 60–78 (2002)
Article MathSciNet MATH Google Scholar
Puccetti, G.: Sharp bounds on the expected shortfall for a sum of dependent random variables. Stat. Probab. Lett. 83, 1227–1232 (2013)
Article MathSciNet MATH Google Scholar
Rachasingho, J., Tasena, S.: A metric space of subcopulas – an approach via Hausdorff distance. Fuzzy Sets Syst. 378, 144–156 (2020)
Article MathSciNet MATH Google Scholar
Rachev, S.T.: Probability Metrics and the Stability of Stochastic Models. Wiley, Chichester (1991)
MATH Google Scholar
Rudin, W.: Functional Analysis, 2nd edn. McGraw-Hill, New York (1991)
MATH Google Scholar
Rüschendorf, L.: Random variables with maximum sums. Adv. Appl. Probab. 14, 623–632 (1982)
Article MathSciNet MATH Google Scholar
Rüschendorf, L.: Mathematical Risk Analysis. Springer, Berlin (2013)
Book MATH Google Scholar
Saida, A.B., Prigent, J.L.: On the robustness of portfolio allocation under copula misspecification. Ann. Oper. Res. 262, 631–652 (2018)
Article MathSciNet MATH Google Scholar
Schultz, R., Tiedemann, S.: Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. 105, 365–386 (2006)
Article MathSciNet MATH Google Scholar
Sempi, C.: Convergence of copulas: a critical remark. Rad. Mat. 12, 241–249 (2004)
MathSciNet MATH Google Scholar
Shiryaev, A.N.: Probability, 2nd edn. Springer, New York (1996)
Book MATH Google Scholar
Sklar, A.: Fonctions de répartition à $n$ dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)
MATH Google Scholar
Trutschnig, W.: On a strong metric on the space of copulas and its induced dependence measure. J. Math. Anal. Appl. 384, 690–705 (2011)
Article MathSciNet MATH Google Scholar
Trutschnig, W.: Some results on the convergence of (quasi-)copulas. Fuzzy Sets Syst. 191, 113–121 (2012)
Article MathSciNet MATH Google Scholar
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
Book MATH Google Scholar
Wang, S.S., Dhaene, J.: Comonotonicity, correlation order and premium principles. Insur. Math. Econ. 22, 235–242 (1998)
Article MathSciNet MATH Google Scholar
Zolotarev, V.M.: Probability metrics. Theory Probab. Appl. 28, 278–302 (1983)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author would like to thank an Associate Editor and two referees for their constructive comments and suggestions that led to an improved version of the article.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Mathematics, Saarland University, Campus E2 4, D-66123, Saarbrücken, Germany
Henryk Zähle

Authors

Henryk Zähle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henryk Zähle.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proofs

1.1 A.1 Proof of Proposition 2.2

By the assumption imposed on $h$, we can find a constant $c \in R_{+ +}$ such that

\int_{R^{d^{'}}} {| y |}^{p^{'}} h (μ) (d y) = \int_{R^{d}} {| h (x) |}^{p^{'}} μ (d x) \leq c (1 + \int_{R^{d}} {| x |}^{p} μ (d x)) < \infty

for any $\mu \in{\mathcal{M}}_{d}^{p}$. This gives the first assertion. Since the involved topologies are metrisable, it suffices for the second assertion to show that the map $\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}$ is sequentially continuous. Let $\mu $ and $\mu _{n}$, $n \in N$ , be elements of $\mathcal{M}_{d}^{p}$ such that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}$ and thus in particular in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{1}^{p}$. By the classical continuous mapping theorem, we have $\mathfrak{h}(\mu _{n})\to \mathfrak{h}(\mu )$ in $\mathcal{O}_{d'}^{0}\cap{\mathcal{M}}_{d'}^{p'}$. Moreover, by the assumption on $h$, the function $f_{h} : R^{d} \to R$ defined by $f_{h}(x):=|h(x)|^{p'}$ lies in $\mathcal{C}_{d}^{p}$. This implies

\begin{aligned} lim_{n \to \infty} \int_{R^{d^{'}}} {| y |}^{p^{'}} h (μ_{n}) (d y) & = lim_{n \to \infty} \int_{R^{d}} f_{h} (x) μ_{n} (d x) \\ = \int_{R^{d}} f_{h} (x) μ (d x) = \int_{R^{d^{'}}} {| y |}^{p^{'}} h (μ) (d y) . \end{aligned}

Thus $\mathfrak{h}(\mu _{n})\to \mathfrak{h}(\mu )$ in $\mathcal{O}_{d'}^{p'}$. This gives the second assertion. □

1.2 A.2 Proof of Theorem 2.3

We first prove that $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$ if and only if $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}$, regardless of the copula $C\in \mathbf{C}_{d}$. If we have $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$, then

\begin{aligned} \int_{R^{d}} {| x |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) & \leq \int_{R^{d}} c_{1} {| x |}_{1}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) \\ = c_{1} \int_{R^{d}} {(\sum_{i = 1}^{d} | π_{i} (x) |)}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) \\ \leq c_{p} \sum_{i = 1}^{d} \int_{R} {| x_{i} |}^{p} μ_{i} (d x_{i}) < \infty, \end{aligned}

where $c_{p}:=c_{1}2^{\max \{0,p-1\}}$, $|x|_{1}:=\sum _{i=1}^{d}|x_{i}|$ is the 1-norm of $x=(x_{1},\ldots ,x_{d})$ and $c_{1} \in R_{+ +}$ is a suitable constant. Thus $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}$. Conversely, assume that $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}$. Then we have

\begin{aligned} \int_{R} {| x_{i} |}^{p} μ_{i} (d x_{i}) & = \int_{R^{d}} {| π_{i} (x) |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) \\ \leq \int_{R^{d}} {| x |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) < \infty, \end{aligned}

i.e., $\mu _{i}\in{\mathcal{M}}_{1}^{p}$, for any $i=1,\ldots ,d$.

To prove the main assertion of Theorem 2.3, we first let $(C,\mu _{1},\ldots ,\mu _{d})$ and $(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})$, $n \in N$ , be elements of $\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$ such that $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{p}$, $i=1,\ldots ,d$, and $d_{\mu _{1},\ldots \mu _{d}}(C_{n},C)\to 0$. Since $p$-weak convergence implies weak convergence, we then have in particular that $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{0}\cap{\mathcal{M}}_{1}^{p}$, $i=1,\ldots ,d$. So Lindner and Szimayer [31, Theorem 2.1] implies that $\mathfrak{P}_{d}(C_{n},\mu _{1,n},\ldots ,\mu _{d,n})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}$. For the convergence $\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ in $\mathcal{O}_{d}^{p}$ (when $p>0$), it remains to show the convergence

\int_{R^{d}} {| x |}^{p} P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) ⟶ \int_{R^{d}} {| x |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) .

Clearly,

\begin{aligned} | \int_{R^{d}} {| x |}^{p} P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) - \int_{R^{d}} {| x |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) | \\ \leq | \int_{R^{d}} {| x |}^{p} P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) \\ - \int_{R^{d}} ({| x |}^{p} \land a) P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) | \\ + | \int_{R^{d}} ({| x |}^{p} \land a) P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) \\ - \int_{R^{d}} ({| x |}^{p} \land a) P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) | \\ + | \int_{R^{d}} ({| x |}^{p} \land a) P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) - \int_{R^{d}} {| x |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) | \\ = : S_{1} (n, a) + S_{2} (n, a) + S_{3} (a) \end{aligned}

(A.1)

for any $a \in R_{+ +}$ . Since $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}$ (as seen above), we can choose for every $\varepsilon >0$ a suitable constant $a_{3} \in R_{+ +}$ such that

Furthermore, for any $a_{1} \in R_{+ +}$ , we have

where $c_{p}:=c_{1}\,2^{\max \{0,p-1\}}$, $|x|_{1}:=\sum _{i=1}^{d}|x_{i}|$ and $|x|_{\infty}:=\max _{i=1,\ldots ,d}|x_{i}|$ for $x=(x_{1},\ldots ,x_{n})$, and $c_{1}, c_{\infty} \in R_{+ +}$ are suitable constants. For $i=j$, the last integral equals . Since we assumed $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{p}$ (i.e., $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{0}\cap{\mathcal{M}}_{1}^{p}$ and $\int_{R} {| x_{i} |}^{p} μ_{n, i} (d x_{i}) \to \int_{R} {| x_{i} |}^{p} μ_{i} (d x_{i})$ ), Krätschmer et al. [28, Theorem 2.3 (5.⇒3.)] ensures that we can choose $a_{1, i, i} \in R_{+ +}$ so large so that this expression (with $a_{1}=a_{1,i,i}$) is bounded above by $\varepsilon /(3dc_{p})$ uniformly in $n \in N$ . For $i\neq j$ and any $b \in R_{+ +}$ , the summand is bounded above by

Again by [28, Theorem 2.3] and the assumed convergence $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{p}$, we can choose $b_{i}$ so large that ${sup}_{n \in N} S_{1, i} (n, b_{i}) \leq ε / (6 d^{2} c_{p})$ . Once we have chosen $b_{i}$, we can in view of $S_{1,i,j}(n,a_{1},b)\le b^{p}\mu _{n,j}[[-a_{1}^{1/p}/c_{\infty},a_{1}^{1/p}/c_{ \infty}]^{c}]$ choose $a_{1, i, j} \in R_{+ +}$ so large that ${sup}_{n \in N} S_{1, 1} (n, a_{1, i, j}, b) \leq ε / (6 d (d - 1) c_{p})$ ; take into account that ${(μ_{n, j})}_{n \in N}$ as a weakly convergent sequence is tight. That is, we have ${sup}_{n \in N} S_{1} (n, a_{1}) \leq ε / 3$ when we set $a_{1}:=\max _{i,j=1,\ldots ,d}a_{1,i,j}$. Finally, by the already established convergence $\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}$, we can choose $n_{0} \in N$ such that $S_{2}(n,a)\le \varepsilon /(3c_{p})$ for $a:=\max \{a_{1},a_{3}\}$ and all $n\ge n_{0}$. Altogether, we have shown that for any given $\varepsilon >0$, we can find an $n_{0} \in N$ such that the left-hand side of (A.1) is $\le \varepsilon $ for all $n\ge n_{0}$.

Conversely, let $(C,\mu _{1},\ldots ,\mu _{d})$ and $(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})$, $n \in N$ , be elements of $\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$ such that $\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ in $\mathcal{O}_{d}^{p}$. Since $p$-weak convergence implies weak convergence, it is a direct consequence of Lindner and Szimayer [31, Theorem 2.1] that this implies $d_{\mu _{1},\ldots ,\mu _{n}}(C_{n},C)\to 0$ and $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{0}$, $i=1,\ldots ,d$. Moreover, for any $i=1,\ldots ,d$ we have

\begin{aligned} lim_{n \to \infty} | \int_{R} {| x_{i} |}^{p} μ_{n, i} (d x_{i}) - \int_{R} {| x_{i} |}^{p} μ_{i} (d x_{i}) | \\ = lim_{n \to \infty} | \int_{R^{d}} {| π_{i} (x) |}^{p} P_{d} (C_{n}, μ_{n, 1}, \dots, μ_{n, d}) (d x) \\ - \int_{R^{d}} {| π_{i} (x) |}^{p} P_{d} (C, μ_{1}, \dots, μ_{d}) (d x) | = 0 \end{aligned}

since the function ${| π_{i} |}^{p} : R^{d} \to R$ lies in $\mathcal{C}_{d}^{p}$. Thus we even have $\mu _{n,i}\to \mu _{i}$ in $\mathcal{O}_{1}^{p}$, $i=1,\ldots ,d$. □

1.3 A.3 Proof of Corollary 2.6

Of course, it suffices to show that $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})=\mathcal{O}_{d}^{0} \cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$. Recall that the $p$-weak topology is metrisable. Therefore it suffices to show that for any sequence ${(μ_{n})}_{n \in N} \in M_{d} {(μ_{1}, \dots, μ_{d})}^{N}$ and any $\mu \in{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$, it holds that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ if and only if $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$. If $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$, then $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ because $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ is finer than $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$. Conversely, if $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$, we obtain by Theorem 2.3 (with $p=0$) that ${(C_{n})}_{n \in N}$ converges to $C$ in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{n}}$, where $C_{n}$ and $C$ are (arbitrary) copulas of $\mu _{n}$ and $\mu $, respectively. Again with Theorem 2.3, we conclude that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$. □

1.4 A.4 Proof of Corollary 2.8

Since the involved topologies are both metrisable, it suffices to show that the map $\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}:\mathcal{M}_{d}(\mu _{1}, \ldots ,\mu _{d})\to \mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}$ is sequentially continuous for the pair $(\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mu _{1},\ldots ,\mu _{d}}^{\sim})$ for any $p'\in [0,p]$. Let $\mu $ and $\mu _{n}$, $n \in N$ , be elements of $\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})$ such that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ for some $p'\in [0,p]$. By Corollary 2.6, we obtain that $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$, and by Theorem 2.3, it follows that $\lim _{n\to \infty}d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)=0$ for any copulas $C_{n}$ and $C$ of $\mu _{n}$ and $\mu $, respectively. So we arrive at

$$ \lim _{n\to \infty}d_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\big( \mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}} (\mu _{n}),\mathfrak{C}_{ \mu _{1},\ldots ,\mu _{d}}(\mu )\big)=0, $$

i.e., $\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}(\mu _{n})\to \mathfrak{C}_{ \mu _{1},\ldots ,\mu _{d}}(\mu )$ in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}$. □

1.5 A.5 Proof of Example 3.2

We here show that if $\mathcal{M}_{d}'=\mathcal{N}_{d}$, then $\mathfrak{D}_{d}'=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}$.

“⊇” Let $(C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1} \times \cdots \times\mathcal{N}_{1}$, which means that we have $\mu _{1}=\mathrm{N}_{m_{1},s_{1}^{2}}, \ldots , \mu _{d}=\mathrm{N}_{m_{d},s_{d}^{2}}$ for some $m_{1}, \dots, m_{d} \in R$ and $s_{1}, \dots, s_{d} \in R_{+ +}$ , and $C$ is given by (3.2) for some correlation matrix $R$. Recall that quantiles are translation-equivariant and positively homogeneous (on the level of univariate random variables). For the distribution function of the Borel probability measure $\mu :=\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$, we thus obtain

$$\begin{aligned} & F_{\mu}(x_{1},\ldots ,x_{d}) \\ &= C\big(F_{\mu _{1}}(x_{1}),\ldots ,F_{\mu _{d}}(x_{d})\big) \\ & = \boldsymbol{\varPhi}_{\boldsymbol{0},R}\Big(\varPhi _{0,1}^{-1}\big( \varPhi _{m_{1},s_{1}^{2}}(x_{1})\big),\ldots ,\varPhi _{0,1}^{-1}\big( \varPhi _{m_{d},s_{d}^{2}}(x_{d} )\big)\Big) \\ & = \boldsymbol{\varPhi}_{\boldsymbol{0},R}\bigg( \frac{\varPhi _{m_{1},s_{1}^{2}}^{-1}(\varPhi _{m_{1},s_{1}^{2}}(x_{1}))-m_{1}}{s_{1}}, \ldots , \frac{\varPhi _{m_{d},s_{d}^{2}}^{-1}(\varPhi _{m_{d},s_{d}^{2}}(x_{d}))-m_{d}}{s_{d}} \bigg) \\ & = \boldsymbol{\varPhi}_{\boldsymbol{0},R}\bigg( \frac{x_{1}-m_{1}}{s_{1}},\ldots ,\frac{x_{d}-m_{d}}{s_{d}}\bigg) = \boldsymbol{\varPhi}_{m,SRS}(x_{1},\ldots ,x_{d}), \end{aligned}$$

where $m:=(m_{1},\ldots ,m_{d})^{\top}$ and $S$ is the $d\times d$ diagonal matrix with entries $s_{1},\ldots ,s_{d}$ on the diagonal. Note that the matrix $SRS$ is again symmetric and positive semi-definite, implying that $\mu $ is the $d$-variate normal distribution $\mathbf{N}_{m,SRS}$. Since the entries on the diagonal of the matrix $SRS$ are the strictly positive numbers $s_{1}^{2},\ldots ,s_{d}^{2}$, the marginal distributions of $\mu =\mathbf{N}_{m,SRS}$ are $\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}$, i.e., elements of $\mathcal{N}_{1}$. In particular, $\mu $ is a (possibly degenerate) $d$-variate normal distribution with continuous marginals, i.e., $\mu =\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})$ lies in $\mathcal{M}_{d}'=\mathcal{N}_{d}$. Thus we obtain that $(C,\mu _{1}, \ldots ,\mu _{d})\in \mathfrak{D}_{d}'$.

“⊆” Let $(C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'$, i.e., $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$ and $C\in \mathbf{C}_{d}$ are such that $\mu :=\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}'={\mathcal{N}}_{d}$. Then $\mu _{1}=\mathrm{N}_{m_{1},s_{1}^{2}}, \ldots , \mu _{d}=\mathrm{N}_{m_{d},s_{d}^{2}}$ for some $m_{1}, \dots, m_{d} \in R$ and $s_{1}, \dots, s_{d} \in R_{+ +}$ , and $\mu =\mathbf{N}_{m,V}$ for the vector $m:=(m_{1},\ldots ,m_{d})^{ \top}$ and a symmetric and positive semi-definite matrix $V \in R^{d \times d}$ with entries $s_{1}^{2},\ldots ,s_{d}^{2}$ on the diagonal. By (2.1) and the translation-equivariance and positive homogeneity of quantiles (on the level of univariate random variables), we have

$$\begin{aligned} C(u_{1},\ldots ,u_{d}) = & \boldsymbol{\varPhi}_{m,V}\big(\varPhi _{m_{1},s_{1}^{2}}^{-1}(u_{1}), \ldots ,\varPhi _{m_{d},s_{d}^{2}}^{-1}(u_{d})\big) \\ = & \boldsymbol{\varPhi}_{m,V}\big(s_{1}\varPhi _{0,1}^{-1}(u_{1})+m_{1}, \ldots ,s_{d}\varPhi _{0,1}^{-1}(u_{d})+m_{d}\big) \\ = & \boldsymbol{\varPhi}_{\boldsymbol{0},S^{-1}VS^{-1}}\big(\varPhi _{0,1}^{-1}(u_{1}), \ldots ,\varPhi _{0,1}^{-1}(u_{d})\big), \end{aligned}$$

where $S^{-1}$ is the $d\times d$ diagonal matrix with entries $s_{1}^{-1},\ldots ,s_{d}^{-1}$ on the diagonal. The matrix $R:=S^{-1}VS^{-1}$ is again symmetric and positive semi-definite, and for any $i,j\in \{1,\ldots ,d\}$, its entry at $(i,j)$ is equal to $v_{i,j}/(s_{i}s_{j})$, where $v_{i,j}$ is the entry at $(i,j)$ of the matrix $V$. Since $v_{i,i}=s_{i}^{2}$ for any $i=1,\ldots ,d$, it follows that $R$ is a correlation matrix. Thus $C\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}$ and therefore $(C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1} \times \cdots \times\mathcal{N}_{1}$. □

1.6 A.6 Proof of Example 3.3

We here show that if $\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}$ for some $p \in R_{+}$ , then $\mathfrak{D}_{d}'=\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}$.

“⊇” Let $(C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}$. Then, as shown in the first paragraph of Sect. A.2, $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}={\mathcal{M}}_{d}'$. Thus $(C,\mu _{1},\ldots ,\mu _{d})$ lies in $\mathfrak{D}_{d}'$.

“⊆” Let $(C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'$, i.e., $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}$ and $C\in \mathbf{C}_{d}$ are such that $\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}'=\mathcal{M}_{d}^{p}$. Then, as shown in the first paragraph of Sect. A.2, $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. Thus $(C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}$. □

1.7 A.7 Proof of Remark 3.6

The univariate standard normal distribution function $\varPhi _{0,1}$ is continuous and strictly increasing. This implies that pointwise convergence of a sequence of $d$-variate Gaussian copulas ${(Φ_{0, R_{n}} (Φ_{0, 1}^{- 1} (\cdot), \dots, Φ_{0, 1}^{- 1} (\cdot)))}_{n \in N}$ to a $d$-variate Gaussian copula

$$ \boldsymbol{\varPhi}_{\boldsymbol{0},R}\big(\varPhi _{0,1}^{-1} (\, \cdot \,),\ldots ,\varPhi _{0,1}^{-1} (\, \cdot \,)\big) $$

is the same as pointwise convergence of $\boldsymbol{\varPhi}_{\boldsymbol{0},R_{n}}(\,\cdot \,,\ldots ,\,\cdot \,)$ to $\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\,\cdot \,,\ldots ,\,\cdot \,)$. The $d$-variate distribution function $\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\,\cdot \,,\ldots ,\,\cdot \,)$ is continuous since it can be represented as

$$ \boldsymbol{\varPhi}_{\boldsymbol{0},R}(\,\cdot \,,\ldots ,\,\cdot \,)= \boldsymbol{\varPhi}_{\boldsymbol{0},R}\Big(\varPhi _{0,1}^{-1}\big(\varPhi _{0,1} (\, \cdot \,)\big),\ldots ,\varPhi _{0,1}^{-1}\big(\varPhi _{0,1} (\, \cdot \,)\big)\Big) $$

and $\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\varPhi _{0,1}^{-1} (\, \cdot \,),\ldots ,\varPhi _{0,1}^{-1} (\, \cdot \,)) $ as a (Gaussian) copula is Lipschitz-continuous with respect to $|\cdot |_{1}$ and the univariate standard normal distribution function $\varPhi _{0,1} (\, \cdot \,)$ is continuous. Therefore (see Shiryaev [48, Sect. III.1]) the latter pointwise convergence is equivalent to weak convergence of the $d$-variate normal distribution $\mathbf{N}_{\boldsymbol{0},R_{n}}$ to the $d$-variate normal distribution $\mathbf{N}_{\boldsymbol{0},R}$, and by Lévy’s continuity theorem, this is the same as

$$ e^{-\langle R_{n}t,t\rangle /2}=\varphi _{\mathbf{N}_{\boldsymbol{0},R_{n}}}(t) \longrightarrow \varphi _{\mathbf{N}_{\boldsymbol{0},R}}(t)=e^{- \langle Rt,t\rangle /2} $$

for any $t \in R^{d}$ . Obviously, the latter holds if and only if $t^{\top}(R_{n}-R)t\to 0$ for any $t \in R^{d}$ .

If $\|R_{n}-R\|_{\mathrm{Mat}}\to 0$ for some matrix norm $\|\cdot \|_{\mathrm{Mat}}$, then $\|R_{n}-R\|_{\mathrm{max}}\to 0$ for the maximum norm $\|\cdot \|_{\mathrm{max}}$, and thus $t^{\top}(R_{n}-R)t\to 0$ for any $t \in R^{d}$ .

Conversely, assume that $t^{\top}(R_{n}-R)t\to 0$ for any $t \in R^{d}$ . Then for any $i=1,\ldots ,d$, we can conclude by choosing $t$ as the $i$th unit vector $\mathrm{e}_{i}$ that the entry $(i,i)$ of the matrix $R_{n}-R$ converges to 0. For $t:=\mathrm{e}_{i}+\mathrm{e}_{j}$, the expression $t^{\top}(R_{n}-R)t$ is twice the entry $(i,j)$ plus entries $(i,i)$ and $(j,j)$ of the symmetric matrix $R_{n}-R$. It follows that also the entry $(i,j)$ of the matrix $R_{n}-R$ converges to 0 for any $i,j=1,\ldots ,d$ with $i\neq j$. Thus $\|R_{n}-R\|_{\mathrm{max}}\to 0$, and consequently $\|R_{n}-R\|_{\mathrm{Mat}}\to 0$ for any matrix norm $\|\cdot \|_{\mathrm{Mat}}$. □

1.8 A.8 Proof of Example 3.7

For any correlation matrix $R$, use $C^{R}$ to denote the Gaussian copula associated with $R$. By definition, the set $\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}$ is parametrised by the set of all correlation matrices. Now fix $\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}\in{\mathcal{N}}_{1}$ and let $R,R_{1},R_{2},\ldots $ be correlation matrices such that $C^{R_{n}}$ converges to $C^{R}$ pointwise, i.e., $C^{R_{n}}(u_{1},\ldots ,u_{d})\to C^{R}(u_{1},\ldots ,u_{d})$ for any $u_{1},\ldots ,u_{d}\in [0,1]$. In particular,

$$ C^{R_{n}}\big(\varPhi _{m_{1},s_{1}^{2}}(x_{1}),\ldots ,\varPhi _{m_{d},s_{d}^{2}}(x_{d}) \big)\longrightarrow C^{R}\big(\varPhi _{m_{1},s_{1}^{2}}(x_{1}),\ldots , \varPhi _{m_{d},s_{d}^{2}}(x_{d})\big) $$

(A.2)

for all $x_{1}, \dots, x_{d} \in R$ . Now $F_{n}(x_{1},\ldots ,x_{d}):=C^{R_{n}}(\varPhi _{m_{1},s_{1}^{2}}(x_{1}), \ldots ,\varPhi _{m_{d},s_{d}^{2}}(x_{d}))$ and $F(x_{1},\ldots ,x_{d}):=C^{R}(\varPhi _{m_{1},s_{1}^{2}}(x_{1}),\ldots , \varPhi _{m_{d},s_{d}^{2}}(x_{d}))$ are the distribution functions of $\mathfrak{P}_{d}(C^{R_{n}},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})$ and $\mathfrak{P}_{d}(C^{R},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})$, respectively. Moreover, $F$ is continuous since $C^{R}$ as a copula is Lipschitz-continuous with respect to $|\cdot |_{1}$ and the univariate normal distribution functions $\varPhi _{m_{1},s_{1}^{2}} (\, \cdot \,),\ldots ,\varPhi _{m_{d},s_{d}^{2}} (\, \cdot \,)$ are continuous. Therefore (see Shiryaev [48, Sect. III.1]) the convergence in (A.2) is equivalent to

$$ \mathfrak{P}_{d}(C^{R_{n}},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}) \longrightarrow \mathfrak{P}_{d}(C^{R},\mathrm{N}_{m_{1},s_{1}^{2}}, \ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})\qquad \mbox{in $\mathcal{O}_{d}^{0}$}. $$

Moreover, as in (A.1) (with $\mu _{n,i}=\mu _{i}=\mathrm{N}_{m_{i},s_{i}^{2}}$, $n \in N$ , $i=1,\ldots ,d$), we obtain

Thus $\mathfrak{P}_{d}(C^{R_{n}},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}) \to \mathfrak{P}_{d}(C^{R},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})$ in $\mathcal{O}_{p}^{d}\cap{\mathcal{N}}_{d}$. In view of $\mathcal{P}_{d}\circ \mathfrak{P}_{d}=\mathfrak{P}_{d}$ and since the involved topologies are metrisable, we arrive at copula robustness of $\mathcal{P}_{d}:\mathcal{N}_{d}\to{\mathcal{N}}_{d}$, where the image space $\mathcal{N}_{d}$ is equipped with $\mathcal{O}_{p}^{d}\cap{\mathcal{N}}_{d}$. □

1.9 A.9 Proof of Theorem 3.10

Let us first assume that the map $\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}:\mathcal{M}_{d}( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}$ is continuous for the pair $(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),{\mathcal{O}}_{\mathbf{E}})$ for any $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. Since the map

$$ \mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}) $$

is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d}))$-continuous for any $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$ by Corollary 2.5, it follows that the map

$$ \mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d})=\mathcal{T}_{d} \circ \mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}): \mathbf{C}_{d}\to \mathbf{E} $$

is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})$-continuous for any $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. Thus $\mathcal{T}_{d}$ is copula robust.

Conversely, assume $\mathcal{T}_{d}$ is copula robust, i.e., $\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}): \mathbf{C}_{d}\to \mathbf{E}$ is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})$-continuous for any $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. By way of contradiction, assume that $\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}:\mathcal{M}_{d}( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}$ is not continuous for the pair $(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mathbf{E}})$, for some $\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}$. Then one can find elements $\mu _{n}$, $n \in N$ , and $\mu $ of $M_{d} {(μ_{1}, \dots, μ_{d})}^{N}$ such that we have $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d})$, but $\mathcal{T}_{d}(\mu _{n})\not \to{\mathcal{T}}_{d}(\mu )$ in $\mathcal{O}_{\mathbf{E}}$. However, in view of Corollary 2.8, the convergence $\mu _{n}\to \mu $ in $\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})$ implies that $C_{n}\to C$ in $\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}$ for any copulas $C_{n}$ and $C$ of $\mu _{n}$ and $\mu $, respectively. Because the map $\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to \mathbf{E}$ is $(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})$-continuous by assumption, we obtain that $\mathcal{T}_{d}(\mu _{n})\to{\mathcal{T}}_{d}(\mu )$ in $\mathcal{O}_{\mathbf{E}}$. This contradicts $\mathcal{T}_{d}(\mu _{n})\not \to{\mathcal{T}}_{d}(\mu )$ in $\mathcal{O}_{\mathbf{E}}$. □

1.10 A.10 Proof of Corollary 3.13

The map $\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}$ defined by $\mathfrak{h}(\mu ):=\mu \circ h^{-1}$ is $(\mathcal{O}_{d}^{p},\mathcal{O}_{d'}^{p'})$-continuous by Proposition 2.2 and therefore copula robust by Theorem 3.12. Because the map $\mathcal{T}_{d'}:\mathcal{M}_{d'}^{p'}\to \mathbf{E}$ is $(\mathcal{O}_{d'}^{p'},\mathcal{O}_{d}^{p})$-continuous by assumption, it follows by Lemma 3.4 (with $\mathcal{U}:=\mathcal{T}_{d'}$ and $\mathcal{T}_{d}:=\mathfrak{h}$) that $\mathcal{T}_{d}'=\mathcal{T}_{d'}\circ \mathfrak{h}:\mathcal{M}_{d}^{p}\to \mathbf{E}$ is copula robust. □

1.11 A.11 Proof of Example 4.7

(i) The natural extension $\overline{C}_{0}^{(\alpha )}$ of $C_{0}^{(\alpha )}$ to $R^{2}$ is the distribution function of the Borel probability measure $\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})$ on $R^{2}$ . Thus $\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})=\mathcal{H}_{S_{1}^{ \alpha}\uplus S_{2}^{\alpha}}^{1}/\sqrt{2}$, where $\mathcal{H}_{S_{1}^{\alpha}\uplus S_{2}^{\alpha}}^{1}[\,\cdot \,]:={\mathcal{H}}^{1}[\,\cdot \,\cap (S_{1}^{\alpha}\uplus S_{2}^{\alpha})]$ is the 1-dimensional (Borel) Hausdorff measure $\mathcal{H}^{1}$ on $R^{2}$ restricted to the union of the two disjoint line segments $S_{1}^{\alpha}$ and $S_{2}^{\alpha}$ with endpoints $(\alpha ,0),(0,\alpha )$ and $(1,\alpha ),(\alpha ,1)$, respectively. That is, the total mass 1 of $\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})$ is uniformly distributed over $S_{1}^{\alpha}\uplus S_{2}^{\alpha}$. In particular, $\mathcal{H}^{1}[S_{1}^{\alpha}]=\alpha $ and $\mathcal{H}^{1}[S_{2}^{\alpha}]=1-\alpha $. In view of $S_{1}^{\alpha}=\{(x_{1},x_{2})\in [0,1]^{2}:x_{1}+x_{2}=\alpha \}$ and $S_{2}^{\alpha}=\{(x_{1},x_{2})\in [0,1]^{2}:x_{1}+x_{2}=1+\alpha \}$, we can conclude that

$$ \mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})\circ A_{2}^{-1}= \alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha}. $$

In particular, $\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{0}^{(\alpha )},\mu _{1}, \mu _{2})=\alpha $.

The natural extension $\overline{C}_{1}$ of $C_{1}$ to $R^{2}$ is the distribution function of the Borel probability measure $\mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})$ on $R^{2}$ . Thus $\mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})=\mathcal{L}^{(2)}_{[0,1]^{2}}$, where $\mathcal{L}^{(2)}_{[0,1]^{2}}[\,\cdot \,]:=\mathcal{L}^{(2)}[\,\cdot \, \cap [0,1]^{2}]$ is the (Borel) Lebesgue measure $\mathcal{L}^{(2)}$ on $R^{2}$ restricted to $[0,1]^{2}$. Then

$$ \mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})\circ A_{2}^{-1}=(\mathcal{L}^{(1)}_{[0,1]} \otimes{\mathcal{L}}^{(1)}_{[0,1]})\circ A_{2}^{-1}=\mathcal{L}^{(1)}_{[0,1]}*{\mathcal{L}}^{(1)}_{[0,1]}=\Delta _{[0,2]}, $$

where $\mathcal{L}^{(1)}_{[0,1]}[\,\cdot \,]:=\mathcal{L}^{(1)}[\,\cdot \,\cap [0,1]]$ is the (Borel) Lebesgue measure $\mathcal{L}^{(1)}$ on ℝ restricted to $[0,1]$ and $\Delta _{[0,2]}$ is the symmetric triangular distribution. Therefore

$$\begin{aligned} &\mathfrak{P}_{2}(C_{t}^{(\alpha )},\mu _{1},\mu _{2})\circ A_{2}^{-1} \\ &=(1-t)\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})\circ A_{2}^{-1}+t \mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})\circ A_{2}^{-1} \\ &=(1-t)\big(\alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha} \big)+t\Delta _{[0,2]} \end{aligned}$$

for any $t\in (0,1]$. For the distribution function $F_{\Delta _{2}}$ of $\Delta _{[0,2]}$, we have $F_{\Delta _{2}}(x)=0$, $=\frac{1}{2}x^{2}$, $=1-\frac{1}{2}(2-x)^{2}$, $=1$ according to whether $x<0$, $x\in [0,1]$, $x\in (1,2]$, $x>2$. For the distribution function $F^{(\alpha )}$ of $\alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha}$, we have $F^{(\alpha )}(x)=0$, $=\alpha $, $=1$ according to whether $x<\alpha $, $x\in [\alpha ,1+\alpha )$, $x\ge 1+\alpha $. Thus for any $t\in (0,1]$, the distribution function $F_{t}^{(\alpha )}$ of $\mathfrak{P}_{2}(C_{t}^{(\alpha )},\mu _{1},\mu _{2})\circ A_{2}^{-1}$ is given by

$$ F_{t}^{(\alpha )}(x) = \textstyle\begin{cases} 0 , &\quad x\in (-\infty ,0), \\ t\tfrac{1}{2}x^{2} , & \quad x\in [0,\alpha ), \\ t\tfrac{1}{2}x^{2} +(1-t)\alpha , &\quad x\in [\alpha ,1), \\ t(1-\tfrac{1}{2}(2-x)^{2})+(1-t)\alpha , &\quad x\in [1,1+\alpha ), \\ t(1-\tfrac{1}{2}(2-x)^{2})+(1-t) , &\quad x\in [1+\alpha ,2), \\ 1 , &\quad x\in [2,\infty ), \end{cases} $$

so that $\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{t}^{(\alpha )},\mu _{1}, \mu _{2})=\sqrt{2\alpha}$.

(ii) For any $t\in [0,1]$, let $(X_{1}^{(\alpha ),t},X_{2}^{(\alpha ),t})$ be a bivariate random variable with distribution $\mathfrak{P}_{2}(C_{t}^{(\alpha )},\mu _{1},\mu _{2})$. The distribution of the random variable $-(X_{1}^{(\alpha ),0}+X_{2}^{(\alpha ),0})$ is then $\alpha \delta _{-\alpha}+(1-\alpha )\delta _{-(1+ \alpha )}$. Thus $\mathrm{VaR}_{1-\alpha}(-(X_{1}^{(\alpha ),0}+X_{2}^{(\alpha ),0}))=-(1+ \alpha )$. For any $t\in (0,1]$, the distribution function $\hat{F}_{t}^{(\alpha )}$ of $-(X_{1}^{(\alpha ),t}+X_{2}^{(\alpha ),t})$ satisfies $\hat{F}_{t}^{( \alpha )}(x)=1-F_{t}^{(\alpha )}(-x)$ for all those $x \in R$ for which $-x$ is a continuity point of $F_{t}^{(\alpha )}$. Thus for any $t\in (0,1]$, we get $\mathrm{VaR}_{1-\alpha}(-(X_{1}^{(\alpha ),t}+X_{2}^{(\alpha ),t}))=- \sqrt{2\alpha}$. Hence,

$$\begin{aligned} \mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(C_{0}^{(\alpha ),-}, \hat{\mu}_{1},\hat{\mu}_{2}) &=-(1+\alpha ), \\ \mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(C_{t}^{(\alpha ),-}, \hat{\mu}_{1},\hat{\mu}_{2})&=-\sqrt{2\alpha}\qquad \text{for any $t\in (0,1]$}, \end{aligned}$$

where $C_{t}^{(\alpha ),-}$ denotes the copula of the distribution of $(-X_{1}^{(\alpha ),t},-X_{2}^{(\alpha ),t})$; here we use that $\hat{\mu}_{j}$ is the distribution of $-X_{j}^{(\alpha ),t}$, $j=1,2$. This gives the assertion since $C_{t}^{(\alpha ),-}=\hat{C}_{t}^{(\alpha )}$ for any $t\in [0,1]$. The latter equality holds true since the distribution function $F^{(\alpha ),t,-}$ of $(-X_{1}^{(\alpha ),t},-X_{2}^{(\alpha ),t})$ satisfies

$$\begin{aligned} F^{(\alpha ),t,-}(x_{1},x_{2})=\overline{F}^{(\alpha ),t}(-x_{1},-x_{2})&= \hat{C}_{t}^{(\alpha )}\big(1-F_{\mu _{1}}(-x_{1}),1-F_{\mu _{2}}(-x_{2}) \big) \\ &=\hat{C}_{t}^{(\alpha )}\big(F_{\hat{\mu}_{1}}(x_{1}),F_{\hat{\mu}_{2}}(x_{2}) \big) \end{aligned}$$

for any $t\in [0,1]$, where $\overline{F}^{(\alpha ),t}$ denotes the survival function of $(X_{1}^{(\alpha ),t},X_{2}^{(\alpha ),t})$. □

1.12 A.12 Proof of (4.4)

Let $p\in [1,\infty )$ and $ρ : L^{p} \to R$ be a comonotonic convex risk measure. Moreover, let $\mu \in{\mathcal{M}}_{d}^{p}$. Since $(Ω, F, P)$ is assumed to be atomless, we can choose $(X_{1},\ldots ,X_{d})\in L^{p}\times \cdots \times L^{p}$ such that $P \circ {(X_{1}, \dots, X_{d})}^{- 1} = μ$ . A result of Filipović and Svindland [21, Corollary 2.7] ensures that there exist a comonotone optimal capital and risk allocation $(X_{1}^{*},\ldots ,X_{d}^{*})$ of $X:=\sum _{i=1}^{d}X_{i}$ ($\in L^{p}$). Thus

\begin{aligned} R_{⊠_{i = 1}^{d} ρ} (μ) & = R_{⊠_{i = 1}^{d} ρ} (P \circ {(X_{1}, \dots, X_{d})}^{- 1}) \\ = ⊠_{i = 1}^{d} ρ (X_{1}, \dots, X_{d}) = □_{i = 1}^{d} ρ (\sum_{i = 1}^{d} X_{i}) = \sum_{i = 1}^{d} ρ (X_{i}^{*}) \\ = ρ (\sum_{i = 1}^{d} X_{i}^{*}) = ρ (\sum_{i = 1}^{d} X_{i}) \\ = R_{ρ} (P \circ {(\sum_{i = 1}^{d} X_{i})}^{- 1}) = R_{ρ} (μ \circ A_{d}^{- 1}), \end{aligned}

where the fifth equality relies on the comonotonicity of $\rho $. □

1.13 A.13 Proof of Theorem 5.2

We can use arguments similar to those used in the proof of Claus et al. [9, Corollary 2.4]. In view of Lemma 5.1 and the compactness of $\varXi $, we can apply Bonnans and Shapiro [6, Proposition 4.4] to obtain that the map $R_{ρ, h} : M_{d}^{γ p} \to R$ is continuous with respect to $(O_{d}^{γ p}, O_{R})$ . Berge [4, Theorem VI.3.2] ensures that the infimum in (5.3) is attained for any $\mu \in{\mathcal{M}}_{d}^{\gamma p}$. □

1.14 A.14 Proof of Corollary 5.4

Conditions (a)–(c) of Sect. 5.1 hold true for $p=1$. Indeed, condition (a) holds true since monotonicity, distribution-invariance and convexity carry over from $\sigma $ to $\rho $. Condition (b) with $p=1$ clearly holds true for the function $h : Ξ \times R^{d} \to R$ defined by (5.5), and condition (c) holds true since $h$ is continuous everywhere. Thus, since the set $\varXi $ defined by (5.4) is a compact subset of $R^{d}$ , the assertions follow from Theorem 5.2. □

1.15 A.15 Proof of Lemma 6.1

Since we assumed that conditions (a) and (c) hold true, we have that

\begin{aligned} E^{x_{0}, P; π} [| r_{k} (X_{k}, f_{k} (X_{k})) |] \\ \leq E^{x_{0}, P; π} [K_{1} ψ (X_{k})] \\ = K_{1} \int \dots \int \int ψ (y_{k}) P_{k - 1} ((y_{k - 1}, f_{k - 1} (y_{k - 1})), d y_{k}) \\ P_{k - 2} ((y_{k - 2}, f_{k - 2} (y_{k - 2})), d y_{k - 1}) \dots P_{0} ((x_{0}, f_{0} (x_{0})), d y_{1}) \\ \leq K_{1} K_{3}^{k} ψ (x_{0}) . \end{aligned}

Thus $r_{k}(X_{k},f_{k}(X_{k}))$, $k=0,\ldots ,N-1$, are $P^{x_{0}, P; π}$ -integrable for any $x_{0}\in E$, $P\in{\mathcal{P}}'$ and $\pi \in \varPi $. Using (b) and (c), we analogously obtain that $r_{N}(X_{N})$ is $P^{x_{0}, P; π}$ -integrable for any $x_{0}\in E$, $P\in{\mathcal{P}}'$ and $\pi \in \varPi $. For any $n=1,\ldots ,N-1$, we get in the same way that

\begin{aligned} | V_{n}^{P, π} (x) | & = | E^{x_{0}, P; π} [\sum_{k = n}^{N - 1} r_{k} (X_{k}, f_{k} (X_{k})) + r_{N} (X_{N}) | X_{n} = x] | \\ \leq \sum_{k = n}^{N - 1} E^{x_{0}, P; π} [| r_{k} (X_{k}, f_{k} (X_{k})) | | X_{n} = x] + E^{x_{0}, P; π} [| r_{N} (X_{N}) | | X_{n} = x] \\ \leq \sum_{k = n}^{N - 1} K_{1} K_{3}^{k - n} ψ (x) + K_{2} K_{3}^{N - n} ψ (x) = K_{n} ψ (x) \end{aligned}

holds true for all $\pi \in \varPi $ and $x\in E$, where $K_{n}:=\sum _{k=n}^{N-1}K_{1}K_{3}^{k-n}+K_{2}K_{3}^{N-n}$. Therefore, we indeed have that

$$ \|V_{n}^{P}\|_{\psi}=\sup _{x\in E}|\sup _{\pi \in \varPi}V_{n}^{P,\pi}(x)|/ \psi (x)\le \sup _{x\in E}\sup _{\pi \in \varPi}|V_{n}^{P,\pi}(x)|/\psi (x)< \infty $$

for any $n=0,\ldots ,N-1$ and $P\in{\mathcal{P}}$. □

1.16 A.16 Proof of Theorem 6.2

Let $n\in \{0,\ldots ,N-1\}$ and $x_{n}\in E$. Since $P^{x_{0}, Q; π} = δ_{x_{0}} \otimes Q_{0}^{π} \otimes \dots \otimes Q_{N - 1}^{π}$ , we have for any $Q\in{\mathcal{P}}_{\psi}$ that

\begin{aligned} V_{n}^{Q; π} (x_{n}) & = \sum_{k = n}^{N - 1} E^{x_{0}, Q; π} [r_{k} (X_{k}, f_{k} (X_{k})) | X_{n} = x_{n}] + E^{x_{0}, Q; π} [r_{N} (X_{N}) | X_{n} = x_{n}] \\ = r_{n} (x_{n}, f_{n} (x_{n})) \\ + \sum_{k = n + 1}^{N - 1} \int_{E} \dots \int_{E} r_{k} (x_{k}, f_{k} (x_{k})) Q_{k - 1}^{π} (x_{k - 1}, d y_{k}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ + \int_{E} \dots \int_{E} r_{N} (x_{N}) Q_{N - 1}^{π} (x_{N - 1}, d x_{N}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \end{aligned}

and thus

$$\begin{aligned} & V_{n}^{Q;\pi}(x_{n})-V_{n}^{P;\pi}(x_{n}) \\ & = \sum _{k=n+1}^{N-1}\int _{E}\cdots \int _{E} r_{k}\big(x_{k},f_{k}(x_{k}) \big)\,Q_{k-1}^{\pi}(x_{k-1},dx_{k})\cdots Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & \phantom{=:} -\sum _{k=n+1}^{N-1}\int _{E}\cdots \int _{E} r_{k}\big(x_{k},f_{k}(x_{k}) \big)\,P_{k-1}^{\pi}(x_{k-1},dx_{k})\cdots P_{n}^{\pi}(x_{n},dx_{n+1}) \\ & \phantom{=:} +\int _{E}\cdots \int _{E} r_{N}(x_{N})\,Q_{N-1}^{\pi}(x_{N-1},dx_{N}) \cdots Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & \phantom{=:} -\int _{E}\cdots \int _{E} r_{N}(x_{N})\,P_{N-1}^{\pi}(x_{N-1},dx_{N}) \cdots P_{n}^{\pi}(x_{n},dx_{n+1}) \\ & = \sum _{k=n+1}^{N-1}\sum _{j=n}^{k-1}\int _{E}\cdots \int _{E} \int _{E}\int _{E}\cdots \int _{E} r_{k}\big(x_{k},f_{k}(x_{k})\big) \, P_{k-1}^{\pi}(x_{k-1},dx_{k})\,\cdots \\ & \phantom{=} \quad \qquad \qquad \qquad \qquad \qquad \qquad P_{j+1}^{ \pi}(x_{j+1},dx_{j+2})\,(Q_{j}^{\pi}-P_{j}^{\pi})(x_{j},dx_{j+1}) \\ & \phantom{=} \quad \qquad \qquad \qquad \qquad \qquad \qquad Q_{j-1}^{ \pi}(x_{j-1},dx_{j})\,\cdots \,Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & \phantom{=:} +\sum _{j=n}^{N-1}\int _{E}\cdots \int _{E}\int _{E}\int _{E}\cdots \int _{E} r_{N}(x_{N})\, P_{N-1}^{\pi}(x_{N-1},dx_{N})\,\cdots \\ & \phantom{=:} \ \ \, \quad \qquad \qquad \qquad \qquad \qquad P_{j+1}^{\pi}(x_{j+1},dx_{j+2}) \,(Q_{j}^{\pi}-P_{j}^{\pi})(x_{j},dx_{j+1}) \\ & \phantom{=:} \ \ \, \quad \qquad \qquad \qquad \qquad \qquad Q_{j-1}^{\pi}(x_{j-1},dx_{j}) \,\cdots \,Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & = \sum _{j=n}^{N-2}\int _{E}\cdots \int _{E}\int _{E} \\ & \phantom{=:} \qquad \bigg(\sum _{k=j+1}^{N-1}\int _{E}\cdots \int _{E} r_{k}\big(x_{k},f_{k}(x_{k}) \big)\, P_{k-1}^{\pi}(x_{k-1},dx_{k})\,\cdots \,P_{j+1}^{\pi}(x_{j+1},dx_{j+2}) \\ & \qquad \qquad +\int _{E}\cdots \int _{E} r_{N}(x_{N})\, P_{N-1}^{ \pi}(x_{N-1},dx_{N})\,\cdots \,P_{j+1}^{\pi}(x_{j+1},dx_{j+2})\bigg) \\ & \ \ \qquad \qquad \qquad \qquad \, \, \, (Q_{j}^{\pi}-P_{j}^{\pi})(x_{j},dx_{j+1}) \,Q_{j-1}^{\pi}(x_{j-1},dx_{j})\,\cdots \,Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & \phantom{=:} +\int _{E}\cdots \int _{E}\int _{E} r_{N}(x_{N})\, (Q_{N-1}^{\pi}-P_{N-1}^{ \pi})(x_{N-1},dx_{N}) \\ & \phantom{=:} \qquad \qquad \qquad \, Q_{N-2}^{\pi}(x_{N-2},dx_{N-1})\, \cdots \,Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & = \sum _{j=n}^{N-2}\int _{E}\cdots \int _{E}\int _{E} V_{j+1}^{P; \pi}(x_{j+1})\,(Q_{j}^{\pi}-P_{j}^{\pi})(x_{j},dx_{j+1}) \\ & \phantom{=:} \qquad \qquad \qquad \quad \,\,Q_{j-1}^{\pi}(x_{j-1},dx_{j})\, \cdots Q_{n}^{\pi}(x,dx_{n+1}) \\ & \phantom{=:} +\int _{E}\cdots \int _{E}\int _{E} r_{N}(x_{N})\, (Q_{N-1}^{\pi}-P_{N-1}^{ \pi})(x_{N-1},dx_{N}) \\ & \phantom{=:} \qquad \qquad \qquad \,\, Q_{N-2}^{\pi}(x_{N-2},dx_{N-1})\, \cdots \,Q_{n}^{\pi}(x_{n},dx_{n+1}) \\ & = \sum _{j=n}^{N-1}\int _{E}\cdots \int _{E}\int _{E} V_{j+1}^{P; \pi}(x_{j+1})\,(Q_{j}^{\pi}-P_{j}^{\pi})(x_{j},dx_{j+1}) \\ & \phantom{=:} \qquad \qquad \qquad \quad \,\, Q_{j-1}^{\pi}(x_{j-1},dx_{j}) \,\cdots \,Q_{n}^{\pi}(x,dx_{n+1}), \end{aligned}$$

where we used the conventions

$$\begin{aligned} & \int _{E}\int _{E} r_{N}(x_{N})\, P_{N-1}^{\pi}(x_{N-1},dx_{N})\,P_{N}^{ \pi}(x_{j+1},dx_{N+1}):=r_{N}(x_{N}), \\ & \int _{E}\cdots \int _{E}\int _{E} V_{n+1}^{P;\pi}(x_{n+1})\,(Q_{n}^{ \pi}-P_{n}^{\pi})(x_{n},dx_{n+1}) \\ & \phantom{=} \qquad \qquad \ \, Q_{n-1}^{\pi}(x_{n-1},dx_{n})\,\cdots Q_{n}^{\pi}(x,dx_{n+1}) \\ & :=\int _{E} V_{n+1}^{P;\pi}(x_{n+1})(Q_{n}^{\pi}-P_{n}^{\pi})(x_{n},dx_{n+1}), \end{aligned}$$

and other similar ones. Therefore,

\begin{array}{rcl} | {\overline{V}}_{n}^{x_{n}} (Q) - {\overline{V}}_{n}^{x_{n}} (P) | \\ = | sup_{π \in Π} V_{n}^{Q; π} (x_{n}) - sup_{π \in Π} V_{n}^{P; π} (x_{n}) | \\ \leq sup_{π \in Π} | V_{n}^{Q; π} (x_{n}) - V_{n}^{P; π} (x_{n}) | \\ \leq \sum_{j = n}^{N - 1} sup_{π \in Π} sup_{x \in E} (| \int_{E} V_{j + 1}^{P; π} (y) Q_{j}^{π} (x, d y) - \int_{E} V_{j + 1}^{P; π} (y) P_{j}^{π} (x, d y) | \frac{1}{ψ (x)} \\ \times \int_{E} \dots \int_{E} ψ (x_{j}) Q_{j - 1}^{π} (x_{j - 1}, d x_{j}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1})) \\ \leq \sum_{j = n}^{N - 1} sup_{π \in Π} (ϱ_{M^{'}} (V_{j + 1}^{P; π}) \\ \times sup_{(x, a) \in D_{j}} (d_{M} (Q_{j} ((x, a), \cdot), P_{j} ((x, a), \cdot)) \frac{1}{ψ (x)}) \\ \times \int_{E} \dots \int_{E} ψ (x_{j}) Q_{j - 1}^{π} (x_{j - 1}, d x_{j}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1})) \\ \leq \sum_{j = n}^{N - 1} sup_{π \in Π} (ϱ_{M^{'}} (V_{j + 1}^{P; π}) d_{M, ψ} (Q, P) \\ \times \int_{E} \dots \int_{E} ψ (x_{j}) Q_{j - 1}^{π} (x_{j - 1}, d x_{j}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1})) . \end{array}

For the latter multiple integral, we have

\begin{array}{rcl} \int_{E} \dots \int_{E} ψ (x_{j}) Q_{j - 1}^{π} (x_{j - 1}, d x_{j}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ \leq \int_{E} \dots \int_{E} \int_{E} ψ (x_{j}) P_{j - 1}^{π} (x_{j - 1}, d x_{j}) Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ + \int_{E} \dots \int_{E} | \int_{E} ψ (x_{j}) (Q_{j - 1}^{π} (x_{j - 1}, d x_{j}) - P_{j - 1}^{π} (x_{j - 1}, d x_{j})) | \\ Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ \leq \int_{E} \dots \int_{E} K_{3, P} ψ (x_{j - 1}) Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ + \int_{E} \dots \int_{E} ϱ_{M^{'}} (ψ) d_{M} (Q_{j - 1}^{π} (x_{j - 1}, \cdot), P_{j - 1}^{π} (x_{j - 1},, \cdot)) \\ Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ \leq K_{3, P} \int_{E} \dots \int_{E} ψ (x_{j - 1}) Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ + ϱ_{M^{'}} (ψ) d_{M, ψ} (Q, P) \\ \times \int_{E} \dots \int_{E} ψ (x_{j - 1}) Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ = (K_{3, P} + ϱ_{M^{'}} (ψ) d_{M, ψ} (Q, P)) \\ \times \int_{E} \dots \int_{E} ψ (x_{j - 1}) Q_{j - 2}^{π} (x_{j - 2}, d x_{j - 1}) \dots Q_{n}^{π} (x_{n}, d x_{n + 1}) . \end{array}

Treating the remaining multiple integral analogously and proceeding iteratively in this way, we may continue with

\begin{array}{rcl} = & {(K_{3, P} + ϱ_{M} (ψ) d_{M, ψ} (Q, P))}^{n - j - 1} \int_{E} ψ (x_{j - 1}) Q_{n}^{π} (x_{n}, d x_{n + 1}) \\ \leq & {(K_{3, P} + ϱ_{M} (ψ) d_{M, ψ} (Q, P))}^{n - j - 1} \\ \times (\int_{E} ψ (x_{j - 1}) P_{n}^{π} (x_{n}, d x_{n + 1}) \\ + \int_{E} ψ (x_{j - 1}) (Q_{n}^{π} (x_{n}, d x_{n + 1}) - P_{n}^{π} (x_{n}, d x_{n + 1}))) \\ \leq & {(K_{3, P} + ϱ_{M} (ψ) d_{M, ψ} (Q, P))}^{n - j - 1} (K_{3, P} ψ (x_{n}) + ϱ_{M} (ψ) d_{M, ψ} (Q, P) ψ (x_{n})) \\ 1 & = & {(K_{3, P} + ϱ_{M} (ψ) d_{M, ψ} (Q, P))}^{n - j} ψ (x_{n}) . \end{array}

Altogether, we obtain the asserted inequality. □

1.17 A.17 Proof of Lemma 6.5

Let $P\in{\mathcal{P}}_{\alpha}$, i.e., $P=P^{\vec{\boldsymbol{\mu}}}$ for some $\vec{μ} = {(μ_{n})}_{n = 1}^{N} \in M_{1}^{α} {(R_{+ +})}^{N}$ . We have to verify that the defining conditions (a)–(c) of a bounding functions are satisfied. Conditions (a) and (b) are trivially satisfied. Since $\langle a,\boldsymbol{1}\rangle \in [0,x]$ for any $(x,a)\in D$ and $\langle a,z\rangle \le \langle a,\langle z, \mathbf{1}\rangle \mathbf{1}\rangle =\langle a,\mathbf{1}\rangle \langle z,\mathbf{1}\rangle $ for any $a, z \in R_{+}^{d}$ , we also have

\begin{aligned} \int_{R_{+}} ψ_{α} (y) P_{n}^{\vec{μ}} ((x, a), d y) \\ = 1 + \int_{R_{+}} u_{α} (η_{n, (x, a)} (z)) μ_{n + 1} (d z) \\ = 1 + \int_{R_{+ +}} {(Z_{n + 1}^{0} x + 〈 a, z - Z_{n + 1}^{0} 1 〉)}^{α} μ_{n + 1} (d z) \\ \leq 1 + {(Z_{n + 1}^{0})}^{α} {(x - 〈 a, 1 〉)}^{α} + \int_{R_{+ +}} {〈 a, z 〉}^{α} μ_{n + 1} (d z) \\ \leq 1 + {(max_{k = 1, \dots, N} Z_{k}^{0})}^{α} x^{α} + \int_{R_{+ +}} {〈 a, 1 〉}^{α} {〈 z, 1 〉}^{α} μ_{n + 1} (d z) \\ \leq 1 + ({(max_{k = 1, \dots, N} Z_{k}^{0})}^{α} + \int_{R_{+ +}} {〈 z, 1 〉}^{α} μ_{n + 1} (d z)) x^{α} \leq K_{3} ψ_{α} (x) \end{aligned}

for any $n=0,\ldots ,N-1$ and $(x,a)\in D$, where

K_{3} : = {(max_{k = 1, \dots, N} Z_{k}^{0})}^{α} + max_{k = 1, \dots, N} \int_{R_{+ +}} {〈 z, 1 〉}^{α} μ_{k} (d z)

is independent of $n=0,\ldots ,N-1$ and $(x,a)\in D$. This shows that condition (c) is also satisfied and that $\mathcal{P}_{\psi _{\alpha}}=\mathcal{P}_{\alpha}$. □

1.18 A.18 Proof of Theorem 6.6

In view of Theorem C.3 below, we may and do replace without loss of generality the set of all strategies $\varPi $ by the subset $\varPi _{\mathrm{lin}}$ of all those $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ for which for any $n=0,\ldots ,N-1$, the decision rule $f_{n} : R_{+} \to R^{d}$ admits the representation $f_{n}(x)=\kappa _{n}x$ for some $\kappa _{n}\in K$. For any $\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}$, we set $\pi _{\vec{\boldsymbol{\kappa}}}:=(f_{n}^{\kappa _{n}})_{n=0}^{N-1}$ with $f_{n}^{\kappa _{n}}(x):=\kappa _{n} x$. Thus $K^{N}$ can be seen as a parameter set for $\varPi _{\mathrm{lin}}$.

We now show that the assumptions of Corollary 6.3 are met, so that this result ensures the assertion of Theorem 6.6. Let $\vec{μ} \in M_{1}^{α} {(R_{+ +}^{d})}^{N}$ . By Lemma 6.5, we know that $\psi _{\alpha}$ is a bounding function for $Q\in{\mathcal{P}}_{\alpha}$, and obviously $ϱ_{M_{Höl, α}} (ψ_{α}) < \infty$ . So it remains to show that ${sup}_{\vec{κ} \in K^{N}} ϱ_{M_{Höl, α}} (V_{n + 1}^{P^{\vec{μ}}; π_{\vec{κ}}}) < \infty$ for $n=0,\ldots ,N-1$. By Lemma C.4, we have for any $n=0,\ldots ,N-1$ and $\vec{\boldsymbol{\kappa}}\in K^{N}$ that $V_{n}^{P^{ \vec{\boldsymbol{\mu}}};\pi _{\vec{\boldsymbol{\kappa}}}}(\,\cdot \,)= \phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}\,u_{ \alpha}(\,\cdot \,)$, where $\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}$ is finite, and therefore $ϱ_{M_{Höl, α}} (V_{n + 1}^{P^{\vec{μ}}; π_{\vec{κ}}}) = ϕ_{n}^{\vec{μ}; π_{\vec{κ}}} < \infty$ since $ϱ_{M_{Höl, α}} (u_{α}) = 1$ . Along with Lemma C.2, we conclude that the assumptions of Corollary 6.3 are indeed met. □

1.19 A.19 Proof of Corollary 6.7

Recall the definition $M_{Höl, α}^{d} : = {h \in R^{R_{+ +}^{d}} : {∥ h ∥}_{Höl, α}^{d} \leq 1}$ with the Hölder-$\alpha $ norm ${∥ h ∥}_{Höl, α}^{d} : = {sup}_{z_{1}, z_{2} \in R_{+ +}^{d} : z_{1} \neq z_{2}} | h (z_{1}) - h (z_{2}) | / {| z_{1} - z_{2} |}^{α}$ . It is known from Kern et al. [26] that $d_{M_{Höl, α}^{d}}$ (defined analogously to (6.2)) provides a metric on $M_{1}^{α} (R_{+ +}^{d})$ that metrises the $\alpha $-weak topology $O_{d}^{α} (R_{+ +}^{d})$ .

We now show that the mapping $M_{1}^{α} (R_{+ +}) \to P_{α}$ , $\mu \mapsto P^{\overline{\mu}}$, is continuous for the pair $(d_{M_{Höl, α}^{d}}, d_{M_{Höl, α}, ψ_{α}})$ . The assertion of Corollary 6.7 then directly follows from Theorem 6.6. For any $ν, μ \in M_{1}^{α} (R_{+ +})$ , we have

\begin{array}{rcl} d_{M_{Höl, α}, ψ_{α}} (P^{\overline{ν}}, P^{\overline{μ}}) \\ = max_{n = 0, \dots, N - 1} sup_{(x, a) \in D_{n}} d_{M_{Höl, α}} (Q_{n}^{\overline{ν}} ((x, a), \cdot), P_{n}^{\overline{μ}} ((x, a), \cdot)) / ψ_{α} (x) \\ = max_{n = 0, \dots, N - 1} sup_{(x, a) \in D} d_{M_{Höl, α}} (ν \circ η_{n, (x, a)}^{- 1} [\cdot], μ \circ η_{n, (x, a)}^{- 1} [\cdot]) / ψ_{α} (x) \\ = max_{n = 0, \dots, N - 1} sup_{(x, a) \in D} sup_{v \in M_{Höl, α}} \\ | \int_{R_{+}^{d}} v (Z_{n + 1}^{0} (x - 〈 a, 1 〉) + 〈 a, z 〉) (ν (d z) - μ (d z)) | \frac{1}{ψ_{α} (x)} \\ \leq sup_{(x, a) \in D} sup_{v \in M_{Höl, α}} sup_{y \in R_{+}} \\ | \int_{R_{+}^{d}} v (y + 〈 a, z 〉) ν (d z) - \int_{R_{+}^{d}} v (y + 〈 a, z 〉) μ (d z) | \frac{1}{ψ_{α} (x)} \\ \leq sup_{(x, a) \in D} sup_{w \in M_{Höl, α}} | \int_{R_{+}^{d}} w (〈 a, z 〉) ν (d z) - \int_{R_{+}^{d}} w (〈 a, z 〉) μ (d z) | \frac{1}{ψ_{α} (x)} \\ = sup_{(x, a) \in D} sup_{w \in M_{Höl, α}} | \int_{R_{+}^{d}} h_{w, a} (z) ν (d z) - \int_{R_{+}^{d}} h_{w, a} (z) μ (d z) | \frac{1}{ψ_{α} (x)}, \end{array}

where $D_{n} : = D : = {(x, a) \in R_{+}^{d + 1} : 〈 a, 1 〉 \leq x}$ and $h_{w,a}(z):=w(\langle a,z\rangle )$. For the map $h_{w, a} : R_{+ +}^{d} \to R$ , we have

\begin{aligned} {∥ h_{w, a} ∥}_{Höl, α}^{d} & = sup_{z_{1}, z_{2} \in R_{+ +}^{d} : z_{1} \neq z_{2}} | h_{w, a} (z_{1}) - h_{w, a} (z_{2}) | / {| z_{1} - z_{2} |}^{α} \\ = sup_{z_{1}, z_{2} \in R_{+ +}^{d} : z_{1} \neq z_{2}} \frac{{| 〈 a, z_{1} 〉 - 〈 a, z_{2} 〉 |}^{α}}{{| z_{1} - z_{2} |}^{α}} \frac{{| h_{w, a} (z_{1}) - h_{w, a} (z_{2}) |}^{α}}{| 〈 a, z_{1} 〉 - 〈 a, z_{2} 〉 |} \\ = sup_{z_{1}, z_{2} \in R_{+ +}^{d} : z_{1} \neq z_{2}} \frac{x^{α} {| z_{1} - z_{2} |}_{\infty}^{α}}{{| z_{1} - z_{2} |}^{α}} {∥ w ∥}_{Höl, α} \leq c_{\infty}^{α} x^{α}, \end{aligned}

where $c_{\infty} \in R_{+ +}$ is chosen such that $|\cdot |_{\infty}\le c_{\infty}|\cdot |$ and we used that

$$ |\langle a,z_{1}-z_{2}\rangle |\le \langle a,\mathbf{1}\rangle |z_{1}-z_{2}|_{ \infty}\le x|z_{1}-z_{2}|_{\infty}. $$

Since $\psi _{\alpha}(x)=1+x^{\alpha}$, the above calculation can therefore be continued with

\leq sup_{x \in R_{+}} (c_{\infty}^{α} x^{α} d_{M_{Höl, α}^{d}} (ν, μ) / ψ_{α} (x)) \leq c_{\infty}^{α} d_{M_{Höl, α}^{d}} (ν, μ) . □

Appendix B: A comment on the relation between $d_{\mu _{1},\mu _{2}}$ and the metric introduced in [40]

Rachasingho and Tasena [40] recently defined a distance $d$ on the set of bivariate subcopulas as follows. For two bivariate subcopulas $C_{0}$ and $C_{0}'$, they put

$$ d(C_{0},C_{0}'):=\mathfrak{h}_{d_{[0,1]^{2}}}([C_{0}],[C_{0}'])+ \mathfrak{h}_{|\,\cdot \,|}\big(\mathrm{dom}(C_{0}),\mathrm{dom}(C_{0}^{ \prime})\big), $$

where the summand $\mathfrak{h}_{d_{[0,1]^{2}}}([C_{0}],[C_{0}'])$ is the Hausdorff distance (with respect to $d_{[0,1]^{2}}$) between the sets of bivariate copulas $[C_{0}]$ and $[C_{0}']$ induced by $C_{0}$ and $C_{0}'$, respectively, and the summand $\mathfrak{h}_{|\,\cdot \,|}(\mathrm{dom}(C_{0}),\mathrm{dom}(C^{\prime}_{0}))$ is the Hausdorff distance between the domains $\mathrm{dom}(C_{0})$ and $\mathrm{dom}(C_{0}')$ of $C_{0}$ and $C_{0}'$, respectively.

The distance $d_{\mu _{1},\mu _{2}}$ defined by (2.2) basically differs from $d$ for the following reasons. First of all, $d$ is a metric on the set of bivariate subcopulas, whereas $d_{\mu _{1},\mu _{2}}$ is a pseudo-metric on the set of bivariate copulas. Moreover, $d_{\mu _{1},\mu _{2}}$ is designed to be a reasonable distance measure on the set of copulas associated with probability measures from the Fréchet class $\mathcal{M}_{2}(\mu _{1},\mu _{2})$; it defines the distance between two such copulas $C$ and $C'$ by the maximal pointwise distance between the corresponding subcopulas with domain $K:=\overline{\mathrm{ran}F_{\mu _{1}}}\times \overline{\mathrm{ran}F_{\mu _{2}}}$. On the other hand, $d$ is designed to be a reasonable distance measure on the set of arbitrary subcopulas. Even when restricting $d$ to the set of those subcopulas with domain $K$ (and $d_{\mu _{1},\mu _{2}}$ to the set of copulas associated with probability measures from $\mathcal{M}_{2}(\mu _{1},\mu _{2})$), the resulting distance measures are different. Indeed, if $C_{0}$ and $C_{0}'$ are two subcopulas with domain $K$, then

$$ d_{\mu _{1},\mu _{2}}(C,C')=\sup _{u\in K}|C_{0}(u)-C_{0}'(u)|, $$

but $d(C_{0},C_{0}')\ge \sup _{u\in [0,1]^{2}}|C(u)-C'(u)|$ for any copulas $C$ and $C'$ induced by $C_{0}$ and $C_{0}'$, respectively. Note here that $d_{\mu _{1},\mu _{2}}$ is defined in such a way that it does not take into account the behaviour of copulas outside $K=\overline{\mathrm{ran}F_{\mu _{1}}}\times \overline{\mathrm{ran}F_{\mu _{2}}}$, which is motivated by the fact that the behaviour of a copula $C$ outside $K$ is irrelevant for a probability measure from the Fréchet class $\mathcal{M}_{2}(\mu _{1},\mu _{2})$ with copula $C$.

Appendix C: Supplements to Sect. 6

Here we discuss the existence of optimal strategies in the Markov decision model considered in Sect. 6. In Sect. C.1, we first consider the general model introduced in Sect. 6.1. Thereafter, in Sect. C.2, we study in detail the special case of the multi-period portfolio optimisation problem considered in Sect. 6.2.

3.1 C.1 Existence of optimal strategies in the general model

For any $n=0,\ldots ,N-1$ and $P\in{\mathcal{P}}$, denote by $M_{n}^{P} (E)$ the set of all $v \in M (E)$ for which $\int _{E} |v(y)|\,P_{n}((x,f_{n}(x)),dy)<\infty $ for any $x\in E$ and $f_{n}\in F_{n}$. Recall that $M (E)$ is the set of all $(E, B (R))$ -measurable maps $v : E \to R$ . For any $n=0,\ldots ,N-1$, $f_{n}\in F_{n}$ and $v \in M_{n}^{P} (E)$ , we define maps $T_{n, f_{n}}^{P} v : E \to R$ and $T_{n}^{P} v : E \to R \cup {\infty}$ by

$$\begin{aligned} T^{P}_{n,f_{n}}v(x) := & r_{n}\big(x,f_{n}(x)\big)+\int _{E} v(x') \,P_{n}\Big(\big(x,f_{n}(x)\big),dx'\Big), \\ T^{P}_{n}v(x) := & \sup _{f_{n}\in F_{n}}T_{n,f_{n}}^{P} v(x). \end{aligned}$$

Note that $T^{P}_{n,f_{n}}$ and $T^{P}_{n}$ can be seen as maps from $M_{n}^{P} (E)$ to $M (E)$ and from $M_{n}^{P} (E)$ to ${(R \cup {\infty})}^{E}$ , respectively.

For any $n=0,\ldots ,N-1$, $P\in{\mathcal{P}}$ and $v \in M_{n}^{P} (E)$ , a decision rule $f_{n}^{P}\in F_{n}$ is said to be a maximiser of $v$ if $T_{n,f_{n}^{P}}^{P}v(x) = T_{n}^{P}v(x)$ for all $x\in E$. The following result is known from Bäuerle and Rieder [1, Theorem 2.3.8].

Theorem C.1

Let $P\in{\mathcal{P}}$ and assume that for any $n=0,\ldots ,N-1$, there exist sets $M_{n}^{P} \subseteq M_{n}^{P} (E)$ and $F^{P}_{n}\subseteq F_{n}$ such that the following three conditions hold:

(a) $r_{N} \in M_{N - 1}^{P}$ .

(b) $T_{n}^{P} v \in M_{n - 1}^{P}$ for any $v \in M_{n}^{P}$ and $n=1,\ldots ,N-1$.

(c) For any $n=0,\ldots ,N-1$ and $v \in M_{n}^{P}$ , there exists an $f_{n}^{P}\in F_{n}^{P}$ that is a maximiser of $v$.

Then the following three assertions hold true:

(i) $V_{0}^{P} \in M (E)$ and $V_{n + 1}^{P} \in M_{n}^{P}$ , $n=0,\ldots ,N-1$. Moreover, the Bellman iteration scheme holds true, i.e., $V_{N}^{P}=r_{N}$ and $V_{n}^{P}=T_{n}^{P}V_{n+1}^{P}$, $n=0,\ldots ,N-1$.

(ii) $V_{n}^{P}=T_{n}^{P}T_{n+1}^{P}\cdots T_{N-1}^{P}r_{N}$ for any $n=0,\ldots ,N-1$.

(iii) For any $n=0,\ldots ,N-1$, there exists an $f_{n}^{P}\in F_{n}^{P}$ that is a maximiser of $V_{n+1}^{P}$. Any such maximisers $f_{0}^{P},\ldots ,f_{N-1}^{P}$ form a strategy $\pi ^{P}:=(f_{n}^{P})_{n=0}^{N-1}\in \varPi $ that is optimal for the optimisation problem (6.1).

3.2 C.2 Existence of optimal trading strategies in the setting of Sect. 6.2

We now focus on the specific setting of Sect. 6.2, i.e., we discuss the existence of optimal trading strategies for the multi-period portfolio optimisation problem considered there. Let $K:=\{\kappa \in [0,1]^{d}:\langle \kappa ,\boldsymbol{1}\rangle \le 1 \}$ and note that $K$ is compact. For any $\vec{μ} = {(μ_{n})}_{n = 1}^{N} \in M_{1}^{α} {(R_{+ +})}^{N}$ , $\kappa \in K$ and $n=0,\ldots ,N-1$, set $γ_{n}^{\vec{μ}; κ} : = \int_{R_{+}^{d}} u_{α} (Z_{n + 1}^{0} + 〈 κ, z - Z_{n + 1}^{0} 1 〉) μ_{n + 1} (d z)$ . Moreover, set $\gamma _{n}^{\vec{\boldsymbol{\mu}}}:=\sup _{\kappa \in K}\gamma _{n}^{ \vec{\boldsymbol{\mu}};\kappa}$ for any $n=0,\ldots ,N-1$.

Lemma C.2

For any $\vec{μ} = {(μ_{n})}_{n = 1}^{N} \in M_{1}^{α} {(R_{+ +})}^{N}$ and $n=0,\ldots ,N-1$, there exists at least one solution $\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K$ to the optimisation problem $\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}$. In particular, the maximal value $\gamma _{n}^{\vec{\boldsymbol{\mu}}}=\gamma _{n}^{ \vec{\boldsymbol{\mu}};\kappa _{n}^{\vec{\boldsymbol{\mu}}}}$ is finite.

Proof

Let $\vec{μ} = {(μ_{n})}_{n = 1}^{N} \in M_{1}^{α} {(R_{+ +})}^{N}$ and $n\in \{0,\ldots ,N-1\}$. Define a map $g_{n} : R_{+}^{d} \times K \to R_{+}$ by $g_{n}(z,\kappa ):=u_{\alpha}(Z_{n+1}^{0}+\langle \kappa ,z - Z_{n+1}^{0} \boldsymbol{1}\rangle )$. The map $g_{n}(\,\cdot \,,\kappa )$ is Borel-measurable for any fixed $\kappa \in K$, and we have

$$ |g_{n}(z,\kappa )|=u_{\alpha}(Z_{n+1}^{0}+\langle \kappa ,z - Z_{n+1}^{0} \boldsymbol{1}\rangle )\le u_{\alpha}(Z_{n+1}^{0} + \langle \boldsymbol{1},z\rangle ) $$

for any $(z, κ) \in R_{+}^{d} \times K$ . Therefore, $g_{n}$ is dominated by the Borel-measurable function $h_{n} : R_{+}^{d} \to R_{+}$ defined by $h_{n}(z):=u_{\alpha}(Z_{n+1}^{0}+\langle z,\boldsymbol{1}\rangle )$. This function is $\mu _{n+1}$-integrable since $\int_{R_{+}^{d}} h_{n} d μ_{n + 1} \leq {(Z_{n + 1}^{0})}^{α} + \int_{R_{+}^{d}} {〈 1, z 〉}^{α} μ_{n + 1} (d z) < \infty$ (take into account the definition of $M_{1}^{α} (R_{+ +})$ ), and $g_{n}(z,\,\cdot \,)$ is continuous on $K$ for any $z \in R_{+}^{d}$ . So we can apply the continuity lemma (in the form of Bauer [2, Lemma 16.1]) to obtain that the map $G_{n} : K \to R_{+ +}$ defined by $G_{n} (κ) : = \int_{R_{+}^{d}} g_{n} (z, κ) μ_{n + 1} (d z)$ is continuous. Since $K$ is compact, we can infer that there exists a solution $\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K$ to the optimisation problem $\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}$. □

Part (ii) of the following result shows in particular that an optimal trading strategy can be found in the subset $\varPi _{\mathrm{lin}}$ of all those $\pi =(f_{n})_{n=0}^{N-1}\in \varPi $ for which for any $n=0, \ldots ,N-1$, the decision rule $f_{n} : R_{+} \to R^{d}$ admits the representation $f_{n}(x)=\kappa _{n}x$ for some $\kappa _{n}\in K$. For any $n=0,\ldots ,N-1$, let $\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K$ be any solution to the optimisation problem $\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}$ (see Lemma C.2).

Theorem C.3

For any $\vec{μ} \in M_{1}^{α} {(R_{+ +})}^{N}$ , the following two assertions hold true:

(i) For any $n=0,\ldots ,N-1$, the time-$n$ value function $V_{n}^{P^{\vec{μ}}} : R_{+} \to R$ admits the representation $V_{n}^{P^{\vec{\boldsymbol{\mu}}}}(\,\cdot \,)=\phi _{n}^{ \vec{\boldsymbol{\mu}}}\,u_{\alpha}(\,\cdot \,)$, where $\phi _{n}^{\vec{\boldsymbol{\mu}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{ \vec{\boldsymbol{\mu}}}$.

(ii) If for every $n=0,\ldots ,N-1$, a decision rule $f_{n}^{\vec{μ}} : R_{+} \to R_{+}^{d}$ at time $n$ is defined by $f_{n}^{\vec{\boldsymbol{\mu}}}(x):=\kappa _{n}^{ \vec{\boldsymbol{\mu}}}\,x$, then $\pi ^{\vec{\boldsymbol{\mu}}}:=(f_{n}^{\vec{\boldsymbol{\mu}}})_{n=0}^{N-1}$ forms an optimal trading strategy for $P^{\vec{\boldsymbol{\mu}}}$.

Proof

(i) We intend to apply Theorem C.1. Let $M_{n}^{P^{\vec{μ}}} : = M^{'}$ and $F_{n}^{P^{\vec{\boldsymbol{\mu}}}}:=F'$ for any $n=0,\ldots ,N-1$, where $M^{'} : = {v_{θ} : θ \in R_{+ +}}$ and $F':=\{f_{\kappa}:\kappa \in K\}$ with $v_{\theta}(x):=\theta u_{\alpha}(x)$ and $f_{\kappa}(x)=\kappa x$, $x \in R_{+}$ . It can be inferred from Lemma 6.5 that $M_{n}^{P^{\vec{μ}}} : = M^{'} \subseteq M_{n}^{P^{\vec{μ}}} (R_{+})$ for any $n=0,\ldots ,N-1$, where $M_{n}^{P} (R_{+})$ is defined as in Sect. C.1. Moreover, we clearly have $F_{n}^{P^{\vec{\boldsymbol{\mu}}}}:=F'\subseteq F =:F_{n}$ for any $n=0,\ldots ,N-1$.

Below, we show that conditions (a)–(c) of Theorem C.1 are met. Thus the Bellman iteration scheme in part (i) of Theorem C.1 holds true, and so

\begin{array}{rcl} V_{N - 1}^{P^{\vec{μ}}} (x_{N - 1}) & = & T_{N - 1}^{P^{\vec{μ}}} V_{N}^{P^{\vec{μ}}} (x_{N - 1}) = T_{N - 1}^{P^{\vec{μ}}} r_{N} (x_{N - 1}) = T_{N - 1}^{P^{\vec{μ}}} u_{α} (x_{N - 1}) \\ = & sup_{f_{N - 1} \in F_{N - 1}} T_{N - 1, f_{N - 1}}^{P^{\vec{μ}}} u_{α} (x_{N - 1}) \\ = & sup_{f_{N - 1} \in F_{N - 1}} \int_{R_{+}} u_{α} (x_{N}) P_{N - 1} ((x_{N - 1}, f_{N - 1} (x_{N - 1})), d x_{N}) \\ = & sup_{f_{n} \in F_{N - 1}} \int_{R_{+}} {(Z_{N}^{0} x_{N - 1} + 〈 f_{N - 1} (x_{N - 1}), z - Z_{N}^{0} 1 〉)}^{α} μ_{N} (d z) \\ = & sup_{κ \in K} \int_{R_{+}} {(Z_{N}^{0} x_{N - 1} + 〈 κ x_{N - 1}, z - Z_{N}^{0} 1 〉)}^{α} μ_{N} (d z) \\ = & u_{α} (x_{N - 1}) sup_{κ \in K} \int_{R_{+}} {(Z_{N}^{0} + 〈 κ, z - Z_{N}^{0} 1 〉)}^{α} μ_{N} (d z) \\ = & u_{α} (x_{N - 1}) sup_{κ \in K} γ_{N - 1}^{\vec{μ}, κ} \\ = & γ_{N - 1}^{\vec{μ}} u_{α} (x_{N - 1}) = ϕ_{N - 1}^{\vec{μ}} u_{α} (x_{N - 1}) \end{array}

(C.1)

for any $x_{N - 1} \in R_{+}$ . If we continue this way, we successively get $V_{n}^{P^{\vec{\boldsymbol{\mu}}}}(\,\cdot \,)=\phi _{n}^{ \vec{\boldsymbol{\mu}}}\,u_{\alpha}(\,\cdot \,)$, $n=N-2,\ldots ,0$.

It remains to verify that conditions (a)–(c) of Theorem C.1 are met. Condition (a) is trivially satisfied, and (b) can be shown by proceeding analogously to (C.1). Furthermore, similarly to (C.1), we obtain for any $n=0,\ldots ,N-1$ and $θ \in R_{+ +}$ that $\sup _{f_{n}\in F_{n}}T_{n,f_{n}}^{P^{\vec{\boldsymbol{\mu}}}}v_{ \theta}(\,\cdot \,)=\theta u_{\alpha}(\,\cdot \,)\,\sup _{\kappa \in K} \gamma _{n}^{\vec{\boldsymbol{\mu}},\kappa}$ and $\theta u_{\alpha}(\,\cdot \,)\gamma _{n}^{\vec{\boldsymbol{\mu}}, \kappa}=T_{n,f_{\kappa}}^{P^{\vec{\boldsymbol{\mu}}}}v_{\theta}(\, \cdot \,)$ for all $\kappa \in K$. By Lemma C.2, we know that there exists a maximum point $\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K$ of the map $\kappa \mapsto \gamma _{n}^{\vec{\boldsymbol{\mu}},\kappa}$, and therefore for any $n=0,\ldots ,N-1$, the decision rule $f_{\kappa _{n}^{\vec{\boldsymbol{\mu}}}}$ is a maximiser of $v_{\theta}$. This shows that condition (c) is satisfied, too.

(ii) In the proof of (i), we have seen that the assumptions of Theorem C.1 are fulfilled. Thus Theorem C.1 (i) gives $V_{n + 1}^{P^{\vec{μ}}} \in M^{'} = : M_{n}^{P^{\vec{μ}}}$ for any $n=0,\ldots ,N-1$. In particular, the above elaborations under (c) show that for any $n=0,\ldots ,N-1$, the decision rule $f_{\kappa _{n}^{\vec{\boldsymbol{\mu}}}}\in F'=:F_{n}^{P^{ \vec{\boldsymbol{\mu}}}}$ provides a maximiser of $V_{n+1}^{P^{\vec{\boldsymbol{\mu}}}}$. Hence Theorem C.1 (iii) ensures that the strategy $\pi ^{\vec{\boldsymbol{\mu}}}:=(f_{\kappa _{n}^{ \vec{\boldsymbol{\mu}}}})_{n=0}^{N-1}\in \varPi _{\mathrm{lin}}$ forms an optimal trading strategy for $P^{\vec{\boldsymbol{\mu}}}$. □

The following lemma specifies the value functions for fixed strategies. Recall that the values $\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}$, $n=0,\ldots ,N-1$, $\kappa \in K$, were defined before Lemma C.2, and note that for any $\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}$, one defines a trading strategy $\pi _{\vec{\boldsymbol{\kappa}}}:=(f_{n}^{ \vec{\boldsymbol{\kappa}}})_{n=0}^{N-1}\in \varPi _{\mathrm{lin}}$ by setting $f_{n}^{\vec{\boldsymbol{\kappa}}}(x):=\kappa _{n}x$ for any $x \in R_{+}$ and $n=0,\ldots ,N-1$.

Lemma C.4

Let $\vec{μ} \in M_{1}^{α} {(R_{+ +})}^{N}$ and $\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}$. For $n=0,\ldots ,N-1$, the time-$n$ value function $V_{n}^{P^{\vec{μ}}; π_{\vec{κ}}} : R_{+} \to R$ associated with strategy $\pi _{\vec{\boldsymbol{\kappa}}}$ then admits the representation $V_{n}^{P^{\vec{\boldsymbol{\mu}}};\pi _{\vec{\boldsymbol{\kappa}}}}( \,\cdot \,)=\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{ \vec{\boldsymbol{\kappa}}}}\,u_{\alpha}(\,\cdot \,)$, where $\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}$.

Proof

For any $n=0,\ldots ,N-1$ and $x_{n} \in R_{+}$ , we have

\begin{aligned} V_{n}^{P^{\vec{μ}}; π_{\vec{κ}}} (x) \\ = E^{x_{0}, P^{\vec{μ}}; π_{\vec{κ}}} [r_{N} (X_{n}) | X_{n} = x_{n}] = E^{x_{0}, P^{\vec{μ}}; π_{\vec{κ}}} [u_{α} (X_{n}) | X_{n} = x_{n}] \\ = \int_{R_{+}} \dots \int_{R_{+}} \int_{R_{+}} u_{α} (x_{N}) P^{\vec{μ}; π_{\vec{κ}}} (x_{N - 1}, d x_{N}) \\ P^{\vec{μ}; π_{\vec{κ}}} (x_{N - 2}, d x_{N - 1}) \dots P^{\vec{μ}; π_{\vec{κ}}} (x_{n}, d x_{n + 1}) \\ = \int_{R_{+}} \dots \int_{R_{+}} \int_{R_{+}} {(Z_{N}^{0} x_{N - 1} + 〈 κ_{N - 1} x_{N - 1}, z - Z_{N}^{0} 1 〉)}^{α} μ_{N} (d z) \\ P^{\vec{μ}; π_{\vec{κ}}} (x_{N - 2}, d x_{N - 1}) \dots P^{\vec{μ}; π_{\vec{κ}}} (x_{n}, d x_{n + 1}) \\ = \int_{R_{+}} \dots \int_{R_{+}} x_{N - 1}^{α} \int_{R_{+}} {(Z_{N}^{0} + 〈 κ_{N - 1}, z - Z_{N}^{0} 1 〉)}^{α} μ_{N} (d z) \\ P^{\vec{μ}; π_{\vec{κ}}} (x_{N - 2}, d x_{N - 1}) \dots P^{\vec{μ}; π_{\vec{κ}}} (x_{n}, d x_{n + 1}) \\ = γ_{N}^{\vec{μ}; κ_{N}} \int_{R_{+}} \dots \int_{R_{+}} u_{α} (x_{N - 1}) P^{\vec{μ}; π_{\vec{κ}}} (x_{N - 2}, d x_{N - 1}) \dots P^{\vec{μ}; π_{\vec{κ}}} (x_{n}, d x_{n + 1}) . \end{aligned}

If we continue successively in this way, we end up with

V_{n}^{P^{\vec{μ}}; π_{\vec{κ}}} (x) = \prod_{j = n + 1}^{N - 1} γ_{j}^{\vec{μ}; κ_{j}} \int_{R_{+}} u_{α} (x_{n + 1}) P^{\vec{μ}; π} (x_{n}, d x_{n + 1}) = \prod_{j = n}^{N - 1} γ_{j}^{\vec{μ}; κ_{j}} u_{α} (x_{n}) .

This proves the assertion of the lemma, since $\prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}= \phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}$. □

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zähle, H. A concept of copula robustness and its applications in quantitative risk management. Finance Stoch 26, 825–875 (2022). https://doi.org/10.1007/s00780-022-00485-8

Download citation

Received: 20 July 2021
Accepted: 30 March 2022
Published: 13 September 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s00780-022-00485-8

A concept of copula robustness and its applications in quantitative risk management

Abstract

Similar content being viewed by others

Lévy Copulas: Review of Recent Results

Copulae in High Dimensions: An Introduction

A Graphical Tool for Copula Selection Based on Tail Dependence

1 Introduction

2 Preliminary notation, terminology and results

2.1 Fréchet classes and copulas

Remark 2.1

2.2 The set \(\mathcal{M}_{d}^{p}\) and the \(p\)-weak topology

Proposition 2.2

2.3 A generalisation of Deheuvels’ copula convergence theorem

Theorem 2.3

Corollary 2.4

2.4 Characterisation of (\(p\)-)weak convergence in Fréchet classes

Corollary 2.5

Corollary 2.6

Corollary 2.7

Corollary 2.8

Corollary 2.9

3 Copula robustness

3.1 Definition of copula robustness

Definition 3.1

Example 3.2

Example 3.3

Lemma 3.4

3.2 Copula robustness of functionals on \(\mathcal{N}_{d}\)

Definition 3.5

Remark 3.6

Example 3.7

Theorem 3.8

3.3 Copula robustness of functionals on \(\mathcal{M}_{d}^{p}\)

Definition 3.9

Theorem 3.10

Example 3.11

Theorem 3.12

Corollary 3.13

4 Example 1: risk measures of aggregate risks

4.1 Foundations of risk measures

Example 4.1

Theorem 4.2

4.2 Copula robustness of risk measures of aggregate risks

Corollary 4.3

Example 4.4

Remark 4.5

4.3 Relation to aggregation robustness of risk measures

Definition 4.6

Example 4.7

4.4 Application to optimal capital and risk allocations

Corollary 4.8

Example 4.9

5 Example 2: stochastic programming problems

5.1 A class of stochastic programming problems

Lemma 5.1

Theorem 5.2

Remark 5.3

5.2 Example: one-period mean–risk portfolio optimisation

Corollary 5.4

Remark 5.5

5.3 Copula robustness of stochastic programming problems

Corollary 5.6

Example 5.7

6 Example 3: multi-period portfolio optimisation

6.1 Groundwork: a class of Markov decision models

6.1.1 Basic notation and terminology

6.1.2 Intrinsic optimisation problem

6.1.3 Bounding function

Lemma 6.1

6.1.4 Continuous dependence of the optimal value on the transition function

Theorem 6.2

Corollary 6.3

6.2 A utility-based portfolio optimisation problem

6.2.1 Financial market model and a terminal wealth optimisation problem

Example 6.4

6.2.2 Interpretation as a Markov decision problem

6.2.3 Continuous dependence of the optimal value on \(P^{\vec{\boldsymbol{\mu}}}\)

Lemma 6.5

Theorem 6.6

Corollary 6.7