Abstract
In financial and actuarial applications, marginal risks and their dependence structure are often modelled separately. While it is sometimes reasonable to assume that the marginal distributions are ‘known’, it is usually quite involved to obtain information on the copula (dependence structure). Therefore copula models used in practice are quite often only rough guesses. For many purposes, it is thus relevant to know whether certain characteristics derived from \(d\)-variate risks are robust with respect to (at least small) deviations in the copula. In this article, a general concept of copula robustness is introduced and criteria for copula robustness are presented. These criteria are illustrated by means of several examples from quantitative risk management. The concept of aggregation robustness introduced by Embrechts et al. (Finance Stoch. 19:763–790, 2015) can be embedded in our framework of copula robustness.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
As pointed out by Embrechts et al. [17] and McNeil et al. [33, Sect. 6.2.1], in financial mathematics and actuarial science, marginal risks and their dependence structure are often modelled separately. While the marginal risks of a \(d\)-variate risk are identified with probability distributions \(\mu _{1},\ldots ,\mu _{d}\) on the real line, the dependence structure is most often modelled by a \(d\)-variate copula \(C\). The distribution function \(F_{\mu}\) of the joint distribution \(\mu \) is then given by
where \(F_{\mu _{i}}\) is the distribution function of \(\mu _{i}\).
In practical applications, quantitative risk managers and actuaries are interested in various aspects \(\mathcal{T}_{d}(\mu )\) of the joint distribution \(\mu \) of the individual risks. An important example is \(\mathcal{T}_{d}=\mathcal{R}_{A_{d}}\) with
where ℛ is the risk functional corresponding to some distribution-invariant ‘downside’ risk measure and is a fixed Borel-measurable map regarded as an aggregation map in the spirit of McNeil et al. [33, Sect. 6.2.1]. Standard examples for the aggregation map are \(A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}\) and the three other maps presented in Example 4.4 below. Note that \(\mu \circ A_{d}^{-1}\) is the distribution of \(A_{d}(X_{1},\ldots ,X_{d})\) when \((X_{1},\ldots ,X_{d})\) is a random vector distributed according to \(\mu \). Therefore \(\mathcal{R}_{A_{d}}(\mu )\) can be seen as the downside risk of the aggregate position \(A_{d}(X_{1},\ldots ,X_{d})\).
More generally, one could consider \(\mathcal{T}_{d}=\mathcal{R}_{\mathfrak{A}_{d}}\) with
where \(\mathfrak{A}_{d}\) is a fixed set of Borel-measurable maps . If there exists an \(A_{d}^{*}\in \mathfrak{A}_{d}\) at which the infimum in (1.3) is attained, then \(\mathcal{R}_{\mathfrak{A}_{d}}(\mu )\) can be seen as the smallest possible risk of a position \(A_{d}(X_{1},\ldots ,X_{d})\) derived from the single risks \(X_{1},\ldots ,X_{d}\) with joint distribution \(\mu \) through a function \(A_{d}\in \mathfrak{A}_{d}\). It is worth noting that ‘risk’ here does not necessarily mean downside risk, but can also be for instance a mean–downside risk mixture which is the target value in many portfolio optimisation problems. For details, see Sect. 5.2, in particular Remark 5.5.
Of course, there are many other examples for \(\mathcal{T}_{d}\). One of them is the optimal value in a multi-period portfolio optimisation problem that is addressed in Sect. 6.2. In this example, the role of \(\mu \) is played by the joint distribution of the relative price changes of the \(d\) risky assets that are available on the considered financial market.
When starting from separate models for the copula and the marginal distributions, it is reasonable to regard \(\mathcal{T}_{d}\) as a functional of the copula \(C\) and the marginal distributions \(\mu _{1},\ldots ,\mu _{d}\) via
where \(\mathfrak{p}_{d}\) assigns to a \(d\)-variate distribution function its corresponding Borel probability measure on .
In [33, Sect. 6.2.1], McNeil et al. point out that practitioners are often required to work only with partial information. For instance, in some situations, it is possible to obtain (sufficient) information on \(\mu _{1},\ldots ,\mu _{d}\), but it is much more difficult to obtain information on the dependence structure. Carrying this to the extreme, McNeil et al. assume that \(\mu _{1},\ldots ,\mu _{d}\) are fully known and \(C\) is fully unknown. In this case, one cannot specify \(\mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})\), because \(C\) is unknown. This leads to the ‘Fréchet problem’ of specifying the range of the map \(C\mapsto \mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})\). In the special case where \(\mathfrak{T}_{d}\) takes values in ℝ, this is often related to finding (sharp) upper and lower bounds for this map. There is a vast literature dealing with this problem; see for instance the works of Rüschendorf [43], [44, Chap. 4], Embrechts and Puccetti [14], Embrechts et al. [15], Puccetti [39], Embrechts et al. [17] and the references cited therein.
In the present paper, a related but different problem is addressed. Still in the case where \(\mu _{1},\ldots ,\mu _{d}\) are known (and fixed), assume that \(\widehat{C}\) is a guess for the true copula \(C\). It might be based on an expert opinion, a statistical estimation, or the like. Of course, as a guess, \(\widehat{C}\) can differ from \(C\). It is clear that a deviation of \(\widehat{C}\) from \(C\) can imply a significant difference between \(\mathfrak{T}_{d}(\widehat{C},\mu _{1},\ldots ,\mu _{d})\) and \(\mathfrak{T}_{d}(C,\mu _{1},\ldots ,\mu _{d})\). On the other hand, one might ask whether the difference remains small if the deviation of \(\widehat{C}\) from \(C\) is small. This question was raised and answered by Embrechts et al. [17] in the context of (1.2) with \(A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}\). Krätschmer et al. [28, Sect. 4.2.4] took up this concept and generalised the respective result of [17]. In fact, in the latter two references, continuity of the functional \(\mathcal{T}_{d}\) at the probability measure \(\mathfrak{p}_{d}(C(F_{\mu _{1}},\ldots ,F_{\mu _{d}}))\) (with fixed marginal distributions \(\mu _{1},\ldots ,\mu _{d}\) having finite \(p\)th moments) was not considered with respect to a metric on the set of copulas, but with respect to the (relative) weak topology on the set of \(d\)-variate distributions (with marginal distributions \(\mu _{1},\ldots ,\mu _{d}\)). However, it can be seen from Theorem 3.10 below that this is equivalent when the set of copulas is equipped with the supremum distance.
Despite this equivalence, it might be a little more accessible for some readers to measure the difference between two dependence structures directly through the difference between the corresponding copulas, in particular if one starts from separate models for the copula and the marginal distributions. If one follows this approach, one ought to take into account that a \(d\)-variate distribution \(\mu \) with fixed marginal distributions \(\mu _{1},\ldots ,\mu _{d}\) depends on the copula \(C\) only through the values that \(C\) takes on \(\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}\) (\(\subseteq [0,1]^{d}\)), where \(\mathrm{ran}F_{\mu _{i}}\) is the range of \(F_{\mu _{i}}\). This is apparent from (1.1) and suggests to measure the distance between copulas (in the considered framework) only on \(\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}\).
We propose to say that the functional \(\mathcal{T}_{d}\) underlying \(\mathfrak{T}_{d}\) (recall Eq. (1.4)) is copula robust if for any ‘admissible’ univariate distributions \(\mu _{1},\ldots ,\mu _{d}\), the map \(C\mapsto \mathfrak{T}_{d}(C, \mu _{1},\ldots ,\mu _{d})\) is continuous with respect to pointwise (or uniform) convergence on \(\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}\), where it is assumed that \(\mathcal{T}_{d}\) (and thus \(\mathfrak{T}_{d}\)) takes values in a topological space. By ‘admissible’ we mean that one can find at least one copula \(C\) such that the probability measure \(\mathfrak{p}_{d}(C(F_{\mu _{1}},\ldots ,F_{\mu _{d}}))\) is contained in the domain of \(\mathcal{T}_{d}\). The precise definition of copula robustness is given in Sect. 3. The required notation and terminology as well as some auxiliary results are given before in Sect. 2. It is worth mentioning that Theorem 2.3 provides a generalisation of Deheuvels’ [10] copula convergence theorem and that Corollary 2.9 provides a characterisation of weak convergence in Fréchet classes of \(d\)-variate distributions.
In the second part of the paper, we discuss three examples for copula robust functionals \(\mathcal{T}_{d}\). First, in Sect. 4, we address the quantification of the ‘downside risk’ of aggregate financial positions. It will be seen that the functional in (1.2) is copula robust under mild assumptions (Sect. 4.2). The relation of copula robustness to the concept of aggregation robustness of Embrechts et al. [17] (Sect. 4.3) as well as copula robustness of inf-convolution functionals (Sect. 4.4) are also discussed in detail. Second, in Sect. 5, we address stochastic programming problems. It can be inferred from results of Claus et al. [9] that the optimal value of a general stochastic programming problem depends copula robustly on the distribution of the underlying \(d\)-variate input random variable \(Z\). This covers in particular classical one-period portfolio optimisation problems (where the role of \(Z\) is played by the vector of the relative price changes of \(d\) risky assets) and therefore backs in a way a hypothesis of Saida and Prigent [45]. In [45, Sect. 1], they conclude from their numerical investigations that ‘investors must more take care of the specification of the marginal distribution than of the copula function’. Third, in Sect. 6, we address multi-period portfolio optimisation problems and derive results that are similar to those in the one-period case. The main tool in this context is Theorem 6.2 which is a variant of a result of Müller [35] about the continuous dependence of the value function on the transition function in a Markov decision model. Theorem 6.2 is of independent interest and contributes to the general theory of Markov decision processes.
Throughout this paper, \(|\cdot |\) denotes any norm on and \(\langle \,\cdot \,,\,\cdot \,\rangle \) is the Euclidean scalar product defined by \(\langle x,y\rangle :=\sum _{i=1}^{d}x_{i}y_{i}\) for any elements \(x=(x_{1},\ldots ,x_{d})\) and \(y=(y_{1},\ldots ,y_{d})\) of . Moreover, we set and . The proofs of all results can be found in Appendix A.
2 Preliminary notation, terminology and results
2.1 Fréchet classes and copulas
For any , let us use \(\mathcal{M}_{d}\) to denote the set of all Borel probability measures on . For any fixed \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\), denote by \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\) the set of all \(\mu \in{\mathcal{M}}_{d}\) having marginals \(\mu _{1},\ldots ,\mu _{d}\), i.e., satisfying \(\mu \circ \pi _{i}^{-1}=\mu _{i}\) for any \(i=1,\ldots ,d\), where is the projection on the \(i\)th coordinate. The set \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\) is known as Fréchet class associated with the univariate Borel probability measures \(\mu _{1},\ldots ,\mu _{d}\). The distribution function of a Borel probability measure \(\mu \) will be denoted by \(F_{\mu}\).
By definition a \(d\)-variate copula is the distribution function \(C:[0,1]^{d}\rightarrow [0,1]\) of a Borel probability measure on \([0,1]^{d}\) whose marginal distributions are all given by the uniform distribution on \([0,1]\). The latter condition ensures that each \(d\)-variate copula \(C\) is Lipschitz-continuous. Theorem 2.10.7 in Nelsen’s textbook [36] indeed shows that every \(d\)-variate copula \(C\) satisfies \(|C(u)-C(v)|\le |u-v|_{1}\), where \(|x|_{1}:=\sum _{i=1}^{d}|x_{i}|\) for any .
Let us denote by \(\mathbf{C}_{d}\) the set of all \(d\)-variate copulas. With any \(C\in \mathbf{C}_{d}\) and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\), we associate an element \(\mu \) of the Fréchet class \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\) through (1.1). It is indeed easily seen that the right-hand side of (1.1) defines a probability distribution function on and that the corresponding Borel probability measure \(\mu \) has \(\mu _{1},\ldots ,\mu _{d}\) as its marginal distributions. Sklar’s theorem ([49]; see also [36, Theorem 2.10.9]) shows that the distribution function of any element \(\mu \) of \(\mathcal{M}_{d}(\mu _{1},\dots ,\mu _{d})\) admits the representation (1.1). That is, for any \(\mu \in{\mathcal{M}}_{d}(\mu _{1},\dots ,\mu _{d})\), one can find a copula \(C\in \mathbf{C}_{d}\) such that (1.1) holds. On the set \(\mathrm{ran}F_{\mu _{1}}\times \cdots \times\mathrm{ran}F_{\mu _{d}}\), the copula \(C\) is uniquely determined and given by
where . In particular, if \(F_{\mu _{1}},\ldots ,F_{\mu _{d}}\) are all continuous, then the copula \(C\) is unique and given by (2.1) on the whole unit cube \([0,1]^{d}\). For background on copulas, see for instance the textbooks by Durante and Sempi [12, Chaps. 1–2] or Nelsen [36, Chaps. 1–2].
For any nonempty compact set \(K\subseteq [0,1]^{d}\), we can define a pseudo-metric \(d_{K}\) on \(\mathbf{C}_{d}\) through
Since the elements of \(\mathbf{C}_{d}\) are all Lipschitz-continuous with Lipschitz constant 1 on \([0,1]^{d}\), the set \(\mathbf{C}_{d}\) is uniformly equicontinuous. This implies that convergence of a sequence to some \(C\in \mathbf{C}_{d}\) with respect to \(d_{K}\) is equivalent to pointwise convergence of to \(C\) on \(K\). The topology on \(\mathbf{C}_{d}\) generated by \(d_{K}\) is denoted by \(\mathcal{O}_{K}\). For any \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\), we let
For \(K=[0,1]^{d}\), the pseudo-metric \(d_{K}\) is even a metric, and the topology \(\mathcal{O}_{K}\) is the standard topology on \(\mathbf{C}_{d}\) (and the counterpart of the weak topology on the set of all Borel probability measures on \([0,1]^{d}\) whose distribution functions are \(d\)-variate copulas). In particular, if \(F_{\mu _{1}},\ldots ,F_{\mu _{d}}\) are all continuous, then \(d_{\mu _{1},\ldots ,\mu _{d}}=d_{[0,1]^{d}}\) and \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}=\mathcal{O}_{[0,1]^{d}}\).
Convergence of copulas with respect to \(\mathcal{O}_{[0,1]^{d}}\) has been addressed in the literature several times, for instance by Charpentier and Segers [7] and Trutschnig [51]. Metrics inducing topologies that are at least as fine as \(\mathcal{O}_{[0,1]^{d}}\) have been studied for instance by Li et al. [30], Trutschnig [50], Fernández Sánchez and Trutschnig [19] and Kasper et al. [24]. On the other hand, the (pseudo-) metric \(d_{\mu _{1},\ldots ,\mu _{d}}\) defined by (2.2) generates a topology that is at most as fine as \(\mathcal{O}_{[0,1]^{d}}\). It is finally worth mentioning that the metric on the set of bivariate subcopulas that was recently introduced by Rachasingho and Tasena [40] basically differs from the metric \(d_{\mu _{1},\mu _{2}}\) and from its variant \(d_{\mu _{1},\mu _{2}}^{\sim}\) introduced in the following Remark 2.1; for details, see Appendix B.
Remark 2.1
For any fixed \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{d}\), one can regard the pseudo-metric \(d_{\mu _{1},\ldots ,\mu _{d}}\) as a metric when changing from \(\mathbf{C}_{d}\) to the quotient set \(\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) with respect to the equivalence relation
On the resulting quotient set \(\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\), one may then define a metric through , where \(C,C'\) are (arbitrary) representatives of the equivalence classes . The topology on \(\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) generated by \(d_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\), henceforth denoted by \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\), preserves the topological structure in the sense that a set \(G\subseteq \mathbf{C}_{d}\) lies in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}\) if and only if the set
lies in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\).
2.2 The set \(\mathcal{M}_{d}^{p}\) and the \(p\)-weak topology
Fix and let \(\mathcal{M}_{d}^{p}\) be the set of all \(\mu \in{\mathcal{M}}_{d}\) for which . Note that \(\mathcal{M}_{d}=\mathcal{M}_{d}^{0}\supseteq{\mathcal{M}}_{d}^{p_{1}}\supseteq{\mathcal{M}}_{d}^{p_{2}}\) for any with \(p_{1}\le p_{2}\). The \(p\)-weak topology on \(\mathcal{M}_{d}^{p}\), henceforth denoted by \(\mathcal{O}_{d}^{p}\), is defined as the coarsest topology for which all mappings \(\mu \mapsto \int f\,d\mu \), \(f\in {\mathcal{C}}_{d}^{p}\), are continuous, where \(\mathcal{C}_{d}^{p}\) is the space of all continuous functions with . The 0-weak topology on \(\mathcal {M}_{d}^{0}\) (\(=\mathcal{M}_{d}\)) is just the classical weak topology, and the \(p\)-weak topology \(\mathcal{O}_{d}^{p}\) is finer than the relative weak topology \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}\) when \(p>0\).
It is known from Krätschmer et al. [28, Lemma 2.1] that \((\mathcal{M}_{d}^{p},\mathcal{O}_{d}^{p})\) is a Polish space and that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\) if and only if both \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}\) and
In particular, \(\mathcal{O}_{d}^{p}\) is metrised by
for any metric \(d_{\mathrm{weak}}\) which metrises \(\mathcal{O}_{d}^{0}\). Already in the 1980s, Bickel and Freedman [5, Lemma 8.3] proved for \(p\in [1,\infty )\) that \(\mathcal{O}_{d}^{p}\) is also metrisable by the \(L^{p}\)-Wasserstein metric. The following proposition is a sort of continuous mapping theorem.
Proposition 2.2
Let and . Let be a continuous function such that . Then \(\mathfrak{h}(\mu ):=\mu \circ h^{-1}\) lies in \(\mathcal{M}_{d'}^{p'}\) for any \(\mu \in{\mathcal{M}}_{d}^{p}\), and the map \(\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}\) is \((\mathcal{O}_{d}^{p},\mathcal{O}_{d'}^{p'})\)-continuous.
2.3 A generalisation of Deheuvels’ copula convergence theorem
Deheuvels’ convergence theorem [10, Théorème 2.3, Lemma 4.1] says that given a \(d\)-variate distribution \(\mu \in{\mathcal{M}}_{d}\) whose marginal distributions \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\) all possess continuous distribution functions \(F_{\mu _{1}},\ldots ,F_{\mu _{d}}\), a sequence converges to \(\mu \) in \(\mathcal{O}_{d}^{0}\) if and only if \((\mu _{n,i})\) converges to \(\mu _{i}\) in \(\mathcal{O}_{1}^{0}\), \(i=1,\ldots ,d\), and \(d_{[0,1]^{d}}(C_{n},C)\to 0\). Here \(C\) is the unique copula of \(\mu \), \(C_{n}\) is any copula of \(\mu _{n}\), and \(\mu _{n,i}\) is the \(i\)th marginal distribution of \(\mu _{n}\). Sempi [47] and Lindner and Szimayer [31] extended Deheuvels’ result to the general case where the marginal distribution functions \(F_{\mu _{1}},\ldots ,F_{\mu _{d}}\) might be discontinuous. Theorem 2.1 in [31] shows that converges to \(\mu \) in \(\mathcal{O}_{d}^{0}\) if and only if \((\mu _{n,i})\) converges to \(\mu _{i}\) in \(\mathcal{O}_{1}^{0}\), \(i=1,\ldots ,d\), and \(d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0\) (a similar result was proved earlier for \(d=2\) in [47, Theorems 2 and 3]). Recall that \(C\) is uniquely determined only on \(\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}\). The results of [47, Example 2] and [31, Example 2.2] show that convergence of the copula on the whole unit cube \([0,1]^{d}\) can indeed fail. Theorem 2.3 below is a version of the Sempi–Lindner–Szimayer result where the weak topologies are replaced by \(p\)-weak topologies.
Consider the map \(\mathfrak{P}_{d}:\mathbf{C}_{d}\times\mathcal{M}_{1}\times \cdots \times\mathcal{M}_{1}\to{\mathcal{M}}_{d}\) defined by
where \(\mathfrak{p}_{d}\) assigns to a \(d\)-variate distribution function its corresponding Borel probability measure on . Note that \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) remains unchanged when \(C\) is modified outside \(\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}\). It is easily seen (see Appendix A.2) that for any , the univariate distributions \(\mu _{1},\ldots ,\mu _{d}\) lie in \(\mathcal{M}_{1}^{p}\) if and only if the \(d\)-variate distribution \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) lies in \(\mathcal{M}_{d}^{p}\), regardless of the copula \(C\). In particular, the restriction of \(\mathfrak{P}_{d}\) to \(\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\) can be regarded as an \(\mathcal{M}_{d}^{p}\)-valued map.
Theorem 2.3
Fix and let \((C,\mu _{1},\ldots ,\mu _{d})\) and \((C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\), , be elements of \(\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\). Then
in \(\mathcal{O}_{d}^{p}\) if and only if \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{p}\), \(i=1,\ldots ,d\), and \(d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0\).
Recall that a sequence in a product space converges in the product topology if and only if for each projection, the corresponding marginal sequence converges. As a direct consequence, we can obtain from Theorem 2.3 the following corollary, taking into account that each of the involved topologies is metrisable, or at least pseudo-metrisable, and that \(d_{\mu _{1},\ldots ,\mu _{d}}(C,C)=0\) for any \(C\in \mathbf{C}_{d}\) and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\).
Corollary 2.4
For any , \(C\in \mathbf{C}_{d}\) and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the following two assertions hold:
(i) The map \(\mathfrak{P}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to{\mathcal{M}}_{d}^{p}\) is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{p})\)-continuous.
(ii) The map \(\mathfrak{P}_{d}(C,\,\cdot \,,\,\ldots ,\,\cdot \,):\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}\to{\mathcal{M}}_{d}^{p}\) is continuous for the pair \((\mathcal{O}_{1}^{p}\times \cdots \times\mathcal{O}_{1}^{p},\mathcal{O}_{d}^{p})\).
2.4 Characterisation of (\(p\)-)weak convergence in Fréchet classes
For any and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the image of the map
is the Fréchet class \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\), and we have \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p'}\) as well as \({\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\subseteq{\mathcal{M}}_{d}^{p'}\) for any \(p'\in [0,p]\). Therefore, Corollary 2.4 (i) immediately yields the following result.
Corollary 2.5
For any and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the map
is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d}))\)-continuous for any \(p'\in [0,p]\).
As a simple consequence of Theorem 2.3, we obtain the following corollary (see Appendix A.3). The result is already known from Krätschmer et al. [28, Proposition 3.9] (with \(A_{d}\) chosen to be the identity on ), where other arguments have been used for the proof.
Corollary 2.6
For any and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), we have that
for any \(p'\in [0,p]\).
For any fixed \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{d}^{p}\), we use as before \(\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) to denote the quotient set of \(\mathbf{C}_{d}\) with respect to the equivalence relation \(\sim _{\mu _{1},\ldots ,\mu _{d}}\) of identity on \(\overline{\mathrm{ran}F_{\mu _{1}}}\times \cdots \times \overline{\mathrm{ran}F_{\mu _{d}}}\). Recall from Remark 2.1 that we denote by \(\mathcal{O}_{\mu _{1}\ldots ,\mu _{k}}^{\sim}\) the topology on \(\mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) generated by the metric \(d_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\) corresponding to the pseudo-metric \(d_{\mu _{1},\ldots ,\mu _{d}}\), and that \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\) preserves the topological structure of \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}\).
Let us denote by \(\mathfrak{P}_{\mu _{1},\ldots ,\mu _{d}}:\mathbf{C}_{d}/_{\sim _{ \mu _{1},\ldots ,\mu _{d}}}\to{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) the map that assigns to each equivalence class the unique probability measure that satisfies for all representatives . Then Corollary 2.5 can be reformulated as follows.
Corollary 2.7
For any and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the map
is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim},\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}))\)-continuous for any \(p'\in [0,p]\).
Let \(\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}:\mathcal{M}_{d}(\mu _{1}, \ldots ,\mu _{d})\to \mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) be the map that assigns to each \(\mu \in{\mathcal{M}}_{d}(\mu _{1}\ldots ,\mu _{d})\) the unique equivalence class whose representatives are copulas of \(\mu \). Then we have the following converse of Corollary 2.7.
Corollary 2.8
For any and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the map
is \((\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mu _{1},\ldots ,\mu _{d}}^{\sim})\)-continuous for any \(p'\in [0,p]\).
As an immediate consequence of Corollaries 2.7 and 2.8, we obtain the following result. Note that the equivalence (a) ⇔ (b) also follows from Corollary 2.6, and that condition (c) is equivalent with \(d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)\to 0\) for any copulas \(C\) and \(C_{n}\), , of \(\mu \) and \(\mu _{n}\), , respectively.
Corollary 2.9
Fix and \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). Then the following assertions are equivalent for any and \(\mu \in \mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\):
(a) \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\).
(b) \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\).
(c) \(\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}(\mu _{n})\to \mathfrak{C}_{ \mu _{1},\ldots ,\mu _{d}}(\mu )\) in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\).
3 Copula robustness
3.1 Definition of copula robustness
Let \(\mathcal{M}_{d}'\subseteq{\mathcal{M}}_{d}\) and \(\mathcal{T}_{d}:\mathcal{M}_{d}'\longrightarrow \mathbf{E} \) be any map taking values in some topological space \((\mathbf{E},\mathcal{O}_{\mathbf{E}})\). As before, let the map \(\mathfrak{P}_{d}:\mathbf{C}_{d}\times\mathcal{M}_{1}\times \cdots \times\mathcal{M}_{1}\to{\mathcal{M}}_{d}\) be defined by (2.3). Let \(\mathfrak{D}_{d}'\) be the set of all \((C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1} \times \cdots \times\mathcal{M}_{1}\) for which \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) lies in \(\mathcal{M}_{d}'\). Then we can associate with \(\mathcal{T}_{d}\) a functional \(\mathfrak{T}_{d}:\mathfrak{D}_{d}'\to \mathbf{E}\) through
Let \(\mathfrak{M}_{d}'\) be the set of all \(d\)-tuples \((\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{1}\times \cdots \times\mathcal{M}_{1}\) for which there exists a copula \(C\in \mathbf{C}_{d}\) such that \((C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'\). Moreover, for any fixed \((\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'\), let the set \(\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})\) consist of all those copulas \(C\in \mathbf{C}_{d}\) for which \((C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'\).
Definition 3.1
The map \(\mathcal{T}_{d}\) is copula robust if for any fixed \((\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'\), the map \(\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d}'( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}\) is continuous for the pair
The sets \(\mathfrak{M}_{d}'\) and \(\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})\) are illustrated by Examples 3.2 and 3.3 below. The examples show in particular that the set \(\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})\) can be quite different from case to case. In Example 3.2, and in the further course, let \(\mathcal{N}_{1}\) be the set of all non-degenerate univariate normal distributions and for \(d\ge 2\), let \(\mathcal{N}_{d}\) be the set of all (possibly degenerate) \(d\)-variate normal distributions with continuous marginals. In Example 3.2, we also need the notion of a Gaussian copula. Recall that, by definition, a \(d\)-variate Gaussian copula is an element \(C\in \mathbf{C}_{d}\) given through
for some correlation matrix \(R\), i.e., for some symmetric and positive semi-definite matrix \(R\in [-1,1]^{d\times d}\) which has entries 1 on the diagonal. Here \(\varPhi _{0,1}\) and \(\boldsymbol{\varPhi}_{\boldsymbol{0},R}\) are respectively the distribution function of the univariate standard normal distribution and the distribution function of the centered \(d\)-variate normal distribution \(\mathbf{N}_{\boldsymbol{0},R}\) with covariance matrix equal to \(R\), and we set \(\varPhi _{0,1}^{-1}(0):=-\infty \) and \(\varPhi _{0,1}^{-1}(1):=+ \infty \) as well as
for any . The set of all Gaussian copulas is denoted by \(\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\).
Example 3.2
If \(\mathcal{M}_{d}'=\mathcal{N}_{d}\), then \(\mathfrak{D}_{d}'=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}\) (see Appendix A.5). In particular, \(\mathfrak{M}_{d}'=\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}\) and \(\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\) for any \((\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'\).
Example 3.3
If \(\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}\) for some , then \(\mathfrak{D}_{d}'=\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\) (see Appendix A.6). In particular, \(\mathfrak{M}_{d}'=\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\) and \(\mathbf{C}_{d}'(\mu _{1},\ldots ,\mu _{d})=\mathbf{C}_{d}\) for any \((\mu _{1},\ldots ,\mu _{d})\in \mathfrak{M}_{d}'\).
The following lemma is trivial, but, nevertheless, worth to be written down. In the lemma, \((\mathbf{E}',\mathcal{O}_{\mathbf{E}}')\) is another topological space.
Lemma 3.4
If \(\mathcal{T}_{d}\) is copula robust and \(\mathcal{U}:\mathbf{E}\to \mathbf{E}'\) is any \((\mathcal{O}_{\mathbf{E}},\mathcal{O}_{\mathbf{E}}')\)-continuous map, then the composition \(\mathcal{T}_{d}':=\mathcal{U}\circ{\mathcal{T}}_{d}\) is copula robust.
3.2 Copula robustness of functionals on \(\mathcal{N}_{d}\)
In this section, let specifically \(\mathcal{M}_{d}'=\mathcal{N}_{d}\). That is, let \(\mathcal{T}_{d}:\mathcal{N}_{d}\to \mathbf{E}\) be any map taking values in some topological space \((\mathbf{E},\mathcal{O}_{\mathbf{E}})\). In view of Example 3.2, the definition of copula robustness of \(\mathcal{T}_{d}\) (Definition 3.1) can then be reformulated as follows.
Definition 3.5
The map \(\mathcal{T}_{d}\) on \(\mathcal{N}_{d}\) is copula robust if for any fixed \(\mu _{1},\ldots ,\mu _{d}\in \mathcal{N}_{1}\), the map \(\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d}^{ {\mbox{\textup {{\scriptsize {Ga}}}}}}\to \mathbf{E}\) is \((\mathcal{O}_{[0,1]^{d}}\cap \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}},\mathcal{O}_{ \mathbf{E}})\)-continuous.
Remark 3.6
Convergence in \((\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}},\mathcal{O}_{[0,1]^{d}}\cap \mathbf{C}_{d}^{ {\mbox{\textup {{\scriptsize {Ga}}}}}})\) is nothing but pointwise (or uniform) convergence in \(\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\). This sort of convergence is therefore equivalent to convergence of the respective correlation matrices in any matrix norm; for details, see Appendix A.7.
Example 3.7
The identity map \(\mathcal{P}_{d}:\mathcal{N}_{d}\to{\mathcal{N}}_{d}\) is copula robust in the sense of Definition 3.5 when the role of \((\mathbf{E},\mathcal{O}_{\mathbf{E}})\) is played by \((\mathcal{N}_{d},\mathcal{O}_{d}^{p}\cap{\mathcal{N}}_{d})\) for arbitrary (but fixed) . For details, see Appendix A.8.
Example 3.7 and Lemma 3.4 (applied to \(\mathcal{T}_{d}':=\mathcal{T}_{d}\), \(\mathcal{T}_{d}:=\mathcal{P}_{d}\), \(\mathcal{U}:=\mathcal{T}_{d}\)) immediately yield the following result.
Theorem 3.8
If \(\mathcal{T}_{d}\) is \((\mathcal{O}_{d}^{p}\cap{\mathcal{N}}_{d},\mathcal{O}_{\mathbf{E}})\)-continuous for some , then it is copula robust.
Of course, Theorem 3.8 can be generalised to larger sets of parametric distributions as for instance to the set \(\mathcal{S}_{d}\) of all \(d\)-variate (Student) \(t\)-distributions with continuous marginals; see for instance Demarta and McNeil [11] for the definitions of \(d\)-variate \(t\)-distributions and \(t\)-copulas. However, for the sake of clarity and ease, the exposition here is restricted to the Gaussian setting. A perhaps more interesting setting is addressed in the next section.
3.3 Copula robustness of functionals on \(\mathcal{M}_{d}^{p}\)
In this section, let specifically \(\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}\) for some . That is, let \(\mathcal{T}_{d}:\mathcal{M}_{d}^{p}\to \mathbf{E}\) be any map taking values in some topological space \((\mathbf{E},\mathcal{O}_{\mathbf{E}})\). In view of Example 3.3, the definition of copula robustness of \(\mathcal{T}_{d}\) (Definition 3.1) can then be reformulated as follows.
Definition 3.9
The map \(\mathcal{T}_{d}\) on \(\mathcal{M}_{d}^{p}\) is copula robust if for any fixed \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), the map \(\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to \mathbf{E}\) is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})\)-continuous.
With the help of Corollaries 2.5 and 2.8, we can derive the following characterisation of copula robustness of \(\mathcal{T}_{d}\). For details, see Appendix A.9.
Theorem 3.10
Let \(\mathcal{T}_{d}: \mathcal{M}_{d}^{p} \to \mathbf{E}\) be any map. Then \(\mathcal{T}_{d}\) is copula robust if and only if for any fixed \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), its restriction \(\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}\) to the Fréchet class \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\) is continuous for the pair \((\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mathbf{E}})\).
Example 3.11
Corollary 2.4 (i) shows that the identity map \(\mathcal{P}_{d}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d}^{p}\) is copula robust in the sense of Definition 3.9 when the role of \((\mathbf{E},\mathcal{O}_{\mathbf{E}})\) is played by \((\mathcal{M}_{d}^{p},\mathcal{O}_{d}^{p})\).
Example 3.11 and Lemma 3.4 (applied to \(\mathcal{T}_{d}':=\mathcal{T}_{d}\), \(\mathcal{T}_{d}:=\mathcal{P}_{d}\), \(\mathcal{U}:=\mathcal{T}_{d}\)) immediately yield the following result.
Theorem 3.12
If \(\mathcal{T}_{d}\) is \((\mathcal{O}_{d}^{p},\mathcal{O}_{\mathbf{E}})\)-continuous, then it is copula robust.
Now fix and . Proposition 2.2 ensures that in the scope of the following corollary, we have \(\mu \circ h^{-1}\in{\mathcal{M}}_{d'}^{p'}\) for any \(\mu \in{\mathcal{M}}_{d}^{p}\).
Corollary 3.13
Let \(\mathcal{T}_{d'}:\mathcal{M}_{d'}^{p'}\to \mathbf{E}\) be an \((\mathcal{O}_{p'}^{d'},\mathcal{O}_{\mathbf{E}})\)-continuous map and suppose is a continuous map with . Then the map \(\mathcal{T}_{d}':\mathcal{M}_{d}^{p}\to \mathbf{E}\) defined by \(\mathcal{T}_{d}'(\mu ):=\mathcal{T}_{d'}(\mu \circ h^{-1})\) is copula robust.
4 Example 1: risk measures of aggregate risks
4.1 Foundations of risk measures
Let be an atomless probability space and denote by the usual class of all finite-valued random variables modulo the equivalence relation of ℙ-a.s. identity. Moreover, let be the usual \(L^{p}\)-space, \(p>0\). For any , we say that a map is a risk measure when the following three conditions are satisfied:
(i) (monotonicity) \(\rho (X)\le \rho (Y)\) for \(X\), \(Y\in L^{p}\) with \(X\le Y\);
(ii) (cash-additivity) \(\rho (X+m)=\rho (X)+m\) for \(X\in L^{p}\) and ;
(iii) (distribution-invariance) \(\rho (X)=\rho (Y)\) for \(X,Y\in L^{p}\) with .
In this context, the elements of \(L^{p}\) should be seen as payoff profiles where positive realisations correspond to losses. Following Föllmer and Schied [22], [23, Chap. 4], a risk measure is said to be convex if it satisfies the following condition:
(iv) (convexity) \(\rho (\lambda X+(1-\lambda ) Y)\le \lambda \rho (X)+(1-\lambda ) \rho (Y)\) for all \(X,Y\in L^{p}\) and \(\lambda \in [0,1]\).
The following example recalls three risk measures which are popular in practice and/or among academics. For background, see Emmer et al. [18] and references cited therein.
Example 4.1
Fix \(\alpha \in (0,1)\).
(i) The value at risk at level \(\alpha \) is the risk measure defined by \(\mathrm{VaR}_{\alpha}(X):=F_{X}^{\leftarrow}(\alpha )\), where is the lower \(\alpha \)-quantile of . It is not convex.
(ii) The average value at risk at level \(\alpha \) is the risk measure defined by \(\mathrm{AVaR}_{\alpha}(X):=\frac{1}{1-\alpha}\int _{\alpha}^{1}F_{X}^{ \leftarrow}(s)\,ds\) and known to be convex; see for instance the work of Wang and Dhaene [53].
(iii) The \(\alpha \)-expectile at level \(\alpha \) is the risk measure defined by , where denotes the inverse of the function with \(U_{\alpha}(x):=\alpha x\) or \((1-\alpha )x\) depending on whether \(x\ge 0\) or \(x<0\). It is well defined, and known to be convex if and only if \(\alpha \ge 1/2\); see the work of Bellini et al. [3].
For any risk measure , we may define a functional through
where \(X_{\mu}\) is any random variable on with distribution \(\mu \). We refer to \(\mathcal{R}_{\rho}\) as the risk functional associated with \(\rho \). The assertion of the following result is a direct consequence of Cheridito and Li [8, Theorem 4.1] combined with the representation theorem of Krätschmer et al. [27, Theorem 3.5]. Here refers to the natural topology on ℝ.
Theorem 4.2
Let . For any convex risk measure , the corresponding risk functional is -continuous.
4.2 Copula robustness of risk measures of aggregate risks
Let be a risk measure for some . Let be any continuous map, regarded as an aggregation map in the spirit of McNeil et al. [33, Sect. 6.2.1]. Assume that for some and any \(X_{1},\ldots ,X_{d}\in L^{p}\), the random variable \(A_{d}(X_{1},\ldots ,X_{d})\) lies in \(L^{p'}\). Then we can define a map through
We refer to \(\mathcal{R}_{\rho ,A_{d}}\) as aggregation risk functional associated with \(\rho \) and \(A_{d}\). Note that the right-hand side in (4.2) equals \(\rho (A_{d}(X_{1},\ldots ,X_{d}))\) when \((X_{1},\ldots ,X_{d})\) is an -valued random variable with distribution \(\mu \). As a direct consequence of Corollary 3.13 (applied to \(\mathcal{T}_{1}:=\mathcal{R}_{\rho}\), \(h:=A_{d}\), \(\mathcal{T}_{d}':=\mathcal{R}_{\rho ,A_{d}}\)) and Theorem 4.2, we obtain the following result.
Corollary 4.3
Take , a convex risk measure and a continuous map satisfying . Then the aggregation risk functional defined by (4.2) is copula robust.
Example 4.4
In risk management, \(A_{d}\) is frequently chosen as one of the following maps; see for instance the textbook by McNeil et al. [33, Sect. 6.2]:
(i) \(A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}x_{i}\);
(ii) \(A_{d}(x_{1},\dots ,x_{d}) := \max \{x_{1},\dots ,x_{d}\}\);
(iii) \(A_{d}(x_{1},\dots ,x_{d}) := \sum _{i=1}^{d}(x_{i} - t_{i})^{+}\) for thresholds \(t_{1},\dots ,t_{d} > 0\);
(iv) \(A_{d}(x_{1},\dots ,x_{d}) := (\sum _{i=1}^{d}x_{i} - t)^{+}\) for a threshold \(t > 0\).
It is easily seen that for each of these four maps, holds for any . That is, all these maps satisfy the assumptions of Corollary 4.3 for \(p'=p\) (and thus for any and \(p'\in [0,p]\)). In particular, for each of these four maps \(A_{d}\) and for any convex risk measure , the corresponding aggregation risk functional defined by (4.2) is copula robust for any .
Remark 4.5
Of course, the assertion of Corollary 4.3 and the last assertion in Example 4.4 also hold true for any other risk measure \(\rho \) for which the corresponding risk functional is -continuous.
Remark 4.5 indicates that in the setting of Corollary 4.3, the assumed convexity of \(\rho \) is not necessary. To give an example that shows that this is indeed true, let \(\rho \) be the \(\alpha \)-expectile \(\mathrm{Ept}_{\alpha}\) with \(\alpha <1/2\) (see Example 4.1 (iii)). Then \(\rho \) is not convex (see Bellini et al. [3, Proposition 7(b–c)]), but the corresponding risk functional is -continuous (see Krätschmer and Zähle [29, Theorem 2.1]), and the latter implies that the aggregation risk functional is copula robust.
On the other hand, if the risk functional corresponding to some \(\rho \) is not -continuous, then copula robustness of can indeed fail to hold. For instance, Example 4.7 below shows that is not copula robust when \(\rho :=\mathrm{VaR}_{\alpha}\) (see Example 4.1 (i)) and \(A_{2}(x_{1},x_{2}):=x_{1}+x_{2}\). Note here that the risk functional associated with \(\rho :=\mathrm{VaR}_{\alpha}\) is known not to be weakly continuous, and that weak continuity is just -continuity.
It is further known that the risk functional associated with \(\rho :=\mathrm{VaR}_{\alpha}\) can be made weakly continuous when restricting it to the set \(\mathcal{M}_{1}^{(\alpha )}\) of all those Borel probability measures on ℝ that possess a unique \(\alpha \)-quantile (see e.g. van der Vaart [52, Lemma 21.2]), or even to the set \(\mathcal{M}_{1}^{\mathcal{L}}\) of all \(\mu \in \bigcap _{s\in (0,1)}{\mathcal{M}}_{1}^{(s)}\) that possess a Lebesgue density. Nonetheless, the corresponding aggregation risk functional \(\mathcal{R}_{\rho ,A_{2}}\), with \(A_{2}(x_{1},x_{2}):=x_{1}+x_{2}\), defined on the set \(\mathcal{M}_{2}^{\mathcal{L}}\) of all Borel probability measures on with marginal distributions in \(\mathcal{M}_{1}^{\mathcal{L}}\), is still not copula robust. This is also a consequence of Example 4.7. The lack of copula robustness of \(\mathcal{R}_{\rho ,A_{2}}\) on \(\mathcal{M}_{2}^{\mathcal{L}}\) is not immediately obvious. Note, however, that for \(\mu \in{\mathcal{M}}_{2}^{\mathcal{L}}\), the image measure \(\mu \circ A_{2}^{-1}\) can be purely discrete (see Example 4.7), i.e., \(\mu \circ A_{2}^{-1}\) can lie outside the set \(\mathcal{M}_{1}^{(\alpha )}\) on which \(\mathcal{R}_{\rho}\) is weakly continuous.
When restricting \(\mathcal{R}_{\rho ,A_{2}}\), with \(\rho :=\mathrm{VaR}_{\alpha}\) and \(A_{2}(x_{1},x_{2}):=x_{1}+x_{2}\), to the much smaller set \(\mathcal{N}_{2}\) introduced before (3.2), then copula robustness holds true. Note that \(\mu \circ A_{2}^{-1}\in{\mathcal{N}}_{1}'\subseteq{\mathcal{M}}_{1}^{(\alpha )}\) for all \(\mu \in{\mathcal{N}}_{2}\), where \(\mathcal{N}_{1}'\) (\(\supseteq{\mathcal{N}}_{1}\)) is the set of all (possibly degenerate) univariate normal distributions. The copula robustness follows from Theorem 3.8 since the restriction of \(\mathcal{R}_{\rho ,A_{2}}\) to \(\mathcal{N}_{2}\) is -continuous. The latter follows from the \((\mathcal{O}_{2}^{0}\cap{\mathcal{N}}_{2},\mathcal{O}_{1}^{0}\cap{\mathcal{N}}_{1}')\)-continuity of the map \(\mathfrak{h}:\mathcal{N}_{2}\to{\mathcal{N}}_{1}'\) defined by \(\mathfrak{h}(\mu ):=\mu \circ A_{2}^{-1}\) and the -continuity of the restriction of \(\mathcal{R}_{\rho}\) to \(\mathcal{N}_{1}'\) (\(\subseteq{\mathcal{M}}_{1}^{(\alpha )}\)).
4.3 Relation to aggregation robustness of risk measures
In [17], Embrechts et al. consider the special case where \(A_{d}\) is defined as in (i) of Example 4.4 and \(\rho \) is a coherent distortion risk measure defined on a subset of \(L^{1}\). In this case, they obtain an analogue of Corollary 4.3 and refer to it as aggregation robustness. In fact, they do not explicitly consider continuity in the copula, but rather weak continuity of the analogous functional defined on the corresponding Fréchet class. However, as seen in Theorem 3.10, this is the same. A generalisation to more general risk measures and more general aggregation maps is given in the work of Krätschmer et al. [28, Sect. 4.2.4].
The following definition is a reformulation of the definition of aggregation robustness of a risk measure (i.e., of [17, Definition 2.1]). As before, the aggregation risk functional \(\mathcal{R}_{\rho ,A_{d}}\) associated with \(\rho \) and \(A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}\) is defined by (4.2).
Definition 4.6
Let . A risk measure is said to be aggregation robust if the corresponding aggregation risk functionals , \(d\ge 2\), are copula robust.
In view of Corollary 4.3 and Example 4.4, any convex risk measure is aggregation robust. This assertion remains true when replacing in Definition 4.6 the map \(A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}\) by any other of the maps introduced in Example 4.4.
In their Example 2.2, Embrechts et al. [17] demonstrated that for any \(\alpha \in (0,1)\), the value at risk is not aggregation robust. The following example extends the first part of that example (from \(\alpha =1/2\) to general \(\alpha \in (0,1)\)) and shows that for any \(\alpha \in (0,1)\) and , the aggregation risk functional is not copula robust. The example is in particular interesting in that it shows that even if the marginal distributions \(\mu _{1},\ldots ,\mu _{d}\) possess Lebesgue densities and unique quantiles, the map \(C\mapsto \mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{d}}(C,\mu _{1},\ldots , \mu _{d})\) need not be continuous when choosing \(A_{d}(x_{1},\ldots ,x_{d}):=\sum _{i=1}^{d}x_{i}\) (here \(\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{d}}\) is derived from \(\mathcal{R}_{\rho ,A_{d}}\) as \(\mathfrak{T}_{d}\) is derived from \(\mathcal{T}_{d}\) in (3.1)). The point here is that for random variables \(X_{1},\ldots ,X_{d}\) with distributions \(\mu _{1},\ldots ,\mu _{d}\), the distribution of \(\sum _{i=1}^{n}X_{i}\) can be discrete even if \(\mu _{1},\ldots ,\mu _{d}\) possess Lebesgue densities. This fact has already been pointed out in [17].
Example 4.7
Generalising the first part of Embrechts et al. [17, Example 2.2], define for \(\alpha \in (0,1/2]\) a bivariate copula \(C_{0}^{(\alpha )}\) through
let \(C_{1}\) be the bivariate independence copula, i.e., \(C_{1}(u_{1},u_{2}):=u_{1}u_{2}\), and define for any \(t\in [0,1]\) the copula \(C_{t}^{(\alpha )}\) as a mixture of \(C_{0}^{(\alpha )}\) and \(C_{1}\) via
Moreover, for any \(t\in [0,1]\), let \(\hat{C}_{t}^{(\alpha )}\) be the survival copula of \(C_{t}^{(\alpha )}\) which is defined by \(\hat{C}_{t}^{(\alpha )}(u_{1},u_{2}):=u_{1}+u_{2}-1+C_{t}^{(\alpha )}(1-u_{1},1-u_{2})\). Finally, let \(\mu _{1}:=\mu _{2}:=\mathrm{U}_{[0,1]}\) as well as \(\hat{\mu}_{1}:=\hat{\mu}_{2}:=\mathrm{U}_{[-1,0]}\), where \(\mathrm{U}_{I}\) is used to denote the uniform distribution on \(I\). Then the following two assertions are valid:
(i) \(\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{0}^{(\alpha )},\mu _{1}, \mu _{2})=\alpha \) and \(\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{t}^{(\alpha )},\mu _{1}, \mu _{2})=\sqrt{2\alpha}\) for any \(t\in (0,1]\). Therefore we have \(\lim _{t\searrow 0}C_{t}^{(\alpha )}=C_{0}^{(\alpha )}\) uniformly, but
(ii) \(\mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(\hat{C}_{0}^{(\alpha )}, \hat{\mu}_{1},\hat{\mu}_{2})=-1-\alpha \) and \(\mathfrak{R}_{\mathrm{VaR}_{1-\alpha},A_{2}}(\hat{C}_{t}^{(\alpha )}, \hat{\mu}_{1},\hat{\mu}_{2})=-\sqrt{2\alpha}\) for any \(t\in (0,1]\). Therefore we have that \(\lim _{t\searrow 0}\hat{C}_{t}^{(\alpha )}=\hat{C}_{0}^{(\alpha )}\) uniformly, but
For details, see Appendix A.11.
It is worth commenting on the copulas \(C_{0}^{(\alpha )}\), \(C_{1}\) and \(C_{t}^{(\alpha )}\) in the preceding example. The copula \(C_{1}\) is well known; it is simply the distribution function of the uniform distribution on \([0,1]^{2}\). The copula \(C_{0}^{(\alpha )}\) is the distribution function of the ‘uniform distribution’ on the union of the two disjoint line segments \(S_{1}^{\alpha}\) and \(S_{2}^{\alpha}\) with endpoints \((\alpha ,0),(0,\alpha )\) and \((1,\alpha ),(\alpha ,1)\), respectively (see Appendix A.11 for the precise definition). Thus \(C_{t}^{(\alpha )}\) is the distribution function of the Borel probability measure on \([0,1]^{2}\) that is defined as a convex combination (with coefficients \(t\) and \(1-t\)) of the uniform distribution on \([0,1]^{2}\) and the ‘uniform distribution’ on \(S_{1}^{\alpha}\uplus S_{2}^{\alpha}\). For a visualisation of \(C_{0}^{(\alpha )}\), see Fig. 1, and note that the distribution of the sum of two \(\mathrm{U}_{[0,1]}\)-distributed random variables coupled via \(C_{0}^{(\alpha )}\) is the two-point distribution \(\alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha}\).
4.4 Application to optimal capital and risk allocations
Let \(p\in [1,\infty )\). As in the work of Filipović and Svindland [21], consider \(d\) agents, or business units, with endowments \(X_{1},\ldots ,X_{d}\in L^{p}\). We then assume that these agents assess the riskiness of their positions by means of some convex risk measures (in the sense of Sect. 4.1). In order to minimise the total and individual risk, the agents redistribute the aggregate endowment \(X:=\sum _{i=1}^{d}X_{i}\) among themselves. By a redistribution of \(X\), we mean any \(d\)-tuple \((Y_{1},\ldots ,Y_{d})\) of random variables (payoffs) in \(L^{p}\) such that \(X=\sum _{i=1}^{d}Y_{i}\). A redistribution \((X_{1}^{*},\ldots ,X_{d}^{*})\) is called an optimal capital and risk allocation of \(X\) if
Here it is assumed that the redistribution is not subject to frictions, i.e., that every redistribution of \(X\) is admissible, even if this is not always the case (as pointed out by Filipović and Kupper [20]).
Note that an optimal capital and risk allocation \((X_{1}^{*},\ldots ,X_{d}^{*})\) as a redistribution must satisfy \(X=\sum _{i=1}^{d}X_{i}^{*}\). An optimal capital and risk allocation of \(X\) need not exist. If it exists, it coincides with the inf-convolution of \(\rho _{1},\ldots ,\rho _{d}\) at \(X\), denoted by \(\mathop {\square }_{i=1}^{d}\rho _{i}(X)\), which is defined by the right-hand side of (4.3). Using the convention \(\inf \emptyset =\infty \), the inf-convolution can be seen as a map \(\mathop {\square }_{i=1}^{d}\rho _{i}:L^{p}\to (-\infty ,\infty ]\). For background, see [21] and references cited therein.
In the above economic setting, the inf-convolution can also be regarded as a map \(\mathop {\boxtimes }_{i=1}^{d}\rho _{i}:L^{p}\times \cdots \times L^{p}\to (- \infty ,\infty ]\) through
Since we assumed \(\rho _{1},\ldots ,\rho _{d}\) to be convex risk measures on \(L^{p}\), a result of Filipović and Svindland [21, Corollary 2.7] ensures that the inf-convolution \(\mathop {\square }_{i=1}^{d}\rho _{i}\) is a convex risk measure on \(L^{p}\), too (note that in [21, Corollary 2.7], \(\mathop {\square }_{i=1}^{d}\rho _{i}\) is exact, and hence it is ℝ-valued if \(\rho _{1},\ldots ,\rho _{d}\) are ℝ-valued). As a convex risk measure, \(\mathop {\square }_{i=1}^{d}\rho _{i}\) is distribution-invariant, and so is \(\mathop {\boxtimes }_{i=1}^{d}\rho _{i}\). Thus we may associate with \(\mathop {\boxtimes }_{i=1}^{d}\rho _{i}\) a corresponding functional through
with \(A_{d}(x_{1},\ldots ,x_{d})=\sum _{i=1}^{d}x_{i}\), where \(\mathcal{R}_{\mathop {\square }_{i=1}^{d}\rho _{i}}\) and \(\mathcal{R}_{\mathop {\square }_{i=1}^{d}\rho _{i},A_{d}}\) are defined as in (4.1) and (4.2), respectively.
It is worth mentioning that [21, Corollary 2.7] even ensures that for any \(X\in L^{p}\), there exists a comonotone optimal capital and risk allocation \((X_{1}^{*},\ldots ,X_{d}^{*})\). This implies that whenever \(\rho _{1}=\cdots =\rho _{d}\) and \(\rho :=\rho _{1}\) is comonotonic (i.e., finitely additive for all comonotone risks), we have
for any \(\mu \in{\mathcal{M}}_{d}^{p}\) (see Appendix A.12). Of course, for convex risk measures \(\rho \) that are not comonotonic, the representation (4.4) need not apply. An example for a comonotonic convex risk measure is the average value at risk at level \(\alpha \in (0,1)\). A counterexample is the \(\alpha \)-expectile at level \(\alpha \in [1/2,1)\); see Emmer et al. [18].
The following result is a direct consequence of Corollary 4.3 and Example 4.4 since we have seen above that \(\mathop {\square }_{i=1}^{d}\rho _{i}\) is a convex risk measure on \(L^{p}\) if \(\rho _{1},\ldots ,\rho _{d}\) are.
Corollary 4.8
Let \(p\in [1,\infty )\) and be convex risk measures. Then is copula robust.
The following example shows that if the risk measures \(\rho _{1},\ldots ,\rho _{d}\) are not assumed to be convex, copula robustness of \(\mathcal{R}_{\mathop {\boxtimes }_{i=1}^{d}\rho _{i}}\) may fail; recall that \(\mathrm{VaR}_{\alpha}\) is not convex.
Example 4.9
It is known from the work of Embrechts et al. [13, Corollary 2] that \(\mathop {\square }_{i=1}^{2}{\mathrm{VaR}}_{\alpha}=\mathrm{VaR}_{2\alpha}\) on \(L^{1}\) when \(\alpha \in (0,1/2)\). Therefore
for any \(\mu \in{\mathcal{M}}_{2}^{1}\) and \(\alpha \in (0,1/2)\). Thus it follows from Example 4.7 that
is not copula robust for any \(\alpha \in (0,1/2)\).
5 Example 2: stochastic programming problems
5.1 A class of stochastic programming problems
Adopting the framework of Claus et al. [9], let \(\varXi \) be a nonempty and compact subset of , a Borel-measurable function and \(Z\) an -valued random variable on an atomless probability space . Let \(p\in [1,\infty )\) and assume that \(h(\xi ,Z)\) is contained in for any \(\xi \in \varXi \). Consider the optimisation problem
where is any map. A classical example for \(\rho \) is the expectation, i.e., , where \(p=1\). In Sect. 5.2, we consider another example where \(\rho \) is a more general monotone, distribution-invariant and convex function on \(L^{p}\). Problem (5.1) can be written as or, equivalently, as
where \(\mathcal{R}_{\rho}\) is derived from \(\rho \) as in (4.1) and \(\mu \) denotes the distribution of \(Z\).
Lemma 5.1 below assumes the following three conditions, where monotonicity, distribution-invariance and convexity are defined as in (i), (iii) and (iv) in Sect. 4.1. Recall that is assumed to be atomless.
(a) , for some \(p\in [1,\infty )\), is monotone, distribution-invariant and convex.
(b) is Borel-measurable and limited by an exponent .
(c) \((\delta _{\xi}\otimes \mu )[D_{h}]=0\) for any \(\xi \in \varXi \) and \(\mu \in{\mathcal{M}}_{d}^{\gamma p}\).
The second requirement in (b) means that there exists some locally bounded map \(\eta :\varXi \to (0,\infty )\) such that \(|h(\xi ,z)|\le \eta (\xi )(1+|z|)^{\gamma}\) for all . In (c), the set \(D_{h}\) is the set of all discontinuity points of \(h\). Under conditions (a) and (b), the map given by
is well defined. The following lemma is known from Claus et al. [9, Theorem 5.2].
Lemma 5.1
If conditions (a)–(c) hold true, then the map is -continuous.
Lemma 5.1 can be used to obtain the following result on the map
Recall that the set \(\varXi \) was assumed to be compact.
Theorem 5.2
If conditions (a)–(c) hold true, then the infimum in (5.3) is attained for any \(\mu \in{\mathcal{M}}_{d}^{\gamma p}\), and the map is -continuous.
Note here that if the infimum in (5.3) is attained, then \(\mathcal{R}_{\rho ,h}(\mu )\) is a solution to (5.2). Theorem 5.2 is a variant of Claus et al. [9, Corollary 2.4].
Remark 5.3
The -continuity of obtained in the preceding theorem can be seen as robustness of \(\rho \) relative to \((\mathcal{G},Z,\pi _{d}^{\gamma p})\) in the sense of Embrechts et al. [16, Definition 1], where \(\mathcal{G}:=\{h(\xi ,\,\cdot \,):\xi \in \varXi \}\) and \(\pi _{d}^{\gamma p}\) is any metric metrising the \((p\gamma )\)-weak topology \(\mathcal{O}_{d}^{\gamma p}\).
5.2 Example: one-period mean–risk portfolio optimisation
Consider a one-period financial market consisting of one riskless bond and \(d\) risky assets with prices per unit \(S_{0}^{0}:=1\) and at time 0. In between time 0 and time 1, the prices change to \(S_{1}^{0},S_{1}^{1},\ldots ,S_{1}^{d}\) according to \(S_{1}^{i}=Z^{i}S_{0}^{i}\), \(i=0,\ldots ,d\), where the bond’s relative price change \(Z^{0}\) is deterministic () and known at time 0 and the assets’ relative price changes \(Z^{1},\ldots ,Z^{d}\) are -valued random variables on a common atomless probability space and are unobservable at time 0. Let be an amount of capital to be invested in the bond and in the \(d\) assets at time 0. If for any \(i=1,\ldots ,d\), the amount of capital invested in the asset \(i\) is denoted by \(\xi _{i}\), then the amount of capital invested in the bond is \(\xi _{0}:=x_{0}-\langle \xi ,\boldsymbol{1}\rangle \), where \(\xi :=(\xi _{1},\ldots ,\xi _{d})\) and . When identifying a portfolio with the corresponding amounts of capital \(\xi _{1},\ldots ,\xi _{d}\) and assuming that taking loans and short selling are banned, the set
can be seen as the set of all admissible portfolios. The realised loss at time 1 of a portfolio \(\xi =(\xi _{1},\ldots ,\xi _{d})\in \varXi \) is given by
when \(z=(z_{1},\ldots ,z_{d})\) is the vector of the assets’ realised relative price changes, i.e., the realisation of \(Z:=(Z^{1},\ldots ,Z^{d})\).
Of course, the portfolio \(\xi =(\xi _{1},\ldots ,\xi _{d})\in \varXi \) should be chosen such that the expected profit is as high as possible, i.e., such that the expected loss is as small as possible. Simultaneously the portfolio’s downside risk should be as small as possible, where the downside risk can be measured by \(\sigma (h(\xi ,Z))\) for a suitable given ‘downside’ risk measure . This leads to the mean–risk model
where is a risk aversion parameter. Note that the model (5.6) aims at minimising the weighted sum of two competing objects and is in line with Markowitz’ [32] classical mean–variance optimisation theory (where ). It is also worth mentioning that mean–risk models are related to the corresponding multiobjective optimisation problems; see for instance the works of Ogryczak and Ruszczyński [37, 38] and Schultz and Tiedemann [46].
The mean–risk model (5.6) coincides with problem (5.2) when \(Z\) is distributed according to \(\mu \) and is defined by
where one should note that \(\rho \) is monotone, distribution-invariant and convex if \(\sigma \) is. For any fixed \(p\in [1,\infty )\), the following corollary is a simple consequence of Theorem 5.2; see Appendix A.14.
Corollary 5.4
Let be monotone, distribution-invariant and convex. For any \(\mu \in{\mathcal{M}}_{d}^{p}\), let \(\mathcal{R}_{\rho ,h}(\mu )\) be defined by (5.3) (and (5.7)). Then the infimum on the right-hand side of (5.3) is attained (and thus finite) for any \(\mu \in{\mathcal{M}}_{d}^{p}\), and the map is -continuous.
Remark 5.5
If we use \(\mathfrak{A}_{d}\) to denote the set of all functions , \(\xi \in \varXi \), then \(\mathcal{R}_{\rho ,h}=\mathcal{R}_{\rho ,\mathfrak{A}_{d}}\) for the functional \(\mathcal{R}_{\rho ,\mathfrak{A}_{d}}\) defined by (1.3), i.e., by
Here \(\mathcal{R}_{\rho}\) is derived from \(\rho \) as in (4.1), and \(\mathcal{R}_{\rho ,A_{d}}\) is derived from \(\mathcal{R}_{\rho}\) and \(A_{d}\) as in (4.2).
5.3 Copula robustness of stochastic programming problems
In the setting of Sect. 5.1, assume that conditions (a)–(c) are satisfied and recall that \(\varXi \) was assumed to be compact. Then by Theorem 5.2, the map is -continuous. Together with Theorem 3.12, this leads to the following result.
Corollary 5.6
If conditions (a)–(c) hold true, then the map is copula robust.
Corollary 5.6 shows that under conditions (a)–(c), the minimal value of problem (5.2) is robust with respect to slight changes in the copula of \(\mu \).
Example 5.7
Let us return to the specific setting of Sect. 5.2 (mean–risk portfolio optimisation), where \(\mu \) played the role of the joint distribution of the relative price changes \((Z_{1},\ldots ,Z_{d})\). In this framework, it can be seen in the proof of Corollary 5.4 that conditions (a)–(c) are satisfied for \(p=1\) when is monotone, distribution-invariant and convex. Thus under the latter assumptions on \(\sigma \), Corollary 5.6 ensures that the functional is copula robust. Of course, the copula robustness of \(\mathcal{R}_{\rho ,h}\) also directly follows from Theorem 3.12 and Corollary 5.4.
6 Example 3: multi-period portfolio optimisation
In this section, the objective is to show that the maximal expected utility of the terminal wealth of a portfolio in a multi-period financial market model (see Sect. 6.2) is copula robust if it is regarded as a function of the joint distribution of the assets’ relative price changes. The terminal wealth portfolio optimisation problem can be regarded as a Markov decision problem as introduced in the textbook by Bäuerle and Rieder [1, Chaps. 1 and 2] and in other standard monographs. To prove the main result of this section (Corollary 6.8), it is therefore useful to first establish a variant of a result of Müller [35, Theorem 4.2] about the dependence of the value function on the Markov transition probability function. This variant can be found in Theorem 6.2 and is of independent interest. It is worth pointing out that the factor \(1/\psi (x)\) on the right-hand side of (6.3) is essential for our purposes; see the proof of Corollary 6.7.
6.1 Groundwork: a class of Markov decision models
6.1.1 Basic notation and terminology
Let \((E,\mathcal{E})\) be a measurable space, to be regarded as the state space, and the fixed finite planning horizon. For each \(n=0,\ldots ,N-1\) and \(x\in E\), let \(A_{n}(x)\) be a nonempty set whose elements are regarded as the admissible actions at time \(n\) in state \(x\). For each \(n=0,\ldots ,N-1\), let \(A_{n}:=\bigcup _{x\in E} A_{n}(x)\) and \(D_{n}:=\{(x,a)\in E \times A_{n} : a\in A_{n}(x)\}\). The elements of \(A_{n}\) can be seen as the actions that may basically be selected at time \(n\), whereas the elements of \(D_{n}\) are the possible state–action combinations at time \(n\). We equip \(A_{n}\) with a \(\sigma \)-algebra \(\mathcal{A}_{n}\) and \(D_{n}\) with the trace \(\sigma \)-algebra \(\mathcal{D}_{n}:=(\mathcal{E}\otimes{\mathcal{A}}_{n})\cap D_{n}\). We use \(\boldsymbol{A}\) to denote the family that consists of all sets \(A_{n}(x)\), \(n=0,\ldots ,N-1\), \(x\in E\), and of all \(\sigma \)-algebras \(\mathcal{A}_{n}\), \(n=0,\ldots ,N-1\). All the sets and spaces just introduced are fully determined by \((E,\mathcal{E})\) and \(\boldsymbol{A}\). Although all the objects introduced in what follows depend on \((E,\mathcal{E})\) and \(\boldsymbol{A}\), we suppress this dependence in the notation.
By a (Markov decision) transition function associated with \((E,\mathcal{E})\) and \(\boldsymbol{A}\), we mean an \(N\)-tuple \(P=(P_{n})_{n=0}^{N-1}\), where \(P_{n}\) is a probability kernel from \((D_{n},\mathcal{D}_{n})\) to \((E,\mathcal{E})\), to be seen as the one-step transition kernel at time \(n\). The set of all transition functions is denoted by \(\overline{\mathcal{P}}\). The actions are governed by a so called \(N\)-stage strategy, i.e., by an \(N\)-tuple \(\pi =(f_{0},\ldots ,f_{N-1})\) where \(f_{n}\) is a decision rule at time \(n\), i.e., an \((\mathcal{E},\mathcal{A}_{n})\)-measurable map \(f_{n}:E\rightarrow A_{n}\) satisfying \(f_{n}(x)\in A_{n}(x)\) for all \(x\in E\). Let \(F_{n}\) be a nonempty set of decision rules at time \(n\), and define the set of all ‘admissible’ strategies by \(\varPi :=F_{0}\times \cdots \times F_{N-1}\).
For any \(P=(P_{n})_{n=0}^{N-1}\in \overline{\mathcal{P}}\), \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) and \(n=0,\ldots ,N-1\), define the probability kernel \(P_{n}^{\pi}\) from \((E,\mathcal{E})\) to \((E,\mathcal{E})\) by \(P_{n}^{\pi}(x,B) := P_{n}((x,f_{n}(x)),B)\), \(x\in E\), \(B\in{\mathcal{E}}\). The probability measure \(P_{n}^{\pi}(x,\,\cdot \,)\) can be seen as the one-step transition probability at time \(n\) given state \(x\) when the actions are chosen according to \(\pi \). On the measurable space \((\varOmega ,\mathcal{F}):=(E^{N+1},\mathcal{E}^{\otimes (N+1)})\), we can define for any \(x_{0}\in E\) and \(\pi \in \varPi \) the probability measure , where the right-hand side is the usual product of the probability measure \(\delta _{x_{0}}\) and the kernels \(P_{0}^{\pi},\ldots ,P_{N-1}^{\pi}\). Under the probability measure , the identity map \(X=(X_{n})_{n=0}^{N}\) on \(\varOmega \) is called Markov decision process (MDP) associated with initial state \(x_{0}\), transition function \(P\) and strategy \(\pi \).
Let be a -measurable map, referred to as one-stage reward function, and an -measurable map, referred to as terminal reward function. Here \(r_{n}(x,a)\) specifies the one-stage reward when action \(a\) is taken at time \(n\) in state \(x\), and \(r_{N}(x)\) specifies the reward of being in state \(x\) at the terminal time \(N\). Finally, set \(\vec{\boldsymbol{r}}:=(r_{n})_{n=0}^{N}\).
For any fixed subset \(\mathcal{P}\subseteq \overline{\mathcal{P}}\), the collection of the objects \((E,\mathcal{E})\), \(\boldsymbol{A}\), \(\varPi \), \(\mathcal{P}\), , \(X\) and \(\vec{\boldsymbol{r}}\) introduced so far are often referred to as Markov decision model. In fact, in the standard literature, the set \(\mathcal{P}\) is typically a singleton. If, however, there is uncertainty with respect to the ‘true’ transition function, then one should allow a whole bundle of transition functions in the model.
6.1.2 Intrinsic optimisation problem
We assume that \(r_{k}(X_{k},f_{k}(X_{k}))\), \(k=0,\ldots ,N-1\), and \(r_{N}(X_{N})\) are -integrable for any \(x_{0}\in E\), \(P\in{\mathcal{P}}\), \(\pi \in \varPi \) (for a sufficient condition, see Lemma 6.1 below). As a consequence, we can define for any \(P\in{\mathcal{P}}\) and \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) an -measurable map by
The value \(V_{n}^{P;\pi}(x_{0})\) specifies the expected total reward of \(X\) under . Here ‘under ’ means that \(X\) starts in \(x_{0}\) and that the random transitions of \(X\) are governed by \(P\) and \(\pi \). For fixed \(P\in{\mathcal{P}}\), it is natural to look for those strategies \(\pi \in \varPi \) for which the expected total reward from time 0 to \(N\) is maximal for a given initial states \(x_{0}\in E\). This results in the optimisation problem
We assume that \(\sup _{\pi \in \varPi}V_{0}^{P;\pi}(x_{0})<\infty \) for any \(x_{0}\in E\), which means that it is impossible to gain an arbitrarily high reward. A strategy \(\pi ^{P}\in \varPi \) is said to be optimal for (6.1) if \(V_{0}^{P;\pi ^{P}}(x_{0})=V_{0}^{P}(x_{0})\) for any \(x_{0}\in E\), where the map is defined by \(V_{0}^{P}(x_{0}):=\sup _{\pi \in \varPi}V_{0}^{P;\pi}(x_{0})\). The map \(V_{0}^{P}\) is referred to as value function.
Some known facts about the existence of optimal strategies are recalled in Appendix C. Part (i) of Theorem C.1 shows that under some assumptions, the value function can be obtained by the Bellman iteration scheme. The latter involves the time-\(n\) value functions \(V_{n}^{P}\), \(n=1,\ldots ,N\), defined by \(V_{n}^{P}(x):=\sup _{\pi \in \varPi}V_{n}^{P;\pi}(x)\), where for any \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) the -measurable map is defined by (note that the right-hand side is independent of \(x_{0}\in E\)). Here and in the following, we use the convention \(\sum _{n=N}^{N-1}:=0\). The maps \(V_{n}^{P;\pi} (\, \cdot \,)\), \(\pi \in \varPi \), are sometimes called policy value functions and appear in Theorem 6.2.
6.1.3 Bounding function
For the Markov decision model introduced above and \(P\in{\mathcal{P}}\), an \((\mathcal{E},\mathcal{B}([1,\infty ))\)-measurable function \(\psi :E\to [1,\infty )\) is called a bounding function for \(P\) if there exist constants such that the following three assertions hold:
(a) \(|r_{n}(x,a)| \le K_{1} \psi (x)\) for any \(n=0,\ldots ,N-1\) and \((x,a)\in D_{n}\);
(b) \(|r_{N}(x)| \le K_{2} \psi (x)\) for any \(x\in E\);
(c) \(\int _{E}\psi (y)\,P_{n}((x,a),dy)\le K_{3} \psi (x)\) for any \(n=0,\ldots ,N-1\) and \((x,a)\in D_{n}\).
This terminology is adapted from the work of Müller [34, Definition 2.4] and the textbook by Bäuerle and Rieder [1, Definition 2.4.1]. Denote by the set of all -measurable maps , and by the set of all satisfying \(\|v\|_{\psi}<\infty \), where \(\|v\|_{\psi}:=\sup _{x\in E}|v(x)|/\psi (x)\).
Lemma 6.1
Let \(P\in{\mathcal{P}}\). If there exists a bounding function \(\psi \) for \(P\), then the random variables \(r_{k}(X_{k},f_{k}(X_{k}))\), \(k=0,\ldots ,N-1\), and \(r_{N}(X_{N})\) are -integrable for any \(x_{0}\in E\) and \(\pi \in \varPi \), and moreover, \(\|V_{n}^{P}\|_{\psi}<\infty \) (in particular, for any \(\pi \in \varPi \)) for any \(n=0,\ldots ,N-1\).
6.1.4 Continuous dependence of the optimal value on the transition function
Let \(\psi :E\to [1,\infty )\) be an \((\mathcal{E},\mathcal{B}([1,\infty ))\)-measurable function, and note that the integral \(\int _{E} v\,d\mathfrak{m}\) exists and is finite for any and \(\mathfrak{m}\in{\mathcal{M}}_{1}^{\psi}(E)\), the set of all probability measures on \((E,\mathcal{E})\) with \(\int \psi \,d\mathfrak{m}<\infty \). For any fixed subset , the distance between \(\mathfrak{m}_{1}\) and \(\mathfrak{m}_{2}\) from \(\mathcal{M}_{1}^{\psi}(E)\) can be measured by
Note that (6.2) defines a probability pseudo-metric (in the sense of Rachev [41, Sect. 2.3]), i.e., a map which is symmetric and fulfils the triangle inequality. If separates points in \(\mathcal{M}_{1}^{\psi}(E)\) (i.e., if any two \(\mathfrak{m}_{1},\mathfrak{m}_{2}\in{\mathcal{M}}_{1}^{\psi}(E)\) coincide when \(\int _{E} v\,d\mathfrak{m}_{1}=\int _{E} v\,d\mathfrak{m}_{2}\) for all ), then is even a probability metric. It is sometimes called integral probability metric or probability metric with a \(\zeta \)-structure; see Müller [35] and Zolotarev [54].
In some situations, the (pseudo-)metric (with fixed) can be represented by the right-hand side of (6.2) with replaced by a different subset of . Each such set is said to be a generator of . The largest generator of is called the maximal generator of and will be denoted by . That is, is the set of all for which for all \(\mathfrak{m}_{1},\mathfrak{m}_{2}\in{\mathcal{M}}_{1}^{\psi}(E)\); see [35, Definition 3.1]. Examples for and are discussed in Kern et al. [26] and Müller [34, 35].
Now denote by \(\mathcal{P}_{\psi}\) the set of all transition functions \(P=(P_{n})_{n=0}^{N-1}\in{\mathcal{P}}\) with \(P_{n}((x,a),\,\cdot \,) \in{\mathcal{M}}_{1}^{\psi}(E)\) for all \((x,a)\in D_{n}\) and \(n=0,\ldots ,N-1\). For any \(P\in{\mathcal{P}}_{\psi}\), the integrals \(\int _{E} v(y)\,P_{n}((x,a),dy)\), , \((x,a)\in D_{n}\), \(n=0,\ldots ,N-1\), exist and are finite. For any , we may define the distance between two transition functions \(P=(P_{n})_{n=0}^{N-1}\) and \(Q=(Q_{n})_{n=0}^{N-1}\) from \(\mathcal{P}_{\psi}\) by
For any , the Minkowski functional (in the sense of Rudin [42, paragraph after Definition 1.33]) is defined by
where we set \(\inf \emptyset :=\infty \). Examples for and are discussed in Kern et al. [26] and Müller [34]. In the following result, we assume that \(\psi \) is a bounding function for any \(Q\in{\mathcal{P}}_{\psi}\). By Lemma 6.1, it then follows that \(V_{n}^{Q}(x)<\infty \) for any \(n=0,\ldots ,N\), \(Q\in{\mathcal{P}}_{\psi}\) and \(x\in E\). In particular, we can define a functional by \(\overline{\mathcal{V}}_{n}^{x}(Q):=V_{n}^{Q}(x)\). Note that Theorem 6.2 is a refinement of Kern’s PhD thesis [25, Theorem 2.2.8] and that a related result was proved earlier by Müller [34, Theorem 4.2]. We use \(K_{3,P}\) to denote the constant in condition (c) of a bounding function for \(P\).
Theorem 6.2
We assume that \(\psi \) is a bounding function for any \(Q\in{\mathcal{P}}_{\psi}\), and we let and be a generator of . Then for any \(n=0,\ldots ,N-1\), \(x_{n}\in E\) and \(Q,P\in{\mathcal{P}}_{\psi}\), we have
As a direct consequence of Theorem 6.2, we get the following result.
Corollary 6.3
Assume that \(\psi \) is a bounding function for any \(Q\in{\mathcal{P}}_{\psi}\) and let \(P\in{\mathcal{P}}_{\psi}\). Let and be a generator of . If and for any \(n=0,\ldots ,N-1\), then \(\overline{\mathcal{V}}_{n}^{x_{n}}\) is -continuous at \(P\) for any \(n=0,\ldots ,N-1\) and \(x_{n}\in E\).
6.2 A utility-based portfolio optimisation problem
6.2.1 Financial market model and a terminal wealth optimisation problem
Consider an \(N\)-period financial market consisting of one riskless bond \(S^{0}=(S^{0}_{n})_{n=0}^{N}\) and \(d\) risky assets \(S^{i}=(S^{i}_{n})_{n=0}^{N}\), \(i=1,\ldots ,d\), for some fixed . Assume that the value of the bond evolves deterministically according to
for some fixed constants \(Z^{0}_{1},\ldots ,Z^{0}_{N}\in [1,\infty )\), and that the value of the \(i\)th asset evolves stochastically according to
for a constant and independent -valued random variables \(Z^{i}_{1},\ldots ,Z^{i}_{N}\) on a common probability space . For \(n=0,\ldots ,N\), set \(S_{n}:=(S_{n}^{1},\ldots ,S_{n}^{d})\) and \(Z_{n}:=(Z_{n}^{1},\ldots ,Z_{n}^{d})\) and denote by \(\mu _{n}\) the distribution of \(Z_{n}\). We also define , \(\mathcal{F}_{0}:=\{\emptyset ,\varOmega \}\), \(\mathcal{F}_{n}:=\sigma (S_{0},\ldots ,S_{n})=\sigma (Z_{1},\ldots ,Z_{n})\), \(n=1,\ldots ,N\), and .
Now, an agent invests a given amount of capital in the bond and the assets according to some self-financing trading strategy. By a trading strategy, we mean an -adapted -valued stochastic process \(\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}\) with \(\xi _{n}=(\xi _{n}^{1},\ldots ,\xi _{n}^{d})\), where \(\xi _{n}^{0}\) and \(\xi _{n}^{i}\) specify the amounts of capital invested in the bond and in the \(i\)th asset, respectively, during the time interval \([n,n+1)\). The nonnegativity of \(\xi _{n}^{0},\xi _{n}^{1},\ldots ,\xi _{n}^{d}\), \(n=0,\ldots ,N-1\), means that taking loans and short selling of the assets are excluded. The corresponding (-adapted) portfolio process \(X^{\xi}=(X_{n}^{\xi})_{n=0}^{N}\) associated with \(\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}\) is defined by
A trading strategy \(\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}\) is called self-financing with respect to the initial capital \(x_{0}\) if \(x_{0}=\xi _{0}^{0}+\langle \xi _{0},\boldsymbol{1}\rangle \) and \(X_{n}^{\xi}=\xi _{n}^{0}+\langle \xi _{n},\boldsymbol{1}\rangle \) for any \(n=1,\ldots ,N\). Note that \(\xi _{n}^{0}\) and \(\langle \xi _{n},\boldsymbol{1}\rangle \) specify the amounts of capital invested during the time interval \([n,n+1)\) in the bond and in the \(d\) assets, respectively. For any self-financing trading strategy \(\xi =(\xi _{n}^{0},\xi _{n})_{n=0}^{N-1}\) with respect to \(x_{0}\), we have \(\xi _{n}^{0}=X_{n}^{\xi}-\langle \xi _{n},\boldsymbol{1}\rangle \) for any \(n=0,\ldots ,N-1\), and therefore the corresponding portfolio process admits the representation
In view of (6.5), we identify a self-financing trading strategy with respect to \(x_{0}\) with an -adapted -valued stochastic process \(\xi =(\xi _{n})_{n=0}^{N-1}\) with \(\xi _{n}=(\xi _{n}^{1}, \ldots ,\xi _{n}^{d})\) such that \(\langle \xi _{0},\boldsymbol{1}\rangle \in [0,x_{0}]\) and \(\langle \xi _{n},\boldsymbol{1}\rangle \in [0,X_{n}^{\xi}]\) for any \(n=1,\ldots ,N-1\). We restrict ourselves to Markovian self-financing trading strategies \(\xi =(\xi _{n})_{n=0}^{N-1}\) with respect to \(x_{0}\) which means that \(\xi _{n}\) only depends on \(n\) and \(X_{n}^{\xi}\). To put it another way, we assume that for any \(n=0,\ldots ,N-1\), there exists a Borel-measurable map such that \(\xi _{n} = f_{n}(X_{n}^{\xi})\). Then in particular, \(X^{\xi}\) is an -valued -Markov process whose one-step transition probability at time \(n\in \{0,\ldots ,N-1\}\) given state and strategy \(\xi =(\xi _{n})_{n=0}^{N-1}\) (resp. \(\pi :=(f_{n})_{n=0}^{N-1}\)) is given by \(\mu _{n+1}\circ \eta _{n,(x,f_{n}(x))}^{-1} \), where
The agent’s aim is to find a self-financing trading strategy \(\xi =(\xi _{n})_{n=0}^{N-1}\) (resp. \(\pi =(f_{n})_{n=0}^{N-1}\)) with respect to \(x_{0}\) for which her expected utility of the relative terminal wealth is maximised. We assume that the agent is risk-averse and that her attitude towards risk is set via the power utility function defined by
for some fixed \(\alpha \in (0,1)\). Hence the agent is interested in those self-financing trading strategies \(\xi =(\xi _{n})_{n=0}^{N-1}\) (resp. \(\pi =(f_{n})_{n=0}^{N-1}\)) with respect to \(x_{0}\) for which the expectation of \(u_{\alpha}(X_{N}^{\xi}/(x_{0}S_{N}^{0}))\) is maximised. Since \(u_{\alpha}(X_{N}^{\xi}/(x_{0}S_{N}^{0})) = u_{\alpha}(X_{N}^{\xi})/(x_{0}S_{N}^{0})^{ \alpha}\), this is equivalent to maximising the expectation of \(u_{\alpha}(X_{N}^{\xi})\). For notational simplicity, we consider the terminal wealth optimisation problem in the latter form. We assume that \(Z_{n}^{1},\ldots ,Z_{n}^{d}\) are ℙ-a.s. strictly positive and for any \(n=1,\ldots ,N\).
Example 6.4
Assume that the bond and the \(d\) assets evolve according to the 1-dimensional ordinary (Itô stochastic) differential equations
where are constants and \(B^{1},\ldots ,B^{d}\) are (jointly Gaussian) correlated 1-dimensional standard Brownian motions which satisfy for any that , where is a fixed correlation matrix (i.e., \(R\) is symmetric and positive semi-definite with entries 1 on the diagonal). This is a multivariate version of the classical Black–Scholes–Merton model. Choose the trading period to be the unit interval \([0,1]\) and assume that the bond and the assets can be traded only at \(N\) equidistant time points in \([0,1]\), namely at \(t_{N,n}:=n/N\), \(n=0,\ldots ,N-1\). Then the relative price changes \(Z^{0}_{n+1}:=S^{0}_{n+1}/S^{0}_{n}=\mathsf{s}^{0}_{t_{N,n+1}}/ \mathsf{s}^{0}_{t_{N,n}}\) and \(Z_{n+1}^{i}:=S_{n+1}^{i}/S_{n}^{i}=\mathsf{s}^{i}_{t_{N,n+1}}/ \mathsf{s}^{i}_{t_{N,n}}\) are given by, respectively, \(e^{\delta _{0}(t_{N,n+1}-t_{N,n})}\) and \(e^{(\delta _{i} - \sigma _{i}^{2}/2)(t_{N,n+1}-t_{N,n})+\sigma _{i}(B^{i}_{t_{N,n+1}}-B^{i}_{t_{N,n}})}\), i.e., for \(n=0,\ldots ,N-1\),
That is, we have \(Z_{n+1}=(e^{G_{1}},\ldots ,e^{G_{d}})\) for a \(d\)-variate random variable \((G_{1},\ldots ,G_{d})\) which has a \(d\)-variate normal distribution \(\mathbf{N}_{\delta ,\varGamma}\) with \(\delta :=(( \delta _{i} - \sigma _{i}^{2}/2)/N)_{i=1}^{d}\) and \(\varGamma :=( \sigma _{i}R_{i,j}\sigma _{j}/N)_{1\le i,j\le d}\). Thus we have \(\mu _{1}=\cdots =\mu _{N}=\mathrm{LN}_{\delta ,\varGamma}\), where \(\mathrm{LN}_{\delta ,\varGamma}\) is a \(d\)-variate log-normal distribution with parameters \(\delta \) and \(\varGamma \).
6.2.2 Interpretation as a Markov decision problem
The terminal wealth optimisation problem just introduced can be embedded in the framework of Sect. 6.1 as follows. Let \(Z^{0}_{1},\ldots ,Z^{0}_{N}\in [1,\infty )\) be a priori fixed and choose . For any and \(n=0,\ldots ,N-1\), let
Hence and for \(n=0, \ldots ,N-1\). Let and for any \(n=0,\ldots ,N-1\), and let the set \(F\) consist of all those Borel-measurable maps that satisfy \(\langle f(x),\boldsymbol{1}\rangle \in [0,x]\) for any . Finally, let \(F_{n}:=F\) for \(n=0,\ldots ,N-1\) and \(\varPi :=F_{0}\times \cdots \times F_{N-1}=F^{N}\).
Let be the set of all Borel probability measures on for which . The latter condition is equivalent to , which can be shown by using arguments as at the beginning of Appendix A.2. For any , we define a transition function \(P^{\vec{\boldsymbol{\mu}}}=(P_{n}^{\vec{\boldsymbol{\mu}}})_{n=0}^{N-1}\) by
where the map is defined by (6.6). The set of all such transition functions is denoted by \(\mathcal{P}_{\alpha}\), i.e., , and plays the role of \(\mathcal{P}\).
Let \(r_{n}:= 0\), \(n=0,\ldots ,N-1\), and \(r_{N}(x):=u_{\alpha}(x)\), . Then
for any , \(P\in{\mathcal{P}}_{\alpha}\) and \(\pi \in \varPi \), and the terminal wealth problem introduced subsequent to (6.7) can be identified with the optimisation problem (6.1), i.e., with
for any and \(P\in{\mathcal{P}}_{\alpha}\). A strategy \(\pi ^{P}\in \varPi \) is called an optimal (self-financing) trading strategy for \(P\) if it solves the maximisation problem (6.8) for any . Note that the coordinate process \(X\) plays the role of the portfolio process \(X^{\xi}\) introduced in (6.4), and that for each , any self-financing trading strategy \(\xi =(\xi _{n})_{n=0}^{N-1}\) with respect to \(x_{0}\) may be identified with some \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) through \(\xi _{n}=f_{n}(X_{n}^{\xi})\). Theorem C.3 ensures that optimal trading strategies exist.
6.2.3 Continuous dependence of the optimal value on \(P^{\vec{\boldsymbol{\mu}}}\)
Let the function be defined by \(\psi _{\alpha}(x):=1 + u_{\alpha}(x)\). Moreover, let \(\mathcal{P}_{\psi _{\alpha}}\) be derived from \(\mathcal{P}_{\alpha}\) as \(\mathcal{P}_{\psi}\) is derived from \(\mathcal{P}\) in Sect. 6.1.
Lemma 6.5
\(\psi _{\alpha}\) is a bounding function for any \(P\in{\mathcal{P}}_{\alpha}\), and we have \(\mathcal{P}_{\psi _{\alpha}}=\mathcal{P}_{\alpha}\).
Let , where the Hölder-\(\alpha \) norm is defined by . We obviously have , and in view of Lemmas 6.5 and 6.1, we can therefore define a functional through \(\overline{\mathcal{V}}_{n}^{x}(P):=V_{n}^{P}(x)\). The set separates points in , implying that (defined by (6.2) with ) provides a metric on ; see Kern et al. [26] for details. Let be defined by (6.3) with and \(\psi :=\psi _{\alpha}\).
Theorem 6.6
For any \(n=0,\ldots ,N-1\) and , the map is -continuous.
Recall that the elements of \(\mathcal{P}_{\alpha}\) (\(=\mathcal{P}_{\psi _{ \alpha}}\)) are parametrised by the elements of the set . For any , denote by \(\overline{\mu}\) the element of whose \(N\) entries are all equal to \(\mu \), i.e., \(\overline{\mu}:=(\mu )_{n=1}^{N}\). Then we can define a functional by
Since we used \(\mathcal{O}_{d}^{\alpha}\) to denote the \(\alpha \)-weak topology on \(\mathcal{M}_{1}^{\alpha}\) (see Sect. 2.2), we use to denote the analogous topology on .
Corollary 6.7
For any \(n=0,\ldots ,N-1\) and , the map defined by (6.9) is -continuous.
6.3 Copula robustness of the maximal expected utility of the terminal wealth
For any \(n=0,\ldots ,N-1\) and , let the map be defined by (6.9), and note that \(\mathcal{V}_{0}^{x_{0}}(\mu )\) corresponds to the maximal expected utility of the terminal wealth in (6.8) with \(P=P^{\overline{\mu}}\). When regarding each as a Borel probability measure on the whole Euclidean space (with ), the set can be seen as a subset of \(\mathcal{M}_{1}^{\alpha}\). Thus \(\mathbf{C}_{d}(\mu _{1},\ldots \mu _{d})=\mathbf{C}_{d}\) for any , and Theorem 3.12 and Corollary 6.7 together imply the following result.
Corollary 6.8
For any \(n=0,\ldots ,N-1\) and , the map defined by (6.9) is copula robust.
References
Bäuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Berlin (2011)
Bauer, H.: Measure and Integration Theory. de Gruyter, Berlin (2001)
Bellini, F., Klar, B., Müller, A., Rosazza Gianin, E.: Generalized quantiles as risk measures. Insur. Math. Econ. 54, 41–48 (2014)
Berge, C.: Topological Spaces. Macmillan, New York (1963)
Bickel, P.J., Freedman, D.A.: Some asymptotic theory for the bootstrap. Ann. Stat. 9, 1196–1217 (1981)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Charpentier, A., Segers, J.: Convergence of Archimedean copulas. Stat. Probab. Lett. 78, 412–419 (2008)
Cheridito, P., Li, T.: Risk measures on Orlicz hearts. Math. Finance 19, 189–214 (2009)
Claus, M., Krätschmer, V., Schultz, R.: Weak continuity of risk functionals with applications to stochastic programming. SIAM J. Optim. 27, 91–109 (2017)
Deheuvels, P.: Caractérisation complète des lois extrêmes multivariées et de la convergence des types extrêmes. Publ. Inst. Stat. Univ. Paris 23, 1–36 (1978)
Demarta, S., McNeil, A.J.: The \(t\) copula and related copulas. Int. Stat. Rev. 73, 111–129 (2005)
Durante, F., Sempi, C.: Principles of Copula Theory. Taylor & Francis, London (2015)
Embrechts, P., Liu, H., Wang, R.: Quantile-based risk sharing. Oper. Res. 66, 936–949 (2018)
Embrechts, P., Puccetti, G.: Bounds for functions of dependent risks. Finance Stoch. 10, 341–352 (2006)
Embrechts, P., Puccetti, G., Rüschendorf, L.: Model uncertainty and VaR aggregation. J. Bank. Finance 37, 2750–2764 (2013)
Embrechts, P., Schied, A., Wang, R.: Robustness in the optimization of risk measures. Oper. Res. 70, 95–110 (2021)
Embrechts, P., Wang, B., Wang, R.: Aggregation-robustness and model uncertainty of regulatory risk measures. Finance Stoch. 19, 763–790 (2015)
Emmer, S., Kratz, M., Tasche, D.: What is the best risk measure in practice? A comparison of standard measures. J. Risk 18, 31–60 (2015)
Fernández Sánchez, J., Trutschnig, W.: Conditioning-based metrics on the space of multivariate copulas and their interrelation with uniform and levelwise convergence and iterated function systems. J. Theor. Probab. 28, 1311–1336 (2015)
Filipović, D., Kupper, M.: Optimal capital and risk transfers for group diversification. Math. Finance 18, 55–76 (2007)
Filipović, D., Svindland, G.: Optimal capital and risk allocations for law- and cash-invariant convex functions. Finance Stoch. 12, 423–439 (2008)
Föllmer, H., Schied, A.: Robust preferences and convex measures of risk. In: Sandmann, K., Schönbucher, P. (eds.) Advances in Finance and Stochastics. Essays in Honour of Dieter Sondermann, pp. 39–56. Springer, Berlin (2002)
Föllmer, H., Schied, A.: Stochastic Finance. An Introduction in Discrete Time, 3rd revised and extended edn. de Gruyter, Berlin (2011)
Kasper, T., Fuchs, S., Trutschnig, W.: On weak conditional convergence of bivariate Archimedean and extreme value copulas, and consequences to nonparametric estimation. Bernoulli 27, 2217–2240 (2021)
Kern, P.: Sensitivity and Statistical Inference in Markov Decision Models and Collective Risk Models. PhD thesis, Saarland University, Saarbrücken, (2020). Available online at https://doi.org/10.22028/D291-32385
Kern, P., Simroth, A., Zähle, H.: First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function. Math. Methods Oper. Res. 92, 165–197 (2020)
Krätschmer, V., Schied, A., Zähle, H.: Comparative and qualitative robustness for law-invariant risk measures. Finance Stoch. 18, 271–295 (2014)
Krätschmer, V., Schied, A., Zähle, H.: Domains of weak continuity of statistical functionals with a view toward robust statistics. J. Multivar. Anal. 158, 1–19 (2017)
Krätschmer, V., Zähle, H.: Statistical inference for expectile-based risk measures. Scand. J. Stat. 44, 425–454 (2017)
Li, X., Mikusiński, P., Taylor, M.D.: Strong approximation of copulas. J. Math. Anal. Appl. 225, 608–623 (1998)
Lindner, A., Szimayer, A.: A limit theorem for copulas. Discussion paper 433, Sonderforschungsbereich 386, Ludwig-Maximilian-Universität München (2005). Available online at https://epub.ub.uni-muenchen.de/1802/
Markowitz, H.M.: Portfolio selection. J. Finance 7, 77–91 (1952)
McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management, 1st edn. Princeton University Press, Princeton (2005)
Müller, A.: How does the value function of a Markov decision process depend on the transition probabilities? Math. Oper. Res. 22, 872–885 (1997)
Müller, A.: Integral probability metrics and their generating classes of functions. Adv. Appl. Probab. 29, 429–443 (1997)
Nelsen, R.B.: An Introduction to Copulas, 2nd edn. Springer, New York (2006)
Ogryczak, W., Ruszczyński, A.: From stochastic dominance to mean–risk models: semideviations as risk measures. Eur. J. Oper. Res. 116, 33–50 (1999)
Ogryczak, W., Ruszczyński, A.: Dual stochastic dominance and related mean–risk models. SIAM J. Optim. 13, 60–78 (2002)
Puccetti, G.: Sharp bounds on the expected shortfall for a sum of dependent random variables. Stat. Probab. Lett. 83, 1227–1232 (2013)
Rachasingho, J., Tasena, S.: A metric space of subcopulas – an approach via Hausdorff distance. Fuzzy Sets Syst. 378, 144–156 (2020)
Rachev, S.T.: Probability Metrics and the Stability of Stochastic Models. Wiley, Chichester (1991)
Rudin, W.: Functional Analysis, 2nd edn. McGraw-Hill, New York (1991)
Rüschendorf, L.: Random variables with maximum sums. Adv. Appl. Probab. 14, 623–632 (1982)
Rüschendorf, L.: Mathematical Risk Analysis. Springer, Berlin (2013)
Saida, A.B., Prigent, J.L.: On the robustness of portfolio allocation under copula misspecification. Ann. Oper. Res. 262, 631–652 (2018)
Schultz, R., Tiedemann, S.: Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. 105, 365–386 (2006)
Sempi, C.: Convergence of copulas: a critical remark. Rad. Mat. 12, 241–249 (2004)
Shiryaev, A.N.: Probability, 2nd edn. Springer, New York (1996)
Sklar, A.: Fonctions de répartition à \(n\) dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)
Trutschnig, W.: On a strong metric on the space of copulas and its induced dependence measure. J. Math. Anal. Appl. 384, 690–705 (2011)
Trutschnig, W.: Some results on the convergence of (quasi-)copulas. Fuzzy Sets Syst. 191, 113–121 (2012)
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
Wang, S.S., Dhaene, J.: Comonotonicity, correlation order and premium principles. Insur. Math. Econ. 22, 235–242 (1998)
Zolotarev, V.M.: Probability metrics. Theory Probab. Appl. 28, 278–302 (1983)
Acknowledgements
The author would like to thank an Associate Editor and two referees for their constructive comments and suggestions that led to an improved version of the article.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proofs
1.1 A.1 Proof of Proposition 2.2
By the assumption imposed on \(h\), we can find a constant such that
for any \(\mu \in{\mathcal{M}}_{d}^{p}\). This gives the first assertion. Since the involved topologies are metrisable, it suffices for the second assertion to show that the map \(\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}\) is sequentially continuous. Let \(\mu \) and \(\mu _{n}\), , be elements of \(\mathcal{M}_{d}^{p}\) such that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\) and thus in particular in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{1}^{p}\). By the classical continuous mapping theorem, we have \(\mathfrak{h}(\mu _{n})\to \mathfrak{h}(\mu )\) in \(\mathcal{O}_{d'}^{0}\cap{\mathcal{M}}_{d'}^{p'}\). Moreover, by the assumption on \(h\), the function defined by \(f_{h}(x):=|h(x)|^{p'}\) lies in \(\mathcal{C}_{d}^{p}\). This implies
Thus \(\mathfrak{h}(\mu _{n})\to \mathfrak{h}(\mu )\) in \(\mathcal{O}_{d'}^{p'}\). This gives the second assertion. □
1.2 A.2 Proof of Theorem 2.3
We first prove that \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\) if and only if \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}\), regardless of the copula \(C\in \mathbf{C}_{d}\). If we have \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\), then
where \(c_{p}:=c_{1}2^{\max \{0,p-1\}}\), \(|x|_{1}:=\sum _{i=1}^{d}|x_{i}|\) is the 1-norm of \(x=(x_{1},\ldots ,x_{d})\) and is a suitable constant. Thus \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}\). Conversely, assume that \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}\). Then we have
i.e., \(\mu _{i}\in{\mathcal{M}}_{1}^{p}\), for any \(i=1,\ldots ,d\).
To prove the main assertion of Theorem 2.3, we first let \((C,\mu _{1},\ldots ,\mu _{d})\) and \((C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\), , be elements of \(\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\) such that \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{p}\), \(i=1,\ldots ,d\), and \(d_{\mu _{1},\ldots \mu _{d}}(C_{n},C)\to 0\). Since \(p\)-weak convergence implies weak convergence, we then have in particular that \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{0}\cap{\mathcal{M}}_{1}^{p}\), \(i=1,\ldots ,d\). So Lindner and Szimayer [31, Theorem 2.1] implies that \(\mathfrak{P}_{d}(C_{n},\mu _{1,n},\ldots ,\mu _{d,n})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}\). For the convergence \(\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) in \(\mathcal{O}_{d}^{p}\) (when \(p>0\)), it remains to show the convergence
Clearly,
for any . Since \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}\) (as seen above), we can choose for every \(\varepsilon >0\) a suitable constant such that
Furthermore, for any , we have
where \(c_{p}:=c_{1}\,2^{\max \{0,p-1\}}\), \(|x|_{1}:=\sum _{i=1}^{d}|x_{i}|\) and \(|x|_{\infty}:=\max _{i=1,\ldots ,d}|x_{i}|\) for \(x=(x_{1},\ldots ,x_{n})\), and are suitable constants. For \(i=j\), the last integral equals . Since we assumed \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{p}\) (i.e., \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{0}\cap{\mathcal{M}}_{1}^{p}\) and ), Krätschmer et al. [28, Theorem 2.3 (5.⇒3.)] ensures that we can choose so large so that this expression (with \(a_{1}=a_{1,i,i}\)) is bounded above by \(\varepsilon /(3dc_{p})\) uniformly in . For \(i\neq j\) and any , the summand is bounded above by
Again by [28, Theorem 2.3] and the assumed convergence \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{p}\), we can choose \(b_{i}\) so large that . Once we have chosen \(b_{i}\), we can in view of \(S_{1,i,j}(n,a_{1},b)\le b^{p}\mu _{n,j}[[-a_{1}^{1/p}/c_{\infty},a_{1}^{1/p}/c_{ \infty}]^{c}]\) choose so large that ; take into account that as a weakly convergent sequence is tight. That is, we have when we set \(a_{1}:=\max _{i,j=1,\ldots ,d}a_{1,i,j}\). Finally, by the already established convergence \(\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}^{p}\), we can choose such that \(S_{2}(n,a)\le \varepsilon /(3c_{p})\) for \(a:=\max \{a_{1},a_{3}\}\) and all \(n\ge n_{0}\). Altogether, we have shown that for any given \(\varepsilon >0\), we can find an such that the left-hand side of (A.1) is \(\le \varepsilon \) for all \(n\ge n_{0}\).
Conversely, let \((C,\mu _{1},\ldots ,\mu _{d})\) and \((C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\), , be elements of \(\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\) such that \(\mathfrak{P}_{d}(C_{n},\mu _{n,1},\ldots ,\mu _{n,d})\to \mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) in \(\mathcal{O}_{d}^{p}\). Since \(p\)-weak convergence implies weak convergence, it is a direct consequence of Lindner and Szimayer [31, Theorem 2.1] that this implies \(d_{\mu _{1},\ldots ,\mu _{n}}(C_{n},C)\to 0\) and \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{0}\), \(i=1,\ldots ,d\). Moreover, for any \(i=1,\ldots ,d\) we have
since the function lies in \(\mathcal{C}_{d}^{p}\). Thus we even have \(\mu _{n,i}\to \mu _{i}\) in \(\mathcal{O}_{1}^{p}\), \(i=1,\ldots ,d\). □
1.3 A.3 Proof of Corollary 2.6
Of course, it suffices to show that \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})=\mathcal{O}_{d}^{0} \cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\). Recall that the \(p\)-weak topology is metrisable. Therefore it suffices to show that for any sequence and any \(\mu \in{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\), it holds that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) if and only if \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\). If \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\), then \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) because \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) is finer than \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\). Conversely, if \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\), we obtain by Theorem 2.3 (with \(p=0\)) that converges to \(C\) in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{n}}\), where \(C_{n}\) and \(C\) are (arbitrary) copulas of \(\mu _{n}\) and \(\mu \), respectively. Again with Theorem 2.3, we conclude that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\). □
1.4 A.4 Proof of Corollary 2.8
Since the involved topologies are both metrisable, it suffices to show that the map \(\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}:\mathcal{M}_{d}(\mu _{1}, \ldots ,\mu _{d})\to \mathbf{C}_{d}/_{\sim _{\mu _{1},\ldots ,\mu _{d}}}\) is sequentially continuous for the pair \((\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mu _{1},\ldots ,\mu _{d}}^{\sim})\) for any \(p'\in [0,p]\). Let \(\mu \) and \(\mu _{n}\), , be elements of \(\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})\) such that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p'}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) for some \(p'\in [0,p]\). By Corollary 2.6, we obtain that \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{p}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\), and by Theorem 2.3, it follows that \(\lim _{n\to \infty}d_{\mu _{1},\ldots ,\mu _{d}}(C_{n},C)=0\) for any copulas \(C_{n}\) and \(C\) of \(\mu _{n}\) and \(\mu \), respectively. So we arrive at
i.e., \(\mathfrak{C}_{\mu _{1},\ldots ,\mu _{d}}(\mu _{n})\to \mathfrak{C}_{ \mu _{1},\ldots ,\mu _{d}}(\mu )\) in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}^{\sim}\). □
1.5 A.5 Proof of Example 3.2
We here show that if \(\mathcal{M}_{d}'=\mathcal{N}_{d}\), then \(\mathfrak{D}_{d}'=\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1}\times \cdots \times\mathcal{N}_{1}\).
“⊇” Let \((C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1} \times \cdots \times\mathcal{N}_{1}\), which means that we have \(\mu _{1}=\mathrm{N}_{m_{1},s_{1}^{2}}, \ldots , \mu _{d}=\mathrm{N}_{m_{d},s_{d}^{2}}\) for some and , and \(C\) is given by (3.2) for some correlation matrix \(R\). Recall that quantiles are translation-equivariant and positively homogeneous (on the level of univariate random variables). For the distribution function of the Borel probability measure \(\mu :=\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\), we thus obtain
where \(m:=(m_{1},\ldots ,m_{d})^{\top}\) and \(S\) is the \(d\times d\) diagonal matrix with entries \(s_{1},\ldots ,s_{d}\) on the diagonal. Note that the matrix \(SRS\) is again symmetric and positive semi-definite, implying that \(\mu \) is the \(d\)-variate normal distribution \(\mathbf{N}_{m,SRS}\). Since the entries on the diagonal of the matrix \(SRS\) are the strictly positive numbers \(s_{1}^{2},\ldots ,s_{d}^{2}\), the marginal distributions of \(\mu =\mathbf{N}_{m,SRS}\) are \(\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}\), i.e., elements of \(\mathcal{N}_{1}\). In particular, \(\mu \) is a (possibly degenerate) \(d\)-variate normal distribution with continuous marginals, i.e., \(\mu =\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\) lies in \(\mathcal{M}_{d}'=\mathcal{N}_{d}\). Thus we obtain that \((C,\mu _{1}, \ldots ,\mu _{d})\in \mathfrak{D}_{d}'\).
“⊆” Let \((C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'\), i.e., \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\) and \(C\in \mathbf{C}_{d}\) are such that \(\mu :=\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}'={\mathcal{N}}_{d}\). Then \(\mu _{1}=\mathrm{N}_{m_{1},s_{1}^{2}}, \ldots , \mu _{d}=\mathrm{N}_{m_{d},s_{d}^{2}}\) for some and , and \(\mu =\mathbf{N}_{m,V}\) for the vector \(m:=(m_{1},\ldots ,m_{d})^{ \top}\) and a symmetric and positive semi-definite matrix with entries \(s_{1}^{2},\ldots ,s_{d}^{2}\) on the diagonal. By (2.1) and the translation-equivariance and positive homogeneity of quantiles (on the level of univariate random variables), we have
where \(S^{-1}\) is the \(d\times d\) diagonal matrix with entries \(s_{1}^{-1},\ldots ,s_{d}^{-1}\) on the diagonal. The matrix \(R:=S^{-1}VS^{-1}\) is again symmetric and positive semi-definite, and for any \(i,j\in \{1,\ldots ,d\}\), its entry at \((i,j)\) is equal to \(v_{i,j}/(s_{i}s_{j})\), where \(v_{i,j}\) is the entry at \((i,j)\) of the matrix \(V\). Since \(v_{i,i}=s_{i}^{2}\) for any \(i=1,\ldots ,d\), it follows that \(R\) is a correlation matrix. Thus \(C\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\) and therefore \((C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\times\mathcal{N}_{1} \times \cdots \times\mathcal{N}_{1}\). □
1.6 A.6 Proof of Example 3.3
We here show that if \(\mathcal{M}_{d}'=\mathcal{M}_{d}^{p}\) for some , then \(\mathfrak{D}_{d}'=\mathbf{C}_{d}\times\mathcal{M}_{1}^{p}\times \cdots \times\mathcal{M}_{1}^{p}\).
“⊇” Let \((C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}\). Then, as shown in the first paragraph of Sect. A.2, \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}^{p}={\mathcal{M}}_{d}'\). Thus \((C,\mu _{1},\ldots ,\mu _{d})\) lies in \(\mathfrak{D}_{d}'\).
“⊆” Let \((C,\mu _{1},\ldots ,\mu _{d})\in \mathfrak{D}_{d}'\), i.e., \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}\) and \(C\in \mathbf{C}_{d}\) are such that \(\mathfrak{P}_{d}(C,\mu _{1},\ldots ,\mu _{d})\in{\mathcal{M}}_{d}'=\mathcal{M}_{d}^{p}\). Then, as shown in the first paragraph of Sect. A.2, \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). Thus \((C,\mu _{1},\ldots ,\mu _{d})\in \mathbf{C}_{d}\times\mathcal{M}_{1}^{p} \times \cdots \times\mathcal{M}_{1}^{p}\). □
1.7 A.7 Proof of Remark 3.6
The univariate standard normal distribution function \(\varPhi _{0,1}\) is continuous and strictly increasing. This implies that pointwise convergence of a sequence of \(d\)-variate Gaussian copulas to a \(d\)-variate Gaussian copula
is the same as pointwise convergence of \(\boldsymbol{\varPhi}_{\boldsymbol{0},R_{n}}(\,\cdot \,,\ldots ,\,\cdot \,)\) to \(\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\,\cdot \,,\ldots ,\,\cdot \,)\). The \(d\)-variate distribution function \(\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\,\cdot \,,\ldots ,\,\cdot \,)\) is continuous since it can be represented as
and \(\boldsymbol{\varPhi}_{\boldsymbol{0},R}(\varPhi _{0,1}^{-1} (\, \cdot \,),\ldots ,\varPhi _{0,1}^{-1} (\, \cdot \,)) \) as a (Gaussian) copula is Lipschitz-continuous with respect to \(|\cdot |_{1}\) and the univariate standard normal distribution function \(\varPhi _{0,1} (\, \cdot \,)\) is continuous. Therefore (see Shiryaev [48, Sect. III.1]) the latter pointwise convergence is equivalent to weak convergence of the \(d\)-variate normal distribution \(\mathbf{N}_{\boldsymbol{0},R_{n}}\) to the \(d\)-variate normal distribution \(\mathbf{N}_{\boldsymbol{0},R}\), and by Lévy’s continuity theorem, this is the same as
for any . Obviously, the latter holds if and only if \(t^{\top}(R_{n}-R)t\to 0\) for any .
If \(\|R_{n}-R\|_{\mathrm{Mat}}\to 0\) for some matrix norm \(\|\cdot \|_{\mathrm{Mat}}\), then \(\|R_{n}-R\|_{\mathrm{max}}\to 0\) for the maximum norm \(\|\cdot \|_{\mathrm{max}}\), and thus \(t^{\top}(R_{n}-R)t\to 0\) for any .
Conversely, assume that \(t^{\top}(R_{n}-R)t\to 0\) for any . Then for any \(i=1,\ldots ,d\), we can conclude by choosing \(t\) as the \(i\)th unit vector \(\mathrm{e}_{i}\) that the entry \((i,i)\) of the matrix \(R_{n}-R\) converges to 0. For \(t:=\mathrm{e}_{i}+\mathrm{e}_{j}\), the expression \(t^{\top}(R_{n}-R)t\) is twice the entry \((i,j)\) plus entries \((i,i)\) and \((j,j)\) of the symmetric matrix \(R_{n}-R\). It follows that also the entry \((i,j)\) of the matrix \(R_{n}-R\) converges to 0 for any \(i,j=1,\ldots ,d\) with \(i\neq j\). Thus \(\|R_{n}-R\|_{\mathrm{max}}\to 0\), and consequently \(\|R_{n}-R\|_{\mathrm{Mat}}\to 0\) for any matrix norm \(\|\cdot \|_{\mathrm{Mat}}\). □
1.8 A.8 Proof of Example 3.7
For any correlation matrix \(R\), use \(C^{R}\) to denote the Gaussian copula associated with \(R\). By definition, the set \(\mathbf{C}_{d}^{{\mbox{\textup {{\scriptsize {Ga}}}}}}\) is parametrised by the set of all correlation matrices. Now fix \(\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}\in{\mathcal{N}}_{1}\) and let \(R,R_{1},R_{2},\ldots \) be correlation matrices such that \(C^{R_{n}}\) converges to \(C^{R}\) pointwise, i.e., \(C^{R_{n}}(u_{1},\ldots ,u_{d})\to C^{R}(u_{1},\ldots ,u_{d})\) for any \(u_{1},\ldots ,u_{d}\in [0,1]\). In particular,
for all . Now \(F_{n}(x_{1},\ldots ,x_{d}):=C^{R_{n}}(\varPhi _{m_{1},s_{1}^{2}}(x_{1}), \ldots ,\varPhi _{m_{d},s_{d}^{2}}(x_{d}))\) and \(F(x_{1},\ldots ,x_{d}):=C^{R}(\varPhi _{m_{1},s_{1}^{2}}(x_{1}),\ldots , \varPhi _{m_{d},s_{d}^{2}}(x_{d}))\) are the distribution functions of \(\mathfrak{P}_{d}(C^{R_{n}},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})\) and \(\mathfrak{P}_{d}(C^{R},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})\), respectively. Moreover, \(F\) is continuous since \(C^{R}\) as a copula is Lipschitz-continuous with respect to \(|\cdot |_{1}\) and the univariate normal distribution functions \(\varPhi _{m_{1},s_{1}^{2}} (\, \cdot \,),\ldots ,\varPhi _{m_{d},s_{d}^{2}} (\, \cdot \,)\) are continuous. Therefore (see Shiryaev [48, Sect. III.1]) the convergence in (A.2) is equivalent to
Moreover, as in (A.1) (with \(\mu _{n,i}=\mu _{i}=\mathrm{N}_{m_{i},s_{i}^{2}}\), , \(i=1,\ldots ,d\)), we obtain
Thus \(\mathfrak{P}_{d}(C^{R_{n}},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}}) \to \mathfrak{P}_{d}(C^{R},\mathrm{N}_{m_{1},s_{1}^{2}},\ldots ,\mathrm{N}_{m_{d},s_{d}^{2}})\) in \(\mathcal{O}_{p}^{d}\cap{\mathcal{N}}_{d}\). In view of \(\mathcal{P}_{d}\circ \mathfrak{P}_{d}=\mathfrak{P}_{d}\) and since the involved topologies are metrisable, we arrive at copula robustness of \(\mathcal{P}_{d}:\mathcal{N}_{d}\to{\mathcal{N}}_{d}\), where the image space \(\mathcal{N}_{d}\) is equipped with \(\mathcal{O}_{p}^{d}\cap{\mathcal{N}}_{d}\). □
1.9 A.9 Proof of Theorem 3.10
Let us first assume that the map \(\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}:\mathcal{M}_{d}( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}\) is continuous for the pair \((\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),{\mathcal{O}}_{\mathbf{E}})\) for any \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). Since the map
is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d}))\)-continuous for any \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\) by Corollary 2.5, it follows that the map
is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})\)-continuous for any \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). Thus \(\mathcal{T}_{d}\) is copula robust.
Conversely, assume \(\mathcal{T}_{d}\) is copula robust, i.e., \(\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}): \mathbf{C}_{d}\to \mathbf{E}\) is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})\)-continuous for any \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). By way of contradiction, assume that \(\mathcal{T}_{d}|_{\mathcal{M}_{d}(\mu _{1},\ldots ,\mu _{d})}:\mathcal{M}_{d}( \mu _{1},\ldots ,\mu _{d})\to \mathbf{E}\) is not continuous for the pair \((\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d}),\mathcal{O}_{ \mathbf{E}})\), for some \(\mu _{1},\ldots ,\mu _{d}\in{\mathcal{M}}_{1}^{p}\). Then one can find elements \(\mu _{n}\), , and \(\mu \) of such that we have \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}( \mu _{1},\ldots ,\mu _{d})\), but \(\mathcal{T}_{d}(\mu _{n})\not \to{\mathcal{T}}_{d}(\mu )\) in \(\mathcal{O}_{\mathbf{E}}\). However, in view of Corollary 2.8, the convergence \(\mu _{n}\to \mu \) in \(\mathcal{O}_{d}^{0}\cap{\mathcal{M}}_{d}(\mu _{1},\ldots ,\mu _{d})\) implies that \(C_{n}\to C\) in \(\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}}\) for any copulas \(C_{n}\) and \(C\) of \(\mu _{n}\) and \(\mu \), respectively. Because the map \(\mathfrak{T}_{d}(\,\cdot \,,\mu _{1},\ldots ,\mu _{d}):\mathbf{C}_{d} \to \mathbf{E}\) is \((\mathcal{O}_{\mu _{1},\ldots ,\mu _{d}},\mathcal{O}_{\mathbf{E}})\)-continuous by assumption, we obtain that \(\mathcal{T}_{d}(\mu _{n})\to{\mathcal{T}}_{d}(\mu )\) in \(\mathcal{O}_{\mathbf{E}}\). This contradicts \(\mathcal{T}_{d}(\mu _{n})\not \to{\mathcal{T}}_{d}(\mu )\) in \(\mathcal{O}_{\mathbf{E}}\). □
1.10 A.10 Proof of Corollary 3.13
The map \(\mathfrak{h}:\mathcal{M}_{d}^{p}\to{\mathcal{M}}_{d'}^{p'}\) defined by \(\mathfrak{h}(\mu ):=\mu \circ h^{-1}\) is \((\mathcal{O}_{d}^{p},\mathcal{O}_{d'}^{p'})\)-continuous by Proposition 2.2 and therefore copula robust by Theorem 3.12. Because the map \(\mathcal{T}_{d'}:\mathcal{M}_{d'}^{p'}\to \mathbf{E}\) is \((\mathcal{O}_{d'}^{p'},\mathcal{O}_{d}^{p})\)-continuous by assumption, it follows by Lemma 3.4 (with \(\mathcal{U}:=\mathcal{T}_{d'}\) and \(\mathcal{T}_{d}:=\mathfrak{h}\)) that \(\mathcal{T}_{d}'=\mathcal{T}_{d'}\circ \mathfrak{h}:\mathcal{M}_{d}^{p}\to \mathbf{E}\) is copula robust. □
1.11 A.11 Proof of Example 4.7
(i) The natural extension \(\overline{C}_{0}^{(\alpha )}\) of \(C_{0}^{(\alpha )}\) to is the distribution function of the Borel probability measure \(\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})\) on . Thus \(\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})=\mathcal{H}_{S_{1}^{ \alpha}\uplus S_{2}^{\alpha}}^{1}/\sqrt{2}\), where \(\mathcal{H}_{S_{1}^{\alpha}\uplus S_{2}^{\alpha}}^{1}[\,\cdot \,]:={\mathcal{H}}^{1}[\,\cdot \,\cap (S_{1}^{\alpha}\uplus S_{2}^{\alpha})]\) is the 1-dimensional (Borel) Hausdorff measure \(\mathcal{H}^{1}\) on restricted to the union of the two disjoint line segments \(S_{1}^{\alpha}\) and \(S_{2}^{\alpha}\) with endpoints \((\alpha ,0),(0,\alpha )\) and \((1,\alpha ),(\alpha ,1)\), respectively. That is, the total mass 1 of \(\mathfrak{P}_{2}(C_{0}^{(\alpha )},\mu _{1},\mu _{2})\) is uniformly distributed over \(S_{1}^{\alpha}\uplus S_{2}^{\alpha}\). In particular, \(\mathcal{H}^{1}[S_{1}^{\alpha}]=\alpha \) and \(\mathcal{H}^{1}[S_{2}^{\alpha}]=1-\alpha \). In view of \(S_{1}^{\alpha}=\{(x_{1},x_{2})\in [0,1]^{2}:x_{1}+x_{2}=\alpha \}\) and \(S_{2}^{\alpha}=\{(x_{1},x_{2})\in [0,1]^{2}:x_{1}+x_{2}=1+\alpha \}\), we can conclude that
In particular, \(\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{0}^{(\alpha )},\mu _{1}, \mu _{2})=\alpha \).
The natural extension \(\overline{C}_{1}\) of \(C_{1}\) to is the distribution function of the Borel probability measure \(\mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})\) on . Thus \(\mathfrak{P}_{2}(C_{1},\mu _{1},\mu _{2})=\mathcal{L}^{(2)}_{[0,1]^{2}}\), where \(\mathcal{L}^{(2)}_{[0,1]^{2}}[\,\cdot \,]:=\mathcal{L}^{(2)}[\,\cdot \, \cap [0,1]^{2}]\) is the (Borel) Lebesgue measure \(\mathcal{L}^{(2)}\) on restricted to \([0,1]^{2}\). Then
where \(\mathcal{L}^{(1)}_{[0,1]}[\,\cdot \,]:=\mathcal{L}^{(1)}[\,\cdot \,\cap [0,1]]\) is the (Borel) Lebesgue measure \(\mathcal{L}^{(1)}\) on ℝ restricted to \([0,1]\) and \(\Delta _{[0,2]}\) is the symmetric triangular distribution. Therefore
for any \(t\in (0,1]\). For the distribution function \(F_{\Delta _{2}}\) of \(\Delta _{[0,2]}\), we have \(F_{\Delta _{2}}(x)=0\), \(=\frac{1}{2}x^{2}\), \(=1-\frac{1}{2}(2-x)^{2}\), \(=1\) according to whether \(x<0\), \(x\in [0,1]\), \(x\in (1,2]\), \(x>2\). For the distribution function \(F^{(\alpha )}\) of \(\alpha \delta _{\alpha}+(1-\alpha )\delta _{1+\alpha}\), we have \(F^{(\alpha )}(x)=0\), \(=\alpha \), \(=1\) according to whether \(x<\alpha \), \(x\in [\alpha ,1+\alpha )\), \(x\ge 1+\alpha \). Thus for any \(t\in (0,1]\), the distribution function \(F_{t}^{(\alpha )}\) of \(\mathfrak{P}_{2}(C_{t}^{(\alpha )},\mu _{1},\mu _{2})\circ A_{2}^{-1}\) is given by
so that \(\mathfrak{R}_{\mathrm{VaR}_{\alpha},A_{2}}(C_{t}^{(\alpha )},\mu _{1}, \mu _{2})=\sqrt{2\alpha}\).
(ii) For any \(t\in [0,1]\), let \((X_{1}^{(\alpha ),t},X_{2}^{(\alpha ),t})\) be a bivariate random variable with distribution \(\mathfrak{P}_{2}(C_{t}^{(\alpha )},\mu _{1},\mu _{2})\). The distribution of the random variable \(-(X_{1}^{(\alpha ),0}+X_{2}^{(\alpha ),0})\) is then \(\alpha \delta _{-\alpha}+(1-\alpha )\delta _{-(1+ \alpha )}\). Thus \(\mathrm{VaR}_{1-\alpha}(-(X_{1}^{(\alpha ),0}+X_{2}^{(\alpha ),0}))=-(1+ \alpha )\). For any \(t\in (0,1]\), the distribution function \(\hat{F}_{t}^{(\alpha )}\) of \(-(X_{1}^{(\alpha ),t}+X_{2}^{(\alpha ),t})\) satisfies \(\hat{F}_{t}^{( \alpha )}(x)=1-F_{t}^{(\alpha )}(-x)\) for all those for which \(-x\) is a continuity point of \(F_{t}^{(\alpha )}\). Thus for any \(t\in (0,1]\), we get \(\mathrm{VaR}_{1-\alpha}(-(X_{1}^{(\alpha ),t}+X_{2}^{(\alpha ),t}))=- \sqrt{2\alpha}\). Hence,
where \(C_{t}^{(\alpha ),-}\) denotes the copula of the distribution of \((-X_{1}^{(\alpha ),t},-X_{2}^{(\alpha ),t})\); here we use that \(\hat{\mu}_{j}\) is the distribution of \(-X_{j}^{(\alpha ),t}\), \(j=1,2\). This gives the assertion since \(C_{t}^{(\alpha ),-}=\hat{C}_{t}^{(\alpha )}\) for any \(t\in [0,1]\). The latter equality holds true since the distribution function \(F^{(\alpha ),t,-}\) of \((-X_{1}^{(\alpha ),t},-X_{2}^{(\alpha ),t})\) satisfies
for any \(t\in [0,1]\), where \(\overline{F}^{(\alpha ),t}\) denotes the survival function of \((X_{1}^{(\alpha ),t},X_{2}^{(\alpha ),t})\). □
1.12 A.12 Proof of (4.4)
Let \(p\in [1,\infty )\) and be a comonotonic convex risk measure. Moreover, let \(\mu \in{\mathcal{M}}_{d}^{p}\). Since is assumed to be atomless, we can choose \((X_{1},\ldots ,X_{d})\in L^{p}\times \cdots \times L^{p}\) such that . A result of Filipović and Svindland [21, Corollary 2.7] ensures that there exist a comonotone optimal capital and risk allocation \((X_{1}^{*},\ldots ,X_{d}^{*})\) of \(X:=\sum _{i=1}^{d}X_{i}\) (\(\in L^{p}\)). Thus
where the fifth equality relies on the comonotonicity of \(\rho \). □
1.13 A.13 Proof of Theorem 5.2
We can use arguments similar to those used in the proof of Claus et al. [9, Corollary 2.4]. In view of Lemma 5.1 and the compactness of \(\varXi \), we can apply Bonnans and Shapiro [6, Proposition 4.4] to obtain that the map is continuous with respect to . Berge [4, Theorem VI.3.2] ensures that the infimum in (5.3) is attained for any \(\mu \in{\mathcal{M}}_{d}^{\gamma p}\). □
1.14 A.14 Proof of Corollary 5.4
Conditions (a)–(c) of Sect. 5.1 hold true for \(p=1\). Indeed, condition (a) holds true since monotonicity, distribution-invariance and convexity carry over from \(\sigma \) to \(\rho \). Condition (b) with \(p=1\) clearly holds true for the function defined by (5.5), and condition (c) holds true since \(h\) is continuous everywhere. Thus, since the set \(\varXi \) defined by (5.4) is a compact subset of , the assertions follow from Theorem 5.2. □
1.15 A.15 Proof of Lemma 6.1
Since we assumed that conditions (a) and (c) hold true, we have that
Thus \(r_{k}(X_{k},f_{k}(X_{k}))\), \(k=0,\ldots ,N-1\), are -integrable for any \(x_{0}\in E\), \(P\in{\mathcal{P}}'\) and \(\pi \in \varPi \). Using (b) and (c), we analogously obtain that \(r_{N}(X_{N})\) is -integrable for any \(x_{0}\in E\), \(P\in{\mathcal{P}}'\) and \(\pi \in \varPi \). For any \(n=1,\ldots ,N-1\), we get in the same way that
holds true for all \(\pi \in \varPi \) and \(x\in E\), where \(K_{n}:=\sum _{k=n}^{N-1}K_{1}K_{3}^{k-n}+K_{2}K_{3}^{N-n}\). Therefore, we indeed have that
for any \(n=0,\ldots ,N-1\) and \(P\in{\mathcal{P}}\). □
1.16 A.16 Proof of Theorem 6.2
Let \(n\in \{0,\ldots ,N-1\}\) and \(x_{n}\in E\). Since , we have for any \(Q\in{\mathcal{P}}_{\psi}\) that
and thus
where we used the conventions
and other similar ones. Therefore,
For the latter multiple integral, we have
Treating the remaining multiple integral analogously and proceeding iteratively in this way, we may continue with
Altogether, we obtain the asserted inequality. □
1.17 A.17 Proof of Lemma 6.5
Let \(P\in{\mathcal{P}}_{\alpha}\), i.e., \(P=P^{\vec{\boldsymbol{\mu}}}\) for some . We have to verify that the defining conditions (a)–(c) of a bounding functions are satisfied. Conditions (a) and (b) are trivially satisfied. Since \(\langle a,\boldsymbol{1}\rangle \in [0,x]\) for any \((x,a)\in D\) and \(\langle a,z\rangle \le \langle a,\langle z, \mathbf{1}\rangle \mathbf{1}\rangle =\langle a,\mathbf{1}\rangle \langle z,\mathbf{1}\rangle \) for any , we also have
for any \(n=0,\ldots ,N-1\) and \((x,a)\in D\), where
is independent of \(n=0,\ldots ,N-1\) and \((x,a)\in D\). This shows that condition (c) is also satisfied and that \(\mathcal{P}_{\psi _{\alpha}}=\mathcal{P}_{\alpha}\). □
1.18 A.18 Proof of Theorem 6.6
In view of Theorem C.3 below, we may and do replace without loss of generality the set of all strategies \(\varPi \) by the subset \(\varPi _{\mathrm{lin}}\) of all those \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) for which for any \(n=0,\ldots ,N-1\), the decision rule admits the representation \(f_{n}(x)=\kappa _{n}x\) for some \(\kappa _{n}\in K\). For any \(\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}\), we set \(\pi _{\vec{\boldsymbol{\kappa}}}:=(f_{n}^{\kappa _{n}})_{n=0}^{N-1}\) with \(f_{n}^{\kappa _{n}}(x):=\kappa _{n} x\). Thus \(K^{N}\) can be seen as a parameter set for \(\varPi _{\mathrm{lin}}\).
We now show that the assumptions of Corollary 6.3 are met, so that this result ensures the assertion of Theorem 6.6. Let . By Lemma 6.5, we know that \(\psi _{\alpha}\) is a bounding function for \(Q\in{\mathcal{P}}_{\alpha}\), and obviously . So it remains to show that for \(n=0,\ldots ,N-1\). By Lemma C.4, we have for any \(n=0,\ldots ,N-1\) and \(\vec{\boldsymbol{\kappa}}\in K^{N}\) that \(V_{n}^{P^{ \vec{\boldsymbol{\mu}}};\pi _{\vec{\boldsymbol{\kappa}}}}(\,\cdot \,)= \phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}\,u_{ \alpha}(\,\cdot \,)\), where \(\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}\) is finite, and therefore since . Along with Lemma C.2, we conclude that the assumptions of Corollary 6.3 are indeed met. □
1.19 A.19 Proof of Corollary 6.7
Recall the definition with the Hölder-\(\alpha \) norm . It is known from Kern et al. [26] that (defined analogously to (6.2)) provides a metric on that metrises the \(\alpha \)-weak topology .
We now show that the mapping , \(\mu \mapsto P^{\overline{\mu}}\), is continuous for the pair . The assertion of Corollary 6.7 then directly follows from Theorem 6.6. For any , we have
where and \(h_{w,a}(z):=w(\langle a,z\rangle )\). For the map , we have
where is chosen such that \(|\cdot |_{\infty}\le c_{\infty}|\cdot |\) and we used that
Since \(\psi _{\alpha}(x)=1+x^{\alpha}\), the above calculation can therefore be continued with
Appendix B: A comment on the relation between \(d_{\mu _{1},\mu _{2}}\) and the metric introduced in [40]
Rachasingho and Tasena [40] recently defined a distance \(d\) on the set of bivariate subcopulas as follows. For two bivariate subcopulas \(C_{0}\) and \(C_{0}'\), they put
where the summand \(\mathfrak{h}_{d_{[0,1]^{2}}}([C_{0}],[C_{0}'])\) is the Hausdorff distance (with respect to \(d_{[0,1]^{2}}\)) between the sets of bivariate copulas \([C_{0}]\) and \([C_{0}']\) induced by \(C_{0}\) and \(C_{0}'\), respectively, and the summand \(\mathfrak{h}_{|\,\cdot \,|}(\mathrm{dom}(C_{0}),\mathrm{dom}(C^{\prime}_{0}))\) is the Hausdorff distance between the domains \(\mathrm{dom}(C_{0})\) and \(\mathrm{dom}(C_{0}')\) of \(C_{0}\) and \(C_{0}'\), respectively.
The distance \(d_{\mu _{1},\mu _{2}}\) defined by (2.2) basically differs from \(d\) for the following reasons. First of all, \(d\) is a metric on the set of bivariate subcopulas, whereas \(d_{\mu _{1},\mu _{2}}\) is a pseudo-metric on the set of bivariate copulas. Moreover, \(d_{\mu _{1},\mu _{2}}\) is designed to be a reasonable distance measure on the set of copulas associated with probability measures from the Fréchet class \(\mathcal{M}_{2}(\mu _{1},\mu _{2})\); it defines the distance between two such copulas \(C\) and \(C'\) by the maximal pointwise distance between the corresponding subcopulas with domain \(K:=\overline{\mathrm{ran}F_{\mu _{1}}}\times \overline{\mathrm{ran}F_{\mu _{2}}}\). On the other hand, \(d\) is designed to be a reasonable distance measure on the set of arbitrary subcopulas. Even when restricting \(d\) to the set of those subcopulas with domain \(K\) (and \(d_{\mu _{1},\mu _{2}}\) to the set of copulas associated with probability measures from \(\mathcal{M}_{2}(\mu _{1},\mu _{2})\)), the resulting distance measures are different. Indeed, if \(C_{0}\) and \(C_{0}'\) are two subcopulas with domain \(K\), then
but \(d(C_{0},C_{0}')\ge \sup _{u\in [0,1]^{2}}|C(u)-C'(u)|\) for any copulas \(C\) and \(C'\) induced by \(C_{0}\) and \(C_{0}'\), respectively. Note here that \(d_{\mu _{1},\mu _{2}}\) is defined in such a way that it does not take into account the behaviour of copulas outside \(K=\overline{\mathrm{ran}F_{\mu _{1}}}\times \overline{\mathrm{ran}F_{\mu _{2}}}\), which is motivated by the fact that the behaviour of a copula \(C\) outside \(K\) is irrelevant for a probability measure from the Fréchet class \(\mathcal{M}_{2}(\mu _{1},\mu _{2})\) with copula \(C\).
Appendix C: Supplements to Sect. 6
Here we discuss the existence of optimal strategies in the Markov decision model considered in Sect. 6. In Sect. C.1, we first consider the general model introduced in Sect. 6.1. Thereafter, in Sect. C.2, we study in detail the special case of the multi-period portfolio optimisation problem considered in Sect. 6.2.
3.1 C.1 Existence of optimal strategies in the general model
For any \(n=0,\ldots ,N-1\) and \(P\in{\mathcal{P}}\), denote by the set of all for which \(\int _{E} |v(y)|\,P_{n}((x,f_{n}(x)),dy)<\infty \) for any \(x\in E\) and \(f_{n}\in F_{n}\). Recall that is the set of all -measurable maps . For any \(n=0,\ldots ,N-1\), \(f_{n}\in F_{n}\) and , we define maps and by
Note that \(T^{P}_{n,f_{n}}\) and \(T^{P}_{n}\) can be seen as maps from to and from to , respectively.
For any \(n=0,\ldots ,N-1\), \(P\in{\mathcal{P}}\) and , a decision rule \(f_{n}^{P}\in F_{n}\) is said to be a maximiser of \(v\) if \(T_{n,f_{n}^{P}}^{P}v(x) = T_{n}^{P}v(x)\) for all \(x\in E\). The following result is known from Bäuerle and Rieder [1, Theorem 2.3.8].
Theorem C.1
Let \(P\in{\mathcal{P}}\) and assume that for any \(n=0,\ldots ,N-1\), there exist sets and \(F^{P}_{n}\subseteq F_{n}\) such that the following three conditions hold:
(a) .
(b) for any and \(n=1,\ldots ,N-1\).
(c) For any \(n=0,\ldots ,N-1\) and , there exists an \(f_{n}^{P}\in F_{n}^{P}\) that is a maximiser of \(v\).
Then the following three assertions hold true:
(i) and , \(n=0,\ldots ,N-1\). Moreover, the Bellman iteration scheme holds true, i.e., \(V_{N}^{P}=r_{N}\) and \(V_{n}^{P}=T_{n}^{P}V_{n+1}^{P}\), \(n=0,\ldots ,N-1\).
(ii) \(V_{n}^{P}=T_{n}^{P}T_{n+1}^{P}\cdots T_{N-1}^{P}r_{N}\) for any \(n=0,\ldots ,N-1\).
(iii) For any \(n=0,\ldots ,N-1\), there exists an \(f_{n}^{P}\in F_{n}^{P}\) that is a maximiser of \(V_{n+1}^{P}\). Any such maximisers \(f_{0}^{P},\ldots ,f_{N-1}^{P}\) form a strategy \(\pi ^{P}:=(f_{n}^{P})_{n=0}^{N-1}\in \varPi \) that is optimal for the optimisation problem (6.1).
3.2 C.2 Existence of optimal trading strategies in the setting of Sect. 6.2
We now focus on the specific setting of Sect. 6.2, i.e., we discuss the existence of optimal trading strategies for the multi-period portfolio optimisation problem considered there. Let \(K:=\{\kappa \in [0,1]^{d}:\langle \kappa ,\boldsymbol{1}\rangle \le 1 \}\) and note that \(K\) is compact. For any , \(\kappa \in K\) and \(n=0,\ldots ,N-1\), set . Moreover, set \(\gamma _{n}^{\vec{\boldsymbol{\mu}}}:=\sup _{\kappa \in K}\gamma _{n}^{ \vec{\boldsymbol{\mu}};\kappa}\) for any \(n=0,\ldots ,N-1\).
Lemma C.2
For any and \(n=0,\ldots ,N-1\), there exists at least one solution \(\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K\) to the optimisation problem \(\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}\). In particular, the maximal value \(\gamma _{n}^{\vec{\boldsymbol{\mu}}}=\gamma _{n}^{ \vec{\boldsymbol{\mu}};\kappa _{n}^{\vec{\boldsymbol{\mu}}}}\) is finite.
Proof
Let and \(n\in \{0,\ldots ,N-1\}\). Define a map by \(g_{n}(z,\kappa ):=u_{\alpha}(Z_{n+1}^{0}+\langle \kappa ,z - Z_{n+1}^{0} \boldsymbol{1}\rangle )\). The map \(g_{n}(\,\cdot \,,\kappa )\) is Borel-measurable for any fixed \(\kappa \in K\), and we have
for any . Therefore, \(g_{n}\) is dominated by the Borel-measurable function defined by \(h_{n}(z):=u_{\alpha}(Z_{n+1}^{0}+\langle z,\boldsymbol{1}\rangle )\). This function is \(\mu _{n+1}\)-integrable since (take into account the definition of ), and \(g_{n}(z,\,\cdot \,)\) is continuous on \(K\) for any . So we can apply the continuity lemma (in the form of Bauer [2, Lemma 16.1]) to obtain that the map defined by is continuous. Since \(K\) is compact, we can infer that there exists a solution \(\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K\) to the optimisation problem \(\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}\). □
Part (ii) of the following result shows in particular that an optimal trading strategy can be found in the subset \(\varPi _{\mathrm{lin}}\) of all those \(\pi =(f_{n})_{n=0}^{N-1}\in \varPi \) for which for any \(n=0, \ldots ,N-1\), the decision rule admits the representation \(f_{n}(x)=\kappa _{n}x\) for some \(\kappa _{n}\in K\). For any \(n=0,\ldots ,N-1\), let \(\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K\) be any solution to the optimisation problem \(\max \{\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}:\kappa \in K\}\) (see Lemma C.2).
Theorem C.3
For any , the following two assertions hold true:
(i) For any \(n=0,\ldots ,N-1\), the time-\(n\) value function admits the representation \(V_{n}^{P^{\vec{\boldsymbol{\mu}}}}(\,\cdot \,)=\phi _{n}^{ \vec{\boldsymbol{\mu}}}\,u_{\alpha}(\,\cdot \,)\), where \(\phi _{n}^{\vec{\boldsymbol{\mu}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{ \vec{\boldsymbol{\mu}}}\).
(ii) If for every \(n=0,\ldots ,N-1\), a decision rule at time \(n\) is defined by \(f_{n}^{\vec{\boldsymbol{\mu}}}(x):=\kappa _{n}^{ \vec{\boldsymbol{\mu}}}\,x\), then \(\pi ^{\vec{\boldsymbol{\mu}}}:=(f_{n}^{\vec{\boldsymbol{\mu}}})_{n=0}^{N-1}\) forms an optimal trading strategy for \(P^{\vec{\boldsymbol{\mu}}}\).
Proof
(i) We intend to apply Theorem C.1. Let and \(F_{n}^{P^{\vec{\boldsymbol{\mu}}}}:=F'\) for any \(n=0,\ldots ,N-1\), where and \(F':=\{f_{\kappa}:\kappa \in K\}\) with \(v_{\theta}(x):=\theta u_{\alpha}(x)\) and \(f_{\kappa}(x)=\kappa x\), . It can be inferred from Lemma 6.5 that for any \(n=0,\ldots ,N-1\), where is defined as in Sect. C.1. Moreover, we clearly have \(F_{n}^{P^{\vec{\boldsymbol{\mu}}}}:=F'\subseteq F =:F_{n}\) for any \(n=0,\ldots ,N-1\).
Below, we show that conditions (a)–(c) of Theorem C.1 are met. Thus the Bellman iteration scheme in part (i) of Theorem C.1 holds true, and so
for any . If we continue this way, we successively get \(V_{n}^{P^{\vec{\boldsymbol{\mu}}}}(\,\cdot \,)=\phi _{n}^{ \vec{\boldsymbol{\mu}}}\,u_{\alpha}(\,\cdot \,)\), \(n=N-2,\ldots ,0\).
It remains to verify that conditions (a)–(c) of Theorem C.1 are met. Condition (a) is trivially satisfied, and (b) can be shown by proceeding analogously to (C.1). Furthermore, similarly to (C.1), we obtain for any \(n=0,\ldots ,N-1\) and that \(\sup _{f_{n}\in F_{n}}T_{n,f_{n}}^{P^{\vec{\boldsymbol{\mu}}}}v_{ \theta}(\,\cdot \,)=\theta u_{\alpha}(\,\cdot \,)\,\sup _{\kappa \in K} \gamma _{n}^{\vec{\boldsymbol{\mu}},\kappa}\) and \(\theta u_{\alpha}(\,\cdot \,)\gamma _{n}^{\vec{\boldsymbol{\mu}}, \kappa}=T_{n,f_{\kappa}}^{P^{\vec{\boldsymbol{\mu}}}}v_{\theta}(\, \cdot \,)\) for all \(\kappa \in K\). By Lemma C.2, we know that there exists a maximum point \(\kappa _{n}^{\vec{\boldsymbol{\mu}}}\in K\) of the map \(\kappa \mapsto \gamma _{n}^{\vec{\boldsymbol{\mu}},\kappa}\), and therefore for any \(n=0,\ldots ,N-1\), the decision rule \(f_{\kappa _{n}^{\vec{\boldsymbol{\mu}}}}\) is a maximiser of \(v_{\theta}\). This shows that condition (c) is satisfied, too.
(ii) In the proof of (i), we have seen that the assumptions of Theorem C.1 are fulfilled. Thus Theorem C.1 (i) gives for any \(n=0,\ldots ,N-1\). In particular, the above elaborations under (c) show that for any \(n=0,\ldots ,N-1\), the decision rule \(f_{\kappa _{n}^{\vec{\boldsymbol{\mu}}}}\in F'=:F_{n}^{P^{ \vec{\boldsymbol{\mu}}}}\) provides a maximiser of \(V_{n+1}^{P^{\vec{\boldsymbol{\mu}}}}\). Hence Theorem C.1 (iii) ensures that the strategy \(\pi ^{\vec{\boldsymbol{\mu}}}:=(f_{\kappa _{n}^{ \vec{\boldsymbol{\mu}}}})_{n=0}^{N-1}\in \varPi _{\mathrm{lin}}\) forms an optimal trading strategy for \(P^{\vec{\boldsymbol{\mu}}}\). □
The following lemma specifies the value functions for fixed strategies. Recall that the values \(\gamma _{n}^{\vec{\boldsymbol{\mu}};\kappa}\), \(n=0,\ldots ,N-1\), \(\kappa \in K\), were defined before Lemma C.2, and note that for any \(\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}\), one defines a trading strategy \(\pi _{\vec{\boldsymbol{\kappa}}}:=(f_{n}^{ \vec{\boldsymbol{\kappa}}})_{n=0}^{N-1}\in \varPi _{\mathrm{lin}}\) by setting \(f_{n}^{\vec{\boldsymbol{\kappa}}}(x):=\kappa _{n}x\) for any and \(n=0,\ldots ,N-1\).
Lemma C.4
Let and \(\vec{\boldsymbol{\kappa}}=(\kappa _{n})_{n=0}^{N-1}\in K^{N}\). For \(n=0,\ldots ,N-1\), the time-\(n\) value function associated with strategy \(\pi _{\vec{\boldsymbol{\kappa}}}\) then admits the representation \(V_{n}^{P^{\vec{\boldsymbol{\mu}}};\pi _{\vec{\boldsymbol{\kappa}}}}( \,\cdot \,)=\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{ \vec{\boldsymbol{\kappa}}}}\,u_{\alpha}(\,\cdot \,)\), where \(\phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}:= \prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}\).
Proof
For any \(n=0,\ldots ,N-1\) and , we have
If we continue successively in this way, we end up with
This proves the assertion of the lemma, since \(\prod _{j=n}^{N-1}\gamma _{j}^{\vec{\boldsymbol{\mu}};\kappa _{j}}= \phi _{n}^{\vec{\boldsymbol{\mu}};\pi _{\vec{\boldsymbol{\kappa}}}}\). □
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zähle, H. A concept of copula robustness and its applications in quantitative risk management. Finance Stoch 26, 825–875 (2022). https://doi.org/10.1007/s00780-022-00485-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00780-022-00485-8