Abstract
Distortion (Denneberg in ASTIN Bull 20(2):181–190, 1990) is a well known premium calculation principle for insurance contracts. In this paper, we study sensitivity properties of distortion functionals w.r.t. the assumptions for risk aversion as well as robustness w.r.t. ambiguity of the loss distribution. Ambiguity is measured by the Wasserstein distance. We study variances of distances for probability models and identify some worst case distributions. In addition to the direct problem we also investigate the inverse problem, that is how to identify the distortion density on the basis of observations of insurance premia.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The function of the insurance business is to carry the risk of a loss of the customer for a fixed amount, called the premium. The premium has to be larger than the expected loss, otherwise the insurance company faces ruin with probability one. The difference between the premium and the expectation is called the risk premium. There are several principles, from which an insurance premium is calculated on the basis of the loss distribution.
Let X be a (non-negative) random loss variable. Traditionally, an insurance premium is a functional, \(\pi {:}\, \{ X\ge 0 \text { defined on } (\varOmega , \mathcal {F}, P) \} \rightarrow \mathbb {R}_{\ge 0}\). We will work with functionals that depend only on the distribution of the loss random variable (sometimes called law-invariance or version-independence property, Young 2014). If X has distribution function F we use the notation \(\pi (F)\) for the pertaining insurance premium, and \(\mathbb {E}(F)\) for the expectation of F. We use alternatively the notation \(\pi (F)\) or \(\pi (X)\), resp. \(\mathbb {E}(F)\) or \(\mathbb {E}(X)\) whenever it is more convenient. To the extent of the paper, a more specific notation is used for particular cases of the premium.
We consider the following basic pricing principles:
-
The distortion principle (Denneberg 1990).
-
The certainty equivalence principle (Von Neumann and Morgenstern 1947).
-
The ambiguity principle (Gilboa and Schmeidler 1989).
-
Combinations of the previous (for instance Luan 2001).
1.1 The distortion principle
The distortion principle is related to the idea of stress testing. The original distribution function F is modified (distorted) and the premium is the expectation of the modified distribution. If \(g:\, [0,1] \rightarrow \mathbb {R}\) is a concave monotonically increasing function with the property \(g(0)=0\), \(g(1)=1\), then the distorted distribution \(F^{g}\) is given by
The function g is called the distortion function and
with \(g^\prime \) being the derivative of g, is the distortion density.Footnote 1 Notice that h is a density in [0, 1]. We denote by \(H(u)=\int _0^u h(v) \, dv\) the distortion distribution. Since the assumptions imply that \(g(x) \ge x\) for \(0\le x \le 1\), \(F^g \le F\), i.e. \(F^g\) is first order stochastically larger than F.Footnote 2 The distortion premium is the expectation of \(F^{g}\)
By a simple integral transform, one may easily see that the premium can equivalently be written as
where \({\text {V@R}}_v(F) = F^{-1}(v)\), the quantile function. Note that a functional of this form is called an L-estimates (Huber 2011). If the random variable X takes as well negative values, we could generally define the premium as a Choquet integral
In principle, any distortion function which is monotonic and satisfies \(g(u) \ge u\) is a valid basis for a distortion function. However, the concavity of g guarantees that the pertaining distortion density h is increasing, which—in insurance application—reflects the fact that putting aside risk capital gets more expensive for higher quantiles of the risk distribution. Nondecreasing distortion functions lead to non-negative distortion densities with the consequence that
Relaxing the monotonicity assumption for g would violate in general the monotonicity w.r.t. first stochastic order.
1.2 Examples of distortion functions
Widely used distortion functions g resp. the pertaining distortion densities h are
-
the power distortion with exponent s. If \(0<s< 1\),
$$\begin{aligned} g^{(s)}(v)=v^{s},\quad h^{(s)}(v)=s(1-v)^{s-1}. \end{aligned}$$(3)The premium is known as the proportional hazard transform (Wang 1995) and calculated as
$$\begin{aligned} \pi _{h^{(s)}}(F) = \int _0^\infty 1- F(x)^s \, dx = s\int _0^1 F^{-1}(v)(1-v)^{s-1} \, dv. \end{aligned}$$(4)If \(s\ge 1\), then we take
$$\begin{aligned} g^{(s)}(v)= 1- (1-v)^s, \quad h^{(s)}(v) = s v^{s-1}. \end{aligned}$$(5)The premium is
$$\begin{aligned} \pi _{h^{(s)}}(F) = \int _0^\infty 1- (1-F(x))^s \, dx = s\int _0^1 F^{-1}(v)v^{s-1} \, dv. \end{aligned}$$(6)If we consider integer exponent, the premium has a special representation.
Proposition 1
Let \(X^{(i)}\), \(i=1, \ldots , n\) be independent copies of the random variable X, then the power distortion premium with integer power s has the representation
Proof
Let F be the distribution of X. The power distortion premium for integer power s is computed with \(g^{(s)}\) in (5) and by definition
The assertion follows from the fact that the distribution function of the random variable \(\max \lbrace X^{(1)}, \ldots , X^{(s)}\rbrace \) is \(F(x)^s\). \(\square \)
Finally, notice that the distortion density is bounded for \(s\ge 1\), but unbounded for \(0<s<1\).
-
the Wang distortion or Wang transform (Wang 2000)
$$\begin{aligned} g(v)=\varPhi \left( \varPhi ^{-1}(v)+\lambda \right) ,\qquad h(v)=\frac{\phi (\varPhi ^{-1}(1-v)+\lambda )}{\phi \left( \varPhi ^{-1}(1-v)\right) },\quad \lambda >0, \end{aligned}$$where \(\varPhi \) is the standard normal distribution and \(\phi \) its density.
-
the \({\text {AV@R}}\) (average value-at-risk) distortion function and density are
$$\begin{aligned} g_\alpha (v)=\min \left\{ \frac{v}{1-\alpha } ,1\right\} ,\qquad h_\alpha (v)=\frac{1}{1-\alpha }\,\mathbb {1}_{v\ge \alpha }, \end{aligned}$$(7)where \(0\le \alpha <1\). The pertaining premium has different names, such as conditional tail expectation (CTE), CV@R (conditional value at risk) or ES (expected shortfall) (Embrechts et al. 1997). The premium is
$$\begin{aligned} \pi _{h_\alpha }(F)=\int _{0}^{\infty }\min \left\{ \frac{1-F(x)}{1-\alpha },1\right\} \,dx=\frac{1}{1-\alpha }\int _{\alpha }^{1}F^{-1}(v)\,dv. \end{aligned}$$(8) -
piecewise constant distortion densities. The insurance industry uses also piecewise constant increasing distortion functions. For example, the following distortion function is used by a large reinsurer.
v | \(h\,(v)\) | v | \(h\,(v)\) |
---|---|---|---|
[0,0.85] | 0.8443 | [0.988,0.992) | 3.6462 |
[0.85,0.947) | 1.1731 | [0.992,0.993) | 4.0572 |
[0.947,0.965) | 1.4121 | [0.993,0.996) | 6.5378 |
[0.965,0.975) | 1.7335 | [0.996,0.997) | 12.7020 |
[0.975,0.988) | 2.4806 | [0.997,1] | 14.9436 |
For more examples on different choices of h and also for different families of distributions, see Wang (1996) and Furman and Zitikis (2008).
1.3 Certainty equivalence principle
Let V be a convex, strictly monotonic disutility function.Footnote 3 The certainty equivalence premium is the solution of
i.e. it is obtained by equating the disutility of the premium and the expected disutility of the loss. The premium is written as follows
By Jensen’s inequality \(\pi ^V(F) \ge \mathbb {E}(F)\). Examples for disutilities V are the power utility \(V(x)=x^{s}\) for \(s \ge 1\) or the exponential utility \(V(x)=\exp (x)\).
Related to this premium, one could consider just the expected value and compute the expected disutility (Borch 1961) obtaining
For generalizations of the CEQ premium see Vinel and Krokhmal (2017).
1.4 The ambiguity principle
Let \(\mathfrak {F}\) be a family of distributions, which contains the “most probable” loss distribution F. The ambiguity insurance premium is
\(\mathfrak {F}\) is called the ambiguity set. In an alternative, but equivalent notation, the ambiguity premium is given by
where \(\mathcal {Q}\) is a family of probability models containing the baseline model P. The functional inside the maximization needs not to be the expectation, but can be general, see e.g. Wozabal (2012), Wozabal (2014), Gilboa and Schmeidler (1989) and our Sect. 6.
Remark 1
In their seminal paper from 1989, Gilboa and Schmeidler (1989) give an axiomatic approach to extended utility functionals of the form
where U is a utility function and Y is a profit variable. For the insurance case, U should be replaced by a disutility function V and Y should be replaced by a loss variable X leading to an equivalent expression
The link to (10) is obvious and it can be seen as a combination of expected disutility (9) and ambiguity.
Remark 2
Recall the fundamental pricing formula of derivatives in financial markets states that the price can be obtained by taking the maximum of the discounted expected payoffs, where the maximum is taken over all probability measures, which make the discounted price of the underlying a martingale. This can be seen as an ambiguity price.
The ambiguity premium is characterized by the choice of the ambiguity set \(\mathfrak {F}\). In principle, this set can be arbitrary given as long as it contains F. Convex premium functionals have a dual representation, which are also in the form of an ambiguity functional. For distortion functionals, this will be illustrated in the next section. Other important examples for ambiguity premium prices can be defined through distances for probability distributions. Let D be such a distance, then an ambiguity set is given by
with ambiguity premium
We call \(\epsilon \) the ambiguity radius. This radius quantifies not only the risk premium, but also the model uncertainty, since the real distribution is typically not exactly known and all we have is a baseline model F. In our Sect. 6 we base ambiguity models on the Wasserstein distance WD.
1.5 Combined models
Luan (2001) introduced a combination of distortion and certainty equivalence premium prices by defining a variable W distributed according to \(F^g\) and setting
Notice that \((F^g)^{-1} (v) = F^{-1}(1-g^{-1}(1-v))\).
More generally, one may also add ambiguity respect to the model and set
Notice that (11) contains all previous definitions by making some of the following parameter settings
If all parameters are set like that, we recover the expectation.
We could also consider the expected disutility premium (9) and combine it with the distortion premium,
Section 6 will be dedicated to study the combination of distortion and ambiguity premium prices.
As to notation, we denote by \(\mathcal {L}^p\) the space of all random variables with finite p-norm for all \(p\ge 1\)
resp. \(\Vert X \Vert _\infty = \hbox {ess sup } (|X|)\), the essential supremum. The same notation is used for any real valued function on [0, 1] and p and q are conjugates if \(1/p + 1/q =1\).
2 The distortion premium and generalizations
The characterization and represestations of the distortion premium were studied exhaustively. Among some of the most classic contributions we mention the dual theory of Yaari (1987); and the characterization by axioms of this premium developed in Wang et al. (1997), where the power distortion for \(0<s<1\) is also characterized in a unique manner. A summary of other known representations and new generalization of this premium will be presented below. Recall that any mapping \(X \mapsto \pi (X)\) which is monotone, convex and fulfils translation equivarianceFootnote 4 is a risk measure. Furthermore, if \(\pi \) is also positively homogeneous, monotonic w.r.t. the first stochastic order and subadditiveFootnote 5, then it is a coherent risk measure (Artzner et al. 1999). The distortion premium fulfils all these properties, therefore by the Fenchel–Moreau–Rockefellar theorem, it has a dual representation.
Theorem 1
(see Pflug 2006) The dual representation of the distortion premium with distortion density h is given by
Note that all admissible Z’s in Theorem 1 are densities on [0, 1], since \(h\ge 0\) and \(\mathbb {E}(h(U))=1\). To put it differently, given X defined on \((\varOmega , \mathcal {F}, P)\) and let \(\mathcal {Q}\) be the set of all probability measures on \((\varOmega , \mathcal {F})\) such that the density \(\frac{dQ}{dP}\) has distribution function H, the distortion distribution, then
Therefore, every distortion premium can be seen as well as an ambiguity premium with \(\mathcal {Q}\) as the ambiguity set.
Let us look into more detail to the special case of the \({\text {AV@R}}\) premium. In this case, the dual representation specializes to
From the previous representation, we can see that the \({\text {AV@R}}\)-distortion densities \(h_{\alpha }\) are the extremes of the convex set of all distortion densities. This fact implies that any distortion premium can be represented as mixtures of \({\text {AV@R}}\)’s, such representations are called Kusuoka representations (Kusuoka 2001; Jouini et al. 2006). Coherent risks have a Kusuoka representation of the form
where \(\mathcal {K}\) is a collection of probability measures in [0, 1]. In particular, for the distortion premium we have the following result (Pflug and Römisch 2007).
Theorem 2
Any distortion premium can be written as
The mixture distribution K is given by the way how h is represented as a mixture of the \({\text {AV@R}}\)-distortion densities, i.e.
The pure \({\text {AV@R}}_\beta \) is contained in this class by setting \(K(\alpha ) = \delta _\beta \), the Dirac measure at \(\beta \). Moreover, the integral of the \({\text {AV@R}}\)’s is obtained for \(K(\alpha ) = \alpha \) and is defined as
if the integral exists.
Remark 3
Some other generalizations of the distortion premium were studied in Greselin and Zitikis (2018), where they consider a class of functionals
with \(\nu (\cdot ,\cdot )\) an integrable function and show the Gini-index and Bonferroni-index belong to this class. These generalizations lead to inequality measures instead of risk measures.
As a related generalization of the distortion premium one may consider
for some convex and monotonic Lipschitz function \(\nu \) and some non-negative function k on [0, 1]. Clearly, R(X) is convex and monotonic, but in general is neither positively homogeneous nor translation equivariant unless \(\nu \) is the identity (see “Appendix” section for a proof). To our knowledge, functionals of the form (12) are not used in the insurance sector. For this and some other generalizations see the papers of Goovaerts et al. (2004) and Furman and Zitikis (2008).
3 Continuity of the premium w.r.t. the Wasserstein distance
In this section we study sensitivity properties of the distortion premium respect to the underlying distribution. Some results in this section are related to those in Pichler (2013), Pflug and Pichler (2014) and Kiesel et al. (2016). Similar results of continuity for variability measures are studied in Furman et al. (2017). To start, we recall the notion of the Wasserstein distance.
Definition 1
Let \((\varOmega ,d)\) be a metric space and P, \(\tilde{P}\) be two Borel probability measures on it. Then the Wasserstein distance of order \(r\ge 1\) is defined as
Here the infimum is over all joint distributions of the pair (X, Y), such that the marginal distributions are P resp. \(\tilde{P}\), i.e. \(X\sim P\), \(Y \sim \tilde{P}\).
For two distributions F and G on the real line endowed with metric
this definition specializes to (see Vallender 1974)
Therefore, the Wasserstein distance is the (absolute) area between the distribution functions which is also the (absolute) area between the inverse distributions. By a similar argument one may prove that the Wasserstein distance of order \(r\ge 1\) with the \(d_1\) metric on the real line is
We now study continuity properties of the functional \(F \mapsto \pi _h(F)\).
Proposition 2
(Continuity for bounded distortion densities) Let F and G be two distributions on the real line and h a distortion density function. If the distributions have both finite first moments and h is bounded, then
Proof
See Pichler (2010). \(\square \)
Remark 4
The boundedness of h is ensured if g has a finite right hand side derivative at 0, and also if g has finite Lipschitz constant L, since \(\Vert h\Vert _\infty \le L\).
Proposition 2 can be easily generalized as follows.
Proposition 3
(Continuity for distortion densities in \(\mathcal {L}^q\) for \(q<\infty \)) Let F and G be two distributions on the real line and h a distortion density function. If F, G have finite p-moments and \(h\in \mathcal {L}^q\), then
where p and q are conjugates.
Proof
By Hölder’s inequality for p and q we obtain
\(\square \)
Example 1
Let F and G be two distributions with finite first moments.
-
For the \({\text {AV@R}}\) distortion premium \( ||h_\alpha ||_\infty = \frac{1}{1-\alpha }\), and therefore
$$\begin{aligned} | \pi _{h_\alpha }(F) - \pi _{h_\alpha }(G)|\le \frac{1}{1-\alpha } \cdot WD_{1,d_1} (F,G) . \end{aligned}$$ -
For the power distortion with \(s\ge 1\), \( ||h^{(s)}||_\infty = s\), and therefore
$$\begin{aligned} | \pi _{h^{(s)}}(F) - \pi _{h^{(s)}}(G)|\le s\cdot WD_{1,d_1} (F,G) . \end{aligned}$$
The power distortion with \(0<s<1\) is not bounded. The next result is dedicated for this particular case.
Proposition 4
(Continuity for the the power distortion with \(0<s<1\)) Let F and G be distribution functions and \(h^{(s)}\) the distortion density defined in (3). If F and G have finite p-moments for \(p>\frac{1}{s}\) and \(h\in \mathcal {L}^q\), then
where p and q are conjugates.
Proof
We first note that \(p>\frac{1}{s}\) implies \(q< \frac{1}{1-s} \) and let \(t=1+q\, (s-1)>0\).
Proposition 3 proves the statement. \(\square \)
The next result is a direct consequence of Proposition 4.
Corollary 1
(Continuity for distortion densities dominated by power distortion densities with \(0<s<1\)) Let F and G be distribution functions and h a distortion density. If h is such that \(h(v)\le c\cdot h^{(s)}(v)\), for all \(v\in [0,1]\), \(c>0\) and \(0<s<1\), F and G have finite p-moments for \(p>\frac{1}{s}\) , then \(h\in \mathcal {L}^q\) and
where p and q are conjugates.
Corollary 2
(Convergence) If \(F, F_n\) for all \(n\ge 1\) have finite uniformly bounded p-moments, \(h\in \mathcal {L}^q\) and \(WD_{p,d_1} (F_n,F) \rightarrow 0\) as \(n\rightarrow \infty \), then
where p and q are conjugates.
Remark 5
Corollary 2 holds when the sequence of distributions are the empirical distributions \(\widehat{F}_n\) defined on an i.i.d. sample of size n, \((x_1, \ldots , x_n)\) from \(X\sim F\). If F has finite p-moments, then \(WD_{p,d_1} (\widehat{F}_n,F) \xrightarrow [n \rightarrow \infty ]{} 0\), hence \( \left| \pi _h(\widehat{F}_n) - \pi _h(F) \right| \xrightarrow [n \rightarrow \infty ]{} 0\). This result follows by applying Lemma 4.1 in Pflug and Pichler (2014).
Finally notice that, for continuity, the order of the Wasserstein distance r coincides with the number of finite moments of F.
3.1 Partial coverage
Many insurance contracts do not guarantee complete indemnity, but their payoff is just a part of the full damage. Such contracts include proportional insurance, deductibles and capped insurance. In general, there is a (monotonic) payoff function T such that the payoff is T(X), if the total loss is X. A quite flexible form is for instance the excess-of-loss insurance (XL-insurance), which has a payoff function
Denote by \(F^T\) the distribution of T(X), if F is the distribution of X. The distortion premium for partial coverage is \(\pi _h(F^T)\). We study the relationship between \(F^T\) and \(G^T\) as well as between \(\pi _h(F^T)\) and \(\pi _h(G^T)\) in a slightly more general setup, namely for Hölder continuous T. Recall that T is Hölder continuous with constant \(H_\beta \), if \(|T(x)-T(y)|\le H_\beta \cdot |x-y|^\beta \), for some \(\beta \le 1\).
Theorem 3
(Distance between the original and image probabilities by T) Let P and Q be two probability measures and consider their image probabilities under T denoted by \(P^T\) and \(Q^T\), respectively. If T is a \(\beta \)-Hölder continuous mapping, then
for \(r_\beta =\frac{r}{\beta }\ge 1\) and \(r\ge 1\), where \(H_\beta \) is the \(\beta \)-Hölder constant.
Proof
Let the joint distribution of X and Y such that
then
Taking the \(r_\beta \) root on both sides finished the proof. \(\square \)
For the XL-insurance, the Hölder-constant is a Lipschitz constant (\(\beta =1\)) and has the value 1.
From the previous Theorem we can conclude that, if two probabilities are close, then the image probabilities by a mapping T with the characteristics of Theorem 3, are close in Wasserstein distance as well. Theorem 3 isolates the argument also used in Theorem 3.31 in Pflug and Pichler (2014). Note that the underlying distances for the Wasserstein distances are the metrics of the respective spaces.
Corollary 3
Let F, G be two distributions defined by the probabilities P and Q, respectively, and \(F^T, G^T\) be their image distributions by T, respectively. If T is a \(\beta \)-Hölder continuous mapping with constant \(H_\beta \), \(h\in \mathcal {L}^q\), the distributions \(F^T\), \(G^T\) with finite p-moments, then for all \(r=p\cdot \beta \) (\(r\ge 1\)), the distortion premium with payment function T satisfies
We proceed now to study sensitivity properties of the distortion premium w.r.t. the distortion density.
4 Continuity of the premium w.r.t. the distortion density
Previously, we studied the mapping \(F \mapsto \pi _h(F)\) for fixed h. In this section, we consider and present properties of the mapping \(h \mapsto \pi _h(F)\) for fixed F. Different sensitivity properties w.r.t. the distortion parameters were studied in Gourieroux and Liu (2006).
Proposition 5
(Continuity of the distortion premium w.r.t. the distortion density h) Let F be a distribution and consider two different distortion densities \(h_1, \, h_2\). If F has finite p-moments and \(h_1, h_2\in \mathcal {L}^q\), then
where p and q are conjugates. Here the choices \(p=1\), \(q=\infty \) and \(p=\infty \), \(q=1\) are included.
Proof
Use Hölder inequality and the result is direct. \(\square \)
We can conclude that, if \(h_1\) and \(h_2\) are close, then also the premium prices are close. However, h is always identifiable by the following Proposition.
Proposition 6
If \(\pi _{h_1}(F) = \pi _{h_2}(F)\) for all distribution functions F (the value \(\infty \) is not excluded), then
Proof
Let \(F_a\) be the distribution which takes the value 0 with probability a and the value 1 with probability \(1-a\), for some \(a\in (0,1)\), then its inverse \(F_a^{-1}\) is the indicator function of the interval [a, 1]. Hence,
Thus, the distortion distributions \(H_1\) and \(H_2\) are equal and therefore \(h_1 = h_2\) almost surely.
\(\square \)
Remark 6
Note the previous proposition is true if the family of distributions where the premium prices coincide contains all the Bernoulli variables. Compare also Theorem 2 in Wang et al. (1997).
Remark 7
Another family with the property that the premium prices for this family determine the distortion in a unique manner is the family of Power distributions of the form \(F_{\gamma }(u)=u^\gamma \) on [0, 1] and more general of the form \(F_{\gamma , \beta }(u)=\beta ^{-\gamma }u^\gamma \) on \([0, \beta ]\). The distortion premium prices for this family are
and the uniqueness of h and \(\beta \) is obtained since
and the inversion formula for the Mellin transform (see Zwillinger 2002).
5 Estimating the distortion density from observations
The way how insurance companies calculate a premium is typically not revealed to the customer. Notice that risk premia appear not only in the insurance business, see the link of insurance premium prices and asset pricing in Nguyen et al. (2012). Risk premia appears in other areas such as
-
Power future markets A future contract fixes the price today for delivery of energy later. There is the risk of price changes between now and the delivery period. Thus, such a contract has the character of an insurance and the pricing principles apply, although the price is found in exchange markets (e.g. electricity future markets).
-
Exotic options While standard options are priced through a replication strategy argument, this argument does not apply for other types of options and these options have the character of insurance contracts. Pricing of such contracts is often done over the counter, but again the pricing principle is not revealed to the counterparty.
-
Credit derivatives Also these contracts carry the character of insurance and can be priced according to insurance price principles.
In this section we assume that we know the distortion premium prices of m contracts, which are all priced with the same distortion density h. For each contract j, we also have a sample \(x_1^{(j)}, \ldots , x_n^{(j)}\) of size n drawn from the loss distribution of this contract at our disposal. For simplicity we assume that n is the same for all contracts, but this is not crucial.
The goal of this section is to show how the distortion density h can be regained from the observations of the insurance prices, which would help us to shed more light on the price formation of contract counterparties. Notice that our aim is not to estimate the distortion premium prices from empirical data as is done in Gourieroux and Liu (2006) or Tsukahara (2013).
A simulation example As an example we consider m different loss distributions, all of Gamma type. From each distribution, we obtain a sample of size n. For each sample, we calculate the \({\text {AV@R}}\) and power distortion premium prices. Based on the prices obtained and our samples, we aim to recover the distortion density h. We denote the ordered sample from the j-th loss distribution by \(x^{(j)}_{[1]} , \ldots , x^{(j)}_{[n]}\). The distortion premium, with distortion density h for each sample \(j=1, \ldots , m\), is
On the following, we develop (16) for the particular cases of \({\text {AV@R}}\) and power distortion premium prices for each sample \(j=1, \ldots , m\).
AV@R distortion premium The price for \(h_{ \alpha }\) defined on (7) is
where \(1<i_\alpha <n\) s.t. \( \frac{i_\alpha -1}{n}\le \alpha < \frac{i_\alpha }{n}\).
Power distortion premium The price given by the power distortion \(h^{(s)}\) defined in (3) with \(0<s<1\) is
and the price given by \(h^{(s)}\) defined in (5) with \(s\ge 1\) is
The inverse problem consists on estimating the distortion density h from observed prices. Recall that among the examples we presented of common distortion densities we had step functions and continuous functions, therefore we will use step and spline functions in order to estimate estimate h. We do so for the prices obtained in (17)–(19).
5.1 Estimation of the distortion density with a step function
Distortion density as a step function Let \(\widehat{h}^1_l\) denote the step function consisting of l equal-size steps, defined as
where \(L= n/l\), \( \lambda _s\in \mathbb {R}\) for \(k=1, \ldots , l\) and l denotes the dimension of the step function space. We also impose
with \(0\le \lambda _1\le \cdots \le \lambda _l. \) In this way, \(\widehat{h}^1_l\) fulfils the density constraints as well as the non-decreasing constraints.
Prices with the step function For each sample \(j=1, \ldots , m\), the prices with \(\widehat{h}^1_l\) are
Estimation In order to estimate \(\widehat{h}^1_l\) we will minimize the squares of the differences between the prices obtained by a distortion function h and the premium obtained by \(\widehat{h}^1_l\) in (22). We will test our results with the given prices \(\pi ^{(j)}\) calculated in (17), (18) and (19). We solve,
5.2 Estimation of the distortion density with a cubic monotone spline
B-splines construction For our purposes we will define the splines on the interval [0, 1]. Any B-spline is a linear combinations of the B-spline basis functions. The B-spline basis functions have all the same degree, b and we choose to define them at equally spaced knots \(t_k=k/L\), for \(k=0, \ldots , L\), hence L subintervals. The functions for this basis are denoted as \(B_{k,b}\) and constructed following a recursion formula. The B-spline basis function of degree 0 is denoted and defined as
The B-spline basis functions of degree b, \(B_{k,b}\) are obtained as an interpolation between \(B_{k,b-1}\) and \(B_{k+1,b-1}\), following the recursion formula
In the recursion we need to define fake knots \(t_{-k}=0\) and \(t_{L+k}=1\) for \(k=1, \ldots ,b\). In our case, we consider splines of degree \(b=2\). If we divide [0, 1] in L equally sized intervals, the basis has \(L+2 \) functions
Notice that all the elements of the basis can be obtained by translating the B-spline basis function \(B_{0,2}\) defined on the first \(b+2=4\) knots. In order to have a base of increasing monotone cubic splines we integrate the functions of (23) and obtain a new base
where \(S_k(v)= \int _0^v B_{k,2}(w)\, dw\) for all \(k=-2, \ldots , L-1\). We scale the functions of (23) so the splines in (24) are distribution functions. Note that no linear combination of (24) gives us a constant function, due to construction of (24). Therefore, we need one element to our base, say \(S_{L}(v)=c\) and hence
is our final base with \(l=L+3\) elements, where l denotes its dimension.
As an example we illustrate the base obtained for \(L=5\). Starting with \(B_{0,2}\) defined on \(t_0=0, t_1=1/5, t_2=2/5, t_3=3/5\), precisely
We denote by \(S_0\) the distribution of \(B_{0,2}\) and obtain the rest of the monotone cubic splines by translating \(S_0\). The basis of cubic monotone splines of dimension \(l=8\), illustrated in Fig. 1, is denoted as
where \(S_k(v)=S_0(v-k/5)\) for \(k=-2, \ldots , 4\) and \(S_{5}(v)=c\).
Any linear combination with positive scalars of the splines in (26) define a spline which is an increasing and positive function.
Distortion density as a spline Let \(\widehat{h}^2_l(v)\) denote an increasing monotone cubic density defined as a linear combination of \(l=L+3\) splines in (25)
where \(\lambda _k\ge 0\) for all \(k=-2, \ldots , L\). Notice that by setting the scalars to be non-negative, \(\widehat{h}^2_l \) is increasing. However, \(\widehat{h}^2_l\) must integrate to 1 on [0, 1], hence
where
Prices with the spline For each sample \(j=1, \ldots , m\), the prices with \(\widehat{h}^2_l\) are
Estimation Given prices \(\pi ^{(j)}\) calculated as in (17), (18) or (19) and the prices calculated in (29) for every sample \(j=1, \ldots , m\), we solve
where \(a_k\) is defined in (28).
The estimations obtained by solving (\(P_1\)) and (\(P_2\)) are presented below.
AV@R distortion premium We consider particular cases of \(h_\alpha \) for \(\alpha =0.9, 0.95\). We estimate the distortion density for each of the cases, with two different step functions, corresponding to \(l=8, 10\) steps, and two different spline basis functions of dimensions \(l=8, 13\), respectively.
Step function The estimated step distortions \(\widehat{h}_l\) for \(l=8, 10\) are obtained by solving (\(P_1\)) and illustrated below (Fig. 2).
Splines The estimated spline distortions \(\widehat{h}^2_l\) for \(l=8,13\) are obtained by solving (\(P_2\)) and illustrated below (Fig. 3).
Power distortion premium For this case we consider\(h^{(s)}\) for \(s=0.8, 3\). We solve (\(P_1\)) and (\(P_2\)) with the same number of steps and number of spline basis functions as before.
Step function The estimated step distortions \(\widehat{h}^1_l\) for \(l=8,10\) are obtained by solving (\(P_1\)) and illustrated below (Fig. 4).
Splines The estimated spline distortions \(\widehat{h}^1_l\) for \(l=8,13\) are obtained by solving (\(P_2\)) and illustrated below (Fig. 5).
The optimal values of the optimization problems for all the cases can be seen in the following Table 1.
6 Ambiguity
In this section we combine the distortion premium with the ambiguity principle. Such an approach allows us to incorporate model uncertainty into the premium. Recall that, by setting the distortion density to \(h=1\), we would price just with the ambiguity principle. As was mentioned in Sect. 1, distances can be used to define ambiguity sets. Here, closed Wasserstein balls will serve as ambiguity sets. These sets will be centred at F, an initial distribution, that we refer to as our baseline model.
Definition 2
(Robust distortion premium under Wasserstein balls with \(d_1\)) Let F be the baseline loss distribution, h a distortion density. The robust distorted price of order \(r\ge 1\) is
where \(\mathcal {B}_{r,d_1}(F,\epsilon ) =\{G:\, WD_{r,d_1}(G, F)\le \epsilon \}.\) We call the worst case distribution and denote it by \(F^*\) if \(F^* \in \mathcal {B}_{r,d_1}(F,\epsilon ) \) and is such that
Remark 8
Notice that for \(r_1 \le r_2\)
thus \(\mathcal {B}_{r_1,d_1} \supseteq \mathcal {B}_{r_2,d_1}\).
We can say more about the value and solution of (P-r) if we choose \(r=p\). We start with bounded distortion densities, i.e. for \(p=1\) and \(q=\infty \).
Proposition 7
(Characterization of the worst case distribution for \(r\ge p=1 \)) Let the baseline distribution F have its first moment finite.
-
(i)
If h is unbounded, then (P-r) for \(r = 1\) is unbounded.
-
(ii)
If h is bounded with \(\sup _v h(v) = \Vert h\Vert _\infty \), then (P-r) is bounded for all \(r\ge 1\). If \(r=1\), the optimal value of (P-r) is
$$\begin{aligned} \pi ^\epsilon _{h,1,d_1}(F) = \pi _h(F) + \epsilon \cdot \Vert h\Vert _\infty . \end{aligned}$$We interpret the additional term \(\epsilon \cdot \Vert h\Vert _\infty \) as the ambiguity premium. For the worst case distribution,
-
if \(h(v) = \Vert h\Vert _\infty \) for \(v \ge 1-\eta \) and \(0<\eta \le 1\), then the supremum is attained at
$$\begin{aligned} F_\eta ^*(x) = \left\{ \begin{array}{ll} F(x) &{} \quad x< F^{-1}(1-\eta ),\\ 1-\eta &{} \quad F^{-1}(1-\eta )\le x < F^{-1}(1-\eta ) + \epsilon / \eta ,\\ F\left( x- \epsilon / \eta \right) &{} \quad x \ge F^{-1}(1-\eta ) + \epsilon / \eta . \end{array} \right. \end{aligned}$$ -
Otherwise, the supremum is not attained, but can be approximated by the sequence \(F^*_{1/n}(x)\), \(\forall n\in \mathbb {N}\).
-
Proof
-
(i)
Given that h is increasing and unbounded, the increasing sequence \(K_n = h\left( 1- 1/n \right) \), is such that \(\lim _{n\rightarrow \infty } K_n =\infty \). For all \(n\in \mathbb {N}\) we define a distribution \(G_n\) such that
$$\begin{aligned} G_n^{-1}(v) = F^{-1}(v) + \epsilon \cdot n\, \mathbb {1}_{[1-1/n , 1]}. \end{aligned}$$
\(G_n\) is on the boundary of \( \mathcal {B}_{1,d_1}(F,\epsilon ) \) and
Hence, (P-r) is unbounded for \(r=1\). (ii) It is sufficient to prove (P-r) is bounded for \(r=1\) since \(\mathcal {B}_{1,d_1} \supseteq \mathcal {B}_{r,d_1}\) for all \(r\ge 1\) (see Remark 8). Any admissible G for \(r=1\) can be written as \(G^{-1}(v) = F^{-1}(v) + G_1^{-1}(v)\), where \(G_1\) is such that \(\int _0^1 G_1^{-1}(v) \, dv \le \epsilon \). Since F has its first moment finite, the following upper bound is finite:
The distribution \(F_\eta ^*(x)\) given in the Proposition has inverse
Therefore, \(F_\eta ^* \) is on the boundary of \(\mathcal {B}_{1,d_1}(F,\epsilon )\) and
If \(h(v) = \Vert h\Vert _\infty \) for \(v \ge 1-\eta \), then \(F_\eta ^*\) attains the upper bound in (31). Otherwise, \(F^*_{1/n}\) approaches the maximum from below, since
and
\(\square \)
Remark 9
The solution \(F^*_\eta \) in Proposition 7 is not unique. Any distribution \(\tilde{F}_\eta \) such that \({\tilde{F}_\eta }^{-1}(v) = F^{-1}(v) + \frac{\epsilon }{\eta }\cdot k(v)\mathbb {1}_{[1-\eta ,1]} \), with \(\frac{1}{\eta }\cdot k(v)\mathbb {1}_{[1-\eta ,1]}\) a density on [0, 1], attains the supremum.
As an example, we illustrate the worst case distribution for the \({\text {AV@R}}\) premium (Fig. 6).
If h is unbounded we can characterize the solution of (P-r) as follows.
Proposition 8
(Characterization of the worst case distribution for \(\mathbf {r\ge p>1}\)) Let the baseline distribution F have finite p-moments. If \(h\in \mathcal {L}^q\), then (P-r) is bounded for \(r\ge p\). If \(r=p\), the optimal value of (P-r) is
Also in this case, the term \(\epsilon \cdot \Vert h\Vert _q^q\) is interpreted as ambiguity premium.
Furthermore, the worst case distribution \(F^*\) of (P-r) for \(r=p\) is such that
Proof
We prove (P-r) is bounded for \(r=p\) and by Remark 8 we have boundness for all \(r\ge p\). Notice that, for all admissible G, if \(r=p\), we have
\(F^*\) is admissible since it is on the boundary of \(\mathcal {B}_{p, d_1}(F, \epsilon )\)
and \(F^*\) attains the upper bound
Under some conditions on h we can also prove unboundness of (P-r) for \(r>p>1\) in the case where h is not in \(L^q\), where q is the conjugate of p, the finite moments of F.
Proposition 9
(Unboundness for \(\mathbf {r}>\mathbf {p}>\mathbf {1}\)) Let the baseline distribution F have finite p-moments and let \(h\notin \mathcal {L}^q\), for \(p,\, q\) conjugates and \(r,\,s\) conjugates with \(r>1\). If there exists \(s_1<s\) such that \(\int _0^1 h(v)^{s_1} \, dv =\infty \) and \(h\in \mathcal {L}^t\), for all \(t<s_1\), then (P-r) is unbounded for all \(r>p\).
Proof
Define \(\psi _\eta (v) = h(v)^{s_1-1}\mathbb {1}_{[1-\eta , 1]}\). Since \(\psi _\eta \in \mathcal {L}^r\) for \(r>1\) (note that \(r(s_1 - 1)<s_1\)), there exists an \(0<\eta <1\) such that
Thus, the distribution \(G_\eta \) such that \(G_\eta ^{-1}(v) = F^{-1} + \psi _\eta (v)\) is in \(\mathcal {B}_{r,d_1}(F, \epsilon )\). And its premium is unbounded
Remark 10
If instead of the metric \(d_1\) we consider \(d_p(x,y)= |x^p - y^p|\) as underlying metric for the Wasserstein distance, we could define the ambiguity principle
where \(\mathcal {B}_{r,d_p}(F,\epsilon ) =\{G:\, WD_{r,d_p}(G, F)\le \epsilon \}.\) It is easy to see that, if F has p-moments the constraint of the balls make all of admissible distributions to have also p-moments, therefore for Proposition 3, if \(h\in \mathcal {L}^q\) , then (P-dp) is bounded. Furthermore, continuity respect to this Wasserstein distance implies our continuity results in Sect. 3.
7 Conclusions
After some introduction about general premium principles we propose generalizations of the distortion premium. In addition, we have studied in detail three functional relationships for the distortion premium
-
the premium function \(F \mapsto \pi _h(F)\), i.e. the properties of \(\pi _h\) as a premium principle,
-
the direct function \(h \mapsto \pi _h(F)\), i.e. the dependency on the distortion density,
-
the inverse functions \(\pi _h(F) \mapsto h\).
The smoothness properties are important for robustness aspects, however it is well known that a quite smooth direct function makes the inverse problem difficult. We showed however that the inverse problem is identifiable and we gave a simple quadratic optimization problem to estimate it from empirical data. We successfully illustrated this in a simulation study, the application on real data is left for further research. We also identified the ambiguity premium for Wasserstein balls as ambiguity sets offering, in some cases, a specific formulation of the worst case distribution. It turned out that the extra premium for ambiguity depends on the distortion function h and in a multiplicative way on the ambiguity radius \(\epsilon \), but does not on the loss distribution F itself. Thus it is the same for all contracts and can be calculated in a separate manner. Finally, by using different distances as underlying metrics for the Wasserstein ball, and hence, for the ambiguity set, we could find bounds for the robust premium is always bounded.
Notes
The derivative of a concave function is a.e. defined, even if it is not differentiable everywhere.
\(F_1\) is first order stochastically larger than \(F_2\) if \(F_1(x)\le F_2(x)\) for all x.
The original notion of a utility function introduced by Neumann/Morgenstern was a concave monotonic U, such that the decision maker maximizes the expectation \(\mathbb {E}(U(Y))\) of a profit variable Y. A disutility function can be defined out of a utility function by setting \(V(u) = - U(-u)\).
\(\pi \) has translation equivariance property, if \(\pi (X+c) = \pi (X) + c\), for \(c\in \mathbb {R}\).
A premium \(\pi \) is called subadditive, if \(\pi (X+Y) \le \pi (X) + \pi (Y)\). Subadditivity and positive homogeneity imply convexity.
References
Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1999). Coherent measures of risk. Mathematical finance, 9(3), 203–228.
Borch, K. (1961). The utility concept applied to the theory of insurance. ASTIN Bulletin: The Journal of the IAA, 1(5), 245–255.
Denneberg, D. (1990). Premium calculation: Why standard deviation should be replaced by absolute deviation. ASTIN Bulletin, 20(2), 181–190.
Embrechts, P., Klüppelberg, C., & Mikosch, T. (1997). Modelling extremal events for insurance and finance. Berlin: Springer.
Furman, E., Wang, R., & Zitikis, R. (2017). Gini-type measures of risk and variability: Gini shortfall, capital allocations, and heavy-tailed risks. Journal of Banking and Finance, 83, 70–84.
Furman, E., & Zitikis, R. (2008). Weighted premium calculation principles. Insurance: Mathematics and Economics, 42(1), 459–465.
Gilboa, I., & Schmeidler, D. (1989). Maxmin expected utility with non-unique prior. Journal of Mathematical Economics, 18(2), 141–153.
Goovaerts, M. J., Kaas, R., Laeven, R. J. A., & Tang, Q. (2004). A comonotonic image of independence for additive risk measures. Insurance: Mathematics and Economics, 35(3), 581–594.
Gourieroux, C., & Liu, W. (2006). Sensitivity analysis of distortion risk measures (No. 2006-33).
Greselin, F., & Zitikis, R. (2018). From the classical Gini index of income inequality to a new Zenga-type relative measure of risk: A modellers perspective. Econometrics, 6(1), 4.
Huber, P. J. (2011). Robust statistics, international encyclopedia of statistical science, 1248–1251. Berlin: Springer.
Jouini, E., Schachermayer, W., & Touzi, N. (2006). Law invariant risk measures have the Fatou property. In Advances in mathematical economics (pp. 49–71). Tokyo: Springer.
Kiesel, R., Rühlicke, R., Stahl, G., & Zheng, J. (2016). The Wasserstein metric and robustness in risk management. Risks, 4(3), 32.
Kusuoka, S. (2001). On law invariant coherent risk measures. Advances in mathematical economics (pp. 83–95). Berlin: Springer.
Luan, C. (2001). Insurance premium calculations with anticipated utility theory. ASTIN Bulletin: The Journal of the IAA, 31(1), 23–35.
Nguyen, H. T., Pham, U. H., & Tran, H. D. (2012). On some claims related to Choquet integral risk measures. Annals of Operations Research, 195(1), 5–31.
Pflug, G Ch. (2006). Subdifferential representations of risk measures. Mathematical Programming, 108(2–3), 339–354.
Pflug, G Ch., & Pichler, A. (2014). Multistage stochastic optimization (1st ed.). Berlin: Springer.
Pflug, G Ch., & Römisch, W. (2007). Modeling, measuring and managing risk. Singapore: World Scientific.
Pichler, A. (2010). Distance of probability measures and respective continuity properties of acceptability functionals (Doctoral dissertation, uniwien).
Pichler, A. (2013). The natural Banach space for version independent risk measures. Insurance: Mathematics and Economics, 53(2), 405–415.
Tsukahara, H. (2013). Estimation of distortion risk measures. Journal of Financial Econometrics, 12(1), 213–235.
Vallender, S. S. (1974). Calculation of the Wasserstein distance between probability distributions on the line. Theory of Probability and Its Applications, 18(4), 784–786.
Vinel, A., & Krokhmal, P. A. (2017). Certainty equivalent measures of risk. Annals of Operations Research, 249(1–2), 75–95.
Von Neumann, J., & Morgenstern, O. (1947). Theory of games and economic behavior (2nd ed.). Princeton, NJ: Princeton University Press.
Wang, S. S. (1995). Insurance pricing and increased limits ratemaking by proportional hazards transforms. Insurance Mathematics and Economics, 17(1), 43–54.
Wang, S. S. (1996). Premium calculation by transforming the layer premium density. ASTIN Bulletin: The Journal of the IAA, 26, 71–92.
Wang, S. S. (2000). A class of distortion operators for pricing financial and insurance risks. Journal of Risk and Insurance, 67, 15–36.
Wang, S. S., Young, V. R., & Panjer, H. H. (1997). Axiomatic characterization of insurance prices. Insurance: Mathematics and Economics, 21(2), 173–183. http://EconPapers.repec.org/RePEc:eee:insuma:v:21:y:1997:i:2:p:173-183.
Wozabal, D. (2012). A framework for optimization under ambiguity. Annals of Operations Research, 193(1), 21–47.
Wozabal, D. (2014). Robustifying convex risk measures for linear portfolios: A nonparametric approach. Operations Research, 62(6), 1302–1315.
Yaari, M. E. (1987). The dual theory of choice under risk. Econometrica: Journal of the Econometric Society, 55, 95–115.
Young, V. R. (2014). Premium principles. London: Wiley.
Zwillinger, D. (2002). CRC standard mathematical tables and formulae. London: Chapman and Hall.
Acknowledgements
Open access funding provided by University of Vienna.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Properties of the generalized distortion premium
We consider here the generalized distortion premium
where \(X \in \mathcal {L}^1\), \(\nu \) a convex, monotone Lipschitz function and k a non-negative weight function on [0, 1], which satisfies \(\int _0^1 (1-\alpha )^{-1} \, k(\alpha ) \, d\alpha < \infty \). Clearly, \(X \mapsto R(X)\) is convex and monotone, but is positively homogeneous and/or translation equivariant only if \(\nu \) is a multiple of the identity. To see this, consider the subdifferential of R at \(Y \in \mathcal {L}^1\) is
where \(F_Y\) is the distribution function of Y. Notice that \(\mathbb {E}(Y \cdot Z_Y)\) depends only on the distribution function \(F_Y\). After some calculation, one finds that
Finally, based on the subdifferential, one gets a dual representation
where \(Z_Y\) is given by (33).
It is well known (see Pflug and Römisch 2007) that R is positively homogeneous only if
when it is finite. This implies that \(\nu (x)= \gamma \cdot x\) for some \(\gamma > 0\). R is translation equivariant, if in addition the expectation of the dual multiplier \(Z_Y\) is one, which in happens only if \( \int _0^1 \gamma \, k(\alpha ) \, d\alpha =1\).
1.2 On different underlying metrics for the Wasserstein distance
There is a whole family of distances on \(\mathbb {R}\), which are generalizations of \(d_1\). Set for \(x,y \ge 0\), \(d_p(x,y)= |x^p - y^p|\). The Wasserstein distance of order 1 with distance \(d_p\) is
Lemma 1
Notice that for \( p\ge 1\)
Proof
By the subadditivity of \(x \mapsto x^p\) on \(\mathbb {R}_{\ge 0}\) one has that \(|x-y|^p \le |x^p - y^p|\) and therefore
\(\square \)
Remark 11
This argument also shows that if F has finite p-moments and if \(WD_{1,d_p}(F,G)< \infty \) (and a fortiori if \(WD_{p,d_1}(F,G)< \infty \)), then also G has finite p-moments. On the other hand, if both F and G have finite p-moments, then
[see Lemma 2.19 in Pflug and Pichler (2014)]. Therefore, imposing conditions on \(WD_{1,d_p}\) or on \(WD_{p,d_1}\) leads to quite similar results.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Escobar, D.D., Pflug, G.C. The distortion principle for insurance pricing: properties, identification and robustness. Ann Oper Res 292, 771–794 (2020). https://doi.org/10.1007/s10479-018-3119-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-018-3119-1