1 Extreme Value Distributions and Copulas

For risk measurement and solvency purposes one should especially focus on the properties in the tails of the joint probability distribution of the risk factors. This is a crucial task. Unfortunately, it is often unfeasible to have a good description of the tails of their joint probability distribution using observations only. By definition we have scarce information in the tail of the distribution because such events only occur very rarely. Therefore, tail distribution information should be combined with extreme value theory in order to arrive at reasonable solvency models. In the present section we briefly sketch the essentials of univariate and multivariate extreme value theory, for an extended mathematical outline we refer to the relevant literature McNeil et al. [108], Embrechts et al. [61], Resnick [128], Kotz–Nadarajah [96], Joe [90] and Nelsen [120].

In order to characterize the tails of univariate distributions one often studies the asymptotic behavior of extreme outcomes. One statistics to analyze is the asymptotic behavior of the maximum of an i.i.d. sequence (X s ) s≥1 with distribution F defined by

$$ M_m = \max_{1\le s \le m } X_s. $$

The Fisher–Tippett [68] theorem says that if there exist normalizing constants c m >0 and \(d_{m} \in\mathbb{R}\) such that for some non-degenerate distribution function H we have the following convergence in distribution

$$ \ c_m^{-1} (M_m - d_m ) \stackrel{(d)}{\longrightarrow} H, $$
(10.1)

then H is either of Fréchet, Gumbel or Weibull type, see Theorem 3.2.3 in Embrechts et al. [61]. The Weibull type distributions are short tailed because their support has a finite right endpoint. The Gumbel and the Fréchet type distributions have an infinite right endpoint but the Gumbel type survival distributions decay much faster than the Fréchet type ones.

Most of the distributions used in applications satisfy (10.1) for appropriate constants c m >0, \(d_{m}\in\mathbb{R}\) and non-degenerate limit distribution H. The Gumbel type distributions range from moderately heavy tailed distributions (log-normal distribution) to light tailed distributions (normal, gamma or exponential distributions). The Fréchet type distributions are heavy tailed distributions, examples are Pareto, Cauchy, log-gamma and Burr distributions. We concentrate on this last class of distributions because these distributions usually show the appropriate type of tail behavior for extreme value modeling. For Fréchet type distributions there is a simple characterization: A distribution F is of Fréchet type in (10.1) with tail exponent α>0 if and only if the survival function 1−F is regularly varying at infinity with index −α, i.e.

$$ \lim_{x\to\infty} \frac{1-F(tx)}{1-F(x)}=t^{-\alpha}, \qquad\text{ for all } t>0. $$

An equivalent characterization to (10.1) is the following Pickands–Balkema–de Haan [10, 127] theorem. A random variable XF is of Weibull (ξ<0), Gumbel (ξ=0) or Fréchet (ξ>0) type if and only if there exists a positive measurable function a(⋅) such that for (1+ξx)>0

$$ \lim_{u \to x_F} \mathbb{P}\bigl[ X > u+x a(u) \big\vert X >u \bigr] = \left\{ \begin{array}{ll} (1+\xi x)^{-1/\xi}& \text{ if } \xi\neq0,\\ \exp \{-x \}& \text{ if } \xi=0, \end{array} \right. $$
(10.2)

where \(x_{F}=\inf\{x\in\mathbb{R}; F(x)=1\}\) denotes the right endpoint of the distribution F. Statement (10.2) is especially helpful because it describes the tail of the distribution above large thresholds. In practice, one uses the Fréchet type distributions for extreme value modeling with tail exponent α=1/ξ>0 (sometimes also called the Pareto parameter). This parameter can be estimated using maximum likelihood methods, Bayesian methods, log-log plots, Hill estimators, peaks-over-threshold methods, etc. For a detailed discussion we refer to McNeil et al. [108], Chap. 7.

Once we have specified univariate (marginal) distributions we need to couple them to a multivariate model that respects the inherent dependence structure. One method is to use a so-called copula which is a multivariate distribution function on the unit cube with uniform marginals, see Joe [90] and Nelsen [120]. As for univariate marginals there are more or less appropriate copulas for dependence modeling of joint extreme outcomes. In general, such a copula should show tail dependence which says, in non-mathematical terms, that if one risk factor observes an extreme outcome it is more likely that also the other risk factor is extreme. Copulas that have this property are the Gumbel copula, the Clayton survival copula or the multivariate t-copula. In practice, often the Gaussian copula is used and the correlation matrix is estimated as described in Sect. 9.5. From a pure risk measurement point of view, the Gaussian copula should not be used because it does not put sufficient probability weight to the joint occurrence of extremes. We refer to Chaps. 5–7 in McNeil et al. [108] and Embrechts et al. [60].

2 Parameter Uncertainty

In Sect. 8.3 we have studied numerical examples of an endowment policy (Example 8.20), a life-time annuity (Example 8.19) and a non-life claims reserving run-off (Sect. 8.3.1). However, the analysis done in Sect. 8.3 is not the whole story. In all these examples we have assumed that the model parameters are known, for example, the life table was given by the Gompertz mortality law with known parameters and also the parameters for Hertig’s claims reserving model were assumed to be known, see Table 8.1. Of course, in general, these parameters are not known and need to be estimated from data. This estimation adds an additional source of uncertainty to the problem. This was already mentioned in Remark 8.18. In the present section we model this parameter uncertainty. We therefore use a Bayesian approach because this provides a consistent modeling framework.

2.1 Parameter Uncertainty for a Non-life Run-Off

In this subsection we revisit Hertig’s [83] log-normal claims reserving model from Examples 7.8, 7.17 and 8.11, but we modify the model assumptions such that we can incorporate the uncertainty about the true model parameters. We change Model Assumptions 7.9 as follows, X i,j denotes again incremental payments for accident year i in development year j and C i,j denotes the corresponding cumulative payments.

Model Assumptions 10.1

We set Assumption 6.3 and assume IJ+1 and

  1. (a)

    for all t=1,…,I+J;

  2. (b)

    conditionally, given Φ=(Φ 0,…,Φ J−1) and σ=(σ 0,…,σ J−1), we have

    • X i,j are independent for different accident years i and X i,0>0;

    • for cumulative payments it holds that

      for j=0,…,J−1 and i=1,…,I;

  3. (c)

    σ>0 is deterministic and Φ j for j=0,…,J−1 are independent random variables with

    and prior parameters \(\phi_{j}\in\mathbb{R}\) and s j >0;

  4. (d)

    (X 1,0,…,X I,0) and Φ are independent.

Contrary to Model Assumptions 7.9 we now assume that we do not know the true value of the chain ladder parameters. Therefore, we model Φ j stochastically, using a prior distribution for Φ j with prior parameters ϕ j and s j . The parameter ϕ j is our prior knowledge about the mean of Φ j and s j quantifies our uncertainty in this prior knowledge. That is, for very uncertain prior knowledge we choose s j large, which gives so-called vague priors and for small s j we have strong prior knowledge. If we have no prior knowledge we let s j →∞ and then we obtain non-informative prior distributions for Φ j . However, non-informative prior distributions need some care because they do not always lead to sensible models.

For simplicity we assume that σ is known. In a full Bayesian approach we should also model this parameter stochastically. We refrain from doing so in order to get analytically tractable solutions. If we used a full Bayesian model then we could obtain numerical answers using Markov chain Monte Carlo (MCMC) simulation methods, see Gilks et al. [75], Asmussen–Glynn [8], Johansen et al. [91] and Robert–Casella [129].

Note that is generated by X i,j , i+jt. This implies that we assume to have no other insurance technical information than the payments X i,j itself. If we have other insurance technical information, then we need to specify how this additional information influences the estimation of the parameters Φ and the prediction of future cash flows (for an example see Merz–Wüthrich [114]). Moreover, the assumptions on , t≥0, also imply (implicitly) that the parameters Φ are independent of financial information. Finally, note that we do not add new accident years to after time t=I which corresponds to the run-off situation after time t=I.

The aim is to predict the ultimate claim C i,J , given information , for i+J>t. We have already seen in Lemma 7.10 that conditional on the model parameters Φ we obtain the predictor (it)

Therefore, we would like to make Bayesian inference on Φ, given the observations , i.e. we aim to determine the posterior distribution of Φ at time t. This then provides the Bayesian predictor for C i,J at time t:

(10.3)

As mentioned above, we study the run-off situation after time t=I. At time t=I, the first column of the claims development is observed, therefore we can consider all distributions conditional on . Choose tI, then the joint density of ((ξ i,j ) i+jt ,Φ), conditional on , is given by

The first term on the right-hand side is the likelihood function of the observations ξ i,j , i+jt, given the parameters Φ, the second term is the prior information about the parameters Φ.

There are two different ways to calculate the Bayesian predictor for C i,J given in (10.3). The first way uses the property that we can completely integrate out the parameters Φ. This approach is described in Wüthrich [164], see also Sect. 10.4 below. The disadvantage of this approach is that it sometimes lacks interpretation. Therefore, we calculate (10.3) by directly evaluating the posterior distribution of the parameters, which is stated in the next theorem.

Theorem 10.2

Set Model Assumptions 10.1 and choose tI. Conditionally, given , the random variables Φ j , j=0,…,J−1, are independent Gaussian distributed with posterior means and variances given by

Proof of Theorem 10.2

The posterior distribution of Φ, given , satisfies the following proportionality condition . So, if we use the explicit form of the density we obtain

But this immediately implies that we have independent Gaussian posteriors with the posterior parameters \(\phi_{j}^{(t)}\) and \((s_{j}^{(t)})^{2}\). □

Theorem 10.2 implies the following identity for the posterior means

(10.4)

with sample means and credibility weights given by

$$ \overline{\xi}_j^{(t)}= \frac{1}{(t-j-1)\wedge I} \sum _{i=1}^{(t-j-1)\wedge I} \xi_{i,j+1} \quad\text{ and } \quad \alpha_j^{(t)}= \frac{ [(t-j-1)\wedge I ]s_j^2}{ \sigma_j^2+ [(t-j-1)\wedge I ]s_j^2}. $$

We find that the posterior mean of Φ j is a credibility weighted average between the sample mean \(\overline {\xi}_{j}^{(t)}\) and the prior mean ϕ j with credibility weight \(\alpha_{j}^{(t)}\). For non-informative prior information we let s j →∞ and find that \(\alpha_{j}^{(t)}\to1\) which means that we give full credibility to the observation based parameter estimate \(\overline{\xi}_{j}^{(t)}\). For perfect prior information we send s j →0 and find that \(\alpha_{j}^{(t)}\to0\), i.e. we give full credibility to the prior estimate ϕ j and we are back in the situation of Model Assumptions 7.9.

Theorem 10.2 implies for the Bayesian predictor the following corollary. We define the posterior chain-ladder factors for l=0,…,J−1 and by

Corollary 10.3

Under Model Assumptions 10.1 we obtain for i+jtI (j<J)

with posterior chain-ladder factors at time tI

$$ f_l^{(t)}= \exp \bigl\{ \phi_l^{(t)}+ \bigl(s_l^{(t)}\bigr)^2\big/2+\sigma_l^2\big/2 \bigr\}+1. $$

Moreover, are \((\mathbb{P}, \mathbb {T})\)-martingales for all l=0,…,J−1.

One should compare Lemma 7.10 and Corollary 10.3. We now obtain a chain-ladder factor \(f_{l}^{(t)}\) that incorporates the experiences made in the set of observations , i.e. Bayesian inference tells us how we need to adjust the parameters w.r.t. the collected observations up to time t.

Corollary 10.4

Under Model Assumptions 10.1 the VaPo at time tI is given by

$$ \mathrm{VaPo}_t(\mathbf{X}_{(t+1)}) =\sum _{i=t+1-J}^{I} C_{i,t-i} ~\sum _{j=t-i+1}^{J} ~\prod_{l=t-i}^{j-2} f^{(t)}_l ~ \bigl(f^{(t)}_{j-1}-1 \bigr) ~\mathfrak{Z}^{(i+j)}, $$

and the discounted best-estimate reserves are

Proof of Corollary 10.4

The VaPo construction follows completely analogously to the one in Example 7.8. However, we would like to emphasize that the choice of the ZCBs as financial basis is motivated by the fact that insurance technical and financial events are independent and that the insurance technical filtration \(\mathbb{T}\) is generated by X i,j . □

The CDR becomes more sophisticated in this case because we also need to study the update of information in the chain-ladder factor estimates \(f_{l}^{(t)}\). Assume we purchase the VaPo at time t, this generates the value

at time t+1. This value V t+1 should be used to finance the best-estimate liability at time t+1

(10.5)

As a consequence, the CDR at time t+1 is given by

$$ \mathrm{CDR}_{t+1}(\mathbf{X}_{(t+1)})= V_{t+1}-Q^0_{t+1} [\mathbf{X}_{(t+1)} ], $$

with

because Bayesian predictors are \((\mathbb{P},\mathbb {T})\)-martingales for ti+j. The variance of the CDR will have a complex form because the observations in also enter the parameter estimates. Namely, we obtain the following lemma.

Lemma 10.5

Under Model Assumptions 10.1 the chain-ladder factor estimates have the following recursive structure for j>tI−1

$$ \phi_j^{(t+1)}=\beta_j^{(t)}~ \xi_{t-j, j+1}+ \bigl(1-\beta_j^{(t)} \bigr) \phi_j^{(t)}, $$

with credibility weight

$$ \beta_j^{(t)}= \frac{s_j^2}{ \sigma_j^2+(t-j)s_j^2}. $$

For jtI−1 we have \(\phi_{j}^{(t+1)}=\phi_{j}^{(t)}\).

Proof of Lemma 10.5

Theorem 10.2 implies, see (10.4),

with -measurable sample mean \(\overline{\xi}_{j}^{(t+1)}\) and credibility weight \(\alpha_{j}^{(t+1)}\). Note that the case jtI−1 immediately follows because in that case we have [(tj−1)∧I]=[(tj−1+1)∧I]=I. Hence, there remains the case j>tI−1. In that case we find for the sample mean

So the first credibility term gives

In a similar fashion we obtain for the second credibility term

$$ \bigl(1-\alpha_j^{(t+1)} \bigr)\phi_j= \frac{\sigma_j^2}{ \sigma_j^2+(t-j)s_j^2}~ \phi_j = \bigl(1-\beta_j^{(t)} \bigr) \bigl(1-\alpha_j^{(t)} \bigr) ~\phi_j. $$

Collecting all the terms proves the claim. □

This lemma enables us to rewrite the chain-ladder factor in the following way.

Proposition 10.6

Under Model Assumptions 10.1, for j>tI−1, the chain-ladder factors are given by

This result allows for the calculation of the variance of the CDR, see also (10.5). The crucial point is that the new observations on the diagonal enter the calculation in a simple log-normal way. We will come back to this calculation in the context of nominal reserves, see Sect. 10.3 below.

Finally, we aim to calculate the risk-adjusted reserves . To this end we need to identify the risk factors so that we can use the FKG inequality (8.11). The simplest choice is to use ξ i,j and Φ j as risk factors. We choose a coordinate-wise increasing function in these risk factors. For Gaussian risk factors the exponential function is at hand. Therefore we make the following choice for the probability distortion, see also formula (4.4) in Wüthrich et al. [169],

$$ \varphi_n^{T} = \prod_{j=1}^J \exp \Biggl\{ \sum _{i=1}^I \alpha\xi_{i,j} + \widetilde{\alpha}\varPhi_{j-1} - (I \alpha+\widetilde{\alpha}) \phi_{j-1} -(I\alpha+\widetilde{\alpha})^2 \frac{s_{j-1}^2}{2} -I\alpha^2 \frac{\sigma_{j-1}^2}{2} \Biggr\}, $$
(10.6)

where \(\alpha,\widetilde{\alpha}\ge0\) are fixed constants. \(\varphi_{n}^{T}\) satisfies the necessary normalization condition. Namely, using the conditional independence of the ξ i,j and the independence of the Φ j , we obtain

The probability distortions for tn are then defined by (martingale property)

Theorem 10.7

Under Model Assumptions 10.1 we have for k>tI and i∈{kJ,…,I}

with risk-adjusted chain-ladder factors

$$ f_l^{(+t)}= \bigl(f_l^{(t)}-1 \bigr) \exp \bigl\{\bigl(\widetilde{\alpha}+ \alpha\bigl[I-(t-l-1)\bigr]\bigr) \bigl(s_l^{(t)}\bigr)^2 + \alpha \sigma_l^2 \bigr\}+1. $$

In comparison to Theorem 8.12 we obtain an additional loading term that corresponds to parameter uncertainty. For ltI≥0 and \(\alpha,\widetilde{\alpha}\ge0\) we define

$$ \tau_{t,l}(\widetilde{\alpha},\alpha)= \exp \bigl\{\bigl(\widetilde{ \alpha}+ \alpha\bigl[I-(t-l-1)\bigr]\bigr) \bigl(s_l^{(t)} \bigr)^2 + \alpha\sigma_l^2 \bigr\} \ge1. $$

In view of Theorem 10.7 we obtain

$$ f_l^{(+t)} = \bigl(f_l^{(t)}-1 \bigr) \tau_{t,l}(\widetilde{ \alpha},\alpha)+1\ge f_l^{(t)}. $$
(10.7)

The posterior chain-ladder factors \(f_{l}^{(t)}\) provide the best-estimate reserves at time t, the chain-ladder factors \(f_{l}^{(+t)}\) provide risk-adjusted reserves that consider both process risk in ξ i,j and parameter uncertainty in Φ j and account for the corresponding market-value margin.

Proof of Theorem 10.7

The proof is similar to the one of Theorem 8.12. We first condition on the knowledge of the chain-ladder parameters Φ. We obtain for k>tI

We start with analyzing \(\varphi_{n}^{T}\). It is given by

This means that, conditionally on Φ, the first term is the only random term.We define

For k>t we apply the tower property in the first step to

in the second step we have used the conditional independence of ξ i,j , given Φ. Putting all the pieces together we find

(10.8)

There are three important observations that allow to calculate (10.8). The first is , the second comes from Theorem 10.2, namely we have posterior independence of the Φ j , conditionally given . This implies that expected values over the products of Φ j in (10.8) can be rewritten as products of expected values. The third observation is that in the expected value of (10.8) we have exactly the same product terms as in \(\varphi_{t}^{T}\) except for the development periods j∈{ti,…,ki−1}. This implies that all terms cancel except the ones that belong to these development parameters. If we in addition cancel all constants and -measurable terms we arrive at

So there remains the calculation of the terms in the product of the right-hand side of the equality above. Using Theorem 10.2 we obtain for j∈{ti,…,ki−1}

This proves Theorem 10.7. □

The protected VaPo including a protection against parameter uncertainty in the chain-ladder parameters Φ j is now given by

$$ \mathrm{VaPo}^\mathrm{prot}_t(\mathbf{X}_{(t+1)}) = \sum_{i=t+1-J}^{I} C_{i,t-i} ~\sum _{j=t-i+1}^{J} ~\prod _{l=t-i}^{j-2} f^{(+t)}_l \bigl(f^{(+t)}_{j-1}-1 \bigr) \mathfrak{Z}^{(i+j)}, $$

and the discounted risk-adjusted reserves are

In order to calculate the expected CDR gain of the risk-adjusted reserves we need (10.7) and the martingale property from Corollary 10.3. This immediately implies for st, (the first equality sign is a definition)

(10.9)

Moreover, this provides the following corollary.

Corollary 10.8

Under Model Assumptions 10.1 for t>sI the expected discounted risk-adjusted reserves are given by

Proof of Corollary 10.8

From Proposition 10.6 we have the following recursive structure for t>I

Therefore, conditional on , ξ t−1−l,l+1 is the only random term in \(f_{l}^{(t)}\). Since these terms belong all to different accident years and development periods for l∈{ti,…,J−1} we have posterior independence conditional on , which implies for t>sI

Iteration of this argument and (10.9) imply the claim. This completes the proof. □

For the expected CDR gain for risk-adjusted reserves we then obtain

The first term on the right-hand side is the expected value generated by the protected \(\mathrm{VaPo}^{\mathrm{prot}}_{t}(\mathbf{X}_{(t+1)})\) at time t+1, the second term is the expected payments in accounting year t+1 and the expected remaining risk-adjusted reserves. The latter are calculated with Corollary 10.8.

Example 10.9

(Hertig’s model with parameter uncertainty)

We revisit the numerical example from Sect. 8.3.1 but apply Model Assumptions 10.1. In order to do this analysis we need to choose the prior parameters of as well as the variance parameters \(\sigma_{j}^{2}\). For ϕ j and σ j we choose the values given in Table 8.1. Moreover, we choose prior coefficient of variation for all j=0,…,J−1

$$ \mathrm{Vco} \bigl(\exp\{ \varPhi_j \} \bigr)= \bigl(\exp\bigl \{s_j^2\bigr\} -1 \bigr)^{1/2} = 20~\%. $$

With these parameter choices we are able to calculate the credibility weights \(\alpha_{j}^{(t)}\) and the posterior means \(\phi_{j}^{(t)}\). In Fig. 10.1 we present the prior means ϕ j , sample means \(\overline{\xi}_{j}^{(t)}\) and posterior means \(\phi_{j}^{(t)}\) based on the data with t=17. We see that the posterior mean smooths the sample mean using the prior mean with credibility weights \(1-\alpha_{j}^{(t)}\). With Corollaries 10.3 and 10.4 we can then calculate the best-estimate reserves and the corresponding valuation portfolio. In Table 10.1 we present these best-estimate reserves both under Model Assumptions 7.9 (no parameter uncertainty) and Model Assumptions 10.1 (with parameter uncertainty), see also Sect. 8.3.1. We see that the resulting reserves are very similar, which is not surprising in view of Fig. 10.1.

Fig. 10.1
figure 1

Prior means ϕ j , sample means \(\overline{\xi}_{j}^{(t)}\) and posterior means \(\phi_{j}^{(t)}\) for t=17

Table 10.1 Discounted best-estimate reserves and nominal best-estimate reserves under Model Assumptions 7.9 (no parameter uncertainty) and Model Assumptions 10.1 (with parameter uncertainty)

In the next step we calculate the risk-adjusted reserves using risk aversion parameter α=4 % (for process risk) and \(\widetilde{\alpha}= 10~\%\) (for parameter uncertainty). These choices provide the risk-adjusted reserves given in Table 10.2. These parameter choices substantially add to the market-value margin compared to the model without parameter uncertainty. Another observation that often holds true is that the market-value margin has a comparable size to the discounting effect (of course this observation depends on the choices of the risk aversion parameters and the term structure of interest rates, but remarkably it often holds true in practice).

Table 10.2 Best-estimate reserves , risk-adjusted reserves and the market-value margin \(\mathrm{MVM}_{17}^{\boldsymbol{\varphi}}(\mathbf{X}_{(18)})\) under Model Assumptions 7.9 (no parameter uncertainty) and Model Assumptions 10.1 (with parameter uncertainty)

Finally, we revisit Figs. 8.3 and 8.4 but under Model Assumptions 10.1. The nominal reserves are given by setting P(k,i+j)≡1, this provides for kI

In order to calculate the run-off patterns we need to calculate the weights (8.31) and (8.32). Using Theorem 10.7 and Corollary 10.8 we find for nominal reserves (kt)

This provides Figs. 10.2 and 10.3 for Hertig’s claims reserving model with parameter uncertainty. We see that the parameter uncertainty increases the market-value margin and hence the risk-adjusted reserves (Fig. 10.2 versus Fig. 8.3). The expected CDR gain of the risk-adjusted (nominal) reserves is 312 in this model. In Fig. 10.3 we plot the run-off pattern of the expected best-estimate reserves and the expected market-value margin run-off pattern. The first pattern (v t ,v t+1,…,v n−1) denotes the run-off of the market-value margin under Model Assumptions 10.1 and the second (v t (2),v t+1(2),…,v n−1(2)) the market-value margin under Model Assumptions 7.9. We observe that these two patterns almost completely coincide which says that the run-off of the relative uncertainties is very similar.

Fig. 10.2
figure 2

Expected run-offs of the best-estimate reserves and the expected market-value margin for k=17,…,n−1 under Model Assumptions 10.1

Fig. 10.3
figure 3

Best-estimate reserves run-off pattern (w t ,w t+1,…,w n−1) and market-value margin run-off pattern (v t ,v t+1,…,v n−1) under Model Assumptions 10.1 and (v t (2),v t+1(2),…,v n−1(2)) under Model Assumptions 7.9

We conclude that Model Assumptions 10.1 provide a sensible model for solvency considerations of non-life insurance run-offs. The remaining thing that needs to be done is the calibration of the risk aversion parameters α and \(\widetilde{\alpha}\) so that one obtains a suitable market-value margin in a regulatory solvency model.

2.2 Modeling of Longevity Risk

In this section we revisit the life-time annuity Examples 7.6, 7.16 and 8.8. The VaPo at time t was given by

$$ \mathrm{VaPo}_t(\mathbf{X}_{(t+1)})= \sum _{k=t+1}^{55} \Biggl(\prod _{s=t+1}^{k} p_{x+s} \Biggr) L_{x+t} ~a~\mathfrak{I}, $$

and the protected VaPo by

$$ \mathrm{VaPo}^\mathrm{prot}_t(\mathbf{X}_{(t+1)})= \sum_{k=t+1}^{55} \Biggl(\prod _{s=t+1}^{k} p^+_{x+s} \Biggr) L_{x+t} ~a~\mathfrak{I}. $$

The second order life table was constructed with the Gompertz [77] mortality law

$$ p_{x+t+1}=p_{x+t+1}(m) =\exp \bigl\{ - e^{(x+t-m)/\varsigma} \bigl( e^{1/\varsigma}-1 \bigr) \bigr\}, $$

and the first order life table was obtained with a probability distortion resulting in

$$ p^+_{x+t+1} =\exp \bigl\{ - e^{(x+t-m^+)/\varsigma} \bigl( e^{1/\varsigma}-1 \bigr) \bigr\}, $$

with m +m. In Sect. 8.2.4 we have argued that single lives are i.i.d. and then we have chosen the span probability distortions by (using single lives as risk drivers)

$$ \breve{\varphi}_{t+1}^{T} = \prod _{i=1}^{L_{x+t}} \breve{\varphi}_{t+1}^{T} \bigl(Y_{x+t+1}^{(i)}\bigr), $$

where \(Y_{x+t+1}^{(i)}\) is the -measurable indicator whether person i of portfolio L x+t survives the period (t,t+1], and

$$ \breve{\varphi}_{t+1}^{T}(1)=\frac{p_{x+t+1}^+}{p_{x+t+1}} \qquad \text{ and } \qquad \breve{\varphi}_{t+1}^{T}(0)= \frac{1-p_{x+t+1}^+}{1-p_{x+t+1}} =\frac{q_{x+t+1}^+}{q_{x+t+1}}. $$

This then implies that, see (8.13),

(10.10)

Note that these considerations were all done conditional on the knowledge of the true model parameters m and ς. Assume that we do not know the true value for m. Then, of course, an i.i.d. deflection in the probability distortion is not appropriate because we do not only have the pure stochastic risk that people do not die as expected. We have parameter risk in m which is much worse, because if m is not appropriate for one person it is not appropriate for all people. Thus, there isn’t any diversification in this uncertainty. In particular, if the true m is larger than the chosen value we have longevity misspecification, which means that people live longer in expectation than we have built best-estimate reserves for.

Assume that m +=(1+α)m for a risk aversion parameter α≥0 for process risk. Then, conditional on the model parameter m, (10.10) is rephrased as

Note that this is an increasing function in the unknown parameter m. We now choose a prior distribution h on \(\mathbb{R}_{+}\) which specifies our knowledge about the true value of m. If we have poor knowledge we will choose a vague prior distribution h (having a considerable variance), for good knowledge we choose a prior distribution h that is very concentrated around the expected value of m. This then provides

In particular, this means that we should evaluate the posterior distribution of m, conditional on the collected information up to time t. Similar to (10.6) we would like to add a distortion part for parameter uncertainty. The simplest way to achieve this is to consider m as an additional risk driver and then to choose an increasing function in m. For simplicity we choose the function \(\exp \{\widetilde{\alpha}m \}\) with risk aversion parameter \(\widetilde{\alpha}\ge0\) (but of course any other strictly positive and increasing function would also do). The new span probability distortion including risk drivers for process risk and parameter uncertainty is then defined by, under suitable integrability assumptions,

where is the conditional moment generating function of m at position \(\widetilde{\alpha}\), given . Because \(\breve{\varphi}_{t+1}^{T}\) is normalized, conditional on m, see Sect. 8.2.4, we have that

This implies that the corresponding probability distortion is a normalized \((\mathbb{P},\mathbb{T})\)-martingale as required in Assumption 6.3. The probability distorted mean is then given by

(10.11)

The risk aversion parameter α≥0 accounts for process risk and the risk aversion parameter \(\widetilde{\alpha} \ge0\) for parameter uncertainty giving the total market-value margin. This should be compared to the best-estimate reserves given by

(10.12)

In general, (10.11) and (10.12) cannot directly be calculated. Therefore, we briefly present importance sampling in the next example. Note that for simplicity we only present one period tt+1 in this path-dependent problem.

Example 10.10

(Importance sampling)

In this example we briefly sketch importance sampling. Importance sampling is a simulation technique that allows to evaluate expressions like (10.11) and (10.12). On the whole importance sampling is assigned to the Markov chain Monte Carlo (MCMC) methods. However, MCMC methods go much beyond importance sampling and can be applied to much more general situations. Probably the two most powerful MCMC methods are the Metropolis-Hastings [80, 116] algorithm and the Gibbs sampling algorithm. For a detailed introduction we refer to Gilks et al. [75], Asmussen–Glynn [8], Johansen et al. [91] and Robert–Casella [129].

In order to apply the importance sampling technique we need an explicit prior distributional assumption for m. We assume that m has a gamma distribution with parameters γ and c so that \(\mathbb{E} [m ]=\gamma/c\) and Var(m)=γ/c 2. Moreover, we assume that

i.e. the insurance technical filtration \(\mathbb{T}\) is generated by only. This implies that for (l x+s ) st , with l x+s l x+s−1 for all s (i.e. d x+s =l x+s−1l x+s ≥0), we have

where, of course, we have applied all the necessary conditional independence assumptions between individual lives. Hence, the posterior distribution of m, given the observation , has density

where the normalizing constant is given by

Using this posterior density we need to calculate (10.11), rewritten as

(10.13)

and (10.12), rewritten as

Unfortunately, these three integrals cannot be calculated explicitly because does not have a nice form and because cannot be calculated in closed form. Therefore, the aim is to get rid of the normalizing constant and to modify the integrals such that we can apply Monte Carlo simulation methods. First we treat (10.13). Note that we have an explicit form for

Therefore, we divide in a first step the enumerator and the denominator in (10.13) by the normalizing constant and we see that it cancels in the evaluation of the two integrals. Secondly, we choose a random variable \(\widetilde{m}\) which has the same support as m, from which we can easily simulate i.i.d. samples and which has an explicit form for its density denoted by \(\widetilde{h}(\cdot)\). With this we modify (10.13) as follows

with random variable \(\widetilde{m}\sim\widetilde{h}(\cdot)\) (under \(\mathbb{P}_{t}\), given ) and importance weights

which have an explicit form. But now numerical evaluation of these integrals is straightforward: sample \(\widetilde{m}_{1}, \ldots, \widetilde{m}_{T}\) i.i.d. from \(\widetilde{h}(\cdot)\) and obtain sample estimate

To obtain fast convergence the sampling distribution \(\widetilde {h}(\cdot)\) should be chosen such that the importance weights w t (⋅) do not become heavy tailed. For (10.12) we proceed analogously. All that we need to change is to set \(\alpha=\widetilde{\alpha}=0\) in the above estimate and then we obtain sample estimate

We remark that this procedure works in great generality, basically the only thing that we need to have is a closed form for the density up to the normalizing constant and then we can apply importance sampling. The weakness of importance sampling is that we do not obtain the whole posterior distribution and it only works well in low dimensions. For higher dimensions one usually applies other MCMC methods which provide full posterior distributions. This closes the example.

3 Cost-of-Capital Loading in Practice

3.1 General Considerations

In the introduction to Chap. 8 we have stated that risk-adjusted reserves should be such that the “liabilities could be transferred between two knowledgeable and willing parties in an arm’s length transaction at that amount”. We have used the concept of probability distortion in order to calculate a market-value margin which reflects the corresponding (marked-to-model) price for risk bearing beyond best-estimate reserves. The crucial object was the insurance technical probability distortion φ T which through risk aversion determines the size of the risk margin. In practice, one often uses an approach different from probability distortions. We have briefly touched on this different approach in the rule of thumb (8.30) and in Sect. 9.4.4 on dividend payments. The argumentation is the following: for each accounting year t a shareholder provides protection against shortfalls in terms of solvency capital. Since he may lose part of this capital in case of an adverse event he expects a return (dividend rate) that is above the risk-free rate. Often, one chooses a constant cost-of-capital spread spCoC>0 above the risk-free rate r t which then provides the cost-of-capital rate defined by

$$ r^{(t)}_\mathrm{CoC}= r_t + \mathrm{sp}_\mathrm{CoC}~>~r_t. $$

In that case the risk-adjusted reserves at time t are defined by

where \(\mathrm{MVM}_{t}^{\mathrm{CoC}}(\mathbf{X}_{(t+1)})\) is now a margin that is sufficient to fulfill all future cost-of-capital payments defined by

$$ X_{s+1}^\mathrm{CoC}= r^{(s)}_\mathrm{CoC}~ \rho_s \bigl(-\mathrm{CDR}_{s+1}(\mathbf {X}_{(s+1)}) \bigr), $$

for st. One should compare this to (8.7). In the language of Sect. 9.4.1 we describe the insurance technical result in accounting year st for the best-estimate reserves by −CDR s+1(X (s+1)). In order to protect against possible shortfalls in accounting year s in this insurance technical result we need to hold as buffer the risk measure ρ s (−CDR s+1(X (s+1))). This risk measure is provided by a risk averse investor that expects return \(r^{(s)}_{\mathrm{CoC}}\) on his investment. Thus, the market-value margin in this cost-of-capital approach is defined by

(10.14)

Remarks

  • Depending on whether dividends are paid at the beginning or at the end of accounting year (s,s+1] we need to deflate with \(\varphi_{s}^{A}\) or \(\varphi_{s+1}^{A}\), respectively.

  • Definition (10.14) does not consider the uncertainties in the cost-of-capital payments itself. Note that the cost-of-capital cash flow

    $$ \mathbf{X}_{(t+1)}^\mathrm{CoC}= \bigl(0,\ldots, 0, X_{t+1}^\mathrm{CoC},\ldots, X_{n}^\mathrm{CoC}\bigr) $$
    (10.15)

    is \(\mathbb{F}\)-previsible (one period), but in a fully-fledged approach we should also calculate a risk measure and a market-value margin for the fluctuations (over all periods) in this cost-of-capital cash flow. In most cases this leads to models that are not analytically tractable and require nested simulations, see Salzmann–Wüthrich [139] for an example. To avoid this we could work with probability distortions, as introduced above. Here, for simplicity, we choose the simplified version (10.14).

  • The cost-of-capital cash flow as defined in (10.15) does not consider the possible default of the insurance company, i.e. exercising the limited liability option. This is described in more detail in Möhr [118].

Our aim is to compare \(\mathrm{MVM}_{t}^{\mathrm{CoC}}(\mathbf{X}_{(t+1)})\) and \(\mathrm{MVM}_{t}^{\boldsymbol{\varphi}}(\mathbf{X}_{(t+1)})\) for an explicit example in non-life insurance. This is done in the next subsection.

3.2 Cost-of-Capital Loading Example

In this subsection we compare \(\mathrm{MVM}_{t}^{\mathrm{CoC}}(\mathbf{X}_{(t+1)})\) and \(\mathrm{MVM}_{t}^{\boldsymbol{\varphi}}(\mathbf{X}_{(t+1)})\) for an explicit market-value margin example in non-life insurance. \(\mathrm{MVM}_{t}^{\boldsymbol{\varphi}}(\mathbf{X}_{(t+1)})\) has already been calculated in Sect. 10.2.1. Here, we would like to calculate \(\mathrm{MVM}_{t}^{\mathrm{CoC}}(\mathbf{X}_{(t+1)}) \) in a particular situation. We slightly change the setup of Hertig’s [83] log-normal claims reserving model in order that the calculations do not become “too messy”:

Model Assumptions 10.11

We assume that Model Assumptions 10.1 hold with the only change that

Remarks

  • Under Model Assumptions 10.1 we have assumed that the excess payments C i,j+1/C i,j −1 have a log-normal distribution, whereas now we assume that C i,j+1/C i,j has a log-normal distribution. The result of this change is that the cost-of-capital formulas simplify. In particular examples, statistical methods should explain which model fits best.

  • Considering excess payments is especially important for modeling claims inflation, see Sect. 10.4. For the current example we refrain from doing so.

  • Of course, at the end, \(\mathrm{MVM}_{t}^{\mathrm{CoC}}(\mathbf{X}_{(t+1)})\) and \(\mathrm{MVM}_{t}^{\boldsymbol{\varphi}}(\mathbf{X}_{(t+1)})\) are not fully compatible since they were calculated in different models. However, our analysis will provide interesting insights.

In order to further simplify the comparison we only consider nominal payments, i.e. we set φ A≡1. This implies r t =0, P(t,m)=1 and \(r^{(t)}_{\mathrm{CoC}}= \mathrm{sp}_{\mathrm{CoC}}>0\). Thus, for tI

and

with chain-ladder factors for lsi and s=t,t+1 under Model Assumptions 10.11 given by

The differences between Model Assumptions 10.1 and 10.11 are that all the ±1 disappear in the results of the latter, but all the statements of Sect. 10.2.1 hold (in a similar spirit) true. The nominal CDR is then defined by, see also Merz–Wüthrich [113],

As conditional risk measure we choose the standard deviation based risk measure of Example 9.7, i.e. for given constant ψ>0 we set

note that . We denote the nominal ultimate claim predictor at time t by . Then we can rewrite the nominal CDRs as follows

This immediately gives the following result (see also Wüthrich–Merz [166], Sect. 1.2.1):

Corollary 10.12

Set Model Assumptions 10.11. The sequence is a \((\mathbb {P},\mathbb{T})\)-martingale and the nominal CDRs \(\mathrm{CDR}^{\mathrm{nom}}_{t+1}(\mathbf{X}_{(t+1)})\), , are uncorrelated.

In the following we give two different approaches for the calculation of the cost-of-capital margin. These two approaches are based on different calculations of future conditional risk measures ρ t and correspond to two different levels of complexity. The reason for giving two approaches is that in general (10.14) cannot be calculated in closed form. For the standard deviation based risk measure we can calculate ρ s explicitly for s=t but we are not able to determine for s>t. Therefore, this last expression is approximated (deterministically).

3.2.1 Cost-of-Capital Loading: Approach 1

In Approach 1 we calculate \(\rho_{t}(-\mathrm{CDR}^{\mathrm{nom}}_{t+1}(\mathbf{X}_{(t+1)}))\) for the standard deviation based risk measure at time t analytically. This provides the next theorem.

Theorem 10.13

Under Model Assumptions 10.11 we have for t=I,…,n−1

with, we set i =min{i,l}=il,

$$ \varDelta_t(i,l) =\exp \bigl\{\bigl(\beta_{t-i^\ast}^{(t)} \bigr)^{1_{\{i\neq l\}}} \bigl[\bigl(s^{(t)}_{t-i^\ast} \bigr)^2+ \sigma_{t-i^\ast}^2 \bigr] \bigr\} \hspace{-.3cm} \prod_{j=t-i^\ast+1}^{J-1} \exp \bigl\{ \bigl(\beta_{j}^{(t)}\bigr)^2 \bigl[ \bigl(s^{(t)}_{j}\bigr)^2+ \sigma_{j}^2 \bigr] \bigr\}. $$

In Approach 1 we then define the conditional risk measures as follows: For st

This implies that \(\rho_{s}^{(1)}\) is -measurable for all st. Basically, we choose the conditional risk measure \(\rho_{t}^{(1)}\) of the next accounting year and then scale this risk measure proportionally to the expected run-off of the best-estimate reserves to calibrate the future risk measures \(\rho_{s}^{(1)}\), s>t. This approach is risk-based for accounting year t but it is not risk-based for later accounting years. Approach 1 then provides the market-value margin at time t defined by

$$ \mathrm{MVM}_t^{\mathrm{CoC}(1)}(\mathbf{X}_{(t+1)}) = r^{(t)}_\mathrm{CoC}~\sum_{s \ge t} \rho_s^{(1)}. $$

Note that this approach is very simple since it can be solved analytically. Therefore, approaches taken in practice are often of this type, see e.g. TP.5.41 in QIS5 [64]. We will compare its performance to a more risk-based approach and to the probability distortion approach \(\mathrm{MVM}_{t}^{\boldsymbol{\varphi}}(\mathbf{X}_{(t+1)})\).

Proof of Theorem 10.13

We have for t=I,…,n−1

Thus, we need to calculate the second last term. Therefore we consider

where we have used the analog to Proposition 10.6 under Model Assumptions 10.11. Note that this last product only involves stochastic terms (ξ m,tm+1) mi that belong to different accident years mi, given C i,ti . Therefore, they are all independent, conditionally given and Φ. Moreover, these terms all belong to different development periods, and since we have posterior independence between the components of Φ, given , see Theorem 10.2, the product decouples into independent terms. We start with i<l and obtain (using this posterior independence)

We need to calculate these last two terms. In the case i=l we obtain

We start with the calculation, using the tower property for conditional expectations in the first step,

The next term provides

Note that similar to Corollary 10.3 we have

and similar to Proposition 10.6 we have (using the martingale property of chain-ladder factors)

Comparing the last three formulas to each other we find

Completely analogously we obtain

This implies for i<l

and for i=l

Collecting all the terms completes the proof. □

3.2.2 Cost-of-Capital Loading: Approach 2

For Approach 2 we use a nice property of conditional expectations, namely that successive predictions constitute a martingale and, hence, the CDRs are uncorrelated, see Corollary 10.12. The total run-off uncertainty at time t for nominal reserves is given by the variability of the differences \(\widehat{C}_{i,J}^{(t)}-C_{i,J}\). If we choose the conditional L 2-distance to measure this uncertainty (conditional mean square error of prediction (MSEP), see Wüthrich–Merz [166], Sect. 3.1) we obtain for

where in the last step we have used the uncorrelatedness of the CDRs. As a consequence, we know how the total uncertainty is experienced over time via the CDRs. This exactly suggests the second approach, once we have calculated the variances of the CDRs.

Theorem 10.14

Under Model Assumptions 10.11 we have for st

For Approach 2 we then define the conditional risk measures as follows: For st

This implies that \(\rho_{s}^{(2)}\) is -measurable for all st and provides a risk-based allocation of the total run-off uncertainty to individual accounting years. Approach 2 then motivates the market-value margin definition at time t

$$ \mathrm{MVM}_t^{\mathrm{CoC}(2)}(\mathbf{X}_{(t+1)}) = r^{(t)}_\mathrm{CoC}~\sum_{s \ge t} \rho_s^{(2)}. $$

Note that this is the approach suggested by Salzmann–Wüthrich [139] but for a different claims reserving model.

Proof of Theorem 10.14

For st we have from the proof of Theorem 10.13

Iteration provides the claim. □

Example 10.15

(Non-life run-off cost-of-capital loadings)

We revisit the data example given in Sect. 8.3.1. However, since we choose a different model now, we need to specify new prior parameters. These are given in Table 10.3.

Table 10.3 Cumulative payments C i,j for i+j≤17 as well as parameters ϕ j , s j and σ j under Model Assumptions 10.11

These model choices provide the results presented in Table 10.4. We see that the prior parameters are chosen such that the nominal best-estimate reserves almost coincide in the two models. Next, we calculate the standard deviations of the CDRs. We obtain from Theorem 10.13

Using the expected run-off of the best-estimate reserves we can then calculate the conditional risk measures \(\rho_{s}^{(1)}\). We choose ψ=2 which corresponds to a confidence interval of two standard deviations, and we choose cost-of-capital rate \(r_{\mathrm{CoC}}^{(t)}=6~\%\) (for the nominal considerations we choose risk-free rate r t =0). Similarly, we can use Theorem 10.14 which provides the conditional risk measures \(\rho_{s}^{(2)}\). This provides the market-value margins presented in Table 10.5. We conclude that cost-of-capital loading Approach 1 does not provide a sensible market-value margin. The reason is that the expected run-off of the best-estimate reserves is not a good volume measure for the scaling of the run-off of the underlying risks. On the other hand the choices \(\mathrm{MVM}_{17}^{\mathrm{CoC}(2)}(\mathbf{X}_{(18)})\) and \(\mathrm{MVM}_{17}^{\boldsymbol{\varphi}}(\mathbf{X}_{(18)})\) are very similar and one could fine tune the risk aversion parameters \(\alpha, \widetilde{\alpha}\) as well as the parameters ψ and \(r_{\mathrm{CoC}}^{(t)}\) so that they almost coincide.

Table 10.4 Nominal best-estimate reserves under Model Assumptions 10.1 and 10.11
Table 10.5 Market-value margins under Model Assumptions 10.1 and 10.11

Finally, in Fig. 10.4 we compare the expected run-off patterns of the market-value margins, for Approaches l=1,2 we define the patterns by

$$ v_s(l)= r^{(t)}_\mathrm{CoC} \sum _{u\ge s} \rho_u^{(l)}/ \mathrm{MVM}_{t}^{\mathrm{CoC}(l)}(\mathbf{X}_{(t+1)}) $$

and (v t (φ),v t+1(φ),…,v n−1(φ)) corresponds to (v t ,v t+1,…,v n−1) of Fig. 10.3. We see that cost-of-capital Approach 2 pattern (v s (2)) st and the probability distortion approach pattern (v s (φ)) st give a very similar picture for the run-off risks. However, the cost-of-capital Approach 1 pattern (v s (1)) st underestimates the run-off risks. This has to do with the fact that usually small simple claims can be settled immediately (and their reserves are released), whereas more complicated risky claims stay in the claims settlement process for a longer time period. Therefore, the volume of the reserves decreases much faster than the underlying risks. These findings are in line with the case study and conclusions presented in Wüthrich [162].

Fig. 10.4
figure 4

Market-value margin run-off patterns (v t (1),v t+1(1),…,v n−1(1)) and (v t (2),v t+1(2),…,v n−1(2)) in Model 10.11 and (v t (φ),v t+1(φ),…,v n−1(φ)) in Model 10.1

Summarizing: we prefer the probability distortion approach for this example, because it gives sensible results and its application is fairly straightforward by simply distorting the chain-ladder factors. This finishes the example.

4 Accounting Year Factors in Run-Off Triangles

4.1 Model Assumptions

Under Model Assumptions 10.1 we do not allow for the modeling of accounting year effects (calendar year effects) such as (super-imposed) claims inflation and change in jurisdiction, because conditional on the model parameters Φ we assume that the individual chain-ladder ratios C i,j+1/C i,j are independent for different accident years i. This assumption is often not fulfilled in practice. Define the following model structure

$$ C_{i,j+1} = \bigl(\exp \{\zeta_{i,j+1} \}+1 \bigr)~ C_{i,j} = \bigl(\exp \{\xi_{i,j+1} \}\exp \{ \varPsi_{i+j+1} \}+1 \bigr)~ C_{i,j}, $$

or, in view of Model Assumptions 10.1, we choose log-link ratios

$$ \log \biggl(\frac{C_{i,j+1}}{C_{i,j}} -1 \biggr)= \zeta_{i,j+1}= \xi_{i,j+1}+ \varPsi_{i+j+1}. $$
(10.16)

Compared to Model Assumptions 10.1 we add a term Ψ i+j+1 that is able to cope with diagonal effects that act within a fixed accounting year k=i+j+1. Note that (i) these diagonal effects should only act on the incremental part X i,j+1=C i,j+1C i,j of the claims payments because the payments C i,j up to accounting year i+j are already made; (ii) the diagonal factors exp{Ψ i+j+1} will be independent of the financial filtration. Diagonal factors adapted to the financial filtration (such as economic inflation) should be captured by the choice of the right financial instruments (e.g. inflation protected ZCBs, real estate price index, etc.). Therefore, exp{Ψ i+j+1} allows for the modeling of legal changes, technical progress, change of claims settlement philosophy, weather conditions, environmental changes, etc. Similar models have been considered by Shi et al. [145], Merz et al. [115], Salzmann–Wüthrich [140] and Wüthrich [161, 164].

We generalize Ansatz (10.16) in the sense that we allow for any correlation structure. Similar to Merz et al. [115] we define the incremental individual log-link ratios

$$ \zeta_{i,j+1}= \log \biggl(\frac{C_{i,j+1}}{C_{i,j}} -1 \biggr), \quad \boldsymbol{\zeta}_j=(\zeta_{1,j},\ldots, \zeta_{I,j})' \quad\text{ and } \quad \boldsymbol{\zeta}= \bigl(\boldsymbol{\zeta}'_1,\ldots, \boldsymbol { \zeta}'_J\bigr)'. $$

Model Assumptions 10.16

We set Assumption 6.3 and assume

  1. (a)

    for all t=1,…,I+J;

  2. (b)

    conditionally, given vector Φ=(Φ 0,…,Φ J−1)′ and matrix \(\varSigma\in\mathbb{R}^{IJ\times IJ}\), we have

    with matrix \(A \in\mathbb{R}^{IJ \times J}\) such that for all j=0,…,J−1 and i=1,…,I

    $$ \mathbb{E} [ \zeta_{i,j+1} \vert \boldsymbol{\varPhi}, \varSigma ] = \varPhi_j; $$
  3. (c)

    \(\varSigma\in\mathbb{R}^{IJ\times IJ}\) is a deterministic positive definite covariance matrix and the vector has prior parameters \(\boldsymbol{\phi}\in\mathbb{R}^{J}\) and \(T\in\mathbb{R}^{J\times J}\) (positive definite);

  4. (d)

    (X 1,0,…,X I,0) and Φ are independent.

For the covariance matrix Σ we can choose any correlation structure. For instance, we can choose correlations along accounting year diagonals proposed by (10.16). If Σ is a diagonal matrix with variances only depending on j (and not on i) we are back in Model Assumptions 10.1.

The conditional distribution of ζ, given Φ, is given by the multivariate Gaussian density

$$ h (\boldsymbol{\zeta}\vert \boldsymbol{\varPhi} ) = \frac{1}{(2\pi)^{IJ/2}|\varSigma|^{1/2}} \exp \biggl\{ - \frac{1}{2}(\boldsymbol{\zeta}-A\boldsymbol{ \varPhi})' \varSigma^{-1}(\boldsymbol{\zeta}-A\boldsymbol{ \varPhi}) \biggr\}. $$

The prior density of Φ is given by

$$ \pi (\boldsymbol{\varPhi} ) = \frac{1}{(2\pi)^{J/2}|T|^{1/2}} \exp \biggl\{ - \frac{1}{2}(\boldsymbol{\varPhi}-\boldsymbol{\phi})' T^{-1}(\boldsymbol{\varPhi}-\boldsymbol{\phi}) \biggr\}, $$

with prior mean ϕ and prior covariance matrix T. This immediately provides the joint density of (ζ,Φ) given by the product

$$ h(\boldsymbol{\zeta},\boldsymbol{\varPhi})= h (\boldsymbol{\zeta}\vert \boldsymbol{\varPhi} ) \pi (\boldsymbol{\varPhi} ). $$

Theorem 10.17

Set Model Assumptions 10.16. The random vector (ζ,Φ) has a multivariate Gaussian distribution

Proof of Theorem 10.17

We prove the statement via moment generating functions. Choose \(\boldsymbol{r}=(r_{1}',r_{2}')'\in\mathbb{R}^{IJ+J}\) with \(r_{1} \in \mathbb{R}^{IJ}\). We have, using the tower property for conditional expectations,

$$ \mathbb{E} \bigl[\exp \bigl\{ \bigl(\boldsymbol{\zeta}', \boldsymbol {\varPhi}'\bigr) \boldsymbol{r} \bigr\} \bigr] = \mathbb{E} \bigl[\exp \bigl\{r_1' \boldsymbol{\zeta} + r_2' \boldsymbol{\varPhi} \bigr\} \bigr] =\mathbb{E} \bigl[\exp \bigl\{r_2' \boldsymbol{\varPhi} \bigr\} \mathbb{E} \bigl[\exp \bigl\{r_1' \boldsymbol{ \zeta} \bigr\}\big\vert \boldsymbol{\varPhi} \bigr] \bigr]. $$

Note that . This immediately implies that

This is exactly the desired moment generating function. □

Corollary 10.18

Set Model Assumptions 10.16. The unconditional (marginal) distribution of ζ is given by

Corollary 10.18 shows that under Model Assumptions 10.16 the parameters Φ can completely be eliminated, i.e. once we have specified the prior parameters ϕ and T we can work with the marginal distribution of ζ given by Corollary 10.18. In the literature, Model Assumptions 10.16 are also known as fixed and random effects models, see Shi et al. [145]. The fixed effects are the latent factors Φ which are specified through a prior distribution. The random effects can, for instance, be understood in the sense of (10.16). That is, the random effects are used to explain the choice of the covariance matrix Σ. For example, we can assume that the diagonal effects Ψ k , k=1,…,I+J, are i.i.d. which gives a particular choice of Σ (see Shi et al. [145] and Wüthrich [164]). Another choice would be to assume that Ψ k , k=1,…,I+J, is an AR(1) process which gives another explanation for the choice of Σ (see Shi et al. [145] and Donnelly–Wüthrich [58]). Concluding, the detailed Model Assumptions 10.16 are used to explain the choices of the matrices Σ and T, once this is done, we work under the resulting marginal distribution given by Corollary 10.18.

4.2 Predictive Distribution

Assume that ζ R are the observed components of ζ, and the remaining components ζ P need to be predicted. Let r be the dimension of ζ R with 1≤r<IJ. We define the projections \(B_{R}:\mathbb{R}^{IJ} \to\mathbb{R}^{r}\) and \(B_{P}:\mathbb{R}^{IJ} \to\mathbb{R}^{IJ-r}\) such that we obtain a bijective decomposition

$$ \boldsymbol{\zeta} \mapsto (B_R\boldsymbol{\zeta}, B_P\boldsymbol{\zeta} ) = ( \boldsymbol{\zeta}_{R}, \boldsymbol{\zeta}_{P} ), $$

with B R ζ=ζ R and B P ζ=ζ P . We identify the mappings B R and B P with their matrices.

Corollary 10.19

Set Model Assumptions 10.16. The random vector (ζ R ,ζ P ) has a multivariate Gaussian distribution with the first two moments given by

The covariance matrix between the components ζ R and ζ P is given by

$$ S'_{P,R}= S_{R,P}= \mathrm{Cov} (\boldsymbol{ \zeta}_{R},\boldsymbol{\zeta}_{P} ) = B_R~ S~ B_P'. $$

This corollary is an easy consequence of Corollary 10.18 because it only describes a permutation (relabeling) of the components of ζ. The following theorem is a standard result for Gaussian distributions, see for instance Johnson–Wichern [92].

Theorem 10.20

Set Model Assumptions 10.16. The posterior distribution of ζ P , given ζ R , is a multivariate Gaussian distribution with posterior mean given by

$$ \boldsymbol{\mu}_{P}^\mathrm{post}= \boldsymbol{ \mu}_{P}^\mathrm{post} (\boldsymbol{\zeta}_{R})= \mathbb{E} [ \boldsymbol{\zeta}_{P} \vert \boldsymbol{ \zeta}_{R} ] =\boldsymbol{\mu}_{P}+ S_{P,R}~ (S_{R} )^{-1} (\boldsymbol{\zeta}_{R}- \boldsymbol{\mu}_{R} ), $$

and posterior covariance matrix given by

$$ S_{P}^\mathrm{post}= \mathrm{Cov} ( \boldsymbol{ \zeta}_{P} \vert \boldsymbol{\zeta}_{R} ) =S_{P}- S_{P,R} ~ (S_{R} )^{-1} S_{R,P}. $$

Theorem 10.20 allows to determine the predictive distribution of ζ P , given observation ζ R . We summarize the findings:

  • The predictive distribution of ζ P , given ζ R , is obtained in closed form from Theorem 10.20. Thus, if we have observations we can determine the predictive distribution of ζ P explicitly, given the -measurable components ζ R .

  • The fixed effects Φ and the random effects are only used for the choices of T and Σ and for interpretation and data analysis. Otherwise, they can be integrated out, see Corollary 10.19.

  • Theorem 10.20 also allows for closed form solutions of the prediction of the outstanding liabilities and of the prediction uncertainty, for explicit formulas and examples we refer to Merz et al. [115] and Salzmann–Wüthrich [140].

5 Premium Liability Modeling

In the sections on non-life insurance modeling we have only considered the run-off of the insurance liabilities (see for instance Example 7.8). That is, we have fixed a final accident year I and then we have constructed the VaPo at time tI for the run-off of these accident years iI, see (7.16). At time t=I the VaPo was given by

The crucial point is that in the second summation we only consider accident years iI. This situation is called the run-off situation of accident years iI, because it considers the run-off of the liabilities with accident years iI (old business) but it does not consider any new business thereafter. We denote the cash flow of this run-off situation at time t=I by \(\mathbf{X}^{\text{run-off}}_{(I+1)}\). We would now like to add new business to the portfolio which corresponds to a new accident year I+1. On the one hand, we receive a premium Π I+1 for this new accident year and, on the other hand, from this premium exposure we need to finance all claims that are generated in this accident year I+1 (capital cover system). Let us, for the time being, consider the liability situation only. The VaPo is given by (for sufficiently large final time horizon n)

where we have added the VaPo of the new business (nb) for accident year I+1 denoted by

For modeling the run-off liabilities \(\mathbf{X}^{\text{run-off}}_{(I+1)}\) we use Hertig’s [83] claims reserving model. For modeling the premium liabilities \(\mathbf{X}^{\mathrm{nb}}_{(I+1)}\) we extend this model. Note that in Model Assumptions 7.9 and 10.1 (Hertig’s claims reserving models) we have only specified the distributions of C i,1,…,C i,J , conditional on the first payment C i,0=X i,0. That is, we have not specified the distribution of the first column of the claims development triangle. In order to model premium liabilities in a consistent way we make model assumptions for this first column. A simple model is to assume that Model Assumptions 10.1 hold true for accident years i=1,…,I+1 (we extend the assumptions from I to I+1) and that X I+1,0=C I+1,0 is independent of , and Φ with

In this way we have a simple model that allows for the joint modeling of run-off liabilities (old business) and premium liabilities (new business).

In practice, premium liabilities are often modeled differently because a log-normal assumption on X I+1,0 is not appropriate and, typically, one considers business line segmentations. A common approach is to directly model the total ultimate claim C I+1,J of accident year I+1 and then, using a cash flow pattern, this total ultimate claim is allocated to the different accounting year payments so that the discounted claim can be calculated. The difficulty with this second approach is that two completely different models are chosen for run-off liability risks and premium liability risks. This becomes apparent one year later, where suddenly the model for accident year I+1 is changed when we study the run-off of this accident year after accounting year t=I+1 (using for example Hertig’s claims reserving model, Model Assumptions 10.1). This inconsistency is also criticized in Ohlsson-Lauzeningks [122] when one considers the time series of solvency assessments.

In many situations it is appropriate to separate large claims from attritional (small) claims, see Gisler [76]. In our terminology a large claim is either an individual large claim (e.g. a large general liability claim) or a catastrophe claim (like a storm event that causes lots of small claims). The reason for separating large and attritional claims is that the underlying risk factors have a very different stochastic nature. On the one hand, large claims are small frequency and high severity claims (either they occur or they do not). On the other hand, attritional claims are high frequency and low severity claims where the law of large numbers takes effect and the main risk drivers are of parameter uncertainty nature. We denote the cash flow generated by large claims of accident year t by \(\mathbf{X}^{\mathrm{lc}}_{(t)}\) and the one generated by attritional claims by \(\mathbf{X}^{\mathrm{ac}}_{(t)}\). Thus, the premium liability from new business in accounting year t=I+1 is determined by the cash flow

$$ \mathbf{X}^\mathrm{nb}_{(I+1)} =\mathbf{X}^\mathrm{ac}_{(I+1)}+ \mathbf{X}^\mathrm{lc}_{(I+1)}. $$

We now model these two claims classes separately in the next subsections.

5.1 Modeling Attritional Claims

We assume that attritional claims \(\mathbf{X}^{\mathrm{ac}}_{(1)}, \ldots, \mathbf{X}^{\mathrm{ac}}_{(I+1)}\) satisfy Model Assumptions 10.1 (with I extended to I+1) and all that we need to specify additionally are the initial distributions for \(X^{\mathrm{ac}}_{1,0},\ldots, X^{\mathrm{ac}}_{I+1,0}\).

Model Assumptions 10.21

(Attritional claims)

Set Assumption 6.3 and assume

  1. (a)

    for all t≥1;

  2. (b)

    σ=(σ 0,…,σ J−1)>0, w i >0 and λ i >0, for i=1,…,I+1, are given deterministic parameters;

  3. (c)

    conditionally, given Φ=(Φ 0,…,Φ J−1) and Θ i =(Θ i,1,Θ i,2), i=1,…,I+1, we have

    • \(X^{\mathrm{ac}}_{i,j}\) are independent for different accident years i;

    • \(X_{i,0}^{\mathrm{ac}}\) has a compound Poisson distribution

      $$ X^\mathrm{ac}_{i,0}= \sum_{l=1}^{N_i} \varTheta_{i,2} Y_{i,l}, $$

      where N i only depends on Θ i,1 and has a Poisson distribution with mean w i λ i Θ i,1. The components of Y i =(Y i,l ) l≥1 are i.i.d., positive, with finite second moments and do not depend on (Φ,(Θ m ) m≥1);

    • for cumulative payments \(C_{i,j}^{\mathrm{ac}} =\sum_{l=0}^{j} X^{\mathrm{ac}}_{i,l}\) we have (conditional on \(X^{\mathrm{ac}}_{i,0}>0\))

      for j=0,…,J−1 and i=1,…,I+1;

  4. (d)

    all components of (Φ,(Θ i ) i≥1) are independent and

    and prior parameters ϕ j and s j >0. Moreover, Θ i , i=1,…,I+1, are i.i.d. with positive independent components Θ i,1 and Θ i,2 having mean 1 and finite variance.

Remarks 10.22

  • To illustrate the independence assumptions it is often easier to write down the appropriate likelihood functions. We refrain from doing so but briefly mention the crucial properties.

  • Φ and Θ 1,…,Θ I+1 are the underlying (unknown) risk characteristics. The random effect Θ i models the risk drivers of accident year i (like weather conditions, etc.) and the fixed effect Φ models the (unknown) chain-ladder parameters. We assume that all their (prior) distributions are independent.

  • Conditionally, given Φ and Θ 1,…,Θ I+1, we assume that

    • \(X^{\mathrm{ac}}_{i,j}\) are independent for different accident years i;

    • \(C^{\mathrm{ac}}_{i,j}\) satisfy the chain-ladder property for j=1,…,J with chain-ladder factors determined by Φ (conditional on the first payment \(X^{\mathrm{ac}}_{i,0}\) being strictly positive);

    • \(X^{\mathrm{ac}}_{i,0}\) has a compound Poisson distribution with i.i.d. severities Θ i,2 Y i,l and an independent Poisson distribution N i for the number of payments. This is the model proposed in Gisler [76] and SST [151];

    • w i is a deterministic volume measure for the exposure of accident year i and λ i is the frequency of an average accident year i (before having any additional information about this particular year). Θ i,1 then models the state of nature which influences this accident year, this results in the frequency λ i Θ i,1 and in the expected number of initial payments w i λ i Θ i,1 for accident year i. Similarly, Θ i,2 influences the attritional claim severities Y i,l . If the attritional claim severities are linked to economic factors, then we may drop out of the basic actuarial model framework (Model Assumptions 6.3) and Θ i,2 determines an appropriate financial instrument.

Proposition 10.23

(Attritional claims moments)

Model Assumptions 10.21 imply

Proof of Proposition 10.23

We first apply the tower property of conditional expectations, secondly we use the conditional independence of X i,j for different accident years i to obtain

where we have used the structure of the compound Poisson distribution and the independence assumptions. Because Θ I+1 and are independent, and because the components of Θ I+1 are independent with mean 1 the first claim of the proposition follows. Similarly, we obtain

where in the third step we have used Corollary 4.2.1 in Rolski et al. [134] for the second moment of the compound Poisson distribution and in the fourth and fifth step the independence between Θ i,1 and Θ i,2. This proves the claim for the coefficient of variation. □

We define

and

Then we obtain for the coefficient of variation of accident year I+1

(10.17)

That is, the coefficient of variation for the initial payment \(X^{\mathrm{ac}}_{I+1,0}\) of attritional claims has two terms:

  • The first term \(v_{I+1}^{\mathrm{param}}\) (parameter uncertainty) is simply a constant which serves as a lower bound for the coefficient of variation. The idea behind this term is that this is the non-diversifiable part of the initial attritional claims payments. The uncertainty drivers are the unknown characteristics Θ I+1 of the next accident year I+1 which cannot be diversified by increasing the volume of the insurance portfolio. This lower bound for the uncertainty can be estimated with market data (large volume) and is often specified by the local regulator, see for example SST [151], Sect. 8.4.3. This specification serves as a lower bound for possible diversification benefits.

  • The second term corresponds to the diversifiable part of the initial attritional claims payments (process uncertainty). It is inversely proportional to the expected number of payments w I+1 λ I+1. Also for this second term it is preferable that the regulator chooses an appropriate value for \(v_{I+1}^{\mathrm{process}}\) (depending on the line of business and the definition of attritional claims) and then the second term scales according to the volume of the company.

Concluding Remarks

  • Model Assumptions 10.21 allow for a risk based and volume adapted modeling of the attritional claim of the premium liability of new business. The initial payment \(X^{\mathrm{ac}}_{i,0}\) of each accident year i has a compound Poisson structure with coefficient of variation bounded from below by \(v_{i}^{\mathrm{param}}\), see (10.17). The attritional claims payments \(X^{\mathrm{ac}}_{i,j}\) beyond development period 0 (j≥1) are then given by a chain-ladder model.

  • The main risk driver is the parameter uncertainty in Θ i . In order to obtain the protected VaPo for the attritional claims payments \(\mathbf{X}^{\mathrm{ac}}_{(1)}, \ldots, \mathbf{X}^{\mathrm{ac}}_{(I+1)}\) we can choose this risk driver for applying the FKG inequality (8.11).

  • For the time being we have only considered one single line of business. For a portfolio of different lines of business we can start to aggregate these lines of business by assuming an appropriate dependence structure between the structural parameters Θ i of the different lines of business, see Wüthrich [160], Sect. 2.2. Moreover, we can also introduce dependence for the run-off of different lines of business by assuming that each run-off satisfies Model Assumptions 10.16 and making the run-off triangles dependent for different lines of business analogously to Merz et al. [115].

5.2 Modeling Large Claims

In this subsection we turn to the modeling of large claims \(\mathbf{X}^{\mathrm{lc}}_{(i)}\) for accident years i=1,…,I+1. The fundamental difference to attritional claims is that large claims are low frequency and high severity claims. Therefore, diversification is often only obtained over time and a reliable estimation of a cash flow pattern for each accident year is impossible. For this reason, large claims are often studied in an undiscounted view (financial returns and discounting only play a marginal role for large claims, at least for direct insurers).

Let us for the time being fix one accident year i. We assume that we have in accident year i a finite number of risk drivers (either individual large claims or catastrophe events such as storm, flood, hail, earthquake, etc.). The total large claim amount of accident year i over all risk drivers ν is then given by

$$ C_{i}^\mathrm{lc}= \sum_{\nu} \sum_{l=1}^{N_i^\nu} Y_{i,l}^{\nu}, $$

where \(N_{i}^{\nu}\) models the number of events of risk driver ν in accident year i and \(Y_{i,l}^{\nu}\) the total aggregate claim of the l-th event of this risk driver ν in year i. If we assume that the large claims of all risk drivers ν are independent and the total claim \(\sum_{l=1}^{N_{i}^{\nu}} Y_{i,l}^{\nu}\) of each risk driver ν follows a compound Poisson distribution, then \(C_{i}^{\mathrm{lc}}\) has again a compound Poisson distribution, see Mikosch [117], Proposition 3.3.4. Therefore, we can merge the large claim of accident year i into one single compound Poisson distribution, thus, for simplicity, we assume that the large claim of accident year i has a compound Poisson distribution given by

$$ C_{i}^\mathrm{lc}= \sum_{l=1}^{N_i^\mathrm{lc}} Y_{i,l}^\mathrm{lc}, $$

where \(N_{i}^{\mathrm{lc}}\) is the number of large claims in accident year i and \(Y_{i,l}^{\mathrm{lc}}\) denotes the severity of the l-th large claim in accident year i.

Model Assumptions 10.24

(Large claims)

We set Assumption 6.3 and assume that the large claim \(C_{i}^{\mathrm{lc}}\) is -measurable for all i=1,…,I+1 with

  1. (a)

    \(C_{i}^{\mathrm{lc}}\) are independent for different accident years i;

  2. (b)

    \(C_{i}^{\mathrm{lc}}\) has a compound Poisson distribution with expected number of claims \(\lambda_{i}^{\mathrm{lc}}>0\), and the claims severities \(Y_{i,l}^{\mathrm{lc}}\) are i.i.d. Pareto distributed with threshold θ>0 and Pareto parameter χ i >1.

Remarks 10.25

  • The parameter θ is the threshold for large claims and catastrophe events. Below this threshold the single claims are defined to be attritional claims that satisfy Model Assumptions 10.21, above the threshold the events are defined to be large claims. The choice of threshold θ may depend on the size of the company, the line of business considered, but also on the choice of the reinsurance program.

  • Often one chooses a heavy tailed distribution function for large claims modeling. Typical examples are all regularly varying (at infinity) survival distributions such as Pareto, Burr, log-gamma, etc., see Sect. 10.1, Embrechts et al. [61] and Mikosch [117], Sect. 3.2.5.

  • We assume the Pareto parameter χ i to be bigger than 1. Otherwise we have an infinite mean model, which means that we need to charge an infinite premium for such risks. Note that for χ i >1

  • So far, we have not considered cash flows, we have only defined the total large claim amounts \(C^{\mathrm{lc}}_{i}\) of accident year i. Since discounting is less important for large claims (the main risk drivers are of insurance technical nature), large claims are often studied on a nominal basis (no discounting or bank account discounting). Moreover, n should be so large that all large claims \(C^{\mathrm{lc}}_{1}, \ldots, C^{\mathrm{lc}}_{I+1}\) are settled after accounting year n.

  • As in the non-life run-off triangle examples we should now study the flow of information and the claims development result for (1) best-estimate corrections on reported claims and (2) newly reported IBNyR claims (incurred but not yet reported claims). We come back to this in (10.20) and (10.21), below.

  • In practice, the distribution of the compound Poisson random variable \(C^{\mathrm{lc}}_{i}\) is often determined numerically with Monte Carlo simulation. However, if we allow for discretization of the claim severities \(Y_{i,l}^{\mathrm{lc}}\) we can apply the Fast Fourier Transform or the Panjer [124, 125] algorithm which allows for closed form solutions.

5.3 Reinsurance

In the previous chapters and sections we have described (insurance) risk bearing and the price for risk bearing comprising in the market-value margin. We could also try to mitigate risk by buying reinsurance. Typically, reinsurance is bought for peak risks that go beyond the risk capacity of a single insurance company and, therefore, are pooled by reinsurance companies.

Reinsurance contracts and programs are very diverse. Many reinsurance products have a rather complicated structure. This includes path dependencies of reinsurance triggers, etc. Therefore, reinsurance can often only be modeled numerically and there are powerful simulation tools on the market that optimize reinsurance programs (reinsurance is always a trade-off between price of reinsurance cover versus amount of risk mitigation). In the present text we only consider very basic reinsurance covers that can still be handled analytically. A brief introduction can be found in Teugels–Sundt [153], p. 1400.

Probably the easiest form of reinsurance is a so-called proportional reinsurance contract. There are two forms of proportional reinsurance contracts, namely quota-share reinsurance and surplus treaty.

Assume that we have an original claim of size Y≥0. For a quota-share reinsurance cover one chooses a fixed proportion p∈(0,1) and then the reinsurer covers the claim pY which means that

$$ (1-p)~Y $$

remains at the ceding insurer.

For a surplus treaty the ceding company determines a maximum loss θ>0 that it can retain for each risk Y l ≥0 in its portfolio. Every risk Y l which provides a maximal coverage M l greater than the retained line θ is ceded proportionally to the size of risk. Thus, the proportion is defined by

$$ p_l = \frac{(M_l-\theta)\vee0}{M_l} = \frac{(M_l-\theta)_+}{M_l}. $$

This provides that

$$ \sum_l (1-p_l)~Y_l $$

stays with the ceding company. Concluding, the ceding company covers at most the retention limit θ per risk l.

Another form of reinsurance is non-proportional reinsurance. The excess-of-loss reinsurance is often used for catastrophe covers. Assume we have an original claim of size Y≥0. We choose a fixed deductible θ D >0 and a fixed cover θ C >0. In that case the reinsurance company covers the claim in the layer “θ C xs θ D ”, i.e.

$$ (Y-\theta_D)_+-(Y-\theta_D-\theta_C)_+ = \min\bigl\{\theta_C, (Y-\theta_D)_+\bigr\}. $$

A stop-loss reinsurance is an excess-of-loss reinsurance cover that acts on the total annual claim amount Y=∑ l Y l , if Y l denotes the claims within a fixed accounting year.

These contracts are well understood. However, we would like to emphasize that there is also a default risk of the reinsurer involved! Assume that we consider an excess-of-loss reinsurance cover with deductible θ D and cover θ C =∞. In that case the reinsurance company faces the claim (Yθ D )+ and the ceding company faces the claim

$$ Y \wedge\theta_D, $$

but only if the reinsurer is able to fulfill its obligations!

We make Model Assumptions 10.24 (large claims) and we assume (for simplicity) that all large claims can be paid and settled immediately. Then the large claims cash flow of accident year i is given by \(\mathbf{X}^{\mathrm{lc}}_{(i)}=(0, \ldots, 0, X_{i,0}^{\mathrm{lc}}, 0, \ldots, 0)\in\mathbb{R}^{n+1}\) with (i+1)-st component given by

$$ X_{i,0}^\mathrm{lc}= C_{i}^\mathrm{lc}= \sum_{l=1}^{N_i^\mathrm{lc}} Y_{i,l}^\mathrm{lc}. $$

The insurance company now decides to buy an excess-of-loss cover “∞ xs θ D ” at the beginning of accounting year i. This implies that it faces the following claim

$$ X_{i,0}^\mathrm{lc,~ ri}= \sum _{l=1}^{N_i^\mathrm{lc}} \bigl\{\bigl(Y_{i,l}^\mathrm{lc} \wedge \theta_D\bigr) + \bigl(Y_{i,l}^\mathrm{lc}- \theta_D\bigr)_+ ~1^c_{\{\mathrm{ri}\}}(i) \bigr\}, $$
(10.18)

where \(1^{c}_{\{\mathrm{ri}\}}(i) \in\{0,1\}\) is the indicator whether the reinsurance company has defaulted up to (and including) accounting year i and hence it is not able to fulfill its obligations (here we have assumed that in case of a reinsurance default we cannot recover anything). That is, similar to the coupon bond in Example 5.3, we define

$$ 1^c_{\{\mathrm{ri}\}}(t)= \left\{ \begin{array}{ll} 1 & \text{ ~reinsurance company has defaulted in $[0,t]$,}\\ 0 & \text{ ~otherwise.} \end{array} \right. $$

Moreover, \(1_{\{\mathrm{ri}\}}(t)=1-1^{c}_{\{\mathrm{ri}\}}(t)\) is the indicator whether the reinsurance company has not defaulted until time t.

Conclusion

We have a counter-party risk which needs a careful study for solvency purposes.

If claims development is involved reinsurance cover analysis and modeling gets much more complicated. We briefly explain why. We split the total number of large claims into their reporting delays. For simplicity, we assume that the maximal reporting delay is one year

$$ N_i^\mathrm{lc}=N_{i,0}^\mathrm{lc}+N_{i,1}^\mathrm{lc}, $$

where \(N_{i,j}^{\mathrm{lc}}\) is the number of reported claims with a reporting delay of j=0,1 years. The best-estimate liability in accounting year i with reporting delay j=0 (without reinsurance) is then given by

Note that if we cannot immediately settle \(Y_{i,l}^{\mathrm{lc}}\) we need to predict this claim by which is then booked in the profit and loss statement. Typically, reinsurance is immediately triggered on these predicted values, i.e. reinsurance needs to be considered as an asset in a full balance sheet approach. The reinsurance situation (10.18) in accounting year i is then modified as follows

(10.19)

that is, reinsurance is applied to the best-estimates of reported claims \(l=1,\ldots, N_{i,0}^{\mathrm{lc}}\) of accounting year i and the booked figures are predicted values. One year later we obtain the claims development result on these reported claims \(l=1, \ldots, N_{i,0}^{\mathrm{lc}}\) given by

(10.20)

this is the update of information. In addition we obtain the best-estimate for late reported (IBNyR) claims for accident year i given by (note \(N_{i}^{\mathrm{lc}}= N_{i,0}^{\mathrm{lc}}+N_{i,1}^{\mathrm{lc}}\))

(10.21)

Reinsurance for accounting year i+1 needs then to be applied to (10.20) and (10.21). We assume that if the reinsurance company has defaulted in period i it cannot recover for period i+1, that is \(1^{c}_{\{\mathrm{ri}\}}(i)\le1^{c}_{\{\mathrm{ri}\}}(i+1)\). In this case the ceding company faces the (nominal claim) after accounting year i+1

This assumes that in case of a default of the reinsurance company no money is transferred between the two counter-parties (in both directions). The first term states that if the reinsurance company has already defaulted in accounting year i the ceding company has to come up for the whole claim in accounting year i+1 (the claims development result and the IBNyR claims). In case the reinsurance company survives accounting year i then term two becomes active. Term two states that we have the excess-of-loss cover on all IBNyR claims \(l= N_{i,0}^{\mathrm{lc}}+1,\ldots, N_{i}^{\mathrm{lc}}\) (in case the reinsurer does not default in accounting year i+1). The situation for reported claims \(l=1,\ldots, N_{i,0}^{\mathrm{lc}}\) is more involved:

  1. (a)

    If the reinsurer does not default in accounting year i+1, i.e. \(1^{c}_{\{\mathrm{ri}\}}(i)=1^{c}_{\{\mathrm{ri}\}}(i+1)=0\), then term two for reported claims in period j=0 reads as

    That is, gains and losses in the claims development result are shared by the two parties according to the excess-of-loss contract (note that gains have a negative sign and losses a positive sign).

  2. (b)

    The case when the reinsurer defaults in accounting year i+1 is more tricky, i.e. \(1^{c}_{\{\mathrm{ri}\}}(i)=0\) and \(1^{c}_{\{\mathrm{ri}\}}(i+1)=1\). In that case term two for reported claims reads as

    In the case claim l does not enter the difference \(\widehat{X}_{i,0}^{\mathrm{lc,~ri}} - \widehat{X}_{i,0}^{\mathrm{lc}}\) (i.e. we have not called for any reinsurance for these claims at time i). This implies that these claims development results are completely consumed by the insurance company (because the reinsurance has defaulted in accounting year i+1).

    The case is more delicate. These claims l enter the difference \(\widehat{X}_{i,0}^{\mathrm{lc,~ri}} - \widehat{X}_{i,0}^{\mathrm{lc}}\) because the first layer was paid by the insurer and the excess layer was paid by the reinsurer. Our definition now says that the claims development result is completely consumed by the insurance company (because the reinsurance has defaulted in accounting year i+1). This may be fine in case of a loss but it may be questionable in case of a CDR gain because in this latter case we make a gain on something we have not really paid for (excess layer in accounting year i). From this point of view we should forward this CDR gain on the excess layer to the defaulted reinsurance company. On the other hand, we need to finance possible negative CDRs on other claims and a possible loss on the excess layer of IBNyR claims \(l=N_{i,0}^{\mathrm{lc}}+1, \ldots, N_{i}^{\mathrm{lc}}\). From this point of view the insurance company will use the possible CDR gain on reported claims to finance other losses (and not forward any gains to the defaulted reinsurance company). All this assumes that the reinsurer has paid the difference \(\widehat{X}_{i,0}^{\mathrm{lc}}-\widehat{X}_{i,0}^{\mathrm{lc,~ri}}\) in cash to the insurer in period i.

For modeling purposes we often assume that the default process is a non-decreasing Markov process with values in {0,1}. This means that a defaulted company cannot recover. Moreover, we assume that every default is a total loss of the reinsurance contract for present and future insurance benefits. Also this can be relaxed if we assume that there are possible recoveries given default (and the company may also buy new reinsurance covers for the claims development of old claims). These last remarks apply to every credit risk towards a counter-party and are of special interest in surety and credit risk insurance.

6 Risk Measurement and Solvency Modeling

The aim (and difficult task) now is to bring all pieces together for a comprehensive risk assessment and solvency analysis. In many situations such comprehensive models become rather complicated and can only be solved numerically. However, we would like to emphasize that one should not “over-complicate” the models. Too involved models lack the requirement of understanding and interpretation. One should always understand what the model is doing as well as one should always be able to interpret parameters and understand their sensitivities. More complex models may give better matches to past observations but in many cases this does not immediately imply that they also give better predictions for future events (predictive power). Concluding, one should always concentrate on the essential risk drivers. There are many stochastic positions in a balance sheet of an insurance company which are negligible from a risk measurement point of view, so one should not invest too much time in modeling them very accurately as long as they are under control.

Our aim in this section is to present a toy model. We assume that our insurance company is a mono-liner that, for example, sells automobile policies. This mono-liner faces premium liability risk from new business as well as run-off risk from old business. In addition, there are many other risk factors like financial and ALM risk, reinsurance default risk, etc. We are going to model these factors in a comprehensive stochastic model.

6.1 Insurance Liabilities

We assume that Assumption 6.3 (basic actuarial model) is fulfilled. Our company faces attritional claims \(X_{i,j}^{\mathrm{ac}}\) that satisfy Model Assumptions 10.21 and it faces independent large claims \(C_{i}^{\mathrm{lc}}\) which have, for accident years i=1,…,I+1, i.i.d. Pareto distributions with threshold θ>0 and Pareto parameter χ>1. We assume an immediate settlement of large claims, thus, large claims in accident year i generate a -measurable cash flow \(X_{i}^{\mathrm{lc}}=C_{i}^{\mathrm{lc}}\) in accounting year i. Moreover, the company holds a life-time annuity portfolio according to Example 7.6. We assume that attritional claims, large claims and the life-times of the annuity portfolio are independent and generate the insurance technical filtration \(\mathbb{T}\). We denote the insurance cash flow generated by these three claims categories by X. The VaPo for the liabilities after time I is then at time I given by

The first line describes the run-off of the outstanding liabilities of attritional claims with accident years iI, the second line describes the premium liability claims of new business of accident year I+1 (attritional and large claims) and the third line is the run-off of the life-time annuities. For the life-time annuities we use the convention that L x+t is -measurable, thus, L x is the number of people alive at time I.

If we, in addition, consider a stop-loss reinsurance cover for large claims \(X_{i}^{\mathrm{lc}}\) (we choose “∞ xs θ D ” with θ D >θ) and assume that the default probability of the reinsurance company is completely driven by financial market information, then we can introduce a new financial instrument \(\mathfrak{B}^{(I+1)}\) which provides the value B I+1 of the bank account \(\mathfrak{B}\) at time I+1 if the reinsurance company does not default and zero otherwise. The large claims part is then modified and we obtain the reinsurance integrated VaPo (because we assume an immediate settlement of large claims formula (10.19) has a simpler structure)

(10.22)

Note that large claims are modeled on an annual basis and \(X_{I+1}^{\mathrm{lc}}\) denotes the total large claim amount of accident year I+1 (with the assumption that this total large claim has a Pareto distribution). The reinsurance cover “∞ xs θ D ” is then understood as a stop-loss cover on this annual claim amount. The appropriate financial instrument for this reinsurance cover is the bank account that may possibly default. In this spirit the reinsurance cover is modeled as a defaultable bond with time to maturity of one year, see Sect. 5.1.2.

We calculate all the single terms. For the run-off of the outstanding liabilities of attritional claims for old accident years, first line in (10.22), we obtain from Corollary 10.4 the VaPo

$$ \mathrm{VaPo}_I \bigl(\mathbf{X}^{\text{run-off}}_{(I+1)} \bigr)= \sum_{i=I+1-J}^{I} C^\mathrm{ac}_{i,I-i} ~\sum_{j=I-i+1}^{J} ~\prod _{l=I-i}^{j-2} f^{(I)}_l ~ \bigl(f^{(I)}_{j-1}-1 \bigr) ~\mathfrak{Z}^{(i+j)}. $$

For the premium liability risk of attritional claims of new business, second line in (10.22), we obtain the VaPo, see Proposition 10.23 and Corollary 10.4,

with . To further simplify the model we assume that we can approximate the distribution of \(C_{I+1,0}^{\mathrm{ac}}\) by a gamma distribution, i.e. we assume

with parameters γ I+1>0 and c I+1>0 calibrated by, see (10.17),

For the premium liability risk of new business large claims \(\mathbf{X}^{\mathrm{lc,~ri}}_{(I+1)}\), third line in (10.22), we obtain using the Pareto assumption with χ>1

Finally, the life-time annuity portfolio, fourth line in (10.22), is given by

$$ \mathrm{VaPo}_I \bigl(\mathbf{X}^{\dagger}_{(I+1)} \bigr)= L_{x} ~ \sum_{k=1}^{55} \Biggl(\prod_{s=1}^{k} p_{x+s} \Biggr) ~a~\mathfrak{I}. $$

For the calculation of the VaPo protected against insurance technical risk we need to explain the choice of the probability distortion φ T. We assume that the probability distortion decouples into the product

$$ \varphi_t^{T}=\varphi_t^{T, \mathrm{ac}}~ \varphi_t^{T, \mathrm{lc}} ~\varphi_t^{T, {\dagger}}, $$

where the three terms provide independent density processes for the real world probability measure \(\mathbb{P}\) and the insurance technical filtration \(\mathbb{T}\). The protected VaPo at time I is then obtained by

(10.23)

where the protected VaPo for the run-off \(\mathbf{X}^{\text{run-off}}_{(I+1)}\) is obtained from Theorem 10.7 (this specifies the choice \(\varphi_{t}^{T, \mathrm{ac}}\)). The protected VaPo for the attritional claims from new business \(\mathbf{X}^{\mathrm{ac}}_{(I+1)}\) is also obtained from \(\varphi_{t}^{T, \mathrm{ac}}\) (where we include a distortion for the first payment \(C_{I+1,0}^{\mathrm{ac}}\) and the remaining payments are treated simultaneously with the run-off liabilities of accident years iI according to Theorem 10.7). \(\varphi_{t}^{T, \mathrm{lc}}\) distorts the large claims \(X_{I+1}^{\mathrm{lc}}\). Finally, the distorted life-time annuity VaPo is obtain by choosing an appropriate first order life table.

In order to calculate an explicit example we would like to specify all the terms in detail. Therefore, we need to know the protected VaPo at time I for the accounting condition (a) and at time I+1 for the insurance contract condition (b) (acceptability of chosen business plan).

(a) Protected VaPo for the Accounting Condition at Time I

We choose for the run-off of the old attritional claims

$$ \mathrm{VaPo}^\mathrm{prot}_I \bigl(\mathbf{X}^{\text {run-off}}_{(I+1)} \bigr)= \sum_{i=I+1-J}^{I} C^\mathrm{ac}_{i,I-i} ~\sum_{j=I-i+1}^{J} ~\prod_{l=I-i}^{j-2} f^{(+I)}_l ~ \bigl(f^{(+I)}_{j-1}-1 \bigr) ~\mathfrak{Z}^{(i+j)}, $$

where the risk-adjusted chain-ladder factors \(f^{(+I)}_{l}\) are given by Theorem 10.7 with risk aversion parameters α, \(\widetilde{\alpha} > 0\). For the attritional claims from premium liability risk of new business we choose

with where ψ nb>0 is an appropriate loading factor. For the large claims from premium liability risk we choose

where we assume that \(X_{I+1}^{\mathrm{lc}}\) has a Pareto distribution with Pareto parameter χ +∈(1,χ) and threshold θ>0 under the measure \(\mathbb{P}^{+}\), note that the retained claim is decreasing in χ. Finally, the protected life-time annuity VaPo is given by

$$ \mathrm{VaPo}^\mathrm{prot}_I \bigl(\mathbf{X}^{\dagger}_{(I+1)} \bigr) =L_{x} ~ \sum_{k=1}^{55} \Biggl(\prod_{s=1}^{k} p^+_{x+s} \Biggr) ~a~\mathfrak{I}, $$

where \((p^{+}_{x+s})_{x,s}\) is an appropriate first order life table according to Example 8.8.

(b) Protected VaPo for the Insurance Contract Condition at Time I+1

The run-off of old attritional claims from time I is at time I+1 given by

where \(f^{(+,I+1)}_{l}\) are the risk-adjusted chain ladder factors at time I+1 given by Theorem 10.7 (for the parameter update process we use Lemma 10.5). For the attritional claims from premium liability risk of new business we have

$$ \mathrm{VaPo}^\mathrm{prot}_{I+1} \bigl(\mathbf{X}^\mathrm{ac}_{(I+1)} \bigr)= C_{I+1,0}^\mathrm{ac} \Biggl(\mathfrak{Z}^{(I+1)}+ \sum_{j=1}^{J} \prod _{l=0}^{j-2} f^{(+,I+1)}_l ~ \bigl(f^{(+,I+1)}_{j-1}-1 \bigr) \mathfrak{Z}^{(I+1+j)} \Biggr). $$

For the large claims from premium liability risk we have (note that we have assumed an immediate settlement and thus -measurability)

the only remaining uncertainty is the possible reinsurance default. Finally, the protected life-time annuity VaPo is given by

$$ \mathrm{VaPo}^\mathrm{prot}_{I+1} \bigl(\mathbf{X}^{\dagger}_{(I+1)} \bigr) =L_{x+1} ~ \sum_{k=1}^{55} \Biggl(\prod_{s=2}^{k} p^+_{x+s} \Biggr) ~a~\mathfrak{I}, $$

where \((p^{+}_{x+s})_{x,s}\) is the same first order life table as above.

Note that if we are only interested in acceptability (insurance contract condition (b)) we do not need to specify the loadings in accounting condition (a) introduced by the measures \(\mathbb{P}^{+}\). This is further discussed in Sect. 10.6.4.

6.2 Asset Portfolio and Premium Income

In the previous subsection we have considered the insurance liabilities. There are two kinds of insurance liabilities, the ones that were triggered in accident years iI (past, prior accident years) and the ones that occur in accident year I+1 (future). The ones that belong to the prior years iI are represented by

$$ \mathrm{VaPo}^\mathrm{prot}_I \bigl(\mathbf{X}^{\text {run-off}}_{(I+1)} \bigr) +\mathrm{VaPo}^\mathrm{prot}_I \bigl( \mathbf{X}^{\dagger}_{(I+1)} \bigr). $$

For these liabilities we hold an asset portfolio denoted by with value \(S_{I}^{(I-)}\) at time I. Secondly, there are the claims with accident year I+1 that belong to the new business

$$ \mathrm{VaPo}^\mathrm{prot}_I \bigl(\mathbf{X}^\mathrm{ac}_{(I+1)} \bigr)+ \mathrm{VaPo}^\mathrm{prot}_I \bigl( \mathbf{X}^\mathrm{lc,~ri}_{(I+1)} \bigr). $$

For these claims we obtain an insurance premium (for new business). In general, the insurance premium that is collected for accident year I+1 is not entirely known at the beginning of accounting year I+1. This may be for different reasons:

  • Often, the expiry of the contracts is December 31. Some of these contracts are renewed, others are canceled. Usually, on January 1, the insurance company does not have all the information about the renewals and adaptions of contracts that expire on December 31.

  • During accounting year I+1 the insurance company will write and sell new contracts. Of course, this number of new contracts is also random at time I.

Therefore, the exposure in accounting year I+1 is not completely known at the beginning of the accounting year. Moreover, we may have contracts whose exposures lap into accounting year I+2, e.g. assume we sell a one-year contract on April 1, year I+1, and collect the entire one-year premium in accounting year I+1. This contract provides us with a three-months risk exposure also in accounting year I+2 which makes it necessary to build so-called unearned premium reserves at the end of accounting year I+1 for the inexperienced exposure in accounting year I+2. For more details see Sect. 1.1.1 of Wüthrich–Merz [166].

Let us, for simplicity, assume that we do not need to build unearned premium reserves, i.e. we only have one-year contracts starting at January 1 and for which we collect a yearly premium. We assume that this premium is divided into two parts Π 0 and Π 1. Π 0 denotes the part that we already know at the beginning of accounting year I+1 and Π 1 is the part of the premium which is only known at the end of this accounting year. In a further simplification we assume that Π 1 is only received at the end of accounting year I+1 and that it is independent of the financial filtration \(\mathbb{A}\). The insurance technical filtration \(\mathbb{T}\) is then enlarged such that Π k is -measurable for k=0,1. This implies that the premium cash flow is given by

$$ \varPi_0 ~\mathfrak{Z}^{(I)}+ \varPi_1~ \mathfrak{Z}^{(I+1)}, $$
(10.24)

and the expected premium at time I for accounting year I+1 is given by the portfolio

This implies that at time I we have expected asset value (including the premium for new business)

We invest this asset value \(S_{I}^{(I)}\) into a portfolio of eligible assets, see at the end of Sect. 9.3.3,

with value \(S_{I}^{(I)}\) at time I. Note that we need to account for the expected premium at time I. Otherwise, we are short of asset values because we also account for the total expected claim of new business that corresponds to that premium. Of course, the premium for accounting year I+1 and the claim of new business with accident year I+1 stem from the same exposure, which implies that these two random variables are dependent (usually positively correlated). We will come back to this in Sect. 10.6.5.6. This will also clarify the enlarged filtration \(\mathbb{T}\) which was introduced rather sloppily in the present section.

6.3 Cost Process and Other Risk Factors

In the previous sections we have considered the VaPo of the insurance liability cash flow \(\mathbf{X}^{\mathrm{ri}}_{(I+1)}\), see (10.23), and the corresponding asset position . In order to produce these insurance contracts (and the insurance cover) additional costs are generated, e.g. commissions are paid to the insurance underwriter, salaries of the employees need to be paid, buildings need to be rented, the IT system needs to be maintained, etc. All these expenses (usually called administrative and claims handling expenses) guarantee the production and the smooth run-off of the liabilities. Here, we consider two different types of costs: (i) costs at the inception of a contract and (ii) claims handling costs. These costs (and any other expenses) generate a cost cash flow

$$ \mathbf{X}_{(I+1)}^\mathrm{costs} =\bigl(0,\ldots, 0, X^\mathrm{costs}_{I+1}, \ldots, X^\mathrm{costs}_{n} \bigr). $$

We choose a very simple stochastic model for this cost cash flow. We assume that the costs for the inceptions are paid at time I+1, are -measurable and are independent of the financial filtration \(\mathbb{A}\), henceforth they are represented by the portfolio

$$ X^\mathrm{incept}_{I+1}~\mathfrak{Z}^{(I+1)}. $$

The claims handling costs \(\mathbf{X}^{\mathrm{claims\ handling}}_{(I+1)}\) are assumed to be proportional to the insurance liability cash flow \(\mathbf{X}^{\mathrm{ri}}_{(I+1)}\) with given proportionality constant c>0.

Note that for an entire study of expenses different from insurance liability payments a comprehensive cost calculation is necessary. An appropriate activity-based cost allocation method then allocates the total expenses to the different expense classes which enable to model the cost cash flow \(\mathbf{X}_{(I+1)}^{\mathrm{costs}}\), see also Sect. 5.6 in Wüthrich et al. [168] and Buchwalder et al. [27]. Such cost allocation methods are very important because they also determine the profitability of sub-portfolios and units, but one should always keep in mind that these allocations are never unique but driven by expert judgment. For instance, how should the costs for maintenance of the IT system be allocated to different sub-portfolios? Therefore, profitability results need to be interpreted with care.

For the modeling of an insurance company further factors need to be considered. Besides the classical cost process for production and run-off of liabilities also all other expenses and risk factors of an insurance company need to be considered. For example, operational risk may also generate costs and losses (for operational risk modeling, see Böckner [18] and Shevchenko [144]). We assume that our cost process \(\mathbf{X}_{(I+1)}^{\mathrm{costs}}\) contains all these payouts.

6.4 Accounting Condition and Acceptability

We merge the insurance cash flow \(\mathbf{X}^{\mathrm{ri}}_{(I+1)}\), the cost cash flow \(\mathbf{X}_{(I+1)}^{\mathrm{costs}}\) and the asset position at time I to the full balance sheet approach (note that the asset portfolio contains the value of the premium of new business). The protected VaPo of insurance liabilities and cost process is then given by

(10.25)

As described above, we assume (for simplicity) the claims handling costs to be proportional to the insurance liabilities resulting in the factor (1+c). In addition, we consider the distorted costs at the inception of the new contracts, modeled by the last term in (10.25).

The asset portfolio needs special care. Note that the premium part Π 1 is only observable at time I+1, hence we also have an uncertainty in the premium income. The portfolio contains the expected premium at time I+1 viewed from time I. Therefore on the asset side the true value may deviate from the one given by , the deviation at time I+1 is given by the term

(10.26)

We may also introduce a (negative) risk margin on the asset side for the uncertainty in this premium income cash flow. For simplicity we refrain from doing so.

(a) Accounting Condition

We first evaluate the accounting condition. We apply an accounting principle at time I to (10.25). This provides liability value

For the accounting condition to be fulfilled at time I we need to have

$$ S_I^{(I)} \ge Q_I \bigl[\mathbf{X}^\mathrm{liability}_{(I+1)} \bigr]. $$

The liability side considers the run-off \(\mathbf{X}_{(I+1)}^{\text{run-off}}\) of old claims liabilities, the run-off of the life-time annuity \(\mathbf{X}^{\dagger}_{(I+1)}\), the claim generated by new business \(\mathbf{X}^{\mathrm{ac}}_{(I+1)}+\mathbf{X}^{\mathrm{lc,~ri}}_{(I+1)}\) (including reinsurance) and the cost process \(\mathbf{X}_{(I+1)}^{\mathrm{costs}}\) of old liabilities and new business (we assume that the reinsurance premium has already been deducted). The asset side considers the initial provisions for old claims and the premium cash flow for new business (where the reinsurance premium is already deducted). This value should be reduced if we also integrate a risk margin for the uncertainty in the premium income (10.26).

(b) Insurance Contract Condition and Acceptability

We first evaluate the cash flow in accounting year I+1. It is given by

The first term \(X_{I+1}^{\mathrm{ri}}\) is the insurance liability cash flow which consists of payments \(X_{i,I+1-i}^{\mathrm{ac}}\) for attritional claims with accident years iI+1, payments for large claims \(X_{I+1}^{\mathrm{lc, \ ri}}\) new business (including the reinsurance default risk), and the annuity payments \(X_{I+1}^{\dagger}=L_{x+1}~a~I_{I+1}\). The second term \({X}_{I+1}^{\mathrm{costs}}\) consists of the expenses \(X^{\mathrm{incept}}_{I+1}\) for the inception of new business and of the claims handling costs \(X^{\mathrm{claims~handling}}_{I+1}\), which are both assumed to be -measurable. Note that these cash flows are not necessarily independent (this depends on the chosen model assumptions). The final term is the difference between the expected premium and the true premium Π 1 received during accounting year I+1.

The risk-adjusted reserves for the liabilities at time I+1 are given by

with

Note that the run-off liabilities \(\mathbf{X}^{\mathrm{ri}}_{(I+2)}\) by assumption only consist of life-time annuity payments and attritional claims (and the corresponding loading factor (1+c) for expenses). The attritional claims liabilities are predicted by Hertig’s claims reserving model (see Model Assumptions 10.1), where we have assumed that the initial attritional claim \(C^{\mathrm{ac}}_{I+1,0}\) is given by a gamma distribution. The risk-adjusted chain ladder factors \(f^{(+,I+1)}_{j}\) at time I+1 are then given by Theorem 10.7 (based on the cumulative attritional claims observations). Finally at time I, we choose a conditional risk measure ρ I according to Definition 9.15 and then the insurance contract condition (acceptability) is given by

$$ \rho_I(\mathrm{AD}_{I+1})= \rho_I \bigl(X_{I+1}+ Q_{I+1} \bigl[\mathbf{X}^\mathrm{liability}_{(I+2)} \bigr] -S_{I+1}^{(I)} \bigr) \le0, $$

for asset value

In the next subsection we give explicit numerical examples in different situations.

6.5 Solvency Toy Model in Action

In this subsection we present a numerical analysis using the risk measurement model specified in the previous subsections. First we give a detailed description of the model choice and then we present a sensitivity analysis. We assume that the basic actuarial model (Assumption 6.3) is fulfilled. Moreover, we aim to achieve acceptability and solvency at time I, i.e. we would like to study the conditional risk measure ρ I (AD I+1) of the asset deficit \(\mathrm{AD}_{I+1} =X_{I+1}+ Q_{I+1}[\mathbf{X}^{\mathrm{liability}}_{(I+2)}] -S_{I+1}^{(I)}\).

6.5.1 State Price Deflator Model

For the financial deflator φ A modeling we choose the discrete time one-factor Vasicek financial model (see Model 5.7) with parameter values provided in (5.3). We emphasize once more that the Vasicek financial model has some unpleasant features (see important Remark 5.1 and Sect. 9.4.6). For our (educational) purposes, however, it is sufficient. In the Vasicek model we obtain an affine term structure for the ZCB prices and we can calculate their expected future prices with Proposition 5.6. In Fig. 10.5 we provide these prices at time I and the corresponding expected future prices viewed from time I. We observe that we expect a slight decrease of prices which corresponds to an expected increase in the yield curve.

Fig. 10.5
figure 5

ZCB prices P(I,I+s) and as a function of time to maturity s=0,…,16

6.5.2 Run-Off of Old Attritional Claims

For the run-off of the old attritional claims liabilities we choose Hertig’s log-normal claims reserving model with parameter uncertainty (see Model Assumptions 10.1 and 10.21). The market-value margin is calculated with the help of the risk-adjusted chain ladder factors provided by Theorem 10.7. Finally, the prior parameters, the risk aversion parameters and the data are chosen from Example 10.9. This provides the values given in Table 10.6. We see that the best-estimate reserves are increasing because we obtain a financial return on them. On the other hand, risk-adjusted reserves and the market-value margin are decreasing, this comes from the fact that the uncertainty in accounting year I+1 is eliminated at time I+1 (this provides a positive expected CDR gain for risk bearing).

Table 10.6 Reserves at time I and expected reserves at time I+1 under Model Assumptions 10.1 (with parameter uncertainty) of run-off liabilities of old attritional claims

6.5.3 Attritional Claims of New Business

For modeling attritional claims of new business we choose the simplified version of Model Assumptions 10.21 described in Sect. 10.6.1. We choose γ I+1=100 and c I+1=0.0082. This provides initially expected payments for attritional claims of accident year I+1

As initial loading factor we choose ψ nb=1.80 %, this corresponds to a cost-of-capital loading factor of \(r^{(I)}_{\mathrm{CoC}}=6.5~\%\) of the 99.5 % VaR of . For obtaining the expected (risk-adjusted) ultimate claim \(C^{\mathrm{ac}}_{I+1,J}\) we then use the (risk-adjusted) chain ladder factors from Hertig’s log-normal claims reserving model. This provides the values in Table 10.7. We again see that the best-estimate liabilities are increasing because we have a financial return on them. On the other hand, risk-adjusted liabilities and the market-value margin are decreasing, this comes from the fact that the uncertainty in accounting year I+1 is eliminated at time I+1 (this provides a positive expected CDR gain for risk bearing).

Table 10.7 Attritional claims of new business at time I and expected attritional claims of new business at time I+1

6.5.4 Large Claims of New Business

As described in Sect. 10.6.1 we choose a Pareto distribution for modeling the annual large claim \(X_{I+1}^{\mathrm{lc}}\) of new business. The Pareto parameters are given by θ=500 and χ=2. For the reinsurance cover we choose a stop-loss contract “∞ xs θ D =2000”, and the default probability of the reinsurance company is chosen to be 2 %. For the market-value margin loading we choose the loading factor of 8.31 % which corresponds to a cost-of-capital loading factor of \(r^{(I)}_{\mathrm{CoC}}=6.5~\%\) of the 99.5 % VaR above the expected large claims. Moreover, we assume that the large claims are immediately settled at the end of the accident year, i.e. there is no further claims development. This provides the (expected) liabilities given in Table 10.8. Note that as a consequence of the immediate settlement of large claims the market-value margin is completely released in accounting year I+1.

Table 10.8 Large claims of new business at time I and expected large claims of new business at time I+1

In Table 10.9 we present the situation for different reinsurance covers. The first line gives the expected large claims without reinsurance, the second line provides the figure with the stop-loss cover where we assume that the reinsurance company may default with default probability 2 % and the last line is the situation where the reinsurance company cannot default.

Table 10.9 Expected large claims of new business at time I+1 with different reinsurance cover versions

6.5.5 Life-Time Annuity

For the modeling of the life-time annuity liabilities we choose exactly the model and parameters from Example 8.19. In Table 10.10 we present the reserves of the life-time annuity liabilities.

Table 10.10 Life-time annuity at time I and expected values at time I+1

6.5.6 Premium Income and Administrative Costs

As described above, we divide the administrative costs into two different categories, namely costs for claims handling and costs at the inception of a contract. For the claims handling costs we assume that they are completely in parallel with the claims payments and therefore we choose a constant cost ratio of c=5 % (on the best-estimate reserves for expected claims handling costs and on the risk-adjusted reserves in order to obtain also a market-value margin for the uncertainties in the cost cash flow).

For the costs at the inception of a contract we assume that these costs are paid at the end of the accounting year and that these inception costs are log-normally distributed with an expected value of 4519 and a coefficient of variation of 10 % (we do not calculate a loading for these inception costs). The total costs are provided in Table 10.11.

Table 10.11 Administrative costs at time I and expected values at time I+1

Finally, we need to model the uncertainty in the premium income. We assume that we receive an expected premium that has market value 28,280 at time I. This provides a claims ratio of 78.0 % (best-estimate liabilities for new business divided by the expected premium) and a cost ratio of 19.8 % (inception costs and claims handling costs related to new business divided by the expected premium). If we add these two ratios we obtain the combined ratio of 97.8 % which means that in the average the business is profitable (since the ratio is below 100 %).

In order to determine the uncertainty in the premium income we simply choose a log-normal distribution for Π 1 with mean 28,280 and coefficient of variation of 5 %. The uncertainty at time I+1 is then determined by . Moreover, we do not choose a margin for this uncertainty.

Remark on the choice of the insurance technical filtration \(\mathbb {T}\): In Sect. 10.6.1 we write that \(\mathbb{T}\) is generated by the insurance liability cash flows. Furthermore, before formula (10.24) we then say that the insurance technical filtration is enlarged. This enlargement should be done such that (i) the cost processes and the premium income become \(\mathbb{T}\)-adapted, and (ii) the prediction of the claims cash flows does not change.

6.5.7 Asset Portfolio and Full Balance Sheet

Next, we specify the choice of the asset portfolio. The asset portfolio consists of two parts. The first part is a proportion of the replicating portfolio of the expected liabilities and the second part is invested in other asset classes. We denote the asset portfolio at time I by . Then this asset portfolio is given by an aggregation of six sub-portfolios

where we are going to describe the six sub-portfolios \(\mathfrak{U}^{(l)}\). The insurance cash flow (outstanding liabilities including administrative cost cash flow) is given by \(\mathbf{X}^{\mathrm{liability}}_{(I+1)}\), see (10.25). For the first sub-portfolio we choose a replicating portfolio for the ratio w 1=90 %, i.e.

$$ \mathfrak{U}^{(1)}= w_1~\mathrm{VaPo}_I \bigl( \mathbf{X}^\mathrm{liability}_{(I+1)} \bigr). $$

This means that we hold w 1=90 % of the VaPo on the asset side of the balance sheet. The remaining asset values are invested as follows: We invest in the bank account

$$ \mathfrak{U}^{(2)}= w_2~ \mathfrak{B}, $$

in basis financial instruments \(\mathfrak{A}^{(l)}\), l=3,4,5, satisfying the Vasicek financial model with exponential growth price processes according to Proposition 5.5 (see also Model 5.7) such that

$$ \mathfrak{U}^{(l)}= w_l~\mathfrak{A}^{(l)}, $$

and finally \(\mathfrak{U}^{(6)}\) is a European put option on \(\mathfrak{U}^{(3)}\) with maturity I+1 and strike equal to the expected value of the price of \(\mathfrak{U}^{(3)}\) at time I+1 viewed from time I. The prices of \(\mathfrak{U}^{(l)}\) at times t=I,I+1 are given by \(U_{t}^{(l)}\), for l=1,…,6. The price of the European put option at time t is calculated by

Because \(\mathfrak{U}^{(3)}\) fulfills the Vasicek financial model this European put option can explicitly be calculated with Theorem 5.13.

Next, we need to specify the parameters of the basis financial instruments \(\mathfrak{A}^{(l)}\), l=3,4,5. We denote their innovations by \(\delta_{t}^{(l)}\), their standard deviation parameters by σ (l) and their correlation parameters by c (l), see Sect. 5.2.1. Moreover, the innovation of the inflation process (modeled by instrument \(\mathfrak{I}\) for the life-time annuity example) is denoted by δ. For (ε,δ,δ (3),δ (4),δ (5)) we choose a multivariate Gaussian distribution with the following (positive definite) correlation matrix

$$ \left( \begin{array}{rrrrr} 1.00&~-0.30&0.50&0.50&0.50\\ ~-0.30&1.00&~-0.15&~-0.15&~-0.15\\ 0.50&~-0.15&1.00&0.44&0.44\\ 0.50&~-0.15&0.44&1.00&0.44\\ 0.50&~-0.15&0.44&0.44&1.00 \end{array} \right). $$

Next, we specify the standard deviation parameters, we choose (σ (3),σ (4),σ (5))=(25 %,10 %,5 %). That is, \(\mathfrak{A}^{(3)}\) is the most risky asset with an expected log return of 1.13 %, \(\mathfrak{A}^{(4)}\) provides an expected log return of 0.75 %, and \(\mathfrak{A}^{(5)}\) of 0.63 %. One should compare this to the risk-free return r I =0.50 % and to the expected log return of \(\mathfrak{I}\) given by 0.45 %. For their calculations we refer to (5.9). Thus, the European put option \(\mathfrak{U}^{(6)}\) is bought on the most risky asset \(\mathfrak{U}^{(3)}\). Finally, we need to specify w l , l=1,…,5, they are chosen such that we obtain the balance sheet given in Table 10.12. We see an excessive growth in the equity position of 9.19 %, this is because part of the market-value margin is released for risk bearing in accounting year I+1. The risk bearing capital is obtained by

$$ \mathrm{RBC}_I = 2{,}279+8{,}074= 10{,}353. $$
Table 10.12 Balance sheet at time I and predicted balance sheet at time I+1

6.5.8 Risk Assessment and Solvency Example

Above we have made all the specifications that are necessary for testing solvency of our company. First of all we have a negative asset deficit AD I at time I of −8,074 which says that accounting condition (a) is fulfilled. For the validation of the insurance contract condition (b) (acceptability) we need to simulate the distribution of the balance sheet positions at time I+1. We have run 100,000 Monte Carlo simulations in order to obtain the empirical distribution of the asset deficit AD I+1 (viewed from time I). In Fig. 10.6 we give the empirical distribution of the asset deficit AD I+1 at time I+1, conditional on . The conditional mean is and we can now evaluate any conditional risk measure ρ I on this empirical distribution given in Fig. 10.6. The right tail of the empirical distribution of the asset deficit is shown in Fig. 10.7, and in Fig. 10.8 we give the log-log plot of its survival distribution. The log-log plot provides a negative slope of roughly value −2. This corresponds to heavy-tailedness and it is driven by the large claims distribution with parameter χ=2. Finally, in Table 10.13 we present the acceptability and solvency analysis. As conditional risk measures ρ I we choose value-at-risk (VaR) and conditional tail expectation (CTE), both on a 99 % security level, see Examples 9.8 and 9.9. We observe that we obtain solvency for the value-at-risk risk measure but we are not solvent for the more conservative conditional tail expectation risk measure. The value-at-risk risk measure calculates the threshold for severe adverse events whereas the conditional tail expectation calculates the average adverse event above this threshold. Value-at-risk provides a solvency capital SC I at time I of 7,978 which gives a free capital F I of 96. For the conditional tail expectation on the other hand a capital injection of 2,602 (in terms of ZCBs with maturity I+1) is necessary in order to obtain solvency.

Fig. 10.6
figure 6

Empirical distribution of asset deficit AD I+1 at time I+1, conditional on

Fig. 10.7
figure 7

Right tail of the empirical distribution of asset deficit AD I+1, conditional on

Fig. 10.8
figure 8

Log-log plot of empirical survival distribution of asset deficit AD I+1, conditional on 

Table 10.13 Acceptability analysis with value-at-risk and conditional tail expectation risk measures both on the security level 99 %

6.5.9 Analysis of Reinsurance Cover

In this subsection we analyze the influence of the reinsurance cover on the large claims. We consider three different cases: (i) no reinsurance cover on large claims; (ii) stop-loss reinsurance cover as above with θ D =2000 and reinsurance company default probability 2 %; (iii) stop-loss reinsurance cover with θ D =2000 but no reinsurance company default. The differences in expected claims were already presented in Table 10.9. In order to make the analysis comparable we adjust the initial balance sheet by these differences in expectations. The results are provided in Table 10.14. A first observation is that the reinsurance cover is necessary to obtain solvency under the value-at-risk risk measure. The truncation of large excess losses is of high importance in this example. A second observation is that the influence of the reinsurance cover is much higher for the conditional tail expectation than for the value-at-risk risk measure. From a practical point of view this is obvious because the conditional tail expectation does not only measure the threshold but also the size by which the threshold is violated. In many cases the conditional tail expectation should therefore be preferred. Finally, we observe that the reinsurance default probability of 2 % does not affect the picture too much, i.e. in this example we can almost neglect the fact that the reinsurance company can default. However, this picture is very misleading! The reason therefore is that reinsurance defaults have been chosen independently from any other risk factors (in particular, independently from insurance technical events through the choice of the financial instrument \(\mathfrak{B}^{(I+1)}\)). If we, for example, link reinsurance defaults to large losses of our company then this picture changes dramatically and we see that reinsurance default considerations become important, see Table 10.15. In Table 10.15 we present the example where the reinsurance company defaults exactly in the 2 % worst case events of large claims, i.e. whenever the large claim satisfies

$$ X_{I+1}^\mathrm{lc} > \mathrm{VaR}_{98~\%} \bigl( X_{I+1}^\mathrm{lc} \bigr)= \theta~(1-98~\%)^{-1/\chi}=3{,}535, $$

then the reinsurance company defaults. Henceforth, exactly the worst large claims are not covered because in these cases the reinsurance company defaults (of course, in that case reinsurance defaults need to be chosen \(\mathbb{T}\)-adapted). Table 10.15 shows that in this case we do not need to buy reinsurance because having reinsurance is almost as bad as having no reinsurance.

Table 10.14 Solvency analysis for different reinsurance covers
Table 10.15 Solvency analysis for different dependence structures of reinsurance defaults

6.5.10 Analysis of European Put Option

In the next analysis we assume that we do not have the put option \(\mathfrak{U}^{(6)}\) on the most risky asset \(\mathfrak{U}^{(3)}\) but instead we put its value into the bank account \(\mathfrak{B}\). For the reinsurance cover we choose an independent default probability of 2 %, see Table 10.14. This provides the results in Table 10.16. We see that we are short of free capital (the free capital is reduced by 96+27=123 for the value-at-risk risk measure) which means that we need to have the European put option on the most risky asset in order to stay solvent for the value-at-risk risk measure (or otherwise we need to invest in less risky assets).

Table 10.16 Solvency analysis in terms of the European put option asset position

6.5.11 Other Asset Allocations

In this subsection we illustrate the importance of optimal asset allocations. In Table 10.12 we have chosen an asset allocation such that \(\mathfrak{U}^{(1)}\) corresponds to 90 % of the VaPo. We now increase this ratio to 100 % which means that \(\mathfrak{U}^{(1)}=\mathrm{VaPo}_{I}(\mathbf{X}_{(I+1)}^{\mathrm{liability}})\). The remaining asset positions are reduced proportionally such that the total asset value remains constant. This provides the balance sheet in Table 10.17. We see that the expected value of \(\mathfrak{U}^{(1)}\) now moves in parallel with the best-estimate reserves of the insurance liabilities and the administrative cost process X liability=X ri+X costs, i.e. we have a better ALM strategy at the price of a lower expected log return (0.56 % is reduced to 0.55 %). If we now run the simulations we obtain the results presented in Table 10.18. We see that the free capital increases because the asset portfolio replicates the expected liabilities in a more optimal way. We can even go one step further and choose as asset allocation the protected VaPo, i.e. \(\mathfrak{U}^{(1)}= \mathrm{VaPo}^{\mathrm{prot}}_{I}(\mathbf{X}_{(I+1)}^{\mathrm{liability}})\) and scale the remaining assets proportionally, that is, we choose the values in Table 10.19. In Table 10.18 we see that this asset allocation further increases the free capital (and reduces the solvency capital). As in Artzner–Eisele [4] we may now ask the question for which asset allocation the free capital is maximized. Of course, this question depends on the chosen conditional risk measure ρ I and on the set of eligible assets. In particular, we need to make sure that this pair does not allow for acceptability arbitrage (see Artzner et al. [7], Remark 9.42 and Example 9.46). This becomes especially important when we allow for short positions (which was not the case in the examples above).

Table 10.17 Balance sheet at time I and predicted balance sheet at time I+1
Table 10.18 Solvency analysis for different asset allocations
Table 10.19 Balance sheet at time I and predicted balance sheet at time I+1

6.5.12 Margrabe Option

Finally, we study numerically the case where we buy a Margrabe option \(\mathfrak{M}^{(I+1)}\) to hedge financial risk. We choose the asset portfolio given in Table 10.20. We assume that we buy financial asset \(\mathfrak{U}^{(4)}\) with the same volatility and correlation parameters as above and the total volume is chosen such that \(U^{(4)}_{I}\) is equal to the technical provision obtained from the protected VaPo \(\mathrm{VaPo}^{\mathrm{prot}}_{I}(\mathbf{X}_{(I+1)}^{\mathrm{liability}})\) at time I. This protected VaPo generates value \(V^{+}_{I+1}\) at time I+1 and we buy the Margrabe option \(\mathfrak{M}^{(I+1)}\) with maturity I+1 to protect against possible shortfalls in the asset value. The price of the Margrabe option \(\mathfrak{M}^{(I+1)}\) at time I is given by

Numerical simulation provides the value of the Margrabe option given in Table 10.20. If we hold this Margrabe option we are on the one hand protected against adverse financial events, and from Table 10.20 we see that this asset portfolio generates an expected financial log return of 0.62 %. If we invest the price of the Margrabe option also into the bank account \(\mathfrak{B}\) we would have a higher expected log return of 0.71 % but therefore we would not have a protection against financial shortfalls in the form of the Margrabe option. Note that we also change the total value of the assets so that we obtain a meaningful analysis.

Table 10.20 Balance sheet at time I and predicted balance sheet at time I+1 including the Margrabe option \(\mathfrak{M}^{(I+1)}\)

We now compare these two situations in the light of solvency: (i) with Margrabe option, and (ii) without Margrabe option but therefore increased value in bank account \(\mathfrak{B}\). We see in Table 10.21 that in this case the Margrabe option prevents from an insolvency. We admit a lower expected log return of 0.62 % versus 0.71 % but therefore we obtain a guarantee in terms of the Margrabe option. We also see that the Margrabe option is comparably cheap. It costs 2,368 but gives an additional free capital of almost 5,000!

Table 10.21 Solvency analysis for asset allocation of Table 10.20 considering the Margrabe option

7 Concluding Remarks

In Sect. 10.6 we have presented a simple risk measurement model that allows for the study of solvency. In practice, of course, the risk landscape is much more involved. Typically, one defines different risk classes and risk factors. Then one models each of these individually by marginal distributions. Finally, the aggregation is done either by a simple aggregation mechanism (which often is not really consistent in a mathematical sense) or by using copulas.

The risk classes typically studied are (see Sandström [141], Chap. 10):

  • Financial market and ALM risk. Typical asset classes studied are interest rate, equity, currency, commodity, real estate, hedge fund, private equity, etc.

  • Credit risk. Default or downgrading of mortgages, fixed income and reinsurance companies, etc.

  • Non-life insurance risk. Run-off risk, premium liability risks including catastrophic events.

  • Life insurance risk. Mortality and longevity risks, disability and health risks, lapse rates.

  • Operational risk.

  • Other risks like concentration, liquidity, etc.

Between many of these risk classes and risk factors we can have dependencies that are not always easily captured in an appropriate model. Moreover, especially in life insurance products, one often has policyholder options and guarantees that are difficult to model. In addition, minimal interest rate guarantees are often combined with profit sharing (and legal quota) which lead to implicit equations.

This now gives the impression that the basic actuarial model Assumptions 6.3 are too restrictive. This is not necessarily the case. In many situations it suffices to choose the right financial instruments \(\mathfrak{A}^{(i)}\) for replication, this has for instance been done for reinsurance defaults in (10.22) by the choice of \(\mathfrak{B}^{(I+1)}\). This portfolio then replicates the liabilities in terms of financial instruments. However, if these financial instruments do not belong to the set of (traded) eligible assets we are allowed to invest in, then this introduces additional ALM risk. This is also the case in Sect. 7.4 where we construct the approximate VaPo.

Concluding Remarks

We have tried to model risk factors and financial exposures very accurately. This hopefully results in a reasonable solvency model which allows to evaluate critical financial situations. However, solvency modeling should go much beyond simple quantitative analysis. The aim should be to develop a solvency model which is sufficiently sophisticated on the one hand, and sufficiently simple on the other hand. Complexity is needed to get an appropriate description of the risk factors and their interaction. Simplicity is needed to understand and interpret the sensitivities and the influence of these risk drivers. The risk manager should exactly know the weaknesses and the strengths of his model and how it is embedded into the real world problem he tries to solve. This requires a deep understanding of the real world problem and the corresponding mathematical model, which in particular allows him to identify the gaps between the real world problem and his model.

Besides this quantitative view there are also risk factors that cannot be captured by a mathematical model. In particular, the human factor is a crucial one, and one should always be aware of the fact that all systems are designed and run by human beings, where it is natural that errors just happen. Therefore, it is important that also sensible qualitative control systems are in place and that these are revised on a regular basis.

The risk manager should always keep in mind that the ultimate goal of solvency is the protection of the policyholder. Therefore, it is recommended that he takes a sufficiently conservative position about the unknowns, because past experience has shown that risk assessment is too optimistic in most cases.