1 Introduction

1.1 Motivation

The classical loss reserving problem is interested in estimating existing (unpaid and/or unreported) claim liabilities, based on past data. Traditional reserving techniques, widely used in practice, are applied on data at a certain level of aggregation (usually quarters or years), often presented in triangles. Parameters typically rely on only a few such data points—sometimes even only one. This makes those techniques very vulnerable to the presence of outliers.

This issue is well known, however, being able to quantify the specific impact of each observation to certain statistics of interest provides greater information regarding the nature of the data at hand and will often provide insights regarding the techniques themselves. This is of particular importance when implementing and adjusting models. The objective of this paper is to provide a mathematically tractable approach to understanding how changes in each incremental claim in a loss triangle will impact certain statistics of interest. The presented impact functions are not designed to detect outliers but rather to deepen our understanding of how outliers may impact the results of a model. Impact functions coupled with statistically sound procedures to detect and treat abnormal observations will improve the robustness of reserving techniques and ultimately lead to more informed and reliable decisions.

Venter and Tampubolon [5] calculate the impact of incremental claims on total reserve estimates under a range of models. They consider the traditional chain-ladder technique however the impacts are calculated numerically and hence no closed form equations are provided. Additionally, no consideration of individual accident years, mean squared errors (mse) or quantiles is given which constitutes a major contribution of this paper. Verdonck et al. [7] show that traditional chain-ladder reserve estimates are highly susceptible to even just one outlier and highlight that the impact on reserves may be positive or negative. See Verdonck et al. [7], Verdonck and Debruyne [6] for recent robust reserving techniques.

In this paper we rigorously investigate the impact that incremental observations have on reserve estimates, their variability, and their quantiles. Notably, we provide closed form equations for the first derivative of these statistics of interest under Mack’s Model [2], which highlights numerous properties of this technique, including areas of a loss triangle where outliers are likely to have the greatest effect on results and hence where observations should be most heavily scrutinised. It appears that observations in the corners of a loss triangle have the potential to impact results most significantly. Additionally, we compare the impact of incremental observations on reserves under Mack’s Model and the Bornhuetter Ferguson technique [1] which suggests that the latter approach is more robust.

These techniques may be applied in practice to identify areas of a given loss triangle that reserves are particularly sensitive to and hence where outliers, if present, may have a significant impact on results. These impact functions may also be used to compare reserve sensitivities under different techniques as we have done for Mack’s Model and the Bornhuetter Ferguson approach. The impact that incremental observations are having in different loss triangles may be calculated using these impact functions and comparisons made between areas of sensitivity and properties of these different triangles. Through such a comparative study, trends may begin to emerge, making it easier to identify anomalous observations or even whole data sets with abnormal properties.

The paper is structured as follows. In Sect. 2 we define notation and briefly review the reserving techniques that will be considered. Impact functions are defined in Sect. 2.5. The following sections summarise the impact functions for certain statistics of interest and apply them to real data where 3D graphical representations are used to highlight their features. Section 3 focuses on central estimates, Sect. 4 on mean squared errors, and Sect. 5 on quantiles. The data used for the examples in this paper is presented and discussed in Appendix A  and detailed proofs for the impact functions can be found in Appendix B. Section 6 concludes.

2 Notation and framework

2.1 Loss triangles

The loss reserving problem is concerned with using currently available data to predict future claim amounts in a reliable manner. The available data is often arranged in a loss triangle which provides a visual representation of the development of claims up to the current time as well as what is required to be predicted (see Fig. 1). We denote by \(X_{i,j}\) and \(C_{i,j}\) the incremental and cumulative claims for accident year i and development year j respectively. Denote by \({\mathbb {B}}=\{X_{i,j}:i+j\le I+1\}\) the past claims data. Let \(R_i\) represent reserves for accident year i and R represent total reserves.

Fig. 2
figure 1

Aggregate claims run-off triangle

2.2 Chain-ladder

The traditional chain-ladder method is probably the most famous reserving technique. This approach hinges on the assumption that development factors \(f_1,f_2,\ldots ,f_{I-1}\) exist, such that \( E[C_{i,j+1}|C_{i,j}]=f_jC_{i,j}\). These development factors are unknown and estimated by

$$\begin{aligned} {\widehat{f}}_j=\frac{\sum _{i=1}^{I-j}C_{i,j+1}}{\sum _{i=1}^{I-j}C_{i,j}},\ 1\le j\le I-1. \end{aligned}$$
(2.1)

Ultimate claims for accident year i are then estimated by \( {\widehat{C}}_{i,I}=C_{i,I-i+1}{\widehat{f}}_{I-i+1}\cdots {\widehat{f}}_{I-1}\). From here accident year reserves and total reserves are subsequently estimated by

$$\begin{aligned} {\widehat{R}}_i={\widehat{C}}_{i,I}-C_{i,I-i+1} \quad \text {and} \quad {\widehat{R}}=\sum _{i=1}^{I}{\widehat{R}}_i. \end{aligned}$$
(2.2)

2.3 Mack’s model

Mack’s Model [2] is able to retain much of the simplicity of the deterministic chain-ladder whilst providing a formula for the mse of reserve estimates. The assumptions underlying classical chain-ladder reserves are the same for the first moment of reserves under Mack’s Model. The mse of prediction for individual accident year reserves is given by

$$\begin{aligned} \text {mse}({\widehat{R}}_i)= & {} C_{i,I-i+1}\sum _{j=I-i+1}^{I-1}(f_{I-i+1}\cdots f_{j-1}\sigma ^2_jf^2_{j+1}\cdots f^2_{I-1})\nonumber \\{} & {} +C_{i,I-i+1}^2(f_{I-i+1}\cdots f_{I-1}-{\widehat{f}}_{I-i+1}\cdots {\widehat{f}}_{I-1})^2. \end{aligned}$$
(2.3)

The estimate for the mse of total reserves is given by

$$\begin{aligned} \widehat{\text {mse}({\widehat{R}})}=\sum _{i=2}^{I}\left\{ \text {mse}({\widehat{R}}_i)+{\widehat{C}}_{i,I}\left( \sum _{j=i+1}^{I}{\widehat{C}}_{j,I}\right) \sum _{k=I-i+1}^{I-1}\frac{2{\widehat{\sigma }}_k^2/{\widehat{f}}_k^2}{\sum _{n=1}^{I-k}C_{n,k}}\right\} \end{aligned}$$
(2.4)

where

$$\begin{aligned} {\widehat{\sigma }}_k^2=\frac{1}{I-k-1}\sum _{i=1}^{I-k}C_{i,k}\left( \frac{C_{i,k+1}}{C_{i,k}}-\widehat{f_{k}}\right) ^2,\ 1\le k \le I-2. \end{aligned}$$
(2.5)

This paper is not concerned with tail-fitting, however, as outlined in [2], this approach leaves us without an estimator for \(\sigma _{I-1}\) and we follow the same approach as in that paper to estimate \(\sigma _{I-1}\) by requiring that \(\frac{{\widehat{\sigma }}_{I-3}}{{\widehat{\sigma }}_{I-2}}=\frac{{\widehat{\sigma }}_{I-2}}{{\widehat{\sigma }}_{I-1}}\) holds as long as \({\widehat{\sigma }}_{I-3}>{\widehat{\sigma }}_{I-2}\) leading to

$$\begin{aligned} {\widehat{\sigma }}_{I-1}^2=\min \left( \frac{{\widehat{\sigma }}_{I-2}^4}{{\widehat{\sigma }}_{I-3}^2},\min ({\widehat{\sigma }}_{I-3}^2,{\widehat{\sigma }}_{I-2}^2)\right) . \end{aligned}$$
(2.6)

2.4 Bornhuetter–Ferguson reserves

The Bornhuetter–Ferguson (BF) reserving methodology [1] is an opposing extreme to the chain-ladder (and Mack’s Model) in that it uses prior estimates of ultimate claims and development patterns rather than inducing them from the available data to date. It can be considered a highly robust method as the presence of outliers will not influence reserve estimates. However in practice, chain-ladder development factors are often used to infer the development pattern when using the BF approach. It is this BF approach that we will consider (otherwise meaningful results will not present themselves). This means that the only difference between the two techniques is the estimate of ultimate claims for a given accident year. In particular, the BF method uses a prior (exogenously determined) estimate whereas the chain-ladder method uses the available data to estimate ultimate claims. Bornhuetter–Ferguson reserves are given by

$$\begin{aligned} {\widehat{R}}_i^{BF}= & {} {\widehat{C}}^{BF}_{i,I}-C_{i,I-i+1}= {\widehat{\mu }}_i-{\widehat{\mu }}_i\frac{1}{{\widehat{f}}_{I-i+1}\cdot \ldots \cdot {\widehat{f}}_{I-1}} \quad \text {and } \quad {\widehat{R}}^{BF}=\sum _{i=1}^{I}{\widehat{R}}_i^{BF},\nonumber \\ \end{aligned}$$
(2.7)

where \({\widehat{\mu }}_i\) represents the prior estimate of ultimate reserves for accident year i.

2.5 Using impact functions to explore reserve sensitivities

In this section we present impact functions for numerous statistics of interest under the assumptions of Mack’s Model. An impact function is able to highlight the sensitivity of a statistic of interest to a particular observation as well as pinpoint the marginal contribution of that observation to the final value of the statistic in some instances. This is done by taking the first derivative of the statistic with respect to the given observation. In our case we are interested in how an incremental claim \(X_{k,j}\) may influence a given statistic T such that the impact function is given by

$$\begin{aligned} \text {IF}_{k,j}(T)=\frac{\partial T}{\partial X_{k,j}}. \end{aligned}$$
(2.8)

Further we have that if the statistic of interest T is homogeneous of order one with respect to the \(X_{k,j}\)’s then

$$\begin{aligned} T=\sum _{\{k+j\le I+1\}}\frac{\partial T}{\partial X_{k,j}}X_{k,j}. \end{aligned}$$
(2.9)

The statistic, T, may represent reserves, the mse of reserve estimates or quantiles. It is interesting to investigate both the sign and magnitude of the impact functions to better understand the relationship a statistic has with an incremental claim. Furthermore, for those statistics that are homogeneous of order one, we may wish to see whether \(\text {IF}_{k,j}(T)\cdot X_{k,j}\) is bounded or not. Boundedness will highlight that an outlying value of \(X_{k,j}\) may only have a limited effect on T and hence this is a desirable property for robust estimators. If bounded, one should investigate the maximum value that \(\text {IF}_{k,j}(T)\cdot X_{k,j}\) may take.

In the following subsections, we provide closed form equations to calculate the impact that each incremental claim is having on the aforementioned statistics in Mack’s Model. We also provide the impact that incremental claims are having on reserves under the Bornhuetter–Ferguson methodology (see Sect. 2.4).

The impact functions will be presented with the aid of an example using real data. This data is given in Appendix A . By applying Mack’s Model to this data set we calculate reserves for accident year 8 as $226,403,952, total reserves as $1,463,388,942, rmse for accident year 8 reserves as $9,448,925 and the rmse for total reserves as $45,480,914.

3 Impact functions on central estimates

3.1 Individual accident years

3.1.1 Mack’s model

The impact function for reserves of individual accident years (\({\widehat{R}}_i\)) under Mack’s Model is given by

$$\begin{aligned}{} & {} \text {IF}_{k,j}({\widehat{R}}_i) = \frac{\partial {\widehat{R}}_i}{\partial X_{k,j}} \end{aligned}$$
(3.1)
$$\begin{aligned}{} & {} \quad = {\left\{ \begin{array}{ll} 0, &{}\text {if } k>i \\ \frac{{\widehat{R}}_i}{C_{i,I-i+1}},&{} \text {if } k=i \\ {\widehat{C}}_{i,I}\sum _{p=k}^{i-1}\left( \left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p+1}}\right) {{\textbf {1}}}_{\{j\le I-p+1\}}-\left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p}}\right) {{\textbf {1}}}_{\{j\le I-p\}}\right) ,&{} \text {if } k\le i-1 \end{array}\right. } \end{aligned}$$
(3.2)

where empty sums equal zero. See Appendix B.1  for proof. For \(k=i\) this can be simplified further to a function only of future estimated development factors. This allows greater understanding of the impact of incremental claims for \(k=i\) in that we can understand how these claims will affect development factors by simply noting whether they will be represented in the numerator and/or denominator of Eq. (2.1).

$$\begin{aligned} \text {IF}_{i,j}({\widehat{R}}_i)= \frac{{\widehat{R}}_i}{C_{i,I-i+1}}= & {} \frac{C_{i,I-i+1}\left( \prod _{s=I-i+1}^{I-1}{\widehat{f}}_s-1\right) }{C_{i,I-i+1}}= \prod _{s=I-i+1}^{I-1}{\widehat{f}}_s-1 \end{aligned}$$
(3.3)

An interesting point to note about the impact function for \({\widehat{R}}_i\) is that its value is heavily dependent on the position of the incremental claim in the loss triangle. In particular, three different cases for the accident year k have been given in Eq. (3.2) and furthermore the third case (i.e. \(k\le i-1\)) includes a summation that is further dependent on the value of k and two indicator functions that rely on the development period j. This dependence on position is a feature that is common to all impact functions provided in this paper. Notably, the impact of a positive change in any specific cell of a triangle may be positive in some other cells, negative in others, and the net effect on the reserve estimate may be positive or negative. This is an inherent effect of the reserving algorithm under consideration, and indicates how the results of that algorithm might be affected by an outlier in the original cell. Table 1 provides the impact of each incremental claim on accident year 8 reserves. A 3D graphical representation of these impacts is given in Fig. 2.

Table 1 \(\text {IF}_{k,j}({\widehat{R}}_8)\)
Fig. 3
figure 2

Illustration of \(\text {IF}_{k,j}({\widehat{R}}_8)\)

The first case in Eq. (3.2) is represented to the right of accident year 8 where incremental claims from accident years greater than the year of interest have no impact on reserves. The row of columns with equal height for accident year 8 corresponds to the second case of Eq. (3.2) where incremental claims are having an equal and positive effect on the reserve estimate. Now, the area in the upper left of the loss triangle (i.e. between accident year 7 and development year 3) represents an area where all incremental claims are having a negative impact on reserves. More specifically, \( \text {IF}_{k,j}({\widehat{R}}_i)\le 0, \text { for all } k\le i-1 \text { and } j\le I-i+1 \). For \(j>I-i+1\) (and \(k\le i-1\)), the situation is somewhat murkier and we have the result that

$$\begin{aligned} \text {IF}_{k,j}({\widehat{R}}_i)> & {} 0, \text { if }\sum _{p=k}^{\min \{i-1,I-j+1\}}\left( \left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p}}\right) \right. \nonumber \\{} & {} +\left. \left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p+1}}\right) \right) <\left( \frac{1}{\sum _{q=1}^{I-j+1}C_{q,j}}\right) . \end{aligned}$$
(3.4)

This inequality is readily calculable from the original loss triangle and we have found that in most instances we have considered it holds true. Figure 2 represents when this inequality holds as we see that for \(j>I-i+1=3\), all impacts are positive.

Additionally, note that for any choice of development period j the impact is increasing with accident year k throughout the loss triangle. Now we focus on the diagonals when \(j>I-i+1\). For the most recent diagonal (i.e. \(k+j=I+1\)) we have that \( \text {IF}_{k,j}({\widehat{R}}_i)>\text {IF}_{k+1,j-1}({\widehat{R}}_i) \Longleftrightarrow \sum _{q=1}^{k}X_{q,j}<C_{k+1,j-1} \). This says that the impact will be increasing as we move up the most recent diagonal (from accident year \(k+1\) and development year \(j-1\) to accident year k and development j) if the sum of incremental claims in column j is less than the cumulative claims up to development year \(j-1\) for accident year \(k+1\). It is likely that this will hold for situations when incremental claims in later development periods are usually less than those in earlier periods and hence the column sums in these development years can be expected to be less than cumulative claims for the following accident year. Additionally, if this decreasing development pattern is present it will be more likely for this inequality to hold at later development periods than earlier ones. For the other diagonals we have that

$$\begin{aligned}{} & {} \text {IF}_{k,j}({\widehat{R}}_i)>\text {IF}_{k+1,j-1}({\widehat{R}}_i) \Longleftrightarrow \frac{1}{\sum _{q=1}^{k}C_{q,I-k+1}}\nonumber \\{} & {} \quad -\frac{1}{\sum _{q=1}^{k}C_{q,I-k}}>\frac{1}{\sum _{q=1}^{k+l}C_{q,I-k+1-l}}-\frac{1}{\sum _{q=1}^{k+l-1}C_{q,I-k+1-l}}, \end{aligned}$$
(3.5)

where l represents the diagonal that is being evaluated such that for the second most recent diagonal \(l=2\), for the third most recent diagonal \(l=3\) and so on. Note that in most examples that we have considered these inequalities hold and as a result we see the impact increasing for incremental claims as we move towards the top right hand corner of the loss triangle.

The final property that we have derived is that for fixed accident year k, \(\text {IF}_{k,j}({\widehat{R}}_i)\) is increasing with j for \(j\ge I-i+1\). The proofs for these properties are given in Appendix B.2.

3.1.2 Bornhuetter–Ferguson

The impact function for BF individual accident year reserves is given by

$$\begin{aligned} \text {IF}_{k,j}({\widehat{R}}^{BF}_i)={\left\{ \begin{array}{ll} 0 &{} \text {if } k\ge i\\ {\widehat{\mu }}_i\frac{\sum _{p=k}^{i-1}\left( \left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p+1}}\right) {{\textbf {1}}}_{\{j\le I-p+1\}}-\left( \frac{1}{\sum _{q=1}^{p}C_{q,I-p}}\right) {{\textbf {1}}}_{\{j\le I-p\}}\right) }{({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})} &{} \text {if } k<i \end{array}\right. }\nonumber \\ \end{aligned}$$
(3.6)

Note that this result is a function of the assumption that \({\widehat{\mu }}_i={\widehat{C}}_{i,I}\) and this is set before the calculation and unchanged in the calculation of the derivative otherwise we would receive the same results as under Mack’s Model. We will discuss some of the interesting results for \(\text {IF}_{k,j}({\widehat{R}}^{BF}_i)\) with the aid of Fig. 3 which shows the impact function for Bornhuetter-Ferguson accident year 8 reserves under the assumption that \({\widehat{\mu }}_i={\widehat{C}}_{i,I}\). The results for \(\text {IF}_{k,j}({\widehat{R}}_i^{BF})\) differ from the corresponding impact function for the chain-ladder reserves \((\text {IF}_{k,j}({\widehat{R}}_i))\) in two major ways. Firstly, incremental claims in the same accident year as the reserve under inspection have no impact on that reserve in the BF case whereas they do in the CL case. This is shown by zero values for each accident year greater than or equal to 8 in Fig. 3. Secondly, for the case when \(k<i\), under the CL approach the \({\widehat{\mu }}_i\) is instead replaced by \({\widehat{C}}_{i,I}\) and there is no denominator term (i.e. \(({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\) is not there). If the prior estimate of ultimate claims \({\widehat{\mu }}_i\) is less than or reasonably close to the CL estimate of ultimate claims \({\widehat{C}}_{i,I}\) then the impact of incremental claims under the BF method is less than the corresponding impact under the CL approach as it is divided by the factor \(({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\). Of course this assertion will be dependent on the difference between \({\widehat{\mu }}_i\) and \({\widehat{C}}_{i,I}\) particularly because \(({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\) may be only slightly greater than 1 in some instances. Further, \(({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\) will be increasing with accident year i such that the relative difference between the impacts under the BF and CL will increase with i.

Fig. 4
figure 3

Illustration of \(\text {IF}_{k,j}({\widehat{R}}^{BF}_8)\)

Apart from the aforementioned changes in magnitude (and the change for \(k=i\)), the trends that we observe for this impact function will be similar to what was described in Sect. 3.1.1 for \(\text {IF}_{k,j}({\widehat{R}}_i)\). The proof for this impact function as well as a formal statement regarding the relationship between \(\text {IF}_{k,j}({\widehat{R}}_i)\) and \(\text {IF}_{k,j}({\widehat{R}}^{BF}_i)\) is given in Appendix B.3.

3.2 Total reserves

3.2.1 Mack’s model

We have that \( {\widehat{R}}=\sum _{i=1}^{I}{\widehat{R}}_i \), such that the impact function for total reserves is simply given by

$$\begin{aligned} \text {IF}_{k,j}({\widehat{R}})=\frac{\partial }{\partial X_{k,j}}\sum _{i=1}^{I}{\widehat{R}}_i=\sum _{i=1}^{I}\text {IF}_{k,j}({\widehat{R}}_i) \end{aligned}$$
(3.7)

Again we use the aid of a diagram to illustrate the main properties of the impact function.

Fig. 5
figure 4

Illustration of \(\text {IF}_{k,j}({\widehat{R}})\)

The observation in the upper left corner of a loss triangle (\(X_{1,1}\)), has a negative impact on reserves for each accident year in every case. This cornerpoint is shown as the closest observation in Fig. 4 and is having the largest negative impact on total reserve estimates (\(-\) 1.3875).

The impacts towards the latest development periods are also significant however they are positive. Importantly, this positive impact usually increases for each accident year as we move towards the upper right corner observation and hence this cornerpoint (\(X_{1,I}\)) will likely have a large positive impact. In our example, this observation (\(X_{1,10}\)) has the largest impact on total reserves (9.3050). The increasing pattern towards the top right corner can be understood by noting that the impact function for each accident year is increasing with j for the same k when \(j\ge I-i+1\). However the result for final reserves is somewhat dependent on the inequalities as given in the Sect. 3.1.1 regarding the diagonals for \(j\ge I-i+1\).

The result for \(X_{1,I}\) can be further understood by noting that any positive increase in this observation leads to a greater estimate of \(f_{I-1}\) without a decrease in another estimated development factor. Hence final reserve estimates will be increased as this development factor is used for forecasting final cumulative claims for every other accident year.

Next, the bottom left corner observation (\(X_{I,1}\)) is the only observation currently available for the final accident year. The impact this observation has on final reserves is given solely by Eq. (3.3). Importantly, this value is greatest when considering observations in the first column as there are more development factors being multiplied than when \(j>1\). Furthermore, in the other accident years (i.e. \(k\ne I\)), observations will be impacting estimated development factors \({\widehat{f}}_s\) such that one development factor will be increased and the other decreased as observations are altered. This is true except for the first column where the impact will only be felt for \({\widehat{f}}_1\) and it will be negative, and the last column where only \({\widehat{f}}_{I-1}\) will be impacted and the impact will be positive.

For the first column of observations, the impact will be negative or zero for each accident year except when \(k=i\). We see that those observations around \(X_{1,1}\) also often have negative impacts as they are encapsulated in the set \(k\le i-1\) and \(j\le I-i+1\) where their impact is negative for a larger number of accident years than other observations.

An additional interesting result is that for constant k the impact is increasing with j and for constant j the impact is increasing with k throughout this triangle. The impact is also increasing as we move along diagonals towards the top right corner for all \(j\ge 4\). These results can be understood by noting similar properties in the impact function for individual accident year reserves.

Similar results as to what has been stated here are mentioned on a heuristic basis in Venter and Tampubolon [5]. Notably, Venter and Tampubolon [5] highlight that impact functions can be used to evaluate the robustness of models and in turn compare and refine models based on robustness. This work provides mathematical justification for these conclusions and allows the impact of each observation to be traced precisely. These impact functions also provide insight into how adjustment of outlying points will affect results.

Note that the value of \(\text {IF}_{k,j}({\widehat{R}})\) and \(\text {IF}_{k,j}({\widehat{R}}_i)\) is independent of \(X_{k,j}\) for \(k+j=I+1\) (i.e. the last diagonal of the loss triangle). The proof for this result is given in Appendix B.4. A further result is that \({\widehat{R}}_i\) and \({\widehat{R}}\) are homogeneous of order 1 such that

$$\begin{aligned} {\widehat{R}}_i=\sum _{k+j\le I+1}\text {IF}_{k,j}({\widehat{R}}_i)\cdot X_{k,j} \quad \text {and}\quad {\widehat{R}}=\sum _{k+j\le I+1}\text {IF}_{k,j}.({\widehat{R}})\cdot X_{k,j} \end{aligned}$$
(3.8)

The proof for homogeneity of order 1 is given in Appendix B.5. This allows us to find the marginal contribution of each incremental claim to reserves given by \(\text {IF}_{k,j}({\widehat{R}})\cdot X_{k,j}\). The 3D graph for these marginal contributions to total reserves is given in Fig. 5 and we note that the result is somewhat different than when considering \(\text {IF}_{k,j}({\widehat{R}})\). This highlights how the magnitude of the incremental claim itself can impact the contribution it makes to reserves. In particular, we note that the magnitude of incremental claims significantly decreases in later development periods and this is reflected in the graph. Hence, this analysis allows us to identify influential observations within a loss triangle.

Fig. 6
figure 5

Illustration of \(\text {IF}_{k,j}({\widehat{R}})\cdot X_{k,j}\)

3.2.2 Bornhuetter–Ferguson

The impact function for total Bornhuetter–Ferguson reserves is given by \( \text {IF}_{k,j}({\widehat{R}}^{BF})=\sum _{i=1}^{I}\text {IF}_{k,j}({\widehat{R}}_i^{BF})\). The 3D graph for this impact function under the assumption that \({\widehat{\mu }}_i={\widehat{C}}_{i,I}\) for all i is given in Fig. 6.

Fig. 7
figure 6

Illustration of \(\text {IF}_{k,j}({\widehat{R}}^{BF})\)

Under this technique, the significant impact of \(X_{I,1}\) is completely eliminated. Additionally, all impacts have been reduced in comparison to Fig. 4 and in particular, the impact for late accident years at early development periods is significantly reduced. This is because the denominator in Eq. (3.6) for \(k<i\) is greater for earlier development periods and the impact of \(X_{k,j}\) on \({\widehat{R}}_i\) is zero for a greater proportion of accident years i as we increase k.

4 Impact on mean squared error under Mack’s model

4.1 Individual accident year MSE

When calculating the impact function for this statistic we have considered the \(\sigma _j\) and \(f_j\) terms as known constants such that we are calculating the sensitivity of the mean squared error to incremental claims rather than the sensitivity of the estimate of this term. To approximate this function we may then then plug in the estimated values of \(\sigma _j\) and \(f_s\). The impact function is given by

$$\begin{aligned}{} & {} \text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\nonumber \\{} & {} \quad ={\left\{ \begin{array}{ll} 0 &{}\text {if } k>i\\ \sum _{j=I-i+1}^{I-1}(f_{I-i+1}\cdot ...\cdot f_{j-1}\sigma ^2_jf_{j+1}^2\cdot ...\cdot f_{I-1}^2)\\ +2C_{i,I-i+1}({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})^2\sum _{s=I-i+1}^{I-1}\frac{{\widehat{\sigma }}^2_s/{\widehat{f}}_s^2}{\sum _{i=1}^{I-s}C_{i,s}}, &{} \text {if } k=i\\ -2C_{i,I-i+1}({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\sqrt{\sum _{s=I-i+1}^{I-1}\frac{{\widehat{\sigma }}^2_s/{\widehat{f}}_s^2}{\sum _{i=1}^{I-s}C_{i,s}}}\\ {\widehat{C}}_{i,I}\sum _{p=k}^{i-1}\left( \left( \frac{1}{\sum _{i=1}^{p}C_{i,I-p+1}}\right) {{\textbf {1}}}_{\{j\le I-p+1\}}-\left( \frac{1}{\sum _{i=1}^{p}C_{i,I-p}}\right) {{\textbf {1}}}_{\{j\le I-p\}}\right) , &{} \text {if } k\le i-1 \end{array}\right. } \end{aligned}$$
(4.1)

See Appendix B.6 for proof. Note that as the results for \(\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\) will be in units of $\(^2\) it is often desirable to look at the impact function for the root mean squared error (rmse). This is simply given by \( \text {IF}_{k,j}\left( \sqrt{\text {mse}({\widehat{R}}_i)}\right) =\frac{1}{2}\cdot \frac{\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))}{\sqrt{\text {mse}({\widehat{R}}_i)}} \) and allows for the results given to be in the same units as reserves. The impact that each incremental claim is having on the rmse of the reserves for accident year 8 is given in Table 2 with a graphical representation provided in Fig. 7. From Eq. (4.1) we have that for \(k=i\), the impact is always the sum of two positive terms and is independent of j. Hence for \(k=i\) the impact is always positive and equal. This is represented by the row of equal height positive columns for accident year 8 in Fig. 7. Additionally, we observe that the sign of the impact is the opposite of that for \(\text {IF}_{k,j}({\widehat{R}}_8)\) (except when \(k=8\)) and we observe similar trends in terms of magnitude. In particular, note that the impact is increasing in magnitude towards the top right corner observation however these impacts are negative.

Table 2 \(\text {IF}_{k,j}\left( \sqrt{\text {mse}({\widehat{R}}_8)}\right) \)

For the cases when \(k\le i-1\) note that the term \(-2C_{i,I-i+1}({\widehat{f}}_{I-i+1}\cdot ...\cdot {\widehat{f}}_{I-1})\sqrt{\sum _{s=I-i+1}^{I-1}\frac{{\widehat{\sigma }}^2_s/{\widehat{f}}_s^2}{\sum _{i=1}^{I-s}C_{i,s}}}\) is always negative and is independent of k and j (i.e. the same value for this term is used throughout the triangle for all \(k\le i-1\)). Additionally, the term \({\widehat{C}}_{i,I}\sum _{p=k}^{i-1}\left( \left( \frac{1}{\sum _{i=1}^{p}C_{i,I-p+1}}\right) {{{\textbf {1}}}}_{\{j\le I-p+1\}}-\left( \frac{1}{\sum _{i=1}^{p}C_{i,I-p}}\right) {{{\textbf {1}}}}_{\{j\le I-p\}}\right) \) is equal to \(\text {IF}_{k,j}({\widehat{R}}_i)\) provided above. Hence we see that \(\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\) will have opposite sign to \(\text {IF}_{k,j}({\widehat{R}}_i)\) throughout the triangle for \(k\le i-1\).

Notably, for \(k\le i-1\) and \(j\le I-i+1\) the impact will be positive which is shown in Fig. 7 for \(k\le 7\) and \(j\le 3\). We will see a change of sign for \(\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\) from positive for \(j\le I-i+1\) to negative for \(j=I-i+2\) when \(k=i-1\). A proof of this property is given in Appendix B.7.

Fig. 8
figure 7

Illustration of \(\text {IF}_{k,j}\left( \sqrt{\text {mse}({\widehat{R}}_8)}\right) \)

Similar trends in terms of the magnitude are seen for \(\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\) as was outlined above for \(\text {IF}_{k,j}({\widehat{R}}_i)\). For instance as we move towards the top right corner of the loss triangle we will tend to see the impact become increasingly negative (as opposed to increasingly positive for \(\text {IF}_{k,j}({\widehat{R}}_i)\)). Further investigation into the relationship between \(\text {IF}_{k,j}({\widehat{R}}_i)\)) and \(\text {IF}_{k,j}(\text {mse}({\widehat{R}}_i))\), particularly the change in sign when \(k\le i-1\) is a warranted extension of this work.

4.2 Estimate of total MSE

We have that for Mack’s Model the mean squared error of prediction for total reserves is given by

$$\begin{aligned} \text {mse}({\widehat{R}})=\sum _{i=2}^{I}\left\{ (\text {s.e.}({\widehat{R}}_i))^2+{\widehat{C}}_{i,I}\left( \sum _{q=i+1}^{I}{\widehat{C}}_{q,I}\right) \left( \sum _{r=I-i+1}^{I-1}\frac{2\sigma _r^2/{\widehat{f}}_r^2}{\sum _{n=1}^{I-r}C_{n,r}}\right) \right\} \nonumber \\ \end{aligned}$$
(4.2)

In this case, we are again considering the unknown \(\sigma _r\) values as constants rather than taking their estimates. This again allows us to focus on the impact that incremental claims are having on the mean squared error rather than its associated estimate. The impact function is given by

$$\begin{aligned}{} & {} \text {IF}_{k,j}\left( \widehat{\text {mse}({\widehat{R}})}\right) =\sum _{i=2}^{I}\left\{ \widehat{\text {IF}}_{k,j}(\text {mse}({\widehat{R}}_i))+{\widehat{C}}_{i,I}\left( \sum _{q=i+1}^{I}{\widehat{C}}_{q,I}\right) \right. \nonumber \\{} & {} \qquad \left. \times \sum _{r=I-i+1}^{I-1}\frac{-2\sigma _r^2\sum _{n=1}^{I-r}{\widehat{f}}_r^2C_{n,r}\left( \frac{\partial \ln C_{n,r}}{\partial X_{k,j}}+\frac{2\partial \ln {\widehat{f}}_r}{\partial X_{k,j}}\right) }{\left( \sum _{n=1}^{I-r}C_{n,r}{\widehat{f}}_r^2\right) ^2} +\left( \sum _{r=I-i+1}^{I-1}\frac{2\sigma _r^2/{\widehat{f}}_r^2}{\sum _{n=1}^{I-r}C_{n,r}}\right) \right. \nonumber \\{} & {} \qquad \times \left. \left( {\widehat{C}}_{i,I}\left( \sum _{q=i+1}^{I}\left( \text {IF}_{k,j}({\widehat{R}}_q)+\frac{\partial C_{q,I-q+1}}{\partial X_{k,j}}\right) \right) +\left( \sum _{q=i+1}^{I}{\widehat{C}}_{q,I}\right) \left( \text {IF}_{k,j}({\widehat{R}}_i)+\frac{\partial C_{i,I-i+1}}{\partial X_{k,j}}\right) \right) \right\} .\nonumber \\ \end{aligned}$$
(4.3)

Note that this formulation of the impact function still contains derivative terms which are readily calculable. Importantly, these impacts are not simply the sum of the impacts for the mse of each individual accident year. A similar point is made regarding taking the impact of the rmse for total reserves (\(\sqrt{\text {mse}({\widehat{R}})}\)) when calculating impact functions in practice such that we are looking at the impact in the same units as reserves. The impact of individual claims on the estimated rmse of total reserves is given in Fig. 8. It appears that the main result for these impacts is that for development period 1, all impacts are positive and then holding k constant the impacts are decreasing with j towards zero and then continuing in this pattern, are becoming increasingly negative towards the upper right corner.

Fig. 9
figure 8

Illustration of \(\text {IF}_{k,j}\left( \sqrt{\text {mse}({\widehat{R}})}\right) \)

5 Impact on lognormal quantiles

We now provide the impact function for total reserves under the common assumption that they are lognormally distributed. In principle, a similar approach may be employed for any location-scale distribution, though the mathematics of quantiles can become intractable here in the case of discrete distributions. For this reason, the over-dispersed Poisson distribution has been avoided, despite it being a more natural distribution to associate with the chain-ladder. An example given in Chapter 11 of [4] shows (for that case) that quantiles other than extreme ones are little affected by the lognormal choice rather than a shorter-tailed distribution. Nonetheless, we would advise validation that lognormal is an appropriate choice for the data at hand before implementing the results provided here. We have the following assumptions

$$\begin{aligned} E[R]={\widehat{R}}=e^{\mu +\frac{1}{2}\sigma ^2} \quad \text {and} \quad \text {Var}(R)=\text {mse}({\widehat{R}})=e^{2\mu +\sigma ^2}(e^{\sigma ^2}-1) \end{aligned}$$
(5.1)

such that \( R\sim LN(\mu ,\sigma ^2)\). The q quantile of a lognormal distribution, \(X\sim LN(\mu ,\sigma )\) is given by

$$\begin{aligned} F_X^{-1}(q)=e^{\mu +\sigma \Phi ^{-1}(q)} \end{aligned}$$
(5.2)

where \(\Phi (.)\) is the cumulative distribution function of the standard normal distribution. The impact function for lognormal quantiles is given by

$$\begin{aligned} \text {IF}_{k,j}\left( F_R^{-1}(q)\right)= & {} \left( \frac{2\cdot \text {IF}_{k,j}({\widehat{R}})\cdot {\widehat{R}}-\text {IF}_{k,j}\left( \text {mse}({\widehat{R}})\right) }{2(\text {mse}({\widehat{R}})+{\widehat{R}}^2)}+\frac{\Phi ^{-1}(q)\left( \text {IF}_{k,j}(\text {mse}({\widehat{R}}))\cdot {\widehat{R}}-2\text {mse}({\widehat{R}})\cdot \text {IF}_{k,j}({\widehat{R}})\right) }{2{\widehat{R}}\left( \text {mse}({\widehat{R}})+{\widehat{R}}^2\right) \sqrt{\ln \left( 1+\frac{\text {mse}{\widehat{R}}}{{\widehat{R}}^2}\right) }}\right) \nonumber \\{} & {} \quad \times \exp \left[ \ln ({\widehat{R}})-\frac{1}{2}\ln \left( 1+\frac{\text {mse}({\widehat{R}})}{\widehat{{R}^2}}\right) +\sqrt{\ln \left( 1+\frac{\text {mse}({\widehat{R}})}{\widehat{{R}^2}}\right) }\Phi ^{-1}(q)\right] \nonumber \\ \end{aligned}$$
(5.3)

See Appendix 1 for proof. The impact that each incremental claim is having on the 99.5% quantile of total reserves under the assumption that they are log-normally distributed is given in Table 3 and the corresponding 3D graph is given in Fig. 9. Importantly, we see similar trends in this impact triangle as were seen for \(\text {IF}_{k,j}({\widehat{R}})\) (Fig. 4). Notably, the three cornerpoints \(X_{1,1}\), \(X_{1,10}\) and \(X_{10,1}\) are having significant impacts on the 99.5% quantile of reserves. This can be understood intuitively in that if an incremental claim is having a given impact on reserves then we may expect to see a similar impact on their associated quantiles.

Table 3 \(\text {IF}_{k,j}\left( F_R^{-1}(0.995)\right) \)
Fig. 10
figure 9

Illustration of \(\text {IF}_{k,j}\left( F_R^{-1}(0.995)\right) \)

6 Conclusion

In this paper we have provided impact functions for a range of statistics of interest under Mack’s Model as well as reserves under the Bornhuetter–Ferguson technique. Properties of these impact functions have been discussed and we have illustrated their calculation on real data. These impact functions capture the rate at which the relevant statistic of interest will change given movement in a particular incremental claim. Additionally, we can highlight the marginal contribution of each incremental claim to reserves as they are homogeneous functions of order one.

We have illustrated that there is often a small set of observations that these statistics are particularly sensitive to. This highlights a lack of robustness in that deviations in some observations may largely dictate results. A further feature of these functions is that they are heavily dependent on the numerous other observations within a loss triangle, highlighting the interdependence of the claims development and each incremental observation. Additionally, all impact functions that have been derived in this section are unbounded with respect to individual incremental claims except for the cases when they equal zero. Hence the relevant statistics of interest may be carried arbitrarily far from their true value in the presence of outlying observations.

Finally, we have illustrated results using data from a Belgian non-life insurer.

7 Code

The R code to replicate numerical results and figures is available on the GitHub repository https://github.com/agi-lab/reserving-impact-factors.