Optimal Exercise Strategies for Operational Risk Insurance via Multiple Stopping Times

In this paper we demonstrate how to develop analytic closed form solutions to optimal multiple stopping time problems arising in the setting in which the value function acts on a compound process that is modified by the actions taken at the stopping times. This class of problem is particularly relevant in insurance and risk management settings and we demonstrate this on an important application domain based on insurance strategies in Operational Risk management for financial institutions. In this area of risk management the most prevalent class of loss process models is the Loss Distribution Approach (LDA) framework which involves modelling annual losses via a compound process. Given an LDA model framework, we consider Operational Risk insurance products that mitigate the risk for such loss processes and may reduce capital requirements. In particular, we consider insurance products that grant the policy holder the right to insure k of its annual Operational losses in a horizon of T years. We consider two insurance product structures and two general model settings, the first are families of relevant LDA loss models that we can obtain closed form optimal stopping rules for under each generic insurance mitigation structure and then sec-ondly classes of LDA models for which we can develop closed form approximations of the optimal stopping rules. In particular, for losses following a compound Poisson process with jump size given by an Inverse-Gaussian distribution and two generic types of insurance mitigation, we are able to derive analytic expressions for the loss process modified by the insurance application, as well as closed form solutions for the optimal multiple stopping rules in discrete time (annually). When the combination of insurance mitigation and jump size distribution does not lead to tractable stopping rules we develop a principled class of closed form approximations to the optimal decision rule. These approximations are developed based on a class of orthogonal Askey polynomial series basis expansion representations of the annual loss compound process distribution and functions of this annual loss. and of risk transfer and


Introduction
In this paper we consider probability models comprised of classes of compound processes that are widely used in applications in risk and insurance modelling. In particular we consider how to solve multiple optimal stopping time problems where at times of stopping the compound process under consideration is modified by an action taken, and we aim to find the optimal set of times at which such actions should be taken to achieve the minimization of an objective function or gain function.
In the context of applied probability modelling where such frameworks are particularly important to consider we address the class of insurance mitigation problems in Operational Risk modelling. In the Basel II document (BCBS 2006), OpRisk "is defined as the risk of loss resulting from inadequate or failed internal processes, people and systems or from external events. This definition includes legal risk, but excludes strategic and reputational risk". Also, OpRisk events should be classified in one of the event types of Table 1 and recorded as taking place in one of the business lines of the same Table. Under the Basel III (BCBS 2010) banking regulation that is the core financial regulation framework for Operational Risk modelling in all financial institutions, the core advanced risk modelling framework known as the Advanced Measurement Approach (AMA) discusses such classes of loss model primarily given by compound processes. It advocates their use for practical application under the Loss Distribution Approach (LDA) framework.
Under this LDA, the fundamental model framework used by large banks to capture Operational Risk (OpRisk), annual OpRisk losses are usually modelled as a counting process, such as a compound Poisson processes (CPP), where the Poisson random variable denotes the frequency of the losses (how many of them occur during the year) and the jump size distribution models the severity of each loss. Under Basel III guidelines, the application of an action to mitigate loss exposures and reduce regulatory capital requirements is permitted.
In this context we consider classes of risk transfer mechanisms based on insurance products applied to modify the compound process loss model in some manner dictated by the type of insurance product under consideration. We design objective functions that related to loss mitigation over a time horizon of T years and solve for optimal insurance exercise times under a framework of multiple optimal stopping criteria.
Within this context we solve, in discrete time, a class of optimal multiple stopping problems which arise from a new family of insurance products, providing closed form solution for the stopping rules. Since the knowledge of the distribution of the annual OpRisk loss after the usage of the insurance policy is essential for the calculation of the multiple stopping rule, for LDA models that (after insurance mitigation) are not analytically tractable we derive a closed form approximation to this distribution based on Askey polynomial expansions. Since the approximating distribution has support on the positive real line, Gamma densities are used as a basis and possible problems with the positivity of the estimated density are also discussed.

Insurance and Operational Risk
Since the New Basel Capital Accord in 2004 (Basel II), Operational Risk (OpRisk) quantification has become increasingly important for financial institutions. However, the same degree of attention has not yet been devoted to insurance mitigation of OpRisk losses nor, consequently, to detailed analysis of potential risk and capital reduction that different risk transfer strategies in OpRisk may allow.
Historically the transference of credit and market risks through credit derivatives and interest rate swaps, for example, has been an active subject of extensive studies both from practitioners and academics while only a few references about OpRisk transfer of risk and possible approach to such risk transfers can be found in the literature (see Brandts 2004, Bazzarello et al. 2006and Peters et al. 2011. This slow uptake of insurance policies in OpRisk for capital mitigation can be partially attributed to four general factors: (a) there still remains a rather limited understanding of the impact on capital reduction of currently available OpRisk insurance products, especially in the complex multi-risk, multi-period scenarios; (b) the relative conservative Basel II regulatory cap of 20 % in a given year (for AMA models); (c) the limited understanding at present of the products and types of risk transfer mechanisms available for OpRisk processes; and (d) the limited competition for insurance products available primarily for OpRisk, where yearly premiums and minimum Tier I capital requirements required to even enter into the market for such products precludes the majority of banks and financial institutions in many jurisdictions.
Some of the reasons for these four factors arises when one realises that OpRisk is particularly challenging to undertake general risk transfer strategies for, since its risk processes range from loss processes which are insurable in a traditional sense (see Definition 2) to infrequent high consequence loss processes which may be only partially insurable and may result from extreme losses typically covered by catastrophe bonds and other types of risk transfer mechanisms. For these reasons, the development of risk transfer products for OpRisk settings by insurers is a relatively new and growing field in both academic research and industry, where new products are developed as greater understanding of catastrophe and high consequence low frequency loss processes are better understood.
To qualify these points, consider factor (d). In terms of special products, a large reinsurance company that offers a number of products in the space of OpRisk loss processes to a global market is Swiss Re. They have teams such as in the US the Excess and Surplus market Casualty group which specialises in "U.S-domiciled surplus lines wholesale brokers with primary, umbrella and follow-form excess capacity for difficult-to-place risks in the Excess and Surplus market". This group aims to seek coverage solutions for challenging risks not in the standard/admitted market. The types of coverage limits offered are quoted as being of the range: USD 10 million limits in umbrella and follow form excess; USD 5 million CGL limits for each occurrence; USD 5 million general aggregate limit; USD 5 million products/completed operations; and USD 5 million personal and advertising injury. There is also groups like the Professional and Management Liability team in Swiss Re that provide bespoke products for "protection for organisations and their executives, as well as other professionals, against allegations of wrong-doing, mismanagement, negligence, and other related exposures." In addition as discussed in Van den Brink (2002) there are some special products that are available for OpRisk insurance coverage offered by Swiss Re and known as the Financial Institutions OpRisk Insurance (FIORI) which covers OpRisk causes such as Liability, Fidelity and unauthorised activity, Technology risk, Asset protection and External fraud. It is noted in Chernobai et al. (2008) that the existence of such specialised products is limited in scope and market since the resulting premium one may be required to pay for such an insurance product can typically run into very significant costs, removing the actual gain from obtaining the insurance contract in terms of capital mitigation in the first place. Hence, although the impact of insurance in OpRisk management is yet to be fully understood it is clear that it is a critical tool for the management of exposures and should be studied more carefully.
At this stage it is beneficial to recall the fundamental definition of an insurance policy or contract.
Definition 1 (Insurance Policy) At a fundamental level one can consider insurance to be the fair transfer of risk associated with a loss process between two financial entities. The transfer of risk is formalized in a legal insurance contract which is facilitated by the financial entity taking out the insurance mitigation making a payment to the insurer offering the reduction in risk exposure. The contract or insurance policy legally sets out the terms of the coverage with regard to the conditions and circumstances under which the insured will be financially compensated in the event of a loss. As a consequence the insurance contract policy holder assumes a guaranteed and often known proportionally small loss in the form of a premium payment corresponding to the cost of the contract in return for the legal requirement for the insurer to indemnify the policy holder in the event of a loss.
Under this definition one can then interpret the notion of insurance as a risk management process in which a financial institution may hedge against potential losses from a given risk process or group of risk processes. In Mehr et al. (1980) and Berliner (1982) they discuss at a high level the fundamental characteristics of what it means to be an insurable loss or risk process, which we note in Definition 2. We observe that this standard Actuarial view on insurability does not always coincide directly to the economists view.
Definition 2 (Insurable Losses) In Mehr et al. (1980) and (Chernobai et al. 2008, chapter 3) they define an insurable risk as one that should satisfy the following characteristics: 1. The risks must satisfy the "Law of Large Numbers", i.e. there should be a large number of similar exposures. 2. The loss must take place a known recorded time, place and from a reportable cause.
3. The loss process must be considered subject to randomness. That is, the events that result in the generation of a claim should be random or at a minimum outside the control of the policy holder. 4. The loss amounts generated by a particular risk process must be commensurate with the charged premium, and associated insurer business costs such as claim analysis, contract issuance and processing. 5. The estimated premium associated with a loss process must be affordable. This is particularly important in high consequence rare-event settings, see discussions in Peters et al. (2011) who consider this question in a general setting. 6. The probability of a loss should be able to be estimated for a given risk process as well and some statistic characterizing the typical, average, median etc. loss amount. 7. Either the risk process has a very limited chance of a catastrophic loss that would bankrupt the insurer and in addition the events that occur to create a loss occur in a non-clustered fashion, or the insurer will cap the total exposure.
In Gollier (2005) the authors argue that there is also a need to consider the economic ramifications for insurable risks. In particular they add to this definition of insurable risks the need to consider the economic market for such risk transfers. In particular they discuss uninsurable and partially insurable losses, where an uninsurable loss occurs when "..., given the economic environment, no mutually advantageous risk transfer can be exploited by the consumer and the supplier of insurance". A partially uninsurable loss arises when the two parties to the risk transfer exchange can only partially benefit or exploit the mutually advantageous components of the risk transfer, this has been considered in numerous studies, see Aase (1993), Arrow (1953), Arrow (1965), Borch (1962, and Raviv (1979).
As noted in Gollier (2005), from the economists perspective the basic model for risk transfer involves a competitive insurance market in which the Law of Large Numbers is utilised as part of the evaluation of the social surplus of the transfer of risk. However, unlike the actuarial view presented above, the maximum potential loss and the probabilities associated with this loss are not directly influential when it comes to assessing the size of risk transfers at market equilibrium. In addition the economic model adds factors related to the degree of risk aversion of market participants (agents) and their degree of optimism when assessing the insurability of risks in the economy. Classically these features are all captured by the economic model know as the Arrow-Borch-Raviv model of perfect competition in insurance markets, see a good review in (Gollier 2005, section 2) and Ghossoub (2012).
In the work developed here we pose an interesting general question of how may one construct insurance products satisfying the axioms and definitions above whilst allowing a sufficiently general class of policies, to be discussed in the sequel. This class may actually be suitable for a wider range of financial institutions and banks than those specialised products currently on offer. More specifically, in this paper we discuss aspects of an insurance product that provides its owner several opportunities to decide which annual OpRisk loss(es) to insure. This product can be thought of as a way to decrease the cost paid by its owner to the insurance company in a similar way to what occurs with swing options in energy markets (see for example, Jaillet et al. 2004 andCarmona andTouzi 2008): instead of buying T yearly insurance policies over a period of T years, the buyer can negotiate with the insurance company a contract that covers only k of the T years (to be chosen by the owner). This type of structured product will result in a reduction in the cost of insurance or partial insurance for OpRisk losses and this aspect is highlighted in Carmona and Touzi (2009, page 188), where they note that "even without considering the cost of major catastrophes, insurance coverage is very expensive". In addition, we argue it may be interesting to explore such structures if the flexibility they provide results in an increased uptake of such products for OpRisk coverage, further reducing insurance premiums and resulting perhaps in greater competition in the market for these products.
The general insurance product presented here can accommodate any form of insurance policy, but we will focus on two basic generic "building block" policies (see Definitions 3 and 4) which can be combined to create more complex types of protection. For these two basic policies we present a "moderate-tailed" model for annual risks that leads to closed form usage strategies of the insurance product, answering the question: when is it optimal to ask the insurance company to cover the annual losses?
For the rest of the paper we assume that throughout a year a financial institution incurs a random number of loss events, say N, with severities (loss amounts) X 1 , . . . , X N . Additionally, we suppose the company holds an insurance product that lasts for T years and grants the company the right to mitigate k of its T annual losses through utilisation of its insurance claims. To clarify consider a given year t ≤ T where the company will incur N(t) losses adding up to Z(t) = N(t) n=1 X n (t), assuming it has not yet utilised all its k insurance mitigations it then has the choice to make an insurance claim or not. If it utilises the insurance claim in this year the resulting annual loss will be denoted by Z(t). Such a loss process model structure is standard in OpRisk and insurance and is typically referred to as the LDA which we illustrate an example instance of in Fig. 1.
In this context the company's aim is to choose k distinct years out of the T in order to minimize its expected operational loss over the time interval [0, T ], where it is worth noting that if Z > Z i.e., if the insurance is actually mitigating the company's losses, all its k rights should be exercised. The question that then must be addressed is what is the optimal decision rule, i.e. define the multiple optimal stopping times for making the k sets of insurance claims.
The rest of the paper is organized as follows. In Section 2 we present the insurance policies we use as mitigation for the insurance product described above. Section 3 presents an overview of useful theoretical results in the field of multiple stopping rules for independent observations in discrete time, in particular Theorem 1 which is the main result in this Section. A summary of properties related to the LDA model used in this paper is presented in Section 4 and used in Section 5 to present the main contribution of this work, namely closed form solutions for the optimal multiple stopping rules for the insurance products considered. In Section 6 we check the theoretical optimality of the rules derived in Section 5, comparing them with predefined rules.
Since these closed form results rely upon the stochastic loss model considered, we also provide a general framework applicable for any loss process. Therefore, in Section 7 we discuss a method based on series expansions of unknown densities to calculate the optimal rules when the combination of insurance policy and severity density does not lead to analytical results. The conclusions and some final considerations are shown in Section 8.

Insurance Policies
As stated before, the insurance policies presented here must be thought as building blocks for more elaborated ones, leading to mitigation of more complex sources of risk. It also worth noticing that the policies presented are just a mathematical model of the actual policies that would be sold in practice and although some characteristics, such as deductibles, can be incorporated in the model they are not presented at this stage.
In the sequel we present these basic insurance policies the company can use in the insurance product. For the sake of notational simplicity, if a process Z(t) T t=1 is a sequence of i.i.d. random variables, we will drop the time index and denote a generic r.v. from this process as Z. For the rest of the paper I A will denote the indicator function on the event A, i.e., I A = 1 if A is valid and zero otherwise.

Definition 3 (Individual Loss Policy (ILP))
This policy applies a constant haircut to the loss process in year t in which individual losses experience a Top Cover Limit (TCL) as specified by

Definition 4 (Accumulated Loss Policy (ALP))
The ALP provides a specified maximum compensation on losses experienced over a year. If this maximum compensation is denoted by ALP then the annual insured process is defined as To characterize the annual application of such policies we provide a schematic representation of each of these policies in Figs. 2 to 3, assuming the same losses as in Fig. 1. The (part of the) loss mitigated by the insurance policy is represented by a white bar and the remaining loss due to the owner of the insurance product is painted grey. As in Fig. 1, annual losses are represented by hatched bars.

Multiple Optimal Decision Rules
Assume an agent sequentially observe a process W (t) T t=1 , for a fixed T < +∞ and wants to choose k < T of these observations in order to maximize (or minimize, as discussed later on Remark 2) the expected sum of these chosen observations. For k = 1, this problem is known in the literature as the house selling problem (see Sofronov 2013 for an updated literature review) since one of its interpretations is as follows. If the agent is willing to sell a house and assume that at most T bids will be observed he wants to choose the optimal time τ such that the house will be sold for the highest possible value. The extension of this problem for k > 1 is know as the multiple house selling problem, where the agent wants to sell k identical houses. It is worth noting that in our insurance problem the agent is interested in choosing k periods to exercise the insurance policy in order to minimize loss, in a sense that will be make precise shortly in this paper.
Formally, the mathematical framework of this problem consists of a filtered probabil- is the sigma-algebra generated by W (t). Within this framework, where we assume the flow of information is given only by the observed values of W, it is clear that any decision at time t should take into account only values of the process W up to time t. It is also required that two actions can not take place at the same time, i.e., we do not allow two stopping times to occur at the same discrete time instant. These assumptions are precisely stated in the following definition, but for further details on the theory of multiple optimal stopping rules we refer the reader to Sofronov et al. (2006), Nikolaev andSofronov (2007), and Sofronov (2013). Definition 5 A collection of integer-valued random variables (τ 1 , . . . , τ i ) is called an imultiple stopping rule if the following conditions hold: Given the mathematical definition of a stopping rule the notion of optimality of these rules can be made precise in the following definitions.
Definition 6 For a given multiple stopping rule τ = (τ 1 , . . . , τ k ) the gain function utilized in this paper takes the following additive form: Definition 7 Let S m be the class of multiple stopping rules τ = (τ 1 , . . . , τ k ) such that is defined as the m-value of the game and, in particular, if m = 1 then v 1 is the value of the game.
Definition 8 A multiple stopping rule τ * ∈ S m is called an optimal multiple stopping rule The following result (presented in Sofronov et al. 2006 andNikolaev andSofronov 2007) provides the optimal multiple stopping rule that maximizes the expectation of the sum of the observations (see Fig. 4 for a schematic representation).
In the context we consider it will always be optimal to stop the process exactly k times, but this may not be true, for example, if some reward is given to the product holder for less than k years of claims of insurance. In the absence of such considerations, we proceed with assuming always k years of claims will be made. In Theorem 1 we can see that the value function for L > l is artificial and v 0,1 , for example, has no interpretation. On the other hand, v 1,1 can not be calculated using the general formula (it would depend on v 0,1 ). With one stop remaining and one step left, from the reasons given above, we are obliged to stop, and, therefore, there is no maximization step when calculating v 1,1 , i.e., v 1,1 = E[W (T − 1 + 1)]. The same argument is valid for l > 1 and, in this case, and, if we have l ≤ (T −1) steps left and also l stops, we must stop in all the steps remaining. So, From Theorem 1 and the assumption of independence of the annual losses, we can see that to be able to calculate the optimal rule we only need to calculate (unconditional) expectations of the form E[max{c 1 + W, c 2 }], for different values of c 1 and c 2 , such that (Fig. 4).

Objective Functions for Rational and Boundedly Rational Insurers
In this paper we will consider two possible general populations for the potential insurer. The first group are those that are perfectly rational, meaning that they will always act in an optimal fashion when given the chance and, more importantly, are capable (i.e. have the resources) to figure out what is the optimal behaviour. In this case we will consider a global objective function to be optimized.
The second group represent boundedly rational insurers who act sub-optimally. This group represents firms who are incapable or lack the resources/knowledge to understand how to act optimally when determining their optimal behaviours/actions and will be captured by local behaviours.
Hence, these two populations will be encoded in two objective functions: one which is optimal (globally) and one which represents a sub-optimal (local strategy) the boundedly rational population would likely adopt. These behaviours can be made precise through the following exercising strategies, for the first and second groups, respectively.

Global Risk Transfer Strategy:
Minimizes the (expected) total loss over the period [0, T ]; 2. Local Risk Transfer Strategy: Minimizes the (expected) sum of the losses at the insurance times (i.e. stopping times).
These two different groups can be understood as, for example, large corporations, with employees dedicated to fully understand the mathematical nuances of this kind of contract and small companies, with limited access to information. The group with "bounded rationality" may decide (heuristically, without the usage of any mathematical tool) to follow the so-called Local Risk Transfer Strategy, which will produce smaller gain in the period [0, T ]. As we will see in Section 6 these two different objective functions can lead to completely different exercising strategies, and we believe the insurance company who sells this contract should be aware of these different behaviours.
For the first loss function the formal objective is to minimize Since T t=1 Z(t) does not depend on the choice of τ 1 , . . . , τ k , this is, in fact, equivalent to maximize where the process W is defined as For the second objective function, the company aims to minimise the total loss not over period [0, T ] but instead only at times at which the decisions are taken to apply insurance and therefore claim against losses in the given year, and, in this case, the process W should be viewed as Remark 1 Note that if the agent is trying to maximize the first loss function (using W = Z − Z), then W is non-negative stochastic process, and only one kind of expectation is required to be calculated, Remark 2 If the agent is trying to minimize the expected gain of the sum of Z(t) random variables (instead of maximizing it) one can rewrite the problem as follows. Define a process . Therefore the optimal stopping times that maximize the expected sum of the process W are the same that minimize the expected sum of the process Z.
The present work is mainly devoted to the study of a combination of insurance policy and severity distribution that leads to closed form results of the value function integrals, which, in turn, are required for closed form multiple optimal stopping rules. Nonetheless, on Section 7 we also show how one can develop principled approximation procedures in order to calculate the distribution of the insured process Z and, consequently, the optimal rule. In the remainder of this section we present a very simple example using the second (local) objective function, where we assume the annual insured losses are modelled as Log-Normal random variables. The reader should note this assumption is an approximation to the usual LDA model, where severities are assumed to be Log-Normally distributed.
Example 1 (Log-Normal) Assume that the insured losses Z(1), . . . , Z(T ) form a sequence of i.i.d. random variables such that Z ∼ Log-Normal(0,1). To calculate the multiple optimal rule that minimizes the expected loss let us define W = − Z. The values of the game using the equations in Theorem 1 can be seen in Table 2.
Note that Table 2 presents the value of expected loss at the times we stop, i.e., E k j =1 − Z(τ j ) , so it only makes sense to compare values within the same column. Doing so one can see that for a fixed number of stops l, the value of the game is increasing with the number of steps remaining. In other words the more one can wait to decide in which step to stop the smaller is the expected loss.

Loss Process Models via LDA
Before discussing the application of the Theorem 1 to the problem of choosing the multiple exercising dates of the insurance product present in Section 1.1, in this Section we present the LDA model that leads to closed form solutions in Section 5.
The LDA in OpRisk assumes that during a year t a company suffers N(t) operational losses, with N(t) following some counting distribution (usually Poisson or Negative Binomial). The severity of each of these losses is denoted by X 1 (t), . . . , X N(t) (t) and the cumulative loss by the end of year t is given by Z(t) = N(t) n=1 X n (t). For the purpose of modelling OpRisk losses it is essential that the severity density allows extreme events to occur, since these events often occur in practice, as shown, for example, in Peters et al. (2013, Section 1.1). Following the nomenclature in Franzetti (2011, Table 3.3), the Inverse Gaussian distribution possess a "moderate tail" which makes it a reasonable model for OpRisk losses for many risk process types and is often used in practice. This family of distributions also has the advantage of being closed under convolution and this characteristic is essential if closed form solutions for the multiple optimal stopping problem are to be obtained.

Remark 3
The recently released European Banking Authority (EBA) Regulatory Technical Standards for Operational Risk (EBA 2015) requires that sub-exponential distributions should be used to model OpRisk severities, "unless exceptional reasons exist". Although Inverse Gaussian distributions are not sub-exponential (see Embrechts 1983) they can provide similar fittings to Log-Normal distributions (Chhikara and Folks 1977). Moreover, for datasets not sufficiently well approximated by Inverse Gaussian severities, Section 7 provides a general approximation scheme for the optimal strategy.
In the closed form solutions we present for the different insurance policies we use properties of the Inverse Gaussian distribution and its relationship with the Generalized Inverse Gaussian distribution. The following Lemmas will be used throughout; see additional details in Folks and Chhikara (1978) and Jørgensen (1982).
In the following, let X 1 , . . . , X n be a sequence of i.i.d. Inverse Gaussian (IG) random variable with parameters μ, λ > 0, i.e., Let also G be a Generalized Inverse Gaussian (GIG) r.v. with parameters α, β > 0, p ∈ R, i.e., where K p is a modified Bessel function of the third kind (sometimes called modified Bessel function of the second kind), defined as K p (z) = 1 2 +∞ 0 u p−1 e −z(u+1/u)/2 du.

Lemma 2 Any Inverse Gaussian random variable can be represented as a Generalized
Inverse Gaussian, and for the particular case of Lemma 1 the relationship is f S n (x; nμ, n 2 λ) ≡ f G (x; λ/μ 2 , n 2 λ, −1/2).
Lemma 3 Modified Bessel functions of the third kind are symmetric around zero in the parameter p. In particular when p = 1/2,

Lemma 4 The density of an Inverse Gaussian r.v. has the following property (which clearly holds for any power of x, with the proper adjustment in the last parameter of the GIG on the right hand side of Eq. 5):
xf G (x; λ/μ 2 , n 2 λ, −1/2) ≡ nμ f G (x; λ/μ 2 , n 2 λ, 1/2).
Proof (of Lemmas 1-4) The proof of Lemma 1 can be found in Tweedie (1957, Section 2) and the result in Lemma 2 can be seen by comparing the kernel of both distributions. The symmetry in Lemma 3 can be seen through the following characterization of modified Bessel functions of the third kind

Closed-Form Multiple Optimal Stopping Rules for Multiple Insurance Purchase Decisions
In this Section we present some models in which the optimal rules can be calculated explicitly, with all the technical proofs postponed to the Appendix. Using the results presented in Section 4 we show that if we assume a Poisson-Inverse Gaussian LDA model, where X n ∼ I G(λ, μ) and N ∼ P oi(λ N ), the optimal times (years) to exercise or make claims on the insurance policy for the Accumulated Loss Policy (ALP) can be calculated analytically regardless of where the global or local gain (objective) functions are considered. For the Individual Loss Policy (ILP), when using the gain function as the local objective function given by the sum of the losses at the stopping times (insurance claim years) we propose to model the losses after the insurance policy is applied and, in this case, we present analytical solutions for the stopping rules. On the other hand, the ILP Total loss case given by the global objective function does not produce a closed form solution. However, we show how a simple Monte Carlo scheme can be used to accurately estimate the results.
Since we assume the annual losses Z(1), . . . , Z(T ) are identically distributed we will denote by Z a r.v. such that Z ∼ Z(1). As in the other Sections Z is the insured process; S n = n k=1 X k is the partial sum up to the n-th loss; p m = P[N = m] is the probability of observing m losses in one year. The gain W will be defined as either − Z, when the objective is to minimize the loss at the times the company uses the insurance policy (local optimality), or Z − Z, in case the function to be minimized is the total loss over the time horizon [0, T ], i.e. (global optimality).

Accumulated Loss Policy (ALP)
For the ALP case (see Definition 4) we can model the severity of the losses before applying the insurance policy. Conditional upon the fact that m n=1 X n > ALP , then the annual loss after the application of the insurance policy will be m n=1 X n − ALP . With this in mind, we can then calculate the c.d.f.'s of the insured process, Z and also of the random variable Z − Z.

Local Risk Transfer Objective: Minimizing the Loss at the Stopping Times
Proposition 1 (Local Risk Transfer Case) The cdf and pdf of the insured process are given, respectively, by where the constant C 0 is defined as C 0 := +∞ m=1 F I G (ALP ; mμ, m 2 λ)p m + p 0 .
After calculating the distribution of Z we can calculate expectations of the form E [max {c 1 + W, c 2 }] w.r.t. the loss process Z and, therefore one can consequently obtain the multiple optimal stopping rules under the Accumulated Loss Policy via direct application of Theorem 1.

Global Risk Transfer Objective: Minimizing the Loss Over Period [0, T ]
If we assume the company wants to minimize its total loss over the period [0, T ] the gain achieved through the Accumulated Loss Policy (ALP) is given by For notational convenience we will denote by W m = min ALP , m n=1 X n the annual gain conditional on the fact that m losses were observed.

Proposition 2 (Global Risk Transfer Case: ALP)
The cdf and pdf of the gain process are given, respectively, by After calculating the distribution of the gain, W, we can calculate expectations w.r.t. it and, therefore, the multiple optimal stopping rule under the Accumulated Loss Policy is then obtained via direct application of Theorem 1.

Individual Loss Policy (ILP)
The previous insurance policy, the ALP structure, has been based on the aggregated amount throughout the year. In the case of the ILP insurance structure, the coverage is not on an accumulated aggregate coverage, instead it is based on an individual loss event coverage.

Local Risk Transfer Objective: Minimizing the Loss at the Stopping Times
Let us assume a company buys the insurance policy called Individual Loss Policy (ILP). In this case, a particular loss process observed by the company after applying the insurance policy may be given by In this case we can define a new process ( X n ) n≥1 such that and the annual insured loss would be given by Z = N n=1 X n . Note that in this example the new process, ( X n ) n≥1 would have N < N non zero observations and, in general, N ≤ N . The process ( X n ) n≥1 can be interpreted as an auxiliary process, meaning that if the company had claimed on the insurance policy for this year then the observed losses would have been X n , instead of X n .
In our approach we will model the random variable N and the process ( X n ) n≥1 , the first as an homogeneous Poisson process with mean λ N and the second as a sequence of i.i.d. random variables such that X ∼ I G(λ, μ).

Theorem 4 (Local Risk Transfer Case: ILP) Assuming that N ∼ P oi(λ N ) and
for t = 1, . . . , T . In this case the optimal stopping rule is given by Eq. 1, where

Global Risk Transfer Objective: Minimizing the Loss Over Period [0, T ] via Monte Carlo
If we assume the frequency of annual losses is given by N ∼ P oi(λ N ) and its severities by X i ∼ I G(λ, μ) then the gain process W is given by From Lemma 1 we know the Inverse Gaussian family is closed under convolution, but the distribution of the sum of truncated Inverse Gaussian r.v.'s does not take any known form. A simple and effective way to approximate the expectations necessary to the calculation of the optimal multiple stopping rule is to use a Monte Carlo scheme as follows.
By the end of this process we will have a sample W (1) , . . . , W (M) from the gain, which can be used to approximate, for any given values of 0 < c 1 < c 2 the expectations as

Case Studies
In this Section we will analyse the results provided by the optimal rule in the scenario where analytical expressions are available. Although the loss distribution parameters are different for each insurance policy, in this section we will assume the insurance product is valid for T = 8 years and gives its owner the right to mitigate k = 3 losses. First, for the Accumulated Loss Policy (ALP), Fig. 5 presents a comparison of the two objective functions (Global and Local Risk Transfer), when the LDA parameters are (λ, μ, λ N ) = (3, 2, 3) and the insurance specific parameter is set to ALP = 10. In this case we know the probability of having an annual loss that would make it worth utilising the insurance product in one year is P[Z > ALP ] ≈ 20 %. In this study, for a large number of scenarios, M = 50, 000, the optimal rules from both the objective functions were calculated and, for each scenario, the set of stopping times (m 1 , m 2 , m 3 ) were calculated. On the bottom of Fig. 5 we can see that the exercising strategy is considerably different for the two objective functions. For the Global Risk Transfer, we can see that fixing the first two stopping times, say (m 1 , m 2 ) = (1, 2), it is preferable (on average) to use the remaining right as early as possible. Another way to see the same pattern is to verify that the frequency of occurrence of the set of strategies (1, 2, 3); (2, 3, 4); (3, 4, 5); (4, 5, 6) is decreasing, again indicating a prevalence of early exercise strategies. On the other hand, if the objective is to minimize the "local risk", in more than 25 % of the cases the optimal strategy will be to use the rights as soon as possible.
On the top of Fig. 5 we present histograms of the total loss over [0, T ] (i) without insurance (solid line); (ii) using the global objective function (dark grey); (iii) using the local objective function (light grey). As expected the mean of the total loss when using the local loss function is greater than the global one, but still smaller than the total loss without any insurance.
For both the ALP and the ILP case, we want to check the optimality of the rules presented, comparing them with pre-specified stopping rules. Denoting (m 1 , m 2 , m 3 ) the three stopping times, the rules are defined as follows.  124  125  126  127  128  134  135  136  137  138  145  146  147  148  156  157  158  167  168  178  234  235  236  237  238  245  246  247  248  256  257  258  267  268  278  345  346  347  348  356  357  358  367  368  378  456  457  458  467  468  478  567  568  578  678   Local Risk Transfer   0   5   10   15   20   25   123  124  125  126  127  128  134  135  136  137  138  145  146  147  148  156  157  158  167  168  178  234  235  236  237  238  245  246  247  248  256  257  258  267  268  278  345  346  347  348  356  357  358  367  368  378  456  457  458  467  468  478  567  568  578  For a large number of scenarios, M = 10, 000, we calculated the loss for each of the four rules (the Optimal, the Deterministic, the Random and the Average rules) and plot the histogram, comparing with the expected loss under the optimal rule, see Fig. 6 for the Accumulated Loss Policy (ALP) and Fig. 7 for the Individual Loss Policy (ILP). In all the examples the Optimal rule outperforms the other three showing the difficulty of creating a stopping rule that leads to losses as small as the optimal one. In the first row of histograms on Fig. 6 the results are related to the global loss function, and in the second one to the local loss. Note that the horizontal axis in each line is exactly the objective function we are trying to minimize, precisely, for the global optimization and k j =1 Z(τ j ) for the local one. In this figure the vertical dashed bar represents the average total loss under the different rules and the solid grey line is defined as These values must be understood as the expected loss under each of the two different gain functions and are easily derived from the definition of the gain functions and Theorem 1.
On Fig. 7 we present the same comparison as in the second row of Fig. 6 using the modelling proposed in Section 5.2.1, with parameters (λ, μ, λ N ) = (3, 1, 4). For this simulation study the conclusion is similar to the one drawn from the ALP case, where the pre-defined stopping rules underperformed the multiple optimal rule.

Optimal Rule
Loss when insurance is used Local Risk Transfer 0 20 40 60 80 0.00 0.10 0.20

Rule 3
Loss when insurance is used

Series Expansion for the Density of the Insured Process
Section 5 presented some combinations of Insurance Policies and LDA models that led to closed form solutions for the multiple stopping rule. For the cases where analytical solutions can not be found, one alternative is to create a series expansion of the density of the insured process Z such that all the expectations necessary in Theorem 1 can be analytically calculated. In this Section we will assume the first n moments of the distribution of the insured process Z are known and our objective is to minimize the local risk, but the calculations are also valid if we work with the global optimization problem (in this case one should use Z − Z instead of Z).

Gamma Basis Approximation
If the n-th first moments of the insured process Z can be calculated (either algebraically or numerically) and the support of the insured random variable is [0, +∞) one can use a series expansion of the density of Z in a Gamma basis. For notational convenience, define a new random variable V ar [ Z] and set a = E[ Z] 2 V ar [ Z] . Denoting by f U the density of U the idea, as in the Gaussian case of a Gram-Charlier expansion (see, e.g., Jondeau and Rockinger 2001), is to write f U as Since supp(U ) = supp( Z) = [0, +∞) we assume the kernel g(· ; a) also has positive support (differently from the Gram-Charlier expansion, where g(·) is chosen as a Gaussian kernel). If g(u; a) = u a−1 e −u (a) i.e., a Gamma kernel with shape = a and scale = 1, then the orthonormal polynomial basis (with respect to this kernel) is given by the Laguerre polynomials (in contrast to Hermite polynomials in the Gaussian case) defined as L (a) n (u) = (−1) n u 1−a e −u d n du n (u n+a−1 e −u ).
Remark 4 Note that the definition of the Laguerre polynomials on Eq. 11 is slightly different from the usual one, i.e., the one based on Rodrigues' formula and using the fact that f U can be written in the form of Eq. 10 we find that Then, using the characterization of A n in Eq. 12 and the fact that E[U ] = V ar[U ] = a we can see that

Similar but lengthier calculations show that for
Therefore, matching the first four moments, the density of the original random variable Z can be approximated as where u = bz, A 3 and A 4 are given, respectively, by Eqs. 13 and 14 and the Laguerre polynomials can be found in Table 3. For additional details on the Gamma expansion we refer the reader to Bowers (1966). Since this expansion does not ensure positivity of the density at all points (it can be negative for particular choices of skewness and kurtosis) we will adopt the approach discussed in Jondeau and Rockinger (2001) for the Gauss-Hermite Gramm Charlier case modified to the Gamma-Laguerre setting. To find the region on the (μ 3 , μ 4 )-plane where f U (u) is positive for all u we will first find the region where f U (u) = 0, i.e., 2 (u) = u 2 − 2(a + 1)u + (a + 1)a L (a) 3 (u) = u 3 − 3(a + 2)u 2 + 3(a + 2)(a + 1)u − (a + 2)(a + 1)a L (a) 4 (u) = u 4 − 4(a + 3)u 3 + 6(a + 3)(a + 2)u 2 − 4(a + 3)(a + 2)(a + 1)u + (a + 3)(a + 2)(a + 1)a.
For a fixed value u, we now want to find the set (μ 3 , μ 4 ) as a function of u such that Eq. 15 remains zero for small variations on u. This set is given by (μ 3 , μ 4 ) such that d du We can then rewrite (15) and (16) as the following system of algebraic equations Therefore, one can solve this system to show that the curve where the approximation will stay positive for all u is given by: , for u ∈ [0, +∞).
As an illustration, Fig. 8 presents (on the left) the histogram of the loss process Z = N n=1 X n for X ∼ LN (μ = 1, σ = 0.8) and N ∼ P oi(λ N = 2) and in gray the Gamma approximation using the first four moments of Z. On the right it is presented the graph of the region where the density is positive for all values of u, given by Eq. 17. The grey area was If the the third and fourth moments of the chosen model lied outside the permitted area one could chose μ 3 and μ 4 as the estimates that minimize some constrained optimization problem, for instance, the Maximum Likelihood Estimator (using f U (u; μ 3 , μ 4 ) = u a−1 e −u (a) 4 (u) as the likelihood). The constrained region is clearly given by a segment of the curve in Eq. 17 and the endpoints can be found using a root-search method checking for which values of u the red curve in Fig. 8 touches the grey area.
Given the approximation of f U , and consequently of f Z , one can easily calculate the optimal multiple stopping rule, since E[ Z] is assumed to be known and E[min{c 1 + Z, c 2 }] can be calculated as follows. Gamma(a, 1), i.e., f G (x) = x a−1 e −x (a) then, similarly to Lemma 4 the following property holds xf G (x; a, 1) ≡ af G (x; a + 1, 1).

Conclusion and Final Remarks
In this paper we studied some properties of an insurance product where its owner has the right to choose which of the next k years the issuer should mitigate its annual losses. For two different forms of mitigation we presented as closed form solutions for the exercising strategy that minimized (on average) the sum of all annual losses in the next T years. This model assumed a "moderate tail" for the severity of the losses the owner incurs, namely a Poisson-Inverse Gaussian LDA model. Although it is assumed the company already holds the proposed contract, the company can use the analysis presented on Fig. 6 as a proxy for the price of the insurance product. The value, from the company's point of view, of the insurance product should be the expected difference (under the natural probability) of the losses that would be incurred without the product and the losses incurred using the product in the most profitable way (for the buyer), It must also be said this price does not include the premium asked by the insurance company and also does not take into consideration the fact that external insurance companies will not have access to the models used by the company but it can still be a valuable proxy. Also, the impact of the proposed products on the capital requirements and the effect of the inclusion of deductibles require further research in future studies.
An alternative to the results presented in Section 7 can involve the use of a Monte Carlo method. If there exists a mechanism to sample from the severity distribution one can easily create a sample of the insured process Z and use this sample to calculate all the necessary expectations on Theorem 1. The advantage of this approach is that one can handle any combination of severity distribution and insurance policy, but it can be extremely time consuming and the variance of the estimative can be prohibitive. It is important to note the sampling of the severity can be made offline, i.e., the same sample should be used to calculate all the integrals. Another alternative to solve the optimal multiple stopping problem is the usage of an extended version of the so-called Least-Square Monte Carlo method, first presented in Longstaff and Schwartz (2001) and extended to the multiple stopping scenario in Bender and Schoenmakers (2006) (see also Bender et al. 2013 for recent related work).
Regarding the results presented in Theorems 4 and 2 the truncation point for the infinite sums can be chosen to be much larger than the expected number of losses (parameter λ N ), since the summands are composed by a p.m.f. of a Poisson r.v. (which presents exponential decay) and a bounded term (difference of c.d.f.'s times constants).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Compliance with Ethical Standards
Conflict of interests The authors declare that they have no conflict of interest.

Appendix: Proofs
Proof (of Proposition 1) The p.d.f. easily follows from the derivation of F Z (z) with respect to z but it is important to note that f Z is a continuous density with discrete mass at z = 0, i.e., Proof (of Theorem 2) As in Theorem 4, to calculate the optimal rule we only need to calculate E[W ] and E[max{−c 1 + W, −c 2 }], for 0 < c 1 < c 2 . Given the expression (7)  Since we assumed w < ALP , we have that min{w, ALP } = w and the indicator function on the second term is always equal to zero leading to the following expression for an arbitrary w ≥:  For the second term, first note that E[max{−c 1 + W, −c 2 }] = (−1)E[min{c 1 + Z, c 2 }] and it then follows that, for 0 < c 1 < c 2 , Pr[ N = n] F GI G (c 2 − c 1 ; λ/μ 2 , n 2 λ, 1/2)nμ +(c 1 − c 2 )F GI G (c 2 − c 1 ; λ/μ 2 , n 2 λ, −1/2) + c 2 + c 1 Pr[ N = 0]. Note that, for notational ease, f Z must be understood as the absolutely continuous part of the density of Z.