Optimal reinsurance via BSDEs in a partially observable model with jump clusters

We investigate the optimal reinsurance problem when the loss process exhibits jump clustering features and the insurance company has restricted information about the loss process. We maximize expected exponential utility of terminal wealth and show that an optimal solution exists. By exploiting both the Kushner-Stratonovich and Zakai approaches, we provide the equation governing the dynamics of the (infinite-dimensional) filter and characterize the solution of the stochastic optimization problem in terms of a BSDE, for which we prove existence and uniqueness of solution. After discussing the optimal strategy for a general reinsurance premium, we provide more explicit results in some relevant cases.


Introduction
Optimal reinsurance problems have attracted special attention during the past few years and they have been investigated in many different model settings.Insurance companies can hardly deal with all the different sources of risk in the real world, so they hedge against at least part of them, by re-insuring with other institutions.A reinsurance agreement allows the primary insurer to transfer part of the risk to another company and it is well known that this is an effective tool in risk management.Moreover, the subscription of such contracts is required by some financial regulators, see e.g. the Directive Solvency II in the European Union.Large part of the existing literature focuses mainly on classical reinsurance contracts such as the proportional and the excess-of-loss, which were extensively investigated under a variety of optimization criteria, e.g.ruin probability minimization, dividend optimization and expected utility maximization.Here we are interested in the latter approach (see Irgens and Paulsen [20], Mania and Santacroce [26], Brachetta and Ceci [3] and references therein).Some of the classical papers devoted to the subject assume a diffusive dynamics for the surplus process, while the more recent literature considers surplus processes including jumps.
The pioneering risk model with jumps in non-life insurance is the classical Cramér-Lundberg model, where the claims arrival process is a Poisson process with constant intensity.This assumption implies that the instantaneous probability that an accident occurs is always constant, which is in a way too restrictive in the real world, as already motivated by Grandell [18].In recent years, many authors made a great effort to go beyond the classical model formulation.For example, Cox processes were employed to introduce a stochastic intensity for the claims arrival process, see e.g.Albrecher and Asmussen [1], Bjork and Grandell [2], Embrechts et al. [17].Moreover, other authors introduced Hawkes processes in order to capture the self-exciting property of the insurance risk model in presence of catastrophic events.Hawkes processes were introduced by Hawkes [19] to describe geological phenomena with clustering features like earthquakes.Hawkes processes with general kernels are not Markov processes: they can eventually include long-range dependence, while Hawkes processes with exponential kernel exhibit the appealing property that the couple process-intensity is Markovian; moreover they are affine processes according to the definition provided by Duffie, Filipovic and Schachermayer [16].For the latter literature strand here we mention Stabile and Torrisi [32] and Swishchuk et al. [34].Dassios and Zhao [12] proposed a model which combines the two approaches by introducing a Cox process with shot noise intensity and a Hawkes process with exponential kernel for describing the claim arrival dynamics.Recently Cao, Landriault and Li [7] investigated the optimal reinsuranceinvestment problem in the model setting proposed by Dassios and Zhao [12] with a reward function of mean-variance type.
A different line of research related to the optimal-reinsurance investment problem focuses on the possibility that the insurer does not have access to all the information when choosing the reinsurance strategy.As a matter of fact, only the claims arrival and the corresponding disbursements are observable.In this case we need to solve a stochastic optimization problem under partial information.Liang and Bayraktar [23] were the first to introduce a partial information framework in optimal reinsurance problems.They consider the optimal reinsurance and investment problem in an unobservable Markov-modulated compound Poisson risk model, where the intensity and jump size distribution are not known, but have to be inferred from the observations of claim arrivals.Ceci, Colaneri and Cretarola [10] derive risk-minimizing investment strategies when information available to investors is restricted and they provide optimal hedging strategies for unit-linked life insurance contracts.Jang, Kim and Lee [21] present a systematic comparison between optimal reinsurance strategies in complete and partial information framework and quantify the information value in a diffusion setting.
More recently, Brachetta and Ceci [4] investigate the optimal reinsurance problem under the criterion of maximizing the expected exponential utility of terminal wealth when the insurance company has restricted information on the loss process in a model with claim arrival intensity and claim sizes distribution affected by an unobservable environmental stochastic factor.
In the present paper we investigate the optimal reinsurance strategy for a risk model with jump clustering properties in a partial information setting.The risk model is similar to that proposed by Dassios and Zhao [12] and it includes two different jump processes driving the claims arrivals: one process with constant intensity describing the exogenous jumps and another with stochastic intensity representing the endogenous jumps, that exhibits self-exciting features.The externallyexcited component represents catastrophic events, which generate claims clustering increasing the claim arrival intensity.The endogenous part allows us to capture the clustering effect due to self-exciting features.That is, when an accident occurs, it increases the likelihood of such events.The insurance company has only partial information at disposal, more precisely the insurer can only observe the cumulative claims process.The externally-excited component of the intensity is not observable and the insurer needs to estimate the stochastic intensity by solving a filtering problem.Our approach is substantially different from that of Cao et Al.[7] in several respects: firstly, we work in a partial information setting; secondly, the intensity of the self-excited claims arrival exhibits a slight more general dependence on the claims severity; finally, we maximize an exponential utility function instead of following a mean-variance criterion.In a partially observable framework, our goal is to characterize the value process and the optimal strategy.The optimal stochastic control problem in our case turns out to be infinite dimensional and the characterization of the optimal strategy cannot be performed by solving a Hamilton-Jacobi-Bellman equation, but via a BSDE approach.
A difficulty naturally arises when dealing with Hawkes processes: the intensity of the jumps is not bounded a priori, although a non-explosive condition holds.Hence we are not able to exploit some relevant bounds, which are usually required to prove a verification theorem and results on existence and uniqueness of the solution for the related BSDE.Nevertheless, we are going to show that the optimal stochastic control problem has a solution, which admits a characterization in terms of a unique solution to a suitable BSDE.
Our paper aims to contribute in different directions to the literature on optimal reinsurance problems: first, we provide a rigorous and formal construction of the dynamic contagion model.Second, we study the filtering problem associated to our problem, providing a characterization of the filter process in terms of the Kushner-Stratonovich equation and the Zakai equation as well.To the best of our knowledge, this problem has not been addressed in the existing literature.We refer to Dassios and Jang [13] for a similar problem without the self-exciting component.Third, we solve the optimal reinsurance problem under the expected utility criterion.
We remark that our study differs from Brachetta and Ceci [4] in many key aspects.The risk model is substantially different, requires a strong effort to be rigorously constructed and the study of a new filtering problem.What is more, a crucial assumption in Brachetta and Ceci [4] is the boundedness of the claims arrival intensity, which is not satisfied in our case, thus leading to additional technicalities in most of the proofs.This is what happens, for example, to prove existence and uniqueness of the solution to the BSDE.Moreover, we perform the optimization over a class of admissible contracts, instead of maximizing over the retention level.This feature allows us to cover a larger class of problems.Finally, we do not require the existence of an optimal control for the derivation of the BSDE, hence the general presentation turns out to be different.
The paper is organized as follows.In Section 2 we are going to introduce the risk model and to specify what information is available to the insurer.A rigorous mathematical construction is provided, based on a measure change approach, necessary to develop the following analysis in full details.In Section 3 the filtering problem is investigated in order to reduce the optimal stochastic control problem to a complete information setting.The stochastic differential equation satisfied by the filter is obtained, by exploiting both the Kushner-Stratonovich and the Zakai approaches.In Section 4 the optimal stochastic control problem is formulated, while in Section 5 a characterization of the value process associated with the optimal stochastic control problem is illustrated.Due to the infinite dimension of the filter, the approach based on the Hamilton-Jacobi-Bellman equation cannot be exploited, so the value process is characterized as the unique solution of a BSDE.In Section 6 the optimal reinsurance strategy is investigated under general assumptions and some relevant cases are discussed.Some proofs and useful computations are collected in Appendices A, B and C.

The mathematical model
Let (Ω, F , P; F) be a filtered probability space and assume that the filtration F = {F t , t ∈ [0, T ]} satisfies the usual hypotheses.The time T > 0 is a finite time horizon that represents the maturity of a reinsurance contract.Here we start by giving an overview of the optimal reinsurance problem from the primary insurer's point of view, then, in Section 2.1, we provide a rigorous construction of our model setting.
Notice that the counting process N (1) is defined via its intensity λ in Equation (2.1), which in turn depends on the history of N (1) .So, an apparent logical loop seems to arise about the existence of λ.We postpone this issue to Section 2.1, where we perform a rigorous construction of the model based on an equivalent change of probability measure.
The following assumption will hold from now on: Assumption 2.1.We assume N (2) , {Z n } n≥1 and {Z (2) n } n≥1 to be independent of each other.
We define the cumulative claim process C = {C t , t ∈ [0, T ]} at time t as (2.2) Remark 2.2.Our model includes many meaningful properties of risk models.The claim arrival process has stochastic intensity, reflecting random changes in the instantaneous probability that accidents occur.Most importantly, our framework captures both self-exciting (endogenous) and externally-exciting (exogenous) factors, via, respectively, the claim arrival times and sizes {T n , Z n } n≥1 and {T n , Z n } n≥1 .For this reason, it is well suited to describe, for instance, catastrophic events, see Cao, Landriault and Li [7], where self-exciting jump sizes are independent on claims severity.Differently, in our model they depend on claim sizes: ℓ(Z (1) j ).Moreover, the decay coefficient is considered, because the catastrophic events typically exhibit this behavior.
The insurance company is allowed to subscribe a reinsurance contract with a retention function Φ(z, u) parametrized by a dynamic reinsurance strategy u t ∈ U, ∀t ∈ [0, T ] (the control).That is, under a dynamic strategy u = {u t , t ∈ [0, T ]} the aggregate losses covered by the insurer, denoted by so that the remaining losses (C − C u ) will be undertaken by the reinsurer.We highlight that in our settings the insurer can choose the optimal reinsurance arrangement over a class of admissible contracts, see Section 4 for details.For this service a reinsurance premium rate q u = {q u t , t ∈ [0, T ]} must be paid.Hence the primary insurer receives the insurance premium rate c, pays the reinsurance premium rate q u and bears the aggregate losses C u , so that the surplus process, R u , follows the SDE: , where R 0 denotes the initial capital.Investing the surplus in a risk-free asset with interest rate r > 0, the total wealth X u of the primary insurer is We assume that the information at disposal is limited: the insurer only observes the cumulative claims process C in Equation (2.2).Let us denote by H the natural filtration generated by C: We assume that the insurer and the reinsurer have the same information represented by H. Therefore, the insurance and the reinsurance premium have to be H-predictable.The same applies to the insurer's control u.The insurer aims at maximizing the expected exponential utility of terminal wealth over a suitable class of H-predictable strategies U (which will be made precise later in Definition 4.4): where η > 0 denotes the insurer's risk aversion.More mathematical details on the control problem to be solved will be given in Section 4.
Remark 2.3.Notice that the stochastic wealth X u can possibly take negative values, due to the possibility of borrowing money from the bank account.
This setting leads to investigate a stochastic control problem under partial information.Due to the presence of the externally-excited component, the claim arrival intensity in Equation (2.1) is F-adapted rather than H-adapted, hence it is not observable by the insurance and reinsurance companies.We will reduce the original problem to a stochastic control problem under complete information by solving a filtering problem in Section 3. The knowledge of the filter process allows to compute the H-adapted (predictable) intensity of the claim arrival process N (1) , which represents the best estimate of the stochastic intensity λ based on the available information.
The next subsection provides a formal and rigorous construction of our model.
2.1.Model construction.We are going to introduce the dynamic contagion model by a suitable measure change, starting from two Poisson processes with constant intensity on a given probability space (Ω, F , Q; F): N (1) is standard and N (2) has constant intensity ρ > 0.Moreover, we take two sequences {Z n } n≥1 and {Z n } n≥1 of i.i.d.positive random variables with distribution functions F (1) and F (2) , respectively, and such that E Q [ℓ(Z (1) )] < +∞ and E Q [Z (2) ] < +∞.We assume N (1) , N (2) , {Z n } n≥1 and {Z (2) n } n≥1 to be independent of each other under Q.
The key idea behind our construction is to introduce a new measure P, equivalent to Q on (Ω, F ; F), such that, under P, the intensity of N (2) and the distributions of {Z (1) n } n≥1 and {Z (2) n } n≥1 do not change and N (1) is a counting process with stochastic intensity λ given by Equation (2.1).Notice that, under P, N (1) , N (2) , {Z n } n≥1 and {Z (2) n } n≥1 are not independent anymore.Let us introduce the integer-valued random measures m (i) (dt, dz), i = 1, 2 where δ (t,z) denotes the Dirac measure in (t, z).Under Q, m (i) (dt, dz), i = 1, 2, are independent Poisson measures with compensator measures given respectively by ,Q (dt, dz) = ρF (2) (dz)dt.
The measure change from (Q, F) to (P, F) will be performed via the stochastic process L defined as follows, for t ∈ [0, T ]: where E(M t ) denotes the Doléans-Dade exponential of a martingale M and where λ under Q is defined by Equation (2.1).This process will be proved to be a (Q, F)-martingale under the following: Assumption 2.4.We assume that there exists ε > 0 such that Before proving the martingale property, we notice the following: In fact, by Equation (2.1) We then have an explicit expression for L t : , where we used the mutual independence of N (1) , N (2) , {Z n } n≥1 and {Z (2) n } n≥1 which holds by construction under Q.By exploiting Lemma A.2 we immediately find: Using similar arguments, one shows that Now that the change of measure has been rigorously introduced, we can safely introduce the (P, F) compensator measures of m (i) (dt, dz), i = 1, 2.
It turns out that for any F-predictable random field where ν (i) (ds, dz), i = 1, 2, are defined in Equation (2.9).Moreover, under the condition is a (P, F)-martingale.

Markov property.
In this subsection we discuss and characterize the Markov structure of the intensity, working on (Ω, F , P; F).Equation (2.1) reads as (2.10) Proposition 2.8.The process λ is a (P, F)-Markov process with generator The domain of the generator L, denoted by D(L), is given by the class of functions f ∈ C 1 (0, +∞) such that ) (2.12) In what follows we will need the following, which will be crucial to prove Proposition 2.10: Assumption 2.9.  (2(ds, dz), (2.13) hence by Remark 2.7 By applying Gronwall's Lemma we obtain It is immediate to verify that h 1 (t) ≥ 0 and By Itô formula we get Then, there exist and again by Gronwall's Lemma it follows that E[λ k t ] ≤ h k (t), with h k a measurable, integrable and nonnegative function on [0, T ], and this concludes the proof.Proposition 2.11.Under Assumption 2.9, the functions Proof.Under Assumption 2.9, by computations similar to those performed in the proof of Proposition 2.10, we get the claim.

The filtering problem
We assume that the insurance company has a partial information because the externally-exciting component in the intensity process λ introduced in Equation (2.1) is not observable.For filtering of Cox processes with shot noise intensity, that is without the self-exciting component in Equation (2.1), we refer to Dassios and Jang [13], where the estimation of the intensity λ given the observations of the claim arrival process N (1) reduces to the use of the classical Kalman-Bucy filter after a Gaussian approximation of the intensity is performed.This result applies in the case where the intensity ρ of the externally-exciting component is sufficiently large.Their working setting can be seen as a particular case of our contagion model and their results can then be obtained as special cases, with no assumption on ρ needed (see also Remark 3.7).
The insurance company aims at estimating the intensity λ by observing the cumulative claim process C defined in Equation (2.2), that is, by observing the double sequence {(T n )} n≥1 of arrival times and claim sizes.This leads to a filtering problem with marked point processes observations.
Let us recall that H = F C , defined in Equation (2.3), is the observation flow, representing the information at disposal by the insurance company.So, the estimate of the intensity λ can be described through the filter process π = {π t , t ∈ [0, T ]} which provides the conditional distribution of λ t given H t , for any time t ∈ [0, T ].More in details, the filter is the H-càdlàg (right-continuous with left limits) process taking values in the space of probability measures on [0, +∞) such that for any function By applying the innovation method (see for instance Brémaud [5, Chapter IV]) we will characterize the filter in terms of the so called Kushner-Stratonovich (KS henceforth) equation.
where L and D(L) are given in Proposition 2.8.
Proof.We denote by R the (P, H)-optional projection of an F-progressively measurable process R We will use the two-well known facts: • for every (P, F)-martingale m, the (P, H)-optional projection m is a (P, H)-martingale; • for any F-progressively measurable process Ψ we have that By Itô formula, for any f ∈ D(L), we have: where m f is a (P, F)-martingale and taking the (P, H)-optional projection we get where M f is a (P, H)-martingale.By the martingale representation theorem there exists an Hpredictable random field, To derive the expression of h f , we consider an H-adapted and bounded process with U an H-predictable bounded random field.Since Γ is H-adapted the following equality holds By applying the product rule we get where m f is a (P, F)-martingale.Taking the (P, H)-optional projection we obtain that where M f is a (P, H)-martingale.On the other hand we have that where M f is a (P, H)-martingale.By (3.4) we have that the finite variation parts in Equations (3.5) and (3.6) have to coincide: for any t ∈ [0, T ] n } with U = {U t , t ∈ [0, T ]} any bounded Hpredictable, positive process and A ∈ B([0, +∞)).With this choice we get that Γ is bounded and and recalling that λ t > 0, ∀t ∈ [0, T ] (which implies λ t = π t (λ) > 0 ∀t ∈ [0, T ]), we obtain that Finally since the counting process N (1) is not explosive we have that T n → +∞ as n → +∞ and by Equations (3.2) and (3.3) we obtain that the filter is solution to the KS Equation (3.1).
It remains to prove uniqueness for this equation.As in Theorem 3.3 in Ceci and Colaneri [8] we have that strong uniqueness of the solution to the KS Equation follows by uniqueness of the Filtered Martingale Problem (FMP( L, λ 0 , C 0 )) associated to the generator L of the pair {(λ t , C t ), ∈ [0, T ]} for any initial condition (λ 0 , C 0 ) ∈ (0, +∞) × [0, +∞).For details on FMP we refer to Kurtz and Ocone [22].The operator L is given by (1) (dz) for a suitable class of functions f (λ, C).
Next, to prove that the FMP( L, λ 0 , C 0 ) has a unique solution we apply Theorem 3.3 in Kurtz and Ocone [22], after checking that the required hypotheses are fulfilled.First, let us observe that the martingale problem for the operator L is well posed on the space of càdlàg (0, +∞) × [0, +∞)valued paths.Furthermore, we can choose a domain D( L), such that for any with K positive constant.Moreover, it is easy to verify that Lf (λ, C) is a continuous function of their arguments.Finally, D( L) is dense in the space of continuous functions which vanish at infinity and so all hypotheses of Theorem 3.3 in Kurtz and Ocone [22] are satisfied and this concludes the proof.
The filtering Equation (3.1) has a natural recursive structure in terms of the sequence {T n } n≥1 .Indeed, between two consecutive jump times, for t ∈ [T where At a jump time T (1) n ≤ T , we have that the value of the filter is completely determined by the knowledge of the filter π t , with t ∈ (T n ∧ T ) and the observed data (T Notice that L is the Markov generator of a shot noise Cox process, obtained taking ℓ(z) = 0 in Equation (2.1).
we get by Equations (3.10) and (3.12), that, for any k = 1, 2, . . ., between two consecutive jump times and at a jump time T (1) In particular, for k = 1 we have that π t (f 1 ) = π t (λ) provides the (P, H)-intensity of N (1) , and the KS equation reads as that is t . (3.15) Notice that the equations for π t (f k ) depend on π t (f 1 ), . . ., π t (f k+1 ), for any k = 1, 2, . . . .Thus the (P, H)-predictable intensity of N (1) , where the process Y has the same jumps of π(λ) and between two consecutive jumps solves the SDE: n ∧ T, T Hence the filter is dominated by a process with exponential decay behaviour between consecutive jump times.
Thanks to Theorem 3.2 we have characterized the filter in terms of a nonlinear stochastic equation.In our framework it is possible to describe the filter also in terms of the unnormalized filter as solution of the so-called Zakai equation, which has the advantage of being linear.
By the Kallianpur-Striebel formula we get that, for any t ∈ [0, T ] where Q is the equivalent probability measure introduced in Section 2.1, L is given in Equation (2.7).The process denotes the unnormalized filter and is a finite measure-valued H-càdlàg process.
Proposition 3.5 (Zakai equation).For any f ∈ D(L), the unnormalized filter is the unique strong solution to the Zakai equation, for any t ∈ [0, T ] 1) (ds, dz) − F (1) (dz)ds).(3.17) can be easily obtained by considering the effect of the Girsanov change measure, that is σ(1) is the Doléans-Dade exponential of the s − ds) Hence it solves By Itô's formula we get that .
Taking into account Equations (3.1) and (3.20) and that we get Equation (3.17).Finally as in Theorem 4.7 in Ceci and Colaneri [9] we can prove strong uniqueness for the Zakai equation by the strong uniqueness of the KS-equation.
The Zakai equation can be written also as where the operator L is defined in Equation (3.11) and as the KS-equation it has a natural recursive structure in terms of the sequence {T n } n≥1 .Indeed, between two consecutive jump times, for t ∈ [T (1) and at a jump time T (1) By the linear structure of the Zakai between consecutive jumps we get a convenient expression of the filter.
where λ n is the shot noise Cox process, solution ∀t ∈ (T n−1 ∧ T, T n ∧ T ), of the SDE with initial law π T (1) Proof.Let λ s,x denotes the solution to Equation (3.23) with initial condition (s, x) ∈ [0, +T ) × (0, +∞).By Itô's formula ∀s with M a (P, F)-martingale.Setting γ t = e − t s ( λ s,x u −1)du by the product rule we obtain and, taking the expectation, we obtain Thus for any f ∈ D(L), Ψ t (s, x)(f coincides with the filter at jump time T n−1 . Remark 3.7.[Filtering of a shot noise Cox process] Taking β = 0 and ℓ(z) = 0 in Equation (2.1) the claim arrival process N (1) reduces to the Cox process with shot noise intensity considered in Dassios and Jang [14].Denoting by L SN the Markov generator given by in this special case the KS and the Zakai equations are driven by N (1) and are given by ) respectively.In particular, the KS-equation between two consecutive jump times coincides with that in the general case in Equation (3.10) (with L replaced by L SN ) while the update at a jump time T (1) n (see Equation (3.12)) is given by Analogously, the Zakai-equation between two consecutive jump times coincides with that in the general case in Equation (3.20) (with L replaced by L SN ), while the update at a jump time T (1) n n − (λf ).

The reduced optimal control problem under complete information
By the filtering techniques developed in Section 3, the original problem under partial information is now reduced to a complete observation stochastic control problem, which involves only processes adapted or predictable w.r.t. the filtration H, under P.The (P, H)-predictable projection measure of m (1) (dt, dz) (see Equation (2.4)) associated with the loss process C can be written in terms of the filter π: π t − (λ)F (1) (dz)dt.In the sequel we shall denote by m (1) (dt, dz) the (P, H)-compensated jump-measure m (1) (dt, dz) = m (1) (dt, dz) − π t − (λ)F (1) (dz)dt.
The primary insurer wishes to subscribe a reinsurance contract to optimally control her wealth.The surplus process without reinsurance evolves according to the following equation: where {c t , t ∈ [0, T ]} denotes the insurance premium, which is assumed to be H-predictable and such that E T 0 c t dt < +∞ and R 0 is the initial capital.The primary insurer subscribes a generic reinsurance contract, that is characterized by the retention function Φ, which is an H-predictable random field, in general.We assume that the insurer can choose any reinsurance arrangement in a given class of admissible contracts, which is a family of functions of z ∈ [0, +∞) representing the retained loss.For practical applications, we suppose that the contracts are parametrized by a n-uple u (the control) taking values in U ⊆ R n , with n ∈ N and R denoting the compactification of R.Under an admissible strategy u ∈ U (the definition of admissibility set U will be given in Definition 4.4), she retains the amount Φ(Z j , u T (1)

j
) of the j-th claim, while the remaining

j
) is paid by the reinsurer.
We suppose that Φ(z, u) is continuous in u and there exist at least two points u N , u M ∈ U such that so that u = u N corresponds to null reinsurance, while u = u M represents the maximum reinsurance protection.Notice that u M corresponds to full reinsurance when applicable.
Example 4.2.We can show how standard reinsurance contracts fit our model formulation.
(1) Under proportional reinsurance, the insurer transfers a percentage (1 − u) of any future loss to the reinsurer, so we set Selecting the scalar u ∈ [0, 1] =: U is equivalent to choosing the retention level of the contract.Notice that here u N = 1 means no reinsurance and u M = 0 is full reinsurance.(2) Under an excess-of-loss reinsurance policy, the reinsurer covers all the losses exceeding a retention level u, hence we fix the class of all the functions with this form: So, here U := [0, +∞], u N = +∞ and u M = 0 is full reinsurance.(3) Under a limited stop-loss reinsurance, for any claim the reinsurer covers the losses exceeding a threshold u 1 , up to a maximum level u 2 > u 1 , so that the maximum loss is limited to (u 2 − u 1 ) on the reinsurer's side.In this case: . Clearly, we have that u M = (u M,1 , u M,2 ) = (0, +∞) and u N can be any point on the line u 1 = u 2 .A particular case is the so-called limited stop-loss with fixed reinsurance coverage, in which u 2 = u 1 + β, β > 0.Here U = [0, +∞], u N = +∞ and u M = 0 corresponds to the maximum reinsurance coverage β.
Clearly the insurer will have to pay a reinsurance premium q u = {q u t , t ∈ [0, T ]}, which depends on the strategy u.We assume that the reinsurance premium admits the following representation: for a given function q(t, ω, u) : [0, T ] × Ω × U → [0, +∞) continuous in u, H-predictable and with continuous partial derivatives ∂q(t,ω,u) ∂u i , i = 1, . . ., n.We assume that, for any t ∈ [0, T ] × Ω q(t, ω, u N ) = 0, q(t, ω, u) ≤ q(t, ω, u M ), ∀u ∈ U, since a null protection is not expensive and the maximum reinsurance is the most expensive.In the following q u will denote the reinsurance premium associated with the dynamic reinsurance strategy {u t , t ∈ [0, T ]}.Notice that both insurance and reinsurance premia are assumed to be H-predictable, since insurer and reinsurer share the same information.Finally, we require the following integrability condition: which ensures that for any u ∈ U, E T 0 q u s ds < +∞.
Example 4.3 (Expected value principle).Under any admissible reinsurance strategy u ∈ U, the expected cumulative losses covered by the reinsurer in the interval [0, t] are given by According to the expected value principle, the premium q u applied by the reinsurer has to satisfy where θ R > 0 denotes the safety loading applied by reinsurer.Thus Summarizing, the surplus process with reinsurance evolves according to Let us observe that turns out to be a (P, H)-martingale, because is finite, since Proposition 2.10 holds, and Remarks 3.1, 4.1 apply.
The insurance company invests its surplus in a risk-free asset with constant interest rate r > 0, so that for any reinsurance strategy u ∈ U the wealth dynamics is whose solution is given by As announced before, the insurer aims at optimally controlling her wealth using reinsurance.More formally, she aims at maximizing the expected exponential utility of terminal wealth, that is: which turns out trivially to be equivalent to the minimization problem: where η > 0 denotes the insurer's risk aversion.Clearly, the admissible strategies must be H-predictable, since they are based on the information at disposal.The next assumptions are required in the sequel.Assumption 4.5.We assume that for every a > 0 i) E e aℓ(Z (1) ) < +∞, E e aZ (1) < +∞, E e aZ (2) < +∞.
ii) E e a T 0 q u M t dt < +∞.Lemma 4.6.Under Assumption 4.5 i) for every a > 0 we have that E[e aC T ] < +∞.

Proof. See Appendix B.
Remark 4.7.Usually insurance companies apply a maximum policy D > 0, i.e., they only repay claims up to the amount D to the policyholders.In this setting, claims' sizes are of the form min{Z

The value process and its BSDE characterization
In this section we study the value process associated to the problem in Equation (4.8).Let us introduce the Snell envelope for any u ∈ U: with U(t, u) defined, for an arbitrary control u ∈ U, as the restricted class of controls almost surely equal to u over [0, t] U(t, u) := ū ∈ U : ūs = u s a.s.for all s ≤ t ≤ T .
Denoting by Xu t = e −rt X u t the discounted wealth: and introducing the value process as follows, (where U t is introduced in Definition 4.4) we can show that ∀u ∈ U and, in turn, choosing null reinsurance, i.e. u t = u N , for any t ∈ [0, T ], we get where XN and W N denote the discounted wealth and the Snell envelope in Equations ( 5.2) and (5.1), respectively, associated to null reinsurance.Our aim is to develop a BSDE characterization for the process {W N t , t ∈ [0, T ]} which also provides a complete description of the value process {V t , t ∈ [0, T ]} in Equation (5.3).
The following definitions will play a key role for our BSDE characterization and its solution.
Definition 5.1.We define three classes of stochastic processes: • S 2 denotes the space of càdlàg H-adapted processes Y such that: • L 2 denotes the space of càdlàg H-adapted processes Y such that: Definition 5.2.We define and, similarly, we denote by M u the same set augmented with the variable u ∈ U, i.e., Definition 5.3.Let ξ be an H T -measurable random variable.A solution to a BSDE driven by the compensated random measure m (1) (dt, dz) given in Equation (4.1) and generator g is a pair where We first give some preliminary results.
Proposition 5.4.Under Assumption 4.5 i), we have that where M (i) , i = 1, 2, are the following (P, H)-martingales (5.7) By Equation (5.5), we get that for any t ∈ [0, T ] where M (2) is a (P, H)-martingale.Moreover, we have that To complete the proof, we observe that Doob's martingale inequality implies that which is finite according to Lemma 4.6.Remark 5.6.By Proposition 5.5 since u = u N ∈ U, {W N t , t ∈ [0, T ]} is a (P, H)-submartingale and W N ∈ S 2 ⊆ L 2 (this follows from Proposition 5.4).As a consequence, by Doob-Meyer decomposition and (P, H)-martingale representation theorems, it admits the expression where Θ W N ∈ L 2 by (5.8) and {A t , t ∈ [0, T ]} in an increasing (P, H)-predictable process such that E T 0 A 2 t dt < +∞.Moreover, W N T = e −ηX N T := ξ, and since the wealth associated to null reinsurance, u = u N , is given by we get the inequality ξ ≤ e ηe rT C T .Thus Lemma 4.6 guarantees that ξ is a random variable with finite moments of any order.Summarizing, we obtain that Next step provides an explicit expression for the process A and characterizes W N and the optimal control via a BSDE approach.
We now give the main result of this section.
Theorem 5.7.Under Assumption 4.5, (W N , Θ W N ) ∈ L 2 × L 2 is the unique solution the following BSDE with terminal condition ξ = e −ηX N T , where Moreover, the process u * ∈ U which satisfies is an optimal control.
Proof.Theorem 5.7 follows directly by an existence result of a solution to the BSDE (5.9) (see Theorem 5.9 below) and a verification result (see Theorem 5.10 below), which imply that any solution to the BSDE (5.9) coincides with the process (W N , Θ W N ).
Remark 5.8.Let us notice that i) the driver of the BSDE (5.9) is always nonnegative, since via Equation (5.10), we get ii) there exists u * ∈ U which satisfies Equation (5.11): by hypothesis q u t and Φ(z, u) are continuous on u ∈ U and U is compact, hence measurability selection results ensure that the maximizer is a (P, H)-predictable process and Proposition 4.8 holds.
Proof.The proof is postponed to Appendix C.
We now wish to provide a verification result.To this end we recall the following result in Brachetta and Ceci [4,Proposition 3.4].
Proposition 5.10.Suppose there exists an H-adapted process D such that: } is an (P, H)-sub-martingale for any u ∈ U and an (P, H)martingale for some u * ∈ U; Then D t = V t and u * is an optimal control.Theorem 5.11.(Verification Theorem) Under Assumption 4.5, let (Y, Θ Y ) ∈ L 2 × L 2 be a solution to the BSDE (5.9) and let u * ∈ U be a process satisfying Equation (5.11).Then Y coincides with and u * is an optimal control.
Proof.Let (Y, Θ Y ) ∈ L 2 × L 2 be a solution to the BSDE (5.9) and u * ∈ U be the process satisfying Equation (5.11) (see ii) in Remark 5.8).Define D t := e η XN t e rT Y t , t ∈ [0, T ], and observe that D T = e ηX N T ξ = 1.We now prove that D = {D t e −η Xu t e rT , t ∈ [0, T ]} is a (P, H)-sub-martingale for any u ∈ U and a (P, H)-martingale for u * .Then the statement will follow by Proposition 5.10.

By the product rule, for any
Recalling Equation (5.2), we notice that and applying Itô formula we obtain Finally, after some calculations we get, for any u ∈ U where It remains to verify that, for any u ∈ U, the process {M u t , t ∈ [0, T ]}, is a (P, H)-martingale.To this end, it is sufficient to prove that the following two conditions hold Using Equation (5.13), Φ(z, u t ) ≤ z, the well known inequality 2ab ≤ a 2 +b 2 ∀a, b ∈ R and Jensen's inequality, the first expectation above is dominated by which is finite because of Assumption 4.5 ii), Remark 3.1, Proposition 2.10 and recalling that Θ Y ∈ L 2 .The second expectation is lower than where the first term is finite because Y ∈ L 2 , the second is finite by Assumption 4.5 ii) and the third follows by Remark 3.1 and Proposition 2.10.

The optimal reinsurance strategy
The aim of this section is to provide more insight into the structure of the optimal reinsurance strategy and investigate some special cases.By Theorem 5.7, (W N , Θ W N ) ∈ L 2 × L 2 is the unique solution to the BSDE (5.9) and any maximizer in Equation (5.11) provides an optimal control.Hence, exploiting the expression in Equation (4.3), we look over u ∈ U for the maximizer of the function f : M u → R given by f (t, ω, w, θ(•), u) = −wηe r(T −t) q(t, ω, u) The following general result provides a characterization of the optimal reinsurance strategy in the one-dimensional case, where In order to obtain some definite results we need to introduce a concavity hypothesis for the function f w.r.Then the optimal reinsurance strategy is , where û is: and we define the two regions and ū(t, ω, w, θ(•)) ∈ (u M , u N ) solves the following equation: Proof.We observe that f given in Equation (6.1) is continuous and strictly concave in u ∈ [u M , u N ] by hypothesis.Hence the first order condition, which reads as Equation (6.3), admits a unique solution ū(t, ω, w, θ(•)) measurable function on M. If we extend the function f to the whole real line, i.e. u ∈ R, it is decreasing for u < ū and increasing for u > ū, hence the maximizer on [u M , u N ] must be given by û(t, ω, w, θ(•)) = max{u M , min{ū(t, ω, w, θ(•)), u N }}, which is equivalent to the Equation (6.2).
Remark 6.2.If q(t, ω, u) and Φ(z, u) are linear or convex on u ∈ [u M , u N ] then f is strictly concave in u ∈ [u M , u N ] and Proposition 6.1 applies.
We now consider a few examples under the expected value principle for the reinsurance premium (see Remark 4.3).
Let us briefly comment the previous result.We can distinguish three cases, depending on the stochastic conditions (in particular, depending on the solution of the BSDE (5.9)): • if the reinsurer's safety loading θ R is smaller than θ F t , then full reinsurance is optimal; • if θ R is larger than θ N t , then null reinsurance is optimal and the contract is not subscribed; • lastly, if θ F t < θ R < θ N t , then the optimal retention level takes values in (0, 1), that is, the ceding company transfers to the reinsurance a non null percentage of risk (not the full risk).
In other words, if the reinsurance contract is inexpensive, the full reinsurance is purchased.On the contrary, when the reinsurance cost is excessive, the primary insurer will retain all the risk.In the intermediate case θ F t < θ R < θ N t the retention level takes values in the interval (0, 1).In any case, the concepts of inexpensive and expensive must be related to the underlying risk through the stochastic processes W N and Θ W N , hence the thresholds are stochastic.
To obtain explicit results we will reduce our analysis to the case where the control is u = u 1 , while u 2 = u 1 + β is unequivocally determined, β > 0 being the fixed maximum reinsurance coverage.
According to Equation 4.4, the expected value principle becomes where S Z is the survival function S Z (z) = 1 − F (1) (z).
Let us observe that Assumption 4.5 ii) is automatically satisfied, in virtue of Lemma B.1.
Let us briefly comment the previous result.Differently from the proportional reinsurance, null reinsurance is never optimal and we can distinguish two cases, depending on the maximum coverage β and the solution of the BSDE (5.9): • if the reinsurer's safety loading θ R is smaller than θ L (i.e. the contract is inexpensive) then the maximum reinsurance coverage β is optimal; • if θ R is larger than θ L (i.e. the contract is inexpensive) then it is optimal purchasing reinsurance but not with maximum coverage.2)) can be easily obtained from the previous case by letting β → ∞.The optimal reinsurance strategy, under Assumption 4.5 i), becomes then: where and ū(t, w, θ(•)) ∈ (0, +∞) solves the following equation: As in the Limited Stop-Loss Reinsurance case, null reinsurance is never optimal and two cases are possible, depending on the solution of the BSDE (5.9): • when the reinsurance contract is inexpensive (θ R < θ L ), the full reinsurance is optimal; • otherwise, it becomes optimal to purchase an intermediate protection level.
Let 0 ≤ t 1 < t 2 ≤ T, A ∈ F t 1 and denote by A C the complementary set of A. Then we have that Now the inner expectation corresponds to the Laplace transform of a Poisson random variable, since ] = e (e−1)λ(t 2 −t 1 ) .Substituting and rearranging the terms we then get that E e T 0 bt dNt = E e (e−1)λ(t 2 −t 1 )½ A .On the other hand, we notice that ] (t)½ A λ dt = E e (e−1)λ(t 2 −t 1 )½ A , which proves the statement for any bounded F-predictable process.To complete the proof, we extend this result to unbounded processes.Assume that {b t , t ≥ 0} is an arbitrary F-predictable process and define a sequence of F-stopping times τ n = inf{t ≥ 0 : b t > n}, n ≥ 1.Clearly, τ n → +∞ as n → +∞.By the first part of the proof, we know that E e T ∧τn 0 bt dNt = E e T ∧τn 0 (e b t −1)λ dt , so that, to complete the proof, it remains to pass to the limit n → +∞ and to apply the monotone convergence theorem to the family of random variables X n := e T ∧τn 0 bt dNt , n ≥ 1, in the case when b is positive, or to X n := e  where we have used that N((0, t] × A) is a Poisson process with intensity A F (dz) λ.

Appendix B. Proof of key Lemmas
We focus here on the finiteness of E e aN (1)   T , E e a T 0 λsds , E e a T 0 πs(λ)ds and E[e aC T ], which are computed under P for an arbitrary real constant a > 0.Here N (1) is a standard Poisson process under (Q, F) and a counting process with intensity λ (given in Equation (2.1)) under (P, F).We will exploit the measure change introduced in detail in Section 2 and we will work under Assumption 4.5 i).We prove the following:
Proof.First of all, we show that under Assumption 4.5 i) we have Recalling Equation (2.6), for a suitable c 1 > 0 and for c 2 = aT we find that where we used the mutual independence of N (1) , N (2) , {Z n } n≥1 , {Z n } n≥1 under Q and, in the last equality, we followed the path traced in the proof of Proposition 2.6.Finally Assumption 4.5 i) gives the finiteness of the expectation under Q.
To prove that E e aN (1)   T is finite we exploit the change of measure from We show now that E e a T 0 λsds < +∞ ∀a > 0. We proceed as above: passing under Q via L T , recalling Equation 2.6 and introducing the integer-valued random measure m (1) (dt, dz), we find for a suitable constant C 1 > 0. We now apply Lemma A.2 under Q and for H(t, z) = [C 2 ℓ(z) + ln(λ t − )] and with ν (1),Q (dt, dz) = F (1) (dz)dt and we get: which is finite under Assumption 4.5 i).
It remains to prove that E e a T 0 πs(λ)ds < +∞ ∀a > 0. The structure of the filtering equation implies that over [0, T ] the filter attains its maximum value at a jump time.More precisely, we showed in Remark 3.4 that the filter is dominated by a process with exponential decay behavior between two consecutive jumps, hence the maximum over [0, T ] is attained at a jump time τ ≤ T such that π τ (λ) = max π T (1) 1 (λ), . . ., π T (1)

(λ) .
Notice that the maximum is taken over a finite number of elements, because the jump process N (1)  is non explosive.Then, using Jensen's inequality we have that E e a T 0 πt(λ) dt ≤ E e aT πτ (λ) ≤ E π τ (e aT λ ) = E e aT λτ < +∞.
The last inequality is implied by the fact that τ ≤ T and so the following inequalities hold E e aT λτ = E Q L T e aT λτ ≤ C 1 E Q e aT λτ e T 0 ln(λ s − )dN for suitable constants C i > 0, i = 1, 2, and we can prove the finiteness by doing the same computations to prove that E e a T 0 λsds < +∞.
Based on the previous Lemma, we conclude this section proving the useful result given in Lemma 4.6, i.e. for every a > 0 E[e aC T ] < +∞.
Proof of Lemma 4.6.We have that for a suitable constant κ > 0, passing under Q via the Radon-Nikodym derivative L T given in Equation (2.7) and using Lemma A. Appendix C. Proof of Theorem 5.9 Proof.In order to apply Papapantoleon, Possamai and Saplaouras [27, Theorem 3.5] we start by verifying that the BSDE data are standard under β, i.e., that assumptions (F1) − (F5) therein are satisfied for a β > 0. We will show that in our setting any β > 0 works fine (see (F4) below).where we have used the boundedness of |e −ηe R(T −t) (z−Φ(z,u)) − 1| and that q u t ≤ q u M t for any u ∈ U. Now, since the inequality above does not depend on u we also have that the ess sup u∈U satisfies it and we can take its square (we use here the trivial relation (a+b+c) 2 ≤ 3(a 2 + b 2 + c 2 )), finding: Recalling now that the transition kernel reads K ω t (dz) = π t − (λ)F (1) (dz) we use the following, for an integrable function ϑ: So, the target, being Equation (C.2), is reached and we have the following values for the stochastic Lipschitz coefficients γ t and γt : γ t = 3η 2 e 2r(T −t) (q u M t ) 2 + 3π 2 t − (λ), γt = 3π t − (λ), which, as expected, are independent of the control u. α 2 s = max 3η 2 e 2r(T −s) (q u M s ) 2 + 3π 2 s − (λ), 3π s − (λ) and also A t = t 0 α 2 s ds, so that we can easily verify that the inequality ∆A t ≤ Φ, P−a.s.holds true for any Φ > 0 since A has no jumps.Notice that (F2) requires that the terminal condition ξ = e −ηX T N belongs to the set of H T −measurable random variables such that E e βA T e −2ηX N T < ∞, for some β > 0. This is true for any β > 0, since

Theorem 3 . 2 (
Kushner-Stratonovich equation).For any f ∈ D(L), the filter is the unique strong solution to the filtering equation, for any t ∈ [0, T ] solves Equation (3.20) and, as a consequence, Ψt(s,x)(f ) Ψt(s,x)(1) solves the KS-equation between two consecutive jump times given in Equation (3.10).Finally the statement follows by uniqueness of the KS-equation observing that +∞ 0

Definition 4 . 4 .
We define by U the class of admissible strategies, which are all the U-valued and H-predictable processes, {u t , t ∈ [0, T ]}, such that E e −ηX u T < +∞.Given t ∈ [0, T ], we will denote by U t the class U restricted to the time interval [t, T ].

( 1 )
n , D} ≤ D, hence condition E e aZ(1) < +∞ in Assumption 4.5 is trivially satisfied.The class of admissible strategies is non empty, as shown by the next result.Proposition 4.8.Under Assumption 4.5, every H-predictable process {u t , t ∈ [0, T ]} with values in U is admissible.Proof.Thanks to Lemma 4.6, the proof is basically the same as in Brachetta and Ceci [4, Prop.2.2, pag.4].