The Markovian Shot-noise Risk Model: A Numerical Method for Gerber-Shiu Functions

In this paper, we consider discounted penalty functions, also called Gerber-Shiu functions, in a Markovian shot-noise environment. At first, we exploit the underlying structure of piecewise-deterministic Markov processes (PDMPs) to show that these penalty functions solve certain partial integro-differential equations (PIDEs). Since these equations cannot be solved exactly, we develop a numerical scheme that allows us to determine an approximation of such functions. These numerical solutions can be identified with penalty functions of continuous-time Markov chains with finite state space. Finally, we show the convergence of the corresponding generators over suitable sets of functions to prove that these Markov chains converge weakly against the original PDMP. That gives us that the numerical approximations converge to the discounted penalty functions of the original Markovian shot-noise environment.


Introduction and Overview
The introduction of the family of penalty functions by Gerber and Shiu in Gerber and Shiu (1998) had and still has a huge impact on the field of ruin theory. This unifying approach, generalizes previously considered risk measures and allows a comprehensive analysis of the ruin event of an insurance portfolio. Since then, Gerber-Shiu functions were analysed in different types of risk models. For example in the renewal model in Gerber and Shiu (2005), Li and Garrido (2005) and Willmot and Dickson (2003), the Markov modulated model in Zhang (2008), and the Björk-Grandell model in Schmidli (2010). The case of spectrally negative Lévy risk processes was already considered in Garrido and Morales (2006) and resolved in a very general form by the so-called quintuple law derived in Doney and Kyprianou (2006).
Initially, the main aim was to establish explicit formulas, which allow for direct calculation of discounted penalty functions. This was successfully done in the classical and renewal models, if the claim sizes are exponentially or phase-type distributed. Due to the increasing complexity of underlying models and considered penalty functions, this is generally hardly possible nowadays. Since simulation techniques like (quasi-)Monte Carlo methods are time-consuming and not always directly implementable, there is an increasing effort in finding efficient numerical procedures to determine suitable approximations of penalty functions for more complex models. Exemplary contributions are Chau et al. (2015), Diko and Usábel (2011), Lee et al. (2021), and Preischl et al. (2018). For the renewal risk model, Strini and Thonhauser (2020) introduced a numerical scheme based on a discretization of the corresponding generator to determine discounted penalty functions depending on a local cost functional and the deficit at ruin.
We consider Gerber-Shiu functions in the context of a Markovian shot-noise environment. The motivation for using the Markovian shot-noise model is the modelling of disasters, like earthquakes, as it was applied in Dassios and Jang (2003) in the context of pricing of reinsurance of catastrophic events. A generalized version of this model was considered by Albrecher and Asmussen (2006), who were interested in the asymptotic behaviour of the ruin probability in a general shot-noise model and derived exponentially decaying upper and lower bounds. Further extensions of this model are introduced by Stabile and Torrisi (2010), who considered heavy-tailed claim events, and Macci and Torrisi (2011), considering a non-constant premium rate. Recently, Pojer and Thonhauser (2022) were able to show the convergence behaviour of the ruin probability in the Markovian model.
In this contribution we are able to deal with Gerber-Shiu functions in their full generality. The introduction of an additional process, allows us to include functions depending on the surplus before ruin. By the underlying structure of piecewise-deterministic Markov processes, we can represent these discounted penalty functions as solutions to Feynman-Kac type partial integro-differential equations. Since there is no evidently explicit solution to the resulting equations, we develop a scheme to solve these equations numerically. First, we resolve the problem of the unboundedness of the involved intensity process. In a second step, we discretize the bounded version of the partial integro-differential equations and solve the corresponding system of linear equations. The obtained numerical solutions correspond to Gerber-Shiu functions of approximating Markov chains with finite state space. Eventually, we use weak convergence on the Skorokhod space of càdlàg functions to obtain a convergence result for the determined function values.
This paper is organized in the following way. In Section 2, we define the considered model, the concept of Gerber-Shiu functions, and their analytic properties. In Section 3, we introduce families of auxiliary processes used to approximate the original PDMPs of the Markovian shot-noise model and motivate the proposed numerical scheme. In Section 4, we show convergence of the numerical approximation by exploiting convergence in distribution of processes over the space of càdlàg functions. Finally, in Section 5, we give examples that show the performance of the proposed numerical scheme.

Risk Model and Gerber-Shiu Functions
At first, we briefly introduce the considered Markovian shot-noise model as it is also used in Pojer and Thonhauser (2022). For this, we consider a probability space (Ω, F, ℙ) , which is assumed to be big enough to carry all of the subsequently defined stochastic objects.
Definition 1 Let and be positive constants and N a homogeneous Poisson process with intensity and jump times T i i ≥ 1 . Let further Y i i ≥ 1 be i.i.d. copies of a positive random variable Y with distribution F Y independent of N . Then, the intensity process t t ≥ 0 given by is called Markovian shot-noise process.
Using this, we can define the surplus process in the following way.
Definition 2 Let N be a Cox process whose stochastic intensity is a Markovian shot-noise process t t ≥ 0 and U i i ≥ 1 an i.i.d. sequence of positive random variables, independent of N and , with distribution function F U . Let further c be a positive and x a non-negative constant. Then, the surplus process X t t ≥ 0 is given by The Markovian shot-noise model was used by Dassios and Jang (2003) to model catastrophic events like earthquakes. A single catastrophic event, e.g. the earthquake, increases the intensity by a random quantity Y, called shock, and induces Poi(Y∕ ) many claims, called a cluster, which do not occur immediately, but instead they will be reported over a period of time. This allows us the following interpretation of the involved parameters and random variables. The parameter is the inverse of the expected time between two catastrophic events. Since the total number of claims due to a single catastrophe is Poi(Y∕ ) , we have that the random variable Y∕ determines the distribution of the number of claims in a single cluster. The decay parameter determines how long it will take until all claims of the cluster are reported and paid. Despite the easy interpretation, it might be hard to estimate these quantities, especially the distribution of Y.
From now on, we will assume the following: Assumption 1 Assume that the net profit condition c > [U] [Y] holds, where U has distribution F U and is independent of all other stochastic objects introduced so far.
In many cases, the ruin probability itself is not a satisfying measure of the risk in the given model. One well-established much more general approach is to use Gerber-Shiu functions. For the precise setup, we follow the presentation used in Schmidli (2010). Let w(x, y) be a continuous and bounded function and > 0 . Then, the corresponding Gerber-Shiu function is defined by for (x, ) ∈ [0, ∞) × (0, ∞) . Further, g (x, ) = 0 for x < 0 or ≤ 0 . Here, the expectation (x, ) is the expectation with initial conditions X 0 = x and 0 = . Even though this representation is commonly used, it is not satisfying in our case. Since we want to exploit weak convergence of càdlàg processes to justify our numerical scheme, we have to extend the definition of GS-functions.
Definition 3 Let t t ≥ 0 denote the Markovian shot-noise process and X t t ≥ 0 the corresponding surplus process. As a third process define {m} t ≥ 0 as m t ∶= U N t , the process which remembers the size of the latest claim. Using these processes we define for a continuous and bounded function w, and a constant > 0 the Gerber-Shiu function As already mentioned before, the main advantage of this representation is, that we use the càdlàg process m t instead of the làdcàg process X t− . Since m is a sequence of i.i.d. random variables, the value m does not depend on the current level m, i.e. the size of the latest claim. Therefore, we will omit m in future and write still g (x, ) for the GS-function. For the sake of completeness, we define the filtration F t t ≥ 0 to be the natural filtration of the multivariate process (X t , m t , t ) t ≥ 0 . In this setting, the multivariate process is a strong Markov process with respect to this filtration.
To obtain a partial integro-differential equation which is satisfied by the GS-functions, we use the Markovian structure of our model. The process (X t , m t , t ) t ≥ 0 is a piecewisedeterministic Markov process (PDMP) with generator which is certainly well-defined for all bounded and continuously differentiable functions f. Since our Gerber-Shiu functions are generally not continuously differentiable, we use the general definition of the generator of a PDMP from Rolski et al. (1999). For a function f, the path-derivative is defined by Then, the domain of the generator of our PDMP consists of all functions f, which are pathdifferentiable a.e. and satisfy that for all t ≥ 0, Proof To show this, we prove that the Gerber-Shiu functions are path-differentiable and bounded.
For the path-differentiability, we follow the line of arguments as given in Strini and Thonhauser (2020). Define for some deterministic r > 0 the bounded stopping time = r ∧ T 1 , where T 1 denotes the first jump-time of the PDMP. Doing this we get Now, there are two cases. Either = r or = T 1 . Using this, we get where Adding and subtracting e − ∫ r 0 ( e − s + + ) ds g (x, ) and rearranging gives us The integral over H is differentiable in r = 0 from the right with derivative H(0). Hence, for r → 0 we have which gives us the differentiability of g along the paths of our PDMP. The integrability is an immediate consequence of the boundedness of w. ◻ If we use the derived form of the path-derivative of g in the definition of the generator, we see that the GS-function solves the partial integro-differential equation on (x, ) ∈ [0, ∞) × (0, ∞) . But, we still have to show that it is its unique solution.
Proof As already shown, the GS-function is bounded and solves the equation stated above. Now, observe that the PIDE can be rewritten in terms of the generator of the PDMP by be an arbitrary bounded solution of this equation, > 0 , and S a bounded stopping time. Since h is path-differentiable and bounded, it is in the domain of the generator A , which we use, to get Using this representation for the bounded stopping time ∧ t , yields Let us now focus on the third term. Using that N is a counting process with intensity , we can rewrite this to The same arguments yield Hence, Since h is bounded, we can find a positive constant K such that Using this, we finally get that ◻

Auxiliary Processes
As already shown in the previous section, the function g satisfies a partial integrodifferential equation (PIDE). Generally, this equation cannot be solved explicitly. Hence, we need some numerical scheme that allows us to calculate an approximation of the desired value. An intuitive way to do so, is to bound the state space and suitably discretize the PIDE on the bounded domain. This results in a system of linear equations which we can solve. Our approach is to approximate a bounded version of the PDMP by Markov chains, determine the corresponding Gerber-Shiu functions and show that these converge to the original ones.

Bounded Processes
The first step of the approximation procedure is to bound some components of the processes in a suitable way.
Then, we define the bounded intensity process by Here, the distribution of the random variable Y (b) j depends on the original j-th shock and the pre-jump location of the process (b) in the following way: define the random variable

then the bounded shocks are given by
This might seem complicated, but it ensures, that our new shocks have bounded support and the whole process (b) does not leave the bounded state space (0, max (b)] . Given this new process, we define our new counting process N (b) using the acceptance-rejection method, also called thinning. Let T be a jump time of the original counting process N and We accept the jump time for the new jump process if is a Cox process with intensity (b) , whose jump times coincide with jump times of our original process. Defining the sequence of bounded claims by is no longer an i.i.d. sequence, this new triplet of processes is again a PDMP with generator or alternatively we write for convenience Having this, we can now define the GS-function of the bounded process.
Definition 5 Let g (x, ) = (x, ) w(X + m , −X )e − I { < ∞} be an arbitrary Gerber-Shiu function. Then, we define the corresponding GS-function of the bounded process by We call these processes bounded, since the intensity process (b) t t ≥ 0 and the random events Y (b) and U (b) are a.s. bounded. Despite this denomination, the multivariate process is left unbounded since any change there would disturb the strictly monotone increasing drift. We could resolve this problem, by using external states, which could be introduced to change the deterministic flow. This would create an active boundary, i.e. an area where jumps occur deterministically. Unfortunately, this causes several problems in the proofs of weak convergence in Section 4.

Discrete State Processes
Let us now fix some b, and h > 0 . We set Then we introduce a continuous-time Markov chain with countable state space approximating the bounded process the following way: Definition 6 Define the state space Further, we set for some fixed j the number of points which we can reach from j with a bounded shock event. The corresponding probabilities are . Using this, we define the Markov chain on the discrete state space via its generator where we set 1 − 1 = 1 and n = max (b) for all n ≥ N .
This generator consists still of infinitely many expressions, due to the unbounded state-space. To bypass this, we have to introduce a third family of processes with finite state-space.
Here, we write again 1 − 1 = 1 and n = max (b) for all n ≥ N . Now, we can introduce the corresponding GS-function of the Markov chain with finite state space.
is the unique solution of the following finite system of linear equations: and Proof Since > 0 , the matrix corresponding to the above system of equations is strict diagonally dominant, hence regular. Since the GS-function solves the system, it is the unique solution. ◻

Convergence of Gerber-Shiu Functions
In this section, we will prove that our numerical scheme converges as h → 0 and b → ∞ . For this, we want to exploit the convergence in distribution of processes as random variables on the Skorokhod space of càdlàg functions. This convergence implies the convergence of Skorokhod-continuous and bounded functionals of the corresponding processes. For further details on this metric space see (Ethier and Kurtz 2009, Chapter 3). Since our processes are Markov processes, the main idea is to reduce this to the convergence of the corresponding generators. For Feller processes, these properties are equivalent as shown in (Kallenberg 2002, Theorem 19.25). Since our processes are not Feller, we will use Theorem 8.2 in Chapter 5 of Ethier and Kurtz (2009) to show the same. Consequently, we have to find a suitable subdomain of our generators such that the induced semigroup is strongly continuous on this set of functions. If this domain is convergence determining, e.g. if it contains C ∞ c , and the generators converge for all f from this domain, then the corresponding processes converge weakly.

Lemma 4 The generator A of the original PDMP generates a strongly continuous contraction semigroup
Proof Since A is the generator of the Markov process (X t , m t , t ) t ≥ 0 , we have to show that T t maps this set into itself and is strongly continuous there.
By a small modification of the proof of Theorem 27.6 in Davis (1993), we can relax the needed assumption that the intensity is bounded. This gives us that for all bounded and continuous f, we have that T t f ∈ C b too. By Theorem 7.7.4 of (Jacobsen 2006, pp. 181-182), the operators map path-differentiable functions satisfying ‖A f ‖ ∞ into itself and sat- The strong continuity is an immediate consequence of the boundedness of Af . Consider Since this upper bound is independent of (x, m, ) , we can let t tend to 0, which gives us that the contraction semigroup T t is strongly continuous in t = 0 . ◻ x, m, ) . At first, we will cover the case k = 0 . Let f ∈ D arbitrary and g = A f . It is easy to see, that for every b > 0 , f is in the domain of the generator A (b) too. Writing g (b) for

Theorem 5 Let f ∈ D be arbitrary and
The first term is the expectation of a zero mean martingale, hence 0. For the second term, we take a closer look at the difference of the generators A and A (b) applied to the same function f in the same point (x, m, ): and The derivatives coincide and so do the integrals from 0 to U max (b) and 0 to min max (b) − , Y max (b) respectively. The absolute value of the remaining parts can be bounded by This upper bound tends to 0 as b → ∞ since U max (b) , max (b) and Y max (b) tend to infinity but not uniformly in .
If we now get back to our expectation we see that The second part tends to 0 as b → ∞ but the first part still needs some work, since it depends on v . For this, we remember that, given 0 = , v − with v ≤ s + t can be bounded from above by the compound Poisson distributed random variable ∑ N t + s i = 1 Y i . By this we get that which is independent of v and tends to 0 as b tends to infinity. By this, we have that For k > 0 , we observe that the chosen time points t 1 , … , t k are prior to time t, hence By this we have that Therefore, similar to the case k = 0 we can rewrite the difference as The functions h i are in C b , hence we can bound the absolute value of the product uniformly by some constant c and get Proof This, is a direct consequence of Theorem 5 and Theorem 8.2 in (Ethier and Kurtz 2009, pp. 226-227). ◻

Convergence of the Discrete Processes
Now we will use the same ideas as before, but on the set Again we define a contraction semigroup and want to show that this semigroup is strongly continuous at 0 over the set D (b) .

Lemma 7 Let f be in
The altered initial condition in the first variable only affects the surplus process. Let X (b) t be the reserve process with initial capital y and X (b) t the corresponding process with starting value x. By the linear structure of the surplus process, we see that for all t ≥ 0 and all ∈ Ω . By this we get that where L denotes the Lipschitz constant of f with respect to ‖.‖ 1 . The same idea leads to a preserved Lipschitz-continuity in the second variable. Now we want to show that this holds for the third variable too. Here, things get a little more complicated, since small changes in the intensity process influence all three processes. Let us now consider a realization (b) t of the intensity process with initial condition (b) 0 = and for some h > 0 the altered realization ̃( b) with starting value + h and take a look at the difference of those processes. If no shock event appeared until time t, or shocks happened but ̃( b) did not hit max (b) , the relation between those processes is Otherwise, the difference decreases and may even become 0 if both, (b) and ̃( b) hit max (b).
As already mentioned, the difference in the starting intensity leads to a change in the surplus process too. To be precise, we again consider two realizations, X (b) t with starting intensity + h and X (b) t corresponding to (b) 0 = . They are related by Finally, the corresponding realizations m (b) t and m (b) t may relate in three different ways. The first case is that N (b) t > 0 and the last jump before time t is due to Ñ . In this case m (b) t and m (b) Let us first consider the second term. We can rewrite the difference as By the Lipschitz continuity of f and the boundedness of j , we get that this is less or equal to 2L max (b)ch, where L denotes a Lipschitz constant of f. By the same idea, we can bound the third term by For the second term we define the function g ∶ [0, ∞) → ℝ by This is a Lipschitz continuous function in one real variable. Hence, it is differentiable almost everywhere and at every u, where g is differentiable the equality g � (u) = f (x + cu, x l , e − u ) holds. By this we get that where L is a Lipschitz constant of f . By this we get that ◻ Equivalently to Theorem 5, we prove the following lemma.
Lemma 10 Let f ∈ D (b) be arbitrary but fixed. Then, for all t, s > 0 , k ≥ 0 , h 1 , … , h k ∈ C b and t 1 < t 2 < … < t k ≤ t we have that

Examples
In this section, we give some explicit examples of Gerber-Shiu functions and corresponding numerical approximations in a Markovian shot-noise model with the following specific parameters. We choose decay parameter = 1 , intensity of the underlying Poisson process = 1.5 and premium rate c = 15 4 . Further, we assume that the shock events Y i and the claim events U i are exponentially distributed with mean 1. All computations and simulations are made on a standard notebook with an Intel Core i5.10210U processor at 1.60 GHz and 16 GB of RAM.

Laplace-Transform Function of the Time of Ruin
The first example is the GS-function g (1) ∶= (x, ) e − I { < ∞} , i.e. the Laplace transform of , with = 0.1, and for fixed = 2.3 . In Fig. 1, the function in black is a Monte Carlo simulation using 10000 sample paths and the red area is the corresponding 95-percent confidence interval. For the numerical approximations, we choose x = 50 , max = 4.5 , U max = 10 , Y max = 4.5 and h = 1 cm for m ∈ {5, 10, 12}. As we can see in Table 1, the main advantage of the numerical method is the speed of computation. The scheme with h ≈ 0.02 needed about 38 minutes whereas the computation of the corresponding simulation needed approximately 27 hours.

Discounted Surplus Before Ruin
Here, we consider the same setting as in the previous example, but now with penalty function g (2) (x, ) = (x, ) e − X − I { < ∞} . Again, the black function in Fig. 2 is a MC simulation from 10000 paths, which we will use as a reference solution. This time, the plot shows the behaviour of the GS-function in x for fixed = 3.9 . As before, the numerical approximations are calculated with parameter x = 50 , max = 4.5 , U max = 10 , Y max = 4.5 and h = 1 cm for m ∈ {5, 10, 12}. The function w(x, y) = x is continuous but not bounded. We bypass this problem by considering penalty functions of the form w(x, y) = min(x, n) for n ∈ ℕ . These functions are continuous and bounded; hence, the theory derived before is applicable. Further, the sequence e − min X − , n I { < ∞} n ∈ ℕ is monotone increasing. Consequently, the approximations converge for n → ∞ by monotone convergence. As we can see in Table 2, the behaviour in terms of computing time and relative error is similar to the corresponding values in the example of the Laplace transform.

Ruin Probability
As a third example, we consider the ruin probability g (3) (x, ) = (x, ) I { < ∞} . The reference solution is again obtained by MC-simulation, and the bounds x , max , U max , and Y max are chosen as in the previous examples. An illustration of the simulation and the numerical approximations for fixed = 2.3 can be seen in Fig. 3.
In Table 3, we see that the run time and the relative error are very similar to the first two examples. Again, even in the finest step size considered, the numerical scheme beats the simulation by a factor of ≈ 40 in terms of computation time, which is the main advantage of our approach.

Empirical Convergence Order
Another topic of interest is the convergence order of numerical schemes. This has been studied for example by Chau et al. (2015), who considered GS-functions in a Lévy subordinator model. Numerical methods to solve integro-differential equations related to ours are also derived in Brunner (1988), who proposed spline collocation methods for ordinary Volterra integro-differential equations. He was able able achieve a convergence order up to order 2m, given that the coefficients of the Volterra equation are 2m times continuous differentiable.
Since we consider partial integro-differential equations, we cannot use his results to obtain a theoretical convergence order. Alternatively, we compute the empirical order of convergence of our numerical scheme in the examples given before. For this, we consider two different approaches. The first is the estimated order of convergence (EOC) as defined in (Steinbach 2008, p. 253). For a sequence x n n ≥ 0 with limit x, we define the sequence of absolute errors by e n ∶= | | x − x n | | . Assuming that e n ≈ Cn − for some fixed constant, i.e. that the sequence converges with order , we divide by e n − 1 and get e n e n − 1 ≈ ( n n − 1 ) − . Applying the logarithm and dividing by ln( n n − 1 ) gives us the EOĈ n = ln e n e n − 1 ln n − 1 n .

Fig. 3 Ruin probability and numerical approximations
The second procedure uses the same assumption e n ≈ Cn − , or equivalently ln(e n ) ≈ ln(C) − ln(n) . Having this form, we use a linear regression approach to get an estimator ̃ for the parameter , as it is used by Chau et al. (2015). We are interested in the convergence behaviour of the sequence of our numerical approximations at some fixed points (x i , j ) as the fineness of the discretization tends to 0. To determine the error terms e n correctly, we have to know the limit of this sequence, which is not the GS-function of our original process, but the GS-function of the bounded process, which we obtain by simulation.
In the following examples, we fix the bounds x = 50 , Y max = 4.5 , U max = 10 and max = 4.5 and the GS-functions g (1) , g (2) , and g (3) as before. Then, we consider the sequence of numerical approximations with step size h = 1 cm for m ∈ {1, … , 12} and compute the EOC and the regression estimate ̃ at the points (0.4, 2.3), (1.4, 2.3), and (2.5, 2.3). As we can see in Table 4, it seems plausible that we observe a linear convergence behaviour.
In Fig. 4 we see the linear function obtained by regression of the approach − ln(e n ) = ln(C) + ln(n) for the ruin probability in the point (0.4, 2.3) and the corresponding observed values in red. The coefficient of determination R 2 = 0.9996 indicates that there is indeed a linear relationship between ln(e n ) and ln(n) , i.e. that the assumption e n ≈ Cn − is reasonable, and that ̃≈ 1.037 is a good estimation of the true convergence order in this example.