On The Randomized Schmitter Problem

We revisit the classical Schmitter problem in ruin theory and consider it for randomly chosen initial surplus level U. We show that the computational simplification that is obtained for exponentially distributed U allows to connect the problem to m-convex ordering, from which simple and sharp analytical bounds for the ruin probability are obtained, both for the original (but randomized) problem and for extensions involving higher moments. In addition, we show that the solution to the classical problem with deterministic initial surplus level can conveniently be approximated via Erlang(k)-distributed U for sufficiently large k, utilizing the computational advantages of the advocated randomization approach.


Introduction
At the ASTIN Colloquium 1990 in Montreux, the Swiss actuary Hans Schmitter presented an algorithm for the exact evaluation of the ruin probability (u) of a Cramér-Lundberg surplus process for an insurance portfolio with initial surplus u, for the case when the claim amount distribution is discrete on a finite range (Schmitter (1990)). Also inspired by Bowers (1969), he then posed the following question: If the individual claims are known to have mean and variance 2 , which claim size distributions minimize or maximize the ruin probability for a given u, respectively? I.e., the problems are where X is the random variable describing the individual claim sizes. This problem was then further discussed by Brockett et al. (1991) and taken up in Kaas (1991), where it was also extended to the related problem of finding extremal values of stop-loss premiums for compound Poisson distributions with similar moment restrictions. Much later, De Vylder et al. (1997b);  provided a numerical solution to the Schmitter problem based on a renewal equation that approximates the classical ruin model using a discrete time grid and partially solved the original problem in De Vylder et al. (1997a).
While on the basis of these contributions the problem can be considered as quite well understood, it was never solved in full generality. Correspondingly, despite the considerable time that has passed since then and the gradual shift of criteria for solvency considerations in insurance practice in the meantime, we would like to add an additional layer of complexity and understanding of the Schmitter problem in this paper by taking the perspective of a randomized initial surplus level.
Randomization as a principle has proven to be a very useful tool in risk theory leading to simpler expressions (see e.g. Albrecher et al. (2013), Ivanovs (2013)) or even unexpected identities (Albrecher and Ivanovs (2017)), but particularly also to considerable computational advantages (cf. Carr (1998), Asmussen et al. (2002), Albrecher and Goffard (2021)). The idea for the latter computational approach is to replace a deterministic quantity by a random variable with matching expected value, often with the advantage of smoothing the corresponding computational problem, leading to simpler and amenable expressions. In a final step, if possible the variance of that random variable is reduced considerably such that the resulting value can be an excellent approximation of the original computationally complex problem ("Erlangization").
In our setting, we replace the deterministic initial surplus level u by an exponentially distributed random variable U with mean u. The expected value of the resulting ruin probability can then be expressed in terms of the (simpler) Laplace transform of the classical ruin probability under the Cramér-Lundberg model. At this level, analytical lower and upper bounds for the ruin probability in the Schmitter problem can then be established utilizing the strong results in the theory of m-convex orders obtained by Denuit et al. (1999Denuit et al. ( , 1998) (see also Lefèvre et al. (2020) for a recent application of this ordering concept). In addition, the generality of the latter results in fact allows to give sharp upper and lower bounds for the ruin probability when more than two moments of the underlying claim size distribution are specified, which can be seen as an extension of the Schmitter problem that naturally narrows the gap between the upper and lower bound. For a comprehensive survey of stochastic orderings we refer to the monographs by Kaas et al. (1994), Shanthikumar (1994, 2007) and Müller and Stoyan (2002). More recent treatments in a specifically actuarial context include Kaas et al. [Ch.7] (2008) and Asmussen and Steffensen [Ch.8] (2020).
Eventually, we are also interested in using these explicit expressions of the randomized model to approximate the classical situation of deterministic initial surplus level u. Developing the results further towards Erlang(k) distributed initial surplus, for increasing k (maintaining the expected value at u) this provides increasingly accurate approximations for the classical deterministic case, expressed through the explicit formulas of the randomized model. min/max (u) subject to X is a non-negative random variable, with (X) = and Var(X) = 2 , The remaining paper is structured as follows. First, Section 2 recapitulates the model setting and summarizes relevant results from the existing literature. In Section 3 we then analyze the problem for an exponentially distributed initial surplus level U. We obtain an expression for the corresponding (expected) ruin probability in terms of the Laplace transform of the classical ruin probability in the Cramér-Lundberg model, and provide sharp lower and upper bounds for it when the claim size is bounded. We also provide corresponding bounds in the case of more than two pre-specified moments of the claim size distribution. Moreover, we illustrate the resulting interval for particular numerical parameters and place various concrete (truncated) claim size distributions within these bounds. In Section 4, we expand the randomization idea towards Erlang(k)-distributed initial surplus, and in the spirit of Asmussen et al. (2002) we approximate the ruin probability with deterministic surplus via Erlangization and Richardson extrapolation. We give numerical illustrations which show that the known and somewhat curious kinks in the graphs of the known optimal solutions of the classical Schmitter problem can be smoothly approximated with this randomization approach. In some cases, a small value of k is already sufficient for a good approximation, in others the value of k has to be quite considerable. Section 5 concludes.

Preliminaries and Previous Results
Consider the classical Cramér-Lundberg model with surplus process at time t ≥ 0 , where u is the initial surplus level. Here, S(t) = X 1 + ⋯ + X N(t) denotes the aggregate claims up to time t, where the number of claims {N(t); t ≥ 0} up to time t refers to a homogeneous Poisson process with rate > 0 and the claim sizes X i , i = 1, 2, … , are independent and identically distributed random variables with distribution function F X and expected value (X 1 ) = , independent of {N(t); t ≥ 0} . We assume that all moments of X 1 exist. The premium income per unit of time is c = (1 + ) , where > 0 is the safety loading. Define the associated aggregate loss process as R(t) = S(t) − ct, for t ≥ 0. The probability (u) of ultimate ruin is the probability that the surplus process C(t) ever drops below zero, The maximal aggregate loss L = sup t≥0 R(t) can be decomposed as the sum of ladder heights, i.e. as the sum of the amounts by which record lows (here denoted by L 1 , L 2 , … ) in the insurer's surplus C(t) appear. Furthermore, the distribution of the L i (i = 1, 2, …) is given by the integrated tail distribution  (2009)). The latter expression shows that L is a compound geometric random variable and may be written as L = ∑ M k=1 L i , with M being the number of ladder heights. It is easy to see that M has a geometric distribution with parameter (1) For the case when the claim amount distribution has discrete support {x 1 , x 2 , … , x m } (with probabilities p 1 , p 2 , … , p m ), Schmitter (1990) gave an explicit expression to compute (u) in the form In the context of the Schmitter problem, 2-point distributions for the claim size play a special role. If X assumes the values x 1 with probability p and x 2 > x 1 with probability 1 − p , then for fixed mean > 0 and variance 2 > 0 we simply have or correspondingly Notice that x 2 is increasing in x 1 . Moreover, one has the relationships (see e.g. Kaas et al. [Ch 10.2] (1994)). If we additionally assume that X ∈ [0, b] , naturally x 2 ≤ b, and we have 0 ≤ ≤ b and 0 ≤ 2 ≤ (b − ) . The following two extremal cases will be particularly relevant later. Namely, X = {0, 0 * ∶= ( 2 + 2 )∕ } and so p = 2 ∕( 2 + 2 ) and X = {b * ∶= + 2 ∕( − b), b}. In here, x * denotes the function that assigns to x the unique real number such that the random variable X = {x, x * } has mean and variance 2 . Note that if b is not bounded, then as x 1 ↑ , p ↑ 1 and x 2 → ∞ ; while the probability mass at x 2 becomes arbitrarily small, it significantly contributes to the variance.
For any non-negative loss variable X, the stop-loss premium X is defined by Note that there is a one-to-one relation between the integrated tail distribution of X and its stop-loss premium, namely F L i (z) = 1 − X (z)∕ . One important concept in the theory of risk ordering is the stop-loss order. Concretely, a random variable X is said to be less risky than another random variable Y in stop-loss order ( X ≤ sl Y ) if X (d) ≤ Y (d) for all retentions d ≥ 0 (it is equivalent to increasing convex ordering, cf. Shaked and Shanthikumar (2007)). The problem of finding bounds for stop-loss premiums is a classical topic in actuarial science, see for example Bühlmann et al. (1977), Kaas and Goovaerts (1986) and Steenackers and Goovaerts (1991). For a study of the relation between stop-loss premiums and their associated ruin probabilities as well as general upper bounds for both stop-loss premiums and ruin probabilities see Cai and Garrido (1998) and the references therein. A consequence of the above concept is that if for two Cramér-Lundberg risk processes with equal premium per unit of time and claim intensity parameter, but different claim sizes, say X and Y, with X ≤ sl Y we have X (u) ≤ Y (u) for all u ≥ 0 (see Kaas et al. [Ch.8.2,Th.2.1] (1994)). Correspondingly, the Schmitter problem may be seen as being reduced to finding extremal distributions in the stop-loss order in the class of random variables in [0, b] with mean and variance 2 . However, as pointed out in Brockett et al. (1991) there are no extremal distributions in terms of stop-loss order in such a class.
Nevertheless one can construct stop-loss transforms in the corresponding range (bounded or not) with the given mean, but with minimal variance, larger than the given one. For two given moments, the latter is achieved by constructing a polynomial of degree 2 above the function (X − d) + which is tangent to this function in 2 points. The abscissas of these points will be the mass points. For a comprehensive description of this construction see Kaas et al. [Ch.10] (1994). In the following we briefly state its main consequences.
For unbounded X with mean and variance 2 , the maximal stop-loss premium at fixed retention d is attained by a random variable Z with support {r, r * } , where For given retention d, the maximal stop-loss premium is attained by the distribution with the mass points with the notation introduced before. However, these results do not provide an upper bound for the ruin probability in the Schmitter problem, because it is not the same extremal distribution across all values of d, but the latter would be needed to bequeath the dominance in terms of the stop loss premium from the integrated tail to all its convolutions in (1). However, Kaas (1991) showed that if X has lower stop-loss premiums than Y on the interval [0, u], then the same property holds for compound sums with N terms of these random variables respectively, and ruin probabilities with an initial surplus u are lower for X than for Y. That is, for values of u smaller than 1 2 0 * , the ruin probability is maximized by the 2-point claim random variable X = {0, 0 * }. Consequently, in terms of the upper bound the Schmitter problem is solved for small values of the initial surplus u.
De  and De Vylder et al. (1997b) provided numerical solutions to the problem based on a renewal equation in a discretized risk model. By restricting to lattice distributions, they used the method of linear combinations (see also Kaas et al. [Sec.3] (1992)) to obtain optimal solutions to the problem. They noted that for u ≫ b the maximal ruin probability was given by the 2-point claim random variable X = {b * , b}. In fact, De Vylder et al. (1997a) then proved that there exists a constant c > 0 such that for all u ≥ c the maximal ruin probability is given by that 2-point claim random variable. However, the concrete value of c as well as the optimal result for intermediate values of u seem to still not be settled up to this day. The minimal stop-loss premium for risks X with mean is given by ( − d) + for all retentions d ≥ 0 , i.e. it is attained by the defective random variable Z concentrated at , implying Z ≤ sl X and therefore Z (u) ≤ X (u) for all u. However, Z does not fulfill the variance constraint, so that this is not a valid solution to the Schmitter problem. It does provide a general lower bound for its solution though, and for unbounded X the variance constraint can then be satisfied by adding an (↓ 0) mass at infinity, see also Asmussen and Albrecher [Cor. IV.8.4] (2009).

Exponentially Distributed Initial Surplus
Let us now replace the deterministic initial surplus u by a random variable U that has an exponential distribution with parameter s > 0 . The redefined surplus process then is where c and S(t) are defined as in the classical ruin model, and U is independent of S(t).
Using the convenient fact that this choice of U simply puts us in the framework of Laplace transforms, due to (2) the ruin probability U (s) ∶= ℙ(C R (t) < 0 for some t > 0 ) is then given by Since the randomization of the initial surplus corresponds to a probability-weighted averaging over situations with deterministic surplus, it is clear that this step leads to a smoothing of the ruin probability shape. Figure 1 compares the ruin probabilities (u) for deterministic surplus u = {1.5, 4.5, 9.0} and = 0.5 (the parameters from Kaas [ Fig. 1] (1991)) with the corresponding randomized quantities of the same expected initial surplus (U) = 1∕s = u for 2-point distributions with given mean = 3 and variance 2 = 1 . One observes that the sensitivity w.r.t. the choice of the only free parameter x 1 is substantially different, and the somewhat curious shape change for increasing u from the classical deterministic case is indeed evened out.
Let us now look at the randomized and extended Schmitter problem with possibly more than two fixed moments of the claims size distribution. Inspired by Kaas (1991), using the maximal aggregate loss L and assuming that the moments of the claim size are finite, one can express the ruin probability in terms of the claim size moments, namely The first four terms of this series are given by Therefore, distributions with large third moment will make U (s) small and vice versa.
For 2-point distributions, a simple calculation shows that, Thus, for x 1 ∈ [0, ) , its third moment is increasing and convex, so the maximum will be at x 1 = 0 and the minimum at x 1 → . In fact, for deterministic surplus and 2-point distributions, Kaas (1991) argued that as ∫ ∞ 0 (u)du = (L) does not depend on x 1 and ∫ ∞ 0 u (u)du = (L 2 ) increases linearly with the third moment of the claim distribution, so that for small u, the ruin probability will be large for x 1 = 0.
While these considerations are intuitive, from (4) it becomes clear that for the extremal values of the randomized ruin probability it suffices to minimize (maximize) the Laplace transform of the individual claim sizes, i.e. to find extremal random variables in the Laplace transform order. The Laplace transform order has been introduced by Rolski and Stoyan (1976) to compare waiting times in queuing theory. In actuarial science, Denuit (2001) studied both univariate and multivariate versions of the Laplace transform order and gave several actuarial applications. We can now give sharp bounds for the randomized Schmitter problem for two given moments.

Proposition 3.1 Let X be a non-negative random variable with mean and variance 2 . Then
Proof Note that e −s is the Laplace transform of a random variable Y degenerate at . Moreover, 2 2 + 2 + 2 2 + 2 e −s( + 2 ∕ ) is the Laplace transform of a random variable Z with mean , variance 2 and such that ℙ(Z = 0) = 1 − ℙ(Z = ( 2 + 2 )∕ ) = 2 2 + 2 . Therefore, as maximizing the Laplace transform of the individual claim sizes minimizes U (s) and . vice versa, it suffices to show that Y ≤ Lt X ≤ Lt Z. The proof of the latter can be found in Shaked and Shanthikumar [Ch. 5,Theorem 5

.A.21] (2007). ◻
It is worth noticing that the distribution maximizing the randomized ruin probability coincides with the 2-point distribution that maximizes the ruin probability under deterministic surplus for small values of u. This is rather intuitive, since U (s) is a weighted average of (u) with a lot of weight for small values of u.
If more moments of the claim size X in [0, b] are specified, then one can obtain tighter upper and lower bounds for the randomized ruin probability. In view of (4), this reduces to the derivation of bounds for the Laplace transform of X in the moment space B S ([0, b]; 1 , 2 , … , m ) of all risks X with range [0, b] such that (X k ) = k for k = 1, 2, … , m. Fortunately, our context fits exactly into the framework of Denuit et al. (1999Denuit et al. ( , 1998 Table 1, Table 2] (1999). Moreover, the latter reference also provided the extrema with respect to the order ≤ m−cx when not only the first m − 1 moments and the support are given, but also when the density function of X is known to be unimodal with a known mode. 1 Proposition 3.2 Let X ∈ [0, b], b > 0, with moments ( 1 , 2 , … , m ). Then, which can then be translated to bounds for U (s).
The bounds for the Laplace transform using m−convex risks were already described in Denuit et al. (2000), extending earlier works of Eckberg (1977), Whitt (1983) and Lefèvre et al. (1986). In particular, Eckberg (1977) derived bounds for the Laplace transform up to the third moment using the theory of Chebychev systems and applied the bounds to problems in queuing and traffic theory. Moreover, the latter reference provides bounds for the case where no upper bound is known. We would also like to mention that, closely related to the theory of m-convex stochastic orders, using Markov-Krein theory and the theory of Chebychev systems, Cox (1984, 1985) obtained similar upper and lower bounds for the expected value of a function of some random variable with given moments. Also, , 1983, De Vylder and Goovaerts (1982),  and Hürlimann (1998) examined related bounding problems.
Using (8) we can give explicit bounds for the ruin probability with exponentially distributed initial surplus in terms of the given parameters. For reference, we restate here the respective bounds given in Denuit et al. [ Table 1, Table 2] (1999)) in terms of ruin probabilities when up to 4 moments of X are given: Case m = 1 . If 1 is given, then X (1) min is a random variable degenerate at 1 , and Therefore, Case m = 2 . If 1 and 2 are given, then In this case, it can be seen that Note that for b → ∞ the above expressions indeed converge to the bounds given in Proposition 3.1.
Then, the bounds for the ruin probability are given by Case m = 4 . If 1 and up to 4 are given, then Here, As can easily be verified,  the previous section, once sees that the ruin probability decreases with increasing third moment. As expected, the bounds are tighter as the knowledge of the second moment is incorporated. Finally, the graph at the bottom depicts the bounds of the generalized randomized Schmitter problem for given four moments of X as a function of 4 , leading to yet tighter bounds. Note that in this numerical illustration the values of the first three moments were selected in such a way that one finds a feasible set of distribution parameters for all of the distributions in the following numerical illustration. In order to illustrate the performance of the bounds and how they improve with the addition of moment information, we explicitly calculate U (s) for some chosen claim size distribution in each case, suitably truncated so that it fits into the model setup: In each case, the distribution parameters were determined using the method of moments for the given moment values set in the example above. For further details on claim size distributions and truncation see, for example, Albrecher et al. [Sec.3.3 & Ch.4] (2017).

Numerical Illustrations
The results are given in Fig. 3a where the exact ruin probabilities obtained using (4) together with the general bounds are plotted as a function of the expected initial surplus (U) = 1∕s for the same set of parameters as above. In particular, b was selected in such a way that no strong truncation effect is present in the distributions. One sees that, for fixed 1 only, the truncated exponential case is nicely between the sharp bounds. However, these bounds are very wide. When information about the second moment of X is included, the tightness of the bounds improves significantly. From (5), one would expect that to be the case only for small values of s where information about the first two moments provides a good approximation for the ruin probability. However, we can see that even for large values of s the improvement is considerable. The tightness of the interval for possible ruin probabilities becomes even more remarkable when the first three moments are fixed. This illustrates that in the context of ruin probabilities, the knowledge of the first few moments of the claim size distribution already provides a very accurate approximation. In a broader statistical context, for an account on reconstructions of arbitrary distributions from given moments, see e.g. Mnatsakanov (2008). Finally, for recent progress on the general probability level concerning criteria of moment-determinacy of distributions, see Yarovaya et al. (2020).

Remark 3.1
All results from this section can easily be generalized to the case where the initial surplus is assumed to be a mixture of exponential random variables. Indeed, consider a random initial surplus O with density with 0 < q i < 1, ∑ n i=1 q i = 1 and k i > 0 for i = 1, … , n. Then Since Proposition 3.2 applies to any value s > 0, for every given set of m moment constraints and all k i > 0, i = 1, … , n, we obtain min Consequently, when U is a mixture of exponential random variables, the lower and upper bounds for the expected ruin probability are linear combinations of the respective upper and lower bounds for the ruin probability with exponentially distributed initial surplus.
Note that for obtaining these sharp bounds, one still needed to reduce the expressions to purely exponential components so that the bounds on Laplace transforms can be used. For more general assumptions on U (like a general phase-type distribution) that link cannot be carried over in such a direct way. In the next section, we will, however, study the case of Erlang(k) distributed U in more detail, which is of particular interest, as for large k a deterministic initial surplus level can be approximated.

Erlang Distributed Initial Surplus
A natural extension of exponentially distributed random surplus is now to consider Erlang distributed initial surplus. Concretely, consider U to be an Erlang(k, s) random variable E k with density We then get Here ̂ (s) = 1∕s −̂ (s) denotes the Laplace transform of the survival probability of the classical Cramér-Lundberg risk process and we observe that its derivatives w.r.t. the Laplace argument lead to an explicit expression for the case of random Erlang-distributed initial surplus. We focus here on the classical Schmitter setting with fixed mean and variance of the claim size distribution. In Fig. 4 we depict the ruin probabilities for Erlang(k, s) distributed initial surplus for 2-point distributions as a function of x 1 for a given mean and variance, for two expected surplus levels.
In contrast to the exponential case (k = 1) , there is unfortunately no direct relation between the optimization problem and the minimization (maximization) of the Laplace transform of the ruin probability. What we obtain is in fact an expression in terms of its (k − 1)-th derivative (with (0) ∕ s (0)̂( s) =̂(s) ). For example, for k = 2 we get Thus, for a fixed parameter s, in order to maximize our ruin probability, we need to minimize an expression that depends on both the Laplace transform of X and its first derivative. Since the variance of a Erlang distribution goes to 0 as k → ∞ , one particular motivation to consider Erlang distributed initial surplus is as a tool to approximate the case of deterministic initial surplus, as in fact one has E (k, s) → (u) as k → ∞ . The approximation (u) ≈ E (k, s) , or Erlang smoothing, was considered in Asmussen et al. (2002) as a numerical scheme to approximate the finite horizon ruin probability by replacing the deterministic time horizon T by an "standarized" Erlang(k, k/T) random variable, which for k → ∞ becomes exact (see Asmussen and Albrecher [Ch.IX.8] (2009) for a more general discussion, as well as Stanford et al. (2005), Carr (1998) and Kyprianou and Pistorius (2003) for applications of this approach to other fields). Concerning the convergence rate with increasing k, for our context of random initial surplus one can adapt Theorem 6 of Asmussen et al. (2002) in a straightforward way to obtain the following result: Proposition 4.1 Let u > 0 be the expected initial surplus and let E k denote the Erlang distribution with shape parameter k and mean u. Then E (k, s) → (u) as k → ∞. More precisely, for some constant C As already suggested in Asmussen et al. (2002), a further improvement of accuracy for fixed k can be obtained by Richardson extrapolation. This is a general method (see e.g. Press et al. (2007) for details) for computing an abstract quantity y (it could be an integral, a derivative, etc.) accurately using a sequence y k → y for which the convergence rate is known, where c 1 is typically unknown but can be eliminated. In fact, setting ỹ k = (k + 1)y k+1 − ky k , we get that ỹ k → y and one obtains an improved approximation of convergence rate O(k −1− ).
Translated into our context, we then get with an error rate of order 1∕k 2 .
For an illustration of the method, consider the same example as in Fig. 1, namely the set of 2-point distributions with mean = 3, variance 2 = 1, and safety loading = 0.5. Figure 5 shows the results of the approximation. One observes that the approximation of the deterministic case via the randomized initial surplus is quite satisfactory already for k = 11 . The numerical approximation works well even for intermediate values of the initial surplus for which the ruin probability (and its kink) is difficult to approximate. In order to also reproduce the particular shape of that curve, higher values of k are however needed. It is worth to note the tremendous improvement when employing Richardson extrapolation for larger values of u (cf. the graph for u = 9).
Remark 4.1 Analogous to the exponential initial surplus case, one can obtain an expression for the ruin probability in terms of moments of L. Concretely, (u) ≈ (k + 1) E (k + 1, (k + 1)∕u) − k E (k, k∕u), applied to the Erlang(k) case needs the ( k + 2)-th moment of X, already for the above first two terms. This is unfortunate, as the deterministic (u) will only be obtained for k → ∞ , and we see that even in this simple approximation higher-order moments of X already play a crucial role. This is in particular the case for moderate values of (U) , and in those cases we have indeed seen in the graphs above that a good approximation of the deterministic case needed large values of k.
Remark 4.2 When the goal is to approximate a deterministic initial surplus level, combinations of exponentials (i.e. densities of the form (9) but with q i ∈ ℝ , ∑ n i=1 q i = 1 ) could a priori also be candidates for U, as that class is dense in the class of all distributions on the positive halfline, see e.g. Dufresne (2007)). Unfortunately, apart from the fact that an enormous number n will be needed for a reasonable approximation of a deterministic u, the differing signs of q i in (10) then also do not allow to identify extremal distributions as in Section 3.

Conclusion
In this paper we showed how randomization can be used to provide a solution to the Schmitter problem in ruin theory and its extension to higher moments. Linking this problem with established results in the theory of m-convex stochastic orders, we provided sharp bounds for the ruin probability under the assumption of an exponential initial surplus. For the more general case of Erlang distributed initial surplus, such analytical sharp bounds are not within reach. However, we showed how the deterministic classical case can be approximated by the simple expressions of the randomized case using Erlangization.