Skip to main content

Goodness-of-Fit Procedures for Compound Distributions with an Application to Insurance

Abstract

Goodness-of-fit procedures are introduced for testing the validity of compound models. New tests that utilize the Laplace transform as well as classical tests based on the distribution function are investigated. A major area of application of compound laws is in insurance, to model total claims resulting from specific claim frequencies and individual claim sizes. Monte Carlo simulations are used to compare the different test procedures under a variety of specifications for these two components of total claims. A detailed application to an insurance dataset is presented.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. Allison J, Santana L (2015) On a data-dependent choice of the tuning parameter appearing in certain goodness-of-fit tests. J Stat Comput Simul 85(16):3276–3288. https://doi.org/10.1080/00949655.2014.968781

    Article  MathSciNet  MATH  Google Scholar 

  2. Allison JS, Santana L, Smit N, Visagie IJH (2017) An ‘apples to apples’ comparison of various tests for exponentiality. Comput Stat. 32(4):1241–1283. https://doi.org/10.1007/s00180-017-0733-3

    Article  MathSciNet  MATH  Google Scholar 

  3. Beirlant J, Goegebeur Y, Segers J, Teugels J (2006) Statistics of extremes: theory and applications. Wiley

  4. Besbeas P, Morgan BJT (2004) Efficient and robust estimation for the one-sided stable distribution of index 1/2. Stat Probab Lett 66(3):251–257. https://doi.org/10.1016/j.spl.2003.10.013

    Article  MathSciNet  MATH  Google Scholar 

  5. Charpentier A (2014) Computational actuarial science with R. CRC Press, Boca Raton

    Book  Google Scholar 

  6. Chaudhury M (2010) A review of the key issues in operational risk capital modeling. J Oper Risk 5(3):37

    Article  Google Scholar 

  7. Choulakian V, Lockhart R, Stephens M (1994) Cramér-von Mises statistics for discrete distributions. Can J Stat 22(1):125–137

    Article  Google Scholar 

  8. Conover W (1972) A Kolmogorov goodness-of-fit test for discontinuous distributions. J Am Stat Assoc 67(339):591–596

    Article  MathSciNet  Google Scholar 

  9. D’Agostino RB, Stephens RB (1986) Goodness-of-Fit Techniques, vol. 68. Statistics, textbooks and monograph

  10. Dimitrova DS, Kaishev V, Tan S (2017) Computing the Kolmogorov-Smirnov distribution when the underlying cdf is purely discrete, mixed or continuous. Available at http://openaccess.city.ac.uk/18541/

  11. Escanciano J (2009) On the lack of power of omnibus specification tests. Economet Theor 25:162–194

    Article  MathSciNet  Google Scholar 

  12. Feller W (2008) An introduction to probability theory and its applications, vol 2. Wiley, New York

    MATH  Google Scholar 

  13. Genest C, Rémillard B (2008) Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Ann Inst Henri Poincaré Probab Stat 44(6):1096–1127

    Article  MathSciNet  Google Scholar 

  14. Ghosh S, Beran J (2006) On estimating the cumulant generating function of linear processes. Ann Inst Stat Math 58(1):53–71. https://doi.org/10.1007/s10463-005-0009-5

    Article  MathSciNet  MATH  Google Scholar 

  15. Giacomini R, Politis D, White H (2013) A warp-speed method for conducting Monte Carlo experiments involving bootstrap. Economet Theor 29(3):567–589

    Article  MathSciNet  Google Scholar 

  16. Gleser L (1985) Exact power of goodness-of-fit tests of Kolmogorov type for discontinuous distributions. J Am Stat Assoc 80(392):954–958

    Article  MathSciNet  Google Scholar 

  17. Goffard PO (2019) Online accompaniment for "Goodness-of-fit tests for compound distributions with applications in insurance". Available at https://github.com/LaGauffre/Online_accoompaniement_GOF_Test_Compound_Distribution

  18. Henze N (1992) A new flexible class of omnibus tests for exponentiality. Commun Stat Theor Methods 22(1):115–133. https://doi.org/10.1080/03610929308831009

    Article  MathSciNet  MATH  Google Scholar 

  19. Henze N (1996) Empirical-distribution-function goodness-of-fit tests for discrete models. Can J Stat 24(1):81–93

    Article  MathSciNet  Google Scholar 

  20. Henze N, Klar B (2002) Goodness-of-fit tests for the inverse Gaussian distribution based on the empirical laplace transform. Ann Inst Stat Math 54(2):425–444. https://doi.org/10.1023/A:1022442506681

    Article  MathSciNet  MATH  Google Scholar 

  21. Henze N, Meintanis SG (2002) Tests of fit for exponentiality based on the empirical Laplace transform. Statistics 36(2):147–161. https://doi.org/10.1080/02331880212042

    Article  MathSciNet  MATH  Google Scholar 

  22. Janssen A (2000) Global power function of goodness-of-fit tests. Ann Stat 28:239–253

    Article  MathSciNet  Google Scholar 

  23. Katz L (1965) Unified treatment of a broad class of discrete probability distributions. in: Classical and Contagious Discrete Distributions, Pergamon Press, Oxford

  24. Lockhart R, Spinelli J, Stephens M (2007) Cramér-von Mises statistics for discrete distributions with unknown parameters. Canad J Stat 35(1):125–133

    Article  Google Scholar 

  25. Meintanis S, Iliopoulos G (2003) Tests of fit for the Rayleigh distribution based on the empirical Laplace transform. Ann Inst Stat Math 55(1):137–151. https://doi.org/10.1007/BF02530490

    Article  MathSciNet  MATH  Google Scholar 

  26. Milošević B, Obradović M (2016) New class of exponentiality tests based on u-empirical laplace transform. Stat Pap 57(4):977–990. https://doi.org/10.1007/s00362-016-0818-z

    Article  MathSciNet  MATH  Google Scholar 

  27. Noether G (1963) Note on the Kolmogorov statistic in the discrete case. Metrika 7(1):115–116

    Article  MathSciNet  Google Scholar 

  28. Panjer HH (1981) Recursive evaluation of a family of compound distributions. ASTIN Bull 12(1):22–26. https://doi.org/10.1017/S0515036100006796

    Article  MathSciNet  Google Scholar 

  29. Pril ND (1986) Moments of a class of compound distributions. Scand Actuar J 2:117–120. https://doi.org/10.1080/03461238.1986.10413800

    Article  MathSciNet  MATH  Google Scholar 

  30. Schmid P (1958) On the Kolmogorov and Smirnov limit theorems for discontinuous distribution functions. Ann Math Stat 29(4):1011–1027

    Article  MathSciNet  Google Scholar 

  31. Slakter MJ (1965) A comparison of the Pearson chi-square and Kolmogorov goodness-of-fit tests with respect to validity. J Am Stat Assoc 60(311):854–858

    Article  MathSciNet  Google Scholar 

  32. Spinelli J (2001) Testing fit for the grouped exponential distribution. Can J Stat 29(3):451–458

    Article  MathSciNet  Google Scholar 

  33. Spinelli J, Stephens M (1997) Cramér-von Mises tests of fit for the Poisson distribution. Can J Stat 25(2):257–268

    Article  Google Scholar 

  34. Stute W, Manteiga WG, Quindimil MP (1993) Bootstrap based goodness-of-fit-tests. Metrika 40(1):243–256

    Article  MathSciNet  Google Scholar 

  35. Sundt B, Jewell WS (1981) Further results on recursive evaluation of compound distributions. ASTIN Bull 12(1):27–39

    Article  MathSciNet  Google Scholar 

  36. Tenreiro C (2019) On the automatic selection of the tuning parameter appearing in certain families of goodness-of-fit tests. J Stat Comput Simul 89(10):1780–1797. https://doi.org/10.1080/00949655.2019.1598409

    Article  MathSciNet  MATH  Google Scholar 

  37. Thas O (2010) Comparing distributions. Springer

  38. van der Vaart AW, Wellner JA (1996) Weak convergence. In: Weak Convergence and Empirical Processes, pp. 16–28. Springer New York. https://doi.org/10.1007/978-1-4757-2545-2_3

  39. Walsh JE (1963) Bounded probability properties of Kolmogorov-Smirnov and similar statistics for discrete data. Ann Inst Stat Math 15(1):153–158

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank an anonymous referee for helpful comments and suggestions that helped improve the original manuscript. This work was initiated while Pierre-Olivier Goffard and Simos Meintanis were visiting the department of Statistics and Applied Probability at UC Santa Barbara. The authors are grateful for the warm welcome they received there. Pierre-Olivier Goffard’s work is conducted within the Research Chair DIALog under the aegis of the Risk Foundation, an initiative by CNP Assurances.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P.-O. Goffard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Probability and Statistics: an Issue in Memory of Theophilos Cacoullos” guest edited by Narayanaswamy Balakrishnan, Charalambos #. Charalambides, Tasos Christofides, Markos Koutras, and Simos Meintanis.

Appendix: Consistency and Limit Null Distribution

Appendix: Consistency and Limit Null Distribution

In this section, we discuss the consistency and limiting distribution of the LT-based test statistics under the null hypothesis \(H_0\). We focus our attention on the test criterion \(S_{n,w}\) defined in (28), but note that similar results may be obtained for the test statistic \(T_{n,w}\). We begin with the consistency of the test based on \(S_{n,w}\) under the following assumptions:

  1. (A.1)

    The estimator satisfies \({{\widehat{\vartheta }}}\rightarrow {\widetilde{\vartheta }}\), a.s., as \(n\rightarrow \infty \), for some \({\widetilde{\vartheta }} \in \Theta \), with \({\widetilde{\vartheta }}\equiv \vartheta _0\) when the null hypothesis \(H_0\) is true, with \(\vartheta _0\) being the true value.

  2. (A.2)

    The LT \(L^X_0(\cdot ;\vartheta )\) is continuous in \( \vartheta \).

  3. (A.3)

    The weight function satisfies,

    1. (i)

      \(w(t)>0, \forall t>0\), except for a set of measure zero,

    2. (ii)

      \(\int _0^\infty w(t) \mathrm{{d}} t<\infty \).

Theorem 1

Let \(L^X(t)\) denote the LT of X. Then if assumptions (A.1) to (A.3) are satisfied,

$$\begin{aligned} \frac{S_{n,w}}{n} \rightarrow \int _0^\infty \left( L^{X}(t)-L^{X}_0(t;{\widetilde{\vartheta }})\right) ^2 w(t) \mathrm{{d}} t, \end{aligned}$$
(45)

a.s., as \(n\rightarrow \infty \).

Proof

: Clearly the strong consistency of the empirical Laplace transform and the continuity of \(L_0(\cdot ;\vartheta )\) imply that

$$\begin{aligned} \mathrm{{SE}}_{n} \rightarrow L^X(t)-L^X_0(t;\widetilde{\vartheta }), \end{aligned}$$

a.s., as \(n\rightarrow \infty \). Then since SE\(^2_n(t)\le 4\), the result follows by Lebesgue’s dominated convergence theorem. \(\square \)

The right-hand side of (45) is positive unless \(L^X(t)=L^X_0(t;\widetilde{\vartheta })\), for all \(t>0\). However, by the uniqueness of the LT, the last identity holds true only under the null hypothesis \(H_0\), in which case \({\widetilde{\vartheta }} \equiv \vartheta _0\), thus implying the strong consistency of the test that rejects \(H_0\) for large values of \(S_{n,w}\).

We continue with the limit distribution of the test statistic \(S_{n,w}\) under the null hypothesis \(H_0\). For simplicity, we assume that \(\vartheta \) is a scalar parameter. To this end assume that

  1. (A.4)

    The estimator \({{\widehat{\vartheta }}}:={{\widehat{\vartheta }}}_n\) satisfies the Bahadur representation

    $$\begin{aligned} {{\widehat{\vartheta }}}_n-\vartheta _0=\frac{1}{n}\sum _{j=1}^n \ell (X_j;\vartheta _0)+o_P(1) \end{aligned}$$

    where \(\ell (\cdot ;\cdot )\) are such that \({\mathbb {E}}(\ell (X;\vartheta _0))=0\) and \({\mathbb {E}}(\ell ^2(X;\vartheta _0))<\infty \).

  2. (A.5)

    The LT \(L_0^X(t;\vartheta )\) is twice differentiable with respect to \(\vartheta \) with a continuous second derivative in the neighborhood of the true value \(\vartheta _0\).

  3. (A.6)

    The weight function is such that

$$\begin{aligned} \int _{0}^\infty \left( \frac{\partial L_0^X(t;\vartheta _0)}{\partial \vartheta }\right) ^2 w(t) \mathrm{{d}}t<\infty , \end{aligned}$$

and

$$\begin{aligned} \int _{0}^\infty \left( \frac{\partial ^2 L_0^X(t;\vartheta )}{\partial \vartheta ^2}\right) ^2_{\vartheta =\vartheta ^*} w(t) \mathrm{{d}}t<\infty , \end{aligned}$$

for all \(\vartheta ^*\) in a neighborhood of \(\vartheta _0\).

Theorem 2

Under assumptions (A.1) to (A.6) we have under \(H_0\),

$$\begin{aligned} Z_n(t)=\sqrt{n} (L^X_n(t)-L_0^X(t;{{\widehat{\vartheta }}}_n)){\mathop {\longrightarrow }\limits ^{{{{\mathcal {L}}}}}}Z(t), \end{aligned}$$

as \(n\rightarrow \infty \), where Z(t) is the zero-mean Gaussian process with covariance kernel \(K(s,t;\vartheta _0)={\mathbb {E}}(Y(t;\vartheta _0)Y(s;\vartheta _0))\) with

$$\begin{aligned} Y(t;\vartheta )=e^{-tX}-L_0^X(t;\vartheta )-\frac{\partial L_0^X(t;\vartheta )}{\partial \vartheta }\ell (X;\vartheta ). \end{aligned}$$

The covariance kernel is specified by

$$\begin{aligned} K(s,t;\vartheta )= & {} L_0^X(t+s,\vartheta )-L_0^X(t;\vartheta )L_0^X(s; \vartheta ) \\- & {} \frac{\partial L_0^X(s;\vartheta )}{\partial \vartheta } {\mathbb {E}}\left( e^{-tX} \ell (X;\vartheta ) \right) -\frac{\partial L_0^X(t;\vartheta )}{\partial \vartheta } {\mathbb {E}}\left( e^{-sX},\ell (X;\vartheta ) \right) \\+ & {} \frac{\partial L_0^X(s;\vartheta )}{\partial \vartheta } \frac{\partial L_0^X(t;\vartheta )}{\partial \vartheta } {\mathbb {E}}(\ell ^2(X;\vartheta )). \end{aligned}$$

Proof

: Along the proof we will write \(Z^{(1)}_n\approx Z^{(2)}_n\) if the two random processes \((Z^{(k)}_n(t), k=1,2)\), satisfy \(Z^{(1)}_n(t)-Z^{(2)}_n(t)=\varepsilon _n(t)\), and the remainder \(\varepsilon _n(t)\) is such that it has no effect on the limit null distribution of the test statistic \(S_{n,w}\).

With this understanding using assumption (A.5) and the second part of (A.6), a two-term Taylor expansion yields

$$\begin{aligned} Z_n\approx Z^*_n, \end{aligned}$$

where

$$\begin{aligned} Z^*_n(t)=\sqrt{n} \left( L^X_n(t)-L_0^X(t;\vartheta _0)\right) -\sqrt{n}\left( {{\widehat{\vartheta }}}_n-\vartheta _0\right) \frac{\partial L_0^X(t;\vartheta _0)}{\partial \vartheta }. \end{aligned}$$
(46)

In turn, using assumption (A.4) and the first part of (A.6) in (46) leads to

$$\begin{aligned} Z^*_n\approx Z^{**}_n, \end{aligned}$$

where

$$\begin{aligned} Z^{**}_n(t)=\sqrt{n} \left( L^X_n(t)-L_0^X(t;\vartheta _0)\right) -\frac{\partial L_0^X(t;\vartheta _0)}{\partial \vartheta } \frac{1}{\sqrt{n}}\sum _{j=1}^n \ell (X_j;\vartheta _0). \end{aligned}$$
(47)

The result now follows by applying the Central Limit Theorem in Hilbert spaces, (see e.g. van der Vaart and Wellner [38], p. 50) on the process \(Z^{**}_n(t)\) given in (47). \(\square \)

Now the limit distribution of the test statistic follows from Theorem 2 and the Continuous Mapping theorem. Specifically we have

$$\begin{aligned} S_{n,w}=\int _0^\infty Z^2_n(t) w(t) \mathrm{{d}}t {\mathop {\longrightarrow }\limits ^{{{{\mathcal {L}}}}}}\int _0^\infty Z^2(t) w(t) \mathrm{{d}}t:=Z_w \end{aligned}$$

where Z(t) is the process defined in Theorem 2. The distribution of \(Z_w\) is the same as that of \(\sum _{j=1}^\infty \lambda _j N^2_j\), where \(\lambda _1,\lambda _2, . . .\), are the eigenvalues corresponding to the integral operator

$$\begin{aligned} Ag(s)=\int _0^\infty K(s,t) g(t) w(t) \mathrm{{d}}t, \end{aligned}$$

i.e. the solutions of the equation \(Ag(s)=\lambda g(s)\), and where \(N_j, \ j\ge 1\), are iid standard normal variates.

Remark 71

The assumptions (A.1)–(A.3) made in order to prove consistency, as well as those pertaining to the limit null distribution, (A.4)–(A.6), are standard in the context of testing goodness-of-fit based on the empirical LT; see for instance Henze [18], Henze and Klar [20], and Henze and Meintanis [21].

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Goffard, PO., Jammalamadaka, S.R. & Meintanis, S.G. Goodness-of-Fit Procedures for Compound Distributions with an Application to Insurance. J Stat Theory Pract 16, 52 (2022). https://doi.org/10.1007/s42519-022-00276-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42519-022-00276-6

Keywords

  • Compound distributions
  • Goodness-of-fit tests
  • Katz family
  • Laplace transform

Mathematics Subject Classification

  • 62G10
  • 62G20