Skip to main content
Log in

On the goodness-of-fit tests for gamma generalized linear models

  • Research Article
  • Published:
Journal of the Korean Statistical Society Aims and scope Submit manuscript

Abstract

An omitted covariate in the regression function leads to hidden or unobserved heterogeneity in generalized linear models (GLMs). Using this fact, we develop two novel goodness-of-fit tests for gamma GLMs. The first is a score test to check the existence of hidden heterogeneity and the second is a Hausman-type specification test to detect the difference between two estimators for the dispersion parameter. In addition to these developments, we reveal the undesirable behavior of the deviance test for gamma GLMs, which is still used by many scholars in practice. Exploiting real-world data, we demonstrate the application of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bailey, R. A., & Simon, Le Roy J. (1960). Two studies in automobile insurance ratemaking. ASTIN Bulletin, 20, 192–217.

    Article  Google Scholar 

  • Cheser, A. (1984). Testing for neglected heterogeneity. Econometrica, 52, 865–872.

    Article  Google Scholar 

  • Dean, C. B. (1992). Testing for overdispersion in Poisson and Binomial regression models. Journal of the American Statistical Association, 87, 451–457.

    Article  Google Scholar 

  • de Jong, P., & Heller, G. Z. (2008). Generalized linear models for insurance data. New York: Cambridge University Press.

    Book  Google Scholar 

  • Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46(6), 1251–1271.

    Article  MathSciNet  Google Scholar 

  • Havil, J. (2003). Gamma: Exploring Euler’s constant. New Jersey: Princeton University Press.

    MATH  Google Scholar 

  • Jacomin-Gadda, H., & Commenges, D. (1995). Tests of homogeneity for generalized linear models. Journal of the American Statistical Association, 90, 1237–1246.

    Article  MathSciNet  Google Scholar 

  • Klar, B., & Meintanis, S. G. (2012). Specification tests for the response distribution in generalized linear models. Computational Statistics, 27, 251–267.

    Article  MathSciNet  Google Scholar 

  • Kim, J., & Lee, W. (2019). On testing the hidden heterogeneity in negative binomial regression models. Metrika, 82, 457–470.

    Article  MathSciNet  Google Scholar 

  • le Cessie, S., & van Houwelingen, H. C. (1995). Testing the fit of a regression model via score tests in random effects models. Biometrics, 51, 600–614.

    Article  Google Scholar 

  • Lee, Y., Neldern, J., & Pawitan, Y. (2006). Generalized linear models with random effects. New York: Chapman & Hall.

    Book  Google Scholar 

  • McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd). New York: Chapman & Hall.

    Book  Google Scholar 

  • Ohlsson, E. (2008). Combining generalized linear models and credibility models in practice. Scandinavian Actuarial Journal, 4, 301–314.

    Article  MathSciNet  Google Scholar 

  • Salway, R., & Wakefield, J. (2008). Gamma generalized linear models for pharmacokinetic data. Biometrics, 64, 620–625.

    Article  MathSciNet  Google Scholar 

  • Stram, D. O., & Lee, J. (1994). Variance component testing in the longitudinal mixed effects model. Biometrics, 50, 1171–1177.

    Article  Google Scholar 

  • Xacur, Q. A. Q., & Garrido, J. (2015). Generalised linear models for aggregate claims: To Tweedie or not ? European Actuarial Journal, 5, 181–202.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Seongil Jo was supported by INHA UNIVERSITY Research Grant. Woojoo Lee was supported by the New Faculty Startup Fund from Seoul National University. Myeongjee Lee was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2017R1A6A3A110335).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Woojoo Lee.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: The details about the asymptotic variance used in the score test

First,

$$\begin{aligned} I_{\sigma ^2_{v}\sigma ^2_{v}}= &\, {} E\left( \frac{\partial \ell }{\partial \sigma ^2_{v}}\right) ^2\\= &\, {} \frac{1}{4}\sum _{i}E\left\{ \left( \frac{\nu w_i (y_{i}-\mu _{i})}{\mu _{i}}\right) ^2-\frac{\nu w_i y_{i}}{\mu _{i}} \right\} ^2\\= &\, {} \frac{1}{4}\sum _{i}E\left\{ \left( \frac{\nu w_i(y_{i}-\mu _{i})}{\mu _{i}}\right) ^2-\frac{\nu w_i (y_{i}-\mu _{i})}{\mu _{i}} -\nu w_i \right\} ^2. \end{aligned}$$

Since \(E(y_{i}-\mu _{i})^4 = \mu ^4_{i}(6/(\nu w_i)+3)/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^3 = 2\mu ^3_{i}/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^2 = \mu ^2_{i}/(\nu w_i)\), we have

$$\begin{aligned} I_{\sigma ^2_{v}\sigma ^2_{v}}=\frac{1}{4} \sum _{i} \left[ 2(\nu w_i)^2 + 3\nu w_i\right] \end{aligned}$$

Meanwhile,

$$\begin{aligned} I_{\varvec{\theta }\varvec{\theta }}=\left( \begin{array}{cc} I_{\varvec{\beta }\varvec{\beta }} &{} I_{\varvec{\beta }\nu } \\ I^{T}_{\varvec{\beta }\nu } &{} I_{\nu \nu } \\ \end{array} \right) \end{aligned}$$

Since \(\frac{\partial \ell _{i}}{\partial \varvec{\beta }} = \frac{\nu w_i (y_{i}-\mu _{i})}{\mu ^2_{i}}\frac{\partial \mu _{i}}{\partial \beta }=\nu w_i \frac{y_{i}-\mu _{i}}{\mu _{i}}{\varvec{x}}_{i}\) where \({\varvec{x}}_{i}\) denotes the column vector of covariates,

$$\begin{aligned} I_{\varvec{\beta }\varvec{\beta }}=\sum _{i}-E\left( \frac{\partial ^2 \ell _{i}}{\partial \varvec{\beta }\partial \varvec{\beta }^{T}}\right) =\sum _{i} \nu w_i {\varvec{x}}_{i}{\varvec{x}}^{T}_{i} \end{aligned}$$

and

$$\begin{aligned} I_{\varvec{\beta }\nu }=0. \end{aligned}$$
$$\begin{aligned} I_{\nu \nu }= &\, {} E\left\{ \left( \frac{\partial \ell _{i}}{\partial \nu }\right) ^2\right\} = w_{i}^2\left[ E\left\{ \left( \log y_{i}\right) ^2\right\} + \frac{1}{\mu _{i}^2}E\{(y_{i})^2\} + C^2 - \frac{2}{\mu _{i}}E\left( y_i\log y_i\right) + 2CE(\log y_i) - 2\frac{C}{\mu _i}E(y_i)\right] \\= &\, {} w_i^2\left[ \psi ^{(1)}(\nu w_i) + \frac{1}{\nu w_i} + 1 - 2\left( \psi (\nu w_i + 1) - \log \frac{\nu w_i}{\mu _i}\right) + \left( C + \psi (\nu w_i) + \log \frac{\mu _i}{\nu w_i}\right) ^2 - 2C\right] , \\= &\, {} w_i^2\left[ \psi ^{(1)}(\nu w_i) + \frac{1}{\nu w_i} - 2\left\{ \psi (\nu w_i + 1) - \psi (\nu w_i) \right\} \right] \end{aligned}$$

where \(E(\log y_{i}) =\psi (\nu w_i)+\log (\mu _{i}/(\nu w_i))\), \(E(y_{i}\log y_{i}) =\mu _{i}(\psi (\nu w_i+1)-\log (\frac{\nu w_i}{\mu _{i}}))\), \(E(\log y_{i})^2 =\psi ^{(1)}(\nu w_i)-\left( \psi (\nu w_i) + \log (\frac{\mu _{i}}{\nu w_i}))\right) ^2\), \(\psi (\cdot )\) and \(\psi ^{(1)}(\cdot )\) denote the digamma function and the trigamma function, respectively. C is defined as \(\log \nu w_i - \log \mu _i - \psi (\nu w_i) + 1\).

Finally, \(I_{\sigma ^2_{v}\varvec{\theta }}=(I_{\sigma ^2_{v}\varvec{\beta }},I_{\sigma ^2_{v}\nu })\)

$$\begin{aligned} I_{\sigma ^2_{v}\varvec{\beta }}= &\, {} E\left( \frac{\partial \ell }{\partial \sigma ^2_{v}}\frac{\partial \ell }{\partial \varvec{\beta }}\right) \\= &\, {} \sum _{i}\frac{1}{2} E\left[ \left( \left( \frac{\nu w_i (y_{i}-\mu _{i})}{\mu _{i}}\right) ^2-\frac{\nu w_i y_{i}}{\mu _{i}}\right) \left( \nu w_i \frac{y_{i}-\mu _{i}}{\mu _{i}}{\varvec{x}}_{i}\right) \right] \\= & {} \frac{1}{2}\nu \sum _{i}w_{i}{\varvec{x}}_{i} \end{aligned}$$

and

$$\begin{aligned} I_{\sigma ^2_{v}\nu }= &\, {} E\left( \frac{\partial \ell }{\partial \sigma ^2_{v}}\frac{\partial \ell }{\partial \nu }\right) \\= &\, {} \sum _{i}\frac{1}{2} E\left[ \left( \left( \frac{\nu w_{i}(y_{i}-\mu _{i})}{\mu _{i}}\right) ^2-\frac{\nu w_{i} y_{i}}{\mu _{i}}\right) \left( w_i\log \nu w_i+w_i+w_i\log y_{i}-\frac{w_iy_{i}}{\mu _{i}}-w_i\log \mu _{i}- w_i\psi (\nu w_i)\right) \right] \\= &\, {} \sum _{i}\frac{1}{2} E\left[ \left( \left( \frac{\nu w_i (y_{i}-\mu _{i})}{\mu _{i}}\right) ^2-\frac{\nu w_i (y_{i}-\mu _{i})}{\mu _{i}}-\nu w_i\right) \left( w_i\log y_{i}-\frac{w_i(y_{i}-\mu _{i})}{\mu _{i}}\right) \right] . \end{aligned}$$

Since \(E(y^2_{i}\log y_{i})=\frac{\nu w_i(\nu w_i+1)}{\nu ^2 w_i^2}\mu ^2_{i}(\psi (\nu w_i+2)-\log (\frac{\nu w_i}{\mu _{i}}))\),

$$\begin{aligned} I_{\sigma ^2_{v}\nu }=-\frac{1}{2} \sum _{i} w_i. \end{aligned}$$

Appendix 2: The details about the asymptotic variance used in the Hausman-type test

Consider \(y_{i} \sim Gamma(\mu _{i},\phi /w_{i})\). Using \(E(y_{i}-\mu _{i})^4 = \mu ^4_{i}(6/(\nu w_i)+3)/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^3 = 2\mu ^3_{i}/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^2 = \mu ^2_{i}/(\nu w_i)\),

The estimating equations for \(\varvec{\beta }\) and \(\phi \) are

$$\begin{aligned} u_{r}= &\, {} \sum _{i=1}^{n} \frac{\nu w_{i}(y_{i}-\mu _{i})}{\mu _{i}}x_{ir}=0 ~~~~~ (r=1,\ldots ,p),\\ u_{p+1}= &\, {} \sum _{i=1}^{n} \frac{\nu w_{i}(y_{i}-\mu _{i})^2}{\mu ^2_{i}}-n=0 \end{aligned}$$

Let \(\varvec{\theta }=(\varvec{\beta }^{T},\phi )^{T}\). For \(r,s=1,\ldots ,p+1\),

$$\begin{aligned} A(\theta )_{r,s}= &\, {} \lim _{n\rightarrow \infty } \frac{1}{n}E(-\frac{\partial u_{r}}{\partial \theta _{s}})\\= & \,{} \lim _{n\rightarrow \infty } \left( \begin{array}{cc} n^{-1}\sum _{i=1}^{n}\nu w_{i}{\varvec{x}}_{i}{\varvec{x}}^{T}_{i} &{} 0 \\ n^{-1}\sum _{i=1}^{n}2 {\varvec{x}}^{T}_{i} &{} \nu \\ \end{array} \right) \end{aligned}$$

and

$$\begin{aligned} B(\varvec{\theta })_{r,s}= &\, {} \lim _{n\rightarrow \infty } \frac{1}{n}E(u_{r}u_{s})\\= &\, {} \lim _{n\rightarrow \infty } \left( \begin{array}{cc} n^{-1}\sum _{i=1}^{n}\nu w_{i}x_{i}{\varvec{x}}^{T}_{i} &{} n^{-1}\sum _{i=1}^{n}2{\varvec{x}}_{i} \\ n^{-1}\sum _{i=1}^{n}2{\varvec{x}}^{T}_{i} &{} n^{-1}\sum _{i=1}^{n}(2+6/(\nu w_{i})) \\ \end{array} \right) \end{aligned}$$

The asymptotic distribution for MME is

$$\begin{aligned} \sqrt{n}({\widehat{\phi }}^{MME}-\phi ) \sim N(0, \left( A^{-1}(\varvec{\theta })B(\varvec{\theta })(A^{-1}(\varvec{\theta }))^{T}\right) _{p+1,p+1}) \end{aligned}$$

Let \(c_{p+1}=\lim _{n\rightarrow \infty } n^{-1}\sum _{i=1}^{n}(2+6/(\nu w_{i}))\), \({\varvec{b}}=\lim _{n\rightarrow \infty } n^{-1}\sum _{i=1}^{n}2{\varvec{x}}_{i}\), \(b_{p+1}=\nu \), and \(I_{1}=\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^{n}\nu w_{i}{\varvec{x}}_{i}{\varvec{x}}^{T}_{i}\) Here, \(\left( A^{-1}(\varvec{\theta })B(\varvec{\theta })(A^{-1}(\varvec{\theta }))^{T}\right) _{p+1,p+1}\) is

$$\begin{aligned} \frac{c_{p+1}-{\varvec{b}}^{T}(I_{1})^{-1} {\varvec{b}}}{b^2_{p+1}}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jo, S., Lee, M. & Lee, W. On the goodness-of-fit tests for gamma generalized linear models. J. Korean Stat. Soc. 50, 315–332 (2021). https://doi.org/10.1007/s42952-020-00095-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42952-020-00095-0

Keywords

Navigation