Abstract
An omitted covariate in the regression function leads to hidden or unobserved heterogeneity in generalized linear models (GLMs). Using this fact, we develop two novel goodness-of-fit tests for gamma GLMs. The first is a score test to check the existence of hidden heterogeneity and the second is a Hausman-type specification test to detect the difference between two estimators for the dispersion parameter. In addition to these developments, we reveal the undesirable behavior of the deviance test for gamma GLMs, which is still used by many scholars in practice. Exploiting real-world data, we demonstrate the application of our proposed method.
Similar content being viewed by others
References
Bailey, R. A., & Simon, Le Roy J. (1960). Two studies in automobile insurance ratemaking. ASTIN Bulletin, 20, 192–217.
Cheser, A. (1984). Testing for neglected heterogeneity. Econometrica, 52, 865–872.
Dean, C. B. (1992). Testing for overdispersion in Poisson and Binomial regression models. Journal of the American Statistical Association, 87, 451–457.
de Jong, P., & Heller, G. Z. (2008). Generalized linear models for insurance data. New York: Cambridge University Press.
Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46(6), 1251–1271.
Havil, J. (2003). Gamma: Exploring Euler’s constant. New Jersey: Princeton University Press.
Jacomin-Gadda, H., & Commenges, D. (1995). Tests of homogeneity for generalized linear models. Journal of the American Statistical Association, 90, 1237–1246.
Klar, B., & Meintanis, S. G. (2012). Specification tests for the response distribution in generalized linear models. Computational Statistics, 27, 251–267.
Kim, J., & Lee, W. (2019). On testing the hidden heterogeneity in negative binomial regression models. Metrika, 82, 457–470.
le Cessie, S., & van Houwelingen, H. C. (1995). Testing the fit of a regression model via score tests in random effects models. Biometrics, 51, 600–614.
Lee, Y., Neldern, J., & Pawitan, Y. (2006). Generalized linear models with random effects. New York: Chapman & Hall.
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd). New York: Chapman & Hall.
Ohlsson, E. (2008). Combining generalized linear models and credibility models in practice. Scandinavian Actuarial Journal, 4, 301–314.
Salway, R., & Wakefield, J. (2008). Gamma generalized linear models for pharmacokinetic data. Biometrics, 64, 620–625.
Stram, D. O., & Lee, J. (1994). Variance component testing in the longitudinal mixed effects model. Biometrics, 50, 1171–1177.
Xacur, Q. A. Q., & Garrido, J. (2015). Generalised linear models for aggregate claims: To Tweedie or not ? European Actuarial Journal, 5, 181–202.
Acknowledgements
Seongil Jo was supported by INHA UNIVERSITY Research Grant. Woojoo Lee was supported by the New Faculty Startup Fund from Seoul National University. Myeongjee Lee was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2017R1A6A3A110335).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: The details about the asymptotic variance used in the score test
First,
Since \(E(y_{i}-\mu _{i})^4 = \mu ^4_{i}(6/(\nu w_i)+3)/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^3 = 2\mu ^3_{i}/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^2 = \mu ^2_{i}/(\nu w_i)\), we have
Meanwhile,
Since \(\frac{\partial \ell _{i}}{\partial \varvec{\beta }} = \frac{\nu w_i (y_{i}-\mu _{i})}{\mu ^2_{i}}\frac{\partial \mu _{i}}{\partial \beta }=\nu w_i \frac{y_{i}-\mu _{i}}{\mu _{i}}{\varvec{x}}_{i}\) where \({\varvec{x}}_{i}\) denotes the column vector of covariates,
and
where \(E(\log y_{i}) =\psi (\nu w_i)+\log (\mu _{i}/(\nu w_i))\), \(E(y_{i}\log y_{i}) =\mu _{i}(\psi (\nu w_i+1)-\log (\frac{\nu w_i}{\mu _{i}}))\), \(E(\log y_{i})^2 =\psi ^{(1)}(\nu w_i)-\left( \psi (\nu w_i) + \log (\frac{\mu _{i}}{\nu w_i}))\right) ^2\), \(\psi (\cdot )\) and \(\psi ^{(1)}(\cdot )\) denote the digamma function and the trigamma function, respectively. C is defined as \(\log \nu w_i - \log \mu _i - \psi (\nu w_i) + 1\).
Finally, \(I_{\sigma ^2_{v}\varvec{\theta }}=(I_{\sigma ^2_{v}\varvec{\beta }},I_{\sigma ^2_{v}\nu })\)
and
Since \(E(y^2_{i}\log y_{i})=\frac{\nu w_i(\nu w_i+1)}{\nu ^2 w_i^2}\mu ^2_{i}(\psi (\nu w_i+2)-\log (\frac{\nu w_i}{\mu _{i}}))\),
Appendix 2: The details about the asymptotic variance used in the Hausman-type test
Consider \(y_{i} \sim Gamma(\mu _{i},\phi /w_{i})\). Using \(E(y_{i}-\mu _{i})^4 = \mu ^4_{i}(6/(\nu w_i)+3)/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^3 = 2\mu ^3_{i}/(\nu w_i)^2\), \(E(y_{i}-\mu _{i})^2 = \mu ^2_{i}/(\nu w_i)\),
The estimating equations for \(\varvec{\beta }\) and \(\phi \) are
Let \(\varvec{\theta }=(\varvec{\beta }^{T},\phi )^{T}\). For \(r,s=1,\ldots ,p+1\),
and
The asymptotic distribution for MME is
Let \(c_{p+1}=\lim _{n\rightarrow \infty } n^{-1}\sum _{i=1}^{n}(2+6/(\nu w_{i}))\), \({\varvec{b}}=\lim _{n\rightarrow \infty } n^{-1}\sum _{i=1}^{n}2{\varvec{x}}_{i}\), \(b_{p+1}=\nu \), and \(I_{1}=\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^{n}\nu w_{i}{\varvec{x}}_{i}{\varvec{x}}^{T}_{i}\) Here, \(\left( A^{-1}(\varvec{\theta })B(\varvec{\theta })(A^{-1}(\varvec{\theta }))^{T}\right) _{p+1,p+1}\) is
Rights and permissions
About this article
Cite this article
Jo, S., Lee, M. & Lee, W. On the goodness-of-fit tests for gamma generalized linear models. J. Korean Stat. Soc. 50, 315–332 (2021). https://doi.org/10.1007/s42952-020-00095-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-020-00095-0