Skip to main content
Log in

Goodness of Fit of Product Multinomial Regression Models to Sparse Data

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

Tests of goodness of fit of sparse multinomial models with non-canonical links is proposed by using approximations to the first three moments of the conditional distribution of a modified Pearson Chi-square statistic. The modified Pearson statistic is obtained using a supplementary estimating equation approach. Approximations to the first three conditional moments of the modified Pearson statistic are derived. A simulation study is conducted to compare, in terms of empirical size and power, the usual Pearson Chi-square statistic, the standardized modified Pearson Chi-square statistic using the first two conditional moments, a method using Edgeworth approximation of the p-values based on the first three conditional moments and a score test statistic. There does not seems to be any qualitative difference in size of the four methods. However, the standardized modified Pearson Chi-square statistic and the Edgeworth approximation method of obtaining p-values using the first three conditional moments show power advantages compared to the usual Pearson Chi-square statistic, and the score test statistic. In some situations, for example, for small nominal level, the standardized modified Pearson Chi-square statistic shows some power advantage over the method using Edgeworth approximation of the p-values using the first three conditional moments. Also, the former is easier to use and so is preferable. Two data sets are analyzed and a discussion is given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti A. (2002). Categorical data analysis, 2nd. John Wiley & Sons, New York.

    Book  MATH  Google Scholar 

  • Cox D. R. and Reid N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. R. Statist. Soc. B 49, 1–39.

    MathSciNet  MATH  Google Scholar 

  • Cressie N. and Read I. (1984). Multinomial goodness-of-fit Tests. J. R. Statist. Soc. B 46, 440–464.

    MathSciNet  MATH  Google Scholar 

  • Dale J. (1986). Asymptotic normality of goodness-of-fit statistics for sparse product multinomials. J. R. Statist. Soc. B 48, 48–59.

    MathSciNet  MATH  Google Scholar 

  • Fagerland M. W., Hosmer D. W. and Bofin A. M. (2008). Multinomial goodness-of-fit tests for logistic regression models. Statist. Med. 27, 4238–4253.

    Article  MathSciNet  Google Scholar 

  • Fahrmeir L. and Tutz G. (1994). Multivariate statistical modelling based on generalized linear models. Springer-Verlag, New York.

    Book  MATH  Google Scholar 

  • Farrington C. P. (1996). On assessing goodness of fit of generalized linear models to sparse data. J. R. Statist. Soc. B 58, 349–360.

    MathSciNet  MATH  Google Scholar 

  • Firth D. (1987). Discussion of ‘parameter orthogonality and approximate conditional inference’ by D.R. Cox and N. Reid. J. R. Statist. Soc. B 49, 22–23.

    Google Scholar 

  • Kim S., Choi H. and Lee S. (2009). Estimate-based goodness-of-fit test for large sparse multinomial distributions. Computational Statistics and Data Analysis 53, 1122–1131.

    Article  MathSciNet  MATH  Google Scholar 

  • Koehler K. J. (1986). Goodness-of-fit tests for log-linear models in sparse contingency tables. J. Am. Statist. Ass. 81, 483–493.

    Article  MathSciNet  MATH  Google Scholar 

  • Koehler K. J. and Larntz K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials. J. Am. Statist. Ass. 75, 336-344.

    Article  MATH  Google Scholar 

  • Lewis T., Saunders I. W. and Westcott M. (1984). The moments of the Pearson chi-squared statistic and the minimum expected value in two-way tables. Biometrika 71, 515–522.

    Article  MathSciNet  MATH  Google Scholar 

  • McCullagh P. (1985). On the asymptotic distribution of Pearson’s statistic in linear exponential family models. Int. Statist. Rev. 53, 61–67.

    Article  MathSciNet  MATH  Google Scholar 

  • McCullagh P. (1986). The conditional distribution of goodness-of-fit statistics for discrete data. J. Am. Statist. Ass. 81, 104–107.

    Article  Google Scholar 

  • McCullagh P. (1987). Tensor Methods in Statistics. Chapman and Hall, London.

    MATH  Google Scholar 

  • Morel J. G. (1992). A simple algorithm for generating multinomial random vectors with extra variation. Comm. Statist. 21, 1255–1268.

    Article  Google Scholar 

  • Osius S. and Rojek D. (1992). Normal goodness-of-fit tests for parametric multinomial models with large degree of freedom. J. Am. Statist. Ass. 87, 1145–1152.

    Article  MathSciNet  MATH  Google Scholar 

  • Paul S. R. and Deng D. (2000). Goodness of fit of generalized linear models to sparse data. J. R. Statist. Soc. B. 62, 323–333.

    Article  MathSciNet  Google Scholar 

  • Paul S. R. and Deng D. (2012). Assessing goodness of fit of generalized linear models to sparse data using higher order moment corrections. Sankhya: The Indian Journal of Statistics (B) 74, 195–210.

    Article  MathSciNet  MATH  Google Scholar 

  • Paul S. R., Liang K. Y. and Self S. G. (1989). On Testing Departure from the Binomial and Multinomial Assumptions. Biometrics 45, 231–236.

    Article  MATH  Google Scholar 

  • Stafford J. E. (1995). Exact cumulant calculations for Pearson χ 2 and Zelterman statistics for r-way contingency tables. Journal of Computational and Graphical Statistics 4, 199–212.

    Google Scholar 

  • Zelterman D. (1987). Goodness-of-fit tests for large sparse multinomial distributions. J. Amer. Statist. Assoc. 82, 624–629.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dianliang Deng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, D., Paul, S.R. Goodness of Fit of Product Multinomial Regression Models to Sparse Data. Sankhya B 78, 78–95 (2016). https://doi.org/10.1007/s13571-015-0109-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-015-0109-z

Keywords and phrases

AMS (2000) subject classification

Navigation