Goodness of Fit of Product Multinomial Regression Models to Sparse Data

Deng, Dianliang; Paul, Sudhir R.

doi:10.1007/s13571-015-0109-z

Goodness of Fit of Product Multinomial Regression Models to Sparse Data

Published: 16 November 2015

Volume 78, pages 78–95, (2016)
Cite this article

Sankhya B Aims and scope Submit manuscript

Dianliang Deng¹ &
Sudhir R. Paul²

143 Accesses
2 Citations
Explore all metrics

Abstract

Tests of goodness of fit of sparse multinomial models with non-canonical links is proposed by using approximations to the first three moments of the conditional distribution of a modified Pearson Chi-square statistic. The modified Pearson statistic is obtained using a supplementary estimating equation approach. Approximations to the first three conditional moments of the modified Pearson statistic are derived. A simulation study is conducted to compare, in terms of empirical size and power, the usual Pearson Chi-square statistic, the standardized modified Pearson Chi-square statistic using the first two conditional moments, a method using Edgeworth approximation of the p-values based on the first three conditional moments and a score test statistic. There does not seems to be any qualitative difference in size of the four methods. However, the standardized modified Pearson Chi-square statistic and the Edgeworth approximation method of obtaining p-values using the first three conditional moments show power advantages compared to the usual Pearson Chi-square statistic, and the score test statistic. In some situations, for example, for small nominal level, the standardized modified Pearson Chi-square statistic shows some power advantage over the method using Edgeworth approximation of the p-values using the first three conditional moments. Also, the former is easier to use and so is preferable. Two data sets are analyzed and a discussion is given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

References

Agresti A. (2002). Categorical data analysis, 2nd. John Wiley & Sons, New York.
Book MATH Google Scholar
Cox D. R. and Reid N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. R. Statist. Soc. B 49, 1–39.
MathSciNet MATH Google Scholar
Cressie N. and Read I. (1984). Multinomial goodness-of-fit Tests. J. R. Statist. Soc. B 46, 440–464.
MathSciNet MATH Google Scholar
Dale J. (1986). Asymptotic normality of goodness-of-fit statistics for sparse product multinomials. J. R. Statist. Soc. B 48, 48–59.
MathSciNet MATH Google Scholar
Fagerland M. W., Hosmer D. W. and Bofin A. M. (2008). Multinomial goodness-of-fit tests for logistic regression models. Statist. Med. 27, 4238–4253.
Article MathSciNet Google Scholar
Fahrmeir L. and Tutz G. (1994). Multivariate statistical modelling based on generalized linear models. Springer-Verlag, New York.
Book MATH Google Scholar
Farrington C. P. (1996). On assessing goodness of fit of generalized linear models to sparse data. J. R. Statist. Soc. B 58, 349–360.
MathSciNet MATH Google Scholar
Firth D. (1987). Discussion of ‘parameter orthogonality and approximate conditional inference’ by D.R. Cox and N. Reid. J. R. Statist. Soc. B 49, 22–23.
Google Scholar
Kim S., Choi H. and Lee S. (2009). Estimate-based goodness-of-fit test for large sparse multinomial distributions. Computational Statistics and Data Analysis 53, 1122–1131.
Article MathSciNet MATH Google Scholar
Koehler K. J. (1986). Goodness-of-fit tests for log-linear models in sparse contingency tables. J. Am. Statist. Ass. 81, 483–493.
Article MathSciNet MATH Google Scholar
Koehler K. J. and Larntz K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials. J. Am. Statist. Ass. 75, 336-344.
Article MATH Google Scholar
Lewis T., Saunders I. W. and Westcott M. (1984). The moments of the Pearson chi-squared statistic and the minimum expected value in two-way tables. Biometrika 71, 515–522.
Article MathSciNet MATH Google Scholar
McCullagh P. (1985). On the asymptotic distribution of Pearson’s statistic in linear exponential family models. Int. Statist. Rev. 53, 61–67.
Article MathSciNet MATH Google Scholar
McCullagh P. (1986). The conditional distribution of goodness-of-fit statistics for discrete data. J. Am. Statist. Ass. 81, 104–107.
Article Google Scholar
McCullagh P. (1987). Tensor Methods in Statistics. Chapman and Hall, London.
MATH Google Scholar
Morel J. G. (1992). A simple algorithm for generating multinomial random vectors with extra variation. Comm. Statist. 21, 1255–1268.
Article Google Scholar
Osius S. and Rojek D. (1992). Normal goodness-of-fit tests for parametric multinomial models with large degree of freedom. J. Am. Statist. Ass. 87, 1145–1152.
Article MathSciNet MATH Google Scholar
Paul S. R. and Deng D. (2000). Goodness of fit of generalized linear models to sparse data. J. R. Statist. Soc. B. 62, 323–333.
Article MathSciNet Google Scholar
Paul S. R. and Deng D. (2012). Assessing goodness of fit of generalized linear models to sparse data using higher order moment corrections. Sankhya: The Indian Journal of Statistics (B) 74, 195–210.
Article MathSciNet MATH Google Scholar
Paul S. R., Liang K. Y. and Self S. G. (1989). On Testing Departure from the Binomial and Multinomial Assumptions. Biometrics 45, 231–236.
Article MATH Google Scholar
Stafford J. E. (1995). Exact cumulant calculations for Pearson χ ² and Zelterman statistics for r-way contingency tables. Journal of Computational and Graphical Statistics 4, 199–212.
Google Scholar
Zelterman D. (1987). Goodness-of-fit tests for large sparse multinomial distributions. J. Amer. Statist. Assoc. 82, 624–629.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Regina, Regina, SK, S4S 0A2, Canada
Dianliang Deng
Department of Mathematics and Statistics, University of Windsor, Windsor, ON, N9B 3P4, Canada
Sudhir R. Paul

Authors

Dianliang Deng
View author publications
You can also search for this author in PubMed Google Scholar
Sudhir R. Paul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dianliang Deng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deng, D., Paul, S.R. Goodness of Fit of Product Multinomial Regression Models to Sparse Data. Sankhya B 78, 78–95 (2016). https://doi.org/10.1007/s13571-015-0109-z

Download citation

Received: 24 January 2013
Published: 16 November 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s13571-015-0109-z

Keywords and phrases

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Goodness of Fit of Product Multinomial Regression Models to Sparse Data

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords and phrases

AMS (2000) subject classification

Navigation

Goodness of Fit of Product Multinomial Regression Models to Sparse Data

Abstract

Access this article

Similar content being viewed by others

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases

AMS (2000) subject classification

Search

Navigation