Skip to main content
Log in

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-Fit Statistics in Categorical Data Analysis

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

We investigate the performance of three statistics, R 1, R 2 (Glas in Psychometrika 53:525–546, 1988), and M 2 (Maydeu-Olivares & Joe in J. Am. Stat. Assoc. 100:1009–1020, 2005, Psychometrika 71:713–732, 2006) to assess the overall fit of a one-parameter logistic model (1PL) estimated by (marginal) maximum likelihood (ML). R 1 and R 2 were specifically designed to target specific assumptions of Rasch models, whereas M 2 is a general purpose test statistic. We report asymptotic power rates under some interesting violations of model assumptions (different item discrimination, presence of guessing, and multidimensionality) as well as empirical rejection rates for correctly specified models and some misspecified models. All three statistics were found to be more powerful than Pearson’s X 2 against two- and three-parameter logistic alternatives (2PL and 3PL), and against multidimensional 1PL models. The results suggest that there is no clear advantage in using goodness-of-fit statistics specifically designed for Rasch-type models to test these models when marginal ML estimation is used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A., & Yang, M. (1987). An empirical investigation of some effects of sparseness in contingency tables. Computational Statistics & Data Analysis, 5, 9–21.

    Article  Google Scholar 

  • Andersen, E.B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140.

    Article  Google Scholar 

  • Bartholomew, D.J., & Leung, S.O. (2002). A goodness of fit test for sparse 2p contingency tables. British Journal of Mathematical & Statistical Psychology, 55, 1–15.

    Article  Google Scholar 

  • Bartholomew, D., & Tzamourani, P. (1999). The goodness of fit of latent trait models in attitude measurement. Sociological Methods & Research, 27, 525–546.

    Article  Google Scholar 

  • Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443–459.

    Article  Google Scholar 

  • Bock, R.D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179–197.

    Article  Google Scholar 

  • Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited information goodness of fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical & Statistical Psychology, 59, 173–194.

    Article  Google Scholar 

  • Christoffersson, A. (1975). Factor analysis of dichotomized variables. Psychometrika, 40, 5–32.

    Article  Google Scholar 

  • De Leeuw, J., & Verhelst, N. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational and Behavioral Statistics, 11, 183–196.

    Article  Google Scholar 

  • Fischer, G.H. & Molenaar, I.W. (Eds.) (1995). Rasch models: foundations, recent developments and applications. New York: Springer.

    Google Scholar 

  • Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53, 525–546.

    Article  Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model. Psychometrika, 54, 635–659.

    Article  Google Scholar 

  • Glas, C.A.W., & Verhelst, N.D. (1995). Testing the Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: foundations, recent developments and applications (pp. 69–96). New York: Springer.

    Google Scholar 

  • Glas, C.A.W. (2009). Personal communication.

  • Irtel, H. (1995). An extension of the concept of specific objectivity. Psychometrika, 60, 115–118.

    Article  Google Scholar 

  • Joe, H., & Maydeu-Olivares, A. (2010). A general family of limited information goodness-of-fit statistics for multinomial data. Psychometrika, 75, 393–419.

    Article  Google Scholar 

  • Jöreskog, K.G. (1994). On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika, 59, 381–389.

    Article  Google Scholar 

  • Jöreskog, K.G., & Moustaki, I. (2001). Factor analysis of ordinal variables: a comparison of three approaches. Multivariate Behavioral Research, 36, 347–387.

    Article  Google Scholar 

  • Koehler, K., & Larntz, K. (1980). An empirical investigation of goodness of fit statistics for sparse multidimensional tables. Journal of the American Statistical Association, 75, 336–344.

    Article  Google Scholar 

  • Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.

    Article  Google Scholar 

  • Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.

    Google Scholar 

  • Mathai, A.M., & Provost, S.B. (1992). Quadratic forms in random variables: theory and applications. New York: Marcel Dekker.

    Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2005). Limited and full information estimation and goodness-of-fit testing in 2n tables: a unified approach. Journal of the American Statistical Association, 100, 1009–1020.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit in multidimensional contingency tables. Psychometrika, 71, 713–732.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Joe, H. (2008). An overview of limited information goodness-of-fit testing in multidimensional contingency tables. In K. Shigemasu, A. Okada, T. Imaizumi, & T. Hoshino (Eds.), New trends in psychometrics (pp. 253–262). Tokyo: Universal Academy Press.

    Google Scholar 

  • Maydeu-Olivares, A., & Liu, Y. (2012). Item diagnostics in multivariate discrete data. Manuscript under review.

  • Mavridis, D., Moustaki, I., & Knott, M. (2007). Goodness-of-fit measures for latent variable models for binary data. In S.-Y. Lee (Ed.), Handbook of latent variables and related models (pp. 135–162). Amsterdam: Elsevier.

    Chapter  Google Scholar 

  • McDonald, R.P. (1999). Test theory: a unified treatment. Mahwah: Lawrence Erlbaum.

    Google Scholar 

  • Montaño, R. (2009). Una comparación de las estadísticas de bondad de ajuste R 1 y M 2 para modelos de la Teoría de Respuesta al Ítem [Comparing the R 1 and M 2 statistics for goodness of fit assessment in IRT models]. Unpublished Ph.D. dissertation, University of Barcelona.

  • Pfanzagel, J. (1993). A case of asymptotic equivalence between conditional and marginal maximum likelihood estimators. Journal of Statistical Planning and Inference, 35, 301–307.

    Article  Google Scholar 

  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedagogiske Institut.

    Google Scholar 

  • Reiser, M. (1996). Analysis of residuals for the multinomial item response model. Psychometrika, 61, 509–528.

    Article  Google Scholar 

  • Reiser, M. (2008). Goodness-of-fit testing using components based on marginal frequencies of multinomial data. British Journal of Mathematical & Statistical Psychology, 61, 331–360.

    Article  Google Scholar 

  • Satorra, A., & Saris, W.E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 83–90.

    Article  Google Scholar 

  • Suárez-Falcon, J.C., & Glas, C.A.W. (2003). Evaluation of global testing procedure for item fit to the Rasch model. British Journal of Mathematical & Statistical Psychology, 56, 127–143.

    Article  Google Scholar 

  • Swaminathan, H., Hambleton, R.K., & Rogers, H.J. (2007). Assessing the fit of item response models. In C.R. Rao & S. Sinharay (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 683–718). Amsterdam: Elsevier.

    Chapter  Google Scholar 

  • Teugels, J.L. (1990). Some representations of the multivariate Bernoulli and binomial distributions. Journal of Multivariate Analysis, 32, 256–268.

    Article  Google Scholar 

  • Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic models. Psychometrika, 47, 175–186.

    Article  Google Scholar 

  • van den Wollenberg, A.L. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123–139.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Maydeu-Olivares.

Additional information

This research was supported by an ICREA-Academia Award and Grant SGR 2009 74 from the Catalan Government, and by Grants PSI2009-07726 and PR2010-0252 from the Spanish Ministry of Education awarded to the first author, and by a Dissertation Research Award of the Society of Multivariate Experimental Psychology awarded to the second author. The authors are indebted to the reviewers and to David Thissen for comments that improved the manuscript.

Appendix

Appendix

Relationship π R2=T R2 π for n=4 items

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maydeu-Olivares, A., Montaño, R. How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-Fit Statistics in Categorical Data Analysis. Psychometrika 78, 116–133 (2013). https://doi.org/10.1007/s11336-012-9293-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-012-9293-1

Key words

Navigation