, Volume 60, Issue 4, pp 459–487 | Cite as

Some neglected problems in IRT

  • Gerhard H. Fischer


The paper addresses three neglected questions from IRT. In section 1, the properties of the “measurement” of ability or trait parameters and item difficulty parameters in the Rasch model are discussed. It is shown that the solution to this problem is rather complex and depends both on general assumptions about properties of the item response functions and on assumptions about the available item universe. Section 2 deals with the measurement of individual change or “modifiability” based on a Rasch test. A conditional likelihood approach is presented that yields (a) an ML estimator of modifiability for given item parameters, (b) allows one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifiability parameter, or (c) to estimate modifiability jointly with the item parameters. Uniqueness results for all three methods are also presented. In section 3, the Mantel-Haenszel method for detecting DIF is discussed under a novel perspective: What is the most general framework within which the Mantel-Haenszel method correctly detects DIF of a studied item? The answer is that this is a 2PL model where, however, all discrimination parameters are known and the studied item has the same discrimination in both populations. Since these requirements would hardly be satisfied in practical applications, the case of constant discrimination parameters, that is, the Rasch model, is the only realistic framework. A simple Pearsonx2 test for DIF of one studied item is proposed as an alternative to the Mantel-Haenszel test; moreover, this test is generalized to the case of two items simultaneously studied for DIF.

Key words

measurement IRT Rasch model measurement of change DIF Mantel-Haenszel statistic 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aczél, J. (1966).Lectures on functional equations and their applications. New York: Academic Press.Google Scholar
  2. Alper, T. M. (1987). A classification of all order-preserving homeomorphism groups of the reals that satisfy finite uniqueness.Journal of Mathematical Psychology, 31, 135–154.Google Scholar
  3. Andersen, E. B. (1973). A goodness of fit test for the Rasch model.Psychometrika, 38, 123–140.Google Scholar
  4. Andersen, E. B. (1985). Estimating latent correlations between repeated testings.Psychometrika, 50, 3–16.Google Scholar
  5. Baker, F. B. (1992).Item response theory. New York: Marcel Dekker.Google Scholar
  6. Bereiter, C. (1963). Some persisting dilemmas in the measurement of change. In C. W. Harris (Ed.),Problems in measuring change (pp. 3–20). Madison: The University of Wisconsin Press.Google Scholar
  7. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.),Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.Google Scholar
  8. Churchhouse, R. F. (1981).Handbook of applicable mathmatics, Vol. III. Chichester and New York: J. Wiley.Google Scholar
  9. Colonius, H. (1979). Zur Eindeutigkeit der Parameter im Rasch-Modell [On the uniqueness of parameters in the Rasch model].Psychologische Beiträge, 21, 414–416.Google Scholar
  10. Cronbach, L. J., & Furby, L. (1970). How should we measure change—or should we?Psychological Bulletin, 74, 68–80.Google Scholar
  11. Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change.Psychometrika, 56, 495–515.Google Scholar
  12. Fischer, G. H. (1972). A measurement model for the effect of mass-media.Acta Psychologica, 36, 207–220.Google Scholar
  13. Fischer, G. H. (1974).Einführung in die Theorie psychologischer Tests [Introduction to mental test theory. In German]. Berne: Huber.Google Scholar
  14. Fischer, G. H. (1976). Some probabilistic models for measuring change. In D. N. M. de Gruijter & L. J. Th. van der Kamp (Eds.),Advances in psychological and educational measurement (pp. 97–110). New York: J. Wiley.Google Scholar
  15. Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model.Psychometrika, 46, 59–77.Google Scholar
  16. Fischer, G. H. (1983). Logistic latent trait models with linear constraints.Psychometrika, 48, 3–26.Google Scholar
  17. Fischer, G. H. (1987). Applying the principles of specific objectivity and generalizability to the measurement of change.Psychometrika, 52, 565–587.Google Scholar
  18. Fischer, G. H. (1988). Spezifische Objektivität: Eine wissenschaftstheoretische Grundlage des Rasch-Modells [Specific objectivity: A theoretical foundation of the Rasch model. In German]. In K. D. Kubinger (Ed.),Moderne Testtheorie (pp. 87–111). Weinhein: Beltz.Google Scholar
  19. Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data.Psychometrika, 54, 599–624.Google Scholar
  20. Fischer, G. H. (1993). Notes on the Mantel-Haenszel procedure and another chi-squared test for the assessment of DIF.Methodika, 7, 88–100.Google Scholar
  21. Fischer, G. H. (1995a). Derivations of the Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.),Rasch models. Foundations, recent developments, and applications (pp. 15–38). New York: Springer-Verlag.Google Scholar
  22. Fischer, G. H. (1995b). The linear logistic test model. In G. H. Fischer & I. W. Molenaar (Eds.),Rasch models. Foundations, recent developments, and applications (pp. 131–155). New York: Springer-Verlag.Google Scholar
  23. Fischer, G. H., & Parzer, P. (1991). An extension of the rating scale model with an application to the measurement of change.Psychometrika, 56, 637–651.Google Scholar
  24. Fischer, G. H., & Ponocny, I. (1994). An extension of the partial credit model with an application to the measurement of change.Psychometrika, 59, 177–192.Google Scholar
  25. Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.),Rasch models. Foundations, recent developments, and applications (pp. 69–95). New York: Springer-Verlag.Google Scholar
  26. Guttmann, G., & Etlinger, S. C. (1991). Susceptibility to stress and anxiety in relation to performance, emotion, and personality: The ergopsychometric approach. In C. D. Spielberger, I. G. Sarason, J. Strelau, & J. M. T. Brebner (Eds.),Stress and anxiety, Vol. 13 (pp. 23–52). New York: Hemisphere Publishing.Google Scholar
  27. Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (Ed.),Educational measurement (pp. 147–200). New York: Macmillan, and London: Collier Macmillan.Google Scholar
  28. Hamerle, A. (1979). Über die meßtheoretischen Grundlagen von Latent-Trait-Modellen [On measurement-theoretic foundations of latent trait models. In German.]Archiv für Psychologie, 132, 19–39.Google Scholar
  29. Hamerle, A. (1982).Latent-Trait-Modelle. [Latent trait models]. Weinheim: Beltz. (In German)Google Scholar
  30. Harris, C. W. (Ed.). (1963).Problems in measuring change. Madison: The University of Wisconsin Press.Google Scholar
  31. Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.),Test validity. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  32. Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983).Item response theory. Application to psychological measurement. Homewood, IL: Dow Jones-Irwin.Google Scholar
  33. Irtel, H. (1987). On specific objectivity as a concept in measurement. In E. E. Roskam & R. Suck (Eds.),Progress in mathematical psychology (pp. 35–45). Amsterdam: North-Holland.Google Scholar
  34. Irtel, H. (1994). The uniqueness structure of simple latent trait models. In G. H. Fischer & D. Laming (Eds.),Contributions to mathematical psychology, psychometrics, and methodology (pp. 265–275). New York: Springer-Verlag.Google Scholar
  35. Johnson, N. L., & Kotz, S. (1969).Distributions in statistics: Discrete distributions, Vol. I. Boston: Houghton Mifflin.Google Scholar
  36. Kempf, W. (1977). Dynamic models for the measurement of ‘traits’ in social behavior. In W. Kempf & B. H. Repp (Eds.),Mathematical models for social psychology (pp. 14–58). Berne: Huber.Google Scholar
  37. Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971).Foundations of measurement, Vol. 1. New York/London: Academic Press.Google Scholar
  38. Kubinger, K. D. (1988).Moderne Testtheorie [Modern test theory]. Weinheim: Beltz. (In German)Google Scholar
  39. Lord, F. M. (1963). Elementary models for measuring change. In C. W. Harris (Ed.),Problems in measuring change (pp. 21–38). Madison: The University of Wisconsin Press.Google Scholar
  40. Lord, F. M. (1980).Applications of items response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  41. Luce, R. D. (1990). Goals, achievements, and limitations of modern fundamental measurement theory. In H. H. Bock (Ed.),Classification and related methods of data analysis (pp. 15–22). Amsterdam: North-Holland.Google Scholar
  42. McLane, S., & Birkoff, G. (1988).Algebra (3rd ed.). New York: Chelsea.Google Scholar
  43. Narens, L. (1981). On the scales of measurement.Journal of Mathematical Psychology, 24, 249–275.Google Scholar
  44. Pfanzagl, J. (1971).Theory of measurement. Würzburg and Vienna: Physica-Verlag.Google Scholar
  45. Pfanzagl, J. (1994). On item parameter estimation in certain latent trait models. In G. H. Fischer & D. Laming (Eds.),Contributions to mathematical psychology, psychometrics, and methodology (pp. 249–263). New York: Springer-Verlag.Google Scholar
  46. Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Pædagogiske Institut.Google Scholar
  47. Rasch, G. (1961). On general laws and the meaning of measurement in psychology.Proceedings of the IV. Berkeley Symposium on mathematical statistics and probability, Vol. IV (pp. 321–333). Berkeley: University of California Press.Google Scholar
  48. Rasch, G. (1967). An informal report on a theory of objectivity in comparisons. In L. J. Th. van der Kamp & C. A. J. Vlek (Eds.),Measurement theory (pp. 1–19). Leyden: University of Leyden. (Proceedings of the NUFFIC international summer session in science in “Het Oude Hof”, The Hague, July 14–19, 1996)Google Scholar
  49. Rasch, G. (1968, September).A mathematical theory of objectivity and its consequences for model construction. Paper presented at the European Meeting on Statistics, Econometrics, and Management Science, Amsterdam, The Netherlands.Google Scholar
  50. Rasch, G. (1972). Objectivitet i samfundsvidenskaberne et metodeproblem [Ojectivity in the social sciences as a methodological problem].National-økonomisk Tidsskrift, 110, 161–196. (In Danish)Google Scholar
  51. Rasch, G. (1977). On specific objectivity. An attempt at formalizing the request for generaliy and validity of scientific statements. In M. Blegvad (Ed.),The Danish yearbook of philosophy (pp. 58–94). Copenhagen: Munksgaard.Google Scholar
  52. Santner, T. J., & Duffy, D. E. (1989).The statistical analysis of discrete data. New York: Springer-Verlag.Google Scholar
  53. Scheiblechner, H. (1995). Isotonic psychometric models (ISOP).Psychometrika, 60, 281–304.Google Scholar
  54. Stene, J. (1968). Einführung in Raschs Theorie psychologischer Messung [Introduction to Rasch's theory of psychological measurement]. In G. H. Fischer (Ed.),Psychologische Testtheorie (pp. 229–268). Berne: Huber. (In German)Google Scholar
  55. Steyer, R. & Eid, M. (1993).Messen und Testen [Measurement and testing]. Berlin: Springer-Verlag. (In German)Google Scholar
  56. Tutz, G. (1989).Latent Trait-Modelle für ordinale Beobachtungen [Latent trait models for ordinal data]. Berlin: Springer-Verlag. (In German)Google Scholar
  57. Verhelst, N. D., & Glas, C. A. W. (1993). A dynamic generalization of the Rasch model.Psychometrika, 58, 395–415.Google Scholar
  58. Verhelst, N. D., & Glas, C. A. W. (1995). Dynamic generalizations of the Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.),Rasch models. Foundations, recent developments, and applications (pp. 181–201). New York: Springer-Verlag.Google Scholar
  59. Wainer, H., & Mislevy, R. (1990). Item response theory, item calibration and proficiency estimation. In H. Wainer (Ed.),Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  60. Webster, H., & Bereiter, C. (1963). The reliability of changes measured by mental test scores. In Harris, C. W. (Ed.),Problems in measuring change (pp. 39–59). Madison: The University of Wisconsin Press.Google Scholar
  61. Wright, B. D., & Stone, M. H. (1972).Best test design. Chicago: Mesa Press.Google Scholar

Copyright information

© The Psychometric Society 1995

Authors and Affiliations

  • Gerhard H. Fischer
    • 1
  1. 1.University of ViennaAustria

Personalised recommendations