Skip to main content
Log in

Item bias detection using loglinear irt

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

A method is proposed for the detection of item bias with respect to observed or unobserved subgroups. The method uses quasi-loglinear models for the incomplete subgroup × test score × Item 1 × ... × itemk contingency table. If subgroup membership is unknown the models are Haberman's incomplete-latent-class models.

The (conditional) Rasch model is formulated as a quasi-loglinear model. The parameters in this loglinear model, that correspond to the main effects of the item responses, are the conditional estimates of the parameters in the Rasch model. Item bias can then be tested by comparing the quasi-loglinear-Rasch model with models that contain parameters for the interaction of item responses and the subgroups.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baker, R. J., & Nelder, J. A. (1978).The GLIM system: Generalized linear interactive modeling. Oxford: The Numerical Algorithms Group.

    Google Scholar 

  • Berk, R. A. (1982).Handbook of methods for detecting test bias. Baltimore: The Johns Hopkins University Press.

    Google Scholar 

  • Binet, A., & Simon, T. (1916).The development of Intelligence in Children. Baltimore: Williams & Wilkins.

    Google Scholar 

  • Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975).Discrete multivariate analysis. Cambridge, MA: MIT Press.

    Google Scholar 

  • Camilli, G. (1979).A critique of the chi-square method for assessing item bias. Unpublished paper, University of Colorado, Laboratory of Educational Research, Boulder.

    Google Scholar 

  • Cressie, N., & Holland, P. W. (1983). Characterizing the manifest probabilities of latent trait models.Psychometrika, 48, 129–142.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM Algorithm.Journal of the Royal Statistical Society, Series B,39, 1–38.

    Google Scholar 

  • Duncan, O. D. (1984). Rasch measurement: Further examples and discussion. In C. F. Turner & E. Martin (Eds.),Surveying subjective phenomena (Vol. 2, pp. 367–403). New York: Russell Sage Foundation.

    Google Scholar 

  • Durovic, J. (1975).Definitions of test bias: A taxonomy and an illustration of an alternative model. Unpublished doctoral dissertation, State University of New York at Albany.

  • Fienberg, S. E. (1972). The analysis of incomplete multi-way contingency tables.Biometrics, 28, 177–202. (Corrig. 1972,29, 829)

    Google Scholar 

  • Fischer, G. H., & Forman, A. F. (1982). Some applications of logistic latent trait models with linear constraints on parameters.Applied Psychological Measurement, 6, 397–416.

    Google Scholar 

  • Goodman, L. A. (1974). Exploratory latent structure analysis.Biometrika, 61, 215–231.

    Google Scholar 

  • Goodman, L. A. (1975). A new model for scaling response patterns: An application of the quasi-independence concept.Journal of the American Statistical Association, 70, 755–768.

    Google Scholar 

  • Goodman, L. A. (1978).Analyzing qualitative/categorical data: Loglinear models and latent structure analysis. London: Addison Wesley.

    Google Scholar 

  • Goodman, L. A., & Fay, R. (1974).ECTA program, description for users. Chicago: University of Chicago, Department of Statistics.

    Google Scholar 

  • Haberman, S. J. (1979).Analysis of qualitative data: New developments (Vol. 2). New York: Academic Press.

    Google Scholar 

  • Holland, P. W. (1985).On the study of differential item performance without IRT. Paper presented at the Annual Meeting of the Military Testing Association, San Diego.

  • Holland, P. W., & Thayer, D. (1986).Differential item performance and the Mantel-Haenszel statistic. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco.

  • Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In R. A. Berk (Ed.),Handbook of methods for detecting item bias. Baltimore: The Johns Hopkins University Press.

    Google Scholar 

  • Jensen, A. R. (1980)Bias in mental testing. London: Methuen.

    Google Scholar 

  • Kelderman, H. (1984). Loglinear Rasch model tests.Psychometrika, 49, 223–245.

    Article  Google Scholar 

  • Kelderman, H. (1987).Estimating quasi-loglinear models for a Rasch table if the number of items is large (Research Report 87-5). Enschede: University of Twente, Department of Education.

    Google Scholar 

  • Kelderman, H., & Steen, R. (1988).LOGIMO: A program for loglinear IRT modeling. Enschede: University of Twente, Department of Education.

    Google Scholar 

  • Kok, F. G. (1982).Het partijdige item. [The biased item] Psychologisch Laboratorium, University of Amsterdam.

  • Kok, F. G., & Mellenbergh, G. J. (1985, July).A mathematical model for item bias and a definition of bias effect size. Paper presented at the Fourth Meeting of the Psychometric Society, Cambridge, Great Britain.

  • Kok, F. G., Mellenbergh, G. J., & van der Flier, H. (1985). An iterative procedure for detecting biased items.Journal of Educational Measurement, 22, 295–303.

    Article  Google Scholar 

  • Larnz, K. (1978). Small-sample comparisons of exact levels for chi-square statistics.Journal of the American Statistical Association, 73, 412–419.

    Google Scholar 

  • Lazarsfeld, P. F. (1950). The interpretation and computation of some latent structures. In S. A. Stouffer, et al. (Eds.),Measurement and prediction in World War II (Vol. 4, pp. 413–472). Princeton: Princeton University Press.

    Google Scholar 

  • Lazarsfeld, P. F., & Henry, N. W. (1968).Latent structure analysis. Boston: Houghton Miffin.

    Google Scholar 

  • Lord, F. M. (1980).Applications of item response theory to practical testing problems. Hillsdale, New Jersey: Lawrence Erlbaum.

    Google Scholar 

  • McHugh, R. B. (1956). Efficient estimation and local identifica in latent class analysis.Psychometrika, 21, 331–347.

    Article  Google Scholar 

  • Mellenbergh, G. J. (1982). Contingency table methods for assessing item bias.Journal of Educational Statistics, 7, 105–118.

    Google Scholar 

  • Mislevy, R. J. (1981).A general linear model for the analysis of Rasch item threshold estimates. Unpublished doctoral dissertation, University of Chicago.

  • Muthén, B., & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis.Journal of Educational Statistics, 10, 133–142.

    Google Scholar 

  • Nungester, R. J. (1977). An empirical examination of three models of item bias.Dissertation Abstracts International, 38, 2726 A. (University Microfilms No. 77-24, 289, Doctoral dissertation Florida State University, 1977)

  • Osterlind, S. J. (1983).Test item bias. Beverly Hills: Sage.

    Google Scholar 

  • Petersen, N. S., & Novick, M. R. (1976). An evaluation of some models for culture-fair selection.Journal of Educational Measurement, 3–29.

  • Rao, C. R. (1965).Linear statistical inference and its applications. New York: Wiley.

    Google Scholar 

  • Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedagogiske Institut.

    Google Scholar 

  • Rasch, G. (1966). An item analysis that takes individual differences into account.British Journal of Mathematical and Statistical Psychology, 19, 49–57.

    PubMed  Google Scholar 

  • Scheuneman, J. (1979). A method of assessing bias in test items.Journal of Educational Measurement, 16, 143–152.

    Article  Google Scholar 

  • Shepard, L. A., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria.Journal of Educational Statistics, 6, 317–377.

    Google Scholar 

  • Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model.Scandinavian Journal of Statistics, 9, 23–30.

    Google Scholar 

  • Wright, B. D., Mead, R. J., & Draba, R. (1975).Detecting and correcting test item bias with a logistic response model (RM 22). Chicago: University of Chicago, Department of Education, Statistical Laboratory.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The author thanks Wim J. van der Linden and Gideon J. Mellenbergh for comments and suggestions and Frank Kok for empirical data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kelderman, H. Item bias detection using loglinear irt. Psychometrika 54, 681–697 (1989). https://doi.org/10.1007/BF02296403

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02296403

Key words

Navigation