Abstract
A method is proposed for the detection of item bias with respect to observed or unobserved subgroups. The method uses quasi-loglinear models for the incomplete subgroup × test score × Item 1 × ... × itemk contingency table. If subgroup membership is unknown the models are Haberman's incomplete-latent-class models.
The (conditional) Rasch model is formulated as a quasi-loglinear model. The parameters in this loglinear model, that correspond to the main effects of the item responses, are the conditional estimates of the parameters in the Rasch model. Item bias can then be tested by comparing the quasi-loglinear-Rasch model with models that contain parameters for the interaction of item responses and the subgroups.
Similar content being viewed by others
References
Baker, R. J., & Nelder, J. A. (1978).The GLIM system: Generalized linear interactive modeling. Oxford: The Numerical Algorithms Group.
Berk, R. A. (1982).Handbook of methods for detecting test bias. Baltimore: The Johns Hopkins University Press.
Binet, A., & Simon, T. (1916).The development of Intelligence in Children. Baltimore: Williams & Wilkins.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975).Discrete multivariate analysis. Cambridge, MA: MIT Press.
Camilli, G. (1979).A critique of the chi-square method for assessing item bias. Unpublished paper, University of Colorado, Laboratory of Educational Research, Boulder.
Cressie, N., & Holland, P. W. (1983). Characterizing the manifest probabilities of latent trait models.Psychometrika, 48, 129–142.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM Algorithm.Journal of the Royal Statistical Society, Series B,39, 1–38.
Duncan, O. D. (1984). Rasch measurement: Further examples and discussion. In C. F. Turner & E. Martin (Eds.),Surveying subjective phenomena (Vol. 2, pp. 367–403). New York: Russell Sage Foundation.
Durovic, J. (1975).Definitions of test bias: A taxonomy and an illustration of an alternative model. Unpublished doctoral dissertation, State University of New York at Albany.
Fienberg, S. E. (1972). The analysis of incomplete multi-way contingency tables.Biometrics, 28, 177–202. (Corrig. 1972,29, 829)
Fischer, G. H., & Forman, A. F. (1982). Some applications of logistic latent trait models with linear constraints on parameters.Applied Psychological Measurement, 6, 397–416.
Goodman, L. A. (1974). Exploratory latent structure analysis.Biometrika, 61, 215–231.
Goodman, L. A. (1975). A new model for scaling response patterns: An application of the quasi-independence concept.Journal of the American Statistical Association, 70, 755–768.
Goodman, L. A. (1978).Analyzing qualitative/categorical data: Loglinear models and latent structure analysis. London: Addison Wesley.
Goodman, L. A., & Fay, R. (1974).ECTA program, description for users. Chicago: University of Chicago, Department of Statistics.
Haberman, S. J. (1979).Analysis of qualitative data: New developments (Vol. 2). New York: Academic Press.
Holland, P. W. (1985).On the study of differential item performance without IRT. Paper presented at the Annual Meeting of the Military Testing Association, San Diego.
Holland, P. W., & Thayer, D. (1986).Differential item performance and the Mantel-Haenszel statistic. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco.
Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In R. A. Berk (Ed.),Handbook of methods for detecting item bias. Baltimore: The Johns Hopkins University Press.
Jensen, A. R. (1980)Bias in mental testing. London: Methuen.
Kelderman, H. (1984). Loglinear Rasch model tests.Psychometrika, 49, 223–245.
Kelderman, H. (1987).Estimating quasi-loglinear models for a Rasch table if the number of items is large (Research Report 87-5). Enschede: University of Twente, Department of Education.
Kelderman, H., & Steen, R. (1988).LOGIMO: A program for loglinear IRT modeling. Enschede: University of Twente, Department of Education.
Kok, F. G. (1982).Het partijdige item. [The biased item] Psychologisch Laboratorium, University of Amsterdam.
Kok, F. G., & Mellenbergh, G. J. (1985, July).A mathematical model for item bias and a definition of bias effect size. Paper presented at the Fourth Meeting of the Psychometric Society, Cambridge, Great Britain.
Kok, F. G., Mellenbergh, G. J., & van der Flier, H. (1985). An iterative procedure for detecting biased items.Journal of Educational Measurement, 22, 295–303.
Larnz, K. (1978). Small-sample comparisons of exact levels for chi-square statistics.Journal of the American Statistical Association, 73, 412–419.
Lazarsfeld, P. F. (1950). The interpretation and computation of some latent structures. In S. A. Stouffer, et al. (Eds.),Measurement and prediction in World War II (Vol. 4, pp. 413–472). Princeton: Princeton University Press.
Lazarsfeld, P. F., & Henry, N. W. (1968).Latent structure analysis. Boston: Houghton Miffin.
Lord, F. M. (1980).Applications of item response theory to practical testing problems. Hillsdale, New Jersey: Lawrence Erlbaum.
McHugh, R. B. (1956). Efficient estimation and local identifica in latent class analysis.Psychometrika, 21, 331–347.
Mellenbergh, G. J. (1982). Contingency table methods for assessing item bias.Journal of Educational Statistics, 7, 105–118.
Mislevy, R. J. (1981).A general linear model for the analysis of Rasch item threshold estimates. Unpublished doctoral dissertation, University of Chicago.
Muthén, B., & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis.Journal of Educational Statistics, 10, 133–142.
Nungester, R. J. (1977). An empirical examination of three models of item bias.Dissertation Abstracts International, 38, 2726 A. (University Microfilms No. 77-24, 289, Doctoral dissertation Florida State University, 1977)
Osterlind, S. J. (1983).Test item bias. Beverly Hills: Sage.
Petersen, N. S., & Novick, M. R. (1976). An evaluation of some models for culture-fair selection.Journal of Educational Measurement, 3–29.
Rao, C. R. (1965).Linear statistical inference and its applications. New York: Wiley.
Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedagogiske Institut.
Rasch, G. (1966). An item analysis that takes individual differences into account.British Journal of Mathematical and Statistical Psychology, 19, 49–57.
Scheuneman, J. (1979). A method of assessing bias in test items.Journal of Educational Measurement, 16, 143–152.
Shepard, L. A., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria.Journal of Educational Statistics, 6, 317–377.
Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model.Scandinavian Journal of Statistics, 9, 23–30.
Wright, B. D., Mead, R. J., & Draba, R. (1975).Detecting and correcting test item bias with a logistic response model (RM 22). Chicago: University of Chicago, Department of Education, Statistical Laboratory.
Author information
Authors and Affiliations
Additional information
The author thanks Wim J. van der Linden and Gideon J. Mellenbergh for comments and suggestions and Frank Kok for empirical data.
Rights and permissions
About this article
Cite this article
Kelderman, H. Item bias detection using loglinear irt. Psychometrika 54, 681–697 (1989). https://doi.org/10.1007/BF02296403
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02296403