Abstract
Measurement invariance (lack of bias) of a manifest variableY with respect to a latent variableW is defined as invariance of the conditional distribution ofY givenW over selected subpopulations. Invariance is commonly assessed by studying subpopulation differences in the conditional distribution ofY given a manifest variableZ, chosen to substitute forW. A unified treatment of conditions that may allow the detection of measurement bias using statistical procedures involving only observed or manifest variables is presented. Theorems are provided that give conditions for measurement invariance, and for invariance of the conditional distribution ofY givenZ. Additional theorems and examples explore the Bayes sufficiency ofZ, stochastic ordering inW, local independence ofY andZ, exponential families, and the reliability ofZ. It is shown that when Bayes sufficiency ofZ fails, the two forms of invariance will often not be equivalent in practice. Bayes sufficiency holds under Rasch model assumptions, and in long tests under certain conditions. It is concluded that bias detection procedures that rely strictly on observed variables are not in general diagnostic of measurement bias, or the lack of bias.
Similar content being viewed by others
References
Berk, R. A. (1982).Handbook of methods for detecting test bias. Baltimore, MD: The Johns Hopkins University.
Cleary, T. A. (1968). Test bias: Prediction of grades of Negro and white students in integrated colleges.Journal of Educational Measurement, 5, 115–124.
Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.),Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.
Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In R. A. Berk (Ed.),Handbook of methods for detecting test bias (pp. 117–160). Baltimore, MD: The Johns Hopkins University.
Junker, B. W. (1990, June).Essential independence and structural robustness in item response theory. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.
Lehmann, E. L. (1955). Ordered families of distributions.Annals of Mathematical Statistics, 26, 399–419.
Lehmann, E. L. (1986).Testing statistical hypotheses. New York: Wiley.
Lord, F. M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease.Journal of the National Cancer Institute, 22, 719–748.
Marascuilo, L. A., & Slaughter, R. E. (1981). Statistical procedures for identifying possible sources of item bias based onx 2 statistics.Journal of Educational Measurement, 18, 229–248.
Mellenbergh, G. J. (1989). Item bias and item response theory.International Journal of Educational Research, 13, 127–143.
Rao, C. R. (1973).Linear statistical inference and its applications. New York: Wiley.
Reilly, R. R. (1986). Validating employee selection procedures. In D. H. Kaye & M. H. Aicken (Eds.),Statistical methods in discrimination litigation (pp. 133–158). New York: Marcel Dekker.
Scheuneman, J. D. (1979). A method of assessing bias in test items.Journal of Educational Measurement, 16, 143–152.
Shealy, R., & Stout, W. F. (1990, June).A new model and statistical test for psychological test bias. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.
Shepard, L. A., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria.Journal of Educational Statistics, 6, 317–375.
Stout, W. F. (1990). A new item response theory modeling approach with applications to multidimensionality assessment and ability estimation.Psychometrika, 55, 293–325.
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.),Test validity (pp. 147–169). Hillsdale, NJ: Lawrence Erlbaum.
Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide?,Journal of Educational Statistics, 15, 185–197.
Author information
Authors and Affiliations
Additional information
Preparation of this article was supported in part by PSC-CUNY grant #661282 to Roger E. Millsap.
Rights and permissions
About this article
Cite this article
Meredith, W., Millsap, R.E. On the misuse of manifest variables in the detection of measurement bias. Psychometrika 57, 289–311 (1992). https://doi.org/10.1007/BF02294510
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294510