Ackerman, T. (1992).*A didactic explanation of item bias, item impact, and item validity from a multidimensional IRT perspective*.*Journal of Educational Measurement, 29*, 67–91.

Ackerman, T. (1992, April).*Assessing construct validity using multidimensional item response theory*. Paper presented at the 1992 AERA/NCME joint meeting, San Francisco, CA.

Ansley, T. N., & Forsyth, R. A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data.*Applied Psychological Measurement, 9*, 37–48.

Dorans, N. J. (1992, November).*Implications in choice of metric for DIF effect size on decisions about DIF*. Paper presented at the 1991 International Symposium on Modern Theories in Measurement, Montebello, Quebec.

Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the scholastic aptitude test.*Journal of Educational Measurement, 23*, 355–368.

Drasgow, F. (1987). A study of measurement bias of two standard psychological tests.*Journal of Applied Psychology, 72*, 19–30.

Fraser, C. (1983).*NOHARM II, A Fortran program for fitting unidimensional and multi-dimensional normal ogive models of latent trait theory* (*Technical Report*). University of New England, Australia.

Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods.*Applied Measurement in Education, 2*, 313–334.

Hambleton, R. K., & Swaminanthan, H. (1985).*Item response theory: Principles and applications*. Boston: Kluwer-Nijhoff Publishing.

Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.),*Test validity* (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.

Kok, F. (1988). Item bias and test multidimensionality. In R. Langeheine & J. Rost (Eds.),*Latent trait and latent models* (pp. 263–275). New York: Plenum Press.

Lautenschlager, G., & Park, D. (1988) IRT item bias detection procedures: Issues of model mis-specification, robustness, and parameter linking.*Applied Psychological Measurement, 12*, 365–376.

Linn, R., Levine, M., Hastings, C., & Wardrop, J. (1981). Item bias on a test of reading comprehension.*Applied Psychological Measurement, 5*, 159–173.

Lord, F. M. (1980).*Applications of item response theory to practical testing problems*. Hillsdale, NJ: Lawrence Erlbaum.

Lord, F. M., & Novick, M. R. (1968).*Statistical theories of mental test scores*. Reading, MA: Addison-Wesley.

Mellenbergh, G. J. (1982). Contingency table methods for assessing item bias.*Journal of Educational Statistics, 7*, 105–118.

Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement bias.*Psychometrika, 57*, 289–311.

Millsap, R. E., & Meredith, W. (1989, July).*The detection of DIF: Why there is no free lunch*. Paper presented at the Annual Meeting of the Psychometric Society, University of California at Los Angeles.

Mislevy, R. J., & Bock, R. D. (1984).*Item operating characteristics of the Armed Services Aptitude Battery (ASVAB). Form 8A*. (Tech. Rep. N00014-83-C-0283). Washington, DC: Office of Naval Research.

Nandakumar, R. (in press).*Simultaneous DIF amplification and cancellation: Shealy-Stout's test for DIF. Journal of Educational Measurement*.

Raju, N. S., van der Linden, W. J., & Fleer, P. J. (1992, April).*An IRT-based internal measure of test bias with applications for differential item functioning*. Paper presented at the 1992 AERA meeting, San Francisco, CA.

Reckase, M. D. (1992, April).*Mathematics test item formats versus the skill being assessed: A brief review*. Paper presented at the 1992 NCME Meeting, San Francisco, CA.

Roussos, L. (1993).*Simulation studies of effects of small sample size and studied item parameters on SIBTEST and Mantel-Haenzel Type 1 error performance* (Technical Report). Champaign, IL: University of Illinois.

Shealy, R. T. (1989).*An item response theory-based statistical procedure for detecting concurrent internal bias in ability tests*. Unpublished doctoral dissertation, Department of Statistics, University of Illinois, Urbana-Champaign.

Shealy, R. T., & Stout, W. F. (1991a).*An item response theory model for test bias* (Tech. Rep. 1991-#2). Washington, DC: Office of Naval Research.

Shealy, R. T., & Stout, W. F. (1991b).*A procedure to detect test bias present simultaneously in several items* (Tech. Rep. 1991-#3). Washington, DC: Office of Naval Research.

Shealy, R. T., & Stout, W. F. (1993). An item response theory model for test bias and differential test functioning. In (by invitation) P. Holland & H. Wainer (Eds.),*Differential item functioning* (pp. 197–240). Hillsdale, NJ: Erlbaum.

Stout, W. F. (1987) A nonparametric approach for assessing latent trait unidimensionality.*Psychometrika, 52*, 589–617.

Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures.*Journal of Educational Measurement, 27*, 361–370.

Wainer, H. (1993). Model-based standardized measurement of an item's differential impact. In P. Holland & H. Wainer (Eds.),*Differential item functioning: theory and practice* (pp. 123–136). Hillsdale, NJ: Erlbaum.

Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide?*Journal of Educational Statistics, 15*, 185–197.