A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF

Shealy, Robin; Stout, William

doi:10.1007/BF02294572

A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF

Published: June 1993

Volume 58, pages 159–194, (1993)
Cite this article

Psychometrika Aims and scope Submit manuscript

Robin Shealy¹ &
William Stout¹

1499 Accesses
428 Citations
6 Altmetric
Explore all metrics

Abstract

A model-based modification (SIBTEST) of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for several items. A distinction between DIF and bias is proposed. SIBTEST detects bias/DIF without the usual Type 1 error inflation due to group target ability differences. In simulations, SIBTEST performs comparably to Mantel-Haenszel for the one item case. SIBTEST investigates bias/DIF for several items at the test score level (multiple item DIF called differential test functioning: DTF), thereby allowing the study of test bias/DIF, in particular bias/DIF amplification or cancellation and the cognitive bases for bias/DIF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Best Practices in Detecting Bias in Cognitive Tests

Detection of Differential Item Functioning via the Credible Intervals and Odds Ratios Methods

Examining Differential Item Functioning from a Multidimensional IRT Perspective

Article 05 April 2024

References

Ackerman, T. (1992).A didactic explanation of item bias, item impact, and item validity from a multidimensional IRT perspective.Journal of Educational Measurement, 29, 67–91.
Google Scholar
Ackerman, T. (1992, April).Assessing construct validity using multidimensional item response theory. Paper presented at the 1992 AERA/NCME joint meeting, San Francisco, CA.
Ansley, T. N., & Forsyth, R. A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data.Applied Psychological Measurement, 9, 37–48.
Google Scholar
Dorans, N. J. (1992, November).Implications in choice of metric for DIF effect size on decisions about DIF. Paper presented at the 1991 International Symposium on Modern Theories in Measurement, Montebello, Quebec.
Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the scholastic aptitude test.Journal of Educational Measurement, 23, 355–368.
Google Scholar
Drasgow, F. (1987). A study of measurement bias of two standard psychological tests.Journal of Applied Psychology, 72, 19–30.
Google Scholar
Fraser, C. (1983).NOHARM II, A Fortran program for fitting unidimensional and multi-dimensional normal ogive models of latent trait theory (Technical Report). University of New England, Australia.
Google Scholar
Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods.Applied Measurement in Education, 2, 313–334.
Google Scholar
Hambleton, R. K., & Swaminanthan, H. (1985).Item response theory: Principles and applications. Boston: Kluwer-Nijhoff Publishing.
Google Scholar
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.),Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Kok, F. (1988). Item bias and test multidimensionality. In R. Langeheine & J. Rost (Eds.),Latent trait and latent models (pp. 263–275). New York: Plenum Press.
Google Scholar
Lautenschlager, G., & Park, D. (1988) IRT item bias detection procedures: Issues of model mis-specification, robustness, and parameter linking.Applied Psychological Measurement, 12, 365–376.
Google Scholar
Linn, R., Levine, M., Hastings, C., & Wardrop, J. (1981). Item bias on a test of reading comprehension.Applied Psychological Measurement, 5, 159–173.
Google Scholar
Lord, F. M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Mellenbergh, G. J. (1982). Contingency table methods for assessing item bias.Journal of Educational Statistics, 7, 105–118.
Google Scholar
Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement bias.Psychometrika, 57, 289–311.
Google Scholar
Millsap, R. E., & Meredith, W. (1989, July).The detection of DIF: Why there is no free lunch. Paper presented at the Annual Meeting of the Psychometric Society, University of California at Los Angeles.
Mislevy, R. J., & Bock, R. D. (1984).Item operating characteristics of the Armed Services Aptitude Battery (ASVAB). Form 8A. (Tech. Rep. N00014-83-C-0283). Washington, DC: Office of Naval Research.
Google Scholar
Nandakumar, R. (in press).Simultaneous DIF amplification and cancellation: Shealy-Stout's test for DIF. Journal of Educational Measurement.
Raju, N. S., van der Linden, W. J., & Fleer, P. J. (1992, April).An IRT-based internal measure of test bias with applications for differential item functioning. Paper presented at the 1992 AERA meeting, San Francisco, CA.
Reckase, M. D. (1992, April).Mathematics test item formats versus the skill being assessed: A brief review. Paper presented at the 1992 NCME Meeting, San Francisco, CA.
Roussos, L. (1993).Simulation studies of effects of small sample size and studied item parameters on SIBTEST and Mantel-Haenzel Type 1 error performance (Technical Report). Champaign, IL: University of Illinois.
Google Scholar
Shealy, R. T. (1989).An item response theory-based statistical procedure for detecting concurrent internal bias in ability tests. Unpublished doctoral dissertation, Department of Statistics, University of Illinois, Urbana-Champaign.
Shealy, R. T., & Stout, W. F. (1991a).An item response theory model for test bias (Tech. Rep. 1991-#2). Washington, DC: Office of Naval Research.
Google Scholar
Shealy, R. T., & Stout, W. F. (1991b).A procedure to detect test bias present simultaneously in several items (Tech. Rep. 1991-#3). Washington, DC: Office of Naval Research.
Google Scholar
Shealy, R. T., & Stout, W. F. (1993). An item response theory model for test bias and differential test functioning. In (by invitation) P. Holland & H. Wainer (Eds.),Differential item functioning (pp. 197–240). Hillsdale, NJ: Erlbaum.
Google Scholar
Stout, W. F. (1987) A nonparametric approach for assessing latent trait unidimensionality.Psychometrika, 52, 589–617.
Google Scholar
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures.Journal of Educational Measurement, 27, 361–370.
Google Scholar
Wainer, H. (1993). Model-based standardized measurement of an item's differential impact. In P. Holland & H. Wainer (Eds.),Differential item functioning: theory and practice (pp. 123–136). Hillsdale, NJ: Erlbaum.
Google Scholar
Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide?Journal of Educational Statistics, 15, 185–197.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Illinois at Urbana-Champaign, USA
Robin Shealy & William Stout

Authors

Robin Shealy
View author publications
You can also search for this author in PubMed Google Scholar
William Stout
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This research was partially supported by Office of Naval Research Cognitive and Neural Sciences Grant N0014-90-J-1940, 4421-548 and National Science Foundation Mathematics Grant NSF-DMS-91-01436. The research reported here is collaborative in every respect and the order of authorship is alphabetical. The assistance of Hsin-hung Li and Louis Roussos in conducting the simulation studies was of great help. Discussions with Terry Ackerman, Paul Holland, and Louis Roussos were very helpful.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shealy, R., Stout, W. A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika 58, 159–194 (1993). https://doi.org/10.1007/BF02294572

Download citation

Issue Date: June 1993
DOI: https://doi.org/10.1007/BF02294572

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF

Abstract

Access this article

Similar content being viewed by others

Best Practices in Detecting Bias in Cognitive Tests

Detection of Differential Item Functioning via the Credible Intervals and Odds Ratios Methods

Examining Differential Item Functioning from a Multidimensional IRT Perspective

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF

Abstract

Access this article

Similar content being viewed by others

Best Practices in Detecting Bias in Cognitive Tests

Detection of Differential Item Functioning via the Credible Intervals and Odds Ratios Methods

Examining Differential Item Functioning from a Multidimensional IRT Perspective

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation