Skip to main content

The Assessment of Differential Item Functioning in Comput Adaptive Tests

  • Chapter
Computerized Adaptive Testing: Theory and Practice

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agresti, A. (1990). Categorical data analysis. New York: Wiley.

    Google Scholar 

  • Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Donoghue, J. R., Holland, P. W., & Thayer, D. T. (1993). A Monte Carlo study of factors that affect the Mantel-Haenszel and standardization measures of differential item functioning. In P. W. Holland and H. Wainer (Eds.), Differential Item Functioning. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland and H. Wainer (Eds.), Differential item functioning, (pp. 35–66). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23, 355–368.

    Google Scholar 

  • Fischer, G.H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487.

    Google Scholar 

  • Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1995). Bayesian data analysis. London: Chapman & Hall.

    Google Scholar 

  • Holland, P.W., & Thayer, D.T.(1985) An alternative definition of the ETS delta scale of item difficulty. (ETS Research Report No. 85-43). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Holland, P.W., & Thayer, D.T. (1988) Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity, (pp. 129–145). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Holland, P.W., & Zwick, R. (1991). A simulation study of some simple approaches to the study of DIF for CAT&’s. (Internal memorandum). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Holland, P.W., & Wainer, H. (Eds.) Differential item functioning. Hillsdale, NJ: Erlbaum

    Google Scholar 

  • Kelley, T. L. (1923). Statistical methods. New York: Macmillan.

    Google Scholar 

  • Krass, I., & Segall, D. (1998). Differential item functioning and on-line item calibration. (Draft report). Monterey, CA: Defense Manpower Data Center.

    Google Scholar 

  • Legg, S.M., & Buhr, D.C. (1992). Computerized adaptive testing with different groups. Educational Measurement: Issues and Practice, 11, 23–27.

    Article  Google Scholar 

  • Mantel, N. & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.

    Google Scholar 

  • Miller, T.R. (1992, April). Practical considerations for conducting studies of differential item functioning in a CAT environment. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.

    Google Scholar 

  • Miller, T.R., & Fan, M. (1998, April). Assessing DIF in high dimensional CATs. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego.

    Google Scholar 

  • Nandakumar, R. & Roussos, L. (in press). CATSIB: A modified SIBTEST procedure to detect differential item functioning in computerized adaptive tests. (Research report) Newtown, PA: Law School Admission Council.

    Google Scholar 

  • Pashley, P. J. (1997). Computerized LSAT research agenda: Spring 1997 update. (LSAC report). Newtown, PA: Law School Admission Council.

    Google Scholar 

  • Phillips, A. & Holland, P.W. (1987). Estimation of the variance of the Mantel-Haenszel log-odds-ratio estimate. Biometrics, 43, 425–431.

    Google Scholar 

  • Pommerich, M., Spray, J.A., & Parshall, C.G. (1995). An analytical evaluation of two common-odds ratios as population indicators of DIF. (ACT Report 95-1). Iowa City: American College Testing Program.

    Google Scholar 

  • Powers, D. E., & O’Neill, K. (1993). Inexperienced and anxious computer users: Coping with a computer-administered test of academic skills. Educational Assessment, 1, 153–173.

    Article  Google Scholar 

  • Robins, J., Breslow, N., & Greenland, S. (1986). Estimators of the Mantel-Haenszel variance consistent in both sparse data and large-strata limiting models. Biometrics, 42, 311–323.

    Google Scholar 

  • Roussos, L. (1996, June). A type I error rate study of a modified SIBTEST DIF procedure with potential application to computerized-adaptive tests. Paper presented at the annual meeting of the Psychometric Society, Banff, Alberta, Canada.

    Google Scholar 

  • Roussos, L., & Nandakumar, R. (1998, June). Kernel-smoothed CATSIB. Paper presented at the annual meeting of the Psychometric Society, Urbana-Champaign, IL.

    Google Scholar 

  • Roussos, L., & Stout, W.F. (1996). Simulation studies of effects of small sample size and studied item parameters on SIBTEST and Mantel-Haenszel Type I error performance. Journal of Educational Measurement, 33, 215–230.

    Article  Google Scholar 

  • Schaeffer, G., Reese, C., Steffen, M., McKinley, R. L., & Mills, C. N. (1993). Field test of a computer-based GRE general test. (ETS Research Report No. RR 93-07). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Shealy, R., & Stout, W.F. (1993a). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58, 159–194.

    Google Scholar 

  • Shealy, R., & Stout, W.F. (1993b). An item response theory model for test bias and differential test functioning. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 197–239). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Steinberg, L., Thissen, D, & Wainer, H. (1990). Validity. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 187–231). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Stocking, M.L., Jirele, T., Lewis, C., & Swanson, L. (1998). Moderating possibly irrelevant multiple mean score differences on a test of mathematical reasoning. Journal of Educational Measurement, 35, 199–222.

    Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Way, W. D. (1994). A simulation study of the Mantel-Haenszel procedure for detecting DIF for the NCLEX using CAT. (Internal technical report). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Wenglinsky, H. (1998). Does it compute? The relationship between educational technology and student achievement in mathematics. (ETS Policy Information Center report) Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Wingersky, M. S., Patrick, R., & Lord, F. M. (1988). LOGIST user’s guide: LOGIST Version 6.00. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide? Journal of Educational Statistics, 15, 185–197.

    Google Scholar 

  • Zwick, R. (1992). Application of Mantel’s chi-square test to the analysis of differential item functioning for functioning for ordinal items. (Technical memorandum). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Zwick, R. (1997). The effect of adaptive administration on the variability of the Mantel-Haenszel measure of differential item functioning. Educational and Psychological Measurement, 57, 412–421.

    Google Scholar 

  • Zwick, R., & Thayer, D. T. (1996). Evaluating the magnitude of differential item functioning in polytomous items. Journal of Educational and Behavioral Statistics, 21, 187–201.

    Google Scholar 

  • Zwick, R., & Thayer, D. T. (in press). Application of an empirical Bayes enhancement of Mantel-Haenszel DIF analysis to computer-adaptive tests. Applied Psychological Measurement.

    Google Scholar 

  • Zwick, R., Thayer, D. T., & Lewis, C. (1997) An Investigation of the Validity of an Empirical Bayes Approach to Mantel-Haenszel DIF Analysis. (ETS Research Report No. 97-21). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Zwick, R., Thayer, D. T., & Lewis, C. (1999). An empirical Baye approach to Mantel-Haenszel DIF analysis. Journal of Educational Measurement, 36, 1–28.

    Article  Google Scholar 

  • Zwick, R., Thayer, D.T., & Lewis, C. (2000). Using loss functions for DIF detection: An empirical Bayes approach. Journal of Educational and Behavioral Statistics, 25, 225–247.

    Google Scholar 

  • Zwick, R., Thayer, D.T., & Mazzeo, J. (1997). Descriptive and inferential procedures for assessing DIF in polytomous items. Applied Measurement in Education, 10, 321–344.

    Google Scholar 

  • Zwick, R., Thayer, D.T., & Wingersky, M. (1993). A simulation study of methods for assessing differential item functioning in computer-adaptive tests. (ETS Research Report 93-11). Princeton, NJ: Educationl Testing Service.

    Google Scholar 

  • Zwick, R., Thayer, D. T., & Wingersky, M. (1994a) A simulation study of methods for assessing differential item functioning in computerized adaptive tests. Applied Psychological Measurement, 18, 121–140.

    Google Scholar 

  • Zwick, R., Thayer, D.T., & Wingersky, M. (1995). Effect of Rasch calibration on ability and DIF estimation in computer-adaptive tests. Journal of Educational Measurement, 32, 341–363.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Kluwer Academic Publishers

About this chapter

Cite this chapter

Zwick, R. (2000). The Assessment of Differential Item Functioning in Comput Adaptive Tests. In: van der Linden, W.J., Glas, G.A. (eds) Computerized Adaptive Testing: Theory and Practice. Springer, Dordrecht. https://doi.org/10.1007/0-306-47531-6_12

Download citation

  • DOI: https://doi.org/10.1007/0-306-47531-6_12

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-0-7923-6425-2

  • Online ISBN: 978-0-306-47531-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics