A Comparison of Differential Item Functioning (DIF) Detection for Dichotomously Scored Items Using IRTPRO, BILOG-MG, and IRTLRDIF

Ong, Mei Ling; Kim, Seock-Ho; Cohen, Allan; Cramer, Stephen

doi:10.1007/978-3-319-19977-1_10

Mei Ling Ong⁶,
Seock-Ho Kim⁷,
Allan Cohen⁸ &
…
Stephen Cramer⁹

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 140))

1669 Accesses
1 Citations

Abstract

This study was designed to provide an empirical comparison of three IRT calibration programs, IRTPRO, BILOG-MG, and IRTLRDIF, all of which can be used for detecting differential item functioning (DIF). The three programs were compared for each of three dichotomous IRT models, the one-parameter logistic, the two-parameter logistic, and the three-parameter logistic models. Results from each of these programs were examined using data from a test designed to predict high school graduation test results in a large Southeastern US state. Results suggested that all three programs detected DIF differently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baker, F. B., & Kim, S.-H. (2004). Item response theory—Parameter estimation techniques (2nd ed.). Boca Raton: Taylor & Francis.
MATH Google Scholar
Basokcu, T. O., & Ogretmen, T. (2014). Comparison of parametric item response techniques in determining differential item functioning in polytomous scale. American Journal of Theoretical and Applied Statistics, 3, 31–38.
Article Google Scholar
Cai, L., Thissen, D., & du Toit, S. (2011). IRTPRO 2.1 [Computer software]. Lincolnwood: Scientific Software International.
Google Scholar
Coffman, D. L., & Belue, R. (2009). Disparities in sense of community—True race differences or differential item functioning? Journal of Community Psychology, 37, 547–558.
Article Google Scholar
Georgia Center for Assessment. (2007–2012). The Georgia high school graduation predictor test. Athens, GA: Author.
Google Scholar
Georgia Department of Education. (2010). Test content descriptions based on the Georgia performance standards social studies. http://archives.gadoe.org/DMGetDocument.aspx/GHSGT%20Social%20Studies%20Content%20Descriptions%20GPS%20Version%20Update%20Oct%202010.pdf?p=6CC6799F8C1371F6A344D9C15C23A9D859A861593B934AB75F446073BD12714C&Type=D. Accessed 15 Nov 2014.
Hambleton, R. K. (2006). Good practices for identifying differential item functioning. Medical Care, 44, S182–S188.
Article Google Scholar
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
Kline, T. J. B. (2004). Gender and language differences on the test of workplace essential skills—Using overall mean scores and item-level differential item functioning analyses. Educational and Psychological Measurement, 64, 549–559.
Article MathSciNet Google Scholar
Logan, J. R., Minca, E., & Adar, S. (2012). The geography of inequality—Why separate means unequal in American public schools. Sociology of Education, 85, 287–301.
Article Google Scholar
Lord, F. M. (1977). A broad-range tailored test of verbal ability. Applied Psychological Measurement, 1, 95–100.
Article Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
McNulty, T. L., & Bellair, P. E. (2003). Explaining racial and ethnic differences in serious adolescent violent behavior. Criminology, 41, 709–748.
Article Google Scholar
Paek, I., & Han, K. T. (2013). IRTPRO 2.1 for windows (item response theory for patient-reported outcomes). Applied Psychological Measurement, 37, 242–252.
Article Google Scholar
Samejima, F. (1997). Graded response model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer.
Chapter Google Scholar
Steinberg, L. (1994). Context and serial-order effects in personality measurement—Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66, 341–349.
Article Google Scholar
Thissen, D. (2001). IRTLRDIF v2.0b—Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning [Computer software documentation]. Chapel Hill: L. L. Thurstone Psychometric Laboratory, University of North Carolina.
Google Scholar
Thissen, D., Steinverg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response model. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–114). Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
Van der Linden, W. J., & Hambleton, R. K. (1997). Item response theory—Brief history, common models, and extensions. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 1–28). New York: Springer.
Chapter Google Scholar
Wainer, H., Sireci, S. G., & Thissen, D. (1991). Differential testlet functioning—Definitions and detection. Journal of Educational Measurement, 28, 197–219.
Article Google Scholar
Wang, X.-B., Wainer, H., & Thissen, D. (1995). On the viability of some untestable assumptions in equating exams that allow examinee choice. Applied Measurement in Education, 8, 211–225.
Article Google Scholar
Woods, C. M. (2009). Empirical selection of anchors for tests of differential item functioning. Applied Psychological Measurement, 33, 42–57.
Article MathSciNet Google Scholar
Woods, C. M., Cai, L., & Wang, M. (2013). The Langer-improved Wald test for DIF testing with multiple groups—Evaluation and comparison to two-group IRT. Educational and Psychological Measurement, 73, 532–547.
Article Google Scholar
Zimowski, M. F., Muraki, E., Mislevy, R. J., & Bock, R. D. (2003). BILOG-MG [Computer software]. Lincolnwood: Scientific Software International.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Education Psychology, University of Georgia, 126H Aderhold Hall, University of Georgia, Athens, GA, 30602, USA
Mei Ling Ong
Department of Education Psychology, University of Georgia, 325U Aderhold Hall, University of Georgia, Athens, GA, 30602, USA
Seock-Ho Kim
Department of Education Psychology, University of Georgia, 125M Aderhold Hall, University of Georgia, Athens, GA, 30602, USA
Allan Cohen
Department of Education Psychology, University of Georgia, 320A Aderhold Hall, University of Georgia, Athens, GA, 30602, USA
Stephen Cramer

Authors

Mei Ling Ong
View author publications
You can also search for this author in PubMed Google Scholar
Seock-Ho Kim
View author publications
You can also search for this author in PubMed Google Scholar
Allan Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Cramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mei Ling Ong .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
L. Andries van der Ark
University of Wisconsin, Madison, Wisconsin, USA
Daniel M. Bolt
The Hong Kong Institute of Education, Hong Kong, Hong Kong SAR
Wen-Chung Wang
University of Illinois, Champaign, Illinois, USA
Jeffrey A. Douglas
The Penn State University, University Park, Pennsylvania, USA
Sy-Miin Chow

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ong, M.L., Kim, SH., Cohen, A., Cramer, S. (2015). A Comparison of Differential Item Functioning (DIF) Detection for Dichotomously Scored Items Using IRTPRO, BILOG-MG, and IRTLRDIF. In: van der Ark, L., Bolt, D., Wang, WC., Douglas, J., Chow, SM. (eds) Quantitative Psychology Research. Springer Proceedings in Mathematics & Statistics, vol 140. Springer, Cham. https://doi.org/10.1007/978-3-319-19977-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-19977-1_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19976-4
Online ISBN: 978-3-319-19977-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics