This paper demonstrates that, after applying a simple modification to Li and Stout’s (Psychometrika 61(4):647–677, 1996) CSIBTEST statistic, an improved variant of the statistic could be realized. It is shown that this modified version of CSIBTEST has a more direct association with the SIBTEST statistic presented by Shealy and Stout (Psychometrika 58(2):159–194, 1993). In particular, the asymptotic sampling distributions and general interpretation of the effect size estimates are the same for SIBTEST and the new CSIBTEST. Given the more natural connection to SIBTEST, it is shown that Li and Stout’s hypothesis testing approach is insufficient for CSIBTEST; thus, an improved hypothesis testing procedure is required. Based on the presented arguments, a new chi-squared-based hypothesis testing approach is proposed for the modified CSIBTEST statistic. Positive results from a modest Monte Carlo simulation study strongly suggest the original CSIBTEST procedure and randomization hypothesis testing approach should be replaced by the modified statistic and hypothesis testing method.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
The use of KR-20 and coefficient \(\alpha \) assumes that the items used to form X are monotonically related to the composite, and therefore should be ordered and interval in nature.
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. doi:10.18637/jss.v048.i06.
Chalmers, R. P. (2016). SimDesign: Structure for organizing Monte Carlo simulation designs. R package version 1.0. https://CRAN.R-project.org/package=SimDesign.
Chalmers, R. P., Counsell, A., & Flora, D. B. (2016). It might not make a big DIF: Improved differential test functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76(1), 114–140. doi:10.1177/0013164415584576.
Chang, H.-H., Mazzeo, J., & Roussos, L. (1996). DIF for polytomously scored items: An adaptation of the SIBTEST procedure. Journal of Educational Measurement, 33(3), 333–353.
Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23(4), 355–368.
Edgington, E. S. (1987). Randomization tests. New York, NY: Maecel Dekker.
Guttman, L. (1945). A basis for analyzing test–retest reliability. Psychometrika, 10, 255–282.
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160.
Li, H.-H., & Stout, W. (1996). A new procedure for detection of crossing DIF. Psychometrika, 61(4), 647–677.
Lord, F. M., & Novick, M. R. (1968). Statistical theory of mental test scores. Reading, MA: Addison-Wesley.
Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detect test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159–194.
Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education, 24(3), 136–156. doi:10.1080/10691898.2016.1246953.
Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum.
Special thanks to two anonymous reviewers for providing insightful comments that improved the quality of this manuscript. Correspondence concerning this article should be addressed to R. Philip Chalmers.
About this article
Cite this article
Chalmers, R.P. Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF. Psychometrika 83, 376–386 (2018). https://doi.org/10.1007/s11336-017-9583-8
- non-uniform DIF
- bidirectional bias