Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF


This paper demonstrates that, after applying a simple modification to Li and Stout’s (Psychometrika 61(4):647–677, 1996) CSIBTEST statistic, an improved variant of the statistic could be realized. It is shown that this modified version of CSIBTEST has a more direct association with the SIBTEST statistic presented by Shealy and Stout (Psychometrika 58(2):159–194, 1993). In particular, the asymptotic sampling distributions and general interpretation of the effect size estimates are the same for SIBTEST and the new CSIBTEST. Given the more natural connection to SIBTEST, it is shown that Li and Stout’s hypothesis testing approach is insufficient for CSIBTEST; thus, an improved hypothesis testing procedure is required. Based on the presented arguments, a new chi-squared-based hypothesis testing approach is proposed for the modified CSIBTEST statistic. Positive results from a modest Monte Carlo simulation study strongly suggest the original CSIBTEST procedure and randomization hypothesis testing approach should be replaced by the modified statistic and hypothesis testing method.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. 1.

    The use of KR-20 and coefficient \(\alpha \) assumes that the items used to form X are monotonically related to the composite, and therefore should be ordered and interval in nature.


  1. Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. doi:10.18637/jss.v048.i06.

    Article  Google Scholar 

  2. Chalmers, R. P. (2016). SimDesign: Structure for organizing Monte Carlo simulation designs. R package version 1.0.

  3. Chalmers, R. P., Counsell, A., & Flora, D. B. (2016). It might not make a big DIF: Improved differential test functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76(1), 114–140. doi:10.1177/0013164415584576.

    Article  Google Scholar 

  4. Chang, H.-H., Mazzeo, J., & Roussos, L. (1996). DIF for polytomously scored items: An adaptation of the SIBTEST procedure. Journal of Educational Measurement, 33(3), 333–353.

    Article  Google Scholar 

  5. Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23(4), 355–368.

    Article  Google Scholar 

  6. Edgington, E. S. (1987). Randomization tests. New York, NY: Maecel Dekker.

    Google Scholar 

  7. Guttman, L. (1945). A basis for analyzing test–retest reliability. Psychometrika, 10, 255–282.

    Article  PubMed  Google Scholar 

  8. Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151–160.

    Article  Google Scholar 

  9. Li, H.-H., & Stout, W. (1996). A new procedure for detection of crossing DIF. Psychometrika, 61(4), 647–677.

    Article  Google Scholar 

  10. Lord, F. M., & Novick, M. R. (1968). Statistical theory of mental test scores. Reading, MA: Addison-Wesley.

    Google Scholar 

  11. Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detect test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159–194.

    Article  Google Scholar 

  12. Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education, 24(3), 136–156. doi:10.1080/10691898.2016.1246953.

    Article  Google Scholar 

  13. Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to R. Philip Chalmers.

Additional information

Special thanks to two anonymous reviewers for providing insightful comments that improved the quality of this manuscript. Correspondence concerning this article should be addressed to R. Philip Chalmers.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chalmers, R.P. Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF. Psychometrika 83, 376–386 (2018).

Download citation


  • DIF
  • non-uniform DIF
  • bidirectional bias
  • Crossing-SIBTEST