Bifactor MIRT as an Appealing and Related Alternative to CDMs in the Presence of Skill Attribute Continuity
For virtually all tests analyzed using CDMs, low-dimensional compensatory item response theory (IRT) models with continuous abilities appear to provide an equivalent or better statistical fit, as noted in a recent commentary by von Davier and Haberman (Psychometrika, 79:340–346, 2014). We examine these issues using both simulation and real data analyses. We suggest that the results motivate consideration of bifactor MIRT models as an attractive alternative for diagnostic measurement, particular in cases where skill attribute continuity is suspected or can be confirmed. The potential usefulness of bifactor MIRT for diagnostic scoring is also based on other considerations. For example, bifactor MIRT reflects a tendency for items to measure primarily one of the required conjunctively interacting skill attributes (the most difficult of the required attributes), and also makes it possible to address the estimation limitations of MIRT models of high dimensionality (Cai L, Psychometrika, 75(4):581–612, 2010). Additionally, the bifactor MIRT model uses orthogonal statistical dimensions, making it easier to quantify the incremental contribution provided by attending to specific factors that can provide the foundation for diagnosis.
The author would like to thank Nana Kim and the two assigned reviewers for their review and comments on an earlier version of this chapter.
- Bolt, D. M. (2017). Parameter invariance and skill attribute continuity in the DINA model. Unpublished manuscript.Google Scholar
- Chalmers, R. P. (2012). MIRT: A multidimensional item response theory package for the R environment. Journal of Statistical Software. www.jstatsoft.org
- Rijmen, F. (2009). Efficient full information maximum likelihood estimation for multidimensional IRT models. ETS Research Report Series, i-31.Google Scholar
- Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2017). CDM: Cognitive diagnosis modeling. R package version, 3-1.Google Scholar
- Rupp, A. A., & Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6(4), 219–262.Google Scholar
- Sympson, J. B. (1978). A model for testing with multidimensional items. In D. J. Weiss (Ed.), Proceedings of the 1977 computerized adaptive testing conference (pp. 82–98). Minneapolis, MN: University of Minnesota, Department of Psychology, Psychometric Methods Program.Google Scholar
- Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold, & M. Safto (Eds.), Monitoring skills and knowledge acquisition (pp. 453–488). Hillsdale, NJ: Erlbaum.Google Scholar
- Weeks, J. P. (2015). Multidimensional test linking. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 406–434). New York, NY: Routledge.Google Scholar