Assessing Growth in a Diagnostic Classification Model Framework

Abstract

A common assessment research design is the single-group pre-test/post-test design in which examinees are administered an assessment before instruction and then another assessment after instruction. In this type of study, the primary objective is to measure growth in examinees, individually and collectively. In an item response theory (IRT) framework, longitudinal IRT models can be used to assess growth in examinee ability over time. In a diagnostic classification model (DCM) framework, assessing growth translates to measuring changes in attribute mastery status over time, thereby providing a categorical, criterion-referenced interpretation of growth. This study introduces the Transition Diagnostic Classification Model (TDCM), which combines latent transition analysis with the log-linear cognitive diagnosis model to provide methodology for analyzing growth in a general DCM framework. Simulation study results indicate that the proposed model is flexible, provides accurate and reliable classifications, and is quite robust to violations to measurement invariance over time. The TDCM is used to analyze pre-test/post-test data from a diagnostic mathematics assessment.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.

    Article  Google Scholar 

  2. Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. The Annals of Mathematical Statistics, 37(6), 1554–1563.

    Article  Google Scholar 

  3. Bock, R. D., Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25, 275–285.

    Article  Google Scholar 

  4. Bottge, B. A., Heinrichs, M., Chan, S., & Serlin, R. (2001). Anchoring adolescents’ understanding of math concepts in rich problem solving environments. Remedial and Special Education, 22, 299–314.

    Article  Google Scholar 

  5. Bottge, B. A., Heinrichs, M., Chan, S. Y., Mehta, Z. D., & Watson, E. (2003). Effects of video-based and applied problems on the procedural Math skills of average- and low achieving adolescents. Journal of Special Education Technology, 18(2), 5–22.

    Article  Google Scholar 

  6. Bottge, B. A., Ma, X., Gassaway, L., Toland, M., Butler, M., & Cho, S. J. (2014). Effects of blended instructional models on math performance. Exceptional Children, 80, 237–255.

    Article  Google Scholar 

  7. Bottge, B. A., Toland, M., Gassaway, L., Butler, M., Choo, S., Griffen, A. K., et al. (2015). Impact of enhanced anchored instruction in inclusive math classrooms. Exceptional Children, 81(2), 158–175.

    Article  Google Scholar 

  8. Bradshaw, L. (2016). Diagnostic classification models. In A. A. Rupp & J. P. Leighton (Eds.), The handbook of cognition and assessment: Frameworks, methodologies, and applications (pp. 297–327). West Sussex: Wiley-Blackwell.

    Google Scholar 

  9. Bradshaw, L., & Madison, M. J. (2016). Invariance properties for general diagnostic classification models. International Journal of Testing, 16(2), 99–118. https://doi.org/10.1080/15305058.2015.1107076.

    Article  Google Scholar 

  10. Bradshaw, L., & Templin, J. (2014). The little model that couldn’t: How the DINA model misclassifies students and hides important effects. Paper presented at the annual meeting of the Northeastern Educational Research Association in Trumbull, CT.

  11. Collins, L. M., & Lanza, S. T. (2010). Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. New York: Wiley.

    Google Scholar 

  12. Collins, L. M., & Wugalter, S. E. (1992). Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research, 27, 131–157.

    Article  Google Scholar 

  13. Corbett, A. T., & Anderson, J. R. (1995). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-adapted Interaction, 4, 253–278.

    Article  Google Scholar 

  14. de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.

    Article  Google Scholar 

  15. Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.

    Article  Google Scholar 

  16. Fischer, G. H. (1976). Some probabilistic models for measuring change. In D. N. M. de Gruijter & L. J. T. van der Kamp (Eds.), Advances in psychological and educational measurement (pp. 97–110). New York: Wiley.

    Google Scholar 

  17. Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika, 54, 599–624.

    Article  Google Scholar 

  18. George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models, with an emphasis on differential item functioning. Psychological Test and Assessment Modeling, 56(4), 405–432.

    Google Scholar 

  19. Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 33, 315–332.

    Google Scholar 

  20. Han, K. T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing. GMAC Research Reports, RR-11-02.

  21. Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

  22. Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.

    Article  Google Scholar 

  23. Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnostics. Applied Psychological Measurement, 29, 262–277.

    Article  Google Scholar 

  24. Henson, R., Templin, J., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log linear models with latent variables. Psychometrika, 74, 191–210.

    Article  Google Scholar 

  25. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802.

    Article  Google Scholar 

  26. Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  27. Huff, K., & Goodman, D. P. (2007). The demand for cognitive diagnostic assessment. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: theory and applications (pp. 19–60). London: Cambridge University Press.

    Google Scholar 

  28. Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on teaching and learning in the context of NG TOEFL (unpublished doctoral dissertation). Champaign, IL: University of Illinois.

    Google Scholar 

  29. Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.

    Article  Google Scholar 

  30. Jurich, D. P., & Bradshaw, L. P. (2014). An illustration of diagnostic classification modeling in student learning outcomes assessment. International Journal of Testing, 14, 49–72.

    Article  Google Scholar 

  31. Kaya, Y., & Leite, W. (2017). Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: An evaluation of model performance. Educational and Psychological Measurement, 77(3), 369–388.

    Article  Google Scholar 

  32. Kunina-Habenicht, O., Rupp, A. A., & Wilhem, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49, 59–81.

    Article  Google Scholar 

  33. Langeheine, R., Pannekoek, J., & van de Pol, F. (1996). Bootstrapping goodness-of-fit measures in categorical data analysis. Sociological Methods and Research, 24, 492–516.

    Article  Google Scholar 

  34. Lanza, S. T., & Collins, L. M. (2008). A new SAS procedure for latent transition analysis: Transitions in dating and sexual behavior. Developmental Psychology, 42(2), 446–456.

    Article  Google Scholar 

  35. Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: Benefits of a latent variable approach to modeling transitions in substance use. Journal of Drug Issues, 40, 93–120.

    Article  Google Scholar 

  36. Lao, H., & Templin, J. (2016). Estimation of diagnostic classification models without constraints: Issues with class label switching. Paper presented at the annual meeting of the National Council on measurement in education in Washington, DC.

  37. Lee, W., & Cho, S. J. (2017). The consequences of ignoring item parameter drift in longitudinal item response models. Applied Measurement in Education, 30(2), 129–146.

    Article  Google Scholar 

  38. Li, F., Cohen, A., Bottge, B., & Templin, J. (2015). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204.

    Article  Google Scholar 

  39. Maydeu-Olivares, A. (2015). Evaluating the fit of IRT models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 111–127). New York, NY: Taylor & Francis (Routledge).

    Google Scholar 

  40. Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement invariance. Psychometrika, 57(2), 289–311.

    Article  Google Scholar 

  41. Muthén, L. K., & Muthén, B. O. (1998–2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.

  42. Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.

    Article  Google Scholar 

  43. Roberts, T. J., & Ward, S. E. (2011). Using latent transition analysis in nursing research to explore change over time. Nursing Research, 60(1), 73–79.

    Article  Google Scholar 

  44. Rupp, A. A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York: Guilford.

    Google Scholar 

  45. Sinharay, S., & Haberman, S. J. (2014). How often is the misfit of item response theory models practically significant? Educational Measurement: Issues and Practice, 33(1), 23–35.

    Article  Google Scholar 

  46. Templin, J., & Bradshaw, L. (2013). Measuring the reliability of diagnostic classification model examinee estimates. Journal of Classification, 30, 251–275.

    Article  Google Scholar 

  47. Templin, J., & Henson, R. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.

    Article  Google Scholar 

  48. Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(3), 37–50.

    Article  Google Scholar 

  49. von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report No. RR-05–16). Princeton, NJ: Educational Testing Service.

  50. Wang, S., Yang, Y., Culpepper, S., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: A higher-order hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 47, 57–87.

    Article  Google Scholar 

  51. Wells, C. S., Subkoviak, M. J., & Serlin, R. C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26, 77–87.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Institute of Educational Sciences (IES) Grant Number R324A150035. The opinions expressed are those of the authors and do not necessarily reflect the views of IES.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Matthew J. Madison.

Appendix

Appendix

Abbreviated TDCM Mplus Syntax (two-attribute, pre-test/post-test example).

This example combines LCDM syntax from Templin and Hoffman (2013) and Mplus’s LTA capabilities to estimate the TDCM.

figurea
figureb
figurec

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Madison, M.J., Bradshaw, L.P. Assessing Growth in a Diagnostic Classification Model Framework. Psychometrika 83, 963–990 (2018). https://doi.org/10.1007/s11336-018-9638-5

Download citation

Keywords

  • diagnostic classification model
  • cognitive diagnosis model
  • latent transition analysis
  • item parameter drift
  • measurement invariance
  • growth
  • pre-test/post-test design