, Volume 83, Issue 4, pp 963–990 | Cite as

Assessing Growth in a Diagnostic Classification Model Framework

  • Matthew J. MadisonEmail author
  • Laine P. Bradshaw


A common assessment research design is the single-group pre-test/post-test design in which examinees are administered an assessment before instruction and then another assessment after instruction. In this type of study, the primary objective is to measure growth in examinees, individually and collectively. In an item response theory (IRT) framework, longitudinal IRT models can be used to assess growth in examinee ability over time. In a diagnostic classification model (DCM) framework, assessing growth translates to measuring changes in attribute mastery status over time, thereby providing a categorical, criterion-referenced interpretation of growth. This study introduces the Transition Diagnostic Classification Model (TDCM), which combines latent transition analysis with the log-linear cognitive diagnosis model to provide methodology for analyzing growth in a general DCM framework. Simulation study results indicate that the proposed model is flexible, provides accurate and reliable classifications, and is quite robust to violations to measurement invariance over time. The TDCM is used to analyze pre-test/post-test data from a diagnostic mathematics assessment.


diagnostic classification model cognitive diagnosis model latent transition analysis item parameter drift measurement invariance growth pre-test/post-test design 



This work was supported by the Institute of Educational Sciences (IES) Grant Number R324A150035. The opinions expressed are those of the authors and do not necessarily reflect the views of IES.


  1. Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.CrossRefGoogle Scholar
  2. Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. The Annals of Mathematical Statistics, 37(6), 1554–1563.CrossRefGoogle Scholar
  3. Bock, R. D., Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25, 275–285.CrossRefGoogle Scholar
  4. Bottge, B. A., Heinrichs, M., Chan, S., & Serlin, R. (2001). Anchoring adolescents’ understanding of math concepts in rich problem solving environments. Remedial and Special Education, 22, 299–314.CrossRefGoogle Scholar
  5. Bottge, B. A., Heinrichs, M., Chan, S. Y., Mehta, Z. D., & Watson, E. (2003). Effects of video-based and applied problems on the procedural Math skills of average- and low achieving adolescents. Journal of Special Education Technology, 18(2), 5–22.CrossRefGoogle Scholar
  6. Bottge, B. A., Ma, X., Gassaway, L., Toland, M., Butler, M., & Cho, S. J. (2014). Effects of blended instructional models on math performance. Exceptional Children, 80, 237–255.CrossRefGoogle Scholar
  7. Bottge, B. A., Toland, M., Gassaway, L., Butler, M., Choo, S., Griffen, A. K., et al. (2015). Impact of enhanced anchored instruction in inclusive math classrooms. Exceptional Children, 81(2), 158–175.CrossRefGoogle Scholar
  8. Bradshaw, L. (2016). Diagnostic classification models. In A. A. Rupp & J. P. Leighton (Eds.), The handbook of cognition and assessment: Frameworks, methodologies, and applications (pp. 297–327). West Sussex: Wiley-Blackwell.CrossRefGoogle Scholar
  9. Bradshaw, L., & Madison, M. J. (2016). Invariance properties for general diagnostic classification models. International Journal of Testing, 16(2), 99–118. Scholar
  10. Bradshaw, L., & Templin, J. (2014). The little model that couldn’t: How the DINA model misclassifies students and hides important effects. Paper presented at the annual meeting of the Northeastern Educational Research Association in Trumbull, CT.Google Scholar
  11. Collins, L. M., & Lanza, S. T. (2010). Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. New York: Wiley.Google Scholar
  12. Collins, L. M., & Wugalter, S. E. (1992). Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research, 27, 131–157.CrossRefGoogle Scholar
  13. Corbett, A. T., & Anderson, J. R. (1995). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-adapted Interaction, 4, 253–278.CrossRefGoogle Scholar
  14. de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.CrossRefGoogle Scholar
  15. Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.CrossRefGoogle Scholar
  16. Fischer, G. H. (1976). Some probabilistic models for measuring change. In D. N. M. de Gruijter & L. J. T. van der Kamp (Eds.), Advances in psychological and educational measurement (pp. 97–110). New York: Wiley.Google Scholar
  17. Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika, 54, 599–624.CrossRefGoogle Scholar
  18. George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models, with an emphasis on differential item functioning. Psychological Test and Assessment Modeling, 56(4), 405–432.Google Scholar
  19. Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 33, 315–332.Google Scholar
  20. Han, K. T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing. GMAC Research Reports, RR-11-02.Google Scholar
  21. Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.Google Scholar
  22. Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.CrossRefGoogle Scholar
  23. Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnostics. Applied Psychological Measurement, 29, 262–277.CrossRefGoogle Scholar
  24. Henson, R., Templin, J., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log linear models with latent variables. Psychometrika, 74, 191–210.CrossRefGoogle Scholar
  25. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802.CrossRefGoogle Scholar
  26. Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.Google Scholar
  27. Huff, K., & Goodman, D. P. (2007). The demand for cognitive diagnostic assessment. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: theory and applications (pp. 19–60). London: Cambridge University Press.CrossRefGoogle Scholar
  28. Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on teaching and learning in the context of NG TOEFL (unpublished doctoral dissertation). Champaign, IL: University of Illinois.Google Scholar
  29. Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.CrossRefGoogle Scholar
  30. Jurich, D. P., & Bradshaw, L. P. (2014). An illustration of diagnostic classification modeling in student learning outcomes assessment. International Journal of Testing, 14, 49–72.CrossRefGoogle Scholar
  31. Kaya, Y., & Leite, W. (2017). Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: An evaluation of model performance. Educational and Psychological Measurement, 77(3), 369–388.CrossRefGoogle Scholar
  32. Kunina-Habenicht, O., Rupp, A. A., & Wilhem, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49, 59–81.CrossRefGoogle Scholar
  33. Langeheine, R., Pannekoek, J., & van de Pol, F. (1996). Bootstrapping goodness-of-fit measures in categorical data analysis. Sociological Methods and Research, 24, 492–516.CrossRefGoogle Scholar
  34. Lanza, S. T., & Collins, L. M. (2008). A new SAS procedure for latent transition analysis: Transitions in dating and sexual behavior. Developmental Psychology, 42(2), 446–456.CrossRefGoogle Scholar
  35. Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: Benefits of a latent variable approach to modeling transitions in substance use. Journal of Drug Issues, 40, 93–120.CrossRefGoogle Scholar
  36. Lao, H., & Templin, J. (2016). Estimation of diagnostic classification models without constraints: Issues with class label switching. Paper presented at the annual meeting of the National Council on measurement in education in Washington, DC.Google Scholar
  37. Lee, W., & Cho, S. J. (2017). The consequences of ignoring item parameter drift in longitudinal item response models. Applied Measurement in Education, 30(2), 129–146.CrossRefGoogle Scholar
  38. Li, F., Cohen, A., Bottge, B., & Templin, J. (2015). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204.CrossRefGoogle Scholar
  39. Maydeu-Olivares, A. (2015). Evaluating the fit of IRT models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 111–127). New York, NY: Taylor & Francis (Routledge).Google Scholar
  40. Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement invariance. Psychometrika, 57(2), 289–311.CrossRefGoogle Scholar
  41. Muthén, L. K., & Muthén, B. O. (1998–2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.Google Scholar
  42. Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.CrossRefGoogle Scholar
  43. Roberts, T. J., & Ward, S. E. (2011). Using latent transition analysis in nursing research to explore change over time. Nursing Research, 60(1), 73–79.CrossRefGoogle Scholar
  44. Rupp, A. A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York: Guilford.Google Scholar
  45. Sinharay, S., & Haberman, S. J. (2014). How often is the misfit of item response theory models practically significant? Educational Measurement: Issues and Practice, 33(1), 23–35.CrossRefGoogle Scholar
  46. Templin, J., & Bradshaw, L. (2013). Measuring the reliability of diagnostic classification model examinee estimates. Journal of Classification, 30, 251–275.CrossRefGoogle Scholar
  47. Templin, J., & Henson, R. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.CrossRefGoogle Scholar
  48. Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(3), 37–50.CrossRefGoogle Scholar
  49. von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report No. RR-05–16). Princeton, NJ: Educational Testing Service.Google Scholar
  50. Wang, S., Yang, Y., Culpepper, S., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: A higher-order hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 47, 57–87.CrossRefGoogle Scholar
  51. Wells, C. S., Subkoviak, M. J., & Serlin, R. C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26, 77–87.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2018

Authors and Affiliations

  1. 1.Department of Education and Human DevelopmentClemson UniversityClemsonUSA
  2. 2.Department of Educational PsychologyUniversity of GeorgiaAthensUSA

Personalised recommendations