Skip to main content

Assessing Growth in a Diagnostic Classification Model Framework

Abstract

A common assessment research design is the single-group pre-test/post-test design in which examinees are administered an assessment before instruction and then another assessment after instruction. In this type of study, the primary objective is to measure growth in examinees, individually and collectively. In an item response theory (IRT) framework, longitudinal IRT models can be used to assess growth in examinee ability over time. In a diagnostic classification model (DCM) framework, assessing growth translates to measuring changes in attribute mastery status over time, thereby providing a categorical, criterion-referenced interpretation of growth. This study introduces the Transition Diagnostic Classification Model (TDCM), which combines latent transition analysis with the log-linear cognitive diagnosis model to provide methodology for analyzing growth in a general DCM framework. Simulation study results indicate that the proposed model is flexible, provides accurate and reliable classifications, and is quite robust to violations to measurement invariance over time. The TDCM is used to analyze pre-test/post-test data from a diagnostic mathematics assessment.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  • Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.

    Article  Google Scholar 

  • Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. The Annals of Mathematical Statistics, 37(6), 1554–1563.

    Article  Google Scholar 

  • Bock, R. D., Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25, 275–285.

    Article  Google Scholar 

  • Bottge, B. A., Heinrichs, M., Chan, S., & Serlin, R. (2001). Anchoring adolescents’ understanding of math concepts in rich problem solving environments. Remedial and Special Education, 22, 299–314.

    Article  Google Scholar 

  • Bottge, B. A., Heinrichs, M., Chan, S. Y., Mehta, Z. D., & Watson, E. (2003). Effects of video-based and applied problems on the procedural Math skills of average- and low achieving adolescents. Journal of Special Education Technology, 18(2), 5–22.

    Article  Google Scholar 

  • Bottge, B. A., Ma, X., Gassaway, L., Toland, M., Butler, M., & Cho, S. J. (2014). Effects of blended instructional models on math performance. Exceptional Children, 80, 237–255.

    Article  Google Scholar 

  • Bottge, B. A., Toland, M., Gassaway, L., Butler, M., Choo, S., Griffen, A. K., et al. (2015). Impact of enhanced anchored instruction in inclusive math classrooms. Exceptional Children, 81(2), 158–175.

    Article  Google Scholar 

  • Bradshaw, L. (2016). Diagnostic classification models. In A. A. Rupp & J. P. Leighton (Eds.), The handbook of cognition and assessment: Frameworks, methodologies, and applications (pp. 297–327). West Sussex: Wiley-Blackwell.

    Chapter  Google Scholar 

  • Bradshaw, L., & Madison, M. J. (2016). Invariance properties for general diagnostic classification models. International Journal of Testing, 16(2), 99–118. https://doi.org/10.1080/15305058.2015.1107076.

    Article  Google Scholar 

  • Bradshaw, L., & Templin, J. (2014). The little model that couldn’t: How the DINA model misclassifies students and hides important effects. Paper presented at the annual meeting of the Northeastern Educational Research Association in Trumbull, CT.

  • Collins, L. M., & Lanza, S. T. (2010). Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. New York: Wiley.

    Google Scholar 

  • Collins, L. M., & Wugalter, S. E. (1992). Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research, 27, 131–157.

    Article  Google Scholar 

  • Corbett, A. T., & Anderson, J. R. (1995). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-adapted Interaction, 4, 253–278.

    Article  Google Scholar 

  • de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179–199.

    Article  Google Scholar 

  • Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.

    Article  Google Scholar 

  • Fischer, G. H. (1976). Some probabilistic models for measuring change. In D. N. M. de Gruijter & L. J. T. van der Kamp (Eds.), Advances in psychological and educational measurement (pp. 97–110). New York: Wiley.

    Google Scholar 

  • Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika, 54, 599–624.

    Article  Google Scholar 

  • George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models, with an emphasis on differential item functioning. Psychological Test and Assessment Modeling, 56(4), 405–432.

    Google Scholar 

  • Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 33, 315–332.

    Google Scholar 

  • Han, K. T., & Guo, F. (2011). Potential impact of item parameter drift due to practice and curriculum change on item calibration in computerized adaptive testing. GMAC Research Reports, RR-11-02.

  • Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

  • Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.

    Article  Google Scholar 

  • Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnostics. Applied Psychological Measurement, 29, 262–277.

    Article  Google Scholar 

  • Henson, R., Templin, J., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log linear models with latent variables. Psychometrika, 74, 191–210.

    Article  Google Scholar 

  • Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4), 800–802.

    Article  Google Scholar 

  • Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Huff, K., & Goodman, D. P. (2007). The demand for cognitive diagnostic assessment. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: theory and applications (pp. 19–60). London: Cambridge University Press.

    Chapter  Google Scholar 

  • Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on teaching and learning in the context of NG TOEFL (unpublished doctoral dissertation). Champaign, IL: University of Illinois.

    Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.

    Article  Google Scholar 

  • Jurich, D. P., & Bradshaw, L. P. (2014). An illustration of diagnostic classification modeling in student learning outcomes assessment. International Journal of Testing, 14, 49–72.

    Article  Google Scholar 

  • Kaya, Y., & Leite, W. (2017). Assessing change in latent skills across time with longitudinal cognitive diagnosis modeling: An evaluation of model performance. Educational and Psychological Measurement, 77(3), 369–388.

    Article  Google Scholar 

  • Kunina-Habenicht, O., Rupp, A. A., & Wilhem, O. (2012). The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement, 49, 59–81.

    Article  Google Scholar 

  • Langeheine, R., Pannekoek, J., & van de Pol, F. (1996). Bootstrapping goodness-of-fit measures in categorical data analysis. Sociological Methods and Research, 24, 492–516.

    Article  Google Scholar 

  • Lanza, S. T., & Collins, L. M. (2008). A new SAS procedure for latent transition analysis: Transitions in dating and sexual behavior. Developmental Psychology, 42(2), 446–456.

    Article  Google Scholar 

  • Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: Benefits of a latent variable approach to modeling transitions in substance use. Journal of Drug Issues, 40, 93–120.

    Article  Google Scholar 

  • Lao, H., & Templin, J. (2016). Estimation of diagnostic classification models without constraints: Issues with class label switching. Paper presented at the annual meeting of the National Council on measurement in education in Washington, DC.

  • Lee, W., & Cho, S. J. (2017). The consequences of ignoring item parameter drift in longitudinal item response models. Applied Measurement in Education, 30(2), 129–146.

    Article  Google Scholar 

  • Li, F., Cohen, A., Bottge, B., & Templin, J. (2015). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement, 76(2), 181–204.

    Article  Google Scholar 

  • Maydeu-Olivares, A. (2015). Evaluating the fit of IRT models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 111–127). New York, NY: Taylor & Francis (Routledge).

    Google Scholar 

  • Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement invariance. Psychometrika, 57(2), 289–311.

    Article  Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (1998–2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.

  • Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502.

    Article  Google Scholar 

  • Roberts, T. J., & Ward, S. E. (2011). Using latent transition analysis in nursing research to explore change over time. Nursing Research, 60(1), 73–79.

    Article  Google Scholar 

  • Rupp, A. A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York: Guilford.

    Google Scholar 

  • Sinharay, S., & Haberman, S. J. (2014). How often is the misfit of item response theory models practically significant? Educational Measurement: Issues and Practice, 33(1), 23–35.

    Article  Google Scholar 

  • Templin, J., & Bradshaw, L. (2013). Measuring the reliability of diagnostic classification model examinee estimates. Journal of Classification, 30, 251–275.

    Article  Google Scholar 

  • Templin, J., & Henson, R. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.

    Article  Google Scholar 

  • Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(3), 37–50.

    Article  Google Scholar 

  • von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report No. RR-05–16). Princeton, NJ: Educational Testing Service.

  • Wang, S., Yang, Y., Culpepper, S., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: A higher-order hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 47, 57–87.

    Article  Google Scholar 

  • Wells, C. S., Subkoviak, M. J., & Serlin, R. C. (2002). The effect of item parameter drift on examinee ability estimates. Applied Psychological Measurement, 26, 77–87.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Institute of Educational Sciences (IES) Grant Number R324A150035. The opinions expressed are those of the authors and do not necessarily reflect the views of IES.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew J. Madison.

Appendix

Appendix

Abbreviated TDCM Mplus Syntax (two-attribute, pre-test/post-test example).

This example combines LCDM syntax from Templin and Hoffman (2013) and Mplus’s LTA capabilities to estimate the TDCM.

figure a
figure b
figure c

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Madison, M.J., Bradshaw, L.P. Assessing Growth in a Diagnostic Classification Model Framework. Psychometrika 83, 963–990 (2018). https://doi.org/10.1007/s11336-018-9638-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-018-9638-5

Keywords

  • diagnostic classification model
  • cognitive diagnosis model
  • latent transition analysis
  • item parameter drift
  • measurement invariance
  • growth
  • pre-test/post-test design