Journal of Classification

, Volume 30, Issue 2, pp 251–275 | Cite as

Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

  • Jonathan Templin
  • Laine Bradshaw


Over the past decade, diagnostic classification models (DCMs) have become an active area of psychometric research. Despite their use, the reliability of examinee estimates in DCM applications has seldom been reported. In this paper, a reliability measure for the categorical latent variables of DCMs is defined. Using theory-and simulation-based results, we show how DCMs uniformly provide greater examinee estimate reliability than IRT models for tests of the same length, a result that is a consequence of the smaller range of latent variable values examinee estimates can take in DCMs. We demonstrate this result by comparing DCM and IRT reliability for a series of models estimated with data from an end-of-grade test, culminating with a discussion of how DCMs can be used to change the character of large scale testing, either by shortening tests that measure examinees unidimensionally or by providing more reliable multidimensional measurement for tests of the same length.


Diagnostic classification models Cognitive diagnosis Reliability Classification Psychometrics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. ACKERMAN, T. (2009), "Using Confirmatory MIRT Modeling to Provide Diagnostic Information in Large Scale Assessment”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.Google Scholar
  3. BIRNBAUM, A. (1968), “Some Latent Trait Models and Their Use in Inferring an Examinee’s Ability”, in Statistical Theories of Mental Test Scores, eds. F.M. Lord and M.R. Novick, Reading MA: Addison-Wesley, pp. 397–479.Google Scholar
  4. DE AYALA, R.J. (2009), Theory and Practice of Item Response Theory, New York: Guilford.Google Scholar
  5. HABERMAN, S.J., VON DAVIER, M., and LEE, Y.-H. (2008), “Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions Versus Multivariate Polytomous Ability Distributions”, Research Report 08–45, Princeton NJ: Educational Testing Service.Google Scholar
  6. HAERTEL, E. (1989), “Using Restricted Latent Class Models to Map the Skill Structure of Achievement Items”, Journal of Educational Measurement, 26, 333–352.CrossRefGoogle Scholar
  7. HAMBLETON, R.K., SWAMINATHAN, H., and ROGERS, H.J. (1991), Fundamentals of Item Response Theory, Newbury Park CA: Sage.Google Scholar
  8. HARTZ, S.M. (2002), A Bayesian Framework for The Unified Model for Assessing Cognitive Abilities: Blending Theory with Practicality, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.Google Scholar
  9. HENSON, R., TEMPLIN, J., and WILLSE, J. (2009a), “Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables”, Psychometrika, 74, 191–210.MathSciNetzbMATHCrossRefGoogle Scholar
  10. HENSON, R., TEMPLIN, J., and WILLSE, J. (2009b), “Ancillary Random Effects: A Way to Obtain Diagnostic Information from Existing Large Scale Tests”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.Google Scholar
  11. JUNKER, B.W., and SIJTSMA, K. (2001), “Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory”, Applied Psychological Measurement, 25, 258–272.MathSciNetCrossRefGoogle Scholar
  12. LEIGHTON, J.P., and GIERL, M.J. (Eds.) (2007), Cognitive Diagnostic Assessment for Education: Theory and Practices, Cambridge: Cambridge University Press.Google Scholar
  13. LORD, F.M. (1980), Applications of Item Response Theory to Practical Testing Problems”, Hillsdale NJ: Erlbaum.Google Scholar
  14. MACREADY, G.B., and DAYTON, C.M. (1977), “The Use of Probabilistic Models in the Assessment of Mastery”, Journal of Educational Statistics, 2, 99–120.CrossRefGoogle Scholar
  15. MARIS, E. (1999), “Estimating Multiple Classification Latent Class Models”, Psychometrika, 64, 197–212.MathSciNetCrossRefGoogle Scholar
  16. MAYDEU-OLIVARES, A., and JOE, H. (2005), “Limited- and Full-Information Estimation and Goodness-of-Fit Testing in 2n Contingency Tables: A Unified Framework”, Journal of the American Statistical Association, 100, 1009–1020.MathSciNetzbMATHCrossRefGoogle Scholar
  17. MISLEVY, R.J., BEATON, A.E., KAPLAN, B., and SHEEHAN, K.M. (1992), “Estimating Population Characteristics from Sparse Matrix Samples of Item Responses”, Journal of Educational Measurement, 29, 133–161.CrossRefGoogle Scholar
  18. MUTHÉN, L.K., and MUTHÉN, B.O. (2010), “Mplus User’s Guide” (Version 5.21, Computer software and manual), Los Angeles CA: Muthén and Muthén.Google Scholar
  19. ROUSSOS, L., DIBELLO, L., STOUT, W., HARTZ, S., HENSON, R., and TEMPLIN, J. (2007), “The Fusion Model Skills Diagnosis System”, in Cognitive Diagnostic Assessment in Education, eds. J. Leighton and M. Gierl, New York NY: Cambridge University Press, pp. 275–318.CrossRefGoogle Scholar
  20. RUPP, A., and TEMPLIN, J. (2008), “Unique Characteristics of Diagnostic Models: A Review of the Current State-of-the-Art”, Measurement, 6, 219–262.Google Scholar
  21. RUPP, A., TEMPLIN, J., and HENSON, R. (2010), Diagnostic Measurement: Theory, Methods, and Applications, New York: Guilford.Google Scholar
  22. SINHARAY, S., and HABERMAN, S. J. (2009), “How Much Can We Reliably Know About What Examinees Know?”, Measurement, 7, 49–53.Google Scholar
  23. TEMPLIN, J. (2004), Generalized Linear Mixed Proficiency Models for Cognitive Diagnosis, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.Google Scholar
  24. TEMPLIN, J. (2006), CDM User's Guide, Lawrence KS: University of Kansas.Google Scholar
  25. TEMPLIN, J., and HENSON, R. (2006), “Measurement of Psychological Disorders Using Cognitive Diagnosis Models”, Psychological Methods, 11, 287–305.CrossRefGoogle Scholar
  26. VON DAVIER, M. (2005), “A General Diagnostic Model Applied to Language Testing Data”, ETS Research Report RR-05-16.Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Educational PsychologyThe University of GeorgiaAthensUSA

Personalised recommendations