Journal of Classification

, Volume 30, Issue 2, pp 195–224 | Cite as

Incorporating Student Covariates in Cognitive Diagnosis Models

  • Elizabeth Ayers
  • Sophia Rabe-Hesketh
  • Rebecca Nugent


In educational measurement, cognitive diagnosis models have been developed to allow assessment of specific skills that are needed to perform tasks. Skill knowledge is characterized as present or absent and represented by a vector of binary indicators, or the skill set profile. After determining which skills are needed for each assessment item, a model is specified for the relationship between item responses and skill set profiles. Cognitive diagnosis models are often used for diagnosis, that is, for classifying students into the different skill set profiles. Generally, cognitive diagnosis models do not exploit student covariate information. However, investigating the effects of student covariates, such as gender, SES, or educational interventions, on skill knowledge mastery is important in education research, and covariate information may improve classification of students to skill set profiles. We extend a common cognitive diagnosis model, the DINA model, by modeling the relationship between the latent skill knowledge indicators and covariates. The probability of skill mastery is modeled as a logistic regression model, possibly with a student-level random intercept, giving a higher-order DINA model with a latent regression. Simulations show that parameter recovery is good for these models and that inclusion of covariates can improve skill diagnosis. When applying our methods to data from an online tutor, we obtain reasonable and interpretable parameter estimates that allow more detailed characterization of groups of students who differ in their predicted skill set profiles.


Cognitive diagnosis model Collateral information Concomitant variables Covariates DIF DINA Higher order model Random effect Skill diagnosis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. AYERS, E., and JUNKER, B.W. (2008), “IRT Modeling of Tutor Performance to Predict End-of-year Exam Scores”, Educational and Psychological Measurement, 68, 972–987.MathSciNetCrossRefGoogle Scholar
  2. AYERS, E., NUGENT, R., and DEAN, N. (2009), “A Comparison of Student Skill Knowledge Estimates”, in Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings, eds. T. Barnes, M. Desmarais, C. Romero, and S.Ventura, Cordoba, Spain, pp.1-10,
  3. BARNES, T.M. (2003), “The Q-matrix Method of Fault-Tolerant Teaching in Knowledge Assessment and Data Mining”, Ph.D. Thesis, Department of Computer Science, North Carolina State University, NC.Google Scholar
  4. BOZARD, J. (2010), “Invariance Testing in Diagnostic Classification Models”, Masters Thesis, The University of Georgia, Athens, GA.Google Scholar
  5. CHIU, C. (2008), “Cluster Analysis for Cognitive Diagnosis: Theory and Applications”, Ph.D. Thesis, Educational Psychology, University of Illinois, Urbana Champaign, IL.Google Scholar
  6. CHIU, C., DOUGLAS, J., and LI, X. (2009), “Cluster Analysis for Cognitive Diagnosis: Theory and Applications”, Psychometrika, 74, 633–665.MathSciNetzbMATHCrossRefGoogle Scholar
  7. CHO, S-J., and COHEN, A.S. (2010), “A Multilevel Mixture IRT Model with an Application to DIF”, Journal of Educational and Behavioral Statistics, 35, 336–370.CrossRefGoogle Scholar
  8. CLOGG, C.C., and GOODMAN, L.A. (1984), “Latent Structure Analysis of a Set of Multidimensional Contingency Tables”, Journal of the American Statistical Association, 79, 762–771.MathSciNetzbMATHCrossRefGoogle Scholar
  9. COHEN, J. (1960), “A Coefficient of Agreement for Nominal Scales”, Educational and Psychological Measurement, 20, 37–46.CrossRefGoogle Scholar
  10. DAYTON, C.M., andMACREADY, G.B. (1988), “Concomitant Variable Latent ClassModels”, Journal American Statistical Association, 83, 173–178.MathSciNetCrossRefGoogle Scholar
  11. DE LA TORRE, J., and DOUGLAS, J. (2004), “Higher-order Latent Trait Models for Cognitive Diagnosis”, Psychometrika, 69, 333–353.MathSciNetCrossRefGoogle Scholar
  12. DE LA TORRE, J. (2009), “DINA Model and Parameter Estimation: A Didactic”, Journal of Educational and Behavioral Statistics, 34, 115–130.CrossRefGoogle Scholar
  13. DE LA TORRE, J., and CHIU, C.Y. (2009), “A Generalized Index of Item Discrimination for Cognitive Diagnosis Models”, paper presented at the International Meeting of the Psychometric Society, Cambridge, England.Google Scholar
  14. EMBRETSON, S.E. (1984), “A General Latent Trait Model for Response Processes”, Psychometrika, 49, 175–186.CrossRefGoogle Scholar
  15. FORMANN, A.K. (1992), “Linear Logistic Latent Class Analysis for Polytomous Data”, Journal of the American Statistical Association, 87, 476–486.CrossRefGoogle Scholar
  16. HAERTEL, E. H. (1989), “Using Restricted Latent ClassModels to Map the Skill Structure of Achievement Items”, Journal of Educational Measurement, 26, 333–352.CrossRefGoogle Scholar
  17. HARTZ, S. (2002), “A Bayesian Framework for the Unified Model for Assessing Cognitive Abilities: Blending Theory with Practicality”, Ph.D Thesis, University of Illinois, Urbana-Champaign, IL.Google Scholar
  18. HEFFERNAN, N.T., KOEDINGER, K.R., and JUNKER, B.W. (2001), “Using Web-Based Cognitive Assessment Systems for Predicting Student Performance on State Exams”, research proposal to the Institute of Educational Statistics, US Department of Education; Department of Computer Science at Worcester Polytechnic Institute, Worcester County, MA.Google Scholar
  19. HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.CrossRefGoogle Scholar
  20. JÖRESKOG, K.G. (1971), “Simultaneous Factor Analysis in Several Populations”, Psychometrika, 36, 409–426.zbMATHCrossRefGoogle Scholar
  21. JÖRESKOG, K.G., and GOLDBERGER, A.S. (1975), “Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable”, Journal of the American Statistical Association, 70, 631–639.MathSciNetzbMATHGoogle Scholar
  22. JUNKER, B.W., and SIJTSMA, K. (2001), “Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory”, Applied Psychological Measurement, 25, 258–272.MathSciNetCrossRefGoogle Scholar
  23. LARSEN, K., PETERSEN, J. H., BUDTZ-JØRGENSEN, E., and ENDAHL, L. (2000), “Interpreting Parameters in the Logistic Regression Model with Random Effects”, Biometrics, 56, 909–914.zbMATHCrossRefGoogle Scholar
  24. LI, F., and COHEN, A.S. (2006), “A Higher-Order DINA Rasch Model For Detection of Differential Item Functioning”, paper presented at the annual meeting of the Pacific Rim Objective Measurement Symposium, Hong Kong, People’s Republic of China.Google Scholar
  25. MACREADY, G.B., and DAYTON, C.M. (1977), “The Use of Probabilistic Models in the Assessment of Mastery”, Journal of Educational Statistics, 2, 99–120.CrossRefGoogle Scholar
  26. MAGIDSON, J., and VERMUNT, J.K. (2001), “Latent Class Factor and Cluster Models, Bi-plots and Related Graphical Displays”, Sociological Methodology, 31, 223–264.CrossRefGoogle Scholar
  27. MARIS, E. (1999), “Estimating Multiple Classification Latent Class Models”, Psychometrika, 64, 187–212.MathSciNetCrossRefGoogle Scholar
  28. MEREDITH, W. (1993), “Measurement Invariance, Factor Analysis and Factorial Invariance”, Psychometrika, 58, 525–543.MathSciNetzbMATHCrossRefGoogle Scholar
  29. MISLEVY, R.J. (1985), “Estimation of Latent Group Effects”, Journal of the American Statistical Association, 80, 993–997.MathSciNetCrossRefGoogle Scholar
  30. MISLEVY, R.J. (1987), “Exploiting Auxiliary Information about Examinees in the Estimation of Item Parameters”, Applied Psychological Measurement, 11, 81–91.CrossRefGoogle Scholar
  31. MISLEVY, R.J., JOHNSON, E.G., and MURAKI, E. (1992), “Scaling Procedures in NAEP”, Journal of Educational Statistics, 17, 131–154.CrossRefGoogle Scholar
  32. MILLSAP, R. E., and EVERSON, H. T. (1993), “Methodology Review: Statistical Approaches for Assessing Measurement Bias”, Applied Psychological Measurement, 17, 297–334.CrossRefGoogle Scholar
  33. MUTHÉN, B., and LEHMAN, J. (1985), “Multiple Group IRT Modeling: Applications to Item Bias Analysis”, Journal of Educational Statistics, 10, 133–142.CrossRefGoogle Scholar
  34. MUTHÉN, L. K., and MUTHÉN, B. O. (2010), Mplus User’s Guide (Sixth Ed.), Los Angeles, CA: Muthén & Muthén.Google Scholar
  35. R DEVELOPMENT CORE TEAM. (2004), “R: A Language and Environment for Statistical Computing”, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0,
  36. RUBIN, D.B. (1987), Multiple Imputation for Nonresponse in Surveys, New York: Wiley.CrossRefGoogle Scholar
  37. RUPP, A., and TEMPLIN, J. (2007), “The Effects of Q-Matrix Misspecification on Parameter Estimates and Classification Accuracy in the DINA Model”, Educational and Psychological Measurement, 68 (1), 78–96.MathSciNetCrossRefGoogle Scholar
  38. RUPP, A., TEMPLIN, J., and HENSON, R. (2010), Diagnostic Measurement: Theory, Methods, and Applications, New York: The Guildford Press.Google Scholar
  39. SELF, J. (1993), “Model-Based Cognitive Diagnosis”, User Modeling and User-Adapted Interaction, 3, 89–106.CrossRefGoogle Scholar
  40. SMIT, A., KELDERMAN, H., and VAN DER FLIER, H. (1999), “Collateral Information and Mixed Rasch Models”, Methods of Psychological Research Online, 4, 19–32.Google Scholar
  41. SMIT, A., KELDERMAN, H., and VAN DER FLIER, H. (2000), “The Mixed Birnbaum Model: Estimation using Collateral Information”, Methods of Psychological Research Online, 5, 31–43.Google Scholar
  42. SPIEGELHALTER, D.J., THOMAS, A., and BEST, N.G. (2003), WinBUGS: Bayesian Inference Using Gibbs Sampling, Manual Version 1.4, Cambridge: Medical Research Council Biostatistics Unit.Google Scholar
  43. STATACORP. (2009), Stata Statistical Software: Release 11, College Station, TX: Stata-Corp LP.Google Scholar
  44. TATSUOKA, K.K. (1983), “Rule Space: An Approach for Dealing with Misconceptions Based on Item Response Theory”, Journal of Educational Measurement, 20, 345–354.CrossRefGoogle Scholar
  45. TEMPLIN, J. (2004), “Generalized Linear Mixed Proficiency Models”, Ph.D. Thesis, University of Illinois, Urbana-Champaign, IL.Google Scholar
  46. TEMPLIN, J., HENSON, R., and DOUGLAS, J. (2007), “General Theory and Estimation of Cognitive Diagnosis Models: Using Mplus to DeriveModel Estimates”, Manuscript under Review.Google Scholar
  47. TEMPLIN, J.L., HENSON, R.A., TEMPLIN, S.E., and ROUSSOS, L. (2008), “Robustness of Hierarchical Modeling of Skill Association in Cognitive Diagnosis Models, Applied Psychological Measurement, 32, 559–574.MathSciNetCrossRefGoogle Scholar
  48. THISSEN, D., STEINBERG, L., and WAINER, H. (1988), “Use of Item Response Theory in the Study of Group Differences in Trace Lines”, in Test Validity, eds. H. Wainer and H. Braun, Hillsdale, NJ: Erlbaum, pp. 147–169.Google Scholar
  49. VERMUNT, J.K. (2003), “Multilevel Latent Class Models”, Sociological Methodology, 33, 213–239.CrossRefGoogle Scholar
  50. VON DAVIER, M. (2010), “Hierarchical Mixtures of Diagnostic Models”, Psychological Test and Assessment Modeling, 52, 8–28.Google Scholar
  51. XU, X., and VON DAVIER, M. (2008), “Fitting the Structural Diagnostic Model to NAEP Data”, Research Report RR-08-27, Princeton, NJ: Educational Testing Service.Google Scholar
  52. WEDEL, M. (2002), “Concomitant Variables in Finite Mixture Models”, Statistica Neerlandica, 56, 362–375.MathSciNetzbMATHCrossRefGoogle Scholar
  53. ZHANG, W. (2006), “Detecting Differential Item Functioning Using the DINA Model”, Ph.D. Thesis, The University of North Carolina at Greensboro, Greensboro, NC.Google Scholar
  54. ZWINDERMAN, A.H. (1991), “A Generalized RaschModel for Manifest Predictors”, Psychometrika, 56, 589–600.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Elizabeth Ayers
    • 1
  • Sophia Rabe-Hesketh
    • 2
  • Rebecca Nugent
    • 3
  1. 1.American Institutes for ResearchWashington DCUSA
  2. 2.Graduate School of Education, University of California, 1501 Tolman Hall, University of CaliforniaBerkeleyUSA
  3. 3.Department of Statistics, Baker HallCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations