The General Diagnostic Model

  • Matthias von DavierEmail author
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)


The general diagnostic model (GDM) allows modeling dichotomous and polytomous item responses under the assumption that respondents differ with respect to multiple latent skills or attributes, and that these may be distributed differently across populations. Item responses can be of mixed format, dichotomous and/or polytomous, and skills/attributes can be binary, polytomous ordinal, or continuous. Variables that define populations can be observed, latent as in discrete mixture models, or partially missing. Unobserved grouping variables can be predicted based on hierarchical extensions of the GDM. It was shown that through reparameterization, the GDM contains the DINA as well as the logistic G-DINA, which is the same as the log-linear cognitive diagnostic model (LCDM), as special cases, and hence can fit all models that can be specified in these frameworks. Taken together, the GDM includes a wide range of diagnostic models, as well as item response theory (IRT), multidimensional IRT (MIRT), latent class models, located latent class models, multiple group and mixture versions of these models, as well as multilevel, and longitudinal extensions of these. This chapter introduces the GDM by means of a formal description of basic model assumptions and their generalizations and describes how models can be estimated in the GDM framework using the mdltm software. The software is free for research purposes, can handle very large databases up to millions of respondents and thousands of items, and provides efficient estimation of models through utilization of massively parallel estimation algorithms. The software was used operationally for scaling the PISA 2015, 2018, and PIAAC 2012 main study databases, which include hundreds of populations, grouping variables, and weights, and hundreds of test forms collected over five assessment cycles with a combined size of over two million respondents.


  1. Adams, R. (2010). Case (Person) Fit and Residuals. Notes on ConQuest 3.0 software features.
  2. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.CrossRefGoogle Scholar
  3. Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the chisquare test. Journal of the American Statistical Association, 33, 526–542.CrossRefGoogle Scholar
  4. Bock, R. D., & Zimowski, M. F. (1997). Multiple group IRT. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer.CrossRefGoogle Scholar
  5. Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799. CrossRefGoogle Scholar
  6. Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models in social and behavioral research: Applications and data analysis methods (1st ed.). Newbury Park, CA: Sage Publications.Google Scholar
  7. de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.CrossRefGoogle Scholar
  8. de la Torre, J., & Minchen, N. D. (this volume). The G-DINA model framework. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Cham, Switzerland: Springer. Google Scholar
  9. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1–38.Google Scholar
  10. Gibbons, R. D., & Hedeker, D. R. (1992, September). Full-information item bi-factor analysis. Psychometrika, 57(3), 423–436. CrossRefGoogle Scholar
  11. Gilula, Z., & Haberman, S. J. (1994). Models for analyzing categorical panel data. Journal of the American Statistical Association, 89, 645–656.CrossRefGoogle Scholar
  12. Haberman, S. J. (2009). Use of generalized residuals to examine goodness of fit of item response models. ETS RR-09-15.Google Scholar
  13. Haberman, S. J., von Davier, M., & Lee, Y. (2008). Comparison of multidimensional item response models: Multivariate normal ability distributions versus multivariate polytomous ability distributions. RR-08-45. ETS Research Report.Google Scholar
  14. Heinen, T. (1996). Latent class and discrete latent trait models, similarities and differences. Thousand Oaks, CA: Sage Publications.Google Scholar
  15. Henson, R., & Templin, J. L. (this volume). Loglinear cognitive diagnostic model (LCDM). In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Cham, Switzerland: Springer.Google Scholar
  16. Henson, R., Templin, J., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log linear models with latent variables. Psychometrika, 74, 191–210.CrossRefGoogle Scholar
  17. Holland, P. W. (1990). The Dutch identity: A new tool for the study of item response models. Psychometrika, 55(1), 5–18.CrossRefGoogle Scholar
  18. Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25(3), 211–220. CrossRefGoogle Scholar
  19. Marsman, M., Borsboom, D., Kruis, J., Epskamp, S., van Bork, R., Waldorp, L. J., … Maris, G. (2018). An introduction to network psychometrics: Relating Ising network models to item response theory models. Multivariate Behavioral Research, 53(1), 15–35. CrossRefGoogle Scholar
  20. Mislevy, R. J., & Verhelst, N. D. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195–215.CrossRefGoogle Scholar
  21. Molenaar, I. W. (1983). Some improved diagnostics for failure of the Rasch model. Psychometrika, 48, 49–72.CrossRefGoogle Scholar
  22. Rijmen, F., & Jeon, M. (2013). Fitting an item response theory model with random item effects across groups by a variational approximation method. Annals of Operations Research, 206, 647–662.CrossRefGoogle Scholar
  23. Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282.CrossRefGoogle Scholar
  24. Rost, J., & von Davier, M. (1994). A conditional item fit index for Rasch models. Applied Psychological Measurement, 18, 171–182.CrossRefGoogle Scholar
  25. Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York: The Guilford Press.Google Scholar
  26. Savage, L. J. (1972). The foundation of statistics. Dover publications.Google Scholar
  27. Schwarz, G. E. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464. CrossRefGoogle Scholar
  28. Sinharay, S. (2003). Practical applications of posterior predictive model checking for assessing fit of the common item response theory models (Research Report RR03–33). Retrieved from Educational Testing Service website:
  29. Tsao, R. (1967). A second order exponential model for multidimensional dichotomous contingency tables with applications in medical diagnosis. Unpublished doctoral thesis, Harvard University, Department of Statistics.Google Scholar
  30. Vermunt, J. K. (2003). Multilevel latent class models. Sociological Methodology, 33, 213–239.CrossRefGoogle Scholar
  31. Vermunt, J. K. (2004). An EM-algorithm for the estimation of parametric and nonparametric hierarchical nonlinear models. Statistica Neerlandica, 58(2), 220–233.CrossRefGoogle Scholar
  32. Vermunt, J. K. (2010). Latent class models. In P. Peterson, E. Baker, & B. McGaw (Eds.), International encyclopedia of education (Vol. 7, pp. 238–244). Oxford, UK: Elsevier.CrossRefGoogle Scholar
  33. von Davier, M. (1996). Mixtures of polytomous Rasch models and latent class models for ordinal variables. In F. Faulbaum & W. Bandilla (Eds.), Softstat 95 – advances in statistical software 5. Stuttgart, Germany: Lucius & Lucius.Google Scholar
  34. von Davier, M. (1997). Bootstrapping goodness-of-fit statistics for sparse categorical data: Results of a Monte Carlo study. Methods of Psychological Research, 2, 29–48. Retrieved January 14, 2010, from: Scholar
  35. von Davier, M. (2005). A general diagnostic model applied to language testing data. ETS Research Report Series, 2005, 1–35. CrossRefGoogle Scholar
  36. von Davier, M. (2008a). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307. CrossRefGoogle Scholar
  37. von Davier, M. (2008b). The mixture general diagnostic model. In G. R. Hancock & K. M. Samuelson (Eds.), Advances in latent variable mixture models (pp. 255–276). Charlotte, NC: Information Age Publishing.Google Scholar
  38. von Davier, M. (2009a). Mixture distribution item response theory, latent class analysis, and diagnostic mixture models. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (pp. 11–34). Washington, DC: APA Press.Google Scholar
  39. von Davier, M. (2009b). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement – Interdisciplinary Research and Perspectives, 7(1, March), 67–74.CrossRefGoogle Scholar
  40. von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52, 8–28. Retrieved April 26, 2012, from: Google Scholar
  41. von Davier, M. (2011). Equivalency of the DINA model and a constrained general diagnostic model. ETS-RR-11-37. Princeton: ETS Research Report Series.Google Scholar
  42. von Davier, M. (2013). The DINA model as a constrained general diagnostic model—two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71.CrossRefGoogle Scholar
  43. von Davier, M. (2014). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM) (Research Report No. RR-14-40). Princeton, NJ: Educational Testing Service.
  44. von Davier, M. (2016). High-performance psychometrics: The parallel-E parallel-M algorithm for generalized latent variable models. (ETS Research Report ETS-RR-16-34).Google Scholar
  45. von Davier M. (2017) New results on an improved parallel EM algorithm for estimating generalized latent variable models. In van der Ark L., Wiberg M., Culpepper S., Douglas J., Wang WC. (eds) Quantitative psychology. IMPS 2016. Springer Proceedings in Mathematics & Statistics (Vol 196). Cham, Switzerland: Springer.
  46. von Davier, M. (2018). Diagnosing diagnostic models: From von Neumann’s elephant to model equivalencies and network psychometrics. Measurement: Interdisciplinary Research and Perspectives, 16(1), 59–70. CrossRefGoogle Scholar
  47. von Davier, M., & Lee, Y.-S. (this volume). Introduction: From latent class analysis to DINA and beyond. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models. Cham, Switzerland: Springer.Google Scholar
  48. von Davier, M., & von Davier, A. A. (2007). A unified approach to IRT scale linkage and scale transformations. Methodology, 3, 115–124.CrossRefGoogle Scholar
  49. von Davier, M., & von Davier, A. A. (2011). A general model for IRT scale linking and scale transformations. In A. A. von Davier (Ed.), Statistical models for test equating, scaling and linking. New York: Springer.CrossRefGoogle Scholar
  50. von Davier, M., & Carstensen, C. H. (Eds.). (2007). Multivariate and mixture distribution Rasch models. New York, NY: Springer.Google Scholar
  51. von Davier, M., & Molenaar, I. W. (2003). A person-fit index for Polytomous Rasch models, latent class models, and their mixture generalizations. Psychometrika, 68, 213–228.CrossRefGoogle Scholar
  52. von Davier, M., & Rost, J. (2006). Mixture distribution item response models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, Vol. 26: Psychometrics (pp. 643–768). Amsterdam, The Netherlands: Elsevier.Google Scholar
  53. von Davier, M., & Rost, J. (2016). Logistic mixture-distribution response models. In W. van der Linden (Ed.), Handbook of item response theory (Vol. 1, 2nd ed., pp. 393–406). Boca Raton, FL: CRC Press.Google Scholar
  54. von Davier, M., DiBello, L., & Yamamoto, K. (2008). Reporting test outcomes using models for cognitive diagnosis. In J. Hartig, E. Klieme, & D. Leutner (Eds.), Assessment of competencies in educational contexts (pp. 151–176). Toronto, Canada: Hogrefe & Huber Publishers.Google Scholar
  55. von Davier, M., Xu, X., & Carstensen, C. H. (2011). Measuring growth in a longitudinal large scale assessment with a general latent variable model. Psychometrika, 76, 318–336.CrossRefGoogle Scholar
  56. von Davier, M., & Yamamoto, K. (2004a, October). A class of models for cognitive diagnosis. Paper presented at the 4th Spearman Conference, Philadelphia, PA.Google Scholar
  57. von Davier, M., & Yamamoto, K. (2004b). Partially observed mixtures of IRT models: An extension of the generalized partial credit model. Applied Psychological Measurement, 28(6), 389–406.CrossRefGoogle Scholar
  58. Xu, X., & von Davier, M. (2006). Cognitive diagnosis for NAEP proficiency data (Research Report, RR-06-08). Princeton, NJ: ETS.CrossRefGoogle Scholar
  59. Xu, X., & von Davier, M. (2008). Linking with the general diagnostic model. (Research Report RR-08-08). Princeton, NJ: Educational Testing Service.Google Scholar
  60. Yamamoto, K. (1989). A hybrid model of IRT and latent class models.. Research Report RR-89-41. Princeton, NJ: Educational Testing Service.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.National Board of Medical Examiners (NBME)PhiladelphiaUSA

Personalised recommendations