GDM Software mdltm Including Parallel EM Algorithm

  • Lale Khorramdel
  • Hyo Jeong Shin
  • Matthias von DavierEmail author
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)


This chapter illustrates the use of the software mdltm (von Davier, A general diagnostic model applied to language testing data. ETS Research Report No. RR-05-16, Educational Testing Service, Princeton, 2005), for multidimensional discrete latent trait models. The software mdltm was designed to handle large data sets as well as complex test and sampling designs, providing high flexibility for operational analyses. It allows the estimation of many different latent variable models, includes different constraints for parameter estimation, and provides different model and item fit statistics as well as multiple methods for proficiency estimation. The software utilizes an computationally efficient parallel EM algorithm (von Davier, New results on an improved parallel EM algorithm for estimating generalized latent variable models. In van der Ark L, Wiberg M, Culpepper S, Douglas J, Wang WC (eds) Quantitative psychology. IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol 196. Springer, New York, 2017) that allows estimation of high-dimensional diagnostic models for very large datasets. The software is illustrated by applying diagnostic models to data from the programme for international student assessment (PISA).


  1. Adams, R. J., Wilson, M., & Wang, W. C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23.CrossRefGoogle Scholar
  2. Agresti, A. (2002). Categorical data analysis. Hoboken, NJ: Wiley.CrossRefGoogle Scholar
  3. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.CrossRefGoogle Scholar
  4. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  5. Cai, L. (2010a). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.CrossRefGoogle Scholar
  6. Cai, L. (2010b). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.CrossRefGoogle Scholar
  7. Gibbons, R. D., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.CrossRefGoogle Scholar
  8. Gilula, Z., & Haberman, S. J. (1994). Models for analyzing categorical panel data. Journal of the American Statistical Association, 89, 645–656.CrossRefGoogle Scholar
  9. Haberman, S. J., von Davier, M., & Lee, Y. (2008). Comparision of multidimensional item response models: Multivariate normal ability distributions versus multivariate polytomous ability distributions. ETS Research Report Series (pp. 1–25).
  10. Jeon, M., & Rijmen, F. (2014). Recent developments in maximum likelihood estimation of MTMM models for categorical data. Frontiers in Psychology, 5, 269. CrossRefGoogle Scholar
  11. Jeon, M., Rijmen, F., & Rabe-Hesketh, S. (2013). Modeling differential item functioning using the multiple-group bifactor model. Journal of Educational and Behavioral Statistics, 38, 32–60.CrossRefGoogle Scholar
  12. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.CrossRefGoogle Scholar
  13. Mazzeo, J., & von Davier, M. (2008). Review of the Programme for International Student Assessment (PISA) test design: Recommendations for fostering stability in assessment results (OECD Working Paper EDU/PISA/GB (2008) 28). Paris, France: OECD. Retrieved from Google Scholar
  14. Mazzeo, J., & von Davier, M. (2013). Linking scales in international large-scale assessments. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis. Boca Raton, FL: CRC Press.Google Scholar
  15. Mislevy, R. J., & Sheehan, K. M. (1987). Marginal estimation procedures. In A. E. Beaton (Ed.), Implementing the new design: The NAEP 1983–84 technical report (Report No. 15-TR-20). Princeton, NJ: Educational Testing Service.Google Scholar
  16. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–177.CrossRefGoogle Scholar
  17. OECD. (2016). PISA 2015 Assessment and analytical framework: Science, reading, mathematic and financial Literacy. Paris, France: PISA, OECD Publishing.CrossRefGoogle Scholar
  18. Organisation for Economic Co-Operation and Development. (2013). Chapter 17: Technical report of the Survey of Adult Skills (PIAAC) (pp. 406–438). Retrieved from the OECD website:
  19. Organisation for Economic Co-Operation and Development. (2017). PISA 2015 technical report. Paris, France: OECD Publishing.Google Scholar
  20. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Nielsen & Lydiche (Expanded Edition, Chicago, University of Chicago Press, 1980).Google Scholar
  21. Rijmen, F., & Jeon, M. (2013). Fitting an item response theory model with random item effects across groups by a variational approximation method. Annals of Operations Research, 206, 647–662.CrossRefGoogle Scholar
  22. Rijmen, F., Jeon, M., Rabe-Hesketh, S., & von Davier, M. (2014). A third order item response theory model for modeling the effects of domains and subdomains in large-scale educational assessment surveys. Journal of Educational and Behavioral Statistics, 38, 32–60.Google Scholar
  23. Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142–151.CrossRefGoogle Scholar
  24. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.CrossRefGoogle Scholar
  25. von Davier, M. (2005). A general diagnostic model applied to language testing data (ETS Research Report No. RR-05-16). Princeton, NJ: Educational Testing Service.CrossRefGoogle Scholar
  26. von Davier, M. (2008). The mixture general diagnostic model. In G. R. Hancock & K. M. Samuelson (Eds.), Advances in latent variable mixture models. Information Age Publishing.Google Scholar
  27. von Davier, M. (2010). Hierarchical mixtures of diagnostic models. Psychological Test and Assessment Modeling, 52, 8–28.Google Scholar
  28. von Davier, M. (2013). The DINA model as a constrained general diagnostic model – Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67(1), 49–71. CrossRefGoogle Scholar
  29. von Davier, M. (2014). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM) (Research Report No. ETS RR-14-40). Princeton, NJ: Educational Testing Service. CrossRefGoogle Scholar
  30. von Davier, M. (2016). High-performance psychometrics: The Parallel-E Parallel-M algorithm for generalized latent variable models. ETS Research Report Series ISSN, 2016, 2330–8516.Google Scholar
  31. von Davier, M. (2017). New results on an improved parallel EM algorithm for estimating generalized latent variable models. In L. van der Ark, M. Wiberg, S. Culpepper, J. Douglas, & W. C. Wang (Eds.), Quantitative Psychology. IMPS 2016. Springer Proceedings in Mathematics & Statistics (Vol. 196). New York, NY: Springer.Google Scholar
  32. von Davier, M., & Carstensen, C. H. (2006). Multivariate and mixture distribution rasch models: Extensions and applications. New York, NY: Springer.Google Scholar
  33. von Davier, M., Gonzalez, E. & Mislevy, R. (2009) What are plausible values and why are they useful? In IERI Monograph series: Issues and methodologies in large scale Assessments, vol. 2. Retrieved from:
  34. von Davier, M., González, J. B., & von Davier, A. A. (2013). Local equating using the Rasch Model, the OPLM, and the 2PL IRT Model—or—What is it anyway if the model captures everything there is to know about the test takers? Journal of Educational Measurement, 50(3), 295–303. CrossRefGoogle Scholar
  35. von Davier, M., & Rost, J. (2016). Logistic mixture-distribution response models. In W. van der Linden (Ed.), Handbook of item response theory (Vol. 1, 2nd ed., pp. 393–406). Boca Raton, FL: CRC Press.Google Scholar
  36. von Davier, M., Sinharay, S., Oranje, A., & Beaton, A. (2006). Statistical procedures used in the National Assessment of Educational Progress (NAEP): Recent developments and future directions. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26): Psychometrics. Amsterdam, The Netherlands: Elsevier.Google Scholar
  37. von Davier, M., & von Davier, A. (2007). A unified approach to IRT scale linking and scale transformations. Methodology, 3(3), 115–124.CrossRefGoogle Scholar
  38. Xu, X. & von Davier, M. (2008a). Linking with the General Diagnostic Model. ETS Research Report No. RR-08-08, Princeton, NJ: Educational Testing Service.
  39. Xu, X. & von Davier, M. (2008b). Fitting the structured general diagnostic model to NAEP data. ETS Research Report No. RR-08-27, Princeton, NJ: Educational Testing Service.Google Scholar
  40. Xu, X. & von Davier, M. (2008c). Comparing multiple-group multinomial loglinear models for multidimensional skill distributions in the general diagnostic model. ETS Research Report No. RR-08-35, Princeton, NJ: Educational Testing Service.Google Scholar
  41. Yamamoto, K., Khorramdel, L., & von Davier, M. (2013, updated 2016). Chapter 17: Scaling PIAAC cognitive data. In OECD (2013), Technical Report of the Survey of Adult Skills (PIAAC) (pp. 406–438), PIAAC, OECD Publishing. Retrieved from

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Lale Khorramdel
    • 1
  • Hyo Jeong Shin
    • 2
  • Matthias von Davier
    • 1
    Email author
  1. 1.National Board of Medical Examiners (NBME)PhiladelphiaUSA
  2. 2.Educational Testing ServicePrincetonUSA

Personalised recommendations