Using Hamiltonian Monte Carlo to estimate the log-linear cognitive diagnosis model via Stan
The Bayesian literature has shown that the Hamiltonian Monte Carlo (HMC) algorithm is powerful and efficient for statistical model estimation, especially for complicated models. Stan, a software program built upon HMC, has been introduced as a means of psychometric modeling estimation. However, there are no systemic guidelines for implementing Stan with the log-linear cognitive diagnosis model (LCDM), which is the saturated version of many cognitive diagnostic model (CDM) variants. This article bridges the gap between Stan application and Bayesian LCDM estimation: Both the modeling procedures and Stan code are demonstrated in detail, such that this strategy can be extended to other CDMs straightforwardly.
KeywordsMarkov chain Monte Carlo (MCMC) Bayesian Cognitive diagnostic model LCDM Stan Hamiltonian Monte Carlo (HMC)
- Almond, R. (2014). Comparison of two MCMC algorithms for hierarchical mixture models. In Bayesian Modeling Application Workshop at the Uncertainty in Artificial Intelligence Conference (pp. 1–19). Corvallis, OR: AUAI Press.Google Scholar
- Betancourt, M. J., Byrne, S., & Girolami, M. (2014). Optimizing the integrator step size for Hamiltonian Monte Carlo. arXiv:1411.6669Google Scholar
- Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., … Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 20, 1–37.Google Scholar
- Dai, S., Svetina, D., & Chen, C. (2018). Investigation of missing responses in Q-matrix validation. Applied Psychological Measurement. Advance online publication. doi: https://doi.org/10.1177/0146621618762742
- Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EMalgorithm. Journal of the Royal Statistical Society, 39, 1–38.Google Scholar
- Gilks, W. R. (1998). Full conditional distributions. In W. R. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 75–88). Boca Raton, FL: Chapman & Hall.Google Scholar
- Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Doctoral dissertation), University of Illinois at Urbana-Champaign, IL.Google Scholar
- Hoffman, M. D., & Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593–1623.Google Scholar
- Ishwaran, H., & Zarepour, M. (2002). Dirichlet prior sieves in finite normal mixtures. Statistica Sinica, 941–963.Google Scholar
- Jiang, Z., & Skorupski, W. (2017). A Bayesian approach to estimating variance components within a multivariate generalizability theory framework. Behavior Research Methods. Advance online publication. doi:10.3758/s13428-017-0986-3Google Scholar
- Knott, M., & Bartholomew, D. J. (1999). Latent variable models and factor analysis (No. 7). Edward Arnold.Google Scholar
- Lao, H., & Templin, J. (2016, April). Estimation of diagnostic classification models without constraints: Issues with class label switching. Paper presented at Annual Meeting of the National Council on Measurement in Education, Washington, DC.Google Scholar
- Lee, M. D., & Wagenmakers, E. J. (2014). Bayesian cognitive modeling: A practical course. New York, NY: Cambridge University Press.Google Scholar
- Lee, S. T. (2016, November 21). DINA model with independent attributes. Retrieved from http://mc-stan.org/documentation/case-studies/dina_independent.html.
- Liu, R. (2017). Misspecification of attribute structure in diagnostic measurement. Educational and Psychological Measurement. https://doi.org/10.1177/0013164417702458.
- Ma, W., & de la Torre, J. (2016). GDINA: The Generalized DINA model framework (R package version 0.13.0). Available online at http://CRAN. R-project.org/package=GDINA.
- Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99–120.Google Scholar
- Muthén, L. K., & Muthén, B. O. (2013). Mplus user’s guide (Version 7.1) [Computer software and manual]. Los Angeles, CA: Muthén & Muthén.Google Scholar
- Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In S. Brooks (Ed.), Handbook of Markov Chain Monte Carlo (pp. 113–162). Boca Raton, FL: CRC Press/Taylor & Francis.Google Scholar
- Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. Paper presented at the 3rd International Workshop on Distributed Statistical Computing, Vienna, Austria.Google Scholar
- R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from www.Rproject.org/
- Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford Press.Google Scholar
- Rupp, A. A., & Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspectives, 6, 219–262.Google Scholar
- da Silva, M. A., de Oliveira, E. S. B., von Davier, A. A., & Bazán, J. L. (2017). Estimating the DINA model parameters using the No-U-Turn Sampler. Biometrical Journal. Advance online publication. doi: https://doi.org/10.1002/bimj.201600225
- Stan Development Team. (2016a). rstan: R interface to Stan (R package version 2.0.3). Retrieved from http://mc-stan.org
- Stan Development Team. (2016b). Stan: A C++ library for probability and sampling (Version 2.8.0). Retrieved from http://mc-stan.org
- von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement: Interdisciplinary Research and Perspectives, 7, 67–74.Google Scholar
- Zhan, P. (2017). Using JAGS for Bayesian cognitive diagnosis models: A tutorial. arXiv:1708.02632Google Scholar