Abstract
Nowadays, Bayesian methods are routinely used for estimating parameters of item response theory (IRT) models. However, the marginal likelihoods are still rarely used for comparing IRT models due to their complexity and a relatively high dimension of the model parameters. In this paper, we review Monte Carlo (MC) methods developed in the literature in recent years and provide a detailed development of how these methods are applied to the IRT models. In particular, we focus on the “best possible” implementation of these MC methods for the IRT models. These MC methods are used to compute the marginal likelihoods under the one-parameter IRT model with the logistic link (1PL model) and the two-parameter logistic IRT model (2PL model) for a real English Examination dataset. We further use the widely applicable information criterion (WAIC) and deviance information criterion (DIC) to compare the 1PL model and the 2PL model. The 2PL model is favored by all of these three Bayesian model comparison criteria for the English Examination data.
Similar content being viewed by others
References
Bock, R. D., & Mislevy, R. J. (1989). A hierarchical item response model for educational testing. In Multilevel analysis of educational data (pp. 57–74). Elsevier.
Cao, J., & Stokes, S. L. (2008). Bayesian IRT Guessing models for partial guessing behaviors. Psychometrika, 73(2), 209.
Chen, M.-H. (1994). Importance-weighted marginal Bayesian posterior density estimation. Journal of the American Statistical Association, 89(427), 818–824.
Chen, M.-H. (2005). Computing marginal likelihoods from a single MCMC output. Statistica Neerlandica, 59(1), 16–29.
Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association, 90(432), 1313–1321.
Chib, S., & Jeliazkov, I. (2001). Marginal likelihood from the Metropolis-Hastings output. Journal of the American Statistical Association, 96(453), 270–281.
DiCiccio, T. J., Kass, R. E., Raftery, A., & Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. Journal of the American Statistical Association, 92(439), 903–915.
Fan, Y., Wu, R., Chen, M.-H., Kuo, L., & Lewis, P. O. (2010). Choosing among partition models in Bayesian phylogenetics. Molecular Biology and Evolution, 28(1), 523–532.
Fox, J.-P. (2005). Multilevel IRT using dichotomous and polytomous response data. British Journal of Mathematical and Statistical Psychology, 58(1), 145–172.
Friel, N., & Pettitt, A. N. (2008). Marginal likelihood estimation via power posteriors. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 70(3), 589–607.
Gelfand, A. E., Smith, A. F., & Lee, T.-M. (1992). Bayesian Analysis of constrained parameter and truncated data problems using Gibbs sampling. Journal of the American Statistical Association, 87(418), 523–532.
Gelman, A., & Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science, 163–185.
Harris, D. (1989). Comparison of 1-, 2-, and 3-parameter IRT models. Educational Measurement: Issues and Practice, 8(1), 35–41.
Karabatsos, G. (2016). Bayesian Nonparametric response models. In Handbook of item response theory, volume one (pp. 351–364). Chapman and Hall/CRC.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773–795.
Lartillot, N., & Philippe, H. (2006). Computing Bayes factors using thermodynamic integration. Systematic Biology, 55(2), 195–207.
Lewis, S. M., & Raftery, A. E. (1997). Estimating Bayes factors via posterior simulation with the Laplace-Metropolis estimator. Journal of the American Statistical Association, 92(438), 648–655.
Luo, Y., & Jiao, H. (2018). Using the stan program for Bayesian item response theory. Educational and Psychological Measurement, 78(3), 384–408.
Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51(2), 177–195.
Natesan, P., Nandakumar, R., Minka, T., & Rubright, J. D. (2016). Bayesian Prior choice in IRT estimation using MCMC and variational bayes. Frontiers in Psychology, 7, 1422.
Newton, M. A., & Raftery, A. E. (1994). Approximate Bayesian inference by the weighted likelihood bootstrap. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 70, 3–48.
Petris, G., & Tardella, L. (2003). A geometric approach to transdimensional Markov chain Monte Carlo. The Canadian Journal of Statistics, 31(4), 469–482.
Petris, G., & Tardella, L. (2007). New perspectives for estimating normalizing constants via posterior simulation: Technical report, Universita I di Roma “La Sapienza”.
Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian Measures of model complexity and fit. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 64(4), 583–639.
Wang, X., Berger, J. O., Burdick, D. S., et al. (2013). Bayesian Analysis of dynamic item response models in educational testing. The Annals of Applied Statistics, 7(1), 126–153.
Wang, Y.-B., Chen, M.-H., Kuo, L., & Lewis, P. O. (2018). A new Monte Carlo method for estimating marginal likelihoods. Bayesian Analysis, 13(2), 311.
Watanabe, S. (2010). Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research (JMLR), 11, 3571–3594.
Xie, W., Lewis, P. O., Fan, Y., Kuo, L., & Chen, M.-H. (2011). Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Systematic Biology, 60(2), 150–160.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, Y., Hu, G., Cao, L. et al. A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models. J. Korean Stat. Soc. 48, 503–512 (2019). https://doi.org/10.1016/j.jkss.2019.04.001
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1016/j.jkss.2019.04.001