A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Liu, Yang; Hu, Guanyu; Cao, Lei; Wang, Xiaojing; Chen, Ming-Hui

doi:10.1016/j.jkss.2019.04.001

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Published: 17 May 2019

Volume 48, pages 503–512, (2019)
Cite this article

Journal of the Korean Statistical Society Aims and scope Submit manuscript

Yang Liu¹,
Guanyu Hu¹,
Lei Cao^1,2,
Xiaojing Wang¹ &
…
Ming-Hui Chen¹

45 Accesses
6 Citations
Explore all metrics

Abstract

Nowadays, Bayesian methods are routinely used for estimating parameters of item response theory (IRT) models. However, the marginal likelihoods are still rarely used for comparing IRT models due to their complexity and a relatively high dimension of the model parameters. In this paper, we review Monte Carlo (MC) methods developed in the literature in recent years and provide a detailed development of how these methods are applied to the IRT models. In particular, we focus on the “best possible” implementation of these MC methods for the IRT models. These MC methods are used to compute the marginal likelihoods under the one-parameter IRT model with the logistic link (1PL model) and the two-parameter logistic IRT model (2PL model) for a real English Examination dataset. We further use the widely applicable information criterion (WAIC) and deviance information criterion (DIC) to compare the 1PL model and the 2PL model. The 2PL model is favored by all of these three Bayesian model comparison criteria for the English Examination data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

References

Bock, R. D., & Mislevy, R. J. (1989). A hierarchical item response model for educational testing. In Multilevel analysis of educational data (pp. 57–74). Elsevier.
Google Scholar
Cao, J., & Stokes, S. L. (2008). Bayesian IRT Guessing models for partial guessing behaviors. Psychometrika, 73(2), 209.
Article MathSciNet Google Scholar
Chen, M.-H. (1994). Importance-weighted marginal Bayesian posterior density estimation. Journal of the American Statistical Association, 89(427), 818–824.
Article MathSciNet Google Scholar
Chen, M.-H. (2005). Computing marginal likelihoods from a single MCMC output. Statistica Neerlandica, 59(1), 16–29.
Article MathSciNet Google Scholar
Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association, 90(432), 1313–1321.
Article MathSciNet Google Scholar
Chib, S., & Jeliazkov, I. (2001). Marginal likelihood from the Metropolis-Hastings output. Journal of the American Statistical Association, 96(453), 270–281.
Article MathSciNet Google Scholar
DiCiccio, T. J., Kass, R. E., Raftery, A., & Wasserman, L. (1997). Computing Bayes factors by combining simulation and asymptotic approximations. Journal of the American Statistical Association, 92(439), 903–915.
Article MathSciNet Google Scholar
Fan, Y., Wu, R., Chen, M.-H., Kuo, L., & Lewis, P. O. (2010). Choosing among partition models in Bayesian phylogenetics. Molecular Biology and Evolution, 28(1), 523–532.
Article Google Scholar
Fox, J.-P. (2005). Multilevel IRT using dichotomous and polytomous response data. British Journal of Mathematical and Statistical Psychology, 58(1), 145–172.
Article MathSciNet Google Scholar
Friel, N., & Pettitt, A. N. (2008). Marginal likelihood estimation via power posteriors. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 70(3), 589–607.
Article MathSciNet Google Scholar
Gelfand, A. E., Smith, A. F., & Lee, T.-M. (1992). Bayesian Analysis of constrained parameter and truncated data problems using Gibbs sampling. Journal of the American Statistical Association, 87(418), 523–532.
Article MathSciNet Google Scholar
Gelman, A., & Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science, 163–185.
Google Scholar
Harris, D. (1989). Comparison of 1-, 2-, and 3-parameter IRT models. Educational Measurement: Issues and Practice, 8(1), 35–41.
Article Google Scholar
Karabatsos, G. (2016). Bayesian Nonparametric response models. In Handbook of item response theory, volume one (pp. 351–364). Chapman and Hall/CRC.
Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773–795.
Article MathSciNet Google Scholar
Lartillot, N., & Philippe, H. (2006). Computing Bayes factors using thermodynamic integration. Systematic Biology, 55(2), 195–207.
Article Google Scholar
Lewis, S. M., & Raftery, A. E. (1997). Estimating Bayes factors via posterior simulation with the Laplace-Metropolis estimator. Journal of the American Statistical Association, 92(438), 648–655.
MathSciNet MATH Google Scholar
Luo, Y., & Jiao, H. (2018). Using the stan program for Bayesian item response theory. Educational and Psychological Measurement, 78(3), 384–408.
Article Google Scholar
Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51(2), 177–195.
Article MathSciNet Google Scholar
Natesan, P., Nandakumar, R., Minka, T., & Rubright, J. D. (2016). Bayesian Prior choice in IRT estimation using MCMC and variational bayes. Frontiers in Psychology, 7, 1422.
Article Google Scholar
Newton, M. A., & Raftery, A. E. (1994). Approximate Bayesian inference by the weighted likelihood bootstrap. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 70, 3–48.
Google Scholar
Petris, G., & Tardella, L. (2003). A geometric approach to transdimensional Markov chain Monte Carlo. The Canadian Journal of Statistics, 31(4), 469–482.
Article MathSciNet Google Scholar
Petris, G., & Tardella, L. (2007). New perspectives for estimating normalizing constants via posterior simulation: Technical report, Universita I di Roma “La Sapienza”.
Google Scholar
Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Google Scholar
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian Measures of model complexity and fit. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 64(4), 583–639.
Article MathSciNet Google Scholar
Wang, X., Berger, J. O., Burdick, D. S., et al. (2013). Bayesian Analysis of dynamic item response models in educational testing. The Annals of Applied Statistics, 7(1), 126–153.
Article MathSciNet Google Scholar
Wang, Y.-B., Chen, M.-H., Kuo, L., & Lewis, P. O. (2018). A new Monte Carlo method for estimating marginal likelihoods. Bayesian Analysis, 13(2), 311.
Article MathSciNet Google Scholar
Watanabe, S. (2010). Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research (JMLR), 11, 3571–3594.
MathSciNet MATH Google Scholar
Xie, W., Lewis, P. O., Fan, Y., Kuo, L., & Chen, M.-H. (2011). Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Systematic Biology, 60(2), 150–160.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Connecticut, Storrs, CT, USA
Yang Liu, Guanyu Hu, Lei Cao, Xiaojing Wang & Ming-Hui Chen
School of Basic Science, Changchun University of Technology, Changchun, China
Lei Cao

Authors

Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Guanyu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Hui Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming-Hui Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Hu, G., Cao, L. et al. A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models. J. Korean Stat. Soc. 48, 503–512 (2019). https://doi.org/10.1016/j.jkss.2019.04.001

Download citation

Received: 16 February 2019
Accepted: 11 April 2019
Published: 17 May 2019
Issue Date: December 2019
DOI: https://doi.org/10.1016/j.jkss.2019.04.001

AMS 2010 subject classifications

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

AMS 2010 subject classifications

Keywords

Navigation

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

AMS 2010 subject classifications

Keywords

Search

Navigation