Advertisement

Comparison of Hyperpriors for Modeling the Intertrait Correlation in a Multidimensional IRT Model

  • Meng-I ChangEmail author
  • Yanyan Sheng
Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 265)

Abstract

Markov chain Monte Carlo (MCMC) algorithms have made the estimation of multidimensional item response theory (MIRT) models possible under a fully Bayesian framework. An important goal in fitting a MIRT model is to accurately estimate the interrelationship among multiple latent traits. In Bayesian hierarchical modeling, this is realized through modeling the covariance matrix, which is typically done via the use of an inverse Wishart prior distribution due to its conjugacy property. Studies in the Bayesian literature have pointed out limitations of such specifications. The purpose of this study is to compare the inverse Wishart prior with other alternatives such as the scaled inverse Wishart, the hierarchical half-t, and the LKJ priors on parameter estimation and model adequacy of one form of the MIRT model through Monte Carlo simulations. Results suggest that the inverse Wishart prior performs worse than the other priors on parameter recovery and model-data adequacy across most of the simulation conditions when variance for person parameters is small. Findings from this study provide a set of guidelines on using these priors in estimating the Bayesian MIRT models.

Keywords

Multidimensional item response theory Fully bayesian model Markov chain Monte Carlo 

References

  1. Alvarez, I., Niemi, J., & Simpson, M. (2014). Bayesian inference for a covariance matrix. arXiv preprint arXiv:1408.4050.
  2. Barnard, J., McCulloch, R., & Meng, X. L. (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Statistica Sinica, 10(4), 1281–1311. Retrieved from http://www.jstor.org/stable/24306780.
  3. Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541–561.  https://doi.org/10.1007/BF02296195.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459.  https://doi.org/10.1007/BF02293801.MathSciNetCrossRefGoogle Scholar
  5. Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395–414.  https://doi.org/10.1177/0146621603258350.MathSciNetCrossRefGoogle Scholar
  6. Bouriga, M., & Féron, O. (2013). Estimation of covariance matrices based on hierarchical inverse-Wishart priors. Journal of Statistical Planning and Inference, 143, 795–808.  https://doi.org/10.1016/j.jspi.2012.09.006.MathSciNetCrossRefzbMATHGoogle Scholar
  7. Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434–455.MathSciNetGoogle Scholar
  8. Carpenter, B., et al. (2016). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. http://dx.doi.org/10.18637/jss.v076.i01.
  9. Chang, M. I., & Sheng, Y. (2016). A comparison of two MCMC algorithms for the 2PL IRT model. In The Annual Meeting of the Psychometric Society (pp. 71–79). Springer, Cham.Google Scholar
  10. Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings algorithm. The American Statistician, 49(4), 327–335.Google Scholar
  11. Dawber, T., Rogers, W. T., & Carbonaro, M. (2009). Robustness of Lord’s formulas for item difficulty and discrimination conversions between classical and item response theory models. Alberta Journal of Educational Research, 55(4), 512–533.Google Scholar
  12. Duane, S., Kennedy, A. D., Pendleton, B. J., & Roweth, D. (1987). Hybrid Monte Carlo. Physics Letters B, 195, 216–222.  https://doi.org/10.1016/0370-2693(87)91197-X.CrossRefGoogle Scholar
  13. Fox, J. P., & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66(2), 271–288.  https://doi.org/10.1007/BF02294839.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Geisser, S., & Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association, 74(365), 153–160.Google Scholar
  15. Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (Comment on an Article by Browne and Draper). Bayesian Analysis, 1(3), 515–533.MathSciNetCrossRefGoogle Scholar
  16. Gelman, A. (2014). Bayesian data analysis (3rd ed.). Boca Raton: CRC Press.Google Scholar
  17. Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge ; New York : Cambridge University Press, 2007.Google Scholar
  18. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457–472.CrossRefGoogle Scholar
  19. Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.  https://doi.org/10.1109/TPAMI.1984.4767596.CrossRefzbMATHGoogle Scholar
  20. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory: Newbury Park, Calif.: Sage Publications.Google Scholar
  21. Harwell, M., Stone, C. A., Hsu, T. C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101–125.  https://doi.org/10.1177/014662169602000201.CrossRefGoogle Scholar
  22. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97–109.  https://doi.org/10.1093/biomet/57.1.97.MathSciNetCrossRefzbMATHGoogle Scholar
  23. Hemker, B. T., Sijtsma, K., & Molenaar, I. W. (1995). Selection of unidimensional scales from a multidimensional item bank in the polytomous Mokken IRT model. Applied Psychological Measurement, 19(4), 337–352.  https://doi.org/10.1177/014662169501900404.CrossRefGoogle Scholar
  24. Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593–1623.MathSciNetzbMATHGoogle Scholar
  25. Huang, A. & Wand, M. P. (2013). Simple marginally noninformative prior distributions for covariance matrices. Bayesian Analysis, 8(2), 439–452.Google Scholar
  26. Kieftenbeld, V., & Natesan, P. (2012). Recovery of graded response model parameters: A comparison of marginal maximum likelihood and Markov chain Monte Carlo estimation. Applied Psychological Measurement, 36(5), 399–419.  https://doi.org/10.1177/0146621612446170.CrossRefGoogle Scholar
  27. Kim, S. H. (2007). Some posterior standard deviations in item response theory. Educational and Psychological Measurement, 67(2), 258–279.  https://doi.org/10.1177/00131644070670020501.MathSciNetCrossRefGoogle Scholar
  28. Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9), 1989–2001.MathSciNetCrossRefGoogle Scholar
  29. Liu, H., Zhang, Z., & Grimm, K. J. (2016). Comparison of inverse Wishart and separation-strategy priors for Bayesian estimation of covariance parameter matrix in growth curve analysis. Structural Equation Modeling: A Multidisciplinary Journal, 23(3), 354–367.  https://doi.org/10.1080/10705511.2015.1057285.MathSciNetCrossRefGoogle Scholar
  30. Lord, F. M. (1980). Applications of item response theory to practical testing problems: Hillsdale, N.J.: L. Erlbaum Associates.Google Scholar
  31. Luo, U., & Al-Harbi, K. (2017). Performances of LOO and WAIC as IRT model selection methods. Psychological Test and Assessment Modeling, 59(2), 183–205.Google Scholar
  32. Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51(2), 177–195.  https://doi.org/10.1007/BF02293979.MathSciNetCrossRefzbMATHGoogle Scholar
  33. O’Malley, A., & Zaslavsky, A. (2008). Domain-level covariance analysis for survey data with structured nonresponse. Journal of the American Statistical Association, 103(484), 1405–1418.  https://doi.org/10.1198/016214508000000724.MathSciNetCrossRefzbMATHGoogle Scholar
  34. Patz, R. J., & Junker, B. W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24(2), 146–178.  https://doi.org/10.3102/10769986024002146.CrossRefGoogle Scholar
  35. Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 25–36.  https://doi.org/10.1177/0146621697211002.CrossRefGoogle Scholar
  36. Roberts, J., & Thompson, V. (2011). Marginal maximum a posteriori item parameter estimation for the generalized graded unfolding model. Applied Psychological Measurement, 35(4), 259–279.  https://doi.org/10.1177/0146621610392565.CrossRefGoogle Scholar
  37. Schuurman, N. K., Grasman, R. P. P. P., & Hamaker, E. L. (2016). A comparison of inverse-Wishart prior specifications for covariance matrices in multilevel autoregressive models. Multivariate Behavioral Research, 51(2–3), 185–206.  https://doi.org/10.1080/00273171.2015.1065398.CrossRefGoogle Scholar
  38. Sheng, Y. (2010). A sensitivity analysis of Gibbs sampling for 3PNO IRT models: Effects of prior specifications on parameter estimates. Behaviormetrika, 37(2), 87–110.  https://doi.org/10.2333/bhmk.37.87.CrossRefzbMATHGoogle Scholar
  39. Sheng, Y., & Wikle, C. K. (2007). Comparing multiunidimensional and unidimensional item response theory models. Educational and Psychological Measurement, 67(6), 899–919.  https://doi.org/10.1177/0013164406296977.MathSciNetCrossRefGoogle Scholar
  40. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 583–639.Google Scholar
  41. Stan Development Team. (2017). Stan modeling language users guide and reference manual, Version 2.15.0.Google Scholar
  42. Swaminathan, H., & Gifford, J. A. (1982). Bayesian estimation in the Rasch model. Journal of Educational Statistics, 7(3), 175–191.  https://doi.org/10.2307/1164643.CrossRefGoogle Scholar
  43. Swaminathan, H., & Gifford, J. A. (1983). Estimation of parameters in the three-parameter latent trait model. New Horizon Testing, 13–30.  https://doi.org/10.1016/b978-0-12-742780-5.50009-3.Google Scholar
  44. Tokuda, T., Goodrich, B., Van Mechelen, I., Gelman, A., & Tuerlinckx, F. (2011). Visualizing distributions of covariance matrices. Unpublished manuscript. http://www.stat.columbia.edu/gelman/research/unpublished/Visualization.pdf.
  45. Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413–1432.  https://doi.org/10.1007/s11222-016-9696-4.MathSciNetCrossRefzbMATHGoogle Scholar
  46. Watanabe, S. (2010, December). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11, 3571–3594.Google Scholar
  47. Wollack, J. A., Bolt, D. M., Cohen, A. S., & Lee, Y. S. (2002). Recovery of item parameters in the nominal response model: A comparison of marginal maximum likelihood estimation and Markov chain Monte Carlo estimation. Applied Psychological Measurement, 26(3), 339–352.  https://doi.org/10.1177/0146621602026003007.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of PsychologyPhilander Smith CollegeLittle RockUSA
  2. 2.Department of Counseling, Quantitative Methods, and Special EducationSouthern Illinois University CarbondaleCarbondaleUSA

Personalised recommendations