Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data

Zhang, Jiwei; Zhang, Zhaoyuan; Tao, Jian

doi:10.1007/s42952-020-00100-6

Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data

Research Article
Published: 08 January 2021

Volume 50, pages 955–996, (2021)
Cite this article

Journal of the Korean Statistical Society Aims and scope Submit manuscript

Jiwei Zhang¹,
Zhaoyuan Zhang² &
Jian Tao³

244 Accesses
Explore all metrics

Abstract

Missing responses generally exist in educational and psychological assessments. The statistical inference will lead to serious deviation if the missing responses are not properly modeled in the framework of non-ignorable missing mechanism. In this current study, it is studied whether the different missing mechanism (ignorable missing and non-ignorable missing) models are appropriate to analyze the missing response data from the perspective of parameter estimation and model assessment. In addition, a highly effective Bayesian sampling algorithm based on auxiliary variables is used to estimate the complex models. Compared with the traditional marginal likelihood method and other Bayesian algorithms, the advantages of the new algorithm are discussed in detail. Based on the Markov Chain Monte carlo samples from the posterior distributions, the deviance information criterion (DIC) and the logarithm of the pseudomarignal likelihood (LPML) are employed to compare the different missing mechanism models. Four simulation studies are conducted and a detailed analysis of PISA science data is carried out to further illustrate the proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage maximum likelihood approach for item-level missing data in regression

Article Open access 24 April 2020

Empirical likelihood method for non-ignorable missing data problems

Article 19 September 2016

Combining proration and full information maximum likelihood in handling missing data in Likert scale items: A hybrid approach

Article 06 August 2021

References

Ackerman, T. A. (1996a). Developments in multidimensional item response theory. Applied Psychological Measurement, 20, 309–310.
Article Google Scholar
Ackerman, T. A. (1996b). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20, 311–329.
Article Google Scholar
Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.
Article Google Scholar
Asparouhov, T., & Muthén, B. (2010). Bayesian analysis of latent variable models using Mplus (Technical report, Version 4). Retrieved from http://www.statmodel.com.
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation of multidimensional IRT models. Psychometrika, 66, 541–561.
Article MathSciNet MATH Google Scholar
Bishop, C. M. (2006). Slice sampling. Pattern Recognition and Machine Learning. New York: Springer.
MATH Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). Reading: MIT Press.
Google Scholar
Blossfeld, H.-P., Ro$\beta $bach, H.-G., & von Maurice, J. (2011). Education as a lifelong process—The German national educational panel study (NEPS) [Special issue]. In Zeitschrift für Erziehungswissenschaft , 14. Wiesbaden: Springer VS.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.
Article MathSciNet Google Scholar
Bock, R. D., & Schilling, S. G. (1997). High-dimensional full-information item factor analysis. In M. Berkane (Ed.), Latent variable modelling and applications to causality (pp. 164–176). New York: Springer.
Google Scholar
Brooks, S. P., & Gelman, A. (1998). Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.
MathSciNet Google Scholar
Chen, M.-H., Shao, Q.-M., & Ibrahim, J. G. (2000). Monte Carlo methods in Bayesian computation. New York: Springer.
Book MATH Google Scholar
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49, 327–335.
Google Scholar
Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81, 1142–1163.
Article MathSciNet MATH Google Scholar
Damien, P., Wakefield, J., & Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by auxiliary variables. Journal of the Royal Statistical Society Series B, 61, 331–344.
Article MathSciNet MATH Google Scholar
Diggle, P. J., Heagerty, P., Liang, K. Y., & Zeger, S. L. (2002). Analysis of longitudinal data (2nd ed.). Oxford: Oxford University Press.
MATH Google Scholar
Fox, J. P. (2005). Multilevel IRT using dichotomous and polytomous items. The British Journal of Mathematical and Statistical Psychology, 58, 145–172.
Article MathSciNet Google Scholar
Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer.
Book MATH Google Scholar
Fox, J.-P., & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Article MathSciNet MATH Google Scholar
Geisser, S., & Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association, 74, 153–160.
Article MathSciNet MATH Google Scholar
Gelfand, A. E., Dey, D. K., & Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics 4 (pp. 147–167). Oxford, UK: Oxford University Press.
Google Scholar
Gelfand, A. E., & Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.
Article MathSciNet MATH Google Scholar
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.
Article MATH Google Scholar
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article MATH Google Scholar
Glas, C. A. W., & Pimentel, J. L. (2008). Modeling nonignorable missing data in speeded tests. Educational and Psychological Measurement, 68, 907–922.
Article MathSciNet Google Scholar
Glas, C. A. W., Pimentel, J. L., & Lamers, M. A. (2015). Nonignorable data in IRT mdoels: Polytomous response and response propensity models with covariates. Psychological Test and Assessment Modeling, 57, 523–541.
Google Scholar
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Article MathSciNet MATH Google Scholar
Heckman, J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. The Annals of Economic and Social Measurement, 5, 475–492.
Google Scholar
Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153–61.
Article MathSciNet MATH Google Scholar
Holman, R., & Glas, C. A. W. (2005). Modelling non-ignorable missing-data mechanisms with item response theory models. British Journal of Mathematical and Statistical Psychology, 58, 1–17.
Article MathSciNet Google Scholar
Huisman, M. (2000). Imputation of missing item responses: Some simple techniques. Quality and Quantity, 34, 331–351.
Article Google Scholar
Ibrahim, J. G., Chen, M.-H., & Sinha, D. (2001). Bayesian survival analysis. New York: Springer.
Book MATH Google Scholar
Jackman, S. (2009). Bayesian analysis for the social sciences. Chichester: Wiley.
Book MATH Google Scholar
Korobko, O. K., Glas, C. A. W., Bosker, R. J., & Luyten, J. W. (2008). Comparing the difficulty of examination subjects with item response theory. Journal of Educational Measurement, 45, 137–155.
Article Google Scholar
Kuk, A. Y. C. (1999). Laplace importance sampling for generalized linear mixed models. Journal of Statistical Computation and Simulation, 63, 143–158.
Article MathSciNet MATH Google Scholar
Lee, S.-Y., & Song, X.-Y. (2004). Evaluation of the Bayesian and maximum likelihood approaches in analyzing structural equation models with small sample sizes. Multivariate Behavioral Research, 39, 653–686.
Article Google Scholar
Little, R. J. A. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88, 125–134.
MATH Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York: Wiley.
Book MATH Google Scholar
Lord, F. M. (1974). Estimation of latent ability and item parameters when there are omitted responses. Psychometrika, 39, 247–264.
Article MATH Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing scores. Reading: Addison-Wesley.
Google Scholar
Lord, F. M. (1983). Maximum likelihood estimation of item response parameters when some responses are omitted. Psychometrika, 48, 477–482.
Article MATH Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
MATH Google Scholar
Lu, J., Zhang, J. W., & Tao, J. (2018). Slice–Gibbs sampling algorithm for estimating the parameters of a multilevel item response model. Journal of Mathematical Psychology, 82, 12–25.
Article MathSciNet MATH Google Scholar
Ludlow, L. H., & O’Leary, M. (1999). Scoring omitted and not-reached items: Practical data analysis implications. Educational and Psychological Measurement, 59, 615–630.
Article Google Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092.
Article MATH Google Scholar
Mislevy, R. J., & Chang, H. H. (2000). Does adaptive testing violate local independence? Psychometrika, 65, 149–156.
Article MathSciNet MATH Google Scholar
Mislevy, R. J., & Wu, P. K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing. Research Report No. RR-96-30. Princeton: Educational Testing Service.
Moustaki, I., & Knott, M. (2000). Weighting for item non-response in attitude scales by using latent variable models with covariates. Journal of the Royal Statistical Society Series A, 163, 445–459.
Article Google Scholar
Muthén, B. O. (2010). Bayesian analysis in Mplus: A brief introduction (Incomplete draft,Version 3). http://www.statmodel.com/download/IntroBayesVersion%203.pdf
Neal, R. (2003). Slice sampling. The Annals of Statistics, 31, 705–767.
Article MathSciNet MATH Google Scholar
O’Muircheartaigh, C., & Moustaki, I. (1999). Symmetric pattern models: A latent variable approach to item non-response in attitudes scales. Journal of the Royal Statistic Society, 162, 177–194.
Article Google Scholar
Patz, R. J., & Junker, B. W. (1999a). A straight forward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.
Article Google Scholar
Pohl, S., Gräfe, L., & Rose, N. (2014). Dealing with omitted and not-reached items in competence tests: Evaluating approaches accounting for missing responses in item response theory models. Educational and Psychological Measurement, 74, 423–452.
Article Google Scholar
Pohl, S., Haberkorn, K., Hardt, K., & Wiegand, E. (2012). NEPS technical report for reading? Scaling results of starting cohort 3 in fifth grade. NEPS Working Paper No. 15. Bamberg: Otto-Friedrich-Universitt, Nationales Bildungspanel.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal, 2, 1–21.
Article MATH Google Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323.
Article MathSciNet MATH Google Scholar
Reckase, M. D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.
Article Google Scholar
Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory. New York: Springer.
Google Scholar
Rose, N. (2013). Item nonresponses in educational and psychological measurement. Doctoral Thesis, Friedrich-Schiller University, Jena.
Rose, N., von Davier, M., & Nagengast, B. (2017). Modeling omitted and not-reached items in IRT models. Psychometrika, 82, 795–819.
Article MathSciNet MATH Google Scholar
Rose, N., von Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with IRT. Research Report No. RR-10-11. Princeton, NJ: Educational Testing Service.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Article MathSciNet MATH Google Scholar
Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.
Book MATH Google Scholar
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Article Google Scholar
Skaug, H. J. (2002). Automatic differentiation to facilitate maximum likelihood estimation in nonlinear random effects models. Journal of Computational and Graphical Statistics, 11, 458–470.
Article MathSciNet Google Scholar
Song, X.-Y., & Lee, S.-Y. (2012). A tutorial on the Bayesian approach for analyzing structural equation models. Journal of Mathematical Psychology, 56, 135–148.
Article MathSciNet MATH Google Scholar
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.
Article MathSciNet MATH Google Scholar
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82, 528–550.
Article MathSciNet MATH Google Scholar
Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussions). The Annals of Statistics, 22, 1701–1762.
MathSciNet MATH Google Scholar
Zhang, J. W., Lu, J., Chen, F., & Tao, J. (2018). Exploring the correlations between multiple latent variables and covariates based on a multilevel multidimensional item response model. Frontiers in Psychology, 10, 2387.
Article Google Scholar
Zhang, Z., Hamagami, F., Wang, L., Grimm, K. J., & Nesselroade, J. R. (2007). Bayesian analysis of longitudinal data using growth curve models. International Journal of Behavioral Development, 31, 374–383.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Key Lab of Statistical Modeling and Data Analysis of Yunnan Province, School of Mathematics and Statistics, Yunnan University, 2 North Cuihu Road, Wuhua District, Kunming, 650091, Yunnan, China
Jiwei Zhang
School of Mathematics and Statistics , Yili Normal University, 448 Jie Fang Road, Yili, Xijiang, 835000, Yil, China
Zhaoyuan Zhang
Key Laboratory of Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, 5268 Ren Min Street, Nanguan District, Changchun, 130024, Jilin, China
Jian Tao

Authors

Jiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Tao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaoyuan Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Zhang, Z. & Tao, J. Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data. J. Korean Stat. Soc. 50, 955–996 (2021). https://doi.org/10.1007/s42952-020-00100-6

Download citation

Received: 05 July 2020
Accepted: 07 December 2020
Published: 08 January 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s42952-020-00100-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data

Abstract

Access this article

Similar content being viewed by others

Two-stage maximum likelihood approach for item-level missing data in regression

Empirical likelihood method for non-ignorable missing data problems

Combining proration and full information maximum likelihood in handling missing data in Likert scale items: A hybrid approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data

Abstract

Access this article

Similar content being viewed by others

Two-stage maximum likelihood approach for item-level missing data in regression

Empirical likelihood method for non-ignorable missing data problems

Combining proration and full information maximum likelihood in handling missing data in Likert scale items: A hybrid approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation