Abstract
Missing responses generally exist in educational and psychological assessments. The statistical inference will lead to serious deviation if the missing responses are not properly modeled in the framework of non-ignorable missing mechanism. In this current study, it is studied whether the different missing mechanism (ignorable missing and non-ignorable missing) models are appropriate to analyze the missing response data from the perspective of parameter estimation and model assessment. In addition, a highly effective Bayesian sampling algorithm based on auxiliary variables is used to estimate the complex models. Compared with the traditional marginal likelihood method and other Bayesian algorithms, the advantages of the new algorithm are discussed in detail. Based on the Markov Chain Monte carlo samples from the posterior distributions, the deviance information criterion (DIC) and the logarithm of the pseudomarignal likelihood (LPML) are employed to compare the different missing mechanism models. Four simulation studies are conducted and a detailed analysis of PISA science data is carried out to further illustrate the proposed methodology.
Similar content being viewed by others
References
Ackerman, T. A. (1996a). Developments in multidimensional item response theory. Applied Psychological Measurement, 20, 309–310.
Ackerman, T. A. (1996b). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20, 311–329.
Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.
Asparouhov, T., & Muthén, B. (2010). Bayesian analysis of latent variable models using Mplus (Technical report, Version 4). Retrieved from http://www.statmodel.com.
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation of multidimensional IRT models. Psychometrika, 66, 541–561.
Bishop, C. M. (2006). Slice sampling. Pattern Recognition and Machine Learning. New York: Springer.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). Reading: MIT Press.
Blossfeld, H.-P., Ro$\beta $bach, H.-G., & von Maurice, J. (2011). Education as a lifelong process—The German national educational panel study (NEPS) [Special issue]. In Zeitschrift für Erziehungswissenschaft , 14. Wiesbaden: Springer VS.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.
Bock, R. D., & Schilling, S. G. (1997). High-dimensional full-information item factor analysis. In M. Berkane (Ed.), Latent variable modelling and applications to causality (pp. 164–176). New York: Springer.
Brooks, S. P., & Gelman, A. (1998). Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.
Chen, M.-H., Shao, Q.-M., & Ibrahim, J. G. (2000). Monte Carlo methods in Bayesian computation. New York: Springer.
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49, 327–335.
Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81, 1142–1163.
Damien, P., Wakefield, J., & Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by auxiliary variables. Journal of the Royal Statistical Society Series B, 61, 331–344.
Diggle, P. J., Heagerty, P., Liang, K. Y., & Zeger, S. L. (2002). Analysis of longitudinal data (2nd ed.). Oxford: Oxford University Press.
Fox, J. P. (2005). Multilevel IRT using dichotomous and polytomous items. The British Journal of Mathematical and Statistical Psychology, 58, 145–172.
Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer.
Fox, J.-P., & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Geisser, S., & Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association, 74, 153–160.
Gelfand, A. E., Dey, D. K., & Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics 4 (pp. 147–167). Oxford, UK: Oxford University Press.
Gelfand, A. E., & Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Glas, C. A. W., & Pimentel, J. L. (2008). Modeling nonignorable missing data in speeded tests. Educational and Psychological Measurement, 68, 907–922.
Glas, C. A. W., Pimentel, J. L., & Lamers, M. A. (2015). Nonignorable data in IRT mdoels: Polytomous response and response propensity models with covariates. Psychological Test and Assessment Modeling, 57, 523–541.
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Heckman, J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. The Annals of Economic and Social Measurement, 5, 475–492.
Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153–61.
Holman, R., & Glas, C. A. W. (2005). Modelling non-ignorable missing-data mechanisms with item response theory models. British Journal of Mathematical and Statistical Psychology, 58, 1–17.
Huisman, M. (2000). Imputation of missing item responses: Some simple techniques. Quality and Quantity, 34, 331–351.
Ibrahim, J. G., Chen, M.-H., & Sinha, D. (2001). Bayesian survival analysis. New York: Springer.
Jackman, S. (2009). Bayesian analysis for the social sciences. Chichester: Wiley.
Korobko, O. K., Glas, C. A. W., Bosker, R. J., & Luyten, J. W. (2008). Comparing the difficulty of examination subjects with item response theory. Journal of Educational Measurement, 45, 137–155.
Kuk, A. Y. C. (1999). Laplace importance sampling for generalized linear mixed models. Journal of Statistical Computation and Simulation, 63, 143–158.
Lee, S.-Y., & Song, X.-Y. (2004). Evaluation of the Bayesian and maximum likelihood approaches in analyzing structural equation models with small sample sizes. Multivariate Behavioral Research, 39, 653–686.
Little, R. J. A. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88, 125–134.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York: Wiley.
Lord, F. M. (1974). Estimation of latent ability and item parameters when there are omitted responses. Psychometrika, 39, 247–264.
Lord, F. M. (1980). Applications of item response theory to practical testing scores. Reading: Addison-Wesley.
Lord, F. M. (1983). Maximum likelihood estimation of item response parameters when some responses are omitted. Psychometrika, 48, 477–482.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Lu, J., Zhang, J. W., & Tao, J. (2018). Slice–Gibbs sampling algorithm for estimating the parameters of a multilevel item response model. Journal of Mathematical Psychology, 82, 12–25.
Ludlow, L. H., & O’Leary, M. (1999). Scoring omitted and not-reached items: Practical data analysis implications. Educational and Psychological Measurement, 59, 615–630.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092.
Mislevy, R. J., & Chang, H. H. (2000). Does adaptive testing violate local independence? Psychometrika, 65, 149–156.
Mislevy, R. J., & Wu, P. K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing. Research Report No. RR-96-30. Princeton: Educational Testing Service.
Moustaki, I., & Knott, M. (2000). Weighting for item non-response in attitude scales by using latent variable models with covariates. Journal of the Royal Statistical Society Series A, 163, 445–459.
Muthén, B. O. (2010). Bayesian analysis in Mplus: A brief introduction (Incomplete draft,Version 3). http://www.statmodel.com/download/IntroBayesVersion%203.pdf
Neal, R. (2003). Slice sampling. The Annals of Statistics, 31, 705–767.
O’Muircheartaigh, C., & Moustaki, I. (1999). Symmetric pattern models: A latent variable approach to item non-response in attitudes scales. Journal of the Royal Statistic Society, 162, 177–194.
Patz, R. J., & Junker, B. W. (1999a). A straight forward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.
Pohl, S., Gräfe, L., & Rose, N. (2014). Dealing with omitted and not-reached items in competence tests: Evaluating approaches accounting for missing responses in item response theory models. Educational and Psychological Measurement, 74, 423–452.
Pohl, S., Haberkorn, K., Hardt, K., & Wiegand, E. (2012). NEPS technical report for reading? Scaling results of starting cohort 3 in fifth grade. NEPS Working Paper No. 15. Bamberg: Otto-Friedrich-Universitt, Nationales Bildungspanel.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal, 2, 1–21.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323.
Reckase, M. D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.
Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory. New York: Springer.
Rose, N. (2013). Item nonresponses in educational and psychological measurement. Doctoral Thesis, Friedrich-Schiller University, Jena.
Rose, N., von Davier, M., & Nagengast, B. (2017). Modeling omitted and not-reached items in IRT models. Psychometrika, 82, 795–819.
Rose, N., von Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with IRT. Research Report No. RR-10-11. Princeton, NJ: Educational Testing Service.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
Skaug, H. J. (2002). Automatic differentiation to facilitate maximum likelihood estimation in nonlinear random effects models. Journal of Computational and Graphical Statistics, 11, 458–470.
Song, X.-Y., & Lee, S.-Y. (2012). A tutorial on the Bayesian approach for analyzing structural equation models. Journal of Mathematical Psychology, 56, 135–148.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82, 528–550.
Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussions). The Annals of Statistics, 22, 1701–1762.
Zhang, J. W., Lu, J., Chen, F., & Tao, J. (2018). Exploring the correlations between multiple latent variables and covariates based on a multilevel multidimensional item response model. Frontiers in Psychology, 10, 2387.
Zhang, Z., Hamagami, F., Wang, L., Grimm, K. J., & Nesselroade, J. R. (2007). Bayesian analysis of longitudinal data using growth curve models. International Journal of Behavioral Development, 31, 374–383.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, J., Zhang, Z. & Tao, J. Bayesian algorithm based on auxiliary variables for estimating item response theory models with non-ignorable missing response data. J. Korean Stat. Soc. 50, 955–996 (2021). https://doi.org/10.1007/s42952-020-00100-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-020-00100-6