Skip to main content
Log in

Bayesian modeling of measurement error in predictor variables using item response theory

  • Articles
  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling.Journal of Educational Statistics, 17, 251–269.

    Google Scholar 

  • Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 433–448). New York, NY: Springer.

    Google Scholar 

  • Béguin, A.A. (2000).Robustness of equating high-stakes tests. Unpublished doctoral dissertation, Twente University, Enschede, Netherlands.

    Google Scholar 

  • Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation of multidimensional IRT models.Psychometrika, 66, 541–562.

    Google Scholar 

  • Bernardo, J.M., & Smith, A.F.M. (1994).Bayesian theory. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Best, N.G., Cowles, M.K., & Vines, S.K. (1995).CODA Convergence diagnosis and output analysis software for Gibbs sampler output: Version 0.3 [Computer software and manual]. University of Cambridge: MRC Biostatistics Unit.

    Google Scholar 

  • Bollen, K.A. (1989).Structural equations with latent variables. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Bosker, R.J., Blatchford, P., & Meijnen, G.W. (1999). Enhancing educational excellence, equity and efficiency. In R.J. Bosker, B.P.M. Creemers, & S. Stringfield (Eds.),Evidence from evaluations of systems and schools in change (pp. 89–112). Dordrecht/Boston/London: Kluwer Academic Publishers.

    Google Scholar 

  • Box, G.E.P., & Tiao, G.C. (1973).Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley Publishing.

    Google Scholar 

  • Bryk, A.S., & Raudenbush, S.W. (1992).Hierarchical linear models. Newbury Park, CA: Sage Publications.

    Google Scholar 

  • Carlin, B.P., & Louis, T.A. (1996).Bayes and empirical Bayes methods for data analysis. London: Chapman & Hall.

    Google Scholar 

  • Carroll, R., Ruppert, D., & Stefanski, L.A. (1995).Measurement error in nonlinear models. London: Chapman & Hall.

    Google Scholar 

  • Chen, M.-H., & Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals.Journal of Computational and Graphical Statistics, 8, 69–92.

    Google Scholar 

  • Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings Algorithm.The American Statistician, 49, 327–335.

    Google Scholar 

  • Cook, T.D., & Campbell, D.T. (1979).Quasi-experimentation, design & analysis issues for field settings. Chicago, IL: Rand McNally College Publishing.

    Google Scholar 

  • de Leeuw, J., & Kreft, I.G.G. (1986). Random coefficient models for multilevel analysis.Journal of Educational and Behavioral Statistics, 11, 57–86.

    Google Scholar 

  • Fox, J.-P. (2001).Multilevel IRT: A Bayesian perspective on estimating parameters and testing statistical hypotheses. Unpublished doctoral dissertation, Twente University, Enschede, Netherlands.

    Google Scholar 

  • Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling.Psychometrika, 66, 269–286.

    Google Scholar 

  • Fuller, W.A. (1987).Measurement error models. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association, 85, 398–409.

    Google Scholar 

  • Gelfand, A.E., Hills, S.E., Racine-Poon, A., & Smith, A.F.M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling.Journal of the American Statistical Association, 85, 972–985.

    Google Scholar 

  • Gelman A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995).Bayesian data analysis. London: Chapman & Hall.

    Google Scholar 

  • Gelman, A., Meng X.-L., & Stern, H.S. (1996). Posterior predictive assessment of model fitness via realized discrepancies.Statistica Sinica, 6, 733–807.

    Google Scholar 

  • Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images.IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.

    Google Scholar 

  • Gilks, W.R., & Roberts, G.O. (1996). Strategies for improving MCMC. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.),Markov Chain Monte Carlo in practice (pp. 89–114). London: Chapman & Hall.

    Google Scholar 

  • Goldstein, H. (1995).Multilevel statistical models (2nd ed.). London: Edward Arnold.

    Google Scholar 

  • Gruber, M.H.J. (1998).Improving efficiency by shrinkage. New York, NY: Marcel Dekker.

    Google Scholar 

  • Hoijtink, H., & Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In G.H. Fischer & I.W. Molenaar (Eds.),Rasch models: Foundations, recent developments and applications (pp. 53–68). New York, NY: Springer.

    Google Scholar 

  • Johnson, V.E., & Albert, J.H. (1999).Ordinal data modeling. New York, NY: Springer-Verlag.

    Google Scholar 

  • Lindley, D.V., & Smith, A.F.M. (1972). Bayes estimates for the linear model.Journal of the Royal Statistical Society, Series B,34, 1–41.

    Google Scholar 

  • Liu, J.S., Wong, H.W., & Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes.Biometrika, 81, 27–40.

    Google Scholar 

  • Lord, F.M. (1980).Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Lord, F.M., & Novick, M.R. (1968).Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

    Google Scholar 

  • MacEachern, S.N., & Berliner, L.M. (1994). Subsampling the Gibbs sampler.The American Statistician, 48, 188–190.

    Google Scholar 

  • McDonald, R.P. (1967). Nonlinear factor analysis.Psychometrika Monograph Number 15.

  • McDonald, R.P. (1982). Linear versus nonlinear models in latent trait theory.Applied Psychological Measurement, 6, 379–396.

    Google Scholar 

  • McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 257–269). New York, NY: Springer.

    Google Scholar 

  • Muthén, B.O. (1989). Latent variable modeling in heterogeneous populations.Psychometrika, 54, 557–585.

    Google Scholar 

  • Patz, J.P., & Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses.Journal of Educational and Behavioral Statistics, 24, 342–366.

    Google Scholar 

  • Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review.Journal of Educational Statistics, 13, 85–116.

    Google Scholar 

  • Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., & Congdon, R.T., Jr. (2000).HLM 5. Hierarchical linear and nonlinear modeling. Lincolnwood, IL: Scientific Software International.

    Google Scholar 

  • Richardson, S. (1996). Measurement error. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.),Markov Chain Monte Carlo in practice (pp. 401–417). London: Chapman & Hall.

    Google Scholar 

  • Robert, C.P., & Casella, G. (1999).Monte Carlo statistical methods. New York, NY: Springer.

    Google Scholar 

  • Roberts, G.O., & Sahu, S.K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler.Journal of the Royal Statistical Society, Series B,59, 291–317.

    Google Scholar 

  • Seltzer, M.H. (1993). Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach.Journal of Educational Statistics, 18, 207–235.

    Google Scholar 

  • Seltzer, M.H., Wong, W.H., & Bryk, A.S. (1996). Bayesian analysis in applications of hierarchical models: Issues and methods.Journal of Educational and Behavioral Statistics, 21, 131–167.

    Google Scholar 

  • Snijders, T.A.B., & Bosker, R.J. (1999).Multilevel analysis. London: Sage Publications.

    Google Scholar 

  • Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data augmentation.Journal of the American Statistical Association, 82, 528–550.

    Google Scholar 

  • Tierney, L. (1994). Markov chains for exploring posterior distributions.The Annals of Statistics, 22, 1701–1762.

    Google Scholar 

  • van der Linden, W.J. (1998). Optimal assembly of psychological and educational tests.Applied Psychological Measurement, 22, 195–211.

    Google Scholar 

  • Zellner, A. (1971).An introduction to Bayesian inference in econometrics. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996).Bilog MG, Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Paul Fox.

Additional information

This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fox, JP., Glas, C.A.W. Bayesian modeling of measurement error in predictor variables using item response theory. Psychometrika 68, 169–191 (2003). https://doi.org/10.1007/BF02294796

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294796

Key words

Navigation