Abstract
Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show that these methods can be implemented in a flexible way which requires minimal technical sophistication on the part of the end user. After providing an overview of item factor analysis and MCMC, results from several examples (simulated and real) will be discussed. The bulk of these examples focus on models that are problematic for current “gold-standard” estimators. The results demonstrate that it is possible to obtain accurate parameter estimates using MCMC in a relatively user-friendly package.
Similar content being viewed by others
References
Adams, R.J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1–23.
Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.
Albert, J.H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88, 669–679.
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Best, N.G., Cowles, M.K., & Vines, S.K. (1997). coda: Convergence diagnosis and output analysis software for Gibbs sampling output (Version 0.4) [Computer software]. Cambridge: University of Cambridge, Institute of Public Health, Medical Research Council Biostatistics Unit.
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika, 46, 443–459.
Bock, R.D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.
Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., & Wood, R. (2002). TESTFACT 4 [Computer software]. Chicago: Scientific Software International, Inc.
Bolt, D.M., & Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.
Cai, L. (In Press-a). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika.
Cai, L. (In Press-b). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics.
Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited-information goodness-of-fit testing of item response models for sparse 2p tables. British Journal of Mathematical and Statistical Psychology, 59, 173–194.
Casella, G., & George, E.I. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167–174.
Chen, M.-H., Shao, Q.-M., & Ibrahim, J.G. (2000). Monte Carlo methods in Bayesian computation. New York: Springer.
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49, 327–335.
Cowles, M.K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101–111.
Cowles, M.K., & Carlin, B. (1996). Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association, 91, 883–904.
de la Torre, J., & Patz, R.J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.
DeMars, C.E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43, 145–168.
DeMars, C.E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67, 433–446.
Edwards, M.C. (2005a). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Unpublished doctoral dissertation, University of North Carolina at Chapel Hill.
Edwards, M.C. (2005b). MultiNorm: Multidimensional normal ogive item response theory analysis [Computer software].
Edwards, M.C., & Vevea, J.L. (2006). An empirical Bayes approach to subscore augmentation: How much strength can we borrow? The Journal of Educational and Behavioral Statistics, 31, 241–259.
Edwards, M.C., & Wirth, R.J. (2009). Measurement and the study of change. Research in Human Development, 6, 74–96.
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Gamerman, D. (1997). Markov chain Monte Carlo. New York: Chapman and Hall.
Gelman, A. (1996). Inference and monitoring convergence. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 131–143). London: Chapman and Hall.
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis (2nd ed.). New York: Chapman and Hall.
Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J.M. Bernardo, J. Berger, A.P. Dawid, & A.F.M. Smith (Eds.), Bayesian statistics 4 (pp. 169–193). Oxford: Oxford University Press.
Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss, D.J., Segawa, E., Bhaumik, D.K., et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.
Gibbons, R.D., & Hedeker, D.R. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.
Gibbons, R.D., Rush, A.J., & Immekus, J.C. (2009). On the psychometric validity of the domains of the pdsq: An illustration of the bi-factor item response theory model. Journal of Psychiatric Research, 43, 401–410.
Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (1996a). Introducing Markov chain Monte Carlo. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 1–19). New York: Chapman and Hall.
Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (Eds.) (1996b). Markov chain Monte Carlo in practice. New York: Chapman and Hall.
Gill, J. (2008). Bayesian methods: A social and behavioral sciences approach. New York: Chapman and Hall/CRC.
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Heidelberger, P., & Welch, P.D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31, 1109–1144.
Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., et al. (2007). Practical issues in the application of item response theory: A demonstration using item form the Pediatric Quality of Life Inventory (PedsQL) 4.0 Generic Core Scales. Medical Care, 45, S39–S47.
Holzinger, K.J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.
Jöreskog, K.G., & Sörbom, D. (2001). LISREL user’s guide. Chicago: SSI International.
Jöreskog, K.G., & Sörbom, D. (2003). LISREL 8.54 [Computer software]. Chicago: Scientific Software International, Inc.
Kang, T., & Cohen, A.S. (2007). Irt model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331–358.
Kass, R.E., Carlin, B.P., Gelman, A., & Neal, R.M. (1998). Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician, 52, 93–100.
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087–1092.
Metropolis, N., & Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44, 335–341.
Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.
Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.
Pearson, K. (1914). The life, letters and labours of Francis Gallon (Vol. I). Cambridge: Cambridge University Press.
R Development Core Team (2005). R: A language and environment for statistical computing [Computer software]. Vienna: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org. Available from http://www.R-project.org.
Raftery, A.E., & Lewis, S. (1992). How many iterations in the Gibbs sampler? In J.M. Bernardo, J. Berger, A.P. Dawid, & A.F.M. Smith (Eds.), Bayesian statistics 4 (pp. 763–773). Oxford: Oxford University Press.
Roberts, G.O. (1996). Markov chain concepts related to sampling algorithms. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 45–57). New York: Chapman and Hall.
Samejima, F. (1969). Psychometrika Monograph, No. 17: Estimation of latent ability using a response pattern of graded scores.
Schilling, S., & Bock, R.D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.
Segall, D.O. (2002). Confirmatory item factor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Shi, J.-Q., & Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.
Sinharay, S. (2004). Experiences with Markov chain Monte Carlo convergence assessment in two psychometric examples. Journal of Educational and Behavioral Statistics, 29, 461–488.
Sinharay, S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.
Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.
Tanner, M.A. (1996). Tools for statistical inference. New York: Springer.
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
Thissen, D. (1991). Multilog: Multiple category item analysis and test scoring using item response theory [Computer software]. Chicago: Scientific Software International, Inc.
Thurstone, L.L. (1947). Multiple-factor analysis. Chicago: University of Chicago Press.
Wainer, H., Bradlow, E.T., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.
Wainer, H., & Kiely, G. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185–202.
Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K., Nelson, L., et al. (2001). Augmented scores—“Borrowing strength” to compute scores based on a small number of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 347–387). Mahwah: Lawrence Erlbaum Associates, Inc.
Wang, X., Bradlow, E.T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.
Wirth, R.J., & Edwards, M.C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.
Author information
Authors and Affiliations
Corresponding author
Additional information
I would like to thank Li Cai, David Thissen, and R.J. Wirth for comments on earlier versions of this draft. I would like to thank Roger Millsap and the reviewers for their guidance on revisions. The resulting paper is better for all of your efforts. Any remaining faults are my own.
Rights and permissions
About this article
Cite this article
Edwards, M.C. A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis. Psychometrika 75, 474–497 (2010). https://doi.org/10.1007/s11336-010-9161-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-010-9161-9