A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Edwards, Michael C.

doi:10.1007/s11336-010-9161-9

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Published: 02 April 2010

Volume 75, pages 474–497, (2010)
Cite this article

Psychometrika Aims and scope Submit manuscript

Michael C. Edwards¹

914 Accesses
51 Citations
Explore all metrics

Abstract

Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show that these methods can be implemented in a flexible way which requires minimal technical sophistication on the part of the end user. After providing an overview of item factor analysis and MCMC, results from several examples (simulated and real) will be discussed. The bulk of these examples focus on models that are problematic for current “gold-standard” estimators. The results demonstrate that it is possible to obtain accurate parameter estimates using MCMC in a relatively user-friendly package.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Mixed methods research: what it is and what it could be

Article Open access 29 March 2019

References

Adams, R.J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1–23.
Article Google Scholar
Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.
Article Google Scholar
Albert, J.H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88, 669–679.
Article Google Scholar
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Article Google Scholar
Best, N.G., Cowles, M.K., & Vines, S.K. (1997). coda: Convergence diagnosis and output analysis software for Gibbs sampling output (Version 0.4) [Computer software]. Cambridge: University of Cambridge, Institute of Public Health, Medical Research Council Biostatistics Unit.
Google Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika, 46, 443–459.
Article Google Scholar
Bock, R.D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.
Article Google Scholar
Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., & Wood, R. (2002). TESTFACT 4 [Computer software]. Chicago: Scientific Software International, Inc.
Google Scholar
Bolt, D.M., & Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Article Google Scholar
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.
Article Google Scholar
Cai, L. (In Press-a). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika.
Cai, L. (In Press-b). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics.
Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited-information goodness-of-fit testing of item response models for sparse 2^p tables. British Journal of Mathematical and Statistical Psychology, 59, 173–194.
Article PubMed Google Scholar
Casella, G., & George, E.I. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167–174.
Article Google Scholar
Chen, M.-H., Shao, Q.-M., & Ibrahim, J.G. (2000). Monte Carlo methods in Bayesian computation. New York: Springer.
Google Scholar
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49, 327–335.
Article Google Scholar
Cowles, M.K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101–111.
Article Google Scholar
Cowles, M.K., & Carlin, B. (1996). Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association, 91, 883–904.
Article Google Scholar
de la Torre, J., & Patz, R.J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.
Article Google Scholar
DeMars, C.E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43, 145–168.
Article Google Scholar
DeMars, C.E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67, 433–446.
Article Google Scholar
Edwards, M.C. (2005a). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Unpublished doctoral dissertation, University of North Carolina at Chapel Hill.
Edwards, M.C. (2005b). MultiNorm: Multidimensional normal ogive item response theory analysis [Computer software].
Edwards, M.C., & Vevea, J.L. (2006). An empirical Bayes approach to subscore augmentation: How much strength can we borrow? The Journal of Educational and Behavioral Statistics, 31, 241–259.
Article Google Scholar
Edwards, M.C., & Wirth, R.J. (2009). Measurement and the study of change. Research in Human Development, 6, 74–96.
Article Google Scholar
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Article Google Scholar
Gamerman, D. (1997). Markov chain Monte Carlo. New York: Chapman and Hall.
Google Scholar
Gelman, A. (1996). Inference and monitoring convergence. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 131–143). London: Chapman and Hall.
Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis (2nd ed.). New York: Chapman and Hall.
Google Scholar
Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.
Article Google Scholar
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article Google Scholar
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J.M. Bernardo, J. Berger, A.P. Dawid, & A.F.M. Smith (Eds.), Bayesian statistics 4 (pp. 169–193). Oxford: Oxford University Press.
Google Scholar
Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss, D.J., Segawa, E., Bhaumik, D.K., et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.
Article Google Scholar
Gibbons, R.D., & Hedeker, D.R. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.
Article Google Scholar
Gibbons, R.D., Rush, A.J., & Immekus, J.C. (2009). On the psychometric validity of the domains of the pdsq: An illustration of the bi-factor item response theory model. Journal of Psychiatric Research, 43, 401–410.
Article PubMed Google Scholar
Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (1996a). Introducing Markov chain Monte Carlo. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 1–19). New York: Chapman and Hall.
Google Scholar
Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (Eds.) (1996b). Markov chain Monte Carlo in practice. New York: Chapman and Hall.
Google Scholar
Gill, J. (2008). Bayesian methods: A social and behavioral sciences approach. New York: Chapman and Hall/CRC.
Google Scholar
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Article Google Scholar
Heidelberger, P., & Welch, P.D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31, 1109–1144.
Article Google Scholar
Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M., et al. (2007). Practical issues in the application of item response theory: A demonstration using item form the Pediatric Quality of Life Inventory (PedsQL) 4.0 Generic Core Scales. Medical Care, 45, S39–S47.
Article PubMed Google Scholar
Holzinger, K.J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.
Article Google Scholar
Jöreskog, K.G., & Sörbom, D. (2001). LISREL user’s guide. Chicago: SSI International.
Google Scholar
Jöreskog, K.G., & Sörbom, D. (2003). LISREL 8.54 [Computer software]. Chicago: Scientific Software International, Inc.
Google Scholar
Kang, T., & Cohen, A.S. (2007). Irt model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331–358.
Article Google Scholar
Kass, R.E., Carlin, B.P., Gelman, A., & Neal, R.M. (1998). Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician, 52, 93–100.
Article Google Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087–1092.
Article Google Scholar
Metropolis, N., & Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44, 335–341.
Article PubMed Google Scholar
Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.
Google Scholar
Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.
Google Scholar
Pearson, K. (1914). The life, letters and labours of Francis Gallon (Vol. I). Cambridge: Cambridge University Press.
Google Scholar
R Development Core Team (2005). R: A language and environment for statistical computing [Computer software]. Vienna: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org. Available from http://www.R-project.org.
Raftery, A.E., & Lewis, S. (1992). How many iterations in the Gibbs sampler? In J.M. Bernardo, J. Berger, A.P. Dawid, & A.F.M. Smith (Eds.), Bayesian statistics 4 (pp. 763–773). Oxford: Oxford University Press.
Google Scholar
Roberts, G.O. (1996). Markov chain concepts related to sampling algorithms. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 45–57). New York: Chapman and Hall.
Google Scholar
Samejima, F. (1969). Psychometrika Monograph, No. 17: Estimation of latent ability using a response pattern of graded scores.
Schilling, S., & Bock, R.D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.
Google Scholar
Segall, D.O. (2002). Confirmatory item factor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Shi, J.-Q., & Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.
Google Scholar
Sinharay, S. (2004). Experiences with Markov chain Monte Carlo convergence assessment in two psychometric examples. Journal of Educational and Behavioral Statistics, 29, 461–488.
Article Google Scholar
Sinharay, S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.
Article Google Scholar
Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.
Article Google Scholar
Tanner, M.A. (1996). Tools for statistical inference. New York: Springer.
Google Scholar
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.
Article Google Scholar
Thissen, D. (1991). Multilog: Multiple category item analysis and test scoring using item response theory [Computer software]. Chicago: Scientific Software International, Inc.
Google Scholar
Thurstone, L.L. (1947). Multiple-factor analysis. Chicago: University of Chicago Press.
Google Scholar
Wainer, H., Bradlow, E.T., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In W.J. van der Linden & C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.
Google Scholar
Wainer, H., & Kiely, G. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185–202.
Article Google Scholar
Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K., Nelson, L., et al. (2001). Augmented scores—“Borrowing strength” to compute scores based on a small number of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 347–387). Mahwah: Lawrence Erlbaum Associates, Inc.
Google Scholar
Wang, X., Bradlow, E.T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.
Article Google Scholar
Wirth, R.J., & Edwards, M.C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

1827 Neil Avenue, Columbus, OH, 43210, USA
Michael C. Edwards

Authors

Michael C. Edwards
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael C. Edwards.

Additional information

I would like to thank Li Cai, David Thissen, and R.J. Wirth for comments on earlier versions of this draft. I would like to thank Roger Millsap and the reviewers for their guidance on revisions. The resulting paper is better for all of your efforts. Any remaining faults are my own.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Edwards, M.C. A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis. Psychometrika 75, 474–497 (2010). https://doi.org/10.1007/s11336-010-9161-9

Download citation

Received: 19 October 2009
Revised: 30 January 2010
Published: 02 April 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s11336-010-9161-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Mixed methods research: what it is and what it could be

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Mixed methods research: what it is and what it could be

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation