Abstract
This paper considers the reflection unidentifiability problem in confirmatory factor analysis (CFA) and the associated implications for Bayesian estimation. We note a direct analogy between the multimodality in CFA models that is due to all possible column sign changes in the matrix of loadings and the multimodality in finite mixture models that is due to all possible relabelings of the mixture components. Drawing on this analogy, we derive and present a simple approach for dealing with reflection in variance in Bayesian factor analysis. We recommend fitting Bayesian factor analysis models without rotational constraints on the loadings—allowing Markov chain Monte Carlo algorithms to explore the full posterior distribution—and then using a relabeling algorithm to pick a factor solution that corresponds to one mode. We demonstrate our approach on the case of a bifactor model; however, the relabeling algorithm is straightforward to generalize for handling multimodalities due to sign invariance in the likelihood in other factor analysis models.
Similar content being viewed by others
Notes
To reduce clutter, we plot only selected factor loadings that illustrate our points.
We present two chains out of three to reduce clutter.
Note that the label-switching problem does not apply to maximum a posteriori or ML estimation that are not MCMC-based (e.g., Celeux, Forbes, Robert, & Titterington, 2006, p. 656).
References
Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis. In J. Neyman (Ed.), Proceedings of the third berkeley symposium on mathematical statistics and probability (pp. 111–150). Oakland: University of California Press.
Bafumi, J., Gelman, A., Park, D. K., & Kaplan, N. (2005). Practical issues in implementing and understanding Bayesian ideal point estimation. Political Analysis, 13, 171–187.
Bishop, Y., Fienberg, S. E., & Holland, P. (1975). Discrete multivariate analysis: Theory and practice. Cambridge: The MIT press.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.
Celeux, G., Forbes, F., Robert, C. P., & Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1, 651–673.
Celeux, G., Hurn, M., & Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association, 95, 957–970.
Clarkson, D. B. (1979). Estimating the standard eoors of rotated factor loadings by jackknifing. Psychometrika, 44, 297–314.
Congdon, P. (2003). Applied Bayesian modelling. New York: Wiley.
Congdon, P. (2006). Bayesian statistical modelling. New York: Wiley.
Cressie, N., & Read, T. (1989). Pearson’s \(\chi ^2\) and the loglikelihood ration statistic \(G^2\)—A comparative review. International Statistical Review, 57, 19–43.
Curtis, S. M., & Erosheva, E. A. (2016). relabeLoadings: Relabel loadings from MCMC output for confirmatory factor analysis. R package version 1.0.
Dolan, C. V., & Molenaar, P. C. (1991). A comparison of four methods of calculating standard errors of maximum-likelihood estimates in the analysis of covariance structure. British Journal of Mathematical and Statistical Psychology, 44, 359–368.
Drton, M. (2009). Likelihood ratio tests and singularities. The Annals of Statistics, 37, 979–1012.
Erosheva, E. A., & Curtis, S. M. (2011). Dealing with rotational invariance in Bayesian confirmatory factor analysis. Technical Report 589, University of Washington.
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1, 515–533.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis. New York: Chapman & Hall/CRC.
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel and hierarchical models. Cambridge: Cambridge University Press.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics 4 (pp. 169–194). Oxford: Clarendon Press.
Geweke, J., & Zhou, G. (1996). Measuring the pricing error of the arbitrage pricing theory. Review of Financial Studies, 9, 557–587.
Geyer, C. J., & Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data. Journal of the Royal Statistical Society, Series B (Methodological), 54, 657–699.
Ghosh, J., & Dunson, D. (2008). Bayesian model selection in factor analytic models. In D. B. Dunson (Ed.), Random effect and latent variable model selection (pp. 151–163). Berlin: Springer.
Ghosh, J., & Dunson, D. (2009). Default prior distributions and efficient posterior computation in Bayesian factor analysis. Journal of Computational and Graphical Statistics, 18, 306–320.
Gibbons, R. D., & Hedeker, D. R. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.
Gruhl, J., Erosheva, E. A., Crane, P. K., et al. (2013). A semiparametric approach to mixed outcome latent variable models: Estimating the association between cognition and regional brain volumes. The Annals of Applied Statistics, 7, 2361–2383.
Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.
Holzinger, K. J., & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factor solution, no. 48 in Supplementary Educational Monographs, University of Chicago.
Jackman, S. (2001). Multidimensional analysis of roll call data via Bayesian simulation: Identification, estimation, inference, and model checking. Political Analysis, 9, 227–241.
Jasra, A., Holmes, C., & Stephens, D. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statistical Science, 20, 50–67.
Jennrich, R. I. (1978). Rotational equivalence of factor laoding matrices with specified values. Psychometrika, 43, 421–426.
Lee, S.-Y. (1981). A Bayesian approach to confirmatory factor analysis. Psychometrika, 46, 153–160. doi:10.1007/BF02293896.
Lee, S.-Y. (2007). Structural equation modeling: A Bayesian approach. West Sussex: Wiley.
Loken, E. (2005). Identification constraints and inference in factor analysis models. Structural Equation Modeling, 12, 232–244.
Lopes, H. F., & West, M. (2004). Bayesian model assessment in factor analysis. Statistica Sinica, 14, 41–67.
Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000). WinBUGS-A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325–337.
Martin, J., & McDonald, R. (1975). Bayesian estimation in unrestricted factor analysis: A treatment for Heywood cases. Psychometrika, 40, 505–517. doi:10.1007/BF02291552.
Matlosz, K. (2013). Bayesian multidimensional scaling model for ordinal preference data. Ph.D. thesis, Columbia University.
Millsap, R. E. (2001). When trivial constraints are not trivial: The choice of uniqueness constraints in confirmatory factor analysis. Structural Equation Modeling, 8, 1–17.
Muirhead, R. J. (1982). Aspects of multivariate statistical theory. New York: Wiley.
Muthén, L. K., & Muthén, B. O. (2005). Mplus: Statistical analysis with latent variables: User’s guide. Los Angeles: Muthén & Muthén.
Nishihara, R., Minka, T., & Tarlow, D. (2013). Detecting parameter symmetries in probabilistic models. arXiv preprint arXiv:1312.5386.
Peeters, C. F. (2012). Bayesian exploratory and confirmatory factor analysis: Perspectives on constrained-model selection. Ph.D. thesis, Utrecht University.
Pennell, R. (1972). Routinely computable confidence intervals for factor loadings using the ’Jack-knife’. British Journal of Mathematical and Statistical Psychology, 25, 107–114.
Quinn, K. M. (2004). Bayesian factor analysis for mixed ordinal and continuous responses. Political Analysis, 12, 338–353.
R Development Core Team. (2010). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Rabe-Hesketh, S., & Skrondal, A. (2008). Multilevel and longitudinal modeling using Stata. College Station: STATA press.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36.
Rowe, D. B. (2001). A model for Bayesian factor analysis with jointly distributed means and loadings. Social Science Working Paper, 1108, 1–16.
Savitsky, T. D., & McCaffrey, D. F. (2014). Bayesian hierarchical multivariate formulation with factor analysis for nested ordinal data. Psychometrika, 79, 275–302.
Scheines, R., Hoijtink, H., & Boomsma, A. (1999). Bayesian estimation and testing of structural equation models. Psychometrika, 64, 37–52.
Schervish, M. J. (1995). Theory of statistics. New York: Springer.
Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society, Series B, 62, 795–809.
Tanaka, K. (2013). A Bayesian multidimensional scaling model for partial rank preference data. Ph.D. thesis, Columbia University.
Vermunt, J. K., & Magidson, J. (2005). Technical guide for Latent GOLD 4.0: Basic and advanced. Belmont: Statistical Innovations Inc.
Acknowledgements
This research was supported by Grant R01 AG029672-01A1 from the National Institutes of Health. The authors are grateful to Thomas Richardson, Adrian Raftery, Peter Hoff, Jonathan Gruhl and Y. Samuel Wang for helpful discussions, and to Terrance Savitsky for useful comments on an earlier draft of the paper and on the R code.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Erosheva, E.A., Curtis, S.M. Dealing with Reflection Invariance in Bayesian Factor Analysis. Psychometrika 82, 295–307 (2017). https://doi.org/10.1007/s11336-017-9564-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-017-9564-y