Abstract
In this paper, various types of finite mixtures of confirmatory factor-analysis models are proposed for handling data heterogeneity. Under the proposed mixture approach, observations are assumed to be drawn from mixtures of distinct confirmatory factor-analysis models. But each observation does not need to be identified to a particular model prior to model fitting. Several classes of mixture models are proposed. These models differ by their unique representations of data heterogeneity. Three different sampling schemes for these mixture models are distinguished. A mixed type of the these three sampling schemes is considered throughout this article. The proposed mixture approach reduces to regular multiple-group confirmatory factor-analysis under a restrictive sampling scheme, in which the structural equation model for each observation is assumed to be known. By assuming a mixture of multivariate normals for the data, maximum likelihood estimation using the EM (Expectation-Maximization) algorithm and the AS (Approximate-Scoring) method are developed, respectively. Some mixture models were fitted to a real data set for illustrating the application of the theory. Although the EM algorithm and the AS method gave similar sets of parameter estimates, the AS method was found computationally more efficient than the EM algorithm. Some comments on applying the mixture approach to structural equation modeling are made.
Similar content being viewed by others
References
Aitkin, M., & Wilson, G. T. (1980). Mixture models, outliers, & the EM algorithm.Technometrics, 22, 325–331.
Behboodian, J. (1970). On a mixture of normal distributions.Biometrika, 57, 215–217.
Bentler, P. M., Lee, S.-Y., & Weng, L.-J. (1987). Multiple population covariance structure analysis under arbitrary distribution theory.Communications in Statistics-Theory and Methods, 16, 1951–1964.
Bhattacharya, C. G. (1967). A simple method of resolution of a distribution into Gaussian components.Biometrics, 23, 115–135.
Blåfield, E. (1980). Clustering of observations from finite mixtures with structural information.Jyvaskyla Studies in Computer Science, Economics & Statistics 2. Jyvaskyla University, Finland.
Bollen, K. A. (1989).Structural Equations with Latent Variables. New York: Wiley.
Browne, M. W. (1982). Covariance structures. In D. M. Hawkins (Ed.),Topics in applied multivariate analysis (pp. 72–141). London: Cambridge University Press.
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures.British Journal of Mathematical and Statistical Psychology, 37, 62–83.
Choi, K. (1969). Estimators for the parameters of a finite mixture of distributions.The Annals of Institute of Statistical Mathematics, 21, 107–116.
Choi, K., & Bulgren, W. B. (1968). An estimation procedure for mixtures of distributions.Journal of the Royal Statistical Society, Series B, 30, 444–460.
Crawford, S. L., DeGroot, M. H., Kadane, J. B., & Small, M. J. (1992). Modeling lake-chemistry distribution: Approximate Bayesian methods for estimating a finite mixture model.Technometrics, 34, 441–455.
Day, N. E. (1969). Estimating the components of a mixture of normal distributions.Biometrika, 56, 463–474.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B, 39, 1–38.
Do, K., & McLachlan, G. J. (1984). Estimation of mixing proportions: A case study.Applied Statistics, 33, 134–140.
Everitt, B. S., & Hand, D. J. (1981).Finite mixture distributions. London: Chapman and Hall.
Fryer, J. G., & Robertson, C. A. (1972). A comparison of some methods for estimating mixed normal distributions.Biometrika, 59, 639–648.
Furman, W. D., & Lindsay, B. G. (1994b). Measuring the relative effectiveness of moment estimators as starting values in maximizing mixture likelihoods.Computational Statistics and Data Analysis, 17, 473–492.
Ganesalingam, S., & McLachlan, G. J. (1981). Some efficiency results for the estimation of the mixing proportion in a mixture of two normal distributions.Biometrics, 37, 23–33.
Goldfeld, S. M., & Quandt, R. E. (1976).Studies in Nonlinear Estimation. Cambridge, MA: Ballinger.
Hartigan, J. A. (1977). Distribution problems in clustering. In J. van Ryzin (Ed.),Classification and clustering (pp. 54–71). New York: Academic Press.
Hartley, M. J. (1978). Comments on a paper by Quandt and Ramsey.Journal of the American Statistical Association, 73, 738–741.
Hasselblad, V. (1966). Estimation of parameters for a mixture of normal distributions.Technometrics, 8, 431–444.
Hasselblad, V. (1967).Finite mixtures of distributions from the exponential family. Unpublished doctoral dissertation, UCLA, California.
Hathaway, R. J. (1985). A constrained formulation of maximum-likelihood estimation for normal mixture distributions.The Annals of Statistics, 13, 795–800.
Hathaway, R. J. (1986). A constrained EM algorithm for univariate normal mixtures.Journal of Statistical Computation and Simulation, 23, 211–230.
Holzinger, K. J., & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factor solution.Supplementary Educational Monographs, 48.
Hosmer, D. W. (1974). Maximum likelihood estimates of the parameters of a mixture of two regression lines.Communication in Statistics-Theory and Methods, 3, 995–1006.
Hosmer, D. W., & Dick, N. P. (1977). Information and mixtures of two normal distributions.Journal of Statistical Computation and Simulation, 6, 137–148.
John, S. (1970). On identifying the population of origin of each observation in a mixture of observations from two normal populations.Technometrics, 12, 553–563.
Johnson, R. A., & Wichern, D. W. (1988).Applied Multivariate Statistical Analysis (2nd ed.). New Jersey: Prentice Hall.
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations.Psychometrika, 57, 409–426.
Kano, Y., Berkane, M., & Bentler, P. M. (1990). Covariance structure analysis with heterogeneous kurtosis parameters.Biometrika, 77, 575–585.
Kiefer, N. M. (1978). Discrete parameter variation: Efficient estimation of a switching regression model.Econometrica, 46, 427–434.
Lee, S.-Y., & Tsui, K. L. (1982). Covariance structure analysis in several populations.Psychometrika, 47, 297–308.
Lehmann, E. L. (1980). Efficient Likelihood Estimators.The American Statistician, 34, 233–235.
Lindsay, B. G. (1989). Moment matrices: Applications in mixtures.The Annals of Statistics, 13, 435–475.
Lindsay, B. G., & Basak, P. (1993). Multivariate normal mixtures: A fast consistent method of moments.Journal of the American Statistical Association, 88, 468–476.
Magnus, J. R., & Neudecker, H. (1988).Matrix differential calculus with applications in statistics and econometrics. Chichester: Wiley.
McLachlan, G. J. (1982). The classification and mixture mixture likelihood approaches to cluster analysis. In P. R. Krishnaiah & L. N. Kanal (Eds.):Handbook of statistics, Vol.2 (pp. 199–208).
McLachlan, G. J., & Basford, K. E. (1988).Mixture models: Inference and applications to clustering. New York: Marcel Dekker.
Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations.Psychometrika, 54, 557–585.
Odell, P. L., & Basu, J. P. (1976). Concerning several methods for estimating crop acreages using remote sensing data.Communications in Statistics-Theory and Methods, 5, 1091–1114.
Pearson, K. (1894). Contribution to the mathematical theory of evolution.Philosophical Transactions of the Royal Society, Series A, 185, 71–110.
Please, N. W. (1973). Comparison of factor loadings in different populations.British Journal of Mathematical and Statistical Psychology, 26, 61–89.
Quandt, R. E. (1972). A new approach to estimating switching regressions.Journal of the American Statistical Association, 67, 306–310.
Quandt, R. E., & Ramsey, J. B. (1978). Estimating mixtures of normal distributions and switching regressions.Journal of the American Statistical Association, 73, 730–738.
Rajagopalan, M., & Loganathan, A. (1991). Bayes estimates of mixing proportions in finite mixture distributions.Communications in Statistics-Theory and Methods, 20, 2337–2349.
Rao, C. R. (1952).Advanced statistical methods in biometric research. New York: Wiley.
Redner, R. A., & Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm.SIAM Review, 26, 195–239.
Rubin, D. B., & Thayer, D. T. (1982). EM algorithms for ML factor analysis.Psychometrika, 47, 69–76.
SAS Institute (1990).SAS/IML Software: Usage and Reference (version 6). Cary, NC: Author.
Schoenberg, R., & Richtand, C. (1984). Application of the EM method.Sociological Methods and Research, 13, 127–150.
Schork, N. (1992). Bootstrapping likelihood ratios in quantitative genetics. In R. LePage & L. Billard (Eds.),Exploring the limits of bootstrap (pp. 389–396). New York: Wiley.
Sclove, S. C. (1977). Population mixture models and clustering algorithms.Communications in Statistics-Theory and Methods, Series A, 6, 417–434.
Scott, A. J., & Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria.Biometrics, 27, 238–397.
Smith, A. F. M., & Makov, U. E. (1978). A quasi-Bayes sequential procedures for mixtures.Journal of the Royal Statistical Society, Series B, 40, 106–111.
Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups.British Journal of Mathematical and Statistical Psychology, 27, 229–239.
Sundberg, R. (1976). An iterative method for solution of the likelihood equations for incomplete data from exponential families.Communications in Statistics-Simulation and Computation, 5, 55–64.
Symons, M. J. (1981). Clustering criteria and multivariate normal mixtures.Biometrics, 37, 35–43.
Tan, W. Y., & Chang, W. C. (1972). Some comparisons of the method of moments and the method of maximum likelihood in estimating parameters of a mixture of two normal densities.Journal of the American Statistical Association, 67, 702–708.
Teicher, H. (1960). On the mixture of distributions.The Annals of the Mathematical Statistics, 31, 55–73.
Teicher, H. (1961). Identifiability of mixtures.The Annals of the Mathematical Statistics, 32, 244–248.
Teicher, H. (1963). Identifiability of finite mixtures.The Annals of the Mathematical Statistics, 34, 1265–1269.
Titerington, D. M. (1990). Some recent research in the analysis of mixture distributions.Statistics, 21, 619–641.
Titterington, D. M., Smith, A. F. M., & Makov, U. E. (1985).Statistical analysis of finite mixture distributions. Chichester: Wiley.
Wolfe, J. H. (1970). Pattern clustering by multivariate mixture analysis.Multivariate Behavioral Research, 5, 329–350.
Yakowitz, S. J. (1969). A consistent estimators for the identification of finite mixtures.The Annals of Mathematical Statistics, 39, 209–214.
Yakowitz, S. J., & Spragins, J. D. (1968). On the identifiability of finite mixtures.The Annals of Mathematical Statistics, 39, 1728–1735.
Yung, Y. F. (1995).Finite mixtures in confirmatory factor-analytic models (microfilm). Ann Arbor, MI: Univesity Microfilms.
Author information
Authors and Affiliations
Additional information
Note: This paper is one of the Psychometric Society's 1995 Dissertation Award papers.—Editor
This article is based on the dissertation of the author. The author would like to thank Peter Bentler, who was the dissertation chair, for guidance and encouragement of this work. Eric Holman, Robert Jennrich, Bengt Muthén, and Thomas Wickens, who served as the committee members for the dissertation, had been very supportive and helpful. Michael Browne is appreciated for discussing some important points about the use of the approximate information in the dissertation. Thanks also go to an anonymous associate editor, whose comments were very useful for the revision of an earlier version of this article.
Rights and permissions
About this article
Cite this article
Yung, YF. Finite mixtures in confirmatory factor-analysis models. Psychometrika 62, 297–330 (1997). https://doi.org/10.1007/BF02294554
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02294554