Skip to main content
Log in

Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Mixtures of factor analyzers are becoming more and more popular in the area of model based clustering of multivariate data. According to the likelihood approach in data modeling, it is well known that the unconstrained likelihood function may present spurious maxima and singularities. To reduce such drawbacks, in this paper we introduce a procedure for parameter estimation of mixtures of factor analyzers, which maximizes the likelihood function under the mild requirement that the eigenvalues of the covariance matrices lie into some interval [a,b]. Moreover, we give a recipe on how to select appropriate bounds for the constrained EM algorithm, directly from the handled data. We then analyze and measure its performance, compared with the usual non-constrained approach, and also with other constrained models in the literature. Results show that the data-driven constraints improve the estimation and the subsequent classification, at the same time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Aitken, A.: On Bernoulli’s numerical solution of algebraic equations. In: Proceedings of the Royal Society of Edinburgh, vol. 46, pp. 289–305 (1926)

    Google Scholar 

  • Andrews, J., McNicholas, P.: Extending mixtures of multivariate t-factor analyzers. Stat. Comput. 21(3), 361–373 (2011)

    Article  MathSciNet  Google Scholar 

  • Baek, J., McLachlan, G., Flack, L.: Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1298–1309 (2010)

    Article  Google Scholar 

  • Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  • Bishop, C.M., Tippin, M.E.: A hierarchical latent variable model for data visualization. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 281–293 (1998)

    Article  Google Scholar 

  • Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.: The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46(2), 373–388 (1994)

    Article  MATH  Google Scholar 

  • Day, N.E.: Estimating the components of a mixture of normal distributions. Biometrika 56(3), 463–474 (1969)

    Article  MATH  MathSciNet  Google Scholar 

  • Forina, M., Armanino, C., Castino, M., Ubigli, M.: Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25, 189–201 (1986)

    Google Scholar 

  • Ghahramani, Z., Hilton, G.: The EM algorithm for mixture of factor analyzers. Technical report CRG-TR-96-1 (1997)

  • Greselin, F., Ingrassia, S.: Constrained monotone EM algorithms for mixtures of multivariate t distributions. Stat. Comput. 20(1), 9–22 (2010)

    Article  MathSciNet  Google Scholar 

  • Hathaway, R.: A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann. Stat. 13(2), 795–800 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  • Hennig, C.: Breakdown points for maximum likelihood estimators of location-scale mixtures. Ann. Stat. 32(4), 1313–1340 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Hoff, P.: Subset clustering of binary sequences, with an application to genomic abnormality data. Biometrics 61, 1027–1036 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Ingrassia, S.: A likelihood-based constrained algorithm for multivariate normal mixture models. Stat. Methods Appl. 13, 151–166 (2004)

    Article  MathSciNet  Google Scholar 

  • Ingrassia, S., Rocci, R.: Constrained monotone em algorithms for finite mixture of multivariate Gaussians. Comput. Stat. Data Anal. 51, 5339–5351 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Liu, J., Zhang, J., Palumbo, M., Lawrence, C.: Bayesian clustering with variable and transformation selection (with discussion). Bayesian Stat. 7, 249–275 (2003)

    MathSciNet  Google Scholar 

  • Lubischew, A.: On the use of discriminant functions in taxonomy. Biometrics 18, 455–477 (1962)

    Article  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000a)

    Book  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Mixtures of factor analyzers. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 599–606. Morgan Kaufmann, San Francisco (2000b)

    Google Scholar 

  • McNicholas, P., Murphy, T.: Parsimonious Gaussian mixture models. Stat. Comput. 18(3), 285–296 (2008)

    Article  MathSciNet  Google Scholar 

  • McNicholas, P.D., Jampani, K.R., McDaid, A.F., Murphy, T.B., Banks, L.: pgmm: parsimonious Gaussian mixture models. R package version 1.0 (2011)

  • Meng, X., van Dyk, D.: The EM algorithm—an old folk-song sung to a fast new tune. J. R. Stat. Soc., Ser. B, Stat. Methodol. 59(3), 511–567 (1997)

    Article  MATH  Google Scholar 

  • Pan, W., Shen, X.: Penalized model-based clustering with application to variable selection. J. Mach. Learn. Res. 8, 1145–1164 (2007)

    MATH  Google Scholar 

  • R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org

  • Raftery, A., Dean, N.: Variable selection for model-based clustering. J. Am. Stat. Assoc. 101(473), 168–178 (2006)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors sincerely thank the Associate Editor and the referees for very helpful comments and valuable suggestions. Their pertinent comments helped us also in fixing some not-so-minor details in the exposition, which greatly improved the quality of the final version.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Greselin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Greselin, F., Ingrassia, S. Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers. Stat Comput 25, 215–226 (2015). https://doi.org/10.1007/s11222-013-9427-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9427-z

Keywords

Navigation