A data driven equivariant approach to constrained Gaussian mixture modeling

Rocci, Roberto; Gattone, Stefano Antonio; Di Mari, Roberto

doi:10.1007/s11634-016-0279-1

A data driven equivariant approach to constrained Gaussian mixture modeling

Regular Article
Published: 06 January 2017

Volume 12, pages 235–260, (2018)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Roberto Rocci¹,
Stefano Antonio Gattone² &
Roberto Di Mari ORCID: orcid.org/0000-0001-5498-009X³

418 Accesses
9 Citations
Explore all metrics

Abstract

Maximum likelihood estimation of Gaussian mixture models with different class-specific covariance matrices is known to be problematic. This is due to the unboundedness of the likelihood, together with the presence of spurious maximizers. Existing methods to bypass this obstacle are based on the fact that unboundedness is avoided if the eigenvalues of the covariance matrices are bounded away from zero. This can be done imposing some constraints on the covariance matrices, i.e. by incorporating a priori information on the covariance structure of the mixture components. The present work introduces a constrained approach, where the class conditional covariance matrices are shrunk towards a pre-specified target matrix \(\varvec{\varPsi }.\) Data-driven choices of the matrix \(\varvec{\varPsi },\) when a priori information is not available, and the optimal amount of shrinkage are investigated. Then, constraints based on a data-driven \(\varvec{\varPsi }\) are shown to be equivariant with respect to linear affine transformations, provided that the method used to select the target matrix be also equivariant. The effectiveness of the proposal is evaluated on the basis of a simulation study and an empirical example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Guide for Sparse PCA: Model Comparison and Applications

Article Open access 29 June 2021

Mixture Models: Latent Profile and Latent Class Analysis

Multivariate Gaussian processes: definitions, examples and applications

Article Open access 27 January 2023

References

Anderson TW, Gupta SD (1963) Some inequalities on characteristic roots of matrices. Biometrika 50:522–524
Article MathSciNet MATH Google Scholar
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79
Article MathSciNet MATH Google Scholar
Biernacki C, Chrétien S (2003) Degeneracy in the maximum likelihood estimation of univariate Gaussian mixtures with the EM. Stat Probab Lett 61:373–382
Article MathSciNet MATH Google Scholar
Browne RP, Subedi S, McNicholas P (2013) Constrained optimization for a subset of the Gaussian parsimonious clustering models. arXiv:1306.5824
Chen J, Tan X (2009) Inference for multivariate normal mixtures. J Multivar Anal 100:1367–1383
Article MathSciNet MATH Google Scholar
Chen J, Tan X, Zhang R (2008) Inference for normal mixtures in mean and variance. Stat Sin 18(2):443
MathSciNet MATH Google Scholar
Ciuperca G, Ridolfi A, Idier J (2003) Penalized maximum likelihood estimator for normal mixtures. Scand J Stat 30(1):45–59
Article MathSciNet MATH Google Scholar
Day NE (1969) Estimating the components of a mixture of two normal distributions. Biometrika 56:463–474
Article MathSciNet MATH Google Scholar
Dawid AP (1981) Some matrix-variate distribution theory: notational considerations and a Bayesian application. Biometrika 68(1):265–274
Article MathSciNet MATH Google Scholar
Dickey JM (1967) Matricvariate generalizations of the multivariate t distribution and the inverted multivariate t distribution. Ann Math Stat 38(2):511–518
Article MathSciNet MATH Google Scholar
Di Mari R, Oberski DL, Vermunt JK (2016) Bias-adjusted three-step latent Markov modeling with covariates. Struct Equ Model Multidiscip J. doi:10.1080/10705511.2016.1191015
Google Scholar
Doherty KAJ, Adams RG (2007) Unsupervised learning with normalised data and non-Euclidean norms. Appl Soft Comput 7:20321
Article Google Scholar
Fraley C, Raftery AE (2007) Bayesian regularization for normal mixture estimation and model-based clustering. J Classif 24(2):155–181
Article MathSciNet MATH Google Scholar
Fritz H, Garcia-Escudero LA, Mayo-Iscar A (2013) A fast algorithm for robust constrained clustering. Comput Stat Data Anal 61:124–136
Article MathSciNet MATH Google Scholar
Gallegos MT, Ritter G (2009a) Trimming algorithms for clustering contaminated grouped data and their robustness. Adv Data Anal Classif 3(2):135–167
Article MathSciNet MATH Google Scholar
Gallegos MT, Ritter G (2009b) Trimmed ML estimation of contaminated mixtures. Sankhya Indian J Stat Ser A (2008-) 71(2):164–220
Garcia-Escudero LA, Gordaliza A, Matran C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345
Article MathSciNet MATH Google Scholar
Garcia-Escudero LA, Gordaliza A, Matran C, Mayo-Iscar A (2014) Avoiding spurious local maximizers in mixture modeling. Stat Comput 25(3):619–633
Article MathSciNet MATH Google Scholar
Greselin F, Ingrassia S (2013) Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers. Stat Comput 25(2):215–226
Article MathSciNet MATH Google Scholar
Hathaway RJ (1985) A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann Stat 13:795–800
Article MathSciNet MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Article MATH Google Scholar
Ingrassia S (2004) A likelihood-based constrained algorithm for multivariate normal mixture models. Stat Methods Appl 13:151–166
Article MathSciNet Google Scholar
Ingrassia S, Rocci R (2007) A constrained monotone EM algorithm for finite mixture of multivariate Gaussians. Comput Stat Data Anal 51:5339–5351
Article MathSciNet MATH Google Scholar
Ingrassia S, Rocci R (2011) Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints. Comput Stat Data Anal 55(4):1715–1725
Article MathSciNet MATH Google Scholar
James W, Stein C (1961) Estimation with quadratic loss. In: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability Vol. 1, No. 1961, pp 361–379
Kearns M (1997) A bound on the error of cross validation using the approximation and estimation rates, with consequences for the training-test split. Neural Comput 9(5):1143–1161
Article Google Scholar
Kiefer NM (1978) Discrete parameter variation: efficient estimation of a switching regression model. Econometrica 46:427–434
Article MathSciNet MATH Google Scholar
Kiefer J, Wolfowitz J (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann Math Stat 27:886906
MathSciNet MATH Google Scholar
Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivar Anal 125:100–120
Article MathSciNet MATH Google Scholar
Kleinber J (2002) An impossibility theorem for clustering. In: Advances in neural information processing systems, (NIPS). MIT Press, Cambridge, pp 446–453
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
McLachlan GJ, Peel D (1998) Robust cluster analysis via mixtures of multivariate t-distributions. In: Amin A, Dori D, Pudil P, Freeman H (eds) Lecture Notes in Computer Science, vol 1451. Springer, Berlin, pp 658–666
Milligan GW, Cooper MC (1988) A study of standardization of variables in cluster analysis. J Classif 5:181–204
Article MathSciNet Google Scholar
Peel D, McLachlan GJ (2000) Robust mixture modelling using the t distribution. Stat Comput 10(4):339–348
Article Google Scholar
Policello II GE (1981) Conditional maximum likelihood estimation in gaussian mixtures. In: Taillie C, Patil GP, Baldessari BA (eds) Statistical distributions in scientific work. Volume 5–inferential problems and properties proceedings of the NATO advanced study institute held at the Università degli Studi di Trieste, Trieste, Italy, July 10-August 1 1980. NATO advanced study institutes series, vol 79. Springer, Netherlands, pp 111–125
Ridolfi A, Idier J (1999) Penalized maximum likelihood estimation for univariate normal mixture distributions. In: Actes du 17’ colloque GRETSI, Vannes, pp 259–262
Ridolfi A, Idier J (2000) Penalized maximum likelihood estimation for univariate normal mixture distributions. Bayesian inference and maximum entropy methods, MaxEnt workshops. Gif-sur-Yvette, July 2000
Ritter G (2014) Robust cluster analysis and variable selection. CRC Press, Boca Raton
MATH Google Scholar
Roth M (2013) On the multivariate \(t\) distribution. Technical report, Linköping university, Division of automatic control
Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56:2454–2470
Article MathSciNet MATH Google Scholar
Smyth P (1996) Clustering using Monte-Carlo cross validation. In Proceedings of the second international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, p 126133
Smyth P (2000) Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1):63–72
Article Google Scholar
Snoussi H, Mohammad-Djafari A (2001) Penalized maximum likelihood for multivariate Gaussian mixture. In: Fry RL (ed) MaxEnt workshops: Bayesian inference and maximum entropy methods, Aug 2001, pp 36–46
Tan X, Chen J, Zhang R (2007) Consistency of the constrained maximum likelihood estimator in finite normal mixture models. In: Proceedings of the American Statistical Association, American Statistical Association, Alexandria, 2007 [CD-ROM], pp 2113–2119
Tanaka K, Takemura A (2006) Strong consistency of the maximum likelihood estimator for finite mixtures of locationscale distributions when the scale parameters are exponentially small. Bernoulli 12(6):1003–1017
Article MathSciNet MATH Google Scholar
van der Laan MJ, Dudoit S, Keles S (2004) Asymptotic optimality of likelihood-based cross-validation. Stat Appl Genet Mol Biol 3(1):1–23
MathSciNet MATH Google Scholar
Vermunt JK (2010) Latent class modeling with covariates: two improved three-step approaches. Polit Anal 18(4):450–469
Article Google Scholar
Xu J, Tan X, Zhang R (2010) A note on Phillips (1991): a constrained maximum likelihood approach to estimating switching regressions. J Econom 154:35–41
Article MATH Google Scholar

Download references

Acknowledgements

The authors are grateful to the associate editor and the two anonymous referees for their useful comments which have lead to a considerable improvement of the paper.

Author information

Authors and Affiliations

Department of Economics and Finance (DEF), University of Rome Tor Vergata, Rome, Italy
Roberto Rocci
DiSFPEQ, Università G. d’Annunzio Chieti-Pescara, Chieti, Italy
Stefano Antonio Gattone
Department of Economics and Business, University of Catania, Catania, Italy
Roberto Di Mari

Authors

Roberto Rocci
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Antonio Gattone
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Di Mari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roberto Di Mari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rocci, R., Gattone, S.A. & Di Mari, R. A data driven equivariant approach to constrained Gaussian mixture modeling. Adv Data Anal Classif 12, 235–260 (2018). https://doi.org/10.1007/s11634-016-0279-1

Download citation

Received: 14 April 2016
Revised: 18 November 2016
Accepted: 26 December 2016
Published: 06 January 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11634-016-0279-1

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data driven equivariant approach to constrained Gaussian mixture modeling

Abstract

Access this article

Similar content being viewed by others

A Guide for Sparse PCA: Model Comparison and Applications

Mixture Models: Latent Profile and Latent Class Analysis

Multivariate Gaussian processes: definitions, examples and applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A data driven equivariant approach to constrained Gaussian mixture modeling

Abstract

Access this article

Similar content being viewed by others

A Guide for Sparse PCA: Model Comparison and Applications

Mixture Models: Latent Profile and Latent Class Analysis

Multivariate Gaussian processes: definitions, examples and applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation