Skip to main content
Log in

A data driven equivariant approach to constrained Gaussian mixture modeling

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Maximum likelihood estimation of Gaussian mixture models with different class-specific covariance matrices is known to be problematic. This is due to the unboundedness of the likelihood, together with the presence of spurious maximizers. Existing methods to bypass this obstacle are based on the fact that unboundedness is avoided if the eigenvalues of the covariance matrices are bounded away from zero. This can be done imposing some constraints on the covariance matrices, i.e. by incorporating a priori information on the covariance structure of the mixture components. The present work introduces a constrained approach, where the class conditional covariance matrices are shrunk towards a pre-specified target matrix \(\varvec{\varPsi }.\) Data-driven choices of the matrix \(\varvec{\varPsi },\) when a priori information is not available, and the optimal amount of shrinkage are investigated. Then, constraints based on a data-driven \(\varvec{\varPsi }\) are shown to be equivariant with respect to linear affine transformations, provided that the method used to select the target matrix be also equivariant. The effectiveness of the proposal is evaluated on the basis of a simulation study and an empirical example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Anderson TW, Gupta SD (1963) Some inequalities on characteristic roots of matrices. Biometrika 50:522–524

    Article  MathSciNet  MATH  Google Scholar 

  • Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79

    Article  MathSciNet  MATH  Google Scholar 

  • Biernacki C, Chrétien S (2003) Degeneracy in the maximum likelihood estimation of univariate Gaussian mixtures with the EM. Stat Probab Lett 61:373–382

    Article  MathSciNet  MATH  Google Scholar 

  • Browne RP, Subedi S, McNicholas P (2013) Constrained optimization for a subset of the Gaussian parsimonious clustering models. arXiv:1306.5824

  • Chen J, Tan X (2009) Inference for multivariate normal mixtures. J Multivar Anal 100:1367–1383

    Article  MathSciNet  MATH  Google Scholar 

  • Chen J, Tan X, Zhang R (2008) Inference for normal mixtures in mean and variance. Stat Sin 18(2):443

    MathSciNet  MATH  Google Scholar 

  • Ciuperca G, Ridolfi A, Idier J (2003) Penalized maximum likelihood estimator for normal mixtures. Scand J Stat 30(1):45–59

    Article  MathSciNet  MATH  Google Scholar 

  • Day NE (1969) Estimating the components of a mixture of two normal distributions. Biometrika 56:463–474

    Article  MathSciNet  MATH  Google Scholar 

  • Dawid AP (1981) Some matrix-variate distribution theory: notational considerations and a Bayesian application. Biometrika 68(1):265–274

    Article  MathSciNet  MATH  Google Scholar 

  • Dickey JM (1967) Matricvariate generalizations of the multivariate t distribution and the inverted multivariate t distribution. Ann Math Stat 38(2):511–518

    Article  MathSciNet  MATH  Google Scholar 

  • Di Mari R, Oberski DL, Vermunt JK (2016) Bias-adjusted three-step latent Markov modeling with covariates. Struct Equ Model Multidiscip J. doi:10.1080/10705511.2016.1191015

    Google Scholar 

  • Doherty KAJ, Adams RG (2007) Unsupervised learning with normalised data and non-Euclidean norms. Appl Soft Comput 7:20321

    Article  Google Scholar 

  • Fraley C, Raftery AE (2007) Bayesian regularization for normal mixture estimation and model-based clustering. J Classif 24(2):155–181

    Article  MathSciNet  MATH  Google Scholar 

  • Fritz H, Garcia-Escudero LA, Mayo-Iscar A (2013) A fast algorithm for robust constrained clustering. Comput Stat Data Anal 61:124–136

    Article  MathSciNet  MATH  Google Scholar 

  • Gallegos MT, Ritter G (2009a) Trimming algorithms for clustering contaminated grouped data and their robustness. Adv Data Anal Classif 3(2):135–167

    Article  MathSciNet  MATH  Google Scholar 

  • Gallegos MT, Ritter G (2009b) Trimmed ML estimation of contaminated mixtures. Sankhya Indian J Stat Ser A (2008-) 71(2):164–220

  • Garcia-Escudero LA, Gordaliza A, Matran C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345

    Article  MathSciNet  MATH  Google Scholar 

  • Garcia-Escudero LA, Gordaliza A, Matran C, Mayo-Iscar A (2014) Avoiding spurious local maximizers in mixture modeling. Stat Comput 25(3):619–633

    Article  MathSciNet  MATH  Google Scholar 

  • Greselin F, Ingrassia S (2013) Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers. Stat Comput 25(2):215–226

    Article  MathSciNet  MATH  Google Scholar 

  • Hathaway RJ (1985) A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann Stat 13:795–800

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  MATH  Google Scholar 

  • Ingrassia S (2004) A likelihood-based constrained algorithm for multivariate normal mixture models. Stat Methods Appl 13:151–166

    Article  MathSciNet  Google Scholar 

  • Ingrassia S, Rocci R (2007) A constrained monotone EM algorithm for finite mixture of multivariate Gaussians. Comput Stat Data Anal 51:5339–5351

    Article  MathSciNet  MATH  Google Scholar 

  • Ingrassia S, Rocci R (2011) Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints. Comput Stat Data Anal 55(4):1715–1725

    Article  MathSciNet  MATH  Google Scholar 

  • James W, Stein C (1961) Estimation with quadratic loss. In: Proceedings of the fourth Berkeley symposium on mathematical statistics and probability Vol. 1, No. 1961, pp 361–379

  • Kearns M (1997) A bound on the error of cross validation using the approximation and estimation rates, with consequences for the training-test split. Neural Comput 9(5):1143–1161

    Article  Google Scholar 

  • Kiefer NM (1978) Discrete parameter variation: efficient estimation of a switching regression model. Econometrica 46:427–434

    Article  MathSciNet  MATH  Google Scholar 

  • Kiefer J, Wolfowitz J (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann Math Stat 27:886906

    MathSciNet  MATH  Google Scholar 

  • Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivar Anal 125:100–120

    Article  MathSciNet  MATH  Google Scholar 

  • Kleinber J (2002) An impossibility theorem for clustering. In: Advances in neural information processing systems, (NIPS). MIT Press, Cambridge, pp 446–453

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • McLachlan GJ, Peel D (1998) Robust cluster analysis via mixtures of multivariate t-distributions. In: Amin A, Dori D, Pudil P, Freeman H (eds) Lecture Notes in Computer Science, vol 1451. Springer, Berlin, pp 658–666

  • Milligan GW, Cooper MC (1988) A study of standardization of variables in cluster analysis. J Classif 5:181–204

    Article  MathSciNet  Google Scholar 

  • Peel D, McLachlan GJ (2000) Robust mixture modelling using the t distribution. Stat Comput 10(4):339–348

    Article  Google Scholar 

  • Policello II GE (1981) Conditional maximum likelihood estimation in gaussian mixtures. In: Taillie C, Patil GP, Baldessari BA (eds) Statistical distributions in scientific work. Volume 5–inferential problems and properties proceedings of the NATO advanced study institute held at the Università degli Studi di Trieste, Trieste, Italy, July 10-August 1 1980. NATO advanced study institutes series, vol 79. Springer, Netherlands, pp 111–125

  • Ridolfi A, Idier J (1999) Penalized maximum likelihood estimation for univariate normal mixture distributions. In: Actes du 17’ colloque GRETSI, Vannes, pp 259–262

  • Ridolfi A, Idier J (2000) Penalized maximum likelihood estimation for univariate normal mixture distributions. Bayesian inference and maximum entropy methods, MaxEnt workshops. Gif-sur-Yvette, July 2000

  • Ritter G (2014) Robust cluster analysis and variable selection. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Roth M (2013) On the multivariate \(t\) distribution. Technical report, Linköping university, Division of automatic control

  • Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56:2454–2470

    Article  MathSciNet  MATH  Google Scholar 

  • Smyth P (1996) Clustering using Monte-Carlo cross validation. In Proceedings of the second international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, p 126133

  • Smyth P (2000) Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1):63–72

    Article  Google Scholar 

  • Snoussi H, Mohammad-Djafari A (2001) Penalized maximum likelihood for multivariate Gaussian mixture. In: Fry RL (ed) MaxEnt workshops: Bayesian inference and maximum entropy methods, Aug 2001, pp 36–46

  • Tan X, Chen J, Zhang R (2007) Consistency of the constrained maximum likelihood estimator in finite normal mixture models. In: Proceedings of the American Statistical Association, American Statistical Association, Alexandria, 2007 [CD-ROM], pp 2113–2119

  • Tanaka K, Takemura A (2006) Strong consistency of the maximum likelihood estimator for finite mixtures of locationscale distributions when the scale parameters are exponentially small. Bernoulli 12(6):1003–1017

    Article  MathSciNet  MATH  Google Scholar 

  • van der Laan MJ, Dudoit S, Keles S (2004) Asymptotic optimality of likelihood-based cross-validation. Stat Appl Genet Mol Biol 3(1):1–23

    MathSciNet  MATH  Google Scholar 

  • Vermunt JK (2010) Latent class modeling with covariates: two improved three-step approaches. Polit Anal 18(4):450–469

    Article  Google Scholar 

  • Xu J, Tan X, Zhang R (2010) A note on Phillips (1991): a constrained maximum likelihood approach to estimating switching regressions. J Econom 154:35–41

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the associate editor and the two anonymous referees for their useful comments which have lead to a considerable improvement of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roberto Di Mari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rocci, R., Gattone, S.A. & Di Mari, R. A data driven equivariant approach to constrained Gaussian mixture modeling. Adv Data Anal Classif 12, 235–260 (2018). https://doi.org/10.1007/s11634-016-0279-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-016-0279-1

Keywords

Mathematics Subject Classification

Navigation