Advertisement

Statistics and Computing

, Volume 20, Issue 1, pp 9–22 | Cite as

Constrained monotone EM algorithms for mixtures of multivariate t distributions

  • F. GreselinEmail author
  • S. Ingrassia
Article

Abstract

Mixtures of multivariate t distributions provide a robust parametric extension to the fitting of data with respect to normal mixtures. In presence of some noise component, potential outliers or data with longer-than-normal tails, one way to broaden the model can be provided by considering t distributions. In this framework, the degrees of freedom can act as a robustness parameter, tuning the heaviness of the tails, and downweighting the effect of the outliers on the parameters estimation. The aim of this paper is to extend to mixtures of multivariate elliptical distributions some theoretical results about the likelihood maximization on constrained parameter spaces. Further, a constrained monotone algorithm implementing maximum likelihood mixture decomposition of multivariate t distributions is proposed, to achieve improved convergence capabilities and robustness. Monte Carlo numerical simulations and a real data study illustrate the better performance of the algorithm, comparing it to earlier proposals.

Keywords

Finite mixture models EM algorithm t Distribution Clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821 (1993) zbMATHCrossRefMathSciNetGoogle Scholar
  2. Biernacki, C.: (2004). An asymptotic upper bound of the likelihood to prevent Gaussian mixture from degenerating. Technical report, Université de Franche-Comté Google Scholar
  3. Campbell, N.A., Mahon, R.J.: A multivariate study of variation in two species of rock crab of genus. Letpograspus, Aust. J. Zool. 22, 417–455 (1974) CrossRefGoogle Scholar
  4. Day, N.E.: Estimating the components of a mixture of normal distributions. Biometrika 56, 463–474 (1969) zbMATHCrossRefMathSciNetGoogle Scholar
  5. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B 39, 1–38 (1977) zbMATHMathSciNetGoogle Scholar
  6. Fang, K.T., Anderson, T.W.: Statistical Inference in Elliptically Contoured and Related Distributions. Alberton, New York (1990) zbMATHGoogle Scholar
  7. Frayley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002) CrossRefGoogle Scholar
  8. Greselin, F., Ingrassia, S.: A note on constrained EM algorithms for mixtures of elliptical distributions. In: Advances in Data Analysis, Data Handling and Business Intelligence, Proceedings of 32nd Annual Conference of German Classification Society, 53 (2008) Google Scholar
  9. Guerrero-Cusumano, J.L.: A measure of total variability for the multivariate t distribution with applications to finance. Inf. Sci. 92, 47–63 (1996) zbMATHCrossRefMathSciNetGoogle Scholar
  10. Hathaway, R.J.: A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann. Stat. 13, 795–800 (1985) zbMATHCrossRefMathSciNetGoogle Scholar
  11. Hawkins, D.M.: A new test for multivariate normality and homoscedasticity. Technometrics 23, 105–110 (1981) zbMATHCrossRefMathSciNetGoogle Scholar
  12. Hennig, C.: Breakdown points for maximum likelihood estimators of location-scale mixtures. Ann. Stat. 32, 1313–1340 (2004) zbMATHCrossRefMathSciNetGoogle Scholar
  13. Ingrassia, S.: A likelihood-based constrained algorithm for multivariate normal mixture models. Stat. Methods Appl. 13, 151–166 (2004) CrossRefMathSciNetGoogle Scholar
  14. Ingrassia, S., Rocci, R.: Constrained monotone EM algorithms for finite mixture of multivariate Gaussians. Comput. Stat. Data Anal. 51, 5339–5351 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  15. Kotz, S., Nadarajah, S.: Multivariate t Distributions and Their Applications. Cambridge University Press, New York (2004) zbMATHGoogle Scholar
  16. Lange, K.L., Little, R.J.A., Taylor, G.M.G.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84, 881–896 (1989) CrossRefMathSciNetGoogle Scholar
  17. Lin, T.I., Lee, J.C., Ni, H.F.: Bayesian analysis of mixture modelling using the multivariate t distribution. Stat. Comput. 14, 119–130 (2004) CrossRefMathSciNetGoogle Scholar
  18. McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000) zbMATHCrossRefGoogle Scholar
  19. Meng, X.L., Rubin, D.B.: Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80, 267–278 (1993) zbMATHCrossRefMathSciNetGoogle Scholar
  20. Nadarajah, S., Kotz, S.: Mathematical properties of the multivariate t distribution. Acta Appl. Math. 89, 53–84 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  21. Nettleton, D.: Convergence properties of the EM algorithm in constrained parameter spaces. Can. J. Stat. 27, 639–648 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  22. Peel, D., McLachlan, G.J.: Robust mixture modelling using the t distribution. Stat. Comput. 10, 339–348 (2000) CrossRefGoogle Scholar
  23. Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26, 195–239 (1984) zbMATHCrossRefMathSciNetGoogle Scholar
  24. Shoham, S.: Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognit. 35, 1127–1142 (2002) zbMATHCrossRefGoogle Scholar
  25. Theobald, C.M.: An inequality with applications to multivariate analysis. Biometrika 62, 461–466 (1975) zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Dipartimento di Metodi Quantitativi per le Scienze Economiche e AziendaliUniversità di Milano BicoccaMilanoItaly
  2. 2.Dipartimento di Economia e Metodi QuantitativiUniversità di CataniaCataniaItaly

Personalised recommendations