Clustering with EM: Complex Models vs. Robust Estimation

  • C. Saint-Jean
  • C. Frélicot
  • B. Vachon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1876)


Clustering multivariate data that are contaminated by noise is a complex issue, particularly in the framework of mixture model estimation because noisy data can significantly affect the parameters estimates. This paper addresses this problem with respect to likelihood maximization using the Expectation-Maximization algorithm. Two different approaches are compared. The first one consists in defining mixture models that take into account noise. The second one is based of robust estimation of the model parameters in the maximization step of EM. Both have been tested separately, then jointly. Finally, a hybrid model is proposed. Results on artificial data are given and discussed.


Clustering Expectation-Maximization Robustness M-estimation 


  1. 1.
    Campbell, N.A., Lopuhad, H.P., Rousseeuw, P.J.: On the calculation of a robust S-estimator of a covariance matrix. Delft University of Technology. Tech. Report DUT-TWI-95-117 (1995)Google Scholar
  2. 2.
    Celeux, G., Soromenho, G.: An entropy criterion for assessing the number of clusters in a mixture model. J. of Classification 13 (1996) 195–212zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm with discussion. J. of the Royal Stat. Soc. 39 (1977) 1–38zbMATHMathSciNetGoogle Scholar
  4. 4.
    McLachlan, G.J., Peel, D., Basford, K.E., and Adams, P.: The EMMIX software for the fitting of mixtures of normal and t-components. Journal of Statistical Software 4, No. 2. (1999).Google Scholar
  5. 5.
    Fraley, C., Raftery, A.E.: MCLUST: Software for model-based clustering and discriminant analysis. Univ. of Wash. Tech. Report TR-342 (1998)Google Scholar
  6. 6.
    Huber, P.J.: Robust statistics. John Wiley. New-York (1981)zbMATHGoogle Scholar
  7. 7.
    Kharin, Y.: Robustness of clustering under outliers. LNCS 1280 (1997)Google Scholar
  8. 8.
    McLachlan, G.J., Peel, D.: Robust cluster analysis via mixtures of multivariate t-distributions. LNCS 1451 (1999) 658–667Google Scholar
  9. 9.
    Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. of Stat. 11 (1983) 95–103zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • C. Saint-Jean
    • 1
  • C. Frélicot
    • 1
  • B. Vachon
    • 1
  1. 1.L3I - UPRES EALa Rochelle Cedex 1

Personalised recommendations