Advances in Data Analysis and Classification

, Volume 4, Issue 2, pp 111–135

A simulation study to compare robust clustering methods based on mixtures

Regular Article

DOI: 10.1007/s11634-010-0065-4

Cite this article as:
Coretto, P. & Hennig, C. Adv Data Anal Classif (2010) 4: 111. doi:10.1007/s11634-010-0065-4


The following mixture model-based clustering methods are compared in a simulation study with one-dimensional data, fixed number of clusters and a focus on outliers and uniform “noise”: an ML-estimator (MLE) for Gaussian mixtures, an MLE for a mixture of Gaussians and a uniform distribution (interpreted as “noise component” to catch outliers), an MLE for a mixture of Gaussian distributions where a uniform distribution over the range of the data is fixed (Fraley and Raftery in Comput J 41:578–588, 1998), a pseudo-MLE for a Gaussian mixture with improper fixed constant over the real line to catch “noise” (RIMLE; Hennig in Ann Stat 32(4): 1313–1340, 2004), and MLEs for mixtures of t-distributions with and without estimation of the degrees of freedom (McLachlan and Peel in Stat Comput 10(4):339–348, 2000). The RIMLE (using a method to choose the fixed constant first proposed in Coretto, The noise component in model-based clustering. Ph.D thesis, Department of Statistical Science, University College London, 2008) is the best method in some, and acceptable in all, simulation setups, and can therefore be recommended.


Model-based clusteringGaussian mixtureMixture of t-distributionsNoise component

Mathematics Subject Classification (2000)


Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Dipartimento di Scienze Economiche e StatisticheUniversità degli Studi di SalernoFiscianoItaly
  2. 2.Department of Statistical SciencesUniversity College LondonLondonUK