Skip to main content

A Robust Methodology for Comparing Performances of Clustering Validity Criteria

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5249))

Abstract

Many different clustering validity measures exist that are very useful in practice as quantitative criteria for evaluating the quality of data partitions. However, it is a hard task for the user to choose a specific measure when he or she faces such a variety of possibilities. The present paper introduces an alternative, robust methodology for comparing clustering validity measures that has been especially designed to get around some conceptual flaws of the comparison paradigm traditionally adopted in the literature. An illustrative example involving the comparison of the performances of four well-known validity measures over a collection of 7776 data partitions of 324 different data sets is presented.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kaufman, L., Rousseeuw, P.: Finding Groups in Data. Wiley, Chichester (1990)

    Book  MATH  Google Scholar 

  2. Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Arnold (2001)

    Google Scholar 

  3. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  4. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17, 107–145 (2001)

    Article  MATH  Google Scholar 

  5. Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985)

    Article  Google Scholar 

  6. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. on Pattern Analysis and Machine Intelligence 1, 224–227 (1979)

    Article  Google Scholar 

  7. Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3, 1–27 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  8. Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. on Systems, Man and Cybernetics − B 28(3), 301–315 (1998)

    Article  Google Scholar 

  10. Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650–1654 (2002)

    Article  Google Scholar 

  11. Casella, G., Berger, R.L.: Statistical Inference, 2nd edn. Duxbury Press (2001)

    Google Scholar 

  12. Milligan, G.W.: A monte carlo study of thirdy internal criterion measures for cluster analysis. Psychometrika 46(2), 187–199 (1981)

    Article  MATH  Google Scholar 

  13. Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. Journal of the American Statistical Association 78, 553–569 (1983)

    Article  MATH  Google Scholar 

  14. Milligan, G.W., Cooper, M.C.: A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research 21, 441–458 (1986)

    Article  Google Scholar 

  15. Triola, M.F.: Elementary Statistics. Addison Wesley Longman (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vendramin, L., Campello, R.J.G.B., Hruschka, E.R. (2008). A Robust Methodology for Comparing Performances of Clustering Validity Criteria. In: Zaverucha, G., da Costa, A.L. (eds) Advances in Artificial Intelligence - SBIA 2008. SBIA 2008. Lecture Notes in Computer Science(), vol 5249. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88190-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88190-2_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88189-6

  • Online ISBN: 978-3-540-88190-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics