Information Retrieval

, Volume 12, Issue 4, pp 461–486 | Cite as

A comparison of extrinsic clustering evaluation metrics based on formal constraints

  • Enrique Amigó
  • Julio Gonzalo
  • Javier Artiles
  • Felisa Verdejo
Article

Abstract

There is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints. We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.

Keywords

Clustering Evaluation metrics Formal constraints 

References

  1. Artiles, J., Gonzalo, J., & Sekine, S. (2007). The Semeval-2007 Weps evaluation: Establishing a benchmark for the web people search task. In Proceedings of the 4th International Workshop on Semantic Evaluations (Semeval-2007), June 23–24 (pp. 64–69). Prague.Google Scholar
  2. Bagga, A., & Baldwin, B. (1998). Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING-ACL’98) (pp. 79–85). Montreal.Google Scholar
  3. Bakus, J., Hussin, M. F., & Kamel, M. (2002). A SOM-based document clustering using phrases. In Proceedings of the 9th International Conference on Neural Information Procesing (ICONIP’02) (pp. 2212–2216). Singapore.Google Scholar
  4. Dom, B. (2001). An information-theoretic external cluster-validity measure. IBM Research Report.Google Scholar
  5. Ghosh, J. (2003). Scalable clustering methods for data mining. In N. Ye (Ed.), Handbook of data mining. NJ: Lawrence Erlbaum.Google Scholar
  6. Gonzalo, J., & Peters, C. (2005). The impact of evaluation on multilingual text retrieval. In Proceedings of SIGIR 2005 (pp. 603–604). Salvador de Bahia.Google Scholar
  7. Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2–3), 107–145.MATHCrossRefGoogle Scholar
  8. Larsen, B., & Aone, C. (1999). Fast and effective text mining using linear-time document clustering. In Knowledge Discovery and Data Mining (pp. 16–22). San Diego, CA.Google Scholar
  9. Meila, M. (2003). Comparing clusterings. In Proceedings of COLT 03. Washington, DC.Google Scholar
  10. Pantel, P., & Lin, D. (2002). Efficiently clustering documents with committees. In Proceedings of the PRICAI 2002 7th Pacific Rim International Conference on Artificial Intelligence (pp. 18–22). Tokyo, Japan.Google Scholar
  11. Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (pp. 410–420). Prague.Google Scholar
  12. Steinbach, M., Karypis, G., & Kumar, V. (2000). A comparison of document clustering techniques, KDD 2000 (pp. 109–110). Boston, MA.Google Scholar
  13. Strehl, A. (2002). Relationship-based clustering and cluster ensembles for high-dimensional data mining. PhD thesis, The University of Texas at Austin.Google Scholar
  14. Van Rijsbergen, C. (1974). Foundation of evaluation. Journal of Documentation, 30(4), 365–373.CrossRefGoogle Scholar
  15. Xu, W., Liu, X., & Gong, Y. (2003). Document clustering based on non-negative matrix factorization. In SIGIR ’03: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 267–273). NY: ACM Press.Google Scholar
  16. Zhao, Y., & Karypis, G. (2001). Criterion functions for document clustering: Experiments and analysis. Technical Report TR 01-40. Department of Computer Science, University of Minnesota, Minneapolis, MN.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Enrique Amigó
    • 1
  • Julio Gonzalo
    • 1
  • Javier Artiles
    • 1
  • Felisa Verdejo
    • 1
  1. 1.Departamento de Lenguajes y Sistemas InformáticosUNEDMadridSpain

Personalised recommendations