Clustering Similarity Comparison Using Density Profiles

Bae, Eric; Bailey, James; Dong, Guozhu

doi:10.1007/11941439_38

Eric Bae²⁰,
James Bailey²⁰ &
Guozhu Dong²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3467 Accesses
3 Citations

Abstract

The unsupervised nature of cluster analysis means that objects can be clustered in many ways, allowing different clustering algorithms to generate vastly different results. To address this, clustering comparison methods have traditionally been used to quantify the degree of similarity between alternative clusterings. However, existing techniques utilize only the point memberships to calculate the similarity, which can lead to unintuitive results. They also cannot be applied to analyze clusterings which only partially share points, which can be the case in stream clustering. In this paper we introduce a new measure named ADCO, which takes into account density profiles for each attribute and aims to address these problems. We provide experiments to demonstrate this new measure can often provide a more reasonable similarity comparison between different clusterings than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C., Han, J., Wang, J., Yu, P.: A framework for clustering evolving data streams. In: 29th VLDB Conference (2003)
Google Scholar
Aggarwal, C.: A framework for diagnosing changes in evolving data streams. In: Intern. Conf. on Management of Data, pp. 575–586 (2003)
Google Scholar
Fred, A., Jain, A.: Combining Multiple Clusterings Using Evidence Accumulation. Transac. on Pattern Analysis and Machine Intelligence 27, 835–850 (2005)
Article Google Scholar
Fred, A., Jain, A.: Robust data clustering. In: Comp. Soc. Conf. on Computer Vision and Pattern Recognition, pp. 128–133 (2003)
Google Scholar
Grossman, S.: Elementary Linear Algebra. Saunders College Publishing (1994)
Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques. Journ. of Intelligent Info. Sys. 17, 107–145 (2001)
Article MATH Google Scholar
Hamers, L., Hemeryck, Y.: Similarity measures in scientometric research: the Jaccard index vs. Salton’s cosine formula. Info. Process. and Manage. 25, 315–318 (1989)
Article Google Scholar
Hubert, L.: Comparing partitions. Journ. of classification, 193–218 (1985)
Google Scholar
Karypis, G., Aggarwal, R., Kumar, V.: Multilevel hypergraph partitioning: application in VLSI domain. In: Ann. Conf. on Design Automation, pp. 526–529 (1997)
Google Scholar
Meila, M.: Comparing Clusterings. Statistics Technical Report (2005), http://www.stat.washington.edu/www/research/reports/2002/
Meila, M.: Comparing Clusterings - An Axiomatic View. In: 22nd International Conference on Machine Learning (2005)
Google Scholar
O’Callaghan, L., Mishra, N., Meyerson, A.: Streaming-Data Algorithms for High-Quality Clustering. In: Intern. Conf. on Data Engineering (2002)
Google Scholar
Rand, W.: Objective criteria for the evaluation of clustering methods. Journ. of the American Statistical Association 66, 846–850 (1971)
Article Google Scholar
Ratanamahatana, C.: CloNI: Clustering of square root of N interval discretization. Data Mining IV, Info. and Comm. Tech. 29 (2003)
Google Scholar
Strehl, A., Ghosh, J.: Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. Jour. on Machine Learning 3, 583–617 (2002)
Article MathSciNet Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1999)
Google Scholar
Zhou, D., Li, J., Zha, H.: A new Mallows distance based metric for comparing clusterings. In: Intern. Conf. on Machine Learning (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, University of Melbourne, Australia
Eric Bae & James Bailey
Department of Computer Science and Engineering, Wright State University, USA
Guozhu Dong

Authors

Eric Bae
View author publications
You can also search for this author in PubMed Google Scholar
James Bailey
View author publications
You can also search for this author in PubMed Google Scholar
Guozhu Dong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DisPRR, National ICT Australia Ltd, QLD, Australia
Abdul Sattar
School of Computing, University of Tasmania, Sandy Bay, 7005, Tasmania, Australia
Byeong-ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bae, E., Bailey, J., Dong, G. (2006). Clustering Similarity Comparison Using Density Profiles. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_38

Download citation

DOI: https://doi.org/10.1007/11941439_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics