Abstract
The K-Means clustering is by far the most widely used method for discovering clusters in data. It has a good performance on the data with compact super-sphere distributions, but tends to fail in the data organized in more complex and unknown shapes. In this paper, we analyze in detail the characteristic property of data clustering and propose a novel dissimilarity measure, named density-sensitive distance metric, which can describe the distribution characteristic of data clustering. By using this dissimilarity measure, a density-sensitive K-Means clustering algorithm is given, which has the ability to identify complex non-convex clusters compared with the original K-Means algorithm. The experimental results on both artificial data sets and real-world problems assess the validity of the algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Xu, R., Wunsch, D.: Survey of Clustering Algorithms. IEEE Trans. Neural Networks 16, 645–678 (2005)
Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)
Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. In: Advances in Neural Information Processing Systems 9, pp. 368–374. MIT Press, Cambridge, MA (1997)
Chinrungrueng, C., Sequin, C.H.: Optimal adaptive K-means algorithm with dynamic adjustment of learning rate. IEEE Trans Neural Network 1, 157–169 (1995)
Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognition 36, 451–461 (2003)
Su, M.-C., Chou, C.-H.: A modified version of the K-Means algorithm with a distance based on cluster symmetry. IEEE Transactions on Pattern Anal. Machine Intell. 23, 674–680 (2001)
Charalampidis, D.: A modified K-Means algorithm for circular invariant clustering. IEEE Transactions on Pattern Anal. Machine Intell. 27, 1856–1865 (2005)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with Local and Global Consistency. In: Thrun, S., Saul, L., Scholkopf, B. (eds.) Advances in Neural Information Processing Systems 16, pp. 321–328. MIT Press, Cambridge (2004)
Bousquet, O., Chapelle, O., Hein, M.: Measure based regularization. In: Thrun, S., Saul, L., Scholkopf, B. (eds.) Advances in Neural Information Processing Systems 16, MIT Press, Cambridge (2004)
Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML), vol. 18, pp. 19–26 (2001)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. Technical report, University of California, Department of Information and Computer Science, Irvine, CA (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, L., Bo, L., Jiao, L. (2006). A Modified K-Means Clustering with a Density-Sensitive Distance Metric. In: Wang, GY., Peters, J.F., Skowron, A., Yao, Y. (eds) Rough Sets and Knowledge Technology. RSKT 2006. Lecture Notes in Computer Science(), vol 4062. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11795131_79
Download citation
DOI: https://doi.org/10.1007/11795131_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36297-5
Online ISBN: 978-3-540-36299-9
eBook Packages: Computer ScienceComputer Science (R0)