A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory

Oyang, Yen-Jen; Chen, Chien-Yu; Yang, Tsui-Wei

doi:10.1007/3-540-44794-6_29

Yen-Jen Oyang³,
Chien-Yu Chen³ &
Tsui-Wei Yang³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2168))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2671 Accesses
11 Citations

Abstract

This paper discusses the clustering quality and complexities of the hierarchical data clustering algorithm based on gravity theory. The gravitybased clustering algorithm simulates how the given N nodes in a K-dimensional continuous vector space will cluster due to the gravity force, provided that each node is associated with a mass. One of the main issues studied in this paper is how the order of the distance term in the denominator of the gravity force formula impacts clustering quality. The study reveals that, among the hierarchical clustering algorithms invoked for comparison, only the gravity-based algorithm with a high order of the distance term neither has a bias towards spherical clusters nor suffers the well-known chaining effect. Since bias towards spherical clusters and the chaining effect are two major problems with respect to clustering quality, eliminating both implies that high clustering quality is achieved. As far as time complexity and space complexity are concerned, the gravitybased algorithm enjoys either lower time complexity or lower space complexity, when compared with the most well-known hierarchical data clustering algorithms except single-link.

Download to read the full chapter text

Chapter PDF

A Divisive Hierarchical Clustering Algorithm to Find Clusters with Smaller Diameter to Cardinality Ratio

Pragmatic Evaluation of the Impact of Dimensionality Reduction in the Performance of Clustering Algorithms

Clustering graph data: the roadmap to spectral techniques

Article Open access 22 January 2024

Keywords:

References

Choudry, S. and N. Murty, A divisive scheme for constructing minimal spanning trees in coordinate space, Pattern Recognition Letters, volume 11 (1990), number 6, pp. 385–389
Article Google Scholar
D. Eppstein, Fast hierarchical clustering and other applications of dynamic closest pairs, The ACM Journal of Experimental Algorithmics, 5(1):1–23, Jun 2000
Article MathSciNet Google Scholar
M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), Aug. 1996.
Google Scholar
B. Everitt, Cluster analysis, Halsted Press, 1980.
Google Scholar
S. Guha, R. Rastogi, and K. Shim. Cure: An efficient clustering algorithm for large databases. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data(SIGMOD’98), pages 73–84, Seattle, WA, June 1998.
Google Scholar
S. Guha, R. Rastogi, and S. Kyuseok. ROCK: A robust clustering algorithm for categorical attributes. In Proceedings of ICDE’99, pp. 512–521, 1999.
Google Scholar
J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2000
Google Scholar
A. Hinneburg, and D. A. Keim, An Efficient Approach to Clustering in Large Multimedia Databases with Noise, Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, (KDD98), pp. 58–65, 1998.
Google Scholar
A.K. Jain, R.C. Dubes, Algorithms for clustering data, Prentice Hall, 1988.
Google Scholar
A.K. Jain, M.N. Murty, P.J. Flynn, Data Clustering: A Review, ACM Computing Surveys, Vol. 31, No. 3, pp.264–323, Sep. 1999.
Article Google Scholar
G. Karypis, E.-H. Han, and V. Kumar. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. COMPUTER, 32:68–75, 1999
Article Google Scholar
D. Krznaric and C. Levcopoulos, Fast Algorithms for Complete Linkage Clustering, Discrete & Computational Geometry, 19:131–145, 1998.
Article MATH MathSciNet Google Scholar
Kurita, T., An efficient agglomerative clustering algorithm using a heap, Pattern Recognition, volume 24 (1991), number 3 pp. 205–209
Google Scholar
R.T. Ng, J. Han, Efficient and Effective Clustering Methods for Spatial Data Mining, VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, pp.144–155, Sep. 1994.
Google Scholar
M. Stonebraker, J. Frew, K. Gardels and J. Meredith, The Sequoia 2000 Storage Benchmark, Proceedings of SIGMOD, pp. 2–11, 1993.
Google Scholar
W.E. Wright, Gravitational Clustering, Pattern Recognition, 1977, Vol.9, pp. 151–166.
Article Google Scholar
X. Xu, M. Ester, H.-P. Kriegel, J. Sander, A distribution-based clustering algorithm for mining in large spatial databases, In Proceedings of 14^th International Conference on Data Engineering (ICDE’98), 1998.
Google Scholar
Zamir, O. and O. Etzioni (1998). Web document clustering: A feasibility demonstration. In Proceedings of the 21th International ACM SIGIR Conference, pp. 46–54.
Google Scholar
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: An Efficient Data Clustering Method for Very Large Databases, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pp.103–114, Jun. 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Yen-Jen Oyang, Chien-Yu Chen & Tsui-Wei Yang

Authors

Yen-Jen Oyang
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tsui-Wei Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Inst.of Information and Computing Sciences Dept. of Mathematics and Computer Science, University of Utrecht, Padualaan 14, de Uithof, 3508, TB Utrecht, The Netherlands
Arno Siebes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oyang, YJ., Chen, CY., Yang, TW. (2001). A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_29

Download citation

DOI: https://doi.org/10.1007/3-540-44794-6_29
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory

Abstract

Chapter PDF

Similar content being viewed by others

A Divisive Hierarchical Clustering Algorithm to Find Clusters with Smaller Diameter to Cardinality Ratio

Pragmatic Evaluation of the Impact of Dimensionality Reduction in the Performance of Clustering Algorithms

Clustering graph data: the roadmap to spectral techniques

Keywords:

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory

Abstract

Chapter PDF

Similar content being viewed by others

A Divisive Hierarchical Clustering Algorithm to Find Clusters with Smaller Diameter to Cardinality Ratio

Pragmatic Evaluation of the Impact of Dimensionality Reduction in the Performance of Clustering Algorithms

Clustering graph data: the roadmap to spectral techniques

Keywords:

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation