Abstract
Data clustering is a popular data mining technique for discovering the structure of a data set. However, the power of the results depends on the nature of the clusters prototypes generated by the clustering technique. Some cluster algorithms just label the data points producing a prototype for the cluster as the full set of data points belonging to the clusters. Some techniques produce a single ’abstract’ data point as the model for the full cluster losing the information of the shape, size and structure of the cluster. This paper proposes an on-line cluster prototype generation mechanism for the Gravitational Clustering algorithm. The idea is to use the gravitational system dynamic and the inherent hierarchical property of the gravitational algorithm for determining some summarized prototypes of clusters at the same time the gravitational clustering algorithm is finding such clusters. In this way, a cluster is represented by several different ’abstract’ data points allowing the algorithm to find an appropriated representation of clusters that are found. The performance of the proposed mechanism is evaluated experimentally on two types of synthetic data sets: data sets with Gaussian clusters and with non parametric clusters. Our results show that the proposed mechanism is able to deal with noise, finds the appropriated number of clusters and finds an appropriated set of cluster prototypes.
Keywords
- clustering
- gravitational
- hierarchical
- prototype
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ben, S., Jin, Z., Yang, J.: Guided Fuzzy Clustering with Multi-Prototypes. In: 2011 International Joint Conference on Neural Networks (IJCNN 2011), pp. 2430–2436. IEEE (2011)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenun Press (1981)
Cormer, T., Leiserson, C., Rivest, R.: Introduction to Algorithms. McGraw Hill (1990)
Gomez, J., Dasgupta, D., Nasraoui, O.: A New Gravitational Clustering Algorithm. In: 3rd SIAM International Conference on Data Mining (SDM 2003), vol. 3, pp. 83–94 (2003)
Gomez, J., Nasraoui, O., Leon, E.: RAIN – Data Clustering using Randomized Interactions between Data Points. In: 3rd International Conference on Machine Learning and Applications (ICMLA 2004), pp. 250–255 (2004)
Han, J., Kamber, M.: Data Mining – Concepts and Techniques. Morgan Kaufmann (2000)
Jain, A.K.: Data Clustering – 50 Years Beyond K-Means. Pattern Recognition Letters 31(8), 651–666 (2010)
Karypis, G., Han, E., Kumar, V.: CHAMELEON – A Hierarchical Clustering Algorithm Using Dynamic Model. IEEE Computer 32(8), 68–75 (1999)
Kundu, S.: Gravitational Clustering – A New Approach Based on the Spatial Distribution of the Points. Pattern Recognition 32(7), 1149–1160 (1999)
Kuok, C.M., Fu, A.W., Wong, M.H.: Mining Fuzzy Association Rules in Databases. SIGMOD Record 27(1), 41–46 (1998)
Lee, W., Stolfo, S., Mok, K.: Mining in a Data-flow Environment –: Experience in Network Intrusion Detection. In: 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1999), pp. 114–124. ACM (1999)
Leon, E., Nasraoui, O., Gomez, J.: Scalable Evolutionary Clustering Algorithm with Self-Adaptive Genetic Operators. In: 2010 IEEE Congress on Evolutionary Computation (CEC 2010), pp. 4010–4017. IEEE (2010)
Liu, M., Jiang, X., Kot, A.C.: A Multi-Prototype Clustering Algorithm. Pattern Recognition 42(5), 689–698 (2009)
Luo, T., Zhong, C., Li, H., Sun, X.: A Multi-Prototype Clustering Algorithm Based on Minimum Spanning Tree. In: 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010), pp. 1602–1607. IEEE (2010)
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symposium on Mathematics, Statistics, and Probabilities, pp. 281–297. University of California Press (1967)
Nasraoui, O., Krishnapuram, R.: A Novel Approach to Unsupervised Robust Clustering Using Genetic Niching. In: 9th IEEE International Conference on Fuzzy Systems (FUZZ IEEE 2000), vol. 1, pp. 170–175 (2000)
Nurnberger, A., Pedrycz, W., Kruse, R.: Data mining tasks and methods – Classification – Neural Network Approaches. In: Klosgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery. Oxford University Press (2002)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons (1987)
Wright, W.E.: Gravitational Clustering. Pattern Recognition 9(3), 151–166 (1977)
Zhao, Y., Karypis, G.: Comparison of Agglomerative and Partitional Document Clustering Algorithms. In: SIAM Workshop on Clustering High-dimentional Data and Its Applications 2002, pp. 1–13 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
León, E., Gómez, J., Giraldo, F. (2012). Online Cluster Prototype Generation for the Gravitational Clustering Algorithm. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-34654-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34653-8
Online ISBN: 978-3-642-34654-5
eBook Packages: Computer ScienceComputer Science (R0)
