Abstract
Kernel based clustering methods allow to unsupervised partition samples in feature space but have a quadratic computation time O(n 2) where n are the number of samples. Therefore these methods are generally ineligible for large datasets. In this paper we propose a meta-algorithm that performs parallelized clusterings of subsets of the samples and merges them repeatedly. The algorithm is able to use many Kernel based clustering methods where we mainly emphasize on Kernel Fuzzy C-Means and Relational Neural Gas. We show that the computation time of this algorithm is basicly linear, i.e. O(n). Further we statistically evaluate the performance of this meta-algorithm on a real-life dataset, namely the Enron Emails.
Chapter PDF
Similar content being viewed by others
References
Fahim, A.M., Salem, A.M., Torkey, F.A., Ramadan, M.A.: An efficient enhanced k-means clustering algorithm. Journal of Zhejiang University SCIENCE A, 1626–1633 (2006) ISSN 1009-3095
Ng, R.T., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proceedings of the 20th VLDB Conference, pp. 286–296. Morgan Kaufmann Publishers, San Francisco (1994)
Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. Proceedings of IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)
Kantabutra, S., Couch, A.L.: Parallel K-means Clustering Algorithm on NOWs. NECTEC Technical Journal 1(6) (2000)
Alex, N., Hammer, B.: Parallelizing single patch pass clustering. In: ESANN 2008 (2008) ISBN 2-930307-08-0
Zhang, R., Rudnicky, A.I.: A Large Scale Clustering Scheme for Kernel K-Means. In: ICPR 2002, 16th International Conference on Pattern Recognition, vol. 4, p. 40289 (2002)
Hasenfuss, A., Hammer, B., Rossi, F.: Patch Relational Neural Gas Clustering of Huge Dissimilarity Datasets. In: Prevost, L., Marinai, S., Schwenker, F. (eds.) ANNPR 2008. LNCS (LNAI), vol. 5064, pp. 1–12. Springer, Heidelberg (2008)
Zhang, D.Q., Chen, S.C.: Fuzzy clustering using kernel methods. In: International Conference of Control and Automatation (ICCA 2002), Xiamen, China, pp. 123–128 (2002)
Hammer, B., Hasenfuss, A.: Relational Neural Gas. In: Hertzberg, J., Beetz, M., Englert, R. (eds.) KI 2007. LNCS (LNAI), vol. 4667, pp. 190–204. Springer, Heidelberg (2007)
Jebara, T., Kondor, R., Howard, A.: Probability Product Kernels. Journal of Machine Learning Research 5, 819–844 (2004)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2009), http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Faußer, S., Schwenker, F. (2010). Parallelized Kernel Patch Clustering. In: Schwenker, F., El Gayar, N. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2010. Lecture Notes in Computer Science(), vol 5998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12159-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-12159-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12158-6
Online ISBN: 978-3-642-12159-3
eBook Packages: Computer ScienceComputer Science (R0)