Abstract
This paper describes the realization of a parallel version of the k/h-means clustering algorithm. This is one of the basic algorithms used in a wide range of data mining tasks. We show how a database can be distributed and how the algorithm can be applied to this distributed database. The tests conducted on a network of 32 PCs showed for large data sets a nearly ideal speedup.
Chapter PDF
Keywords
- Execution Time
- Parallel Version
- Data Mining Task
- Distribute Computing Environment
- Machine Learn Database
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
M.R. Anderberg. Cluster Analysis for Applications. Academic Press, 1973.
John A. Hatigan. Clustering Algorithms. John Wiley and Sons, 1975.
W. Kloesgen and J.M. Zytkow. Knowledge discovery in database terminology. Advances in Knowledge Discovery and Data Mining, pages 573–592, 1996.
J.B. MacQueen. Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967.
C.F. Olson. Parallel algorithms for hierarchical clustering. Parallel Computing, 21, 1995.
E.M. Rasmussen and P. Willett. Efficiency of hierarchical agglomerative clustering using the icl distributed array oricessor. Journal of Documentation, 45(1), 1989.
Helmuth Spaeth. Cluster Analysis Algorithms. John Wiley and Sons, 1980.
Kilian Stoffel. Pattern matching in time series. Technical Report University of Neuchâtel, September 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stoffel, K., Belkoniene, A. (1999). Parallel k/h-Means Clustering for Large Data Sets. In: Amestoy, P., et al. Euro-Par’99 Parallel Processing. Euro-Par 1999. Lecture Notes in Computer Science, vol 1685. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48311-X_205
Download citation
DOI: https://doi.org/10.1007/3-540-48311-X_205
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66443-7
Online ISBN: 978-3-540-48311-3
eBook Packages: Springer Book Archive