HPCC 2007: High Performance Computing and Communications pp 97-107 | Cite as
An Adaptive Parallel Hierarchical Clustering Algorithm
Abstract
Clustering of data has numerous applications and has been studied extensively. It is very important in Bioinformatics and data mining. Though many parallel algorithms have been designed, most of algorithms use the CRCW-PRAM or CREW-PRAM models of computing. This paper proposed a parallel EREW deterministic algorithm for hierarchical clustering. Based on algorithms of complete graph and Euclidean minimum spanning tree, the proposed algorithms can cluster n objects with O(p) processors in O(n 2/p) time where 1≤ p ≤ \(\frac{n}{log n}\). Performance comparisons show that our algorithm is the first algorithm that is both without memory conflicts and adaptive.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
- 2.Olson, C.F.: Parallel Algorithms for Hierarchical Clustering. Parallel Computing 21, 1313–1325 (1995)MATHCrossRefMathSciNetGoogle Scholar
- 3.Dahlhaus, E.: Parallel Algorithms for Hierarchical Clustering and Applications to Split Decomposition and Parity Graph Recognition. Journal of Algorithms 36, 205–240 (2000)MATHCrossRefMathSciNetGoogle Scholar
- 4.Rajasekaran, S.: Efficient Parallel Hierarchical Clustering Algorithms. IEEE transactions on parallel and distributed systems 16(6), 497–502 (2005)CrossRefMathSciNetGoogle Scholar
- 5.Rasmussen, E.M., Willett, P.: Efficiency of hierarchic agglomerative clustering using the ICL Distributed Array Processor. Journal of Documentation 45, 1–24 (1989)CrossRefGoogle Scholar
- 6.Li, X., Fang, Z.: Parallel Clustering Algorithms. Parallel Computing 11, 275–290 (1989)MATHCrossRefMathSciNetGoogle Scholar
- 7.Li, X.: Parallel Algorithms for Hierarchical Clustering and Clustering Validity. IEEE Trans. Pattern Analysis and Machine Intelligence 12, 1088–1092 (1990)CrossRefGoogle Scholar
- 8.Tsai, H.R., Horng, S.J., Lee, S.S., Tsai, S.S., Kao, T.W.: Parallel Hierarchical Clustering Algorithms on Processor Arrays with a Reconfigurable Bus System. Pattern Recognition 30, 801–815 (1997)CrossRefGoogle Scholar
- 9.Akl, S G.: Optimal parallel merging and sorting without memory conflicts. IEEE Trans. Comput. 36(11), 1367–1369 (1987)MathSciNetGoogle Scholar
- 10.Chen, G.: Design and analysis of parallel algorithm. Higher education press, Beijing (2002)Google Scholar
- 11.Datta, A., Soundaralakshmi, S.: Fast Parallel Algorithm for Distance Transform. IEEE Transactions on Systems, Man, and Cybernetics 33(5), 429–434 (2003)Google Scholar
- 12.Akl, S.G.: An adaptive and cost-optimal parallel algorithm for minimum spanning trees. Computing 3, 271–277 (1986)CrossRefMathSciNetGoogle Scholar
- 13.Li, K.L., Li, Q.H., Li, R.F.: Optimal parallel algorithm for the knapsack problem without memory conflicts. Journal of Computer Science and Technology 19(6), 760–768 (2004)MathSciNetCrossRefGoogle Scholar
- 14.Jun, M., Shaohan, M.: Effcient Parallel Algorithm s for Some Graph Theory Problems. J. of Comput. Sci. Technol. 8(4), 362–366 (1993)CrossRefGoogle Scholar
- 15.Nath, D., Maheshwari, S.N.: Parallel algorithms for the connected components and minimal spanning tree problems. Inf: Proc. Lett. 14(1), 7–11 (1982)MATHCrossRefMathSciNetGoogle Scholar
- 16.Chong, K.W., Han, Y.J.: Concurrent Threads and Optimal Parallel MinimumSpanning Trees Algorithm. Journal of the ACM 48(2), 297–323 (2001)MATHCrossRefMathSciNetGoogle Scholar
- 17.Dash, M., Petrutiu, S., Scheuermann, P.: pPOP: Fast yet accurate parallel hierarchical clustering using partitioning. Data & Knowledge Engineering 61(3), 563–578 (2007)CrossRefGoogle Scholar