Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

Takizawa, Hiroyuki; Kobayashi, Hiroaki

doi:10.1007/s11227-006-8294-1

Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

Published: June 2006

Volume 36, pages 219–234, (2006)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Hiroyuki Takizawa¹ &
Hiroaki Kobayashi²

383 Accesses
36 Citations
Explore all metrics

Abstract

This paper presents an effective scheme for clustering a huge data set using a PC cluster system, in which each PC is equipped with a commodity programmable graphics processing unit (GPU). The proposed scheme is devised to achieve three-level hierarchical parallel processing of massive data clustering. The divide-and-conquer approach to parallel data clustering is employed to perform the coarse-grain parallel processing by multiple PCs with a message passing mechanism. By taking advantage of the GPU’s parallel processing capability, moreover, the proposed scheme can exploit two types of the fine-grain data parallelism at the different levels in the nearest neighbor search, which is the most computationally-intensive part of the data-clustering process. The performance of our scheme is discussed in comparison with that of the implementation entirely running on CPU. Experimental results clearly show that the proposed hierarchial parallel processing can remarkably accelerate the data clustering task. Especially, GPU co-processing is quite effective to improve the computational efficiency of parallel data clustering on a PC cluster. Although data-transfer from GPU to CPU is generally costly, acceleration by GPU co-processing is significant to save the total execution time of data-clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Status, challenges and trends of data-intensive supercomputing

Article 01 June 2022

Large scale K-means clustering using GPUs

Article Open access 18 October 2022

GPU Architecture

References

Abbas HM, Bayoumi MM (2002) Parallel Codebook Design for Vector Quantization on a Message Passing MIMD architecture. Parallel Computing 28:1079–1093
Article Google Scholar
Anderberg M (1973) Cluster Analysis for Applications. Academic Press Inc, NY and London
Google Scholar
Bohn C-A (1998) Kohonen Feature Mapping Through Graphics Hardware. Computational Intelligence and Neuroscience
Buck I (2005) Taking the Plunge into GPU Computing. In: GPU Gems 2: Programming Techniques for High-performance Graphics and General-purpose Computation. Addison-Wesley, pp 509–519
Everitt B, Landau S, Leese M (2001) Cluster Analysis. 4th edn. Oxford University Press Inc., NY
Google Scholar
Fan Z, Qiu F, Kaufman A, Yoakum-Stover S (2004) GPU Cluster for High Performance Computing. In: The ACM/IEEE SC2004 Conference (SC’04)
Fayyad U, Haussler D, Stolorz P (1996) KDD for Science Data Analysis: Issues and Examples. In: The Second International Conference on Knowledge Discovery and Data mining (KDD-96)
Forgy E (1965) Cluster Analysis of Multivariate Data: Efficiency vs. Interpretability of Classification. Biometrics 21:768–769 (Abstract)
Google Scholar
Gersho A, Gray R (1992) Vector Quantization and Signal Compression. Kluwer, Norwell, MA
Gierlinger T, Prabhu P (2004) Towards Load Balanced Computations using GPU and CPU. In: 2004 ACM Workshop on General-Purpose Computing on Graphics Processors (GP2), p C-14
Hall JD, Hart J (2004) GPU Acceleration of Interactive Clustering. In: ACM Workshop on General-Purpose Computing on Graphics Processors, p C-6
Kilgariff E, Fernando R (2005) The GeForce 6 series GPU architecture. In: GPU Gems 2: Programming Techniques for High-performance graphics and general-purpose computation. Addison-Wesley, pp 471–491
Kobayashi K, Kiyoshita M, Onodera H, Tamaru K (1997) A Memory-based Parallel Processor for Vectror Quantization: FMPP-VQ. IEICE Trans. Electron E80-C(7):970–975
Kohonen T (1995) Self-Organizing Maps. Springer-Verlag, New York
Google Scholar
Linde Y, Buzo A, Gray R (1980) An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications COM-28(1):84–95
Article Google Scholar
Macedonia M (2003) The GPU enters computing’s mainstream. IEEE Computers 36(10):106–108
Google Scholar
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: The Fifth Berkley Symposium on Mathematical Statistics and Probability vol 1, pp 281–297
Manohar M, Tilton J (1996) Progressive Vector Quantization on a Massively Parallel SIMD machine with Application to Multispectral Image Data. IEEE Transactions on Image Processing 5(1):142–147
Article Google Scholar
Moreland K, Angel E (2003) The FFT on a GPU. In: SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 Proceedings, pp 112–119
Murty M, Krishna G (1980) A computationally efficient technique for data clustering. Pattern Recognition 12:153–158
Article Google Scholar
Owens J (2005) Streaming Architextures and Technology Trends. In: GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley, pp 457–470
Parhi K, Wu F, Genesan K (1994) Sequential and Parallel Neural Network Vector Quantizers. IEEE Trans. Computers 43(1):104–109
Google Scholar
Patané G, Russo M (2001) The Enhanced LBG Algorithm. Neural Networks 14:1219–1237
Article Google Scholar
Takizawa H, Kobayashi H (2004) Multi-Grain Parallel Processing of Data Clustering on Programmable Graphics Hardware. In: The 2nd International Symposium on Parallel and Distributed Processing and Applications (ISPA’04), 3358:16–27
MathSciNet Google Scholar
Thompson CJ, Hahn S, Oskin M (2002) Using Modern Graphics Architectures for General-Purpose Computing: A Fram ework and Analysis’. In: International Symposium on Microarchitecture(MICRO), Turkey.
Woo M, Neider J, Davis T, Shreiner D (1999) OpenGL Programming Guide: The Official Guide to Learning openGL, Version 1.2, 3rd edn. Addison-Wesley

Download references

Author information

Authors and Affiliations

Graduate School of Infortmation Sciences, Tohoku University, Aoba, Aramaki-aza, Aoba-ku, Sendai, 980-8578, Japan
Hiroyuki Takizawa
Infortmation Synergy Center, Tohoku University, Aoba, Aramaki-aza, Aoba-ku, Sendai, 980-8578, Japan
Hiroaki Kobayashi

Authors

Hiroyuki Takizawa
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hiroyuki Takizawa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Takizawa, H., Kobayashi, H. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing. J Supercomput 36, 219–234 (2006). https://doi.org/10.1007/s11227-006-8294-1

Download citation

Issue Date: June 2006
DOI: https://doi.org/10.1007/s11227-006-8294-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

Abstract

Access this article

Similar content being viewed by others

Status, challenges and trends of data-intensive supercomputing

Large scale K-means clustering using GPUs

GPU Architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

Abstract

Access this article

Similar content being viewed by others

Status, challenges and trends of data-intensive supercomputing

Large scale K-means clustering using GPUs

GPU Architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation