Skip to main content

Scalable Clustering Using Graphics Processors

  • Conference paper
Book cover Advances in Web-Age Information Management (WAIM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4016))

Included in the following conference series:

Abstract

We present new algorithms for scalable clustering using graphics processors. Our basic approach is based on k-means. By changing the order of determining object labels, and exploiting the high computational power and pipeline of graphics processing units (GPUs) for distance computing and comparison, we speed up the k-means algorithm substantially. We introduce two strategies for retrieving data from the GPU, taking into account the low bandwidth from the GPU back to the main memory. We also extend our GPU-based approach to data stream clustering. We implement our algorithms in a PC with a Pentium IV 3.4G CPU and a NVIDIA GeForce 6800 GT graphics card. Our comprehensive performance study shows that the common GPU in desktop computers could be an efficient co-processor of CPU in traditional and data stream clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proc. of VLDB (2003)

    Google Scholar 

  2. Babcock, B., Datar, M., Motwani, R., O’Callaghan, L.: Maintaining variance and k-medians over data stream windows. In: Proc. of PODS (2003)

    Google Scholar 

  3. Baciu, G., Wong, S., Sun, H.: Recode: An image-based collision detection algorithm. Visualization and Computer Animation 10(4), 181–192 (1999)

    Article  Google Scholar 

  4. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm fordiscovering clusters in large spatial databases with noise. In: Proc. of KDD (1996)

    Google Scholar 

  5. Govindaraju, N.K., Lloyd, B., Wang, W., Lin, M., et al.: Fast computation of database operations using graphics processors. In: Proc. of SIGMOD (2004)

    Google Scholar 

  6. Govindaraju, N.K., Raghuvanshi, N., Manocha, D.: Fast and approximate stream mining of quantiles and frequencies using graphics processors. In: Proc. Of SIGMOD (2005)

    Google Scholar 

  7. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams:theory and practice. In: IEEE TKDE, pp. 515–528 (2003)

    Google Scholar 

  8. Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: Proc. of SIGMOD, pp. 73–84 (1998)

    Google Scholar 

  9. Hall, J.D., Hart, J.C.: Gpu acceleration of iterative clustering. In: Proc. Of SIGGRAPH poster (2004)

    Google Scholar 

  10. Hoff III, K.E., Keyser, J., Lin, M., Manocha, D., Culver, T.: Fast computation of generalized voronoi diagrams using graphics hardware. In: Proc. of SIGGRAPH, pp. 277–286 (1999)

    Google Scholar 

  11. Jain, A., Dubes, R.: Algorithms for clustering data. New Jersey (1998)

    Google Scholar 

  12. Larsen, E.S., McAllister, D.K.: Fast matrix multiplies using graphics hardware. In: Proc. of IEEE Supercomputing (2001)

    Google Scholar 

  13. Sun, C., Agrawal, D., Abbadi, A.E.: Hardware acceleration for spatial selections and joins. In: Proc. of SIGMOD, pp. 455–466 (2003)

    Google Scholar 

  14. Venkatasubramanian, S.: The graphics card as a stream computer. In: SIGMOD Workshop on Management and Processing of Data Streams (2003)

    Google Scholar 

  15. Thompson, C.J., Hahn, S., Oskin, M.: Using modern graphics architectures for general-purpose computing: A framework and analysis. In: Proc. of IEEE/ACM International Symposium on Microarchitectures, pp. 306–317 (2002)

    Google Scholar 

  16. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: Proc. of SIGMOD, pp. 103–114 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cao, F., Tung, A.K.H., Zhou, A. (2006). Scalable Clustering Using Graphics Processors. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds) Advances in Web-Age Information Management. WAIM 2006. Lecture Notes in Computer Science, vol 4016. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11775300_32

Download citation

  • DOI: https://doi.org/10.1007/11775300_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35225-9

  • Online ISBN: 978-3-540-35226-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics