Skip to main content
Log in

Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA)

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Kamber M, Han J (2005) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San Mateo

    Google Scholar 

  2. Peng Y, Kou G, Shi Y, Chen ZX (2008) A descriptive framework for the field of data mining and knowledge discovery. Int J Inf Technol Decis Mak 7(4):639–682

    Article  Google Scholar 

  3. Olson D, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York

    Google Scholar 

  4. Zhou L, Lai KK, Yen J (2009) Credit scoring models with AUC maximization based on the weighted SVM. Int J Inf Technol Decis Mak 8(4):677–696

    Article  MATH  Google Scholar 

  5. Zhang Q, Segal RS (2008) Web mining: a survey of current research, techniques, and software. Int J Inf Technol Decis Mak 7(4):683–720

    Article  Google Scholar 

  6. Zaki MJ (1999) Parallel and distributed association mining: a survey. IEEE Concurr 7(4):4–25, Special issue on Parallel Mechanisms for Data Mining

    Article  Google Scholar 

  7. Srivastava A, Han E, Kumar V, Singh V (1999) Parallel formulation of decision-tree classification algorithms. Data Min Knowl Discov 3(3):237–261

    Article  Google Scholar 

  8. Gaber MM, Yu PS (2006) Detection and classification of changes in evolving data streams. Int J Inf Technol Decis Mak 5(4):659–670

    Article  Google Scholar 

  9. Liu Y, Pisharath J, Liao WK, Memik G, Choudhary A, Dubey P (2004) Performance evaluation and characterization of scalable data mining algorithms. In: 16th IASTED international conference on parallel and distributed computing and systems (PDCS). MIT, Cambridge, pp 620–625

    Google Scholar 

  10. Dehuri S, Mall R (2009) Parallel processing of olap queries using a cluster of workstations. Int J Inf Technol Decis Mak 6(2):279–299

    Article  Google Scholar 

  11. Ergu D, Kou G, Peng Y, Shi Y, Shi Y (2011) The analytic hierarchy process: task scheduling and resource allocation in cloud computing environment. J Supercomput. doi:10.1007/s11227-011-0625-1

    MATH  Google Scholar 

  12. NVIDIA (2008) CUDA programming guide 2.1. http://www.nvidia.com/object/cuda_develop.html

  13. Tesla (2009) C1060 computing processor. http://www.nvidia.com/object/product_tesla_c1060_us.html

  14. Balevic A, Rockstroh L, Li W et al (2008) Acceleration of a Finite-Difference Time-Domain method with general purpose GPUs (GPGPUs). In: Proc of international conference on computer and information technology, vol 1–2, pp 291–294

    Google Scholar 

  15. Cohen JM, Molemaker MJ (2009) A fast double precision CFD code using CUDA. In: 21st International conference on parallel computational fluid dynamics

    Google Scholar 

  16. Jeong WK, Fletcher PT, Tao R et al (2007) Interactive visualization of volumetric white matter connectivity. IEEE Trans Vis Comput Graph 3(6):1480–1487

    Article  Google Scholar 

  17. Kavinguy B (2008) A neural network on GPU. http://www.codeproject.com/KB/graphics/GPUNN.aspx

  18. Catanzaro B, Sundaram N, Keutzer K (2008) Fast support vector machine training and classification on graphics processors. In: ICML ’08: proceedings of the 25th international conference on machine learning, pp 104–111

    Chapter  Google Scholar 

  19. Vasiliadis G, Antonatos S, Polychronakis M et al (2008) Gnort: high performance network intrusion detection using graphics processors. Recent Adv Intrusion Detect 5230:116–134

    Article  Google Scholar 

  20. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc of international conference on very large data bases, pp 487–499

    Google Scholar 

  21. Fix E, Hodges JL (1951) Discriminatory analysis, non-parametric discrimination: consistency properties. Technical Report 21-49-004(4), USAF School of Aviation Medicine, Randolph Field, Texas

  22. Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137 (Original version: Technical Report, Bell Labs, 1957)

    Article  MathSciNet  MATH  Google Scholar 

  23. Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using GPU. In: IEEE conference on computer vision and patter recognition workshops, vols 1–3, pp 1107–1112

    Google Scholar 

  24. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proc of international conference on management of data, pp 1–12

    Google Scholar 

  25. Zaki MJ, Ogihara M, Parthasarathy S, Li W (1996) Parallel data mining for association rules on shared-memory multi-processors. In: Proc of supercomputing, p 43

    Google Scholar 

  26. Agrawal R, Shafer C (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969

    Article  Google Scholar 

  27. Han EH, Karypis G, Kumar V (2000) Scalable parallel data mining for association rules. IEEE Trans Knowl Data Eng 12(3):337–352

    Article  Google Scholar 

  28. Cheung DW, Xiao YQ (1999) Effect of data distribution in parallel mining of associations. Data Min Knowl Discov 3(3):291–314

    Article  Google Scholar 

  29. Holt JD, Chung SM (2007) Parallel mining of association rules from text databases. J Supercomput 39(3):273–299

    Article  Google Scholar 

  30. Shafer J, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In: Proc of international conference on very large data bases, pp 544–555

    Google Scholar 

  31. Zaki MJ, Ho CT, Agrawal R (1999) Scalable parallel classification for data mining on shared-memory multiprocessors. In: IEEE international conference on data engineering, pp 198–205

    Google Scholar 

  32. Joshi MV, Karypis G, Kumar V (1998) ScalParC: a new scalable and efficient parallel classification algorithm for mining large datasets. In: Proc of international parallel processing symposium, pp 573–579

    Google Scholar 

  33. Nagesh HS, Choudhary A, Goil S (2000) A scalable parallel subspace clustering algorithm for massive data sets. In: Proc of international conference on parallel processing, pp 477–484

    Google Scholar 

  34. Forman G, Zhang B (2000) Linear speed-up for a parallel non-approximate recasting of center-based clustering algorithms, including K-Means, K-Harmonic Means, and EM. In: Proc ACM SIGKDD workshop on distributed and parallel knowledge discovery (KDD’00), Boston, MA

    Google Scholar 

  35. Sibson R (1973) SLINK: An optimally efficient algorithm for the single link cluster method. Comput J 16(1):30–34

    Article  MathSciNet  Google Scholar 

  36. Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244

    Article  Google Scholar 

  37. Fang WB, Lau KK, Lu M, Xiao XY et al (2008) Parallel data mining on graphics processors. Technical Report HKUST-CS08-07. http://code.google.com/p/gpuminer/

  38. Che S, Boyer M, Meng JY et al (2008) A performance study of general purpose applications on graphics processors using CUDA. J Parallel Distrib Comput 68(10):1370–1380

    Article  Google Scholar 

  39. Wu R, Zhang B, Hsu MC (2009) Clustering billions of data points using GPUs. In: UCHPC-MAW’09, pp 1–5

    Chapter  Google Scholar 

  40. CUDA SDK 3.2 (2010) http://developer.nvidia.com/object/cuda_3_2_downloads.html

  41. IBM synthetic data generator (2011) http://www.cs.loyola.edu/~cgiannel/assoc_gen.html

  42. The Linux Kernel Archives (2007) http://www.kernel.org/1480-1487

  43. KDD Cup 2004 Data (2011) http://kodiak.cs.cornell.edu/kddcup/datasets.html

  44. KDD Cup 1999 Data (2011) http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jian, L., Wang, C., Liu, Y. et al. Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA). J Supercomput 64, 942–967 (2013). https://doi.org/10.1007/s11227-011-0672-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-011-0672-7

Keywords

Navigation