Advertisement

The Journal of Supercomputing

, Volume 66, Issue 3, pp 1438–1461 | Cite as

High performance evaluation of evolutionary-mined association rules on GPUs

  • Alberto Cano
  • José María Luna
  • Sebastián Ventura
Article

Abstract

Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices.

Keywords

Performance evaluation Association rules Parallel computing GPU 

Notes

Acknowledgements

This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, projects P08-TIC-3720 and TIN-2011-22408, and FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU grants AP2010-0042 and AP2010-0041.

References

  1. 1.
    NVIDIA CUDA Programming, Best practices guide, http://www.nvidia.com/cuda (2013)
  2. 2.
    Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969 CrossRefGoogle Scholar
  3. 3.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th international conference on very large data bases, pp 487–499 Google Scholar
  4. 4.
    Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10:230–237 CrossRefGoogle Scholar
  5. 5.
    Alba E, Tomassini M (2002) Parallelism and evolutionary algorithms. IEEE Trans Evol Comput 6(5):443–462 CrossRefGoogle Scholar
  6. 6.
    Bai H, Ouyang D, He L (2009) GPU-based frequent pattern mining algorithm. Chin J Sci Instrum 30(10):2082–2087 Google Scholar
  7. 7.
    Banzhaf W, Harding S, Langdon WB, Wilson G (2009) Accelerating genetic programming through graphics processing units. In: Genetic programming theory and practice. VI. Genetic and evolutionary computation, pp 1–19 CrossRefGoogle Scholar
  8. 8.
    Berzal F, Blanco I, Sánchez D, Vila M (2002) Measuring the accuracy and interest of association rules: a new framework. Intell Data Anal 6(3):221–235 zbMATHGoogle Scholar
  9. 9.
    Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: Proceedings of the 1st workshop on frequent itemset mining implementations Google Scholar
  10. 10.
    Brin S, Motwani R, Silverstein C (1997) In: Beyond market baskets: generalizing association rules to correlations (SIGMOD ’97), pp 265–276 Google Scholar
  11. 11.
    Brin S, Motwani R, Ullman J, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 255–264 Google Scholar
  12. 12.
    Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16:187–202 CrossRefGoogle Scholar
  13. 13.
    Cecilia JM, Nisbet A, Amos M, García JM, Ujaldón M (2013) Enhancing GPU parallelism in nature-inspired algorithms. J Supercomput 63(1–17):773–789 CrossRefGoogle Scholar
  14. 14.
    Cui Q, Guo X (2013) Research on parallel association rules mining on GPU. In: Proceedings of the international conference on green communications and networks. Lecture notes in electrical engineering, vol 224, pp 215–222 Google Scholar
  15. 15.
    Fok KL, Wong TT, Wong ML (2007) Evolutionary computing on consumer graphics hardware. IEEE Intell Syst 22(2):69–78 CrossRefGoogle Scholar
  16. 16.
    Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: Proceedings of the genetic and evolutionary computation conference, pp 1039–1046 Google Scholar
  17. 17.
    Frank A, Asuncion A (2010) In: UCI machine learning repository Google Scholar
  18. 18.
    Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Zhang Y, Volkov V (2008) Parallel computing experiences with CUDA. IEEE MICRO 28(4):13–27 CrossRefGoogle Scholar
  19. 19.
    George T, Nathan M, Wagner M, Renato F (2010) Tree projection-based frequent itemset mining on multi-core CPUs and GPUs. In: Proceedings of the 22nd international symposium on computer architecture and high performance computing, pp 47–54 Google Scholar
  20. 20.
    Gray B, Orlowska M (1998) CCAIIA: clustering categorical attributes into interesting association rules. In: Research and development in knowledge discovery and data mining. Lecture notes in computer science, vol 1394, pp 132–143 CrossRefGoogle Scholar
  21. 21.
    Green RC II, Wang L, Alam M, Formato RA (2012) Central force optimization on a GPU: a case study in high performance metaheuristics. J Supercomput 62(1):378–398 CrossRefGoogle Scholar
  22. 22.
    Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87 MathSciNetCrossRefGoogle Scholar
  23. 23.
    Harding S, Banzhaf W (2007) Fast genetic programming on GPUs. Lect Notes Comput Sci 4445:90–101 CrossRefGoogle Scholar
  24. 24.
    Hoai RI, Whigham NX, Shan PA, O’Neill Y, McKay M (2010) Grammar-based genetic programming: a survey. Genet Program Evol Mach 11(3–4):365–396 Google Scholar
  25. 25.
    Hu J, Yang-Li X (2008) A fast parallel association rules mining algorithm based on FP-forest. In: Proceedings of the 5th international symposium on neural networks: advances in neural networks, part II. Lecture notes in computer science, vol 5264, pp 40–49 Google Scholar
  26. 26.
    Hwu WW (2009) Illinois ECE 498AL: Programming massively parallel processors, Lecture 13: Reductions and their implementation Google Scholar
  27. 27.
    Jang B, Schaa D, Mistry P, Kaeli D (2011) Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans Parallel Distrib Syst 23(1):105–118 CrossRefGoogle Scholar
  28. 28.
    Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2011) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). The Journal of Supercomputing, 1–26 Google Scholar
  29. 29.
    Lin K-C, Liao I-E, Chen Z-S (2011) An improved frequent pattern growth method for mining association rules. Expert Syst Appl 38:5154–5161 CrossRefGoogle Scholar
  30. 30.
    Klosgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Advances in knowledge discovery and data mining, pp 249–271 Google Scholar
  31. 31.
    Langdon WB (2010) A many threaded CUDA interpreter for genetic programming. Lect Notes Comput Sci 6021:146–158 CrossRefGoogle Scholar
  32. 32.
    Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669 CrossRefGoogle Scholar
  33. 33.
    Langdon WB (2011) Performing with CUDA. In: Proceedings of the genetic and evolutionary computation conference, pp 423–430 Google Scholar
  34. 34.
    Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. Lect Notes Comput Sci 4971:73–85 CrossRefGoogle Scholar
  35. 35.
    Langdon WB, Harrison AP (2008) GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Comput 12(12):1169–1183 CrossRefGoogle Scholar
  36. 36.
    Luna JM, Romero JR, Ventura S (2010) G3PARM: a grammar guided genetic programming algorithm for mining association rules. In: Proceedings of the 2010 IEEE world Congress on computational intelligence, pp 2586–2593 Google Scholar
  37. 37.
    Luna JM, Romero JR, Ventura S (2012) Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl Inf Syst 32(1):53–76 CrossRefGoogle Scholar
  38. 38.
    Ordoñez C, Ezquerra N, Santana C (2006) Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3):259–283 CrossRefGoogle Scholar
  39. 39.
    Pallipuram VK, Bhuiyan M, Smith MC (2012) A comparative study of GPU programming models and architectures using neural networks. J Supercomput 61(3):618–673 CrossRefGoogle Scholar
  40. 40.
    Papè NF, Alcalá-Fernandez J, Bonarini A, Herrera F (2009) Evolutionary extraction of association rules: a preliminary study on their effectiveness. Lect Notes Comput Sci 5572:646–653 CrossRefGoogle Scholar
  41. 41.
    Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules Google Scholar
  42. 42.
    Rivera G, Tseng CW (1998) Data transformations for eliminating conflict misses. ACM SIGPLAN Not 33(5):38–49 CrossRefGoogle Scholar
  43. 43.
    Romero C, Luna JM, Romero JR, Ventura S (2011) RM-tool: a framework for discovering and evaluating association rules. Adv Eng Softw 42(8):566–576 CrossRefGoogle Scholar
  44. 44.
    Sánchez D, Serrano JM, Cerda L, Vila MA (2008) Association rules applied to credit card fraud detection. Expert Syst Appl 36(2):3630–3640 CrossRefGoogle Scholar
  45. 45.
    Silverstein C, Brin S, Motwani R (1998) Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Discov 2(1):39–68 CrossRefGoogle Scholar
  46. 46.
    Wu XL, Obeid N, Hwu WM (2010) Exploiting more parallelism from applications having generalized reductions on GPU architectures. In: Proceedings of the 10th IEEE international conference on computer and information technology, pp 1175–1180 Google Scholar
  47. 47.
    Zhang C, Zhang S (2002) Association rules mining: models and algorithms. Lecture notes in computer science, vol 2307. Springer, Berlin CrossRefGoogle Scholar
  48. 48.
    Zhou J, Yu KM, Wu BC (2010) Parallel frequent patters mining algorithm on GPU. In: Proceedings of the IEEE international conference on systems, man and cybernetics, pp 435–440 Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Alberto Cano
    • 1
  • José María Luna
    • 1
  • Sebastián Ventura
    • 1
  1. 1.Department of Computer Science and Numerical AnalysisUniversity of CordobaCordobaSpain

Personalised recommendations