Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

High performance evaluation of evolutionary-mined association rules on GPUs

  • 366 Accesses

  • 30 Citations

Abstract

Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Listing 1
Fig. 5
Listing 2
Listing 3
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    NVIDIA CUDA Programming, Best practices guide, http://www.nvidia.com/cuda (2013)

  2. 2.

    Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969

  3. 3.

    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th international conference on very large data bases, pp 487–499

  4. 4.

    Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10:230–237

  5. 5.

    Alba E, Tomassini M (2002) Parallelism and evolutionary algorithms. IEEE Trans Evol Comput 6(5):443–462

  6. 6.

    Bai H, Ouyang D, He L (2009) GPU-based frequent pattern mining algorithm. Chin J Sci Instrum 30(10):2082–2087

  7. 7.

    Banzhaf W, Harding S, Langdon WB, Wilson G (2009) Accelerating genetic programming through graphics processing units. In: Genetic programming theory and practice. VI. Genetic and evolutionary computation, pp 1–19

  8. 8.

    Berzal F, Blanco I, Sánchez D, Vila M (2002) Measuring the accuracy and interest of association rules: a new framework. Intell Data Anal 6(3):221–235

  9. 9.

    Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: Proceedings of the 1st workshop on frequent itemset mining implementations

  10. 10.

    Brin S, Motwani R, Silverstein C (1997) In: Beyond market baskets: generalizing association rules to correlations (SIGMOD ’97), pp 265–276

  11. 11.

    Brin S, Motwani R, Ullman J, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 255–264

  12. 12.

    Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16:187–202

  13. 13.

    Cecilia JM, Nisbet A, Amos M, García JM, Ujaldón M (2013) Enhancing GPU parallelism in nature-inspired algorithms. J Supercomput 63(1–17):773–789

  14. 14.

    Cui Q, Guo X (2013) Research on parallel association rules mining on GPU. In: Proceedings of the international conference on green communications and networks. Lecture notes in electrical engineering, vol 224, pp 215–222

  15. 15.

    Fok KL, Wong TT, Wong ML (2007) Evolutionary computing on consumer graphics hardware. IEEE Intell Syst 22(2):69–78

  16. 16.

    Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: Proceedings of the genetic and evolutionary computation conference, pp 1039–1046

  17. 17.

    Frank A, Asuncion A (2010) In: UCI machine learning repository

  18. 18.

    Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Zhang Y, Volkov V (2008) Parallel computing experiences with CUDA. IEEE MICRO 28(4):13–27

  19. 19.

    George T, Nathan M, Wagner M, Renato F (2010) Tree projection-based frequent itemset mining on multi-core CPUs and GPUs. In: Proceedings of the 22nd international symposium on computer architecture and high performance computing, pp 47–54

  20. 20.

    Gray B, Orlowska M (1998) CCAIIA: clustering categorical attributes into interesting association rules. In: Research and development in knowledge discovery and data mining. Lecture notes in computer science, vol 1394, pp 132–143

  21. 21.

    Green RC II, Wang L, Alam M, Formato RA (2012) Central force optimization on a GPU: a case study in high performance metaheuristics. J Supercomput 62(1):378–398

  22. 22.

    Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87

  23. 23.

    Harding S, Banzhaf W (2007) Fast genetic programming on GPUs. Lect Notes Comput Sci 4445:90–101

  24. 24.

    Hoai RI, Whigham NX, Shan PA, O’Neill Y, McKay M (2010) Grammar-based genetic programming: a survey. Genet Program Evol Mach 11(3–4):365–396

  25. 25.

    Hu J, Yang-Li X (2008) A fast parallel association rules mining algorithm based on FP-forest. In: Proceedings of the 5th international symposium on neural networks: advances in neural networks, part II. Lecture notes in computer science, vol 5264, pp 40–49

  26. 26.

    Hwu WW (2009) Illinois ECE 498AL: Programming massively parallel processors, Lecture 13: Reductions and their implementation

  27. 27.

    Jang B, Schaa D, Mistry P, Kaeli D (2011) Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans Parallel Distrib Syst 23(1):105–118

  28. 28.

    Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2011) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). The Journal of Supercomputing, 1–26

  29. 29.

    Lin K-C, Liao I-E, Chen Z-S (2011) An improved frequent pattern growth method for mining association rules. Expert Syst Appl 38:5154–5161

  30. 30.

    Klosgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Advances in knowledge discovery and data mining, pp 249–271

  31. 31.

    Langdon WB (2010) A many threaded CUDA interpreter for genetic programming. Lect Notes Comput Sci 6021:146–158

  32. 32.

    Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669

  33. 33.

    Langdon WB (2011) Performing with CUDA. In: Proceedings of the genetic and evolutionary computation conference, pp 423–430

  34. 34.

    Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. Lect Notes Comput Sci 4971:73–85

  35. 35.

    Langdon WB, Harrison AP (2008) GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Comput 12(12):1169–1183

  36. 36.

    Luna JM, Romero JR, Ventura S (2010) G3PARM: a grammar guided genetic programming algorithm for mining association rules. In: Proceedings of the 2010 IEEE world Congress on computational intelligence, pp 2586–2593

  37. 37.

    Luna JM, Romero JR, Ventura S (2012) Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl Inf Syst 32(1):53–76

  38. 38.

    Ordoñez C, Ezquerra N, Santana C (2006) Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3):259–283

  39. 39.

    Pallipuram VK, Bhuiyan M, Smith MC (2012) A comparative study of GPU programming models and architectures using neural networks. J Supercomput 61(3):618–673

  40. 40.

    Papè NF, Alcalá-Fernandez J, Bonarini A, Herrera F (2009) Evolutionary extraction of association rules: a preliminary study on their effectiveness. Lect Notes Comput Sci 5572:646–653

  41. 41.

    Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules

  42. 42.

    Rivera G, Tseng CW (1998) Data transformations for eliminating conflict misses. ACM SIGPLAN Not 33(5):38–49

  43. 43.

    Romero C, Luna JM, Romero JR, Ventura S (2011) RM-tool: a framework for discovering and evaluating association rules. Adv Eng Softw 42(8):566–576

  44. 44.

    Sánchez D, Serrano JM, Cerda L, Vila MA (2008) Association rules applied to credit card fraud detection. Expert Syst Appl 36(2):3630–3640

  45. 45.

    Silverstein C, Brin S, Motwani R (1998) Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Discov 2(1):39–68

  46. 46.

    Wu XL, Obeid N, Hwu WM (2010) Exploiting more parallelism from applications having generalized reductions on GPU architectures. In: Proceedings of the 10th IEEE international conference on computer and information technology, pp 1175–1180

  47. 47.

    Zhang C, Zhang S (2002) Association rules mining: models and algorithms. Lecture notes in computer science, vol 2307. Springer, Berlin

  48. 48.

    Zhou J, Yu KM, Wu BC (2010) Parallel frequent patters mining algorithm on GPU. In: Proceedings of the IEEE international conference on systems, man and cybernetics, pp 435–440

Download references

Acknowledgements

This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, projects P08-TIC-3720 and TIN-2011-22408, and FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU grants AP2010-0042 and AP2010-0041.

Author information

Correspondence to Sebastián Ventura.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cano, A., Luna, J.M. & Ventura, S. High performance evaluation of evolutionary-mined association rules on GPUs. J Supercomput 66, 1438–1461 (2013). https://doi.org/10.1007/s11227-013-0937-4

Download citation

Keywords

  • Performance evaluation
  • Association rules
  • Parallel computing
  • GPU