Abstract
Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices.
Similar content being viewed by others
References
NVIDIA CUDA Programming, Best practices guide, http://www.nvidia.com/cuda (2013)
Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th international conference on very large data bases, pp 487–499
Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10:230–237
Alba E, Tomassini M (2002) Parallelism and evolutionary algorithms. IEEE Trans Evol Comput 6(5):443–462
Bai H, Ouyang D, He L (2009) GPU-based frequent pattern mining algorithm. Chin J Sci Instrum 30(10):2082–2087
Banzhaf W, Harding S, Langdon WB, Wilson G (2009) Accelerating genetic programming through graphics processing units. In: Genetic programming theory and practice. VI. Genetic and evolutionary computation, pp 1–19
Berzal F, Blanco I, Sánchez D, Vila M (2002) Measuring the accuracy and interest of association rules: a new framework. Intell Data Anal 6(3):221–235
Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: Proceedings of the 1st workshop on frequent itemset mining implementations
Brin S, Motwani R, Silverstein C (1997) In: Beyond market baskets: generalizing association rules to correlations (SIGMOD ’97), pp 265–276
Brin S, Motwani R, Ullman J, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 255–264
Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16:187–202
Cecilia JM, Nisbet A, Amos M, García JM, Ujaldón M (2013) Enhancing GPU parallelism in nature-inspired algorithms. J Supercomput 63(1–17):773–789
Cui Q, Guo X (2013) Research on parallel association rules mining on GPU. In: Proceedings of the international conference on green communications and networks. Lecture notes in electrical engineering, vol 224, pp 215–222
Fok KL, Wong TT, Wong ML (2007) Evolutionary computing on consumer graphics hardware. IEEE Intell Syst 22(2):69–78
Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: Proceedings of the genetic and evolutionary computation conference, pp 1039–1046
Frank A, Asuncion A (2010) In: UCI machine learning repository
Garland M, Le Grand S, Nickolls J, Anderson J, Hardwick J, Morton S, Phillips E, Zhang Y, Volkov V (2008) Parallel computing experiences with CUDA. IEEE MICRO 28(4):13–27
George T, Nathan M, Wagner M, Renato F (2010) Tree projection-based frequent itemset mining on multi-core CPUs and GPUs. In: Proceedings of the 22nd international symposium on computer architecture and high performance computing, pp 47–54
Gray B, Orlowska M (1998) CCAIIA: clustering categorical attributes into interesting association rules. In: Research and development in knowledge discovery and data mining. Lecture notes in computer science, vol 1394, pp 132–143
Green RC II, Wang L, Alam M, Formato RA (2012) Central force optimization on a GPU: a case study in high performance metaheuristics. J Supercomput 62(1):378–398
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87
Harding S, Banzhaf W (2007) Fast genetic programming on GPUs. Lect Notes Comput Sci 4445:90–101
Hoai RI, Whigham NX, Shan PA, O’Neill Y, McKay M (2010) Grammar-based genetic programming: a survey. Genet Program Evol Mach 11(3–4):365–396
Hu J, Yang-Li X (2008) A fast parallel association rules mining algorithm based on FP-forest. In: Proceedings of the 5th international symposium on neural networks: advances in neural networks, part II. Lecture notes in computer science, vol 5264, pp 40–49
Hwu WW (2009) Illinois ECE 498AL: Programming massively parallel processors, Lecture 13: Reductions and their implementation
Jang B, Schaa D, Mistry P, Kaeli D (2011) Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans Parallel Distrib Syst 23(1):105–118
Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2011) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). The Journal of Supercomputing, 1–26
Lin K-C, Liao I-E, Chen Z-S (2011) An improved frequent pattern growth method for mining association rules. Expert Syst Appl 38:5154–5161
Klosgen W (1996) Explora: a multipattern and multistrategy discovery assistant. In: Advances in knowledge discovery and data mining, pp 249–271
Langdon WB (2010) A many threaded CUDA interpreter for genetic programming. Lect Notes Comput Sci 6021:146–158
Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669
Langdon WB (2011) Performing with CUDA. In: Proceedings of the genetic and evolutionary computation conference, pp 423–430
Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. Lect Notes Comput Sci 4971:73–85
Langdon WB, Harrison AP (2008) GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Comput 12(12):1169–1183
Luna JM, Romero JR, Ventura S (2010) G3PARM: a grammar guided genetic programming algorithm for mining association rules. In: Proceedings of the 2010 IEEE world Congress on computational intelligence, pp 2586–2593
Luna JM, Romero JR, Ventura S (2012) Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl Inf Syst 32(1):53–76
Ordoñez C, Ezquerra N, Santana C (2006) Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3):259–283
Pallipuram VK, Bhuiyan M, Smith MC (2012) A comparative study of GPU programming models and architectures using neural networks. J Supercomput 61(3):618–673
Papè NF, Alcalá-Fernandez J, Bonarini A, Herrera F (2009) Evolutionary extraction of association rules: a preliminary study on their effectiveness. Lect Notes Comput Sci 5572:646–653
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules
Rivera G, Tseng CW (1998) Data transformations for eliminating conflict misses. ACM SIGPLAN Not 33(5):38–49
Romero C, Luna JM, Romero JR, Ventura S (2011) RM-tool: a framework for discovering and evaluating association rules. Adv Eng Softw 42(8):566–576
Sánchez D, Serrano JM, Cerda L, Vila MA (2008) Association rules applied to credit card fraud detection. Expert Syst Appl 36(2):3630–3640
Silverstein C, Brin S, Motwani R (1998) Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Discov 2(1):39–68
Wu XL, Obeid N, Hwu WM (2010) Exploiting more parallelism from applications having generalized reductions on GPU architectures. In: Proceedings of the 10th IEEE international conference on computer and information technology, pp 1175–1180
Zhang C, Zhang S (2002) Association rules mining: models and algorithms. Lecture notes in computer science, vol 2307. Springer, Berlin
Zhou J, Yu KM, Wu BC (2010) Parallel frequent patters mining algorithm on GPU. In: Proceedings of the IEEE international conference on systems, man and cybernetics, pp 435–440
Acknowledgements
This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, projects P08-TIC-3720 and TIN-2011-22408, and FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU grants AP2010-0042 and AP2010-0041.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cano, A., Luna, J.M. & Ventura, S. High performance evaluation of evolutionary-mined association rules on GPUs. J Supercomput 66, 1438–1461 (2013). https://doi.org/10.1007/s11227-013-0937-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-0937-4