Skip to main content
Log in

Parallel frequent itemsets mining using distributed graphic processing units

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Data mining is an essential technique in knowledge discovery which is widely used for pattern extraction and information classification. Extracting useful rules and knowledge by considering the relationships and association of the data is as an important data mining technique used for data analysis, called association rule mining (ARM). Several scans of the dataset are necessary to extract frequent patterns and association rules during a time-consuming process. Discovery of frequent patterns within data is the major phase of the ARM process, which is very expensive in terms of execution times. Powerful parallel systems with multiple graphics processing units (GPUs) and multiple general-purpose graphics processing units (GPGPUs) are appropriate choices to reduce the execution time. Although GPU architectures can speed up the mining process, a single GPU is usually unable to use a large amount of data to extract frequent patterns. It is therefore necessary to use multiple GPU processors on a system or distribute them within a network to improve the efficiency of parallelization. In this paper, multiple GPUs are parallelized to propose a new framework, called GPApbmp, for parallelization of the Apriori algorithm, which is a well-known level-wise frequent pattern mining method, for faster extraction of association rules. The proposed framework uses multiple GPUs, on which the dataset is distributed to reduce the execution time and the number of database scans in the Apriori method using a vertical approach. The experimental results on standard datasets show that the proposed method reduces the execution time speeds up the mining process. The results obtained from two and four parallelized NVidia GeForce 710 processors evaluated in CUDA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Abdelaal AA, Abed S, Al-Shayeji M, Allaho M (2021) Customized frequent patterns mining algorithms for enhanced top-rank-K frequent pattern mining. Expert Syst Appl 169:114530

    Article  Google Scholar 

  2. Agrawal, R., Srikant, R. (1994) Fast algorithms for mining association rules. In proc. 1994 Int. Conf. Very Large data bases (VLDB’94), 487–499.

  3. Agrawal, R., Imielinski, T., Swami, A. (1993) Mining association rules between sets of items in large databases. In: proceedings of the 1993ACM-SIGMOD international conference on management of data (SIGMOD’93), 207–216.

  4. Ahamed AKC, Magoules F (2017) Conjugate gradient method with graphics processing unit acceleration: CUDA vs OpenCL. Adv Eng Softw 111:32–42

    Article  Google Scholar 

  5. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31(3):606–660

    Article  MathSciNet  Google Scholar 

  6. Baralis E, Cerquitelli T, Chiusano S (2009) IMine: index support for item set mining. IEEE Trans Knowl Data Eng 21(4):493–506

    Article  Google Scholar 

  7. Bustio-Martínez L, Cumplido R, Letras M, Hernández-León R, Feregrino-Uribe C, Hernández-Palancar J (2021) FPGA/GPU-based acceleration for frequent Itemsets mining: a comprehensive review. ACM Comput Surv (CSUR) 54(9):1–35

    Article  Google Scholar 

  8. Cheng J, Grossman M, & McKercher, T. (2014) Professional Cuda C programming. John Wiley & Sons

  9. Chengyan L, FENG S, Sun G DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework. Multimed Tools Appl 79:16771–16793

  10. Chon KW, Hwang SH, Kim MS (2018) GMiner: a fast GPU-based frequent itemset mining method for large-scale data. Inf Sci 439:19–38

    Article  MathSciNet  Google Scholar 

  11. D’Angelo G, Rampone S, Palmieri F (2017) Developing a trust model for pervasive computing based on Apriori association rules learning and Bayesian classification. Soft Comput 21(21):6297–6315

    Article  Google Scholar 

  12. Davashi R (2021) ILUNA: single-pass incremental method for uncertain frequent pattern mining without false positives. Inf Sci 564:1–26

    Article  MathSciNet  Google Scholar 

  13. Deng H, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Syst Appl 41(10):4505–4512

    Article  Google Scholar 

  14. Deng H, Lv SL (2015) PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via children–parent equivalence pruning. Expert Syst Appl 42(13):5424–5432

    Article  Google Scholar 

  15. Deng ZH (2016) DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl Soft Comput 41:214–223

    Article  Google Scholar 

  16. Deng ZH, Wang ZH (2010) A new fast vertical method for mining frequent itemsets. Int J Comput Intell Syst 3(6):733–744

    MathSciNet  Google Scholar 

  17. Deng ZH, Wang ZH, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using n-lists. SCIENCE CHINA Inf Sci 55(9):2008–2030

    Article  MathSciNet  MATH  Google Scholar 

  18. Djenouri Y, Comuzzi M (2017) Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci 420:1–15

    Article  Google Scholar 

  19. Djenouri Y, AhceneBendjoudi MM, Nouali-Taboudjemat N and ZinebHabbas (2014) "Parallel association rules mining using GPUS and bees behaviors." In 2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp. 401–405. IEEE.

  20. Djenouri Y, AhceneBendjoudi, Mehdi M, Nouali-Taboudjemat N, ZinebHabbas (2015) GPU-based bees swarm optimization for association rules mining. J Supercomp 71(4):1318–1344

    Article  Google Scholar 

  21. Djenouri Y, AhceneBendjoudi, DjamelDjenouri, and Comuzzi M (2017) "GPU-based bio-inspired model for solving association rules mining problem." In 2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 262–269. IEEE

  22. Djenouri Y, AsmaBelhadi, Fournier-Viger P, and Lin JC-W (2017) "An hybrid multi-core/gpu-based mimetic algorithm for big association rule mining." In International Conference on Genetic and Evolutionary Computing, pp. 59–65. Springer, Singapore

  23. Djenouri Y, Fournier-Viger P, Lin JCW, Djenouri D, Belhadi A (2019) GPU-based swarm intelligence for association rule mining in big databases. Intelligent Data Analysis 23(1):57–76

    Article  Google Scholar 

  24. Djenouri Y, DjamelDjenouri AB, Cano A (2019) Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Inf Sci 496:363–377

    Article  Google Scholar 

  25. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: proceeding of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD’00), 1–12.

  26. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  MathSciNet  Google Scholar 

  27. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier

  28. Hosseinabady M, Zainol MAB, Nunez-Yanez J (2019) Heterogeneous FPGA+ GPU embedded systems: challenges and opportunities. arXiv preprint arXiv:1901.06331.

  29. http://fimi.ua.ac.be/data/

  30. Hung CL, Lin YS, Lin CY, Chung YC, Chung YF (2015) CUDAClustalW: an efficient parallel algorithm for progressive multiple sequence alignment on multi-GPUs. Comput Biol Chem 58:62–68

    Article  Google Scholar 

  31. Jiang H, Xu CW, Liu ZY, Yu LY (2017) GPU-accelerated Apriori algorithm. In ITM web of conferences (Vol. 12, p. 03046). EDP sciences.

  32. Kalaiselvi T, Sriramakrishnan P, Somasundaram K (2017) Survey of using GPUCUDA programming model in medical image analysis. Inform Med Unlocked 9:133–144

    Article  Google Scholar 

  33. Kalivarapu V, Winer E (2015) A study of graphics hardware accelerated particle swarm optimization with digital pheromones. Struct Multidiscip Optim 51(6):1281–1304

    Article  Google Scholar 

  34. Kalra, M., Lal, N., & Qamar, S. (2018). K-mean clustering algorithm approach for data mining of heterogeneous data. In information and communication Technology for Sustainable Development (pp. 61–70). Springer, Singapore, K-Mean Clustering Algorithm Approach for Data Mining of Heterogeneous Data.

  35. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput struct Biotechnol J 15:104–116

    Article  Google Scholar 

  36. Lee H, Shao B, Kang U (2015) Fast graph mining with HBase. Inf Sci 315:56–66

    Article  MathSciNet  Google Scholar 

  37. Mordvanyuk N, López B, Bifet A (2021) vertTIRP: robust and efficient vertical frequent time interval-related pattern mining, expert systems with applications, 168, 114276.

  38. Park J, Chen MS, Yu PS (1995) An effective hash based algorithm for mining association rules. In: SIGMOD'95, 175-186.

  39. Pavithra A, Dhanaraj S (2018) Comparative study of effective performance of association rule Mining in Different Databases. Data Mining Knowl Eng 10(4):74–77

    Google Scholar 

  40. Roberge V, Tarbouchi M, Okou FA (2017) Distribution system optimization on graphics processing unit. IEEE Trans Smart Grid 8(4):1689–1699

    Article  Google Scholar 

  41. Singh AP, Singh DP (2015) Implementation of K-shortest path algorithm in GPU using CUDA. Procedia Comp Sci 48:5–13

    Article  Google Scholar 

  42. Sohrabi MK (2018) A gossip-based information fusion protocol for distributed frequent Itemset mining, Enterprise Inform Syst, 12(6), 674-694.

  43. Sohrabi MK, Barforoush AA (2013) Parallel frequent itemset mining using systolic arrays. Knowl-Based Syst 37:462–471

    Article  Google Scholar 

  44. Sohrabi MK, Ghods V (2014) Top-down vertical itemset mining. In sixth international conference on graphic and image processing (ICGIP 2014), 94431V-94431V7.

  45. Sohrabi MK, Ghods V (2016) CUSE: a novel cube-based approach for sequential pattern mining. In 4th international symposium on computational and business intelligence (ISCBI), 186–190.

  46. Sohrabi MK, Taheri N (2018) A haoop-based parallel mining of frequent itemsets using N-lists. J Chin Inst Eng 41(1):229–238

    Article  Google Scholar 

  47. Tiwary A, Mayank, Sahoo AK, and Misra R (2014) "Efficient implementation of apriori algorithm on HDFS using GPU." In 2014 International Conference on High Performance Computing and Applications (ICHPCA), pp. 1–7. IEEE

  48. Toivonen H (1996) Sampling large databases for association rules. In: proceeding of the 1996 international conference on very large data bases (VLDB’96), 134–145.

  49. www.philippe-fournier-viger.com

  50. Zhang F, Zhang Y, Bakos JD. GPApriori: GPU-accelerated frequent itemset mining. Proceed CLUSTER (2011), pp. 590–594.

  51. Zoraghchian AA, Sohrabi MK, FarzinYaghmaee (2021) Exploiting parallel graphics processing units to improve association rule mining in transactional databases using butterfly optimization algorithm. Cluster Comput 24(4):3767–3778

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Karim Sohrabi.

Ethics declarations

Funding

There is no funding for this paper.

Conflict of interest

Authors declare that they have no conflicts of interests.

Competing interests

Authors declare that they have no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zoraghchian, A.A., Sohrabi, M.K. & Yaghmaee, F. Parallel frequent itemsets mining using distributed graphic processing units. Multimed Tools Appl 81, 43873–43895 (2022). https://doi.org/10.1007/s11042-022-13225-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13225-z

Keywords

Navigation