Accelerating frequent itemset mining on graphics processing units

Zhang, Fan; Zhang, Yan; Bakos, Jason D.

doi:10.1007/s11227-013-0887-x

Accelerating frequent itemset mining on graphics processing units

Published: 02 February 2013

Volume 66, pages 94–117, (2013)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Fan Zhang¹,
Yan Zhang¹ &
Jason D. Bakos¹

679 Accesses
35 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we describe a new parallel Frequent Itemset Mining algorithm called “Frontier Expansion.” This implementation is optimized to achieve high performance on a heterogeneous platform consisting of a shared memory multiprocessor and multiple Graphics Processing Unit (GPU) coprocessors. Frontier Expansion is an improved data-parallel algorithm derived from the Equivalent Class Clustering (Eclat) method, in which a partial breadth-first search is utilized to exploit maximum parallelism while being constrained by the available memory capacity. In our approach, the vertical transaction lists are represented using a “bitset” representation and operated using wide bitwise operations across multiple threads on a GPU. We evaluate our approach using four NVIDIA Tesla GPUs and observed a 6–30× speedup relative to state-of-the-art sequential Eclat and FPGrowth implementations executed on a multicore CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting

Parallel frequent itemsets mining using distributed graphic processing units

Article 30 May 2022

Ali Abbas Zoraghchian, Mohammad Karim Sohrabi & Farzin Yaghmaee

Parallel and Distributed Mining of Probabilistic Frequent Itemsets Using Multiple GPUs

References

Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8:962–969
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc of 20th intl conf on VLDB, pp 487–499
Google Scholar
Ansari E, Dastghaibifard G (2008) Distributed frequent itemset mining using trie data structure. Int J Comput Sci 35(3):377–381
Google Scholar
Aouad LM, Na L-k (2007) Distributed frequent itemsets mining in heterogeneous platforms. J Eng Comput Arch 1(2), ISSN: 1934–7197
Google Scholar
Bart G (2004) Frequent itemset mining dataset repository. http://fimi.ua.ac.be/data/
Bodon F (2005) A trie-based APRIORI implementation for mining frequent item sequences. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, OSDM ’05. ACM Press, New York, pp 56–65
Chapter Google Scholar
Borgelt C (2003) Efficient implementations of apriori and eclat. In: Proc 1st IEEE ICDM workshop on frequent item set mining implementations (FIMI 2003), pp 90–99
Google Scholar
Borgelt C, Kruse R (2002) Induction of association rules: apriori implementation. In: 15th conference on computational statistics, pp 395–400
Google Scholar
Burdick D, Calimlim M (2001) Mafia: a maximal frequent itemset algorithm for transactional databases. In: Proceedings 17th international conference on data engineering, pp 443–452
Chapter Google Scholar
Craus M (2008) A new parallel algorithm for the frequent itemset mining problem. In: International symposium on parallel and distributed computing, 2008, ISPDC ’08, pp 165–170
Chapter Google Scholar
Fang W, Lu M (2009) Frequent itemset mining on graphics processors. In: Proceedings of the fifth international workshop on data management on new hardware, DaMoN ’09. ACM Press, New York, pp 34–42
Chapter Google Scholar
Fiat A, Shporer S (2003) AIM: another itemset miner. In: IEEE ICDM workshop on frequent itemset mining implementations (FIMI’03)
Google Scholar
Goethals B, Zaki MJ (2004) Advances in frequent itemset mining implementations: report on fimi’03. ACM SIGKDD Explor Newsl 6(1):109–117
Article Google Scholar
Han J, Pei J (2004) Mining frequent patterns without candidate generation: a Frequent-Pattern tree approach. Data Min Knowl Discov 8:53–87
Article MathSciNet Google Scholar
Kosters WA, Pijls W (2003) APRIORI, a depth first implementation. In: Proc of the workshop on frequent itemset mining implementations
Google Scholar
Liu L, Li E (2007) Optimization of frequent itemset mining on Multiple-Core processor. In: VLDB ’07, pp 1275–1285
Google Scholar
NVIDIA (2011) NVIDIA CUDA compute unified device architecture programming guide. NVIDIA, Santa Clara
Google Scholar
Parthasarathy S, Zaki MJ (1996) Parallel data mining for association rules on shared-memory multiprocessors. In: Proc Supercomputing’96, pp 43–64
Google Scholar
Pramudiono I, Kitsuregawa M (2003) Parallel FP-Growth on PC cluster. In: Advances in knowledge discovery and data mining. Lecture notes in computer science, vol 2637. Springer, Berlin/Heidelberg, pp 467–473
Chapter Google Scholar
Salvatore O, Claudio L (2003) kdci: a multi-strategy algorithm for mining frequent sets. In: Goethals B, Zaki MJ (eds) FIMI 03, frequent itemset mining implementations. Proceedings of the ICDM 2003 workshop on frequent itemset mining implementations, 19 December 2003, Melbourne, Florida, USA, CEUR-WS.org, CEUR workshop proceedings, vol 90
Google Scholar
Sucahyo YG, Gopalan RP (2003) Efficiently mining frequent patterns from dense datasets using a cluster of computers. In: Australian conference on artificial intelligence’03, pp 233–244
Google Scholar
Ye Y, Chiang C (2006) A parallel apriori algorithm for frequent itemsets mining. In: Fourth international conference on software engineering research, management and applications, pp 87–94
Google Scholar
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proc SIGKDD, pp 326–335
Google Scholar
Zaki MJ, Parthasarathyi S (1997) New algorithms for fast discovery of association rules. In: 3rd intl conf on knowledge discovery and data mining. AAAI Press, Menlo Park, pp 283–286
Google Scholar
Zhang F, Zhang Y, Bakos J (2012) Gpapriori: Gpu-accelerated frequent itemset mining. In: IEEE international conference on cluster computing, pp 590–594
Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant Nos. CCF-0844951 and CCF-091560.

Author information

Authors and Affiliations

University of South Carolina, Columbia, USA
Fan Zhang, Yan Zhang & Jason D. Bakos

Authors

Fan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jason D. Bakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason D. Bakos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, F., Zhang, Y. & Bakos, J.D. Accelerating frequent itemset mining on graphics processing units. J Supercomput 66, 94–117 (2013). https://doi.org/10.1007/s11227-013-0887-x

Download citation

Published: 02 February 2013
Issue Date: October 2013
DOI: https://doi.org/10.1007/s11227-013-0887-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating frequent itemset mining on graphics processing units

Abstract

Access this article

Similar content being viewed by others

Accelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting

Parallel frequent itemsets mining using distributed graphic processing units

Parallel and Distributed Mining of Probabilistic Frequent Itemsets Using Multiple GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating frequent itemset mining on graphics processing units

Abstract

Access this article

Similar content being viewed by others

Accelerating Parallel Frequent Itemset Mining on Graphics Processors with Sorting

Parallel frequent itemsets mining using distributed graphic processing units

Parallel and Distributed Mining of Probabilistic Frequent Itemsets Using Multiple GPUs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation