A Highly Parallel Algorithm for Frequent Itemset Mining

Mesa, Alejandro; Feregrino-Uribe, Claudia; Cumplido, René; Hernández-Palancar, José

doi:10.1007/978-3-642-15992-3_31

Alejandro Mesa^19,20,
Claudia Feregrino-Uribe²⁰,
René Cumplido²⁰ &
…
José Hernández-Palancar¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6256))

Included in the following conference series:

Mexican Conference on Pattern Recognition

1118 Accesses
3 Citations

Abstract

Mining frequent itemsets in large databases is a widely used technique in Data Mining. Several sequential and parallel algorithms have been developed, although, when dealing with high data volumes, the execution of those algorithms takes more time and resources than expected. Because of this, finding alternatives to speed up the execution time of those algorithms is an active topic of research. Previous attempts of acceleration using custom architectures have been limited because of the nature of the algorithms that have been conceived sequentially and do not exploit the intrinsic parallelism that the hardware provides. The innovation in this paper is a highly parallel algorithm that utilizes a vertical bit vector (VBV) data layout and its feasibility for making support counting. Our results show that for dense databases a custom architecture for this algorithm can perform faster than the fastest architecture reported in previous works by one order of magnitude.

Download to read the full chapter text

Chapter PDF

An efficient method for mining frequent sequential patterns using multi-Core processors

Article 02 November 2016

Bao Huynh, Bay Vo & Vaclav Snasel

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Data Partitioning for Fast Mining of Frequent Itemsets in Massively Distributed Environments

References

Agrawal, R., Shafer, J.C.: Parallel mining of association rules design, implementation and experience. Technical Report RJ10004, IBM Research Report (February 1996)
Google Scholar
Baker, Z.K., Prasanna, V.K.: Efficient Hardware Data Mining with the Apriori Algorithm on FPGAs. In: Proc. of the 13th Annual IEEE Symposium on Field Programmable Custom Computing Machines 2005 (FCCM ’05), pp. 3–12 (2005)
Google Scholar
Baker, Z.K., Prasanna, V.K.: An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing System. In: Proc. of the 14th Annual IEEE Symposium on Field Programmable Custom Computing Machines 2006 (FCCM ’06), pp. 67–75 (2006)
Google Scholar
Goethals, B.: Frequent itemset mining dataset repository, http://fimi.cs.helsinki.fi/data/
Han, E.H., Karypis, G., Kumar, V.: Scalable parallel data mining for association rules. In: Proc. of the ACM SIGMOD Conference, pp. 277–288 (1997)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: 2000 ACM SIGMOD Intl. Conf. on Management of Data, pp. 1–12. ACM Press, New York (2000)
Google Scholar
Palancar, J.H., Tormo, O.F., Cárdenas, J.F., León, R.H.: Distributed and shared memory algorithm for parallel mining of association rules. In: Perner, P. (ed.) MLDM 2007. LNCS (LNAI), vol. 4571, pp. 349–363. Springer, Heidelberg (2007)
Chapter Google Scholar
Park, J., Chen, M., Yu, P.: An effective hash based algorithm for mining association rules. In: Carey, M.J., Schneider, D.A. (eds.) SIGMOD Conference, pp. 175–186. ACM Press, New York (1995)
Google Scholar
Sun, S., Steffen, M., Zambreno, J.: A reconfigurable platform for frequent pattern mining. In: RECONFIG ’08: Proc. of the 2008 Intl. Conf. on Reconfigurable Computing and FPGAs, pp. 55–60. IEEE Computer Society, Los Alamitos (2008)
Chapter Google Scholar
Sun, S., Zambreno, J.: Mining association rules with systolic trees. In: Proc. of the Intl. Conf. on Field-Programmable Logic and its Applications (FPL), pp. 143–148. IEEE, Los Alamitos (2008)
Google Scholar
Wen, Y., Huang, J., Chen, M.: Hardware-enhanced association rule mining with hashing and pipelining. IEEE Trans. on Knowl. and Data Eng. 20(6), 784–795 (2008)
Article Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proc. of the 3rd Intl. Conf. on KDD and Data Mining (KDD’97), pp. 283–286 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Technologies Application Center, CENATAV, La Habana, Cuba
Alejandro Mesa & José Hernández-Palancar
National Institute for Astrophysics, Optics and Electronics, INAOE, Puebla, México
Alejandro Mesa, Claudia Feregrino-Uribe & René Cumplido

Authors

Alejandro Mesa
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Feregrino-Uribe
View author publications
You can also search for this author in PubMed Google Scholar
René Cumplido
View author publications
You can also search for this author in PubMed Google Scholar
José Hernández-Palancar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840, Puebla, Sta. Maria Tonantzintla, Mexico
José Francisco Martínez-Trinidad
Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840, Puebla, Sta. Maria Tonantzintla, Mexico
Jesús Ariel Carrasco-Ochoa
Centre for Vision, Speech and Signal Processing, School of Electronics and Physical Sciences, University of Surrey, GU2 7XH, Guildford, Surrey, UK
Josef Kittler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mesa, A., Feregrino-Uribe, C., Cumplido, R., Hernández-Palancar, J. (2010). A Highly Parallel Algorithm for Frequent Itemset Mining. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Kittler, J. (eds) Advances in Pattern Recognition. MCPR 2010. Lecture Notes in Computer Science, vol 6256. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15992-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-15992-3_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15991-6
Online ISBN: 978-3-642-15992-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

A Highly Parallel Algorithm for Frequent Itemset Mining

Abstract

Chapter PDF

Similar content being viewed by others

An efficient method for mining frequent sequential patterns using multi-Core processors

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Data Partitioning for Fast Mining of Frequent Itemsets in Massively Distributed Environments

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

A Highly Parallel Algorithm for Frequent Itemset Mining

Abstract

Chapter PDF

Similar content being viewed by others

An efficient method for mining frequent sequential patterns using multi-Core processors

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Data Partitioning for Fast Mining of Frequent Itemsets in Massively Distributed Environments

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation