Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining

Veloso, Adriano; Meira, Wagner; Ferreira, Renato; Neto, Dorgival Guedes; Parthasarathy, Srinivasan

doi:10.1007/978-3-540-30116-5_39

Adriano Veloso²²,
Wagner Meira Jr.²²,
Renato Ferreira²²,
Dorgival Guedes Neto²² &
…
Srinivasan Parthasarathy²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3202))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2206 Accesses
9 Citations

Abstract

In this paper we propose a novel parallel algorithm for frequent itemset mining. The algorithm is based on the filter-stream programming model, in which the frequent itemset mining process is represented as a data flow controlled by a series of producer and consumer components (called filters), and the data flow (communication) between such filters is made via streams. When production rate matches consumption rate, and communication overhead between producer and consumer filters is minimized, a high degree of asynchrony is achieved. Following this strategy, our algorithm employs an asynchronous candidate generation, and minimizes communication between filters by transferring only the necessary aggregated information. Another nice feature of our algorithm is a look forward approach which accelerates frequent itemset determination. Extensive evaluation shows the parallel performance and scalability of our algorithm.

This work has been partially supported by CNPq-Brazil and by CNPq / CT-INFO / PTACS.

Download to read the full chapter text

Chapter PDF

Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

Article 30 January 2018

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Parallel High Utility Itemset Mining

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Acharya, A., Uysal, M., Satlz, J.: Active disks: Programming model, algorithms and evaluation. In: Proc. of the Intl. Conf. on Architectural Support for programming Languages and Operating Systems (ASPLOS VIII), October 1998, pp. 81–91. ACM Press, New York (1998)
Google Scholar
Agrawal, R., Shafer, J.: Parallel mining of association rules. Transactions on Knowledge and Data Engineering 8(6), 962–969 (1996)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the Intl. Conf. on Very Large Databases (VLDB), SanTiago, Chile, June 1994, pp. 487–499 (1994)
Google Scholar
Beynon, M., Chang, C., Catalyurek, U., Kurc, T., Sussman, A., Andrade, H., Ferreira, R., Saltz, J.: Processing large-scale multi-dimensional data in parallel and distributed environments. Parallel Computing 28(5), 827–859 (2002)
Article Google Scholar
Beynon, M., Kurc, T., Sussman, A., Saltz, J.: Design of a framework for data-intensive wide-area applications. In: Proc of the Heterogeneous Computing Workshop (HCW), May 2000, pp. 116–130. IEEE Computer Society Press, Los Alamitos (2000)
Google Scholar
Catalyurek, U., Gray, M., Kurc, T., Saltz, J., Ferreira, R.: A component-based implementation of multiple sequence alignment. In: Proc. of the ACM Symposium on Applied Computing (SAC), pp. 122–126. ACM, New York (2003)
Google Scholar
Cheung, D., Xiao, Y.: Effect of data distribution in parallel mining of associations. Data Mining and Knowledge Discovery 3(3), 291–314 (1999)
Article Google Scholar
Han, E., Karypis, G., Kumar, V.: Scalable parallel data mining for association rules. Transactions on Knowledge and Data Engineering 12(3), 728–737 (2000)
Google Scholar
Joshi, M., Han, E., Karypis, G., Kumar, V.: Efficient parallel algorithms for mining associations. In: Zaki, M.J., Ho, C.-T. (eds.) KDD 1999. LNCS (LNAI), vol. 1759, pp. 418–429. Springer, Heidelberg (2000)
Chapter Google Scholar
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Article Google Scholar
Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: An efficient parallel and distributed algorithm for counting frequent sets. In: Proc. of the Intl. Conf. on Vector and Parallel Processing (VECPAR), Porto, Portugal, pp. 421–435 (2002)
Google Scholar
Parthasarathy, S., Zaki, M., Ogihara, M., Li, W.: Parallel data mining for association rules on shared-memory systems. Knowledge and Information Systems 3(1), 1–29 (2001)
Article MATH Google Scholar
Spencer, M., Ferreira, R., Beynon, M., Kurc, T., Catalyurek, U., Sussman, A., Saltz, J.: Executing multiple pipelined data analysis operations in the grid. In: Proc. of the ACM/IEEE Conf. on Supercomputing, pp. 1–18. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Veloso, A., Otey, M., Parthasarathy, S., Meira, W.: Parallel and distributed frequent itemset mining on dynamic datasets. In: Pinkston, T.M., Prasanna, V.K. (eds.) HiPC 2003. LNCS (LNAI), vol. 2913, pp. 184–193. Springer, Heidelberg (2003)
Chapter Google Scholar
Zaki, M., Gouda, K.: Fast vertical mining using diffsets. In: Proc. of the Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD), August 2003, ACM, New York (2003)
Google Scholar
Zaki, M., Parthasarathy, S., Ogihara, M., Li, W.: New parallel algorithms for fast discovery of association rules. Data Mining and Knowledge Discovery 4(1), 343–373 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Universidade Federal de Minas Gerais, Brazil
Adriano Veloso, Wagner Meira Jr., Renato Ferreira & Dorgival Guedes Neto
Department of Computer and Information Science, The Ohio-State University, USA
Srinivasan Parthasarathy

Authors

Adriano Veloso
View author publications
You can also search for this author in PubMed Google Scholar
Wagner Meira Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Renato Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Dorgival Guedes Neto
View author publications
You can also search for this author in PubMed Google Scholar
Srinivasan Parthasarathy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Veloso, A., Meira, W., Ferreira, R., Neto, D.G., Parthasarathy, S. (2004). Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Knowledge Discovery in Databases: PKDD 2004. PKDD 2004. Lecture Notes in Computer Science(), vol 3202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30116-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-540-30116-5_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23108-0
Online ISBN: 978-3-540-30116-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining

Abstract

Chapter PDF

Similar content being viewed by others

Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Parallel High Utility Itemset Mining

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Asynchronous and Anticipatory Filter-Stream Based Parallel Algorithm for Frequent Itemset Mining

Abstract

Chapter PDF

Similar content being viewed by others

Improved algorithm for parallel mining collaborative frequent itemsets in multiple data streams

A Parallel Incremental Frequent Itemsets Mining IFIN+: Improvement and Extensive Evaluation

Parallel High Utility Itemset Mining

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation