A fast and distributed algorithm for mining frequent patterns in congested networks

Lin, Kawuu W.; Chung, Sheng-Hao; Lin, Chun-Cheng

doi:10.1007/s00607-015-0457-6

A fast and distributed algorithm for mining frequent patterns in congested networks

Published: 30 April 2015

Volume 98, pages 235–256, (2016)
Cite this article

Computing Aims and scope Submit manuscript

Kawuu W. Lin¹,
Sheng-Hao Chung² &
Chun-Cheng Lin²

428 Accesses
11 Citations
Explore all metrics

Abstract

With advances in technology, frequent pattern mining has been used widely in our daily lives. By using this technology, one can obtain interesting or useful information that would help one make decisions and apply judgment. For example, marketplace managers mine transaction data to obtain information that can help improve services, understand customer buying habits, determine a suitable scheme for placement of goods to increase profits, or for medical and biotechnology applications. However, the rate at which data is generated is very rapid, leading to problems caused by Big Data. Therefore, many researchers have studied distributed, parallel and cloud computing technology to select the best among them. However, data mining uses multiple computing nodes, which requires the transmission of a considerable amount of data in a network environment. The available network bandwidth is limited when many different tasks are being transmitted at the same time and many servers are working in the same network segment. This results in poor transmission, causing severe transfer delay, either internal or external to the network. Thus, we propose the fast and distributed mining algorithm for discovering frequent patterns in congested networks (FDMCN) algorithm, which is based on CARM. The main purpose is to reduce FP-tree transmission such that only a portion of the information is required for mining using computing nodes. The results of empirical evaluation under various simulation conditions show that the proposed method FDMCN delivers excellent performance in terms of execution efficiency and scalability when compared with the PSWS algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constrained Big Data Mining in an Edge Computing Environment

Fast Distributed Mining Algorithm of Maximum Frequent Itemsets Based on Cloud Computing

Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data

References

Adnan M, Alhajj R (2009) DRFP-tree: disk-resident frequent pattern tree. Appl Intell 30(2):84–97
Article Google Scholar
Agrawal R, Srikant R (1994) Quest synthetic data generator. IBM Almaden Research Center, San Jose. http://sourceforge.net/projects/ibmquestdatagen/
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, VLDB, Santiago, pp 487–499
Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969
Article Google Scholar
Baralis E, Cerquitelli T, Chiusano S, Grand A (2013) P-mine: parallel itemset mining on large datasets. ICDE
Ezeife CI, Zhang D (2009) TidFP: mining frequent patterns in different databases with transaction ID. Data Warehousing Knowl Discov, Lecture Notes Comput Sci 5691:125–137
Google Scholar
Grahne G, Zhu J (2003) Efficiently using prefix-trees in mining frequent itemsets. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations
Grahne G, Zhu J (2004) Mining frequent itemsets from secondary memory. International conference on data mining, pp 91–98
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1–12
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. J Data Min Knowl Discov 8(1):53–87
Article MathSciNet Google Scholar
Javed A, Khokhar A (2004) Frequent pattern mining on message passing multiprocessor systems. Distrib Parallel Databases 16:321–334
Article Google Scholar
Schlegel B, Gemulla R, Lehner W (2011) Memory-efficient frequent-itemset mining. In: EDBT/ICDT11 proceedings of the 14th international conference on extending database technology, pp 461–472
Lai Y, Zhongzhi S (2010) An efficient data mining framework on Hadoop using java persistence API. International conference on computer and information technology, pp 203–209
Lai Y, Zhongzhi S, Xu LD, Fan L, Kirsh I (2011) DH-TRIE frequent pattern mining on hadoop using JPA. International conference on granular computing, pp 875–878
Lin KW, Luo YC (2009) A fast parallel algorithm for discovering frequent patterns. GRC ’09. IEEE international conference on granular computing, pp 398–403
Lin KW, Lo YC (2013) Efficient algorithms for frequent pattern mining in many-task computing environments. Knowl Based Syst 49
Qiu Y, Lan YJ, Xie QS (2004) An improved algorithm of mining from FP- tree. In: Proceedings of the third international conference on machine learning and cybernetics, pp 26–29
Vu L, Alaghband G (2013) Novel parallel method for mining frequent patterns on multi-core shared memory systems. In: DISCS-2013 proceedings of the 2013 international workshop on data-intensive scalable computing systems, pp 49–54
Wu X, Zhu X, Gong-Qing W, Ding W (2014) Data mining with big data, TKDE
Yang XY, Liu Z, Fu Y (2010) MapReduce as a programming model for association rules algorithm on Hadoop. International conference on information sciences and interaction sciences, pp 99–102
Yen SJ, Lee YS, Wang CK, Wu JW, Ouyang LY (2009) The studies of mining frequent patterns based on frequent pattern tree. Adv Knowl Discov Data Min, Lecture Notes Comput Sci 5476:232–241
Google Scholar
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3)
Zhou J, Yu KM (2008) Tidset-based Parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters. Adv Grid Pervas Comput, Lecture Notes Comput Sci 5036:18–28
Article Google Scholar
Zhou J, Yu KM (2008) Balanced tidset-based parallel FP-tree algorithm for the frequent pattern mining on grid system. Fourth international conference on semantics, knowledge and grid, pp 103–108

Download references

Acknowledgments

Part of this work was supported by the Ministry of Science and Technology of Taiwan, R.O.C., under grant No. 103-2221-E-151-033-.

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan, ROC
Kawuu W. Lin
Department of Industrial Engineering and Management, National Chiao Tung University, Hsinchu, Taiwan, ROC
Sheng-Hao Chung & Chun-Cheng Lin

Authors

Kawuu W. Lin
View author publications
You can also search for this author in PubMed Google Scholar
Sheng-Hao Chung
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Cheng Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kawuu W. Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, K.W., Chung, SH. & Lin, CC. A fast and distributed algorithm for mining frequent patterns in congested networks. Computing 98, 235–256 (2016). https://doi.org/10.1007/s00607-015-0457-6

Download citation

Received: 01 October 2014
Accepted: 07 April 2015
Published: 30 April 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00607-015-0457-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast and distributed algorithm for mining frequent patterns in congested networks

Abstract

Access this article

Similar content being viewed by others

Constrained Big Data Mining in an Edge Computing Environment

Fast Distributed Mining Algorithm of Maximum Frequent Itemsets Based on Cloud Computing

Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A fast and distributed algorithm for mining frequent patterns in congested networks

Abstract

Access this article

Similar content being viewed by others

Constrained Big Data Mining in an Edge Computing Environment

Fast Distributed Mining Algorithm of Maximum Frequent Itemsets Based on Cloud Computing

Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation