A Novel Parallel Algorithm for Frequent Itemsets Mining in Large Transactional Databases
Since the era of data explosion, data mining in large transactional databases has become more and more important. There are many data mining techniques like association rule mining, the most important and well-researched one. Furthermore, frequent itemset mining is one of the fundamental but time-consuming steps in association rule mining. Most of the algorithms used in literature find frequent itemsets on search space items having at least a minsup and are not reused for subsequent mining. Therefore, in order to decrease the execution time, some parallel algorithms have been proposed for mining frequent itemsets. Nonetheless, these algorithms merely implement the parallelization of Apriori and FP-Growth algorithms. To deal with this problem, several parallel NPA-FI algorithms are proposed as a new approach in order to quickly detect frequent itemsets from large transactional databases using an array of co-occurrences and occurrences of kernel item in at least one transaction. Parallel NPA-FI algorithms are easily used in many distributed file system, namely Hadoop and Spark. Finally, the experimental results show that the proposed algorithms perform better than other existing algorithms.
KeywordsAssociation rules Co-occurrence items Frequent itemsets Parallel algorithm
This work was supported by University of Social Sciences and Humanities; University of Science, VNU-HCM, Vietnam.
- 5.Agrawal, R., Imilienski, T., Swami, A.: Mining association rules between sets of large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216 (1993)Google Scholar
- 7.Lin, M.Y., Lee, P.Y., Hsueh, S.C.: Apriori-based frequent itemset mining algorithms on MapReduce. In: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, New York, NY, USA, p. 76 (2012)Google Scholar
- 8.Moonesinghe, H.D.K., Chung, M.J., Tan, P.N.: Fast parallel mining of frequent itemsets. Technical report no. 2, Department of Computer Science and Engineering, Michigan State University (2006)Google Scholar
- 10.Djenouri, Y., Bendjoudi, A., Djenouri, D., Habbas, Z.: Parallel BSO algorithm for association rules mining using master/worker paradigm. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 258–268. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32149-3_25CrossRefGoogle Scholar