An Efficient Map-Reduce Framework to Mine Periodic Frequent Patterns
Periodic Frequent patterns (PFPs) are an important class of regularities that exist in a transactional database. In the literature, pattern growth-based approaches to mine PFPs have be proposed by considering a single machine. In this paper, we propose a Map-Reduce framework to mine PFPs by considering multiple machines. We have proposed a parallel algorithm by including the step of distributing transactional identifiers among the machines. Further, the notion of partition summary has been proposed to reduce the amount of data shuffled among the machines. Experiments on Apache Spark’s distributed environment show that the proposed approach speeds up with the increase in number of machines and the notion of partition summary significantly reduces the amount of data shuffled among the machines.
KeywordsData mining Periodic frequent pattern mining Map-Reduce
This research was partly supported by the program Research and Development on Real World Big Data Integration and Analysis of the Ministry of Education, Culture, Sports, Science and Technology, and RIKEN, Japan. We acknowledge K. Amulya for her contribution in implementation of the idea.
- 1.Tanbeer, Syed Khairuzzaman, Ahmed, Chowdhury Farhan, Jeong, Byeong-Soo, Lee, Young-Koo: Discovering periodic-frequent patterns in transactional databases. In: Theeramunkong, Thanaruk, Kijsirikul, Boonserm, Cercone, Nick, Ho, Tu-Bao (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 242–253. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-01307-2_24 CrossRefGoogle Scholar
- 2.Kiran, R.U., Shang, H., Toyoda, M., Kitsuregawa, M.: Discovering recurring patterns in time series. In: EDBT, pp. 97–108 (2015)Google Scholar
- 3.Amphawan, K., et al.: Mining periodic-frequent itemsets with approximate periodicity using interval transaction-ids list tree. In: WKDD, pp. 245–248 (2010)Google Scholar
- 4.Anirudh, A., Kiran, R.U., Reddy, P.K., Kitsuregawa, M.: Memory efficient mining of periodic-frequent patterns in transactional databases. In: IEEE, SSCI, pp. 1–8Google Scholar
- 6.Li, H., et al.: Pfp: parallel fp-growth for query recommendation. In: Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, pp. 107–114. ACM (2008)Google Scholar
- 7.Brijs, T., et al.: Using association rules for product assortment decisions: a case study. In: Knowledge Discovery and Data Mining, pp. 254–260 (1999)Google Scholar