Discovering Frequent Patterns in Very Large Transactional Databases

M. Luna, Jose

doi:10.1007/978-981-16-3964-7_2

Jose M. Luna⁶

388 Accesses

Abstract

Finding frequent patterns in very large transactional databases is a challenging problem of great concern in many real-world applications. In this chapter, we first introduce the model of frequent patterns. Second, we describe the search space for finding the desired patterns. Third, we present four popular algorithms to find the patterns. Finally, we present the extensions of frequent patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The set enumeration tree is a high-performance data representation technique, which resembles the depth-first search on the itemset lattice
2.
Other names of this property are: apriori property and downward closure property.

References

C.C. Aggarwal, J. Han, Frequent Pattern Mining (Springer International Publishing, Berlin, 2014)
Book Google Scholar
C.C. Aggarwal, J. Han, Frequent Pattern Mining (Springer Publishing Company, Berlin, 2014)
Book Google Scholar
R. Agrawal, T. Imielinski, and A.N. Swami, Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD Conference’93 (ACM, New York, 1993), pp. 207–216
Google Scholar
C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong and Y.-K. Lee, Handling dynamic weights in weighted frequent pattern mining. IEICE Trans. Inf. Syst., 91-D(11), 2578–2588 (2008)
Google Scholar
K. Amphawan, P. Lenca and A. Surarerks, Mining top-k periodic-frequent pattern from transactional databases without support threshold. Advances in Information Technology—Third International Conference, IAIT 2009 (Springer, Berlin, 2009), pp. 18–29
Google Scholar
C.H. Cai, A.W.C. Fu, C.H. Cheng and W.W. Kwong, Mining association rules with weighted items. Proceedings of the 1998 International Database Engineering and Applications Symposium, IDEAS 1998 (IEEE, New York, 1998), pp. 68–77
Google Scholar
Kun-Ta. Chuang, Jiun-Long. Huang, Ming-Syan. Chen, Mining top-k frequent patterns in the presence of the memory constraint. The VLDB Journal 17(5), 1321–1344 (2008)
Article Google Scholar
Philippe Fournier-Viger, Jerry Chun-Wei Lin, Quang-Huy Duong, and Thu-Lan Dam. FHM + : Faster high-utility itemset mining using length upper-bound reduction. In Trends in Applied Knowledge-Based Systems and Data Science - 29th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2016, Morioka, Japan, August 2-4, 2016, Proceedings, pages 115–127, 2016
Google Scholar
Philippe Fournier-Viger, Jerry Chun-Wei Lin, Roger Nkambou, Bay Vo, and Vincent S. Tseng. High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer Publishing Company, Incorporated, 1st edition, 2019
Google Scholar
Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In Foundations of Intelligent Systems - 21st International Symposium, ISMIS 2014, Roskilde, Denmark, June 25-27, 2014. Proceedings, pages 83–92, 2014
Google Scholar
Philippe Fournier-Viger and Souleymane Zida. FOSHU: faster on-shelf high utility itemset mining - with or without negative unit profit. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, April 13-17, 2015, pages 857–864, 2015
Google Scholar
J. Han, Y. Fu, Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering 11(5), 798–804 (1999)
Article Google Scholar
Jiawei Han, Jian Pei, Yiwen Yin, Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)
Article Google Scholar
R. Uday Kiran, Amulya Kotni, P. Krishna Reddy, Masashi Toyoda, Subhash Bhalla, and Masaru Kitsuregawa. Efficient discovery of weighted frequent itemsets in very large transactional databases: A re-visit. In Naoki Abe, Huan Liu, Calton Pu, Xiaohua Hu, Nesreen K. Ahmed, Mu Qiao, Yang Song, Donald Kossmann, Bing Liu, Kisung Lee, Jiliang Tang, Jingrui He, and Jeffrey S. Saltz, editors, IEEE International Conference on Big Data, Big Data 2018, Seattle, WA, USA, December 10-13, 2018, pages 723–732. IEEE, 2018
Google Scholar
R. Uday Kiran and P. Krishna Reddy. Towards efficient mining of periodic-frequent patterns in transactional databases. In Database and Expert Systems Applications, 21th International Conference, DEXA 2010, Bilbao, Spain, August 30 - September 3, 2010, Proceedings, Part II, pages 194–208, 2010
Google Scholar
Yu-Feng. Lin, Wu. Cheng-Wei, Chien-Feng. Huang, Vincent S. Tseng, Discovering utility-based episode rules in complex event sequences. Expert Syst. Appl. 42(12), 5303–5314 (2015)
Article Google Scholar
Mengchi Liu and Jun-Feng Qu. Mining high utility itemsets without candidate generation. In 21st ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012, pages 55–64, 2012
Google Scholar
J.M. Luna, J.R. Romero, S. Ventura, Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowledge and Information Systems 32(1), 53–76 (2012)
Article Google Scholar
José María Luna, Philippe Fournier-Viger, and Sebastián Ventura. Frequent itemset mining: A 25 years review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 9(6), 2019
Google Scholar
J.M. Luna, P. Fournier-Viger and S. Ventura, Extracting user-centric knowledge on two different spaces: Concepts and records. IEEE Access 8, 134782–134799 (2020)
Google Scholar
José María Luna, Mykola Pechenizkiy, María José del Jesus, and Sebastián Ventura. Mining context-aware association rules using grammar-based genetic programming. IEEE Trans. Cybern., 48(11):3030–3044, 2018
Google Scholar
José María Luna, Mykola Pechenizkiy, Wouter Duivesteijn, and Sebastián Ventura. Exceptional in so many ways - discovering descriptors that display exceptional behavior on contrasting scenarios. IEEE Access, 8:200982–200994, 2020
Google Scholar
José María Luna, José Raúl Romero, Cristóbal Romero, and Sebastián Ventura. Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm. Integr. Comput. Aided Eng., 21(4):321–337, 2014
Google Scholar
José María Luna, José Raúl Romero, and Sebastián Ventura. Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl. Inf. Syst., 32(1):53–76, 2012
Google Scholar
J. Pei, G. Dong, W. Zou, J. Han, Mining Condensed Frequent-Pattern Bases. Knowledge and Information Systems 6(5), 570–594 (2004)
Article Google Scholar
Md. Mamunur Rashid, Md. Rezaul Karim, Byeong-Soo Jeong, and Ho-Jin Choi. Efficient mining regularly frequent patterns in transactional databases. In Proceedings of the 17th International Conference on Database Systems for Advanced Applications - Volume Part I, DASFAA’12, page 258-271, Berlin, Heidelberg, 2012. Springer-Verlag
Google Scholar
Cristóbal Romero, Amelia Zafra, José María Luna, and Sebastián Ventura. Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst. J. Knowl. Eng., 30(2):162–172, 2013
Google Scholar
Abdus Salam and M. Sikandar Hayat Khayal. Mining top-k frequent patterns without minimum support threshold. Knowledge and Information Systems, 30(1):57-86, 2012
Google Scholar
A. Soulet, B. Crémilleux, Adequate condensed representations of patterns. Data Mining and Knowledge Discovery 17(1), 94–110 (2008)
Article MathSciNet Google Scholar
Akshat Surana, R. Uday Kiran, and P. Krishna Reddy. An efficient approach to mine periodic-frequent patterns in transactional databases. In New Frontiers in Applied Data Mining - PAKDD 2011 International Workshops, Shenzhen, China, May 24-27, 2011, Revised Selected Papers, pages 254–266, 2011
Google Scholar
P. N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2005
Google Scholar
Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and Young-Koo Lee. Discovering periodic-frequent patterns in transactional databases. In Advances in Knowledge Discovery and Data Mining, 13th Pacific-Asia Conference, PAKDD 2009, Bangkok, Thailand, April 27-30, 2009, Proceedings, pages 242–253, 2009
Google Scholar
Feng Tao, Fionn Murtagh, and Mohsen Farid. Weighted association rule mining using weighted support and significance framework. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, page 661-666, New York, NY, USA, 2003. Association for Computing Machinery
Google Scholar
Vincent S. Tseng, Bai-En. Shie, Wu. Cheng-Wei, SYu. Philip, Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Vincent S. Tseng, Wu. Cheng-Wei, Philippe Fournier-Viger, SYu. Philip, Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016)
Article Google Scholar
Petre Tzvetkov, Xifeng Yan, Jiawei Han, TSP: mining top-k closed sequential patterns. Knowl. Inf. Syst. 7(4), 438–457 (2005)
Article Google Scholar
Takeaki Uno, Tatsuya Asai, Yuzo Uchida, and Hiroki Arimura. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90. Citeseer, 2003
Google Scholar
Takeaki Uno, Masashi Kiyomi, and Hiroki Arimura. Lcm ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, pages 77–86, 2005
Google Scholar
Takeaki Uno, Masashi Kiyomi, Hiroki Arimura, et al. Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In Fimi, volume 126, 2004
Google Scholar
S. Ventura and J. M. Luna. Pattern Mining with Evolutionary Algorithms. Springer International Publishing, 2016
Google Scholar
Jianyong Wang, Jiawei Han, Lu. Ying, Petre Tzvetkov, TFP: an efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)
Article Google Scholar
Wei Wang, Jiong Yang, SYu. Philip, WAR: weighted association rules for item intensities. Knowl. Inf. Syst. 6(2), 203–229 (2004)
Article Google Scholar
Yin-Ling Cheung and Ada Wai-Chee Fu, Mining frequent itemsets without support threshold: with and without item constraints. IEEE Transactions on Knowledge and Data Engineering 16(9), 1052–1069 (2004)
Article Google Scholar
Unil Yun and John J. Leggett. WFIM: weighted frequent itemset mining with a weight range and a minimum weight. In Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005, Newport Beach, CA, USA, April 21-23, 2005, pages 636–640, 2005
Google Scholar
Unil Yun and Keun Ho Ryu, Approximate weighted frequent pattern mining with/without noisy environments. Knowl. Based Syst. 24(1), 73–82 (2011)
Article Google Scholar
Mohammed J Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara, and Wei Li. Parallel algorithms for discovery of association rules. Data mining and knowledge discovery, 1(4):343–373, 1997
Google Scholar
Souleymane Zida, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Cheng-Wei Wu, and Vincent S. Tseng. EFIM: A highly efficient algorithm for high-utility itemset mining. In Advances in Artificial Intelligence and Soft Computing - 14th Mexican International Conference on Artificial Intelligence, MICAI 2015, Cuernavaca, Morelos, Mexico, October 25-31, 2015, Proceedings, Part I, pages 530–546, 2015
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Cordoba, 14071, Córdoba, Spain
Jose M. Luna

Authors

Jose M. Luna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose M. Luna .

Editor information

Editors and Affiliations

Division of Information Systems, University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
R. Uday Kiran
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, China
Philippe Fournier-Viger
Department of Computer Science and Numerical Analysis, University of Córdoba, Córdoba, Spain
Jose M. Luna
Department of Computer Science, Electrical Engineering, and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Department of Computer Science, Ashoka University, Sonepat, Haryana, India
Anirban Mondal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

M. Luna, J. (2021). Discovering Frequent Patterns in Very Large Transactional Databases. In: Kiran, R.U., Fournier-Viger, P., Luna, J.M., Lin, J.CW., Mondal, A. (eds) Periodic Pattern Mining . Springer, Singapore. https://doi.org/10.1007/978-981-16-3964-7_2

Download citation

DOI: https://doi.org/10.1007/978-981-16-3964-7_2
Published: 30 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3963-0
Online ISBN: 978-981-16-3964-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics