Skip to main content

Discovering Frequent Patterns in Very Large Transactional Databases

  • Chapter
  • First Online:
Periodic Pattern Mining
  • 388 Accesses

Abstract

Finding frequent patterns in very large transactional databases is a challenging problem of great concern in many real-world applications. In this chapter, we first introduce the model of frequent patterns. Second, we describe the search space for finding the desired patterns. Third, we present four popular algorithms to find the patterns. Finally, we present the extensions of frequent patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The set enumeration tree is a high-performance data representation technique, which resembles the depth-first search on the itemset lattice

  2. 2.

    Other names of this property are: apriori property and downward closure property.

References

  1. C.C. Aggarwal, J. Han, Frequent Pattern Mining (Springer International Publishing, Berlin, 2014)

    Book  Google Scholar 

  2. C.C. Aggarwal, J. Han, Frequent Pattern Mining (Springer Publishing Company, Berlin, 2014)

    Book  Google Scholar 

  3. R. Agrawal, T. Imielinski, and A.N. Swami, Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD Conference’93 (ACM, New York, 1993), pp. 207–216

    Google Scholar 

  4. C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong and Y.-K. Lee, Handling dynamic weights in weighted frequent pattern mining. IEICE Trans. Inf. Syst., 91-D(11), 2578–2588 (2008)

    Google Scholar 

  5. K. Amphawan, P. Lenca and A. Surarerks, Mining top-k periodic-frequent pattern from transactional databases without support threshold. Advances in Information Technology—Third International Conference, IAIT 2009 (Springer, Berlin, 2009), pp. 18–29

    Google Scholar 

  6. C.H. Cai, A.W.C. Fu, C.H. Cheng and W.W. Kwong, Mining association rules with weighted items. Proceedings of the 1998 International Database Engineering and Applications Symposium, IDEAS 1998 (IEEE, New York, 1998), pp. 68–77

    Google Scholar 

  7. Kun-Ta. Chuang, Jiun-Long. Huang, Ming-Syan. Chen, Mining top-k frequent patterns in the presence of the memory constraint. The VLDB Journal 17(5), 1321–1344 (2008)

    Article  Google Scholar 

  8. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Quang-Huy Duong, and Thu-Lan Dam. FHM + : Faster high-utility itemset mining using length upper-bound reduction. In Trends in Applied Knowledge-Based Systems and Data Science - 29th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2016, Morioka, Japan, August 2-4, 2016, Proceedings, pages 115–127, 2016

    Google Scholar 

  9. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Roger Nkambou, Bay Vo, and Vincent S. Tseng. High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer Publishing Company, Incorporated, 1st edition, 2019

    Google Scholar 

  10. Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In Foundations of Intelligent Systems - 21st International Symposium, ISMIS 2014, Roskilde, Denmark, June 25-27, 2014. Proceedings, pages 83–92, 2014

    Google Scholar 

  11. Philippe Fournier-Viger and Souleymane Zida. FOSHU: faster on-shelf high utility itemset mining - with or without negative unit profit. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, April 13-17, 2015, pages 857–864, 2015

    Google Scholar 

  12. J. Han, Y. Fu, Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering 11(5), 798–804 (1999)

    Article  Google Scholar 

  13. Jiawei Han, Jian Pei, Yiwen Yin, Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  14. R. Uday Kiran, Amulya Kotni, P. Krishna Reddy, Masashi Toyoda, Subhash Bhalla, and Masaru Kitsuregawa. Efficient discovery of weighted frequent itemsets in very large transactional databases: A re-visit. In Naoki Abe, Huan Liu, Calton Pu, Xiaohua Hu, Nesreen K. Ahmed, Mu Qiao, Yang Song, Donald Kossmann, Bing Liu, Kisung Lee, Jiliang Tang, Jingrui He, and Jeffrey S. Saltz, editors, IEEE International Conference on Big Data, Big Data 2018, Seattle, WA, USA, December 10-13, 2018, pages 723–732. IEEE, 2018

    Google Scholar 

  15. R. Uday Kiran and P. Krishna Reddy. Towards efficient mining of periodic-frequent patterns in transactional databases. In Database and Expert Systems Applications, 21th International Conference, DEXA 2010, Bilbao, Spain, August 30 - September 3, 2010, Proceedings, Part II, pages 194–208, 2010

    Google Scholar 

  16. Yu-Feng. Lin, Wu. Cheng-Wei, Chien-Feng. Huang, Vincent S. Tseng, Discovering utility-based episode rules in complex event sequences. Expert Syst. Appl. 42(12), 5303–5314 (2015)

    Article  Google Scholar 

  17. Mengchi Liu and Jun-Feng Qu. Mining high utility itemsets without candidate generation. In 21st ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012, pages 55–64, 2012

    Google Scholar 

  18. J.M. Luna, J.R. Romero, S. Ventura, Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowledge and Information Systems 32(1), 53–76 (2012)

    Article  Google Scholar 

  19. José María Luna, Philippe Fournier-Viger, and Sebastián Ventura. Frequent itemset mining: A 25 years review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 9(6), 2019

    Google Scholar 

  20. J.M. Luna, P. Fournier-Viger and S. Ventura, Extracting user-centric knowledge on two different spaces: Concepts and records. IEEE Access 8, 134782–134799 (2020)

    Google Scholar 

  21. José María Luna, Mykola Pechenizkiy, María José del Jesus, and Sebastián Ventura. Mining context-aware association rules using grammar-based genetic programming. IEEE Trans. Cybern., 48(11):3030–3044, 2018

    Google Scholar 

  22. José María Luna, Mykola Pechenizkiy, Wouter Duivesteijn, and Sebastián Ventura. Exceptional in so many ways - discovering descriptors that display exceptional behavior on contrasting scenarios. IEEE Access, 8:200982–200994, 2020

    Google Scholar 

  23. José María Luna, José Raúl Romero, Cristóbal Romero, and Sebastián Ventura. Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm. Integr. Comput. Aided Eng., 21(4):321–337, 2014

    Google Scholar 

  24. José María Luna, José Raúl Romero, and Sebastián Ventura. Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl. Inf. Syst., 32(1):53–76, 2012

    Google Scholar 

  25. J. Pei, G. Dong, W. Zou, J. Han, Mining Condensed Frequent-Pattern Bases. Knowledge and Information Systems 6(5), 570–594 (2004)

    Article  Google Scholar 

  26. Md. Mamunur Rashid, Md. Rezaul Karim, Byeong-Soo Jeong, and Ho-Jin Choi. Efficient mining regularly frequent patterns in transactional databases. In Proceedings of the 17th International Conference on Database Systems for Advanced Applications - Volume Part I, DASFAA’12, page 258-271, Berlin, Heidelberg, 2012. Springer-Verlag

    Google Scholar 

  27. Cristóbal Romero, Amelia Zafra, José María Luna, and Sebastián Ventura. Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst. J. Knowl. Eng., 30(2):162–172, 2013

    Google Scholar 

  28. Abdus Salam and M. Sikandar Hayat Khayal. Mining top-k frequent patterns without minimum support threshold. Knowledge and Information Systems, 30(1):57-86, 2012

    Google Scholar 

  29. A. Soulet, B. Crémilleux, Adequate condensed representations of patterns. Data Mining and Knowledge Discovery 17(1), 94–110 (2008)

    Article  MathSciNet  Google Scholar 

  30. Akshat Surana, R. Uday Kiran, and P. Krishna Reddy. An efficient approach to mine periodic-frequent patterns in transactional databases. In New Frontiers in Applied Data Mining - PAKDD 2011 International Workshops, Shenzhen, China, May 24-27, 2011, Revised Selected Papers, pages 254–266, 2011

    Google Scholar 

  31. P. N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2005

    Google Scholar 

  32. Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and Young-Koo Lee. Discovering periodic-frequent patterns in transactional databases. In Advances in Knowledge Discovery and Data Mining, 13th Pacific-Asia Conference, PAKDD 2009, Bangkok, Thailand, April 27-30, 2009, Proceedings, pages 242–253, 2009

    Google Scholar 

  33. Feng Tao, Fionn Murtagh, and Mohsen Farid. Weighted association rule mining using weighted support and significance framework. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, page 661-666, New York, NY, USA, 2003. Association for Computing Machinery

    Google Scholar 

  34. Vincent S. Tseng, Bai-En. Shie, Wu. Cheng-Wei, SYu. Philip, Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)

    Article  Google Scholar 

  35. Vincent S. Tseng, Wu. Cheng-Wei, Philippe Fournier-Viger, SYu. Philip, Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016)

    Article  Google Scholar 

  36. Petre Tzvetkov, Xifeng Yan, Jiawei Han, TSP: mining top-k closed sequential patterns. Knowl. Inf. Syst. 7(4), 438–457 (2005)

    Article  Google Scholar 

  37. Takeaki Uno, Tatsuya Asai, Yuzo Uchida, and Hiroki Arimura. Lcm: An efficient algorithm for enumerating frequent closed item sets. In Fimi, volume 90. Citeseer, 2003

    Google Scholar 

  38. Takeaki Uno, Masashi Kiyomi, and Hiroki Arimura. Lcm ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, pages 77–86, 2005

    Google Scholar 

  39. Takeaki Uno, Masashi Kiyomi, Hiroki Arimura, et al. Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In Fimi, volume 126, 2004

    Google Scholar 

  40. S. Ventura and J. M. Luna. Pattern Mining with Evolutionary Algorithms. Springer International Publishing, 2016

    Google Scholar 

  41. Jianyong Wang, Jiawei Han, Lu. Ying, Petre Tzvetkov, TFP: an efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)

    Article  Google Scholar 

  42. Wei Wang, Jiong Yang, SYu. Philip, WAR: weighted association rules for item intensities. Knowl. Inf. Syst. 6(2), 203–229 (2004)

    Article  Google Scholar 

  43. Yin-Ling Cheung and Ada Wai-Chee Fu, Mining frequent itemsets without support threshold: with and without item constraints. IEEE Transactions on Knowledge and Data Engineering 16(9), 1052–1069 (2004)

    Article  Google Scholar 

  44. Unil Yun and John J. Leggett. WFIM: weighted frequent itemset mining with a weight range and a minimum weight. In Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005, Newport Beach, CA, USA, April 21-23, 2005, pages 636–640, 2005

    Google Scholar 

  45. Unil Yun and Keun Ho Ryu, Approximate weighted frequent pattern mining with/without noisy environments. Knowl. Based Syst. 24(1), 73–82 (2011)

    Article  Google Scholar 

  46. Mohammed J Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara, and Wei Li. Parallel algorithms for discovery of association rules. Data mining and knowledge discovery, 1(4):343–373, 1997

    Google Scholar 

  47. Souleymane Zida, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Cheng-Wei Wu, and Vincent S. Tseng. EFIM: A highly efficient algorithm for high-utility itemset mining. In Advances in Artificial Intelligence and Soft Computing - 14th Mexican International Conference on Artificial Intelligence, MICAI 2015, Cuernavaca, Morelos, Mexico, October 25-31, 2015, Proceedings, Part I, pages 530–546, 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose M. Luna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

M. Luna, J. (2021). Discovering Frequent Patterns in Very Large Transactional Databases. In: Kiran, R.U., Fournier-Viger, P., Luna, J.M., Lin, J.CW., Mondal, A. (eds) Periodic Pattern Mining . Springer, Singapore. https://doi.org/10.1007/978-981-16-3964-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-3964-7_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-3963-0

  • Online ISBN: 978-981-16-3964-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics