Skip to main content
Log in

Efficiently mining frequent itemsets with weight and recency constraints

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the past, a novel framework named recent weighted frequent itemset mining (RWFIM) and two projection-based algorithms, RWFIM-P and RWFIM-PE, were proposed to consider both the relative importance of items (item weights) and the recency of patterns. However, the projection-and-test mechanism used by these algorithms to discover recent weighted frequent itemsets (RWFIs) in a recursive way may have poor performance when the database is dense or contains long transactions. To address this issue, an efficient tree-based RWFI-Mine algorithm is proposed in this paper for mining RWFIs, which considers both weight and the recency of patterns. A novel Set-enumeration tree called the recent weighted frequent (RWF)-tree and a sorted downward closure property of RWFIs for the RWF-tree are proposed. Moreover, two data structures, named element (E)-table and recent weighted frequent (RWF)-table, are designed to store the information needed for discovering RWFIs. RFWI-Mine discovers RWFIs in a recursive way without candidate generation, thus reducing the computational costs and memory requirements for mining RWFIs. A second improved algorithm named RWFI-EMine algorithm is further proposed to avoid building E-tables and RWF-tables for unpromising itemsets and their child nodes by adopting the Estimated Weight of 2-itemset Pruning (EW2P) strategy. Extensive experiments are conducted on several real-world and synthetic datasets to evaluate the performance of the two proposed algorithms, and the ratio between the number of generated RWFIs and WFIs. Results show that the proposed algorithms outperform not only the traditional PWA algorithm for WFIM, but also the state-of-the-art RWFIM-P and RWFIM-PE algorithms for RWFIM, in terms of runtime, memory usage and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Frequent itemset mining dataset repository. Available: http://fimi.ua.ac.be/data/ (2012)

  2. Agrawal R, Srikant R (1994) Quest synthetic data generator Available: http://www.Almaden.ibm.com/cs/quest/syndata.html

  3. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases, International Conference on Very Large Data Bases, pp 487–499

  4. Agrawal R, Srikant R (1995) Mining sequential patterns, International Conference on Data Engineering, pp 3–14

  5. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large database, The ACM SIGMOD International Conference on Management of Data, pp 207–216

  6. Cai CH, Fu AWC, Cheng CH, Kwong WW (1998) Mining association rules with weighted items, International Database Engineering and Applications Symposium, pp 68–77

  7. Chen MS, Han J, Yu PS (1996) Data mining: An overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866– 883

    Article  Google Scholar 

  8. Geng L, Hamilton HJ (2006) Interestingness measures for data mining: A survey. ACM Comput Surv 38 (3):9

    Article  Google Scholar 

  9. Han J, Lakshmanan L, Ng RT (1999) Constraint-based, multidimensional data mining. Computer 32 (8):46–50

    Article  Google Scholar 

  10. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candiyear generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  Google Scholar 

  11. Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: Current status and future directions. Data Min Knowl Disc 15(1):55–86

    Article  MathSciNet  Google Scholar 

  12. Hong TP, Wu YY, Wang SL (2009) An effective mining approach for up-to-year patterns. Expert Systems with Applications 36(6):9747–9752

    Article  Google Scholar 

  13. Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) RWFIM: Recent weighted-frequent itemsets mining. Eng Appl Artif Intell 45:18–32

    Article  Google Scholar 

  14. Lin JCW, Gan W, Hong TP, Tseng VS (2015) HEWIM: High expected weighted itemset mining in uncertain databases, International Conference on Machine Learning and Cybernetics, pp 439–444

  15. Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452

    Article  Google Scholar 

  16. Lan GC, Hong TP, Lee HY, Lin CW (2013) Mining weighted frequent itemsets, The 30th workshop on Combinatorial Mathematics and Computation Theory, pp 85–89

  17. Lee G, Yun U, Ryu KH (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Systems with Applications 41(2):694–708

    Article  Google Scholar 

  18. Lin JCW, Gan W, Hong TP, Zhang B (2015) An incremental high-utility mining algorithm with transaction insertion, The Scientific World Journal

  19. Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) Mining weighted frequent itemsets with the recency constraint, Asia-Pacific Web Conference, pp 635–646

  20. Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient mining of weighted frequent itemsets in uncertain databases, Machine Learning and Data Mining in Pattern Recognition, pp 236–250

  21. Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient algorithms for mining recent weighted frequent itemsets in temporal transactional databases, The 31st Annual ACM Symposium on Applied Computing, pp 861–866

  22. Microsoft. Example database foodmart of microsoft analysis services. Available: http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx

  23. Pasquier N, Bastide Y, Taouil R, Lakhal L (1998) Pruning closed itemset lattices for association rules, International Conference on Advanced Databases, pp 177–196

  24. Ng RT, Lakshmanan L, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained association rules. ACM SIGMOD Rec 27(2):13–24

    Article  Google Scholar 

  25. Fournier-Viger P, Nkambou R, Tseng VS (2011) RuleGrowth: Mining sequential rules common to several sequences by pattern-growth, ACM symposium on applied computing, pp 956– 961

  26. Fournier-Viger P, Faghihi U, Nkambou R, Nguifo EM (2012) CMRules: Mining sequential rules common to several sequences. Knowl-Based Syst 25(1):63–76

    Article  Google Scholar 

  27. Pei J, Han J (2002) Constrained frequent pattern mining: A pattern-growth view. ACM SIGKDD Explorations Newsletter 4(1):31–39

    Article  Google Scholar 

  28. Rymon R (1992) Search through systematic set enumeration, International Conference Principles of Knowledge Representation and Reasoning, pp 539–550

  29. Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements, The International Conference on Extending Database Technology: Advances in Database Technology, pp 3–17

  30. Sun K, Bai F (2008) Mining weighted association rules without preassigned weights. IEEE Trans Knowl Data Eng 20(4):489– 495

    Article  MathSciNet  Google Scholar 

  31. Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework, The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 661–666

  32. Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: An efficient algorithm for high utility itemset mining, The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262

  33. Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on wit-trees. Expert Systems with Applications 40(4):1256–1264

    Article  Google Scholar 

  34. Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR), The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, spp 270–274

  35. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases, The SIAM International Conference on Data Mining, pp 211–225

  36. Yun U, Leggett J (2005) WFIM: Weighted frequent itemset mining with a weight range and a minimum weight, SIAM International Conference on Data Mining, pp 636–640

  37. Yun U, Leggett J (2006) WSpan: Weighted sequential pattern mining in large sequential database, IEEE International Conference on Intelligent Systems, pp 512–517

Download references

Acknowledgments

This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092 and by the Tencent Project under grant CCF-Tencent IAGR20160115.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J.CW., Gan, W., Fournier-Viger, P. et al. Efficiently mining frequent itemsets with weight and recency constraints. Appl Intell 47, 769–792 (2017). https://doi.org/10.1007/s10489-017-0915-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-0915-2

Keywords

Navigation