Efficiently mining frequent itemsets with weight and recency constraints

Lin, Jerry Chun-Wei; Gan, Wensheng; Fournier-Viger, Philippe; Chao, Han-Chieh; Hong, Tzung-Pei

doi:10.1007/s10489-017-0915-2

Efficiently mining frequent itemsets with weight and recency constraints

Published: 22 April 2017

Volume 47, pages 769–792, (2017)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jerry Chun-Wei Lin¹,
Wensheng Gan¹,
Philippe Fournier-Viger²,
Han-Chieh Chao^1,3 &
…
Tzung-Pei Hong^4,5

476 Accesses
14 Citations
Explore all metrics

Abstract

In the past, a novel framework named recent weighted frequent itemset mining (RWFIM) and two projection-based algorithms, RWFIM-P and RWFIM-PE, were proposed to consider both the relative importance of items (item weights) and the recency of patterns. However, the projection-and-test mechanism used by these algorithms to discover recent weighted frequent itemsets (RWFIs) in a recursive way may have poor performance when the database is dense or contains long transactions. To address this issue, an efficient tree-based RWFI-Mine algorithm is proposed in this paper for mining RWFIs, which considers both weight and the recency of patterns. A novel Set-enumeration tree called the recent weighted frequent (RWF)-tree and a sorted downward closure property of RWFIs for the RWF-tree are proposed. Moreover, two data structures, named element (E)-table and recent weighted frequent (RWF)-table, are designed to store the information needed for discovering RWFIs. RFWI-Mine discovers RWFIs in a recursive way without candidate generation, thus reducing the computational costs and memory requirements for mining RWFIs. A second improved algorithm named RWFI-EMine algorithm is further proposed to avoid building E-tables and RWF-tables for unpromising itemsets and their child nodes by adopting the Estimated Weight of 2-itemset Pruning (EW2P) strategy. Extensive experiments are conducted on several real-world and synthetic datasets to evaluate the performance of the two proposed algorithms, and the ratio between the number of generated RWFIs and WFIs. Results show that the proposed algorithms outperform not only the traditional PWA algorithm for WFIM, but also the state-of-the-art RWFIM-P and RWFIM-PE algorithms for RWFIM, in terms of runtime, memory usage and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Weighted Frequent Itemsets with the Recency Constraint

Mining Recent High Expected Weighted Itemsets from Uncertain Databases

Weighted frequent itemset mining over uncertain databases

Article 08 August 2015

References

Frequent itemset mining dataset repository. Available: http://fimi.ua.ac.be/data/ (2012)
Agrawal R, Srikant R (1994) Quest synthetic data generator Available: http://www.Almaden.ibm.com/cs/quest/syndata.html
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases, International Conference on Very Large Data Bases, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns, International Conference on Data Engineering, pp 3–14
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large database, The ACM SIGMOD International Conference on Management of Data, pp 207–216
Cai CH, Fu AWC, Cheng CH, Kwong WW (1998) Mining association rules with weighted items, International Database Engineering and Applications Symposium, pp 68–77
Chen MS, Han J, Yu PS (1996) Data mining: An overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866– 883
Article Google Scholar
Geng L, Hamilton HJ (2006) Interestingness measures for data mining: A survey. ACM Comput Surv 38 (3):9
Article Google Scholar
Han J, Lakshmanan L, Ng RT (1999) Constraint-based, multidimensional data mining. Computer 32 (8):46–50
Article Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candiyear generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Article Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: Current status and future directions. Data Min Knowl Disc 15(1):55–86
Article MathSciNet Google Scholar
Hong TP, Wu YY, Wang SL (2009) An effective mining approach for up-to-year patterns. Expert Systems with Applications 36(6):9747–9752
Article Google Scholar
Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) RWFIM: Recent weighted-frequent itemsets mining. Eng Appl Artif Intell 45:18–32
Article Google Scholar
Lin JCW, Gan W, Hong TP, Tseng VS (2015) HEWIM: High expected weighted itemset mining in uncertain databases, International Conference on Machine Learning and Cybernetics, pp 439–444
Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452
Article Google Scholar
Lan GC, Hong TP, Lee HY, Lin CW (2013) Mining weighted frequent itemsets, The 30th workshop on Combinatorial Mathematics and Computation Theory, pp 85–89
Lee G, Yun U, Ryu KH (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Systems with Applications 41(2):694–708
Article Google Scholar
Lin JCW, Gan W, Hong TP, Zhang B (2015) An incremental high-utility mining algorithm with transaction insertion, The Scientific World Journal
Lin JCW, Gan W, Fournier-Viger P, Hong TP (2015) Mining weighted frequent itemsets with the recency constraint, Asia-Pacific Web Conference, pp 635–646
Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient mining of weighted frequent itemsets in uncertain databases, Machine Learning and Data Mining in Pattern Recognition, pp 236–250
Lin JCW, Gan W, Fournier-Viger P, Hong TP (2016) Efficient algorithms for mining recent weighted frequent itemsets in temporal transactional databases, The 31st Annual ACM Symposium on Applied Computing, pp 861–866
Microsoft. Example database foodmart of microsoft analysis services. Available: http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx
Pasquier N, Bastide Y, Taouil R, Lakhal L (1998) Pruning closed itemset lattices for association rules, International Conference on Advanced Databases, pp 177–196
Ng RT, Lakshmanan L, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained association rules. ACM SIGMOD Rec 27(2):13–24
Article Google Scholar
Fournier-Viger P, Nkambou R, Tseng VS (2011) RuleGrowth: Mining sequential rules common to several sequences by pattern-growth, ACM symposium on applied computing, pp 956– 961
Fournier-Viger P, Faghihi U, Nkambou R, Nguifo EM (2012) CMRules: Mining sequential rules common to several sequences. Knowl-Based Syst 25(1):63–76
Article Google Scholar
Pei J, Han J (2002) Constrained frequent pattern mining: A pattern-growth view. ACM SIGKDD Explorations Newsletter 4(1):31–39
Article Google Scholar
Rymon R (1992) Search through systematic set enumeration, International Conference Principles of Knowledge Representation and Reasoning, pp 539–550
Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements, The International Conference on Extending Database Technology: Advances in Database Technology, pp 3–17
Sun K, Bai F (2008) Mining weighted association rules without preassigned weights. IEEE Trans Knowl Data Eng 20(4):489– 495
Article MathSciNet Google Scholar
Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework, The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 661–666
Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: An efficient algorithm for high utility itemset mining, The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262
Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on wit-trees. Expert Systems with Applications 40(4):1256–1264
Article Google Scholar
Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (WAR), The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, spp 270–274
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases, The SIAM International Conference on Data Mining, pp 211–225
Yun U, Leggett J (2005) WFIM: Weighted frequent itemset mining with a weight range and a minimum weight, SIAM International Conference on Data Mining, pp 636–640
Yun U, Leggett J (2006) WSpan: Weighted sequential pattern mining in large sequential database, IEEE International Conference on Intelligent Systems, pp 512–517

Download references

Acknowledgments

This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092 and by the Tencent Project under grant CCF-Tencent IAGR20160115.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, HIT Campus, Shenzhen University Town Xili, Shenzhen, China
Jerry Chun-Wei Lin, Wensheng Gan & Han-Chieh Chao
School of Natural Sciences and Humanities Harbin Institute of Technology, Shenzhen, HIT Campus, Shenzhen University Town Xili, Shenzhen, China
Philippe Fournier-Viger
Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan
Han-Chieh Chao
Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan
Tzung-Pei Hong

Authors

Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Wensheng Gan
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Han-Chieh Chao
View author publications
You can also search for this author in PubMed Google Scholar
Tzung-Pei Hong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, J.CW., Gan, W., Fournier-Viger, P. et al. Efficiently mining frequent itemsets with weight and recency constraints. Appl Intell 47, 769–792 (2017). https://doi.org/10.1007/s10489-017-0915-2

Download citation

Published: 22 April 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10489-017-0915-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiently mining frequent itemsets with weight and recency constraints

Abstract

Access this article

Similar content being viewed by others

Mining Weighted Frequent Itemsets with the Recency Constraint

Mining Recent High Expected Weighted Itemsets from Uncertain Databases

Weighted frequent itemset mining over uncertain databases

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficiently mining frequent itemsets with weight and recency constraints

Abstract

Access this article

Similar content being viewed by others

Mining Weighted Frequent Itemsets with the Recency Constraint

Mining Recent High Expected Weighted Itemsets from Uncertain Databases

Weighted frequent itemset mining over uncertain databases

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation