FDHUP: Fast algorithm for mining discriminative high utility patterns

Lin, Jerry Chun-Wei; Gan, Wensheng; Fournier-Viger, Philippe; Hong, Tzung-Pei; Chao, Han-Chieh

doi:10.1007/s10115-016-0991-3

FDHUP: Fast algorithm for mining discriminative high utility patterns

Regular Paper
Published: 20 September 2016

Volume 51, pages 873–909, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jerry Chun-Wei Lin¹,
Wensheng Gan¹,
Philippe Fournier-Viger²,
Tzung-Pei Hong^3,4 &
…
Han-Chieh Chao^1,5

677 Accesses
58 Citations
Explore all metrics

Abstract

Recently, high utility pattern mining (HUPM) has been extensively studied. Many approaches for HUPM have been proposed in recent years, but most of them aim at mining HUPs without any consideration for their frequency. This has the major drawback that any combination of a low utility item with a very high utility pattern is regarded as a HUP, even if this combination has low affinity and contains items that rarely co-occur. Thus, frequency should be a key criterion to select HUPs. To address this issue, and derive high utility interesting patterns (HUIPs) with strong frequency affinity, the HUIPM algorithm was proposed. However, it recursively constructs a series of conditional trees to produce candidates and then derive the HUIPs. This procedure is time-consuming and may lead to a combinatorial explosion when the minimum utility threshold is set relatively low. In this paper, an efficient algorithm named fast algorithm for mining discriminative high utility patterns (DHUPs) with strong frequency affinity (FDHUP) is proposed to efficiently discover DHUPs by considering both the utility and frequency affinity constraints. Two compact structures named EI-table and FU-tree and three pruning strategies are introduced in the proposed algorithm to reduce the search space, and efficiently and effectively discover DHUPs. An extensive experimental study shows that the proposed FDHUP algorithm considerably outperforms the state-of-the-art HUIPM algorithm in terms of execution time, memory consumption, and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Data distribution tailoring revisited: cost-efficient integration of representative data

Article 12 April 2024

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Article Open access 22 February 2023

References

Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. IEEE Trans Knowl Data Eng 5(6):914–925
Article Google Scholar
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large database. In: The ACM SIGMOD international conference on management of data, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: International conference on very large data bases. pp 487–499
Agrawal R, Srikant R (1994) Quest synthetic data generator. http://www.Almaden.ibm.com/cs/quest/syndata.html
Ahmed CF, Tanbeer SK, Jeong BS, Le YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong BS, Choi HJ (2011) A framework for mining interesting high utility patterns with a strong frequency affinity. Inf Sci 181(21):4878–4894
Google Scholar
Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. IEEE International Conference on Data Mining 19–26:
Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883
Article Google Scholar
Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Found Intell Syst 8502:83–92
Google Scholar
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu CW, Tseng VS (2014) SPMF: a java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393
MATH Google Scholar
Fournier-Viger P, Zida S (2015) FOSHU: Faster on-shelf high utility itemset mining- with or without negative unit profit. In: The 30th symposium on applied computing, pp 857–864
Frequent itemset mining dataset repository (2012). http://fimi.ua.ac.be/data/
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87
Article MathSciNet Google Scholar
Hong TP, Kuo CS, Chi SC (1999) Mining association rules from quantitative data. Intell Data Anal 3(5):363–376
Article MATH Google Scholar
Hong TP, Lin KY, Chien BC (2003) Mining fuzzy multiple-level association rules from quantitative data. Appl Intell 18(1):79–90
Article MATH Google Scholar
Kim WY, Lee YK, Han J (2004) CCMine: efficient mining of confidence-closed correlated patterns. Adv Knowl Discov Data Min 569–579
Lan GC, Hong TP, Tseng VS (2011) Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst Appl 38(5):5851–5857
Article Google Scholar
Lan GC, Hong TP, Tseng VS (2013) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107
Article Google Scholar
Lan GC, Hong TP, Huang JP, Tseng VS (2014) On-shelf utility mining with negative item values. Expert Syst Appl 41(7):3450–3459
Article Google Scholar
Lin JCW, Gan W, Hong TP, Pan JS (2014) Incrementally updating high-utility itemsets with transaction insertion. In: The 10th international conference advanced data mining and applications, pp 44–56
Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424
Article Google Scholar
Lin JCW, Gan W, Hong TP (2015) A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Adv Eng Inf 29(3):562–574
Article Google Scholar
Lin JCW, Gan W, Hong TP, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inf 29(3):648–661
Article Google Scholar
Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inf 30(2):109–126
Article Google Scholar
Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl Based Syst 96:171–187
Article Google Scholar
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp 689–695
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: ACM international conference on information and knowledge management, pp 55–64
Martínez-Ballesteros M, Martínez-Álvarez F, Troncoso A, Riquelme JC (2014) Selecting the best measures to discover quantitative association rules. Neurocomputing 126:3–14
Article Google Scholar
Omiecinski ER (2003) Alternative interest measures for mining associations in databases. IEEE Trans Knowl Data Eng 15(1):57–69
Article MathSciNet Google Scholar
Rymon R (1992) Search through systematic set enumeration. In: International conference principles of knowledge representation and reasoning, pp 539–550
Tseng, VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: The 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 253–262
Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Tseng VS, Wu CW, Fournier-Viger P, Yu PS (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739
Article Google Scholar
Wu CW, Shie BE, Tseng VS, Yu PS (2012) Mining top-\(k\) high utility itemsets. In: The 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 78-86
Xiong H, Tan PN, Kumar V (2003) Mining strong affinity association patterns in data sets with skewed support distribution. In: IEEE international conference on data mining, pp 387–394
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: The SIAM international conference on data mining, pp 211–225
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626
Article Google Scholar

Download references

Acknowledgments

This research was partially supported by the Tencent Project under grant CCF-TencentRAGR20140114 and by the National Natural Science Foundation of China (NSFC) under grant No. 61503092.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen University Town, Xili, Shenzhen, China
Jerry Chun-Wei Lin, Wensheng Gan & Han-Chieh Chao
School of Natural Sciences and Humanities, Harbin Institute of Technology (Shenzhen), Shenzhen University Town, Xili, Shenzhen, China
Philippe Fournier-Viger
Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan
Tzung-Pei Hong
Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan
Han-Chieh Chao

Authors

Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Wensheng Gan
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Tzung-Pei Hong
View author publications
You can also search for this author in PubMed Google Scholar
Han-Chieh Chao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, J.CW., Gan, W., Fournier-Viger, P. et al. FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowl Inf Syst 51, 873–909 (2017). https://doi.org/10.1007/s10115-016-0991-3

Download citation

Received: 17 May 2015
Accepted: 08 September 2016
Published: 20 September 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10115-016-0991-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FDHUP: Fast algorithm for mining discriminative high utility patterns

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FDHUP: Fast algorithm for mining discriminative high utility patterns

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation