Skip to main content
Log in

Efficient mining of high utility pattern with considering of rarity and length

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Techniques for mining rare patterns have been researched in the association rule mining area because traditional frequent pattern mining methods have to generate a large amount of unnecessary patterns in order to find rare patterns from large databases. One such technique, the multiple minimum support threshold framework was devised to extract rare patterns by using a different minimum item support threshold for each item in a database. Nevertheless, this framework cannot sufficiently reflect environments of the real world. The reason is that it does not consider weights of items, such as market prices of products and fatality rates of diseases, in its mining process. Therefore, an algorithm has been proposed to mine rare patterns with utilities exceeding a user-specified minimum utility by considering rarity and utility information of items. However, since this algorithm employs the concept of traditional high utility pattern mining, patterns’ lengths are not considered for determining utilities of the patterns. If the length of a pattern is sufficiently long, the pattern is more likely to have an enough utility to become a high utility pattern regardless of item utilities within the pattern. Therefore, the algorithm cannot guarantee that all items in a mined pattern have high utilities. In this paper, we propose a novel algorithm that effectively reduces such dependency of patterns on their lengths by considering their lengths in the mining process in order to mine more meaningful rare patterns compared to patterns mined by previous algorithms. Experimental results demonstrate that our algorithm extracts a lesser number of more meaningful patterns and consumes less computational resources compared to state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: 20th international conference on very large data bases, pp 487–499

  2. Ahmed CF, Tanbeer SK, Jeong B, Lee Y (2009) Efficient tree structures for high utility pattern mining in incremental Databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  3. Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-Trees. IEEE Trans Knowl Data Eng 17(10):1347–1362

    Article  Google Scholar 

  4. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proc. of the 2000 ACM SIGMOD int’l conf. on management of data, pp 1–12

  5. Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In: Proc. of the 21th int’l conf. on very large database (VLDB 1995), pp 420–431

  6. Hong T, Lee C, Wang S (2009) Mining high average-utility itemsets. SMC 2009:2526–2530

  7. Hong T, Lee C, Wang S (2011) Effective utility mining with the measure of average utility. Expert Syst Appl 38(7):8259–8265

    Article  Google Scholar 

  8. Hong T, Lee C, Wang S (2009) An Incremental Mining Algorithm for High Average-Utility Itemsets. ISPAN 2009:421–425

  9. Hu Y, Chen Y (2006) Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism. Decis Support Syst 42(1):1–24

    Article  Google Scholar 

  10. Hu Y, Tsai C, Tai C, Chiang I (2015) A novel approach for mining cyclically repeated patterns with multiple minimum supports. Appl Soft Comput 28:90–99

    Article  Google Scholar 

  11. Huynh-Thi-Le Q, Le T, Vo B, Le HB (2015) An efficient and effective algorithm for mining top-rank-k frequent patterns. Expert Syst Appl 42(1):156–164

    Article  Google Scholar 

  12. Kiran RU, Reddy PK (2009) An improved multiple minimum support based approach to mine rare association rules. CIDM 2009:340–347

  13. Kiran RU, Reddy PK (2011) Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms. In: Proceedings of the 14th international conference on extending database technology, pp 11–20

  14. Lan G, Hong T, Tseng VS (2012) A projection-based approach for discovering high average-utility itemsets. J Inf Sci Eng 28:193–209

    Google Scholar 

  15. Lan G, Hong T, Tseng VS (2012) Efficiently mining high average-utility itemsets with an improved upper-bound strategy. Int J Inf Technol Decis Mak 11(5):1009–

    Article  Google Scholar 

  16. Lan G, Hong T, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Article  Google Scholar 

  17. Lee G, Yun U, Ryu K (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst Appl 41(2):694–708

    Article  Google Scholar 

  18. Lee G, Yun U, Ryang H (2015) An Uncertainty-based Approach: Frequent Itemset Mining from Uncertain Data with Different Item Importance. Knowl-Based Syst 90:239256

    Article  Google Scholar 

  19. Lee W, Stolfo SJ, Mok KW (1998) Mining audit data to build intrusion detection models. In: Proceedings of the 4th international conference on knowledge discovery and data mining (KDD 1998), pp 66–72

  20. Lin C, Lan G, Hong T (2015) Mining high utility itemsets for transaction deletion in a dynamic databases. Intell Data Analy 19(1):43–55

    Google Scholar 

  21. Liu B, Hsu W, Ma Y (1999) Mining association rules with multiple minimum supports. KDD ’99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining:337–341

  22. Liu Y, Liao W, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. Advan Knowl Disc Data Mining:689–695

  23. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM), pp 55

  24. Lin MY, Tu T-F, Hsueh S-C (2012) High utility pattern mining using the maximal itemset property and lexicographic tree structures. Inf Sci 215:1–14

    Article  Google Scholar 

  25. Lin C, Hong T, Lu W (2010) Efficiently mining high average utility itemsets with a tree structure. ACIIDS 5990:131–139

    Google Scholar 

  26. Lu T, Vo B, Nguyen HT, Hong T (2014) A new method for mining high average utility itemsets. CISIM 2014:33–42

    Google Scholar 

  27. Mannila H (1998) Database methods for data mining. In: ACM SIGKDD Conf. on knowledge discovery and data mining (KDD 1998) tutorial

  28. Patil SB, Kumaraswamy VS (2009) Intelligent and effective heart attack prediction system using data mining and artificial neural network. Eur J Sci Res 31(4):642–656

    Google Scholar 

  29. Pisharath J, Liu Y, Ozisikyilmaz B, Narayanan R, Liao WK, Choudhary A, Memik G NU-MineBench version 2.0 dataset and technical report, http://cucis.ece.northwestern.edu/projects/DMS/

  30. Pyun G, Yun U (2014) Mining Top-k frequent patterns with combination reducing techniques. Appl Intell 41(1):76–98

    Article  Google Scholar 

  31. Ryang H, Yun U, Ryu K (2014) Discovering high utility itemsets with multiple minimum supports. Intell Data Analy 18(6):1027–1047

    Google Scholar 

  32. Ryang H, Yun U (2015) Top-K High Utility Pattern Mining with Effective Threshold Raising Strategies. Knowl-Based Syst 76:109–126

    Article  Google Scholar 

  33. Tempaiboolkul J (2013) Mining rare association rules in a distributed environment using multiple minimum supports. ICIS 2013:295–299

  34. Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Proc. of the 16th ACM SIGKDD int’l conf. on knowledge discovery and data mining (KDD 2010), pp 253–262

  35. Weng CH (2011) Mining fuzzy specific rare itemsets for education data. Knowl-Based Syst 24(5):697–708

    Article  Google Scholar 

  36. Xu T, Dong X (2013) Mining frequent patterns with multiple minimum supports using basic Apriori. ICNC 2013:957–961

  37. Yun H, Ha D, Hwang B, Ryu K (2003) Mining association rules on significant rare data using relative support. J Syst Softw 67(3):181–191

    Article  Google Scholar 

  38. Yun U, Ryang H (2014) Incremental high utility pattern mining with static and dynamic databases. Appl Intell 42(2):323–352

    Article  Google Scholar 

  39. Yun U, Ryang H, Ryu K (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878

    Article  Google Scholar 

  40. Yun U, Yoon E (2014) An efficient approach for mining weighted approximate closed frequent patterns considering noise constraints. Int J Uncertainty Fuzziness Knowledge Based Syst 22(6):879–912

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF No. 20152062051, NRF No. 20155054624 and NRF No. 20135005682) and the Business for Academic-industrial Cooperative establishments funded Korea Small and Medium Business Administration in 2015 (Grants No. C0261068).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Unil Yun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, D., Yun, U. Efficient mining of high utility pattern with considering of rarity and length. Appl Intell 45, 152–173 (2016). https://doi.org/10.1007/s10489-015-0750-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0750-2

Keywords

Navigation