Skip to main content
Log in

High utility pattern mining algorithm over data streams using ext-list.

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

High utility pattern has received a lot of research and attention because of their wide range of application scenarios. How to efficiently mine high utility patterns over data streams has become an important issue in the field of data mining. To solve the problem that the traditional utility list structure has too many join operations and the join operation is not efficient, which leads to the low spatio-temporal efficiency of the algorithm and the problem that the sliding window model repeatedly generates the same resultset, a new algorithm for high utility pattern mining over data streams is proposed, named HUPM_Stream. A location-indexed list structure, Ext-list, is designed to reduce the time complexity of the utility list join operation, and an improved remaining utility pruning strategy IRS is proposed to reduce the number of utility list join operations, and a hash table structure-based resultset maintenance strategy HRS is designed to effectively reduce the search space of the algorithm and avoid repeatedly generating the same resultset during the sliding process of the window. A large number of experimental results show that the proposed algorithm has better performance on dense datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206

    Article  Google Scholar 

  2. Dawar S, Goyal V (2015) UP-Hist tree: an efficient data structure for mining high utility patterns from transaction databases. In: Proceedings of the 19th international database engineering & applications symposium, pp 56–61

  3. Liu Y, Liao W, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Advances in knowledge discovery and data mining: 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005. Proceedings 9. Springer Berlin Heidelberg, pp 689–695

  4. Tseng VS, Wu CW, Shie BE et al (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 253–262

  5. Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877

    Article  Google Scholar 

  6. Fournier-Viger P, Wu C W, Zida S et al (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Foundations of Intelligent Systems: 21st International Symposium, ISMIS 2014, Roskilde, Denmark, Proceedings 21. Springer International Publishing, pp 83–92

  7. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  Google Scholar 

  8. Krishnamoorthy S (2017) HMiner: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183

    Article  Google Scholar 

  9. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 55–64

  10. Wu P, Niu X, Fournier-Viger P, Huang C, Wang B (2022) UBP-miner: an efficient bit based high utility itemset mining algorithm. Knowl-Based Syst 248:108865

    Article  Google Scholar 

  11. Lan G-C, Hong T-P, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Article  Google Scholar 

  12. Sohrabi MK (2020) An efficient projection-based method for high utility itemset mining using a novel pruning approach on the utility matrix. Knowl Inf Syst 62(11):4141–4167

    Article  Google Scholar 

  13. Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625

    Article  Google Scholar 

  14. Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2019) CLS-miner: efficient and effective closed high-utility itemset mining. Front Comput Sci 13(2):357–381

    Article  Google Scholar 

  15. Dam T-L, Ramampiaro H, Nørvåg K, Duong Q-H (2019) Towards efficiently mining closed high utility itemsets from incremental databases. Knowl-Based Syst 165:13–29

    Article  Google Scholar 

  16. Nguyen LT, Vu VV, Lam MT, Duong TT, Manh LT, Nguyen TT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99

    Article  Google Scholar 

  17. Han M, Zhang N, Wang L, Li X, Cheng H (2022) Mining high utility pattern with negative items in dynamic databases. Int J Intell Syst 37(8):5325–5353

    Article  Google Scholar 

  18. Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2017) An efficient algorithm for mining top-k on-shelf high utility itemsets. Knowl Inf Syst 52(3):621–655

    Article  Google Scholar 

  19. Ahmed CF, Tanbeer SK, Jeong B-S, Choi H-J (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991

    Article  Google Scholar 

  20. Baek Y, Yun U, Kim H, Nam H, Kim H, Lin JC-W, Vo B, Pedrycz W (2021) Rhups: mining recent high utility patterns with sliding window–based arrival time control over data streams. ACM Trans Intell Syst Technol (TIST) 12(2):1–27

    Article  Google Scholar 

  21. Chen X, Zhai P, Fang Y (2021) High utility pattern mining based on historical data table over data streams. In: 2021 4th International Conference on Data Science and Information Technology, pp 368–376

  22. Dawar S, Sharma V, Goyal V (2017) Mining top-k high-utility itemsets from a data stream under sliding window model. Appl Intell 47(4):1240–1255

    Article  Google Scholar 

  23. Jaysawal BP, Huang JW (2020) Sohupds: a single-pass one-phase algorithm for mining high utility patterns over a data stream. In: Proceedings of the 35th annual ACM symposium on applied computing, pp 490-497

  24. Ryang H, Yun U (2016) High utility pattern mining over data streams with sliding window technique. Expert Syst Appl 57:214–231

    Article  Google Scholar 

  25. Yun U, Lee G, Yoon E (2017) Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans Ind Electron 64(9):7239–7249

  26. Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility Itemsets from transactional databases. IEEE Trans on Knowl Data Eng 25(8):1772–1786

    Article  Google Scholar 

  27. Peng A Y, Koh Y S, Riddle P (2017) mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets[C]. In: Advances in Knowledge Discovery and Data Mining: 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, Proceedings, Part II 21. Springer International Publishing, pp 196–207

  28. Yun U, Nam H, Lee G, Yoon E (2019) Efficient approach for incremental high utility pattern mining with indexed list structure. Futur Gener Comput Syst 95:221–239

    Article  Google Scholar 

  29. Gan W, Lin JC-W, Zhang J, Chao H-C, Fujita H, Philip SY (2020) ProUM: projection-based utility mining on sequence data. Inf Sci 513:222–240

    Article  Google Scholar 

  30. Huynh U, Le B, Dinh D-T, Fujita H (2022) Multi-core parallel algorithms for hiding high-utility sequential patterns. Knowl-Based Syst 237:107793

    Article  Google Scholar 

  31. Truong T, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H (2021) Efficient algorithms for mining frequent high utility sequences with constraints. Inf Sci 568:239–264

    Article  MathSciNet  Google Scholar 

  32. Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105

    Article  Google Scholar 

  33. Gan W, Lin JC-W, Chao H-C, Fujita H, Philip SY (2019) Correlated utility-based pattern mining. Inf Sci 504:470–486

    Article  MathSciNet  MATH  Google Scholar 

  34. Kim D, Yun U (2016) Mining high utility itemsets based on the time decaying model. Intell Data Anal 20(5):1157–1180

    Article  Google Scholar 

  35. Feng L, Wang L, Jin B (2013) UT-tree: efficient mining of high utility itemsets from data streams. Intell Data Anal 17(4):585–602

    Article  Google Scholar 

  36. Nam H, Yun U, Yoon E, Lin JC-W (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27

    Article  MathSciNet  MATH  Google Scholar 

  37. Wu CW, Fournier-Viger P, Gu JY et al (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 conference on technologies and applications of artificial intelligence (TAAI). IEEE, pp 187–194

  38. Tseng VS, Wu C-W, Fournier-Viger P, Philip SY (2014) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (62062004), the Natural Science Foundation of Ningxia Province (2022AAC03279), and the Graduate Innovation Project of North Minzu University (YCX22195). And We would like to thank Dr. Bijay Prasad Jaysawal for providing the executable file of the SOHUPDS algorithm.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Han.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, M., Li, M., Chen, Z. et al. High utility pattern mining algorithm over data streams using ext-list.. Appl Intell 53, 27072–27095 (2023). https://doi.org/10.1007/s10489-023-04925-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04925-6

Keywords

Navigation