Skip to main content
Log in

HAUOPM: High Average Utility Occupancy Pattern Mining

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Considerable research has been dedicated to the area of data mining. Among the techniques used for this purpose, frequent pattern mining focuses on revealing patterns that occur frequently in the data. The field of data mining has expanded to include the extraction of patterns that can be beneficial to organizations, known as high-utility pattern mining. To improve the quality of these patterns, occupancy has been introduced as an interestingness measure in recent research. As a result, a new approach known as high-utility occupancy pattern mining (HUOPM) has been developed to identify patterns that make a significant contribution to their supporting transactions, as determined by a utility occupancy metric. A high-utility occupancy pattern is defined as a pattern whose utility occupancy exceeds a threshold value specified by the user. In the realm of transactional databases, the conventional definition of utility occupancy for an itemset involves adding up the utility occupancies for each of its supporting transactions. However, a key drawback of the traditional approach to HUOPM is that it does not account for the length of an itemset. Consequently, some items within an itemset may have higher utility occupancies than others, and the resulting representation may not accurately reflect the relative importance of each item within the itemset. To overcome this limitation, a novel high average utility occupancy pattern mining (HAUOPM) algorithm has been proposed. This algorithm uses a novel measure called average utility occupancy to mine high average utility occupancy patterns. As HAUOPM is not anti-monotonic, a novel average utility occupancy upper bound measure is used to prune unpromising itemsets. It can be observed that when two itemsets are compared, utility of one itemset might be greater than the other, whereas its average utility occupancy is less than the other itemset. Therefore, it is possible that some high-quality patterns are not discovered. To overcome this limitation, we introduce a novel measure called transaction utility occupancy to discover high-quality patterns. A novel average utility occupancy tree structure is used to enumerate the search space. This research uses a modified average utility occupancy list structure to store information about items/itemsets. Extensive experimentation has been performed on real and synthetic datasets to the test the effectiveness of the proposed HAUOPM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Chen, M.-S.; Han, J.; Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)

    Article  Google Scholar 

  2. Gan, W.; Lin, J.C.-W.; Fournier-Viger, P.; Chao, H.-C.; Zhan, J.: Mining of frequent patterns with multiple minimum supports. Eng. Appl. Artif. Intell. 60, 83–96 (2017)

    Article  Google Scholar 

  3. Wang, Z.; Zhu, Y.; Wang, D.; Han, Z.: Fedfpm: a unified federated analytics framework for collaborative frequent pattern mining. In: IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pp. 61–70. IEEE (2022)

  4. Liu, M.; Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)

  5. Fournier-Viger, P.: Fhn: efficient mining of high-utility itemsets with negative unit profits. In: International Conference on Advanced Data Mining and Applications, pp. 16–29. Springer (2014)

  6. Han, M.; Gao, Z.; Li, A.; Liu, S.; Mu, D.: An overview of high utility itemsets mining methods based on intelligent optimization algorithms. Knowl. Inf. Syst. 1–40 (2022)

  7. Fang, W.; Zhang, Q.; Lu, H.; Lin, J.C.-W.: High-utility itemsets mining based on binary particle swarm optimization with multiple adjustment strategies. Appl. Soft Comput. 109073 (2022)

  8. Wu, P.; Niu, X.; Fournier-Viger, P.; Huang, C.; Wang, B.: Ubp-miner: an efficient bit based high utility itemset mining algorithm. Knowl. Based Syst. 248, 108865 (2022)

    Article  Google Scholar 

  9. Sethi, K.K.; Ramesh, D.; Trivedi, M.C.: A spark-based high utility itemset mining with multiple external utilities. Clust. Comput. 25(2), 889–909 (2022)

    Article  Google Scholar 

  10. Gan, W.; Lin, J.C.-W.; Chao, H.-C.; Fournier-Viger, P.; Wang, X.; Yu, P.S.: Utility-driven mining of trend information for intelligent system. arXiv:1912.11666 (2019)

  11. Tang, L.; Zhang, L.; Luo, P.; Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 75–84 (2012)

  12. Shen, B.; Wen, Z.; Zhao, Y.; Zhou, D.; Zheng, W.: Ocean: Fast discovery of high utility occupancy itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 354–365. Springer (2016)

  13. Lin, J.C.-W., et al.: An efficient algorithm to mine high average-utility itemsets. Adv. Eng. Inform. 30(2), 233–243 (2016)

  14. Gan, W.; Lin, J.C.-W.; Fournier-Viger, P.; Chao, H.-C.; Philip, S.Y.: Huopm: high-utility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2019)

    Article  PubMed  Google Scholar 

  15. Agrawal, R.; Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)

  16. Han, J.; Pei, J.; Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  17. Tseng, V.S.; Shie, B.-E.; Wu, C.-W.; Philip, S.Y.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2012)

    Article  Google Scholar 

  18. Zhang, L.; Luo, P.; Tang, L.; Chen, E.; Liu, Q.; Wang, M.; Xiong, H.: Occupancy-based frequent pattern mining. ACM Trans. Knowl. Discov. Data TKDD 10(2), 1–33 (2015)

    Google Scholar 

  19. Chen, C.-M.; Chen, L.; Gan, W.; Qiu, L.; Ding, W.: Discovering high utility-occupancy patterns from uncertain data. Inf. Sci. 546, 1208–1229 (2021)

    Article  MathSciNet  Google Scholar 

  20. Ryu, T.; Yun, U.; Lee, C.; Lin, J.C.-W.; Pedrycz, W.: Occupancy-based utility pattern mining in dynamic environments of intelligent systems. Int. J. Intell. Syst. 37(9), 5477–5507 (2022)

    Article  Google Scholar 

  21. Kim, H.; Ryu, T.; Lee, C.; Kim, H.; Truong, T.; Fournier-Viger, P.; Pedrycz, W.; Yun, U.: Mining high occupancy patterns to analyze incremental data in intelligent systems. ISA Trans. 131, 460–475 (2022)

    Article  PubMed  Google Scholar 

  22. Nguyen, L.T.; Mai, T.; Pham, G.-H.; Yun, U.; Vo, B.: An efficient method for mining high occupancy itemsets based on equivalence class and early pruning. Knowl. Based Syst. 267, 110441 (2023)

    Article  Google Scholar 

  23. Datta, S.; Mali, K.; Ghosh, U.: High occupancy itemset mining with consideration of transaction occupancy. Arab. J. Sci. Eng. 47(2), 2061–2075 (2022)

  24. Kumar, M.J.K., Rana, D.: High average utility itemset mining: a survey. In: ICCIDE, pp. 347 (2020)

  25. Yao, H.; Hamilton, H.J.; Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 482–486. SIAM (2004)

  26. Rymon, R.: Search through systematic set enumeration (1992)

  27. Fournier-Viger, P.; Gomariz, A.; Gueniche, T.; Soltani, A.; Wu, C.-W.; Tseng, V.S.: Spmf: a java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathe John Kenny Kumar.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, M.J.K., Rana, D. HAUOPM: High Average Utility Occupancy Pattern Mining. Arab J Sci Eng 49, 3397–3416 (2024). https://doi.org/10.1007/s13369-023-07971-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-023-07971-x

Keywords

Navigation