Skip to main content
Log in

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Frequent itemset mining (FIM) has firmly established itself as a pivotal and reliable tool in the realm of business analytics, enabling the systematic discovery of valuable patterns and association rules within extensive datasets. However, FIM has a limitation: it only looks at whether an item is present or absent in transactions. This problem is addressed by high-utility itemset mining (HUIM), which considers both the quantity of items and their importance. There are several algorithms for HUIM; their major disadvantage is that they generate too many candidate sets. This problem is tackled by utility-based HUIM algorithms, which are faster and more efficient. However, these algorithms still have an issue with time-consuming join operations. These operations involve unnecessary comparisons when merging two lists of items together. In our study, we introduce a novel technique termed the static increment ratio (SIR) for estimating the probability of an item’s appearance. Additionally, we propose a method known as the hybrid search technique. This technique proficiently compares two item lists through a blend of linear and binary search methods. These approaches leverage the SIR value to make informed decisions, thereby disregarding unnecessary comparisons. We assessed this method using real-world datasets and compared it to state-of-the-art algorithms. The results highlight the method’s reliability and its impressive ability to substantially reduce the execution time of HUIM algorithms, boosting operational efficiency by up to 31%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Code Repository

https://github.com/Rashmin-Gajera/HUIM-SIR-HybridSearch.

References

  1. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: Current status and future directions. Data Min. Knowl. Discov. 15(1), 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  2. Agrawal, R.S., Rakesh: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB, Vol. 1215, pp. 487–499 (1994)

  3. Liu, Y., Liao, W.-K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific–Asia Conference on Knowledge Discovery and Data Mining, Vol. 3518, pp. 689–695 (2005)

  4. Hamilton, H., Hong, Y.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)

    Article  Google Scholar 

  5. Ryang, H., Yun, U., Ryu, K.: Fast algorithm for high utility pattern mining with the sum of item quantities. Intell. Data Anal. 20(2), 395–415 (2016)

    Article  Google Scholar 

  6. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  7. Hu, J., Mojsilovic, A.: High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recognit. 40(11), 3317–3324 (2007)

    Article  Google Scholar 

  8. Malla, S., Janaki, M., Reddy, R.M.V., Awatef, B.: A study on fish classification techniques using convolutional neural networks on highly challenged underwater images. Int. J. Recent Innov. Trends Comput. Commun. 10(4), 1–9 (2022)

    Article  Google Scholar 

  9. Tseng, V., Wu, C.-W., Shie, B.-E., Yu, P.: Up-growth: An efficient algorithm for high utility itemset mining (2010). https://doi.org/10.1145/1835804.1835839

  10. Qu, J.-F., Liu, M., Philippe, F.-V.: Efficient algorithms for high utility itemset mining without candidate generation. 51, 131–160 (2019)

  11. Shady, S.F.: Approaches to teaching a biomaterials laboratory course online. JOEE 12(1), 1–5 (2021)

    Google Scholar 

  12. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)

  13. Shen, Y.-D., Zhang, Z., Yang, Q.: Objective-oriented utility-based association mining. In: 2002 IEEE International Conference on Data Mining, pp. 426–433 (2002)

  14. Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings Of the 4th SIAM ICDM, pp. 482–486 (2004)

  15. Ryang, H., Yun, U., Ryu, K.: Fast algorithm for high utility pattern mining with the sum of item quantities. Intell. Data Anal. 20, 395–415 (2016). https://doi.org/10.3233/IDA-160811

  16. Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: An efficient candidate pruning technique for high utility pattern mining. In: Pacific–Asia Conference on Knowledge Discovery and Data Mining, pp. 749–756 (2009)

  17. Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013). https://doi.org/10.1109/TKDE.2012.59

    Article  Google Scholar 

  18. Song, W., Liu, Y., Li, J.: Mining high utility itemsets by dynamically pruning the tree structure. Appl. Intell. 40(1), 29–43 (2014). https://doi.org/10.1007/s10489-013-0443-7

    Article  Google Scholar 

  19. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning, pp. 83–92 (2014)

  20. Peng, A.Y., Koh, Y.S., Riddle, P.: mhuiminer: a fast high utility itemset mining algorithm for sparse datasets, pp. 196–207 (2017)

  21. Duong, Q.-H., Fournier-Viger, P., Ramampiaro, H., Nørvåg, K., Dam, T.-L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48(7), 1859–1877 (2018). https://doi.org/10.1007/s10489-017-1057-2

    Article  Google Scholar 

  22. Patel, S., Shah, S., Patel, M.: An efficient high utility itemset mining approach using predicted utility co-exist pruning. Int. J. Intell. Syst. Appl. Eng. 10(4), 224–230 (2022)

    Google Scholar 

  23. Song, W., Liu, L., Huang, C.: Generalized maximal utility for mining high average-utility itemsets. Knowl. Inf. Syst. 63, 2947–2967 (2021). https://doi.org/10.1007/s10115-021-01614-z

    Article  Google Scholar 

  24. Yildirim, I., Celik, M.: An efficient tree-based algorithm for mining high average-utility itemset. IEEE Access 7, 144245–144263 (2019). https://doi.org/10.1109/ACCESS.2019.2945840

    Article  Google Scholar 

  25. Yildirim, I., Celik, M.: Mining high-average utility itemsets with positive and negative external utilities. New Gener. Comput. 38, 153–186 (2020). https://doi.org/10.1007/s00354-019-00078-8

    Article  Google Scholar 

  26. Wu, J.M., Li, Z., Srivastava, G.E.A.: Analytics of high average-utility patterns in the industrial internet of things. Appl. Intell. 52, 6450–6463 (2022). https://doi.org/10.1007/s10489-021-02751-2

    Article  Google Scholar 

  27. Truong, T., Duong, H., Le, B., Fournier-Viger, P.: Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans. Knowl. Data Eng. 31(2), 301–314 (2019). https://doi.org/10.1109/TKDE.2018.2833478

    Article  Google Scholar 

  28. Kumar, M.J.K., Rana, D.: Hauopm: High average utility occupancy pattern mining. Arab. J. Sci. Eng. (2023). https://doi.org/10.1007/s13369-023-07971-x

    Article  Google Scholar 

  29. Patel, S., Shah, S.M., Patel, M.N.: An efficient search space exploration technique for high utility itemset mining. In: International Conference On Machine Learning and Data Engineering (2023)

  30. Fournier-Viger, P., et al.: The spmf open-source data mining library version. 2, 9853 (2016). https://doi.org/10.1007/978-3-319-46131-1_8

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rashmin Gajera.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gajera, R., Patel, S., Madhani, K. et al. An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique. Int J Data Sci Anal (2024). https://doi.org/10.1007/s41060-024-00538-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41060-024-00538-5

Keywords

Navigation