Abstract
Frequent itemset mining (FIM) has firmly established itself as a pivotal and reliable tool in the realm of business analytics, enabling the systematic discovery of valuable patterns and association rules within extensive datasets. However, FIM has a limitation: it only looks at whether an item is present or absent in transactions. This problem is addressed by high-utility itemset mining (HUIM), which considers both the quantity of items and their importance. There are several algorithms for HUIM; their major disadvantage is that they generate too many candidate sets. This problem is tackled by utility-based HUIM algorithms, which are faster and more efficient. However, these algorithms still have an issue with time-consuming join operations. These operations involve unnecessary comparisons when merging two lists of items together. In our study, we introduce a novel technique termed the static increment ratio (SIR) for estimating the probability of an item’s appearance. Additionally, we propose a method known as the hybrid search technique. This technique proficiently compares two item lists through a blend of linear and binary search methods. These approaches leverage the SIR value to make informed decisions, thereby disregarding unnecessary comparisons. We assessed this method using real-world datasets and compared it to state-of-the-art algorithms. The results highlight the method’s reliability and its impressive ability to substantially reduce the execution time of HUIM algorithms, boosting operational efficiency by up to 31%.
Similar content being viewed by others
Code Repository
References
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: Current status and future directions. Data Min. Knowl. Discov. 15(1), 55–86 (2007)
Agrawal, R.S., Rakesh: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB, Vol. 1215, pp. 487–499 (1994)
Liu, Y., Liao, W.-K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific–Asia Conference on Knowledge Discovery and Data Mining, Vol. 3518, pp. 689–695 (2005)
Hamilton, H., Hong, Y.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)
Ryang, H., Yun, U., Ryu, K.: Fast algorithm for high utility pattern mining with the sum of item quantities. Intell. Data Anal. 20(2), 395–415 (2016)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)
Hu, J., Mojsilovic, A.: High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recognit. 40(11), 3317–3324 (2007)
Malla, S., Janaki, M., Reddy, R.M.V., Awatef, B.: A study on fish classification techniques using convolutional neural networks on highly challenged underwater images. Int. J. Recent Innov. Trends Comput. Commun. 10(4), 1–9 (2022)
Tseng, V., Wu, C.-W., Shie, B.-E., Yu, P.: Up-growth: An efficient algorithm for high utility itemset mining (2010). https://doi.org/10.1145/1835804.1835839
Qu, J.-F., Liu, M., Philippe, F.-V.: Efficient algorithms for high utility itemset mining without candidate generation. 51, 131–160 (2019)
Shady, S.F.: Approaches to teaching a biomaterials laboratory course online. JOEE 12(1), 1–5 (2021)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Shen, Y.-D., Zhang, Z., Yang, Q.: Objective-oriented utility-based association mining. In: 2002 IEEE International Conference on Data Mining, pp. 426–433 (2002)
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings Of the 4th SIAM ICDM, pp. 482–486 (2004)
Ryang, H., Yun, U., Ryu, K.: Fast algorithm for high utility pattern mining with the sum of item quantities. Intell. Data Anal. 20, 395–415 (2016). https://doi.org/10.3233/IDA-160811
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: An efficient candidate pruning technique for high utility pattern mining. In: Pacific–Asia Conference on Knowledge Discovery and Data Mining, pp. 749–756 (2009)
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013). https://doi.org/10.1109/TKDE.2012.59
Song, W., Liu, Y., Li, J.: Mining high utility itemsets by dynamically pruning the tree structure. Appl. Intell. 40(1), 29–43 (2014). https://doi.org/10.1007/s10489-013-0443-7
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning, pp. 83–92 (2014)
Peng, A.Y., Koh, Y.S., Riddle, P.: mhuiminer: a fast high utility itemset mining algorithm for sparse datasets, pp. 196–207 (2017)
Duong, Q.-H., Fournier-Viger, P., Ramampiaro, H., Nørvåg, K., Dam, T.-L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48(7), 1859–1877 (2018). https://doi.org/10.1007/s10489-017-1057-2
Patel, S., Shah, S., Patel, M.: An efficient high utility itemset mining approach using predicted utility co-exist pruning. Int. J. Intell. Syst. Appl. Eng. 10(4), 224–230 (2022)
Song, W., Liu, L., Huang, C.: Generalized maximal utility for mining high average-utility itemsets. Knowl. Inf. Syst. 63, 2947–2967 (2021). https://doi.org/10.1007/s10115-021-01614-z
Yildirim, I., Celik, M.: An efficient tree-based algorithm for mining high average-utility itemset. IEEE Access 7, 144245–144263 (2019). https://doi.org/10.1109/ACCESS.2019.2945840
Yildirim, I., Celik, M.: Mining high-average utility itemsets with positive and negative external utilities. New Gener. Comput. 38, 153–186 (2020). https://doi.org/10.1007/s00354-019-00078-8
Wu, J.M., Li, Z., Srivastava, G.E.A.: Analytics of high average-utility patterns in the industrial internet of things. Appl. Intell. 52, 6450–6463 (2022). https://doi.org/10.1007/s10489-021-02751-2
Truong, T., Duong, H., Le, B., Fournier-Viger, P.: Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans. Knowl. Data Eng. 31(2), 301–314 (2019). https://doi.org/10.1109/TKDE.2018.2833478
Kumar, M.J.K., Rana, D.: Hauopm: High average utility occupancy pattern mining. Arab. J. Sci. Eng. (2023). https://doi.org/10.1007/s13369-023-07971-x
Patel, S., Shah, S.M., Patel, M.N.: An efficient search space exploration technique for high utility itemset mining. In: International Conference On Machine Learning and Data Engineering (2023)
Fournier-Viger, P., et al.: The spmf open-source data mining library version. 2, 9853 (2016). https://doi.org/10.1007/978-3-319-46131-1_8
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gajera, R., Patel, S., Madhani, K. et al. An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique. Int J Data Sci Anal (2024). https://doi.org/10.1007/s41060-024-00538-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41060-024-00538-5