Abstract
High utility itemset mining (HUIM) is an expansion of frequent itemset mining (FIM). Both of them are techniques to find interesting patterns from the database. The interesting patterns found by FIM are based on frequently appeared items. This approach is not that efficient to identify the desired patterns, as it considers only existence or nonexistence of items in database and ignores utility. However, the patterns are more meaningful for the user if the utility is considered. The utility can be quantity, profit, cost, risk, or other factors based on user interest. HUIM is another approach to find interesting patterns by considering utility of items along with the frequency. It uses minimum utility threshold to determine if an itemset is high utility itemset (HUI) or not. There are several challenges to implement utility from traditional pattern mining to HUIM. Lately, there are many research contributions that proposed different algorithms to solve these issues. This review work explores various HUIM techniques with detailed analysis of different strategies like apriori, tree based, utility lists based, and hybrid. These strategies are used to implement various HUIM techniques in order to achieve the effectiveness in pattern mining. The observations and analytical findings based on this detailed review done with respect to various parameters can be recommended and used for further research in the pattern mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Han, J., Kamber, M., Pei, J.: Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, Waltham (2012)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000). https://doi.org/10.1109/69.846291
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. Association for Computing Machinery, New York (2000). https://doi.org/10.1145/335191.335372
Sucahyo, Y.G., Gopalan, R.P.: CT-PRO: abottom-up non recursive frequent itemset mining algorithm using compressed fp-tree data structure. In: IEEE ICDM Workshop on Frequent Itemset Mining Implementations. (2004)
Aryabarzana, N., Bidgoli, B.M., Teshnehlab, M.: negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst. Appl. 105, 129–143 (2018). https://doi.org/10.1016/j.eswa.2018.03.041
Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006). https://doi.org/10.1016/j.datak.2005.10.004
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689–695. Springer, Berlin (2005). https://doi.org/10.1007/11430919_79
Erwin, A., Gopalan, R.P., Achuthan, N.R.: CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: 7th IEEE International Conference on Computer and Information Technology, pp. 71–76. IEEE, Fukushima (2007). https://doi.org/10.1109/CIT.2007.120
Erwin, A., Gopalan, R.P., Achuthan, N.R.: A bottom-up projection based algorithm for mining high utility itemsets. In: 2nd International Workshop on Integrating Artificial Intelligence and Data Mining, pp. 3–11. Australian Computer Society, Australia (2007)
Erwin, A., Gopalan, R.P., Achuthan, N.R.: Efficient mining of high utility itemsets from large datasets. In: 12th Pacific-Asia Conferences on Knowledge Discovery and Data Mining, pp. 554–561. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-68125-0_50
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-Growth: an efficient algorithm for high utility itemset mining. In: 16th ACM SIGKDD Interntional Conference on Knowledge Discovery and Data Mining, pp. 253–262. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1835804.1835839
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013). https://doi.org/10.1109/TKDE.2012.59
Song, W., Liu, Y., Li, J.: Mining high utility itemsets by dynamically pruning the tree structure. Appl. Intell. 40, 29–43 (2014). https://doi.org/10.1007/s10489-013-0443-7
Deng, Z.H.: An efficient structure for fast mining high utility itemset. Appl. Intell. 48, 3161–3177 (2018). https://doi.org/10.1007/s10489-017-1130-x
Yildirim, I., Celik, M.: An efficient tree-based algorithm for mining high average-utility itemset. IEEE Access 7, 144245–144263 (2019). https://doi.org/10.1109/ACCESS.2019.2945840
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009). https://doi.org/10.1109/TKDE.2009.46
Yin, J., Zheng, Z., Cao, L.: USpan: An efficient algorithm for mining high utility sequential patterns. In: 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 660–668. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2339530.2339636
Gan, W., Lin, J.C.W., Zhang, J., Chao, H.C., Fujita, H., Yu, S.: ProUM: projection-based utility mining on sequence data. Inf. Sci. Inf. Comput. Sci. Intell. Syst. Appl. J. 513, 222–240 (2020). https://doi.org/10.1016/j.ins.2019.10.033
Gan, W., Lin, J.C.W., Zhang, J., Viger, P.F., Chao, H.C., Yu, P.S.: Fast utility mining on sequence data. IEEE Trans. Cybern. 51(2), 487–500 (2020). https://doi.org/10.1109/TCYB.2020.2970176
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: 21st ACM International Conferene on Information and Knowledge Management, pp. 55–64. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2396761.2396773
Viger, P.F., Wu, C.W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: 21st International symposium on Methodologies for Intelligent Systems, pp. 83–92. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08326-1_9
Ryang, H., Yun, U.: Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl. Inf. Syst. Int. J. 51, 627–659 (2017). https://doi.org/10.1007/s10115-016-0989-x
Krishnamoorthy, S.: HMiner: efficiently mining high utility itemsets. Expert Syst. Appl. 90, 168–183 (2017). https://doi.org/10.1016/j.eswa.2017.08.028
Duong, Q.H., Viger, P.F., Ramampiaro, H., Norvag, K., Dam, T.L.: Efficient high utility itemset mining using buffered utility-lists. Appl. Intell. 48, 1859–1877 (2018). https://doi.org/10.1007/s10489-017-1057-2
Viger, P.F., Zhang, Y., Lin, J.C.W., Dinh, D.T., Le, H.B.: Mining correlated high-utility itemsets using various measures. Logic J. Interest Group Pure Appl Logics (IGPL) 28(1), 19–32 (2018). https://doi.org/10.1093/jigpal/jzz068
Wu, C.W., Viger, P.F., Gu, J.Y., Tseng, V.S.: Mining compact high utility itemsets without candidate generation. In: High-Utility Pattern Mining: Theory, Algorithms and Applications, pp. 279–302. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_11
Vo, B., Nguyen, L.V., Vu, V.V., Lam, M.T.H., Duong, T.T.M., Manh, L.T., Nguyen, T.T.T., Nguyen, L.T.T., Hong, T.P.: Mining correlated high utility itemsets in one phase. IEEE Access 8, 90465–90477 (2020). https://doi.org/10.1109/ACCESS.2020.2994059
Wei, T., Wang, B., Zhang, Y., Hu, K., Yao, Y., Liu, H.: FCHUIM: efficient frequent and closed high-utility itemsets mining. IEEE Access 8, 109928–109939 (2020). https://doi.org/10.1109/ACCESS.2020.3001975
Vo, B., Nguyen, L.T.T., Bui, N., Nguyen, T.D.D., Huynh, V.N., Hong, T.P.: An efficient method for mining closed potential high-utility itemsets. IEEE Access 8, 31813–31822 (2020). https://doi.org/10.1109/ACCESS.2020.2974104
Amphawan, K., Lenca, P., Jitpattanakul, A., Surarerks, A.: Mining high utility itemsets with regular occurrence. J. ICT Res. Appl. 10(2), 153–176 (2016). https://doi.org/10.5614/itbj.ict.res.appl.2016.10.2.5
Bai, A., Deshpande, P.S., Dhabu, M.: Selective database projections based approach for mining high-utility itemsets. IEEE Access 6, 14389–14409 (2018). https://doi.org/10.1109/ACCESS.2017.2788083
Lin, J.C.W., Li, Y., Viger, P.F., Djenouri, Y., Zhang, J.: Efficient chain structure for high-utility sequential pattern mining. IEEE Access 8, 40714–40722 (2020). https://doi.org/10.1109/ACCESS.2020.2976662
Viger, P.F., Li, J., Lin, J.C.W., Chi, T.T., Kiran, R.U.: Mining cost-effective patterns in event logs. Knowl. Based Syst. 191, 1–25 (2020). https://doi.org/10.1016/j.knosys.2019.105241
Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 196–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57529-2_16
Dawar, S., Goyal, V., Bera, D.: A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl. Intell. 47, 809–827 (2017). https://doi.org/10.1007/s10489-017-0932-1
Wu, J.M.T., Lin, J.C.W., Pirouz, M., Viger, P.F.: TUB-HAUPM: tighter upper bound for mining high average-utility patterns. IEEE Access 6, 18655–18669 (2018). https://doi.org/10.1109/ACCESS.2018.2820740
Vo, B., Nguyen, L.T.T., Nguyen, T.D.D., Viger, P.F., Yun, U.: A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8, 85890–85899 (2020). https://doi.org/10.1109/ACCESS.2020.2992729
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A Survey. Assoc. Comput. Mach. (ACM) Comput. Surv. 38(3), 9 (2006). https://doi.org/10.1145/1132960.1132963
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Atmaja, E.H.S., Sonawane, K. (2022). A Review of High Utility Itemset Mining for Transactional Database. In: Gupta, D., Goswami, R.S., Banerjee, S., Tanveer, M., Pachori, R.B. (eds) Pattern Recognition and Data Analysis with Applications. Lecture Notes in Electrical Engineering, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-19-1520-8_2
Download citation
DOI: https://doi.org/10.1007/978-981-19-1520-8_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1519-2
Online ISBN: 978-981-19-1520-8
eBook Packages: Computer ScienceComputer Science (R0)