Abstract
The goal of the high-utility itemset mining task is to discover combinations of items which that yield high profits from transactional databases. HUIM is a useful tool for retail stores to analyze customer behaviors. However, it ignores the categorization of items. To solve this issue, the ML-HUI Miner algorithm was presented. It combines item taxonomy with the HUIM task and is able to discover insightful itemsets, which are not found in traditional HUIM approaches. Although ML-HUI Miner is efficient in discovering itemsets from multiple abstraction levels, it is a sequential algorithm. Thus, it cannot utilize the powerful multi-core processors, which are currently available widely. This paper addresses this issue by extending the algorithm into a multi-core version, called the MCML-Miner algorithm (Multi-Core Multi-Level high-utility itemset Miner), to help reduce significantly the mining time. Each level in the taxonomy will be assigned a separate processor core to explore concurrently. Experiments on real-world datasets show that the MCML-Miner up to several folds faster than the original algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nguyen, N.T.: Advanced Methods for Inconsistent Knowledge Management. AIKP. Springer, London (2008). https://doi.org/10.1007/978-1-84628-889-0
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)
Yao, H., Hamilton, H.J., Butz, G.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, vol. 4, pp. 482–486 (2004)
Srikant, R., Agrawal, R.: Mining generalized association rules. Future Gener. Comput. Syst. 13(2–3), 161–180 (1997)
Hipp, J., Myka, A., Wirth, R., Güntzer, U.: A new algorithm for faster mining of generalized association rules. In: Żytkow, J.M., Quafafou, M. (eds.) PKDD 1998. LNCS, vol. 1510, pp. 74–82. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0094807
Vo, B., Le, B.: Fast algorithm for mining generalized association rules. Int. J. Database Theory 2(3), 19–21 (2009)
Cagliero, L., Chiusano, S., Garza, P., Ricupero, G.: Discovering high-utility itemsets at multiple abstraction levels. In: Kirikova, M., et al. (eds.) ADBIS 2017. CCIS, vol. 767, pp. 224–234. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67162-8_22
Fournier-Viger, P., Wang, Y., Lin, J.C.-W., Luna, J.M., Ventura, S.: Mining cross-level high utility itemsets. In: Fujita, H., Fournier-Viger, P., Ali, M., Sasaki, J. (eds.) IEA/AIE 2020. LNCS (LNAI), vol. 12144, pp. 858–871. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55789-8_73
Fournier-Viger, P., Lin, J.C.W., Vo, B., Chi, T.T., Zhang, J., Le, H.B.: A survey of itemset mining. In: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 4 (2017)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004). https://doi.org/10.1023/B:DAMI.0000005258.31418.83
Liu, Y., Liao, W.-k., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-Growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS (LNAI), vol. 8502, pp. 83–92. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08326-1_9
Zida, S., Fournier-Viger, P., Lin, J.C.-W., Wu, C.-W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2016). https://doi.org/10.1007/s10115-016-0986-0
Nguyen, L.T.T., Nguyen, P., Nguyen, T.D.D., Vo, B., Fournier-Viger, P., Tseng, V.S.: Mining high-utility itemsets in dynamic profit databases. Knowl.-Based Syst. 175, 130–144 (2019)
Nouioua, M., Wang, Y., Fournier-Viger, P., Lin, J.C.-W., Wu, J.M.-T.: TKC: mining top-k cross-level high utility itemsets. In: 3rd International Workshop on Utility-Driven Mining (UDML 2020) (2020)
Alias, S., Norwawi, N.M.: pSPADE: mining sequential pattern using personalized support threshold value. In: Proceedings - International Symposium on Information Technology 2008, ITSim, vol. 2, pp. 1–8 (2008)
Cong, S., Han, J., Padua, D.: Parallel mining of closed sequential patterns. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 562–567 (2005)
Zhu, T., Bai, S.: A parallel mining algorithm for closed sequential patterns. In: Proceedings - 21st International Conference on Advanced Information Networking and Applications Workshops/Symposia, AINAW 2007, vol. 2, pp. 392–395 (2007)
Nguyen, T.D.D., Nguyen, L.T.T., Vo, B.: A parallel algorithm for mining high utility itemsets. In: Świątek, J., Borzemski, L., Wilimowska, Z. (eds.) ISAT 2018. AISC, vol. 853, pp. 286–295. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-99996-8_26
Vo, B., Nguyen, L.T.T., Nguyen, T.D.D., Fournier-Viger, P., Yun, U.: A multi-core approach to efficiently mining high-utility itemsets in dynamic profit databases. IEEE Access 8, 85890–85899 (2020)
Nguyen, L.T.T., et al.: Efficient method for mining high-utility itemsets using high-average utility measure. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds.) ICCCI 2020. LNCS (LNAI), vol. 12496, pp. 305–315. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63007-2_24
Nguyen, N.T.: Using consensus methods for solving conflicts of data in distributed systems. In: Hlaváč, V., Jeffery, K.G., Wiedermann, J. (eds.) SOFSEM 2000. LNCS, vol. 1963, pp. 411–419. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44411-4_30
Nguyen, N.T.: Consensus system for solving conflicts in distributed systems. Inf. Sci. 147(1), 91–122 (2002)
Chen, Y., An, A.: Approximate parallel high utility itemset mining. Big Data Res. 6, 26–42 (2016)
Sethi, K.K., Ramesh, D., Edla, D.R.: P-FHM+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput. Sci. 132, 918–927 (2018)
Acknowledgements
This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number C2020-28-04.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, T.D.D., Nguyen, L.T.T., Kozierkiewicz, A., Pham, T., Vo, B. (2021). An Efficient Approach for Mining High-Utility Itemsets from Multiple Abstraction Levels. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-73280-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73279-0
Online ISBN: 978-3-030-73280-6
eBook Packages: Computer ScienceComputer Science (R0)