Abstract
High-utility itemset mining (HUIM) is an emerging data mining topic. It aims to find the high-utility itemsets by considering both the internal (i.e., quantity) and external (i.e., profit) utilities of items. High-average-utility itemset mining (HAUIM) is an extension of the HUIM, which provides a more fair measurement named average-utility, by taking into account the length of itemsets in addition to their utilities. In the literature, several algorithms have been introduced for mining high-average-utility itemsets (HAUIs). However, these algorithms assume that databases contain only positive utilities. For some real-world applications, on the other hand, databases may also contain negative utilities. In such databases, the proposed algorithms for HAUIM may not discover the complete set of HAUIs since they are designed for only positive utilities. In this study, to discover the correct and complete set of HAUIs with both positive and negative utilities, an algorithm named MHAUIPNU (mining high-average-utility itemsets with positive and negative utilities) is proposed. MHAUIPNU introduces an upper bound model, three pruning strategies, and a data structure. Experimental results show that MHAUIPNU is very efficient in reducing the size of the search space and thus in mining HAUIs with negative utilities.
Similar content being viewed by others
References
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993). https://doi.org/10.1145/170036.170072
Chu, C.J., Tseng, V.S., Liang, T.: An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl. Math. Comput. 215(2), 767–778 (2009). https://doi.org/10.1016/j.amc.2009.05.066
Deng, Z.H.: DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl. Soft. Comput. 41, 214–223 (2016). https://doi.org/10.1016/j.asoc.2016.01.010
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: Spmf: a java open-source pattern mining library. J. Mach. Learn. Res. 15, 3389–3393 (2014)
Fournier-Viger, P., Wu, C.W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Lect. Notes in Comput. Sci., pp. 83–92. Springer International Publishing (2014). https://doi.org/10.1007/978-3-319-08326-1_9
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000). https://doi.org/10.1145/335191.335372
Hong, T.P., Lee, C.H., Wang, S.L.: Effective utility mining with the measure of average utility. Expert Syst. with Appl. 38(7), 8259–8265 (2011). https://doi.org/10.1016/j.eswa.2011.01.006
Huang, H., Wu, X., Relue, R.: Mining frequent patterns with the pattern tree. New Gener. Comput. 23(4), 315–337 (2005). https://doi.org/10.1007/bf03037636
Kim, D., Yun, U.: Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl. Intell. 47(1), 114–131 (2017). https://doi.org/10.1007/s10489-016-0890-z
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015). https://doi.org/10.1016/j.eswa.2014.11.001
Krishnamoorthy, S.: Efficiently mining high utility itemsets with negative unit profits. Knowl. Based Syst. 145, 1–14 (2018). https://doi.org/10.1016/j.knosys.2017.12.035
Lan, G.C., Hong, T.P., Tseng, V.S.: Efficiently mining of high average-utility itemsets with an improved upper-bound strategy. Int. J. Inf. Technol. Decis. Making 11(05), 1009–1030 (2012). https://doi.org/10.1142/s0219622012500307
Lan, G.C., Hong, T.P., Tseng, V.S.: A projection-based approach for discovering high average-utility itemsets. J. Inf. Sci. Eng. 28, 193–209 (2012)
Lin, C.W., Hong, T.P., Lu, W.H.: Efficiently mining high average utility itemsets with a tree structure. In: Intell. Inf. Database Syst., pp. 131–139. Springer, Berlin (2010). https://doi.org/10.1007/978-3-642-12145-6_14
Lin, C.W., Hong, T.P., Lu, W.H.: Using the structure of prelarge trees to incrementally mine frequent itemsets. New Gener. Comput. 28(1), 5–20 (2010). https://doi.org/10.1007/s00354-008-0072-6
Lin, C.W., Hong, T.P., Lu, W.H.: An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38(6), 7419–7424 (2011). https://doi.org/10.1016/j.eswa.2010.12.082
Lin, J.C.W., Fournier-Viger, P., Gan, W.: FHN: an efficient algorithm for mining high-utility itemsets with negative unit profits. Knowl. Based Syst. 111, 283–298 (2016). https://doi.org/10.1016/j.knosys.2016.08.022
Lin, J.C.W., Li, T., Fournier-Viger, P., Hong, T.P., Zhan, J., Voznak, M.: An efficient algorithm to mine high average-utility itemsets. Adv. Eng. Inf. 30(2), 233–243 (2016). https://doi.org/10.1016/j.aei.2016.04.002
Lin, J.C.W., Ren, S., Fournier-Viger, P., Hong, T.P.: EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5, 12927–12940 (2017). https://doi.org/10.1109/access.2017.2717438
Lin, J.C.W., Ren, S., Fournier-Viger, P., Hong, T.P., Su, J.H., Vo, B.: A fast algorithm for mining high average-utility itemsets. Appl. Intell. 47(2), 331–346 (2017). https://doi.org/10.1007/s10489-017-0896-1
Lin, J.C.W., Shao, Y., Fournier-Viger, P., Djenouri, Y., Guo, X.: Maintenance algorithm for high average-utility itemsets with transaction deletion. Appl. Intell. 48(10), 3691–3706 (2018). https://doi.org/10.1007/s10489-018-1180-8
Liu, J., Wang, K., Fung, B.C.: Mining high utility patterns in one phase without generating candidates. IEEE Trans. Knowl. Data Eng. 28(5), 1245–1257 (2016). https://doi.org/10.1109/tkde.2015.2510012
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proc. of the 21st ACM Int. Conf. Inf. Knowl. Manag., CIKM (2012). https://doi.org/10.1145/2396761.2396773
Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Adv. Knowl. Discov. Data Min., pp. 689–695. Springer, Berlin (2005). https://doi.org/10.1007/11430919_79
Lu, T., Vo, B., Nguyen, H.T., Hong, T.P.: A new method for mining high average utility itemsets. In: Comput. Inf. Syst. Ind. Manag., pp. 33–42. Springer, Berlin (2014). https://doi.org/10.1007/978-3-662-45237-0_5
Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: Adv. in Knowl. Discov. Data Min., pp. 196–207. Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-57529-2_16
Ryang, H., Yun, U.: Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl. Inf. Syst. 51(2), 627–659 (2016). https://doi.org/10.1007/s10115-016-0989-x
Singh, K., Shakya, H.K., Singh, A., Biswas, B.: Mining of high-utility itemsets with negative utility. Expert Syst. (2018). https://doi.org/10.1111/exsy.12296
Truong, T., Duong, H., Le, H.B., Viger, P.F.: Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans. Knowl. Data. Eng., pp. 301–314 (2018). https://doi.org/10.1109/tkde.2018.2833478
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013). https://doi.org/10.1109/tkde.2012.59
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-growth: an efficient algorithm for high utility itemset mining. In: Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. (2010). https://doi.org/10.1145/1835804.1835839
Wu, J.M.T., Lin, J.C.W., Pirouz, M., Fournier-Viger, P.: TUB-HAUPM: tighter upper bound for mining high average-utility patterns. IEEE Access 6, 18655–18669 (2018). https://doi.org/10.1109/access.2018.2820740
Wu, T.Y., Lin, J.C.W., Shao, Y., Fournier-Viger, P., Hong, T.P.: Updating the discovered high average-utility patterns with transaction insertion. In: Adv. Intell. Syst. Comput., pp. 66–73. Springer Singapore (2017). https://doi.org/10.1007/978-981-10-6487-6_9
Yildirim, I., Celik, M.: FIMHAUI: Fast incremental mining of high average-utility itemsets. In: 2018 Int. Conf. on Artif. Intell. and Data Process. (IDAP). IEEE (2018). https://doi.org/10.1109/idap.2018.8620819
Yun, U., Kim, D.: Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Gener. Comput. Syst. 68, 346–360 (2017). https://doi.org/10.1016/j.future.2016.10.027
Yun, U., Kim, D., Yoon, E., Fujita, H.: Damped window based high average utility pattern mining over data streams. Knowl. Based Syst. 144, 188–205 (2018). https://doi.org/10.1016/j.knosys.2017.12.029
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2016). https://doi.org/10.1007/s10115-016-0986-0
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Yildirim, I., Celik, M. Mining High-Average Utility Itemsets with Positive and Negative External Utilities. New Gener. Comput. 38, 153–186 (2020). https://doi.org/10.1007/s00354-019-00078-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-019-00078-8