Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility
High utility itemset mining is an important model with many real-world applications. But the popular adoption and successful industrial application of this model has been hindered by the following two limitations: (i) computational expensiveness of the model and (ii) infrequent itemsets may be output as high utility itemsets. This paper makes an effort to address these two limitations. A generic high utility-frequent itemset model is introduced to find all itemsets in the data that satisfy user-specified minimum support and minimum utility constraints. Two new pruning measures, named cutoff utility and suffix utility, are introduced to reduce the computational cost of finding the desired itemsets. A single phase fast algorithm, called High Utility Frequent Itemset Miner (HU-FIMi), is introduced to discover the itemsets efficiently. Experimental results demonstrate that the proposed algorithm is efficient.
KeywordsData mining Itemset mining Utility itemset
We would like to thank Yahoo Japan Corporation for providing the retail transaction data.
- 2.Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Hong, T.P., Fujita, H.: A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 8(2), e1242 (2018)Google Scholar
- 3.Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: ICDM, pp. 984–989. IEEE (2012)Google Scholar
- 7.Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM, pp. 482–486 (2004)Google Scholar