Advertisement

Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility

  • R. Uday KiranEmail author
  • T. Yashwanth Reddy
  • Philippe Fournier-Viger
  • Masashi Toyoda
  • P. Krishna Reddy
  • Masaru Kitsuregawa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11440)

Abstract

High utility itemset mining is an important model with many real-world applications. But the popular adoption and successful industrial application of this model has been hindered by the following two limitations: (i) computational expensiveness of the model and (ii) infrequent itemsets may be output as high utility itemsets. This paper makes an effort to address these two limitations. A generic high utility-frequent itemset model is introduced to find all itemsets in the data that satisfy user-specified minimum support and minimum utility constraints. Two new pruning measures, named cutoff utility and suffix utility, are introduced to reduce the computational cost of finding the desired itemsets. A single phase fast algorithm, called High Utility Frequent Itemset Miner (HU-FIMi), is introduced to discover the itemsets efficiently. Experimental results demonstrate that the proposed algorithm is efficient.

Keywords

Data mining Itemset mining Utility itemset 

Notes

Acknowledgements

We would like to thank Yahoo Japan Corporation for providing the retail transaction data.

References

  1. 1.
    Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)zbMATHGoogle Scholar
  2. 2.
    Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Hong, T.P., Fujita, H.: A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 8(2), e1242 (2018)Google Scholar
  3. 3.
    Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: ICDM, pp. 984–989. IEEE (2012)Google Scholar
  4. 4.
    Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005).  https://doi.org/10.1007/11430919_79CrossRefGoogle Scholar
  5. 5.
    Pei, J., Han, J., Wang, W.: Constraint-based sequential pattern mining: the pattern-growth methods. J. Intell. Inf. Syst. 28(2), 133–160 (2007)CrossRefGoogle Scholar
  6. 6.
    Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)CrossRefGoogle Scholar
  7. 7.
    Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM, pp. 482–486 (2004)Google Scholar
  8. 8.
    Zhang, C., Almpanidis, G., Wang, W., Liu, C.: An empirical evaluation of high utility itemset mining algorithms. Expert Syst. with Appl. 101, 91–115 (2018)CrossRefGoogle Scholar
  9. 9.
    Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • R. Uday Kiran
    • 1
    • 2
    Email author
  • T. Yashwanth Reddy
    • 3
  • Philippe Fournier-Viger
    • 4
  • Masashi Toyoda
    • 2
  • P. Krishna Reddy
    • 3
  • Masaru Kitsuregawa
    • 2
    • 5
  1. 1.National Institute of Information and Communications TechnologyTokyoJapan
  2. 2.The University of TokyoTokyoJapan
  3. 3.International Institute of Information Technology-HyderabadHyderabadIndia
  4. 4.Harbin Institute of Technology (Shenzhen)ShenzhenChina
  5. 5.National Institute of InformaticsTokyoJapan

Personalised recommendations