Mining Minimal High-Utility Itemsets

  • Philippe Fournier-Viger
  • Jerry Chun-Wei Lin
  • Cheng-Wei Wu
  • Vincent S. Tseng
  • Usef Faghihi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9827)

Abstract

Mining high-utility itemsets (HUIs) is a key data mining task. It consists of discovering groups of items that yield a high profit in transaction databases. A major drawback of traditional high-utility itemset mining algorithms is that they can return a large number of HUIs. Analyzing a large result set can be very time-consuming for users. To address this issue, concise representations of high-utility itemsets have been proposed such as closed HUIs, maximal HUIs and generators of HUIs. In this paper, we explore a novel representation called the minimal high utility itemsets (MinHUIs), defined as the smallest sets of items that generate a high profit, study its properties, and design an efficient algorithm named MinFHM to discover it. An extensive experimental study with real-life datasets shows that mining MinHUIs can be much faster than mining other concise representations or all HUIs, and that it can greatly reduce the size of the result set presented to the user.

Keywords

Utility mining High-utility itemsets Minimal itemsets 

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of International Conference on Very Large Databases, pp. 487–499 (1994)Google Scholar
  2. 2.
    Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high-utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRefGoogle Scholar
  3. 3.
    Deng, Z.: DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl. Soft Comput. 41, 214–223 (2016)CrossRefGoogle Scholar
  4. 4.
    Deng, Z., Lv, S.-H.: PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning. Expert Syst. Appl. 42(13), 5424–5432 (2015)CrossRefGoogle Scholar
  5. 5.
    Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 83–92. Springer, Heidelberg (2014)Google Scholar
  6. 6.
    Fournier-Viger, P., Wu, C.-W., Tseng, V.S.: Novel concise representations of high utility itemsets using generator patterns. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS, vol. 8933, pp. 30–43. Springer, Heidelberg (2014)Google Scholar
  7. 7.
    Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. (JMLR) 15, 3389–3393 (2014)MATHGoogle Scholar
  8. 8.
    Lan, G.C., Hong, T.P., Tseng, V.S.: An efficient projection-based indexing approach for mining high utility itemsets. Knowl. Inf. Syst. 38(1), 85–107 (2014)CrossRefGoogle Scholar
  9. 9.
    Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)CrossRefGoogle Scholar
  10. 10.
    Li, Y.-C., Yeh, J.-S., Chang, C.-C.: Isolated items discarding strategy for discovering high utility itemsets. Data Knowl. Eng. 64(1), 198–217 (2008)CrossRefGoogle Scholar
  11. 11.
    Song, W., Liu, Y., Li, J.: BAHUI: fast and memory efficient mining of high utility itemsets based on bitmap. Int. J. Data Wareh. 10(1), 1–15 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of 22nd ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)Google Scholar
  13. 13.
    Liu, Y., Liao, W., Choudhary, A.K.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Shie, B.-E., Yu, P.S., Tseng, V.S.: Efficient algorithms for mining maximal high utility itemsets from data streams with different models. Expert Syst. Appl. 39(17), 12947–12960 (2012)CrossRefGoogle Scholar
  15. 15.
    Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)CrossRefGoogle Scholar
  16. 16.
    Tseng, V., Wu, C., Fournier-Viger, P., Yu, P.: Efficient algorithms for mining the concise and lossless representation of closed+ high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)CrossRefGoogle Scholar
  17. 17.
    Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of ICDM 2004 Workshop on Frequent Itemset Mining Implementations, CEUR (2004)Google Scholar
  18. 18.
    Nguyen, D., Vo, B., Le, B.: CCAR: an efficient method for mining class association rules with itemset constraints. Eng. Appl. Artif. Intell. 37, 115–124 (2015)CrossRefGoogle Scholar
  19. 19.
    Yin, J., Zheng, Z., Cao, L.: USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 660–668 (2012)Google Scholar
  20. 20.
    Zida, S., Fournier-Viger, P., Wu, C.-W., Lin, J.C.W., Tseng, V.S.: Efficient mining of high utility sequential rules. In: Proceedings of 11th International Conference on Machine Learning and Data Mining, pp. 1–15 (2015)Google Scholar
  21. 21.
    Zida, S., Fournier-Viger, P., Lin, J.C.-W., Wu, C.-W., Tseng, V.S.: EFIM: a highly efficient algorithm for high-utility itemset mining. In: Sidorov, G., Galicia-Haro, S.N. (eds.) MICAI 2015. LNCS, vol. 9413, pp. 530–546. Springer, Heidelberg (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Philippe Fournier-Viger
    • 1
  • Jerry Chun-Wei Lin
    • 2
  • Cheng-Wei Wu
    • 3
  • Vincent S. Tseng
    • 3
  • Usef Faghihi
    • 4
  1. 1.School of Natural Sciences and HumanitiesHarbin Institute of Technology Shenzhen Graduate SchoolShenzhenChina
  2. 2.School of Computer Science and Technology, Shenzhen Graduate SchoolHarbin Institute of Technology Shenzhen Graduate SchoolShenzhenChina
  3. 3.Department of Computer ScienceNational Chiao Tung UniversityHsinchuTaiwan
  4. 4.Department of Computer Science and MathematicsUniversity of IndianapolisIndianapolisUSA

Personalised recommendations