Skip to main content

EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Soft Computing (MICAI 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9413))

Included in the following conference series:

Abstract

High-utility itemset mining (HUIM) is an important data mining task with wide applications. In this paper, we propose a novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discovers high-utility itemsets both in terms of execution time and memory. EFIM relies on two upper-bounds named sub-tree utility and local utility to more effectively prune the search space. It also introduces a novel array-based utility counting technique named Fast Utility Counting to calculate these upper-bounds in linear time and space. Moreover, to reduce the cost of database scans, EFIM proposes efficient database projection and transaction merging techniques. An extensive experimental study on various datasets shows that EFIM is in general two to three orders of magnitude faster and consumes up to eight times less memory than the state-of-art algorithms d\(^2\)HUP, HUI-Miner, HUP-Miner, FHM and UP-Growth+.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the International Conference on Very Large Databases, pp. 487–499 (1994)

    Google Scholar 

  2. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Ras, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 83–92. Springer, Heidelberg (2014)

    Google Scholar 

  3. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.-W., Tseng, V.S.: SPMF: a java open-source pattern mining library. J. Mach. Learn. Res. 15, 3389–3393 (2014)

    Google Scholar 

  4. Fournier-Viger, P., Zida, S.: Foshu: faster on-shelf high utility itemset mining with or without negative unit profit. In: Proc. 30th ACM Symposium on Applied Computing, pp. 857–864 (2015)

    Google Scholar 

  5. Fournier-Viger, P., Wu, C.-W., Tseng, V.S.: Novel concise representations of high utility itemsets using generator patterns. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS, vol. 8933, pp. 30–43. Springer, Heidelberg (2014)

    Google Scholar 

  6. Lan, G.C., Hong, T.P., Tseng, V.S.: An efficient projection-based indexing approach for mining high utility itemsets. Knowl. Inform. Syst. 38(1), 85–107 (2014)

    Article  Google Scholar 

  7. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of 22nd ACM International Conference on Information on Knowledge and Management, pp. 55–64 (2012)

    Google Scholar 

  8. Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)

    Article  Google Scholar 

  9. Liu, Y., Liao, W., Choudhary, A.K.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Liu, J., Wang, K., Fung, B.: Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the 12th IEEE International Conference on Data Mining (ICDM), pp. 984–989 (2012)

    Google Scholar 

  11. Song, W., Liu, Y., Li, J.: BAHUI: fast and memory efficient mining of high utility itemsets based on bitmap. Int. J. Data Warehous. Min. 10(1), 1–15 (2014)

    Article  MathSciNet  Google Scholar 

  12. Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)

    Article  Google Scholar 

  13. Tseng, V., Wu, C., Fournier-Viger, P., Yu, P.: Efficient algorithms for mining the concise and lossless representation of closed+ high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)

    Article  Google Scholar 

  14. Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of the ICDM 2004 Workshop on Frequent Itemset Mining Implementations. CEUR (2004)

    Google Scholar 

  15. Zida, S., Fournier-Viger, P., Wu, C.-W., Lin, J.C.-W., Tseng, V.S.: Efficient mining of high-utility sequential rules. In: Perner, P. (ed.) MLDM 2015. LNCS, vol. 9166, pp. 157–171. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

Download references

Acknowledgement

This work is financed by a National Science and Engineering Research Council (NSERC) of Canada research grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philippe Fournier-Viger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zida, S., Fournier-Viger, P., Lin, J.CW., Wu, CW., Tseng, V.S. (2015). EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining. In: Sidorov, G., Galicia-Haro, S. (eds) Advances in Artificial Intelligence and Soft Computing. MICAI 2015. Lecture Notes in Computer Science(), vol 9413. Springer, Cham. https://doi.org/10.1007/978-3-319-27060-9_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27060-9_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27059-3

  • Online ISBN: 978-3-319-27060-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics