Skip to main content

TKU-CE: Cross-Entropy Method for Mining Top-K High Utility Itemsets

  • Conference paper
  • First Online:
Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices (IEA/AIE 2020)

Abstract

Mining high utility itemsets (HUIs) is one of the most important research topics in data mining because HUIs consider non-binary frequency values of items in transactions and different profit values for each item. However, setting appropriate minimum utility thresholds by trial and error is a tedious process for users. Thus, mining the top-k high utility itemsets (top-k HUIs) without setting a utility threshold is becoming an alternative to determining all of the HUIs. In this paper, we propose a novel algorithm, named TKU-CE (Top-K high Utility mining based on Cross-Entropy method), for mining top-k HUIs. The TKU-CE algorithm follows the roadmap of cross entropy and tackles top-k HUI mining using combinatorial optimization. The main idea of TKU-CE is to generate the top-k HUIs by gradually updating the probabilities of itemsets with high utility values. Compared with the state-of-the-art algorithms, TKU-CE is not only easy to implement, but also saves computational costs incurred by additional data structures, threshold raising strategies, and pruning strategies. Extensive experimental results show that the TKU-CE algorithm is efficient, memory-saving, and can discover most actual top-k HUIs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  2. de Boer, P.-T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005). https://doi.org/10.1007/s10479-005-5724-z

    Article  MathSciNet  MATH  Google Scholar 

  3. Duong, Q.H., Liao, B., Fournier-Viger, P., Dam, T.-L.: An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowl. Based Syst. 104, 106–122 (2016)

    Article  Google Scholar 

  4. Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8

    Chapter  Google Scholar 

  5. Kannimuthu, S., Premalatha, K.: Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl. Artif. Intell. 28(4), 337–359 (2014)

    Article  Google Scholar 

  6. Lin, J.C.-W., Yang, L., Fournier-Viger, P., Hong, T.-P., Voznak, M.: A binary PSO approach to mine high-utility itemsets. Soft. Comput. 21(17), 5103–5121 (2016). https://doi.org/10.1007/s00500-016-2106-1

    Article  Google Scholar 

  7. Liu, Y., Liao, W.-K., Choudhary, A.N.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79

    Chapter  Google Scholar 

  8. Ryang, H., Yun, U.: Top-k high utility pattern mining with effective threshold raising strategies. Knowl.-Based Syst. 76, 109–126 (2015)

    Google Scholar 

  9. Singh, K., Singh, S.S., Kumar, A., Biswas, B.: TKEH: an efficient algorithm for mining top-k high utility itemsets. Appl. Intell. 49(3), 1078–1097 (2018). https://doi.org/10.1007/s10489-018-1316-x

    Article  Google Scholar 

  10. Song, W., Huang, C.: Discovering high utility itemsets based on the artificial bee colony algorithm. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10939, pp. 3–14. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93040-4_1

    Chapter  Google Scholar 

  11. Tseng, V.S., Wu, C.-W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016)

    Article  Google Scholar 

  12. Wang, J., Han, J., Lu, Y., Tzvetkov, P.: TFP: an efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 17(5), 652–664 (2005)

    Article  Google Scholar 

  13. Wu, C.-W., Shie, B.-E., Tseng, V.S., Yu, P.S.: Mining top-k high utility itemsets. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 78–86 (2012)

    Google Scholar 

  14. Wu, J.M.T., Zhan, J., Lin, J.C.W.: An ACO-based approach to mine high-utility itemsets. Knowl.-Based Syst. 116, 102–113 (2017)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (61977001), the Great Wall Scholar Program (CIT&TCD20190305), and Beijing Urban Governance Research Center.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, W., Liu, L., Huang, C. (2020). TKU-CE: Cross-Entropy Method for Mining Top-K High Utility Itemsets. In: Fujita, H., Fournier-Viger, P., Ali, M., Sasaki, J. (eds) Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices. IEA/AIE 2020. Lecture Notes in Computer Science(), vol 12144. Springer, Cham. https://doi.org/10.1007/978-3-030-55789-8_72

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55789-8_72

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55788-1

  • Online ISBN: 978-3-030-55789-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics