Mining Minimal High-Utility Itemsets

Fournier-Viger, Philippe; Lin, Jerry Chun-Wei; Wu, Cheng-Wei; Tseng, Vincent S.; Faghihi, Usef

doi:10.1007/978-3-319-44403-1_6

Philippe Fournier-Viger¹⁵,
Jerry Chun-Wei Lin¹⁶,
Cheng-Wei Wu¹⁷,
Vincent S. Tseng¹⁷ &
…
Usef Faghihi¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9827))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

891 Accesses
16 Citations

Abstract

Mining high-utility itemsets (HUIs) is a key data mining task. It consists of discovering groups of items that yield a high profit in transaction databases. A major drawback of traditional high-utility itemset mining algorithms is that they can return a large number of HUIs. Analyzing a large result set can be very time-consuming for users. To address this issue, concise representations of high-utility itemsets have been proposed such as closed HUIs, maximal HUIs and generators of HUIs. In this paper, we explore a novel representation called the minimal high utility itemsets (MinHUIs), defined as the smallest sets of items that generate a high profit, study its properties, and design an efficient algorithm named MinFHM to discover it. An extensive experimental study with real-life datasets shows that mining MinHUIs can be much faster than mining other concise representations or all HUIs, and that it can greatly reduce the size of the result set presented to the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Recall that for an itemset X, the extensions of X are the itemsets that can be obtained by appending an item y to X such that \(y \succ i\), \(\forall i \in X\).

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of International Conference on Very Large Databases, pp. 487–499 (1994)
Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high-utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Article Google Scholar
Deng, Z.: DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl. Soft Comput. 41, 214–223 (2016)
Article Google Scholar
Deng, Z., Lv, S.-H.: PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning. Expert Syst. Appl. 42(13), 5424–5432 (2015)
Article Google Scholar
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 83–92. Springer, Heidelberg (2014)
Google Scholar
Fournier-Viger, P., Wu, C.-W., Tseng, V.S.: Novel concise representations of high utility itemsets using generator patterns. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS, vol. 8933, pp. 30–43. Springer, Heidelberg (2014)
Google Scholar
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. (JMLR) 15, 3389–3393 (2014)
MATH Google Scholar
Lan, G.C., Hong, T.P., Tseng, V.S.: An efficient projection-based indexing approach for mining high utility itemsets. Knowl. Inf. Syst. 38(1), 85–107 (2014)
Article Google Scholar
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Article Google Scholar
Li, Y.-C., Yeh, J.-S., Chang, C.-C.: Isolated items discarding strategy for discovering high utility itemsets. Data Knowl. Eng. 64(1), 198–217 (2008)
Article Google Scholar
Song, W., Liu, Y., Li, J.: BAHUI: fast and memory efficient mining of high utility itemsets based on bitmap. Int. J. Data Wareh. 10(1), 1–15 (2014)
Article MathSciNet Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of 22nd ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Google Scholar
Liu, Y., Liao, W., Choudhary, A.K.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005)
Chapter Google Scholar
Shie, B.-E., Yu, P.S., Tseng, V.S.: Efficient algorithms for mining maximal high utility itemsets from data streams with different models. Expert Syst. Appl. 39(17), 12947–12960 (2012)
Article Google Scholar
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Tseng, V., Wu, C., Fournier-Viger, P., Yu, P.: Efficient algorithms for mining the concise and lossless representation of closed+ high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)
Article Google Scholar
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of ICDM 2004 Workshop on Frequent Itemset Mining Implementations, CEUR (2004)
Google Scholar
Nguyen, D., Vo, B., Le, B.: CCAR: an efficient method for mining class association rules with itemset constraints. Eng. Appl. Artif. Intell. 37, 115–124 (2015)
Article Google Scholar
Yin, J., Zheng, Z., Cao, L.: USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 660–668 (2012)
Google Scholar
Zida, S., Fournier-Viger, P., Wu, C.-W., Lin, J.C.W., Tseng, V.S.: Efficient mining of high utility sequential rules. In: Proceedings of 11th International Conference on Machine Learning and Data Mining, pp. 1–15 (2015)
Google Scholar
Zida, S., Fournier-Viger, P., Lin, J.C.-W., Wu, C.-W., Tseng, V.S.: EFIM: a highly efficient algorithm for high-utility itemset mining. In: Sidorov, G., Galicia-Haro, S.N. (eds.) MICAI 2015. LNCS, vol. 9413, pp. 530–546. Springer, Heidelberg (2015)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China
Philippe Fournier-Viger
School of Computer Science and Technology, Shenzhen Graduate School, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China
Jerry Chun-Wei Lin
Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
Cheng-Wei Wu & Vincent S. Tseng
Department of Computer Science and Mathematics, University of Indianapolis, Indianapolis, USA
Usef Faghihi

Authors

Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Vincent S. Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Usef Faghihi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Fournier-Viger .

Editor information

Editors and Affiliations

Clausthal University of Technology , Clausthal-Zellerfeld, Germany
Sven Hartmann
Victoria University of Wellington , Wellington, New Zealand
Hui Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fournier-Viger, P., Lin, J.CW., Wu, CW., Tseng, V.S., Faghihi, U. (2016). Mining Minimal High-Utility Itemsets. In: Hartmann, S., Ma, H. (eds) Database and Expert Systems Applications. DEXA 2016. Lecture Notes in Computer Science(), vol 9827. Springer, Cham. https://doi.org/10.1007/978-3-319-44403-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-44403-1_6
Published: 06 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44402-4
Online ISBN: 978-3-319-44403-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics