Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets

Li, Yu-Chiang; Yeh, Jieh-Shan; Chang, Chin-Chen

doi:10.1007/11540007_67

Yu-Chiang Li²⁰,
Jieh-Shan Yeh²¹ &
Chin-Chen Chang^20,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3614))

Included in the following conference series:

International Conference on Fuzzy Systems and Knowledge Discovery

961 Accesses
19 Citations

Abstract

The value of the itemset share is one way of evaluating the magnitude of an itemset. From business perspective, itemset share values reflect more the significance of itemsets for mining association rules in a database. The Share-counted FSM (ShFSM) algorithm is one of the best algorithms which can discover all share-frequent itemsets efficiently. However, ShFSM wastes the computation time on the join and the prune steps of candidate generation in each pass, and generates too many useless candidates. Therefore, this study proposes the Direct Candidates Generation (DCG) algorithm to directly generate candidates without the prune and the join steps in each pass. Moreover, the number of candidates generated by DCG is less than that by ShFSM. Experimental results reveal that the proposed method performs significantly better than ShFSM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing 61, 350–361 (2001)
Article MATH Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. 1993 ACM SIGMOD Intl. Conf. on Management of Data, Washington, D.C., pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Intl. Conf. on Very Large Data Bases, Santiago, Chile, pp. 487–499 (1994)
Google Scholar
Barber, B., Hamilton, H.J.: Algorithms for mining share frequent itemsets containing infrequent subsets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 316–324. Springer, Heidelberg (2000)
Chapter Google Scholar
Barber, B., Hamilton, H.J.: Parametric algorithm for mining share frequent itemsets. Journal of Intelligent Information Systems 16, 277–293 (2001)
Article MATH Google Scholar
Barber, B., Hamilton, H.J.: Extracting share frequent itemsets with infrequent subsets. Data Mining and Knowledge Discovery 7, 153–185 (2003)
Article MathSciNet Google Scholar
Carter, C.L., Hamilton, H.J., Cercone, N.: Share based measures for itemsets. In: Komorowski, H.J., Zytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 14–24. Springer, Heidelberg (1997)
Google Scholar
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proc. 3rd IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 19–26 (2003)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent pattern without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Article MathSciNet Google Scholar
Hilderman, R.J., Carter, C.L., Hamilton, H.J., Cercone, N.: Mining association rules from market basket data using share measures and characterized itemsets. Intl. Journal of Artificial Intelligence Tools 7, 189–220 (1998)
Article Google Scholar
Kantardzic, M.: Data mining: Concepts, models, methods, and algorithms. John Wiley & Sons, Inc., New York (2002)
Google Scholar
Li, Y.C., Yeh, J.S., Chang, C.C.: A fast algorithm for mining share-frequent itemsets. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 417–428. Springer, Heidelberg (2005)
Chapter Google Scholar
Li, Y.C., Yeh, J.S., Chang, C.C.: Efficient algorithms for mining share-frequent itemsets. In: Proc. 11th World Congress of Intl. Fuzzy Systems Association (2005) (to appear )
Google Scholar
Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proc. 8th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, Alberta, Canada, pp. 229–238 (2002)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: An effective hash-based algorithm for mining association rules. In: Proc. 1995 ACM-SIGMOD Intl. Conf. on Management of Data, San Jose, CA, pp. 175–186 (1995)
Google Scholar
Wang, K., Zhou, S., Han, J.: Profit mining: From patterns to actions. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 70–88. Springer, Heidelberg (2002)
Chapter Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithm. In: Proc. 7th ACM-SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 401–406 (2001)
Google Scholar
http://alme1.almaden.ibm.com/software/quest/Resources/datasets/syndata.html

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, 621, Taiwan
Yu-Chiang Li & Chin-Chen Chang
Department of Computer Science and Information Management, Providence University, Taichung, 433, Taiwan
Jieh-Shan Yeh
Department of Information Engineering and Computer Science, Feng Chia University, Taichung, 407, Taiwan
Chin-Chen Chang

Authors

Yu-Chiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Jieh-Shan Yeh
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Chen Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University, Block S1, Nanyang Avenue, 639798, Singapore
Lipo Wang
Honda Research Institute Europe GmbH, Offenbach/Main, Germany
Yaochu Jin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, YC., Yeh, JS., Chang, CC. (2005). Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_67

Download citation

DOI: https://doi.org/10.1007/11540007_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28331-7
Online ISBN: 978-3-540-31828-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics