Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree

Liu, Guimei; Lu, Hongjun; Lou, Wenwu; Xu, Yabo; Yu, Jeffrey Xu

doi:10.1023/B:DAMI.0000041128.59011.53

Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree

Published: November 2004

Volume 9, pages 249–274, (2004)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Guimei Liu¹,
Hongjun Lu¹,
Wenwu Lou¹,
Yabo Xu² &
…
Jeffrey Xu Yu²

315 Accesses
53 Citations
Explore all metrics

Abstract

Mining frequent patterns, including mining frequent closed patterns or maximal patterns, is a fundamental and important problem in data mining area. Many algorithms adopt the pattern growth approach, which is shown to be superior to the candidate generate-and-test approach, especially when long patterns exist in the datasets. In this paper, we identify the key factors that influence the performance of the pattern growth approach, and optimize them to further improve the performance. Our algorithm uses a simple while compact data structure—ascending frequency ordered prefix-tree (AFOPT) to store the conditional databases, in which we use arrays to store single branches to further save space. The AFOPT structure is traversed in top-down depth-first order. Our analysis and experiment results show that the combination of the top-down traversal strategy and the ascending frequency order achieves significant performance improvement over previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of data mining

Article 06 February 2020

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

Article 20 August 2022

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Article Open access 22 February 2023

References

Agarwal, R.C., Aggarwal, C.C., and Prasad, V.V.V. 2000. Depth first generation of long patterns. In Proceedings of the 6th ACM SIGKDD Conference, ACM Press, pp. 108–118.
Agarwal, R.C., Aggarwal, C.C., and Prasad, V.V.V. 2001. A tree projection algorithm for finding frequent itemsets. Journal on Parallel and Distributed Computing.
Agarwal, R., Imielinski, T., and Swami, A.N. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD Conference, ACM Press, pp. 207–216.
Agarwal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th VLDB Conference, Morgan Kaufmann.
Burdick, D., Calimlim, M., and Gehrke, J. 2001. Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the 17th ICDE Conference, IEEE Computer Society, pp. 443–452.
Brin, S., Motwani, R., Ullman, J.D., and Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the 1997 ACM SIGMOD Conference, ACM Press, pp. 255–264.
Gouda, K. and Zaki, M.J. 2001. Efficiently mining maximal frequent itemsets. In Proceedings of the 2001 IEEE ICDM, IEEE Computer Society, pp. 163–170.
Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD Conference, ACM Press, pp. 1–12.
Bayardo, R.J. Jr. 1998. Efficiently mining long patterns from databases. In Proceedings of the 1998 ACMSIGMOD Conference, ACM Press, pp. 85–93.
Liu, J., Pan, Y., Wang, K., and Han, J. 2002. Mining frequent item sets by opportunistic projection. In Proceedings of the 8th KDD Conference, ACM Press, pp. 229–238.
Meretakis, D., Fragoudis, D., Lu, H., and Likothanassis, S. 2000. Scalable association-based text classification. In Proceedings of the 9th CIKM Conference, ACM Press, pp. 5–11.
Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. 1999. Discovering frequent closed itemsets for association rules. In Proceedings of 7th ICDT Conference, Springer, pp. 398–416.
Park, J.S., Chen, M.-S., and Yu, P.S. 1995. An effective hash based algorithm for mining association rules. In Proceedings of the 1995 ACM SIGMOD Conference, ACM Press, pp. 175–186.
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D. 2001. H-mine: Hyper-structure mining of frequent patterns in large databases. In Proceedings of the 2001 IEEE ICDM Conference, IEEE Computer Society, pp. 441–448.
Pei, J., Han, J., and Mao, R. 2000. Closet: An efficient algorithm for mining frequent closed itemsets. In ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30.
Raymon, R. 1992. Search through systematic set enumeration. In Proceedings of the Internation Conference on Principles of Knowledge Representation and Reasoning.
Savasere, A., Omiecinski, E., and Navathe, S.B. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21th VLDB Conference, Morgan Kaufmann, pp. 432–444.
Zaki, M.J. and Hsiao, C.-J. 2002. Charm: An efficient algorithm for closed association rule mining. In Proceedings of the 2nd SIAM International Conference on Data Mining, SIAM.
Zheng, Z., Kohavi, R., and Mason, L. 2001. Real world performance of association rule algorithms. In Proceedings of the 7th KDD Conference, ACM Press, pp. 401–406.
Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li, W. 1997. New algorithms for fast discovery of association rules. In Proceedings of the Third KDD Conference, AAAI Press, pp. 283–286.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong, China
Guimei Liu, Hongjun Lu & Wenwu Lou
The Chinese University of Hong Kong, Hong Kong, China
Yabo Xu & Jeffrey Xu Yu

Authors

Guimei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongjun Lu
View author publications
You can also search for this author in PubMed Google Scholar
Wenwu Lou
View author publications
You can also search for this author in PubMed Google Scholar
Yabo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Xu Yu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, G., Lu, H., Lou, W. et al. Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery 9, 249–274 (2004). https://doi.org/10.1023/B:DAMI.0000041128.59011.53

Download citation

Issue Date: November 2004
DOI: https://doi.org/10.1023/B:DAMI.0000041128.59011.53

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of data mining

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of data mining

The pattern frequency distribution theory: a mathematic establishment toward rational and reliable pattern mining

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation