Skip to main content
Log in

Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Mining frequent patterns, including mining frequent closed patterns or maximal patterns, is a fundamental and important problem in data mining area. Many algorithms adopt the pattern growth approach, which is shown to be superior to the candidate generate-and-test approach, especially when long patterns exist in the datasets. In this paper, we identify the key factors that influence the performance of the pattern growth approach, and optimize them to further improve the performance. Our algorithm uses a simple while compact data structure—ascending frequency ordered prefix-tree (AFOPT) to store the conditional databases, in which we use arrays to store single branches to further save space. The AFOPT structure is traversed in top-down depth-first order. Our analysis and experiment results show that the combination of the top-down traversal strategy and the ascending frequency order achieves significant performance improvement over previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, R.C., Aggarwal, C.C., and Prasad, V.V.V. 2000. Depth first generation of long patterns. In Proceedings of the 6th ACM SIGKDD Conference, ACM Press, pp. 108–118.

  • Agarwal, R.C., Aggarwal, C.C., and Prasad, V.V.V. 2001. A tree projection algorithm for finding frequent itemsets. Journal on Parallel and Distributed Computing.

  • Agarwal, R., Imielinski, T., and Swami, A.N. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD Conference, ACM Press, pp. 207–216.

  • Agarwal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th VLDB Conference, Morgan Kaufmann.

  • Burdick, D., Calimlim, M., and Gehrke, J. 2001. Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the 17th ICDE Conference, IEEE Computer Society, pp. 443–452.

  • Brin, S., Motwani, R., Ullman, J.D., and Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the 1997 ACM SIGMOD Conference, ACM Press, pp. 255–264.

  • Gouda, K. and Zaki, M.J. 2001. Efficiently mining maximal frequent itemsets. In Proceedings of the 2001 IEEE ICDM, IEEE Computer Society, pp. 163–170.

  • Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD Conference, ACM Press, pp. 1–12.

  • Bayardo, R.J. Jr. 1998. Efficiently mining long patterns from databases. In Proceedings of the 1998 ACMSIGMOD Conference, ACM Press, pp. 85–93.

  • Liu, J., Pan, Y., Wang, K., and Han, J. 2002. Mining frequent item sets by opportunistic projection. In Proceedings of the 8th KDD Conference, ACM Press, pp. 229–238.

  • Meretakis, D., Fragoudis, D., Lu, H., and Likothanassis, S. 2000. Scalable association-based text classification. In Proceedings of the 9th CIKM Conference, ACM Press, pp. 5–11.

  • Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. 1999. Discovering frequent closed itemsets for association rules. In Proceedings of 7th ICDT Conference, Springer, pp. 398–416.

  • Park, J.S., Chen, M.-S., and Yu, P.S. 1995. An effective hash based algorithm for mining association rules. In Proceedings of the 1995 ACM SIGMOD Conference, ACM Press, pp. 175–186.

  • Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., and Yang, D. 2001. H-mine: Hyper-structure mining of frequent patterns in large databases. In Proceedings of the 2001 IEEE ICDM Conference, IEEE Computer Society, pp. 441–448.

  • Pei, J., Han, J., and Mao, R. 2000. Closet: An efficient algorithm for mining frequent closed itemsets. In ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30.

  • Raymon, R. 1992. Search through systematic set enumeration. In Proceedings of the Internation Conference on Principles of Knowledge Representation and Reasoning.

  • Savasere, A., Omiecinski, E., and Navathe, S.B. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21th VLDB Conference, Morgan Kaufmann, pp. 432–444.

  • Zaki, M.J. and Hsiao, C.-J. 2002. Charm: An efficient algorithm for closed association rule mining. In Proceedings of the 2nd SIAM International Conference on Data Mining, SIAM.

  • Zheng, Z., Kohavi, R., and Mason, L. 2001. Real world performance of association rule algorithms. In Proceedings of the 7th KDD Conference, ACM Press, pp. 401–406.

  • Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li, W. 1997. New algorithms for fast discovery of association rules. In Proceedings of the Third KDD Conference, AAAI Press, pp. 283–286.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, G., Lu, H., Lou, W. et al. Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery 9, 249–274 (2004). https://doi.org/10.1023/B:DAMI.0000041128.59011.53

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:DAMI.0000041128.59011.53

Navigation