Abstract
This paper presents a new efficient algorithm for mining frequent closed itemsets. It enumerates the closed set of frequent itemsets by using a novel compound frequent itemset tree that facilitates fast growth and efficient pruning of search space. It also employs a hybrid approach that adapts search strategies, representations of projected transaction subsets, and projecting methods to the characteristics of the dataset. Efficient local pruning, global subsumption checking, and fast hashing methods are detailed in this paper. The principle that balances the overheads of search space growth and pruning is also discussed. Extensive experimental evaluations on real world and artificial datasets showed that our algorithm outperforms CHARM by a factor of five and is one to three orders of magnitude more efficient than CLOSET and MAFIA.
References
Agarwal, R., Aggarwal, C. and Prasad, V., 2000. Depth First Generation of Long Patterns.In: The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA.
Agrawal, R. and Srikant, R., 1994. Fast Algorithms for Mining Association Rules.In: VLDB'94, Santiago, Chile, p. 487–499.
Burdick, D., Calimlim, M. and Gehrke, J., 2001. MAFLA: A Maximal Frequent Itemset Algorithm for Transactional Databases.In: The 17th International Conference on Data Engineering, Heidelberg, Germany.
Liu, J., Pan, Y., Wang, K. and Han, J., 2002. Mining Frequent Itemsets by Opportunistic Projection.In: The 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Alberta, Canada, p. 229–238.
Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L., 1998. Pruning Closed Itemset Lattices for Association Rules.In: The BDA French Conference on Advanced Databases, France.
Pasquier, N., Bastide, Y., Taouil, R. and Lakhal., L., 1999. Discovering Frequent Closed Itemsets for Association Rules.In: ICDT'99, Jerusalem, Israel, p. 398–416.
Pei, J., Han, J. and Mao, R., 2000. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets.In: The ACM-SIGMOD International Workshop on Data Mining and Knowledge Discovery, Dallas, TX.
Wang, K., Liu, T., Han, J. and Liu, J., 2002. Top Down FP-Growth for Association Rule Mining.In: The 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, p. 334–340.
Zaki, M. J. and Hsiao, C. J., 2002. CHARM: An Efficient Algorithm for Closed Itemset Mining.In: The 2nd SIAM International Conference on Data Mining, Arlington, VA, USA.
Author information
Authors and Affiliations
Additional information
Project supported by the Ministry of Education (No. 111101-G10110) and the Zhejiang Province Natural Science Foundation (No. 602140), China
Rights and permissions
About this article
Cite this article
Jun-qiang, L., Yun-he, P. An efficient algorithm for mining closed itemsets. J. Zheijang Univ.-Sci. 5, 8–15 (2004). https://doi.org/10.1631/BF02839306
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/BF02839306