Skip to main content

LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Abstract

Frequent itemset mining is one of the fundamental techniques for data mining and knowledge discovery. In the last decade, a number of efficient algorithms have been presented for frequent itemset mining, but most of them focused on only enumerating the itemsets that satisfy the given conditions, and how to store and index the mining result in order to ensure an efficient data analysis is a different matter.

In this paper, we propose a fast algorithm for generating very large-scale all/closed/maximal frequent itemsets using Zero-suppressed BDDs (ZBDDs), a compact graph-based data structure. Our method, “LCM over ZBDDs,” is based on one of the most efficient state-of-the-art algorithms proposed thus far. Not only does it enumerate/list the itemsets, but it also generates a compact output data structure on the main memory. The result can be efficiently postprocessed by using algebraic ZBDD operations. The original LCM is known as an output linear time algorithm, but our new method requires a sub-linear time for the number of frequent patterns when the ZBDD-based data compression works well. Our method will greatly accelerate the data mining process and this will leads to a new style of on-memory processing for dealing with knowledge discovery problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proc. of the 1993 ACM SIGMOD International Conference on Management of Data, vol. 22(2) of SIGMOD Record, pp. 207–216 (1993)

    Google Scholar 

  2. Bryant, R.E.: Graph-based algorithms for Boolean function manipulation. IEEE Transactions on Computers C-35(8), 677–691 (1986)

    Article  Google Scholar 

  3. Goethals, B.: Survey on frequent pattern mining (2003), http://www.cs.helsinki.fi/u/goethals/publications/survey.ps

  4. Goethals, B., Zaki, M.J.: Frequent itemset mining dataset repository. In: Frequent Itemset Mining Implementations (FIMI 2003) (2003), http://fimi.cs.helsinki.fi/data/

  5. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  6. Loekit, E., Bailey, J.: Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams. In: Proc. The Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 307–316 (2006)

    Google Scholar 

  7. Minato, S.: Zero-suppressed BDDs for set manipulation in combinatorial problems. In: Proc. of 30th ACM/IEEE Design Automation Conference, pp. 272–277 (1993)

    Google Scholar 

  8. Minato, S.: Symmetric item set mining based on zero-suppressed BDDs. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds.) DS 2006. LNCS (LNAI), vol. 4265, pp. 321–326. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Minato, S., Arimura, H.: Efficient combinatorial item set analysis based on zero-suppressed BDDs. In: Proc. IEEE/IEICE/IPSJ International Workshop on Challenges in Web Information Retrieval and Integration (WIRI-2005), April 2005, pp. 3–10 (2005)

    Google Scholar 

  10. Minato, S., Arimura, H.: frequent closed item set mining based on zero-suppressed BDDs. Trans. of the Japanese Society of Artificial Intelligence 22(2), 165–172 (2007)

    Article  Google Scholar 

  11. Minato, S., Arimura, H.: Frequent pattern mining and knowledge indexing based on zero-suppressed BDDs. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 152–169. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Uno, T., Arimura, H.: Program codes of takeaki uno and hiroki arimura (2007), http://research.nii.ac.jp/~uno/codes.htm

  13. Uno, T., Kiyomi, M., Arimura, H.: LCM ver.2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, Springer, Heidelberg (2004)

    Google Scholar 

  14. Uno, T., Kiyomi, M., Arimura, H.: LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: Proc. Open Source Data Mining Workshop on Frequent Pattern Mining Implementations 2005 (2005)

    Google Scholar 

  15. Uno, T., Uchida, Y., Asai, T., Arimura, H.: LCM: an efficient algorithm for enumerating frequent closed item sets. In: Proc. Workshop on Frequent Itemset Mining Implementations (FIMI 2003) (2003), http://fimi.cs.helsinki.fi/src/

  16. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(2), 372–390 (2000)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Minato, Si., Uno, T., Arimura, H. (2008). LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics