Discovering Frequent Closed Itemsets for Association Rules

  • Nicolas Pasquier
  • Yves Bastide
  • Rafik Taouil
  • Lotfi Lakhal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1540)

Abstract

In this paper, we address the problem of finding frequent itemsets in a database. Using the closed itemset lattice framework, we show that this problem can be reduced to the problem of finding frequent closed itemsets. Based on this statement, we can construct efficient data mining algorithms by limiting the search space to the closed itemset lattice rather than the subset lattice. Moreover, we show that the set of all frequent closed itemsets suffices to determine a reduced set of association rules, thus addressing another important data mining problem: limiting the number of rules produced without information loss.We propose a new algorithm, called A-Close, using a closure mechanism to find frequent closed itemsets. We realized experiments to compare our approach to the commonly used frequent itemset search approach. Those experiments showed that our approach is very valuable for dense and/or correlated data that represent an important part of existing databases.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Int’l Conference on Management of Data, pages 207–216, May 1993.Google Scholar
  2. [2]
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. Proceedings of the 20th Int’l Conference on Very Large Data Bases, pages 478–499, June 1994. Expanded version in IBM Research Report RJ9839.Google Scholar
  3. [3]
    R. J. Bayardo. Efficiently mining long patterns from databases. Proceedings of the ACM SIGMOD Int’l Conference on Management of Data, pages 85–93, June 1998.Google Scholar
  4. [4]
    G. Birkhoff. Lattices theory. In Coll. Pub. XXV, volume 25. American Mathematical Society, 1967. Third edition.Google Scholar
  5. [5]
    S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. Proceedings of the ACM SIGMOD Int’l Conference on Management of Data, pages 255–264, May 1997.Google Scholar
  6. [6]
    M.-S. Chen, J. Han, and P. S. Yu. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866–883, December 1996.CrossRefGoogle Scholar
  7. [7]
    B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press, 1994. Fourth edition.Google Scholar
  8. [8]
    V. Duquenne and L.-L. Guigues. Famille minimale d’implication informatives résultant d’un tableau de données binaires. Math. Sci. Hum., 24(95):5–18, 1986.MathSciNetGoogle Scholar
  9. [9]
    B. Ganter and K. Reuter. Finding all closed sets: A general approach. In Order, pages 283–290. Kluwer Academic Publishers, 1991.Google Scholar
  10. [10]
    D. Lin and Z. M. Kedem. Pincer-search: A new algorithm for discovering the maximum frequent set. Proceedings of the 6th Int’l Conference on Extending Database Technology, pages 105–119, March 1998.Google Scholar
  11. [11]
    M. Luxenburger. Implications partielles dans un contexte. Math. Inf. Sci. Hum., 29(113):35–55, 1991.MathSciNetGoogle Scholar
  12. [12]
    H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997.CrossRefGoogle Scholar
  13. [13]
    H. Mannila, H. Toivonen, and A. I. Verkamo. Efficient algorithms for discovering association rules. Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pages 181–192, July 1994.Google Scholar
  14. [14]
    A. M. Mueller. Fast sequential and parallel algorithms for association rules mining: A comparison. Technical report, Faculty of the Graduate School of The University of Maryland, 1995.Google Scholar
  15. [15]
    N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Pruning closed itemset lattices for association rules. Proceedings of the BDA French Conference on Advanced Databases, October 1998. To appear.Google Scholar
  16. [16]
    A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in larges databases. Proceedings of the 21th Int’l Conference on Very Large Data Bases, pages 432–444, September 1995.Google Scholar
  17. [17]
    H. Toivonen. Sampling large databases for association rules. Proceedings of the 22nd Int’l Conference on Very Large Data Bases, pages 134–145, September 1996.Google Scholar
  18. [18]
    H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H. Mannila. Pruning and grouping discovered association rules. ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases, pages 47–52, April 1995.Google Scholar
  19. [19]
    R. Wille. Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications, 23:493–515, 1992.MATHCrossRefGoogle Scholar
  20. [20]
    M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. Proceedings of the 3rd Int’l Conference on Knowledge Discovery in Databases, pages 283–286, August 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Nicolas Pasquier
    • 1
  • Yves Bastide
    • 1
  • Rafik Taouil
    • 1
  • Lotfi Lakhal
    • 1
  1. 1.Laboratoire d’Informatique (LIMOS)Université Blaise Pascal - Clermont-Ferrand II Complexe Scientifique des CézeauxAubière CedexFrance

Personalised recommendations