Skip to main content
Log in

Generating a Condensed Representation for Association Rules

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Association rule extraction from operational datasets often produces several tens of thousands, and even millions, of association rules. Moreover, many of these rules are redundant and thus useless. Using a semantic based on the closure of the Galois connection, we define a condensed representation for association rules. This representation is characterized by frequent closed itemsets and their generators. It contains the non-redundant association rules having minimal antecedent and maximal consequent, called min-max association rules. We think that these rules are the most relevant since they are the most general non-redundant association rules. Furthermore, this representation is a basis, i.e., a generating set for all association rules, their supports and their confidences, and all of them can be retrieved needless accessing the data. We introduce algorithms for extracting this basis and for reconstructing all association rules. Results of experiments carried out on real datasets show the usefulness of this approach. In order to generate this basis when an algorithm for extracting frequent itemsets—such as Apriori for instance—is used, we also present an algorithm for deriving frequent closed itemsets and their generators from frequent itemsets without using the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining Association Rules Between Sets of Items in Large Databases. In Proceedings of the SIGMOD Conference, pp. 207–216.

  • Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the VLDB Conference, pp. 478–499.

  • Armstrong, W.W. (1974). Dependency Structures of Data Base Relationships. In Proceedings of the IFIP Congress, pp. 580–583.

  • Baralis, E. and Psaila, G. (1997). Designing Templates for Mining Association Rules. Journal of Intelligent Information Systems, 9(1), 7–32; L. Kerschberg, Z. Ras, and M. Zemankova (Eds.), Kluwer Academic Publishers.

    Google Scholar 

  • Bastide, Y., Taouil, Y., Pasquier, N., Stumme, G., and Lakhal, L. (2000). Mining Frequent Patterns with Counting Inference. SIGKDD Explorations, 2(2), 66–75; U. Fayyad, S. Sarawagi and P. Bradley (Eds.), ACMComputer Press.

  • Bayardo, R.J. (1998). Efficiently Mining Long Patterns from Databases. In Proceedings of the SIGMOD Conference, pp. 85–93.

  • Bayardo, R.J. and Agrawal, R. (1999). Mining the Most Interesting Rules.In Proceedings of the KDD Conference, pp. 145–154.

  • Bayardo, R.J., Agrawal, R., and Gunopulos, D. (2000).Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery, 4(2/3), 217–240; S. Chaudhuri (Ed.), Kluwer Academic Publishers.

  • Beeri, C. and Bernstein, P.A. (1979). Computational Problems Related to the Design of Normal form Relational Schemas. Transactions on Database Systems, 4(1), 30–59.

    Google Scholar 

  • Blake, C.L. and Merz, C.J. (1998). UCI Machine Learning Databases Repository. University of California, Irvine, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html.

  • Brin, S., Motwani, R., and Silverstein, C. (1997.) Beyond Market Baskets: Generalizing Association Rules to Correlation. In Proceedings of the SIGMOD Conference, pp. 265–276.

  • Brin, S., Motwani, R., Ullman, J.D., and Tsur, S. (1997). Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proceedings of the SIGMOD Conference, pp. 255–264.

  • Duquenne, V. and Guigues, J.-L. (1986). Famille Minimale d’implications Informatives résultant d’un Tableau de données Binaires.Mathématiques et Sciences Humaines, 24(95), 5–18.

    Google Scholar 

  • Ganter, B. and Wille, R. (1999.) Formal Concept Analysis: Mathematical Foundations. Springer-Verlag.

  • Han, J. and Fu, Y. (1999). Mining Multiple-Level Association Rules in Large Databases. Transactions on Knowledge and Data Engineering, 11(5), 798–804; P.-S. Yu (Ed.), IEEE Computer Science

  • Hettich, S. and Bay, S.D. (1999.) UCI Knowledge Discovery in Databases Archive. University of California, Irvine, Department of Information and Computer Science,http://kdd.ics.uci.edu

  • Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkamo, A.I. (1994). Finding interesting Rules from Large Sets of Discovered Association Rules. In Proceedings of the CIKM conference, pp. 401–407.

  • Lin, D. and Kedem, Z.M. (1998). Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set. In Proceedings of the EDBT Conference, pp. 105–119.

  • Liu, B., Hsu, W., and Ma, Y. (1999). Pruning and Summarizing the Discovered Association Rules. In Proceedings of the KDD Conference, pp. 125–134.

  • Luxenburger, M. (1991). Implications Partielles Dans un Contexte. Mathématiques, Informatique et Sciences Humaines, 29(113), 35–55.

    Google Scholar 

  • Maier, D. (1980). Minimum Covers in Relational Database Model. Journal of the ACM, 27(4), 664–674; ACM Computer Press.

    Google Scholar 

  • Mannila, H. and Toivonen, H. (1996). Multiple Uses of Frequent Sets and Condensed Representations. In Proceedings of the KDD Conference, pp. 189–194.

  • Mannila, H. and Toivonen, H. (1997). Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(3), 241–258; U. Fayyad (Ed.), Kluwer Academic Publishers.

  • Mannila, H., Toivonen, H., and Verkamo, A.I. (1994). Efficient Algorithms for Discovering Association Rules. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 181–192.

  • Meo, R., Psaila, G., and Ceri, S. (1998). An Extension to SQL for MiningAssociation Rules. Data Mining and Knowledge Discovery, 2(2), 195–224, U. Fayyad (Ed.), Kluwer Academic Publishers.

  • Morimoto, Y., Fukuda, T., Matsuzawa, H., Tokuyama, T., and Yoda, K. (1998). Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proceedings of the VLDB Conference, pp. 380–391.

  • Ng, R.T., Lakshmanan, V.S., Han, J., and Pang, A. (1998). Exploratory Mining and Pruning Optimizations of Constrained Association Rules. In Proceedings ofthe SIGMOD Conference, pp. 13–24.

  • Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1998). Pruning Closed Itemset Lattices for Association Rules. In Proceedings of the BDA Conference, pp. 177–196.

  • Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Discovering Frequent Closed Itemsets for Association Rules. In Proceedings of the ICDT Conference, LNCS 1540, pp. 398–416.

  • Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems, 24(1), 25–46; M. Jarke and D. Shasha (Eds.), Elsevier Science.

    Google Scholar 

  • Pasquier, N., Bastide, Y., Taouil, R., and Lakhal, L. (1999). Closed SetBased Discovery of Small Covers for Association Rules. In Proceedings of the BDA Conference, pp. 361–381.

  • Pei, J., Han, J., and Mao, R. (2000). Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30.

  • Piatetsky-Shapiro, G. and Matheus, C.J. (1994). The Interestingness of Deviations. In Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pp. 25–36.

  • Silberschatz, A. and Tuzhilin, A. (1996). What Makes Patterns Interesting in Knowledge Discovery Systems. Transactions on Knowledge and Data Engineering, 8(6), 970–974; IEEE Computer Science.

    Google Scholar 

  • Silverstein, C., Brin, S., and Motwani, R. (1998). Beyond Market Baskets: Generalizing Association Rules to Dependence Rules. Data Miningand Knowledge Discovery, 2(1), 39–68; U. Fayyad (Ed.), Kluwer AcademicPublishers.

  • Srikant, R. and Agrawal, R. (1995). Mining Generalized Association Rules. In Proceedings of the VLDB Conference, pp. 407–419.

  • Srikant, R. and Agrawal, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. In Proceedings of the SIGMOD Conference, pp. 1–12.

  • Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining Association Rules with Item Constraints. In Proceedings of the KDD Conference, pp. 67–73.

  • Taouil, R., Pasquier, N., Bastide, Y., and Lakhal, L. (2000). Mining Bases for Association Rules Using Closed Sets. In Proceedings of theICDE Conference, p. 307.

  • Toivonen, H., Klemettinen, M., Ronkainen, P., Hätönen, K.,and Mannila, H. (1995). Pruning and Grouping Discovered Association Rules. In Proceedings of the ECML Workshop, pp. 47–52.

  • Zaki, M.J. (2000). Generating Non-Redundant Association Rules. In Proceedings of the KDD Conference, pp. 34–43.

  • Zaki, M.J. and Hsiao, C.-J. (1999). ChARM: An Efficient Algorithm for Closed Association Rule Mining. Technical Report 99-10.

  • Zaki, M.J. and Ogihara, M. (1998). Theoretical Foundations of Association Rules. In Proceedings of the DMKD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 7:1–7:8.

  • Zaki, M.J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New Algorithms for Fast Discovery of Association Rules. In Proceedings of the KDD Conference, pp. 283–286.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Pasquier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pasquier, N., Taouil, R., Bastide, Y. et al. Generating a Condensed Representation for Association Rules. J Intell Inf Syst 24, 29–60 (2005). https://doi.org/10.1007/s10844-005-0266-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-005-0266-z

Keywords

Navigation