Advertisement

Efficient Mining of Association Rules Based on Formal Concept Analysis

  • Lotfi Lakhal
  • Gerd Stumme
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3626)

Abstract

Association rules are a popular knowledge discovery technique for warehouse basket analysis. They indicate which items of the warehouse are frequently bought together. The problem of association rule mining has first been stated in 1993. Five years later, several research groups discovered that this problem has a strong connection to Formal Concept Analysis (FCA). In this survey, we will first introduce some basic ideas of this connection along a specific algorithm, Titanic, and show how FCA helps in reducing the number of resulting rules without loss of information, before giving a general overview over the history and state of the art of applying FCA for association rule mining.

Keywords

Association Rule Minimum Support Frequent Itemsets Association Rule Mining Formal Concept Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of Data (SIGMOD 1993), May 1993, pp. 207–216. ACM Press, New York (1993)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on Very Large Data, September 1994, pp. 478–499. Morgan Kaufmann, San Francisco (1994)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering (ICDE 1995), March 1995, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)CrossRefGoogle Scholar
  4. 4.
    Bastide, Y.: Data Mining: algorithmes par niveau, techniques d’implementation et applications. PhD thesis, Université de Clermont-Ferrand II (2000)Google Scholar
  5. 5.
    Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explorations, Special Issue on Scalable Algorithms 2(2), 71–80 (2000)Google Scholar
  6. 6.
    Bay, S.D.: The UCI KDD Archive. Technical report, University of California, Department of Information and Computer Science, Irvine, 99, http://kdd.ics.uci.edu
  7. 7.
    Bayardo, R.J.: Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of Data (SIGMOD 1998), June 1998, pp. 85–93. ACM Press, New York (1998)CrossRefGoogle Scholar
  8. 8.
    Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proceedings of the 4th IEEE International Conference on Data Mining, pp. 35–42. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  9. 9.
    Bordat, J.P.: Calcul pratique du treillis de galois d’une correspondance Galois. Math. Sci. Hum. 96, 31–47 (1986)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Boulicaut, J.-F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: PADKK 2000: Proceedings of the 4th Pacific- Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, London, UK, pp. 62–73. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    Jean-Francois Boulicaut, Artur Bykowski, and Christophe Rigotti. Approximation of frequency queries by means of free-sets. In PKDD 2000: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pages 75–85, London, UK, 2000. Springer-Verlag. Google Scholar
  12. 12.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlation. In: Proceedings of the 1997 ACM SIGMOD international conference on Management of Data (SIGMOD 1997), May 1997, pp. 265–276. ACM Press, New York (1997)CrossRefGoogle Scholar
  13. 13.
    Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM SIGMOD international conference on Management of Data (SIGMOD 1997), May 1997, pp. 255–264. ACM Press, New York (1997)CrossRefGoogle Scholar
  14. 14.
    Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of the 17th Int. Conf. on Data Engineering, IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  15. 15.
    Bykowski, A., Rigotti, C.: A condensed representation to find frequent patterns. In: PODS 2001: Proceedings of the twentieth ACM SIGMODSIGACT- SIGART symposium on Principles of database systems, pp. 267–273. ACM Press, New York (2001)CrossRefGoogle Scholar
  16. 16.
    Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–85. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Cristofor, D., Cristofor, L., Simovici, D.A.: Galois Connections and Data Mining. Journal of Universal Computer Science 6(1), 60–73 (2000)zbMATHMathSciNetGoogle Scholar
  18. 18.
    Duquenne, V., Guigues, J.-L.: Famille minimale d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences Humaines 24(95), 5–18 (1986)MathSciNetGoogle Scholar
  19. 19.
    Fay, G.: An algorithm for finite Galois connections. Technical report, Institute for Industrial Economy, Budapest (1973)Google Scholar
  20. 20.
    Ganter, B.: Two basic algorithms in concept analysis. FB4–Preprint 831, TH Darmstadt (1984)Google Scholar
  21. 21.
    Ganter, B., Reuter, K.: Finding all closed sets: a general approach. Order 8, 283–290 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Goethals, B., Muhonen, J., Toivonen, H.: Mining non-derivable association rules. In: Proc. SIAM International Conference on Data Mining, Newport Beach, CA (April 2005)Google Scholar
  23. 23.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, September 2000. Morgan Kaufmann, San Francisco (2000)Google Scholar
  24. 24.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD Int’l Conf. on Management of Data, May 2000, pp. 1–12 (2000)Google Scholar
  25. 25.
    Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 211–218. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  26. 26.
    Kamber, M., Han, J., Chiang, Y.: Metarule-guided mining of multi-dimensional association rules using data cubes. In: Proc. of the 3rd KDD Int’l Conf. (August 1997)Google Scholar
  27. 27.
    Kryszkiewicz, M.: Concise representation of frequent patterns based on disjunction-free generators. In: ICDM 2001: Proceedings of the 2001 IEEE International, Washington, DC, USA, pp. 305–312. IEEE Computer Society, Los Alamitos (2001)Google Scholar
  28. 28.
    Lent, B., Agrawal, R., Srikant, R.: Discovering trends in text databases. In: Proceedings of the 3rd international conference on Knowledge Discovery and Data mining (KDD 1997), August 1997, pp. 227–230. AAAI Press, Menlo Park (1997)Google Scholar
  29. 29.
    Lin, D., Kedem, M.: A new algorithm for discovering the maximum frequent set. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 105–119. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  30. 30.
    Luxenburger, M.: Implications partielles dans un contexte. Mathématiques, Informatique et Sciences Humaines 29(113), 35–55 (1991)MathSciNetGoogle Scholar
  31. 31.
    Luxenburger, M.: Implikationen, Abhängigkeiten und Galois–Abbildungen. PhD thesis, TH Darmstadt, Shaker Verlag, Aachen. In english language, beside the introduction (1993)Google Scholar
  32. 32.
    Mannila, H.: Methods and problems in data mining. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 41–55. Springer, Heidelberg (1996)Google Scholar
  33. 33.
    Norris, E.M.: An algorithm for computing the maximal rectangles in a binary relation. Rev. Roum. Math. Pures et Appl. 23(2), 243–250 (1978)zbMATHMathSciNetGoogle Scholar
  34. 34.
    Park, J.S., Chen, M.-S., Yu, P.S.: An efficient hash based algorithm for mining association rules. In: Proceedings of the 1995 ACM SIGMOD international conference on Management of Data (SIGMOD 1995), May 1995, pp. 175–186. ACM Press, New York (1995)CrossRefGoogle Scholar
  35. 35.
    Pasquier, N.: Extraction de bases pour les règles d’association à partir des itemsets fermés fréquents. In: Actes du 18ème congrès sur l’Informatique des Organisations et Systèmes d’Information et de Décision INFORSID 2000 (May 2000)Google Scholar
  36. 36.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Pruning closed itemset lattices for association rules. In: Actes des 14èmes journées Bases de Données Avancées (BDA 1998), Octobre 1998, pp. 177–196 (1998)Google Scholar
  37. 37.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed set based discovery of small covers for association rules. In: Actes des 15èmes journées Bases de Données Avancées (BDA 1999), Octobre 1999, pp. 361–381 (1999)Google Scholar
  38. 38.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  39. 39.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24(1), 25–46 (1999)CrossRefGoogle Scholar
  40. 40.
    Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a condensed representation for association rules. Journal of Intelligent 24(1), 29–60 (2005)zbMATHGoogle Scholar
  41. 41.
    Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)Google Scholar
  42. 42.
    Savasere, E.O., Navathe, S.: An efficient algorithm for mining association rules in larges databases. In: Proceedings of the 21st international conference on Very Large Data Bases (VLDB 1995), September 1995, pp. 432–444. Morgan Kaufmann, San Francisco (1995)Google Scholar
  43. 43.
    Silverstein, C., Brin, S., Motwani, R.: Beyond market baskets: Generalizing association rules to dependence rules. Data Mining and Knowledge Discovery 2(1), 39–68 (1998)CrossRefGoogle Scholar
  44. 44.
    Stumme, G.: Conceptual knowledge discovery with frequent concept lattices. FB4- Preprint 2043, TU Darmstadt (1999)Google Scholar
  45. 45.
    Stumme, G., Taouil, R., Bastide, Y., Pasqier, N., Lakhal, L.: Computing iceberg concept lattices with titanic. J. on Knowledge and Data Engineering 42(2), 189–222 (2002)zbMATHCrossRefGoogle Scholar
  46. 46.
    Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Intelligent structuring and reducing of association rules with formal concept analysis. In: Baader, F., Brewka, G., Eiter, T. (eds.) KI 2001. LNCS (LNAI), vol. 2174, pp. 335–350. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  47. 47.
    Taouil, R.: Algorithmique du treillis des fermés: application à l’analyse formelle de concepts et aux bases de données. PhD thesis, Université de Clermont-Ferrand II (2000)Google Scholar
  48. 48.
    Toivonen, H.: Discovery of frequent patterns in large data collection. PhD thesis, University of Helsinki (1996)Google Scholar
  49. 49.
    Valtchev, P., Missaoui, R., Godin, R.: Formal concept analysis for knowledge discovery and data mining: The new challenges. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 352–371. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  50. 50.
    Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 236–245. ACM Press, New York (2003)CrossRefGoogle Scholar
  51. 51.
    Wang, J., Karypis, G.: Bamboo: Accelerating closed itemset mining by deeply pushing the length-decreasing support constraint. In: Berry, M.W., Dayal, U., Kamath, C., Skillicorn, D.B. (eds.) Proceedings of the Fourth SIAM International Conference on Data Mining, SIAM, Philadelphia (2004)Google Scholar
  52. 52.
    Zaki, M.J., Hsiao, C.-J.: Chaarm: An efficient algorithm for closed association rule mining. technical report 99–10. Technical report, Computer Science Dept., Rensselaer Polytechnic (October 1999)Google Scholar
  53. 53.
    Zaki, M.J., Ogihara, M.: Theoretical foundations of association rules. In: DMKD 1998 workshop on research issues in Data Mining and Knowledge Discovery, June 1998, pp. 1–8. ACM Press, New York (1998)Google Scholar
  54. 54.
    Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proceedings of the 3rd international conference on Knowledge Discovery and Data mining (KDD 1997), August 1997, pp. 283–286. AAAI Press, Menlo Park (1997)Google Scholar
  55. 55.
    Mohammed, J.: Zaki. Generating non-redundant association rules. In: Proc. KDD 2000, pp. 34–43 (2000)Google Scholar
  56. 56.
    Zaki, M.J., Hsaio, C.-J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Transactions on Knowledge and Data Engineering 17(4), 462–478 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Lotfi Lakhal
    • 1
  • Gerd Stumme
    • 2
  1. 1.Département d’InformatiqueIUT d’Aix-en-ProvenceAix-en-Provence cedexFrance
  2. 2.Chair of Knowledge & Data Engineering, Department of Mathematics and Computer ScienceUniversity of KasselKasselGermany

Personalised recommendations