Advertisement

Data Mining of Association Rules and the Process of Knowledge Discovery in Databases

  • Jochen Hipp
  • Ulrich Güntzer
  • Gholamreza Nakhaeizadeh
Chapter
  • 634 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2394)

Abstract

In this paper we deal with association rule mining in the context of a complex, interactive and iterative knowledge discovery process. After a general introduction covering the basics of association rule mining and of the knowledge discovery process in databases we draw the attention to the problematic aspects concerning the integration of both. Actually, we come to the conclusion that with regard to human involvement and interactivity the current situation is far from being satisfying. In our paper we tackle this problem on three sides: First of all there is the algorithmic complexity. Although today’s algorithms efficiently prune the immense search space the achieved run times do not allow true interactivity. Nevertheless we present a rule caching schema that significantly reduces the number of mining runs. This schema helps to gain interactivity even in the presence of extreme run times of the mining algorithms. Second, today the mining data is typically stored in a relational database management system. We present an efficient integration with modern database systems which is one of the key factors in practical mining applications. Third, interesting rules must be picked from the set of generated rules. This might be quite costly because the generated rule sets normally are quite large whereas the percentage of useful rules is typically only a very small fraction. We enhance the traditional association rule mining framework in order to cope with this situation.

Keywords

Association Rule Mining Algorithm Frequent Itemsets Association Rule Mining Generalize Association Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P. Adriaans and D. Zantinge. Data Mining. Addison-Wesley, Harlow, England, 1996.Google Scholar
  2. 2.
    R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 93), pages 207–216, Washington, USA, May 1993.Google Scholar
  3. 3.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases (VLDB’ 94), Santiago, Chile, June 1994.Google Scholar
  4. 4.
    T. Barth. Guidelines for the data mining process. Technical report, University of Stuttgart, Stuttgart, Germany, 1998. ESPRIT Project Number 22700.Google Scholar
  5. 5.
    R. J. Brachman and T. Anand. The process of knowledge discovery in databases: A human centered approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, chapter 2, pages 37–57. AAAI/MIT Press, 1996.Google Scholar
  6. 6.
    S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 7), pages 265–276, 1997.Google Scholar
  7. 7.
    S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 97), pages 265–276, 1997.Google Scholar
  8. 8.
    C. E. Brodley and P. Smyth. The process of applying machine learning algorithms. In Presented at Workshop on Applying Machine Learning in Practice, 12th International Machine Learning Conference (IMLC 95), Tahoe City, CA, 1995.Google Scholar
  9. 9.
    P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and R. Wirth. CRISP-DM 1.0. http://www.crisp-dm.org/, 2000.
  10. 10.
    U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11):27–34, November 1996.Google Scholar
  11. 11.
    J. Han, Y. Fu, W. Wang, K. Koperski, and O. Zaiane. DMQL: A data mining query language for relational databases. In Proceedings of the 1996 SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’ 96), Montreal, Canada, June 1996.Google Scholar
  12. 12.
    H. Heuser. Lehrbuch der Analysis. B. G. Teubner Verlag, Stuttgart, 8 edition, 1990.zbMATHGoogle Scholar
  13. 13.
    J. Hipp, U. Güntzer, and U. Grimmer. Integrating association rule mining algorithms with relational database systems. In Proceedings of the 3rd International Conference on Enterprise Information Systems (ICEIS 2001), pages 130–137, Setúbal, Portugal, July 7–10 2001.Google Scholar
  14. 14.
    J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining-a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000.Google Scholar
  15. 15.
    J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Mining association rules: Deriving a superior algorithm by analysing today’s approaches. In Proceedings of the 4th European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 00), pages 159–168, Lyon, France, September 13–16 2000.Google Scholar
  16. 16.
    J. Hipp and G. Lindner. Analysing warranty claims of automobiles. an application description following the CRISP-DM data mining process. In Proceedings of 5th International Computer Science Conference (ICSC’ 99), pages 31–40, Hong Kong, China, December 13–15 1999.Google Scholar
  17. 17.
    J. Hipp, C. Mangold, U. Güntzer, and G. Nakhaeizadeh. Efficient rule retrieval and postponed restrict operations for association rule mining. In Proceedings of the Sixth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’02), May 6–8 2002.Google Scholar
  18. 18.
    J. Hipp, A. Myka, R. Wirth, and U. Güntzer. A new algorithm for faster mining of generalized association rules. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 98), pages 74–82, Nantes, France, Sept. 23–26 1998.Google Scholar
  19. 19.
    IBM. Intelligent Miner Handbook, 1999.Google Scholar
  20. 20.
    T. Imielinski, A. Virmani, and A. Abdulghani. Data mining: Application programming interface and query language for database mining. In Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining (KDD’ 96), pages 256–262, Portland, Oregon, USA, August 1996.Google Scholar
  21. 21.
    T. Imielinski, A. Virmani, and A. Abdulghani. DMajor-application programming interface for database mining. Data Mining and Knowledge Discovery, 3(4):347–372, December 1999.Google Scholar
  22. 22.
    L. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries: 2-var constraints. In 3rd SIGMOD’98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), pages 157–168, Seattle, WA, June 1998.Google Scholar
  23. 23.
    R. Meo, G. Psaila, and S. Ceri. A new sql-like operator for mining association rules. In Proceedings of the 22nd International Conference on Very Large Databases (VLDB’ 96), Mumbai (Bombay), India, September 1996.Google Scholar
  24. 24.
    R. Ng, L. S. Lakshmanan, J. Han, and T. Mah. Exploratory mining via constrained frequent set queries. In Proceedings of the 1999 ACM-SIGMOD International Conference on Management of Data (SIGMOD’ 99), pages 556–558, Philadelphia, PA, USA, June 1999.Google Scholar
  25. 25.
    R. Ng, L. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD’ 98), Seattle, Washington, USA, June 1998.Google Scholar
  26. 26.
    A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), pages 432–444, Zürich, Switzerland, September 1995.Google Scholar
  27. 27.
    R. Srikant and R. Agrawal. Mining generalized association rules. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), Zürich, Switzerland, September 1995.Google Scholar
  28. 28.
    R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.Google Scholar
  29. 29.
    R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.Google Scholar
  30. 30.
    G. J. Williams and Z. Huang. Modelling the kdd process. Technical report, CSIRO Division of Information Technology, GPO Box 664 Canberra ACT 2601 Australia, Februar 1996.Google Scholar
  31. 31.
    R. Wirth, M. Borth, and J. Hipp. When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In Proceedings of the PKDD 2001 Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, pages 56–64, Freiburg, Germany, September 3–7 2001.Google Scholar
  32. 32.
    R. Wirth and J. Hipp. CRISP-DM: Towards a standard process modell for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pages 29–39, Manchester, UK, April 2000.Google Scholar
  33. 33.
    M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Jochen Hipp
    • 1
  • Ulrich Güntzer
    • 2
  • Gholamreza Nakhaeizadeh
    • 1
  1. 1.DaimlerChrysler AG, Research & TechnologyUlmGermany
  2. 2.Wilhelm Schickard-InstituteUniversity of TübingenTübingenGermany

Personalised recommendations