Skip to main content

Data Mining of Association Rules and the Process of Knowledge Discovery in Databases

  • Chapter
  • First Online:
Advances in Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2394))

Abstract

In this paper we deal with association rule mining in the context of a complex, interactive and iterative knowledge discovery process. After a general introduction covering the basics of association rule mining and of the knowledge discovery process in databases we draw the attention to the problematic aspects concerning the integration of both. Actually, we come to the conclusion that with regard to human involvement and interactivity the current situation is far from being satisfying. In our paper we tackle this problem on three sides: First of all there is the algorithmic complexity. Although today’s algorithms efficiently prune the immense search space the achieved run times do not allow true interactivity. Nevertheless we present a rule caching schema that significantly reduces the number of mining runs. This schema helps to gain interactivity even in the presence of extreme run times of the mining algorithms. Second, today the mining data is typically stored in a relational database management system. We present an efficient integration with modern database systems which is one of the key factors in practical mining applications. Third, interesting rules must be picked from the set of generated rules. This might be quite costly because the generated rule sets normally are quite large whereas the percentage of useful rules is typically only a very small fraction. We enhance the traditional association rule mining framework in order to cope with this situation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. Adriaans and D. Zantinge. Data Mining. Addison-Wesley, Harlow, England, 1996.

    Google Scholar 

  2. R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 93), pages 207–216, Washington, USA, May 1993.

    Google Scholar 

  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases (VLDB’ 94), Santiago, Chile, June 1994.

    Google Scholar 

  4. T. Barth. Guidelines for the data mining process. Technical report, University of Stuttgart, Stuttgart, Germany, 1998. ESPRIT Project Number 22700.

    Google Scholar 

  5. R. J. Brachman and T. Anand. The process of knowledge discovery in databases: A human centered approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, chapter 2, pages 37–57. AAAI/MIT Press, 1996.

    Google Scholar 

  6. S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 7), pages 265–276, 1997.

    Google Scholar 

  7. S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 97), pages 265–276, 1997.

    Google Scholar 

  8. C. E. Brodley and P. Smyth. The process of applying machine learning algorithms. In Presented at Workshop on Applying Machine Learning in Practice, 12th International Machine Learning Conference (IMLC 95), Tahoe City, CA, 1995.

    Google Scholar 

  9. P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and R. Wirth. CRISP-DM 1.0. http://www.crisp-dm.org/, 2000.

  10. U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11):27–34, November 1996.

    Google Scholar 

  11. J. Han, Y. Fu, W. Wang, K. Koperski, and O. Zaiane. DMQL: A data mining query language for relational databases. In Proceedings of the 1996 SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’ 96), Montreal, Canada, June 1996.

    Google Scholar 

  12. H. Heuser. Lehrbuch der Analysis. B. G. Teubner Verlag, Stuttgart, 8 edition, 1990.

    MATH  Google Scholar 

  13. J. Hipp, U. Güntzer, and U. Grimmer. Integrating association rule mining algorithms with relational database systems. In Proceedings of the 3rd International Conference on Enterprise Information Systems (ICEIS 2001), pages 130–137, Setúbal, Portugal, July 7–10 2001.

    Google Scholar 

  14. J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining-a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000.

    Google Scholar 

  15. J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Mining association rules: Deriving a superior algorithm by analysing today’s approaches. In Proceedings of the 4th European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 00), pages 159–168, Lyon, France, September 13–16 2000.

    Google Scholar 

  16. J. Hipp and G. Lindner. Analysing warranty claims of automobiles. an application description following the CRISP-DM data mining process. In Proceedings of 5th International Computer Science Conference (ICSC’ 99), pages 31–40, Hong Kong, China, December 13–15 1999.

    Google Scholar 

  17. J. Hipp, C. Mangold, U. Güntzer, and G. Nakhaeizadeh. Efficient rule retrieval and postponed restrict operations for association rule mining. In Proceedings of the Sixth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’02), May 6–8 2002.

    Google Scholar 

  18. J. Hipp, A. Myka, R. Wirth, and U. Güntzer. A new algorithm for faster mining of generalized association rules. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 98), pages 74–82, Nantes, France, Sept. 23–26 1998.

    Google Scholar 

  19. IBM. Intelligent Miner Handbook, 1999.

    Google Scholar 

  20. T. Imielinski, A. Virmani, and A. Abdulghani. Data mining: Application programming interface and query language for database mining. In Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining (KDD’ 96), pages 256–262, Portland, Oregon, USA, August 1996.

    Google Scholar 

  21. T. Imielinski, A. Virmani, and A. Abdulghani. DMajor-application programming interface for database mining. Data Mining and Knowledge Discovery, 3(4):347–372, December 1999.

    Google Scholar 

  22. L. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries: 2-var constraints. In 3rd SIGMOD’98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), pages 157–168, Seattle, WA, June 1998.

    Google Scholar 

  23. R. Meo, G. Psaila, and S. Ceri. A new sql-like operator for mining association rules. In Proceedings of the 22nd International Conference on Very Large Databases (VLDB’ 96), Mumbai (Bombay), India, September 1996.

    Google Scholar 

  24. R. Ng, L. S. Lakshmanan, J. Han, and T. Mah. Exploratory mining via constrained frequent set queries. In Proceedings of the 1999 ACM-SIGMOD International Conference on Management of Data (SIGMOD’ 99), pages 556–558, Philadelphia, PA, USA, June 1999.

    Google Scholar 

  25. R. Ng, L. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD’ 98), Seattle, Washington, USA, June 1998.

    Google Scholar 

  26. A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), pages 432–444, Zürich, Switzerland, September 1995.

    Google Scholar 

  27. R. Srikant and R. Agrawal. Mining generalized association rules. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), Zürich, Switzerland, September 1995.

    Google Scholar 

  28. R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.

    Google Scholar 

  29. R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.

    Google Scholar 

  30. G. J. Williams and Z. Huang. Modelling the kdd process. Technical report, CSIRO Division of Information Technology, GPO Box 664 Canberra ACT 2601 Australia, Februar 1996.

    Google Scholar 

  31. R. Wirth, M. Borth, and J. Hipp. When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In Proceedings of the PKDD 2001 Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, pages 56–64, Freiburg, Germany, September 3–7 2001.

    Google Scholar 

  32. R. Wirth and J. Hipp. CRISP-DM: Towards a standard process modell for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pages 29–39, Manchester, UK, April 2000.

    Google Scholar 

  33. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hipp, J., Güntzer, U., Nakhaeizadeh, G. (2002). Data Mining of Association Rules and the Process of Knowledge Discovery in Databases. In: Perner, P. (eds) Advances in Data Mining. Lecture Notes in Computer Science(), vol 2394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46131-0_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-46131-0_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44116-8

  • Online ISBN: 978-3-540-46131-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics