A new and versatile method for association generation
Applications that require associations with very small support have prohibitively large running times.
They assume a static database. Some applications require generating associations in real-time from a dynamic database, where transactions are constantly being added and deleted. There are no existing algorithms to accomodate such applications.
They can only find associations of the type where a conjunction of attributes implies a conjunction of different attributes. It turns out that there are many cases where a conjunction of attributes implies another conjunction only provided the exclusion of certain attributes. To our knowledge, there is no current algorithm that can generate such excluding associations.
We present a novel method for association generation, that answers all three above desiderata. Our method is inherently different from all existing algorithms, and especially suitable to textual databases with binary attributes. At the heart of our algorithm lies the use of subword trees for quick indexing into the required database statistics. We tested our algorithm on the Reuters-22173 database with satisfactory results.
- 2.R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD, pages 207–216, Washington, DC, May 1993.Google Scholar
- 3.R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. 20th Int'l Conf. on VLDB, Santiago, Chile, Aug 1994.Google Scholar
- 4.R. Feldman, A. Amir, Y. Aumann, A. Zilberstein, and H. Hirsh. Incremental algorithms for association generation. to appear, First Pacific Conference on Knowledge Discovery, July 1996.Google Scholar
- 5.R. Feldman and I. Dagan. Knowledge discovery in textual databases. Proc. 1st Intl. Conf. on Knowledge Discovery and Data Mining, pages 112–117, 1995.Google Scholar
- 6.R. Feldman, I. Dagan, and H. Hirsh. Keyword-based browsing and analysis of large document sets. In Proc. 5th Symp. on Document Analysis and Information Retrieval, Las Vegas, Nevada, April 1996.Google Scholar
- 7.R. Feldman, I. Dagan, and W. Kloesgen. Efficient algorithms for mining and manipulating associations in texts. In Proc. 13th European Meeting on Cybernetics and Systems Research, Vienna, Austria, April 1996.Google Scholar
- 8.W. Kloesgen. Problems for knowledge discovery in databases and their treatment in the statistical interpreter explora. Int'l J. for Intelligent Systems, 7(7):649–673, 1992.Google Scholar
- 9.W. Kloesgen. Efficient discovery of interesting statements. The Journal of Intelligent Information Systems, 4(1), 1995.Google Scholar
- 10.H. Mannila and H. Toivonen. Multiple uses of frequent sets and condensed representations. Proc. 2nd Int'l Conference on Knowledge Discovery in Databases, 1996.Google Scholar
- 11.H. Mannila, H. Toivonen, and A. I. Verkamo. Efficient algorithms for discovering association rules. Proc. AAAI Workshop on Knowledge Discovery in Databases, pages 181–192, 1994.Google Scholar
- 12.G. Piatetsky-Shapiro and W. J. Frawley, editors. Knowledge Discovery in Databases. AAAI Press/MIT Press, 1991.Google Scholar
- 13.A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. Proc. 21st Int'l Conf. on VLDB, 1995.Google Scholar
- 14.R. Sedgewick. Algorithms. Addison-Wesley, second edition, 1988.Google Scholar