Attribute selection strategies for attribute-oriented generalization
We describe and compare attribute-selection strategies for attribute-oriented generalization (AOG). AOG summarizes the information in a relational database by repeatedly replacing specific attribute values with more general concepts. Several strategies for selecting the next attribute to generalize have been suggested in the literature, but their relative merits have not previously been assessed. Here, we evaluate the usefulness and efficiency of previously proposed and new strategies.
Ten different attribute selection strategies for generalization were implemented and tested, with the performance of the strategies evaluated and compared using criteria that consider their ability to efficiently produce interesting results. We use measures of interestingness that consider the structure of the domain-expert defined concept hierarchies that are used to guide generalization. Based on the comparison of the experimental results, a strategy that considers the complexity of the concept hierarchies was found to provide efficient and effective guidance towards interesting results.
Keywordsknowledge acquisition learning knowledge representation applications
Unable to display preview. Download preview PDF.
- L. Bhandari, “Attribute Focusing: Machine-Assisted Knowledge Discovery Applied to Software Production Process Control,” Knowledge Discovery in Databases: Papers from the 1993 Workshop, Technical Report WS-93-02, AAAI Press, Menlo Park, CA., 1993 61–69.Google Scholar
- Y. Cai, N. Gereone and J. Han, “Attribute-Oriented Induction in Relational Databases,” in: G. Piatetsky-Shapiro and W. J. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MT Press, Menlo Park, CA., 1991, 213–228.Google Scholar
- C. L. Carter and H. J. Hamilton, “Performance Evaluation of Attribute-Oriented Algorithms for Knowledge Discovery from Databases,” Seventh International Conference on Tools with Artificial Intelligence (TAI 95), Nov. 5–8, 1995, Herndon, Virginia, USA, 486–489.Google Scholar
- C. L. Carter, H. J. Hamilton and N. Cercone, “The Software Architecture of DBLEARN, “Technical Report CS-94-04,” University of Regina, 1994.Google Scholar
- R. Feldman and I. Dagan, “Knowledge Discovery in Textural Databases (KDT),” Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), August, 1995, Montreal, Quebec, Canada, 112–117.Google Scholar
- W. J. Frawley, G. Piatetsky-Shapiro and C. J. Matheus, “Knowledge Discovery in Databases: An Overview,” in: G. Piatetsky-Shapiro and W. J. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Menlo Park, CA., 1991, Pages 1–27.Google Scholar
- H. J. Hamilton and D. F. Fudger, “Estimating DBLEARN's Potential for Knowledge Discovery in Databases,” Computational Intelligence, 11(2), 1995, 1–18.Google Scholar
- J. Han, Y. Cai and N. Cercone, “Knowledge Discovery in Databases: An Attribute-Oriented Approach,” Proceedings of the 18th VLDB Conference, Vancouver, British Columbia, 1992, 547–559.Google Scholar
- M. Klemettinin, H. Mannila, P. Ronkainen, H. Toivonen and A. I. Verkamo, “Finding Interesting Rules from Large Sets of Discovered Association Rules,” in: Adams N.R., Bhargava B.K. and Yesha Y., Eds., Third International Conference on Information and Knowledge Management, ACM Press, Gaitersburg, Maryland, Nov.–Dec., 1994, 401–407.Google Scholar
- J. A. Major and J. J. Mangano, “Selecting Among Rules Induced from a Hurricane Database,” Knowledge Discovery in Databases: Papers from the 1993 Workshop, Technical Report WS-93-02, AAAI Press, Menlo Park, CA., 1–13.Google Scholar
- C. J. Matheus, G. Piatetsky-Shapiro and D. McNeill, “Selecting and Reporting What is Interesting: The KEFIR Application to Healthcare Data,” in: U. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, Eds., Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, Menlo Park, CA., 1995, 401–419.Google Scholar
- G. Piatetsky-Shapiro, “Discovery, Analysis and Presentation of Strong Rules,” in: G. Piatetsky-Shapiro and W. J. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Menlo Park, CA., 1991, 229–248.Google Scholar
- C. B. Rivera and C. L. Carter, “A Tutorial Guide to DB-Discover, Version 2.0,” Technical Report CS-95-05, University of Regina, 1995.Google Scholar
- N. Shan, H. J. Hamilton and N. Cercone, “GRG: Knowledge Discovery Using Information Generalization, Information Reduction and Rule Generation,“ 7th IEEE International Conference on Tools with Artificial Intelligence, Washington, D.C., November, 1995. 372–379.Google Scholar
- A. Silberschatz and A. Tuzhilin, “On Subjective Measures of Interestingness in Knowledge Discovery,” Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, August, 1995, 275–281.Google Scholar
- P. Smyth and R. M. Goodman, “Rule Induction using Information Theory,” in: Piatetsky-Shapiro G. and Frawley W. J., Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Menlo Park, CA., 1991, 159–176.Google Scholar