An Appropriate Abstraction for an Attribute-Oriented Induction
- Cite this paper as:
- Kudoh Y., Haraguchi M. (1999) An Appropriate Abstraction for an Attribute-Oriented Induction. In: Arikawa S., Furukawa K. (eds) Discovery Science. DS 1999. Lecture Notes in Computer Science, vol 1721. Springer, Berlin, Heidelberg
An attribute-oriented induction is a useful data mining method that generalizes databases under an appropriate abstraction hierarchy to extract meaningful knowledge. The hierarchy is well designed so as to exclude meaningless rules from a particular point of view. However, there may exist several ways of generalizing databases according to user’s intention. It is therefore important to provide a multi-layered abstraction hierarchy under which several generalizations are possible and are well controlled. In fact, too-general or too-specific databases are inappropriate for mining algorithms to extract significant rules. From this viewpoint, this paper proposes a generalization method based on an information theoretical measure to select an appropriate abstraction hierarchy. Furthermore, we present a system, called ITA (Information Theoretical Abstraction), based on our method and an attribute-oriented induction. We perform some practical experiments in which ITA discovers meaningful rules from a census database US Census Bureau and discuss the validity of ITA based on the experimental results.
Unable to display preview. Download preview PDF.