Ontology-Driven Induction of Decision Trees at Multiple Levels of Abstraction
Most learning algorithms for data-driven induction of pattern classifiers (e.g., the decision tree algorithm), typically represent input patterns at a single level of abstraction – usually in the form of an ordered tuple of attribute values. However, in many applications of inductive learning – e.g., scientific discovery, users often need to explore a data set at multiple levels of abstraction, and from different points of view. Each point of view corresponds to a set of ontological (and representational) commitments regarding the domain of interest. The choice of an ontology induces a set of representatios of the data and a set of transformations of the hypothesis space. This paper formalizes the problem of inductive learning using ontologies and data; describes an ontology-driven decision tree learning algorithm to learn classification rules at multiple levels of abstraction; and presents preliminary results to demonstrate the feasibility of the proposed approach.
KeywordsDecision Tree Information Gain Candidate Attribute Decision Tree Algorithm Hypothesis Space
Unable to display preview. Download preview PDF.
- 1.Almuallim H., Akiba, Y., Kaneda, S.: On Handling Tree-Structured Attributes. Proceedings of the Twelfth International Conference on Machine Learning (1995)Google Scholar
- 2.Andorf, C., Dobbs, D., Honavar, V.: Discovering Protein Function Classification Rules from Reduced Alphabet Representations of Protein Sequences. Proceedings of the Conference on Computational Biology and Genome Informatics. Durham, North Carolina (2002)Google Scholar
- 4.Han, J., Fu, Y.: Exploration of the Power of Attribute-Oriented Induction in Data Mining. U.M. Fayyad, el al. (eds.), Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)Google Scholar
- 5.Heflin, J., Hendler, J., Luke, S.: Coping with Changing Ontologies in a Distributed Environment. Ontology Management. Papers from the AAAI Workshop. WS-99-13. AAAI Press (1999)Google Scholar
- 6.Honavar, V., Silvescu, A., Reinoso-Castillo, J., Andorf, C., Dobbs, D.: Ontology-Driven Information Extraction and Knowledge Acquisition from Heterogeneous, Distributed Biological Data Sources. Proceedings of the IJCAI-2001 Workshop on Knowledge Discovery from Heterogeneous, Distributed, Autonomous, Dynamic Data and Knowledge Sources. (2001)Google Scholar
- 7.Reinoso-Castillo, J.: An Ontology-Driven Query-Centric Approach to Information Integration from Heterogeneous, Distributed, Autonomous Data Sources. Masters Thesis. Artificial Intelligence Research Laboratory, Iowa State University, June 2002.Google Scholar
- 8.Kamber, M., Winstone, L., Gong, W., Cheng, S., Han, J.: Generalization and Decision Tree Induction: Efficient Classification in Data Mining, Proc. of 1997 Int’l Workshop on Research Issues on Data Engineering (RIDE’97) Birmingham, England, April (1997)Google Scholar
- 9.Quinlan, J. R.: C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann (1992)Google Scholar
- 10.Silvescu, A., Caragea, D., and Honavar, V.: Learning Decision Tree Classifiers When Classes are not Mutually Exclusive. To appear. (2002)Google Scholar
- 11.Sowa, J.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks Cole Publishing Co., Pacific Grove, CA (2000)Google Scholar
- 12.Taylor, M., Stoffel, K., Hendler, J.: Ontology-based Induction of High Level Classification Rules. SIGMOD Data Mining and Knowledge Discovery workshop proceedings. Tuscon, Arizona (1997)Google Scholar
- 13.Walker, A.: On retrieval from a small version of a large database. VLDB Conference (1980)Google Scholar