Date: 26 Feb 1999

Learning Logical Descriptions for Document Understanding: A Rough Sets-Based Approach

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Inductive learning systems in a logical framework are prone to difficulties when dealing with huge amount of information. In particular, the learning cost is greatly increased, and it becomes difficult to find descriptions of concepts in a reasonable time. In this paper, we present a learning approach based on Rough Set Theory, and more especially on its basic notion of concept approximation. In accordance with RST, a learning process is splitted into three steps, namely (1) partitioning of knowledge, (2) approximation of the target concept, and finally (3) induction of a logical description of this concept. The second step of approximation reduces the volume of the learning data, by computing well-chosen portions of the background knowledge which represent approximations of the concept to learn. Then, only one of these portions is used during the induction of the description, which allows for reducing the learning cost. In the first part of this paper, we report how RST’s basic notions namely indiscernibility, as well as lower and upper approximations of a concept have been adapted in order to cope with a logical framework. In the remainder of the paper, some empirical results obtained with a concrete implementation of the approach, i.e., the EAGLE system, are given. These results show the relevance of the approach, in terms of learning cost gain, on a learning problem related to the document understanding.