Principles of Data Mining and Knowledge Discovery
Volume 1510 of the series Lecture Notes in Computer Science pp 441449
Data transformation and rough sets
 Jaroslaw StepaniukAffiliated withInstitute of Computer Science, Bialystok University of Technology
 , Marcin MajAffiliated withInstitute of Computer Science, Bialystok University of Technology
Abstract
Knowledge discovery and data mining systems have to face several difficulties, in particular related to the huge amount of input data. This problem is especially related to inductive logic programming systems, which employ algorithms that are computationally complex. Learning time can be reduced by feeding the ILP algorithm only a wellchosen portion of the original input data. Such transformation of the input data should throw away unimportant clauses but leave ones that are potentially necessary to obtain proper results. In this paper two approaches to data reduction problem are proposed. Both are based on rough set theory. Rough set techniques serve as data reduction tools to reduce the size of input data fed to more timeexpensive (searchintensive) ILP techniques. First approach transforms input clauses into decision table form, then uses reducts to select only meaningful data. Second approach introduces a special kind of approximation space. When properly used, iterated lower and upper approximations of target concept have the ability to preferably select facts that are more relevant to the problem, at the same time throwing out the facts that are totally unimportant.
 Title
 Data transformation and rough sets
 Book Title
 Principles of Data Mining and Knowledge Discovery
 Book Subtitle
 Second European Symposium, PKDD ’98 Nantes, France, September 23–26, 1998 Proceedings
 Pages
 pp 441449
 Copyright
 1998
 DOI
 10.1007/BFb0094848
 Print ISBN
 9783540650683
 Online ISBN
 9783540496878
 Series Title
 Lecture Notes in Computer Science
 Series Volume
 1510
 Series ISSN
 03029743
 Publisher
 Springer Berlin Heidelberg
 Copyright Holder
 SpringerVerlag
 Additional Links
 Topics
 Industry Sectors
 eBook Packages
 Editors
 Authors

 Jaroslaw Stepaniuk ^{(1)}
 Marcin Maj ^{(1)}
 Author Affiliations

 1. Institute of Computer Science, Bialystok University of Technology, Wiejska 45A, 15351, Bialystok, Poland
Continue reading...
To view the rest of this content please follow the download PDF link above.