Abstract
The problem of features subset selection can be defined as the selection of a relevant subset of features which allows a learning algorithm to induce small high-accuracy concepts. To achieve this goal, two main approaches have been developed, the first one is algorithm-independent (filter approach) which considers only the data, when the second approach takes into account both the data and a given learning algorithm (wrapper approach). Recent work were developed to study the interest of rough set theory and more particularly its notions of reducts and core to deal with the problem of features subset selection. Different methods were proposed to select features using both the core and the reduct concepts, whereas other researches show that useful features subset does not necessarily contain all features in cores. In this paper, we underline the fact that rough set theory is concerned with deterministic analysis of attribute dependencies which are at basis of the two notions of reduct and core. We extend the notion of dependency which allows us to find both deterministic and non-deterministic dependencies. A new notion of strong reducts is then introduced and leads to the definition of strong feature subsets (SFS). The interest of SFS is illustrated by the improvement of the accuracy of our learning system, called Alpha, on real-world datasets.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Almuallim, H., Dietterich, T.G. (1994): Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69 (1-2), pp. 279–305
Deogun J.S., Raghavan, Sever H. (1995): Exploiting upper approximation in the rough set methodology. In International Conference on Knozledge Discovery and Data Mining, pp. 69–74.
Grzymala-Busse J.W. (1995): Rough Sets. Advances in imaging and physics, Vol. 94, pp. 151–195
Hadjimichael M., Wong S.K.K. (1995): Fuzzy representation in rough set approximation. Advances in imaging and physics, Vol. 94, pp. 151–195
John, G. H., Kohavi, R., Pfleger, K. (1994): Irrelevant features and the subset selection Problem. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129.
Kira, K., Rendell, L.A. (1992): The feature selection problem: traditional methods and a new algorithm In Proceedings of the 9th National Conference on Artificial Intelligence, pp. 129–134.
Kittler, J. (1986): Feature selection and extraction In Young, T.Y. and Fu, K.S. (eds.), Handbook of Pattern Recognition and Image Processing. Academic Press, New York.
Kohavi, R. (1994): Feature subset selection as search probabilistic estimates AAAI Fall Symposium on Relevance, pp. 122–126.
Kohavi, R., Frasca, B. (1994): Useful feature subset and rough set reducts In Proceedings of the Third International Workshop on Rough Sets and Soft Computing, pp. 310–317.
Kohavi, R., Sommerfield, D. (1995): Feature subset selection using the wrapper method: overfitting and dynamic search space topology. In proceeding of the First International Conference on Knowledge Discovery and Data Mining, pp. 192–197.
Langley, P. (1994): Selection of relevant features in machine learning In Proceedibgs of the AAAI Fall Symposium on Relevance. AAAI Press.
Modrzejewski, M. (1993): Feature selection using rough sets theory. In Proceedings of the European Conference on Machine Learning, pp. 213–226.
Pawlak Z. (1991): Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht, The Netherlands.
Pawlak Z. (1993): Rough Sets: present state and the future. Foundations of Computing and Decision Sciences, 18 (3-4), pp. 157–166
Pawlak Z., Grzymala-Busse J.W., Slowinski R., Ziarko W. (1995): Rough Sets. Communications of the ACM, Vol. 38, No 11, pp. 89–95
Pawlak Z., Wong S.K.M., Ziarko W. (1990): Rough Sets: Probabilistic versus Deterministic In B.R. Gaines and J.H. Boose, editors, Machine Learning and Uncertain Reasoning, pp. 227–241, Academic Press.
Quafafou, M. (1997): α-RST: A generalization of rough set theory. In proceedings of the Fifth International Workshop on Rough Sets and Soft Computing, North Carolina, USA.
Quafafou, M. (1997): On the Roughness of Fuzzy Sets In proceedings of the European Symposium on Intelligent Techniques, Bari, Italy.
Vafai, H., De Jong, K. (1993): Robust Feature Selection Algorithms In Proceeding of the Fifth International Conference on Tools with Artificial Intelligence, IEEE Computer Society Press, pp. 356–363.
Ziarko, W. (1991): The discovery, analysis, and representation of data dependencies in databases. In Piatetsky-Shapiro, G., and Frawley, W., eds., Knowledge Discovery in Databases, MIT Press.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Quafafou, M., Boussouf, M. (1997). Induction of strong feature subsets. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_138
Download citation
DOI: https://doi.org/10.1007/3-540-63223-9_138
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive