Using Rough Sets with Heuristics for Feature Selection

Zhong, Ning; Dong, Juzhen; Ohsuga, Setsuo

doi:10.1023/A:1011219601502

Using Rough Sets with Heuristics for Feature Selection

Published: August 2001

Volume 16, pages 199–214, (2001)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ning Zhong¹,
Juzhen Dong¹ &
Setsuo Ohsuga²

474 Accesses
225 Citations
Explore all metrics

Abstract

Practical machine learning algorithms are known to degrade in performance (prediction accuracy) when faced with many features (sometimes attribute is used instead of feature) that are not necessary for rule discovery. To cope with this problem, many methods for selecting a subset of features have been proposed. Among such methods, the filter approach that selects a feature subset using a preprocessing step, and the wrapper approach that selects an optimal feature subset from the space of possible subsets of features using the induction algorithm itself as a part of the evaluation function, are two typical ones. Although the filter approach is a faster one, it has some blindness and the performance of induction is not considered. On the other hand, the optimal feature subsets can be obtained by using the wrapper approach, but it is not easy to use because of the complexity of time and space. In this paper, we propose an algorithm which is using rough set theory with greedy heuristics for feature selection. Selecting features is similar to the filter approach, but the evaluation criterion is related to the performance of induction. That is, we select the features that do not damage the performance of induction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Aho, A.V., Hopcroft, J.E., and Ullman, J.D. (1983). Data Structures and Algorithms. Reading, MA: Addison-Wesley Publishing Company.
Google Scholar
Boussouf, M. (1998). A Hybrid Approach to Feature Selection. In Zytkow, J. and Quafafou, M. (Eds.), Principles of Data Mining and Knowledge Discovery (pp. 231-238). LNAI 1510, Berlin: Springer-Verlag.
Google Scholar
Dong, J.Z., Zhong, N., and Ohsuga, S. (1999). Probabilistic Rough Induction: The GDT-RS Methodology and Algorithms. In Z.W. Ras and A. Skowron (Eds.), Foundations of Intelligent Systems (pp. 621-629). LNAI 1609, Berlin: Springer-Verlag.
Google Scholar
Fayyad, U.M., Piatetsky-Shapiro, G., and Smyth, P. (1996). From Data Mining to Knowledge Discovery: An Overview. In Advances in Knowledge Discovery and Data Mining. Cambridge, MA: (pp. 1-36). MIT Press.
Google Scholar
Liu H. and Motoda, H. (1998). Feature Selection. Boston, MA: Kluwer Academic Publishers.
Google Scholar
Kohavi, R. (1994). Useful Feature Subsets and Rough Set Reducts. In Proc. Third International Workshop on Rough Set and Soft Computing (pp. 310-317).
Kohavi, R. (1994). Feature Subset Selection as Search with Probabilistic Estimates. In Proc. AAAI Fall Symposium on Relevance (pp. 109-113).
Kohavi, R. (1994). Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology. In Proc. AAAI Fall Symposium on Relevance (pp. 109-113).
Pawlak, Z. (1982). Rough Sets, International Journal of Computer and Information Sciences, 11, 341-356.
Google Scholar
Pawlak, Z. (1991). Rough Sets, Theoretical Aspects of Reasoning about Data. Boston, MA: Kluwer Academic Publishers.
Google Scholar
Polkowski, L. and Skowron, A. (Eds.) (1998). Rough Sets in Knowledge Discovery, Vol.1, 2. Heidelberg: Physica-Verlag.
Google Scholar
Skowron, A. and Rauszer, C. (1992). The Discernibility Matrics and Functions in Information Systems. In R. Slowinski (Ed.), Intelligent Decision Support (pp. 331-362). Boston, MA: Kluwer Academic Publishers.
Google Scholar
Skowron, A. and Polkowski, L. (1997). A Synthesis of Decision Systems from Data Tables. In T.Y. Lin and N. Cercone (Eds.), Rough Sets and Data Mining (pp. 259-299). Boston, MA: Kluwer Academic Publishers.
Google Scholar
Yao, Y.Y., Wong, S.K.M., and Butz, C.J. (1999). On Information-Theoretic Measures of Attribute Importance. In Zhong, N. and Zhou, L (Eds.), Methodologies for Knowledge Discovery and Data Mining (pp. 231-238). LNAI 1574, Berlin: Springer-Verlag.
Google Scholar
Zhong, N. Dong, J. Z., and Ohsuga, S. (1998). Data Mining: A Probabilistic Rough Set Approach. In Skowron, A. and Polkowski, L. (Eds.), Rough Sets in Knowledge Discovery, Vol. 2, Heidelberg: (pp. 231-238). Physica-Verlag.
Google Scholar
Zhong, N., Skowron, A., and Ohsuga, S. (Eds.) (1999). New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. LNAI 1711, Berlin: Springer-Verlag.
Google Scholar
Zhong, N. and Zhou, L. (Eds.) (1999). Methodologies for Knowledge Discovery and Data Mining. LNAI 1574, Berlin: Springer-Verlag.
Google Scholar
Zupan, B., Bohanec, M., Demsar, J., and Bratko, I. (1998). Feature Transformation by Function Decomposition, IEEE Intelligent Systems, 13(2), 38-43.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, Maebashi Institute of Technology, 460-1 Kamisadori-Cho, Maebashi-City, 371-0816, Japan
Ning Zhong & Juzhen Dong
Department of Information and Computer Science, Waseda University, 3-4-1 Okubo Shinjuku-Ku, Tokyo, 169, Japan
Setsuo Ohsuga

Authors

Ning Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Juzhen Dong
View author publications
You can also search for this author in PubMed Google Scholar
Setsuo Ohsuga
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhong, N., Dong, J. & Ohsuga, S. Using Rough Sets with Heuristics for Feature Selection. Journal of Intelligent Information Systems 16, 199–214 (2001). https://doi.org/10.1023/A:1011219601502

Download citation

Issue Date: August 2001
DOI: https://doi.org/10.1023/A:1011219601502

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Using Rough Sets with Heuristics for Feature Selection

Abstract

Access this article

Similar content being viewed by others

A survey on ensemble learning

Recent advances in decision trees: an updated survey

T2RFIS: type-2 regression-based fuzzy inference system

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Using Rough Sets with Heuristics for Feature Selection

Abstract

Access this article

Similar content being viewed by others

A survey on ensemble learning

Recent advances in decision trees: an updated survey

T2RFIS: type-2 regression-based fuzzy inference system

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation