Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Han, Jianchao; Sanchez, Ricardo; Hu, Xiaohua

doi:10.1007/11548669_23

Jianchao Han²³,
Ricardo Sanchez²³ &
Xiaohua Hu²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3641))

Included in the following conference series:

International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing

1274 Accesses
7 Citations

Abstract

Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In this paper, we develop a new computation model based on relative attribute dependency that is defined as the proportion of the projection of the decision table on a subset of condition attributes to the projection of the decision table on the union of the subset of condition attributes and the set of decision attributes. To find an optimal reduct, we use information entropy conveyed by the attributes as the heuristic. A novel algorithm to find optimal reducts of condition attributes based on the relative attribute dependency is implemented using Java, and is experimented with 10 data sets from UCI Machine Learning Repository. We conduct the comparison of data classification using C4.5 with the original data sets and their reducts. The experiment results demonstrate the usefulness of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Almuallim, H., Dietterich, T.G.: Learning Boolean concepts in the presence of many irrelevant features. Artificial Intelligence 69(1-2), 279–305 (1994)
Article MATH MathSciNet Google Scholar
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Han, J., Hu, X., Lin, T.Y.: A New Computation Model for Rough Set Theory Based on Database Systems. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 381–390. Springer, Heidelberg (2003)
Chapter Google Scholar
Han, J., Hu, X., Lin, T.Y.: Feature Subset Selection Based on Relative Dependency Between Attributes. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 176–185. Springer, Heidelberg (2004)
Chapter Google Scholar
Grzymala-Busse, J.W.: LERS - A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Grzymala-Busse, J.W.: A Comparison of Three Strategies to Rule Induction. In: Proc. of the International Workshop on Rough Sets in Knowledge Discovery, Warsaw, Poland, April 5-13, pp. 132–140 (2003)
Google Scholar
Kira, K., Rendell, L.A.: The Feature Selection Problem: Traditional Methods and a new Algorithm. In: 9th National Conference on Artificial Intelligence (AAAI), pp. 129–134 (1992)
Google Scholar
Lin, T.Y., Cercone, N.: Applications of Rough Sets Theory and Data Mining. Kluwer Academic Publishers, Dordrecht (1997)
Google Scholar
Lin, T.Y., Yin, P.: Heuristically Fast Finding of the Shortest Reducts. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 465–470. Springer, Heidelberg (2004)
Chapter Google Scholar
Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes. In: 7th IEEE International Conference on Tools with Artificial Intelligence (1995)
Google Scholar
Modrzejewski, M.: Feature Selection Using Rough Sets Theory. In: European Conference on Machine Learning, pp. 213–226 (1993)
Google Scholar
Nguyen, H., Nguyen, S.: Some efficient algorithms for rough set methods. In: IPMU, pp. 1451–1456 (1996)
Google Scholar
Pagallo, G., Haussler, D.: Boolean Feature Discovery in Empirical Learning. Machine Learning 5, 71–99 (1990)
Article Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Article MATH MathSciNet Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht (1991)
MATH Google Scholar
Quafafou, M., Boussouf, M.: Generalized Rough Sets Based Feature Selection. Intelligent Data Analysis 4, 3–17 (2000)
MATH Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Sever, H., Raghavan, V., Johnsten, D.T.: The Status of Research on Rough Sets for Knowledge Discovery in Databases. In: 2nd International Conference on Nonlinear Problems in Aviation and Aerospace, vol. 2, pp. 673–680 (1998)
Google Scholar
Shen, Q., Chouchoulas, A.: A Rough-fuzzy Approach for Generating Classification Rules. Pattern Recognition 35, 2425–2438 (2002)
Article MATH Google Scholar
Zhang, J., Wang, J., Li, D., He, H., Sun, J.: A New Heuristic Reduct Algorithm Based on Rough Sets Theory. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 247–253. Springer, Heidelberg (2003)
Chapter Google Scholar
Zhang, M., Yao, J.: A Rough Set based Approach ro Feature Selection. In: Proc. IEEE Annual Meeting of Fuzzy Information NAFIP, pp. 434–439 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, California State University Dominguez Hills, 1000 E. Vistoria Street, Carson, CA, 90747
Jianchao Han & Ricardo Sanchez
College of Information Science and Technology, Drexel University, 3141 Chestnut Street, Philadelphia, PA, 19104
Xiaohua Hu

Authors

Jianchao Han
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Sanchez
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohua Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, Regina, SK, S4S 0A2 Canada, Polish-Japanese Institute of Information Technology, Koszykowa 86, 02-008 Warsaw, P.O. Box, Poland
Dominik Ślęzak
School of Information Science and Technology, Southwest Jiaotong University, 610031, Chengdu, P.R. China
Guoyin Wang
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Marcin Szczuka
Department of Computer Science, Brock University, St. Catharines, L2S 3A1, Ontario, Canada
Ivo Düntsch
Department of Computer Science, University of Regina, S4S 0A2, Regina, Saskatchewan, Canada
Yiyu Yao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, J., Sanchez, R., Hu, X. (2005). Feature Selection Based on Relative Attribute Dependency: An Experimental Study. In: Ślęzak, D., Wang, G., Szczuka, M., Düntsch, I., Yao, Y. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11548669_23

Download citation

DOI: https://doi.org/10.1007/11548669_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28653-0
Online ISBN: 978-3-540-31825-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics