Abstract
A heuristic algorithm of reduct computation for feature selection is proposed in the paper, which is a discernibility matrix based method and aims at reducing the number of irrelevant and redundant features in data mining. The method used both significance information of attributes and information of discernibility matrix to define the necessity of heuristic feature selection. The advantage of the algorithm is that it can find an optimal reduct for feature selection in most cases. Experimental results confirmed the above assertion. It also shown that the proposed algorithm is more efficient in time performance comparing with other similar computation methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthu-rusamy (eds.) Advances in Knowledge Discovery and Data Mining. AAAI Press / The MIT Press, pp. 495–515 (1996)
Provost, F., Kolluri, V.: A Survey of Methods for Scaling Up Inductive Algorithms. Journal of Data Mining and Knowledge Discovery 3, 131–169 (1999)
Magdalinos, Doulkeridis, C., Vazirgiannis, M.: A Novel Effective Distributed Dimensionality Reduction Algorithm. In: Proceedings of the Second Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics, Bethesda, MA, pp. 18–25 (2006)
Liu, H., Motoda, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective, pp. 191–204. Kluwer Academic Publishers, Boston (2001)
Skowron, A., James F, P.: Rough Sets: Trends and Challenges. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. LNCS (LNAI), vol. 2639, Springer, Heidelberg (2003)
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11(5), 341–356 (1982)
X. Hu, T.Y. Lin, J. Jianchao: A New Rough Sets Model Based on Database Systems. Fundamenta Informaticae, 1–18 (2004)
Kusiak, A.: Rough Set Theory: A Datamining Tool for Semiconductor Manufacturing. IEEE Transactions on Electronics Packaging Manufacturing, 24(1) (2001)
Lin, T.Y., Cercone, N. (eds.): Rough Sets and Datamining: Analysis of Imprecise Data. Kluwer Academic Publishers, Boston, MA (1997)
Zhang, M., Yao, J.T.: A Rough Sets Based Approach to Feature Selection. In: Proceedings of the 23rd International Conference of NAFIPS, Banff, Canada, pp. 434–439 (2004)
Deogun, J., Choubey, S., Raghavan, V., Severm, H.: Feature Selection and Effective Classifiers. Journal of ASIS 49(5), 403–414 (1998)
Michal, G., Jacek, S.: RSL-The Rough Set Library Version 2.0. ICS Research Report. Warsaw University of Technology (1994)
Hu, K., Lu, Y., Shi, C.: Feature Ranking in Rough Sets. AI Communications 16(1), 41–50 (2003)
Zhong, N., Skowron, A.: A Rough Set-Based Knowledge Discovery Process. International Journal of Applied Mathematics and Computer Science 11(3), 603–619 (2001)
Jensen, R., Shen, Q.: Fuzzy-Rough Attribute Reduction with Application to Web Categorization. Fuzzy Sets and Systems 141(3), 469–485 (2004)
Jensen, R., Shen, Q.: Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches. IEEE Transactions on Knowledge and Data Engineering 16(12) (2004)
Thangavel, K., Pethalakshmi, A.: Feature Selection for Medical Database Using Rough System. Int. J. on Artificial Intelligence and Machine Learning, 5(4) (2005)
Shen, Q., Chouchoulas, A.: A Rough-Fuzzy Approach for Generating Classification Rules. Pattern Recognition 35, 2425–2438 (2002)
Shen, Q., Chouchoulas, A.: A Modular Approach to Generating Fuzzy Rules with Reduced Attributes for the Monitoring of Complex Systems. Engineering Applications of Artificial Intelligence 13(3), 263–278 (2002)
Thangavel, K., Shen, Q., Pethalakshmi, A.: Application of Clustering for Feature Selection Based on Rough Set Theory Approach. AIML Journal 6(1), 19–27 (2006)
Jensen, R.: Combining Rough and Fuzzy Sets for Feature Selection. Ph.D Thesis, School of Informatics, University of Edinburgh (2005)
Liu, H., Motoda, H.: Feature Extraction Construction and Selection: A Datamining Perspective. In: Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, Boston, MA (1998)
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant Features and the Subset Selection Problem. In: Proceedings of 11th International Conference on Machine Learning, pp. 121–129 (1994)
Langley, P.: Selection of Relevant Feature in Machine Learning. In: Proceedings of the AAAI Fall Symposium on Relevance, pp. 140–144. AAAI Press, New Orleans (1994)
Zhong, N., Dong, J.Z., Ohsuga, S.: Using Rough Sets with Heuristics for Feature Selection. Journal of Intelligent Information Systems 16, 199–214 (2001)
Susmaga, R.: Experiments in Incremental Computation of Reducts. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery: Methodology and Applications, Physica – Verlag, pp. 530–553 (1998)
Merz, J., Murphy, P.: UCI Repository of Machine Learning Database. http://www.ics.uci.edu/~mlearn/MLRe-pository.htm/
The Group of Logic, Warsaw University Homepage. http://alfa.mimuw.edu.pl/logic/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, F., Lu, S. (2007). A Feature Selection Algorithm Based on Discernibility Matrix. In: Wang, Y., Cheung, Ym., Liu, H. (eds) Computational Intelligence and Security. CIS 2006. Lecture Notes in Computer Science(), vol 4456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74377-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-74377-4_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74376-7
Online ISBN: 978-3-540-74377-4
eBook Packages: Computer ScienceComputer Science (R0)