Study on Feature Selection Based on Fuzzy Clustering Algorithm
Abstract
Considering the complementarity between the classification and the clustering algorithms, we propose a new feature selection method based on fuzzy Interactive Self-Organizing Data Algorithm (ISODATA). A formula for computing the features’ contribution to class separability in feature space is first defined on the basis of the fuzzy ISODATA. Then, candidate feature subsets are generated according to the feature’s contribution in the procedure of recursive feature elimination process, and the optimal candidate feature subset with the lowest object function, which is the number of misclassified and misclustered samples, is selected from the candidate feature subsets. The proposed method is applied to the acute leukemia gene expression profile dataset. The experiment result shows that the selected features have good performance in terms of both classification and clustering measurements. This demonstrates that our algorithm is effective for selecting informative features from high dimensional dataset.
Keywords
Feature Selection Feature Subset Feature Selection Method Membership Degree Fuse ClassificationPreview
Unable to display preview. Download preview PDF.
References
- 1.Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS Usa 96, 6745–6750 (1999)CrossRefGoogle Scholar
- 2.Bezdek, J.C.: Physical Interpretation of Fuzzy ISODATA. IEEE Trans., Systems Man, Cybern. SME-6(2), 32–37 (1986)Google Scholar
- 3.Cai, R., Hao, Z., Yang, X., Wen, W.: An Efficient Gene Selection Algorithm Based on Mutual Information. Neurocomputing 72, 991–999 (2008)CrossRefGoogle Scholar
- 4.Cai, W.L., Chen, S.C., Zhang, D.Q.: A Simultaneous Learning Framework for Clustering and Classification. Pattern Recognition 42(7), 1248–1259 (2009)CrossRefMATHGoogle Scholar
- 5.Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning (2001)Google Scholar
- 6.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons (2001)Google Scholar
- 7.Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
- 8.Gunala, S., Edizkan, R.: Subspace based feature selection for pattern recognition. Information Science 178, 3716–3726 (2008)CrossRefGoogle Scholar
- 9.Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)CrossRefMATHGoogle Scholar
- 10.Hong, Y., Kwong, S., Chang, Y.C.H., Ren, Q.S.H.: Unsupervised Feature Selection Using Clustering Ensembles and Population Based Incremental Learning Algorithm. Pattern Recognition 41(9), 2742–2756 (2008)CrossRefMATHGoogle Scholar
- 11.Hua, J.P., Tembe, W.D., Dougherty, E.R.: Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognition 42, 409–424 (2009)CrossRefMATHGoogle Scholar
- 12.John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of ICML 1994, 11th International Conference on Machine Learning (1994)Google Scholar
- 13.Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)CrossRefMATHGoogle Scholar
- 14.Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33(1), 25–41 (2000)CrossRefGoogle Scholar
- 15.Liu, H., Motoda, H., Lei, Y.: A selective sampling approach to active feature selection. Artificial Intelligence 159, 49–74 (2004)CrossRefMATHMathSciNetGoogle Scholar
- 16.Mitchell, T.: Machine Learning. McGraw-Hill (1997)Google Scholar
- 17.Rule Quest Research Data Mining Tools, http://www.rulequest.com/r207.html
- 18.Uncu, Ö., Türkşenb, I.B.: A novel feature selection approach: Combining feature wrappers and filters. Information Sciences 177, 449–466 (2007)CrossRefMATHMathSciNetGoogle Scholar
- 19.Zhu, J.Y.: Non-classical mathematics method for Intelligent systems. Huazhong university of science and technology publishing house (2001)Google Scholar