Study on Feature Selection Based on Fuzzy Clustering Algorithm
Considering the complementarity between the classification and the clustering algorithms, we propose a new feature selection method based on fuzzy Interactive Self-Organizing Data Algorithm (ISODATA). A formula for computing the features’ contribution to class separability in feature space is first defined on the basis of the fuzzy ISODATA. Then, candidate feature subsets are generated according to the feature’s contribution in the procedure of recursive feature elimination process, and the optimal candidate feature subset with the lowest object function, which is the number of misclassified and misclustered samples, is selected from the candidate feature subsets. The proposed method is applied to the acute leukemia gene expression profile dataset. The experiment result shows that the selected features have good performance in terms of both classification and clustering measurements. This demonstrates that our algorithm is effective for selecting informative features from high dimensional dataset.
KeywordsFeature Selection Feature Subset Feature Selection Method Membership Degree Fuse Classification
Unable to display preview. Download preview PDF.
- 2.Bezdek, J.C.: Physical Interpretation of Fuzzy ISODATA. IEEE Trans., Systems Man, Cybern. SME-6(2), 32–37 (1986)Google Scholar
- 5.Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning (2001)Google Scholar
- 6.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons (2001)Google Scholar
- 12.John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of ICML 1994, 11th International Conference on Machine Learning (1994)Google Scholar
- 16.Mitchell, T.: Machine Learning. McGraw-Hill (1997)Google Scholar
- 17.Rule Quest Research Data Mining Tools, http://www.rulequest.com/r207.html
- 19.Zhu, J.Y.: Non-classical mathematics method for Intelligent systems. Huazhong university of science and technology publishing house (2001)Google Scholar