Advertisement

Study on Feature Selection Based on Fuzzy Clustering Algorithm

  • Quanjin Liu
  • Zhimin Zhao
  • Yong Wang
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 124)

Abstract

Considering the complementarity between the classification and the clustering algorithms, we propose a new feature selection method based on fuzzy Interactive Self-Organizing Data Algorithm (ISODATA). A formula for computing the features’ contribution to class separability in feature space is first defined on the basis of the fuzzy ISODATA. Then, candidate feature subsets are generated according to the feature’s contribution in the procedure of recursive feature elimination process, and the optimal candidate feature subset with the lowest object function, which is the number of misclassified and misclustered samples, is selected from the candidate feature subsets. The proposed method is applied to the acute leukemia gene expression profile dataset. The experiment result shows that the selected features have good performance in terms of both classification and clustering measurements. This demonstrates that our algorithm is effective for selecting informative features from high dimensional dataset.

Keywords

Feature Selection Feature Subset Feature Selection Method Membership Degree Fuse Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS Usa 96, 6745–6750 (1999)CrossRefGoogle Scholar
  2. 2.
    Bezdek, J.C.: Physical Interpretation of Fuzzy ISODATA. IEEE Trans., Systems Man, Cybern. SME-6(2), 32–37 (1986)Google Scholar
  3. 3.
    Cai, R., Hao, Z., Yang, X., Wen, W.: An Efficient Gene Selection Algorithm Based on Mutual Information. Neurocomputing 72, 991–999 (2008)CrossRefGoogle Scholar
  4. 4.
    Cai, W.L., Chen, S.C., Zhang, D.Q.: A Simultaneous Learning Framework for Clustering and Classification. Pattern Recognition 42(7), 1248–1259 (2009)CrossRefzbMATHGoogle Scholar
  5. 5.
    Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning (2001)Google Scholar
  6. 6.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons (2001)Google Scholar
  7. 7.
    Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  8. 8.
    Gunala, S., Edizkan, R.: Subspace based feature selection for pattern recognition. Information Science 178, 3716–3726 (2008)CrossRefGoogle Scholar
  9. 9.
    Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)CrossRefzbMATHGoogle Scholar
  10. 10.
    Hong, Y., Kwong, S., Chang, Y.C.H., Ren, Q.S.H.: Unsupervised Feature Selection Using Clustering Ensembles and Population Based Incremental Learning Algorithm. Pattern Recognition 41(9), 2742–2756 (2008)CrossRefzbMATHGoogle Scholar
  11. 11.
    Hua, J.P., Tembe, W.D., Dougherty, E.R.: Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognition 42, 409–424 (2009)CrossRefzbMATHGoogle Scholar
  12. 12.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of ICML 1994, 11th International Conference on Machine Learning (1994)Google Scholar
  13. 13.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)CrossRefzbMATHGoogle Scholar
  14. 14.
    Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33(1), 25–41 (2000)CrossRefGoogle Scholar
  15. 15.
    Liu, H., Motoda, H., Lei, Y.: A selective sampling approach to active feature selection. Artificial Intelligence 159, 49–74 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  16. 16.
    Mitchell, T.: Machine Learning. McGraw-Hill (1997)Google Scholar
  17. 17.
    Rule Quest Research Data Mining Tools, http://www.rulequest.com/r207.html
  18. 18.
    Uncu, Ö., Türkşenb, I.B.: A novel feature selection approach: Combining feature wrappers and filters. Information Sciences 177, 449–466 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  19. 19.
    Zhu, J.Y.: Non-classical mathematics method for Intelligent systems. Huazhong university of science and technology publishing house (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Quanjin Liu
    • 1
    • 2
  • Zhimin Zhao
    • 1
  • Yong Wang
    • 3
  1. 1.College of ScienceUniversity of Aeronautics and AstronauticsNanjingChina
  2. 2.Anqing Normal CollegeAnqingChina
  3. 3.The Second Affiliated HospitalAnhui Medical UniversityHefeiChina

Personalised recommendations