Advertisement

Application of Feature Subset Selection Methods on Classifiers Comprehensibility for Bio-Medical Datasets

  • Syed Imran AliEmail author
  • Byeong Ho Kang
  • Sungyoung LeeEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10069)

Abstract

Feature subset selection is an important data reduction technique. Effects of feature selection on classifier’s accuracy are extensively studied yet comprehensibility of the resultant model is given less attention. We show that a weak feature selection method may significantly increase the complexity of a classification model. We also proposed an extendable feature selection methodology based on our preliminary results. Insights from the study can be used for developing clinical decision support systems.

Keywords

Feature subset selection Model comprehensibility Data classification Data mining Clinical decision support system 

Notes

Acknowledgments

This work was supported by the Industrial Core Technology Development Program (10049079, Develop of mining core technology exploiting personal big data) funded by the Ministry of Trade, Industry and Energy (MOTIE, Korea) and This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) NRF-2014R1A2A2A01003914.

References

  1. 1.
    Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newslett. 15(1), 1–10 (2014)CrossRefGoogle Scholar
  2. 2.
    Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. ICML 3, 856–863 (2003)Google Scholar
  3. 3.
    Liu, H., Setiono, R.: A probabilistic approach to feature selection-a filter solution. ICML 96, 319–327 (1996)Google Scholar
  4. 4.
    Hall, M.A.: Correlation-based feature selection for machine learning. Diss. The University of Waikato (1999)Google Scholar
  5. 5.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)CrossRefGoogle Scholar
  6. 6.
    Chen, Y., Li, Y., Cheng, X., Guo, L.: Survey and taxonomy of feature selection algorithms in intrusion detection system. In: Lipmaa, H., Yung, M., Lin, D. (eds.) Inscrypt 2006. LNCS, vol. 4318, pp. 153–167. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Belanche, L.A., González, F.F.: Review and evaluation of feature selection algorithms in synthetic problems. arXiv preprint arXiv:1101.2320 (2011)
  8. 8.
    Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)CrossRefGoogle Scholar
  9. 9.
    Bolón-Canedo, V., Porto-Díaz, I., Sánchez-Maroño, N., Alonso-Betanzos, A.: A framework for cost-based feature selection. Pattern Recogn. 47(7), 2481–2489 (2014)CrossRefGoogle Scholar
  10. 10.
    Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml
  11. 11.
    Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data mining with an ant colony optimization algorithm. IEEE Trans. Evol. Comput. 6(4), 321–332 (2002)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Computer EngineeringKyung Hee University Seocheon-dongyongin-siRepublic of Korea
  2. 2.Department of Engineering and Technology, Information and Communication TechnologyUniversity of TasmaniaHobartAustralia

Personalised recommendations