Application of Feature Subset Selection Methods on Classifiers Comprehensibility for Bio-Medical Datasets
Feature subset selection is an important data reduction technique. Effects of feature selection on classifier’s accuracy are extensively studied yet comprehensibility of the resultant model is given less attention. We show that a weak feature selection method may significantly increase the complexity of a classification model. We also proposed an extendable feature selection methodology based on our preliminary results. Insights from the study can be used for developing clinical decision support systems.
KeywordsFeature subset selection Model comprehensibility Data classification Data mining Clinical decision support system
This work was supported by the Industrial Core Technology Development Program (10049079, Develop of mining core technology exploiting personal big data) funded by the Ministry of Trade, Industry and Energy (MOTIE, Korea) and This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) NRF-2014R1A2A2A01003914.
- 2.Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. ICML 3, 856–863 (2003)Google Scholar
- 3.Liu, H., Setiono, R.: A probabilistic approach to feature selection-a filter solution. ICML 96, 319–327 (1996)Google Scholar
- 4.Hall, M.A.: Correlation-based feature selection for machine learning. Diss. The University of Waikato (1999)Google Scholar
- 7.Belanche, L.A., González, F.F.: Review and evaluation of feature selection algorithms in synthetic problems. arXiv preprint arXiv:1101.2320 (2011)
- 10.Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml