Abstract
Feature selection has been widely discussed as an important preprocessing step in machine learning and data mining. In this paper, a new feature selection evaluation criterion based on low-loss learning vector quantization (LVQ) classification is proposed. Based on the evaluation criterion, a feature selection algorithm that optimizes the hypothesis margin of LVQ classification through minimizing its loss function is presented. Some experiments that are compared with well-known SVM-RFE and Relief are carried out on 4 UCI data sets using Naive Bayes and RBF Network classifier. Experimental results show that new algorithm achieves similar or even higher performance than Relief on all training data and has better or comparable performance than SVM-RFE.
Similar content being viewed by others
References
Murase Kazuyuki (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928
Zhu W, Si G, Zhang Y (2013) Neighborhood effective information ratio for hybrid feature subset evaluation and selection. Neurocomputing 99:25–37
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176
Steuer R et al (2002) The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18(suppl 2):234–240
Dash M, Choi K, Scheuermann P, Liu H (2002) Feature selection for clustering: a filter solution. In: Second IEEE international conference on data mining, pp 115–122
Chuang L-Y et al (2009) A two-stage feature selection method for gene expression data. OMICS 13:127–137
Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Yang JH, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13(2):44–49
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422
Shi J et al (2010) A fast hybrid algorithm for large-scale L1-regularized logistic regression. J Mach Learn Res 11:713–741
Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognit Lett 28(13):1825–1844
Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst Appl 33:49–60
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin
Kwak N, Choi C (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159
Somol P, Pudil P, Kittler J (2004) Fast branch & bound algorithms for optimal feature selection. IEEE Trans Pattern Anal Mach Intell 26:900–912
Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recognit Lett 31:226–233
Yang W, Li D, Zhu L (2011) An improved genetic algorithm for optimal feature subset selection from multi-character feature set. Expert Syst Appl 38:2733–2740
Wang J, Shen X (2008) Probability estimation for large-margin classifiers, Biometrika 95(1):149–167
Crammer K, Gilad-Bachrach R, Navot A, Tishby N (2002) Margin analysis of the LVQ algorithm. In: Proceedings of 17th conference on neural information processing systems
Kira K, Rendell L (1992) A practical approach to feature selection, Proceedings of international conference on machine learning, pp 249–256
Kononerko I (1994) Estimating attributes analysis and extension of RELIEF. Proc Eur Conf Mach Learn 17: l–182
Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Machine Intell 29(6):1035–1051
Sun Y, Li J (2006) Iterative RELIEF for feature weighting. In: Proceedings of 23rd international conference on machine learning, pp 913–920
Gilad-Bachrach R, Navot A, Tishby N (2004) Margin based feature selection-theory and algorithms. In: Proceedings of the 21st international conference on machine learning. Banff, Canada, July, 4–8
Guyon I, Gunn S, Nikravesh M, Zadeh L (2006) Feature extraction: foundations and applications. Springer Physica-Verlag, New York
Li Y, Lu B-L (2009) Feature selection based on loss-margin of nearest neighbor classification. Pattern Recognit 42(9):1914–1921
Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6
Kohonen T (2012) Essentials of the self-organizing map, Neural Networks, In Press, Corrected Proof, Available online 4 October
Lamberti L, Camastra F (2012) Handy: a real-time three color glove-based gesture recognizer with learning vector quantization. Expert Syst Appl 12(39):10489–10494
Singer Y, Lewis DD (2000) Machine learning for information retrieval: advanced techniques. Presented at ACM SIGIR
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant No. 51074097.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, Y., Liu, W. A novel feature selection algorithm based on LVQ hypothesis margin. Neural Comput & Applic 24, 1431–1439 (2014). https://doi.org/10.1007/s00521-013-1366-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-013-1366-2