Abstract
Feature selection is a critical preprocess for constructing model in computer vision and machine learning, yet it is difficult to simultaneously satisfy both reducing features’ number and maintaining classification accuracy. Toward this problem, we propose dividing-based many-objective evolutionary algorithm for large-scale feature selection (DMEA-FS). Firstly, four novel objectives are established for exploring the optimal feature’s subsets. Meanwhile, we design two structures of wrapper for high accuracy and filter for low computation cost in DMEA-FS. Secondly, two new recombination methods are presented for rapid convergence. Mapping-based variable dividing is presented for precise related variables. Thirdly, based on minimum Manhattan distance, a triangle-approximating decision-making is proposed for assisting users’ determination with/without preference information. Numerical experiments against several state-of-the-art feature selection algorithms demonstrate that the proposed DMEA-FS outperforms its competitors in terms of both classification accuracy and metrics of features’ number.
Similar content being viewed by others
References
Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Cham
Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5:19
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795
Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435
Abualigah LM, Khader AT, Hanandeh ES (2018a) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48:4047–4071
Abualigah LM, Khader AT, Hanandeh ES (2018b) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125
Abualigah LM, Khader AT, Hanandeh ES (2018c) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Adler J, Parmryd I (2010) Quantifying colocalization by correlation: the Pearson correlation coefficient is superior to the Mander’s overlap coefficient. Cytometry A 77A:733–742
Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017a) Mrmr ba: a hybrid gene selection algorithm for cancer classification. J Theor Appl Inf Technol 95:2610–2618
Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017b) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int J Data Min Bioinform 19:32–51
Chen X, He F, Yu H (2019) A matting method based on full feature coverage. Multimed Tools Appl 78:11173–11201
Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary pso for feature selection using gene expression data. Comput Biol Chem 32:29–38
Das S, Abraham A, Chakraborty UK, Konar A (2009) Differential evolution using a neighborhood-based mutation operator. IEEE Trans Evol Comput 13:526–553
Deb K, Beyer HG (2001) Self-adaptive genetic algorithms with simulated binary crossover. Evol Comput 9:197–221
Deb K, Jain H (2014) An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: solving problems with box constraints. IEEE Trans Evol Comput 18:577–601
Dua D, Graff C (2017) UCI machine learning repository
Duro JA, Saxena DK, Deb K, Zhang Q (2014) Machine learning based decision support for many-objective optimization problems. Neurocomputing 146:30–47
Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22:811–822
Guo X, Wang X, Wang M, Wang Y (2012) A new objective reduction algorithm for many-objective problems: employing mutual information and clustering algorithm. In: 2012 Eighth international conference on computational intelligence and security, IEEE, pp 11–16
Hadka D, Reed P (2013) Borg: an auto-adaptive many-objective evolutionary computing framework. Evol Comput 21:231
Hamdani TM, Won J.-M, Alimi AM, Karray F (2007) Multi-objective feature selection with NSGA II. In: International conference on adaptive and natural computing algorithms, Springer, pp 240–247
Hancer E, Bing X, Karaboga D, Zhang M (2015) A binary abc algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput 36:334–348
Harris RS, Longerich S, Rosenberg SM (1994) Recombination in adaptive mutation. Science 264:258–260
Hou N, He F, Zhou Y, Chen Y (2019) An efficient gpu-based parallel tabu search algorithm for hardware/software co-design. Front Comput Sci. https://doi.org/10.1007/s11704-019-8184-3
Huang CL (2009) Aco-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448
Huang CL, Wang CJ (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28:1825–1844
Ishibuchi H, Doi K, Nojima Y (2016) Reference point specification in MOEA/D for multi-objective and many-objective problems. In: 2016 IEEE international conference on systems, man, and cybernetics (SMC), IEEE, pp 004015–004020
Ishibuchi H, Doi K, Nojima Y (2017) On the effect of normalization in moea/d for multi-objective and many-objective optimization. Complex Intell Syst 3:279–294
Ishibuchi H, Tsukamoto N, Nojima Y (2008) Evolutionary many-objective optimization: a short review. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence), IEEE, pp 2419–2426
Jain H, Deb K (2014) An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: handling constraints and extending to an adaptive approach. IEEE Trans Evol Comput 18:602–622
Kale A, Sonavane S (2017) Hybrid feature subset selection approach for fuzzy-extreme learning machine. Data-Enabled Discov Appl 1:10
Karakaya G, Galelli S, Ahipasaoglu SD, Taormina R (2016) Identifying (Quasi) equally informative subsets in feature selection problems for classification: a max-relevance min-redundancy approach. IEEE Trans Cybern 46:1424–1437
Kent JT (1983) Information gain and a general measure of correlation. Biometrika 70:163–173
Komeili M, Louis W, Armanfard N, Hatzinakos D (2018) Feature selection for nonstationary data: application to human recognition using medical biometrics. IEEE Trans Cybern 48:1446–1459
Li K, He FZ, Yu HP (2018) Robust visual tracking based on convolutional features with illumination and occlusion handing. J Comput Sci Technol 33:223–236
Li K, He F, Yu H, Chen X (2019a) A parallel and robust object tracking approach synthesizing adaptive bayesian learning and improved incremental subspace learning. Front Comput Sci 13:1116–1135
Li H, He F, Yan X (2019b) IBEA-SVM: an indicator-based evolutionary algorithm based on pre-selection with classification guided by SVM. Appl Math A J Chin Univ 34:1–26
Liagkouras K, Metaxiotis K (2013) An elitist polynomial mutation operator for improved performance of MOEAs in computer networks. In: 2013 22nd international conference on computer communication and networks (ICCCN), IEEE, pp 1–5
Liang Y, He F, Li H (2019) An asymmetric and optimized encryption method to protect the confidentiality of 3D mesh model. Adv Eng Inform 42:100–103
Lin S, Tseng T, Chen S, Huang J (2006) A SA-based feature selection and parameter optimization approach for support vector machine. In: 2006 IEEE international conference on systems, man and cybernetics, vol 4, IEEE, pp 3144–3145
Lu D, Weng Q (2007) A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens 28:823–870
Luo J, He F, Yong J (2020) An efficient and robust bat algorithm with fusion of opposition based learning and whale optimization algorithm. Intell Data Anal 3:1291–1308
Lv X, He F, Cai W, Cheng Y (2019) An optimized RGA supporting selective undo for collaborative text editing systems. J Parallel Distrib Comput 132:310–330
Ma B, Yong X (2017) A tribe competition-based genetic algorithm for feature selection in pattern classification. Appl Soft Comput 58:328–338
Narendra Fukunaga (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26:917–922
Neng H, Yan X, He F (2019) A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design. Des Autom Embed Syst 23:57–77
Pan L, He C, Tian Y, Wang H, Zhang X, Jin Y (2019a) A classification-based surrogate-assisted evolutionary algorithm for expensive many-objective optimization. IEEE Trans Evol Comput 23:74–88
Pan Y, He F, Yu H, Li H (2019b) Learning adaptive trust strength with user roles of truster and trustee for trust-aware recommender systems. Appl Intell. https://doi.org/10.1007/s10489-019-01542-0
Pan Y, He F, Yu H (2019c) A correlative denoising autoencoder to model social influence for top-n recommender system. Front Comput Sci. https://doi.org/10.1007/s11704-019-8123-3
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Saha S, Kaur M (2018) Identification of topology-preserving, class-relevant feature subsets using multiobjective optimization. Soft Comput 23:4717–4733
Thangavel K, Manavalan R (2014) Soft computing models based feature selection for trus prostate cancer image classification. Soft Comput 18:1165–1176
Tian D (2016) A multi-objective genetic local search algorithm for optimal feature subset selection. In: 2016 International conference on computational science and computational intelligence (CSCI), IEEE, pp 1089–1094
Wang D, Tan D, Lei L (2018) Particle swarm optimization algorithm: an overview. Soft Comput 22:387–408
Wan M, Yang G, Sun C, Liu M (2019) Sparse two-dimensional discriminant locality-preserving projection (S2DDLPP) for feature extraction. Soft Comput 23:5511–5518
Wu Y, He F, Zhang D, Li X (2018) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11:341–353
Wuerl Adam, Crain Tim, Braden Ellen (2003) Genetic algorithm and calculus of variations-based trajectory optimization technique. J Spacecr Rockets 40:882–888
Yan Q, Long Y, Chao L, Liu H, Hu R, Xiao C (2016) Geometrically based linear iterative clustering for quantitative feature correspondence. Comput Graph Forum 35:1–10
Yan X, He F, Hou N, Ai H (2018) An efficient particle swarm optimization for large-scale hardware/software co-design system. Int J Coop Inf Syst 27:1741001
Yang X, Wei W, Kai L, Chen W, Zhou Z (2018) Multiple dictionary pairs learning and sparse representation-based infrared image super-resolution with improved fuzzy clustering. Soft Comput 22:1385–1398
Yi Y, Qiao S, Wei Z, Zheng C, Liu Q, Wang J (2018) Adaptive multiple graph regularized semi-supervised extreme learning machine. Soft Comput 22:1–18
Yong J, He F, Li H, Zhou W (2019) A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Appl Math A J Chin Univ. https://doi.org/10.1007/s11766-019-3714-1
Yu H, He F, Pan Y (2019) A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation. Multimed Tools Appl 78:11779–11798
Zhang Q, Hui L (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11:712–731
Zhang D, He F, Han S, Li X (2016) Quantitative optimization of interoperability during feature-based data exchange. Integr Comput Aided Eng 23:31–51
Zhang L, Yan Q, Liu Z, Zou H, Xiao C (2017) Illumination decomposition for photograph with multiple light sources. IEEE Trans Image Process 26:4114–4127
Zhang X, Tian Y, Cheng R, Jin Y (2018) A decision variable clustering based evolutionary algorithm for large-scale many-objective optimization. IEEE Trans Evol Comput 22:97–112
Zhang S, He F, Ren W, Yao J (2019) Joint learning of image detail and transmission map for single image dehazing. Vis Comput. https://doi.org/10.1007/s00371-018-1612-9
Zhao H, Sinha AP, Wei G (2009) Effects of feature construction on classification performance: an empirical study in bank failure prediction. Expert Syst Appl 36:2633–2644
Zhou Y, Fazhi HE, Qiu Y (2017) Dynamic strategy based parallel ant colony optimization on gpus for tsps. Sci China 60:068102
Zhou Y, He F, Hou N, Qiu Y (2018) Parallel ant colony optimization on multi-core SIMD CPUS. Future Gener Comput Syst 79:473–487
Acknowledgements
This study was funded by X the National Science Foundation of China (Grant No. 61472289).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Haoran Li declares that he has no conflict of interest. Fazhi He declares that he has no conflict of interest. Yaqian Laing declares that she has no conflict of interest. Quan Quan declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, H., He, F., Liang, Y. et al. A dividing-based many-objective evolutionary algorithm for large-scale feature selection. Soft Comput 24, 6851–6870 (2020). https://doi.org/10.1007/s00500-019-04324-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04324-5