Skip to main content
Log in

A dividing-based many-objective evolutionary algorithm for large-scale feature selection

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Feature selection is a critical preprocess for constructing model in computer vision and machine learning, yet it is difficult to simultaneously satisfy both reducing features’ number and maintaining classification accuracy. Toward this problem, we propose dividing-based many-objective evolutionary algorithm for large-scale feature selection (DMEA-FS). Firstly, four novel objectives are established for exploring the optimal feature’s subsets. Meanwhile, we design two structures of wrapper for high accuracy and filter for low computation cost in DMEA-FS. Secondly, two new recombination methods are presented for rapid convergence. Mapping-based variable dividing is presented for precise related variables. Thirdly, based on minimum Manhattan distance, a triangle-approximating decision-making is proposed for assisting users’ determination with/without preference information. Numerical experiments against several state-of-the-art feature selection algorithms demonstrate that the proposed DMEA-FS outperforms its competitors in terms of both classification accuracy and metrics of features’ number.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Cham

    Google Scholar 

  • Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5:19

    Google Scholar 

  • Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795

    Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435

    Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES (2018a) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48:4047–4071

    Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES (2018b) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125

    Google Scholar 

  • Abualigah LM, Khader AT, Hanandeh ES (2018c) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466

    Google Scholar 

  • Adler J, Parmryd I (2010) Quantifying colocalization by correlation: the Pearson correlation coefficient is superior to the Mander’s overlap coefficient. Cytometry A 77A:733–742

    Google Scholar 

  • Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017a) Mrmr ba: a hybrid gene selection algorithm for cancer classification. J Theor Appl Inf Technol 95:2610–2618

    Google Scholar 

  • Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017b) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int J Data Min Bioinform 19:32–51

    Google Scholar 

  • Chen X, He F, Yu H (2019) A matting method based on full feature coverage. Multimed Tools Appl 78:11173–11201

    Google Scholar 

  • Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary pso for feature selection using gene expression data. Comput Biol Chem 32:29–38

    MATH  Google Scholar 

  • Das S, Abraham A, Chakraborty UK, Konar A (2009) Differential evolution using a neighborhood-based mutation operator. IEEE Trans Evol Comput 13:526–553

    Google Scholar 

  • Deb K, Beyer HG (2001) Self-adaptive genetic algorithms with simulated binary crossover. Evol Comput 9:197–221

    Google Scholar 

  • Deb K, Jain H (2014) An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: solving problems with box constraints. IEEE Trans Evol Comput 18:577–601

    Google Scholar 

  • Dua D, Graff C (2017) UCI machine learning repository

  • Duro JA, Saxena DK, Deb K, Zhang Q (2014) Machine learning based decision support for many-objective optimization problems. Neurocomputing 146:30–47

    Google Scholar 

  • Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22:811–822

    Google Scholar 

  • Guo X, Wang X, Wang M, Wang Y (2012) A new objective reduction algorithm for many-objective problems: employing mutual information and clustering algorithm. In: 2012 Eighth international conference on computational intelligence and security, IEEE, pp 11–16

  • Hadka D, Reed P (2013) Borg: an auto-adaptive many-objective evolutionary computing framework. Evol Comput 21:231

    Google Scholar 

  • Hamdani TM, Won J.-M, Alimi AM, Karray F (2007) Multi-objective feature selection with NSGA II. In: International conference on adaptive and natural computing algorithms, Springer, pp 240–247

  • Hancer E, Bing X, Karaboga D, Zhang M (2015) A binary abc algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput 36:334–348

    Google Scholar 

  • Harris RS, Longerich S, Rosenberg SM (1994) Recombination in adaptive mutation. Science 264:258–260

    Google Scholar 

  • Hou N, He F, Zhou Y, Chen Y (2019) An efficient gpu-based parallel tabu search algorithm for hardware/software co-design. Front Comput Sci. https://doi.org/10.1007/s11704-019-8184-3

    Article  Google Scholar 

  • Huang CL (2009) Aco-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448

    Google Scholar 

  • Huang CL, Wang CJ (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240

    Google Scholar 

  • Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Google Scholar 

  • Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28:1825–1844

    Google Scholar 

  • Ishibuchi H, Doi K, Nojima Y (2016) Reference point specification in MOEA/D for multi-objective and many-objective problems. In: 2016 IEEE international conference on systems, man, and cybernetics (SMC), IEEE, pp 004015–004020

  • Ishibuchi H, Doi K, Nojima Y (2017) On the effect of normalization in moea/d for multi-objective and many-objective optimization. Complex Intell Syst 3:279–294

    Google Scholar 

  • Ishibuchi H, Tsukamoto N, Nojima Y (2008) Evolutionary many-objective optimization: a short review. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence), IEEE, pp 2419–2426

  • Jain H, Deb K (2014) An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: handling constraints and extending to an adaptive approach. IEEE Trans Evol Comput 18:602–622

    Google Scholar 

  • Kale A, Sonavane S (2017) Hybrid feature subset selection approach for fuzzy-extreme learning machine. Data-Enabled Discov Appl 1:10

    Google Scholar 

  • Karakaya G, Galelli S, Ahipasaoglu SD, Taormina R (2016) Identifying (Quasi) equally informative subsets in feature selection problems for classification: a max-relevance min-redundancy approach. IEEE Trans Cybern 46:1424–1437

    Google Scholar 

  • Kent JT (1983) Information gain and a general measure of correlation. Biometrika 70:163–173

    MathSciNet  MATH  Google Scholar 

  • Komeili M, Louis W, Armanfard N, Hatzinakos D (2018) Feature selection for nonstationary data: application to human recognition using medical biometrics. IEEE Trans Cybern 48:1446–1459

    Google Scholar 

  • Li K, He FZ, Yu HP (2018) Robust visual tracking based on convolutional features with illumination and occlusion handing. J Comput Sci Technol 33:223–236

    Google Scholar 

  • Li K, He F, Yu H, Chen X (2019a) A parallel and robust object tracking approach synthesizing adaptive bayesian learning and improved incremental subspace learning. Front Comput Sci 13:1116–1135

    Google Scholar 

  • Li H, He F, Yan X (2019b) IBEA-SVM: an indicator-based evolutionary algorithm based on pre-selection with classification guided by SVM. Appl Math A J Chin Univ 34:1–26

    MathSciNet  MATH  Google Scholar 

  • Liagkouras K, Metaxiotis K (2013) An elitist polynomial mutation operator for improved performance of MOEAs in computer networks. In: 2013 22nd international conference on computer communication and networks (ICCCN), IEEE, pp 1–5

  • Liang Y, He F, Li H (2019) An asymmetric and optimized encryption method to protect the confidentiality of 3D mesh model. Adv Eng Inform 42:100–103

    Google Scholar 

  • Lin S, Tseng T, Chen S, Huang J (2006) A SA-based feature selection and parameter optimization approach for support vector machine. In: 2006 IEEE international conference on systems, man and cybernetics, vol 4, IEEE, pp 3144–3145

  • Lu D, Weng Q (2007) A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens 28:823–870

    Google Scholar 

  • Luo J, He F, Yong J (2020) An efficient and robust bat algorithm with fusion of opposition based learning and whale optimization algorithm. Intell Data Anal 3:1291–1308

    Google Scholar 

  • Lv X, He F, Cai W, Cheng Y (2019) An optimized RGA supporting selective undo for collaborative text editing systems. J Parallel Distrib Comput 132:310–330

    Google Scholar 

  • Ma B, Yong X (2017) A tribe competition-based genetic algorithm for feature selection in pattern classification. Appl Soft Comput 58:328–338

    Google Scholar 

  • Narendra Fukunaga (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26:917–922

    MATH  Google Scholar 

  • Neng H, Yan X, He F (2019) A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design. Des Autom Embed Syst 23:57–77

    Google Scholar 

  • Pan L, He C, Tian Y, Wang H, Zhang X, Jin Y (2019a) A classification-based surrogate-assisted evolutionary algorithm for expensive many-objective optimization. IEEE Trans Evol Comput 23:74–88

    Google Scholar 

  • Pan Y, He F, Yu H, Li H (2019b) Learning adaptive trust strength with user roles of truster and trustee for trust-aware recommender systems. Appl Intell. https://doi.org/10.1007/s10489-019-01542-0

    Article  Google Scholar 

  • Pan Y, He F, Yu H (2019c) A correlative denoising autoencoder to model social influence for top-n recommender system. Front Comput Sci. https://doi.org/10.1007/s11704-019-8123-3

    Article  Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238

    Google Scholar 

  • Saha S, Kaur M (2018) Identification of topology-preserving, class-relevant feature subsets using multiobjective optimization. Soft Comput 23:4717–4733

    Google Scholar 

  • Thangavel K, Manavalan R (2014) Soft computing models based feature selection for trus prostate cancer image classification. Soft Comput 18:1165–1176

    Google Scholar 

  • Tian D (2016) A multi-objective genetic local search algorithm for optimal feature subset selection. In: 2016 International conference on computational science and computational intelligence (CSCI), IEEE, pp 1089–1094

  • Wang D, Tan D, Lei L (2018) Particle swarm optimization algorithm: an overview. Soft Comput 22:387–408

    Google Scholar 

  • Wan M, Yang G, Sun C, Liu M (2019) Sparse two-dimensional discriminant locality-preserving projection (S2DDLPP) for feature extraction. Soft Comput 23:5511–5518

    Google Scholar 

  • Wu Y, He F, Zhang D, Li X (2018) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11:341–353

    Google Scholar 

  • Wuerl Adam, Crain Tim, Braden Ellen (2003) Genetic algorithm and calculus of variations-based trajectory optimization technique. J Spacecr Rockets 40:882–888

    Google Scholar 

  • Yan Q, Long Y, Chao L, Liu H, Hu R, Xiao C (2016) Geometrically based linear iterative clustering for quantitative feature correspondence. Comput Graph Forum 35:1–10

    Google Scholar 

  • Yan X, He F, Hou N, Ai H (2018) An efficient particle swarm optimization for large-scale hardware/software co-design system. Int J Coop Inf Syst 27:1741001

    Google Scholar 

  • Yang X, Wei W, Kai L, Chen W, Zhou Z (2018) Multiple dictionary pairs learning and sparse representation-based infrared image super-resolution with improved fuzzy clustering. Soft Comput 22:1385–1398

    Google Scholar 

  • Yi Y, Qiao S, Wei Z, Zheng C, Liu Q, Wang J (2018) Adaptive multiple graph regularized semi-supervised extreme learning machine. Soft Comput 22:1–18

    MATH  Google Scholar 

  • Yong J, He F, Li H, Zhou W (2019) A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Appl Math A J Chin Univ. https://doi.org/10.1007/s11766-019-3714-1

    Article  Google Scholar 

  • Yu H, He F, Pan Y (2019) A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation. Multimed Tools Appl 78:11779–11798

    Google Scholar 

  • Zhang Q, Hui L (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11:712–731

    Google Scholar 

  • Zhang D, He F, Han S, Li X (2016) Quantitative optimization of interoperability during feature-based data exchange. Integr Comput Aided Eng 23:31–51

    Google Scholar 

  • Zhang L, Yan Q, Liu Z, Zou H, Xiao C (2017) Illumination decomposition for photograph with multiple light sources. IEEE Trans Image Process 26:4114–4127

    MathSciNet  MATH  Google Scholar 

  • Zhang X, Tian Y, Cheng R, Jin Y (2018) A decision variable clustering based evolutionary algorithm for large-scale many-objective optimization. IEEE Trans Evol Comput 22:97–112

    Google Scholar 

  • Zhang S, He F, Ren W, Yao J (2019) Joint learning of image detail and transmission map for single image dehazing. Vis Comput. https://doi.org/10.1007/s00371-018-1612-9

    Article  Google Scholar 

  • Zhao H, Sinha AP, Wei G (2009) Effects of feature construction on classification performance: an empirical study in bank failure prediction. Expert Syst Appl 36:2633–2644

    Google Scholar 

  • Zhou Y, Fazhi HE, Qiu Y (2017) Dynamic strategy based parallel ant colony optimization on gpus for tsps. Sci China 60:068102

    Google Scholar 

  • Zhou Y, He F, Hou N, Qiu Y (2018) Parallel ant colony optimization on multi-core SIMD CPUS. Future Gener Comput Syst 79:473–487

    Google Scholar 

Download references

Acknowledgements

This study was funded by X the National Science Foundation of China (Grant No. 61472289).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fazhi He.

Ethics declarations

Conflict of interest

Haoran Li declares that he has no conflict of interest. Fazhi He declares that he has no conflict of interest. Yaqian Laing declares that she has no conflict of interest. Quan Quan declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., He, F., Liang, Y. et al. A dividing-based many-objective evolutionary algorithm for large-scale feature selection. Soft Comput 24, 6851–6870 (2020). https://doi.org/10.1007/s00500-019-04324-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04324-5

Keywords

Navigation