Abstract
Feature selection (FS) is one of the major tasks in data cleansing step in machine learning. However, multi-objective FS is more challenging because it tries to optimize two conflicting objectives, namely minimizing the feature set and classification error. In this way, evolutionary algorithms are promising solutions aimed to obtain more reliable Pareto fronts. However, unfortunately they suffer from consuming much time due to exploration in a large search space. Another issue encountered in multi-objective FS approaches is related to the correlation between features. This challenge arises because choosing such features reduces the performance of the classification. To address these challenges, we introduce a multi-objective FS approach that makes several significant contributions. First, the proposed method deals with the correlation between features through a novel probability structure. Secondly, it relies on the Pareto Archived Evolution Strategy (PAES) method, which offers many advantages, including simplicity and its ability to explore the solution space at an acceptable speed. We enhance the PAES structure in a manner that promotes the intelligent generation of offsprings. Consequently, our proposed approach benefits from the introduced probability structure to generate more promising offspring. Lastly, it incorporates a novel strategy to guide the algorithm to find the optimal subset throughout the evolutionary process. The obtained results on real-world datasets reveal a substantial enhancement in the quality of the final solutions.
Similar content being viewed by others
Data Availability
The datasets analyzed during the current study are available in: https://archive.ics.uci.edu/ml/datasets.php.
References
Ahn, G., & Hur, S. (2020). Efficient genetic algorithm for feature selection for early time series classification. Computers & Industrial Engineering, 142, 106345. https://doi.org/10.1016/j.cie.2020.106345
Al-Tashi, Q., Abdulkadir, S. J., Rais, H. M., Mirjalili, S., & Alhussian, H. (2020). Approaches to multi-objective feature selection: A systematic literature review. IEEE Access, 8, 125076–125096. https://doi.org/10.1109/ACCESS.2020.3007291
Amoozegar, M., & Minaei-Bidgoli, B. (2018). Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Systems with Applications, 113, 499–514. https://doi.org/10.1016/j.eswa.2018.07.013
Curry, D. M., & Dagli, C. H. (2014). Computational complexity measures for many-objective optimization problems. Procedia Computer Science, 36, 185–191. https://doi.org/10.1016/j.procs.2014.09.077
Das, A. K., Nikum, A. K., Krishnan, S. V., & Pratihar, D. K. (2020). Multi-objective Bonobo Optimizer (MOBO): An intelligent heuristic for multi-criteria optimization. Knowledge and Information Systems, 62(11), 4407–4444.
Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182–197. https://doi.org/10.1109/4235.996017
Dua, D., & Graff, C. (2017). “{UCI} machine learning repository.” [Online]. Available: http://archive.ics.uci.edu/ml
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of American Statistical Association, 32(200), 675–701. https://doi.org/10.1080/01621459.1937.10503522
Gao, X.-Z., Nalluri, M. S. R., Kannan, K., & Sinharoy, D. (2021). Multi-objective optimization of feature selection using hybrid cat swarm optimization. Science China Technological Sciences, 64(3), 508–520. https://doi.org/10.1007/s11431-019-1607-7
Hancer, E., Xue, B., Zhang, M., Karaboga, D., & Akay, B. (2018). Pareto front feature selection based on artificial bee colony optimization. Information Sciences (NY), 422, 462–479. https://doi.org/10.1016/j.ins.2017.09.028
Jiao, R., Nguyen, B. H., Xue, B., & Zhang, M. (2023). A survey on evolutionary multiobjective feature selection in classification: approaches, applications, and challenges. IEEE Transactions on Evolutionary Computation. https://doi.org/10.1109/TEVC.2023.3292527
Jiménez, F., Sánchez, G., García, J. M., Sciavicco, G., & Miralles, L. (2017). Multi-objective evolutionary feature selection for online sales forecasting. Neurocomputing, 234, 75–92. https://doi.org/10.1016/j.neucom.2016.12.045
Kiziloz, H. E., Deniz, A., Dokeroglu, T., & Cosar, A. (2018). Novel multiobjective TLBO algorithms for the feature subset selection problem. Neurocomputing, 306, 94–107. https://doi.org/10.1016/j.neucom.2018.04.020
Knowles, J., & Corne, D. (1999). “The pareto archived evolution strategy: A new baseline algorithm for pareto multiobjective optimisation,” in Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), 1, 98–105.
Kozodoi, N., Lessmann, S., Papakonstantinou, K., Gatsoulis, Y., & Baesens, B. (2019). A multi-objective approach for profit-driven feature selection in credit scoring. Decision Support Systems, 120, 106–117. https://doi.org/10.1016/j.dss.2019.03.011
Lai, C.-M. (2018). Multi-objective simplified swarm optimization with weighting scheme for gene selection. Applied Soft Computing, 65, 58–68.
Li, T., Zhan, Z.-H., Xu, J.-C., Yang, Q., & Ma, Y.-Y. (2022). A binary individual search strategy-based bi-objective evolutionary algorithm for high-dimensional feature selection. Information Sciences (NY), 610, 651–673. https://doi.org/10.1016/j.ins.2022.07.183
Lin, X., Lin, Z., & Wei, S. (2021). Multi-objective optimized driving strategy of dual-motor EVs using NSGA-II as a case study and comparison of various intelligent algorithms. Applied Soft Computing, 111, 107684. https://doi.org/10.1016/j.asoc.2021.107684
Liu, Z., Chang, B., & Cheng, F. (2021). An interactive filter-wrapper multi-objective evolutionary algorithm for feature selection. Swarm and Evolutionary Computation, 65, 100925. https://doi.org/10.1016/j.swevo.2021.100925
Mukhopadhyay, A., & Maulik, U. (2013). An SVM-Wrapped multiobjective evolutionary feature selection approach for identifying cancer-MicroRNA markers. IEEE Transactions on Nanobioscience, 12(4), 275–281. https://doi.org/10.1109/TNB.2013.2279131
Namakin, M., Rouhani, M., & Sabzekar, M. (2022). An evolutionary correlation-aware feature selection method for classification problems. Swarm and Evolutionary Computation, 75, 101165. https://doi.org/10.1016/j.swevo.2022.101165
Nguyen, H. B., Xue, B., Liu, I., Andreae, P., & Zhang, M. (2016). New mechanism for archive maintenance in PSO-based multi-objective feature selection. Soft Computing, 20(10), 3927–3946. https://doi.org/10.1007/s00500-016-2128-8
Nguyen, B. H., Xue, B., & Zhang, M. (2020). A survey on swarm intelligence approaches to feature selection in data mining. Swarm and Evolutionary Computation, 54, 100663. https://doi.org/10.1016/j.swevo.2020.100663
Ni, C., Chen, X., Wu, F., Shen, Y., & Gu, Q. (2019). An empirical study on pareto based multi-objective feature selection for software defect prediction. Journal of Systems and Software, 152, 215–238. https://doi.org/10.1016/j.jss.2019.03.012
Niu, B., Yi, W., Tan, L., Geng, S., & Wang, H. (2021). A multi-objective feature selection method based on bacterial foraging optimization. Natural Computing, 20(1), 63–76. https://doi.org/10.1007/s11047-019-09754-6
Olu-Ajayi, R., et al. (2023). Building energy performance prediction: A reliability analysis and evaluation of feature selection methods. Expert Systems with Applications, 225, 120109. https://doi.org/10.1016/j.eswa.2023.120109
Osei-Bryson, K.-M., Giles, K., & Kositanurit, B. (2003). Exploration of a hybrid feature selection algorithm. The Journal of the Operational Research Society, 54(7), 790–797. https://doi.org/10.1057/palgrave.jors.2601565
Simumba, N., Okami, S., Kodaka, A., & Kohtake, N. (2021). Comparison of profit-based multi-objective approaches for feature selection in credit scoring. Algorithms, 14(9), 260.
Sohrabi, M. K., & Tajik, A. (2017). Multi-objective feature selection for warfarin dose prediction. Computational Biology and Chemistry, 69, 126–133. https://doi.org/10.1016/j.compbiolchem.2017.06.002
Tian, H., Chen, S.-C., & Shyu, M.-L. (2020). Evolutionary programming based deep learning feature selection and network construction for visual data classification. Information Systems Frontiers, 22(5), 1053–1066. https://doi.org/10.1007/s10796-020-10023-6
Wang, Z., Li, M., & Li, J. (2015). A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure. Information Sciences (NY), 307, 73–88.
Wang, X., Zhang, Y., Sun, X., Wang, Y., & Du, C. (2020). Multi-objective feature selection based on artificial bee colony: An acceleration approach with variable sample size. Applied Soft Computing, 88, 106041. https://doi.org/10.1016/j.asoc.2019.106041
Wilcoxon, F. (1992) “Individual comparisons by ranking methods,” 196–202. https://doi.org/10.1007/978-1-4612-4380-9_16
Zeng, Z., Zhang, H., Zhang, R., & Yin, C. (2015). A novel feature selection method considering feature interaction. Pattern Recognition, 48(8), 2656–2666. https://doi.org/10.1016/j.patcog.2015.02.025
Zhang, Y., Cheng, S., Shi, Y., Gong, D., & Zhao, X. (2019). Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm. Expert Systems with Applications, 137, 46–58. https://doi.org/10.1016/j.eswa.2019.06.044
Zhang, Y., Gong, D., & Cheng, J. (2017). Multi-objective particle swarm optimization approach for cost-based feature selection in classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14(1), 64–75. https://doi.org/10.1109/TCBB.2015.2476796
Zhang, Y., Gong, D., Gao, X., Tian, T., & Sun, X. (2020). Binary differential evolution with self-learning for multi-objective feature selection. Information Sciences (NY), 507, 67–85. https://doi.org/10.1016/j.ins.2019.08.040
Zhu, Y., Liang, J., Chen, J., & Ming, Z. (2017). An improved NSGA-III algorithm for feature selection used in intrusion detection. Knowledge-Based Systems, 116, 74–85. https://doi.org/10.1016/j.knosys.2016.10.030
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Namakin, M., Rouhani, M. & Sabzekar, M. A Multi-objective Feature Selection Method Considering the Interaction Between Features. Inf Syst Front (2024). https://doi.org/10.1007/s10796-024-10481-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s10796-024-10481-2