Abstract
Feature selection (FS) aims to remove the redundant or irrelevant features of the data, which plays a very important role in data mining and machine learning tasks. Recent studies focus on integrating the data discretization technique into the FS process to help achieve superior performances in classification. In this paper, we proposed an improved discretization-based FS method via particle swarm optimization to obtain a higher classification accuracy with a smaller size of feature subset. In our approach, we use a novel encoding and decoding way for particle swarm optimization (PSO) which can efficiently select multiple cut-points for discretization. In addition, a new updating strategy and a local search procedure is proposed to strengthen the searching ability and avoid being trapped into local optimal. Experimental results on benchmark datasets demonstrate the efficacy of our proposed methods both in the classification accuracy and the feature subset size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Sahaf, H., Zhang, M., Johnston, M., Verma, B.: Image descriptor: a genetic programming approach to multiclass texture classification. In: 2015 IEEE Congress on Evolutionary Computation (CEC), pp. 2460–2467. IEEE (2015)
Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. Inf. Theory 44(6), 2743–2760 (1998)
Chuang, L.Y., Chang, H.W., Tu, C.J., Yang, C.H.: Improved binary pso forfeature selection using gene expression data. Comput. Biol. Chem. 32(1), 29–38 (2008). https://doi.org/10.1016/j.compbiolchem.2007.09.005. http://www.sciencedirect.com/science/article/pii/S1476927107001181
Chuang, L.Y., Yang, C.H., Yang, C.H.: Tabu search and binary particle swarmoptimization for feature selection using microarray data. J. Comput. Biol. 16(12), 1689–1703 (2009). https://doi.org/10.1089/cmb.2007.0211. pMID: 20047491
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Dara, S., Banka, H.: A binary PSO feature selection algorithm for gene expression data. In: 2014 International Conference on Advances in Communication and Computing Technologies (ICACACT 2014), pp. 1–6, August 2014. https://doi.org/10.1109/EIC.2015.7230734
Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: 1995 Proceedings of the Sixth International Symposium on Micro Machine and Human Science, MHS 1995, pp. 39–43. IEEE (1995)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI (1993)
Gasca, E., Sánchez, J.S., Alonso, R.: Eliminating redundancy and irrelevance using a new MLP-based feature selection method. Pattern Recogn. 39(2), 313–315 (2006)
Guan, S.U., Liu, J., Qi, Y.: An incremental approach to contribution-based feature selection. J. Intell. Syst. 13(1), 15–42 (2004)
Hsu, C.N., Huang, H.J., Dietrich, S.: The annigma-wrapper approach to fast feature selection for neural nets. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 32(2), 207–212 (2002)
Huang, X., Chi, Y., Zhou, Y.: Feature selection of high dimensional data by adaptive potential particle swarm optimization. In: 2019 IEEE Congress on Evolutionary Computation (CEC), June 2019
Jović, A., Brkić, K., Bogunović, N.: A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1200–1205 May 2015. https://doi.org/10.1109/MIPRO.2015.7160458
Kennedy, J.: Bare bones particle swarms. In: 2003 Swarm Intelligence Symposium. SIS 2003. Proceedings of the 2003 IEEE, pp. 80–87. IEEE (2003)
Liu, M., Xu, L., Yi, J., Huang, J.: A feature gene selection method based on reliefF and PSO. In: 2018 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 298–301. IEEE (2018)
Patterson, G., Zhang, M.: Fitness functions in genetic programming for classification with unbalanced data. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 769–775. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76928-6_90
Reunanen, J.: Overfitting in making comparisons between variable selection methods. J. Mach. Learn. Res. 3(3), 1371–1382 (2003)
Reyes, O., Morell, C., Ventura, S.: Scalable extensions of the reliefFalgorithm for weighting and selecting features on the multi-label learningcontext. Neurocomputing 161, 168–182 (2015). https://doi.org/10.1016/j.neucom.2015.02.045. http://www.sciencedirect.com/science/article/pii/S0925231215001940
Tran, B., Xue, B., Zhang, M.: A new representation in PSO for discretization-based feature selection. IEEE Trans. Cybern. 48(6), 1733–1746 (2018). https://doi.org/10.1109/TCYB.2017.2714145
Tran, B., Xue, B., Zhang, M.: Improved PSO for feature selection on high-dimensional datasets. In: Dick, G., et al. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 503–515. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13563-2_43
Tran, B., Xue, B., Zhang, M.: Bare-bone particle swarm optimisation for simultaneously discretising and selecting features for high-dimensional classification. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 701–718. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31204-0_45
Vieira, S.M., Mendonça, L.F., Farinha, G.J., Sousa, J.M.: Modified binary psofor feature selection using SVM applied to mortality prediction of septicpatients. Appl. Soft Comput. 13(8), 3494–3504 (2013). https://doi.org/10.1016/j.asoc.2013.03.021. http://www.sciencedirect.com/science/article/pii/S1568494613001361
Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016). https://doi.org/10.1109/TEVC.2015.2504420
Xue, B., Cervante, L., Shang, L., Browne, W.N., Zhang, M.: A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Connect. Sci. 24(2–3), 91–116 (2012)
Acknowledgment
This work is supported in part by the Natural Science Foundation of China under Grant 61702336, in part by Shenzhen Emerging Industries of the Strategic Basic Research Project JCYJ20170302154254147, and in part by Natural Science Foundation of SZU (grant No. 2018068).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, J., Zhou, Y., Kang, J. (2019). An Improved Discretization-Based Feature Selection via Particle Swarm Optimization. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11776. Springer, Cham. https://doi.org/10.1007/978-3-030-29563-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-29563-9_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29562-2
Online ISBN: 978-3-030-29563-9
eBook Packages: Computer ScienceComputer Science (R0)