Abstract
Feature selection, which plays an important role in high-dimensional data analysis, is drawing increasing attention recently. Finding the most relevant and important features for classifications are one of the most important tasks of data mining and machine learning, since all of the datasets have irrelevant features that affect accuracy rate and slow down the classifier. Feature selection is an optimization process, which improves the accuracy rate of data classification and reduces the number of selected features. Applying too many features both requires a large memory capacity and leads to a slow execution speed. Feature selection algorithms are often responsible to decide which features should be selected to be used during a classification algorithm. Traditional algorithms seemed to be inefficient due to the complexity of dimensions of the problem, thus evolutionary algorithms were used to improve the problem solving process. The algorithm proposed in this paper, chaotic cuckoo optimization algorithm with levy flight, disruption operator and opposition-based learning (CCOALFDO), is applied to select the optimal feature subspace for classification. It reduces the randomization in selecting features and avoids getting stuck in local optimum solutions which lead to a more interesting feature subset. Extensive experiments are conducted on 20 high-dimensional datasets to demonstrate the effectiveness and efficiency of the proposed method. The results showed the superiority of the proposed method to state-of-the-art methods in terms of classification accuracy rate. In addition, they prove the ability of the CCOALFDO in selecting the most relevant features for classification tasks. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification accuracy rate in real world datasets.
Similar content being viewed by others
References
Aladeemy M et al (2020) New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows. Appl Soft Comput 86:105866
Alatas B, Akin E, Ozer AB (2009) Chaos embedded particle swarm optimization algorithms. Chaos, Solitons Fractals 40(4):1715–1734
Ang JC et al (2015) Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans Comput Biol Bioinf 13(5):971–989
Anter AM, Ali M (2020) Feature selection strategy based on hybrid crow search optimization algorithm integrated with chaos theory and fuzzy c-means algorithm for medical diagnosis problems. Soft Comput 24(3):1–20
Anter AM, Azar AT, Fouad KM (2019) Intelligent hybrid approach for feature selection. In: International conference on advanced machine learning technologies and applications. Springer
Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147–160
Bannigidad P, Gudada C (2019) Age-type identification and recognition of historical kannada handwritten document images using HOG feature descriptors. In: Iyer B, Nalbalwar S, Pathak N (eds) Computing, communication and signal processing. Springer, Berlin, pp 1001–1010
Beltramo T, Klocke M, Hitzmann B (2019) Prediction of the biogas production using GA and ACO input features selection method for ANN model. Inf Process Agric 6(3):349–356
Bhattacharya A, Chattopadhyay PK (2010) Hybrid differential evolution with biogeography-based optimization for solution of economic load dispatch. IEEE Trans Power Syst 25(4):1955–1964
Bostani H, Sheikhan M (2017) Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems. Soft Comput 21(9):2307–2324
Cheng C, Bao L, Bao C (2016) network intrusion detection with bat algorithm for synchronization of feature selection and support vector machines. In: International symposium on neural networks. 2016. Springer
da Silva DL, Seijas LM, Bastos-Filho CJ (2017) Artificial bee colony optimization for feature selection of traffic sign recognition. Int J Swarm Intell Res (IJSIR) 8(2):50–66
Dai Y et al. (2015) Feature selection of high-dimensional biomedical data using improved SFLA for disease diagnosis. In: International conference on bioinformatics and biomedicine (BIBM), IEEE
Dara S, Banka H, Annavarapu CSR (2017) A rough based hybrid binary PSO algorithm for flat feature selection and classification in gene expression data. Ann Data Sci 4(3):1–20
De Souza RCT et al (2018) A V-shaped binary crow search algorithm for feature selection. In: 2018 IEEE congress on evolutionary computation (CEC). 2018. IEEE
Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinf Comput Biol 3(02):185–205
dos Santos CL, Mariani VC (2008) Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization. Expert Syst Appl 34(3):1905–1913
Du D, Simon D, Ergezer M (2009) Biogeography-based optimization combined with evolutionary strategy and immigration refusal. In: IEEE International conference onsystems, man and cybernetics, 2009. SMC IEEE
Eberchart R, Kennedy J (1995) Particle swarm optimization. In: IEEE international conference on neural networks, Perth, Australia. 1995
El Aziz MA, Hassanien AE (2016) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput Appl 29(4):1–10
Emary E, Zawbaa HM (2018) Feature selection via Lèvy Antlion optimization. Pattern Anal Appl 22(3):1–20
Emary E, Zawbaa HM, Hassanien AE (2016) Binary gray wolf optimization approaches for feature selection. Neurocomputing 172:371–381
Fong S, Yang XS, Deb S (2013) Swarm search for feature selection in classification. In: 2013 IEEE 16th international conference on computational science and engineering (CSE), 2013. IEEE
Fong S, Wong R, Vasilakos AV (2016) Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans Serv Comput 9(1):33–45
Gandomi AH, Yang X-S (2014) Chaotic bat algorithm. J Comput Sci 5(2):224–232
Gandomi AH, Yang X-S, Alavi AH (2013) Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Eng Comput 29(1):17–35
Hamidzadeh J, Namaei N (2018) Belief-based chaotic algorithm for support vector data description. Soft Comput 23:1–26
Hamidzadeh J, Monsefi R, Yazdi HS (2015) IRAHC: instance reduction algorithm using hyperrectangle clustering. Pattern Recogn 48(5):1878–1889
Hamidzadeh J, Sadeghi R, Namaei N (2017) Weighted support vector data description based on chaotic bat algorithm. Appl Soft Comput 60:540–551
Harde S, Sahare V (2016) Design and implementation of ACO feature selection algorithm for data stream mining. In: International conference on automatic control and dynamic optimization techniques (ICACDOT), IEEE
Himabindu K, Jyothi S, Mamatha D (2019) GA-based feature selection for squid’s classification. In: Wang J, Reddy G, Prasad V, Reddy V (eds) Soft computing and signal processing. Springer, Berlin, pp 29–36
Hossain MA, Jia X, Benediktsson JA (2016) One-class oriented feature selection and classification of heterogeneous remote sensing images. IEEE J Sel Top Appl Earth Observ Remote Sens 9(4):1606–1612
Hu B et al (2016) Feature selection for optimized high-dimensional biomedical data using the improved shuffled frog leaping algorithm. IEEE/ACM Trans Comput Biol Bioinf 15(6):1765–1773
Huang C-L, Dun J-F (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391
Huang P et al (2018) Feature extraction based on graph discriminant embedding and its applications to face recognition. Soft Comput 23(16):1–14
Huda S et al (2016) A hybrid feature selection with ensemble classification for imbalanced healthcare data: a case study for brain tumor diagnosis. IEEE Access 4:9145–9154
Hussien AG et al (2019) S-shaped binary whale optimization algorithm for feature selection. In: Bhattacharyya S, Mukherjee A, Bhaumik H, Das S, Yoshida K (eds) Recent trends in signal and image processing. Springer, Berlin, pp 79–87
Jayabarathi T, Raghunathan T, Gandomi A (2018) The bat algorithm, variants and some practical engineering applications: a review. In: Yang XS (ed) Nature-inspired algorithms and applied optimization. Springer, Cham, pp 313–330
Jothiprakash V, Arunkumar R (2013) Optimization of hydropower reservoir using evolutionary algorithms coupled with chaos. Water Resour Manag 27(7):1963–1979
Kriegel H-P, Kröger P, Zimek A (2009) Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans Knowl Discov Data (TKDD) 3(1):1
Kumar L, Bharti KK (2019) An improved BPSO algorithm for feature selection. In: Khare A, Tiwary U, Sethi I, Singh N (eds) Recent trends in communication, computing, and electronics. Springer, Berlin, pp 505–513
Lee J, Kim D-W (2016) Efficient multi-label feature selection using entropy-based label selection. Entropy 18(11):405
Li Y, Li T, Liu H (2017) Recent advances in feature selection and its applications. Knowl Inf Syst 53(3):551–577
Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, p 2017
Li-Jiang Y, Tian-Lun C (2002) Application of chaos in genetic algorithms. Commun Theor Phys 38(2):168
Luo T et al (2018) Semi-supervised feature selection via insensitive sparse regression with application to video semantic recognition. IEEE Trans Knowl Data Eng 30(10):1943–1956
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Mafarja M et al (2019a) Whale optimisation algorithm for high-dimensional small-instance feature selection. Int J Parallel Emerg Distrib Syst. https://doi.org/10.1080/17445760.2019.1617866
Mafarja M et al (2019b) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286
Masud MM, et al (2010) Classification and novel class detection of data streams in a dynamic feature space. In: Joint European conference on machine learning and knowledge discovery in databases. 2010. Springer
Mirjalili S, Mirjalili SM, Lewis A (2014) Gray wolf optimizer. Adv Eng Softw 69:46–61
Mistry K et al (2017) A micro-GA embedded PSO feature selection approach to intelligent facial emotion recognition. IEEE Trans Cybern 47(6):1496–1509
Nayar N, Ahuja S, Jain S (2019) Swarm intelligence for feature selection: a review of literature and reflection on future challenges. In: Kolhe M, Trivedi M, Tiwari S, Singh V (eds) Advances in data and information sciences. Springer, Berlin, pp 211–221
Neggaz N, Houssein EH, Hussain K (2020a) An efficient henry gas solubility optimization for feature selection. Expert Syst Appl 152:113364
Neggaz N et al (2020b) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst Appl 145:113103
Oliva D, Abd Elaziz M (2020) An improved brainstorm optimization using chaotic opposite-based learning with disruption operator for global optimization and feature selection. Soft Comput 24:1–22
Peng H, Fan Y (2017) Feature selection by optimizing a lower bound of conditional mutual information. Inf Sci 418:652–667
Pes B, Dessì N, Angioni M (2017) Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf Fusion 35:132–147
Qi C et al (2017) Feature selection and multiple kernel boosting framework based on PSO with mutation mechanism for hyperspectral classification. Neurocomputing 220:181–190
Rajabioun R (2011) Cuckoo optimization algorithm. Appl Soft Comput 11(8):5508–5518
Ramírez-Gallego S et al (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57
Rao H et al (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput 74:634–642
Rodrigues D et al. (2014) A binary krill herd approach for feature selection. In: 2014 22nd international conference on pattern recognition (ICPR), 2014. IEEE
Saremi S, Mirjalili S, Lewis A (2014a) Biogeography-based optimisation with chaos. Neural Comput Appl 25(5):1077–1097
Saremi S, Mirjalili SM, Mirjalili S (2014b) Chaotic krill herd optimization algorithm. Procedia Technol 12:180–185
Sayed GI, Hassanien AE, Azar AT (2019) Feature selection via a novel chaotic crow search algorithm. Neural Comput Appl 31(1):1–18
Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid). Swarm Evol Comput 36:27–36
Simon D (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12(6):702–713
Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst Appl 33(1):49–60
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2020) A review of unsupervised feature selection methods. Artif Intell Rev 53(2):907–948
Song J et al (2016) Deep and fast: deep learning hashing with semi-supervised graph construction. Image Vis Comput 55:101–108
Statnikov A et al (2005) A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21(5):631–643
Sweetlin JD, Nehemiah HK, Kannan A (2017) Feature selection using ant colony optimization with tandem-run recruitment to diagnose bronchitis from CT scan images. Comput Methods Programs Biomed 145:115–125
Syberfeldt A (2014) Multi-objective optimization of a real-world manufacturing process using cuckoo search. In: Yang XS (ed) Cuckoo search and firefly algorithm. Springer, Cham, pp 179–193
Thaher T et al (2020) Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In: Mirjalili S, Faris H, Aljarah I (eds) Evolutionary machine learning techniques. Springer, Berlin, pp 251–272
Tizhoosh HR (2005) Opposition-based learning: a new scheme for machine intelligence. In: International conference on computational intelligence for modelling, control and automation and international conference on intelligent agents, Web technologies and internet commerce (CIMCA-IAWTIC’06)
Viswanathan G, Raposo E, Da Luz M (2008) Lévy flights and superdiffusion in the context of biological encounters and random searches. Phys Life Rev 5(3):133–150
Wang N, Liu L, Liu L (2001) Genetic algorithm in chaos. Or Trans 5(3):1–10
Wang G-G et al (2014) Chaotic krill herd algorithm. Inf Sci 274:17–34
Wang G-G et al (2019) A comprehensive review of krill herd algorithm: variants, hybrids and applications. Artif Intell Rev 51(1):1–30
Xu S, Dai J, Shi H (2018) Semi-supervised feature selection based on least square regression with redundancy minimization. In: 2018 International joint conference on neural networks (IJCNN). 2018. IEEE
Yadav S, Ekbal A, Saha S (2018) Feature selection for entity extraction from multiple biomedical corpora: a PSO-based approach. Soft Comput 22(20):6881–6904
Yan C et al (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst 184:102–111
Yang XS, Deb S (2009) Cuckoo search via Lévy flights. In: Nature & biologically inspired computing, 2009. NaBIC 2009. World Congress on. 2009. IEEE
Yang X-K et al (2018) Semi-supervised minimum redundancy maximum relevance feature selection for audio classification. Multimed Tools Appl 77(1):713–739
Zare M, Eftekhari M, Aghamollaei G (2019) Supervised feature selection via matrix factorization based on singular value decomposition. Chemom Intell Lab Syst 185:105–113
Zhang M et al (2019) Multi-temporal SAR image classification of coastal plain wetlands using a new feature selection method and random forests. Remote Sens Lett 10(3):312–321
Zhenyu G et al (2006) Self-adaptive chaos differential evolution. In: Advances in natural computation. pp. 972–975
Zhong Z (2020) Adaptive graph learning and low-rank constraint for supervised spectral feature selection. Neural Comput Appl 32(11):1–10
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
kelidari, M., Hamidzadeh, J. Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator. Soft Comput 25, 2911–2933 (2021). https://doi.org/10.1007/s00500-020-05349-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05349-x