Abstract
In the biomedical research field, feature selection plays the predominant role in prediction of diseases. The main objective of this paper is to predict cancer from microarray gene expression data by proposing two feature selection algorithms, namely (1) differential evolution with fuzzy rough set feature selection and (2) ant colony optimization with fuzzy rough set feature selection algorithms, which solve the multi-objective optimization problems. The first algorithm represents the hybridization of differential evolution and fuzzy rough set and aims to select the global optimal features by applying the fuzzy rough evaluation function as the fitness function. The second algorithm, i.e., hybridization of ant colony optimization and fuzzy rough set, selects global optimal features by applying the fuzzy rough evaluation function as the fitness function. The performance of proposed two features selection algorithms is evaluated with various classification metrics, which are computed from decision tree classifier using tenfold cross-validation. Five datasets are applied to analyze the performance of the feature selection algorithms. The datasets used are diffuse large B cell lymphoma, breast cancer, Leukemia and small round blue-cell tumors which are cancer datasets. In addition, a non-medical dataset, namely Gisette, is also used to demonstrate the generalization capability of the proposed algorithms. The metrics used for comparison are, namely, accuracy, precision, recall, f-measure, specificity, processing time and receiver operating characteristics. The performance comparison evidenced improved performances for the proposed algorithms. Similar to the hybridization of differential evolution and ant colony optimization with fuzzy rough set, particle swarm optimization can be extended in the future.
Similar content being viewed by others
References
Abualigah LMQ (2019) Feature selection and enhanced Krill Herd algorithm for text document clustering. Studies in computational intelligence, vol 816. Springer, Cham, Switzerland, pp 1–165. https://doi.org/10.1007/978-3-030-10674-4
Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435
Abualigah LM, Khader AT, Hanandeh NES (2018a) A combination of objective functions and hybrid Krill Herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125
Abualigah LM, Khader AT, Hanandeh ES (2018b) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Abualigah LM, Khader AT, Hanandeh ES (2018c) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071
Abualigah LM, Khader AT, Hanandeh ES (2018d) A hybrid strategy for krill herd algorithm with harmony search algorithm to improve the data clustering. Intell Decis Technol 12(1):3–14
Abualigah L, Shehab M, Alshinwan M, Alabool H (2019) Salp swarm algorithm: a comprehensive survey. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04629-4
Akhtar MS, Gupta D, Ekbal A, Bhattacharyya P (2017) Feature selection and ensemble construction: a two-step method for aspect based sentiment analysis. Knowl-Based Syst 125:116–135
Benala TR, Mall R (2018) DABE: differential evolution in analogy-based software development effort Estimation. Swarm Evol Comput 38:158–172
Benjamin SCW, Shantanu HJ, Boris AG, Paul Thompson M (2017) Machine learning on high dimensional shape data from subcortical brain surfaces: a comparison of feature selection and classification methods. Pattern Recogn 63:731–739
Chen B, Chen L, Chen Y (2013) Efficient ant colony optimization for image feature selection. Sig Process 93:1566–1576
Chuang LY, Yang CS, Wuc KC, Yang CH (2011) Gene selection and classification using Taguchi chaotic binary particle swarm optimization. Expert Syst Appl 38:13367–13377
Das AK, Goswami S, Chakrabarti A, Chakraborty B (2017) A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 88:81–94
Jadhav S, He H, Jenkins K (2018) Information gain directed genetic algorithm wrapper feature selection for credit rating. Appl Soft Comput 66:541–553
Lakshmi BN, Indumathi TS, Nandini R (2016) A study on C.5 decision tree classification algorithm for risk predictions during pregnancy. Proc Technol 24:1542–1549
Lu H, Chen J, Yan K, Jin Qun, Xue Y, Gao Z (2017) A hybrid feature selection algorithm for gene expression data Classification. Neurocomputing 256:56–62
Meenachi L, Ramakrishnan S (2018) Evolutionary sequential genetic search technique-based cancer classification using fuzzy rough nearest neighbour classifier. Healthc Technol Lett 5:130–135
Meenachi L, Ramakrishnan S (2020) Random global and local optimal search algorithm based subset generation for diagnosis of cancer. Curr Med Imaging 16:249. https://doi.org/10.2174/1573405614666180720152838
Motieghader H, Najafi A, Sadeghi B, Masoudi-Nejad A (2017) A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata. Inform Med Unlocked 9:246–254
Myszkowski Pawel B, Olech Lukasz P, Laszczyk M, Skowronski ME (2018) Hybrid differential evolution and greedy algorithm (DEGR) for solving multi-skill resource-constrained project scheduling problem. Appl Soft Comput 62:1–14
Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2020) MLACO: a multi-label feature selection algorithm based on ant colony optimization. Knowl Based Syst 192:105285
Paul D, Su R, Romain M, Sebastien V, Pierre V, Isabelle G (2017) Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier. Comput Med Imaging Graph 60:42–49
Rao H, Shi X, Rodrigue AK, Feng J, Xia Y, Elhoseny M, Yuan X, Gu L (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput J 74:634–642
Riza LS, Janusz A, Bergmeir C, Cornelis C, Herrera F, Slezak D, Benitez JM (2014) Implementing algorithms of rough set theory and fuzzy rough set theory in the R package RoughSets. Inf Sci 287:68–89
Salem H, Attiya G, El-Fishawy N (2017) Classification of human cancer diseases by gene expression profiles. Appl Soft Comput 50:124–134
Sheeja TK, Sunny Kuriakose A (2018) A novel feature selection method using fuzzy rough sets. Comput Ind 97:111–116
Shehab M, Abualigah L, Al Hamad H, Alabool H, Alshinwan M, Khasawneh AM (2019) Moth-flame optimization algorithm:variants and applications. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04570-6
Shehab M, Alshawabkah H, Abualigah L, Nagham AM (2020) Enhanced a hybrid moth-flame optimization algorithm using new selection schemes. Eng Comput. https://doi.org/10.1007/s00366-020-00971-7
Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168:1024–1036
Tian M, Gao X, Dai C (2017) Differential evolution with improved individual-based parameter setting and selection strategy. Appl Soft Comput 56:286–297
Vivekanandan T, Ch Sriman Narayana Iyengar N (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Comput Biol Med 90:125–136
Wei JM, Wang SQ, Yuan XJ (2010) Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans Knowl Data Eng 22:381–391
Wu G, Shen X, Li H, Chen H, Lin A, Suganthan PN (2018) Ensemble of differential evolution variants. Inf Sci 423:172–186
Zhang T, Ding B, Zhao X, Yue Q (2018) A fast feature selection algorithm based on swarm intelligence in acoustic defect detection. IEEE Access 6:28848–28858
Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recogn 77:20–29
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest, financially or otherwise.
Ethical approval
This article does not contain any studies with the participants of humans or animals.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Meenachi, L., Ramakrishnan, S. Differential evolution and ACO based global optimal feature selection with fuzzy rough set for cancer data classification. Soft Comput 24, 18463–18475 (2020). https://doi.org/10.1007/s00500-020-05070-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05070-9