Skip to main content

Feature Selection in Gene Expression Profile Employing Relevancy and Redundancy Measures and Binary Whale Optimization Algorithm (BWOA)

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13087))

Included in the following conference series:

Abstract

The presence of a large number of genes in the gene expression profiles imposes a computational challenge for cancer classification. To deal with the high-dimensional feature space, in this paper, we present a 3-step feature selection framework, RRO (Relevancy-Redundancy-Optimization). In the first step, RRO identifies top-ranked class-relevant genes utilizing the analysis of variance (ANOVA) and F-test. In the second step, class correlated but redundant genes are removed by employing the Kendall rank correlation coefficient (Kendall’s \(\tau \)). Finally, we utilize a metaheuristic optimization algorithm, binary whale optimization algorithm (BWOA), with the support vector machine (SVM) classifier to select an optimal gene subset. The comparisons with thirteen state-of-the-art methods in ten gene expression datasets show that RRO yields better or comparable accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/kivancguckiran/microarray-data.

  2. 2.

    http://csse.szu.edu.cn/staff/zhuzx/Datasets.html.

References

  1. Al-Obeidat, F., Tubaishat, A., Shah, B., Halim, Z., et al.: Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data. Neural Comput. Appl. 788, 1–23 (2020)

    Google Scholar 

  2. Alanni, R., Hou, J., Azzawi, H., Xiang, Y.: A novel gene selection algorithm for cancer classification using microarray datasets. BMC Med. Genomics 12(1), 10 (2019)

    Article  Google Scholar 

  3. Almugren, N., Alshamlan, H.: FF-SVM: new firefly-based gene selection algorithm for microarray cancer classification. In: 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 1–6. IEEE (2019)

    Google Scholar 

  4. Alomari, O.A., Khader, A.T., Al-Betar, M.A., Abualigah, L.M.: Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int. J. Data Min. Bioinform. 19(1), 32–51 (2017)

    Article  Google Scholar 

  5. Alshamlan, H., Badr, G., Alohali, Y.: MRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed Res. Int. 2015 (2015)

    Google Scholar 

  6. Alshamlan, H.M., Badr, G.H., Alohali, Y.A.: Genetic bee colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput. Biol. Chem. 56, 49–60 (2015)

    Article  Google Scholar 

  7. Alshamlan, H.M., Badr, G.H., Alohali, Y.A.: ABC-SVM: artificial bee colony and SVM method for microarray gene selection and multi class cancer classification. Int. J. Mach. Learn. Comput. 6(3), 184 (2016)

    Article  Google Scholar 

  8. Aziz, R., Verma, C., Srivastava, N.: A novel approach for dimension reduction of microarray. Comput. Biol. Chem. 71, 161–169 (2017)

    Article  Google Scholar 

  9. Chuang, L.Y., Yang, C.H., Wu, K.C., Yang, C.H.: A hybrid feature selection method for DNA microarray data. Comput. Biol. Med. 41(4), 228–237 (2011)

    Article  Google Scholar 

  10. Dashtban, M., Balafar, M.: Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. Genomics 109(2), 91–107 (2017)

    Article  Google Scholar 

  11. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)

    Article  MathSciNet  Google Scholar 

  12. Ferreira, A.J., Figueiredo, M.A.: An unsupervised approach to feature discretization and selection. Pattern Recogn. 45(9), 3048–3060 (2012)

    Article  Google Scholar 

  13. Gao, L., Ye, M., Lu, X., Huang, D.: Hybrid method based on information gain and support vector machine for gene selection in cancer classification. Genomics Proteomics Bioinform 15(6), 389–395 (2017)

    Article  Google Scholar 

  14. Hussien, A.G., Hassanien, A.E., Houssein, E.H., Bhattacharyya, S., Amin, M.: S-shaped binary whale optimization algorithm for feature selection. In: Bhattacharyya, S., Mukherjee, A., Bhaumik, H., Das, S., Yoshida, K. (eds.) Recent Trends in Signal and Image Processing. AISC, vol. 727, pp. 79–87. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8863-6_9

    Chapter  Google Scholar 

  15. Jain, I., Jain, V.K., Jain, R.: Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl. Soft Comput. 62, 203–215 (2018)

    Article  Google Scholar 

  16. Kar, S., Sharma, K.D., Maitra, M.: Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive k-nearest neighborhood technique. Expert Syst. Appl. 42(1), 612–627 (2015)

    Article  Google Scholar 

  17. Khurma, R.A., Aljarah, I., Sharieh, A., Mirjalili, S.: EvoloPy-FS: an open-source nature-inspired optimization framework in python for feature selection. In: Mirjalili, S., Faris, H., Aljarah, I. (eds.) Evolutionary Machine Learning Techniques. AIS, pp. 131–173. Springer, Singapore (2020). https://doi.org/10.1007/978-981-32-9990-0_8

    Chapter  Google Scholar 

  18. Lai, C., Reinders, M.J., Wessels, L.: Random subspace method for multivariate feature selection. Pattern Recogn. Lett. 27(10), 1067–1076 (2006)

    Article  Google Scholar 

  19. Lai, C.M., Yeh, W.C., Chang, C.Y.: Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 218, 331–338 (2016)

    Article  Google Scholar 

  20. Lee, C.P., Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)

    Article  Google Scholar 

  21. Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE (1995)

    Google Scholar 

  22. Lu, H., Chen, J., Yan, K., Jin, Q., Xue, Y., Gao, Z.: A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256, 56–62 (2017)

    Article  Google Scholar 

  23. Mirjalili, S., Lewis, A.: The whale optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016)

    Article  Google Scholar 

  24. Moradi, P., Gholampour, M.: A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl. Soft Comput. 43, 117–130 (2016)

    Article  Google Scholar 

  25. Mundra, P.A., Rajapakse, J.C.: SVM-RFE with MRMR filter for gene selection. IEEE Trans. Nanobiosci. 9(1), 31–37 (2009)

    Article  Google Scholar 

  26. Nguyen, T., Khosravi, A., Creighton, D., Nahavandi, S.: Hidden Markov models for cancer classification using gene expression profiles. Inf. Sci. 316, 293–307 (2015)

    Article  Google Scholar 

  27. Pedregosa, F.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  28. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  29. Salem, H., Attiya, G., El-Fishawy, N.: Classification of human cancer diseases by gene expression profiles. Appl. Soft Comput. 50, 124–134 (2017)

    Article  Google Scholar 

  30. Sazzed, S.: ANOVA-SRC-BPSO: a hybrid filter and swarm optimization-based method for gene selection and cancer classification using gene expression profiles. In: Proceedings of the Canadian Conference on Artificial Intelligence (2021). https://caiac.pubpub.org/pub/hay53dvq, https://caiac.pubpub.org/pub/hay53dvq

  31. Sharbaf, F.V., Mosafer, S., Moattar, M.H.: A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107(6), 231–238 (2016)

    Article  Google Scholar 

  32. Shreem, S.S., Abdullah, S., Nazri, M.Z.A.: Hybridising harmony search with a Markov blanket for gene selection problems. Inf. Sci. 258, 108–121 (2014)

    Article  MathSciNet  Google Scholar 

  33. Shreem, S.S., Abdullah, S., Nazri, M.Z.A.: Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm. Int. J. Syst. Sci. 47(6), 1312–1329 (2016)

    Article  Google Scholar 

  34. Shreem, S.S., Abdullah, S., Nazri, M.Z.A., Alzaqebah, M.: Hybridizing RELIEFF, MRMR filters and GA wrapper approaches for gene selection. J. Theor. Appl. Inf. Technol. 46(2), 1034–1039 (2012)

    Google Scholar 

  35. Sun, L., Zhang, X., Qian, Y., Xu, J., Zhang, S.: Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf. Sci. 502, 18–41 (2019)

    Article  MathSciNet  Google Scholar 

  36. Wang, Y., Yang, X.G., Lu, Y.: Informative gene selection for microarray classification via adaptive elastic net with conditional mutual information. Appl. Math. Model. 71, 286–297 (2019)

    Article  MathSciNet  Google Scholar 

  37. Xiaofei, H., Deng, C., Partha, N.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, pp. 507–514 (2005)

    Google Scholar 

  38. Yassi, M., Moattar, M.H.: Robust and stable feature selection by integrating ranking methods and wrapper technique in genetic data classification. Biochem. Biophys. Res. Commun. 446(4), 850–856 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salim Sazzed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sazzed, S. (2022). Feature Selection in Gene Expression Profile Employing Relevancy and Redundancy Measures and Binary Whale Optimization Algorithm (BWOA). In: Li, B., et al. Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13087. Springer, Cham. https://doi.org/10.1007/978-3-030-95405-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95405-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95404-8

  • Online ISBN: 978-3-030-95405-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics