Skip to main content

Advertisement

Log in

Gene Selection for Microarray Cancer Classification based on Manta Rays Foraging Optimization and Support Vector Machines

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

In DNA microarray applications, many techniques are proposed for cancer classification in order to detect normal and cancerous humans or classify different types of cancers. Gene selection is usually required as a preliminary step for a cancer classification problem. This step aims to select the most informative genes among a great number of genes, which represent an important issue. Although many studies have been proposed to address this issue, they lack getting the most informative and fewest number of genes with the highest accuracy and little effort from the high dimensionality of microarray datasets. Manta ray foraging optimization(MRFO) algorithm is a new meta-heuristic algorithm that mimics the nature of manta ray fishes in food foraging. MRFO has achieved promising results in other fields, such as solar generating units. Due to the high accuracy results of the support vector machines (SVM), it is the most commonly used classification algorithm in cancer studies, especially with microarray data. For exploiting the pros of both algorithms (i.e., MRFO and SVM), in this paper, a hybrid algorithm is proposed to select the most predictive and informative genes for cancer classification. A binary microarray dataset, which includes colon and leukemia1, and a multi-class microarray dataset that includes SRBCT, lymphoma, and leukemia2, are used to evaluate the accuracy of the proposed technique. Like other optimization techniques, MRFO suffers from some problems related to the high dimensionality and complexity of the microarray data. For solving such problems as well as improving the performance, the minimum redundancy maximum relevance (mRMR) method is used as a preprocessing stage. The proposed technique has been evaluated compared to the most common cancer classification algorithms. The experimental results show that our proposed technique achieves the highest accuracy with the fewest number of informative genes and little effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Dubitzky, W.; Granzow, M.; Downes, C.S.; Berrar, D.: Introduction to microarray data analysis. In: A Practical Approach to Microarray Data Analysis. Springer, pp. 1–46. (2003)

  2. Benso, A.; Di Carlo, S.; Politano, G.; Savino, A.: Gpu acceleration for statistical gene classification. In: 2010 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR), Vol. 2, IEEE, pp. 1–6. (2010)

  3. Golub, T.R.; Slonim, D.K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.P.; Coller, H.; Loh, M.L.; Downing, J.R.; Caligiuri, M.A.; et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  4. Alshamlan, H.M.; Badr, G.H.; Alohali, Y.: A study of cancer microarray gene expression profile: objectives and approaches. In: Proceedings of the World Congress on Engineering, Vol. 2, pp. 1–6 (2013)

  5. Ghorai, S.; Mukherjee, A.; Sengupta, S.; Dutta, P.K.: Multicategory cancer classification from gene expression data by multiclass NPPC ensemble. In: 2010 International Conference on Systems in Medicine and Biology, IEEE, (2010), pp. 41–48

  6. Guo, S.-B.; Lyu, M.R.; Lok, T.-M.: Gene selection based on mutual information for the classification of multi-class cancer. In: International Conference on Intelligent Computing, Springer, pp. 454–463 (2006)

  7. Alanni, R.; Hou, J.; Azzawi, H.; Xiang, Y.: A novel gene selection algorithm for cancer classification using microarray datasets. BMC Med. Genomics 12(1), 10 (2019)

    Article  Google Scholar 

  8. Alshamlan, H.M.; Badr, G.H.; Alohali, Y.A.: The performance of bio-inspired evolutionary gene selection methods for cancer classification using microarray dataset, International Journal of Bioscience. Biochem. Bioinf. 4(3), 166 (2014)

    Google Scholar 

  9. Narendra, P.M.; Fukunaga, K.: A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 9, 917–922 (1977)

    Article  MATH  Google Scholar 

  10. Watada, J.; Arunava, R.; Jingru, L.; Bo, W.; Shuming, W.: A dual recurrent neural network-based hybrid approach for solving convex quadratic bi-level programming problem. Neurocomputing 407, 136–154 (2020)

    Article  Google Scholar 

  11. Zhao, W.; Zhang, Z.; Wang, L.: Manta ray foraging optimization: an effective bio-inspired optimizer for engineering applications. Eng. Appl. Artif. Intell. 87, 103300103300 (2020)

    Article  Google Scholar 

  12. Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)

    Article  MATH  Google Scholar 

  13. Huerta, E.B.; Duval, B.; Hao, J.-K.: A hybrid GA/SVM approach for gene selection and classification of microarray data. In: Workshops on Applications of Evolutionary Computation, Springer, pp. 34–44(2006)

  14. Mukherjee, S.: Classifying microarray data using support vector machines. In: A practical Approach to Microarray Data Analysis. Springer, pp. 166–185 (2003)

  15. Alshamlan, H.; Badr, G.; Alohali, Y.: A comparative study of cancer classification methods using microarray gene expression profile. In: Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013), Springer, pp. 389–398 (2014)

  16. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H.: Harris hawks optimization: algorithm and applications. Future Generat. Comput. Syst. 97, 849–872 (2019)

    Article  Google Scholar 

  17. Hayyolalam, V.; Kazem, A.A.P.: Black widow optimization algorithm: a novel meta-heuristic approach for solving engineering optimization problems. Eng. Appl. Artif. Intell. 87, 103249 (2020)

    Article  Google Scholar 

  18. Mirjalili, S.; Mirjalili, S.M.; Lewis, A.: Grey wolf optimizer. Adv. Eng. softw. 69, 46–61 (2014)

    Article  Google Scholar 

  19. Whitley, D.: A genetic algorithm tutorial. Stat. Comput. 4(2), 65–85 (1994)

    Article  Google Scholar 

  20. Mirjalili, S.; Lewis, A.: The whale optimization algorithm. Adv. Eng. softw. 95, 51–67 (2016)

    Article  Google Scholar 

  21. Kennedy, J.; Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks, Vol. 4, IEEE, pp. 1942–1948 (1995)

  22. Karaboga, D.: An idea based on honey bee swarm for numerical optimization, Tech. rep., Technical report-tr06, Erciyes university, engineering faculty, computer (2005).

  23. Alon, U.; Barkai, N.; Notterman, D.A.; Gish, K.; Ybarra, S.; Mack, D.; Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  24. Khan, J.; Wei, J.S.; Ringner, M.; Saal, L.H.; Ladanyi, M.; Westermann, F.; Berthold, F.; Schwab, M.; Antonescu, C.R.; Peterson, C.; et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Med. 7(6), 673–679 (2001)

    Article  Google Scholar 

  25. Alizadeh, A.A.; Eisen, M.B.; Davis, R.E.; Ma, C.; Lossos, I.S.; Rosenwald, A.; Boldrick, J.C.; Sabet, H.; Tran, T.; Yu, X.; et al.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511 (2000)

    Article  Google Scholar 

  26. Armstrong, S.A.; Staunton, J.E.; Silverman, L.B.; Pieters, R.; den Boer, M.L.; Minden, M.D.; Sallan, S.E.; Lander, E.S.; Golub, T.R.; Korsmeyer, S.J.: Mll translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics 30(1), 41–47 (2002)

    Article  Google Scholar 

  27. Peng, H.; Long, F.; Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  28. Lazar, C.; Taminau, J.; Meganck, S.; Steenhoff, D.; Coletta, A.; Molter, C.; de Schaetzen, V.; Duque, R.; Bersini, H.; Nowe, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 9(4), 1106–1119 (2012)

    Article  Google Scholar 

  29. Tabakhi, S.; Moradi, P.; Akhlaghian, F.: An unsupervised feature selection algorithm based on ant colony optimization. Eng. Appl. Artif. Intell. 32, 112–123 (2014)

    Article  Google Scholar 

  30. Liao, B.; Jiang, Y.; Liang, W.; Zhu, W.; Cai, L.; Cao, Z.: Gene selection using locality sensitive laplacian score. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 11(6), 1146–1156 (2014)

    Article  Google Scholar 

  31. He, X.; Cai, D.; Niyogi, P.: Laplacian score for feature selection. In: Advances in neural information processing systems, pp. 507–514. (2006)

  32. Cai, R.; Hao, Z.; Yang, X.; Wen, W.: An efficient gene selection algorithm based on mutual information. Neurocomputing 72(4–6), 991–999 (2009)

    Article  Google Scholar 

  33. Raileanu, L.E.; Stoffel, K.: Theoretical comparison between the gini index and information gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  34. Ding, C.; Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinf. Comput. Biol. 3(02), 185–205 (2005)

    Article  Google Scholar 

  35. Bertoni, A.; Folgieri, R.; Valentini, G.: Bio-molecular cancer prediction with random subspace ensembles of support vector machines. Neurocomputing 63, 535–539 (2005)

    Article  Google Scholar 

  36. Lai, C.; Reinders, M.J.; Wessels, L.: Random subspace method for multivariate feature selection. Pattern Recognit. Lett. 27(10), 1067–1076 (2006)

    Article  Google Scholar 

  37. Li, X.; Zhao, H.: Weighted random subspace method for high dimensional data classification. Statistics and its Interface 2(2), 153 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  38. Haindl, M.; Somol, P.; Ververidis, D.; Kotropoulos, C.: Feature selection based on mutual correlation. In: Iberoamerican Congress on Pattern Recognition, Springer, pp. 569–577 (2006)

  39. Ghazavi, S.N.; Liao, T.W.: Medical data mining by fuzzy modeling with selected features. Artif. Intell. Med. 43(3), 195–206 (2008)

    Article  Google Scholar 

  40. Ferreira, A.J.; Figueiredo, M.A.: An unsupervised approach to feature discretization and selection. Pattern Recognit. 45(9), 3048–3060 (2012)

    Article  Google Scholar 

  41. Ferreira, A.J.; Figueiredo, M.A.: Efficient feature selection filters for high-dimensional data. Pattern Recognit. Lett. 33(13), 1794–1804 (2012)

    Article  Google Scholar 

  42. Yu, L.; Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 856–863

  43. Yu, L.; Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5(Oct), 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  44. Gheyas, I.A.; Smith, L.S.: Feature subset selection in large dimensionality domains. Pattern recognition 43(1), 5–13 (2010)

    Article  MATH  Google Scholar 

  45. Saeys, Y.; Inza, I.; Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  46. Sahu, B.; Mishra, D.: A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Eng. 38, 27–31 (2012)

    Article  Google Scholar 

  47. Martinez, E.; Alvarez, M.M.; Trevino, V.: Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm. Comput. Biol. Chem. 34(4), 244–250 (2010)

    Article  Google Scholar 

  48. Li, Y.; Wang, G.; Chen, H.; Shi, L.; Qin, L.: An ant colony optimization based dimension reduction method for high-dimensional datasets. J. Bionic Eng. 10(2), 231–241 (2013)

    Article  Google Scholar 

  49. Kabir, M.M.; Shahjahan, M.; Murase, K.: A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 39(3), 3747–3763 (2012)

    Article  Google Scholar 

  50. Yu, H.; Gu, G.; Liu, H.; Shen, J.; Zhao, J.: A modified ant colony optimization algorithm for tumor marker gene selection. Genomics Proteomics Bioinf. 7(4), 200–208 (2009)

    Article  Google Scholar 

  51. Srivastava, A.; Chakrabarti, S.; Das, S.; Ghosh, S.; Jayaraman, V.K.: Hybrid firefly based simultaneous gene selection and cancer classification using support vector machines and random forests. In: Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012), Springer, pp. 485–494 (2013)

  52. Inza, I.; Sierra, B.; Blanco, R.; Larrañaga, P.: Gene selection by sequential search wrapper approaches in microarray cancer class prediction. J. Intell. Fuzzy Syst. 12(1), 25–33 (2002)

    MATH  Google Scholar 

  53. Inza, I.; Larrañaga, P.; Blanco, R.; Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)

    Article  Google Scholar 

  54. Ghoneimy, M.; Nabil, E.; Badr, A.; El-Khamisy, S.F.: Bioscience research.

  55. Alshamlan, H.M.; Badr, G.H.; Alohali, Y.A.: Abc-svm: artificial bee colony and svm method for microarray gene selection and multi class cancer classification. Int. J. Mach. Learn. Comput. 6(3), 184 (2016)

    Article  Google Scholar 

  56. Alba, E.; Garcia-Nieto, J.; Jourdan, L.; Talbi, E.-G.: Gene selection in cancer classification using PSO, SVM and GA, SVM hybrid algorithms. In: IEEE Congress on Evolutionary Computation. IEEE 2007, 284–290 (2007)

  57. Rani, R.R.; Ramyachitra, D.: Microarray cancer gene feature selection using spider monkey optimization algorithm and cancer classification using SVM. Procedia Comput. Sci. 143, 108–116 (2018)

    Article  Google Scholar 

  58. Almugren, N.; Alshamlan, H.: Ff-svm: New firefly-based gene selection algorithm for microarray cancer classification. In: 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), IEEE, pp. 1–6 (2019)

  59. Maulik, U.; Chakraborty, D.: Fuzzy preference based feature selection and semisupervised svm for cancer classification. IEEE Trans. Nanobiosci. 13(2), 152–160 (2014)

    Article  Google Scholar 

  60. Chen, M.-S.; Ho, T.-Y.; Huang, D.-Y.: Online transductive support vector machines for classification. In: 2012 International Conference on Information Security and Intelligent Control, IEEE, pp. 258–261 (2012)

  61. Zhang, L.; Zhou, W.; Wang, B.; Zhang, Z.; Li, F.: Applying 1-norm svm with squared loss to gene selection for cancer classification. Appl. Intell. 48(7), 1878–1890 (2018)

    Article  Google Scholar 

  62. Zhao, W.; Wang, G.; Wang, H.; Chen, H.; Dong, H.; Zhao, Z.: A novel framework for gene selection. Int. J. Adv. Comput. Technol. 3(3), 184–191 (2011)

    Google Scholar 

  63. Lee, C.-P.; Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)

    Article  Google Scholar 

  64. Leung, Y.; Hung, Y.: A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 7(1), 108–117 (2010)

    Article  Google Scholar 

  65. Zibakhsh, A.; Abadeh, M.S.: Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function. Eng. Appl. Artif. Intell. 26(4), 1274–1281 (2013)

    Article  Google Scholar 

  66. Alshamlan, H.; Badr, G;, Alohali, Y.: mrmr-abc: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed. Res. Int. (2015)

  67. Alshamlan, H.M.; Badr, G.H.; Alohali, Y.A.: Genetic bee colony (gbc) algorithm: a new gene selection method for microarray cancer classification. Comput. Biol. Chem. 56, 49–60 (2015)

    Article  Google Scholar 

  68. Díaz-Uriarte, R.; De Andres, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinf. 7(1), 3 (2006)

    Article  Google Scholar 

  69. Wang, G.; Song, Q.; Xu, B.; Zhou, Y.: Selecting feature subset for high dimensional data via the propositional foil rules. Pattern Recognit. 46(1), 199–214 (2013)

    Article  Google Scholar 

  70. Duan, K.-B.; Rajapakse, J.C.; Wang, H.; Azuaje, F.: Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE trans. Nanobiosci. 4(3), 228–234 (2005)

    Article  Google Scholar 

  71. Duan, K.-B.; Rajapakse, J.C.; Nguyen, M.N.: One-versus-one and one-versus-all multiclass svm-rfe for gene selection in cancer classification. In: European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, Springer, pp. 47–56 (2007)

  72. Ghosh, K.K.; Guha, R.; Bera, S.K.; Kumar, N.; Sarkar, R.: S-shaped versus v-shaped transfer functions for binary manta ray foraging optimization in feature selection problem.

  73. Fathy, A.; Rezk, H.; Yousri, D.: A robust global MPPT to mitigate partial shading of triple-junction solar cell-based system using manta ray foraging optimization algorithm. Solar Energy 207, 305–316 (2020)

    Article  Google Scholar 

  74. El-Hameed, M.A.; Elkholy, M.M.; El-Fergany, A.A.: Three-diode model for characterization of industrial solar generating units using manta-rays foraging optimizer: Analysis and validations. Energy Convers. Manage. 219, 113048 (2020)

    Article  Google Scholar 

  75. Selem, S.I.; Hasanien, H.M.; El-Fergany, A.A.: Parameters extraction of PEMFC’s model using manta rays foraging optimizer. Int. J. Energy Res. 44(6), 4629–4640 (2020)

    Article  Google Scholar 

  76. El Akadi, A.; Amine, A.; El Ouardighi, A.; Aboutajdine, D.: A new gene selection approach based on minimum redundancy-maximum relevance (MRMR) and genetic algorithm (GA). In: 2009 IEEE/ACS International Conference on Computer Systems and Applications, IEEE, pp. 69–75 (2009)

  77. Liu, H.; Liu, L.; Zhang, H.: Ensemble gene selection by grouping for microarray data classification. J. Biomed. Inf. 43(1), 81–87 (2010)

    Article  Google Scholar 

  78. Abdi, M.J.; Hosseini, S.M.; Rezghi, M.: A novel weighted support vector machine based on particle swarm optimization for gene selection and tumor classification. Comput. Math. Methods Med. (2012)

  79. Yun, C.; Oh, B.; Yang, J.; Nang, J.: Feature subset selection based on bio-inspired algorithms. J. Inf. Sci. Eng. 27(5), 1667–1686 (2011)

    MathSciNet  Google Scholar 

  80. Huang, T.; Wang, P.; Ye, Z.-Q.; Xu, H.; He, Z.; Feng, K.-Y.; Hu, L.; Cui, W.; Wang, K.; Dong, X.; et al.: Prediction of deleterious non-synonymous SNPS based on protein interaction network and hybrid properties. PLoS ONE 5(7), e11900 (2010)

    Article  Google Scholar 

  81. Rodríguez-Peérez, R.; Vogt, M.; Bajorath, J.: Support vector machine classification and regression prioritize different structural features for binary compound activity and potency value prediction. ACS Omega 2(10), 6371–6379 (2017)

    Article  Google Scholar 

  82. Wang, X.; Gotoh, O.: Microarray-based cancer prediction using soft computing approach, Cancer informatics 7 CIN–S2655. (2009)

  83. Shen, Q.; Shi, W.-M.; Kong, W.; Ye, B.-X.: A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta 71(4), 1679–1683 (2007)

    Article  Google Scholar 

  84. Abdi, M.J.; Giveki, D.: Automatic detection of erythemato-squamous diseases using PSO-SVM based on association rules. Eng. Appl. Artif. Intell. 26(1), 603–608 (2013)

    Article  Google Scholar 

  85. Huang, H.-L.; Chang, F.-L.: Esvm: Evolutionary support vector machine for automatic feature selection and classification of microarray data. Biosystems 90(2), 516–528 (2007)

    Article  MathSciNet  Google Scholar 

  86. Huang, H.-L.; Lee, C.-C.; Ho, S.-Y.: Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers. Biosystems 90(1), 78–86 (2007)

    Article  Google Scholar 

  87. Yang, C.-S.; Chuang, L.-Y.; Ke, C.-H.; Yang, C.-H.: A hybrid feature selection method for microarray classification., IAENG Int. J. Comput. Sci. 35(3)

  88. Peng, S.; Xu, Q.; Ling, X.B.; Peng, X.; Du, W.; Chen, L.: Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett. 555(2), 358–362 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mustafa M. Al-Sayed.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Houssein, E.H., Hassan, H.N., Al-Sayed, M.M. et al. Gene Selection for Microarray Cancer Classification based on Manta Rays Foraging Optimization and Support Vector Machines. Arab J Sci Eng 47, 2555–2572 (2022). https://doi.org/10.1007/s13369-021-06102-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-021-06102-8

Keywords

Navigation