Skip to main content

Advertisement

Log in

Gene selection using hybrid particle swarm optimization and genetic algorithm

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Selecting high discriminative genes from gene expression data has become an important research. Not only can this improve the performance of cancer classification, but it can also cut down the cost of medical diagnoses when a large number of noisy, redundant genes are filtered. In this paper, a hybrid Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) method is used for gene selection, and Support Vector Machine (SVM) is adopted as the classifier. The proposed approach is tested on three benchmark gene expression datasets: Leukemia, Colon and breast cancer data. Experimental results show that the proposed method can reduce the dimensionality of the dataset, and confirm the most informative gene subset and improve classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Almuallim H and Dietterich T (1994). Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1–2): 279–305

    Article  MATH  MathSciNet  Google Scholar 

  • Alon U, Barkai U and Notterman DA et al (1999). Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96: 6745–6750

    Article  Google Scholar 

  • Ben-Dor A, Bruhn L and Friedman N et al (2000). Tissue classification with gene expression profiles. J Comput Biol 7: 559–583

    Article  Google Scholar 

  • Cristianini N and Shawe-Taylor J (1999). An introduction to SVM. Cambridge University Press, Cambridge

    Google Scholar 

  • Deng L, Pei J, Ma J et al (2004) A rank sum test method for informative gene discovery. In: Kim W, Kohavi R, Gehrke J et al (eds) Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 410–490

  • Furey TS, Cristianini N and Duffy N et al (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16: 906–914

    Article  Google Scholar 

  • Goldberg DE (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley, New York

    MATH  Google Scholar 

  • Golub T, Slonim D and Tamayo P et al (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 28: 531–537

    Article  Google Scholar 

  • Guyon I, Weston J and Barnhill S et al (2002). Gene selection for cancer classification using support vector machines. Mach Learn 46: 389–422

    Article  MATH  Google Scholar 

  • Hall M (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: 17th International Conference on Machine Learning. Morgan Kaufmann, San Francisco, CA

  • He W (2004). A spline function approach for detecting differentially expressed genes in microarray data analysis. Bioinformatics 20: 2954–2963

    Article  Google Scholar 

  • Kennedy J, Eberhart RC (1995) Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks, pp 1942–1948

  • Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In Proceedings of the IEEE International Conference on Systems. Man, and Cybernetics, pp 4104–4109

  • Li L, Darden TA and Weingberg CR et al (2001a). Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Comb Chem High Throughput Screen 4: 727–739

    Google Scholar 

  • Li L, Weinberg CR and Darden TA et al (2001b). Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17: 1131–1142

    Article  Google Scholar 

  • Ooi CH and Tan P (2003). Genetic algorithms applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19: 37–44

    Article  Google Scholar 

  • Pan W (2002). A comparative review of statistical methods for discovering differentially expressed genes in replicated Microarray experiments. Bioinformatics 18: 546–554

    Article  Google Scholar 

  • Peng S, Xu Q and Ling XB et al (2003). Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett 555: 358–362

    Article  Google Scholar 

  • Ruiz R, Riquelme JC and Aguilar-Ruiz JS (2006). Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognit 39(12): 2383–2392

    Article  Google Scholar 

  • Shen Q, Shi WM and Kong W et al (2007). A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta 71: 1679–1683

    Article  Google Scholar 

  • Shi XH, Lu YH, Zhou CG et al (2003) Hybrid evolutionary algorithms based on pso and ga. In: Sarker R, Reynolds R, Abbass H et al (eds) Proceeding of IEEE Congress on Evolutionary computation, pp 2393–2399

  • Thomas JG, Olson JM and Tapscott SJ et al (2001). An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Res 11: 1227–1236

    Article  Google Scholar 

  • Tinker AV, Boussioutas A and Bowtell DDL (2006). The challenges of gene expression microarrays for the study of human cancer. Cancer Cell 9: 333–339

    Article  Google Scholar 

  • Troyanskaya OG, Garber ME and Brown PO et al (2002). Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18: 1454–1461

    Article  Google Scholar 

  • Vapnik V (1995). The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  • West M, Blanchette C and Dressman H et al (2001). Predicting the clinical status of human breast cancer using gene expression profiles. Proc Natl Acad Sci 98: 11462–11467

    Article  Google Scholar 

  • Weston J, Mukherjee S and Chapelle O et al (2000). Feature selection for SVMs. Adv Neural Inf Process Syst 13: 668–674

    Google Scholar 

  • Yu L and Liu H (2004). Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5: 1205–1224

    MathSciNet  Google Scholar 

  • Zhang H, Ahn J and Lin X et al (2005). Gene selection using support vector machines with non-convex penalty. Bioinformatics 22: 88–95

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shutao Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, S., Wu, X. & Tan, M. Gene selection using hybrid particle swarm optimization and genetic algorithm. Soft Comput 12, 1039–1048 (2008). https://doi.org/10.1007/s00500-007-0272-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-007-0272-x

Keywords

Navigation