A Hybrid Feature Selection Approach for Microarray Gene Expression Data

  • Feng Tan
  • Xuezheng Fu
  • Hao Wang
  • Yanqing Zhang
  • Anu Bourgeois
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3992)


Due to the huge number of genes and comparatively small number of samples from microarray gene expression data, accurate classification of diseases becomes challenging. Feature selection techniques can improve the classification accuracy by removing irrelevant and redundant genes. However, the performance of different feature selection algorithms based on different theoretic arguments varies even when they are applied to the same data set. In this paper, we propose a hybrid approach to combine useful outcomes from different feature selection methods through a genetic algorithm. The experimental results demonstrate that our approach can achieve better classification accuracy with a smaller gene subset than each individual feature selection algorithm does.


Genetic Algorithm Feature Selection Feature Subset Feature Selection Method Feature Selection Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Liu, Y.: A comparative study on feature selection methods for drug discovery. Journal of Chemical Information and Computer Sciences 44(5), 1823–1828 (2004)Google Scholar
  2. 2.
    Liu, H., Li, J., Wong, L.: A comparative study on feature selection and classification methods using gene expression profiles and proteomic pattern. Genomic Informatics 13, 51–60 (2002)Google Scholar
  3. 3.
    Weston, J., et al.: Feature selection for SVMs. Advances in Neural Information Processing Systems 13 (2000)Google Scholar
  4. 4.
    Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes. In: Proc. IEEE 7th International Conference on Tools with artificial Intelligence, pp. 338–391 (1995)Google Scholar
  5. 5.
    Golub, T.R., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286 (1999)Google Scholar
  6. 6.
    Dash, M., Liu, H.: Handling Large Unsupervised Data via Dimensionality Reduction. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery 1999 (1999)Google Scholar
  7. 7.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46(1-3), 389–422 (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature Selection for SVMs. Advances in Neural Information Processing Systems 13 (2000)Google Scholar
  9. 9.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  10. 10.
    LeCun, Y., Denker, J.S., Solla, S.A.: Optimum Brain Damage. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems II. Morgan Kaufmann, Mateo (1990)Google Scholar
  11. 11.
    Noble, W.S.: Support vector machine applications in computational biology. In: Schoelkopf, B., Tsuda, K., Vert, J.-P. (eds.) Kernel Methods in Computational Biology, pp. 71–92. MIT Press, Cambridge (2004)Google Scholar
  12. 12.
    Schölkopf, B., Guyon, I., Weston, J.: Statistical Learning and Kernel Methods in Bioinformatics. In: Frasconi, P., Shamir, R. (eds.) Artificial Intelligence and Heuristic Methods in Bioinformatics, vol. 183, pp. 1–21. IOS Press, Amsterdam (2003)Google Scholar
  13. 13.
    Alon, U., et al.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor a Normal Colon Tissues Probed by Oligonucleotide Arrays. PNAS 96, 6745–6750 (1999)CrossRefGoogle Scholar
  14. 14.
    Singh, D., et al.: Gene Expression Correlates of Clinical Prostate Cancer Behavior. Cancer Cell 1, 203–209 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Feng Tan
    • 1
  • Xuezheng Fu
    • 1
  • Hao Wang
    • 1
  • Yanqing Zhang
    • 1
  • Anu Bourgeois
    • 1
  1. 1.Department of Computer ScienceGeorgia State UniversityAtlantaUSA

Personalised recommendations