Combining Support Vector Machines and the t-statistic for Gene Selection in DNA Microarray Data Analysis

  • Tao Yang
  • Vojislave Kecman
  • Longbing Cao
  • Chengqi Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6119)


This paper proposes a new gene selection (or feature selection) method for DNA microarray data analysis. In the method, the t-statistic and support vector machines are combined efficiently. The resulting gene selection method uses both the data intrinsic information and learning algorithm performance to measure the relevance of a gene in a DNA microarray. We explain why and how the proposed method works well. The experimental results on two benchmarking microarray data sets show that the proposed method is competitive with previous methods. The proposed method can also be used for other feature selection problems.


Support Vector Machine Gene Selection Linear Support Vector Machine Ranking Criterion Recursive Feature Elimination 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)zbMATHCrossRefGoogle Scholar
  2. 2.
    Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16, 906–914 (2000)CrossRefGoogle Scholar
  3. 3.
    Huang, T.M., Kecman, V.: Gene extraction for cancer diagnosis by support vector machines - an improvement. Artif. Intell. Med. 35, 185–194 (2005)CrossRefGoogle Scholar
  4. 4.
    Huang, T.M., Kecman, V., Kopriva, I.: Kernel Based Algorithms for Mining Huge Data Sets, Supervised, Semi-supervised, and Unsupervised Learning. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  5. 5.
    Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. USA 99, 6562–6566 (2002)zbMATHCrossRefGoogle Scholar
  6. 6.
    Huang, X., Pan, W.: Linear regression and two-class classification with gene expression data. Bioinformatics 19, 2072–2078 (2003)CrossRefGoogle Scholar
  7. 7.
    Lee, J.W., Lee, J.B., Park, M., Song, S.H.: An extensive comparison of recent classification tools applied to microarray data. Computational Statistics and Data Analysis 48, 869–885 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Tan, A.C., Naiman, D.Q., Xu, L., Winslow, R.L., Geman, D.: Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 21, 3896–3904 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Tao Yang
    • 1
  • Vojislave Kecman
    • 2
  • Longbing Cao
    • 1
  • Chengqi Zhang
    • 1
  1. 1.Faculty of Engineering and Information TechnologyUniversity of TechnologySydneyAustralia
  2. 2.Department of Computer ScienceVirginia Commonwealth UniversityRichmondUSA

Personalised recommendations