An Efficient Alternative to SVM Based Recursive Feature Elimination with Applications in Natural Language Processing and Bioinformatics

  • Justin Bedo
  • Conrad Sanderson
  • Adam Kowalczyk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)


The SVM based Recursive Feature Elimination (RFE-SVM) algorithm is a popular technique for feature selection, used in natural language processing and bioinformatics. Recently it was demonstrated that a small regularisation constant C can considerably improve the performance of RFE-SVM on microarray datasets. In this paper we show that further improvements are possible if the explicitly computable limit C →0 is used. We prove that in this limit most forms of SVM and ridge regression classifiers scaled by the factor \(\frac{1}{C}\) converge to a centroid classifier. As this classifier can be used directly for feature ranking, in the limit we can avoid the computationally demanding recursion and convex optimisation in RFE-SVM. Comparisons on two text based author verification tasks and on three genomic microarray classification tasks indicate that this straightforward method can surprisingly obtain comparable (at times superior) performance and is about an order of magnitude faster.


Support Vector Machine Feature Selection Linear Discriminant Analysis Natural Language Processing Microarray Dataset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)zbMATHCrossRefGoogle Scholar
  2. 2.
    Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Proc. 21st Int. Conf. Machine Learning (ICML), Banff, Canada (2004)Google Scholar
  3. 3.
    Huang, T.M., Kecman, V.: Gene extraction for cancer diagnosis by support vector machines - an improvement. Artificial Intelligence in Medicine 35, 185–194 (2005)CrossRefGoogle Scholar
  4. 4.
    Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)Google Scholar
  5. 5.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)Google Scholar
  6. 6.
    Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)zbMATHGoogle Scholar
  7. 7.
    Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley & Sons, Chichester (2001)zbMATHGoogle Scholar
  8. 8.
    Gamon, M.: Linguistic correlates of style: authorship classification with deep linguistic analysis features. In: Proc. 20th Int. Conf. Computational Linguistics (COLING), Geneva, pp. 611–617 (2004)Google Scholar
  9. 9.
    Love, H.: Attributing Authorship: An Introduction. Cambridge University Press, Cambridge (2002)CrossRefGoogle Scholar
  10. 10.
    Sanderson, C., Guenter, S.: Short text authorship attribution via sequence kernels, Markov chains and author unmasking: An investigation. In: Proc. 2006 Conf. Empirical Methods in Natural Language Processing (EMNLP), Sydney, pp. 482–491 (2006)Google Scholar
  11. 11.
    Ambroise, C., McLachlan, G.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. National Acad. Sci. 99, 6562–6566 (2002)zbMATHCrossRefGoogle Scholar
  12. 12.
    Alizadeh, A., Eisen, M., Davis, R., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  13. 13.
    Chu, F., Wang, L.: Gene expression data analysis using support vector machines. In: Proc. Intl. Joint Conf. Neural Networks, pp. 2268–2271 (2003)Google Scholar
  14. 14.
    Tothill, R., Kowalczyk, A., Rischin, D., Bousioutas, A., Haviv, I., et al.: An expression-based site of origin diagnostic method designed for clinical application to cancer of unknown origin. Cancer Research 65, 4031–4040 (2005)CrossRefGoogle Scholar
  15. 15.
    Tibshirani, R., Hastie, T., et al.: Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Statistical Science 18, 104–117 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Rifkin, R., Klautau, A.: In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101–141 (2004)MathSciNetGoogle Scholar
  17. 17.
    van’t Veer, L., Dai, H., van de Vijver, M., He, Y., Hart, A., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Justin Bedo
    • 1
    • 2
  • Conrad Sanderson
    • 1
    • 2
  • Adam Kowalczyk
    • 1
    • 2
    • 3
  1. 1.Australian National UniversityAustralia
  2. 2.National ICT Australia (NICTA)Australia
  3. 3.Dept. Electrical & Electronic Eng.University of MelbourneAustralia

Personalised recommendations