Training Data Selection for Support Vector Machines

  • Jigang Wang
  • Predrag Neskovic
  • Leon N. Cooper
Conference paper

DOI: 10.1007/11539087_71

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3610)
Cite this paper as:
Wang J., Neskovic P., Cooper L.N. (2005) Training Data Selection for Support Vector Machines. In: Wang L., Chen K., Ong Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3610. Springer, Berlin, Heidelberg

Abstract

In recent years, support vector machines (SVMs) have become a popular tool for pattern recognition and machine learning. Training a SVM involves solving a constrained quadratic programming problem, which requires large memory and enormous amounts of training time for large-scale problems. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, it is desirable to remove from the training set the data that is irrelevant to the final decision function. In this paper we propose two new methods that select a subset of data for SVM training. Using real-world datasets, we compare the effectiveness of the proposed data selection strategies in terms of their ability to reduce the training set size while maintaining the generalization performance of the resulting SVM classifiers. Our experimental results show that a significant amount of training data can be removed by our proposed methods without degrading the performance of the resulting SVM classifiers.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jigang Wang
    • 1
  • Predrag Neskovic
    • 1
  • Leon N. Cooper
    • 1
  1. 1.Institute for Brain and Neural Systems, Physics DepartmentBrown UniversityProvidenceUSA

Personalised recommendations