Advertisement

A Hybrid Method for Speeding SVM Training

  • Zhi-Qiang Zeng
  • Ji Gao
  • Hang Guo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4032)

Abstract

Support vector machine (SVM) is a well-known method used for pattern recognition and machine learning. However, training a SVM is very costly in terms of time and memory consumption when the data set is large. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, removing any training samples that are not relevant to support vectors might have no effect on building the proper decision function. In this paper,an effective hybrid method is proposed to remove from the training set the data that is irrelevant to the final decision function, and thus the number of vectors for SVM training becomes small and the training time can be decreased greatly. Experimental results show that a significant amount of training data can be discarded by our methods without compromising the generalization capability of SVM.

Keywords

Support Vector Machine Support Vector Decision Function Decision Boundary Circle Region 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cortes, C., Vapnik, V.: Support-vector network. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  2. 2.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)MATHGoogle Scholar
  3. 3.
    Joachims, T.: Making large-scale SVM learning practical. In: SchÖlkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)Google Scholar
  4. 4.
    Balcazar, J.L., Dai, Y., Watanabe, O.: Provably Fast Training Algorithms for Support Vector Machines. In: Proc. of the 1st IEEE International Conference on Data mining, pp. 43–50. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  5. 5.
    Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines. In: Proc. of the 8th ACM SIGKDD international conference of knowledge Discovery and data mining, Edmonton, Canada (2002)Google Scholar
  6. 6.
    Yu, H., Yang, J., Han, J.: Classifying large data sets using svms with hierarchical clusters. In: Proc. ACM SIGKDD, pp. 306–315 (2003)Google Scholar
  7. 7.
    Daniael, B., Cao, D.: Training Support Vector Machines Using Adaptive Clustering. In: Proc. Of SIAM International Conference on Data Mining 2004, Lake Buena Vista, FL, USA (2004)Google Scholar
  8. 8.
    Valentini, G., Dietterich, T.G.: Low Bias Bagged Support Vector Machines. In: Proc. of the 20tth International Conference on Machine Learning ICML 2003, Washington D.C., USA, pp. 752–759 (2003)Google Scholar
  9. 9.
    Shih, L., Rennie, Y.D.M., Chang, Y., Karger, D.R.: Text Bundling: Statistics-based Data Reduction. In: Proc. of the Twentieth International Conference on Machine Learning (ICML 2003), Washington DC (2003)Google Scholar
  10. 10.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
  11. 11.
    Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, Irvine, CA (1994), Available at: http://www.ics.uci.edu/~mlearn/MLRepository.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zhi-Qiang Zeng
    • 1
  • Ji Gao
    • 1
  • Hang Guo
    • 1
  1. 1.Department of Computer Science and EngineeringZhejiang UniversityHangzhouChina

Personalised recommendations