A Hybrid Method for Speeding SVM Training
Support vector machine (SVM) is a well-known method used for pattern recognition and machine learning. However, training a SVM is very costly in terms of time and memory consumption when the data set is large. In contrast, the SVM decision function is fully determined by a small subset of the training data, called support vectors. Therefore, removing any training samples that are not relevant to support vectors might have no effect on building the proper decision function. In this paper,an effective hybrid method is proposed to remove from the training set the data that is irrelevant to the final decision function, and thus the number of vectors for SVM training becomes small and the training time can be decreased greatly. Experimental results show that a significant amount of training data can be discarded by our methods without compromising the generalization capability of SVM.
KeywordsSupport Vector Machine Support Vector Decision Function Decision Boundary Circle Region
Unable to display preview. Download preview PDF.
- 3.Joachims, T.: Making large-scale SVM learning practical. In: SchÖlkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)Google Scholar
- 5.Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines. In: Proc. of the 8th ACM SIGKDD international conference of knowledge Discovery and data mining, Edmonton, Canada (2002)Google Scholar
- 6.Yu, H., Yang, J., Han, J.: Classifying large data sets using svms with hierarchical clusters. In: Proc. ACM SIGKDD, pp. 306–315 (2003)Google Scholar
- 7.Daniael, B., Cao, D.: Training Support Vector Machines Using Adaptive Clustering. In: Proc. Of SIAM International Conference on Data Mining 2004, Lake Buena Vista, FL, USA (2004)Google Scholar
- 8.Valentini, G., Dietterich, T.G.: Low Bias Bagged Support Vector Machines. In: Proc. of the 20tth International Conference on Machine Learning ICML 2003, Washington D.C., USA, pp. 752–759 (2003)Google Scholar
- 9.Shih, L., Rennie, Y.D.M., Chang, Y., Karger, D.R.: Text Bundling: Statistics-based Data Reduction. In: Proc. of the Twentieth International Conference on Machine Learning (ICML 2003), Washington DC (2003)Google Scholar
- 10.Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
- 11.Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, Irvine, CA (1994), Available at: http://www.ics.uci.edu/~mlearn/MLRepository.html