A Fast Parallel Optimization for Training Support Vector Machine
A fast SVM training algorithm for multi-classes consisting of parallel and sequential optimizations is presented. The main advantage of the parallel optimization step is to remove most non-support vectors quickly, which dramatically reduces the training time at the stage of sequential optimization. In addition, some strategies such as kernel caching, shrinking and calling BLAS functions are effectively integrated into the algorithm to speed up the training. Experiments on MNIST handwritten digit database have shown that, without sacrificing the generalization performance, the proposed algorithm has achieved a speed-up factor of 110, when compared with Keerthi et al.’s modified SMO. Moreover, for the first time ever we investigated the training performance of SVM on handwritten Chinese database ETL9B with more than 3000 categories and about 500,000 training samples. The total training time is just 5.1 hours. The raw error rate of 1.1% on ETL9B has been achieved.
KeywordsSupport Vector Machine Kernel Matrix Radial Basic Function Kernel Sequential Minimization Optimization Training Support Vector Machine
Unable to display preview. Download preview PDF.
- 1.Schölkopf, B., Burges, C.J.C., Vapnik, V.: Extracting support data for a given task. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA (1995) 252–257Google Scholar
- 3.Joachims, T.: Text categorization with support vector machine: learning with many relevant features. In: Proceedings of 10th European Conference on Machine Learning (ECML) (1998) 137–142Google Scholar
- 4.Osuna, E., Freund, R., Girosi, F.: Training support vector machines: An application to face detection. In: Proceedings of the 1997 conference on Computer Vision and Pattern Recognition(CVPR’97), Puerto Rico (1997) 130–136Google Scholar
- 5.Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A. (eds.): Advances in kernel methods: Support Vector Machines, MIT Press, Cambridge, MA (1998) 185–208Google Scholar
- 6.Joachims T.: Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A. (eds.): Advances in kernel methods: Support Vector Machines, MIT Press, Cambridge, MA (1998) 169–184Google Scholar
- 8.Kuhn, H., Tucker, A.: Nonlinear programming. In: Proceedings of 2nd Berkeley Symposium on Mathematical Statistics and Probabilistics. University of California Press (1951) 481–492Google Scholar
- 12.Saito, T., Yamada, H., Yamamoto, K.: An analysis of handprinted character database VIII: An estimation of the database ETL9 of handprinted characters in JIS Chinese characters by directional pattern matching approach. Bul. Electrotech 49(7) (1985) 487–525Google Scholar
- 13.Dong, J.X., Suen, C.Y., Krzyżak, A.: High accuracy handwritten Chinese character recognition using support vector machine. Tech. Rep. CENPARMI, Concordia University, Canada, (2003)Google Scholar