An Efficient Support Vector Machine Learning Method with Second-Order Cone Programming for Large-Scale Problems
- 194 Downloads
- 20 Citations
Abstract
In this paper we propose a new fast learning algorithm for the support vector machine (SVM). The proposed method is based on the technique of second-order cone programming. We reformulate the SVM's quadratic programming problem into the second-order cone programming problem. The proposed method needs to decompose the kernel matrix of SVM's optimization problem, and the decomposed matrix is used in the new optimization problem. Since the kernel matrix is positive semidefinite, the dimension of the decomposed matrix can be reduced by decomposition (factorization) methods. The performance of the proposed method depends on the dimension of the decomposed matrix. Experimental results show that the proposed method is much faster than the quadratic programming solver LOQO if the dimension of the decomposed matrix is small enough compared to that of the kernel matrix. The proposed method is also faster than the method proposed in (S. Fine and K. Scheinberg, 2001) for both low-rank and full-rank kernel matrices. The working set selection is an important issue in the SVM decomposition (chunking) method. We also modify Hsu and Lin's working set selection approach to deal with large working set. The proposed approach leads to faster convergence.
Keywords
second-order cone programming quadratic programming Cholesky factorization eigenvalue decomposition support vector machinePreview
Unable to display preview. Download preview PDF.
References
- S. Fine and K. Scheinberg, “Efficient SVM training using low-rank kernel representations,” Journal of Machine Learning Research, vol. 2, pp. 243–264, 2001.Google Scholar
- C. Campbell and N. Cristianini, “Simple learning algorithms for training support vector machine,” Technical report, University of Bristol, 1998.Google Scholar
- V.N. Vapnik, Stasistical Learning Theory, Wiley: New York, 1998.Google Scholar
- R.J. Vanderbei, “Loqo: An interior point code for quadratic programming,” Tecnical report SOR 94-15, Princeton University, 1994.Google Scholar
- T. Joachims, “Making large-scale support vector machine learning practical,” in Advanvced in Kernel Methods: Support Vector Machine, edited by B. Schölkopf, C. Burges, and A. Smola, MIT Press: Cambridge, MA, 1998, pp. 169–184.Google Scholar
- E. Osuna, R. Freund, and F. Girosi, “An improved training algorithm for support vector machines,” in Proc. of IEEE'97, FL, 1997.Google Scholar
- J. Platt, “Fast training of support vector machines using sequential minimal optimization,” in Advanced in Kernel Methods: Support Vector Machine, edited by B. Schölkopf, C. Burges, and A. Smola, MIT Press: Cambridge, MA, 1998, pp. 185–208.Google Scholar
- C.-W. Hsu and C.-J. Lin, “A simple decomposition method for support vector machines,” Machine Learning, vol. 46, pp. 291–314, 2002.CrossRefMATHGoogle Scholar
- P. Laskov, “An improved decomposition algorithm for regression support vector machines,” Machine Learning, vol. 46, pp. 315–350, 2002.CrossRefMATHGoogle Scholar
- S.S. Kertee, S. Shevade, C. Bhattacharyya, and K. Murthy, “Improvements to Platt's SMO algorithm for SVM classifier design,” Neural Computation, vol. 13, no. 3, pp. 637–649, 2001.Google Scholar
- C.-C. Chang and C.-J. Lin, “Training ν-support vector cclassifiers: Theory and algorithm,” Neural Computation, vol. 13, no. 9, pp. 2119–2147, 2001.CrossRefMATHGoogle Scholar
- R. Collobert and S. Bengio, “SVMTorch: A support vector machine for large-scale regression and classification problems,” Journal of Machine Learning Research, vol. 1, pp. 143–160, 2001. Available at http://www.idiap.ch/learning/SVMTorch.html
- C.-C. Chang and C.-J. Lin, “LIMSVM: A library for support vector machines,” 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
- C.-J. Lin, “On the convergence of the decomposition method for support vector machines,” IEEE Trans. Neural Network, vol. 12, pp. 1288–1298, 2001.Google Scholar
- G.R.G. Lanckriet, N. Cristianini, P.L. Bartlett, L El Ghaoui, and M.I. Jordan, “Learning the kernel matrix with semidefinite programming,” Journal of Machine Learning Research, vol. 5, pp. 27–72, 2004.Google Scholar
- R.D.C. Monterio and T. Tsuchiya, “Polynomial convergence of primal-dual algorithms for the second-order cone programming based on the MZ-family of directions,” Math. Program., vol. 88, pp. 61–83, 2000.MathSciNetGoogle Scholar
- A. Ben-Tal and A. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. MPS-SIAM Series on Optimization: Philadelphia, 2001.Google Scholar
- M. Muramatsu, “On a commutative class of search directions for linear programming over symmetric cones,” Journal of Optimization Theory and Applications, vol. 112, no. 3, pp. 595–625, 2002.CrossRefMATHMathSciNetGoogle Scholar
- R. Debnath, M. Muramatsu, and H. Takahashi, “The support vector machine learning using second order cone programming,” in Proc. IEEE Int. Joint Conference on Neural Networks, Budapest, Hungary, 25–29 July, 2004, pp. 2991–2996.Google Scholar
- R.D.C. Monteiro, “Primal-dual path following algorithms for semidefinite programming,” SIAM Journal on Optimization, vol. 7, pp. 663–678, 1997.CrossRefMATHMathSciNetGoogle Scholar
- C. Helmberg, F. Rendl, R.J. Vanderbei, and H. Wolkowicz, “An interior-point method for semidefinite programming,” SIAM Journal on Optimization, vol. 6, pp. 342–361, 1996.CrossRefMathSciNetMATHGoogle Scholar
- M. Kojima, S. Shindoh, and S. Hara, “Interior-point methods for the monotone linear complementary problem in symmetric matrices,” SIAM Journal on Optimization, vol. 7, pp. 86–125, 1997.CrossRefMathSciNetMATHGoogle Scholar
- E.D. Andersen, C. Roos, and T. Terlaky, “On implementing a primal-dual interior-point method for conic quadratic optimization,” Math. Programming Ser. B, vol. 95, pp. 249–277, 2003.MathSciNetCrossRefMATHGoogle Scholar
- Z. Cai, K.-C. Toh, “Solving second order cone programming via the augmented systems”. [online] Available: http://www.optimization-online.org/DB_HTML/2002/08/517.html
- S. Mehrotra, “On implementation of a primal-dual interior point method,” SIAM Journal on Optimization, vol. 2, no. 4, pp. 575–601, 1992.CrossRefMATHMathSciNetGoogle Scholar
- C.L. Blake and C.J. Merz, “UCI repository of machine learning databases,” Univ. California, Dept. Inform. Comp. Sc., Irvine, CA 1998. [online] Available: http://www.ics.uci.edu/~mlearn/MLRepository.html.
- G. Rätsch, Benchmark data sets. Available at http://www.first.gmd.de/~raetsch/data/benchmarks.htm.
- R. Debnath and H. Takahashi, “An improved working set selection method for SVM decomposition method,” in Proc. IEEE Int. Conference Intelligence Systems, Varna, Bulgaria, 21–24, 2004, pp. 520–523.Google Scholar
- C. Saunders, M.O. Stitson, J. Weston, L. Bottou, B. Schölkopf, and A. Smola, “Support vector machine reference manual,” Technical Report CSD-TR-98-03, Royal Holloway, University of London, Egham, UK, 1998.Google Scholar
- T. Joachims, Department of Computer Science, Cornell University, personal communication, 2003.Google Scholar
- W. Bress, W. Vetterling, S. Teukolsky, and B. Slannery, Numerical Receipes in C (The Art of Scientific Computing), 2nd ed. Cambridge University Press, 1992.Google Scholar
- G.H. Golub, C.F.V. Loan, Matrix Computations, 2nd ed. Johns Hopkins University Press, 1989.Google Scholar
- M.S. Bazaraa, C.M. Shetty, Nonlinear Programming: Theory and Algorithms, Wiley: New York, 1979.MATHGoogle Scholar
- J. Werner, Optimization-Theory and Applications, Vieweg, 1984.Google Scholar