Skip to main content
Log in

Exploiting separability in large-scale linear support vector machine training

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Linear support vector machine training can be represented as a large quadratic program. We present an efficient and numerically stable algorithm for this problem using interior point methods, which requires only \(\mathcal{O}(n)\) operations per iteration. Through exploiting the separability of the Hessian, we provide a unified approach, from an optimization perspective, to 1-norm classification, 2-norm classification, universum classification, ordinal regression and ε-insensitive regression. Our approach has the added advantage of obtaining the hyperplane weights and bias directly from the solver. Numerical experiments indicate that, in contrast to existing methods, the algorithm is largely unaffected by noisy data, and they show training times for our implementation are consistent and highly competitive. We discuss the effect of using multiple correctors, and monitoring the angle of the normal to the hyperplane to determine termination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altman, A., Gondzio, J.: Regularized symmetric indefinite systems in interior point methods for linear and quadratic optimization. Optim. Methods Softw. 11, 275–302 (1999)

    Article  MathSciNet  Google Scholar 

  2. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  3. Chu, W., Keerthi, S.S.: New approaches to support vector ordinal regression. In: ICML ’05: Proceedings of the 22nd International Conference on Machine Learning, pp. 145–152. ACM, New York (2005)

    Chapter  Google Scholar 

  4. Collobert, R., Bengio, S.: SVMTorch: support vector machines for large-scale regression problems. J. Mach. Learn. Res. 1, 143–160 (2001)

    Article  MathSciNet  Google Scholar 

  5. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  6. Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  7. Ferris, M., Munson, T.: Interior point methods for massive support vector machines. SIAM J. Optim. 13(3), 783–804 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  8. Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2002)

    Article  MATH  Google Scholar 

  9. Fine, S., Scheinberg, K.: INCAS: An incremental active set method for SVM. Tech. Rep., IBM Research Labs, Haifa (2002)

  10. Gertz, E.M., Griffin, J.D.: Support vector machine classifiers for large data sets. Technical memo, Argonne National Lab ANL/MCS-TM-289 (2005)

  11. Gertz, E.M., Wright, S.J.: Object-oriented software for quadratic programming. ACM Trans. Math. Softw. 29(1), 58–81 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. Goldfarb, D., Scheinberg, K.: A product-form Cholesky factorization method for handling dense columns in interior point methods for linear programming. Math. Program. 99(1), 1–34 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Goldfarb, D., Scheinberg, K.: Solving structured convex quadratic programs by interior point methods with application to support vector machines and portfolio optimization. Submitted for publication (2005)

  14. Gondzio, J.: HOPDM: a fast LP solver based on a primal-dual interior point method. Eur. J. Oper. Res. 85, 221–225 (1995)

    Article  MATH  Google Scholar 

  15. Gondzio, J.: Multiple centrality corrections in a primal-dual method for linear programming. Comput. Optim. Appl. 6, 137–156 (1996)

    MathSciNet  MATH  Google Scholar 

  16. Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers. MIT Press, Cambridge (2000)

    Google Scholar 

  17. Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: ICML ’08: Proceedings of the 25th International Conference on Machine Learning (2008)

  18. Joachims, T.: Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)

    Google Scholar 

  19. Joachims, T.: Training linear SVMs in linear time. In: KDD ’06: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM, New York (2006)

    Chapter  Google Scholar 

  20. Keerthi, S.S., Chapelle, O., DeCoste, D.: Building support vector machines with reduced classifier complexity. J. Mach. Learn. Res. 7, 1493–1515 (2006)

    MathSciNet  Google Scholar 

  21. Lawson, C.L., Hanson, R.J., Kincaid, D.R., Krogh, F.T.: Algorithm 539: basic linear algebra subprograms for Fortran usage [F1]. ACM Trans. Math. Softw. 5(3), 324–325 (1979)

    Article  MATH  Google Scholar 

  22. Lee, Y.J., Mangasarian, O.L.: RSVM: Reduced support vector machines. In: Proceedings of the SIAM International Conference on Data Mining. SIAM, Philadelphia (2001)

    Google Scholar 

  23. Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: A convergent decomposition algorithm for support vector machines. Comput. Optim. Appl. 38, 217–234 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Networks 10(5), 1032–1037 (1999)

    Article  Google Scholar 

  25. Mészáros, C.: The separable and non-separable formulations of convex quadratic problems in interior point methods. Tech. Rep. WP 98-3, Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest (1998)

  26. Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Principe, J., Gile, L., Morgan, N., Wilson, E. (eds.) Neural Networks for Signal Processing VII—Proceedings of the 1997 IEEE Workshop, pp. 276–285. IEEE Press, New York (1997)

    Chapter  Google Scholar 

  27. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  28. Vanderbei, R.J.: Linear Programming Foundations and Extensions. Kluwer Academic, Boston (1997)

    Google Scholar 

  29. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  30. Vapnik, V.: The Nature of Statistical Learning Theory, 2nd edn. Springer, Berlin (1999)

    Google Scholar 

  31. Vapnik, V.: Transductive inference and semi-supervised learning. In: Chapelle, O., Schölkopf, B., Zien, A. (eds.) Semi-supervised Learning, pp. 454–472. MIT Press, Cambridge (2006), Chap. 24

    Google Scholar 

  32. Weston, J., Collobert, R., Sinz, F., Bottou, L., Vapnik, V.: Inference with the Universum. In: ICML ’06: Proceedings of the 23rd International Conference on Machine Learning, pp. 1009–1016. ACM, New York (2006)

    Chapter  Google Scholar 

  33. Woodsend, K., Gondzio, J.: Hybrid MPI/OpenMP parallel support vector machine training. J. Mach. Learn. Res. 10, 1937–1953 (2009)

    MathSciNet  Google Scholar 

  34. Wright, S.J.: Primal-dual Interior-point Methods. SIAM, Philadelphia (1997)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristian Woodsend.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Woodsend, K., Gondzio, J. Exploiting separability in large-scale linear support vector machine training. Comput Optim Appl 49, 241–269 (2011). https://doi.org/10.1007/s10589-009-9296-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-009-9296-8

Keywords

Navigation