Advertisement

Machine Learning

, Volume 46, Issue 1–3, pp 315–349 | Cite as

Feasible Direction Decomposition Algorithms for Training Support Vector Machines

  • Pavel Laskov
Article

Abstract

The article presents a general view of a class of decomposition algorithms for training Support Vector Machines (SVM) which are motivated by the method of feasible directions. The first such algorithm for the pattern recognition SVM has been proposed in Joachims, T. (1999, Schölkopf et al. (Eds.) Advances in kernel methods-Support vector learning (pp. 185–208). MIT Press). Its extension to the regression SVM—the maximal inconsistency algorithm—has been recently presented by the author (Laskov, 2000, Solla, Leen, & Müller (Eds.) Advances in neural information processing systems 12 (pp. 484–490). MIT Press). A detailed account of both algorithms is carried out, complemented by theoretical investigation of the relationship between the two algorithms. It is proved that the two algorithms are equivalent for the pattern recognition SVM, and the feasible direction interpretation of the maximal inconsistency algorithm is given for the regression SVM. The experimental results demonstrate an order of magnitude decrease of training time in comparison with training without decomposition, and, most importantly, provide experimental evidence of the linear convergence rate of the feasible direction decomposition algorithms.

support vector machines training decomposition algorithms methods of feasible directions working set selection 

References

  1. Chang, C.-C., Hsu, C.-W., & Lin, C.-J. (1999). The analysis of decomposition methods for support vector machines. In Proceedings of the International Joint Conference on Artificial Intelligence. SVM Workshop.Google Scholar
  2. Collobert, R. & Bengio, S. (2000). Support vector machines for large-scale regression problems. Technical Report IDIAP-RR 00-17, IDIAP.Google Scholar
  3. Cristianini, N. & Shawe-Taylor, J. (2000). An introduction to support vector machines. Cambridge, UK: Cambridge University Press.Google Scholar
  4. Hsu, C.-W. & Lin, C.-J. (1999). A simple decomposition method for support vector machines. Technical report, National Taiwan University.Google Scholar
  5. Joachims, T. (1999). Making large-scale support vector machine learning practical. In Schölkopf et al. (Eds.). Advances in Kernel Methods-Support Vector Learning (pp. 169-184). Cambridge, MA: MIT Press.Google Scholar
  6. Kaufmann, L. (1999). Solving the quadratic problem arising in support vector classification. In Schölkopf et al. (Eds.). Advances in Kernel Methods-Support Vector Learning (pp. 147-168). Cambridge, MA: MIT Press.Google Scholar
  7. Keerthi, S. S., Shevade, S. K., Bhattacharyya, C., & Murthy, K. R. K. (1999). Improvements to Platt's SMO algorithm for SVM classifier design. Technical Report CD-99-14, National Institute of Singapure.Google Scholar
  8. Laskov, P. (2000). An improved decomposition algorithm for regression support vector machines. In S. Solla, T. Leen, & K.-R. Müller (Eds.). Advances in neural information processing systems 12 (pp. 484-490). Cambridge, MA: MIT Press.Google Scholar
  9. Mangasarian, O. L. (1969). Nonlinear programming. New York: McGraw-Hill.Google Scholar
  10. Mangasarian, O. L. & Musicant, D. R. (1999a). Massive support vector regression. Technical Report 99-02, University of Wisconsin, Data Mining Institute.Google Scholar
  11. Mangasarian, O. L. & Musicant, D. R. (1999b). Successive overrelaxation for support vector machines. Transactions on Neural Networks, 10:5, 100-106.Google Scholar
  12. Nocedal, J. & Wright, S. J. (1999). Numerical optimization. New York, NY: Springer-Verlag.Google Scholar
  13. Osuna, E. (1998). Support vector machines: Training and applications. Ph.D. Thesis, Massachusetts Institute of Technology.Google Scholar
  14. Osuna, E., Freund, R., & Girosi, F. (1997). Improved training algorithm for support vector machines. Neural Networks and Signal Processing.Google Scholar
  15. Platt, J. C. (1999). Fast training of support vector machines using sequential minimal optimization. In Schölkopf et al. (Eds.) 1999. Advances in kernel method-support vector training (pp. 185-208). Cambridge, MA: MIT Press.Google Scholar
  16. Schölkopf B. (1997). Support vector learning. Ph.D. Thesis, Universität Tübingen.Google Scholar
  17. Schölkopf B., Burges, C., & Smola, A. (Eds.) (1999). Advances in kernel methos-support vector learning. Cambridge, MA: MIT Press.Google Scholar
  18. Shevade, S. K., Keerthi, S. S., Bhattacharyya, C., & Murthy, K. R. K. (1999). Improvements to SMO algorithm for SVM regression. Technical Report CD-99-16, National Institute of Singapure.Google Scholar
  19. Smola, A. (1998). Learning with Kernels. Ph.D. Thesis, Technical University of Berlin.Google Scholar
  20. Smola, A. & Schölkopf B. (1998). A tutorial on support vector regression. Technical Report NC2-TR-1998-030, NeuroCOLT2.Google Scholar
  21. Vapnik, V. N. (1982). Estimation of dependences based on empirical data. New York, NY: Springer-Verlag.Google Scholar
  22. Vapnik, V. N. (1995). The nature of statistical learning theory. New York, NY: Springer-Verlag.Google Scholar
  23. Vapnik, V. N. (1999). Statistical learning theory. New York, NY: J. Wiley and Sons.Google Scholar
  24. Zoutendijk, G. (1960). Methods of feasible directions. Amsterdam: Elsewier.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Pavel Laskov
    • 1
  1. 1.Department of Computer and Information SciencesUniversity of DelawareNewarkUSA

Personalised recommendations