Skip to main content

Part of the book series: Advances in Pattern Recognition ((ACVPR))

  • 4835 Accesses

Abstract

In training an L1 or L2 support vector machine, we need to solve a quadratic programming problem with the number of variables equal to the number of training data. Computational complexity is of the order of M 3, where M is the number of training data. Thus when M is large, training takes long time. To speed up training, numerous methods have been proposed. One is to extract support vector candidates from the training data and then train the support vector machine using these data. Another method is to accelerate training by decomposing variables into a working set and a fixed set and by repeatedly solving the subproblem associated with the working set until convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In contrast to active set training, active learning tries to minimize a labeling task of unlabeled data by only labeling the data that are necessary for generating a classifier. During learning process, the learning machine asks labeling unlabeled data that are crucial for generating a classifier, in a support vector machine environment, the data that change the optimal hyperplane most [11]. Active learning for the labeled data is considered to be one of the working set selection methods.

  2. 2.

    See the discussions in Section 2.3.4.1 on p. 44.

  3. 3.

    If the inequality constraint \(A \textbf{x} \geq \textbf{b}\) is used, y needs to be a nonnegative vector.

  4. 4.

    We have changed steepest ascent methods used in the first edition to Newton's methods [64] to follow the common usage.

  5. 5.

    http://svm.cs.rhbnc.ac.uk/

  6. 6.

    Here, the margin is not measured from the separating hyperplane. Thus, to measure the margin from it we need to add 1.

  7. 7.

    The satimage and USPS data sets will be evaluated in the following section.

References

  1. M.-H. Yang and N. Ahuja. A geometric approach to train support vector machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 430–437, Hilton Head Island, SC, 2000.

    Google Scholar 

  2. M. B. de Almeida, A. de Pádua Braga, and J. P. Braga. SVM-KM: Speeding SVMs learning with a priori cluster selection and k-means. In Proceedings of the Sixth Brazilian Symposium on Neural Networks (SBRN 2000), pages 162–167, Rio de Janeiro, Brazil, 2000.

    Google Scholar 

  3. S. Sohn and C. H. Dagli. Advantages of using fuzzy class memberships in self-organizing map and support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 3, pages 1886–1890, Washington, DC, 2001.

    Google Scholar 

  4. S. Abe and T. Inoue. Fast training of support vector machines by extracting boundary data. In G. Dorffner, H. Bischof, and K. Hornik, editors, Artificial Neural Networks (ICANN 2001)–-Proceedings of International Conference, Vienna, Austria, pages 308–313. Springer-Verlag, Berlin, Germany, 2001.

    Google Scholar 

  5. W. Zhang and I. King. Locating support vectors via β-skeleton technique. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 3, pages 1423–1427, Singapore, 2002.

    Google Scholar 

  6. H. Shin and S. Cho. How many neighbors to consider in pattern pre-selection for support vector classifiers? In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 1, pages 565–570, Portland, OR, 2003.

    Google Scholar 

  7. S.-Y. Sun, C. L. Tseng, Y. H. Chen, S. C. Chuang, and H. C. Fu. Cluster-based support vector machines in text-independent speaker identification. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2004), volume 1, pages 729–734, Budapest, Hungary, 2004.

    Google Scholar 

  8. B. Li, Q. Wang, and J. Hu. A fast SVM training method for very large datasets. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1784–1789, Atlanta, GA, 2009.

    Google Scholar 

  9. C. Saunders, M. O. Stitson, J. Weston, L. Bottou, B. Schölkopf, and A. Smola. Support vector machine: Reference manual. Technical Report CSD-TR-98-03, Royal Holloway, University of London, London, UK, 1998.

    Google Scholar 

  10. E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Neural Networks for Signal Processing VII–-Proceedings of the 1997 IEEE Signal Processing Society Workshop, pages 276–285, 1997.

    Google Scholar 

  11. G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), pages 839–846, Stanford, CA, 2000.

    Google Scholar 

  12. C.-J. Lin. On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks, 12(6):1288–1298, 2001.

    Article  Google Scholar 

  13. C.-J. Lin. Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Transactions on Neural Networks, 13(1):248–250, 2002.

    Article  Google Scholar 

  14. S. S. Keerthi and E. G. Gilbert. Convergence of a generalized SMO algorithm for SVM classifier design. Machine Learning, 46(1–3):351–360, 2002.

    Article  MATH  Google Scholar 

  15. R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using second order information for training support vector machines. Journal of Machine Learning Research, 6:1889–1918, 2005.

    MathSciNet  Google Scholar 

  16. M. Rychetsky, S. Ortmann, M. Ullmann, and M. Glesner. Accelerated training of support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’99), volume 2, pages 998–1003, Washington, DC, 1999.

    Google Scholar 

  17. Y. Koshiba. Acceleration of training of support vector machines. Master's thesis (in Japanese), Graduate School of Science and Technology, Kobe University, Japan, 2004.

    Google Scholar 

  18. C. Campbell, T.-T. Frieß, and N. Cristianini. Maximal margin classification using the KA algorithm. In Proceedings of the First International Symposium on Intelligent Data Engineering and Learning (IDEAL ’98), pages 355–362, Hong Kong, China, 1998.

    Google Scholar 

  19. T.-T. Frieß, N. Cristianini, and C. Campbell. The Kernel-Adatron algorithm: A fast and simple learning procedure for support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML '98), pages 188–196, Madison, WI, 1998.

    Google Scholar 

  20. Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.

    Article  MATH  Google Scholar 

  21. I. Guyon and D. G. Stork. Linear discriminant and support vector classifiers. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 147–169. MIT Press, Cambridge, MA, 2000.

    Google Scholar 

  22. J. Xu, X. Zhang, and Y. Li. Large margin kernel pocket algorithm. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 2, pages 1480–1485, Washington, DC, 2001.

    Google Scholar 

  23. J. K. Anlauf and M. Biehl. The Adatron: An adaptive perceptron algorithm. Europhysics Letters, 10:687–692, 1989.

    Article  Google Scholar 

  24. N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK, 2000.

    Google Scholar 

  25. O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks, 10(5):1032–1037, 1999.

    Article  Google Scholar 

  26. J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 185–208. MIT Press, Cambridge, MA, 1999.

    Google Scholar 

  27. J.-X. Dong, A. Krzyżak, and C. Y. Suen. Fast SVM training algorithm with decomposition on very large data sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(4):603–618, 2005.

    Article  Google Scholar 

  28. L. Bottou and C.-J. Lin. Support vector machine solvers. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 1–27. MIT Press, Cambridge, MA, 2007.

    Google Scholar 

  29. S. Abe, Y. Hirokawa, and S. Ozawa. Steepest ascent training of support vector machines. In E. Damiani, L. C. Jain, R. J. Howlett, and N. Ichalkaranje, editors, Knowledge-Based Intelligent Engineering Systems and Allied Technologies (KES 2002), volume Part 2, pages 1301–1305, IOS Press, Amsterdam, The Netherlands, 2002.

    Google Scholar 

  30. M. Vogt. SMO algorithms for support vector machines without bias term. Technical report, Institute of Automatic Control, TU Darmstadt, Germany, 2002.

    Google Scholar 

  31. V. Kecman, M. Vogt, and T. M. Huang. On the equality of kernel AdaTron and Sequential Minimal Optimization in classification and regression tasks and alike algorithms for kernel machines. In Proceedings of the Eleventh European Symposium on Artificial Neural Networks (ESANN 2003), pages 215–222, Bruges, Belgium, 2003.

    Google Scholar 

  32. C. Sentelle, M. Georgiopoulos, G. C. Anagnostopoulos, and C. Young. On extending the SMO algorithm sub-problem. In Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), pages 886–891, Orlando, FL, 2007.

    Google Scholar 

  33. R. A. Hernandez, M. Strum, J. C. Wang, and J. A. Q. Gonzalez. The multiple pairs SMO: A modified SMO algorithm for the acceleration of the SVM training. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1221–1228, Atlanta, GA, 2009.

    Google Scholar 

  34. G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 409–415. MIT Press, Cambridge, MA, 2001.

    Google Scholar 

  35. A. Shilton, M. Palaniswami, D. Ralph, and A. C. Tsoi. Incremental training of support vector machines. IEEE Transactions on Neural Networks, 16(1):114–131, 2005.

    Article  Google Scholar 

  36. K. Scheinberg. An efficient implementation of an active set method for SVMs. Journal of Machine Learning Research, 7:2237–2257, 2006.

    MathSciNet  Google Scholar 

  37. S. Abe. Batch support vector training based on exact incremental training. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 527–536. Springer-Verlag, Berlin, Germany, 2008.

    Google Scholar 

  38. H. Gâlmeanu and R. Andonie. Implementation issues of an incremental and decremental SVM. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 325–335. Springer-Verlag, Berlin, Germany, 2008.

    Google Scholar 

  39. C. Sentelle, G. C. Anagnostopoulos, and M. Georgiopoulos. An efficient active set method for SVM training without singular inner problems. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 2875–2882, Atlanta, GA, 2009.

    Google Scholar 

  40. O. Chapelle. Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 29–50. MIT Press, Cambridge, MA, 2007.

    Google Scholar 

  41. S. Abe. Is primal better than dual. In C. Alippi, M. Polycarpou, C. Panayiotou, and G. Ellinas, editors, Artificial Neural Networks (ICANN 2009)–-Proceedings of the Nineteenth International Conference, Limassol, Cyprus, Part I, pages 854–863. Springer-Verlag, Berlin, Germany, 2009.

    Google Scholar 

  42. D. Roobaert. DirectSVM: A fast and simple support vector machine perceptron. In Neural Networks for Signal Processing X–-Proceedings of the 2000 IEEE Signal Processing Society Workshop, volume 1, pages 356–365, 2000.

    Google Scholar 

  43. S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy. A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Transactions on Neural Networks, 11(1):124–136, 2000.

    Article  Google Scholar 

  44. T. Raicharoen and C. Lursinsap. Critical support vector machine without kernel function. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 5, pages 2532–2536, Singapore, 2002.

    Google Scholar 

  45. S. V. N. Vishwanathan and M. N. Murty. SSVM: A simple SVM algorithm. In Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN'02), volume 3, pages 2393–2398, Honolulu, Hawaii, 2002.

    Google Scholar 

  46. K. P. Bennett and E. J. Bredensteiner. Geometry in learning. In C. A. Gorini, editor, Geometry at Work, pages 132–145. Mathematical Association of America, Washington, DC 2000.

    Google Scholar 

  47. D. J. Crisp and C. J. C. Burges. A geometric interpretation of ν-SVM classifiers. In S. A. Solla, T. K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, pages 244–250. MIT Press, Cambridge, MA, 2000.

    Google Scholar 

  48. Q. Tao, G. Wu, and J. Wang. A generalized S-K algorithm for learning ν-SVM classifiers. Pattern Recognition Letters, 25(10):1165–1171, 2004.

    Article  Google Scholar 

  49. M. E. Mavroforakis and S. Theodoridis. A geometric approach to support vector machine (SVM) classification. IEEE Transactions on Neural Networks, 17(3):671–682, 2006.

    Article  Google Scholar 

  50. M. E. Mavroforakis, M. Sdralis, and S. Theodoridis. A geometric nearest point algorithm for the efficient solution of the SVM classification task. IEEE Transactions on Neural Networks, 18(5):1545–1549, 2007.

    Article  Google Scholar 

  51. A. Navia-Vázquez, F. Pérez-Cruz, A. Artés-Rodríguez, and A. R. Figueiras-Vidal. Weighted least squares training of support vector classifiers leading to compact and adaptive schemes. IEEE Transactions on Neural Networks, 12(5):1047–1059, 2001.

    Article  Google Scholar 

  52. G. Zanghirati and L. Zanni. A parallel solver for large quadratic programs in training support vector machines. Parallel Computing, 29(4):535–551, 2003.

    Article  MathSciNet  Google Scholar 

  53. L. Ferreira, E. Kaszkurewicz, and A. Bhaya. Parallel implementation of gradient-based neural networks for SVM training. In Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN 2006), pages 731–738, Vancouver, Canada, 2006.

    Google Scholar 

  54. I. Durdanovic, E. Cosatto, and H.-P. Graf. Large-scale parallel SVM implementation. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 105–138. MIT Press, Cambridge, MA, 2007.

    Google Scholar 

  55. I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6:363–392, 2005.

    MathSciNet  Google Scholar 

  56. I. W.-H. Tsang, J. T.-Y. Kwok, and J. M. Zurada. Generalized core vector machines. IEEE Transactions on Neural Networks, 17(5):1126–1140, 2006.

    Article  Google Scholar 

  57. G. Loosli and S. Canu. Comments on the “Core vector machines: Fast SVM training on very large data sets”. Journal of Machine Learning Research, 8:291–301, 2007.

    Google Scholar 

  58. L. Bo, L. Wang, and L. Jiao. Training hard-margin support vector machines using greedy stepwise algorithm. IEEE Transactions on Neural Networks, 19(8):1446–1455, 2008.

    Article  Google Scholar 

  59. R. J. Vanderbei. Linear Programming: Foundations and Extensions, Second Edition. Kluwer Academic Publishers, Norwell, MA, 2001.

    Google Scholar 

  60. S. J. Wright. Primal-Dual Interior-Point Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.

    Google Scholar 

  61. V. Chvátal. Linear Programming. W. H. Freeman and Company, New York, NY, 1983.

    Google Scholar 

  62. R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Technical Report SOR-94-15, Princeton University, 1998.

    Google Scholar 

  63. Y. Koshiba and S. Abe. Comparison of L1 and L2 support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 3, pages 2054–2059, Portland, OR, 2003.

    Google Scholar 

  64. D. P. Bertsekas. Nonlinear Programming, Second Edition. Athena Scientific, Belmont, MA, 1999.

    Google Scholar 

  65. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004.

    Google Scholar 

  66. P. Laskov, C. Gehl, S. Krüger, and K.-R. Müller. Incremental support vector learning: Analysis, implementation and applications. Journal of Machine Learning Research, 7:1909–1936, 2006.

    Google Scholar 

  67. C. P. Diehl and G. Cauwenberghs. SVM incremental learning, adaptation and optimization. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 4, pages 2685–2690, Portland, OR, 2003.

    Google Scholar 

  68. T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu. The entire regularization path for the support vector machine. Journal of Machine Learning Research, 5:1391–1415, 2004.

    MathSciNet  Google Scholar 

  69. Y. Torii and S. Abe. Decomposition techniques for training linear programming support vector machines. Neurocomputing, 72(4–6):973–984, 2009.

    Article  Google Scholar 

  70. P. S. Bradley and O. L. Mangasarian. Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13(1):1–10, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  71. Y. Torii and S. Abe. Fast training of linear programming support vector machines using decomposition techniques. In F. Schwenker and S. Marinai, editors, Artificial Neural Networks in Pattern Recognition: Proceedings of Second IAPR Workshop, ANNPR 2006, Ulm, Germany, pages 165–176. Springer-Verlag, Berlin, Germany, 2006.

    Google Scholar 

  72. P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML'98), pages 82–90, Madison, WI, 1998.

    Google Scholar 

  73. Making large-scale support vector machine learning practical. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 169–184. MIT Press, Cambridge, MA, 1999.

    Google Scholar 

  74. C.-W. Hsu and C.-J. Lin. A simple decomposition method for support vector machines. Machine Learning, 46(1–3):291–314, 2002.

    Article  MATH  Google Scholar 

  75. P. Laskov. Feasible direction decomposition algorithms for training support vector machines. Machine Learning, 46(1–3):315–349, 2002.

    Article  MATH  Google Scholar 

  76. D. Hush and C. Scovel. Polynomial-time decomposition algorithms for support vector machines. Machine Learning, 51(1):51–71, 2003.

    Article  MATH  Google Scholar 

  77. B. Schölkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002.

    Google Scholar 

  78. V. Kecman and I. Hadzic. Support vectors selection by linear programming. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), volume 5, pages 193–198, Como, Italy, 2000.

    Google Scholar 

  79. W. Zhou, L. Zhang, and L. Jiao. Linear programming support vector machines. Pattern Recognition, 35(12):2927–2936, 2002.

    Article  MATH  Google Scholar 

  80. B. Schölkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, pages 640–646. MIT Press, Cambridge, MA, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shigeo Abe .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag London Limited

About this chapter

Cite this chapter

Abe, S. (2010). Training Methods. In: Support Vector Machines for Pattern Classification. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84996-098-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-84996-098-4_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84996-097-7

  • Online ISBN: 978-1-84996-098-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics