Training Methods

Abe, Shigeo

doi:10.1007/978-1-84996-098-4_5

Shigeo Abe²

Part of the book series: Advances in Pattern Recognition ((ACVPR))

4835 Accesses

Abstract

In training an L1 or L2 support vector machine, we need to solve a quadratic programming problem with the number of variables equal to the number of training data. Computational complexity is of the order of M ³, where M is the number of training data. Thus when M is large, training takes long time. To speed up training, numerous methods have been proposed. One is to extract support vector candidates from the training data and then train the support vector machine using these data. Another method is to accelerate training by decomposing variables into a working set and a fixed set and by repeatedly solving the subproblem associated with the working set until convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In contrast to active set training, active learning tries to minimize a labeling task of unlabeled data by only labeling the data that are necessary for generating a classifier. During learning process, the learning machine asks labeling unlabeled data that are crucial for generating a classifier, in a support vector machine environment, the data that change the optimal hyperplane most [11]. Active learning for the labeled data is considered to be one of the working set selection methods.
2.
See the discussions in Section 2.3.4.1 on p. 44.
3.
If the inequality constraint \(A \textbf{x} \geq \textbf{b}\) is used, y needs to be a nonnegative vector.
4.
We have changed steepest ascent methods used in the first edition to Newton's methods [64] to follow the common usage.
5.
http://svm.cs.rhbnc.ac.uk/
6.
Here, the margin is not measured from the separating hyperplane. Thus, to measure the margin from it we need to add 1.
7.
The satimage and USPS data sets will be evaluated in the following section.

References

M.-H. Yang and N. Ahuja. A geometric approach to train support vector machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 430–437, Hilton Head Island, SC, 2000.
Google Scholar
M. B. de Almeida, A. de Pádua Braga, and J. P. Braga. SVM-KM: Speeding SVMs learning with a priori cluster selection and k-means. In Proceedings of the Sixth Brazilian Symposium on Neural Networks (SBRN 2000), pages 162–167, Rio de Janeiro, Brazil, 2000.
Google Scholar
S. Sohn and C. H. Dagli. Advantages of using fuzzy class memberships in self-organizing map and support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 3, pages 1886–1890, Washington, DC, 2001.
Google Scholar
S. Abe and T. Inoue. Fast training of support vector machines by extracting boundary data. In G. Dorffner, H. Bischof, and K. Hornik, editors, Artificial Neural Networks (ICANN 2001)–-Proceedings of International Conference, Vienna, Austria, pages 308–313. Springer-Verlag, Berlin, Germany, 2001.
Google Scholar
W. Zhang and I. King. Locating support vectors via β-skeleton technique. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 3, pages 1423–1427, Singapore, 2002.
Google Scholar
H. Shin and S. Cho. How many neighbors to consider in pattern pre-selection for support vector classifiers? In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 1, pages 565–570, Portland, OR, 2003.
Google Scholar
S.-Y. Sun, C. L. Tseng, Y. H. Chen, S. C. Chuang, and H. C. Fu. Cluster-based support vector machines in text-independent speaker identification. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2004), volume 1, pages 729–734, Budapest, Hungary, 2004.
Google Scholar
B. Li, Q. Wang, and J. Hu. A fast SVM training method for very large datasets. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1784–1789, Atlanta, GA, 2009.
Google Scholar
C. Saunders, M. O. Stitson, J. Weston, L. Bottou, B. Schölkopf, and A. Smola. Support vector machine: Reference manual. Technical Report CSD-TR-98-03, Royal Holloway, University of London, London, UK, 1998.
Google Scholar
E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Neural Networks for Signal Processing VII–-Proceedings of the 1997 IEEE Signal Processing Society Workshop, pages 276–285, 1997.
Google Scholar
G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), pages 839–846, Stanford, CA, 2000.
Google Scholar
C.-J. Lin. On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks, 12(6):1288–1298, 2001.
Article Google Scholar
C.-J. Lin. Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Transactions on Neural Networks, 13(1):248–250, 2002.
Article Google Scholar
S. S. Keerthi and E. G. Gilbert. Convergence of a generalized SMO algorithm for SVM classifier design. Machine Learning, 46(1–3):351–360, 2002.
Article MATH Google Scholar
R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using second order information for training support vector machines. Journal of Machine Learning Research, 6:1889–1918, 2005.
MathSciNet Google Scholar
M. Rychetsky, S. Ortmann, M. Ullmann, and M. Glesner. Accelerated training of support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’99), volume 2, pages 998–1003, Washington, DC, 1999.
Google Scholar
Y. Koshiba. Acceleration of training of support vector machines. Master's thesis (in Japanese), Graduate School of Science and Technology, Kobe University, Japan, 2004.
Google Scholar
C. Campbell, T.-T. Frieß, and N. Cristianini. Maximal margin classification using the KA algorithm. In Proceedings of the First International Symposium on Intelligent Data Engineering and Learning (IDEAL ’98), pages 355–362, Hong Kong, China, 1998.
Google Scholar
T.-T. Frieß, N. Cristianini, and C. Campbell. The Kernel-Adatron algorithm: A fast and simple learning procedure for support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML '98), pages 188–196, Madison, WI, 1998.
Google Scholar
Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.
Article MATH Google Scholar
I. Guyon and D. G. Stork. Linear discriminant and support vector classifiers. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 147–169. MIT Press, Cambridge, MA, 2000.
Google Scholar
J. Xu, X. Zhang, and Y. Li. Large margin kernel pocket algorithm. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 2, pages 1480–1485, Washington, DC, 2001.
Google Scholar
J. K. Anlauf and M. Biehl. The Adatron: An adaptive perceptron algorithm. Europhysics Letters, 10:687–692, 1989.
Article Google Scholar
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK, 2000.
Google Scholar
O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks, 10(5):1032–1037, 1999.
Article Google Scholar
J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 185–208. MIT Press, Cambridge, MA, 1999.
Google Scholar
J.-X. Dong, A. Krzyżak, and C. Y. Suen. Fast SVM training algorithm with decomposition on very large data sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(4):603–618, 2005.
Article Google Scholar
L. Bottou and C.-J. Lin. Support vector machine solvers. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 1–27. MIT Press, Cambridge, MA, 2007.
Google Scholar
S. Abe, Y. Hirokawa, and S. Ozawa. Steepest ascent training of support vector machines. In E. Damiani, L. C. Jain, R. J. Howlett, and N. Ichalkaranje, editors, Knowledge-Based Intelligent Engineering Systems and Allied Technologies (KES 2002), volume Part 2, pages 1301–1305, IOS Press, Amsterdam, The Netherlands, 2002.
Google Scholar
M. Vogt. SMO algorithms for support vector machines without bias term. Technical report, Institute of Automatic Control, TU Darmstadt, Germany, 2002.
Google Scholar
V. Kecman, M. Vogt, and T. M. Huang. On the equality of kernel AdaTron and Sequential Minimal Optimization in classification and regression tasks and alike algorithms for kernel machines. In Proceedings of the Eleventh European Symposium on Artificial Neural Networks (ESANN 2003), pages 215–222, Bruges, Belgium, 2003.
Google Scholar
C. Sentelle, M. Georgiopoulos, G. C. Anagnostopoulos, and C. Young. On extending the SMO algorithm sub-problem. In Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), pages 886–891, Orlando, FL, 2007.
Google Scholar
R. A. Hernandez, M. Strum, J. C. Wang, and J. A. Q. Gonzalez. The multiple pairs SMO: A modified SMO algorithm for the acceleration of the SVM training. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1221–1228, Atlanta, GA, 2009.
Google Scholar
G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 409–415. MIT Press, Cambridge, MA, 2001.
Google Scholar
A. Shilton, M. Palaniswami, D. Ralph, and A. C. Tsoi. Incremental training of support vector machines. IEEE Transactions on Neural Networks, 16(1):114–131, 2005.
Article Google Scholar
K. Scheinberg. An efficient implementation of an active set method for SVMs. Journal of Machine Learning Research, 7:2237–2257, 2006.
MathSciNet Google Scholar
S. Abe. Batch support vector training based on exact incremental training. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 527–536. Springer-Verlag, Berlin, Germany, 2008.
Google Scholar
H. Gâlmeanu and R. Andonie. Implementation issues of an incremental and decremental SVM. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 325–335. Springer-Verlag, Berlin, Germany, 2008.
Google Scholar
C. Sentelle, G. C. Anagnostopoulos, and M. Georgiopoulos. An efficient active set method for SVM training without singular inner problems. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 2875–2882, Atlanta, GA, 2009.
Google Scholar
O. Chapelle. Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 29–50. MIT Press, Cambridge, MA, 2007.
Google Scholar
S. Abe. Is primal better than dual. In C. Alippi, M. Polycarpou, C. Panayiotou, and G. Ellinas, editors, Artificial Neural Networks (ICANN 2009)–-Proceedings of the Nineteenth International Conference, Limassol, Cyprus, Part I, pages 854–863. Springer-Verlag, Berlin, Germany, 2009.
Google Scholar
D. Roobaert. DirectSVM: A fast and simple support vector machine perceptron. In Neural Networks for Signal Processing X–-Proceedings of the 2000 IEEE Signal Processing Society Workshop, volume 1, pages 356–365, 2000.
Google Scholar
S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy. A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Transactions on Neural Networks, 11(1):124–136, 2000.
Article Google Scholar
T. Raicharoen and C. Lursinsap. Critical support vector machine without kernel function. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 5, pages 2532–2536, Singapore, 2002.
Google Scholar
S. V. N. Vishwanathan and M. N. Murty. SSVM: A simple SVM algorithm. In Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN'02), volume 3, pages 2393–2398, Honolulu, Hawaii, 2002.
Google Scholar
K. P. Bennett and E. J. Bredensteiner. Geometry in learning. In C. A. Gorini, editor, Geometry at Work, pages 132–145. Mathematical Association of America, Washington, DC 2000.
Google Scholar
D. J. Crisp and C. J. C. Burges. A geometric interpretation of ν-SVM classifiers. In S. A. Solla, T. K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, pages 244–250. MIT Press, Cambridge, MA, 2000.
Google Scholar
Q. Tao, G. Wu, and J. Wang. A generalized S-K algorithm for learning ν-SVM classifiers. Pattern Recognition Letters, 25(10):1165–1171, 2004.
Article Google Scholar
M. E. Mavroforakis and S. Theodoridis. A geometric approach to support vector machine (SVM) classification. IEEE Transactions on Neural Networks, 17(3):671–682, 2006.
Article Google Scholar
M. E. Mavroforakis, M. Sdralis, and S. Theodoridis. A geometric nearest point algorithm for the efficient solution of the SVM classification task. IEEE Transactions on Neural Networks, 18(5):1545–1549, 2007.
Article Google Scholar
A. Navia-Vázquez, F. Pérez-Cruz, A. Artés-Rodríguez, and A. R. Figueiras-Vidal. Weighted least squares training of support vector classifiers leading to compact and adaptive schemes. IEEE Transactions on Neural Networks, 12(5):1047–1059, 2001.
Article Google Scholar
G. Zanghirati and L. Zanni. A parallel solver for large quadratic programs in training support vector machines. Parallel Computing, 29(4):535–551, 2003.
Article MathSciNet Google Scholar
L. Ferreira, E. Kaszkurewicz, and A. Bhaya. Parallel implementation of gradient-based neural networks for SVM training. In Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN 2006), pages 731–738, Vancouver, Canada, 2006.
Google Scholar
I. Durdanovic, E. Cosatto, and H.-P. Graf. Large-scale parallel SVM implementation. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 105–138. MIT Press, Cambridge, MA, 2007.
Google Scholar
I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6:363–392, 2005.
MathSciNet Google Scholar
I. W.-H. Tsang, J. T.-Y. Kwok, and J. M. Zurada. Generalized core vector machines. IEEE Transactions on Neural Networks, 17(5):1126–1140, 2006.
Article Google Scholar
G. Loosli and S. Canu. Comments on the “Core vector machines: Fast SVM training on very large data sets”. Journal of Machine Learning Research, 8:291–301, 2007.
Google Scholar
L. Bo, L. Wang, and L. Jiao. Training hard-margin support vector machines using greedy stepwise algorithm. IEEE Transactions on Neural Networks, 19(8):1446–1455, 2008.
Article Google Scholar
R. J. Vanderbei. Linear Programming: Foundations and Extensions, Second Edition. Kluwer Academic Publishers, Norwell, MA, 2001.
Google Scholar
S. J. Wright. Primal-Dual Interior-Point Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.
Google Scholar
V. Chvátal. Linear Programming. W. H. Freeman and Company, New York, NY, 1983.
Google Scholar
R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Technical Report SOR-94-15, Princeton University, 1998.
Google Scholar
Y. Koshiba and S. Abe. Comparison of L1 and L2 support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 3, pages 2054–2059, Portland, OR, 2003.
Google Scholar
D. P. Bertsekas. Nonlinear Programming, Second Edition. Athena Scientific, Belmont, MA, 1999.
Google Scholar
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004.
Google Scholar
P. Laskov, C. Gehl, S. Krüger, and K.-R. Müller. Incremental support vector learning: Analysis, implementation and applications. Journal of Machine Learning Research, 7:1909–1936, 2006.
Google Scholar
C. P. Diehl and G. Cauwenberghs. SVM incremental learning, adaptation and optimization. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 4, pages 2685–2690, Portland, OR, 2003.
Google Scholar
T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu. The entire regularization path for the support vector machine. Journal of Machine Learning Research, 5:1391–1415, 2004.
MathSciNet Google Scholar
Y. Torii and S. Abe. Decomposition techniques for training linear programming support vector machines. Neurocomputing, 72(4–6):973–984, 2009.
Article Google Scholar
P. S. Bradley and O. L. Mangasarian. Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13(1):1–10, 2000.
Article MATH MathSciNet Google Scholar
Y. Torii and S. Abe. Fast training of linear programming support vector machines using decomposition techniques. In F. Schwenker and S. Marinai, editors, Artificial Neural Networks in Pattern Recognition: Proceedings of Second IAPR Workshop, ANNPR 2006, Ulm, Germany, pages 165–176. Springer-Verlag, Berlin, Germany, 2006.
Google Scholar
P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML'98), pages 82–90, Madison, WI, 1998.
Google Scholar
Making large-scale support vector machine learning practical. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 169–184. MIT Press, Cambridge, MA, 1999.
Google Scholar
C.-W. Hsu and C.-J. Lin. A simple decomposition method for support vector machines. Machine Learning, 46(1–3):291–314, 2002.
Article MATH Google Scholar
P. Laskov. Feasible direction decomposition algorithms for training support vector machines. Machine Learning, 46(1–3):315–349, 2002.
Article MATH Google Scholar
D. Hush and C. Scovel. Polynomial-time decomposition algorithms for support vector machines. Machine Learning, 51(1):51–71, 2003.
Article MATH Google Scholar
B. Schölkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002.
Google Scholar
V. Kecman and I. Hadzic. Support vectors selection by linear programming. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), volume 5, pages 193–198, Como, Italy, 2000.
Google Scholar
W. Zhou, L. Zhang, and L. Jiao. Linear programming support vector machines. Pattern Recognition, 35(12):2927–2936, 2002.
Article MATH Google Scholar
B. Schölkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, pages 640–646. MIT Press, Cambridge, MA, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe, 657-8501, Japan
Prof. Dr. Shigeo Abe

Authors

Prof. Dr. Shigeo Abe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shigeo Abe .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Abe, S. (2010). Training Methods. In: Support Vector Machines for Pattern Classification. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84996-098-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-84996-098-4_5
Published: 22 January 2010
Publisher Name: Springer, London
Print ISBN: 978-1-84996-097-7
Online ISBN: 978-1-84996-098-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics