Abstract
In training an L1 or L2 support vector machine, we need to solve a quadratic programming problem with the number of variables equal to the number of training data. Computational complexity is of the order of M 3, where M is the number of training data. Thus when M is large, training takes long time. To speed up training, numerous methods have been proposed. One is to extract support vector candidates from the training data and then train the support vector machine using these data. Another method is to accelerate training by decomposing variables into a working set and a fixed set and by repeatedly solving the subproblem associated with the working set until convergence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In contrast to active set training, active learning tries to minimize a labeling task of unlabeled data by only labeling the data that are necessary for generating a classifier. During learning process, the learning machine asks labeling unlabeled data that are crucial for generating a classifier, in a support vector machine environment, the data that change the optimal hyperplane most [11]. Active learning for the labeled data is considered to be one of the working set selection methods.
- 2.
See the discussions in Section 2.3.4.1 on p. 44.
- 3.
If the inequality constraint \(A \textbf{x} \geq \textbf{b}\) is used, y needs to be a nonnegative vector.
- 4.
We have changed steepest ascent methods used in the first edition to Newton's methods [64] to follow the common usage.
- 5.
- 6.
Here, the margin is not measured from the separating hyperplane. Thus, to measure the margin from it we need to add 1.
- 7.
The satimage and USPS data sets will be evaluated in the following section.
References
M.-H. Yang and N. Ahuja. A geometric approach to train support vector machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume 1, pages 430–437, Hilton Head Island, SC, 2000.
M. B. de Almeida, A. de Pádua Braga, and J. P. Braga. SVM-KM: Speeding SVMs learning with a priori cluster selection and k-means. In Proceedings of the Sixth Brazilian Symposium on Neural Networks (SBRN 2000), pages 162–167, Rio de Janeiro, Brazil, 2000.
S. Sohn and C. H. Dagli. Advantages of using fuzzy class memberships in self-organizing map and support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 3, pages 1886–1890, Washington, DC, 2001.
S. Abe and T. Inoue. Fast training of support vector machines by extracting boundary data. In G. Dorffner, H. Bischof, and K. Hornik, editors, Artificial Neural Networks (ICANN 2001)–-Proceedings of International Conference, Vienna, Austria, pages 308–313. Springer-Verlag, Berlin, Germany, 2001.
W. Zhang and I. King. Locating support vectors via β-skeleton technique. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 3, pages 1423–1427, Singapore, 2002.
H. Shin and S. Cho. How many neighbors to consider in pattern pre-selection for support vector classifiers? In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 1, pages 565–570, Portland, OR, 2003.
S.-Y. Sun, C. L. Tseng, Y. H. Chen, S. C. Chuang, and H. C. Fu. Cluster-based support vector machines in text-independent speaker identification. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2004), volume 1, pages 729–734, Budapest, Hungary, 2004.
B. Li, Q. Wang, and J. Hu. A fast SVM training method for very large datasets. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1784–1789, Atlanta, GA, 2009.
C. Saunders, M. O. Stitson, J. Weston, L. Bottou, B. Schölkopf, and A. Smola. Support vector machine: Reference manual. Technical Report CSD-TR-98-03, Royal Holloway, University of London, London, UK, 1998.
E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Neural Networks for Signal Processing VII–-Proceedings of the 1997 IEEE Signal Processing Society Workshop, pages 276–285, 1997.
G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), pages 839–846, Stanford, CA, 2000.
C.-J. Lin. On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks, 12(6):1288–1298, 2001.
C.-J. Lin. Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Transactions on Neural Networks, 13(1):248–250, 2002.
S. S. Keerthi and E. G. Gilbert. Convergence of a generalized SMO algorithm for SVM classifier design. Machine Learning, 46(1–3):351–360, 2002.
R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using second order information for training support vector machines. Journal of Machine Learning Research, 6:1889–1918, 2005.
M. Rychetsky, S. Ortmann, M. Ullmann, and M. Glesner. Accelerated training of support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’99), volume 2, pages 998–1003, Washington, DC, 1999.
Y. Koshiba. Acceleration of training of support vector machines. Master's thesis (in Japanese), Graduate School of Science and Technology, Kobe University, Japan, 2004.
C. Campbell, T.-T. Frieß, and N. Cristianini. Maximal margin classification using the KA algorithm. In Proceedings of the First International Symposium on Intelligent Data Engineering and Learning (IDEAL ’98), pages 355–362, Hong Kong, China, 1998.
T.-T. Frieß, N. Cristianini, and C. Campbell. The Kernel-Adatron algorithm: A fast and simple learning procedure for support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML '98), pages 188–196, Madison, WI, 1998.
Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.
I. Guyon and D. G. Stork. Linear discriminant and support vector classifiers. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 147–169. MIT Press, Cambridge, MA, 2000.
J. Xu, X. Zhang, and Y. Li. Large margin kernel pocket algorithm. In Proceedings of International Joint Conference on Neural Networks (IJCNN ’01), volume 2, pages 1480–1485, Washington, DC, 2001.
J. K. Anlauf and M. Biehl. The Adatron: An adaptive perceptron algorithm. Europhysics Letters, 10:687–692, 1989.
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK, 2000.
O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks, 10(5):1032–1037, 1999.
J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 185–208. MIT Press, Cambridge, MA, 1999.
J.-X. Dong, A. Krzyżak, and C. Y. Suen. Fast SVM training algorithm with decomposition on very large data sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(4):603–618, 2005.
L. Bottou and C.-J. Lin. Support vector machine solvers. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 1–27. MIT Press, Cambridge, MA, 2007.
S. Abe, Y. Hirokawa, and S. Ozawa. Steepest ascent training of support vector machines. In E. Damiani, L. C. Jain, R. J. Howlett, and N. Ichalkaranje, editors, Knowledge-Based Intelligent Engineering Systems and Allied Technologies (KES 2002), volume Part 2, pages 1301–1305, IOS Press, Amsterdam, The Netherlands, 2002.
M. Vogt. SMO algorithms for support vector machines without bias term. Technical report, Institute of Automatic Control, TU Darmstadt, Germany, 2002.
V. Kecman, M. Vogt, and T. M. Huang. On the equality of kernel AdaTron and Sequential Minimal Optimization in classification and regression tasks and alike algorithms for kernel machines. In Proceedings of the Eleventh European Symposium on Artificial Neural Networks (ESANN 2003), pages 215–222, Bruges, Belgium, 2003.
C. Sentelle, M. Georgiopoulos, G. C. Anagnostopoulos, and C. Young. On extending the SMO algorithm sub-problem. In Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), pages 886–891, Orlando, FL, 2007.
R. A. Hernandez, M. Strum, J. C. Wang, and J. A. Q. Gonzalez. The multiple pairs SMO: A modified SMO algorithm for the acceleration of the SVM training. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 1221–1228, Atlanta, GA, 2009.
G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 409–415. MIT Press, Cambridge, MA, 2001.
A. Shilton, M. Palaniswami, D. Ralph, and A. C. Tsoi. Incremental training of support vector machines. IEEE Transactions on Neural Networks, 16(1):114–131, 2005.
K. Scheinberg. An efficient implementation of an active set method for SVMs. Journal of Machine Learning Research, 7:2237–2257, 2006.
S. Abe. Batch support vector training based on exact incremental training. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 527–536. Springer-Verlag, Berlin, Germany, 2008.
H. Gâlmeanu and R. Andonie. Implementation issues of an incremental and decremental SVM. In V. Kůrková, R. Neruda, and J. Koutnik, editors, Artificial Neural Networks (ICANN 2008)–-Proceedings of the Eighteenth International Conference, Prague, Czech Republic, Part I, pages 325–335. Springer-Verlag, Berlin, Germany, 2008.
C. Sentelle, G. C. Anagnostopoulos, and M. Georgiopoulos. An efficient active set method for SVM training without singular inner problems. In Proceedings of the 2009 International Joint Conference on Neural Networks (IJCNN 2009), pages 2875–2882, Atlanta, GA, 2009.
O. Chapelle. Training a support vector machine in the primal. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 29–50. MIT Press, Cambridge, MA, 2007.
S. Abe. Is primal better than dual. In C. Alippi, M. Polycarpou, C. Panayiotou, and G. Ellinas, editors, Artificial Neural Networks (ICANN 2009)–-Proceedings of the Nineteenth International Conference, Limassol, Cyprus, Part I, pages 854–863. Springer-Verlag, Berlin, Germany, 2009.
D. Roobaert. DirectSVM: A fast and simple support vector machine perceptron. In Neural Networks for Signal Processing X–-Proceedings of the 2000 IEEE Signal Processing Society Workshop, volume 1, pages 356–365, 2000.
S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy. A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Transactions on Neural Networks, 11(1):124–136, 2000.
T. Raicharoen and C. Lursinsap. Critical support vector machine without kernel function. In Proceedings of the Ninth International Conference on Neural Information Processing (ICONIP ’02), volume 5, pages 2532–2536, Singapore, 2002.
S. V. N. Vishwanathan and M. N. Murty. SSVM: A simple SVM algorithm. In Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN'02), volume 3, pages 2393–2398, Honolulu, Hawaii, 2002.
K. P. Bennett and E. J. Bredensteiner. Geometry in learning. In C. A. Gorini, editor, Geometry at Work, pages 132–145. Mathematical Association of America, Washington, DC 2000.
D. J. Crisp and C. J. C. Burges. A geometric interpretation of ν-SVM classifiers. In S. A. Solla, T. K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, pages 244–250. MIT Press, Cambridge, MA, 2000.
Q. Tao, G. Wu, and J. Wang. A generalized S-K algorithm for learning ν-SVM classifiers. Pattern Recognition Letters, 25(10):1165–1171, 2004.
M. E. Mavroforakis and S. Theodoridis. A geometric approach to support vector machine (SVM) classification. IEEE Transactions on Neural Networks, 17(3):671–682, 2006.
M. E. Mavroforakis, M. Sdralis, and S. Theodoridis. A geometric nearest point algorithm for the efficient solution of the SVM classification task. IEEE Transactions on Neural Networks, 18(5):1545–1549, 2007.
A. Navia-Vázquez, F. Pérez-Cruz, A. Artés-Rodríguez, and A. R. Figueiras-Vidal. Weighted least squares training of support vector classifiers leading to compact and adaptive schemes. IEEE Transactions on Neural Networks, 12(5):1047–1059, 2001.
G. Zanghirati and L. Zanni. A parallel solver for large quadratic programs in training support vector machines. Parallel Computing, 29(4):535–551, 2003.
L. Ferreira, E. Kaszkurewicz, and A. Bhaya. Parallel implementation of gradient-based neural networks for SVM training. In Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN 2006), pages 731–738, Vancouver, Canada, 2006.
I. Durdanovic, E. Cosatto, and H.-P. Graf. Large-scale parallel SVM implementation. In L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, editors, Large-Scale Kernel Machines, pages 105–138. MIT Press, Cambridge, MA, 2007.
I. W. Tsang, J. T. Kwok, and P.-M. Cheung. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6:363–392, 2005.
I. W.-H. Tsang, J. T.-Y. Kwok, and J. M. Zurada. Generalized core vector machines. IEEE Transactions on Neural Networks, 17(5):1126–1140, 2006.
G. Loosli and S. Canu. Comments on the “Core vector machines: Fast SVM training on very large data sets”. Journal of Machine Learning Research, 8:291–301, 2007.
L. Bo, L. Wang, and L. Jiao. Training hard-margin support vector machines using greedy stepwise algorithm. IEEE Transactions on Neural Networks, 19(8):1446–1455, 2008.
R. J. Vanderbei. Linear Programming: Foundations and Extensions, Second Edition. Kluwer Academic Publishers, Norwell, MA, 2001.
S. J. Wright. Primal-Dual Interior-Point Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.
V. Chvátal. Linear Programming. W. H. Freeman and Company, New York, NY, 1983.
R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Technical Report SOR-94-15, Princeton University, 1998.
Y. Koshiba and S. Abe. Comparison of L1 and L2 support vector machines. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 3, pages 2054–2059, Portland, OR, 2003.
D. P. Bertsekas. Nonlinear Programming, Second Edition. Athena Scientific, Belmont, MA, 1999.
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004.
P. Laskov, C. Gehl, S. Krüger, and K.-R. Müller. Incremental support vector learning: Analysis, implementation and applications. Journal of Machine Learning Research, 7:1909–1936, 2006.
C. P. Diehl and G. Cauwenberghs. SVM incremental learning, adaptation and optimization. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2003), volume 4, pages 2685–2690, Portland, OR, 2003.
T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu. The entire regularization path for the support vector machine. Journal of Machine Learning Research, 5:1391–1415, 2004.
Y. Torii and S. Abe. Decomposition techniques for training linear programming support vector machines. Neurocomputing, 72(4–6):973–984, 2009.
P. S. Bradley and O. L. Mangasarian. Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13(1):1–10, 2000.
Y. Torii and S. Abe. Fast training of linear programming support vector machines using decomposition techniques. In F. Schwenker and S. Marinai, editors, Artificial Neural Networks in Pattern Recognition: Proceedings of Second IAPR Workshop, ANNPR 2006, Ulm, Germany, pages 165–176. Springer-Verlag, Berlin, Germany, 2006.
P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML'98), pages 82–90, Madison, WI, 1998.
Making large-scale support vector machine learning practical. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 169–184. MIT Press, Cambridge, MA, 1999.
C.-W. Hsu and C.-J. Lin. A simple decomposition method for support vector machines. Machine Learning, 46(1–3):291–314, 2002.
P. Laskov. Feasible direction decomposition algorithms for training support vector machines. Machine Learning, 46(1–3):315–349, 2002.
D. Hush and C. Scovel. Polynomial-time decomposition algorithms for support vector machines. Machine Learning, 51(1):51–71, 2003.
B. Schölkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002.
V. Kecman and I. Hadzic. Support vectors selection by linear programming. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), volume 5, pages 193–198, Como, Italy, 2000.
W. Zhou, L. Zhang, and L. Jiao. Linear programming support vector machines. Pattern Recognition, 35(12):2927–2936, 2002.
B. Schölkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, pages 640–646. MIT Press, Cambridge, MA, 1998.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Abe, S. (2010). Training Methods. In: Support Vector Machines for Pattern Classification. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84996-098-4_5
Download citation
DOI: https://doi.org/10.1007/978-1-84996-098-4_5
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-84996-097-7
Online ISBN: 978-1-84996-098-4
eBook Packages: Computer ScienceComputer Science (R0)