Abstract
Support vector machine (SVM) models are usually trained by solving the dual of a quadratic programming, which is time consuming. Using the idea of penalty function method from optimization theory, this paper combines the objective function and the constraints in the dual, obtaining an unconstrained optimization problem, which could be solved by a generalized Newton method, yielding an approximate solution to the original model. Extensive experiments on pattern classification were conducted, and compared to the quadratic programming-based models, the proposed approach is much more computationally efficient (tens to hundreds of times faster) and yields similar performance in terms of receiver operating characteristic curve. Furthermore, the proposed method and quadratic programming-based models extract almost the same set of support vectors.
Similar content being viewed by others
References
Armijo L (1966) Minimization of functions having Lipschitz-continuous first partial derivatives. Pac J Math 16:1–3
Bach FR, Heckerman D, Horvitz E (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741
Ben-Hur A, Horn D, Siegelmann HT, Vapnik V (2001) Support vector clustering. J Mach Learn Res 2:125–137
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Chapelle O (2007) Training a support vector machine in the primal. Neural Comput 19:1155–1178
Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3):131–159
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Defferrard M, Benzi K, Vandergheynst P, Bresson X (2017) FMA: a dataset for music analysis. In: Proceedigns of 18th International Society for Music Information Retrieval Conference (ISMIR)
Gold C, Sollich P (2003) Model selection for support vector machine classification. Neurocomputing 55(1–2):221–249
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
Hiriart-Urruty J-B, Strodiot JJ, Nguyen VH (1984) Generalized Hessian matrix and second-order optimality conditions for problems with \(C^{1,1}\) data. Appl Math Optim 11(1):43–56
Huang X, Shi L, Suykens JAK (2014) Support vector machine classifier with pinball loss. IEEE Trans PAMI 36(5):984–997
Lee J, Lee D (2005) An improved cluster labeling method for support vector clustering. IEEE Trans PAMI 27(3):461–464
Lee J, Lee D (2006) Dynamic characterization of cluster structures for robust and inductive support vector clustering. IEEE Trans PAMI 28(11):1869–1874
Lee YJ, Mangasarian OL (2001) SSVM: a smooth support vector machine. Comput Optim Appl 20:5–22
Joachims J (1999) Making large-scale SVM learning practical. In: Advances in kernel methods—support vector learning. MIT-Press, Cambridge
Mangasarian OL (2002) A finite newton method for classification. Optim Methods Softw 17:913–929
Opper M, Winther O (2000) Gaussian process and SVM: mean field and leave-one-out. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 261– 280
Osuna E, Freund R, Girosi F, (1997a) An improved training algorithm for support vector machines. In: Proc. of IEEE workshop neural networks for signal processing, pp 276 – 285
Osuna E, Freund R, Girosi F (1997b) Training support vector machines: an application to face detection. In: Proc, IEEE CVPR
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods—support vector learning. MIT-Press, Cambridge
Powers DMW (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J Mach Learn Technol 2(1):37–63
Ruszczyński A (2006) Nonlinear optimization. Princeton University Press, Princeton
Shalev-Schwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: Primal estimated sub-gradient solver for SVM. Math Program 127(1):3–30
Smola AJ, Schölkopf B (2002) A tutorial on support vector regression. Stat Comput 14(3):199–222
Tax DMJ, Duin RPW (1999) Support vector domain description. Pattern Recognit Lett 20(11–13):1191–1199
Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66
Vapnik V,Chapelle O (2000) Bounds on error expectation for SVM. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 311–326
Viola P, Jones M (2001) Rapid Object detection using a boosted cascade of simple features. In: Proc. of IEEE CVPR
Wahba G, Lin Y, Zhang H (2000) Generalized approximate cross validation for support vector machines, or another way to look at margin-like quantities. In: Advances in large margin classifiers. MIT Press, Cambridge
Wang Z, Crammer K, Vucetic S (2012) Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training. J Mach Learn Res 13(1):3103–3131
Zheng S (2016) Smoothly approximated support vector domain description. Pattern Recogn 49(1):55–64
Zheng S (2019) A fast iterative Algorithm for support vector data description. Int J Mach Learn Cybern 10(5):1173–1187
Acknowledgements
The author would like to extend his sincere gratitude to the anonymous reviewers for their constructive suggestions and comments, which have greatly helped improve the quality of this paper.
Funding
This work was supported by a Summer Faculty Fellowship from Missouri State University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author certifies that there is no conflicts of interest or competing interest.
Rights and permissions
About this article
Cite this article
Zheng, S. A support vector approach based on penalty function method. Adv. in Comp. Int. 2, 9 (2022). https://doi.org/10.1007/s43674-021-00026-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43674-021-00026-4