SVM-Maj: a majorization approach to linear support vector machines with different hinge errors
Support vector machines (SVM) are becoming increasingly popular for the prediction of a binary dependent variable. SVMs perform very well with respect to competing techniques. Often, the solution of an SVM is obtained by switching to the dual. In this paper, we stick to the primal support vector machine problem, study its effective aspects, and propose varieties of convex loss functions such as the standard for SVM with the absolute hinge error as well as the quadratic hinge and the Huber hinge errors. We present an iterative majorization algorithm that minimizes each of the adaptations. In addition, we show that many of the features of an SVM are also obtained by an optimal scaling approach to regression. We illustrate this with an example from the literature and do a comparison of different methods on several empirical data sets.
KeywordsSupport vector machines Iterative majorization Absolute hinge error Quadratic hinge error Huber hinge error Optimal scaling
Mathematics Subject Classification (2000)90C30 62H30 68T05
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Chang C-C, Lin C-J (2006) LIBSVM: a library for support vector machines (Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm)
- Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, CambridgeGoogle Scholar
- De Leeuw J (1994) Block relaxation algorithms in statistics. In: Bock H-H, Lenski W, Richter MM(eds) Information systems and data analysis. Springer, Berlin, pp 308–324Google Scholar
- Heiser WJ (1995) Convergent computation by iterative majorization: theory and applications in multidimensional data analysis. In: Krzanowski WJ(eds) Recent advances in descriptive multivariate analysis. Oxford University Press, Oxford, pp 157–189Google Scholar
- Hsu C-W, Lin C-J (2006) BSVM: bound-constrained support vector machines (Software available at http://www.csie.ntu.edu.tw/~cjlin/bsvm/index.html)
- Joachims T (1999) Making large-scale SVM learning practical. In: Schölkopf B, Burges C, Smola A (eds) Advances in kernel methods—support vector learning. MIT-Press, Cambridge (http://www-ai.cs.uni-dortmund.de/DOKUMENTE/joachims_99a.pdf)
- Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the ACM conference on knowledge discovery and data mining (KDD) (http://www.cs.cornell.edu/People/tj/publications/joachims_06a.pdf)
- Newman D, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases (http://www.ics.uci.edu/~mlearn/MLRepository.html University of California, Irvine, Department of Information and Computer Sciences)
- Rousseeuw PJ, Leroy AM (2003) Robust regression and outlier detection. Wiley, New YorkGoogle Scholar
- Van der Kooij AJ (2007) Prediction accuracy and stability of regression with optimal scaling transformations. Unpublished doctoral dissertation, Leiden UniversityGoogle Scholar