Abstract
For a support vector machine, model selection consists in selecting the kernel function, the values of its parameters, and the amount of regularization. To set the value of the regularization parameter, one can minimize an appropriate objective function over the regularization path. A priori, this requires the availability of two elements: the objective function and an algorithm computing the regularization path at a reduced cost. The literature provides us with several upper bounds and estimates for the leave-one-out cross-validation error of the ℓ2-SVM. However, no algorithm was available so far for fitting the entire regularization path of this machine. In this article, we introduce the first algorithm of this kind. It is involved in the specification of new methods to tune the corresponding penalization coefficient, whose objective function is a leave-one-out error bound or estimate. From a computational point of view, these methods appear especially appropriate when the Gram matrix is of low rank. A comparative study involving state-of-the-art alternatives provides us with an empirical confirmation of this advantage.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allgower, E.L., Georg, K.: Continuation and path following. Acta Numerica 2, 1–64 (1993)
Aronszajn, N.: Theory of reproducing kernels. Transactions of the American Mathematical Society 68(3), 337–404 (1950)
Berlinet, A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, Boston (2004)
Blanchard, G., Massart, P., Vert, R., Zwald, L.: Kernel projection machine: a new tool for pattern recognition. In: NIPS, vol. 17, pp. 1649–1656 (2005)
Burman, P.: A comparative study of ordinary cross-validation, ν-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989)
Cawley, G.C., Talbot, N.L.C.: Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters. Journal of Machine Learning Research 8, 841–861 (2007)
Chang, Y.-W., Hsieh, C.-J., Chang, K.-W., Ringgaard, M., Lin, C.-J.: Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research 11, 1471–1490 (2010)
Chapelle, O.: Training a support vector machine in the primal. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large-Scale Kernel Machines, ch. 2, pp. 29–50. The MIT Press, Cambridge (2007)
Chapelle, O., Vapnik, V.N., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46(1), 131–159 (2002)
Chung, K.-M., Kao, W.-C., Sun, C.-L., Wang, L.-L., Lin, C.-J.: Radius margin bounds for support vector machines with the RBF kernel. Neural Computation 15(11), 2643–2681 (2003)
Cortes, C., Mohri, M., Talwalkar, A.: On the impact of kernel approximation on learning accuracy. In: AISTATS 2010, pp. 113–120 (2010)
Cortes, C., Vapnik, V.N.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 10(7), 1895–1923 (1998)
Duan, K., Keerthi, S.S., Poo, A.N.: Evaluation of simple performance measures for tuning SVM hyperparameters. Neurocomputing 51, 41–59 (2003)
Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research 2, 243–264 (2001)
Fletcher, R.: Practical Methods of Optimization, 2nd edn. John Wiley & Sons, Chichester (1987)
Frank, A., Asuncion, A.: UCI machine learning repository (2010)
Girolami, M.: Orthogonal series density estimation and the kernel eigenvalue problem. Neural Computation 14(13), 669–688 (2002)
Gold, C., Sollich, P.: Model selection for support vector machine classification. Neurocomputing 55(1-2), 221–249 (2003)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
Guermeur, Y., Monfrini, E.: A quadratic loss multi-class SVM for which a radius-margin bound applies. Informatica 22(1), 73–96 (2011)
Hastie, T., Rosset, S., Tibshirani, R., Zhu, J.: The entire regularization path for the support vector machine. Journal of Machine Learning Research 5, 1391–1415 (2004)
Keerthi, S.S., Chapelle, O., DeCoste, D.: Building support vector machines with reduced classifier complexity. Journal of Machine Learning Research 7, 1493–1515 (2006)
Keerthi, S.S., Sindhwani, V., Chapelle, O.: An efficient method for gradient-based adaptation of hyperparameters in SVM models. In: NIPS, vol. 19, pp. 673–380 (2007)
Luntz, A., Brailovsky, V.: On estimation of characters obtained in statistical procedure of recognition. Technicheskaya Kibernetica, 3 (1969) (in Russian)
Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42(3), 287–320 (2001)
Rosset, S.: Following curved regularized optimization solution paths. In: NIPS, vol. 17, pp. 1153–1160 (2005)
Stone, M.: Asymptotics for and against cross-validation. Biometrika 64(1), 29–35 (1977)
Suykens, J.A.K., Lukas, L., Van Dooren, P., De Moor, B., Vandewalle, J.: Least squares support vector machine classifiers: A large scale algorithm. In: Proceeding of the European Conference on Circuit Theory and Design, pp. 839–842 (1999)
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Inc., New York (1998)
Vapnik, V.N., Chapelle, O.: Bounds on error expectation for support vector machines. Neural Computation 12(9), 2013–2036 (2000)
Zhang, K., Kwok, J.T.: Density-weighted Nyström method for computing large kernel eigensystems. Neural Computation 21(1), 121–146 (2009)
Zhang, K., Tsang, I.W., Kwok, J.T.: Improved Nyström low-rank approximation and error analysis. In: ICML 2008, pp. 1232–1239 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bonidal, R., Tindel, S., Guermeur, Y. (2014). Model Selection for the ℓ2-SVM by Following the Regularization Path. In: Nguyen, NT., Le-Thi, H.A. (eds) Transactions on Computational Intelligence XIII. Lecture Notes in Computer Science, vol 8342. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54455-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-54455-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54454-5
Online ISBN: 978-3-642-54455-2
eBook Packages: Computer ScienceComputer Science (R0)