Selection of Basis Functions Guided by the L2 Soft Margin
- 1.8k Downloads
Support Vector Machines (SVMs) for classification tasks produce sparse models by maximizing the margin. Two limitations of this technique are considered in this work: firstly, the number of support vectors can be large and, secondly, the model requires the use of (Mercer) kernel functions. Recently, some works have proposed to maximize the margin while controlling the sparsity. These works also require the use of kernels. We propose a search process to select a subset of basis functions that maximize the margin without the requirement of being kernel functions. The sparsity of the model can be explicitly controlled. Experimental results show that accuracy close to SVMs can be achieved with much higher sparsity. Further, given the same level of sparsity, more powerful search strategies tend to obtain better generalization rates than simpler ones.
KeywordsBasis Function Forward Selection Relevance Vector Machine Radial Basis Function Sparse Model
Unable to display preview. Download preview PDF.
- 2.Schölkopf, B., Tsuda, J.P.V.K.: Kernel methods in computational biology. MIT Press, Cambridge (2004)Google Scholar
- 7.Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: 15th International Conf. on Machine Learning, pp. 82–90. Morgan Kaufmann, San Francisco (1998)Google Scholar
- 9.Keerthi, S., Chapelle, O., DeCoste, D.: Building Support Vector Machines with Reduced Classifier Complexity. Journal of Machine Learning Research 8, 1–22 (2006)Google Scholar
- 10.Keerthi, S., DeCoste, D.: A modified finite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research 6, 341–361 (2005)Google Scholar
- 11.Kittler, J.: Feature selection and extraction. In: Young, F. (ed.) Handbook of Pattern Recognition and Image Processing, Academic Press, London (1986)Google Scholar
- 13.Lee, Y.J., Mangasarian, O.L.: Rsvm: Reduced support vector machines. In: SIAM International Conference on Data Mining (2001)Google Scholar