Computational Management Science

, Volume 6, Issue 1, pp 41–51 | Cite as

Self-adaptive support vector machines: modelling and experiments

Original Paper

Abstract

Method

In this paper, we introduce a bi-level optimization formulation for the model and feature selection problems of support vector machines (SVMs). A bi-level optimization model is proposed to select the best model, where the standard convex quadratic optimization problem of the SVM training is cast as a subproblem.

Feasibility

The optimal objective value of the quadratic problem of SVMs is minimized over a feasible range of the kernel parameters at the master level of the bi-level model. Since the optimal objective value of the subproblem is a continuous function of the kernel parameters, through implicity defined over a certain region, the solution of this bi-level problem always exists. The problem of feature selection can be handled in a similar manner.

Experiments and results

Two approaches for solving the bi-level problem of model and feature selection are considered as well. Experimental results show that the bi-level formulation provides a plausible tool for model selection.

Keywords

Support vector machines (SVMs) Machine learning Model selection Feature selection Bi-level programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bennett KP (1992) Decision tree construction via linear programming. In: Proceedings of the 4th midwest artificial intelligence and cognitive science society, Utica, Illinois, pp 97–102Google Scholar
  2. Bradley PS, Mangasarian OL, Street WN (1998) Feature selection via mathematical programming. INFORMS J Comput 10: 209–217CrossRefGoogle Scholar
  3. Chapelle O, Vapnik V (2000) Model selection for support vector machines. In: Leen TK, Solla SA, Muller KR(eds) Advances in neural information processing system, vol 12. MIT Press, CambridgeGoogle Scholar
  4. Chapelle O, Vapnik V (2002) Choosing multiple parameters for support vector machine. Machine Learn 46: 131–159CrossRefGoogle Scholar
  5. Chinneck JW (1994) MINOS(IIS): infeasibility analysis using MINOS. Comput Oper Res 21(1): 1–9CrossRefGoogle Scholar
  6. Conn A, Scheinberg K, Toint PhL (1997) Recent progress in unconstrained nonlinear optimization without drivatives. Math Program 79: 397–414Google Scholar
  7. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machine. Cambridge University Press, LondonGoogle Scholar
  8. Fan E (2002) Global optimization of Lennard-Jones atomic clusters. Master Thesis, Department of Computing and Software, McMaster UniversityGoogle Scholar
  9. Fourer R, Gay D, Kernighan B (2002) AMPL: A mathematical programming language. Duxbury Press/Brooks/Cole Publishing CompanyGoogle Scholar
  10. Jaakkola TS, Haussler D (1998) Exploiting generative models in discriminative classifiers. In: Solla SA, Kearns MS, Cohn DA(eds) Advances in neural information processing systems (Cambridgem, MA, USA). MIT Press, Cambridge, pp 487–493Google Scholar
  11. Joachims T (2000) Estimating the generalization performance of a svm efficiently. In: Pat Langley(eds) Proceedings of ICML-00, 17th international conference on machine learning (Stanford, US). Morgan Kaufmann Publishers, San Francisco, pp 431–438Google Scholar
  12. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LJ (1990) Handwritten digit recognition with back-propagation network. In: Advances in neural information processing systems, vol. 2. Morgan Kaufman, San FranciscoGoogle Scholar
  13. LeCun Y, Jackel L, Bottou L, Brunot A, Cortes C, Denker J, Drucker H, Guyon I, Muller U, Sackinger E, Simard P, Vapnik V (1995) Comparison of learning algorithms for handwritten digit recognition. In: Fogelman F, Gallinari P (eds) International conference on artificial neural networks, pp 53–60Google Scholar
  14. Pontil M, Verri A (1998) Object recognition with support vector machines. IEEE Trans. PAMI 20: 637–646Google Scholar
  15. Street WN, Wolberg WH, Mangasarian OL (1993) Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE: international symposium on electronic imaging: science and technology, vol. 1905. San JoseGoogle Scholar
  16. Vanderbei RJ (1999) LOQO: An interior point code for quadratic programming. Optim Methods Softw 11: 451–484CrossRefGoogle Scholar
  17. Vapnik V (1999) The nature of statistical learning theory. Springer, New YorkGoogle Scholar
  18. Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. NIPS, pp 668–674Google Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  1. 1.Corporate ServicesLake Simcoe Region Conservation AuthorityNewmarketCanada
  2. 2.Department of Industrial and Enterprise System EngineeringUniversity of Illinois at Urbana-ChampaignUrbanaUSA
  3. 3.Department of Computing and SoftwareSchool of Computational Engineering and ScienceHamiltonCanada

Personalised recommendations