Support vector machine classification with indefinite kernels

Abstract

We propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our algorithm simultaneously computes support vectors and a proxy kernel matrix used in forming the loss. This can be interpreted as a penalized kernel learning problem where indefinite kernel matrices are treated as noisy observations of a true Mercer kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the projected gradient or analytic center cutting plane methods. We compare the performance of our technique with other methods on several standard data sets.

This is a preview of subscription content, access via your institution.

References

  1. 1

    Asuncion, A., Newman, D.: UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html (2007). Accessed 10 Jan 2008

  2. 2

    Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. Proceedings of the 21st International Conference on Machine Learning. 8 pp (2004)

  3. 3

    Bennet, K.P., Bredensteiner, E.J.: Duality and geometry in svm classifiers. Proceedings of the 17th International conference on Machine Learning pp. 57–64 (2000)

  4. 4

    Bertsekas, D.: Nonlinear Programming, 2nd edn. Athena Scientific (1999)

  5. 5

    Boyd S., Vandenberghe L.: Convex Optimization. Cambridge University Press, USA (2004)

    MATH  Google Scholar 

  6. 6

    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Accessed 1 April 2008

  7. 7

    Chen, J., Ye, J.: Training SVM with indefinite kernels. Proceedings of the 25th International Conference on Machine Learning. 8 pp (2008)

  8. 8

    Chen, Y., Gupta, M.R., Recht, B.: Learning kernels from indefinite similarities. Proceedings of the 26th International Conference on Machine Learning. 8 pp (2009)

  9. 9

    Cuturi, M.: Permanents, transport polytopes and positive definite kernels on histograms. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence pp. 732–737 (2007)

  10. 10

    Demmel, J.W.: Applied Numerical Linear Algebra. SIAM (1997)

  11. 11

    Goffin J.-L., Vial J.-P.: Convex nondifferentiable optimization: a survey focused on the analytic center cutting plane method. Optim. Methods Softw. 17(5), 805–867 (2002)

    MATH  MathSciNet  Google Scholar 

  12. 12

    Haasdonk B.: Feature space interpretation of SVMs with indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intel. 27(4), 482–492 (2005)

    Article  Google Scholar 

  13. 13

    Haasdonk, B., Keysers, D.: Tangent distance kernels for support vector machines Proceedings of the 16th International Conference on Pattern Recognition 2, 864–868 (2002)

  14. 14

    Hettich R., Kortanek K.O.: Semi-infinite programming: theory, methods, and applications. SIAM Rev. 35(3), 380–429 (1993)

    MATH  Article  MathSciNet  Google Scholar 

  15. 15

    Higham N.: Computing the nearest correlation matrix—a problem from finance. IMA J. Numer. Anal. 22, 329–343 (2002)

    MATH  Article  MathSciNet  Google Scholar 

  16. 16

    Hiriart-Urruty J.-B., Lemaréchal C.: Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)

    Google Scholar 

  17. 17

    Hull J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)

    Article  Google Scholar 

  18. 18

    Kulis, B., Sustik, M., Dhillon, I.: Learning low-rank kernel matrices. Proceedings of the 23rd International Conference on Machine Learning. pp. 505–512 (2006)

  19. 19

    Lanckriet, G.R.G., Cristianini, N., Jordan, M.I., Noble, W.S.: Kernel-based integration of genomic data using semidefinite programming. In: Kernel Methods in Computational Biology. MIT Press, Cambridge (2003)

  20. 20

    Lanckriet G.R.G., Cristianini N., Bartlett P., Ghaoui L.E., Jordan M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)

    Google Scholar 

  21. 21

    Lewis A.: Nonsmooth analysis of eigenvalues. Math. Program. 84, 1–24 (1999)

    MATH  MathSciNet  Google Scholar 

  22. 22

    Lin, H.-T., Lin, C.-J.: A Study on Sigmoid Kernel for SVM and the Training of Non-PSD Kernels by SMO-type Methods. National Taiwan University, Department of Computer Science and Information Engineering, Taipei, Taiwan (2003)

  23. 23

    MOSEK ApS: The MOSEK optimization tools manual. Version 5.0, revision 105. Software available at http://www.mosek.com. (2008). Accessed 1 Dec 2008

  24. 24

    Nesterov Y.: Introductory Lectures on Convex Optimization. Springer, Berlin (2003)

    Google Scholar 

  25. 25

    Ong, C.S., Mary, X., Canu, S. Smola, A.J.: Learning with non-positive kernels. Proceedings of the 21st International Conference on Machine Learning. 8 pp (2004)

  26. 26

    Ong C.S., Smola A.J., Williamson R.C.: Learning the kernel with hyperkernels’. J. Mach. Learn. Res. 6, 1043–1071 (2005)

    MathSciNet  Google Scholar 

  27. 27

    Overton M.: Large-scale optimization of eigenvalues. SIAM J. Optim. 2(1), 88–120 (1992)

    MATH  Article  MathSciNet  Google Scholar 

  28. 28

    Rakotomamonjy A., Bach F., Canu S., Grandvalet Y.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)

    MathSciNet  Google Scholar 

  29. 29

    Saigo H., Vert J.P., Ueda N., Akutsu T.: Protein homology detection using string alignment kernels. Bioinformatics 20(11), 1682–1689 (2004)

    Article  Google Scholar 

  30. 30

    Schölkopf B., Smola A.: Learning with Kernels. The MIT Press, Cambridge (2002)

    Google Scholar 

  31. 31

    Simard P.Y., Cun Y.A.L., Denker J.S., Victorri B.: Transformation invariance in pattern recognition-tangent distance and tangent propogation. Lect. Notes Comp. Sci. 1524, 239–274 (1998)

    Article  Google Scholar 

  32. 32

    Sonnenberg S., Rätsch G., Schäfer C., Schölkopf B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)

    MathSciNet  Google Scholar 

  33. 33

    Woźnica, A., Kalousis, A. Hilario, M.: Distances and (indefinite) kernels for set of objects. Proceedings of the 6th International Conference on Data Mining pp. 1151–1156 (2006)

  34. 34

    Wu, G., Chang, E.Y., Zhang, Z.: An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines. Proceedings of the 22nd International Conference on Machine Learning. 8 pp (2005)

  35. 35

    Zamolotskikh, A., Cunningham, P.: An Assessment of Alternative Strategies for Constructing EMD-Based Kernel Functions for Use in an SVM for Image Classification. Proceedings of the International Workshop on CBMI 2007, 11–17 (2007)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ronny Luss.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Luss, R., d’Aspremont, A. Support vector machine classification with indefinite kernels. Math. Prog. Comp. 1, 97–118 (2009). https://doi.org/10.1007/s12532-009-0005-5

Download citation

Mathematics Subject Classification (2000)

  • Primary 62H30
  • Secondary 90C25
  • 68T05