Skip to main content

Relaxing support vectors for classification

Abstract

We introduce a novel modification to standard support vector machine (SVM) formulations based on a limited amount of penalty-free slack to reduce the influence of misclassified samples or outliers. We show that free slack relaxes support vectors and pushes them towards their respective classes, hence we use the name relaxed support vector machines (RSVM) for our method. We present theoretical properties of the RSVM formulation and develop its dual formulation for nonlinear classification via kernels. We show the connection between the dual RSVM and the dual of the standard SVM formulations. We provide error bounds for RSVM and show it to be stable, universally consistent and tighter than error bounds for standard SVM. We also introduce a linear programming version of RSVM, which we call RSVMLP. We apply RSVM and RSVMLP to synthetic data and benchmark binary classification problems, and compare our results with standard SVM classification results. We show that relaxed influential support vectors may lead to better classification results. We develop a two-phase method called RSVM2 for multiple instance classification (MIC) problems, where RSVM formulations are used as classifiers. We extend the two-phase method to the linear programming case and develop RSVMLP2. We demonstrate the classification characteristics of RSVM2 and RSVMLP2, and report our classification results compared to results obtained by other SVM-based MIC methods on public benchmark datasets. We show that both RSVM2 and RSVMLP2 are faster and produce more accurate classification results.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Fig. 8

References

  • Andrews, S., Hofmann, T., & Tsochantaridis, I. (2002). Multiple instance learning with generalized support vector machines. In Eighteenth national conference on artificial intelligence (pp. 943–944). Menlo Park: Am. Assoc. of Artificial Intelligence.

    Google Scholar 

  • Andrews, S., Tsochantaridis, I., & Hofmann, T. (2003). Support vector machines for multiple-instance learning. Advances in Neural Information Processing Systems, 15, 561–568.

    Google Scholar 

  • Bartlett, P. L., & Mendelson, S. (2003). Rademacher and Gaussian complexities: risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.

    Google Scholar 

  • Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In COLT’92, Proceedings of the fifth annual workshop on computational learning theory (pp. 144–152). New York: ACM.

    Chapter  Google Scholar 

  • Brow, T., Settles, B., & Craven, M. (2005). Classifying biomedical articles by making localized decisions. In Proceedings of the fourteenth text retrieval conference (TREC05).

    Google Scholar 

  • Byun, H., & Lee, S. W. (2002). Applications of support vector machines for pattern recognition: a survey. In SVM’02, Proceedings of the first international workshop on pattern recognition with support vector machines (pp. 213–236). London: Springer.

    Chapter  Google Scholar 

  • Carneiro, G., Chan, A. B., Moreno, P. J., & Vasconcelos, N. (2007). Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3), 394–410.

    Article  Google Scholar 

  • Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.

    Article  Google Scholar 

  • Chen, Y., & Wang, J. Z. (2004). Image categorization by learning and reasoning with regions. Journal of Machine Learning Research, 5, 913–939.

    Google Scholar 

  • Chien, L. J., Lee, Y. J., Kao, Z. P., & Chang, C. C. (2010). Robust 1-norm soft margin smooth support vector machine. In Proceedings of the 11th international conference on intelligent data engineering and automated learning (pp. 145–152). Berlin, Heidelberg: Springer.

    Google Scholar 

  • Cortes, C., & Vapnik, V. N. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.

    Google Scholar 

  • Cover, T. M. (1968). Rates of convergence for nearest neighbor procedures. In Proceedings of the Hawaii international conference on system sciences.

    Google Scholar 

  • Cristianini, N., & Taylor, J. S. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • De Mol, C., De Vito, E., & Rosasco, L. (2009). Elastic-net regularization in learning theory. Journal of Complexity, 25(2), 201–230.

    Article  Google Scholar 

  • Devroye, L., Györfi, L., & Lugosi, G. (1996). Applications of mathematics: Vol. 1. A probabilistic theory of pattern recognition. Berlin: Springer.

    Book  Google Scholar 

  • Dietterich, T. G., Lathrop, R. H., & Lozano-Perez, T. (1997). Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1–2), 31–71.

    Article  Google Scholar 

  • Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification (Vol. 2). New York: Wiley.

    Google Scholar 

  • Fan, R. E., Chen, P. H., & Lin, C. J. (2005). Working set selection using second order information for training SVM. Journal of Machine Learning Research, 6, 1889–1918.

    Google Scholar 

  • Frank, A., & Asuncion, A. (2010). UCI machine learning repository. Irvine: University of California, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml.

    Google Scholar 

  • Hastings, C., Mosteller, F., Tukey, J. W., & Winsor, C. P. (1947). Low moments for small samples: a comparative study of order statistics. Annals of Mathematical Statistics, 18, 413–426.

    Article  Google Scholar 

  • Ivanciuc, O. (2007). Applications of support vector machines in chemistry. Biochemistry, 23(2), 291–400.

    Google Scholar 

  • Krause, N., & Singer, Y. (2004). Leveraging the margin more carefully. In Twenty first international conference on machine learning, ICML 04 (pp. 63–70).

    Chapter  Google Scholar 

  • Lotte, F., Congedo, M., Lécuyer, A., Lamarche, F., & Arnaldi, B. (2007). A review of classification algorithms for EEG-based brain-computer interfaces. Journal of Neural Engineering, 4(2), R1–R13.

    Article  Google Scholar 

  • Mangasarian, O. L., & Wild, E. W. (2007). Multiple instance classification via successive linear programming. Journal of Optimization Theory and Applications, 137(3), 555–568.

    Article  Google Scholar 

  • Mason, L., Baxter, J., Bartlett, P., & Frean, M. (2000). Functional gradient techniques for combining hypotheses. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers (pp. 221–246). Cambridge: MIT Press.

    Google Scholar 

  • Noble, W. S. (2004). Kernel methods in computational biology (pp. 71–92). Cambridge: MIT Press.

    Google Scholar 

  • Pedroso, J. P., & Murata, N. (2001). Support vector machines with different norms: motivation, formulations and results. Pattern Recognition Letters, 22, 1263–1272.

    Article  Google Scholar 

  • Platt, J. C. (1998). Sequential minimal optimization: a fast algorithm for training support vector machines. Advances in Kernel Methods: Support Vector Learning, 208, 1–21.

    Google Scholar 

  • Platt, J. C., Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGs for multiclass classification (Vol. 12, pp. 547–553). Cambridge: MIT Press.

    Google Scholar 

  • Qi, X., & Han, Y. (2007). Incorporating multiple SVMs for automatic image annotation. Pattern Recognition, 40(2), 728–741.

    Article  Google Scholar 

  • Rätsch, G. (2011). IDA benchmark repository. Friedrich Miescher Laboratory of the Max Planck Institute. URL http://www.fml.tuebingen.mpg.de/Members/raetsch/benchmark.

  • Schölkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New support vector algorithms. Neural Computing, 12, 1207–1245.

    Article  Google Scholar 

  • Seref, O., Kundakcioglu, O. E., & Bewernitz, M. (2008). Support vector machines in neuroscience. In N. Wickramasinghe & E. Geisler (Eds.), Encyclopedia of healthcare information systems (pp. 1283–1293). Hershey: IGI Global.

    Chapter  Google Scholar 

  • Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis (Vol. 47). Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Solera-Ureña, R., Padrell-Sendra, J., Martín-Iglesias, D., Gallardo-Antolín, A., Peláez-Moreno, C., & Díaz-De-María, F. (2007). SVMs for automatic speech recognition: a survey. In Progress in nonlinear speech processing (pp. 190–216) Berlin, Heidelberg: Springer

    Chapter  Google Scholar 

  • Song, Q., Hu, W., & Xie, W. (2002). Robust support vector machine with bullet hole image classification. Structure, 32(4), 440–448.

    Google Scholar 

  • Steinwart, I. (2001). On the influence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2(1), 67–93.

    Google Scholar 

  • Steinwart, I. (2002). Support vector machines are universally consistent. Journal of Complexity, 18(3), 768–791.

    Article  Google Scholar 

  • Steinwart, I. (2005). Consistency of support vector machines and other regularized kernel classifiers. IEEE Transactions on Information Theory, 51(1), 128–142.

    Article  Google Scholar 

  • Tuia, D., Volpi, M., Copa, L., Kanevski, M., & Munoz-Mari, J. (2011). A survey of active learning algorithms for supervised remote sensing image classification. IEEE Journal of Selected Topics in Signal Processing, 5(3), 606–617.

    Article  Google Scholar 

  • Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.

    Google Scholar 

  • Wahba, G. (1997). Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV (Chap. 6, pp. 69–88). Cambridge: MIT Press

    Google Scholar 

  • Wang, J. Y. (2002). Application of support vector machines in bioinformatics. Master’s thesis, National Taiwan University.

  • Wang, L., Zhu, J., & Zou, H. (2006). The doubly regularized support vector machine. Statistica Sinica, 16(2), 589–615.

    Google Scholar 

  • Weston, J., & Herbrich, R. (2000). Adaptive margin support vector machines. In A. J. Smola, P. L. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large margin classifiers (pp. 281–295). Cambridge: MIT Press.

    Google Scholar 

  • Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 102(479), 974–983.

    Article  Google Scholar 

  • Xu, L., Crammer, K., & Schuurmans, D. (2006). Robust support vector machine training via convex outlier ablation. In Proceedings of the 21st national conference on artificial intelligence (Vol. 1, pp. 536–542). Menlo Park: AAAI Press.

    Google Scholar 

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B, 67(2), 301–320.

    Article  Google Scholar 

Download references

Acknowledgements

This research is supported in part by NASA Award NNX09AR44A, and NIH-NIAID Award UH3AI08326-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Onur Şeref.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Şeref, O., Chaovalitwongse, W.A. & Brooks, J.P. Relaxing support vectors for classification. Ann Oper Res 216, 229–255 (2014). https://doi.org/10.1007/s10479-012-1193-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-012-1193-3

Keywords

  • Classification
  • Support vector machines
  • Error bounds
  • Multiple instance classification