Skip to main content
Log in

Feature uncertainty bounds for explicit feature maps and large robust nonlinear SVM classifiers

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

We consider the binary classification problem when data are large and subject to unknown but bounded uncertainties. We address the problem by formulating the nonlinear support vector machine training problem with robust optimization. To do so, we analyze and propose two bounding schemes for uncertainties associated to random approximate features in low dimensional spaces. The proposed bound calculations are based on Random Fourier Features and the Nyström methods. Numerical experiments are conducted to illustrate the benefit of the technique. We also emphasize the decomposable structure of the proposed robust nonlinear formulation that allows the use of efficient stochastic approximation techniques when datasets are large.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95(1), 3–51 (2003)

    Article  MathSciNet  Google Scholar 

  2. Ben-Tal, A., Bhadra, S., Bhattacharyya, C., Nath, J.S.: Chance constrained uncertain classification via robust optimization. Math. Program. 127(1), 145–173 (2011)

    Article  MathSciNet  Google Scholar 

  3. Ben-Tal, A., Bhadra, S., Bhattacharyya, C., Nemirovski, A.: Efficient methods for robust classification under uncertainty in kernel matrices. J. Mach. Learn. Res. 13, 2923–2954 (2012)

    MathSciNet  MATH  Google Scholar 

  4. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton Series in Applied Mathematics. Princeton University Press (2009)

  5. Bottou, L., Curtis, F., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018). https://doi.org/10.1137/16M1080173

    Article  MathSciNet  MATH  Google Scholar 

  6. Caramanis, C., Mannor, S., Xu, H.: Robust optimization in machine learning. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimization for Machine Learning, Neural Information Processing Series. The MIT Press, Cambridge (2012)

  7. Chang, L.B., Bai, Z., Huang, S.Y., Hwang, C.R.: Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices. J. Multivariate Anal. 120, 102–119 (2013). https://doi.org/10.1016/j.jmva.2013.05.006

    Article  MathSciNet  MATH  Google Scholar 

  8. Choromanska, A., Jebara, T., Kim, H., Mohan, M., Monteleoni, C.: Fast spectral clustering via the Nyström method. In: Algorithmic Learning Theory, Lecture Notes in Comput. Sci., vol. 8139, pp 367–381. Springer, Heidelberg (2013), https://doi.org/10.1007/978-3-642-40935-6_26

    Google Scholar 

  9. Couellan, N., Wang, W.: Uncertainty-safe large scale support vector machines. Comput. Statist. Data Anal. 109, 215–230 (2017)

    Article  MathSciNet  Google Scholar 

  10. Couellan, N., Wang, W.: On the convergence of a stochastic approximation method for structured bi-level optimization. Preprint. https://hal.archives-ouvertes.fr/hal-01932372 (2018)

  11. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other Kernel-Based Learning Methods. Repr. Cambridge University Press (2001)

  12. Gittens, A., Mahoney, M.W.: Revisiting the Nyström method for improved large-scale machine learning. J. Mach. Learn. Res. 17, 1–65 (2016)

    MATH  Google Scholar 

  13. Homrighausen, D., McDonald, D.J.: On the Nyström and column-sampling methods for the approximate principal components analysis of large datasets. J. Comput. Graph. Statist. 25(2), 344–362 (2016). https://doi.org/10.1080/10618600.2014.995799

    Article  MathSciNet  Google Scholar 

  14. Li, M., Bi, W., Kwok, J.T., Lu, B.L.: Large-scale Nyström kernel matrix approximation using randomized SVD. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 152–164 (2015). https://doi.org/10.1109/TNNLS.2014.2359798

    Article  MathSciNet  Google Scholar 

  15. MOSEK-ApS: The MOSEK optimization toolbox for MATLAB manual. Version 7.1 (Revision 28). http://docs.mosek.com/7.1/toolbox/index.html (2015)

  16. Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Neural Information Processing Systems (2007)

  17. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

    Article  MathSciNet  Google Scholar 

  18. Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  19. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York (2004)

    Book  Google Scholar 

  20. Shiwaswamy, P., Bhattacharyya, C., Smola, A.: Second order cone programming approaches for handling missing and uncertain data. J. Mach. Learn. Res. 7, 1283–1314 (2006)

    MathSciNet  MATH  Google Scholar 

  21. Sturm, J.: Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones. Optim. Methods Softw. 11–12, 625–653 (1999)

    Article  MathSciNet  Google Scholar 

  22. Sutherland, D.J., Schneider, J.: On the error of random F,ourier features. arXiv:https://arxiv.org/abs/1506.02785 (2015)

  23. Trafalis, T., Gilbert, R.: Robust support vector machines for classification and computational issues. Optim. Methods Softw. 22(1), 187–198 (2007)

    Article  MathSciNet  Google Scholar 

  24. Trafalis, T.B., Gilbert, R.C.: Robust classification and regression using support vector machines. Eur. J. Oper. Res. 173(3), 893–909 (2006)

    Article  MathSciNet  Google Scholar 

  25. Trokicić, A.: Approximate spectral learning using Nyström method. Facta Univ. Ser. Math. Inform. 31(2), 569–578 (2016)

    MathSciNet  MATH  Google Scholar 

  26. Vapnik, V.N.: Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. Wiley, New York (1998)

    Google Scholar 

  27. Xanthopoulos, P., Pardalos, P., Trafalis, T.: Robust Data Mining. SpringerBriefs in Optimization. Springer. https://books.google.fr/books?id=CqMlwCO5yJcC (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Couellan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Couellan, N., Jan, S. Feature uncertainty bounds for explicit feature maps and large robust nonlinear SVM classifiers. Ann Math Artif Intell 88, 269–289 (2020). https://doi.org/10.1007/s10472-019-09676-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-019-09676-0

Keywords

Mathematics Subject Classification (2010)

Navigation