Skip to main content

Advertisement

Log in

In-depth analysis of SVM kernel learning and its components

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The performance of support vector machines in nonlinearly separable classification problems strongly relies on the kernel function. Toward an automatic machine learning approach for this technique, many research outputs have been produced dealing with the challenge of automatic learning of good-performing kernels for support vector machines. However, these works have been carried out without a thorough analysis of the set of components that influence the behavior of support vector machines and their interaction with the kernel. These components are related in an intricate way and it is difficult to provide a comprehensible analysis of their joint effect. In this paper, we try to fill this gap introducing the necessary steps in order to understand these interactions and provide clues for the research community to know where to place the emphasis. First of all, we identify all the factors that affect the final performance of support vector machines in relation to the elicitation of kernels. Next, we analyze the factors independently or in pairs and study the influence each component has on the final classification performance, providing recommendations and insights into the kernel setting for support vector machines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://deap.readthedocs.io.

  2. https://pypi.org/project/evocov/.

  3. https://pypi.org/project/ksvmlib/.

References

  1. Ali S, Smith-Miles KA (2006) A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing 70(1):173–186. https://doi.org/10.1016/j.neucom.2006.03.004

    Article  Google Scholar 

  2. Alizadeh M, Ebadzadeh MM (2011) Kernel evolution for support vector classification. In: 2011 IEEE workshop on evolving and adaptive intelligent systems (EAIS), pp 93–99. https://doi.org/10.1109/EAIS.2011.5945924

  3. Bing W, Wen-qiong Z, Ling C, Jia-hong L (2010) A GP-based kernel construction and optimization method for RVM. In: 2010 the 2nd international conference on computer and automation engineering (ICCAE), vol 4, pp 419–423. https://doi.org/10.1109/ICCAE.2010.5451646

  4. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory. ACM, New York, NY, USA, COLT ’92, pp 144–152. https://doi.org/10.1145/130385.130401. (Event-place: Pittsburgh, Pennsylvania, USA)

  5. Burges CJ, Crisp DJ (2000) Uniqueness of the SVM solution. In: Advances in neural information processing systems, pp 223–229

  6. Chapelle O (2002) Support vector machines: induction principle, adaptive tuning and prior knowledge. Ph.D. thesis, LIP6

  7. Cho Y, Saul LK (2009) Kernel methods for deep learning. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems, vol 22. Curran Associates, Inc., pp 342–350. http://papers.nips.cc/paper/3628-kernel-methods-for-deep-learning.pdf

  8. Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2:265–292

    MATH  Google Scholar 

  9. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  10. Dioşan L, Rogozan A, Pecuchet JP (2007) Evolving kernel functions for SVMs by genetic programming. In: Sixth international conference on machine learning and applications (ICMLA 2007), pp 19–24. https://doi.org/10.1109/ICMLA.2007.70

  11. Dioşan L, Rogozan A, Pecuchet JP (2008) Optimising multiple kernels for SVM by genetic programming. In: Evolutionary computation in combinatorial optimization, Lecture notes in computer science. Springer, Berlin, Heidelberg, pp 230–241. https://doi.org/10.1007/978-3-540-78604-7_20

  12. Dioşan L, Rogozan A, Pecuchet JP (2012) Improving classification performance of support vector machine by genetically optimising kernel shape and hyper-parameters. Appl Intell 36(2):280–294. https://doi.org/10.1007/s10489-010-0260-1

    Article  Google Scholar 

  13. Dua D, Graff C (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

  14. Durrande N, Ginsbourger D, Roustant O (2012) Additive covariance kernels for high-dimensional Gaussian process modeling. Annales de la Faculté de Sciences de Toulouse Tome 21(3):481–499

    Article  MathSciNet  Google Scholar 

  15. Duvenaud D (2014) Automatic model construction with Gaussian processes. Thesis. University of Cambridge. http://www.repository.cam.ac.uk/handle/1810/247281

  16. Duvenaud D, Lloyd J, Grosse R, Tenenbaum J, Zoubin G (2013) Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th international conference on machine learning, pp 1166–1174. http://jmlr.org/proceedings/papers/v28/duvenaud13.html

  17. Fortin FA, Rainville FMD, Gardner MA, Parizeau M, Gagné C (2012) DEAP: evolutionary algorithms made easy. J Mach Learn Res 13(Jul):2171–2175

    MathSciNet  Google Scholar 

  18. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701

    Article  Google Scholar 

  19. Gagné C, Schoenauer M, Sebag M, Tomassini M (2006) Genetic programming for kernel-based learning with co-evolving subsets selection. In: Parallel problem solving from nature—PPSN IX, Lecture notes in computer science. Springer, Berlin, Heidelberg, pp 1008–1017. https://doi.org/10.1007/11844297_102

  20. Genton MG (2002) Classes of kernels for machine learning: a statistics perspective. J Mach Learn Res 2:299–312

    MathSciNet  MATH  Google Scholar 

  21. Gijsberts A, Metta G, Rothkrantz L (2010) Evolutionary optimization of least-squares support vector machines. In: Data mining, annals of information systems. Springer, Boston, MA, pp 277–297. https://doi.org/10.1007/978-1-4419-1280-0_12

  22. Girdea M, Ciortuz L (2007) A hybrid genetic programming and boosting technique for learning kernel functions from training data. In: Ninth international symposium on symbolic and numeric algorithms for scientific computing (SYNASC 2007), pp 395–402. https://doi.org/10.1109/SYNASC.2007.71

  23. HajiGhassemi N, Deisenroth M (2014) Analytic long-term forecasting with periodic Gaussian processes. In: Proceedings of machine learning research, pp 303–311. http://proceedings.mlr.press/v33/hajighassemi14.html

  24. Howley T, Madden MG (2005) The genetic kernel support vector machine: description and evaluation. Artif Intell Rev 24(3–4):379–395. https://doi.org/10.1007/s10462-005-9009-3

    Article  Google Scholar 

  25. Howley T, Madden MG (2006) An evolutionary approach to automatic kernel construction. In: Artificial neural networks—ICANN 2006, Lecture notes in computer science. Springer, Berlin, Heidelberg, pp 417–426. https://doi.org/10.1007/11840930_43

  26. Hussain M, Wajid SK, Elzaart A, Berbar M (2011) A comparison of SVM kernel functions for breast cancer detection. In: Imaging and visualization 2011 eighth international conference computer graphics, pp 145–150. https://doi.org/10.1109/CGIV.2011.31

  27. Joachims T (1998) Making large-scale SVM learning practical. Technical report. https://www.econstor.eu/handle/10419/77178

  28. Koch P, Bischl B, Flasch O, Bartz-Beielstein T, Weihs C, Konen W (2012) Tuning and evolution of support vector kernels. Evol Intell 5(3):153–170. https://doi.org/10.1007/s12065-012-0073-8

    Article  Google Scholar 

  29. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  30. Li CH, Lin CT, Kuo BC, Chu HS (2010) An automatic method for selecting the parameter of the RBF kernel function to support vector machines. In: 2010 IEEE international geoscience and remote sensing symposium, pp 836–839. https://doi.org/10.1109/IGARSS.2010.5649251. (iSSN: 2153-7003)

  31. Li JB, Chu SC, Pan JS (2013) Kernel learning algorithms for face recognition. Springer, Berlin

    MATH  Google Scholar 

  32. MacKay DJC (1996) Bayesian methods for backpropagation networks. In: Models of neural networks III, physics of neural networks. Springer, New York, NY, pp 211–254. https://doi.org/10.1007/978-1-4612-0723-8_6

  33. Mercer J (1909) XVI. Functions of positive and negative type, and their connection the theory of integral equations. Philos Trans R Soc Lond Ser A Contain Pap Math Phys Character 209(441–458):415–446. https://doi.org/10.1098/rsta.1909.0016

    Article  MATH  Google Scholar 

  34. Mezher MA, Abbod MF (2014) Genetic folding for solving multiclass SVM problems. Appl Intell 41(2):464–472. https://doi.org/10.1007/s10489-014-0533-1

    Article  Google Scholar 

  35. Mohandes MA, Halawani TO, Rehman S, Hussain AA (2004) Support vector machines for wind speed prediction. Renew Energy 29(6):939–947. https://doi.org/10.1016/j.renene.2003.11.009

    Article  Google Scholar 

  36. Neal RM (1996) Bayesian learning for neural networks. Lecture notes in statistics. Springer, New York

    Book  Google Scholar 

  37. Olson RS, La Cava W, Orzechowski P, Urbanowicz RJ, Moore JH (2017) PMLB: a large benchmark suite for machine learning evaluation and comparison. BioData Min 10(1):36. https://doi.org/10.1186/s13040-017-0154-4

    Article  Google Scholar 

  38. Pei Y (2019) Automatic decision making for parameters in kernel method. In: 2019 IEEE symposium series on computational intelligence (SSCI), pp 3207–3214. https://doi.org/10.1109/SSCI44817.2019.9002691

  39. Phienthrakul T, Kijsirikul B (2007) GPES: an algorithm for evolving hybrid kernel functions of support vector machines. In: 2007 IEEE congress on evolutionary computation, pp 2636–2643. https://doi.org/10.1109/CEC.2007.4424803

  40. Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large-Margin Classif 10(3):61–74

    Google Scholar 

  41. Powell MJD (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 7(2):155–162. https://doi.org/10.1093/comjnl/7.2.155

    Article  MathSciNet  MATH  Google Scholar 

  42. Pree H, Herwig B, Gruber T, Sick B, David K, Lukowicz P (2014) On general purpose time series similarity measures and their use as kernel functions in support vector machines. Inf Sci 281:478–495. https://doi.org/10.1016/j.ins.2014.05.025

    Article  Google Scholar 

  43. Reitmaier T, Sick B (2015) The responsibility weighted Mahalanobis kernel for semi-supervised training of support vector machines for classification. Inf Sci 323:179–198. https://doi.org/10.1016/j.ins.2015.06.027

    Article  MathSciNet  Google Scholar 

  44. Schuh MA, Angryk RA, Sheppard J (2012) Evolving kernel functions with particle swarms and genetic programming. In: Youngblood GM, McCarthy PM (eds) Proceedings of the twenty-fifth international Florida artificial intelligence research society conference, 2012. AAAI Press, Marco Island, Florida, pp 80–85. http://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS12/paper/view/4479/4770.pdf

  45. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. https://doi.org/10.1214/aos/1176344136

    Article  MathSciNet  MATH  Google Scholar 

  46. Shaffer JP (2012) Modified sequentially rejective multiple test procedures. J Am Stat Assoc 81:826–831

    Article  Google Scholar 

  47. Sousa ADM, Lorena AC, Basgalupp MP (2017) GEEK: grammatical evolution for automatically evolving kernel functions. In: 2017 IEEE Trustcom/BigDataSE/ICESS, pp 941–948. https://doi.org/10.1109/Trustcom/BigDataSE/ICESS.2017.334

  48. Sullivan KM, Luke S (2007) Evolving kernels for support vector machine classification. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York, NY, USA, GECCO ’07, pp 1702–1707. https://doi.org/10.1145/1276958.1277292

  49. Thadani K, Ashutosh, Jayaraman VK, Sundararajan V (2006) Evolutionary selection of kernels in support vector machines. In: 2006 international conference on advanced computing and communications, pp 19–24. https://doi.org/10.1109/ADCOM.2006.4289849

  50. Valerio R, Vilalta R (2014) Kernel selection in support vector machines using gram-matrix properties. In: Proceedings of the 27th international conference on advances in neural information processing systems. Workshop on modern nonparametrics: automating the learning pipeline, NIPS, vol 14, pp 2–4

  51. Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780

    Google Scholar 

  52. Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin

    Book  Google Scholar 

  53. Zhang F (2011) Positive semidefinite matrices. In: Matrix theory, universitext. Springer, New York, NY, pp 199–252. https://doi.org/10.1007/978-1-4614-1099-7_7

  54. Zhao L, Gai M, Jia Y (2018) Classification of multiple power quality disturbances based on PSO-SVM of hybrid kernel function. J Inf Hiding Multimed Signal Process 10(1):138–146

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the Spanish Ministry of Science and Innovation (projects TIN2016-78365-R and PID2019-104966GB-I00), and the Basque Government (projects KK-2020/00049 and IT1244-19, and ELKARTEK program). Jose A. Lozano is also supported by BERC 2018-2021 (Basque government) and BCAM Severo Ochoa accreditation SEV-2017-0718 (Spanish Ministry of Science and Innovation).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibai Roman.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roman, I., Santana, R., Mendiburu, A. et al. In-depth analysis of SVM kernel learning and its components. Neural Comput & Applic 33, 6575–6594 (2021). https://doi.org/10.1007/s00521-020-05419-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05419-z

Keywords

Navigation