SVM Ensembles Are Better When Different Kernel Types Are Combined

Stork, Jörg; Ramos, Ricardo; Koch, Patrick; Konen, Wolfgang

doi:10.1007/978-3-662-44983-7_17

Jörg Stork²¹,
Ricardo Ramos²¹,
Patrick Koch²¹ &
…
Wolfgang Konen²¹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2965 Accesses
6 Citations

Abstract

Support vector machines (SVM) are strong classifiers, but large datasets might lead to prohibitively long computation times and high memory requirements. SVM ensembles, where each single SVM sees only a fraction of the data, can be an approach to overcome this barrier. In continuation of related work in this field we construct SVM ensembles with Bagging and Boosting. As a new idea we analyze SVM ensembles with different kernel types (linear, polynomial, RBF) involved inside the ensemble. The goal is to train one strong SVM ensemble classifier for large datasets with less time and memory requirements than a single SVM on all data. From our experiments we find evidence for the following facts: Combining different kernel types can lead to an ensemble classifier stronger than each individual SVM on all training data and stronger than ensembles from a single kernel type alone. Boosting is only productive if we make each single SVM sufficiently weak, otherwise we observe overfitting. Even for very small training sample sizes—and thus greatly reduced time and memory requirements—the ensemble approach often delivers accuracies similar or close to a single SVM trained on all data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breiman, L. (1996): Bagging predictors. In: Machine Learning, 24(2), 123–140.
Google Scholar
Caputo, B., SIM, K. Furesjo, F., & Smola, A. (2002). Appearance-based object recognition using SVMs: Which kernel should I use? In Proceedings of Nips Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision, Whistler.
Google Scholar
Chang, C., & Lin, C. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.
Google Scholar
Chang, E. Y., & Zhu, K., et al. (2008). PSVM: Parallelizing support vector machines on distributed computers. Advances in Neural Information Processing Systems, 20, 16.
Google Scholar
Cortes, C., Mohri, M., & Rostamizadeh, A. (2012). Ensembles of kernel predictors. arXiv preprint:1202.3712, arxiv.org.
Google Scholar
Cortes, C., & Vapnik, V. (1995). Support vector machine. Machine Learning, 20(3), 273–297.
MATH Google Scholar
Crammer, K., & Singer, Y. (2002). On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2, 265–292.
MATH Google Scholar
Cristianini, N., & Shawe-Taylor, J. (2000). Support vector machines. Cambridge: Cambridge University Press.
Book Google Scholar
Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., & Weingessel, A. (2008). Misc functions of the department of statistics (e1071), TU Wien. R package, version 1.5-18. http://CRAN.R-project.org/package=e1071.
Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the Second European Conference on Computational Learning Theory (EuroCOLT) (pp. 23–37).
Google Scholar
Kim, H.-C., Pang, S., Je, H.-M., Kim, D., & Yang Bang, S. (2003). Constructing support vector machine ensemble. Pattern Recognition, 36(12), 2757–2767.
Article MATH Google Scholar
Lin, H.-T., & Li, L. (2008). Support vector machinery for infinite ensemble learning. The Journal of Machine Learning Research, 9, 285–312.
MATH Google Scholar
Mercer, J. (1909). Functions of positive and negative type, and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society London, 209, 415–446.
Article MATH Google Scholar
Meyer, O., Bischl, B., & Weihs, C. (2013). Support vector machines on large data sets: Simple parallel approaches. In M. Spiliopoulou, et al. (Eds.), Data analysis, machine learning and knowledge discovery. New York: Springer.
Google Scholar
Pavlov, D., Mao, J., & Dom, B. (2000). Scaling-up Support Vector Machines using boosting algorithm. In Proceedings of the 15th International Conference on Pattern Recognition (Vol. 2, pp. 219–222). IEEE, Barcelona.
Google Scholar
Schölkopf, B., & Smola, A. (2002). Learning with kernels: Support vector machines, regularization, optimization and beyond. Massachusetts: MIT Press.
Google Scholar
Wang, S., Mathew, A., Chen, Y., Xi, L., Ma, L., & Lee, J. (2009). Empirical analysis of support vector machine ensemble classifiers. Expert Systems with Applications, 36(3), 6466–6476.
Article Google Scholar
Weston, J., & Watkins, C. (1999). Support Vector Machines for multi-class pattern recognition. In Proceedings of the 7th European Symposium on Artificial Neural Networks (ESANN) (Vol. 99, pp. 61–72).
Google Scholar
Wickramaratna, J., Holden, S. B., & Buxton, B. (2001). Performance degradation in boosting. In J. Kittler & F. Roli (Eds.), Proceedings of the 2nd International Workshop on Multiple Classifier Systems (pp. 11–21). Cambridge: Cambridge University Press.
Google Scholar
Yu, H., Yang J., Han, J., & Li, X. (2005). Making SVMs scalable to large data sets using hierarchical cluster indexing. Data Mining and Knowledge Discovery, 11(3), 295–321.
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work has been partially supported by the Bundesministerium für Bildung und Forschung (BMBF) under the grant SOMA (AiF FKZ 17N1009) and by the Cologne University of Applied Sciences under the research focus grant COSA.

Author information

Authors and Affiliations

Cologne University of Applied Sciences, 51643, Gummersbach, Germany
Jörg Stork, Ricardo Ramos, Patrick Koch & Wolfgang Konen

Authors

Jörg Stork
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Koch
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Konen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Essex, Colchester, United Kingdom
Berthold Lausen
University of Luxembourg, Walferdange, Luxembourg
Sabine Krolak-Schwerdt
University of Luxembourg, Walferdange, Luxembourg
Matthias Böhmer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stork, J., Ramos, R., Koch, P., Konen, W. (2015). SVM Ensembles Are Better When Different Kernel Types Are Combined. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds) Data Science, Learning by Latent Structures, and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44983-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-662-44983-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44982-0
Online ISBN: 978-3-662-44983-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics