An Alternating Genetic Algorithm for Selecting SVM Model and Training Set
Support vector machines (SVMs) have been found highly helpful in solving numerous pattern recognition tasks. Although it is challenging to train SVMs from large data sets, this obstacle may be mitigated by selecting a small, yet representative, subset of the entire training set. Another crucial and deeply-investigated problem consists in selecting the SVM model. There have been a plethora of methods proposed to effectively deal with these two problems treated independently, however to the best of our knowledge, it was not explored how to effectively combine these two processes. It is a noteworthy observation that depending on the subset selected for training, a different SVM model may be optimal, hence performing these two operations simultaneously is potentially beneficial. In this paper, we propose a new method to select both the training set and the SVM model, using a genetic algorithm which alternately optimizes two different populations. We demonstrate that our approach is competitive with sequential optimization of the hyperparameters followed by selecting the training set. We report the results obtained for several benchmark data sets and we visualize the results elaborated for artificial sets of 2D points.
KeywordsSupport vector machines Model selection Training set selection Genetic algorithms
This work was supported by the National Centre for Research and Development under the grant: POIR.01.02.00-00-0030/15.
- 4.Ferragut, E., Laska, J.: Randomized sampling for large data applications of SVM. In: Proceedings of the ICMLA, vol. 1, pp. 350–355 (2012)Google Scholar
- 8.Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods, pp. 169–184. MIT Press, Cambridge (1999)Google Scholar
- 12.Le, Q., Sarlos, T., Smola, A.: Fastfood - approximating kernel expansions in loglinear time. In: Proceedings of the ICML, pp. 1–9 (2013)Google Scholar
- 15.Nalepa, J., Kawulok, M.: A memetic algorithm to select training data for support vector machines. In: Proceedings of the GECCO, pp. 573–580. ACM (2014)Google Scholar
- 17.Nalepa, J., Siminski, K., Kawulok, M.: Towards parameter-less support vector machines. In: Proceedings of the ACPR, pp. 211–215 (2015)Google Scholar
- 18.Nishida, K., Kurita, T.: RANSAC-SVM for large-scale datasets. In: Proceedings of the IEEE ICPR, pp. 1–4 (2008)Google Scholar
- 19.Ripepi, G., Clematis, A., DAgostino, D.: A hybrid parallel implementation of model selection for support vector machines. In: Proceedings of the PDP, pp. 145–149 (2015)Google Scholar
- 22.Sullivan, K.M., Luke, S.: Evolving kernels for support vector machine classification. In: Proceedings of the GECCO, pp. 1702–1707. ACM, New York (2007)Google Scholar
- 23.Tang, Y., Guo, W., Gao, J.: Efficient model selection for support vector machine with Gaussian kernel function. In: Proceedings of the IEEE CIDM, pp. 40–45 (2009)Google Scholar