This paper presents a new method for selecting valuable training data for support vector machines (SVM) from large, noisy sets using a genetic algorithm (GA). SVM training data selection is a known, however not extensively investigated problem. The existing methods rely mainly on analyzing the geometric properties of the data or adapt a randomized selection, and to the best of our knowledge, GA-based approaches have not been applied for this purpose yet. Our work was inspired by the problems encountered when using SVM for skin segmentation. Due to a very large set size, the existing methods are too time-consuming, and random selection is not effective because of the set noisiness. In the work reported here we demonstrate how a GA can be used to optimize the training set, and we present extensive experimental results which confirm that the new method is highly effective for real-world data.


Support Vector Machine Support Vector Machine Training Genetic Algorithm Process Skin Segmentation Genetic Algorithm Strategy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)zbMATHGoogle Scholar
  2. 2.
    Khan, R., Hanbury, A., Stöttinger, J., Bais, A.: Color based skin classification. Pattern Recogn. Lett. 33(2), 157–163 (2012)CrossRefGoogle Scholar
  3. 3.
    Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in kernel methods, pp. 169–184. MIT Press, USA (1999)Google Scholar
  4. 4.
    Balc’azar, J., Dai, Y., Watanabe, O.: A Random Sampling Technique for Training Support Vector Machines. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, pp. 119–134. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Lee, Y.J., Huang, S.Y.: Reduced support vector machines: A statistical theory. IEEE Trans. on Neural Networks 18(1), 1–13 (2007)CrossRefGoogle Scholar
  6. 6.
    Chien, L.J., Chang, C.C., Lee, Y.J.: Variant methods of reduced set selection for reduced support vector machines. J. Inf. Sci. Eng. 26(1), 183–196 (2010)zbMATHGoogle Scholar
  7. 7.
    Koggalage, R., Halgamuge, S.: Reducing the number of training samples for fast support vector machine classification. Neural Information Process. Lett. and Reviews 2(3), 57–65 (2004)Google Scholar
  8. 8.
    Li, Y.: Selecting training points for one-class support vector machines. Pattern Recogn. Lett. 32(11), 1517–1522 (2011)CrossRefGoogle Scholar
  9. 9.
    Shin, H., Cho, S.: Neighborhood property–based pattern selection for support vector machines. Neural Comput. 19(3), 816–855 (2007)zbMATHCrossRefGoogle Scholar
  10. 10.
    Abe, S., Inoue, T.: Fast Training of Support Vector Machines by Extracting Boundary Data. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol. 2130, pp. 308–313. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Wang, D., Shi, L.: Selecting valuable training samples for SVMs via data structure analysis. Neurocomputing 71, 2772–2781 (2008)CrossRefGoogle Scholar
  12. 12.
    Chang, C.C., Pao, H.K., Lee, Y.J.: An RSVM based two-teachers-one-student semi-supervised learning algorithm. Neural Networks 25, 57–69 (2012)CrossRefGoogle Scholar
  13. 13.
    Wang, J., Neskovic, P., Cooper, L.N.: Training Data Selection for Support Vector Machines. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3610, pp. 554–564. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Zhang, W., King, I.: Locating support vectors via β-skeleton technique. In: Int. Conf. on Neural Information Process, pp. 1423–1427 (2002)Google Scholar
  15. 15.
    Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: Fast SVM training on very large data sets. J. of Machine Learning Research 6, 363–392 (2005)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Zeng, Z.Q., Xu, H.R., Xie, Y.Q., Gao, J.: A geometric approach to train SVM on very large data sets. Intell. System and Knowledge Eng. 1, 991–996 (2008)Google Scholar
  17. 17.
    Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: 17th Int. Conf. on Machine Learning, pp. 839–846. Morgan Kaufmann Publishers Inc., USA (2000)Google Scholar
  18. 18.
    Musicant, D.R., Feinberg, A.: Active set support vector regression. IEEE Trans. on Neural Networks 15(2), 268–275 (2004)CrossRefGoogle Scholar
  19. 19.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. The University of Michigan Press (1975)Google Scholar
  20. 20.
    Corne, D., Dorigo, M., Glover, F., Dasgupta, D., Moscato, P., Poli, R., Price, K.V. (eds.): New ideas in optimization. McGraw-Hill Ltd., UK (1999)Google Scholar
  21. 21.
    Elamin, E.E.A.: A proposed genetic algorithm selection method. In: 1st National Symposium, NITS (2006)Google Scholar
  22. 22.
    Nagata, Y., Bräysy, O., Dullaert, W.: A penalty-based edge assembly memetic algorithm for the vehicle routing problem with time windows. Computers & OR 37(4), 724–737 (2010)zbMATHCrossRefGoogle Scholar
  23. 23.
    Nalepa, J., Czech, Z.J.: A parallel heuristic algorithm to solve the vehicle routing problem with time windows. Studia Informatica 33(1), 91–106 (2012)Google Scholar
  24. 24.
    Phung, S.L., Chai, D., Bouzerdoum, A.: Adaptive skin segmentation in color images. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal, pp. 353–356 (2003)Google Scholar
  25. 25.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. on Intell. Systems and Technology 2, 27:1–27:27 (2011)CrossRefGoogle Scholar
  26. 26.
    Staelin, C.: Parameter selection for support vector machines. Technical Report HPL-2002-354. HP Laboratories, Israel (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michal Kawulok
    • 1
  • Jakub Nalepa
    • 1
  1. 1.Institute of InformaticsSilesian University of TechnologyGliwicePoland

Personalised recommendations