Efficient Support Vector Machine Classification Using Prototype Selection and Generation

  • Stefanos OugiaroglouEmail author
  • Konstantinos I. Diamantaras
  • Georgios Evangelidis
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 475)


Although Support Vector Machines (SVMs) are considered effective supervised learning methods, their training procedure is time-consuming and has high memory requirements. Therefore, SVMs are inappropriate for large datasets. Many Data Reduction Techniques have been proposed in the context of dealing with the drawbacks of k-Nearest Neighbor classification. This paper adopts the concept of data reduction in order to cope with the high computational cost and memory requirements in the training process of SVMs. Experimental results illustrate that Data Reduction Techniques can effectively improve the performance of SVMs when applied as a preprocessing step on the training data.


Support Vector Machines k-NN classification Data reduction Prototype abstraction Prototype generation Condensing 


  1. 1.
    Aha, D.W.: Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. Int. J. Man Mach. Stud. 36(2), 267–287 (1992). CrossRefGoogle Scholar
  2. 2.
    Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991). Google Scholar
  3. 3.
    Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)Google Scholar
  4. 4.
    Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Min. Knowl. Discov. 6(2), 153–172 (2002). MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Chapelle, O.: Training a support vector machine in the primal. Neural Comput. 19, 1155–1178 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Chen, C.H., Jóźwik, A.: A sample set condensation algorithm for the class sensitive artificial neural network. Pattern Recogn. Lett. 17(8), 819–823 (1996). CrossRefGoogle Scholar
  7. 7.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  8. 8.
    Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (2006). CrossRefzbMATHGoogle Scholar
  9. 9.
    Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)Google Scholar
  10. 10.
    Garcia, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012). CrossRefGoogle Scholar
  11. 11.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14(3), 515–516 (1968)CrossRefGoogle Scholar
  12. 12.
    McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding of 5th Berkeley Symposium on Mathematics, Statistics and Probability. pp. 281–298. University of California Press, Berkeley (1967)Google Scholar
  13. 13.
    Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010). CrossRefGoogle Scholar
  14. 14.
    Ougiaroglou, S., Evangelidis, G.: Efficient dataset size reduction by finding homogeneous clusters. In: Proceedings of the Fifth Balkan Conference in Informatics, BCI 2012, pp. 168–173. ACM, New York (2012).
  15. 15.
    Ougiaroglou, S., Evangelidis, G.: Efficient data abstraction using weighted IB2 prototypes. Comput. Sci. Inf. Syst. 11(2), 665–678 (2014). CrossRefGoogle Scholar
  16. 16.
    Ougiaroglou, S., Evangelidis, G.: RHC: a non-parametric cluster-based data reduction for efficient k-NN classification. Pattern Anal. Appl. 19(1), 93–109 (2019). MathSciNetCrossRefGoogle Scholar
  17. 17.
    Ougiaroglou, S., Evangelidis, G.: Efficient editing and data abstraction by finding homogeneous clusters. Ann. Math. Artif. Intell. 76(3), 327–349 (2015). MathSciNetzbMATHGoogle Scholar
  18. 18.
    Sánchez, J.S.: High training set size reduction by space partitioning and prototype abstraction. Pattern Recogn. 37(7), 1561–1564 (2004)CrossRefGoogle Scholar
  19. 19.
    Toussaint, G.: Proximity graphs for nearest neighbor decision rules: recent progress. In: 34th Symposium on the INTERFACE, pp. 17–20 (2002)Google Scholar
  20. 20.
    Triguero, I., Derrac, J., Garcia, S., Herrera, F.: taxonomy and experimental study on prototype generation for nearest neighbor classification. Trans. Sys. Man Cyber. Part C 42(1), 86–100 (2012). CrossRefGoogle Scholar
  21. 21.
    Vapnik, V.: Estimation of Dependencies Based on Empirical Data. Nauka, Moscow (1979). English translation: Springer Verlag, New York (1982)zbMATHGoogle Scholar
  22. 22.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  23. 23.
    Vapnik, V., Chervonenkis, A.: Theory of pattern recognition (1974)Google Scholar
  24. 24.
    Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000). CrossRefzbMATHGoogle Scholar
  25. 25.
    Wu, J.: Advances in K-means Clustering: A Data Mining Thinking. Springer, Heidelberg (2012)CrossRefzbMATHGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Stefanos Ougiaroglou
    • 1
    Email author
  • Konstantinos I. Diamantaras
    • 1
  • Georgios Evangelidis
    • 2
  1. 1.Department of Information TechnologyAlexander TEI of ThessalonikiSindosGreece
  2. 2.Department of Applied InformaticsUniversity of MacedoniaThessalonikiGreece

Personalised recommendations