Skip to main content

Fast Local Support Vector Machines for Large Datasets

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5632))

Abstract

Local SVM is a classification approach that combines instance-based learning and statistical machine learning. It builds an SVM on the feature space neighborhood of the query point in the training set and uses it to predict its class. There is both empirical and theoretical evidence that Local SVM can improve over SVM and kNN in terms of classification accuracy, but the computational cost of the method permits the application only on small datasets. Here we propose FastLSVM, a classifier based on Local SVM that decreases the number of SVMs that must be built in order to be suitable for large datasets. FastLSVM precomputes a set of local SVMs in the training set and assigns to each model all the points lying in the central neighborhood of the k points on which it is trained. The prediction is performed applying to the query point the model corresponding to its nearest neighbor in the training set. The empirical evaluation we provide points out that FastLSVM is a good approximation of Local SVM and its computational performances on big datasets (a large artificial problem with 100000 samples and a very large real problem with more than 500000 samples) dramatically ameliorate performances of SVM and its fast existing approximations improving also the generalization accuracies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blanzieri, E., Melgani, F.: An adaptive SVM nearest neighbor classifier for remotely sensed imagery. In: IEEE Int. Conf. on Geoscience and Remote Sensing Symposium (IGARSS 2006), pp. 3931–3934 (2006)

    Google Scholar 

  2. Bottou, L., Vapnik, V.: Local learning algorithms. Neural Computation 4(6), 888–900 (1992)

    Article  Google Scholar 

  3. Vapnik, V.N., Bottou, L.: Local algorithms for pattern recognition and dependencies estimation. Neural Computation 5(6), 893–909 (1993)

    Article  Google Scholar 

  4. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (2000)

    Book  MATH  Google Scholar 

  5. Blanzieri, E., Melgani, F.: Nearest neighbor classification of remote sensing images with the maximal margin principle. IEEE Transactions on Geoscience and Remote Sensing 46(6), 1804–1811 (2008)

    Article  Google Scholar 

  6. Segata, N., Blanzieri, E.: Empirical assessment of classification accuracy of Local SVM. In: Proc. of Benelearn, pp. 47–55 (2009)

    Google Scholar 

  7. Brailovsky, V.L., Barzilay, O., Shahave, R.: On global, local, mixed and neighborhood kernels for support vector machines. Pattern Recognition Letters 20(11-13), 1183–1190 (1999)

    Article  Google Scholar 

  8. Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: Proc. of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2126–2136 (2006)

    Google Scholar 

  9. Segata, N.: FaLKM-lib v1.0: a Library for Fast Local Kernel Machines. Technical report, number DISI-09-025. DISI, University of Trento, Italy (2009), http://disi.unitn.it/~segata/FaLKM-lib

  10. Cheng, H., Tan, P.N., Jin, R.: Localized Support Vector Machine and Its Efficient Algorithm. In: Proc. SIAM Intl. Conf. Data Mining (2007)

    Google Scholar 

  11. Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core Vector Machines: Fast SVM Training on Very Large Data Sets. The Journal of Machine Learning Research 6, 363–392 (2005)

    MathSciNet  MATH  Google Scholar 

  12. Bordes, A., Ertekin, S., Weston, J., Bottou, L.: Fast kernel classifiers with online and active learning. Journal of Machine Learning Research 6, 1579–1619 (2005)

    MathSciNet  MATH  Google Scholar 

  13. Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of SVMs for very large scale problems. Neural Computation 14(5), 1105–1114 (2002)

    Article  MATH  Google Scholar 

  14. Collobert, R., Bengio, Y., Bengio, S.: Scaling Large Learning Problems with Hard Parallel Mixtures. International Journal of Pattern Recognition and Artificial Intelligence 17(3), 349–365 (2003)

    Article  MATH  Google Scholar 

  15. Yu, H., Yang, J., Han, J., Li, X.: Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing. Data Mining and Knowledge Discovery 11(3), 295–321 (2005)

    Article  MathSciNet  Google Scholar 

  16. Dong, M., Wu, J.: Localized Support Vector Machines for Classification. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 799–805 (2006)

    Google Scholar 

  17. Zanni, L., Serafini, T., Zanghirati, G.: Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems. The Journal of Machine Learning Research 7, 1467–1492 (2006)

    MathSciNet  MATH  Google Scholar 

  18. Dong, J.X., Krzyzak, A., Suen, C.Y.: Fast SVM training algorithm with decomposition on very large data sets. IEEE Transaction on Pattern Analysis Machine Intelligence 27(4), 603–618 (2005)

    Article  Google Scholar 

  19. Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 217–226. ACM, New York (2006)

    Chapter  Google Scholar 

  20. Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A Dual Coordinate Descent Method for Large-scale Linear SVM. In: Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML) (2008)

    Google Scholar 

  21. Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

  22. Schölkopf, B., Smola, A.J.: Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge (2002)

    Google Scholar 

  23. Bottou, L., Lin, C.J.: Support Vector Machine Solvers. Large-Scale Kernel Machines (2007)

    Google Scholar 

  24. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  25. Chang, Q., Chen, Q., Wang, X.: Scaling gaussian rbf kernel width to improve svm classification. In: International Conference on Neural Networks and Brain, ICNN&B 2005, October 13-15, vol. 1, pp. 19–22 (2005)

    Google Scholar 

  26. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys (CSUR) 33(3), 273–321 (2001)

    Article  Google Scholar 

  27. Beygelzimer, A., Kakade, S., Langford, J.: Cover Trees for Nearest Neighbor. In: Proceedings of the 23rd International Conference on Machine learning, Pittsburgh, PA, pp. 97–104 (2006)

    Google Scholar 

  28. Ridella, S., Rovetta, S., Zunino, R.: Circular backpropagation networks for classification. IEEE Transactions on Neural Networks 8(1), 84–97 (1997)

    Article  Google Scholar 

  29. Suykens, J.A.K., Vandewalle, J.: Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9(3), 293–300 (1999)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Segata, N., Blanzieri, E. (2009). Fast Local Support Vector Machines for Large Datasets. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03070-3_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03069-7

  • Online ISBN: 978-3-642-03070-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics