Abstract
Local SVM is a classification approach that combines instance-based learning and statistical machine learning. It builds an SVM on the feature space neighborhood of the query point in the training set and uses it to predict its class. There is both empirical and theoretical evidence that Local SVM can improve over SVM and kNN in terms of classification accuracy, but the computational cost of the method permits the application only on small datasets. Here we propose FastLSVM, a classifier based on Local SVM that decreases the number of SVMs that must be built in order to be suitable for large datasets. FastLSVM precomputes a set of local SVMs in the training set and assigns to each model all the points lying in the central neighborhood of the k points on which it is trained. The prediction is performed applying to the query point the model corresponding to its nearest neighbor in the training set. The empirical evaluation we provide points out that FastLSVM is a good approximation of Local SVM and its computational performances on big datasets (a large artificial problem with 100000 samples and a very large real problem with more than 500000 samples) dramatically ameliorate performances of SVM and its fast existing approximations improving also the generalization accuracies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blanzieri, E., Melgani, F.: An adaptive SVM nearest neighbor classifier for remotely sensed imagery. In: IEEE Int. Conf. on Geoscience and Remote Sensing Symposium (IGARSS 2006), pp. 3931–3934 (2006)
Bottou, L., Vapnik, V.: Local learning algorithms. Neural Computation 4(6), 888–900 (1992)
Vapnik, V.N., Bottou, L.: Local algorithms for pattern recognition and dependencies estimation. Neural Computation 5(6), 893–909 (1993)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (2000)
Blanzieri, E., Melgani, F.: Nearest neighbor classification of remote sensing images with the maximal margin principle. IEEE Transactions on Geoscience and Remote Sensing 46(6), 1804–1811 (2008)
Segata, N., Blanzieri, E.: Empirical assessment of classification accuracy of Local SVM. In: Proc. of Benelearn, pp. 47–55 (2009)
Brailovsky, V.L., Barzilay, O., Shahave, R.: On global, local, mixed and neighborhood kernels for support vector machines. Pattern Recognition Letters 20(11-13), 1183–1190 (1999)
Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: Proc. of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2126–2136 (2006)
Segata, N.: FaLKM-lib v1.0: a Library for Fast Local Kernel Machines. Technical report, number DISI-09-025. DISI, University of Trento, Italy (2009), http://disi.unitn.it/~segata/FaLKM-lib
Cheng, H., Tan, P.N., Jin, R.: Localized Support Vector Machine and Its Efficient Algorithm. In: Proc. SIAM Intl. Conf. Data Mining (2007)
Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core Vector Machines: Fast SVM Training on Very Large Data Sets. The Journal of Machine Learning Research 6, 363–392 (2005)
Bordes, A., Ertekin, S., Weston, J., Bottou, L.: Fast kernel classifiers with online and active learning. Journal of Machine Learning Research 6, 1579–1619 (2005)
Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of SVMs for very large scale problems. Neural Computation 14(5), 1105–1114 (2002)
Collobert, R., Bengio, Y., Bengio, S.: Scaling Large Learning Problems with Hard Parallel Mixtures. International Journal of Pattern Recognition and Artificial Intelligence 17(3), 349–365 (2003)
Yu, H., Yang, J., Han, J., Li, X.: Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing. Data Mining and Knowledge Discovery 11(3), 295–321 (2005)
Dong, M., Wu, J.: Localized Support Vector Machines for Classification. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 799–805 (2006)
Zanni, L., Serafini, T., Zanghirati, G.: Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems. The Journal of Machine Learning Research 7, 1467–1492 (2006)
Dong, J.X., Krzyzak, A., Suen, C.Y.: Fast SVM training algorithm with decomposition on very large data sets. IEEE Transaction on Pattern Analysis Machine Intelligence 27(4), 603–618 (2005)
Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 217–226. ACM, New York (2006)
Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A Dual Coordinate Descent Method for Large-scale Linear SVM. In: Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML) (2008)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Schölkopf, B., Smola, A.J.: Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge (2002)
Bottou, L., Lin, C.J.: Support Vector Machine Solvers. Large-Scale Kernel Machines (2007)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chang, Q., Chen, Q., Wang, X.: Scaling gaussian rbf kernel width to improve svm classification. In: International Conference on Neural Networks and Brain, ICNN&B 2005, October 13-15, vol. 1, pp. 19–22 (2005)
Chávez, E., Navarro, G., Baeza-Yates, R., MarroquÃn, J.L.: Searching in metric spaces. ACM Computing Surveys (CSUR) 33(3), 273–321 (2001)
Beygelzimer, A., Kakade, S., Langford, J.: Cover Trees for Nearest Neighbor. In: Proceedings of the 23rd International Conference on Machine learning, Pittsburgh, PA, pp. 97–104 (2006)
Ridella, S., Rovetta, S., Zunino, R.: Circular backpropagation networks for classification. IEEE Transactions on Neural Networks 8(1), 84–97 (1997)
Suykens, J.A.K., Vandewalle, J.: Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Segata, N., Blanzieri, E. (2009). Fast Local Support Vector Machines for Large Datasets. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)