Speed Up SVM Algorithm for Massive Classification Tasks

Do, Thanh-Nghi; Nguyen, Van-Hoa; Poulet, François

doi:10.1007/978-3-540-88192-6_15

Thanh-Nghi Do⁶,
Van-Hoa Nguyen⁷ &
François Poulet⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5139))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2672 Accesses
23 Citations

Abstract

We present a new parallel and incremental Support Vector Machine (SVM) algorithm for the classification of very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic program so that this task for large datasets requires large memory capacity and long time. We extend a recent Least Squares SVM (LS-SVM) proposed by Suykens and Vandewalle for building incremental and parallel algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI and Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 70 times faster than a CPU implementation and often significantly faster (over 1000 times) than state-of-the-art algorithms like LibSVM, SVM-perf and CB-SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI Repository of Machine Learning Databases, http://archive.ics.uci.edu/ml/
Boser, B., Guyon, I., Vapnik, V.: A Training Algorithm for Optimal Margin Classifiers. In: Proc. of 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992)
Google Scholar
Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001)
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM – A Library for Support Vector Machines (2003), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Delve, Data for evaluating learning in valid experiments (1996), http://www.cs.toronto.edu/~delve
Do, T.-N., Fekete, J.-D.: Large Scale Classification with Support Vector Machine Algorithms. In: Proc. of ICMLA 2007, 6th International Conference on Machine Learning and Applications, pp. 7–12. IEEE Press, Ohio (2007)
Chapter Google Scholar
Do, T.-N., Poulet, F.: Classifying one Billion Data with a New Distributed SVM Algorithm. In: Proc. of RIVF 2006, 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, Ho Chi Minh, Vietnam, Vietnam, pp. 59–66 (2006)
Google Scholar
Domingos, P., Hulten, G.: A General Framework for Mining Massive Data Streams. Journal of Computational and Graphical Statistics 12(4), 945–949 (2003)
Article MathSciNet Google Scholar
Dongarra, J., Pozo, R., Walker, D.: LAPACK++: a design overview of object-oriented extensions for high performance linear algebra. In: Proc. of Supercomputing 1993, pp. 162–171. IEEE Press, Los Alamitos (1993)
Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-03 Panel – Data Mining: The Next 10 Years. In: SIGKDD Explorations, vol. 5(2), pp. 191–196 (2004)
Google Scholar
Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: Proc. of the 2nd SIAM Int. Conf. on Data Mining SDM 2002 Arlington, Virginia, USA (2002)
Google Scholar
Guyon, I.: Web Page on SVM Applications (1999), http://www.clopinet.com/isabelle/Projects/SVM/app-list.html
Joachims, T.: Training Linear SVMs in Linear Time. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 217–226 (2006)
Google Scholar
Mangasarian, O.: A Finite Newton Method for Classification Problems. Data Mining Institute Technical Report 01-11, Computer Sciences Department, University of Wisconsin (2001)
Google Scholar
Mangasarian, O., Musicant, D.: Lagrangian Support Vector Machines. Journal of Machine Learning Research 1, 161–177 (2001)
Article MATH MathSciNet Google Scholar
NVIDIA® CUDA^TM, CUDA Programming Guide 1.1 (2007)
Google Scholar
NVIDIA® CUDA^TM, CUDA CUBLAS Library 1.1 (2007)
Google Scholar
Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999)
Google Scholar
Poulet, F., Do, T.-N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Camp, O., Filipe, J., Hammoudi, S. (eds.) Enterprise Information Systems V, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004)
Google Scholar
Suykens, J., Vandewalle, J.: Least Squares Support Vector Machines Classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Article MathSciNet Google Scholar
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of the 6th ACM SIGKDD Int. Conf. on KDD 1999, San Diego, USA (1999)
Google Scholar
Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: Proc. of ICML 2000, the 17th Int. Conf. on Machine Learning, Stanford, USA, pp. 999–1006 (2000)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
MATH Google Scholar
Wasson: Nvidia’s GeForce 8800 graphics processor, Technical report, PC Hardware Explored (2006)
Google Scholar
Yu, H., Yang, J., Han, J.: Classifying Large Data Sets Using SVMs with Hierarchical Clusters. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 306–315 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Technology, Can Tho University, 1 Ly Tu Trong street, Can Tho city, Vietnam
Thanh-Nghi Do
IRISA Symbiose, Campus de Beaulieu, 35042, Rennes Cedex, France
Van-Hoa Nguyen
IRISA Texmex, Campus de Beaulieu, 35042, Rennes Cedex, France
François Poulet

Authors

Thanh-Nghi Do
View author publications
You can also search for this author in PubMed Google Scholar
Van-Hoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
François Poulet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Sichuan University, 610065, Chengdu, China
Changjie Tang
Department of Computer Science, The University of Western Ontario, Canada
Charles X. Ling
School of ITEE, The University of Queensland, Australia
Xiaofang Zhou
Faculty of Science & Engineering, York University, 355 Lumbers Building, M3J 1P3, Toronto, Ontario, Canada
Nick J. Cercone
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, 4072, Queensland, Australia
Xue Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Do, TN., Nguyen, VH., Poulet, F. (2008). Speed Up SVM Algorithm for Massive Classification Tasks. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2008. Lecture Notes in Computer Science(), vol 5139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88192-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-88192-6_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88191-9
Online ISBN: 978-3-540-88192-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics