World Wide Web

, Volume 21, Issue 2, pp 261–287 | Cite as

Scalable and fast SVM regression using modern hardware

  • Zeyi WenEmail author
  • Rui Zhang
  • Kotagiri Ramamohanarao
  • Li YangEmail author


Support Vector Machine (SVM) regression is an important technique in data mining. The SVM training is expensive and its cost is dominated by: (i) the kernel value computation, and (ii) a search operation which finds extreme training data points for adjusting the regression function in every training iteration. Existing training algorithms for SVM regression are not scalable to large datasets because: (i) each training iteration repeatedly performs expensive kernel value computations, which is inefficient and requires holding the whole training dataset in memory; (ii) the search operation used in each training iteration considers the whole search space which is very expensive. In this article, we significantly improve the scalability and efficiency of SVM regression by exploiting the high performance of Graphics Processing Units (GPUs) and solid state drives (SSDs). Our key ideas are as follows. (i) To reduce the cost of repeated kernel value computations and avoid holding the whole training dataset in the GPU memory, we precompute all the kernel values and store them in the CPU memory extended by the SSD; together with an efficient strategy to read the precomputed kernel values, reusing precomputed kernel values with an efficient retrieval is much faster than computing them on-the-fly. This also alleviates the restriction that the training dataset has to fit into the GPU memory, and hence makes our algorithm scalable to large datasets, especially for large datasets with very high dimensionality. (ii) To enhance the performance of the frequently used search operation, we design an algorithm that minimizes the search space and the number of accesses to the GPU global memory; this optimized search algorithm also avoids branch divergence (one of the causes for poor performance) among GPU threads to achieve high utilization of the GPU resources. Our proposed techniques together form a scalable solution to the SVM regression which we call SIGMA. Our extensive experimental results show that SIGMA is highly efficient and can handle very large datasets which the state-of-the-art GPU-based algorithm cannot handle. On the datasets of size that the state-of-the-art GPU-based algorithm can handle, SIGMA consistently outperforms the state-of-the-art GPU-based algorithm by an order of magnitude and achieves up to 86 times speedup.


Regression Support vector machines GPUs SSDs 



Rui Zhang is supported by ARC Future Fellow project FT120100832. This work is partially supported by the National Natural Science Foundation of China (No 61402155). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.


  1. 1.
    Athanasopoulos, A., Dimou, A., Mezaris, V., Kompatsiaris, I.: GPU acceleration for support vector machines International Workshop on Image Analysis for Multimedia Interactive Services (2011)Google Scholar
  2. 2.
    Carpenter, A.: Cusvm: A cuda implementation of support vector classification and regression. patternsonscreen. net/cuSVMDesc.pdf (2009)
  3. 3.
    Caruana, G., Li, M., Qi, M.: A MapReduce based parallel SVM for large scale spam filtering The International Conference on Fuzzy Systems and Knowledge Discovery, vol. 4, pp 2659–2662 (2011)Google Scholar
  4. 4.
    Catak, F.O., Balaban, M.E.: CloudSVM: training an SVM classifier in cloud computing systems Pervasive Computing and the Networked World, pp 57–68. Springer-Verlag (2013)Google Scholar
  5. 5.
    Catanzaro, B., Sundaram, N., Keutzer, K.: Fast support vector machine training and classification on graphics processors ICML, pp 104–111. ACM (2008)Google Scholar
  6. 6.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)Google Scholar
  7. 7.
    Codreanu, V., Dröge, B., Williams, D., Yasar, B., Yang, P., Liu, B, Dong, F., Surinta, O., Schomaker, L., Roerdink, J., et al: Evaluating automatically parallelized versions of the SVM. Concurrency and Computation: Practice and Experience (2014)Google Scholar
  8. 8.
    Cotter, A., Srebro, N., Keshet, J.: A GPU-tailored approach for training kernelized SVMs KDD, pp 805–813 (2011)Google Scholar
  9. 9.
    CUDA Nvidia: NVIDIA CUDA programming guide (2011)Google Scholar
  10. 10.
    Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training support vector machines. JMLR 6, 1889–1918 (2005)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Flake, G.W., Lawrence, S.: Efficient SVM regression training with SMO. Mach. Learn. 46(1-3), 271–290 (2002)CrossRefzbMATHGoogle Scholar
  12. 12.
    Forero, P.A., Cano, A., Giannakis, G.B.: Consensus-based distributed support vector machines. The Journal of Machine Learning Research 99, 1663–1707 (2010)MathSciNetzbMATHGoogle Scholar
  13. 13.
    He, B., Fang, W., Luo, Q., Govindaraju, N.K., Wang, T.: Mars: a mapreduce framework on graphics processors Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp 260–269. ACM (2008)Google Scholar
  14. 14.
    Hsu, C.-W., Chang, C.-C., Lin, C.-J., et al: A practical guide to support vector classification (2003)Google Scholar
  15. 15.
    Hu, M., Hao, W.: A parallel approach for SVM with multi-core CPU International Conference on Computer Application and System Modeling, vol. 15, pp V15–373. IEEE (2010)Google Scholar
  16. 16.
    Joachims, T.: Making large-scale SVM learning practical Advances in kernel methods, pp 169–184. MIT Press (1999)Google Scholar
  17. 17.
    Joachims, T.: Training linear SVMs in linear time KDD, pp 217–226 (2006)Google Scholar
  18. 18.
    Jordaan, E.M., Smits, G.F.: Robust outlier detection using SVM regression IEEE International Joint Conference on Neural Networks, vol. 3, pp 2017–2022. IEEE (2004)Google Scholar
  19. 19.
    Kang, S., Park, S., Jung, H., Shim, H., Cha, J.: Performance trade-offs in using nvram write buffer for flash memory-based storage devices. IEEE Trans. Comput. 58(6), 744–758 (2009)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Kim, K.-J.: Financial time series forecasting using support vector machines. Neurocomputing 55(1), 307–319 (2003)CrossRefGoogle Scholar
  21. 21.
    Li, Y., Gong, S., Liddell, H.: Support vector regression and classification based multi-view face detection and recognition International Conference on Automatic Face and Gesture Recognition, pp 300–305. IEEE (2000)Google Scholar
  22. 22.
    Nocedal, J., Wright, S.: Numerical optimization, series in operations research and financial engineering. Springer (2006)Google Scholar
  23. 23.
    Nvidia CUDA: Cublas library. NVIDIA Corporation, Santa Clara, California, 15, 2008Google Scholar
  24. 24.
    Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines IEEE Workshop on Neural Networks for Signal Processing, pp 276–285. IEEE (1997)Google Scholar
  25. 25.
    Platt, J.C.: Fast training of SVMs using sequential minimal optimization Advances in kernel methods, pp 185–208. MIT Press (1999)Google Scholar
  26. 26.
    Scholkopf, B., Smola, A.: Learning with kernels (2002)Google Scholar
  27. 27.
    Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter, A.: Pegasos: Primal estimated sub-gradient solver for svm. Math. Program. 127(1), 3–30 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Smola, A.J., Schölkopf, B.: A tutorial on SVM regression. Stat. Comput. 14(3), 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Sun, Y., Yuan, N. J., Wang, Y., Xie, X., McDonald, K., Zhang, R.: Contextual intent tracking for personal assistants (2016)Google Scholar
  30. 30.
    Volkov, V.: Better performance at lower occupancy The GPU Technology Conference, vol. 10 (2010)Google Scholar
  31. 31.
    Ward, P.G.D., He, Z., Zhang, R., Qi, J.: Real-time continuous intersection joins over large sets of moving objects using graphic processing units. The VLDB Journal 23(6), 965–985 (2014)CrossRefGoogle Scholar
  32. 32.
    Wen, Z., Zhang, R., Ramamohanarao, K.: Enabling precision/recall preferences for semi-supervised svm training Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp 421–430. ACM (2014)Google Scholar
  33. 33.
    Wen, Z., Zhang, R., Ramamohanarao, K., Qi, J., Taylor, K.: Mascot: Fast and highly scalable svm cross-validation using GPUs and SSDs ICDM. IEEE (2014)Google Scholar
  34. 34.
    Yoo, B., Won, Y., Cho, S., Kang, S., Choi, J., Yoon, S.: SSD characterization: From energy consumption’s perspective Proceedings of HotStorage (2011)Google Scholar
  35. 35.
    Yang, L., Zhou, F., Xia, Y.: An improved caching strategy for training SVMs International Conference on Intelligent Systems and Knowledge Engineering, pp 1397–1401 (2007)Google Scholar
  36. 36.
    Yang, Q., Ren, J.: I-cash: Intelligently coupled array of ssd and hdd 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pp 278–289. IEEE (2011)Google Scholar
  37. 37.
    Zhao, H.X., Magoules, F.: Parallel support vector machines on multi-core and multiprocessor systems International Conference on Artificial Intelligence and Applications. IASTED (2011)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Department of Computing and Information SystemsThe University of MelbourneMelbourneAustralia
  2. 2.Department of Computer ScienceHuBei University of EducationWuhanPeople’s Republic of China

Personalised recommendations