Parallel Column Subset Selection of Kernel Matrix for Scaling up Support Vector Machines

  • Jiangang Wu
  • Chang Feng
  • Peihuan Gao
  • Shizhong LiaoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9530)


Nyström method and low-rank linearized Support Vector Machines (SVMs) are two widely used methods for scaling up kernel SVMs, both of which need to sample part of columns of the kernel matrix to reduce the size. However, existing non-uniform sampling methods suffer from at least quadratic time complexity in the number of training data, limiting the scalability of kernel SVMs. In this paper, we propose a parallel sampling method called parallel column subset selection (PCSS) based on the divide-and-conquer strategy, which divides the kernel matrix into several small submatrices and then selects columns in parallel. We prove that PCSS has a (1+\(\epsilon \)) relative-error upper bound with respect to the kernel matrix. Further, we present two approaches to scaling up kernel SVMs by combining PCSS with Nyström method and low-rank linearized SVMs. The results of comparison experiments demonstrate the effectiveness, efficiency and scalability of our approaches.


Support Vector Machines (SVMs) Nyström method Low-rank linearized SVMs Column subset selection Parallel sampling 



This work was supported in part by Natural Science Foundation of China under Grant No. 61170019.


  1. 1.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)CrossRefGoogle Scholar
  2. 2.
    Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theor. Comput. 2, 225–247 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Ding, L., Liao, S.: Nyström approximate model selection for LSSVM. In: Advances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference (PAKDD 2012), pp. 282–293 (2012)Google Scholar
  4. 4.
    Drineas, P., Kannan, R., Mahoney, M.W.: Fast monte carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM J. Comput. 36(1), 158–183 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Subspace sampling and relative-error matrix approximation: column-based methods. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 316–326. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  6. 6.
    Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Relative-error CUR matrix decompositions. SIAM J. Matrix Anal. Appl. 30(2), 844–881 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  8. 8.
    Fine, S., Scheinberg, K.: Efficient svm training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2002)zbMATHGoogle Scholar
  9. 9.
    Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)CrossRefGoogle Scholar
  10. 10.
    Golub, G., Van Loan, C.: Matrix Comput. Johns Hopkins University Press, Baltimore (1996) Google Scholar
  11. 11.
    Guruswami, V., Sinop, A.K.: Optimal column-based low-rank matrix reconstruction. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1207–1214 (2012)Google Scholar
  12. 12.
    Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Mahoney, M.W., Drineas, P.: CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. 106(3), 697–702 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002) Google Scholar
  15. 15.
    Smola, A.J., Schökopf, B.: Sparse greedy matrix approximation for machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 911–918 (2000)Google Scholar
  16. 16.
    Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)CrossRefzbMATHGoogle Scholar
  17. 17.
    Tsang, I.W., Kwok, J.T., Cheung, P.M., Cristianini, N.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(4), 363–392 (2005)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998) zbMATHGoogle Scholar
  19. 19.
    Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13 (NIPS 2001), pp. 682–688 (2001)Google Scholar
  20. 20.
    Zhang, K., Lan, L., Wang, Z., Moerchen, F.: Scaling up kernel SVM on limited resources: a low-rank linearization approach. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1425–1434 (2012)Google Scholar
  21. 21.
    Zhang, K., Tsang, I.W., Kwok, J.T.: Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 1232–1239 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jiangang Wu
    • 1
  • Chang Feng
    • 1
  • Peihuan Gao
    • 1
  • Shizhong Liao
    • 1
    Email author
  1. 1.School of Computer Science and TechnologyTianjin UniversityTianjinChina

Personalised recommendations