Skip to main content

Parallel Column Subset Selection of Kernel Matrix for Scaling up Support Vector Machines

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9530))

Abstract

Nyström method and low-rank linearized Support Vector Machines (SVMs) are two widely used methods for scaling up kernel SVMs, both of which need to sample part of columns of the kernel matrix to reduce the size. However, existing non-uniform sampling methods suffer from at least quadratic time complexity in the number of training data, limiting the scalability of kernel SVMs. In this paper, we propose a parallel sampling method called parallel column subset selection (PCSS) based on the divide-and-conquer strategy, which divides the kernel matrix into several small submatrices and then selects columns in parallel. We prove that PCSS has a (1+\(\epsilon \)) relative-error upper bound with respect to the kernel matrix. Further, we present two approaches to scaling up kernel SVMs by combining PCSS with Nyström method and low-rank linearized SVMs. The results of comparison experiments demonstrate the effectiveness, efficiency and scalability of our approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We assume \(n\mod t =0\) for simplicity, then each submatrix contains n / t columns. In a general case we can also partition \(\mathbf K\) into t submatrices and each contains \(\lfloor n/t \rfloor \) or \(\lceil n/t \rceil \) columns.

  2. 2.

    Here we also assume \(l\mod t =0\) for simplicity. In a general case we can select \(\lfloor l/t \rfloor \) or \(\lceil l/t \rceil \) columns from each submatrix in order to insure that the number of entire selected columns is l.

  3. 3.

    http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

References

  1. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)

    Article  Google Scholar 

  2. Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix approximation and projective clustering via volume sampling. Theor. Comput. 2, 225–247 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Ding, L., Liao, S.: Nyström approximate model selection for LSSVM. In: Advances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference (PAKDD 2012), pp. 282–293 (2012)

    Google Scholar 

  4. Drineas, P., Kannan, R., Mahoney, M.W.: Fast monte carlo algorithms for matrices II: computing a low-rank approximation to a matrix. SIAM J. Comput. 36(1), 158–183 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  5. Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Subspace sampling and relative-error matrix approximation: column-based methods. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 316–326. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Drineas, P., Mahoney, M.W., Muthukrishnan, S.: Relative-error CUR matrix decompositions. SIAM J. Matrix Anal. Appl. 30(2), 844–881 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  7. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  8. Fine, S., Scheinberg, K.: Efficient svm training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2002)

    MATH  Google Scholar 

  9. Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)

    Article  Google Scholar 

  10. Golub, G., Van Loan, C.: Matrix Comput. Johns Hopkins University Press, Baltimore (1996)

    Google Scholar 

  11. Guruswami, V., Sinop, A.K.: Optimal column-based low-rank matrix reconstruction. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1207–1214 (2012)

    Google Scholar 

  12. Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)

    MathSciNet  MATH  Google Scholar 

  13. Mahoney, M.W., Drineas, P.: CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. 106(3), 697–702 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  14. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)

    Google Scholar 

  15. Smola, A.J., Schökopf, B.: Sparse greedy matrix approximation for machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 911–918 (2000)

    Google Scholar 

  16. Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  MATH  Google Scholar 

  17. Tsang, I.W., Kwok, J.T., Cheung, P.M., Cristianini, N.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(4), 363–392 (2005)

    MathSciNet  MATH  Google Scholar 

  18. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  19. Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13 (NIPS 2001), pp. 682–688 (2001)

    Google Scholar 

  20. Zhang, K., Lan, L., Wang, Z., Moerchen, F.: Scaling up kernel SVM on limited resources: a low-rank linearization approach. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1425–1434 (2012)

    Google Scholar 

  21. Zhang, K., Tsang, I.W., Kwok, J.T.: Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 1232–1239 (2008)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by Natural Science Foundation of China under Grant No. 61170019.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shizhong Liao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wu, J., Feng, C., Gao, P., Liao, S. (2015). Parallel Column Subset Selection of Kernel Matrix for Scaling up Support Vector Machines. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27137-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27136-1

  • Online ISBN: 978-3-319-27137-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics