Approximate Consistency: Towards Foundations of Approximate Kernel Selection

Ding, Lizhong; Liao, Shizhong

doi:10.1007/978-3-662-44848-9_23

Lizhong Ding²³ &
Shizhong Liao²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8724))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4579 Accesses
4 Citations

Abstract

Kernel selection is critical to kernel methods. Approximate kernel selection is an emerging approach to alleviating the computational burdens of kernel selection by introducing kernel matrix approximation. Theoretical problems faced by approximate kernel selection are how kernel matrix approximation impacts kernel selection and whether this impact can be ignored for large enough examples. In this paper, we introduce the notion of approximate consistency for kernel matrix approximation algorithm to tackle the theoretical problems and establish the preliminary foundations of approximate kernel selection. By analyzing the approximate consistency of kernel matrix approximation algorithms, we can answer the question that, under what conditions, and how, the approximate kernel selection criterion converges to the accurate one. Taking two kernel selection criteria as examples, we analyze the approximate consistency of Nyström approximation and multilevel circulant matrix approximation. Finally, we empirically verify our theoretical findings.

Download to read the full chapter text

Chapter PDF

Kernel Tests for One, Two, and K-Sample Goodness-of-Fit: State of the Art and Implementation Considerations

Optimal Kernel Selection for Density Estimation

Kernel Methods

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)
Google Scholar
Bach, F.: Sharp analysis of low-rank kernel matrix approximations. In: Proceedings of the 26th Annual Conference on Learning Theory (COLT), pp. 185–209 (2013)
Google Scholar
Bartlett, P.L., Boucheron, S., Lugosi, G.: Model selection and error estimation. Machine Learning 48(1–3), 85–113 (2002)
Article MATH Google Scholar
Bartlett, P., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)
MathSciNet Google Scholar
Cawley, G., Talbot, N.: On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research 11, 2079–2107 (2010)
MATH MathSciNet Google Scholar
Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46(1-3), 131–159 (2002)
Article MATH Google Scholar
Cortes, C., Mohri, M., Talwalkar, A.: On the impact of kernel approximation on learning accuracy. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 113–120 (2010)
Google Scholar
De Vito, E., Caponnetto, A., Rosasco, L.: Model selection for regularized least-squares algorithm in learning theory. Foundations of Computational Mathematics 5(1), 59–85 (2005)
Article MATH MathSciNet Google Scholar
Ding, L.Z., Liao, S.Z.: Approximate model selection for large scale LSSVM. Journal of Machine Learning Research - Proceedings Track 20, 165–180 (2011)
Google Scholar
Ding, L., Liao, S.: Nyström approximate model selection for LSSVM. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS (LNAI), vol. 7301, pp. 282–293. Springer, Heidelberg (2012)
Chapter Google Scholar
Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)
MATH MathSciNet Google Scholar
Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research 2, 243–264 (2002)
MATH Google Scholar
Gittens, A., Mahoney, M.W.: Revisiting the Nyström method for improved large-scale machine learning. In: Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 567–575 (2013)
Google Scholar
Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2), 215–223 (1979)
Article MATH MathSciNet Google Scholar
Hastie, T., Rosset, S., Tibshirani, R., Zhu, J.: The entire regularization path for the support vector machine. Journal of Machine Learning Research 5, 1391–1415 (2004)
MATH MathSciNet Google Scholar
Jin, R., Yang, T.B., Mahdavi, M., Li, Y.F., Zhou, Z.H.: Improved bounds for the Nyström method with application to kernel classification. IEEE Transactions on Information Theory 5(10), 6939–6949 (2013)
Article MathSciNet Google Scholar
Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. Journal of Machine Learning Research 13, 981–1006 (2012)
MATH MathSciNet Google Scholar
Liu, Y., Jiang, S., Liao, S.: Efficient approximation of cross-validation for kernel methods using Bouligand influence function. In: Proceedings of the 31st International Conference on Machine Learning (ICML), pp. 324–332 (2014)
Google Scholar
Luxburg, U.V., Bousquet, O., Schölkopf, B.: A compression approach to support vector model selection. Journal of Machine Learning Research 5, 293–323 (2004)
MATH Google Scholar
Micchelli, C.A., Pontil, M.: Learning the kernel function via regularization. Journal of Machine Learning Research 6, 1099–1125 (2005)
MATH MathSciNet Google Scholar
Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)
MATH Google Scholar
Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Proceedings of the 17th International Conference on Machine Learning (ICML), pp. 911–918 (2000)
Google Scholar
Song, G.H.: Approximation of kernel matrices in machine learning. Ph.D. thesis, Syracuse University, Syracuse, NY, USA (2010)
Google Scholar
Song, G.H., Xu, Y.S.: Approximation of high-dimensional kernel matrices by multilevel circulant matrices. Journal of Complexity 26(4), 375–405 (2010)
Article MATH MathSciNet Google Scholar
Tyrtyshnikov, E.E.: A unifying approach to some old and new theorems on distribution and clustering. Linear Algebra and its Applications 232, 1–43 (1996)
Article MATH MathSciNet Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Wahba, G., Lin, Y., Zhang, H.: GACV for support vector machines. In: Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)
Google Scholar
Wang, S.S., Zhang, Z.H.: Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling. Journal of Machine Learning Research 14, 2729–2769 (2013)
Google Scholar
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems 13, pp. 682–688 (2001)
Google Scholar
Yang, T.B., Li, Y.F., Mahdavi, M., Jin, R., Zhou, Z.H.: Nyström method vs random Fourier features: A theoretical and empirical comparison. In: Advances in Neural Information Processing Systems 24, pp. 1060–1068 (2012)
Google Scholar
Zhang, K., Kwok, J.T.: Clustered Nyström method for large scale manifold learning and dimension reduction. IEEE Transactions on Neural Networks 21(10), 1576–1587 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, 300072, China
Lizhong Ding & Shizhong Liao

Authors

Lizhong Ding
View author publications
You can also search for this author in PubMed Google Scholar
Shizhong Liao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Computer and Decision Engineering, Université Libre de Bruxelles, Av. F. Roosevelt, CP 165/15, 1050, Brussels, Belgium
Toon Calders
Dipartimento di Informatica, Università degli Studi “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Floriana Esposito
Department of Computer Science, Universität Paderborn, Warburger Str. 100, 33098, Paderborn, Germany
Eyke Hüllermeier
Dipartimento di Informatica, Università degli Studi di Torino, Corso Svizzera 185, 10149, Torino, Italy
Rosa Meo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, L., Liao, S. (2014). Approximate Consistency: Towards Foundations of Approximate Kernel Selection. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44848-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-662-44848-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44847-2
Online ISBN: 978-3-662-44848-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Approximate Consistency: Towards Foundations of Approximate Kernel Selection

Abstract

Chapter PDF

Similar content being viewed by others

Kernel Tests for One, Two, and K-Sample Goodness-of-Fit: State of the Art and Implementation Considerations

Optimal Kernel Selection for Density Estimation

Kernel Methods

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Approximate Consistency: Towards Foundations of Approximate Kernel Selection

Abstract

Chapter PDF

Similar content being viewed by others

Kernel Tests for One, Two, and K-Sample Goodness-of-Fit: State of the Art and Implementation Considerations

Optimal Kernel Selection for Density Estimation

Kernel Methods

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation