Abstract
This paper focuses on developing fast numerical algorithms for selection of a kernel optimal for a given training data set. The optimal kernel is obtained by minimizing a cost functional over a prescribed set of kernels. The cost functional is defined in terms of a positive semi-definite matrix determined completely by a given kernel and the given sampled input data. Fast computational algorithms are developed by approximating the positive semi-definite matrix by a related circulant matrix so that the fast Fourier transform can apply to achieve a linear or quasi-linear computational complexity for finding the optimal kernel. We establish convergence of the approximation method. Numerical examples are presented to demonstrate the approximation accuracy and computational efficiency of the proposed methods.
Similar content being viewed by others
References
Argyriou A, Micchelli C A, Pontil M, Ying Y. A spectral regularization framework for multi-task structure learning. NIPS, 2008, 20: 25–32
Aronszajn N. Theory of reproducing kernels. Trans Amer Math Soc, 1950, 68: 337–404
Bhatia R. Matrix Analysis. New York: Springer, 1997
Bochner S. Lectures on Fourier Integrals. Princeton: Princeton University Press, 1959
de Boor C. A Practical Guide to Splines. New York: Springer-Verlag, 1978
Bousquet O, Herrmann D J L. On the complexity of learning the kernel matrix. NIPS, 2003, 15: 399–406
Chapelle O, Vapnik V, Bousquet O, Mukherjee S. Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46: 131–159
Cormen T H, Leiserson C E, Rivest R L, Stein C. Introduction to Algorithms. Boston: MIT Press, 2001
Cristianini N, Shawe-Taylor J, Elisseeff A, Kandola J. On kernel-target alignment. NIPS, 2002, 14: 367–373
Cucker F, Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2002, 39: 1–49
Davis P J. Circulant Matrices. New York: John Wiley & Sons Inc, 1979
Demko S, Moss W, Smith P. Decay rates for inverses of band matrices. Math Comput, 1984, 43: 491–499
Drucker H, Wu D, Vapnik V N. Support vector machines for span categorization. IEEE Trans Neural Networks, 1999, 10: 1048–1054
Durrett R. Probability: Theory and Examples. Belmont: Duxbury Press, 1996
Gasquet C, Witomski P. Fourier Analysis and Applications. New York: Springer-Verlag, 1999
Gelfand I, Raikov D, Shilov G. Commutative Normed Rings. Bronx: Chelsea Publishing Company, New York, 1964
Gray R M. Toeplitz and Circulant Matrices: A Review. Boston: Now Publishers Inc, 2006
Grenander U, Szegö G. Toeplitz Forms and Their Applications. Berkeley and Los Angeles: University of Calif Press, 1958
Horn R A, Johnson C R. Matrix Analysis. Cambridge: Cambridge University Press, 1986
Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. J Math Anal Appl, 1971, 33: 82–95
Lanckriet G R G, Cristianini N, Bartlett P, El Ghaoui L, Jordan M I. Learning the kernel matrix with semi-definite programming. J Mach Learn Res, 2004, 5: 27–72
Lin Y, Brown L D. Statistical properties of the method of regularization with periodic Gaussian reproducing kernel. Ann Statis, 2004, 32: 1723–1743
Micchelli C A, Pontil M. Learning the kernel function via regularization. J Mach Learn Res, 2005, 6: 1099–1125
Micchelli C A, Pontil M. On learning vector-valued functions. Neural Comput, 2005, 17: 177–204
Müller K -R, Smola A J, Rätsch G, Schölkopf B, Kohlmorgen J, Vapnik V N. Predicting time series with support vector machines. Lecture Notes in Computer Science, 1997, 1327: 999–1004
Schölkopf B, Smola A J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge: MIT Press, 2004
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Analysis and Machine Intelligence, 2007, 29: 411–426
Shawe-Taylor J, Cristianini N. Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press, 2004
Smola A J, Schölkopf B. On a kernel-based method for pattern recognition, regression, approximation, and operator inversion. Algorithmica, 1998, 22: 211–231
Sung K K, Poggio T. Example-based learning for view-based human face detection. IEEE Trans Pattern Analysis and Machine Intelligence, 1998, 20: 39–51
Vapnik V N. Statistical Learning Theory. New York: Wiley, 1998
Xu Y, Zhang H. Refinable kernels. J Mach Learn Res, 2007, 8: 2083–2120
Xu Y, Zhang H. Refinement of reproducing kernels. J Mach Learn Res, 2009, 10: 137–170
Ying Y, Zhou D X. Learnability of Gaussians with flexible variances. J Mach Learn Res, 2007, 8: 249–276
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Song, G., Xu, Y. Approximation of kernel matrices by circulant matrices and its application in kernel selection methods. Front. Math. China 5, 123–160 (2010). https://doi.org/10.1007/s11464-009-0054-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11464-009-0054-0
Keywords
- Optimal kernel
- reproducing kernel
- reproducing kernel Hilbert space
- learning with kernel
- circulant matrix
- B-spline kernel