Approximation of kernel matrices by circulant matrices and its application in kernel selection methods
- 109 Downloads
- 2 Citations
Abstract
This paper focuses on developing fast numerical algorithms for selection of a kernel optimal for a given training data set. The optimal kernel is obtained by minimizing a cost functional over a prescribed set of kernels. The cost functional is defined in terms of a positive semi-definite matrix determined completely by a given kernel and the given sampled input data. Fast computational algorithms are developed by approximating the positive semi-definite matrix by a related circulant matrix so that the fast Fourier transform can apply to achieve a linear or quasi-linear computational complexity for finding the optimal kernel. We establish convergence of the approximation method. Numerical examples are presented to demonstrate the approximation accuracy and computational efficiency of the proposed methods.
Keywords
Optimal kernel reproducing kernel reproducing kernel Hilbert space learning with kernel circulant matrix B-spline kernelMSC
46E22 68Q32Preview
Unable to display preview. Download preview PDF.
References
- 1.Argyriou A, Micchelli C A, Pontil M, Ying Y. A spectral regularization framework for multi-task structure learning. NIPS, 2008, 20: 25–32Google Scholar
- 2.Aronszajn N. Theory of reproducing kernels. Trans Amer Math Soc, 1950, 68: 337–404MATHCrossRefMathSciNetGoogle Scholar
- 3.Bhatia R. Matrix Analysis. New York: Springer, 1997Google Scholar
- 4.Bochner S. Lectures on Fourier Integrals. Princeton: Princeton University Press, 1959MATHGoogle Scholar
- 5.de Boor C. A Practical Guide to Splines. New York: Springer-Verlag, 1978MATHGoogle Scholar
- 6.Bousquet O, Herrmann D J L. On the complexity of learning the kernel matrix. NIPS, 2003, 15: 399–406Google Scholar
- 7.Chapelle O, Vapnik V, Bousquet O, Mukherjee S. Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46: 131–159MATHCrossRefGoogle Scholar
- 8.Cormen T H, Leiserson C E, Rivest R L, Stein C. Introduction to Algorithms. Boston: MIT Press, 2001MATHGoogle Scholar
- 9.Cristianini N, Shawe-Taylor J, Elisseeff A, Kandola J. On kernel-target alignment. NIPS, 2002, 14: 367–373Google Scholar
- 10.Cucker F, Smale S. On the mathematical foundations of learning. Bull Amer Math Soc, 2002, 39: 1–49MATHCrossRefMathSciNetGoogle Scholar
- 11.Davis P J. Circulant Matrices. New York: John Wiley & Sons Inc, 1979MATHGoogle Scholar
- 12.Demko S, Moss W, Smith P. Decay rates for inverses of band matrices. Math Comput, 1984, 43: 491–499MATHCrossRefMathSciNetGoogle Scholar
- 13.Drucker H, Wu D, Vapnik V N. Support vector machines for span categorization. IEEE Trans Neural Networks, 1999, 10: 1048–1054CrossRefGoogle Scholar
- 14.Durrett R. Probability: Theory and Examples. Belmont: Duxbury Press, 1996Google Scholar
- 15.Gasquet C, Witomski P. Fourier Analysis and Applications. New York: Springer-Verlag, 1999MATHGoogle Scholar
- 16.Gelfand I, Raikov D, Shilov G. Commutative Normed Rings. Bronx: Chelsea Publishing Company, New York, 1964Google Scholar
- 17.Gray R M. Toeplitz and Circulant Matrices: A Review. Boston: Now Publishers Inc, 2006MATHGoogle Scholar
- 18.Grenander U, Szegö G. Toeplitz Forms and Their Applications. Berkeley and Los Angeles: University of Calif Press, 1958MATHGoogle Scholar
- 19.Horn R A, Johnson C R. Matrix Analysis. Cambridge: Cambridge University Press, 1986Google Scholar
- 20.Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. J Math Anal Appl, 1971, 33: 82–95MATHCrossRefMathSciNetGoogle Scholar
- 21.Lanckriet G R G, Cristianini N, Bartlett P, El Ghaoui L, Jordan M I. Learning the kernel matrix with semi-definite programming. J Mach Learn Res, 2004, 5: 27–72Google Scholar
- 22.Lin Y, Brown L D. Statistical properties of the method of regularization with periodic Gaussian reproducing kernel. Ann Statis, 2004, 32: 1723–1743MATHCrossRefMathSciNetGoogle Scholar
- 23.Micchelli C A, Pontil M. Learning the kernel function via regularization. J Mach Learn Res, 2005, 6: 1099–1125MathSciNetGoogle Scholar
- 24.Micchelli C A, Pontil M. On learning vector-valued functions. Neural Comput, 2005, 17: 177–204MATHCrossRefMathSciNetGoogle Scholar
- 25.Müller K -R, Smola A J, Rätsch G, Schölkopf B, Kohlmorgen J, Vapnik V N. Predicting time series with support vector machines. Lecture Notes in Computer Science, 1997, 1327: 999–1004CrossRefGoogle Scholar
- 26.Schölkopf B, Smola A J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge: MIT Press, 2004Google Scholar
- 27.Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Analysis and Machine Intelligence, 2007, 29: 411–426CrossRefGoogle Scholar
- 28.Shawe-Taylor J, Cristianini N. Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press, 2004Google Scholar
- 29.Smola A J, Schölkopf B. On a kernel-based method for pattern recognition, regression, approximation, and operator inversion. Algorithmica, 1998, 22: 211–231MATHCrossRefMathSciNetGoogle Scholar
- 30.Sung K K, Poggio T. Example-based learning for view-based human face detection. IEEE Trans Pattern Analysis and Machine Intelligence, 1998, 20: 39–51CrossRefGoogle Scholar
- 31.Vapnik V N. Statistical Learning Theory. New York: Wiley, 1998MATHGoogle Scholar
- 32.Xu Y, Zhang H. Refinable kernels. J Mach Learn Res, 2007, 8: 2083–2120MathSciNetGoogle Scholar
- 33.Xu Y, Zhang H. Refinement of reproducing kernels. J Mach Learn Res, 2009, 10: 137–170Google Scholar
- 34.Ying Y, Zhou D X. Learnability of Gaussians with flexible variances. J Mach Learn Res, 2007, 8: 249–276MathSciNetGoogle Scholar