Abstract
Regularized classifiers are known to be a kind of kernel-based classification methods generated from Tikhonov regularization schemes, and the trigonometric polynomial kernels are ones of the most important kernels and play key roles in signal processing. The main target of this paper is to provide convergence rates of classification algorithms generated by regularization schemes with trigonometric polynomial kernels. As a special case, an error analysis for the support vector machines (SVMs) soft margin classifier is presented. The norms of Fejér operator in reproducing kernel Hilbert space and properties of approximation of the operator in L 1 space with periodic function play key roles in the analysis of regularization error. Some new bounds on the learning rate of regularization algorithms based on the measure of covering number for normalized loss functions are established. Together with the analysis of sample error, the explicit learning rates for SVM are also derived.
Similar content being viewed by others
References
Arouszajiu N (1950) Theory of reproducing kernels. Trans Math Soc 68: 337–404
Bartlett PL (1998) The sample complexity of pattern classificayion with neural networks: the size of the weights is more import than the size of the network. IEEE Trans Inf Theory 44: 525–536
Bartlett PL (2008) Fast rates for estimation error and oracle inequalities for model selection. Econom Theory 24: 545–552
Bousquet O (2003) New approaches to statistical learning theory. Ann Inst Stat Math 55: 371–389
Bousquet O, Elisseeff A (2002) Stability and generalization. J Mach Learn Res 2: 499–526
Butzer PL, Nessel RJ (1971) Fourier analysis and approximation, vol I. Birkhäuser/Academic Press, Basel
Chen DR, Wu Q, Ying YM, Zhou DX (2004) Support vector machine soft margin classifiers: error analysis. J Mach Learn Res 5: 1143–1175
Cucker F, Smale S (2001) On the mathematical foundations of learning theory. Bull Am Math Soc 39: 1–49
Cucker F, Smale S (2002) Best choices for regularization parameters in learning theory: on the biasvariance problem. Found Comput Math 1: 413–428
Cucker F, Zhou DX (2007) Learning theory: an approximation theory viewpoint. Cambridge University Press, Cambridge
Devroye L, Gyorfi L, Lugosi G (1997) A probabilistic theory of pattern recognition. Springer, New York
Evgeniou T, Pontil M (1999) On the V-gamma dimension for regression in reproducing Kernel Hilbert spaces. In Proceedings of algorithmic learning theory. Lecture notes in computer science, vol 1720. Springer, London, pp 106–117
Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. Adv Comput Math 13: 1–50
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, New York
Huang CM, Lee YJ, Lin DKJ, Huang SY (2007) Model selection for support vector machines via uniform design. Comput Stat Data Anal 52(1): 335–346
Kutin S, Niyogi P (2002) Almost-everywhere algorithmic stability and generalization error. Technical Report TR-2002-03, Department of Computer Science, The University of Chicago
Koltchinskii V, Panchenko D (2000) Rademacher processes and bounding the risk of function learning. High Dimens Probab 47: 1902–1914
Martinand A, Bartlett PL (1999) Neural network learning: theoretical foundations. Cambridge University Press, Cambridge
Régis V, Jean-Philippe V (2006) Consistency and convergence rates of one-class SVMs and related algorithms. J Mach Learn Res 7: 817–854
Shim J, Hwang C (2009) Support vector censored quantize regression under random censoring. Comput Stat Data Anal 53(4): 912–919
Smale S, Zhou DX (2004) Shannon sampling and function reconstruction from point values. Bull Am Math Soc 41: 279–305
Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learn Res 2: 67–73
Tong HZ, Chen DR, Li ZP (2008) Learning rates for regularized classifiers using multivariate polynomial kernels. J Complex 24: 619–631
Vapnik V (1998) Statistical learning theory. Wiley, New York
Wu Q, Zhou DX (2005) SVM soft margin classifiers: linear programming versus quadratic programming. Neural Comput 17: 1160–1187
Wu Q, Zhou DX (2006) Analysis of support vector machine classification. J Comput Anal Appl 8: 108–134
Ye GB, Zhou DX (2008) Learning and approximation by Gaussian on Riemannian manifolds. Adv Comput Math 3: 291–310
Wu Q, Ying YM, Zhou DX (2007) Multi-kernel regularized classifiers. J Complex 23: 108–134
Zhang T (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Stat 32: 56–85
Zhou DX (2003) Capacity of reproducing kernel spaces in learning theory. IEEE Trans Inf Theory 49: 1734–1752
Zhou DX, Jetter K (2006) Approximation with polynomial kernel and SVM classifiers. Adv Comput Math 25: 323–344
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, F., Wu, D. & Lee, J. Learning Rates for Regularized Classifiers Using Trigonometric Polynomial Kernels. Neural Process Lett 35, 265–281 (2012). https://doi.org/10.1007/s11063-012-9217-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9217-1