Support Vector Machines and the Bayes Rule in Classification
 Yi Lin
 … show all 1 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
The Bayes rule is the optimal classification rule if the underlying distribution of the data is known. In practice we do not know the underlying distribution, and need to “learn” classification rules from the data. One way to derive classification rules in practice is to implement the Bayes rule approximately by estimating an appropriate classification function. Traditional statistical methods use estimated log odds ratio as the classification function. Support vector machines (SVMs) are one type of large margin classifier, and the relationship between SVMs and the Bayes rule was not clear. In this paper, it is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule. The rate of convergence of the solutions of SVMs to their corresponding target functions is explicitly established in the case of SVMs with quadratic or higher order loss functions and spline kernels. Simulations are given to illustrate the relation between SVMs and the Bayes rule in other cases. This helps understand the success of SVMs in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.
 Boser, B.E., Guyon, I.M., and Vapnik, V.N. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, D. Haussler (Ed.). Pittsburgh, PA: ACM Press.
 Burges, C.J.C. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167.
 Cortes, C. and Vapnik, V.N. 1995. Support vector networks. Machine Learning, 20:273–297.
 Cox, D.D. and O'sullivan, F. 1990. Asymptotic analysis of penalized likelihood and related estimates. The Annals of Statistics, 18(4):1676–1695.
 Evgeniou, T., Pontil, M., and Poggio, T. 1999. A unified framework for regularization networks and support vector machines. Technical Report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Sciences.
 Friedman, J.H. 1997. On bias, variance, 0/1loss, and the curseofdimensionality. Data Mining and Knowledge Discovery, 1(1):55–77.
 Kaufman, L. 1999. Solving the quadratic programming problem arising in support vector classification. In Advances in Kernel MethodsSupport Vector Learning, B. Schölkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press, pp. 147–168.
 Lin, Y. 1998. Tensor product space ANOVA models in high dimensional function estimation. Ph.D. Dissertation, University of Pennsylvania.
 Lin, Y. 2000a. Tensor product space ANOVA models. The Annals of Statistics, 28(3):734–755.
 Lin, Y. 2000b. On the support vector machine. Technical Report 1029, Department of Statistics, University of Wisconsin, Madison.
 Lin, Y., Lee, Y., and Wahba, G. 2002. Support vector machines for classification in nonstandard situations. Machine Learning, 46:191–202.
 ShaweTaylor, J. and Cristianini, N. 1998. Robust bounds on the generalization from the margin distribution. Neuro COLT Technical Report TR1998029.
 Vapnik, V.N. 1995. The Nature of Statistical Learning Theory. New York: Springer Verlag.
 Wahba, G. 1990. Spline Models for Observational Data. Philadelphia, PA: Society for Industrial and Applied Mathematics.
 Wahba, G. 1999. Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. In Advances in Kernel MethodsSupport Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola (Eds.). Cambridge, MA: MIT Press.
 Wahba, G., Lin, Y., and Zhang, H. 2000. GACV for support vector machines, or, another way to look at marginlike quantities. In Advances in Large Margin Classifiers, A.J. Smola, P. Bartlett, B. Scholkopf, and D. Schurmans (Eds.). Cambridge, MA and London, England: MIT Press.
 Title
 Support Vector Machines and the Bayes Rule in Classification
 Journal

Data Mining and Knowledge Discovery
Volume 6, Issue 3 , pp 259275
 Cover Date
 20020701
 DOI
 10.1023/A:1015469627679
 Print ISSN
 13845810
 Online ISSN
 1573756X
 Publisher
 Kluwer Academic Publishers
 Additional Links
 Topics
 Keywords

 support vector machine
 classification
 the Bayes rule
 reproducing kernel
 reproducing kernel Hilbert space
 regularization methods
 Industry Sectors
 Authors

 Yi Lin ^{(1)}
 Author Affiliations

 1. Department of Statistics, University of Wisconsin, Madison, 1210 West Dayton Street, Madison, WI, 537061685, USA