Support Vector Machines and the Bayes Rule in Classification
- Cite this article as:
- Lin, Y. Data Mining and Knowledge Discovery (2002) 6: 259. doi:10.1023/A:1015469627679
- 541 Downloads
The Bayes rule is the optimal classification rule if the underlying distribution of the data is known. In practice we do not know the underlying distribution, and need to “learn” classification rules from the data. One way to derive classification rules in practice is to implement the Bayes rule approximately by estimating an appropriate classification function. Traditional statistical methods use estimated log odds ratio as the classification function. Support vector machines (SVMs) are one type of large margin classifier, and the relationship between SVMs and the Bayes rule was not clear. In this paper, it is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule. The rate of convergence of the solutions of SVMs to their corresponding target functions is explicitly established in the case of SVMs with quadratic or higher order loss functions and spline kernels. Simulations are given to illustrate the relation between SVMs and the Bayes rule in other cases. This helps understand the success of SVMs in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.