Advertisement

Machine Learning

, Volume 46, Issue 1–3, pp 191–202 | Cite as

Support Vector Machines for Classification in Nonstandard Situations

  • Yi Lin
  • Yoonkyung Lee
  • Grace Wahba
Article

Abstract

The majority of classification algorithms are developed for the standard situation in which it is assumed that the examples in the training set come from the same distribution as that of the target population, and that the cost of misclassification into different classes are the same. However, these assumptions are often violated in real world settings. For some classification methods, this can often be taken care of simply with a change of threshold; for others, additional effort is required. In this paper, we explain why the standard support vector machine is not suitable for the nonstandard situation, and introduce a simple procedure for adapting the support vector machine methodology to the nonstandard situation. Theoretical justification for the procedure is provided. Simulation study illustrates that the modified support vector machine significantly improves upon the standard support vector machine in the nonstandard situation. The computational load of the proposed procedure is the same as that of the standard support vector machine. The procedure reduces to the standard support vector machine in the standard situation.

support vector machine classification Bayes rule GCKL GACV 

References

  1. Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In D. Haussler (Ed.), Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory. Pittsburgh, PA: ACM Press.Google Scholar
  2. Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, M., Jr., & Haussler, D. (2000). Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences, 97:1, 262-267.Google Scholar
  3. Burges, C. J. C. (1998). Atutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:2, 121-167.Google Scholar
  4. Cox, D. D. & O'Sullivan, F. (1990). Asymptotic analysis of penalized likelihood and related estimates. The Annals of Statistics, 18:4, 1676-1695.Google Scholar
  5. Cristianini, N., Campbell, C., & Shawe-Taylor, J. (1998). Dynamically adapting kernels in support vector machines. NeuroCOLT2 Technical Report Series, NC2-TR-1998-017.Google Scholar
  6. Evgeniou, T., Pontil, M., & Poggio, T. (1999). A unified framework for regularization networks and support vector machines. Technical report, M.I.T. Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Sciences.Google Scholar
  7. Hand, D. J. (1997). Construction and assessment of classification rules. Chichester, England: John Wiley & Sons.Google Scholar
  8. Karakoulas, G. & Shawe-Taylor, J. (1999). Optimizing classifiers for imbalanced training sets. In M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.), Advances in neural information processing systems (Vol. 11). Cambridge, MA: MIT Press.Google Scholar
  9. Lin, Y. (1999). Support vector machines and the Bayes rule in classification. Data Mining and Knowledge Discovery, To appear.Google Scholar
  10. Wahba, G. (1990). Spline models for observational data. Philadelphia, PA: Society for Industrial and Applied Mathematics.Google Scholar
  11. Wahba, G., Lin, Y., & Zhang, H. (2000). GACV for support vector machines, or, another way to look at marginlike quantities. In A. J. Smola, P. Bartlett, B. Scholkopf, & D. Schurmans (Eds.), Advances in large margin classifiers. Cambridge, MA & London, England: MIT Press.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Yi Lin
    • 1
  • Yoonkyung Lee
    • 1
  • Grace Wahba
    • 1
  1. 1.Department of StatisticsUniversity of Wisconsin, MadisonMadisonUSA

Personalised recommendations