Advertisement

One-Class Support Vector Machines with a Conformal Kernel. A Case Study in Handling Class Imbalance

  • Gilles Cohen
  • Mélanie Hilario
  • Christian Pellegrini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)

Abstract

Class imbalance is a widespread problem in many classification tasks such as medical diagnosis and text categorization. To overcome this problem, we investigate one-class SVMs which can be trained to differentiate two classes on the basis of examples from a single class. We propose an improvement of one-class SVMs via a conformal kernel transformation as described in the context of binary SVM classifiers by [2,3]. We tested this improved one-class SVM on a health care problem that involves discriminating 11% nosocomially infected patients from 89% non infected patients. The results obtained are encouraging: compared with three other SVM-based approaches to coping with class imbalance, one-class SVMs achieved the highest sensitivity recorded so far on the nosocomial infection dataset. However, the price to pay is a concomitant decrease specificity, and it is for domain experts to decide the proportion of false positive cases they are willing to accept in order to ensure treatment of all infected patients.

Keywords

Support Vector Machine Support Vector Nosocomial Infection Minority Class Class Imbalance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ali, K., Manganaris, S., Srikant, R.: Partial classification using association rules. In: Proc. 3rd International Conference on Knowledge Discovery in Databases and Data Mining (1997)Google Scholar
  2. 2.
    Amari, S., Wu, S.: Improving support vector machine classifiers by modifying kernel functions. Neural Networks 12(6), 783–789 (1999)CrossRefGoogle Scholar
  3. 3.
    Amari, S., Wu, S.: An information-geometrical method for improving the performance of support vector machine classifiers. In: ICANN 1999, pp. 85–90 (1999)Google Scholar
  4. 4.
    Bishop, C.: Novelty detection and neural network validation. IEEE Proceedings on Vision, Image and Signal Processing 141(4), 217–222 (1994)CrossRefGoogle Scholar
  5. 5.
    Boothby, W.M.: An introduction to differential manifolds and Riemannian geometry. Academic Press, Orlando (1986)Google Scholar
  6. 6.
    Burges, C.: Geometry and invariance in kernel based methods. In: MIT Press, editor, Adv. in kernel methods: Support vector learning (1999)Google Scholar
  7. 7.
    Cohen, G., Hilario, M., Sax, H., Hugonnet, S.: Asymmetrical margin approach to surveillance of nosocomial infections using support vector classification. In: Intelligent Data Analysis in Medicine and Pharmacology (2003)Google Scholar
  8. 8.
    Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20(3), 273–297 (1995)zbMATHGoogle Scholar
  9. 9.
    Cristianini, N., Taylor, J.S.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)Google Scholar
  10. 10.
    Domingos, P.: A general method for making classifiers cost-sensitive. In: Proc. 5th International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)Google Scholar
  11. 11.
    Fletcher, R.: Practical Methods of Optimization. John Wiley and Sons, Chichester (1987)zbMATHGoogle Scholar
  12. 12.
    French, G.G., Cheng, A.F., Wong, S.L., Donnan, S.: Repeated prevalence surveys for monitoring effectiveness of hospital infection control. Lancet 2, 1021–1023 (1983)Google Scholar
  13. 13.
    Harbarth, S., Ruef, C., Francioli, P., Widmer, A., Pittet, D., Network, S.-N.: Nosocomial infections in Swiss university hospitals: a multi-centre survey and review of the published experience. Schweiz Med Wochenschr 129, 1521–1528 (1999)Google Scholar
  14. 14.
    Japkowicz, N.: The class imbalance problem: A systematic study. Intelligent Data Analysis Journal 6(5) (2002)Google Scholar
  15. 15.
    Kubat, M., Matwin, S.: Addressing the curse of imbalanced data sets: Onesided sampling. In: Procs o f the Fourteenth International Conference on Machine Learning, pp. 179–186 (1997)Google Scholar
  16. 16.
    Scholkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.: Estimating the support of a high-dimensional distribution. In: Neural Computation, vol. 13, pp. 1443–1471. MIT Press, Cambridge (1999)Google Scholar
  17. 17.
    Tarassenko, L., Hayton, P., Cerneaz, N., Brady, M.: Novelty detection for the identification of masses in mammograms. In: Proceedings of the 4th IEE International Conference on Artificial Neural Networks (ICANN 1995), pp. 442–447 (1995)Google Scholar
  18. 18.
    Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Gilles Cohen
    • 1
    • 2
  • Mélanie Hilario
    • 2
  • Christian Pellegrini
    • 2
  1. 1.Medical Informatics ServiceUniversity Hospital of GenevaGenevaSwitzerland
  2. 2.Artificial Intelligence LaboratoryUniversity of GenevaGenevaSwitzerland

Personalised recommendations