Advertisement

EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems

  • Pilsung Kang
  • Sungzoon Cho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4232)

Abstract

Data imbalance occurs when the number of patterns from a class is much larger than that from the other class. It often degenerates the classification performance. In this paper, we propose an Ensemble of Under-Sampled SVMs or EUS SVMs. We applied the proposed method to two synthetic and six real data sets and we found that it outperformed other methods, especially when the number of patterns belonging to the minority class is very small.

Keywords

Majority Class Minority Class Class Boundary Class Pattern Weight Vote 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery 1(3), 291–316 (1997)CrossRefGoogle Scholar
  2. 2.
    Kubat, M., Holte, R., Matwin, S.: Machine Learning for the detection of oil spills in satellite radar images. Machine Learning 30(2), 195–215 (1998)CrossRefGoogle Scholar
  3. 3.
    Shin, H.J., Cho, S.Z.: Response Modeling with Support Vector Machine. Expert Systems with Applications 30(4), 746–760 (1997)CrossRefGoogle Scholar
  4. 4.
    Bruzzone, L., Serpico, S.B.: Classification of imbalanced remote-sensing data by neural networks. Pattern Recognition Letters 18(11-13), 1323–1328 (1997)CrossRefGoogle Scholar
  5. 5.
    Yan, R., Liu, Y., Jin, R., Hauptman, A.: On Predicting Rare Classes with SVM Ensembles in Scene Classification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2003 (2003)Google Scholar
  6. 6.
    Kubat, M., Holte, R., Matwin, S.: Learning when Negative Examples Abound. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224. Springer, Heidelberg (1997)Google Scholar
  7. 7.
    Dumais, S., Platt, J., Hecherman, D., Sahami, M.: Inductive Learning Algorithms and Representations for Text Categorization. In: Proceedings of the Seventh International Conference on Information and Knowledge Management (1998)Google Scholar
  8. 8.
    Chawla, N.V., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Oversampling Techniques. Journal of Artificial Intelligence Research 16, 321–357 (2002)MATHGoogle Scholar
  9. 9.
    Kubat, M., Matwin, S.: Addressing the Curse of Imbalanced Training Sets: One- Sided Selection. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 179–186 (1997)Google Scholar
  10. 10.
    Chris, D., Holte, R.C.: C4.5, Class Imbalance, and Cost Sensitivity: Why Under- Sampling beats Over-Sampling. In: Proceedings of the International Conference on Machine Learning (ICML 2003) Workshop on Learning from Imbalanced Data Sets II (2003)Google Scholar
  11. 11.
    Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.: SMOTEBoost: Improving Prediction of the Minority Class in Boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Kim, H.C., Pang, S., Je, H.M., Kim, D.J., Bang, S.Y.: Constructing Support Vector Machine Ensemble. Pattern Recognition 36, 2757–2767 (2003)MATHCrossRefGoogle Scholar
  13. 13.
    UCI Machine Learning Repository: http://www.ics.uci.edu/mlearn/MLRepository.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Pilsung Kang
    • 1
  • Sungzoon Cho
    • 1
  1. 1.Seoul National UniversitySeoulKorea

Personalised recommendations