Advertisement

GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework

  • Sukarna BaruaEmail author
  • Md. Monirul Islam
  • Kazuyuki Murase
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9489)

Abstract

Online imbalanced learning has two important characteristics: samples of one class (minority class) are under-represented in the data set and samples come to the learner online incrementally. Such a data set may pose several problems to the learner. First, it is impossible to determine the minority class beforehand as the learner has no complete view of the whole data. Second, the status of imbalance may change over time. To handle such a data set efficiently, we present here a dynamic and adaptive algorithm called Generalized Over-Sampling based Online Imbalanced Learning (GOS-IL) framework. The proposed algorithm works by updating a base learner incrementally. This update is triggered when number of errors made by the learner crosses a threshold value. This deferred update helps the learner to avoid instantaneous harms of noisy samples and to achieve better generalization ability in the long run. In addition, correctly classified samples are not used by the algorithm to update the learner for avoiding over-fitting. Simulation results on some artificial and real world datasets show the effectiveness of the proposed method on two performance metrics: recall and g-mean.

Keywords

Imbalanced learning Online learning Oversampling 

Notes

Acknowledgments

This research work has been done in the Department of Computer Science & Engineering of Bangladesh University of Engineering and Technology (BUET). The authors would like to acknowledge BUET for its generous support.

References

  1. 1.
    Ciaramita, M., Murdock, V., Plachouras, V.: Online learning from click data for sponsored search. In: International World Wide Web Conference, pp. 227–236 (2008)Google Scholar
  2. 2.
    Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003)Google Scholar
  3. 3.
    Nishida, K., Shimada, S., Ishikawa, S., Yamauchi, K.: Detecting sudden concept drift with knowledge of human behavior. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 3261–3267 (2008)Google Scholar
  4. 4.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(10), 1263–1284 (2009)Google Scholar
  5. 5.
    Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)CrossRefGoogle Scholar
  6. 6.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)zbMATHGoogle Scholar
  7. 7.
    He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, pp. 1322–1328. IEEE, Hong Kong (2008)Google Scholar
  8. 8.
    Barua, S., Islam, M.M., Murase, K.: ProWSyn: proximity weighted synthetic oversampling technique for imbalanced data set learning. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 317–328. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  9. 9.
    Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Recursive least square perceptron model for non-stationary and imbalanced data stream classification. Evol. Syst. 4(2), 119–131 (2013)CrossRefGoogle Scholar
  10. 10.
    Mirza, B., Lin, Z., Toh, K.A.: Weighted online sequential extreme learning machine for class imbalance learning. Neural Process. Lett. 38(3), 465–486 (2013)CrossRefGoogle Scholar
  11. 11.
    Wang, S., Minku, L.L., Yao, X.: A learning framework for online class imbalance learning. In: Computational Intelligence and Ensemble Learning (CIEL), pp. 36–45 (2013)Google Scholar
  12. 12.
    Dawid, A.P., Vovk, V.G.: Prequential probability: principles and properties. Bernoulli 5(1), 125–162 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sukarna Barua
    • 1
    Email author
  • Md. Monirul Islam
    • 1
  • Kazuyuki Murase
    • 2
  1. 1.Bangladesh University of Engineering and Technology (BUET)DhakaBangladesh
  2. 2.University of FukuiFukuiJapan

Personalised recommendations