Towards Deeper Insights into Deep Learning from Imbalanced Data

  • Jie Song
  • Yun Shen
  • Yongcheng Jing
  • Mingli SongEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 771)


Imbalanced performance usually happens to those classifiers (including deep neural networks) trained on imbalanced training data. These classifiers are more likely to make mistakes on minority class instances than on those majority class ones. Existing explanations attribute the imbalanced performance to the imbalanced training data. In this paper, using deep neural networks, we strive for deeper insights into the imbalanced performance. We find that imbalanced data is a neither sufficient nor necessary condition for imbalanced performance in deep neural networks, and another important factor for imbalanced performance is the distance between the majority class instances and the decision boundary. Based on our observations, we propose a new under-sampling method (named Moderate Negative Mining) which is easy to implement, state-of-the-art in performance and suitable for deep neural networks, to solve the imbalanced classification problem. Various experiments validate our insights and demonstrate the superiority of the proposed under-sampling method.


Imbalanced classification Imbalanced performance Deep neural networks Under-sampling 



This work was supported in part by the National Natural Science Foundation of China (61572428, U1509206), National Key Research and Development Program (2016YFB1200203), Program of International Science and Technology Cooperation (2013DFG12840), Fundamental Research Funds for the Central Universities (2017FZA5014) and National High-tech Technology R&D Program of China (2014AA015205).


  1. 1.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)zbMATHGoogle Scholar
  2. 2.
    Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol. 97, pp. 179–186, July 1997Google Scholar
  3. 3.
    Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, pp. 63–66. Springer, Heidelberg (2001). CrossRefGoogle Scholar
  4. 4.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  5. 5.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  6. 6.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  7. 7.
    Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., Brunk, C.: Reducing misclassification costs. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 217–225 (1994)Google Scholar
  8. 8.
    Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: SVMs modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(1), 281–288 (2009)CrossRefGoogle Scholar
  9. 9.
    Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for learning from imbalanced data sets. Comput. Intell. 20(1), 18–36 (2004)MathSciNetCrossRefGoogle Scholar
  10. 10.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  11. 11.
    Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Jeatrakul, P., Wong, K.W., Fung, C.C.: Classification of imbalanced data by combining the complementary neural network and SMOTE algorithm. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010 Part II. LNCS, vol. 6444, pp. 152–159. Springer, Heidelberg (2010). CrossRefGoogle Scholar
  13. 13.
    Zhou, Z.H., Liu, X.Y.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Huang, C., Li, Y., Change Loy, C., Tang, X.: Learning deep representation for imbalanced classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5375–5384 (2016)Google Scholar
  15. 15.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)Google Scholar
  16. 16.
    Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 148–156 (1994)Google Scholar
  17. 17.
    Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10(3), e0118432 (2015)CrossRefGoogle Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  • Jie Song
    • 1
    • 2
  • Yun Shen
    • 1
    • 2
  • Yongcheng Jing
    • 1
    • 2
  • Mingli Song
    • 1
    • 2
    Email author
  1. 1.College of Computer ScienceZhejiang UniversityHangzhouChina
  2. 2.Alibaba-Zhejiang University Joint Institute of Frontier TechnologiesHangzhouChina

Personalised recommendations