On the Impact of Imbalanced Data in Convolutional Neural Networks Performance

  • Francisco J. PulgarEmail author
  • Antonio J. Rivera
  • Francisco Charte
  • María J. del Jesus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10334)


In recent years, new proposals have emerged for tackling the classification problem based on Deep Learning (DL) techniques. These proposals have shown good results in certain fields, such as image recognition. However, there are factors that must be analyzed to determine how they influence the results obtained by these new algorithms. In this paper, the classification of imbalanced data with convolutional neural networks (CNNs) is analyzed. To do this, a series of tests will be performed in which the classification of real images of traffic signals by CNNs will be performed based on data with different imbalance levels.


Deep learning Convolutional neural network Image recognition Imbalanced dataset 


The work of F. Pulgar was supported by the University of Jaén under the Action 15: Predoctoral aids for the encouragement of the doctorate. This work was partially supported by the Spanish Ministry of Science and Technology under project TIN2015-68454-R.


  1. 1.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, USA (2000)zbMATHGoogle Scholar
  2. 2.
    Kotsiantis, S.: Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies, pp. 3–24 (2007)Google Scholar
  3. 3.
    Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. 6(1), 1–6 (2004)CrossRefGoogle Scholar
  4. 4.
    He, H., García, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  5. 5.
    Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recogn. Artif. Intell. 23(4), 687–719 (2009)CrossRefGoogle Scholar
  6. 6.
    Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for class imbalance problem: bagging, boosting and hybrid based approaches. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 42(4), 463–484 (2012)CrossRefGoogle Scholar
  7. 7.
    Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behaviour of several methods for balancing machine learning training data. SIGKDD Explor. 6(1), 20–29 (2004)CrossRefGoogle Scholar
  8. 8.
    Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining (KDD01), pp. 204–213 (2001)Google Scholar
  9. 9.
    Zadrozny, B., Langford, J., Abe, N.: Costsensitive learning by costproportionate example weighting. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM03), pp. 435–442 (2003)Google Scholar
  10. 10.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEE Trans. Pattern Anal. Mach. Intell. 3(8), 1798–1828 (2013)CrossRefGoogle Scholar
  11. 11.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep learning (2016)Google Scholar
  12. 12.
    Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. Computer. Neural Netw. 32, 323–332 (2012)CrossRefGoogle Scholar
  13. 13.
    García, V., Sánchez, J.S., Mollineda, R.A.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl. Based Syst. 25(1), 13–21 (2012)CrossRefGoogle Scholar
  14. 14.
    Orriols-Puig, A., Bernad-Mansilla, E.: Evolutionary rule-based systems for imbalanced datasets. Soft Comput. 13(3), 213–225 (2009)CrossRefGoogle Scholar
  15. 15.
    Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification, Technical Report No. IDSIA-04-12 (2012)Google Scholar
  16. 16.
    McMillan, R.L.: How Skype used AI to build its amazing new language translator, wire (2014)Google Scholar
  17. 17.
    LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks (1995)Google Scholar
  18. 18.
    Sak, H., Senior, A., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Proceedings of Interspeech, pp. 338–342 (2013)Google Scholar
  19. 19.
    Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: Proceedings of International Joint Conference on Neural Networks (2011)Google Scholar
  20. 20.
    LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on in Circuits and Systems (ISCAS), pp. 253–256 (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Francisco J. Pulgar
    • 1
    Email author
  • Antonio J. Rivera
    • 1
  • Francisco Charte
    • 1
  • María J. del Jesus
    • 1
  1. 1.Depart of Computer ScienceUniversity of JaénJaénSpain

Personalised recommendations