Performance of Deep Learning Algorithms vs. Shallow Models, in Extreme Conditions - Some Empirical Studies

  • Samik Banerjee
  • Prateep Bhattacharjee
  • Sukhendu Das
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10597)


Deep convolutional neural networks (DCNN) successfully exhibit exceptionally good classification performance, despite their massive size. The effect of a large value of noise term, as irreducible error in Expected Prediction Error (EPE) is first discussed. Through extensive systematic experiments, we show how in extreme conditions the traditional approaches fare at par with large neural networks, which generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks trained for classification barely fit a random labeling of the training data as an extreme condition to learn. This phenomenon is quantitatively unaffected even if we train the CNNs with completely inseparable data. This can be due to large degree of corruption of the entire data by random noise or random labels associated with data due to observation error. We corroborate these experimental findings by showing that depth six CNN (VGG-6) fails to overcome large noise in image signals.


Convolutional neural networks Noise Classification SVM EPE 


  1. 1.
    Banerjee, S., Das, S.: Soft-margin learning for multiple feature-kernel combinations with domain adaptation, for recognition in surveillance face dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) on Biometrics, pp. 169–174 (2016)Google Scholar
  2. 2.
    Bergmans, P.: A simple converse for broadcast channels with additive white gaussian noise (corresp.). IEEE Trans. Inf. Theory 20(2), 279–280 (1974)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Blog, C.M.: Model selection: underfitting, overfitting, and the bias-variance tradeoff (2013)Google Scholar
  4. 4.
    Chen, J.C., Zheng, J., Patel, V.M., Chellappa, R.: Fisher vector encoded deep convolutional features for unconstrained face verification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2981–2985, September 2016Google Scholar
  5. 5.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)Google Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  7. 7.
    Cristianini, N., Scholkopf, B.: Support vector machines and kernel methods: the new generation of learning machines. AI Mag. 23(3), 31 (2002)Google Scholar
  8. 8.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)Google Scholar
  9. 9.
    de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: VISAPP (2), pp. 273–280 (2009)Google Scholar
  10. 10.
    Domingos, P.: A unified bias-variance decomposition. In: Proceedings of 17th International Conference on Machine Learning, pp. 231–238. Morgan Kaufmann, Stanford (2000)Google Scholar
  11. 11.
    Dougherty, J., Kohavi, R., Sahami, M., et al.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning, vol. 12, pp. 194–202 (1995)Google Scholar
  12. 12.
    Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer Series in Statistics, vol. 1. Springer, Berlin (2001)zbMATHGoogle Scholar
  13. 13.
    Haykin, S.: Multilayer perceptrons. Neural Netw. Compr. Found. 2, 156–255 (1999)Google Scholar
  14. 14.
    Hoffman, J., Guadarrama, S., Tzeng, E.S., Hu, R., Donahue, J., Girshick, R., Darrell, T., Saenko, K.: LSDA: Large scale detection through adaptation. In: Advances in Neural Information Processing Systems (NIPS), pp. 3536–3544 (2014)Google Scholar
  15. 15.
    James, G.M.: Variance and bias for general loss functions. Mach. Learn. 51(2), 115–135 (2003)CrossRefzbMATHGoogle Scholar
  16. 16.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)Google Scholar
  17. 17.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  18. 18.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  19. 19.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  20. 20.
    Mao, J., Xu, W., Yang, Y., Wang, J., Huang, Z., Yuille, A.: Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632 (2014)
  21. 21.
    Mhaskar, H., Liao, Q., Poggio, T.: Learning functions: when is deep better than shallow. arXiv preprint arXiv:1603.00988 (2016)
  22. 22.
    Poggio, T., Mhaskar, H., Rosasco, L., Miranda, B., Liao, Q.: Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. arXiv preprint arXiv:1611.00740 (2016)
  23. 23.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  24. 24.
    Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1891–1898 (2014)Google Scholar
  25. 25.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1701–1708 (2014)Google Scholar
  26. 26.
    Zhu, Z., Luo, P., Wang, X., Tang, X.: Recover canonical-view faces in the wild with deep neural networks. arXiv preprint arXiv:1404.3543 (2014)

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Indian Institute of Technology MadrasChennaiIndia

Personalised recommendations