ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases

  • Pierre StockEmail author
  • Moustapha Cisse
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11210)


ConvNets and ImageNet have driven the recent success of deep learning for image classification. However, the marked slowdown in performance improvement combined with the lack of robustness of neural networks to adversarial examples and their tendency to exhibit undesirable biases question the reliability of these methods. This work investigates these questions from the perspective of the end-user by using human subject studies and explanations. The contribution of this study is threefold. We first experimentally demonstrate that the accuracy and robustness of ConvNets measured on Imagenet are vastly underestimated. Next, we show that explanations can mitigate the impact of misclassified adversarial examples from the perspective of the end-user. We finally introduce a novel tool for uncovering the undesirable biases learned by a model. These contributions also show that explanations are a valuable tool both for improving our understanding of ConvNets’ predictions and for designing more reliable models.


Bias detection Interpretability Adversarial examples 


  1. 1.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)Google Scholar
  2. 2.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1 (2012)Google Scholar
  3. 3.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR09 (2009)Google Scholar
  4. 4.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010 Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). Scholar
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  6. 6.
    Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks. CoRR (2016)Google Scholar
  7. 7.
    Girshick, R.: Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). ICCV 2015 (2015)Google Scholar
  8. 8.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR abs/1703.06870 (2017)Google Scholar
  9. 9.
    Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. CoRR abs/1605.03170 (2016)Google Scholar
  10. 10.
    Bulat, A., Tzimiropoulos, G.: Human pose estimation via convolutional part heatmap regression. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part VII. LNCS, vol. 9911, pp. 717–732. Springer, Cham (2016). Scholar
  11. 11.
    Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. CoRR (2017)Google Scholar
  12. 12.
    Wang, Y., Deng, X., Pu, S., Huang, Z.: Residual convolutional CTC networks for automatic speech recognition. CoRR (2017)Google Scholar
  13. 13.
    Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. CoRR (2017)Google Scholar
  14. 14.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR (2014)Google Scholar
  15. 15.
    Szegedy, C., et al.: Intriguing properties of neural networks. CoRR (2013)Google Scholar
  16. 16.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014)Google Scholar
  17. 17.
    Cissé, M., Adi, Y., Neverova, N., Keshet, J.: Houdini: fooling deep structured visual and speech recognition models with adversarial examples. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA (2017)Google Scholar
  18. 18.
    Ritter, S., Barrett, D.G.T., Santoro, A., Botvinick, M.M.: Cognitive psychology for deep neural networks: a shape bias case study. In: Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research (2017)Google Scholar
  19. 19.
    Bolukbasi, T., Chang, K., Zou, J.Y., Saligrama, V., Kalai, A.: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. CoRR (2016)Google Scholar
  20. 20.
    Kim, B., Khanna, R., Koyejo, O.O.: Examples are not enough, learn to criticize! Criticism for interpretability. In: Advances in Neural Information Processing Systems 29 (2016)Google Scholar
  21. 21.
    Karpathy, A.: What i learned from competing against a convnet on imagenet.
  22. 22.
    Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015)Google Scholar
  23. 23.
    Tabacof, P., Valle, E.: Exploring the space of adversarial images. CoRR (2015)Google Scholar
  24. 24.
    Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. CoRR (2015)Google Scholar
  25. 25.
    Shaham, U., Yamada, Y., Negahban, S.: Understanding adversarial training: increasing local stability of neural nets through robust optimization. CoRR (2015)Google Scholar
  26. 26.
    Fawzi, A., Moosavi-Dezfooli, S., Frossard, P.: Robustness of classifiers: from adversarial to random noise. CoRR (2016)Google Scholar
  27. 27.
    Papernot, N., McDaniel, P.D., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP) (2016)Google Scholar
  28. 28.
    Cisse, M., Bojanowski, P., Grave, E., Dauphin, Y., Usunier, N.: Parseval networks: improving robustness to adversarial examples. In: Proceedings of the 34th International Conference on Machine Learning (2017)Google Scholar
  29. 29.
    Kurakin, A., Boneh, D., Tramr, F., Goodfellow, I., Papernot, N., McDaniel, P.: Ensemble adversarial training: attacks and defenses (2018)Google Scholar
  30. 30.
    Guo, C., Rana, M., Cissé, M., van der Maaten, L.: Countering adversarial images using input transformations. CoRR (2017)Google Scholar
  31. 31.
    Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate method to fool deep neural networks. CoRR (2015)Google Scholar
  32. 32.
    Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. CoRR (2016)Google Scholar
  33. 33.
    Simon, H.A., Newell, A.: Human problem solving: the state of the theory in 1970. Am. Psychol. 26, 145 (1972)CrossRefGoogle Scholar
  34. 34.
    Aamodt, A., Plaza, E.: Case-based reasoning; foundational issues, methodological variations, and system approaches. AI Commun. 7, 39–59 (1994)Google Scholar
  35. 35.
    Bichindaritz, I., Marling, C.: Case-based reasoning in the health sciences: what’s next? Artif. Intell. Med. 36, 127–135 (2006)CrossRefGoogle Scholar
  36. 36.
    Kim, B., Rudin, C., Shah, J.A.: The Bayesian case model: a generative approach for case-based reasoning and prototype classification. In: Advances in Neural Information Processing Systems 27 (2014)Google Scholar
  37. 37.
    Gelman, A.: Bayesian data analysis using R (2006)Google Scholar
  38. 38.
    Gretton, A., Borgwardt, K.M., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample-problem. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS 2006 (2006)Google Scholar
  39. 39.
    Badanidiyuru, A., Mirzasoleiman, B., Karbasi, A., Krause, A.: Streaming submodular maximization: massive data summarization on the fly. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)Google Scholar
  40. 40.
    Lipton, Z.C.: The mythos of model interpretability. CoRR (2016)Google Scholar
  41. 41.
    Samek, W., Wiegand, T., Muller, K.: Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. CoRR (2017)Google Scholar
  42. 42.
    Montavon, G., Samek, W., Muller, K.: Methods for interpreting and understanding deep neural networks. CoRR (2017)Google Scholar
  43. 43.
    Dong, Y., Su, H., Zhu, J., Bao, F.: Towards interpretable deep neural networks by leveraging adversarial examples. CoRR (2017)Google Scholar
  44. 44.
    Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)Google Scholar
  45. 45.
    Paperno, D., Marelli, M., Tentori, K., Baroni, M.: Corpus-based estimates of word association predict biases in judgment of word co-occurrence likelihood. Cogn. Psychol. 74, 66–83 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Facebook AI ResearchParisFrance

Personalised recommendations