Generating Post-Hoc Rationales of Deep Visual Classification Decisions

  • Zeynep Akata
  • Lisa Anne Hendricks
  • Stephan Alaniz
  • Trevor Darrell
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)


Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself. Existing approaches for deep visual recognition are generally opaque and do not output any justification text; contemporary vision-language models can describe image content but fail to take into account class-discriminative image aspects which justify visual predictions. Our model focuses on the discriminating properties of the visible object, jointly predicts a class label, and explains why the predicted label is appropriate for the image. A sampling and reinforcement learning based loss function learns to generate sentences that realize a global sentence property, such as class specificity. Our results on a fine-grained bird species classification dataset show that this model is able to generate explanations which are not only consistent with an image but also more discriminative than descriptions produced by existing captioning methods. In this work, we emphasize the importance of producing an explanation for an observed action, which could be applied to a black-box decision agent, akin to what one human produces when asked to explain the actions of a second human.


Explainable AI Rationalizations Fine-grained classification 


  1. Andreas J, Rohrbach M, Darrell T, Klein D (2016) Learning to compose neural networks for question answering. In: NAACLGoogle Scholar
  2. Banerjee S, Lavie A (2005) Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, vol 29, pp 65–72Google Scholar
  3. Berg T, Belhumeur P (2013) How do you tell a blackbird from a crow? In: ICCV, pp 9–16Google Scholar
  4. Biran O, McKeown K (2014) Justification narratives for individual classifications. In: Proceedings of the AutoML workshop at ICML 2014Google Scholar
  5. Core MG, Lane HC, Van Lent M, Gomboc D, Solomon S, Rosenberg M (2006) Building explainable artificial intelligence systems. In: Proceedings of the national conference on artificial intelligence, Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, vol 21, p 1766Google Scholar
  6. Doersch C, Singh S, Gupta A, Sivic J, Efros A (2012) What makes paris look like paris? ACM Transactions on Graphics 31(4)CrossRefGoogle Scholar
  7. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2013) Decaf: A deep convolutional activation feature for generic visual recognition. ICMLGoogle Scholar
  8. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPRGoogle Scholar
  9. Escorcia V, Niebles JC, Ghanem B (2015) On the relationship between visual attributes and convolutional networks. In: CVPRGoogle Scholar
  10. Fang H, Gupta S, Iandola F, Srivastava RK, Deng L, Dollár P, Gao J, He X, Mitchell M, Platt JC, et al (2015) From captions to visual concepts and back. In: CVPR, pp 1473–1482Google Scholar
  11. Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. arXiv preprint arXiv:170403296Google Scholar
  12. Gao Y, Beijbom O, Zhang N, Darrell T (2016) Compact bilinear pooling. In: CVPRGoogle Scholar
  13. Guadarrama S, Krishnamoorthy N, Malkarnenkar G, Venugopalan S, Mooney R, Darrell T, Saenko K (2013) Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In: ICCV, pp 2712–2719Google Scholar
  14. Hendricks LA, Akata Z, Rohrbach M, Donahue J, Schiele B, Darrell T (2016) Generating visual explanations. In: ECCVGoogle Scholar
  15. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  16. Jia X, Gavves E, Fernando B, Tuytelaars T (2015) Guiding long-short term memory for image caption generation. ICCVGoogle Scholar
  17. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, ACM, pp 675–678Google Scholar
  18. Johnson WL (1994) Agents that learn to explain themselves. In: AAAI, pp 1257–1263Google Scholar
  19. Karpathy A, Li F (2015) Deep visual-semantic alignments for generating image descriptions. In: CVPRGoogle Scholar
  20. Kiros R, Salakhutdinov R, Zemel R (2014) Multimodal neural language models. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp 595–603Google Scholar
  21. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105Google Scholar
  22. Kulkarni G, Premraj V, Dhar S, Li S, choi Y, Berg A, Berg T (2011) Baby talk: understanding and generating simple image descriptions. In: CVPRGoogle Scholar
  23. Lacave C, Díez FJ (2002) A review of explanation methods for bayesian networks. The Knowledge Engineering Review 17(02):107–127CrossRefGoogle Scholar
  24. Lampert C, Nickisch H, Harmeling S (2013) Attribute-based classification for zero-shot visual object categorization. In: TPAMIGoogle Scholar
  25. Lane HC, Core MG, Van Lent M, Solomon S, Gomboc D (2005) Explainable artificial intelligence for training and tutoring. Tech. rep., DTIC DocumentGoogle Scholar
  26. Liu S, Zhu Z, Ye N, Guadarrama S, Murphy K (2016) Improved image captioning via policy gradient optimization of spider. arXiv preprint arXiv:161200370Google Scholar
  27. Lomas M, Chevalier R, Cross II EV, Garrett RC, Hoare J, Kopack M (2012) Explaining robot actions. In: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction, ACM, pp 187–188Google Scholar
  28. Mao J, Xu W, Yang Y, Wang J, Yuille AL (2014) Explain images with multimodal recurrent neural networks. NIPS Deep Learning WorkshopGoogle Scholar
  29. Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K (2016) Generation and comprehension of unambiguous object descriptions. In: CVPRGoogle Scholar
  30. Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ (1990) Introduction to wordnet: An on-line lexical database*. International journal of lexicography 3(4):235–244CrossRefGoogle Scholar
  31. Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: ACL, pp 311–318Google Scholar
  32. Park DH, Hendricks LA, Akata Z, Rohrbach A, Schiele B, Darrell T, Rohrbach M (2018) Multimodal explanations: Justifying decisions and pointing to the evidence. arXiv preprint arXiv:180208129Google Scholar
  33. Ranzato M, Chopra S, Auli M, Zaremba W (2015) Sequence level training with recurrent neural networks. arXiv preprint arXiv:151106732Google Scholar
  34. Reed S, Akata Z, Lee H, Schiele B (2016a) Learning deep representations of fine-grained visual descriptions. In: CVPRGoogle Scholar
  35. Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016b) Generative adversarial text to image synthesis. ICMLGoogle Scholar
  36. Ren Z, Wang X, Zhang N, Lv X, Li LJ (2017) Deep reinforcement learning-based image captioning with embedding reward. arXiv preprint arXiv:170403899Google Scholar
  37. Rennie SJ, Marcheret E, Mroueh Y, Ross J, Goel V (2016) Self-critical sequence training for image captioning. arXiv preprint arXiv:161200563Google Scholar
  38. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2016) Grad-cam: Visual explanations from deep networks via gradient-based localization. See https://arxivorg/abs/161002391v37(8)
  39. Shortliffe EH, Buchanan BG (1975) A model of inexact reasoning in medicine. Mathematical biosciences 23(3):351–379MathSciNetCrossRefGoogle Scholar
  40. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556Google Scholar
  41. Teach RL, Shortliffe EH (1981) An analysis of physician attitudes regarding computer-based clinical consultation systems. In: Use and impact of computers in clinical medicine, Springer, pp 68–85Google Scholar
  42. Van Lent M, Fisher W, Mancuso M (2004) An explainable artificial intelligence system for small-unit tactical behavior. In: Proceedings of the National Conference on Artificial Intelligence, Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, pp 900–907Google Scholar
  43. Vedantam R, Lawrence Zitnick C, Parikh D (2015) Cider: Consensus-based image description evaluation. In: CVPR, pp 4566–4575Google Scholar
  44. Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: CVPRGoogle Scholar
  45. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of TechnologyGoogle Scholar
  46. Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine LearningGoogle Scholar
  47. Xu K, Ba J, Kiros R, Courville A, Salakhutdinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. ICMLGoogle Scholar
  48. Yeung S, Russakovsky O, Jin N, Andriluka M, Mori G, Fei-Fei L (2016) Every moment counts: Dense detailed labeling of actions in complex videos. In: CVPRGoogle Scholar
  49. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCVGoogle Scholar
  50. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2015) Object detectors emerge in deep scene cnnsGoogle Scholar
  51. Zintgraf LM, Cohen TS, Adel T, Welling M (2017) Visualizing deep neural network decisions: Prediction difference analysis. arXiv preprint arXiv:170204595Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Zeynep Akata
    • 1
  • Lisa Anne Hendricks
    • 2
  • Stephan Alaniz
    • 1
  • Trevor Darrell
    • 2
  1. 1.AMLABUniversity of AmsterdamAmsterdamThe Netherlands
  2. 2.EECSUniversity of California BerkeleyBerkeleyUSA

Personalised recommendations