Skip to main content

On the Inherent Explainability of Pattern Theory-Based Video Event Interpretations

  • Chapter
  • First Online:
Explainable and Interpretable Models in Computer Vision and Machine Learning

Abstract

The ability of artificial intelligence systems to offer explanations for its decisions is central to building user confidence and structuring smart human-machine interactions. Expressing the rationale behind such a system’s output is an important aspect of human-machine interaction as AI continues to be prominent in general, everyday use-cases. In this paper, we introduce a novel framework integrating Grenander’s pattern theory structures to produce inherently explainable, symbolic representations for activity interpretations. These representations provide semantically rich and coherent interpretations of video activity using connected structures of detected (grounded) concepts, such as objects and actions, that are bound by semantics through background concepts not directly observed, i.e. contextualization cues. We use contextualization cues to establish semantic relationships among concepts to infer a deeper interpretation of events than what can be directly sensed. We propose the use of six questions that can be used to gain insight into the models ability to justify its decision and enhance its ability to interact with humans. The six questions are designed to (1) build an understanding of how the model is able to infer interpretations, (2) enable us to walk through its decision-making process, and (3) understand its drawbacks and possibly address them. We demonstrate the viability of this idea on video data using a dialog model that uses interpretations to generate explanations grounded in both video data and semantics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover + eBook
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Available as EPUB and PDF

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen K, MÞller KR (2010) How to explain individual classification decisions. Journal of Machine Learning Research 11(Jun):1803–1831

    MathSciNet  MATH  Google Scholar 

  • Biran O, McKeown K (2014) Justification narratives for individual classifications. In: Proceedings of the AutoML workshop at ICML, vol 2014

    Google Scholar 

  • Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N (2015) Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1721–1730

    Google Scholar 

  • Core MG, Lane HC, Van Lent M, Gomboc D, Solomon S, Rosenberg M (2006) Building explainable artificial intelligence systems. In: AAAI, pp 1766–1773

    Google Scholar 

  • Escalante HJ, Kaya H, Salah AA, Escalera S, Gucluturk Y, Guclu U, Baro X, Guyon I, Junior JJ, Madadi M, Ayache S, Viegas E, Gurpinar F, Sukma Wicaksana A, Liem CCS, van Gerven MAJ, van Lier R (2018) Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos. ArXiv e-prints 1802.00745

    Google Scholar 

  • Grenander U (1996) Elements of pattern theory. JHU Press

    Google Scholar 

  • Gumperz JJ (1992) Contextualization and understanding. Rethinking context: Language as an interactive phenomenon 11:229–252

    Google Scholar 

  • Hendricks LA, Akata Z, Rohrbach M, Donahue J, Schiele B, Darrell T (2016) Generating visual explanations. In: European Conference on Computer Vision, Springer, pp 3–19

    Chapter  Google Scholar 

  • Herlocker JL, Konstan JA, Riedl J (2000) Explaining collaborative filtering recommendations. In: Proceedings of the 2000 ACM conference on Computer supported cooperative work, ACM, pp 241–250

    Google Scholar 

  • Junior JCSJ, Musse SR, Jung CR (2010) Crowd analysis using computer vision techniques. IEEE Signal Processing Magazine 27(5):66–77

    Google Scholar 

  • Kheradpisheh SR, Ghodrati M, Ganjtabesh M, Masquelier T (2016) Deep networks can resemble human feed-forward vision in invariant object recognition. Scientific Reports 6:32,672

    Article  Google Scholar 

  • Kuehne H, Arslan A, Serre T (2014) The language of actions: Recovering the syntax and semantics of goal-directed human activities. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 780–787

    Google Scholar 

  • Lane HC, Core MG, Van Lent M, Solomon S, Gomboc D (2005) Explainable artificial intelligence for training and tutoring. Tech. rep., DTIC Document

    Google Scholar 

  • Ledley S, Lusted LB, Ledley RS (1959) Reasoning foundations of medical diagnosis. In: Science, Citeseer

    Google Scholar 

  • Linder N, Turkki R, Walliander M, MÃ¥rtensson A, Diwan V, Rahtu E, Pietikäinen M, Lundin M, Lundin J (2014) A malaria diagnostic tool based on computer vision screening and visualization of plasmodium falciparum candidate areas in digitized blood smears. PLoS One 9(8):e104,855

    Article  Google Scholar 

  • Liu H, Singh P (2004) Conceptnet’ a practical commonsense reasoning tool-kit. BT Technology Journal 22(4):211–226

    Article  Google Scholar 

  • Lomas M, Chevalier R, Cross II EV, Garrett RC, Hoare J, Kopack M (2012) Explaining robot actions. In: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction, ACM, pp 187–188

    Google Scholar 

  • Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, IEEE, pp 1975–1981

    Google Scholar 

  • Martens D, Provost F (2013) Explaining data-driven document classifications. MIS Quarterly

    Google Scholar 

  • Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2009) Dataset shift in machine learning. The MIT Press

    Google Scholar 

  • Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp 1135–1144

    Google Scholar 

  • Shortliffe EH, Buchanan BG (1975) A model of inexact reasoning in medicine. Mathematical biosciences 23(3–4):351–379

    Article  MathSciNet  Google Scholar 

  • Souza F, Sarkar S, Srivastava A, Su J (2015) Temporally coherent interpretations for long videos using pattern theory. In: CVPR, IEEE, pp 1229–1237

    Google Scholar 

  • de Souza FD, Sarkar S, Srivastava A, Su J (2016) Spatially coherent interpretations of videos using pattern theory. International Journal on Computer Vision pp 1–21

    Google Scholar 

  • Speer R, Havasi C (2013) Conceptnet 5: A large semantic network for relational knowledge. In: The People’s Web Meets NLP, Springer, pp 161–176

    Google Scholar 

Download references

Acknowledgements

This research was supported in part by NSF grants IIS 1217676 and CNS-1513126. The authors would also like to thank Daniel Sawyer for his invaluable insights during discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sathyanarayanan N. Aakur .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Aakur, S.N., de Souza, F.D.M., Sarkar, S. (2018). On the Inherent Explainability of Pattern Theory-Based Video Event Interpretations. In: Escalante, H., et al. Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-98131-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98131-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98130-7

  • Online ISBN: 978-3-319-98131-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics