Skip to main content

Advertisement

Log in

ConvXAI: a System for Multimodal Interaction with Any Black-box Explainer

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Several studies have addressed the importance of context and users’ knowledge and experience in quantifying the usability and effectiveness of the explanations generated by explainable artificial intelligence (XAI) systems. However, to the best of our knowledge, no component-agnostic system that accounts for this need has yet been built. This paper describes an approach called ConvXAI, which can create a dialogical multimodal interface for any black-box explainer by considering the knowledge and experience of the user. First, we formally extend the state-of-the-art conversational explanation framework by introducing clarification dialogue as an additional dialogue type. We then implement our approach as an off-the-shelf Python tool. To evaluate our framework, we performed a user study including 45 participants divided into three groups based on their level of technology use and job function. Experimental results show that (i) different groups perceive explanations differently; (ii) all groups prefer textual explanations over graphical ones; and (iii) ConvXAI provides clarifications that enhance the usefulness of the original explanations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31

Similar content being viewed by others

Data Availability

The datasets analysed during the current study are available at https://bit.ly/3u4hFLM.

Notes

  1. https://bit.ly/3haAXM3

  2. https://rasa.com/

  3. https://www.luis.ai/

  4. All datasets can be accessed from https://github.com/sebischair/NLU-Evaluation-Corpora

  5. https://cloud.google.com/dialogflow

  6. https://bit.ly/3haAXM3

  7. https://forms.gle/JArrvVVYJR1jdpUs8

  8. https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai

Abbreviations

XAI:

Explainable Artificial Intelligence

NL:

Natural Language

NLP:

Natural Language Processing

NLG:

Natural Language Generation

NLU:

Natural Language Understanding

ADF:

Agent Dialogue Framework

SLU:

Spoken Language Understanding

NN:

Neural Network

RecNN:

Recursive Neural Network

CNN:

Convolutional Neural Networks

CRF:

Conditional Random Fields

TriCRF:

Triangular CRF

GRU:

Gated Recurrent Unit

RNN:

Recurrent Neural Network

BLSTM:

Bidirectional Long Short-term Memory

BERT:

Bidirectional Encoder Representations from Transformers

DST:

Dialogue State Tracking

Seq2Seq:

Sequence-to-sequence

ASR:

Automatic Speech Recognition

DP:

Dialogue Policy

EDM:

Explanation Dialogue Model

UML:

Unified Modeling Language

ML:

Machine Learning

CI:

Conversation Initialiser

SVM:

Support Vector Machines

EG:

Explanation Generator

JSON:

JavaScript Object Notation

References

  1. Payrovnaziri SN, Chen Z, Rengifo-Moreno P, Miller T, Bian J, Chen JH, et al. Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review. J Am Med Inform Assoc. 2020;27(7):1173–85.

    Article  Google Scholar 

  2. Deeks A. The judicial demand for explainable artificial intelligence. Columbia Law Rev. 2019;119(7):1829–50.

    Google Scholar 

  3. Gao X, Gong R, Zhao Y, Wang S, Shu T, Zhu SC. Joint mind modeling for explanation generation in complex human-robot collaborative tasks. In: 2020 29th IEEE international conference on robot and human interactive communication (RO-MAN). IEEE; 2020. p. 1119–26.

  4. Cambria E, Liu Q, Decherchi S, Xing F, Kwok K. SenticNet 7: a commonsense-based neurosymbolic AI framework for explainable sentiment analysis. Proceedings of LREC 2022. 2022.

  5. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. CSUR. 2018;51(5).

  6. Lou Y, Caruana R, Gehrke J. Intelligible models for classification and regression. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 2012. p. 150–8.

  7. Ribeiro MT, Singh S, Guestrin C. Anchors: High-precision model-agnostic explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32. 2018.

  8. Kim B, Glassman E, Johnson B, Shah J. iBCM: interactive Bayesian case model empowering humans via intuitive interaction. 2015.

  9. Mariotti E, Alonso JM, Gatt A. Towards harnessing natural language generation to explain black-box models. In: NL4XAI. 2020.

  10. Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis Support Syst. 2011;51(1):141–54.

    Article  Google Scholar 

  11. Gkatzia D, Lemon O, Rieser V. Natural Language Generation enhances human decision-making with uncertain information. In: ACL. 2016.

  12. Madumal P, Miller T, Sonenberg L, Vetere F. A grounded interaction protocol for explainable artificial intelligence. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2019. p. 1033–41.

  13. Walton DN, Krabbe EC. Commitment in dialogue: basic concepts of interpersonal reasoning. SUNY press; 1995.

  14. Sokol K, Flach P. One explanation does not fit all. KI-Künstliche Intelligenz. 2020.

  15. Giabelli A, Malandri L, Mercorio F, Mezzanzanica M, Seveso A. NEO: a tool for taxonomy enrichment with new emerging occupations. In: ISWC. 2020. p. 568–84.

  16. Miller T. Explanation in artificial intelligence: insights from the social sciences. AIJ. 2019.

  17. Schoonderwoerd TA, Jorritsma W, Neerincx MA, Van Den Bosch K. Human-centered XAI: developing design patterns for explanations of clinical decision support systems. Int J Hum Comput Stud. 2021;154:102684

  18. Gatt A, Krahmer E. Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J Artif Intell Res. 2018;61:65–170.

    Article  MathSciNet  MATH  Google Scholar 

  19. Chaves AP, Gerosa MA. How should my chatbot interact? A survey on social characteristics in human–chatbot interaction design. Int J Hum Comput Interact. 2020;1–30.

  20. De Gennaro M, Krumhuber EG, Lucas G. Effectiveness of an empathic chatbot in combating adverse effects of social exclusion on mood. Front Psychol. 2020;10:3061.

    Article  Google Scholar 

  21. Sokol K, Flach PA. Conversational explanations of machine learning predictions through class-contrastive counterfactual statements. In: IJCAI. 2018.

  22. Korpan R, Epstein SL. Toward natural explanations for a robot’s navigation plans. HRI. 2018.

  23. Alonso JM, Ramos-Soto A, Reiter E, van Deemter K. An exploratory study on the benefits of using natural language for explaining fuzzy rule-based systems. In: FUZZ-IEEE. 2017.

  24. Park DH, Hendricks LA, Akata Z, Rohrbach A, Schiele B, Darrell T, et al. Multimodal explanations: justifying decisions and pointing to the evidence. In: CVPR. 2018.

  25. Reiter E, Dale R. Building natural language generation systems (Studies in Natural Language Processing). Cambridge: Cambridge University Press. 2000. https://doi.org/10.1017/CBO9780511519857

  26. Krahmer E, Theune M. Empirical methods in natural language generation: data-oriented methods and empirical evaluation. 2010.

  27. Sklar EI, Azhar MQ. Explanation through argumentation. In: Proceedings of the 6th International Conference on Human-Agent Interaction. 2018. p. 277–85.

  28. Čyras K, Rago A, Albini E, Baroni P, Toni F. Argumentative XAI: a survey. In: IJCAI. 2021. p. 4392–99.

  29. Walton D. Dialogical models of explanation. ExaCt. 2007;2007:1–9.

    Google Scholar 

  30. Walton D. A dialogue system specification for explanation. Synthese. 2011;182(3):349–74.

    Article  Google Scholar 

  31. Arioua A, Croitoru M. Formalizing explanatory dialogues. In: International Conference on Scalable Uncertainty Management. Springer; 2015. p. 282–97.

  32. Glaser BG, Strauss AL, Strutzel E. The discovery of grounded theory; strategies for qualitative research. Nurs Res. 1968;17(4):364.

    Article  Google Scholar 

  33. Sokol K, Flach PA. Glass-box: explaining AI decisions with counterfactual statements through conversation with a voice-enabled virtual assistant. In: IJCAI. 2018. p. 5868–70.

  34. Kuzba M. What would you ask the machine learning model? Identification of user needs for model explanations based on human-model conversations. In: ECML PKDD 2020 Workshops, vol. 1323. Springer Nature; 2020. p. 447.

  35. Raymond A, Gunes H, Prorok A. Culture-based explainable human-agent deconfliction. In: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems; 2020. p. 1107–15.

  36. Hernandez-Bocanegra DC, Ziegler J. Conversational review-based explanations for recommender systems: exploring users–query behavior. In: CUI 2021-3rd Conference on Conversational User Interfaces; 2021. p. 1–11.

  37. Liao QV, Gruen D, Miller S. Questioning the AI: informing design practices for explainable AI user experiences. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 2020. p. 1–15.

  38. Chromik M, Butz A. Human-XAI Interaction: a review and design principles for explanation user interfaces. In: IFIP Conference on Human-Computer Interaction. Springer; 2021. p. 619–40.

  39. Dazeley R, Vamplew P, Foale C, Young C, Aryal S, Cruz F. Levels of explainable artificial intelligence for human-aligned conversational explanations. Artif Intell. 2021;299: 103525.

    Article  MathSciNet  MATH  Google Scholar 

  40. Keselj V. Speech and language processing Daniel Jurafsky and James H. Martin (Stanford University and University of Colorado at Boulder) Pearson Prentice Hall, 2009, xxxi+ 988 pp; hardbound, ISBN 978-0-13-187321-6, $115.00. MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info...; 2009.

  41. Divya S, Indumathi V, Ishwarya S, Priyasankari M, Devi SK. A self-diagnosis medical chatbot using artificial intelligence. Journal of Web Development and Web Designing. 2018;3(1):1–7.

    Google Scholar 

  42. Zhou L, Gao J, Li D, Shum HY. The design and implementation of xiaoice, an empathetic social chatbot. Comput Linguist. 2020;46(1):53–93.

    Article  Google Scholar 

  43. Ni J, Pandelea V, Young T, Zhou H, Cambria E. HITKG: towards goal-oriented conversations via multi-hierarchy learning. In: Proceedings of the AAAI conference on artificial intelligence, vol. 36. 2022. p. 11112–20.

  44. Roller S, Dinan E, Goyal N, Ju D, Williamson M, Liu Y, et al. Recipes for building an open-domain chatbot. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. p. 300–25.

  45. Young T, Xing F, Pandelea V, Ni J, Cambria E. Fusing task-oriented and open-domain dialogues in conversational agents. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36. 2022. p. 11622–9.

  46. Jurafsky D, Martin JH. Speech and language processing: an introduction to natural language processing. Computational Linguistics, and Speech Recognition. 2020.

  47. Goo CW, Gao G, Hsu YK, Huo CL, Chen TC, Hsu KW, et al. Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 2, (Short Papers). 2018. p. 753–7.

  48. Chen YN, Hakanni-Tür D, Tur G, Celikyilmaz A, Guo J, Deng L. Syntax or semantics? Knowledge-guided joint semantic frame parsing. In: IEEE Spoken Language Technology Workshop (SLT). IEEE; 2016. p. 348–55.

  49. Zhang C, Li Y, Du N, Fan W, Philip SY. Joint slot filling and intent detection via capsule neural networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. p. 5259–67.

  50. Guo D, Tur G, Wt Y, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks. In: IEEE Spoken Language Technology Workshop (SLT). IEEE; 2014. p. 554–9.

  51. Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE; 2013. p. 78–83.

  52. Jeong M, Lee GG. Triangular-chain conditional random fields. IEEE Trans Audio Speech Lang Process. 2008;16(7):1287–302.

    Article  Google Scholar 

  53. Zhang X, Wang H. A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, vol. 16. 2016. p. 2993–9.

  54. Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers). 2018. p. 309–14.

  55. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Advances in Neural Information Processing Systems. 2017. p. 5998–6008.

  56. Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv:1609.01454 [Preprint]. 2016. Available from: http://arxiv.org/abs/1609.01454.

  57. Chen Q, Zhuo Z, Wang W. Bert for joint intent classification and slot filling. arXiv:1609.01454 [Preprint]. 2019. Available from: http://arxiv.org/abs/1609.01454.

  58. Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1). 2019.

  59. Weld H, Huang X, Long S, Poon J, Han SC. A survey of joint intent detection and slot-filling models in natural language understanding. arXiv:2101.08091 [Preprint]. 2021. Available from: http://arxiv.org/abs/2101.08091

  60. Li W, Shao W, Ji S, Cambria E. BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing. 2022;467:73–82.

    Article  Google Scholar 

  61. Žilka L, Marek D, Korvas M, Jurcicek F. Comparison of Bayesian discriminative and generative models for dialogue state tracking. In: Proceedings of the SIGDIAL 2013 Conference. 2013. p. 452–6.

  62. Williams JD, Raux A, Henderson M. The dialog state tracking challenge series: a review. Dialogue & Discourse. 2016;7(3):4–33.

    Article  Google Scholar 

  63. Wang Z, Lemon O. A simple and generic belief tracking mechanism for the dialog state tracking challenge: on the believability of observed information. In: Proceedings of the SIGDIAL 2013 Conference. 2013. p. 423–32.

  64. Sun K, Chen L, Zhu S, Yu K. A generalized rule based tracker for dialogue state tracking. In: IEEE Spoken Language Technology Workshop (SLT). IEEE; 2014. p. 330–5.

  65. Xu P, Hu Q. An end-to-end approach for handling unknown slot values in dialogue state tracking. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers. 2018. p. 1448–57.

  66. Heckerman D, Horvitz E. Inferring informational goals from free-text queries: a Bayesian approach. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. 1998. p. 230–7.

  67. Horvitz E, Paek T. A computational architecture for conversation. In: UM99 User Modeling. Springer; 1999. p. 201–10.

  68. Zhao J, Mahdieh M, Zhang Y, Cao Y, Wu Y. Effective sequence-to-sequence dialogue state tracking. arXiv:2108.13990 [Preprint]. 2021. Available from: http://arxiv.org/abs/2108.13990

  69. Serban IV, Sordoni A, Bengio Y, Courville A, Pineau J. Hierarchical neural network generative models for movie dialogues. arXiv:1507.04808 [Preprint]. Available from: https://arxiv.org/abs/1507.04808. 2015;7(8):434-41.

  70. Bohus D, Rudnicky A. A “k hypotheses+ other” belief updating model. 2006.

  71. Lee S. Structured discriminative model for dialog state tracking. In: Proceedings of the SIGDIAL 2013 Conference; 2013. p. 442-51.

  72. Williams JD, Zweig G. End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning. arXiv:1606.01269 [Preprint]. 2016. Available from: http://arxiv.org/abs/1606.01269.

  73. Peng B, Li X, Gao J, Liu J, Wong KF. Deep Dyna-Q: integrating planning for task-completion dialogue policy learning. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers. 2018. p. 2182–92.

  74. Tiwari A, Saha T, Saha S, Sengupta S, Maitra A, Ramnani R, et al. Multi-modal dialogue policy learning for dynamic and co-operative goal setting. In: 2021 International Joint Conference on Neural Networks (IJCNN). IEEE; 2021. p. 1–8.

  75. Zhao Y, Wang Z, Huang Z. Automatic curriculum learning with over-repetition penalty for dialogue policy learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35. 2021. p. 14540–8.

  76. McBurney P, Parsons S. Games that agents play: a formal framework for dialogues between autonomous agents. J Logic Lang Inform. 2002;11(3):315–34.

    Article  MathSciNet  MATH  Google Scholar 

  77. Hendricks LA, Hu R, Darrell T, Akata Z. Generating counterfactual explanations with natural language. In: ICML WHI. 2018.

  78. Akula AR, Todorovic S, Chai JY, Zhu SC. Natural language interaction with explainable AI models. In: CVPR. 2019.

  79. Papamichail KN, French S. Explaining and justifying the advice of a decision support system: a natural language generation approach. Expert Systems with Applications. 2003.

  80. Rosenthal S, Selvaraj SP, Veloso MM. Verbalization: narration of autonomous robot experience. In: IJCAI. 2016.

  81. Bunk T, Varshneya D, Vlasov V, Nichol A. Diet: lightweight language understanding for dialogue systems. arXiv:2004.09936 [Preprint]. 2020. Available from: http://arxiv.org/abs/2004.09936.

  82. Lafferty J, McCallum A, Pereira FC. Conditional random fields: probabilistic models for segmenting and labeling sequence data. 2001.

  83. Vanzo A, Bastianelli E, Lemon O. Hierarchical multi-task natural language understanding for cross-domain conversational AI: HERMIT NLU. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. 2019. p. 254–63.

  84. Liu X, Eshghi A, Swietojanski P, Rieser V. Benchmarking natural language understanding services for building conversational agents. In: 10th International Workshop on Spoken Dialogue Systems Technology 2019. 2019.

  85. Braun D, Mendez AH, Matthes F, Langen M. Evaluating natural language understanding services for conversational question answering systems. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017. p. 174–85.

  86. Lorenc P, Marek P, Pichl J, Konrád J, Šedivỳ J. Do we need online NLU tools?  arXiv:2011.09825 [Preprint]. 2020. Available from: http://arxiv.org/abs/2011.09825.

  87. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.

    Article  MATH  Google Scholar 

  88. Palan S, Schitter C. Prolific. AC–A subject pool for online experiments. J Behav Exp Financ. 2018;17:22–7.

  89. Settles B. Active learning literature survey. 2009.

  90. Ribeiro MT, Singh S, Guestrin C. Why should I trust you? Explaining the predictions of any classifier. In: ACM SIGKDD. 2016.

  91. Adler P, Falk C, Friedler SA, Nix T, Rybeck G, Scheidegger C, et al. Auditing black-box models for indirect influence. Knowl Inf Syst. 2018;54(1):95–122.

    Article  Google Scholar 

  92. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. 2017. p. 618–26.

  93. Turner RA, model explanation system. In,. IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE; 2016. p. 1–6.

  94. Amarasinghe K, Manic M. Explaining what a neural network has learned: toward transparent classification. In: FUZZ-IEEE. 2019.

  95. Chang S, Harper FM, Terveen LG. Crowd-based personalized natural language explanations for recommendations. In: RecSys. 2016.

  96. Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE. 2015;10(7):e0130140.

  97. Henelius A, Puolamäki K, Ukkonen A. Interpreting classifiers through attribute interactions in datasets. In: ICML WHI. 2017.

  98. Verbeke W, Martens D, Mues C, Baesens B. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst Appl. 2011;38(3):2354–64.

    Article  Google Scholar 

  99. Hohman F, Srinivasan A, Drucker SM. TeleGam: combining visualization and verbalization for interpretable machine learning. In: VIS. IEEE; 2019.

  100. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Proces Syst. 2017;30.

  101. van der Waa J, Robeer M, van Diggelen J, Brinkhuis M, Neerincx M. Contrastive explanations with local foil trees. In: Proceedings of the ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden. vol. 37. 2018.

  102. Krippendorff K. Reliability in content analysis: some common misconceptions and recommendations. Hum Commun Res. 2004;30(3):411–33.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Navid Nobani.

Ethics declarations

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 1010 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malandri, L., Mercorio, F., Mezzanzanica, M. et al. ConvXAI: a System for Multimodal Interaction with Any Black-box Explainer. Cogn Comput 15, 613–644 (2023). https://doi.org/10.1007/s12559-022-10067-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-022-10067-7

Keywords

Navigation