Abstract
As Transformer models gain complexity in the realm of Natural Language Processing (NLP), their interpretability becomes a significant challenge. To tackle this issue, visual explanation emerges as a compelling avenue. Central to visual explanation is the ability to visualize the model’s path leading to specific outputs, shedding light on the relevant features or components that influence the desired outcomes. One major objective of NLP visual explanation is to emphasize the most salient portions of the text that exert the most significant impact on the model’s output. Numerous visual explanation techniques for NLP models have surfaced in recent times. However, evaluating and comparing the performance of these methods presents a major hurdle. Conventional classification accuracy measures are inadequate for assessing visualization quality. To address this, rigorous criteria are essential to gauge the usefulness of the extracted insights for explaining the models. Additionally, visualizing discrepancies in the knowledge extracted by different models becomes crucial for effective ranking purposes. This is an area of research with very few available results. In this work, we investigate how to to evaluate explanations/visualizations resulting from Machine Learning (ML) models for text classification. We describe and apply several methods for evaluating the quality of text visualizations, including both automated techniques based on quantifiable measures and subjective techniques based on human judgements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
References
Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B (2018) Sanity checks for saliency maps. In: Advances in neural information processing systems, pp 9505–9515
Amann J, Blasimme A, Vayena E, Frey D, Madai V (2020) Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inf Dec Making 20. https://doi.org/10.1186/s12911-020-01332-6
Braşoveanu AMP, Andonie R (2020) Visualizing transformers for NLP: a brief survey. In: 2020 24th international conference information visualisation (IV), pp 270–279. https://doi.org/10.1109/IV51561.2020.00051
Braşoveanu AMP, Andonie R (2022) Visualizing and explaining language models. In: Kovalerchuk B, Nazemi K, Andonie R, Datia N, Banissi E (eds) Integrating artificial intelligence and visualization for visual knowledge discovery. Springer International Publishing, Cham, pp 213–237. https://doi.org/10.1007/978-3-030-93119-3_8
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. https://github.com/openai/gpt-3
Clark K, Khandelwal U, Levy O, Manning CD (2019) What does BERT look at? an analysis of BERT’s attention. CoRR abs/1906.04341. http://arxiv.org/abs/1906.04341
Collins C, Fokkens A, Kerren A, Weaver C, Chatzimparmpas A (2022) Visual Text Analytics (Dagstuhl Seminar 22191). Dagstuhl Reports 12(5):37–91. 10.4230/DagRep.12.5.37. drops.dagstuhl.de/opus/volltexte/2022/17443
Comission E (2023) White paper on artificial intelligence: a european approach to excellence and trust. https://commission.europa.eu/publications/white-paper-artificial-intelligence-european-approach-excellence-and-trust_en
Danilevsky M, Qian K, Aharonov R, Katsis Y, Kawas B, Sen P (2020) A survey of the state of explainable AI for natural language processing. In: Proceedings of the 1st conference of the Asia-Pacific chapter of the association for computational linguistics and the 10th international joint conference on natural language processing. Association for Computational Linguistics, Suzhou, China, pp 447–459. https://aclanthology.org/2020.aacl-main.46
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human anguage technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, volume 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423
Dunn A, Inkpen D, Andonie R (2021) Context-sensitive visualization of deep learning natural language processing models. In: 2021 25th international conference information visualisation (IV), pp 170–175. https://doi.org/10.1109/IV53921.2021.00035
Dunn A, Inkpen D, Andonie R (2022) Evaluation of deep learning context-sensitive visualization models. In: 2022 26th international conference information visualisation (IV), pp 359–365. https://doi.org/10.1109/IV56949.2022.00066
Gardner M, Grus J, Neumann M, Tafjord O, Dasigi P, Liu NF, Peters M, Schmitz M, Zettlemoyer LS (2017) Allennlp: a deep semantic natural language processing platform
Han X, Wallace BC, Tsvetkov Y (2020) Explaining black box predictions and unveiling data artifacts through influence functions
Hoover B, Strobelt H, Gehrmann S (2019) exBERT: a visual analysis tool to explore learned representations in transformers models. CoRR abs/1910.05276. http://arxiv.org/abs/1910.05276
Jain S, Wallace BC (2019) Attention is not explanation. CoRR abs/1902.10186. http://arxiv.org/abs/1902.10186
Kovalerchuk B, Ahmad MA, Teredesai A (2021) Survey of explainable machine learning with visual and granular methods beyond quasi-explanations. Interpretable artificial intelligence: A perspective of granular computing pp 217–267
Kovalerchuk B, Andonie R, Datia N, Nazemi K, Banissi E (2022) Visual knowledge discovery with artificial intelligence: challenges and future directions. In: Kovalerchuk B, Nazemi K, Andonie R, Datia N, Banissi E (eds) Integrating artificial intelligence and visualization for visual knowledge discovery. Springer International Publishing, Cham, pp 1–27. https://doi.org/10.1007/978-3-030-93119-3_1
Kovalerchuk B, Nazemi K, Andonie R, Datia N, Banissi E (2022) Integrating artificial intelligence and visualization for visual knowledge discovery. Springer
Lettieri N, Guarino A, Malandrino D, Zaccagnini R (2020) Knowledge mining and social dangerousness assessment in criminal justice: metaheuristic integration of machine learning and graph-based inference. Artif Intell Law. https://doi.org/10.1007/s10506-022-09334-7
Li J, Chen X, Hovy E, Jurafsky D (2015) Visualizing and understanding neural models in NLP. arXiv preprint arXiv:1506.01066
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692. http://arxiv.org/abs/1907.11692
Lucaci D, Inkpen D (2021) Towards unifying the explainability evaluation methods for NLP. In: Wang L, Feng Y, Hong Y, He R (eds) Natural language processing and Chinese computing—10th CCF international conference, NLPCC 2021, Qingdao, China, October 13-17, 2021, Proceedings, Part II, Lecture Notes in Computer Science, vol 13029. Springer, pp 303–314. https://doi.org/10.1007/978-3-030-88483-3_23
Marcinkevics R, Vogt JE (2020) Interpretability and explainability: a machine learning zoo mini-tour. ArXiv abs/2012.01805
Mikolov T, Yih Wt, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Atlanta, Georgia, pp 746–751. https://aclanthology.org/N13-1090
Otter DW, Medina JR, Kalita JK (2021) A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netwo Learning Syst 32(2):604–624. https://doi.org/10.1109/TNNLS.2020.2979670
Radford A, Wu J. Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. https://github.com/openai/gpt-2
Reif E, Yuan A, Wattenberg M, Viégas FB, Coenen A, Pearce A, Kim B, Visualizing and measuring the geometry of BERT. In: Wallach et al. [40], pp 8592–8600. http://papers.nips.cc/paper/9065-visualizing-and-measuring-the-geometry-of-bert
Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108. http://arxiv.org/abs/1910.01108
Saravia E, Liu HCT, Huang YH, Wu J, Chen YS (2018) CARER: contextualized affect representations for emotion recognition. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 3687–3697. https://doi.org/10.18653/v1/D18-1404. https://www.aclweb.org/anthology/D18-1404
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2019) Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128(2):336–359. https://doi.org/10.1007/s11263-019-01228-7
Simonyan K, Vedaldi A, Zisserman A (2014) Deep inside convolutional networks: Visualising image classification models and saliency maps. In: Bengio Y, LeCun Y (eds) 2nd international conference on learning representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Workshop Track Proceedings. http://arxiv.org/abs/1312.6034
Smilkov D, Thorat N, Kim B, Viégas FB, Wattenberg M (2017) SmoothGrad: removing noise by adding noise. CoRR abs/1706.03825. http://arxiv.org/abs/1706.03825
Strobelt H, Hoover B, Satyanaryan A, Gehrmann S (2021) LMdiff: a visual diff tool to compare language models. In: Proceedings of the 2021 conference on empirical methods in natural language processing: system demonstrations, pp 96–105. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic. https://doi.org/10.18653/v1/2021.emnlp-demo.12. https://aclanthology.org/2021.emnlp-demo.12
Su W, Zhu X, Cao Y, Li B, Lu L, Wei F, Dai J (2020) VL-BERT: pre-training of generic visual-linguistic representations. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=SygXPaEYvH
Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need
Vig J (2019) A multiscale visualization of attention in the transformer model. In: Costa-jussà MR, Alfonseca E (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 3: System Demonstrations, pp 37–42. Association for Computational Linguistics. https://doi.org/10.18653/v1/p19-3007. https://doi.org/10.18653/v1/p19-3007
Wallace E, Tuyls J, Wang J, Subramanian S, Gardner M, Singh S (2019) Allennlp interpret: a framework for explaining predictions of NLP models. CoRR abs/1909.09251
Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) (2019) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada. http://papers.nips.cc/book/advances-in-neural-information-processing-systems-32-2019
Webber W, Moffat A, Zobel J (2010) A similarity measure for indefinite rankings. ACM Trans Inf Syst 28(4). https://doi.org/10.1145/1852102.1852106
Yang Z, Dai Z, Yang Y, Carbonell JG, Salakhutdinov R, Le QV, Xlnet: Generalized autoregressive pretraining for language understanding. In: Wallach et al. [40], pp 5754–5764. http://papers.nips.cc/paper/8812-xlnet-generalized-autoregressive-pretraining-for-language-understanding
Yun Z, Chen Y, Olshausen BA, LeCun Y (2021) Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors. arXiv preprint arXiv:2103.15949
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dunn, A., Inkpen, D., Andonie, R. (2024). Designing and Evaluating Context-Sensitive Visualization Models for Deep Learning Text Classifiers. In: Kovalerchuk, B., Nazemi, K., Andonie, R., Datia, N., Bannissi, E. (eds) Artificial Intelligence and Visualization: Advancing Visual Knowledge Discovery. Studies in Computational Intelligence, vol 1126. Springer, Cham. https://doi.org/10.1007/978-3-031-46549-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-46549-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46548-2
Online ISBN: 978-3-031-46549-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)