Predicting Dialogue Breakdown in Conversational Pedagogical Agents with Multimodal LSTMs

Min, Wookhee; Park, Kyungjin; Wiggins, Joseph; Mott, Bradford; Wiebe, Eric; Boyer, Kristy Elizabeth; Lester, James

doi:10.1007/978-3-030-23207-8_37

Wookhee Min²⁰,
Kyungjin Park²⁰,
Joseph Wiggins²¹,
Bradford Mott²⁰,
Eric Wiebe²⁰,
Kristy Elizabeth Boyer²¹ &
…
James Lester²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11626))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3159 Accesses
4 Citations

Abstract

Recent years have seen a growing interest in conversational pedagogical agents. However, creating robust dialogue managers for conversational pedagogical agents poses significant challenges. Agents’ misunderstandings and inappropriate responses may cause breakdowns in conversational flow, lead to breaches of trust in agent-student relationships, and negatively impact student learning. Dialogue breakdown detection (DBD) is the task of predicting whether an agent’s utterance will cause a breakdown in an ongoing conversation. A robust DBD framework can support enhanced user experiences by choosing more appropriate responses, while also offering a method to conduct error analyses and improve dialogue managers. This paper presents a multimodal deep learning-based DBD framework to predict breakdowns in student-agent conversations. We investigate this framework with dialogues between middle school students and a conversational pedagogical agent in a game-based learning environment. Results from a study with 92 middle school students demonstrate that multimodal long short-term memory network (LSTM)-based dialogue breakdown detectors incorporating eye gaze features achieve high predictive accuracies and recall rates, suggesting that multimodal detectors can play an important role in designing conversational pedagogical agents that effectively engage students in dialogue.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hirschberg, J., Manning, C.D.: Advances in natural language processing. Science 349, 261–266 (2015)
Article MathSciNet Google Scholar
Kim, Y., Baylor, A.L.: Research-based design of pedagogical agent roles: a review, progress, and recommendations. Int. J. Artif. Intell. Educ. 26(1), 160–169 (2016)
Article Google Scholar
Tegos, S., Demetriadis, S.: Conversational agents improve peer learning through building on prior knowledge. Educ. Technol. Soc. 20, 99–111 (2017)
Google Scholar
Graesser, A.C.: Conversations with AutoTutor help students learn. Int. J. Artif. Intell. Educ. 26, 124–132 (2016)
Article Google Scholar
Litman, D., et al.: Towards using conversations with spoken dialogue systems in the automated assessment of non-native speakers of English. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 270–275 (2016)
Google Scholar
Rus, V., Mello, S.D., Hu, X., Graesser, A.C.: Recent advances in conversational intelligent tutoring systems. AI Mag. 34(3), 42–54 (2013)
Article Google Scholar
Lester, J., Ha, E., Lee, S., Mott, B., Rowe, J., Sabourin, J.: Serious games get smart: intelligent game-based learning environments. AI Mag. 34(4), 31–45 (2013)
Article Google Scholar
Johnson, W.L., Lester, J.C.: Face-to-face interaction with pedagogical agents, twenty years later. Int. J. Artif. Intell. Educ. 26(1), 25–36 (2016)
Article Google Scholar
Pezzullo, Lydia G., et al.: “Thanks Alisha, keep in touch”: gender effects and engagement with virtual learning companions. In: André, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (eds.) AIED 2017. LNCS (LNAI), vol. 10331, pp. 299–310. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61425-0_25
Chapter Google Scholar
Martinovsky, B., Traum, D.: The error is the clue: breakdown in human-machine interaction. In: Proceedings of the ISCA Workshop on Error Handling in Spoken Dialogue Systems, pp. 11–17 (2003)
Google Scholar
Higashinaka, R., Funakoshi, K., Inaba, M., Tsunomori, Y., Takahashi, T., Kaji, N.: Overview of dialogue breakdown detection challenge 3. In: Proceedings of Dialog System Technology Challenge 6 (2017)
Google Scholar
Higashinaka, R., Funakoshi, K., Araki, M.: Towards taxonomy of errors in chat-oriented dialogue systems. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 87–95 (2015)
Google Scholar
Steichen, B., Carenini, G., Conati, C.: User-adaptive information visualization: using eye gaze data to infer visualization tasks and user cognitive abilities. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 317–328. ACM (2013)
Google Scholar
D’Mello, S., Olney, A., Williams, C., Hays, P.: Gaze tutor: a gaze-reactive intelligent tutoring system. Int. J. Hum.-Comput. Stud. 70, 377–398 (2012)
Article Google Scholar
Hutt, S., Mills, C., White, S., Donnelly, P.J., D’Mello, S.K.: The eyes have it: gaze-based detection of mind wandering during learning with an intelligent tutoring system. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 86–93 (2016)
Google Scholar
Min, W., et al.: Multimodal goal recognition in open-world digital games. In: 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pp. 80–86 (2017)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1–32 (1997)
Article Google Scholar
Rowe, J.P., Shores, L.R., Mott, B.W., Lester, J.C.: Integrating learning, problem solving, and engagement in narrative-centered learning environments. Int. J. Artif. Intell. Educ. 21, 115–133 (2011)
Google Scholar
Stolcke, A., et al.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 339–373 (2000)
Article Google Scholar
Min, W., et al.: Predicting dialogue acts of virtual learning companion utilizing student multimodal interaction data. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 454–459 (2016)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
Google Scholar
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics System Demonstrations, pp. 55–60 (2014)
Google Scholar
Min, W., Mott, B.W.: NCSU_SAS_WOOKHEE: a deep contextual long-short term memory model for text normalization. In: Proceedings of the Workshop for the Normalization of Noisy User Text, pp. 111–119 (2015)
Google Scholar

Download references

Acknowledgments

This research was funded by the National Science Foundation under grants CHS-1409639 and DRL-1640141. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Center for Educational Informatics, North Carolina State University, Raleigh, NC, 27606, USA
Wookhee Min, Kyungjin Park, Bradford Mott, Eric Wiebe & James Lester
Department of Computer and Information Science and Engineering, University of Florida, Gainsville, FL, 32601, USA
Joseph Wiggins & Kristy Elizabeth Boyer

Authors

Wookhee Min
View author publications
You can also search for this author in PubMed Google Scholar
Kyungjin Park
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Wiggins
View author publications
You can also search for this author in PubMed Google Scholar
Bradford Mott
View author publications
You can also search for this author in PubMed Google Scholar
Eric Wiebe
View author publications
You can also search for this author in PubMed Google Scholar
Kristy Elizabeth Boyer
View author publications
You can also search for this author in PubMed Google Scholar
James Lester
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wookhee Min .

Editor information

Editors and Affiliations

University of Sao Paulo, Sao Paulo, Brazil
Seiji Isotani
University of Malaga, Málaga, Spain
Eva Millán
Carnegie Mellon University, Pittsburgh, PA, USA
Amy Ogan
DePaul University, Chicago, IL, USA
Peter Hastings
Carnegie Mellon University, Pittsburgh, PA, USA
Bruce McLaren
University College London, London, UK
Rose Luckin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Min, W. et al. (2019). Predicting Dialogue Breakdown in Conversational Pedagogical Agents with Multimodal LSTMs. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds) Artificial Intelligence in Education. AIED 2019. Lecture Notes in Computer Science(), vol 11626. Springer, Cham. https://doi.org/10.1007/978-3-030-23207-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-23207-8_37
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23206-1
Online ISBN: 978-3-030-23207-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics