Skip to main content

Predicting Dialogue Breakdown in Conversational Pedagogical Agents with Multimodal LSTMs

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11626))

Included in the following conference series:

Abstract

Recent years have seen a growing interest in conversational pedagogical agents. However, creating robust dialogue managers for conversational pedagogical agents poses significant challenges. Agents’ misunderstandings and inappropriate responses may cause breakdowns in conversational flow, lead to breaches of trust in agent-student relationships, and negatively impact student learning. Dialogue breakdown detection (DBD) is the task of predicting whether an agent’s utterance will cause a breakdown in an ongoing conversation. A robust DBD framework can support enhanced user experiences by choosing more appropriate responses, while also offering a method to conduct error analyses and improve dialogue managers. This paper presents a multimodal deep learning-based DBD framework to predict breakdowns in student-agent conversations. We investigate this framework with dialogues between middle school students and a conversational pedagogical agent in a game-based learning environment. Results from a study with 92 middle school students demonstrate that multimodal long short-term memory network (LSTM)-based dialogue breakdown detectors incorporating eye gaze features achieve high predictive accuracies and recall rates, suggesting that multimodal detectors can play an important role in designing conversational pedagogical agents that effectively engage students in dialogue.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hirschberg, J., Manning, C.D.: Advances in natural language processing. Science 349, 261–266 (2015)

    Article  MathSciNet  Google Scholar 

  2. Kim, Y., Baylor, A.L.: Research-based design of pedagogical agent roles: a review, progress, and recommendations. Int. J. Artif. Intell. Educ. 26(1), 160–169 (2016)

    Article  Google Scholar 

  3. Tegos, S., Demetriadis, S.: Conversational agents improve peer learning through building on prior knowledge. Educ. Technol. Soc. 20, 99–111 (2017)

    Google Scholar 

  4. Graesser, A.C.: Conversations with AutoTutor help students learn. Int. J. Artif. Intell. Educ. 26, 124–132 (2016)

    Article  Google Scholar 

  5. Litman, D., et al.: Towards using conversations with spoken dialogue systems in the automated assessment of non-native speakers of English. In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 270–275 (2016)

    Google Scholar 

  6. Rus, V., Mello, S.D., Hu, X., Graesser, A.C.: Recent advances in conversational intelligent tutoring systems. AI Mag. 34(3), 42–54 (2013)

    Article  Google Scholar 

  7. Lester, J., Ha, E., Lee, S., Mott, B., Rowe, J., Sabourin, J.: Serious games get smart: intelligent game-based learning environments. AI Mag. 34(4), 31–45 (2013)

    Article  Google Scholar 

  8. Johnson, W.L., Lester, J.C.: Face-to-face interaction with pedagogical agents, twenty years later. Int. J. Artif. Intell. Educ. 26(1), 25–36 (2016)

    Article  Google Scholar 

  9. Pezzullo, Lydia G., et al.: “Thanks Alisha, keep in touch”: gender effects and engagement with virtual learning companions. In: André, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (eds.) AIED 2017. LNCS (LNAI), vol. 10331, pp. 299–310. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61425-0_25

    Chapter  Google Scholar 

  10. Martinovsky, B., Traum, D.: The error is the clue: breakdown in human-machine interaction. In: Proceedings of the ISCA Workshop on Error Handling in Spoken Dialogue Systems, pp. 11–17 (2003)

    Google Scholar 

  11. Higashinaka, R., Funakoshi, K., Inaba, M., Tsunomori, Y., Takahashi, T., Kaji, N.: Overview of dialogue breakdown detection challenge 3. In: Proceedings of Dialog System Technology Challenge 6 (2017)

    Google Scholar 

  12. Higashinaka, R., Funakoshi, K., Araki, M.: Towards taxonomy of errors in chat-oriented dialogue systems. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 87–95 (2015)

    Google Scholar 

  13. Steichen, B., Carenini, G., Conati, C.: User-adaptive information visualization: using eye gaze data to infer visualization tasks and user cognitive abilities. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 317–328. ACM (2013)

    Google Scholar 

  14. D’Mello, S., Olney, A., Williams, C., Hays, P.: Gaze tutor: a gaze-reactive intelligent tutoring system. Int. J. Hum.-Comput. Stud. 70, 377–398 (2012)

    Article  Google Scholar 

  15. Hutt, S., Mills, C., White, S., Donnelly, P.J., D’Mello, S.K.: The eyes have it: gaze-based detection of mind wandering during learning with an intelligent tutoring system. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 86–93 (2016)

    Google Scholar 

  16. Min, W., et al.: Multimodal goal recognition in open-world digital games. In: 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pp. 80–86 (2017)

    Google Scholar 

  17. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1–32 (1997)

    Article  Google Scholar 

  18. Rowe, J.P., Shores, L.R., Mott, B.W., Lester, J.C.: Integrating learning, problem solving, and engagement in narrative-centered learning environments. Int. J. Artif. Intell. Educ. 21, 115–133 (2011)

    Google Scholar 

  19. Stolcke, A., et al.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26(3), 339–373 (2000)

    Article  Google Scholar 

  20. Min, W., et al.: Predicting dialogue acts of virtual learning companion utilizing student multimodal interaction data. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 454–459 (2016)

    Google Scholar 

  21. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960)

    Article  Google Scholar 

  22. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)

    Google Scholar 

  23. Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  24. Min, W., Mott, B.W.: NCSU_SAS_WOOKHEE: a deep contextual long-short term memory model for text normalization. In: Proceedings of the Workshop for the Normalization of Noisy User Text, pp. 111–119 (2015)

    Google Scholar 

Download references

Acknowledgments

This research was funded by the National Science Foundation under grants CHS-1409639 and DRL-1640141. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wookhee Min .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Min, W. et al. (2019). Predicting Dialogue Breakdown in Conversational Pedagogical Agents with Multimodal LSTMs. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds) Artificial Intelligence in Education. AIED 2019. Lecture Notes in Computer Science(), vol 11626. Springer, Cham. https://doi.org/10.1007/978-3-030-23207-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23207-8_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23206-1

  • Online ISBN: 978-3-030-23207-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics