Learning Smooth, Human-Like Turntaking in Realtime Dialogue

  • Gudny Ragna Jonsdottir
  • Kristinn R. Thorisson
  • Eric Nivel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5208)

Abstract

Giving synthetic agents human-like realtime turntaking skills is a challenging task. Attempts have been made to manually construct such skills, with systematic categorization of silences, prosody and other candidate turn-giving signals, and to use analysis of corpora to produce static decision trees for this purpose. However, for general-purpose turntaking skills which vary between individuals and cultures, a system that can learn them on-the-job would be best. We are exploring ways to use machine learning to have an agent learn proper turntaking during interaction. We have implemented a talking agent that continuously adjusts its turntaking behavior to its interlocutors based on realtime analysis of the other party’s prosody. Initial results from experiments on collaborative, content-free dialogue show that, for a given subset of turn-taking conditions, our modular reinforcement learning techniques allow the system to learn to take turns in an efficient, human-like manner.

Keywords

Turntaking Machine Learning Prosody End-of-utterance detection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Goodwin, C.: Conversational organization: Interaction between speakers and hearers. Academic Press, New York (1981)Google Scholar
  2. 2.
    Jonsdottir, G.R., Gratch, J., Fast, E., Thórisson, K.R.: Fluid semantic back-channel feedback in dialogue: Challenges and progress. In: Pélachaud, C., Martin, J.-C., André, E., Chollet, G., Karpouzis, K., Pelé, D. (eds.) IVA 2007. LNCS (LNAI), vol. 4722, pp. 154–160. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Edlund, J., Heldner, M., Gustafson, J.: Utterance segmentation and turn-taking in spoken dialogue systems (2005)Google Scholar
  4. 4.
    Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action. In: Granström, B., House, D.I.K. (eds.) Multimodality in Language and Speech Systems, pp. 173–207. Kluwer Academic Publishers, Dordrecht (2002)Google Scholar
  5. 5.
    Card, S.K., Moran, T.P., Newell, A.: The model human processor: An engineering model of human performance. In: Handbook of Human Perception, vol. II. John Wiley and Sons, Chichester (1986)Google Scholar
  6. 6.
    Thórisson, K.R.: Dialogue control in social interface agents. In: INTERCHI Adjunct Proceedings, 139–140 (1993)Google Scholar
  7. 7.
    Thórisson, K.R.: Communicative Humanoids: A Computational Model of Psycho-Social Dialogue Skills. PhD thesis, Massachusetts Institute of Technology (1996)Google Scholar
  8. 8.
    Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)CrossRefGoogle Scholar
  9. 9.
    Thórisson, K.R.: Modeling multimodal communication as a complex system. In: Wachsmuth, I., Knoblich, G. (eds.) ZiF Research Group International Workshop. LNCS (LNAI), vol. 4930, pp. 143–168. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Sato, R., Higashinaka, R., Tamoto, M., Nakano, M., Aikawa, K.: Learning decision trees to determine turn-taking by spoken dialogue systems. In: ICSLP 2002, pp. 861–864 (2002)Google Scholar
  11. 11.
    Traum, D.R., Heeman, P.A.: Utterance units and grounding in spoken dialogue. In: Proc. ICSLP 1996., Philadelphia, PA, vol. 3, pp. 1884–1887 (1996)Google Scholar
  12. 12.
    Schlangen, D.: From reaction to prediction: Experiments with computational models of turn-taking. In: Proceedings of Interspeech 2006, Panel on Prosody of Dialogue Acts and Turn-Taking, Pittsburgh, USA (September 2006)Google Scholar
  13. 13.
    Raux, A., Eskenazi, M.: Optimizing endpointing thresholds using dialogue features in a spoken dialogue system. In: Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, Columbus, Ohio, Association for Computational Linguistics, pp. 1–10 (June 2008)Google Scholar
  14. 14.
    Gratch, J., Okhmatovskaia, A., Lamothe, F., Marsella, S., Morales, M., van der Werf, R.J., Morency, L.P.: Virtual rapport. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 14–27. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Pierrehumbert, J., Hirschberg, J.: The meaning of intonational contours in the interpretation of discourse. In: Cohen, P.R., Morgan, J., Pollack, M. (eds.) Intentions in Communication, pp. 271–311. MIT Press, Cambridge (1990)Google Scholar
  16. 16.
    Ng-Thow-Hing, V., List, T., Thórisson, K.R., Lim, J., Wormer, J.: Design and evaluation of communication middleware in a distributed humanoid robot architecture. In: Prassler, E., Nilsson, K., Shakhimardanov, A. (eds.) IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2007) Workshop on Measures and Procedures for the Evaluation of Robot Architectures and Middleware (2007)Google Scholar
  17. 17.
    Thorisson, K.R., Benko, H., Arnold, A., Abramov, D., Maskey, S., Vaseekaran, A.: Constructionist design methodology for interactive intelligences. A.I. Magazine 25(4), 77–90 (2004)Google Scholar
  18. 18.
    Nivel, E., Thórisson, K.R.: Prosodica: A realtime prosody tracker for dynamic dialogue. Technical report, Reykjavik University Department of Computer Science, Technical Report RUTR-CS08001 (2004)Google Scholar
  19. 19.
    Thórisson, K.R.: Machine perception of multimodal natural dialogue. In: McKevitt, P., Nulláin, S.Ó., Mulvihill, C. (eds.) Language, Vision & Music, pp. 97–115. John Benjamins, Amsterdam (2002)Google Scholar
  20. 20.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Gudny Ragna Jonsdottir
    • 1
  • Kristinn R. Thorisson
    • 1
  • Eric Nivel
    • 1
  1. 1.Center for Analysis & Design of Intelligent Agents & School of Computer ScienceReykjavik UniversityReykjavikIceland

Personalised recommendations