Skip to main content
Log in

A model for incremental grounding in spoken dialogue systems

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

We present a computational model of incremental grounding, including state updates and action selection. The model is inspired by corpus-based examples of overlapping utterances of several sorts, including backchannels and completions. The model has also been partially implemented within a virtual human system that includes incremental understanding, and can be used to track grounding and provide overlapping verbal and non-verbal behaviors from a listener, before a speaker has completed her utterance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. It is sometimes useful to distinguish further between the explicit or predicted surface form, as opposed to the explicit or predicted meaning.

  2. Sometimes, an utterance that includes an acknowledgment will also proceed to initiate a new CGU (as in “okay, so let’s talk about the other matter”).

References

  1. op den Akker H, Schulz C (2008) Exploring features and classifiers for dialogue act segmentation. In: Popescu-Belis A, Stiefelhagen R (eds) Machine learning for multimodal interaction. Lecture notes in computer science, vol 5237. Springer, Heidelberg, pp 196–207

  2. Allwood J, Kopp S, Grammer K, Ahlsn E, Oberzaucher E, Koppensteiner M (2007) The analysis of embodied communicative feedback in multimodal corpora: a prerequisite for behavior simulation. Lang Res Eval 41(3—-4):255–272. doi:10.1007/s10579-007-9056-2

    Article  Google Scholar 

  3. Bohus D, Horvitz E (2009) Learning to predict engagement with a spoken dialog system in open-world settings. In: Proceedings of SIGDIAL 2009. London

  4. Buß O, Baumann T, Schlangen D (2010) Collaborating on utterances with a spoken dialogue system using an isu-based approach to incremental dialogue management. In: Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue Association for, Computational Linguistics. pp 233–236

  5. Carletta J (2007) Unleashing the killer corpus: experiences in creating the multi-everything ami meeting corpus. Lang Res Eval 41(2):181–190

    Article  Google Scholar 

  6. Clark H (1996) Using language. Cambridge University Press, linebreak Cambridge

    Book  Google Scholar 

  7. Clark H, Schaefer E (1989) Contributing to discourse. Cogn Sci 13(2):259–294

    Article  Google Scholar 

  8. DeVault D, Sagae K, Traum D (2009) Can i finish? Learning when to respond to incremental interpretation results in interactive dialogue. In: 10th SIGdial Workshop on Discourse and Dialogue. London

  9. DeVault D, Sagae K, Traum D (2011) Detecting the status of a predictive incremental speech understanding model for real-time decision-making in a spoken dialogue system. In: The 12th Annual Conference of the International Speech Communication Association (InterSpeech 2011)

  10. DeVault D, Sagae K, Traum D (2011) Incremental interpretation and prediction of utterance meaning for interactive dialogue. Dialog Discourse 2(1)

  11. DeVault D, Traum D (2013) A method for the approximation of incremental understanding of explicit utterance meaning using predictive models in finite domains. NAACL-HLT 2013

  12. Gratch J, Okhmatovskaia A, Lamothe F, Marsella S, Morales M, van der Werf R, Morency LP (2006) Virtual rapport. In: Gratch J, Young M, Aylett R, Ballin D, Olivier P (eds) Intelligent virtual agents, vol 2. Springer, Berlin, pp 14–27. doi:10.1007/11821830_2

    Chapter  Google Scholar 

  13. Hartholt A, Traum DR, Marsella SC, Shapiro A, Stratou G, Leuski A, Morency LP, Gratch J (2013) All together now—introducing the virtual human toolkit. In: Aylett R, Krenn B, Pelachaud C, Shimodaira H (eds) IVA, Lecture notes in computer science, vol 8108. Springer, Berlin, pp 368–381

  14. Huang L, Morency L, Gratch J (2011) Virtual rapport 2.0. Intelligent virtual agents. Springer, Berlin, pp 68–79

    Book  Google Scholar 

  15. Kopp S, Allwood J, Grammer K, Ahlsen E, Stocksmeier T (2008) Modeling embodied feedback with virtual humans. In: Proceedings of the Embodied communication in humans and machines, 2nd ZiF research group international conference on Modeling communication with robots and virtual humans, ZiF’06, Springer-Verlag, Berlin, pp 18–37. http://dl.acm.org/citation.cfm?id=1794517.1794519

  16. Matheson C, Poesio M, Traum D (2000) Modelling grounding and discourse obligations using update rules. In: Proceedings of the First Conference of the North American Chapter of the Association for Computational Linguistics

  17. Milward D (1992) Dynamics, dependency grammar and incremental interpretation. In: COLING92, pp 1095–1099

  18. Morency LP, Kok I, Gratch J (2010) A probabilistic multimodal approach for predicting listener backchannels. Autonom Agent Multi-Agent Syst 20:70–84. doi:10.1007/s10458-009-9092-y

    Article  Google Scholar 

  19. Nakatani C, Traum D (1999) Coding discourse structure in dialogue (version 1.0). Tech. Rep. UMIACS-TR-99-03, University of Maryland

  20. Oviatt S, Cohen P (1991) Discourse structure and performance efficiency in interactive and non-interactive spoken modalities. Comp Speech Lang 5(4):297–326

    Article  Google Scholar 

  21. Plüss B, DeVault D, Traum D (2011) Toward rapid development of multi-party virtual human negotiation scenarios. In: Proceedings of SemDial

  22. Poesio M, Traum DR (1997) Conversational actions and discourse situations. Comput Intell 13(3)

  23. Roque A (2009) Dialogue management in spoken dialogue systems with degrees of grounding. Ph.D. thesis, University of Southern California, Los Angeles

    Google Scholar 

  24. Roque A, Traum D (2008) Degrees of grounding based on evidence of understanding. In: Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, Association for, Computational Linguistics. pp 54–63

  25. Schlangen D, Baumann T, Buschmeier H, Buß O, Kopp S, Skantze G, Yaghoubzadeh R (2010) Middleware for incremental processing in conversational agents. In: Proceedings of SigDial 2010. Tokyo

  26. Schlangen D, Skantze G (2009) A general, abstract model of incremental dialogue processing. In: Proc. of the 12th Conference of the European Chapter of the ACL

  27. Schuler W, Wu S, Schwartz L (2009) A framework for fast incremental interpretation during speech decoding. Comput Ling 35(3):313–343

    Article  Google Scholar 

  28. Selfridge E, Arizmendi I, Heeman P, Williams J (2011) Stability and accuracy in incremental speech recognition. In: Proceedings of the SIGDIAL 2011 Conference, Association for Computational Linguistics, Portland, pp 110–119. http://www.aclweb.org/anthology/W/W11/W11-2014

  29. Skantze G, Hjalmarsson A (2010) Towards incremental speech generation in dialogue systems. In: Proceedings of the SIGDIAL 2010 Conference, Association for Computational Linguistics, Tokyo, pp 1–8. http://www.aclweb.org/anthology/W/W10/W10-4301

  30. Skantze G, Schlangen D (2009) Incremental dialogue processing in a micro-domain. In: Proceedings of the 12th Conference of the European Association for Computational Linguistics (EACL)

  31. Tanenhaus M, Brown-Schmidt S (2008) Language processing in the natural world. Philos Trans Royal Soc B 363(1493):1105–1122

    Article  Google Scholar 

  32. Traum D (2003) Semantics and pragmatics of questions and answers for dialogue agents. In: proceedings of the International Workshop on Computational Semantics, pp 380–394

  33. Traum D, DeVault D, Lee J, Wang Z, Marsella S (2012) Incremental dialogue understanding and feedback for multiparty, multimodal conversation. In: Intelligent Virtual Agents. Springer

  34. Traum D, Rickel J, Marsella S, Gratch J (2003) Negotiation over tasks in hybrid human-agent teams for simulation-based training. In: Proceedings of AAMAS 2003: Second International Joint Conference on Autonomous Agents and Multi-Agent Systems, pp 441–448

  35. Traum D, Swartout W, Gratch J, Marsella S (2008) A virtual human dialogue model for non-team interaction. In: Dybkjaer L, Minker W (eds) Recent trends in discourse and dialogue. Springer, Netherlands

  36. Traum DR (1994) A computational theory of grounding in natural language conversation. Ph.D. thesis, University of Rochester, Rochester

    Google Scholar 

  37. Traum DR, Marsella S, Gratch J, Lee J, Hartholt A (2008) Multi-party, multi-issue, multi-strategy negotiation for multi-modal virtual agents. In: Prendinger H, Lester JC, Ishizuka M (eds) IVA, lecture notes in computer science, vol 5208. Springer, Berlin, pp 117–130

  38. Traum DR, Morency LP (2010) Integration of visual perception in dialogue understanding for virtual humans in multi-party interaction. In: AAMAS International Workshop on Interacting with ECAs as Virtual Characters

  39. Traum DR, Schubert LK, Poesio M, Martin NG, Light M, Hwang CH, Heeman P, Ferguson G, Allen JF (1996) Knowledge representation in the TRAINS-93 conversation system. Intern J Exp Syst 9(1):173–223

    Google Scholar 

  40. Wang Z, Lee J, Marsella S (2011) Towards more comprehensive listening behavior: beyond the bobble head. In: Intelligent Virtual Agents, Springer, Berlin, pp 216–227

  41. Ward N, Tsukahara W (1999) A responsive dialogue system. In: Wilks Y (eds) Machine conversations. Springer, New York

Download references

Acknowledgments

Some of the effort described here has been sponsored by the US Army. Any opinions, content or information presented does not necessarily reflect the position or the policy of the United States Government, and no official endorsement should be inferred.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Traum.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Visser, T., Traum, D., DeVault, D. et al. A model for incremental grounding in spoken dialogue systems. J Multimodal User Interfaces 8, 61–73 (2014). https://doi.org/10.1007/s12193-013-0147-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-013-0147-7

Keywords

Navigation