Incremental Dialogue Understanding and Feedback for Multiparty, Multimodal Conversation

  • David Traum
  • David DeVault
  • Jina Lee
  • Zhiyang Wang
  • Stacy Marsella
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7502)


In order to provide comprehensive listening behavior, virtual humans engaged in dialogue need to incrementally listen, interpret, understand, and react to what someone is saying, in real time, as they are saying it. In this paper, we describe an implemented system for engaging in multiparty dialogue, including incremental understanding and a range of feedback. We present an FML message extension for feedback in multipary dialogue that can be connected to a feedback realizer. We also describe how the important aspects of that message are calculated by different modules involved in partial input processing as a speaker is talking in a multiparty dialogue.


Automatic Speech Recognition Virtual Human Feedback Message Conversational Agent Dialogue Manager 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allwood, J.: Linguistic Communication as Action and Cooperation. Ph.D. thesis, Göteborg University, Department of Linguistics (1976)Google Scholar
  2. 2.
    Argyle, M., Cook, M.: Gaze and Mutual Gaze. Cambridge University Press (1976)Google Scholar
  3. 3.
    Argyle, M., Lalljee, M., Cook, M.: The effects of visibility on interaction in a dyad. Human Relations 21, 3–17 (1968)CrossRefGoogle Scholar
  4. 4.
    Bavelas, J.: Listeners as co-narrators. Journal of Personality and Social Psychology 79, 941–952 (2000)CrossRefGoogle Scholar
  5. 5.
    Brunner, L.: Smiles can be back channels. JPSP 37(5), 728–734 (1979)MathSciNetGoogle Scholar
  6. 6.
    Callan, H., Chance, M., Pitcairn, T.: Attention and advertence in human groups. Soc. Sci. Inform. 12, 27–41 (1973)CrossRefGoogle Scholar
  7. 7.
    DeVault, D., Sagae, K., Traum, D.: Detecting the status of a predictive incremental speech understanding model for real-time decision-making in a spoken dialogue system. In: Proceedings of InterSpeech (2011)Google Scholar
  8. 8.
    DeVault, D., Sagae, K., Traum, D.: Incremental interpretation and prediction of utterance meaning for interactive dialogue. Dialogue & Discourse 2(1) (2011)Google Scholar
  9. 9.
    Dittmann, A., Llewellyn, L.: Relationship between vocalizations and head nods as listener responses. JPSP 9, 79–84 (1968)Google Scholar
  10. 10.
    Ellsworth, P., Friedman, H., Perlick, D., Hoyt, M.: Some effects of gaze on subjects motivated to seek or to avoid social comparison. JESP 14, 69–87 (1978)Google Scholar
  11. 11.
    Friedman, H.S., Riggio, R.E.: Effect of individual differences in non-verbal expressiveness on transmission of emotion. Journal of Nonverbal Behavior 6(2), 96–104 (1981)CrossRefGoogle Scholar
  12. 12.
    Goffman, E.: Forms of Talk. University of Pennsylvania Press, Philadelphia (1981)Google Scholar
  13. 13.
    Goodwin, C.: Conversational organization: interaction between speakers and hearers. Academic Press, London (1981)Google Scholar
  14. 14.
    Gratch, J., Marsella, S.: A domain-independent framework for modeling emotion. Journal of Cognitive Systems Research (2004)Google Scholar
  15. 15.
    Gu, E., Badler, N.I.: Visual Attention and Eye Gaze During Multiparty Conversations with Distractions. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 193–204. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Hartholt, A., Russ, T., Traum, D., Hovy, E., Robinson, S.: A common ground for virtual humans: Using an ontology in a natural language oriented virtual human architecture. In: Language Resources and Evaluation Conference (LREC) (May 2008)Google Scholar
  17. 17.
    Heintze, S., Baumann, T., Schlangen, D.: Comparing local and sequential models for statistical incremental natural language understanding. In: Proceedings of SIGDIAL (2010)Google Scholar
  18. 18.
    Heylen, D., Kopp, S., Marsella, S.C., Pelachaud, C., Vilhjálmsson, H.H.: The Next Step towards a Function Markup Language. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 270–280. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  19. 19.
    Huggins-Daines, D., Kumar, M., Chan, A., Black, A.W., Ravishankar, M., Rudnicky, A.I.: Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In: Proceedings of ICASSP (2006)Google Scholar
  20. 20.
    Ikeda, K.: Triadic exchange pattern in multiparty communication: A case study of conversational narrative among friends. Language and culture 30(2), 53–65 (2009)Google Scholar
  21. 21.
    Jan, D., Traum, D.R.: Dynamic movement and positioning of embodied agents in multiparty conversations. In: Proc. of 6th AAMAS, pp. 59–66 (2007)Google Scholar
  22. 22.
    Kendon, A.: Conducting Interaction: Patterns of Behavior in Focused Encounters. Cambridge University Press, Cambridge (1990)Google Scholar
  23. 23.
    Kopp, S., Krenn, B., Marsella, S., Marshall, A., Pelachaud, C., Pirker, H., Thórisson, K., Vilhjálmsson, H.H.: Towards a Common Framework for Multimodal Generation: The Behavior Markup Language. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 205–217. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Kopp, S., Stocksmeier, T., Gibbon, D.: Incremental Multimodal Feedback for Conversational Agents. In: Pelachaud, C., Martin, J.-C., André, E., Chollet, G., Karpouzis, K., Pelé, D. (eds.) IVA 2007. LNCS (LNAI), vol. 4722, pp. 139–146. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  25. 25.
    Lee, J., Marsella, S.C.: Nonverbal Behavior Generator for Embodied Conversational Agents. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 243–255. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  26. 26.
    Maatman, R.M., Gratch, J., Marsella, S.C.: Natural Behavior of a Listening Agent. In: Panayiotopoulos, T., Gratch, J., Aylett, R.S., Ballin, D., Olivier, P., Rist, T. (eds.) IVA 2005. LNCS (LNAI), vol. 3661, pp. 25–36. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  27. 27.
    Marsella, S., Gratch, J.: Ema: A process model of appraisal dynamics. Journal of Cognitive Systems Research 10(1), 70–90 (2009)CrossRefGoogle Scholar
  28. 28.
    Morency, L.P., de Kok, I., Gratch, J.: A probabilistic multimodal approach for predicting listener backchannels. AAMAS 20, 70–84 (2010)Google Scholar
  29. 29.
    Plüss, B., DeVault, D., Traum, D.: Toward rapid development of multi-party virtual human negotiation scenarios. In: Proceedings of SemDial 2011, the 15th Workshop on the Semantics and Pragmatics of Dialogue (September 2011)Google Scholar
  30. 30.
    Sagae, K., Christian, G., DeVault, D., Traum, D.R.: Towards natural language understanding of partial speech recognition results in dialogue systems. In: Short Paper Proceedings of NAACL HLT (2009)Google Scholar
  31. 31.
    Sagae, K., DeVault, D., Traum, D.R.: Interpretation of partial utterances in virtual human dialogue systems. In: NAACL-HLT 2010 Demonstration (2010)Google Scholar
  32. 32.
    Traum, D.: Semantics and pragmatics of questions and answers for dialogue agents. In: Proceedings of the International Workshop on Computational Semantics, pp. 380–394 (2003)Google Scholar
  33. 33.
    Traum, D.R., Morency, L.P.: Integration of visual perception in dialogue understanding for virtual humans in multi-party interaction. In: AAMAS International Workshop on Interacting with ECAs as Virtual Characters (May 2010)Google Scholar
  34. 34.
    Traum, D.R., Rickel, J.: Embodied agents for multi-party dialogue in immersive virtual worlds. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 766–773 (2002)Google Scholar
  35. 35.
    Wang, Z., Lee, J., Marsella, S.: Towards More Comprehensive Listening Behavior: Beyond the Bobble Head. In: Vilhjálmsson, H.H., Kopp, S., Marsella, S., Thórisson, K.R. (eds.) IVA 2011. LNCS, vol. 6895, pp. 216–227. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • David Traum
    • 1
  • David DeVault
    • 1
  • Jina Lee
    • 1
  • Zhiyang Wang
    • 1
  • Stacy Marsella
    • 1
  1. 1.Institute for Creative TechnologiesUniversity of Southern CaliforniaPlaya VistaUSA

Personalised recommendations