Modeling Embodied Feedback with Virtual Humans

  • Stefan Kopp
  • Jens Allwood
  • Karl Grammer
  • Elisabeth Ahlsen
  • Thorsten Stocksmeier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4930)


In natural communication, both speakers and listeners are active most of the time. While a speaker contributes new information, a listener gives feedback by producing unobtrusive (usually short) vocal or non-vocal bodily expressions to indicate whether he/she is able and willing to communicate, perceive, and understand the information, and what emotions and attitudes are triggered by this information. The simulation of feedback behavior for artificial conversational agents poses big challenges such as the concurrent and integrated perception and production of multi-modal and multi-functional expressions. We present an approach on modeling feedback for and with virtual humans, based on an approach to study “embodied feedback” as a special case of a more general theoretical account of embodied communication. A realization of this approach with the virtual human Max is described and results are presented.


Feedback Virtual Humans Embodied Conversational Agent 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allwood, J.: Linguistic Communication as Action and Cooperation. Gothenburg Monographs in Linguistics 2. Göteborg University, Department of Linguistics (1976)Google Scholar
  2. 2.
    Allwood, J., Nivre, J., Ahlsén, E.: On the semantics and pragmatics of linguistic feedback. Journal of Semantics 9(1), 1–26 (1992)CrossRefGoogle Scholar
  3. 3.
    Allwood, J., Kopp, S., Grammer, K., Ahlsen, E., Oberzaucher, E., Koppensteiner, M.: The analysis of embodied communicative feedback in multimodal corpora – a prerequisite for behaviour simulation. Journal of Language Resources and Evaluation (to appear)Google Scholar
  4. 4.
    Wachsmuth, I., Becker, C., Kopp, S.: Simulating the Emotion Dynamics of a Multimodal Conversational Agent. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 154–165. Springer, Heidelberg (2004)Google Scholar
  5. 5.
    Beun, R.J., van Eijk, R.M.: Conceptual Discrepancies and Feedback in Human-Computer Interaction. In: Proc. Dutch directions in HCI, ACM Press, New York (2004)Google Scholar
  6. 6.
    Cassell, J., Thórisson, K.R.: The Power of a Nod and a Glance: Envelope vs. Emotional Feedback in Animated Conversational Agents. Int J. Applied Artificial Intelligence 13(4–5), 519–538 (1999)Google Scholar
  7. 7.
    Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjálmsson, H., Yan, H.: Embodiment in Conversational Interfaces: Rea. In: Proc. CHI, pp. 520–527 (1999)Google Scholar
  8. 8.
    Cathcart, N., Carletta, J., Klein, E.: A shallow model of backchannel continuers in spoken dialogue. In: Proc. European Chapter of the Association for Computational Linguistics (EACL10), pp. 51–58 (2003)Google Scholar
  9. 9.
    Ehlich, K.: Interjektionen, Max Niemeyer Verlag (1986)Google Scholar
  10. 10.
    Fujie, S., Fukushima, K., Kobayashi, T.: A Conversation Robot with Back-channel Feedback Function based on Linguistic and Nonlinguistic Information. In: Proc. ICARA Int. Conference on Autonomous Robots and Agents, pp. 379–384 (2004)Google Scholar
  11. 11.
    Marsella, S.C., Morency, L.-P., Gratch, J., Okhmatovskaia, A., Lamothe, F., Morales, M., van der Werf, R.J.: Virtual Rapport. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS (LNAI), vol. 4133, pp. 14–27. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Nijholt, A., Heylen, D., Vissers, M., op den Akker, R.: Affective Feedback in a Tutoring System for Procedural Tasks. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 244–253. Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Houck, N., Gass, S.M.: Cross-cultural back channels in English refusals: A source of trouble. In: Jaworski, A. (ed.) Silence-Interdisciplinary perspectives, pp. 285–308. Mouton de Gruyter, Berlin (1997)Google Scholar
  14. 14.
    Kopp, S., Wachsmuth, I.: Synthesizing multimodal utterances for conversational agents. Computer Animation & Virtual Worlds 15(1), 39–52 (2004)CrossRefGoogle Scholar
  15. 15.
    Kopp, S., Gesellensetter, L., Krämer, N., Wachsmuth, I.: A conversational agent as museum guide – design and evaluation of a real-world application. In: Panayiotopoulos, T., Gratch, J., Aylett, R.S., Ballin, D., Olivier, P., Rist, T. (eds.) IVA 2005. LNCS (LNAI), vol. 3661, pp. 329–343. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Nakano, Y., Reinstein, G., Stocky, T., Cassell, J.: Towards a Model of Face-to-Face Grounding. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, pp. 553–561. Springer, Heidelberg (2004)Google Scholar
  17. 17.
    Poggi, I., Pelachaud, C., de Rosis, F., Carofiglio, V., De Carolis, B.: GRETA. A Believable Embodied Conversational Agent. In: Stock, O., Zancarano, M. (eds.) Multimodal Intelligent Information Presentation, Kluwer, Dordrecht (2005)Google Scholar
  18. 18.
    Scherer, K.R.: Affect Bursts. In: van Goozen, S., van de Poll, N.E., Sergeant, J.A. (eds.) Emotions: Essays on Emotion Theory, pp. 161–193. Lawrence Erlbaum, Mahwah (1994)Google Scholar
  19. 19.
    Schmid, H.: Improvements in Part-of-Speech Tagging With an Application To German (1995),
  20. 20.
    Shimojima, H., Koiso, M., Swerts, Katagiri, Y.: Mouton de GruyterAn Informational Analysis of Echoic Responses in Dialogue. In: Proc. ESSLLI Workshop on Integrating Information from Different Channels in Multi-Media-Contexts, pp. 48–55 (2000)Google Scholar
  21. 21.
    Stocksmeier, T., Kopp, S., Gibbon, D.: Synthesis of prosodic attitudinal variants in German backchannel “ja”. In: Proc. of Interspeech 2007, Antwerp, Belgium (2007)Google Scholar
  22. 22.
    Takeuchi, M., Kitaoka, N., Nakagawa, S.: Timing detection for realtime dialog systems using prosodic and linguistic information. In: Proc. of the International Conference Speech Prosody (SP 2004), pp. 529–532 (2004)Google Scholar
  23. 23.
    Thórisson, K.R.: Communicative Humanoids: A Computational Model of Psychosocial Dialogue Skills. Ph.D. thesis, MIT (1996)Google Scholar
  24. 24.
    Vilhjálmsson, H.H., Cassell, J.: BodyChat: Autonomous Commumicative Behaviors in Avators. Agents, 269–276 (1998)Google Scholar
  25. 25.
    Wachsmuth, I., Knoblich, G.: Embodied communication in humans and machines - a research agenda. Artificial Intelligence Review 24(3-4), 517–522 (2005)CrossRefGoogle Scholar
  26. 26.
    Wallers, A.: Minor sounds of major importance - prosodic manipulation of synthetic backchannels in swedish. Master’s thesis, KTH Stockholm, Sweden (2006)Google Scholar
  27. 27.
    Ward, N.: Using prosodic cues to decide when to produce back-channel utterances. In: Proceedings of ICSLP, pp. 1728–1731 (1996)Google Scholar
  28. 28.
    Ward, N.: Prosodic features which cue backchannel responses in English and Japanese. Pragmatics 32, 1177–1207 (2000)CrossRefGoogle Scholar
  29. 29.
    Ward, N., Tsukahara, W.: Prosodic features which cue back-channel responses in English and Japanese. Journal of Pragmatics 32, 1177–1207 (2000)CrossRefGoogle Scholar
  30. 30.
    Wiener, N.: Cybernetics and Control and Communication in the Animal and the Machine. MIT Press, Cambridge (1948)Google Scholar
  31. 31.
    Yngve, V.H.: On getting a word in edgewise. In: Papers from the 6th Regional Meeting of the Chicago Linguistics Society, University of Chicago, pp. 567–578 (1970)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Stefan Kopp
    • 1
  • Jens Allwood
    • 2
  • Karl Grammer
    • 3
  • Elisabeth Ahlsen
    • 2
  • Thorsten Stocksmeier
    • 1
  1. 1.A.I. GroupBielefeld UniversityBielefeldGermany
  2. 2.Dep. of LinguisticsGöteborg UniversityGöteborgSweden
  3. 3.Ludwig Boltzmann Inst. for Urban EthologyViennaAustria

Personalised recommendations