International Journal of Social Robotics

, Volume 5, Issue 3, pp 345–356 | Cite as

Using Embodied Multimodal Fusion to Perform Supportive and Instructive Robot Roles in Human-Robot Interaction

  • Manuel GiulianiEmail author
  • Alois Knoll


We present a robot that is working with humans on a common construction task. In this kind of interaction, it is important that the robot can perform different roles in order to realise an efficient collaboration. For this, we introduce embodied multimodal fusion, a new approach for processing data from the robot’s input modalities. Using this method, we implemented two different robot roles: the robot can take the instructive role, in which the robot mainly instructs the user how to proceed with the construction; or the robot can take the supportive role, in which the robot hands over assembly pieces to the human that fit to the current progress of the assembly plan. We present a user evaluation that researches how humans react to the different roles of the robot. The main findings of this evaluation are that the users do not prefer one of the two roles of the robot, but take the counterpart to the robot’s role and adjust their own behaviour according to the robot’s actions. The most influential factors for user satisfaction in this kind of interaction are the number of times the users picked up a building piece without getting an explicit instruction by the robot, which had a positive influence, and the number of utterances the users made themselves, which had a negative influence.


Human-robot interaction Embodied multimodal fusion Robot roles 



This research was supported by the European Commission through the projects JAST (FP6-003747-IP) and JAMES (FP7-270435-STREP). Thanks to Sören Jentzsch for help in annotating the video data.


  1. 1.
    Biddle B (1979) Role theory: expectations, identities, and behaviors. Academic Press, New York Google Scholar
  2. 2.
    Boves L, Neumann A, Vuurpijl L, Bosch L, Rossignol S, Engel R, Pfleger N (2004) Multimodal interaction in architectural design applications. In: Lecture notes in computer science, pp 384–390 Google Scholar
  3. 3.
    Breazeal C (2004) Social interactions in HRI: the robot view. IEEE Trans Syst Man Cybern, Part C, Appl Rev 34(2):181–186 CrossRefGoogle Scholar
  4. 4.
    Chaimowicz L, Campos M, Kumar V (2002) Dynamic role assignment for cooperative robots. In: Proceedings of IEEE international conference on robotics and automation (ICRA’02), 2002, vol 1. IEEE, New York, pp 293–298 Google Scholar
  5. 5.
    Clark H, Krych M (2004) Speaking while monitoring addressees for understanding. J Mem Lang 50(1):62–81 CrossRefGoogle Scholar
  6. 6.
    Cohen PR, Johnston M, McGee D, Oviatt S, Pittman J, Smith I, Chen L, Clow J (1997) Quickset: multimodal interaction for distributed applications. In: MULTIMEDIA ’97: proceedings of the fifth ACM international conference on multimedia. ACM Press, New York, pp 31–40 CrossRefGoogle Scholar
  7. 7.
    Cook C, Heath F, Thompson RL, Thompson B (2001) Score reliability in Web- or Internet-based surveys: unnumbered graphic rating scales versus Likert-type scales. Educ Psychol Meas 61(4):697–706 CrossRefGoogle Scholar
  8. 8.
    Dautenhahn K, Woods S, Kaouri C, Walters M, Koay K, Werry I (2005) What is a robot companion-friend, assistant or butler? In: IEEE/RSJ international conference on intelligent robots and systems (IROS 2005). IEEE, New York, pp 1192–1197 CrossRefGoogle Scholar
  9. 9.
    Foster ME, Matheson C (2008) Following assembly plans in cooperative, task-based human-robot dialogue. In: Proceedings of the 12th workshop on the semantics and pragmatics of dialogue (Londial 2008), London, June 2008 Google Scholar
  10. 10.
    Foster M, Giuliani M, Isard A, Matheson C, Oberlander J, Knoll A (2009) Evaluating description and reference strategies in a cooperative human-robot dialogue system. In: International joint conference on artificial intelligence (IJCAI 2009), Pasadena, California, July 2009 Google Scholar
  11. 11.
    Foster ME, Giuliani M, Knoll A (2009) Comparing objective and subjective measures of usability in a human-robot dialogue system. In: Proceedings of the 47th annual meeting of the Association for Computational Linguistics and the 4th international joint conference on natural language processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore, August 2009 Google Scholar
  12. 12.
    Gibson J (1986) The ecological approach to visual perception. Lawrence Erlbaum, Hillsdale Google Scholar
  13. 13.
    Giuliani M, Foster M, Isard A, Matheson C, Oberlander J, Knoll A (2010) Situated reference in a hybrid human-robot interaction system. In: International natural language generation conference (INLG 2010), Dublin, Ireland, July 2010 Google Scholar
  14. 14.
    Hinds P, Roberts T, Jones H (2004) Whose job is it anyway? A study of human–robot interaction in a collaborative task. Hum-Comput Interact 19:151–181 CrossRefGoogle Scholar
  15. 15.
    Holzapfel H, Nickel K, Stiefelhagen R (2004) Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures. In: ICMI ’04: proceedings of the 6th international conference on multimodal interfaces. ACM Press, New York, pp 175–182 CrossRefGoogle Scholar
  16. 16.
    Johnston M, Bangalore S, Vasireddy G, Stent A, Ehlen P, Walker M, Whittaker S, Maloor P (2002) Match: an architecture for multimodal dialogue systems. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 376–383 Google Scholar
  17. 17.
    Kipp M (2010) Multimedia annotation, querying and analysis in ANVIL. In: Multimedia information extraction Google Scholar
  18. 18.
    Krüger N, Piater J, Wörgötter F, Geib C, Petrick R, Steedman M, Ude A, Asfour T, Kraft D, Omrcen D et al (2009) A formal definition of object-action complexes and examples at different levels of the processing hierarchy. PACO+ technical report, available from
  19. 19.
    Litman D, Pan S (2002) Designing and evaluating an adaptive spoken dialogue system. User Model User-Adapt Interact 12(2–3):111–137 zbMATHCrossRefGoogle Scholar
  20. 20.
    Looije R, Neerincx M, Kruijff G (2007) Affective collaborative robots for safety & crisis management in the field. In: Intelligent human computer systems for crisis response and management (ISCRAM 2007), Delft, Netherlands, May 2007 Google Scholar
  21. 21.
    Meyer S (2011) Mein Freund der Roboter: Servicerobotik für ältere Menschen—Eine Antwort auf den demographischen Wandel? VDE-Verlag, Berlin Google Scholar
  22. 22.
    Müller T, Ziaie P, Knoll A (2008) A wait-free realtime system for optimal distribution of vision tasks on multicore architectures. In: Proc 5th international conference on informatics in control, automation and robotics, May 2008 Google Scholar
  23. 23.
    Persuadable research survey shows many willing to borrow money to buy a domestic robot, January 2012.
  24. 24.
    Ross L, Amabile T, Steinmetz J (1977) Social roles, social control, and biases in social-perception processes. J Pers Soc Psychol 35(7):485–494 CrossRefGoogle Scholar
  25. 25.
    Sebanz N, Bekkering H, Knoblich G (2006) Joint action: bodies and minds moving together. Trends Cogn Sci 10(2):70–76 CrossRefGoogle Scholar
  26. 26.
    Sentis L, Khatib O (2004) Task-oriented control of humanoid robots through prioritization. In: Proceedings of the IEEE-RAS/RSJ international conference on humanoid robots Google Scholar
  27. 27.
    Stewart D, Stasser G (1995) Expert role assignment and information sampling during collective recall and decision making. J Pers Soc Psychol 69(4):619 CrossRefGoogle Scholar
  28. 28.
    Stiefelhagen R, Ekenel H, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human–robot interaction for the Karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851 CrossRefGoogle Scholar
  29. 29.
    Stone P, Veloso M (1999) Task decomposition and dynamic role assignment for real-time strategic teamwork. In: Intelligent agents V: agents theories, architectures, and languages, pp 293–308 CrossRefGoogle Scholar
  30. 30.
    Tapus A, Mataric M (2008) Socially assistive robots: the link between personality, empathy, physiological signals, and task performance. In: AAAI Spring, vol 8 Google Scholar
  31. 31.
    van Breemen AJN (2005) iCat: experimenting with animabotics. In: AISB 2005 creative robotics symposium Google Scholar
  32. 32.
    Wahlster W (2006) SmartKom: foundations of multimodal dialogue systems. Springer, Berlin CrossRefGoogle Scholar
  33. 33.
    Walker M, Kamm C, Litman D (2000) Towards developing general models of usability with PARADISE. Nat Lang Eng 6(3–4):363–377 CrossRefGoogle Scholar
  34. 34.
    Ziaie P, Müller T, Knoll A (2008) A novel approach to hand-gesture recognition in a human-robot dialog system. In: Proceedings of the first intl workshop on image processing theory, tools & applications, Sousse, Tunesia, November 2008 Google Scholar
  35. 35.
    Zimbardo P, Cross A (1971) Stanford prison experiment. Stanford University Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  1. 1.fortiss GmbHMunichGermany
  2. 2.Robotics and Embedded SystemsTechnical University MunichGarching bei MünchenGermany

Personalised recommendations