Integrating Language, Vision and Action for Human Robot Dialog Systems

  • Markus Rickert
  • Mary Ellen Foster
  • Manuel Giuliani
  • Tomas By
  • Giorgio Panin
  • Alois Knoll
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4555)


Developing a robot system that can interact directly with a human instructor in a natural way requires not only highly-skilled sensorimotor coordination and action planning on the part of the robot, but also the ability to understand and communicate with a human being in many modalities. A typical application of such a system is interactive assembly for construction tasks. A human communicator sharing a common view of the work area with the robot system instructs the latter by speaking to it in the same way that he would communicate with a human partner.


Multiagent System Humanoid Robot Robot System Categorial Grammar Construction Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Erlhagen, W., Mukovskiy, A., Bicho, E., Panin, G., Kiss, C., Knoll, A., van Schie, H., Bekkering, H.: Goal-directed imitation for robots: a bio-inspired approach to action understanding and skill learning. Robotics and Autonomous Systems 54(5), 353–360 (2006)CrossRefGoogle Scholar
  2. 2.
    Foster, M.E., By, T., Rickert, M., Knoll, A.: Human-robot dialogue for joint construction tasks. In: Proceedings of the International Conference on Multimodal Interfaces, pp. 68–71. ACM Press, New York (2006)CrossRefGoogle Scholar
  3. 3.
    van Breemen, A.J.N., Yan, X., Meerbeek, B.: iCat: An animated user-interface robot with personality. In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 143–144. ACM Press, New York (2005)CrossRefGoogle Scholar
  4. 4.
    Knoll, A., Hildebrandt, B., Zhang, J.: Instructing cooperating assembly robots through situated dialogues in natural language. In: Proceedings of the IEEE International Conference on Robotics and Automation, IEEE Computer Society Press, Los Alamitos (1997)Google Scholar
  5. 5.
    Zhang, J., Knoll, A.: A two-arm situated artificial communicator for human-robot cooperative assembly. In: Proceedings of the IEEE International Workshop on Human Robot Communication, pp. 292–299 (2001)Google Scholar
  6. 6.
    Knoll, A.: basic system for multimodal robot instruction. In: Kühnlein, P., Rieser, H., Zeevat, H. (eds.) Perspectives on Dialogue in the New Millennium. Pragmatics and Beyond New Series, vol. 114, John Benjamins, Amsterdam (2003)Google Scholar
  7. 7.
    Knoll, A.: Distributed contract networks of sensor agents with adaptive reconfiguration: Modelling, simulation, implementation. Journal of the Franklin Institute 338(6), 669–705 (2001)zbMATHCrossRefGoogle Scholar
  8. 8.
    Breazeal, C., Brooks, A., Gray, J., Hoffman, G., Kidd, C., Lee, H., Lieberman, J., Lockerd, A., Chilongo, D.: Tutelage and collaboration for humanoid robots. International Journal of Humanoid Robotics 1(2), 315–348 (2004)CrossRefGoogle Scholar
  9. 9.
    Sidner, C.L., Dzikovska, M.: first experiment in engagement for human-robot interaction in hosting activities. In: Bernsen, N., Dybkjær, L., van Kuppevelt, J. (eds.) Advances in Natural Multimodal Dialogue Systems, Springer, Heidelberg (2005)Google Scholar
  10. 10.
    Fong, T.W., Kunz, C., Hiatt, L., Bugajska, M.: The human-robot interaction operating system. In: Proceedings of the International Conference on Human-Robot Interaction, ACM (2006)Google Scholar
  11. 11.
    Burghart, C., Mikut, R., Stiefelhagen, R., Asfour, T., Holzapfel, H., Steinhaus, P., Dillmann, R.: cognitive architecture for humanoid robot: first approach. In: Proceedings of the IEEE-RAS International Conference on Humanoid Robots, pp. 357–362 (2005)Google Scholar
  12. 12.
    Henning, M.: new approach to object-oriented middleware. IEEE Internet Computing 8(1), 66–75 (2004)CrossRefGoogle Scholar
  13. 13.
    Nuance Communications, Inc.: Dragon NaturallySpeaking,
  14. 14.
    Panin, G., Ladikos, A., Knoll, A.: An efficient and robust real-time contour tracking system. In: Proceedings of IEEE International Conference on Computer Vision Systems, vol. 44, IEEE Computer Society, Los Alamitos (2006)Google Scholar
  15. 15.
    Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., Salesin, D.H.: Synthesizing realistic facial expressions from photographs. In: Proceedings of the International Conference on Computer Graphics and Interactive Techniques, pp. 75–84. ACM Press, New York (1998)Google Scholar
  16. 16.
    Moddemeijer, R.: On estimation of entropy and mutual information of continuous distributions. Signal Processing 16(3), 233–246 (1989)CrossRefGoogle Scholar
  17. 17.
    Steedman, M.: The Syntactic Process. MIT Press, Cambridge, MA (2000)Google Scholar
  18. 18.
    White, M.: Efficient realization of coordinate structures in combinatory categorial grammar. Research on Language and Computation 4(1), 39–75 (2006)CrossRefGoogle Scholar
  19. 19.
    Morrow, J.D., Khosla, P.K.: Manipulation task primitives for composing robot skills. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 3354–3359 (1997)Google Scholar
  20. 20.
    Zhang, J., von Collani, Y., Knoll, A.: Interactive assembly by a two-arm robot agent. Journal of Robotics and Autonomous Systems 29(1), 91–100 (1999)CrossRefGoogle Scholar
  21. 21.
    Thomas, U., Finkemeyer, B., Kröger, T., Wahl, F.: Error-tolerant execution of complex robot tasks based on skill primitives. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 3069–3075 (2003)Google Scholar
  22. 22.
    Milighetti, G., Kuntze, H.-B.: Multi-sensor controlled skills for humanoid robots. In: Proceedings of the IFAC International Symposium on Robot Control (2006)Google Scholar
  23. 23.
    Jilka, M., Syrdal, A.K.: The AT&T German text-to-speech system: Realistic linguistic description. In: Proceedings of the International Conference on Spoken Language Processing, pp. 113–116 (2002)Google Scholar
  24. 24.
    Sidner, C.L., Lee, C., Kidd, C.D., Lesh, N., Rich, C.: Explorations in engagement for humans and robots. Artificial Intelligence 166(1-2), 140–164 (2005)CrossRefGoogle Scholar
  25. 25.
    White, M., Foster, M.E., Oberlander, J., Brown, A.: Using facial feedback to enhance turn-taking in multimodal dialogue system. In: Proceedings of the International Conference on Human-Computer Interaction (2005)Google Scholar
  26. 26.
    Traum, D., Larsson, S.: The information state approach to dialogue management. In: Smith, R.W., van Kuppevelt, J. (eds.) Current and New Directions in Discourse and Dialogue, Kluwer Academic Publishers, Dordrecht (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Markus Rickert
    • 1
  • Mary Ellen Foster
    • 1
  • Manuel Giuliani
    • 1
  • Tomas By
    • 1
  • Giorgio Panin
    • 1
  • Alois Knoll
    • 1
  1. 1.Robotics and Embedded Systems Group, Department of Informatics, Technische Universität München, Boltzmannstraße 3, D-85748 Garching bei MünchenGermany

Personalised recommendations