Spatial Knowledge Representation for Human-Robot Interaction

  • Reinhard Moratz
  • Thora Tenbrink
  • John Bateman
  • Kerstin Fischer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2685)


Non-intuitive styles of interaction between humans and mobile robots still constitute a major barrier to the wider application and acceptance of mobile robot technology. More natural interaction can only be achieved if ways are found of bridging the gap between the forms of spatialkno wledge maintained by such robots and the forms of language used by humans to communicate such knowledge. In this paper, we present the beginnings of a computationalmodelfor representing spatialkno wledge that is appropriate for interaction between humans and mobile robots. Work on spatial reference in human-human communication has established a range of reference systems adopted when referring to objects; we show the extent to which these strategies transfer to the human-robot situation and touch upon the problem of differing perceptual systems. Our results were obtained within an implemented kernel system which permitted the performance of experiments with human test subjects interacting with the system. We show how the results of the experiments can be used to improve the adequacy and the coverage of the system, and highlight necessary directions for future research.


Naturalhuman-robot interaction computationalmodel ing of spatial knowledge reference systems 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amalberti et al., 1993. Amalberti, R., Carbonell, N., and Falzon, P. (1993). User Representations of Computer Systems in Human-Computer Speech Interaction. International Journal of Man-Machine Studies, 38:547–566.CrossRefGoogle Scholar
  2. 2.
    Bateman, 1999. Bateman, J. A. (1999). Using aggregation for selecting content when generating referring expressions. In Proceedings of the 37th. Annual Meeting of the Association for Computational Linguistics (ACL’99), pages 127–134, University of Maryland. Association for Computational Linguistics.Google Scholar
  3. 3.
    Biber, 1988. Biber, D. (1988). Variation across speech and writing. Cambridge University Press, Cambridge.Google Scholar
  4. 4.
    Eschenbach, 2001. Eschenbach, C. (2001). Contextual, Functional, and Geometric Features and Projective Terms. In Proceedings of the 2nd annual language & space workshop: Defining Functional and Spatial Features, University of Notre Dame.Google Scholar
  5. 5.
    Eschenbach et al., 2000. Eschenbach, C., Tschander, T., Habel, C., and Kulik, L. (2000). Lexical Specification of Paths. In Freksa, C., Habel, C., and Wender, K. F., editors, Spatial Cognition II, Lecture Notes in Artificial Intelligence. Springer-Verlag, Berlin.Google Scholar
  6. 6.
    Fischer, 2000. Fischer, K. (2000). What is a situation? In Proceedings of G talog 2000, Fourth Workshop on the Semantics and Pragmatics of Dialogue, pages 85–92.Google Scholar
  7. 7.
    Habel et al., 1999. Habel, C., Hildebrandt, B., and Moratz, R. (1999). Interactive robot navigation based on qualitative spatial representations. In Wachsmuth, I. and Jung, B., editors, Proceedings Kogwis99, pages 219–225, St. Augustin. infix.Google Scholar
  8. 8.
    Hernández, 1994. Hernández, D. (1994). Qualitative representation of spatial knowledge. Lecture Notes in Artificial Intelligence. Springer Verlag, Berlin, Heidelberg, New York.Google Scholar
  9. 9.
    Herrmann, 1990. Herrmann, T. (1990). Vor, hinter, rechts und links: das 6h-modell. psychologische studien zum sprachlichen lokalisieren. Zeitschrift fur Literaturwissenschaft und Linguistik, 78:117–140.Google Scholar
  10. 10.
    Herrmann and Grabowski, 1994. Herrmann, T. and Grabowski, J. (1994). Sprechen: Psychologie der Sprachproduktion. Spektrum Verlag, Heidelberg.Google Scholar
  11. 11.
    Hildebrandt and Eikmeyer, 1999. Hildebrandt, B. and Eikmeyer, H.-J. (1999). Sprachverarbeitung mit Combinatory Categorial Grammar: Inkrementalitat & Effizienz. SFB 360: Situierte Kunstliche Kommunikatoren, Report 99/05, Bielefeld.Google Scholar
  12. 12.
    Hildebrandt et al., 1995. Hildebrandt, B., Moratz, R., Rickheit, G., and Sagerer, G. (1995). Integration von bild-und sprachverstehen in einer kognitiven architektur. In Kognitionswissenschaft, volume 4, pages 118–128, Berlin. Springer-Verlag.Google Scholar
  13. 13.
    Horacek, 2001. Horacek, H. (2001). Textgenerierung. In Carstensen, K.-U., Ebert, C., Endriss, C., Jekat, S., Klabunde, R., and Langer, H., editors, Computerlinguistik und Sprachtechnologie-Eine Einfü hrung, pages 331–360. Spektrum Akademischer Verlag, Heidelberg.Google Scholar
  14. 14.
    Kummert et al., 1993. Kummert, F., Niemann, H., Prechtel, R., and Sagerer, G. (1993). Control and explanation in a signal understanding environment. Signal Processing, special issue on ‘Intelligent Systems for Signal and Image Understanding’, 32:111–145.Google Scholar
  15. 15.
    Lay et al., 2001. Lay, K., Prassler, E., Dillmann, R., Grunwald, G., H gele, M., Lawitzky, G., Stopp, A., and von Seelen, W. (2001). MORPHA: Communication and Interaction with Intelligent, Anthropomorphic Robot Assistants. In International Status Conference: Lead Projects Human-Computer-Interaction, Saarbruecken, Germany.Google Scholar
  16. 16.
    Levelt, 1996. Levelt, W. J. M. (1996). Perspective Taking and Ellipsis in Spatial Descriptions. In Bloom, P., Peterson, M., Nadel, L., and Garrett, M., editors, Language and Space, pages 77–109. MIT Press, Cambridge, MA.Google Scholar
  17. 17.
    Levinson, 1996. Levinson, S. C. (1996). Frames of Reference and Molyneux’s Question: Crosslinguistic Evidence. In Bloom, P., Peterson, M., Nadel, L., and Garrett, M., editors, Language and Space, pages 109–169. MIT Press, Cambridge, MA.Google Scholar
  18. 18.
    Moratz, 1997. Moratz, R. (1997). Visuelle Objekterkennung als kognitive Simulation. Diski 174. Infix, Sankt Augustin.Google Scholar
  19. 19.
    Moratz et al., 1995. Moratz, R., Eikmeyer, H., Hildebrandt, B., Kummert, F., Rickheit, G., and Sagerer, G. (1995). Integrating speech and selective visual perception using a semantic network. Proc. AAAI-95 Fall Symposium on Computational Models for Integrating Language and Vision, pages 44–49.Google Scholar
  20. 20.
    Moratz et al., 2001. Moratz, R., Fischer, K., and Tenbrink, T. (2001). Cognitive Modeling of Spatial Reference for Human-Robot Interaction. International Journal on Artificial Intelligence Tools, 10(4): 589–611.CrossRefGoogle Scholar
  21. 21.
    Moratz and Hildebrandt, 1998. Moratz, R. and Hildebrandt, B. (1998). Deriving Spatial Goals from Verbal Instructions-A Speech Interface for Robot Navigation-. SFB 360: Situierte Kunstliche Kommunikatoren, Report 98/11, Bielefeld.Google Scholar
  22. 22.
    Moratz et al., 2002. Moratz, R., Nebel, B., and Freksa, C. (2002). Qualitative spatial reasoning about relative position: The tradeoff between strong formal properties and successful reasoning about route graphs. this volume.Google Scholar
  23. 23.
    Neumann and Novak, 1983. Neumann, B. and Novak, H.-J. (1983). Event models for recognition and natural language description of events in real-world image sequences. In IJCAI 1983, pages 643–646.Google Scholar
  24. 24.
    Niemann et al., 1990. Niemann, H., Sagerer, G., Schroder, S., and Kummert, F. (1990). ERNEST: a semantic network system for pattern understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(9):883–905.CrossRefGoogle Scholar
  25. 25.
    Oviatt et al., 1998. Oviatt, S., MacEachern, M., and Levow, G.-A. (1998). Predicting hyperarticulate speech during human-computer error resolution. Speech Communication, 24:87–110.CrossRefGoogle Scholar
  26. 26.
    Reiter and Dale, 1992. Reiter, E. and Dale, R. (1992). A fast algorithm for the generation of referring expressions. In Proceedings of the fifteenth International Conference on Computational Linguistics (COLING-92), volume I, pages 232–238, Nantes, France. International Committe on Computational Linguistics.Google Scholar
  27. 27.
    Retz-Schmidt, 1988. Retz-Schmidt, G. (1988). Various Views on Spatial Prepositions. AI Magazine, 9(2):95–105.Google Scholar
  28. 28.
    Schegloff et al., 1977. Schegloff, E., Jefferson, G., and Sacks, H. (1977). The preference for self-correction in the organisation of repair in conversation. Language, 53:361–383.CrossRefGoogle Scholar
  29. 29.
    Steedman, 1996. Steedman, M. (1996). Surface Structure and Interpretation. MIT Press, Cambridge, MA.Google Scholar
  30. 30.
    Stopp et al., 1994. Stopp, E., Gapp, K.-P., Herzog, G., Laengle, T., and Lueth, T. C. (1994). Utilizing Spatial Relations for Natural Language Access to an Autonomous Mobile Agent. Kunstliche Intelligenz, pages 39–50.Google Scholar
  31. 31.
    Streit, 2001. Streit, M. (2001). Why Are Multimodal Systems so Difficult to Build?-About the Difference between Deictic Gestures and Direct Manipulation. In Bunt, H. and Beun, R.-J., editors, Cooperative Multimodal Communication. Springer-Verlag, Berlin, Heidelberg.Google Scholar
  32. 32.
    Wahlster, 2001. Wahlster, W. (2001). SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents. In International Status Conference: Lead Projects Human-Computer-Interaction, Saarbruecken, Germany.Google Scholar
  33. 33.
    Wahlster et al., 1983. Wahlster, W., Marburger, H., Jameson, A., and Busemann, S. (1983). Overanswering yes-no questions: Extended responses in a nl interface to a vision system. In IJCAI 1983, pages 643–646.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Reinhard Moratz
    • 1
  • Thora Tenbrink
    • 2
  • John Bateman
    • 3
  • Kerstin Fischer
    • 3
  1. 1.Center for Computer StudiesUniversity of BremenBremenGermany
  2. 2.Department for InformaticsUniversity of HamburgHamburg
  3. 3.FB10: Linguistics and Literary StudiesUniversity of BremenBremenGermany

Personalised recommendations