Skip to main content
Log in

From vision to multimodal communication: Incremental route descriptions

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

In the last few years, within cognitive science, there has been a growing interest in the connection between vision and natural language. The question of interest is: How can we discuss what we see. With this question in mind, we will look at the area ofincremental route descriptions. Here, a speaker step-by-step presents the relevant route information in a 3D-environment. The speaker must adjust his/her descriptions to the currently visible objects. Two major questions arise in this context: 1. How is visually obtained information used in natural language generation? and 2. How are these modalities coordinated? We will present a computational framework for the interaction of vision and natural language descriptions which integrates several processes and representations. Specifically discussed is the interaction between the spatial representation and the presentation representation used for natural language descriptions. We have implemented a prototypical version of the proposed model, called MOSES.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allen, J. F. & Kautz, H. A. (1985). A Model of Naive Temporal Reasoning. In Hobbs, J. R. & Moore, R. C. (eds.)Formal Theories of the Commensense World, 251–267. Ablex: Norwood, NJ.

    Google Scholar 

  • Biederman, I. (1990). Higher-Level Vision. In Osherson, D. N., Kossyln, S. M. & Hollerbach, J. M. (eds.)Visual Cognition and Action: An Invitation to Cognitive Science (Volume 2), 41–72. MIT Press: Cambridge, MA.

    Google Scholar 

  • Blades, M. (1991). Wayfinding Theory and Research: The Need for a New Approach. In Mark, D. M. & Frank, A. U. (eds.)Cognitive and Linguistic Aspects of Geographic Space, 137–165. Kluwer: Dordrecht.

    Google Scholar 

  • Chase, W. G. (1982). Spatial Representations of Taxi Drivers. In Rogers D. R. & Sloboda J. A. (eds.)Acquisition of Symbolic Skills. Plenum: New York.

    Google Scholar 

  • Elliott, R. J. & Lesk, M. E. (1982). Route Finding in Street Maps by Computers and People. In Proceedings ofAAAI-82, 258–261. Pittsburgh, PA.

  • Finkler, W. & Schauder, A. (1992). Effects of Incremental Output on Incremental Natural Language Generation. In proceedings ofThe Tenth ECAI, 505–507. Vienna.

  • Gapp, K.-P. (1994). On the Basic Meanings of Spatial Relations: Computation and Evaluation in 3D Space. To appear in: Proceedings ofThe Twelfth AAAI-94. Seattle, WA.

  • Garling, T. (1989). The Role of Cognitive Maps in Spatial Decisions.Journal of Environmental Psychology 9: 269–278. Glasgow, J. (1993). The Imagery Debate Revisited: A Computational Perspective.Computational Intelligence 9(4): 309–333.

    Google Scholar 

  • Gluck, M. (1991). Making Sense of Human Wayfinding: Review of Cognitive and Linguistic Knowledge for Personal Navigation with a New Research Direction. In Mark, D. M. & Frank, A. U. (eds.)Cognitive and Linguistic Aspects of Geographic Space, 117–135. Kluwer Academic Publishers: The Netherlands.

    Google Scholar 

  • Gopal, S. & Smith, T. (1989). NAVIGATOR: A Psychologically Based Model of Environmental Learning Through Navigation.Journal of Environmental Psychology 9: 309–331.

    Google Scholar 

  • Grosz, B. J. (1981). Focusing and description in Natural Language Dialogues. In Joshi, A., Webber, B. L. & Sag, I. A. (eds.)Elements of Discourse Understanding, 84–105. Cambridge, Cambridge University Press: London.

    Google Scholar 

  • Habel, Ch. (1987).Prozedurale Aspekte der Wegplanung und Wegbeschreibung. LILOG-Report 17, IBM, Stuttgart.

  • Hayes-Roth, B. & Hayes-Roth, F. (1979). A Cognitive Model of Planning.Cognitive Science 3: 275–310.

    Google Scholar 

  • Herzog, G., Maaß, W. & Wazinski, P. (1993). VITRA GUIDE: Utilisation du Language Naturel et de Représentation Graphiques pour la Description d'Iténitaires. InColloque Interdisciplinaire du Commitée National “Image et Langages: Multimodalité et Modélisation Cognitive”. Paris.

  • Hoeppner, W., Carstensen, M. & Rhein, U. (1990). Wegauskünfte: Die Interdependenz von Such- und Beschreibungsprozessen. In Freksa, C. & Habel, C. (eds.)Informatik Fachberichte 245, 221–234. Springer.

  • Ittelson, W. H. (1960).Visual Space Perception. Springer: New York.

    Google Scholar 

  • Jackendorf, R. (1987). On Beyond Zebra: The Relation of Linguistic and Visual Information.Cognition 26: 89–114.

    Google Scholar 

  • Joshi, A. K. (1983). Factoring Recursion and Dependencies: An Aspect of Tree Adjoining Grammars (TAG) and a Comparison of Some Formal Properties of TAGS, GPSGS, PLGS, and LPGS. In Proceedings ofThe Twenty-First ACL, 7–15, Cambridge, MA.

  • Klein, W. (1982) Local Deixis in Route Directions. In Jarvella, R. J. & Klein, W. (eds.)Speech, Place, and Action, 161–182. Wiley: Chichester.

    Google Scholar 

  • Korf, R. E. (1990). Real-Time Heuristic Search.Artificial Intelligence 42: 189–211.

    Google Scholar 

  • Kosslyn, S. (1987). Seeing and Imaging in the Cerebral Hemispheres: A Computational Approach.Psychological Review 94: 148–175.

    Google Scholar 

  • Kuipers, B. (1977).Representing Knowledge of Large-Scale Space. PhD thesis, MIT AI Lab, Cambridge, MA. TR-418.

    Google Scholar 

  • Kuipers, B. (1978). Modelling Spatial Knowledge.Cognitive Science 2: 129–153.

    Google Scholar 

  • Landau, B. & Jackendoff, R. (1993). “What” and “Where” in Spatial Language and Spatial Cognition.Behavioral and Brain Sciences 16: 217–265.

    Google Scholar 

  • Leiser, D. & Zilbershatz, A. (1989). THE TRAVELLER: A Computational Model of Spatial Network Learning.Environmental and Behaviour 21(4): 435–463.

    Google Scholar 

  • Lewis, D. (1976). Observations on Route Finding and Spatial Orientation among the Aboriginal Peoples of the Western Desert Region of Central Australia.Oceania XLVI(4).

  • Maaß, W., Baus, J. & Paul, J. (1994).Visually Accessed Spatial Information Used in Incremental Route Descriptions: A Computational Model (submitted).

  • Maaß, W. (1993). A Cognitive Model for the Process of Multimodal, Incremental Route Description. In Proceedings ofThe European Conference on Spatial Information Theory. Springer.

  • Marr, D. (1982).Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Freemann: San Francisco.

    Google Scholar 

  • McCalla, G. & Schneider, P. (1979). The Execution of Plans in an Independent Dynamic Microworld. Proceedings ofThe Sixth IJCAI, 553–555.

  • Meier, J., Metzing, D., Polzin, T., Ruhrberg, P., Rutz, H. & Vollmer, M. (1988).Generierung von Wegbeschreibungen. KoLiBri Arbeitsbericht 9, Fakultät für Linguistik und Literaturwissenschaft, Universität Bielefeld.

  • Neisser, U. (1976).Cognition and Reality. Freeman: San Francisco.

    Google Scholar 

  • Pailhous, J. (1969). Representation de l'espace urbain et cheminements.Le Travail Humain, 32–87.

  • Piaget, J., Inhelder, B. & Szemenska, A. (1960).The Child's Conception of Geometry. Basic Books: New York.

    Google Scholar 

  • Schirra, J. R. J. (1990). Einige Überlegungen zu Bildvorstellungen in kognitiven Systemen. In Freksa, C. & Habel, C. (Hrsg.)Repräsentation und Verarbeitung räumlichen Wissens, 68–82. Springer: Berlin, Heidelberg.

    Google Scholar 

  • Schneider, W. & Shiffrin, R. M. (1977). Controlled and Automatic Human Information Processing. 1. Detection, Search, and Attention.Psychological Review 84: 1–66.

    Google Scholar 

  • Siegel, A. W. & White, S. H. (1975). The Development of Spatial Representation of Large-Scale Environment. In Reese, W. (ed.)Advances in Child Development and Behaviour. Academic Press: New York.

    Google Scholar 

  • Sitaro, K. (1993).Language and Thought Interface: A Study of Spontaneous Gestures and Japanese Mimetics. Ph.D. thesis, University of Chicago.

  • Stokols, D. & Altman, I. (eds.) (1987).Handbook of Environmental Psychology, volume 1 & 2. John Wiley & Sons.

  • Streeter, L. A., Vitello, D. & Wonsiewicz, S. A. (1985). How to Tell People Where to Go: Comparing Navigational Aids.Intentional Journal of Man-Machine Studies 22: 549–562.

    Google Scholar 

  • Thorndyke, P. W. & Goldin, S. E. (1983). Spatial Learning and Reasoning Skill. In Pick, H. L. & Acredolo, L. P. (eds.)Spatial Orientation: Theory, Research, and Application, 195–217. Plenum: New York, London.

    Google Scholar 

  • Thorndyke, P. W. & Hayes-Roth, B. (1982). Differences in Spatial Knowledge Acquired from Maps and Navigation.Cognitive Psychology 14: 560–582.

    Google Scholar 

  • Tolman, E. C. (1948). Cognitive Maps in Rats and Men.Psychological Review 55: 189–208.

    Google Scholar 

  • Wahlster, W., Andre, E., Finkler, W., Graf, W., Profitlich, H.-J., Rist, T. & Schauder, A. (1992). WIP: Integrating Text and Graphics Design for Adaptive Information Presentation. In Dale, R., Hovy, E., Rösner, D. & Stock, O. (eds.)Aspects of Automated Natural Language Generation: Proceedings of, 290–292. Springer: Berlin, Heidelberg.

    Google Scholar 

  • Wunderlich, D. & Reinelt, R. (1982). How to Get There From Here. In Jarvella, R. J. & Klein, W. (eds.)Speech, Place, nd Action, 183–201. Wiley: Chichester.

    Google Scholar 

  • Yeap, W. K. (1988). Towards a Computational Theory of Cognitive Maps.Artificial Intelligence 34: 297–360.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work is funded by the cognitive science programGraduiertenkolleg ‘Kognitionswissenschaft’ of the German Research Community (DFG). I am grateful for Jörg Bau, Joachim Paul, Roger Knop, and Bernd Andes, and Amy Norton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maaß, W. From vision to multimodal communication: Incremental route descriptions. Artif Intell Rev 8, 159–174 (1994). https://doi.org/10.1007/BF00849072

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00849072

Key words

Navigation