Skip to main content

An Intelligent Multimodal Interface

  • Chapter
Book cover Informatik

Part of the book series: TEUBNER-TEXTE zur Informatik ((TTZI,volume 1))

  • 49 Accesses

Abstract

In face-to-face conversation humans frequently use deictic gestures parallel to verbal descriptions for referent identification. Such a multimodal mode of communication is of great importance for intelligent interfaces, as it simplifies and speeds up reference to objects in a visualized application domain. Natural pointing behavior is very flexible, but also possibly ambiguous or vague, so that without a careful analysis of the discourse context of a gesture there would be a high risk of reference failure. The subject of this paper is how the user and discourse model of an intelligent interface influences the comprehension and production of natural language with coordinated pointing, and conversely how multimodal communication influences the user and discourse model. We briefly describe the deixis analyzer of our XTRA system, which handles a variety of tactile gestures, including different granularities, inexact pointing gestures and pars-pro-toto deixis. We show how gestures can be used to shift focus and how focus can be used to disambiguate gestures. Finally, we discuss the impact of the user model on the decision of the presentation planning component, as to whether to use a pointing gesture, a verbal description, or both, for referent identification.

This is a condensed and revised version of my paper ‘User and Discourse Models for Multimodal Communication’, which appears in ‘Sullivan, J.W., Tyler, S.W. (eds.) Architectures for Intelligent Interfaces: Elements and Prototypes. Reading: Addison-Wesley 1991.’ The research was partially supported by the German Science Foundation (DFG) in its Special Collaborative Programme on AI and Knowledge-Based Systems (SFB 314).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  1. Allgayer, J. and Reddig, C. 1986. Processing Descriptions containing Words and Gestures — A System Architecture. In Rollinger, C.-R. (ed.), Proc. of GWAI/ÖGAI 1986, Berlin, Springer.

    Google Scholar 

  2. Bolt, R.A. 1980. Put-That-there: Voice and Gesture at the Graphics Interface. Computer Graphics, 14, pp. 262–270.

    Article  Google Scholar 

  3. Brown, D.C., Kwasny, S.C., Chandrasekaran, B., Sondheimer, N.K. 1979. An Experimental Graphics System with Natural Language Input. Computer and Graphics, 4, pp. 13–22.

    Article  Google Scholar 

  4. Buxton, W. and Myers, B.A. 1986. A Study in Two-Handed Input. Proc. CHI’86 Human Factors in Computing Systems, ACM, New York, pp. 321–326.

    Google Scholar 

  5. Carbonell, J.R. 1970. Mixed-Initiative Man-Computer Dialogues. BBN Report No. 1971, Bolt, Beranek and Newman, Cambridge, MA.

    Google Scholar 

  6. Clark, H.H., Schreuder, R. and Buttrick, S. 1983. Common Ground and the Understanding of Demonstrative Reference. Journal of Verbal Learning and Verbal Behavior, 22, pp. 245–258.

    Article  Google Scholar 

  7. Hayes, P.J. 1986. Steps towards Integrating Natural Language and Graphical Interaction for Knowledge-based Systems. Proc. of the 7th European Conference on Artificial Intelligence, Brighton, Great Britain, pp. 436–465.

    Google Scholar 

  8. Grosz, B. 1981. Focusing and Description in Natural Language Dialogues, in Joshi, A., Webber, B., Sag, I. (eds.), Elements of Discourse Understanding. New York: Cambridge Univ. Press, pages 84–105.

    Google Scholar 

  9. Hinrichs, E. and Polanyi, L. 1987. Pointing The Way: A Unified Treatment of Referential Gesture in Interactive Discourse. Papers from the Parasession on Pragmatics and Grammatical Theory at the 22nd Regional Meeting, Chicago Linguistic Society, Chicago, pp. 298–314.

    Google Scholar 

  10. Kobsa, A., Allgayer, J., Reddig, C., Reithinger, N., Schmauks, D., Harbusch, K. and Wahlster, W. 1986. Combining Deictic Gestures and Natural Language for Referent Indentification. Proc. of the 11th International Conf. on Computational Linguistics, Bonn, West Germany, pp. 356–361.

    Google Scholar 

  11. Neal, J.G., Shapiro, S.C. 1988. Intelligent Multi-Media Interface Technology. In Proc. of the Workshop on Architecures for Intelligent Interfaces: Elements and Prototypes. Monterey, Ca., pp. 69–91.

    Google Scholar 

  12. Reilly, R. (ed.) 1985. Communication Failure in Dialogue: Techniques for Detection and Repair. Deliverable 2, Esprit Project 527, Educational Research Center, St. Patrick’s College, Dublin, Ireland.

    Google Scholar 

  13. Reithinger, N. 1987. Generating Referring Expressions and Pointing Gestures. In Kempen, G. (ed.) Natural Language Generation, Dordrecht, Kluwer, pp. 71–81.

    Chapter  Google Scholar 

  14. Retz-Schmidt, G. (1988): Various Views on Spatial Prepositions. In AI Magazine, Vol. 9, No. 2, also appeared as: Report No. 33, SFB 314, University of Saarbrücken, Computer Science Department.

    Google Scholar 

  15. Schmauks, D. 1987. Natural and Simulated Pointing. In Proc. of the 3rd European ACL Conference, Copenhagen, Danmark, pp. 179–185.

    Google Scholar 

  16. Schmauks, D. and Reithinger, N. 1988. Generating Multimodal Output — Conditions, Advantages and Problems. To appear in Proc. of the 12th International Conference on Computational Linguistics, Budapest, Hungary. Also appeared as Report No. 29, SFB 314, Computer Science Department, University of Saarbrücken.

    Google Scholar 

  17. Scragg, G.W. 1987. Deictic Resolution of Anaphora. Unpublished paper, Franklin and Marshall College, P.O.Box 3003, Lancaster, PA 17604.

    Google Scholar 

  18. Thompson, C. 1986. Building Menu-Based Natural Language Interfaces. Texas Engineering Journal, 3, pp. 140–150.

    Google Scholar 

  19. Wahlster, W. 1984. Cooperative Access Systems. Future Generation Computer Systems, 1, pp. 103–111.

    Article  Google Scholar 

  20. Wahlster, W. and Kobsa, A. 1986. Dialog-Based User Models. In Ferrari, G. (ed.) Proceedings of the IEEE, 74, 7, pp. 948–960.

    Google Scholar 

  21. Wahlster, W. 1988. Distinguishing User Models from Discourse Models, Report No. 27, SFB 314, Computer Science Department, University of Saarbruecken, Fed. Rep. of Germany, to appear in Kobsa, A. and Wahlster, W. (eds.) Computational Linguistics, Special Issue on User Modeling, 1988.

    Google Scholar 

  22. Wetzel, R.P., Hanne, K.H. and Hoepelmann, J.P. 1987. DIS-QUE: Deictic Interaction System-Query Environment. LOKI Report KR-GR 5.3/KR-NL 5, Fraunhofer Gesellschaft, IAO, Stuttgart, Fed. Rep. of Germany.

    Google Scholar 

  23. Woods, W.A. et al. 1979. Research in Natural Language Understanding. Annual Report, TR 4274, Bolt, Beranek and Newman, Cambridge, MA, USA.

    Google Scholar 

  24. Zimmermann, T.G., Lanier, J., Blouchard, C, Bryson, S. and Harvill, Y. 1987. A Hand Gesture Interface Device. Proc. CHI’87 Human Factors in Computing Systems, ACM, New York, pp. 189–192.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1992 B. G. Teubner Verlagsgesellschaft, Leipzig

About this chapter

Cite this chapter

Wahlster, W. (1992). An Intelligent Multimodal Interface. In: Buchmann, J., Ganzinger, H., Paul, W.J. (eds) Informatik. TEUBNER-TEXTE zur Informatik, vol 1. Vieweg+Teubner Verlag, Wiesbaden. https://doi.org/10.1007/978-3-322-95233-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-322-95233-2_29

  • Publisher Name: Vieweg+Teubner Verlag, Wiesbaden

  • Print ISBN: 978-3-8154-2033-1

  • Online ISBN: 978-3-322-95233-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics