Skip to main content

Multimodal Interfaces

  • Reference work entry
Encyclopedia of Multimedia
  • 1014 Accesses

Definition

Multimodal interfaces process two or more combined user input modes, such as speech, pen, touch, manual gestures, and gaze, in a coordinated manner with multimedia system output.

They are a new class of emerging systems that aim to recognize naturally occurring forms of human language and behavior, with the incorporation of one or more recognition-based technologies (e.g., speech, pen, vision). Multimodal interfaces represent a paradigm shift away from conventional graphical user interfaces (Fig. 1). They are being developed largely because they offer a relatively expressive, transparent, efficient, robust, and highly mobile form of human–computer interaction. They represent users’ preferred interaction style, and they support users’ ability to flexibly combine modalities or to switch from one input mode to another that may be better suited to a particular task or setting.

Multimodal Interfaces. Figure 1.
figure 1_159

Multimodal interfaces for field and mobile use.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 449.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. S.L. Oviatt, P.R. Cohen, L. Wu, J. Vergo, L. Duncan, B. Suhm, J. Bers, T. Holzman, T. Winograd, J. Landay, J. Larson, and D. Ferro, “Designing the User Interface for Multimodal Speech and Gesture Applications: State-of-the-Art Systems and Research Directions,” J. Carroll (Ed.), “Human Computer Interaction,” Vol. 15, No. 4, 2000, pp. 263–322 (also in Human-Computer Interaction in the New Millennium, Reading, MA.: Addison-Wesley, 2001).

    Google Scholar 

  2. G. Potamianos, C. Neti, J. Luettin, and I. Matthews, “Audio-Visual Automatic Speech Recognition: An Overview,” G. Bailly, E. Vatikiotis-Bateson, and P. Perrier (Eds.), “Issues in Visual and Audio-Visual Speech Processing,” MIT, Cambridge, MA, 2004.

    Google Scholar 

  3. S.L. Oviatt, “Multimodal Interfaces,” J. Jacko and A. Sears (Eds.), “Handbook of Human–Computer Interaction,” Lawrence Erlbaum, Mahwah, New Jersey, 2003, Chapter 14, pp. 286–304.

    Google Scholar 

Download references

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

(2008). Multimodal Interfaces. In: Furht, B. (eds) Encyclopedia of Multimedia. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78414-4_159

Download citation

Publish with us

Policies and ethics