Abstract
A multimodal system is a system equipped with a multimodal interface through which a user can interact with the system by using his/her natural communication modalities, such as speech, gesture, eye gaze, etc. To understand a user’s intension, multimodal input fusion, a critical component of a multimodal interface, integrates a user’s multimodal inputs and finds the combined semantic interpretation of them. As powerful, yet affordable input and output technologies becoming available, such as speech recognition and eye tracking, it becomes possible to attach recognition technologies to existing applications with a multimodal input fusion module; therefore, a practical multimodal system can be built. This paper documents our experience about building a practical multimodal system with our multimodal input fusion technology. The pilot study has been conducted over the multimodal system. By outlining observations from the pilot study, the implications on multimodal interface design are laid out.
Chapter PDF
References
Baldridge, J., Kruijff, M.G.: Coupling CCG and Hybrid Logic Dependency Semantics. In: 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia (July 2002)
Bozsahin, C., Kruijff, M.G., White, M.: Specifying Grammars for OpenCCG: A Rough Guide. The OpenCCG package (2006), http://openccg.sourceforge.net/
Kruijff. G. M.: A Categorial Modal Architecture of Informativity: Dependency Grammar Logic & Information Structure. Ph.D thesis, Charles University, Prague, Czech Republic (2001)
Nischt, M., Prendinger, H., Andre, E., Ishizuka, M.: MPML3D: a Reactive Framework for the Multimodal Presentation Markup Language. In: Gratch, J., Young, M., Aylett, R.S., Ballin, D., Olivier, P. (eds.) IVA 2006. LNCS, vol. 4133, pp. 218–229. Springer, Heidelberg (2006)
Sun, Y., Prendinger, H., Shi, Y., Chen, F., Chung, V., Ishizuka, M.: THE HINGE between Input and Output: Understanding the Multimodal Input Fusion Results In an Agent-Based Multimodal Presentation System. In: CHI 2008 extended abstracts on Human factors in computing systems, Florence, Italy, April 2008, pp. 3483–3488 (2008)
Sun, Y., Shi, Y., Chen, F., Chung, V.: An Efficient Unification-based Multimodal Language Processor in Multimodal Input Fusion. In: 19th Australasian conference on Computer-Human Interaction: Entertaining User Interfaces, Adelaide, Australia (November 2007)
Steedman, M.: The Syntactic Process. MIT Press, Cambridge (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sun, Y., Shi, Y.(., Chen, F., Chung, V. (2009). Building a Practical Multimodal System with a Multimodal Fusion Module. In: Jacko, J.A. (eds) Human-Computer Interaction. Novel Interaction Methods and Techniques. HCI 2009. Lecture Notes in Computer Science, vol 5611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02577-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-02577-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02576-1
Online ISBN: 978-3-642-02577-8
eBook Packages: Computer ScienceComputer Science (R0)