Abstract
A major concern of Human Computer Interaction is to improve communication between people and computer applications. One possible way of improving such communication is to capitalise on the way human beings use speech and gesture in a complementary manner, exploiting the redundancy of information between these modes. Redundant data input via multiple modalities, give considerable scope for the resolution of error and ambiguity. This paper describes implementation of a simple, inexpensive tri-modal input system accepting touch, two dimensional gesture and speech input. Currently the speech and gesture recognition systems operate separately. Truth maintenance and blackboard system architectures in a multimodal interpreter are proposed for handling the integration between modes and task knowledge. Preliminary results from the two dimensional gesture recognition system are presented. Rule Induction is used for analysis of the gesture data and preliminary classification results are presented. Current implementations and future work on redundancy are also discussed.
Preview
Unable to display preview. Download preview PDF.
References
Ando, H.; Kikuchi, H.; Hataoka, N.; Agent-typed Multimodal Interface Using Speech, Pointing Gestures and CG. Symbiosis of Human and Artifact; Anzai, Ogawa; Mori, eds.; pp29–34, 1995.
Ando, H.; Kitahara, Y.; Hataoka, N.; Evaluation of Multimodal Interface using Spoken Language and Pointing Gesture on Interior Design System. International Conference on Spoken Language Processing, pp567–570, 1994.
Bellalem, N.; Romary, L.; Structural Analysis of Co-verbal Deictic Gesture in Multimodal Dialogue Systems. Proceedings of Gesture Workshop 1996: Progress in Gestural Interaction, pp141–153, Springer Verlag, 1997.
Bellik, Y; Modality Integration: Speech and Gesture. Survey of the State of the Art in Human Language Technology; Cole, R, ed.; Available from the Centre for Spoken Language Understanding, Oregon Graduate Institute, USA, 1997.
Blake, A.; Isard, M.; Reynard, D.; Learning to track the visual motion of contours. Artificial Intelligence, No. 78, pp179–212, 1995.
Bolt, R.A.; The Human Interface —where people and computers meet, Chapter 3, pp35–52, 1984.
Bordegoni, M.; Parallel Use of Hand Gestures and Force-Input Device for Interacting with 3D and Virtual Reality Environments. International Journal of Human-Computer Interaction, Vol. 6, No. 4, pp391–413, 1994.
Bordegoni, M.; Faconti, G.P.; Architectural Models of Gesture Systems. Proceedings of Gesture Workshop 1996, pp61–73, Springer Verlag, 1997.
Bregler, C.; Manke, S.; Hild, H.; Waibel, A.; Bimodal Sensor Integration on the Example of “Speech-Reading”. Proceedings of the International Conference on Neural Networks, 1993.
Caloini, A.; Tosolini, P.; Time and gesture based interfaces a test case: Shinjuku Guide. UK Toolbook User Conference, 1994.
Davis, J.; Shah, M.; Gesture Recognition; Technical Report available from Department of Computer Science, University of Central Florida, Orlando, FL 32816.
Duchnowski, P.; Meier, U.; Waibel, A.; See me, Hear me: Integrating Automatic Speech Recognition and Lip-reading. ICSLP ’94, International Conference on Spoken Language Processing, pp547–50, Vol. 2, 1994.
Fishman, E.S.; Padilla, E.; Speech Recognition Software Overview Summer/Fall 1997. Speech Recognition Update August 1997. Available from http://www.voicerecognition.com/article_8_97.html.
Hinde, C.J.; Bray, A.D.; et al; A Truth Maintenance Approach to Process Planning. Artificial Intelligence in Manufacturing, Proceedings of the 4th International Conference on the Applications of Artificial Intelligence in Engineering, pp171–188, 1989.
Koons, D.B.; Sparrell, C.J.; Thorisson, K.R.; Integrating Simultaneous Input from Speech, Gaze, and Hand Gestures. Intelligent Multimedia Interfaces, pp257–276, AAAI Press, 1993.
Markowitz, Judith; Talking to Machines. Byte, pp97–104, December 1995.
Nakagawa, S.; Zhang, J.; Chengcharoen, W.; A multi-modal Interface with Speech and Touch Screen. Symbiosis of Human and Artifact, pp213–218, Anzai; Ogawa; Mori, eds.;1995.
Newby, G.B.; Gesture Recognition Using Statistical Similarity. Proceedings of the Virtual Reality and Persons with Disabilities Conference, 1993.
Oviatt, S.; Multimodal Interfaces for Dynamic Interactive Maps. CHI 96, Vancouver. 13–18 April 1996, pp95–102.
Oviatt, S.; DeAngeli, A.; Kuhn, K.; Integration and Synchronization of Input Modes during Multimodal Human-Computer Interaction. CHI 97, Atlanta GA, USA. 22–27 March 1997, pp.415–422.
Oviatt, S.; Olsen, E.; Integration Themes in Multimodal Human-Computer Interaction. Proceedings of International Conference on Spoken Language Processing, pp551–554, 1994.
Oviatt, S.; VanGent, R.; Error resolution During Multimodal Human-Computer Interaction. Proceedings of International Conference on Spoken Language Processing, 1996.
Rich, E; Knight, K; Artificial Intelligence, McGraw-Hill, 1991.
Robbe, S.; Carbonell, N.; Dauchy, P.; How Do Users Manipulate Graphical Icons? An Empirical Study. Proceedings of Gesture Workshop 1996. Springer, 1997.
Rubine, D.; Specifying Gestures by Example. Computer Graphics 25, 4, pp 329–337, ACM Press, 1991.
Vo, M.T.; Waibel, A.; Multimodal Human-Computer Interaction. Draft available from Alex Waibel at School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213–3890, U.S.A.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag
About this paper
Cite this paper
Mills, K.M., Alty, J.L. (1998). Investigating the role of redundancy in multimodal input systems. In: Wachsmuth, I., Fröhlich, M. (eds) Gesture and Sign Language in Human-Computer Interaction. GW 1997. Lecture Notes in Computer Science, vol 1371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052997
Download citation
DOI: https://doi.org/10.1007/BFb0052997
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64424-8
Online ISBN: 978-3-540-69782-4
eBook Packages: Springer Book Archive