Visible speech improves human language understanding: Implications for speech processing systems

Thompson, Laura A.; Ogden, William C.

doi:10.1007/BF00849044

Visible speech improves human language understanding: Implications for speech processing systems

Published: October 1995

Volume 9, pages 347–358, (1995)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Laura A. Thompson¹ &
William C. Ogden¹

74 Accesses
5 Citations
Explore all metrics

Abstract

Evidence from the study of human language understanding is presented suggesting that our ability to perceive visible speech can greatly influence our ability to understand and remember spoken language. A view of the speaker's face can greatly aid in the perception of ambiguous or noisy speech and can aid cognitive processing of speech leading to better understanding and recall. Some of these effects have been replicated using computer synthesized visual and auditory speech. Thus, it appears that when giving an interface a voice, it may be best to give it a face too.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The self-advantage in visual speech processing enhances audiovisual speech recognition in noise

Article 25 November 2014

Rethinking the McGurk effect as a perceptual illusion

Article 21 April 2021

Talking points: A modulating circle reduces listening effort without improving speech recognition

Article 22 May 2018

References

Baddeley, A. D. (1986).The Psychology of Memory Basic Books: New York.
Google Scholar
Breeuwer, M. & Plomp, R. (1984). Speechreading Supplemented with Frequency-Selective Sound-Pressure Information.Journal of the Acoustical Society of America 76: 686–691.
Google Scholar
Brunswik, E. (1955). Representative Design and Probabilistic Theory in a Functional Psychology.Psychological Review 62: 193–217.
Google Scholar
Chi, M. T. H., Feltovich, P. J. & Glaser, R. (1981). Categorization and Representation of Physics Problems by Experts and Novices.Cognitive Science 5: 121–152.
Google Scholar
Cohen, M. M. & Massaro, D. W. (1990). Synthesis of Visible Speech.Behavior Research Methods, Instruments, & Computers 22: 260–263.
Google Scholar
Cohen, M. M. & Massaro, D. W. (1993). Modeling Coarticulation in Synthetic Visual Speech. In Thalmann, N. M. and Thalmann, D. (eds.)Models and Techniques in Computer Animation, 139–155. Springer-Verlag: New York.
Google Scholar
Gesi, A. T., Massaro, D. W. & Cohen, M. M. (1992). Discovery and Expository Methods in Teaching Visual Consonant and Word Identification.Journal of Speech and Hearing Research 35: 1180–1188.
Google Scholar
Guindon, R. (1988). How to Interface to Advisory Systems? Users Request Help With a Very Simple Language. In Proceedingsof CHI '88, 191–196 Association for Computing Machinery: New York.
Google Scholar
Hotchkiss, D. (1987).Demographic Aspects of Hearing Impairment: Questions and Answers. Center for Assessment and Demographic Studies, Gallaudet Research Institute: Washington, DC.
Google Scholar
Just, M. A. & Carpenter, P. A. (1980). A Theory of Reading: From Eye Fixations to Comprehension.Psychological Review 87: 329–354.
Google Scholar
Kendon, A. (1983). Gesture and Speech: How They Interact. In Weimann, J. M. & Harrison, R. P. (eds.)Nonverbal Interaction, 13–45. Sage: Beverly Hills, CA.
Google Scholar
Krauss, R., Morrel-Samuels, Pl. & Colasante, C. (1991). Do Conversational Hand Gestures Communicate?Journal of Personality and Social Psychology 61: 743–754.
Google Scholar
Kuhl, P. K. & Meltzoff, A. N. (1988). Speech as an Intermodal Object of Perception. In Yonas, Albert (eds.)Perceptual Development in Infancy: The Minnesota Symposia on Child Psychology, Vol. 20, 235–266. Lawrence Erlbaum Associates: Hillsdale, NJ.
Google Scholar
Leiser, R. G. (1989). Exploiting Convergence to Improve Natural Langauge Understanding.Interacting with Computers: The Interdisciplinary Journal of Human-Computer Interaction 1: 284–298.
Google Scholar
Lesgold, A., Rubinson, H., Feltovich, P., Glaser, R., Klopfer, D. & Wang, Y. (1988). Expertise in a Complex Skill: Diagnosing X-ray Pictures. In Chi, M. T. H., Glaser, R. & Farr, M. J. (eds.)The Nature of Expertise. Lawrence Erlbaum Associates: Hillsdale, NJ.
Google Scholar
MacLeod, A. & Summerfield, Q. (1990). A Procedure for Measuring Auditory and Audio-visual Speech-Reception Thresholds for Sentences in Noise: Rationale, Evaluation, and Recommendations for Use.British Journal of Audiology 24: 29–43.
Google Scholar
Marslen-Wilson, W. D. & Tyler, L. K. (1980). The Temporal Structure of Spoken Language Understanding.Cognition 8: 1–71.
PubMed Google Scholar
Massaro, D. W. (1987).Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry. Erlbaum: Hillsdale, NJ.
Google Scholar
Massaro, D. W. (in press). Bimodal Speech Perception Across the Lifespan. In Lewkowicz, D.J. & Lickliter, R. (eds.)The Development of Intersensory Perception: Comparative Perspectives. Lawrence Erlbaum Associates: Hillsdale, NJ.
Massaro, D. W. & Cohen, M. M. (1990). Perception of Synthesized Audible and Visible Speech.Psychological Science 1: 55–63.
Google Scholar
Massaro, D. W., Cohen, J. M. & Gesi, A. T. (1993). Long-Term Training, Transfer, and Retention in Learning to Lipread.Perception & Psychophysics 53: 549–562.
Google Scholar
Massaro, D. W., Thompson, L. A., Barron, B. & Laren, E. (1986). Developmental Changes in Visual and Auditory Contributions to Speech Perception.Journal of Experimental Child Psychology 41: 93–113.
Google Scholar
McGurk, H. & MacDonald, J. (1976). Hearing Lips and Seeing Voices.Nature 264: 746–748.
Google Scholar
McNeill, D. (1987). So YouDo Think Gestures Are Nonverbal? Reply to Feyereisen (1987).Psychological Review 94: 499–504.
Google Scholar
Ogden, W. C. (1988). Using Natural Language Interfaces. In Helander, M. (ed.)Handbook of Human-Computer Interaction Elsevier Science Publishers: North-Holland.
Google Scholar
Ogden, W. C. & Brooks, S. R. (1983). Query Languages for the Casual User: Exploring the Middle Ground Between Formal and Natural Languages. In Proceedings ofCHI '83: Human Factors in Computing Systems, 161–165. Association for Computing Machinery: New York.
Google Scholar
Pearce, A., Wyvill, B., Wyvill, G. & Hill, D. (1986). Speech and Expression: A Computer Solution to Face Animation.Graphics Interface '86.
Petajan, E. D. (1985). automatic Lipreading to Enhance Speech Recognition.IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 19–23, 40–47.
Schoenfeld, A. H. & Hermann, D. J. (1982). Problem Perception and Knowledge Structure in Expert and Novice Mathematical Problem Solvers.Journal of Experimental Psychology: Learning, Memory and Cognition 8: 484–494.
Google Scholar
Short, J., Williams, E. & Christie, B. (1976).The Social Psychology of Telecommunications. Wiley: Chichester, England.
Google Scholar
Silver, E. A. (1979). Students Perceptions of Relatedness Among Mathematical Verbal Problems.Journal for Research in Mathematics Education 12: 54–64.
Google Scholar
Strassmann, P. (1990).The Business Value of Computers. Information Economics: New Caanan, CT.
Google Scholar
Sumby, W. H. & Pollack, I. (1954). Visual Contribution to Speech Intelligibility in Noise.Journal of the Acoustical Society of America 26: 212–215.
Google Scholar
Summerfield, A. Q. (1979). Use of Visual Information in Phonetic Perception.Phonetica 36: 314–331.
Google Scholar
Thompson, L. A. (in press). Encoding and Memory for Visible Speech and Gestures: A Comparison Between Young and Older Adults.Psychology and Aging.
Thompson, L.A. & Lee, K. (in press). Information Integration in Cross-Model Pattern Recognition: An Argument for Acquired Modularity.Acta Psychologica.
Thompson, L. A. & Massaro, D. W. (1986). Evaluation and Integration of Speech and pointing Gestures During Referential Understanding.Journal of Experimental Child Psychology 42: 144–168.
Google Scholar
Thompson, L. A. & Massaro, D. W. (1994). Children's Integration of Speech and Pointing Gestures in Comprehension.Journal of Experimental Child Psychology 57: 327–354.
Google Scholar
Walden, B. E., Prosek, R. A., Montgomery, A., Scherr, C. K. & Jones, C. J. (1977). Effects of Training on the Visual Recognition of Consonants.Journal of Speech and Hearing Research 20: 130–145.
Google Scholar
Walden, B. E., Prosek, R. A. & Worthington, D. W. (1974). Predicting Audiovisual Consonant Recognition Performance of Hearing-Impaired Adults.Journal of Speech and Hearing Research 18: 272–280.
Google Scholar
Watt, W. C. (1968). Habitability.American Documentation. July, 338–351.
Weiser, M. & Shertz, J. (1983). Programming Problem Representation in Novice and Expert Programmers.International Journal of Man-Machine Studies 19: 391–398.
Google Scholar
Williams, E. (1977). Experimental Comparisons of Face-to-Face and Mediated Communication: A Review.Psychological Bulletin 84: 963–976.
Google Scholar

Download references

Author information

Authors and Affiliations

Psychology Department and Computing Research laboratory, New Mexico State University, 88003, Las Cruces, NM, USA
Laura A. Thompson & William C. Ogden

Authors

Laura A. Thompson
View author publications
You can also search for this author in PubMed Google Scholar
William C. Ogden
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thompson, L.A., Ogden, W.C. Visible speech improves human language understanding: Implications for speech processing systems. Artif Intell Rev 9, 347–358 (1995). https://doi.org/10.1007/BF00849044

Download citation

Issue Date: October 1995
DOI: https://doi.org/10.1007/BF00849044

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visible speech improves human language understanding: Implications for speech processing systems

Abstract

Access this article

Similar content being viewed by others

The self-advantage in visual speech processing enhances audiovisual speech recognition in noise

Rethinking the McGurk effect as a perceptual illusion

Talking points: A modulating circle reduces listening effort without improving speech recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Visible speech improves human language understanding: Implications for speech processing systems

Abstract

Access this article

Similar content being viewed by others

The self-advantage in visual speech processing enhances audiovisual speech recognition in noise

Rethinking the McGurk effect as a perceptual illusion

Talking points: A modulating circle reduces listening effort without improving speech recognition

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation