Perceiving speech from inverted faces

Massaro, Dominic W.; Cohen, Michael M.

doi:10.3758/BF03206832

Perceiving speech from inverted faces

Published: January 1996

Volume 58, pages 1047–1065, (1996)
Cite this article

Download PDF

Perception & Psychophysics Aims and scope Submit manuscript

Perceiving speech from inverted faces

Download PDF

Dominic W. Massaro¹ &
Michael M. Cohen¹

469 Accesses
35 Citations
Explore all metrics

Abstract

We examined whether the orientation of the face influences speech perception in face-to-face communication. Participants identified auditory syllables, visible syllables, and bimodal syllables presented in an expanded factorial design. The syllables were /ba/, /va/, /δa/, or /da/. The auditory syllables were taken from natural speech whereas the visible syllables were produced by computer animation of a realistic talking face. The animated face was presented either as viewed in normal upright orientation or inverted orientation (180° frontal rotation). The central intent of the study was to determine if an inverted view of the face would change the nature of processing bimodal speech or simply influence the information available in visible speech. The results with both the upright and inverted face views were adequately described by the fuzzy logical model of perception (FLMP). The observed differences in the FLMP’s parameter values corresponding to the visual information indicate that inverting the view of the face influences the amount of visible information but does not change the nature of the information processing in bimodal speech perception

Article PDF

Face viewing behavior predicts multisensory gain during speech perception

Article 16 December 2019

Audiovisual Integration of Face–Voice Gender Studied Using “Morphed Videos”

The multimodal facilitation effect in human communication

Article Open access 22 September 2022

References

Bernstein, L. E., &Eberhardt, S. P. (1986).Johns Hopkins lipreading corpus I-II: Disc I. Baltimore: Johns Hopkins University Press.
Google Scholar
Bertelson, P., Vroomen, J., Wiegeraad, G., & de Gelder, B. (1994, September).Exploring the relation between McGurk interference and ventriloquism. Paper presented at the 1994 International Conference on Spoken Language Processing, Yokohama, Japan.
Braida, L. D. (1991). Crossmodal integration in the identification of consonant segments.Quarterly Journal of Experimental Psychology,43A, 647–677.
Google Scholar
Bruno, N., &Cutting, J. E. (1988). Minimodularity and the perception of layout.Journal of Experimental Psychology: General,117, 161–170.
Article Google Scholar
Campbell, R. (1992). The neuropsychology of lipreading.Philosophical Transactions of the Royal Society London: Series B,335, 39–45.
Article Google Scholar
Campbell, R. (1994). Audiovisual speech: Where, what, when, how?Current Psychology of Cognition,13, 76–80.
Google Scholar
Chandler, J. P. (1969). Subroutine STEPIT-Finds local minima of a smooth function of several parameters.Behavioral Science,14, 81–82.
Google Scholar
Cohen, M. M., &Massaro, D. W. (1990). Synthesis of visible speech.Behavior Research Methods, Instruments, & Computers,22, 260–263.
Google Scholar
Cohen, M. M., &Massaro, D. W. (1992). On the similarity of categorization models. In F. G. Ashby (Ed.),Probabilistic multidimensional models of perception and cognition (pp. 395–447). Hillsdale, NJ: Erlbaum.
Google Scholar
Cohen, M. M., &Massaro, D. W. (1993). Modeling coarticulation in synthetic visual speech. In N. M. Thalmann & D. Thalmann (Eds.),Models and techniques in computer animation (pp. 139–156). Tokyo: Springer-Verlag.
Google Scholar
Cohen, M. M., &Massaro, D. W. (1994). Development and experimentation with synthetic visible speech.Behavior Research Methods, Instruments, & Computers,26, 260–265.
Article Google Scholar
Cutting, J. E., Bruno, N., Brady, N. P., &Moore, C. (1992). Selectivity, scope, and simplicity of models: A lesson from fitting judgments of perceived depth.Journal of Experimental Psychology: General,121, 364–381.
Article Google Scholar
de Gelder, B., Vroomen, J., &van der Heide, L. (1991). Face recognition and lip-reading in autism.European Journal of Cognitive Psychology,3, 69–86.
Article Google Scholar
Etcoff, N. L., &Magee, J. J. (1992). Categorical perception of facial expressions.Cognition,44, 227–240.
Article PubMed Google Scholar
Fisher, B. (1991).Integration of visual and auditory information in perception of speech events. Unpublished doctoral dissertation, University of California, Santa Cruz.
Google Scholar
Gouraud, H. (1971). Continuous shading of curved surfaces.IEEE Transactions on Computers,C-20, 623–628.
Article Google Scholar
Green, K. P. (1994). The influence of an inverted face on the McGurk effect.Journal of the Acoustical Society of America,95, 3014.
Article Google Scholar
Green, K. P., Kuhl, P. K., Meltzoff, A. N., &Stevens, E. B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect.Perception & Psychophysics,50, 524–536.
Article Google Scholar
Hellige, J. B. (1993).Hemispheric asymmetry: What’s right and what’s left. Cambridge, MA: Harvard University Press.
Google Scholar
Jordan, T. R., & Bevan, K. (in press). Seeing and hearing rotated faces: Influence of facial orientation on visual and audio-visual speech recognition.Journal of Experimental Psychology: Human Perception & Performance.
Levine, S. C., Banich, M. T., &Koch-Weser, M. P. (1988). Face recognition: A general or specific right hemisphere capacity?Brain & Cognition,8, 303–325.
Article Google Scholar
Massaro, D. W. (1984). Children’s perception of auditory and visual speech.Child Development,55, 1777–1788.
Article PubMed Google Scholar
Massaro, D. W. (1987).Speech perception by ear and eye: A paradigm for psychological inquiry. Hillsdale, NJ: Erlbaum.
Google Scholar
Massaro, D. W. (1989a). Multiple book review of Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry.Behavioral & Brain Sciences,12, 741–794.
Article Google Scholar
Massaro, D. W. (1989b). Testing between the TRACE model and the Fuzzy Logical Model of Perception.Cognitive Psychology,21, 398–421.
Article PubMed Google Scholar
Massaro, D. W. (1990). A fuzzy logical model of speech perception. In D. Vickers & P. L. Smith (Eds.),Human information processing: Measures, mechanisms, and models (pp. 367–379). Amsterdam: Elsevier, North-Holland.
Google Scholar
Massaro, D. W. (1992). Broadening the domain of the fuzzy logical model of perception. In H. L. Pick, Jr., P. Van den Broek, & D. C. Knill (Eds.),Cognition, conceptual, and methodological issues (pp. 51–84). Washington, DC: American Psychological Association.
Chapter Google Scholar
Massaro, D. W., &Cohen, M. M. (1983). Evaluation and integration of visual and auditory information in speech perception.Journal of Experimental Psychology: Human Perception & Performance,3, 753–771.
Article Google Scholar
Massaro, D. W., &Cohen, M. M. (1990). Perception of synthesized audible and visible speech.Psychological Science,1, 55–63.
Article Google Scholar
Massaro, D. W., Cohen, M. M., &Smeele, P. M. T. (1995). Crosslinguistic comparisons in the integration of visual and auditory speech.Memory & Cognition,23, 113–131.
Article Google Scholar
Massaro, D. W., &Friedman, D. (1990). Models of integration given multiple sources of information.Psychological Review,97, 225–252.
Article PubMed Google Scholar
Massaro, D. W., Tsuzaki, M., Cohen, M. M., Gesi, A., &Heredia, R. (1993). Bimodal speech perception: An examination across languages.Journal of Phonetics,21, 445–478.
Google Scholar
McGurk, H., &MacDonald, J. (1976). Hearing lips and seeing voices.Nature,264, 746–748.
Article PubMed Google Scholar
Parke, F. I. (1974).A parametric model for human faces (Tech. Rep. UTEC-CSc-75-047). Salt Lake City: University of Utah.
Google Scholar
Parke, F. I. (1975). A model for human faces that allows speech synchronized animation.Computers & Graphics Journal,1(1), 1–4.
Article Google Scholar
Parke, F. I. (1982). Parameterized models for facial animation.IEEE Computer Graphics,2(9), 61–68.
Article Google Scholar
Pearce, A., Wyvill, B., Wyvill, G., & Hill, D. (1986). Speech and expression: A computer solution to face animation.Proceedings of Graphics Interface ’86 (pp. 136–140).
Platt, J. R. (1964). Strong inference.Science,146, 347–353.
Article PubMed Google Scholar
Rhodes, G. (1994). Secrets of the face.New Zealand Journal of Psychology,23, 3–17.
Google Scholar
Sekiyama, K., &Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility.Journal of the Acoustical Society of America,90, 1797–1805.
Article PubMed Google Scholar
Sekiyama, K., &Tokhura, Y. (1993). Inter-language differences in the influence of visual cues in speech perception.Journal of Phonetics,21, 427–444.
Google Scholar
Sergent, J. (1994). Cognitive and neural structures in face processing. In A. Kertesz (Ed.),Localization and neuroimaging in neuropsychology (pp. 473–494). San Diego: Academic Press.
Google Scholar
Tanaka, J. W., &Farah, M. J. (1993). Parts and wholes in face recognition.Quarterly Journal of Experimental Psychology: Human Experimental Psychology,46A, 225–245.
Google Scholar
Thompson, L. A., &Massaro, D. W. (1989). Before you see it, you see its parts: Evidence for feature encoding and integration in preschool children and adults.Cognitive Psychology,21, 334–362.
Article PubMed Google Scholar
Valentine, T. (1988). Upside-down faces: A review of the effect of inversion upon face recognition.British Journal of Psychology,79, 471–491.
PubMed Google Scholar
Vroomen, J. H. M. (1992).Hearing voices and seeing lips: Investigations in the psychology of lipreading. Doctoral dissertation, Katholieke Universiteit Brabant, Nijmegen.
Google Scholar
Zaidel, D. W. (1994). Worlds apart: Pictorial semantics in the left and right cerebral hemispheres.Current Directions in Psychological Science,3, 5–8.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Santa Cruz, 95064, Santa Cruz, CA
Dominic W. Massaro & Michael M. Cohen

Authors

Dominic W. Massaro
View author publications
You can also search for this author in PubMed Google Scholar
Michael M. Cohen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominic W. Massaro.

Additional information

This research was supported, in part, by grants from the Public Health Service (PHS R01 DC 00236), the National Science Foundation (BNS 8812728), and the University of California, Santa Cruz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Massaro, D.W., Cohen, M.M. Perceiving speech from inverted faces. Perception & Psychophysics 58, 1047–1065 (1996). https://doi.org/10.3758/BF03206832

Download citation

Received: 03 February 1995
Accepted: 15 January 1996
Issue Date: January 1996
DOI: https://doi.org/10.3758/BF03206832

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Perceiving speech from inverted faces

Abstract

Article PDF

Similar content being viewed by others

Face viewing behavior predicts multisensory gain during speech perception

Audiovisual Integration of Face–Voice Gender Studied Using “Morphed Videos”

The multimodal facilitation effect in human communication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Perceiving speech from inverted faces

Abstract

Article PDF

Similar content being viewed by others

Face viewing behavior predicts multisensory gain during speech perception

Audiovisual Integration of Face–Voice Gender Studied Using “Morphed Videos”

The multimodal facilitation effect in human communication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation