Abstract
We examined whether the orientation of the face influences speech perception in face-to-face communication. Participants identified auditory syllables, visible syllables, and bimodal syllables presented in an expanded factorial design. The syllables were /ba/, /va/, /δa/, or /da/. The auditory syllables were taken from natural speech whereas the visible syllables were produced by computer animation of a realistic talking face. The animated face was presented either as viewed in normal upright orientation or inverted orientation (180° frontal rotation). The central intent of the study was to determine if an inverted view of the face would change the nature of processing bimodal speech or simply influence the information available in visible speech. The results with both the upright and inverted face views were adequately described by the fuzzy logical model of perception (FLMP). The observed differences in the FLMP’s parameter values corresponding to the visual information indicate that inverting the view of the face influences the amount of visible information but does not change the nature of the information processing in bimodal speech perception
Article PDF
Similar content being viewed by others
References
Bernstein, L. E., &Eberhardt, S. P. (1986).Johns Hopkins lipreading corpus I-II: Disc I. Baltimore: Johns Hopkins University Press.
Bertelson, P., Vroomen, J., Wiegeraad, G., & de Gelder, B. (1994, September).Exploring the relation between McGurk interference and ventriloquism. Paper presented at the 1994 International Conference on Spoken Language Processing, Yokohama, Japan.
Braida, L. D. (1991). Crossmodal integration in the identification of consonant segments.Quarterly Journal of Experimental Psychology,43A, 647–677.
Bruno, N., &Cutting, J. E. (1988). Minimodularity and the perception of layout.Journal of Experimental Psychology: General,117, 161–170.
Campbell, R. (1992). The neuropsychology of lipreading.Philosophical Transactions of the Royal Society London: Series B,335, 39–45.
Campbell, R. (1994). Audiovisual speech: Where, what, when, how?Current Psychology of Cognition,13, 76–80.
Chandler, J. P. (1969). Subroutine STEPIT-Finds local minima of a smooth function of several parameters.Behavioral Science,14, 81–82.
Cohen, M. M., &Massaro, D. W. (1990). Synthesis of visible speech.Behavior Research Methods, Instruments, & Computers,22, 260–263.
Cohen, M. M., &Massaro, D. W. (1992). On the similarity of categorization models. In F. G. Ashby (Ed.),Probabilistic multidimensional models of perception and cognition (pp. 395–447). Hillsdale, NJ: Erlbaum.
Cohen, M. M., &Massaro, D. W. (1993). Modeling coarticulation in synthetic visual speech. In N. M. Thalmann & D. Thalmann (Eds.),Models and techniques in computer animation (pp. 139–156). Tokyo: Springer-Verlag.
Cohen, M. M., &Massaro, D. W. (1994). Development and experimentation with synthetic visible speech.Behavior Research Methods, Instruments, & Computers,26, 260–265.
Cutting, J. E., Bruno, N., Brady, N. P., &Moore, C. (1992). Selectivity, scope, and simplicity of models: A lesson from fitting judgments of perceived depth.Journal of Experimental Psychology: General,121, 364–381.
de Gelder, B., Vroomen, J., &van der Heide, L. (1991). Face recognition and lip-reading in autism.European Journal of Cognitive Psychology,3, 69–86.
Etcoff, N. L., &Magee, J. J. (1992). Categorical perception of facial expressions.Cognition,44, 227–240.
Fisher, B. (1991).Integration of visual and auditory information in perception of speech events. Unpublished doctoral dissertation, University of California, Santa Cruz.
Gouraud, H. (1971). Continuous shading of curved surfaces.IEEE Transactions on Computers,C-20, 623–628.
Green, K. P. (1994). The influence of an inverted face on the McGurk effect.Journal of the Acoustical Society of America,95, 3014.
Green, K. P., Kuhl, P. K., Meltzoff, A. N., &Stevens, E. B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect.Perception & Psychophysics,50, 524–536.
Hellige, J. B. (1993).Hemispheric asymmetry: What’s right and what’s left. Cambridge, MA: Harvard University Press.
Jordan, T. R., & Bevan, K. (in press). Seeing and hearing rotated faces: Influence of facial orientation on visual and audio-visual speech recognition.Journal of Experimental Psychology: Human Perception & Performance.
Levine, S. C., Banich, M. T., &Koch-Weser, M. P. (1988). Face recognition: A general or specific right hemisphere capacity?Brain & Cognition,8, 303–325.
Massaro, D. W. (1984). Children’s perception of auditory and visual speech.Child Development,55, 1777–1788.
Massaro, D. W. (1987).Speech perception by ear and eye: A paradigm for psychological inquiry. Hillsdale, NJ: Erlbaum.
Massaro, D. W. (1989a). Multiple book review of Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry.Behavioral & Brain Sciences,12, 741–794.
Massaro, D. W. (1989b). Testing between the TRACE model and the Fuzzy Logical Model of Perception.Cognitive Psychology,21, 398–421.
Massaro, D. W. (1990). A fuzzy logical model of speech perception. In D. Vickers & P. L. Smith (Eds.),Human information processing: Measures, mechanisms, and models (pp. 367–379). Amsterdam: Elsevier, North-Holland.
Massaro, D. W. (1992). Broadening the domain of the fuzzy logical model of perception. In H. L. Pick, Jr., P. Van den Broek, & D. C. Knill (Eds.),Cognition, conceptual, and methodological issues (pp. 51–84). Washington, DC: American Psychological Association.
Massaro, D. W., &Cohen, M. M. (1983). Evaluation and integration of visual and auditory information in speech perception.Journal of Experimental Psychology: Human Perception & Performance,3, 753–771.
Massaro, D. W., &Cohen, M. M. (1990). Perception of synthesized audible and visible speech.Psychological Science,1, 55–63.
Massaro, D. W., Cohen, M. M., &Smeele, P. M. T. (1995). Crosslinguistic comparisons in the integration of visual and auditory speech.Memory & Cognition,23, 113–131.
Massaro, D. W., &Friedman, D. (1990). Models of integration given multiple sources of information.Psychological Review,97, 225–252.
Massaro, D. W., Tsuzaki, M., Cohen, M. M., Gesi, A., &Heredia, R. (1993). Bimodal speech perception: An examination across languages.Journal of Phonetics,21, 445–478.
McGurk, H., &MacDonald, J. (1976). Hearing lips and seeing voices.Nature,264, 746–748.
Parke, F. I. (1974).A parametric model for human faces (Tech. Rep. UTEC-CSc-75-047). Salt Lake City: University of Utah.
Parke, F. I. (1975). A model for human faces that allows speech synchronized animation.Computers & Graphics Journal,1(1), 1–4.
Parke, F. I. (1982). Parameterized models for facial animation.IEEE Computer Graphics,2(9), 61–68.
Pearce, A., Wyvill, B., Wyvill, G., & Hill, D. (1986). Speech and expression: A computer solution to face animation.Proceedings of Graphics Interface ’86 (pp. 136–140).
Platt, J. R. (1964). Strong inference.Science,146, 347–353.
Rhodes, G. (1994). Secrets of the face.New Zealand Journal of Psychology,23, 3–17.
Sekiyama, K., &Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility.Journal of the Acoustical Society of America,90, 1797–1805.
Sekiyama, K., &Tokhura, Y. (1993). Inter-language differences in the influence of visual cues in speech perception.Journal of Phonetics,21, 427–444.
Sergent, J. (1994). Cognitive and neural structures in face processing. In A. Kertesz (Ed.),Localization and neuroimaging in neuropsychology (pp. 473–494). San Diego: Academic Press.
Tanaka, J. W., &Farah, M. J. (1993). Parts and wholes in face recognition.Quarterly Journal of Experimental Psychology: Human Experimental Psychology,46A, 225–245.
Thompson, L. A., &Massaro, D. W. (1989). Before you see it, you see its parts: Evidence for feature encoding and integration in preschool children and adults.Cognitive Psychology,21, 334–362.
Valentine, T. (1988). Upside-down faces: A review of the effect of inversion upon face recognition.British Journal of Psychology,79, 471–491.
Vroomen, J. H. M. (1992).Hearing voices and seeing lips: Investigations in the psychology of lipreading. Doctoral dissertation, Katholieke Universiteit Brabant, Nijmegen.
Zaidel, D. W. (1994). Worlds apart: Pictorial semantics in the left and right cerebral hemispheres.Current Directions in Psychological Science,3, 5–8.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported, in part, by grants from the Public Health Service (PHS R01 DC 00236), the National Science Foundation (BNS 8812728), and the University of California, Santa Cruz.
Rights and permissions
About this article
Cite this article
Massaro, D.W., Cohen, M.M. Perceiving speech from inverted faces. Perception & Psychophysics 58, 1047–1065 (1996). https://doi.org/10.3758/BF03206832
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03206832