Mandarin speech perception by ear and eye follows a universal principle

Chen, Trevor H.; Massaro, Dominic W.

doi:10.3758/BF03194976

Mandarin speech perception by ear and eye follows a universal principle

Published: July 2004

Volume 66, pages 820–836, (2004)
Cite this article

Download PDF

Perception & Psychophysics Aims and scope Submit manuscript

Mandarin speech perception by ear and eye follows a universal principle

Download PDF

Trevor H. Chen¹ &
Dominic W. Massaro¹

867 Accesses
24 Citations
Explore all metrics

Abstract

In this study, the nature of speech perception of native Mandarin Chinese was compared with that of American English speakers, using synthetic visual and auditory continua (from /ba/ to /da/) in an expanded factorial design. In Experiment 1, speakers identified synthetic unimodal and bimodal speech syllables as either /ba/ or /da/. In Experiment 2, Mandarin speakers were given nine possible response alternatives. Syllable identification was influenced by both visual and auditory sources of information for both Mandarin and English speakers. Performance was better described by the fuzzy logical model of perception than by an auditory dominance model or a weighted-averaging model. Overall, the results are consistent with the idea that although there may be differences in information (which reflect differences in phonemic repertoires, phonetic realizations of the syllables, and the phonotactic constraints of languages), the underlying nature of audiovisual speech processing is similar across languages.

Article PDF

Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners

McGurk stimuli for the investigation of multisensory integration in cochlear implant users: The Oldenburg Audio Visual Speech Stimuli (OLAVS)

Article 25 August 2016

Neural Network Dynamics and Audiovisual Integration

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Anderson, N. H. (1981).Foundations of information integration theory. New York: Academic Press.
Google Scholar
Best, C. (1993). Emergence of language-specific constraints in perception of non-native speech: A window on early phonological development. In B. de Boysson-Bardies & S. de Schonen (Eds.),Developmental neurocognition: Speech and face processing in the first year of life (pp. 289–304). Norwell, MA: Kluwer.
Google Scholar
Burnham, D., Lau, S., Tam, H., & Schoknecht, C. (2001). Visual discrimination of Cantonese tone by tonal but non-Cantonese speakers, and by non-tonal language speakers. InProceedings of International Conference on Auditory-Visual Speech Processing (pp. 155-160), Sydney, Australia. Available from http://www.isca-speech.org. archive/avsp98.
Chandler, J. P. (1969). Subroutine STEPIT: Finds local minima of a smooth function of several parameters.Behavioral Science,14, 81–82.
Google Scholar
Cohen, M. M., &Massaro, D. W. (1990). Synthesis of visible speech.Behavioral Research Methods, Instruments, & Computers,22, 260–263.
Google Scholar
Cutler, A., Demuth, K., &McQueen, J. M. (2002). Universality versus language-specificity in listening to running speech.Psychological Science,13, 258–262.
Article PubMed Google Scholar
De Gelder, B., &Vroomen, J. (1992). Auditory and visual speech perception in alphabetic and non-alphabetic Chinese-Dutch bilinguals. In R. J. Harris (Ed.),Cognitive processing in bilinguals (pp. 413–426). Amsterdam: Elsevier.
Chapter Google Scholar
Diehl, R. L., &Kluender, K. R. (1987). On the categorization of speech sounds. In S. Harnad (Ed.),Categorical perception (pp. 226–253). Cambridge: Cambridge University Press.
Google Scholar
Donovan, J. (2001).Feminist theory: The intellectual traditions (3rd ed.). New York: Continuum.
Google Scholar
Flege, J. E. (2003). Assessing constraints on second-language segmental production and perception. In A. Meyer & N. Schiller (Eds.),Phonetics and phonology in language: Comprehension and Production: Differences and Similarities. Berlin: Mouton de Gruyter.
Google Scholar
Fowler, C. A. (1996). Listeners do hear sounds, not tongues.Journal of the Acoustical Society of America,99, 1730–1741.
Article PubMed Google Scholar
Gallistel, C. R. (2002). Language and spatial frames of reference in mind and brain.Trends in Cognitive Sciences,6, 321–322.
Article PubMed Google Scholar
Gouraud, H. (1971). Continuous shading of curved surfaces.IEEE Transactions on Computers,C-20, 623–628.
Article Google Scholar
Hayashi, Y., & Sekiyama, K. (1998). Native-foreign language effect in the McGurk effect: A test with Chinese and Japanese. InProceedings of Auditory-Visual Speech Processing 1998 (pp. 61-66), Sydney, Australia. Available from http://www.isca-speech.org/archive/ avsp98.
Jacobs, A. M., &Grainger, J. (1994). Models of visual word recognition: Sampling the state of the art.Journal of Experimental Psychology: Human Perception & Performance,20, 1311–1334.
Article Google Scholar
Klatt, D. H. (1980). Software for a cascade/parallel formant synthesizer.Journal of the Acoustical Society of America,67, 971–995.
Article Google Scholar
Liberman, A. M. (1996).Speech: A special code. Cambridge, MA: MIT Press.
Google Scholar
Liberman, A. M., &Mattingly, I. G. (1985). The motor theory of speech perception revised.Cognition,21, 1–36.
Article PubMed Google Scholar
Massaro, D. W. (1987).Speech perception by ear and eye: A paradigm for psychological inquiry. Hillsdale, NJ: Erlbaum.
Google Scholar
Massaro, D. W. (1988). Some criticisms of connectionist models of human performance.Journal of Memory & Language,27, 213–234.
Article Google Scholar
Massaro, D. W. (1989). Testing between the TRACE model and the fuzzy logical model of speech perception.Cognitive Psychology,21, 398–421.
Article PubMed Google Scholar
Massaro, D. W. (1998).Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press.
Google Scholar
Massaro, D. W., Cohen, M. M., &Smeele, P. M. T. (1995). Crosslinguistic comparisons in the integration of visual and auditory speech.Memory & Cognition,23, 113–131.
Article Google Scholar
Massaro, D. W., &Friedman, D. (1990). Models of integration given multiple sources of information.Psychological Review,97, 225–252.
Article PubMed Google Scholar
Massaro, D. W., Tsuzaki, M., Cohen, M. M., Gesi, A., &Heredia, R. (1993). Bimodal speech perception: An examination across languages.Journal of Phonetics,21, 445–478.
Google Scholar
Massaro, D. W., Weldon, M. S., &Kitzis, S. N. (1991). Integration of orthographic and semantic information in memory retrieval.Journal of Experimental Psychology: Learning, Memory, & Cognition,17, 277–287.
Article Google Scholar
Mattingly.I. G., &Studdert-Kennedy, M. (Eds.) (1991).Modularity and the motor theory of speech perception. Hillsdale, NJ: Erlbaum.
Google Scholar
Movellan, J. R, &McClelland, J. L. (2001). The Morton-Massaro law of information integration: Implications for models of perception.Psychological Review,108, 113–148.
Article PubMed Google Scholar
Nearey, T. M. (1992). Context effects in a double-weak theory of speech perception.Language & Speech,35, 153–171.
Google Scholar
Oden, G. C., &Massaro, D. W. (1978). Integration of featural information in speech perception.Psychological Review,85, 172–191.
Article PubMed Google Scholar
Platt, J. R. (1964). Strong inference.Science,146, 347–353.
Article PubMed Google Scholar
Popper, K. R. (1959).The logic of scientific discovery. New York: Basic Books.
Google Scholar
Robert-Ribes, J., Schwartz, J.-L., &Escudier, P. (1995). A comparison of models for fusion of the auditory and visual sensors in speech perception.Artificial Intelligence Review,9, 323–346.
Article Google Scholar
Sekiyama, K. (1997). Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese subjects.Perception & Psychophysics,59, 73–80.
Article Google Scholar
Sekiyama, K., &Tohkura, Y. (1989). Effects of lip-read information on auditory perception of Japanese syllables [Abstract].Journal of the Acoustical Society of America,85(1, Suppl.), 138.
Article Google Scholar
Sekiyama, K., &Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility.Journal of the Acoustical Society of America,90, 1797–1805.
Article PubMed Google Scholar
Sekiyama, K., &Tohkura, Y. (1993). Inter-language differences in the influence of visual cues in speech perception.Journal of Phonetics,21, 427–444.
Google Scholar
Tiippana, K., Sams, M., & Andersen, T. S. (2001). Visual attention influences audiovisual speech perception. In D. W. Massaro, J. Light, & K. Geraci (Eds.),Proceedings of Auditory-Visual Speech Processing (pp. 167-171). Aalborg. Available from http://www.isca_speech.org/ archive/avsp01.
Van Ijzendoorn, M. H., &Sagi, A. (1999). Cross-cultural patterns of attachment: Universal and contextual dimensions. In J. Cassidy & P. R. Shaver (Eds.),Handbook of attachment: Theory, research, and clinical applications (pp. 713–734). New York: Guilford.
Google Scholar
Zadeh, L. A. (1965). Fuzzy sets.Information & Control,8, 338–353.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of California, 95064, Santa Cruz, CA
Trevor H. Chen & Dominic W. Massaro

Authors

Trevor H. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dominic W. Massaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominic W. Massaro.

Additional information

The research and writing of this article were supported by Grants CDA-9726363, BCS-9905176, and IIS-0086107 from the National Science Foundation, Public Health Service Grant PHS R01 DC00236, a Cure Autism Now Foundation Innovative Technology Award, and the University of California, Santa Cruz (Cota-Robles Fellowship).

Electronic supplementary material

Supplementary material, approximately 340 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, T.H., Massaro, D.W. Mandarin speech perception by ear and eye follows a universal principle. Perception & Psychophysics 66, 820–836 (2004). https://doi.org/10.3758/BF03194976

Download citation

Received: 01 January 2003
Accepted: 14 October 2003
Issue Date: July 2004
DOI: https://doi.org/10.3758/BF03194976

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Mandarin speech perception by ear and eye follows a universal principle

Abstract

Article PDF

Similar content being viewed by others

Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners

McGurk stimuli for the investigation of multisensory integration in cochlear implant users: The Oldenburg Audio Visual Speech Stimuli (OLAVS)

Neural Network Dynamics and Audiovisual Integration

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 340 KB.

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mandarin speech perception by ear and eye follows a universal principle

Abstract

Article PDF

Similar content being viewed by others

Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners

McGurk stimuli for the investigation of multisensory integration in cochlear implant users: The Oldenburg Audio Visual Speech Stimuli (OLAVS)

Neural Network Dynamics and Audiovisual Integration

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 340 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation