Abstract
The acoustic structure of the speech signal is extremely variable due to a variety of contextual factors, including talker characteristics and speaking rate. To account for the listener’s ability to adjust to this variability, speech researchers have posited the existence of talker and rate normalization processes. The current study examined how the perceptual system encoded information about talker and speaking rate during phonetic perception. Experiments 1–3 examined this question, using a speeded classification paradigm developed by Garner (1974). The results of these experiments indicated that decisions about phonemic identity were affected by both talker and rate information: irrelevant variation in either dimension interfered with phonemic classification. While rate classification was also affected by phoneme variation, talker classification was not. Experiment 4 examined the impact of talker and rate variation on the voicing boundary under different blocking conditions. The results indicated that talker characteristics influenced the voicing boundary when talker variation occurred within a block of trials only under certain conditions. Rate variation, however, influenced the voicing boundary regardless of whether or not there was rate variation within a block of trials. The findings from these experiments indicate that phoneme and rate information are encoded in an integral manner during speech perception, while talker characteristics are encoded separately.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Biederman, I., &Checkosky, S. F. (1970). Processing redundant information.Journal of Experimental Psychology,83, 486–490.
Darwin, C. J., McKeown, J. D., &Kirby, D. (1989). Perceptual compensation for transmission channel and speaker effects on vowel quality.Speech Communication,8, 221–234.
Diehl, R. L., Kluender, K. R., Foss, D. J., Parker, E. M., &Gernsbacher, M. A. (1987). Vowels as islands of reliability.Journal of Memory & Language,26, 564–573.
Diehl, R. L., &Walsh, M. A. (1989). An auditory basis for the stimuluslength effect in the perception of stops and glides.Journal of the Acoustical Society of America,85, 2154–2164.
Eimas, P. D., &Miller, J. L. (1980). Contextual effects in infant speech perception.Science,209, 1140–1141.
Eimas, P. D., Tartter, V. C., &Miller, J. L. (1981). Dependency relations during the processing of speech. In P. D. Eimas & J. L. Miller (Eds.),Perspectives on the study of speech (pp. 283–309). Hillsdale, NJ: Erlbaum.
Eimas, P. D., Tartter, V. C., Miller, J. L., &Keuthen, N. J. (1978). Asymmetric dependencies in processing phonetic features.Perception & Psychophysics,23, 12–20.
Garner, W. R. (1974).The processing of information and structure. Potomac, MD: Erlbaum.
Gordon, P. C., Eberhardt, J. L., &Rueckl, J. G. (1993). Attentional modulation of the phonetic significance of acoustic cues.Cognitive Psychology,25, 1–42.
Green, K. P., &Miller, J. L. (1985). On the role of visual rate information in phonetic perception.Perception & Psychophysics,38, 269–276.
Green, K. P., Stevens, E. B., &Kuhl, P. K. (1994). Talker continuity and the use of rate information during phonetic perception.Perception & Psychophysics,55, 249–260.
Haggard, M. P., Ambler, S., &Callow, M. (1970). Pitch as a voicing cue.Journal of the Acoustical Society of America,31, 613–617.
Haggard, M. P., Summerfield, A. Q., &Roberts, M. (1981). Psychoacoustical and cultural determinants of phoneme boundaries: Evidence from trading F0 cues in the voiced-voiceless distinction.Journal of Phonetics,9, 49–62.
Hillenbrand, J., &Houde, R. A. (1995). Vowel recognition: Formants, spectral peaks, and spectral shape.Journal of the Acoustical Society of America,98, 2949.
Johnson, K. (1990a). Compensation for talker variability and vowel variability in the perception of fricatives.Journal of the Acoustical Society of America,87, S118.
Johnson, K. (1990b). Contrast and normalization in vowel perception.Journal of Phonetics,18, 229–254.
Johnson, K. (1990c). The role of perceived speaker identity in F0 normalization of vowels.Journal of the Acoustical Society of America,88, 642–654.
Jongman, A., &Miller, J. D. (1990). Method of location of burstonset spectra in the auditory-perceptual space: A study of place of articulation in voiceless stop consonants.Journal of the Acoustical Society of America,89, 867–873.
Klatt, D. H. (1980). Software for a cascade/parallel formant synthesizer.Journal of the Acoustical Society of America,67, 971–995.
Kuhl, P. K. (1979). Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories.Journal of the Acoustical Society of America,66, 1668–1679.
Kuhl, P. K. (1983). Perception of auditory equivalence classes for speech in early infancy.Infant Behavior & Development,6, 263–285.
Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., &Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age.Science,255, 606–608.
Ladefoged, P. (1967).Three areas of experimental phonetics. London: Oxford University Press.
Lisker, L. (1975). Is it VOT or a first-formant transition detector?Journal of the Acoustical Society of America,57, 1547–1551.
Logan, R. J., &Pastore, R. E. (1990). Talker normalization and speaker recognition by humans: One mechanism or two?Journal of the Acoustical Society of America,87, S70.
Luce, P. A., Feustel, T. C., &Pisoni, D. B. (1983). Capacity demands in short-term memory for synthetic and natural speech.Human Factors,25, 17–32.
Melara, R. D. (1989). Dimensional interaction between color and pitch.Journal of Experimental Psychology: Human Perception & Performance,15, 69–79.
Melara, R. D., &Marks, L. E. (1990). Dimensional interactions in language processing: Investigating directions and levels of crosstalk.Journal of Experimental Psychology: Learning, Memory, & Cognition,16, 539–554.
Miller, J. L. (1981a). Effects of speaking rate on segmental distinctions. In P. D. Eimas & J. L. Miller (Eds.),Perspectives on the study of speech (pp. 39–74). Hillsdale, NJ: Erlbaum.
Miller, J. L. (1981b). Some effects of speaking rate on phonetic perception.Phonetica,38, 159–180.
Miller, J. L. (1987a). Mandatory processing in speech perception: A case study. In J. L. Garfield (Ed.),Modularity in knowledge representation and natural-language understanding (pp. 309–322). Cambridge, MA: MIT Press.
Miller, J. L. (1987b). Rate-dependent processing in speech perception. In A.W. Ellis (Ed.),Progress in the psychology of language (pp. 119–157). Hillsdale, NJ: Erlbaum.
Miller, J. L., Aibel, I. L., &Green, K. (1984). On the nature of ratedependent processing during phonetic perception.Perception & Psychophysics,35, 5–15.
Miller, J. L. &Baer, T. (1983). Some effects of speaking rate on the production of /b/ & /w/.Journal of the Acoustical Society of America,73, 1751–1755.
Miller, J. L., &Dexter, E. R. (1988). Effects of speaking rate and lexical status on phonetic perception.Journal of Experimental Psychology: Human Perception & Performance,14, 369–378.
Miller, J. L., Green, K.P., &Reeves, A. (1986). Speaking rate and segments: A look at the relation between speech production and speech perception for the voicing contrast.Phonetica,43, 106–115.
Miller, J. L., &Liberman, A. M. (1979). Some effects of later-occurring information on the perception of stop consonant and semi-vowel.Perception & Psychophysics,25, 457–465.
Miller, J. L., &Volaitis, L. E. (1989). Effect of speaking rate on the perceptual structure of a phonetic category.Perception & Psychophysics,45, 506–512.
Miller, J. L., &Wayland, S. C. (1993). Limits on the limitations of context conditioned effects in the perception of [b] and [w].Perception & Psychophysics,54, 205–210.
Mullennix, J. W., &Pisoni, D. B. (1990). Stimulus variability and processing dependencies in speech perception.Perception & Psychophysics,47, 379–390.
Mullennix, J. W., Pisoni, D. B., &Martin, C. S. (1988). Some effects of talker variability on spoken word recognition.Journal of the Acoustical Society of America,85, 365–378.
Nearey, T. (1989). Static, dynamic, and relational properties in vowel perception.Journal of the Acoustical Society of America,85, 2088–2113.
Nosofsky, R. M. (1986). Attention, similarity and the identificationcategorization relationship.Journal of Experimental Psychology: General,115, 39–57.
Nosofsky, R. M., Clark, S. E., &Shin, H. J. (1989). Rules and exemplars in categorization, identification, and recognition.Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 282–304.
Nusbaum, H. C., &Morin, T. M. (1992). Paying attention to differences among talkers. In Y. Tohkura, Y. Sagisaka, & E. Vatikiotis-Bateson (Eds.),Speech perception, speech production, and linguistic structure (pp. 113–134). Tokyo: OHM.
Nygaard, L. C., Sommers, M. S., &Pisoni, D. B. (1994). Speech perception as a talker-contingent process.Psychological Science,5, 42–46.
Peterson, G. E., &Barney, H. L. (1952). Control methods used in a study of vowels.Journal of the Acoustical Society of America,24, 175–184.
Pomerantz, J. R., &Garner, W. E. (1973). Stimulus configuration in selective attention tasks.Perception & Psychophysics,14, 565–569.
Remez, R., Rubin, P., Nygaard, L., &Howell, W. (1987). Perceptual normalization of vowels produced by sinusoidal voices.Journal of Experimental Psychology: Human Perception & Performance,13, 40–61.
Shinn, P. C., Blumstein, S. E., &Jongman, A. (1985). Limitations of context conditioned effects in the perception of [b] and [w].Perception & Psychophysics,38, 397–407.
Sommers, M. S., Nygaard, L. C., &Pisoni, D. B. (1994). The effects of speaking rate and amplitude variability on perceptual identification.Journal of the Acoustical Society of America,96, 1314–1324.
Summerfield, Q. (1981). On articulatory rate and perceptual constancy in phonetic perception.Journal of Experimental Psychology: Human Perception & Performance,7, 1074, 1095.
Tomiak, G. R., Green, K. P., &Kuhl, P. K. (1991). Phonetic coding and its relationship to talker and rate normalization.Journal of the Acoustical Society of America,90, S2363.
Tomiak, G. R., Mullennix, J. W., &Sawusch, J. R. (1987). Integral processing of phonemes: Evidence for a phonetic mode of perception.Journal of the Acoustical Society of America,81, 755–764.
Turvey, M. T. (1973). On peripheral and central processes in vision: Inferences from an information-processing analysis of masking with patterned stimuli.Psychological Review,80, 1–52.
Volaitis, L. E., &Miller, J. L. (1992). Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories.Journal of the Acoustical Society of America,92, 723–735.
Wood, C. C. (1974). Parallel processing of auditory and phonetic information in speech discrimination.Perception & Psychophysics,15, 501–508.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported in part by Research and Training Grant 1 P60 DC-01409 from the National Institute of Deafness and Other Communication Disorders, National Institutes of Health Grant NS-26475 to K.P.G., and National Institutes of Health Grant HD-18286 to P.K.K. We would like to thank Kathryn Fohr, Lisa Kupnis, and Erica Stevens for their help in collecting and analyzing the data. We would also like to thank Joanne Miller and two reviewers for their helpful comments on an earlier version of the manuscript. Please address correspondence to K. P. Green, Psychology Department,
Rights and permissions
About this article
Cite this article
Green, K.P., Tomiak, G.R. & Kuhl, P.K. The encoding of rate and talker information during phonetic perception. Perception & Psychophysics 59, 675–692 (1997). https://doi.org/10.3758/BF03206015
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03206015