Abstract
Even when the speaker, context, and speaking style are held fixed, the physical properties of naturally spoken utterances of the same speech sound vary considerably. This variability imposes limits on our ability to distinguish between different speech sounds. We present a conceptual framework for relating the ability to distinguish between speech sounds in single-token experiments (in which each speech sound is represented by a single wave form) to resolution in multiple-token experiments. Experimental results indicate that this ability is substantially reduced by an increase in the number of tokens from 1 to 4, but that there is little further reduction when the number of tokens increases to 16. Furthermore, although there is little relation between the ability to distinguish between a given pair of tokens in the multiple- and the 1-token experiments, there is a modest correlation between the ability to distinguish specific vowel tokens in the 4- and 16-token experiments. These results suggest that while listeners use a multiplicity of cues to distinguish between single tokens of a pair of vowel sounds, so that performance is highly variable both across tokens and listeners, they use a smaller set when distinguishing between populations of naturally produced vowel tokens, so that variability is reduced. The effectiveness of the cues used in the latter case is limited more by internal noise than by the variability of the cues themselves.
Article PDF
Similar content being viewed by others
References
Berg, B. G., &Green, D. M. (1990). Spectral weights in profile listening.Journal of the Acoustical Society of America,88, 758–766.
Berliner, J. E., Braida, L. D., &Durlach, N. I. (1977). Intensity perception: VII. Further data on roving level discrimination and the resolution and bias edge effects.Journal of the Acoustical Society of America,61, 1256–1267.
Berliner, J. E., &Durlach, N. I. (1973). Intensity perception: IV Resolution in roving-level discrimination.Journal of the Acoustical Society of America,53, 1270–1287.
Braida, L. D. (1991). Crossmodal integration in the identification of consonant segments.Quarterly Journal of Experimental Psychology,43A, 647–677.
Carlstrom, R., Grantstrom, B., &Klatt, D. H. (1979). The relative perceptual salience of selected acoustic manipulations.Journal of the Acoustical Society of America,66, S86.
Cornett, R. O. (1967). Cued speech.American Annals of the Deaf,112, 3–13.
Durlach, N. I., &Braida, L. D. (1969). Intensity perception: I. Preliminary theory of intensity resolution.Journal of the Acoustical Society of America,46, 372–383.
Farrar, C. L., Reed, C. M., Durlach, N. I., Zurek, P. M., Fro, Y., &Braida, L. D. (1987). Spectral-shape discrimination: I. Results from normal-hearing listeners for stationary broadband noise.Journal of the Acoustical Society of America,81, 1085–1092.
Gourevitch, V., &Galanter, E. (1967). A significance test for one parameter isosensitivity functions.Psychometrika,32, 25–33.
Green, D. M., Kidd, G., &Picardi, M. C. (1983). Successive versus simultaneous comparison in auditory intensity discrimination.Journal of the Acoustical Society of America,73, 639–643.
Green, D. M., &Swets, J. A. (1966).Signal detection theory and psychophysics. New York: Wiley.
Hanna, T. E. (1984). Discrimination of reproducible noise as a function of bandwidth and duration.Perception & Psychophysics,36, 409–416.
House, A. S., &Fairbanks, G. (1953). The influence of consonant environment upon the secondary acoustical characteristics of vowels.Journal of the Acoustical Society of America,25, 105–113.
Houtsma, A. J. M., &Goldstein, J. L. (1972). Central origin of complex-tone pitch.Journal of the Acoustical Society of America,51, 520–529.
Huang, C. B. (1991).An Acoustic and Perceptual Study of Vowel Formant Trajectories in American English. Unpublished doctoral dissertation, Massachusetts Institute of Technology.
Klatt, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence.Journal of the Acoustical Society of America,59, 1208–1221.
Klatt, D. H. (1979). Perceptual comparisons among a set of vowels similar to /æ/, some differences between psychoacoustic distance and phonetic distance.Journal of the Acoustical Society of America,66, S86.
Kuhl, P. K. (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not.Perception & Psychophysics,50, 93–107.
Macmillan, N. A., &Creelman, C. D. (1991).Detection theory: A user’s guide. New York: Cambridge University Press.
Macmillan, N. A., Goldberg, R. F., &Braida, L. D. (1988). Resolution for speech sounds: Basic sensitivity and context memory on vowel and consonant continua.Journal of the Acoustical Society of America,84, 1262–1280.
Maney, J. W. (1989).Token variability of intra-speaker speech. Unpublished bachelor’s thesis, Massachusetts Institute of Technology.
Perkell, J. S., &Klatt, D. H. (Eds.) (1986).Invariance and variability in speech processes. Hillsdale, NJ: Erlbaum.
Peterson, G. E., &Barney, H. L. (1952). Control methods used in a study of the vowels.Journal of the Acoustical Society of America,24, 175–184.
Picheny, M. A., Durlach, N. I., &Braida, L. D. (1986). Speaking clearly for the hard of hearing II: Acoustic characteristics of clear and conversational speech.Journal of Speech & Hearing Research,29, 434–446.
Pisoni, D. B. (1990). Effects of talker variability on speech perception: Implications for current research and theory. InResearch on speech perception (Progress Rep. No. 16, pp. 169–191). Bloomington: University of Indiana, Department of Psychology, Speech Research Laboratory.
Pollack, I. (1956). Identification and discrimination of components of elementary auditory displays.Journal of the Acoustical Society of America,28, 906–909.
Ronan, D. E. ( 1992).Effects of token variability on vowel intelligibility. Unpublished bachelor’s thesis, Massachusetts Institute of Technology.
Schroeder, M. R. (1968). Reference signal for signal quality studies.Journal of the Acoustical Society of America,44, 1735–1736.
Siegel, R. A., &Colburn, H. S. (1989). Binaural processing of noisy stimuli: Internal/external noise ratios for diotic and dichotic stimuli.Journal of the Acoustical Society of America,86, 2122–2128.
Uchanski, R. M., Millier, K. M., Reed, C. M., &Braida, L. D. (1992). Effects of token variability on vowel identification. In M. E. Schouten (Ed.),The auditory processing of speech: From sounds to words (pp. 291–302). Berlin: Mouton de Gruyter.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by NIH Grants R01-DC00117 and P01-DC00361.
Rights and permissions
About this article
Cite this article
Uchanski, R.M., Braida, L.D. Effects of token variability on our ability to distinguish between vowels. Perception & Psychophysics 60, 533–543 (1998). https://doi.org/10.3758/BF03206044
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03206044