Abstract
The personal attributes of a talker perceived via acoustic properties of speech are commonly considered to be an extralinguistic message of an utterance. Accordingly, accounts of the perception of talker attributes have emphasized a causal role of aspects of the fundamental frequency and coarsegrain acoustic spectra distinct from the detailed acoustic correlates of phonemes. In testing this view, in four experiments, we estimated the ability of listeners to ascertain the sex or the identity of 5 male and 5 female talkers from sinusoidal replicas of natural utterances, which lack fundamental frequency and natural vocal spectra. Given such radically reduced signals, listeners appeared to identify a talker’s sex according to the central spectral tendencies of the sinusoidal constituents. Under acoustic conditions that prevented listeners from determining the sex of a talker, individual identification from sinewave signals was often successful. These results reveal that the perception of a talker’s sex and identity are not contingent and that fine-grain aspects of a talker’s phonetic production can elicit individual identification under conditions that block the perception of voice quality.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Bricker, P. D., &Pruzansky, S. (1976). Speaker recognition. In N. J. Lass (Ed.),Contemporary issues in experimental phonetics (pp. 295–326). New York: Academic Press.
Byrd, D. (1994). Relations of sex and dialect to reduction.Speech Communication,15, 39–54.
Carrell, T. D. (1981). Effects of glottal waveform on the perception of talker sex.Journal of the Acoustical Society of America,70, S97.
Carrell, T. D. (1985).Contributions of fundamental frequency, formant spacing & glottal waveform to talker identification. Unpublished doctoral dissertation, Indiana University.
Church, B. A., &Schacter, D. L. (1994). Perceptual specificity of auditory priming: Implicit memory for voice intonation and fundamental frequency.Journal of Experimental Psychology: Learning, Memory, & Cognition,20, 521–533.
Fant, G. (1966). A note on vocal tract size factors and nonuniform F-pattern scalings.Speech Transmission Laboratory: Quarterly Progress & Status Report,4, 22–30.
Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory.Journal of Experimental Psychology: Learning, Memory, & Cognition,22, 1166–1183.
Hecker, M. H. L. (1971).Speaker recognition: An interpretive survey of the literature (ASHA Monographs, No. 16). Washington, DC: American Speech and Hearing Association.
Hollien, H., &Klepper, B. (1984). The speaker identification problem.Advances in Forensic Psychology & Psychiatry,1, 87–111.
Jassem, W. (1971). Pitch and compass of the speaking voice.Journal of the International Phonetic Association,1, 59–68.
Joos, M. (1948). Acoustic phonetics.Language,24(Suppl.), 1–137.
Julesz, B., &Hirsh, I. J. (1972). Visual and auditory perception: An essay of comparison. In E. E. Denes & P. B. Denes (Eds.),Human communication: A unified view (pp. 283–340). New York: McGraw-Hill.
Klatt, D. H., &Klatt, L. C. (1990). Analysis, synthesis and perception of voice quality variations among female and male talkers.Journal of the Acoustical Society of America,87, 820–857.
Kreiman, J. (1997). Listening to voice: Theory and practice in voice perception research. In K. Johnson & J. Mullenix (Eds.),Talker variability in speech recognition (pp. 85–108). San Diego: Academic Press.
Lieberman, P. (1963). Some effects of semantic and grammatical context on the production and perception of speech.Language & Speech,6, 172–187.
McDonough, J., Ladefoged, P., &George, H. (1993). Navajo vowels and phonetic universal tendencies.UCLA Working Papers in Phonetics,84, 143–150.
Monsen, R. B., &Engebretson, A. M. (1977). Study of variations in the male and female glottal wave.Journal of the Acoustical Society of America,62, 981–993.
Nearey, T. M. (1978).Phonetic feature systems for vowels. Bloomington: Indiana University Linguistics Club.
Nolan, F. (1983).The phonetic bases of speaker recognition. Cambridge: Cambridge University Press.
Nygaard, L. C., Sommers, M. S., &Pisoni, D. B. (1994). Speech perception as a talker-contingent process.Psychological Science,5, 42–46.
Peterson, G. E., &Barney, H. L. (1952). Control methods used in a study of the vowels.Journal of the Acoustical Society of America,24, 175–184.
Pickett, J. M., &Pollack, I. (1963). Intelligibility of excerpts from fluent speech: Effects of rate of utterance and duration of excerpt.Language & Speech,6, 151–164.
Pisoni, D. B. (1997). Some thoughts on “normalization” in speech perception. In K. Johnson & J. Mullenix (Eds.),Talker variability in speech recognition (pp. 9–32). San Diego: Academic Press.
Remez, R. E., Fellowes, J. M., &Rubin, P. E. (1997). Talker identification based on phonetic information.Journal of Experimental Psychology: Human Perception & Performance,23, 651–666.
Remez, R. E., &Rubin, P. E. (1984). On the perception of intonation from sinusoidal sentences.Perception & Psychophysics,35, 429–440.
Remez, R. E., &Rubin, P. E. (1993). On the intonation of sinusoidal sentences: Contour and pitch height.Journal of the Acoustical Society of America,94, 1983–1988.
Remez, R. E., & Rubin, P. E. (in press). Acoustic shards, perceptual glue. In J. Charles-Luce, P. A. Luce, & J. R. Sawusch (Eds.),Theories in spoken language: Perception, production, & development. Norwood, NJ: Ablex.
Remez, R. E., Rubin, P. E., Berns, S. M., Pardo, J. S., &Lang, J. M. (1994). On the perceptual organization of speech.Psychological Review,101, 129–156.
Remez, R. E., Rubin, P. E., Pisoni, D. B., &Carrell, T. D. (1981). Speech perception without traditional speech cues.Science,212, 947–950.
Reynolds, G. S. (1961). Attention in the pigeon.Journal of the Experimental Analysis of Behavior,4, 203–208.
Rubin, P. E. (1980).Sinewave synthesis [Internal memorandum]. New Haven, CT: Haskins Laboratories.
Schwartz, M. F., &Rine, H. E. (1968). Identification of speaker sex from isolated whispered vowels.Journal of the Acoustical Society of America,44, 1736–1737.
Sheffert, S. M., &Fowler, C. A. (1995). The effect of voice and visible speaker change on memory for spoken words.Journal of Memory & Language,34, 665–685.
Siegel, S., &Castellan, N. J., Jr. (1988).Nonparametric statistics for the behavioral sciences (2nd ed.). New York: McGraw-Hill.
Trudgill, P. (1974).Sociolinguistics: An introduction. Harmondsworth, U.K.: Penguin.
Van Lancker, D. R., Cummings, J. L., Kreiman, J., &Dobkin, B. H. (1988). Phonagnosia: A dissociation between familiar and unfamiliar voices.Cortex,24, 195–209.
Wilkie, D. M., &Masson, M. E. (1976). Attention in the pigeon: A reevaluation.Journal of the Experimental Analysis of Behavior,26, 207–212.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by Grants DC00308 (to R.E.R.) and HD01994 (to Haskins Laboratories) from the National Institutes of Health.
Rights and permissions
About this article
Cite this article
Fellowes, J.M., Remez, R.E. & Rubin, P.E. Perceiving the sex and identity of a talker without natural vocal timbre. Perception & Psychophysics 59, 839–849 (1997). https://doi.org/10.3758/BF03205502
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03205502