Talker-specific learning in speech perception

Nygaard, Lynne C.; Pisoni, David B.

doi:10.3758/BF03206860

Talker-specific learning in speech perception

Published: January 1998

Volume 60, pages 355–376, (1998)
Cite this article

Download PDF

Perception & Psychophysics Aims and scope Submit manuscript

Talker-specific learning in speech perception

Download PDF

Lynne C. Nygaard¹ &
David B. Pisoni²

3093 Accesses
378 Citations
Explore all metrics

Abstract

The effects of perceptual learning of talker identity on the recognition of spoken words and sentences were investigated in three experiments. In each experiment, listeners were trained to learn a set of 10 talkers’ voices and were then given an intelligibility test to assess the influence of learning the voices on the processing of the linguistic content of speech. In the first experiment, listeners learned voices from isolated words and were then tested with novel isolated words mixed in noise. The results showed that listeners who were given words produced by familiar talkers at test showed better identification performance than did listeners who were given words produced by unfamiliar talkers. In the second experiment, listeners learned novel voices from sentence-length utterances and were then presented with isolated words. The results showed that learning a talker’s voice from sentences did not generalize well to identification of novel isolated words. In the third experiment, listeners learned voices from sentence-length utterances and were then given sentence-length utterances produced by familiar and unfamiliar talkers at test. We found that perceptual learning of novel voices from sentence-length utterances improved speech intelligibility for words in sentences. Generalization and transfer from voice learning to linguistic processing was found to be sensitive to the talker-specific information available during learning and test. These findings demonstrate that increased sensitivity to talker-specific information affects the perception of the linguistic properties of speech in isolated words and sentences.

Article PDF

The Interface Theory of Perception

Article 18 September 2015

The English Sublexical Toolkit: Methods for indexing sound–spelling consistency

Article Open access 09 April 2024

Perception of vocoded speech in domestic dogs

Article Open access 16 April 2024

References

Abercrombie, D. (1967).Elements of general phonetics. Chicago: Aldine.
Google Scholar
Assmann, P. F., Nearey, T. M., &Hogan, J. T. (1982). Vowel identification: Orthographic, perceptual, and acoustic aspects.Journal of the Acoustical Society of America,71, 975–989.
Article PubMed Google Scholar
Bradlow, A. R., Nygaard, L. C., &Pisoni, D. B. (1995). On the contribution of in-tance-specific characteristics to speech perception. In C. Sorin, J. Mariani, H. Meloni, & J. Schoentagen (Eds.),Levels in speech communication: Relations and interactions (pp. 13–24). Amsterdam: Elsevier.
Google Scholar
Bradlow, A. R., Torretta, G. M., &Pisoni, D. B. (1996). Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics.Speech Communication,20, 255–272.
Article Google Scholar
Bricker, P. D., &Pruzansky, S. (1976). Speaker recognition. In N. J. Lass (Ed.),Contemporary issues in experimental phonetics (pp. 295–326). New York: Academic Press.
Google Scholar
Brooks, L. (1978). Nonanalytic concept formation and memory for instances. In E. Rosch & B. Lloyd (Eds.),Cognition and categorization (pp. 169–211). Hillsdale, NJ: Erlbaum.
Google Scholar
Church, B. A., &Schacter, D. L. (1994). Perceptual specificity of auditory priming: Implicit memory for voice intonation and fundamental frequency.Journal of Experimental Psychology: Learning, Memory, & Cognition,20, 521–533.
Article Google Scholar
Cole, R. A., Coltheart, M., &Allard, F. (1974). Memory of a speaker’s voice: Reaction time to same- or different-voiced letters.Quarterly Journal of Experimental Psychology,26, 1–7.
PubMed Google Scholar
Costanzo, F. S., Markel, N. N., &Costanzo, P. R. (1989). Voice quality profile and perceived emotion.Journal of Counseling Psychology,16, 267–270.
Article Google Scholar
Craik, F. I. M., &Kirsner, K. (1974). The effect of speaker’s voice on word recognition.Quarterly Journal of Experimental Psychology,26, 274–284.
Article Google Scholar
Creelman, C. D. (1957). The case of the unknown talker.Journal of the Acoustical Society of America,29, 655.
Article Google Scholar
Doddington, G. R. (1985). Speaker recognition: Identifying people by their voices.Proceedings of the IEEE,73, 1651–1664.
Article Google Scholar
Dupoux, E., &Green, K. (1997). Perceptual adjustment to highly compressed speech: Effects of talker and rate changes.Journal of Experimental Psychology: Human Perception & Performance,23, 914–927.
Article Google Scholar
Egan, J. P. (1948). Articulation testing methods.Laryngoscope,58, 955–991.
Article PubMed Google Scholar
Fant, G. (1973).Speech sounds and features. Cambridge, MA: MIT Press.
Google Scholar
Fodor, J. A. (1983).The modularity of mind. Cambridge, MA: MIT Press.
Google Scholar
Fowler, C. A. (1986). An event approach to the study of speech perception from a direct-realist perspective.Journal of Phonetics,14, 3–28.
Google Scholar
Garfield, J. L. (1987). Introduction: Carving the mind at its joints. In J. L. Garfield (Ed.),Modularity in knowledge representation and natural-language understanding (pp. 17–23). Cambridge, MA: MIT Press.
Google Scholar
Garner, W. (1974).The processing of information and structure. Hillsdale, NJ: Erlbaum.
Google Scholar
Garvin, P. L., &Ladefoged, P. L. (1963). Speaker identification and message identification in speech recognition.Phonetica,9, 193–199.
Article Google Scholar
Geiselman, R. E. (1979). Inhibition of the automatic storage of speaker’s voice.Memory & Cognition,7, 201–204.
Google Scholar
Geiselman, R. E., &Bellezza, F. S. (1976). Long-term memory for speaker’s voice and source location.Memory & Cognition,4, 483–489.
Google Scholar
Geiselman, R. E., &Bellezza, F. S. (1977). Incidental retention of speaker’s voice.Memory & Cognition,5, 658–665.
Google Scholar
Geiselman, R. E., &Crawley, J. M. (1983). Incidental processing of speaker characteristics: Voice as connotative information.Journal of Verbal Learning & Verbal Behavior,22, 15–23.
Article Google Scholar
Gibson, E. J. (1969).Principles of perceptual learning and development. New York: Appleton-Century-Crofts.
Google Scholar
Gibson, E. J. (1991).An Odyssey in learning and perception. Cambridge, MA: MIT Press.
Google Scholar
Gibson, J. J., &Gibson, E. J. (1955). Perceptual learning: Differentiation or enrichment?Psychological Review,62, 32–41.
Article PubMed Google Scholar
Goldinger, S. D. (1992).Words and voices: Implicit and explicit memory for spoken words (Research on Speech Perception Tech. Rep. No. 7). Bloomington: Indiana University, Department of Psychology.
Google Scholar
Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory.Journal of Experimental Psychology: Learning, Memory, & Cognition,22, 1166–1183.
Article Google Scholar
Goldinger, S. D. Pisoni, D. B., &Logan, D. B. (1991). The nature of talker variability effects on recall of spoken word lists.Journal of Experimental Psychology: Learning, Memory, & Cognition,17, 152–162.
Article Google Scholar
Goldstone, R. (1994). Influences of categorization on perceptual discrimination.Journal of Experimental Psychology: General,123, 178–200.
Article Google Scholar
Greenspan, S. L., Nusbaum, H. C., &Pisoni, D. B. (1988). Perceptual learning of synthetic speech produced by rule.Journal of Experimental Psychology: Learning, Memory, & Cognition,14, 421–433.
Article Google Scholar
Hall, G. (1991).Perceptual and associative learning. Oxford: Oxford University Press, Clarendon Press.
Book Google Scholar
Halle, M. (1985). Speculations about the representation of words in memory. In V. A. Fromkin (Ed.),Phonetic linguistics (pp. 101–114). New York: Academic Press.
Google Scholar
Hintzman, D. L. (1986). “Schema abstraction” in a multiple trace memory model.Psychological Review,93, 411–428.
Article Google Scholar
House, A. S., Williams, C. E., Hecker, M. H. L., &Kryter, K. D. (1965). Articulation-testing methods: Consonantal differentiation with a closed-response set.Journal of the Acoustical Society of America,37, 158–166.
Article PubMed Google Scholar
Institute of Electrical and Electronics Engineers (1969).IEEE recommended practice for speech quality measurements (IEEE Report No. 297). New York: Author.
Google Scholar
Jacoby, L. L., &Brooks, L. R. (1984). Nonanalytic cognition: Memory, perception, and concept learning. In G. H. Bower (Ed.),The psychology of learning and motivation (Vol. 18, pp. 1–47). New York: Academic Press.
Google Scholar
Johnson, K. (1990). The role of perceived speaker identity inF0 normalization of vowels.Journal of the Acoustical Society of America,88, 642–654.
Article PubMed Google Scholar
Joos, M. A. (1948). Acoustic phonetics.Language,24(Suppl. 2), 1–136.
Google Scholar
Kolers, P. A. (1976). Pattern analyzing memory.Science,191, 1280–1281.
Article PubMed Google Scholar
Kolers, P. A., &Ostry, D. J. (1974). Time course of loss of information regarding pattern analyzing operations.Journal of Verbal Learning & Verbal Behavior,13, 599–612.
Article Google Scholar
Kuhl, P. K. (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not.Perception & Psychophysics,50, 93–107.
Google Scholar
Kuhl, P. K. (1992). Psychoacoustics and speech perception: Internal standards, perceptual anchors, and prototypes. In L. A. Werner & E. W. Rubel (Eds.),Developmental psychoacoustics (pp. 293–332). Washington, DC: APA Press.
Chapter Google Scholar
Labov, W. (1972).Sociolinguisticpatterns. Philadelphia: University of Pennsylvania Press.
Google Scholar
Ladefoged, P. (1980). What are linguistic sounds made of?Language,56, 485–502.
Article Google Scholar
Ladefoged, P., &Broadbent, D. E. (1957). Information conveyed by vowels.Journal of the Acoustical Society of America,29, 98–104.
Article Google Scholar
Laver, J. (1989). Cognitive science and speech: A framework for research. In H. Schnelle & N. O. Bernsen (Eds.),Logic and linguistics: Research directions in cognitive science. European perspectives (Vol. 2, pp. 37–70). Hillsdale, NJ: Erlbaum.
Google Scholar
Laver, J., &Trudgill, P. (1979). Phonetic and linguistic markers in speech. In K. R. Scherer & H. Giles (Eds.),Social markers in speech (pp. 1–32). Cambridge: Cambridge University Press.
Google Scholar
Lawrence, D. H. (1949). Acquired distinctiveness of cues: I. Transfer between discriminations on the basis of familiarity with the stimulus.Journal of Experimental Psychology,39, 770–784.
Article PubMed Google Scholar
Legge, G. E., Grossmann, C., &Pieper, C. M. (1984). Learning unfamiliar voices.Journal of Experimental Psychology: Learning, Memory, & Cognition,10, 1–36.
Article Google Scholar
Liberman, A. M., &Mattingly, I. G. (1985). The motor theory of speech perception revised.Cognition,21, 1–36.
Article PubMed Google Scholar
Lightfoot, N. (1989).Effects of talker familiarity on serial recall of spoken word lists (Research on Speech Perception Progress Report No. 15). Bloomington: Indiana University, Department of Psychology.
Google Scholar
Lively, S. E., Logan, J. S., &Pisoni, D. B. (1993). Training Japanese listeners to identify English Irl and III: II. The role of phonetic environment and talker variability in learning new perceptual categories.Journal of the Acoustical Society of America,94, 1242–1255.
Article PubMed Google Scholar
Lively, S. E., Pisoni, D. B., Yamada, R. A., Tohkura, Y., &Yamada, T. (1994). Training Japanese listeners to identify English Irl and IM: III. Long-term retention of new phonetic categories.Journal of the Acoustical Society of America,96, 2076–2087.
Article PubMed Google Scholar
Logan, J. S., Lively, S. E., &Pisoni, D. B. (1991). Training Japanese listeners to identify English Irl and HI: A first report.Journal of the Acoustical Society of America,89, 874–886.
Article PubMed Google Scholar
Luce, P. A., Pisoni, D. B., &Goldinger, S. D. (1990). Similarity neighborhoods of spoken words. In G. T. M. Altmann (Ed.),Cognitive models of speech processing: Psycholinguistic and computational perspectives (pp. 122–147). Cambridge, MA: MIT Press.
Google Scholar
Markel, N. N., Bein, M. F., &Phillis, J. (1973). The relationship between words and tone-of-voice.Language & Speech,16, 15–21.
Google Scholar
Martin, C. S., Mullennix, J. W., Pisoni, D. B., &Summers, W. V. (1989). Effects of talker variability on recall of spoken word lists.Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 676–681.
Article Google Scholar
McClelland, J. L., &Elman, J. L. (1986). The TRACE model of speech perception.Cognitive Psychology,18, 1–86.
Article PubMed Google Scholar
Miller, J. D. (1989). Auditory-perceptual interpretation of the vowel.Journal of the Acoustical Society of America,85, 2114–2134.
Article PubMed Google Scholar
Mullennix, J. W., &Pisoni, D. B. (1990). Stimulus variability and processing dependencies in speech perception.Perception & Psychophysics,47, 379–390.
Google Scholar
Mullennix, J. W., Pisoni, D. B., &Martin, C. S. (1989). Some effects of talker variability on spoken word recognition.Journal ofthe Acoustical Society of America,85, 365–378.
Article Google Scholar
Murray, I. R., &Arnott, J. L. (1993). Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion.Journal of the Acoustical Society of America,93, 1097–1108.
Article PubMed Google Scholar
Nearey, T. M. (1989). Static, dynamic, and relational properties in vowel perception.Journal of the Acoustical Society of America,85, 2088–2113.
Article PubMed Google Scholar
Nosofsky, R. M. (1987). Attention and learning processes in the identification and categorization of integral stimuli.Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 700–708.
Google Scholar
Nusbaum, H. C., Pisoni, D. B., &Davis, D. K. (1984).Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words (Research on Speech Perception Progress Report No. 10). Bloomington: Indiana University, Department of Psychology.
Google Scholar
Nygaard, L. C., &Kalish, M. L. (1994). Modeling the effect of learning voices on the perception of speech.Journal of the Acoustical Society of America,95, 2873.
Article Google Scholar
Nygaard, L. C., &Pisoni, D. B. (1995). Speech perception: New directions in research and theory. In J. L. Miller & P. D. Eimas (Eds.),Handbook of perception and cognition: Vol. II. Speech, language and communication (pp. 63–96). New York: Academic Press.
Google Scholar
Nygaard, L. C., Sommers, M. S., &Pisoni, D. B. (1994). Speech perception as a talker-contingent process.Psychological Science,5, 42–46.
Article Google Scholar
Nygaard, L. C., Sommers, M. S., &Pisoni, D. B. (1995). Effects of stimulus variability on perception and representation of spoken words in memory.Perception & Psychophysics,57, 989–1001.
Google Scholar
Palmeri, T. J. Goldinger, S. D., &Pisoni, D. B. (1993). Episodic encoding of voice attributes and recognition memory for spoken words.Journal of Experimental Psychology: Learning, Memory, & Cognition,19, 309–328.
Article Google Scholar
Peters, R. W. (1955a). The effect of length of exposure to speaker’s voice upon listener reception. InJoint Project Report No. 44 (pp. 1–8). Pensacola, FL: U.S. Naval School of Aviation Medicine.
Google Scholar
Peters, R. W. (1955b). The relative intelligibility of single-voice and multiple-voice messages under various conditions of noise. InJoint Project Report No. 56 (pp. 1–9). Pensacola, FL: U.S. Naval School of Aviation Medicine.
Google Scholar
Peterson, G. E., &Barney, H. L. (1952). Control methods used in a study of the vowels.Journal of the Acoustical Society of America,24, 175–184.
Article Google Scholar
Pisoni, D. B. (1993). Long-term memory in speech perception: Some new findings on talker variability, speaking rate, and perceptual learning.Speech Communication,13, 109–125.
Article Google Scholar
Pisoni, D. B. (1997). Some thoughts on “normalization” in speech perception. In K. Johnson & J. W. Mullennix (Eds.),Talker variability in speech processing (pp. 9–32). San Diego: Academic Press.
Google Scholar
Pollack, I., Pickett, J. M., &Sumby, W. H. (1954). On the identification of speakers by voice.Journal of the Acoustical Society of America,26, 403–406.
Article Google Scholar
Remez, R. E., Fellowes, J. M., &Rubin, P. E. (1997). Talker identification based on phonetic information.Journal of Experimental Psychology: Human Perception & Performance,23, 651–666.
Article Google Scholar
Schacter, D. L. (1990). Perceptual representation systems and implicit memory: Toward a resolution of the multiple memory systems debate. In A. Diamond (Ed.),Development and neural bases of higher cortical functions (Annals of the New York Academy of Sciences, Vol. 608, pp. 543–571). New York: New York Academy of Sciences.
Google Scholar
Schwab, E. C., Nusbaum, H. C., &Pisoni, D. B. (1985). Some effects of training on the perception of synthetic speech.Human Factors,27, 395–408.
PubMed Google Scholar
Shankweiler, D. P., Strange, W., &Verbrugge, R. R. (1977). Speech and the problem of perceptual constancy. In R. Shaw & J. Bransford (Eds.),Perceiving, acting, and knowing: Toward an ecological psychology (pp. 315–345). Hillsdale, NJ: Erlbaum.
Google Scholar
Shepard, R. N., &Teghtsoonian, M. (1961). Retention of information under conditions approaching a steady state.Journal of Experimental Psychology,62, 302–309.
Article PubMed Google Scholar
Sommers, M. S., Nygaard, L. C., &Pisoni, D. B. (1994). Stimulus variability and spoken word recognition: I. Effects of variability in speaking rate and overall amplitude.Journal of the Acoustical Society of America,96, 1314–1324.
Article PubMed Google Scholar
Stevens, K. N., &Blumstein, S. E. (1978). Invariant cues for place of articulation in stop consonants.Journal of the Acoustical Society of America,64, 1358–1368.
Article PubMed Google Scholar
Strange, W., &Dittmann, S. (1984). Effects of discrimination training on the perception of /r-1/ by Japanese adults learning English.Perception & Psychophysics,36, 131–145.
Google Scholar
Summerfield, Q. (1975). Acoustic and phonetic components of the influence of voice changes and identification times for CVC syllables. InReport on research in progress in speech perception (Vol. 2, pp. 73–98). Belfast, Northern Ireland: The Queen’s University of Belfast, Department of Psychology.
Google Scholar
Summerfield, Q., &Haggard, M. P. (1973). Vocal tract normalization as demonstrated by reaction times. InReport of speech research in progress (Vol. 2, pp. 12–23). Belfast, Northern Ireland: The Queen’s University of Belfast.
Google Scholar
Thompson, C. P. (1985). Voice identification: Speaker identifiability and a correction of the record regarding sex effects.Human Learning: Journal of Practical Research & Applications,4, 19–27.
Google Scholar
Van Lancker, D. (1991). Personal relevance and the human right hemisphere.Brain & Cognition,17, 64–92.
Article Google Scholar
Van Lancker, D., Cummings, J. L., Kreiman, J., &Dobkin, B. H. (1988). Phonagnosia: A dissociation between familiar and unfamiliar voices.Cortex,24, 195–209.
PubMed Google Scholar
Van Lancker, D., &Kreiman, J. (1987). Voice discrimination and recognition are separate abilities.Neuropsychologia,25, 829–854.
Article PubMed Google Scholar
Van Lancker, D., Kreiman, J., &Emmorey, K. (1985). Familiar voice recognition: Patterns and parameters: Part I. Recognition of backward voices.Journal of Phonetics,13, 19–38.
Google Scholar
Van Lancker, P., Kreiman, J., &Wickens, T. (1985). Familiar voice recognition: Patterns and parameters. Part II. Recognition of ratealtered voices.Journal of Phonetics,13, 39–52.
Google Scholar
Verbrugge, R. R., Strange, W., Shankweiler, D. P., &Edman, T. R. (1976). What information enables a listener to map a talker’s vowel space?Journal of the Acoustical Society of America,60, 198–212.
Article PubMed Google Scholar
Weenink, D. J. M. (1986). The identification of vowel stimuli from men, women, and children.Proceedings from the Institute of Phonetic Sciences of the University of Amsterdam,10, 41–54.
Google Scholar
Williams, C. E. (1964). The effects of selected factors on the aural identification of speakers. InReport EDS-TDR-65-153 (Section III). Hanscom Field, MA: Air Force Systems Command, Electronic Systems Division.
Google Scholar
Wohlwill, J. F. (1958). The definition and analysis of perceptual learning.Psychological Review,65, 283–295.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, Emory University, 532 N. Kilgo Circle, 30322, Atlanta, GA
Lynne C. Nygaard
Indiana University, Bloomington, Indiana
David B. Pisoni

Authors

Lynne C. Nygaard
View author publications
You can also search for this author in PubMed Google Scholar
David B. Pisoni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lynne C. Nygaard.

Additional information

This research was supported by NIDCD Research Grant DC-00111 and NIDCD Research Training Grant DC-00012 to Indiana University. Portions of this research were presented at the 125th meeting of the Acoustical Society of America in Ottawa and at the XHIth International Congress of Phonetic Sciences in Stockholm.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nygaard, L.C., Pisoni, D.B. Talker-specific learning in speech perception. Perception & Psychophysics 60, 355–376 (1998). https://doi.org/10.3758/BF03206860

Download citation

Received: 14 October 1996
Accepted: 04 May 1997
Issue Date: January 1998
DOI: https://doi.org/10.3758/BF03206860

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Talker-specific learning in speech perception

Abstract

Article PDF

Similar content being viewed by others

The Interface Theory of Perception

The English Sublexical Toolkit: Methods for indexing sound–spelling consistency

Perception of vocoded speech in domestic dogs

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Talker-specific learning in speech perception

Abstract

Article PDF

Similar content being viewed by others

The Interface Theory of Perception

The English Sublexical Toolkit: Methods for indexing sound–spelling consistency

Perception of vocoded speech in domestic dogs

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation