Talker adaptation or “talker” adaptation? Musical instrument variability impedes pitch perception

Shorey, Anya E.; King, Caleb J.; Theodore, Rachel M.; Stilp, Christian E.

doi:10.3758/s13414-023-02722-4

Talker adaptation or “talker” adaptation? Musical instrument variability impedes pitch perception

Published: 31 May 2023

Volume 85, pages 2488–2501, (2023)
Cite this article

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Anya E. Shorey¹,
Caleb J. King¹,
Rachel M. Theodore^2,3 &
…
Christian E. Stilp¹

1 Altmetric

Abstract

Listeners show perceptual benefits (faster and/or more accurate responses) when perceiving speech spoken by a single talker versus multiple talkers, known as talker adaptation. While near-exclusively studied in speech and with talkers, some aspects of talker adaptation might reflect domain-general processes. Music, like speech, is a sound class replete with acoustic variation, such as a multitude of pitch and instrument possibilities. Thus, it was hypothesized that perceptual benefits from structure in the acoustic signal (i.e., hearing the same sound source on every trial) are not specific to speech but rather a general auditory response. Forty nonmusician participants completed a simple musical task that mirrored talker adaptation paradigms. Low- or high-pitched notes were presented in single- and mixed-instrument blocks. Reflecting both music research on pitch and timbre interdependence and mirroring traditional “talker” adaptation paradigms, listeners were faster to make their pitch judgments when presented with a single instrument timbre relative to when the timbre was selected from one of four instruments from trial to trial. A second experiment ruled out the possibility that participants were responding faster to the specific instrument chosen as the single-instrument timbre. Consistent with general theoretical approaches to perception, perceptual benefits from signal structure are not limited to speech.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

In “Tone” with dogs: exploring canine musicality

Article Open access 16 May 2024

Lightness/pitch and elevation/pitch crossmodal correspondences are low-level sensory effects

Article 29 January 2019

Of cricket chirps and car horns: The effect of nature sounds on cognitive performance

Article 26 October 2018

Notes

Since participants for Experiment 1 were recruited without regard for their musical training, an alternative analysis was conducted including the participants who did not meet the 90% accuracy criterion in the single block. Participants were still more accurate and responded faster in the single instrument block (M acc = 77.6%, 95% CI [69.9%, 85.3%]; M RT = 909 ms, 95% CI [833, 985]) than the mixed block (M acc = 71.7%, [67.1%, 76.4%]; M RT = 1,161, 95% CI [1,067, 1,254]). Mixed effects models using the same architecture as described in the main text revealed the differences across the blocks were significant (accuracy model: \(\widehat{\mathrm{\upbeta}}\) = −1.25, 95% CI [−2.12, −0.42], Z = −3.17; RT model: \(\widehat{\mathrm{\upbeta}}\) = .28, 95% CI [0.18, 0.37], t = 6.08). Thus, the same pattern of results was observed with or without the single-instrument block performance criterion.

References

Assgari, A. A., & Stilp, C. E. (2015). Talker information influences spectral contrast effects in speech categorization. The Journal of the Acoustical Society of America, 138(5), 3023–3032. https://doi.org/10.1121/1.4934559
Article PubMed Google Scholar
Assgari, A. A., Theodore, R. M., & Stilp, C. E. (2019). Variability in talkers’ fundamental frequencies shapes context effects in speech perception. The Journal of the Acoustical Society of America, 145(3), 1443–1454. https://doi.org/10.1121/1.5093638
Article PubMed Google Scholar
Assmann, P. F., Nearey, T. M., & Hogan, J. T. (1982). Vowel identification: Orthographic, perceptual, and acoustic aspects. Journal of the Acoustical Society of America, 71(4), 975–989. https://doi.org/10.1121/1.387579
Article PubMed Google Scholar
Attneave, F. (1954). Some informational aspects of visual perception. Psychological Review, 61(3), 183–193. https://doi.org/10.1037/h0054663
Article PubMed Google Scholar
Barlow, H. B. (1961). Possible principles underlying the transformation of sensory messages. Sensory Communication, 1(01), 217–233.
Google Scholar
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001
Article Google Scholar
Barreda, S. (2012). Vowel normalization and the perception of speaker changes: An exploration of the contextual tuning hypothesis. Journal of the Acoustical Society of America, 132(5), 3453–3464.
Article PubMed Google Scholar
Bates, D., Mächler, M., Bolker, B. M., & Walker, S. C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 2021. https://doi.org/10.18637/jss.v067.i01
Article Google Scholar
Boersma, P., & Weenick, D. (2021). Praat: Doing phonetics by computer (Version 6.1.50) [Computer program]. http://www.praat.org
Bradlow, A. R., Nygaard, L. C., & Pisani, D. B. (1999). Effects of talker, rate, and amplitude variation on recognition memory for spoken words. Perception & Psychophysics, 61(2), 206–219.
Article Google Scholar
Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. MIT press.
Book Google Scholar
Bressler, S., Masud, S., Bharadwaj, H., & Shinn-Cunningham, B. (2014). Bottom-up influences of voice continuity in focusing selective auditory attention. Psychological Research, 78(3), 349–360. https://doi.org/10.1007/s00426-014-0555-7
Article PubMed PubMed Central Google Scholar
Choi, J. Y., Hu, E. R., & Perrachione, T. K. (2018). Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing. Attention, Perception, & Psychophysics, 80(3), 784–797. https://doi.org/10.3758/s13414-017-1395-5
Article Google Scholar
Choi, J. Y., & Perrachione, T. K. (2019). Time and information in perceptual adaptation to speech. Cognition, 192(June), 103982. https://doi.org/10.1016/j.cognition.2019.05.019
Article PubMed PubMed Central Google Scholar
Creelman, C. D. (1957). Case of the unknown talker. The Journal of the Acoustical Society of America, 29(5), 655–655. https://doi.org/10.1121/1.1909003
Article Google Scholar
Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55, 149–179. https://doi.org/10.1146/annurev.psych.55.090902.142028
Article PubMed Google Scholar
Dooling, R. J., Okanoya, K., & Brown, S. D. (1989). Speech perception by budgerigars (Melopsittacus undulatus): The voiced-voiceless distinction. Perception & Psychophysics, 46(1), 65–71.
Article Google Scholar
Eimas, P. D., Siqueland, E. R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants. Science, 171(3968), 303–306.
Article PubMed Google Scholar
Fowler, C. A., & Rosenblum, L. D. (1990). Duplex perception: A comparison of monosyllables and slamming doors. Journal of Experimental Psychology: Human Perception and Performance, 16(4), 742–754. https://doi.org/10.1037/0096-1523.16.4.742
Article PubMed Google Scholar
Fowler, C. A., Best, C. T., & Mcroberts, G. W. (1990). Young infants’ perception of liquid coarticulatory influences on following stop consonants. Perception & Psychophysics, 48, 559–570.
Article Google Scholar
Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning Memory and Cognition, 22(5), 1166–1183. https://doi.org/10.1037/0278-7393.22.5.1166
Article PubMed Google Scholar
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105(2), 251–279.
Article PubMed Google Scholar
Goldinger, S. D., Pisoni, D. B., & Logan, J. S. (1991). On the nature of talker variability effects on recall of spoken word lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(1), 152–162.
PubMed Google Scholar
Green, P., & Macleod, C. J. (2016). SIMR: An R package for power analysis of generalized linear mixed models by simulation. Methods in Ecology and Evolution, 7(4), 493–498. https://doi.org/10.1111/2041-210X.12504
Article Google Scholar
Heald, S. L. M., & Nusbaum, H. C. (2014). Speech perception as an active cognitive process. Frontiers in Systems Neuroscience, 8(MAR), 1–15. https://doi.org/10.3389/fnsys.2014.00035
Article Google Scholar
Heald, S. L. M., Van Hedger, S. C., & Nusbaum, H. C. (2017). Perceptual plasticity for auditory object recognition. Frontiers in Psychology, 8(MAY), 781. https://doi.org/10.3389/fpsyg.2017.00781
Article PubMed PubMed Central Google Scholar
Hillenbrand, J., Getty, L. A., Clark, M. J., & Wheeler, K. (1995). Acoustic characteristics of American English vowels. Journal of the Acoustic Society of America, 97(5), 3099–3111.
Article Google Scholar
Kleinschmidt, D. F., & Jaeger, T. F. (2015). Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel. Psychological Review, 122(2), 148–203. https://doi.org/10.1037/a0038695
Article PubMed PubMed Central Google Scholar
Kluender, K. R., Diehl, R. L., & Killeen, P. R. (1987). Japanese quail can learn phonetic categories. Science, 237(4819), 1195–1197.
Article PubMed Google Scholar
Krumhansl, C. L., & Iverson, P. (1992). Perceptual interactions between musical pitch and timbre. Journal of Experimental Psychology: Human Perception and Performance, 18(3), 739–751.
PubMed Google Scholar
Kuhl, P. K., & Miller, J. D. (1975). Speech perception by the chinchilla: Voiced-voiceless distinction in alveolar plosive consonants. Science, 190(4209), 69–72.
Article PubMed Google Scholar
Ladefoged, P., & Broadbent, D. E. (1957). Information conveyed by vowels. The Journal of the Acoustical Society of America, 29(1), 98–104.
Article Google Scholar
Lenth, R. (2019). emmeans: Estimated marginal means, aka least-squares means (R package Version 1.6.3). https://cran.r-project.org/package=emmeans
Liberman, A. M. (1996). Speech: A special code. MIT Press.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431–461.
Article PubMed Google Scholar
Locke, S., & Kellar, L. (1973). Categorical perception in a nonlinguistic mode. Cortex, 9(4), 355–369.
Article PubMed Google Scholar
Lotto, A. J., & Kluender, K. R. (1998). General contrast effects in speech perception: Effect of preceding liquid on stop consonant identification. Perception & Psychophysics, 60(4), 602–619.
Article Google Scholar
Madsen, S. M. K., Whiteford, K. L., & Oxenham, A. J. (2017). Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds. Scientific Reports, 7(1), 1–9. https://doi.org/10.1038/s41598-017-12937-9
Article Google Scholar
Magnuson, J. S., & Nusbaum, H. C. (2007). Acoustic differences, listener expectations, and the perceptual accommodation of talker variability. Journal of Experimental Psychology: Human Perception and Performance, 33(2), 391–409. https://doi.org/10.1037/0096-1523.33.2.391
Article PubMed Google Scholar
Mann, V. A. (1980). Influence of preceding liquid on stop-consonant perception. Perception & Psychophysics, 28(5), 407–412.
Article Google Scholar
Martin, C. S., Mullennix, J. W., Pisoni, D. B., & Summers, W. V. (1989). Effects of talker variability on recall of spoken word lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(4), 676–684. https://doi.org/10.1037/0278-7393.17.1.152
Article PubMed Google Scholar
Mattingly, I. G., Liberman, A. M., Syrdal, A. K., & Halwes, T. (1971). Discrimination in speech and nonspeech modes. Cognitive Psychology, 2(2), 131–157. https://doi.org/10.1016/0010-0285(71)90006-5
Article Google Scholar
Melara, R. D., & Marks, L. E. (1990). Processes underlying dimensional interactions: Correspondences between linguistic and nonlinguistic dimensions. Memory & Cognition, 18(5), 477–495.
Article Google Scholar
Micheyl, C., Delhommeau, K., Perrot, X., & Oxenham, A. J. (2006). Influence of musical and psychoacoustical training on pitch discrimination. Hearing Research, 219(1/2), 36–47. https://doi.org/10.1016/j.heares.2006.05.004
Article PubMed Google Scholar
Miller, J. L., & Liberman, A. M. (1979). Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception & Psychophysics, 25, 457–465.
Article Google Scholar
Miller, J. L., & Eimas, P. D. (1983). Studies on the categorization of speech by infants. Cognition, 13(2), 135–165.
Article PubMed Google Scholar
Miller, J. D., Wier, C. C., Pastore, R. E., Kelly, W. J., & Dooling, R. J. (1976). Discrimination and labeling of noise-buzz sequences with varying noise-lead times: An example of categorical perception. Journal of the Acoustical Society of America, 60(2), 410–417. https://doi.org/10.1121/1.381097
Article PubMed Google Scholar
Mills, H. E., Shorey, A. E., Theodore, R. M., & Stilp, C. E. (2022). Context effects in perception of vowels differentiated by F1 are not influenced by variability in talkers’ mean F1 or F3. The Journal of the Acoustical Society of America, 152(1), 55–66. https://doi.org/10.1121/10.0011920
Article PubMed Google Scholar
Mullennix, J. W., & Pisoni, D. B. (1990). Stimulus variability and processing dependencies in speech perception. Perception & Psychophysics, 47(4), 379–390. https://doi.org/10.3758/BF03210878
Article Google Scholar
Mullennix, J. W., Pisoni, D. B., & Martin, C. S. (1989). Some effects of talker variability on spoken word recognition. Journal of the Acoustical Society of America, 85(1), 365–378. https://doi.org/10.1037/0278-7393.15.4.676
Article PubMed Google Scholar
Nusbaum, H. C., & Morin, T. M. (1992). Paying attention to differences among talkers. In Y. Tohkura, E. Vatikiotis-Bateson, & Y. Sagisaka (Eds.), Speech perception, production and linguistic structure (pp. 113–134). IOS Press.
Google Scholar
Nygaard, L. C., Sommers, M. S., & Pisoni, D. B. (1995). Effects of stimulus variability on perception and representation of spoken words in memory. Perception & Psychophysics, 57(7), 989–1001. https://doi.org/10.3758/BF03205458
Article Google Scholar
Opolko, F., & Wapnick, J. (1989). McGill University Master Samples user’s manual. McGill University.
Google Scholar
Parker, E. M., Diehl, R. L., & Kluender, K. R. (1986). Trading relations in speech and nonspeech. Perception & Psychophysics, 39(2), 129–142. https://doi.org/10.3758/BF03211495
Article Google Scholar
Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. The Journal of the Acoustical Society of America, 24(2), 175–184.
Article Google Scholar
Pisoni, D. B., Carrell, T. D., & Gans, S. J. (1983). Perception of the duration of rapid spectrum changes in speech and nonspeech signals. Perception & Psychophysics, 34, 314–322.
Article Google Scholar
Pitt, M. A. (1994). Perception of pitch and timbre by musically trained and untrained listeners. Journal of Experimental Psychology: Human Perception and Performance, 20(5), 976–986. https://doi.org/10.1037/0096-1523.20.5.976
Article PubMed Google Scholar
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.r-project.org/
Rand, T. C. (1971). Vocal tract size normalization in the perception of stop consonants. Journal of the Acoustical Society of America, 50, 139.
Article Google Scholar
Rand, T. C. (1974). Dichotic release from masking for speech. Journal of the Acoustical Society of America, 55(3), 678–680. https://doi.org/10.1121/1.1914584
Article PubMed Google Scholar
Repp, B. H. (1982). Phonetic trading relations and context effects: New experimental evidence for a speech mode of perception. Psychological Bulletin, 92(1), 81–110. https://doi.org/10.1037/0033-2909.92.1.81
Article PubMed Google Scholar
Schön, D., Magne, C., & Besson, M. (2004). The music of speech: Music training facilitates pitch processing in both music and language. Psychophysiology, 41(3), 341–349. https://doi.org/10.1111/1469-8986.00172.x
Article PubMed Google Scholar
Schouten, M. E. H. (1980). The case against a speech mode of perception. Acta Psychologica, 44(1), 71–98. https://doi.org/10.1016/0001-6918(80)90077-3
Article PubMed Google Scholar
Shepard, R. N. (1964). Circularity in judgments of relative pitch. The Journal of the Acoustical Society of America, 36(12), 2346–2353. https://doi.org/10.1121/1.1919362
Article Google Scholar
Sommers, M. S., Nygaard, L. C., & Pisoni, D. B. (1994). Stimulus variability and spoken word recognition: I. Effects of variability in speaking rate and overall amplitude. Journal of the Acoustical Society of America, 96(3), 1314–1324. https://doi.org/10.1121/1.411453
Article PubMed Google Scholar
Spiegel, M. F., & Watson, C. S. (1984). Performance on frequency-discrimination tasks by musicians and nonmusicians. Journal of the Acoustical Society of America, 76(6), 1690–1695. https://doi.org/10.1121/1.391605
Article Google Scholar
Sinnott, J. M., Beecher, M. D., Moody, D. B., & Stebbins, W. C. (1976). Speech sound discrimination by monkeys and humans. The Journal of the Acoustical Society of America, 60(3), 687–695.
Article PubMed Google Scholar
Stilp, C. E., & Theodore, R. M. (2020). Talker normalization is mediated by structured indexical information. Attention, Perception, & Psychophysics, 82(5), 2237–2243. https://doi.org/10.3758/s13414-020-01971-x
Article Google Scholar
Stilp, C. E., Alexander, J. M., Kiefte, M., & Kluender, K. R. (2010). Auditory color constancy: Calibration to reliable spectral properties across nonspeech context and targets. Attention, Perception, & Psychophysics, 72(2), 470–480.
Article Google Scholar
Studdert-Kennedy, M., Liberman, A. M., Harris, K. S., & Cooper, F. S. (1970). Motor theory of speech perception: A reply to Lane’s critical review. Psychological Review, 77(3), 234–249. https://doi.org/10.1037/h0029078
Article PubMed Google Scholar
Tervaniemi, M., Just, V., Koelsch, S., Widmann, A., & Schröger, E. (2005). Pitch discrimination accuracy in musicians vs nonmusicians: An event-related potential and behavioral study. Experimental Brain Research, 161(1), 1–10. https://doi.org/10.1007/s00221-004-2044-5
Article PubMed Google Scholar
Van Hedger, S. C., Heald, S. L. M., & Nusbaum, H. C. (2015). The effects of acoustic variability on absolute pitch categorization: Evidence of contextual tuning. The Journal of the Acoustical Society of America, 138(1), 436–446. https://doi.org/10.1121/1.4922952
Article PubMed Google Scholar
Werker, J. F., & Tees, R. C. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7(1), 49–63.
Article Google Scholar
Wier, C. C., Jesteadt, W., & Green, D. M. (1977). Frequency discrimination as a function of frequency and sensation level. The Journal of the Acoustical Society of America, 61(1), 178–184.
Article PubMed Google Scholar
Woods, K. J. P., Siegel, M. H., Traer, J., & McDermott, J. H. (2017). Headphone screening to facilitate web-based auditory experiments. Attention, Perception, & Psychophysics, 79(7), 2064–2072. https://doi.org/10.3758/s13414-017-1361-2
Article Google Scholar
Zarate, J. M., Ritson, C. R., & Poeppel, D. (2012). Pitch-interval discrimination and musical expertise: Is the semitone a perceptual boundary? The Journal of the Acoustical Society of America, 132(2), 984–993. https://doi.org/10.1121/1.4733535
Article Google Scholar
Zarate, J. M., Ritson, C. R., & Poeppel, D. (2013). The effect of instrumental timbre on interval discrimination. PLOS ONE, 8(9), e75410. https://doi.org/10.1371/journal.pone.0075410
Article PubMed PubMed Central Google Scholar
Zhang, C., & Chen, S. (2016). Toward an integrative model of talker normalization. Journal of Experimental Psychology: Human Perception and Performance, 42(8), 1252–1268.
PubMed Google Scholar

Download references

Acknowledgments

The authors thank Lauren Girouard-Hallam, Raina Isaacs, Vitor Neves Guimaraes, and Carolyn Mervis for feedback on an earlier version of this manuscript, and Aidan Shorey, Micki Shorey, and Ralph Shorey for providing instrument photos for Fig. 1. The authors declare no financial support nor any conflict of interests pertaining to this manuscript.

Author information

Authors and Affiliations

Department of Psychological and Brain Sciences, University of Louisville, 317 Life Sciences Building, Louisville, KY, 40272, USA
Anya E. Shorey, Caleb J. King & Christian E. Stilp
Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Unit 1085, Storrs, CT, 06269-1085, USA
Rachel M. Theodore
Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, 337 Mansfield Road, Unit 1272, Storrs, CT, 06269-1272, USA
Rachel M. Theodore

Authors

Anya E. Shorey
View author publications
You can also search for this author in PubMed Google Scholar
Caleb J. King
View author publications
You can also search for this author in PubMed Google Scholar
Rachel M. Theodore
View author publications
You can also search for this author in PubMed Google Scholar
Christian E. Stilp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anya E. Shorey.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shorey, A.E., King, C.J., Theodore, R.M. et al. Talker adaptation or “talker” adaptation? Musical instrument variability impedes pitch perception. Atten Percept Psychophys 85, 2488–2501 (2023). https://doi.org/10.3758/s13414-023-02722-4

Download citation

Accepted: 26 April 2023
Published: 31 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.3758/s13414-023-02722-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Talker adaptation or “talker” adaptation? Musical instrument variability impedes pitch perception

Abstract

Access this article

Similar content being viewed by others

In “Tone” with dogs: exploring canine musicality

Lightness/pitch and elevation/pitch crossmodal correspondences are low-level sensory effects

Of cricket chirps and car horns: The effect of nature sounds on cognitive performance

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Talker adaptation or “talker” adaptation? Musical instrument variability impedes pitch perception

Abstract

Access this article

Similar content being viewed by others

In “Tone” with dogs: exploring canine musicality

Lightness/pitch and elevation/pitch crossmodal correspondences are low-level sensory effects

Of cricket chirps and car horns: The effect of nature sounds on cognitive performance

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation