Timbre Recognition and Sound Source Identification

Agus, Trevor R.; Suied, Clara; Pressnitzer, Daniel

doi:10.1007/978-3-030-14832-4_3

Trevor R. Agus²¹,
Clara Suied²² &
Daniel Pressnitzer²³

Part of the book series: Springer Handbook of Auditory Research ((SHAR,volume 69))

2102 Accesses
2 Citations

Abstract

The ability to recognize many sounds in everyday soundscapes is a useful and impressive feature of auditory perception in which timbre likely plays a key role. This chapter discusses what is known of timbre in the context of sound source recognition. It first surveys the methodologies that have been used to characterize a listener’s ability to recognize sounds and then examines the types of acoustic cues that could underlie the behavioral findings. In some studies, listeners were directly asked to recognize familiar sounds or versions of them that were truncated, filtered, or distorted by other resynthesis methods that preserved some cues but not others. In other studies, listeners were exposed to novel sounds, and the build-up of cues over time or the learning of new cues was tracked. The evidence currently available raises an interesting debate that can be articulated around two qualitatively different hypotheses: Are sounds recognized through distinctive features unique to each sound category (but of which there would need to be many to cover all recognized categories) or rather, are sounds recognized through a relatively small number of perceptual dimensions in which different sounds have their own recognizable position?

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Examples of acoustic and auditory sketches are available from https://hal.archives-ouvertes.fr/hal-01250175 (Isnard et al. 2016)

References

Agus TR, Thorpe SJ, Pressnitzer D (2010) Rapid formation of robust auditory memories: insights from noise. Neuron 66:610–618
Article CAS Google Scholar
Agus TR, Suied C, Thorpe SJ, Pressnitzer D (2012) Fast recognition of musical sounds based on timbre. J Acoust Soc America 131:4124–4133. https://doi.org/10.1121/1.3701865
Article Google Scholar
Agus TR, Paquette S, Suied C et al (2017) Voice selectivity in the temporal voice area despite matched low-level acoustic cues. Sci Rep 7:11526. https://doi.org/10.1038/s41598-017-11684-1
Article CAS PubMed PubMed Central Google Scholar
Andrillon T, Kouider S, Agus T, Pressnitzer D (2015) Perceptual learning of acoustic noise generates memory-evoked potentials. Curr Biol 25. https://doi.org/10.1016/j.cub.2015.09.027
Article CAS Google Scholar
Backhaus VH (1932) Über die Bedeutung der Ausgleichsvorgänge in der Akustik. Z Tech Phys 13:31–46
Google Scholar
Beauchamp JW (1975) Analysis and synthesis of cornet tones using nonlinear interharmonic relationships. J Aud Eng Soc 23:778–795
Google Scholar
Belin P, Zatorre RJ, Lafaille P et al (2000) Voice-selective areas in human auditory cortex. Nature 403:309–312. https://doi.org/10.1038/35002078
Article CAS PubMed Google Scholar
Benade AH (1990) Fundamentals of musical acoustics. Dover Publications, New York
Google Scholar
Berger KW (1964) Some factors in the recognition of timbre. J Acoust Soc Am 36:1888–1891. https://doi.org/10.1121/1.1919287
Article Google Scholar
Bigand E, Delbé C, Gérard Y, Tillmann B (2011) Categorization of extremely brief auditory stimuli: domain-specific or domain-general processes? PLoS One 6:e27024. https://doi.org/10.1371/journal.pone.0027024
Article CAS PubMed PubMed Central Google Scholar
Brimijoin OW, Akeroyd MA, Tilbury E, Porr B (2013) The internal representation of vowel spectra investigated using behavioral response-triggered averaging. J Acoust Soc Am 133:EL118–EL122. https://doi.org/10.1121/1.4778264
Article PubMed Google Scholar
Cavanagh P (2005) The artist as neuroscientist. Nature 434:301–307
Article CAS Google Scholar
Chi T, Ru P, Shamma SA (2005) Multiresolution spectrotemporal analysis of complex sounds. J Acoust Soc Am 118:887–906
Article Google Scholar
Clark M, Luce D, Abrams R et al (1963) Preliminary experiments on the aural significance of parts of tones of orchestral instruments and on choral tones. J Aud Eng Soc 11:45–54
Google Scholar
Coath M, Denham SL (2005) Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience. Biol Cybern 93:22–30. https://doi.org/10.1007/s00422-005-0560-4
Article PubMed Google Scholar
Delorme A, Richard G, Fabre-Thorpe M (2000) Ultra-rapid categorisation of natural scenes does not rely on colour cues: a study in monkeys and humans. Vis Res 40:2187–2200. https://doi.org/10.1016/S0042-6989(00)00083-3
Article CAS PubMed Google Scholar
Elliott CA (1975) Attacks and releases as factors in instrument identification. J Res Mus Ed 23:35–40. https://doi.org/10.2307/3345201
Article Google Scholar
Elliott TM, Hamilton LS, Theunissen FE (2013) Acoustic structure of the five perceptual dimensions of timbre in orchestral instrument tones. J Acoust Soc Am 133:389–404. https://doi.org/10.1121/1.4770244
Article PubMed PubMed Central Google Scholar
Formisano E, De Martino F, Bonte M, Goebel R (2008) “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322:970–973
Article CAS Google Scholar
Fritz J, Shamma S, Elhilali M, Klein D (2003) Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nat Neurosci 6:1216–1223
Article CAS Google Scholar
Giordano BL, McAdams S (2006) Material identification of real impact sounds: effects of size variation in steel, glass, wood, and plexiglass plates. J Acoust Soc Am 119:1171–1181. https://doi.org/10.1121/1.2149839
Article PubMed Google Scholar
Giordano BL, McAdams S (2010) Sound source mechanics and musical timbre perception: evidence from previous studies. Music Percept 28:155–168
Article Google Scholar
Gosselin F, Schyns PG (2001) Bubbles: a technique to reveal the use of information in recognition tasks. Vis Res 41:2261–2271. https://doi.org/10.1016/S0042-6989(01)00097-9
Article CAS PubMed Google Scholar
Gray GW (1942) Phonemic microtomy: the minimum duration of perceptible speech sounds. Commun Monogr 9:75–90
Google Scholar
Grey JM (1977) Multidimensional perceptual scaling of musical timbres. J Acoust Soc Am 61:1270–1277
Article CAS Google Scholar
Grey JM, Moorer JA (1977) Perceptual evaluations of synthesised musical instrument tones. J Acoust Soc Am 62:454–462
Article Google Scholar
Gygi B, Kidd GR, Watson CS (2004) Spectral-temporal factors in the identification of environmental sounds. J Acoust Soc Am 115:1252–1265
Article Google Scholar
Gygi B, Kidd GR, Watson CS (2007) Similarity and categorization of environmental sounds. Percept Psychophys 69:839–855
Article Google Scholar
Gygi B, Shafiro V (2011) The incongruency advantage for environmental sounds presented in natural auditory scenes. J Exp Psychol Hum Percept Perform 37:551–565. https://doi.org/10.1037/a0020671
Article PubMed PubMed Central Google Scholar
Helmholtz H (1877) Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik, 4th edn. F. Vieweg und Sohn, Braunschweig. English edition: Helmholtz H (1954) On the sensations of tone as a physiological basis for the theory of music (trans: Ellis AJ), 2nd edn. Dover, New York
Google Scholar
Hjortkjær J, McAdams S (2016) Spectral and temporal cues for perception of material and action categories in impacted sound sources. J Acoust Soc Am 140:409–420. https://doi.org/10.1121/1.4955181
Article PubMed Google Scholar
Hjortkjær J, Kassuba T, Madsen KH et al (2018) Task-modulated cortical representations of natural sound source categories. Cereb Cortex 28:295–306. https://doi.org/10.1093/cercor/bhx263
Article PubMed Google Scholar
Hromádka T, Zador AM (2009) Representations in auditory cortex. Curr Opin Neurobiol 19:430–433. https://doi.org/10.1016/S0959-4388(09)00096-8
Article PubMed PubMed Central Google Scholar
Isnard V, Taffou M, Viaud-Delmon I, Suied C (2016) Auditory sketches: very sparse representations of sounds are still recognizable. PLoS One 11. https://doi.org/10.1371/journal.pone.0150313
Article Google Scholar
Iverson P, Krumhansl CL (1993) Isolating the dynamic attributes of musical timbre. J Acoust Soc Am 94:2595–2603. https://doi.org/10.1121/1.407371
Article CAS PubMed Google Scholar
Kang H, Agus TR, Pressnitzer D (2017) Auditory memory for random time patterns. J Acoust Soc Am 142:2219–2232. https://doi.org/10.1121/1.5007730
Article PubMed Google Scholar
Kell AJE, Yamins DLK, Shook EN et al (2018) A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98:630–644. https://doi.org/10.1016/j.neuron.2018.03.044
Article CAS PubMed Google Scholar
Kumar S, Bonnici HM, Teki S et al (2014) Representations of specific acoustic patterns in the auditory cortex and hippocampus. Proc R Soc B Biol Sci 281:20141000. https://doi.org/10.1098/rspb.2014.1000
Article Google Scholar
Lakatos S, McAdams S, Causse R (1997) The representation of auditory source characteristics: simple geometric form. Percept Psychophys 59:1180–1190
Article CAS Google Scholar
Leaver AM, Rauschecker JP (2010) Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J Neurosci 30(22):7604–7612. https://doi.org/10.1523/JNEUROSCI.0296-10.2010
Article CAS PubMed PubMed Central Google Scholar
Lewis JW, Wightman FL, Brefczynski JA et al (2004) Human brain regions involved in recognizing environmental sounds. Cereb Cortex 14:1008–1021. https://doi.org/10.1093/cercor/bhh061bhh061
Article PubMed Google Scholar
Luo S, Zhu L, Althoefer K, Liu H (2017) Knock-knock: acoustic object recognition by using stacked denoising autoencoders. Neurocomputing 267:18–24. https://doi.org/10.1016/j.neucom.2017.03.014
Article Google Scholar
McAdams S (1993) Recognition of sound sources and events. In: McAdams S, Bigand E (eds) Thinking in sound: the cognitive psychology of human audition. Oxford University Press, Oxford, pp 146–198
Chapter Google Scholar
McAdams S, Winsberg S, Donnadieu S et al (1995) Perceptual scaling of synthesized musical timbres: common dimensions, specificities, and latent subject classes. Psychol Res 58:177–192. https://doi.org/10.1007/BF00419633
Article CAS PubMed Google Scholar
McAdams S, Beauchamp JW, Meneguzzi S (1999) Discrimination of musical instrument sounds resynthesized with simplified spectrotemporal parameters. J Acoust Soc Am 105:882–897. https://doi.org/10.1121/1.426277
Article CAS PubMed Google Scholar
McDermott JH, Simoncelli EP (2011) Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis. Neuron 71:926–940. https://doi.org/10.1016/j.neuron.2011.06.032
Article CAS PubMed PubMed Central Google Scholar
McDermott JH, Schemitsch M, Simoncelli EP (2013) Summary statistics in auditory perception. Nat Neurosci 16:493–498. https://doi.org/10.1038/nn.3347
Article CAS PubMed PubMed Central Google Scholar
McPherson MJ, McDermott JH (2018) Diversity in pitch perception revealed by task dependence. Nat Hum Behav 2:52–66. https://doi.org/10.1038/s41562-017-0261-8
Article PubMed Google Scholar
Newton MJ, Smith LS (2012) A neurally inspired musical instrument classification system based upon the sound onset. J Acoust Soc Am 131:4785–4798. https://doi.org/10.1121/1.4707535
Article PubMed Google Scholar
Norman-Haignere S, Kanwisher NG, McDermott JH (2015) Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron 88:1281–1296. https://doi.org/10.1016/j.neuron.2015.11.035
Article CAS PubMed PubMed Central Google Scholar
Occelli F, Suied C, Pressnitzer D et al (2016) A neural substrate for rapid timbre recognition? Neural and behavioral discrimination of very brief acoustic vowels. Cereb Cortex 26:2483–2496. https://doi.org/10.1093/cercor/bhv071
Article CAS PubMed Google Scholar
Ogg M, Slevc LR, Idsardi WJ (2017) The time course of sound category identification : insights from acoustic features. J Acoust Soc Am 142:3459–3473
Article Google Scholar
Patil K, Pressnitzer D, Shamma S, Elhilali M (2012) Music in our ears: the biological bases of musical timbre perception. PLoS Comput Biol 8(11):e1002759. https://doi.org/10.1371/journal.pcbi.1002759
Article CAS Google Scholar
Pressnitzer D, Agus T, Suied C (2015) Acoustic timbre recognition. In: Jaeger D., Jung R. (eds) Encyclopedia of computational neuroscience. Springer, New York, pp. 128–133
Google Scholar
Risset J-C, Wessel DL (1999) Exploration of timbre by analysis and synthesis. In: The psychology of music, pp 113–169. https://doi.org/10.1016/B978-012213564-4/50006-8
Chapter Google Scholar
Robinson K, Patterson RD (1995) The duration required to identify the instrument, the octave, or the pitch chroma of a musical note. Music Percept 13:1–15. https://doi.org/10.2307/40285682
Article Google Scholar
Rosch E, Mervis CB (1975) Family resemblances: studies in the internal structure of categories. Cogn Psychol 7:573–605. https://doi.org/10.1016/0010-0285(75)90024-9
Article Google Scholar
Saldanha EL, Corso JF (1964) Timbre cues and the identification of musical instruments. J Acoustic Soc Am 36:2021–2026
Article Google Scholar
Shannon RV, Zeng FG, Kamath V et al (1995) Speech recognition with primarily temporal cues. Science 270:303–304
Article CAS Google Scholar
Sidwell A, Summerfield Q (1985) The effect of enhanced spectral contrast on the internal representation of vowel-shaped noise. J Acoust Soc Am 78:495–506
Article Google Scholar
Siedenburg K, Fujinaga I, McAdams S (2016) A comparison of approaches to timbre descriptors in music information retrieval and music psychology. J New Mus Res 45:27–41. https://doi.org/10.1080/09298215.2015.1132737
Article Google Scholar
Siedenburg K, McAdams S (2017) Four distinctions for the auditory “wastebasket” of timbre. Front Psychol 8:1747. https://doi.org/10.3389/fpsyg.2017.01747
Suied C, Agus TR, Thorpe SJ, Pressnitzer D (2013a) Processing of short auditory stimuli: the rapid audio sequential presentation paradigm (RASP). In: Basic aspects of hearing. Springer, New York, pp 443–451
Chapter Google Scholar
Suied C, Drémeau A, Pressnitzer D, Daudet L (2013b) Auditory sketches: sparse representations of sounds based on perceptual models. In: Aramaki M, Barthet M, Kronland-Martinet R, Ystad S (eds) From sounds to music and emotions. CMMR 2012. Lecture Notes in Computer Science. Springer, Berlin/Heidelberg
Google Scholar
Suied C, Agus TR, Thorpe SJ et al (2014) Auditory gist: recognition of very short sounds from timbre cues. J Acoust Soc Am 135:1380–1391. https://doi.org/10.1121/1.4863659
Article PubMed Google Scholar
Thoret E, Depalle P, McAdams S (2016) Perceptually salient spectrotemporal modulations for recognition of sustained musical instruments. J Acoust Soc Am 140:EL478–EL483. https://doi.org/10.1121/1.4971204
Article PubMed Google Scholar
Thoret E, Depalle P, McAdams S (2017) Perceptually salient regions of the modulation power spectrum for musical instrument identification. Front Psychol 8:587. https://doi.org/10.3389/fpsyg.2017.00587
Tomasino B, Canderan C, Marin D et al (2015) Identifying environmental sounds: a multimodal mapping study. Front Hum Neurosci 9:567. https://doi.org/10.3389/fnhum.2015.00567
Venezia JH, Hickok G, Richards VM (2016) Auditory “bubbles”: efficient classification of the spectrotemporal modulations essential for speech intelligibility. J Acoust Soc Am 140:1072–1088. https://doi.org/10.1121/1.4960544
Article PubMed PubMed Central Google Scholar
Viemeister NF, Wakefield GH (1991) Temporal integration and multiple looks. J Acoust Soc Am 90:858–865
Article CAS Google Scholar
Warren WH, Verbrugge RR (1984) Auditory perception of breaking and bouncing events: a case study in ecological acoustics. J Exp Psychol Hum Percept Perform 10:704–712. https://doi.org/10.1037/0096-1523.10.5.704
Article PubMed Google Scholar
Wedin L, Goude G (1972) Dimension analysis of the perception of instrumental timbre. Scand J Psychol 13:228–240. https://doi.org/10.1111/j.1467-9450.1972.tb00071.x
Article CAS PubMed Google Scholar
Wun S, Horner A, Wu B (2014) Effect of spectral centroid manipulation on discrimination and identification of instrument timbres. J Aud Eng Soc 62:575–583. https://doi.org/10.17743/jaes.2014.0035
Article Google Scholar
Yonelinas AP (2002) The nature of recollection and familiarity: a review of 30 years of research. J Mem Lang 46:441–517. https://doi.org/10.1006/jmla.2002.2864
Article Google Scholar

Download references

Acknowledgements

DP was supported by the ANR grants ANR-10-LABX-0087 and ANR-10-IDEX- 0001-02, and by the European Research Council (ERC ADAM No. 295603).

Compliance with Ethics Requirements

Trevor Agus declares that he has no conflict of interest.

Clara Suied declares that she has no conflict of interest.

Daniel Pressnitzer declares that he has no conflict of interest.

Author information

Authors and Affiliations

School of Arts, English and Languages, Queen’s University Belfast, Belfast, UK
Trevor R. Agus
Département de Neurosciences et Sciences Cognitives, Institut de recherche biomédicale des armées, Brétigny-sur-Orge, France
Clara Suied
Laboratoire des Systèmes Perceptifs, Département d’études Cognitives, Paris Science & Lettres – PSL University, Centre national de la recherche scientifique, Paris, France
Daniel Pressnitzer

Authors

Trevor R. Agus
View author publications
You can also search for this author in PubMed Google Scholar
Clara Suied
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Pressnitzer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Trevor R. Agus .

Editor information

Editors and Affiliations

Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
Kai Siedenburg
Audio Communication Group, Technische Universität Berlin, Berlin, Germany
Charalampos Saitis
Schulich School of Music, McGill University, Montreal, QC, Canada
Stephen McAdams
Department of Biology, University of Maryland, Collage Park, MD, USA
Arthur N. Popper
Department of Psychology, Loyola University Chicago, Chicago, IL, USA
Richard R. Fay

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Agus, T.R., Suied, C., Pressnitzer, D. (2019). Timbre Recognition and Sound Source Identification. In: Siedenburg, K., Saitis, C., McAdams, S., Popper, A., Fay, R. (eds) Timbre: Acoustics, Perception, and Cognition. Springer Handbook of Auditory Research, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-030-14832-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-14832-4_3
Published: 08 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14831-7
Online ISBN: 978-3-030-14832-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics