Scale-Space Theory for Auditory Signals

Lindeberg, Tony; Friberg, Anders

doi:10.1007/978-3-319-18461-6_1

Tony Lindeberg¹⁶ &
Anders Friberg¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9087))

Included in the following conference series:

International Conference on Scale Space and Variational Methods in Computer Vision

2257 Accesses
4 Citations
3 Altmetric

Abstract

We show how the axiomatic structure of scale-space theory can be applied to the auditory domain and be used for deriving idealized models of auditory receptive fields via scale-space principles. For defining a time-frequency transformation of a purely temporal signal, it is shown that the scale-space framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal window functions. Applied to the definition of a second layer of receptive fields from the spectrogram, it is shown that the scale-space framework leads to two canonical families of spectro-temporal receptive fields, using a combination of Gaussian filters over the logspectral domain with either Gaussian filters or a cascade of first-order integrators over the temporal domain. These spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Such idealized models of auditory receptive fields respect auditory invariances, can be used for computing basic auditory features for audio processing and lead to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields in the inferior colliculus (ICC) and the primary auditory cortex (A1).

Support from the Swedish Research Council contracts 2010-4766, 2012-4685 and 2014-4083, a KTH CSC Small Visionary Project and the EU project SkAT-VG FET-Open grant 618067 is gratefully acknowledged.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aertsen, A.M.H.J., Johannesma, P.I.M.: The spectro-temporal receptive field: A functional characterization of auditory neurons. Biol. Cyb. 42, 133–143 (1981)
Article MATH Google Scholar
Miller, L.M., Escabi, N.A., Read, H.L., Schreiner, C.: Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex. J. Neurophys. 87, 516–527 (2001)
Google Scholar
Gabor, D.: Theory of communication. J. of the IEE 93, 429–457 (1946)
Google Scholar
Wolfe, P.J., Godsill, S.J., Dorfler, M.: Multi-Gabor dictionaries for audio time-frequency analysis. In: Appl. of Signal Proc. to Audio and Acoustics, pp. 43–46 (2001)
Google Scholar
Johannesma, P.I.M.: The pre-response stimulus ensemble of neurons in the cochlear nucleus. In: IPO Symposium on Hearing Theory, Eindhoven, pp. 58–69 (1972)
Google Scholar
Patterson, R.D., Nimmo-Smith, I., Holdsworth, J., Rice, P.: An efficient auditory filterbank based on the gammatone function. In: A meeting of the IOC Speech Group on Auditory Modelling at RSRE, vol. 2, 7 (1987)
Google Scholar
Qiu, A., Schreiner, C.E., Escabi, M.A.: Gabor analysis of auditory midbrain receptive fields: Spectro-temporal and binaural composition. J. of Neurophysiology 90, 456–476 (2003)
Article Google Scholar
Elhilali, M., Fritz, J., Chi, T.S., Shamma, S.: Auditory cortical receptive fields: Stable entities with plastic abilities. J. of Neuroscience 27, 10372–10382 (2007)
Article Google Scholar
Atencio, C.A., Schreiner, C.E.: Spectrotemporal processing in spectral tuning modules of cat primary auditory cortex. PLOS ONE 7, e31537 (2012)
Article Google Scholar
Lindeberg, T., Friberg, A.: Idealized computational models of auditory receptive fields. PLOS ONE, 10.1371/journal.pone.0119032 (2015) preprint arXiv:1404.2037
Lindeberg, T.: Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space. J. of Mathematical Imaging and Vision 40, 36–81 (2011)
Article MATH MathSciNet Google Scholar
Lindeberg, T., Fagerström, D.: Scale-space with causal time direction. In: Buxton, B., Cipolla, R. (eds.) ECCV 1996. LNCS, vol. 1064, pp. 229–240. Springer, Heidelberg (1996)
Google Scholar
Heckmann, M., Domont, X., Joublin, F., Goerick, C.: A hierarchical framework for spectro-temporal feature extraction. Speech Communication 53, 736–752 (2011)
Article Google Scholar
Ngamkham, W., Sawigun, C., Hiseni, S., Serdijn, W.A.: Analog complex gammatone filter for cochlear implant channels. In: ISCAS, pp. 969–972 (2010)
Google Scholar
Koenderink, J.J.: Scale-time. Biological Cybernetics 58, 159–162 (1988)
Article MATH MathSciNet Google Scholar
Lindeberg, T.: Separable time-causal and time-recursive receptive fields. Scale Space and Variational Methods in Computer Vision (2015); these proceedings
Google Scholar
Lindeberg, T.: A computational theory of visual receptive fields. Biological Cybernetics 107, 589–635 (2013)
Article MATH MathSciNet Google Scholar
Lindeberg, T.: Feature detection with automatic scale selection. Int. J. of Computer Vision 30, 77–116 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computational Biology, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden
Tony Lindeberg
Department of Speech, Music and Hearing, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden
Anders Friberg

Authors

Tony Lindeberg
View author publications
You can also search for this author in PubMed Google Scholar
Anders Friberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tony Lindeberg .

Editor information

Editors and Affiliations

University of Bordeaux, Talence, France
Jean-François Aujol
ENS Cachan, Cachan, France
Mila Nikolova
University of Bordeaux, Talence, France
Nicolas Papadakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lindeberg, T., Friberg, A. (2015). Scale-Space Theory for Auditory Signals. In: Aujol, JF., Nikolova, M., Papadakis, N. (eds) Scale Space and Variational Methods in Computer Vision. SSVM 2015. Lecture Notes in Computer Science(), vol 9087. Springer, Cham. https://doi.org/10.1007/978-3-319-18461-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-18461-6_1
Published: 28 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18460-9
Online ISBN: 978-3-319-18461-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics