Abstract
Models of auditory processing, particularly of speech, face many difficulties. Included in these are variability among speakers, variability in speech rate, and robustness to moderate distortions such as time compression. We constructed a system based on ensembles of feature detectors derived from fragments of an onset-sensitive sound representation. This method is based on the idea of ‘spectro-temporal response fields’ and uses convolution to measure the degree of similarity through time between the feature detectors and the stimulus. The output from the ensemble was used to derive segmentation cues and patterns of response, which were used to train an artificial neural network (ANN) classifier. This allowed us to estimate a lower bound for the mutual information between the class of the input and the class of the output. Our results suggest that there is significant information in the output of our system, and that this is robust with respect to the exact choice of feature set, time compression in the stimulus, and speaker variation. In addition, the robustness to time compression in the stimulus has features in common with human psychophysics. Similar experiments using feature detectors derived from fragments of non-speech sounds performed less well. This result is interesting in the light of results showing aberrant cortical development in animals exposed to impoverished auditory environments during the developmental phase.
Similar content being viewed by others
References
A Aertsen P Johannesma (1981) ArticleTitleA comparison of the spectro-temporal sensitivity of auditory neurons to tonal and natural stimuli Biol Cybern 42 IssueID2 145–56 Occurrence Handle10.1007/BF00336732 Occurrence Handle6976799
E Ahissar S Nagarajan M Ahissar A Protopapas H Mahncke M Merzenich (2001) ArticleTitleSpeech comprehension is correlated with temporal response patterns recorded from auditory cortex Proc Natl Acad Sci USA 98 IssueID23 13367–13372 Occurrence Handle10.1073/pnas.201400998 Occurrence Handle11698688
Arai T, Greenberg S (1998) Speech intelligibility in the presence of cross-channel asynchrony. In: IEEE international conference on acoustics, speech and signal processing, pp 933–936
O Bar-Yosef Y Rotman I Nelken (2002) ArticleTitleResponses of neurons in cat primary auditory cortex to bird chirps: effects of temporal and spectral context J Neurosci 22 IssueID19 8619–8632 Occurrence Handle12351736
E Buss JW Hall JH Grose (2003) ArticleTitleSpectral integration of synchronous and asynchronous cues to consonent identification J Acoust Soc Am 115 IssueID5 2278–2285 Occurrence Handle10.1121/1.1691035
EF Chang MM Merzenich (2003) ArticleTitleEnvironmental noise retards auditory cortical development Science 300 IssueID5618 498–502 Occurrence Handle10.1126/science.1082163 Occurrence Handle12702879
Abbot Dayan (2001) Neural Coding MIT Cambridge
R Charms Particlede D Blake M Merzenich (1998) ArticleTitleOptimizing sound features for cortical neurons Science 280 IssueID5368 1439–1443 Occurrence Handle10.1126/science.280.5368.1439 Occurrence Handle9603734
DA Depireux JZ Simon DJ Klein S Shamma (2001) ArticleTitleSpectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex J Neurophysiol 85 IssueID3 1220–1234 Occurrence Handle11247991
S Edelman (1998) ArticleTitleRepresentation is representation of similarities Behav Brain Sci 21 IssueID4 449–467 Occurrence Handle10.1017/S0140525X98001253 Occurrence Handle10097019
M Elhilali JB Fritz DJ Klein JZ Simon SA Shamma (2004) ArticleTitleDynamics of precise spike timing in primary auditory cortex J Neurosci 24 IssueID5 1159–72 Occurrence Handle10.1523/JNEUROSCI.3825-03.2004 Occurrence Handle14762134
A Fishbach I Nelken Y Yeshurun (2001) ArticleTitleAuditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients J Neurophysiol 85 IssueID6 2303–2323 Occurrence Handle11387378
J Fritz S Shamma M Elhilali D Klein (2003) ArticleTitleRapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex Nat Neurosci 6 IssueID11 1216–1223 Occurrence Handle10.1038/nn1141 Occurrence Handle14583754
BR Glasberg BC Moore (1990) ArticleTitleDerivation of auditory filter shapes from notched noise data Hear Res 47 IssueID1 103–138 Occurrence Handle10.1016/0378-5955(90)90170-T Occurrence Handle2228789
D Golomb J Hertz S Panzeri A Treves B Richmond (1997) ArticleTitleHow well can we estimate the information carried in neuronal responses from limited samples Neural Comput 9 IssueID3 649–665 Occurrence Handle9097477
P Heil (1997) ArticleTitleAuditory cortical onset responses revisited. II. Response strength J Neurophysiol 77 IssueID5 2642–2660 Occurrence Handle9163381
P Heil D Irvine (1997) ArticleTitleFirst-spike timing of auditory-nerve fibers and comparison with auditory cortex J Neurophysiol 78 IssueID5 2438–2454 Occurrence Handle9356395
P Heil H Neubauer (2001) ArticleTitleTemporal integration of sound pressure determines thresholds of auditory-nerve fibers J Neurosci 21 IssueID18 7404–7415 Occurrence Handle11549751
Irino T, Patterson RD, Kawahara H (2005) In Speech separation by humans and machines., Kluwer Academic, Massachusetts, chap Speech segregation using an event synchronous auditory image and STRAIGHT, pp 153–165
N Kowalski D Depireux S Shamma (1996a) ArticleTitleAnalysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra J Neurophysiol 76 IssueID5 3503–3523
N Kowalski D Depireux S Shamma (1996b) ArticleTitleAnalysis of dynamic spectra in ferret primary auditory cortex. II. Prediction of unit responses to arbitrary dynamic spectra J Neurophysiol 76 IssueID5 3524–3534
JF Linden RC Liu M Sahani CE Schreiner MM Merzenich (2003) ArticleTitleSpectrotemporal structure of receptive fields in areas AI and AAF of mouse auditory cortex J Neurophysiol 90 IssueID4 2660–2675 Occurrence Handle12815016
CK Machens MS Wehr AM Zador (2004) ArticleTitleLinearity of cortical receptive fields measured with natural sounds J Neurosci 24 IssueID5 1089–100 Occurrence Handle10.1523/JNEUROSCI.4445-03.2004 Occurrence Handle14762127
L Miller ME Miller et al. (2002) ArticleTitleSpectrotemporal receptive fields in the lemniscal auditory thalamus and cortex J Neurophysiol 87 516–527 Occurrence Handle11784767
I Nelken A Fishbach L Las N Ulanovsky D Farkas (2003) ArticleTitlePrimary auditory cortex of cats: feature detection or something else? Biol Cybern 89 IssueID5 397–406 Occurrence Handle10.1007/s00422-003-0445-3 Occurrence Handle14669020
Nikias C, Athina P (1993) Higher-order spectra analysis. Prentice Hall Signal Pocessing Series
D Phillips S Hall S Boehnke (2002) ArticleTitleCentral auditory onset responses, and temporal asymmetries in auditory perception Hear Res 167 IssueID1-2 192–205 Occurrence Handle10.1016/S0378-5955(02)00393-3 Occurrence Handle12117542
R Shepard S Chipman (1970) ArticleTitleSecond-order isomorphism of internalrepresentations: shapes of states Cogni Psychol 1 1–17 Occurrence Handle10.1016/0010-0285(70)90002-2
Slaney M (1994) Auditory toolbox documentation. technical report 45. Tech. rep., Apple Computers Inc.
E Terhardt (1974) ArticleTitlePitch, consonance, and harmony J Acoust Soc Am 55 IssueID5 1061–1069 Occurrence Handle4833699
FE Theunissen SV David NC Singh A Hsu WE Vinje JL Gallant (2001) ArticleTitleEstimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli Netw Comput Neural Syst 12 289–316 Occurrence Handle10.1088/0954-898X/12/3/304
S Ullman M Vidal-Naquet E Sali (2002) ArticleTitleVisual features of intermediate complexity and their use in classification Nat Neurosci 5 IssueID7 682–687 Occurrence Handle12055634
L Wiegrebe (2001) ArticleTitleSearching for the time constant of neural pitch extraction J Acoust Soc Am 109 IssueID3 1082–1091 Occurrence Handle10.1121/1.1348005 Occurrence Handle11303922
L Zhang S Bao M Merzenich (2001) ArticleTitlePersistent and specific influences of early acoustic environments on primary auditory cortex Nat Neurosci 4 IssueID11 1123–1130 Occurrence Handle10.1038/nn745 Occurrence Handle11687817
LI Zhang S Bao MM Merzenich (2002) ArticleTitleDisruption of primary auditory cortex by synchronous auditory inputs during a critical period Proc Natl Acad Sci USA 99 IssueID4 2309–2314 Occurrence Handle10.1073/pnas.261707398 Occurrence Handle11842227
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Coath, M., Denham, S. Robust sound classification through the representation of similarity using response fields derived from stimuli during early experience. Biol Cybern 93, 22–30 (2005). https://doi.org/10.1007/s00422-005-0560-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-005-0560-4