Abstract
This paper presents a spiking neural network (SNN) model of leaky integrate-and-fire (LIF) neurons for sound recognition, which provides a way to simulate the brain processes. Neural coding and learning by processing external stimulus and recognizing different patterns are important parts in SNN model. Based on features extracted from the time-frequency representation of sound, we present a time-frequency encoding method which can retain the adequate information of original sound and generate spikes from represented features. The generated spikes are further used to train the SNN model with plausible supervised synaptic learning rule to efficiently perform various classification tasks. By testing the encoding and learning methods in RWCP database, experiments demonstrate that the proposed SNN model can achieve the robust performance for sound recognition across a variety of noise conditions.
This work was supported by the National Natural Science Foundation of China under grant number 61673283.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mouel, C., Harris, K.D., Yger, P.: Supervised learning with decision margins in pools of spiking neurons. J. Comput. Neurosci. 37(2), 333–344 (2014)
Indiveri, G., Liu, S.C.: Memory and information processing in neuromorphic systems. Proc. IEEE 103(8), 1379–1397 (2015)
Bohte, S.M.: The evidence for neural information processing with precise spike-times: a survey. Nat. Comput. 3(2), 195–206 (2004)
Benchenane, K., Peyrache, A., Khamassi, M., Tierney, P.L., Gioanni, Y., Battaglia, F.P., Wiener, S.I.: Coherent theta oscillations and reorganization of spike timing in the hippocampal- prefrontal network upon learning. Neuron 66(6), 921–936 (2010)
Meister, M., Berry II, M.J.: The neural code of the retina. Neuron 22(3), 435–450 (1999)
Heil, P.: Auditory cortical onset responses revisited. I. First-spike timing. J. Neurophysiol. 77(5), 2616–2641 (1997)
Perez-Orive, J., Mazor, O., Turner, G.C., Cassenaer, S., Wilson, R.I., Laurent, G.: Oscillations and sparsening of odor representations in the mushroom body. Science 297(5580), 359–365 (2002)
Mehta, M.R., Lee, A.K., Wilson, M.A.: Role of experience and oscillations in transforming a rate code into a temporal coding. Nature 417(417), 741–746 (2002)
Vanrullen, R., Guyonneau, R., Thorpe, S.J.: Spike times make sense. Trends Neurosci. 28(1), 1–4 (2005)
Hodgkin, A.L., Huxley, A.F.: A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117(4), 500–544 (1952)
Izhikevich, E.M.: Simple model of spiking neurons. IEEE Trans. Neural Netw. 14(6), 1569–1572 (2003)
Gerstner, W., Kistler, W.M.: Spiking Neuron Models: Single Neurons, Populations, Plasticity . Cambridge University Press, Cambridge (2002)
Panzeri, S., Brunel, N., Logothetis, N.K., Kayser, C.: Sensory neural codes using multiplexed temporal scales. Trends Neurosci. 33(3), 111–120 (2010)
Tiesinga, P., Fellous, J.M., Sejnowski, T.J.: Regulation of spike timing in visual cortical circuits. Nat. Rev. Neurosci. 9(2), 97–107 (2008)
Gutig, R.: To spike, or when to spike? Curr. Opin. Neurobiol. 25, 134–139 (2014)
Natarajan, R., Huys, Q.J., Dayan, P., Zemel, R.S.: Encoding and decoding spikes for dynamic stimuli. Neural Comput. 20(20), 2325–2360 (2008)
Leutgeb, S., Leutgeb, J.K., Moser, M.B., Moser, E.I.: Place cells, spatial maps and the population code for memory. Curr. Opin. Neurobiol. 15(6), 738–746 (2006)
Masquelier, T., Guyonneau, R., Thorpe, S.J.: Competitive STDP-based spike pattern learning. Neural Comput. 21(5), 1259–1276 (2009)
Gütig, R., Sompolinsky, H., Tempotron, T.: A neuron that learns spike timing-based decisions. Nat. Neurosci. 9(3), 420–428 (2006)
Ponulak, F., Kasiński, A.: Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting. Neural Comput. 22(2), 467–510 (2010)
Yu, Q., Tang, H., Tan, K.C., Li, H.: Precise-spike-driven synaptic plasticity: learning hetero-association of spatiotemporal spike patterns. PLoS ONE 8(11), e78318 (2013)
Orchard, G., Meyer, C., Etienne-Cummings, R., Posch, C., Thakor, N., Benosman, R.: Hfirst: a temporal approach to object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(10), 2028–2040 (2015)
Zhao, B., Ding, R., Chen, S., Linares-Barranco, B., Tang, H.: Feedforward categorization on aer motion events using cortex-like features in a spiking neural network. IEEE Trans. Neural Netw. Learn. Syst. 26(9), 24–31 (2015)
Hu, J., Tang, H., Tan, K.C., Li, H., Shi, L.: A spike-timing-based integrated model for pattern recognition. Neural Comput. 25(2), 450–472 (2013)
Brody, C.D., Hopfield, J.J.: Simple networks for spike-timing-based computation, with application to olfactory processing. Neuron 37(5), 843–852 (2003)
Yu, Q., Yan, R., Tang, H., Tan, K.C., Li, H.: A spiking neural network system for robust sequence recognition. IEEE Trans. Neural Netw. Learn. Syst. 27(3), 621–635 (2016)
O’Shaughnessy, D.: Automatic speech recognition: history, methods and challenges. Pattern Recogn. 41(10), 2965–2979 (2008)
Cowling, M., Sitte, R.: Comparison of techniques for environmental sound recognition. Pattern Recogn. Lett. 24(15), 2895–2907 (2003)
Woodard, J.P.: Modeling and classification of natural sounds by product code hidden markov models. IEEE Trans. Signal Process. 40(7), 1833–1835 (1992)
Goldhor, R.S.: Recognition of environmental sounds. In: 1993 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 149–152 (1993)
Liu, L.: Ground vehicle acoustic signal processing based on biological hearing models. Masters Thesis, University of Maryland, College Park (1999)
Valero, X., Alias, F.: Gammatone cepstral coefficients: biologically inspired features for non-speech audio classification. IEEE Trans. Multimedia 14(6), 1684–1689 (2012)
Dennis, J., Yu, Q., Tang, H., Li, H.: Temporal coding of local spectrogram features for robust sound recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 803–807 (2013)
Real World Computing Partnership, “RWCP Sound Scene Database”. http://tosa.mri.co.jp/sounddb/index.htm
Nguyen, V.A., Starzyk, J.A., Goh, W.-B., Jachyra, D.: Neural network structure for spatio-temporal long-term memory. IEEE Trans. Neural Netw. Learn. Syst. 23(6), 971–983 (2012)
Hu, J., Tang, H., Tan, K.C., Li, H.: How the brain formulates memory: a spatio-temporal model research frontier. IEEE Comput. Intell. Mag. 11(2), 56–68 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xiao, R., Yan, R., Tang, H., Tan, K.C. (2017). A Spiking Neural Network Model for Sound Recognition. In: Sun, F., Liu, H., Hu, D. (eds) Cognitive Systems and Signal Processing. ICCSIP 2016. Communications in Computer and Information Science, vol 710. Springer, Singapore. https://doi.org/10.1007/978-981-10-5230-9_57
Download citation
DOI: https://doi.org/10.1007/978-981-10-5230-9_57
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5229-3
Online ISBN: 978-981-10-5230-9
eBook Packages: Computer ScienceComputer Science (R0)