Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals

Tavanaei, Amirhossein; Maida, Anthony

doi:10.1007/978-3-319-70136-3_95

Amirhossein Tavanaei¹⁸ &
Anthony Maida¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10639))

Included in the following conference series:

International Conference on Neural Information Processing

3952 Accesses
26 Citations
2 Altmetric

Abstract

Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun, Y.: Learning invariant feature hierarchies. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7583, pp. 496–505. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33863-2_51
Chapter Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MATH MathSciNet Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Article Google Scholar
Sainath, T.N., Mohamed, A., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8614–8618. IEEE (2013)
Google Scholar
Hinton, G., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013)
Google Scholar
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)
Article Google Scholar
Ghosh-Dastidar, S., Adeli, H.: Spiking neural networks. Int. J. Neural Syst. 19(04), 295–308 (2009)
Article Google Scholar
Kasabov, N., Dhoble, K., Nuntalid, N., Indiveri, G.: Dynamic evolving spiking neural networks for on-line spatio-and spectro-temporal pattern recognition. Neural Netw. 41, 188–201 (2013)
Article Google Scholar
Diehl, P.U., Cook, M.: Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Front. Comput. Neurosci. 9 (2015)
Google Scholar
Kheradpisheh, S.R., Ganjtabesh, M., Masquelier, T.: Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition. Neurocomputing 205, 382–392 (2016)
Article Google Scholar
Bengio, Y., Mesnard, T., Fischer, A., Zhang, S., Wu, Y.: STDP-compatible approximation of backpropagation in an energy-based model. Neural Comput. 29(3), 555–577 (2017)
Article Google Scholar
Masquelier, T., Thorpe, S.J.: Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput. Biol. 3(2), e31 (2007)
Article Google Scholar
Wysoski, S.G., Benuskova, L., Kasabov, N.: Fast and adaptive network of spiking neurons for multi-view visual pattern recognition. Neurocomputing 71(13), 2563–2575 (2008)
Article Google Scholar
Beyeler, M., Dutt, N.D., Krichmar, J.L.: Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule. Neural Netw. 48, 109–124 (2013)
Article Google Scholar
Wade, J.J., McDaid, L.J., Santos, J.A., Sayers, H.M.: SWAT: a spiking neural network training algorithm for classification problems. IEEE Trans. Neural Netw. 21(11), 1817–1830 (2010)
Article Google Scholar
Tavanaei, A., Maida, A.S.: A spiking network that learns to extract spike signatures from speech signals. Neurocomputing 240, 191–199 (2017)
Article Google Scholar
Tavanaei, A., Maida, A.S.: Training a hidden Markov model with a Bayesian spiking neural network. J. Signal Process. Syst. 1–10 (2016)
Google Scholar
Verstraeten, D., Schrauwen, B., Stroobandt, D.: Reservoir-based techniques for speech recognition. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 1050–1053. IEEE (2006)
Google Scholar
Dibazar, A.A., Song, D., Yamada, W., Berger, T.W.: Speech recognition based on fundamental functional principles of the brain. In: IEEE International Joint Conference on Neural Networks, vol. 4, pp. 3071–3075 (2004)
Google Scholar
Loiselle, S., Rouat, J., Pressnitzer, D., Thorpe, S.: Exploration of rank order coding with spiking neural networks for speech recognition. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, vol. 4, pp. 2076–2080. IEEE (2005)
Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Article Google Scholar
Cao, Y., Chen, Y., Khosla, D.: Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vision 113(1), 54–66 (2015)
Article MathSciNet Google Scholar
Dan, Y., Poo, M.M.: Spike timing-dependent plasticity: from synapse to perception. Physiol. Rev. 86(3), 1033–1048 (2006)
Article Google Scholar
Tavanaei, A., Masquelier, T., Maida, A.S.: Acquisition of visual features through probabilistic spike-timing-dependent plasticity. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 307–314 (2016)
Google Scholar
Hirsch, H.G., Pearce, D.: The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW) (2000)
Google Scholar
Leonard, R.: A database for speaker-independent digit recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1984, vol. 9, pp. 328–331. IEEE (1984)
Google Scholar
Doddington, G.R., Schalk, T.B.: Computers: speech recognition: turning theory to practice: new ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today. IEEE Spectr. 18(9), 26–32 (1981)
Article Google Scholar
Dao, M., Suo, Y., Chin, S.P., Tran, T.D.: Structured sparse representation with low-rank interference. In: 2014 48th Asilomar Conference on Signals, Systems and Computers, pp. 106–110. IEEE (2014)
Google Scholar
Van Doremalen, J., Boves, L.: Spoken digit recognition using a hierarchical temporal memory. In: INTERSPEECH, pp. 2566–2569 (2008)
Google Scholar
Neil, D., Liu, S.C.: Effective sensor fusion with event-based sensors and deep network architectures. In: 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2282–2285. IEEE (2016)
Google Scholar
Groenland, K., Bohte, S.: Efficient forward propagation of time-sequences in convolutional neural networks using deep shifting. arXiv preprint arXiv:1603.03657 (2016)

Download references

Author information

Authors and Affiliations

The Center for Advanced Computer Studies, Bio-inspired AI Lab, University of Louisiana at Lafayette, Lafayette, LA, 70503, USA
Amirhossein Tavanaei & Anthony Maida

Authors

Amirhossein Tavanaei
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Maida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amirhossein Tavanaei .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tavanaei, A., Maida, A. (2017). Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_95

Download citation

DOI: https://doi.org/10.1007/978-3-319-70136-3_95
Published: 26 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70135-6
Online ISBN: 978-3-319-70136-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics