Skip to main content

Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10639))

Included in the following conference series:

Abstract

Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun, Y.: Learning invariant feature hierarchies. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7583, pp. 496–505. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33863-2_51

    Chapter  Google Scholar 

  2. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  3. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  4. Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)

    Article  Google Scholar 

  5. Sainath, T.N., Mohamed, A., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8614–8618. IEEE (2013)

    Google Scholar 

  6. Hinton, G., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  7. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013)

    Google Scholar 

  8. Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)

    Article  Google Scholar 

  9. Ghosh-Dastidar, S., Adeli, H.: Spiking neural networks. Int. J. Neural Syst. 19(04), 295–308 (2009)

    Article  Google Scholar 

  10. Kasabov, N., Dhoble, K., Nuntalid, N., Indiveri, G.: Dynamic evolving spiking neural networks for on-line spatio-and spectro-temporal pattern recognition. Neural Netw. 41, 188–201 (2013)

    Article  Google Scholar 

  11. Diehl, P.U., Cook, M.: Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Front. Comput. Neurosci. 9 (2015)

    Google Scholar 

  12. Kheradpisheh, S.R., Ganjtabesh, M., Masquelier, T.: Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition. Neurocomputing 205, 382–392 (2016)

    Article  Google Scholar 

  13. Bengio, Y., Mesnard, T., Fischer, A., Zhang, S., Wu, Y.: STDP-compatible approximation of backpropagation in an energy-based model. Neural Comput. 29(3), 555–577 (2017)

    Article  Google Scholar 

  14. Masquelier, T., Thorpe, S.J.: Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput. Biol. 3(2), e31 (2007)

    Article  Google Scholar 

  15. Wysoski, S.G., Benuskova, L., Kasabov, N.: Fast and adaptive network of spiking neurons for multi-view visual pattern recognition. Neurocomputing 71(13), 2563–2575 (2008)

    Article  Google Scholar 

  16. Beyeler, M., Dutt, N.D., Krichmar, J.L.: Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule. Neural Netw. 48, 109–124 (2013)

    Article  Google Scholar 

  17. Wade, J.J., McDaid, L.J., Santos, J.A., Sayers, H.M.: SWAT: a spiking neural network training algorithm for classification problems. IEEE Trans. Neural Netw. 21(11), 1817–1830 (2010)

    Article  Google Scholar 

  18. Tavanaei, A., Maida, A.S.: A spiking network that learns to extract spike signatures from speech signals. Neurocomputing 240, 191–199 (2017)

    Article  Google Scholar 

  19. Tavanaei, A., Maida, A.S.: Training a hidden Markov model with a Bayesian spiking neural network. J. Signal Process. Syst. 1–10 (2016)

    Google Scholar 

  20. Verstraeten, D., Schrauwen, B., Stroobandt, D.: Reservoir-based techniques for speech recognition. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 1050–1053. IEEE (2006)

    Google Scholar 

  21. Dibazar, A.A., Song, D., Yamada, W., Berger, T.W.: Speech recognition based on fundamental functional principles of the brain. In: IEEE International Joint Conference on Neural Networks, vol. 4, pp. 3071–3075 (2004)

    Google Scholar 

  22. Loiselle, S., Rouat, J., Pressnitzer, D., Thorpe, S.: Exploration of rank order coding with spiking neural networks for speech recognition. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, vol. 4, pp. 2076–2080. IEEE (2005)

    Google Scholar 

  23. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  24. Cao, Y., Chen, Y., Khosla, D.: Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vision 113(1), 54–66 (2015)

    Article  MathSciNet  Google Scholar 

  25. Dan, Y., Poo, M.M.: Spike timing-dependent plasticity: from synapse to perception. Physiol. Rev. 86(3), 1033–1048 (2006)

    Article  Google Scholar 

  26. Tavanaei, A., Masquelier, T., Maida, A.S.: Acquisition of visual features through probabilistic spike-timing-dependent plasticity. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 307–314 (2016)

    Google Scholar 

  27. Hirsch, H.G., Pearce, D.: The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW) (2000)

    Google Scholar 

  28. Leonard, R.: A database for speaker-independent digit recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1984, vol. 9, pp. 328–331. IEEE (1984)

    Google Scholar 

  29. Doddington, G.R., Schalk, T.B.: Computers: speech recognition: turning theory to practice: new ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today. IEEE Spectr. 18(9), 26–32 (1981)

    Article  Google Scholar 

  30. Dao, M., Suo, Y., Chin, S.P., Tran, T.D.: Structured sparse representation with low-rank interference. In: 2014 48th Asilomar Conference on Signals, Systems and Computers, pp. 106–110. IEEE (2014)

    Google Scholar 

  31. Van Doremalen, J., Boves, L.: Spoken digit recognition using a hierarchical temporal memory. In: INTERSPEECH, pp. 2566–2569 (2008)

    Google Scholar 

  32. Neil, D., Liu, S.C.: Effective sensor fusion with event-based sensors and deep network architectures. In: 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2282–2285. IEEE (2016)

    Google Scholar 

  33. Groenland, K., Bohte, S.: Efficient forward propagation of time-sequences in convolutional neural networks using deep shifting. arXiv preprint arXiv:1603.03657 (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amirhossein Tavanaei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tavanaei, A., Maida, A. (2017). Bio-inspired Multi-layer Spiking Neural Network Extracts Discriminative Features from Speech Signals. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_95

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70136-3_95

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70135-6

  • Online ISBN: 978-3-319-70136-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics