Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks

Peralta, Ivan; Odetti, Nanci; Rufiner, Hugo L.

doi:10.1007/978-3-031-48309-7_1

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14338))

Included in the following conference series:

International Conference on Speech and Computer

476 Accesses

Abstract

The automatic recognition of speaker-independent digits requires high accuracy and robustness to noise and variability. Spiking neural networks (SNNs) are a promising model for this task, as they can mimic the temporal dynamics and energy efficiency of the human auditory system. However, SNNs are difficult to train and often require complex learning algorithms. Spoken digits provide a useful benchmark task to evaluate new SNN architectures. Performance in small vocabulary tasks is an important first step before scaling up to more complex recognition scenarios. In this paper, we propose to use an extreme learning layer (ELL) as a simple and effective way to improve the learning of SNNs for spoken digit recognition. The ELL is a randomly generated layer that maps the input features to the next layer without any further adjustment. The output layer is then trained by entropy minimization. We show that ELL can boost the performance of the SNN on the benchmark data set TIDIGITS. We also compare our approach with some state-of-the-art methods achieving competitive results with less computational cost and complexity. The proposed approach also shows good robustness to additive noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The internal summation with the upper limit \(\infty \) indicates that the summation is performed until the entire spike encoding of the digit is completed.

References

Bhangale, K.B., Kothandaraman, M.: Survey of deep learning paradigms for speech processing. Wireless Pers. Commun. 125(2), 1913–1949 (2022)
Article Google Scholar
Deng, Y., Chakrabartty, S., Cauwenberghs, G.: Analog auditory perception model for robust speech recognition. In: Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, vol. 3, pp. 1705–1709. IEEE (2004)
Google Scholar
Gerstner, W., Kistler, W.M.: Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge (2002)
Book MATH Google Scholar
Guo, W., Fouda, M.E., Eltawil, A.M., Salama, K.N.: Neural coding in spiking neural networks: a comparative study for robust neuromorphic systems. Front. Neurosci. 15, 638474 (2021)
Article Google Scholar
Gupta, S., Agrawal, A., Pathak, A.: Energy-efficient deep learning: a review. Sustain. Comput.: Inform. Syst. 25, 100370 (2020). https://doi.org/10.1016/j.suscom.2020.100370
Article Google Scholar
Huang, G., Huang, G.B., Song, S., You, K.: Trends in extreme learning machines: a review. Neural Netw. 61, 32–48 (2015)
Article MATH Google Scholar
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1), 489–501 (2006)
Article Google Scholar
Izhikevich, E.M.: Which model to use for cortical spiking neurons? IEEE Trans. Neural Networks 15(5), 1063–1070 (2004)
Article Google Scholar
Leonard, R.G., Doddington, G.: TIDIGITS. Linguistic Data Consortium, Philadelphia (1993)
Google Scholar
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)
Article Google Scholar
Pan, Z., Chua, Y., Wu, J., Zhang, M., Li, H., Ambikairajah, E.: An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks. Front. Neurosci. 13 (2020). https://www.frontiersin.org/articles/10.3389/fnins.2019.01420
Peralta, I., Odetti, N., Filomena, E., Rufiner, J., Ricart, N., Rufiner, H.L.: A new spiking neural network with extreme learning for FPGA implementation. In: Proceedings of the 10th Southern Programmable Logic Conference, pp. 49–54 (2019). https://sinc.unl.edu.ar/sinc-publications/2019/POFRRR19
Schrauwen, B., Van Campenhout, J.: Parallel hardware implementation of a broad class of spiking neurons using serial arithmetic. In: Proceedings of the 14th European Symposium on Artificial Neural Networks, pp. 623–628. D-Side Publications (2006)
Google Scholar
Unnikrishnan, K., Hopfield, J.J., Tank, D.W.: Speaker-independent digit recognition using a neural network with time-delayed connections. Neural Comput. 4(1), 108–119 (1992)
Article Google Scholar
Varga, A., Steeneken, H.J.: Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)
Article Google Scholar
Verstraeten, D., Schrauwen, B., Stroobandt, D., Van Campenhout, J.: Isolated word recognition with the liquid state machine: a case study. Inf. Process. Lett. 95(6), 521–528 (2005)
Article MATH Google Scholar

Download references

Acknowledgements

We would like to express our gratitude to the National University of Entre Ríos UNER for their support and resources provided within the framework of the research and development project PID6187, and to the National University of Litoral with project CAID 50620190100151LI, enabling us to conduct this investigation.

Author information

Authors and Affiliations

Laboratorio de Cibernética, Facultad de Ingeniería UNER, Oro Verde, Argentina
Ivan Peralta, Nanci Odetti & Hugo L. Rufiner
Instituto Señales, Sistemas e Inteligencia Computacional, sinc(i) UNL-CONICET, Santa Fe, Argentina
Hugo L. Rufiner

Authors

Ivan Peralta
View author publications
You can also search for this author in PubMed Google Scholar
Nanci Odetti
View author publications
You can also search for this author in PubMed Google Scholar
Hugo L. Rufiner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hugo L. Rufiner .

Editor information

Editors and Affiliations

St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
Indian Institute of Information Technology Dharwad, Dharwad, India
K. T. Deepak
Indian Institute of Technology Dharwad, Dharwad, India
Rajesh M. Hegde
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal
Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peralta, I., Odetti, N., Rufiner, H.L. (2023). Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14338. Springer, Cham. https://doi.org/10.1007/978-3-031-48309-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-48309-7_1
Published: 22 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48308-0
Online ISBN: 978-3-031-48309-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural Networks