Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

Selva Nidhyananthan, S.; Shantha Selva kumari, R.; Shenbagalakshmi, V.

doi:10.1007/s10772-016-9349-1

Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

Published: 01 July 2016

Volume 19, pages 577–583, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

S. Selva Nidhyananthan¹,
R. Shantha Selva kumari¹ &
V. Shenbagalakshmi¹

380 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, the role of speech recognition system in the assessment of dysarthric speech based on a method called Elman back propagation network (EBN) is studied. Dysarthria is a neurological disability that damages the control of motor speech articulators. The persons who suffer from Dysarthria may have speech intelligibility rate which may vary from low (2 %) to high (95 %). EBN is a Recurrent network, here a fully connected neural network is built such that the speech characteristics are represented simultaneously by neuron activation states. It is an efficient self supervised training algorithm. For parametric representation of the speech signal, we used Glottal feature along with mel frequency cepstral coefficients. Then finally the output of both the features is compared after the evaluation process using different neural networks and modeling methods. Evaluation of the proposed method is done on the subset of the Universal Access Research database. The subset consists of 9 dysarthric speakers out of 19 speakers each uttering 100 words repeatedly 3 times. The promising performance of the proposed system can be successfully applied to help the people who work for the voice disorder persons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Amandeep Singh Dhanjal & Williamjeet Singh

A Deep Learning Framework for Audio Deepfake Detection

Article 08 November 2021

Janavi Khochare, Chaitali Joshi, … Faruk Kazi

References

Arisoy, E., Chen, S. F., Ramabhadran, B., & Sethy, A. (2014). Converting neural network language models into back-off language models for efficient decoding in automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(1), 184–192.
Article Google Scholar
Buabin, E. (2012). Boosted hybrid recurrent neural classifier for text document classification on the Reuters news text corpus. International Journal of Machine Learning and Computing, 2(5), 588–592.
Article Google Scholar
De Mulder, W., Bethard, S., & Moens, M.-F. (2015). A survey on the application of recurrent neural networks to statistical language modeling. Computer Speech & Language, 30(1), 61–98.
Article Google Scholar
Dede, G., & Sazli, M. H. (2010). Speech recognition with artificial neural networks. Digital Signal Processing, 20(3), 763–768.
Article Google Scholar
Duffy, J. (1995). Motor speech disorders. St. Louis: Mosby.
Google Scholar
Fachrie, M., & Harjoko, A. (2015). Robust Indonesian Digit Speech Recognition using Elman Recurrent Neural Network. Konferensi Nasional Informatika (KNIF), 2015, 49–54.
Google Scholar
Finch, A., Dixon, P., & Sumita, E. (2012). Rescoring a phrase-based machine transliteration system with recurrent neural network language models. NEWS’12 Proceedings of the 4th named entity workshop (pp. 47–51).
Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5–6), 602–610.
Article Google Scholar
Green, P., Carmiehael, J., Hatzis, A., Enderby, P., Hawley, M., & Parker, M. (2003). Automatic speech recognition with sparse training data for dysarthric speakers. Proceedings of the 8th European conference on speech communication and technology (pp. 1189–1192). Geneva.
Hawley, M. S., Enderby, P., Green, P., Cuningham, S., Brownsell, S., Carmichael, J., et al. (2007). A speech-controlled environmental control system for people with severe dysarthria. Medical Engineering & Physics, 29(5), 586–593.
Article Google Scholar
Huang, F., Ahuja, A., Downey, D., Yang, Y., Guo, Y., & Yates, A. (2014). Learning representations for weakly supervised natural language processing tasks. Computational Linguistics, 40(1), 85–120.
Article Google Scholar
Jayaram, G., & Abdelhamied, K. (1995). Experiments in dysarthric speech recognition using artificial neural networks. Journal of Rehabilitation Research and Developmen., 32(2), 162–169.
Google Scholar
Lecorvé, G., & Motlicek, P. (2012). Conversion of recurrent neural network language models to weighted finite state transducers for automatic speech recognition. Proceedings of Interspeech (pp. 1666–1669).
Love, R. J. (1992). Childhood motor speech disability. Boston: Allyn and Bacon.
Google Scholar
Menendez-Pidal, X., Polikoff, J. B., Peters, S. M., Leonzio, J. E., & Bunnell, H. T. (1996). The Nemours database of dysarthric speech. Fourth international conference on spoken language. ICSLP 96 (Vol. 3, pp. 1962–1965). Philadelphia, PA.
Michaelis, D., Gramss, T., & Strube, H. W. (1997). Glottal-to-noise excitation ratio: A new measure for describing pathological voices. ACUSTICA Acta Acustica, 83, 700–706.
Google Scholar
Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., & Ranzato, M. A. (2015). Learning longer memory. In Recurrent neural networks. arXiv:1412.7753v2 [cs.NE].
O’Shaughnessy, D. (2001). Speech communication human and machines (II ed.). New Delhi: Universities press (India) Limited.
Google Scholar
Selouani S.-A., Yakoub, M. S., & O’Shaughnessy D. (2009). Alternative speech communication system for persons with severe speech disorders. Eurasip Journal on Advances in Signal Processing, pp. 1–12, Article No. 6.
Selva Nidhyananthan S., Shantha Selva Kumari, R., & Jaffino, G. (2012). Text-independent speaker identification using residual feature extraction Technique. CiiT International Journal of Digital Signal Processing, Vol. 4(3), pp. 81–85.
Sheela, K. G, & Deepa, S. N. (2013). Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering. Article ID 425740.
Shi, Y., Zhang, W. -Q., Liu, J., & Johnson, M. T. (2013). RNN language model with word clustering and class-based output layer. EURASIP Journal on Audio, Speech, and Music Processing, 2013, 22.
Sundermeyer, M, Oparin, I., Gauvain, J. -L., Freiberg, B., Schluter, R., & Ney, H. (2013). Comparison of feedforward and recurrent neural network language models. Proceedings of the international conference on acoustics, speech and signal processing (pp. 8430–8434).
Trentin, E., & Gori, M. (2010). A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing, 20(3), 763–768.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of ECE, Mepco Schlenk Engineering College, Sivakasi, India
S. Selva Nidhyananthan, R. Shantha Selva kumari & V. Shenbagalakshmi

Authors

S. Selva Nidhyananthan
View author publications
You can also search for this author in PubMed Google Scholar
R. Shantha Selva kumari
View author publications
You can also search for this author in PubMed Google Scholar
V. Shenbagalakshmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. Shenbagalakshmi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Selva Nidhyananthan, S., Shantha Selva kumari, R. & Shenbagalakshmi, V. Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition. Int J Speech Technol 19, 577–583 (2016). https://doi.org/10.1007/s10772-016-9349-1

Download citation

Received: 15 June 2015
Accepted: 13 June 2016
Published: 01 July 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10772-016-9349-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation