Skip to main content
Log in

Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper, the role of speech recognition system in the assessment of dysarthric speech based on a method called Elman back propagation network (EBN) is studied. Dysarthria is a neurological disability that damages the control of motor speech articulators. The persons who suffer from Dysarthria may have speech intelligibility rate which may vary from low (2 %) to high (95 %). EBN is a Recurrent network, here a fully connected neural network is built such that the speech characteristics are represented simultaneously by neuron activation states. It is an efficient self supervised training algorithm. For parametric representation of the speech signal, we used Glottal feature along with mel frequency cepstral coefficients. Then finally the output of both the features is compared after the evaluation process using different neural networks and modeling methods. Evaluation of the proposed method is done on the subset of the Universal Access Research database. The subset consists of 9 dysarthric speakers out of 19 speakers each uttering 100 words repeatedly 3 times. The promising performance of the proposed system can be successfully applied to help the people who work for the voice disorder persons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Arisoy, E., Chen, S. F., Ramabhadran, B., & Sethy, A. (2014). Converting neural network language models into back-off language models for efficient decoding in automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(1), 184–192.

    Article  Google Scholar 

  • Buabin, E. (2012). Boosted hybrid recurrent neural classifier for text document classification on the Reuters news text corpus. International Journal of Machine Learning and Computing, 2(5), 588–592.

    Article  Google Scholar 

  • De Mulder, W., Bethard, S., & Moens, M.-F. (2015). A survey on the application of recurrent neural networks to statistical language modeling. Computer Speech & Language, 30(1), 61–98.

    Article  Google Scholar 

  • Dede, G., & Sazli, M. H. (2010). Speech recognition with artificial neural networks. Digital Signal Processing, 20(3), 763–768.

    Article  Google Scholar 

  • Duffy, J. (1995). Motor speech disorders. St. Louis: Mosby.

    Google Scholar 

  • Fachrie, M., & Harjoko, A. (2015). Robust Indonesian Digit Speech Recognition using Elman Recurrent Neural Network. Konferensi Nasional Informatika (KNIF), 2015, 49–54.

    Google Scholar 

  • Finch, A., Dixon, P., & Sumita, E. (2012). Rescoring a phrase-based machine transliteration system with recurrent neural network language models. NEWS’12 Proceedings of the 4th named entity workshop (pp. 47–51).

  • Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5–6), 602–610.

    Article  Google Scholar 

  • Green, P., Carmiehael, J., Hatzis, A., Enderby, P., Hawley, M., & Parker, M. (2003). Automatic speech recognition with sparse training data for dysarthric speakers. Proceedings of the 8th European conference on speech communication and technology (pp. 1189–1192). Geneva.

  • Hawley, M. S., Enderby, P., Green, P., Cuningham, S., Brownsell, S., Carmichael, J., et al. (2007). A speech-controlled environmental control system for people with severe dysarthria. Medical Engineering & Physics, 29(5), 586–593.

    Article  Google Scholar 

  • Huang, F., Ahuja, A., Downey, D., Yang, Y., Guo, Y., & Yates, A. (2014). Learning representations for weakly supervised natural language processing tasks. Computational Linguistics, 40(1), 85–120.

    Article  Google Scholar 

  • Jayaram, G., & Abdelhamied, K. (1995). Experiments in dysarthric speech recognition using artificial neural networks. Journal of Rehabilitation Research and Developmen., 32(2), 162–169.

    Google Scholar 

  • Lecorvé, G., & Motlicek, P. (2012). Conversion of recurrent neural network language models to weighted finite state transducers for automatic speech recognition. Proceedings of Interspeech (pp. 1666–1669).

  • Love, R. J. (1992). Childhood motor speech disability. Boston: Allyn and Bacon.

    Google Scholar 

  • Menendez-Pidal, X., Polikoff, J. B., Peters, S. M., Leonzio, J. E., & Bunnell, H. T. (1996). The Nemours database of dysarthric speech. Fourth international conference on spoken language. ICSLP 96 (Vol. 3, pp. 1962–1965). Philadelphia, PA.

  • Michaelis, D., Gramss, T., & Strube, H. W. (1997). Glottal-to-noise excitation ratio: A new measure for describing pathological voices. ACUSTICA Acta Acustica, 83, 700–706.

    Google Scholar 

  • Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., & Ranzato, M. A. (2015). Learning longer memory. In Recurrent neural networks. arXiv:1412.7753v2 [cs.NE].

  • O’Shaughnessy, D. (2001). Speech communication human and machines (II ed.). New Delhi: Universities press (India) Limited.

    Google Scholar 

  • Selouani S.-A., Yakoub, M. S., & O’Shaughnessy D. (2009). Alternative speech communication system for persons with severe speech disorders. Eurasip Journal on Advances in Signal Processing, pp. 1–12, Article No. 6.

  • Selva Nidhyananthan S., Shantha Selva Kumari, R., & Jaffino, G. (2012). Text-independent speaker identification using residual feature extraction Technique. CiiT International Journal of Digital Signal Processing, Vol. 4(3), pp. 81–85.

  • Sheela, K. G, & Deepa, S. N. (2013). Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering. Article ID 425740.

  • Shi, Y., Zhang, W. -Q., Liu, J., & Johnson, M. T. (2013). RNN language model with word clustering and class-based output layer. EURASIP Journal on Audio, Speech, and Music Processing, 2013, 22.

  • Sundermeyer, M, Oparin, I., Gauvain, J. -L., Freiberg, B., Schluter, R., & Ney, H. (2013). Comparison of feedforward and recurrent neural network language models. Proceedings of the international conference on acoustics, speech and signal processing (pp. 8430–8434).

  • Trentin, E., & Gori, M. (2010). A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing, 20(3), 763–768.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Shenbagalakshmi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Selva Nidhyananthan, S., Shantha Selva kumari, R. & Shenbagalakshmi, V. Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition. Int J Speech Technol 19, 577–583 (2016). https://doi.org/10.1007/s10772-016-9349-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-016-9349-1

Keywords

Navigation