Recognizing Connected Digit Strings Using Neural Networks
This paper discusses the usage of feed-forward and recurrent Artificial Neural Networks (ANNs) in whole word speech recognition. The Long-Short Term Memory (LSTM) network has been trained to do speaker independent recognition of any series of connected digits in polish language, using only the acoustic features extracted from speech. It is also shown how to effectively change the analog network output into binary information on recognized words. The parametrs of the conversion are fine-tuned using artificial evolution.
KeywordsHide Markov Model Word Recognition Speech Recognition Memory Cell Recurrent Neural Network
Unable to display preview. Download preview PDF.
- 1.Gers, F.: Long Short-Term Memory in Recurrent Neural Networks, PhD thesis (2001)Google Scholar
- 2.Graves, A., Schmidthuber, J.: Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Journal of Neural Networks, 602–610 (June/July 2005)Google Scholar
- 3.Graves, A., Eck, D., Beringer, N., Schmidthuber, J.: Biologically Plausible Speech Recognition with LSTM Neural Nets. In: Proceedings of the First International Workshop on Biologically Inspired Approaches to Advanced Information Technology, Bio-ADIT 2004, Lausanne, Switzerland, January 2004, pp. 175–184 (2004)Google Scholar
- 5.Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, Los Alamitos (2001)Google Scholar
- 7.Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)Google Scholar
- 8.Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition, 267–296 (1990)Google Scholar
- 9.Young, S.: The HTK Book. Cambridge University Press, Cambridge (1995)Google Scholar