Application of LSTM Neural Networks in Language Modelling

  • Daniel Soutner
  • Luděk Müller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8082)

Abstract

Artificial neural networks have become state-of-the-art in the task of language modelling on a small corpora. While feed-forward networks are able to take into account only a fixed context length to predict the next word, recurrent neural networks (RNN) can take advantage of all previous words. Due the difficulties in training of RNN, the way could be in using Long Short Term Memory (LSTM) neural network architecture.

In this work, we show an application of LSTM network with extensions on a language modelling task with Czech spontaneous phone calls. Experiments show considerable improvements in perplexity and WER on recognition system over n-gram baseline.

Keywords

language modelling recurrent neural networks LSTM neural networks 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Frinken, V., Zamora-Martinez, F., Espana-Boquera, S., Castro-Bleda, M.J., Fischer, A., Bunke, H.: Long-short term memory neural networks language modeling for handwriting recognition. In: 21st International Conference on Pattern Recognition (ICPR), November 11-15, pp. 701–704 (2012)Google Scholar
  2. 2.
    Mikolov, T., Kombrink, S., Burget, L., Cernocky, J.H.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, pp. 5528–5531 (2011)Google Scholar
  3. 3.
    Sundermeyer, M., Schlüter, R., Ney, H.: LSTM Neural Networks for Language Modeling. In: INTERSPEECH 2012 (2012)Google Scholar
  4. 4.
    Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. In: Proc. Intl. Conf. on Spoken Language Processing, Denver, vol. 2, pp. 901–904 (2002)Google Scholar
  5. 5.
    Soutner, D., Loose, Z., Müller, L., Pražák, A.: Neural Network Language Model with Cache. TSD 2012:528-534Google Scholar
  6. 6.
    Řehůřek, R., Sojka, P.: Software Framework for Topic Modelling with Large Corpora. In: Proceedings of LREC 2010 Workshop New Challenges for NLP Frameworks, p. 5. University of Malta, Valletta (2010) ISBN 2-9517408-6-7Google Scholar
  7. 7.
    Blei, D.M., Ng, A.Y., Jordan, M.I., Lafferty, J.: Latent dirichlet allocation. Journal of Machine Learning Research 3 (2003)Google Scholar
  8. 8.
    Hochreiter, S., Schmidhuber, J.: Long Short-term Memory. Neural Computation 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  9. 9.
    Gers, F.: Long Short-Term Memory in Recurrent Neural Networks, Ph.D. Thesis. École Polytechnique Fédérale de Lausanne, Switzerland (2001)Google Scholar
  10. 10.
    Brown, P.F., Della Pietra, V.J., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-Based n-gram Models of Natural Language. Computational Linguistics 18(4), 467–479 (1992)Google Scholar
  11. 11.
    Charniak, E.: BLLIP 1987-89 WSJ Corpus Release 1, Linguistic Data Consortium, Philadelphia (2000)Google Scholar
  12. 12.
    Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATHGoogle Scholar
  13. 13.
    Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.: Training Recurrent Networks by Evolino. Neural Computation 19(3), 757–779 (2007) PDF (preprint)MATHCrossRefGoogle Scholar
  14. 14.
    Kneser, R., Ney, H.: Improved backing-off for M-gram language modeling. In: 1995 International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995, May 9-12, vol. 1, pp. 181–184 (1995)Google Scholar
  15. 15.
    Oparin, I., Sundermeyer, M., Ney, H., Gauvain, J.: Performance analysis of Neural Networks in combination with n-gram language models. In: ICASSP, pp. 5005–5008 (2012)Google Scholar
  16. 16.
    Mikolov, T., Zweig, G.: Context Dependent Recurrent Neural Network Language Model. Microsoft Research Technical Report MSR-TR-2012-92 (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Daniel Soutner
    • 1
  • Luděk Müller
    • 1
  1. 1.Faculty of Applied Sciences, Department of CyberneticsUniversity of West BohemiaPlzeňCzech Rep.

Personalised recommendations