K-Component Adaptive Recurrent Neural Network Language Models

Shi, Yangyang; Larson, Martha; Wiggers, Pascal; Jonker, Catholijn M.

doi:10.1007/978-3-642-40585-3_40

Yangyang Shi²⁰,
Martha Larson²⁰,
Pascal Wiggers²¹ &
…
Catholijn M. Jonker²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8082))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

2430 Accesses
4 Citations

Abstract

Conventional n-gram language models for automatic speech recognition are insufficient in capturing long-distance dependencies and brittle with respect to changes in the input domain. We propose a k-component recurrent neural network language model (karnnlm) that addresses these limitations by exploiting the long-distance modeling ability of recurrent neural networks and by making use of k different sub-models trained on different contextual domains. Our approach uses Latent Dirichlet Allocation to automatically discover k subsets of the training data, that are used to train k component models. Our experiments first use a Dutch-language corpus to confirm the ability of karnnlm to automatically choose the appropriate component. Then, we use a standard benchmark set (Wall Street Journal) to perform N-best list rescoring experiments. Results show that karnnlm improves performance over the rnnlm baseline; the best performance is achieved when karnnlm is combined with the general model using a novel iterative alternating N-best rescoring strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rosenfeld, R.: Two decades of statistical language modeling: where do we go from here? Proceedings of the IEEE 88, 1270–1278 (2000)
Article Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH, pp. 1045–1048 (2010)
Google Scholar
Mikolov, T., Kombrink, S., Burget, L., Cernocky, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531 (2011)
Google Scholar
Mikolov, T., Zweig, G.: Context dependent recurrent neural network language model. In: SLT, pp. 234–239 (2012)
Google Scholar
Shi, Y., Wiggers, P., Jonker, C.M.: Towards recurrent neural networks language models with linguistic and contextual features. In: 13th Annual Conference of the International Speech Communication Association (2012)
Google Scholar
Iyer, R., Ostendorf, M., Rohlicek, J.R.: Language modeling with sentence-level mixtures. In: Proceedings of the Workshop on Human Language Technology, pp. 82–87. Association for Computational Linguistics, Morristown (1994)
Chapter Google Scholar
Iyer, R., Ostendorf, M.: Modeling long distance dependence in language: Topic mixtures vs. dynamic cache models. In: Proc. ICSLP 1996, Philadelphia, PA, vol. 1, pp. 236–239 (1996)
Google Scholar
Kneser, R., Peters, J.: Semantic clustering for adaptive language modeling. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1997, vol. 2, pp. 779–782 (1997)
Google Scholar
Clarkson, P., Robinson, A.: Language model adaptation using mixtures and an exponentially decaying cache. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1997, vol. 2, pp. 799–802 (1997)
Google Scholar
Ostendorf, M., Kannan, A., Austin, S., Kimball, O., Schwartz, R., Rohlicek, J.R.: Integration of diverse recognition methodologies through reevaluation of n-best sentence hypotheses. In: Proceedings of the Workshop on Speech and Natural Language, HLT 1991, pp. 83–87. Association for Computational Linguistics, Stroudsburg (1991)
Chapter Google Scholar
Chin, H.S., Chen, B.: Word topical mixture models for dynamic language model adaptation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2007, vol. 4, pp. IV–169 –IV–172 (2007)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Hoekstra, H., Moortgat, M., Schuurman, I., van der Wouden, T.: Syntactic annotation for the Spoken Dutch Corpus Project (CGN). In: Computational Linguistics in the Netherlands 2000, pp. 73–87 (2001)
Google Scholar
Nelleke, O.N., Wim, G.W., Van Eynde, F., Boves, L., Martens, J.P., Moortgat, M., Baayen, H.: Experiences from the Spoken Dutch Corpus project. In: Proceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002, pp. 340–347 (2002)
Google Scholar
Wang, W., Harper, M.P.: The SuperARV language model: Investigating the effectiveness of tightly integrating multiple knowledge sources. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, vol. 2, pp. 238–247 (2002)
Google Scholar
den Bosch, V.: Scalable classification-based word prediction and confusible correction. Traitement Automatique des Langues 46, 39–63 (2006)
Google Scholar
Shi, Y., Wiggers, P., Jonker, C.M.: Socio-situational setting classification based on language use. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 455–460 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent System Department, Delft University of Technology, Mekelweg 4, 2628CD, Netherlands
Yangyang Shi, Martha Larson & Catholijn M. Jonker
CREATE-IT Applied Research, Amsterdam University of Applied Sciences (HvA), Netherlands
Pascal Wiggers

Authors

Yangyang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Martha Larson
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Wiggers
View author publications
You can also search for this author in PubMed Google Scholar
Catholijn M. Jonker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal & Václav Matoušek &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, Y., Larson, M., Wiggers, P., Jonker, C.M. (2013). K-Component Adaptive Recurrent Neural Network Language Models. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-40585-3_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics