Adaptive Language Modeling with a Set of Domain Dependent Models

Shi, Yangyang; Wiggers, Pascal; Jonker, Catholijn M.

doi:10.1007/978-3-642-32790-2_57

Yangyang Shi²¹,
Pascal Wiggers²¹ &
Catholijn M. Jonker²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1665 Accesses

Abstract

An adaptive language modeling method is proposed in this paper. Instead of using one static model for all situations, it applies a set of specific models to dynamically adapt to the discourse. We present the general structure of the model and the training procedure. In our experiments, we instantiated the method with a set of domain dependent models which are trained according to different socio-situational settings (almosd). We compare it with previous topic dependent and socio-situational setting dependent adaptive language models and with a smoothed n-gram model in terms of perplexity and word prediction accuracy. Our experiments show that almosd achieves perplexity reductions up to almost 12% compared with the other models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH, pp. 1045–1048 (2010)
Google Scholar
Foster, P., Skehan, P.: The influence of planning and task type on second language performance. Studies in Second Language Acquisition 18, 299–323 (1996)
Article Google Scholar
Wiggers, P.: Modelling Context in Automatic Speech Recognition. Ph.D. thesis, Delft University of Technology (2008)
Google Scholar
Wiggers, P., Rothkrantz, L.: Combining Topic Information and Structure Information in a Dynamic Language Model. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 218–225. Springer, Heidelberg (2009)
Chapter Google Scholar
Iyer, R., Ostendorf, M.: Modeling long distance dependencies in language: Topic mixtures versus dynamic cache models. IEEE Trans. Speech Audio Process. 7, 236–239 (1999)
Article Google Scholar
Iyer, R., Ostendorf, M., Rohlicek, J.R.: Language modeling with sentence-level mixtures. In: HLT 1994: Proceedings of the Workshop on Human Language Technology, pp. 82–87. Association for Computational Linguistics, Morristown (1994)
Chapter Google Scholar
Shi, Y., Wiggers, P., Jonker, C.M.: Language modelling with dynamic bayesian networks using conversation types and part of speech information. In: The 22nd Benelux Conference on Artificial Intelligence, BNAIC (2010)
Google Scholar
Shi, Y., Wiggers, P., Jonker, C.M.: Combining Topic Specific Language Models. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 99–106. Springer, Heidelberg (2011)
Chapter Google Scholar
Bellegarda, J.: Statistical language model adaptation: review and perspectives. Speech Communication 42, 93–108 (2004)
Article Google Scholar
Brown, P.F., Pietra, V.J.D., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18, 467–479 (1992)
Google Scholar
Rosenfeld, R.: A maximum entropy approach to adaptive statistical language modelling. Computer Speech & Language 10, 187–228 (1996)
Article Google Scholar
Seymore, K., Rosenfeld, R.: Using story topics for language model adaptation. In: Kokkinakis, G., Fakotakis, N., Dermatas, E. (eds.) EUROSPEECH. ISCA (1997)
Google Scholar
Adda, G., Jardino, M., Gauvain, J.L.: Sixth European Conference on Speech Communication and Technology, Eurospeech 1999, budapest, Hungary, September 5-9. ISCA (1999)
Google Scholar
Wiggers, P., Rothkrantz, L.J.M.: Topic-based language modeling with dynamic bayesian networks. In: Proceedings of the Ninth International Conference on Spoken Language Processing, pp. 1866–1869 (2006)
Google Scholar
Hermansky, H.: Dealing with Unexpected Words in Automatic Recognition of Speech. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 1–15. Springer, Heidelberg (2011)
Chapter Google Scholar
Hoekstra, H., Moortgat, M., Schuurman, I., van der Wouden, T.: Syntactic annotation for the spoken dutch corpus project (cgn). Computational Linguistics in the Netherlands 2000, 73–87 (2001)
Google Scholar
Nelleke, O., Wim, G., Frank Van, E., Louis, B., Jean-pierre, M., Michael, M., Harald, B.: Experiences from the spoken dutch corpus project. In: Proceedings of the Third International Conference on Language Resources and Evaluation, pp. 340–347 (2002)
Google Scholar
van den Bosch, A.: Scalable classification-based word prediction and confusible correction. Traitement Automatique des Langues 46, 39–63 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Interactive intelligence Group, Delft University of Technology, Mekelweg 4, 2628CD, The Netherlands
Yangyang Shi, Pascal Wiggers & Catholijn M. Jonker

Authors

Yangyang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Wiggers
View author publications
You can also search for this author in PubMed Google Scholar
Catholijn M. Jonker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, Y., Wiggers, P., Jonker, C.M. (2012). Adaptive Language Modeling with a Set of Domain Dependent Models. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-32790-2_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Language Modeling with a Set of Domain Dependent Models