Combining Topic Specific Language Models

  • Yangyang Shi
  • Pascal Wiggers
  • Catholijn M. Jonker
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6836)


In this paper we investigate whether a combination of topic specific language models can outperform a general purpose language model, using a trigram model as our baseline model. We show that in the ideal case — in which it is known beforehand which model to use — specific models perform considerably better than the baseline model. We test two methods that combine specific models and show that these combinations outperform the general purpose model, in particular if the data is diverse in terms of topics and vocabulary. Inspired by these findings, we propose to combine a decision tree and a set of dynamic Bayesian networks into a new model. The new model uses context information to dynamically select an appropriate specific model.


Bayesian Network Language Model Dynamic Bayesian Network Bayesian Network Model Dedicated Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chelba, C., Jelinek, F.: Exploiting syntactic structure for language modeling. In: Proceedings of the 17th International Conference on Computational Linguistics, vol. 1, pp. 225–231. ACL, Stroudsburg (1998)CrossRefGoogle Scholar
  2. 2.
    Rosenfeld, R.: A maximum entropy approach to adaptive statistical language modelling. Computer Speech and Language 10, 187–228 (1996)CrossRefGoogle Scholar
  3. 3.
    Schwenk, H.: Efficient training of large neural networks for language modeling. In: Proceedings IEEE International Joint Conference on Neural Networks, 2004, vol. 4, pp. 3059–3064 (2004)Google Scholar
  4. 4.
    Wiggers, P., Rothkrantz, L.: Combining topic information and structure information in a dynamic language model. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 218–225. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Shi, Y., Wiggers, P., Jonker, C.: Language modelling with dynamic bayesian networks using conversation types and part of speech information. In: The 22nd Benelux Conference on Artificial Intelligence, BNAIC (2010)Google Scholar
  6. 6.
    Clarkson, P., Robinson, A.J.: Language model adaptation using mixtures and an exponentially decaying cache. In: Proc. ICASSP 1997, Munich, Germany, pp. 799–802 (1997)Google Scholar
  7. 7.
    Kneser, R., Steinbiss, V.: On the dynamic adaptation of stochastic language models. In: Proceedings of ICASSP 1993, Minnapolis(USA), vol. II, pp. 586–589 (1993)Google Scholar
  8. 8.
    Iyer, R., Ostendorf, M., Rohlicek, J.R.: Language modeling with sentence-level mixtures. In: HLT 1994: Proceedings of the Workshop on Human Language Technology, pp. 82–87. Association for Computational Linguistics, Morristown (1994)Google Scholar
  9. 9.
    Bahl, L.R., Brown, P.F., de Souza, P.V., Mercer, R.L.: A tree-based statistical language model for natural language speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37, 1001–1008 (1989)CrossRefGoogle Scholar
  10. 10.
    Xu, P., Jelinek, F.: Random forests in language modeling. In: Proceedings of EMNLP, pp. 325–332 (2004)Google Scholar
  11. 11.
    Hoekstra, H., Moortgat, M., Schuurman, I., van der Wouden, T.: Syntactic annotation for the spoken dutch corpus project (cgn). In: Computational Linguistics in the Netherlands 2000, pp. 73–87 (2001)Google Scholar
  12. 12.
    Oostdijk, N., Goedertier, W., Eynde, F.V., Boves, L., Pierre Martens, J., Moortgat, M., Baayen, H.: Experiences from the spoken dutch corpus project. In: Proceedings of the Third International Conference on Language Resources and Evaluation, pp. 340–347 (2002)Google Scholar
  13. 13.
    Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)zbMATHGoogle Scholar
  14. 14.
    Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5, 142–150 (1989)CrossRefGoogle Scholar
  15. 15.
    Murphy, K.P.: Dynamic Bayesian Networks: Representation, Inference and Learning. PhD thesis, University of California, Berkeley (2002)Google Scholar
  16. 16.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Reading (1999)Google Scholar
  17. 17.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society, series B 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yangyang Shi
    • 1
  • Pascal Wiggers
    • 1
  • Catholijn M. Jonker
    • 1
  1. 1.Man-Machine Interaction GroupDelft University of TechnologyNetherlands

Personalised recommendations