Perplexity of n-Gram and Dependency Language Models

  • Martin Popel
  • David Mareček
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6231)


Language models (LMs) are essential components of many applications such as speech recognition or machine translation. LMs factorize the probability of a string of words into a product of P(w i |h i ), where h i is the context (history) of word w i . Most LMs use previous words as the context. The paper presents two alternative approaches: post-ngram LMs (which use following words as context) and dependency LMs (which exploit dependency structure of a sentence and can use e.g. the governing word as context). Dependency LMs could be useful whenever a topology of a dependency tree is available, but its lexical labels are unknown, e.g. in tree-to-tree machine translation. In comparison with baseline interpolated trigram LM both of the approaches achieve significantly lower perplexity for all seven tested languages (Arabic, Catalan, Czech, English, Hungarian, Italian, Turkish).


Speech Recognition Directed Acyclic Graph Machine Translation Word Form Preceding Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rosenfeld, R.: A Maximum Entropy Approach to Adaptive Statistical Language Modelling. Computer speech and language 10, 187 (1996)CrossRefGoogle Scholar
  2. 2.
    Bilmes, J., Kirchhoff, K.: Factored Language Models and Generalized Parallel Backoff. In: HLT/NAACL-2003, Edmonton, Alberta (2003)Google Scholar
  3. 3.
    Chelba, C., Engle, D., Jelinek, F., Jimenez, V., Khudanpur, S., Mangu, L., Printz, H., Ristad, E., Rosenfeld, R., Stolcke, A., et al.: et al.: Structure and Performance of a Dependency Language Model. In: Proceedings of Eurospeech (1997)Google Scholar
  4. 4.
    Shen, L., Xu, J., Weischedel, R.: A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model, pp. 577–585 (2008)Google Scholar
  5. 5.
    Nivre, J., Hall, J., Kubler, S., McDonald, R., Yuret, D., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 Shared Task on Dependency Parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, Prague, pp. 915–932 (2007)Google Scholar
  6. 6.
    Charniak, E.: Immediate-Head Parsing for Language Models. In: Processing of ACL 2001, vol. 39, pp. 116–123 (2001)Google Scholar
  7. 7.
    Chen, S., Goodman, J.: An Empirical Study of Smoothing Techniques for Language Modeling. Computer Speech and Language 13, 359–394 (1999)CrossRefGoogle Scholar
  8. 8.
    Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: A Language-Independent System for Data-Driven Dependency Parsing. Natural Language Engineering 13, 95–135 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Martin Popel
    • 1
  • David Mareček
    • 1
  1. 1.Institute of Formal and Applied LinguisticsCharles University in Prague 

Personalised recommendations