A Generative Language Modeling Approach for Ranking Entities

  • Wouter Weerkamp
  • Krisztian Balog
  • Edgar Meij
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5631)


We describe our participation in the INEX 2008 Entity Ranking track. We develop a generative language modeling approach for the entity ranking and list completion tasks. Our framework comprises the following components: (i) entity and (ii) query language models, (iii) entity prior, (iv) the probability of an entity for a given category, and (v) the probability of an entity given another entity. We explore various ways of estimating these components, and report on our results. We find that improving the estimation of these components has very positive effects on performance, yet, there is room for further improvements.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006, pp. 43–50 (2006)Google Scholar
  2. 2.
    Balog, K., Weerkamp, W., de Rijke, M.: A few examples go a long way: constructing query models from elaborate query. In: SIGIR 2008, pp. 371–378 (2008)Google Scholar
  3. 3.
    Demartini, G., de Vries, A.P., Iofciu, T., Zhu, J.: Overview of the INEX 2008 entity ranking track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631. Springer, Heidelberg (2009)Google Scholar
  4. 4.
    Denoyer, L., Gallinari, P.: The Wikipedia XML corpus. SIGIR Forum 40, 64–69 (2006)CrossRefGoogle Scholar
  5. 5.
    Hiemstra, D.: Using Language Models for Information Retrieval. PhD thesis, University of Twente (2001)Google Scholar
  6. 6.
    Mackay, D.J.C., Peto, L.: A hierarchical dirichlet language model. Natural Language Engineering 1(3), 1–19 (1994)Google Scholar
  7. 7.
    Miller, D., Leek, T., Schwartz, R.: A hidden Markov model information retrieval system. In: SIGIR 1999, pp. 214–221 (1999)Google Scholar
  8. 8.
    Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998, pp. 275–281 (1998)Google Scholar
  9. 9.
    Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating ap and ndcg. In: SIGIR 2008, pp. 603–610 (2008)Google Scholar
  10. 10.
    Zhai, C.: Statistical language models for information retrieval a critical review. Foundations and Trends in Information Retrieval 2, 137–213 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Wouter Weerkamp
    • 1
  • Krisztian Balog
    • 1
  • Edgar Meij
    • 1
  1. 1.ISLAUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations