Utilizing Passage-Based Language Models for Document Retrieval

  • Michael Bendersky
  • Oren Kurland
Conference paper

DOI: 10.1007/978-3-540-78646-7_17

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4956)
Cite this paper as:
Bendersky M., Kurland O. (2008) Utilizing Passage-Based Language Models for Document Retrieval. In: Macdonald C., Ounis I., Plachouras V., Ruthven I., White R.W. (eds) Advances in Information Retrieval. ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg

Abstract

We show that several previously proposed passage-based document ranking principles, along with some new ones, can be derived from the same probabilistic model. We use language models to instantiate specific algorithms, and propose a passage language model that integrates information from the ambient document to an extent controlled by the estimated document homogeneity. Several document-homogeneity measures that we propose yield passage language models that are more effective than the standard passage model for basic document retrieval and for constructing and utilizing passage-based relevance models; the latter outperform a document-based relevance model. We also show that the homogeneity measures are effective means for integrating document-query and passage-query similarity information for document retrieval.

Keywords

passage-based document retrieval document homogeneity passage language model passage-based relevance model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Michael Bendersky
    • 1
  • Oren Kurland
    • 2
  1. 1.Center for Intelligent Information Retrieval, Department of Computer ScienceUniversity of MassachusettsAmherst 
  2. 2.Faculty of Industrial Eng. & Mgmt.TechnionIsrael

Personalised recommendations