Article

Language Resources and Evaluation

, 43:385

First online:

Chinese document re-ranking based on automatically acquired term resource

  • Donghong JiAffiliated withDepartment of Computer Science, Center for Study of Language Information, Wuhan University Email author 
  • , Shiju ZhaoAffiliated withDepartment of Chinese Language and Literature, Wuhan University
  • , Guozheng XiaoAffiliated withCenter for Study of Language Information, Wuhan University

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion.

Keywords

Term extraction Term weighting Maximal marginal relevance Document re-ranking Information retrieval