Chinese document re-ranking based on automatically acquired term resource



In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion.


Term extraction Term weighting Maximal marginal relevance Document re-ranking Information retrieval 

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Department of Computer Science, Center for Study of Language InformationWuhan UniversityWuhanChina
  2. 2.Department of Chinese Language and LiteratureWuhan UniversityWuhanChina
  3. 3.Center for Study of Language InformationWuhan UniversityWuhanChina

Personalised recommendations