Language Resources and Evaluation

, 43:385

Chinese document re-ranking based on automatically acquired term resource

Authors

    • Department of Computer Science, Center for Study of Language InformationWuhan University
  • Shiju Zhao
    • Department of Chinese Language and LiteratureWuhan University
  • Guozheng Xiao
    • Center for Study of Language InformationWuhan University
Article

DOI: 10.1007/s10579-009-9106-z

Cite this article as:
Ji, D., Zhao, S. & Xiao, G. Lang Resources & Evaluation (2009) 43: 385. doi:10.1007/s10579-009-9106-z
  • 60 Views

Abstract

In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion.

Keywords

Term extractionTerm weightingMaximal marginal relevanceDocument re-rankingInformation retrieval

Copyright information

© Springer Science+Business Media B.V. 2009