Chinese document re-ranking based on automatically acquired term resource
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion.
- Balinski, J., & Danilowicz, C. (2005). Re-ranking method based on inter-document distance. Information Processing and Management, 41, 759–775. CrossRef
- Bear, J., Israel, D., Petit J., & Martin D. (1997). Using information extraction to improve document retrieval. Proceedings of TREC.
- Chen, K., Chen, H., Kando, N., Kuriyama, K., Lee, S., Sung, H., et al. (2003). Overview of CLIR task at the third NTCIR workshop. Proceedings of NTCIR III.
- Crouch, C., Crouch, D., Chen, Q., & Holtz, S. (2002). Improving the retrieval effectiveness of very short queries. Information Processing and Management, 38, 1–36. CrossRef
- Diaz, F. (2005). Regularizing ad hoc retrieval scores. Proceedings of CIKM.
- Kamps, J. (2004). Improving retrieval effectiveness by reranking documents based on controlled vocabulary. Proceedings of ECIR.
- Kurland, O., & Lee L. (2005). PageRank without hyper-links: Structural re-ranking using links induced by language models. Proceedings of the 28th ACM SIGIR.
- Lee, K., Park, Y., & Choi, K. S. (2001). Document re-ranking model using clusters. Information Processing and Management, 37(1), 1–14. CrossRef
- Luk, R. W. P., & Wong, K. F. (2002) Pseudo-relevance feedback and title re-ranking for Chinese IR. Proceedings of NTCIR Workshop 4.
- Mitra, M., Singhal A., & Buckley, C. (1998). Improving automatic query expansion. Proceedings of ACM SIGIR.
- Qu, Y. L., Xu, G. W., & Wang J. (2000). Rerank method based on individual thesaurus. Proceedings of NTCIR2 Workshop.
- Robertson, S. E., & Jones, K. S. (1977). Relevance weighting of search terms. Journal of the American Society for Information Science, 27.
- Robertson, S. E., Walker, S., & Jones K. S. (1995). Okapi at TREC-3. Proceedings of TREC.
- Rocchio, J. (1971). Relevant feedback in information retrieval. In G. Salton (Ed.), The smart retrieval system: Experiments in automatic document processing. Englewood Cliffs, NJ: Prentice-Hall.
- Salton, G. (1968). Automatic information organization and retrieval. New York: McGraw Hill Text.
- Schutze, H. (1998). The hypertext concordance: A better back-of-the-book index. Proceedings of First Workshop on Computational Terminology.
- Tao, T., & Zhai. C. X., (2004). A mixture clustering model for pseudo feedback in information retrieval. Proceedings of the Meeting of the International Federation of Classification Societies.
- Xu, J., & Croft, W. B. (1996). Query expansion using local and global document analysis. Proceedings of ACM SIGIR.
- Xu, J., & Croft, W. B. (2000). Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1), 79–112. CrossRef
- Yang, L. P., Ji D. H., & Tang L. (2004). Document re-ranking based on automatically acquired key terms in chinese information retrieval. Proceedings of 20th COLING.
- Yang, L. P., Ji, D. H., & Zhou, G. D. (2006). Document re-ranking using cluster validation and label propagation. Proceedings of CIKM.
- Yang, L. P., Ji, D. H., Zhou, G. D., & Nie, Y. (2005). Improving retrieval effectiveness by using key terms in top retrieved documents. Proceedings of 27th ECIR.
- Zhai, C. X., & Lafferty, J. (2002). Two-stage language models for information retrieval. Proceedings of the 25th ACM SIGIR.
- Zhang, B. Y., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., et al. (2005). Improving search results using affinity graph. Proceedings of the 28th ACM SIGIR Conference.
- Chinese document re-ranking based on automatically acquired term resource
Language Resources and Evaluation
Volume 43, Issue 4 , pp 385-406
- Cover Date
- Print ISSN
- Online ISSN
- Springer Netherlands
- Additional Links
- Term extraction
- Term weighting
- Maximal marginal relevance
- Document re-ranking
- Information retrieval
- Industry Sectors
- Author Affiliations
- 1. Department of Computer Science, Center for Study of Language Information, Wuhan University, 430072, Wuhan, China
- 2. Department of Chinese Language and Literature, Wuhan University, 430072, Wuhan, China
- 3. Center for Study of Language Information, Wuhan University, 430072, Wuhan, China