Chinese document re-ranking based on automatically acquired term resource

Ji, Donghong; Zhao, Shiju; Xiao, Guozheng

doi:10.1007/s10579-009-9106-z

Chinese document re-ranking based on automatically acquired term resource

Published: 13 November 2009

Volume 43, pages 385–406, (2009)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Donghong Ji¹,
Shiju Zhao² &
Guozheng Xiao³

143 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

A survey on neural topic models: methods, applications, and challenges

Article Open access 25 January 2024

Scientific paper recommendation systems: a literature review of recent publications

Article Open access 05 October 2022

Notes

http://research.nii.ac.jp/ntcir-ws3/work-en.html.

References

Balinski, J., & Danilowicz, C. (2005). Re-ranking method based on inter-document distance. Information Processing and Management, 41, 759–775.
Article Google Scholar
Bear, J., Israel, D., Petit J., & Martin D. (1997). Using information extraction to improve document retrieval. Proceedings of TREC.
Chen, K., Chen, H., Kando, N., Kuriyama, K., Lee, S., Sung, H., et al. (2003). Overview of CLIR task at the third NTCIR workshop. Proceedings of NTCIR III.
Crouch, C., Crouch, D., Chen, Q., & Holtz, S. (2002). Improving the retrieval effectiveness of very short queries. Information Processing and Management, 38, 1–36.
Article Google Scholar
Diaz, F. (2005). Regularizing ad hoc retrieval scores. Proceedings of CIKM.
Kamps, J. (2004). Improving retrieval effectiveness by reranking documents based on controlled vocabulary. Proceedings of ECIR.
Kurland, O., & Lee L. (2005). PageRank without hyper-links: Structural re-ranking using links induced by language models. Proceedings of the 28th ACM SIGIR.
Lee, K., Park, Y., & Choi, K. S. (2001). Document re-ranking model using clusters. Information Processing and Management, 37(1), 1–14.
Article Google Scholar
Luk, R. W. P., & Wong, K. F. (2002) Pseudo-relevance feedback and title re-ranking for Chinese IR. Proceedings of NTCIR Workshop 4.
Mitra, M., Singhal A., & Buckley, C. (1998). Improving automatic query expansion. Proceedings of ACM SIGIR.
Qu, Y. L., Xu, G. W., & Wang J. (2000). Rerank method based on individual thesaurus. Proceedings of NTCIR2 Workshop.
Robertson, S. E., & Jones, K. S. (1977). Relevance weighting of search terms. Journal of the American Society for Information Science, 27.
Robertson, S. E., Walker, S., & Jones K. S. (1995). Okapi at TREC-3. Proceedings of TREC.
Rocchio, J. (1971). Relevant feedback in information retrieval. In G. Salton (Ed.), The smart retrieval system: Experiments in automatic document processing. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Salton, G. (1968). Automatic information organization and retrieval. New York: McGraw Hill Text.
Google Scholar
Schutze, H. (1998). The hypertext concordance: A better back-of-the-book index. Proceedings of First Workshop on Computational Terminology.
Tao, T., & Zhai. C. X., (2004). A mixture clustering model for pseudo feedback in information retrieval. Proceedings of the Meeting of the International Federation of Classification Societies.
Xu, J., & Croft, W. B. (1996). Query expansion using local and global document analysis. Proceedings of ACM SIGIR.
Xu, J., & Croft, W. B. (2000). Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1), 79–112.
Article Google Scholar
Yang, L. P., Ji D. H., & Tang L. (2004). Document re-ranking based on automatically acquired key terms in chinese information retrieval. Proceedings of 20th COLING.
Yang, L. P., Ji, D. H., & Zhou, G. D. (2006). Document re-ranking using cluster validation and label propagation. Proceedings of CIKM.
Yang, L. P., Ji, D. H., Zhou, G. D., & Nie, Y. (2005). Improving retrieval effectiveness by using key terms in top retrieved documents. Proceedings of 27th ECIR.
Zhai, C. X., & Lafferty, J. (2002). Two-stage language models for information retrieval. Proceedings of the 25th ACM SIGIR.
Zhang, B. Y., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., et al. (2005). Improving search results using affinity graph. Proceedings of the 28th ACM SIGIR Conference.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Center for Study of Language Information, Wuhan University, 430072, Wuhan, China
Donghong Ji
Department of Chinese Language and Literature, Wuhan University, 430072, Wuhan, China
Shiju Zhao
Center for Study of Language Information, Wuhan University, 430072, Wuhan, China
Guozheng Xiao

Authors

Donghong Ji
View author publications
You can also search for this author in PubMed Google Scholar
Shiju Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Guozheng Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Donghong Ji.

Additional information

First author is supported by NSF (60773011), NSF(90820005), and first two authors are supported by Wuhan University 985 Project (985yk004).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, D., Zhao, S. & Xiao, G. Chinese document re-ranking based on automatically acquired term resource. Lang Resources & Evaluation 43, 385–406 (2009). https://doi.org/10.1007/s10579-009-9106-z

Download citation

Published: 13 November 2009
Issue Date: December 2009
DOI: https://doi.org/10.1007/s10579-009-9106-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chinese document re-ranking based on automatically acquired term resource

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on neural topic models: methods, applications, and challenges

Scientific paper recommendation systems: a literature review of recent publications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Chinese document re-ranking based on automatically acquired term resource

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

A survey on neural topic models: methods, applications, and challenges

Scientific paper recommendation systems: a literature review of recent publications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation