Abstract
In the context of biomedical information retrieval (IR), this paper explores the relationship between the document’s global context and the query’s local context in an attempt to overcome the term mismatch problem between the user query and documents in the collection. Most solutions to this problem have been focused on expanding the query by discovering its context, either global or local. In a global strategy, all documents in the collection are used to examine word occurrences and relationships in the corpus as a whole, and use this information to expand the original query. In a local strategy, the top-ranked documents retrieved for a given query are examined to determine terms for query expansion. We propose to combine the document’s global context and the query’s local context in an attempt to increase the term overlap between the user query and documents in the collection via document expansion (DE) and query expansion (QE). The DE technique is based on a statistical method (IR-based) to extract the most appropriate concepts (global context) from each document. The QE technique is based on a blind feedback approach using the top-ranked documents (local context) obtained in the first retrieval stage. A comparative experiment on the TREC 2004 Genomics collection demonstrates that the combination of the document’s global context and the query’s local context shows a significant improvement over the baseline. The MAP is significantly raised from 0.4097 to 0.4532 with a significant improvement rate of +10.62% over the baseline. The IR performance of the combined method in terms of MAP is also superior to official runs participated in TREC 2004 Genomics and is comparable to the performance of the best run (0.4075).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stokes, N., Li, Y., Cavedon, L., Zobel, J.: Exploring criteria for successful query expansion in the genomic domain. Information Retrieval 12(1), 17–50 (2009)
Voorhees, E.M.: Query expansion using lexical semantic relations. In: SIGIR 1994 Conference on Research and Development in Information Retrieval, pp. 61–69 (1994)
Le, D.T.H., Chevallet, J.P., Dong, T.B.T.: Thesaurus-based query and document expansion in conceptual indexing with umls. In: RIVF 2007, pp. 242–246 (2007)
Zhou, W., Yu, C.T., et al.: Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature. In: SIGIR, pp. 655–662 (2007)
Lu, Z., Kim, W., Wilbur, W.J.: Evaluation of query expansion using mesh in pubmed. Information Retrieval 12(1), 69–80 (2009)
Gobeill, J., Ruch, P., Zhou, X.: Query and document expansion with medical subject headings terms at medical imageclef 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 736–743. Springer, Heidelberg (2009)
Billerbeck, B., Zobel, J.: Document expansion versus query expansion for ad-hoc retrieval. In: The 10th Australasian Document Comput. Symp., pp. 34–41 (2005)
Tao, T., Wang, X., et al.: Language model information retrieval with document expansion. In: Association for Computational Linguistics, pp. 407–414 (2006)
Sparck Jones, K.: Automatic Keyword Classification for Information Retrieval. Butterworths, London (1971)
Rocchio, J.: Relevance Feedback in Information Retrieval, pp. 313–323 (1971)
Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: Conference on Research and Development in Information Retrieval, pp. 4–11 (1996)
Amati, G.: Probabilistic models for Information Retrieval based on Divergence from Randomness. PhD thesis, University of Glasgow (2003)
Abdou, S., Savoy, J.: Searching in medline: Query expansion and manual indexing evaluation. Information Processing Management 44(2), 781–789 (2008)
Robertson, S.E., Walker, S.: Okapi/keenbow at trec-8. TREC 8, 151–162 (1999)
Robertson, S.E., Walker, S., Hancock-Beaulieu, M.: Okapi at trec-7: Automatic ad hoc, filtering, vlc and interactive. In: TREC-7 Proceedings, pp. 199–210 (1998)
Hersh, W., Bhuptiraju, R.: Trec 2004 genomics track overview. In: The Thirteenth Text Retrieval Conference, TREC 2004 (2004)
Ounis, I., Lioma, T.: Research directions in terrier. In: Baeza-Yates, R., et al. (eds.) Novatica Special Issue on Web Information Access (2007) (invited paper)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dinh, D., Tamine, L. (2011). Combining Global and Local Semantic Contexts for Improving Biomedical Information Retrieval. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-20161-5_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)