Skip to main content

Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8956))

Abstract

Pseudo Relevance feedback (PRF) based query expansion approaches assumes that the top ranked retrieved documents are relevant. But this assumption is not always true; it may also possible that a PRF document may contain different topics, which may or may not be relevant to the query terms even if the documents are judged relevant. In this paper our focus is to capture the limitation of PRF based query expansion and propose a hybrid method to improve the performance of PRF based query expansion by combining corpus based term co-occurrence information and semantic information of term. Firstly, the paper suggest use of corpus based term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, we use semantic similarity approach to rank the query expansion terms obtained from top feedback documents. Third, we combine co-occurrence and semantic similarity together to rank the query expansion terms obtained from first step on the basis of semantic similarity. The experiments were performed on FIRE ad hoc and TREC-3 benchmark datasets of information retrieval. The results show significant improvement in terms of precision, recall and mean average precision (MAP). This experiments shows that the combination of both techniques in an intelligent way gives us goodness of both of them. As this is the first attempt in this direction there is a large scope of improving these techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Van Rijsbergen, C.J.: A theoretical basis for the use of co-occurrence data in information Retrieval. Journal of Documentation 33, 106–119 (1977)

    Article  Google Scholar 

  2. Robertson, S.E., Walker, S., Beaulieu, M.H.: Okapi at TREC-7. In: Proceedings of the Seventh Text REtrieval Conference. Gaithersburg, USA (1998)

    Google Scholar 

  3. Kobayakawa, M., Kinjo, S., Hoshi, M., Ohmori, T., Yamamoto, A.: Fast Computation of Similarity Based on Jaccard Coefficient for Composition-Based Image Retrieval. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds.) PCM 2009. LNCS, vol. 5879, pp. 949–955. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Miller, G.A., Beckwith, R., Fellbaum, C.D., Gross, D., Miller, K.: WordNet: An online lexical database. Int. J. Lexicograph. 3(4), 235–244 (1990)

    Article  Google Scholar 

  5. Resnik, P.: Semantic Similarity in Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 95–130 (1999)

    MATH  Google Scholar 

  6. Wu, Z., Palmer, M.: Verb Semantics and Lexical Selection. In: Annual Meeting of the Associations for Computational Linguistic, Las Cruces, New, Mexico, pp. 133–138 (1994)

    Google Scholar 

  7. Leacock, C., Miller, G.A., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. Journal of Computational Linguistic, 265–283 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Singh, J., Sharan, A. (2015). Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval. In: Natarajan, R., Barua, G., Patra, M.R. (eds) Distributed Computing and Internet Technology. ICDCIT 2015. Lecture Notes in Computer Science, vol 8956. Springer, Cham. https://doi.org/10.1007/978-3-319-14977-6_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14977-6_45

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14976-9

  • Online ISBN: 978-3-319-14977-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics