Skip to main content

Relation Based Term Weighting Regularization

  • Conference paper
Advances in Information Retrieval (ECIR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

Abstract

Traditional retrieval models compute term weights based on only the information related to individual terms such as TF and IDF. However, query terms are related. Intuitively, these relations could provide useful information about the importance of a term in the context of other query terms. For example, query “perl tutorial” specifies that a user look for information relevant to both perl and tutorial. Thus, a document containing both terms should have higher relevance score than the ones with only one of them. However, if the IDF value of “tutorial” is much smaller than “perl”, existing retrieval models may assign the document lower score than those containing multiple occurrences of “perl”. It is clear that the importance of a term should be dependent on not only collection statistics but also the relations with other query terms. In this work, we study how to utilize semantic relations among query terms to regularize term weighting. Experiment results over TREC collections show that the proposed strategy is effective to improve the retrieval performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Croft, W., Turtle, H., Lewis, D.: The use of phrases and structured queries in information retrieval. In: Proceedings of SIGIR 1991 (1991)

    Google Scholar 

  2. Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proceedings of SIGIR 2004 (2004)

    Google Scholar 

  3. Fang, H., Zhai, C.: An exploration of axiomatic approaches to information retrieval. In: Proceedings of SIGIR 2005 (2005)

    Google Scholar 

  4. Fang, H., Zhai, C.: Semantic term matching in axiomatic approaches to information retrieval. In: Proceedings of SIGIR 2006 (2006)

    Google Scholar 

  5. Fuhr, N.: Probabilistic models in information retrieval. The Computer Journal 35(3), 243–255 (1992)

    Article  MATH  Google Scholar 

  6. Grieff, W.R.: A theory of term weighting based on exploratory data analysis. In: Proceedings of SIGIR 1998 (1998)

    Google Scholar 

  7. Hartiwig, F., Dearing, B.E.: Exploratory Data Analysis. Sage Publications (1979)

    Google Scholar 

  8. Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: Proceedings of SIGIR 2004 (2004)

    Google Scholar 

  9. Metzler, D., Croft, W.B.: A markov random field model for term dependencies. In: Proceedings of SIGIR 2005 (2005)

    Google Scholar 

  10. Mitra, M., Buckley, C., Singhal, A., Cardie, C.: An analysis of statistical and syntactic phrases. In: Proceedings of RIAO (1997)

    Google Scholar 

  11. Ponte, J., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the ACM SIGIR 1998, pp. 275–281 (1998)

    Google Scholar 

  12. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. In: Proceedings of TREC-3 (1995)

    Google Scholar 

  13. Salton, G.: Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley (1989)

    Google Scholar 

  14. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of SIGIR 1996 (1996)

    Google Scholar 

  15. Tao, T., Zhai, C.: An exploration of proximity measures in information retrieval. In: Proceedings of SIGIR 2007 (2007)

    Google Scholar 

  16. van Rijbergen, C.J.: A theoretical basis for theuse of co-occurrence data in information retrieval. Journal of Documentation, 106–119 (1977)

    Google Scholar 

  17. van Rijsbergen, C.J.: Information Retrieval. Butterworths (1979)

    Google Scholar 

  18. Voorhees, E.M.: Overview of the trec 2005 robust retrieval track. In: Proceedings of the Fourteenth Text REtrieval Conference, TREC 2005 (2006)

    Google Scholar 

  19. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of SIGIR 2001 (2001)

    Google Scholar 

  20. Zheng, W., Fang, H.: Query Aspect Based Term Weighting Regularization in Information Retrieval. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 344–356. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, H., Fang, H. (2012). Relation Based Term Weighting Regularization. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28997-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28996-5

  • Online ISBN: 978-3-642-28997-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics