Skip to main content

Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval

  • Conference paper
SIGIR ’94

Abstract

The 2-Poisson model for term frequencies is used to suggest ways of incorporating certain variables in probabilistic models for information retrieval. The variables concerned are within-document term frequency, document length, and within-query term frequency. Simple weighting functions are developed, and tested on the TREC test collection. Considerable performance improvements (over simple inverse collection frequency weighting) are demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Robertson S.E. et al. Okapi at TREC-2. In: [2].

    Google Scholar 

  2. Harman D.K. (Ed.) The Second Text REtrieval Conference (TREC-2). NIST Gaithersburg MD, to appear.

    Google Scholar 

  3. Cooper W.S. et al. Probabilistic retrieval in the TIPSTER collection: an application of staged logistic regression. In: Harman D.K. (Ed.) The First Text REtrieval Conference (TREC-1). NIST Gaithersburg MD, 1993. (pp 73–88 ).

    Google Scholar 

  4. Croft W. and Harper D. Using probabilistic models of information retrieval without relevance information. Journal of Documentation 1979; 35: 285–295.

    Article  Google Scholar 

  5. Harter S.P. A probabilistic approach to automatic keyword indexing. Journal of the American Society for Information Science 1975; 26:197–206 and 280–289.

    Google Scholar 

  6. Robertson S.E., Van Rijsbergen C.J. & Porter M.F. Probabilistic models of indexing and searching. In Oddy R.N. et al. (Eds.) Information Retrieval Research (pp.35–56). Butterworths London, 1981.

    Google Scholar 

  7. Cooper W.S. Inconsistencies and misnomers in probabilistic IR. In: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp.57–62). Chicago, 1991.

    Chapter  Google Scholar 

  8. Robertson S.E. and Sparck Jones K. Relevance weighting of search terms. Journal of the American Society for Information Science 1976; 27: 129–146.

    Article  Google Scholar 

  9. Margulis E.L. Modelling documents with multiple Poisson distributions. Information Processing and Management 1993; 29: 215–227.

    Article  Google Scholar 

  10. Moffat A., Sacks-Davis R., Wilkinson R. & Zabel J. Retrieval of partial documents. In: Harman D.K. (Ed.) The First Text REtrieval Conference (TREC-1). NIST Gaithersburg MD, 1993. (pp 59–72 ).

    Google Scholar 

  11. Buckley C., Salton G. & Allan J. Automatic retrieval with locality information using SMART. In: [2].

    Google Scholar 

  12. Robertson S.E. Query-document symmetry and dual models. (Unpublished.)

    Google Scholar 

  13. Porter M.F. An algorithm for suffix stripping. Program 1980; 14: 130–137.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag London Limited

About this paper

Cite this paper

Robertson, S.E., Walker, S. (1994). Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2099-5_24

  • Publisher Name: Springer, London

  • Print ISBN: 978-3-540-19889-5

  • Online ISBN: 978-1-4471-2099-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics