Abstract
The 2-Poisson model for term frequencies is used to suggest ways of incorporating certain variables in probabilistic models for information retrieval. The variables concerned are within-document term frequency, document length, and within-query term frequency. Simple weighting functions are developed, and tested on the TREC test collection. Considerable performance improvements (over simple inverse collection frequency weighting) are demonstrated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Robertson S.E. et al. Okapi at TREC-2. In: [2].
Harman D.K. (Ed.) The Second Text REtrieval Conference (TREC-2). NIST Gaithersburg MD, to appear.
Cooper W.S. et al. Probabilistic retrieval in the TIPSTER collection: an application of staged logistic regression. In: Harman D.K. (Ed.) The First Text REtrieval Conference (TREC-1). NIST Gaithersburg MD, 1993. (pp 73–88 ).
Croft W. and Harper D. Using probabilistic models of information retrieval without relevance information. Journal of Documentation 1979; 35: 285–295.
Harter S.P. A probabilistic approach to automatic keyword indexing. Journal of the American Society for Information Science 1975; 26:197–206 and 280–289.
Robertson S.E., Van Rijsbergen C.J. & Porter M.F. Probabilistic models of indexing and searching. In Oddy R.N. et al. (Eds.) Information Retrieval Research (pp.35–56). Butterworths London, 1981.
Cooper W.S. Inconsistencies and misnomers in probabilistic IR. In: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp.57–62). Chicago, 1991.
Robertson S.E. and Sparck Jones K. Relevance weighting of search terms. Journal of the American Society for Information Science 1976; 27: 129–146.
Margulis E.L. Modelling documents with multiple Poisson distributions. Information Processing and Management 1993; 29: 215–227.
Moffat A., Sacks-Davis R., Wilkinson R. & Zabel J. Retrieval of partial documents. In: Harman D.K. (Ed.) The First Text REtrieval Conference (TREC-1). NIST Gaithersburg MD, 1993. (pp 59–72 ).
Buckley C., Salton G. & Allan J. Automatic retrieval with locality information using SMART. In: [2].
Robertson S.E. Query-document symmetry and dual models. (Unpublished.)
Porter M.F. An algorithm for suffix stripping. Program 1980; 14: 130–137.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag London Limited
About this paper
Cite this paper
Robertson, S.E., Walker, S. (1994). Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_24
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2099-5_24
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive