Advertisement

Information Retrieval

, Volume 5, Issue 2–3, pp 239–256 | Cite as

Threshold Setting and Performance Optimization in Adaptive Filtering

  • Stephen Robertson
Article

Abstract

An experimental adaptive filtering system, built on the Okapi search engine, is described. In addition to the regular text retrieval functions, the system requires a complex set of procedures for setting score thresholds and adapting them following feedback. These procedures need to be closely related to the evaluation measures to be used. A mixture of quantitative methods relating a threshold to the number of documents expected to be retrieved in a time period, and qualitative methods relating to the probability of relevance, is defined. Experiments under the TREC-9 Adaptive Filtering Track rules are reported. The system is seen to perform reasonably well in comparison with other systems at TREC. Some of the variables that may affect performance are investigated.

filtering thresholds optimization adaptation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan J (1996) Incremental relevance feedback for information filtering. In: Frei H-P et al., Eds., SIGIR 96: Proceedings of the 19th Annual International Conference on Research and Development in Information Retrieval, ACM, pp. 270–278.Google Scholar
  2. Arampatzis A, Beney J, Koster CHA and van der Weide TP (2001) Incrementality, half-life, and threshold optimization for adaptive document filtering. In: Voorhees EM and Harman DK, Eds., The Ninth Text REtrieval Conference (TREC-9), NIST, Gaithersburg, MD (NIST Special Publications 500-249), pp. 589–600.Google Scholar
  3. Arampatzis A and van Hameren A (2001) The score-distributional threshold optimization for adaptive binary classification tasks. In: Croft WB et al., Eds., SIGIR 2001: Proceedings of the 24th Annual International Conference on Research and Development in Information Retrieval, ACM Press, pp. 285–293.Google Scholar
  4. Bookstein A (1983) Information retrieval: A sequential learning process. Journal of the American Society for Information Science, 34:331–342.Google Scholar
  5. Callan J (1998) Learning while filtering documents. In: Croft B et al., Eds., SIGIR'98: Proceedings of the 21st Annual International Conference on Research and Development in Information Retrieval, ACM Press, pp. 224–231.Google Scholar
  6. Robertson SE (1990) On term selection for query expansion. Journal of Documentation, 46:359–364.Google Scholar
  7. Robertson SE (2002a) Comparing the performance of adaptive filtering and ranked output systems. Information Retrieval, 5:257–268.Google Scholar
  8. Robertson SE (2002b) Introduction to the special issue: Overview of the TREC routing and filtering tasks. Information Retrieval, 5:127–137.Google Scholar
  9. Robertson SE and Sparck Jones K (1976) Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129–146.Google Scholar
  10. Robertson SE and Walker S (2000a) Threshold setting in adaptive filtering. Journal of Documentation, 56:312–331.Google Scholar
  11. Robertson SE and Walker S (2000b) Okapi/Keenbow at TREC-8. In: Voorhees EM and Harman DK, Eds., The Eighth Text REtrieval Conference (TREC-8). NIST, Gaithersburg, MD(NIST Special Publication no. 500-246), pp. 151–162.Google Scholar
  12. Robertson SE and Walker S (2001) Microsoft Cambridge at TREC-9: Filtering track. In:Voorhees EM and Harman DK, Eds., The Ninth Text REtrieval Conference (TREC-9). NIST, Gaithersburg, MD(NIST Special Publication 500-249), pp. 361–368.Google Scholar
  13. Robertson SE et al. (1995). Okapi at TREC-3. In: Harman DK, Ed., The Third Text REtrieval Conference (TREC-3), NIST, Gaithersburg, MD (NIST Special Publication no. 500-225), pp. 109–126.Google Scholar
  14. Sparck Jones K, Walker S and Robertson SE (2000) A probabilistic model of information retrieval: Development and comparative experiments. Information Processing and Management, 36:779–808 (Part 1) and 809–840 (Part 2).Google Scholar
  15. Zhang Y and Callan J (2001) Maximumlikelihood estimation for filtering thresholds. In: Croft WB et al., Ed., SIGIR 2001: Proceedings of the 24th Annual International Conference on Research and Development in Information Retrieval, ACM Press, pp. 294–302.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Stephen Robertson
    • 1
  1. 1.Microsoft ResearchCambridgeUK

Personalised recommendations